Friday, July 1, 2011

How to get gmail.com banned - not that I did this

When I started Mailinator, a LOT of people told me it wouldn't work because websites would ban it right away. Ban it with reckless abandon. Ban it like the new thing on the internet was to just sit around and ban Mailinator all darn day long.

As it turns out, that didn't happen. Sure, some sites do ban Mailinator and some are even really (really) excited about the idea, but in the grand scheme of things, it's not really very many. Thousands of people use Mailinator everyday, so clearly, its a useful tool that many sites accept.

Back in the day however, I sadly fell prey to the words of doom that I was being fed. I mean, holy mackerel - what if sites DO ban it? What then?

So I drew up a plan. A plan, that at this time I can say I may not be fully proud of. A plan that involved guile, wit, a few domain names, and some rate-limiting (thread-safe) data structures.

I write this now because, well, for the most part the war is over and Gotham has grown past needing Batman anymore. Mailinator is not really the rogue tool it once was. Heck, hotmail supports disposable email now. It's mainstream.

Typically there are two reasons people want to ban Mailinator. A few years ago, people really had some sort of notion that your email somehow equated to your identity. Given the radically insecure setup of email in general, that was really a ridiculous technical assumption. Nonetheless it was pervasive.

Secondly, people banned Mailinator for fear of people abusing their website. Now keep in mind, anything you can do with Mailinator, you can also do with YahooMail or Hotmail. Its just that Mailinator lets you do it faster, but Yahoo is plenty happy to let you sign-up for 100 email accounts.

I get occasional emails from people asking me to have Mailinator stop accepting email from their site. Usually for the reason of stopping abuse. If they're nice and it makes sense, I almost always do it. But in my experience, usually when the existence of Mailinator is pinpointed as a cause of abuse, it is in truth merely an avenue that is already inherent to the internet or your website. Even shutting Mailinator down wouldn't solve the problem. The bad-guys just go somewhere else and keep on abusing.

Any sort of abuse is needless to say, no fun for anyone. Mailinator has specific code built-in to detect scripts and stop them.

In truth, Mailinator's system for detecting and shutting-down scripts and abuse really only serves one purpose. Its like that silly metal bar people put on the steering wheels of their cars. Let's be real, if a thief really wants your car, some dinky metal bar on the steering wheel isn't going to do diddly to stop him.

Same with Mailinator's anti-abuse code - it won't stop a determined person - but it does make it more of a pain than simply using something else.

So -Dear hacker bad-guy abuser dudes - other disposable email sites probably don't have that sneaky anti-script pain-in-the-butt hacker-stop code like Mailinator. Go use them.

There....

1) Solved hacking and abuse on the internet ? --> Not even maybe
2) Solved a little of the hacking and abuse on the internet possibly for me? ---> DING!

Ok, back to the story - as I said, in the beginning, the idea of wide-spread Mailinator banning scared me a lot. So what did I do? I bought some additional domains for Mailinator.

To this day, you can email bob@thisisnotmyrealemail.com and it will end up at mailinator (in the bob inbox) just like bob@mailinator.com.

Cool. Alternate domains. Problem solved.

Wait a second. How exactly do I tell the world about the alternate domains without telling the people that want to ban them all?

Every few weeks I get an email like:


Hi! Love your service. Can you send me the exhaustive, comprehensive, and complete list of alternate domains so I can pick a nice one that suits my individual personal style? kThxBai


At first, I was like "Neat! People love Mailinator and want to...heeeyy.. waaiiit a second".

If I give them the whole list, then they will, um, have the whole list. And then they can ban the whole list.

Ok. I know. I'll list one random alternate domain on the homepage every time you visit. No one will have the whole list. Just one here, one there. Perfect !

There problem solved. Again. Well, sort of.

Soon after I put up this "one random alternate domain per homepage load" system - the scrapers started. Every now and then I'd notice several hundred homepage loads from the same IP in a very short period of time.

They were scripts; scripts that were loading the homepage over and over and scraping out the random alternate domain that was shown. Sneaky. By doing this they could eventually formulate the entire list of alternate domains.

Drat. Now what. For awhile, nothing. I just let them go. A few months later however, I got an email from a Russian guy (sorry Russian guy, I don't remember your name).


You are dumb. Your homepage is easy to scrape and doesn't change so its easy to scrape your alternate domain. You are dumb.


He was right. Well, I'm not sure about the dumb part, but my homepage was easy to scrape. Someone could probably write a script to scrape it in short order. Probably just took a few minutes.

Could I make it harder to scrape? Well, I could, but wouldn't really slow anyone down much.

It was then however, I had a flash. An idea of simply epic proportions. A thought so crazy - that dad-burn-it, it just might work.

Let's not make the page scraping harder - let's make it EASIER.

I removed the bit of code that displayed the alternate domain and put it in its own (teensy) webpage. That "webpage" had absolutely nothing in it, except the text for the randomly chosen alternate domain itself.

Then, I embedded my new tiny webpage into the homepage (so it showed as before). Basically, to the viewer of the homepage - nothing was different. You saw the homepage and a randomly generated alternate domain, just where it was.

But to the folks that had been scraping my site, things looked plenty different. In fact, I probably broke all their scrapers (Sorry nice people trying to get all my alternate domains just to ban them! (ok, not really)).

Now here is a finer point of semantics. If you go to the Mailinator homepage, there is some text that says "Here is an alternate domain" followed, by, well, a randomly chosen alternate domain.

However, now that I split off that tiny little webpage with JUST the alternate domain in it - you could go there too by typing in the url directly. And you'd see nothing BUT the alternate domain. No surrounding text. No text saying "this is an alternate domain". That little page showed a domain, but made no claim about what it was displaying.

For your browsing pleasure, here's the only direct link to that page that I know of: Go ahead, reload the page a few times. (You can see this also on the Mailinator homepage on the lower left).

After the script guys got over the minor annoyance of their scripts breaking because o f my new setup, I'm sure there were office parties across the nation. Mailinator! Now even easier to scrape!

Now for the record, the rest of this post is hypothetical. An unimplemented idea if you will. Who knows - I'll bet nothing you read here on out ever happened. Just random thoughts. Musings. One big theory. Consider it random daydreams of guy who runs a fun email service.

Remember all that script-detecting code from the anti-abuse system? Well, what if I put that in here too I thought. Let's "detect" when a script is hitting our weensy alternate-domain page.

And, what if we also detected when the little web page is being viewed but not "in" the homepage - but by itself (just like the link above). And what if after about 30 page hits from the same script (or so), stop displaying actual alternate domains and start sprinkling in some other things. Hmm... but what other things?

I know - how about "gmail.com". Or, um "hotmail.com". Or maybe, "yahoo.com".

What, in our completely and totally hypothetical situation, would that do?

Well, let's see. There are these folks out there running scripts against Mailinator collecting all my alternate domains. Those scripts probably put results in a database or something and connects to their website. When one of their users tries to sign-up on their site using one of my alternate domains, it's in their database as a banned site and its immediately rejected.

Now imagine the wacky fun if somehow, some way, (totally theoretically speaking) some silly person snuk "gmail.com" in that list. I'd guess banning your users trying sign-up with "gmail.com" addresses is probably not what you want.

And, hypothetically speaking if you had code that would sneak in these non-alternate-domains in the page they weren't supposed to accessing anyway, when would be the best time to set it into action?

Well, those scripts ran at many different times, but just after midnight seemed like a popular time-slot.

If such code existed, making it active Sunday morning from Midnight to 2am seems nice. I mean heck, if my website stopped accepting signups from "gmail.com" on some Sunday morning, I'm sure I'd be downright chipper to hop into the office and find out why.

Boy. If all that stuff happened - I wonder what kind of email conversations I'd have on that Sunday afternoon? I bet they'd be like:

Your alternate domain list displayed 'gmail.com'!
Hi Fred, no it doesn't. Just reloaded the homepage 10 times, nothing like that. all the best.

or I bet another would be like:

Yahoo.com? What is this some kind of joke?
Sorry, did you mean to email this to Carol Bartz? Not sure what you're talking about.


Phew. Well, that's surely a fun thought experiment. As you can see from the link above however, it surely doesn't do anything like that. Honestly these days, most of the scrapers are gone. I think simply that the internet evolved and more of them simply lost interest in the fight.

Every now and then I'm still asked what I think about banning Mailinator. I've mellowed a lot since the early days and I pretty much always give the same answer. If you think banning Mailinator going to solve your problem, go ahead. In my experience, it won't. And by asking I am guessing that you are making some assumptions in your site that will surface as issues in other ways.

And of course, script writers, you now have the direct link to the alternate domain page above. Scrape away. But keep in mind, the best way not to trigger any Mailinator abuse systems is to not do anything "too fast". Those script detectors are pretty fickle little beasts. It's not a bad idea to try and stay on their good side.