About email.php


Spam sucks.

Most of us have to deal with lots and lots of spam every day in our E-mail boxes. It comes from all over the world, and advertises products and services ranging from fast money-making schemes to "personal enhancement." And most all of us spend a lot of time weeding through and deleting spam.

But there's one thing that spam wouldn't be possible without: E-mail addresses.

That's right. Before you can receive spam, your E-mail has to become known by a spammer. Spammers don't "magically" learn your E-mail address through the proverbial grapevine; they acquire your address in one of several possible ways.

Some ways your E-mail address can get to a spammer is as follows:

After seeing all this, it may seem that no matter what you do your E-mail address is going to end up online. And the truth is that you're right. Most all users who do anything more with the Internet than check the local weather are going to inevitably have to give out their E-mail address in a potentially insecure fashion. This very weakness is what the spammers live by, feast on, and drool over every day.

But there is still one more piece of the puzzle. A spammer doesn't have some sort of extra-sensory perception. He or she isn't somehow magically told when you happen to post your E-mail somewhere. Obtaining all of these public E-mail addresses isn't exactly free or easy. There are a few methods spammers will use to obtain your E-mail:

The first two of the above methods are pretty much in the bag - there isn't much you can do about it. If you want to keep your E-mail from ending up on a CD-ROM, you'll probably have to refrain from any online activity that involves your E-mail address as identification, because unless you feel like reading (sometimes extremely lengthy) privacy policies, you will quickly find you gave your E-mail to a site that said somewhere in the midst of a sea of words that your E-mail may be sold or distributed to others as part of a business venture.

The second option gives you a bit more chance for personal protection, but again, this may shut you out of bulletin boards that you may enjoy posting to.

The third and fourth methods here are key. These methods simply involve searching the Internet for E-mail addresses. And that's where the email.php script comes in.

email.php is a PHP script that generates 2,000 completely random, invalid E-mail addresses. The addresses appear valid enough so that most crawlers will consider them valid and add them to the list of e-mail addresses. Another feature of the script is that on each page is a link back to itself, but with a random ID tacked onto it so most crawlers will think it's a different page and recursively, repeatedly download more and more fake E-mail addresses!

Crawling technology is very crucial to the Internet. All major search engines employ crawling techniques to scour the Web for new or changed Web pages. When a crawler runs, it will start with a list of known pages and visit each one. Then, it will look for links on all of the pages it visited, and visit all of those links. The process repeats for a long, long time. Crawlers are intelligent and keep track of which pages they've viewed already so they don't reload pages that they don't need to. The end result for most search engines is a huge, enormous database of the pages on the Internet. A search engine's crawler will also handle the task of assigning keywords to the pages for use in searching, and possibly maintaining a cached local copy of the page (such as on Google.)

A spammer's crawler doesn't look for keywords, however; it looks for E-mail addresses. Just like a search engine, it will start with a list of known sites and visit all of the links on each page recursively looking for more pages to visit on which to scan for E-mail addresses. It's like a wild animal desperate for food; it will search forever. Most crawlers take each address they find and add it to a central database which the spammer later uses to send out spam E-mails.

The email.php script attacks spammers in both their third and fourth methods. Since the page contains what would appear to most scripts to be valid E-mail addresses, a spammer searching a search engine for telltale phrases of E-mail addresses will most likely eventually stumble on the page. By the same token, crawlers that search the Web will see the page, add all of the false E-mail addresses to its database, and then, thanks to the loopback link, will crawl the page...again and again...and again and again...all the while adding bogus, completely useless E-mail addresses to its database.

This is sweet revenge on the spammer! You know how annoyed you get with deleting spam? Think of how annoyed your spammer will be when he finds thousands, possibly millions, of completely bogus E-mails in his database! This can also drastically curtail the spamming activity itself, because if it goes unnoticed, the time a spammer's software will spend trying to e-mail all of the bogus addresses will be time during which it can't spam legitimate users!

To top it all off, the top of the page contains a nice block of convincing text. The page disguises itself as a somewhat-legit E-mail address service that hands out free E-mail addresses. This may be just enough to trick the dumber spammer into crawling the page.

So there you have it. We, the users, can fight spam. This is just one of many ways we can work against it - and it gives a feeling of satisfaction to know that you're not just ignoring it, you're actively fighting back against it.


(C) 2005 Flint Million