Friday, January 2 2004
setting: rural Hurley, Ulster County, New York
Due to the continued psychological impairments from the New Years hangover, I didn't resume my projects until today.
Some of you may remember a big anti-spam crusade I'd pursued back in August. I'd thought I'd successfully installed SpamBouncer on my Spies email account and that my spam problems would be over. It definitely seemed that way at first, but (much like George W. Bush) I was premature to declare victory so soon. Evidently I did so during a general post-SoBig lull in the global spam volume. When it picked back up again, I found my mailbox getting hammered the same as always. Something in my SpamBouncer installation wasn't working, but damn if I could figure out what it was. The installation is all written in something called Procmail, a computer language completely unlike any other. To me its scripts make about as much sense as Mesopotamian Cuneiform. Not only is Procmail cryptic, but its programs are difficult to debug. You basically have to set it up, manually spam it from a Hotmail account, and then examine its logs.
Today I decided to take a different approach. I would replace all the mailto: links on my website with links to a web form that would send me mails containing special headers. I would also maintain a whitelist of known correspondents. But those would be the only emails I would accept. After replacing all the mailto: links on my site with links to the form, all I'd have to do would be string together the snippets of Procmail code to handle these two conditions.
I automated the task of replacing all the mailto: links on my site, of course. First I found an FTP client capable of downloading just the filetypes specified in a filter (CuteFTP does this, but my favorite FTP client does not). Then I downloaded all the .htm, .html, .shtml, and .php files from my site. It's a 600 megabyte site, but these hypertext files came to only about 50 megabytes, a doable download over dialup.
For much of today I spent my time writing and debugging a VBScript robot that recursively waded through a file system and replaced mailto: links with links to a single file at a known place in the file system. It was a tricky programming project, one requiring lots of test runs. Each of these tests parsed through most of what I've written in the past seven years, something that it took my computer less than a minute to do. As the wading robot dropped down lower and lower into the directory structure, it had to measure how many directories upward to send the link it was making. When I later explained this process to Gretchen, I told her that it was like calculating the trajectory of an object tossed out of holes of various depths such that the object would always land in the same place outside of the hole.
Actually, it was a little more complicated than the way I've described it because while it was in there, the robot also replaced form links on my father's website with links to a different form mailer page.
After I'd uploaded the modified hypertext, I began to have doubts about my harsh filtering rules. I'd printed my email address on my business cards, and I occasionally get business that way. All those people would be filtered now.
Much as I didn't want to, I decided to wade back into SpamBouncer's Procmail swamp again. Though I hadn't noticed it happening, I must have actually learned something about Procmail during the agonizing work of getting simple whitelist and header examination Procmail recipes (plagiarized from the web) to work. When I looked at the old .procmail file I'd uploaded back in August, it was immediately obvious where the syntax error in the code was. Once I fixed that, SpamBouncer in all its wonderfully-configurable glory was working on my account. I almost wept with joy!
I configured SpamBouncer so that it would throw away mail having even the slightest spamlike trait. All my regular correspondents would still get through (because of my whitelist), and all my website feedback would still get through (because of the secret header tagging). That left random strangers who discovered my email address in print or by word-of-mouth. Such email would still get through, so long as it wasn't trying to sell me something or get me to slap a herbal viagra patch on my penis to collect a fabulous prize from Paris Hilton.
For linking purposes this article's URL is:feedback
previous | next