Your leaking thatched hut during the restoration of a pre-Enlightenment state.

 

Hello, my name is Judas Gutenberg and this is my blaag (pronounced as you would the vomit noise "hyroop-bleuach").



links

decay & ruin
Biosphere II
Chernobyl
dead malls
Detroit
Irving housing

got that wrong
Paleofuture.com

appropriate tech
Arduino μcontrollers
Backwoods Home
Fractal antenna

fun social media stuff


Like asecular.com
(nobody does!)

Like my brownhouse:
   bad parser and bad radio noise
Tuesday, October 25 2011

Today I began working on the web spider that would parse through my files containing the names of various organism and the ESRI layer data to plot their ranges on a map, taking relevant data and sticking it into a MySQL database (since that's the database system with I am most comfortable). It turned out that the format for ESRI captioned map data is identical to that used for dBase files. I don't know anything about dBase other than that it is a 1980s-era database system that one rarely hears about these days. But it didn't matter; PHP has extensions allowing it to read files in the dBase format, and this made it easy for me to quickly read organism data into my MySQL organism table.
The next step was to then go through these MySQL records and spider additional content and images from the web. The best source for such information (partly because the format of its pages are all the same) is Wikipedia. It took me awhile to develop a spider that can actually read a Wikipedia page without throwing an error (you have to set User-Agent or it will throw a 403 error), but once I did, all I had to do was use my homemade parsers to slurp data out of identifiable places in each web page's structure.
Or so I thought; it turned out that I had a bug in one of my workhorse parsing functions known as parseTwoPartBetween. It takes a string a, looks through it for one string b followed at some point by another string c, and returns what lies between c and and fourth string d. It's great for parsing out the contents of arbitrary HTML nodes. The problem was that it was failing to properly work with multi-character b strings, a bug that hadn't shown up in testing or in most applications. Because the vagueness of the variable names in this complicated function rendered it a complete tear-down, I had to rewrite it from scratch. The old buggy version of the function had 140 lines of code while the new one only had 93 lines and didn't seem to have any bugs.

While taking a bath this evening, I kept hearing the solar controller resetting, which triggers audible clicks from one or more of its relays. When it's resetting a lot it can sound like a katydid, though I've put code in there to slow it down when this happens. I figured that tonight's behavior had something to do with the fact that the computer it is attached to was in standby, a state that makes its lines more susceptible to the kind of noises that can trigger a communications-based reset. (Remember, this is serial-based Arduino technology, and recent-model Arduinos can be reset via their communications connection so as to make reflashing their firmware effortless.) So when I got out of the bath, I went to investigate. Interestingly, though, the solar controller stopped resetting the moment I turned off the lights in the bathroom where I'd been taking a bath. Evidently the electrical noise generated by the three dimmable compact fluorescent bulbs in that bathroom is strong enough to cause resets on the solar controller when the computer it is attached to happens to be in standby. Mind you, the bathroom is about 15 feet away from the boiler room and uses a separate circuit from anything used by equipment in the boiler room. This is why RF hacking will always be a black art, a field where superstitious ritual works as well or better than common sense.


For linking purposes this article's URL is:
http://asecular.com/blog.php?111025

feedback
previous | next