Monday, November 13 2006
I spent much of the day trying to come up with algorithms for rating the data being collected by the various thermal sensors in my solar heating system. Those sensors, by the way, have been reporting only the blandest of information, given that the days have been overcast and unseasonably warm, rising up and down within the confines of the Fahrenheit 50s.
By "rating the data," I mean scanning through it and finding points of inflection in the grand waveforms. For the solar panel's sensor, such points of inflection occur when the sun first shines on the panel in the morning or when it starts to fade in the late afternoon. For other parts of the system, the inflection points happen at other times, depending on the speed of the hydronic fluid or whether or not someone has decided to place a load on the system by, for example, taking a shower.
Finding those inflection points in the data is easy when you're scanning through them with your eyes. You'll see the numbers bounce around a certain number or gradually drift upward or downward with the changes in the weather. Then, suddenly, the numbers will start to increase rapidly, a degree per second in some cases. Or they'll drop, though this process is always slower. But how do you find those places in the data so you can flag them? Knowing where they are and when they occur in relation to each other among each of the various sensors can provide extremely useful information.
Today I wrote a data scanner that looks at a variable-sized "window" of data on either side of each point it analyzes and performs a statistical evaluation. It figures out the highest and lowest data values in the window and then asks if those are within a certain distance of the edge of the window. It then asks if the point in question is statistically closer to the highest value or the lowest one in the window. Then it asks if the same could be said about the point lying to its left or right. Depending on the answers to these questions, it decides whether or not the point is one of inflection. This code was capable of finding inflection points artificially introduced into the data, though I've yet to collect any data interesting enough to analyze with this system.
Another problem is figuring out when it's best to dump logs and raw MySQL data that have already been analyzed. This is important, since the system is logging several megabytes a day of numeric text.
By the way, if anyone has a suggestion for a less-clunky way to identify inflection points in data, I'd be interested in learning about them. I'm sure greater minds have spent whole lifetimes on this problem and come up with better solutions than the clunky CPU-intensive one that took me a couple hours.
For linking purposes this article's URL is:feedback
previous | next