this recent rw story about a fuel leak in a plane is a great starting point for this little rant...
A fuel leak on a civilian aircraft caught the attention of Staff Sgt. Bartek Bachleda, 909th Air Refueling Squadron boom operator, during a flight from Chicago to Narita airport, Japan. After alerting the pilots and aircrew, the ranking pilot made the decision to divert the flight to San Francisco.
"I noticed the leak on the left side of the aircraft right behind the wing earlier during take-off," said Sergeant Bachleda.
Sergeant Bachleda continued analyzing the outflow of fuel to be 100 percent sure it was a leak while the plane was reaching cruising altitude. Almost an hour into the flight, he told a stewardess of the possible leak [emphasis added], but was given an unconcerned response.
...
Sergeant Bachleda said the captain and the crew were trying to figure out how the aircraft was losing 6,000 pounds of fuel an hour and then they knew exactly what was going on.
...
While conversing with the captain, the sergeant said he was hesitant at first to inform them about the leak, but he knew it was abnormal. The captain said they would have never made it to Japan if it wasn't for him.
the first draft i read in my feeds said the plane diverted back to chi-town, and then they went to SF because it's the other hub to Narita (i miss you TKO!). anyway, the impression i get from this .mil news report is that it is plausible that the plane could've flown out to the pacific before anyone noticed it was running low on fuel. i'm sure (?) they'd notice in time to divert to some island airstrip, but that's not the point...
for all the complexity and information being managed through the cockpit of the modern airliner, there is nothing that analyzes real-time information and says "you're in flying-gear and your fuel is dropping x% beyond the capacity of your engines to use fuel... LEAK!!!"
if you have devices that feed data into log servers, there's a lot of good info available... don't try to make some uber system for parsing data, b/c that's just perfection being the enemy of good. just parse for simple stuff that is clearly not right, b/c it's better than doing nothing.... who is hitting that explicit deny rule for outbound smtp on the firewall (b/c ppl other than your mail servers serving mail might be interesting... right?)? the number of messages received containing the word 'alert' or 'error' or 'critical' has changed by what percent over the last hour, day, and week? user xyz has 314 denied access attempts on network shares over the last hour. etc...
you're not looking to build some uber silver bullet, just a series of flashing lights to pull your attention to a particular area... without knowing too much about data, you can still put things within reasonable boundries and alert when something spikes... a built-in regexp based tuner for false-positives, and you're all set to learn some new stuff about your environment...
i've got some horrible mangled embarassing code stuffed away in this space, and since i'm building things out for some analysis coming up (and hopefully interesting blog foo) maybe i can break this out and get things ready for release... perhaps... if nothing shiny gets in my way... ;)