Splunk is an enterprise-class machine data gathering and analysis tool, capable of consuming 100s of gigabytes of machine generated data and turning it into arbitrarily searchable data and presenting it with fancy reports and graphs. I’ve used Splunk in the past for a project and another time to do post-mortem analysis of a hacked webserver.
Since I was working on a driver for the NXT2WIFI and programming an SNMP agent in ROBOTC, I thought, “Why not have Splunk do something with this?” So I hooked up a LEGO Light Sensor, Sound Sensor and a HiTechnic Barometer Sensor. Now I could measure atmosphere pressure, temperature, ambient light level and background sound over WiFi.
I implemented some SNMP OIDs to return appropriate sensor values, depending on the OID queried:
- iso.3, 6, 1, 2, 1, 1, 5, 0: name of the brick
- iso.3, 6, 1, 3.1: temperature in C
- iso.3, 6, 1, 3.2: pressure in 1/1000th inch Hg
- iso.3, 6, 1, 3.3: sound
- iso.3, 6, 1, 3.4: ambient light
In order to fetch all the sensor data, I cobbled together a quick script to run on the Splunk host (a Linux VM):
#!/bin/sh IP=192.168.0.102 DATE=`date` TEMP=`snmpget -r 0 -c public -v 1 $IP iso.3.6.1.3.1 | sed 's/[^:]\+: //'` TEMP=`echo "scale=2; $TEMP/100" | bc` sleep 1 PRESS=`snmpget -r 0 -c public -v 1 $IP iso.3.6.1.3.2 | sed 's/[^:]\+: //'` sleep 1 SOUND=`snmpget -r 0 -c public -v 1 $IP iso.3.6.1.3.3 | sed 's/[^:]\+: //'` sleep 1 LIGHT=`snmpget -r 0 -c public -v 1 $IP iso.3.6.1.3.4 | sed 's/[^:]\+: //'` echo $DATE,$TEMP,$PRESS,$SOUND,$LIGHT
This produced a nice and easy to digest output:
Mon Jun 11 20:46:19 CEST 2012,23.5,30761,34,60
Next I setup Splunk to run this every minute or so. After some tweaking and fiddling, I managed to get it to produce some nifty looking reports:
Overkill? Sure, but it was pretty cool to have my little NXT being queried by enterprise-class tooling! All the SNMP agent code will be part of the driver suite when I get around to publishing it.
Ah, to have that kind of time. I definitely like the remote sensing capability that you created, Xander. Great lead in for a robotic explorer.
How experienced are you in shell scripting? I’ve been at it for a year, and have posted some (rather lengthy) programs on my blog. (My blog that needs comments on making posts MUCH SHORTER.)
Anyway, I’m now trying to run this on my machine. Since I installed Linux as my desktop directly, it doesn’t have the same software installed. Should I get ‘snmp’ through ‘snpp’ or ‘snmpd’?
I have almost 18 years’ Linux experience, so I’ve done a fair amount of scripting.
Which distribution are you using? Ubuntu? snmpget, which is the program I use in my shell script, is part of the snmp-utils or snmputils package. snmpd is for when you want your desktop machine to be an snmp agent. You don’t need it, at least not for what I was doing with Splunk and the NXT.
Yup, I’m using Ubuntu. I’ve been using Linux full-time since summer of ’09. I’ve tried some other distros, but haven’t found anything I like as much as the pre-2011 Ubuntu releases. Mainly because I miss Compiz being stable.
Come to think of it, I don’t have WiFi for my NXT. Anything for Bluetooth? Or maybe some hints on Android dataloging/image transfer? I’m trying to get my robot-mounted phone back on the roof. All part of a 24-hour panoramic time-lapse picture and one WEIRD conversation: “Hi, my phone is on the roof.” I’ve got a sample here:
http://nicknackgus.deviantart.com/#/d54u5br
It took two or three days to edit that picture. The shell script I used to to edit that could be optimized, but I have it running on an old desktop that I only keep for this type of project, so I truely don’t care.
Anything is possible, you can send BT messages from your NXT and they can be received by your Linux box and turned into something Splunk can ingest and analyse.
According to the graphs, you didn’t get hot under the collar, or even let of steam.
Wow, I wish my coding experiences could be like yours!
Dear Xander,
Would it be possible to use Splunk for real-time processing of the data and drive the robot around using their data processing capabilities?
Or do you recommend something else? I am normally working on the robot-side of things and not so much on the infrastructure-side. And everything “is supposed to be real-time” such as VoltDB, Splunk, Storm, HStreaming, but on closer inspection this is most often not true. Or there are no functionalities to put my own (image) feature extraction methods, object recognition, classification, reservoir computing methods, or random other machine learning techniques “in the cloud”.
It would be great to have more people like you trying to cross that bridge! And it shouldn’t be in someone’s spare time! 🙂
I am not sure how real-time Splunk is. To do object recognition and do real-life stuff with the data (other than make graphs), you’d need some pretty fast processing. There are other data processing systems, like Hadoop, that may also work. I have never used Hadoop, though, but it’s very popular. 30 seconds might be real-time enough for an IT infrastructure but would make your robot look a little silly when you throw a ball at it and you hope for Splunk to gather and analyse the data, so it can figure out a trajectory tell your robot where to catch it.
I know with Splunk you can have applications that can both insert data and query it but I have not really played with that. It would be interesting to see how quickly you can insert data and run a saved query on it afterwards. There are several SDKs for Splunk, one of which is Python (someone sharing your last name is the father of that particular language). If you have some clever interns, they could certainly explore that possibility.
When doing such things in the cloud, also keep in mind the round trip time between the robot’s sensors and the data analysis system. When offloading this kind of stuff and you have a LOT of sensor data, latency will become a serious issue. On the other hand, simply having sensors attached to a PC and relaying positioning info the robot, renders it into nothing more than a RC car, which is also kind of sad.
I have never played with systems like VoltDB, Storm or HStreaming, so it is hard for me to make a judgement about them.