Saturday, February 04, 2006

Lack of monitoring tools for Linux

I've been looking for a way to monitor a few dozen Linux servers lately, and there just doesn't seem to be a nice integrated tool to do it. In particular, I am looking for something that:
  • Pulls various SNMP data from a list of Linux server
  • Stores said data for a user-specifiable amount of time
  • Generates useful graphs of said data
  • Sends emails out when said data exceeds certain thresholds
  • Provides a decent web interface for controlling everything
  • Runs under Linux
Maybe I'm just blind, but there doesn't seem to be anything that can do all of the above. I can accomplish some of it using mon, for example, but then I don't have a decent web interface, data retrieval/storage, or graphing. I can use Cacti, but then I don't have good alerting or data storage (RRD files are "lossy"). I would write my own, but then I lose the nice user interface.

Undoubtedly, someone will eventually come out with the complete package that satisfies my every desire. Once that happens, I'll just be one step away from having everything I ever wanted from Linux, with better cluster administration tools being my last hurdle.

20 Comments:

At 6:12 AM, Anonymous Anonymous said...

Nagios on a central server will take care of most of this. Webmin on each linux server will allow easy admin for most items. Nagios with Nagiostat/rrdtool will also do all the monitoring/alerting/metrics for Routers, firewalls, switches, AS400, Windows, Unix, etc. in one spot.

 
At 6:50 AM, Anonymous Anonymous said...

I think i have the answer to your prayers. :) I just ran into this neat tool at slashdot.org called Splunk. It does real-time data collection on many different sources and compiles them into a single "Google-like" interface for users to search through. It's really rather nice:

http://www.splunk.com/

 
At 6:56 AM, Anonymous Anonymous said...

Have a look at BigBrother http://www.bb4.org/
Live site at our university: https://bb.phys.ethz.ch/bb/

I'd say RRDtool is exactly what you describe as "Stores data for a user-specifiable amount of time", since you can configure it just like that (eg store full data for 3 Months, then store the 1h-average for 2 more years) and was invented at our university =) http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/

 
At 7:09 AM, Anonymous Anonymous said...

Have you checked out Argus? It's open source, and has handled everything I've thrown at it thus far. Excellent software. Extremely easy to set up (unlike BB, Nagios, and the like)...

http://argus.tcp4me.com

 
At 7:58 AM, Anonymous Anonymous said...

We use OpenNMS, a lot of configuring by hand, but works good so far.

 
At 8:38 AM, Anonymous Anonymous said...

We use a combination of Nagios for monitoring and alerts, and Cacti for pretty graphs. Not an ideal solution, but it seems to be the best I've found so far.

 
At 8:44 AM, Anonymous Anonymous said...

Just For Fun Network Monitoring System... www.jffnms.org is an awesome, open source SNMP program that lets you set up satellite locations across the internet, and they all pool in to your one mysql database, which is graphed with more information than you can shake a stick at. You can even perform some functions (restart, etc.) on machines you're monitoring.

 
At 9:03 AM, Blogger Adrian Cockcroft said...

There is also orca (http://www.orcaware.com) which is good for collecting more detailed information on systems. Its relatively simple, very extensible and multiplatform. The Linux data collector is called procallator.

 
At 9:26 AM, Anonymous Anonymous said...

Hobbit is a (compatible) GPL rewrite of Bigbrother, with the trending features of LARRD built-in (ie disk, memory, load graphing works out-the-box). Since all configuration is on the server-side, set is also a lot less effort (especially on distros that ship hobbit ...). Additng additional graphs is much easier (edit hobbitgraph.cfg rather than hack up LARRD), and additional data can be collected to rrd files quite easily using the ncv collector.

It does use RRD for most data storage, but that's what most people need for trend analysis.

 
At 1:45 PM, Anonymous Anonymous said...

Cacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality. Cacti provides a fast poller, advanced graph templating, multiple data acquisition methods, and user management features out of the box. All of this is wrapped in an intuitive, easy to use interface that makes sense for LAN-sized installations up to complex networks with hundreds of devices.
http://www.cacti.net/

 
At 2:41 PM, Anonymous Anonymous said...

Take a look at netmrg www.netmrg.net. I use this with some scripts run out of snmp. Seems to work well.

 
At 10:44 PM, Anonymous Anonymous said...

Maybe late then never: zabbix.com is a nice tool, meeting your request.

 
At 5:47 AM, Blogger p0wer said...

How about Munin? It is targeted towards a single machine, but it can send alerts to Nagios

 
At 11:13 PM, Anonymous Anonymous said...

hyperic http://www.hyperic.com/ is my weapon of choice

 
At 10:55 PM, Blogger Answerguy said...

bash, perl, python, php and ruby to name a few they "do" monitor and will do what ever you ask them to do. Alert, Store 'said' data, poll SNMP data. Give you nice UI.

What it won't or can't do is make coffee for you!

 
At 10:17 AM, Anonymous Anonymous said...

What about Groundwork Open Source? It doesn't add much to Nagios, but it's very easy to setup: http://www.groundworkopensource.com
Look for the free version.

 
At 4:16 PM, Anonymous Anonymous said...

Monitoring Alerts:
Nagios is horrible to manage, but it works so we use it. I'm not happy about it. I'd replace it in a second.

Graphs and trends:
Cacti is crap. It makes things way more complicated than it needs to be. We got rid of it. I switched to Munin. Munin is not perfect, but reasonable. I like the plugins. It's SUPER easy to add plugins and graphs to Munin. Configuration and installation was so-so. All these Perl beasts suck when it comes to installation and configuration.

 
At 2:19 PM, Anonymous Anonymous said...

Don't forget the incredibly lightweight "mon".

http://www.kernel.org/pub/software/admin/mon/
or
http://mon.wiki.kernel.org/

 
At 9:25 PM, Anonymous Anonymous said...

aptitude install munin

 
At 3:46 AM, Blogger Unknown said...

For those using Nagios,
Here is a solution which gives the ability to extend Nagios notifications with email, sms, and voice messages.

If interested, check the website:
http://www.alarmtilt.com/?action=solutions&browse=nagios

 

Post a Comment

<< Home