Ask Slashdot: Performance Monitoring for Linux 81
muadib
wants to know about the following:
"Given the current discussions on tuning, I am trying to
find out if there are any performance monitoring
applications for Linux. I don't mean things like xload,
xosview, etc which provide only a small amount of data.
For anyone who's done benchmarking under NT, I mean
something like their built in perfmon utility that lets
you view and capture just about any statistic on your
system or on a remote system. Capturing is the specific
functionality I'm looking for b/c I'm working on a Linux
device driver, and it would be nice to have historical data
of CPU utilization, interrupts/s, etc. so that I can
compare complete system perfomance between code revisions."
Anyone working on an NT peformance monitor clone?? (Score:1)
You know, mimick the interface, make NT sysadmins feel at home, and like they can reuse some of the $$$$ they spent on training??????
KDE does a great job of looking and feeling like windows, and their ktop (or whatever) does a great job of imitating the NT task manager, but so far I have not seen anything like a Kperfomon.
I know it is traditional to hate all things NT/95 but sysadmin tools with a NT/95 interface would have a very large built-in interface and possibly persuade managers into believing Linux is as easy to use as NT. After all, the fact that NT looked like 95 has to have something to do with it's acceptance as a server.....
There really is no "standard" gui sysadmin interface for Linux, why not take advantage of all Bill Gates legal ground work making it legal for you to rip off the look and feel of his sysadmin tools.
If you have any info on NT-alike sysadmin tools (such as a samba interface, event log, etc) let me know at:
motjuste@briefcase.com
Serious Performance Analysis (Score:1)
offer a product called 'Viewpoint' that does what
you guys are really hoping for: there's a UNIX process that reads something like 300 kernel variables at any rate (usually every 30 seconds) and then sends that data to a central monitoring program. The central program can talk to hundreds of UNIX,VMS,Unisys,NT,etc machines at once and plots and correlates
the data provided. The features it has are pretty mind-boggling; look on the web page to get a feel for it. If you look hard enough, I think they even made a Java and a Web version of the frontend.The tools are for enterprise
clients who want to know about the details of
their performance: if the cache hit rate isn't
as good as it should be, if the network is too saturated for best performance, etc. I beleive you
could even compare Linux and IRIX's relative
merits by looking at the two's metrics side by side under similar stresses. When I left, they
were adding some modules like an Oracle module
(to correlate kernel metrics with Oracle's SQL performance) and I personally suggested creating
an Apache module (which may or may not exist -
they have an API to program to, so it could happen
if someone cared enough to make it happen)
I should stop here and say that I am pretty sure
this product retails for tens of thousands of
dollars.
My understanding is that there was a port to Linux
done inhouse but I doubt they have rolled it out.
My other understanding is that the company has pretty much gone to shit since they were bought out right after I left; so who knows if they will be clued enough to want to work on Linux. If any of you are very excited about products like this
(ie, products for managing tens/hundreds of millions of dollars worth of computers) coming to linux, I'd suggest going on the web site, finding
a feedback form, and speaking your mind.
NT Perfmon workalike -- for Intel and Alpha (Score:1)
This tool is IMHO the best of the pack out there at the time for really understanding the performance of your programs with respect to caching and processor quirks. Check it out.
Re:need of a whizzy app! (Score:1)
http://www.blakeley.com/resources/vtad [blakeley.com]
Re:Something less braindead the perfmon, I hope. (Score:1)
These are my experiences based on doing capacity planning agents for NT:
If you want to roll your own use the performance registry.
Now the performance registry is an interesting beast. Check out ther perfmon code to see how its done but the low down is that this API is unsafe. Be careful with multi thread access to Perf Registry (actually - don't)
The buffer size is assuming UNICODE size so it halves the buffer every call unless you refresh it (see the API) there is a MS bug report on this.
Calls may return crap even though the return code is OK - use unicode to check the header.
Use the counter size returned by the API not those specified in the header (they are wrong in some cases - not most)
The performance registry relies on other DLLs that may fail so it in turn may fail.
The spec is that you return info if you are asked but SQL Server 6.x returns info even when NOT asked and this is bad for performance if you only want some counters. (this is a known bug)
But beware others may do this too.
Actually this is a great idea and I wish it was done properly (ie. robust)
It kills UNIX in ths regard if it worked as spec'd
Linux is even worse that say SVR4 UNIX in instrumentation particularly I/O is zero!
BUT its
Ive written agent/server type stuff to get info and it aint to hard but you've got to have the info to begin with.
give moodss a try... (Score:2)
A thorough and intuitive drag'n'drop scheme is used for most viewer editing tasks: creation, modification, type mutation, destruction,
The module code is the link between the moodss core and the data to be displayed. All the specific code is kept in the module package. Since module data access is entirely customizable through C code, Tcl, HTTP,
Apart from a sample module with random data, ps, cpustats, memstats, diskstats, mounts, route, arp modules for Linux, apache and apachex modules are included (running "wish moodss ps cpustats memstats" mimics the "top" application with a graphic edge).
All the above in rpm, tgz,
Jean-Luc Fontaine
see screenshots, html documentation,
Regards.
Re:Anyone working on an NT peformance monitor clon (Score:1)
Re:M$ Hate Lackey? (Score:1)
In the works (Score:2)
I'm currently working on a tool that resembles what you're looking for. It consists of 3 parts and a kernel patch. The patch adds a feature to the kernel that enables a "trace" driver to register and enables key kernel parts to call upon the driver to note that a given event has occured. In turn, the trace module takes care of the events and puts them in a buffer. When a certain quantity of information is in the driver's buffer, he sends a signal to a trace daemon. The daemon then reads from the driver and appends the trace information to a trace file. The last part is the trace data decoder. This decoder takes the binary data and transforms it into a human-readable format. Therefore, impact on the system is minimized. As of this time, all the above mentionned parts are complete. The only thing that remains to be done is to build a GUI for the decoder (right now it works perfectly on command line). This is what I'm working on right now.
This system enables the observer to know exactly what happens at every moment in the system . As for remotley observing a host, this is not a problem, it is actually planned. This will consist in the trace daemon offering it's services on an IP port which can be contacted by other hosts. If you're interested ... send me an e-mail. I'll have a web page for it as soon as the GUI is complete.
Some of the basics (Score:2)
Forgive me if any of these are obvious -- I'm
not trying to be sarcastic, it just that some
sys admins don't know this yet:
"top" shows you CPU load, memory usage, and
usage per process -- there are many options in
"top" check out the man page for it.
"pstree" shows a tree graph all processes and
who spawned what.
"ps auwx" (or "ps -ef" in Solaris) shows all
current processes
"netstat" and "ifconfig -a" shows network info
such as errors, dropped packets, etc.
Big Brother is a decent package for monitoring
several servers at once. It generates a web page
of colored lights (GIFs) indicating system load, web
daemon status, email daemon status, ftp daeomon
status, etc.
I remember reading something in th LJ (Score:1)
I'm guessing the author of the question (only two months wait? Wow, that new hardware really has sped things up.) has already perused the article on SCO releasing sar as open source. This sounds like what he needs, albeit in a command line only form (correct me if I'm wrong.)
Now what I want to see is a graphical heartbeat that looks a bit like cthugha (an oscilliscope on acid, for sound) and uses whatever system stats the operator deems apropriate as parameters for its graphics generating equations. Now, if only I had studied math a bit harder, I might write it myself... I have the Father Guido Sarducci "5 Minute University" syndrome, "We'll teach you in five minutes everything you'll remember five years after you graduate."
linux perfmon (Score:1)
If you build it right, it will work, and any anomalies usually succumb to thoughtful analysis.
Instrumenting a network at the level you guys are talking about can kill it, severely.
Get ye a good sniffer, and learn to read it's utterances.
Re:Cron Proc'ing (Score:1)
is better suited to what we need. Sooner or later we won't have to jump to Excel to get quality graphs drawn.
Uuuh, GNUPlot?
I've not looked at Guppi, but my instinct if people want flashier graphs than GNUPlot can produce, the way to go is to extend GNUPlot.
--
Cron Proc'ing (Score:1)
In a slightly related note, Linux needs some new graphics libraries--GD is good, but it's not Excel. I have the distinct feeling GIMP is better suited to what we need. Sooner or later we won't have to jump to Excel to get quality graphs drawn.
Once you pull the pin, Mr. Grenade is no longer your friend.
Re:Heh, SAR has just been opensourced. (Score:1)
Guppi, perhaps? (Score:1)
Of course, if you do go with the GIMP, you could get some seriously sharp looking graphs for your extra effort.
--
Re:Some of the basics (Score:1)
Re:SNMP (Score:1)
What is this Crickett/RDD, I can't find mention of it with any web searches. Is it free? A web pointer please!
Re:Monitoring (Score:1)
:g/Sun/s//SCO/g
Re:linux performance mailing list already going (Score:1)
I'm not sure that these lists will overlap too much; the list I started is focused on system-administrator level tools to monitor both the health and load of Linux systems. It's scope is larger than just performance tuning -- it's not only about tweaking systems to run benchmarks better, but about making sure you get notified when your systems go down.
It's not about duplication of effort -- it's about a different perspective and different goals.
Mailing list available for this topic (Score:2)
To join this list, send a message [mailto] consisting of the single word "subscribe" (in the message body , not the subject) to:
The first objective of this list is to gather enough information to build a performance and reliability HOWTO. Many of the attendees of the BOF are on this list. This list is still in its infancy, but I'm sure that the Slashdot effect will change that!Serious Performance Monitoring Tools (Score:1)
The best tool that I have ever seen is Performance Co-Pilot on SGIs. They recently demo'd this product at a Linux expo running on an SGI Visual Workstation running Linux and I believe they are heading towards open sourcing it (along with a lot of other SGI stuff).
See http://www.sgi.com/software/co-pilot
I have recently written my own tool for DEC Alphas, but it is primative compared to SGI's tool. Monitoring multiple hosts simultaneously in real-time on the same chart/3D visualisation is non-trivial.
My impression is that there is a good oppurtunity to add some good instrumentation to Linux using a consistent interface, someone has just got to do it. The other UNIXes suffer from insufficient instrumentation and a lack of public interfaces to get at the information.
re "man proc" (Score:1)
Re:Some of the basics (Score:1)
-- Donovan
SAR on linux (Score:1)
Monitoring (Score:1)
Lando
linux performance mailing list already going (Score:2)
To subscribe to this list, send an email to the following address
with "subscribe" in the body of the message:
linux-perf-request@www-klinik.uni-mainz.de
My collegue, Rich Pettit has done alot of work to add perf stats to the
Shameless Plug!!! (Score:2)
I might as well put in a (shameless) plug for our product, RAPS. Check out http://www.foglight.com for more details. Again, it's aimed at the enterprise level so it's not cheap but it has OS level monitoring as well as Sybase and Oracle agents, Netscape and Apache WWW server monitoring.
How about Linux and NT monitoring? (Score:1)
Re:Linux Performance monitoring (Score:1)
Take a look at procinfo.
Re:Something less braindead the perfmon, I hope. (Score:1)
perfmon is actually an excellent and comprehensive utility that has some very nice features and is actually useful in the real world. I use it for database tuning, for example.
I recommend the book (now some years old) on using perfmon. In retrospect it looks like the last hurrah of the VMS crowd before the Win95 mentality took over NT development.
And the fact that perfmon gets no respect is Yet Another Reason Why
Hmmmm.... (Score:1)
*TOP* maybe?
pipe it to a file or something.
Re:M$ Hate Lackey? (Score:1)
Something less braindead the perfmon, I hope. (Score:1)
Perfmon only really performs spot measurements of things like CPU utilization. It can't tell you the true average CPU utilization of a process over a 10 minute interval. It can just tell you the average of instananeous CPU utilization at 0 minutes and 10 minutes. This bugs the hell out of me, especially since NT keeps a running count of execution time for each process.
Re:Writing My Own (Score:2)
Why don't you wait for the SAR sources to be
released and adopt it. SAR supposedly gives
a lot of low level statistics. If I remember
correctly, sometime back [ok, longtime back!]
when I was writing an SNMP agent for Acer
Server Manager, we used to make use of SARs
libraries on UnixWare to get some specific
statistics for instrumentation.
-Sas
Re:Linux Performance monitoring (Score:1)
Re:Hmmmm.... (Score:1)
*SARCASM* maybe?
Really, 'top' only scrapes the surface of what NT's perfmon does ... 'top' is alot closer to NT's task manager.
I love linux ... but ... NT has some very useful tools.
Monitoring Performance (Score:1)
Re:Writing My Own (Score:1)
As it stands raps runs just on solaris (maybe nt?), and is very cool. I was skeptical at first, having seen a bunch f other crappy monitoring applications that I could write better myself, but this one really does a good job of presenting both low and high level stats in very digestible formats.
SE Performance Toolkit (Score:1)
vtad may fit the bill (Score:1)
Heh, SAR has just been opensourced. (Score:2)
It makes me laugh to see this article [slashdot.org] appearing on the same page as this Ask /.
Although, I presume no-one's ported it to Linux yet.
re: "monitor" available for linux? (Score:1)
Is this not true?
Re:Some of the basics (Score:1)
if you know perl, just use its extensive regular expression matching with common monitoring tools....couple this all together and have it output to files, and BINGO!
:) hope this helps
Re: "monitor" available for linux? (Score:1)
my only uncertainty was whether or not there was a linux port....not whether i knew what monitor was....
M$ Hate Lackey? (Score:1)
All I have to say is that they have their merits. Would you go to a (example) christian conservative "Pro-Life" demonstration and just take everything they said as gospel??
Thanks for thinking,
Pad
need of a whizzy app! (Score:1)
An application that pumps snmp and
Then you could use SQL to search for certain criteria/data at a given period in time on a given machine.
Does anything do this for Linux?
I wish I had the time to do this, i suppose it would be worthwhile doing!?!
There's lots of tools... (Score:1)
I use xperfmon++ for real-time monitoring - it looks a lot like the windows prefmon tool, but does not have the capability to log to file (this would probably run to only a few lines of code if you wanted to add it). However, almost all the stats can be got from vmstat(8) and netstat(8) -i. Just write a few filters to get rid the the headers and stuff from the output then paste(1) the results into one file and BINGO!
Rodd
Perfmon on Linux (Score:2)
monitoring interface. A majority of the Linux
systems out there run on Intel CPUs. It is
not hard to implement an interface for
programming the CPU counters to measure
a variety of events. However, there needs
work to be done to use the so obtained data
in a meaningful manner. The CPU counters can
be coupled with various system tools that give
us system statistics. The bottomline:
1. There are a few drivers and libraries
for Linux that allow you to make use
of CPU counters (at least for the x86).
These don't seem to be much used.
2. Like Intel's VTune tool (which pertains
mainly to code optimization), a tool
could be written for Linux that gives
extensive performance statistics, and
helps in optimizing code. The
infrastructure for such a tool is already
there, IMHO.
3. For interested people, I could find
(from some dust-laden hard disks
pointers to related code and documents.
I worked on this subject long back.
Linux Performance monitoring (Score:1)
an OpenWin GUI which came with everything you mentioned and more: disk usage, swapping, cpu, interrupts, etc. I had the same tools on my SPARC station under both OpenWin and CDE so I am assuming it comes with the GUI and not the OS.
Does this help?
Re:Some of the basics (Score:1)
Plus, lets say that my system goes down during the test run. I want to be able to look at it the next day and determine at what point that happened and what the state of the machine was right around the time it happened, b/c that might help me track down what happened.
Deepak Saxena
Project Director, Linux Demo Day '99
Writing My Own (Score:2)
the machine you're sitting at. Email me if you're interested in more info or helping.
- Deepak
Deepak Saxena
Project Director, Linux Demo Day '99
heiesenberg doesn't apply here (Score:2)
Intuitively it would seem that measuring threads in great detail would distort the measurement. However, performance registers on board CPUS (I'm thinking Intel at the moment) allow one to monitor certain aspects of threads with almost no overhead whatsoever. It's perfectly feasable and desirable to monitor code and a fine grained level.
There are many performance anomalies that can't necessarily be identified at the design level (for example pipeline flushing and cache problems). These detailed measurements can tell you much more about your program's behavior than the standard profiler can.
Re:Tools (Score:1)
I've made the patches to the kernel for my perfstat modifications available as well as the class library for reading the perfstat file. I'm in the process of getting an ftp server set up where I can put these changes. I will also make available a white paper I wrote on my efforts.
I will get this done as soon as possible, but I've got a lot of other work to do that monopolizes my time.
Rich
---- Richard Pettit
---- Performance Architect
---- Foglight Software, Inc.
Linux Performance Monitoring/IOStat (Score:1)
ftp://ftp.uk.linux.org/pub/linux/sct/fs/profili
and for nfs:
ftp://ftp.sce.carleton.ca/pub/rads/iostat-2.0.3
Please see the linux-perf mailing list for more information. Send subscribe requests to
linux-perf-request@www-klinik.uni-mainz.de
SNMP (Score:1)
UCD SNMP MRTG and Cricket URLs (Score:2)
MRTG [ee-staff.ethz.ch]
UCD SNMP for Linux [ucdavis.edu]
MRTG is kinda a bear to work with for monitoring stuff other than a router, but it can be done. For an example you can check out my suso.org stats page [suso.org]. Look on the left side.