dim_STAT User's Guide

^<<

^[up]

^>>

^{dim_STAT User's Guide. by Dimitri}

Add-On Statistics

One of the most powerful features of dim_STAT is the ability to integrate your own statistic programs with the tool. Once added, they will be considered by dim_STAT as being the same as the standard set of STAT(s) and give you the same kind of service: Online Monitoring, Up-Loading, Analyzing, Reporting, etc.
However, the choice of external stat programs is so wide that it's quite impossible to design a wrapper for each and every format. Therefore, I've decided to limit the input recognizer to just 2 formats (which covers maybe 95% of needs) and leave it to you to write, if necessary, your own wrapper and modify the output to one of the supported formats.
Formats supported by dim_STAT:

- SINGLE-Line: with one output line per measurement (ex: vmstat)

- MULTI-Line: with several output lines per measurement (ex: iostat)
To be correctly interpreted, your stat program should produce a stable output. This means the same format for data lines, at least one line in case of MULTI, keep the time-out interval constant, etc. Lines not containing data have to be declared, so that they can be ignored by dim_STAT.
NOTE: lines shorter than 4 characters are considered as "spam" and will be ignored!
Let's look at some examples...

Example of SINGLE-Line command integration

Let's assume we want to monitor a read/write cache hit on the system. This information can be retrieved using "sar":

$ sar -b 1 1000000000000000

SunOS sting 5.9 Generic_112233-05 sun4u 07/09/2004

18:10:13 bread/s lread/s %rcache bwrit/s lwrit/s %wcache pread/s pwrit/s 18:10:14 0 1 100 0 0 100 0 0 18:10:15 0 14 100 0 0 100 0 0 18:10:16 0 7 100 0 0 100 0 0 18:10:17 0 0 100 0 0 100 0 0 18:10:18 0 0 100 0 0 100 0 0 18:10:19 0 135 100 0 0 100 0 0 18:10:20 0 0 100 0 0 100 0 0 18:10:21 0 69 100 0 2 100 0 0 18:10:22 0 86 100 0 2 100 0 0 18:10:23 0 0 100 0 0 100 0 0 18:10:24 0 0 100 0 0 100 0 0 18:10:25 0 0 100 0 0 100 0 0 ...

What we are interested in are the "4"-th and "7"-th columns from the sar output, and ignoring any lines containing "*SunOS*" or "*read*".

Folowing the "Integrate New Add-On-STAT" link:

Step 1: FIRST INFO

Let's give the new Add-On the name CacheHIT.
We need only 2 columns from the output line (4th and 7th value). This is a "Single-Line" output...
Click on "New"...

Step 2: INTEGRATION

During this step we need to explain what we want to run and which information we'll need:
Description: CacheHIT via SAR
Shell Command: sar -b %i 1000000000000000

During execution of sar %i will be replaced with the time interval in seconds.
The command name doesn't matter here because it is only used as an alias for STAT-service. Have a look at the "access" file section, it's possible to name the shell command "toto" and put in it /usr/bin/sar as an alias.

Ignore Lines: we should ignore any lines containing "*SunOS*" or "*read*"
Data Descriptions:

ColumnName - leave it as it is, if you don't need to access the database directly. Note: there are 2 reserved columns for Collect-ID and measurement No.

Data Type - if you're not sure, set it to "Float", otherwise it will be "Int"

Column# on input - in our case we need columns 4 and 7

Short Name - single word descriptions, here %rcache and %wcache

Full Name - description to be used where detailed information is needed

Use in Multi-Host - if you choose "Yes" the corresponding value will be automatically enabled in Multi-Host mode for analyzing of several hosts at once.

Create!!

Created!

What's Next? Will it work now?
Yes! IF YOU DID NOT FORGET to give your STAT-service access to this new command! This is a very common error.
If you want to collect "CacheHIT" data from server "S" be sure that the STAT-service on "S" is given execution permissions for the "sar" command. Add the following lines to your /etc/STATsrv/access file:
# CacheHIT Add-On
command   sar    /usr/sbin/sar
#
And now it'll work! :-))
NOTE: for security reasons and for a cleaner "stat to command" relationship, it is preferable to create for our new add-on a specific script 'CacheHIT.sh', and then use that instead of the direct access to the 'sar' command.
Example:
$ cat /etc/STATsrv/bin/CacheHIT.sh
#bin/ksh
exec /usr/sbin/sar -b $1 1000000000000000

$ CacheHIT.sh 5
...

$ tail -3 /etc/STATsrv/access
# CacheHIT Add-On
command   CacheHIT   /etc/STATsrv/bin/CacheHIT.sh
#
And the Add-On shell command needs to be changed to: "CacheHIT %i"

Anti-Spam Filter

IMPORTANT: There is an anti-spam filter feature, that is always active during data collecting. It rejects any input line having shorter than 4 characters in length. If your newly made stat command prints only one small column of numbers, you need to add leading spaces to take care that the data is accepted by dim_STAT.

MULTI-Line Add-On command integration

Multi-Line integration is quite similar to Single-Line, except few additional things:

Line Separator pattern: this is by default "new-line", but in some cases it can be a header (like iostat)
Attribute Column: very important! As you have several lines per measure you need to distinct these by something (like the "diskname" column in iostat).
Use In Multi-Host: is more than simply Yes/No, you should use SUM and/or AVG for collected values.

REAL LIFE EXAMPLE...

To probably even better feel a new Add-On integration process in dim_STAT, let me tell you a one real life story happened this year with one of our customers..
So well, once understood with dim_STAT what goes on the system and storage, customer also decided to bring more light on what is going wrong (or well) on their application too (finally)..
Initially they wrote a lot of debug messages into their log files, but nothing useful really to understand what's going wrong.. Also, more data they wrote to the log files - more slower worked application :-) normal, no? So, as the first step they simplified logging and got a single file: /var/tmp/appstats.log. Every N seconds a new line was added into this file and containing just 3 numbers, and the las one (we're interested in) is an avg TPS during the last time period (M seconds (bigger vs N)):
# tail -5 /var/tmp/appstats.log
10:17 5 20
10:20 7 30
10:23 2 50
10:26 8 30
10:30 1 10
#
And then customer is creating a simple monitoring script AppStats.sh:
# AppStats.sh 5
10
50
40
20
30
^C
#
In few minutes customer integrated this new stat command as dim_STAT Add-On, but... 15 minutes later it still did not collect any data...
WHY?...

Common Error #1

The first problem: the output line is very short! and lines shorter than 4 characters are ignored by anti-spam filter (as mentioned before)! All we need is just to add 3 blank characters in the begin of the line.
Let' get a look on the script source:
#bin/bash
#================================================
# AppStats
#================================================
while true
do v
 tail -1 /var/tmp/appstats.log
 sleep $1
done | awk '{ printf( "%d\n", $3 ) }'
#================================================
Just add 4 spaces into {printf( "%d\n", $3 )} before %d and it'll be ok!
#bin/bash
#================================================
# AppStats
#================================================
while true
do
 tail -1 /var/tmp/appstats.log
 sleep $1
done | awk '{ printf( "    %d\n", $3 ) }'
#================================================
The script output now is:
# AppStats.sh 5
    10
    50
    40
    20
    30
^C
#

Common Error #2

But that's not all! It'll still not work!...
Why?.. - the output of this script is not regular yet!...
To check it (as well with any other script) just execute it in the same way but piped to the 'more':
# AppStats.sh 5 | more
... 10 minutes later there will be still no any output!... - and it exactly what's happening when STAT-service is trying to send data to the dim_STAT server via process pipe...
What is wrong here?.. - the problem is inside of the script its output is self-piped into 'awk' program, and 'awk' itself is not flushing its output - data will stay buffered until the whole 'awk' buffer is not filled.. and only then data will be flushed to the pipe...
How to fix it?.. - add fflush instruction into the script (depending on 'awk' version) - change the script in way to have 'awk' call inside of the loop
Updated script :
#bin/bash
#================================================
# AppStats
#================================================
while true
do
 tail -1 /var/tmp/appstats.log | awk '{ printf( "    %d\n", $3 ) }'
 sleep $1
done 
#================================================
As 'awk' is finished on each loop passing, data will be always flushed and entered into the pipe with each iteration.

Continue improvement...

So well, customer copied the new script into /etc/STATsrv/bin on all needed servers and added into the end their /etc/STATsrv/access files:
# AppStats add-on
command  AppStats  /etc/STATsrv/bin/AppStats.sh
On the dim_STAT the Add-On was integrated as:

Single-Line
name: AppStats
1 column
shell command: "AppStats %i"
value: integer, 1st position, name: TPS

And we started to collect some first data...
Within first 40 minutes, once customer fully enjoyed to graph their application TPS levels, one of the developers said it will be fine to see on the same time an avg response time!.. And within one hour they extended their log file line with additional value showing avg RespTM.
The new script showing one value more:
#bin/bash
#================================================
# AppStats
#================================================
while true
do
 tail -1 /var/tmp/appstats.log | awk '{ printf( "    %d  %d\n", $3, $4 ) }'
 sleep $1
done 
#================================================
And we reintegrated again the same script but describing now 2 columns from output. And it worked just fine!..
Should I say during the next few hours they already wanted to add 3 other new columns! :-))

And finally...

Finally it was hard for developers to decide how many stat values they will need on each server, because it depends on application deployment as well on server role.. So, they understood hos to extend their script with any other values, but preferred to avoid Add-On integration step every time they added a new value into their log file..
Well.. nothing impossible :-)
The only way to have "dynamic" stat list is to improve AppStats script in way it working like a Multi-Line stat command (like 'iostat' may show more or less disks according your server configuration)..
The idea is simple, this output:
# AppStats.sh 5
   TPS  AvgTM  Users   Active
    30    20     200      40
    40    20     200      50
^C
#
into multi-line:
# AppStats.sh 5
    
 Name       Value
  TPS         30
  AvgTM       20
  Users      200
  Active      40
   
 Name       Value
  TPS         40
  AvgTM       20
  Users      200
  Active      50
^C
#
And according to needs, log file may contain on the same time the value names, as well values itself:
# tail -2 /var/tmp/appstats.log
11:12  33  TPS  30  AvgTM  20  Users 200  Active 40
11:22  33  TPS  40  AvgTM  20  Users 200  Active 50
The new script version:
#bin/bash
#================================================
# AppStats
#================================================
while true
do
 echo " Name       Value"
 tail -1 /var/tmp/appstats.log | awk '{ printf( " %-8s  %3d\n %-8s  %3d\n %-8s  %3d\n\n", 
   $3, $4, $5, $6, $7, $8 ) }'
 sleep $1
done 
#================================================
This scrips may be integrated now as Multi-Line Add-On, having 2 columns on the output... And even if script will be extended again with other values - they will just extend a list of lines with names and values.

Pre-Integrated Add-Ons

To make your life easier, there are several additional already pre-integrated stat programs (Oracle, Java, Linux, etc).
They are all already installed by default in your dim_STAT server, BUT! not all of them enabled in your STAT-service by default - only commands not needing any additional checking are enabled!...
As a rule, check first if the add-on works correctly, by starting it directly from the STAT-service bin-directory on the client side (/etc/STATsrv/bin), and only then enable it via access file (usually a simple uncomment in /etc/STATsrv/access)...

ProcLOAD / UserLOAD

There are 2 additional psSTAT wrappers:

ProcLOAD: all output information on-the-fly summarized by process name
UserLOAD: all output information on-the-fly summarized by user name

These stats are very useful when you have hundreds or thousands of running processes and you want to study groups of processes or users, instead of the activity of a single process.

Example of output :

# /etc/STATsrv/bin/ProcLOAD.sh 5 PNAME NTOT NACT UsrTM SysTM %CPU VSZ SYSC NLWP VCTX ICTX SIGS InputBLK OutputBLK I/O_CHR STATcmd 312 58 0.00 0.00 0.0 594112 1472 312 180 2 0 0 0 198874 WebX.mySQL 312 58 0.70 0.04 3.4 1142968 8307 312 1066 82 0 0 0 398649 fsflush 1 1 0.00 0.03 0.4 0 0 1 7 2 0 0 155 0 httpd 7 1 0.00 0.00 0.0 18008 10 7 14 0 0 0 0 0 in.rlogind 1 0 0.00 0.00 0.0 2240 0 1 0 0 0 0 0 0 inetd 1 1 0.00 0.00 0.0 5304 1 4 4 0 0 0 0 0 init 1 0 0.00 0.00 0.0 2400 0 1 0 0 0 0 0 0 java 2 2 0.00 0.00 0.1 455448 255 50 413 1 0 0 0 12 mysqld 1 1 0.24 0.12 2.0 62216 21258 315 1058 30 0 0 342 4448475 nfs4cbd 1 0 0.00 0.00 0.0 2360 0 2 0 0 0 0 0 0 picld 1 1 0.00 0.00 0.0 4632 33 6 3 0 0 0 0 0 psSTAT64 1 1 0.02 0.08 0.3 5856 5006 1 3 2 0 0 0 3146 rpcbind 1 0 0.00 0.00 0.0 2880 0 1 0 0 0 0 0 0 sendmail 2 1 0.00 0.00 0.0 15456 10 2 3 0 0 0 0 0 svc.startd 1 1 0.00 0.00 0.0 10200 9 13 4 0 0 0 0 672 syseventd 1 0 0.00 0.00 0.0 2552 0 14 0 0 0 0 0 0 ttymon 2 0 0.00 0.00 0.0 4648 0 2 0 0 0 0 0 0 utmpd 1 1 0.00 0.00 0.0 1280 0 1 1 0 0 0 0 0 vold 1 0 0.00 0.00 0.0 2912 0 6 0 0 0 0 0 0 wrapper-solari 1 1 0.00 0.00 0.1 3040 237 2 168 2 0 0 0 0 xntpd 1 1 0.00 0.00 0.0 2320 25 1 5 0 5 0 0 0 ypbind 1 0 0.00 0.00 0.0 2360 0 1 0 0 0 0 0 0 ^C

Special Solaris 10: ZoneLOAD / PoolLOAD/ TaskLOAD/ ProjLOAD

Four psSTAT_10 wrappers were added, that are specific to Solaris 10 and later:

ZoneLOAD : all output information on-the-fly grouped by zone id

ProjLOAD : the same, but grouped by project id

TaskLOAD : the same, but grouped by task id

PoolLOAD : the same, but grouped by pool id

These stats give you more extended information comparing to the standard 'prstat'.
Following some more details about output columns (given for ZoneLOAD, but valid for others too :-))
ZoneLOAD.sh - a shell script wrapper for psSTAT command to collect all data pre-grouped per Solaris Zone (psSTAT option: -M zone). Description of values printed per zone (each value is printed per a given time period):

N_total -- current number of all processes running within a zone

N_activ -- current number of processes being *activewithin a zone per a given time period

UsrCPU -- total User CPU *timeconsumed within a zone per a given time period

SysCPU -- total System CPU *timeconsumed within a zone per a given time period

CPU% -- percent of CPU Busy% within a zone - this value will depend on were or not some CPU assigned to the zone, so it's still better to monitor a CPU% usage within a zone via "vmstat" command!

VSize -- total "virtual memory size" in KB of all processes running within a zone (be aware each process within its VSZ value may already include several shared libraries or shared memory segments (SHM), and these *same* shared objects may be accounted several times within a total VSize...
Currently there is no any "simple" way to say you how much memory is used by a group of processes (for ex. Oracle processes, etc.) - even there is still possible to write a script which will account each shared object only once, such script will use a significant amount of CPU time..
So, nobody is perfect, but there is a room for improvement! :-))

SysCalls �-- total number of all system calls/sec within a zone

N_lwp -- current number of LWP (kernel threads) running within a zone

Vol_CTX �-- total number of all volоntary context switch/sec within a zone

InVol_CTX -- �total number of all involоntary context switch/sec within a zone

Sigs -- total number of all signals/sec within a zone

I_Blks -- total number of all input I/O blocks/sec within a zone

O_Blks -- total number of all output I/O blocks/sec within a zone

IO_Chrs -- total number of all I/O character operations/sec within a zone

The last 3 values are very curious :-) �because on time I've needed it I did not find any document describing what they are meaning, so I've based my naming on the description given within a /proc structure header files - these values are helping in some cases without involving any DTrace script to understand which process (or Zone in the current case) is doing more I/O operations than others...

netLOAD

The netLOAD wrapper is to monitor Solaris network activity. This tool is already for a long time included into dim_STAT's STAT-service. And since v.8.0, netLOAD monitors all network interfaces present in the system (including virtual and loopback). If some indicators are not populated by device drivers, a '-1' value is presented instead. Also, a new '-I' option is added: You may give a fixed list of network interfaces you want to monitor (run '/etc/STATsrv/bin/netLOAD' for more details). In STAT-service, netLOAD is integrated via a 'netLOAD.sh' script, to provide an easy way to change an option.
Example of output :
# /etc/STATsrv/bin/netLOAD.sh 5
Name             IBytes/s       OBytes/s  Ipack/s  Opack/s  Ierr/s Oerr/s  Col/s        Bytes/s   Pack/s  Nocanput
lo0                  -1.0           -1.0      0.4      0.4     0.0    0.0    0.0            0.0      0.8         0
ce0               26300.6         3840.0    105.2     64.0     0.0    0.0    0.0        30140.6    169.2         0
ce1                   0.0            0.0      0.0      0.0     0.0    0.0    0.0            0.0      0.0         0
  
Name             IBytes/s       OBytes/s  Ipack/s  Opack/s  Ierr/s Oerr/s  Col/s        Bytes/s   Pack/s  Nocanput
lo0                  -1.0           -1.0      0.8      0.8     0.0    0.0    0.0            0.0      1.6         0
ce0               27624.4         2688.0     77.2     44.8     0.0    0.0    0.0        30312.4    122.0         0
ce1                   0.0            0.0      0.0      0.0     0.0    0.0    0.0            0.0      0.0         0

UDPstat

The UDPstat is a wrapper around of "netstat -s" command on Solaris, and made to monitor a UDP traffic on the system. While it's printing all main counters (In/Out traffic, In/Out errors), it's particularly interesting to analyze Input Overflows (and Input Checksums as well). option.

Example of output :

# /etc/STATsrv/bin/UDPstat.sh 5 UDP-stat Tot# Delta Val/s udpInDatagrams 65700 0 0.00 udpInErrors 0 0 0.00 udpOutDatagrams 68321 0 0.00 udpOutErrors 0 0 0.00 udpNoPorts 3514281 0 0.00 udpInCksumErrs 0 0 0.00 udpInOverflows 0 0 0.00 none 0 0 0 UDP-stat Tot# Delta Val/s udpInDatagrams 65900 200 40.00 udpInErrors 0 0 0.00 udpOutDatagrams 68321 0 0.00 udpOutErrors 0 0 0.00 udpNoPorts 3514281 0 0.00 udpInCksumErrs 0 0 0.00 udpInOverflows 0 0 0.00 none 0 0 0

HAR

HAR - is the Hardware Activity Reporter tool for Solaris 8 and up. Starting with Solaris 8, Sun had begun to deliver public interfaces for the SPARC and x86 hardware performance counters --libcpc, to access CPU counters and libpctx, to track a process. HAR differs from other tools in the fact that it combines the low-level counts into higher-level metrics more useful to application programmers. Application programmers are typically interested in the following metrics: CPI, FLOPS, MIPS, address bus percentage utilization, cache miss rates, branch and branch miss rates, and stall rates. These metrics help in assessing the fair usage of available processing units, locating bottlenecks and guiding tuning efforts, when needed...
Check this valuable article to discover everything about this powerful tool!..

NOTE : by default HAR add-on is disabled within a Solaris STAT-service, why? - to get a CPU counters data Solaris library functions requiring an exclusive access to the chip - for a very short time, but exclusive anyway - so any other process running on the requesting CPU will be moved to another CPU and get some unwanted side effects.. That's why I'm not suggesting to run HAR for a long period on your production system until you're not fully understanding how it works..

Oracle Add-Ons

NOTE : Originally all these scripts were made as examples to show how easily we may collect data even from Oracle. But with a time people started to use them more and more (while I still expected, inspired by examples, they'll add something more optimal :-)). For example, current scripts all the time connecting/disconnecting to/from the database, and collector keeping connection opened will be more optimal, etc... But well - it's still better then nothing! :-))
Anyway, all following wrappers are needing a correctly setting of Oracle environment for the "Oracle" user. By default the user's name is oracle , but it may be changed inside of the scripts.
It means that:
# su - oracle -c "sqlplus /nolog"
should work correctly and give you a SQL> prompt for the right database instance.
Then you may check that:
# /etc/STATsrv/oraEXEC.sh 5
prints you the current number of Oracle sessions and current exec/commit activity.
If it doesn't work - fix it before to go further :-)) (BTW, there is a dim_STAT user group where you may always ask questions - http://groups.google.com/group/dimstat )
Oracle Add-Ons:

oraIO : Oracle I/O stats for data/temp files
oraEXEC : Oracle SQL QueryExecutions/sec, Commits/sec, Number of Sessions
oraLATCH : Oracle latch stats
oraSLEEP : Oracle latch sleeps stats
oraENQ : Oracle enqueue stats

By default all these Add-Ons are already enabled within dim_STAT database, and all you need is just to uncomment them within a STAT-service access file (/etc/STATsrv/access) and start a new collect including Oracle stats :-))
And of course you may add any other one. Some people even collect statspack reports directly into dim_STAT!

MySQL Add-Ons

mysqlSTAT - is monitoring a "show status" output. Each output variable is presented with 3 values:

current value of a variable
delta between current and previous value
value of delta/sec

And it's up to you to choose from the list of variables what kind of information you're interesting in :-) To work properly this add-on needs to be configured - edit your /etc/STATsrv/bin/mysqlSTAT.sh file to setup user/password and host/port information.

mysqlLOAD - is oriented multi-host monitoring and presenting a compact list of data from "show status" output:
On        -- MySQL Server On-Line flag (0 or 1)  

Sessions  -- number of currently connected user sessions (threads) 

InnDirty  -- amount of dirty pages in InnoDB 

InnoFree  -- amount of free pages in InnoDB 

KeyDirty  -- amount of dirty pages in MyISAM Key buffer 

OpFiles   -- number of currently open files 

OpTables  -- number of currently open tables 

ByteRx/s  -- received bytes/sec via network 

ByteTx/s  -- sent bytes/sec via network 

Commit/s  -- number of COMMIT requests/sec 

Delete/s  -- number of DELETE requests/sec 

Insert/s  -- number of INSERT requests/sec 

Select/s  -- number of SELECT requests/sec 

Update/s  -- number of UPDATE requests/sec 

InnDsy/s  -- InnoDB Data Sync/sec 

InnDrd/s  -- InnoDB Data Read/sec 

InnDwr/s  -- InnoDB Data Write/sec 

InnLwr/s  -- InnoDB Log Write/sec 

InnLsy/s  -- InnoDB Log Sync/sec 

Key_Rd/s  -- MyISAM Key Read/sec 

Key_Wr/s  -- MyISAM Key Write/sec 

Query/s   -- Query/sec execution 

AbrtClnt  -- aborted clients (delta) 

AbrtConn  -- aborted connections (delta) 

Connects  -- number of recent connects (delta) 

SlowReqs  -- number of slow requests (delta) 

TabLckWt  -- table lock waits (delta) 

Rollback  -- called rollbacks (delta) 
This add-on also needs to be configured to work properly - edit your /etc/STATsrv/bin/mysqlSTAT.sh file to setup user/password and host/port information.

innodbSTAT - is monitoring a "show innodb status" output (or "show engine innodb status" since MySQL 5.5). Working similar to "mysqlSTAT", but list of variables is based on InnoDB status only. To work properly this add-on needs to be configured - edit your /etc/STATsrv/bin/innodbSTAT.sh file to setup user/password and host/port information.

innodbMUTEX - is monitoring a "show mutex status" output (or "show engine innodb mutex" since MySQL 5.5). Printing the InnoDB MUTEX related stats, already ready to print not only "waits" (as a standard), but also more detailed data (available via compiling of InnoDB with debug options or just hacking (like counters, spins, real waited time on each mutex, etc.)). To work properly this add-on needs to be configured - edit your /etc/STATsrv/bin/innodbMUTEX.sh file to setup user/password and host/port information.
Example of output :
# /etc/STATsrv/bin/innodbMUTEX.sh 5
  
MUTEX                             count  count/s spin_waits spin_waits/s spin_rounds spin_rounds/s os_waits os_waits/s os_yields os_yields/s os_wait_times os_wait_times/s
db-server-online                      1        1        1        1        1        1        1        1        1        1        1        1
buf/buf0buf.c:1122                   -1       -1       -1       -1       -1       -1        0 0.000000       -1       -1       -1       -1
fil/fil0fil.c:1535                   -1       -1       -1       -1       -1       -1        0 0.000000       -1       -1       -1       -1
srv/srv0srv.c:973                    -1       -1       -1       -1       -1       -1        0 0.000000       -1       -1       -1       -1
combined_buf/buf0buf.c:818           -1       -1       -1       -1       -1       -1        0 0.000000       -1       -1       -1       -1
log/log0log.c:830                    -1       -1       -1       -1       -1       -1        0 0.000000       -1       -1       -1       -1
btr/btr0sea.c:181                    -1       -1       -1       -1       -1       -1        0 0.000000       -1       -1       -1       -1
combined_buf/buf0buf.c:820           -1       -1       -1       -1       -1       -1        0 0.000000       -1       -1       -1       -1
  
MUTEX                             count  count/s spin_waits spin_waits/s spin_rounds spin_rounds/s os_waits os_waits/s os_yields os_yields/s os_wait_times os_wait_times/s
db-server-online                      1        1        1        1        1        1        1        1        1        1        1        1
buf/buf0buf.c:1122                   -1       -1       -1       -1       -1       -1        0 0.000000       -1       -1       -1       -1
fil/fil0fil.c:1535                   -1       -1       -1       -1       -1       -1        0 0.000000       -1       -1       -1       -1
srv/srv0srv.c:973                    -1       -1       -1       -1       -1       -1     2411 482.200012       -1       -1       -1       -1
combined_buf/buf0buf.c:818           -1       -1       -1       -1       -1       -1        0 0.000000       -1       -1       -1       -1
log/log0log.c:830                    -1       -1       -1       -1       -1       -1        0 0.000000       -1       -1       -1       -1
btr/btr0sea.c:181                    -1       -1       -1       -1       -1       -1      411 82.199997       -1       -1       -1       -1
combined_buf/buf0buf.c:820           -1       -1       -1       -1       -1       -1        0 0.000000       -1       -1       -1       -1
  
^C
NOTE: the -1 is printed if information is not available.

innodbIOSTAT (deprecated, works only with old InnoDB) - is an adoption of DTrace script published by Neel but with one additional feature: it detects automatically if mysqld is not running anymore or started/restarted again. And of course you may run it only on the system supporting DTrace :-)

PostgreSQL Add-Ons

pgsqlSTAT is monitoring a "pg_stat_bgwriter" and "pg_stat_database" output. Each output variable is presented with 3 values:

current value of a variable
delta between current and previous value
value of delta/sec
some values are also presented per database name

And it's up to you to choose from the list of variables what kind of information you're interesting in. To work properly this add-on need to be configured - edit /etc/STATsrv/bin/pgsqlSTAT.sh file to setup user/password and host/port information.
pgsqlLOAD is oriented multi-host monitoring and presenting a compact summary (single line) from "pg_stat_bgwriter" and "pg_stat_database" output:
On        -- Server On-Line flag (1/0)

Sessions  -- number currently connected user sessions (backends)

Commit/s  -- number of executed COMMITs/sec

Rollback  -- number of executed rollbacks (delta)

B_Read/s  -- Block reads/sec

B_hit/s   -- Block read hit/sec

RowSnd/s  -- Rows sent/sec

RowFch/s  -- Rows fetched/sec

RowIns/s  -- Rows inserted/sec

RowUpd/s  -- Rows updated/sec

RowDel/s  -- Rows deleted/sec

ChpTimed  -- Checkpoints involved by timeout (delta)

ChptReqs  -- Checkpoints involved by request (delta) - probably out of checkpoint segments 

BuffChpt  -- Buffers written by checkpoint (delta)

BufClean  -- Buffers cleaned by background writer (delta)

MxWClean  -- number of times Max Written level was reached by background writer (delta)

BufBkend  -- Buffers written by backends (delta)

BufAlloc  -- Allocated buffers (delta)
Please, read an excellent howto written by Greg Smith to see how analyze this data - http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm
To work properly this add-on also need to be configured - edit /etc/STATsrv/bin/pgsqlLOAD.sh file to setup user/password and host/port information.

jvmSTAT

This is a wrapper to bring information from the "jvmstat" package. This jvmstat is now officially integrated with the JVM 1.5 distribution or later (and called "jstat" now). The jvmSTAT wrapper is giving a way to monitor ALL running JVMs on your server on the same time!
To run jvmSTAT properly you need first of all to have jdk 1.5 (or later) installed on your host and check it works correctly on your server:
# cd /usr/jdk15/bin
# jps
...
#
If you don't see your running JVM(s) within "jps" output - try to fix it first before continue on next steps :-) - normally it should work with any JVM since Java version 1.4.2.
To get the 'jvmSTAT.sh' wrapper working:

edit the /etc/STATsrv/bin/jvmSTAT.sh file (from STAT-service) on each client machine, to set the right path environment for JAVA_HOME pointed to the jdk 1.5 home. (ex: JAVA_HOME=/usr/jdk15)

enable jvmSTAT in STAT-service on each client (uncomment jvmSTAT in /etc/STATsrv/access file)

before starting any new collect, including jvmSTAT, be sure that the jvmSTAT Add-On is already installed (Add-On interface from Main Page)

Then start to collect JvmSTAT data :-)

jvmGC

This one still exists, but I don't see any reason why anyone would still use it, jvmSTAT is the better solution for any kind of "GC" collection.
This wrapper collects on-the-fly information about GC (garbage collector) activity of any JVM running with the "-verbose:gc" option. Before JVM 1.4.2 the only possible way to get information on the GC activity of the standard JVM was dump of the log output, so this wrapper is simply based on log file scanning.
Usage: If you want to see GC activity of one of your JVMs, running on server "J".
0) Install "jvmGC" via the Add-Ons page.
1.) jvmGC uses the $LOG file for data input (you may change name and permissions according to your needs (default filename: /var/tmp/jvm.log), modify if needs on the server "J" STAT-service side (/etc/STATsrv/bin).
2) use the web interface to start the collect including "jvmGC"
3) on server "J" add the "-verbose:gc" option to java in your starting application script and redirect output into the application log file (for ex. app.log)
4) once you want to monitor your JVM:
$ tail -f app.log | /etc/STATsrv/bin/grepX GC >> /var/tmp/jvm.log
5) observe jvmGC output data and have fun!

LINUX specific STATs

Linux Add-Ons:

LvmSTAT (Linux vmstat)
LcpuSTAT (Linux mpstat)
LioSTAT (Linux iostat)
LnetLOAD (Linux netLOAD)
LpsSTAT (Linux psSTAT)
LprcLOAD (Linux ProcLOAD)
LusrLOAD (Linux UserLOAD)

For details, see the following special Linux note...

Administation tasks

At any moment you can:
Edit Add-On Description - in case you make a mistake in any value name, or in a shell command corresponding to your Add-On you may quickly repair it via Edit interface (however you cannot change anymore MySQL table column names or datatypes - if the error was here, you're better to recreate this Add-On one again ;-))
Save Add-On Description - this will give you an ASCII text file which may be reused for another database. This way you may share with others any new findings and any new tools you found useful!
Restore Add-On Description - from information on a given Description file, re-create all Add-On required database structures and fill all information required for it to function correctly. WARNING: if you're already using the same Add-On in the current database, all previous data will be destroyed!
Delete Add-On - removes the Add-On and all corresponding data from the current database...

^<<

^[up]

^>>