<< | [up] | >> |
dim_STAT User's Guide. by Dimitri |
STAT-service |
STAT-service was introduced in dim_STAT since version 3.0 and provides a simple, stable and secure way for on-line collecting of STAT data from Solaris/SPARC, Solaris/x86 and Linux/x86 servers. Since v. 8.1 it's distributed under GPL with source code, so you may compile it now yourself on other platforms to collect data from other UNIX platforms. As a pilot example, a package for HP/UX is provided. And any newly ported kits are of course welcome! Since Jun.2009 there is also available a version of STAT-service daemon rewritten in Perl by Marc KODERER: http://search.cpan.org/~mkoderer/stat_agent-0.09/stat_agent.pl - feel free to try this version too and don't forget to send your comments and RFE to Marc! :-)
Install STAT-service |
The STAT-service module is shipped as part of the dim_STAT distribution (dim_STAT-INSTALL/STAT-service directory), in form of Solaris packages or as tar archives for manual integration. STAT-service has to be installed on every machine that needs to be monitored. The install is to be done as "root" user. Package install (".pkg" file) :Manual install (".tar" file) :# pkgadd -d STATsrv.pkgThe software needs to be installed into a special /etc/STATsrv directory, which is the home directory of STAT-service. The contents of this directory is:# cd /etc # tar xvf /path_to/STATsrv.tar # ln -s /etc/STATsrv/STAT-service /etc/rc2.d/S99STATsrv # ln -s /etc/STATsrv/STAT-service /etc/rc1.d/K99STATsrv # ln -s /etc/STATsrv/STAT-service /etc/rc0.d/K99STATsrv # ln -s /etc/STATsrv/STAT-service /etc/rcS.d/K99STATsrvNext step, start the service daemon:/etc/STATsrv/ STAT-service -- script to start/stop service daemon, also defines port number to listen (def:5000) access -- access control file /bin -- contains extended STAT programs/scripts /log -- contains all logged information about service demandsThe way dim_STAT and STAT-service are communicating with each other is very simple:# /etc/STATsrv/STAT-service startAs you see, this schema is quite robust and will work after cluster switching, network corruptions, reboots, etc. Collections can be started once and then left running for a long period. In case you need to collect only during specific time intervals, you may just start and stop the STAT-service through a "cron" job or a similar tool. Note: it appears that during a halt of the system (a power-off of a running machine), the TCP/IP connections can stay and don't receive an error code. When this happens, the collect should be broken via a "auto-eject" timeout. However, auto-eject can also happen due to a mini-hang on the system or simply of the stat program. In this case you'll see holes in your collects, so take care when interpreting the results.
- 1) dim_STAT connects to the STAT-service deamon of the monitored server
- 2) if the service is not available, then wait a time-out and go to 1) or exit if the STAT collect is stopped during this period
- 3) dim_STAT will ask about the stat command that it needs
- 4) if there are no permissions for this command or the command is not found, the "command" connection will be closed with an error message
- 5) dim_STAT collects the data, maintaining any time-shift due to previous time-outs
- 6) if the TCP connection is broken: go back to 1)
- 7) if STAT is stopped, then close the connection and exit
- 8) if there was no activity during the "auto-eject" timeout, close the connection and goto 1)
STAT-service Access control file |
Here is an example of STAT-service access control file. As you see, you may limit the number of stat commands accessible for each machine. This task may be done by host administrator and may be completely independent. IMPORTANT :
- access file all the time checked by STAT-service daemon, so you never need to restart service to activate your modifications.
- since v.8.0 only stat commands working for sure on a given system are enabled by default. It's up to you to enable other commands which may need some additional configuration (like jvmSTAT, oraEXEC, etc.) or simple software presence (like VxVM for vxstat) - "enable" means just uncomment them within your /etc/STATsrv/access file :-)
- since v.8.5 you may add a port number for a command! - it gives a way to collect several similar stats from the same host but from the different sources :-)
For example, if you're running say 3 Oracle database instances on the same server and still wanting to monitor each one in details, but there is only one oraEXEC possible per system because (as it) it may accept only one Oracle SID... So you may just make several copies of the same oraEXEC.sh wrapper and assign them to the different ports like that:then you start several STAT-service processes (on port 5000, 5001 and 5002) and collect data from your servers like it was 3 different hosts :-) (and from port 5000 you'll collect data about SID#0, from 5001 - SID#1, 5002 - SID#2)... - it's a straight forward way in a such situation as well for MySQL and PostgreSQL too as it's still more simple solution rather to rewrite whole the stuff to accept several databases on the same time...command oraEXEC /etc/STATsrv/bin/oraEXEC_sid0.sh command oraEXEC:5001 /etc/STATsrv/bin/oraEXEC_sid1.sh command oraEXEC:5002 /etc/STATsrv/bin/oraEXEC_sid2.sh
# # STAT-service access file # # Format: # ... # command name[:port] fullpath # ... # access IP-address # ... # command name[:port] fullpath # ... # # By default all machines in the network may access to STAT-services # # Keyword "access" make access restriction by IP-adress for all following # commands till next "access" section. # # For example: # # ==================================================================== # # # # Any host may access to vmstat and mpstat collections # # # command vmstat /usr/bin/vmstat # command mpstat /usr/bin/mpstat # # # # Only machines 129.157.1.[1-3] may access netLOAD collections # # # access 129.157.1.1 # access 129.157.1.2 # access 129.157.1.3 # command netLOAD.sh /etc/STATsrv/bin/netLOAD.sh # # # # Only machine 129.157.1.1 may access psSTAT collections # # # access 129.157.1.1 # command psSTAT /etc/STATsrv/bin/psSTAT # # # ==================================================================== # # """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""" # // All folowing commands should work out the box... // # """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""" command Lvmstat /etc/STATsrv/bin/vmstat command Lvmstat:5001 /etc/STATsrv/bin/vmstat2 command Lmpstat /etc/STATsrv/bin/Lmpstat.sh command tailX /etc/STATsrv/bin/tailX command LioSTAT /etc/STATsrv/bin/ioSTAT.sh command LpsSTAT /etc/STATsrv/bin/psSTAT.sh command LPrcLOAD /etc/STATsrv/bin/ProcLOAD.sh command LUsrLOAD /etc/STATsrv/bin/UserLOAD.sh command LnetLOAD /etc/STATsrv/bin/netLOAD.sh command LcpuSTAT /etc/STATsrv/bin/cpuSTAT.sh command sysinfo /etc/STATsrv/bin/sysinfo.sh command SysINFO /etc/STATsrv/bin/sysinfo.sh command IObench /etc/STATsrv/bin/IObench_STAT.sh command dbSTRESS /etc/STATsrv/bin/dbSTRESS_STAT.sh command dbSTRESS1:5000 /etc/STATsrv/bin/dbSTRESS_STAT.sh command dbSTRESS2:5001 /etc/STATsrv/bin/dbSTRESS_STAT.sh # """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""" # // Next commands may need some additional configuration // # // (see each *.sh to get more details before uncomment) // # """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""" # Java (JVM) #command jvmSTAT /etc/STATsrv/bin/jvmSTAT.sh # Oracle #command oraEXEC /etc/STATsrv/bin/oraEXEC.sh #command oraIO /etc/STATsrv/bin/oraIO.sh #command oraENQ /etc/STATsrv/bin/oraENQ.sh #command oraLATCH /etc/STATsrv/bin/oraLATCH.sh #command oraSLEEP /etc/STATsrv/bin/oraSLEEP.sh # MySQL #command innodbSTAT /etc/STATsrv/bin/innodbSTAT.sh #command mysqlSTAT /etc/STATsrv/bin/mysqlSTAT.sh #command mysqlLOAD /etc/STATsrv/bin/mysqlLOAD.sh # PostgreSQL #command pgsqlSTAT /etc/STATsrv/bin/pgsqlSTAT.sh #command pgsqlLOAD /etc/STATsrv/bin/pgsqlLOAD.sh #
<< | [up] | >> |