Created: 2002-01-21
Last modified: 2007-05-03




dim_STAT User's Guide






by Dimitri


dimitri.kravtchuk@france.sun.com





Ces informations sont données ŕ titre indicatif et n'engagent pas Sun Microsystems.

Table of contents



Overview...
dim_STAT is a tool for general/detailed performance analyze and monitoring of Solaris and Linux systems.

Main features are:

All STAT data are collected from standard Solaris or Linux programs (vmstat, iostat, etc.) or some special (like psSTAT for users/processes activity) and saved in MySQL database. Collected data are accessed via Web interface and can be presented in several manner (interactive or static graphs, text, HTML tables). Since v.8.1 there is also a way to collect data from any other UNIX systems (HP/UX, AIX, etc.)

dim_STAT can be used for On-Line monitoring one or several hosts on the same time. As well, data may be easily post loaded from output files of stat commands and analyzed in the same manner. At any time collecting from new stat commands may be added to the tool (via Add-On interface) and enlarge your view on application workload, RDBMS, your personal STAT program, etc.

By default dim_STAT interfaces Solaris stats (SPARC ans x86):

as well add-on extentions for both Solaris and Linux/x86:

CPU usage of dim_STAT is very low and even less important than standard proctool, top, or perfbar. So given performance vision is more close to reality...


Freeware End User License
LICENSE    
    
This software is released as "freeware". You are encouraged to redistribute    
unmodified copies of this software, as long as no fee is charged for the    
software, directly or indirectly, separately or as part of ("bundled with")    
another software product, without the express permission of the author.    
You may not attempt to reverse compile, modify or disassemble the software    
in whole or in part.    
    
SUPPORT, BUG REPORTS, SUGGESTIONS    
    
You are encouraged to send bug reports and suggestions.    
This software is not supported. Hence, your technical questions may or may    
not be answered. Questions, bug reports, comments and suggestions should    
all be sent to: Dimitri KRAVTCHUK (dimitri.kravtchuk@free.fr or     
dimitrik@sun.com) or into currently dedicated mail-list (dim_STAT@sun.com).    
    
DISCLAIMER    
    
ANY USE BY YOU OF THE SOFTWARE IS AT YOUR OWN RISK. THE SOFTWARE ARE PROVIDED    
FOR USE "AS IS" WITHOUT WARRANTY OF ANY KIND. TO THE MAXIMUM EXTENT PERMITTED    
BY LAW, THE AUTHOR (**) DISCLAIMS ALL WARRANTIES OF ANY KIND, EITHER EXPRESS    
OR IMPLIED, INCLUDING, WITHOUT LIMITATION, IMPLIED WARRANTIES OF    
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.    
THE AUTHOR (**) IS NOT OBLIGATED TO PROVIDE ANY UPDATES TO THE SOFTWARE.    
    
**Dimitri KRAVTCHUK    
      

Installation
dim_STAT installation package is delivered generally as TAR archive (dim_STAT.tar) or already "untared" on CD support.

Before install: Verify your disk space - you will need ~50MB for initial install, mostly to host Web Server and Database Server data. Database volume will grow according size of your future STAT collections and Web directory may grow with your reports, so be large at once and reserve enough space for your data...

During installation: new user "dim" and group "dim" will be created. User "dim" is the owner of dim_STAT database and Web server. In case you have some special rules or restrictions on your system, you may create it by yourself before, as well choose another user and group names according your system policy.



STAT-service
STAT-service was introduced in dim_STAT since v.3.0. and provides a simple, stable and secure way for on-line STAT collecting from Solaris/SPARC, Solaris/x86 and Linux/x86 servers. Since v.8.1 it's distributed under GPL with source code, so you may compile it now yourself on other platforms and collect data from other UNIX systems. As pilot example, package for HP/UX is provided. Any new ported kits are welcome!...

Main Page
Now, installation is finished, Database and Web servers are running. Be sure STAT-service is installed and running on all servers you want to monitor... (Note: you'll be surprised, but 90% of cases with initial troubles are due to this stupid thing - people just forgetting to start STAT-service :))

Once it's done, you are ready to open your preferred Web browser (Java enabled or not - it's up to you) and connect to the dim_STAT Web server. Index page contains some links to documentation, presentation, tool history, etc., but the link you'll need to click is: "Main Page"... (do you believe some people meet a problem to find it? :))

Here is a small snapshot of the screen you'll see (Note: blue/red histogram has nothing in common with current host activity, it's just an example of working Java applet (since v.7.0 it's converted into image to simplify and speed-up main interface, if you want to be sure Java plug-in is working correctly within your browser and you may click on the 'Preference' link and then on Java Applet testing link). Since v.7.0 you have a choice for graph generation: interactive Java applet or static PNG image. Until v.8.0 for Solaris/SPARC users there was an old Netscape Navigator 4.5 shipped in /apps/httpd directory - I don't ship it anymore as there are at least FireFox and Opera are available for free to download from Internet. So well - 'Main Page' :))

As you already supposed, the Main Page will re-group all main actions... And you're right! :))

I'll not present action by action, but rather functionality by functionality, in order of operation. However, the most short working cycle should be composed at least of:

  • Starting STAT collect
  •  Analyze/Monitor collecting data
  • Stop STAT collect

Few words about User Interface - don't be surprise you did not find any "Back" button once you enter somewhere from the Main Page: there in no one! You have to use your browser's navigation back button for it, and it's not because I'm just lazy :)) The reason is very simple: dim_STAT may use Java applet to present data in graphical mode, but it seems for every showing graph-applet Web browser is running dedicated JVM for each one, and if you never come back in navigation - all JVMs will stay keeping suspended in the browser memory till it will crash with out of memory error... To prevent crashing I'm forcing you to use browser's button.

Since v.7.0 you'll see a small toolbar in the top of your page presenting:
   - Currently used Database Name
   - Short links into Home/ Preferences/ Log Admin

Navigation become more simple, but be aware about running applets in case you're using them!



Preferences
Preferences page contains a set of key options used by different part of application. Most critical of them are grouped here, all other options (if supported) are "auto-keeping" last given value (if you already used dim_STAT before you'll see there is no more any graph settings here - all graph values are auto-saved every time when you use graph view)...

Note: your browser must accept Netscape cookies to make this features work!

Also, there is no global setting button here, and I did not want to create too much links. So, each option has its own validation button - don't forget to click on it to apply your modifications.

Database - Without any special setting, all collected data stored in the "Default" database (real name is "dim"). However, to avoid possible contention and simplify further administration, it's highly recommended to use different databases for different projects/users/centers/etc. So, within Database section you may choose the name of the database you want to use or create+choose a new one. As reminder, current (working) database name is always present in browser title and toolbar in every dim_STAT window as [db-name]. Free and Used disk space showed for the currently used database. (Note: MySQL has a quite small storage footprint, so disk space usage will be almost reasonable, but it's a good habit to check time to time if no one of your datafiles out-pass 2GB in size, as shipped database server is a 32bit program).

Host Name List - Here you may give a pre-defined hostname list of your servers you are usually monitor, in this case instead of repetitive (and possible wrong) typing of the same names you'll be able simple to choose right name from fall-down list in your browser.

Bookmark Term - if you never used dim_STAT before just leave it as it for the moment. For others - this option was specially created to satisfy everyone who prefer a different name for "Bookmark" functionality :)) (introduced in dim_STAT since version 4.0, after long discussions we still did not get any agreement on name :)) So, you're free now to name it as you like! :))

LOG Messages option - give you a way to:
     - enable/disable auto-generated time slice messages for easier time interval selection
     - message list size setting (in lines)
     - max message visible length (in characters)

Page Colors - you're free to play with page colors if you're not happy with default setting or simply prefer to change something time to time :))

Check Java support - simple way to check if dim_STAT applet is working for your browser...


Starting On-Line collecting
Before starting any STAT collect, check first STAT-service is up and running on every server you want to monitor!!! At least you will be sure you are not in case of the most common error :))

Another point - if you want to monitor any Linux server: be sure you're installed Linux STAT Add-Ons before start any collect (see special Linux section in this document).

Now from dim_STAT Main Page you may just follow Start New Collect link. (Note: since v.8.0 there is no more separation on single or multi host collect).

IMPORTANT:



EasySTAT
Since dim_STAT v.7.0, EasySTAT script make a part of STAT-service for Solaris. EasySTAT is designed to simplify BatchLOAD interfacing with stat collecting from "very remote" or "highly secured" hosts.

In few words all you need is:

EasySTAT Usage:

   $ /etc/STATsrv/bin/EasySTAT.sh  OutDIR  Intreval NbHours [ Title Host Base Batch ]

   options:
       OutDIR    - Output directory for stat collects (def: /var/tmp)
       Interval  - measurement interval for stat commands in sec. (def: 30)
       NbHours   - execution duration in hours (def: 8h)
       Title     - title to use during BatchLOAD processing
       Host      - hostname to use during BatchLOAD processing
       Base      - database name to use during BatchLOAD processing
       Batch     - full path to BatchLOAD binary on your server (def: /apps/ADMIN/BatchLOAD)

EasySTAT Config: bu default script collects 5 main stats



BatchLOAD
The idea of BatchLOAD came (as all other things) from day to day needs: sometime you are facing customers/users who want to know what happens on their machines, but they don't agree to install any additional software on them... (very constructive approach :)). So, all you can do is to ask them to run some stat commands on their systems and send you the output files. And every day loading their files via Web interface you'll think harder and harder if there is any way to do it automatically... Are you ready for BatchLOAD? :))

Once decided to add a new component into dim_STAT, I've kept in mind also some other tools already existing/coming around and collecting output from stat commands on the machine. All such of tools keeping data in their own format, so I've tried to design the input format for BatchLOAD to be easily adaptable. Of course, I did not think to create something universal :)), but hope it should be not too hard to write a script converting from already existing format to BatchLOAD...

Some words about BatchLOAD internals: there is no dependency or something else on the name of loaded files. All needed information is given by command options and inside of the loaded file. Loaded file must have special TAGs, at least two: to give STAT name and confirm the END.

USAGE:

Usage: /apps/ADMIN/BatchLOAD -cmd NEW/ADD options 

    Options [NEW]:   -- force new collect creation
       -base DBname       -- database name
       -ID id             -- Collect ID, if 0 use max+1 id automatically
       -title Title       -- Collect Title
       -host Hostname     -- Collect Host Name
       -isec sec          -- Collect STATs Interval (sec)
       -start datetime    -- Collect Start DateTime in format YYYYMMDDHHMISS
       -skip1 yes/no      -- Yes/No skip first STAT measurement (often wrong values)
       -file Filename     -- Full path to file with STATs outputs
       -verbose on/off    -- verbose output on/off

    Options [ADD]:   -- add to existing collect whenever possible
       -base DBname       -- database name
       -host Hostname     -- Collect Host Name (optional)
       -ID id             -- Collect ID, if 0 : 
                             -- if host is given - use max id used by host
                             -- otherwise, use max (last) id automatically
       -skip1 yes/no      -- Yes/No skip first STAT measurement (often wrong values)
       -file Filename     -- File with STATs outputs
       -verbose on/off    -- verbose output on/off

Example:
$ /apps/ADMIN/BatchLOAD -cmd NEW -ID 0 -base ANT -file `pwd`/vmstat.out -skip1 no -title "Test BatchLOAD" -host V880 -isec 20 -start 20031024100000
$ /apps/ADMIN/BatchLOAD -cmd ADD -ID 0 -base ANT -file `pwd`/iostat.out -skip1 no
$ /apps/ADMIN/BatchLOAD -cmd ADD -ID 0 -base ANT -file `pwd`/mpstat.out -skip1 no -verbose on

in this example first line will create new STAT Collect using automatically new ID (max+1) with title "Test BatchLOAD" and load first file: "vmstat.out" second & third lines just load into newly created Collect next data: "iostat.out" and "mpstat.out"; once it's finished - we may connect dim_STAT web server and start analyze.

Note: several "-file" options may be used on the same time, for ex:

      $ /apps/ADMIN/BatchLOAD -cmd NEW -ID 0 -base ANT -skip1 no -title "Test BatchLOAD" -host V880 -isec 20 -start 20031024100000 
  -file `pwd`/vmstat.out -file `pwd`/mpstat.out -file `pwd`/iostat.out


File Format of STAT output


File format is designed in way to give as more possible flexibility on data grouping + processing.

Main TAGs are STAT and END:

==> STAT StatName                  -- after this point all following data corresponds
                                      to given STAT command (StatName)
    Actually supported STAT names: 
        VMSTAT
        MPSTAT
        IOSTAT (iostat -x)
        IOSTAT-xn (iostat -xn)
        VXSTAT (vxstat -v)
        psSTAT
    And all other Add-On STAT you are able to create! :))
    like some already shipped:
        netLOAD
        T3stat
        oraEXEC
        oraIO
        ...

==> END                            -- end of STAT data


At any time the following TAGs may also be inserted:
==> DTSET yyyy-mm-dd hh:mi:ss      -- set date+time point for next STAT data

==> LOGMSG message                 -- add log message into database corresponding
                                      to the currently loading data

Outside of "STAT" - "END" blocks any other lines are ignored.

Note: TAGs are exactly as it shown: "==> STAT", "==> END", "==> DTSET", "==> LOGMSG". Don't miss any characters, please :))


Analyzing

Analyzing is quite intuitive, but let's just give some snapshots and few words about...

So, once you click on Analyze link you have two choices:

 
Let's take a Multi-Host option for the moment, as it's quite easy for the first look.

Also, you can see some other additional options:




Multi-Host Analyzing
Multi-Host analyzing is most simple to understand and good point to get started. Let's go!

NOTE: some screenshots may be not up to date and don't matching exactly to the newest dim_STAT version. Hope it'll not make too much troubles for you (they are here only to give a general idea about interface and choice of actions)...

Main point: as we want to see several hosts on the same time and on the same graph, we cannot observe more than one single stat-value per graph, however there may be several graphs viewed on the same page.

General steps:
     - Choose STAT collects
     - Choose time interval you are interesting in
     - Choose Graph size/mode attributes
     - Choose STAT data you want to analyze
     - Go!



Single-Host Analyzing
Single-Host is quite similar to Multi-Host, but gives wider variety of parameters as it working only with one given STAT collect. Let's use as example Demo collect (given by default with dim_STAT database) and analyze IOSTAT" data...

So, you may open your browser now and just follow me step by step connecting to your dim_STAT server.



Bookmarks
Most of bookmarks are pre-defined to save your time. Their number may vary from release to release, but never forget - you may always create your own and keep them as your specific kit, and keep safe from one base to another. Also, people very quickly starting to have a habit to use only bookmarks and sometime are lost - oh, there is no way to see per network interface activity! or, no way to see a single process, only top-10!... Don't forget all values are here, just go directly to the STAT interface and you'll find them :)) Then create new bookmarks covering other needs and enjoy! :))

dim_STAT will just give you a way to easily play with any collected data, but drive wheel is in your hands! :))


dim_STAT CLI
Well, I was really surprised by strong demand for CLI solution in dim_STAT by users!.. Seems Web interface is not making happy all the time :))

Here we are, since v.8.1 there is a CLI module in dim_STAT :)

 # /apps/ADMIN/dim_STAT-CLI

  dim_STAT CLI v.1.4
  Usage: dim_STAT-CLI  [options] 
    Options:
       -Base DBname
       -ID CollectID               (if empty: prints available Collect list)
       -Stat Name                  (if empty: prints available Stat list)
       -Begin YYYYMMDDhhmiss 
       -End YYYYMMDDhhmiss 
       -Out fname

    optional:
       -Title graphtitle           (if empty: uses Collect title)
       -Width size                 (if empty: uses default graph width)
       -Height size                (if empty: uses default graph height)

For the moment it gives you a way to get a single graph in PNG format for given Database, CollectID and Time interval. Stat names are corresponding directly to your Bookmarks in given Database, so more Bookmarks you done - more graphs you may generate :))


Administration
Several administration points were already covered in previous sections. Let's speak about some other, more oriented on day to day management...

Add-On Statistics

One of the most powerful features of dim_STAT is ability to integrate your OWN statistic programs into the tool. Once added, they will be considered by tool as all other from standard set of STAT(s) and give you the same kind of service: Online Monitoring, Up-Loading, Analyzing, Reporting, etc.

However, choice of external stat programs is too wide and it's quite impossible to design a wrapper adapting to any format. So, I've decided to limit input recognizer to just 2 formats (but covering maybe 95% needs) and leave you to write your own wrapped if necessary to bring the output to one of supported formats.

Formats supported by dim_STAT:

     - SINGLE-Line: with one output line per measurement (ex: vmstat)

     - MULTI-Line: with several output lines per measurement (ex: iostat)

To be correctly interpreted, your stat program should produce a stable output (same format for data lines, at least one line in MILTI case, keep time-out interval correctly, etc.). Lines different of data have to be declared as ignored by dim_STAT.

NOTE: lines shorter than 4 characters considered as spam! and ignored!

Let's take some examples to get it more clear...


Linux Special Notes
I don't know if I'll surprise you by saying: all dim_STAT binaries for Solaris SPARC are still compiled on the old and legendary SPARCstation-5 under Solaris2.6 and working on every next generation Sun SPARC machines, as well as last generation + Solaris10 included? Some unchanged binaries are still here and are even 10 years old! This calls a TRUE binary compatibility! :))

Now, may I say the same thing about Linux? :)) Even sometime the SAME vendor breaking its binary compatibility between previous and next distribution version!...

Firstly, when the main problem was only in the different implementation of shared libraries, I've recompiled all main dim_STAT programs as static binaries to be sure they will run on every distribution. With time things are more worse: static binary may do a core dump on some distros... So, current dim_STAT Linux version shipping both dynamic and several static versions of the same binary generated on the different distros. Current v.8.0 is reported to work out-of-the-box on: MEPIS 3.3.1-1, MEPIS 6.0, RHEL 4.x, CentOS 4.x, SuSE 9/10, Fedora Core 3/4/5. Anyway, if you meet any problem during installation or execution of dim_STAT - please, contact me directly and we'll try to fix this issue together...

NOTE: PC boxes are quite chip today, so always ask yourself if simple buying of $200 PC, installing MEPIS 6.0 (10 minutes), installing dim_STAT (5 minutes) and START to collect from all your servers! - will not be chipper/easier/simpler rather trying to fix issue after issue :))


Report Tool
This User's Guide is completely written using Report Tool :))
As usually, this tool was created to cover my day to day needs :))

Quite often I have to write reports to explain performance findings, present observed system/application activity, etc. etc. etc. ... etc. Yes, etc. because sometime we have to write too much to make things work or simply protect people from doing stupid things :))

So well, once you've started to write your document for French customer (so in French), and it appears majority of the development team speaking English only (or not only, but not French)... And you start to keep 2 parallel copies from the same document: FR/EN... Then you discover something very important and cannot say it to customer (yet) but absolutely need to communicate it internally! So you split once more again: FR/EN Customer/Internal = 4 different documents! next split give you 8 documents (still based on the same source of information!)... And more again: looking on people spending hours (or whole day) doing copy+paste of activity graphs from browser/teamquest/best1/patrol/etc into their wordprocessor ... makes me cry :))

So I was really tired by this situation and tried to imagine something different :))


Additional Tools
Since v.5 additional tools were shipped within package, but it seems I forgot to present them explicitly and a lot of users were not informed about...

FAQ