dim_STAT User's Guide

^<<

^[up]

^>>

^{dim_STAT User's Guide. by Dimitri}

Analyzing

Analyzing your STAT data is quite intuitive, but let's just give some screen shots and few words of comment.
Once you click on the "Analyze" link you have 3 options:

Single-Host Analyze
Multi-Host Analyze
Multi-Host Extended Analyze

Let's take for now the Multi-Host option, as it's the easier one :-)
There are some other additional options:

Active ONLY - show only currently running collects
STATs Status - in Single Host mode this option shows high numbers of already collected stats (very important to see if something is really collecting)
Title matching - to filter collects on title pattern
LOG matching - to filter LOG messages with a text pattern

Welcome Analyze!

LOG Messages

A few words about LOG Messages. As we saw already during the start of a new collect, you can use an optional parameter, Client Log File, to catch during the collect time any new text messages in this logfile. All messages are saved with a time-stamp in the same database as where the collect data is stored. Alternatively, at any moment you may add these kind of messages manually using the web interface. There is a special link "LOG Messages Admin" and under every graph view there is a a link to add a new message.
But, when can this be helpful?
Firstly, it'll help you to choose the correct time intervals for analyzing data, without having to remember the exact time slices when something particular happened on this machine.
Secondly, when analyzing the activity on your machine, you'll be able to get a list of every registered event, corresponding to the same time interval.
Example 1
Let's say you DBA in vacations and you're acting for a few days. The user claims that time-to-time something happens on the machine and slows down his work. You're starting to monitor the system, and yes sometimes you observe strange activity on the Oracle side. So, instead to write down the times corresponding to the problem, you simply add two messages: "Something strange" and "Ok now" while you're analyzing activity graphs. Once your DBA comes back, you may just point him to your messages. Also, if somebody else will analyze the time slices, entering the same perimeter, he or she will also be warned by your messages!
Example 2
Every night you're starting some batch jobs while nobody else is working on the system. There are several important parts and you're trying to optimize them or simply check nothing goes wrong.
Let's assume your main batch script is looking like:
#bin/sh
 start_batch01
 start_batch02
 start_batch03
 start_batch04
 ...
 start_batch20
 exit
Now, simply add log messages:
#bin/sh
 echo "Start Night Batch" >> /var/tmp/log
 echo "start batch01" >> /var/tmp/log
 start_batch01
 echo "start batch02" >> /var/tmp/log
 start_batch02
 echo "start batch03" >> /var/tmp/log
 start_batch03
 echo "start batch04" >> /var/tmp/log
 start_batch04
 ...
 echo "start batch20" >> /var/tmp/log
 start_batch20
 echo "End Night Batch" >> /var/tmp/log
 exit
After that, every time you start a new STAT collect to monitor this machine, you give "/var/tmp/log" as Client Log File name. This way, every time you start your main batch script, every message written into /var/tmp/log will be saved and timestamped in the dim_STAT database. To select the correct time interval for analyzing the workload during for example batch04, you only need to simply click between the messages: "start batch04" and "start batch05".

Tasks

There are two special "Task" tags that may be used with log messages:

===> TASK_BEGIN: Unique_Task_Name --Marking begin of task execution
===> TASK_END: Unique_Task_Name --Marking the end
The Unique_Task_Name should be one word of up to 40 characters and unique within the current collect. For example, for 4 batches started in parallel we can add to the script:
( echo "===> TASK_BEGIN: batch1" >> /tmp/log; batch1.sh; echo "===> TASK_END: batch1" >> /tmp/log ) & 
( echo "===> TASK_BEGIN: batch2" >> /tmp/log; batch2.sh; echo "===> TASK_END: batch2" >> /tmp/log ) & 
( echo "===> TASK_BEGIN: batch3" >> /tmp/log; batch3.sh; echo "===> TASK_END: batch3" >> /tmp/log ) & 
( echo "===> TASK_BEGIN: batch4" >> /tmp/log; batch4.sh; echo "===> TASK_END: batch4" >> /tmp/log ) &
When you analyze activity graphs later, you can use the "Show Tasks" button to get a short summary about all the executed tasks during the observed period and with their total execution time (if they are finished). This can be useful in case you're starting big long jobs in parallel. And they are all executed by the same process, so there is no way to know which one is running which job.

^<<

^[up]

^>>