Informix Logo


Top Ten onstat Commands

Материал прислан Евгением Нечаевым


onstat -D

INFORMIX-OnLine Version 7.12.UC2 -- On-Line -- Up 122 days 20:48:40 -- 72616 s

Dbspaces

address number flags fchunk nchunks flags owner name

c3fa80e8 1 1    1 1 N informix rootdbs
c3fa84b0 2 2001 2 1 N T informix tempdbs
c3fa8518 3 1    3 1 N informix db1
c3fa8580 4 1    4 1 N informix db2

4 active, 2047 maximum

Chunks

address chk/dbs offset page Rd page Wr pathname

c3fa8150 1 1 0 1259 289 /home/informix/ROOTDBS
c3fa8228 2 2 0 11   11  /home/informix/TEMPDBS
c3fa8300 3 3 0 6    0   /home/informix/db1
c3fa83d8 4 4 0 6    0   /home/informix/db2

4 active, 2047 maximum

Monitor the number of page reads and page writes in the "page Rd" and "page Wr" columns. This can be used to determine how even the I/O access to each chunk is. Remember that there may be multiple chunks on a single device.

onstat -F

INFORMIX-OnLine Version 7.12.UC2 -- On-Line -- Up 122 days 20:51:45 -- 72616 s

Fg Writes LRU Writes Chunk Writes

0         103         311

address flusher state data

c3faa444 0      I     0 = 0X0

states: Exit Idle Chunk Lru

Monitor the types of writes occurring in you system. Foreground (Fg Writes) should be eliminated. LRU Writes and Chunk Writes will vary depending on the type of system you have. An OLTP system should maximize LRU Writes. There will always be some Chunk Writes, but LRU Writes will speed up checkpoint duration. In a DSS system, Chunk Writes should be maximized. Some LRU Writes may still occur in an effort to eliminate foreground writes(Fg Writes).

Also monitor page cleaners (flushers) at checkpoint time. Make sure they are all busy doing Chunk Writes.

 onstat -l

INFORMIX-OnLine Version 7.12.UC2 -- On-Line -- Up 122 days 20:56:26 -- 72616 s

Physical Logging

Buffer bufused bufsize numpages numwrits pages/io

P-2    0       16       274      22        12.45

phybegin physize phypos phyused %used

10003f   500     433    0        0.00

Logical Logging

Buffer bufused bufsize numrecs numpages numwrits recs/pages pages/io

L-2    0       16       2437    113      31        21.6     3.6

address number flags uniqid begin size used %used

c3ecc55c 1    U-B----   7  100233 250  250 100.00
c3ecc578 2    U---C-L   8  10032d 250  106 42.40
c3ecc594 3    F------   0  100427 250  0    0.00
c3ecc5b0 4    F------   0  100521 250  0    0.00
c3ecc5cc 5    U-B----   5  10061b 250  250 100.00
c3ecc5e8 6    U-B----   6  100715 250  250 100.00

 Monitor the physical log buffer usage. Observe the bufsize and the pages/io columns of the first line of the output. If (pages/io) / (bufsize) is roughly 75% then the buffer is being utilized efficiently. If it is less then 75% then the physical log buffer is probably too large. If the ratio is greater than 90% then the physical log buffer is potentially too small.

Monitor the logical log buffer usage the same way as the physical log buffer. However, if unbuffered logging is being used, the flushing of the buffer will depend on the size of transactions, not the utilization of the buffer. This rule may not apply. If most transactions are smaller than a page of the logical log buffer, this ratio may always be low. Just keep the logical log buffers at there defaults.

The physical log file should be monitored near checkpoint time to determine if the percent used is near 75%. A well tuned physical log file will be nearly full at checkpoint time. If the physical log file is not being filled up, then it is wasting disk space.

Also, the logical log files should be monitored to be sure that they are being backed up. By using the sysmasters database, it can be determined which logical log files have been freed. Look at the systrans table for min(tx_loguniq>0) to find the last log file containing an open transaction.

 onstat -m

INFORMIX-OnLine Version 7.12.UC2 -- On-Line -- Up 122 days 20:57:47 -- 72616 s

Message Log File: /home/informix/online.log

13:09:47 Dataskip is now OFF for all dbspaces
13:09:47 On-Line Mode
13:09:47 Checkpoint Completed: duration was 0 seconds.
14:04:50 Checkpoint Completed: duration was 0 seconds.
14:14:51 Checkpoint Completed: duration was 0 seconds.
14:24:54 Checkpoint Completed: duration was 0 seconds.
14:29:54 Checkpoint Completed: duration was 0 seconds.

Mon Nov 4 11:23:51 1996

11:23:51 Logical Log 7 Complete.
11:27:10 Checkpoint Completed: duration was 0 seconds.

Tue Dec 31 11:16:01 1996

11:16:01 Checkpoint Completed: duration was 0 seconds.
11:21:00 Checkpoint Completed: duration was 0 seconds.
11:26:01 Checkpoint Completed: duration was 0 seconds.
11:36:01 Checkpoint Completed: duration was 0 seconds.

Monitor the online message log file for unusual events which may occur. Also keep tabs on the frequency and duration of checkpoints.

onstat -p

Informix Dynamic Server Version 9.30.UC2E3   -- On-Line -- Up 15:30:00 -- 30704 Kbytes

Profile
dskreads pagreads bufreads %cached dskwrits pagwrits bufwrits %cached
14864    46995    6301400  99.76   22142    49296    670458   96.70  

isamtot  open     start    read     write    rewrite  delete   commit   rollbk
4931349  8898     511695   1389809  901215   1368     1137     858      0

gp_read  gp_write gp_rewrt gp_del   gp_alloc gp_free  gp_curs 
0        0        0        0        0        0        0       

ovlock   ovuserthread ovbuff   usercpu  syscpu   numckpts flushes 
0        0            165      104.08   2.76     10       384     

bufwaits lokwaits lockreqs deadlks  dltouts  ckpwaits compress seqscans
1457     0        22486669 0        0        0        329      805     

ixda-RA  idx-RA   da-RA    RA-pgsused lchwaits
44       1983     5561     7583       8       

 The profile information has many things that can be monitored.

The read percent cached is important for an OLTP system. The read percent cached should be 95% or higher. This is not always possible because of the applications use of the data. But it is a starting point. Adding more buffers generally will increase the read percent cache.

The write percent cached can also be monitored. But it is much harder to tune. As buffers increase, so should the write percent cache. 85% or higher is a beginning target. However, depending on the application, that percentage may not be achieved.

Lock contention can be monitored by looking at the lokwaits and lockreqs columns. If lokwaits are 1% or higher that of lockreqs, you may have lock contention. Changing the way application use isolation levels and locks will help improve lock contention.

Deadlocks within applications can be detected by the deadlks and dltouts columns. Deadlks is a count of the number of dead locks detected and aborted by online. Dltouts is a count of the number of queries which have had the deadlock timeout time expire. Dead lock timeouts only occur for distributed queries. If deadlocks are occurring, changes will be required in the application.

Read ahead efficiency can be monitored. Add up the values in the ixda-RA, idx-RA, and da-RA columns and compare the sum to RA-pgsused. If the sum of pages read ahead is not nearly equal to the number of pages used, then too many pages are being read ahead. Reduce the RA_PAGES parameter. An additional effect of this will be a reduction of the read percent cache.

Additionally, the RA_THRESHOLD needs to be set close to RA_PAGES or the bufwaits column will increase as the database engine is waiting for the read aheads to complete before it can use the pages.

 onstat -R

# of dirty buffers in each LRU queue

 INFORMIX-OnLine Version 7.12.UC2 -- On-Line -- Up 122 days 20:59:53 -- 72616 s

8 buffer LRU queue pairs

# f/m length % of pair total

0 F   25  100.0% 25
1 m   0   0.0%
2 f   25  100.0% 25
3 m   0   0.0%
4 f   25  100.0% 25
5 m   0   0.0%
6 f   25  100.0% 25
7 m   0   0.0%
8 f   25  100.0% 25
9 m   0   0.0%
10 f  25  100.0% 25
11 m  0   0.0%
12 f  25  100.0% 25
13 m  0   0.0%
14 f  25  100.0% 25
15 m  0   0.0%

0 dirty, 200 queued, 200 total, 256 hash buckets, 2048 buffer size

start clean at 60% (of pair total) dirty, or 15 buffs dirty, stop at 50%

Observe the percent of pages in the modified side of each queue at checkpoint time and compare it to the LRU MAX DIRTY parameter. The percentage should always be less than LRU MAX DIRTY. If it is not, then either there are not enough pagecleaners to perform write requests, not enough AIO vps, or KAIO threads to perform physical writes, or the disk controllers or the drives themselves are saturated.

For OLTP systems, reducing LRU MAX DIRTY below the percentage of dirty pages in the LRU queue generally will decrease the duration of checkpoints.

onstat -g ioq

INFORMIX-OnLine Version 7.12.UC2 -- On-Line -- Up 122 days 21:06:48 -- 72616 Kbytes

AIO I/O queues:
q name/id    len maxlen totalops  dskread dskwrite  dskcopy
  adt   0      0      0        0        0        0        0 
  msc   0      0      1      114        0        0        0 
  aio   0      0      1     1031       14      476        0 
  pio   0      0      1       27        0       27        0 
  lio   0      0      1      147        0      147        0 
  gfd   3      0      1      211      187       24        0 
  gfd   4      0    152    35991    14627    21364        0 
  gfd   5      0    140      630       50      580        0 

Monitor the len and maxlen columns of the I/O request queues. A maxlen of greater than 25 or a len of 10 which is consistent indicates that the requests are not being serviced fast enough. Adding more VPs will help if the disks or controllers are not already saturated.

onstat -g lsc

INFORMIX-OnLine Version 7.12.UC2 -- On-Line -- Up 123 days 01:39:41 -- 72616 s
Light Scan Info
descriptor address next_lpage next_ppage ppage_left bufcnt look_aside
 0        c41edcd0     87     500223       490          1  N
 1        c41efe68     85     6000cc       497          1  Y
 1        c41f07b8     83     7000cc       497          1  Y

Use onstat -g lsc to determine if light scans are being utilized. If the size of the table is larger than the size of the buffers and the isolation level is set to dirty read, or committed read with a shared lock on the table, then light scans should happen. Light scans will significantly speed up large table scans. The ppage_left column will display the number of pages still to scan for a given fragment of a table.

Onstat -g ntd

INFORMIX-OnLine Version 7.12.UC2 -- On-Line -- Up 122 days 21:01:14 -- 72616
global network information:
  #netscb connects     read    write    q-free  q-limits  q-exceed alloc/max
   5/   8       50     2750    12000    1/   1  279/  10    0/   0    1/   1

Client Type     Calls   Accepted   Rejected       Read      Write
sqlexec         yes           50          0       2700      12000
srvinfx         yes            0          0          0          0
onspace         yes            0          0          0          0
onlog           yes            0          0          0          0
onparam         yes            0          0          0          0
oncheck         yes            0          0          0          0
onload          yes            0          0          0          0
onunload        yes            0          0          0          0
onmonitor       yes            0          0          0          0
dr_accept       yes            0          0          0          0
cdraccept       no             0          0          0          0
ontape          yes            0          0          0          0
srvstat         yes            0          0          0          0
asfecho         yes            0          0          0          0
listener        yes            0          0         50          0
crsamexec       yes            0          0          0          0
safe            yes            0          0          0          0
onutil          yes            0          0          0          0
Totals                        50          0       2750      12000

Monitor the number of accepted vs. rejected connections. If there are a large number of rejections then either the user table has overflowed (onstat -p ovuserthreads) or the network is timing out on the connection.

Onstat -g ppf

INFORMIX-OnLine Version 7.13.UC2 -- On-Line -- Up 3 days 15:51:58 -- 213816 Kbytes

Partition profiles
partnum    lkrqs lkwts dlks  touts isrd  iswrt isrwt isdel bfrd  bfwrt seqsc rhitratio
0x100001   0     0     0     0     0     0     0     0     0     0     0     0  
0x100002   195   0     0     0     49    0     0     0     152   0     3     95 
0x100003   366   0     0     0     170   0     0     0     432   0     0     94 
0x100004   77    0     0     0     34    0     0     0     84    0     0     89 
0x100005   12    0     0     0     4     0     0     0     12    0     0     75 
0x100006   17    0     0     0     7     0     0     0     20    0     0     65 
0x100007   17    0     0     0     7     0     0     0     17    0     0     83 
0x100008   45    0     0     0     21    0     0     0     48    0     0     82 
0x100009   69    0     0     0     31    0     0     0     97    0     3     93 
0x10000d   0     0     0     0     0     0     0     0     2     0     0     50 
0x100010   2     0     0     0     0     0     0     0     2     0     0     50 
0x100012   22    0     0     0     9     0     0     0     28    0     0     65 
0x100013   26    0     0     0     11    0     0     0     30    0     0     87 
0x100014   3     0     0     0     1     0     0     0     9     0     0     56 
0x100015   60    0     0     0     24    0     0     0     72    0     0     88 
0x100016   3     0     0     0     3     0     0     0     3     0     0     67 
0x100017   0     0     0     0     3     0     0     0     3     0     0     67 
0x100018   0     0     0     0     0     0     0     0     2     0     0     50 
0x10001a   0     0     0     0     0     0     0     0     2     0     0     50 

Monitor the number of read and write operations which are occurring on open fragments of tables using isrd, iswrt, isrwt and sdel. Since only open tables are listed, sampling onstat -g ppf over time provides good indication of which tables are most frequently used. Determine if sequential scans are being used to read data from the table using seqsc. Compare fragments from the same table to determine If I/O is balanced across the fragments of the table.

 


Украинская баннерная сеть
 

[Home]

Сайт поддерживается группой пользователей Информикс на Украине.

Hosted by NO-more.