Informix Logo



Настройка производительности

Происхождение этого драфта неизвестно

1. Introduction

1.1 What is performance?

This may seem like an obvious question, but it is not. Different people have different views of what they mean by 'performance'. Also, their view may change depending on circumstances. When discussing performance, the first thing to do is to ensure that all the parties involved agree as to what they are talking about.

1.2 Defining terms of reference

This follows from 1.1 above. When conducting any sort of performance exercise (tuning, investigation, benchmark) one needs to have a clear statement of objectives, metrics, constraints etc.

1.3 Performance investigations

When conducting a performance investigation one should be careful to investigate all areas and not just concentrate on one. For example, a poorly designed/implemented application can result in very poor performance no matter how well OnLine is tuned.

2. Applications

A very important area. A badly designed or implemented application can drag down the performance of the most highly tuned OnLine server.

2.1 Using ESQL/C

Properly used, ESQL/C is the most performant solution for writing Informix client applications. However, it is not necessarily the easiest to use. Most of the performance related issues that apply to ESQL/C apply to ESQL/COBOL, 4GL and NewEra as well.

2.2 SQL usage

  • Prepare frequently used statements
  • May need to open cursors 'WITH REOPTIMIZATION'
  • Use cursors or execute into
  • Multi-statement prepare saves client/server exchanges
  • Use insert cursors for bulk inserts
  • Investigate benefits of FET_BUF_SIZE (not for I-SQL or 4GL)
  • Use least restrictive isolation level that is appropriate
  • Avoid using ROWIDS, use an SQL primary key instead
  • Avoid unnecessary type conversions, beware of 'hidden' type conversions e.g. select col1, col2 from tab1 where col1 > 10 if col1 is of type char, the integer 10 will be converted to type char for each row comparison. This statement should be written select col1, col2 from tab1 where col1 > '10'
  • Arithmetic on and conversion to/from 'decimal' is expensive
  • Use UPDATE : WHERE CURRENT OF where appropriate
  • Avoid frequent switches of database - use distributed queries or multiple connections via CONNECT statement
  • Let the server do the work (but keep it to a minimum)
  • Don't fetch more rows/columns than you need to
  • Only update changed columns
  • Can reduce lock contention (may avoid locking index entries)
  • Filter and join rows in the server, not the application
  • Sort and group rows in the server (but only when necessary)

2.3 Data Design

  • May need to de-normalise for performance
  • It is sometimes necessary to de-normalise the data model to reduce joins in order to achieve the required performance
  • This should be a 'last ditch' measure
  • Keep row lengths short
  • Keep row lengths short, certainly less than 1 OnLine page
  • Long rows split across pages and are bad for performance
  • VARCHARS are a two-edged sword
  • Can often significantly reduce storage space
  • If an update increases the length of a VARCHAR value this can have a negative performance impact as the row is split across multiple pages
  • Only use BLOBS if absolutely necessary
  • BLOB processing is less performant than most other types

2.4 Indexing

Avoid highly duplicate indexes.

Highly duplicate indexes are inefficient and degrade performance. If you feel you need to index on a column containing highly duplicate values, try making the index a composite with a second column which contains (nearly) unique values. The server can still use this index for access through the duplicate column but the index will be much more efficiently processed.

Choose indexes carefully

The objective of indexing is to facilitate rapid data access. Only create indexes that give a significant performance improvement to SQL operations. Use the Query Plan and testing to determine the best indexing strategy.

Don't over index

Too many indexes are as bad as not enough/incorrect indexes. Each index present on a table adds storage overhead. Indexes also add processing overhead to inserts, deletes and maybe updates on the table.

From V7.2: don't need ascending and descending indexes on the same columns

V7.2 can process an index in reverse so any index is effectively both ascending and descending

Use FILLFACTOR to allow room for inserts

Optimiser's choice of a suitable index is based on the order of index creation

E.g. table with index on colA and another index on colA, colB and colC. A filter on colA would use the index that was created first, which may not be the best one.

Utilise index only reads/scans

Consider:

select cola, colb from tab1 where cola = 'ABC123';

If colb is small (e.g. an integer) and cola is indexed then it might be better to make the index a composite on cola and colb. This query could then be satisfied by just reading the index entry without accessing the data row at all.

This technique is only worthwhile for frequently executed queries.

2.5 Stored Procedures

Can reduce client /server traffic

Use SPs to reduce client server traffic by moving complex application logic to the server (of course this increases the server processing overhead).

Avoid small SPs

SPs are quite useful for enforcing application logic etc. but they can be a performance hit. Avoid small SPs with few statements.

Avoid nested or recursive SPs

SPs which call other SPs, which call other SPs : are bad for performance

SPs can considerably reduce Online's ability to parallelise queries

OnLine cannot currently parallelise most queries involving SPs, though it can parallelise the individual SQL statements within the SP.

SPs used in an expression in a WHERE clause will be evaluated for every row examined

2.6 Query Plan

The Optimiser Query Plan (produced by SET EXPLAIN ON) is very useful in looking at the efficiency of SQL statements. One can see how the effects of statistics, data distributions, indexes, fragmentation and SQL statement composition affect query performance.

Note that the writing of the query plan has a significant impact on performance so do NOT have SET EXPLAIN ON when doing timings or for live running.

3. Loading data

3.1 Load methods:

High Performance Loader (Parallel load)

Very flexible, parallel load utility. Available as from OnLine 7.2. High performance.

Custom ESQL load programs

Can provide high performance, depending on sophistication and complexity.

Bulk Loader (BLOADER)

Earlier version of parallel load utility. Works with older versions of OnLine. Not officially supported. Good performance.

DBLOAD

Standard load utility. Loads from ASCII files. Adequate performance for smaller data volumes.

DBACCESS LOAD statement

An easy way to load delimited ASCII files.

3.2 FET_BUF_SIZE affects insert cursor buffer

Note that, contrary to the documentation, FET_BUF_SIZE affects the size of the buffer used for insert cursors as well as fetch cursors and is therefore very useful when loading data using ESQL/C, DBLOAD and DBACCESS (but not 4GL).

3.3 Turn off logging during loading

Loading data is faster without logging.

3.4 Add indexes, Referential Integrity and constraints after data is loaded

The presence of indexes, RI etc. can greatly degrade load performance.

4. OnLine

4.1 General points

4.2 Cooked vs. Raw disk space

Always use raw space, never use cooked.

By raw space we mean a raw disk partition accessed via a character mode special file (a 'c' in the first position of the output from ls -l). We do not mean:

Block mode special files associated with disk partitions (a 'b' in the first position of the output of ls -l)

Volume manager control devices

Anything else that isn't a proper raw disk partition

4.3 Update Statistics

This is very important. Ensure that your data distributions and statistics are up to date by running the appropriate UPDATE STATISTICS commands when the distribution of values in any relevant columns has changed significantly.

Run UPDATE STATISTICS HIGH for columns that head an index and use MEDIUM for all other columns. There is an ATG utility that will generate appropriate UPDATE STATISTICS statements.

Run the updates using multiple simultaneous jobs to improve overall throughput as internal parallelism is low for UPDATE STATISTICS processing.

You may need to increase the value of DBUPSPACE environment variable.

4.4 Read Ahead

Useful for non-parallel sequential scans (data only, index only and index/data).

Monitor via onstat -p

If sum of ixda-RA, idx-RA, and da-RA is close to RA-pgsused then try increasing read ahead parameters. If the sum is much smaller than RA-pgsused, decrease the parameters.

Generally keep RA_PAGES <= 32 and RA_THRESHOLD around half of RA_PAGES

4.5 OLTP vs. DSS

OLTP characteristics:

  • Lots of users
  • High transaction rates
  • Transactions are relatively small in terms of data processed
  • Significant write activity (insert/update/delete)
  • Fast access via indexes
  • High buffer cache usage
  • Limited scope for parallelism

DSS characteristics:

  • Large amounts of data
  • Little write activity (except for temporary space and data loading)
  • Relatively few users
  • Large, complex queries processing lots of data
  • Access mainly via sequential scans
  • Lots of scope for parallelism

4.6 Disk Layout and Fragmentation

Choice of disk hardware:

  • High transfer rates and low seek times
  • Larger numbers of smaller disks better than fewer larger disks
  • Avoid overloading controllers

Separate OnLine disk space from other system activity

  • File systems
  • Swap space

Separate Physical and Logical logs

For any OnLine instance where there is a lot of write activity (insert, update and delete), separate the logical logs and physical logs onto different disks / controllers.

Try not to have anything else on the disks where the physical and logical logs are located

Alternate logical logs on different disks

Avoids contention between current log and log backup.

Minimise head movement

If possible, allocate raw partitions to minimise disk head movement.

Control table fragmentation

Try and prevent tables becoming fragmented into many extents. This is especially important with OnLine V5.0 and earlier.

Size tables accurately if possible

Assign appropriate initial and next extent sizes

Try and keep to <= 8 extents

Consider defragmenting badly fragmented tables

For V7.1 and later:

Extent sizes for detached indexes are derived from the sizes for the associated table.

For fragmented tables, extent sizes are applied per fragment.

Avoid too many chunks

Large numbers of chunks significantly increase checkpoint duration even if there are no dirty pages.

For V5.0 and earlier

Spread dbspaces across multiple disks using multiple chunks and/or O/S striping.

For OnLine V6.0 and earlier, try to evenly spread I/O load across multiple disks / controllers using multiple chunks per dbspace and/or O/S striping. Put high activity tables in these dbspaces.

For V6.0

Spread dbspaces across multiple disks using multiple chunks and/or O/S striping.

For OnLine V6.0 and earlier, try to evenly spread I/O load across multiple disks / controllers using multiple chunks per dbspace and/or O/S striping. Put high activity tables in these dbspaces.

Use raw temporary space (DBSPACETEMP)

OnLine can use raw temporary space more efficiently than file system space.

For V7.1 and later

Use data fragmentation wisely

Don't fragment small tables

Don't fragment all tables

Identify the critical tables that are large and accessed most frequently.

Only put one fragment for a table on a disk.

Beware of fragmenting an index by round robin accidentally.

OLTP - fragment data by Round Robin, Detach (and maybe fragment) Indexes

Generally, for best OLTP performance, consider fragmenting large high usage tables by Round Robin in dbspaces on different disks (to spread I/O across multiple disks). Indexes on these tables should be explicitly created in a specific dbspace (on different disk to data fragments) and could optionally be fragmented by expression.

DSS - fragment data by Round Robin or Expression. Few or no indexes.

Generally, for best DSS performance, consider how to best fragment large tables - either Round Robin or Expression. You will likely use different fragmentation for different tables. The objective is to balance overall I/O and also to allow the optimiser to eliminate fragment scans for the most frequently run queries.

Fragmenting by expression

Keep the expression(s) simple

Put most restrictive expression for a given dbspace first i.e. use x <= 10 and x >=1 in dbspace1 NOT x >= 1 and x <= 10 in dbspace1

Put most frequently hit dbspaces before less frequently hit ones

Avoid expressions requiring type conversions

Don't fragment on columns whose values change frequently

Moving a row between fragments is additional overhead

Choose expression carefully

To balance I/O loading, not volume of data

To aid fragment elimination in queries

Avoid the REMAINDER IN clause

Use multiple raw temporary spaces (DBSPACETEMP)

OnLine can use raw temporary space more efficiently than file system space.

OnLine will parallelise some operations involving temporary space if DBSPACETEMP lists multiple dbspaces.

If you use multiple temporary dbspaces, put them on separate disks to each other and other high activity dbspaces.

4.7 OPTCOMPIND

OLTP - set to 0

For OLTP, one normally wants to inhibit the use of hash joins as they are processor intensive.

DSS - set to 2

For DSS, hash joins can give good join performance.

4.8 LRUS

Minimum 4, otherwise = NUMCPUVPS

4.9 Page Cleaners

< 20 disks, 1 thread per disk

>= 20 & <= 100 disks, 1 thread per 2 disks

> 100 disks, 1 thread per 4 disks

4.10 LRU_MIN_DIRTY, LRU_MAX_DIRTY

Use these to control when page cleaning occurs. For OLTP generally want cleaning to occur evenly and not all at checkpoints

4.11 PDQPRIORITY / MGM Parameters

DSS queries are controlled by PDQPRIORITY and MGM.

A DSS query is any query for which PDQPRIORITY is > 0.

DS_MAX_QUERIES

Specifies the maximum number of concurrent DSS queries.

DS_TOTAL_MEMORY

Specifies maximum memory allowed for all DSS operations. Allocated from the virtual portion.

DS_MAX_SCANS

Specifies the maximum number of concurrent DSS scans.

MAXPDQPRIORITY

Specifies the maximum allowed PDQPRIORITY.

PDQPRIORITY

Controls DSS query resource usage and parallelism.

Single DSS queries in isolation

For best performance of single DSS queries being run in isolation (e.g. a single query benchmark) set PDQPRIORITY to 100. (also make sure MAXPDQPRIORITY is set to 100 in ONCONFIG file)

Multiple DSS queries

Choose suitable value for PDQPRIORITY for each query depending on what you want to achieve.

Mixed DSS and OLTP workload

This is tricky to tune for. The objective is normally to try and prevent the DSS queries degrading the OLTP performance too much.

OLTP only

Can either disable PDQ altogether (PDQPRIORITY / MAXPDQPRIORITY = 0) or set PDQPRIORITY / MAXPDQPRIORITY to 1 to allow parallel scan only (be very careful not to get throttled by DS_TOTAL_MEMORY, DS_MAX_SCANS, DS_MAX_QUERIES etc.)

4.12 Buffers

For OLTP, want large buffer pool and high cache hit rate

For typical OLTP workloads, best performance is achieved when most data access is to a page in the OnLine cache. To achieve this one needs a correctly sized cache. The efficiency of the cache is shown by onstat -p. This shows a %cached figure for both reads and writes. You want to get these as high as possible, within the constraints imposed by available memory. It is often possible to get read hit rates above 95% and write hit rates above 85%.

For DSS, a small buffer pool usually suffices

For typical DSS workloads, most access is via light sequential scans which do not go through the buffer pool. As a result there is no need for a large buffer pool. Instead the memory should be allocated to DS_TOTAL_MEMORY to make it available for scan buffers, hash joins etc.

4.13 Buffered vs. Unbuffered logging

Buffered logging more performant than unbuffered

With buffered logging, the logical log buffer is only flushed to disk when it becomes full. With unbuffered logging, every commit forces a log buffer flush. Note that as the logical log buffer is common to all databases, just one high activity database using unbuffered logging can significantly reduce the benefit of using buffered logging.

Buffered logging less 'safe' than unbuffered

Because the log buffer is only flushed when it is full, if there is an OnLine or system failure, the contents of the log buffer will be lost. Of course at the time of failure, the buffer may well contain commit records that have not yet been written to disk. OnLine fast recovery will roll back those transactions. All databases will be self consistent but may be inconsistent with the applications view of them as transactions that the application thought were successfully committed have in fact been rolled back.

4.14 LOGBUFF

Very important when using buffered logging

Less important for unbuffered logging

Unless you have a lot of clients doing write activity or are using in-table BLOBS

Monitor with onstat -l

Tune to have pages/io around about 90%-95% of buffer size on average. Keep increasing until no improvement seen. For mainly unbuffered logging may not be able to get close to this figure.

Not too important for systems with little write activity

4.15 PHYSBUFF

Important for instances with a lot of write activity

Monitor with onstat -l

Tune so that pages/io are around 90%-95% of buffer size on average.

4.16 CKPTINTVL

During normal operations, two main events cause a checkpoint to occur: the checkpoint interval expiring or the physical log becoming 75% full.

The amount of work that occurs between checkpoints determines how long fast recovery will take after a system failure.

If recovery time is important for you, configure the checkpoint interval to ensure acceptable recovery time. Otherwise, you can make the interval high and allow the physical log 75% full condition to drive the bulk of your checkpoints.

4.17 KAIO vs. AIO

KAIO

Kernel AIO (KAIO) is not available in all ports / platforms. Where it is available it is usually the most efficient form of disk I/O but is only available when using raw disk space.

When using KAIO, you still need to configure one or two AIO VPs for explain output etc.

Informix AIO

In the absence of KAIO, the Informix AIO mechanism is a quite efficient.

Start with 2 AIO VPs per disk that contains OnLine data (as opposed to logical/physical logs). Monitor AIO VP performance using onstat -g ioq and onstat -g iov. Add or remove AIO VPs as necessary. It does not hurt too much to have a couple of AIO VPs that are mostly idle.

4.18 Connectivity

Client <-> OnLine vs. OnLine <-> OnLine

Choose best communications mechanism

Local clients: Shared Memory or Stream Pipes

For clients running on the same machine as the OnLine server, choose shared memory (ipcshm) or stream pipes (ipcstr) connectivity. Where available, stream pipes is generally faster, more flexible and more secure than shared memory.

Avoid network connections (TCP/IP) for local clients as the performance is much reduced compared to ipcshm or ipcstr connections.

Remote clients: TCP/IP

Use TCP/IP for remote clients. Only use Netware IPX/SPX if there are specific reasons to do so.

Investigate effects of varying the fetch buffer size (FET_BUF_SIZE environment variable) and the socket buffer size (set in sqlhosts file)

Minimise client/server exchanges using the techniques described earlier.

Poll threads and NET VPs

Poll threads handle incoming data from clients

Data from server to client is sent by individual sqlexec threads

A single poll thread can handle about 200 typical clients

Varies depending on many factors, only a guideline

For lots of clients, very active clients, or large amounts of data input (e.g. lots of inserts), may need more poll threads

Poll threads can run in CPU VPs (inline poll) or NET VPs

Only one 'inline' protocol at any one time.

Inline Poll threads

Best performance with lower numbers of clients

Helps TCP/IP considerably

Imposes additional overhead on CPU VPs

Can't have more poll threads than CPU VPs

NET VPs for poll threads

For large numbers of clients NET VPs may offer better performance

Can have as many poll threads as you want

Offloads processing from CPU VPs

Avoid the Relay Module

Try and avoid using the Special Purpose Relay Module to connect pre 6.0 clients to >= 6.0 OnLine. IT IS A PERFORMANCE KILLER.

Move to native 6.x/7.x clients ASAP.

4.19 Dictionary, Distribution and Stored Procedure Cache

OnLine cache's data distributions, data dictionary information and Stored Procedures

Individual caches, separate from main buffer cache

Distribution cache

Monitor with onstat -g dsc

Control size with DS_HASHSIZE (number of hash buckets) and DS_POOLSIZE (maximum number of entries)

Dictionary cache

Monitor with onstat -g dic

Control size with DD_HASHSIZE (number of hash buckets) and DD_HASHMAX (maximum entries per bucket)

Stored Procedure cache

Monitor with onstat -g prc

Control size with PC_HASHSIZE (number of hash buckets) and PC_POOLSIZE (maximum number of entries)

4.20 Monitoring Sessions

Sessions can be monitored using onstat and SMI

For problem sessions, investigate:

  • Number of threads
  • Disk I/O
  • Memory usage
  • SQL statements

4.21 Single CPU VP flag

Allows OnLine to eliminate some mutex operations

If you are only using one CPU VP and are sure you will not want to add more dynamically, be sure to set the SINGLE_CPU_VP flag on in the ONCONFIG file.

This allows OnLine to eliminate many mutex calls used for inter CPU VP synchronisation.

4.22 MULTIPROCESSOR flag

Optimises internal OnLine operations for MP systems

Set this flag correctly based on the number of processors in your system.

It allows OnLine to optimise internal operations such as latching and VP idling based on the type of machine (uniprocessor or multiprocessor).

Currently, other than performance, there is no problem if the flag is incorrectly set. This may not remain true in future releases.

4.23 Monitoring OnLine performance

onstat

Command line tool. Displays information about many different aspects of OnLine operation.

SMI

Similar information to onstat but via an SQL interface. Can develop custom monitoring tools.

Onperf / Xtree

Graphical (X/Motif) based performance tool available on some platforms. Can display query trees within OnLine server.

Oncheck

Check table fragmentation, index integrity etc. etc.

5. O/S and Hardware

5.1 Always look at the whole picture

There is usually little point in just looking at OnLine's performance metrics. One needs to consider the system as a whole e.g. disk / controller utilisation, CPU utilisation, network utilisation etc.

5.2 O/S tools: sar, vmstat, iostat etc.

Unix vendors provide various system monitoring utilities. Some common command line utilities are sar, vmstat and iostat. Many vendors offer graphical utilities (e.g. GlancePlus on HP).

You must be very careful when using these utilities. They often report seemingly similar metrics, but the exact meaning of a particular metric can often differ radically between different Unix vendors, or even different Unix releases from the same vendor. Also, these utilities often suffer from subtle bugs that make some metrics unreliable.

5.3 Disks and controllers

Use available O/S monitoring tools to investigate disk I/O

Try to balance I/O evenly across all disks (not just OnLine disks) and controllers

RAID5 can have poor write performance unless there is a large hardware cache

Try to keep %busy below ~30%. If it reaches ~60% you will see degraded performance.

Do not be fooled into worrying to much about iowait time. This metric is difficult to interpret meaningfully and is mostly misleading.

Try to minimise head movement on each disk by careful positioning of raw partitions.

OS / hardware mirroring can often help performance

Generally helps performance more than OnLine mirroring

Does depend on the sophistication of the vendors implementation.

5.4 Network

Different Unix vendors offer different levels of control over TCP/IP parameters. Some you can often affect are:

Number of Stream Buffers

More buffers may improve performance for heavy client / server loads.

Packet Size

For reliable networks where clients often send / receive large amounts of data, increasing the packet size may improve throughput.

For unreliable networks or situations where clients send / receive small amounts of data a smaller packet size may be better

The packet size should correlate to the socket buffer size specified in the sqlhosts file.

Connect Queue Length

This can affect the performance of connection requests when many requests arrive at once. Increasing this value can reduce the number of client connection retries.

5.6 CPU usage

CPU usage monitoring is an important aspect of tuning OnLine performance

High system or OnLine CPU usage is not necessarily indicative of a performance problem

You want as much of the available to be resource to be utilised as possible.

This only becomes a problem where processing is delayed due to a lack of resource.

You can monitor the overall CPU usage of OnLine via onstat -p

Individual VP CPU usage can be monitored via onstat -g glo

Overall system CPU usage can be monitored via sar, vmstat, etc.

What should you be looking for:

OnLine thread ready queue (onstat -g rea)

If this persistently shown a large number of threads waiting to run (say more than 10-15) you are short of CPU VP resource.

Lots of idle CPU (sar, vmstat etc.)

If you the ready queue indicates a lack of CPU VP resource and there is spare system CPU capacity, try adding another CPU VP.

Generally, one should not have more CPU VPs than physical CPUs but this does not always hold true.

Don't add more CPU VPs if there is no spare processor capacity

Adding more CPU VPs to a system running near 100% CPU utilisation will only make matters worse.

Other processes using large amounts of CPU capacity

Maybe these should run on another machine or during off peak periods

#CPU VPs <= # physical CPUs

DSA design objective is that optimal performance will be obtained with the number of CPU VPs <= the number of physical CPUs.

There may be circumstances where adding CPU VPs over and above the number of physical CPUs may improve performance. If so, these are considered bugs and should be logged via support for investigation and resolution.

6. Conclusion

6.1 Define the problem

Clearly define objectives, metrics and criteria.

6.2 Tune the application

If the application is badly designed / implemented, nothing else matters much!

6.3 Tune OnLine

Optimise OnLine for your target workload.

6.4 Tune the system as a whole

Ensure system resources are adequate and correctly allocated.

 

Украинская баннерная сеть
 

[Home]

Сайт поддерживается группой пользователей Информикс на Украине.

Hosted by NO-more.