Происхождение этого драфта неизвестно
1.1 What is performance?
This may seem like an obvious question, but it is not. Different people have different views of what they mean by 'performance'. Also, their view may change depending on circumstances. When discussing performance, the first thing to do is to ensure that all the parties involved agree as to what they are talking about.
1.2 Defining terms of reference
This follows from 1.1 above. When conducting any sort of performance exercise (tuning, investigation, benchmark) one needs to have a clear statement of objectives, metrics, constraints etc.
1.3 Performance investigations
When conducting a performance investigation one should be careful to investigate all areas and not just concentrate on one. For example, a poorly designed/implemented application can result in very poor performance no matter how well OnLine is tuned.
A very important area. A badly designed or implemented application can drag down the performance of the most highly tuned OnLine server.
2.1 Using ESQL/C
Properly used, ESQL/C is the most performant solution for writing Informix client applications. However, it is not necessarily the easiest to use. Most of the performance related issues that apply to ESQL/C apply to ESQL/COBOL, 4GL and NewEra as well.
2.2 SQL usage
2.3 Data Design
Avoid highly duplicate indexes.
Highly duplicate indexes are inefficient and degrade performance. If you feel you need to index on a column containing highly duplicate values, try making the index a composite with a second column which contains (nearly) unique values. The server can still use this index for access through the duplicate column but the index will be much more efficiently processed.
Choose indexes carefully
The objective of indexing is to facilitate rapid data access. Only create indexes that give a significant performance improvement to SQL operations. Use the Query Plan and testing to determine the best indexing strategy.
Don't over index
Too many indexes are as bad as not enough/incorrect indexes. Each index present on a table adds storage overhead. Indexes also add processing overhead to inserts, deletes and maybe updates on the table.
From V7.2: don't need ascending and descending indexes on the same columns
V7.2 can process an index in reverse so any index is effectively both ascending and descending
Use FILLFACTOR to allow room for inserts
Optimiser's choice of a suitable index is based on the order of index creation
E.g. table with index on colA and another index on colA, colB and colC. A filter on colA would use the index that was created first, which may not be the best one.
Utilise index only reads/scans
If colb is small (e.g. an integer) and cola is indexed then it might be better to make the index a composite on cola and colb. This query could then be satisfied by just reading the index entry without accessing the data row at all.
This technique is only worthwhile for frequently executed queries.
2.5 Stored Procedures
Can reduce client /server traffic
Use SPs to reduce client server traffic by moving complex application logic to the server (of course this increases the server processing overhead).
Avoid small SPs
SPs are quite useful for enforcing application logic etc. but they can be a performance hit. Avoid small SPs with few statements.
Avoid nested or recursive SPs
SPs which call other SPs, which call other SPs : are bad for performance
SPs can considerably reduce Online's ability to parallelise queries
OnLine cannot currently parallelise most queries involving SPs, though it can parallelise the individual SQL statements within the SP.
SPs used in an expression in a WHERE clause will be evaluated for every row examined
2.6 Query Plan
The Optimiser Query Plan (produced by SET EXPLAIN ON) is very useful in looking at the efficiency of SQL statements. One can see how the effects of statistics, data distributions, indexes, fragmentation and SQL statement composition affect query performance.
Note that the writing of the query plan has a significant impact on performance so do NOT have SET EXPLAIN ON when doing timings or for live running.
3. Loading data
3.1 Load methods:
High Performance Loader (Parallel load)
Very flexible, parallel load utility. Available as from OnLine 7.2. High performance.
Custom ESQL load programs
Can provide high performance, depending on sophistication and complexity.
Bulk Loader (BLOADER)
Earlier version of parallel load utility. Works with older versions of OnLine. Not officially supported. Good performance.
Standard load utility. Loads from ASCII files. Adequate performance for smaller data volumes.
DBACCESS LOAD statement
An easy way to load delimited ASCII files.
3.2 FET_BUF_SIZE affects insert cursor buffer
Note that, contrary to the documentation, FET_BUF_SIZE affects the size of the buffer used for insert cursors as well as fetch cursors and is therefore very useful when loading data using ESQL/C, DBLOAD and DBACCESS (but not 4GL).
3.3 Turn off logging during loading
Loading data is faster without logging.
3.4 Add indexes, Referential Integrity and constraints after data is loaded
The presence of indexes, RI etc. can greatly degrade load performance.
4.1 General points
4.2 Cooked vs. Raw disk space
Always use raw space, never use cooked.
By raw space we mean a raw disk partition accessed via a character mode special file (a 'c' in the first position of the output from ls -l). We do not mean:
Block mode special files associated with disk partitions (a 'b' in the first position of the output of ls -l)
Volume manager control devices
Anything else that isn't a proper raw disk partition
4.3 Update Statistics
This is very important. Ensure that your data distributions and statistics are up to date by running the appropriate UPDATE STATISTICS commands when the distribution of values in any relevant columns has changed significantly.
Run UPDATE STATISTICS HIGH for columns that head an index and use MEDIUM for all other columns. There is an ATG utility that will generate appropriate UPDATE STATISTICS statements.
Run the updates using multiple simultaneous jobs to improve overall throughput as internal parallelism is low for UPDATE STATISTICS processing.
You may need to increase the value of DBUPSPACE environment variable.
4.4 Read Ahead
Useful for non-parallel sequential scans (data only, index only and index/data).
Monitor via onstat -p
If sum of ixda-RA, idx-RA, and da-RA is close to RA-pgsused then try increasing read ahead parameters. If the sum is much smaller than RA-pgsused, decrease the parameters.
Generally keep RA_PAGES <= 32 and RA_THRESHOLD around half of RA_PAGES
4.5 OLTP vs. DSS
4.6 Disk Layout and Fragmentation
Choice of disk hardware:
Separate OnLine disk space from other system activity
Separate Physical and Logical logs
For any OnLine instance where there is a lot of write activity (insert, update and delete), separate the logical logs and physical logs onto different disks / controllers.
Try not to have anything else on the disks where the physical and logical logs are located
Alternate logical logs on different disks
Avoids contention between current log and log backup.
Minimise head movement
If possible, allocate raw partitions to minimise disk head movement.
Control table fragmentation
Try and prevent tables becoming fragmented into many extents. This is especially important with OnLine V5.0 and earlier.
For V7.1 and later:
For V5.0 and earlier
For V7.1 and later
OLTP - set to 0
For OLTP, one normally wants to inhibit the use of hash joins as they are processor intensive.
DSS - set to 2
For DSS, hash joins can give good join performance.
Minimum 4, otherwise = NUMCPUVPS
4.9 Page Cleaners
< 20 disks, 1 thread per disk
>= 20 & <= 100 disks, 1 thread per 2 disks
> 100 disks, 1 thread per 4 disks
4.10 LRU_MIN_DIRTY, LRU_MAX_DIRTY
Use these to control when page cleaning occurs. For OLTP generally want cleaning to occur evenly and not all at checkpoints
4.11 PDQPRIORITY / MGM Parameters
DSS queries are controlled by PDQPRIORITY and MGM.
A DSS query is any query for which PDQPRIORITY is > 0.
Specifies the maximum number of concurrent DSS queries.
Specifies maximum memory allowed for all DSS operations. Allocated from the virtual portion.
Specifies the maximum number of concurrent DSS scans.
Specifies the maximum allowed PDQPRIORITY.
Controls DSS query resource usage and parallelism.
Single DSS queries in isolation
For best performance of single DSS queries being run in isolation (e.g. a single query benchmark) set PDQPRIORITY to 100. (also make sure MAXPDQPRIORITY is set to 100 in ONCONFIG file)
Multiple DSS queries
Choose suitable value for PDQPRIORITY for each query depending on what you want to achieve.
Mixed DSS and OLTP workload
This is tricky to tune for. The objective is normally to try and prevent the DSS queries degrading the OLTP performance too much.
Can either disable PDQ altogether (PDQPRIORITY / MAXPDQPRIORITY = 0) or set PDQPRIORITY / MAXPDQPRIORITY to 1 to allow parallel scan only (be very careful not to get throttled by DS_TOTAL_MEMORY, DS_MAX_SCANS, DS_MAX_QUERIES etc.)
For OLTP, want large buffer pool and high cache hit rate
For typical OLTP workloads, best performance is achieved when most data access is to a page in the OnLine cache. To achieve this one needs a correctly sized cache. The efficiency of the cache is shown by onstat -p. This shows a %cached figure for both reads and writes. You want to get these as high as possible, within the constraints imposed by available memory. It is often possible to get read hit rates above 95% and write hit rates above 85%.
For DSS, a small buffer pool usually suffices
For typical DSS workloads, most access is via light sequential scans which do not go through the buffer pool. As a result there is no need for a large buffer pool. Instead the memory should be allocated to DS_TOTAL_MEMORY to make it available for scan buffers, hash joins etc.
4.13 Buffered vs. Unbuffered logging
Buffered logging more performant than unbuffered
With buffered logging, the logical log buffer is only flushed to disk when it becomes full. With unbuffered logging, every commit forces a log buffer flush. Note that as the logical log buffer is common to all databases, just one high activity database using unbuffered logging can significantly reduce the benefit of using buffered logging.
Buffered logging less 'safe' than unbuffered
Because the log buffer is only flushed when it is full, if there is an OnLine or system failure, the contents of the log buffer will be lost. Of course at the time of failure, the buffer may well contain commit records that have not yet been written to disk. OnLine fast recovery will roll back those transactions. All databases will be self consistent but may be inconsistent with the applications view of them as transactions that the application thought were successfully committed have in fact been rolled back.
Very important when using buffered logging
Less important for unbuffered logging
Unless you have a lot of clients doing write activity or are using in-table BLOBS
Monitor with onstat -l
Tune to have pages/io around about 90%-95% of buffer size on average. Keep increasing until no improvement seen. For mainly unbuffered logging may not be able to get close to this figure.
Not too important for systems with little write activity
Important for instances with a lot of write activity
Monitor with onstat -l
Tune so that pages/io are around 90%-95% of buffer size on average.
During normal operations, two main events cause a checkpoint to occur: the checkpoint interval expiring or the physical log becoming 75% full.
The amount of work that occurs between checkpoints determines how long fast recovery will take after a system failure.
If recovery time is important for you, configure the checkpoint interval to ensure acceptable recovery time. Otherwise, you can make the interval high and allow the physical log 75% full condition to drive the bulk of your checkpoints.
4.17 KAIO vs. AIO
Kernel AIO (KAIO) is not available in all ports / platforms. Where it is available it is usually the most efficient form of disk I/O but is only available when using raw disk space.
When using KAIO, you still need to configure one or two AIO VPs for explain output etc.
In the absence of KAIO, the Informix AIO mechanism is a quite efficient.
Start with 2 AIO VPs per disk that contains OnLine data (as opposed to logical/physical logs). Monitor AIO VP performance using onstat -g ioq and onstat -g iov. Add or remove AIO VPs as necessary. It does not hurt too much to have a couple of AIO VPs that are mostly idle.
Client <-> OnLine vs. OnLine <-> OnLine
Choose best communications mechanism
Local clients: Shared Memory or Stream Pipes
For clients running on the same machine as the OnLine server, choose shared memory (ipcshm) or stream pipes (ipcstr) connectivity. Where available, stream pipes is generally faster, more flexible and more secure than shared memory.
Avoid network connections (TCP/IP) for local clients as the performance is much reduced compared to ipcshm or ipcstr connections.
Remote clients: TCP/IP
Use TCP/IP for remote clients. Only use Netware IPX/SPX if there are specific reasons to do so.
Investigate effects of varying the fetch buffer size (FET_BUF_SIZE environment variable) and the socket buffer size (set in sqlhosts file)
Minimise client/server exchanges using the techniques described earlier.
Poll threads and NET VPs
Poll threads handle incoming data from clients
Data from server to client is sent by individual sqlexec threads
A single poll thread can handle about 200 typical clients
Varies depending on many factors, only a guideline
For lots of clients, very active clients, or large amounts of data input (e.g. lots of inserts), may need more poll threads
Poll threads can run in CPU VPs (inline poll) or NET VPs
Only one 'inline' protocol at any one time.
Inline Poll threads
Best performance with lower numbers of clients
Helps TCP/IP considerably
Imposes additional overhead on CPU VPs
Can't have more poll threads than CPU VPs
NET VPs for poll threads
For large numbers of clients NET VPs may offer better performance
Can have as many poll threads as you want
Offloads processing from CPU VPs
Avoid the Relay Module
Try and avoid using the Special Purpose Relay Module to connect pre 6.0 clients to >= 6.0 OnLine. IT IS A PERFORMANCE KILLER.
Move to native 6.x/7.x clients ASAP.
4.19 Dictionary, Distribution and Stored Procedure Cache
OnLine cache's data distributions, data dictionary information and Stored Procedures
Individual caches, separate from main buffer cache
Monitor with onstat -g dsc
Control size with DS_HASHSIZE (number of hash buckets) and DS_POOLSIZE (maximum number of entries)
Monitor with onstat -g dic
Control size with DD_HASHSIZE (number of hash buckets) and DD_HASHMAX (maximum entries per bucket)
Stored Procedure cache
Monitor with onstat -g prc
Control size with PC_HASHSIZE (number of hash buckets) and PC_POOLSIZE (maximum number of entries)
4.20 Monitoring Sessions
Sessions can be monitored using onstat and SMI
For problem sessions, investigate:
4.21 Single CPU VP flag
Allows OnLine to eliminate some mutex operations
If you are only using one CPU VP and are sure you will not want to add more dynamically, be sure to set the SINGLE_CPU_VP flag on in the ONCONFIG file.
This allows OnLine to eliminate many mutex calls used for inter CPU VP synchronisation.
4.22 MULTIPROCESSOR flag
Optimises internal OnLine operations for MP systems
Set this flag correctly based on the number of processors in your system.
It allows OnLine to optimise internal operations such as latching and VP idling based on the type of machine (uniprocessor or multiprocessor).
Currently, other than performance, there is no problem if the flag is incorrectly set. This may not remain true in future releases.
4.23 Monitoring OnLine performance
Command line tool. Displays information about many different aspects of OnLine operation.
Similar information to onstat but via an SQL interface. Can develop custom monitoring tools.
Onperf / Xtree
Graphical (X/Motif) based performance tool available on some platforms. Can display query trees within OnLine server.
Check table fragmentation, index integrity etc. etc.
5. O/S and Hardware
5.1 Always look at the whole picture
There is usually little point in just looking at OnLine's performance metrics. One needs to consider the system as a whole e.g. disk / controller utilisation, CPU utilisation, network utilisation etc.
5.2 O/S tools: sar, vmstat, iostat etc.
Unix vendors provide various system monitoring utilities. Some common command line utilities are sar, vmstat and iostat. Many vendors offer graphical utilities (e.g. GlancePlus on HP).
You must be very careful when using these utilities. They often report seemingly similar metrics, but the exact meaning of a particular metric can often differ radically between different Unix vendors, or even different Unix releases from the same vendor. Also, these utilities often suffer from subtle bugs that make some metrics unreliable.
5.3 Disks and controllers
Use available O/S monitoring tools to investigate disk I/O
Try to balance I/O evenly across all disks (not just OnLine disks) and controllers
RAID5 can have poor write performance unless there is a large hardware cache
Try to keep %busy below ~30%. If it reaches ~60% you will see degraded performance.
Do not be fooled into worrying to much about iowait time. This metric is difficult to interpret meaningfully and is mostly misleading.
Try to minimise head movement on each disk by careful positioning of raw partitions.
OS / hardware mirroring can often help performance
Generally helps performance more than OnLine mirroring
Does depend on the sophistication of the vendors implementation.
Different Unix vendors offer different levels of control over TCP/IP parameters. Some you can often affect are:
Number of Stream Buffers
More buffers may improve performance for heavy client / server loads.
For reliable networks where clients often send / receive large amounts of data, increasing the packet size may improve throughput.
For unreliable networks or situations where clients send / receive small amounts of data a smaller packet size may be better
The packet size should correlate to the socket buffer size specified in the sqlhosts file.
Connect Queue Length
This can affect the performance of connection requests when many requests arrive at once. Increasing this value can reduce the number of client connection retries.
5.6 CPU usage
CPU usage monitoring is an important aspect of tuning OnLine performance
High system or OnLine CPU usage is not necessarily indicative of a performance problem
You want as much of the available to be resource to be utilised as possible.
This only becomes a problem where processing is delayed due to a lack of resource.
You can monitor the overall CPU usage of OnLine via onstat -p
Individual VP CPU usage can be monitored via onstat -g glo
Overall system CPU usage can be monitored via sar, vmstat, etc.
What should you be looking for:
6.1 Define the problem
Clearly define objectives, metrics and criteria.
6.2 Tune the application
If the application is badly designed / implemented, nothing else matters much!
6.3 Tune OnLine
Optimise OnLine for your target workload.
6.4 Tune the system as a whole
Ensure system resources are adequate and correctly allocated.