1. What is SQL optimization?
SQL optimization involves the identification, analysis, and optimization of SQL statements that are responsible for the highest load as regards I/O at the database level. These statements are also called "processing-intensive SQL statements".
Due to the complexity of the topic, this note can only offer an outline description of an analysis. SAP does, however, offer a number of training courses and services if you wish to find out more information (UMEW60 - Empowering Workshop SAP DB/MaxDB Performance Monitoring and Optimization, WB550 - MaxDB Internals): www.service.sap.com/servicecat -> Support Services -> SAP Solution Management Optimization -> SAP Business Process Performance Optimization, or -> SAP Remote Performance Optimization.
2. How important is SQL optimization?
Regular analysis and optimization of expensive SQL statements provides THE most important basis for high-performance system operation. Resource-intensive SQL statements are directly responsible for increased I/O and CPU activities, and therefore result in a poor data cache hit ratio.
3. Which criteria are used to determine the load of an SQL statement?
Reads are the main criteria for the load generated by an SQL statement, and these are required to return a result.
We distinguish between the following:
-> Number of disk accesses (or physical reads):
Number of the blocks that were not in the data cache memory, and therefore had to be read from the hard disk
= > Measure of the I/O load of an SQL statement
-> Number of data cache accesses (or Logical Reads):
Number of blocks that were read in the data cache in the memory
The following criteria are also taken into account, although they usually have little influence on database performance:
-> Number of qualified rows:
Number of table entries returned by an SQL statement
= > Measure of the network load of an SQL statement
As part of SQL optimization, the SQL statements with the most reads (physical reads, logical reads) and processed rows (ratio of rows read to qualified rows) and the longest runtime are taken into account.
From a performance point of view, it does not matter whether a statement was executed frequently and only a small-scale load created, or whether it was executed once and a large-scale load triggered. Only the total load triggered by the SQL statements and not the load for each execution is significant.
In the R/3 System environment, SQL statements with bind variables are parsed (:A0, :A1, and so on). These bind variables may contain other specific values. Unlike other database systems, all SQL statements that only vary in values are also regarded as different statements.
The same SQL statement executed with different values can therefore also have different run schedules. These commands are structurally the same and are listed individually in the Command Monitor with different values.
4. Where can I find information on the SQL statements executed?
The central source of information for SQL statements is transaction DB50 -> Problem Analysis -> Performance -> SQL Performance
The Command Monitor and the Resource Monitor in particular are the tools used to analyze long SQL statements.
a) Command Monitor (Diagnosis Monitor)
The Command Monitor is designed for short-term analysis and should only be activated during an analysis. If the Command Monitor is not activated, no data is available for analysis. For an initial analysis, you can start the Command Monitor with the default values using 'Change monitor settings':
Number of page accesses: >= 1.000
Runtime of SQL statement: >= 1,000 ms
Selectivity: <= 10%
Save parameter values: X
Number of monitoring entries: 3.000
Once the Command Monitor has been activated, all commands that correspond to at least one of the recording criteria is recorded in the Monitor. If the maximum number of monitoring entries has been reached, the commands are overwritten cyclically, and the oldest commands are removed from the Command Monitor.
If the system to be analyzed has large-scale performance problems, the default runtime settings (greater than or equal to 10,000 ms) and the number of page accesses (greater than or equal to 10,000 ) should be increased, so that the critical SQL statements are not removed too quickly from the Command Monitor. The selectivity should be increased correspondingly for this type of analysis, or can not be specified at all.
Please also see Note 994590 concerning this topic.
b) Resource Monitor (analyze diagnosis)
The Resource Monitor must also be activated explicitly in DB50 by selecting 'Start monitor recording'.
The Resource Monitor is a monitoring tool that can be used to analyze the load, for example, of a workday. You can use a resource consumption analysis to identify the most processing-intensive SQL statements. The resources consumed by an SQL statement are measured (for example, runtime, I/O accesses).
If a SQL statement is used several times, the total costs are determined. The same commands are only stored once for all sessions.
This means it is possible to recognize SQL statements that have a relatively low runtime, but that create a considerable database load due to the number of implementations. The SQL statements to be displayed can be restricted using additional definitions in 'Limit display'.
5. How do I determine the most processing-intensive SQL statements?
You can use the Resource Monitor to determine the SQL statements that use the most system resources. Sort the output list according to the runtime of an SQL statement or the number DISK I/O or the number of rows read, to determine the most expensive SQL statements.
Use the Command Monitor for a detailed analysis.
6. How do I get information on an SQL statement that is currently running?
If Shared SQL is deactivated (default setting for all MaxDB versions up to 7.6), the Command Monitor must have been activated for subsequent analysis.
If a work process is busy accessing a table for a long period of time in SM50/SM66, you can use the Client PID in transaction DB50 -> Kernel Threads -> Task Manager -> Active Tasks in the application PIC column, to determine the corresponding MaxDB task. In the MaxDB system table 'running_commands', you get the following command executed in the session:
select SQL_STATEMENT from running_commands where TASKID = <DB-Task>
A menu-driven support of this analysis is not currently supported by transaction DB50.
7. Why does the SQL statement at the database level look different to the one in the ABAP source code?
The open SQL statements implemented from the R/3 System are transferred to the database using the database interface (DBI). In the DBI, the SQL statement is modified in several cases before it is transferred to the database:
If a column with an empty variable is compared in the WHERE part, the DBI ignores this condition.
If you are using FOR ALL ENTRIES, a long values list is broken down into statements with shorter IN lists or OR links, based on the following DBI parameters:
If the parameter is set to 0 (default setting for R/3 Basis releases up to and including 4.6D), OR links are always used.
If the value is 1 (default setting for R/3 Basis releases up to and including R/3 Basis 6.10), IN lists are used if possible. If however, the FOR ALL ENTRIES list is referenced several times in the SQL statement, OR links must still be used.
Number of OR operations
Length of IN lists (if rsdb/prefer_in_itab_opt = 1)
Refer to Notes 48230 and 48244 for more information. Note also that RSPARAM always displays -1 if default values are used. The value used is displayed in the work process traces dev_w* (transaction ST11).
Accesses of R/3 pool and cluster tables (that do not exist as independent tables at the MaxDB level) are converted into accesses of the corresponding table pools or table clusters.
If tables are buffered completely or generically in the R/3 System, the buffers are retroactively loaded using special DBI statements if necessary (for example, "SELECT * FROM <table> ORDER BY <primary_key_fields>" for completely buffered tables) that can differ from the statement from the ABAP source code.
ABAP statements with "SELECT SINGLE" or "UP TO ROWS" (or ROWNO) are not transferred to the database if DISTINCT, aggregate functions, group by or ORDER BY are used in the statement. This rule is valid for all databases. This rule is valid for all databases. For "UP TO ROWS", HINT FIRST_ROWS is transferred to the database for some databases, but not for MaxDB, however.
Some operations (for example, kernel procedures, bulk operations, and generations) can generate SQL statements although no standard SQL statement is displayed in the ABAP source code.
8. How many processing-intensive SQL statements should be taken into account in SQL optimization?
There is no single, general response to this question. The answer depends primarily on the corresponding rows of optimization and the current system status. For example, in one case, the optimization of an individual extremely processing-intensive SQL statement may ensure that a system that could previously hardly be used runs optimally again, whereas in another, more than 50 SQL statements must be optimized, to achieve the desired performance improvements.
Indicators regarding whether a system is already well tuned or not are outlined in Note 819641.
9. What are selection and join conditions?
Within the framework of a selection condition, a column is compared with one or several actual values (for example, "MANDT = :A0", "BDAT greater than '20050821' "). In join conditions, two column values from different tables are compared (for example, "T_00.MANDT = T_01.MANDT")
Selection and join conditions are specified in the WHERE part of the SQL statement and - in the case of views - also in the view definition in transaction SE11. These view conditions must not be overlooked when SQL statements are being analyzed.
For more information on this, see the MaxDB documentation (Note 767598) in the SQL Optimizer handbook.
10. Which conditions are valued by the MaxDB Optimizer?
The Optimizer can use the following search conditions to determine the optimal search strategy:
- Equality conditions
- Area conditions
- IN conditions
- Like conditions
Column = value
Column <, <=, >=, > value
Column BETWEEN value AND value
Column IN ( value, value, ... )
Column LIKE string value (including %,?,...)
Column = (ANY) <subquery>
Column IN <subquery>
Join table optimizer
Table1_column = table2_column
The search conditions are displayed here in the sequence of their value assignment type. That is, an equality condition on the same column is always better valued than an IN condition under the same conditions.
The SQL Optimizer may also reform conditions. If, for example, only one value is repeatedly specified in an IN condition, it converts this condition to an equality condition.
11. What is a MaxDB secondary key (index)?
The MaxDB primary key is a UNIQUE index that is implemented directly on the data tree, unlike a secondary key (index), that is, no separate primary key tree is created for the primary key. The primary key is used as a separator in the B* tree. The records of the table are sorted and stored in key sequence.
12. What is a MaxDB primary key?
A separate B* tree is created for a secondary key (or index).
The secondary key (index) contains no physical addresses on the data tree, instead it contains logical addresses in the form of primary keys.
13. What is a run schedule?
A run schedule (or explain, access path) shows how MaxDB accesses the requested data (index access, table scan, key range, key equal, index equal, and so on). A run schedule displays the strategy the Optimizer selects to run a special SQL statement. These run schedules are used to analyze long SQL statements. A run schedule can only be displayed for select statements. Other SQL statements must be reformulated to display a run schedule. For example, an UPDATE statement can be converted into a SELECT FOR REUSE.
14. How can I display the run schedule for an SQL statement?
In the Command Monitor in the overview of SQL statements, you can double-click the SQL statement to be analyzed to go to the details menu.
You can select 'Run schedule of SQL statement display' to go to the run schedule.
The run schedule of an SQL statement can also be determined in the SQLStudio if the function 'Replace placeholders in the SQL statement' is selected and this SQL statement is then executed with the keyword EXPLAIN or EXPLAIN VIEW in the SQLStudio.
The explain can also be implemented from the SQL trace (transaction ST05).
The displayed run schedule is always determined using the indexes and statistic values currently available in the system. If these values were changed since the traces were created, the run schedule displayed does not necessarily show the strategy used during the trace creation.
15. What are the characteristic components of a run schedule?
The explain displays a block for each table from the SELECT FROM list.
The sequence of strategies reflects the execution sequence.
The sequence of blocks reflects the execution sequence of the join.
COPIED / NOT COPIED determines whether a resulting set is set up or not.
COSTS estimates the number of pages that must be read and written (with interim result structure) to return the result.
To display an overview of optimizer strategies, go to Concepts -> Performance -> SQL Optimizer. Note 767598 describes where to find the MaxDB documentation.
16. What is an optimal run schedule?
An optimal run schedule returns a minimum of pages to be read.
17. What role do update statistics play?
To determine the optimal access path, the Optimizer requires statistical information. If this is maintained regularly, incorrect strategy decisions may be made in the join environment for current data distribution.
The one-table optimizer does not require any statistical information as the access strategy is determined here using an evaluation mechanism.
For more information, see Notes 927882 and 808060. For BW systems, additionally refer to Note 797667.
18. Which utilities are available for further analyzing a processing-intensive SQL statement?
In addition to the performance analysis tools already mentioned (Command and Resource Monitor), the metadata regarding the tables involved in an SQL statement are also required. Use transaction DB50 -> Problem Analysis -> Table/View/Synonym.
This function provides the following information:
-> when the last update statistics was executed.
-> which estimated value is set for the update statistics,
-> how the optimizer statistics appear
-> which indexes are available for the table, whether they are active and
-> how the primary key is defined
-> you also get the table definition.
All this information is used in an SQL analysis.
19. How can I determine what report was used to start the SQL statement?
In SAP versions up to 6.40, the prerequisite is that you can determine the report that implemented an processing-intensive SQL statement using the SAP profile parameter dbs/ada/register_appl_info. Set this parameter in the instance profile. For more information, see Note 216208.
If the setting was made in line with Note 216208, the list layout can be changed in the Command Monitor so that the calling ABAP program is also displayed. You can then go directly from the Command Monitor to the call point in ABAP. If it is still not clear there in which context the source code will run, a where-used list can return further information.
Information on the triggering transaction can also be determined by capturing a current query in SM66 or by a permanent SQL trace restricted to the table using ST05.
If no report name is displayed in the Command Monitor, the statement is called by a tool or script outside the R/3 System.
If branching using a source code button fails, you may have dynamically generated source code.
You cannot branch to the source code for statements that originated in the R/3 kernel.
20. How can I optimize an processing-intensive SQL statement?
Before an processing-intensive SQL statement is technically tuned, the application view should always be checked to determine the extent to which the SQL statement is useful. If the SQL statement is the result of an inadequately filled input template, a badly qualified SE16 SELECT (report /1BCDWB/DB*), an unfavorable query (report AQ*) or a design error in a customer-specific development, the statement must be optimized from a non-technical view or avoided completely. If the statement comes from the standard SAP system, you can use an SAP note search to determine whether a general solution is available.
The following section provides a general overview of classes of processing-intensive SQL statements and of possible technical tuning strategies:
a) Large number of executions
If an SQL statement is processing-intensive primarily due to the large number of executions, you must check whether the number of executions can be reduced in the application.
b) Large number of processed rows for each execution
If an SQL statement reads several data records for each execution, you must check in the application whether the scope of the dataset can be reduced (for example, by specifying more conditions in the WHERE part).
c) Poor ratio of read to qualified records
This indicates a high read load (that several records must be read to return the result).
Check whether resource consumption in the MaxDB is optimal. Consider the following issues:
Does the MaxDB Optimizer fail to decide on the best access path?
If a detailed analysis of the SQL statement shows that MaxDB does not decide on the optimal access path, you should firstly check whether the optimizer statistics correctly reflect the current dataset. When the work load is minimal at the weekend, the statistics should be updated once a week for the tables whose size has changed.
Other reasons for an incorrect access path are incorrect MaxDB parameterization or the use of an obsolete Support Package. For more information, see Note 767635.
Note 817934 is also available for MaxDB Support for further analysis.
Can the access be optimized by creating an additional index?
If the existing selective conditions are not sufficiently supported by an index, a corresponding index can be created for optimization or - if no side effects are possible - an existing index can be modified.
For information about the index, see FAQ Note 928037.
Note that in the R/3 system, some central tables are accessed using special index tables (Note 185530 for SD, Note 191492 for MM/WM, and Note 187906 for PP/PM). In the cases described, you should open the initial screen using an index table, so that no other index must be created. The same applies for accesses on the table BSEG, for the index tables such as BSIS, BSAS, BSID, and BSAD.
Large number of disk reads per execution
Together with primary key ranges, a large number of disk reads is often only available using the fields MANDT or Full Table Scans.
In this case, use appropriate measures to force an index access (create a new index, modify statistics, and so on)
21. What should I bear in mind for COLUMN > ' ' (blank) queries?
Bear in mind the following for SQL statements of the type COLUMN > ' ' :
If COLUMN is a part of a multiple key or index, and is not the last column in this key or index, then, because of its architecture, MaxDB must search through, not only all rows that contain > ' ', but also all of the rows that are blank in COLUMN.
You can optimize such statements by creating a single index using COLUMN or by creating a multiple index, in which COLUMN is the last index field.
22. How can I determine which indexes are actually used?
In transaction DB50 -> Problem Analysis -> Tables/Views/Synonym, you can display the existing indexes using the tab 'Indexes'. If a green point appears in the 'Use' column, this index was already used by an SQL statement for the access. A red point indicates that the index was previously not used.
For more information, see the FAQs in Note 928037: MaxDB indexes.
23. What must be considered in terms of the index design?
Each additional index means additional input if changes are made to the table, and requires additional disk space. Therefore, take account of the following general rules:
-> Create as few indexes as possible.
-> Check whether an existing index can be put to good use by changing the application design.
-> If possible, avoid creating any index that is very similar to an existing index.
-> Only change standard indexes or existing indexes in consultation with SAP or at SAP's request.
To determine the optimal index for an actual statement, the actual values must be known in the WHERE condition and its selectivities.
For more information, see the FAQs in Note 928037: MaxDB indexes.
24. Where can I find further information on SQL optimization?
MaxDB documentation (Note 767598) provides detailed information on the topic of SQL optimization. We recommend that you use the following keywords to open the glossary in the documentation:
-> Primary key
-> Command monitor
-> Resource monitor
Performance guidelines are currently being drawn up in MaxDB Development Support, to optimize lengthy SQL statements.
25. Which training courses deal with the topic of SQL optimization?
For information about MaxDB Tuning, see the SAP Developer Network (SDN) under the following link:
SAP currently provides two training courses that include the topic of SQL optimization.
a) WB550: MaxDB Internals
This course gives you a better insight into MaxDB Internals and devotes one chapter to SQL optimization.
b) UMEW60: MaxDB/SAP DB Performance Monitoring & Optimization
This course allows you to execute an SQL optimization remotely on your own system together with the course provider.
For more information about this training course, enter the course number in the course search of the SAP training catalog on SAPNet: