HANA SQL Optimization Interview Questions and Answer

FAQ SAP HANA SQL Optimization

Here are some top most important and frequently asked SAP HANA SQL Optimization Interview Questions and Answer such as What is an expensive SQL statement, How can the user recognize a specific SQL statement etc.

What are the indications which exist for critical SQL statements?

Alert	Name	Description
39	Long-running statements	long-running SQL statements are identified.

The following SAP HANA alerts indicate problems present in the SQL area:

SQL: "HANA_Configuration_MiniChecks" shows a potentially critical issue (C = 'X') for one of the following individual checks:

Check ID	Details
1110	SQL using in average > 1 connection (last day)
1112	SQL using in average > 1 thread (last hour)
1113	SQL using in average > 1 thread (last day)
1115	Longest running current SQL statement (h)
1118	Longest running current job (s)
1120	Exp. stmt. trace: SQL running > 1 h (last day)
1160	Average COMMIT time (ms)
1162	Average COMMIT I/O time (ms)
1170	Average database request time (ms)
1180	Avg. ABAP buffer loading sessions (last day)
1181	Avg. FDA write sessions (last day)

How can the user recognize a specific SQL statement?

A specific SQL statement can be recognized by the SQL text. Since the text can be quite lengthy and there can be various similar SQL texts it is useful to identify a SQL statements based on a hash value which has been taken from the SQL text. This hash value is known as the statement hash and can be located in column STATEMENT_HASH of SAP HANA performance views like M_SQL_PLAN_CACHE, M_EXPENSIVE_STATEMENTS or M_SERVICE_THREAD_SAMPLES.

The statement ID (column STATEMENT_ID) cannot be utilized for that purpose, since it is based on a connection ID and not on a SQL statement text.

What is an expensive SQL statement?

An expensive SQL statement is referred to as database access which shows high values in areas like:

High execution time: SQL statements are generally very important as far as performance analysis is concerned with the overall runtime being high. The overall runtime counts don’t make a difference, when the statement is executed 1 time with a duration of 1000 seconds or 1 million times with an average duration of 1 ms, because in both cases the statement is mainly responsible for a total duration of 1000 seconds only.
High memory utilization
High CPU consumption
High lock wait time
High amount of returned records
High number of executions
High intra- / inter-node network communication

How can time information in the SQL cache (M_SQL_PLAN_CACHE) be interpreted?

Tables such as M_SQL_PLAN_CACHE, M_SQL_PLAN_CACHE_RESET and HOST_SQL_PLAN_CACHE comprise of many time related columns it is not possible for these to be interpreted.

Time column type	Description
CURSOR	Includes the overall cursor time which comprises of the SAP HANA server execution time and the client time Preparation time is excluded. This is mostly applicable to SELECT and CALL operations, and can be 0 for others (e.g. DML, DDL) If the client performs various tasks between fetches of data, the cursor time will be higher than the SAP HANA server time. Resulting in MVCC issues as the old versions of data have to be maintained until the execution is over.
EXECUTION	Comprises of the execution time (open + fetch + lock wait + close) on SAP HANA server side, excludes table load and preparation time
EXECUTION_OPEN	Includes the open time on SAP HANA server side Entails the actual retrieval of data in case of column store accesses along with early materialization
EXECUTION_FETCH	Has the fetch time on SAP HANA server side contains the actual retrieval of data during the case of row store accesses or late materialization
EXECUTION_CLOSE	includes the close time on SAP HANA server side
TABLE_LOAD	Contains the table load time while preparing,is part of the preparation time
PREPARATION	Encompasses the preparation time
LOCK_WAIT	Covers the transaction lock wait time and internal locks are excluded.

The retrieving date causes the EXECUTION_OPEN or EXECUTION_FETCH, which is generally quite long.

In SQL tuning, one of the critical information is the total elapsed time of the SQL statement, which is the result of the sum of preparation time and execution time.

Is it essential for secondary indexes to provide an optimal performance?

SAP HANA can process data efficiently to ensure a good performance and without using indexes. During the process of frequent, selective accesses to large tables it is always helpful to create additional secondary indexes. It is always advised that column store indexes must always be created on single columns as and when possible, as the single column indexes utilize less memory when compared to multi column indexes.

Should you create optimizer statistics for supporting optimal execution plans?

It is not mandatory to create optimizer statistics for normal SAP HANA tables. SAP HANA controls the optimal execution plans by certain heuristics (e.g. based on unique indexes and constraints), by internally collecting and re-using statistical information and ad-hoc sampling of data.

Remote tables which are accessed via Smart Data Access (SDA, ) can utilize optimizer statistics. They can be created with the CREATE STATISTICS command or - in the context of BW - with report RSSDA_CREATE_TABLE_STAT and displayed via SQL: "HANA_SQL_Statistics" .

How can you determine the details for prepared SQL statements?

For reducing the parsing overhead, bind variables are utilized in various environments like SAP ABAP:

Variant	Example SQL command
Literals	SELECT * FROM DBSTATC WHERE OBJOW = 'SAPR3' AND DBOBJ = 'AFPO'
Bind variables	SELECT * FROM DBSTATC WHERE OBJOW = ? AND DBOBJ = ?

A significant difference can be made for the execution plan, performance and resource consumption when a SQL statement is executed with explicit literals or with bind variables. Hence, it is suggested that you analyze an expensive SQL statement which uses bind variables in a similar manner, i.e. also with bind variables. The same can be obtained on using a prepared SQL statement.