Online Tutorials & Training Materials | STechies.com
Register Login
Advertisement

Database Statistics Interview Questions and Answer

|| 0

Database Statistics Interview Questions and Answer
Stechies

Database statistics FAQ

1. What are database statistics used for?

The Cost Based Optimizer (CBO) requires database statistics in order to use the most suitable access path (for example, index access, full table scan, join types) to access the required data for SELECT, UPDATE and DELETE statements. To do this, the system uses the statistics to select the access path with the fewest expected I/O operations.Database statistics can also be used as part of database monitoring (for example, to determine memory requirements).

2. When are database statistics not required?

No statistics are required if you are using the Rule Based Optimizer (RBO):

R/3 <= 3.1I

SAP table pools and table clusters

Other tables with exception processing in accordance with DBSTATC (doc 122718)

If RULE hints are explicitly specified

Oracle DDIC objects (exception: Large BW systems, doc 558746)

When the TP transport control program is used for an import, statistics are deleted by default on the DDXTT, DDXTF, TATAF, TRBAT, TRBATC and TRBAT2 tables so that you can use the RBO (see doc 564161).

No statistics are required for insert statements either, since the inserts carried out are based on Freelist entries.

As of Oracle Release 10g, you generally require database statistics.

3. What type of statistics are there and where are they stored?

Statistics are saved in the Oracle Data Dictionary. Different types of statistics are available:

Table statistics:

  • Access using the DBA_TABLES view
  • These contain information such as the number of rows (NUM_ROWS), number of blocks used (NUM_BLOCKS), accuracy (SAMPLE_SIZE) or date of last statistics creation (LAST_ANALYZED)

Index statistics:

  • Access using the DBA_INDEXES view
  • Contain information such as the size of the index tree (BLEVEL), number of leaf blocks (LEAF_BLOCKS), number of different keys (DISTINCT_KEYS), key figure for assigning index vs. table (CLUSTERING_FACTOR, see doc 832343), accuracy (SAMPLE_SIZE) or date of last statistics creation (LAST_ANALYZED)

Column statistics:

  • Access using the DBA_TAB_COLUMNS view
  • Contain information such as the number of different values (NUM_DISTINCT), lowest value (LOW_VALUE), highest value (HIGH_VALUE), accuracy (SAMPLE_SIZE) or date of the last statistics creation (LAST_ANALYZED)

Histogram information (optional):

  • Access using the DBA_TAB_HISTOGRAMS view
  • Contains (75 items or less by default) bucket information consisting of the bucket number (ENDPOINT_NUMBER) and bucket final value (ENDPOINT_ACTUAL_VALUE).
  • System statistics (doc 927295)
  • Access using SYS.AUX_STATS$
  • Contains information about the weighting of Single Block Reads, Multi Block Reads and CPU consumption.
  • Only used in the SAP environment as of Oracle Release 10g.

4. What are histograms?

Histograms describe the distribution of column values between the lowest existing value (LOW_VALUE) and the highest existing value (HIGH_VALUE). For more information, see doc 797629.

5. How damaging are old statistics?

No conclusions can be made as to how good or bad the existing statistics are, based on their age. The decisive factor affecting the quality of statistics is how well the current dataset is represented. Even "old" statistics generally still correspond to the current dataset.

6. What are the disadvantages of creating new statistics?

Every change to the statistics can cause changed accesses to the CBO In most cases, changed access caused by new statistics have a positive effect on performance. Such changes can have negative consequences in a small number of cases. In this context, doc the following recommendations:

If you have manually modified the statistics to optimize performance, as described in doc 724545, you should avoid creating any new statistics for the affected tables. However, if you still want to create new statistics, you will need to make the manual changes described in doc 724545 again.

7. How can I restore old statistics?

Creating new statistics always involves the risk that individual accesses may be compromised. In this case, it is helpful if you can retrieve the previous statistics quickly. In addition, the following possibilities are available:

If you use DBMS_STATS to create the statistics, as of Oracle 10g, you can retrieve previous statistics from a table using DBMS_STATS.RESTORE_TABLE_STATS. Using the following command, you can restore the statistics that existed a day ago, for example:

EXEC DBMS_STATS.RESTORE_TABLE_STATS('<owner>', '<table_name>',
SYSDATE - 1);

Otherwise, you can only restore old statistics if you saved them explicitly using DBMS_STATS.EXPORT_TABLE_STATS. You can retrieve statistics that you saved using DBMS_STATS.IMPORT_TABLE_STATS. Refer to doc 448380 for further details.

8. How can database statistics be created?

There are various ways to create database statistics:

Oracle tools and commands:

ANALYZE command (see Oracle documentation)
DBMS_STATS package (see Oracle documentation and doc 448380)

SAP tools:

BRCONNECT 6.10 or higher (see SAP online documentation and doc 403704)
Transaction DB20
Report RSANAORA

9. How does BRCONNECT proceed when creating statistics?

In the usual scenario, BRCONNECT functions in two phases:

a) First, it determines whether new statistics are required: New statistics are only required if the number of table entries since the last statistics creation has changed by more than one threshold value (parameter stats_change_threshold, default: 50%) BRCONNECT then creates new statistics:

b) Creating new statistics: If the first phase resulted in the need for new statistics, and the system has met further conditions (no alternative entry in the DBSTATC control table, no table pool or table cluster, and so on), new statistics are created.

10. What is the difference between DBMS_STATS and ANALYZE?

DBMS_STATS is a more recent method for creating statistics that is expected to replace ANALYZE in the long term. You can observe the following differences when using DBMS_STATS instead of ANALYZE:

Advantages of DBMS_STATS:

  • It is possible to use histograms for global statistics of partitioned objects
  • Table-internal parallel processing is possible (doc 408532)
  • Correct statistics even for columns with identical character strings in the first 32 characters (doc 365480)
  • You can transport statistics to other systems
  • You can back up and reactivate statistics

Disadvantages of DBMS_STATS:

  • DBMS_STATS only determines statistical data that is relevant to CBO. For monitoring purposes, the system does not calculate useful columns such as AVG_SPACE, EMPTY_BLOCKS, AVG_SPACE_FREELIST_BLOCKS or NUM_FREELIST_BLOCKS. As a result, you cannot make assertions about space utilization and fragmentation on the basis of this value with DBMS_STATS statistics, for example.
  • No statistics for cluster tables
  • Oracle 8i: Histograms cannot be created when you use parallel processing.

11. Are the statistics of DBMS_STATS and ANALYZE compatible?

The two types of statistics are not fully compatible, which causes problems when you create DBMS_STATS statistics for objects with ANALYZE statistics and vice versa. For more information, see doc 489352.

When changing the method, you should therefore first delete statistics using the function of the method used to date ("ANALYZE TABLE ... DELETE STATISTICS" or DBMS_STATS.DELETE_..._STATS) before new statistics are created. Deletion of these old statistics has already been implemented in current versions of BRCONNECT and BW.

12. How should statistics usually be created?

To keep statistics up-to-date, SAP recommends that you schedule the following command at least once a week:

brconnect ... -f stats -t all

Irrespective of the number of table lines or (in the case of activated table monitoring) the contents of DBA_TAB_MODIFICATIONS, this run checks which table requires new statistics in the first step. In the second step, new statistics are created for the tables that were determined to need them. You can use transaction DB13 to schedule these runs (see doc 132861).

If new columns or indexes are created (even in transports - refer to doc 657536 for more information), the table statistics must be updated (for instance, using transaction DB20). Other functions such as DB13, BRCONNECT without special force options, or the sole analysis of an index (whether newly created or reorganized) can cause incomplete statistics and then cause considerable performance problems. As of Oracle 10g, Oracle automatically creates index statistics when an index is created. This means that you do not need to also create the statistics manually.

In some situations, statistics are also created implicitly:

After special BW actions such as aggregate rollup or change runs; doc 555030 describes how you can deactivate this automatic creation if required.

At the end of a reorganization with SAPDBA

doc 1057511 describes an alternative to generating statistics regularly using BRCONNECT. This procedure means that experienced administrators do not need to use BRCONNECT as frequently, provided certain precautions are followed.

13. Can I create statistics while the system is running?

Creating statistics does not block any database objects. In theory, therefore, you can create statistics while the system is running. However, since the creation of statistics involves an additional CPU and I/O load, it should be carried out during a period of minimal workload, if at all possible.

14. How can I determine when an object was last analyzed?

For tables, execute the following statement at Oracle level:

SELECT TO_CHAR(LAST_ANALYZED, 'DD.MM.YYYY, HH24:MI:SS') FROM
DBA_TABLES WHERE
OWNER = '<table_owner>' AND TABLE_NAME = '<table_name>';

For indexes, replace DBA_TABLES with DBA_INDEXES and TABLE_NAME with INDEX_NAME in the statement above.

15. How can I determine whether Oracle DDIC tables contain statistics?

You can use the following SELECT statement to determine whether and which objects from the Oracle data dictionary contain statistics:

SELECT TABLE_NAME
FROM DBA_TABLES
WHERE LAST_ANALYZED IS NOT NULL AND OWNER = 'SYS';

doc 138639 describes the cases in which you should create statistics for the Oracle DDIC.

16. How can I determine, whether there are any statistics relating to fixed objects?

As of Oracle 10g, we recommend that you use fixed object statistics (refer to doc 838725). You can check whether these statistics have been created or not as follows:

SELECT TABLE_NAME
FROM DBA_TAB_STATISTICS
WHERE LAST_ANALYZED IS NOT NULL AND OWNER = 'SYS' AND
TABLE_NAME LIKE 'X$%';

If the query returns X$ tables, then fixed object statistics exist.

17. Why do some tables not contain any new statistics even though a statistical run was carried out with BRCONNECT?

Before BRCONNECT creates new statistics, the tool checks whether new statistics are required. No new statistics are created if there have been no significant changes in the number of entries in a table.

18. Why is it that frequently, only the primary indexes have new statistics after a BRCONNECT run?

To determine the current number of table entries in the check phase, BRCONNECT executes an ANALYZE of the primary index. This is much faster than accessing the (larger) table yourself. For this reason, the primary indexes always receive new statistics - even if it turns out that the table does not require any new statistics.The system deviates from the above rule if table monitoring is activated (docs 408527, 628590). In this case, there is no need to analyze the primary index and as a result, the primary indexes do not receive any new statistics.

19. How can I force the creation of statistics with BRCONNECT for a table?

You can do this by additionally specifying the force options

-f collect,allsel,keep

(these ensure that statistics are always created even if changes are made to table rows, entries in DBSTATC and the table type). This type of call is also generated when you create statistics in transaction DB20.

doc that the force options are expert settings that should only be used in exceptional circumstances.

20. How can I use BRCONNECT to create new statistics for all tables?

It may be useful in some situations to create new statistics for all tables without a necessity check. Tables that normally receive no statistics should, of course, be left out of the statistic creation. You can do this using the following call:

brconnect -u / -c -f stats -t all -f collect

21. With what degree of accuracy does BRCONNECT create statistics?

The accuracy of new statistics depends on the size of the table:

Fewer than 10,000 rows: Exact calculation (COMPUTE)

Fewer than 100,000 rows: ESTIMATE with 30 % of the table

Fewer than 1,000,000 rows: ESTIMATE with 10 % of the table

Fewer than 10,000,000 rows: ESTIMATE with 3 % of the table

More than 10,000,000 rows: ESTIMATE with 1% of the table

22. How can I determine tables whose statistics have too low a sample size?

You can use the following statement to determine the statistics of larger tables (more than 50,000 entries), which were created with less than 20% of the accuracy used by BRCONNECT by default:

SELECT TABLE_NAME, NUM_ROWS, SAMPLE_SIZE,
TO_CHAR(LAST_ANALYZED, 'dd.mm.yyyy hh24:mi:ss')
LAST_ANALYZED
FROM DBA_TABLES
WHERE NUM_ROWS > 50000 AND SAMPLE_SIZE > 100 AND
SAMPLE_SIZE < 0.2 * NUM_ROWS *
DECODE(NUM_ROWS, 0, 0, DECODE(TRUNC(LOG(10, NUM_ROWS)),
0, 1, 1, 1, 2, 1, 3, 1, 4, 0.3, 5, 0.1, 6, 0.03, 0.01))

ORDER BY 4 ASC;

Sample sizes < = 100 are percentages that were specified in earlier Oracle releases instead of row numbers, and are therefore excluded from the check.

Check whether there is a plausible explanation for the lower accuracy of the statistics (for example, a DBSTATC entry), otherwise create new statistics.

NUM_ROWS values on the order of approximately 1065 are created for large tables when you use an "ANALYZE TABLE ... ESTIMATE STATISTICS" statement without specifying a sample size (such as when you create statistics manually without using BRCONNECT).

If the SAMPLE_SIZE value is demonstrably lower than the sample size you specified for the last statistic creation, this may be caused by fragmentation effects. For more information, also see doc 821687.

If you implement predefined statistics in accordance with doc 1020260, SAMPLE_SIZE no longer fits the imported values. Therefore, you can ignore the tables mentioned in doc 1020260 here.

If you generate statistics using BRCONNECT after a restructuring of the the system (that is, at a point where no CBO statistics exist yet), a sample size of one percent is used by default. You may have to create the statistics again (using the option "-f collect") to obtain larger sample sizes.

23. What is the DBSTATC table used for?

DBSTATC contains predefined exceptions for calculating statistics with BRCONNECT. For more information, see doc 122718.

Statistics are calculated normally for tables that are not defined in DBSTATC (unless there is another exception, such as table pool/table cluster or R/3 3.1I).

24. How can I define additional exceptions to the standard procedure in creating statistics with BRCONNECT?

In some individual cases, the statistics created by BRCONNECT are not optimal. In these cases, you may require greater accuracy in the analysis or you may need to create statistics with additional histograms. To make these settings permanent, you can add an entry in the DBSTATC table as described in doc 106047.

25. Which statement does BRCONNECT use internally to create statistics?

Oracle 9i or lower

To create statistics, BRCONNECT calls ANALYZE TABLE with several FOR conditions, by default. The following is an example of an estimate with 10%:

ANALYZE TABLE <table_name> ESTIMATE STATISTICS SAMPLE 10 PERCENT

FOR TABLE FOR ALL INDEXES FOR ALL INDEXED COLUMNS SIZE 1;

With this statement, table statistics ("FOR TABLE"), index statistics for all indexes belonging to the table ("FOR ALL INDEXES") and column statistics for all indexed columns ("FOR ALL INDEXED COLUMNS") are created without histograms ("SIZE 1").

As of BRCONNECT 6. 40, the system creates statistics for ALL columns ("FOR ALL COLUMNS") by default.

Oracle 10g or higher

As of Release 10g, by default, BRCONNECT uses DBMS_STATS to create statistics. A typical DBMS_STATS call looks like:

DBMS_STATS.GATHER_TABLE_STATS

( OWNNAME => '"<owner>"',
TABNAME => '"<table_name>"',
ESTIMATE_PERCENT => <percentage>,
METHOD_OPT => 'FOR ALL COLUMNS SIZE 1',
DEGREE => NULL,
CASCADE => TRUE,
NO_INVALIDATE => FALSE);

This statement creates table statistics, index statistics ("CASCADE => TRUE") and all column statistics ("FOR ALL COLUMNS") without histograms ("SIZE 1").

26. How can I control the use of ANALYZE and DBMS_STATS?

You can use the stats_dbms_stats parameter to specify the use of DBMS_STATS for some or all tables or for ANALYZE (see doc 424239).

27. How can I accelerate the creation of statistics?

For more information, see doc 806554.

28. How I can determine whether table, index or column statistics are missing?

During its statistics runs, BRCONNECT checks only the tables and index tables. This means that missing column statistics can no longer be created over a long period of time. In addition, up until Release 6.20, BRCONNECT created statistics only for indexed columns. In the meantime, however, it is advisable to create statistics for all columns (doc 723894).

The following SQL statements allow you to determine the missing statistics manually:

Indexes with missing statistics:

SELECT T.TABLE_NAME, I.INDEX_NAME
FROM DBA_TABLES T, DBA_INDEXES I
WHERE

I.OWNER = T.OWNER AND
I.TABLE_NAME = T.TABLE_NAME AND
T.LAST_ANALYZED IS NOT NULL AND I.LAST_ANALYZED IS NULL

AND INDEX_TYPE != 'LOB';

Indexed columns with missing statistics:

SELECT SUBSTR(T.OWNER, 1, 15) OWNER,
SUBSTR(T.TABLE_NAME, 1, 30) TABLE_NAME,
SUBSTR(TC.COLUMN_NAME, 1, 15) COLUMN_NAME,
T.NUM_ROWS NUM_ROWS
FROM DBA_TABLES T, DBA_TAB_COLUMNS TC, DBA_IND_COLUMNS IC
WHERE

T.OWNER = IC.TABLE_OWNER AND
TC.OWNER = IC.TABLE_OWNER AND
T.TABLE_NAME = IC.TABLE_NAME AND
TC.TABLE_NAME = IC.TABLE_NAME AND
TC.COLUMN_NAME = IC.COLUMN_NAME AND
T.LAST_ANALYZED IS NOT NULL AND TC.LAST_ANALYZED IS NULL

GROUP BY T.OWNER, T.TABLE_NAME, TC.COLUMN_NAME, T.NUM_ROWS
ORDER BY 4 DESC, 1 ASC, 2 ASC, 3 ASC;

All columns with missing statistics:

SELECT SUBSTR(T.OWNER, 1, 15) OWNER,
SUBSTR(T.TABLE_NAME, 1, 30) TABLE_NAME,
SUBSTR(TC.COLUMN_NAME, 1, 15) COLUMN_NAME,
T.NUM_ROWS NUM_ROWS
FROM DBA_TABLES T, DBA_TAB_COLUMNS TC
WHERE

T.OWNER = TC.OWNER AND
T.TABLE_NAME = TC.TABLE_NAME AND
TC.DATA_TYPE NOT IN ('CLOB', 'BLOB', 'NCLOB', 'BFILE',
'LONG', 'LONG RAW') AND
T.LAST_ANALYZED IS NOT NULL AND
TC.LAST_ANALYZED IS NULL

GROUP BY T.OWNER, T.TABLE_NAME, TC.COLUMN_NAME, T.NUM_ROWS
ORDER BY 4 DESC, 1 ASC, 2 ASC, 3 ASC;

If results are returned, new statistics should be created for the relevant tables (to create statistics for all columns with BRCONNECT 6.40 or higher). This also creates all index and column statistics.

As of Oracle 10g, statistics must exist for ALL SAP tables. You can use the following statement to check whether there are still SAP tables without statistics under 10g.

SELECT
T.OWNER,
T.TABLE_NAME,
TO_CHAR(O.CREATED, 'dd.mm.yyyy hh24:mi:ss') CREATION_TIME
FROM
DBA_TABLES T,
DBA_OBJECTS O
WHERE
T.OWNER = O.OWNER AND
T.TABLE_NAME = O.OBJECT_NAME AND
T.OWNER LIKE 'SAP%' AND
T.LAST_ANALYZED IS NULL AND
O.OBJECT_TYPE = 'TABLE';

When the system returns tables, the reason for the missing statistics should be identified and new statistics should be created.

29. What are system statistics?

System statistics are a new type of statistic, available as of Oracle 9, which contains system-specific information about CPU and I/O behavior. For more information, see doc 927295.

30. What should I take into account with respect to the statistics for partitioned objects?

Statistical information for partitions is located in the Oracle views DBA_TAB_PARTITIONS, DBA_IND_PARTITIONS and DBA_PART_COL_STATISTICS.

Partitioned objects may have global (cross-partition) or local (partition-based) statistics. The GLOBAL_STATS field indicates whether or not global statistics have been created.

doc 524341 contains more information about creating statistics for partitioned objects with BRCONNECT. While we previously recommended that you use the ANALYZE method to create statistics, you can also use DBMS_STATS as of BRCONNECT 6.40 (doc 744483).

31. What special features apply to statistics in the BW environment?

Refer to the corresponding information in doc 1013912.

32. Can I manipulate statistics to trigger a preferable access path?

Yes, as of Oracle 8. 1, you can change statistical data using DBMS_STATS packages. For more information on this topic, see doc 724545.

If you change statistics manually, the USER_STATS entry belonging to the statistics is converted from NO to YES. You can therefore use this column value to find out whether statistics were manipulated.

33. Can statistics be transported from one system to another?

Yes, this is also possible as of Oracle 8. 1 using DBMS_STATS packages. For example, you can do this to simulate production system relationships with a large volume of data on a small test system.

34. Can the NUM_ROWS field be filled with decimal figures?

In certain cases, Oracle may fill the NUM_ROWS field (which records the number of rows in the object) with decimal figures instead of integers. This behavior is normal and does not present a problem.

35. Are statistics lost during an Oracle or R/3 upgrade?

Statistics are retained during an Oracle and R/3 upgrade and do not have to be recreated.

36. How are statistical values included in the CBO cost calculation?

doc 750631 describes rules of thumb for the cost calculation of typical database accesses based on parameters and statistics.

37. Why does DB20 report statistics as obsolete, even if they have just been created?

DB20 compares the current number of rows in the table with the number of the rows the last time the statistics were created. If the difference between these numbers exceeds a defined threshold value (default value is 50%), the statistics are regarded as obsolete.

To estimate the current number of table entries, an analysis of the primary index is performed with the default statistics accuracy. If the accuracy is low or the index is heavily fragmented, the estimated result deviates significantly from reality and DB20 reports obsolete statistics, even though the statistics are current.

In most cases, reconstructing the index (see doc 332677) or creating the statistics with greater accuracy helps in this case.

38. Can I use Oracle tools such as DBMS_STATS instead of BRCONNECT for regular statistic creation?

The Oracle package DBMS_STATS provides numerous options for automatically updating obsolete statistics (for example, GATHER STALE, GATHER AUTO), that can be executed periodically based on the Oracle job system. SAP does not support periodic creation of statistics for R/3 using these Oracle mechanisms (refer to doc 105047). Instead, you should always use BRCONNECT.

39. Does BRCONNECT also support new 10g statistics features, such as locking statistics?

doc 892296 describes new functions in BRCONNECT 7.00, which are mainly aimed at supporting the new Oracle 10g functions.

40. Can entries in DBSTATC be transported?

The table DBSTATC is not connected to the SAP transport system. If you want to transport the DBSTATC table, you can do so with an explicit transport using the Workbench Organizer. (Refer to doc 211105).

41. How can I reduce the scope of the statistics history?

As of Oracle 10g, Oracle keeps old statistics for 31 days by default to maintain the option of being able to activate such statistics again using DBMS_STATS.RESTORE_TABLE_STATS. In particular cases, the history statistics can lead to a massive memory requirement in the SYSAUX tablespace. Segments with statistical data and a potentially large memory requirement are, for example:

WRI$_OPTSTAT_IND_HISTORY
I_WRI$_OPTSTAT_HH_ST
I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST
WRI$_OPTSTAT_HISTHEAD_HISTORY
I_WRI$_OPTSTAT_H_ST
WRI$_OPTSTAT_HISTGRM_HISTORY
I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST

The history statistics data can become particularly intense if histogram statistics are created on a large scale, as in the BW environment, for example. To reduce the scope of the authorized history data, you can set the retention period to a decreased number of days (decreased by two or more) as follows:

EXEC DBMS_STATS.ALTER_STATS_HISTORY_RETENTION (<days>);

Alternatively, you can deactivate the complete statistics history as follows:

EXEC DBMS_STATS.ALTER_STATS_HISTORY_RETENTION (0);

To delete all statistics history data up to a certain date, you can execute the following command:

EXEC DBMS_STATS.PURGE_STATS(<date>);

To delete all data up to two days ago, you can use the following command:

EXEC DBMS_STATS.PURGE_STATS(SYSDATE - 2);

42.When We Open Database in Restricted Mode?

Sometimes we need to startup the db in RESTRICT mode to perform some maintenance activities.only users with specialized privileges can access the db.


Advertisement
Advertisement
Advertisement