Online Tutorials & Training Materials | STechies.com
Register Login

Oracle Histograms Interview Questions and Answers

|| || 1

Oracle Histograms Interview Questions and Answers
Stechies

FAQ: Oracle histograms

1. What are histograms?
Histograms are an optional component for database statistics. Database statistics are required as a basis for the decision process of the cost-based optimizer (CBO) when determining the optimum access path (index access, full table scan...) (see Note 588668).

Histograms contain information about how the data of a table column is distributed. If there are no histograms, Oracle assumes the column values are distributed equally. Histograms enable the CBO to determine whether certain values occur frequently, rarely or not at all, and based on this information, the CBO can select an optimum access path.
Example (a table whose column COL1 10000 sometimes has the value 1 and only has the value 2 once):

CREATE TABLE HIST_EXAMPLE (COL1 NUMBER, COL2 NUMBER);
BEGIN
   FOR I IN 1..10000 LOOP
     INSERT INTO HIST_EXAMPLE VALUES (1, I);
   END LOOP;
   INSERT INTO HIST_EXAMPLE VALUES (2, 0);
END;
/

If you now create an index for COL1 as well as statistics with histograms, the CBO may reach different decisions depending on the comparison value:

SELECT * FROM HIST_EXAMPLE WHERE COL1 = 1;

SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=10000)
   TABLE ACCESS (FULL) OF 'HIST_EXAMPLE' (Cost=5 Card=10000)

SELECT * FROM HIST_EXAMPLE WHERE COL1 = 2;

SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=1 Bytes=5)
   TABLE ACCESS (BY INDEX ROWID) OF 'HIST_EXAMPLE' (Cost=2 Card=1)
     INDEX (RANGE SCAN) OF 'HIST_EXAMPLE_1' (NON-UNIQUE) (Cost=1 Card=1)

 If "COL1=1", the CBO selects a full table scan because almost all of the table entry meets the condition, and an index access is therefore not necessary. If "COL1=2", the index is used because only one entry is relevant.
Histograms therefore permit different access paths depending on the values that exist in the WHERE condition.

With multiple column values, it is not possible to store an exact statement using value distribution in the form of histograms. Instead, the system only stores a certain maximum amount of information. This value distribution information is referred to as a bucket. A maximum of 75 buckets per column is the default value.

The more buckets that are used, the more exact the histogram information is. At the same time, however, this means that the memory and administration requirements are greater. Generally, 75 buckets are sufficient.

Normal statistics without histograms are always based on exactly one bucket, and represent the lowest and highest values of the column.

3. What kinds of histograms are there?
The following types of histograms exist:

Value-based histograms (FREQUENCY histograms)

If there are fewer different values in the column than the amount of buckets used, each bucket contains one column value and the frequency of this column value.

Height-balanced histograms (HEIGHT BALANCED histograms)

If there are more different values in the column than the amount of buckets used, the column values are divided into intervals of the same size. In this case, an interval is assigned to each bucket.

4. What limitations do histograms have?
Histograms have the following Oracle limitations:

The maximum number of histogram buckets is 254.

In the case of character columns, only the first 32 characters are included in the determination of histograms. If columns only differ in later characters, the histogram does not reflect this difference.

5. How can I create histograms using Oracle?
You have the following options to create histograms using Oracle:

ANALYZE TABLE

As part of the ANALYZE TABLE command for compiling statistics, you can specify

FOR ALL [INDEXED] COLUMNS [SIZE <buckets>]
to create histograms (<buckets> > 1). If youdo not specify the SIZE, 75 buckets are used. For example, with the following command, histograms are created with 75 buckets for all indexed columns of the T100 table:

ANALYZE TABLE T100 COMPUTE STATISTICS FOR ALL INDEXED COLUMNS;

DBMS_STATS

You can also specify the METHOD_OPTS argument as part of the DBMS_STATS command GATHER_*_STATS, the contents of this argument are syntactically identical to the FOR expression of ANALYZE TABLE.

Therefore, the following is comparable with the above ANALYZE TABLE example:

DBMS_STATS.GATHER_TABLE_STATS('SAPR3', 'T100',
 METHOD_OPT=>'FOR ALL INDEXED COLUMNS');

Note that these commands only create additional histograms for existing statistics. A prerequisite in this case is that normal statistics must already exist. See also Note 588668.

6. How can I create histograms using SAP?
The "brconnect -F stats" function enables you to create CBO statistics with and without histograms (see the BRCONNECT online documentation). To create statistics with histograms, you must explicitly use the " -m EH" (ESTIMATE) or " -m CH" (COMPUTE) option. For the continuous generation of histograms for a table, you can create an entry with AMETH=EH or AMETH=CH in the DBSTATC control table.

By default, BRCONNECT always creates statistics without histograms in the R/3 environment.

7. How can I delete histograms?
It is only possible to delete histograms indirectly by creating new statistics without histograms for the relevant table.

8. Where is histogram information stored?
Histogram information is stored in objects of the Oracle data dictionaries in the SYSTEM tablespace in exactly the same way as information about normal statistics. You can access this information using DBA_TAB_HISTOGRAMS (tables) and DBA_PART_HISTOGRAMS (partitions). For each histogram bucket, there is an entry in the DBA view. The ENDPOINT_NUMBER column references the relevant bucket. For volume-based histograms, there is also another entry that defines the starting point for the first bucket.
The actual data is contained in the HIST_HEAD$ and HISTGRM$ tables, in which HISTGRM$ is contained in the C_OBJ#_INTCOL# Oracle cluster.

9. How can I determine whether histograms are created for a table?
The following queries determine for which columns histograms were created and of how many buckets each of these is made up:

Unpartitioned tables:

  • General information:

    SELECT COLUMN_NAME, NUM_BUCKETS FROM DBA_TAB_COLUMNS WHERE
    OWNER = '<table_owner>' AND TABLE_NAME = '<table_name>'
    AND NUM_BUCKETS > 1;

 Note that columns with only one distinct value only ever have one bucket, even if histograms are created. This means that the SELECT described above is not 100% correct. For Oracle 9i and lower, there is no further information about DBA or V$ views.

  • As of Oracle 10g and higher, you can read the existence directly from the HISTOGRAM column:

    SELECT COLUMN_NAME, HISTOGRAM, NUM_BUCKETS FROM DBA_TAB_COLUMNS WHERE
       OWNER = '<table_owner>' AND TABLE_NAME = '<table_name>';

Partitioned tables:

  • General information

    SELECT COLUMN_NAME, PARTITION_NAME, NUM_BUCKETS
    FROM DBA_PART_COL_STATISTICS
    WHERE OWNER = '<table_owner>' AND TABLE_NAME = '<table_name>'
    AND NUM_BUCKETS > 1;

                    Note the restriction described above for columns that have only one distinct value.

  • Oracle 10g and higher:

    SELECT COLUMN_NAME, PARTITION_NAME, HISTOGRAM, NUM_BUCKETS
    FROM DBA_PART_COL_STATISTICS
    WHERE OWNER = '<table_owner>' AND TABLE_NAME = '<table_name>'
    ORDER BY 1, 2;

10. How can I determine for which tables histograms are created?
The following statement delivers all tables on which histograms are created (due to the restrictions described above, histograms are not taken into account in columns with only one distinct value):

SELECT OWNER, TABLE_NAME FROM DBA_TAB_COLUMNS
WHERE NUM_BUCKETS > 1
GROUP BY OWNER, TABLE_NAME
UNION
SELECT OWNER, TABLE_NAME FROM DBA_PART_COL_STATISTICS
WHERE NUM_BUCKETS > 1
GROUP BY OWNER, TABLE_NAME;

As of Oracle 10g, you can also use the following query:

SELECT OWNER, TABLE_NAME FROM DBA_TAB_COLUMNS
WHERE  HISTOGRAM != 'NONE'
GROUP BY OWNER, TABLE_NAME
UNION
SELECT OWNER, TABLE_NAME FROM DBA_PART_COL_STATISTICS
WHERE  HISTOGRAM != 'NONE'
GROUP BY OWNER, TABLE_NAME;

11. When are Oracle histograms useful?
The CBO can only productively analyze histograms if the actual values for selection conditions or join conditions are known. In the R/3 environment, the actual values are replaced by bind variables by default, therefore histograms are not useful here. Histograms can only be analyzed productively under the following conditions:

Using SUBSTITUTE Hints

You can use the DBI Hints SUBSTITUTE VALUES or SUBSTITUTE LITERALS to ensure that literals or all the values of an SQL statement are passed on to the database directly, without being substituted by bind variables. Histograms are useful for this. See also Note 772497.

Database accesses without bind variables

Some SAP components such as BW, Bank Analyzer or EP sometimes execute database accesses without bind variables. In cases where the values are transferred directly, histograms may influence the execution plan.

Selection conditions for views

If you create views with selection conditions, the database receives their values directly - there is no substitution by bind variables. Histograms are useful for this.

Bind Value Peeking active

As of Oracle 9i, Bind Value Peeking is active by default. This means, when you first execute an SQL statement, Oracle determines the contents of the bind variables to determine an optimal access path. Due to many problems with this feature, you must deactivate it in the SAP environment (_OPTIM_PEEK_USER_BINDS=FALSE).

12. Why did SAP also recommend creating histograms in other cases?
Up to and including Oracle 9.2.0. 5, the cost accounting function of the CBO had a bug that meant that even if bind variables were used, accesses that used columns with only one distinct value could be better analyzed by a factor of 100 if histograms were created for these columns (see also Note 750631). This is why we recommended creating histograms in certain cases, in order to force the system to use certain indexes (Note 335415).
This bug has been corrected as of Oracle 9. 2.0.6, so that histograms no longer have any affect on cost accounting.
In the BW environment, histograms are used as standard because concrete values are transferred to Oracle rather than bind variables in many situations (Note 129252).

13. What must I do to ensure that I do not experience performance problems as of Oracle 9.2.0.6 due to the corrected histogram bug?
First, use the checks outlined above to determine whether you use histograms in your system. If histograms exist on non-BW tables (that is, tables with names that do not start with /BIC/ or /BI0/) and there is no plausible explanation for this (for example, deliberate use of SUBSTITUTE_VALUES or SUBSTITUTE_LITERALS hints), these histograms have normally been created to take advantage of the Oracle bug described above.
As of 9.2.0. 6, you must find a different solution for these tables. To do this, proceed as follows before you upgrade to 9.2.0.6:

Delete histograms in a non-production environment, and check whether performance problems occur as a result.

If performance problems occur after you delete histograms due to long SQL statements, optimize these according to Note 766349 using hints or adjustments to the statistics. Transfer these adjustments to the live environment.

Remove the relevant entries with CH and EH from DBSTATC because the histograms are no longer required.

If none of the other measures work, then you can use the command

EVENT='38052 trace name context forever'

as a workaround to bring about the previous behavior in the Oracle profile (see Note 592393 (7)). In addition, you must create histograms on the relevant tables again.

14. When are histograms useful in an R/3 environment?
Histograms are only useful if the actual values are known from the WHERE condition. In the R/3 environment, although binding-variables are used as standard, it is possible to allow Oracle to access the actual values used:

You can use the DBI Hints &SUBSTITUTE VALUES& and &SUBSTITUTE LITERALS& to instruct the database interface of the R/3 System to replace bind variables with values before the SQL statement is sent to Oracle (Note 129385).

If you use Oracle 9i, and the parameter _OPTIM_PEEK_USER_BINDS is not explicitly set to FALSE, Oracle automatically determines the contents of the bind variables during the initial parse. Due to other performance problems, however, we currently recommend that you set the parameter to FALSE (Note 755342).

With current SAP releases, you can set the SAP parameter dbs/ora/substitute_literals to 1 to prevent bind variables from being used in connection with literals (Note 902042). In this case, histograms may be useful on all columns that are accessed with literals (among other things). However, note that the parameter also comes into effect for dynamic WHERE clauses and this may lead to unintended side-effects and problems. Therefore, you should only use this global parameter in exceptional cases and you should always consult with SAP Support (for example, by creating a message under SV-BO-DB).
In such cases, the Oracle optimizer can use the histogram information productively.

15. Where are histograms useful in the SAP environment?
During the following accesses, SUBSTITUTE hints are sometimes used and histograms may be useful for this:

RSDD_TMPNM_ADM (Note 802299)

LTAP / LTBP (Note 811852)

MSEG / MKPF (Note 902157, 902675)

PPC_HEAD

BDCP / BDCPS (Note 706836)

F4 search help (Note 1008433)

Additionally, histograms are sometimes used for objects of the Oracle DDIC as of Oracle 10g (in connection with the statistics creation that is described in Note 838725).

16. Why is ENDPOINT_ACTUAL_VALUE in DBA_TAB_HISTOGRAMS partially unfilled?
The ENDPOINT_ACTUAL_VALUE contains the endpoints of individual histogram buckets in readable form. Therefore, it is suitable for analyses of optimizer decisions in connection with histograms. However, this column is not always filled. DBMS_STATS statistics in particular do not often fill this column. This is not a matter of a bug, but rather the normal behavior (compare information from Oracle bug 3333781), since ENDPOINT_ACTUAL_VALUE is only intended for Oracle-internal purposes.

17. What problems are there regarding histograms?
The following problems may occur when you use histograms:
With Oracle Release 9.2.0.6 or higher, the elimination of the above histogram bug may cause performance problems for statements that were optimized when used during the previous incorrect behavior. The same problem may also occur with Oracle 9.2.0.5 or lower if Bind Value Peeking is active. (Refer to Note 690702).

As of Oracle 9i, performance problems may occur in connection with Bind Value Peeking if you did not deactivate this using _OPTIM_PEEK_USER_BINDS=FALSE (Note 755342).

As described in Note 176754 (32) and (33), the existence of histograms can have an effect on the cost calculation for SQL statements with bind variables.

The C_OBJ#_INTCOL# Oracle cluster that stores histogram information may become very large. See Notes 566665 and 636670 for more information.


Comments

  • 28 Sep 2010 4:56 am
    Good explanation.. clear and precise.. Thank you.

Related Articles

0.0139 seconds.