Sizing SAP HANA In-Memory Database
Important Note: SAP HANA In-Memory Database can now be sized using SAP QuickSizer. Please go to service.sap.com/quicksizer for further information.
Please note that this section describes the sizing of the SAP HANA In-Memory Database as it is used e.g. for replication of ERP data coming from an SAP ERP system. Other applications running on top of SAP HANA may have application specific sizing algorithms.
How to determine the Database Footprint of the data?
Memory sizing of HANA is determined by the amount of data that is to be stored in memory, i.e. the amount of disk space covered by the corresponding database tables, excluding their associated indexes. Note that if the database supports compression, the space of the uncompressed data is needed. Based on this amount of data, we apply a compression factor to determine the size of the RAM needed for HANA.
If data comes from an SAP NetWeaver based system, the attached shell script "get_size.csh" for DB6 and Oracle based NetWeaver systems) or "get_db4_nonbw_size.sh" (for DB4 / iOS based systems - please read the attached README file) can be used to obtain the database footprint of the tables (for DB6 and Oracle based NetWeaver systems).
The scripts can be used in two different modes:
- without parameter: the names of all active tables of the NetWeaver system along with their database size (in MB) will be written to a file "TableSize.txt" which also contains the sum of all table sizes.
- with a filename as parameter: the file should contain a list of tables that are to be replicated (one table name per row, no blanks). Only for those tables which can be found both in the file and in the database (as active tables of the NetWeaver system) the script writes the table names and their respective databse sizes to the file "TableSize.txt".
Using sizing scripts for databases that support compression
If the database dictionary provides information on its compression factors, this information will be included in the computation of the uncompressed data size. At this point of time, only DB6 provides this information. For all other database platforms, the resulting data size has to be multiplied by the database specific average compresion factor. Please contact your DB system administrator to obtain this compression factor.
If data comes from non-NetWeaver systems or out of other database platforms, please proceed as follows:
DB2 for LUW (DB6):
The amount can be estimated using the system view "syscat.tables", assuming hat database statistics are up to date. For each table multiply "npages" by pagesize (and by "avgrowcompressionratio" if deep compression is used).
DB2 for IBM i (DB4):
The amount can be estimated using the catalog view "systablestat" in library "qsys2". For each table use the following expression to calculate the size in megabytes:
( ( number_rows * data_size ) /
( CASE ( number_rows + number_deleted_rows )
WHEN 0 THEN 1
ELSE ( number_rows + number_deleted_rows )
END ) ) / 1024. / 1024.
The amount can be estimated using the system table "tablesizes". Note that the size contains the primary key, however, the impact on sizing is only moderate.
MS SQL Server:
Use stored procedure "sp_spaceused 'table name'" for each table individually.
Use the "bytes" field in system view "dba_segments" (or other *_segments views) for segment types "Table" and "Table Partition". As an alternative, you can also multiply "num_rows" by "avr_row_len" from system view "*_tables".
Exceptional compression behaviour:
In certain cases the actual compression factor can deviate from the average compression that has been observed for various standard scenarios. In particular, database tables which contain data that have already been compressed on application level usually cannot be compressed further. Data distribution can also have significant impact. Columns with a low number of distinct values compress better than columns where nearly all records have different values.
Data that is loaded into HANA will be initially stored as a so called delta index which is uncompressed. This delta index will be merged into the existing (compressed) index at a later point in time. If a huge amount of data needs to be loaded in one request, the amount of memory that is needed to store the delta index can be as much as two times the size of the delta data itself.
CPU resources for HANA are implicitely defined by the certified hardware platforms that satisfy the memory requirements and that are available from your hardware vendor. To estimate the maximum number of active users (i.e. users which cause any kind of activity on the server within the time period of one hour) that can be handled by a HANA server, we have chosen the following approach.
Like in the sizing algorithms for SAP BW and SAP BWA which have been implemented in SAP QuickSizer (for details please refer to service.sap.com/quicksizer) we assume that HANA queries can be divided into three categories ("easy", "medium", "heavy"), which differ in the amount of CPU resources that they require. Typically, "medium" queries use twice as much resources as "easy" ones, while "heavy" queries require ten times as much resourses.
Furthermore, we assume that HANA users can be devided into three categories ("sporadic", "normal", "expert"). The user categories are defined by the frequency of query execution and the mix of queries from different categories. "Sporadic" users typically execute one query per hour, and run 80% "easy" queries and 20% "medium" queries; "normal" users execute 11 queries per hour, and run 50% "easy" queries and 50% "medium" queries; and "expert" users execute 33 queries per hour, and run 100% "heavy" queries. Together with a default distribution on the user categories (70% sporadic, 25% normal, 5% expert) and results from multiuser load tests on certified hardware an average resource requirement of currently 0.2 cores per user has been determined.
The components of SAP HANA and of the SAP In-Memory Database can only be installed by certified hardware partners on validated hardware running a specific operating system. Any other system or content developed with such systems is not supported by SAP. For further information please refer to the information page of the product version. Support Package Stacks (SPS) can be downloaded and applied to appliances according to agreements with the respective hardware partner.
CSS/CSN components for SAP In-Memory Database and SAP HANA
- BC-HAN: SAP High-Performance Analytic Appliance (SAP HANA)
- BC-HAN-MOD: SAP High-Performance Analytic Appliance Modeler
- BC-HAN-LOA: Load Controler
- BC-HAN-REP: Sybase Replication Server
- BC-DB-HDB: SAP In-Memory Computing Engine
- BC-DB-HDB-DBA: Database Administration for HDB
- BC-DB-HDB-INS: Installation HDB
- BC-DB-HDB-PER: Database Persistence for HDB
- BC-DB-HDB-SYS: Database Interface/DBMS for HDB
- BC-DB-HDB-UPG: Upgrade HDB
- BC-DB-HDB-ENG: SAP In-Memory Computing Engine
- BC-DB-HDB-MDX: MDX Engine/Excel Client