A schema may be defined as a data warehousing model that describes an entire database graphically. This is done by establishing relationships between the various entities that make up the database. A multidimensional data model is very useful to address the various distinctive needs of extremely large databases for analytical purposes. There are mainly three types of multidimensional schemas- star schema, snowflake schema, and galaxy schema. However, while selecting database schema, snowflake and star schemas tend to be popular choices for warehouse data.
Snowflake Schema vs Star Schema
Star Schema |
Snowflake Schema |
Fact table connected to various dimension tables. |
Fact table connected to dimension tables, which are in turn connected to other sub-dimension tables. |
Data redundancy increases due to lack of normalization of data. |
Data normalisation reduces data redundancy. |
Easy to design and understand. |
Difficult to design and understand. |
It is a top-down model. |
It is a bottom-up model. |
It requires more space. |
It uses less space. |
Queries are less complex. |
More complex query. |
Requires less time to run a query. |
Requires more time to run a query. |
Less number of joins. |
More number of joins. |
What is Star Schema?
Star schema or Star Join Schema is one of the easiest data warehouse schemas. It is a database warehousing model that you can depict with the help of a simple star. There is a fact table at the centre of the star. The dimension tables associated with the fact table make up the arms of the star. The dimension data for every dimension level is stored in Warehouse Builder. Mostly, the fact table is in the third normal form while the dimension tables are de-normalized.
Characteristics of Star Schema
- A one-dimensional table represents all the dimensions in a star schema.
- All dimension tables are connected to the fact table but not with each other.
- The fact table contains key and measure. The dimension tables contain a group of attributes.
- The dimension tables are not normalized.
- The star schema is easy to understand due to its uncomplicated structure.
- Due to the less number of joins, queries run faster.
- Calculations and aggregations such as total sale, total revenue etc. are rapidly done using star schema.
- The aggregates can be grouped into query statements to suit your needs.
- This is very popular as several business intelligence tools support it.
What is Snowflake Schema?
A Snowflake Schema is an extended version of a Star Schema, with normalized dimension tables. There is a central fact table, which branches out into several dimension tables. These dimension tables are then normalized into various sub-dimension tables. Thus, the resulting model looks like a snowflake. Since the data is normalized, there is more than one dimension to this data warehousing model. Working with this model is advantageous when you are working with large volumes of data relating to details of customers, the calendars of various companies, financial products of banks, insurance companies etc.
Characteristics of Snowflake Schema
- At least one of the dimension tables are normalized.
- It is easier to implement dimensions to snowflake schema.
- Most dimension tables with a sub-dimension table associated with them have a null value for attributes.
- Attributes can be queried independently.
- Snowflake schema uses disk space efficiently.
- Query performance may be diminished due to multiple tables.
- The queries become complex due to a large number of joins.
Conclusion
While a star schema provides a simple form of data warehousing, you get a more detailed distribution of data on a snowflake schema. This is especially helpful if you are working with large sets of data. Also, the reduced redundancy in snowflake schema makes maintaining the stored data easier over a long period of time. However, if you want to save time, a star schema is more suitable for you. Therefore, both prove extremely useful in analytical purposes, with their own sets of merits and demerits.