Showing posts with label schema. Show all posts
Showing posts with label schema. Show all posts

Sunday, August 19, 2018

What is Inferred dimension and Why it is important

Inferred Dimension, also referred to as Early coming Fact or Late coming dimension. These all are different names of same scenario.

If the source systems of your Data warehouse are very well designed, you should never face this scenario. But, sometimes it would happen that the fact data is available in the source but related dimension data does not yet exists.

Consider that Online Movie Ticketing business scenario. Suppose a sales person sold some bulk movie tickets to an organisation at some discount. He provided a discount with a new promotion code, which he is supposed to punch in the system but forgot. When the ETL would process the data it will find that the promo code is not available.

There are 2 ways to handle this scenario:

  • 1st one is to mark the FKey as unknown. This is the simplest option but these sales records could not be used for the reporting purpose unless the dimension record arises or the fact is updated accordingly. Which is an ETL overhead altogether.
  • The other option is to use Inferred Dimension. For any such record we could create an entry in the dimension table and mark it inferred as shown in the image below.
Inferred Dimension
Using this method, our fact record could be used for reporting and a better picture would be generated. At a later load when the promo code details would be available, the dimension record could be updated automatically, depending on what type of Slowly Changing Dimension it is.

Sunday, August 5, 2018

Fact Constellation schema in Data warehousing

Fact Constellation: Fact Constellation allows you to connect multiple facts to dimensions. It is combination of multiple stars connected with each other using dimension, hence called Galaxy or Fact Constellation schema.

As compared to Star and snowflake it is used for more sophisticated application. The dimension tables could be normalized or not as per the requirement.

If you do not understand the basic concepts of data warehousing, please visit my previous article.

If we consider the data warehousing scenario detailed here, below is how is galaxy or fact constellation schema would look like.


Fact Constellation or Galaxy Schema

Characteristics of Fact Constellation schema:

  • Complex design
  • Different variants of aggregations
  •  Suitable for more sophisticate application 

Wednesday, August 1, 2018

Star Schema - The simplest of all

Star Schema : It is simplest data warehouse schema. It is called star because it resembles the star with points radiating out of the center. Fact tables is the center of the star and dimensions sit at the points.
Usually the fact table in the star schema is in 3rd normal form and dimensions are usually de-normalized. It is most simplest and most commonly used schema.

If you do not understand the basic concepts of data warehousing, please visit my previous article.

If we consider the data warehousing scenario detailed here, below is how a star schema would look like.
Star Schema

Although, star schema require just one fact table at the center, but you could surely have multiple stars in your design.

Characteristics of  star schema

  • Simplest design
  • Better query performance.
  • A little extra time to load the data into dimension because of de-normalization.
  • More space because of redundancy in dimensions.
Click here to learn more about different types of schema.
If you liked the post, please share it.
Subscribe to your email and receive new articles on your email