Sunday, August 19, 2018

What is Inferred dimension and Why it is important

Inferred Dimension, also referred to as Early coming Fact or Late coming dimension. These all are different names of same scenario.

If the source systems of your Data warehouse are very well designed, you should never face this scenario. But, sometimes it would happen that the fact data is available in the source but related dimension data does not yet exists.

Consider that Online Movie Ticketing business scenario. Suppose a sales person sold some bulk movie tickets to an organisation at some discount. He provided a discount with a new promotion code, which he is supposed to punch in the system but forgot. When the ETL would process the data it will find that the promo code is not available.

There are 2 ways to handle this scenario:

  • 1st one is to mark the FKey as unknown. This is the simplest option but these sales records could not be used for the reporting purpose unless the dimension record arises or the fact is updated accordingly. Which is an ETL overhead altogether.
  • The other option is to use Inferred Dimension. For any such record we could create an entry in the dimension table and mark it inferred as shown in the image below.
Inferred Dimension
Using this method, our fact record could be used for reporting and a better picture would be generated. At a later load when the promo code details would be available, the dimension record could be updated automatically, depending on what type of Slowly Changing Dimension it is.

No comments:

Post a Comment

If you liked the post, please share it.
Subscribe to your email and receive new articles on your email