Wednesday, July 11, 2018

Generic DateTime dimension in data warehouse

If you have worked on a data warehouse you would surely have a dimension for Date. It is usually called Dim_Date, Dim_Time or Dim_DateTime. Don't get confused by the use of word time, the table is usually create with grain of one day unless there are some very specific reporting requirements.
So what should be columns in this dimension and how it should be populated and maintained. From my experience in datawarehousing, i have created a generic dimension for date which works in almost all scenarios and is does not require any maintainance.
Below is how the structure looks like.

As you could notice apart from the first column, rest of the columns are Computed. So by using this approach we need not to create any ETL for it and hence no maintenance pain.
At the moment there are 27 columns in the table above but you can surely add more as needed.

All we need to do is populate the first column with a simple SQL query and rest of the columns will follow.

Below query inserts the dates in the table from year 2010 to year 2022. You could update the dates as per your need

declare @start_date datetime, @end_date datetime, @cur_date datetime
set @start_date = '2010-01-01'
set @end_date = '2022-12-31'
set @cur_date = @start_date
while @cur_date <= @end_date
begin
insert into [Dim_Date](Date)
select CONVERT(varchar, @cur_date) as Date
set @cur_date = dateadd(dd, 1, @cur_date) 
end

Formulas of the computed fields are available here.

No comments:

Post a Comment

If you liked the post, please share it.
Subscribe to your email and receive new articles on your email