Site icon Martin's Blog

Temporal Fact Tables

If you are a dimensional modeling purist, you may want to get your preferred drink of choice (make it a double) before you continue reading this post. It will definitely challenge your way of thinking, but not in a bad way…I hope.

The word “temporal” simply means “of or relating to time“, and labeling a fact table in such a way may seem redundant or strange…or both. This term however describes a very specific implementation of a fact table with characteristics of a type-2 dimension, as you will see below.

First some Credits

I attended a session at the PASS Summit, presented by Davide Mauri (blog; twitter). He aptly coined the term “Temporal Snapshot Fact Table” to describe this particular type of design approach, and he deserves a lot of credit for having the courage to present such a difficult topic. It is not just a difficult topic to conceptualize and grasp, but also to present.

I left that session wondering if this topic is really well understood by BI professionals, mainly because I found very little (if any) information or examples out there. Hence the reason for this blog post…

I have been involved in a similar type of design implementation for the past 2 years, and hopefully this post will provide some clear information to help you identify when and how to use this kind of fact table. Note that I have omitted the word “Snapshot” from the title…which is probably the only difference of opinion I have to Davide’s given title. In my opinion this is not a snapshot table, because the fact record is not only written once but updated as many times as necessary.

Setting the Scene

Facts are essentially transactions which happen at a specific point in time. They usually do not change after being recorded…I know there are exceptions, but in most cases it will be a simple update if any.

But what if you do not have traditional facts? What if you have to track certain properties of a “dimension” over time which happens to have factual attributes? What if the changes to those properties are too volatile to incorporate them in the dimension table…or even a snapshot fact table?

Let me explain in more detail…

I am involved in a price modeling project for a large retailer. The “pricing tool” allows the business to create rules for certain categories of products, or any level of the product hierarchy to be more precise. These complex rules are then used to recommend what they should be asking for their products. Sales data is not relevant here…we are simply applying different levels of rules to determine what we should be asking for the products.

In order to apply all the relevant rules, we need to track certain attributes of the products over time. Some of the attributes we need to track are:

The typical question we need to answer is: “What are all the attributes of a product on any given day?”.

Up to this point you should still be saying: “Type-2 dimension!!!“…so let’s assume we’ve decided to track all of the above in a type-2 product dimension.

The following will make us think twice about the choice we’ve just made:

A single product dimension would be a really bad choice:

The next argument is that we should have separate dimensions for products, stores/regions and coupons. That should really have been our first course of action, but for the sake of clarification I started with a simplified (and most probably incorrect) approach.

Creating our separate dimensions, we end up with something similar to the following…

Dimensions

Our initial design for the dimensions look good and uncle Ralph will be proud, but we still have to tie it all together somehow. And this is where everything becomes interesting really fast. The first “spanner in the works” is that we don’t have traditional facts.

If we did (in the case of product sales for instance), the typical star schema would have been relatively straight forward:

Typical Star Schema

But unfortunately we do not have a transaction that happens at a point in time. Instead, we are dealing with volatile attributes which are changing rapidly over time.

A solution that comes to mind is a periodic snapshot fact table, but again we will have to deal with too many records. For a snapshot fact table to work in this scenario we will have to take daily snapshots for current and future changes, as well as keep the history in case they want to model on historic attributes.

The snapshot fact table will have a similar outcome to the very first approach where we attempted to track all the changes in a single type-2 dimension. There will be too many records to make it a feasible option.

The Solution

The solution to all of this is astoundingly simple…if you are prepared to step aside from the beaten track for a little while. If we add effective and termination dates to our fact table, we will be able to satisfy all of the requirements.

It will give us the ability to track all attribute changes over time, and we should be able to return the given attributes of any product on any given day. Record counts would also be manageable because we are only generating new records when something changes.

The star-schema design will look something like this:

Temporal Fact Table Solution

You will notice that the coupon dimension has disappeared from the above schema. In our case at least, it turned out that we were only interested in the measures (i.e. coupon amount) originally contained within that dimension. Depending on your specific requirements, you may want to keep it as a separate dimension…be aware though that something like a coupon dimension could be a monster dimension with potentially hundreds of millions of records.

The temporal fact table almost seems too simple to be a viable solution. It is nothing more than a fact table with the characteristics of a type-2 dimension, more specifically the inclusion of effective and termination dates. It works like a dream when used appropriately, and provides a sustainable solution to this type of business requirement.

As with many other things in life though, there is of course a price to pay when choosing a temporal fact table as part of your design:

Conclusion

As BI environments evolve, we may just end up seeing more of this “hybrid-type” design when traditional facts are absent and attributes are too volatile for snapshots.

Hopefully the scenario and explanation above will provide some food for thought…even for the purists at heart, which I am actually one of. Use this approach with caution…you will burn your fingers (and maybe more) if implemented in the wrong type of situation.

Additional Resources

Dimensions: General Information; Slowly Changing Dimensions; Degenerate Dimensions; Monster Dimensions
Fact Tables: General Information
Bridge Tables: General Information

Exit mobile version