Microsoft PowerPivot – Data Model

There are many challenges in building our own PowerPivot.  Most of them are related to data; we probably all heard about “Garbage in, garbage out”.  I am not going to talk about how to clean your data which is a huge subject to discuss.  Instead, I am going to show some basic tips on how to store data in PowerPivot.  We can consolidate all the data into one table or we can normalize the data and store them into many smaller tables.

One big simple table

If we store all data into one table; it can be huge.  A product name can be 20 times longer than a product ID which is a numeric field, e.g. “Contoso Laptop Cooling Hub notebook fan with 4 ports USB hub E80 Black” vs. "754”.  PowerPviot can hold a lot of data but it still has a limit, i.e. 4 GB, so we better make use of available memory while we can.

Normalization (Star Schema)

We can fully normalize our data into multiple small tables, then it would save a lot of memory space.  Normalization is a very common database design in a lot of business application.  However, there would be a performance issue; because the report may require information from many different tables and PowerPivot has to link them all before it can return the requested information.  It can take a long time to process.  We don’t want our CFO to waste his / her time in front of their computer; they want the information and they want it now!

Star Schema

The star schema is the most common data model that is used in PowerPivot.  It is called a star schema because the model looks like a star, with a big fact table at center and linked with multiple dimension tables.

image

We notice that columns from fact table are normally used in the Value section of PowerPivot Field List and columns from dimension tables are found in either Column Labels or Row Label sections.

image

Fact Table

A fact table usually has 2 sets of columns.  The fact columns(or business information) are typically numeric; e.g. TotalCost, TotalSales, and TotalPorfit.  The key columns are identification keys that are used to connected to dimension tables, e.g. SalesKey, DateKey, and ChannelKey.

A fact table normally contains a lot of details records. and can take up as much as 90% of storage.

Dimension Table

Dimension table provides meaning to fact table,  Each dimension table describes certain attributes of a particular business information; e.g. DimStores contains store information such as Name, Manager, Location etc.  They are normally descriptive, textual values and much compact than fact table.

Most dimension tables are denormalized; however, there are situations that you may want to normalize dimension tables; i.e. spilt the tables into multiple smaller tables.  The most typical situation is that if you have a dimension table and some of the columns are hardly used in any report, then you may want to take out those columns and put them in another table.  We can also normalize a table if some of the columns have a lot of empty cells.  Another example is that if there is different update behaviours of the columns, some can be updated by replacing the original value but some cannot be overwritten.

If we normalize the dimension tables, then we create a star schema which may slow down the performance.  It is something that we must make our judgement.

image

Data model design is often one of the most challenging tasks during a PowerPivot development; there is no universal design rule.  And it would change over time, so we have to monitor our PowerPivot and fine tune as it is necessary.  This may be the part that we want to involve an expert.

Andrew Chan is the owner and founder of ALG Inc.

We help you to make better and faster decisions!

About Andrew Chan
Andrew Chan is an Business Consultant who gives you accurate, consistent and timely information so that you can make better and faster decisions. He is an Associate of Society of Actuaries with over 20 years of IT experience. Apart from strong analytical skills and proven technical background, he was also a former system director at Manulife who had extensive project management experience. If you are looking for someone to gather, consolidate, validate, visualize and analyze data, look no further! Andrew can provide the most cost effective business analytics solution so that you can explore, optimize, predict and visualize your business. Don’t guess on any decision, no matter it is finance, operation, marketing or sales! Always ask for evidence!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: