by Julia Gusman, Managing Consultant
Why I care (and you should too)
Working mostly with visual tools (Power BI, Tableau), I often get involved in the final stage of BI projects when the data has been cleansed and transformed and a data warehouse already exists. I often hear from clients “Our IT team has built a data warehouse, everything is there, now we just need to point Power BI / Tableau / etc. at it and create some reports.” Oh boy.
Luckily, having a solid understanding of Dimensional Modelling allows me to quickly see which analysis I can do in Power BI or Tableau and when it is time to extend the data model.
Is Dimensional Modelling Relevant Today?
Dimensional Modelling has been developed by Ralf Kimball in 1996.
Wait, 26 years ago? Is it even relevant today given all the technological advancements like cloud computing, artificial intelligence, big data, machine learning, etc.?
Absolutely! That is the best part. It is and will be fundamental for a long time. If you ask me, that is one of the best skills to acquire if you want to get into, or future-proof your existing skills in, BI and analytics. It does not require learning any coding language or software which will inevitably become obsolete over time (hello Pascal, APL, Microsoft Access, where are you?). Dimensional Modelling is a method for organising data for analysis. It is applicable no matter what technology the data is stored in (on-premises DB, cloud, Excel) and which software is used to analyse it (Tableau, Power BI, etc.).
Do not just take my word for it, check out this article from Microsoft, explaining why Dimensional Modelling is important in Power BI.
Dimensional Modelling is all about business context and yet, I rarely meet business analysts with a solid understanding of it. The result? Projects that build ‘technically’ correct Data Warehouses but end up not answering critical business questions. Yes, it is again about bridging that gap between business and IT.
Let me give you an example
Let us say you work for a university, and you want to analyse the student application process. The same logic would apply if you were looking at insurance claims or IT support tickets, any situation where “things” (e.g., applications, claims, tickets) progress through various stages (submitted, approved, processed, closed, etc.).
Your IT team might have built a dimensional model like the one below. This is a perfectly correct star schema, from a technical perspective.
Using the above model, it will be easy to answer the question: How many applications are currently in which status? All we need to do is sum up the Application Count from the Fact Daily Applications table, filter for today’s date from the Dim Date table and group by Status Description from the Dim Status table. In Power BI you would connect the tables in the model view, drag and drop the fields on the visual and add a filter for today. You could also connect directly to the database and run a SQL query:
Select sum(“Application Count”), “Status Description”
from “Fact Daily Applications” f
inner join “Dim Date” d on f.”Date Key” = d.”Date Key”
inner join “Dim Status” s on f.”Status Key” = s.”Status Key”
where d.Date = today()
group by “Status Description”
Now let us assume we want to find out how long it takes on average, to progress an application form submitted to approval. For each application, you will have to find the date it was in the status ‘submitted,’ find the date it was in the status ‘approved’ and then the time between the dates. Not so straightforward. What you need in this case, is a different model.
Notice, that now we are capturing the status dates and the time it takes for the application to move from one status to the next in the same table. We can easily report on how long the processes are taking and identify any protentional bottlenecks.
Where to from here?
In our two-day course which is completely technology agnostic (you can use pen and paper or PowerPoint or anything else to draw the diagrams), we cover the process of going from the business question to the data model (star schema). You will learn simple steps and standard solutions to common problems as well as get a chance to practice new skills.
Connect with us if you’d like to learn more about Dimensional Modelling and setting up the perfect Data Warehouse.