Today’s post is all about laying the groundwork for some dynamic ranking reporting and analysis using the DAX programming language. From the starting point of an SSAS Tabular solution using the sample AdventureWorks 2014 database, I’ve built a DAX query that returns a Top 10 list of customers along with two attributes (Year, Product Category) and five measures. Today we’ll disassemble this query into its components, explain why the query is structured as it is, and briefly look ahead to parameterizing various elements of the query in future reports.
Why does this matter?
1. Because information workers need the ability to think and ask questions of data ‘in the moment’. Their questions are always changing and they want to maximize the value of their time by focusing on the Top X (events, customers, products, etc).
2. Because business intelligence professionals need solutions that can adapt to changing analytical needs, scale across large volumes of data, and yet still be reasonably simple to manage/administer.
So there’s always a balance between analytical richness, scalability, and simplicity. In general, increased scalability and simplicity comes at the expense of analytical richness and vice versa but by following certain patterns/practices it’s possible to achieve a significant level of both.
*Click the image to zoom
The Query’s Structure
• Define Measures
In SSAS Tabular there’s no performance loss from defining measures scoped locally to the query. Additionally the local query measures help keep the underlying SSAS Tabular model more user friendly as the various intermediate calculations are not exposed in self-service applications such as Excel or Power View. Furthermore the explicit definition of the measures in the query benefits the management and administration of the solution.
• Layers of Table Expressions
In this query I’ve nested a CalculateTable expression within a Summarize function, passed this table to a Filter function (which uses the defined measures), and finally applied a TopN table expression on top of the resulting table to produce the Top 10 table based on Sales (given the filter context).
At first glance this query may look complex and yes, some level of trial-and-error may be needed (I needed it anyway), but it’s the structure that gives the query its analytical flexibility and performance. From a performance or scalability standpoint, the CalculateTable being executed first is critical as this reduces the volume of fact table records that are scanned and evaluated in the subsequent filters. The idea here is to allow the in-memory storage engine with its ability to use multiple threads to handle the simple filters and expressions thereby reducing the workload on the formula engine which can handle more complex filters but may only use a single thread.
Many of the query’s arguments have fixed values (e.g. Category = “Bikes”) but these can easily be swapped out for parameters in SQL Server Reporting Services and via other methods in Excel Services and other tools. For example the user could select from a list of product categories, a list of Years, a minimum value for days since last purchase, and how many customers to show (Top X).
This query or ‘dataset’ is focused on customers but with very minimal change the query could be used to support other business entities represented in the SSAS tabular model such as products, departments, etc. New measures could be added for these queries, the group by columns in the Summarize function could be expanded or modified, the columns to filter in the CalculateTable could be changed, etc. These would all very minor modifications requiring little time or even significant knowledge of SSAS Tabular or DAX.
*DAX queries (and their parameters) against SSAS Tabular models can be implemented in the DMX editor of SQL Server Data Tools.
Next Sunday I’ll follow up on this DAX Query by implementing some parameters and visualization in SSRS. I’ll also look to leverage the query for other kinds of reports and may go a bit deeper into the table and measure expressions themselves though the point of this isn’t the calculations itself but rather the overall design that makes the solution dynamic.