The purpose of this post is to identify a number of behaviors and assumptions which may contribute to suboptimal Power BI projects and overall Power BI environments. A few symptoms resulting from these ‘warning signs’ include inefficient resource utilization, higher costs, reduced performance, reliability issues (e.g. data refresh failures), greater operational support or maintenance, version control issues, a confused and/or frustrated user base, longer delivery timelines, etc.
This is a very broad topic and Power BI has expanded to support many more use cases so I’m just calling out a few of the top, higher level items that I’ve seen in larger deployments. Based on the scale and goals of your Power BI deployment, along with your organization’s policies, available resources, and other factors, these items may be more or less impactful to you.
A new dataset is created for each new report
One of the first metrics I look at when joining a project is the count of reports per dataset. In many cases, I find that this number is very close to 1.0. In other words, whenever a new report is created, a corresponding dataset is created – each PBIX file contains both the dataset and the report.
This is less of an issue if the dataset reflects a connection to an Analysis Services data model but it’s a massive concern if the dataset is actually a data model itself complete with its own source queries, relationships, metrics, etc. If there are import mode tables in the dataset, you could quickly use up memory or run into refresh concurrency limitations refreshing all the datasets. Even if DirectQuery is used for all tables you still have version control issues with each dataset needing to be maintained with new or different calculations, security, etc.
Why does this happen? Here are some of the main reasons:
- The developer or business analyst was assigned the task of building reports and existing datasets/models containing the required data or logic didn’t exist.
- Power BI Desktop is primarily referred to as a ‘report authoring tool’ even though it’s also a data modeling tool. So it can be a bit counter-intuitive to only build a dataset/data model in Power BI Desktop and then only build a live connection report in Power BI Desktop.
- The ability to leverage a Power BI dataset across workspaces hasn’t existed (shared datasets are still in preview) and thus it became common or expected to just build a new dataset for each new workspace.
Before you start building a new Power BI dataset, particularly one that imports data from sources, you should A) at least confirm that a dataset/data model doesn’t already exist with the required data and logic and B) if a new dataset is required, try to think more broadly about how your dataset will support multiple reports.
Power BI Artifact Types Misunderstood
It’s one thing to have limited experience with the primary programming languages of Power BI (DAX, Power Query (M)) but a lack of clarity around the fundamental Power BI artifacts and how they relate to one another is much more detrimental to success with Power BI.
If you’re familiar at all with relational databases, imagine if many users and/or developers had the ability to create database objects and these users didn’t understand the differences between tables, views, functions, and stored procedures. Obviously resource and performance issues (and chaos generally) would arise and you’d surely seek to ensure that only users with a basic knowledge of these object types would be allowed to create anything in the database.
You might draw up a simple vertical diagram for each Power BI that shows the relationships between the artifacts from dashboards at the top to datasets or dataflows at the bottom. For example you might have a single dataset which serves as the source for 5 reports which in turn support 2 summary level dashboards. Additionally, your dataset may be supported by one or several dataflows. This simple exercise forces you to explain why you are (or not) using the different artifact types and to consider the dependencies.
In Power BI, users and other stakeholders may refer to everything as a ‘dashboard’ even though Power BI dashboards are hardly used at all throughout the organization. It may be completely harmless if end users refer to what they’re seeing on their screens as ‘dashboards’ but it’s critical that the teams that design and develop Power BI solutions have a crystal clear understanding of the artifact types and thus use them as they’re intended to be used. Without this understanding, you could end up with one report per dataset as described earlier and you might not be taking advantage of other artifacts like dashboards or dataflows.
Overly Broad User Classifications
It might be tempting to classify users in the organizations into only two segments or personas such as ‘end users’ and ‘creators’. You might logically reason that ‘creators’ will be assigned pro licenses and be trained to develop and publish content while ‘end users’ will be trained on how to consume and access content.
This simple binary distinction may be appropriate when you’re first getting started with Power BI but I’d suggest a bit more granularity reflecting the significantly different skills, features, and complexity associated with developing different kinds of Power BI content. At a minimum, I split the creators into ‘Report Authors’ and ‘Data Modelers’ with the report authors learning to build visually rich and intuitive user experiences based on the robust, secure, and performant datasets created by the data modelers.
The report author doesn’t need to learn much about DAX or dimensional modeling and likewise the data modeler doesn’t need to learn much about data visualization best practices, bookmarks, or user exploration/navigation. If some level of self-service data transformation is part of your Power BI environment, you might consider a third type of user (and associated training) revolving around Power BI dataflows and Power Query (M).
Limited Group Access and Identity Management
I’ll often see a large, manually maintained list of individual users granted access to a Power BI app. The underlying dataset(s) within the workspace published as an app don’t implement row-level security and thus the app permission list is the only access layer and it’s the responsibility of a few users in the workspace to regularly add or remove users from the app.
Obviously there are productivity issues and data risks with this practice but the workspace team publishing the app may not have any idea if a group exists with the required users or if/how such a group can be created to replace the list of individual identities.
Another issue I see is a lack of distinction between the different types of groups (distribution lists, O365 groups, security groups, mail-enabled security groups) and the implications of these differences. Some group types can be used for certain permissions and features in Power BI but not others and thus existing groups, such as distribution lists, may need to be revised to mail-enabled security groups to be utilized by certain Power BI features or settings.
This alone is a big topic so I’ll just close this warning sign with three considerations:
- Identity the team responsible for identity management and/or Azure Activity Directory and partner with them on the creation and management of groups to support Power BI
- Identify existing lists of user identities that should be converted to groups
- Consider requiring row-level security roles in datasets, perhaps as part of certified datasets, for production solutions.
- In other words, don’t exclusively rely on app access for permission.
Flying Blind at the Organizational Level
Power BI teams often have in-depth knowledge of specific projects they’ve been working on but sometimes don’t see the bigger picture of overall Power BI adoption, usage, and resource utilization for their organization. For example, someone on the Power BI team may utilize usage metrics for one or a few specific workspaces but not have any idea how much Power BI Premium capacity has been provisioned, how it’s allocated across workloads and configured, or what reports and dashboards are most popular for the entire organization.
With the Power BI Premium Metrics app, the O365 audit log (and Power BI activity log), and the Power BI Admin role there’s really not a good reason to be completely blind to what’s going inside a given Power BI environment. If you don’t have a basic monitoring solution in place, any known Power BI admin to help you, or any idea on the resources (e.g. pro licenses, premium v-cores) available you might want to pause your Power BI development activities until you get these items in order.
Power BI as ‘Visualization Tool’ Only
Power BI ‘can’ be used exclusively as a data visualization tool but there’s a compelling case to utilize some combination of the broader set of capabilities within the platform including enterprise modeling, paginated reports, AI, and dataflows. As just one example, paginated reports has historically been its entirely separate BI product (SQL Server Reporting Services) but now is integrated within the Power BI service thus providing a single portal for both interactive and paginated report types.
If users and IT personnel aren’t well aware of the diverse capabilities of the platform they may simply assume Power BI can’t solve a problem. For example, certain business teams may be utilizing Alteryx for self-service data transformation processes and have no idea about Power Query (M) or dataflows and the benefits of integrating their data transformation processes within Power BI (and ADLS in the case of dataflows).
Another symptom and example of this overly narrow view is trying to force a Power BI interactive report to do something that is much better suited for a paginated report. The following article does a good job describing the considerations and requirements that would lead to a paginated report (.rdl) rather than an interactive report (.pbix).
Passive, Reactive Administration of Power BI
Let’s say you at least know who your primary Power BI admin is and you know the essentials of your environment ranging from licenses, premium capacity usage, activity logs, etc. Maybe once a week you check the Power BI Premium metrics app for performance as well maybe an on-premises data gateway monitoring solution. Assuming there are no major bottlenecks or issues identified, no action is taken.
In my experience, this more passive, reactive approach to administering Power BI might be sufficient for A) small deployments and B) exclusively top-down deployments in which the IT organization owns all solutions from end-to-end. However, when you have many ongoing Power BI projects and a mix of top-down and self-service or hybrid approaches admins need to be more proactive to avoid problems from surfacing before it’s too late. Power BI activity and resource usage can change very quickly.
For example, if you observe three 1GB datasets in the same Power BI workspace, each with a slightly different name, you might reach out to the dataset owner and/or team leader and ask why the datasets can’t be consolidated into a single dataset to support the required reports. Maybe you currently have ample memory in your premium capacity but this practice of duplicating datasets doesn’t scale so a policy with real enforcement/support from leadership may be needed.
Another example of a more active approach to administering Power BI would be regularly meeting with top Power BI teams to understand the issues they’re encountering, they’re plans for future projects, and to share what you and/or the BI team are planning or working on such as data models, revised governance policies, etc.
DAX is ‘the’ language of Power BI
If I had to pick one language for a Power BI project or perhaps when hiring a Power BI developer I’d easily pick DAX. However, the great majority of my experience in the development of Power BI solutions has involved more than DAX alone. Some level of SQL and/or Power Query (M) has almost always been required in addition to DAX and I’ve also sometimes needed custom expressions in SSRS or paginated reports, custom PowerShell scripts to support Analysis Services data models and admin work, and even some R, Python, MDX, and more.
Here are some consequences of the belief that DAX is the only language you need to learn:
- You’ll use DAX to create columns and tables which could and should be created upstream in the DW, via SQL, or view Power Query (M)
- Your data model(s) will thus contain logic that’s completely unknown to the data warehouse or engineering team
- Your data model(s) will thus consume more memory (if imported) and perform less
- You’ll isolate yourself from the broader set of capabilities in Power BI
- Rather than suggesting to solve a problem or to supplement a solution with relatively simple Power Query (M) expressions, paginated reports, a SQL view, etc you’ll constantly try to embed the required logic in a DAX measure, query, or model and this might not be the right tool for the scenario.
- You’ll isolate yourself from other data and IT teams such as the data science team, the data warehouse or platform team, the IT teams
- It’s difficult to quantify but there’s real value in being able to describe how a given language familiar to someone can be used in Power BI. For example, if you’re working with the data warehouse team on a DirectQuery dataset, being able to speak both DAX and SQL and relational database concepts is critical.
A near-future blog post might be titled something like ‘The Full Stack Power BI Pro’ to discuss this issue in much greater detail. In my view the single tool expert (e.g. DAX/SSAS guru) is increasingly being replaced by more well-rounded BI professionals/engineers.
The main point for now is just that, in general, a proficiency in Power Query (M) and/or SQL and possibly other languages offers great synergies with DAX and you’re likely limiting the value of your solutions and work by exclusively studying and developing DAX.
Force Power BI to Replicate other Platforms
Most large companies have more than one BI tool in operation (e.g. Cognos, MicroStrategy, Tableau, etc) and it’s often the intent or desire to migrate existing solutions from these platforms to Power BI. This seems very rational at a high level and indeed it’s increasingly technically feasible but it also can’t be overstated that there are very real differences between Power BI and whichever platform you’re coming from and this implies careful planning as well as acceptance among multiple stakeholder groups that not all features/behaviors in the current platform will be exactly replicated in the Power BI solutions.
If you take the naive approach that “I just want these 5 reports currently in platform ABC converted into 5 Power BI ‘dashboards'” you might end up with something resembling 5 Power BI dashboards but in a very poor, unsustainable architecture such as 5 distinct Power BI DirectQuery datasets, each supporting its own specific report.
When a user or developer describes a ‘requirement’ that exactly mirrors what they’re familiar with from the current platform you need to A) generalize this need into its larger concept or functionality family such as content distribution, operational reports, or user navigation experience and B) identify the features available in Power BI supporting this broader concept such as email subscriptions, paginated reports, or app navigation, respectively.
When a user or someone more loyal to the current platform points out the differences or ‘loss’ of functionality with Power BI you can probably easily point out a number of new, additional features or advantages available in the Power BI ecosystem as well as roadmap items that will further close any gaps with other platforms. The main point is that some level of change is to be expected and you could create real problems by trying to force Power BI to quickly and easily replicate the solutions that were built for other platforms.
One Size Fits All Solutions
Let’s say that a Power BI project for the sales organization was a major success. It may be appealing for project and team leaders to declare that this project represents the ultimate standard and template to be followed for all other projects and solutions.
It’s entirely possible that significant elements of the Power BI project for the sales organization can and should be leveraged across projects such as how security groups, sanctioned data sources, and shared datasets were created/utilized. However, other projects or business teams may entail completely different scenarios that require data sources not currently in the data warehouse, user experiences and content delivery outside of Power BI interactive reports, and some level of self-service development.
If you try to force the single project approach (e.g. top-down, IT owned) and a single type of reporting artifact or user experience onto other projects you could be missing out on the benefits of other features in the platform (e.g. Analyze in Excel) that were directly applicable to the given scenario. In a worst-case scenario the business team will lose confidence in Power BI and develop their own solution with whatever tools are available to them, typically Excel or maybe another BI tool in the organization.
The broader point is that standardization and policies on core issues (e.g data access, security) is always a good practice but some level of flexibility and open-mindedness about design and delivery approaches is necessary to accommodate the diverse needs and use cases that arise.
Project Silos
You’ve likely heard of ‘data silos’ and this is essentially what I keep referring to when I mention the creation of an excessive number of datasets. However, a similar and correlated issue to be aware of are ‘project silos’ in which all design and development work is completely built from the ground up exclusively around the requirements of the given project.
In many scenarios a project is initiated and/or funded around a particular team or key stakeholders’ needs/requirements and thus the team or developer assigned to the project is ‘heads down’ focused on resolving these requirements. However, there’s a high probability that some of the requirements for Project A, such as customer analytics, overlap with the requirements for Project B and several future projects.
It should be obvious that there are real version control problems and inefficiencies with different project teams independently developing their own BI artifacts to cater to the specific wishes of a single project or project sponsor. You could have many different DAX metrics with slightly different names but similar or slightly different logic thus creating confusion as to which data model or set of DAX metrics (if any) can/should be used going forward.
Another behavior to call out related to project silos is the allocation of a premium capacity to one specific project. I’ve seen entire premium capacity nodes named after a specific project which only required a few small datasets and didn’t even go very far in terms of user adoption/usage. As you know, premium capacity is not cheap and thus you want to be very thoughtful about how you can consolidate many different Power BI solutions or teams to maximize the usage of available resources (e.g. using 20 out of 25GB of RAM) of a premium capacity while still providing enough resources for acceptable scale and performance.
Rewarding Suboptimal Behaviors
Let’s say that one Power BI developer, Jack, hacks together three terrible Power BI datasets but develops three gorgeous, interactive reports on top of these models (which somehow temporarily perform and return reasonable results) and publishes them to a large group of users. Let’s say that another Power BI developer, Hannah, builds a robust Power BI dataset complete with rich metadata, RLS role security definitions, performance testing, etc which is capable of supporting all of Jack’s reports and many more but she hasn’t yet created or published any reports.
It’s entirely possible that Jack will be viewed more favorably than Hannah even though, generally speaking, Hannah has produced more long term value. Obviously this is short sighted and unjust but BI teams are often pressured into delivering ‘something of value’ to the business rapidly and Power BI makes it possible (sometimes very easy) to churn out impressive data visualizations even with very minimal underlying data modeling. A better outcome might be to assign Jack the role of ‘report author’ and Hannah the role of ‘data modeler’ and use role-specific criteria to evaluate their performance rather than the generic ‘Power BI developer’ role.
Business teams and sometimes individuals within the BI/Analytics department may not fully understand or know how to identify solution quality or sustainability issues, the role of datasets in Power BI architectures, etc so this is something to be mindful of.
Everyone Gets a Pro License
Let’s say you’ve determined that some level of self-service will be part of your Power BI environment going forward. To implement this, maybe you’ve provisioned a premium capacity dedicated to self-service solutions to isolate this content from solutions developed and maintained by the BI/Analytics and DW teams. This makes sense but it doesn’t make much sense to give every business user who might possibly want to create or edit content a Power BI Pro license.
In one project I noticed that 770+ pro licenses were assigned to business users and the great majority of these users had not created anything in Power BI Desktop, much less published to the Power BI service. This was still a small fraction of the overall employee base but still posed an unnecessary expense and governance issue.
Not everyone on a given team or within a department that creates Power BI content needs to be assigned a Power BI pro license. A user with a free license can still contribute to projects via Power BI Desktop (locally), the Viewer role (given Premium capacity), and through other supporting tools and tasks. It’s much more manageable to identify the few users on a team or department are that will be considered the ‘power users’ for their given team/dept and only assign these users the pro licenses. As Power BI is rolled out to more teams/departments, you’ll have a much smaller group of pro users to work with in terms of various training and governance policies.
Marketing Confused for Technology
As you absorb various forms of information on Power BI from websites, blogs, books, tech events, etc you might ask a few basic questions such as “Who owns this information source or platform?” and “What are this person or organization’s primary goals?”
If the answers to these questions are A) Microsoft wholly owns or greatly influences this source or individual and B) Microsoft is primarily interested in earning profits for its shareholders then you should at least consider the possibility that what you’re being told or ‘learning’ is at least somewhat shaped to support a narrative or belief that serves Microsoft’s financial interests. Likewise, a presentation at a technology event by a Microsoft partner or consultant may be very informative and helpful but remember that some level of marketing is probably embedded, it’s not 100% technology.
Do you think you can just build a ‘Dashboard in a Day’ for your real-world scenarios? Do you think Power BI is ‘easy’ (just some DAX, right?) or just like an Office 365 app?
I assume you wouldn’t have found this blog if you were that naive but I’d also guess that many of you or key stakeholders in your organization hold certain views of Power BI that exclude important details such as limitations or complexities that don’t exactly get emphasized by various seemingly official information sources.
Let me be clear that that I’m not accusing anyone of any level of deceit. I use MS Docs and various other MS-owned information sources including presentations delivered at MS-controlled tech events in my projects. I’m simply saying, as I think many of my blog posts and those of others in the community have also expressed, that Power BI is by no means simple or easy and that the coordination of a broad set of people and technical skills are required.
Wrapping Up
It’s been quite a while since I wrote a long, text-heavy blog post so I thought I’d change things up a bit.If you’ve only started following this blog recently you may think I only write about Power BI administration and PowerShell so I intend to venture back into other areas soon and try to keep mixing up topics and blog styles to keep things fresh. If blogs like this are interesting you can subscribe to email notification via the Subscribe widget.
Peak Mastering Power BI sales?
According to the Amazon author portal last week was the highest selling week my books have had thus far (for the US at least). This is surprising given how long the books have been out and all the competition from other books and various free information sources. If you’re a new owner of one of my books I hope you find it useful and if you’re interested in a new edition or title I hope you’ll be patient – this is looking like Q4 at this point.
2 comments