Power BI Artifact to Workspace Relationships

As a Power BI admin, particularly when you’re responsible for a large deployment with many teams using the Power BI service in different ways, you want to put yourself in a position to easily answer common questions to understand the current state and any emerging trends. The Office 365 audit logs for Power BI activities, the Power BI Premium metrics app, data gateway monitoring, and the Power BI Management PowerShell commands are four good tools available to help you, in addition to the Power BI admin portal itself and the Power Platform Admin center.

However, with the possible exception of the metrics app, these tools are generally only starting points – the data they provide is of limited value without some level of customization such as a nightly ETL (extract-transform-load) process to store the audit log data into a database or integration with user and group profile data from Azure Active Directory.

In this blog post, I wanted to follow up on the Power BI Admin Datasets post from earlier this year and provide a sample PowerShell script that makes possible a dataset containing relationships between app workspaces and Power BI artifacts (reports, dashboards, datasets, data sources).

Artifact Data Model

A sample PowerShell script (.ps1) with the essential logic to support the following relationships in a Power BI dataset is available for download from my GitHub repository:

Power BI Artifact Relationships

Per the above image from the fairly new modeling view in Power BI Desktop, Datasets, Reports, Dashboards, and Data Sources each relate back to the app workspaces in which they’re contained, either directly or indirectly via the relationship to datasets. In this design, reports and data sources are related to workspaces via their relationships to datasets, which has a many-to-one relationship with workspaces. Dashboards are related directly to workspaces and both data sources and reports have inactive relationships to workspaces that could be utilized via the USERELATIONSHIP() DAX function.

For this blog post and sample script I’ve left out additional logic and modeling that goes into Power BI Premium capacities and Azure AD groups. You can see in the above image that I have distinct diagrams created for each area (e.g. group owners and members to groups; capacity admins and capacity workloads to capacities).

So this isn’t a complex model in itself but the critical WorkspaceID you need for the relationships is not available by simply running the Get commands (Get-PowerBIReport, Get-PowerBIDataset, Get-PowerBIDashboard, Get-PowerBIDatasource). The sample script mentioned above handles this by iterating over the active workspaces and executing the given Get command for each workspace.

Artifact Sample Script

So let’s walk through each step of the sample script and provide some additional detail so you can consider using ‘as is’ or modifying for your own purposes:

  1. Authenticate to Power BI
    • The credential of a Power BI service admin is required to use this script.
    • By creating a PowerShell credential and using this credential with the Connect-PowerBIServiceAccount cmdlet, the script can execute remotely such as a nightly refresh.
  2. Define output file paths.
    • This will be the directory location and file names to write the data to at the end of the script.
    • You’ll want to decide whether you’ll be loading a Power BI dataset directly from these files or if you’ll add a job or package of some kind which loads a database from these files thus allowing you to build your dataset on top of a database, which of course is preferable.
    • In my latest project, an SSIS package is used to load the files to a SQL Server database and a SQL Server Agent job runs this package once the PowerShell script has completed.
  3. Active Workspaces
    • Get all the workspaces for the organization (notice the -All parameter) but filter out the ‘My Workspaces’ and any workspace that isn’t in an ‘active’ state.
      • There definitely could be use cases for analyzing personal workspaces but I’m almost always focused on the group workspaces created for the purpose of sharing with others.
    • The app workspaces are loaded into a variable and this variable is used in the following steps in order to provide access to the workspace ID.
      • For all the tables exported I add a column to note when the data was retrieved.
  4. Datasets
    • A ForEach statement is used to iterate or loop over the individual workspaces in the workspaces object stored in a variable in the preceding step.
    • The $WSID variable is used to store the workspace ID of the given workspace member (think of it like a row of a table) being iterated .
    • The Where-Object is used to filter out Report Usage Metrics Model and Dashboard Usage Metrics model.
      • Just like excluding personal workspaces, I’m not interested in reporting on these datasets since they aren’t created by business users/teams as part of projects.
  5. Reports
    • The same ForEach approach over active workspaces is used but notice that only a few columns are selected and, after a Sort-Object command, a Get-Unique command is used to remove duplicates.
    • When users publish reports from workspaces to apps a copy of the report is created with a distinct report ID and URL. Therefore, without this approach of removing duplicate report names in the workspace, you’d be at risk of severely over-counting the number of reports published.
  6. Dashboards
    • Just like reports, given the copies created for each published app, it’s necessary to remove duplicate dashboard names.
  7. Datasources
    • Unlike datasets, reports, and dashboards which were all retrieved via iterating over active workspaces, data sources are retrieved in an iterating fashion over each dataset of the datasets object created in step #4.
    • Notice that this object has both workspace ID and dataset ID and both are added to the data sources object retrieved such that there’s an option to create relationships to both workspaces and datasets.
  8. Export Objects to Files
    • Finally, the PowerShell objects representing workspaces, datasets, reports, dashboards, and data sources are all exported out to distinct CSV files to support downstream reporting.

Report Samples

With the data model containing the artifact to workspace relationships created, it’s fairly easy to build template report pages that take advantage of the relationships. For example, in the reports page sample below you see workspaces, datasets, and even premium capacity data on the same visual;

  • You may notice that I consider reports per dataset as a critical metric. You want this number to be high indicating that a common or trusted set of datasets are being re-used to support many reports.
  • I also like to create derived columns on the datasets table to distinguish datasets which use only one data source versus those with multiple data sources and the number of data sources. A dataset which retrieves from many data sources is generally less trustworthy and could be a sign that these sources need to be integrated into a data warehouse.
  • At the moment there’s no API endpoint or PowerShell cmdlet for retrieving dataflows. Paginated reports are just now appearing in the Get-PowerBIReport cmdlet but currently there’s not a property available to easily distinguish paginated versus interactive reports.
  • I left out the part of the script that retrieves premium capacities to keep this example as simple as possible but you would typically create a relationship between workspaces and premium capacities if your organization has premium capacities. Given the other relationships between artifacts and workspaces, filtering on a premium capacity would filter down all the reports, datasets, etc contained in workspaces hosted on this premium capacity.

Wrapping Up

I was fortunate to speak at the SQL Saturday Los Angeles event last Saturday and I promised to the attendees that I would share the demo scripts in upcoming blogs. This is the first of those blogs and I intend to post the automatic deployment scripts/samples (bulk, incremental) sometime fairly soon.

I’m not always able to respond to comments quickly but feel welcome to leave any questions or issues you have.

7 comments

    1. Yes, looping over the workspaces to get the datasets, reports, and dashboards and looping over the datasets to get their data sources requires time, even if you’re avoiding the personal workspaces and only retrieving active workspaces. In my current project, there are about 400 workspaces (active, not personal workspaces) and the script takes about an hour to run each night. I have other scripts which retrieve Azure Active Directory data (User profile, group members and owners) and this too can take about an hour. Nightly jobs that run for a few hours are generally acceptable to support an admin database and/or analytical data model but there are options to improve performance such as only looping over premium capacity workspaces or a set list of workspaces you need to analyze or using a new REST API. I may blog about this sometime.

      Note: If your organization has thousands of workspaces you may consider revising policies around who can create workspaces and what their scopes should be. For example, only 1-2 users from a department could be authorized to create workspaces for this department and these workspaces should contain many reports and dashboards around a particular topic.

      Like

  1. I read in comments in a separate post that there currently isn’t a way to unpack the ConnectionDetails. Is that still the case?

    Like

    1. I read in comments in a separate post that there currently isn’t a way to unpack the ConnectionDetails. Is that still the case?
      (i.e.in the Datasources output)

      Like

  2. Get-PowerBIDataset : A task was canceled.
    At line:31 char:9
    + Get-PowerBIDataset -Scope Organization -WorkspaceId $WSID | W …
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : WriteError: (Microsoft.Power…tPowerBIDataset:GetPowerBIDataset) [Get-PowerBIDataset], TaskCanceledException
    + FullyQualifiedErrorId : A task was canceled.,Microsoft.PowerBI.Commands.Data.GetPowerBIDataset

    Like

  3. Brett – Thanks for the script and great post!
    We’ve tried to execute the script but got Intermittent error (above). When we run the full script, we always get this error.
    But when we add additional filter (like in the workspace name), it sometime success; when it fails, it fails with the same error as we run the full script.

    With the same code execute, it sometime could success, but sometime fails. We cannot figure out why.
    Have you encounter this before? Are we doing something wrong..? Thanks!

    Like

Leave a Reply