Power BI Admin Datasets

If you’re a Power BI Administrator, and particularly if you’re responsible for administering a large and varied Power BI environment, you may consider investing time into the development of a Power BI administration dataset. Just like normal datasets or analytical data models generally, the purpose of this work is to provide a platform for simple and consistent analysis of data as represented in facts and dimensions. In this case, you’re putting yourself and possibly other stakeholders in a position to quickly answer common questions about your Power BI environment.

Typical facts to be included in the Power BI admin dataset are the Office 365 Audit Log events for Power BI Activities (e.g. view report) and the Performance Monitor Counters for the On-Premises Data Gateway. The dimensions for the dataset include app workspaces, users, premium capacities, and more. By creating a Power BI Admin dataset (or possibly a few, targeted Admin datasets) with relationships between these facts and dimensions and with simple calculation metrics (e.g. count of workspaces), a basic package of reports and dashboards targeted at common administrative questions can be built and published to Power BI.

Tools for Power BI Admins

Before you head down the path of building a custom administrative solution, you should definitely be familiar with the capabilities (and limitations) of tools provided for Power BI administrators. A couple examples of this include the relatively new Workspaces page in the Power BI Admin Portal and the Power BI Premium Capacity Metrics app.

Power BI Administration is clearly a priority for the Power BI team, particularly given Microsoft’s strategy of embedding enterprise BI capabilities (Analysis Services, Reporting Services) into the Power BI platform, and thus new administrative features and capabilities are being added frequently. However, and again this is mostly for large and varied Power BI environments (e.g. several hundred to thousands of active users), I find that custom admin solutions are valuable, if not essential, and my sense is they will likely remain so (in some form or another) for the next 18-24+ months.

This past Saturday I spoke at the Orange County SQL Saturday in Los Angeles, California. The PowerPoint file from this presentation which provides more details on existing Power BI Administration tools as well as common custom admin solutions is available for download from my GitHub repository.

From Script to Data to Dataset

The bad news is that yes, as a custom admin solution for a Microsoft application some degree of PowerShell scripting is required. The good news is that given the Power BI Management module and particularly the Invoke-PowerBIRestMethod cmdlet for calling the Power BI REST API, these PowerShell scripts need not be very complex at all. Assuming you’ve already been assigned the Power BI service administrator role, minimal to no experience with PowerShell is required.

Essentially your PowerShell script(s) will write data to CSV or JSON files per the following image:

Exported Power BI Artifact Data Files

These script(s) will retrieve from Power BI (obviously) and likely the Azure Active Directory module as well given the importance of Azure AD security groups for implementing data governance policies. Of the files above, Users, Groups, GroupOwners, and GroupMembers are all retrieved from Azure Active Directory via their respective cmdlets.

The capacity JSON files reflect a single command to retrieve all Power BI Premium capacities for the tenant (‘Capacities’) and the distinct Power BI Premium workloads enabled for each of six distinct premium capacities. A workload is a relatively new but very important concept in Power BI as the resources (CPU, RAM) from a premium capacity node can now be allocated to ‘workloads’ beyond Power BI datasets such as Dataflows, Paginated Reports, and AI.

Power BI Desktop of course has a rich and mature set of capabilities for accessing files/folders including JSON format, optionally transforming and enhancing this data, and then modeling it to support visualization and analysis. There are plenty of other blogs, documents, and books that cover the Power Query Editor connect and transform capabilities and given the length of this blog post I’ll skip ahead to just show the Fields List and a few schemas within a single but fairly robust custom Power BI admin dataset:

Per the slides above, there are natural one-to-many relationships between many of these artifacts which can you model in the new Modeling view in Power BI Desktop. In these examples, I’m not including the fact tables for Power BI audit log events or gateway performance monitor counters but you can image relating an audit log event to a user table from Azure AD.

You can use the ‘Dataset Configured By’ field from the Datasets artifact table to create a relationship to a Users table from Azure AD or possible de-normalize these user columns (e.g. User Name, Department) into the datasets table. As just two example use cases, this allows you to create reports that break out which data sources are being used by which department (ie finance, purchasing) and which user is, via the dataset he/she configured, associated with the most reports.

Typically I wouldn’t recommend a bidirectional cross-filtering relationship unless it was absolutely necessary and we fully understood the implications for behavior and performance. Why two bi-direct relationships are shown above is outside the scope of this post.

Power BI Artifact Script

Both sample scripts (Power BI Artifacts, Azure AD) from this post have been uploaded to my GitHub repository. Please note that you may not need all or many of these tables, at least not at the moment. For example, if you don’t have Power BI Premium capacity yet, simply having every workspace, dataset, data source, report, and dashboard for your tenant could be a significant benefit. When you’re just getting started, simply having the count of these items and watching the counts grow (or not) over time can provide a basic indication of the scale and growth of Power BI in your organization.

With all of this said, let’s take a look at one of these scripts:

#1. Connect to Power BI with credential of Power BI Service Administrator

$User = "somebody@bigcompany.com"
$PW = "abcdef"

$SecPasswd = ConvertTo-SecureString $PW -AsPlainText -Force
$myCred = New-Object System.Management.Automation.PSCredential($User,$SecPasswd)

Connect-PowerBIServiceAccount -Credential $myCred

#2. Declare variables for export paths and date data was retrieved

$RetrieveDate = Get-Date 
$BasePath = "C:\Users\Brett Powell\PowerShell Scripts\ArtifactExport\"

#ExportPaths
$DatasetsPath = $BasePath + "Datasets.csv"
$WorkspacesPath = $BasePath + "Workspaces.csv"
$ReportsPath = $BasePath + "Reports.csv"
$DashboardsPath = $BasePath + "Dashboards.csv"
$DatasourcesPath = $BasePath + "DataSources.csv"
$CapacitiesPath = $BasePath + "Capacities.json"
$CapacityAWorkloadPath = $BasePath + "CapacityAWorkloads.json"
$CapacityBWorkloadPath = $BasePath + "CapacityBWorkloads.json"
$CapacityCWorkloadPath = $BasePath + "CapacityCWorkloads.json"
$CapacityDWorkloadPath = $BasePath + "CapacityDWorkloads.json"
$CapacityEWorkloadPath = $BasePath + "CapacityEWorkloads.json"
$CapacityFWorkloadPath = $BasePath + "CapacityFWorkloads.json"

#3. Export Power BI Premium capacities and capacity workloads

#Premium Capacity ID variables
$CapacityA = "The Capacity ID for Premium Capacity A"
$CapacityB = "The Capacity ID for Premium Capacity B"
$CapacityC = "The Capacity ID for Premium Capacity C"
$CapacityD = "The Capacity ID for Premium Capacity D"
$CapacityE = "The Capacity ID for Premium Capacity E"
$CapacityF = "The Capacity ID for Premium Capacity F"

#Premium Capacity Workload URL variables
$CapacityAURL = 'capacities/' + $CapacityA + '/Workloads'
$CapacityBURL = 'capacities/' + $CapacityB + '/Workloads'
$CapacityCURL = 'capacities/' + $CapacityC + '/Workloads'
$CapacityDURL = 'capacities/' + $CapacityD + '/Workloads'
$CapacityEURL = 'capacities/' + $CapacityE + '/Workloads'
$CapacityFURL = 'capacities/' + $CapacityF + '/Workloads'

#Export Premium capacities
Invoke-PowerBIRestMethod -Url 'capacities' -Method Get | Out-File $CapacitiesPath

#Export EA Prod Capacity Workloads
Invoke-PowerBIRestMethod -Url $CapacityAURL -Method Get | Out-File $CapacityAWorkloadPath

Invoke-PowerBIRestMethod -Url $CapacityBURL -Method Get | Out-File $CapacityBWorkloadPath

Invoke-PowerBIRestMethod -Url $CapacityCURL  -Method Get | Out-File $CapacityCWorkloadPath

Invoke-PowerBIRestMethod -Url $CapacityDURL  -Method Get | Out-File $CapacityDWorkloadPath

Invoke-PowerBIRestMethod -Url $CapacityEURL   -Method Get | Out-File $CapacityEWorkloadPath

Invoke-PowerBIRestMethod -Url $CapacityFURL -Method Get | Out-File $CapacityFWorkloadPath

#4. Export Power BI Artifacts

Get-PowerBIDataset -Scope Organization | Select *, @{Name="Date Retrieved";Expression={$RetrieveDate}} | Export-Csv -Path $DatasetsPath

Get-PowerBIReport -Scope Organization |  Select *, @{Name="Date Retrieved";Expression={$RetrieveDate}} | Export-Csv -Path $ReportsPath

Get-PowerBIDashboard -Scope Organization |  Select *, @{Name="Date Retrieved";Expression={$RetrieveDate}} | Export-Csv -Path $DashboardsPath

Get-PowerBIWorkspace -Scope Organization -All |  Select *, @{Name="Date Retrieved";Expression={$RetrieveDate}} | Export-Csv -Path $WorkspacesPath

Get-PowerBIDataset -Scope Organization | Foreach {$dsID = $_.Id; Get-PowerBIDatasource -DatasetId $dsID -Scope Organization} | `
    Select *, @{Name = "DatasetID"; Expression={$dsID}}, @{Name = "Date Retrieved"; Expression={$RetrieveDate}} | `
    Export-Csv -Path $DatasourcesPath

Per the code comments the script above follows the following four steps:

  1. Authenticate to Power BI
  2. Define the file paths as variables
  3. Export Power BI Premium capacity metadata
  4. Export Power BI artifacts

As an alternative to a user credential (a user assigned to the PBI Admin role), you could also use a service principal with the Connect-PowerBIServiceAccount cmdlet. Please also note that the retrieval of the data sources (Get-PowerBIDatasource) at the end of the script can take hours to complete.

Azure Active Directory Script

Whether it’s the Power BI artifacts or the Power BI audit log event data or maybe something separate from Power BI altogether you probably want to analyze/slice the data based on user profile attributes. Moreover, it’s quite the benefit to be able to quickly identify which users are in which security groups, who the owner(s) of those groups are, etc without having to access a specific portal/page or create a one-time adhoc script.

So let’s look at a script for retrieving Users, Groups, Group Owners, and Group Members:

#1. Authenticate to Azure AD with service principal and certificate thumbprint

$tenant = "abcdef"
$application = "ghijkl"
$thumb = "mnopqrs"

Connect-AzureAD -TenantId $tenant -ApplicationId $application -CertificateThumbprint $thumb

#2. Define variables for users, groups, export paths and date retrieved

$RetrieveDate = Get-Date 
$ADGroups = Get-AzureADGroup -All $true
$ADUsers = Get-AzureADUser -All $true 

$BasePath = "C:\Users\Brett Powell\PowerShell Scripts\ArtifactExport\"

$AzureADUserPath = $BasePath + "Users.csv"
$AzureADGroupPath = $BasePath + "Groups.csv"
$AzureADGroupOwners = $BasePath + "GroupOwners.csv" 
$AzureADGroupMembers = $BasePath + "GroupMembers.csv"

#3. Export Azure AD Users and Groups

$ADUsers |  Select *, @{Name="Date Retrieved";Expression={$RetrieveDate}} | Export-Csv -Path $AzureADUserPath

$ADGroups |  Select *, @{Name="Date Retrieved";Expression={$RetrieveDate}} | Export-Csv -Path $AzureADGroupPath

#4. Export Azure AD Group Owners and Group Members

#Group Owners
$ADGroups | ForEach {$GroupID = $_.ObjectId; Get-AzureADGroupOwner -ObjectId $GroupID} | ` 
    Select DisplayName, UserPrincipalName, @{Name="GroupMembershipType";Expression={"Owner"}}, `
    @{Name="GroupObjectID";Expression={$GroupID}}, @{Name="Date Retrieved";Expression={$RetrieveDate}} | `
    Export-Csv $AzureADGroupOwners

#Group Members
$ADGroups | ForEach {$GroupID = $_.ObjectId; Get-AzureADGroupMember -ObjectId $GroupID -All $true} | ` 
    Select DisplayName, UserPrincipalName, @{Name="GroupMembershipType";Expression={"Member"}}, `
    @{Name="GroupObjectID";Expression={$GroupID}}, @{Name="Date Retrieved";Expression={$RetrieveDate}} | `
    Export-Csv $AzureADGroupMembers

Per the code comments the script above follows the following four steps

  1. Authenticate to Azure AD
  2. Define variables for users, groups, and the export paths
  3. Export the users and groups
  4. Export the group owners and group members

To setup authentication with a service principal per this example you can follow the following process from MS Docs. As an alternative remote authentication approach for the Connect-AzureAD cmdlet, you can create a PowerShell credential based on the identity of a user like the following snippet:

$TenantID = "abcdef"
$User = "somebody@bigcompany.com"
$PW = "somebody@bigcompany.com"

$SecPasswd = ConvertTo-SecureString $PW -AsPlainText -Force

$myCred = New-Object System.Management.Automation.PSCredential($User,$SecPasswd)

Connect-AzureAD -TenantId $TenantID -Credential $myCred

Similar to the Get-PowerBIDatasource example from the Power BI artifact script, the Azure AD script iterates over the Azure AD groups to get each group owner and each group member for each group. Unlike the Power BI artifact data, all Azure AD data can be written out to CSV files thus leaving making it easy to import and optionally transform/enhance this data in Power BI.

Admin Dataset Reporting Samples

Just like normal BI projects, almost all of the ‘heavy lifting’ for the Power BI Admin dataset is in the ETL (extract-transform-load) process. Once you’re relatively comfortable and stable in terms of accessing the Power BI data you need on a schedule, perhaps via SQL Server Agent or Azure Automation, the analytical and visualization layer is fairly straight forward even if you’re not that experienced with Power BI or Analysis Services.

Nonetheless, before I wrap up this post I’ll share a few sample report pages which can be developed on top of the admin dataset containing the tables from the two scripts.

It’s always a good idea to follow sound visualization practices (ie. alignment, distribution) and to take advantage of Power BI’s core features such as the new filter pane. However, the audience for Power BI Admin reports may be only a small group of internal BI/IT users or stakeholders looking for a few basic data points so I wouldn’t get too carried away with the aesthetics.

Wrapping Up

Maybe a time will come when the Power BI Admin portal or perhaps a dedicated admin app provides rich analytics of your tenant such that the solution described in this post is unnecessary. Likewise, maybe there will soon be out-of-the-box monitoring capabilities for data gateways, integration with Azure AD, and easy access to the O365 audit log event data. However, my experience suggests that every organization or deployment has its unique admin/monitoring needs and thus even if an out-of-the-box solution is provided by Microsoft at some point it may not have everything you need and certainly won’t be as flexible/extensible as a homegrown custom solution. In summary, though of course we’re all stretched for capacity/resources to take on new work, you might consider a Power BI admin dataset(s) to aid your administrative efforts.

Also, I’m sorry for the long delay since my last post. I’ve been very busy but I’ll try to start posting again more regularly.

38 comments

    1. Thanks Brian – Yes, in a follow up blog post I can. I’d like to publish a revised script and schema which supports relationships between workspaces and the reports, datasets, and dashboards they contain. As mentioned in a recent post, the workspace containing the given report, dashboard, and dataset isn’t included by the default Get cmdlets.

      Like

    2. Hi Brett,
      Thank you so much for great post. Awesome.
      I would like to be able to get for each dataset: the database server name; database name; database tables names; the path + file name if the dataset is using a flat file. Is this possible? Can you please assist.

      Thanks so much in advance.

      Joe

      Like

  1. Hi,
    Thank you for this blog post. it’s a great topic to consider. But as you may know, Power BI just released Gateway Performance Monitoring (Public Preview) in May 2019 release. Will there be any difference in types of monitoring? I just wanted to install a gateway in our server, and I considered to preparing this kind of feature. Can you suggest which one you recommended, using script or just using new feature from Gateway itself? Thank you.

    Regards,
    Sam

    Like

      1. No worries – yes, the new gateway monitoring log file (albeit with several limitations noted in the blog) should probably be the source for the PBI Admin dataset, particularly once this becomes GA.

        In addition to this, the scripts should be adjusted to associate the reports, dashboards, and datasets with their workspace. It may also make sense to remove the additional artifacts reflecting the published app version. I’ll address these issues in an upcoming blog post. Regards, Brett

        Like

  2. Hi Brett, are you aware of any powershell cmdlets or Rest Api examples that would allow me to get a list of refresh schedules for all the datasets in my Power BI tenant?

    Like

  3. Hi Brett, are you aware of any powershell cmdlets or Rest Api code examples that would allow me to get a list of refresh schedules for all the datasets in my Power BI tenant?

    Like

    1. Hi Roger,

      I usually call the API for getting a single datasets’ refresh schedule via the Invoke-PowerBIRestMethod. Specifically, I build a URL a string like the following: admin/datasets/{datasetID}/refreshSchedule

      So there are a few lines prior to retrieving he refresh history to A) get the dataset ID needed and B) build a URL variable that matches the structure required by the refresh schedule API.

      I believe you could build a PowerShell script that loops over every dataset in your tenant and pulls the refresh schedule for each. The problem, however, is that JSON data is returned by the refresh schedule and history APIs and therefore you cannot add columns to the output to identify which dataset or which workspace the refresh schedule is for.

      For this reason, when I pull refresh history and refresh schedule, I select just a handful of the very most important datasets in the tenant and I add logic in either Power BI Desktop or in a SQL Server database that stores the script output to identify the dataset and workspace. Hope all is well.

      Like

      1. Hi Brett,

        Thank you for all the great content – your blog has been a huge help towards stepping up Power BI monitoring in our tenant.

        I could customize one of the powershell scripts you shared in a different article to also include the refresh history for all returned datasets and change the returned JSON’s structure to include the dataset and workspace identifiers:

        #1. Get active workspaces excluding personal workspaces (‘My Workspace’)
        $ActiveWorkspaces = Get-PowerBIWorkspace -Scope Organization -All | Where-Object {$_.Type -ne “PersonalGroup” -and $_.State -eq “Active”} |`
        Select-Object *, @{Name=”DateRetrieved”;Expression={$RetrieveDate}}

        #2. Get datasets in active workspaces including the workspace ID of the dataset
        $Datasets = ForEach ($WS in $ActiveWorkspaces)
        {
        $WSID = $WS.Id
        Get-PowerBIDataset -Scope Organization -WorkspaceId $WSID | Where-Object {$_.Name -ne “Report Usage Metrics Model” -and $_.Name -ne “Dashboard Usage Metrics Model”} |`
        Select-Object *,@{Name=”WorkspaceID”;Expression={$WSID}},@{Name=”DateRetrieved”;Expression={$RetrieveDate}}
        }

        #3. Get refresh history from datasets in active modern workspaces

        $RefresHistory = ForEach ($DS in $Datasets)
        {
        $DSID = $DS.Id
        $WSID = $DS.WorkspaceID
        $DS1URL = ‘https://api.powerbi.com/v1.0/myorg/groups/’ + $WSID + ‘/datasets/’ + $DSID + ‘/refreshes’
        Invoke-PowerBIRestMethod -Url $DS1URL -Method Get -ErrorAction SilentlyContinue | `
        Select-Object | ConvertFrom-Json | SELECT @{Name=”WorkspaceID”;Expression={$WSID}},@{Name=”DatasetId”;Expression={$DSID}},@{Name=”DateRetrieved”;Expression={$RetrieveDate}},’@odata.context’,value
        }

        The above script works fine when I run it with my own credentials after activating the eligible PBI service administrator role. However, when I try running it via a reporting user that has the PBI service administrator role permanently assigned, step number #3 returns a 401 (Unauthorized) error. Steps #1 and #2 compile successfully.

        Have you ever experienced reporting users with the appropriate role being locked out of interacting with parts of Power BI’s API?

        Like

  4. I was trying to do the exact thing you talked about with PowerShell and the Invoke-PowerBIRestMethod, but was hoping you already had an example script I could borrow. No worries, thx for the reply.

    Like

  5. Brett, nice little post here. Got a question. I’m trying to get a list of all datasets and all subsequent datasource paths and the last line of your “Artifact” Powershell script gets me *almost* there. However when I run it, the first two columns are empty (Name, ConnectionString – not sure if those matter?), and the “ConnectionDetails” just shows the object (Microsoft.PowerBI.Common.Api.Shared.DatasourceConnectionDetails). Is there a way to tweak this portion of the script so we can get the file paths of all the DataSources?

    Thanks so much!
    Albert

    Like

    1. Thanks Albert. I’m afraid the answer is ‘no’, it’s not currently possible retrieve the actual source connection information via the Get-PowerBIDatasource command (or an alternative with the REST API). In my solutions I add a column to the data sources table identifying the source as DirectQuery if the ConnectionString property is not null or Import otherwise (it is null). I know there are some improvements coming for admin solutions that should improve visibility but the details and timeline are unclear at this point. Regards, Brett

      Like

      1. Brett, Thanks again. I was able to figure out a way to pull the DataSource Path and the DataSetID into an array/csv, and then later use a Join-Object tool to combine that csv with the DataSets.csv from you’re tool to make one large Dataset list with corresponding DataSource paths. Took a while, but eventually figured something out! šŸ™‚ Thanks again for the help!!

        Like

  6. Federico,
    Basically I ran the command: Invoke-PowerBIRestMethod -url “https://api.powerbi.com/v1.0//myorg/admin/datasets” -Method Get
    and pulled its results into an array. then I did a foreach loop for that array, and ran the command: Invoke-PowerBIRestMethod -url “https://api.powerbi.com/v1.0/myorg/admin/datasets/$Data/datasources” -Method Get and then did further foreach commands to split the data out into single line entries of a .CSV file.

    hope that helps,
    Albert

    Like

    1. Thanks. It worked

      Now I’m trying to get all tables in each dataset using “Get-PowerBITable” but I’m getting ‘NotFound’ error

      Checked and I have proper access

      Any thoughts on how to get a list of all tables contained in each dataset?

      Thanks again
      Fede

      Like

    2. Hi Albert
      can you share your code please.
      I m stuck also to get the ConnectionDetails.Server.
      I tried eerything no success..

      thanks

      Like

    1. Thanks Nadav –
      I should probably publish an update on this topic at some point. Power BI will be providing an admin dataset and reports out-of-the-box fairly soon.

      Like

      1. Actually, I am struggling with Tenant migration right now. I am trying to export and import all gateway data sources. I already have a power shell script to export workspaces and report and import them into the new tenant.

        Like

  7. It’s a little strange for me
    Some APIs necessitate to be an PBI administrator but not get access to all WS like this one

    $result = Invoke-PowerBIRestMethod -Url “https://api.powerbi.com/v1.0/myorg/admin/Groups?%24top=5000&%24expand=reports,dashboards,datasets

    Like

  8. hello Brett,

    unfortunately when there is more workspaces in tenant then the script thrown below error:
    Get-PowerBIDataset : A task was canceled.
    At line:37 char:16
    + … $dataset = Get-PowerBIDataset -Scope Organization -WorkspaceId $work …
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : WriteError: (Microsoft.Power…tPowerBIDataset:GetPowerBIDataset) [Get-PowerBIDataset], TaskCanceledException
    + FullyQualifiedErrorId : A task was canceled.,Microsoft.PowerBI.Commands.Data.GetPowerBIDataset

    Probably Microsoft cutting off such a queries.. do you have any idea how to get around or optimize?

    Best Regards,
    Anders

    Like

Leave a Reply