Group Email Notification for Dataset Refresh Failure

The owners of Power BI datasets have the option to receive an email notification if a scheduled refresh fails. While this feature is helpful, given the importance of some datasets as the source for potentially many highly utilized reports and dashboards, a Power BI administrator or administration team may consider a custom BI solution to trigger a group email notification in the event a highly valued dataset fails to refresh.

Outlook Email triggered by Power BI Dataset Refresh Failure

Depending on the scenario, a Power BI admin could follow up the auto-generated email with another message to include the specific error noted in the refresh history JSON file. For example, the refresh history may identify the specific column that was missing in the rowset or that a gateway was offline.

Components and Requirements

The solution described in this blog post involves the following tools and requirements:

  1. PowerShell
    • A single script (.ps1) will retrieve the refresh history for one or many Power BI datasets via Power BI REST API (Get Refresh History) and write this data out as JSON files to a network directory folder.
  2. Power BI Administrator Role
    • This user account will be able to access the Power BI workspace and datasets to be monitored for refresh history.
  3. SQL Server Agent
    • Several recent blog posts have involved Azure Automation as the scheduling tool of choice but in this case we’re using an on-premises SQL Server database with the SQL Server Agent service running.
  4. Power BI Desktop
    • Power Query (M) expressions are used to retrieve the JSON files containing dataset refresh history.
    • Power BI report pages including visuals which have been filtered to count the number of refresh failures in the current day for a single or multiple dataset.
  5. Power BI App Workspace
    • Technically this isn’t necessary but it’s strongly recommended to publish the Power BI report from #4 (and perhaps other Power BI administration related content) to an app workspace.
      • This implies you’ve been assigned a Power BI Pro license.
  6. Power BI Dashboard
    • Card visuals from the Power BI report reflecting the count of refresh failures for a given dataset in the current day are pinned as tiles to a dashboard.
  7. Data Alerts
    • As card-based dashboard tiles, alerts are configured in the event the count of refresh failures for the given dataset is greater than zero.
  8. On-Premises Data Gateway
    • To refresh your Power BI report which is based on the JSON files in an on-premises network you’ll need to configure the files as data sources for your on-premises data gateway.
  9. MS Flow
    • Finally, one MS flow is configured per dataset being tracked and each flow is triggered off an alert from step #7.
    • In this case, the alert drives an email to a distribution list of users responsible for the dataset.

PowerShell Script

At the heart of the solution is of course the call you make to the Power BI REST API (via PowerShell in this case) to fetch the refresh history for the one or many datasets you want to track.

The REST API for dataset refresh history requires the dataset ID so this script uses the name of the workspace and the name of the dataset contained in this workspace to retrieve this value. At this point you simply write the data out to a folder location for easy access by Power BI.

You can download a sample PowerShell script that supports three Power BI datasets like the following code here:

#1. Authenticate to Power BI with Power BI Service Administrator Account

$User = "123456@abc.com"
$PW = "abcdefgh"

$SecPasswd = ConvertTo-SecureString $PW -AsPlainText -Force
$myCred = New-Object System.Management.Automation.PSCredential($User,$SecPasswd)

Connect-PowerBIServiceAccount -Credential $myCred

#2. Export paths 

$RefreshExportPath = "C:\Users\Brett Powell\ArtifactExport\RefreshHistory\"

$Dataset1path = $RefreshExportPath + "Dataset1.json"
$Dataset2Path = $RefreshExportPath + "Dataset2.json"
$Dataset3Path = $RefreshExportPath + "Dataset3.json"

#3. Get Dataset IDs based on workspace

#Dataset ID for Dataset 1
$DS1WorkspaceID = Get-PowerBIWorkspace -Scope Organization -Name "Workspace Name of Dataset 1" | Where {$_.State -eq "Active"} | ForEach {$_.Id}

$DS1ID = Get-PowerBIDataset -Scope Organization -WorkspaceId $DS1WorkspaceID | ` 
    Where {($_.IsRefreshable -eq $True) -and ($_.Name -eq "Dataset 1 Name")} | ForEach {$_.Id} | Get-Unique -AsString

#Dataset ID for Dataset 2
$DS2WorkspaceID = Get-PowerBIWorkspace -Scope Organization -Name "Workspace Name of Dataset 2" | Where {$_.State -eq "Active"} | ForEach {$_.Id}

$DS2ID = Get-PowerBIDataset -Scope Organization -WorkspaceId $DS2WorkspaceID | ` 
    Where {($_.IsRefreshable -eq $True) -and ($_.Name -eq "Dataset 2 Name")} | ForEach {$_.Id} | Get-Unique -AsString

#Dataset ID for Dataset 3
$DS3WorkspaceID = Get-PowerBIWorkspace -Scope Organization -Name "Workspace Name of Dataset 3" | Where {$_.State -eq "Active"} | ForEach {$_.Id}

$DS3ID = Get-PowerBIDataset -Scope Organization -WorkspaceId $DS3WorkspaceID | ` 
    Where {($_.IsRefreshable -eq $True) -and ($_.Name -eq "Dataset 3 Name")} | ForEach {$_.Id} | Get-Unique -AsString

#4. Refresh API URLs 

$URLStart = 'admin/datasets/'
$URLEnd = '/refreshes'

$DS1URL = $URLStart + $DS1ID + $URLEnd
$DS2URL = $URLStart + $DS2ID + $URLEnd
$DS3URL = $URLStart + $DS3ID + $URLEnd

#5. Export Refresh History files

Invoke-PowerBIRestMethod -Url $DS1URL -Method Get | Out-File $Dataset1Path

Invoke-PowerBIRestMethod -Url $DS2URL -Method Get | Out-File $Dataset2Path

Invoke-PowerBIRestMethod -Url $DS3URL -Method Get | Out-File $Dataset3Path

Per the above example, you essentially find the dataset IDs and build a URL string for the Invoke-PowerBIRestMethod cmdlet which calls the REST API for refresh history.

Why separate JSON files (one per dataset)? Because converting the JSON output does not work for this API (you lose the data) and retrieving the refresh history into a single JSON causes issues when reading this file into Power BI.

The Scheduled Job

In this case, I’ve configured a SQL Agent job to run this Powershell script every 30 minutes on weekdays starting at 8 AM and ending at 3 PM. Upon execution, the JSON files containing the refresh history data are replaced with new, updated files.

To execute the script remotely, the PowerShell credential object (Step #1 in script) or a service principal account with granted permissions to the Power BI REST API is necessary.

SQL Server Agent Job with Step to Execute PowerShell Script

SQL Server Agent has been around for a long time and I assume many administrators and readers of this blog are familiar with it. Nonetheless, here’s a link to Agent in MS Docs.

Power BI Desktop

So now we have the refresh history data available to us and being updated regularly. We just need, at a minimum, a report which loads these files and applies the essential logic to count the refresh failures and filter for the current day.

Power Query (M) Expressions

Write a single M expression (query) to retrieve and enhance one of the datasets (from JSON file source) like the following:

let
    Source = Json.Document(File.Contents(Dataset1Path)),
    #"Converted to Table" = Record.ToTable(Source),
    #"Removed Top Rows" = Table.Skip(#"Converted to Table",1),
    #"Expanded Value" = Table.ExpandListColumn(#"Removed Top Rows", "Value"),
    #"Expanded Value1" = Table.ExpandRecordColumn(#"Expanded Value", "Value", {"id", "refreshType", "startTime", "endTime", "status"}, {"id", "refreshType", "startTime", "endTime", "status"}),
    #"Renamed Columns" = Table.RenameColumns(#"Expanded Value1",{{"id", "Refresh ID"}, {"refreshType", "Refresh Type"}, {"startTime", "Refresh Start Time"}, {"endTime", "Refresh End Time"}, {"status", "Refresh Status"}}),
    ColTypes = Table.TransformColumnTypes(#"Renamed Columns",{{"Refresh ID", Int64.Type}, {"Refresh Type", type text}, {"Refresh Start Time", type datetime}, {"Refresh End Time", type datetime}, {"Refresh Status", type text}}),
    WorkspaceCol = Table.AddColumn(ColTypes, "Workspace", each "The Name of the Workspace", type text),
    DatasetCol = Table.AddColumn(WorkspaceCol, "Dataset", each "The Name of the Dataset", type text),
    RefreshDurationCol = Table.AddColumn(DatasetCol, "Refresh Duration", each [Refresh End Time] - [Refresh Start Time], type duration),
    MinutesCol = Table.AddColumn(RefreshDurationCol, "Refresh Minutes", each Duration.Minutes([Refresh Duration]), Int64.Type)
in
    MinutesCol

The name of the workspace and dataset itself is not included in the JSON output so I’ve added these columns in via Table.AddColumn(). Additionally, to support other analysis such as the duration of dataset refreshes I’ve added a couple other columns that compute this value and store it in minutes, respectively.

Notice that Dataset1Path is a parameter to the network directory of the JSON file for Dataset1. I’d recommend parameters for almost all data sources to support easy/quick changes in the future. Moreover, I could clean up the query a bit more – the intent is just to show you the essential transformations for your own application.

Once you’re satisfied with your query for a single dataset, you can simply duplicate this query and change the values to retrieve the other datasets. Finally, you can append the dataset queries into a single refresh history query via the Table.Combine() function.

The Report

In this case I’m using the relative date filtering at the page level scope to filter down some card visuals containing the count of refresh failures:

Relative Date Filter: Current Day

Yes, we could handle this filtering via DAX too. In this case, to keep the DAX as simple as possible, I have a Total Refreshes measure which simply counts the number of rows (COUNTROWS()) in the refresh history table and a Refresh Failures measure as the following:

Refresh Failures = IF(
    CALCULATE([Total Refreshes], 'RefreshHistory'[Refresh Status] = "Failed") =BLANK(),0,
    CALCULATE([Total Refreshes], 'RefreshHistory'[Refresh Status] = "Failed"))

You then add some card visuals and filter each down (visual level filter) to a specific dataset like the following example:

Refresh Failure Cards in Power BI Report

Remember that you’ll also have the successful refresh history data available so it may make sense to create a few reports that analyze the time required to refresh various datasets, how frequently the datasets are being refreshed, etc.

Some of this data is available in the Power BI Premium metrics app but A) not all of your datasets may be in premium capacity and B) even if they are in premium capacity, you won’t be able to configure an alert and group email notification on the refresh failures and you won’t be able to access the error message associated with the refresh failure like with this approach.

The Dashboard

You can now publish your Power BI report to a Power BI workspace, perhaps a workspace you use for other custom Power BI monitoring solutions. Once published to the workspace, you can create a dashboard and pin your card visuals to this dashboard (e.g. ‘Today’s Refresh Failures’) like the following image:

Dashboard Card Tiles with Refresh Failures per Dataset in Current Day

Per the alert icons in the top left corner of two of the five tiles, an alert has been triggered by the value of the card. In this case, which I assume to be the normal case, I’ve configured an alert on each tile to be fired if the value of the card exceeds zero:

Data Alert for Dataset Refresh Failure

You can configure data alerts for standard Card, KPI, and gauge visuals. Most importantly, per the final step they can be used as the trigger for MS Flows.

*As mentioned at the beginning, you’ll need to add the JSON files as data sources to an on-premises data gateway and schedule a refresh for the Power BI dataset loading these files and supporting the report and dashboard.

The MS Flow; Email Notification

Go to MS Flow and create a new flow which utilizes a Power BI alert as the trigger for an Outlook email like the following image:

MS Flow with Power BI Alert as Trigger for Outlook Email

You’ll create one one flow per dataset that you want the email for. Of course, if you’re managing Power BI responsibly, you’ll try to limit the volume of datasets to avoid version control issues and the waste of resources so you should only need/use something like this for those datasets which support many important, highly visible reports and dashboards.

Wrapping Up

At first glance this may seem like a lot of work but in reality, especially if you have any experience with PowerShell and the Power BI REST API, this doesn’t have to be that much effort to get setup. I may include a few administrative solutions like this in upcoming Enterprise Power BI books.

Additionally, particularly given the coming Shared Datasets feature, it wouldn’t be surprising if some kind of group notification feature is also made available in the near future such that custom solutions like this aren’t necessary.

If you believe this blog and others like it are helpful feel welcome to subscribe to Insight Quest (click ‘Follow’) and to share any thoughts or questions in the comments.

2 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s