Group Email Notification for Dataset Refresh Failure

New Refresh Notification Feature

There’s now a feature in Power BI which allows dataset owners to specify multiple users or groups to receive an email notification in the event of a refresh failure. Here’s the blog post announcement.

Given this new feature, the approach described in this blog post involving a data alert in Power BI and an MS Flow to send a custom notification email to users or teams associated with the dataset is not necessary.

However, particularly for widely used and important datasets (perhaps ‘certified’ datasets), it may still be a good idea to have a process which retrieves dataset refresh history. For example, a you may want to regularly monitor that the time required to refresh a dataset is staying within certain boundaries or compare the refresh times and reliability of different datasets across different premium capacities and sources.

Custom Group Email Notification

The owners of Power BI datasets have the option to receive an email notification if a scheduled refresh fails. While this feature is helpful, given the importance of some datasets as the source for potentially many highly utilized reports and dashboards, a Power BI administrator or administration team may consider a custom BI solution to trigger a group email notification in the event a highly valued dataset fails to refresh.

Outlook Email triggered by Power BI Dataset Refresh Failure

Depending on the scenario, a Power BI admin could follow up the auto-generated email with another message to include the specific error noted in the refresh history JSON file. For example, the refresh history may identify the specific column that was missing in the rowset or that a gateway was offline.

Components and Requirements

The solution described in this blog post involves the following tools and requirements:

  1. PowerShell
    • A single script (.ps1) will retrieve the refresh history for one or many Power BI datasets via Power BI REST API (Get Refresh History) and write this data out as JSON files to a network directory folder.
  2. Power BI Administrator Role
    • This user account will be able to access the Power BI workspace and datasets to be monitored for refresh history.
  3. SQL Server Agent
    • Several recent blog posts have involved Azure Automation as the scheduling tool of choice but in this case we’re using an on-premises SQL Server database with the SQL Server Agent service running.
  4. Power BI Desktop
    • Power Query (M) expressions are used to retrieve the JSON files containing dataset refresh history.
    • Power BI report pages including visuals which have been filtered to count the number of refresh failures in the current day for a single or multiple dataset.
  5. Power BI App Workspace
    • Technically this isn’t necessary but it’s strongly recommended to publish the Power BI report from #4 (and perhaps other Power BI administration related content) to an app workspace.
      • This implies you’ve been assigned a Power BI Pro license.
  6. Power BI Dashboard
    • Card visuals from the Power BI report reflecting the count of refresh failures for a given dataset in the current day are pinned as tiles to a dashboard.
  7. Data Alerts
    • As card-based dashboard tiles, alerts are configured in the event the count of refresh failures for the given dataset is greater than zero.
  8. On-Premises Data Gateway
    • To refresh your Power BI report which is based on the JSON files in an on-premises network you’ll need to configure the files as data sources for your on-premises data gateway.
  9. MS Flow
    • Finally, one MS flow is configured per dataset being tracked and each flow is triggered off an alert from step #7.
    • In this case, the alert drives an email to a distribution list of users responsible for the dataset.

PowerShell Script

At the heart of the solution is of course the call you make to the Power BI REST API (via PowerShell in this case) to fetch the refresh history for the one or many datasets you want to track.

The REST API for dataset refresh history requires the dataset ID so this script uses the name of the workspace and the name of the dataset contained in this workspace to retrieve this value. At this point you simply write the data out to a folder location for easy access by Power BI.

You can download a sample PowerShell script that supports three Power BI datasets like the following code here:

#1. Authenticate to Power BI with Power BI Service Administrator Account

$User = "123456@abc.com"
$PW = "abcdefgh"

$SecPasswd = ConvertTo-SecureString $PW -AsPlainText -Force
$myCred = New-Object System.Management.Automation.PSCredential($User,$SecPasswd)

Connect-PowerBIServiceAccount -Credential $myCred

#2. Export paths 

$RefreshExportPath = "C:\Users\Brett Powell\ArtifactExport\RefreshHistory\"

$Dataset1path = $RefreshExportPath + "Dataset1.json"
$Dataset2Path = $RefreshExportPath + "Dataset2.json"
$Dataset3Path = $RefreshExportPath + "Dataset3.json"

#3. Get Dataset IDs based on workspace

#Dataset ID for Dataset 1
$DS1WorkspaceID = Get-PowerBIWorkspace -Scope Organization -Name "Workspace Name of Dataset 1" | Where {$_.State -eq "Active"} | ForEach {$_.Id}

$DS1ID = Get-PowerBIDataset -Scope Organization -WorkspaceId $DS1WorkspaceID | ` 
    Where {($_.IsRefreshable -eq $True) -and ($_.Name -eq "Dataset 1 Name")} | ForEach {$_.Id} | Get-Unique -AsString

#Dataset ID for Dataset 2
$DS2WorkspaceID = Get-PowerBIWorkspace -Scope Organization -Name "Workspace Name of Dataset 2" | Where {$_.State -eq "Active"} | ForEach {$_.Id}

$DS2ID = Get-PowerBIDataset -Scope Organization -WorkspaceId $DS2WorkspaceID | ` 
    Where {($_.IsRefreshable -eq $True) -and ($_.Name -eq "Dataset 2 Name")} | ForEach {$_.Id} | Get-Unique -AsString

#Dataset ID for Dataset 3
$DS3WorkspaceID = Get-PowerBIWorkspace -Scope Organization -Name "Workspace Name of Dataset 3" | Where {$_.State -eq "Active"} | ForEach {$_.Id}

$DS3ID = Get-PowerBIDataset -Scope Organization -WorkspaceId $DS3WorkspaceID | ` 
    Where {($_.IsRefreshable -eq $True) -and ($_.Name -eq "Dataset 3 Name")} | ForEach {$_.Id} | Get-Unique -AsString

#4. Refresh API URLs 

$URLStart = 'admin/datasets/'
$URLEnd = '/refreshes'

$DS1URL = $URLStart + $DS1ID + $URLEnd
$DS2URL = $URLStart + $DS2ID + $URLEnd
$DS3URL = $URLStart + $DS3ID + $URLEnd

#5. Export Refresh History files

Invoke-PowerBIRestMethod -Url $DS1URL -Method Get | Out-File $Dataset1Path

Invoke-PowerBIRestMethod -Url $DS2URL -Method Get | Out-File $Dataset2Path

Invoke-PowerBIRestMethod -Url $DS3URL -Method Get | Out-File $Dataset3Path

Per the above example, you essentially find the dataset IDs and build a URL string for the Invoke-PowerBIRestMethod cmdlet which calls the REST API for refresh history.

Why separate JSON files (one per dataset)? Because converting the JSON output does not work for this API (you lose the data) and retrieving the refresh history into a single JSON causes issues when reading this file into Power BI.

The Scheduled Job

In this case, I’ve configured a SQL Agent job to run this Powershell script every 30 minutes on weekdays starting at 8 AM and ending at 3 PM. Upon execution, the JSON files containing the refresh history data are replaced with new, updated files.

To execute the script remotely, the PowerShell credential object (Step #1 in script) or a service principal account with granted permissions to the Power BI REST API is necessary.

SQL Server Agent Job with Step to Execute PowerShell Script

SQL Server Agent has been around for a long time and I assume many administrators and readers of this blog are familiar with it. Nonetheless, here’s a link to Agent in MS Docs.

Power BI Desktop

So now we have the refresh history data available to us and being updated regularly. We just need, at a minimum, a report which loads these files and applies the essential logic to count the refresh failures and filter for the current day.

Power Query (M) Expressions

Write a single M expression (query) to retrieve and enhance one of the datasets (from JSON file source) like the following:

let
    Source = Json.Document(File.Contents(Dataset1Path)),
    #"Converted to Table" = Record.ToTable(Source),
    #"Removed Top Rows" = Table.Skip(#"Converted to Table",1),
    #"Expanded Value" = Table.ExpandListColumn(#"Removed Top Rows", "Value"),
    #"Expanded Value1" = Table.ExpandRecordColumn(#"Expanded Value", "Value", {"id", "refreshType", "startTime", "endTime", "status"}, {"id", "refreshType", "startTime", "endTime", "status"}),
    #"Renamed Columns" = Table.RenameColumns(#"Expanded Value1",{{"id", "Refresh ID"}, {"refreshType", "Refresh Type"}, {"startTime", "Refresh Start Time"}, {"endTime", "Refresh End Time"}, {"status", "Refresh Status"}}),
    ColTypes = Table.TransformColumnTypes(#"Renamed Columns",{{"Refresh ID", Int64.Type}, {"Refresh Type", type text}, {"Refresh Start Time", type datetime}, {"Refresh End Time", type datetime}, {"Refresh Status", type text}}),
    WorkspaceCol = Table.AddColumn(ColTypes, "Workspace", each "The Name of the Workspace", type text),
    DatasetCol = Table.AddColumn(WorkspaceCol, "Dataset", each "The Name of the Dataset", type text),
    RefreshDurationCol = Table.AddColumn(DatasetCol, "Refresh Duration", each [Refresh End Time] - [Refresh Start Time], type duration),
    MinutesCol = Table.AddColumn(RefreshDurationCol, "Refresh Minutes", each Duration.Minutes([Refresh Duration]), Int64.Type)
in
    MinutesCol

The name of the workspace and dataset itself is not included in the JSON output so I’ve added these columns in via Table.AddColumn(). Additionally, to support other analysis such as the duration of dataset refreshes I’ve added a couple other columns that compute this value and store it in minutes, respectively.

Notice that Dataset1Path is a parameter to the network directory of the JSON file for Dataset1. I’d recommend parameters for almost all data sources to support easy/quick changes in the future. Moreover, I could clean up the query a bit more – the intent is just to show you the essential transformations for your own application.

Once you’re satisfied with your query for a single dataset, you can simply duplicate this query and change the values to retrieve the other datasets. Finally, you can append the dataset queries into a single refresh history query via the Table.Combine() function.

The Report

In this case I’m using the relative date filtering at the page level scope to filter down some card visuals containing the count of refresh failures:

Relative Date Filter: Current Day

Yes, we could handle this filtering via DAX too. In this case, to keep the DAX as simple as possible, I have a Total Refreshes measure which simply counts the number of rows (COUNTROWS()) in the refresh history table and a Refresh Failures measure as the following:

Refresh Failures = IF(
    CALCULATE([Total Refreshes], 'RefreshHistory'[Refresh Status] = "Failed") =BLANK(),0,
    CALCULATE([Total Refreshes], 'RefreshHistory'[Refresh Status] = "Failed"))

You then add some card visuals and filter each down (visual level filter) to a specific dataset like the following example:

Refresh Failure Cards in Power BI Report

Remember that you’ll also have the successful refresh history data available so it may make sense to create a few reports that analyze the time required to refresh various datasets, how frequently the datasets are being refreshed, etc.

Some of this data is available in the Power BI Premium metrics app but A) not all of your datasets may be in premium capacity and B) even if they are in premium capacity, you won’t be able to configure an alert and group email notification on the refresh failures and you won’t be able to access the error message associated with the refresh failure like with this approach.

The Dashboard

You can now publish your Power BI report to a Power BI workspace, perhaps a workspace you use for other custom Power BI monitoring solutions. Once published to the workspace, you can create a dashboard and pin your card visuals to this dashboard (e.g. ‘Today’s Refresh Failures’) like the following image:

Dashboard Card Tiles with Refresh Failures per Dataset in Current Day

Per the alert icons in the top left corner of two of the five tiles, an alert has been triggered by the value of the card. In this case, which I assume to be the normal case, I’ve configured an alert on each tile to be fired if the value of the card exceeds zero:

Data Alert for Dataset Refresh Failure

You can configure data alerts for standard Card, KPI, and gauge visuals. Most importantly, per the final step they can be used as the trigger for MS Flows.

*As mentioned at the beginning, you’ll need to add the JSON files as data sources to an on-premises data gateway and schedule a refresh for the Power BI dataset loading these files and supporting the report and dashboard.

The MS Flow; Email Notification

Go to MS Flow and create a new flow which utilizes a Power BI alert as the trigger for an Outlook email like the following image:

MS Flow with Power BI Alert as Trigger for Outlook Email

You’ll create one one flow per dataset that you want the email for. Of course, if you’re managing Power BI responsibly, you’ll try to limit the volume of datasets to avoid version control issues and the waste of resources so you should only need/use something like this for those datasets which support many important, highly visible reports and dashboards.

Wrapping Up

At first glance this may seem like a lot of work but in reality, especially if you have any experience with PowerShell and the Power BI REST API, this doesn’t have to be that much effort to get setup. I may include a few administrative solutions like this in upcoming Enterprise Power BI books.

Additionally, particularly given the coming Shared Datasets feature, it wouldn’t be surprising if some kind of group notification feature is also made available in the near future such that custom solutions like this aren’t necessary.

If you believe this blog and others like it are helpful feel welcome to subscribe to Insight Quest (click ‘Follow’) and to share any thoughts or questions in the comments.

14 comments

  1. Hi Brett,
    I am trying to implement the same in my system I used your same script changed the password and destination folder location,script executes with no error but ders no json files in d folder . I am very new to power shell scripting and am in dire need to implement this functionality. It is awsome to see what you have done . Could you please help me with the same ,greatly appreciate your help. Thanks in advance .

    Like

    1. Hi Amy,

      You should make sure you’ve been assigned the Power BI Service Admin role or, less likely, that you’re a global admin for the tenant. If you’re a PBI admin, you should check that the variables which retrieve the workspace and dataset ID values are indeed producing the values expected. You can find these values by navigating to the workspace and dataset settings in the PBI service.

      For example, you would first connect to PBI (Connect-PowerBIServiceAccount) and then highlight the lines of the script which retrieve the workspace and dataset ID for one of the datasets you want to produce the refresh history for. Assuming no errors, you can then just add a line of code to the script that only references one of the variables (no expressions or logic, just the variable itself) and execute this line of the script. If this variable is blank, you likely need to revise the Get-PowerBIWorkspace or Get-PowerBIDataset logic or maybe you’re not a PBI admin or don’t have access to the workspace.

      If the workspace you need is the current or O365-based workspace (not the new workspaces) then the Type -eq Workspace filter could be the issue for you. You could switch to $_.Type -eq “Group” if this was the case or remove the Where object filter altogether.

      I’ve been using this approach successfully for months and thus I’m pretty confident it works though certainly some experience with PowerShell and the PBI module is helpful.

      Like

  2. Thank you so much for the reply Brett. I will try the same again and come back if there’s any doubt. Can we connect through mails ? If you dont mind please send me a ping on my mail :powerquest139@gmail.com.

    Like

  3. Hi Brett, great post, helped me a lot by keeping track of my refreshes!
    I was just intrigued about the reason you had to explicitly define the datasets you wanted to get the refresh history. Have you seen the following script?

    https://github.com/Microsoft/powerbi-powershell/blob/master/examples/Export-RefreshHistory.ps1

    I tested and it seems to work fine for all datasets and all workspace in my tenant.

    Thanks a lot, you are great!

    Like

    1. Hi Leonardo –

      Thanks for your comment and link – sorry for the delayed response. No, I hadn’t seen or used that script before but it’s tenant-wide scope could definitely be useful and ultimately superior to the dataset-specific approach from this blog post. Rather than retrieving and importing the refresh history for specific datasets, you’d load all the dataset refresh history and then filter this data to specific datasets in which you’d want to drive the group email alert.

      As mentioned in the blog and many times before you really want to minimize the number of datasets in the tenant. You also may only need/want a group email alert for a few high priority datasets such as a production dataset for the finance team, a production dataset for merchandising, etc. Each group email implies a distinct MS Flow and a much more verbose PowerShell script that retrieves all datasets could be unnecessary. Thanks for visiting this blog.

      Like

      1. Thank you very much for your attention!
        And thanks for great content you deliver, congrats.
        Regards

        Like

    1. Hi Jay,

      Thanks for your comment and following this blog. I’ve added a section to the top of this blog post that links to the announcement of the new notification feature.

      I’ve been wanting to make this update since the announcement but have been a bit busy. Per the updated section, I still generally think the idea of retrieving dataset refresh history into some form of data store for later analysis has merit.

      Like

Leave a Reply