Analyze SQL Agent Job History with Power BI

The purpose of this post is to share and describe a Power BI-based solution for analyzing SQL Server Agent job history data. SQL Server Agent has been around for many years and though increasingly my projects involve other scheduling and orchestration tools like Azure Automation, SQL Agent remains relevant both for on-premises BI/DW processes and in Azure via SQL Managed Instance.

In fact, dependencies on SQL Agent for scheduled jobs is one of the main reasons SQL Managed Instance was created as this reduces the friction in migrating on-premises workloads to Azure.

You can download the Power BI template (.PBIT) described in this blog from GitHub:

Power BI Template for SQL Agent Job History

Power BI Template: SQL Instance Parameter Required

I’d strongly recommend you read through this blog post to better understand the context and details of this solution before you attempt to execute the template. At a minimum, you should check that you have the SQL Agent service running, note the name of the server its running on (the one required parameter per the above image), and confirm that you can query the SQL Agent tables in the MSDB database such as dbo.sysjobs.

This template uses native T-SQL queries so in Power BI Desktop you might also disable the user approval requirements for new native queries prior to running the template:

Global Security Options in Power BI Desktop

SQL Server Agent Background

I assume many readers of this blog are familiar with SQL Agent and thus can skip ahead to later sections. For others, the SQL Agent service allows for the creation and scheduling of ‘jobs’ with each ‘job’ comprised of a sequence of one more ‘steps’. For example, a job could first execute a T-SQL stored procedure, then run an SSIS package, and maybe finally process/refresh an SSAS database or execute a PowerShell script. SQL Agent logs the duration of each step, each instance/execution of the overall job, any retries attempted, error messages, and other useful information.

In a common use case, a BI and data warehouse team will link together multiple steps (or multiple jobs) to support a nightly data warehouse and data model update process. If the nightly SQL Agent job fails, the team/user closest to the technology involved in the given step will review the agent log and apply changes necessary to successfully complete the process from the point of failure and in future instances of the job.

The following image from SQL Server Management Studio (SSMS) is from a four-step SQL Agent job scheduled to occur nightly:

SQL Server Agent Log File Viewer

In the above example, Power BI and Azure AD data is retrieved via PowerShell scripts in steps 1 and 2, then an SSIS package is used to load this data to a SQL Server database in step 3, and finally a Power BI dataset used for administering Power BI is triggered to refresh. This entire process executed in about 2.5 hours with the Azure AD data retrieval step requiring 1.5 hours itself.

As an aside, some of the PowerShell scripts I’ve featured in recent blog posts containing loops over Power BI app workspaces require time (e.g. 52 minutes per image) and of course loops are suboptimal from a performance standpoint. There’s a recent blog post on avoiding workspace loops you may consider. However, if you filter down your workspaces to exclude personal/my workspaces prior to executing the loops, a nightly 50+ minute process such as this example for a large Power BI environment might be acceptable.

What problem are we solving?

When you’re getting started with SQL Agent just viewing the job history and error log files via the graphical interfaces in SSMS might be sufficient. If you want to become more productive and proactive in your analysis, you’ll look for a programmatic solution such as writing T-SQL queries against the SQL Agent tables in the MSDB system database including dbo.sysjobs, dbo.sysjobhistory, dbo.sysjobsteps, and others. Over the years several blog posts have been written containing T-SQL example approaches to querying this data and MS Docs itself provides an example T-SQL query for the job history table.

The Power BI-based solution starts with T-SQL queries but builds on top of this layer to leverage several of Power BI’s rich and integrated features including Power Query functions, DAX expressions, reporting visualizations, and more. The core idea of this solution is to have a data model in place (a Power BI ‘dataset’), including fact to dimension relationships and measures with built-in calculation logic to make it relatively easy to build SQL Agent reports and dashboards and to browse/analyze SQL agent history data.

Since the data model already includes the SQL Agent data stored in the MSDB database and can be refreshed on a schedule, users (typically DBAs and other IT admins) don’t have to write new or modified SQL queries on an ad hoc basis to answer different questions. Within the PBIX file containing the data model itself or a separate file or tool connected to the published dataset users can drag-and-drop and filter to find what they need.

Its outside the scope of this blog post but you could leverage any number of tools to analyze the SQL Agent History Power BI dataset beyond Power BI itself. Given XMLA endpoint connectivity, you could use Excel, Paginated/SSRS reports, and other 3rd Party BI tools just like an Analysis Services database.

Report Pages (Visual Layer)

In describing any Power BI solution I’d prefer to start from the source systems and data and walk through the logic and components of the Power Query Editor and then the Modeling View before saying anything about any reports or dashboards. However, experience suggests that introducing the report/visual layer, at least a representative sample, further addresses the ‘Why does this exist?’ question and will help maintain attention to these two more fundamental layers.

With that being said, the following short slideshow displays the three report pages and highlights two tooltip pages included in this initial version of the template:

The two main report pages of the PBIX file that’s generated from the template are the Job Level and Job Step Level pages. As mentioned earlier, SQL Agent is comprised of ‘jobs’ and ‘jobs’ contain one or many ‘steps’ which will execute on a schedule. Therefore, these two pages help address the most basic questions such as “Do I have any job failures?”, “How long are my jobs taking to complete?”, “Which steps within my jobs are taking the longest to complete or are failing?”, etc.

There’s a report-level filter (impacting all report pages) that’s set for the trailing last 30 days and you can also use the Date slider at the top to modify the date range of the report based on the run date of the job. You can also use the slicer at the top right to filter for one or multiple ‘jobs’ of interest and the slicers are synchronized across the pages such that these selections will be carried over.

The tooltip pages are intended to provide additional context as you hover over a specific agent job or a specific agent job step. The bar charts on the Job Level and Job Step Level pages will render these tooltip pages to give additional details relevant to these specific items (or ‘dimension members’ if you prefer) including a line chart of the duration for the given job or step over time.

Many additional and more visually impressive and analytically powerful reports could be developed. Depending on the extent of SQL Agent usage these additional reports could make sense but for common use cases such as routine monitoring and troubleshooting these reports or slight variations might be sufficient and represent quite an improvement from ad hoc SQL queries).

The Data Model (Sematic Layer)

At the center of the solution are two star schema models – one for job history at the job level and one at the job step level. You might think of this as the header or order level of a sales or purchase order versus the line level of these transactions which includes the individual products.

The following two images from Modeling view expose the fact to dimension (many to one) relationships.

Both the SQL Agent Job Step History and the SQL Agent Job History fact tables have many-to-one relationships with the Date and the SQL Agent Jobs dimension tables. The SQL Agent Job Step History fact table also has an active many-to-one relationship to the SQL Agent Job Steps dimension table.

Each row of the SQL Agent Job History table reflects one instance of a job being executed from start to finish inclusive of all steps. Each row of the SQL Agent Job Step History table reflects the execution of one step of a job. The source table in the MSDB database for both the aggregate/job level fact table and the step-level fact table is dbo.sysjobhistory. Every row of a fact table should have the same meaning and therefore there’s filtering logic in the solution to separate the rows representing the overall job from those representing an individual step of a job.

In addition to the relationships, there are display folders of basic DAX measures in each fact table per the following image:

SQL Agent DAX Measures

As their names indicate, these measures either count the number of jobs or steps that succeeded or failed or apply an aggregation to a duration fact column (in minutes). The presence of the DAX measures and the relationships makes development of the report pages relatively straight forward.

Power Query (M) Data (ETL Layer)

The fact and dimension tables of the data model are the product of a data transformation process executed via T-SQL statements against the Agent tables in the MSDB system database and Power Query (M) expressions. The T-SQL statements contain logic to prep and stage the Agent tables in MSDB such as adjusting date and time columns into data types and structures more suitable for reporting. The results of these SQL queries are further enhanced and modified with Power Query (M) expressions.

Here’s a look at the Power Query Editor:

Power Query Editor

Like normal, I like to use query groups to isolate queries into those that are being loaded to the data model (dimension and fact tables) from those that are not but are required for staging/prepping processes.

One item you might find a bit novel or interesting in this design is that I’m storing T-SQL statements in their own ‘queries’ returning text primitive values – the text value being the T-SQL statement. In this way, the queries which invoke the Sql.Database() function can simply reference these ‘queries’ as the value in the Query parameter. This means that A) the staging queries against the MSDB tables are shorter and cleaner to read and B) the T-SQL statements can be more easily viewed and modified if necessary in the future.

In this version, I filter out common system Agent jobs such as ‘SSIS Server Maintenance Job’ and ‘Sys Policy History Purge’ in order to focus the solution on user/business created jobs. There are filters and queries to split job level and step level data into their own flows, joins to integrate job schedule attributes and to support the relationship between job history at the step level with job steps.

The date table is completely created on the fly via Power Query (M) expressions. For this version, the query creates a date table from the current date (at the time of execution) back 365 days. This might be acceptable for the limited scope of this solution or you might want to replace this date table with an ‘official’ date dimension table if you have one.

Given the length of this post I can’t go into significant depth here on the queries but there are comments included in the M expressions and I think the structure and syntax is simple enough that an intermediate user of Power Query (M) could reason her way through the queries.

Don’t mix native queries and M, normally

If you’ve been working with Power Query (M) for any significant period of time you know that any logic/transform you apply after a native query will not be folded back to the source system. The general guidance is always to either A) accomplish all required logic in the native query or B) accomplish all required logic in a Power Query (M) expression which can be folded/translated into a query for execution by the source system.

Therefore, I want to be very clear that I’m only using both tools in the same solution because the scale of this data is very small. When you’re querying DMV and system table sources which contain maybe hundreds or a few thousand rows, some level of processing executed via the Power Query engine is probably acceptable and even advisable in some cases.

In this case, Power Query functions such as #duration, #time, and Duration.TotalMinutes() are used on top of the SQL statements and in my view this is actually a simpler approach than consolidating all logic into much more extensive T-SQL queries. If you beg to differ, perhaps given relative comfort/skill in T-SQL, you can always re-factor the T-SQL queries and remove the M expressions.

Wrapping Up

If you work with SQL Agent regularly and are at least open to using Power BI I hope you’re able to give this template a try and find some value in it. If you don’t use SQL Agent maybe this blog post and sample solution can give you some idea of leveraging Power BI to build your own monitoring and administrative solutions for the tools/services you are responsible for.

If you’d like to receive email notification of future blog posts you can click ‘Follow’ in the upper right.

26 comments

  1. Hi

    Thank you for your template, I am doing a test on my SQLSERVER but for some reason this query is not returning anything stgSQLAgentJobHistoryJobLevel resulting in having no data in SQL agent Job history.
    I must be missing something.

    Can you assist me please

    Thanks

    Like

    1. Hi

      my issue was related to last step of stgSQLAgentJobHistoryJobLevel as it was looking for “Job outocome” and I am using french language :).

      Table.SelectRows(RenameCols, each ([Step Name] = “(Job outcome)”))

      Thank you again for you great work

      Like

  2. Hi

    Thanks for the awesome template and article.

    I am currently receiving the below issue when trying to refresh tables.

    I have no idea on the solution to this, please can you advise ?

    Expression.Error: We cannot convert the value null to type Number.
    Details:
    Value=
    Type=[Type]

    Like

    1. Hi Andrew,

      Thanks for your note. To troubleshoot the issue, open up the query editor and refresh the preview for the staging queries (stgSQLAgentJobs, stSQLAgentJobSchedules, etc) and see which staging query is producing an error. There are four T-SQL statements stored within their own queries (AgentJobHistoryTSQL, AgentJobsTSQL, etc) – you can copy this code and paste it into SSMS and try to execute the query against the MSDB system database which contains these tables (e.g. dbo.sysjobs, dbo.sysjobschedules, etc). Notice that the T-SQL statements contain a WHERE clause for run_date > 0 (or next_run_date or last_run_date >0).

      You might first run simple SELECT statements against dbo.sysjobhistory dbo.sysjobs, dbo.sysjobschedules, and dbo.sysjobsteps as these are the source tables queried by the template. You can then try the SELECT statements used in the template to see which one produces an error. If you don’t have any data in these source tables or you can’t query the tables you may ensure that you have SQL Agent jobs running on this instance and that you have permissions to query the msdb database. I haven’t had any issues with the template and apparently quite a few readers of this blog have been able to use the template successfully but I suppose it’s possible for your data you may need to tweak something – I’d need more detail/access to know for sure.

      Hope any of this helps – thanks for visiting this blog.

      Like

  3. Hi Brett,

    Thank you again for the template, is it possible to add the possibility to view ongoing jobs maybe on separate page. I have some jobs that take 8-12 hours.
    Is it also possible to keep historical data? on my sql server I am keeping logs for only last 4 weeks but my wish is to keep data from all periods in powerBi.

    Like

    1. Hi Adel,

      You can build a report page which filters on the Run Status column of the SQL Agent Job Step History table to be ‘In Progress’. The SQL Agent Job History (at the job level) is filtered to retrieve the Job Outcome rows only (Step Name = Job Outcome). However, the Step Level table excludes these rows and thus the In Progress rows from dbo.sysjobhistory are allowed to load for reporting/analysis.

      In terms of keeping all historical data I don’t think there’s an easy way to handle this with Power BI alone. You would need a separate process, possibly with a tool like SSIS, to regularly archive this data and then point the template to these archive tables.

      Like

  4. Nice solution you have there, very handy! I have two problems. The first is when you have multipel schedules on one job (many to one error). The second is the same as previous comment (error to convert null to number in unknown column). Regards David

    Like

    1. Thanks David, I’ll consider this to be included in an updated version at some point. I’m currently writing a Second Edition of Mastering Power BI so I don’t have capacity for this updated tool for 2-3 months.

      Like

  5. Hi Brett

    I am getting error when there are multiple schedules for same job. It’s giving an error as “SQL Agent Jobs” table is on one side of relationship in data model.

    Thanks

    Like

    1. Hi Swapnil,

      Yes, if you open the Power Query Editor you can see in the SQL Agent Jobs query (in Dimension Tables group) that there’s a join between stgSQLAgentJobs and stgSQLAgentJobSchedules. If there are multiple job schedules associated with a single job, this would create additional rows in the Jobs table and thus violate the 1-to-many relationship of SQL Agent Jobs to to the two fact tables. In the modeling view, you could revise the relationship between SQL Agent Jobs and SQL Agent Job History to many-to-many. Likewise, you could revise the relationship between SQL Agent Jobs and SQL Agent Job Step History to many-to-many to avoid this error. You can make this change in the Cardinality dropdown of the Edit Relationship menu. The template model assumes that there’s either zero or 1 schedule per job and this is likely the case for the great majority of jobs but for multi-schedule jobs you need to switch the cardinality of the relationships in the model.

      Liked by 1 person

  6. Great work, this was something i was looking for , i was trying to refresh this powerbi report and am getting the error “the column job_id in table sql_job_agent contains a duplicate value and this is not allowed.” Has anyone faced this error.?

    Like

    1. Thanks Finny,

      Sorry to hear of your issue and that I couldn’t respond sooner. The model contains 1-to-many relationships between SQL Agent Jobs and SQL Agent Job History as well as between SQL Agent Jobs and SQL Agent Job Step History. However, the SQL Agent Jobs query contains a join to Job Schedule information and thus, in the event that there’s multiple job schedules associated with a single job, this results in additional rows for SQL Agent Jobs – some of which have the same job ID. For tables on the one side of relationships, the relationship column needs to have unique values.

      In Modeling view, you can revise the relationship cardinality between SQL Agent Jobs and the two SQL Agent History fact tables to many-to-many. Double-click the line connecting the tables or open the Manage Relationships dialog via the Modeling tab and then edit the relationship. With many-to-many relationships defined between SQL Agent Jobs and SQL Agent Job History as well as SQL Agent Jobs and SQL Agent Job Step History, you should avoid the error of the duplicate values (many instances of the same job ID will be allowed).

      Like

  7. Thank you Brett for the nice report.
    I got the below error when I tried to refresh the report via Power BI app.
    There is a gateway installed on the server.
    Am I missing something?

    Refresh failed due to gateway configuration issues.
    This dataset requires a properly configured gateway in order to refresh.
    If you’re using personal mode, please make sure your gateway is online.
    If you’re using enterprise mode, please make sure you’ve added the following data sources in the Gateway
    Management Portal
    SqlServer { server : “xxxx” , database : “msdb” }

    Like

    1. Hi Andrew,

      Sorry to hear of this issue. I’ve used the dataset and report with an on-premises data gateway (standard mode, not personal). Assuming the gateway installed is in standard mode you’ll want a gateway admin to A) check that the specific SQL server database is configured as a data source with the necessary authentication and credentials and B) ensure that you have permission to use this data source for this gateway. If the gateway is configured for this SQL Server database and you have permission to the source for this gateway, the dataset settings page should present a dropdown option to associate the dataset with a gateway. The gateway source will light up green if it aligns with the source in the dataset. You may test a separate PBI dataset with a simple query against the MSDB database and see if PBI recognizes this source for a gateway. Generally I wouldn’t recommend using the personal mode gateway so I’d consider this a very last option to get started.

      Like

  8. Hi

    First of all, thank you for making this template available. I had already developed a worse model before seeing this post, but I switched my report to work on your data model and implemented a few new things. It works flawlessly, however when I publish the report to the Power BI Service, and refreshes the data set, the pages are just blank. The refresh doesnt fail, but the tables are empty. In Power BI Desktop it works just fine and I can refresh it in Power BI Desktop, upload the report without refreshing and it will work until the underlying data set is updated. I’ve tried to simply upload your template without any modification (except pointing to our DWH) and it’s the same issue. I believe it might have something to do with the dependent queries refreshing before the other queries are done, because sometimes only fact “SQL Agent Job Step History” is empty while the fact “SQL Agent Job History” did load properly.

    Any idea what could cause this issue?

    Like

    1. Hi Rasmus,

      Thanks for giving this template a try. If you can refresh the dataset in Power BI Desktop and the tables load locally, and if the published dataset refreshes successfully, then I’d first query the tables to determine if they’re indeed empty or if your report template is empty. If the dataset is in premium capacity, you could connect via XMLA endpoint and check the rows (countrows()) of the tables. If the dataset is not in premium capacity, you could add a couple countrows() measures to the dataset, live connect to the published dataset, and check if the tables are empty.

      If the published dataset tables are not empty after refresh, then I’d think there’s a filter context issue in your report. You might check if there’s any filters being applied at any level and if you can revert the default filters. That’s about all I can think of without reviewing your solution more closely.

      Good news is that I’m working with a client that uses SQL Agent extensively and expect to get approved to deploy a modified version of this template to their environment. This will give me an opportunity to walk through the template again and make a number of enhancements to share with a new blog post.

      Have a Merry Christmas.
      Brett

      Like

      1. Hi Brett

        Thank you for your reply

        When live connecting to the power bi data set, the tables are empty as well.

        The data set scheduled refresh history says successfully every time it has been refreshed. Sometimes, some of the tables are populated with data, but not always. This makes me believe there is a bug or inconsistency in the way the tables are populated, and since some tables are dependent, it might populate the table before the previous ones has been populated completely.

        Like

    2. hi Rasmus,

      I had the same issue with mine, my report would refresh automatically but it would just show blank but when I refreshed it manually, it would be fine. So I just reuploaded the powerbi report to a different foler and it strangely started working..

      Like

      1. Hi Finny

        Thank you for the suggestion. I tried that and it actually worked the first 2 time it refreshed, but now I’m stuck on the same issue again. Very weird. Let me know if yours still works as expected after some time 🙂

        Like

      2. You may have to take the sql statements from the dataset and execute them against your db and walk through the applied steps in Power Query to find the steps or scenarios and dependencies causing to lose the rows..sorry for the issue, I’ll look to have an enhanced version sometime soon that’s more resilient

        Like

  9. Havent had time to look into the issue since then, but it still persist.

    A couple of ideas for your next iteration (might also do it myself when I get the time for it):

    Make it a DirectQuery model that allows live connection so you can see stuff like what is currently running. Power BI Service only allows periodic updates on imported data.

    Add SSISDB info – which SSIS packages fails/succeed

    Add SSAS DMV queries for SSAS info and status. I have a standalone SSAS model for documentation that I can share if this is of any interest for you.

    Scope might get too big, but just some thoughts.

    Like

  10. Thank you for the efforts in developing this. Indeed very helpful. However one question, how can i publish this to PowerBI report server and use parameter to change SQL server to be monitored from there? I dont see any option to do this on PBI RS. Are you aware of any workaround or way to get around working this on Power BI report server?

    Like

Leave a Reply