Analyze SQL Agent Job History with Power BI

The purpose of this post is to share and describe a Power BI-based solution for analyzing SQL Server Agent job history data. SQL Server Agent has been around for many years and though increasingly my projects involve other scheduling and orchestration tools like Azure Automation, SQL Agent remains relevant both for on-premises BI/DW processes and in Azure via SQL Managed Instance.

In fact, dependencies on SQL Agent for scheduled jobs is one of the main reasons SQL Managed Instance was created as this reduces the friction in migrating on-premises workloads to Azure.

You can download the Power BI template (.PBIT) described in this blog from GitHub:

Power BI Template for SQL Agent Job History

Power BI Template: SQL Instance Parameter Required

I’d strongly recommend you read through this blog post to better understand the context and details of this solution before you attempt to execute the template. At a minimum, you should check that you have the SQL Agent service running, note the name of the server its running on (the one required parameter per the above image), and confirm that you can query the SQL Agent tables in the MSDB database such as dbo.sysjobs.

This template uses native T-SQL queries so in Power BI Desktop you might also disable the user approval requirements for new native queries prior to running the template:

Global Security Options in Power BI Desktop

SQL Server Agent Background

I assume many readers of this blog are familiar with SQL Agent and thus can skip ahead to later sections. For others, the SQL Agent service allows for the creation and scheduling of ‘jobs’ with each ‘job’ comprised of a sequence of one more ‘steps’. For example, a job could first execute a T-SQL stored procedure, then run an SSIS package, and maybe finally process/refresh an SSAS database or execute a PowerShell script. SQL Agent logs the duration of each step, each instance/execution of the overall job, any retries attempted, error messages, and other useful information.

In a common use case, a BI and data warehouse team will link together multiple steps (or multiple jobs) to support a nightly data warehouse and data model update process. If the nightly SQL Agent job fails, the team/user closest to the technology involved in the given step will review the agent log and apply changes necessary to successfully complete the process from the point of failure and in future instances of the job.

The following image from SQL Server Management Studio (SSMS) is from a four-step SQL Agent job scheduled to occur nightly:

SQL Server Agent Log File Viewer

In the above example, Power BI and Azure AD data is retrieved via PowerShell scripts in steps 1 and 2, then an SSIS package is used to load this data to a SQL Server database in step 3, and finally a Power BI dataset used for administering Power BI is triggered to refresh. This entire process executed in about 2.5 hours with the Azure AD data retrieval step requiring 1.5 hours itself.

As an aside, some of the PowerShell scripts I’ve featured in recent blog posts containing loops over Power BI app workspaces require time (e.g. 52 minutes per image) and of course loops are suboptimal from a performance standpoint. There’s a recent blog post on avoiding workspace loops you may consider. However, if you filter down your workspaces to exclude personal/my workspaces prior to executing the loops, a nightly 50+ minute process such as this example for a large Power BI environment might be acceptable.

What problem are we solving?

When you’re getting started with SQL Agent just viewing the job history and error log files via the graphical interfaces in SSMS might be sufficient. If you want to become more productive and proactive in your analysis, you’ll look for a programmatic solution such as writing T-SQL queries against the SQL Agent tables in the MSDB system database including dbo.sysjobs, dbo.sysjobhistory, dbo.sysjobsteps, and others. Over the years several blog posts have been written containing T-SQL example approaches to querying this data and MS Docs itself provides an example T-SQL query for the job history table.

The Power BI-based solution starts with T-SQL queries but builds on top of this layer to leverage several of Power BI’s rich and integrated features including Power Query functions, DAX expressions, reporting visualizations, and more. The core idea of this solution is to have a data model in place (a Power BI ‘dataset’), including fact to dimension relationships and measures with built-in calculation logic to make it relatively easy to build SQL Agent reports and dashboards and to browse/analyze SQL agent history data.

Since the data model already includes the SQL Agent data stored in the MSDB database and can be refreshed on a schedule, users (typically DBAs and other IT admins) don’t have to write new or modified SQL queries on an ad hoc basis to answer different questions. Within the PBIX file containing the data model itself or a separate file or tool connected to the published dataset users can drag-and-drop and filter to find what they need.

Its outside the scope of this blog post but you could leverage any number of tools to analyze the SQL Agent History Power BI dataset beyond Power BI itself. Given XMLA endpoint connectivity, you could use Excel, Paginated/SSRS reports, and other 3rd Party BI tools just like an Analysis Services database.

Report Pages (Visual Layer)

In describing any Power BI solution I’d prefer to start from the source systems and data and walk through the logic and components of the Power Query Editor and then the Modeling View before saying anything about any reports or dashboards. However, experience suggests that introducing the report/visual layer, at least a representative sample, further addresses the ‘Why does this exist?’ question and will help maintain attention to these two more fundamental layers.

With that being said, the following short slideshow displays the three report pages and highlights two tooltip pages included in this initial version of the template:

The two main report pages of the PBIX file that’s generated from the template are the Job Level and Job Step Level pages. As mentioned earlier, SQL Agent is comprised of ‘jobs’ and ‘jobs’ contain one or many ‘steps’ which will execute on a schedule. Therefore, these two pages help address the most basic questions such as “Do I have any job failures?”, “How long are my jobs taking to complete?”, “Which steps within my jobs are taking the longest to complete or are failing?”, etc.

There’s a report-level filter (impacting all report pages) that’s set for the trailing last 30 days and you can also use the Date slider at the top to modify the date range of the report based on the run date of the job. You can also use the slicer at the top right to filter for one or multiple ‘jobs’ of interest and the slicers are synchronized across the pages such that these selections will be carried over.

The tooltip pages are intended to provide additional context as you hover over a specific agent job or a specific agent job step. The bar charts on the Job Level and Job Step Level pages will render these tooltip pages to give additional details relevant to these specific items (or ‘dimension members’ if you prefer) including a line chart of the duration for the given job or step over time.

Many additional and more visually impressive and analytically powerful reports could be developed. Depending on the extent of SQL Agent usage these additional reports could make sense but for common use cases such as routine monitoring and troubleshooting these reports or slight variations might be sufficient and represent quite an improvement from ad hoc SQL queries).

The Data Model (Sematic Layer)

At the center of the solution are two star schema models – one for job history at the job level and one at the job step level. You might think of this as the header or order level of a sales or purchase order versus the line level of these transactions which includes the individual products.

The following two images from Modeling view expose the fact to dimension (many to one) relationships.

Both the SQL Agent Job Step History and the SQL Agent Job History fact tables have many-to-one relationships with the Date and the SQL Agent Jobs dimension tables. The SQL Agent Job Step History fact table also has an active many-to-one relationship to the SQL Agent Job Steps dimension table.

Each row of the SQL Agent Job History table reflects one instance of a job being executed from start to finish inclusive of all steps. Each row of the SQL Agent Job Step History table reflects the execution of one step of a job. The source table in the MSDB database for both the aggregate/job level fact table and the step-level fact table is dbo.sysjobhistory. Every row of a fact table should have the same meaning and therefore there’s filtering logic in the solution to separate the rows representing the overall job from those representing an individual step of a job.

In addition to the relationships, there are display folders of basic DAX measures in each fact table per the following image:

SQL Agent DAX Measures

As their names indicate, these measures either count the number of jobs or steps that succeeded or failed or apply an aggregation to a duration fact column (in minutes). The presence of the DAX measures and the relationships makes development of the report pages relatively straight forward.

Power Query (M) Data (ETL Layer)

The fact and dimension tables of the data model are the product of a data transformation process executed via T-SQL statements against the Agent tables in the MSDB system database and Power Query (M) expressions. The T-SQL statements contain logic to prep and stage the Agent tables in MSDB such as adjusting date and time columns into data types and structures more suitable for reporting. The results of these SQL queries are further enhanced and modified with Power Query (M) expressions.

Here’s a look at the Power Query Editor:

Power Query Editor

Like normal, I like to use query groups to isolate queries into those that are being loaded to the data model (dimension and fact tables) from those that are not but are required for staging/prepping processes.

One item you might find a bit novel or interesting in this design is that I’m storing T-SQL statements in their own ‘queries’ returning text primitive values – the text value being the T-SQL statement. In this way, the queries which invoke the Sql.Database() function can simply reference these ‘queries’ as the value in the Query parameter. This means that A) the staging queries against the MSDB tables are shorter and cleaner to read and B) the T-SQL statements can be more easily viewed and modified if necessary in the future.

In this version, I filter out common system Agent jobs such as ‘SSIS Server Maintenance Job’ and ‘Sys Policy History Purge’ in order to focus the solution on user/business created jobs. There are filters and queries to split job level and step level data into their own flows, joins to integrate job schedule attributes and to support the relationship between job history at the step level with job steps.

The date table is completely created on the fly via Power Query (M) expressions. For this version, the query creates a date table from the current date (at the time of execution) back 365 days. This might be acceptable for the limited scope of this solution or you might want to replace this date table with an ‘official’ date dimension table if you have one.

Given the length of this post I can’t go into significant depth here on the queries but there are comments included in the M expressions and I think the structure and syntax is simple enough that an intermediate user of Power Query (M) could reason her way through the queries.

Don’t mix native queries and M, normally

If you’ve been working with Power Query (M) for any significant period of time you know that any logic/transform you apply after a native query will not be folded back to the source system. The general guidance is always to either A) accomplish all required logic in the native query or B) accomplish all required logic in a Power Query (M) expression which can be folded/translated into a query for execution by the source system.

Therefore, I want to be very clear that I’m only using both tools in the same solution because the scale of this data is very small. When you’re querying DMV and system table sources which contain maybe hundreds or a few thousand rows, some level of processing executed via the Power Query engine is probably acceptable and even advisable in some cases.

In this case, Power Query functions such as #duration, #time, and Duration.TotalMinutes() are used on top of the SQL statements and in my view this is actually a simpler approach than consolidating all logic into much more extensive T-SQL queries. If you beg to differ, perhaps given relative comfort/skill in T-SQL, you can always re-factor the T-SQL queries and remove the M expressions.

Wrapping Up

If you work with SQL Agent regularly and are at least open to using Power BI I hope you’re able to give this template a try and find some value in it. If you don’t use SQL Agent maybe this blog post and sample solution can give you some idea of leveraging Power BI to build your own monitoring and administrative solutions for the tools/services you are responsible for.

If you’d like to receive email notification of future blog posts you can click ‘Follow’ in the upper right.

10 comments

  1. Hi

    Thank you for your template, I am doing a test on my SQLSERVER but for some reason this query is not returning anything stgSQLAgentJobHistoryJobLevel resulting in having no data in SQL agent Job history.
    I must be missing something.

    Can you assist me please

    Thanks

    Like

    1. Hi

      my issue was related to last step of stgSQLAgentJobHistoryJobLevel as it was looking for “Job outocome” and I am using french language :).

      Table.SelectRows(RenameCols, each ([Step Name] = “(Job outcome)”))

      Thank you again for you great work

      Like

  2. Hi

    Thanks for the awesome template and article.

    I am currently receiving the below issue when trying to refresh tables.

    I have no idea on the solution to this, please can you advise ?

    Expression.Error: We cannot convert the value null to type Number.
    Details:
    Value=
    Type=[Type]

    Like

    1. Hi Andrew,

      Thanks for your note. To troubleshoot the issue, open up the query editor and refresh the preview for the staging queries (stgSQLAgentJobs, stSQLAgentJobSchedules, etc) and see which staging query is producing an error. There are four T-SQL statements stored within their own queries (AgentJobHistoryTSQL, AgentJobsTSQL, etc) – you can copy this code and paste it into SSMS and try to execute the query against the MSDB system database which contains these tables (e.g. dbo.sysjobs, dbo.sysjobschedules, etc). Notice that the T-SQL statements contain a WHERE clause for run_date > 0 (or next_run_date or last_run_date >0).

      You might first run simple SELECT statements against dbo.sysjobhistory dbo.sysjobs, dbo.sysjobschedules, and dbo.sysjobsteps as these are the source tables queried by the template. You can then try the SELECT statements used in the template to see which one produces an error. If you don’t have any data in these source tables or you can’t query the tables you may ensure that you have SQL Agent jobs running on this instance and that you have permissions to query the msdb database. I haven’t had any issues with the template and apparently quite a few readers of this blog have been able to use the template successfully but I suppose it’s possible for your data you may need to tweak something – I’d need more detail/access to know for sure.

      Hope any of this helps – thanks for visiting this blog.

      Like

  3. Hi Brett,

    Thank you again for the template, is it possible to add the possibility to view ongoing jobs maybe on separate page. I have some jobs that take 8-12 hours.
    Is it also possible to keep historical data? on my sql server I am keeping logs for only last 4 weeks but my wish is to keep data from all periods in powerBi.

    Like

    1. Hi Adel,

      You can build a report page which filters on the Run Status column of the SQL Agent Job Step History table to be ‘In Progress’. The SQL Agent Job History (at the job level) is filtered to retrieve the Job Outcome rows only (Step Name = Job Outcome). However, the Step Level table excludes these rows and thus the In Progress rows from dbo.sysjobhistory are allowed to load for reporting/analysis.

      In terms of keeping all historical data I don’t think there’s an easy way to handle this with Power BI alone. You would need a separate process, possibly with a tool like SSIS, to regularly archive this data and then point the template to these archive tables.

      Like

  4. Nice solution you have there, very handy! I have two problems. The first is when you have multipel schedules on one job (many to one error). The second is the same as previous comment (error to convert null to number in unknown column). Regards David

    Like

    1. Thanks David, I’ll consider this to be included in an updated version at some point. I’m currently writing a Second Edition of Mastering Power BI so I don’t have capacity for this updated tool for 2-3 months.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s