High Concurrency Data Pipelines in Fabric

Image of author: Dennes Torres

Dennes Torres 8th January 2025

Comments 0

Data Pipelines can orchestrate many activities, creating a flow for data ingestion. One of these activities is the notebook execution activity.

However, every time a data pipelines executes a notebook, it creates a completely new session and spark pool.

This makes the Data Pipeline very slow and expensive.

How bad it can be

Imagine your pipeline will run a notebook inside a loop. The loop executes the notebook many times.

Each execution means a completely new spark pool. This is expensive.

A screenshot of a computer

Description automatically generated

Besides being expensive, the default configurations for a spark session and a capacity will not support this running in parallel. You will need to limit the number of parallel notebook executions, using the ForEach activity, like in the image below

A screenshot of a computer

Description automatically generated

High Concurrency to the Rescue

The solution is to enable High Concurrency for Data Pipelines running notebooks. This can be done in two steps:

Enable this configuration in the workspace settings
Configure the session tag in the notebook activity

In the workspace settings, you find this option to be enabled in Spark Settings, like in the image below:

A screenshot of a computer

Description automatically generated

After that, the Session Tag configuration defines which notebook activities will use this feature or not. You can create groups of notebook activities running each group in a different session. You can use any string as “Session Tag”

A screenshot of a computer

Description automatically generated

The High Concurrency Results

The image below shows a comparison between the execution without high concurrency and with high concurrency.

The execution time dropped from almost 13 minutes to less than 3.

A screenshot of a computer

Description automatically generated

References

Fabric Monday 55: Pipelines High Concurrency to Save Yout Time and Money

Summary

If you plan to orchestrate notebooks using Data Pipelines, the High Concurrency configuration is essential for you

This document contains proprietary information and is protected by copyright law.

Copyright © 2026 Red Gate Software Limited. All rights reserved

Get more Simple Talk in Google Search
Want to see more articles from Simple Talk and Redgate? Add us as a preferred source and Google will show our content more prominently in your search results.

Follow on Google

Spotted an error?
If you've seen something in this article that needs changing, whether it's a technical inaccuracy or a typo, please let us know by reaching out to the team.

Article tags

About the author

Dennes Torres

Dennes Torres is a Data Platform MVP and Software Architect living in Malta who loves SQL Server and software development and has more than 20 years of experience. Dennes can improve Data Platform Architectures and transform data in knowledge. He moved to Malta after more than 10 years leading devSQL PASS Chapter in Rio de Janeiro and now is a member of the leadership team of MMDPUG PASS Chapter in Malta organizing meetings, events, and webcasts about SQL Server. He is an MCT, MCSE in Data Platforms and BI, with more titles in software development. You can get in touch on his blog https://dennestorres.com or at his work https://dtowersoftware.com

Dennes's contributions

Articles

238
Books

0
Top topics

Dennes's latest contributions:

Image of author: Dennes Torres

Dennes Torres in AI

Vibe Coding or Not Coding: What’s in the middle

The capability to generate code with AI created a new fashion in the market: The Vibe Coding. Mocking and memes about vibe coding don’t stop...

27 June 2025 4 min read

Image of author: Dennes Torres

Dennes Torres in Other

Using Parameters with DAX in Report Builder

This is something we expect to be simple, but it has a lot of secrets. Using parameters in DAX queries is essential, as any other...

29 May 2025 4 min read

Image of author: Dennes Torres

Dennes Torres in T-SQL Programming

Azure SQL Native JSON Data Type: Query Techniques

Explore Azure SQL's native JSON data type with CHECK constraints, JSON_ARRAYAGG, JSON_OBJECTAGG, and JSON_OBJECT functions. These features are coming to SQL Server 2025 - learn...

07 May 2025 16 min read