{"id":106535,"date":"2025-05-02T02:34:23","date_gmt":"2025-05-02T02:34:23","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=106535"},"modified":"2025-05-02T02:34:24","modified_gmt":"2025-05-02T02:34:24","slug":"query-lineage-in-fabric-lakehouses","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/blogs\/query-lineage-in-fabric-lakehouses\/","title":{"rendered":"Query Lineage in Fabric Lakehouses"},"content":{"rendered":"\n<p>I already wrote about <a href=\"https:\/\/www.red-gate.com\/simple-talk\/blogs\/query-performance-analysis-in-lakehouses\/\">query performance analysis in lakehouses<\/a>, but one secret was left behind: The capability to track the queries back to a semantic model, report and visual in a report, identifying the query lineage.<\/p>\n\n\n\n<p>If you check the text of the queries, at the end of the text you will find content like this:<\/p>\n\n\n\n<p><div><span style=\"font-family: Courier New;font-size: 10pt\">\nOPTION&nbsp;<font color=\"maroon\">(<\/font><font color=\"maroon\">label<\/font>&nbsp;<font color=\"silver\">=<\/font>&nbsp;<font color=\"red\">N&#8221;<\/font><font color=\"maroon\">{&#8220;DatasetId&#8221;:&#8221;1269551b-bf26-47de-b0f0-974fa60f7b08&#8243;,&#8221;Sources&#8221;:[{&#8220;ReportId&#8221;:&#8221;01ab9208-399a-47ec-b444-d03633fc3e1d&#8221;,&#8221;VisualId&#8221;:&#8221;30ac676503a0bd357312&#8243;,&#8221;Operation&#8221;:&#8221;AutoPageRefresh&#8221;}<\/font><font color=\"silver\">]<\/font><font color=\"blue\">}<\/font><font color=\"red\">&#8221;<\/font><font color=\"maroon\">)<\/font>\n<\/span><\/div><\/p>\n\n\n\n<p>This has an interesting meaning:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li> We can use this information to track the query lineage <\/li>\n\n\n\n<li>Applications can send lineage (or more) to <strong>SQL<\/strong> using <strong>OPTION (LABEL)<\/strong> statement <\/li>\n<\/ul>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-applications-tracking-lineage\">Applications Tracking Lineage<\/h2>\n\n\n\n<p>If you develop applications consuming data from a lakehouse SQL Endpoint, they can track the query lineage using the <strong>OPTION(LABEL)<\/strong> in the SQL Statement.<\/p>\n\n\n\n<p>This OPTION is not only accepted by the lakehouse, but the lakehouse will process this option, extract the JSON in this option and include it in the field &#8220;label&#8221; in the &#8220;queryinsights&#8221; view.<\/p>\n\n\n\n<p>The JSON doesn&#8217;t need to have this exact format: it can contain anything your application need and when using custom applications, you can expand the tracking way beyond the query lineage.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-retrieving-the-lineage-from-queryinsights\">Retrieving The Lineage from QueryInsights<\/h2>\n\n\n\n<p>The lineage information received with the query is stored in a field called &#8220;label&#8221;, in JSON format.<\/p>\n\n\n\n<p>We need to use JSON expressions to extract the 4 values from the field: DatasetId, ReportId, VisualId and Operation. Once we confirm the label field contains a JSON value, we can extract the value from the JSON.<\/p>\n\n\n\n<p>This is an example of the DatasetId value:<\/p>\n\n\n\n<p><div><span style=\"font-family: Courier New;font-size: 10pt\"><br><font color=\"blue\">CASE<\/font><br><font color=\"blue\">WHEN<\/font>\u00a0<font color=\"maroon\">ISJSON(label)<\/font>\u00a0<font color=\"silver\">=<\/font>\u00a0<font color=\"black\">1<\/font>\u00a0<font color=\"blue\">THEN<\/font>\u00a0<font color=\"maroon\">JSON_VALUE(label,<\/font>\u00a0<font color=\"red\">&#8216;$.DatasetId&#8217;)<\/font><br><font color=\"blue\">ELSE<\/font>\u00a0<font color=\"blue\">NULL<\/font><br><font color=\"blue\">END<\/font>\u00a0<font color=\"blue\">AS<\/font>\u00a0<font color=\"maroon\">DatasetId<\/font><br><\/span><\/div><\/p>\n\n\n\n<p>The other ones are slightly different because they are in a different position of the JSON:<\/p>\n\n\n\n<p><div><span style=\"font-family: Courier New;font-size: 10pt\"><br><font color=\"blue\">CASE<\/font><br><font color=\"blue\">WHEN<\/font>\u00a0<font color=\"maroon\">ISJSON(label)<\/font>\u00a0<font color=\"silver\">=<\/font>\u00a0<font color=\"black\">1<\/font>\u00a0<font color=\"blue\">THEN<\/font>\u00a0<font color=\"maroon\">JSON_VALUE(label,<\/font>\u00a0<font color=\"red\">&#8216;$.Sources[0].ReportId&#8217;)<\/font><br><font color=\"blue\">ELSE<\/font>\u00a0<font color=\"blue\">NULL<\/font><br><font color=\"blue\">END<\/font>\u00a0<font color=\"blue\">AS<\/font>\u00a0<font color=\"maroon\">ReportId<\/font><br><\/span><\/div><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-the-complete-query-and-execution\">The complete query and execution<\/h2>\n\n\n\n<p>Let&#8217;s consider the query below to retrieve this information:<\/p>\n\n\n\n<p><div><span style=\"font-family: Courier New;font-size: 10pt\"><br><span style=\"color: blue\">SELECT<\/span>\u00a0<span style=\"color: blue\">TOP<\/span>\u00a0<span style=\"color: black\">20<\/span>\u00a0<span style=\"color: maroon\">distributed_statement_id<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: maroon\">program_name<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: maroon\">data_scanned_disk_mb<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: maroon\">data_scanned_memory_mb<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: maroon\">data_scanned_remote_storage_mb<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: fuchsia;font-style: italic\">Replace<\/span><span style=\"color: maroon\">(<\/span><span style=\"color: fuchsia;font-style: italic\">Replace<\/span><span style=\"color: maroon\">(command<\/span><span style=\"color: silver\">,<\/span>\u00a0<span style=\"color: fuchsia;font-style: italic\">Char<\/span><span style=\"color: maroon\">(<\/span><span style=\"color: black\">13<\/span><span style=\"color: maroon\">)<\/span><span style=\"color: silver\">,<\/span>\u00a0<span style=\"color: red\">&#8221;<\/span><span style=\"color: maroon\">)<\/span><span style=\"color: silver\">,<\/span>\u00a0<span style=\"color: fuchsia;font-style: italic\">Char<\/span><span style=\"color: maroon\">(<\/span><span style=\"color: black\">10<\/span><span style=\"color: maroon\">)<\/span><span style=\"color: silver\">,<\/span>\u00a0<span style=\"color: red\">&#8221;<\/span><span style=\"color: maroon\">)<\/span>\u00a0<span style=\"color: blue\">AS<\/span>\u00a0<span style=\"color: maroon\">command<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: maroon\">total_elapsed_time_ms<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: maroon\">start_time<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: maroon\">end_time<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: maroon\">allocated_cpu_time_ms<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: maroon\">status<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: maroon\">row_count<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">CASE<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">WHEN<\/span>\u00a0<span style=\"color: #FF0080;font-weight: bold\">Isjson<\/span><span style=\"color: maroon\">(label)<\/span>\u00a0<span style=\"color: silver\">=<\/span>\u00a0<span style=\"color: black\">1<\/span>\u00a0<span style=\"color: blue\">THEN<\/span>\u00a0<span style=\"color: #FF0080;font-weight: bold\">Json_value<\/span><span style=\"color: maroon\">(label<\/span><span style=\"color: silver\">,<\/span>\u00a0<span style=\"color: red\">&#8216;$.DatasetId&#8217;<\/span><span style=\"color: maroon\">)<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">ELSE<\/span>\u00a0<span style=\"color: blue\">NULL<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">END<\/span>  <span style=\"color: blue\">AS<\/span>\u00a0<span style=\"color: maroon\">DatasetId<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">CASE<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">WHEN<\/span>\u00a0<span style=\"color: #FF0080;font-weight: bold\">Isjson<\/span><span style=\"color: maroon\">(label)<\/span>\u00a0<span style=\"color: silver\">=<\/span>\u00a0<span style=\"color: black\">1<\/span>\u00a0<span style=\"color: blue\">THEN<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: #FF0080;font-weight: bold\">Json_value<\/span><span style=\"color: maroon\">(label<\/span><span style=\"color: silver\">,<\/span>\u00a0<span style=\"color: red\">&#8216;$.Sources[0].ReportId&#8217;<\/span><span style=\"color: maroon\">)<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">ELSE<\/span>\u00a0<span style=\"color: blue\">NULL<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">END<\/span> <span style=\"color: blue\">AS<\/span>\u00a0<span style=\"color: maroon\">ReportId<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">CASE<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">WHEN<\/span>\u00a0<span style=\"color: #FF0080;font-weight: bold\">Isjson<\/span><span style=\"color: maroon\">(label)<\/span>\u00a0<span style=\"color: silver\">=<\/span>\u00a0<span style=\"color: black\">1<\/span>\u00a0<span style=\"color: blue\">THEN<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: #FF0080;font-weight: bold\">Json_value<\/span><span style=\"color: maroon\">(label<\/span><span style=\"color: silver\">,<\/span>\u00a0<span style=\"color: red\">&#8216;$.Sources[0].VisualId&#8217;<\/span><span style=\"color: maroon\">)<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">ELSE<\/span>\u00a0<span style=\"color: blue\">NULL<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">END<\/span> <span style=\"color: blue\">AS<\/span>\u00a0<span style=\"color: maroon\">VisualId<\/span><span style=\"color: silver\">,<\/span>,<br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">CASE<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">WHEN<\/span>\u00a0<span style=\"color: #FF0080;font-weight: bold\">Isjson<\/span><span style=\"color: maroon\">(label)<\/span>\u00a0<span style=\"color: silver\">=<\/span>\u00a0<span style=\"color: black\">1<\/span>\u00a0<span style=\"color: blue\">THEN<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: #FF0080;font-weight: bold\">Json_value<\/span><span style=\"color: maroon\">(label<\/span><span style=\"color: silver\">,<\/span>\u00a0<span style=\"color: red\">&#8216;$.Sources[0].Operation&#8217;<\/span><span style=\"color: maroon\">)<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">ELSE<\/span>\u00a0<span style=\"color: blue\">NULL<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">END<\/span> <span style=\"color: blue\">AS<\/span>\u00a0<span style=\"color: maroon\">Operation<\/span>,<br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: maroon\">label<\/span><br><span style=\"color: blue\">FROM<\/span>\u00a0\u00a0\u00a0<span style=\"color: maroon\">queryinsights<\/span><span style=\"color: silver\">.<\/span><span style=\"color: maroon\">exec_requests_history<\/span><br><span style=\"color: blue\">WHERE<\/span>\u00a0\u00a0<span style=\"color: maroon\">program_name<\/span>\u00a0<span style=\"color: blue\">IN<\/span>\u00a0<span style=\"color: maroon\">(<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: red\">&#8216;Core\u00a0.Net\u00a0SqlClient\u00a0Data\u00a0Provider&#8217;<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: red\">&#8216;.Net\u00a0SqlClient\u00a0Data\u00a0Provider&#8217;<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: red\">&#8216;Framework\u00a0Microsoft\u00a0SqlClient\u00a0Data\u00a0Provider&#8217;<\/span><span style=\"color: silver\">,<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: red\">&#8216;PowerBIPremium-DirectQuery&#8217;<\/span>\u00a0<span style=\"color: maroon\">)<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">AND<\/span>\u00a0<span style=\"color: maroon\">start_time<\/span>\u00a0<span style=\"color: silver\">&gt;<\/span>\u00a0<span style=\"color: red\">&#8216;2024-12-02&#8217;<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">AND<\/span>\u00a0<span style=\"color: maroon\">command<\/span>\u00a0<span style=\"color: blue\">NOT<\/span>\u00a0<span style=\"color: blue\">LIKE<\/span>\u00a0<span style=\"color: red\">&#8216;%sys.sp_set_session_context%&#8217;<\/span><br>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<span style=\"color: blue\">AND<\/span>\u00a0<span style=\"color: maroon\">status<\/span>\u00a0<span style=\"color: silver\">=<\/span>\u00a0<span style=\"color: red\">&#8216;Succeeded&#8217;<\/span><br><span style=\"color: blue\">ORDER<\/span>\u00a0\u00a0<span style=\"color: blue\">BY<\/span>\u00a0<span style=\"color: maroon\">total_elapsed_time_ms<\/span>\u00a0<span style=\"color: blue\">DESC<\/span>\u00a0<br><\/span><\/div><\/p>\n\n\n\n<p>Our focus is the lineage, but the query above has some interesting details about analysing the performance of a lakehouse:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li> It filters only the queries with some specific program names. These are program names from client applications, including power bi. You can decide to filter for a specific set of them. <\/li>\n\n\n\n<li>It filters out queries containing <em>&#8216;sys.sp_set_session_context&#8217;<\/em> . It&#8217;s a query generated by client libraries when starting a session for a user. <\/li>\n\n\n\n<li>The query only retrieves &#8220;Succeeded&#8221; results. You can decide about retrieving the opposite for a different analysis or not filtering by status at all. <\/li>\n\n\n\n<li>The query is using a TOP 20 ordering by <em>total_elapsed_time_ms<\/em> in descending order. This results in a list of the worst queries according to the total elapsed time. You can create different queries using the same method with other metrics available in these tables. <\/li>\n\n\n\n<li>It&#8217;s filtering by <em>start_time<\/em>. In this way, it limits the queries and avoids retrieving old ones. <\/li>\n\n\n\n<li> It&#8217;s removing line breaks from the query text. <\/li>\n<\/ul>\n<\/div>\n\n\n<p>This query needs to be executed at an SQL endpoint. One of the possible methods to automate it is to use notebooks. I wrote about <a href=\"https:\/\/www.red-gate.com\/simple-talk\/blogs\/fabric-query-a-sql-endpoint-from-a-notebook\/\">how to run SQL Endpoint queries in notebooks<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-translating-the-object-names\">Translating the Object Names<\/h2>\n\n\n\n<p>We can use sempy library to translate the name of semantic models and reports. There are different methods to achieve this, but with some limitations<\/p>\n\n\n\n<p>This kind of translation is always based on the workspace. This means we can&#8217;t only ask to translate an Id, we need to know in which workspace the object is.<\/p>\n\n\n\n<p>If we try to list all possible workspaces, we will get as a result only the workspaces we have access to. There is no single API call to translate the object id wherever the object is.<\/p>\n\n\n\n<p>The <strong>sempy<\/strong> library has a method called <strong>resolve_item_name<\/strong> . One of the parameters is the workspace id. In this way, this method can only be used if you know where to look.<\/p>\n\n\n\n<p>The problem of the <strong>resolve_item_name<\/strong> is the need to execute it once for each id to be resolved. We may have too many Ids to resolve, resulting in too many API calls.<\/p>\n\n\n\n<p>A better option will be to use the method <strong>list_items<\/strong> to retrieve all the items from a workspace. We can transform the list into a dataframe and translate the names in the query result. <strong>list_items<\/strong> can receive a collection of workspace ids, allowing the search to be in multiple workspaces.<\/p>\n\n\n\n<p>According to your needs, you may have a list of workspaces where you should search, or you may need to search in all the available ones. You can create a fixed list with some workspace names or use the method <strong>list_workspaces<\/strong> to retrieve a listof all workspaces you have permission to.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-pyspark-code-example\">PySpark Code Example<\/h2>\n\n\n\n<p>The code below is one example, among many possibilities, to translate the Ids inside a dataframe called query_results<\/p>\n\n\n\n<p><div style=\"background: #ffffff;overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em\"><pre style=\"margin: 0;line-height: 125%\" class=\"crayon:false\"><span style=\"color: #008800;font-weight: bold\">import<\/span> <span style=\"color: #0e84b5;font-weight: bold\">sempy.fabric<\/span> <span style=\"color: #008800;font-weight: bold\">as<\/span> <span style=\"color: #0e84b5;font-weight: bold\">fabric<\/span><br><span style=\"color: #008800;font-weight: bold\">import<\/span> <span style=\"color: #0e84b5;font-weight: bold\">pandas<\/span> <span style=\"color: #008800;font-weight: bold\">as<\/span> <span style=\"color: #0e84b5;font-weight: bold\">pd<\/span><br><br><span style=\"color: #888888\"># Authenticate and get workspaces<\/span><br>workspaces <span style=\"color: #333333\">=<\/span> fabric<span style=\"color: #333333\">.<\/span>list_workspaces()<span style=\"color: #333333\">.<\/span>query(<span style=\"background-color: #fff0f0\">'`Is On Dedicated Capacity` == True'<\/span>)<span style=\"color: #333333\">.<\/span>Id<br><br><span style=\"color: #888888\"># Get both datasets and reports in a single call<\/span><br>datasetItemsList <span style=\"color: #333333\">=<\/span> [fabric<span style=\"color: #333333\">.<\/span>list_items(<span style=\"background-color: #fff0f0\">\"SemanticModel\"<\/span>,wsId) <span style=\"color: #008800;font-weight: bold\">for<\/span> wsId <span style=\"color: #000000;font-weight: bold\">in<\/span> workspaces]<br>datasetItems <span style=\"color: #333333\">=<\/span> pd<span style=\"color: #333333\">.<\/span>concat(datasetItemsList, ignore_index<span style=\"color: #333333\">=<\/span><span style=\"color: #007020\">True<\/span>)<br>reportItemsList <span style=\"color: #333333\">=<\/span> [fabric<span style=\"color: #333333\">.<\/span>list_items(<span style=\"background-color: #fff0f0\">\"Report\"<\/span>,wsId) <span style=\"color: #008800;font-weight: bold\">for<\/span> wsId <span style=\"color: #000000;font-weight: bold\">in<\/span> workspaces]<br>reportItems<span style=\"color: #333333\">=<\/span>pd<span style=\"color: #333333\">.<\/span>concat(reportItemsList, ignore_index<span style=\"color: #333333\">=<\/span><span style=\"color: #007020\">True<\/span>)<br>dataset_mapping<span style=\"color: #333333\">=<\/span>{row[<span style=\"background-color: #fff0f0\">\"Id\"<\/span>]: row[<span style=\"background-color: #fff0f0\">\"Display Name\"<\/span>] <span style=\"color: #008800;font-weight: bold\">for<\/span> x,row <span style=\"color: #000000;font-weight: bold\">in<\/span> datasetItems<span style=\"color: #333333\">.<\/span>iterrows()}<br>report_mapping <span style=\"color: #333333\">=<\/span> {row[<span style=\"background-color: #fff0f0\">\"Id\"<\/span>]: row[<span style=\"background-color: #fff0f0\">\"Display Name\"<\/span>] <span style=\"color: #008800;font-weight: bold\">for<\/span> x , row <span style=\"color: #000000;font-weight: bold\">in<\/span> reportItems<span style=\"color: #333333\">.<\/span>iterrows()}<br><br><span style=\"color: #888888\"># Ensure query_results is treated as a list of dictionaries<\/span><br>rows <span style=\"color: #333333\">=<\/span> query_results<span style=\"color: #333333\">.<\/span>select(<span style=\"background-color: #fff0f0\">'*'<\/span>)<span style=\"color: #333333\">.<\/span>collect()<br><br><span style=\"color: #888888\"># Convert Row objects to dictionaries<\/span><br>query_results_list <span style=\"color: #333333\">=<\/span> [row<span style=\"color: #333333\">.<\/span>asDict() <span style=\"color: #008800;font-weight: bold\">for<\/span> row <span style=\"color: #000000;font-weight: bold\">in<\/span> rows]<br><br><span style=\"color: #888888\"># Translate DatasetId and ReportId to their corresponding names<\/span><br><span style=\"color: #008800;font-weight: bold\">for<\/span> row <span style=\"color: #000000;font-weight: bold\">in<\/span> query_results_list:<br>    <span style=\"color: #008800;font-weight: bold\">if<\/span> row[<span style=\"background-color: #fff0f0\">\"DatasetId\"<\/span>] <span style=\"color: #000000;font-weight: bold\">in<\/span> dataset_mapping:<br>       row[<span style=\"background-color: #fff0f0\">\"DatasetName\"<\/span>] <span style=\"color: #333333\">=<\/span> dataset_mapping[row[<span style=\"background-color: #fff0f0\">\"DatasetId\"<\/span>]] <span style=\"color: #888888\"># Replace ID with Name<\/span><br>    <span style=\"color: #008800;font-weight: bold\">if<\/span> row[<span style=\"background-color: #fff0f0\">\"ReportId\"<\/span>] <span style=\"color: #000000;font-weight: bold\">in<\/span> report_mapping:<br>       row[<span style=\"background-color: #fff0f0\">\"ReportName\"<\/span>] <span style=\"color: #333333\">=<\/span> report_mapping[row[<span style=\"background-color: #fff0f0\">\"ReportId\"<\/span>]] <span style=\"color: #888888\"># Replace ID with Name<\/span><br><br><span style=\"color: #888888\"># Convert back to Fabric DataFrame<\/span><br>query_results <span style=\"color: #333333\">=<\/span> spark<span style=\"color: #333333\">.<\/span>createDataFrame(query_results_list)<br><\/pre><\/div><\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li> The list of workspaces is generated, and some shortcuts are used to convert it to a list of Ids to be used in list_items <\/li>\n\n\n\n<li> We list semantic models and reports. The list_items doesn&#8217;t filter by multiple types, or we list all types, or we execute multiple times. If we decide to list all types, Sandeep Pawar described <a href=\"https:\/\/fabric.guru\/one-line-code-to-get-a-list-of-items-from-all-the-fabricpremium-workspaces\" target=\"_self\" rel=\"noopener\"><strong>how to list everything in a single line of code<\/strong><\/a> <\/li>\n\n\n\n<li> Two dictionaries are built to be used as shortcut when translating the Ids <\/li>\n\n\n\n<li> The dataframe is converted to dictionary <\/li>\n\n\n\n<li> All the translation happens in the dictionaries <\/li>\n\n\n\n<li> The result is converted to dataframe <\/li>\n<\/ul>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-conclusion\">Conclusion<\/h2>\n\n\n\n<p>Using this information it&#8217;s possible to create a report to analyse the performance of a lakehouse, highlighting the most expensive queries from each semantic model, the most expensive semantic models and identifying where the optimization focus needs to be.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I already wrote about query performance analysis in lakehouses, but one secret was left behind: The capability to track the queries back to a semantic model, report and visual in a report, identifying the query lineage. If you check the text of the queries, at the end of the text you will find content like&#8230;&hellip;<\/p>\n","protected":false},"author":50808,"featured_media":106556,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[2,159164],"tags":[158998,158997,159035,159315],"coauthors":[6810],"class_list":["post-106535","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blogs","category-microsoft-fabric","tag-lakehouse","tag-microsoft-fabric","tag-notebook","tag-query-lineage"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/106535","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/50808"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=106535"}],"version-history":[{"count":19,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/106535\/revisions"}],"predecessor-version":[{"id":106555,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/106535\/revisions\/106555"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media\/106556"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=106535"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=106535"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=106535"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=106535"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}