{"id":92618,"date":"2021-11-01T17:00:54","date_gmt":"2021-11-01T17:00:54","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=92618"},"modified":"2021-10-12T21:28:47","modified_gmt":"2021-10-12T21:28:47","slug":"data-factory-use-a-sql-query-to-create-a-data-source","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/blogs\/data-factory-use-a-sql-query-to-create-a-data-source\/","title":{"rendered":"Data Factory: Use a SQL Query to create a Data Source"},"content":{"rendered":"<p>When we include a data source inside a data flow it always requests a dataset. The dataset, in turn, requires you to point to a table. A dataset can&#8217;t be defined over a query, it needs an object on the linked service, which may be a table or view as you may notice on the image below.<\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-92624\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/10\/DataFactorySQL06.png\" alt=\"\" width=\"1027\" height=\"440\" \/><\/p>\n<p>&nbsp;<\/p>\n<h4><em><strong>How can we work around this and use a query as a source for a data flow?<\/strong><\/em><\/h4>\n<p>The secret is on the source object we use on the data flow. We need to select a dataset, as always. However, on the 2nd tab, <em>Source Options<\/em>, we can choose the input type as <em>Query<\/em> and define a <strong>SQL<\/strong> query. The source will ignore the table configuration in the dataset and get the data from the query.<\/p>\n<p>This is how the <em>1st tab<\/em> will look like when we select the dataset:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-92619\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/10\/datafactorySQL01.png\" alt=\"\" width=\"1035\" height=\"452\" \/><\/p>\n<p>This is how the<em> 2nd tab<\/em> looks like with the query defined:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-92621\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/10\/DataFactorySQL03.png\" alt=\"\" width=\"764\" height=\"289\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>After defining the query, we can click the button <em>Import Projection<\/em>. <strong>Data Factory<\/strong> will need to initialize the <strong>Integration Runtime<\/strong>, so it can execute the import of the schema. Once the <strong>Integration Runtime<\/strong> is initialized, the <em>Import Projection<\/em> can proceed. Usually you will need to click the button again.<\/p>\n<p>On the <em>Projection<\/em> tab we will not see anything related to the table at all, only the query results will be there.<\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-92620\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/10\/dataFactorySQL02.png\" alt=\"\" width=\"782\" height=\"618\" \/><\/p>\n<p>The schema drift properties on this scenario are still optional. They will act completely over the query. We import the projection schema from the query. The dataflow will accept additional fields beyond the ones defined during development if the schema drift property is enabled, as it usually does.<\/p>\n<h2>The Dataflow Code<\/h2>\n<p>We can say in some ways the data factory data flows have two different languages: <strong>The Data Flow Script<\/strong> (<strong>DFS<\/strong>) and the json syntax. The two buttons on the top right of the <strong>Data Factory<\/strong> screen allow us to see the code.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-92625\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/10\/DataFactorySQL07.png\" alt=\"\" width=\"190\" height=\"53\" \/><\/p>\n<p>The <strong>DFS<\/strong> from this script makes no reference to the dataset at all:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-92622\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/10\/DataFactorySQL04.png\" alt=\"\" width=\"541\" height=\"570\" \/><\/p>\n<p>\n<strong>Data Factory<\/strong> converts the <strong>DFS<\/strong>\u00a0to a single script line in the <strong>JSON<\/strong> file. The <strong>JSON<\/strong> file requires a source dataset specified, but many dataset definitions, such as the table, will be ignored. The resulting <strong>JSON<\/strong> will be like the image below.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-92623\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/10\/DataFactorySQL05.png\" alt=\"\" width=\"1280\" height=\"408\" \/><\/p>\n<p>The dataset, on this example, establishes a link between the data flow and the linked service, but only that.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When we include a data source inside a data flow it always requests a dataset. The dataset, in turn, requires you to point to a table. A dataset can&#8217;t be defined over a query, it needs an object on the linked service, which may be a table or view as you may notice on the&#8230;&hellip;<\/p>\n","protected":false},"author":50808,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[2],"tags":[136322,145486,4472,145487,4824],"coauthors":[6810],"class_list":["post-92618","post","type-post","status-publish","format-standard","hentry","category-blogs","tag-azure-sql","tag-data-factory","tag-dataflow","tag-dfs","tag-etl"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/92618","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/50808"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=92618"}],"version-history":[{"count":1,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/92618\/revisions"}],"predecessor-version":[{"id":92626,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/92618\/revisions\/92626"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=92618"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=92618"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=92618"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=92618"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}