{"id":91956,"date":"2021-07-21T20:05:47","date_gmt":"2021-07-21T20:05:47","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=91956"},"modified":"2021-07-21T20:07:15","modified_gmt":"2021-07-21T20:07:15","slug":"query-private-blob-storage-sql","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/cloud\/azure\/query-private-blob-storage-sql\/","title":{"rendered":"How to query private blob storage with SQL and Azure Synapse"},"content":{"rendered":"<p><strong>The series so far:<\/strong><\/p>\n<ol>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/cloud\/azure\/query-blob-storage-sql-using-azure-synapse\/\">How to query blob storage with SQL using Azure Synapse<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/cloud\/azure\/query-private-blob-storage-sql\/\">How to query private blob storage with SQL and Azure Synapse<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/cloud\/azure\/performance-of-querying-blob-storage-with-sql\/\">Performance of querying blob storage with SQL<\/a>\u00a0<\/li>\n<\/ol>\n\n<p>The queries from the previous article were made against the public container in the blob storage. However, if the container is private, you will need to authenticate with the container. In this article, you&#8217;ll learn how to query private blob storage with SQL.<\/p>\n<p>NOTE: Be sure that the Azure Synapse Workspace and the storage account with the sample files are set up before following along with this article. You will also need to replace your storage account URL each time that a storage account URL is used in the article.<\/p>\n<p>There are three possible authentication methods, and these methods may have some variation according to the type of storage account and the access configuration. I will not dig into details about storage here and leave that for a future article.<\/p>\n<p>The three authentication methods are:<\/p>\n<p><strong>Key:<\/strong> The storage account has two keys. If you provide one of the keys during the connection, you have access to the storage account.<\/p>\n<p><strong>Shared Access Key (SAS):<\/strong> A SAS key is used in a similar way to the key; however, it has additional benefits, such as:<\/p>\n<ul>\n<li>It has an expiration date<\/li>\n<li>It has limits of what permissions the user can have<\/li>\n<\/ul>\n<p><strong>Azure AD:<\/strong> You can make the authentication using Azure Active Directory.<\/p>\n<p>These authentication methods are supported in general by storage accounts. However, <strong>SQL Server On Demand<\/strong> has limitations about what authentication methods it supports for the storage: <strong>SAS<\/strong> and <strong>Azure AD<\/strong> using <strong>Pass Through<\/strong>.<\/p>\n<h2>Authenticating with SAS key<\/h2>\n<p>The first method to try is with the SAS key. Follow along with key management next.<\/p>\n<h3>Understanding key management<\/h3>\n<p>First, you must understand how to generate the <em>SAS<\/em> key. Each storage account has two main keys used to control access to the storage. Using these keys is one of the methods of access, also called <strong>SAK<\/strong>.<\/p>\n<p>The accounts have two keys to allow creating a management policy over them. At any moment, you can revoke one of the keys, and every client application using that key directly or indirectly will lose access.<\/p>\n<p>On the left side panel of the storage account, under settings, you can see the <em>Access Keys<\/em> option. On this option, you find the two main keys, and you can revoke them, generate new keys and cancel the access of all applications depending on these keys, directly or indirectly.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91957\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-27.png\" alt=\"Image showing access keys\" width=\"867\" height=\"725\" \/><\/p>\n<p>You may also notice the <em>Shared Access Signature<\/em> option on the left. That\u2019s where you can generate <em>SAS keys<\/em>. The <em>SAS keys<\/em> are encrypted by one of the main storage account keys. If you regenerate the main <em>SAK<\/em> used for the encryption, all the <em>SAS<\/em> keys encrypted by it will become invalid.<\/p>\n<h3>Good Practice<\/h3>\n<p>One good practice is always using the secondary <em>SAK<\/em> to generate the <em>SAS<\/em> keys because you will be able to regenerate the secondary <em>SAK<\/em> to invalidate all SAS keys without stopping critical applications, making access directly with the primary <em>SAK<\/em>.<\/p>\n<p>You also can break down your applications by how critical they are and generate their <em>SAS<\/em> keys based on the primary or secondary <em>SAK<\/em>.<\/p>\n<p>However, you only have two <em>SAK<\/em>s available. Regenerating one of them will drop the connection from many applications. The best option is still relying on the <em>SAS<\/em> expiration date.<\/p>\n<h3>Generating and managing the SAS key<\/h3>\n<p>The basic and most important detail you need to know about the keys is: They are not managed by Azure. This means that once you generate a <em>SAS<\/em> key, you can\u2019t cancel the key directly. The only way to cancel the <em>SAS<\/em> key is regenerating one of the <em>SAK<\/em>s, which then creates a cascading effect on other keys, as explained before.<\/p>\n<p>As a result, the rule is simple: Take care of the <em>SAS<\/em> keys you generate.<\/p>\n<p>Once you click on the Shared Access Signature page, you will need to fill in the following details to build your SAS key:<\/p>\n<ul>\n<li><strong>Allowed Services<\/strong>. In this example, you only need the blob service.<\/li>\n<li><strong>Allowed Resource Type<\/strong>. Here you specify what API level you would like. <em>Service Management<\/em>, <em>Containers Management<\/em> or <em>Objects<\/em>. You need objects, of course. However, the List Blobs API is related to the Containers resource type. For this example, you don\u2019t need to enable containers management and creation. However, you need the user of the key to be able to list the blobs in the container, so you also need to include the Containers resource type<\/li>\n<li><strong>Allowed Permissions<\/strong>. You will need the following permissions: Read, Write, List, Add, Create<\/li>\n<li><strong>Blob Versioning Permissions<\/strong>: This is not needed for the example.<\/li>\n<li><strong>Start\/Expire date\/time.<\/strong> It\u2019s up to you to decide how your Azure key policy will be. Make sure that the key will not expire while following the example.<\/li>\n<li><strong>Allowed IP Addresses<\/strong>. You can limit the IP addresses that can use this key. If you are using a server application or service to access the blob storage, you can create IP restrictions to make the key more secure. If the service is on Azure, I would build a private network first. In this example, you are running the queries from the Synapse Workbench, so leave the field blank.<\/li>\n<li><strong>Allowed Protocols<\/strong>. By default, it only allows HTTPS access; this is good for security.<\/li>\n<li><strong>Preferred Routing Tier<\/strong>: This one is a bit more complex. Any option is ok for the example; the choice depends on the rules of your environment. See the section below for more information about this functionality.<\/li>\n<li><strong>Signature Key<\/strong>. You can choose any key; it\u2019s not important for the result. This gives you the freedom to plan your storage key policy in the way you prefer.<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91958\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-28.png\" alt=\"Image showing properties when creating the key\" width=\"898\" height=\"792\" \/><\/p>\n<table>\n<tbody>\n<tr bgcolor=\"#B4C6E7\">\n<td>\n<p><strong>Preferred Routing Tier<\/strong><\/p>\n<p>Microsoft has a worldwide network linked to Azure. When you try to access something hosted inside Azure, there are two access possibilities. These possibilities are related to the concept of POP \u2013 Point of Preference.<\/p>\n<p>The POP is the network access where your packages will get into Microsoft Network. It can be the closest POP to the client trying to access or the closest POP to the storage location.<\/p>\n<p><strong>POP Close to the client<\/strong>: The package will get into Microsoft Network as soon as possible, which means it will be faster and safer, but it will increase your networking costs.<\/p>\n<p><strong>POP Close to the Storage<\/strong>: The package will move through the internet and only get into Microsoft Network at the last moment possible. This will reduce the networking costs, reduce the security, and the performance may vary a lot.<\/p>\n<p>In order to use these options, you need first to enable them on the <strong>Network tab<\/strong>. By doing so, Azure creates different URL\u2019s to the storage account, one for the <strong>Microsoft Network routing<\/strong> and another for <strong>Internet Routing<\/strong>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91959\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-29.png\" alt=\"Image showing networking\" width=\"886\" height=\"594\" \/><\/p>\n<p>When you choose to enable the <strong>SAS<\/strong> key for one of these routing options, it\u2019s not only a matter of the key, but also the URL. The key will only work with the correct URL<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>After clicking the Generate button, the resulting SAS key is a set for querystring parameters to be used in a URL, as part of the querystring. You may notice the following details:<\/p>\n<ul>\n<li>Each <em>SAS<\/em> parameter is a pair of <em>\u2018key=value\u2019<\/em> data in the querystring<\/li>\n<li>The querystring is already URL encoded, so you can include it in a URL<\/li>\n<li>The querystring is provided with a starting <em>\u2018?\u2019<\/em>, which is exactly what may be needed to include it in a URL. The question mark is not part of the SAS key; some places accept the question mark, others require the question mark to be removed.<\/li>\n<li>The <strong>sig<\/strong> parameter is the heart of <em>SAS<\/em> key. It\u2019s a value encrypted by the storage key chosen, ensuring the security of the access.<\/li>\n<li>The <strong>sv<\/strong> parameter points to the version of the <em>SAS<\/em> key generator. There are tools able to support up to some specific versions. This example will not face this challenge.<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91960\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-30.png\" alt=\"Image showing keys\" width=\"1547\" height=\"405\" \/><\/p>\n<p>Save the generated <em>SAS<\/em> token as you will need it later.<\/p>\n<h3>Configuring the serverless Pool<\/h3>\n<p>In order to store the <em>SAS<\/em> key safely in Synapse, you use <strong>Credential<\/strong> objects. SQL Server supports two kinds of Credential objects:<\/p>\n<ul>\n<li>Server level Credential<\/li>\n<li>Database Level Credential, introduced in <strong>SQL Server 2016<\/strong><\/li>\n<\/ul>\n<p>The difference is simple: Database level credential provides you more flexibility to move the solution and the database, while server-level credential is re-usable for all databases on the server.<\/p>\n<p>The <strong>Credential<\/strong> object is important because it\u2019s part of the encryption structure in <strong>SQL Server<\/strong> to keep critical information, such as the <em>SAS<\/em> key, safe.<\/p>\n<p>Recommended additional reading: <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/security\/encryption\/encryption-hierarchy?view=sql-server-ver15&amp;WT.mc_id=DP-MVP-4014132\">https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/security\/encryption\/encryption-hierarchy?view=sql-server-ver15&amp;WT.mc_id=DP-MVP-4014132<\/a><\/p>\n<p>The credential will not solve the problem by itself: the <strong>OPENROWSET<\/strong> doesn\u2019t accept the credential object directly. You need to create at least one more object, called <strong>External Objects<\/strong>.<\/p>\n<p>The two existing databases on the Serverless pool can\u2019t be used for that. The existing databases are:<\/p>\n<p><strong>Master<\/strong>: The traditional SQL Server master database<\/p>\n<p><strong>Default<\/strong>: A database to integrate <strong>Spark Pool<\/strong> and SQL On Demand Pool. It\u2019s a replicated database. <strong>Synapse<\/strong> has access to <strong>Spark<\/strong> data through this database.<\/p>\n<p>You must create a new database to hold the objects needed. Call the database <strong>NYTaxi<\/strong>. Open Synapse Studio to run the script. (<a href=\"https:\/\/www.red-gate.com\/simple-talk\/cloud\/azure\/query-blob-storage-sql-using-azure-synapse\/\">The first article in the series<\/a> has the required steps)<\/p>\n<p>According to the way <strong>SQL Server<\/strong> encryption works, you need to create a master key in the database, as you will learn by reading the link provided above. The master key will be responsible for the encryption of the objects inside the database, in this case, the credentials.<\/p>\n<p>Creating the database and credentials (replace with your own SAS key token):<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">Create\u00a0Database\u00a0NYTaxi\r\nGO\r\nUSE NYTAXI\r\nGO\r\nCreate Master Key\r\nGO\r\n\t\r\nCREATE\u00a0DATABASE\u00a0SCOPED\u00a0CREDENTIAL\u00a0[MaltaCredential]\r\nWITH\u00a0IDENTITY='SHARED\u00a0ACCESS\u00a0SIGNATURE',\u00a0\u00a0\r\nSECRET\u00a0=\u00a0'?sv=2019-12-12&amp;ss=bfqt&amp;srt=sco&amp;sp=rwdlacupx&amp;se=2021-12-23T07:34:32Z&amp;st=2020-11-21T23:34:32Z&amp;spr=https&amp;sig=mJpur3JCEp99w5OqHxcXSXGOh4g44rOZfl5j8%2B6St30%3D'\r\nGO<\/pre>\n<p>Once the script is executed, the execution context will be on the <em>NYTaxi<\/em> database. You are still working on the <strong>Serverless Pool<\/strong>, but now with your own database.<\/p>\n<p>Three details to highlight here:<\/p>\n<ul>\n<li>The <strong>IDENTITY<\/strong> value is fixed, always <strong>SHARED ACCESS SIGNATURE<\/strong> when using <em>SAS<\/em> keys.<\/li>\n<li>The <strong>SECRET<\/strong> includes the question mark \u2018?\u2019 . Some features expect the question mark, but others don\u2019t.<\/li>\n<li>The credentials are not visible on the UI. You need to use <em>DMV\u2019s<\/em> to see the existing credentials<\/li>\n<\/ul>\n<p>Checking the existing credentials on the database:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">select\u00a0*\u00a0from\u00a0sys.database_scoped_credentials<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91961\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-31.png\" alt=\"Image showing credentials in query\" width=\"753\" height=\"309\" \/><\/p>\n<h3>Serverless pool databases<\/h3>\n<p>You can look in detail at your new database using the second toolbar icon, located on the left side of the screen.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91962\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-32.png\" alt=\"Image showing menu\" width=\"118\" height=\"397\" \/><\/p>\n<p>The databases from the <strong>Serverless Pool<\/strong> are not easily visible, but once you use the Refresh on the action menu (<em>\u201c\u2026\u201d<\/em> besides <em>Databases<\/em>), they will appear on the <em>Databases<\/em> list.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91963\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-33.png\" alt=\"Image showing refresh\" width=\"521\" height=\"88\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"349\" height=\"119\" class=\"wp-image-91964\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-34.png\" \/><\/p>\n<p>One very important difference between regular databases and the database on the <strong>Serverless Pool<\/strong> is you can\u2019t create tables on the <strong>Serverless Pool<\/strong>. You can create what is called <strong>External Objects<\/strong> and views.<\/p>\n<p><strong>External Objects<\/strong> were introduced in <strong>SQL Server 2016<\/strong> as part of the <strong>Polybase<\/strong> architecture, allowing access to data remotely from many different formats, especially unstructured formats. This doesn\u2019t mean you are using <strong>Polybase<\/strong>: <strong>Synapse Serverless<\/strong> pool has a native provider for blob storage which takes a detour from <strong>Polybase<\/strong>.<\/p>\n<p><strong>External Objects<\/strong> are objects used to map external sources, such as blob storage, to SQL and allow you to query them like regular tables. The <strong>Serverless Pool<\/strong> will not be holding the data, only pointing to it. In the same way, you can create views over the <strong>External Objects<\/strong>, encapsulating more complex queries.<\/p>\n<p>The image below shows exactly this: the database can\u2019t have tables.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91965\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-35.png\" alt=\"Image showing NYTaxi database\" width=\"353\" height=\"257\" \/><\/p>\n<h3>Creating the External Object<\/h3>\n<p><code>OPENROWSET<\/code> doesn\u2019t support the credential directly. You need to create an object called <em>External Data Source<\/em> to use the credential.<\/p>\n<p>Create the external data source, replacing the URL with your URL from your storage created in the first article:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">CREATE\u00a0EXTERNAL\u00a0DATA\u00a0SOURCE\u00a0[demoMalta]\u00a0WITH\u00a0\r\n(\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\r\n\u00a0\u00a0\u00a0\u00a0LOCATION\u00a0=\u00a0'https:\/\/lakedemo.blob.core.windows.net\/datalake',\r\n\u00a0\u00a0\u00a0\u00a0CREDENTIAL=MaltaCredential\r\n);\r\ngo<\/pre>\n<p>The <code>LOCATION<\/code> attribute uses HTTP\/HTTPS protocol, the only protocol accepted by serverless pool to connect to blob storage.<\/p>\n<p>You can check the created <em>Data Source<\/em> in the <em>Databases<\/em> window<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"338\" height=\"279\" class=\"wp-image-91966\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-36.png\" \/><\/p>\n<h3>OPENROWSET with External Data Source<\/h3>\n<p>The query below is the same already used in the previous article of this series, but modified to use the External Data Source:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">select\u00a0Month(Cast(Cast(DateId\u00a0as\u00a0Varchar)\u00a0as\u00a0Date))\u00a0Month,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0count(*)\u00a0Trips\r\n\u00a0from\u00a0\r\n\u00a0\u00a0\u00a0\u00a0OPENROWSET(\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0BULK\u00a0'\/trips\/',\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Data_Source='demoMalta',\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0FORMAT='PARQUET'\r\n\u00a0\u00a0\u00a0\u00a0)\u00a0AS\u00a0[rides]\r\ngroup\u00a0by\u00a0Month(Cast(Cast(DateId\u00a0as\u00a0Varchar)\u00a0as\u00a0Date))\r\norder\u00a0by\u00a0Month<\/pre>\n<p>Some details about this syntax:<\/p>\n<ul>\n<li>The data source makes the syntax easier<\/li>\n<li>You don\u2019t specify the full URL anymore, only the final path<\/li>\n<li>The data source holds the credential, but the data source can also be used on public access scenarios to make the syntax easier<\/li>\n<\/ul>\n<p>The result is similar as before:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91967\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-37.png\" alt=\"Image showing chart of results\" width=\"734\" height=\"424\" \/><\/p>\n<h2>Azure Active Directory pass-through authentication<\/h2>\n<p><strong>Azure AD<\/strong> authentication is the default authentication method used by serverless pool. When you don\u2019t provide any other authentication, Synapse will try to authenticate us using <strong>Azure AD<\/strong>.<\/p>\n<p>The documentation mentions Pass-Through authentication as slower than SAS authentication. However, this is the kind of internal detail that changes every day. While I was writing this article, they had similar performance; sometimes, with Pass-Through having even better performance than SAS.<\/p>\n<p>The requirement for the pass-through is very simple: The user making the query needs to have data permission over the storage.<\/p>\n<p>Since it may not be a good idea to set this permission user by user, the best option is to create an Azure AD group, include the users needed in the group, and then give the permission to the group.<\/p>\n<h3>Create an AD group and set the permission<\/h3>\n<p>Follow these steps to create the AD group and give permissions:<\/p>\n<ul>\n<li>On Azure Portal, open the main menu<\/li>\n<li>Click <em>Azure Active Directory<\/em><\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91968\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-38.png\" alt=\"Image showing Azure AD\" width=\"270\" height=\"161\" \/><\/p>\n<ul>\n<li>Under Manage, click <strong>Groups<\/strong><\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91969\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-39.png\" alt=\"Image shown manage menu\" width=\"240\" height=\"209\" \/><\/p>\n<ul>\n<li>Click <strong>New Group<\/strong> button<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91970\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-40.png\" alt=\"Image showing groups\" width=\"490\" height=\"179\" \/><\/p>\n<ul>\n<li>On <em>Group Type<\/em> dropdown, leave Security<\/li>\n<li>On <em>Group Name<\/em><strong>,<\/strong> you can choose any name; I will call the group <em>DataLakeAdmins<\/em><\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91971\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-41.png\" alt=\"Image showing new group\" width=\"243\" height=\"232\" \/><\/p>\n<ul>\n<li>Under <em>Members<\/em>, click the <em>No Members Selected<\/em> link<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"196\" height=\"70\" class=\"wp-image-91972\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-42.png\" \/><\/p>\n<ul>\n<li>In the <em>Search<\/em> box, type the name of the account you are using. Only part of the name may work; Azure will search for your account.<\/li>\n<li>When you see your account below the <em>Search<\/em> box, select it<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91973\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-43.png\" alt=\"Image showing how to add members\" width=\"690\" height=\"584\" \/><\/p>\n<ul>\n<li>Once your account is selected, click the <em>Select<\/em> button<\/li>\n<li>Click the <em>Create<\/em> button, completing the group creation<\/li>\n<li>Return to the Home of Azure Portal<\/li>\n<li>Locate your storage account, <em>LakeDemo,<\/em> and click on it<\/li>\n<li>Click <em>Access Control (IAM)<\/em> option on the left side menu<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"303\" height=\"162\" class=\"wp-image-91974\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-44.png\" \/><\/p>\n<ul>\n<li>Click the <em>Add<\/em> button and the <em>Add Role Assignment<\/em> option<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91975\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-45.png\" alt=\"Image showing how to add role assignment\" width=\"357\" height=\"185\" \/><\/p>\n<ul>\n<li>On <em>Role<\/em> dropdown, select <em>Storage Blob Data Contributor<\/em><\/li>\n<\/ul>\n<p><strong><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91976\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-46.png\" alt=\"Select role\" width=\"509\" height=\"367\" \/><\/strong><\/p>\n<ul>\n<li>On the <em>Select<\/em> box, type the name of the group, in this case, <em>DataLakeAdmins<\/em><\/li>\n<li>Select the group as soon it appears below the <em>Select<\/em> box<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"491\" height=\"163\" class=\"wp-image-91977\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-47.png\" \/><\/p>\n<ul>\n<li>Click the <em>Save<\/em> button to create the role assignment<\/li>\n<\/ul>\n<p>Now you can execute the query on Synapse without providing the credential, using the pass-through authentication. Note that you will need to close out of the Synapse Workspace and open it again to get the refreshed permissions.<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">select\u00a0Month(Cast(Cast(DateId\u00a0as\u00a0Varchar)\u00a0as\u00a0Date))\u00a0Month,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0count(*)\u00a0Trips\r\n\u00a0from\u00a0\r\n\u00a0\u00a0\u00a0\u00a0OPENROWSET(\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0BULK\u00a0'https:\/\/lakedemo.blob.core.windows.net\/datalake\/trips\/',\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0FORMAT='PARQUET'\r\n\u00a0\u00a0\u00a0\u00a0)\u00a0AS\u00a0[rides]\r\ngroup\u00a0by\u00a0Month(Cast(Cast(DateId\u00a0as\u00a0Varchar)\u00a0as\u00a0Date))\r\norder\u00a0by\u00a0Month<\/pre>\n<h2>Analyzing the log and comparing the methods<\/h2>\n<p>One good way to compare both methods is by using <strong>Azure Monitor<\/strong> and <strong>Log Analytics<\/strong> to compare what happens with each authentication. Both are very important tools when working with <strong>Azure<\/strong>. You can read more about them in this other article <a href=\"https:\/\/www.red-gate.com\/simple-talk\/blogs\/saving-money-log-analytics\/\">https:\/\/www.red-gate.com\/simple-talk\/blogs\/saving-money-log-analytics\/<\/a><\/p>\n<p>Once you have the diagnostics configuration enabled, a simple <strong>Kusto<\/strong> query can show what happens with each Synapse query execution.<\/p>\n<h3>Enabling Diagnostics on the storage account<\/h3>\n<p>The <em>Diagnostics<\/em> configuration in <em>Storage<\/em> <em>Accounts<\/em> is among the features that most change on <strong>Azure<\/strong>. At the time of this writing, there are two <strong>Diagnostics<\/strong> options in <strong>Storage Accounts<\/strong>.<\/p>\n<p>Due to that, the steps below may change:<\/p>\n<ol>\n<li>Open the <em>LakeDemo<\/em> <em>Storage Account<\/em> on the portal<\/li>\n<li>Select <em>Diagnostics Settings (preview)<\/em> on the left tab<\/li>\n<\/ol>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"289\" height=\"395\" class=\"wp-image-91978\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-48.png\" \/><\/p>\n<ol>\n<li>Click on the <em>Disabled<\/em> button besides the <em>blob<\/em> row<\/li>\n<\/ol>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91979\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-49.png\" alt=\"List of storage account\" width=\"967\" height=\"184\" \/><\/p>\n<ol>\n<li>Click the <em>Add Diagnostic Settings link<\/em><\/li>\n<\/ol>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"242\" height=\"180\" class=\"wp-image-91980\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-50.png\" \/><\/p>\n<ol>\n<li>On the left side, select all the information options to be sent to <em>Log Analytics<\/em><\/li>\n<\/ol>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"193\" height=\"308\" class=\"wp-image-91981\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-51.png\" \/><\/p>\n<ol>\n<li>On the right side, select <em>Log Analytics<\/em> as the destination of the information<\/li>\n<\/ol>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"349\" height=\"394\" class=\"wp-image-91982\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-52.png\" \/><\/p>\n<ol>\n<li>On the <em>Subscription<\/em> drop down select the subscription where the <em>Log Analytics<\/em> workspace is located<\/li>\n<li>On the <em>Log Analytics<\/em> workspace dropdown, select the workspace which will hold the log information<\/li>\n<li>On the <em>Diagnostic<\/em> <em>Settings<\/em> <em>Name<\/em> textbox, type the name of the diagnostics configuration. In most types, there will be only one, and the name is not very important.<\/li>\n<\/ol>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"391\" height=\"58\" class=\"wp-image-91983\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-53.png\" \/><\/p>\n<ol>\n<li>Click on the <strong>Save<\/strong> button<\/li>\n<\/ol>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"284\" height=\"115\" class=\"wp-image-91984\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-54.png\" \/><\/p>\n<h3>Checking the logs<\/h3>\n<p>Execute this query on <code>Synapse Serverless<\/code>:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">select\u00a0Month(Cast(Cast(DateId\u00a0as\u00a0Varchar)\u00a0as\u00a0Date))\u00a0Month,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0count(*)\u00a0Trips\r\n\u00a0from\u00a0\r\n\u00a0\u00a0\u00a0\u00a0OPENROWSET(\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0BULK\u00a0'\/trips\/',\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Data_Source='demoMalta',\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0FORMAT='PARQUET'\r\n\u00a0\u00a0\u00a0\u00a0)\u00a0AS\u00a0[rides]\r\ngroup\u00a0by\u00a0Month(Cast(Cast(DateId\u00a0as\u00a0Varchar)\u00a0as\u00a0Date))\r\norder\u00a0by\u00a0Month<\/pre>\n<ol>\n<li>Open the <em>Storage Account<\/em> in the portal<\/li>\n<li>Click on <em>Logs (preview)<\/em> on the left side menu<\/li>\n<\/ol>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"281\" height=\"288\" class=\"wp-image-91985\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-55.png\" \/><\/p>\n<ul>\n<li>Close the <em>Queries<\/em> window that appears<\/li>\n<li>It may take a few minutes before the log information is up to date. In the <em>New Query 1<\/em> window, type the following Kusto query:<\/li>\n<\/ul>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">StorageBlobLogs\r\n|\u00a0where\u00a0TimeGenerated\u00a0&gt;\u00a0ago(1h)\r\n| order\u00a0by\u00a0TimeGenerated\u00a0desc\u00a0<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91986\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-56.png\" alt=\"Kusto query for logs\" width=\"383\" height=\"152\" \/><\/p>\n<ul>\n<li>Click the <em>Execute<\/em> button<\/li>\n<\/ul>\n<p>You may notice the following details on the result:<\/p>\n<ul>\n<li>A single <code>OPENROWSET<\/code> query on Synapse over one folder with a single file generated six operations on the storage account<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91987\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-57.png\" alt=\"Log results\" width=\"929\" height=\"300\" \/><\/p>\n<ul>\n<li>The <em>AuthenticationType<\/em> field appears as <em>SAS \u2013 Shared Access Key<\/em><\/li>\n<li>HTTP Status 206 means Partial Content. The <em>GetBlob<\/em> method returns the result in pieces, to control the transfer of big blobs<\/li>\n<li>The total number of calls may vary due to the number of pieces <em>GetBlob<\/em> method returned<\/li>\n<li>The <em>RequesterTenantId<\/em> field is empty because the <em>AuthenticationType<\/em> is <em>SAS<\/em><\/li>\n<\/ul>\n<p>Now, try the pass-through authentication. Execute the following query on <em>Synapse Serverless<\/em>:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">select\u00a0Month(Cast(Cast(DateId\u00a0as\u00a0Varchar)\u00a0as\u00a0Date))\u00a0Month,\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0count(*)\u00a0Trips\r\n\u00a0from\u00a0\r\n\u00a0\u00a0\u00a0\u00a0OPENROWSET(\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0BULK\u00a0'https:\/\/lakedemo.blob.core.windows.net\/datalake\/trips\/',\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0FORMAT='PARQUET'\r\n\u00a0\u00a0\u00a0\u00a0)\u00a0AS\u00a0[rides]\r\ngroup\u00a0by\u00a0Month(Cast(Cast(DateId\u00a0as\u00a0Varchar)\u00a0as\u00a0Date))\r\norder\u00a0by\u00a0Month<\/pre>\n<p>Execute the Kusto query again on the storage log:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">StorageBlobLogs\r\n|\u00a0where\u00a0TimeGenerated\u00a0&gt;\u00a0ago(1h)\r\n|order\u00a0by\u00a0TimeGenerated\u00a0desc\u00a0<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-91988\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2021\/07\/word-image-58.png\" alt=\"results after AD \" width=\"1526\" height=\"352\" \/><\/p>\n<p>You may notice the following:<\/p>\n<ul>\n<li>The query generates eight activities on the storage, two more than the <strong>SAS<\/strong> authenticated query<\/li>\n<li><strong>Synapse<\/strong> first tries to use anonymous access, generating the first two access and resulting in HTTP error 404. After the failed attempt, it reverts to pass-through<\/li>\n<li>The pass-through authentication appears as <em>OAUTH<\/em>, the protocol used by <strong>Azure Active Directory<\/strong><\/li>\n<li>There is a bigger interval between the second failed request and the first <em>OAUTH<\/em> request (0.3ms). On this interval, <strong>Synapse<\/strong> was contacting <strong>Azure Active Directory<\/strong> and requesting the authentication<\/li>\n<li>The <em>RequesterTenentId<\/em> field is now filled with the id of the tenant used for the authentication.<\/li>\n<\/ul>\n<h2>Query private storage with SQL<\/h2>\n<p>The two authentication methods create a safe way to access data in Azure storage. However, it could still be a bit better if you could use its own Synapse Managed Identity as an authentication method to the storage.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Azure storage can be marked Private to control access. Dennes Torres explains how to query private blob storage with SQL and Azure Synapse.&hellip;<\/p>\n","protected":false},"author":50808,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[137091,53],"tags":[124839,145436,145437,5134],"coauthors":[6810],"class_list":["post-91956","post","type-post","status-publish","format-standard","hentry","category-azure","category-featured","tag-azure-blob-storage","tag-azure-synapse","tag-secure-blob-storage","tag-sql-prompt"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/91956","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/50808"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=91956"}],"version-history":[{"count":7,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/91956\/revisions"}],"predecessor-version":[{"id":91998,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/91956\/revisions\/91998"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=91956"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=91956"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=91956"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=91956"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}