{"id":78545,"date":"2018-04-30T14:27:34","date_gmt":"2018-04-30T14:27:34","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=78545"},"modified":"2026-04-14T13:14:48","modified_gmt":"2026-04-14T13:14:48","slug":"sql-server-graph-databases-part-5-importing-relational-data-graph-database","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/sql-server-graph-databases-part-5-importing-relational-data-graph-database\/","title":{"rendered":"SQL Server Graph Databases Part 5: Import Relational Data into Graph Tables"},"content":{"rendered":"<p>SQL Server&#8217;s graph database features &#8211; introduced in SQL Server 2017 &#8211; allow you to model and query relationship data using node and edge tables instead of traditional JOIN chains. This article, Part 5 of the series, covers migrating an existing relational database (AdventureWorks2017) into a graph structure: creating node tables for entities such as customers, stores, and products; creating edge tables to represent relationships between them; and querying the graph data using the MATCH clause. It also demonstrates combining graph and relational queries in the same statement &#8211; allowing incremental adoption without requiring a full migration away from relational tables.<\/p>\n<h4>The series so far:<\/h4>\n<ol>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/sql-development\/sql-server-graph-databases-part-1-introduction\/\">SQL Server Graph Databases - Part 1: Introduction<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/t-sql-programming\/sql-server-graph-databases-part-2-querying-data-graph-database\/\">SQL Server Graph Databases - Part 2: Querying Data in a Graph Database<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/t-sql-programming\/sql-server-graph-databases-part-3-modifying-data-graph-database\/\">SQL Server Graph Databases - Part 3: Modifying Data in a Graph Database<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/t-sql-programming\/sql-server-graph-databases-part-4-working-hierarchical-data-graph-database\">SQL Server Graph Databases - Part 4: Working with hierarchical data in a graph database<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/t-sql-programming\/sql-server-graph-databases-part-5-importing-relational-data-graph-database\/\">SQL Server Graph Databases - Part 5:\u00a0Importing Relational Data into a Graph Database<\/a><\/li>\n<\/ol>\n\n<p>With the release of SQL Server 2017, Microsoft introduced graph database features to support data sets that contain complex relationships between entities. The graph capabilities are integrated into the database engine and require no special configurations or installations. You can use these features in addition to or independently of the traditional relational structures. For example, you might implement a graph database for a new master data management solution that could benefit from both graph and relational tables.<\/p>\n<p>When creating a graph database, you might be working with new data, existing data, or a combination of both. In some cases, the data might already exist in relational tables, which do not support the graph features. Only node and edge tables in a graph database allow you to use the new capabilities, in which case, you must either copy the data over to the graph tables or forget about using the graph features altogether.<\/p>\n<p>For those interested in the first option, this article demonstrates how to move from a relational structure to a graph structure, using data from the <strong>AdventureWorks2017<\/strong> sample database. The database might not represent the type of data you had in mind, but it provides a handy way to illustrate how to migrate to a graph structure, using a relational schema already familiar to many of you. Such a recognizable structure also helps demonstrate various ways to query the data once it\u2019s in the graph tables.<\/p>\n<h2>Moving from Relational Tables to Graph Tables<\/h2>\n<p>The <strong>AdventureWorks2017<\/strong> database includes transactional data related to the fictitious company Adventure Works, which sells bicycles and related equipment to retail outlets and online customers. For this article, we\u2019ll focus on the retail outlets that ordered the products, the sales reps who sold the products, and the vendors who supplied the products, along with such details as the number of items ordered and the amount paid for those items.<\/p>\n<p>To retrieve this type of data from the <strong>AdventureWorks2017<\/strong> database as it exists in its current state, you would be accessing different combinations of the tables shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1142\" height=\"630\" class=\"wp-image-78546\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/04\/word-image-236.png\" \/><\/p>\n<p>For those who\u2019ve been around SQL Server documentation for a while, such tables as <strong>SalesOrderHeader<\/strong>, <strong>SalesOrderDetail<\/strong>, <strong>Product<\/strong>, and <strong>Person<\/strong> should be quite familiar because they\u2019re included in countless examples that demonstrate various ways to work with relational data. However, suppose that you now want to pull some of this information into a graph database, in which case, the data model might look more like the one shown in the next figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1140\" height=\"540\" class=\"wp-image-78547\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/04\/word-image-237.png\" \/><\/p>\n<p>The data model includes only four nodes (<strong>Stores<\/strong>, <strong>SalesReps<\/strong>, <strong>Vendors<\/strong>, and <strong>Products<\/strong>) and only three edges (<strong>Purchases<\/strong>, <strong>Sells<\/strong>, and <strong>Supplies<\/strong>). Together these nodes and edges define the following relationships:<\/p>\n<ul>\n<li>Stores purchase products<\/li>\n<li>Sales reps sell products<\/li>\n<li>Vendors supply products<\/li>\n<\/ul>\n<p>You\u2019ll define these relationships within the edge tables by mapping the originating node to the terminating node for each relationship, as you saw in the first article in this series. The implication here is that you should first populate the node tables before the edge tables so you can reference the originating and terminating node IDs when defining your relationships.<\/p>\n<h2>Creating and Populating the Node Tables<\/h2>\n<p>Before you can create and populate your node tables, you must determine where to put the tables. For the examples in this article, I created the <strong>graph<\/strong> schema within the <strong>AdventureWorks2017<\/strong> database, using the following T-SQL code:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  USE AdventureWorks2017;\n  GO\n  CREATE SCHEMA graph;  \n  GO<\/pre>\n<p>You do not need to locate the graph tables in the <strong>graph<\/strong> schema or even in the <strong>AdventureWorks2017<\/strong> database. However, if you plan to try out the examples to follow and want to locate the graph tables elsewhere, be sure to update the T-SQL code accordingly.<\/p>\n<p>With the <strong>graph<\/strong> schema in place (or wherever you locate the tables), you can then create and populate the <strong>Stores<\/strong> node table, which includes two user-defined columns, <strong>StoreID<\/strong> and <strong>StoreName<\/strong>, as shown in the following <strong>CREATE<\/strong> <strong>TABLE<\/strong> statement:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DROP TABLE IF EXISTS graph.Stores;\n  GO\n  CREATE TABLE graph.Stores (\n    StoreID INT PRIMARY KEY,\n    StoreName NVARCHAR(50) NOT NULL\n  ) AS NODE;\n  INSERT INTO graph.Stores(StoreID, StoreName)\n  SELECT c.CustomerID, s.Name\n  FROM Sales.Customer c INNER JOIN Sales.Store s\n    ON c.StoreID = s.BusinessEntityID;<\/pre>\n<p>The example follows the same procedures used in the first article to create and populate node tables, so be sure to refer back to the article if you\u2019re unsure about what we\u2019re doing here. Keep in mind that you must include the <strong>AS<\/strong> <strong>NODE<\/strong> clause in the <strong>CREATE<\/strong> <strong>TABLE<\/strong> statement. You can also add whatever other user-defined columns you want to include. SQL Server will automatically generate the table\u2019s <strong>$node_id<\/strong> column.<\/p>\n<p>You can then use an <strong>INSERT\u2026SELECT<\/strong> statement to populate the <strong>Stores<\/strong> table, as you would with any SQL Server table. In this case, you must join the <strong>Sales.Customer<\/strong> table to the <strong>Sales.Store<\/strong> table to get the store name. In addition, when supplying values for the <strong>StoreID<\/strong> column in the <strong>Stores<\/strong> table, you should use the <strong>CustomerID<\/strong> value in the <strong>Customer<\/strong> table, rather than use the <strong>StoreID<\/strong> value in that table, because the <strong>SalesOrderHeader<\/strong> table uses the <strong>CustomerID<\/strong> value. This approach helps to keep things simpler when populating the edge tables. SQL Server automatically populates the <strong>$node_id<\/strong> column.<\/p>\n<p>That\u2019s all there is to setting up the <strong>Stores<\/strong> table, and creating and populating the <strong>SalesReps<\/strong> table is even easier:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DROP TABLE IF EXISTS graph.SalesReps;\n  GO\n  CREATE TABLE graph.SalesReps (\n    SalesRepID INT PRIMARY KEY,\n    FirstName NVARCHAR(50) NOT NULL,\n    LastName NVARCHAR(50) NOT NULL,\n  ) AS NODE;\n  INSERT INTO graph.SalesReps(SalesRepID, FirstName, LastName)\n  SELECT BusinessEntityID, FirstName, LastName\n  FROM Person.Person\n  WHERE PersonType = 'SP';<\/pre>\n<p>For this example, you can pull all the data directly from the <strong>Person<\/strong> table, limiting the results to those rows with a <strong>PersonType<\/strong> value of <strong>SP<\/strong> (for <em>salesperson<\/em>). If you want to include such information as sales quotas or job titles in the table, you must join the <strong>Person<\/strong> table to the <strong>SalesPerson<\/strong> or <strong>Employee<\/strong> table (or both). For this example, however, the <strong>Person<\/strong> table is enough.<\/p>\n<p>The next table to create and populate is <strong>Products<\/strong>. For this, you can pull all the data from the <strong>Production.Product<\/strong> table:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DROP TABLE IF EXISTS graph.Products;\n  GO\n  CREATE TABLE graph.Products (\n    ProductID INT PRIMARY KEY,\n    ProductName NVARCHAR(50) NOT NULL,\n    StandardCost MONEY NOT NULL,\n  ) AS NODE;\n  INSERT INTO graph.Products(ProductID, ProductName, StandardCost)\n  SELECT ProductID, Name, StandardCost\n  FROM Production.Product\n  WHERE FinishedGoodsFlag = 1;<\/pre>\n<p>For this example, when retrieving data from the <strong>Product<\/strong> table, you should include a <strong>WHERE<\/strong> clause that filters the data so that only rows with a <strong>FinishedGoodsFlag<\/strong> value of <strong>1<\/strong> are included. This ensures that you include only salable products in the <strong>Products<\/strong> table.<\/p>\n<p>The final node table is <strong>Vendors<\/strong>, which gets all its data from the <strong>Purchasing.Vendor<\/strong> table:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DROP TABLE IF EXISTS graph.Vendors;\n  GO\n  CREATE TABLE graph.Vendors (\n    VendorID INT PRIMARY KEY,\n    AccountNumber NVARCHAR(15) NOT NULL,\n    VendorName NVARCHAR(50) NOT NULL\n  ) AS NODE;\n  INSERT INTO graph.Vendors(VendorID, AccountNumber, VendorName)\n  SELECT BusinessEntityID, AccountNumber, Name\n  FROM Purchasing.Vendor;<\/pre>\n<p>That\u2019s all there is to creating and populating the node tables. Once they\u2019re in place, you can start in on your edge tables.<\/p>\n<h2>Creating and Populating the Edge Tables<\/h2>\n<p>Creating an edge table is just as simple as a node table, with a few notable differences. For the edge table, the table definition requires an <strong>AS<\/strong> <strong>EDGE<\/strong> clause, rather than an <strong>AS<\/strong> <strong>NODE<\/strong> clause, and the user-defined columns are optional. (Node tables require at least one user-defined column.) In addition, SQL Server automatically generates the <strong>$edge_id<\/strong> column, rather than the <strong>$node_id<\/strong> column.<\/p>\n<p>The first edge table is <strong>Orders<\/strong>, which includes three user-defined columns, as shown in the following <strong>CREATE<\/strong> <strong>TABLE<\/strong> statement:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DROP TABLE IF EXISTS graph.Orders;\n  GO\n  CREATE TABLE graph.Orders (\n    OrderDate DATETIME NOT NULL,\n    OrderQty SMALLINT NOT NULL,\n    LineTotal MONEY NOT NULL\n  ) AS EDGE;<\/pre>\n<p>After you create the <strong>Orders<\/strong> table, you can add the data, which relies on the <strong>SalesOrderHeader<\/strong> and <strong>SalesOrderDetail<\/strong> tables to supply the values for the user-defined columns and, more importantly, to provide the structure for defining the relationships between the <strong>Stores<\/strong> and <strong>Products<\/strong> nodes:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  INSERT INTO graph.Orders($from_id, $to_id, OrderDate, \n     OrderQty, LineTotal)\n  SELECT s.node1, p.node2, h.OrderDate, d.OrderQty, d.LineTotal\n  FROM Sales.SalesOrderHeader h INNER JOIN Sales.SalesOrderDetail d\n    ON h.SalesOrderID = d.SalesOrderID \n    INNER JOIN (SELECT $node_id AS node1, StoreID FROM graph.Stores) s\n    ON h.CustomerID = s.StoreID\n    INNER JOIN (SELECT $node_id AS node2, ProductID \n                FROM graph.Products) p\n    ON d.ProductID = p.ProductID;<\/pre>\n<p>After joining the <strong>SalesOrderHeader<\/strong> and <strong>SalesOrderDetail<\/strong> tables, the <strong>SELECT<\/strong> statement joins the <strong>SalesOrderHeader<\/strong> table to the <strong>Stores<\/strong> tables, based on the <strong>CustomerID<\/strong> and <strong>StoreID<\/strong> values. The join uses a subquery to retrieve only the <strong>$node_id<\/strong> and <strong>StoreID<\/strong> columns from the <strong>Stores<\/strong> table and to rename the <strong>$node_id<\/strong> column to <strong>node1<\/strong>. The query will fail if you try to use <strong>$node_id<\/strong> in the SELECT list. You can then join the <strong>SalesOrderHeader<\/strong> table to the <strong>Products<\/strong> table, using the same logic as when joining to the <strong>Stores<\/strong> table.<\/p>\n<p>The <strong>node1<\/strong> and <strong>node2<\/strong> columns returned by the <strong>SELECT<\/strong> statement provide the values for the <strong>$from_id<\/strong> and <strong>$to_id<\/strong> columns in the edge table. As you\u2019ll recall from the first article, you must specifically provide these values when inserting data into an edge table. The values are essential to defining the relationships between the originating and terminating nodes. SQL Server automatically populates the <strong>$edge_id<\/strong> column.<\/p>\n<p>The next step is to create and populate the <strong>Sells<\/strong> edge table, which works much the same way as the <strong>Orders<\/strong> table, even when it comes to the user-defined columns. The main difference is that the relationships originate with the <strong>SalesReps<\/strong> table, as shown in the following T-SQL code:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DROP TABLE IF EXISTS graph.Sells;\n  GO\n  CREATE TABLE graph.Sells (\n    OrderDate DATETIME NOT NULL,\n    OrderQty SMALLINT NOT NULL,\n    LineTotal MONEY NOT NULL\n  ) AS EDGE;\n  INSERT INTO graph.Sells($from_id, $to_id, OrderDate, \n     OrderQty, LineTotal)\n  SELECT s.node1, p.node2, h.OrderDate, d.OrderQty, d.LineTotal\n  FROM Sales.SalesOrderHeader h INNER JOIN Sales.SalesOrderDetail d\n    ON h.SalesOrderID = d.SalesOrderID \n    INNER JOIN (SELECT $node_id AS node1, SalesRepID \n                FROM graph.SalesReps) s\n    ON h.SalesPersonID = s.SalesRepID\n    INNER JOIN (SELECT $node_id AS node2, ProductID \n                FROM graph.Products) p\n    ON d.ProductID = p.ProductID;<\/pre>\n<p>The fact that the <strong>Orders<\/strong> and <strong>Sells<\/strong> tables include the same user-defined columns points to the possibility that you could create a fifth node table for sales orders and then include columns such as <strong>OrderDate<\/strong> in there. However, this approach could make your schema and queries unnecessarily complicated, while providing little benefit. On the other hand, this approach helps to eliminate duplicate data. As with any database, the exact layout of your graph model will depend on the type of data you\u2019re storing and how you plan to query that data.<\/p>\n<p>The last step is to create and populate the <strong>Supplies<\/strong> table. In this case, the structure for the relationships is available through the <strong>ProductVendor<\/strong> table:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  DROP TABLE IF EXISTS graph.Supplies;\n  GO\n  CREATE TABLE graph.Supplies (\n    StandardPrice MONEY NOT NULL\n  ) AS EDGE;\n  INSERT INTO graph.Supplies($from_id, $to_id, StandardPrice)\n  SELECT v.node1, p.node2, pv.StandardPrice\n  FROM Purchasing.ProductVendor pv\n    INNER JOIN (SELECT $node_id AS node1, VendorID \n                FROM graph.Vendors) v\n    ON pv.BusinessEntityID = v.VendorID\n    INNER JOIN (SELECT $node_id AS node2, ProductID \n                FROM graph.Products) p\n    ON pv.ProductID = p.ProductID;<\/pre>\n<p>The <strong>ProductVendor<\/strong> table does all the product-vendor mapping for you and includes the <strong>StandardPrice<\/strong> values. You need only join this table to the <strong>Vendors<\/strong> and <strong>Products<\/strong> tables to get the originating and terminating node IDs.<\/p>\n<h2>Retrieving Store Sales Data<\/h2>\n<p>With the graph tables now defined and populated, you\u2019re ready to start querying them, just like you saw in the second article in this series. For example, you can use the following <strong>SELECT<\/strong> statement to return information about the products that each store has ordered:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  SELECT s.StoreName, p.ProductName, o.OrderQty, o.LineTotal,\n    CAST(o.OrderDate AS DATE) AS OrderDate\n  FROM graph.Stores s, graph.Orders o, graph.Products p\n  WHERE MATCH(s-(o)-&gt;p)\n  ORDER BY s.StoreID;<\/pre>\n<p>The <strong>SELECT<\/strong> statement uses the <strong>MATCH<\/strong> function to specify what data to retrieve. As described in the second article, the function lets you define a search pattern based on the relationships between nodes. You can use the function only in the <strong>WHERE<\/strong> clause of a query that targets node and edge tables. The following table shows part of the results that the <strong>SELECT<\/strong> statement returns. (The statement returns over 60,000 rows.)<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1138\" height=\"528\" class=\"wp-image-78548\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/04\/word-image-238.png\" \/><\/p>\n<p>In the above example, the <strong>MATCH<\/strong> clause specifies the relationship <em>store orders product<\/em>. If you were to retrieve the same data directly from the relational tables, you could not use the <strong>MATCH<\/strong> clause. Instead, your query would look similar to the following:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  SELECT c.CustomerID, c.StoreID, st.Name StoreName, p.name ProductName, \n    d.OrderQty, d.LineTotal, s.OrderDate\n  FROM sales.SalesOrderHeader s INNER JOIN sales.SalesOrderDetail d\n    ON s.SalesOrderID = d.SalesOrderID\n    INNER JOIN sales.customer c ON s.CustomerID = c.CustomerID\n    INNER JOIN sales.store st ON c.storeid = st.BusinessEntityID\n    INNER JOIN production.product p ON d.ProductID = p.ProductID\n  WHERE p.FinishedGoodsFlag = 1 \n  ORDER BY st.BusinessEntityID;<\/pre>\n<p>Although this query is more complex than the previous one, you can use it without having to create and populate graph tables. As with any data, you\u2019ll have to determine on a case-by-case basis when a graph database will be useful to your circumstances and which structure will deliver the best-performing queries.<\/p>\n<p>Returning now to the graph tables, you can modify the preceding example by grouping the data based on the stores and products, as shown in the following example:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  SELECT s.StoreName, p.ProductName, \n    SUM(o.OrderQty) AS TotalItems, SUM(o.LineTotal) AS TotalAmount\n  FROM graph.Stores s, graph.Orders o, graph.Products p\n  WHERE MATCH(s-(o)-&gt;p)  \n  GROUP BY s.StoreName, p.ProductName\n  HAVING SUM(o.OrderQty) &gt; 100\n  ORDER BY s.StoreName;<\/pre>\n<p>As you can see, you can use the <strong>MATCH<\/strong> function in conjunction with other clauses, including the <strong>HAVING<\/strong> clause, which in this case, limits the results to rows with a total quantity greater than 100. The following figure shows the data now returned by the <strong>SELECT<\/strong> statement.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1138\" height=\"510\" class=\"wp-image-78549\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/04\/word-image-239.png\" \/><\/p>\n<p>When implementing a graph database based on existing relational data, you might want to copy only part of the data set into the graph tables, in which case, you\u2019ll likely need to create queries that can retrieve data from both the graph and relational tables. One way to achieve this is to define a common table expression (CTE) that retrieves the graph data and then use the CTE when retrieving the relational data, as shown in the following example:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  WITH StoreOrders AS\n  (\n    SELECT s.StoreName, p. ProductID, p.ProductName, \n      SUM(o.OrderQty) AS TotalItems, SUM(o.LineTotal) AS TotalAmount\n    FROM graph.Stores s, graph.Orders o, graph.Products p\n    WHERE MATCH(s-(o)-&gt;p)  \n    GROUP BY s.StoreName, p.ProductID, p.ProductName\n    HAVING SUM(o.OrderQty) &gt; 100\n  )\n  SELECT so.StoreName, so.ProductName, ps.Name AS ProductSubcategory, \n    pc.Name AS ProductCategory, so.TotalItems, so.TotalAmount\n  FROM StoreOrders so INNER JOIN Production.Product pr\n    ON so.ProductID = pr.ProductID \n    INNER JOIN Production.ProductSubcategory ps\n    ON pr.ProductSubcategoryID = ps.ProductSubcategoryID \n    INNER JOIN Production.ProductCategory pc\n    ON ps.ProductCategoryID = pc.ProductCategoryID\n  ORDER BY so.StoreName;<\/pre>\n<p>In this case, the outer <strong>SELECT<\/strong> statement joins the data from the CTE to the <strong>Product<\/strong>, <strong>ProductSubcategory<\/strong>, and <strong>ProductCategory<\/strong> tables in order to include the product categories and subcategories in the results, as shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1138\" height=\"382\" class=\"wp-image-78550\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/04\/word-image-240.png\" \/><\/p>\n<p>Being able to access both graph and relational data makes it possible to implement a graph database for those complex relationships that can justify the additional work, while still retaining the basic relational structure for all other data.<\/p>\n<h2>Retrieving Sales Rep and Vendor Data<\/h2>\n<p>Of course, once you have your graph tables in place, you can run a query against any of them. For example, the following query returns a list of sales reps and the products they have sold, along with details about the orders:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  SELECT r.LastName + ', ' + r.FirstName AS FullName, p.ProductName, \n    CAST(s.OrderDate AS DATE) AS OrderDate, s.OrderQty, s.LineTotal\n  FROM graph.SalesReps r, graph.Sells s, graph.Products p\n  WHERE MATCH(r-(s)-&gt;p)\n  ORDER BY r.LastName, r.FirstName;<\/pre>\n<p>As you can see, retrieving information about the <strong>Sells<\/strong> relationships works just like returning data about the <strong>Orders<\/strong> relationships, but now the results are specific to each sales rep, as shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1128\" height=\"558\" class=\"wp-image-78551\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/04\/word-image-241.png\" \/><\/p>\n<p>The results shown here are only a small portion of the returned data. The statement actually returns over 60,000 rows. However, you can aggregate the data just as you saw earlier:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk \">  SELECT r.LastName + ', ' + r.FirstName AS FullName, \n    YEAR(s.OrderDate) AS OrderYear,\n    SUM(s.OrderQty) AS TotalItems, CAST(SUM(s.LineTotal) AS INT) \n       AS TotalSales\n  FROM graph.SalesReps r, graph.Sells s, graph.Products p\n  WHERE MATCH(r-(s)-&gt;p)\n  GROUP BY r.LastName, r.FirstName, YEAR(s.OrderDate)\n  ORDER BY r.LastName, r.FirstName, YEAR(s.OrderDate);<\/pre>\n<p>Now the <strong>SELECT<\/strong> statement returns only 58 rows, with the first 10 shown below.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"938\" height=\"566\" class=\"wp-image-78552\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/04\/word-image-242.png\" \/><\/p>\n<p>There\u2019s little difference between returning data based on the <strong>Orders<\/strong> relationships or the <strong>Sells<\/strong> relationships, except that the originating nodes are different. You can also take the same approach to retrieve vendor data. Just be sure to update the table alias references as necessary, as shown in the following example:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  WITH VendorProducts AS\n  (\n    SELECT v.VendorName, p.ProductID, p.ProductName, \n      AVG(s.StandardPrice) AvgPrice\n    FROM graph.Vendors v, graph.Supplies s, graph.Products p\n    WHERE MATCH(v-(s)-&gt;p) \n    GROUP BY v.VendorName, p.ProductID, p.ProductName\n  )\n  SELECT vp.VendorName, vp.ProductName, ps.Name AS ProductSubcategory,\n     pc.Name AS ProductCategory\n  FROM VendorProducts vp INNER JOIN Production.Product pr\n    ON vp.ProductID = pr.ProductID \n    INNER JOIN Production.ProductSubcategory ps\n    ON pr.ProductSubcategoryID = ps.ProductSubcategoryID \n    INNER JOIN Production.ProductCategory pc\n    ON ps.ProductCategoryID = pc.ProductCategoryID\n  WHERE pc.Name = 'Components'\n  ORDER BY vp.VendorName, vp.ProductName;<\/pre>\n<p>This should all look familiar to you. The <strong>SELECT<\/strong> statement uses a CTE to join the graph and relational data together. The following table shows the first 10 rows of the 32 that the statement returns.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1246\" height=\"580\" class=\"wp-image-78553\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/04\/word-image-243.png\" \/><\/p>\n<p>As you can see, the results include the vendor and product names, along with the product subcategories and categories.<\/p>\n<h2>Digging into the Graph Data<\/h2>\n<p>Once you get the basics down of how to query your graph tables, you can come up with other ways to understand the relationships between the nodes. For example, the following <strong>SELECT<\/strong> statement attempts to identity sales reps who might be focusing too heavily on certain vendors:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">  SELECT r.LastName + ', ' + r.FirstName AS FullName, v.VendorName, \n    CAST(SUM(sl.LineTotal) AS INT) AS TotalAmount\n  FROM graph.SalesReps r, graph.Sells sl, graph.Products p, \n    graph.Supplies sp, graph.Vendors v\n  WHERE MATCH(r-(sl)-&gt;p&lt;-(sp)-v)\n  GROUP BY r.LastName, r.FirstName, v.VendorName\n  HAVING SUM(sl.LineTotal) &gt; \n    (SELECT AVG(LineTotal) FROM graph.Sells) * 50\n  ORDER BY r.LastName, r.FirstName, v.VendorName;<\/pre>\n<p>The statement groups the data by the name of the sales reps and then by the vendors. The statement also includes a <strong>HAVING<\/strong> clause that calculates an amount 50 times the average sales and then compares that to the total sales of each sales rep. Only reps that go over the calculated amount are included in the results, as shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"832\" height=\"488\" class=\"wp-image-78554\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/04\/word-image-244.png\" \/><\/p>\n<p>By being able to return this type of information, you can identify patterns that point to anomalies or specific trends in the data set. For instance, suppose you now want to identify the products that stores have bought based on a specified product that they also bought (a scenario sometimes referred to <em>customers who bought this also bought that<\/em>).<\/p>\n<p>One way to get this information is to use a CTE to retrieve the IDs of the stores that ordered the specified product and then, for each store return the list of other products that the store ordered. To achieve this, use the CTE to qualify your query so it returns only the other products that the stores bought:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk \">  WITH StoreIDs AS\n  (\n    SELECT s.StoreID\n    FROM graph.stores s, graph.Orders o, graph.Products p \n    WHERE MATCH(s-(o)-&gt;p)\n      AND p.ProductName = 'Sport-100 Helmet, Blue'\n    GROUP BY s.StoreID, s.StoreName\n    HAVING SUM(o.OrderQty) &gt; 75\n  )\n  SELECT s.StoreName, p.ProductName\n  FROM graph.stores s, graph.Orders o, graph.Products p \n  WHERE MATCH(s-(o)-&gt;p)\n    AND s.StoreId IN (SELECT StoreID FROM StoreIDs)\n    AND p.ProductName &lt;&gt; 'Sport-100 Helmet, Blue'\n  GROUP BY s.StoreName, p.ProductName\n  HAVING SUM(o.OrderQty) &gt; 75\n  ORDER BY StoreName, ProductName;<\/pre>\n<p>The outer <strong>SELECT<\/strong> statement returns the list of products that each of the three stores has ordered. The key is to use the <strong>IN<\/strong> operator in a <strong>WHERE<\/strong> clause condition to compare the <strong>StoreId<\/strong> value to a list of store IDs returned by the CTE. You should also include a <strong>WHERE<\/strong> clause condition to exclude the product <strong>Sport-100 Helmet, Blue<\/strong>. The <strong>SELECT<\/strong> statement returns the results shown in the following figure.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"860\" height=\"534\" class=\"wp-image-78555\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2018\/04\/word-image-245.png\" \/><\/p>\n<p>There are other ways you can get at <em>customers who bought this also bought that <\/em>information, such as using Python or R, but this approach provides a relatively simple way to get the data from a graph database, without having to jump through too many hoops.<\/p>\n<h2>Making the Best of Both Worlds<\/h2>\n<p>Because the graph database features are integrated with the database engine, there\u2019s no reason you can\u2019t work with graph and relational data side-by-side, depending on your application requirements and the nature of your data. This integration also gives you the flexibility to incorporate graph tables into an existing relational structure or make them both part of the design when planning for a new application. Keep in mind, however, that the graph features are still new to SQL Server and lack some of the advanced capabilities available to more established graph products. Perhaps after a couple more releases, SQL Server will be a more a viable contender in the graph database market, at least when used in conjunction with relational data.<\/p>\n\n\n<section id=\"faq\" class=\"faq-block my-5xl\">\n    <h2>FAQs: SQL Server Graph Databases \u2013 Part 5: Importing Relational Data into a Graph Database<\/h2>\n\n                        <h3 class=\"mt-4xl\">1. How do I import relational data into a SQL Server graph database?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Create node tables (AS NODE) for each entity type in your relational model, then create edge tables (AS EDGE) for each relationship. Populate the node tables using INSERT&#8230;SELECT from your existing relational tables. For edge tables, join the source and target node tables to get the internal $node_id values for each relationship, then insert those pairs into the edge table. You can query the graph tables immediately using the MATCH clause.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">2. What is the MATCH clause in SQL Server graph queries?<\/h3>\n            <div class=\"faq-answer\">\n                <p>The MATCH clause is used in SELECT statements to specify relationship patterns between graph node and edge tables. It uses an arrow notation &#8211; MATCH(NodeA-(Edge)-&gt;NodeB) &#8211; to express that NodeA has an edge relationship pointing to NodeB. You can chain multiple MATCH conditions to traverse multiple relationship hops in a single query, which would otherwise require multiple JOINs in a relational model.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">3. Can I use graph tables and relational tables in the same SQL Server query?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Yes. SQL Server graph tables are stored in the same database as regular relational tables and can be JOINed together in standard SELECT statements. You can use MATCH to navigate the graph portion of a query while JOINing to regular relational tables for additional data. This allows incremental adoption &#8211; you can add graph relationships to an existing relational database without migrating all data to a pure graph structure.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">4. What SQL Server version do I need for graph database features?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Graph database features (AS NODE, AS EDGE, and MATCH) were introduced in SQL Server 2017. The SHORTEST_PATH function and other advanced graph features were added in SQL Server 2019. For most import and query scenarios, SQL Server 2017 is the minimum requirement. Azure SQL Database and SQL Managed Instance also support SQL Server graph features.<\/p>\n            <\/div>\n            <\/section>\n","protected":false},"excerpt":{"rendered":"<p>Learn how to import AdventureWorks relational data into SQL Server graph node and edge tables, query the graph data with MATCH, and combine graph and relational queries side-by-side. Part 5 of the SQL Server Graph Databases series.&hellip;<\/p>\n","protected":false},"author":221841,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":true,"footnotes":""},"categories":[143531],"tags":[5134],"coauthors":[6779],"class_list":["post-78545","post","type-post","status-publish","format-standard","hentry","category-t-sql-programming-sql-server","tag-sql-prompt"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/78545","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221841"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=78545"}],"version-history":[{"count":6,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/78545\/revisions"}],"predecessor-version":[{"id":109621,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/78545\/revisions\/109621"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=78545"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=78545"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=78545"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=78545"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}