{"id":71357,"date":"2017-06-19T17:59:29","date_gmt":"2017-06-19T17:59:29","guid":{"rendered":"https:\/\/www.simple-talk.com\/?p=71357"},"modified":"2021-09-29T16:21:13","modified_gmt":"2021-09-29T16:21:13","slug":"sql-server-r-services-basics","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/bi-sql-server\/sql-server-r-services-basics\/","title":{"rendered":"SQL Server R Services: The Basics"},"content":{"rendered":"<h4>The series so far:<\/h4>\n<ol>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/sql-server-r-services-basics\/\">SQL Server R Services: The Basics<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/sql-server-r-services-digging-r-language\/\">SQL Server R Services: Digging into the R Language<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/sql-server-r-services-working-ggplot2-statistical-graphics\/\">SQL Server R Services: Working with ggplot2 Statistical Graphics<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/sql-server-r-services-working-data-frames\/\">SQL Server R Services: Working with Data Frames<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/sql-server-r-services-generating-sparklines-types-spark-graphs\/\">SQL Server R Services: Generating Sparklines and Other Types of Spark Graphs<\/a><\/li>\n<li><a href=\"https:\/\/www.red-gate.com\/simple-talk\/sql\/bi\/sql-server-r-services-working-multiple-data-sets\/\">SQL Server R Services: Working with Multiple Data Sets<\/a><\/li>\n<\/ol>\n\n<p>With the release of SQL Server 2016, you can now run R scripts within the SQL Server environment, using data that you retrieve directly from a SQL Server database. The ability to run R scripts comes through a new feature: SQL Server R Services. The key to working with R Services is the <strong>sys.sp_execute_external_script<\/strong> system stored procedure, which Microsoft also introduced in SQL Server 2016. Unfortunately, Microsoft documentation is a bit unclear on some of the finer points of using the procedure, and it\u2019s up to you to read between the lines, while trying to make sense of the bigger picture.<\/p>\n<p>To help you better understand the <strong>sp_execute_external_script<\/strong> stored procedure, this article walks you through the process of running R scripts based on SQL Server data. This article is the first in a series about using SQL Server R Services to write R scripts. Although most of the focus of this article is on the stored procedure itself, in subsequent articles, we\u2019ll dig deeper into the R language, after you have a foundation in how to run R scripts that incorporate SQL Server data.<\/p>\n<p>To help you get started, this article walks you through a series of examples that demonstrate the <strong>sp_execute_external_script<\/strong> stored procedure. I created the examples on a local instance of SQL Server 2016, using data from the <strong>AdventureWorks2014<\/strong> database. The R scripts in the examples are very simple and are meant only to introduce you to basic concepts. As we progress through the series, we\u2019ll go into greater depth into R scripting in order to demonstrate more advanced analytical capabilities.<\/p>\n<h2>Getting started with R scripting<\/h2>\n<p>The <strong>sp_execute_external_script<\/strong> stored procedure provides the structure necessary to run R scripts against SQL Server data, directly within the SQL Server environment. To run an R script, you must call the system stored procedure and provide the necessary parameters, adhering to the procedure\u2019s requirements when assigning values. One of those values is the R script itself.<\/p>\n<p>Currently, the stored procedure supports only the R language, but Microsoft plans to add support for Python in SQL Server 2017. Because of this expansion, Microsoft will be changing the name of SQL Server R Services to SQL Server Machine Learning Services and Microsoft R Server to Microsoft Machine Learning Server. For now, we\u2019ll stick with the original names.<\/p>\n<p>Before you can use the <strong>sp_execute_external_script<\/strong> stored procedure, you must enable your SQL Server instance to support external scripts, but before you can do that, you must ensure that R Services is installed correctly and up and running. Unfortunately, this in itself can be a tricky and frustrating process, especially if service packs or cumulative updates are involved. If you have not set up R Services or enabled external scripting and are not sure how to go about this, a good place to start is with the MSDN article <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/advanced-analytics\/r\/set-up-sql-server-r-services-in-database\">Set up SQL Server R Services (In-Database)<\/a>.<\/p>\n<p>Once you have R Services ready to go, you should be able to try out the <strong>sp_execute_external_script<\/strong> stored procedure. The following syntax show the procedure\u2019s basic elements:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">sp_execute_external_script\r\n    @language = N'language',\r\n    @script = N'script',\r\n    [ @input_data_1 = 'SelectStatement' [,] ]\r\n    [ @input_data_1_name =  N'InputDataSet' [,] ]\r\n    [ @output_data_1_name = N'OutputDataSet' [,] ]\r\n    [ @params =  N'@ParameterName DataType [ OUT | OUTPUT ] [ ,...n ]' [,] ]\r\n    [ @ParameterName = 'value' [ OUT | OUTPUT ] [,] [ ,...n ] ]\r\n    [ WITH &lt;execute_option&gt; ] [;]\r\n  &lt;execute_option&gt;::=  \r\n  {\r\n      { RESULT SETS UNDEFINED }\r\n    | { RESULT SETS NONE }\r\n    | { RESULT SETS ( &lt;result_sets_definition&gt; ) }  \r\n  }<\/pre>\n<p>I\u2019ve simplified the syntax a bit for this article, but it still contains most of the basic components. One thing that\u2019s missing is the breakdown for the <strong>&lt;result_sets_definition&gt;<\/strong> element, which is used to provide column names for the data outputted by the R script, as well as to take other steps. We\u2019ll get into the specifics of all this later in the article. In the meantime, if you want to view a more complete version of the syntax, see the Microsoft document <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/system-stored-procedures\/sp-execute-external-script-transact-sql\">sp_execute_external_script (Transact-SQL)<\/a>. Note, however, that the last time I looked at this information, the syntax appeared incorrect, in terms of how some of the optional elements have been bracketed, so proceed with caution.<\/p>\n<p>With that in mind, let\u2019s start with a very basic example of how to use the <strong>sp_execute_external_script<\/strong> stored procedure so you can start to get a feel for the way it all works. At a minimum, when calling the procedure, you must provide values for the <strong>@language<\/strong> and <strong>@script<\/strong> parameters, as shown in the following example:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">EXEC sp_execute_external_script\r\n  \u00a0 @language = N'R',\r\n  \u00a0 @script = N'OutputDataSet &lt;- data.frame(seq(1,4,0.5));'; \u00a0\r\n  GO<\/pre>\n<p>For the <strong>@language<\/strong> parameter, we must specify the value <strong>R<\/strong> (for the R language), which is currently the only value accepted for this parameter. Once support for Python has been added, we\u2019ll be able to specify that language instead.<\/p>\n<p>Next comes the value for the <strong>@script<\/strong> parameter, which defines the R script that we will ultimately run. In this case, the script is broken into two elements, separated by the back arrow (<strong>&lt;-<\/strong>) assignment operator. Essentially, we\u2019re assigning the value on the right side of the operator to the variable on the left side of the operator, similarly to how we would assign a value to a T-SQL variable.<\/p>\n<p>In this case, we\u2019re assigning the value to the <strong>OutputDataSet<\/strong> variable on the left side of the operator. This is the default name used to identify the output data set that is generated when we run the R script. (As you\u2019ll see later in the article, we can provide a different name for the output data set.)<\/p>\n<p>On the right side of the assignment operator is an expression that defines the data we want to return. The <strong>seq<\/strong> function generates a sequence of numbers starting with <strong>1<\/strong>, ending with <strong>4<\/strong>, and incremented by <strong>0.5<\/strong>. The results of that function are converted to a data frame (<strong>data.frame<\/strong>), which provides a table-like structure for handling the data.<\/p>\n<p>When we run the stored procedure, the data frame containing the numeric sequence is assigned to the <strong>OutputDataSet<\/strong> variable, which is returned by the stored procedure when we run it, giving us the results shown in the following table:<\/p>\n<table>\n<thead>\n<tr>\n<td>(No column name)<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>1<\/td>\n<\/tr>\n<tr>\n<td>1.5<\/td>\n<\/tr>\n<tr>\n<td>2<\/td>\n<\/tr>\n<tr>\n<td>2.5<\/td>\n<\/tr>\n<tr>\n<td>3<\/td>\n<\/tr>\n<tr>\n<td>3.5<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Of course, you\u2019ll want to do far more with your R scripts than what we\u2019ve done here, but this should give you a general idea of how you can use the <strong>sp_execute_external_script<\/strong> stored procedure to run R scripts. No doubt most R scripts you work with will be much more involved. In fact, because of the potential complexities, I find that it is often easier to put the script in a T-SQL variable and then call that variable from within the stored procedure, as shown in the following example:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">DECLARE @rscript NVARCHAR(MAX);\r\n  SET @rscript = N'OutputDataSet &lt;- data.frame(seq(1,4,0.5));';\r\n  EXEC sp_execute_external_script\r\n  \u00a0 @language = N'R',\r\n  \u00a0 @script = @rscript; \u00a0\r\n  GO<\/pre>\n<p>The stored procedure will return the same results as the preceding example, only now it\u2019s easier to read and update the R script, or at least it will be as they become more complex.<\/p>\n<h2>Using SQL Server data in R scripts<\/h2>\n<p>As fun as it might be to run R scripts alone in SQL Server, the real power that comes with R Services is the ability to incorporate SQL Server data into those scripts. For that, we need to include the procedure\u2019s <strong>@input_data_1<\/strong> parameter, which specifies the <strong>SELECT<\/strong> statement to use to retrieve data from our database, as shown in the following example:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">DECLARE @rscript NVARCHAR(MAX);\r\n  SET @rscript = N'OutputDataSet &lt;- InputDataSet;';\r\n  DECLARE @sqlscript NVARCHAR(MAX);\r\n  SET @sqlscript = N'\r\n  \u00a0 SELECT FirstName, LastName, SalesYTD\r\n  \u00a0 FROM Sales.vSalesPerson\r\n  \u00a0 WHERE SalesYTD &gt; 2000000\r\n  \u00a0 ORDER BY SalesYTD DESC;';\r\n  EXEC sp_execute_external_script\r\n  \u00a0 @language = N'R',\r\n  \u00a0 @script = @rscript,\r\n  \u00a0 @input_data_1 = @sqlscript; \u00a0\r\n  GO<\/pre>\n<p>As with the R script, I\u2019m using a T-SQL variable (<strong>@sqlscript<\/strong>) to store the <strong>SELECT<\/strong> statement. I then assign that variable to the <strong>@input_data_1<\/strong> parameter.<\/p>\n<p>I\u2019ve also updated the R script itself to assign the <strong>InputDataSet<\/strong> value to the <strong>OutputDataSet<\/strong> variable. <strong>InputDataSet<\/strong> is the default name of the variable used to represent the query specified in the <strong>@input_data_1<\/strong> parameter. In other words, I\u2019m assigning the data returned by the <strong>SELECT<\/strong> statement to the <strong>OutputDataSet<\/strong> variable, giving us the results shown in the following table:<\/p>\n<table>\n<thead>\n<tr>\n<td>(No column name)<\/td>\n<td>(No column name)<\/td>\n<td>(No column name)<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Linda<\/td>\n<td>Mitchell<\/td>\n<td>4251368.5497<\/td>\n<\/tr>\n<tr>\n<td>Jae<\/td>\n<td>Pak<\/td>\n<td>4116871.2277<\/td>\n<\/tr>\n<tr>\n<td>Michael<\/td>\n<td>Blythe<\/td>\n<td>3763178.1787<\/td>\n<\/tr>\n<tr>\n<td>Jillian<\/td>\n<td>Carson<\/td>\n<td>3189418.3662<\/td>\n<\/tr>\n<tr>\n<td>Ranjit<\/td>\n<td>Varkey Chudukatil<\/td>\n<td>3121616.3202<\/td>\n<\/tr>\n<tr>\n<td>Jos\u00e9<\/td>\n<td>Saraiva<\/td>\n<td>2604540.7172<\/td>\n<\/tr>\n<tr>\n<td>Shu<\/td>\n<td>Ito<\/td>\n<td>2458535.6169<\/td>\n<\/tr>\n<tr>\n<td>Tsvi<\/td>\n<td>Reiter<\/td>\n<td>2315185.611<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>All we\u2019ve done here is to pass our T-SQL query results to the R script, which returns those same results as that of the <strong>SELECT<\/strong> statement. Of course, this is something we could have done without using an R script; however, by being able to pass our query results to the R script, we can then use the analytical power built into R to do something with that data, which is the whole point of SQL Server R Services and the <strong>sp_execute_external_script<\/strong> stored procedure.<\/p>\n<p>For example, the following R script divides the <strong>SalesYTD<\/strong> totals by <strong>7<\/strong> and then rounds that figure to two decimal points:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">DECLARE @rscript NVARCHAR(MAX);\r\n  SET @rscript = N'\r\n  \u00a0 OutputDataSet &lt;- InputDataSet;\r\n  \u00a0 OutputDataSet[,3] &lt;- round(InputDataSet$SalesYTD \/ 7, 2);';\r\n  DECLARE @sqlscript NVARCHAR(MAX);\r\n  SET @sqlscript = N'\r\n  \u00a0 SELECT FirstName, LastName, SalesYTD\r\n  \u00a0 FROM Sales.vSalesPerson\r\n  \u00a0 WHERE SalesYTD &gt; 2000000\r\n  \u00a0 ORDER BY SalesYTD DESC;';\r\n  EXEC sp_execute_external_script\r\n  \u00a0 @language = N'R',\r\n  \u00a0 @script = @rscript,\r\n  \u00a0 @input_data_1 = @sqlscript; \u00a0\r\n  GO<\/pre>\n<p>The idea here is it find the average monthly sales based on the assumption that seven months have passed so far this year and then round that figure. To do so, I\u2019ve added a second line to the R script. To the left of the assignment operator, I\u2019ve specified <strong>OutputDataSet[,3]<\/strong> to indicate that we want to update the third column within data frame.<\/p>\n<p>To the right of the assignment operator, we reference the <strong>InputDataSet<\/strong> variable again, but this time add a dollar sign (<strong>$<\/strong>) and the name of the source column (<strong>SalesYTD<\/strong>). We then divide this amount by <strong>7<\/strong> and use the <strong>round<\/strong> function to round the amount to two decimal points, giving us the results shown in the following table:<\/p>\n<table>\n<thead>\n<tr>\n<td>(No column name)<\/td>\n<td>(No column name)<\/td>\n<td>(No column name)<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Linda<\/td>\n<td>Mitchell<\/td>\n<td>607338.36<\/td>\n<\/tr>\n<tr>\n<td>Jae<\/td>\n<td>Pak<\/td>\n<td>588124.46<\/td>\n<\/tr>\n<tr>\n<td>Michael<\/td>\n<td>Blythe<\/td>\n<td>537596.88<\/td>\n<\/tr>\n<tr>\n<td>Jillian<\/td>\n<td>Carson<\/td>\n<td>455631.2<\/td>\n<\/tr>\n<tr>\n<td>Ranjit<\/td>\n<td>Varkey Chudukatil<\/td>\n<td>445945.19<\/td>\n<\/tr>\n<tr>\n<td>Jos\u00e9<\/td>\n<td>Saraiva<\/td>\n<td>372077.25<\/td>\n<\/tr>\n<tr>\n<td>Shu<\/td>\n<td>Ito<\/td>\n<td>351219.37<\/td>\n<\/tr>\n<tr>\n<td>Tsvi<\/td>\n<td>Reiter<\/td>\n<td>330740.8<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>This, of course, is an extremely simple example of R (and something we could have also done in T-SQL), but it demonstrates how you can incorporate SQL Server data into your R scripts. Later in the series, we\u2019ll dig a lot more deeply into R, but for now, we\u2019ll continue to focus on the <strong>sp_execute_external_script<\/strong> stored procedure.<\/p>\n<h2>Naming data elements<\/h2>\n<p>The examples we\u2019ve looked at so far have used the default names for the input and output data sets (<strong>InputDataSet<\/strong> and <strong>OutputDataSet<\/strong>, respectively); however, the <strong>sp_execute_external_script<\/strong> stored procedure lets us provide our own names. For example, to specify the name of the input data set (the <strong>SELECT<\/strong> statement), we include the <strong>@input_data_1_name<\/strong> parameter when calling the stored procedure, as shown in the following example:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">DECLARE @rscript NVARCHAR(MAX);\r\n  SET @rscript = N'\r\n  \u00a0 OutputDataSet &lt;- SqlData;\r\n  \u00a0 OutputDataSet[,3] &lt;- round(SqlData$SalesYTD \/ 7, 2);';\r\n  DECLARE @sqlscript NVARCHAR(MAX);\r\n  SET @sqlscript = N'\r\n  \u00a0 SELECT FirstName, LastName, SalesYTD\r\n  \u00a0 FROM Sales.vSalesPerson\r\n  \u00a0 WHERE SalesYTD &gt; 2000000\r\n  \u00a0 ORDER BY SalesYTD DESC;';\r\n  EXEC sp_execute_external_script\r\n  \u00a0 @language = N'R',\r\n  \u00a0 @script = @rscript,\r\n  \u00a0 @input_data_1 = @sqlscript,\r\n  \u00a0 @input_data_1_name = N'SqlData';\r\n  GO<\/pre>\n<p>In this case, the <strong>@input_data_1_name<\/strong> parameter specifies that the stored procedure should use <strong>SqlData<\/strong> to refer to the input data set, rather than the default name, <strong>InputDataSet<\/strong>. Notice that I\u2019ve updated all references to the <strong>InputDataSet<\/strong> variable in the R script to <strong>SqlData<\/strong>. The stored procedure will still return the same results, despite having changed the references within the R script.<\/p>\n<p>This process works much the same for the output data set, only this time we include the <strong>@output_data_1_name<\/strong> parameter when calling the stored procedure:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">DECLARE @rscript NVARCHAR(MAX);\r\n  SET @rscript = N'\r\n  \u00a0 MonthlySales &lt;- SqlData;\r\n  \u00a0 MonthlySales[,3] &lt;- round(SqlData$SalesYTD \/ 7, 2);';\r\n  DECLARE @sqlscript NVARCHAR(MAX);\r\n  SET @sqlscript = N'\r\n  \u00a0 SELECT FirstName, LastName, SalesYTD\r\n  \u00a0 FROM Sales.vSalesPerson\r\n  \u00a0 WHERE SalesYTD &gt; 2000000\r\n  \u00a0 ORDER BY SalesYTD DESC;';\r\n  EXEC sp_execute_external_script\r\n  \u00a0 @language = N'R',\r\n  \u00a0 @script = @rscript,\r\n  \u00a0 @input_data_1 = @sqlscript,\r\n  \u00a0 @input_data_1_name = N'SqlData',\r\n  \u00a0 @output_data_1_name = N'MonthlySales';\r\n  GO<\/pre>\n<p>In this case, the <strong>@output_data_1_name<\/strong> parameter specifies <strong>MonthlySales<\/strong> for the output data frame. To accommodate this new name, I changed all references in the R script from <strong>OutputDataSet<\/strong> to <strong>MonthlySales<\/strong>. Once again, the stored procedure still returns the same results.<\/p>\n<p>Up to this point, none of the examples included column names for the returned data. To do so, we must add a <strong>WITH<\/strong> <strong>RESULT<\/strong> <strong>SETS<\/strong> clause when calling the <strong>sp_execute_external_script<\/strong> stored procedure. In the clause, we specify the column names and their data types, as shown in the following example:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">DECLARE @rscript NVARCHAR(MAX);\r\n  SET @rscript = N'\r\n  \u00a0 MonthlySales &lt;- SqlData;\r\n  \u00a0 MonthlySales[,3] &lt;- round(SqlData$SalesYTD \/ 7, 2);';\r\n  DECLARE @sqlscript NVARCHAR(MAX);\r\n  SET @sqlscript = N'\r\n  \u00a0 SELECT FirstName, LastName, SalesYTD\r\n  \u00a0 FROM Sales.vSalesPerson\r\n  \u00a0 WHERE SalesYTD &gt; 2000000\r\n  \u00a0 ORDER BY SalesYTD DESC;';\r\n  EXEC sp_execute_external_script\r\n  \u00a0 @language = N'R',\r\n  \u00a0 @script = @rscript,\r\n  \u00a0 @input_data_1 = @sqlscript,\r\n  \u00a0 @input_data_1_name = N'SqlData',\r\n  \u00a0 @output_data_1_name = N'MonthlySales'\r\n  \u00a0 WITH RESULT SETS(\r\n  \u00a0 \u00a0 (FirstName NVARCHAR(50), LastName NVARCHAR(50), MonthlyAvg MONEY));\r\n  GO<\/pre>\n<p>Now the results include the specified column names:<\/p>\n<table>\n<thead>\n<tr>\n<td>FirstName<\/td>\n<td>LastName<\/td>\n<td>MonthlyAvg<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Linda<\/td>\n<td>Mitchell<\/td>\n<td>607338.36<\/td>\n<\/tr>\n<tr>\n<td>Jae<\/td>\n<td>Pak<\/td>\n<td>588124.46<\/td>\n<\/tr>\n<tr>\n<td>Michael<\/td>\n<td>Blythe<\/td>\n<td>537596.88<\/td>\n<\/tr>\n<tr>\n<td>Jillian<\/td>\n<td>Carson<\/td>\n<td>455631.2<\/td>\n<\/tr>\n<tr>\n<td>Ranjit<\/td>\n<td>Varkey Chudukatil<\/td>\n<td>445945.19<\/td>\n<\/tr>\n<tr>\n<td>Jos\u00e9<\/td>\n<td>Saraiva<\/td>\n<td>372077.25<\/td>\n<\/tr>\n<tr>\n<td>Shu<\/td>\n<td>Ito<\/td>\n<td>351219.37<\/td>\n<\/tr>\n<tr>\n<td>Tsvi<\/td>\n<td>Reiter<\/td>\n<td>330740.8<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Instead of including the column names when adding the <strong>WITH<\/strong> <strong>RESULT<\/strong> <strong>SETS<\/strong> clause, we could have specified <strong>UNDEFINED<\/strong> to indicate that no column names should be included, but it wasn\u2019t necessary because this is the default behavior. Another option is to specify <strong>NONE<\/strong>, which prevents the stored procedure from returning the data assigned to the <strong>MonthlySales<\/strong> variable.<\/p>\n<h2>Defining R input parameters<\/h2>\n<p>The <strong>sp_execute_external_script<\/strong> stored procedure also lets us use input parameters to pass data into our <strong>SELECT<\/strong> statement or R script. To do so, we must first use the <strong>@params<\/strong> parameter to declare the variables and then, as a separate step, assign a value to each parameter.<\/p>\n<p>In the following <strong>EXECUTE<\/strong> statement, I\u2019ve included the <strong>@params<\/strong> parameter and declared two input parameters: <strong>@TotalSales<\/strong> and <strong>@TotalMonths<\/strong>:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">DECLARE @rscript NVARCHAR(MAX);\r\n  SET @rscript = N'\r\n  \u00a0 MonthlySales &lt;- SqlData;\r\n  \u00a0 MonthlySales[,3] &lt;- round(SqlData$SalesYTD \/ TotalMonths, 2);';\r\n  DECLARE @sqlscript NVARCHAR(MAX);\r\n  SET @sqlscript = N'\r\n  \u00a0 SELECT FirstName, LastName, SalesYTD\r\n  \u00a0 FROM Sales.vSalesPerson\r\n  \u00a0 WHERE SalesYTD &gt; @TotalSales\r\n  \u00a0 ORDER BY SalesYTD DESC;';\r\n  EXEC sp_execute_external_script\r\n  \u00a0 @language = N'R',\r\n  \u00a0 @script = @rscript,\r\n  \u00a0 @input_data_1 = @sqlscript,\r\n  \u00a0 @input_data_1_name = N'SqlData',\r\n  \u00a0 @output_data_1_name = N'MonthlySales',\r\n  \u00a0 @params = N'@TotalSales MONEY, @TotalMonths TINYINT',\r\n  \u00a0 @TotalSales = 2000000,\r\n  \u00a0 @TotalMonths = 7\r\n  \u00a0 WITH RESULT SETS(\r\n  \u00a0 \u00a0 (FirstName NVARCHAR(50), LastName NVARCHAR(50), MonthlyAvg MONEY)); \u00a0\r\n  GO<\/pre>\n<p>When defining the <strong>@params<\/strong> parameter, I assigned the input parameters as a single string value, using a comma to separate the individual variable declarations. Each declaration includes the name of the input variable and its data type. (We can also specify nullability.) After the <strong>@params<\/strong> parameter, I set the value for each input parameter. In this case, I\u2019m using literal values, but as you\u2019ll see later in the article, we can assign other types of values to the input parameters.<\/p>\n<p>In addition to defining the input parameters, I also updated the R script to reference the <strong>@TotalMonths<\/strong> parameter and the <strong>SELECT<\/strong> statement to reference the <strong>@TotalSales<\/strong> parameter. Notice that for the R script, I do not precede the parameter name with an at (<strong>@<\/strong>) symbol and instead treat it just like a variable.<\/p>\n<h2>Adding R scripts to stored procedures<\/h2>\n<p>You can include the <strong>sp_execute_external_script<\/strong> stored procedure in a user-defined stored procedure, making it possible to call your R script just like any other database object. The only tricky part is to connect the input parameters of the user-defined stored procedure to the input parameters of the <strong>sp_execute_external_script<\/strong> stored procedure.<\/p>\n<p>Let\u2019s look at an example of how this works, based on the preceding code. In the following <strong>CREATE<\/strong> <strong>PROCEDURE<\/strong> statement, I include the <strong>@MinSales<\/strong> and <strong>@MonthsYTD<\/strong> input parameters as part of the procedure definition:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">DROP PROCEDURE IF EXISTS dbo.GetMonthlyAverages;\r\n  GO\r\n  CREATE PROCEDURE dbo.GetMonthlyAverages\r\n  \u00a0 (@MinSales MONEY, @MonthsYTD TINYINT)\r\n  AS\r\n  \u00a0 SET NOCOUNT ON;\r\n  \u00a0 DECLARE @rscript NVARCHAR(MAX);\r\n  \u00a0 SET @rscript = N'\r\n  \u00a0 \u00a0 MonthlySales &lt;- SqlData;\r\n  \u00a0 \u00a0 MonthlySales[,3] &lt;- round(SqlData$SalesYTD \/ TotalMonths, 2);';\r\n  \u00a0 DECLARE @sqlscript NVARCHAR(MAX);\r\n  \u00a0 SET @sqlscript = N'\r\n  \u00a0 \u00a0 SELECT FirstName, LastName, SalesYTD\r\n  \u00a0 \u00a0 FROM Sales.vSalesPerson\r\n  \u00a0 \u00a0 WHERE SalesYTD &gt; @TotalSales\r\n  \u00a0 \u00a0 ORDER BY SalesYTD DESC;';\r\n  \u00a0 EXEC sp_execute_external_script\r\n  \u00a0 \u00a0 @language = N'R',\r\n  \u00a0 \u00a0 @script = @rscript,\r\n  \u00a0 \u00a0 @input_data_1 = @sqlscript,\r\n  \u00a0 \u00a0 @input_data_1_name = N'SqlData',\r\n  \u00a0 \u00a0 @output_data_1_name = N'MonthlySales',\r\n  \u00a0 \u00a0 @params = N'@TotalSales MONEY, @TotalMonths TINYINT',\r\n  \u00a0 \u00a0 @TotalSales = @MinSales,\r\n  \u00a0 \u00a0 @TotalMonths = @MonthsYTD\r\n  \u00a0 \u00a0 WITH RESULT SETS(\r\n  \u00a0 \u00a0 \u00a0 (FirstName NVARCHAR(50), LastName NVARCHAR(50), MonthlyAvg MONEY)); \u00a0\r\n  GO<\/pre>\n<p>Notice that I assigned the <strong>@MinSales<\/strong> parameter to the <strong>@TotalSales<\/strong> parameter and the <strong>@MonthsYTD<\/strong> parameter to the <strong>@TotalMonths<\/strong> parameter. In this way, the values provided for the <strong>@MinSales<\/strong> and <strong>@MonthsYTD<\/strong> parameters when calling the <strong>GetMonthlyAverages<\/strong> stored procedure are passed to the <strong>@TotalSales<\/strong> and the <strong>@TotalMonths<\/strong> parameters. We can demonstrate this by running the following <strong>EXECUTE<\/strong> statement:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">EXEC dbo.GetMonthlyAverages @MinSales = 2000000, @MonthsYTD = 7;<\/pre>\n<p>We still get the same results as in the previous example, only now we get to persist the structure and have the ability to pass in different values when calling the <strong>GetMonthlyAverages<\/strong> procedure.<\/p>\n<h2>Defining R output parameters<\/h2>\n<p>One other point about the <strong>sp_execute_external_script<\/strong> stored procedure worth noting: You can define an output parameter for returning a scalar value in addition to or in place of the data frame returned by the R script.<\/p>\n<p>The following example updates the preceding one to return only a scalar value, using the <strong>@mean<\/strong> variable and <strong>@MeanOut<\/strong> output parameter:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">DROP PROCEDURE IF EXISTS dbo.GetMonthlySalesMean;\r\n  GO\r\n  CREATE PROCEDURE dbo.GetMonthlySalesMean\r\n  \u00a0 (@MinSales MONEY, @MonthsYTD TINYINT)\r\n  AS\r\n  \u00a0 SET NOCOUNT ON;\r\n  \u00a0 DECLARE @rscript NVARCHAR(MAX);\r\n  \u00a0 SET @rscript = N'\r\n  \u00a0 \u00a0 sales &lt;- SqlData;\r\n  \u00a0 \u00a0 sales[,3] &lt;- SqlData$SalesYTD \/ TotalMonths;\r\n  \u00a0 \u00a0 MeanOut &lt;- mean(sales[,3]);';\r\n  \u00a0 DECLARE @sqlscript NVARCHAR(MAX);\r\n  \u00a0 SET @sqlscript = N'\r\n  \u00a0 \u00a0 SELECT FirstName, LastName, SalesYTD\r\n  \u00a0 \u00a0 FROM Sales.vSalesPerson\r\n  \u00a0 \u00a0 WHERE SalesYTD &gt; @TotalSales\r\n  \u00a0 \u00a0 ORDER BY SalesYTD DESC;';\r\n  \u00a0 DECLARE @mean MONEY;\r\n  \u00a0 EXEC sp_execute_external_script\r\n  \u00a0 \u00a0 @language = N'R',\r\n  \u00a0 \u00a0 @script = @rscript,\r\n  \u00a0 \u00a0 @input_data_1 = @sqlscript,\r\n  \u00a0 \u00a0 @input_data_1_name = N'SqlData',\r\n  \u00a0 \u00a0 @params = N'@TotalSales MONEY, @TotalMonths TINYINT, @MeanOut MONEY OUTPUT',\r\n  \u00a0 \u00a0 @TotalSales = @MinSales,\r\n  \u00a0 \u00a0 @TotalMonths = @MonthsYTD,\r\n  \u00a0 \u00a0 @MeanOut = @mean OUTPUT\r\n  \u00a0 \u00a0 WITH RESULT SETS NONE;\u00a0\r\n  \u00a0 SELECT @mean AS SalesMean;\r\n  GO<\/pre>\n<p>First, I declared the <strong>@mean<\/strong> variable, using the <strong>MONEY<\/strong> data type. I then added the <strong>@MeanOut<\/strong> parameter declaration to the <strong>@params<\/strong> parameter value, again using the <strong>MONEY<\/strong> data type. As part of the declaration, I included the <strong>OUTPUT<\/strong> keyword. Next, I assigned the <strong>@mean<\/strong> variable to the <strong>@MeanOut<\/strong> parameter, again including the <strong>OUTPUT<\/strong> keyword.<\/p>\n<p>Next, I simplified the second line of the R script by removing the <strong>round<\/strong> function, and then added a third line that uses the <strong>mean<\/strong> function to find the mean value of the <strong>SalesYTD<\/strong> averages. I then assigned this value to the <strong>@MeanOut<\/strong> variable so that result will be returned.<\/p>\n<p>Another step I took was to set the <strong>WITH<\/strong> <strong>RESULT<\/strong> <strong>SETS<\/strong> clause <strong>NONE<\/strong> so the data frame itself is not returned. I also removed the <strong>@output_data_1_name<\/strong> parameter because it cannot be used when specifying <strong>NONE<\/strong> for the <strong>WITH<\/strong> <strong>RESULT<\/strong> <strong>SETS<\/strong> clause. If you include this parameter, SQL Server will return an error.<\/p>\n<p>We can now try out our new procedure by running the following <strong>EXECUTE<\/strong> statement:<\/p>\n<pre class=\"theme:ssms2012-simple-talk lang:tsql decode:true\">EXEC dbo.GetMonthlySalesMean @MinSales = 2000000, @MonthsYTD = 7;<\/pre>\n<p>The stored procedure returns only a scalar value, which on my system is 461084.189.<\/p>\n<h2>Working with the R language<\/h2>\n<p>If you followed along with all the examples in this article, you should have a good idea of how to use SQL Server to run an R script that incorporates SQL Server data. Regardless of how complex your R scripts become, the basic stricture for running them is the same. Essentially, you define a <strong>SELECT<\/strong> statement, assign the results to a variable, and use the variable in your R script. In the R script, you can apply whatever logic you want, and then output the final data to the output variable, which is returned when you run the <strong>sp_execute_external_script<\/strong> stored procedure.<\/p>\n<p>Now that you have a foundation for how to work with R in SQL Server, you should be ready to implement more robust scripts. In the articles to follow in this series, we\u2019ll dive deeper into R scripting so you can perform more complex analytics. We\u2019ll also look at how to render those results into meaningful visualizations. As you\u2019ll discover, the R examples we\u2019ve looked at so far barely scratch the surface of R\u2019s potential. But they should at least provide you with a good foundation for moving forward.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It is possible to do a great deal with R within SQL Server, but it is best to start by doing analysis in R on numeric data from SQL Server and returning the results to SQL Server. There is great value to be gained even with this basic foundation. Robert Sheldon is on hand to give you a kick start with the first in his series on beginning with R in SQL Server.&hellip;<\/p>\n","protected":false},"author":221841,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143528,143531],"tags":[5134],"coauthors":[6779],"class_list":["post-71357","post","type-post","status-publish","format-standard","hentry","category-bi-sql-server","category-t-sql-programming-sql-server","tag-sql-prompt"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/71357","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221841"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=71357"}],"version-history":[{"count":6,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/71357\/revisions"}],"predecessor-version":[{"id":76217,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/71357\/revisions\/76217"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=71357"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=71357"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=71357"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=71357"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}