Azure DocumentDB Server-Side Scripting

DocumentDB only uses SQL for querying data. To create procedures and functions, you have to flex your JavaScript skills to create JavaScript functions that are saved to a DocumentDB collection. Robert Sheldon shows how it is done.

Azure DocumentDB is a Microsoft cloud service that offers a platform for implementing NoSQL databases and populating them with JSON documents. To facilitate data access, DocumentDB provides a SQL query language developed to work with the JSON data. You can use the language to access the documents directly through the Azure portal or from within your application code when connecting to a DocumentDB database.

The SQL supported in DocumentDB is limited strictly to querying data. You cannot use the language to create stored procedures or user-defined functions (UDFs), like you can with T-SQL in SQL Server. For DocumentDB, you must turn to server-side scripting, which takes the form of JavaScript functions saved to a DocumentDB collection.

In this article, I show you how to create stored procedures and functions, providing a number of examples that demonstrate what it takes to get started. You can also use use server-side scripting to create triggers, but those we’ll leave for another time.

Adding JSON documents to DocumentDB

To try out the examples in this article, you need to have an Azure account, a DocumentDB account, and a DocumentDB database with at least one collection. You must also have the sample data necessary to run the stored procedures and functions. For an introduction to how to get started with DocumentDB, see my previous article, Microsoft Azure DocumentDB. In the meantime, here’s a quick overview of how I came up with the data for the examples in this article.

I acquired the sample data from the AdventureWorks2016CTP3 database on a local instance of SQL Server 2016 CTP3, which now lets you output a query as a JSON document. I started with the following T-SQL SELECT statement to get the data I needed for the first document:

The statement returns a JSON array, which contains one JSON object, as shown in the following results:

If we were to feed these results into a JSON formatter, the code would look similar to the following, with the JSON object neatly delineated:

We can use either format to create a document in our DocumentDB database. However, we must first remove the outer brackets, which define the array, and use only the outer object as our document.

For the exercises, we’ll create five documents initially, which we can do simply by entering a different BusinessEntityID value in the WHERE clause, using the values 1 through 5. To make all this easier, I’ve included the five documents here, ready for you to copy so you can add each one to your DocumentDB collection:

Again, refer to my previous article for how to get started with DocumentDB, including how to add documents. After you have created the five samples, you can use Document Explorer in the Azure portal to view each one, as shown in the following figure.

2333-7838bf44-c24c-44c7-8942-21ea9f3fa85

In this case, the document with the id property value of ken0 is selected, the first document in the list. You can select any document in Document Explorer to view its contents.

Creating a stored procedure to retrieve data

As with JSON documents, you add a stored procedure to a specific collection. The stored procedure is essentially a JavaScript object made up of two properties: id and body. The id property value is simply a unique identifier for the stored procedure. The body property value is a JavaScript function that does the actual work of retrieving or updating the JSON documents.

You can use a development tool such as Node.js to define and add a stored procedure to a DocumentDB collection, or you can use Script Explorer in the Azure portal to add the stored procedure directly. For this article, we’ll take the latter approach so you can familiarize yourself with the basics of DocumentDB server-side scripting regardless of the environment in which you work.

If you do use a third-party tool, you will have to take additional steps to connect to the DocumentDB database and register the stored procedure, once you’ve defined it. The exact approach you take will depend on a number of factors, such as your development environment or whether you’re using the DocumentDB REST API or one of the supported SDKs. For information about the available APIs and SDKs, see the MSDN article Azure DocumentDB Reference Documentation.

To access Script Explorer, open the Collection pane specific to the collection where you added the sample documents. You can get to the pane through your DocumentDB account and database. In the Collection pane, click the Script Explorer button, either the one at the top or the one near the bottom, which is more of a box. When the Script Explorer pane opens, click Create Stored Procedure. This launches the Stored Procedures pane, as shown in the following figure.

2333-5cc5ad58-61c8-488c-b373-adaeb11b28f

The Stored Procedures pane is where you add your procedure to the DocumentDB collection. When you use the Stored Procedures pane, as opposed using to a third-party tool, you provide the id property value by typing a procedure name in the ID text box. The name must be unique within the collection. In this case, we’ll name our stored procedure GetFullNames.

The rest of the pane is devoted to the body property value, where we define the function that does the bulk of the work. As the ID value suggests, the stored procedure will retrieve a list of full names from the available set of JSON documents. To keep things simple, we’ll also name the function GetFullNames, but you can use any name or treat it as an anonymous function and provide no name. The following JavaScript shows the entire function:

If you’re already a JavaScript pro, this function should be fairly straightforward, but I’ll walk through it for those who are new to JavaScript. We start by specifying the function keyword, followed by the function name, GetFullNames. If we wanted to pass parameter values into the stored procedure, we would specify those parameters within the parentheses that follow the function name. (We’ll be covering parameters in the next example.)

The function body is enclosed in a set of curly braces and includes a number of variable definitions. The first four variables are based on the DocumentDB server-side JavaScript SDK, which is available to every DocumentDB collection. The SDK lets us create the objects we need in our JavaScript to interact with the documents in our collection.

We start with the context variable, using the getContext method to define the variable as a Context object:

The Context object provides access to the current collection, allowing us to perform our database-related operations. We can then use the context variable to call the getCollection method that is available to the Context object:

This allows us to create a Collectio n object that we assign to the coll variable. Through the Collection object, we can execute create, read, update, and delete (CRUD) operations against the JSON documents within the collection.

We can also use the Collection object to call the getSelfLink method, which creates a self-link to the current collection. We’ve assigned the link to the link variable:

Next, we use the context variable to call the getResponse method, allowing us to define the response variable as a Response object:

The Response object will hold the server response to the requested operation.

Now that we’ve set up the objects we need to work with the document data, we can define the query variable to hold the SELECT statement that we’ll run against our collection:

The SELECT statement follows the conventions of the DocumentDB SQL query language, including the CONCAT function used to concatenate the first and last names from each document. The second argument in the CONCAT function is the space added between the two names. Notice that we need to escape the quotes (using a backslash) because they’re embedded in the variable definition, which is a quoted string.

Next we define the run variable by calling the queryDocuments method available to the Collection object:

The method takes four parameter values, which provide the details necessary to execute our query:

  • The link variable specifies the link used to access the collection.
  • The query variable provides the query to be executed.
  • The curly braces serve as a placeholder for feed options such as page size or query scans. For this example, we do not need to use any of these options.
  • The callback argument calls the The callback function defined later in the script. This option specifies what callback operation to run after the query request has been completed.

The callback argument specified as the fourth parameter is usually treated as a function that can itself take up to three parameters. The first is error information, the second is an array of returned documents, and the third can be other options. For now, we need to concern ourselves only with the error information and documents.

The error information is automatically generated by the database engine if any issues should arise, and the document array contains the results of our query, if all goes as planned. To process the callback, we create a function named callback and include the err parameter to handle the errors and the docs parameter to handle the array of documents:

Within the function, we define one if statement and one if…else statement block, giving us the following logic:

  • If an error is generated, return that error.
  • If the docs variable is undefined or empty, return the message No documents were found.
  • Otherwise, use the response variable to call the setBody method, which sets the value of the response object to our array of returned documents.

Notice that the setBody method calls the JSON.stringify method. The method converts a JavaScript value to a JSON string. In this case, JSON.stringify is converting the data in the docs variable to the JSON format.

The final statement in our JavaScript is an if statement that throws an error if the run variable is undefined:

Because we include the if statement, the script will return the specified message if the queryDocuments method returns a false, which can occur as a result of a timeout.

After you have added the complete script to the Stored Procedures pane (whether typing it in directly or copying and pasting what I have here), you need only click the Save button to add the stored procedure to the collection.

Not surprisingly, the next logical step would be to run the stored procedure to try it out. Unfortunately, you cannot. At least not through the Azure portal. For some reason, Microsoft chose not to include the ability to test your stored procedures within Azure. For that, you must use a third-party tool.

If you need a quick solution for testing your stored procedure, you can turn to freeware such as Azure DocumentDB Studio, available through GitHub. At least this way, you know if your stored procedure is working. For example, I ran the GetFullNames procedure in DocumentDB Studio and came up with the following results:

Although DocumentDB Studio had trouble with the accented ‘a’ in Sánchez, it at least provided a quick and easy way to test my procedure. To connect to DocumentDB through DocumentDB Studio, I had to provide the endpoint (URI) and secret key (primary key) for my DocumentDB account.

Adding an input parameter to a stored procedure

Now let’s look at a different example of a stored procedure, one that takes an input parameter. The next procedure, GetFullName, is similar to the last, except that it returns the full name from only one document, based on the specified document ID:

First off, notice that we now include the id parameter with the initial function name: GetFullName(id). As a result, when calling the procedure, we will also need to specify the document’s ID.

The id parameter is then included as part of the SELECT statement assigned to the query variable:

Instead of an ORDER BY clause, we now have a WHERE clause for retrieving only the specified document. The clause compares each document’s id property with the id parameter, returning only the matching document.

If we were to call the procedure after saving it, we should now receive only one full name. For example, I ran the stored procedure in DocumentDB Studio, passing in the id value rob0. The stored procedure returned the name Rob Walters, as expected.

Using stored procedures to update data

DocumentDB’s SQL query language supports only the SELECT statement, which means you cannot use the language to update document data. However, you can create JavaScript stored procedures that can update data. For example, the following stored procedure will change the JobTitle value in the specified document:

The stored procedure’s function, UpdateJobTitle, starts by defining two parameters, id and newtitle. The first parameter is for the document ID, and the second parameter is for the document’s new JobTitle value. This is followed by the same four variables we defined in our previous examples.

Next, we add two if statements to ensure that the parameter values are provided when calling the stored procedure:

If one or both values are missing, the stored procedure will return a message saying that the ID or job title has not been defined. To generate either message, we use a throw statement to create a new Error object that contains the message text.

We next define the query and run variables, which follow the same logic as the preceding example. Note however, that the SELECT clause in the query uses the asterisk (*) wildcard to indicate that the entire document should be returned, rather than specific properties:

The callback function is also similar to the previous example, except for a couple important changes:

The if statement in the if…else statement block checks that the docs array contains data and, if so, runs the UpdateDoc function, defined later in the script. The function takes one argument, the returned document. However, because docs is an array, we must specify the index, which in this case is 0. JavaScript arrays use zero-based indexing, and the docs array contains only one document.

If the expression in the if statement in the if…else block evaluates to false, the else statement kicks in and returns a message stating that the document could not be found.

Now let’s look at the UpdateDoc function, which changes the JobTitle property to the value specified in the newtitle property:

We first set the property to the new title ( doc.JobTitle = new title). We then define a variable named replace, which uses the coll variable to call the replaceDocument method.

The replaceDocument method replaces our original document with the one we’ve just updated. The method takes four arguments. The first is a resource link to the document, which we can reference by specifying the _self property. The second argument is the document itself, which has been updated with the new property value. The third is a placeholder for options such as an indexing directive.

The fourth parameter is a callback function, which we’ve embedded as a method argument, rather than calling another named function. In this case, the second parameter, newdoc, is the updated document. When the function runs, it first checks for errors. If there are none, it uses the response variable to call the setBody method, which sets the value of the Response object, making it possible to update the document in the collection.

If you were to run the UpdateJobTitle stored procedure after saving it, it should update the job title for the specified document. For example, when I ran the stored procedure, I passed in gail0 for the id parameter and Design Engineer 2 for the newtitle parameter. The procedure updated the document as expected.

Creating user-defined functions

Another handy feature in DocumentDB is the ability to create JavaScript UDFs. Before I demonstrate how this works, let’s create another sample document, using the following T-SQL SELECT statement:

This time around, I used the BusinessEntityID value for the document’s id property, and kept the LoginID column in its original form. I’ve included the results here, without the outer brackets, so you can copy and paste them in order to add the document your collection:

A DocumentDB UDF is similar to a stored procedure in that it is essentially a JavaScript object that contains two properties. The first is name, which uniquely identifies the function within the collection, and the second is body, which is a JavaScript function that carries out the logic.

To create a function in Script Explorer, click the Create User Define d Function button, and type a name in the ID text box. This gives us our name property value for the function.

Now let’s create the function component for our body property value. The function will extract the username from the LoginID value, without the domain name. For this example, assume that the domain portion can come in the form of adventure-works or adv-works and we want to account for both forms in our function, as shown in the following JavaScript:

We start by naming the function ExtractID and specifying that it take one parameter, the login value to be converted. Next, we use an if statement to check whether the login value is undefined. If so, we use a throw statement to return a message.

We then create an if…else if…else block for extracting our data, based on the following logic:

  • If the login value contains adventure-works, use the slice method to return a substring starting at the 17th character. (The slice method uses a zero-based index, so we specify 16.)
  • If the login value contains adv-works, use the slice method to return a substring starting at the 11th character.
  • Otherwise, return the login value as is.

The two if expressions also use the match method to compare values, using the login parameter to call the method. We first pass in a simple regular expression, enclosed in forward slashes and followed by the i argument, which indicates that the comparison should be case-insensitive. If the regular expression matches the specified string, the if expression evaluates to true.

After we save the UDF, we can use it in a DocumentDB SELECT statement, as shown in the following example:

When you call a UDF in DocumentDB, you must precede the function name with udf. (period included). The SELECT statement returns the following results:

If we want a bit less clutter in our results, we can add VALUE to the SELECT clause, without specifying an alias:

Now the results are much cleaner:

This should give you the basics of how to create a UDF and use it in a SELECT statement. We can even use it in a query within a stored procedure, so let’s look at how that works.

Using UDFs in your stored procedures

Suppose we want to create a stored procedure that retrieves information about an employee, based on the ID. We might create a function such as the following:

For the most part, the GetemployeeInfo function contains many of the same elements that you saw in earlier examples. However, the SELECT statement in this example calls the ExtractID UDF, allowing us to retrieve only the username when returning the document data.

The stored procedure also contains another difference from previous examples. The callback function includes additional logic to further refine the username:

If the username ends in 0, the 0 will be removed. We again use the match method to check the value, but with a regular expression that is a bit more complex. Essentially, it determines whether the Login property value ends in a 0 that is preceded only by characters, not numbers. This way, the function won’t remove the 0 from a login ID such as jossef10. If the login ID meets the conditions specified by the if expression, the statement then uses the slice method to trim off that last character. Otherwise, the Login value is left as is.

When I ran the stored procedure, I passed in 6 as the document ID and received the following results:

Notice that the Login value is now jossef, without the 0. Suppose we now add another document to the collection, just like that last one, but with an id value of 7 and a Login value of adventure-works\\jossef1, giving us the following JSON document:

If we were to run the stored procedure again, passing in 7 as the document ID, we would now get the following results, with the Login value untouched:

The point of adding this extra logic to the stored procedure is to demonstrate that we can instead add this logic to our function:

First we trim off the domain name and then, when appropriate, we trim off the 0. This allows us to simplify our stored procedure even further and let the function do more of the work:

When we run the GetEmployeeInfo2 stored procedure, passing in 6 as the ID, we get the following results:

If we pass 7 in as the ID, we should get these results:

In both cases, we get the Login value we wanted: a username without the domain name and without ending in 0.

DocumentDB server-side scripting

Because we can use JavaScript to build stored procedures and UDFs in DocumentDB, we can incorporate complex logic into our collections. JavaScript is a widely implemented language with a large user base, making it easier to resolve issues specific to our situations and giving us a great deal of flexibility for persisting logic to our database collections.

Best of all, adding stored procedures and UDFs is a relatively painless process. JavaScript and JSON are incorporated directly into DocumentDB, making server-side scripting a natural extension to DocumentDB solutions. In many cases, you might still want to keep the logic closer to the application, but for repetitive tasks on large document sets, you could find it beneficial to keep some of that logic nearer to where that data resides. The key is in having the JavaScript skills necessary to build the applicable logic.

Keep in mind, however, that DocumentDB is a relatively new service, so there are still some kinks to work out, such as not being able to run stored procedures from Script Explorer or Query Explorer. In addition, DocumentDB is a cloud service, so you pay for the resources you use. Be sure to read the fine print before you do anything. That said, under the right circumstance, server-side scripting in DocumentDB could prove beneficial. At the very least, it might be worth taking for a spin.