There’s a possibility you’ve parachuted into this series on JSON and Oracle at this point. If you are already familiar with what JSON is and how it is created, and are only interested in discovering how to wrestle JSON data from an Oracle database, then you’re welcome, grab a seat. However, if you see the words JSON and Oracle, and scratch your head wondering if it’s a sequel to the movie, Jason and the Argonauts, then you probably should start from Part 1. Don’t worry; you’ll catch up soon enough.
In this article we’ll still be using the very basic json_test table we created in the previous article. However, to fully stretch the muscles of the functionality Oracle have put together for querying JSON data, we’ll need some JSON data with a few levels of complexity.
1 2 3 4 5 |
CREATE TABLE json_test (ID RAW(16) NOT NULL, json_data CLOB NOT NULL, created_date DATE DEFAULT SYSDATE ); |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
{"Title" : "Lemonade", "AlbumArtist" : "Beyonce", "ReleaseYear" : 2016, "Label" : "Columbia Records", "Genre" : "R&B", "Songs" : [{ "Title" : "Pray you catch me", "Artist" : "Beyonce", "Producer" : [{"Name" : "Beyonce"}, {"Name" : "Jeremy McDonald"}]}, { "Title" : "Hold Up", "Artist" : "Beyonce", "Producer": [{"Name" : "Diplo"}, {"Name" : "Beyonce"}]}, { "Title" : "Don't hurt yourself", "Artist" : "Beyonce", "Featuring": "Jack White", "Producer": [{"Name" : "Derek Dixie"}, {"Name" : "Beyonce"}]}, { "Title" : "Sorry", "Artist" : "Beyonce", "Producer": [{"Name" : "Melo-X"}, {"Name" : "Beyonce"}]}, { "Title" : "Freedom", "Artist" : "Beyonce", "Featuring": "Kendrick Lamar", "Producer": [{"Name" : "Just Blaze"}, {"Name" : "Beyonce"}]}], "ExecutiveProducer" : "Beyonce"} |
Once we’ve inserted our data into our table we’re ready to rock and roll.
Dot-Notation
We’ll come on to talk about all the exciting, high-powered functionality that we have for querying our data, but there are times when you need something really simple and reasonably straightforward. The dot-notation syntax allows you retrieve JSON values easily using a syntax that resembles that you’d use in an ordinary sql statement. The simple dot-notation syntax is as follows: the table alias followed by a dot followed by a JSON column followed by one or more JSON object keys (if there is more than one object key, they must all be separated by dots).
There are a few rules that you need to bear in mind:
- Your table must have an alias; you cannot choose to skip this step and simply use the table name. It must be aliased.
- Your column must be a JSON column, by which I mean that it must be signposted as containing JSON data by the existence of an IS JSON check constraint. As always, while this constraint must exist, it does not need to be enabled.
- The JSON object key in your query must be a case-sensitive match for the object key whose value you are seeking to retrieve.
Obey these rules and what you get is the following: if your query matches a single value it’ll return that value, but if it matches multiple values it’ll return them as the elements of a JSON array.
Simple dot-notation JSON queries return all their results as strings, and can only return up to 4000 characters. Breach that number and, irrespective of your data, it will return a null. To lift that kind of weight you’ll need the muscles of Oracle’s JSON functions, and we’ll talk about them soon.
But first, let me lead you down the garden path.
Oracle JSON Path Expressions
One of JSON’s main bragging points is how simple it is; however, in truth, large JSON files can be cobwebs of complex relationships. Oracle’s JSON path expressions are a way of reaching into JSON data and matching particular sections or values. They strongly resemble XPath expressions, mostly for the simple reason that they do for JSON what XPath does for XML: provide a simple means for directly identifying values in the data. For instance:
1 |
$.AlbumArtist |
is a quick way to identify the album artist – Beyonce – from our JSON data. The rules are as follows:
- Each path must begin with a dollar sign ($). This is called the context item, and represents the jumping off point for our path. This point will be identified by the SQL expression that is passed to the Oracle function using the JSON path expression.
The context item is followed by zero or more object steps or array steps. - An object step is a dot (.) followed by an object key (or an asterisk – * – to match all keys). The object step returns the value of the matching object key. An asterisk will return the values of all matching object keys in no particular order.
- An array step is a left bracket ([), followed by one or more zero-based array indexes (or a wildcard – *), and bookmarked by a right bracket (]).
There are a few simple, common sense rules regarding array step indexes. To avoid compile-time errors, the indexes must be in ascending order – [0,2,3] is valid; [0,3,2] is not – and must not contain repetitions – [0,2,3] is valid; [0,2,2] is not.
Returning to our JSON data, we can come up with some examples that illustrate what we’ve just learned about Oracle JSON path expressions.
- $.Title is the value of the Title object key of the context item. In other words, Lemonade.
- $.Songs[1] is the object in the second position in the Songs array. The reason why it is the second object is that array indexes are zero-based.
- $.Songs[0].Title is the value of the key Title of the first element in the Songs array. In other words, Pray you catch me.
- $.Songs[*].Producer[0].Name is the value of the the first Name key in the Producer array of all elements in the Songs array.
It’s all pretty straightforward, but it can make your head hurt if you’re trying to work it out as some kind of mental exercise. What we need is a few functions that actually use JSON path expressions so we can see them in action.
JSON_VALUE
JSON_VALUE returns the scalar value of an object key in JSON data. It accepts two mandatory arguments, the first one – usually a table or view column – specifying the context item, and the second containing the path expression.
Here’s an example:
It’s straightforward enough to be self-explanatory. JSON_VALUE, using our path expression returns the value of a JSON key as a SQL value. It’s worth bearing in mind that JSON_VALUE expects your path expression to identify a single scalar value; its default behaviour if you violate this or any of its rules is to return a null.
The error clause is one of JSON_VALUE’s optional arguments. The default behaviour is NULL ON ERROR. However, you can instruct it to ERROR ON ERROR, in which case it will say “ORA-40470: JSON_VALUE evaluated to multiple values” if your path resolves to more than one value.
By default all values returned by calls to JSON_VALUE are of type VARCHAR2 (with a length of 4000). However, you can call an optional returning clause to specify a data type from a choice of VARCHAR2 or NUMBER. You can also specify the length for VARCHAR2 and precision for NUMBER.
Finally, if your column is a blob you must include the keywords FORMAT JSON to alert Oracle that it should expect JSON data.
JSON_QUERY
On the face of it, JSON_QUERY is very similar to JSON_VALUE; the syntax is close to indistinguishable, and its purpose is superficially identical. There are some differences, though. JSON_QUERY returns one or more values from JSON data; in contrast to JSON_VALUE which can only return a single value. (Additionally, JSON_QUERY, unlike JSON_VALUE, cannot return the NUMBER data type.) One advantage that this difference gives JSON_QUERY, over its close cousin, is that you can use JSON_QUERY to retrieve fragments of a JSON document.
The first argument is the column name (followed by FORMAT JSON if it is of the blob data type). This column is the context item for the next argument.
The next required argument is a path expression.
The next argument is optional and is the returning clause. Unlike JSON_VALUE, JSON_QUERY cannot return numbers; you can only specify VARCHAR2. However, you can use this clause to declare a size for your string.
The next argument is one we haven’t discussed yet. It’s the wrapper clause. This optional parameter allows you specify the form in which JSON_QUERY will return its output. What this means will become clearer once I tell you what the options are:
WITH WRAPPER: returns a string that represents a JSON array of all the values that match your path expression. What this means is that if your path expression, for example, resolves to Beyonce, your function will return [Beyonce]. If, however, it resolves to {Name: Beyonce}, it’ll return [{Name: Beyonce}]. And if your path resolves to a null, it’ll return [ ].
WITHOUT WRAPPER: returns a string that represents the single JSON object or array that matches your path expression. Single. Object or Array. What this means is that if your path resolves to multiple objects or array, or if it resolves to one or more values, it’ll error. Using our earlier examples, Beyonce will error because it is a value. However, the object {Name: Beyonce} will be returned as, well, {Name: Beyonce}. A null will error also, as it is not a single object or array.
The default behaviour for JSON_QUERY is WITHOUT WRAPPER.
Following the wrapper clause as an optional argument for JSON_QUERY is the error clause. There’s a new option for the error clause that only exists for JSON_QUERY; it is EMPTY ON ERROR. This means the function should return an empty array, [ ], on error.
To illustrate the contrast between JSON_QUERY and JSON_VALUE, here’s that path expression that resulted in an error for JSON_VALUE.
Notice how we were able to return multiple values, and how, using the WITH WRAPPER clause, they are wrapped in an array. If we used the WITHOUT WRAPPER option for the same expression, we run up against an error.
JSON_TABLE
If you’re conversant with the Oracle XQuery function XMLTable, then the thinking behind JSON_TABLE will be familiar to you. Basically, it’s a means of transforming JSON data into a relational format – in other words, rows and columns. It’s a really powerful tool, if you think about it. It enables you take JSON data and decompose it to a virtual table upon which you can carry out your everyday SQL operations.
Its syntax is as follows:
The first mandatory argument, as always, is the source JSON column. This is followed by the path expression.
Next is the optional error clause. The options here are, unsurprisingly, ERROR ON ERROR and NULL ON ERROR, with the latter being the default.
Finally, there is the mandatory columns clause. This clause provides the definition for the columns of the virtual table. The column clause deserves a closer look. It begins with the keyword COLUMNS and is followed by the column definitions, ensconced in braces.
The column definitions are made up of a column name, a return type, the keyword PATH and the path expression that resolves to the value you want to populate your column with.
An example would probably help at this juncture. Let’s get a track listing from our album.
1 2 3 4 5 6 7 8 9 |
SELECT jt.* FROM json_test, JSON_TABLE (json_data, '$.Songs[*]' COLUMNS (row_number FOR ORDINALITY, song VARCHAR2(50) PATH '$.Title', artist VARCHAR2(30) PATH '$.Artist', featured_artist VARCHAR2(30) PATH '$.Featuring' )) AS jt; |
There are a few things I should point out. First, notice how the table name – in this case, json_test – must be part of the FROM clause, right before the call to JSON_TABLE.
Second, notice how the main path expression provides the context item for the column path expression.
Finally, a column with the FOR ORDINALITY keyword returns generated row numbers.
Conclusion
And that’s it for this lesson on querying JSON data in an Oracle database. In the final article in this series we’ll be talking about generating JSON data from relational data. Hint: it’s not as easy as it should be.
Reference:
Load comments