Consuming hierarchical JSON documents in SQL Server using OpenJSON

Over the years, Phil was struck by the problems of reading and writing JSON documents with SQL Server. Now that SQL Server 2016 onwards has good JSON support, he thought that the articles would be forgotten. Not so, they continue to be popular, so he felt obliged to write about how you can use SQL Server's JSON support to speed the process up.… Read more

Missing Data

In the real world of business or scientific reporting and analysis, data can prove to be awkward. It can be plain wrong or it can be altogether missing. Sure, we have the NULL to signify unknown, but that doesn't play well with regular business reporting. There are a number of ways of dealing with missing information, and methods of estimating data from existing data has a long and respectable history. Joe Celko gets to grips with a data topic that is often treated with some trepidation. … Read more

SQL Code Smells

Some time ago, Phil Factor wrote his booklet 'SQL Code Smells', collecting together a whole range of SQL Coding practices that could be considered to indicate the need for a review of the code. It was published as 119 code smells, even though there were 120 of them at the time. Phil Factor has continued to collect them and the current state of the art is reflected in this article. There are now around 150 of these smells and SQL Code Guard is committed to cover as many as possible of them. … Read more

Simple SQL: Random Thoughts

How does one get a truly random sample of data of a certain size from a SQL Server database table. Well, there are simple non-portable tricks one can use, such as the NewID() function, but then refining those can be tricky. Take the Rand() function for a start. Can it really provide you with a truly random number? Why doesn't the TABLESAMPLE clause give you a set number of rows? Joe Celko scratches his head a bit, explains some of the issues and invites some suggestions and tricks from readers.… Read more

Statistics in SQL: Student’s t-test

Many undergraduates have misunderstood the name 'Students' in the t-test to imply that it was designed as a simple test suitable for students. In fact it was William Sealy Gosset, an Englishman publishing under the pseudonym Student, who developed the t-test and t distribution in 1908, as a way of making confident predictions from small sample sizes of normally-distributed variables. As Gosset's employer was Guinness, the brewer, Phil Factor takes a sober view of calculating it in SQL.… Read more

The Basics of Good T-SQL Coding Style – Part 4: Performance

There are several obvious problems with poor SQL Coding habits. It can make code difficult to maintain, or can confuse your team colleagues. It can make refactoring a chore or make testing difficult. The most serious problem is poor performance. You can write SQL that looks beautiful but performs sluggishly, or interferes with other threads. A busy database developer adopts good habits so as to avoid staring at execution plans. Rob Sheldon gives some examples.… Read more

Data in Motion and Data at Rest

Microsoft (StreamInsight), and Azure Stream Analytics represent a very different model for processing data. They are concerned with processing complex event streams of data (CEPs) from such things as sensors to deduce significant patterns and apply filters. Joe Celko discusses the background to an intriguing technology of complex event processing to establish the difference between data at rest, and data on the move.… Read more

SQL Server R Services: Digging into the R Language

It is not just the analytic power of R that you get from using SQL Server R Services, but also the great range of packages that can be run in R that provide a daunting range of graphing and plotting facilities. Robert Sheldon shows how you can take data held in SQL Server and, via SQL Server R Services, use an R package called ggPlot that offers a powerful graphics language for creating elegant and complex plots.… Read more

SQL Data Aggregation Aggravation

When we have to deal with and store a lot of data, it makes sense to aggregate it so that we store only the information we actually need. If we get this right, this works well, but the design of the system takes care and thought because the problems can be subtle and various. Joe Celko describes some of the ways that things can go wrong and end up providing incorrect, inaccurate or misleading results.… Read more

SQL Graph Objects in SQL Server 2017: the Good and the Bad

Graph databases are useful for certain types of database tasks that involve representing and traversing complex relationships between entities. These can be difficult to do in relational databases and even trickier to report on. Until now, we have had the choice of doing it awkwardly in SQL Server or having an ancillary database to tackle this type of task. SQL Server 2017 will be bringing graph capabilities to the product but will these features prove to be good enough to allow us to dispense with specialised Graph databases? Dennes Torres decided to find out.… Read more

The Basics of Good T-SQL Coding Style – Part 3: Querying and Manipulating Data

SQL was designed to be a third-generation language, expressed in syntax close to real language, because it was designed to be easy for untrained people to use. Even so, there are ways of expressing SQL Queries and data manipulation in ways that make it easier for the database engine to turn into efficient action. and easier for your colleagues to understand. Robert Sheldon homes in on data querying and manipulation and makes suggestions for team standards in SQL Coding.… Read more

Is It Time To Stop Using IsNumeric()?

The old system function IsNumeric() often causes exasperation to a developer who is unfamiliar with the quirks of Transact SQL. It seems to think a comma or a number with a 'D' in the midde of it is a number. Phil Factor explains that though IsNumeric has its bugs, it real vice is that it doesn't tell you which of the numeric datatypes the string parameter can be coerced into, and because it doesn't check for overflow. Phil comes to the rescue with a couple of useful alternatives, one of which works whatever version of SQL Server you have, and which tell you what datatype the string can be converted to.… Read more

Database Code Analysis

Database code analysis will reduce the number of 'code smells' that creep into your database builds. It will alert the team to mistakes or omissions, such as missing indexes, that are likely to cause performance problems in production. It will allow the Governance and Operations team visibility into production readiness of the code, warning them of security loopholes and vulnerabilities. William Brewer describes the two technical approaches to database code analysis, static and dynamic, and suggests some tools that can help you get started.… Read more

SQL Server R Services: The Basics

It is possible to do a great deal with R within SQL Server, but it is best to start by doing analysis in R on numeric data from SQL Server and returning the results to SQL Server. There is great value to be gained even with this basic foundation. Robert Sheldon is on hand to give you a kick start with the first in his series on beginning with R in SQL Server.… Read more

The Basics of Good T-SQL Coding Style – Part 2: Defining Database Objects

Technical debt is a real problem in database development, where corners have been cut in the rush to keep to dates. The result may work but the problems are in the details: such things as inconsistent naming of objects, or of defining columns; sloppy use of data types, archaic syntax or obsolete system functions. With databases, technical debt is even harder to pay back. Robert Sheldon explains how and why you can get it right first time instead.… Read more

Statistics in SQL: The Kruskal–Wallis Test

Before you report your conclusions about your data, have you checked whether your 'actionable' figures occurred by chance? The Kruskal-Wallis test is a safe way of determining whether samples come from the same population, because it is simple and doesn't rely on a normal distribution in the population. This allows you a measure of confidence that your results are 'significant'. Phil Factor explains how to do it.… Read more

SQL Server User-Defined Functions

User-Defined Functions (UDFs) are an essential part of the database developers' armoury. They are extraordinarily versatile, but just because you can even use scalar UDFs in WHERE clauses, computed columns and check constraints doesn't mean that you should. Multi-statement UDFs come at a cost and it is good to understand all the restrictions and potential drawbacks. Phil Factor gives an overview of User-defined functions: their virtues, vices and their syntax.… Read more

How you log in to Simple Talk has changed

We now use Redgate ID (RGID). If you already have an RGID, we’ll try to match it to your account. If not, we’ll create one for you and connect it.

This won’t sign you up to anything or add you to any mailing lists. You can see our full privacy policy here.


Simple Talk now uses Redgate ID

If you already have a Redgate ID (RGID), sign in using your existing RGID credentials. If not, you can create one on the next screen.

This won’t sign you up to anything or add you to any mailing lists. You can see our full privacy policy here.