Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 40 years of experience with database-intensive applications. Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career. See also :
One of the common problems when you are developing databases is knowing what’s changed and when. You want to detect changes and to work out what has changed in a particular build. You’d have thought it would be an easy problem, when you consider that every database object in SQL Server’s system views has a … Read more
In order for a database system to work, you often need to provide programmable server objects. I’ve written very few databases that didn’t include agent jobs and triggers, or that didn’t require XEvents for diagnostics. These need to be scripted out in just the same way as database objects. . It can be done via … Read more
A while ago, I wrote an article Automated Script-generation with Powershell and SMO about using SMO to script out a SQL Server database. It has remained surprisingly but agreeably popular. SMO is still there, but now part of the sqlserver module that is included with SSMS and downloadable. Someone recently asked me whether it was … Read more
New releases of SQL Server arrive at a quick pace, and it's difficult to keep up with the many features introduced in each version. In this article, Phil Factor reviews a feature you may have missed, inline indexes. He covers the syntax and the many ways they can be used and then performs some performance tests to see if they can make a difference with table variables.… Read more
This article is about using the DOS Batch script facility of the Windows command line, together with SQLCMD to write the contents of each table in a database to the local filesystem. It shows how to use temporary stored procedures to advantage. Just to make it a bit harder, I’m doing it in extended JSON … Read more
I’ve often read in forums how people have special utility databases with all their stored procedures and functions for working on the databases on the server. It is great because you don’t want your utilities intruding into the actual databases that you are developing or testing. The problem is that it doesn’t work. Let me … Read more
SQL naming conventions for tables, and all the associated objects such as indexes, constraints, keys and triggers, are important for teamwork. Poorly-named tables and other objects make it difficult to maintain databases. Table names must follow the rules for SQL Server identifiers, and be less than 128 characters. It is possible to force SQL Server to … Read more
You have many options when exporting data from a database. In this article, Phil Factor compares several methods including XML and array-in-array JSON for speed and file size.… Read more
So you have a database with JSON in it. Can you validate it? I don’t mean just to ensure that it is valid JSON, but ensure that the JSON contains values that are legitimate. Are NI values, postcodes or bank codes valid? Can the dates or GUIDs be successfully parsed? Are those integers really integers? … Read more
JSON was initially designed for the informal transfer of data that has no schema. It has no concept of a table, or of an array of identical arrays. This means that it must tell you each key for each object, even if the original data object was a table. With the happy minimum of key-value … Read more
JSON is a viable option for transferring data between systems. It has the ability to include schema information along with the data which is an advantage over CSV files. In this article, Phil Factor demonstrates how he takes advantage of JSON when exporting or importing tables. … Read more
If you are, as you should be, checking JSON data in a whole lot of files before you import them into your database, you would do well to use JSON Schema, because you can run a number of checks such as regex checks that can’t be done any other way, and it is usually possible … Read more
You might think that it is easy to get JSON data from a spreadsheet, and there are plenty of utilities around that are based on the idea that it is trivial. If such data was strictly tabular, then it might be. The problem is that any conversion tool makes assumptions about the way that data … Read more
We’re going to set up a web service for a SQL Server database using node js on a windows server. This is intended for a mobile application, but has a variety of other uses where an ODBC connection isn’t possible. This service is purely done as a demonstration for people with a database background, so … Read more
The Extended Events (or XEvents) feature has been part of SQL Server since 2008, but many database professionals struggle to get started using it. In this article, Phil Factor demonstrates several useful Extended Event sessions that measure just one thing in each. He then provides the code necessary to parse the resulting XML into something you can use.… Read more
Having spent a lot of my working life trying to preserve the integrity of data, there was a certain intriguing novelty in the idea of pseudonymizing data. One of the standard techniques of pseudonymization is that of shuffling data columns as though you are shuffling cards. The original values are kept but placed in the … Read more
I haven’t seen a SQL Server table with real unencrypted credit card numbers for several years, and I don’t know of any good reasons to have them stored that way. However, I’ve needed them in the past for testing a web application that had to take credit card details. Generating credit cards in a way … Read more
NoSQL databases like MongoDB are gaining popularity, but using the right tools for the job at hand is most important. In this article, Phil Factor demonstrates how to work with a MongoDB database and how to use PowerShell with MongoDB so that the process can be automated.… Read more
In any real numeric data from a database , you are only rarely going to see any sort of normal distribution of the values. Sales data will rise and fall according to the time of year and the economic cycle. The date of input of a record will vary with the workload. If you plot … Read more
When you are developing an existing database, or demonstrating it, you nowadays need pseudonymised data, or even better, completely anonymized data. This data has to look right at first glance, and it needs to have the same distribution as the real data. Although we are yet to tackle continuous variables with complicated distributions such as … Read more