5 October 2023

Guest post

This is a guest post from Phil Factor. Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 30 years of experience with database-intensive applications.

Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career.

He is a regular contributor to Simple Talk and SQLServerCentral.

Phil Factor

5 October 2023

Using Database Models and Flyway for Automating Routine Development tasks

Database models have all sorts of useful applications during Flyway development to help us automate those repetitive development tasks that otherwise slow down delivery. This article shows how models can help us automate mundane tasks such as generating a build script for any version of a database or deleting the data from every table.

Guest post

A model of SQL database is a hierarchical representation of its metadata, providing details of the structure of the tables, views, procedures and functions. We build these models by extracting and collating information from the RDBMS’s metadata views and saving it in markup such as XML or YAML, or in a language such as JSON.

A database model is mostly useful for scripting processes, or for recording, and later referencing, the design of a database at any particular version. If you use a hierarchical model in JSON, it can be accessed by any scripting language such as PowerShell or Python to find out the structure of the database. Even DOS batch, with the help of JQ.exe, can read from it.

I’ve exploited this fact, in my Flyway Teamwork Framework, to provide PowerShell tasks that can generate a JSON model for each new version of the database, and then compare the current model to the previous one to produce a summary of what changed. I’ve described the various ways in which this can help team-based development in Tracking Development Changes using Database Models with Flyway.

As well as allowing us to compare database versions, a model also has more general uses in helping to automate database development tasks. For example, the model provides the information we need to write routines for tasks such as checking that naming conventions are being followed, or that tables are complete, or for extracting documentation. Because the model also contains all the objects dependencies, we can also use it to automate tasks that rely on knowing the correct dependency order. For example, we can use a model to generate a build script that ensures all objects are created in the correct dependency order, or to produce a script that deletes all data from a database in reverse-dependency order.

To illustrate the point, I’ll demo a few of these uses for a model, using the Flyway Teamwork framework. They provide a standardized way to automate this sort of mundane, but necessary, development task, regardless of RDBMS.

Generating a SQL Script that does a SQL task for every table

Unfortunately, there is no SQL method in common between the major RDBMSs to provide a simple way of executing a given Transact-SQL statement against each table in the database. Even if there was, tasks that affect a whole set of tables often need to happen in the correct dependency order. For example, if you need to build a database, you can’t create a table that has foreign key references to tables that you haven’t yet created. Conversely, if the task is to delete all the data, you can’t delete data from any table to which other tables with existing data still refer.

Even SQL Server ‘s sp_MSforeachtable can’t ensure a task is done in dependency, or reverse-dependency, order. It is certainly possible to achieve similar functionality for other RDBMSs by using their respective system views, cursors, and procedural languages. Fortunately, though, we don ‘t have to do this. Instead, we can steal the idea behind sp_MSforeachtable and use our database model to provide the dependency order.

As an example, I’ve provided a utility that generates code that does the action that you specify for every table, but in the right dependency order for the operation. It is called Script-ForEachTable. You’ll find it in the Resources directory and is just one of the resources that is installed by ‘Preliminary.ps1‘.

All you need to do is provide the SQL for the operation you want to perform as a template, using ‘??‘ as a placeholder for the actual table names. The utility will read the contents of the JSON model, sort the objects into the required dependency order and then iterate through the tables, in that order, replacing the placeholder with the actual table names. The result is a script that you can simply execute to perform the task. It is only the SQL template we use that needs to change for the RDBMS.

Using this utility, or alternatively by reading the table manifest that the $SaveDatabaseModelIfNecessary task in the Flyway Teamwork framework already generates, we can run these operations much more simply, as well as performing more general tasks, such as listing the number of rows in each table.

Deleting the data from every table

The following code uses Script-ForEachTable to auto-generate a script that will delete the contents of all the tables in a SQL Server database, without having to disable or delete any constraints:

$DeletionRoutine=@"

--delete the data from ??, in chunks of 10,000 rows if necessary

Print 'Deleting all the data from ?? '

DECLARE @SoFarDeleted INT=1;

WHILE (@SoFarDeleted > 0 and @@Error=0)

BEGIN

-- Delete a chunk of rows at a time

DELETE TOP (10000) ??;

SET @SoFarDeleted = @@ROWCOUNT;

END

$TheJSONModelLocation=

'S:\work\Github\FlywayTeamwork\Pubs\Versions\current\Reports\DatabaseModel.JSON '

Script-ForEachTable $DeletionRoutine $TheJSONModelLocation;

This will provide the following SQL for every table, doing each table in reverse dependency order:

Print 'Deleting all the data from dbo.authors '

DECLARE @SoFarDeleted INT=1;

WHILE (@SoFarDeleted > 0 and @@Error=0)

BEGIN

-- Delete a chunk of rows at a time

DELETE TOP (10000) dbo.authors;

SET @SoFarDeleted = @@ROWCOUNT;

END

The reason that we are deleting data in smaller batches is that it is useful where large deletions might cause lock contention and other performance issues. The optimal batch size for your RDBMS and data is likely to be different.

Other RDBMSs will have different syntax. PostgreSQL doesn’t allow a LIMIT in a DELETE statement, so it is a bit more awkward.

--delete the data from dbo.prices, in chunks of 10,000 rows if necessary

DO '

DECLARE

ChunkSize INT := 10000;

RowsDeleted INT;

BEGIN

LOOP

DELETE FROM dbo.prices

WHERE ctid IN (

SELECT ctid

FROM dbo.prices

ORDER BY ctid

LIMIT ChunkSize

);

GET DIAGNOSTICS RowsDeleted = ROW_COUNT;

IF RowsDeleted = 0 THEN

EXIT;

END IF;

END LOOP;

END ';

In Oracle, you can achieve a similar result by using PL/SQL and the BULK COLLECT feature to fetch and delete data in chunks.

Count the number of rows in every table

Once you have a utility like this, it comes in handy for other database chores. How about this, for example, as a template for generating SQL scripts to count the number of rows in each table of a database:

DECLARE @count INT, @Feedback Nvarchar(80)

SELECT @count=Count(*) FROM ??

SELECT @Feedback= '?? has '+Convert(VARCHAR(10), @Count)+ ' rows '

PRINT @Feedback

"@ | Script-ForEachTable -PathToModel $TheJSONModelLocation;

A fragment of the script it generates would look like this:

DECLARE @count INT, @Feedback Nvarchar(80)

SELECT @count=Count(*) FROM dbo.publishers

SELECT @Feedback= 'dbo.publishers has '+Convert(VARCHAR(10), @Count)+ ' rows '

PRINT @Feedback

DECLARE @count INT, @Feedback Nvarchar(80)

SELECT @count=Count(*) FROM dbo.Publication_Types

SELECT @Feedback= 'dbo.Publication_Types has '+Convert(VARCHAR(10), @Count)+ ' rows '

PRINT @Feedback

And it would produce a report like this ..

dbo.titleauthor has 1499 rows
dbo.TagTitle has 1000 rows
dbo.sales has 21 rows
dbo.roysched has 86 rows
dbo.editions has 2600 rows
…etc…

Generating build scripts from a model

If you need a build script for each new database version, you can set a Flyway callback to generate the script for you, automatically, using whatever tool the RDBMS provides. The Flyway Teamwork Framework, along with a $SaveDatabaseModelIfNecessary task that provides the manifest that tells you the order for doing the work, also has a $CreateBuildScriptIfNecessary task that provides object-level build scripts for each migration.

Together, this give you everything you need to generate a build script but, of course, it requires a connection to a live database. If you don’t have access to the required version of the database, you can instead generate the build script from the database model. The Convert-modelToScript cmdlet, included with the Framework, will parse a model, and create from it a build script. You’ll find it in the Resources directory.

The following example generates a build script from the model for V1.3 of the Pubs database:

1 2	'<MyPathTo> Pubs\Versions\1.3\Reports\DatabaseModel.JSON' \| Convert-ModelToScript > 'MyBuildScript.SQL '

I get a script that I can then check against my build script to see what’s missing, or I can use to create a database that I can compare with the original. Here is a sample of the code generated…

CREATE TYPE people.PersonalName FROM nvarchar (80) NOT NULL;

CREATE TYPE people.PersonalPhoneNumber FROM varchar (20) NOT NULL;

CREATE TYPE people.PersonalAddressline FROM varchar (60) NULL;

CREATE TYPE people.PersonalPostalCode FROM varchar (15) NOT NULL;

CREATE TYPE people.PersonalNote FROM nvarchar (MAX) NOT NULL;

CREATE TYPE people.PersonalCVC FROM char (3) NOT NULL;

CREATE TYPE people.PersonalTitle FROM nvarchar (20) NOT NULL;

CREATE TYPE people.PersonalPaymentCardNumber FROM varchar (20) NOT NULL;

CREATE TYPE people.PersonalLocation FROM varchar (20) NULL;

CREATE TYPE people.PersonalEmailAddress FROM nvarchar (80) NOT NULL;

CREATE TYPE people.PersonalSuffix FROM nvarchar (20) NULL;

CREATE TYPE dbo.empid FROM char (9) NOT NULL;

CREATE TYPE dbo.tid FROM varchar (6) NOT NULL;

CREATE TYPE dbo.Dollars FROM numeric (9,2) NOT NULL;

CREATE TYPE dbo.id FROM varchar (11) NOT NULL;

CREATE TABLE dbo.authors (

/* The authors of the publications. a publication can have one or more author */

au_id dbo.id -- The key to the Authors Table

CONSTRAINT UPKCL_auidind PRIMARY KEY (au_id),

au_lname nvarchar(80), -- last name of the author

au_fname nvarchar(80), -- first name of the author

phone nvarchar(40) -- the author's phone number

CONSTRAINT AssumeUnknown DEFAULT ('UNKNOWN'),

address nvarchar(80), -- the author=s firest line address

city nvarchar(40), -- the city where the author lives

state char(2), -- the state where the author lives

zip char(5), -- the zip of the address where the author lives

contract bit -- had the author agreed a contract?

);

CREATE TABLE dbo.jobs (

/* These are the job descriptions and min/max salary level */

job_id smallint IDENTITY(1,1) -- The surrogate key to the Jobs Table

CONSTRAINT PK__jobs__6E32B6A51A14E395 PRIMARY KEY (job_id),

job_desc varchar(50) -- The description of the job

CONSTRAINT AssumeANewPosition DEFAULT ('New Position - title not formalized yet'),

min_lvl tinyint, -- the minimum pay level appropriate for the job

max_lvl tinyint -- the maximum pay level appropriate for the job

);

CREATE TABLE dbo.Publication_Types (

/* An edition can be one of several types */

Publication_Type nvarchar(20) --

CONSTRAINT PK__Publicat__66D9D2B37E8D5751 PRIMARY KEY (Publication_Type)

);

CREATE TABLE dbo.publishers (

/* this is a table of publishers who we distribute books for */

pub_id char(8) -- The surrogate key to the Publishers Table

CONSTRAINT UPKCL_pubind PRIMARY KEY (pub_id),

pub_name nvarchar(100), -- The name of the publisher

city nvarchar(100), -- the city where this publisher is based

state char(2), -- Thge state where this publisher is based

country nvarchar(80) -- The country where this publisher is based

CONSTRAINT AssumeItsTheSatates DEFAULT ('USA')

);

CREATE TABLE dbo.stores (

/* these are all the stores who are our customers */

stor_id char(4) -- The primary key to the Store Table

CONSTRAINT UPK_storeid PRIMARY KEY (stor_id),

stor_name nvarchar(80), -- The name of the store

stor_address nvarchar(80), -- The first-line address of the store

city nvarchar(40), -- The city in which the store is based

state char(2), -- The state where the store is base

zip char(5) -- The zipt code for the store

);

CREATE TABLE dbo.TagName (

/* All the categories of publications */

TagName_ID int IDENTITY(1,1) -- The surrogate key to the Tag Table

CONSTRAINT TagnameSurrogate PRIMARY KEY (TagName_ID),

Tag nvarchar(80) -- the name of the tag

CONSTRAINT Uniquetag UNIQUE (Tag)

);

At the moment, it is really just an elaborate test harness for the framework, to verify or inspect a model. For example, I can generate a build script from the model and use the script to create a database. Then, using SQL Compare, I compare the new database to the one that generates the model to ensure they’re identical. Because the code of all DML and procedural code of the routines and views is in the model, this is easier than it sounds, but it is one of those background maintenance jobs. I haven’t yet attempted to conform with individual dialects of all supported RDBMSs because I’ve not got independent schema comparison tools to check against.

Although JSON database object models are very easy to read, they will inevitably still contain a few SQL expressions. Many database objects, such as views, procedures, functions and some constraints cannot be represented without them. This means, for example, that we can’t interrogate the JSON definition of a view to work out the tables on which it depends. However, there are simple ways round these sorts of problems. To determine what tables are used in SQL expressions or batches, the Framework provides a generic SQL Tokenizer that can be used to give you a list of tables referenced within the scripted object such as a view, function or procedure.

Creating a style checker for tables

I’ve also created a Find-TableSmells utility, which is essentially a table style checker for database models. At the moment it only runs with SQL Server, but that is only because I needed it for a SQL Server database that needed tidying up. I’ve yet to update all RDBMSs to the new standard.

1	Find-TableSmells '<path-to>\FlywayTeamwork\Pubs\Versions\1.1\Reports\DatabaseModel.JSON '

On the original Pubs Database, it is rather sniffy about the tables that have no primary keys but otherwise it finds little to criticize.

The dbo.roysched heap has no suitable candidate for a primary key
The dbo.roysched table has a foreign Key name  'FK__roysched__title___15502E78 ' that looks auto-generated
The dbo.pub_info table has a foreign Key name  'FK__pub_info__pub_id__20C1E124 ' that looks auto-generated
The dbo.titles table has no documentation
The dbo.titles table has a foreign Key name  'FK__titles__pub_id__08EA5793 ' that looks auto-generated
The dbo.discounts heap has no index
The dbo.discounts table has a foreign Key name  'FK__discounts__stor___173876EA ' that looks auto-generated
The dbo.sales table has a column name  'qty ' that is incomprehensible

Conclusion

One problem I had when changing from its SQL Server origins to a multi-RDBMS system was the work of scaling out the facilities that are needed for database development. Although it was relatively easy, and quite intriguing to do the first three RDBMSs, it then became a chore. All the various tasks seem to boil down to getting information about the database. If information can be put in a standard model and then that model is used as a source of information for the tasks, the complexity drops away. The same is true of database development, even though the context is different. If diagramming, change-reports, quality-control, searching and so on can be done from a model rather than just the current development ‘state’ how much easier the work can become, and how much easier to understand how and when changes were made.

Tools in this post

Redgate Flyway

Bring stability and speed to database deployments

Find out more

Redgate Flyway

Bring stability and speed to database deployments

Find out more

Redgate Test Data Manager

Redgate Flyway

Redgate Monitor

Overview

Protect

Automate

Monitor

Product articles Redgate Flyway Tracking development changes
Using Database Models and Flyway for…

Guest post

Using Database Models and Flyway for Automating Routine Development tasks

Guest post

Generating a SQL Script that does a SQL task for every table

Deleting the data from every table

Count the number of rows in every table

Generating build scripts from a model

Creating a style checker for tables

Conclusion

Tools in this post

Redgate Flyway

Redgate Flyway

Product Articles

University

Events

Forums

Community

Simple Talk

Redgate Test Data Manager

Redgate Flyway

Redgate Monitor

Overview

Protect

Automate

Monitor

Product articles Redgate Flyway Tracking development changes Using Database Models and Flyway for…

Guest post

Guest post

Generating a SQL Script that does a SQL task for every table

Deleting the data from every table

Count the number of rows in every table

Generating build scripts from a model

Creating a style checker for tables

Conclusion

Tools in this post

Redgate Flyway

Redgate Flyway

You may also like

Redgate Hub

Product Articles

University

Events

Forums

Community

Simple Talk

Product articles Redgate Flyway Tracking development changes
Using Database Models and Flyway for…