Taming Database Documentation with Flyway and MySQL
Database object documentation is essential for explaining to busy developers, and the wider business, the purpose of each object and how to use it. The solution presented in this article consists of a SQL script to allow developers to add comments to MySQL database objects, without affecting the database version, and a simple way to generate a documentation report, in JSON. The SQL script will execute automatically as a callback, during any Flyway Teams migration run, and the report will allow the team to spot any gaps quickly.
The database documentation, in the sense of comments attached to database objects and attributes, isn’t really part of the metadata of a database. However effective the comment is in shedding light on the reason for a table or column, it doesn’t affect the working of a database in any way. For this reason, adding documentation to a database object shouldn’t change the version of a database. It should never be part of a database comparison. This implies that we should not add database object documentation directly to build scripts or the migration scripts. It must be done separately; but how?
If you have a separate script that applies the documentation, you can use it as a Flyway ‘callback script’ and Flyway will apply it automatically at the end of a migration run. You merely name the script according to the Flyway specification and it applies it to the database.
Maintaining documentation for a SQL Server database
I’ve already provided an example of this way of maintaining JSON-based documentation for a SQL Server database. It uses a SQL-based callback called afterMigrate__ApplyTableDescriptions.sql that applies a documentation to a database, and a task script block called $ExecuteTableDocumentationReport
, within DatabaseBuildAndMigrateTasks.ps1, to generate a JSON document of the current documentation of tables and columns, with blanks where more documentation is needed.
In the original SQL Server version, I chose JSON because I found it easy to update the information in the JSON document, and SQL Server can easily read from a JSON document, and use the information to update comments, and to accommodate new tables as and when they are added to the project. Flyway doesn’t get upset if you change a callback. As I mainly use SQL Server, I’m a lot more conscious of the usefulness of comments. SQL Prompt can display them in tooltips. You can apply named comments to almost any database object or attribute, and these can be almost 8000 characters in length. I store structured documentation in YAML in them as well as simple strings.
Challenges of documenting other databases
I wanted to extend support for database documentation, within my Flyway Teamwork framework, to include other databases. Unfortunately, support for comments in other RDBMS is patchy because there is no SQL standard for a comment. The best one can hope for is to document tables and columns. MySQL and MariaDB can’t even add comments to views or the columns of views. SQLite knows nothing of comments but stores the original source of the tables and columns, including the SQL code comment blocks (/* */) and comments (–). This is fine up to a point, but extraction of these comments is a rather delicate operation involving Regex.
The most obvious way of showing the developers what tables and columns still need comments is to produce a report on every successful migration run listing all the tables and columns with blank strings where there are no comments. Then the developers can add the missing comments using an updated version of the afterMigrate__ApplyTableDescriptions.sql
script, in time for the next migration. It is all very civilized.
The only cloud on the horizon for this JSON-based system is that relatively few RDBMSs can read JSON documents into a relational form, so if you need a standard format for documentation across RDBMSs, you’ll be disappointed.
If you are using Flyway community, then your callbacks must be in Java or SQL, not PowerShell. In Flyway Teams or Redgate Deploy, we can use PowerShell in callbacks so we can use this to deal with the JSON data and make the calls to the database to update the documentation far more simply.
A Flyway callback to add documentation to a MySQL or MariaDB database
Some head-scratching was needed before attempting the first cut of a MySQL/MariaDB version of the documentation callback. I decided that the simplest way of representing the documentation was via a multiline VALUES
statement. This means editing the SQL when a table is added, or you wish to alter the comment. It isn’t that much more of a hassle but isn’t nearly as convenient as JSON.
The objective is to add a comment only when strictly necessary. This means that you need to see what comments are there already, and whether they are different and therefore have changed. This is more of a worry with the way that MySQL attaches or alters a comment to a column. The only way to do it is to present the entire column definition for the column for which you are altering the comments. If you get that process wrong, you can change the column definition and thereby trigger a table rebuild or worse. It is a scary process but there is no way around that.
So here is the afterMigrate__ApplyTableDescriptions.sql
script for a MySQL or MariaDB database. It applies comments to both the table and its columns.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
-- now put the table comments into a temporary table called TableCommentsWanted CREATE TEMPORARY TABLE TableCommentsWanted(Tablename VARCHAR(128), TheDescription VARCHAR(1024)); INSERT INTO TableCommentsWanted (Tablename, TheDescription)-- used to store all the table comments VALUES -- all the tables either in place or planned in future work ('dbo.authors' , 'The authors of the publications. a publication can have one or more author'), ('dbo.discounts', 'These are the discounts offered by the sales people for bulk orders'), ('dbo.editions', 'A publication can come out in several different editions, of maybe a different type'), ('dbo.employee', 'An employee of any of the publishers'), ('dbo.jobs', 'These are the job descriptions and min/max salary level' ), ('dbo.prices', 'these are the current prices of every edition of every publication'), ('dbo.sales', 'these are the sales of every edition of every publication'), ('dbo.pub_info', 'this holds the special information about every publisher'), ('dbo.publications', 'This lists every publication marketed by the distributor'), ('dbo.publishers', 'this is a table of publishers who we distribute books for'), ('dbo.roysched', 'this is a table of the authors royalty scheduleable'), ('dbo.stores', 'these are all the stores who are our customers'), ('dbo.TagName', 'All the categories of publications'), ('dbo.TagTitle', 'This relates tags to publications so that publications can have more than one'), ('dbo.titleauthor', 'this is a table that relates authors to publications, and gives their order of listing and royalty') ; -- now put the table comments into a temporary table called ColumnCommentsWanted CREATE TEMPORARY TABLE ColumnCommentsWanted (TableObjectName varchar (128), `Type` varchar(20), `Column` VARCHAR(128), `comment` VARCHAR(1024) ); INSERT INTO ColumnCommentsWanted (`TableObjectName`, `TYPE`, `Column`, `comment`) VALUES ( N'dbo.publications', N'TABLE', N'Publication_id', N'The surrogate key to the Publications Table' ), ( N'dbo.publications', N'TABLE', N'title', N'the title of the publicxation' ), ( N'dbo.publications', N'TABLE', N'pub_id', N'the legacy publication key' ), ( N'dbo.publications', N'TABLE', N'notes', N'any notes about this publication' ), ( N'dbo.publications', N'TABLE', N'pubdate', N'the date that it was published' ), --< most of the data excised. >-- ( N'dbo.titleauthor', N'TABLE', N'au_id', N'Foreign key to the author' ), ( N'dbo.titleauthor', N'TABLE', N'title_id', N'Foreign key to the publication' ), ( N'dbo.titleauthor', N'TABLE', N'au_ord', N' the order in which authors are listed' ), ( N'dbo.titleauthor', N'TABLE', N'royaltyper', N'the royalty percentage figure' ), ( N'dbo.titleview', N'VIEW', N'title', N'the name of the title' ), ( N'dbo.titleview', N'VIEW', N'au_ord', N'order in which the authors are listed' ), ( N'dbo.titleview', N'VIEW', N'au_lname', N'author last name' ), ( N'dbo.titleview', N'VIEW', N'price', N'the price of the title' ), ( N'dbo.titleview', N'VIEW', N'ytd_sales', N'year to date sales' ), ( N'dbo.titleview', N'VIEW', N'pub_id', N'the id if the publisher' ), ( N'dbo.titles', N'VIEW', N'title_id', N'The primary key to the Titles table' ), ( N'dbo.titles', N'VIEW', N'title', N'the name of the title' ), ( N'dbo.titles', N'VIEW', N'Type', N'the type/tag' ), ( N'dbo.titles', N'VIEW', N'pub_id', N'the id of the publisher' ), ( N'dbo.titles', N'VIEW', N'price', N'the price of the publication' ), ( N'dbo.titles', N'VIEW', N'advance', N'the advance' ), ( N'dbo.titles', N'VIEW', N'royalty', N'the royalty' ), ( N'dbo.titles', N'VIEW', N'ytd_sales', N'Year to date sales for the title' ), ( N'dbo.titles', N'VIEW', N'notes', N'Notes about the title' ), ( N'dbo.titles', N'VIEW', N'pubdate', N'Date of publication' ), ( N'dbo.PublishersByPublicationType', N'VIEW', N'publisher', N'Name of the publisher' ), ( N'dbo.PublishersByPublicationType', N'VIEW', N'AudioBook', N'audiobook sales' ), ( N'dbo.PublishersByPublicationType', N'VIEW', N'Book', N'Book sales' ), ( N'dbo.PublishersByPublicationType', N'VIEW', N'Calendar', N'Calendar sales' ), ( N'dbo.PublishersByPublicationType', N'VIEW', N'Ebook', N'Ebook sales' ), ( N'dbo.PublishersByPublicationType', N'VIEW', N'Hardback', N'Hardback sales' ), ( N'dbo.PublishersByPublicationType', N'VIEW', N'Map', N'Map sales' ), ( N'dbo.PublishersByPublicationType', N'VIEW', N'PaperBack', N'Paperback sales' ), ( N'dbo.PublishersByPublicationType', N'VIEW', N'total', N'Total sales' ), ( N'dbo.TitlesAndEditionsByPublisher', N'VIEW', N'Publisher', N'Name of publisher' ), ( N'dbo.TitlesAndEditionsByPublisher', N'VIEW', N'Title', N'the name of the title' ), ( N'dbo.TitlesAndEditionsByPublisher', N'VIEW', N'Listofeditions', N'a list of editions with its price' ) ; -- now we determine what comments need to be inserted or updated into a table -- and insert the results into a temporary table WhatTableToDocument CREATE TEMPORARY TABLE WhatTableToDocument ( TheOrder int NOT NULL AUTO_INCREMENT PRIMARY KEY, Tablename varchar(80), TheDescription VARCHAR(255)); INSERT INTO TableCommentsWanted (Tablename, TheDescription) SELECT Tablename, TheDescription FROM information_schema.`TABLES` INNER JOIN TableCommentsWanted on CONCAT (TABLE_SCHEMA,'.',TABLE_NAME)=TableName AND TABLE_TYPE = 'BASE TABLE' WHERE table_comment <> TheDescription; -- We now need to insert these comments into the tables -- we'll do it one at a time so it requirese several prepared sttements Delimiter $$ CREATE PROCEDURE CommentEachTable() BEGIN set @iiMax = 0; set @ii = 0; SELECT COUNT(*) FROM WhatTableToDocument INTO @iiMax; SET @ii=1; WHILE @ii <= @iiMax DO SET @TABLE=(Select tablename FROM WhatTableToDocument WHERE TheOrder=@ii); SET @COMMENT=(Select TheDescription FROM WhatTableToDocument WHERE TheOrder=@ii); SET @sqlstmt = CONCAT( 'ALTER TABLE ',@Table,' COMMENT ''',@COMMENT,''';'); PREPARE st FROM @sqlstmt; EXECUTE st; DEALLOCATE PREPARE st; SET @ii = @ii + 1; END WHILE; END; $$ DELIMITER ; CALL CommentEachTable; -- execute the temporary procedure /*now we need to insert or update all the comments. This is a bit more complicated because we need to recreate the column exactly as it existed with the addition of the comment */ Create temporary table WhatColumnToDocument ( TheOrder int NOT NULL AUTO_INCREMENT PRIMARY KEY, Theexpression VARCHAR(255)); INSERT INTO WhatColumnToDocument (Theexpression) SELECT CONCAT('ALTER ', comments.type, ' `', TABLE_SCHEMA, '`.`', table_name, '` CHANGE `', column_name, '` `', column_name, '` ', column_type, ' ', IF(collation_name IS NOT NULL,CONCAT ('COLLATE ',collation_name,' '),' '), IF(is_nullable = 'YES', '' , 'NOT NULL '), case when COLUMN_DEFAULT IS NULL then '' ELSE CONCAT (' DEFAULT (',COLUMN_DEFAULT,')') END, extra, ' COMMENT \'', REPLACE(`comment`,'''',''''''), '\' ;') as script FROM ColumnCommentsWanted comments INNER JOIN information_schema.`COLUMNS` ON comments.column=`COLUMNS`.COLUMN_NAME and TABLE_SCHEMA = DATABASE() AND TYPE <> 'VIEW' AND CONCAT (TABLE_SCHEMA,'.',TABLE_NAME)=TableObjectName WHERE COLUMN_COMMENT <> COMMENT; /* now a procedure to do each column that needs to be updated */ Delimiter $$ CREATE PROCEDURE CommentPerRow() BEGIN set @iiMax = 0; set @ii = 0; SELECT COUNT(*) FROM WhatColumnToDocument INTO @iiMax; SET @ii=1; WHILE @ii <= @iiMax DO SET @expression=(Select Theexpression FROM WhatColumnToDocument WHERE TheOrder=@ii); PREPARE st FROM @expression; EXECUTE st; DEALLOCATE PREPARE st; SET @ii = @ii + 1; END WHILE; END; $$ DELIMITER ; -- now mop up if necessary CALL CommentPerRow; DROP PROCEDURE CommentPerRow; DROP PROCEDURE CommentEachTable; -- temporary tables will only last as long as the session is alive. |
Reviewing the documentation
If you want to try this out, I’ve provided an example PubsMySQL project as part of my Flyway Teamwork framework. The above afterMigrate__ApplyTableDescriptions.sql
callback script is included in the Migrations folder for this project so all you need to do is run a migration and the callback will automatically add the documentation specified by the script to the database objects.
We can then execute the $ExecuteTableDocumentationReport
as a post-migration task and it should generate a documentation report (TableDocumentationReport.json) for each final version of any migration run and save it in the Versions/…/Reports folder of the project.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
cd <MyPathTo>\FlywayDevelopments\PubsMySQL . '.\preliminary.ps1' Write-Output @" Processing the $($dbDetails.variant) variant of $($dbDetails.branch) branch of the $($dbDetails.project) project using $($dbDetails.database) database on $($dbDetails.server) server with user $($dbDetails.installedBy)" "@ flyway migrate #Set-Alias MySQL $MySQLAlias -Scope local #Set-Alias MySQLDump $MySQLDumpAlias -Scope Script $PostMigrationTasks = @( $GetCurrentVersion, #checks the database and gets the current version number $ExecuteTableDocumentationReport ) Process-FlywayTasks $dbDetails $PostMigrationTasks |
The resulting documentation should look like this:
Generating a build script with comments
If, having made your project your current working directory and executed ‘preliminary’, as shown in the previous script, you can also then execute this PowerShell to get a ‘build script’ for the current version, containing the DDL of all your tables complete with lots of informative comments:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
$GetListOfTableTypes=@" SELECT CONCAT ('SHOW CREATE ', TYPE,' ',ObjectName,';') as SQLCode FROM (SELECT concat(tables.table_schema,'.',tables.TABLE_NAME) AS ObjectName, case when v.Table_Name IS NULL then 'TABLE' ELSE 'VIEW' END AS "Type" FROM information_schema.`TABLES` LEFT OUTER JOIN information_schema.views v ON tables.TABLE_NAME=v.Table_Name AND v.table_Schema=tables.table_Schema WHERE tables.table_schema IN ('dbo','people') AND tables.TABLE_NAME <> 'flyway_schema_history' And v.Table_Name IS NULL ) as f "@ Execute-SQL $dbDetails $GetListOfTableTypes| ConvertFrom-json| foreach { Execute-SQL $dbDetails "$($_.sqlcode)"}|convertfrom-json| foreach {$_.'Create Table'}| foreach{"`n$($_.replace(', ',",`n "));"}>"$($dbDetails.reportLocation)\current\TableBuildScript.sql" |
Summary
We’ve illustrated a solution to the problem of applying comments easily to a database version in Flyway without changing its version. Comments should never represent a change to a database. They are there for ‘decoration’ because they don’t affect the workings of the database. A comment doesn’t change data or a process. Sure, they make a database easier to understand and maintain if they are done properly.
If you use structured comments, they can be very informative, and are useful in prompting the busy developer to provide all the necessary information, such as, in the case of a function, what it does, how to use it, and why, what all the parameters do and what the output is, who created it and when.
All that remains is to produce a version of this callback for all the relational database systems supported by Flyway!
Tools in this post
Flyway Teams
Ideal for organizations looking to improve collaboration and fine tune their processes during development and the deployment of database changes.