Deploying the same database to many different RDBMSs

With the idea of a generic Dacpac defined by international standard, comes the potential for a Visual Studio developer to use SSDT to create a generic database model to a SQL-92 compliant standard that can then be deployed to any one of the major RDBMSs. The same database model would be deployable to Oracle, MySQL, or SQL Server, for example. Professor Hugh Bin-Haad explains the reasoning and technology behind this.

One of the major problems for developers of a database application is that they have to commit to a particular database when they start development. The syntax of Oracle, PostgreSQL, MySQL, and SQL Server are different enough to require complete rewrites if a tactical decision is taken to change the provider of the RDBMS. Currently, the developer is forced to adopt an ORM that supports all the major database providers to provide a level of abstraction. Although this will work if the database is merely used to hold state about business processes, provide persistence for objects and implement behaviour, more is required if the application is to benefit from the power of a relational system.

As part of my work on the ANSI X5T6 Database Standards Committee, there has been a lot of discussion on the adoption of Microsoft’s DACPAC standard as a standard across all the major relational platforms such as SQL Server, Oracle, MariaDB, MySQL and PostgreSQL. This will allow Visual Studio user to create and develop databases in SSDT which would then be deployable to any major RDBMS.

Currently, DACPACS are zipped files that contain a simple build SQL script, consisting of CREATE statements in the correct reverse-dependency order, an XML-based model of the metadata (database schema), and JSON files of whatever static data is required. Currently, the build script uses the SQL Server dialect, and the XML model contains the definition of SQL Server objects and a string representation of the contents of each routine.

Microsoft has suggested that it is possible to create a DACPAC convention that conforms to the ODBC SQL 3.8 extended grammar. This would allow a DACPAC to be prepared for any ODBC-compliant data source. The Build script would conform to SQL-92 Grammar and the XML model of the schema would model the database as a SQL-92-compliant database, but accept nothing beyond this.

Any ODBC driver will conform to one of three levels of SQL Grammar, the minimum grammar (a subset of the Entry level syntax of SQL-92 – FIPS 127-2.), core grammar, or the extended grammar (SQL 92 compliant). Even the ODBC text driver, with its minimum SQL Grammar will allow SQL Statements to be used to select, delete, update or insert rows. The extended grammar allows the common denominator of the SQL Grammar to be used.

By using DACPAC files that are compliant with this ODBC SQL Extended Grammar, databases, both build and data, can be extracted from, and deployed to, any relational database that conforms to SQL-92 and has an ODBC driver to the same level. The ODBC objects are mapped to the actual RDBMS implementation. This means that SQLPackage.exe and DACFx 3 would be able to deploy to any database for which an ODBC driver exists. Naturally, a SQL Server developer would be able to use the native SQL Server syntax as usual, but the DACPAC that is created would not be generic, and would fail against an Oracle database.

The current problem is that, while ODBC and JDBC has been adopted enthusiastically outside Microsoft, the SQL Grammar defined by ODBC has not progressed beyond SQL-92, which means that many features will not be deployable by a generic DACPAC.

Microsoft are keen on including Microsoft Access in the list of major RDBMS, and would like the DACPAC to be able to deploy to Access. This has led to a certain resistance to standardising on a procedural language such as TSQL or PL-SQL. Microsoft see Access as a key component in their database strategy, in the form of “Big Access” which would connect to cloud data sources, spreadsheets, enterprise data sources and log files, providing functionality for data visualisation, assessing data quality, data blending, and data shaping. However, the development time to bring Access up to current SQL Standards would be prohibitive.

The work needed to define an ODBC standard to SQL 2008 for SQL Server, PostgreSQL, and Oracle would seem fairly achievable but MYSQL and Access would require a major investment in the ODBC drivers. For text and Excel, the task would seem enormous, though there would be little call for a DACPAC that could deploy a spreadsheet or directory of text files.

The discussions of the relevant standards committees have ranged around the possibility of defining DACPACs to three levels, ridiculous, ho-hum and reasonable. This would allow the developer in Visual Studio to develop for a generic database and then select the appropriate vendor to deploy the database to at the point of release, and perhaps test the same database on each rival platform to see which best supported the database application.

It is probable that there will be issues to be resolved and technical issues will require working parties to explore solutions to report back in time. This type of standard requires a measured approach, allowing time for the major vendors to respond fully to the various issues that are flagged up. This means that, allowing for the working parties to report back, for various European bodies to sign up to the recommendations, a working standard will be achievable by 2030, but of course, because this is an April Fool’s gag, it won’t happen anyway.