Deploying Database Developments

When a team is developing a database application, it is a mistake to believe that deployment is a simple task. It isn't.  It has to be planned, and scripted. Alexander Karmanov describes many of the problems you're likely to meet, and provides an example solution that aims to save the DBA from the nightmare complexity of an unplanned deployment.


  1. Deploying Database Developments.
  2. Purpose.
    1. Goals.
    2. Incremental vs. Full Build.
    3. Database environment
    4. Build number
    5. Change control (authority)
  3. Incremental update.
    1. Change-Log.
    2. Incremental Build Mechanism..
  4. Database creation.
    1. Full Build Mechanism..
  5. Database Development
    1. Source code categories.
    2. Database project configuration file structure.
    3. Database project folder structure.
    4. Database development process.
      1. Security Configuration.
      2. user mapping file:
      3. Storage Architecture.
      4. Code Development
      5. Database Name Mapping.
      6. Database Schema and Data Changes Implementation/Incremental Build Process Integration.
      7. Full Build Scripts Generation.
      8. Fully vs. Incrementally Built Database Testing.
      9. Unit-testing.
    5. Model-driven Development and Code Generation.
    6. [Nightly] Database Build Process.
  6. Installation Packager Integration.
  7. Conclusion.


After the initial deployment of a database in an IT project, there is often insufficient thought  given to subsequently deploying the inevitable  changes in the database’s structure, configuration, fixes and so on. 

It is not uncommon to find a development in which only one copy of the database exists. Changes to the database’s design are made directly on it, using Management Studio/Query Analyzer. All Test or local database copies are then created from backups of that single copy.

For larger-scale developments,  database objects are stored in form of a script which is lodged in a source control system such as Visual Source Safe, Team Foundation Server or SVN/Subversion.

In both cases, the build process  assumes that the database is created from scratch, from the scripts. The synchronization of the development and production database schemas (the deployment process) is typically considered to be minor problem, and so is usually left until the last moment.

Both these development techniques put the burden on the DBA’s shoulders. He must manually apply the scripts in proper order, collect the changes, and maintan the history of changes from the development team. The most daunting task is that of synchronizing with the production database system. Often, the latest changes must be rapidly propagated to many servers, whilst keeping these servers in a consistent state. Additional issues are typically discovered along the way: testers may want to receive the latest changed stored procedures but want have their data intact, production deployment could require a sudden change of the folders used for the database files, and so on.

There are already several tools that help DBAs with this type of work – Red Gate, SQL Farms, and many others provide tools to identify differences in database structure and data and then execute TSQL scripts against multiple servers to synchronize them. These tools do a decent job in creating scripts that will eliminate any differences in either structure or data, but for me the biggest problem with these tools is that the scripts they produce are usually applicable to a particular combination of source and target databases. Usually there is no time or place to test them thoroughly. As a result the production database may still differ from one used for tests.

Some of the tools are able to detect the difference at the deployment time, but isn’t it then far too late?

It scares me to think that it is the tool that is making the decisions about what column to add or to drop during the production deployment, without a person to analyze and approve such changes. Moreover, doing that at the very last moment makes each deployment unique and unrepeatable; you cannot be confident of a consistent result.

What I want to describe here is an approach I use currently and which has proved to be reliable in several projects.


The most important aim was to smooth out the process of making changes to the existing databases.

We also needed to

  • store the database project in a source control system
  • create a database from scratch in any known state (build), including the most current one
  • making it easy to configure any environment-specific parameters, removing all environment-specific statements from the code/scripts
  • make the security configuration clear and easily maintainable
  • decrease the need for a high level of special knowledge in the deployment personnel
  • combine all the steps to creating a database into a single script

We needed  a simple way of deploying the databases that my colleagues and I developed. We wanted to ensure that the deployment of changes to the database would be smooth, without needing to search for the correct files in the developer’s machines or in the source control system.

Ideally, the process of creating or updating the application environment should not require  the attention  of  any developer. Any person who is interested in getting his database box up to date or building it from scratch should be able to do it by himself. No DBA knowledge should be required from that person, no T-SQL skills, no database structure knowledge, nothing. All that should be required is  the name of the target SQL Server and a login with sufficient rights. The rest should be preconfigured.

There should be a way to recreate a database as it was at a particular moment, or point in time, such as the release state in a previous build.

The Database state should be easily identifiable. Two databases carrying the same build number should be identical in their structure, code and static data.

Parameters that are specific to the server’s environment, such as the location of the database files, their size, growth rates, etc. must be easy to configure. Change of the configuration should not involve any code or script change. I can speak from bitter experience, in which the branch of the code that involved the production server name was never executed, or tested, until the production environment was created. It may sound like common sense  to separate data and code, but in SQL Server development it is not always that common.

Security (logins, roles, database object permissions) must be loaded also from some configuration files, not in T-SQL scripts for the same reason.

The deployments must also be repeatable; that is to say, actions that are needed to bring a database from one state to another should be the same in any server environment.

Incremental vs. Full Build

There are two alternative tasks when deploying a database: a full database build from scratch and an incremental upgrade of an existing database.

A Full database build
will involve several steps: You must create the database, the schema objects such as tables or procedures. You must also define permissions and load the initial data. The final step is to stamp the database with a build number.
An Incremental build
must retrieve the database build number from the target database, using the Change-Log (incremental builds history log) that defines what database structure change scripts must be applied. Then it re-deploys all the stored procedures and redefines permissions as per the configuration files. Finally it also stamps the database with the build number.

The structure and code of the database is the same regardless of the way it is built.

Both scenario share some database project files and have some specific ones. The way the files are stored and how the project is configured is described below.

Database environment

A database environment is a single unit consisting of one or more databases.

This set of databases is placed into a single SQL Server instance, deployed and updated always together, and never separated. Because of this, a system that comprises several databases is best treated as a unit for deployment, maintenance and development  purposes.

The treatment of databases as as if it was a tightly integrated unit isn’t always appropriate, as it inevitably limits the flexibility of a solution. One should consider that at the very beginning of development, since retrospective splitting of the environment may be quite complex. I achieved it a few times – split one database into many and combined them back. It is certainly possible, and the process that I’m about to describe helped a lot.

Build number

The Build number is an arbitrary string value that identifies the state of the database structure, code and initial data. All databases with the same build number must have an identical database structure, the same code and  static data.

The idea is to have some value, such as a string or number, which could serve as an identifier. It is not, strictly, necessary to have an incrementing value because the order of the builds is defined by the Change-Log rather than the build number value. Nevertheless, it is worth assigning numbers or characters to the builds in some order so you can see at a glance which is the later of two builds.

At one time, we used a table to store this value. It helped to track deployments, but for some projects no changes in database structure are tolerated. One could not add any table without extensive modeling work to get it through the review boards and other processes. So the current set of scripts uses an extended database property called ‘database build number’. The actual name of the property is arbitrary. It is easier to justify the existence of an Extended database property than a new table to the client.

Change control (authority)

For this solution to work there must be no other change in the database structure or code in the database(s) to be performed other than those determined by the deployment mechanism. This is because it then wouldn’t conform to any existing build

The database would have to have a special build number, and that would require a proper analysis and the development team would need to be notified.

As a protection from such side hacks, a DDL trigger ought to be created in the database so as to modify the build number when a change is made, so that it would not match any build recorded in the Change-Log. This is not yet in the current implementation.

Incremental update

Let’s first start with the process of deploying modifications to an existing database, before we go on to describe the initial creation of the database environment.

At this point the database structure is known and its state is identified by a build number.

All the changes in structure are coded in DDL and DML that assumes a particular existing state of the structure and data of the database. The order in which they are applied is known and will be the same in all environments.

Code, in the form of stored procedures, user-defined functions and triggers not bound to the schema, is supposed to be re-deployed completely as well as security configuration.


The change log is used to track all the changes made to the database structure and static data.

Here is an example of the Change-Log file structure:

The change log is updated by developers or the build master – it is defined by the project team.

In our current implementation, all the references to the files are relative to the configuration file location. In the Change-Log example, database change scripts are stored in subfolders of the folder where the Change-Log is stored. Each build scripts are put into a single subfolder named after the build.

The change log, as with any other script lists used in the process, is not restricted to T-SQL scripts. Currently there are four types of  scripts- another script list, T-SQL scripts to be run using OSQL and SQLCMD utilities and Windows batch files. The latter is required so as to run SSIS packages, copy any necessary files, and create folders, and so on. There is a way to pass parameters into the batch files. Also it provides a way to handle any number of the scripts in the project – no limit is defined for the number of levels or files.

Build entries may have no steps inside them, where they are used for tracking builds that didn’t have any structural/data changes, but only code ones, for example. Another option may be to make a change in the build number so as to match the application build numbering. Unless you assign as appropriate build number to the deployed database, there will be no easy way to identify its position in the Change-Log. Of course, developers or the build master could add the missing build numbers to the Change-Log and the DBA could stamp the database with some known build number, but it is better to avoid such inconsistencies in the first place.

The requirements of the projects determine how granular the Change-Log is (how often new builds are created). In one of our projects, it was strictly defined by the deployment plan, so that  no build numbers other that listed in that document would appear in the Change-Log. That meant that we added changes in the latest build, and always upgraded some build previously submitted for testing/deployment. In other projects we build it more frequently, so that the minor build version element was in our control – we could add something every other hour.

Incremental Build Mechanism

The Incremental Build mechanism gets the build number from the target database environment. If the environment consists of two or more databases, they all should have the same build number, otherwise the build process stops with an error.

The build number of the target database environment is read from the Change-Log. If that build number is not found in the Change-Log, the deployment stops with an error – no starting point is found. If the build number is found in the Change-Log, all the scripts from builds after that build are applied. Builds and scripts are arranged in the Change-Log in the order they are supposed to be applied, so that the order is the same in any environment.

After the database structure and data changes are applied, the system recreates all the stored procedures specified in the database Environment-Configuration Files as ‘re-deployable’  (see configuration file description below.)  The scripts creating them are executed, so these scripts must take care to replace the existing version of that procedure, by dropping it before attempting to create the new version. I prefer to drop all the code objects at once and then recreate them. This ensures that the environments always have the same set of procedures and that no obsolete code (removed from the code base) is left in the database. Unfortunately that point of view is not always shared, so then we just recreate procedures listed; obsolete procedures removal scripts are delivered as structural changes via incremental scripts in the Change-Log.

After we complete the changes in database objects we set up the permissions. All permissions are revoked from the objects and logins, and then recreated from the security configuration files. In this way, the security configuration is refreshed and all unassumed access to the database is eliminated. Existing logins and roles are not dropped by that process. If the permission configuration file does not contain anything for them they are left orphans and must be removed either manually or by a clean-up script included into the Change log. Mostly it is justified by a fact that logins may have passwords different from those in the configuration file. Passwords are stored in configuration file as text, so after deployment in secure environments they must be changed manually by a DBA.   

Here is a diagram showing how the incremental database update process delivers the changes to a target environment:


Database creation

The second Process is that of creating a database from scratch.

Unlike the incremental update, the target environment does not already contain databases from the project being deployed.

Databases are created according to information in the Environment-Configuration File in which four configuration files are specified for each database:

File layout

This refers to a configuration file containing a list of filegroups that must be created in the database. It also provides a definition of the physical files for known SQL Server instances. Instances with identical layout are grouped together as environments. If the instance name is not found in the configuration file, its filegroups/log file are created in a folder specified as default for the target instance.
The file also provides some configuration options to be set for the database in form of attribute name-value pairs. There is also a condition for each pair which allows an expression evaluating to 1 or 0. For example, it may be used to detect if SQL Server 2000 has SP3 applied.

Object and data creation script

Points to a list of scripts which must be executed to create the database objects and to populate database with initial data.

A script generating full build database schema creation script

As described below the script creating database objects during full build is derived from the existing database. To avoid manual work and to provide consistent results such operation must be implemented as [a set of] scripts. 

Database roles and permissions granted to them

Configuration file specifies role names, for each list of pairs object name-permission is specified.

For the whole environment there is a build number element – that number is used to label the database(s) build from scratch.

Also for the environment a login/DB user mapping configuration file is specified. Those logins are created at the moment of database creation or update and mapped to particular database users (thus granting access to those databases) and list of roles these database users are assigned to.

Full Build Mechanism

The Full Build mechanism checks the target SQL Server instance to see if the databases it is about to build already exists. If any of them exists it stops with an error.

After that, it creates the databases using file layout information provided for each database. It tries to find instance name among the environments listed in that file. If the instance is found, it uses the file parameters that are provided, otherwise it creates filegroups using the default database file and log folder, size, growth, etc.

When the database is created, its objects are created by the following object and data-creation script. That script is a list referring to other (lists of) scripts (T-SQL and command shell) which create database objects and load the static data. Before executing those scripts it checks that all the files exist. If any file is missed the process stops with an error.

After database objects are created, security is configured for the database. The roles are created and the listed permissions are granted.

If there is a requirement to load some initial data for test purposes or to load the database from external sources, then additional commands may be bundled with the full build mechanism call in a command (batch) file. Such commands may execute SSIS packages or some other programs or start SQL Server job to load the data.

Here is a diagram that shows how to build the database from scratch in a target environment:



Database Development

In order to make described processes working some principles must be followed.

Primarily they are related to the way script files are organized in the database project folder and to the contents of the scripts.

Source code categories

Database source code is split into six logical categories

  1. Database schema
    This includes all the objects that related to the data storage: tables, views, indexes, constraints, user-defined data types, user-defined functions (if only they are used in views or table definition – default or check constraints, computed column expressions). Any change in these objects affects the data stored in the database.
  2. Code
    This  includes all the objects manipulating or retrieving data: stored procedures, user-defined functions (not used in the schema), and triggers. It also includes CLR code.
    Changes in these objects do not affect the data immediately.
    Triggers are included since in that area since they are considered to be some kind of stored procedure called in response to some event. Change of the trigger does not mean change of the data in the table it is attached to.
  3. Data
    This category includes scripts that insert initial data at the database creation, SSIS packages and any other programs used to populate database from outer sources.
  4. Physical file layout
    This includes information needed for the empty database creation. It specifies file groups. For specific environments different files may be configured – with different names, size, etc. Environments are identified by SQL Server instance name.
  5. Security
    This describes database-level roles and permissions granted to them, server logins and what database theses logins can access, which database user names these logins should use and what roles they are assigned in the databases.
  6. Change-Log
    This provides information necessary to upgrade database environment from any state to the latest and current one. The scope of the category is limited to the physical file layout, database and data changes, but it can be extended to code and security (while it is not recommended.)

This categorization is pretty standard. One can always categorize objects as a part of the database schema. That would mean all the changes to the schema are delivered through the Change-Log. Whereas this is possible to do, it is much easier to have a single copy of a stored procedure (re-)creation script than it is to have to search its current state in the log. The same is true for the data – it is easier to maintain a stand-alone script or data loading package.

Database project configuration file structure

The deployment of a project requires several configuration files. All of them are accessible from the full/incremental build script. Altogether they provide enough flexibility to describe almost any database project:


The second row of the configuration files is described above in the  Database creation section

The second Process is that of creating a database from scratch.

Unlike the incremental update, the target environment does not already contain databases from the project being deployed.

Databases are created according to information in the Environment-Configuration File in which four configuration files are specified for each database:

File layout

This refers to a configuration file containing a list of filegroups that must be created in the database. It also provides a definition of the physical files for known SQL Server instances. Instances with identical layout are grouped together as environments. If the instance name is not found in the configuration file, its filegroups/log file are created in a folder specified as default for the target instance.
The file also provides some configuration options to be set for the database in form of attribute name-value pairs. There is also a condition for each pair which allows an expression evaluating to 1 or 0. For example, it may be used to detect if SQL Server 2000 has SP3 applied.

Object and data creation script

Points to a list of scripts which must be executed to create the database objects and to populate database with initial data.

A script generating full build database schema creation script

As described below the script creating database objects during full build is derived from the existing database. To avoid manual work and to provide consistent results such operation must be implemented as [a set of] scripts. 

Database roles and permissions granted to them

Configuration file specifies role names, for each list of pairs object name-permission is specified.

For the whole environment there is a build number element – that number is used to label the database(s) build from scratch.

Also for the environment a login/DB user mapping configuration file is specified. Those logins are created at the moment of database creation or update and mapped to particular database users (thus granting access to those databases) and list of roles these database users are assigned to.

See the Full Build and Incremental Build sections above.

Configuration files/script files below that level (referred by those in the picture) are either T-SQL script files or command shell batch files. There is no mandatory file extension – the method by which they are executed is specified by the Type attribute of the reference, not the file name.

Database project folder structure

Below is a sample folder structure for AdventureWorks database that is provided with SQL Server.

The content of the some folders is abbreviated out to keep it in a single page – just too many files.


Incremental build folders have been added as an illustration.

Database development process

Development process follows source code areas.

Overall process can be pictured as following:



Security Configuration

The Security configuration must be specified by the application developers in the project team. Connection(s) characteristics are identified: what connections must be there, what role are played by whom, what objects should be accessible for these roles, etc. For each database, a list of roles with permissions is created and, for the overall environment, a list of logins and their mapping to the database users is created.

Since the deliverables are configuration files (XML documents) they are usually easily mapped to (or derived from) existing documents. In any case they provide human-readable format which can be easily understood by most of the project team members and its stakeholders.

The Deployment Mechanism relies on the existence of the objects in the database that the configuration file refers to. If the objects do not exist  then the deployment will stop with an error.

Here is an example of the role and permission configuration file:

Here is an example of the login/database user mapping file:

Storage Architecture

The Storage architecture for the production databases is usually documented as part of the database model. This would include such things as the estimate of its eventual size, the table-to-filegroup mapping and the Drive letters and folders used. Not only should these be documented for the more important production environment or user acceptance test environment, but also for various development environments: These might include the ‘build box’ or the ‘test box’ configuration that would allow the testing of the deployment script. I.e. it would use deployment script code branch responsible for creating database files as per configuration file.

If the particular environment is not found in the configuration file; databases are created with all the default properties defined for the SQL Server instance.

No instance configuration is changed by the deployment mechanism; it is responsible only for the databases it creates or updates.

Code Development

Code put in the database folders should comply with company or project coding standards.

From the perspective of the process we are describing, code must follow these rules:

  • All the stored procedures must be registered in the configuration file that is specified within the database object creation list; this is because  the database creation mechanism should know what it needs to call. No script is executed unless it is can be determined from the Environment-Configuration File. The database in which the object should be created must be specified in DBName attribute.
  • Code should not switch context to a particular database (except for the system ones, like master or tempdb). The reason is given in the  REF _Ref197415943 \h Database Name Mapping section  REF _Ref197415943 \p \h below. The name of the database against which that script must be executed is specified in the configuration file referring to that script.
  • Code should not refer to any other hard-coded database name. If a different database is accessed it should be done using a parameter passed from outside. The Project architecture should specify this convention  from the beginning.
  • Code should not assume any particular option set. If, for example, the procedure needs to have the  QUOTED_IDENTIFIER option set to ON, then the script should set it  and not assume any settings;
  • There should be no permission set in the code. There are dedicated configuration files responsible for the database security.
  • Scripts should not assume any physical file name (if any). Scripts must be designed to have these specified by parameters in the same way as database names.
  • Code should have a preamble that detects whether a procedure already exists, and dropping it if necessary. You must drop a procedure if it exists, before you attempt to create it.
  • Script should comment its execution verbosely enough by printing messages on steps it performs (existing version of procedure detection and deletion, description of the data being inserted, etc.)
  • The code should not use an ALTER statement that assumes the existence of the procedure – it should be dropped and recreated. More generally, code should execute without error however many times you do it. Data-loading scripts should check for the existence of the data it loads whether it is adding or modifying the data; A script that creates a procedure should recreate the existing procedure rather than fail with an error.
  • Each stored procedure, function or trigger should be put into a separate file. An exception may be made for the code that has been generated automatically as by an Entity-Relationship modelling tool; such scripts usually contain a large number of objects where no individual modification is supposed to happen and so it makes no sense to split them up. If, however,  a piece of code, procedure or UDF, must be customized, it should be removed from the generated code file and re-created in a separate one. That means that that the  segregated piece of code must henceforward be maintained separately from the rest of the generated code.
  • Each script must be referenced from the database object creation configuration file. It may be addressed directly or through another script combining similar objects. It makes sense to have a separate script list for stored procedures and functions, another for the data inserting scripts, and so on.

Database Name Mapping

In some environments a single SQL Server instance is shared between different teams or users. It may be used for several purposes such as for development and testing, or for different kinds of testing – regression and user-acceptance testing. In such case the same database environment is created with the database(s) renamed.

We therefore need to provide a mechanism for easy database-name mapping.

The build mechanism may be parameterized with a configuration file with structure like this one:

It maps the name of the database as it is specified in the configuration file to the name which should be used instead.

One may consider database name(s) mentioned in all the configuration files as “logical” while that configuration file maps them to “the real” ones.

In the absence of  such a mapping file mechanism, the database environment creation utility uses names that are specified in the Environment-Configuration File.

To make that mapping really work, T-SQL scripts should not explicitly switch to any particular user database (that excludes system ones like master of tempdb ones). Even the dynamic scripts should  either be made parameterized from outside or derive their current database name but never have the database name hard coded inside.

The deployment mechanism accepts such a file name as an optional parameter – a developer or build master may create a set of batch files to update or create particular environment.

There is a potential difficulty with the database-file layout configuration file. The instance in which several database environments are created using the name-mapping mechanism should not be configured in that file. This is because the physical file name is specified for an instance and currently there are no means to configure it for the multiple copies of a database.

In the absence of any specification of the database file layout, then the data and log files will be created in SQL Server instance default database location named after the database and filegroup, with the default size and file growth. All these parameters must be configured for the SQL Server instance, especially the default file and log location.

Database Schema and Data Changes Implementation/Incremental Build Process Integration

To use the incremental process, the changes to the database structure are delivered to the project in form of scripts modifying that structure. The full build script is generated later from the incrementally built database.

That means that database schema changes are added into the project in two places:

  1. Scripts to update the existing database.
    Such scripts are created by a developer. He must develop it to work in the previous build environments and prepare a script that would modify the database to the desired state.
  2. Scripts to create the database objects in the empty database.      
    This script is generated from the database built incrementally. It is derived utilizing some tool (described in  REF _Ref197493703 \h Full Build Scripts Generation section  REF _Ref197493703 \p \h below), so ideally database schema objects creation script should not be created or modified manually.

Data changes are put in different locations depending on the data nature. Changes to the non-static (operational) data are delivered as incremental update scripts registered in Change-Log. Changes to static (reference) data are delivered depending on the way that data is created and maintained in the database.

There are several possible scenarios of data update:

  1. Data loaded initially via T-SQL scripts.
    In this case, change must be delivered in both places. I.e. (1) the full-build script inserting data into an empty database must be modified and an incremental script updating existing database must be created and registered in the Change-Log.
  2. Data loaded initially, via some appropriate method, from an external source such as a delimited text file or Excel spreadsheet. It may be a solution utilizing SSIS, DTS, any other ETL tool.
    In that case there are two possible approaches depending on the capabilities of the data loading mechanism:
    1. Loading mechanism is supposed to be used to load data only once when the database is created. In this case changes must be delivered as in case of T-SQL scripts described above. I.e. data source is modified and an incremental script modifying existing data is created and registered in Change-log.
    2. Loading mechanism is able to detect the changes and deliver them appropriately. In this case the source of the data is modified and the mechanism loading them is re-run. A command to re-run it may be put into the Change-Log.

This scenario also assumes the data source is versioned in the source control system along with the database project. Usually that is true, otherwise it becomes data issue and does not affect development process.

  1. Changed data is the application’s operational data.
    That data is not loaded initially, so just a script modifying that data is created and registered in the Change-Log.
    For instance, new version of application changes format of an identifier stored. The database must be updated in proper time, synchronously with the application deployment. A script updating that data may be put applied according to the Change-Log


Full Build Scripts Generation

Whether one does an incremental or full build,  the databases produced must be the same. Manual change is a very error-prone process. To avoid that, the structural changes are introduced into the project in the form of scripts that incrementally update the database. The full database build script, or set of scripts, is extracted from the database that has been upgraded from some previous build. For each database in the Environment-Configuration File, there is an element pointing to a script list called when the full build script is generated. In the simplest case, in which the database schema object creation statements may be put in a single script, that that is put into that list.

In my case it is a shell batch file calling Red-Gate SQL Compare utility.

That script compares a database that has been incrementally built in some SQL Server instance with a snapshot of an empty database. Parameters tell the program not to generate objects created other ways: stored procedures and other code objects not used in the database schema, roles and permissions, etc. But it includes dependencies, so functions used in table definitions (e.g. for computed columns and constraints) are included into that script.

Certainly,  that script should be changed  for each specific database. For example, if XML schema collections were loaded from external files their creation statements would be excluded by using option /exclude:XmlSchemaCollection. But since they are just put into a script it is okay to get them from the database, even if we lose some comments on them, etc. Another good example is assembly creation – they may be introduced into the database via incremental build process and put into the full build script as a long hex string.


Fully vs. Incrementally Built Database Testing

After the build is completed, the environment that has been created from scratch is compared with the environment that has been built incrementally from some previous build. SQL Compare is used for such comparison (any other tool with similar functionality may be used.)


There are a number of the frameworks for unit-testing databases. Automated testing may be incorporated into the build process to ensure that nothing failed after the change.

Model-driven Development and Code Generation

A Model-driven approach helps to get the database directly from the Entity-Relationship model such as provided by Erwin.

In our projects database models were used to get the database structure as it should be appear at particular build. While ERwin features a model comparison option, it was not that convenient to use.

Whenever the model was changed, the script generated by the model was used to create a new database. Then that database was compared against the database from the previous build (utilizing SQL Compare), and the difference was put in to a script (or a set of them, depending on the scope of changes and existing data issues). These scripts were enhanced with such things as diagnostic messages, some plumbing, and script header/footer with the script description. Sometimes, the script that eliminated the differences required additional changes to handle existing data properly, so alterations were inevitable. When the set of script is finalized it is introduced into both the Change-Log and the source control system.

Code that is generated from the model for the new database structure is used to update the code base.

The difference from the “regular” or “manual” development is that the model always provides an image of the desired database state.

Why is it that the full build script is not derived from the model directly? Firstly,  the modeling tool does not necessarily know all the RDBMS features. For example, ERwin, in the version that we used, did not know  anything about the computed columns and the XML data types. Secondly, the model was sometimes updated later for different reasons, even if the model-driven approach is used. Thirdly, if the model was used to generate the full-build script(s) it would add another point of the database structure changes. Having the full database build scripts and the incremental scripts originating from two different sources would lead,and actually had led, to significant differences between full and incrementally built databases.

[Nightly] Database Build Process

The following diagram depicts the final state of the automated build process. So far it is only partially implemented and half of it is done manually.


Installation Packager Integration

Most of the tools creating the installation packages allow database operations. It seems to be worth using them in the database creation process instead of the VBscript that is used now.

On the other hand, the existing script may be incorporated into the installation package created by some tool.


The process that we have described allows the database development and deployment processes to be streamlined.

It makes it far easier, and more accurate, to repeatedly create  databases,  and allows as many installation of the database environment as required without putting an impossible burden on the DBA.

Alexander has attached a project folder which he created from AdventureWorks database. it can be downloaded from the bottom of the article, either in .RAR or .ZIP format.

It includes everything described in the article.
There is also a set of command files performing certain commands:
#Build from scratch.cmd – creates environment from scratch (including code, data, security, everything)
#Delete environment.cmd – deletes exsitng environment
#Nightly build from scratch.cmd – gets the most recent project files from SVN and recreates the local environment
#Redeploy code objects.cmd – updates code objects – stored procedures, UDFs, etc.
#Script database objects.cmd – regenerates database schema object script from existing DB and upates environment build number
#Security – logins and db
users.cmd – creates server logins and maps them to database users and roles
#Security – roles and permissions.cmd – creates roles in database(s) and grants permissions
#Update DB schema incrementally.cmd  – perform incremental update of the schema and data as per Change log to the latest and greatest state
#Update environment.cmd – perform incremental update of the enviroment including code, permissions, everything
The very script performing all these actions is here:
DB Creator AdvWorks\DB Environment Builder\DB Environment Builder.vbs
Database project files are here:
DB Creator AdvWorks\AdventureWorks OLTP Environment