Configuration Management with PowerShell and XML

DevOps, Continuous Delivery & Database Lifecycle Management
Automated Deployment

Intro

To minimize problems in any software delivery process, it is vital to keep every single script in the repository: Obviously, all aspects of configuration must be scripted, and configuration scripts are a part of this “every“. When I first attempted to apply elements of the Continuous Delivery (CD) process, I struggled with the practical difficulties of structuring and storing configuration information in any enterprise environment that has many interdependent .NET applications.

On the .NET platform, all configuration information is typically kept in XML file(s). Configuration files consist of sections; some of which are standard and control built-in features of the framework, whereas others are custom sections that are created by developers to govern the behavior of the components they write. The .NET framework provides a rich set of classes to read and write such configuration information. In Listing 1, we show a sample XML configuration file.

</appSettings>

<system.web>

</system.web>

</configuration>

Listing 1 – Sample .NET configuration file

Much of this configuration information will be different from the development version when the application is deployed into production. For example, there might be debug or logging options to change, and there are likely to be connection strings to be changed to point to different databases.

To make this easier, Visual Studio 2010 introduced Web.config Transformations, a simple and straightforward method of transforming config files during publishing and packaging. Depending on the nature of the build configuration, Debug or Release for example, the appropriate conversion is applied to the XML configuration file. The technique is similar to XSLT but, while XSLT is a general transformation language, Web.config Transformations are optimized for the requirements of deployment. Listing 2 shows a simple transformation that removes an attribute.

<system.web>

</system.web>

</configuration>

Listing 2 – An example of Web.Config Transformations

However, out-of-the-box, the feature is not really sophisticated enough to meet the needs of complex environments. This is because:

It requires a separate transformation for each deployment environment. Even with a couple of such environments, it is hard to maintain the consistency of transformations. How to validate that a new app setting is correctly done for all environments? Manual verification and maintenance is time-consuming and error-prone.
It does not allow configuration-sharing between applications. For example where an application has a UI layer that uses web services, there should be only one parameter defining the port on which services are running and it should be used for deployment of both components.

In one of the projects at Objectivity, we dealt with these problems by using Web.config Transformation as a basis for a more sophisticated solution. Let’s look closely at it, as I strongly believe that the structural concepts are broadly universal and can be applied more generally to Windows environments, regardless of deployment language or storage technology.

Types of configuration

Within most .NET-based solutions, we will come across three types of configuration setting:

Application settings – used in *.config files to influence behavior of application (Listing 1)
Deployment process parameters – necessary to perform installation, for example location of DB backups
Topology– defines servers, application layers, package assignments

I will go through each type and describe in more detail how each type of configuration setting can be organized.

Tokens and Applications settings

Our configuration management system was based on the idea of a token. A token, in this context, is a unique macro substitution string, which does not appear naturally in any config files and therefore can be easily and safely substituted during deployment the with actual, environment-specific, value. For example we chose the following format convention to use to define tokens:

##token_name##

We used these tokens to control configuration items such as:

connection strings
names of application pools
web sites names
ports of HTTP and NET.TCP bindings
notification email addresses
paths to DB backups

Building blocks

From the architectural perspective, our solution had three main building blocks:

Web.config Transformations – the Visual Studio feature helped us to convert the developers’ settings into a more generic configuration template
XML file(s) – used to store tokens (an example on Listing 3)
PowerShell scripts – facilitated remote placement of environment-specific values in config files during deployment

</tokens>

Listing 3 – Sample XML with tokens

XML works really well as a format for configuration information, because it is widely known by developers and in addition, it integrates well with PowerShell, a language of deployment automation on Windows platform. Listing 4 demonstrates how easy it is to load a configuration file (defined in Listing 3), iterate over the nodes, accessing the attributes.

$tokens = [xml] (Get-Content "tokens.xml")

foreach ($token in $tokens.tokens.token)

{

"$($token.key) = $($token.value)"

}

Listing 4 – XML operations in PowerShell

Transformation process

The three transformation stages of the configuration settings of the application, illustrated in this diagram are:

Developers’ Config Files:
Developers were working on their own version of configuration files, adapted to their local environment and checked it in together with the code changes. Local configuration could be shared due to similarity of environments, for example: the same naming of local DBs, keeping sources in the same folder, etc. That simplified development process and definitely helped us to avoid some merge conflicts.
Generic Config Files:
Next, during the Continuous Integration (CI) process on TFS, local configuration was converted with Visual Studio Web.config Transformations into generic files having tokens for settings, which are environment-specific. Those generic files were then embedded in MsDeploy packages (ZIP files containing application binaries), which we used for deployment to all environments. This is in line with another Continuous Delivery principle i.e. “Build once, deploy many”.
Environment-specific Environment settings:
Finally, as part of the installation process, PowerShell scripts replaced tokens with values defined for a given environment. This is done on the destination server by using ‘remoting’ or remote execution. The Environment-specific configuration was kept in XML files in the repository, each environment with a separate token file, together with the code.

This diagram below illustrates how connection string was altered throughout the process:

Connection string value substitution (Click to see full-size)

Tokens’ inheritance

The XML files that stored the values of the configuration Macro tokens were organized hierarchically, so that the default values of the parent tokens from the Root XML could be inherited or overwritten by child tokens for any individual environment. An approach like this minimizes the number of tokens that are kept in environment-specific files and therefore simplifies maintenance.

The Root XML defined:

The global list of tokens
The description of each token – very important from maintenance perspective. After a few months, nobody remembers what a particular token is used for unless it is documented, no matter how descriptive the name may be
The default values
A flag to indicate whether the token must be defined at environment level – this attribute was used for the configuration health check

Local XML files defined each environment-specific token-overrides and were stored in their corresponding environment folder. Here is what the physical structure looked like:

1 2	Environments folder RootTokens.XML <TEST> folderTokens.XML<UAT> folderTokens.XML<Production> folderTokens.XML

Sub-tokens

The design I’ve described, using tokens with inheritance, is very flexible and it may seem that it would be sufficient for its’ purpose. However we found that some of configuration tokens were interdependent. For example, the HTTP port on which the WCF service is running was declared the first time for service-deployment and a second time for the UI layer using it. As mistakes with such dependencies usually come to light too late in the testing phase, we wanted to avoid them as much as possible. So we applied DRY principle to configuration files and introduced the concept of sub-tokens. Now a token’s value could refer to other token(s) and so the deployment process was enriched with another step – the token pre-processing phase, which resolved those dependencies.

Here is a token referring to sub-token:

1	<token key="##INSTALL_ROOT##" value="##Services_Install_Root##\INT" />

Deployment process parameters

Because PowerShell was our language of automation for the deployment process, we could easily access tokens from the deployment code through the global variable. Here is an example showing how to load tokens (for simplicity without support for sub-tokens) and use one of them to run SQL script against DB:

function Get-Tokens($tokenRepo) {

$hash = @{}

[xml]$tokenConfig = Get-Content $tokenRepo

$tokenConfig.tokens.token | ForEach-Object {

$hash += @{$_.key = $_.value}

}

return $hash

}

$tokens = Get-Tokens -tokenRepo "$environments_dir\$environment\tokens.xml"

$dbConnectionStringToken = "##DB_CONNECTION_STRING##"

Write-Log -message "Db initialization started..."

# function below checks whether token is present and if not, raises an error

Verify-Token $tokens $dbConnectionStringToken

$dbConnectionString = $tokens[$dbConnectionStringToken]

$dbConnectionBuilder = New-Object System.Data.SqlClient.SqlConnectionStringBuilder

$dbConnectionBuilder.set_ConnectionString($dbConnectionString)

RunSqlFile $dbConnectionBuilder (Join-Path $initDbScriptsPath "UpdateDictionaryTables.sql")

Write-Log -message "Db initialization completed."

One could argue whether embedding (and possibly duplicating) token names in PowerShell code is a good practice or not, but we had more pressing concerns on our minds and, in practice, this approach worked for us.

Topology

Describing the variables that affect the topology of the infrastructure via XML is very similar to using UML Infrastructure Diagrams for the given server environment. It defines things like:

The Servers to use (hostnames or IP addresses)
The Logical layers to which those servers belong to – this allows us to easily express load-balanced configurations
The Packages to install and on which layer

The Topology XML document was, due to its uniqueness, stored in a separate file with its own bespoke format. I think that a sample will explain more than a raw description, so here it is one:

</AppTier>

</AppTier>

</AppTier>

</AppTier>

</Environment>

</Packages>

</Platform>

During the deployment process, PowerShell scripts loaded the XML topology document for the target server-environment and used it to find out which MsDeploy packages to install on which servers. An MsDeploy package contains application artifacts such as: DLLs, HTML, CSS or JavaScript files, XML configuration, etc.

In addition, every package was accompanied by a manifest file (there is a sample on Listing 5). It is an XML document that defines the type of application (web, desktop, and windows service) and its related parameters (application pool and web site parameters in case of web application or destination folder in case of desktop app). In our approach, the manifest could also utilize tokens and therefore change parameter values between environments. But let’s not dig too deep in details, because the deployment process itself is a topic worth a separate article.

<service physicalPath="##Services_Install_Root##\Phoenix.Queuing.WindowsService"

displayName="Phoenix.Queuing.WindowsService"

description="Phoenix.Queuing.WindowsService"

username="NT AUTHORITY\NetworkService">

</service>

</deployment>

Listing 5 – Application’s manifest file

Conclusions and future directions

The configuration structure I’ve described is not perfect but, nonetheless, it proved sufficient in our case and allowed us to successfully deliver three fairly complex sub-systems. Configuration maintenance and deployment automation was part of our ‘Definition-of-Done’ and every developer was encouraged to get familiar with the configuration architecture. This helped make the project more predictable by removing any justification for postponing deployment-related activities till the last minute.

Configuration-completeness was verified during CI builds as the first step of the process (the so-called build framework self-check). Our feedback loop was very fast so, if something basic went wrong, CI caught it right at the beginning.

World and technology inevitably moves forward and these days with gained experience we would do some of the things differently. For example, currently I and my fellow colleagues are exploring whether INI files could be better format for configuration than XML – although XML is powerful and generic, it is also hard for people to read and change and some valid alternatives (like INI, JSON or DSL) exist.

Another technological advance could well affect how we manage configuration in future. In October 2013, Microsoft released version 4 of PowerShell. One of new features is “Desired State Configuration” (DSC) – declarative language extensions that enable deployment and configuration management. What DSC handles perfectly is the topology, i.e. one of discussed configuration types. You would have to add support for tokens on your own, but, as far as I know, none of the deployment products running on Microsoft stack (e.g. “Octopus Deploy” or “Release Management for Visual Studio 2013”) deliver support for configuration that is as sophisticated. Although PowerShell 4 cannot be installed and used in every environment due to the limited versions of the Windows operating system that it supports, I would definitely give it a try if only possible. Microsoft is investing strongly in PowerShell and already there are several resource kits available giving you the means to: configure IIS, change Active Directory, install SQL Server, manage a cluster or firewall rules. I am sure that the future will bring many improvements in this area.

DevOps, Continuous Delivery & Database Lifecycle Management
Go to the Simple Talk library to find more articles, or visit www.red-gate.com/solutions for more information on the benefits of extending DevOps practices to SQL Server databases.

Register for Simple Talk

Configuration Management with PowerShell and XML

Intro

Types of configuration