Connection Strings, an Overview

We asked Phil to come up with a simple explanation of connection strings. We somehow weren't expecting a 'quote of the day' for your database, or a C# application to gather data from the internet. However, sometimes the oblique approach is the best, especially when the knowledge comes from hard-won experience by a cynical man.

Getting Started

In order to connect to a data source of any type, you’ll probably need a connection string. 

Let’s take a rather extreme example. How about connecting to an HTML page from SQL Server? If your server can ‘see’ the internet then it is pretty simple once you know the correct connection string and the idiosyncrasies

875-OLEDB.jpg 

Or maybe with a slight variation …

And with the connection string you can link to it as a server…

 

875-linked.jpg

Here is the same connection String being used in a simple C# command-line program

…which produces …

875-Dos.jpg

Neat eh? You cannot write to the website, mercifully, with this provider, although if your internet site supports WebDAV, (Frontpage or Sharepoint, for example), then you can use the MSDAIPP provider to do that.  Yes, it is a scary thought.   

The Application perspective

Certainly, although you have other options such as Linq to SQL when connecting to SQL Server, the use of ADO.NET with a connection string will solve a lot of problems when connecting to a database, application, or wide variety of data stores.  The general idea is for the application code to be as independent of the actual data source as possible.  ODBC, JDBC and OLE DB  all provide an interface to the application that is reasonably generic. ODBC even supplied a minimum SQL Grammar based on X/Open (1992) that allowed it to be used with any vendor’s data store.  The connection string contains all the vendor-specific information to allow a connection to be made.  With ADO.NET,  the MDAC  data providers are accessed via a ‘Data Provider Factory’; all you need to do is to choose the Provider (normally OleDB, ODBC, OracleClient, SQLClent, or SQLServerCE) and pass it the connection string and you can, if you choose, do the rest of your coding in a Database-independent way through a set of abstractions that include the datasource, session, command and rowsets.   Connection strings are still at the heart of things with ADO.NET, used to store all the provider-specific information.

Nobody will try to pretend that it is easy to know what to do when a connection string doesn’t work. The errors from the drivers are often arcane, often irrelevant, and not nearly explicit enough to indicate where the problems lie. In debugging the previous illustration, for example, I spent ages scratching my head wondering why it couldn’t find the table in the HTML page until I discovered that the square-brackets delimiting the table name  were obligatory. When looking for a connection string, the web holds a rich store of connection strings that work. It pays to look around before cutting your own from scratch. However, once you have one that works, it is worth taking time to puzzle out why and how it works.

In ODBC and ole DB, all the specifics of a connection are held in a connection string. This wasn’t originally intended for human consumption. When the idea of the connection string was developed, the designers realised that one needed to browse for data.  An ODBC function called SQLBrowseConnect could be used iteratively by applications to build up a connection string, passing back lists of alternatives, or question marks in places where it needed further input, until a connection could be made. The user would see dialog boxes that prompted for such information as passwords, user IDs, or the name of a file.

Syntax

The connection string is actually a list of keyword/attribute pairs.  The keyword is not case-sensitive whereas the value may be. If a ‘keyword=value’ pair occurs more than once in the connection string,  then the value assigned by the last occurrence is used. The Provider keyword is the exception to this rule. The first value is used. If the string is being passed  back  to the application by the driver that needs more information before it can make a connection, then the attributes can be a list of the enumerations of the possible values, where the  data source needs the user to select, for example, a database from a server, or a table from a database.

All Keyword/attribute pairs are delimited by  a semicolon (;) and any semicolon (;) within an attribute or value must be delimited by quotes. If the Keyword or attribute is delimited by double-quotes, then the delimiter must be single-quotes and vice-versa. If a keyword contains an equal sign (=) then it has to be preceded by another equals sign to show that it is part of the keyword.

All spaces are ignored unless they are within strings that are delimited by double-quotes but if you want connection pooling to work, the connection strings have to be exactly the same!

Some of the information in a connection string can be specific to the provider. In our example, for example, “HDR=Yes” means that there is a header row in the table (or in Excel, the cell range or named range, so the provider will not include the first row of the selection into the recordset If you choose “HDR=No“, then the provider will include the first row of the cell range (or named ranged) into the recordset.

Security

The most worrying thing about connection strings, particularly in .NET applications, is that UserID and Password information is passed in them. Normally, the application developer will store a connection string complete, as a logical address of the data store he wants to access. This often leads to problems. Originally, the SQLBrowseConnect function in ODBC would pass back a UID=?; And PWD=?;   within the connection string that was then filled in by the application, after getting the credentials via a dialog box. It was then passed back to ODBC within the connection string.  The credentials were assumed to be volatile, but were passed to the provider as plain text.  This connection string, complete with credentials, was not intended to be stored on disk in an initialisation file.  However, the connection string was soon treated as a static string of serialised information that was held in the registry or initialisation file, or worst of all, embedded in the application itself.  If the connection string contains user_IDs and password credentials, and this is sometimes unavoidable, then one of the easiest ways to get control of a system is to look for them in connection strings stored in plain text around the user’s computer or within an application. Unencrypted connection strings compiled into an application’s source code can be viewed using .NET Reflector or the MSIL Disassembler (Ildasm.exe) tool.  To make matters worse, if a connection string is embedded then the application has to be recompiled whenever the connection string ever changes.  An attacker can inject other values in a connection string in order to access sensitive data.

Where one can use windows security, then the application configuration files can contain the complete connection strings along with other application-specific information.  Where you have to use userID/Password authentication you would add the credentials in later at runtime using the DbConnectionStringBuilder class.  The Connection strings are stored in the connectionStrings section of the configuration element of an application configuration file, or can be contained in separate External configuration files referenced as a configSource.  This allows access to be restricted via file access security. Ideally, the configuration file would be encrypted, and the connection string would be passed to the data provider via a secure link.

Building the connection string dynamically.

Here is a version of our C# program that uses the DbConnectionStringBuilder class

Storing connection strings in configuration files

Application configuration files (web.config or app.config) can be used to hold these connection strings.  Although these files share common elements, their name and location varies according to the application’s host.  They can be used to store any number of connection strings. Before these configuration files became the norm, Universal Data Link files were the recommended way of doing it. These were referenced directly by OLEDB; one merely passed the reference to the file in the form ‘FILE id=”I”><name of the file> in the Connection string.

The simplest way for the application to access the connection strings that are stored in configuration files is to use the  System.Configuration namespace

The ConfigurationManager (or WebConfigurationManager)  is used when working with configuration files on the local computer, and allows you to read in the connection string by its name.

You can use the ConnectionStringSettingsCollection to retrieve connection strings from application configuration files. It contains a collection of ConnectionStringSettings objects, each of which represents a single entry in the connectionStrings section. Its properties map to connection string attributes, allowing you to retrieve a connection string by specifying the name or the provider name.

For our application, we just want one connection string, so we create a file called <name of program> .exe.config  (where <name of program> is the name of the program you compile the following code to) and  put in:

We can now access this from the program

Encrypting connection strings in config files.

You  can encrypt sensitive information in a configuration file. When the encrypted connection string is retrieved at run time, the .NET Framework uses  whatever provider you specify to decrypt the CipherValue and make it available to your application.  The  encryption most commonly used is the RSAProtectedConfigurationProvider but the DPAPIProtectedConfigurationProvider, which uses the Windows built-in cryptographic services,  is also supplied but doesn’t allow the same  encrypted configuration file to be used on  multiple servers the details of encrypting configuration files is handled in great detail in http://msdn.microsoft.com/en-us/library/53tyfkaw.aspx  Encrypting Configuration Information Using Protected Configuration.

DSNs

DSN Connections

System DSN connection strings are stored  in the Windows Registry and are available to all users of the computer. You merely refer to the DSN as a ‘logical’ connection. File-based DSNs are similar, but associated with a user. They were intended to be easier to install  for user applications, but aren’t much used. From the perspective of the connection string, they work the same way. In our first example, we could hav done this…

The control panel has an applet within the administrative tools for managing ODBC Data Sources which provides a wizard like interface to collecting connection settings.

If you have a DSN, then your connection string within the application just needs to specify the provider and the name of the DSN, along with the credentials if you are forced to use them. All other parameters are stored within the DSN.

DSN-less Connections

The connection string that we’ve consistently used in this article doesn’t use a DSN, since I wanted it to work when you paste it into your own IDE and execute it. This type of connection string is sometimes referred to as DSN-less.

In Conclusion

One would like co come up with a clear logical approach to constructing and debugging connection strings.  I simply can’t bring myself to do it, since my own experience from working with connection strings is that, once one goes beyond the standard simple connection to SQL Server, using windows authentication, they can behave like wild feral creatures that struggle to be tamed, and can behave in strange unpredictable ways, emitting incomprehensible cries when their needs are not met. However, the rewards of getting it right are great. One no longer needs to be concerned with the specifics of a provider, but can get on with dealing with data without being concerned with the details of how the data is obtained from the source.

With Connection Strings,  it is worth the pain of discovery.

Interesting Links about Connection Strings