SQL Source Control: The Development Story, Part II

When creating SQL Source Control, the team had to make decisions as to which source control systems the tool would support, and needed to accomodate a range of different development practices. It also had to avoid the potential problems from the fundemental differences between database builds and application builds, David and Stephanie, from the team that wrote the tool, explain what they did, and why.

Before SQL Source Control was released, we chatted to Stephanie Herr, the project manager for SQL Source Control, and David Simner, one of the developers, about all the work that had gone into bringing the product to reality. It was a fascinating, frank, discussion about making a software tool as simple as possible to use, the pursuit of quality, and the technical challenges that an ambitious project like this is likely to encounter and have to overcome. This is the second of two articles based on the interview.

Source Control integration

Stephanie:
Because we are developers, source control is an integral part of our work, and we use Subversion. There are two add-ins for Visual Studio now that allow you to access your SubVersion repositories through the Visual Studio solution explorer, AnkhSVN and VisualSVN. These are the tools that we use at Red Gate.
David:
These two add-ins are the ones that most .NET application developers use with SubVersion.
Stephanie:
They just give you easy access to the source control server. When we started designing SQL Source Control, we wanted to repeat that same concept by making it really easy to use source control by integrating it into Management Studio. We had, however, also realised from the start that there were other source control systems in use. We’ve run polls for the past two and a half years to see what source control systems people are using and how they were changing in popularity over that time. We found that SubVersion and Team Foundation Server are very popular,and both have been increasing in popularity over the past two and a half years. It became an easy choice to integrate SQL Source Control with them.
Simple Talk:
In retrospect, it’s proved to be the right decision because of the increasing degree of domination of the market by those two providers. A lot of the smaller providers of source control have been marginalised or have disappeared in that period.
David:
When you run the polls, Visual SourceSafe (VSS) still comes out first. Visual SourceSafe is the most popular, but it’s also decreasing very rapidly because Microsoft have warned that they will soon withdraw support for it. Team foundation Server (TFS) is being promoted as the logical upgrade path to SourceSafe. Even though a lot of people are using SourceSafe, we’ve made the decision not to support it because there are issues with SourceSafe that would we wouldn’t want SQL Source Control to be associated with. As it is not a particularly good source control solution anyway, most people will be moving away from it within a year or so.
Stephanie:
Yeah, the last update Microsoft had for it was in 2005 and it just hasn’t been getting attention from Microsoft since then. They are really pushing their Team Foundation Server solution, so we didn’t want to invest a lot of time in it. Team Foundation Server has a SQL Server backend, so it’s much more robust and powerful, as is SubVersion.

When developing SQL Source Control, we didn’t want to invest too much time in SourceSafe because we’re still trying to understand the future plans of users who were still on Visual SourceSafe, especially with Microsoft ending mainstream support next year. We wanted to really make sure that there was a business case for investing in it. It is also more technically a challenge to integrate with because it doesn’t have a nice API for us to access.

Simple Talk:
So is there no support at all for Visual SourceSafe in the tool?
Stephanie:
Now we only support SubVersion and Team Foundation Server. Visual SourceSafe is probably the next most requested source control system we get, but…
Simple Talk:
But if it’s only for a year it seems hardly worthwhile.
David:
Yes, given it will take about three months or so to actually write the code to support it. The product is deprecated.
Simple Talk:
And once you’ve done that, testing the Visual SourceSafe integration is surely going to be a nightmare.
Stephanie:
Yeah, I think it’s April 2011 that support ends for Visual SourceSafe, according to the Microsoft website, so it’s even less than a year now. We do support two source control systems and the idea is to keep monitoring the source control popularity and, if there is a business case, to add more in the future: But V1.0 is restricted to just those two source control systems.
Simple Talk:
It would be rather cool to use the TortoiseSVN tool for settings.
Stephanie:
We do use your Tortoise repositories, so if you have browsed something in the repo browser, we use your Tortoise credentials, so we try to use that first and then if it’s not in Tortoise we’ll prompt you for the credentials to your Source Control system, and then we actually send that back to Tortoise’s SVN credentials cache.

If you then use Tortoise to look at the history, which is also something we haven’t talked about yet, you know it’s remembered in Tortoise so you can easily access the repository.

Simple Talk:
That’s very convenient. So that if you were already a SubVersion user and you’ve set up Tortoise then you don’t have to set anything up in SQL Source Control?
David:
It will just pick up your username and password automatically and yeah, the ‘create‘ and ‘Add link’ dialogue will give a drop down list of locations to choose from.
Simple Talk:
Do you support all modes of SubVersion, not just the simple file-level integration?
Stephanie:
We support all SubVersion’s protocols, the file, http, https, SubVersion and then SubVersion secured socket layer.
David:
SubVersion plus SSH?
Stephanie:
Plus SSH…
David:
…which is what you’d typically use if it’s a Linux machine that’s hosting the SubVersion repository; so we just support all of them.
Stephanie:
This makes a lot of sense when you consider that a lot of companies will already have SubVersion set up and be using it for their application development.
Simple Talk:
That’s what worried me, because no power on earth will make people change their SubVersion once it’s set up in a particular mode.
Stephanie:
I think it could be changed per repository, but there should be no issue since we support everything, so there’s no reason to have to change from what you’re currently doing.

Avoiding unnecessary table-rebuilds

Simple Talk:
There were quite a lot of warnings from people who had created home-brew source control systems for SQL Server that there might be difficulties ahead in forcing a database development into a traditional source control model. Do feel confident that you’ve cracked all those problems?
David:
What sort of things do you mean?
Simple Talk:
An example would be the inexact mapping between objects as defined within the system views, so that indexes, parameters and columns are not objects in themselves, they’re just attached to a table object. There are other slightly strange mappings of the various parts of the SQL Server metadata.
David:
On the ‘commit‘ and the’ get latest’ tab we never show you sub-objects like that; so indexes never appear, fields never appear. Only tables appear; so when you change an index or add a field to a table, what you actually see on the ‘commit‘ and the ‘get latest’ tab is the table, so get away from dealing with sub-objects by showing you the objects themselves.
Simple Talk:
So that if you make a change to an index, that would then signal that the table itself has changed rather than the index? Wouldn’t that mean that a table would be subject to an unnecessary rebuild when only an index really has to be changed?
David:
The application works out the optimal way of doing it. From SQL Source Control’s perspective it’s just a change on a table like any other change on a table and then the SQL Compare Engine, which we use behind the scenes, understands what that change means and what it has to do in order to sync that to and from Source Control.
Simple Talk:
Does that take care of all the problems?
David:
It’s caused minor problems in the blue indicators. The blue indicators purely reflect the changes at object-level, rather than showing what would need to be rebuilt. We poll the SQL Server regularly to see what objects are changing and so we need to take an account there of a new or changed index or column, and flag it as a change of the parent table, therefore it is this table that should get the blue indicator on it. Because we took a decision early on to use the SQL Compare Engine to calculate what actually needs to be rebuilt, that’s not really been one of the problems that we’ve had to face.
Stephanie:
I think that’s also one of the differences between what we’ve done and what Microsoft does in their database projects. They actually separate the indexes, columns and parameters out, and treat them as different objects.
Simple Talk:
They don’t do incremental build do they?  I thought that you have to rebuild, using the entire script, if you make a change.
Stephanie:
I think it is in the Premium version that you can do incremental changes. My understanding is that, in the Professional one, you’re just working completely offline and there’s no live database, you’re working on the scripts and then you need to build that into a database. With SQL Source Control, you don’t have to worry about losing your data. All your data is still maintained, so making changes to tables doesn’t impact the data within the table

The dedicated database vs the shared development database

David:
Application developers are used to a dedicated model so, if I’m writing some SQL Source Control, I have my own checkout of the SQL Source Control code. I’m making my own changes locally, nobody else gets to see them and then at some point I commit back to Source Control.

Database developers have got two different ways of working.

The first, more usual, system is where you work on your own private copy of the database which is just for you and no one else connects to it or uses it, you make all your changes in that database and then you commit to Source Control using our tool.

Alternatively, you have a shared model where all the database developers are on the same database, they’re all making the changes on one database, and then they each commit the changes to Source Control. And they might commit their own changes or they might, if more than one is working on the same schema, commit other people’s changes. They might do that on purpose or they might do that accidentally.

So it’s a question of what we ought to encourage people to use, what happens if people use the other one, how our tool should handle it.

Simple Talk:
If you support both ways, then you satisfy both existing practices. You can’t insist on changing the way people choose to work.
Stephanie:
So we do think that there are some people that work on a shared model out of choice , and would want to stay on that shared model even if we had a solution that made the local system easy to administer?.
Simple Talk:
Oh yes, but out of habit. The shared model is adopted widely because it has, up to now, been difficult to re-synchronise or merge all the separate private branches back to the master, and to assess the repercussions of your co-workers’ contributions.
Stephanie:
We do think there might be some people that are currently on a shared model because it’s really hard to share your changes between different database servers.

But one of the advantages of SQL Source Control is you can link all these separate databases to the same repository, the same centralised Source Control repository and then easily share those changes with one another and be able to review those changes before committing them and before getting them out of Source Control.

And once you have that model where everybody is working on their own dedicated database you have all the advantages of Source Control. One, you don’t have to worry about overwriting somebody else’s changes, you don’t have to worry about your changes being overwritten and you also don’t have to worry about making a change that actually breaks something so that it’s actually impacting all the other database developers and maybe even application developers that point to that database.

So having these dedicated databases is really exactly what Source Control is. Getting it to a local working area, making your changes, testing them locally, making sure you’re happy with them before then committing them back to Source Control.

Simple Talk:
So basically you can support either model of database development?
David:
Both of them work in the sense that you won’t get any error messages from the tool, but the usability aspect of a shared model are quite difficult to get your head around, so on the commit tab you’ll see other people’s changes because it’s a shared database.

You could also very easily accidentally commit someone else’s changes. This means that when they go to the commit tab they may wonder where their objects have gone, because someone else has committed them for them.

As well as that there are some technical aspects of how we handle things under the hood that means the program will have false positives, so it will tell you that there are conflicts when there aren’t any. Revert or undo also won’t do the right thing, you’ll often undo back to something further back in time than you meant to undo to.

The undo getting it wrong and the conflicts getting it wrong are more serious issues. The other people committing your stuff and wondering why other people’s changes are there are fairly obvious, based on the fact that it’s a shared database. The fact that we would find it so hard to resolve conflicts and undo will be very, very confusing to people because they don’t understand what’s going on under the hood that’s causing it to get it wrong.

Stephanie:
That’s something we’re talking about right now, how to handle this. So really we recommend using the dedicated approach because you have all those other advantages of not losing one another’s changes and also being able to break things without impacting others.

Now your dedicated database doesn’t have to be on your own local machine, that would be the ideal, but if you don’t want to install SQL Server on a person’s machine you could just have separate databases on the same server.

Simple Talk:
If you are doing a build of the database as part of integration testing, then this could be then be stocked with realistic test data, and data feeds. Then it can be used as a central resource for performance testing, and for sandboxing, as any changes would be lost. This would remove one of the requirements for a shared development database, but I suspect that you’ll still find teams that prefer to work that way.
Stephanie:
We do want to get feedback from people on a centralised shared model and see how we could do it to improve their experience as well. So they can still use it but they have to be mindful that they’re not committing other people’s changes. On a centralised model, or a shared model, to talk of conflicts is a bit strange because everybody is on the same database, and maybe, occasionally over-writing other people’s work. It’s just how we’re processing it behind that, that you see these conflicts.