TortoiseSVN and Subversion Cookbook Part 9: Server, Repository, and Statistics

In the ninth installment of his popular series on using Subversion, Michael describes how to set up a simple Subversion server for a multi-user project and describes some of the reports, charts and tables you can get about the activity in your project

This is the ninth installment of the TortoiseSVN and Subversion Cookbook series, a collection of practical recipes to help you navigate through the occasionally subtle complexities of source control with Subversion and its ubiquitous GUI front-end, TortoiseSVN. So far this series has covered:

  • Part 1: Checkouts and commits in a multiple-user environment.
  • Part 2: Adding, deleting, moving, and renaming files, plus filtering what you add.
  • Part 3: Putting things in and taking things out of source control.
  • Part 4: Sharing source-controlled libraries in other source-controlled projects.
  • Part 5: Embedding revision details within your source files.
  • Part 6: Working with tags and snapshots.
  • Part 7: Managing revisions and working copies.
  • Part 8: Getting the most from log messages.

This installment describes how to set up a production Subversion server, browse your repository once in place, and view a wealth of statistics about your repository and its usage.

Reminder: Refer to the Subversion book and the TortoiseSVN book for further reading as needed, and as directed in the recipes below.

Setting up a Subversion server

In part 3 of this series, I wrote two recipes Setting up a new repository and Deploying Subversion for a single-user installation. These recipes showed how simple it is to set up a new repository, then went on to touch some of the fine points of setting up a server. This recipe goes further into the detail.

Single User

It is worth repeating a paragraph from this recipe:

‘For a single-user environment, you do not actually need to set up a server for your Subversion repository; you could just use the file:// protocol to access the repository on your local machine. (See either TortoiseSVN for the single user or, for command-line use, Single-User Subversion.) Both of those articles are quite ancient in Internet time but the basics they cover are still valid. Please note, however, that the non-server implementation approach, while possible, is not necessarily the best choice because you get no security with the file:// protocol and you have uncontrolled access to the Subversion database (allowing avenues for corruption).’

Single User or Small Installation

I think that the best approach is to run the svnserve server that is included with the Subversion distribution and, as of TortoiseSVN 1.7, now included with TortoiseSVN as well! Svnserve is a lightweight, standalone Subversion server designed for situations where you do not need the full might of an Apache server. in support of this, you can read the thorough discussion on finer points of the svnserve server by David W. In his answer to this StackOverflow post. With svnserve you use the familiar svn:// protocol to access your repository, thereby gaining the benefit of a controlled interface to your repository. You have a couple of choices for security control as well, as detailed in the svnserve reference above.

The table below compares and contrasts the approaches given so far (file:, svn:, and svn+ssh: protocols) along with the Apache server (http: and https: protocols). This table is my interpretation of the pros and cons discussed in Choosing a Server Configuration in the Subversion book.

Attribute No server svnserve svnserve over SSH Apache HTTP server
Setup None Easy Moderate Complex
Accounts None No system accounts needed on server Leverage existing SSH accounts No system accounts needed on server
Performance Fast Stateful protocol; faster than Apache No system accounts needed on server Stateless protocol; slower than svnserve
Multiple authentication methods available No (No) But can configure with SASL No Yes
Network traffic  encrypted NA No Yes Optionally via SSL
Advanced logging facilities No No No Yes
Web browsing No No No Built-in repo browsing
Repository mountable as network drive for transparency No No No Yes
Protocol accessible through firewall No No (svn) No (svn) Yes (http/https)
Repository insulated by interface No Yes Yes Yes

Large Installation

A quick web search would likely point you to the conclusion that the venerable Apache HTTP server is widely used to connect Subversion clients to a Subversion server. There are actually several vendors that supply/support Subversion servers but I am going to limit my remarks to the offering from Collabnet, entitled Subversion Edge, not just because Collabnet was founded by two respected luminaries in computing, Tim O’Reilly and Brian Behlendorf, but also because Collabnet founded the Subversion open source project itself in 2000. (And because my editor thought it would be useful to include here.)

Subversion Edge is actually a package combining a Subversion server, an Apache server, and a web-based repository browser (ViewVC). From the blurb on the Edge website, “Subversion Edge is the answer for easy installation, administration, security, and governance of your Subversion environment.” Well that sounded like a lot of “marketing-speak” to me. I was frankly quite skeptical. I have set up an Apache installation before. It is not difficult but it is by no means trivial. Collabnet claims that Subversion Edge makes setting up both Apache and Subversion truly simple. So I put it to the test and found that it really required just a few minutes to install and configure Edge, and start browsing my Edge-hosted repository with both the Edge web interface and a separate Subversion client!

The entire stack is administered with a web-based interface. In fact, once you install Edge, the only entry that appears in the menu on your Windows Start Button is a URL (http://localhost:3343/csvn) that launches the web command console. Once you login you land on the status page (Figure 9-1), which is clean and informative. Down the left side you have the status of your Subversion server along with the URLs you need to browse with a third-party client and with the supplied repo-browser, ViewVC. The tabs/buttons across the top let you administer the various facets of the system. I opened the Administration tab with some trepidation but found there is little to configure for a basic system. The main settings include just three checkboxes that let you:

  • Switch from http to https for browsing
  • Switch from http to https for the command console
  • Auto-start Subversion when you launch the command console

The last is superfluous in a Windows environment; both Subversion and Apache servers, installed as Windows services, are configured to auto-started on boot.

1741-edge-status-8091fd87-7585-46a2-a17e

Figure 9-1: The Subversion Edge status page provides a URL for your repository for use with other clients as well as a URL to invoke the supplied ViewVC browser interface to your repository.

The next tab, Repositories, gives you access to the repositories on your system. You can both create repositories and discover existing ones (Figure 9-2).

1741-edge-repositories-f4b7f0e3-d706-4af

Figure 9-2: The Subversion Edge repository page lists your existing repositories as well as creates new ones for you.

You can open a new browser window to run ViewVC, the web-based repository browser for Subversion from either the URL on the status page or the repository list on the Repositories page. Selecting my test-repo1 repository from the previous figure, you see the familiar top-level branches, tags, and trunk nodes of a repository (Figure 9-3, top). You may then drill down to view specific files within ViewVC (Figure 9-3, bottom).

1741-edge-browse-beec9adb-16f6-4225-8c27

Figure 9-3: The browser-based repository-browser works much like TortoiseSVN’s repository browser, letting you view the log, contents, and annotated (blamed) contents for a file.

While the web-based repo-browser is handy, it is just a repo-browser rather than a full client. It lets you browse directories and files, and view log entries (i.e. history), but that is about the extent of its capabilities. You still need a desktop client to do any real work. I was able to connect both TortoiseSVN and Subversion’s command line client using the URL supplied on the Status page (Figure 9-1) without incident.

Browsing your repository with TortoiseSVN

Typically, you use your working copy to do your day-to-day work, updating frequently and committing as needed. Your working copy is private to you; each person on your team has their own working copy. All of you draw from the same central repository. Occasionally you may want more direct access to the repository. The most common reason for this is to browse through a repository without checking out the files it contains. For large projects with a huge amount of files the time to checkout might be substantial; if you are looking for something specific it could be much faster to browse the repository and just grab the one or few files you may actually need.

TortoiseSVN comes with a repository browser that is accessed, as everything else, through the context menu of a given file or folder (TortoiseSVN >> Repo-browser). This repo-browser provides a Windows Explorer-like GUI to navigate through your source controlled projects. At first glance, though, it appears to have almost no functionality other than a tree navigation pane on the left. But wait-this is TortoiseSVN remember!-and everything works from context menus. Inside the repo-browser everything still works from context menus. Figure 9-4 shows a portion of the context menu for a selected folder. The list of actions includes those for file maintenance (add, rename, delete) but note that Show log and Revision graph appear here just like in Windows Explorer, and both of those give you full access to the rich differencing and reviewing tools of TortoiseSVN.

1741-repo-browser-a78160a1-4533-4802-83e

Figure 9-4: TortoiseSVN’s repository browser also provides access to the full power of TortoiseSVN via context menus just like within Windows Explorer.

Browsing your repository with a web browser

If you do not have TortoiseSVN installed you can still browse your repository provided it uses a standard browser protocol-http or https. (So repository URLs using the svn: protocol, for example, do not lend themselves to this recipe.) With no additional support, your Subversion web server can generate web pages that let you walk through the tree hierarchy of your repository. The pages are plain, very plain. They remind me of web pages from the last century! But you can easily enhance the output. The recipe Setting up a Subversion Server showed a web application to do this very thing-ViewVC-integrated into the one-click installation of Collabnet’s Subversion Edge. As shown in Figure 9-3, you get a much enhanced view of your repository complete with buttons to view the log, annotate a file with authorship, or to download a given file. Since the indicated recipe already covered ViewVC, I need add nothing here.

Another alternative that is, in some sense, more lightweight than ViewVC is ReposStyle, an open source XSTL style sheet that you can hookup on your server to dress up the plain, default output to look better. Figure 9-5 shows the drab, default output of a Subversion server in the top pane, and the better-designed output after integrating the ReposStyle style sheet in the bottom pane. Notice that you get buttons for individual items (open, view history) as well as a button bar at the top of the content pane (home, up, folder history, and refresh).

1741-repos_style-50ce4242-906e-4208-bc3b

Figure 9-5: You can dress up the plain, default output of a Subversion server with various techniques. Here you see the difference provided by applying an XSLT stylesheet from ReposStyle.

The installation of ReposStyle is straightforward; the table below shows the complete installation instructions at the left, with my notes for specifically integrating with Subversion Edge at the right.

Viewing Subversion Statistics

Inevitably, even if your team consists of just one person, you may still want or need to know metrics about your Subversion installation. The lightweight StatSVN utility is a good place to start this exploration. Unfortunately, it appears that development on this tool may have slowed or stalled; at the time of writing the home page indicates the last update was over two years ago and the extensive list of demos (i.e. running StatSVN on a variety of large projects) is just a long list of broken links. Sigh. Nevertheless, the tables and charts StatSVN produces are useful, plus the installation and use of StatSVN is extremely simple, so I do recommend the product.

The StatSVN site provides one sample report that illustrates the depth and breadth of reports, charts, and tables you get. I have combined screen shots from many different pages to provide this thumbnail highlight of the entire report.

Lines of code
Monitor your project growth by month (along bottom axis) and by release (dotted verticals indicate tags). Also itemizes by developer, showing separate plotline for each.
1741-r1c2.png 1741-r1c3.png
Contributions of Developers
See in absolute and percentage terms the contributions of each team member.
1741-r2c2.png  
Words in commit messages
Tag cloud shows frequency of word use by relative word size (left) or view the word list as a table with counts and percentages (right).
1741-r3c2.png 1741-r3c3.png
Tags in the repository
This handy summary of tags provides a history of releases (or other events that you choose to tag) by date and lines of code.
1741-r4c2.png  
Repository inventory
Itemized by folder, this shows file counts and line counts per folder, and even includes deleted items as well (indicated by the crossed out folders)
1741-r5c2.png  
Commit activity
Left chart shows by hour of the day, right chart shows by day of the week.
1741-r6c2.png 1741-r6c3.png
Activity
In the scatter plot (left) each dot is a single commit (time of day on Y-axis, calendar days on X-axis). At right, lines of code appear across the top while churn rate (number of lines touched per day) appears at bottom.
1741-r7c2.png 1741-r7c3.png
Files
Count of files over time (left) and average lines per file (right).
1741-r8c2.png 1741-r8c3.png
File Details
Count of files and lines of code by type of file (left) and files with the most revisions (right).
1741-r9c2.png 1741-r9c3.png
Directories Lines of code per directory (left) and relative sizes of largest directories (right). 1741-r10c2.png 1741-r10c3.png

Installing and usage of StatSVN is, as mentioned, very simple. In fact, the package consists of just a “readme” file and a Java jar file, so installation is just a matter of putting the jar file in an appropriate folder of your choice. Execution consists of three steps:

  1. Checkout everything from your repository; if you already have a working copy, skip this step.
  2. Generate a Subversion log of your entire working copy. The user guide recommends this:
    svn log -v --xml > logfile.log
    Alternately, you could explicitly specify revisions if desired, e.g.:
    svn log -r HEAD:1 -v --xml > logfile.log
  3. Execute StatSVN with your log file and working copy:
    java -jar /path/to/statsvn.jar logfile.log /path/to/working-copy

There are a number of options you may pass to StatSVN as well. For example, I use -disable-twitter-button to suppress the annoying “Tweet this” appearing in various places. Also, StatSVN spawns a large number of svn diff calls on many threads. When I tried it with the default (25) I had a number of svn diff failures, so I used -threads 5 to reduce the thread count and it worked better for me. See Command Line Options for a list of all of them.