Test-driven scalability in Exchange Server Archiver

I’ve recently been working on Red Gate’s new Exchange Server Archiver tool. Earlier on in the development process, I was developing part of the Archive Service (the component responsible for extracting messages from Exchange Server and passing them over to be stored).

As with all Red Gate projects, we involve testers from a very early stage, and try to use automated tests wherever possible. In this particular case, Jon had written a test harness for my code that worked roughly as follows:

  • Log on to Exchange, insert some number of messages, then log off
  • Run an archive using a particular set of conditions
  • Log back on, retrieve the message stubs, and compare them to the expected result, and to the now-archived messages in our store

All this started out running rather nicely, but over time, the system started to show signs of stress. We got up to around about the four thousand test case mark, by which time it was taking an unacceptably long time to do what seemed like a simple operation. Specifically, three minutes to insert a message. What!?

My first move was to try and reproduce the problem on my own system, but in the true spirit of evil bugs, everything worked fine – seven seconds! The usual candidates of rebooting servers, checking for available memory and so on didn’t help, and sending messages from Outlook worked fine.

Somewhat confused, we grabbed the beta version of ANTS Profiler 4 and installed that on Jon’s machine. I was involved in the development of ANTS Profiler 3, so was very interested to see how AP4 was looking!

We profiled the tests, which included the archive service code as well, and grabbed the results. The new call tree instantly showed the problem: the message insertion was responsible for 99% of the time taken, and 80% of that was spent in the code that populates the folder tree of a mailbox. Within that, the call tree just went straight off the edge of the screen – it was behaving far more recursively than I’d ever have expected. Surely this wasn’t right?

A quick trip back to Outlook revealed the cause: one of the other tests had created a 250 folders nested one inside another, and my code was repeating most of the same work for every folder, rather than just doing it the once that was actually necessary. In a nutshell, as soon as the number of folders in a mailbox became large, the performance went downhill rapidly.

Whilst not quite believing Jon’s assertions that a 250-deep folder structure was a valid use case, something still had to be done. A quick tweak of the code to bring the repeated effort out of the loop to where it should have been, and the test took ten seconds to run – nearly twenty times faster than before!

Before ANTS Profiler, I would have resorted to a liberal spattering of Debug.WriteLine (DateTime.Now) statements in an attempt to see where the time was being spent. And no doubt got very confused when it appeared so many times! Instead, I had all the information in front of me, and solving the problem took a matter of minutes. It also saved hours of time spent executing tests, which I can only imagine left more time for Jon to think of even more evil things to throw at the system…