When working on a tool which interacts with Microsoft Azure and attempting to measure performance, it’s basically impossible to get anything resembling a fair and consistent test. A test run performed over lunchtime may get a wildly different set of timings to one performed at night when everyone is out of the office. Results can even vary day to day depending on the network traffic around you.

The Cerebrata team were recently working on measuring and hopefully improving transfer speeds to and from Microsoft Azure blob storage, and we were struggling to get any data that we could rely on. The proposed solution: running the tests on VMs in Azure, housed in the same data center as our blob storage. Not perfect by any stretch of the imagination, as all control over resources is rescinded, but preferable to running the tests from our office. This way, the main cause of variation in our results – an inconsistent connection – was much less likely to be a problem, thanks to fewer hops to the destination and the ability to use only the high bandwidth connections in the data center.

Of course, deciding to run the tests in a Microsoft Azure VM is all well and good, but what happens if you want to run them more than once? Remoting into the VM multiple times a day to kick off tests is going to become tedious very quickly and, every time you forget to go kick those tests off, time is going to be wasted. Also, if the tests are still under development and more are being added each day, you have to find a way of getting the required files onto your VM which you don’t mind running through for every update. When all this has been done, where are your results? On the VM? That’s another trip!

We knew that we’d be running these tests multiple times a day over the period of a month or two, so wanted to automate this process as much as possible. However there was no need for these tests to run after these few weeks of work were done, so we weren’t interested in involving the build server, deploying Azure VMs on demand, or anything like that. In fact, all the automation was just on the VM itself and this is how we implemented it.

Setting up the VM

The limiting factor for performance was always going to be bandwidth, so other than that there were no special VM requirements. In Azure, the size of the VM you run has an effect on the amount of bandwidth you are allocated on the physical networking card (NIC) – unsurprisingly, the larger the VM you select, the more bandwidth you are allotted. However it is possible that, if the other VMs on the same physical machine aren’t busy using the NIC, you might get a little extra throughput. This is something to be aware of, especially for a test like ours.

We were in quite a unique position, because we had chosen to specify the different options for the performance runs as conditionalised code triggered by build-time conditional compilation symbols. This meant that our VM would need to be able to build C# code during the test runs, and for this we needed to install the Windows SDK. Tests were run using NUnit, so this was installed and the VM was ready to run tests. Nothing was automated at this point, but we’ll come to that shortly.

Gathering test results

When starting out, output from each test was written to the console and piped into a text file on the command line if required. This wasn’t going to work when automated and it also just left the results files on the VM, which would mean manually fetching them at some point. The first step was to make each test write the results to a file as well as output to console, so we had a single results file per test. This still left us with text files on the VM, so we decided to just upload these files into blob storage, including the test names and dates to identify them. We now had access to all the results files, even if the VM was shut down or destroyed and, using the free Azure Explorer tool, that’s basically like having them on your local machine.


Figure 1 – Azure Explorer

Progress was being made, but looking through hundreds (potentially thousands) of text files full of timings isn’t ideal. It quickly becomes very tedious, so some processing needs to be done. For this, we wrote an application that would parse each text file and first merge the results for identical tests, and then average the timings. This gave us an average for each test configuration and transfer file size, which were put into ascending order to present the highest performing set of options in an easy to read format. These summary files were also uploaded to blob storage and updated whenever the results-parsing app was run.


Figure 2 – Results summary file

The final piece of information we wanted to regularly look at was the distribution of timings for a particular configuration option or set of options. This was achieved by parsing the individual results files again, and this time importing the timings data into excel, which became the basis for our customizable graphs. Again, the excel file was uploaded to blob storage for resilience and ease of access.

Having this results-parsing application run at the end of each group of test runs gave us individual results files for a deep dive look into the performance runs, ordered summaries of all the runs, and a spreadsheet primed with data for graphing – all of which were easily accessible in blob storage.


As it happens, this was probably the least interesting part of the whole process, as we kept things simple and just used a batch file coupled with Windows Scheduler. As mentioned earlier, the VM would be building the tests and libraries, so the source code was present. The batch file would loop through the different configuration options, build each one using the specified compilation symbols, and then run the tests. After this loop was finished, the results processing app was built and run to generate our summary results files and excel spreadsheet.

Getting code changes

The initial setup to get our baseline test running was pretty quick and, given that the test takes several hours to complete and the more data we have the better, we wanted to start running as soon as possible and add different configurations as we went along. The problem we still needed to work out with this was how to update the VM with the latest source files before each test run. Having to remote into the VM to check that the previous test run was finished, and retrieving the source files from a local machine manually, would render the rest of the automation pretty much useless.

Luckily, the source code for these performance tests was stored in GitHub, and obviously the VM had access to the internet so we could pull changes down easily. To automate this, another batch file was created which would use the Git shell to pull down the latest changes and run the previously created batch file to get the tests going. As long as this second batch file managing the pull wasn’t modified, we didn’t have to log into the VM at all, and the tests were always run with the latest checked-in changes.

Switching Windows Scheduler to start this new batch file instead of the old was easy and we had options in terms of how to schedule and manage the cycles of testing. We could add a check to make sure that the previous test run had been completed before the Scheduler kicked things off again, we could choose to just run on a schedule, or we could choose to just run the whole process again as soon as the previous run had finished.

Wait, this environment doesn’t represent our user base

The automation was all set up, running well and the results were piling in. It quickly became apparent that the larger the block size and larger number of concurrent blocks uploads saw the quickest times, but this was hardly surprising on a VM with huge bandwidth. This obviously didn’t represent the general use case for our customers, as they would mostly be transferring files from their local networks, which wouldn’t be comparable to our Azure VM.

We wanted Azure Management Studio to be able to transfer files quickly, but not only for users with brilliant connections. We needed to run our tests in an environment with limited bandwidth to see if our findings were the same. To achieve this, we used the Network Emulator Toolkit for Windows – a software emulator which can be set for bandwidth, packet loss, errors, latency and much more. It isn’t the prettiest application but it is extremely useful. Leaving this running on the VM, we could run the tests with a specified bandwidth and network environment to better emulate our customers’ real environments.


Figure 3 – Network Emulator Toolkit


This was a quick and easy way to get some test automation running in Azure which worked really well for us, mainly because it was always only going to be needed for a month or two. Given that we were just running a few VMs for a couple of months, cost wasn’t really an issue so we didn’t even feel the need to shut down the VMs during inactive periods. We could have easily achieved this by using the batch file to shut down the VM after test completion, and then using our build server to run a PowerShell script to start it up again for the next round of tests.

Had the work been carried out now, we could also have taken advantage of the new Microsoft Azure Automation features that are currently in preview.

Equally, had this been for long term testing or for testing spread over multiple VMs, we would have certainly involved the local build server and written infrastructure to create VMs on demand, run the tests remotely and graphically display the test results. With the new optional VM Agent on Microsoft Azure VMs and the addition of Puppet and Chef to the extensions, there are lots of exciting possibilities to create extensive automated testing infrastructures in Azure, particularly if you’re investing in something you know you’ll be using for a while.