Inside Red Gate – Experimental Results

As a brief interlude from my Concurrent Collections series, I thought I would give an roundup of how the lean startup experiments have been progressing. As you can expect, there’s been some good aspects and some bad aspects.

The experiments so far

After lots of discussions, arguments, posing and ruling out hypotheses, we came up with two ‘starter’ hypotheses we could test fairly easily:

  1. Customers don’t agree to send data on how they use SmartAssembly; which features they use, the versions of .NET they have installed, etc, because the consent dialog doesn’t make it clear the data is all anonymous and aggregated.

    This is a prequisite for further experimentation. SmartAssembly isn’t a webapp, with google analytics and web logs telling us everything. In order to conduct experiments on SmartAssembly, we need to know how customers are using it once it is installed on their machines, and they need to confirm it’s ok for us to collect that data. Our current acceptance rate for usage reporting on SmartAssembly itself is quite low, so we needed to improve this for future experiments.

  2. Customers aren’t using feature usage reporting and error reporting within their own applications because those options are swamped amongst all the obfuscation options.

Experiment 1

Hypothesis: Customers don’t agree to send data on how they use SmartAssembly because the dialog asking for consent doesn’t make it clear the data is all anonymous

The experiment for this is pretty obvious – improve the wording on the consent dialog. This change was applied to SmartAssembly 6.5. We would compare signup rates with 6.2 to see what effect, if any, this change would have.

Result: Inconclusive

We found after 6.5 had been released that we weren’t collecting the right data from our download and install process to be able to accurately calculate an acceptance percentage. One of the quirks of the existing feature usage instrumentation is that the answer users give to the consent dialog is used for all future versions of the product with the same major version.

Since 6.5 is a minor version upgrade from 6.2, this means we couldn’t differentiate between an existing customer downloading to upgrade from <6.2 to 6.5 (who wouldn’t be presented with the new consent dialog), and a new user downloading for the first time. The data we collected couldn’t be interpreted one way or the other; there were too many other variables.

Experiment 2

Hypothesis: Customers aren’t using feature usage reporting and error reporting within their own applications because those options are swamped amongst all the obfuscation options

To perform this experiment, we produced a version of SmartAssembly that only had merging, signing, feature usage and error reporting available. We only wanted to present this version to customers whom we knew were downloading SmartAssembly for the error and/or feature usage reporting. The only place we could reasonably guarantee this was a reporting-specific landing page. So, the download link on that page would be A/B tested between the reporting-only and standard version of SmartAssembly.

Result: Not enough data

Our limited scope for this test – one specific landing page amongst the many pages on SmartAssembly where customers could download – meant we got very few people downloading the reporting-only version of SmartAssembly; not enough for any conclusion to be drawn one way or the other.

We had also added a ‘cop-out’ link on the download page so people could guarantee to get the standard version, if they did happen to download the reporting-only version and wonder where all the obfuscation went. We suspect a lot of users clicked this link; unfortunately, it was untracked, so we don’t really know.

It’s not all bad news…

So, the first experiments we performed didn’t really work. However, that doesn’t mean they weren’t useful. We worked through a lot of the infrastructure issues that restricted our experimentation, and, most importantly, we learnt things:

  1. Make sure you know what the experiment is for, and what data you will collect to validate or refute your hypotheses. Check the data you will be collecting will make sense.
  2. For an A/B test, ensure you will have a sensible number of users on both sides to able to draw conclusions.
  3. Don’t allow users to opt-out of experimental builds, even if it’s not marketed as an ‘experimental’ version. Don’t blur the boundaries of experimental and non-experimental builds.

So what now?

We’ve decided we’re going down the wrong route. We shouldn’t be trying to target existing users of SmartAssembly, who are looking primarily for an obfuscation tool, we need to target new users. We’ve decided to pivot and create a new product, a new website, and a new analytics platform. If you’re interested, there are samples of the kind of data you could have on your applications at http://www.applicationmetrics.com, as well as a sign up to receive more information and take part in our experiments.