Tales from a Cloud Software Firm

Following on from a discussion about how people are using the cloud, the Simple-Talk Editorial Team sat down with the CTO of 10th Magnitude to talk about how they deliver Cloud software and services for their clients using agile methodology, and Azure.

The Editorial Team of Simple-Talk was staring thoughtfully into their Theakston’s Old Peculier at the weekly editorial meeting. We’d been ruminating about the extent to which ordinary commercial websites and internet applications had moved to the cloud. Opinions had ranged from the cynical to the nephophilic (cloud loving). How, we had been wondering, are they using the Cloud? The conversation had lost any workable steam-pressure and had lapsed into a contemplation of the strain of hop used in the beer. All of a sudden, someone chirped up ‘Why don’t we ask a customer of one of our Cloud Services?’

So we did.

10th Magnitude develop custom software applications in the Cloud for a wide range of clients who with a new idea or new product they want to bring to market. These clients don’t necessarily have a whole lot of product design or IT knowledge so 10th Magnitude takes that vision and designs it as a software application, builds it, and then operates it for them in the Cloud. Clients generally don’t deal with any of the technology, design or operations end of things. We contacted Steve Harshbarger, their CTO at the time, to find out more.

Simple-Talk:
Do you manage all the backend services for the applications that you provide for your clients?
Steve H.:
We have some clients that are larger organisations with IT departments, so we help them to transition to the Cloud and work with their IT people; that’s part of the business. But we’re a ground-up development company in which we’re doing everything for the client.
Simple-Talk:
Do you use the same Cloud services that you provide to your customers in your own business?
Steve H.:
Oh yeah, you know we don’t even own a server, and we’re very proud of that.
Simple-Talk:
For your clients, do you use a mix of Cloud providers depending on what they need, or do you usually go with Azure or Amazon?
Steve H.:
There’s a whole mix, but Azure is at the centre of it.. We’re very well versed in the backend on .Net developments, we find the Azure platform to be very productive for us.
Simple-Talk:
Is this because you can stay in the Microsoft stack?
Steve H.:
Yes – and we’re often integrating with other services, like Red Gate for backups or Postmark wrap on email. When we do payments, there are different payment gateways, so an application typically has four or five different Cloud services integrated, but our custom code typically runs on Azure.
Simple-Talk:
How do you choose which cloud services to provide?
Steve H.:
It actually starts on the business side. Generally we’re looking for a fast, cost-effective route to market. The Cloud takes all the networking and hardware and infrastructure issues off the table, and just let us talk about the product.
Simple-Talk:
So it’s a business decision first?
Steve H.:
Yeah, it really is. Once they get there, that scalability becomes very important because they can start small in the market and, as the application or customer-base grows, we can turn up the number of resources. That’s always appreciated.
Simple-Talk:
Can you tell me anything about how the services are implemented technically?
Steve H.:
Sure. It depends if it’s a web-based or mobile application and often it’s both. But for the web-based stuff, our stack is typically from the bottom up, SQL Azure database and Azure storage. On top of that we run web and worker roles depending on the nature of the work. We tend to use C# .Net 4.0/5.0, MVC 4 and Entity Framework on the server side, and on the frontend it’s really HTML5, JQuery, CSS, ‘run on any browser’, ‘run on any tablet’; type of approach. For mobile, we’ve done both native Android and iPhone, as well as HTML5 optimised for mobile; it just depends on the situation. The mobile UI will be talking to a REST API from the server.
Simple-Talk:
Do you see much difference in the performance side of things? Whether it’s better using services like SQL Azure as opposed to just normal SQL Server and use both?
Steve H.:
We’ve never had to benchmark the same application under both scenarios. However, we have specifically load-tested applications that we write for customers before they go online, to make sure they’re going to scale and we’ve never had a problem getting it to the performance level we want.
Simple-Talk:
Generally speaking, the most common criticism of SQL Azure is that of backup. Do your clients see this as a problem? Do you see a continuing requirement for third-party tools to assist with this?
Steve H.:
So we’ve used the Red Gate’s Cloud Services tool as long as it’s been out to run our backups nightly and I just became aware that Microsoft created its own backup service within Azure that performs a similar backup on a schedule, type thing. I haven’t compared the two yet but I know they charge you for the database it creates temporarily as it’s creating the backup. For now the Red Gate tool continues to be the right solution.
Simple-Talk:
Do your clients tend to get all of the services from the beginning with a complete implementation, or do people usually start with certain aspects of the application, such as Azure websites, and then they add other services as need be?
Steve H.:
It’s more how we roll out features. We have an Agile development methodology and we walk the client through the process of figuring out what their two or three-year vision is, and then we walk them back to the minimal viable product to get started, so that they can get it done quickly at a modest investment. We get the applications out into production, and we’ll start adding new versions when they get feedback. But the core web role, worker role, SQL Azure database stuff is there from the start.
Simple-Talk:
You mentioned that you had an Agile methodology that you walked your clients through? Can you tell me a little more about that?
Steve H.:
There are really three parts to the way that we do Agile: envisioning, development and operations. In the envisioning process, we spend two or three weeks with the customer, laying out what they want to build. We do wire-frames, prototypes, user stories and technical designs, but we do it very quickly and very collaboratively. Out of this comes a roadmap for what needs to be developed. Then we move into our development phase. You can do all the design work you want, but we think that customers don’t really know what they need until they see it. We deliver features every two to three weeks that they can actually test, play with and give us feedback on, it’s called iterations in the Agile world, and so we build in flexibility. As it unfolds before them, they may have new ideas or things, so we iterate. It’s usually a two or three month development process and we like to keep it fairly short. This gives us a production application that we put into operation. At that point it’s up and running in the Cloud, and we’re doing things such as running backups, doing health monitoring and running a help desk in case issues comes up. Those are the three steps that we go through to get that up and going.
Simple-Talk:
You say that you work on a three-week delivery cycle during the development of the product. What is it that allows you to deliver at that pace? Does it require continuous integration?
Steve H.:
We make a conscious decision to deliver at that pace. We use the deployment and publishing features built into Visual Studio which connect to Azure. We have a couple of environments going in parallel. Day to day people are running in the Azure Emulator on their laptops but we always have a development environment upon Azure. A couple of times a week the code is being pushed up there and it’s kind of tested and run, so you could say it’s semi continuous, every couple of days deployment going on.

Every two weeks there’s kind of a testing environment that the customer has access to and that’s the one we would push to if we’re delivering on a Friday. A Thursday afternoon is typically pushing up that development or that testing environment, so that’s refreshed every couple of weeks. Then a major release, the production environment, that could happen every couple of months or whatever. So that’s how to do it.

Simple-Talk:
Do you use Test-driven development in your Agile process?
Steve H.:
Here and there is the answer. We’ve done an application that processes payments. It’s basically a credit card processing application. The core logic that actually takes your credit card information and processes the payment records it, things like that. We do TDD and there’s a whole suite of unit tests around it because it’s critical and it’s hard to get into a user interface and test all the permutations. Yeah, but other areas we write specs and test more traditionally and manually so it just depends on whether it has value or not in a particular area.
Simple-Talk:
Would it be fair to say then that you guys test the TDD more when it’s to do with more critical functions?
Steve H.:
Yeah, I think if it’s financial or critical or the logic you’re testing is really complex, there’s a lot of permutations and it’s just having someone go and sit down and test it for an hour is not going to cover it, then you’d be more methodical about it. So we like to pick and choose. Personally I think if you use TDD for everything, I think you’re investing too much in testing so I like to be more strategic about where to use it.
Simple-Talk:
When you first got started, did you find the Microsoft resources comprehensive? Were you able to figure out everything you needed from documentation when you started using the services?
Steve H.:
I personally started using Azure when it was still in preview mode, so from then to now the improvement, in terms of resources has been fantastic.

I feel that the information’s good, and I get a lot out of following Scott Guthrie’s blog. He does a great job at telling you what’s new and why, and giving you a trail head where to find more info.

He just did one a couple of weeks ago, it was like a whole day on Azure, it was good, he gave a keynote for a couple of hours on the whole platform and everything that’s new, and then you could watch the videos of drilldowns on what you’re interested in.

I tend to watch the keynotes and then, if a technology is relevant to what we’re doing, I may watch that. Stack Overflow is always a good place to go since it is peer-to-peer stuff. Cloud technologies are changing so fast that, whereas five or six years ago you could go buy a book on a subject and it would be pretty current and relevant, now you need other ways to keep abreast of developments

All Cloud services, not just Azure, are being developed so fast by the big technology companies that the books that cover the current technology don’t exist, so it’s really down to relying on the community for support.

Simple-Talk:
What kind of advice do you give clients when they’re figuring out what sort of services they need?
Steve H.:
I try to use the minimum number of services that accomplish the goal because the more you use, the more connection points, complexity and risk. That’s rule number one. When we design applications, we have to design both for performance, scalability, and security as well as cost. You can make different design decision that give the same feature, but one may cost twice as much to run than the other, so you have to be efficient in how you’re spending money. For example I may have three wildly different processes I want to run in the background and I could put those on three worker roles because that’s convenient, or I can put them on one because then I can share the cost of running that and save the client a lot of money as it scales. We have to think about that pretty actively.
Simple-Talk:
What’s the most effective way to identify the biggest cost savings and keep costs down? If you don’t know what you’re doing, I imagine it can get very expensive quite quickly.
Steve H.:
Typically, the largest costs are the compute resources, the web and worker roles. The database and the storage are typically a fraction of this.

In development, we run everything on extra small roles, which will save a lot of money in development.

In development, you’re going to have two or three environments set up. You’ll have a dev environment, a test environment and something for the client to look at, so be very careful of the costs of doing that. The other thing that can bite you is forgetting to turn off cloud resources. It’s easy to spit out new instances in the Cloud; you might bring them up for a load test on Monday morning in order to show your customer the application. If you forget to turn those off, they might run for a month before they show up on a bill.

You really have to be on top of what you’re spinning down, so it’s best to assign that responsibility to someone on the team to keep track of that.

And then when we go to production, we do load testing, so we’ll simulate however many users or hits we think might be reasonable, and that will help us figure out how many web roles should we run initially. If the answer’s two and you didn’t do a load test and just assumed you needed to run ten then you have all this extra capacity that you’re paying for, so do load testing. There are actually tools in the Azure portal that will show you the utilisation of your resources, so you can see if you’re running 90% of your CPU or 5%. And if it’s low maybe you need to use a smaller machine or dial things back.

Simple-Talk:
Is load-testing something that is automated and done as part of integration testing? I imagine that you need far more precision with this, compared with a conventional application based in hosted servers.
Steve H.:
We’ve typically used a service called Neustar. You basically write scripts to simulate what end users would do and then you spin up large numbers of instances, to run a test for however long you like and you might say, “I want to simulate a thousand people hitting the site in one hour” and it’ll run those tests all in parallel. It’ll measure if it succeeded or not and you can look at your own performance on your back end and see how the database performs, that kind of thing.
Simple-Talk:
How do you monitor the live production applications to make sure that their loading matches the capacity you’ve configured? Is this automated?
Steve H.:
Oh, a few tools. We use Pingdom, just in general to monitor uptime. We’ll ping the public site but we also set up custom counters to kind of measure a bit if the database is up and storage is up and different worker roles are up, so that tells us just general if things are up or down. Then the utilisation, we’re really using the built in Azure dashboard tools for that so we’ll do a manual review of those logs and dashboards in the Azure portal once a week or so. If things are getting tagged or tweaked we’ll take action, is about the extent of it.
Simple-Talk:
What do you see as the biggest advantage of using Azure in the Cloud services? What’s the best gain?
Steve H.:
From the point of view of my business it’s the speed, I can get a customer’s app up and running, in production, in half the time I could three or four years ago. My business used to be write code and turn it over to the customer, they’d run it on their own hardware or in Rackspace or a provider like that. With Azure, it’s happening twice as fast and the cost is coming down so we can be nimble and iterate.
Simple-Talk:
Do you see Cloud services as being a fundamental paradigm shift in delivering applications? Is it, alternatively, an evolution of managed server hosting. Do you see further developments in managed cloud-based services?.
Steve H.:
It’s been more of a shift. Managed server hosting always required that you have kind of a network engineer skillset or someone that likes to build out machines and OS’s. The person that would typically deal with hardware is dealing with the [inaudible 00:11:39] managed service provider and most of that goes away with a platform service like Azure so basically it’s a team of developers who can deal with most of the deployment issues. In that way it’s been a shift, it kind of simplifies what your team can be and cuts out… It’s missing part of the old team, which is good. I haven’t done one of these personally but I’m aware of much larger projects where you really have to think about the network typology more even on Azure and you get those kinds of folks in, but there are whole classes of applications you can build with a lot less tension to the infrastructure now. So that’s been a pleasant shift I think.
Simple-Talk:
Yeah. What slows down development work for you guys?
Steve H.:
Well a couple of things. The emulator is still very slow to deploy code to the Visual Studio. The time from pressing go and be able to debug your code can be 90 seconds, which if you’re doing that a hundred times a day, that’s a big chunk of your day out so that slows us down. Sometimes there are issues you only see in the Cloud app you deploy. You may not catch them locally. Debugging those can still be a challenge because getting debug data off the remote machines is not as straightforward as you might think. You may have to remote it and you may have to download logs. We write code that writes to our own log, kind of run things, look at the log you’ll see what’s going on so it’s kind of primitive in some ways. Those type of things.
Simple-Talk:
Have you guys had any issues with Azure being unavailable?
Steve H.:
There have been a couple of outages. There was a well-publicised one earlier this year. The end result was an SSL certificate on Microsoft’s end expired or went bad and it took off all their data centres worldwide for a couple of hours. I don’t know if you remember that.

Yeah, so that was frustrating. Our alert system went off and we heard about it. We had to contact all our customers and we had applications running for and proactively let them know this is going on, you know, “we’re going to solve it”. We heard from some of the customers while it was going on, so it created an interesting day. Being a Microsoft outage we didn’t have a lot of control over it. They tell you about it on your dashboard so you know they’re on it but…

Simple-Talk:
Does that interrupt your development work as well or does it just take things that are running in Azure offline?
Steve H.:
Well sure it does because your teams that are working on current projects, you’ll hear about this outage and until you know what’s going on you think it might be your applications so you start having to look at running apps and all the code and going to fire drill mode. Then you figure out, “Oh, it’s Microsoft, okay, I guess we’ll just sit and wait.” So yeah, it can disrupt your day.
Simple-Talk:
You’ve mentioned you’ve got Richard’s Cloud services, the Red Gate Cloud services, Pingdom and I’ve got a couple of other third party tools listed here, what’s your favourite one? If you had to only keep three which ones would it be?
Steve H.:
Beanstalk, Postmark and Red Gate Cloud Services. So Beanstalk’s source control, Postmark is a kind of outbound e-mail service and of course Cloud Services for backups.
Simple-Talk:
Are there any interesting lessons you have for people getting set up as your development, just starting out?
Steve H.:
Boy, I’d probably have dozens. If I was sitting someone down new what would I warn them about? Like I’m finding that if your applications going to have a lot of users and scalability is really important, I mean you really have to sit down and design that but it’s not automatic in the platform. I mean I know you can scale out and add web and worker roles but if, I don’t know, let’s say you’re writing a web API and you’re going to have millions of hits a day so you’re going to queue up these messages and have worker roles take care of them. I mean you have to think about, ‘am I going to use a cache and where?’ ‘When am I going to do database reads and writes?’ ‘Should I do it in the front end or the back end?’

I mean you really have to think through how to use the platform to make it scale, it’s not free. The hard work still exists so I think my advice would be become aware of all the tools and features and services in Azure and map those to your requirements and design it out, don’t just jump in and think it’s all going to work magically because it’s in a Cloud.

Steve is formerly CTO at 10th Magnitude, and currently CTO at New Velocity Media, a multi-platform media studio that specializes in content production and technology including the Multipop video platform and Galahad content distribution engine.