12 Architecture Guidelines for Your System in the Cloud Era

When you are planning to implement a cloud system, there are twelve architectural guidelines that can help you to get the most out of the advantages that a cloud platform can provide

In my last post I wrote about cloud computing design best practices: entry and exit strategies, cost and risk management, refactoring to meet scalability and up-time, and most important – listening to your customers.

This time we will discuss several architectural issues that you should consider when diving into cloud system implementation:

1. Choose the right server framework for YOU:

There are so many server frameworks these days and so many case studies for each of them (PHP and Facebook, Python and Google, Ruby on Rails and Twitter). Therefore, you should choose the framework that your team feels comfortable with and the one that you feel you’ll be able to grow with. Keep in mind the availability of consultants, new hires, and good reference material.

2. Consider using Open Source:

Everybody is using Open Source. What do you do if your organization isn’t an expert in this field? As I mentioned in my previous post, you should learn, fast. There are plenty of cloud providers that support Windows and .Net, and you may get a head start if you choose technology you’re familiar with. When you grow, you can refactor critical parts of your product. This way you can add technologies to remove your Push/COMET  bottlenecks using Erlang* or cut costs by replacing common processes that result in 90% of your costs with LAMP.

* Why should you care about PUSH? If you plan to notify your users instantly about status changes like Facebook does, you should care. Since servers cannot initiate communication with (or PUSH data to) clients in the internet, you should either 1) poll the server every second (a lot of traffic and resources) or 2) keep the connection open. The latter option is called COMET and it’s the right solution for large scale implementations. Erlang is the most common programming language for these cases due to its unique features that we’ll discuss about them in one of our future posts.

3. Offload traffic to CDN:

Extract your static and streaming content to a CDN provider. This move will cut your server and network utilization and will improve your end user experience.

4. Use smart clients capabilities:

Make sure your end user clients can accommodate network failures. If you are using Gmail and have seen the loading… label instead of 404 page, you probably know what I’m talking about (otherwise Google for jQuery).

5. Enable elastic growth:

If your system pattern usage is not uniform, consider turning on and off some of your instances to hedge costs and meet the spikes. While some cloud platforms provide built in solutions for that (like Amazon’s Auto Scaling), you can always implement it using a monitoring system and cloud provider’s API.

6. Use Data Replication:

Don’t forget to keep your data safe and avoid downtime; make sure you do this using commodity hardware and software.

7. Balance between NoSQL and SQL:

Choose SQL as a start if that’s where the skills in your business are, but don’t neglect NoSQL when you get larger. When should you consider NoSQL? If your users create structured data (let’s say Word documents) and other users use this structured object as is (reading the word document in that case), it may be wiser to save this data item as-is rather than break it into atomic pieces, normalize it, and then regenerate it. In these cases I recommend avoiding saving these items as a blob in a database. Databases should be used for what they do best: storing normalized data for aggregations, filtering, and sorting.

8. Consider Map Reduce for Growth:

When you perform aggregation on 100M records, it might be easy; but when performing it on Petabytes of data, it might be more difficult. The map reduce schema solves the problem of large-scale aggregation by distributing the computing to small servers that actually store the data rather than doing everything in a single huge server.

9. Prepare for Scale out:

In order to avoid bottlenecks you should be able to split each function in the system to two or more servers. If you think your best solution is purchasing a larger server, you might be heading in the wrong direction.

10. Shard your data:

Scaling out web servers might be easy, but data is usually the largest obstacle for scaling out. Most conservative designs concentrate the data in a single place (the Database). Sharding is a common solution for this limitation in the cloud era. Sharding breaks the database into smaller and more managable data stores.

11. Use In Memory Databases:

Memory is five to ten times faster than disk. Therefore, if you use your database for short-term storage, consider using the memory database. If you’re afraid of data loss, consider using data replication to a remote server.

12. Keep stressing your servers:

Since traffic and usage are (hopefully) growing, you should always make sure you can meet larger numbers of users. Use tools such as JMeter to make sure your system is always ready for the next step. In the next post we will discuss how the cloud can help you test and solve your systems.

Bottom Line

If you put a Y next to all 12 items, you’re in a great position. If not, it is time to start addressing the issues here and making sure you can meet your business goals.

Keep Performing,

Moshe Kaplan