2187-just_azure.svg

Every developer has learned about queues; most of us have written a similar data structure at least once in school or university. It’s all about a collection that has two primary methods: enqueue (to insert new items to the queue at the bottom) and dequeue (to get the top most item from the queue). Yes, Microsoft Azure Queues are ‘queues’ as you might commonly understand them, but they are also more than just a class or a concept – Microsoft Azure Queues are a ready-to-use service that loosely connects components or applications through the cloud.

Before we move on, it’s worth asking yourself an important question: Are you already using cloud technology? Sure, many companies claim to use the cloud, and that’s good but: Do they see the cloud just as a hosting platform, or are they creating an application that really relies on platform-as-a-service (PaaS) features to build failsafe, scalable software? Microsoft Azure had a strong focus on PaaS right from the beginning, and Azure Queues are one of the services that were available from the earliest days that are almost certainly underused.

What are Azure Queues?

From an architectural view

Azure Queues are queues located in the Microsoft cloud which you can use for exchanging messages between components either in the cloud or on premise. A message typically represents a task created by someone (“producer”) that has to be processed by someone else (“consumer”). Each message has a small body and some attributes, such as time-to-live, which you can use for configuring your service. As Azure ensures that a dequeued message is invisible to other listeners, you could imagine many producers and many consumers as well as one-to-one scenarios. The main architectural benefit is loose coupling (more about advantages and scenarios later in this article).

From a developer’s view

Azure Queues are a RESTful service that you can use to enqueue and dequeue messages as well as administrate (create, delete) your queues. There are many language-specific wrapper APIs available, so you can develop with your favorite language instead of sending REST calls directly – at the moment (April 2014) .NET, Node.js, Java, PHP, Ruby and Python are officially available. Azure Queues are a part of the Storage Service that unites Blobs, Tables and Queues under one storage account. If you want to do a little bit of early further reading on how to start using Queues, you can check out these useful bits of documentation:

From a commercial view

Azure Queues are a charged service that you pay-per-use depending on the desired redundancy level, the needed storage space, and the number of transactions (e.g. read, write, delete). As a benchmark, let’s say you choose to have:

  • Local redundancy (no replication to a separated data center)
  • You process 20,000 messages a day (‘process’ in this case means inserting, reading and deleting them)
  • Each message with a 40 kb payload (780 mb / day)

… you’ll pay less than $3 a month (April 2014). If you need additional services (e.g. a worker role for processing the queue) further costs are charged. To get a better picture of the potential cost of using Azure Queues, take a look at the Price Calculator.

Service Bus Queues

But wait! There are two queue technologies available in Microsoft Azure: Azure Queues (as described in this article) and Service Bus Queues. The latter is newer and has a focus on publish/subscribe mechanisms (i.e. the ability to receive messages without having to poll the queue) and integration with the WCF communication stack. On the other hand, Azure Queues are faster and more applicable when storing more than 5 GB of messages in a queue.

For a deeper, more detailed comparison between the two queue technologies, take a look at:

In a nutshell: the life of a message

Now that we’ve covered a basic introduction to the concepts, let’s get a deeper understanding by taking a look at the life of a single message: from birth to death – Add to Delete.

Security: It all starts with Authentication

Before you can talk to your queue, you first have to authenticate. There are two options available at this stage: you can either connect with the storage account name and storage account key (retrieved from the management portal), or with a temporary token.

It’s worth bearing in mind that everyone who gets the name/key credentials has full access to all operations in your storage account. Those people can use that access to add and delete queues, read and write messages, and even access blob and table storages. To cut a long story short: be careful who’s in possession of these credentials.

The second option enables you to create a temporary token that is valid for a particular queue, and limited by functionality and time. Obviously, this way you can give someone restricted access (e.g. read-only) for a specific time period (e.g. 5 minutes from now), and maintain much tighter control over who has access to your application.

Depending on your scenario and the ownership of the involved components you can decide which authentication option fits your needs.

Adding a message: don’t be chatty

Adding a message is actually quite easy – technically speaking, it’s an HTTP POST request that contains a small XML structure in its body.

The main problem at this point is that the maximum message size is restricted to 64 KB. Yes, really… kilobytes. In the early days, the limit was set to 8 KB, so be grateful. Due to this fact, the theoretical maximum string length is 65,536 characters, but when it comes to binary content it gets even worse: The overhead of the necessary base64 encoding (+ 33 %) leads to a maximum useful payload of just 49,152 bytes (48 KB). And in case you’re wondering, the .NET Storage API generally uses base64 encoding for string and binary payloads as a default. The moral of the story: Messages are an improper way of transferring big amounts of data. It’s more about describing a task and giving a link to the stored data, if applicable.

Redundancy: Where the magic of the cloud begins

When creating a storage account, you have to decide if you want to automatically replicate your data to a second data center. In the case of a failure or outage, Microsoft will first attempt to recover the data of the primary location. That’s preferred due to the fact that the whole backup process is done asynchronously, and it’s not guaranteed that all the data is backed up already. Also, the switch over from primary to secondary is not at an account level, but at subsystem level so it can affect multiple customers. If the primary location cannot be repaired, then Microsoft may decide to fail over to the secondary location.. Your application code remains unaffected: producers or consumers won’t even recognize that the Eastern US data center is down and they are already talking to the secondary location in Western US.

It’s worth bearing in mind that the decision has to be made for the whole storage account (blob, tables, queues), and that there are three options available:

  • Locally redundant (data is only replicated within the data center)
  • Geo redundant (as described above)
  • Read-access geo redundant (same as geo redundant, but limited to read-access).

That’s the magic of the cloud: while you are inserting a small message with only one HTTP request, the cloud machinery automatically starts its work and provides a reliable infrastructure at the level you’ve chosen. Just to be clear: geo redundancy doesn’t come for free, but it’s really, really cheap.

Deferred visibility: I’m there but no one can see me

Usually an added message becomes immediately visible to potential consumers. However, in some cases you might want to defer the appearance of the message, e.g. start a task in one hour or at midnight. Azure Queue messages have a visibility timeout attribute that takes timespans up to 7 days.

Time-to-live: Messages don’t outstay their welcome

To prevent messages from living in the queue forever, or to express that a message is only reasonable within the next hour, you can specify a time-to-live attribute. If the message isn’t processed and deleted in the given time range, the message gets deleted automatically. The maximum time-to-live allowed is 7 days, and it’s also the default value.

Accessing Messages: Sneak a peek, or read it exclusively

There are two different ways of getting messages out of a queue. The first one is just to sneak a peek at the queue without changing anything; you can specify how many messages you want to receive (up to a maximum of 32), but all the messages are still available and visible for other consumers after you’ve looked at them. This functionality is mainly useful for 3rd party tools: if you query your queue just to see what’s going on during development or to debug a problem, you don’t want to prevent messages from actually being processed.

Most of the time you’ll want to read a message and own it exclusively for a specific amount of time. You can choose how many messages you want to retrieve, from 1 up to 32 per request, and you can also specify how long the messages will be invisible for others – that’s the time you get for processing it successfully. If it takes longer to process, then the messages become visible again and other consumers could start processing them as well. This pattern is called “at least once”, whereby every message will be processed at least by one consumer, and maybe more.

Deleting messages

When you are reading messages, they only become invisible to other consumers – they are not deleted automatically. If you’ve processed a message successfully, then you have to delete it explicitly to remove it from the queue.

It’s also a good pattern to delete messages that could not be processed after several attempts (maybe because of an invalid structure), and every message knows how often it has been dequeued, so it’s very easy to implement this kind of functionality.

Azure Queues: where to use them? A typical scenario

As mentioned above, the main reason for using queues is to separate components and loosely connect them.

Imagine two components of your application exchanging data, maybe one of them on premise and one in the cloud. When using common exchange mechanisms such as calling a web service from A to B (or vice versa) you have some disadvantages to contend with:

  • You really rely on both partners being available simultaneously; if one partner is down, then there’s no communication
  • You have to implement try/retry logic to get things up and running again after an outage
  • It’s hard to scale up if more work is present

With Azure Queues you have a third player that connects the two components and acts as both a buffer and a mediator. So for example, if the consumer partner is down, the producer can still insert messages in the queue while it’s waiting for the other component to come back online. And it’s very easy to scale up: just add more consumers (e.g. Worker Roles), and your queue is processed in parallel. Scaling up and down can actually be a very dynamic process – you could add or remove workers automatically depending on the actual queue size, and hence be fast on heavy load and save costs when the queue is empty.

That said, there’s no such thing as a free lunch. There are some disadvantages to using an Azure queue:

  • You have one point of failure – if the Azure Queue is down, you’ve got a problem (see geo redundancy as a solution)
  • Queues are not free (though they are very cheap in most situations)

Wrapping things up and see what’s next

If you really want to exploit the full power of the cloud technology you have access to, you must go a step beyond the basic web hosting possibilities. Microsoft Azure Queues are one of many options in your toolbox that you can use to build scalable, robust architectures based on the unique capabilities of the cloud. As application demands and designs evolve, parallelism and scalability will become more and more important, and task-based thinking is very well serviced by queues.

This is just the introduction article to Microsoft Azure Queues, and there are more to come. Next we’ll look at some practical examples of how to start working with queues (complete with lots of code samples), and then we’ll move on to some articles about security, performance optimization, available tooling, and real-world scenarios. I’d appreciate any comment or suggestions along the way to help make this series more useful, so feel free to contact me!