The ASs of Distributed Computing

The term “Cloud” is, sometimes intentionally I think, far too vaguely defined. I tend to use the term Distributed Computing to refer to this new way of working. Distributed Computing is simply operating some or all of your computing needs somewhere else – often operated by an external vendor rather than an internal team.

Even that description is probably too simplified to be practical. “Some or all of your computing” leaves a lot open to interpretation, so the industry developed a few more terms that are technically more specific describing which parts of the computing are sourced to another location, and how they are structured. This leads to all kinds of other discussions, such as architectures, performances and pricing.

As with any discipline, focusing on specifics leads to further distinctions that become unworkable at general levels. For instance, we call a canine house pet a “dog”, although a zoologist would argue that this is far too vague – we should use the term Canis lupus familiaris. A fellow zoologist would argue even this is too non-specific, and would classify the animal using even more terms. At some point the description includes whether the dog is domestic or feral, or even if the animal is alive or dead!

But for most of us, stating “When a man’s best friend is a dog, then that dog has a problem” is enough. We all know what the sentence means.

But we run into the same issue with clarifying the Distributed Computing discussion. The industry has defined three primary terms – with more coming each week, it seems – describing the ways you can distribute your computing needs. The terms are:

Infrastructure as a Service
Software as a Service
Platform as a Service

The “as-a-Service” moniker is both useful and much-maligned. I find it quite descriptive, because it explains that what precedes the moniker is being delivered, operated, controlled and maintained by someone else. Although even that might be a little misleading… I’ll explain that nuance in a moment. In general, each of these terms means that the setup, operation and maintenance for each of these areas are handled just like a utility handles providing you power, clean water, and phone service. Note that this means that the choice of the provider of those services matters a great deal.

For instance, let’s take the phone example. Many of us have strong preferences for a particular phone vendor because of the selection of phones, the uptime, charges, coverage and more. Many of those same considerations come into play when you’re selecting not only an architecture preference, but the vendor that provides them. This moves you away from the straight technology question to more along the lines of business questions like security, experience and trust.

Let’s explore where these terms are specific enough, and some of the interesting benefits and considerations when using each paradigm, and a few of the architectural patterns available in each. Understand that I’ll stick to the most basic components of these terms, so there is a lot of room for further interpretation of each. But just like many parts of life, various levels of understanding can be useful – not everything needs to be defined down to the atomic level.

Infrastructure as a Service – IaaS

Moving part or all of an infrastructure was one of the earliest use of the term “cloud”. And it all started with virtualization.

Computers at their most basic normally involve four components – a Central Processing Unit (CPU), some sort of persistence storage (I/O), non-persistent memory (RAM), and a network interface (NIC). Since the very earliest days of the mainframe, technical design experts realized that these components were merely facilities to store and manipulate data. They theorized, and later built, software components that emulated each of these components – and virtualization was born. As mainframes matured, as early as 1972 an entire operating system was used to hold a virtualization environment. The virtualization we recognize today has been in use since before the microcomputer.

Interesting but useless side note: The earliest names for technology did not sound like an infant named every web site on the planet, and all projects were not named after some child’s stuffed toy. Most had a mix of acronyms or numbers, such as VM/CMS. But the earliest “mascot” for virtualization started with, and remains – a teddy bear!

At first enterprises were slow to adopt virtualization at the PC Server level. The hypervisors weren’t mature enough to handle high-intensity loads, so Virtual Machines (VM’s) were limited to things liked development systems, testing systems and personal workstations. But as the hypervisors increased in performance, the “one application / one physical server” rule began to relax, to the point of serious, real-time workloads – even databases – became acceptable targets for virtualization.

With virtualization entrenched in the workplace, it became common in a relatively short time to not only remove multiple physical servers and collapse them onto fewer physical “hosts”, even those physical machines could be removed. After all, most companies have a data center – a building where the physical computers are housed – since the cost of setting up the proper electrical, temperature and safety levels are better suited outside the location where the company does its regular business. In fact, many of these facilities began to be shared among companies. You might not need an entire data center, and its commensurate costs. Companies sprang up with the sole purpose of housing physical computers for a company’s virtualization environment. These facilities (often called co-locations or colo’s) provided a service, arguably one of the first “as-a-Service” offerings, of not only hosting these physical computers, but often even providing the hardware itself. And the “cloud” (in practice if not in marketing term) was born.

From there, it was simply a matter of time before the next step in servicing this need was taken – and the formal “Cloud Computing” environment was created by vendors and offered for sale on a wide scale.

Primary Characteristics

So this brief but necessary history lesson brings us to the basic characteristics that set Infrastructure-as-a-Service apart from other architectures.

The first is that the asset (CPU, RAM, NIC and I/O) are presented essentially “raw”. You can treat the asset as if you completely control it – within the boundaries of the vendor that provides it. For instance, in the case of storage (I/O), a “drive” is presented to you for access across the public Internet or in some cases a paid-for direct network tunnel for better performance. But what you do with the storage, and in some cases even the file system you use, is up to you. It’s just a resource for you to consume, located somewhere else.

The second characteristic is that of abstraction. Although that drive is presented to you as if it were a Storage Area Network (SAN) disk or local hard drive, you have no control over the actual physical components underneath. For that matter, you don’t even know what those components are. And in many cases, it’s possibly several levels of more software before you actually hit hardware. This allows the vendor to shift your assets around, replace them with faster/newer/cheaper components, and so on.

Which brings up the third characteristic of IaaS – the provisioning and servicing aspect. In fact, this characteristic is shared among all of the “as-a-Service” designations, and is actually the heart of the “cloud” or Distributed Computing. The key here is that the vendor provides not only the buildings and the facilities, the personnel, the power, cooling, computers and other hardware, but they have some sort of system that allows you to simply request what you want to use – called provisioning. If you’re a technical professional involved in your own company’s infrastructure, this is something you normally provide. Someone submits a request for a computing need, and you’re expected to figure out how much capacity you have, how much you need, and to build and provide the system. A Distributed Computing vendor for IaaS now takes over that task.

The next part of this characteristic is the servicing and maintenance of the systems. Someone has to ensure the system is up, functional and performing well. In the case of IaaS, this is often where the servicing aspect stops.

All of this has an interesting side-effect regarding purchasing and maintaining software for the system. While the vendor handles everything from the facilities through the hardware abstraction, the point of demarcation is the software from the Operating System (OS)-up.

Compatible Use-Cases

Since IaaS simply virtualizes physical servers and handles the provisioning aspects, you could say that most any computing need will run on this architecture. Of course, there are some practicality issues that make that too broad of a blanket statement.

You normally start at the operating system installation. Most of the time, this is a pre-configured image with a selection of approved “images” to choose from, which is completely understandable. After all, even virtualized machines need drivers compatible with their host software, and the vendor wants to limit how many they have to support to be able to be responsive. So you’ll need to do some investigation on which operating systems you require for your application.

That being said, since you have complete control over the operating system, many INSTALL.BAT or SETUP.EXE -type programs are suited for IaaS systems. You can also cluster systems or figure out a scale-out solution, but since you don’t have complete control over the hardware and infrastructure, you’ll need to work with your IaaS vendor to ensure things work the way you expect.

Considerations

With all of the “as-a-Service” solutions, there are security considerations. There may or may not be a way to connect your enterprise security environment (such as Active Directory) to the IaaS environment – check with your vendor to see how that can be implemented. Even so, you can’t rely on things like encryption alone to secure your private assets. Any time someone has access to your environment, special care needs to be taken in the security arena. Federation, in which you allow not only your local users access to an asset but also folks from other areas (such as customers) with their own security mechanisms needs to be thought through carefully.

In an IaaS environment, although your software is initially licensed and patched to a certain level for you, from there on these items will be your responsibility. Not only will you need to apply patches for the VM internal environment, but any operating system, coding platform and other software will need to be treated as if you still own it onsite. Upgrading and patching needs to be built in to your operations just as it always has.

Another consideration is the aforementioned scale paradigms. Virtual Machines can only reach a certain number of CPUs, NICs and so on before it becomes impossible to scale them “up” any higher. You may have a fabric of software you use in order to scale these systems outward, which is the proper way to handle increasing loads in a Distributed computing environment. Ensure that you understand how your fabric of choice works with your vendor’s IaaS environment.

Software as a Service – SaaS

Probably one of the easiest “as-a-Service” paradigms to understand is that of Software as a Service. Simply put, this is an environment where you log on to a remote system, use the software via a series of screens and buttons, and log off. There’s often nothing to install, and little to configure to begin using it.

An example that even pre-dates the current “cloud” moniker is a remote financial application. Companies for years have been using a SaaS environment to handle established patterns in software systems like accounting and finance, and payroll. In fact, when I visit many companies to talk about their use of Distributed Computing, they tell me they aren’t currently using any. I then bring up this example and they are surprised to learn they’ve been using SaaS for a long time.

Primary Characteristics

A SaaS offering is composed of a set of software, often on the web, that is running on a set of remote servers. The Operating System, hardware, scale and all other aspects of computing are normally obfuscated from the users.

In most part, a SaaS offering is set for a particular series of screens or application paths. However, some SaaS offerings can be customized, which is what leads a few purists to debate this term. In fact, some are merely groups of functions that need to be customized before any kind of use, which makes them less SaaS and more “Some-other-function-as-a-Service” – but if there are no Virtual Computing environment factors to consider, no code to write, and nothing to deploy, it normally fits the description of SaaS.

Compatible Use-Cases

For the most part, other than with the caveats already mentioned, SaaS is well-suited to a “best fit” solution. If you have the need to run an office suite of software and your connectivity is robust, there are multiple on-line solutions available – nothing to install, nothing to license, just use and (in many cases) pay.

Considerations

Which brings up the first consideration with a SaaS solution -the cost. It’s important to understand how you’ll be billed for using the service. Free offerings may be fine, but most of those are not licensed to be used within a company. Even if they were, nothing can be operated for free, so it’s either ad-supported or perhaps the vendor has access to your private data.

Support is another consideration. You need to ensure you have support available, that it is robust, and that it works. Since you’re essentially outsourcing an entire function, you need to be able to rely that your users can get support and training when they need it.

It should be noted that one of the biggest considerations is connectivity. While the SaaS vendor may have amazing facilities, great uptime, perfect support and response, if the users can’t get to it, it’s down for them. Some vendors solve this issue by installing local software that caches the data when offline – like modern e-mail software.

Finally, ensure you understand data management in a SaaS provider. Who owns your data? Who has access to it? How is it backed up, and how is it restored?

Platform as a Service – PaaS

One of the newest, and sometimes a little more complicated “as-a-Service” offerings, is “Platform as a Service” or PaaS. In this paradigm, not only is the hardware, virtualization and other infrastructure controlled, provisioned and maintained, but also the Operating System and in most cases even the scale-out paradigm. You write code, deploy it to the service, and your users use it as a SaaS offering.

Primary Characteristics

If you think about IaaS as the ability to use a Virtual Computer, you can think of PaaS as a complete deployment environment available for your use. Imagine for a moment you write some code, and hand that un-compiled code to a friend for her to compile and run on her server.

Often the PaaS solution comes with multiple components. Not only can you deploy code to run, but storage systems, queues, caches, secure connections, federations, and other services are available.

The way that you interact with the PaaS environment depends on the vendor. In some, you write code on a local system and compile, deploy, test and use the software on the vendor’s PaaS environment. In others, you get a local emulation of the PaaS, and can do everything “offline” until you’re ready to deploy the tested code to “production” – once again, the vendor’s PaaS.

Compatible Use-Cases

There are multiple places where a PaaS makes sense – most involving when you need to write your own code, although as an aside many PaaS providers allow pre-configured packages (such as Hadoop for big data and so on) to be deployed with no code at all.

A flexible environment where you need to be able to write and deploy code quickly fits well in a PaaS environment. There’s nothing to plan for or configure, the version is always “now” on the operating system, and it’s just ready to accept your code. There are no licenses to buy, no viri scan to configure and so on.

Another well-suited use-case is elastic scale – meaning that the system needs to be able to grow, sometimes quickly and massively, and then shrink back down. In some cases this is a manual effort, in others it can be coded directly into the deployed application.

Anything facing the public Internet that needs to interact with both customers and internal stakeholders fits well in a PaaS. There are even connectors and components to allow federated security, making the deployment even more in line with standard patterns.

If you do in fact have code you have written within your organization, a PaaS solution can provide rapid “hybrid” integration – allowing functions that need to have the speed or elasticity advantages simply coded in to the current software. This allows, for instance, customers to enter data on a web page and the internal accounting, sales, fulfillment and other departments to access that data and combine it with their internal access – all without exposing the internal network needlessly.

Considerations

Stateless programming is essential for scale in a PaaS, or actually any kind of Distributed Computing. Since most SETUP.EXE applications expect their own server, they are sometimes not well suited to PaaS.

Backup and recovery is a shared effort between you and the PaaS vendor. There are things you can do to ensure that you are planning for these kinds of events right in your code, and many PaaS vendors provide features in their platform to ensure availability and continuity.

The languages supported by the PaaS vendor are important. Some PaaS vendors lock you into only one or two languages, others provide multiple choices.

There are other “AS’s” that are being explored and phased into the lexicon. Data as a Service – DaaS, Reporting as a Service – RaaS, and many others have been announced, with more on the way.

It’s important to note that Distributed Computing is here to stay, but it isn’t meant as a replacement for the way we work today. It’s a supplement, just as every other computing paradigm has been. Physical servers, virtualized private environments, colos, and even mainframes are all still here. The technical professional person is most useful when he or she takes the time to learn the several ways of working with technology, and applies them properly to the business problem at hand.

Register for Simple Talk

The ASs of Distributed Computing

Article tags

About the author

Buck Woody

Buck's contributions

Articles

Books

Top topics

Buck's latest contributions:

Which New Technology Should I Chase?

Data Science Laboratory System – Distributed File Databases

Data Science Laboratory System – Graph Databases

Infrastructure as a Service – IaaS

Primary Characteristics

Compatible Use-Cases

Considerations

Software as a Service – SaaS

Primary Characteristics

Compatible Use-Cases

Considerations

Platform as a Service – PaaS

Primary Characteristics

Compatible Use-Cases

Considerations

Article tags

Recommended

About the author

Buck Woody

Buck's contributions

Articles

Books

Top topics

Buck's latest contributions:

Which New Technology Should I Chase?

Data Science Laboratory System – Distributed File Databases

Data Science Laboratory System – Graph Databases