Simple Talk development: sites and modules

In the last post, I described Simple-Talk’s original architecture: four .NET applications sharing the same database with very little separation between the applications. The problems we had gave us a set of goals for any change in architecture:

  • Replaceability — we’d like to be able to change one component of the site without requiring extensive work on other parts of the site
  • Reusability — if we have multiple sites with some overlap in functionality, we’d like to be able to reuse the implementation of that functionality
  • Repeatable deployments — we should be able to deploy a site with a single command
  • Testability — we should be able to run automated tests against the site, and against individual parts, both to prevent regressions and to do a little test-driven development

Many of these problems came from the lack of separation between components. Any new architecture needs a clear definition of what the components of our site are, and how they communicate with each other.

In our new architecture, a site is made up of instances of a module. A module can only communicate with other modules over HTTP. Other forms of communication, such as through a shared database or the filesystem, is forbidden. This makes communication between modules a little more explicit — if you want to see how one module depends on another, take a look at the HTTP calls. It also encourages looser coupling, since a module can only depend on the HTTP API exposed by another module rather than, say, the design of its database.

For instance, Simple-Talk has an instance of our WordPress module running under the path /blogs. There’s a small HTTP API that we’ve added to WordPress to allow blog posts to be fetched and used on other parts of the site, such as the front page. If we decided that we wanted to use a different blogging platform, we’d have to extend that platform to support the same HTTP API. Once this is done, the rest of the site would be able to communicate without any knowledge that the underlying implementation has changed.

Let’s say we want to start hosting some blogs at blogs.example.com: we just want to deploy WordPress, where somebody else has already done the hard work of implementing the WordPress module for us. First of all, we need to create a file called site.json, which is going to contain all of the environment-agnostic configuration for our site:

We give the site a name of “blogs.example.com”, and then we list all of the modules that make up this site. In this case, we have a single module called “blogs”, which is an instance of the “wordpress” module. There are some options that almost all modules use, such as the path under which the module sit. In this case, since WordPress will be serving up the entire site, the value for webPath is the root.

There are also options that are specific to a particular module, such as which options and plugins are enabled in WordPress. In this case, we’ve turned on the XML RPC interface so that people can use Live Writer or similar, and turned on the Akismet plugin to help filter out spam in the comments. This allows us to set up WordPress automatically during the deployment process, rather than the usual method of configuring WordPress through its web interface. Doing so allows us to repeat deployments reliably, which is handy both for automated tests and for ensuring that what we deploy to production is the same as what we’ve been deploying during tests and to staging.

We also create an environment-specific configuration file, such as staging.json:

We specify the main host where the applications will be running, in this case “blogs-staging.example.com”, and where on the filesystem the site should be deployed to. We also have some configuration for each module — in this case, we need to state which port WordPress should listen on, and the details it should use for MySQL.

Once the configuration is all in place, we run the deploy command that builds up the directories and files required for that site and environment, and deploys the entire site:

Deploying a site made up of multiple modules is just a case of changing the site configuration in site.json to include multiple modules. Since each module operates largely independently, there’s little extra complexity in configuring a site with multiple modules instead of just one module.

Hopefully, this brings us a little closer to some of our goals from the change in architecture:

  • Replaceability — forcing modules to only communicate over HTTP encourages modules not to be dependent on the exact implementation behind an HTTP API, meaning you can replace what implements that API without affecting the users of that API
  • Reusability — different modules can be combined in different sites by creating the relevant configuration JSON files
  • Repeatable deployments — each module knows how to deploy itself for a specific site and environment, so a site can be deployed with a single command
  • Testability — by allowing modules to be deployed independently, we can test each module in isolation

In the next post, I’ll cover how you go about actually implementing a module.