Product articles Development and Testing with Clones
Getting Started with Cloud-hosted…

Getting Started with Cloud-hosted Databases using Spawn

This article will get you started with Spawn, a hosted database cloning service running entirely on Amazons AWS.

Guest post

This is a guest post from Tonie Huizer. With 20+ years of experience Tonie likes to read, experiment, talk and write about software. In his stories on medium he explains Azure, SQL and other Microsoft technologies.

Spawn makes it easy to create disposable copies of databases for your development and testing environments, and to automate them within DevOps pipelines. To get started, I’ll show you how to set up a Spawn account and walk you through the basics of creating data images and then spinning up data containers hosting your disposable databases.

In subsequent articles, I’ll start to explore how Spawn fits into DevOps pipelines alongside other Redgate tools so that you can create disposable databases to support each stage of your development process. This might include creating databases for each branch of your version control repository, as well as for continuous integration and testing of development changes.

Why Spawn?

The ideal Database DevOps process would allow every developer to spin up their own, ‘private’ copy of a specific version of a database, complete with data, whenever they need it.  Each developer would then be free to experiment with different approaches and run destructive tests without interfering with the work of others. For such a system to work, it must be very fast and easy to spin up a database, run tests, and then reset it back to its original state, or remove it, without adding excessive overhead, in terms of maintenance or disk space. Also, it should be possible not only for SQL Server databases, but also others such as Oracle, PostgreSQL, MySQL, MariaDB, MongoDB or Redis.

The first technology that comes to mind for providing this type of ‘data service’ is containers. They are like virtual machines, but a container is short-lived and has a very specific goal, in this case running a database. Many developers will already be familiar with, or at least aware of, Docker containers. However, they will also tell you that using Docker containers for databases requires a considerable amount of ‘docker management’ code, as well as power and storage space on the local machine, for the containers.

The premise behind Spawn, is that you stop using your own computer and start using disposable databases as a cloud-hosted service. It uses a command line (SpawnCtl) and web interface to create and manage those copies, and makes scripting and automation very simple, with minimal coding.

Step 1: Sign up for Spawn

I’m only going to cover the basics you need to get started quickly with Spawn, and give you a feel for working with disposable, cloud-hosted databases. Please refer to the excellent documentation at https://docs.spawn.cc/ to immerse yourself in more details.

To get started with Spawn, the first step is to sign up at Spawn.cc, using your GitHub account. For small databases, and single user projects, you can use the “Free” account tier, which will allow you to spin up your first spawn-hosted database in minutes, free of charge.

Signing up for Spawn

Spawn can be accessed via an Internet Browser, but the real versatility of the tool is realized via the Spawn command line, SpawnCtl.

Step 2: Install SpawnCtl

Download SpawnCtl for your OS and install it. After installation it’s necessary to authenticate yourself from the terminal. Below I use the spawnctl auth command, from PowerShell, which open an Internet Browser to do the authentication.

Setting up Spawn

Step 3: Spin up a data container from a public data image

The base of every data container is a data image. This image is just a YAML file specifying the source of the image, such as a database backup, and its location, plus details of the database engine and version, and a name, for the hosting container.

Spawn uses the instructions in the data image to spin up the data container, which is the read/write database engine instance of the type and version you specified. From a single data image, each developer can quickly create a ‘clone’ of that image, inside a data container, and then connect to the data container to access any databases and data in the original image. Each data container is isolated from any others.

To help get you going quickly, the team at Spawn have provided some public data images based on familiar sample databases. You can see view them either from the Spawn web interface, https://app.spawn.cc/, or using spawnctl get data-images, as follows (if you have any existing private images, then add the --public flag to see the public images as well):

spawnctl get data-images

get a list of public spawn data images

So, let’s now use Spawn to create instant copies of the Stackoverflow database within a cloud-hosted data container, first using spawnctl, and then the web interface.

Spawnctl

The following command will, within a few minutes, create a data container (I called it tutorial-ctl), hosting a SQL Server instance on which the Stackoverflow database is installed. The lifetime argument specifies that the container only runs for an hour.

spawnctl create data-container --image mssql-stackoverflow --name tutorial-ctl --lifetime 1h

use spawnctl to create a data container

The response of the create data-container command is the connection string for the freshly created container. In case you missed that output, it can be queried via this command:

spawnctl get data-container tutorial-ctl -o yaml

Web interface

From the web interface, https://app.spawn.cc/, simply click on the “+” icon for the Stackoverflow public image, give the container a name (I called this one tutorial) and, optionally, an expiry date and then hit “create“:

use Spawn web interface to create a data container

Once the data container is ready, you’ll be presented with its connection string, the port, and credentials for the sa account. Although the use of sa is not a best practice, in these short-lived databases it is common and the only default option available when using SQL Server on Linux in a Docker container.

A Select – Delete – Reset example

We can now access either of our cloud-based copies of the Stackoverflow database, the one we created via the web or one from the terminal. We can just use sqlcmd with the correct connection string for the data container, as follows:

sqlcmd -S instances.spawn.cc,31614 -U sa -P moJlRMmu4D6aUh2r -d stackoverflow

Alternatively, SQL Server Management Studio, Azure Data Studio or your favourite SQL editor will work too.

Once connected, we can run a test query on the tags table (SELECT COUNT(*) from dbo.tags) in Stackoverflow, and then we DELETE the table:

testing out a spawn-hosed database

The tags data is now lost now but don’t panic, because we can easily get it back just by resetting the data container to its original state:

spawnctl reset data-container tutorial-ctl

reset a data container

Step 4: Create your own data images and containers

The public images are useful for an initial demo, but of course for your Database DevOps processes, you’ll need to create your own shareable data images.

As briefly discussed earlier, a data image is really just a file containing instructions on how to create a data container. Like Docker image files, Spawn data images are defined in YAML (YAML Ain’t Markup Language™). The source for the new data image can be empty, scripts or a backup. We’re only going to use a SQL Server backup as the source, but please refer to the links for details on the other sources.

Create the data image

Open your favourite code editor and start a new file called dataimage.yml. In this file we will specify the following details:

  • The name of the data image – mine is called whiskytrace
  • The source – in our case a backup
  • The engine and version of the data container – I’m using SQL Server 2017
  • Finally, the folder and file location of the source for the data image – I’ll assume that there is already a SQL Server backup available in the folder C:\SQL\Backup

The file should look as this, of course you can vary with the database and backup filenames as needed.

Optionally, the data image can have tags associated with it, so you can include a few more details about the source, such as the fact that the data image comes from a backup of production:

After saving the file we submit this file to Spawn with the create data-image command:

spawnctl create data-image --file="C:\SQL\Backup\dataimage.yml"

create a custom data image

But something is not right, and the Spawn server suggests that we inspect the log. This can be done with this command:

spawnctl logs data-image 16089

troubleshoot data image creation

The problem appears to be that the version of SQL Server we specified in our dataimage.yml file is 2017, but the backup is made with SQL Server 2019.

Changing the version to 2019 in dataimage.yml solves that problem:

This time the data image is created successfully:

failed and successful data image creation

To tidy up, we delete the image on which the create failed:

spawnctl delete data-image 16089

deleting data images

Spin up the data container

For large databases, the step of creating the data image will be the slowest one but with this done, spinning up data containers, based on this image, should be fast. The syntax for creating a private data container is the same as for a public one:

spawnctl create data-container –-image whiskytrace --name first-dram -–lifetime 1h

create a private data container

The container is started within the minute and as you can see it contains the data, and you must believe me on this: it’s the data that was previously backed up.

connect to the private data container

Cleaning up

In our examples we specified a lifetime of one hour for the database containers. This means that there’s no cleaning up needed, unless you want to free those resources earlier, which can be done with the delete command:

spawnctl delete data-container tutorial-ctl
spawnctl delete data-container first-dram

Summary

In a few steps we created our first disposable SQL Server database in the Spawn cloud, and then explored how to create a private data image and then a clone (database container) of that image.

In subsequent articles, we’ll look at the use of Spawn within a team, introduce other DevOps tools such as SQL Change Automation and Flyway Desktop, finally setting up a complete DevOps workflow with Azure Pipelines.