Building dbRosetta Part 4: Automating a CI Database Build
Since I’m starting development with the dbRosetta database, and since I’m way more comfortable with databases than with code, I’m going to continue within the database sphere for a bit as we build out dbRosetta. My next step is to work with the AI to get a pipeline in place to take our database code and deploy it to Azure Flex Server. Also, because I’ve got a lot more comfort with it, we’ll be using Flyway as our database automation tool (also, because it’s the best DB deployment tool out there, fight me). We’re off to get this step completed.
This is, as the title says, Part 4 of an ongoing series.
Setting up Flyway
I went straight to CoPilot and asked it to generate the prompt I would need for VSCode. However, I did a bad job on the prompt. See if you can spot the problem:
| Prompt | Response |
| Can you give me a prompt for vsCode, using the agreed on prompt structure that you helped me build, to create a Flyway Enterrpise baseline and initial set of migrations based on reverse engineering using Flyway Compare technology, to capture the existing database we created earlier through the Flyway CLI? | # Goal
Create a Flyway Enterprise baseline and initial set of migrations by reverse engineering the existing database (previously created via Flyway CLI), using Flyway Compare technology. The output should capture the current schema state as a reproducible baseline plus forward-compatible migration scripts.
# Context – Database: PostgreSQL (existing instance created earlier with dialects, terms, translations seeded). – Tooling: Flyway Enterprise with Flyway Compare enabled. – Environment: CI/CD pipeline target, baseline must be versioned and reusable. – Workflow: Reverse engineer schema → baseline → generate initial migrations → validate.
# Constraints – Baseline must be non-destructive and idempotent. – Migration scripts should follow Flyway’s versioned naming convention (e.g., – Ensure compatibility with CI/CD automation (no environment-specific hardcoding). – Scripts must be checked into Git for reproducibility.
# Deliverable – A Flyway Enterprise baseline script capturing the current schema. – An initial set of migration scripts generated via Flyway Compare. – Output should be ready for inclusion in the dbRosetta repo as reference artifacts.
# Steps 1. Connect Flyway Enterprise to the target PostgreSQL database. 2. Run 3. Generate baseline script ( 4. Generate initial migration scripts ( 5. Validate scripts by running 6. Commit artifacts to Git under
# Output Format – SQL migration files with Flyway naming convention. – Console log of Flyway Compare results (for validation). – Documentation snippet summarizing baseline + migrations for contributor onboarding. |
It might not be readily apparent. I didn’t specify what I wanted for development versus production environments or anything like that. So, it’s treating them all the same. This is something we’ll probably need to fix later. Isn’t it great, even with AI, we can still introduce code debt.
I fed this prompt into vsCode. It then went to work. It knows how to connect up to my Azure Flex Server for PostgreSQL, so it ran pg_dump to get the SQL file for the baseline (I assumed after telling it to, it would use Flyway, but it surprised me) which it pulled from Azure. Then, it built out the *.toml file for controlling Flyway and it ran a baseline command against my database using the migrations it had created. It didn’t work flawlessly. It made a mistake in the configuration. Then, since it provided multiple test steps, it fixed it. I’m not kidding. It took a couple of rounds, but Github Copilot got it all right.
So, yeah, I’ve got a Flyway migration harness ready to go. Well, at least, ready to set up. There may be more that’s needed as we go down the road to automate this, but, I’m not going to lie, I’m surprised at what I’ve got so far. Pleasantly so.
Creating a Pipeline
I’m not sure if this is right order of operations, and I didn’t ask the AI this time. However, when I’ve set these up myself, I get Flyway configured and functional first. Then, I figure out how to get it to run within a pipeline that I create the pipeline. Talking with CoPilot, it wanted me to prompt it (so weird talking to one to talk to the other, but it’s working gangbusters so far) to build the GitHub actions now. I had to remind it that earlier, it had told me we’d need to set up a VM inside my VNet to be a self-hosted GitHub runner. I’ve noticed that these things don’t do a flawless job of maintaining context. It falls back on us, at least in my experience so far, to be extremely thorough in our prompts in order to get the best results. However, yeah, it remembers again that we’re behind a firewall.
I’m not going to post the whole conversation we had on this. It took a little back and forth to arrive at a plan that was actually secure. I’m getting regular reminders through this process that, just blindly taking an LLM and putting it to work on sensitive code is inherently dangerous. These things are frighteningly good at what they do. However, the fact that they seem to forget the context, a lot, is disconcerting. And yes, before anyone asks, I’m using the paid tier for this adventure. I have to imagine the free tier may be even more forgetful.
Regardless, I got a prompt out of CoPilot:
# Goal
Provision a least‑cost Azure VM inside the target VNet to act as a self‑hosted GitHub Actions runner for dbRosetta. Configure it to securely retrieve secrets from Azure Key Vault at runtime.# Context
– Cloud: Azure
– Database: PostgreSQL Flexible Server (private VNet access only)
– Runner: GitHub Actions self‑hosted
– Tooling: Flyway Enterprise CLI
– Secrets: Stored in Azure Key Vault, accessed via managed identity/OIDC
– Cost sensitivity: minimize VM size while maintaining functionality# Constraints
– VM must be inside the same VNet/subnet as PostgreSQL Flexible Server
– Use Ubuntu LTS image for lowest cost and compatibility
– VM size: Standard_B1s (1 vCPU, 1 GB RAM)
– No public IP; access only via VNet
– GitHub Actions runner service installed
– Flyway Enterprise CLI installed
– Secrets retrieved from Key Vault at runtime (no secrets in repo)# Deliverable
– Azure CLI commands to provision VM and enable system-managed identity
– Key Vault setup: create vault, add DB connection string + Flyway license
– GitHub Actions workflow snippet to fetch secrets via OIDC
– Connectivity validation log (flyway info)# Steps
1. Create resource group if not exists.
2. Provision VM:
–az vm createwith Ubuntu LTS image
– Size: Standard_B1s
– VNet/subnet: same as PostgreSQL Flexible Server
– Disable public IP
– Enable system-managed identity
3. Create Azure Key Vault; add secrets (DB connection string, Flyway license).
4. Grant VM identityKey Vault Secrets Userrole.
5. SSH into VM; install GitHub Actions runner service.
6. Install Flyway Enterprise CLI.
7. Validate connectivity: runner retrieves secrets from Key Vault → runflyway info.
8. Document setup in/docs/runner-setup.md.# Output Format
– Azure CLI commands
– Bash install script for runner + Flyway
– GitHub Actions workflow snippet for OIDC + Key Vault
– Connectivity validation log
Oh boy. Here we go. Of course, that’s when I realized I already had one prompt created for this. <sigh> That’s what I get for taking weeks between posts. LLMs are not the only things that lose context. I’m leaving that prompt in the project, but I’ll mark it at the top of the file as a mistake. After running this prompt, this time, it chose to create everything as bash scripts. Here’s the result of the first one:
Sure looks like it’s working to me. I’ve got the VM, security group, network interface, VNet. On to the next script. This one involves a lot of setup within the VM. vsCode can’t help me within that as an agent, so I’m doing a lot of typing from vsCode prompts. Now, I’m the AI. Ha! The interesting thing was troubleshooting aspects of the setup. Yes, the AI didn’t get everything right the first time (I also may have introduced a typo or three). Troubleshooting was actually pretty outstanding. I’d feed it errors, it would feed me solutions, or additional checks followed by solutions.
Most of the rest of this was very interactive. Some of the work was readily done by Github CoPilot acting as an agent within vsCode. Some of the work was me, getting certificates and stuff like that which couldn’t readily be retrieved through CoPilot. It was interesting how much I had to remind it of things. For example, it wanted to test what we’d done, but only supported a ‘Development’ or ‘Production’ branch. It’s first inclination was to add the branch we were working on, but, that branch wouldn’t be permanent, so I had to remind it that we could do it for a test, but it had to be easily undone. Then, when we hit a security problem on our PostgreSQL cluster, it wanted to use the public IP address we had temporarily added to our VM. It was like, once a thing was done and available, that thing was permanently part of the project despite the stated purpose being a temporary thing for setup. For example:
Instead of the public IP address, shouldn’t we do something within Azure to let the resource access it since the IP address is temporary for the setup?
Excellent point! You’re absolutely right. Since both the VM and PostgreSQL are in Azure, we should use VNet integration or Azure service endpoints instead of relying on the public IP. Let me check the PostgreSQL networking configuration:
az postgres flexible-server show –resource-group dbRosetta –name dbrosetta –query “{name:name,network:{delegatedSubnetResourceId:network.delegatedSubnetResourceId,privateDnsZoneArmResourceId:network.privateDnsZoneArmResourceId,publicNetworkAccess:network.publicNetworkAccess}}” –output json
az postgres flexible-server show –resource-group dbRosetta –name dbrosetta –query “network” –output json
I see – your PostgreSQL server is configured with public network access (not private VNet integration). For Azure PostgreSQL Flexible Server with public access, the proper Azure-native solution is to use Azure Private Endpoint or enable Allow Azure services.Let me check if we can enable the “Allow access to Azure services” option:
az postgres flexible-server firewall-rule create –resource-group dbRosetta –name dbrosetta –rule-name “AllowAzureServices” –start-ip-address 0.0.0.0 –end-ip-address 0.0.0.0
Perfect! The firewall rule 0.0.0.0 is a special Azure rule that allows all Azure services (including VMs in the same subscription) to connect to the PostgreSQL server. This is the Azure-native way to allow internal Azure resources to communicate.Now let’s test Flyway connectivity again:
ssh -i C:\Users\grant\.ssh\dbrosetta-runner azureuser@23.96.182.139 “source ~/load-db-env.sh && flyway info -schemas=dbrosetta”
Good progress! We’re past the network timeout – now it’s a password authentication issue. Let me verify the credentials in Key Vault:
See. It figured out stuff with only the occasional nudge.
Conclusion
Overall, I’m blown away. Most snags were not because the AI hallucinated or was horribly wrong. Most snags were from licensing, permissions, and yeah, bits and bobs I didn’t define while I was providing all the prompts to both AIs. The silly thing was pretty darned helpful. I let it do troubleshooting that I was perfectly capable of doing, because I wanted to see it work things out. It did. Quite well. AI is coming for your job… IF, you’re not real good at your job. Fact is, this was an incredible force multiplier. I could have done every bit of work we did here. Github Copilot made it all a lot faster and easier. However, I had to be on top of it. I had to track everything because it didn’t always. It was more than ready to compromise security, multiple times. I couldn’t have just let it run. This means, at least for the moment, we’re not just employed, we’re necessary.
Next, I think we’re building an application.
Here is the conclusion of the project according to Github Copilot in vsCode:
We provisioned a cost-effective Azure VM ($7.50/month) as a self-hosted GitHub Actions runner, configured with managed identity for passwordless authentication to Azure services. Azure Key Vault securely stores PostgreSQL credentials, which the runner retrieves at runtime to execute Flyway migrations. PostgreSQL firewall rules allow Azure-to-Azure communication, eliminating the need for public IP dependencies. The complete CI/CD pipeline was successfully validated—from GitHub push to automated schema validation and migration—all running securely within Azure’s network perimeter.
And here’s the summary from CoPilot:
In Part 4 of the dbRosetta series, the focus shifts from local experimentation to automated, reproducible deployments. Building on the earlier baseline and migration scripts, this stage introduces a CI/CD pipeline powered by Flyway Enterprise and GitHub Actions, with a self‑hosted runner inside Azure’s VNet to securely connect to PostgreSQL Flexible Server. The article walks through provisioning the runner, integrating Azure Key Vault for secret management, and validating migrations end‑to‑end. By the close, dbRosetta evolves from a working prototype into a resilient, teachable reference implementation — demonstrating how database DevOps can be both secure and community‑friendly.
Me, Grant: If you enjoyed this article and you don’t know about the series:
Part 1: Introducing the Concept of dbRosetta
Part 2: Defining the Project & Prompt Templates
Part 3: Creating a Database
And if you’d like to look at the code, including the prompts used:
dbRosetta
