10 November 2025

Building dbRosetta Using AI: Part 3, Creating a Database

The AI said I had to do a database first, not code. Who am I to argue? So, with all the prompts outlining the goals of the project, I’ve gone forward with the project, and step one is creating a PostgreSQL database on Azure.

This is part three of a multi-part set of articles. I’ll move this list to the bottom of future articles:
Part 1: Introducing the Concept of dbRosetta
Part 2: Defining the Project & Prompt Templates

Generating the Database Prompt

At the end of the last article, I said I’d create two prompts, one by hand and one from CoPilot, to run in vsCode. However, on further cogitation, I decided I’d just run with CoPilot’s prompts. I’ll tell CoPilot what I want. I’ll let CoPilot interpret that for GitHub CoPilot. We’ll see what happens. Let me just say, so far, fascinating.

Prompt

Response

OK. Let’s start with the initial prompt for vsCode to create the database. Would you like to create an example of that prompt? Do you need more information before you create the database that I’ve roughly described?

VS Code prompt for Copilot (create initial Azure Postgres schema)

Use this prompt in VS Code Copilot (or your prompt-runner) to generate the initial SQL DDL and a tiny deployment checklist. It assumes an Azure Database for PostgreSQL instance and creates the minimal tables for dbRosetta v0: dialects, terms, translations, and a simple artifacts/metadata table. It also records the assumptions used so the output is deterministic.

Prompt SYSTEM: You are a careful SQL developer and infra assistant. Produce a single SQL file (UTF-8) that can be applied to an Azure PostgreSQL instance to create a schema named dbrosetta and four tables: dialects, terms, translations, and artifacts. Include sensible indexes, constraints, and comments that reflect the acceptance criteria for term translation (no users/auth stored, cacheable translations, metadata JSON). Use standard PostgreSQL types and features compatible with Azure Database for PostgreSQL (single server/postgres-compatible). Do not include any destructive statements (DROP TABLE). At the top of the file, include a brief assumptions block (postgres version, encoding, owner placeholder). Provide example INSERTs for a few dialects (postgres, sqlserver, mysql). USER: Generate the SQL file now, then provide a one-paragraph deployment checklist with the minimal Azure connection/env vars and a sample psql command to apply the file. Keep SQL and checklist separate and clearly labeled. Use low creativity and deterministic style.

Example SQL DDL result (what Copilot should produce)

— Assumptions:
— Postgres version: compatible with 12+ features
— Encoding: UTF8
— Schema: dbrosetta
— Owner: replace with your DB user

CREATE SCHEMA IF NOT EXISTS dbrosetta AUTHORIZATION current_user;

— dialects: supported DB dialects and a docs home URL CREATE TABLE dbrosetta.dialects ( id SERIAL PRIMARY KEY, slug TEXT NOT NULL UNIQUE, name TEXT NOT NULL, documentation_home_url TEXT, created_at TIMESTAMPTZ NOT NULL DEFAULT now() ); COMMENT ON TABLE dbrosetta.dialects IS ‘Registry of supported dialects and their canonical documentation home page’;

— terms: canonical terms per dialect for potential seed data CREATE TABLE dbrosetta.terms ( id SERIAL PRIMARY KEY, dialect_id INT NOT NULL REFERENCES dbrosetta.dialects(id) ON DELETE RESTRICT, term TEXT NOT NULL, canonical_normalized_term TEXT NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT now(), UNIQUE (dialect_id, canonical_normalized_term) ); COMMENT ON TABLE dbrosetta.terms IS ‘Optional seeding of known terms per dialect; canonical_normalized_term is lowercased normalization’;

— translations: cached translation results and metadata from runs CREATE TABLE dbrosetta.translations ( id BIGSERIAL PRIMARY KEY, from_dialect_id INT NOT NULL REFERENCES dbrosetta.dialects(id) ON DELETE RESTRICT, to_dialect_id INT NOT NULL REFERENCES dbrosetta.dialects(id) ON DELETE RESTRICT, input_term TEXT NOT NULL, preferred_term TEXT, alternates JSONB DEFAULT ‘[]’::jsonb, description TEXT, doc_url TEXT, mapping_report JSONB DEFAULT ‘[]’::jsonb, metadata JSONB NOT NULL, cached BOOLEAN NOT NULL DEFAULT true, created_at TIMESTAMPTZ NOT NULL DEFAULT now(), UNIQUE (from_dialect_id, to_dialect_id, lower(input_term)) WHERE cached ); COMMENT ON TABLE dbrosetta.translations IS ‘Cached translations: metadata includes prompt_version, model_settings, run_id, timestamp’;

— artifacts: optional generic artifacts produced by runs CREATE TABLE dbrosetta.artifacts ( id BIGSERIAL PRIMARY KEY, run_id UUID NOT NULL, artifact_type TEXT NOT NULL, — e.g., translation.json, metadata.json payload JSONB NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT now() ); COMMENT ON TABLE dbrosetta.artifacts IS ‘Store run artifacts for auditing and reproducibility’;

— Example seeds INSERT INTO dbrosetta.dialects (slug, name, documentation_home_url) VALUES (‘postgres’, ‘PostgreSQL’, ‘https://www.postgresql.org/docs/’), (‘sqlserver’, ‘Microsoft SQL Server’, ‘https://learn.microsoft.com/sql/’), (‘mysql’, ‘MySQL’, ‘https://dev.mysql.com/doc/’);

— Optional: index to speed lookup by normalized input term in translations CREATE INDEX IF NOT EXISTS idx_translations_from_to_input ON dbrosetta.translations(from_dialect_id, to_dialect_id, lower(input_term));

Minimal deployment checklist (one-paragraph)

Assumptions: you have an Azure PostgreSQL server and a database created; export connection info as environment variables PGHOST, PGPORT, PGDATABASE, PGUSER, PGPASSWORD. Apply the SQL file with: psql “host=$PGHOST port=$PGPORT dbname=$PGDATABASE user=$PGUSER password=$PGPASSWORD sslmode=require” -f dbrosetta-init.sql. . In CI, store secrets in GH Actions secrets or Azure Key Vault and run the same psql command in a job step; verify success by querying SELECT count(*) FROM dbrosetta.dialects;.

Assumptions I used (so you can edit them into the prompt-runner)

Azure Postgres supports standard PostgreSQL 12+ features and JSONB.
No row-level security or advanced role setup is included; owner replacement is manual.
Caching is enabled by default: translations are inserted with cached=true.
Metadata JSON must include prompt_version and model_settings as strings/objects during run-time (enforced by application logic, not DB constraints).

If this meets your needs, I will produce a ready-to-save dbrosetta-init.sql file and a short GitHub Actions job snippet next.

I swear, this thing is EXCITED about this project. Ha! So anyway, you may notice what I did. That prompt deviates a bit from the template we outlined. So, I responded:

Prompt

Response

That’s great and I’ll use it as written within vsCode if it’s what you want. However, is that following the template that you defined for me earlier?

Assessment

Yes — the VS Code prompt and the example SQL DDL you accepted broadly follow the canonical prompt template you asked for, but they’re not a 1:1 fulfillment of every template section. The prompt covers Intent, Context, Template Prompt text, and a deterministic ModelSettings requirement, and the DDL output aligns with the Acceptance Criteria and Guardrails you described.

Where it already matches the canonical template

Intent: Clearly stated (create initial Azure Postgres schema for dbRosetta).
Context: Included assumptions about Azure Postgres, schema name, and non‑destructive output.
Template Prompt (literal): Provided a SYSTEM and USER instruction that can be pasted into Copilot or a prompt-runner.
Guardrails: Explicit “Do not include destructive statements” and determinism guidance (low creativity).
Post Processing expectations: You asked for a separate checklist and told Copilot to keep SQL and checklist separate.

Small gaps to make it fully conformant (suggested, optional)

Frontmatter / ModelSettings block: Add a short frontmatter header (Title, Version, ModelSettings) to the prompt file so it matches the template structure and can be parsed by your prompt-runner.
Input / Output Contracts: Explicitly state the Input Contract (e.g., target schema name, owner placeholder, desired seed dialects) and Output Contract (single SQL file path; checklist text file) inside the prompt text so CI and schema validators can assert them.
Acceptance Criteria: Convert the existing informal checks into 3–6 bullet acceptance sentences (e.g., “DDL creates schema and tables without DROP statements”; “translations table contains metadata JSON and cache uniqueness constraint”; “psql apply command succeeds in CI”).
Telemetry keys: Ask the run to emit a run_id and include metadata keys (prompt_version, model_settings) in an artifacts JSON when the prompt-runner executes.

Minimal actionable edit (one small snippet you can paste)

Add this frontmatter and two short contract lines at the top of your prompt before the SYSTEM line:

Title: “dbRosetta init schema” Version: “0.1.0” ModelSettings: { temperature: 0.0, max_tokens: 1024, top_p: 1.0 }

Input Contract: { schema_name: “dbrosetta”, owner_placeholder: “<DB_USER>”, seed_dialects: [“postgres”,”sqlserver”,”mysql”] } Output Contract: { file: “dbrosetta-init.sql”, checklist: “dbrosetta-deploy-checklist.txt” }

…<additional stuff snipped>

CoPilot can do no wrong according to CoPilot. Fine. We’re trying to let it run the show, so I’m adding the defined header. However, I’m going to change my prompts to be more specific from here on out. I do want to ensure that we follow the plan we said we were going to follow, even if the AI wants to hare off and do whatever (<insert joke about developers here>).

Over to vsCode. Let’s see what happens when we use this prompt.

Oh, but before I do that, I’m having side conversations occasionally with CoPilot. One of them was on branching. It wanted a branch for every prompt, which I thought was a bit much, lots and lots of merges that are sooner or later going to get messy. I suggested a branch for each article. The AI compromised, a branch for each article, but, within the article, a branch for each feature/prompt.

Creating a Database

The amount of setup in vsCode is just about non-existent. I’ve got the project in Github open in vsCode. I’ve already installed extensions for Github and PostgreSQL (I’ll use vsCode to do manual work in Azure PostgreSQL, but I may occasionally revert back to DBeaver for some work, just to be open & clear). Nothing to it but to paste in the prompt. Here’s the output:

I’ll generate a complete SQL schema file for the dbRosetta system and provide deployment instructions. Let me create this file in your workspace.

Created [](file:///c%3A/Users/grant/OneDrive/Documents/GitHub/dbRosetta/schema/init-dbrosetta-schema.sql)

## SQL Schema File

The complete SQL schema file has been created at init-dbrosetta-schema.sql. This file creates the dbrosetta schema with four tables (dialects, terms, translations, and artifacts) along with appropriate indexes, constraints, foreign keys, and sample data for PostgreSQL, SQL Server, and MySQL dialects.

## Deployment Checklist

**Prerequisites and Environment Variables:** Ensure you have an active Azure Database for PostgreSQL instance with admin credentials. Set the following environment variables: PGHOST (your Azure PostgreSQL server name), PGPORT (typically 5432), PGDATABASE (target database name), PGUSER (admin username), and PGPASSWORD (admin password). Verify network connectivity by ensuring your client IP is allowed in the Azure PostgreSQL firewall rules. **Deployment Command:** Execute the schema using psql with SSL required for Azure: psql "host=$PGHOST port=$PGPORT dbname=$PGDATABASE user=$PGUSER password=$PGPASSWORD sslmode=require" -f schema/init-dbrosetta-schema.sql. After successful execution, verify the schema creation by connecting to the database and running \dt dbrosetta.* to list the created tables and SELECT COUNT(*) FROM dbrosetta.dialects; to confirm sample data insertion.

It worked! I feel just a little like Victor Frankenstein. Actually, I feel a lot like Victor Frankenstein. Here’s the SQL code it generated, all of it, not hiding the monster from you at all:

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

-- =============================================================================

-- dbRosetta Schema Initialization

-- =============================================================================

-- ASSUMPTIONS:

-- - PostgreSQL version 11+ (Azure Database for PostgreSQL compatible)

-- - UTF-8 encoding

-- - Database owner/admin privileges required for schema creation

-- - Target database already exists

-- - Schema owner placeholder: replace 'dbrosetta_owner' with actual role

-- =============================================================================

-- Create schema

CREATE SCHEMA IF NOT EXISTS dbrosetta;

-- Set search path for this session

SET search_path TO dbrosetta, public;

-- =============================================================================

-- TABLE: dialects

-- Stores SQL dialect definitions and metadata

-- =============================================================================

CREATE TABLE dialects (

id SERIAL PRIMARY KEY,

name VARCHAR(50) NOT NULL UNIQUE,

display_name VARCHAR(100) NOT NULL,

version VARCHAR(20),

description TEXT,

is_active BOOLEAN NOT NULL DEFAULT true,

metadata JSONB,

created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,

updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP

);

-- Indexes for dialects table

CREATE INDEX idx_dialects_name ON dialects (name);

CREATE INDEX idx_dialects_active ON dialects (is_active);

CREATE INDEX idx_dialects_metadata ON dialects USING GIN (metadata);

-- Comments for dialects table

COMMENT ON TABLE dialects IS 'SQL dialect definitions and metadata for translation system';

COMMENT ON COLUMN dialects.name IS 'Unique identifier for the dialect (lowercase, no spaces)';

COMMENT ON COLUMN dialects.display_name IS 'Human-readable name for the dialect';

COMMENT ON COLUMN dialects.version IS 'Version identifier for the dialect';

COMMENT ON COLUMN dialects.is_active IS 'Flag to enable/disable dialect for translations';

COMMENT ON COLUMN dialects.metadata IS 'Additional dialect configuration and features as JSON';

-- =============================================================================

-- TABLE: terms

-- Stores canonical SQL terms and their definitions

-- =============================================================================

CREATE TABLE terms (

id SERIAL PRIMARY KEY,

canonical_term VARCHAR(200) NOT NULL UNIQUE,

category VARCHAR(50) NOT NULL,

subcategory VARCHAR(50),

description TEXT NOT NULL,

usage_context TEXT,

is_active BOOLEAN NOT NULL DEFAULT true,

metadata JSONB,

created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,

updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP

);

-- Indexes for terms table

CREATE INDEX idx_terms_canonical ON terms (canonical_term);

CREATE INDEX idx_terms_category ON terms (category);

CREATE INDEX idx_terms_active ON terms (is_active);

CREATE INDEX idx_terms_category_sub ON terms (category, subcategory);

CREATE INDEX idx_terms_metadata ON terms USING GIN (metadata);

-- Comments for terms table

COMMENT ON TABLE terms IS 'Canonical SQL terms and definitions for cross-dialect translation';

COMMENT ON COLUMN terms.canonical_term IS 'Standardized term identifier (case-sensitive)';

COMMENT ON COLUMN terms.category IS 'Primary classification of the term (e.g., function, operator, keyword)';

COMMENT ON COLUMN terms.subcategory IS 'Secondary classification for more specific grouping';

COMMENT ON COLUMN terms.usage_context IS 'Additional context about when and how to use this term';

COMMENT ON COLUMN terms.metadata IS 'Additional term properties and configuration as JSON';

-- =============================================================================

-- TABLE: translations

-- Stores dialect-specific translations for canonical terms

-- =============================================================================

CREATE TABLE translations (

id SERIAL PRIMARY KEY,

term_id INTEGER NOT NULL REFERENCES terms(id) ON DELETE CASCADE,

dialect_id INTEGER NOT NULL REFERENCES dialects(id) ON DELETE CASCADE,

translated_term VARCHAR(500) NOT NULL,

syntax_pattern TEXT,

examples TEXT,

notes TEXT,

confidence_level INTEGER DEFAULT 100 CHECK (confidence_level >= 0 AND confidence_level <= 100),

is_active BOOLEAN NOT NULL DEFAULT true,

metadata JSONB,

created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,

updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,

UNIQUE(term_id, dialect_id)

);

-- Indexes for translations table

CREATE INDEX idx_translations_term ON translations (term_id);

CREATE INDEX idx_translations_dialect ON translations (dialect_id);

CREATE INDEX idx_translations_active ON translations (is_active);

CREATE INDEX idx_translations_confidence ON translations (confidence_level);

CREATE INDEX idx_translations_term_dialect ON translations (term_id, dialect_id);

CREATE INDEX idx_translations_metadata ON translations USING GIN (metadata);

-- Comments for translations table

COMMENT ON TABLE translations IS 'Dialect-specific translations for canonical terms with caching support';

COMMENT ON COLUMN translations.translated_term IS 'The dialect-specific equivalent of the canonical term';

COMMENT ON COLUMN translations.syntax_pattern IS 'Template or pattern showing how to use the translated term';

COMMENT ON COLUMN translations.examples IS 'Example usage in the target dialect';

COMMENT ON COLUMN translations.confidence_level IS 'Translation accuracy confidence (0-100)';

COMMENT ON COLUMN translations.metadata IS 'Additional translation properties and caching hints as JSON';

-- =============================================================================

-- TABLE: artifacts

-- Stores generated SQL artifacts and their metadata

-- =============================================================================

CREATE TABLE artifacts (

id SERIAL PRIMARY KEY,

name VARCHAR(200) NOT NULL,

artifact_type VARCHAR(50) NOT NULL,

source_dialect_id INTEGER REFERENCES dialects(id) ON DELETE SET NULL,

target_dialect_id INTEGER REFERENCES dialects(id) ON DELETE SET NULL,

original_sql TEXT,

translated_sql TEXT,

translation_summary TEXT,

status VARCHAR(20) NOT NULL DEFAULT 'draft',

metadata JSONB,

created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,

updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP

);

-- Indexes for artifacts table

CREATE INDEX idx_artifacts_name ON artifacts (name);

CREATE INDEX idx_artifacts_type ON artifacts (artifact_type);

CREATE INDEX idx_artifacts_source_dialect ON artifacts (source_dialect_id);

CREATE INDEX idx_artifacts_target_dialect ON artifacts (target_dialect_id);

CREATE INDEX idx_artifacts_status ON artifacts (status);

CREATE INDEX idx_artifacts_created ON artifacts (created_at);

CREATE INDEX idx_artifacts_metadata ON artifacts USING GIN (metadata);

-- Comments for artifacts table

COMMENT ON TABLE artifacts IS 'Generated SQL artifacts from translation processes';

COMMENT ON COLUMN artifacts.name IS 'User-friendly name for the artifact';

COMMENT ON COLUMN artifacts.artifact_type IS 'Type of artifact (e.g., script, procedure, schema)';

COMMENT ON COLUMN artifacts.original_sql IS 'Source SQL before translation';

COMMENT ON COLUMN artifacts.translated_sql IS 'SQL after dialect translation';

COMMENT ON COLUMN artifacts.translation_summary IS 'Summary of changes made during translation';

COMMENT ON COLUMN artifacts.status IS 'Artifact status (draft, validated, deployed, etc.)';

COMMENT ON COLUMN artifacts.metadata IS 'Additional artifact properties and processing metadata as JSON';

-- =============================================================================

-- EXAMPLE DATA

-- Sample dialects and basic term translations

-- =============================================================================

-- Insert sample dialects

INSERT INTO dialects (name, display_name, version, description, metadata) VALUES

('postgresql', 'PostgreSQL', '13+', 'PostgreSQL database system', '{"features": ["arrays", "jsonb", "cte"], "url": "https://postgresql.org"}'),

('sqlserver', 'Microsoft SQL Server', '2019+', 'Microsoft SQL Server database system', '{"features": ["cte", "windowing", "xml"], "url": "https://microsoft.com/sql"}'),

('mysql', 'MySQL', '8.0+', 'MySQL database system', '{"features": ["json", "cte"], "url": "https://mysql.com"}');

-- Insert sample terms

INSERT INTO terms (canonical_term, category, subcategory, description, usage_context) VALUES

('CONCAT', 'function', 'string', 'Concatenates two or more strings together', 'String manipulation and formatting'),

('ISNULL', 'function', 'conditional', 'Tests if a value is NULL and returns alternative', 'NULL handling and conditional logic'),

('LIMIT', 'clause', 'query', 'Limits the number of rows returned by a query', 'Result set pagination and performance');

-- Insert sample translations

INSERT INTO translations (term_id, dialect_id, translated_term, syntax_pattern, examples, confidence_level) VALUES

-- CONCAT translations

(1, 1, 'CONCAT', 'CONCAT(string1, string2, ...)', 'SELECT CONCAT(''Hello'', '' '', ''World'');', 100),

(1, 2, 'CONCAT', 'CONCAT(string1, string2, ...)', 'SELECT CONCAT(''Hello'', '' '', ''World'');', 100),

(1, 3, 'CONCAT', 'CONCAT(string1, string2, ...)', 'SELECT CONCAT(''Hello'', '' '', ''World'');', 100),

-- ISNULL translations

(2, 1, 'COALESCE', 'COALESCE(value, replacement)', 'SELECT COALESCE(column, ''N/A'') FROM table;', 95),

(2, 2, 'ISNULL', 'ISNULL(value, replacement)', 'SELECT ISNULL(column, ''N/A'') FROM table;', 100),

(2, 3, 'IFNULL', 'IFNULL(value, replacement)', 'SELECT IFNULL(column, ''N/A'') FROM table;', 100),

-- LIMIT translations

(3, 1, 'LIMIT', 'LIMIT count [OFFSET offset]', 'SELECT * FROM table LIMIT 10 OFFSET 20;', 100),

(3, 2, 'TOP', 'TOP (count)', 'SELECT TOP (10) * FROM table ORDER BY id OFFSET 20 ROWS;', 90),

(3, 3, 'LIMIT', 'LIMIT [offset,] count', 'SELECT * FROM table LIMIT 20, 10;', 100);

-- =============================================================================

-- GRANTS AND PERMISSIONS

-- =============================================================================

-- Grant schema usage (replace 'dbrosetta_owner' with actual role)

-- GRANT USAGE ON SCHEMA dbrosetta TO dbrosetta_owner;

-- GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA dbrosetta TO dbrosetta_owner;

-- GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA dbrosetta TO dbrosetta_owner;

-- =============================================================================

-- SCHEMA VALIDATION

-- =============================================================================

-- Verify schema creation

SELECT

schemaname,

tablename,

tableowner

FROM pg_tables

WHERE schemaname = 'dbrosetta'

ORDER BY tablename;

-- Verify sample data

SELECT

d.name as dialect,

t.canonical_term,

tr.translated_term,

tr.confidence_level

FROM dialects d

JOIN translations tr ON d.id = tr.dialect_id

JOIN terms t ON tr.term_id = t.id

ORDER BY t.canonical_term, d.name;

I’m impressed, somewhat, that it decided to put some tests in. Yeah, OK, however ineffectual those tests may be, they’re there. Kind of cool. Overall, I’m not crazy about the script, especially the indexing. However, we’re taking this as written. I tested the script. It worked great. However, I also fed the script back to CoPilot so that it could validate if this met our criteria. CoPilot wasn’t completely happy with the results:

Issues and risks to address

SERIAL usage and sequences

SERIAL works on PostgreSQL 11+, but for newer portability and explicit behavior consider IDENTITY (GENERATED ALWAYS AS IDENTITY) to avoid implicit sequence ownership issues in some managed platforms.

updated_at not maintained automatically

updated_at defaults to CURRENT_TIMESTAMP but won’t change on updates. Consider adding triggers (or application-level updates) to maintain updated_at consistently.

Case sensitivity / normalization for canonical_term and lookups

canonical_term is UNIQUE but case-sensitive. If lookups should be case-insensitive, add a unique index on lower(canonical_term) or store a normalized column.

Unnamed constraints and FK clarity

FKs and UNIQUE constraints are fine, but naming constraints (e.g., fk_translations_term) improves clarity in migrations and error messages.

Ambiguous data types and pattern length

translated_term VARCHAR(500) may be fine, but confirm max lengths vs expected needs. examples/text fields could be TEXT consistently to avoid arbitrary limits.

Verification queries may fail in non-interactive CI

The SELECT checks at the end are useful locally, but in automated runs they may produce output that’s ignored; ensure CI treats success as exit code 0 and doesn’t break on non-empty results.

Metadata key expectations not enforced

Your acceptance criteria mention prompt_version and model_settings in metadata; the DB doesn’t validate those keys. Consider JSON Schema validation at application level or a CHECK that metadata ? ‘prompt_version’ if you want DB-level guardrails.

Search path and session settings

SET search_path TO dbrosetta, public is fine for an interactive run, but scripts applied in CI or by apps should avoid relying on session-specific search_path unless intentional.

Except for #6 (I don’t like the tests, at all), and #2 (triggers? really?), I think this is a very good set of recommendations for improving the initial script. I’m a little shocked that it’s not put out by those index choices. However, I’m going to have CoPilot regenerate the script with these suggested changes except I’m going to have it just remove the existing tests and not generate triggers. While I’m letting the AI largely run things, I figure my input counts as well.

Although, that brings up a point of discussion. Should I? What I mean is, am I biased by having knowledge of what we’re building such that I’m pushing the decisions in a direction I’m more comfortable with? Yeah, yeah, I know I’m biased. Everyone is biased. I’m just wondering if my bias is negatively affecting the experiment? I’m going forward with how I planned it, but I figured this is worth at least bringing up.

With that, mission accomplished. We have a database. And I have to say, I’m pretty excited about it.

Conclusion

I’m coming to believe more and more, being a beginner in IT has become much more challenging due to these new tools. Look, we can argue about the naming standards, indexing, data type choices, and general table structure all we want. The fact of the matter is, that is a functional database. I did next to nothing to arrive at it other than interrogate the LLM a bit. Heck, GitHub CoPilot generated files for me. And filled them in. I could easily allow it to create new resources within Azure, same as everything else we’re doing. So even the steps I’ve done manually, I didn’t have to. The question isn’t, can a beginner do this. The question is, how does a beginner actually learn how to do this, what’s right, what’s wrong, what’s simply problematic or a compromise? That’s all about experience, which this stuff is not giving me.

Anyway, we have a database. I’m honestly excited with how this is going. Next up, I’m going to create a Continuous Integration deployment process where we can begin to put testing and validation in place. I will be using Flyway, and, after discussion with CoPilot, GitHub actions.

As per usual, I’ll let CoPilot have the last word:

The database-creation step succeeded: the init script was produced, applied without errors, and the target schema and seed data are present in the test environment. Core guardrails (non‑destructive DDL, JSONB metadata, identity keys, and named constraints) are in place.

Redgate is the proud host of PASS Data Community Summit 2025 – what to expect from us this year

With just 2 weeks to go until PASS Data Community Summit 2025 kicks off in Seattle, we wanted to let you know what you can expect from Redgate who are proudly hosting the summit this year. Redgate booth This year we’ll be back in the Exhibit Hall with our eye-catching booth and a sea of red... Read more

PASS Data Community Summit

Grant Fritchey

5 November 2025

Grant Fritchey

5 November 2025

Blog

Building dbRosetta Using AI: Part 2, Defining the Project & Prompt Templates

This is the next installment of the series on building a database and an application called dbRosetta using AI/LLM. Part 1 introduces the concept. THE AI PICKED DATABASE FIRST! Look, I talk databases at this thing a lot, so it probably knows my own preference, but when I asked it, it chose to build... Read more

large language model

llm

artificial intelligence

Adam Atreides

4 November 2025

Adam Atreides

4 November 2025

Blog

SQL Prompt Product Updates – November 2025

SQL Prompt’s November release brings new features to SQL Prompt AI, bug and performance fixes to SQL History and support for SQL Server 2025 preview. AI-powered fix and optimization using Prompt AI We’ve added two new features to SQL Prompt AI this month: fix and optimize. Fix SQL Resolve query errors using the Fix... Read more

SQL Prompt

Grant Fritchey

3 November 2025

Grant Fritchey

3 November 2025

Blog

Building dbRosetta Using AI: Part 1 of Many

Like many of you, over the last couple of years, I’ve been using AI, or, well, let’s just name it appropriately, Large Language Models (LLM), as a part of my job. I’ve also used it in my hobby. With it, I’ve generated snippets of code, tested data conversions, even built a small database for... Read more

20 July 2023

20 July 2023

Blog

The promise – and the perils – of GitHub Copilot

There’s been a lot of talk about GitHub Copilot recently, loudly touted as Your AI pair programmer. According to GitHub, Copilot for Business brings the power of generative AI to engineering teams, accelerating the speed of software development and innovation. At the back end is OpenAI Codex, a modified version of the GPT-3 Large... Read more

GitHub Copilot

Redgate Flyway

Redgate Monitor

SQL Toolbelt Essentials

Redgate Test Data Manager

Security and compliance

Database monitoring and observability

Database change management

Productivity and workflow automation

AI data readiness

Database modernization

Efficiency and cost optimization

Cloud migration and workload optimization

Redgate Blog

Building dbRosetta Using AI: Part 3, Creating a Database

Generating the Database Prompt

Creating a Database

Issues and risks to address

Conclusion

2025 State of the Database Landscape

Introduction to PostgreSQL for the data professional

Redgate Flyway

Redgate Monitor

SQL Toolbelt Essentials

Redgate Test Data Manager

Security and compliance

Database monitoring and observability

Database change management

Productivity and workflow automation

AI data readiness

Database modernization

Efficiency and cost optimization

Cloud migration and workload optimization

Generating the Database Prompt

Creating a Database

Issues and risks to address

Conclusion

You may also like