{"id":107975,"date":"2025-12-15T14:30:00","date_gmt":"2025-12-15T14:30:00","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=107975"},"modified":"2025-12-11T09:47:14","modified_gmt":"2025-12-11T09:47:14","slug":"how-to-build-an-ai-powered-semantic-search-in-postgresql-with-pgvector","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/postgresql\/how-to-build-an-ai-powered-semantic-search-in-postgresql-with-pgvector\/","title":{"rendered":"How to Build an AI-Powered Semantic Search in PostgreSQL with pgvector"},"content":{"rendered":"\n<p><strong>Learn how to build a sleek, AI-powered semantic search engine that lives inside your existing database. We\u2019ll walk through how to store vector embeddings, run similarity-based queries, and turn ordinary text searches into meaning-aware retrieval with nothing more than standard SQL and a vector extension.<\/strong><\/p>\n\n\n\n<p>Imagine trying to describe a movie to a friend without naming it. You might say, <em>\u201cit\u2019s that film about an astronaut stranded on another planet who grows potatoes to survive\u201d<\/em> and, even though you never mentioned the title, your friend instantly knows you\u2019re talking about <em><a href=\"https:\/\/www.imdb.com\/title\/tt3659388\/\" target=\"_blank\" rel=\"noreferrer noopener\">The Martian<\/a><\/em>.<\/p>\n\n\n\n<p>Humans are great at understanding the meaning behind words, not just matching exact <a href=\"https:\/\/dictionary.cambridge.org\/dictionary\/english\/keyword\" target=\"_blank\" rel=\"noreferrer noopener\">keywords<\/a>. Traditional databases, by contrast, have historically required exact matches to find results.<\/p>\n\n\n\n<p>This is changing. With the <a href=\"https:\/\/www.postgresql.org\/about\/news\/pgvector-070-released-2852\/\" target=\"_blank\" rel=\"noreferrer noopener\">pgvector extension for PostgreSQL<\/a>, your database can finally search by <em>meaning<\/em> &#8211; not just by matching text patterns. By turning text and other data into high-dimensional vectors (numeric representations of meaning), <a href=\"https:\/\/www.postgresql.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">PostgreSQL<\/a> can perform similarity searches based on intent and context. <\/p>\n\n\n\n<p>This unlocks powerful new capabilities for search, recommendations, and AI-driven applications &#8211; all within the <a href=\"https:\/\/aws.amazon.com\/what-is\/sql\/\" target=\"_blank\" rel=\"noreferrer noopener\">SQL<\/a> database you already know and love.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"531\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/12\/image.png\" alt=\"A graph showing the cycle of User Query all the way to Similarity Search.\" class=\"wp-image-107976\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/12\/image.png 975w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/12\/image-300x163.png 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/12\/image-768x418.png 768w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-why-traditional-keyword-search-falls-short\">Why Traditional Keyword Search Falls Short<\/h2>\n\n\n\n<p>Standard <a href=\"https:\/\/www.red-gate.com\/simple-talk\/blogs\/writing-an-efficient-query\/\" target=\"_blank\" rel=\"noreferrer noopener\">SQL queries<\/a> excel at exact matching (finding products in a given category, or documents containing specific keywords), but fall short when it comes to understanding context or <a href=\"https:\/\/en.wikipedia.org\/wiki\/User_intent\" target=\"_blank\" rel=\"noreferrer noopener\">intent<\/a>. For example:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li>A shopper might search for <em>\u201caffordable laptops for coding\u201d<\/em>, but your catalog tags products as <em>\u201cbudget-friendly computers for programming.\u201d<\/em> A plain keyword search could miss these matches, even though the intent is the same.<br><br><\/li>\n\n\n\n<li>You want to recommend articles similar to what a user just read, even if the articles use different wording. Exact keyword overlap might be minimal or nonexistent.<br><br><\/li>\n\n\n\n<li>A support chatbot needs to fetch answers that fit the <em>intent<\/em> of a customer\u2019s question, not just the exact words typed.<\/li>\n<\/ul>\n<\/div>\n\n\n<p>These scenarios require <a href=\"https:\/\/www.genesys.com\/en-gb\/definitions\/what-is-semantic-understanding\" target=\"_blank\" rel=\"noreferrer noopener\">semantic understanding<\/a> &#8211; grasping the meaning behind words. Enter <strong>vector embeddings<\/strong>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-what-are-vector-embeddings\">What are Vector Embeddings?<\/h2>\n\n\n\n<p>A vector embedding is essentially a list of numbers that encodes the semantic essence of data, often text. Think of it as mapping meaning to a point in a high-dimensional space. Similar concepts end up as nearby points in that space.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"531\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/12\/image-1.png\" alt=\"A graph showing what vector embeddings are.\" class=\"wp-image-107977\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/12\/image-1.png 975w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/12\/image-1-300x163.png 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2025\/12\/image-1-768x418.png 768w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><\/figure>\n\n\n\n<p><strong>Key features of embeddings:<\/strong><\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li><strong>Semantic representation:<\/strong> Phrases or concepts with similar meaning have similar vectors. For example, <em>\u201cI need a vacation\u201d<\/em> and <em>\u201cLooking for a holiday getaway\u201d<\/em> would produce embeddings close to each other, even though the words differ.<br><br><\/li>\n\n\n\n<li><strong>Measurable similarity:<\/strong> We can measure how close two pieces of text are by computing the distance between their vectors.<br><br><\/li>\n\n\n\n<li><strong>AI-generated:<\/strong> Embeddings come from AI models trained on massive datasets. Models built by <a href=\"https:\/\/openai.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">OpenAI<\/a> or <a href=\"https:\/\/cohere.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Cohere<\/a>, for example, or <a href=\"https:\/\/www.marqo.ai\/course\/introduction-to-sentence-transformers\" target=\"_blank\" rel=\"noreferrer noopener\">sentence transformers<\/a>, convert text into these numeric signatures.<\/li>\n<\/ul>\n<\/div>\n\n\n<p>You can think of a text embedding as an \u201cidea index\u201d &#8211; a numeric signature of a sentence or document. By comparing these signatures, we can quickly find semantically related content in a database.<\/p>\n\n\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-what-is-the-pgvector-extension-in-postgresql\">What is the pgvector Extension in PostgreSQL?<\/h2>\n\n\n\n<p>Up until now, searching semantically often meant using a specialized vector database or external search service. The <a href=\"https:\/\/github.com\/pgvector\/pgvector\" target=\"_blank\" rel=\"noreferrer noopener\">pgvector<\/a> extension changes that by bringing vector search directly into PostgreSQL. With pgvector, you can:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li>Store embeddings in a native <code>VECTOR<\/code> column.<br><br><\/li>\n\n\n\n<li>Perform similarity queries using vector operators like <code>&lt;-&gt;<\/code> (for distance).<br><br><\/li>\n\n\n\n<li>Index vector columns for blazing-fast nearest-neighbor searches.<br><br><\/li>\n\n\n\n<li>Combine vector search with traditional SQL filters and full-text search.<br><br><\/li>\n<\/ul>\n<\/div>\n\n\n<p>In short, pgvector lets you build AI-powered search and recommendation features right in your existing PostgreSQL database. No need for a separate search engine &#8211; just SQL and vectors.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-to-get-started-with-pgvector-in-postgresql\">How to Get Started with pgvector in PostgreSQL<\/h2>\n\n\n\n<p>Let\u2019s walk through the basic steps to enable semantic search in your database.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-install-the-pgvector-extension\">Install the pgvector Extension<\/h3>\n\n\n\n<p>First, enable the <code>vector<\/code> extension:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">CREATE EXTENSION IF NOT EXISTS vector;<\/pre><\/div>\n\n\n\n<p>Most managed PostgreSQL services (<a href=\"https:\/\/aws.amazon.com\/rds\/\" target=\"_blank\" rel=\"noreferrer noopener\">AWS RDS<\/a>, <a href=\"https:\/\/aws.amazon.com\/rds\/aurora\/serverless\/\" target=\"_blank\" rel=\"noreferrer noopener\">Aurora<\/a>, <a href=\"https:\/\/cloud.google.com\/sql?hl=en\" target=\"_blank\" rel=\"noreferrer noopener\">Google Cloud SQL<\/a>, <a href=\"https:\/\/azure.microsoft.com\/en-gb\/products\/azure-sql\/database\" target=\"_blank\" rel=\"noreferrer noopener\">Azure Database<\/a>, etc) support pgvector. Once installed, you get a new <code>vector<\/code> data type.<\/p>\n\n\n\n<p>While most cloud-hosted PostgreSQL services now support pgvector, on-premises installations may require manual extension installation. This is typically straightforward on standard PostgreSQL distributions, but compatibility varies for Postgres \u201cflavors\u201d like <a href=\"https:\/\/www.percona.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Percona<\/a>, <a href=\"https:\/\/www.enterprisedb.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">EDB<\/a> or older <a href=\"https:\/\/www.tigerdata.com\/blog\/self-hosted-or-cloud-database-a-countryside-reflection-on-infrastructure-choices\" target=\"_blank\" rel=\"noreferrer noopener\">self-hosted<\/a> builds. <\/p>\n\n\n\n<p>If your environment does not include pgvector by default, you can compile and enable it manually as long as the server allows custom extensions.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-creating-tables-with-vector-column\">Creating Tables with Vector Column<\/h3>\n\n\n\n<p>Next, create a table to hold your data (documents, products, etc) and include a column for embeddings. For example:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">CREATE TABLE documents (\n  id SERIAL PRIMARY KEY,\n  content TEXT,\n  embedding VECTOR(1536)  -- dimensions depend on your embedding model\n);<\/pre><\/div>\n\n\n\n<p>Here, <code>VECTOR(1536)<\/code> means each embedding has 1536 dimensions. That matches OpenAI\u2019s <code>text-embedding-3-small<\/code> model. If you use a different model (like a Cohere model or an open-source transformer), adjust the dimension accordingly.<\/p>\n\n\n\n<p><strong>Tip:<\/strong> For quick tests you could use a small dimension (like <code>VECTOR(3)<\/code>), but in production you\u2019ll want to match the full dimension of your embedding model for best results.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-generating-and-storing-embeddings\">Generating and Storing Embeddings<\/h3>\n\n\n\n<p>Now, let\u2019s generate embeddings for your data and insert them. As an example, here\u2019s how you might use <a href=\"https:\/\/www.python.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">Python<\/a> and <a href=\"https:\/\/openai.com\/api\/\" target=\"_blank\" rel=\"noreferrer noopener\">OpenAI\u2019s API<\/a>:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">import openai\nimport psycopg\n\nclient = openai.OpenAI(api_key=\"YOUR_API_KEY\")\n\n# Example text\ntext = \"A suspenseful sci-fi adventure with a twist ending.\"\n\n# Generate an embedding for the text\nresponse = client.embeddings.create(\n    model=\"text-embedding-3-small\",\n    input=text\n)\nembedding = response.data[0].embedding  # This is a list of 1536 floats\n\n# Insert into Postgres\nconn = psycopg.connect(\"dbname=vector_demo user=postgres\")\nwith conn.cursor() as cur:\n    cur.execute(\n        \"INSERT INTO documents (content, embedding) VALUES (%s, %s)\",\n        (text, embedding)\n    )\nconn.commit()<\/pre><\/div>\n\n\n\n<p>Notice that the <code>psycopg<\/code> driver automatically converts the Python list into PostgreSQL\u2019s <code>VECTOR<\/code> type.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-running-similarity-queries\">Running Similarity Queries<\/h3>\n\n\n\n<p>With data in place, you can find entries similar to a query by comparing embeddings. For instance, if you have a query embedding (perhaps from another API call), you can run:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">SELECT id, content\nFROM documents\nORDER BY embedding &lt;-&gt; '[0.12, -0.44, ...]'   -- your query vector here\nLIMIT 5;\n\nFor example, the query might return results like:\n\n id |                   content\n----+-------------------------------------------------------------\n  3 | \"A thrilling sci-fi drama set on a distant colony.\"\n  7 | \"A story about surviving in an alien world.\"\n  1 | \"Suspenseful sci-fi adventure with a twist ending.\"\n  9 | \"Exploration narrative with strong science themes.\"\n\nThese are the documents whose vectors lie closest to the query embedding, making them the most semantically relevant.<\/pre><\/div>\n\n\n\n<p>The <code>&lt;-&gt;<\/code> operator calculates the <a href=\"https:\/\/www.geeksforgeeks.org\/maths\/euclidean-distance\/\" target=\"_blank\" rel=\"noreferrer noopener\">Euclidean distance<\/a> between vectors (a smaller distance means more similar). This query returns the 5 documents whose embeddings are closest to the query. Those are your most semantically relevant results.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-optimizing-performance-indexing-vectors-for-speed\">Optimizing Performance: Indexing Vectors for Speed<\/h2>\n\n\n\n<p>When you only have a few rows, a full table scan might be fine. However, as your data grows, you\u2019ll want an index on the <code>embedding<\/code> column. pgvector offers two main index types:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li><strong>HNSW (Hierarchical Navigable Small World):<\/strong> A graph-based index that links vectors to their neighbors. It\u2019s very fast and accurate out-of-the-box and doesn\u2019t require extra training. Great for datasets up to tens of millions of vectors.<br><br><\/li>\n\n\n\n<li><strong>IVFFlat (Inverted File with Flat Quantization):<\/strong> Clusters vectors into buckets (using <a href=\"https:\/\/www.geeksforgeeks.org\/machine-learning\/k-means-clustering-introduction\/\" target=\"_blank\" rel=\"noreferrer noopener\">k-means<\/a>) so that only certain clusters are searched. It needs a training step (<code>ANALYZE<\/code>) but can scale to very large datasets (hundreds of millions of vectors). It lets you trade some recall for speed by adjusting how many clusters you probe.<\/li>\n<\/ul>\n<\/div>\n\n\n<p>Both HNSW and IVFFlat are forms of <a href=\"https:\/\/www.mongodb.com\/resources\/basics\/ann-search\" target=\"_blank\" rel=\"noreferrer noopener\">Approximate Nearest Neighbor (ANN)<\/a> indexing. Instead of scanning every vector, ANN structures explore only a subset of the search space, dramatically reducing query time while maintaining high recall. This trade-off makes ANN indexing well-suited for real-time search applications where speed is more important than perfect precision.<br><br>To create an index on the <code>documents<\/code> table:<\/p>\n\n\n\n<p>For HNSW (using Euclidean distance):<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">CREATE INDEX ON documents USING hnsw (embedding vector_l2_ops);<\/pre><\/div>\n\n\n\n<p>Use <code>vector_cosine_ops<\/code> if you want cosine similarity instead.<\/p>\n\n\n\n<p>For IVFFlat (with L2 distance, say 100 lists):<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">CREATE INDEX ON documents USING ivfflat (embedding vector_l2_ops) WITH (lists = 100);\nANALYZE documents;  -- trains the IVFFlat index\n<\/pre><\/div>\n\n\n\n<p>The <code>lists = 100<\/code> setting means \u201ccluster into 100 groups\u201d. More lists = more clusters (potentially higher accuracy, but more memory and training time). The <code>ANALYZE<\/code> command is crucial for IVFFlat as it builds the clusters on your data.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Tuning the Index<\/h3>\n\n\n\n<p>These indexes are approximate (they use nearest-neighbor algorithms), so you can tweak them:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li>For HNSW, increase <code>hnsw.ef_search<\/code> for higher recall (at the cost of slower search). For example:<br><br>&nbsp;<code>SET hnsw.ef_search = 100;<\/code><\/li>\n\n\n\n<li>For IVFFlat, increase <code>ivfflat.probes<\/code> to search more clusters per query:<br><br>&nbsp;<code>SET ivfflat.probes = 10;<\/code><\/li>\n\n\n\n<li>Experiment with these settings on your data to balance accuracy and performance.<\/li>\n<\/ul>\n<\/div>\n\n\n<p><\/p>\n\n\n\n<section id=\"my-first-block-block_0f7aee0f4878eca1e8ef47321d52e695\" class=\"my-first-block alignwide\">\n    <div class=\"bg-brand-600 text-base-white py-5xl px-4xl rounded-sm bg-gradient-to-r from-brand-600 to-brand-500 red\">\n        <div class=\"gap-4xl items-start md:items-center flex flex-col md:flex-row justify-between\">\n            <div class=\"flex-1 col-span-10 lg:col-span-7\">\n                <h3 class=\"mt-0 font-display mb-2 text-display-sm\">Enjoying this article? Subscribe to the Simple Talk newsletter<\/h3>\n                <div class=\"child:last-of-type:mb-0\">\n                                            Get the latest articles, event information, podcasts and other industry content delivered straight to your inbox every two weeks.                                    <\/div>\n            <\/div>\n                            <a href=\"https:\/\/www.red-gate.com\/simple-talk\/subscribe\/\" class=\"btn btn--secondary btn--lg\">Subscribe now<\/a>\n                    <\/div>\n    <\/div>\n<\/section>\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Combining Semantic and Traditional Search<\/h2>\n\n\n\n<p>Vector search plays nicely with regular SQL filtering and even full-text search. Often, you want to find results that match certain <a href=\"https:\/\/www.ibm.com\/think\/topics\/metadata\" target=\"_blank\" rel=\"noreferrer noopener\">metadata<\/a> or keywords <em>and<\/em> are semantically relevant. This hybrid approach delivers very precise results.<\/p>\n\n\n\n<p>For example, suppose you\u2019re searching for recent sci-fi articles about space. You might write:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">SELECT id, content\nFROM documents\nWHERE category = 'sci-fi'\n  AND published_date &gt; '2023-01-01'\n  AND to_tsvector('english', content) @@ plainto_tsquery('space')\nORDER BY embedding &lt;-&gt; '[0.12, -0.44, ...]'   -- your query vector\nLIMIT 10;<\/pre><\/div>\n\n\n\n<p>This query does four things:<\/p>\n\n\n<div class=\"block-core-list\">\n<ol start=\"1\" class=\"wp-block-list\">\n<li>Filters to <code>category = 'sci-fi'<\/code>.<br><br><\/li>\n\n\n\n<li>Filters to recent content.<br><br><\/li>\n\n\n\n<li>Applies a keyword filter (<code>space<\/code>) using PostgreSQL\u2019s full-text search.<br><br><\/li>\n\n\n\n<li>Ranks results by semantic closeness to the query vector.<br><br><\/li>\n<\/ol>\n<\/div>\n\n\n<p>The end result is a shortlist of recent sci-fi articles about space that are most relevant to the query\u2019s intent.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-practical-use-cases-of-semantic-search-with-pgvector\">Practical Use Cases of Semantic Search with pgvector<\/h2>\n\n\n\n<p>Here are some concrete ways teams are using semantic search with pgvector:<\/p>\n\n\n\n<p><strong>Smart Product Search:<\/strong> An online store might take a user query like <em>\u201clightweight laptop good for video editing\u201d<\/em>, convert it to a vector, and run:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \"># Application code\nquery_embedding = get_embedding(\"lightweight laptop good for video editing\")<\/pre><\/div>\n\n\n\n<p>Then in SQL:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">SELECT id, name, price, description\nFROM products\nWHERE category = 'computers' AND in_stock = true\nORDER BY embedding &lt;-&gt; query_embedding\nLIMIT 10;<\/pre><\/div>\n\n\n\n<p>This returns the computer&#8217;s most semantically-matching query, filtered by category and stock. It finds relevant laptops even if their descriptions use different wording.<br><\/p>\n\n\n\n<p><strong>Knowledge Base Chatbots:<\/strong> A support system can embed a user\u2019s question and search FAQs:<\/p>\n\n\n\n<p><code>question_embedding = get_embedding(\"How do I reset my password?\")<\/code><\/p>\n\n\n\n<p>Then:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">SELECT id, question, answer\nFROM knowledge_base\nORDER BY embedding &lt;-&gt; question_embedding\nLIMIT 3;\n<\/pre><\/div>\n\n\n\n<p>The chatbot retrieves the top FAQs about password reset (even if phrased differently), giving better answers.<\/p>\n\n\n\n<p><strong>Content Recommendations:<\/strong> A news site can recommend similar articles. Compute an embedding for the current article:<br><br><code>article_embedding = get_embedding(current_article.content) &nbsp;<\/code><\/p>\n\n\n\n<p>Then find similar ones:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">SELECT id, title, summary\nFROM articles\nWHERE id != current_article.id\n  AND published_date &gt; (CURRENT_DATE - INTERVAL '30 days')\nORDER BY embedding &lt;-&gt; article_embedding\nLIMIT 5;\n<\/pre><\/div>\n\n\n\n<p>This finds other recent articles whose content is closest in meaning to what the reader is viewing.<\/p>\n\n\n\n<p>These examples show how semantic search can surface relevant content based on meaning, not just keyword matches.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-best-practices-for-production\">Best Practices for Production<\/h2>\n\n\n\n<p>As you scale up, here are some best practices to follow:<\/p>\n\n\n\n<p><strong>Pick the Right Embedding Model:<\/strong> Different models offer different trade-offs. For many use cases, OpenAI\u2019s <code>text-embedding-3-small<\/code> (1536 dimensions) is a sweet spot of quality and cost. If you need top-tier results, <code>text-embedding-3-large<\/code> (over 3000 dimensions) is more accurate but costs more per query. Benchmarks show that larger models (even when dimension-reduced) often outperform older models like <code>text-embedding-ada-002<\/code>. Test and choose a model that fits your accuracy needs and budget.<\/p>\n\n\n\n<p><strong>Batch Your Embedding Requests:<\/strong> To avoid making an <a href=\"https:\/\/www.cloudflare.com\/learning\/security\/api\/what-is-api-call\/\" target=\"_blank\" rel=\"noreferrer noopener\">API call<\/a> per document, send texts in batches. For example:<br><\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">texts = [\"Document 1\", \"Document 2\", \"Document 3\", ...]\nresponse = client.embeddings.create(\n    model=\"text-embedding-3-small\",\n    input=texts\n)\nembeddings = [item.embedding for item in response.data]<\/pre><\/div>\n\n\n\n<p>This fetches embeddings for many documents at once, which is more efficient and cost-effective.<\/p>\n\n\n\n<p><strong>Load Data in Bulk:<\/strong> When inserting many rows, use bulk methods. For instance, with <a href=\"https:\/\/www.psycopg.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">psycopg<\/a>:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">data = [(texts[i], embeddings[i]) for i in range(len(texts))]\nwith conn.cursor() as cur:\n    cur.executemany(\n        \"INSERT INTO documents (content, embedding) VALUES (%s, %s)\",\n        data\n    )\nconn.commit()\n<\/pre><\/div>\n\n\n\n<p>Or use PostgreSQL\u2019s <code>COPY<\/code> command if you have a large file. Bulk loading dramatically speeds up ingestion.<\/p>\n\n\n\n<p><strong>Ensure Your Index is Used:<\/strong> After creating a vector index, double-check that queries use it. Use <code>EXPLAIN<\/code>:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">EXPLAIN SELECT id, content\nFROM documents\nORDER BY embedding &lt;-&gt; '[0.12, -0.44, ...]'\nLIMIT 5;\n<\/pre><\/div>\n\n\n\n<p>The plan should show something like <code>Index Scan using ..._embedding_idx<\/code>. If it falls back to a sequential scan, check your query and index definitions.<\/p>\n\n\n\n<p><strong>Partition Large Tables:<\/strong> If you have a <strong>very<\/strong> large collection (hundreds of millions of vectors), consider <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/postgresql\/postgresql-partitioning-the-most-useful-feature-you-may-never-have-used\/\" target=\"_blank\" rel=\"noreferrer noopener\">partitioning<\/a>. For example:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">CREATE TABLE documents (\n  id SERIAL PRIMARY KEY,\n  category TEXT,\n  content TEXT,\n  embedding VECTOR(1536)\n) PARTITION BY LIST (category);\n\nCREATE TABLE documents_tech PARTITION OF documents FOR VALUES IN ('technology');\nCREATE TABLE documents_health PARTITION OF documents FOR VALUES IN ('health');\n-- Add more partitions as needed\n<\/pre><\/div>\n\n\n\n<p>Partitioning by category (or time, region, etc) can improve manageability and query performance at scale.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The Future of Vector Search in PostgreSQL<\/h2>\n\n\n\n<p>PostgreSQL\u2019s vector capabilities keep getting better. With each pgvector release, we see enhancements like:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li>More advanced ANN algorithms and index types.<\/li>\n<\/ul>\n<\/div>\n\n\n<p>As ANN methods evolve, pgvector continues to adopt newer algorithms that further improve search quality and reduce latency.<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li>Support for additional distance metrics (<a href=\"https:\/\/www.red-gate.com\/simple-talk\/business-intelligence\/data-science\/comparing-groups-for-similarities-in-power-query-using-cosine-similarity\/#cosine-similarity\" target=\"_blank\" rel=\"noreferrer noopener\">cosine<\/a>, inner product, etc.) and vector types.<br><br><\/li>\n\n\n\n<li>Better <a href=\"https:\/\/www.postgresql.org\/docs\/current\/parallel-query.html\" target=\"_blank\" rel=\"noreferrer noopener\">parallel query<\/a> execution for vector searches.<br><br><\/li>\n\n\n\n<li>Smarter <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/performance-sql-server\/execution-plan-basics\/\" target=\"_blank\" rel=\"noreferrer noopener\">query planning<\/a> and optimization for hybrid (vector + text) queries.<br><br>These ongoing improvements mean PostgreSQL is becoming an even more powerful platform for AI-driven applications. You can combine vector search with all the usual SQL features (joins, transactions, etc), enabling simpler architectures. No need for a separate search engine &#8211; your database does it all.<\/li>\n<\/ul>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-summary-amp-key-takeaways\">Summary &amp; Key Takeaways<\/h2>\n\n\n\n<p>Vector search transforms PostgreSQL from a traditional relational database into a semantic search engine. With pgvector, you can:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li>Store and search data by <em>meaning<\/em> &#8211; not just by keywords.<br><br><\/li>\n\n\n\n<li>Build semantic search, recommendation systems, and other AI-powered features.<br><br><\/li>\n\n\n\n<li>Keep everything within the PostgreSQL database you already use.<br><br><\/li>\n\n\n\n<li>Combine vector similarity with your existing filters, <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/postgresql\/understanding-sql-join-types\/\" target=\"_blank\" rel=\"noreferrer noopener\">joins<\/a>, and full-text search.<br><br><\/li>\n\n\n\n<li>Scale to millions of embeddings with proper indexing.<br><br>The ability to find data by meaning is revolutionizing search and discovery. Best of all, you can implement these capabilities using the tools you already trust. Your database now <em>understands<\/em> data as well as storing it.<\/li>\n<\/ul>\n<\/div>\n\n\n<section id=\"faq\" class=\"faq-block my-5xl\">\n    <h2>FAQs: How to Build an AI-Powered Semantic Search in PostgreSQL with pgvector<\/h2>\n\n                        <h3 class=\"mt-4xl\">1. What is semantic search?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Semantic search finds results based on meaning (understanding intent and context) &#8211; not exact keywords.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">2. How does pgvector enable semantic search in PostgreSQL?<\/h3>\n            <div class=\"faq-answer\">\n                <p>pgvector stores high-dimensional vector embeddings of text (or other data) and lets you query by similarity directly in SQL.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">3. What is a vector embedding?<\/h3>\n            <div class=\"faq-answer\">\n                <p>A numeric representation of meaning. Similar concepts produce vectors close to each other in high-dimensional space.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">4. How do I find similar content?<\/h3>\n            <div class=\"faq-answer\">\n                <p>\u00a0Generate a query embedding and order results by similarity using the <code data-start=\"707\" data-end=\"712\">&lt;-&gt;<\/code> operator:<\/p>\n<p><code>SELECT id, content<br \/>\nFROM documents<br \/>\nORDER BY embedding &lt;-&gt; '[query_vector]'<br \/>\nLIMIT 5;<br \/>\n<\/code><\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">5. Can I combine semantic and traditional search?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Yes &#8211; filter by metadata, keywords, or full-text search while ranking by semantic similarity.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">6. How can I scale searches for large datasets?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Use approximate nearest neighbor (ANN) indexes like HNSW or IVFFlat, and tune their parameters for speed vs. accuracy.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">7. What are some common use cases of semantic search?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Product search, knowledge base retrieval, content recommendations, and finding similar documents or media.<\/p>\n            <\/div>\n            <\/section>\n","protected":false},"excerpt":{"rendered":"<p>Learn how to build a sleek, AI-powered semantic search engine that lives inside your existing database. We\u2019ll walk through how to store vector embeddings, run similarity-based queries, and turn ordinary text searches into meaning-aware retrieval with nothing more than standard SQL and a vector extension.&hellip;<\/p>\n","protected":false},"author":346510,"featured_media":107985,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[159169,143523,53,143534],"tags":[159075,159370,158978],"coauthors":[159369],"class_list":["post-107975","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-databases","category-featured","category-postgresql","tag-ai","tag-pgvector","tag-postgresql"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/107975","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/346510"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=107975"}],"version-history":[{"count":9,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/107975\/revisions"}],"predecessor-version":[{"id":108130,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/107975\/revisions\/108130"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media\/107985"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=107975"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=107975"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=107975"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=107975"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}