{"id":110100,"date":"2026-05-01T12:00:00","date_gmt":"2026-05-01T12:00:00","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=110100"},"modified":"2026-04-23T13:01:52","modified_gmt":"2026-04-23T13:01:52","slug":"what-is-chunking-and-how-does-it-apply-to-vectors-in-sql-server-2025","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/what-is-chunking-and-how-does-it-apply-to-vectors-in-sql-server-2025\/","title":{"rendered":"What is chunking, and how does it apply to vectors in SQL Server 2025?"},"content":{"rendered":"\n<p><strong>In this article, Greg Low explains what chunking is, why it matters for embeddings, and how SQL Server 2025 enables efficient AI-powered vector search.<\/strong><\/p>\n\n\n\n<p>If you&#8217;ve started to work with <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/ai-in-sql-server-2025-embeddings\/\" target=\"_blank\" rel=\"noreferrer noopener\">vector databases and looked at using text embeddings for AI search<\/a>, you might have come across the term <em>chunking<\/em> and wondered what it relates to. In this article, I&#8217;ll explain the concept in general &#8211; and then show how it works in <a href=\"https:\/\/www.red-gate.com\/simple-talk\/collections\/sql-server-2025-articles-guides\/\" target=\"_blank\" rel=\"noreferrer noopener\">SQL Server 2025<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-what-is-chunking\">What is chunking?<\/h2>\n\n\n\n<p>To generate text embeddings, you send a body of text to an <a href=\"https:\/\/www.red-gate.com\/simple-talk\/ai\/when-and-when-not-to-use-llms-in-your-data-pipeline\/\" target=\"_blank\" rel=\"noreferrer noopener\">AI language model<\/a> that returns you a set of vectors. The number of values that come back (i.e., the dimensions of the vector) is determined by the model you&#8217;re using. That makes sense for product descriptions, categories, short summaries, etc, but as the text size increases, it gets more challenging. Additionally, the price you pay for using the model usually increases with the number of tokens you send.<\/p>\n\n\n\n<p><em>Chunking<\/em> is the term used to describe breaking a larger document into smaller pieces before you generate embeddings for it. Each piece (or &#8216;chunk&#8217;) of text is then converted to a vector. The vector represents the semantic meaning of the text. We can then store the vectors and use them to search for other text with a similar meaning. So, the chunking determines the unit of meaning you&#8217;re going to work with, and will directly impact search quality.<\/p>\n\n\n\n<p>When working with real documents, many are too large and too broad to embed as a single block. Think policy documents, a technical guide, contracts, a long article, transcripts&#8230;these often contain <em>many<\/em> topics and subtopics. A vector that represents the meaning of the large document ends up like some sort of average representation of what it contains. That&#8217;s not often useful unless you&#8217;re just looking for vague similarity. What you generally want is the part of the document that&#8217;s relevant to the search.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-issues-with-retrieval-augmented-generation-rag\">Issues with retrieval-augmented generation (RAG)<\/h2>\n\n\n\n<p>Another problem arises when working with <a href=\"https:\/\/aws.amazon.com\/what-is\/retrieval-augmented-generation\/\" target=\"_blank\" rel=\"noreferrer noopener\">retrieval-augmented generation (RAG)<\/a>. Relevant chunks of information are sent as part of the context when a prompt is sent to a language model.<\/p>\n\n\n\n<p>It&#8217;s easy for the context &#8216;window&#8217; to be full of irrelevant information which makes it hard to locate the actual <em>useful<\/em> information. The size of the context window can also be a limiting factor.<\/p>\n\n\n\n<p>Good chunking balances precision and context. As well as the issues with chunks being too large, if the chunks are too small, there might not be enough related information to allow for correct interpretation.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-chunking-methods-explained\">Chunking methods, explained<\/h2>\n\n\n\n<p>There are different ways to chunk text.<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li>Fixed-size chunking splits content to a predefined length. This can be measured in characters, words, or tokens. This is the easiest way to perform chunking and is easy to automate.<br><br><\/li>\n\n\n\n<li>Sentence-based chunking tries to determine and preserve sentence boundaries.<br><br><\/li>\n\n\n\n<li>Paragraph-based chunking follows paragraph boundaries and often lets you work with the document structure much more closely.<br><br><\/li>\n\n\n\n<li>Semantic chunking tries to locate boundaries where the topic being discussed has changed.<\/li>\n<\/ul>\n<\/div>\n\n\n<p>When you&#8217;re chunking text, you also need to consider a concept called <em>overlap<\/em>. You might determine that it&#8217;s valuable to repeat part of the previous chunk at the start of the following one. The aim is to reduce the chance of important sections of text being separated from relevant context because of where chunking boundaries have fallen.<\/p>\n\n\n\n<p>The downside of increasing overlap is that you&#8217;re storing more text, you need to send more text to generate embeddings, and you&#8217;re more likely to get duplicated matches in query response (because they&#8217;re referencing the same text.) The amount of overlap can be an important parameter to tune when designing chunking.<\/p>\n\n\n\n<p>There&#8217;s no single correct answer for how to determine the best chunk size. It totally depends on the source text, the model being used, and the prompts coming from the users.<\/p>\n\n\n\n<section id=\"my-first-block-block_955914061c6310a208d4c11b8c9556ab\" class=\"my-first-block alignwide\">\n    <div class=\"bg-brand-600 text-base-white py-5xl px-4xl rounded-sm bg-gradient-to-r from-brand-600 to-brand-500 red\">\n        <div class=\"gap-4xl items-start md:items-center flex flex-col md:flex-row justify-between\">\n            <div class=\"flex-1 col-span-10 lg:col-span-7\">\n                <h3 class=\"mt-0 font-display mb-2 text-display-sm\">Fast, reliable and consistent SQL Server development&#8230;<\/h3>\n                <div class=\"child:last-of-type:mb-0\">\n                                            &#8230;with SQL Toolbelt Essentials. 10 ingeniously simple tools for accelerating development, reducing risk, and standardizing workflows.                                    <\/div>\n            <\/div>\n                            <a href=\"https:\/\/www.red-gate.com\/products\/sql-toolbelt-essentials\/\" class=\"btn btn--secondary btn--lg\">Learn more &amp; try for free<\/a>\n                    <\/div>\n    <\/div>\n<\/section>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-chunking-in-sql-server-2025-explained\">Chunking in SQL Server 2025, explained<\/h2>\n\n\n\n<p>SQL Server 2025 added several functions for working with text embeddings, vectors, vector-based indexing, and searches.<\/p>\n\n\n\n<p>The <code>CREATE EXTERNAL MODEL<\/code> command allows you to configure how calls can be made to a language model that calculates text embeddings. This can be a cloud-based system like OpenAI, or a local model hosted in a tool like Ollama. Learn more about the command <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/sql-server-2025-create-external-model-and-ai_generate_embeddings-commands-explained\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<\/p>\n\n\n\n<p>The <code>AI_GENERATE_EMBEDDINGS<\/code> function simplifies the generation of text embeddings once the external model has been defined, and the vector data type provides a convenient storage format for the embeddings. Learn more about the function <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/sql-server-2025-create-external-model-and-ai_generate_embeddings-commands-explained\/#:~:text=Now%20that%20you%20have%20created%20the%20model%2C%20to%20use%20it%20you%20need%20the%20new%20function%20AI_GENERATE_EMBEDDINGS\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>.<\/p>\n\n\n\n<p><code>AI_GENERATE_CHUNKS<\/code> rounds out the capabilities by letting you chunk text before you generate embeddings for it. It&#8217;s a <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/sql-server-functions-the-basics\/#table-valued-functions:~:text=Employees%22.-,Table%2Dvalued%20Functions,-Table%2Dvalued%20Functions\" target=\"_blank\" rel=\"noreferrer noopener\">table-valued function<\/a> that chunks supplied text. The current documentation of the syntax shows that it accepts:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li>The text to be chunked (<code>SOURCE<\/code>)<br><br><\/li>\n\n\n\n<li>The type of chunking to use (<code>CHUNK_TYPE<\/code>)<br><br><\/li>\n\n\n\n<li>The size of each chunk when using <code>FIXED<\/code> chunking (<code>CHUNK_SIZE<\/code>)<br><br><\/li>\n\n\n\n<li>(Optional) The amount of overlap (<code>OVERLAP<\/code>)<br><br><\/li>\n\n\n\n<li>(Optional) An option to generate an ID for each chunk (<code>ENABLE_CHUNK_SET_ID<\/code>).<\/li>\n<\/ul>\n<\/div>\n\n\n<p>Note that many of these AI functions are still in preview so the syntax, and the options allowed for each parameter, may change. It&#8217;s also worth noting that vector indexing and vector search have had substantial changes recently.<\/p>\n\n\n\n<p><code>CHUNK_TYPE<\/code> currently only has a single accepted value of <code>FIXED<\/code>. Whenever you see a required parameter with only a single value, it&#8217;s a hint that this is <em>not<\/em> the end of the story!<\/p>\n\n\n\n<p><code>CHUNK_SIZE<\/code> is required for <code>FIXED<\/code> (which is the only option right now), and is a positive number.<\/p>\n\n\n\n<p><code>OVERLAP<\/code> is the percentage of the previous text that needs to be included in a chunk. That&#8217;s a value between 0 and 50, and the default is 0 (i.e., no overlap).<\/p>\n\n\n\n<p>If <code>ENABLE_CHUNK_ID<\/code> is 1, then a column called chunk_set_id is returned along with the chunks. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-it-looks-in-practice-an-example-query\">How it looks in practice &#8211; an example query<\/h2>\n\n\n\n<p>Let&#8217;s look at an example query:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \" >DECLARE @TextToChunk nvarchar(max)\n= N'A happy day feels light from the moment it begins. The sun seems warmer, '\n+ N'small things go right without effort, and there is a quiet sense of ease '\n+ N'in everything around you. Time spent with people you enjoy, good food, '\n+ N'laughter, and a few simple moments of calm can make the whole day feel '\n+ N'bright and memorable. By the end of it, you feel content, grateful, and '\n+ N'a little reluctant for it to end.';\nSELECT * \nFROM AI_GENERATE_CHUNKS\n(\n    source = @TextToChunk, \n    chunk_type = FIXED,\n    chunk_size = 50,\n    enable_chunk_set_id = 1\n) AS c;<\/pre><\/div>\n\n\n\n<p>That query returns the following data:<\/p>\n\n\n\n<p><img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"203\" class=\"wp-image-110101\" style=\"width: 800px;\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-122311.png\" alt=\"\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-122311.png 697w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/04\/Screenshot-2026-04-23-122311-300x76.png 300w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" \/><\/p>\n\n\n\n<p>The chunk_order determines the order of the chunk in the original text. The chunk_offset determines the starting location within the original text. The chunk_length returns the final length of each chunk, and the chunk_set_id determines an ID for this set of chunks. The function is a table-valued function, generally called via <code>CROSS APPLY<\/code> or <code>OUTER APPLY<\/code> to process multiple rows, and it&#8217;s here you would see different sets.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-summary-chunking-in-sql-server-2025\">Summary: Chunking in SQL Server 2025<\/h2>\n\n\n\n<p>These features mean that SQL Server 2025 can support a much stronger workflow for <a href=\"https:\/\/www.graphlit.com\/glossary\/semantic-retrieval\" target=\"_blank\" rel=\"noreferrer noopener\">semantic retrieval<\/a>. Text can be stored in tables, split into chunks with <code>AI_GENERATE_CHUNKS<\/code>, converted into embeddings with <code>AI_GENERATE_EMBEDDINGS<\/code>, stored in a vector column, and used with vector functions and vector indexes for similarity search. These capabilities greatly enhance the options available in SQL Server.<\/p>\n\n\n\n<section id=\"my-first-block-block_313bae498194e63a87bd38736774ca74\" class=\"my-first-block alignwide\">\n    <div class=\"bg-brand-600 text-base-white py-5xl px-4xl rounded-sm bg-gradient-to-r from-brand-600 to-brand-500 red\">\n        <div class=\"gap-4xl items-start md:items-center flex flex-col md:flex-row justify-between\">\n            <div class=\"flex-1 col-span-10 lg:col-span-7\">\n                <h3 class=\"mt-0 font-display mb-2 text-display-sm\">Simple Talk is brought to you by Redgate Software<\/h3>\n                <div class=\"child:last-of-type:mb-0\">\n                                            Take control of your databases with the trusted Database DevOps solutions provider. Automate with confidence, scale securely, and unlock growth through AI.                                    <\/div>\n            <\/div>\n                            <a href=\"https:\/\/www.red-gate.com\/solutions\/overview\/\" class=\"btn btn--secondary btn--lg\">Discover how Redgate can help you<\/a>\n                    <\/div>\n    <\/div>\n<\/section>\n\n\n<section id=\"faq\" class=\"faq-block my-5xl\">\n    <h2>FAQs: Chunking in SQL Server 2025<\/h2>\n\n                        <h3 class=\"mt-4xl\">1. What is chunking in AI, and why is it important?<\/h3>\n            <div class=\"faq-answer\">\n                <p data-start=\"42\" data-end=\"257\">Chunking is the process of splitting large text into smaller sections before generating embeddings, helping improve search relevance and reduce processing costs.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">2. How does chunking affect embedding quality?<\/h3>\n            <div class=\"faq-answer\">\n                <p data-start=\"259\" data-end=\"428\">Smaller, focused chunks produce more accurate vectors, while large chunks can dilute meaning and harm search precision.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">3. What is the ideal chunk size for embeddings?<\/h3>\n            <div class=\"faq-answer\">\n                <p data-start=\"430\" data-end=\"609\">There\u2019s no fixed answer &#8211; the best size depends on your data, model, and use case, but it should balance context with specificity.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">4. What is overlap in chunking and when should you use it?<\/h3>\n            <div class=\"faq-answer\">\n                <p data-start=\"611\" data-end=\"800\">Overlap repeats part of the previous chunk to preserve context, which can improve understanding but increases storage and cost.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">5. How does SQL Server 2025 support chunking?<\/h3>\n            <div class=\"faq-answer\">\n                <p data-start=\"802\" data-end=\"990\">SQL Server 2025 introduces the <code>AI_GENERATE_CHUNKS<\/code> function, allowing you to split text directly in the database before creating embeddings.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">6. Can chunking improve RAG (retrieval-augmented generation)?<\/h3>\n            <div class=\"faq-answer\">\n                <section class=\"text-token-text-primary w-full focus:outline-none [--shadow-height:45px] has-data-writing-block:pointer-events-none has-data-writing-block:-mt-(--shadow-height) has-data-writing-block:pt-(--shadow-height) [&amp;:has([data-writing-block])&gt;*]:pointer-events-auto [content-visibility:auto] supports-[content-visibility:auto]:[contain-intrinsic-size:auto_100lvh] R6Vx5W_threadScrollVars scroll-mb-[calc(var(--scroll-root-safe-area-inset-bottom,0px)+var(--thread-response-height))] scroll-mt-[calc(var(--header-height)+min(200px,max(70px,20svh)))]\" dir=\"auto\" data-turn-id=\"request-WEB:8ec6373a-5372-4598-bbf7-b12c04b3cfe5-3\" data-testid=\"conversation-turn-8\" data-scroll-anchor=\"false\" data-turn=\"assistant\">\n<div class=\"text-base my-auto mx-auto pb-10 [--thread-content-margin:var(--thread-content-margin-xs,calc(var(--spacing)*4))] @w-sm\/main:[--thread-content-margin:var(--thread-content-margin-sm,calc(var(--spacing)*6))] @w-lg\/main:[--thread-content-margin:var(--thread-content-margin-lg,calc(var(--spacing)*16))] px-(--thread-content-margin)\">\n<div class=\"[--thread-content-max-width:40rem] @w-lg\/main:[--thread-content-max-width:48rem] mx-auto max-w-(--thread-content-max-width) flex-1 group\/turn-messages focus-visible:outline-hidden relative flex w-full min-w-0 flex-col agent-turn\">\n<div class=\"flex max-w-full flex-col gap-4 grow\">\n<div class=\"min-h-8 text-message relative flex w-full flex-col items-end gap-2 text-start break-words whitespace-normal outline-none keyboard-focused:focus-ring [.text-message+&amp;]:mt-1\" dir=\"auto\" data-message-author-role=\"assistant\" data-message-id=\"4b467c38-188c-4789-b308-377384476b7e\" data-message-model-slug=\"gpt-5-3\" data-turn-start-message=\"true\">\n<div class=\"flex w-full flex-col gap-1 empty:hidden\">\n<div class=\"markdown prose dark:prose-invert w-full wrap-break-word light markdown-new-styling\">\n<p data-start=\"992\" data-end=\"1178\" data-is-last-node=\"\" data-is-only-node=\"\">Yes. Well-designed chunks help ensure only relevant context is retrieved, improving response accuracy and reducing noise.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/section>\n            <\/div>\n            <\/section>\n","protected":false},"excerpt":{"rendered":"<p>Learn what chunking is, why it matters for embeddings, and how SQL Server 2025 enables efficient AI-powered vector search.&hellip;<\/p>\n","protected":false},"author":346483,"featured_media":110106,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143523,53,143524],"tags":[4168,4170,4150,4151,159254,159319],"coauthors":[159368],"class_list":["post-110100","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-databases","category-featured","category-sql-server","tag-database","tag-database-administration","tag-sql","tag-sql-server","tag-sql-server-2025","tag-sqlserver2025publicpreview"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/110100","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/346483"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=110100"}],"version-history":[{"count":4,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/110100\/revisions"}],"predecessor-version":[{"id":110111,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/110100\/revisions\/110111"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media\/110106"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=110100"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=110100"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=110100"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=110100"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}