{"id":111077,"date":"2026-06-08T12:00:00","date_gmt":"2026-06-08T12:00:00","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=111077"},"modified":"2026-06-01T11:58:26","modified_gmt":"2026-06-01T11:58:26","slug":"how-to-call-an-ollama-based-ai-text-embeddings-model-from-sql-server-2025","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/how-to-call-an-ollama-based-ai-text-embeddings-model-from-sql-server-2025\/","title":{"rendered":"How to call an Ollama-based AI text embeddings model from SQL Server 2025"},"content":{"rendered":"\n<p><strong>In the <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/how-to-host-an-ai-text-embeddings-model-for-sql-server-using-ollama\/\" target=\"_blank\" rel=\"noreferrer noopener\">first article<\/a> of this <a href=\"https:\/\/www.red-gate.com\/simple-talk\/collections\/ai-text-embeddings-in-sql-server-everything-you-need-to-know\/\" target=\"_blank\" rel=\"noreferrer noopener\">series<\/a>, I explained how to install and configure Ollama to host text embeddings models locally. I also demonstrated how to install Caddy as a proxy to allow SQL Server to use Ollama via https-based calls. In this article, I&#8217;ll show you how to make use of this at the SQL Server end.<\/strong><\/p>\n\n\n\n<p><strong>No time to read? Here&#8217;s a quick overview:<\/strong><br>To generate text embeddings in SQL Server using a locally hosted Ollama model, enable REST endpoints with <code>sp_configure<\/code>, define the model once using <code>CREATE EXTERNAL MODEL<\/code>, then call <code>AI_GENERATE_EMBEDDINGS<\/code> whenever you need a vector. This avoids messy REST code every time and keeps your embedding logic clean and reusable.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-which-text-embeddings-models-should-i-use-in-ollama\">Which text embeddings models should I use in Ollama?<\/h2>\n\n\n\n<p>You&#8217;ll find that <a href=\"https:\/\/ollama.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Ollama<\/a> can download and run many text embeddings models. There is a trade-off between the size of the models, the number of vector dimensions they return, and their effectiveness.<\/p>\n\n\n\n<p>The models I commonly use are:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Model Name<\/strong><\/td><td><strong>Vector Dimensions<\/strong><\/td><\/tr><tr><td>all-minilm<\/td><td>384<\/td><\/tr><tr><td>nomic-embed-text<\/td><td>768<\/td><\/tr><tr><td>mxbai-embed-large<\/td><td>1024<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Even the <strong>all-minilm<\/strong> model is surprisingly effective with standard text. The number of dimensions is important, as vector data can quickly become some of the largest data in your databases if you aren&#8217;t careful.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-to-test-the-caddy-proxy\">How to test the Caddy proxy<\/h2>\n\n\n\n<p>We need to start by enabling the ability to call REST-based services:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">EXEC sp_configure 'external rest endpoint enabled', 1;\nRECONFIGURE;\nGO<\/pre><\/div>\n\n\n\n<p>That allows us to use the <code>sp_invoke_external_rest_endpoint<\/code> system <a href=\"https:\/\/www.red-gate.com\/simple-talk\/blogs\/for-the-love-of-stored-procedures\/\" target=\"_blank\" rel=\"noreferrer noopener\">stored procedure<\/a> to make REST-based calls.<\/p>\n\n\n\n<p>Now that we have Ollama and Caddy up and running, we can check that SQL Server can actually call them by executing the following code:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">DECLARE @ReturnCode int;\nDECLARE @Response nvarchar(max);\n\nEXEC @ReturnCode = sys.sp_invoke_external_rest_endpoint\n  @method  = 'GET',\n  @url     = 'https:\/\/localhost:8443\/api\/tags',\n  @headers = N'{\"Accept\":\"application\/json\"}',\n  @response = @Response OUTPUT;\nSELECT @ReturnCode AS ReturnCode,\n        @Response as Response;<\/pre><\/div>\n\n\n\n<p>You should see a http status code of 200 returned:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"940\" height=\"63\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/05\/image-64.png\" alt=\"\" class=\"wp-image-111080\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/05\/image-64.png 940w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/05\/image-64-300x20.png 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/05\/image-64-768x51.png 768w\" sizes=\"auto, (max-width: 940px) 100vw, 940px\" \/><\/figure>\n\n\n\n<p>However, if you receive an error that looks like a certificate error, you&#8217;ll need to make sure you trust the root certificate for Caddy. Open <a href=\"https:\/\/www.red-gate.com\/simple-talk\/devops\/database-devops\/questions-use-powershell-shy-ask\/\" target=\"_blank\" rel=\"noreferrer noopener\">PowerShell<\/a> and, in the folder where you have downloaded Caddy, execute the following:<\/p>\n\n\n\n<p><code>.\\caddy_windows_amd64.exe environ<\/code><\/p>\n\n\n\n<p>Look for the value of the <code>caddy.AppDataDir<\/code>. On my system, that was <code>C:\\Users\\Greg\\AppData\\Roaming\\Caddy<\/code>. In that folder, navigate to the subfolder <code>pki\\authorities\\local<\/code> and, in here, you&#8217;ll find the Caddy root certificate <code>root.crt<\/code>. <\/p>\n\n\n\n<p>Using Windows certificate manager (as an administrator), import that certificate into the Certificates subfolder under <strong>Trusted Root Certification Authorities<\/strong>. If you have old Caddy certificates, you should remove them first. You&#8217;ll then need to restart Caddy to complete the process.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-to-retrieve-text-embeddings-via-sql-server-rest\">How to retrieve text embeddings via SQL Server REST<\/h2>\n\n\n\n<p>Given we can make REST-based calls, we could just directly call the Ollama service via the Caddy proxy by doing this:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">DECLARE @Payload nvarchar(max) =\nN'{\"model\":\"nomic-embed-text\",\"prompt\":\"How many products are in stock?\"}';\nEXEC @ReturnCode = sys.sp_invoke_external_rest_endpoint\n  @method  = 'POST',\n  @url     = 'https:\/\/localhost:8443\/api\/embed',\n  @payload = @Payload,\n  @headers = N'{\"Content-Type\":\"application\/json\"}',\n  @response = @Response OUTPUT;\nSELECT @ReturnCode AS ReturnCode,\n       @Response as Response;<\/pre><\/div>\n\n\n\n<p>If you inspect the returned response, you&#8217;ll find the vector for the embeddings.<\/p>\n\n\n\n<section id=\"my-first-block-block_941c58c5cc72316f2aab44816edbf5ba\" class=\"my-first-block alignwide\">\n    <div class=\"bg-brand-600 text-base-white py-5xl px-4xl rounded-sm bg-gradient-to-r from-brand-600 to-brand-500 red\">\n        <div class=\"gap-4xl items-start md:items-center flex flex-col md:flex-row justify-between\">\n            <div class=\"flex-1 col-span-10 lg:col-span-7\">\n                <h3 class=\"mt-0 font-display mb-2 text-display-sm\">Fast, reliable and consistent SQL Server development&#8230;<\/h3>\n                <div class=\"child:last-of-type:mb-0\">\n                                            &#8230;with SQL Toolbelt Essentials. 10 ingeniously simple tools for accelerating development, reducing risk, and standardizing workflows.                                    <\/div>\n            <\/div>\n                                            <a href=\"https:\/\/www.red-gate.com\/products\/sql-toolbelt-essentials\/\" class=\"btn btn--secondary btn--lg\" aria-label=\"Learn more &amp; try for free: Fast, reliable and consistent SQL Server development...\">Learn more &amp; try for free<\/a>\n                    <\/div>\n    <\/div>\n<\/section>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-using-create-external-model\">Using CREATE EXTERNAL MODEL <\/h2>\n\n\n\n<div id=\"callout-block_995185052f245320b102820398ceb18a\" class=\"callout alignnone\">\n    <div class=\"child-last:mb-0 child-first:mt-0 bg-gray-50 dark:bg-gray-950 p-4xl my-3xl\">\n\n<p><strong>You may also be interested in&#8230;<br><\/strong><a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/sql-server-2025-create-external-model-and-ai_generate_embeddings-commands-explained\/\" target=\"_blank\" rel=\"noreferrer noopener\">SQL Server 2025 CREATE EXTERNAL MODEL and AI_GENERATE_EMBEDDINGS Commands Explained<\/a><\/p>\n\n<\/div>\n<\/div> \n\n\n<p>Now, while we <em>could<\/em> then extract the embeddings from that returned <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/oracle-databases\/json-for-absolute-beginners-part-1-introduction\/\" target=\"_blank\" rel=\"noreferrer noopener\">JSON<\/a>, having all this code every time we want to call a model is messy. Instead, SQL Server allows us to define an <code>EXTERNAL MODEL<\/code> that <em>remembers<\/em> how to call the model:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">CREATE EXTERNAL MODEL AllMiniLM\nAUTHORIZATION dbo\nWITH \n(\n      LOCATION = 'https:\/\/localhost:8443\/api\/embed',\n      API_FORMAT = 'Ollama',\n      MODEL_TYPE = EMBEDDINGS,\n      MODEL = 'all-minilm'\n);<\/pre><\/div>\n\n\n\n<p>The model isn&#8217;t a schema-bound object, so we don&#8217;t have a schema as part of the name. The <code>LOCATION<\/code> is just the URL we need to call &#8211; or the folder if we&#8217;re using an <a href=\"https:\/\/onnx.ai\/\" target=\"_blank\" rel=\"noreferrer noopener\">ONNX<\/a> (Open Neural Network Exchange) runtime. <\/p>\n\n\n\n<p>The <code>API_FORMAT<\/code> lets SQL Server know what to expect when calling and receiving a response. Currently, it can be one of either <a href=\"https:\/\/azure.microsoft.com\/en-us\/products\/ai-foundry\/models\/openai\" target=\"_blank\" rel=\"noreferrer noopener\">Azure OpenAI<\/a>, <a href=\"https:\/\/openai.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">OpenAI<\/a>, Ollama, or ONNX Runtime.<\/p>\n\n\n\n<p>The <code>MODEL_TYPE<\/code> is curious. At present, it can only be <code>EMBEDDINGS<\/code>. However, whenever you see a required parameter and only one permitted value, you just know that the SQL Server team are considering other uses for this.<\/p>\n\n\n\n<p>Finally, we specify the model to use, and that&#8217;s it. It&#8217;s now easier for SQL Server to make calls to the model.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Using AI_GENERATE_EMBEDDINGS<\/h2>\n\n\n\n<p>Once the external model is created, we can call it by using the <code>AI_GENERATE_EMBEDDINGS<\/code> function. That simplifies the code for working with embeddings.<\/p>\n\n\n\n<p>Do bear in mind that, before you can use it, that external AI runtimes must be enabled. Depending on the specific version of SQL Server you&#8217;re using, you might also need to enable preview features:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">EXECUTE sp_configure 'external AI runtimes enabled', 1;\nRECONFIGURE WITH OVERRIDE;\nGO\nAnd then we can call a new function to easily retrieve embeddings.\nSELECT AI_GENERATE_EMBEDDINGS(N'SQL Server loves AI' \n       USE MODEL AllMiniLM);<\/pre><\/div>\n\n\n\n<p>Note that the model that I mentioned here is the name we gave to the external model. I often name the external model based on the text model I&#8217;m calling.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-summary-and-next-steps\">Summary and next steps<\/h2>\n\n\n\n<p>Creating an external model and then calling it via <code>AI_GENERATE_EMBEDDINGS<\/code> makes it easy to work with text embeddings in SQL Server. One further aspect to consider is <em>which<\/em> text to generate the embedding for. SQL Server has provided the <code>AI_GENERATE_CHUNKS<\/code> function to help us with that. We&#8217;ll explore that in the next article.<\/p>\n\n\n\n<section id=\"my-first-block-block_bf405a4d2ddb34e6732ad2da1caf9d03\" class=\"my-first-block alignwide\">\n    <div class=\"bg-brand-600 text-base-white py-5xl px-4xl rounded-sm bg-gradient-to-r from-brand-600 to-brand-500 red\">\n        <div class=\"gap-4xl items-start md:items-center flex flex-col md:flex-row justify-between\">\n            <div class=\"flex-1 col-span-10 lg:col-span-7\">\n                <h3 class=\"mt-0 font-display mb-2 text-display-sm\">Simple Talk is brought to you by Redgate Software<\/h3>\n                <div class=\"child:last-of-type:mb-0\">\n                                            Take control of your databases with the trusted Database DevOps solutions provider. Automate with confidence, scale securely, and unlock growth through AI.                                    <\/div>\n            <\/div>\n                                            <a href=\"https:\/\/www.red-gate.com\/solutions\/overview\/\" class=\"btn btn--secondary btn--lg\" aria-label=\"Discover how Redgate can help you: Simple Talk is brought to you by Redgate Software\">Discover how Redgate can help you<\/a>\n                    <\/div>\n    <\/div>\n<\/section>\n\n\n<section id=\"faq\" class=\"faq-block my-5xl\">\n    <h2>FAQs: How to call an Ollama-based AI text embeddings model from SQL Server 2025<\/h2>\n\n                        <h3 class=\"mt-4xl\">1. Which Ollama text embedding model should I use with SQL Server?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Common choices are <code class=\"bg-text-200\/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]\">all-minilm<\/code> (384 dimensions), <code class=\"bg-text-200\/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]\">nomic-embed-text<\/code> (768 dimensions), and <code class=\"bg-text-200\/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]\">mxbai-embed-large<\/code> (1024 dimensions). Even <code class=\"bg-text-200\/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]\">all-minilm<\/code> is surprisingly effective for standard text, and fewer dimensions help keep vector storage manageable.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">2. How do I enable SQL Server to call Ollama via REST?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Run <code class=\"bg-text-200\/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]\">sp_configure 'external rest endpoint enabled', 1<\/code> followed by <code class=\"bg-text-200\/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]\">RECONFIGURE<\/code>. This allows you to use <code class=\"bg-text-200\/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]\">sp_invoke_external_rest_endpoint<\/code> to make HTTPS calls to Ollama through your Caddy proxy.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">3. What is CREATE EXTERNAL MODEL in SQL Server?<\/h3>\n            <div class=\"faq-answer\">\n                <p><code class=\"bg-text-200\/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]\">CREATE EXTERNAL MODEL<\/code> defines a reusable reference to an embedding model, storing its URL, API format (Ollama, OpenAI, Azure OpenAI, or ONNX Runtime), and model name, so you don&#8217;t have to write REST call code every time.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">4. How do I generate embeddings in SQL Server with AI_GENERATE_EMBEDDINGS?<\/h3>\n            <div class=\"faq-answer\">\n                <p>After enabling external AI runtimes and creating an external model, call <code class=\"bg-text-200\/5 border border-0.5 border-border-300 text-danger-000 whitespace-pre-wrap rounded-[0.4rem] px-1 py-px text-[0.9rem]\">AI_GENERATE_EMBEDDINGS(N'your text' USE MODEL YourModelName)<\/code>. This returns the vector directly, avoiding the need to parse JSON from raw REST responses.<\/p>\n            <\/div>\n            <\/section>\n","protected":false},"excerpt":{"rendered":"<p>Learn how to call locally hosted Ollama embedding models from SQL Server using sp_invoke_external_rest_endpoint, CREATE EXTERNAL MODEL, and AI_GENERATE_EMBEDDINGS.&hellip;<\/p>\n","protected":false},"author":346483,"featured_media":111067,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[159169,143523,53,143524],"tags":[159075,4168,4170,159401,159400,4150,4151],"coauthors":[159368],"class_list":["post-111077","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-databases","category-featured","category-sql-server","tag-ai","tag-database","tag-database-administration","tag-greglowollamaseries","tag-ollama","tag-sql","tag-sql-server"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/111077","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/346483"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=111077"}],"version-history":[{"count":11,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/111077\/revisions"}],"predecessor-version":[{"id":111120,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/111077\/revisions\/111120"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media\/111067"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=111077"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=111077"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=111077"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=111077"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}