In the first article of this series, I explained how to install and configure Ollama to host text embeddings models locally. I also demonstrated how to install Caddy as a proxy to allow SQL Server to use Ollama via https-based calls. In this article, I’ll show you how to make use of this at the SQL Server end.
No time to read? Here’s a quick overview:
To generate text embeddings in SQL Server using a locally hosted Ollama model, enable REST endpoints with sp_configure, define the model once using CREATE EXTERNAL MODEL, then call AI_GENERATE_EMBEDDINGS whenever you need a vector. This avoids messy REST code every time and keeps your embedding logic clean and reusable.
Which text embeddings models should I use in Ollama?
You’ll find that Ollama can download and run many text embeddings models. There is a trade-off between the size of the models, the number of vector dimensions they return, and their effectiveness.
The models I commonly use are:
| Model Name | Vector Dimensions |
| all-minilm | 384 |
| nomic-embed-text | 768 |
| mxbai-embed-large | 1024 |
Even the all-minilm model is surprisingly effective with standard text. The number of dimensions is important, as vector data can quickly become some of the largest data in your databases if you aren’t careful.
How to test the Caddy proxy
We need to start by enabling the ability to call REST-based services:
|
1 2 3 |
EXEC sp_configure 'external rest endpoint enabled', 1; RECONFIGURE; GO |
That allows us to use the sp_invoke_external_rest_endpoint system stored procedure to make REST-based calls.
Now that we have Ollama and Caddy up and running, we can check that SQL Server can actually call them by executing the following code:
|
1 2 3 4 5 6 7 8 9 10 |
DECLARE @ReturnCode int; DECLARE @Response nvarchar(max); EXEC @ReturnCode = sys.sp_invoke_external_rest_endpoint @method = 'GET', @url = 'https://localhost:8443/api/tags', @headers = N'{"Accept":"application/json"}', @response = @Response OUTPUT; SELECT @ReturnCode AS ReturnCode, @Response as Response; |
You should see a http status code of 200 returned:

However, if you receive an error that looks like a certificate error, you’ll need to make sure you trust the root certificate for Caddy. Open PowerShell and, in the folder where you have downloaded Caddy, execute the following:
.\caddy_windows_amd64.exe environ
Look for the value of the caddy.AppDataDir. On my system, that was C:\Users\Greg\AppData\Roaming\Caddy. In that folder, navigate to the subfolder pki\authorities\local and, in here, you’ll find the Caddy root certificate root.crt.
Using Windows certificate manager (as an administrator), import that certificate into the Certificates subfolder under Trusted Root Certification Authorities. If you have old Caddy certificates, you should remove them first. You’ll then need to restart Caddy to complete the process.
How to retrieve text embeddings via SQL Server REST
Given we can make REST-based calls, we could just directly call the Ollama service via the Caddy proxy by doing this:
|
1 2 3 4 5 6 7 8 9 10 |
DECLARE @Payload nvarchar(max) = N'{"model":"nomic-embed-text","prompt":"How many products are in stock?"}'; EXEC @ReturnCode = sys.sp_invoke_external_rest_endpoint @method = 'POST', @url = 'https://localhost:8443/api/embed', @payload = @Payload, @headers = N'{"Content-Type":"application/json"}', @response = @Response OUTPUT; SELECT @ReturnCode AS ReturnCode, @Response as Response; |
If you inspect the returned response, you’ll find the vector for the embeddings.
Fast, reliable and consistent SQL Server development…
Using CREATE EXTERNAL MODEL
You may also be interested in…
SQL Server 2025 CREATE EXTERNAL MODEL and AI_GENERATE_EMBEDDINGS Commands Explained
Now, while we could then extract the embeddings from that returned JSON, having all this code every time we want to call a model is messy. Instead, SQL Server allows us to define an EXTERNAL MODEL that remembers how to call the model:
|
1 2 3 4 5 6 7 8 9 |
CREATE EXTERNAL MODEL AllMiniLM AUTHORIZATION dbo WITH ( LOCATION = 'https://localhost:8443/api/embed', API_FORMAT = 'Ollama', MODEL_TYPE = EMBEDDINGS, MODEL = 'all-minilm' ); |
The model isn’t a schema-bound object, so we don’t have a schema as part of the name. The LOCATION is just the URL we need to call – or the folder if we’re using an ONNX (Open Neural Network Exchange) runtime.
The API_FORMAT lets SQL Server know what to expect when calling and receiving a response. Currently, it can be one of either Azure OpenAI, OpenAI, Ollama, or ONNX Runtime.
The MODEL_TYPE is curious. At present, it can only be EMBEDDINGS. However, whenever you see a required parameter and only one permitted value, you just know that the SQL Server team are considering other uses for this.
Finally, we specify the model to use, and that’s it. It’s now easier for SQL Server to make calls to the model.
Using AI_GENERATE_EMBEDDINGS
Once the external model is created, we can call it by using the AI_GENERATE_EMBEDDINGS function. That simplifies the code for working with embeddings.
Do bear in mind that, before you can use it, that external AI runtimes must be enabled. Depending on the specific version of SQL Server you’re using, you might also need to enable preview features:
|
1 2 3 4 5 6 |
EXECUTE sp_configure 'external AI runtimes enabled', 1; RECONFIGURE WITH OVERRIDE; GO And then we can call a new function to easily retrieve embeddings. SELECT AI_GENERATE_EMBEDDINGS(N'SQL Server loves AI' USE MODEL AllMiniLM); |
Note that the model that I mentioned here is the name we gave to the external model. I often name the external model based on the text model I’m calling.
Summary and next steps
Creating an external model and then calling it via AI_GENERATE_EMBEDDINGS makes it easy to work with text embeddings in SQL Server. One further aspect to consider is which text to generate the embedding for. SQL Server has provided the AI_GENERATE_CHUNKS function to help us with that. We’ll explore that in the next article.
Simple Talk is brought to you by Redgate Software
FAQs: How to call an Ollama-based AI text embeddings model from SQL Server 2025
1. Which Ollama text embedding model should I use with SQL Server?
Common choices are all-minilm (384 dimensions), nomic-embed-text (768 dimensions), and mxbai-embed-large (1024 dimensions). Even all-minilm is surprisingly effective for standard text, and fewer dimensions help keep vector storage manageable.
2. How do I enable SQL Server to call Ollama via REST?
Run sp_configure 'external rest endpoint enabled', 1 followed by RECONFIGURE. This allows you to use sp_invoke_external_rest_endpoint to make HTTPS calls to Ollama through your Caddy proxy.
3. What is CREATE EXTERNAL MODEL in SQL Server?
CREATE EXTERNAL MODEL defines a reusable reference to an embedding model, storing its URL, API format (Ollama, OpenAI, Azure OpenAI, or ONNX Runtime), and model name, so you don’t have to write REST call code every time.
4. How do I generate embeddings in SQL Server with AI_GENERATE_EMBEDDINGS?
After enabling external AI runtimes and creating an external model, call AI_GENERATE_EMBEDDINGS(N'your text' USE MODEL YourModelName). This returns the vector directly, avoiding the need to parse JSON from raw REST responses.
Load comments