{"id":109233,"date":"2026-04-16T14:13:00","date_gmt":"2026-04-16T14:13:00","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=109233"},"modified":"2026-04-08T12:56:35","modified_gmt":"2026-04-08T12:56:35","slug":"when-and-when-not-to-use-llms-in-your-data-pipeline","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/ai\/when-and-when-not-to-use-llms-in-your-data-pipeline\/","title":{"rendered":"When, and when not, to use LLMs in your data pipeline"},"content":{"rendered":"\n<p><strong>\u201cLet\u2019s add LLMs to the pipeline\u201d has become a familiar refrain in modern data teams, but turning that idea into production reality is where things often go wrong. While large language models can unlock powerful capabilities, they\u2019re just as likely to introduce unnecessary cost, latency, and complexity when misapplied. Drawing on real-world experience building large-scale ML systems, this guide breaks down exactly where LLMs belong in your data pipeline and, just as importantly, where they don\u2019t.<\/strong><\/p>\n\n\n\n<p>Someone on your team just got back from a <a href=\"https:\/\/passdatacommunitysummit.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">conference<\/a>. Or maybe they spent 20 minutes on LinkedIn. Either way, the message lands in Slack: \u201cWe should be using LLMs in our pipeline.\u201d<\/p>\n\n\n\n<p>I\u2019ve watched this play out more times than I can count &#8211; and I get it. <a href=\"https:\/\/www.ibm.com\/think\/topics\/large-language-models\" target=\"_blank\" rel=\"noreferrer noopener\">LLMs (large language models)<\/a> are genuinely exciting. However, what usually follows that Slack message is not so exciting. It&#8217;s a weeks-long detour that ends with an infrastructure bill that makes the VP of Engineering visibly uncomfortable. A pipeline that\u2019s three times slower than before. And a compliance team asking questions nobody prepared for.<\/p>\n\n\n\n<p>I\u2019ve spent the last few years building production <a href=\"https:\/\/www.ibm.com\/think\/topics\/machine-learning\" target=\"_blank\" rel=\"noreferrer noopener\">ML (machine learning)<\/a> systems that touch hundreds of millions of users &#8211; customer journey optimization, support analytics, <a href=\"https:\/\/hightouch.com\/blog\/behavioral-data\" target=\"_blank\" rel=\"noreferrer noopener\">behavioral data<\/a> at petabyte scale. The clearest lesson from all of it: the teams that get LLMs right aren\u2019t the ones who use them <em>everywhere<\/em>. They\u2019re the ones who are ruthlessly specific about exactly where they use them.<\/p>\n\n\n\n<p>This article is my attempt to give you a concrete framework for making that call &#8211; not in theory, but instead in the kind of production environment where things break at 2am and someone has to explain the <a href=\"https:\/\/aws.amazon.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">AWS<\/a> bill.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-why-llms-are-being-over-applied\">Why LLMs are being over-applied<\/h2>\n\n\n\n<p>The appeal of LLMs is obvious. It can read a messy support ticket, figure out what the customer actually wants, classify it, summarize it, and draft a reply &#8211; all in one <a href=\"https:\/\/www.red-gate.com\/simple-talk\/sysadmin\/general\/api-monitoring-key-metrics-and-best-practices\/\" target=\"_blank\" rel=\"noreferrer noopener\">API<\/a> call. Three years ago that would\u2019ve been four separate models and a custom <a href=\"https:\/\/www.geeksforgeeks.org\/nlp\/natural-language-processing-nlp-pipeline\/\" target=\"_blank\" rel=\"noreferrer noopener\">NLP (natural language processing) pipeline<\/a>. Now, it\u2019s a weekend prototype.<\/p>\n\n\n\n<p>The problem, however, shows up later. <a href=\"https:\/\/docs.themeisle.com\/freshrank\/understanding-token-usage-and-costs\" target=\"_blank\" rel=\"noreferrer noopener\">Token costs<\/a> scale with volume, and latency is real. The output is probabilistic (meaning identical inputs don\u2019t always produce identical outputs), and when something goes wrong, debugging an LLM call is nothing like debugging a <a href=\"https:\/\/www.dataforgelabs.com\/data-transformation-tools\/sql-transformation\" target=\"_blank\" rel=\"noreferrer noopener\">SQL transformation<\/a>. There\u2019s no stack trace for \u201cthe model decided to phrase it differently this time.\u201d<\/p>\n\n\n\n<p>The trap springs during the proof-of-concept phase. The demo looks great. The prototype handles edge cases, tolerates messy input and produces readable output&#8230;but the demo doesn\u2019t run on ten million rows. It doesn\u2019t have a 50-millisecond latency budget. And nobody from compliance is watching when you build it.<\/p>\n\n\n\n<p>That gap between demo performance and production fitness is where most LLM projects run into trouble.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-where-do-llms-genuinely-belong-in-a-data-pipeline\">Where do LLMs genuinely belong in a data pipeline?<\/h2>\n\n\n\n<p>Let\u2019s start with where LLMs actually earn their keep&#8230;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-1-unstructured-text-parsing-and-enrichment\">1. Unstructured text parsing and enrichment<\/h3>\n\n\n\n<p>If your pipeline ingests <a href=\"https:\/\/www.pcmag.com\/encyclopedia\/term\/free-form-text\" target=\"_blank\" rel=\"noreferrer noopener\">free-form text<\/a> (support tickets, reviews, medical notes, legal documents, call transcripts), LLMs are doing something <a href=\"https:\/\/en.wikipedia.org\/wiki\/Rule-based_system\" target=\"_blank\" rel=\"noreferrer noopener\">rule-based systems<\/a> genuinely cannot.<\/p>\n\n\n\n<p>A <a href=\"https:\/\/www.red-gate.com\/simple-talk\/featured\/using-regex-in-sql-server-2025-complete-guide\/\" target=\"_blank\" rel=\"noreferrer noopener\">regex<\/a> extracts a phone number. It doesn\u2019t extract the emotional tone of a complaint, the buried risk clause in paragraph 14 of a contract, or the intent behind \u201cthis thing keeps doing the thing again.\u201d In one large-scale support analytics deployment I worked on, LLM-based enrichment of call transcripts gave us classification accuracy that would\u2019ve taken hundreds of hand-crafted rules to approximate &#8211; rules that would have needed updating every time a new product launched. Instead, we simply updated a prompt.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-2-semantic-search-and-rag\">2. Semantic search and RAG<\/h3>\n\n\n\n<p>This is the clearest LLM win in data pipelines, full stop. When users need to query a knowledge base or documentation corpus using natural language &#8211; and when keyword matching falls short &#8211; <a href=\"https:\/\/help.openai.com\/en\/articles\/8868588-retrieval-augmented-generation-rag-and-semantic-search-for-gpts\" target=\"_blank\" rel=\"noreferrer noopener\">RAG architectures<\/a> are purpose-built for the job.<\/p>\n\n\n\n<p>RAG isn&#8217;t even just a fallback option &#8211;  for <a href=\"https:\/\/www.lenovo.com\/gb\/en\/glossary\/what-is-retrieve\/\" target=\"_blank\" rel=\"noreferrer noopener\">retrieval tasks<\/a>, it\u2019s the architecturally correct first choice. It&#8217;s traceable (the model can cite its sources), updatable without retraining (swap the vector database, not the model), and doesn\u2019t silently degrade when your data changes. In a separate piece I\u2019ve written on RAG-first architectures, I argue that roughly 70% of production AI use cases are better served by RAG and smart prompting than by fine-tuning. Retrieval is the clearest example of why.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-3-natural-language-to-sql\">3. Natural language to SQL<\/h3>\n\n\n\n<p>What surprises people is that LLMs are actually quite good at translating natural language questions into SQL &#8211; <em>as long as the context is right<\/em>. Pass in your schema, a few example queries, and relevant table descriptions, and you\u2019ve turned a <a href=\"https:\/\/www.ibm.com\/think\/topics\/ai-hallucinations\" target=\"_blank\" rel=\"noreferrer noopener\">hallucination-prone<\/a> model into a reliable query generator.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-4-explaining-what-your-models-found\">4. Explaining what your models found<\/h3>\n\n\n\n<p>Statistical models are good at <a href=\"https:\/\/www.ibm.com\/think\/topics\/anomaly-detection\" target=\"_blank\" rel=\"noreferrer noopener\">detecting anomalies<\/a> but terrible at explaining them to whoever needs to take action. An LLM can take the outputs of your <a href=\"https:\/\/www.red-gate.com\/products\/redgate-monitor\/\" target=\"_blank\" rel=\"noreferrer noopener\">monitoring system<\/a> &#8211; such as churn spike, support ticket surge, or regional drop-off &#8211; and turn them into a paragraph a product manager can actually use.<\/p>\n\n\n\n<p>This is also a relatively low-volume use case. You\u2019re not narrating a billion events; you\u2019re narrating a handful of alerts. The cost profile is fine.<\/p>\n\n\n\n<section id=\"my-first-block-block_08656051142c76fa1091d36557d97322\" class=\"my-first-block alignwide\">\n    <div class=\"bg-brand-600 text-base-white py-5xl px-4xl rounded-sm bg-gradient-to-r from-brand-600 to-brand-500 red\">\n        <div class=\"gap-4xl items-start md:items-center flex flex-col md:flex-row justify-between\">\n            <div class=\"flex-1 col-span-10 lg:col-span-7\">\n                <h3 class=\"mt-0 font-display mb-2 text-display-sm\">Future-proof database monitoring with Redgate Monitor<\/h3>\n                <div class=\"child:last-of-type:mb-0\">\n                                            Multi-platform database observability for your entire estate. Optimize performance, ensure security, and mitigate potential risks with fast deep-dive analysis, intelligent alerting, and AI-powered insights.                                    <\/div>\n            <\/div>\n                            <a href=\"https:\/\/www.red-gate.com\/products\/redgate-monitor\/\" class=\"btn btn--secondary btn--lg\">Learn more &amp; try for free<\/a>\n                    <\/div>\n    <\/div>\n<\/section>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-where-llms-do-not-belong\">Where LLMs do <em>not<\/em> belong<\/h2>\n\n\n\n<p>Now the harder part. LLMs can technically be wedged into almost any pipeline task, but the question is whether you <em>should<\/em>. Usually, the answer is no.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-1-deterministic-transformations\">1. Deterministic transformations<\/h3>\n\n\n\n<p>If your transformation has a single correct answer &#8211; convert this timestamp to UTC, calculate a 30-day rolling average, join two tables on customer ID &#8211; use SQL <em>or<\/em> a dataframe operation. That\u2019s it. LLMs add probabilistic variability to tasks that require <a href=\"https:\/\/www.mcherm.com\/deterministic-programming-with-llms.html\" target=\"_blank\" rel=\"noreferrer noopener\">deterministic<\/a> correctness, and they do it at orders-of-magnitude higher cost and latency.<\/p>\n\n\n\n<p>I\u2019ve seen teams run LLM calls to \u201cclean\u201d structured fields that a two-line <a href=\"https:\/\/www.red-gate.com\/simple-talk\/development\/dotnet-development\/10-reasons-python-better-than-c-sharp\/\" target=\"_blank\" rel=\"noreferrer noopener\">Python<\/a> function would have handled the same way, for a thousandth of the price. The justification is usually that the LLM handles edge cases better. Sometimes that\u2019s true. However, it more often means the edge cases haven\u2019t been properly defined. The fix here is to define them &#8211; not hand the problem to a model that charges per token.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-2-high-volume-low-latency-enrichment\">2. High-volume, low-latency enrichment<\/h3>\n\n\n\n<p>Petabyte-scale pipelines and LLM token costs do not coexist peacefully. If you\u2019re enriching a billion user events per day with API calls, the infrastructure bill will not survive a quarterly review. <\/p>\n\n\n\n<p>The lesson I\u2019ve learned working at that scale: LLMs belong at the edges, on the high-value, low-volume decisions. Not in the hot path where you need sub-millisecond responses at fractions of a cent per million operations. The table below emphasizes the magnitude of this gap:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Approach<\/strong><\/td><td><strong>Latency<\/strong><\/td><td><strong>Cost \/ 1M rows<\/strong><\/td><td><strong>Reliability<\/strong><\/td><\/tr><tr><td>LLM (GPT-4 class)<\/td><td>800ms\u20132s<\/td><td>$800\u2013$2,000<\/td><td>High (with guardrails)<\/td><\/tr><tr><td>LLM (smaller \/ local)<\/td><td>200\u2013500ms<\/td><td>$50\u2013$200<\/td><td>Medium<\/td><\/tr><tr><td>Classical ML<\/td><td>10\u201350ms<\/td><td>$5\u2013$20<\/td><td>High (deterministic)<\/td><\/tr><tr><td>SQL \/ Rule-based<\/td><td>&lt;10ms<\/td><td>&lt;$1<\/td><td>Very High<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Your numbers will vary by model, provider, and workload, but the order-of-magnitude differences are real. They don\u2019t just disappear with optimization.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-3-regulated-or-compliance-sensitive-outputs\">3. Regulated or compliance-sensitive outputs<\/h3>\n\n\n\n<p>If your pipeline produces outputs subject to <a href=\"https:\/\/www.red-gate.com\/blog\/inside-perspectives-the-growing-importance-of-security-and-compliance\/\" target=\"_blank\" rel=\"noreferrer noopener\">regulatory audit<\/a> &#8211; such as fraud scores, healthcare decisions and financial calculations (to name a few) &#8211; LLMs are the wrong foundation. Regulators want explainability, reproducibility, and deterministic behavior under identical inputs. LLMs don\u2019t naturally offer any of those things.<\/p>\n\n\n\n<p>That doesn\u2019t mean LLMs are useless in regulated environments. They can draft, summarize, and flag items for human review. But the authoritative output &#8211; the number that goes in the record, the decision that gets logged &#8211; needs to come from something auditable.<\/p>\n\n\n\n<section id=\"my-first-block-block_778292c6a08b6a7f60c798cf0f6b830a\" class=\"my-first-block alignwide\">\n    <div class=\"bg-brand-600 text-base-white py-5xl px-4xl rounded-sm bg-gradient-to-r from-brand-600 to-brand-500 red\">\n        <div class=\"gap-4xl items-start md:items-center flex flex-col md:flex-row justify-between\">\n            <div class=\"flex-1 col-span-10 lg:col-span-7\">\n                <h3 class=\"mt-0 font-display mb-2 text-display-sm\">Write accurate SQL faster in SSMS with SQL Prompt AI<\/h3>\n                <div class=\"child:last-of-type:mb-0\">\n                                            Write or modify queries using natural language, get clear explanations for unfamiliar code, and fix and optimize SQL with ease &#8211; all without leaving SSMS.                                    <\/div>\n            <\/div>\n                            <a href=\"https:\/\/www.red-gate.com\/products\/sql-prompt\/#ai-powered-code\" class=\"btn btn--secondary btn--lg\">Learn more and try for free<\/a>\n                    <\/div>\n    <\/div>\n<\/section>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-a-practical-decision-framework\">A practical decision framework<\/h2>\n\n\n\n<p>Before you add an LLM to any pipeline stage, ask these four questions. They&#8217;ve saved me from making several expensive decisions.<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li><strong>Is the output deterministic? <\/strong>If yes, reach for SQL, a dataframe library, or classical ML first. LLMs add cost and variance to problems that don\u2019t need it.<br><br><\/li>\n\n\n\n<li><strong>Is the input unstructured or semantically complex? <\/strong>If yes, LLMs are likely the right call. If the input is structured and well-defined, they\u2019re probably not.<br><br><\/li>\n\n\n\n<li><strong>What\u2019s the volume and latency requirement? <\/strong>If you\u2019re processing more than a few million records a day or need sub-100ms responses, run the cost and latency math before you commit.<br><br><\/li>\n\n\n\n<li><strong>Can you explain this output to an auditor? <\/strong>If traceability matters, make sure your LLM integration includes citations, logging, and a path to human review.<\/li>\n<\/ul>\n<\/div>\n\n\n<p>Here\u2019s how that maps across common pipeline tasks:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"781\" height=\"750\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/03\/image-10.png\" alt=\"Image showing a four-question decision framework for assessing when and where you should use an LLM.\" class=\"wp-image-109239\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/03\/image-10.png 781w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/03\/image-10-300x288.png 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/03\/image-10-768x738.png 768w\" sizes=\"auto, (max-width: 781px) 100vw, 781px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Task Type<\/strong><\/td><td><strong>Use LLM?<\/strong><\/td><td><strong>Better Alternative<\/strong><\/td><td><strong>Reason<\/strong><\/td><\/tr><tr><td>Unstructured text parsing<\/td><td>Yes<\/td><td>N\/A<\/td><td>LLMs excel at flexible, context-aware extraction<\/td><\/tr><tr><td>Semantic search \/ Q&amp;A<\/td><td>Yes<\/td><td>N\/A<\/td><td>RAG + LLM is purpose-built for this<\/td><\/tr><tr><td>Dynamic SQL generation<\/td><td>Yes<\/td><td>N\/A<\/td><td>LLMs handle schema-aware query generation well<\/td><\/tr><tr><td>Deterministic transforms<\/td><td>No<\/td><td>SQL \/ dbt<\/td><td>Rule-based is faster, cheaper, auditable<\/td><\/tr><tr><td>High-volume enrichment<\/td><td>Careful<\/td><td>Classical ML \/ rules<\/td><td>Token costs scale badly at petabyte volume<\/td><\/tr><tr><td>Regulated \/ compliance output<\/td><td>No<\/td><td>Deterministic logic<\/td><td>LLMs lack guaranteed output structure<\/td><\/tr><tr><td>Anomaly explanation<\/td><td>Yes<\/td><td>N\/A<\/td><td>LLMs narrate statistical outputs effectively<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-real-life-example-customer-data-enrichment\">Real-life example: customer data enrichment<\/h2>\n\n\n\n<p>Take a task I\u2019ve dealt with directly: enriching customer records with a \u201clikely intent\u201d label based on recent behavior. The input is a mix of <a href=\"https:\/\/hightouch.com\/blog\/clickstream-data\" target=\"_blank\" rel=\"noreferrer noopener\">clickstream<\/a> events, support interactions, and purchase history. The output drives what the customer sees next.<\/p>\n\n\n\n<p>You have three real options:<\/p>\n\n\n<div class=\"block-core-list\">\n<ul class=\"wp-block-list\">\n<li><strong>SQL + <a href=\"https:\/\/seon.io\/resources\/dictionary\/heuristic-rules\/\" target=\"_blank\" rel=\"noreferrer noopener\">rule-based heuristics<\/a>: <\/strong>Fast, cheap and completely auditable. Works great when your intent categories are stable and well-defined. Falls apart when behavior patterns shift or when the language is ambiguous.<br><br><\/li>\n\n\n\n<li><strong>Classical ML classifier: <\/strong>Trained on labeled historical data. Handles high volume and low latency well. Needs periodic retraining as patterns evolve. How interpretable it is depends on what model you pick.<br><br><\/li>\n\n\n\n<li><strong>LLM with behavioral context in the prompt: <\/strong>Handles novel patterns and nuanced intent better than the other two. Expensive at scale. Fine for <a href=\"https:\/\/www.red-gate.com\/simple-talk\/development\/dotnet-development\/syntactic-sugar-and-the-async-pill\/\" target=\"_blank\" rel=\"noreferrer noopener\">async<\/a> batch enrichment; not for real-time serving. Outputs need validation.<\/li>\n<\/ul>\n<\/div>\n\n\n<p>The pattern that\u2019s worked best in the high-volume deployments I\u2019ve been part of is a hybrid: use the classical ML model for the bulk of real-time scoring, and route low-confidence cases to a secondary LLM enrichment step. Let the fast, cheap system handle the straightforward cases. Save the powerful, expensive system for the ones that actually need it. This is the same principle behind RAG-first architectures &#8211; reserve model complexity for where it genuinely moves the needle.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-conclusion-choose-the-right-tool-for-the-right-job\">Conclusion: choose the right tool for the right job<\/h2>\n\n\n\n<p>The most expensive LLM mistake I see isn\u2019t failing to adopt them. It\u2019s adopting them without a clear reason why. LLMs are genuinely transformative for unstructured data, semantic retrieval, natural language querying, and anomaly explanation. They\u2019re the wrong choice for deterministic transforms, high-volume hot paths, and regulated outputs. <\/p>\n\n\n\n<p>The teams getting the most out of LLMs in production aren\u2019t the ones who put them everywhere &#8211; they\u2019re the ones who put them in exactly the right places and fought the urge to do anything more.<\/p>\n\n\n\n<p>The four questions and decision table in this article aren\u2019t exhaustive. They\u2019re meant to create a ten-minute pause before someone adds a $0.80-per-thousand-token model to a pipeline stage that a $0.0001 SQL query was already handling correctly. LLMs are powerful &#8211; and that\u2019s precisely why you should be selective about where you use them.<\/p>\n\n\n\n<section id=\"faq\" class=\"faq-block my-5xl\">\n    <h2>FAQs: When, and when not, to use LLMs in your data pipeline<\/h2>\n\n                        <h3 class=\"mt-4xl\">1. When should you use LLMs in a data pipeline?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Use LLMs for unstructured text tasks like classification, summarization, semantic search, and natural language querying where traditional methods fall short.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">2. When should you avoid using LLMs?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Avoid LLMs for deterministic tasks, high-volume low-latency processing, and regulated outputs that require consistency and auditability.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">3. Why are LLMs expensive in production?<\/h3>\n            <div class=\"faq-answer\">\n                <p>LLMs incur high token-based costs and latency, especially at scale, making them inefficient for large, real-time data pipelines.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">4. What is the best alternative to LLMs for structured data?<\/h3>\n            <div class=\"faq-answer\">\n                <p>SQL, rule-based systems, and classical machine learning models are faster, cheaper, and more reliable for structured and repeatable tasks.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">5. How can you use LLMs efficiently in production?<\/h3>\n            <div class=\"faq-answer\">\n                <div class=\"flex flex-col text-sm\">\n<article class=\"text-token-text-primary w-full focus:outline-none [--shadow-height:45px] has-data-writing-block:pointer-events-none has-data-writing-block:-mt-(--shadow-height) has-data-writing-block:pt-(--shadow-height) [&amp;:has([data-writing-block])&gt;*]:pointer-events-auto scroll-mt-[calc(var(--header-height)+min(200px,max(70px,20svh)))]\" dir=\"auto\" data-turn-id=\"request-WEB:4e7cbde5-6797-4065-8036-affaf39ecfe5-2\" data-testid=\"conversation-turn-6\" data-scroll-anchor=\"true\" data-turn=\"assistant\">\n<div class=\"text-base my-auto mx-auto pb-10 [--thread-content-margin:var(--thread-content-margin-xs,calc(var(--spacing)*4))] @w-sm\/main:[--thread-content-margin:var(--thread-content-margin-sm,calc(var(--spacing)*6))] @w-lg\/main:[--thread-content-margin:var(--thread-content-margin-lg,calc(var(--spacing)*16))] px-(--thread-content-margin)\">\n<div class=\"[--thread-content-max-width:40rem] @w-lg\/main:[--thread-content-max-width:48rem] mx-auto max-w-(--thread-content-max-width) flex-1 group\/turn-messages focus-visible:outline-hidden relative flex w-full min-w-0 flex-col agent-turn\">\n<div class=\"flex max-w-full flex-col gap-4 grow\">\n<div class=\"min-h-8 text-message relative flex w-full flex-col items-end gap-2 text-start break-words whitespace-normal [.text-message+&amp;]:mt-1\" dir=\"auto\" data-message-author-role=\"assistant\" data-message-id=\"9e76faa2-ba5b-4cd3-88b7-1dd971715315\" data-message-model-slug=\"gpt-5-3\">\n<div class=\"flex w-full flex-col gap-1 empty:hidden\">\n<div class=\"markdown prose dark:prose-invert w-full wrap-break-word light markdown-new-styling\">\n<p data-start=\"802\" data-end=\"990\" data-is-last-node=\"\" data-is-only-node=\"\">Apply LLMs selectively: focus on high-value, low-volume use cases and combine them with traditional systems in hybrid architectures.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"z-0 flex min-h-[46px] justify-start\">\u00a0<\/div>\n<div class=\"mt-3 w-full empty:hidden\">\n<div class=\"text-center\">\u00a0<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/article>\n<\/div>\n<div class=\"pointer-events-none h-px w-px absolute bottom-0\" aria-hidden=\"true\" data-edge=\"true\">\u00a0<\/div>\n            <\/div>\n            <\/section>\n","protected":false},"excerpt":{"rendered":"<p>\u201cLet\u2019s add LLMs to the pipeline\u201d has become a familiar refrain in modern data teams, but turning that idea into production reality is where things often go wrong. While large language models can unlock powerful capabilities, they\u2019re just as likely to introduce unnecessary cost, latency, and complexity when misapplied. Drawing on real-world experience building large-scale ML systems, this guide breaks down exactly where LLMs belong in your data pipeline and, just as importantly, where they don\u2019t.&hellip;<\/p>\n","protected":false},"author":346673,"featured_media":103103,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[159169,143527,143523],"tags":[159075,4483,4168,159378,4150],"coauthors":[159377],"class_list":["post-109233","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","category-database-administration-sql-server","category-databases","tag-ai","tag-data","tag-database","tag-llm","tag-sql"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/109233","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/346673"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=109233"}],"version-history":[{"count":4,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/109233\/revisions"}],"predecessor-version":[{"id":109468,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/109233\/revisions\/109468"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media\/103103"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=109233"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=109233"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=109233"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=109233"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}