{"id":101142,"date":"2024-02-06T14:34:03","date_gmt":"2024-02-06T14:34:03","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=101142"},"modified":"2024-09-05T15:48:11","modified_gmt":"2024-09-05T15:48:11","slug":"the-fashionable-truth-about-ai","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/opinion\/opinion-pieces\/the-fashionable-truth-about-ai\/","title":{"rendered":"The Fashionable Truth About AI"},"content":{"rendered":"<p>Over the past year, the topic of AI has really blown up in the general public. However,<strong> AI<\/strong> was already something very important to most corporations, but the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Large_language_model\"><strong>Large Language Models<\/strong><\/a> (<strong>LLM<\/strong>) made it extremely fashionable.<\/p>\n<p>When something becomes fashionable, a lot of people try to ride the wave with no care about the truth of the details and technology. There are already many online services offering \u201cLLM\u201d and \u201cAI\u201d for different purposes, even not recommended ones.<\/p>\n<p>Let\u2019s dig into the alphabet soup world of the AI and discover what\u2019s fact and what\u2019s only fashion.<\/p>\n<h1>Before LLM there was Machine Learning<\/h1>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"492\" height=\"276\" class=\"wp-image-101145\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/02\/word-image-101142-1.jpeg\" \/><\/p>\n<p>While the term AI has been used a lot for many purposes, when AI was mentioned, companies were often talking about <strong>Machine Learning<\/strong>: Machine Learning is typically used to analyse trends and predict what will be the probable result in a specific scenario.<\/p>\n<p>Predicting scenarios over existing data is (or was) the main objective of most companies. The most famous scenario is the discovery that men buying diapers at night are prone to by beer as well. Hence sales results will be better if diapers and beers are put together in a store.<\/p>\n<h1>How Machine Learning works<\/h1>\n<p>Using <strong>Machine Learning<\/strong> is about choosing the correct <strong>algorithm<\/strong> to make a <strong>prediction<\/strong>, make the correct <strong>training<\/strong> and <strong>tuning<\/strong> of the <strong>algorithm<\/strong> for a specific <strong>dataset schema<\/strong> and, once achieving an acceptable percentage of correct predictions, make it available for company usage.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"710\" height=\"445\" class=\"wp-image-101146\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/02\/a-diagram-of-a-machine-learning-algorithm-descrip.jpeg\" alt=\"A diagram of a machine learning algorithm\n\nDescription automatically generated\" \/> The professional who executes these tasks is typically a <strong>Data Scientist<\/strong>.<\/p>\n<p>A machine learning algorithm can be summarized as a program capable to apply a mathematical model to a set of data in order to discover specific patterns on this set of data.<\/p>\n<p>The choice of the correct algorithm for the various scenarios an organization needs requires a deep mathematical knowledge. Most of the outsiders in relation to IT area are used to thinking that the IT professions requires a lot of mathematical knowledge.<\/p>\n<p>However, among the thousands of professions in the IT area, Data Scientist requires far more mathematical knowledge than most if not all typical IT professions including most programmers..<\/p>\n<p>The machine learning process starts with <strong>training<\/strong> an algorithm. This involves preparing the algorithm to process a specific dataset schema and make one or more predictions from the data.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"360\" class=\"wp-image-101147\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/02\/a-diagram-of-machine-learning-process-description.jpeg\" alt=\"A diagram of machine learning process\n\nDescription automatically generated\" \/><\/p>\n<p>The <strong>dataset schema<\/strong> are the fields, with their specific data types, used by one table of data. Training an algorithm involves selecting the set of fields in the dataset which the algorithm should analyse to predict the value of a new field.<\/p>\n<p>For example, let\u2019s consider a credit analysis. Based on the person profile, will he pay his debt correctly, or will he stay in debt?<\/p>\n<p>During the training process, the algorithm processes a set of data where the resulting field value is well known. In this way, the algorithm discovers a pattern among all the other fields which leads to a prediction of the resulting field.<\/p>\n<p>For example, in a dataset of bank customers, the algorithm can analyze all the customer profile and identify patterns on the profile fields which leads to the customer to pay his debt or stay in debt.<\/p>\n<p>Once the algorithm finds a pattern, we use the algorithm over a second set of data, for which we know the result but don\u2019t provide it to the algorithm. The algorithm makes the prediction of the result, and we analyze what\u2019s the percentage of hits or miss.<\/p>\n<h1>Machine Learning was also Fashionable<\/h1>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"670\" height=\"426\" class=\"wp-image-101148\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/02\/a-diagram-of-data-engineering-description-automat.jpeg\" alt=\"A diagram of data engineering\n\nDescription automatically generated\" \/><\/p>\n<p>At its time, machine learning was also quite fashionable. One of the biggest complains from Data Scientists was about how they were usually hired in this position only to discover most of the work they had to execute was in fact related to <strong>Data Engineering<\/strong> and <strong>Business Intelligence<\/strong>.<\/p>\n<p><strong>Data Engineering<\/strong> tasks generally are those involving the data transformations and management starting from the production environment to the data intelligence environment, building a trustworthy data repository which can be used for machine learning.<\/p>\n<p>Of course, many specialists would say I\u2019m summarizing too much, forgetting about <strong>Data Architect<\/strong>, <strong>Data Governance<\/strong> and <strong>Data Modeler<\/strong>, but for the purpose of this explanation, it\u2019s ok. There are a lot of functions that are involved in providing data to the Machine Learning processes.<\/p>\n<p>The fashion of machine learning, with low knowledge about what it really involves, is what made many leaders hire <strong>Data Scientists<\/strong> and put them to work but not get what they needed because of all the lead in work.<\/p>\n<p>The <strong>Data Engineering<\/strong> work to create a data intelligence environment and the business intelligence work over this environment should already be done before reaching machine learning implementations.<\/p>\n<p>Over time, it will be interesting to see what behaviours <strong>LLM<\/strong> will generate as it goes from being fashionable to just another tool we all use?<\/p>\n<h1>What\u2019s Business Intelligence<\/h1>\n<p><strong>Business Intelligence<\/strong> is the process of exploring the data generated by the work of <strong>Data Engineers<\/strong>, make discoveries over this data, and deliver actionable information for the company decision makers.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"870\" height=\"459\" class=\"wp-image-101149\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/02\/a-magnifying-glass-over-a-computer-description-au.jpeg\" alt=\"A magnifying glass over a computer\n\nDescription automatically generated\" \/><\/p>\n<p>This process is executed by <strong>Data Analysts<\/strong> with the support of <strong>Business Analysts<\/strong>.<\/p>\n<p>It makes no sense the attempt to make predictions before you have a trustworthy data repository built by <strong>Data Engineers<\/strong> and <strong>Data Analysts<\/strong> had the opportunity to explore it to the extreme, extracting all possible actionable information from it.<\/p>\n<p>Only after all this process is in place you will have a trustworthy data set and you will know what information needs to be predicted by machine learning algorithms.<\/p>\n<p>Of course, exceptions to this order of work may exist, but that\u2019s what they are, exceptions.<\/p>\n<h1>The Large Language Models are here<\/h1>\n<p><strong>LLMs<\/strong> are completely different than <strong>Machine Learning<\/strong> algorithms. They are not a replacement or the evolution of Machine Learning, they are a very different side of Artificial Intelligence. This new side of AI managed to shift the fashion from Machine Learning to LLMs, but technically, there is no replacement.<\/p>\n<p>Machine learning is still needed in companies together with LLMs. Leave machine learning behind and replacing it with LLMs would only be a new mistake caused by the effect of fashions over solid decisions caused by poorly educated (and typically excitable!) leadership.<\/p>\n<h2>What are Large Language Models<\/h2>\n<p><strong>LLMs<\/strong> are algorithms, but they are not the kind of algorithm we train to make predictions, such as the machine learning ones.<\/p>\n<p>The big techs, starting with <strong>OpenAI<\/strong> (the company), pre-trained <strong>LLMs<\/strong> with a huge amount of language understanding knowledge and general world knowledge up to 2021 (in the case of <strong><a href=\"https:\/\/www.cnet.com\/tech\/chatgpt-isnt-stuck-in-2021-anymore-can-browse-web-for-recent-answers\/\">OpenAI\u2019s <\/a><\/strong><a href=\"https:\/\/www.cnet.com\/tech\/chatgpt-isnt-stuck-in-2021-anymore-can-browse-web-for-recent-answers\/\">initial offering<\/a>).<\/p>\n<p><strong><img loading=\"lazy\" decoding=\"async\" width=\"526\" height=\"328\" class=\"wp-image-101150\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/02\/a-close-up-of-a-diagram-description-automatically.jpeg\" alt=\"A close-up of a diagram\n\nDescription automatically generated\" \/><\/strong><\/p>\n<p>The algorithm is capable to receive a specific collection of tokens, process them using the pre-trained knowledge and return another collection of tokens as a reply.<\/p>\n<p>When talking about text and language, token can be counted as characters, for example. It depends on how the algorithm was built and configured.<\/p>\n<p>For comparison purposes, the most powerful algorithms currently <a href=\"https:\/\/allenpike.com\/2023\/32k-of-context-in-your-pocket\">are capable to process 32K<\/a> tokens and there are announcements of new algorithms capable to go up to 120K tokens.<\/p>\n<p>This total of tokens is divided between the input, containing the main question and instructions and the reply.<\/p>\n<p>The input is basically text; instructions, questions, etc. which could reach 1K, 2K tokens? The reply is also text, it would hardly be more than 2K tokens. This makes a total of 4K tokens. What about the other 28K tokens?<\/p>\n<p>The other 28K can vary a lot depending on the implementation and the environment provided. By this I mean the features provided by <strong>OpenAI<\/strong> (on <strong>ChatGPT<\/strong>), by Microsoft (on <strong>Azure OpenAI<\/strong> feature) and others.<\/p>\n<p>For example, we may be able to upload documents, ask to analyse a URL, connect to a source of multiple documents or more.<\/p>\n<h2>The Input: System and User Prompts<\/h2>\n<p>The input can be broken down in two parts: A <strong>System Prompt<\/strong> and a <strong>User Prompt<\/strong>.<\/p>\n<p><strong>System Prompt:<\/strong> It\u2019s a set of instructions about how the algorithm should behave. This includes instructions about what subject of the human knowledge the algorithm should focus on, what and if some limit should be applied, what kind of language should be used to reply, what format should be used to reply and much more.<\/p>\n<p>The algorithm was pre-trained with a so huge amount of knowledge that different instructions can result in completely different results, making the same algorithm seems like completely different applications depending on the instructions sent to the algorithm.<\/p>\n<p>This method of work already created the concept of <strong>Prompt Engineering<\/strong>, the knowledge (or art?) about the best way to write instructions to an <strong>LLM<\/strong> to achieve the desired behavior. There is already the belief this can become a new profession.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"791\" height=\"332\" class=\"wp-image-101151\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/02\/a-diagram-of-a-diagram-description-automatically.png\" alt=\"A diagram of a diagram\n\nDescription automatically generated\" \/><\/p>\n<p><strong>User Prompt:<\/strong> It\u2019s the input (question typically) made by the end user<\/p>\n<p>At this point you may be asking: How do we use them? You may remember ChatGPT UI, where you ask a question and receive an answer, but where is the difference between System Prompt and User Prompt?<\/p>\n<p>The difference between them becomes visible when using their development API to build an application. You can build a custom website to receive the user input. When submitting the user input to the algorithm, you send the System Prompt, and the user input becomes the User Prompt.<\/p>\n<p>In this way, you can use the same algorithm to build applications which will seems to the user to be completely different from each other. The System Prompt is the first element to stablish the difference.<\/p>\n<h2>Additional Information for Processing<\/h2>\n<p>The features provided by the API\u2019s define what additional information we can send to the algorithm. This may include:<\/p>\n<ul>\n<li>Files<\/li>\n<li>Urls for web scrapping<\/li>\n<li>Connection to Azure Cognitive Services<\/li>\n<li>Connection to Cosmo DB Mongo API<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"778\" height=\"471\" class=\"wp-image-101152\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/02\/a-screenshot-of-a-computer-description-automatica.png\" alt=\"A screenshot of a computer\n\nDescription automatically generated\" \/><\/p>\n<p>This allows the algorithm to use its power to process this information. The processing is based on language. It can summarize, find answers, and additional processing we can request through the prompts.<\/p>\n<p>All this information, including the additional information sent, needs to be into the limit of tokens for each call of the algorithm. As a result, you can\u2019t send a huge number of files or URLs. You may need to rely on different services working together, such as Cognitive Services Search, to find the best information to send to the algorithm.<\/p>\n<h2>Summarizing: The difference between Machine Learning and LLM<\/h2>\n<p>After analysing how each one, Machine Learning and LLM works, let\u2019s summarize the difference between both.<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Machine Learning<\/strong><\/td>\n<td><strong>Large Language Model<\/strong><\/td>\n<\/tr>\n<tr>\n<td>Models are intended for predictions<\/td>\n<td>Although the reply could technically be called a prediction, it\u2019s not for the end user. The model process existing data to provide an answer.<\/td>\n<\/tr>\n<tr>\n<td>Model means an algorithm which requires training<\/td>\n<td>Models are pre-trained. Additional training is minimal or none.<\/td>\n<\/tr>\n<tr>\n<td>Model training is for one specific purpose<\/td>\n<td>Model pre-training made by the big techs includes a huge amount of knowledge ingestion<\/td>\n<\/tr>\n<tr>\n<td>The trained model analyses one specific dataset schema<\/td>\n<td>The model provides answer about any knowledge ingested<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1378\" height=\"640\" class=\"wp-image-101153\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/02\/a-screen-shot-of-a-black-background-description-a.png\" alt=\"A screen shot of a black background\n\nDescription automatically generated\" \/><\/p>\n<p>As you may notice, both have their own space in the corporate environment. They both can be used in the corporate data platform, with different purposes.<\/p>\n<h1>The Alphabet Soup increases: Bots and Co-Pilots<\/h1>\n<p>These are two words which became very common recently and linked to LLMs. But what is the difference between both and between them and the LLM itself?<\/p>\n<h2>Bots<\/h2>\n<p>A <strong>Bot<\/strong> is a chat software built using a tool which aggregates a language processing feature with a workflow developed to guide the subject of a conversation.<\/p>\n<p>The Bot developer can design a conversation workflow to guide how the bot reacts when questioned about each subject.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"849\" height=\"332\" class=\"wp-image-101154\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/02\/a-screenshot-of-a-computer-description-automatica-1.png\" alt=\"A screenshot of a computer\n\nDescription automatically generated\" \/><\/p>\n<p>The language processing feature ensures the bot can understand different ways to express the same subject, but that\u2019s the limit of a bot, it has no additional intelligence.<\/p>\n<h2>Co-Pilot<\/h2>\n<p>A <strong>Co-Pilot<\/strong> is a custom chat app implementation using an LLM as a background. It\u2019s associated with our daily tools and their system prompt is guided to provide specific results on our daily tools.<\/p>\n<p>The co-pilots are being so integrated into existing daily tools that these tools are designed to call the backend LLM, get the result and make specific processing with the result according to the tool purpose. For example, find an email, implement a code, implement a report and so on. This is the result of a deep integration of the tool with a backend LLM deployment.<\/p>\n<h2>Difference between Bot, Co-Pilot and LLM<\/h2>\n<p><strong>Bot:<\/strong> Provides a specific, controlled, and predictable set of answers.<\/p>\n<p><strong>Co-Pilot:<\/strong> It\u2019s a specific implementation using an LLM as a backend. Its answers are not always predictable.<\/p>\n<p><strong>LLM:<\/strong> Is the backend algorithm used to implement what now became a convention to call Co-Pilot.<\/p>\n<p>The use cases for each one is very simple:<\/p>\n<ul>\n<li>If you want a chat assistant with predictable and controlled answers, you use a <strong>Bot<\/strong><\/li>\n<li>If you want a tool to help you with creation, search and processing, you use a custom <strong>Co-Pilot<\/strong> built over a backend <strong>LLM<\/strong><\/li>\n<li>If you want the power of an LLM combined with a predictable workflow about a specific subject, you can build a Bot with a defined workflow and when the Bot doesn\u2019t recognize the subject the user is talking about, the Bot send the question to a backend LLM, combining both.<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"620\" height=\"629\" class=\"wp-image-101155\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/02\/a-screenshot-of-a-computer-description-automatica-2.png\" alt=\"A screenshot of a computer\n\nDescription automatically generated\" \/><\/p>\n<p>After analyzing all this, it may seem easy to identify how wrong it would be to use an LLM as a virtual assistant for an end-user facing site: The answers are not controllable, you may be lucky and get good answers, but they can also be a mess.<\/p>\n<p>Using a Bot, on the other hand, you can control the answers for specific conversation subjects, ensuring the bot provides good answers to your customer and leaving the power of LLM for more creative purposes.<\/p>\n<p>This article came exactly when I found companies already offering LLMs to be used as virtual assistants on your website in the model of Software as a Service. Companies like this are not only riding on the fashion, but guiding the end user in the wrong direction.<\/p>\n<h2>Bots, Co-Pilots and Microsoft<\/h2>\n<p>Microsoft has a <strong>Bot Framework<\/strong>, which allows you to build bots completely by code. In the past, the bot framework had many helper tools, such as the <strong>bot composer<\/strong>, but these tools were not updated and are now deprecated. The <strong>Bot Framework<\/strong>, however, remains.<\/p>\n<p>Inside <strong>Power Apps Platform<\/strong>, Microsoft created the <strong>Power Virtual Agents<\/strong>. The <strong>Power Virtual Agents<\/strong> are a no-code solution to build bots and integrate them on your applications. Recently, Microsoft rebranded the Power Virtual Agents as Co-Pilots and the portal to build them as <strong>Co-Pilot Studio<\/strong>.<\/p>\n<p>What changed between Power Virtual Agents and Co-Pilots?<\/p>\n<p>Microsoft included the capability to link the PVAs with an LLM in order to provide more powerful answers.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"542\" height=\"452\" class=\"wp-image-101156\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2024\/02\/a-screenshot-of-a-computer-description-automatica-3.png\" alt=\"A screenshot of a computer\n\nDescription automatically generated\" \/><\/p>\n<p>Although this goes on a good direction, using a bot and an LLM together, the configuration feature on the Co-Pilot in relation to the execution parameters of the LLM are limited. In my opinion, they will meet the needs of small business, exactly as power platform already does, but not the bit ones.<\/p>\n<p>It\u2019s important for the corporations to not fall for the wonders of a beautiful presentation: They will need, for sure, custom apps calling the LLM (such as an Azure Function calling Azure OpenAI) to control all possible LLM configurations and get all its power.<\/p>\n<h1>Summary<\/h1>\n<p>This is a great alphabet soup and it\u2019s very easy to fall for the wonders presented in speeches which seems more like something for the show biz.<\/p>\n<p>However, knowing exactly what each technology is and does makes it easier to identify the traps in the way and pursue the correct solution, built by the right specialists.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Over the past year, the topic of AI has really blown up in the general public. However, AI was already something very important to most corporations, but the Large Language Models (LLM) made it extremely fashionable. When something becomes fashionable, a lot of people try to ride the wave with no care about the truth&#8230;&hellip;<\/p>\n","protected":false},"author":50808,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[159169,53,30],"tags":[159075],"coauthors":[6810],"class_list":["post-101142","post","type-post","status-publish","format-standard","hentry","category-ai","category-featured","category-opinion-pieces","tag-ai"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/101142","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/50808"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=101142"}],"version-history":[{"count":6,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/101142\/revisions"}],"predecessor-version":[{"id":103878,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/101142\/revisions\/103878"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=101142"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=101142"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=101142"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=101142"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}