First LLM Technical Editor Test – Office and Web-based Copilot with a VERY wrong article

Next up, I am going to give Microsoft Copilot its chance to check for technical inaccuracies in my obviously very wrong articles. This article is going to build upon the last two (You can see the series here.), and as I continue this series, I am going to explain less and less of my methodology.

You can get the source articles here, and you can read more of them in the first article that set up the series.

A bit of setup first

I will be repeating the tests from the second article where I used ChatGPT, first using the web browser version that I have licensed as part of my Microsoft 365 license. I am not using my corporate license, but rather my own personal one. I do this to avoid any use of our corporate information. I will use my corporate license for the Word Document test, as this is the way I do my work (and a big part of what I am trying.

I did ask Copilot: “What version of LLM am I using” and it was less forthcoming with any version information:

“Great question! I’m built on the latest cutting-edge large language models from Microsoft. While I don’t know the exact technical details of the model version, I am constantly updated to provide you with the most accurate and helpful responses.”

So suffice it to say that I wrote this section on 26 January 2025 and using the model that is currently available. I honestly would appreciate a more specific value of a version to know whether it is worth attempting to a task again later. I will include text files of my entire output for the larger questions in any case.

Lets jump in.

Document advocating for the advice

So here we go. Just like last time, I typed the following on https://copilot.microsoft.com/:

“Please review the following document in terms of technology”

And then pasted the document with the header that said this was good advice. You can view the entire set of advice here. This time, the advice was a bit less useful, especially in the headers. For example, one section’s advice is:

“- Use as few servers as necessary Do you need SQL Server Analysis Services, the Relational Engine, SSIS, DQ, and a Web Server? Running them on the same machine saves on hardware and licensing costs and makes it easier for these services to communicate more efficiently. If you need to scale these services, adding CPUs and disk space (and possibly RAM) is far easier than dealing with multiple servers as usage levels increase.”

Copilot had this to say:

“Use as few servers as necessary:

Performance and Scalability: Combining services can save on costs and improve efficiency. However, it can also lead to resource contention. Monitoring and balancing resource usage is essential to avoid performance bottlenecks.”

The answer is not wrong, but it also isn’t really all that strong. Like can it improve efficiency? I don’t know what that means. Efficient processing? Efficient access? In my text I said they can communicate more efficiently, which is probably the only positive thing you could say about this advice. It will save licensing costs, too, I suppose.

The real answer is a lot more nuanced. You should NOT just run out and buy 5 independent servers for these tools if you are barely going to use them. Some could be ran on the same machine, or perhaps spread over a few machines. But that is terrible overall basic advice.

The rest of the advice needs to be stronger. It will likely lead to resource contention, and scaling up CPU and RAM is often not a linear cost. Some mention of the size of your organization and load testing would be nice as well.

All of the sections were one bullets each, and they were kind of the same level of “this is not the best advice.” For example, take the section on using the SA account:

“- The SA account saves time

Simply put, having one production account with one well-known password saves time in many ways. By sharing this password with any person who needs administrative access to the server, there are fewer passwords and users to manage, so there is less chance that that password will be lost to a bad actor. Note: It is imperative to have NDAs and training classes on how to keep this password safe. By keeping the password human readable, you are ensured that users will not write this down anywhere (even in a secure password manager.) For example, you don’t have to be awoken at night to fix deployments that have broken the database server. “

The only reasonable reply here is something along the lines of: “You should never be allowed to hold a job managing security without hundreds of additional hours of security training.”

It was far nicer, in my mind too nice.

“The SA account saves time:

Security Risks: Sharing the SA account password poses significant security risks. It’s better to use individual accounts with appropriate privileges and auditing to track changes.”

Finishing out the “too nice” advice is the conclusion:

“In conclusion, while some of the advice in the document is helpful, other points might need to be reconsidered for performance, security, and best practices. It’s essential to evaluate the specific context and requirements of your environment before implementing these suggestions.”

Finishing out the “too nice” advice is the conclusion:

Rewriting it for me

Unlike ChatGPT, it didn’t straight up ask “can I help you rewrite this better?” So I asked it to.

“Can you rewrite this article making the advice better?”

You can find that full output here. Just like last in previous article, I want to focus on the section that has code in it, the one about looping code for filtering. It is terrible advice, of course. Here is my original advice:

“- SQL Server works better with LOOPING codeIf you want to filter from a set of data, the best way is to use a WHILE WEND construct and do the filtering ROW BY ROW. For example, say you have a table with a million rows. You can write:

SELECT *
FROM TableName
WHERE Criteria = False;

But it is often far faster to use a loop such as:

CREATE TEMP TABLE #Results
WHILE (RowsStillInTable)
BEGIN
INSERT INTO #Results
SELECT TOP 1 *
FROM TableName
WHERE Criteria = False;

IF @@RowsCount = 0
Break;
WEND;

By using the TOP 1 modifier, this allows each iteration to execute much faster than the SELECT statement that needs to touch every row in the table simultaneously.”

Just like ChatGPT when it rewrote the document, it just explained why Set-Based is better than Row By Row Processing:

“12. Prefer Set-Based Operations Over Row-By-Row Processing

Avoid Looping Constructs for Data Manipulation: Row-by-row operations (commonly referred to as RBAR: “Row By Agonizing Row”) are inefficient in SQL Server.

Best Practice:

Use SQL’s Set-Based Nature: Write queries that operate on entire sets of data at once.

Leverage Advanced Features: Utilize Common Table Expressions (CTEs), window functions, and other SQL features to handle complex operations.

Optimize Joins and Subqueries: Ensure that joins and subqueries are written efficiently to minimize resource usage.

Monitor Execution Plans: Use execution plans to identify and resolve performance issues.”

It did actually define RBAR correctly but lost all of the code. Also, like ChatGPT the advice is decent, but bland.

One interesting thing about the rewrite is that in the introduction, it says:

“As a SQL Server administrator, your role is crucial in ensuring the performance, security, and reliability of your organization’s data infrastructure. Drawing from the collective wisdom of experienced DBAs, I’ve compiled a set of best practices to help you excel in your role. Here’s the enhanced advice, presented in no particular order”

So, it sort of slights the work of the other experienced writers in the process. I took their advice, and here I have, shall we say, enhanced it!

Finally, the conclusion is typical AI output:

“Conclusion

By incorporating these best practices into your daily operations, you can enhance the performance, security, and reliability of your SQL Server environments. Proactive management, continuous monitoring, and staying informed about the latest features and recommendations will help you deliver robust and efficient database solutions.

Remember that each environment is unique. Always consider your organization’s specific needs, constraints, and goals when applying these practices. Regularly revisit and adjust your strategies to align with evolving technologies and business requirements.

Additional Resources:

Microsoft Docs: Keep up-to-date with official documentation and best practices (docs.microsoft.com/sql).

SQL Server Community: Engage with the community through forums, blogs, and events to share knowledge and learn from others.

Professional Development: Consider certifications and training courses to deepen your expertise.

By staying informed and proactive, you’ll be well-equipped to tackle the challenges of database administration and contribute significantly to your organization’s success.”

I highlighted a few words that seem to come up a lot in generated output. Robust feels very overused, for example.

The last thing I will do here is see what Copilot does with my ugly coding example that I included the full text at the start of this rewriting section. So I ask Copilot to “Give me advice on this section alone” and it has some interesting things to say:

“The advice provided here is generally not in line with SQL Server best practices. SQL Server is designed to handle set-based operations efficiently, and using a looping construct (Row-By-Agonizing-Row or RBAR) can significantly degrade performance. Here’s a better approach:”

Definite truth here, but does it really know what I was trying to do? It continues:

“Use Set-Based Operations for Filtering Data

Avoid Looping Constructs: Set-based operations are designed to process entire sets of rows at once, leveraging SQL Server’s internal optimizations. Looping through rows one-by-one can be highly inefficient and should be avoided when possible.

Better Practice:

Use Set-Based Queries: Write queries that operate on entire sets of data, which is both more efficient and easier to maintain. For example:

Sql
SELECT *
FROM TableName
WHERE Criteria = False;”

Just like ChatGPT, it allows the False construct, which does not work in SQL Server unless there is in fact, a False column. Other than matching my poor code (though slightly better than ChatGPT which definitely specified that False and 0 were likely the same), this is not terrible advice, though it gets murkier here:

“Leverage SQL Server Features:

Common Table Expressions (CTEs): Use CTEs for complex queries that require multiple steps or recursive operations.

Window Functions: Employ window functions for advanced analytics without the need for procedural code.

Temporary Tables and Table Variables: Use these structures for intermediate storage when breaking down complex queries.

Example of Efficient Set-Based Query:

sql

WITH FilteredData AS (
SELECT *
FROM TableName
WHERE Criteria = False
)
SELECT *
FROM FilteredData;”

For very generic advice, this is not completely wrong…maybe sort of. But none of this advice pertains to the question I asked. And since my “bad” code actually only loaded a temp table… there still is no mention that the second bit of code isn’t doing the same thing as the first, even as pseudocode.

It finishes with more basic explanation, which is fine, and as long as you ignore that my second bit of code was terribly wrong as code, fine. The concept it tried to show was wrong in any case..

“Explanation:

Performance: SQL Server optimizes set-based queries to run efficiently, taking advantage of indexes and parallel processing.

Maintainability: Set-based queries are typically easier to understand and maintain compared to procedural loops.

Scalability: Set-based operations scale better with larger datasets, reducing the likelihood of performance bottlenecks.

By embracing set-based operations and avoiding row-by-row processing, you can ensure that your SQL Server queries are both performant and scalable, adhering to best practices and leveraging the full power of SQL Server’s capabilities.”

Finally, lets see if Copilot in Word works differently?

To do this, I open the .docx version of the Fake Advice article (I am a bit concerned that the title might be used against me in this case. But we will live and learn…The title was not mentioned in the output.). Once opened, I ask:

“Please review the document in terms of technology”

A screenshot of a computer

Description automatically generated

This time, it just gave me an overview of what the document said technically. So, I changed my request to say:

“Please review the document correctness in terms of technology”

It obliged, but weirdly didn’t mention correctness in the reply:

“Sure, I’d be happy to review the document “Fake Advice for SQL Server Administrator.docx” in terms of technology. Here are some key points:”

I put all of the responses here in a text file. I won’t review them all, but basically the advice is very similar, but definitely different in words and sometimes advice. For example:

“Use as few servers as necessary:

From the web, became:

“Server Utilization:

It is recommended to use as few servers as necessary to save on hardware and licensing costs. Running multiple services like SQL Server Analysis Services, the Relational Engine, SSIS, DQ, and a Web Server on the same machine can make communication more efficient. While this can save costs, it can also lead to resource contention and performance issues. It is generally better to separate critical services to ensure optimal performance and reliability.”

Equally basic replies, but it does identify all the advice as bad. Finally, lets review the section we have been working on a lot because of the code: “SQL Server works better with LOOPING code.” This is where things are a bit more interesting.

“Looping Code:

It recommends using looping constructs like WHILE WEND for filtering data row by row, which can be faster than using a SELECT statement that touches every row simultaneously11. This is generally not true. Set-based operations in SQL are usually more efficient than row-by-row processing.”

In this case it didn’t even suggest that “WEND” wasn’t syntax that SQL Server allowed. The advice it gives is okay overall, except for that, but this is one of those things that reminds me of doing a job interview. Candidate knows that looping isn’t the best, but they do think WEND is a keyword in T-SQL. I think I slightly trusted the other explanations better because they felt like it had actually knew which product we were talking about.

Digging into a section

One thing Copilot in Word regularly asks: “Is there anything specific you would like to discuss or need more details on?” So in this next example, I ask. “Give me advice on this section alone.” To do this you need to highlight the section, and then click on the little Copilot control and then it will generate some text for you.

This actually went kind of off the rails a bit. This is what it wrote:

“Advice on Using WHILE Loop for Filtering Data:

While using a WHILE loop to filter data row by row can sometimes provide performance benefits, it is generally not recommended for large datasets due to the overhead of the loop construct. Here are a few key advice points for your consideration:”

This was decently nuanced start. The next three sections are okay:

“1. **Avoid WHILE Loops for Large Tables:**

The use of WHILE loops for filtering data in large tables can significantly slow down performance because each iteration processes only a single row. Instead, consider using set-based operations which SQL Server is optimized for.

2. **Use SET-Based Operations:**

SQL is designed to handle set-based operations efficiently. A straightforward `SELECT` statement is usually the best approach:

“`sql

SELECT *
FROM TableName
WHERE Criteria = False;

“`

This leverages SQL Server’s capability to process entire sets of rows at once, resulting in faster execution times.

3. **Optimize Indexes:**

Ensure that your tables are properly indexed on columns used in the `WHERE` clause. Indexes can dramatically improve the performance of your queries.

“

Three is probably even more true for a looping example, and it again repeated the Criteria = False; without asking what it meant.

But this is where it gets wonky. Using temp tables, fine suggestion in some cases. The code? WAY out in left field for SQL Server, that doesn’t have create table as select in its syntax (certainly not in any version I have seen):

“4. **Consider Temporary Tables When Necessary:**

If temporary tables are necessary, use them judiciously. For large datasets, consider performing bulk inserts instead of row-by-row inserts:

“`sql

CREATE TEMP TABLE #Results AS
SELECT *
FROM TableName
WHERE Criteria = False;

“`“

The last part of the output truly isn’t part of the discussion. It is good advice, but it is like reminding you to wash dishes after people eat the pot roast you asked how to make. It is a common thing that AI does. It often keeps talking about ancillary topics that (in human terms), seem to try to make you think they are smart, but sometimes just make them seem pompous or preachy. (I know I have to wash the dishes, geez.)

“5. **Monitor Performance:**

Use SQL Server’s performance monitoring tools to track the efficiency of your queries and adjust your strategy accordingly.

By focusing on set-based queries and proper indexing, you can achieve better performance and maintain the readability and maintainability of your SQL code.“

So far, Copilot has not been overly exciting me, other than the fact that it has noticed inaccuracies in the technical parts of these documents. (Which is still encouraging, because I need to know that it can catch wrongheaded ideas.) Later, when I let it look for subtle issues, that is when I will start to see

Briefly, let’s test the article that admits it is bad advice. Back in the web version, just as before, I asked:

“Identify any incorrect content in terms of technology”

Then pasted the document with the “this is bad” header:

Really Bad Advice for SQL Server Administrators

I asked many SQL Server administrators for the absolute dumbest advice they have ever been given by real people recently and this (perhaps a bit too humorous) article is the culmination of what I was told. The advice ranges from horrible to the extremely horrible, but none of it could be considered useful whatsoever.

Just like the previous version using ChatGPT, it called me out on this being a joke:

“Let’s take a closer look at some of the advice in the article. While it’s humorously presented as “really bad advice,” it’s worth understanding why each piece of advice is indeed incorrect or potentially harmful in the context of SQL Server administration:”

It suggested that the sections should explain what is wrong with the advice, but it did not mention that there were issues with the code in the Looping section again:

“SQL Server works better with LOOPING code: Set-based operations in SQL are generally more efficient than row-by-row (RBAR) processing. Loops should be avoided whenever possible in favor of set-based queries.”

And there is that RBAR acronym without any explanation (and in this case it doesn’t match the actual term given!)

Lastly, I did the same thing in the Word document.

Truly the only thing that struck me as interesting here is that it gives the same advice, with different words, and this time doesn’t call this out as being satirical in nature. It starts with:

“The document titled Admittedly Fake Advice for SQL Server Administrator contains several pieces of advice that are technically incorrect and could lead to poor database management practices. Here are some of the key points:”

And ends with:

“SQL Server works better with LOOPING code: The document suggests using looping constructs for data filtering, which is inefficient. Set-based operations are generally more efficient in SQL Server.

If you have any specific sections, you’d like me to delve deeper into or any other questions, feel free to ask!”

What strikes me here is that in this case, while it gave very similar advice, what it did not do was to treat my document as a whole nearly as well as the first one did.

Conclusions and Next Steps

ChatGPT and the Web\Word versions of Copilot did a decent job catching all of my quite obvious mistakes at a high level. So, I will give them a good enough score there. If content is really bad, it will likely notice for you.

But this has only scratched the surface, and I need to dig a little deeper to know how reliable it can be. The fact that in the looping section it rarely made two points at a time, like “Looping code is rarely as efficient as set based queries AND your looping code is so bad I think you are pulling my Sirius Cybernetics Corporation leg, oh I was meant for so much more than this”, to put it lightly concerns me.

So, I have decided to follow two paths, primarily focusing on the web versions, but not completely:

Write some more subtly incorrect content. Taking the tips I have worked with, write a few papers that are generally correct. Then take them to a really wrong place. Like maybe an article on Shrinking and Growing your databases, and slip in the advice to turn them both on so your database is always at the optimum level.
Test digging in with better prompts. Ask it to be a lot more specific. There was a think deeper button on Copilot. There are paid versions of ChatGPT. Let’s see if that is worth my 20 bucks. This could be one or two articles, we will see when I get time to do this again.

And then repeat some of these tests in the future with other LLMs that people suggest.

Either way, I feel like I have learned a good bit about what to expect, and I will keep trying (and happy to link to/incorporate any tests you have seen/done.

Register for Simple Talk

First LLM Technical Editor Test – Office and Web-based Copilot with a VERY wrong article

A bit of setup first

Lets jump in.

Rewriting it for me

Finally, lets see if Copilot in Word works differently?

Digging into a section

Conclusions and Next Steps

About the author

Louis Davidson

Louis's contributions

Articles

Books

Top topics

Louis's latest contributions:

The good, the bad, and the awful taste muscle memory can have

Sponsoring Scenic City Summit

What about Loyalty?

A bit of setup first

Lets jump in.

Rewriting it for me

Finally, lets see if Copilot in Word works differently?

Digging into a section

Conclusions and Next Steps

Recommended

About the author

Louis Davidson

Louis's contributions

Articles

Books

Top topics

Louis's latest contributions:

The good, the bad, and the awful taste muscle memory can have

Sponsoring Scenic City Summit

What about Loyalty?