An editor’s biggest headache: Plagiarism

As an editor working with other people’s technical writing for many years, you might guess that either language skills or technical abilities would be the largest issue that you deal with. Both of these provide plenty of challenges, but the biggest issue is plagiarism in one shape or another. (Spelling the word plagiarism is kind of fun, too.)

Pretty much anyone that spent time in grade school knows a little bit about the concept of plagiarism because we were clearly instructed to never just copy someone else’s work, either schoolmate, or website, directly. (For those of you of a certain age, replace website with the Encyclopaedia Britannica or perhaps Compton’s and it may ring a few more bells.) Plagiarism, however, is more than just copying something word for word and putting it in a term paper. In this article, I want to discuss the depth and breadth of what it really means; and how I try to avoid it in my writing, and when reviewing other people’s work.

I want to state that none of this was taken directly from any specific case, but from works I have come across for years and years. I will also note that most egregious cases are extremely rare, but that is what makes this the biggest headache. It is the rare cases that are so hard to find and every article we work on/publish, it is essential to reduce the chances that the work is based on any sort of plagiarism.

Plagiarism: What is it?

The definition of plagiarism, from dictionary.com is:

an act or instance of using or closely imitating the language and thoughts of another author without authorization and the representation of that author’s work as one’s own, as by not crediting the original author

What a lot of people don’t realize is that copying other people’s unique thoughts and phrasing is plagiarism . This is where things get complicated, in fact, really complicated. Because there is a fine line between something that is general knowledge, common idiom, or even public domain; and something that you heard one person say and then repeated it like you said it. It is wrong even if what was said was not exactly special in and of itself.

I asked my wife, who has a doctorate in education, what academics say about plagiarism. She told me there was two trains of thought by academics. Those who say anything you use must be either your own or specifically attributed to the originator. Others say if you have permission to use something, that is enough.

In the trade press, which most blogs and articles fall into, we typically are closer to the latter. It is rare that you need to attribute every concept you share, because most trade writing is taking well known topics and putting it out with a new spin to help others understand. However, it is still good practice to attribute your sources when you are writing an article and get ideas from other sources. This goes for code samples, definitions, and even graphics that you reproduce.

Types of Plagiarism

As I was researching this topic and doing a plagiarism check on my document, I found this website https://library.uhv.edu/plagiarisms had some definitions that were interesting. In fact, I discovered this website because my plagiarism check found that I overlapped text with this site as we both had used the same definition of plagiarism from dictionary.com. Perhaps ironically, these very same plagiarism concepts are repeated on many different websites as well, sometimes with slightly different names and organizations; and rarely with attribution of where they go it from.

Deliberate versus accidental plagiarism. Deliberate is obvious, but sometimes your text or formulas will be very close to others by accident. For example, if you are including a formula reference like E = mC ^ 2 (Einstein’s mass-energy equivalence formula that is well known), you are likely to exactly match the text of others.

Sometimes you just use the same words to express the same thing randomly. “It was a dark and stormy night” for example. This is more concerning when it involves specific facts and formulas, or when it happens suspiciously often in a document. It is rarer than you would expect when you write a document to overlap with a lot of other writes accidentally.

Note: I won’t be discussing AI in this blog, but I am becoming more and more convinced that AI plays a part in a kind of hybrid deliberate/accidental plagiarism that is going to continue to by more and more trouble for editors.

Beyond how plagiarism happens, you can break down plagiarism into different types:

Global plagiarism – Just flat slapping your name on someone else’s fully complete work. This is actually quite common in the trade publication area where the monetary value of people’s work is pretty low. In fact, don’t be surprised if you write blogs or books, just how often your work shows up elsewhere.

Verbatim plagiarism – Very similar to the global version , but typically stealing some text and just using it in your own work. For example, if you are describing the CREATE PROCEDURE statement from SQL Server, you could just go to their documentation and copy some of the text and just treat it as yours. This is still considered intolerable in trade writing without attribution. With attribution, it can be effective for a paragraph or a table of information that you don’t want to repeat (it is also a good practice because it lets readers know of more resources).

Paraphrased plagiarism – Taking someone else’s text and just rewording it. It is best that if you are doing this, you state where you got the source from, and ideally you are clarifying or adding to it. Just keep in mind it is always good practice to direct people to other sources that you have used when writing about a topic (even if that source is written better than your work, it elevates your work and gives the reader more information).

Self plagiarism-It is commonly understood that you can’t really plagiarize yourself as long as you own the rights. However, if you are supposed to be providing new material, and you use old material, this is not a good situation either. If you are referencing old material, that should be clear.

For more reading, the uhv.edu website sites referenced https://www.scribbr.com/plagiarism/types-of-plagiarism which had some additional excellent examples.

But really, what is Plagiarism in content that sites like Simple-Talk puts out?

I think we can all agree that if you take the ideas, thoughts, or words of an individual source directly with no other sources, that is a big issue. Copying a few sentences from online documentation, or from other blogs is typically okay, as long as you attribute.

The real difficulty lies in what is someone else’s work and what is general knowledge or simple, straightforward facts. For example, you are writing a blog about something very fast and want to include the speed of light. This value is 299,792,458 meters/second but I clearly don’t know this by heart. I looked this up on Google and this came up right in the interface from the following link: https://www.google.com/search?q=speed+of+light.

Graphical user interface, text, application

Description automatically generated

Do I need to state that this is where I got this value from when that value is most definitely not one that google originated? Maybe, but not necessarily. Looking around, you can see that value repeated on many websites and I expect if I cracked one of my old physics textbooks, it would say the same thing (or perhaps it would quote it in miles/second, since I am an American and we don’t always go in for that metric stuff.)

The thing is, in trade-level writing, we are all generally aware that nothing we are saying is genuinely our thoughts. I did not come up with relational databases, I didn’t come up with B-tree indexes and their structures, I wasn’t the first to say that an index will make a query faster, and that too many indexes may have an overall negative effect. I didn’t come up with 99.99% of the algorithms I have used in my life and even more so, there are only a few of those algorithms that I even somewhat know where they came from originally!

My only truly unique thought was coming up with a way to implement relative positioning using a calendar table. If I ever see someone else doing the same thing, I would hope they said thank you to me. They may have come up with the same idea independently. Either way, I sort of think of things in terms that Elvis Costello once said about other musicians. Everyone copies each other somewhat. There are only so many ways to solve a problem, like there are only so many ways to put chords together to form music. But there are clearly lines that we don’t cross.

The general rules that I look to make sure people I edit follow are as follows.

  • Consider where you learned something. Make sure you know that what you are sharing is in your own words, representing your ideas, or generally known concepts. In this case you probably don’t need to attribute where it came from, but if you can it does not hurt. If you learned something in a class or read it in a book, make sure you are not sharing one person’s specific ideas in a way that makes it sound like your own.
  • If you find something interesting that a person says and you want to use it in your writing, give them credit and you are safe and done. This includes concepts, formulas, everything. If you don’t see it as a generally understood concept, give the first place you read the concept attribution. If you get permission to use the concept, that is okay, but put it in your own words. Said this before, but it is important… attributed does not make you look smart, but it makes you look terrible if you steal words and ideas.
  • If you copy someone’s text directly…attribute it. Always. Always. Even if it is acceptable to copy, and even if you have permission. Give credit unless you specifically are told not to under some penalty. And consider noting that you were asked not to attribute if so.
  • If in doubt, give credit. If in doubt and you don’t know who to attribute, consider not using that piece of information.

Discovering Plagiarism

As time passes, it is more and more difficult to plagiarize and get away with it. In this section, I want to cover some of the methods I use to find plagiarism in writing I have worked with.

Tone and quality

This is basically the method that teachers for centuries have used. Watching for change of tone. Let’s be honest here…most students have copied information from a source from time to time. And probably the first time you got a mild rebuke from the teacher warning you of how wrong that was. As an example, can you spot the issue in this next paragraph where a student is defining a database?

You know, it is like when you want to store information and stuff, you use an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spans formal techniques and practical considerations. It is like, so important to like keep it right, you know.

If it wasn’t already glaringly obvious, I took the words between “you use” and the last sentence directly from https://en.wikipedia.org/wiki/Database.

These changes in tone and quality have always been a flashing red flag that something is wrong. Luckily, using a search engine you can often enter a portion of the text and find where it came from, even if it turns out to be somewhat paraphrased.

Tools

Thanks to the miracle of modern technology, we have tools that we can now paste a document in and check for copied work. For example, I took the text that preceded this section into Grammarly’s checker, and it found 13% of it was plagiarized (because of the text that is attributed. Grammarly doesn’t know you attributed it!) It isn’t perfect, because it can’t check all the books and publications that are not freely available on the internet, but it does a decent job of catching common cases.

Some things are interesting. For example, the definition of plagiarism that I got from dictionary.com was found on a different location than dictionary.com (image pulled from Grammarly’s plagiarism check):

Graphical user interface, application, Teams

Description automatically generated

That site also references dictionary.com, (it is https://library.uhv.edu/plagiarism and it has some more details that I did not cover on plagiarism that is interesting) so we are basically on the same link.

Some things the tool finds is not always deliberate plagiarism. For example, it found the following duplication:

Graphical user interface, text, application, chat or text message

Description automatically generated

There are frequently phrases that you will find in writing that are the same that just can’t be helped. If the phrase is really specific, it might be classifiable as accidental plagiarism and would be something to rephrase.

However, if you find too many of these phrases from the same source, it is absolutely time to look deeper and see if any additional tools were used to cobble together your documents.

Mistakes

Finally, this is kind of the worst-case scenario. You are writing a article again about the speed of light. You go to a source and find that the speed of light is 199,792,458 m/s, and you use that value happily continuing your writing. Now your fact checker checks that value and when they do a search on that value, it turns out that the exact incorrect statement is used by another website. This not only shows you used their information directly without checking around, but that you didn’t know it was wrong.

In Conclusion, Just Make Sure Your Work Is Yours

That says it all. So, I will stop right there.