How history gets vandalized.
IT MAY SEEM that this article is a big departure from my usual beat, but it really isn't. As a passionate student of local history, I have spent countless hours and many years chasing down primary sources in books, archives, and online articles to chronicle the colonial history of the Ottawa Valley.
In that pursuit, I have discovered instances where some of the stories passed down to us were the product of either misinformation or disinformation. [1] But in the past couple of weeks, I encountered for the first time, two works produced by Artificial Intelligence (AI) writers, like the very popular ChatGPT and the lesser-known Google Bard. It looks like we local chroniclers - and maybe all historians - will have a significant amount of work cut out for us in the future.
So, what is an AI writer, you may ask? This, from the Techradar Pro website:
"An AI writer is a tool that relies on artificial intelligence to automatically generate written content." [2]
Remember "Garbage in - garbage out"?
BUT how do AI writers get their content? Are they mining the internet like a simple search engine? No, they are not. According to Scribbr, [3]
"ChatGPT is an AI language model that was trained on a large body of text from a variety of sources (e.g., Wikipedia, books, news articles, scientific journals). The dataset only went up to 2021, meaning that it lacks information on more recent events. It’s also important to understand that ChatGPT doesn’t access a database of facts to answer your questions. Instead, its responses are based on patterns that it saw in the training data."
At this time, AI writers are readily accessible by the (largely) uninformed public as an online tool for writing text on ANY subject. The sudden advantage to the great unwashed will be that with little knowledge and equally little talent, they will be able to create thoroughly readable articles about history in minutes and publish them online without having to bother with tedious research and sourcing info.
On two occasions, recently, I encountered several, blatantly error-filled articles about our local history published on Facebook. These popular Facebook pages are focused on our regional history and they are read by thousands of subscribers.
One such author had written a whole series of nicely packaged vignettes and posted them to several Facebook pages. Another author had posted a lengthy reply to a Facebook post.
I messaged both authors to open a polite conversation about sources and intent. That is when I learnt that they had used an AI writer to generate the posts.
The first author explained: "We have been posting daily, light-hearted fun posts about a multitude of information, including local history, building this content with the help of artificial intelligence. The goal of these posts is to bring a short form, interesting read to viewers while also informing them about local history. We are always working to correct information and have never claimed to be a historical page nor a fact page, but a marketing page."
A marketing page!
So, for the purpose of selling their services as web content marketers, they used an AI writer to invent history because, he writes, "we know that most of the historical data before 1900’s either burned or isn’t cited a lot which makes it hard for the AI to grab the accurate information."
After much back & forth the author acknowledged that history needs to be fact-based and that AI cannot do that reliably. He, then, quite reasonably decided to take down the historical capsules and resolved to leave history to the historians, in the future.
The second author had replied to a post about the history and mystery surrounding a local lake. After referencing the very dubious facts contained in a short narrative online, the claims were challenged. He later replied by posting a few error-filled paragraphs on the "full" history of the lake, but when asked about sources, he said he simply used ChatGPT to do the research. Chat GPT couldn't even get the location of the lake right. He had obviously misunderstood the nature of an AI writer.
Both authors immediately came clean, and neither had imagined that the AI writer could not distinguish between source-based history and fiction. In fact, not being historians, they could not imagine the possibility that the AI writers are pulling data from garbage articles.
Garbage in - Garbage out!
The Ramifications
MANY will say that the horse has left the barn and there's no closing the barn doors now, but I disagree. As with the internet, consumers have found the means to shield the innocent from harm - as imperfect and mild they may be - and wage a counteroffensive to those who use it maliciously.
Many say how useful the AI writer will be for student essays, not realizing perhaps, the garbage-in-garbage-out principle involved.
It will take better minds than mine to fix the problem, but there will have to be ways found to ensure that ALL AI-generated articles will be tagged as such, allowing publishers and consumers to understand what is published. Otherwise, it is going to be a nightmare trying to figure out what is true and what isn't.
[1] The difference between the two terms is that Misinformation is false or inaccurate information coming from rumors, myths, hearsay, insults and pranks that can simply result from ignorance. Disinformation is deliberate and includes malicious content such as hoaxes and propaganda meant to distort reality, support a political agenda, or spread fear among the population.
[3] Published on Scribbr, February 17, 2023 by Jack Caulfield. https://www.scribbr.com/ai-tools/is-chatgpt-trustworthy/#:~:text=No%2C%20ChatGPT%20is%20not%20a,patterns%2C%20not%20facts%20and%20data.
Thank you Rick for this thought provoking article. YOu have articulated some of the things I have been thinking about recently. For many people, "research" consists of a quick Google of a term point finale. It is imperative that all users of the internet learn how to verify sources to confirm the vlaidity of information tha tthey are "consuming"