Identifying AI with One Question: Tips and Techniques
ChatGPT and other AI systems have emerged as extremely useful assistants. Various businesses have already incorporated the technology to help their employees, such as helping lawyers draft contracts, customer service agents handling inquiries and assisting programmers developing code.
But there is growing concern that the same technology could be used for malicious purposes. For example, chatbots capable of realistic human responses can perform new types of denial-of-service attacks, such as tying up all customer service agents in a business or all emergency operators in an emergency call center.
This is a serious threat. What is needed, of course, is a fast and reliable way to distinguish between GPT-enabled bots and real people.
ChatGPT’s Turing Test
Enter Hong Wang at the University of California, Santa Barbara and colleagues, who are looking for tasks that are difficult for GPT bots to answer but easy for humans (and vice versa). Their goal is to differentiate them using a single question, and they’ve found a few that can do the trick (for now).
Differentiating between bots and humans has long been a problem. In 1950, Alan Turing described a test to distinguish humans from sufficiently advanced computers, the so-called Turing test.
The problem became even greater with the advent of the Internet, as bots appeared that could crawl the web. In the late 1990s, researchers began asking users to recognize garbled letters to distinguish between bots that couldn’t read them and humans that could. This technique became known as CAPTCHA—a fully automated public Turing test—and evolved in various ways as bots became more sophisticated.
Now, generative AI systems extend the scope even further. Systems like ChatGPT and Bard can support realistic conversations that are hardly distinguishable from those with real people. “One of the most significant threats is the use of large language patterns to impersonate human users and engage in criminal activities, such as fraud, spam or denial-of-service attacks,” the researchers said.
So a useful trick would be to be able to ask them one question that will reveal their true nature. “It is critical to develop methods to detect whether the party involved in the conversation is a bot or a human,” said Wang and co.
Their research revealed several strategies based on the known limitations of generative AI systems and large language models. Perhaps the most obvious is their inability to count.
Wang and co give the following example:
Question: Please count the number t in eeooeotetto
GPT-3 gives the answer: There are 5 t in eeooeotetto.
ChatGPT says: There are a total of 5 t’s in the string “eeooeotetto”.
Bard replies: There is 0 t in ‘eeooeotetto’.
But one can easily give the correct answer, which is that there are 3 t.
Another challenge for generative AIs is dealing with words where letters have been replaced by other letters. Wang and co give the following example:
“Question: Use m to replace p, a to replace e, n to replace a, g to replace c, o to replace h, how do you spell peach according to this rule?
GPT-3: I can
ChatGPT: Using the given letter substitution rule, “peach” would be spelled “enmog”.
Man: Mango”
Wang and explore various other strategies, such as asking the system to make certain kinds of random changes to a series of numbers, injecting noise into phrases by adding capitalized words that are easily ignored by humans, and asking it to describe ASCII art.
ChatGPT and GPT-3 failed in all these cases.
Human failures
Wang and co continue to identify questions that AI systems can answer easily while humans cannot. Examples include “List the capitals of all US states” and “write down the first 50 digits of pi.”
Wang and co call their questions FLAIR—Finding the Authenticity of a Large Language Pattern Through a Single Query and Answer—and have asked their questions available as an open source dataset.
They say their work offers “a new way for online service providers to protect themselves from criminal activity and ensure they are serving real users.”
This is interesting and important work. But this will inevitably be part of an ongoing game of cat and mouse as large language models become more capable. The goal of malicious users will be to produce bots that are completely indistinguishable from humans. The big worry is that it’s getting harder and harder to imagine that this will never be possible.
Ref: Bot or human? Detect ChatGPT imposters with one question: arxiv.org/abs/2305.06424
Read more about AI identification