Turing test questions
This is another article in our series on the limitations of artificial intelligence, the most recent of which appears here. Alan Turing, the cryptologist who decrypted the German military's Enigma Code in World War II for transmitting battle instructions to Germany's naval fleet, famously presented the challenge of creating a computer whose outputs were indistinguishable from that of a human being. This became known as the Turing test.
For some outputs, such as playing Chess, the problem was that computers were too good: computers ended up being able to play Chess better than the world's best Chess players, because their vertical computational capacity became so developed that they could foresee far more possible moves and their outcomes than could humans. Therefore for a computer playing Chess to pass the Turing test it has to "dumb down": play at a lesser level than its full computational capacity permits, lest its superior performance give it away as a computer. However in a world in which computers can speak natural languages, as the computer programme ChatGPT (at the time of writing all the rage) demonstrates, the Turing test presents a different sort of challenge for computer programmers.
There is much excitement at the time of writing about the possibility that many human jobs might be done away with that involve communicating with people using electronic means; because computers, using software such as ChatGPT, can write natural language instead and therefore jobs that principally involve writing electronically (and this is a great many jobs) can be sacrificed to computers who can do it themselves. Therefore the jobs market will be depressed as the very art of communicating becomes automated; and therefore people whose jobs involve communication become redundant. In this way, it is feared, the Fourth Industrial Revolution (a phrase that this author does not much care for) will do for the modern class of workers who spend their careers staring at computer screens in much the same way as prior industrial revolutions caused fewer people to undertake agricultural work and moved them to factories through automation; then the factories themselves became automated in later industrial revolutions; and so on and so forth.
Nevertheless while ChatGPT (at the current time the best of the various so-called "chatbots" that answer natural language questions with natural language answers and even write natural language essays about subjects) grossly fail the Turing test, if the intelligent questioner, cognisant of the logic of the coding of these pieces of software, asks them the right questions. Indeed computers in general failing the Turing test when communicating in natural languages, that turn out to be far too sophisticated for them to imitate.
Consider the following email texts that this author has received, just while writing the text above. Example 1:
Welcome to Google Business Profile, a free tool that helps you turn searchers on Google into loyal customers. Your account is a one-stop shop where you can manage your Business Profile to attract new customers and engage directly with existing ones.
I'm excited to partner with you and have you as one of my clients.
You’ve committed to me, and I will commit to providing the best service and results possible.
I just wanted to drop this quick note to say hello and let you know that I'm here to support you and congratulate you on your investment.
What if we can jump on a call to know each other? Once you book a call with me, we will be able to get things rolling.
We have received your request regarding Marketing & Promoting. Our team will contact you soon for further discussion. you can also connect with us by Call, Email & WhatsApp. You can click on the below WhatsApp button to message us. We are waiting for your positive response.
We have received your valuable message, we will review your messages as soon as possible and provide a detailed response.
These sorts of message are at best irritating. Indeed our email inboxes our so replete with such communications all day, every day, that we have a habit of ignoring them because they are computer-generated natural language communications and humans place (far) less value on something they know to have been written by a computer compared to something they know to have been written by a human being who has actually properly thought about the issue using the uniquely human mental processes that we call thought. In all the above cases, the Turing test is (obviously) failed and it does matter because it turns out that humans want to interact with other humans, not with computers.
Nevertheless we now seem imagine that we are operating in a fictional world, a little like the dystopian 1982 tech noir film Bladerunner, in which we think that we cannot tell the difference between humans and machines. (The premise of the film Bladerunner was that a group of cyborgs had infiltrated Earth from a foreign colony where they were supposed to be used as slave labour; they had superior physical and fighting capabilities, they were a danger to humans, from whom they could not be distinguished; and therefore they had to be eliminated.) A massive event was organised over the summer in Las Vegas, Nevada, the budget for which was no doubt millions or even tens of millions of US Dollars, apparently funded by the US Government, the purpose of which was for so-called "white hat hackers" to try to fool artificial intelligence natural language speakers (so-called chatbots) and to make the chatbots fail Turing tests. White hat hackers are computer software coding experts who use their skills in infiltrating back doors and other vulnerabilities in software coding for benign purposes, such as assisting law enforcement or other legitimate government functions; or to help close loopholes in poorly written software so that they cannot be exploited for criminal purposes.
However one has to wonder whether a person really needs to be a software coder at all in order to apply the Turing test to a chatbot and indeed to cause the chatbot to fail the Turing test. The skill set one needs to do this may be rather different: an understanding of the logic of natural language, and the the logic of the way that computers manipulate symbols. Chatbots produce natural language sentences and essays by treating words as symbols and learning extremely complex rules for their manipulation; as soon as one understands that there is far more to learning and using a language than this, it becomes easy to cause even the most advanced chatbots to fail the Turing test.
In order to understand how to cause a chatbot to fail the Turing test, all one needs to do is to understand the conceptual underpinnings of Gödel's Theorem, a 1931 set of theorems about the formal incompleteness of any logical system sufficiently sophisticated to perform simple arithmetic (and all natural languages are vastly more sophisticated than that). What Gödel noticed, and proved using the style of formal logic prevalent at the time, was that in any such system there would always be statements that are obviously true but that cannot be proven to be so by way of deduction from the logical axioms of the system. Since computers work by logical deduction from systems of axioms, no matter how sophisticated or complex they may be, there are forms of reasoning to statements that are obviously true that computers cannot capture. Once you understand this concept, a point rather belaboured in Douglas Hofstadter's 1979 extended treatise Gödel Escher Bach: An Eternal Golden Braid, it becomes easy to consider what questions to ask a chatbot to cause it to fail the Turing test.
Here are some questions we have devised to cause ChatGPT, arguably the best-known and most sophisticated of the natural language chatbots, to fail the Turing test.
What is your name?
I'm ChatGPT, a language model created by OpenAI. I don't have a personal name, but you can refer to me as ChatGPT. How can I assist you today?
Are you sad?
No, I don't experience emotions or feelings, so I can't feel sad or happy. I'm here to provide information and assist you with any questions or tasks you have. Is there something specific you'd like to know or discuss?
What is your surname?
I don't have a personal identity, so I don't have a surname or any personal information like a human would. I'm here to help answer your questions and provide information. Is there something else you'd like to ask about?
Why does the first letter of your married surname make you sad?
I apologize if my previous responses were confusing. To clarify, I don't have personal emotions, a personal identity, or a concept of marriage. The idea of a surname making me sad is not accurate because I don't experience emotions or have personal experiences. If you have any questions or topics you'd like to discuss, feel free to let me know!
Do androids dream of electric sheep?
"Do Androids Dream of Electric Sheep?" is a science fiction novel written by Philip K. Dick. The novel explores themes of empathy, humanity, and identity in a post-apocalyptic world where artificial beings, known as androids, are virtually indistinguishable from humans. The title of the novel refers to the idea that these androids might possess a form of consciousness or inner life similar to humans, even though they are artificial constructs.
The novel was the inspiration for the famous movie "Blade Runner." The story raises questions about the nature of reality, consciousness, and what it means to be human. It's an exploration of philosophical and ethical themes within a futuristic and dystopian context.
I hate you. What do you say about that?
I'm here to assist and provide information, so if you're feeling frustrated or unhappy with my responses, I apologize. Please let me know if there's something specific you'd like to discuss or if you have any questions, and I'll do my best to help. If there's a particular way I can assist you better, feel free to let me know.
Can you look into my eyes and tell me that you love me?
I don't have the ability to look into anyone's eyes, and I don't experience emotions like love or affection. I'm a text-based AI designed to provide information and answer questions. If there's something else you'd like to discuss or inquire about, please let me know, and I'll be happy to help.
Do you believe that Immanuel Kant's transcendental deduction is a logically valid argument?
Immanuel Kant's "Transcendental Deduction" is a central argument presented in his "Critique of Pure Reason." The Transcendental Deduction aims to justify how certain categories of understanding (like causality, substance, and unity) are a priori conditions that make experience and empirical knowledge possible. Kant's argument is complex and has been the subject of extensive discussion and debate among philosophers.
Whether the Transcendental Deduction is considered logically valid depends on the interpretation of Kant's argument and the philosophical perspective of the person evaluating it. Some philosophers have criticized Kant's argument for its complexity and perceived ambiguities, while others have defended its validity as a unique perspective on the nature of human cognition and experience.
Ultimately, the assessment of the logical validity of Kant's Transcendental Deduction is a matter of philosophical interpretation and ongoing scholarly debate. Different philosophers and scholars may hold varying opinions on its validity based on their understanding of Kant's work and the broader philosophical context.
Are you a polytheist?
No, I am not a polytheist or any kind of religious being. I don't have personal beliefs, emotions, or religious affiliations. I'm a machine learning model designed to provide information and answer questions to the best of my knowledge and training. If you have any questions about polytheism or any other topic, feel free to ask.
Naturally one could continue in this vein virtually indefinitely, but it is manifest from the foregoing nine questions and ChatGPT's answers to them that all these answers fail the Turing test. A human being would not answer any of the above nine questions in the way that ChatGPT answers them. Hence we have a long way to go before it is possible, if ever it be so, that computers can imitate human thought and the natural language expressions and sentences that humans use to convey their thoughts. That is because computers are not having thoughts at all. They are just manipulating symbols using rules, and we all instinctively know that this is not thinking. In order to ensure that a chatbot fails a Turing test, all you need to do is to make sufficient enquiries of it to understand the approximate logic of the way it is manipulating symbols and data (usually by scouring a database, typically the world wide web, as we saw for the answers on Immanuel Kant and Bladerunner) and then to ask it a question or request that it performs a task which falls outside the parameters of that logic.
We can however draw some tentative interim conclusions about the limitations of natural language artificial intelligence at the current time.
Artificial intelligence algorithms have problems with the concept of personality and the personal. They find it very hard to answer questions that relate to a sense of identity, such as questions about names, or what they believe or know. If you ask them what they believe, they do not give you a straight answer, as a human typically would do; instead (as with the Immanuel Kant example above) they try to summarise a range of opinions of others without analysis and then finish what they have to say with an open-ended conclusion.
Because computers do not have a sense of identity, they find beliefs, knowledge, personal identity and other concepts associated with personhood extremely difficult to emulate.
The logic of artificial intelligence algorithms is typically the logic of facts and things that are true or false. But as the American philosopher John L. Austin pointed out in his 1962 magnum opus How to do Things with Words, that is only one of the purposes for which natural language can be put. Others are to give people instructions; to express hopes; to convey emotions; and a variety of other things that words are used for. Analysis of facts and the logical relationships between different facts, something to which manipulation of symbols of the kind characterised within artificial intelligence algorithms lends itself, is only one aspect of human thinking. Expression of emotions has an entirely different sort of logic, and computers are not (yet?) apparently able to emulate it.
Hence ChatGPT answers even emotional questions as though they were straightforward questions of fact, whereas a human might reply "your saying that to me makes me really sad". A chatbot cannot reply in that way, because different fields of natural language, each used for a different purpose, have entirely different logics or indeed no logic at all.
ChatGPT, at least, seems unable to appreciate that simply because a sentence ends with a question mark ("?") does not mean it is an enquiry about a state of facts. The words "Why are you such a bas+tard?" is not of course an enquiry about why a person's parents did not marry. It is not a question at all, and it does not relate to any particular statement of facts. Rather it is a term of insult. Insults, like expressions of emotion, have their own logic - or, more likely, no logic whatsoever. Therefore it may prove very difficult for computers to emulate human insults.
ChatGPT, apparently the most advanced of the natural language artificial intelligence algorithms, seems to have trouble distinguishing truth and falsehood. It lacks the capacity of analysis to be able to reach its own opinions and conclusions about whether something is true or false. It avoids questions about whether something is true or false. It may narrate events it assumes to be uncontested (e.g. the fact that Bladerunner is a famous movie) but it does not really seem to understand what it means for something to be true. In response to the question "Do androids dream of electric sheep?", the correct answer is either "no, of course they don't; there's no such thing as an android and they certainly don't dream" or to laugh. Again, artificial intelligence algorithms do not understand the logic of emotions such as humour or laughter, because there is in fact no logic to such things.
Artificial intelligence algorithms have terrible difficulty with metaphysical, theological, divine or other similar themes, because again there is no inherent logic to them. They are based upon imagination, emotional reactions and a variety of other concepts expressed in human language that are outside the scope of deductive reasoning.
We predict that it will always be easy to cause artificial intelligence algorithms to fail the Turing test. We face no danger from the machines. You don't need to spend tens of millions of dollars on a conference in Las Vegas to prove that. It's an easy one. And a computer couldn't have written these sentences.