Monday, October 10, 2016

Turing test is more than mere theatre: it shows us how we humans think.

© Huma Shah October 2016

Is the Turing test mere theatre?

You might question why anyone bothers to stage Turing test experiments when a computer programme achieved 50% deception rate in the first instantiation, the inaugural Loebner Prize for Artificial Intelligence in 1991. Back then judges were restricted to asking questions specific to each hidden entity, human or machine’s specialism. Whimsical conversation was the topic winner in 1991 from Joseph Weintraub’s PCTherapist III programme:

Since then machine simulation of natural language has moved on and we see chatbots with characters able to express opinions, share ‘personal information’ and tell it like it is! This from Elbot /Artificial Solutions  ( during a chat with this blogger on Oct 6, 2016: “You have quite an imagination. Next thing you know you'll say I needed batteries!

For me the interest comes from movie talking robots, recently sensationally experienced in Ex Machina where the female embodied robot Ava perpetrates the ultimate deception, or does she? Well it wasn’t quite what Ava’s programmer in the movie, Nathan had imagined. And of course HAL from 2001: A Space Odyssey who talked and lip-read to a menacing end in Kubrick’s truly glorious cinematic production.

Experiencing the 13th Loebner Prize at University of Surrey in 2003 I felt tweaking that format could produce some interesting data. After all, Turing’s imitation game was as much concerned with finding out how we humans think as it is about exploring the intellectual capacity of a machine through its ability to answer any question in a satisfactory and sustained manner – Turing’s words (Computing Machinery and Intelligence, 1950).

Two decades from the 1st Turing test instantiation developer Rollo Carpenter claimed his programme Cleverbot was considered to be 59.3% human, following 1334 votes cast at a 2011 event in Guwahati, India.

Up to the year 2003 the method of practical Turing tests involved a human interrogator interviewing one hidden entity at a time. This is what I have named the viva voce Turing test in my PhD, ‘Deception-detection and Machine Intelligence in Practical Turing tests’.  In 2008 I designed a Turing test experiment in which, for the first time, control pairs of 2machines and 2humans were embedded among pairs of human-machine set-ups. In this layout each interrogator simultaneously questioned a hidden pair and had to decide which was human and which was machine. The 2008 Turing tests were also the first time in which school pupils were given the opportunity to participate as judges and hidden humans.

The new book, ‘Turing’s Imitation Game: Conversations with the Unknown’ details that experiment and two follow up Turing test events, in 2012 at Bletchley Park held on the 100th anniversary of Alan Turing’s birth, June 23rd as part of the worldwide centenary celebrations, and at The Royal Society London in 2014, on the 60th anniversary of Turing’s untimely and sad death.

Each experiment had an incremental purpose, including to scale machine performance in dialogue and whether they were getting better at answering questions in a satisfactory manner. As readers will learn from the book, we humans do not always answer a question appropriately, so should we be harsh when machines don’t, especially as they are learning programmes and a lot ‘younger’ than some of the youngest human judges?

Implementing Turing tests is actually quite hard work. Finding open-minded human interrogators and human foils for the machines as well as motivating developers of computer programmes to participate, takes time and persuasion. Not everyone is happy at the conclusions, as can be evidenced by the many negative and angry comments across tech magazines and newspaper articles, especially after the 2014 experiment. The Turing test does this, it is one of those controversial areas of science that brings out the proprietorial impressions; everyone feels their interpretation is the one Turing intended.

I am really grateful for all the participants, humans and machine who have participated in our experiments – more than 80 judges, 70 hidden humans, and for the ingenuity, patience and collaboration of the developers: Fred Roberts for Elbot; Robby Garner for JFred-TuringHub; Rollo Carpenter for Cleverbot; Robert Medeksza for Ultra Hal, and Vladimir Veselov and his team for Eugene Goostman. You will meet these conversationalists, or chatbots in the book. I hope it encourages more school pupils and the general public to take interest in the Turing test and get involved in the challenge. There’s still more to be done here :)

Thursday, October 06, 2016

What is the Turing test?

For the October 2016 launch of Turing's Imitation Game: Conversations with the Unknown, publisher Cambridge University Press asked the authors for answers to fundamental questions on the Turing test. Co-author, Huma Shah answers hers below:

Image: Harjit Mehroke

CUP: The Turing test was originally devised by Alan Turing in 1950. Why write a book about it now?

Huma: Turing actually devised his imitation game in his 1948 paper, ‘Intelligent Machinery’, considered the first manifesto of artificial intelligence. Turing’s test aims to investigate the intellectual capacity of machines, so it is as relevant today as when he was developing his ideas more than 60 years ago, especially because we are building more and more computer programmes and robots to conversationally interact and collaborate with humans.

CUP: What reactions have you seen in people who have taken the test?

Huma: Judges and hidden humans have mostly enjoyed their participation. However when some judges who got it wrong learn they did not accurately categorise humans as humans and machines as machines they ask all sorts of questions to mitigate their error, such as ‘Were the humans told to act like machines?’ – they were not, all humans in our experiments have always been asked to be themselves. However, what these judges probably have not realised is that error-making is part of intelligent thinking, it’s one way of how we learn and improve.

CUP: Why has the Turing test been controversial?
Huma: Because it questions the very nature of what it means to be human, and conversation-natural language is most human. Different interpretations of Turing’ ideas exist as to the purpose of the test with lots of disagreements, but this is healthy and democratises science and empirical work.

CUP: There is a popular misconception that the Turing test is a test for human-like intelligence in machines. But what is it really?

Huma: No, it is not a test for human-like intelligence but an exploration of whether a machine can ever answer any question put to it in a satisfactory and sustained manner. Of course the judgement of whether an answer to a particular judge’s question is relevant rests with the interrogator who might feel a machine’s response is more appropriate than a human’s answer to the same question.

CUP: Has a machine passed the Turing test? What is the significance of that event?

Huma: No, not in the sense that Turing would have envisaged. What has been achieved in the 2014 Royal Society London held experiment could be said to be the first challenge being overcome, that of wrong identification by 30% of a panel of judges. But this is open to interpretation of one statement of Turing’s in his 1950 paper ignoring what he said before and after. We do not yet have in existence the kinds of machines Turing envisaged that would play his imitation game satisfactorily.

CUP: Can machines think?

Huma: It depends on what you mean by thinking J  In place of circular definitions Turing posed his imitation game and felt that if a machine could answer any question in a satisfactory and sustained manner then that would not be an easy contrivance.