As iPhone 4S’s flood into the hands of the public, users are coming face-to-face with something that they weren’t quite expecting: Apple’s new voice interface, Siri, has an attitude. Ask Siri where to hide a body, for instance, and she’ll give you a list of choices that include a reservoir, a mine, and a swamp. Ask her how much would could a woodchuck chuck if a woodchuck could chuck wood, and she might tell you the answer is 42 cords, or she might ask you to clarify if it’s an African or European woodchuck.
Joshua Topolsky’s at This Is My Nextbegan gathering some of the service’s cheekier answers on Wednesday, and now there’s a Tumblr up called Shit That Siri Says which houses an even more extensive, growing collection.
Siri’s answers are cute, but they’re not much different from the “Easter eggs” that sly coders have been slipping into software for decades. Or are they? I want to suggest, in all earnestness, that as Siri’s repertoire of canned responses grows, Apple could end up with a bona fide artificial intelligence, at least in the “weak AI” sense. Siri may be yet another chatterbot, but it’s a chatterbot with a cloud back-end, and that cloudy combination of real-time analytics and continuous deployment makes all the difference.
The roots of intelligence: algos or data?
In its initial incarnation, the part of Siri’s interaction model that responds to jokes, insults, and other casual queries that are merely intended to probe the machine for a clever response puts it in the venerable category of chatterbots. The chatterbot lineage can be traced back to ELIZA, which was a primitive interactive program that would take English-language input from the user and spit it back out in the form of a question. ELIZA was originally intended as a parody of psychotherapy, and an example exchange might go something like the following:
USER: I feel sad.
ELIZA: Why do you feel sad?
USER: Because I made a mistake
ELIZA: Why did you make a mistake?
USER: I have the flu
ELIZA: Maybe you should see a doctor. I’m merely a psychotherapist.
USER: Habla Espanol?.
ELIZA: Now you’re not making any sense!
A chatterbot like ELIZA uses a mix of natural language processing (NLP) and canned responses to take the user’s input and transform it into some kind of intelligible grammatically correct output. The hard part of making a good chatterbot is the NLP portion. For instance, the program in the example above has to know that “make” is the present tense of “made,” so that it can turn “Because I made a mistake” into “Why did you make a mistake?”. This kind of productive, algorithmic knowledge about how to combine a limited vocabulary of nouns, verbs, and modifiers into syntactically correct and at least superficially relevant English is difficult to code.
So the art and science of chatterbot coding as it has been practiced since the dawn of UNIX is in designing and implementing a set of NLP algorithms that can take a finite vocabulary of words and turn them into legit-sounding English sentences. The easy part, at least from a computer science perspective, is in cooking up a complementary slate of pre-packaged answers that are mere strings produced in response to a set input pattern, which the chatterbot produces in specific situations, like when it doesn’t quite know what to say.
For example, in the above dialog, ELIZA might be hard-coded to match the pattern “have the flu” in the user’s input with the output string “Maybe you should see a doctor. I’m merely a psychotherapist.” This kind of string-to-string mapping doesn’t require any kind of NLP, so there’s no “AI” involved in the popular sense. Ultimately the success of the canned answers approach to chatterbot making hinges not on the intelligence of the algorithm but on the tirelessness of the coder, who has to think of possible statement/response pairs and then hard-code them into the application. The more statement/response, or input/output pairs she dreams up to add to the bot, the more intelligent the bot is likely to appear as the user discovers each of these “Easter eggs” in the course of probing the bot’s conversational space.
An adult user will quickly exhaust the conversational possibilities of a chatterbot that has a hundred, or even a thousand, hard-coded input/output pairs. But what about 100,000 such pairs? Or 1 million? That’s where the cloud makes things interesting.
Big Data, big smarts
In the traditional world of canned, chatterbot-style “AI,” users had to wait for a software update to get access to new input/output pairs. But since Siri is a cloud application, Apple’s engineers can continuously keep adding these hard-coded input/output pairs to it. Every time an Apple engineer thinks of a clever response for Siri to give to a particular bit of input, that engineer can insert the new pair into Siri’s repertoire instantaneously, so that the very next instant every one of the service’s millions of users will have access to it. Apple engineers can also take a look at the kinds of queries that are popular with Siri users at any given moment, and add canned responses based on what’s trending.
In this way, we can expect Siri’s repertoire of clever comebacks to grow in real-time through the collective effort of hundreds of Apple employees and tens or hundreds of millions of users, until it reaches the point where an adult user will be able to carry out a multipart exchange with the bot that, for all intents and purposes, looks like an intelligent conversation.
Note that building an AI by piling Easter egg on top of Easter egg in the cloud isn’t solely the domain of Apple’s Siri. When Google does exactly this—for instance, by showing a five-day weather graphic in response to a local weather search, or by displaying local showtimes in response to a movie search—it’s called a “feature,” not an “Easter egg,” though it’s the same basic principle of “do this specific, clever thing when the user gives this specific input.” Indeed, Google has been at this approach for quite a long time, so I expect that they will shortly be able to reproduce much of Siri’s success on Android. They have the voice recognition capability, the raw data, and the NLP expertise to build a viable Siri competitor, and it seems certain that they’ll do it.
But is a “real” AI?
A philosopher like John Searle will object that, no matter how clever Siri’s banter seems, it’s not really “AI” because all Siri is doing is shuffling symbols around according to a fixed set of rules without “understanding” any of the symbols themselves. But for the rest of us who don’t care about the question of whether Siri has “intentions” or an “inner life,” the service will be a fully functional AI that can response flawlessly and appropriately to a larger range of input than any one individual is likely to produce over the course of a typical interaction with it. At that point, a combination of massive amounts of data and a continuous deployment model will have achieved what clever NLP algorithms alone could not: a chatterbot that looks enough like a “real AI” that we can actually call it an AI in the “weak AI” sense of the term.