For many iPhone users, Siri is the closest thing imaginable to a personal assistant. It can schedule appointments, order more dish soap, and even recommend restaurants for your big night out. But it isn’t really a person; it’s a virtual assistant built into Apple devices. Other virtual assistants such as Alexa and Cortana perform similar tasks, which begs the question: How do these technologies understand and communicate with us as if we’re having a conversation? It all has to do with natural language processing (NLP) and machine learning, two important subfields of artificial intelligence (AI).
The basic fundamentals of NLP are as follows: Our devices begin by parsing individual words rather than the full sentence. For example, if you ask Siri to “show me pictures of a dog,” it’ll try to determine what the words “picture” and “dog” mean prior to fulfilling the request. This is where machine learning comes into play — smart devices analyze copious amounts of data to produce an accurate result, but doing so requires trial and error. An untrained device may wrongly assume that anything with four legs and fur is a dog, and show you pictures of a cat. But over time, these devices are given feedback to improve their accuracy and become more reliable.
Once Siri defines each word, it can figure out how to answer what you’re asking. Here, context is important, especially for words with multiple meanings. Voice assistants often analyze a user’s past behavior to make an educated determination. For instance, if you search for a lot of music, Siri is more likely to assume you’re asking for info about Chicago the band, rather than facts about Chicago, Illinois, when you say, “Tell me about Chicago.” If you ask Siri a question that it doesn’t understand, or it gets a request wrong, try rephrasing with a few more context clues to help the algorithm figure it out.