How Nuance is blazing a path to a smarter Siri

07.02.2019

Nuance's recently announced Project Pathfinder solution gives us a glimpse at Siri's smarter future.

What is Project Pathfinder?

Making Siri and other voice assistants smarter means machines must get much better at analyzing and understanding real-world conversations and developing AI models capable of handling their context and complexity.

Nuance has a big footprint in conversation-based user interfaces. It was the first company Apple turned to in the early days of Siri, and I guess there are plenty of ex-Nuance engineers beavering away on Apple's voice assistant inside the password-protected R&D innovation mines deep inside Cupertino.

[ Further reading: AI and speech advances bring virtual assistants to work ]

Much of Nuance's present business is focused on developing chatbots for enterprise clients.

That's where Nuance Pathfinder comes into its own.

"Project Pathfinder demonstrates how machine learning and AI can automate the creation of dialog models by learning from logs of human conversations," Nuance explained.

Pathfinder can mine huge collections of conversational transcripts between agents and customers and automatically build dialog models that can be used to inform two-way conversations between virtual assistants and consumers. This should help conversation designers develop smarter chatbots. It also makes it much easier to spot anomalies in the conversation flow, suggesting problems in the script your chatbots already use.

[ Take this mobile device management course from PluralSight and learn how to secure devices in your company without degrading the user experience. ]

Building the conversation

When you speak to any kind of voice assistant, you are really interacting with reference models, which themselves try to find a solution to the intent of your question.

To reach the right response, assistants rely on conversation designers.These are real actual humans who usually need to build conversation flow on the basis of what they learn from subject matter experts and some trial and error around behavior. Pathfinder empowers them to supplement their existing knowledge base with deep insights gathered from real conversational interactions that take place inside call centers.

I spoke with Paul Tepper, head of the Technology Advancement Group AI Lab for Nuance Communications .

He explained that the software doesn't simply learn what people are discussing, but also figures out how human agents guide users through the transactions.

This information then makes it possible to add more intelligence to voice assistants/chatbots.

There's a reason Nuance is focused on chatbots rather than addressing the wider needs of voice assistants such as Siri: focus.

Self-learning conversation analytics

Siri and other voice assistants have limitations.

In part, this is because they are built for the mass market, which means they have to handle much more diverse requests than business-focused chatbots.

This creates a lack of focus. It's much, much harder to design AI that can respond to spoken input across all the topics in the world and then offer a sensible response to any kind of follow-up question. It's much easier to develop conversational AI tools that respond to specific needs.

That's why the real innovation is currently to be found in enterprise solutions. Because these solutions are built to handle a much narrower range of potential requests. Such lack of breadth is an advantage, as it makes the algorithms that drive these things a little easier to build as conversations are slightly more predictable.

The chatbot at your utility company is focused on the kind of questions you might ask that utility. Human conversations - the kinds of things we might ask Siri - are less focused and less predictable.

I see it this way:

You can yell, "Hey, Siri" at your HomePod to switch tracks or switch the lights, and it knows to expect those things. But if you want to ask it a couple of questions about stock availability at your local mall or get into an in-depth conversation on what to look for when ordering wood with which to build outside furniture, Siri gets a little out of its depth.

And while other voice assistants may (or may not) offer better responses to simple inquiries, they still aren't really capable of maintaining a multi-statement chat. You can't ask Siri a question and then ask several more on the basis of its answer to that question.

So, how do you build AI that's more capable of handling the kind of complex enquiries that characterize human/machine interactions in the real world?

It starts at the call center

Call center chatbots are designed to handle routine inquiries so that the humans who work in those places can sweat the complex tasks. Of course, because these systems work within a narrow domain of topics, they can handle slightly more complex conversations.

In use, Pathfinder is smart enough to figure out what people want (intents), to select a variety of intent-related chats from within its database, to figure out which conversations in that data collection have that intent, and to then take components of those chats and put them together in a flow-tree-like interface.

Click anywhere on the tree to see the related conversational transcript.

The end result? A flow tree based on multiple topic-related conversations that can be used to inform development of spoken word interfaces.

Nuance's Pathfinder initiative will make these machines capable of handling more complex conversations. Eventually. While it will take a while to really be realized (it won't be available until summer 2019), it shows how conversational analytics, data analysis and AI may empower next-generation voice interfaces, as well as support much more sophisticated human/computer interactions.

How will Nuance tech help Siri?

If Pathfinder can be used to accelerate development of spoken word interfaces for more narrow vertical intents - such as navigation, weather information, or call center conversation - then it should also accelerate development of more complex conversational models.

"The promise of these systems is quite great," Tepper told me. "I think we can kind of figure out how to move from more narrow, vertical conversation domains to more sophisticated conversations."

These are the kind of chats that should truly unlock the hyped power of AI rather than only asking Siri or Alexa to turn the lights off.

We're talking about the inevitable evolution of legitimate two-way conversations with bots and voice assistants that may further unlock the hyped power of AI.

What about privacy?

There's a sick note to this. If every conversational interaction you have with a chatbot or voice assistant is recorded, becomes searchable, and can be analyzed for intent, what happened to your privacy?

I know that most of us are prepared to allow companies that handle a specific task - your bank or utility company, for example, to record a call we make.

Many of us aren't yet aware that some mass market voice assistant technologies also record and keep our conversations in ways that can be traced to us.

Apple does not. All the same, it does keep obfuscated recordings that cannot be linked to your identity or your account for a very short time, so it does have this data.

Apple [engineers] were "pioneers in terms of like doing things like obfuscating data so that they could aggregate over multiple users without actually having it be traceable to any individual user," Tepper tells me.

That's a challenge for conversation design.

"If we you know worked with Apple, for example, and let them use Pathfinder that help accelerate their theory, conversation development there, you know, they'd have to be able to take into consideration the privacy of their users and loading it into the system," Tepper observed.

The thing is, once a conversation has been obfuscated, it is possible for a company to analyse the content of the chat in order to develop more effective machines.

Machines that understand context

Now, I have zero clue if Apple plans to make use of Pathfinder, but the fact that the Nuance solution exists means it is now possible for AI to translate, analyze and make conclusions on the back of real-life conversational interaction.

Plus it is now becoming more possible for smart machines to figure out the intent of a conversation.

It seems reasonable to think that once AI can figure out that much information, it should also be able to use its knowledge of a person's conversational intent to figure out in which topic domain it should look in order to reach an appropriate answer.

That's the big challenge of mass market voice assistants today - they understand the query in very basic terms, but (beyond the data stacks they are given access to) they don't yet know where to look.

What's good for business today will inevitably be useful to consumers tomorrow. It's all about creating machines that have deep understanding of our intent - and the capacity to respond intelligently in conversational terms.

Nuance has made Project Pathfinder available to a small number of strategic customers and expects to make the solution more widely available by summer 2019.