Managing Feedback and Navigating Errors in Voice Experience Design

Developers need to take into account unexpected scenarios in every question of the voice experience.

Octavio Menocal
Voice App Engineer

Developers tend to have a different perspective about their voice applications different than users, we believe users will try their best to give a valid response to our voice experience until it is able to understand them.

That could be the case when it is an already known service like ordering pizza, a meal from a restaurant, a taxi service that adds great value to the user’s everyday activities. But what if it’s a new idea we want users to try out? If we don’t give the users valid guidance or a clue on how to proceed to the next step on our app, and we return an unexpected response, they could get stuck in that step, and then……we are in trouble.

Developers need to take into account unexpected scenarios in every question of the voice experience. If we don’t handle correctly this scenario and return a nonsense response for the context, users will be lost, will think they’re talking to a stupid app and they will think they are wasting their time. Users are impatient and we need to return the right prompt so they know our app is really listening to them and offering good guidance.

In a voice-first world, the context is the key piece to keeping up a good conversation. Voice apps can understand several intents but at the moment of this script, the context of a question in the flow allows only 1 (or sometimes a couple of them) to be the valid responses. If a user answers with an intent that our app does not expect in the context, we need to explain to the users the type of answer we expect and include clear clues in the reprompt so that they say a valid phrase for the context.

Here are a couple recommendations for how to manage difficult scenarios with users:

Include in the reprompt the type of phrases you expect from the user in the context: if users respond with an unexpected intent, catch the failure, and send a response including the type of phrases your app expects to move forward. Here’s an example of a taxi voice app in Alexa:

  • Alexa: Welcome to taxi driver. I can contact a taxi to drive you to your destination, or give you a quote, which would you like?
  • User: I want to go to the airport (Unexpected intent)
  • Alexa: Excellent! I just need to know if you want me to contact a driver, or if you simply need a quote, what’s your interest?
  • User: Oh right, I want you to find a taxi for me.
  • Alexa: Great! You told me you wanted to go to the airport, correct?
  • User: Correct -> (User is pleased the app remembered his destination and didn’t ask for it again)
  • Alexa: Ok, now, where would you like the driver to pick you up?

(User does not remember and ask her mom next to her, while talking to Alexa:
User: Mom, what’s the name of this street?
Mom: William Street, 84404)
Yes! This dialog happened at the time user was talking to Alexa, and guess what, Alexa heard: what’s the name of the street we will be at, at 4pm (Unexpected intent)

  • Alexa: Sorry, I didn’t understand, where would you like the driver to pick you up? -> We didn’t stop the conversation, instead we handled the FallbackIntent and returned a valid reprompt)
  • User: William Street, 84404
  • Alexa: Ok, give me a second, I’m looking for the closest drivers -> This is a progressive response while your app gets information from your external API
  • Alexa: Ok, I found 5 drivers available, the ETA to pick you up is 5 minutes, the price is $40, do you want me to contact the closest driver right now?
  • User: yes
  • Alexa: Excellent, the driver confirmed he’s coming in 5 minutes. I just sent you an email with the receipt of your order. Thanks for using taxi driver, have a nice trip!

Don’t stop the conversation when an error occurs: some frameworks offer a nice handler to catch an unexpected error in the code. If your app has a syntax error or maybe an operation with an external resource fails, return a response informing the user something unexpected happened and they can come back and try again after you have fixed the error; you can also print that error in your server, or send it to your email, Slack or via SMS so you get alerts of what’s wrong and think of how you can fix it quickly.

The Balance by Health Alexa Skill RAIN built with Meredith Corp. is meant to help users find balance in their lives by focusing on fitness, nutrition, health, and beauty goals on a consistent basis. Besides challenges user will hear tips on how they can approach the challenges.

In some scenarios, the app expects specific responses from the users, these scenarios depend on the questions asked, this is the voice experience’s context. Below is an example of how we caught unexpected responses from the user and how we returned a response to try to clarify the user what the app expects.

As you can see, after the skill asked if the user was ready for a challenge, the user asked for sports. The user thought there could be a sport-related challenge; that’s not the case, the skill didn’t understand the request and asked the same question, after a second failed attempt of the user (asking for soccer) the skill identified something was wrong and the user could be lost.

The skill didn’t ask the same question again, because the user could have thought the skill could be in an endless loop. So, the skill gave a help response and guided the user on what things they can say to the skill.

Handling Visual Feedback: For multimodal experiences: when you develop for voice, you should find use cases to leverage the screen.

This will push the boundaries of your testing skills as you should take into account the different devices your voice app can be used in, but if your voice app heavily leverages the screen, don’t forget the non-screen devices. These devices could be the majority out there, as they are more accessible than screen devices.

Finally, don’t forget to implement unit tests, to ensure your code runs smoothly and keep it error free, before you proceed to check VUI errors.

Portions of this content originally appeared as part of SoundHound’s Finding Your Brand Voice: 6 Ways to Build a Better VUI Guide

What’s a Rich Text element?

The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.

Static and dynamic content editing

A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!

How to customize formatting for each rich text

Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of"

nested selector system.

About RAINdrops 

Created and curated by our team of experts, RAINdrop articles cover the many ways voice technology is transforming your industry.

See All Articles

Get Voice on Voice 

Every Tuesday, our industry leading briefing covers the latest updates on voice and beyond. Join over 12,000 subscribers and sign up today.

Voice on Voice

Don't miss another briefing