Let’s Talk About Brand Voice

When technology is anthropomorphic, which voice inherently is, the expectations around how that technology performs increase.

Elissa Dailey
Strategy Director

Cutting through the clutter is hard for any brand and doing so in a memorable way is even harder. With many voice assistants, we lose an important component of sense memory – sight – and rely mainly on hearing as a means to spark association, memory, and interaction. Therefore, literal brand voice is even more important to define, differentiate, and master to ensure memorability and impact.

The Perfect Persona: One of the first questions many brands ask themselves when building a voice-based experience is should we use the device’s voice or a real voice? Either decision leads to the development of a persona, which then begs the question is this polarizing to our audience(s)? or does this accurately represent the brand and the way consumer’s view it? Using a real persona in voice-based experiences is akin to selecting a brand ambassador for your newest product. It has the potential to greatly – and immediately – influence the consumer perception. Using the device’s persona puts more strain on the voice and tone as those have to work harder to accurately portray how you want your brand represented.

Authentic Voice & Tone: A challenge just as prominent as that of selecting a persona, is appropriately communicating and maintaining brand voice and tone. Similarly to how you inject your brand voice into messaging, marketing, and other communications, you will need to appropriately script your desired voice, and corresponding tonal adjustments, into your dialogue flow – and on top of that, will need to make it feel authentic and natural. An added layer of complication is that while many marketing channels are static in their messaging, voice is not, and requires you to account for what can be hundreds or thousands of interactions all maintaining a consistent voice.

Anthropomorphism: An additional and rather complicated challenge of voice is that of anthropomorphism, or the state of having human-like characteristics.

When technology is anthropomorphic, which voice inherently is, the expectations around how that technology performs increase.

When your voice experience fails to adapt to or integrate expected anthropomorphic tendencies, users take notice.

So how can these challenges be addressed when accounting for brand voice in voice-based experiences?

The Perfect Persona: When selecting someone to represent your brand in your voice-based experience, you can begin by asking yourself a series of questions. Does this persona give off the qualities of how you see your brand – are they authoritative, welcoming, positive, or serious? If this person is of note, do they have any associated baggage? Can the audience relate to them or does this person sound like they’re the audience’s child or parent (and if so, is that a good or bad thing)? Selecting a persona is just as important as selecting a brand ambassador and should go through just as stringent a vetting process.

Authentic Voice & Tone: Scripts are a key component of building a voice experience, as they quite literally guide the experience from point A to point B.

Just like in any other kind of script (movies, TV, plays, etc.), character consistency and development is important to ensuring fluidity and can increase the potential for a deeper connection with your audience.

To maintain a consistent voice throughout, try building a character profile to define limitations of voice and the account for areas of tonal fluctuations throughout the script. Outline, via a storyboard, the values of your brand and how you think those could be communicated or represented, the sense or mood you want to put out as well as get back from your users, and more. Additionally, storyboard out a variety of interactions you expect to have with your users to determine the specific tonal adjustments you’ll want to make against each nuanced situation.


In order to imbue some of the same characteristics that we as human beings do/maintain when conversing, it is critically important to adapt a variety of anthropomorphic qualities in your voice experience.
  • Natural, authentic language: write scripts the way you’d speak naturally. If your scripted dialogue isn’t something you’d say out loud, but rather something you’d read, consider adjusting. You can test this by “reading lines” with a partner.
  • Contextual relevance: pick conversations back up right where they left off to show the user that you’ve not forgotten the progress they’ve made. This creates a level of understanding for the user that each conversation is built off past conversations, growing the relationship between user and voice experience.
  • Conversational markers: use words and phrases like “thank you”, “got it”, and “please” to personalize the conversation and speak the user in a way they’re used to from conversations with other humans.

How will you know if you’ve done your brand justice?

To gauge the efficacy of the persona, voice, tonal adjustments, and anthropomorphic characteristics you’ve selected for your voice-based experience, you can explore a few forms of testing. The simplest way, and that which can occur early on in your voice design process, is a targeted survey to, or focus group with, identified audiences. A survey can help gauge immediate reactions as well as can uncover unexpected insights for your consideration prior to actually designing and developing a prototype.

Another way to test these elements is through user-testing of a low-fidelity prototype. Having users log their interactions and reactions to those interactions can help you get a sense of how well the experience is providing value and connecting on an emotional level, if at all.

We’ve explored a variety of paths in our work with brand partners to find a perfect, representative brand voice.

We worked with Sesame Street to recreate Elmo’s voice in a kid-targeted voice experience, ensuring we used the appropriate language, voice, and tonal adjustments to ensure an authentic Elmo and Sesame Street feel. We took thousands of clips and strung them together to form scripted Elmo statements and responses, accounting for a variety of scenarios in which Elmo may have to react.

For Starbucks, in order to incorporate within the Alexa Skill some of the in-person characteristics of a barista that make the ordering experience successful and unique, we visited brick and mortar locations to listen to how the baristas spoke. This helped us craft the voice and language of the experience, understand the importance of using a customer’s name, and more.

Portions of this content originally appeared as part of SoundHound’s Finding Your Brand Voice: 6 Ways to Build a Better VUI Guide

What’s a Rich Text element?

The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.

Static and dynamic content editing

A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!

How to customize formatting for each rich text

Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of"

nested selector system.

About RAINdrops 

Created and curated by our team of experts, RAINdrop articles cover the many ways voice technology is transforming your industry.

See All Articles

Get Voice on Voice 

Every Tuesday, our industry leading briefing covers the latest updates on voice and beyond. Join over 12,000 subscribers and sign up today.

Voice on Voice

Don't miss another briefing