They go by many names, but AI-generated virtual agents that appear to talk, move and gesture like real people appear poised to take off in coming years. Recent fundraising milestones indicate that this space is heating up and investors see significant applications for this technology across both the real world and the emerging VR and AR landscape.
To dive deeper into this space, I recently sat down with Joe Murphy, who leads US business development for DeepBrainAI, to explore the business potential of virtual humans, examine the technologies that enable them, and the many challenges ahead as this approach to human-machine interaction matures and scales.
- Joe, thanks for your time. First, can you help orient us as to what an “AI human” is, and what are some killer use cases?
Hi Eric, thanks for the opportunity to share. For this conversation, I will use the terms, “AI Human, Virtual Human, and Digital Twin” interchangeably. In general, they are variations on a common theme: a digital representation of a person that looks, sounds, and acts as the actual person.
On the topic of “killer use cases,” I believe that AI Humans represent the logical evolution of virtual assistants and chatbots. So, augmenting any chatbot with an AI Human is the killer use case. What I mean is that most existing and future chatbot use cases can be enhanced with a Virtual Human. For example, why when speaking to Alexa or Siri, why do I receive a disembodied voice response from a black box? This is then followed by some awkward conversational turn-taking which is haphazardly guided by flashing lights and pulsing icons.
Previous technical limitations could create a case study for the uncanny valley and the disembodied voice assistant made sense. More recently, video synthesis technology has progressed to the point where Virtual Humans can be indistinguishable from actual humans. So, we are no longer constrained to having a conversation with a faceless black box.
Not to sound overly enthusiastic, but I compare the upcoming Virtual Human chatbot evolution to several other technology shifts, where the video was preferred and overtook the audio-only solution.
Entertainment: Radio → TV
Communication: Phone Call → FaceTime Call
Business: Conference Bridge → Zoom Meeting
Each of the paradigms above was noticeably improved with the addition of video. Adding human-centric video almost always creates a more enjoyable and natural interaction. So, we fully expect that adding AI Humans to chatbots will follow this same pattern of acceptance and adoption.
- I understand there have been some recent converging technological advancements that are critical to delivering on the potential of AI humans. What are some of the foundational technologies that make doing this possible today versus even a few years ago?
I won’t get too heavy into the details here, mostly because it is out of my depth ;-). At the highest level, three main drivers come into play and create the perfect storm for AI Humans.
- Ever-increasing CPU and GPU power in the cloud and on the edge.
- Ongoing and incremental improvements with deep learning research.
- Faster download speeds at home and in public due both readily available broadband access to nationwide 5G access.
- I know from some of our work at RAIN that these AI humans are not just theoretical, they’re already out there being leveraged by businesses and even politicians. Can you give me a few specific examples of AI humans in action?
Oh yes! Let’s get out of the lab and into the real world with some of DeepBrain AI’s Virtual Humans. Each one of these could be a complete case study, but I will provide an overview here and a link with more details.
Banking – KB Bank has implemented AI Humans in several of their branches across Korea. The AI Humans are presented in nearly life-size format via high-tech kiosks in the lobby. Users are greeted by the kiosk and can ask questions about banking services as well as be directed to the appropriate banking staff. More details here.
Retail – The well-known convenience store brand, 7-11 has started a proof of concept at a human-less store and it is supported by a Virtual Human that provides information on products stocked as well as current price specials. More details here.
Media/Entertainment – Several news stations across Korea and China have used DeepBrain AI to create a Digital Twin for their lead anchors. The “AI Anchors” look and sound like the actual anchors, but they can be used for quick news updates and breaking news throughout the day. More details here.
Politics – Probably one of the most unexpected use-cases, Korean President-Elect Yoon Suk-yeol used an AI Human of himself to effectively communicate with younger voters. From the Wall Street Journal article, “More than 80 clips of Mr. Yoon’s digital self have been shared on social media, attracting more than 70,000 comments since making a debut in January. The daily videos are typically 30 seconds or less. Mr. Yoon’s campaign staffers choose a voter question to answer and write a script for the avatar.” More details here.
- There are some obvious and significant ethical questions to explore here. From a technical execution standpoint, is human-level mimicry and indistinguishability from a real person the goal? How do you ensure that disclosures are made when the AI is in use? How do you avoid and manage the potential for deepfakes to be made without a person’s consent, or at least minimize the damage done from these? Can you share how you’re thinking about these questions at DeepBrainAI?
In the AI Banker and AI Retail scenarios, there really isn’t much of an ethical issue. It is basically a chatbot scenario guided by an AI Human. The AI Anchor and AI Politician scenarios could be a bit trickier if bad actors accessed the AI Human models and user interface. However, the AI models are kept secure by the clients and their IT teams with the most up-to-date security protocols and processes. To date, DeepBrain’s clients have chosen to clearly identify when they are using our AI Humans. For example, in Korea, they referred to the model as AI-Yoon whenever the content was delivered with DeepBrain AI technology. Similarly, the AI Anchors show a clear indication on the screen that it is an “AI Anchor.”
- A lot of people have a visceral reaction to the so-called “uncanny valley” effect of almost-but-not-quite human robots and avatars. Is that something you expect to change over time as we get more exposure to these AI humans, or as the AI improves? Is there data about how people respond to interactions with AI humans vs., say, “traditional” humans or vs. non-embodied “traditional” conversational AI agents?
I have seen many companies that create AI Humans that look like figures from the wax museum come to life. They are clearly deep in the middle of the uncanny valley. DeepBrain AI’s Virtual Humans are nearly indistinguishable from the actual humans. I will let you judge for yourself. Take a look at this video.
As far as data goes, we are still in the process of collecting data based on our client’s feedback, but so far we are seeing very high usage and engagement across all sectors.
- There are a number of companies working in this field, some more focused on customer service in the real world, others looking farther ahead to the metaverse and the creation of avatars in virtual worlds. What is DeepBrain’s focus and what’s unique about your approach to developing AI humans?
Without giving away too much of our product roadmap, I can share that I consider the 2D AI Humans of today to be the doorway for brands and businesses to enter the 3D AI Humans of tomorrow and into the metaverse.
- What’s the value of investing in this technology now?
Different businesses have different needs and benefits from implementing a Virtual Human strategy. For example, news and media organizations can create faster and at a lower cost. Banks and retail stores can provide instant and reliable access to customer questions but in a natural and authentic experience. Celebrities and social media influencers can provide a steady stream of content to their fans and followers.
Finally, businesses and brands across the country are pondering, “What’s our strategy for the metaverse?” They need to be working with Virtual Humans today.
What’s a Rich Text element?
The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.
Static and dynamic content editing
A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!
How to customize formatting for each rich text
Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of"
nested selector system.