Conversations

Ask an Expert: The evolution and outlook of voice user interfaces

Mark Ammendolia
|
December 28, 2020

If you've ever searched out podcasts surrounding conversational AI and voice-enabled technology, there's a very good chance you've come across VUX World. Founded and hosted by Kane Simms and Dustin Coates, VUX World has quickly grown into one of the most popular AI podcasts in its class. From helping brands innovate on Alexa and Google Assistant to explaining how enterprises can use conversational AI to solve core business challenges across existing channels (contact centers, IVRs, websites, etc.), Kane and Dustin are hard set on helping businesses and individuals further educate themselves on this burgeoning tech. 

Kane Simms recently joined Voiceflow's Ask an Expert webinar series, where he responded to a host of questions submitted by our community. We've summarized many of his answers, which focus on the evolving voice landscape, enterprise adoption of VUIs, and the pandemic's impact on current industry trends.


More on Kane Simms:

Kane Simms is one of the world's premier voice strategists working with top brands on conversational AI initiatives. He's also a VUX designer who has taught hundreds of students how to build conversational user interfaces. Before starting VUX World, Kane spent six years leading corporate digital transformation initiatives and implementation teams.

A speaker, host, thought-leader, and well-known personality, Kane is passionate about helping organizations take advantage of conversational AIs, leading to the most significant shift in user behavior since the smartphone. (Source: VUX World, Linkedin)


Full Recording Below:

Voiceflow's AMA with Kane Simms, Co-founder of VUX World

What you'll learn by reading this:

  1. Voice is bigger than Alexa and Google Assistant
  2. Navigating the changing voice landscape
  3. Enterprise adoption of VUIs
  4. The biggest mistakes businesses make when adopting voice technology
  5. The pandemic's impact on industry trends

Q1: What excites you the most about voice tech right now?

Kane: I think what excites me more than anything, is I've got a lot of experience working in service design and digital transformation initiatives. With the team I used to manage at my old job, we'd kind of work across the organization, dig up everything, rebuild it, automate as much as we can. [It was] really end-to-end digital transformation.

And so we've always been looking at voice through that lens. [Asking questions like] how can we use this technology to make companies more efficient? How can we make customer experiences more streamlined? How can we make business units work and function better? How can we increase revenue by providing seamless experiences?

A lot of people get into voice from the Alexa Smart Speaker ecosystem and that's where it kind of started for me initially. What happens is you'll discover Alexa or discover the Google Assistant and then you'll start building and playing around with it. Often, this is [focused on] education, interactive audio, or entertainment-based use-cases - and so that's where a lot of people begin their journey.

But as soon as you start lifting under the hood and investigate about where voice can be used effectively, you start realizing the benefit of it in doing those things.

I've mentioned improving customer experience, improving access to streamlined services, enabling self-service, freeing up contact centers. That's what is exciting to me, [...] that voice is bigger than Alexa and Google Assistant and that this concept of voice is being broadened to conversational AI. And so we tend to talk about conversational AI now rather than voice specifically.

What excites me most, however, is that having gone through this journey  — through entertainment, through leisure, through interactive audio — we're now starting to find areas where the technology is good enough and the businesses are ambitious enough to start really searching for those use-cases where this [technology] can be used to increase productivity or streamline service access.

If you look at Moscow Airport — the ultimate IVR system with a conversational agent — [they are] saving 30% of calls from hitting their contact center. That's what's exciting to me about voice and about conversational AI — real practical application of voice technologies that can be used to measurably enhance customer experiences.


Q2: How do you see voice experiences evolving right now? Do you think we're still a long ways from the adoption phase?

Kane: Well, it depends. One of the difficulties about having people adopt stuff right now is that if you're building for Alexa or Google Assistant, you need to draw them into that ecosystem for the purposes that you're creating stuff for.

And that's where a lot of the challenges are. People are always talking about discoverability because the Alexa ecosystem is over here, the Google Assistant ecosystem is over there, but everyone is already on social media, and phoning your contact center, and on your website. And so while people are already behaving one way, we want to bring people over here to [use] this.

People are gradually using these services - Alexa and Google Assistant -  for routine things like timers and music. I think that's why it's difficult to get things discovered — you're not only fighting to be that one app that's at the tip of the iceberg, but you're also fighting to bring people over into an ecosystem to discover they can do more than just music.

And so if you think about applying these same technologies in other areas -  like in an app where you have a voice interface on top of your experience, or on a website where you can streamline user journeys through voice search - there is no struggle or barrier because the technology is already baked in.

I don't know if it's early in terms of adoption in those channels — maybe it's early in the adoption of third-party skills for Alexa and Google Assistant — but we're starting to see businesses and enterprises wake up a little. All the big players like Microsoft, Amazon, Google, Oracle, Salesforce have got services.  Everyone's building out the capability, and so we're starting to see this beginning to happen at other larger companies.


Q3: Is there an industry that VUIs will impact the most in the future?

Kane: To be honest — it's anything and everything. I think we're going to continue to see new devices. I think smart speakers are a brand new category. There was no such thing as smart speakers before. There was no such thing as smart displays before. And so I think we're going to continue to see new devices pop up. What will they be? Who knows. But I think that we will continue to see a proliferation of different devices with voice interfaces in them.

I also think we're going to see voice interfaces on all our current devices — and they kind of already are. If you've got any Apple gear for example, then you've got Siri on everything. If you've got anything Android, you've got Google Assistant on everything. And so we already see that.

We're going to continue to see those platforms —  Alexa and Google Assistant — continue to grow. I think we're going to see more devices, more [voice] interfaces on current devices, and we're going to see all sorts of stuff spring up on the environments that we already go to to get stuff done. Whether it's in apps or in phone lines —  [it'll be] everywhere.


Enterprise adoption of VUIs

I think we're going to see enterprise really adopt it and start to open up. But this is the thing — as soon as you start getting into enterprise and begin wanting to integrate with line of business systems, you very quickly realize that many of those systems are legacy-based, and it's very difficult to integrate with them.

A lot of the work that we used to do when we were designing services [involved working with] systems [that] were so old and clunky that you would take data dumps overnight into a data warehouse and integrate with that.

And so what voice technology does is the same as mobile, social media, and the internet, which is make businesses question how they operate and function. Voice technology is another tool in the digital transformation arsenal that enterprises can use to be more efficient, operate more cost-effectively, and scale what they do. I think we're going to see a lot more enterprise use-cases as well.


Q4: What are the biggest mistakes you've seen businesses and creators make when getting into voice?

Kane: Just because you think you've got a good idea doesn't mean it is. One of the most crucial things that you can do - as painful as it is - is to put something in front of someone who intends to use it as early as you can. You tend to see a lot of hammer's searching for a nail — or solutions that people think are a good idea and build them, but they might not solve a specific problem. They might not have a specific use-case that's being addressed.

And so we tend to advise that you start with the user. Start with the problem. Even though you're thinking about building a voice solution, forget it at the start and try and do your user research. Speak to your end-users. Speak to the person who is ultimately going to use this thing. Find out how they currently use and access your company [or] how they currently buy your products and access your services. What's wrong with those existing channels, and what are the expectations? Why do they choose a competitor over you? Then decide and see whether or not there is a solution that a voice interface will solve.

With this way, you end up starting with the nail and then building the hammer or finding the nail and then building the hammer, rather than trying to build something that you think is a cool idea.

[Build something] to learn by all means. If it helps you learn, if it helps you understand the platform, if it helps you get your head around how to build [and] design stuff — by all means, build stuff and practice, practice, practice. But to release something into the world — for a business to produce something — it needs to be rooted in what is going to be valuable for the user. If you look at the top 10 skills in the Alexa Skill store in America, none of them are branded. NPR and Jeopardy are the top two [and] the rest of [them] are things that people have found a use-case for and gone ahead and built themselves.

That's the hardest part for businesses — trying to find out where you can align and what you are trying to achieve [in terms of] your business objectives [in relation to] a genuine user need or problem [while] having voice be the solution in the middle.


Q5: Within the voice-tech world, what trends have you seen since the beginning of the pandemic?

Kane: What is happening or what we're noticing is that places that have stores or physical locations that are no longer open [but] can sustain themselves — [like] the big companies — are now looking at ways to increase or improve in-store engagement for after COVID. A lot of conversations we've been having are around how to create in-store experiences using voice technology so that when people go back into stores they don't have to touch quite as much stuff and can self-serve.

So that's a future situation that is being worked on now, so when stores do reopen, you might find voice experiences in-stores a little bit more.

In terms of the stuff that's happening right now, I mentioned IVRs and voice in the phone lines and things like that. One of our partner organizations that specializes in IVR software and the creation of interactive IVR systems are run off their feet at the moment — building things for government organizations and for the travel industry that has been hit really hard.

[It's really] conversational technologies in general. It doesn't need to be voice specific — it could be a chatbot on a website or a Facebook messenger bot. A really good example is the Whatsapp bot that the world health organization released a few weeks back.


And so [this technology] is being used to help organizations manage an increase in demand and scale of their customer support and customer service as well as being thought about now [as a solution] to help plan for when the stores eventually reopen.

Q6: What resources did you use to level-up your understanding of VUIs

Kane: I used to watch the Amazon Teams' Twitch streams quite a lot. They came in handy. I thought they were really, really good because they take you through a step-by-step [process] of how to build a skill while talking you through absolutely every single part of it. Even though it was very technical — and i'm not a developer — it helped me understand it enough to hold myself in a conversation with either a client or a guest on a podcast.

The Amazon blog was pretty good [as well]. Google's also published quite a lot of stuff. And then much of [my early knowledge] came from actually talking to people, you know?

I also think that teaching is the best form of learning, and so I've been a VUI design instructor for a while now with the VUX Academy. Going through that process helps you uncover things that you might not have thought of before running workshops. We've been doing workshops now for almost two years, and with every single one, you learn something different.

So it's a mixture of trying to learn from the guests that we have on the podcast, learning from the people who are building this stuff at Amazon and Google but then also through the experience of building skills, through trying things, and exploring tools like Voiceflow and others.

Interested in joining our next live webinar? Sign up for upcoming events here.