Consumer-facing AI systems can carry out tasks or services for an individual based on commands or questions.
A JSON file that defines your Actions. This file includes information for the Actions directory listing, account linking information, a list of intents that the Actions can handle, and the actual fulfillment endpoints.
A phrase that opens a specific action when spoken to a Google Assistant. Also known as the "invocation phrase" or "implicit invocation intent". An example of this would be "Ok Google, open Trivial Pursuit".
A web tool for testing and debugging Actions in real-time. The simulator lets you test your Actions for all surfaces that the Google Assistant supports, without requiring a physical device.
Amazon has created a development tool that allows you to create, modify and delete skills. Coding required.
The Alexa Developer Console is a conversational platform that allows developers to build, test, distribute and certify Alexa Skills.
This is Amazon's voice-first design language that makes it easy to create visually-rich Alexa skills for millions of Alexa devices with screens. APL enables creators to build interactive voice experiences that include graphics, images, slideshows, and video and to customize them for different device types such as Echo Show, Fire TV and select Fire Tablet devices.
A set of actions or tasks that are accomplished by Alexa — Amazon's voice assistant. Skills are like apps for Alexa. They help customers perform everyday tasks or engage with content naturally through voice.
A collection of APIs, tools and documentation for giving Alexa new capabilities.
This is a term used to describe a state where technology is omnipresent and accessible whenever required.
This is code that allows two software programs to communicate with each other.
Stands for "applications". Apps are pieces of software written for a specific platform that are meant to do a particular task. For example, on the iPhone platform, you could create a calculator "app" that utilizes the software and hardware in the iPhone.
An application with interactive voice response (IVR) systems that automatically answers, directs, and transfers incoming calls to an extension without the need of a phone operator/receptionist.
Computer technology that can identify and process human voice. It is mainly used to convert spoken words into computer text. ASR is also used for authenticating users via their voice and performing an action based on the instructions defined by the user. Typically, automatic speech recognition requires preconfigured or saved voices of the primary user(s). It is also known as Automatic Voice Recognition (AVR)
Programs that automate conversations on web or instant messenger
Refers to the use of messaging apps, speech-based assistants (Amazon Alexa, Google Assistant etc.) and chatbots to automate communication, enhance machine learning which can in turn create personalized experiences at scale.
A conversational user interface are platforms that house artificial intelligence-supported voice apps, chatbots and IVRs to have verbal or written interactions with human users. The goal of CUIs? To mimic human conversation.
When something unexpected happened in the conversation between Alexa and the customer. Types of dialogue errors include low confidence errors, timeouts/silence/no input, and false accepts.
A design system that offers a more flexible way to design customer-centric voice experiences. This system involves writing more scripted dialogue between the voice assistant (Ex. Alexa) and the customer so that you can take those conversations and convert them into storyboards.
Dialogflow is a conversational platform that lets developers design and build Google Actions, chatbots, and conversational IVRs. Voiceflow allows your to import your projects to Dialogflow, where you can publish your Actions to Google Assistant. Unlike Voiceflow, coding is required.
When the customer says a command like exit or stop to end the interaction.
When Alexa has mid to high confidence that she correctly understood what the customer said, but she actually misunderstood.
Skills that have been built specifically for Amazon Alexa's 'Flash Briefing' feature, which provides users with news headlines and updates, event information, local weather reports and other forms of short-form content.
A service, app, feed, conversation, or other logic that handles an intent and carries out the corresponding Action.
A set of actions or tasks that are accomplished by Google's voice assistant.
A developer tool that lets you create, maintain, test and publish Actions.
A program interface that uses a computer's graphic capabilities to make it easier to use. GUIs make it possible for users to interact with electronic devices (computers, phones, gaming devices, etc.) through visuals like graphical icons. It is occasionally referred to as "gu-ee".
A happy path is a streamlined path of execution - like in a voice app for example - which features a default progression of events where no exceptional or error conditions arise. This is ideal when building the simplest flow of logic through a system or task. Where the "happy path" falls short is identifying and planning for unexpected inquiries that land outside of the default progression of the event or task.
The physical hardware portion of a platform, such as your physical iPhone. It is a shell that is useless without software giving it instructions for what to do.
With in-skill purchasing (ISP) for Alexa skills, you can make money through your skills by selling digital products to customers.
Tasks your assistant can do for you. Simply put, an intent is the user's intention in a given sentence or command. For example, if the user said "Order me a large mocha coffee", the words "order" and "coffee" would be classified as intents.
Based upon the idea that a computer needs specific information to understand human language. The interaction model provides the necessary information for a computer to understand and process a given voice request or command. This incorporates the use of utterances, intents and slots which all map out a user's spoken input. (see these definitions for more info).
An automated phone system that provides pre-recorded voice responses that can interact with callers, gather information, provide information, and route calls to the appropriate recipients via voice or touchtones on a keypad device.
A device or program enabling a user to communicate with a computer.
When creating a custom Alexa skill, you will need to provide an invocation name that users will use to open your skill. For example, you might say "Alexa, play Game of Thrones Quiz". The invocation name here would be Game of Thrones Quiz.
When Alexa has low confidence that she correctly understood what the customer said. When this occurs, Alexa cannot proceed in the interaction without asking the question again or ending the interaction.
A prompt that asks the customer a question intended to elicit a response from a small set of possible options (recommended 5 or fewer). For example, "Hi Mark, you can now hear about the following: your chequing account balance, your savings account balance, or your credit card balance. Which would you like to hear?"
Combining voice, touch, text, images, graphics, audio and video in a single user interface. This enhances user interactions by providing information through both auditory and visual means. In a nutshell, it's both GUI and VUI together. Voice (audio) + Graphical Interface (visual). Example: Fire TV
Technology used to aid computers in understanding the human's natural language. NLP lets people and machines talk to each other “naturally”. An effective NLP system is able to ingest what is said to it, break it down, comprehend its meaning, determine appropriate action, and respond back in a language the user will understand.
NLU Can be thought of as a subfield of NLP. NLU more specifically deals with machine reading, or reading comprehension. NLU goes beyond the sentence structure and aims to understand the intended meaning of language. While humans are able to effortlessly handle mispronunciations, swapped words, contractions, colloquialisms, and other quirks, machines are less adept at handling unpredictable inputs. Enter NLU.
A prompt that asks the customer a question intended to elicit a wide range of responses. For example, "What would you like to do?"
A branch of machine learning that utilizes patterns and regularities in data to train systems.
A group of technologies that are used as a base upon which other applications, processes or technologies are developed. In personal computing, a platform is the basic hardware (computer) and software (operating system) on which software applications can be run.
Text that is transmitted in real-time on a device as the users speaks
In many scenarios, intents alone are not enough to fulfill a request. This is where "slots" come into play. Slots act like traditional form fields in the sense that they can be optional or required depending on what's needed to complete the request. They are variables that relate back to the intent. For example, in the sentence "order me a large mocha coffee", the words "large" and "mocha" would be classified as slots or necessary options that are needed to fulfill the ask from the users.
Skills that have been built specifically for controlling smart home appliances.
the code that runs the hardware and makes it useful. Without software, hardware wouldn't have the logic or programs in place to actually do anything. Without hardware to run, software is useless.
An SDK is a collection of software development tools in one installable package. They make it easier for developers to create apps by packaging the necessary tools needed. For example, if you were to build a house, an SDK would include a toolbox specifically for constructing the kitchen. You could still use other tools, or even build your own, but an SDK offers something specific to solving problems or a theme of problems within that area.
The ability of an electronic device to recognize spoken words only and not the individual voice characteristics of the user
Easy-to-use visual editor to improve the speech output of voice applications (like Alexa or Google Assistant). In simple terms, SSML can help Alexa or Google sound more natural. For example, you can add longer breaks between sentences or even emphasize a certain word.
Converting human language into artificially produced speech using specialized software. It is also referred to as "read aloud" technology. It works in nearly every personal digital device nowadays, including smartphones, computers and tablets.
These are paths that users follow through an experience. Flows aren't necessarily linear, and can branch out in different paths.
This can be anything the user says. For example, if the user said "order me a large mocha coffee", the entire sentence would be the utterance.
On an Alexa-enabled device with a screen or a display, the viewport is the area of the display that the user can see.
A recorded message that is played by interactive voice response (IVR) systems, message-on-hold systems and other voice processing tools. The goal of the prompt is to guide the user towards their destination — like if they want to see the funds in their savings account or find out the amount they owe on their last credit card statement.
The ability of an electronic security device to recognize the voice of a particular person.
A VUI allows users to interact with a system through voice or speech commands using speech recognition technology (Amazon Alexa, Google Assistant, Siri, Cortana, etc.). It is occasionally referred to as "v-ew-ee".
Programs that automate conversations on phones or voice assistants
A special word or phrase that is meant to activate a given device once said. An example of these words or phrases would be "Alexa", "Hey Siri" and "Hey Google". These are also called "trigger words".
Join over 5,500 creators building with Voiceflow. Get early access to features, community exclusive perks, and a direct line to our pro users and team.
Looking to start off right? Check out our series of videos made by Voiceflow and our community on our channel.
Whether you're new to voice or turning into an expert, the Voiceflow Learning hub has a series of in-depth walkthroughs of our features to explore.