What is the Voiceflow API and how do you use it?

What is a conversation?

Before we dive into the technicals, let's break down what it means to have a conversation - with a computer or with any human.

A conversation starts with you asking something - a "request", and then getting a "response" back based on what you asked + the context of what happened prior.

Now, the request can take a variety of shapes - it could be a voice phrase, selecting a number on the dial-pad, or pressing a button on a chatbot. And the response could be as simple as a line of plain text or as complex as images, videos, or even performing an action on an app.

So a conversation, regardless if it's with Alexa, Google Assistant, Facebook Messenger, Chatbots, or IVRs, looks like a series of request, response, request, response, request, response...

If you're familiar with API calls, you can see how it's easy to adapt this model into an API interface.

Getting starting on Voiceflow

Your Voiceflow project is naturally a fluid conversation model. If you've run your project as a prototype, you'll see that we request it to launch and get back a response, which is a series of instructions. The response goes on until the next prompt or choice block, where we wait for the next user interaction. From there, you send another request, the next response follows, and so on.

It's a little easier looking at an example with the turns labelled:

The blue is the user request and everything else is part of the response. Turn 1 has an implicit request, which is to launch the conversation.

This is a simpler example, but Voiceflow is not limited to being a linear flowchart. You can create complex, open-ended conversations that allow the user to switch contexts and jump around, even with no lines between blocks! For more info, learn about Intent Block.

💡 What's interesting is that the Voiceflow test tool calls the exact same API described here - so there's nothing stopping you from making something even better than the test tool!

Calling the API

A web API is just a link where you go to retrieve things, so if I ask a weather API what the weather will be two days, it gives me a back a report. If I ask the Voiceflow API to reply to "Tyler" after he requests a pizza, I get back a response.

It's almost as if you're speaking with the API, with each API request representing a turn in the conversation.

There are a few pieces of key information that you need to give to the API endpoint every time:

versionID - this helps identify the particular Voiceflow project you are running.
API Key - this authenticates you, so someone can't just spam your project.
userID - to keep track of who is talking to the API. Where user 2 starts their conversation could be a totally different section than user 1. Each user has their own context and progression within a conversation. We'll talk about this more in the state vs stateless section. Make sure to URI encode this value.
request - the actual user action. This could be what they typed, what they said, launching the project, a button pressed, etc.

The format that the request is in is a JSON object called request with type and payload property, while the response is a JSON array of traces, each also with a type and payload. (You can also make custom requests and traces on your Voiceflow project)

Here's what it looks like:

Just keep following it up with additional API calls, and now you've created a conversation!

A great place for resources can be found on the left bar, under Integration > Developer > API. This will be the central portal for managing your API credentials and getting tips.

Integration tab in the Voiceflow creator tool

If this all makes sense, and you're ready to get started, check out our code examples and in depth documentation! This will describe all the specific types of requests and responses you might get.

Building out

Everything so far in this article has been pretty abstract, but we wanted to give you a small taste of what you could do with the Voiceflow API. Here's a gallery of some of the integrations built by the team:

Webchat

Facebook Messenger + Telegram

Webchat

Slack

Customization

Custom Actions

‍Maybe in your use case for the API, you want a response where it charges the user's credit card, or navigates to different part of the website. You're not limited by what's available on Voiceflow, because with custom actions you can do something like this:

Custom NLP

If you want to use your own Natural Language Processing, this can easily done by specifying the request type as an intent instead of text. This will prompt Voiceflow to skip our NLP.

Stateful and Stateless

The Voiceflow API comes in two different flavors: stateful and stateless. It all has to do with user's state - so far in this article, we've been referring to the stateful API, which is the easier concept to work with. State refers to information about the conversation beyond the request that the user just gave - like what block on what flow they are on, what their variables are, and more metadata.

You'll see on the stateful API it includes a userID in the URL, while this is absent on the stateless.

With the stateful API, the state is saved on Voiceflow, so we'll always know what user 1 has done so far in their conversation, and you don't have to provide it in the API call.
The stateless API is very similar to the stateful API - with one difference:
- Instead of passing userID in the path parameters, the current state of the user is passed in each request and a new state is sent back in every response.
- The same request will always produce the same response.
- This API works by passing state back and forth, and Voiceflow will never store user session data in the process. The stateless API doesn't know who it is talking to. If you don't pass in a state it will assume you are at the beginning of the flow.

Here's a quick analogy.

The stateful API is like having a normal conversation, It knows it's talking to user 1, so when they ask a request it will give the appropriate response based on all the prior context, and what it knows about user 1.
The stateless API is like talking to someone with amnesia - it doesn't know or care who exactly it is talking to. Every time user 1 says something, they will also give a sheet of paper about themselves and all the previous context. The API listens to what user 1 says and reads the sheet of paper, then responses and hands back an updated sheet of paper with this most recent interaction included. (Don't worry about the API's mental state - it reads and does everything instantaneously in a fraction of a second)

The stateful API just happens to have this sheet of paper (the state) in its head all the time because it keeps track of who it is talking to.

What is the Voiceflow API and how do you use it?

What's this API about?

What is a conversation?

Getting starting on Voiceflow

Calling the API

Building out

Webchat

Facebook Messenger + Telegram

Webchat

Slack

Customization

Custom Actions

Custom NLP

Stateful and Stateless

How to create your NLU testing strategy

Why the dominant discourse surrounding LLMs needs to change

Jumping off the AI hype train: NLUs in an LLM-dominated world