What is an NLU?
When building conversational assistants, we want to create natural experiences for the user, assisting them without the interaction feeling too clunky or forced. To create this experience, we typically power a conversational assistant using an NLU.
In the data science world, Natural Language Understanding (NLU) is an area focused on communicating meaning between humans and computers. It covers a number of different tasks, and powering conversational assistants is an active research area. These research efforts usually produce comprehensive NLU models, often referred to as NLUs.
There are many NLUs on the market, ranging from very task-specific to very general. The very general NLUs are designed to be fine-tuned, where the creator of the conversational assistant passes in specific tasks and phrases to the general NLU to make it better for their purpose.
For example, an NLU might be trained on billions of English phrases ranging from the weather to cooking recipes and everything in between. If you’re building a bank app, distinguishing between credit card and debit cards may be more important than types of pies. To help the NLU model better process financial-related tasks you would send it examples of phrases and tasks you want it to get better at, fine-tuning its performance in those areas. Another term used for fine-tuning an NLU is training.
How to train your NLU
Currently, the leading paradigm for building NLUs is to structure your data as intents, utterances and entities. Intents are general tasks that you want your conversational assistant to recognize, such as ordering groceries or requesting a refund. You then provide phrases or utterances, that are grouped into these intents as examples of what a user might say to request this task. These utterances help the NLU generalize what a user might say.
For example, at a hardware store, you might ask, “Do you have a Phillips screwdriver” or “Can I get a cross slot screwdriver”. As a worker in the hardware store, you would be trained to know that cross slot and Phillips screwdrivers are the same thing. Similarly, you would want to train the NLU with this information, to avoid much less pleasant outcomes.
Entities or slots, are typically pieces of information that you want to capture from a users. In our previous example, we might have a user intent of shop_for_item but want to capture what kind of item it is. We would call this property an entity.
Each entity might have synonyms, in our shop_for_item intent, a cross slot screwdriver can also be referred to as a Phillips. We end up with two entities in the shop_for_item intent (laptop and screwdriver), the latter entity has two entity options, each with two synonyms.
Many platforms also support built-in entities , common entities that might be tedious to add as custom values. For example for our check_order_status intent, it would be frustrating to input all the days of the year, so you just use a built in date entity type.
All of this information forms a training dataset, which you would fine-tune your model using. Each NLU following the intent-utterance model uses slightly different terminology and format of this dataset but follows the same principles.
Below is an example of a couple of training dataset formats:
Training an NLU
There are two main ways to do this, cloud-based training and local training.
Training an NLU in the cloud is the most common way since many NLUs are not running on your local computer. Cloud-based NLUs can be open source models or proprietary ones, with a range of customization options. Some NLUs allow you to upload your data via a user interface, while others are programmatic.
Some frameworks allow you to train an NLU from your local computer like Rasa or Hugging Face transformer models. These typically require more setup and are typically undertaken by larger development or data science teams.
Using an NLU
So far we’ve discussed what an NLU is, and how we would train it, but how does it fit into our conversational assistant? Under our intent-utterance model, our NLU can provide us with the activated intent and any entities captured. It still needs further instructions of what to do with this information.
The output of an NLU is usually more comprehensive, providing a confidence score for the matched intent.
"I want to buy a laptop!"
With this output, we would choose the intent with the highest confidence which order burger. We would also have outputs for entities, which may contain their confidence score.
"I want to buy a laptop!"
For this, we would use a dialogue manager. A dialogue manager uses the output of the NLU and a conversational flow to determine the next step.
Dialog Manager Visualized
This post won’t go into details on dialogue management, but we will be releasing one in the future!
In this section we learned about NLUs and how we can train them using the intent-utterance model. In the next set of articles, we’ll discuss how to optimize your NLU using a NLU manager.
Ready to better your team's NLU management? Chat with our team.
NLU: Commonly refers to a machine learning model that extracts intents and entities from a users phrase.
ML: Machine Learning
Fine tuning: Providing additional context to a NLU or any ML model to get better domain specific results.
Intent: An action that a user wants to take. Represented by a collection of phrases called utterance.
Utterance: A phrase a user says to perform an action.
Entity: A dynamic item that is captured by the NLU and can be used in the future. For example size or date. Commonly referred to as a slot.
Built-in Entity: A common entity that is built into the platform, common examples include: date, yes, no.
Entity Value: An option of entity that you want to make available to the user. For example an entity called size might have entity values of small, medium and large.
Entity Value Synonym: Something a user can say that is a synonym to an entity value, such as medium and mid-sized.
Dialogue manager: A program that uses NLU output to determine the next step in the conversation.