Build a Multilingual Chatbot with Rasa and Heroku – Training the Model
With a pre-trained model your chatbot is able to handle basic interactions, such as greetings. When you venture beyond the basics, you quickly find out that the bot’s responses don’t make much sense. That’s because the NLP model is too generic and the bot won’t recognise specific intents and entities. Similarly, the model lacks rules for responding in way that makes sense to the user. In this post, I explain and demonstrate how to train a Rasa bot.
In this post we look into training a chatbot with Rasa framework. If you haven’t followed the series, I recommend you to read part 1 and part 2. Feel free to explore the source code. Ultimately have a look at the live demo and let me know your feedback in the comments section below.
This post covers the following parts:
- Training Purpose
- From Data to Model
- Stories as Conversation Scripts
- Taking an Action
- Training Data Format
- Training the Model
- Summary
First of all, let’s have a look at the purpose of the training.
Training Purpose
The objective is to provide sample data with hints on intent and entity recognition, as well as guidance on what actions to respond with. Rasa uses the data to generate a new version of the NLP model. As a result the bot should be better adjusted to whatever you want to achieve with it.
.. one of Sara’s skills are to subscribe a new Rasa user to the Rasa newsletter. To do that, the assistant has to know the user’s email because without it, the assistant is not able to execute the action. So the goal is to teach the assistant that if the user hasn’t provided their email, the assistant should ask for it, but if they did, then the assistant should skip the question and move on to the next state of the conversation.
On training Sara, a demo bot. Source: Designing Rasa Training Stories
From Data to Model
The picture below depicts the process of parsing the training data. The example was taken from Bhavani Ravi’s article about Rasa NLU training in depth.
This is only a half of the story, though. It merely shows the understanding part of the process. User input remains yet to be transformed into meaningful actions. This is done through so called stories.
Stories as Conversation Scripts
A story is essentially a conversation model for a particular situation. In the example below, the user wants to subscribe for a newsletter without providing an email address. The chatbot knows it needs to ask for the email address (utter_ask_email) prior to signing the user up.
## story_email_not_provided
* greet
– utter_greet
* subscribe_newsletter
– utter_ask_email
* inform{ email:’example@example.com’ }
– action_subscribe_newsletter
Capturing user’s email prior to subscribing to a newsletter. Source: Designing Rasa Training Stories
Taking an Action
Actions are part of Rasa’s core dialogue engine. They are used in tandem with the NLU model. User intents are resolved according to the model and the stories map them to actions.
One action might be to greet the user, another might be to call an API, or query a database. Then you train a probabilistic model to predict which action to take given the history of a conversation.
What is an action. Source: Rasa Blog
The probabilistic model of matching intents against actions can be evolved through interactive learning. More on that later.
Simple actions require no coding. You simply declare them as a custom text messages in the config file. Advanced logic, such as making a call to an external service or a database needs to be programmed. Here as an example from the project repo, an implementation of a user subscription form.
Custom actions run on an Action Server, which is part of the Rasa’s SDK. On a side note, there is gotcha when deploying this to Heroku. Stay tuned to learn how to proceed with the deployment.
Training Data Format
You can store the training data as JSON or markdown. JSON is adopted by data generators like Chatito or Tracy. Markdown, on the other hand, is less verbose and therefore easier to read.
Training data can be stored in a single file or as multiple files in a directory. The data set is split into common examples and rules or patterns that help analyse previously unseen data.
Common Examples
They are mandatory and tedious to write. That’s where tools like Chatito or Tracy come into the picture. In my opinion, it pays to off to craft a few examples by hand. It helps you think about the problem domain. The generators help, but it’s harder to change or maintain the generated examples.
Rules and Patterns
Obviously, it’s virtually impossible to catch all edge cases by providing specific examples. To make the bot more robust you want to rely on regular expressions and come up with synonyms. Ideally you include some typos or lingo to increase a chance of catching the user’s intent.
## regex:zipcode
– [0-9]{5}
A ZIP code pattern as a regular expression. Source: Rasa Docs
Lookup Tables
Lookup tables play an important role in entity recognition. They enumerate all supported values for a particular domain, such as purchasing a meal or a cup of coffee:
Training the Model
The training produces a probability model out of the recognised intents and actions (domain.yml
), their mappings (stories.md
), NLU data (nlu.md
) and pipeline configuration. I have tweaked the script to take multilingual support into consideration.
Summary
At this point you have an understanding of the basics behind a chatbot’s smarts. We have looked at describing user intents that drive chatbot’s actions, how entities are recognised and ultimately how to compile an ML model.
Please let me know your questions and feedback in the comments section below. Thank you and stay tuned for the next post where I show how to deploy the chatbot onto Heroku.