Build a Multilingual Chatbot with Rasa and Heroku – Architecture
In this part of the series we look at the chatbot architecture and design choices. I keep the explanations short and to the point, so that you can code along. Perhaps, you already have an idea of what to build with Rasa and only need a bit of guidance, or a hint. If you are new to chatbots development, please check the previous part, where I discuss terminology and core NLP principles.
At the end of this post you will have a thorough understanding of how the bot works. You will be ready to explore the smarts behind it on your own. You can always use the repo as a reference. Don’t forget to play with the live demo and let me know your thoughts in the comments section below.
Here is what’s ahead of us today:
Prerequisites
To build anything with Rasa you are going to need Python. That alone presents a bit of a challenge. What version to use, 2.x vs 3.x? Do I have pipy installed? Oh, and how about spinning up a virtual environment. Oh .. and there is a ton of libraries and transitive dependencies .. Well, that’s why you want to use Docker.
I know. There are several great articles that get you started with Rasa, including their very own tutorial. They all provide guidance of installing everything on your local environment. In my eyes, however, Docker makes the setup much easier and I will use it throughout this tutorial. Hope you are fine with it. If not, there are alternatives to choose from. Rock away.
Installing Docker is trivial and well documented on all major platforms.
Since we are going to juggle with a bunch of services, Docker Compose comes in handy. Dependencies among services become easy to manage. Therefore, go ahead and install Docker Compose too.
The hardest part is over 😉
Chatbot Architecture
- Clients submit requests through a web app running on Flask. The chat window is handled by a custom React widget adjusted for Rasa backend. Each new client subscribes to a conversation channel and receives therefore a unique identifier. There are multiple clients in a channel.
- Request handling is non-blocking. The event handler continuously intercepts new events (client requests) and dispatches them to the agent for background processing.
- The agent runs user messages (queries) through a sequence of steps. First off, it tries to understand the intent by using an NLU Interpreter. The agent checks for actions associated with each intent. The intent-to-action resolution is done by a so called Policy. See the section below for detailed explanations.
- Custom actions, typically more complex operations, are handled by an Action Server. Once an action is finished, the result is fed back to the client as a response.
Rasa Framework
Before we look into the project internals, I should briefly explain how NLP pipeline works.
An Agent is an entry point that plugs into Rasa’s capabilities. In my project, I attach the agent to the channel and let it handle client’s requests in a non-blocking manner. Under the hood, the agent spins up an instance of a Sanic web server.
The agent comprises the following parts:
- Interpreter – Takes the incoming message and by using NLU it resolves the intent and extracts entities, if any.
- Conversation Tracker – Maintains conversation state by storing it either in memory (by default), or in a database, such as Redis, Mongo or a SQL database. Remembering what the user said in the past is essential for a good user experience.
- Policy – Determines action resolution. Once an intent is recognised, the policy looks at the best possible course of action. Which action is executed depends on the policy characteristics and configuration.
Project Structure
I tried to keep the structure clean and tidy by separating config from code and dependency management.
- actions – Custom actions. As you will see later on, Rasa exposes an interface allowing to define and implement a custom action. Typically, it requires interaction with external services, such as weather forecast, translation API etc.
- config – All of the app config, namely intent and action definitions, and NLP pipeline. The exact pipeline depends on the language. I could, for example, use spaCy pre-trained model for English or German version of the bot.
- data – NLU training data and mappings between intents and actions (stories).
- model – Generated NLU model as a result of the bot’s training stemming from NLU training data and, optionally, interactive learning.
- server – Code needed for user requests routing and asynchronous processing. In fact, it contains all of the necessary implementation, including the front-end layer for client interactions. Hope the internal structure makes sense and is easy to follow. Please let me know in the comments section if that’s not the case. Your feedback is highly appreciated.
Last but not least, the root directory contains Docker config (image + services and how they depend on each other), declaration of project dependencies.
Finally, there is a runner script, server.sh, which allows to interact with Rasa API in a simple and straightforward manner.
Running the Chatbot
You can start all of the necessary services by running
./server.sh start
This is an equivalent of running docker compose
docker-compose up -d
Running the start script instantiates the following services:
- An action server listening on port 5055
- An instance of a Rasa agent attached to the port 5500 to handle a chat in English
- An instance of a Rasa agent attached to the port 5501 to handle a chat in German
- A Rasa web server listening on port 5005. The server provides API for training of the bot and generating new language models.
Once the bot is up and running, just visit http://localhost:8080. Hope the demo below gives you an idea of how the UI works.
I hope this is enough to get you started. Feel free to clone the repo, play with the example and let me know your thoughts in the comments section below. Next time, we look into training the bot. That’s where things get really interesting. You will be able to generate your own language model and create a bot to your liking.