From Linguistics to Statistics and AI

ByTomas Zezula March 23, 2020

1980s marked a shift towards probabilistic statistical models. Moore’s law allowed for complex computations at scale. Major industry players (IBM etc.) successfully adopted large statistical models. A steady increase of computational power helped the evolution of machine learning algorithms.

The new era marked the dominance of statistical models and saw major improvements in text and speech recognition. Machine learning have become the mainstay of natural language processing.

1990s started to utilise large volumes of data and set the ground for AI at scale.

In 1992 AT&T Bell Laboratories automated phone call routing. They did so by the means of voice recognition. Their system was able to interpret common requests such as person-to-person call or a collect call. Up until then these simple operations required human assistance. With voice recognition in place the phone calls could be completed without any manual interference.

In the same year researchers at Carnegie Mellon university achieved a significant milestone in their speech recognition project, Sphinx II. Check the PDF in the link, it has many interesting details. The researches were able to push the error rate down to 5% thanks to “large amounts of training data”. Courtesy of Wall Street Journal who contributed tens of millions of words and a few thousand utterances into the training data set.

In 1997 Sepp Hochreiter and Jürgen Schmidhuber came up with a new concept of neural network. Long short-term memory (LSTM) RNN network allowed for processing of large sequences of data, such in case of speech recognition. The invention accelerated further advancements in NLP.

Thanks for reading and if you like this post and want to learn more about pioneers in the space of NLP, go check the related posts in the section below.

Natural Language Processing

Build a Multilingual Chatbot with Rasa and Heroku – Introduction

ByTomas Zezula February 24, 2020

I have long wanted to build a multilingual chatbot. I felt conversation bots in language learning apps or online assistants could do a much better job. With close to zero knowledge about machine learning and NLP in particular, I was looking for a tool that would provide guidance. Yet, I wanted to stay in control…

Natural Language Processing

Analyse Financial Tweets with Stanford CoreNLP – Part 3

ByTomas Zezula September 30, 2020

So far I’ve looked at analysing tweets without paying much attention to data management. Today, I will show you how to organise tweets into a structured schema and store them in a database. This post is part of a series. I encourage you to catch up with the part 1 and part 2, if you’ve…

Natural Language Processing | Python

15 Natural Language Processing Libraries Worth a Try

ByTomas Zezula August 29, 2020

With the rise of machine learning, NLP has become accessible to a wider developer community. This post gives an overview of 15 libraries worth a try in 2020.

Natural Language Processing | Python

Build a Multilingual Chatbot with Rasa and Heroku – Deployment Considerations

ByTomas Zezula May 17, 2020

You put all the effort into writing your own chatbot with Rasa. You spent hours or even days on training your model and testing. Now you want to make it publicly accessible and keep the cost down. Where shall I run the action server? What if my bot is multilingual and I need to spin…

Natural Language Processing

Build a Multilingual Chatbot with Rasa and Heroku – Architecture

ByTomas Zezula March 2, 2020

In this part of the series we look at the chatbot architecture and design choices. I keep the explanations short and to the point, so that you can code along. Perhaps, you already have an idea of what to build with Rasa and only need a bit of guidance, or a hint. If you are…

Natural Language Processing | Python

First Steps with Rasa for Busy Developers

ByTomas Zezula June 7, 2020

This post will help you get started with Rasa, an open source framework for building contextual virtual assistants. I take an opinionated stand point, sharing what I found worked best in terms of cutting on time and effort. Among other things, I’ll show you how to work around some of the limitations and use this…

Similar Posts