Mercury — A Chat-bot for Food Order Processing using ALBERT & CRF

Mercury — a chatbot for Ordering Food using ALBERT & CRF

Unless you have been out of touch with the Deep Learning world, chances are that you have heard about BERT, ALBERT and CRF (Conditional Random Field).

Mercury, named after the Greek God Hermes who was the messenger of the Gods, is a chatbot service which can be integrated with various Food-Delivery brands such as Swiggy or Zomato where a User can simply type in his order and send it as a text.

Mercury can then extract the essential information from the order and place the order for the User accordingly.

Here is a list of technologies involved in Mercury :

=> ALBERT (which looks like BERT++)

=> CRFs (Conditional Random Field)

=> gRPC (Google Remote Procedure Calls)

=> JointALBERT Slot-Filling & Intent Classification

=> Flutter (Front-End)

Since this paper is about “Mercury”, I will only be providing a brief summary and some useful links for a more in-detail understanding of each concept.

What is BERT?

Let’s take a look at these sentences where the same word has different meanings :

|- I was late to work because I left my phone at home and had to go back.

|- Go straight for a mile and then take a left turn.

|- Some left-winged parties are shifting towards centralist ideas.

How do you differentiate between each meaning of the word left?

These differences are almost certainly obvious to you, but what about a machine? Can it spot the differences? What understanding of language do you have that a machine does not?

Or rather, What understanding of language do you have that machines did not have earlier?

The Answer : Context.

Your brain automatically filters out the incorrect meaning of words depending on the other words in the sentence, i.e. depending on the context.

But how does a machine do it?

This is where BERT, a language model which is bidirectionally trained (this is also its key technical innovation), comes into the picture.

This means that machines can now have a deeper sense of language by deriving contextual meaning of each word.

Trending Bot Articles:

1. How Conversational AI can Automate Customer Service

2. Automated vs Live Chats: What will the Future of Customer Service Look Like?

3. Chatbots As Medical Assistants In COVID-19 Pandemic

4. Chatbot Vs. Intelligent Virtual Assistant — What’s the difference & Why Care?

What is ALBERT?

ALBERT was proposed in 2019, with the goal of improving the training and results of BERT Architecture by various techniques :

Parameter Sharing (Drop in Number of Parameters by over 80%)| Inter-Sentence Coherence Loss | Factorization of Embedding Matrix

Results of ALBERT on NLP benchmarks :

ALBERT VS BERT (ALBERT Achieves SOTA Results with 20% Parameters)

What are Conditional Random Fields?

CRF classifies inputs to a feature from a ‘list of potential’ features.

I will be going into a little more detail shortly but for now, just understand that CRFs are used for predicting sequences depending upon previous labels in sentences.

They are often used in NLP in various tasks such as Part-Of-Speech Tagging and Named-Entity Recognition since CRFs excel in modelling sequential data such as words in a sentence.

What is gRPC?

It is an open-sourced high performance Remote Procedure Call framework.

It’s main advantage is that the client and server can exchange multiple messages over a single TCP connection via the gRPC Bidirectional Steaming API.

Mercury uses gRPC bidirectional streaming API for implementing Speech-To-Text functionality by using the Google Speech-To-Text API.

Mercury — What’s under the Hood?

What does Mercury do before placing the order for the User?

How does Mercury know that the text it has received is indeed a request for placing an order?

Let’s take a look at this sentence:

“I would like to have 1 non veg Taco, 3 veg Pizzas and 3 cold drinks from Domino’s.”

How does Mercury go from this to something like this?

This is where Joint-ALBERT (Slot-Filling & Intent-Classification) comes into the picture.

Sneak-Peak under the hood of Mercury’s Model

Training :

We come up with some desired labels for our model.

Intent Label : <OrderFood>

Slot Labels : <restaurant_name> , <food_name>, <food_type>, <qty>, <O> (<O> means that specific word does not carry much value in the sentence and can be masked or ignored).

We create hundreds of sample sentences with labels associated to each word.

ALBERT + Conditional Random Field (Joint-ALBERT):

We have already learnt that CRFs excel in modelling sequential data. So how does it help Mercury?

CRFs essentially help in mapping each word to it’s appropriate label.

For example:

It can map the number “1” to <qty> denoting quantity.

It can map the word “Domino’s” to <restaurant_name>.

Great! So if CRFs can do this, why do we even need ALBERT?

In our original sentence :

“I would like to have 1 non veg Taco, 3 veg Pizzas and 3 cold drinks from Domino’s.”

How does CRF know that the word “non” is a <B-food_type> and the word “veg” is <I-food_type> (B means beginning & I means continuation of B)?

How does CRF know that the word “non” is not the dictionary meaning “anti”?

As you probably already guessed, ALBERT provides CRF the contextual meaning of each word which helps CRF in classifying each word into the correct slot labels.

CRF does Slot-Identification for each word by mapping each word’s possible label with each other and figuring out which mapping has the highest probability.

Bold Line represents the Most Probable Mapping

Finally, how is Intent of the sentence predicted?

CRF does this part too by figuring out that “A specific sequence of slot-labels leads to a specific Intent”.

For example :

If the slots <food_type>, <food_name> and <restaurant_name> are found in a sentence, then the sentence is probably having the intent of <OrderFood>.

Intent Prediction based on Slot Labels

Flutter Front-End

Mercury has a simple and elegant front-end for the User.

Some Useful Links :

You can watch a quick 3-minute Demo of Mercury on My Mercury Website.

You can also check out my other projects on My Main Website.

BERT Paper: Here is the arxiv BERT Research Paper.

ALBERT Paper: Here is the arxiv ALBERT Research Paper.

CRF Paper: Here is the arxiv CRF + LSTM Research Paper for Sequence Tagging.

JointBERT: Here is the arxiv JointBERT (Intent Classification and Slot Filling) Research Paper.

gRPC Introduction: This will get you started with gRPC Basics.

adios, amigos!

Don’t forget to give us your 👏 !

Mercury — A Chat-bot for Food Order Processing using ALBERT & CRF was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.