but my chatbot matches the phrase at 100%, not 70% like in the example. This also happens with other NLP setups and ther phrases, not just this one. Why is that? If you type in something completely unrelated to any intents it won’t match but anything slightly similar the utterance is always matched with a score of 100%.
Welcome to Lesson 3 in our “Lessons from Our Voice Engine” series, featuring high level insights from our Engineering and Speech Tech teams on how our voice engine works. This lesson is from Siva Reddy Gangireddy, a Senior Speech Recognition Scientist on our Speech Tech team.
What is deep learning?
To understand deep learning, we need a basic understanding of machine learning.
Machine learning is a group of algorithms that focus on learning from data to make predictions and decisions without any explicit programming. It usually involves training a model on huge amounts of data to learn patterns so that predictions and decisions can then be made on new data. For example, the smart speakers we use in daily life are based on machine learning algorithms.
Deep learning is a form of machine learning that’s based on neural networks, a set of algorithms designed to mimic the function of the human brain. Any network with more than three layers is considered a deep neural network and the input is processed through those several layers to predict the desired output. Deep neural networks require huge amounts of data and are extensively used in speech recognition and image recognition. At SoapBox Labs, our models are trained on thousands of hours of audio data and evaluated on in-house datasets regularly.
Why is deep learning important, and how is it used, in kids’ speech recognition?
The goal of speech recognition is to convert users’ speech to text. Given the variations in audio data (such as pronunciation, accent and noise), machine learning algorithms are used to ensure accuracy. Because of its superior performance, especially for understanding kids’ variable speech, deep learning is at the core of SoapBox Labs’ voice engine and solutions like fluency assessments. We also use deep learning to deliver wake word detection, voice activity detection (VAD), and end-to-end speech recognition for on-device speech recognition.
Catch up on our previous “Lessons from Our Voice Engine”:
More and more businesses, websites, and social media pages are using chatbots each and every day, and more customers are expecting them when interacting with businesses and trying to figure out whatever it is they want to know. With this in mind, one of the worst things you can do is to offer a low-quality chatbot that creates more problems than solving them.
In this guide, we aim to provide you with the need-to-know tips that will help you script and write out the perfect chatbot that your customers and potential customers are going to love. Let’s get right into it.
Start with an Introduction
From the moment the chatbot opens, you need to start working on a positive experience, and this means leading with introductions. The easy way to do this is to write something like;
“Hey there, <username pulled from social page>
I’m BusinessBot, and I’m here to help with whatever you need. Say Hello in chat to see options, or click an option below if you know what you’re looking for. Let’s get things started!”
As you can see, this introduction starts the interaction off strong and positively begins solving whatever problem someone has come to your page to solve.
Guide Your User
You know when you call up a telephone helpline, and you’re told to choose the options to direct you to the right department, but the options are confusing and don’t really help you find where you want to go? You don’t want that to happen with people using your chatbot, so make sure you’re making special efforts to be precise and lead people to where they want to go.
This might take a little bit of trial and error as people come to you with various queries that you may not have expected, but be proactive in addressing issues, getting feedback and fixing issues, and you’ll be able to guide your users exactly to where they need to be as quickly as possible. The more efficient your chatbot experience, the happier your customers will be!
“However you’re typing your chatbot, you need to ensure the language you’re using is in touch with the rest of your brand. For example, is your brand professional and formal or casual and informal? What kind of language and writing style should you be using to reflect this? This is something you need to consider before you even start writing so your text will remain consistent throughout,” shares Mark Taylor, a writer at Ukservicesreviews and Simplegrad.
Remember, customers are used to talking conversationally in chatbot-styled text boxes, so continue this on with your chatbot in order to resonate most effectively with your customers.
Define Your Option Goals
“There’s no point trying to cover every basis with your chatbot options because this is only going to make your bot confusing and hard to navigate and understand. Instead, you need to make sure your bot options are purposeful and have a goal, which means taking the time to define your goals,” explains Marie Harper, a scriptwriter at Studydemic and Assignment Services.
Are you trying to solve your customer’s problems, make sales, educate and inspire, or inform? Of course, you might have the goal to do all of these, but making your priorities clear in your head will massively help when it comes to structuring your chatbot while ensuring the experience is as smooth as possible.
Proofread Your Content!
This point should go without saying, but it’s incredible to see how many businesses let this simple point slip through the cracks. After writing your content and before you go live, you must absolutely make sure that you’re proofreading your work so it’s free from errors.
Any spelling mistakes, grammar errors, or typos are going to stick out of your content like a sore thumb, and it just looks unprofessional, makes your business lose credibility, and overall lessens the quality of your chatbot experience. The fewer mistakes, the better!
Chatbot personality traits and their effect on user satisfaction amongst Gen Z: A research and hotel case study
I remember when I used a chatbot for the first time. It was the chatbot of Dutch shipping company, Post NL. My package still hadn’t arrived 3 days after its expected delivery date. Slightly annoyed and unable to immediately talk to a real company representative, I got in touch with chatbot Daan. Unfortunately, chatbot Daan did not comprehend my questions, lacked personality, and was unable to redirect me to a real human. All this resulted in me leaving the conversation more frustrated than I was to begin with.
This unsuccessful encounter led me to choose the topic of chatbots as inspiration for my thesis research on artificial intelligence. More specifically, I wanted to dive into the effect of personality within customer service chatbots on Gen Z’s user satisfaction.
Why Gen Z you might be asking? That’s because research indicates that Gen Z and Millennials are most likely to agree that chatbots make it easier and quicker for their issues to get resolved. Understanding which personality traits within customer service chatbots trigger a positive response for one of these generations therefore seemed like valuable knowledge.
So, I went ahead and started researching which of the following personality dimensions from the ‘Big Five’ model* my population responded to best:
Extraversion: Social, talkative, assertive, and funny.
However, only adding personality without taking into account the background of the company the chatbot is operating for does not have the desired effect. It’s still necessary for companies to align their chatbots with their brand and users.
So, in order for companies to deploy an effective customer service chatbot, it needs to:
Identify its customer segments
Map out what the brand stands for
Decide upon the necessary functionalities
And finally add the personality layer that the customer segment has a preference for
I decided to test out these steps and create a prototype chatbot for hypothetical case hotel CitizenM, as this is a hotel that focuses on a segment that is generally more technologically competent.
Two prototype chatbots were created: one with the added personality dimension, and one without. After letting test participants interact with both, they were asked to fill out a user satisfaction survey.
The chatbot with the added personality scored significantly higher in all areas. Although it is not safe to generalize the sample outcome for the entire population, the test did make it evident that adding personality traits to a chatbot indeed has a positive effect on user satisfaction.
Although further research on the topic is recommended, I hope this article serves as a guideline on how to create an effective chatbot for hotels and other companies offering some sort of customer service.
If I learned one thing from this research it’s that a one-size-fits-all chatbot doesn’t cut it anymore. In order for companies to create a successful digital customer experience, they will need to have a personalized approach.
One of the dimensions, neuroticism, was left outside of the research scope as it’s only associated with negative traits.
With the world moving online in the post-pandemic world, it is now more critical than ever for businesses to provide the best possible customer experiences in order to sustain in this competitive market. A crucial part of this customer-centric strategy requires brands to support their customers in a language they prefer.
Most brands choose English as their primary language for all types of customer communications. However, using only a single language for a diverse customer base speaking multiple different languages, not only creates unnecessary barriers but also frustrates the end customer. Various statistics show how businesses are missing out on tremendous opportunities by not supporting multilingual conversations:
29% of businesses have lost customers because they don’t offer multilingual support
72% of consumers are more likely to buy when help/information is in their own language
70% of customers are more loyal to brands that support their native language
These numbers tell us that while English is a popular language, it does not suffice for all customers. This particularly holds true for a country like India, which is known for its language diversity.
The need for Hinglish
The digital revolution in India has exponentially broadened the Internet user base in the country to include large numbers of non-English speakers that outnumber English speakers. While Hindi & English are more popular than all other vernacular languages, it has given rise to a new language: Hinglish with 350 million+ speakers versus only 125 million+ English speakers.
This shift in language is primarily because users find texting and having conversations in Hinglish more convenient. It is easier for users to type their Hindi queries in Latin script instead of Devanagiri script. Since a vast majority of India lives in semi-urban and rural regions and isn’t as fluent in English, it becomes even more critical for brands to adapt to the customer’s preferred language instead of a one-size-fits-all approach with English or Hindi.
Here are some ways users often frame their queries in Hinglish:
“Mujhe help chahiye””Mera order kahan hai””Mujhe refund chahiye”
When we checked with two of our customers — Howtouse and Jiomobility, we found that lack of Hinglish support was the reason behind 40–80% of the bot breaks happening. These numbers were expected to go even higher with Diwali, a prestigious festival in India just a few months away.
Why do basic chatbots fail in supporting Hinglish?
When users ask queries in Hinglish, basic chatbots fail to respond and either transfer it to an agent or send an incorrect response. Not only does it impact the customer experience negatively but it also costs extra human resources even when there’s no need.
The reason for this gap is because chatbot localization is not as easy as taking an English-language chatbot and translating all its content to Hinglish. Most basic chatbots mechanically translate chats back and forth without understanding the context. A fully functional multilingual chatbot needs to be able to decipher the language, understand exactly what the user wants, and respond naturally. The legacy translation services are not equipped with this natural language understanding, which is essential to make or break multilingual customer experiences.
Hinglish chatbots powered by Linguist Pro
At Haptik, we understand the gaps of traditional translation services and instead use a hybrid of basic translation and custom native translation capabilities, called Linguist Pro to power the most accurate multilingual conversations. Learn more about Linguist Pro here.
Haptik has first launched Hinglish with Jiomobility whose mission is to make superior internet & network accessible to every person in every corner of India. This strategy helps the brand to
Expand user base to tier-2, tier-3 cities of India & provide localized services
Provide flexibility to customers to speak in their lingo and colloquialisms
Resolve a higher number of queries with automation & accuracy
How does it work
When we look at how the technology behind Hinglish chatbots works, it can broadly categorize it into 3 simple steps:
Language auto-detection: As soon as a user initiates a conversation in Hinglish, a Haptik multilingual chatbot automatically detects the language in the first three messages and understands that the user is more comfortable in reading Hindi than English for the rest of the conversation.
Natural Language Understanding using Transliteration: To further understand what the user wants to say in the query, Haptik’s proprietary NLU engine transliterates the Hindi text in Latin script to Devanagiri script. Once the translation is complete, the NLU engine uses deep machine learning algorithms to understand the query in Hindi and respond accurately.
“Family plan mein kaise add kar sakte hain” is transliterated to “family plan में prepaid कैसे ऐड कर सकते हैं”
Respond in preferred language: Brands can choose either Hindi or Hinglish as their bot language to resolve and respond to their user queries without requiring any human agent
To Sum Up
There’s no doubt that multilingual chatbots, especially Hinglish are an indispensable tool for brands that want to strengthen their presence in India. For Hinglish chatbots, the translation and NLU are two major components that can make or break the customer experience. Fortunately, Haptik excels as both by using Linguist Pro. Join us in our mission to help brands break all language barriers become by making conversations customer-centric & truly vernacular.
Interested to explore more or want to try out a chatbot of your own?
Before you start building your chatbot, you need to decide how you want to use it. Is it going to be an interactive customer support tool, or a tool to help your sales team generate leads?
2. Build a prototype
Once you’ve decided how you want to use your chatbot, the next step is to build a prototype. This is basically a working model of your chatbot that you can try out and ask people for feedback on.
It’s a good idea to build a prototype before you start coding, because it gives you a better understanding of how your chatbot will work and helps you to spot any problems in the user experience early on.
3. Don’t use canned responses
If you want your chatbot to have a more natural and conversational feel, it’s a good idea to use a combination of dynamic responses and canned responses.
Canned responses are pre-made responses that you can use in some situations, while dynamic responses are responses that are generated using your conversation history.
Even if you use a combination of dynamic and canned responses, the responses you get from your chatbot will probably be one-word answers.
To make your chatbot feel more human, you need to add more detail to these responses. For example, if you ask your chatbot “What’s your favorite movie?” it could reply with “Star Wars”. A better response would be “I like Star Wars because it’s full of action and adventure.”
5. Use analytics to track and improve your chatbot
To make sure your chatbot is getting used and to see how people are interacting with it, it’s important to set up analytics tools for your chatbot.
If you’re using the Microsoft Bot Framework, you can use the built-in analytics tools.
If you’re using Chatfuel, you can use the built-in analytics tools or use a third-party analytics tool like HotJar or Google Analytics.
The process of building a chatbot is straightforward, but it can be time-consuming. It’s also easy to make mistakes when you’re building your chatbot, which can mean you have to go back and redo things.
But if you follow these 5 tips, you’ll be able to build a chatbot that’s easy to use and is better than most of the chatbots you’ll find online.
With Examples from Google Analytics and Adobe Analytics
This is part I of the never-ending story on how to deal with Bots in your Analytics data. I review common, yet usually insufficient or even completely failing approaches. Why did I give up on AI-driven solutions like ReCaptcha, Akamai Bot Manager or Ad Fraud Detection tools? How good are the built-in Bot Filters? Should you at least maintain Bot Filters/Segments on top of GA views/AA Virtual Report Suites? Why does Server-Side Tracking exacerbate the Bot issues? I will finally give a peek at a client who saw Bot Traffic surging to over 40%, a case which made me reconsider entirely how to approach Bot Filtering.
The topic is as old as the world of Web Analytics: Bots (e.g. “crawlers” or “spiders”). They come without warning, wasting your time and money, and often causing spam in your data that is hard or impossible to repair.
Common Approaches to Bot Filtering
A: Web Analytics Tools’ built-in Bot Filters
Let me be short for a change and note that the archaic “Filter by IP” or “Bot Rules” interfaces are a nuisance to maintain and not a sensible option anyway. Bot Rules can handle only User Agent and IP addresses as rules. IP addresses can’t be used for filtering if you obfuscate or truncate the IP, which everybody does these days in Europe. And there are only a few Bots that are recognizable through their User Agent.
Google Analytics has made matters worse by no longer giving you data on network providers. That data, like Adobe’s “Domain” dimension, used to be one of the best ways to identify and filter Bots (at least after the fact). That being said, GA’s “exclude bots and spiders” flag is not much better than Adobe’s built-in Bot Filter. If you compare a GA View with and without the Bots flag, the difference is usually tiny. The views I looked at showed a mere 1% of Sessions less with the Bot Filtering flag.
Google Analytics 4 has Bot Filtering applied by default, and you cannot remove it, nor is there any way to verify what it does (black box):
“At this time, you cannot disable known bot traffic exclusion or see how much known bot traffic was excluded.”
B: Pretend Bots are irrelevant
Life with Bots is a plague. Many try to ignore Bots altogether and pretend that they don’t have much of an impact on the data, because their traffic is too small to be relevant, or that it is constantly at about the same level, so there is always the same margin of distortion. In some cases, this is true. Bots don’t attack each site the same way. Yet more often, it is wrong. Bots can strongly affect your entire Conversion Rate (I remember my first GA client in 2018 having nearly 50% of their sessions generated by Bots). But more frequently, they mess up the data for specific reports.
See this example of a comparably small Bot that spammed the site search with just about a thousand Pageviews, “typing” queries that only a bot can type, thus messing up our zero-results search terms report. This is a nuisance for the Search Management Team, because they prefer optimizing zero-result searches of humans:
C: Apply and Maintain Custom Filters / VRS Segments
Others spend a lot of time creating and updating View Filters in GA which unfortunately still only filter out data when it’s already too late (not retroactively). A common practice in Adobe Analytics is to keep enhancing “Bot Segments” in AA and put those on top of your Virtual Report Suites (see chapter “why you should use Virtual Report Suites”). That filters out Bots everywhere and retroactively AND reduces complexity for your users because they don’t ever need to learn about Bots nor see any Bot Segments. However, those filters/segments grow and grow, slow down queries, are prone to errors and a pain in the ass to maintain. And they become impossible to maintain once you deal with a true Bot rush (see the client case further down). Still, you need this approach at least to some degree.
D: Specialized AI-driven Bot Detection Solutions
Others again try to piggieback on Bot detection solutions. There are the simpler ones that believe they can do it all just by analyzing mouse movements. And there are the known Bot eaters like ReCaptcha, Akamai’s Bot Manager or PerimeterX (who claimed to have an Adobe Analytics integration, but disappeared when asked about specifics). These solutions are usually based on a mix of behavioral algorithms and pattern matches: The algorithmic part checks for certain bot-like behaviors of a user (usually identified by an IP address), while the pattern matches check the UserAgent/IP address against long lists of known bots. If you integrate the solutions the right way, they can drop their verdict (usually a “Bottiness score” or a “Bot/no Bot” flag) back into the browser or via a request to your server-side Tag Manager, and thus make this information consumable for Analytics tracking logic.
In my experience, none of these solutions has shown to be reliable enough nor practical enough for the type of Bot filtering that is needed for Analytics (disclaimer: I have not tried ReCaptcha, but from reading about it, it will have the same issues). Why?
If the solutions say someone is likely a bot, this is often true, but too often also not.
Moreover, they miss out on way too many real bots.
Most importantly, the AI part in them usually means they can’t make their behavior-based Bot verdict until after the first couple of Pageviews — because they first need to see some behaviour before getting a reliable score. So in the moment when our Tag Management System has to decide whether to track that first Pageview, the Bot detectors can’t tell yet whether this is a Bot or a human (or a zombie).
It’s not the fault of the solutions, they just don’t mesh with the way Analytics tracking works
I am not saying these solutions suck. It’s not their fault. But first, they are built to be Bot “Managers”, not Bot “Filters”. So they are built to prevent excessive load on your servers and fraud attempts, but often don’t mind if a slow-moving crawler checks out thousands of product pages. And second, Analytics solutions unfortunately don’t give you an option to say:
“Please delete everything we already tracked from this Bot, and also tell our Analytics vendor that they shall not bill the server calls incurred by this dude.”
So at the client we will take a closer look at in a bit, we could never use the signals from these Bot Detection solutions.
AI-driven Bot Detection is not as good as it sounds
After defending these Bot Detection solutions somewhat, I have to lash out at them a bit. First, I was shocked how many really obvious and really traffic-heavy Bots (easily identifiable by their network domains) one particular expensive solution missed completely, even after days of the same IP addresses spamming the site. That was the nail in the coffin for my attempt to piggieback on them for Analytics-oriented Bot Filtering. Multiple improvement rounds did not change much. See some examples:
Special Case: Ad Fraud Detection Tools, or how to lose a a lot of money due to overzealous AI
Some of these Ad Fraud Detection tools block ads for users whom they believe to be Bots or any other kind of user that likely won’t buy (“window shoppers” etc…). For example, the solution the client tested (I won’t name it here) stopped showing Google ads to supposed “Bots/non-converters/enemies/etc.”. The goal was to have less clicks that end up not converting anyway — because you pay for each click after all. So they expected a decrease in traffic in exchange for an increase in Return on Ad Spend. And the tool vendors bragged about the millions they would save.
The solution did lower the traffic drastically, but also dragged down Revenue. The Conversion Rate and the Return on Ad Spend increased only a bit. After switching off the solution again, Revenue and Traffic both skyrocketed. It was clear now: the ad fraud detection blocked ads for way too many humans.
If you want to avoid such costly test drives, demand that the Bot Filter vendor does a “dry run“ where their tool runs, but does not really filter anything and simply marks users it would filter out. Then compare that sample to your Analytics, e.g. via a common user ID key or an IP address (Bot Filtering is one of the “legitimate uses” for tracking PII like the IP) to see what their tool would have filtered out. I demanded this, but got only about 5 IP addresses they would have filtered (of which 2 were from humans), and then I was shut out from the discussion. The test run started, and the client lost a lot of money.
So anytime “AI” is mentioned as a cure-all method, always be over-cautious, because usually the people offering this AI haven’t understood the complexity of the problem sufficiently yet.
But back to our main topic: How can we get those Bots out of the Analytics data reliably, and before they get into Analytics in the first place? Let’s introduce the client example that changed the way I approached Bot Filtering entirely.
The Client Case: An uncontrollable Bot Rush
Approach C (maintaining segments on top of Virtual Report Suites) sums up my life with Bots as well. At least every month, someone reported that some on-site search report looked weird. Then we found out that some new crawler had taken to crawl all potential search result pages for products associated with brand “Mickey Mouse” and competitors. Another frequent case were freakingly low Product Conversion Rates for certain brands or products going down to near zero because of Bots.
The “solution” was usually to find a clearly identifying but not too complex (ideally one condition) criterion for that crawler/pingbot/whatever (usually the Network Domain was the best indicator), add that criterion to the Bot Segments, then tell people to reload the report, and then we could go back to actually useful work.
This worked decently well while the Bot traffic (Visits) was below 10–15% of the traffic measured by Analytics. Occasionally people asked why data from the past had changed, but it was not grave usually. I am saying “the traffic measured by Analytics” because many Bots are nice enough and do not execute Analytics scripts. Or they understand that this will make it easier for the website to eventually catch and block them. Or they are too stupid.
Over 40% Bots, the slippery Type
But in late 2020, some crazy Bot wave started, and the 10–15% went up to over 40% until March ’21. The Bots became like slippery worms, changing their network domain names and IP addresses all the time, so after finding and adding 100 new Bot domains on Monday to our Bot segment, we could add another 50 on Tuesday, Wednesday and Thursday. It was insane. The expensive IT-held Bot Management tool detected … wait for it … nothing!
Server-Side Tracking exacerbates the Bot Issues
Curiously, this problem only affected the server-side tracking technologies (actually Client-to-Server-to-Vendor). Why is that? To make a long story way too short, in server-side tracking, your browser never sends a request to “google-analytics.com/collect” or “…omtrdc.net/b/ss” etc… Thus, even if the Bot is one of those who do not want to get tracked by Analytics, it can’t evade tracking! So if you switch to Server-Side, get ready to deal with that additional Bot traffic.
But I digress… So we had this massive increase in Bot traffic, and it felt like shoveling water out of a house without a roof during massive rainfalls. Sysiphus live. We asked around whether IT or Marketing or agencies or anybody could help explain what was going on. Nothing relevant surfaced.
And … that’s it for part I. Read Part II to find out how we solved the problem by turning the concept of Bot Filtering upside down. Coming soon!