Your cart is currently empty!
Category: Chat
-
Call Center Experience with Voice agent: Challenges, Use Cases and Case Study
This is part 2 of a series about how call centers can evolve with the help of Conversational AI solutions. Part 1 looked at the theory about choosing the right technology for call center automation and the opportunity to grow with Conversational AI solutions for business. Here we take an even closer look at a specific scenario of Conversational AI in finance and general use cases of voice agents for both end customers and live agents.
Customer Experience with Voice Agents
Voice assistants are growing in popularity these days. The ability for a solution to understand voice, interpret the intent and meaning, and provide value to users is growing at an incredible rate. When working with the voice channel, there are many challenges that can be encountered, where a voice agent can create a huge benefit.
Challenges with Voice
- Determine why a customer is calling (their intent).
- Customer authentication and verification.
- Troubleshooting.
- Inconsistent customer service.
- First contact resolution.
- Agent engagement and productivity.
- Accurate log analysis and data collection.
An automated voice agent can help to mitigate some of the challenges listed above.
Customer Experience With Voice Assistant Authentication of the user can be a challenge depending on how you integrate voice into the solution. If you’re using an IVR, asking security questions is part of the normal process flow for user authentication. But for an embeddable voice solution within an application installed on your mobile device, where the user has already authenticated, you should be able to make certain presumptions of authentication. Any data already presented in the app should be available through the voice agent without additional authentication. In the case of enhancing security, it’s a matter to identify how to integrate a voice technology into existing workflows.
Inconsistent customer service is also a problem. People react differently to the personality, mood, language, accents, word selections, and slang of live agents. A voice agent can mitigate many of those elements by providing the same answer each time, ensuring every customer receives the same information for the same question.
Use Cases of Customer Experience with Voice Agents
- Intent capture & intelligent call routing.
- Conversation transcription.
- Handle thousands of calls simultaneously, making them a solution for peak times or off-hours support.
- Handle transactions during the call routing stage.
- Authenticating customers through natural conversation.
- Personalized service based on customer history.
One key use case of voice systems is conversation transcription. The first value of it is the ability to properly capture all customer activities into a transcript for review and analysis, which can be used to quantify new intents and actions. A second value allows the bot to “listen” to how a live conversation is going on between a user and an agent, allowing the bot to provide proactive recommendations to the live agent of how to support the customer.
An AI bot doesn’t have to be customer-facing; it can be a right hand for the live agents to create efficiency and allow them to be more effective.
An additional benefit of a Conversational AI solution is that of volume management. Scaling an automated service for a short period is much more viable than scaling up a live agent call center, especially in short bursts. Upgrading an automation service can be done in a matter of minutes or hours, allowing for quick support when needed. The ability to provide support and consistent experience in off-hours or on holidays has huge value.
Embracing omnichannel opportunities and automation provides a huge potential for enterprises. Transactions through call routing are also very important since you need to understand who is the right live agent for a particular user engagement. You don’t want to transfer someone to a live agent and then find out that was the wrong department; that’s a poor customer experience. So using an automation tool to identify where the user needs to be transferred, to whom they need to talk, manage an expectation of wait time, and availability is going to be key to a successful engagement.
How do you measure the success of implementing a Conversational AI solution?
Bringing quantifiable numbers to the table is important, and an analytical approach can provide this data, including:
- How many conversations were initiated?
- How many conversations were able to be responded to within the bot?
- How many had escalations to a live agent?
- What was the reduction in wait times for live agents as a result of the inclusion of a bot?
- What intents were not able to be identified by the bot?
These are some metrics that can be measured to create success criteria. But the key metric for Conversational AI bot development relevance is user satisfaction and a method to measure that success should be part of the equation.
AI Chatbot to a human agent handoff Case Study for Financial institution
Let’s take a look at an experience Master of Code has provided, where we were able to create a live agent handoff system. Through the use of questions and the bot’s understanding of the user’s intent, we were able to transfer the user to the right department to get the needed information.
Use Cases: billing and account management FAQs and specific live agent handoff dependant on 12 topics. The AI chatbot for bank and financial service was built by Master of Code on a partner platform. The client serves financial institutions, financial planners, and broker-dealers.
Download Finance Use Cases for Conversational AI report with Top Examples
As you can see, there was a 30% reduction in transfers. This is both with the Conversational AI in finance solution being able to answer some of the questions, but also with the ability to find the right agent. 30% reduction in transfers means that 30% of those calls were handled by the Conversational AI and therefore the wait times in queue for live agents were significantly lessened. It meant that the live agents did not have to rush through a conversation with the user to get to the next call in the queue, which made it a more effective experience.
1.4 M hours saved with implementing the automated customer service tool Erica at Bank of America. Ready to transform? Get in touch
Call Center Experience with Voice agent: Challenges, Use Cases and Case Study was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.
-
Digital Avatars Bring New Meaning to Multitasking
Half visible man hiding behind digital code-Image by Peter Linforth courtesy of Pixaby Imagine writing a blog at home, teaching a writing course, and attending a zoom meeting simultaneously. Believe it or not, digital AI companies are doing just that; I’m not talking gaming avatars; doppelgangers that sound like you and use your facial and body expressions. These digital twins are bringing new meaning to multitasking.
The Use of Digital Twins in Business is Limitless
In November of 2021, HourOne debuted REALS, a platform where anyone can create one or more digital twins for business use. The business can create AI synthetic assistants that can do just about anything.
These AI assistants can answer phones, make appointments and give a presentation in 4 different languages. Unfortunately, you will have to get your coffee, but your AI assistant can order your lunch and have it delivered!
You have complete control over how they interact with your customers. You can program and re-program in virtually (no pun intended!)minutes, making it easy to revise a market task that isn’t going as well as planned or reproduce it using a different assistant and new languages.
This will be an asset to small businesses and gig workers who can put this technology to great use with international clients and remotely located workers. Business is ever-changing, and how you brand and market your business is essential to its success.
REALS stated that you could set up any presentation using “thousands of text lines in multiple languages.” This assistant can give the same presentation in numerous countries simultaneously, something it would take a human presenter to do in months with an expense to the company of thousands of dollars.
The look of marketing will be changed as companies create an interactive avatar that becomes synonymous with their brand. Imagine your kids being able to interact with a character on the cereal box.
Personal Use of Digital Doppelgangers Will Expand Your World
While these interactive characters will be a great marketing tool, they can also be created for personal use especially, in the metaverse. Imagine visiting with your sister in Greece on the beach for a monthly platform fee! Talking to family members and friends.
Housebound seniors and the disabled will be able to create and use the avatar to experience virtual life, find people that share their experiences, make and visit new friends, and experience life in a way they haven’t been able to do.
3-D Software That Creates Collaboration
Nvidia’s Omniverse will launch free 3-D software that will allow real-time collaboration between businesses. They believe that increasing access will benefit us all.
After its Beta launch one year ago, there have been 100,000 downloads by innovators using it to enhance their workflow.
This will expand collaboration between industries and make working with companies worldwide possible for companies big and small.
The use of digital avatars is in its infancy; its use in business and the metaverse will change human interaction in untold ways. The changes they offer are going to bring new meaning to multitasking!
Digital Avatars Bring New Meaning to Multitasking was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.
-
Dubai residents are now able to ask any municipal issues to ‘Fares’
Dubai has decided to go one step ahead by introducing a virtual assistant that will answer citizens’ questions and requests regarding municipal issues. Using conversational AI the voice assistant ‘Fares’ enables multiple options for citizens to chat or talk with Dubai officials even through WhatsApp numbers.
Virtual municipal service
It’s an advanced way of serving the people of Dubai by allowing them to communicate with ‘Fare’ by calling the city hotline or connecting through the WhatsApp number and they can ask any questions, it can be about the city and their services, it can be the inquiry of any request made by them earlier. Additionally, it also has an option to verify any rumors about the city. Citizens can even ask about their status of playing house taxes through these ‘Fares’ and report any issues.
“Communication with #DubaiMunicipality is easier through ‘Fares’, the virtual assistant who answers your inquiries around the clock and helps you submit your services’ reports,” the city’s official Twitter feed explained. “‘Fares’ is available via WhatsApp, the website, and the Municipality’s unified application. Visit our website and learn more about him.
Many cities and countries have opted to move forward with smart services to provide their citizens, so Dubai isn’t the first one to take such initiatives, other countries like India, Russia, West Virginia also have seen to be experimenting with conversational AI.
Don’t forget to give us your 👏 !
Dubai residents are now able to ask any municipal issues to ‘Fares’ was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.
-
Call Center Automation using AI-Powered Chatbot
Welcome to our discussion on call center evolution using AI-powered chat and voice agents. Our intent today is to share our experiences and observations as one of the leading Conversational AI companies about how chat and voice assistants can take call center automation experiences to the next level, providing value to both customers and call center agents, regardless of the communication channel. Then in part 2, we’ll discuss Call Center experience with voice agent: Challenges, Use Cases, and Case Study.
How to choose the right technology for a Conversational AI solution for Call Centers?
There is no single platform or technology that’s a golden ticket to a successful Conversational AI experience for call center automation. In general, for every industry and field, it takes multiple technologies and multiple systems to create an effective solution. Bringing those systems together is something that Master of Code has experience in delivering, which lets us be recognized as a trusted partner by some significant providers of Conversational AI solutions in the market, including Amazon and Microsoft.
We work with many platforms based on customer needs and selected solutions, have the knowledge and skill to create Conversational AI experiences within an existing platform to optimize it, not just for a cloud deliverable. This allows us to understand what works and what doesn’t to provide recommendations and guidance throughout the lifecycle of the engagement.
Implementing a Conversational AI experience within a call center
There are a few fundamental components that must exist, beginning with call center tools that are implemented for an organization. There is no one right or wrong tool, simply what works best for your organization. All of the major solutions, Cisco — RingCentral, Zendesk, and many more — bring value to creating that call center automation experience. And these solutions enable customers to enter into the queue to engage with live agents.
Opportunities for call center automation with Conversational AI
- Reduce repetitive requests to agents, by answering easy questions.
- Reducing wait times for users, resulting in much more favorable agent stats due to lessened waiting times.
- Bring a high level of conversational automation into the equation.
Working with a conversational platform allows marrying the live agent component to the automation piece in a much more simple fashion. In many cases, these call center solutions have a conversational element, either pre-built or with partnerships that can be leveraged. Otherwise, two systems could talk to one another via existing APIs or through custom integrations that can be developed.
Top Conversational AI channels and types for customer engagement
When making the decision of how you want to engage customers, identification of the most applicable channels and conversational types needs to be determined. It can be implemented by adding a chatbot to your website, or maybe through existing digital support channels, such as Apple Business Chat, Facebook Messenger, or Microsoft Teams. Or replacement of a phone system such as a tone-based IVR system with a Conversational AI-based one.
Conversational platform for Call Center - Channels and communication methodologies drive the use case priority and provide a foundation for measuring success. Since each channel will offer different ways of user engagement, strong knowledge of the channel and what is available within it is key to creating that optimal experience. This selection, which can grow as your needs change, is one of the fundamental pieces that can drive digital engagement for your brand.
- Workforce management tools. Allows performing some agent planning, but also understanding how to accurately route a customer to the appropriate live agent, person or department. The faster and more readily any solution can give an answer to the user, the more positive of an experience it is.
- Agent assist. Useful in determining the customer’s need and finding the right agent or workflow to execute the request.
- Menu-driven navigating systems. Can be low-cost to implement, but also creates a much more linear experience as well as provide limited metrics. As a result, you might know how many people follow a certain path, but you don’t necessarily get insight into what other types of things they’re looking to do within your chatbot.
- Analytics. By converting the experience to a Conversational AI flow, the amount of data you get increases dramatically. You will see what people want to do, identify new flows and user experiences, and have data-centric metrics to support your growth decisions.
Download a Conversational Flow Chart Diagram with the Scenario of Building Dialogues for your Chatbot
In addition, you can add in an NLP solution, either a cloud-based one like Microsoft LUIS or an on-prem solution such as RASA. Based on organizational needs, you can fine-tune that experience to flow in a way that is virtually seamless to the end-user.
It is important to select a tool for call-center automation based on your business needs, such as supported languages, what type of PII concerns exist, and technological constraints caused by each provider’s limitation.
Finding the right NLP to manage, understand and train for your call center automation is key. And this can extend further into other AI automation components such as sentiment analysis, document analysis, visual recognition and other cognitive services. Having a long term strategy helps with the right selection, and can save time and money down the road.
Additionally, we cannot forget the line of business tools that house the detailed data that is needed to make the conversation useful to end-users. This is where users can authenticate themselves, perform appropriate tasks, and access CRM, ERP, or other operational services to allow a live agent to engage and answer user questions. As a bot obtains more access to information, its value continues to increase, resulting in customers who can obtain assistance much more quickly.
Value of integrating Conversational AI solutions for call center automation
Whether it is the channel itself, a workforce management tool, NLU or other cognitive systems, line of business tools, or an analytics platform, we cannot deny the importance of integrations. No-code systems may have challenges in obtaining the information needed for an effective exchange, and a low code approach will, at minimum, allow for the development of these custom solutions to provide value.
Omnichannel support allows the bot to work alongside any channel and over multiple communication methods. With the release of a new channel, businesses will just need to create the experience for that channel based on existing flows. But if the Conversational platform does not support it, then an investigation of optimal experience and implementation in that channel will be required.
Integration provides more performant conversations because information can be presented in a more conversational manner. Providing information from a Conversational AI solution, in the same way as from a live agent, makes for a more performant and pleasant experience. We’re not limited to just assisting a customer directly, but rather providing the right solution to solve a problem and implementing a right hand for live agents may be that solution.
API data unification allows us to bring all of the data points together into a coherent message. We may get some data from CRM, an inventory system authentication, SSO platform or directly from a database. Knowing where we can get the data from means that we can unify the experience and merge the data in a way that makes sense for the user.
Value of Integration in Call Center automation Integration provides flexibility like adding data sources without impacting existing ones, such as a CRM upgrading to a new version with new APIs. We update the connectivity library and ensure to get the same information with no changes to the Conversational AI flow. The system can also be configured to fall back to the previous iteration, allowing for it to remain operational, even when downstream services are challenged.
Also read: Three Secrets Behind Impactful Troubleshooting Chatbot Conversation Flows
User request translation is a key value proposition for both a bot and a human agent. Letting a bot handle the interpretation allows for a more probabilistic understanding of the request, which could lead to routing to the appropriate person or simply pulling down the right information. Integrating with NLP and other business services to extract that data and make it available to respond to the user’s request effectively which is a huge value statement.
The Conversational AI approach is much more natural, and the more we make it human-like, the more users will engage and perform actions without the need for a live agent. Building that trust that Conversational AI solutions can answer those questions is key. It can provide a 24/7 support model in languages that perhaps your office can’t do directly through live agents.
Many enterprises have older systems that require a more hands-on approach to obtaining data. It can be a mainframe system, something written in a language and platform that is no longer viable, or a subject matter expert who has the knowledge of how to support it but has left the organization. Accessing legacy APIs and the ability to provide the data into a modern system provides significant value to the customer and to the business, which a no-code solution may not be able to provide.
Having built many extensive Conversational AI solutions, we at Master of Code are well versed in finding the right efficiencies and use cases, bringing information at the right time to create an optimal experience. With each of our partners, we work with stakeholders to best understand the ability to implement Conversational AI solutions. It includes choosing the right technology for the task at hand, data sources, and integrations to generate the best experience for users. The objective is to create efficiency and address customer concerns quickly and correctly.
Take the first step in leveling up your Conversational AI experience:
GET IN TOUCH WITH US
Call Center Automation using AI-Powered Chatbot was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.
-
How to Design and Write a Chatbot in 10 Steps
Before you’re ready to have automated conversations with your customers at scale, you should ask yourself — how exactly do you take your idea and turn it into a real chatbot?
Here are the 10 steps you should take to go from idea to working chatbot prototype that’s ready to be built.
One question that comes up a lot when a business or organization wants to create a chatbot is how exactly do you take your idea and turn it into a real chatbot? Before you’re ready to have automated conversations with your customers at scale, there’s a process of strategy, conversation design, and testing that needs to happen. Here are the 10 steps you should take to go from idea to working chatbot prototype that’s ready to be built.
- Define the purpose. This is the most important thing to determine for your chatbot. Why are you creating a bot in the first place? If you are taking an existing interaction or process and automating it, what is that current experience like, and how could a bot help the customer or improve the process?
- Define the goal. Now that you understand the why, what will this bot do? What is the valuable outcome that a user will gain by interacting with it? It is essential to define your chatbot strategy before the writing begins, so you know exactly what the bot will do and why that goal is important.
- Outline the steps. With that end goal, work backward to determine all of the steps necessary to reach that goal. You can likely gain the information from the business website, sales materials, interviews with customers or sales agents, etc.
- Define the audience and personality. In order to design an experience that converts, it’s crucial to know what the user wants and what their sentiment will be during the interaction. To connect with the audience, you have to know them! How do they buy? What are their challenges? How familiar are they with your topics? These are all questions you will need to answer.
- Map the flows. Create a visual guide of your steps, and fill in the ways these connect to each other. The main goal with creating a flow map is to visualize how a user would go from entry to exit and where they might want to — or be able to — cross paths into other flows. There are several tools you can use to easily create a flow map that represents your chatbot user journey. My two favorites are draw.io and Lucidchart.
- Write the key flows (Hello, Main, Outcome). With these conversations design best practices in mind, it’s time to write! Begin with the most necessary parts of your chatbot conversation, and write the beginning-to-end “ideal” experience. This will also help you discover offshoot flows you need to add. I’ve created an online template for writing all of the dialogue for your chatbot, including these flows, but you may want to write yours in another internal tool or document, depending on your specific needs.
- Write the secondary flows (other answers to main questions, about, contact, etc.). When writing the main flows (and in your flow map), you will notice there are many points where flow needs to extend to offer another path, if a user selects a different answer. After you create the main flows, go back and fill in anything that is currently a dead end and make sure all of your flows connect.
- Create a demo/prototype. Turn your 2D writing into a working, 3D conversation. Creating a video mockup or working example of a chatbot is the best way to illustrate the experience and demonstrate it for your team. Use a tool, like Botmock or Botsociety, to easily create a working demo so you can see how the conversation really flows.
- Edit the experience. After using your prototype and sharing it with your team, you’ve likely uncovered messages that are too long, along with dead ends or pathways that don’t make sense. As necessary, go back and edit the copy to make sure it’s as effective as possible.
- Test the experience. You can continue to test the experience in a prototype mode or in a production-ready bot. Repeat this as necessary. There are a few different ways to test a chatbot (or a prototype) before you release it out into the world. There is role-playing, usability testing, or getting user feedback from other chatbot professionals online. Once you have an experience you are happy with, it’s a good idea to test it with a small group of customers and scale up to something that’s available for everyone to use.
Once you have completed these steps, you are ready to build and release a chatbot. You should feel confident that the experience will be enjoyable for the customers and help the business reach its goals.
Feel a little bit overwhelmed by the process? Not sure where to start? Want to go even further? Learn the basics of conversation design, the ins and outs of this exact process and create a prototype when you take my online course, Chatbot Writing & Design.
Chatbot Writing & Design Course * UX Writers Collective
This post originally appeared on discover.bot
I’m Hillary, the Head of Marketing & Conversation Design at Mav.
Want to chat bots? Network with 1500+ conversation designers when you Join my Private Facebook Group!
How to Design and Write a Chatbot in 10 Steps was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.
-
Ux Case study-Automating Outbound Collection calls through Voice-bot for a Micro-Finance Company
A Voice-bot understands the natural language and interacts with users to remind them about due payments. Background
I work at an IT services company (Simpragma) which aims to revolutionize Contact Centres with their automation expertise built over a decade. The company has built Voicebots, chatbots, social media messenger bots, visual IVR, etc for renowned brands where I have contributed as a user experience designer.
Overview
This project was done for a leading micro-finance company (we will use the name “B Finance”) which has over 5 million customers. They wanted to automate payment reminder calls as the task is repetitive in most cases. I would take you through the design process of the Collections Voicebot at a glance where some stats might be altered for confidentiality. As we had prior knowledge of B Finance’s customer base and had developed a voice bot for them earlier this project was more like a feature roll out to a small segment of customers first, test and improve for final launch. Specific things to keep in mind while going through any Ux case study.
- UX maturity of the design team at Simpragma.
- Duration for research and Ideation –2 weeks.
- Business impact on B Finance in terms of expenditure and ROI.
- Transition for B Finance’s customers from receiving calls from human customer service executive to an AI-ML-based voice bot.
Prior Knowledge
How does the Voice bot function
The voice bot uses ASR (automatic speech recognition) to convert speech-to-text ANd NLP (natural language processing) to understand human language. Then the response is generated using TTS (text-to-speech). The AI predicts the bot response from a predefined library and Machine learning helps to train the voice bot over time for complex scenarios.
Voice bot process for understanding customer queries — making a decision and delivering a response. For more information on the voice bot process, click on the links below:
- What is Voice bot: An Ultimate Guide for Voicebot AI
- Voicebots 101 Guide: applications, benefits, best practices – SentiOne
Earlier Project with B Finance -Inbound customer query resolution Automation
This was not a completely new project from scratch. Our team automated the inbound customer service calls for B Finance earlier and provides continuous support for upgrading workflows, scripts, intents, and any other issues. B Finance was able to answer approximately 5000 calls per day with 50 telephony lines and 22 agents and approximately 3000 calls were going unresolved. After automating the calls B finance scaled up 3 times easily as it did not require hiring and training too many agents. 80 percent of calls were handled by the voice bot and other unique cases would end up with human agents if needed. The voice bot was available 24*7 and 7 days a week as well. The voice bot is continuously trained through machine learning for handling more complex queries over time.
Customer Base
Doing a project with B Finance already gave us a lot of insight into their customer base. Most of the customers have taken small loans for products like mobile phones, home appliances, or other electrical gadgets. As the majority of users were located outside main cities:
- We understood customers’ level of literacy and understanding of technology.
- The majority of the customers speak Hindi in different styles.
- They would use different kinds of words for the same query. So, we were able to develop a library of intents (synonymous words) that can be recognized by the bot to understand the user’s query. For instance, some users might say Due EMI and some might say pending installment but it means the same. The bot could understand similarities and differences as well using such intents.
- A lot of users perceive voice bots as a human who can set higher expectations for getting their issues resolved. We make sure to build a human-like bot but not exactly like human customer care agents.
- As everyone has a different grasping speed, we tuned the bot to speak with an average speech rate, but some customers might still interrupt it. The voice bot was built to adapt, understand and answer to such customers as well.
- Some customers understood that it is not a real human and thus they demand to speak to one. If the customer is adamant, the voice bot transfers such calls to human agents. It would also transfer in some specified cases as well.
Project Brief
After getting the inbound calls automated, B Finance approached us for automating their outbound collection calls. Like any other Financial institution, B Finance generates revenues from the difference between what it lends and what it receives back in form of EMI (principal + interest). Regular payments ensure a fund balance for B finance and they can lend the money further to its new and existing customers. Thus collecting payments on time and reminding customers to pay is a crucial operation for B Finance.
Customers need to be reminded regularly about upcoming and overdue EMI payments. Especially if a customer hasn’t paid the EMI on time it affects their credit score and they have to pay penalty late payment charges post the grace period.
This project would cover 8 lac customers approximately with 300 calling lines every day.
As the client was already operating payment reminders and due collections manually through human agents, they were able to give us data on different scenarios to be automated and scripts for the voice bot as well
These scripts are not finalized until we develop a minimal viable product for testing and improvements.
First Draft
Insights from the scripts
Scripts were not just dialogue between a bot and user but also gave what business required customers to be reminded about.
- Regular Payment Reminder– Reminding customers in advance about upcoming EMI payments.
- NACH reminder– NACH is a facility given by banks in case they want to pay directly for EMI from Bank accounts. Sometimes customers forget to update NACH with their banks and have to be reminded before the due date to avoid late payments.
- Pending EMI reminder– In case a customer forgets to pay EMI on the due date, B Finance reminds them to pay to avoid penalties and negative effects on their Credit score.
- NACH update pending/failed– In case NACH did not get approved B Finance cannot debit from the customer’s bank account. Thus they are asked to pay by other methods.
- Smart Debit failed– In case NACH/smart debit was set up but customers’ accounts could not be debited due to low balance, either they can pay by other methods or add funds to the bank account as B Finance will re-attempt again in a few days.
- Advance Reminder Call– A bit overlapping but can be a necessary step for customers with bad payment history.
- Follow-ups– Scripts for follow-up calls were shared too.
- Disconnection– Script might change in case of a call disconnection due to network reasons.
- Customer declines/doesnt answer– Bot might have to be more assertive or might need a change of plan
Quick Secondary Research
In a B2B scenario, it wasn’t easy to have an understanding of a competitor’s product. After spending a couple of hours getting a demo from competitors I finally moved on to secondary research and came up with some crucial steps to follow for our MVP.
- Introduce Yourself Confirm Name/DOB etc (Identify correctly)
- Confirm Name/DOB etc (Identify correctly)
- Check availability of customer (good time to talk)
- Tell the intent of the call
- Empathize with customer
- Help the customer with ease of payment or alternatives if you can
- Inform about consequences like late payment charges etc.
- Inform the customer properly about the next steps
- Send necessary documents if required
- Give a contact number for the customer to call back in case
Initial Conversation Flow
The initial flows were created for an initial conversation with the client. It is easier to have a conversational flow in place to understand voice bot and customer interaction for both the development team and client as well. It acts as a base for information architecture as well where the dev team can design the decision tree based on the flow in the later stage.
Here is a glimpse of one crucial flow of Pending EMI reminders out of the 6 flows that I created for discussion. The other flows have certain similarities as the nature of customer response was similar. It would give you an idea of what I presented to the client.
Some more cases were added along with each flow
- Follow up — in case customer doesn’t pay after reminder call as well. voice bot to call back and explain the consequences of delaying it.
- Network issue– Voice bot will call back after some time and apologize for the inconvenience caused.
- Customer not answering– If the customer responds after some time the voice bot will be more assertive and tell the customer to pay to the matter. In case a customer doesn’t answer at all, the call logs are maintained and the human agent tries to reach them.
For the current release, the phone lines are limited to 300 for Eight lac customers. If an average call takes 4 minutes to be completed and the number of operational hours is limited to 12 as per government guidelines. Voicebot will be able to call 43000 customers per day approximately.
In case, the due date had crossed the flow would be similar with the just intent message being changed. In case of general reminder, the flow would be like this
Client feedback
We had a quick call with the client and went through the flow together. We even did a small exercise of roleplay where one participant was a user and the other became the voice bot. A few important points that we discovered
- Checking availability was time-consuming and did not fall into the bracket of a ‘must do’ or ‘should do’ at this stage so it can be added later. The reminder call was also very crucial for businesses to generate revenue and as a priority for clients, we decided to skip this step.
2. We were updated with information about the mode of payment, customers need not visit a branch to make payments.
3. For any dispute with the dealer or if the customer has already paid etc, a human agent would call back if the voice bot is not able to resolve the issue.
I wanted some more insights from B finance’s customers but due to a time crunch, it wasn’t possible. We decided that I would analyze their reaction and behavior from the call recordings taken after the first release.
I still went ahead and did a quick qualitative study on agents and process manager working in Collections process for various companies.
Qualitative Insights
Questionnaire
- What is the collection process that you follow?
- How often do you call the customer?
- Is there a right time to call the customer?
- What is your tone while talking to customers
- What is the strategy to make the customer pay their dues?
- What is the repeat rate of defaulters? Are you able to check the history of late payments for one customer?
- What are the consequences of not paying on time?
- How do you deal with people who do not pay regularly?
Interviews with Collection agents and process head Insights
- It is not easy to get users to pay as they come around with various excuses.
- Its human nature that agents might get aggressive or rude dealing with similar situations every day. There is pressure for meeting targets as their earnings are largely dependent on incentives.
- Financial institutions have incentive-based income plans for agents to avoid low competency as the revenue depends on EMI payments from customers.
- There is no hypothecation in microfinance loans where the annual income of households is low. The company cannot claim the items back from customers and neither it would serve a greater purpose after re-selling these products. Getting installments is the only way for growing the business.
- Agents might inform customers about the repercussions of having a low CIBIL score, but either user are unaware as they are from low-income groups or they are not bothered with low CIBIL scores as well.
- Agents go out of the way to collect EMI amounts from customers. There is no ideal way to date that can assure timely payments from all customers. Bad debt is a huge problem for financial institutions.
- Not ideal but agents call customers’ relatives and friends to build some pressure on customers as they might pay out of embarrassment. Similarly, they scare customers about recovery agents turning up at their office or residential addresses.
- The idea of embarrassing people does work somehow. It might be a societal thing as well and for that, we need more research which I might take in the next phase.
- The call frequency is higher and more assertive tones are used for habitual offenders.
- People block or avoid calls so agents might use alternative numbers.
Selection Matrix
Based on the above insights I tried to map down the selection criteria for calling. There were various differentiators and categories to be followed like
- Customers with good payment history vs bad payment history: Voicebot would need to understand this to change the tone with help of a script.
- Pre and post notification: voice bot needs to identify whether it is calling before or after the due date.
- Call Intent: Voicebot needs to inform the customer about the reason for the call like a general reminder for payment, pending EMI, missed auto-debit or NACH failed.
- The intensity of reminders: Customers need to be reminded and the ones who still have not paid might have to be reminded .
- Timeline: Intensity has to increase with timelines. It depends both on business requirements and voice bot availability.
- Identification: The customer needs to be identified properly before going ahead with the call else updating the records.
A glimpse of selection matrix for customers with good payment history A glimpse of selection matrix for customers with bad payment history 2nd Draft (Ideation)
So I started building flow according to the selection matrix where the intensity increases every 30 days if a customer doesn’t pay until 60 days and then goes to field recovery teams.
The call frequency also increased with time and in the case of call avoiders, human agents would reach back.
As the voice bot would have a certain limitation, human agents would take over in unique dispute cases, disagreements for payment, payment gateway issues, etc as it was too early for the voice bot to handle it.
While I was creating flows with different intensities of messages over time for defaulters, I did a quick check with my developers. I felt the selection matrix was overwhelming and there were too many flows to be followed for the prototype.
My assumption matched the developers’ thoughts as well and we concluded to mellow it down.
3rd Draft (Ideation)
For the prototype to be developed quickly with minimal effort yet good enough to be tested in a real environment with approximately 150 people I reduced the complicated matrix into a simple one.
A glimpse of simpler selection matrix for voice bot flows After a lot of ideation, I was able to combine all flows except NACH reminder as it did not require the customer to tell for payment agreement or disagreement.
Flow for Development
The combined flow would be used to develop the prototype. We chose some must-do tasks to be taken care of for the prototype as all problems cant be solved at once.
- Voicebot will fetch customer data through an API to determine the due date Due amount and other necessary details.
- The voice bot would confirm if it speaking to the right customer, If not it would try to get the right phone number and record it. Later it would be manually verified and updated by human agents.
- Voicebot would then introduce the intent to the customer after checking the condition like Check Condition
10 days Before the due date-
1 day before the due date
or
After due date
Pending EMI
NACH failed
Smart debit failed
Redebit
4. The script would get severe with time for customers who do not pay. For the current prototype, we removed the differentiator of good and base payment history as well.
(0–15 days) — Normal severity
(15–30 days)- Medium severity
(30–60days) high severity
(above60days- critical5. Voicebot will give clarity of payable amount — Due amount +interest + late payment charges.
6. In case of any dispute where a customer does not agree to pay voice bot will end the call with a message; I.e. customer care agent will call back the customer.
Development of prototype
As we already have prior knowledge and some modules ready for the same client the development for the first release would take 4 to 6 weeks where I work closely with developers to take insights and note down gaps and opportunities.
Scripts and Intents
Scripts are generally improved over time with every release. Similarly, new intents(words told by a customer that the bot can use to understand their query) are continuously added to the library for training the voice bot.
Testing
Like every other contact center, we record and audit the calls along with an audit manager from the client side. We listen to 10 calls per day to understand what went right and what did not. Then we divide tasks according to tech issues or process issues and make improvements continuously.
The data we get from calls are also used to train the bot more every day for new intents and scripts for the voice bot are modified as well.
Next Release
Testing will give us insights for improvements for the next releases but we do have some key factors already to look at.
- Currently, the phone lines are limited to 300 and thus a customer might get a follow-up call only after 18 days. We need to increase phone lines or find a way to prioritize follow-up for regular violaters.
- We need to add and test for certain steps like ‘checking availability with customer ’ which we removed earlier.
- We need to personalize the calls and change scripts accordingly.
- Train bot for complex scenarios to reduce human agent intervention.
- Get qualitative insight from B Finance’s customers.
- In case a customer doesn’t answer should the voice bot use the alternate number to call back or call the customer’s friends and family to reach out?
- Should there be a ground team for collecting EMI offline?
- I will work on convincing people to pay. To date, offline physical recovery has worked best in case of bad debts. Empathetic user research with such customers will yield great insights.
- In case customers block one number what would be an alternate route to reach.
I will add the further progress of this project in the next case study. Stay tuned.
Ux Case study-Automating Outbound Collection calls through Voice-bot for a Micro-Finance Company was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.
-
It’s 2022 — Time to Level up your Conversational AI Experiences
It’s 2022 — Time to Level up your Conversational AI Experiences
The Conversational AI (CAI) space has come a long way over the last few years. As users prefer to communicate over digital channels, the demand for conversational AI continues to grow. Organizations worldwide are increasing their CAI investments in response to this trend and maturing how they leverage Conversational AI to supplement customer service agent interactions to deliver seamless customer experiences across a multitude of channels.
Customers’ expectations have also matured due to the proliferation and ubiquity of conversational interfaces and virtual assistants. They now demand easy, effective interactions that are personal and contextual to their current needs.
To truly elevate mundane conversations, improve engagement, and add value to their customers in 2022 and beyond, organizations need to focus more on the following practices to craft delightful experiences.
From FAQs, to transactional experiences
Conversational interface projects often start with a proof of concept involving launching a virtual assistant that can automate responses to frequently asked questions (FAQs) via chat or voice. Organizations that want to increase customer satisfaction and achieve business goals need to start looking beyond just FAQs to reap the actual benefits of conversational AI.
Today’s advanced Conversational AI systems that utilize natural language understanding (NLU) can automate many complex transactions to make life easier for customers and internal teams. For example, banks could enable bill payments via virtual assistants instead of just navigating customers to a ‘how to pay’ webpage. A food retailer could allow customers to order food using a virtual agent rather than just navigating to a ‘menu’ page on their website. Check out more Use Cases of Conversational AI in the Finance industry to increase customer satisfaction and automate your processes.
From FAQs to embeddable conversational solutions Building a transactional virtual assistant does not necessarily mean total call center automation with all possible transactions. Organizations need to take a structured approach and use data to prioritize key transactions that are high-volume and high-impact. This will help them deliver more value to their customers and move them closer to meeting their business objectives.
From linear, to flexible and cyclic conversations
Organizations need to be mindful that they are creating experiences for real people who are on the other end of the virtual assistants. Therefore it is paramount to keep customers in mind during the entire process. This shift is profound and places the onus on organizations to deliver a seamless user experience to lessen the user’s cognitive burden.
To continue providing a fluid customer experience, organizations need to anticipate and map out every possible scenario, query, and customer response. They need to design flexible conversations so that customers can converse using their own words in addition to picking from pre-defined menus. They should also be able to change the direction of dialogue or request additional information along the conversation’s path. Lastly, the conversation design needs to be cyclical so customers can pivot and circle back to the conversation as per their preference without starting over. Human to human conversations themselves are not linear and neither should conversational interfaces.
From release and forget, to iterating and tuning
Many organizations that build virtual assistants invest in upfront research and design to understand the customer journey and context. They sometimes, however, drop the ball on iterating and fine-tuning the experience after releasing the virtual assistants to actual customers.
It is crucial for organizations to monitor and evaluate actual conversations to really understand what is working and what isn’t. Reviewing user sessions to investigate errors and determine how to improve the experience should be an integral part of an ongoing sustainment plan. Continuous iteration or ‘bot tuning’ is another critical practice for maintaining a balance of necessary intents and their training data. Tuning could involve various activities like adding, removing, or modifying utterances. Removing intents that don’t add value is just as important as creating new ones.
This results in customer experiences that are as seamless and as simple to navigate as possible. It also increases customer engagement and containment within the conversational experience.
From pre-defined answers, to Natural Language Understanding and Conversation Design
At its core, conversation design aims to mimic human conversations to make digital systems like virtual assistants easy and intuitive to use. The challenge is to make interactions with these systems feel less robotic by understanding the context and purpose of the customer in order to direct them to relevant solutions.
Rule-Based Chatbots vs Conversational AI Many organizations, however, still employ hard-coded or rule-based pattern matching with small rule-sets for their conversational interfaces. This results in higher abandonment rates, low engagement, and perceived project failures.
Natural Language Understanding (NLU) technologies utilize machine learning and training data that allows them to understand user utterances without the need to manually hard code all the pattern matching logic. NLU platforms also provide hooks into domain-specific knowledge bases and forums.
Download Guide: Conversation Design and How to Approach It.
By integrating and maximizing the power of NLU platforms, organizations can enable virtual assistants to respond to human queries efficiently and effectively, improving customer engagement and providing an overall positive customer service experience.
Natural Language Understanding (NLU) for agent From disjointed multiple bots, to a seamlessly integrated omnichannel experience
With an increase in the development of virtual agents, some larger organizations are facing a new challenge. Individual departments are creating conversational interfaces with a narrow scope of handling queries related to very specific use-cases or business functions such as HR or IT. As a result, accessing and discoverability of the numerous virtual assistants becomes a challenge for users.
When appropriate for their situation, organizations can overcome this challenge with the introduction of a “master virtual assistant”. This assistant can be made responsible for handling a range of tasks for the customer by understanding their intent and routing the request to the use-case-specific virtual agent. For example, a financial institution may have separate chatbots to handle commercial and consumer mortgage use cases and a master chatbot that seamlessly manages the interactions across them.
Read more about which processes that could be automated for HR with help of AI.
Is your organization ready to level up its conversational AI experience?
Organizations need to remember that launching a virtual assistant isn’t the destination but a journey. It’s essential to keep in mind that success with conversational AI depends on more than just technology. An elegant conversation design based on research and continuous optimization is also crucial to make virtual assistants more intelligent, intuitive, and engaging.
Whether you’re looking to develop the knowledge and capabilities to scale your conversational AI strategy in-house or find a partner to work with — MOC is available to assist.
Take the first step in leveling up your CAI experience
GET IN TOUCH
It’s 2022 — Time to Level up your Conversational AI Experiences was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.
-
Model Performance and Problem Definition when dealing with Unbalanced Data.
In this post, I am going to talk about the different metrics that we can use to measure classifier performance when we are dealing with unbalanced data.
Before defining any metric let’s talk a little bit about what an unbalanced dataset is, and the problems we might face when dealing with this kind of data. In Machine Learning, when we talk about data balance we are referring to the number of instances among the different classes in our dataset, there are two cases.
Balanced data
When it comes to the distribution of classes in a dataset there could be several scenarios depending on the proportion of instances in each class. Let’s look an example using a binary dataset.
Class distribution reference Image The figure above illustrates the feature distribution of two different classes, As it can be observed instances belonging to the red class is more frequent than the blue class. The class more frequent is called majority class while the class with less samples is called the minority class.
The last plot is an example of an unbalanced dataset, as we can see the class distribution is not the same for all classes. If the data distribution among classes were similar the dataset would be a balanced dataset, the following up image shows this case.
In this case, the scatter chart above shows the data distribution of two classes, in this scenario the amount of instances for each class is similar.
Problems with un balanced data.
We always want to work with perfect data, that is, working with balanced datasets. There are several problems when dealing with unbalanced datasets, when it comes to modeling the most significant might be the bias problem. In this case, since there are more instances belonging to one of the classes the model will tend to bias its predictions towards the majority class.
Another issue that arises when we are dealing with unbalanced data refers to the metrics we are using to measure model performance. And this is the main problem this post is going to tackle.
To good to be true
When it comes to unbalanced datasets we should be really careful about metric choosing and interpretation. For example, let’s say that we have a dataset with 90 instances belonging to one class and 10 to the other one. If we have a classifier that always has the same output, let’s say the majority class we are going to get an accuracy of 90 % which many might think is something great, but the reality shows that the model is unable to distinguish between the two classes, the main task of any classifier.
Evaluating classification models. Accuracy, Precision and Recall.
We can use several metrics to measure performance, such as Precision and recall. In the link above there is a post related to this metrics. One of the limitations of those metrics is that they use one single value for threshold, so the information that they provide is restricted.
Thresholds and classifiers.
When we look at the output of classifiers we usually see a discrete output. however, many classifiers are capable of showing the probabilities instead. For example, in scikit-learn this method is usually called predict_proba it allows us to get the probabilities of all classes.
By default, the threshold is usually 0.5 and is tempting to always choose this value, however, thresholds are problem dependent and we have to think what would be the best option for every case.
Using charts for model comparison.
We already talked about the importance of choosing the right threshold for each problem, to do so it is common to use some charts, in this case I am going to walk through two of them, the precision recall (PR) curve and Receiver Operating Characteristic (ROC) Curve.
The ROC curve shows the variation of true positive and false positive rate for different thresholds. We can use this plot to compare different models. Let’s take a look to the following up example.
The chart above shows the ROC curve for two different models. The x axis show the False positive rate while the Y-axis shows the True positive rate, in this case we can see how the DecisionTreeClassifier seems to be slightly better than the GradienBoostingClassifier since the rate of True Positive is higher than the True Positive rate gotten by GradientBoostingClassifier. However, in this case, both algorithm are quite good and there is not to much information to extract from this plot.
Precision Recall Curve
We can use the PR Curve to get in at glance much more information about the model, the following chart shows the PR curve for the models beforementioned.
The PR curve shows the variation of Recall and Precision through different threshold values. The chart above illustrates the PR curve for two classifiers, in this case, what we are looking at is the precision and recall values, which allows us to make a better comparison. We can observe that the PR curve remains similar for both classifiers for most thresholds and just after the 0.8 value of recall both curves suddenly drop to a minimum of 0.825 in precision.
Context matters.
Now, to choose the right model we have to take into consideration the use case, or the problem case. In a problem where it is important to keep the number of false negatives low (high recall), it might be convenient to choose the Decision Tree Classifier rather than the Gradient Boosting Classifier.
Although the last chart does not allow to make a more realistic comparison, the main idea is to always take into consideration the application of the model. If we want to distinguish between malignant and benign tumors it will be always more convenient to have high recall (low false negatives) since we do not want to diagnose a malignant tumor as benign. In this particular case, this error can put patients’ lives at risk.
Unbalanced data and problem definition
It is important to notice that the last graphic is showing precision and recall values for different threshold values, this is excellent to measure models trained on unbalanced data. Precision and Recall metrics can be seen directly on the graph, thus evaluating the false negative and false positive ratios at a glance.
However, there is one important point to take into consideration, defining the positive class, The metrics that we went through are based on the definition of a Positive class an a Negative class. Let’s go through this classic example once again. Let’s imagine an experiment where we have 100 samples, and let’s say that 90 samples correspond to the class Benign and 10 samples Malignant. There is an imaginary classifier, and in this case let’s say that our positive class are the benign tumors. After training the classifier we get something like this.
TP = 80, TN=0, FP=15, FN=5
Recall = 0.94
Precision = 0.84
As we mentioned before, for this particular case, in which we need to distinguish between malignant and benign tumors, it is important to get a high recall, however, that is true just because we are interested on identifying malignant tumors.
In the last example, we defined as positive class the majority class, thus is tricky to use these metrics to measure performance, since we are having a high number of True Positive only with the problem definition. The numbers have shown that the model is uncapable of detecting malignant tumors (the negative class) since we are getting a true negative ratio equal to zero. The main take away in this case is that clarity in problem definition is paramount.
If you want to keep in contact with me and know more about this type of content, I invite you to follow me on Medium and check my profile on LinkedIn. You also can subscribe to my blog and receive updates every time I create new content.
References
Model Performance and Problem Definition when dealing with Unbalanced Data. was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.
-
What are Large Language Models?
A look at LLMs and their popularity
Photo by Patrick Tomasso on Unsplash Advances in natural language processing (NLP) have been in the news lately, with special attention paid to large language models (LLMs) like OpenAI’s GPT-3. There have been some bold claims in the media — could models like this soon replace search engines or even master language?
But what exactly are these large language models, and why are they suddenly so popular?
What’s a language model?
As humans, we’re pretty good at reading a passage and know where the author might be heading. Of course, we can’t predict exactly what the author will write next — there’s far too many options for that — but we notice abrupt changes or out-of-place words, and can make a stab at filling in endings of sentences. We intuitively know that a message saying “I’ll give you a call, how about” is likely to end with “tomorrow” or “Thursday”, and not “yesterday” or “green”.
This task of predicting what might come next is exactly what a language model (LM) does. From some starting text, the language model predicts words that are likely to follow. Do this repeatedly, and the language model can generate longer fragments of text. For all the recent interest, language models have been around for a long time. They’re built (or trained) by analysing a bunch of text documents to figure out which words, and sequences of words, are more likely to occur than others.
One method of building LMs called n-grams has been around for a long time. These models are quick and easy to build, so people have trained them on different kinds of text. Examples include text generated from Shakespeare: “King Henry. What! I will go seek the traitor Gloucester. Exeunt some of the watch. A great banquet serv’d in;” and from Alice in Wonderland: “Alice was going to begin with,’ the mock turtle said with some surprise that the was”. Train one of these models on something else, like articles from the Financial Times, and the model will predict an entirely different style of text.
N-gram models aren’t good at predicting text that’s coherent beyond a few words. There’s no intent or agency behind what they’re saying; they create sequences of words that might seem sensible at first glance, but not when you read them closely. They’re simply regurgitating patterns in the training data, not saying anything new and interesting. These models have mostly been used in applications like autocorrect, machine translation, and speech recognition, to provide some knowledge about likely sequences of words into a bigger task.
The emergence of large language models
There’s always been a drive to use more and more data for training AI models, and LMs are no exception. In the past decade, this has only accelerated. Training a model on more text means the model has potential to learn more and more about the patterns in language. More data is one part of the ‘large’ in ‘large language models’.
The second part of ‘large’ comes from the size of the models themselves. The past 15 years has seen neural networks as a popular choice of model, and they’ve got larger and larger in terms of the number of parameters in the model.
GPT-3 for example, has 175 billion parameters and is trained on around 500 billion tokens. Tokens are words, or pieces of words. Most of that text data has been scraped from the web, though some comes from books. The combination of lots of data & large models makes LLMs expensive to train, and so only a handful of organisations have been able to do so. However, they’ve been able to better model much longer sequences of words, and the text they generate is more fluent than that generated by earlier LMs. For example, given an initial text prompt to write an article about creativity, GPT-3 generated the following as a continuation:
The word creativity is used and abused so much that it is beginning to lose its meaning. Every time I hear the word creativity I cannot but think of a quote from the movie, “The night they drove old dixie down”. “Can you tell me where I can find a man who is creative?” “You don’t have to find him, he’s right here.” “Oh, thank god. I thought I was going to have to go all over town.”
This is far more readable and fluent than the earlier examples, but it’s worth noting that “The night they drove old dixie down” is a song, and not a movie, and it has no lyrics or lines about a man who is creative. These facts are hallucinated by the model because the sequences of words are probable. As readers, we naturally try and infer the author’s meaning in this passage, but the computer has no agency — it really wasn’t trying to say anything when it generated the passage.
How do language models relate to other NLP technology?
NLP is a broad field — language modeling is just one NLP task and there are many other things you might want to do with text. Some examples include translating text from one language to another, identifying entities like names and locations in your text, or classifying text by topic.
To built models for these other NLP tasks, you can’t just analyse a bunch of documents like for language modeling. Instead, you need to have labelled data — i.e. text that is labelled with the entities or topics that you’re interested in. Or in the case of machine translation, text that means the same thing in two languages. Labelling data is time-consuming and expensive, and a barrier to building good NLP models.
Why all the hype about LLMs?
The bold claims about large language models are inspired by some of their interesting emergent behaviour.
First is that these models can be used as a type of interactive chatbot. By learning appropriate continuations of my text input, they can generate appropriate responses in a conversation. The current generation of chatbots are hand-crafted systems with carefully designed conversations, and they take a lot of effort to create. LLMs offer the possibility of chatbots that are simpler to build and maintain.
The second is that because these models have been trained on a lot of data, they can generate a huge variety of texts, including some that are unexpected. Give GPT-3 the text input, or prompt, to translate text into another language and it’s seen enough multilingual text to have a good go at the translation. That’s without ever being explicitly trained to do translation! The ability to recast NLP tasks into text generation ones and use LLMs to do them is powerful.
A third ability is that LLMs can be fine-tuned to different NLP tasks. An LLM has learned a lot about language during its training, and that knowledge is useful for all NLP tasks. It’s possible to make some small changes to the structure of the LLM so that it classifies topics rather than predicts next words, but still retains most of what it’s learned about patterns in language. Then, it’s easy to fine-tune on a small amount of data that’s been labelled with topic and build a topic classifier. This way of building NLP models by first building an LLM on a large dataset (or, more realistically, using one that a large company has built and released) and then fine-tuning on a specific task, is a relatively new way of building NLP models. This way, it’s possible to build NLP models using far less labelled data than if we built the same model from scratch, and is cheaper and faster. For this reason, LLMs have been dubbed ‘Foundation Models’.
But what are the downsides?
LLMs have some interesting behaviours, and many state-of-the-art NLP models are now based on LLMs. But, as they say, there is no free lunch! There are some downsides to these models that need to be taken into account.
One of the biggest issues is the data that these models are trained on. As with the Shakespeare and Alice in Wonderland examples, LLMs generate text in a similar style to that which they’re trained on. This is obvious in those two examples because of the distinct styles. But even when LLMs are trained on a wide variety of internet text, it’s still the case that the model output is heavily dependent on the training data even if it’s not as immediately obvious in the text they generate.
It’s especially problematic when the training data contains opinions and views which are controversial or offensive. There are many examples of LLMs generating offensive text. It’s not feasible to construct a neutral training set (it raises the question, ‘neutral’ according to whose values?). Most text contains its author’s views to a varying extent, or some perspective (bias) about the time and place it was written. Those biases and values inevitably make their way through to the model output.
As in the creativity example above, LLMs can hallucinate facts and generate text which is just wrong. Because of their ease of use and the superficial fluency of the text they generate, they can be used to quickly create large amounts of text containing errors and misinformation.
The impact of these downsides are exacerbated by there being just a handful of LLMs which are fine-tuned and deployed in many different applications, thus reproducing the same issues again and again.
In summary, large language models are large neural networks trained on lots of data. They have the ability to generate text that’s far more fluent and coherent than previous language models, and they can also be used as a strong foundation for other NLP tasks. Yet, as with all machine learning models, they have several downsides that are still being figured out.
What are Large Language Models? was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.