Category: Chat

  • What is Automatic Speech Recognition?

    Speech recognition is concerned with understanding human communication and recognizing and translating it into texts by computers. It is also referred to as speech-to-text translation as it converts human speeches into a text-based format. This system is a combination of:

    • Linguistics
    • Computer Science
    • Electrical Engineering, etc.

    This technology gives machines the ability to understand human voice and translate them into other forms like:

    • Speech to text
    • Supplying them as commands to compute a process
    • Identify a user using the saved segments, etc.

    ASR (Automated speech recognition) combined with IVR (interactive voice responses) can enable users to speak responses instead of typing them or pressing a button on their phones.

    How does Automatic Speech Recognition work?

    Speech recognition systems are mainly divided into 2 main categories which are:

    • Speaker Dependent.
    • Speaker Independent.

    The speaker-dependent systems are structured in such a way that they need to be trained, which are sometimes referred to as enrollment as well. It’s working is pretty basic, the speaker needs to read the text or a series of isolated vocabulary into the system. The system will then process these recordings and associate them with text libraries. Systems that do not rely on vocal training are called speaker-independent systems.‍

    The basic sequence of an event on an Automatic Speech Recognition software goes as follows:

    • As you speak, your voice is recorded by the software via an audio feed.
    • It then creates a file of the words you spoke into the device.
    • It, later on, cleans the file by removing all the unwanted background noises and normalizes the volume.
    • Further, it is broken down into phonemes, which are the basic building block sounds of language and words.
    • The ASR software then uses statistical data to analyze and deduct the words into complete sentences.
    • Once the above process is completed the ASR can understand the whole conversation and respond to you in a meaningful manner.‍

    Trending Bot Articles:

    1. How Conversational AI can Automate Customer Service

    2. Automated vs Live Chats: What will the Future of Customer Service Look Like?

    3. Chatbots As Medical Assistants In COVID-19 Pandemic

    4. Chatbot Vs. Intelligent Virtual Assistant — What’s the difference & Why Care?

    What are the 2 main types of Automatic Speech Recognition software?

    The 2 main types of ASR software are:

    • Directed Dialogue.
    • Natural Language Conversations.

    Directed Dialogue:

    Directed dialogue conversations are a simpler version of automatic speech recognition. It consists of a machine interface which has a series of yes/no type questions with extremely limited responses.

    It can be found in automated telephone banking and other common customer service interfaces.

    Natural Language Conversations:

    Natural language conversations are complex and improved versions of ASR. Instead of having a limited option of words to use, NLP tries to simulate actual conversations. It allows you to have an open-ended conversation with them. You might see them in popular virtual assistants like:

    • Alexa
    • Google Assistant
    • Siri
    • Microsoft Cortana
    • Bixby, etc.‍

    How does NLP (Natural Language Processing) work?

    NLP is much more important than directed dialogue in terms of future developments in ASR technology. It works in a way that simulates human conversations.

    Natural language processing software on an ASR technology consists of more than 60 thousand words. This gives it the possibility of having over 200 trillion possible word combinations.

    This huge number of potential combinations makes it impractical for NLP automated speech recognition systems to scan its whole set of vocabulary and process each word individually. Therefore, natural language systems have been programmed to react on selected/tagged keywords that give context to longer requests.

    Contextual clues help the system in quickly narrowing down the exact match of words that you are saying, to find the perfect response.

    A good example to explain the would be if you use phrases like “what time is it, or what day is it?’’, the NLP system would focus on keywords like “time” & “day” to find the right response.

    What is Human Tuning?

    NLP works on two main mechanisms which are human tuning and active learning. Human tuning is the simpler version of the ASR model. It involves adding commonly used phrases it has heard during a conversation that was not initially in its vocabulary. The whole take is done manually and conversation logs are added to the ASR software. It is done to expand the comprehension of speech making it capable of answering new questions continuously.‍

    What is Active Learning?

    Active learning is the second and more advanced/sophisticated version of the ASR model. It is usually used with NLP versions of speech recognition technology. Active learning unlike human tuning keeps learning, and adopting new words continuously. It is programmed to continuously keep learning from its previous conversations and keep expanding its vocabulary.

    This software can also pick up more than one speech habits and can communicate in a better manner. What this means is, it starts learning human behavior and provides a personalized experience for them based on their likings.

    Don’t forget to give us your 👏 !


    What is Automatic Speech Recognition? was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.

  • Is it possible to make a chrome extension chatbot?

    Is it possible to take the url the extension is opened in and feed it into the chatbot

    submitted by /u/largomouth3
    [link] [comments]

  • Resources for creating chatbot

    Hi, I’m looking for up-to-date resources (books, videos, articles) how to create open-domain chatbot.

    submitted by /u/TonyGodmann
    [link] [comments]

  • How to maintain chatbot regression tests with minimum effort

    The biggest and the most hateful challenge in software development is writing test cases and maintaining them. This is no different when it comes to chatbot development. At Botium we don’t write the regression tests, we generate them. This article shows you how we do this with the least amount of effort invested.

    To reach the best coverage you have to define all possible conversations of your conversation model. To implement and maintain it manually is really time consuming and sometimes a boring task, not to mention the human mistakes which can easily happen even in the case of a pretty simple chatbot. We have implemented a Crawler tool which will help you to do it in a very simple way.

    Botium Crawler concept

    The Crawler detects the buttons and the quick replies and makes conversations along them. The conversation tree is traversed by a little bit customized depth first algorithm. Each time the Crawler reaches an open-ended question (which means no button or quick reply is found), then the conversation is ended, the path marked as visited and a new conversation is started from the beginning (from the ‘conversation start message’) to keep the context of the conversation safe. (The Crawler process starts the conversations with the messages which are defined in the ‘conversation start messages’ parameter.) When all paths are visited in the conversation tree, then the session is ended and you get all the possible conversations as result so you will have a full regression test. Let’s see how it works in practice in Botium Box.

    Register a Crawler project

    For better understanding I use a very simple banking chatbot example, which is mixed with buttons and open-ended questions.

    With quick start you can define a Crawler project in three simple steps. In the first step you have to choose or register a new chatbot. Then in the second step you can configure some basic settings of the conversation crawler. In the third step you can save the Crawler project or you are able to start the first Crawler session immediately.

    Finishing the registration you will be redirected to the dashboard of your Crawler project. Here you can see the previous Crawler sessions and the current execution settings.

    Crawler session result

    During a Crawler session as many parallel processes are going to be started as many ‘Conversation start messages’ are defined in the execution settings. These processes will detect all possible conversations along buttons and quick replies as it was already mentioned in the Crawler concept.

    The example banking chatbot has bot initiated conversations. The Crawler is able to detect the buttons and quick replies in the welcome messages as well, so in this case we can let the ‘Conversation start messages’ field empty.

    Trending Bot Articles:

    1. How Conversational AI can Automate Customer Service

    2. Automated vs Live Chats: What will the Future of Customer Service Look Like?

    3. Chatbots As Medical Assistants In COVID-19 Pandemic

    4. Chatbot Vs. Intelligent Virtual Assistant — What’s the difference & Why Care?

    And here is the biggest value of the Crawler: the generated conversations. This chatbot is pretty small and simple, so in this case we have just five conversations generated as a result. In case of a more complex chatbot hundreds of test cases can be found, which is enormous work to do manually.

    The other feature, which is as useful as the generated conversations, is the flowchart, which shows the whole detected conversation tree in visual form to get a big clear picture about your chatbot.

    Open-ended questions

    As you can see in the previous section at the bottom of the flowchart there are ‘open-ended questions’ like ‘Which date would be best for you? We need 24 hours …’. In this case the conversation is stopped from Crawler point of view, but with human interaction it could be continued. We have a solution for this problem as well.

    For ‘open-ended questions’ you can define multiple user answers. These responses will be recognized in the next Crawler session as if they would be buttons. After adding some user response at the end of non-finished conversations the flowchart became much bigger and the generated conversations were doubled.

    How to use the generated conversations

    As you can see with some minutes of easy work we generated ten conversations for this bot. In case of a fully button based bot, you have even less work, just press the start button and wait for some minutes.

    But what can we do with these conversations? In other words these are test scripts. Clicking on ‘Copy Test Scripts into Test Set’ you are able to copy them into a new or an existing test set.

    A test set is a collection of test cases which can be added to a test project. At this point the regression test with a pretty good coverage is ready for this bot.

    Crawler configurations

    • Conversation start messages
      These are so called conversation start messages, which the Crawler starts the conversations with. As many start messages you have, as many parallel jobs will be started.
    • Maximum conversation steps
      This is the depth of the conversation in the conversation flow. (One step is a user-bot message pair.) When the configured depth is reached then the conversation is stopped and marked as successfully ended.
    • Number of welcome messages
      There are chatbots which initiate the conversation without user interaction. In this case you have to specify how many welcome messages will be sent by the bot.
      If the bot has a welcome message(s) and you don’t specify any start message, then the Crawler tries to find quick replies and buttons in the welcome message(s), and start the conversations along them.
    • Wait for prompt
      Many chatbots answer with multiple bot messages. In this case you can define a timeout until the Crawler has to wait for the bot messages after each simulated user message.
    • Exit criteria
      In case of a complex chatbot it occurs often that we want to test only a certain part of the conversation tree. In this case you can define exit criteria to exclude some part of the tree.
      If the text of any quick reply/button matches any of the exit criteria, then that conversation is stopped there and marked as successfully ended conversation.
    • Merge utterances
      All text messages are saved as utterances. The Crawler can recognize the non-unique utterances and merge them into one utterance.

    Conclusion

    Botium Crawler is a very powerful tool for creating regression tests. It’s able to generate all test cases on a happy path without user interaction in case of fully button based chatbot and with minimal user interaction in case of partially button based chatbot. With the flowchart you can overview your chatbot conversation tree and detect circles.
    It’s a brand new tool, so there is still a lot of room for improvements, and we already have many ideas. For example we would like to introduce different tree traversal algorithms to reach more effective performance so you will be able to choose the best fit algorithm for your chatbot. Furthermore you will be able to add RegExp as exit criteria, we are planning to make the open-ended question feature more handy, and so on.
    Without proper tools you will be lost. The Crawler feature is the part of our flagship product Botium Box which helps you in your path to successful chatbot testing.

    Don’t forget to give us your 👏 !


    How to maintain chatbot regression tests with minimum effort was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.

  • How technical are APIs?

    The driver for this article is not to start a debate on what things need technological prowess to understand versus the logical components of a highly scalable SaaS-based platform. We intend to help you fit a complex system into a logical block in a way that a layperson can understand.‍

    API means Talking

    Let’s take an example where you are talking to your friend. If the two of you speak in the same language, it’s convenient for one person to understand what the other is saying.‍

    ‍But imagine having the same discussion with a person who doesn’t speak the language. It seems tricky, right?

    ‍‍

    ‍Replace the people in these pictures with systems. APIs can be thought of as the communication medium across the system. Let’s take an example. You have two systems; System A is Engati, and System B is a ticketing system. Say you’d like to create a ticket using System B through Engati. To achieve this, both systems need to communicate in one language; They need to follow one API standard.

    Now you must be wondering if there are different languages to communicate between these systems. Of course, there are. Similar to how languages like English, French, Italian, and Hindi exist, there are multiple ways to communicate, like HTTP REST, SOAP, GraphQL, Sockets, etc. And just like English is a language that most people try to talk to communicate globally, systems often tend to support REST API-based communication (then again, there is always a group of people who don’t like to speak English)!‍

    Trending Bot Articles:

    1. How Conversational AI can Automate Customer Service

    2. Automated vs Live Chats: What will the Future of Customer Service Look Like?

    3. Chatbots As Medical Assistants In COVID-19 Pandemic

    4. Chatbot Vs. Intelligent Virtual Assistant — What’s the difference & Why Care?

    What is a REST API?

    Just like how every language has grammar, syntaxes, and dos and don’ts, communication between systems has its own set of rules. Let’s look at the details of the parts of speech/grammar equivalence for a REST API.

    ‍Going back to our example of Engati attempting to create a ticket in the ticketing system, that is a create operation. Hence Engati will have to invoke a REST API in the ticketing system. To invoke the API, you would need to know the parts of speech/grammar equivalence, i.e., the URL, Parameters, Request Type (which ideally should be POST), Request Body.

    So a logical diagram would look like this:‍

    ‍Similar to how you use a language dictionary to understand the words of the language, we need to look at the documentation of a system to understand the request and the response formats. What details will be sent and what will be received are more specific to each system that we talk to.

    The next time you are looking for an integration between two systems, remember that APIs are just two people (systems) talking to each other under all the technical layers.

    Engati’s extensive documentation page can help you navigate through APIs to create extensive functionality around the platform.

    This article was originally published in Engati blogs.

    Don’t forget to give us your 👏 !


    How technical are APIs? was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.