Applications of Deep Learning in Speech Recognition for Kids

This image depicts deep learning. It shows a human brain.

Welcome to Lesson 3 in our “Lessons from Our Voice Engine” series, featuring high level insights from our Engineering and Speech Tech teams on how our voice engine works. This lesson is from Siva Reddy Gangireddy, a Senior Speech Recognition Scientist on our Speech Tech team.

What is deep learning?

To understand deep learning, we need a basic understanding of machine learning.

Machine learning is a group of algorithms that focus on learning from data to make predictions and decisions without any explicit programming. It usually involves training a model on huge amounts of data to learn patterns so that predictions and decisions can then be made on new data. For example, the smart speakers we use in daily life are based on machine learning algorithms.

Deep learning is a form of machine learning that’s based on neural networks, a set of algorithms designed to mimic the function of the human brain. Any network with more than three layers is considered a deep neural network and the input is processed through those several layers to predict the desired output. Deep neural networks require huge amounts of data and are extensively used in speech recognition and image recognition. At SoapBox Labs, our models are trained on thousands of hours of audio data and evaluated on in-house datasets regularly.

Trending Bot Articles:

1. How Conversational AI can Automate Customer Service

2. Automated vs Live Chats: What will the Future of Customer Service Look Like?

3. Chatbots As Medical Assistants In COVID-19 Pandemic

4. Chatbot Vs. Intelligent Virtual Assistant — What’s the difference & Why Care?

Why is deep learning important, and how is it used, in kids’ speech recognition?

The goal of speech recognition is to convert users’ speech to text. Given the variations in audio data (such as pronunciation, accent and noise), machine learning algorithms are used to ensure accuracy. Because of its superior performance, especially for understanding kids’ variable speech, deep learning is at the core of SoapBox Labs’ voice engine and solutions like fluency assessments. We also use deep learning to deliver wake word detection, voice activity detection (VAD), and end-to-end speech recognition for on-device speech recognition.

Catch up on our previous “Lessons from Our Voice Engine”:

#1: Natural Language Processing (NLP)

#2: Custom Language Models (CLMs)

Don’t forget to give us your 👏 !


Applications of Deep Learning in Speech Recognition for Kids was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.


Posted

in

by

Tags: