Speech Recognition Engineer Armin Saeb brings you the fifth installment of our “Lessons from Our Voice Engine” series, featuring high-level insights from our Engineering and Speech Tech teams on how our voice engine works.
What is an acoustic model (AM)?
Acoustic Models (AM) are key components of any speech recognition engine. An AM describes the statistical properties of sound events and connects the acoustic information with phonemes or other linguistic units. Hidden Markov Models (HMM) are one of the most common types of AMs. Other acoustic models include Deep Neural Networks (DNN) and Convolutional Neural Networks (CNN).
Trending Bot Articles:
Why are AMs important for SoapBox?
AMs play an even more critical role at SoapBox than in normal, adult-focused voice engines because recognizing children’s speech is much more challenging. Kids have smaller vocal tracts and slower and more variable speech patterns. They inhabit noisy environments and use a lot of spontaneous speech, imaginative words, ungrammatical phrases, and incorrect pronunciations!
At SoapBox we work hard to design and train AMs that can cater to all of this complexity and do the heavy lifting of accurately converting children’s speech to text.
Catch up on our previous “Lessons from Our Voice Engine”:
- #1: Natural Language Processing (NLP)
- #2: Custom Language Models (CLMs)
- #3: Deep Learning
- #4: Debiasing
While you’re here, check out our latest insights on voice tech for kids.
Don’t forget to give us your 👏 !
What Are Acoustic Models and Why Are They Needed in Speech Recognition for Kids? was originally published in Chatbots Life on Medium, where people are continuing the conversation by highlighting and responding to this story.