IBM Watson has come up with a system capable of transcripting conversational speech with very low error rates.
The company said that the new system is based upon NIST Switchboard corpus, a scientific benchmark that consists of telephone conversations.
IBM claims that the new system has just a 8% word error rate, which was achieved with the help of advances in applications of deep learning.
The company’s Speech to Text service is capable of converting human voice into written word using machine intelligence.
After a transcription is created, it is sent back to the client and is retroactively updated as more speech comes in.
The recognition models can also be trained for different languages, for specific domains.
Developers can build applications through IBM’s cognitive services available at the development community through Watson Developer Cloud.
According to IBM Speech and Language Algorithms senior manager Michael Picheny speech recognition technology has come a long way in the past five years, but it is under a mistaken impression that it is now a "solved problem" as there is still room for improvement.