Korean voice assistant highlights tech’s insatiable hunger for language data
Thursday, October 18, 2018
Posted by: Dana Walker
Language data is big business. This sub-industry that deals with training corpora for language
technologies ranging from natural language processing to machine translation is enjoying a resurgence thanks to AI.
Basically every language-related, AI-powered technology is driving demand, from speech recognition, sentiment analysis, question-answering and summarization, and of course, neural machine translation (NMT). Language data had always been necessary for technologies such as statistical MT, but NMT and any neural network-based solution is even more data hungry. What’s more, these technologies require high quality, domain-specific language data to provide equally high quality output.