
When Gang Xu, a 46-year-old resident of Beijing, must contact his Canadian tenant about rent payments or electricity bills, he opens an app called iFlytek Input on his smartphone and clicks on the microphone-like icon and then starts talking. The software turns its Chinese words into text messages in English and sends them to a Canadian tenant. It also translates English tenant text messages into Chinese, allowing transparent communication.
In China, more than 500 million people use iFlytek Input to overcome communication obstacles. Some use it to send text messages using voice commands while driving, or to communicate with the speaker of another Chinese dialect. The application was developed by iFlytek, a Chinese AI company that applies in-depth training in areas such as speech recognition, natural language processing, machine translation and data mining (see “
50 Smartest Companies 2017 ”).
Judicial systems use their voice recognition technology to decipher lengthy legal proceedings; business call centers use voice synthesis technology to generate automated responses; and Didi, a popular Chinese carrier application, also uses iFlytek technology to transfer orders to drivers.
')
But, although some impressive advances in voice recognition and instant translation allowed Xu to speak with his Canadian tenant, language understanding and translation remain an incredibly difficult task for machines (see
AI's Language Problem ).
Xu recalls a misunderstanding when he tried to ask his tenant when he would come home from work to sign a lease renewal. But the text message sent by the application was “When do you go to work today?” Looking back, he believes that probably because of the wording of his question: until what time will you work today? “Sometimes, depending on the context, I cannot convey the meaning,” says Xu, who still needs the application to communicate.
History Xu emphasizes why it is so important that a company like iFlytek collect as much data as possible from real-world interactions. The free app collects this data since its launch in 2010.
The iFlytek Developer Platform, called the iFlytek Open Platform, provides voice-based AI technology to more than 400,000 developers in various industries, such as the intelligent home and mobile Internet. The company is estimated at 80 billion yuan ($ 12 billion) and has international ambitions, including a subsidiary in the United States, and plans to expand to other languages. Meanwhile, the company is changing how many industries, such as driving, healthcare and education, interact with their users in China.

In August, iFlytek launched a voice assistant for drivers called Xiaofeiyu (flying fish). To ensure safe driving it has no screen and no buttons. After connecting to the Internet and the driver’s smartphone, he can place calls, play music, search for routes and restaurants through voice commands. Unlike voice assistants designed for the home, Xiaofeiyu was designed to recognize voices in noisy environments.
Min Chu, vice-president of AISpeech, another Chinese company working on human-computer voice technology, says that voice assistants for drivers are in some ways more promising than smart speakers and virtual assistants built into smartphones. When the driver's eyes and hands are busy, it makes sense to rely on voice commands. In addition, as soon as drivers get used to doing something using their voice, the assistant can also become a content provider, recommending entertainment options instead of passively performing requests. Thus, a new business model will be developed.
Although artificial intelligence has the potential to reduce costs and improve patient outcomes in healthcare, many hospitals are reluctant to take decisive steps for fear of destroying a busy system with few doctors and many patients.
In Anhui Provincial Hospital, which is testing AI, voice technology transforms many aspects of the service. The ten robot-assisted speech assistants use iFlytek technology to greet visitors in the lobby of the outpatient department and offer assistance to overloaded recorders. Patients can tell the voice assistant what their symptoms are and then find out which department can help.
Based on data collected by the hospital since June, the assistant sent the patients to the correct ward in 84% of cases.
Doctors at the hospital also use iFlytek to dictate patient vital signs, medications taken and other information into a mobile application, which then turns them into records. The application uses voice printing technology as a signature system that cannot be falsified. The application collects data that will eventually improve its algorithms.
Although voice AI methods are becoming more useful in different scenarios, one fundamental problem remains: machines do not understand the answers they generate, says Xiaoyun Wang, a professor at Peking University who does research in natural language processing. The AI ​​responds to voice requests, looking for the appropriate response in a huge amount of data, but it does not have a real understanding of what it says.
In other words, the natural language processing technology that governs modern voice assistants is based on a set of strict rules, resulting in a misunderstanding that occurred with Xu.
Changing the way the machine handles the language will help companies create voice AI devices that will become an integral part of our daily life. “Anyone who makes a breakthrough in natural language processing will have an advantage in the market,” says Chu.

