《MIT科技評論》:從交通到醫療,科大訊飛的人工智能被5億人使用

9月14日,《麻省理工科技評論》發佈了一篇有關科大訊飛的報道,以下為報道的英文原文與中文內容:

iFlytek’s voice recognition technology is everywhere in China, and that’s what’s making it smarter every day.

科大訊飛的語音識別技術在中國隨處可見,這也是為什麼這項技術每天都在變得更聰明的原因。

《MIT科技評論》:從交通到醫療,科大訊飛的人工智能被5億人使用

科大訊飛2016年度發佈會

by Yiting Sun

September 14, 2017

When Gang Xu, a 46-year-old Beijing resident, needs to communicate with his Canadian tenant about rent payments or electricity bills, he opens an app called iFlytek Input in his smartphone and taps an icon that looks like a microphone, and then begins talking. The software turns his Chinese verbal messages into English text messages, and sends them to the Canadian tenant. It also translates the tenant’s English text messages into Chinese ones, creating a seamless cycle of bilingual conversation.

當46歲的北京市民徐剛(Gang Xu,音)需要與加拿大的租客就租金或電費賬單溝通時,他會打開智能手機上名為“訊飛輸入法”的應用,點擊看起來像是麥克風的圖標,隨後開始說話。這款軟件將他的中文語音消息轉換為英文文本消息,併發送給加拿大的租客。軟件還可以將租客用英文編寫的文本消息轉換為中文文本消息,從而創造出雙語對話的無縫循環。

In China, over 500 million people use iFlytek Input to overcome obstacles in communication such as the one Xu faces. Some also use it to send text messages through voice commands while driving, or to communicate with a speaker of another Chinese dialect. The app was developed by iFlytek, a Chinese AI company that applies deep learning in a range of fields such as speech recognition, natural-language processing, machine translation, and data mining (see “50 Smartest Companies 2017”).

在中國,有超過5億人使用訊飛輸入法來解決類似徐剛的溝通障礙。某些人在開車時使用語音功能去發送短信,而另一些人則使用該工具與操其他方言的中國人交流。這款應用由中國的人工智能公司科大訊飛開發。該公司將深度學習技術應用於語音識別、自然語言處理、機器翻譯和數據挖掘等領域。

Court systems use its voice-recognition technology to transcribe lengthy proceedings; business call centers use its voice synthesis technology to generate automated replies; and Didi, a popular Chinese ride-hailing app, also uses iFlytek’s technology to broadcast orders to drivers.

法庭使用科大訊飛的語音識別技術來聽寫冗長的訴訟程序;商業呼叫中心使用語音合成技術來生成自動回覆;而受歡迎的中國打車應用滴滴也使用科大訊飛的技術來向司機廣播訂單。

But while some impressive progress in voice recognition and instant translation has enabled Xu to talk with his Canadian tenant, language understanding and translation for machines remains an incredibly challenging task (see “AI’s Language Problem”).

然而,儘管語音識別和即時翻譯等技術已經取得了長足的進步,讓徐剛可以與他的加拿大租客交流,但讓機器去理解和翻譯語言仍是挑戰巨大的任務。

Xu recalls a misunderstanding when he tried to ask his tenant when he would get off work to come sign the lease renewal. But the text message sent by the app was “What time do you go to work today?” In retrospect, he figures that it was probably because of the wording of his question: you’ll work until what time today? “Sometimes, depending on the context, I can’t get my meaning across,” says Xu, who still depends on it for communication.

徐剛回憶,有一次他問租客,什麼時候下班過來續簽租約,這時軟件發生了理解錯誤。當時,該軟件發送的短信是:“你今天什麼時候去上班?”徐剛懷疑,這是由於他的提問方式,因為當時的具體問題是:“你今天要上班到什麼時候?”他表示:“某些時候,取決於具體語境,我無法準確表達自己的意思。”不過,他仍然依靠訊飛輸入法去溝通。

Xu’s story highlights why it’s so important for a company like iFlytek to gather as much data from real-world interactions as possible. The app, which is free, has been collecting that data since it launched in 2010.

徐剛的遭遇也證明,為何對類似科大訊飛這樣的公司來說,儘可能多地從現實世界互動中收集數據如此重要。自2010年發佈以來,這款免費軟件一直在收集這些數據。

iFlytek’s developer platform, called iFlytek Open Platform, provides voice-based AI technologies to over 400,000 developers in various industries such as smart home and mobile Internet. The company has international ambitions, including a subsidiary in the U.S. and an effort to expand into languages other than Chinese. Meanwhile, the company is changing the way many industries such as driving, health care, and education interact with their users in China.

《MIT科技評論》:從交通到醫療,科大訊飛的人工智能被5億人使用

科大訊飛園區

科大訊飛的開發者平臺,即訊飛開放平臺,為智能家居和移動互聯網等多個行業的40餘萬開發者提供了基於語音的人工智能功能。同時也在關注國際市場。科大訊飛已經在美國成立了子公司,同時也在拓展除中文以外的其他語言。與此同時,科大訊飛也在改變中國交通、醫療和教育等行業與用戶互動的方式。

In August, iFlytek launched a voice assistant for drivers called Xiaofeiyu (Little Flying Fish). To ensure safe driving, it has no screen and no buttons. Once connected to the Internet and the driver’s smartphone, it can place calls, play music, look for directions, and search for restaurants through voice commands. Unlike voice assistants intended for homes, Xiaofeiyu was designed to recognize voices in a noisy environment.

今年8月,科大訊飛推出了名為“小飛魚”的司機語音助手。為了保障安全駕駛,這款產品沒有提供屏幕和按鈕。在連接互聯網和司機的智能手機之後,設備可以通過語音命令打電話、播放音樂、查找路線,以及搜索餐廳。與瞄準家庭用戶的語音助手不同,小飛魚的設計就是為了在嘈雜環境中識別語音。

Min Chu, the vice president of AISpeech, another Chinese company working on voice-based human-computer interaction technologies, says voice assistants for drivers are in some ways more promising than smart speakers and virtual assistants embedded in smartphones. When the driver’s eyes and hands are occupied, it makes more sense to rely on voice commands. In addition, once drivers become used to getting things done using their voice, the assistant can also become a content provider, recommending entertainment options instead of passively handling requests. This way, a new business model will evolve.

中國另一家專注於人機語音交互技術的公司思必馳(AISpeech)副總裁初敏表示,從某些方面來看,面向司機的語音助手要比智能手機內置的虛擬助手,以及智能音箱更有前景。當司機的雙眼和雙手被佔據時,語音命令會更有意義。此外,一旦司機習慣於用語音來完成操作,那麼這樣的助手就可以成為內容分發渠道,向司機推薦娛樂選擇,而不僅僅是被動地處理請求。這將帶來全新的商業模式。

In the health-care industry, although artificial intelligence has the potential to reduce costs and improve patient outcomes, many hospitals are reluctant to take the plunge for fear of disrupting an already strained system that has few doctors but lots of patients.

在醫療行業,儘管人工智能有望降低成本,改善治療效果,但許多醫院擔心,這將擾亂供需平衡已經緊張的醫療系統,因此並不願意冒險。

At the Anhui Provincial Hospital, which is testing a number of trials using AI, voice-based technologies are transforming many aspects of its service. Ten voice assistants in the shape of a robot girl use iFlytek’s technology to greet visitors in the lobby of the outpatient department and offer relief for overworked receptionists. Patients can tell the voice assistant what their symptoms are, and then find out which department can help.

安徽省立醫院正在使用人工智能開展一系列試驗。例如,該醫院利用語音技術去改革服務的多個方面。在門診大廳裡,10名“機器人女孩”使用科大訊飛的技術充當語音助手,給繁忙的問詢處提供幫助。病人可以告訴語音助手,他們的症狀是什麼,而這些助手會指導他們掛什麼科室的號。

Based on the data collected by the hospital since June, the voice assistant directed patients to the right department 84 percent of the time.

基於該醫院自6月份以來收集的數據,語音助手提供指導的準確率已達到84%。

Doctors at the hospital are also using iFlytek to dictate a patient’s vital signs, medications taken, and other bits of information into a mobile app, which then turns everything into written records. The app uses voice print technology as a signature system that cannot be falsified. The app is collecting data that will improve its algorithms over time.

醫院醫生也在使用科大訊飛的技術,通過移動應用去記錄病人的生命體徵、藥物記錄,以及其他信息。這款應用隨後將把所有信息轉換為文字記錄。該應用使用聲紋技術作為簽名系統,因此信息無法被篡改。與此同時,該應用不斷收集數據,優化算法。

Although voice-based AI techniques are becoming more useful in different scenarios, one fundamental challenge remains: machines do not understand the answers they generate, says Xiaojun Wan, a professor at Peking University who does research in natural-language processing. The AI responds to voice queries by searching for a relevant answer in the vast amount of data it was fed, but it has no real understanding of what it says.

研究自然語言處理的北京大學教授萬小軍指出,儘管基於語音的人工智能技術在多種場景中都越來越有用,但基礎性挑戰依然存在:機器無法理解它們產生的答案。人工智能對語音命令的響應是通過在海量數據中搜索具有相關性的答案,實際上並不真正理解自己所說的內容。

In other words, the natural-language processing technology that powers today’s voice assistants is based on a set of rigid rules, resulting in the kind of misunderstanding Xu went through.

換句話說,當前語音助手使用的自然語言處理技術基於一套嚴格的規則,這也導致了徐剛遭遇的問題。

Changing the way machines process language will help companies create voice-based AI devices that will become an integral part of our daily life. “Whoever makes a breakthrough in natural-language processing will enjoy an edge in the market,” says Chu.

如果可以對機器處理語言的方式進行變革,那麼企業就可以開發出能深入人們日常生活的語音智能設備。初敏表示:“任何人如果能在自然語言處理領域取得突破,都可以在市場上獲得優勢。”

(編譯/陳樺)

相關推薦

推薦中...