Teaching Machines to Understand Us   By Tom Simonite  MIT Technology Review Vol.118 No.5 2015

 

SRE实战 互联网时代守护先锋,助力企业售后服务体系运筹帷幄!一键直达领取阿里云限量特价优惠。

让机器理解我们  作者 Tom Simonite  MIT科技评论 2015年第118卷5号

 

A reincarnation of one of the oldest ideas in artificial intelligence could finally make it possible to truly converse with our computers. And Facebook has a chance to make it happen first.

人工智能领域最古老的概念的重生终于使我们与计算机的交流成为可能。Facebook可能第一个使这个想法得以实现。

 

The first time Yann LeCun revolutionized artificial intelligence, it was a false dawn. It was 1995, and for almost a decade, the young French-man had been dedicated to what many computer scientists considered a bad idea: that crudely mimicking certain features of the brain was the best way to bring about intelligent machines. But LeCun had shown that this approach could produce something strikingly smart and useful. Working at Bell Labs, he made software that roughly simulated neurons and learned to read handwritten text by looking at many diferent examples. Bell Labs’ corporate parent, AT&T, used it to sell the first machines capable of reading the handwriting on checks and written forms. To LeCun and a few fellow believers in artificial neural networks, it seemed to mark the beginning of an era in which machines could learn many other skills previously limited to humans. It wasn’t.

Yann LeCun第一次关于人工智能的革命,是一次错误的黎明。那是在1995年,这个年轻的法国人在这上面花了接近10年时间,但很多计算机科学家认为这是个不好的主意:天然的模拟人脑的某些特性,认为这是实现智能机器的最好方法。但LeCun证明了这种方法是可以带来非常智能、有用的一些东西的。在Bell实验室工作的时候,他编制了粗糙的模拟神经元的软件,并通过观察很多不同的例子来学习阅读手写字体。Bell实验室的集团母公司,AT&T,用这种技术制造出了第一种可以阅读手写支票和手写表格的机器并进行销售。对于LeCun和几个人工神经网络的信徒同伴来说,这似乎意味着机器可以学习人类能力的时代的开始,但并不是。

“This whole project kind of disappeared on the day of its biggest success,” says LeCun. On the same day he celebrated the launch of bank machines that could read thousands of checks per hour, AT&T announced it was splitting into three companies dedicated to diferent markets in communications and computing. LeCun became head of research at a slimmer AT&T and was directed to work on other things; in 2002 he would leave AT&T, soon to become a professor at New York University. Meanwhile, researchers elsewhere found that they could not apply LeCun’s breakthrough to other computing problems. The brain-inspired approach to AI went back to being a fringe interest.

LeCun说:“这整个工程在它最成功的那一天,忽然好像消失了”。在他庆祝推出这种每小时可以处理数千张支票的银行机器那天,AT&T宣布分裂成三家公司,分别致力于通信和计算的不同市场。LeCun成为了削弱过的AT&T的研发头目,被指定了其他任务;2002年他离开了AT&T,很快成为了纽约大学的一名教授。同时,其他地方的研究者发现他们无法将LeCun的突破应用到其他计算问题中。人工智能的人脑灵感方法重新回到了大家的兴趣边缘。

LeCun, now a stocky 55-year-old with a ready smile and a sideways sweep of dark hair touched with gray, never stopped pursuing that fringe interest. And remarkably, the rest of the world has come around. The ideas that he and a few others nurtured in the face of over two decades of apathy and sometimes outright rejection have in the past few years produced striking results in areas like face and speech recognition. Deep learning, as the field is now known, has become a new battleground between Google and other leading technology companies that are racing to use it in consumer services. One such company is Facebook, which hired LeCun from NYU in December 2013 and put him in charge of a new artificial-intelligence research group, FAIR, that today has 50 researchers but will grow to 100. LeCun’s lab is Facebook’s first significant investment in fundamental research, and it could be crucial to the company’s attempts to become more than just a virtual social venue. It might also reshape our expectations of what machines can do.

LeCun现在55岁,健壮结实,梳向一边的黑发带有一点灰白,永远带着微笑,从来没有停止过追求那种边缘的兴趣。而令人瞩目的是,整个世界都在逐渐意识到其重要性。在超过20年的时间里,他和其他几个人一直坚持着这个想法,而其他人对这个想法的态度则是冷漠甚至是完全的抛弃,但在过去几年里,这种思想在人脸识别与语音识别这样的领域里取得了惊人的结果。现在大家都知道这个领域叫做深度学习,现在已经成了Google和其他科技领袖企业的新战场,都在比赛将之用于向客户提供服务。其中一个企业就是Facebook,在2013年12月从纽约大学聘请了LeCun,让他负责一个新的人工智能研究小组FAIR,今天已经有了50名研究人员,后面还会增加到100。LeCun的实验室是Facebook第一次向基础研究进行重大投资,这对企业非常重要,会使企业不仅仅是一个虚拟的社交聚集地,也会改变我们对机器能力的期待值。

Deep Learning’s Leaders

No

Name

Working in

Intro

1

Geoff Hinton

Google&University of Toronto

Did his PhD on artificial neural networks in the 1970s. Showed how to train larger, “deep” neural networks on large data sets in the 2000s, and proved their power for speech and image recognition.

2

Yann LeCun

Facebook

Got interested in neural networks as an undergraduate, and later pioneered the use of deep learning for image recognition. Now leads a group at Facebook trying to create software that understands text and can hold conversations.

3

Yoshua Bengio

IBM&University of Montreal

Started working on artificial neural networks after meeting LeCun at Bell Labs in the 1980s. Was one of the first to apply the technique to understanding words and language. Now working with IBM to improve its Watson software.

4

Andrew Ng

Baidu

Led a project at Google that worked out how neural networks could be trained on millions of pieces of data, allowing greater accuracy. Now oversees research at Baidu, which is working on improved speech recognition.

5

Demis Hassabis

Deepmind

Worked on AI in the games industry, then researched neuroscience to get ideas about building intelligence. He founded DeepMind, which Google bought last year and runs as a quasi-independent unit.

深度学习的领导者

序号

人物

任职企业

介绍

1

Geoff Hinton

Google

多伦多大学

20世纪70年代读博士研究生方向为人工神经网络,2000年后研究怎样在大数据集上训练更大的深度神经网络,在语音和图像方面证明了其算法的能力。

2

Yann LeCun

Facebook

在本科生期间就对神经网络感兴趣,后来是深度学习在图像识别方面的先驱者。现在在Facebook领导一个小组进行自然语言理解软件的研发工作,目的是能进行通用对话。

3

Yoshua Bengio

IBM

蒙特利尔大学

20世纪80年代在Bell实验室与LeCun会面后,决定开始研究人工神经网络,是第一批将这种技术用在理解文字和语言方面的人之一,现在在IBM改进其Watson软件。

4

Andrew Ng

百度

在Google领导一个小组,研究如何训练神经网络工作于数百万的数据量中,并取得更好的精度。现在负责百度的人工智能研究工作,研究改进语音识别算法。

5

Demis Hassabis

Deepmind

在游戏工业中研究人工智能,然后研究神经系统科学,理解如何构建智能系统。他成立了Deepmind,去年被Google收购,现在是一个准独立运行的单元。

 

Facebook and other companies, including Google, IBM, and Microsoft, have moved quickly to get into this area in the past few years because deep learning is far better than previous AI techniques at getting computers to pick up skills that challenge machines, like understanding photos. Those more established techniques require human experts to laboriously program certain abilities, such as how to detect lines and corners in images. Deep-learning software figures out how to make sense of data for itself, without any such programming. Some systems can now recognize images or faces about as accurately as humans.

Facebook和其他企业,包括Google,IBM,Microsoft,在过去几年中都迅速转到这个领域,因为深度学习远远超过了之前的其他人工智能技术,使计算机学会了很有挑战性的能力,比如理解图像。现有的技术都需要人类专家费力的对某些能力编写程序,比如如何检测图像中的边缘和角点。深度学习软件不需要这样编程,就解决了如何理解数据整体的问题。有些系统识别图像或人脸的能力已经与人类一样。

Now LeCun is aiming for something much more powerful. He wants to deliver software with the language skills and common sense needed for basic conversation. Instead of having to communicate with machines by clicking buttons or entering carefully chosen search terms, we could just tell them what we want as if we were talking to another person. “Our relationship with the digital world will completely change due to intelligent agents you can interact with,” he predicts. He thinks deep learning can produce software that understands our sentences and can respond with appropriate answers, clarifying questions, or suggestions of its own.

现在LeCun的目标是更加强大的东西。他希望研制出带有通用语言技能和常识的软件,可以进行基本的对话。现在我们与机器交流需要点击按钮,或输入仔细选择的术语,而以后我们则只需要告诉机器我们想要什么,就像我们和其他人对话一样。他预测到:“只要我们开发出这种可以与之交互的程序,我们与数字世界的关系将会完全改变”。他认为深度学习技术可以产生这种软件,理解我们的语言,用适当的答案做出回应,或者澄清问题,或者自己提出建议。

Agents that answer factual questions or book restaurants for us are one obvious — if not exactly world-changing — application. It’s also easy to see how such software might lead to more stimulating video-game characters or improve online learning. More provocatively, LeCun says systems that grasp ordinary language could get to know us well enough to understand what’s good for us. “Systems like this should be able to understand not just what people would be entertained by but what they need to see regardless of whether they will enjoy it,” he says. Such feats aren’t possible using the techniques behind the search engines, spam filters, and virtual assistants that try to understand us today. They often ignore the order of words and get by with statistical tricks like matching and counting keywords. Apple’s Siri, for example, tries to fit what you say into a small number of categories that trigger scripted responses. “They don’t really understand the text,” says LeCun. “It’s amazing that it works at all.” Meanwhile, systems that seem to have mastered complex language tasks, such as IBM’s Jeopardy! winner Watson, do it by being super-specialized to a particular format. “It’s cute as a demonstration, but not work that would really translate to any other situation,” he says.

能回答实际问题或为我们订餐馆是一个明显的应用(虽然这种应用不能改变世界),很容易还可以看到这种程序还能带来更刺激的视频游戏角色,或者改善在线学习。更激动人心的是,LeCun认为掌握自然语言的系统可以很好的理解我们,然后理解什么是对我们有益的东西。“这样的系统应当不仅能理解人们对什么感兴趣,还应能知道人们需要看到什么,不管是不是对这个感兴趣”,他说。搜索引擎、垃圾邮件过滤、现在的虚拟助理这些背后的技术不可能有这种功能,它们通常忽略文字的顺序,靠统计技术比如匹配计算关键字。比如苹果的Siri,努力将你说的话归到几种类别中,然后触发编排好的响应。LeCun说:“它们并不真正理解这些文字,它只能进行简单的工作”。同时,那些似乎掌握了复杂的语言理解功能的系统,比如IBM Watson,Jeopardy!赢家,只是将其编排成了特殊情况的专有模式。他说:“作为一个展示,是非常漂亮的,但如果换成其他情况,就无法真正工作了”。

In contrast, deep-learning software may be able to make sense of language more the way humans do. Researchers at  Facebook, Google, and elsewhere are developing software that has shown progress toward understanding what words mean. LeCun’s team has a system capable of reading simple stories and answering questions about them, drawing on faculties like logical deduction and a rudimentary understanding of time.

作为对比,深度学习软件可能会更像人一样理解语言。Facebook,Google和其他地方的研究者正在研制这样的软件,并在理解语言的意义上取得了进步。LeCun的团队有一个系统可以阅读简单的故事,并可以回答关于故事的问题,与逻辑推导和初步理解时间的能力也比较接近。

However, as LeCun knows firsthand, artificial intelligence is notorious for blips of progress that stoke predictions of big leaps forward but ultimately change very little. Creating software that can handle the dazzling complexities of language is a bigger challenge than training it to recognize objects in pictures. Deep learning’s usefulness for speech recognition and image detection is beyond doubt, but it’s still just a guess that it will master language and transform our lives more radically. We don’t yet know for sure whether deep learning is a blip that will turn out to be something much bigger.

但是,就像LeCun直接得来的经验一样,人工智能就是这样,经常做出巨大飞跃的预言,但最终改变却极少。自然语言理解的难度是大过识别图像中的物体的,深度学习在语音识别与图像检测中的有用之处是无需置疑的,但能否掌握自然语言并更加彻底的改变我们的生活则是一个疑问。我们不知道深度学习这种闪现是否能变成更加巨大影响的东西。

扫码关注我们
微信号:SRE实战
拒绝背锅 运筹帷幄