注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

秋收.....

山西财院78jitong 19781017--19820715

 
 
 

日志

 
 
关于我

78jitong.......................................................... 高三李五七弓长,三赵九刘七大王,阎吴谢孙崔氏双,柴米余侯箩万堂, 毛邓陈宋任申杭,曾肖徐翁程董梁,储曲祁解韦国强,男女七十学跟党。

网易考拉推荐

2016年5月19日  

2016-05-19 13:21:17|  分类: 默认分类 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
2016年5月19日 - 78jitong - 夏天来了.....
 

Big Data’s Mathematical Mysteries

Machine learning works spectacularly well, but mathematicians aren’t quite sure why

At  a dinner I attended some years ago, the distinguished differential geometer Eugenio Calabi volunteered to me his tongue-in-cheek distinction between pure and applied mathematicians. A pure mathematician, when stuck on the problem under study, often decides to narrow the problem further and so avoid the obstruction. An applied mathematician interprets being stuck as an indication that it is time to learn more mathematics and find better tools.

I have always loved this point of view; it explains how applied mathematicians will always need to make use of the new concepts and structures that are constantly being developed in more foundational mathematics. This is particularly evident today in the ongoing effort to understand “big data” — data sets that are toolarge or complex to be understood using traditional data-processing techniques.

Our current mathematical understanding of many techniques that are central to the ongoing big-data revolution is inadequate, at best. Consider the simplest case, that of supervised learning, which has been used by companies such as Google, Facebook and Apple to create voice- or image-recognition technologies with a near-human level of accuracy. These systems start with a massive corpus of training samples — millions or billions of images or voice recordings — which are used to train a deep neural network to spot statistical regularities. As in other areas of machine learning, the hope is that computers can churn through enough data to “learn” the task: Instead of being programmed with the detailed steps necessary for the decision process, the computers follow algorithms that gradually lead them to focus on the relevant patterns.

Ingrid Daubechies

David von Becker

Ingrid Daubechies

In mathematical terms, these supervised-learning systems are given a large set of inputs and the corresponding outputs; the goal is for a computer to learn the function that will reliably transform a new input into the correct output. To do this, the computer breaks down the mystery function into a number of layers of unknown functions called sigmoid functions. These S-shaped functions look like a street-to-curb transition: a smoothened step from one level to another, where the starting level, the height of the step and the width of the transition region are not determined ahead of time.

Inputs enter the first layer of sigmoid functions, which spits out results that can be combined before being fed into a second layer of sigmoid functions, and so on. This web of resulting functions constitutes the “network” in a neural network. A “deep” one has many layers.

[No Caption]

Olena Shmahalo/Quanta Magazine

Decades ago, researchers proved that these networks are universal, meaning that they can generate all possible functions. Other researchers later proved a number of theoretical results about the unique correspondence between a network and the function it generates. But these results assume networks that can have extremely large numbers of layers and of function nodes within each layer. In practice, neural networks use anywhere between two and two dozen layers.* Because of this limitation, none of the classical results come close to explaining why neural networks and deep learning work as spectacularly well as they do.

It is the guiding principle of many applied mathematicians that if something mathematical works really well, there must be a good underlying mathematical reason for it, and we ought to be able to understand it. In this particular case, it may be that we don’t even have the appropriate mathematical framework to figure it out yet. (Or, if we do, it may have been developed within an area of “pure” mathematics from which it hasn’t yet spread to other mathematical disciplines.)

Another technique used in machine learning is unsupervised learning, which is used to discover hidden connections in large data sets. Let’s say, for example, that you’re a researcher who wants to learn more about human personality types. You’re awarded an extremely generous grant that allows you to give 200,000 people a 500-question personality test, with answers that vary on a scale from one to 10. Eventually you find yourself with 200,000 data points in 500 virtual “dimensions” — one dimension for each of the original questions on the personality quiz. These points, taken together, form a lower-dimensional “surface” in the 500-dimensional space in the same way that a simple plot of elevation across a mountain range creates a two-dimensional surface in three-dimensional space.

What you would like to do, as a researcher, is identify this lower-dimensional surface, thereby reducing the personality portraits of the 200,000 subjects to their essential properties — a task that is similar to finding that two variables suffice to identify any point in the mountain-range surface. Perhaps the personality-test surface can also be described with a simple function, a connection between a number of variables that is significantly smaller than 500. This function is likely to reflect a hidden structure in the data.

In the last 15 years or so, researchers have created a number of tools to probe the geometry of these hidden structures. For example, you might build a model of the surface by first zooming in at many different points. At each point, you would place a drop of virtual ink on the surface and watch how it spread out. Depending on how the surface is curved at each point, the ink would diffuse in some directions but not in others. If you were to connect all the drops of ink, you would get a pretty good picture of what the surface looks like as a whole. And with this information in hand, you would no longer have just a collection of data points. Now you would start to see the connections on the surface, the interesting loops, folds and kinks. This would give you a map for how to explore it.

These methods are already leading to interesting and useful results, but many more techniques will be needed. Applied mathematicians have plenty of work to do. And in the face of such challenges, they trust that many of their “purer” colleagues will keep an open mind, follow what is going on, and help discover connections with other existing mathematical frameworks. Or perhaps even build new ones.

  评论这张
 
阅读(16)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017