In 1960, the Nobel-winning theoretical physicist Eugene Wigner published an article titled “The unreasonable effectiveness of Mathematics in the Natural Sciences”. His point was that, at least in the physical and chemical worlds, mathematics is able to describe the behavior of nature to an uncannily accurate degree, which leads to foundational debates on whether nature arises out of mathematics. Of course there are counter-views which I’ll get to later.
A similar argument was proposed recently for the surprising effectiveness of neural nets, though this time founded on physics rather than mathematics. (But if you accept that physics arises out of mathematics, perhaps it is a part of the same argument after all.)
First a quick review of how a neural net (NN) works and how effective they have become. An NN is a stacked layer of networks in which each layer is a plane of simple elements, where each element takes a small part of an input (an image, or speech or some other complex stimulus) and calculates a simple function based on that input. In image recognition for example, elements in the first plane would recognize simple characteristics like edges, each in a small segment of the image. Those are passed on to a second plane to recognize slightly more complex structures based on those edges, and so on. These systems are not programmed in a conventional sense; they must be trained to recognize objects but once sufficiently trained, they have proven able to beat humans at differentiating characteristics in images as closely related as different breeds of dog.
Deep neural nets/learning have proven to be more than an incremental refinement to existing methods – these methods radically improve accuracy and that in turn has led to an explosion in applications in image, speech and text recognition. Which naturally leads to improved applications for voice-based control, car safety and autonomous cars, medical advances and more. Sufficiently accurate voice-based control alone is likely to dramatically change the way we interact with automation, witness Amazon’s Alexa and similar systems.
An article in Fortune details the activity in this area, particularly investment activity, both internal and VC-funded. We all know about work that Google, Facebook, IBM, Microsoft and others are doing. What you may not know is that equity funding of AI startups reached $1B last quarter. Apparently VCs are now wary of any startup that doesn’t have such an angle, if only because they are losing interest in devices or software controlled through menus and clicks. Their view is that natural language (eg. speech) interfaces are now the hot direction.
So there’s definitely money to be made, but of course VCs don’t give a hoot about the fundamentals of physics, mathematics or the ontology of those domains. But debate in those areas might have something to say about how long-lived this direction could be, so let’s get back to that topic. Henry Lin at Harvard and Max Tegmark at MIT, both physicists, have proposed a reason why neural nets should be so effective and their claim is grounded in physics.
Their reasoning works like this. The physics of the universe can be modeled extremely well with low-order equations and with a small handful of relatively simple functions, much smaller certainly than the range of all possible functions. They attribute this to causal sequences in the evolution of the universe. The universe started from a completely ordered state (the big bang) but is still nowhere near an entropic death, as evidenced by structure in the cosmic microwave background (CMB), for example. This, they assert, is why we see significant structure and why we are able to explain physics with a limited set of equations and functions – evolution through causal sequences leads to relatively simple behavior, at least up to the current era.
So, they argue, the effectively hierarchical structure of recognition in deep neural nets is sufficient to recognize the complexity of systems we encounter in nature, whether the CMB or galaxies, or dog breeds or speech, because they need not model arbitrarily complex systems. The hierarchical structure of how systems evolve in nature, whether cosmological or biological, as evidenced in quite universal characteristics like symmetry and locality ensures they can be modeled with excellent accuracy by neural nets. (In fairness, I am greatly oversimplifying their argument – you can follow a link to an arXiv paper in one of the links below.)
This steers closer to philosophy than science, which doesn’t necessarily make it uninteresting to the more grounded among us, but it does open the floor to counter-arguments. In fact, there were early counters to Wigner’s position. One (interestingly from Hamming) was that humans see what they look for, another that we often find it essential to create new mathematics to fit a requirement (the simple equations and functions we construct are perhaps more to fit within our limited mental capacity than they are a characteristic of nature). The same arguments could be made about the neural net/physics connection. Still, physics (and engineering even more so) is about approximation. If, to sufficient accuracy, a simple model will work, then the deeper “reality” (whatever that might be) may be unimportant for practical applications, though still important for deeper understanding.
To wrap up, since I’m in a philosophical mood, Tegmark and others have written a paper on a concept started (I think) by Roger Penrose on the relationship between Mathematics, Matter and the Mind – a sort of Penrose triangle. A question here is whether one of these three is most fundamental or one most derived, or whether one or more is unrelated and simply an artefact of our attempt to model. Wigner’s position was that matter derives from mathematics. One of Hamming’s positions was that we create mathematics to model what we want to see – that Mind is fundamental, Mathematics derives from the Mind and perhaps our view of Matter is just our way to reduce the natural world into this framework. But for neural nets, who cares – it seems they may already have the power to model with the accuracy we need, at least for now.