Learning From Machine Learning

Learning From Machine Learning

Learning From Machine Learning

Our new machine-learning systems would have to dumb themselves down considerably to explain to us how they are coming up with their conclusions. In so doing, they would have to falsify their representation of that world — a representation that is arguably superior to ours not only in its breadth but also in its very structure. This has implications for how we educate our children.

The fact to begin with is that machine learning systems can’t always explain the conclusions they reach.

The traditional example of a machine learning system is one that gets trained on how to separate email into spam and not-spam. You kickstart the process by feeding in email that humans have classified into those two categories. The computer looks for regularities in each set that well may have escaped human notice. For example, in a study from 2002 it turned out that the inclusion of the word “describe” oddly was a near-certain indicator that the message was *not* spam. In this case, the program can indeed tell us how why it classified any particular message the way it did. For example, maybe it had the words “Get rich quick” and “Easy,” each of which has a heavy spam-weight.

Since then, machine learning has entered much deeper waters. Artificial neural networks running “deep learning” software can perform analyses of such staggering complexity that humans simply can’t understand how they come up with their conclusions. For example, imagine a computer was set up to analyze a set of A-B ads. These are two versions of an online ad, each with a minor variation such as having the price on the left or right, or using a green or blue background. A site like Amazon will serve up the two versions randomly to maybe 100,000 visitors and is likely to discover that one of the variants gets clicked on a few percentage points more than other. From then, all visitors see the version that proved itself to be more effective. But usually no one knows why one version works better than another. In fact if we knew why — for example, that blue backgrounds always do better than green — Amazon wouldn’t need to run A-B tests.

Now imagine that we feed into the deep learning system everything we know about the 100,000 people on whom the A-B test was run. Imagine that for each person we have, say, 5,000 data points. This is indeed the number of pieces of information that Cambridge Analytica — the analytics company hired by the Donald Trump campaign — has supposedly claimed to have about each of 320 million US Citizens. Imagine the deep learning system notes subtle and sometimes inexplicable relationships among this data. Maybe the blue background works better for men who over 50 who had just clicked on three pages that had orange headlines and are coming from a page that reminds them of their home town, but not with white teen-age girls who got there after checking Google to help them with their math homework but only if the next day is a holiday. Now imagine it getting far, far more complex  more complex than that. Machine learning programs are not limited in the variables they can track. They can notice what seem like irrelevant associations and assign them weights. So, it is conceivable that a program could accurately predict which variant of an ad get more clicks, but could only will explain why by showing us a set of connections and weights that surpass the brain’s comprehension.

The ability of deep learning systems to “understand” online human behavior without being able to explain it applies just as well events happening in the real world. Such events are also subject to the subtle and interacting effects of thousands or variables. This flies in the face of the simplifying assumption we in the West have grown up making as we navigate the real world: the physical world is governed by a handful of simple laws that makes it all predictable. There undoubtedly are simple laws, but they blind us to the fact that almost all of what happens is unpredictable: physics explains the braking distance of automobiles, but those simple rules don’t let us predict that the car ahead of us is going to stop, or that there’s going to be a traffic jam because of a breakdown, or whether a bird will fly into our headlights, or even what will be on the radio when the accident happens. For those of us of a certain age, we can’t even predict if we’re going to find our keys in time. There are just too many variables.

If that’s the case, then the machine’s view of the world is more reflective of the world’s nature than is our brain’s simplification of it down to a handful of laws.

This matters to education.

We certainly want to continue to teach the simple laws and regularities to our children. But we perhaps also should be teaching them that these laws apply to a universe so complex that it is never going to be fully clear to us. In this the universe is less like a clockwork and more like a plume of smoke. Both are governed by simple rules, but the predictability of the clockwork is the exception.

We may also want to change the attitude we convey about the tools we use in our quest for knowledge. We have too often considered computers, calculators, and the Internet as shortcuts; if you use them, we imply, then you’re not really learning anything. But much of our most advanced research is already being done using machine learning algorithms operating on vast quantities of data. We are going to rely on their conclusions even though we cannot understand them; we’ll rely on them because they will prove themselves reliable, and because we understand that their way of thinking is more in tune with the enormous complexity of the world than ours is. These are our new tools of thought, as much as rulers and protractors are. It’s just that we can understand how rulers and protractors do their job.

It is humbling to learn that our native equipment for learning is in fact poorly matched to the task we’ve given it. The brain is great for surviving, but not well designed for knowing. Yet there is also something glorious in creating tools that are now suddenly boosting us past our fundamental limitations that have been our condition until now. This is an evolutionary advance in the human ability to think. We should be conveying that sense of excitement to our children.