Not too long ago, the word algorithm was pretty obscure. According to a Google search, its occurrence (in books, at least) was essentially nil until around World War II (the very beginning of the computer era), when it started to creep up.1 A recent Google search for the word returned nearly 150 million hits;2 a search for news stories containing the word returned well over 500,000 hits;3 and Amazon offered over 30,000 books with the word in their titles.4
Some of these books suggest that all we need—rather than love—is algorithms, such as The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World (Domingos 2015), The Advent of the Algorithm: The Idea that Rules the World (Berlinski 2000), and Algorithms to Live By: The Computer Science of Human Decisions (Christian and Griffiths 2016). Some of these books suggest that the combination of algorithms and big data5 means that science itself—the entire enterprise of investigation and discovery—is obsolete.
It sounds like algorithms are pretty important. So what, exactly, is an algorithm?
As originally used in the early nineteenth century, algorithm (which derives from the same root as the word algebra) meant a sequence of operations guaranteed to eventually produce the answer to a particular problem. Of course, computers hadn’t been invented yet, so the operations were performed by people, and the problems were all in pure mathematics.6 The classical idea of algorithm embodies a few simple notions:
• It’s a set of rules.
• The rules specify a sequence of steps that can be performed mechanically, without any judgment.
• The sequence of rules eventually produces a result—or stops without producing one. In other words, it never just goes on and on forever.
• If it finds one, the result is provably correct.
Examples of early classical algorithms include Euclid’s Algorithm, which finds the greatest common divisor of two whole numbers; The Sieve of Eratosthenes, which finds prime numbers; and Binary Search, which finds an item in a sorted list. More prosaic examples are the procedures7 you learned in elementary school to do arithmetic—the ways you do addition, subtraction, multiplication, and division (if you still do them yourself) are all algorithms.
Implicit in the original notion of algorithm is perfect knowledge—that the numbers on which the algorithm is to operate are known. For example, in the Traveling Sales Professional Problem,8 it’s assumed that the map is fixed and certain—that roads won’t be closed, that new roads won’t open, that traffic conditions can be ignored, and so on.
Algorithm used to be contrasted to heuristic. A heuristic is just like an algorithm except that its result can’t be proved correct—and probably isn’t. Implicit in the idea of a heuristic is that it’s good enough for practical purposes, a rule of thumb. A good heuristic produces a result that’s likely to be close to the best most of the time.
It’s more than curious that algorithms are all the rage while heuristics are the forgotten sibling—the same Google News search that produces over half a million results for algorithm yields only a little more than ten thousand for heuristic. (Algorithm certainly sounds fifty times better than heuristic!) But none of the algorithms lately in the news actually are—at best, they’re heuristics, but even saying that is giving them too much credit. Because even heuristic applies to a procedure whose results can be evaluated objectively.
Many of the supposed algorithms are embodied in so-called recommendation engines—see You May Also Like: Taste in an Age of Endless Choice (Vanderbilt 2016)—from Amazon, Yelp, Netflix, Pandora, and so on. Although there’s now even a computer science discipline called “recommendation theory,” recommendations can only be evaluated subjectively—a far cry from the original notion of algorithm—or even heuristic. Whether a program has correctly calculated the optimal route for a spaceflight to Mars isn’t a matter of opinion or survey.
Whether algorithm or heuristic, there’s supposed to be a coherent set of rules that can presumably be reviewed and understood by a human being. Neither algorithm nor heuristic is supposed to be an oracle that makes announcements to be uncomprehendingly accepted and obeyed. For the newsworthy algorithms (the ones that are going to save the world), this is pure fiction. No one can really say specifically what the rules are or specifically what they’re supposed to do, only that they’re supposed to do a good job.
Why, then, is the word algorithm used at all, rather than the more familiar (and less pretentious) phrase computer program? After all, all supposed algorithms are implemented in computer programs—no one is talking about performing them by hand. And therein lies the intended distinction: The computer program merely implements the algorithm. The algorithm itself is the idea underlying it. This, in turn, implies that there’s an actual idea behind the program, rather than the usual kludge behind more pedestrian programs.
In some cases, this may actually be true—or, at least, may once have been true. For example, the original idea behind Google webpage ranking was borrowed from the way in which academic papers have been comparatively evaluated.9 A useful way to do so is simply to count how many citations a paper receives—the more often a paper is cited by other papers in reputable journals, the more likely it’s worthy of attention. This idea can be applied recursively, so that each citation can in turn be weighted by how many times that paper is itself cited, and so on. Applied to pages on the internet, each page can be given a score based on how many other pages hyperlink to it. And, analogously, the value of each page hyperlinking to a page can itself be measured in the same way.
Unfortunately, once this was understood and implemented, it could be played. So-called link farms were created for the sole purpose of containing hundreds or thousands of hyperlinks to a particular webpage to boost its ranking on Google and the other search engines. This, in turn, led to an arms race between the search engines and the practitioners of the dark arts of search engine optimization (SEO). Whatever integrity and intelligibility the Google page rank algorithm once had, opacity became an asset rather than a liability—a feature, not a bug.10
Google is said to modify one aspect or other of its search engine algorithm hundreds of times a year.11 This is actually a very old technique called trial and error. You don’t like the way unscrupulous SEO magicians are gaming you to get their clients higher in your results? Make a change somewhere and try it out. Like the results? Great! No? Try something else. Six months later, when the SEO wizards have figured out a new trick to boost their clients’ rankings, try another change.
Except for companies and websites whose revenues rise and fall according to their ranking in search results, changes to the Google algorithms probably aren’t all that consequential. But some algorithms—applied in much more consequential situations such as hiring, loan-making, housing, juror selection, and medical diagnosis—have less trivial results (see Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy [O’Neil 2016]). The worldwide stock market crash of 1987 known as Black Monday is believed to have been caused—at least in part—by algorithmic trading. Since then, markets are not infrequently hit by what have come to be known as a flash crash. For example, a recent headline reads “Oil went through a flash crash overnight, putting an already fragile market on edge.” Tellingly, “Traders blamed forced margin calls and computer trading for the so-called flash crash but were not quite sure what caused the drop.” That’s right: No one knows exactly what these algorithms are supposed to do or what they’re actually doing.
In many cases, what we ask an algorithm to do is effectively impossible. For example, the occurrence of terrorism is so rare (fewer than one person in a million commit an act of terrorism in the United States per year12) that there are no reliable indicators. Even if a statistically reliable pattern could be found, any algorithm would find overwhelmingly more false positives than true ones. For example, suppose a terrorism identification algorithm was 99.99 percent accurate—that is, it misidentified an innocent person as a terrorist only 0.01 percent of the time. Applied to the general population of the United States of about 330,000,000, it would identify some 33,000 people as terrorists. If only ten of those people actually were terrorists, such an algorithm would actually still be wrong some 99.97 percent of the time!
The sort of algorithm meant to find terrorists is of a type known as pattern-recognition or pattern-matching. The idea is that you show a lot of examples of what you’re looking for (and not looking for) to a deep learning program. It finds patterns that it then applies to new cases by seeking correlations between input and output. What could possibly go wrong?
Paradoxically, having a lot of data to learn from (“big data”) makes things worse rather than better. The more data you give the algorithm to analyze, the more it will find random, spurious correlations. And if you haven’t identified and gathered the metric that’s actually causal, no amount of data will help. (This is why the notion that algorithms obviate science itself is so misguided—science involves much more than finding correlations in existing data. The fundamental goal of science is to find causal relationships—and this often involves inventing new instruments, tools, and materials to make new observations and measurements guided by new insights and new ideas.)
In many cases, what we expect the algorithm to do is completely unreasonable. For example, after some people used Facebook’s recently introduced Live facility to stream videos of horrendous crimes as they were being committed, Facebook apparently thought they could devise an algorithm that could scan all the streaming videos and automatically determine in real time what was being shown. Presumably, this would include being able to distinguish between, for example, an actual murder and a high school play depicting one.13 It should come as a surprise to no one (except, perhaps, Mark Zuckerberg) that this effort wasn’t successful. Facebook has given up on this and has hired 3,000 people—actual human beings!—to screen the live feeds and use their judgment.
If not videos, what about still images? Google’s image search facility is extremely impressive—type in anything you can think of, and hundreds or thousands of photographs and other graphics—most but not all of them appropriate—are immediately returned. Does Google actually have an algorithm that can scan every photograph and every graphic on the web and figure out what’s in it?
Well, no. Google’s image search algorithm relies on two methods:
• Almost all of the photographs and other images on the web have file names and other associated text.
• People are paid a few pennies a picture to look at them and add tags (through facilities such as Amazon’s Mechanical Turk).
In other words, image search is actually just text search that returns images. When you perform an image search by pointing to or uploading an actual photograph, Google looks for a file containing identical (or overlapping) data. Only when it finds such a match can it figure out what the photograph actually depicts by looking at the associated text. This is all very useful and convenient and impressive, but there’s no wizard (or algorithm) behind the curtain. (When it doesn’t find one, Google makes an extremely generic guess. For example, upload a photograph of a wooden loom, and Google guesses that it’s a piece of furniture.)
Before the advent of the internet and the ever-changing world wide web, it was usually demanded that a new computer program be comprehensively (if not exhaustively) tested. In some cases, it still is. For example, if you’ve ever made a mistake filling out an online form, you know that your input is checked against its expected format—for example, a phone number consists of ten digits. Before that form went live, the quality assurance department undoubtedly tested every field to see if it properly rejected bad input and properly processed good input. A crucial metric of the quality of a testing effort is coverage—the proportion of test cases to all possible inputs (or types of input), usually expressed as a percentage. (Where possible, testing is automated—for every test case, the input and the corresponding output are specified.)
But, in many cases, this kind of comprehensive testing isn’t even conceivable. For example, the input to early artificial neural nets (a kind of pattern recognition algorithm) typically consisted of a binary grid of 8 by 8 cells—in other words, a chessboard in which each square could be filled (1) or empty (0). Even an array as simple as that has 18,446,744,073,709,551,616 different possible inputs.14 Such artificial neural nets were typically programmed with a few hundred training cases and then tested with a few hundred more before being set to work picking stocks or race horses. Even with a thousand training cases and another thousand test cases, the coverage is on the order of 0.0000000000000001 percent.
Modern image processing programs handle files that are astronomically larger. A modern digital camera takes photographs with millions of pixels, each of which can take a million or more possible values. The number of possible inputs is considerably larger than the number of quantum particles in the universe. Your image processing program will work amazingly well almost all the time—but one day may suddenly undergo the equivalent of a flash crash.
These days, saying you’ve got an algorithm is saying almost nothing at all. Applied outside the narrow field of abstract problems of pure mathematics, the word algorithm implies a purity, an integrity, a correctness that’s simply unattainable in the real world.
- https://www.amazon.com/s/ref=nb_sb_noss_2?url=search-alias%3Dstripbooks&field-keywords=algorithm. Most of these are technical.
- Another term that has recently become ubiquitous.
- Pure mathematics is the study of numbers in the abstract—without any consideration for what they might count or measure in the real world. Of course, many of these problems were abstracted from real problems in the real world, but they were studied without any reference to those real-world problems that inspired them.
- Another term for algorithm is effective procedure.
- Originally, the Traveling Salesman Problem. It’s the problem to find the best (shortest, fastest, cheapest) route visiting all the cities on a map connected by a defined set of roads between them—for any map with any set of cities and any configuration of roads. Finding the best route between any two cities is an equivalent problem. (Of course, in the mathematical version of the problem, all of these things—cities, roads, speeds, distances, costs—are abstracted.) Google Maps and other routing apps almost always do well enough, but they can’t be guaranteed to yield perfect results.
- See http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf.
- The totality of Google software is said to be about two billion lines of code (see https://www.wired.com/2015/09/google-2-billion-lines-codeand-one-place/). That’s well beyond the ability of anyone to read and understand.
- In 2015, nine people were involved with terrorist acts in the United States.
- This wouldn’t necessarily be equivalent to achieving full and complete human-level artificial intelligence—but it would come close.
- This recalls the fable of the mathematician who bested the emperor who asked how he would like to be repaid for a minor service. The mathematician asked for a single grain of rice for the first square on a chessboard, two grains for the second square, four grains for the third square, and so on. The emperor readily agreed, only to learn that the payment would bankrupt the kingdom.
- Berlinski, David. 2000. The Advent of the Algorithm: The Idea That Rules the World. Boston: Houghton Mifflin Harcourt.
- Christian, Brian, and Tom Griffiths. 2016. Algorithms to Live By: The Computer Science of Human Decisions. New York: Henry Holt and Co.
- Domingos, Pedro. 2015. How the Quest for the Ultimate Learning Machine Will Remake Our World. New York: Hachette Book Group.
- O’Neil, Cathy. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Lake Arbor: Crown.
- Vanderbilt, Tom. 2016. You May Also Like: Taste in an Age of Endless Choice. New York: Knopf.