The Learning Machine
By David Alan Grier
 

connected and overlapping gearsI was unable to attend the Chinese National Computing Congress this year; other tasks kept me close to home. I was sorry to miss the meeting. I had wanted to see some old friends and meet a few new ones. Most important, I had wanted to learn how Chinese researchers were approaching the current problems in computer science and assess how their ideas might interact with the work in the US, Europe, and India. Increasingly, we have come to appreciate that most computer science problems no longer have a single correct answer. Instead, we get the most efficient or workable solutions by combining ideas from multiple viewpoints.

The notion that any engineering problem is best solved by a single solution was the foundation for much of 20th century engineering. It was developed by a group of engineers who were trying to make factories more efficient and organized. They hypothesized that if all their workers used different methods, all but one of them were working inefficiently. To improve their factories, they decided to try to identify the one best way to complete a task and then teach this method to all their workers.

This idea was embraced by computer science and supported by other economic and social forces. I have a hard time convincing my students that all the early computers were slightly different. They had different operations, different programming languages, and different instruction sets. The fact that you could program one brand was no guarantee that you could program another.

My students are only mildly polite when I tell them this story. They give me the same attention that they would give to an aged grandparent who was talking about living on a farm and feeding the goats each morning. After all, they have lived their entire lives in an age that had a small number of computer architectures, a slightly larger number of operating systems that are based on a common model, and a large body of applications that can run on any system. Their experience is echoed by the lessons of their parents, who also worked in a unified computing environment. The only meaningful decision that they make is choosing between Apple and Lenovo, iOS and Android. To most of us, that decision is not particularly important. Both kinds of computers and both kinds of operating systems are fundamentally the same.

Yet, we are seeing a growing interest in heterogeneous computers—computers that combine radically different kinds of processors. Most commonly, heterogeneous computers combine two different kinds of machines: the common central processing unit (CPU) and the graphical processing unit (GPU). CPUs take a single stream of instructions and apply it to a single data stream. A GPU takes a single stream of instructions but applies it to multiple data streams. In theory, GPUs can be much faster for certain problems than CPUs. As a result, the final heterogeneous machine should be substantially faster than a machine made solely of either kind of processor.

Some computer scientists argue that heterogeneous technology will ultimately give us exascale computers that can deliver 1018 floating-point operations per second. The heterogeneous approach, explains one team of researchers in a recent IEEE article, “provides an effective balance of high-throughput, energy-efficient GPU resources coupled with CPU cores optimized for single-threaded performance.”

Fundamentally, the idea of combining two or more kinds of processors in a network is not new. During the 70s and early 80s, many high-speed computers consisted of two different, tightly linked kinds of processors. Although one or two of these machines—notably those of the Cray Corporation—were important for a time, all were eventually overshadowed by machines constructed of multiple, inexpensive, identical processors. Such machines were cheaper to build and easier to use.

When you are combining processors with two different architectures, CPUs and GPUs, you face the problem of finding the one right way to program the machine. Some programs work well on the CPU but not the GPU. Others programs work better on the GPU and slower on the CPU. Of course, some programs do not clearly belong on either processor. Some data will run fastest on the CPU; other data will run best on the GPU. Early in my career, I spent a year or two learning to program such a machine. My work constantly required me to decide which processor would run the bulk of the problem. I had to choose one processor or the other, even though I knew that sometimes programs would run better if I had made the other choice.

Recent research on heterogeneous computing attempts to challenge the idea that there is only one right way to divide work between CPUs and GPUs. It supports the hardware architecture with software tools that analyze the nature of each program and decide which parts of the code should be assigned to which processor. If these tools do their work well—and I hope they do—this software will be able to analyze programs dynamically and decide when they should be moved from one processor to another, when they aren’t a good match to the original assignment.

We could argue, of course, that heterogeneous computers will not abolish the idea that there is one right way to operate a computer. We could claim that a heterogeneous computer has one right way to handle each computer and each dataset. In the process, we would have to acknowledge that preparing a computer for such a machine is too complex for a human programmer. At best, a human programmer can prepare the program and verify that it is sufficiently correct. The machine itself will have to monitor the program and determine a way to execute the code quickly and efficiently, even though it might not be the one right way.

Of course, heterogeneous computing is not the only form of computation that relies on software to find the best way to utilize our hardware resources. We’ve long had tools that refactor software, that search for parallelism, that reorganize our algorithms, that adjust the load on a network of machines. All of them are steps away from the idea that we have to find the one best solution and toward accepting the notion that our machines can find a solution that is good enough.


David Alan Grier circle image

About David Alan Grier

David Alan Grier is a writer and scholar on computing technologies and was President of the IEEE Computer Society in 2013. He writes for Computer magazine. You can find videos of his writings at video.dagrier.net. He has served as editor in chief of IEEE Annals of the History of Computing, as chair of the Magazine Operations Committee and as an editorial board member of Computer. Grier formerly wrote the monthly column “The Known World.” He is an associate professor of science and technology policy at George Washington University in Washington, DC, with a particular interest in policy regarding digital technology and professional societies. He can be reached at grier@computer.org.