The Future of Programming

How tools and good practices will make programming easier

"It's tough to make predictions, especially about the future." -- Yogi Berra

By Eugene Eric Kim

Eugene is a freelance consultant and writer. He can be reached at eekim@eekim.com.

In 1893, to celebrate the 400th anniversary of Columbus landing in America, the American Press Association asked a group of leading journalists, entrepreneurs, engineers, and politicians to hypothesize on the state of the world in 100 years. The resulting essays were far-ranging, imaginative, and mostly wrong. As computer scientist Peter Denning recently observed, "Among the most striking features of the 1893 forecasts is the remarkable paucity of predictions that actually came true."

Predicting the future is easier said than done, and yet, we persist in trying to do it. As futile as it may seem to forecast the future of programming, if we're going to try, it's helpful to recognize certain fundamental characteristics of programming and programmers. We know, for example, that programming is hard. We know that the industry is driven by the desire to make programming easier. And we know, as Perl creator Larry Wall has often observed, that programmers are lazy, impatient, and excessively proud.

This first condition formed the basis of Frederick Brooks's classic text on software engineering, The Mythical Man Month (Addison-Wesley, 1995; ISBN 0201835959) first published in 1975, where he wrote:

As we look to the horizon of a decade hence, we see no silver bullet. There is no single development, in either technology or management technique, which by itself promises even one order of magnitude improvement in productivity, in reliability, in simplicity.

Brooks's prediction was dire and, unfortunately, accurate. There was no silver bullet, and as far as we can tell, there never will be. However, programming is undoubtedly easier today than it was in the past, and the latter two principles of programming and programmers explain why. Programming became easier because the software industry was motivated to make it so, and because lazy and impatient programmers wouldn't accept anything less. And there is no reason to believe that this will change in the future.

Easier programming means writing better code faster, whether it's of higher quality or merely functional. The software industry's interest in easier programming is driven by simple economics. Code that's written faster or that's of higher quality translates to value, and value translates to money. This is especially important today, as we decry the shortage of excellent programmers.

More importantly, easier programming means, to paraphrase Wall once again, making simple tasks easy, and hard tasks possible. The former not only appeals to the lazy programmer in all of us, but it also makes programming accessible to a wider class of people, addressing the programmer shortage to some extent.

However, making hard tasks possible is where the greatest potential lies. New technologies discussed in this magazine, from molecular to quantum computing, create a whole new class of problems that today's programming tools and methodologies do not adequately address. The only way we will be able to take full advantage of these innovations is to come up with ways to simplify programming.

Because we are driven to make software development easier, I have no doubt that we will. The real question is how. While there may be no silver bullet, there are certainly a slew of stones and other blunt objects that we can fling at the problem, and many of the articles in this issue describe such innovations. By exploring the current trends in programming, while keeping in mind the aforementioned three principles, we can paint a picture of what software development may be like over the next century.

Programming Languages

The introduction of Fortran in 1957 was a watershed in the history of programming languages because it was the first high-level language that was a truly practical alternative to assembly language. At the time, computers were expensive, and it cost a lot of money to run programs. However, John Backus, Fortran's creator, realized that it was actually more expensive to develop and maintain programs than it was to run them, a fact that we take for granted today, but that required quite a bit of insight to recognize in the 1950s.

Fortran was not the first high-level programming language invented, nor was it particularly exceptional. In a 1981 presentation on the history of Fortran, Backus admitted, "As far as we were aware, we simply made up the language as we went along." What separated it from other languages was the quality of its compilers, which generated decent machine code with acceptable performance. Although the resulting code was significantly slower than hand-coded assembly language, it was fast enough for the run-time costs to be more than offset by the savings in development and maintenance time.

Today, programming remains expensive and good programmers scarce. And, as in 1957, we see very high-level languages as a possible solution. Many of these languages already exist, some of which are so good that people don't even realize they are programming languages. For instance, VisiCalc cocreator Bob Frankston has pointed out that developing spreadsheets is equivalent to writing software. In this sense, spreadsheets are a type of very high-level programming language, and its users could legitimately be considered programmers.

However, in general, very high-level languages inhabit a relatively small niche, because their compilers are inadequate. In many organizations, for example, product managers use visual languages to model business rules and create fully functional prototypes. Engineers are then needed to reimplement pieces of the resulting code in order to achieve acceptable performance.

As we learned from Fortran, for any breakthrough in language design to have a significant impact on software development, the language must be accompanied by a good compiler that generates production-quality code. In other words, we can make programming easier by making compilers smarter.

Today's compilers are very good. Most programmers are not capable of writing more optimal machine code than the code produced by compilers. However, there are still many ways we can improve compilers. One recent trend is for compilers to analyze the dynamic run-time behavior of programs in order to come up with smarter optimizations. This is the automated equivalent of programmers profiling their code to identify areas of poor performance, and then hand optimizing that code.

Mark Wegman et al. of IBM introduces a similar dynamic optimization technique in this magazine that specifically targets distributed-computing applications. He also suggests an intriguing scenario where compilers automatically choose library functions with the most optimal algorithms and data structures for a particular application.

This might work in the following manner. We know, for example, that the classic quicksort does not work well when a list is already mostly sorted. Good programmers who recognize this characteristic in a particular application would know not to use a quicksort. Now, if programmers can recognize such empirical characteristics and make well-defined decisions accordingly, why can't program compilers do the same? A smart compiler might look for certain characteristics in an application, and choose the most optimal algorithm or data structure based on its observations.

Improving Development Tools

These types of optimization capabilities enable language designers to mask some of the complexities of programming -- what Fred Brooks calls the "essential difficulties" of software -- which, in turn, would make programming accessible to wider audiences. However, while these advanced techniques can be fascinating to think about, there are simpler ways to make programming easier.

Soon after Linus Torvalds first released the Linux kernel in the early 1990s, he decided to port it from C to C++. He recanted this decision after a few months for a number of reasons, one of which was that C++ compilers were much slower than C compilers. The Linux kernel required about an hour to compile on the fastest PCs that were available at the time. An additional 10 minutes would have significantly decreased the productivity of developers who, on an average day, might recompile the kernel several times.

From the programmer's perspective, the speed of the compilers can be just as important as the speed of the code they produce. Late-binding languages and incremental compilers both significantly reduce compilation time, and consequently, are likely to become pervasive technologies in the future.

Late-binding languages, including the various scripting languages and even Java to some extent, are already widely used in development environments. Because compilation time is significantly faster for scripting languages than for a statically bound language like C, development time is greatly reduced. As a result, these languages are great for prototyping algorithms and applications.

However, because these languages often have more than acceptable run-time speeds as well, they are also being used more and more in production systems. Witness the World Wide Web, where Perl and Java rule the server side, because data input and output, not application run-time speed, are usually the performance bottleneck in web applications. As microprocessors continue to get faster and compilers smarter, we may eschew traditional compiled languages entirely for scripting languages or other hybrid interpreted/ compiled languages.

Incremental compilers are interesting, because they show how simple things like a file can have an adverse effect on development time. For years, programmers have separated code into distinct files, compiling each file individually, then linking the code to form a single, unified binary. Because code in one file may affect code in several other files, whenever a file changes, no matter how small the modification, most compilers will recompile all of the files that depend on that file. In other words, if you have a million-line program, and you change a comment in one file, that file and every file that depends on it will be recompiled, unnecessarily wasting lots of valuable time.

Incremental compilers solve this problem by identifying dependencies at a more granular level, usually in a local database using its own proprietary format. Because of its importance and its complexity, incremental compilers have become a hot research area. However, this problem might not have existed in the first place if there had been a standard way of modularizing source code at a more granular level than files.

One proposed solution that several people have started to explore is to wrap XML tags around source code. If every logical block were encapsulated by XML, it would be possible to build more granular dependency trees for compilers to use. This way, if a comment were added to a file, for example, a compiler would be smart enough to know not to recompile anything.

This is essentially what current incremental compilers already do. However, embedding source code in XML has a number of other advantages, many of which have been noted by Eric Armstrong, an independent consultant who is working on one such project. One of its most important consequences would be the ability to link to and from source code using the XLink specification.

Cleanly annotating source code via linking would help solve one of the messy problems of software development: documenting code. There have been many improvements in this area over the years, but they have mostly been incremental. For instance, one of C++'s great innovations was the slash-slash comment, which allowed programmers to create single-line annotations easily, and which helped circumvent C's inability to nest comments.

Despite such improvements, comments are a relatively poor method for documenting software. One of their biggest problems is that they clutter source code, making it hard to read. Languages such as Java and Perl encourage embedding full documentation within source code, which can later be extracted using the appropriate tools. In theory, this is nice because it encourages programmers to write documentation and code at the same time using the same tool. In reality, the resulting clutter makes editing and viewing source code far more difficult.

Code generation tools, such as UML tools, cause similar problems because they maintain synchronization information between diagrams and source code using comments. These comments, which are often cryptic and poorly formatted, only serve to confuse people who are reading the raw source code. As a result, many developers choose to avoid the code-generation features of these tools altogether.

Embedding source code in XML would allow programmers to keep source code and documentation physically separate while maintaining the association using links. For this approach to be successful, however, source-code editors and viewers that handle XML encapsulation transparently are absolutely vital. This seems to be a reasonable expectation. For a long time now, people have been using word processors rather than text editors for writing documents. Programmers should have similar tools that transparently provide the features enabled by a richer, standard file format.

Programming Practices

In a recent Dr. Dobb's Journal article ("Open Source Meets Big Iron," June 2000), Pete Beckman and Greg Wilson noted that, despite all of the studies that show how good programming practices improve programmers' productivity, "most programmers still start coding without a design, then go on to short-change testing and set wildly unrealistic delivery schedules."

To some extent, better tools will encourage programmers to adopt better practices. In the early 1980s, Donald Knuth observed that the programs he had written for his many books tended to have very few errors, because he had thought them through very carefully. This observation motivated Knuth to codify this design and coding process into the literate programming methodology, and he developed a system called "WEB" to encourage the development of literate programs.

On the flip side, tools sometimes discourage good practices. In the 1960s, some people argued that batch-processing computers were better for productivity than time-shared computers because they forced programmers to think carefully about their design before they wrote actual code. Interactive computers with source-code editors allowed programmers to write code faster, but the quality of the resulting code was worse.

Ultimately, if we are to see more programmers adopt good practices, changing the programming culture is just as important as developing good tools. Programmers need to realize that the process of writing good, understandable code may be slower, but saves time and money in the long run. As Bruce Schneier notes in this magazine, understandable code leads to more secure and reliable software and more productive programmers.

Changing this culture will not be easy, and may not be realistic, but there is hope on the horizon. One social phenomenon that may help persuade programmers to write better code is the widespread acceptance of open-source software. People naturally pay more attention to what they are doing when others are going to judge the results of their work. Open source means that programmers' code will be potentially reviewed by a huge audience of their peers, and the subsequent social effect may ultimately prove to be the most important consequence of the open-source movement.

Another programming practice that may require significant cultural change is software reuse. Writing code that is easily reusable is a challenging technical problem, and tools and technology play an important role in making software reuse feasible. However, even in situations where reuse is not technically difficult, programmers will sometimes inexplicably avoid it.

For example, Jack Ganssle, an embedded-systems developer who distributes a regular newsletter entitled The Embedded Muse (http://www.ganssle.com/newsletter.htm) recently wrote that surveys show that 70 percent of real-time operating systems are custom built, even though there are at least 80 different commercial real-time operating systems currently in existence.

Open-source software provides other examples of reuse not happening when it should. Do a search on any category of software, and you will likely find a number of open-source versions, most of which are at about the same level of stability and provide the exact same functionality. This is ironic, considering that the availability of free source code should, in theory, encourage reuse. Why would anyone write yet another e-mail client, for example, when there are already 20 different open-source versions that all do the same thing?

Perceived costs may be one reason why more developers do not make an effort to reuse code. Egotistical pleasure is a more likely reason. (Remember that we programmers suffer from hubris, despite our laziness and impatience.)

The best way to change this attitude is to make reuse as invisible as possible. This has already happened in many ways. Most of us never write memory allocation or screen-output functions, because standard programming libraries, compilers, and operating systems take care of these for us. Every time UNIX users pipe a series of programs together, or develop a new, single-function program that reads and writes text from standard input and output, these users are practicing reuse, whether they are aware of it or not.

The Expected Unexpected

So far, we have examined trends in software development without making any bold predictions. However, if we take these trends to their logical extremes, surprising changes are likely to occur.

Computer pioneer Doug Engelbart likes to ask, "What would happen if we were all 100 times bigger or smaller?" Many people might initially respond, "Nothing. We wouldn't notice any difference." After all, people have similar physical capabilities -- the ability to walk and eat, for example -- regardless of shape or size.

The problem is that whenever you take any trend to its extreme, there are always unexpected consequences. If we were all 100 times bigger, we wouldn't be able to move, because muscle mass increases exponentially faster than muscle strength. This is why the largest mammals, whales, need to reside in the water. If we were all 100 times smaller, we would all die of thirst, because our mouths would not be able to penetrate the surface tension of water. This is why mosquitoes have needle-shaped mouths.

The most pervasive trend in computers is Moore's Law, which tells us that processor speed and memory capacity increases exponentially over time, while price remains constant. Moore's Law has enabled us to make some fairly accurate, short-term predictions about software. Fifteen years ago, researchers had good algorithms for doing speech recognition and playing chess, but processor speeds weren't fast enough for these algorithms to be practical. However, thanks to Moore's Law, it was easy to predict that 15 years later, processors would be powerful enough to handle these algorithms, and that we would have readily available speech recognition and software that could beat grand masters at chess.

What happens if we take Moore's Law to its extreme? This is not so easy to predict. Quantum computing is an excellent example of this. As David Cory and Raymond Laflamme explain in this magazine, quantum computers are able to solve previously intractable problems, like factoring very large integers. You could simplistically (and not quite accurately) view quantum computers as really, really fast processors.

If, suddenly, we can factor very large integers in reasonable amounts of time, what are the implications for public key cryptography? The answer is simple: It stops working. Now, what are the implications for electronic commerce, with an infrastructure that relies on public key cryptography? That is a far less obvious and a far more intriguing question.

How these unexpected changes will affect software development in the future is impossible to know. However, once again, Moore's Law gives us some final insight into predicting what this world might be like. Moore's Law is not a description of any natural law, but a reflection of human behavior, and a belief in our abilities. It is in our nature to innovate and to improve, and as long as software and computers play an important role in our lives, we will continue to focus our efforts in these directions. Ultimately, we can be confident in our views of the future, because we are the ones who are creating it.

DDJ