In this report, I'll look over a few recently launched products, as well as review the history of Web servers and examine what is involved in writing a server. Space constraints preclude detailed product examinations and comparisons, but I'll provide enough background and context to guide your own investigations. But first, let's discuss why you'd want a server.
A Web server is the program that sits on the other end of the wire from your Web browser - the distaff side of the client/server equation. The conversation between Web client and server adheres to the HyperText Transmission Protocol (HTTP), the means by which a client requests Web pages (HTML documents) from a server.
Who needs a server? Anyone who is tired of surfing through people's pages containing endless pictures of their pets, possessions, or offspring, and who wants to inflict this same tedium on others. Personal Web pages are the '90s equivalent of home video, except that you don't have to visit someone else's house to fall asleep - you can do so in the comfort of your own home.
In addition to satisfying personal penchants, Web sites can benefit small businesses, allowing them to reach out to customers, and corporate workgroups, allowing them to share information internally - a poor man's Lotus Notes. The World Wide Web's ability to dissolve geographic boundaries is immensely appealing. (At a recent family gathering, I encountered a cousin, known more for his football ability than his computer literacy, who confided that he wanted to use the Internet to get out of the corporate rat race by moving his family to the countryside and opening up a virtual storefront on the World Wide Web to sell sports memorabilia via an online catalog. Presumably, similar scenarios are happening across the country at summer barbeques and beach outings.)
One attractive option for the impatient is to rent space on a commercial Web server. At the moment, the Web space-rental business is thriving. One rent-a-space company, Network Wizards (http://www.catalog.com) of Menlo Park, CA stopped accepting customers for several months to take a breather and consolidate after a period of rapid growth earlier this year. Their price is certainly attractive: $10 per month, which includes up to 200 MBs of data transferred per month (10 cents per megabyte thereafter). This sets you up on a fast machine hooked to the Internet backbone via a T1 line, maintained 24 hours a day by a former member of the Internet Engineering Task Force. Don't rush to pick up the phone, however. Effective July 31st, the company again put a hold on new accounts: "We will spend the next few months adding more features and enhancing our service, and planning for future expansion. We do not have any date for when we might start accepting new accounts again." Not to worry; there are similar companies across the country still looking for customers. Post a query on the comp.infosystems.www.misc newsgroup, and you'll be inundated with responses.
Given these bargain-basement rental rates, why bother running your own server? One reason is that few of the space-rental firms offer the more sophisticated back-end processing (forms and CGI scripts, clickable imagemaps, user authorization, secure credit-card transactions, virtual host domains, and so on) now deemed necessary for commercial ventures. If provided, these features are present in a scaled-down version; for example, an input form may generate an offsite e-mail message rather than invoke CGI processing.
Even though the proletarian clientele of CompUSA and other mass-market superstores has become accustomed to the eccentricities of DOS and Windows, the vagaries of UNIX and Linux (such as rebuilding the kernel to add support for a new device, or using vi to edit configuration files) remain mysterious and inaccessible to most.
To address this accessibility gap, several vendors have released packages that bring many of the capabilities of high-end UNIX servers to low-end desktop platforms (various flavors of Windows, plus Macintosh). These products include O'Reilly & Associates Website server (Windows 95 and NT), StarNine Webstar (Macintosh), Quarterdeck WebServer (Windows 3.1), Frontier Technologies SuperWeb (Windows 3.1 and NT), and NetWings (Macintosh). Some of these packages are priced as low as $130 (Quarterdeck).
In addition, there is a slew of higher-end products that run on traditional UNIX boxes and have also been released for Windows NT: Process Software Purveyor, Personal Library PL Server, Quadralay WebWorks, Verity Topic Information Server, and the Netscape Communications and Commerce servers. Even though they "run on Windows," these packages can be considered high-end because of their price tags - up to $15,000 for the Verity software, for instance.
McCool was not alone. Other individuals created servers for similar reasons. For example, Marc VanHeyningen says "I decided to build a server in Perl rather than relying on existing code such as CERN's server in order to maximize the flexibility. If it is easier to write a new server than it is to add functionality to the old one, something seems wrong." VanHeyningen's httpd.pl is about 800 lines of Perl code. His server was taken over by Tony Sanders at BSDI, who enhanced and cleaned up the code, a common refrain in the history of Web servers. Meanwhile, back at CERN, Ari Luotonen assumed responsibility for server development in the fall of 1993 and took the CERN server from version 2.12 to 3.0, growing its codebase from 1500 to 15,000 lines of code.
A small-scale arms race ensued between server authors, adding features such as CGI interface, server-side includes, access control, better caching, and so on. The CGI (Common Gateway Interface) specification for connecting external programs to the HTTP daemon evolved out a facility designed by McCool known as "htbin," which was influenced by Plexus's capability for dynamically loading custom modules. Luotonen had a similar facility in the CERN server. Luotonen and McCool hashed out the differences and came up with a joint specification.
John Frank, a mathematics professor at the Northwestern University, wrote the WN server as a successor to GN, a gopher server he had written earlier "as sort of a hobby." WN is about 10,000 lines of rather sparse C code, and includes built-in support for full-text search (done on the fly rather than via an index), imagemaps (rather than using an external gateway), server-side includes, redirection of URLs, filters, and ranges. Frank's admitted focus was on flexibility: "An HTTP server should do more than just serve files. It should play an active role in both navigation and presentation." There is also an emphasis on simplicity of operation, avoiding use of CGI. (You can browse through the source code to WN at Frank's Web site, http://hopf.math.nwu.edu.)
Simon Spero, who was affiliated with the heavily trafficked sunsite.unc.edu (over 100,000 hits per day) at the University of North Carolina, wrote the MDMA server as a way of exploring various ideas for getting increased performance, "up to 10 times faster than other servers." Spero used lightweight threads rather than heavyweight processes to service multiple connections, and a more-streamlined Binary Gateway Interface (BGI) rather than the low-bandwidth CGI facility. The C++ code is large, complex, and unreliable, however. Spero's program was only briefly used in production; sunsite is now using NCSA 1.4.
There are many more servers than those listed here: The Yahoo directory lists 50 entries, including packages for DEC/VAX and IBM mainframes, as well as servers written in Tcl/Tk, Lisp, and LPC4. However, deployment is limited, perhaps in some cases only to the author's personal machine.
In response to flames such as "you must be the worst programmer on the planet," Rob McCool wrote a confessional and entertaining explanation entitled "Why HTTPd Sucks." In it, he describes the rationale behind some design and implementation decisions found in the NCSA server. Much of this will be familiar to any experienced programmer: A small-scale project is successful and grows rapidly in ways unanticipated by its author. The codebase reached its nadir in Version 1.3, afflicted by inadequate performance, spaghetti code, cryptic comments, and unfettered global variables. McCool explains that every change made sense to him at the time it was added. The root cause was his lack of experience: "See, my prior experience to HTTPd was school projects (ha) and video games. Nintendo and Sega don't pay extra for documented code." By the time he realized the consequences of his actions, it was too late.
Despite the criticism, Marc Andreessen thought highly enough of McCool's programming skills to bring him from NCSA to Silicon Valley when founding Netscape Communications. (Interestingly, Ari Luotonen from CERN is also there, working on Netscape's proxy server technology.) The Netscape Netsite server appeared six months after the company's founding, and press releases tout the increased performance and enhanced features.
The Netscape server now comes in various flavors. The basic version, now called "Netscape Communications Server," lists for $1495 ($795 for Windows NT). There is also a more sophisticated, secure-transaction-enabled version known as the "Commerce Server," priced at $5000. In addition, the company says: "a support contract must be purchased for each copy of the Server software purchased." This brings the cost of the Communications server to almost $2000, and that of the Commerce server to almost $6000.
In response to the question, "How is Netsite server software different from the NCSA httpd server software?", the company responds with a list of enhancements, beginning with "lightning fast response times and superior handling of peak loads."
However, improving on NCSA 1.3 performance seems to be an easy enough target, because there are so many people doing this, including at NCSA itself. Browsing through the util.c module in the source for the current release (1.4.2), you'll encounter a change to the getline() function, accompanied by the following comment: "[fixing the] single-character-read brain damage, which Rob McCool has described as the worst implementation decision of his entire life." Instead of using a buffered read, as in the stdio function fread(), McCool used a call to read() to get the next character; as you may recall, this is a system call that results in a user-to-kernel-mode transition. It turns out there's a valid, if hack-ish, reason for using read(), but, due to the fact that Mosaic sends over 1.5 KB of headers with every HTTP request, each such request takes over 1500 system calls to complete. Robert Thau of the MIT's AI Lab came up with a partial fix, and memorialized McCool's infelicitous decision with a code comment.
Benchmarking server software is trickier than most performance evaluations, and there are few credible studies around. An April 1995 study by NCSA's Robert McGrath compared NCSA 1.3, CERN 3.0, NCSA 1.4, and Netscape Communications Server. The overall finding was the NCSA 1.4, in its default configuration (which uses a technique known as "process preforking," essentially spawning a whole bunch of processes at startup, and keeping them around, ready to service requests) was the best-performing server, about double that of NCSA 1.3. The Netscape server (version 1.0, since superseded by 1.1) was "a close second." Given the kernel-mode read() brouhaha, it is not surprising that the Netscape server now makes the fewest system calls of any of the servers, possibly because it uses memory-mapped I/O rather than file read().
One motivation for creating the server is uncertainty about NCSA's licensing of the source code, which is currently in the public domain. Apparently, this may change. (Spyglass, a private company with an exclusive license to NCSA Mosaic code, sublicenses Mosaic to companies such as Spry, Quarterdeck, CompuServe, and the like, and has announced a server product.)
The Apache group firmly states: "Apache is and will be a public domain server." In addition, Apache is currently "much faster" than NCSA 1.3, and "will ultimately be faster" than NCSA 1.4. Although Apache group members are located in different parts of the country, the organization's server is hosted by Organic Online of San Francisco. Many of the founders of Organic were formerly affiliated with HotWired, Wired magazine's popular Web site. Brian Behlendorf, CTO and cofounder of Organic Online, is a prime mover behind the Apache project. Behlendorf was production director of HotWired, which at its debut, offered facilities found nowhere else on the World Wide Web; for example, BBS-style conferencing and user authorization. These features are now finding their way into the mainstream.
Although the Apache software is still in beta, the HotWired site is running it and getting 400,000 hits per day on two Silicon Graphics machines. Another site at the MIT AI Lab gets about 100,000 hits per day.
A recent survey of 13,000 Web users by Georgia Tech's Graphics, Visualization, and Usability (GVU) Center produced some interesting results. The study, conducted in April and May of this year, focused on a number of topics. In the case of HTTP servers, the survey found that the most popular server is NCSA (38.6 percent), followed by MacHTTP (20.8 percent) and CERN (18.5 percent), leaving about 10 percent for all other servers. (The figures for Europe are different: There, CERN takes the number-one position at 34.9 percent.)
The second-place showing of MacHTTP is a bit surprising. Most press accounts, both mainstream and trade, go on at length about large-scale UNIX systems (the past) and Windows-based PCs (the ostensible future, at least for Microsoft-dominated corporate America), giving short shrift to the Mac as an Internet platform. Obviously, the survey was taken before any of the current wave of software was released, but you still wonder how many analysts' prognostications were grounded in real-world numbers.
The Macintosh's significant presence in academia certainly helped it garner second place. It also helped to be an early entrant. According to Chuck Shotton, author of MacHTTP, his was the the third server written, after CERN and NCSA.
Shotton's programming experience includes writing Ada programs for spy satellites and space stations. He was assistant director of the academic computing department at the University of Texas-Houston (UTH) when he decided to write a Web server. He says he hacked together MacHTTP "over a weekend," using some TCP/IP code he had lying around. It was originally written in Think C, but Shotton switched to Metrowerks compiler (along with everyone else, it seems).
MacHTTP is shareware, available on a number of ftp sites, including that of Biap Systems, the company Shotton founded when he left UTH. MacHTTP was recently revamped and reborn as "WebStar," and is now marketed through an exclusive arrangement with StarNine Technologies (Berkeley, CA). Shotton claims that WebStar is about four times faster than MacHTTP, and can handle 500,000 hits per day on a PowerMac 7100. (Two sites running WebStar are http://www.apple.com and http://quicktime.apple.com.) One reason for MacHTTP/WebStar's performance is the use of AppleEvents rather than CGI for back-end processing. AppleEvents allow one application to communicate directly with another and avoid the clumsy manipulations required by CGI. The AppleEvent facility also enables remote administration, allowing the site administrator to monitor multiple server machines simultaneously from a single workstation: examining outputs, memory usage, connection status, and so on.
MacHTTP is the hands-down easiest server software to install and operate, requiring only a single mouse-click to get you on the air.
Apple has demonstrated, however, that it has ample ability to take a competitive advantage (or lucky break) and squander it utterly.
Still, one positive sign is Apple's recent introduction of its Internet Server Solution for the World Wide Web - basically various models of PowerMac with a software bundle that includes Webstar, Netscape Navigator, AppleSearch, Acrobat, Hypercard, and CGI scripts.
One strong reason to go with WebStar (or MacHTTP) is the level of support. There is a large, knowledgeable user base (Shotton claims "66% of the commercial Web server market"), and Shotton is an informative and visible presence on the comp.infosystems.www.servers.mac newsgroup. He also answers the phone at Biap Systems.
If there is a counterpart to Shotton in the Windows world, it has to be Robert
Denny, author of WinHTTP, a shareware server for 16-bit Windows, and WebSite, a
32-bit server for Windows 95 and Windows NT, marketed by O'Reilly
& Associates. Denny is an equally visible and knowledgeable presence in
comp.infosystems
.www.servers.windows.
Denny was founder and CEO of Alisa Systems, a company making cross-platform
e-mail integration tools. At age 49, he got bored and decided he wanted to
write code again. In his spare time, he crowbarred the NCSA source code onto
the Windows 3.1 platform. At that time, the NCSA server followed an execution
model that uses the fork() system call to spawn a new process to serve
each HTTP request. As you recall, fork() makes a complete copy of the
data in the parent process to pass to the child. This is one area that has been
optimized in subsequent NCSA implementations via preforking. Denny chose the
alternative approach of migrating to a multithreaded model.
In the NCSA implementation, communication between parent and child occurs via data in global variables. Porting NCSA HTTPd to 16-bit Windows required implementing a thread dispatcher in assembly language, eliminating all global variables and rebuilding the per-transaction data structures. As you can imagine, the original code (which was a bear) was transmogrified into a different kind of animal.
One source of misbehavior was the various TCP/IP stacks for Windows, such as Trumpet and Chameleon. These stacks have been primarily tested as clients (due to lack of server software), and are not used to the level of activity engendered by a Web server. Certain bugs shook loose and proved elusive, taking the stack authors as long as a year to fix.
In contrast to WinHTTP, WebSite represents a "totally new" server. Its feature set is based on "thousands of email messages and lots of user requests" regarding Denny's earlier creation. Naturally, Denny takes advantage of the thread facilities provided by the Win32 API, as well as other capabilities found in the new-generation Windows environment, such as the registry. WebSite sports such enhancements as dynamic priority control, which throttles the connection set based on the amount of bandwidth available.
Both WinHTTP and WebSite use the WinCGI interface for passing control to external programs. WinCGI is a Windows-specific interface defined by Denny as a more flexible substitute for CGI. It allows you to do CGI-style processing using Visual Basic - much easier than writing Perl scripts. You can use ODBC and Microsoft Access to access, say, your company's data.
As with the other Web servers mentioned earlier, WebSite has many more features than can be discussed here. One nice touch is the WebView utility (written by Jay Weber and colleagues at Enterprise Integration Technologies) that comes bundled with WebSite. This provides a powerful, graphical means for managing hierarchies of HTML documents. WebSite, which lists for $495, is now being sold in bookstores.
As graphical user interfaces take much of the difficulty out of installing and administering Web sites, certain issues will never go away. One is the conceptual hurdle behind TCP/IP and related protocols. And if you're on the PC-hardware platform, you may have to suffer through installing a network card, only to have it conflict with your SCSI adapter or internal modem.
At the high end, the Netscape server line looks like high-quality technology from an experienced team. The company has earned its reputation for quality software. If you value support and want to go with a company whose present prospects will make it the Microsoft of the Year 2000, Netscape is a safe choice. In any case, it is easy enough to try out: You can download a 60-day trial version for free by filling out an online form.
On the other hand, if you value control of source code, an open collaborative development process, and have a repertoire of UNIX skills, Apache and NSCA merit consideration. Lastly, if you are hacker (or wish to become one), you have seemingly unlimited choices. If you don't like the ones listed in http://www.dobbs.com/dddu/servers.html, what are you waiting for? Isn't the weekend coming up?
Ray Valdes is senior technical editor for Dr. Dobb's Journal and can be contacted at ray@valdes.com