Trends in Operating System Design

Will we gain portability at the expense of performance?

Peter D. Varhol

Peter is chair of the graduate computer science department at Rivier College in New Hampshire. He can be contacted at varholp@alpha.acast.nova.edu.

Over the past several years, we've witnessed a number of trends affecting operating-system design, foremost among them a move to modularity. Operating systems such as Microsoft's NT, IBM's OS/2, and others are splintered into discrete components, each having a small, well-defined interface, and each communicating with others via intertask message passing. The lowest level is the microkernel, which provides only essential OS services, such as context switching. Windows NT, for example, also includes a hardware-abstraction layer beneath its microkernel which enables the rest of the OS to perform irrespective of the processor underneath. This high level of OS portability is a primary driving force behind the modular, microkernel-based push.

For an example of a modular, operating-system architecture, there's no better place to look than QNX Software's QNX operating system. QNX is a real-time OS with a UNIX-like command language. QNX consists of a tiny (around 8-Kbyte) microkernel that only handles process scheduling and dispatch, interprocess communication, interrupt handling, and low-level network services, all of which are accessible through 14 kernel calls. The size and simplicity of the kernel allows it to fit entirely in the internal cache of processors such as the 80486.

A minimal QNX system can be built by adding a process-manager module, which creates and manages processes and process memory. To use a QNX system outside an embedded or diskless system, a file system and device manager can be added. These managers run outside kernel space, so the kernel remains small. For more details, see the accompanying text box entitled, "QNX: A Scalable, Microkernel-Based Operating System" and "A Message-Passing Operating System," by Dan Hildebrand (DDJ, September 1988).

Likewise, IBM's Workplace operating system (see Figure 1) is based on the Mach 3.0 microkernel, although IBM-specific extensions (developed with the OSF Research Institute) support parallel processors and real-time operations. This implementation counts five sets of features in its core design: interprocess communication (IPC),

virtual-memory support, processes and threads, host and processor sets, and I/O and interrupt support.

Process dispatch is in the microkernel, but process scheduling is not. The design goal behind this distinction is to separate policy from mechanism. In this case, dispatch is a core mechanism that need never change, but scheduling is a policy that might. This lets you swap the default scheduler for one that provides stronger support for real time, for example, or for a specialized scheduling policy for nonstandard uses.

Above the microkernel, IBM implements personality-neutral services (PNSs) that implement a policy rather than a mechanism, and run outside kernel space. Memory management, for instance, is divided between the microkernel and a PNS. The kernel itself operates the paging functions of the CPU. The pager, operating outside the kernel, determines the page-replacement strategy--that is, which pages will be removed from memory to accommodate a page brought in as a result of a page fault. The pager implements a policy, and the policy can be changed through the use of an alternative pager. IBM is providing a default pager to boot Workplace OS, but the primary paging mechanism is actually the file system, which provides memory-mapped file I/O, caching, and virtual-memory policies, combined.

PNSs include not only traditional OS services (such as the file system and device drivers), but also networking and even database engines. Behind this strategy is IBM's belief that placing application-oriented services such as these close to the microkernel can improve the efficiency of data transfers and queries. Third-party database vendors such as

Oracle can then embed database engines as PNSs to improve performance or make more-direct use of kernel services.

The third layer of modules, closest to the user, is composed of individual personalities. A "personality" is the appearance and behavior of an operating system from the standpoint of the end user. OS/2 can be one personality, Windows another, UNIX a third. The personality looks like the operating system, and system services behave in the expected manner, but many of the services are actually implemented at the PNS level, differently than in the original OS. IBM has demonstrated a UNIX personality, which was simply the entire OSF/1 image running on top of Mach.

Figure 1: IBM's Workplace operating system is based on the Mach 3.0 microkernel architecture.

QNX: A Scalable, Microkernel-Based

Operating System

The operating system of the future may best be modeled by QNX Software's QNX, a 32-bit multitasking OS that utilizes a tiny microkernel. QNX takes a modular approach to services that lets you choose only those services necessary for a particular use. QNX is not an implementation of UNIX, despite its UNIX-like command language and POSIX compliance. It is a separate and distinct operating system from the ground up, and it uses technologies just now starting to come into the mainstream.

The heart of QNX is its microkernel, which implements interprocess communication, low-level network services, process scheduling, and interrupt dispatching; see Figure 2. Process scheduling is real time with preemption, and scheduling is prioritized with round-robin, FIFO, and adaptive-scheduling disciplines. All kernel services are available through 14 APIs, so the ways to access the kernel services are limited.

QNX is a message-passing operating system that utilizes blocking versions of Send, Receive, and Reply function calls. Messages don't queue--the message facility is a process-to-process copy, which QNX claims provides performance comparable to function calls. You can construct your own message queues using built-in messaging primitives.

However, the microkernel does not include process managers, device managers, or a file system. The process manager, Proc, provides services such as process creation and accounting, memory management, inheritance, and pathname-space management. Together, the kernel and Proc provide the features necessary to implement a bare-bones operating system. Fsys (the file-system manager) and Dev (device manager) can be added to for more robustness. Like other QNX processes, device drivers run in user space, but use a specific API to enable them to access a kernel-interrupt vector.

The networking manager is an optional component, tied directly into the microkernel. There is a private interface between the kernel and the network manager, so that any messages passed from a local to a remote process are queued to the network manager. Net manages the sending and receiving of messages, essentially merging microkernels on different nodes into a single, virtual microkernel.

The message-passing architecture, combined with networking services, produces a seamless, distributed system. From the standpoint of user processes, there is no difference between a local call and a call across the network. Likewise, all services above the microkernel are transparently accessible to all processes, whether or not they are local. For data acquisition, QNX can use a private connection between microkernels on a network. This lets you mirror a data-acquisition process without generating traffic on a network being used for other activities.

QNX can be extended. New modules can be developed in user space and debugged at the source level while still providing services normally associated with the kernel. QNX claims that customized OS services can be easily developed by application programmers. Because of the small number of APIs in the kernel and the limited number of APIs in the other QNX-provided components, the QNX learning curve isn't as difficult as with UNIX.

The QNX microkernel consists of 605 lines of source code. A complete implementation of all of the services necessary to implement process management, device management, the file system, and networking, is under 16,000 lines. QNX also conforms to POSIX 1003.1, 1003.2 (shell and utilities), and 1003.4 (real time). With POSIX compliance and a similar command-line interface, is it possible to use QNX in place of UNIX? From my own experiments, the answer appears to be yes, at least in some circumstances. QNX Software is not positioning QNX as a general-

purpose operating system, but there's no reason why it can't be used for almost any purpose.

--P.D.V.

An Interview with Linus Torvalds, Creator of Linux