APPLYING WORKSTATION TECHNOLOGY TO CASE

Large-scale programming power through a distributed development environment

David B. Leblang

David B. Leblang is a senior engineer at Apollo Computer Inc. He has been with Apollo since 1982 where his interests include software configuration management, computer-aided software engineering, and distributed computing systems. He received a M.S. in computer science from Boston University, and an S.B. in computer science from M.I.T. Formerly, he worked at DEC Research. He is a member of the IEEE Computer Society and a member of the ACM.

To manage the complexity of a large software development effort, a project team requires many specialized tools in order to finish on time and within budget. A project would likely use requirements analysis and design tools, document editors, coding and debugging tools, project management, and cost estimation tools, as well as a configuration management system. These tools must work together to form an integrated project support environment. In addition, since large projects are distributed over many computer systems these tools must also recognize and support a distributed development environment.

Networked workstations, with distributed operating environments, provide the computing power and graphics power necessary for sophisticated CASE tools while also providing the high degree of data sharing needed by a large development team. By combining the techniques of object-oriented programming and distributed systems programming a new class of high productivity CASE solutions can be built that take full advantage of the graphics, network, and computation power of modern workstations.

This article describes some general problems encountered in large development projects and some ways the power of distributed computing and the integration of CASE tools can be used to overcome them.

Large-Scale Problems

Large-scale development has particular characteristics not usually found in smaller efforts. In large efforts the system architects and designers are often outnumbered by programmers, technical writers, and sometimes even by marketing personnel. Although the designers may also write code, outline documentation, and help with market brochures, the bulk of the work will be done by people who do not have the designers' intimate insight into the operation of the software. In fact, with very large applications there may be no single person who understands the entire system.

Under these conditions the only way to succeed is to write down the requirements, functional specifications, and design in such a clear and unambiguous way that many small teams can proceed independently and still have their pieces fit together later. The high level of common understanding achieved through a common set of requirement and design documents makes the coordination of individual development teams possible.

Large-scale efforts also require version control and configuration management. During the development of a large project, the requirements, design, code, and other objects change as the project progresses. For example, in a recent project using Apollo's DSEE configuration management system, there were 3,600 changes made in 18 months by 20 different people to a relatively small source code library (100,000 lines); the average code module now has 40 versions. The source library was only one of approximately 50 libraries used to build the product. Some libraries had more than ten variations of the code being worked on at the same time, and one particular library, shared by many projects, had 330 different people access it in the last year. Obviously, a powerful version control and configuration management system is needed.

Large-Scale Solutions

CASE tools evolved from the need to solve the unique problems of large-scale development efforts. These tools generally fall into one of three categories: development tools, process tools, and integration tools.

CASE development tools are oriented toward particular phases in the software life cycle (see U.S. Department of Defense, 1985), including the definition of requirements, design analysis, coding, testing, debugging, and more. CASE process tools are oriented toward activities that are common to all life cycle phases (for example, version control, configuration management, and project management).

The distinction between development tools and process tools is not always clear. CASE integration tools maintain abstract relationships between objects managed by individual CASE tools. (For example, relating a requirement specification to the design modules that satisfy the requirements, or relating a design module to the source code modules that implements the design.) Integration tools help to bridge the gap between life cycle phases and to support the tracing of requirements and project management.

Mechanisms that can relate the CASE objects managed by different tools is a current hot topic in the CASE community. A discussion of the issues and some possible solutions are presented later in this article.

Workstation CASE Tools Versus PC CASE Tools

Some CASE tools run on PCs (for example, requirements and design editors), but these tools tend to be oriented toward single users. They do not address the issues of large-scale distributed development. For example, workstation CASE tools tend to address the problems of version control, distributed and concurrent access to objects, and the protection of objects. Large projects will define thousands of design and code objects, as well as other objects, which will be distributed across many machines. Unlike PC CASE tools, workstation CASE tools must operate in a distributed environment in order to manage their objects and the relationships of those objects.

The software architecture of these tools is usually quite different from that of a single-user tool. The workstation tools often run as multiple processes; one process may act as a server that provides access to a database while many users run client processes that talk to the database server. Another architectural difference is that the tools are written to run in a very large virtual address, rather than being squeezed into a small physical address space. The larger displays available on workstations are used to provide more sophisticated user interfaces.

Although PCs are evolving toward larger address spaces, larger displays, and better support for networking, the basic single-user philosophy still persists. In some sense a workstation is a very high-end PC with a network connection. The CASE applications that take advantage of the high-end features will be more useful in a large project environment.

A "CASE workstation" today usually runs Unix and is configured with a 1-to 7-MIP CPU, around 8 Mbytes of memory, about 200 Mbytes of hard-disk storage, and a 19-inch, 1280-by-1024 bitmap display with multiple windows. These systems are always connected to a 10- to 12M-bit network, and they generally cost between $5K and $10K. They are approximately five to ten times as powerful as an entire timesharing system from the late 1970s. By the early 1990s, workstations will have performance exceeding 100 MIPs and a network bandwidth of over 1OOM-bits.

Workstations are used in large development efforts such as aerospace systems, telephone switching systems, automotive electronics, and other projects involving hundreds of people and millions of lines of code. They are also used by much smaller projects that require good team coordination.

The challenge for the builders of workstation CASE products is that they must find a way to exploit the hardware advances that are producing faster CPUs, faster networks, and larger displays. They must build software products that increase individual and group productivity.

Network Computing Technology

A large development team, perhaps as large as 300 to 500 people building several million lines of code, consists of many members that need to share code, design, documents, and other types of objects. A transparent distributed file system enables a user to access a file stored on any machine in the network as if it were a local file. A workstation environment with transparent distributed file access (see Leach et al., 1983) allows team members to have their own dedicated computers and still share a single (very large) file system. This combines the best of time-sharing and personal computing. It increases productivity by eliminating the error-prone process of maintaining multiple copies of files (and of trying to keep them synchronized by exchanging floppies). Any project that is large enough to have sources and documents on multiple disks will benefit from a transparent distributed file system.

A network computing environment allows users to share computing power as easily as a distributed file system allows them to share files. A network computing environment also makes it easy to write client/server-based tools. This is important for CASE tools supporting large development efforts because the tools need to coordinate their access to such shared resources as a database. Apollo's Network Computing System (NCS) (see Dineen et al., 1987) is a portable implementation of a network computing environment that runs on Unix, MS-DOS, VAX/VMS, and other systems. It enables different types of machines, from a 16-bit IBM PC to a 64-bit Cray supercomputer, to share computations as well as data. This allows a network of computers to function as a single large computer by providing heterogeneous distributed computing services (see Notkin et al., 1987).

Key facilities provided by NCS include remote procedure call, service-location brokerage, and data conversion. Remote procedure call (RPC) is a mechanism that lets an application make an ordinary looking procedure call that executes on a remote machine. Expensive operations like image rotation and database queries can be executed on special-purpose servers, and in addition, several operations can be executed in parallel on different machines. Service-location brokerage acts like a "yellow pages," enabling a client program to find a service on the network. Automatic data conversion smooths the differences between data representations on the various machines, making it easier to exchange parameters.

NCS also provides an interface definition language and stub compiler. These help the user define the client/server interface and generate the code that converts ordinary procedure calls into message-passing operations.

Integration of CASE Tools

A large project will use many different CASE tools, each with its own user interface, database of objects, and function. Users want all of the tools they use to work together. The vagueness of this statement reflects the fact that there are actually several different problems that need solving.

"CASE integration" is a general term that refers to a set of solutions for these problems. There is no widely accepted CASE integration solution at present; however, there are many proposed solutions and partial solutions. The goal of a CASE integration facility is to provide a substrate for the CASE tools of many vendors.

One common problem is the portability of CASE tools: a project wants to use several different CASE tools but can't unless they all run on a common platform. The widespread use of Unix for workstations is helping with the portability problem. In addition, a variety of portable CASE tool interfaces are being proposed; these would be layered on top of existing operating systems. Unix itself is not quite standard, so there are several efforts underway to standardize a version of it.

Since most CASE tools use graphics, a portable graphics interface is also needed. The X-Windows System graphics standard has rapidly gained acceptance. X-Windows does not, however, deal with the problem of variations in user interfaces, which can cause learning-curve discrepancies among users. A variety of "look and feel" standards are emerging on top of graphics standards to address this problem. These standards attempt to define the way menus and the mouse operate in general.

The ability to exchange bulk data between CASE tools, as CAD design tools exchange data with CAD board layout tools, has led to the acceptance of the EDIF standard for information interchange. Extensions to EDIF and other data exchange standards are under discussion.

Many workstation CASE tools have adopted an object-oriented user interface; that is, they represent their objects (project tasks, design modules, documents, and so on) as graphics symbols on the screen. Users interact with the tool by pointing at objects and clicking mouse buttons. Design and project management tools tend to be bubble- and arc-oriented. The bubbles on the screen represent objects, and arcs represent relationships between the objects. For example, if one task must be completed before another task can be started, a project management system may indicate this by drawing an arc between a bubble that represents the first task and another bubble that represents the second. Users can navigate among the bubbles, create new bubbles, and perform various types of analysis on the bubbles and arcs.

Other tools, like documentation and code management tools, have desktops with icons that represent objects. Still other tools may have a different way of representing the objects it manages. Although the details of user interfaces may vary, each tool displays its objects and the intra-tool relationships between them.

Perhaps the most fundamental problem that needs to be addressed by CASE integration is the object correspondence problem. "Object correspondence" refers to the problem of relating objects managed by one tool to objects managed by another (for example, relating requirement paragraphs to design specifications and the code module that implement it). This is a major problem for manual systems and a primary job for CASE integration systems. Although intra-tool relationships are maintained by individual CASE tools, inter-tool relationships must be maintained by a third-party agent. A standard for the CASE tool/third-party agent interface is currently being discussed in many forums. Ultimately, users want to be able to create and navigate inter-tool relationships as easily as they create and navigate intra-tool relationships.

A large project may contain thousands of objects with tens of thousands of relationships. As new objects and new versions of objects are created, the relationships must be maintained. Requirements often evolve as a large system is built; verifying that the design and code match the requirements is known as the "requirements traceability problem" and is a special case of the object correspondence problem.

Intra-tool Versus Inter-Tool Relationships

Large projects use many tools, which are often supplied by different vendors. Maintaining the relationships between the objects managed by these tools requires some degree of standardization. A variety of integration schemes are currently being debated. In addition to solving requirements traceability problems, it would be useful to let a user point at one object on the screen and have the integration facility navigate to related objects. This is very similar to HyperText or HyperCard.

Conclusion

Large development efforts require sophisticated tools in order to succeed. Networked workstations provide the opportunity for a class of tools, known as distributed CASE tools, that address the problems of large-scale efforts. These tools tend to use a large graphics display to provide a good user interface. They operate in a network environment, support version control and concurrent access, and tend provide the "hooks" needed to support CASE integration tools.

Emerging standards in the areas of graphics and user-interface styles, portable operating system interfaces, and network computing environments are helping to bring CASE tools closer to the ideal goal of an integrated project support environment (IPSE). The 1990s will see continued growth in the availability of workstation-based CASE tools.

References

Brooks, F. The Mythical Man-Month, Reading, MA: Addison-Wesley, 1975.

DeRemer, F. and Kron, H. "Programming in the Large Versus Programming in the Small." IEEE Transactions on Software Engineering, vol. 2, no. 2, (June 1976), pp. 80 - 86.

Dineen, T.; Leach, P.; Mishkin, N.; Pato, J.; and Wyant, G. "The Network Computing Architecture and System: An Environment for Developing Distributed Applications." Proceedings of the Summer 87 USENIX Conference, June 1987.

Leach, P.; Levine, P.; Dorous, B.; Hamilton, J.; Nelson, D.; and Stumpf, B. "The Architecture of an Integrated Local Network." IEEE Journal on Selected Areas in Communications, November 1983, pp. 842 - 857.

Leblang, D.B. and Chase, R.P., Jr. "Parallel Software Configuration Management in a Network Computing Environment." IEEE Software, November 1987.

Leblang, D.B.; Chase, R.P., Jr.; and McLean, G.D., Jr. "The DOMAIN Software Engineering Environment for Large-Scale Software Development Efforts." Proceedings of the 1st International Conference on Computer Workstations, San Jose, Calif.: IEEE Computer Society, November 1985, pp. 266 - 280.

Liskov, B. and Zilles, S. "Programming with Abstract Datatypes." ACM SIGPLAN Notices, vol. 9, no. 4 (1974).

Notkin, D.; Hutchinson, N.; Sanislo, J.; and Schwartz, M. "Heterogeneous Computing Environments: Report on the ACM SIGOPS Workshop on Accommodating Hetrogeneity." Communications of the ACM, February 1987, pp. 132 - 140.

U.S. Department of Defense, DOD-STD-2167. Defense System Software Development. Washington, DC 20301: June 1985.

Wirth, N. "Program Development by Stepwise Refinement." CACM, vol. 14, no. 4 (April 1974).

Yourdon, E.L. Constantine Structured Design. Englewood Cliffs, N.J.: Prentice-Hall, 1979.