Very High-Speed Networks: HiPPI and SIGNA

Enabling technology for the information highway

To guarantee real-time application response, it is necessary to add a limited real-time mechanism to the 386BSD kernel. This mechanism allows a special single process to preempt the kernel on demand. This special case carefully "violates" the UNIX model of restricted preemption to achieve a rapid response to data delivery; it is not intended as a general-purpose mechanism for real-time programs.

Extant device and driver interfaces which place the burden of buffer allocation and packet extraction on the device driver are not appropriate for gigabit-network interfaces. Gigabit-networking interfaces must cope with the fact that while processor speed is increasing, memory-system bandwidth is not keeping pace. Operations involving the most bandwidth (the packetdata payload) are costly; if you require more than a single pass over the packet, you overload the memory-system bandwidth and "get behind" in processing a packet. One way to avoid this is to use extensive amounts of memory (arranged as frame buffers) to assemble and present the link-layer packets in transit. Such memory-based devices require novel device-driver interfaces.

Finally, Internet Core protocol structures (TCP, UDP, IP, ICMP) must themselves be modified to eliminate copies and reduce checksum overhead. By operating on descriptors instead of copying the packet around during processing, you can reduce the average passes required per packet from three to one--a significant reduction in memory overhead; see Figure 3. This is done by combining the copy and checksum operations directed to protocol headers and data. The descriptors selectively reference header/data portions of the packet in place in the interface's buffer.

Header prediction can also be enhanced through a "clustering" mechanism, which synchronizes a half-duplex stream of packets. This effectively locks out other system activity during peak-rate transfers.

The Future of the LANL HiPPI and SIGNA Projects

The LANL HiPPI project, exhibited at the Supercomputer '92 and '93 conferences, is possibly the only successful protocol-engine design ever put into operation. Even software-testbed designs (including Project SIGNA) cannot match the current speed of good protocol-engine designs due to the limitation in the memory system used by the processor itself. As such, anyone interested in getting a hands-on, operational, protocol-engine testbed should look at this design carefully. It could save a company years in design and development costs and also bring very high-speed networking that much closer to reality.

Because gigabit hardware technologies are still a matter of speculation, software-only approaches (such as SIGNA) and testbeds are more than just interesting. Both 600-Mbit ATM (MAN) and 100-Mbit Ethernet might offer affordable desktop bandwidth in the near future, while SONET scaled to multi-gigabit levels offers the possibility of metropolitan-network interconnections. Even HiPPI, originally a supercomputer mass-storage interface, has been demonstrated as a network-interconnect standard. With the recent standardization of the HiPPI serial standard, the cost of implementation has lowered drastically.

Very High-Speed Networks

While gigabit networking is considered solely the province of the data industry, knowledge of telephony techniques provides insight into design considerations and constraints. In fact, both the SIGNA and LANL HiPPI testbeds could be viewed simply as a gigabit-terminal equipment. In addition, new gigabit-networking technologies must rely on switching technologies instead of routing technologies, since the data rates required prohibit the delay imposed by the interim retransmission of a packet.

The inevitable reunion of the data-networking and telecommunications industries will be spurred on by the demand for global very high-speed gigabit networking, although probably not in the manner either of these industries have separately forecast. Ironically, the experts most suited to leading the charge are at risk of being most blind to these new possibilities, since they are used to seeing them only in terms of their respective disciplines.

In the meantime, hardware projects like LANL's HiPPI project and software-testbed engines like SIGNA will provide us with the knowledge and experience needed when very high-speed networking solutions become available. Perhaps they will encourage entrepreneurs from both industries to take the initiative and offer ad hoc solutions, creating a whole new information industry. In any case, the demand for very high-speed networks is real, and that demand will be satisfied--one way or another.

The LANL HiPPI Protocol Engine Hardware

The LANL protocol engine (see Figure 1) consists of two CBI (crossbar interface) cards attached to an ordinary EISA PC. Each CBI card has two unidirectional HiPPI ports (one input, one output), each used to manage one half circuit of the communications between an Internet network and a non-Internet-capable application host. Only data and requests for Internet service flow across the application link, and only Internet-protocol (IP) datagrams appear on the network link. It is the sole responsibility of the PC to handle the transformation of the application's requests into appropriate Internet-protocol operations without ever seeing the application's data (just handling pointers to the data only). In this case, the PC is the actual Internet host which operates on behalf of the external host computer.

The key to this architecture is the design of each CBI (see Figure 2), which is built around a large (4-Mbyte) block of video RAM (VRAM). The VRAM has three ports: two serial (one in and one out) for receiving and transmitting HiPPI, and one parallel, bidirectional port that allows the PC to access TCP/IP header and HiPPI Link Layer information. Each board has a port on the network and a port connected to the application host (which runs the network application connected to the network). The data is buffered between the network and the application host solely in the VRAM while the PC arranges the details of the network transfer.

While the roles of application and network are split between two hosts, you could design a delivery mechanism to the application running on the same PC (sort of a "socket protocol engine" for the particular application program) if necessary. This approach can also be used on a single PC or workstation.

By stratifying the design of protocol processing into scalable sections, you can cope with any degree of bandwidth on a networking implementation. Given the rate of technology change, switching a gigabit per second between computers will be routine in less than a decade.

The choice of a PC/supercomputer connection presented some novel problems which had to be resolved to make the LANL HiPPI project fly. One of the most critical issues dealt with the rate of information itself: While a supercomputer has no trouble churning out TCP/IP in order to source a HiPPI link, how could a PC handle it? The secret was to decouple the overhead of the data payload from the protocol processing so that the overhead per packet is fixed, regardless of the size of the packet. Assuming maximum packet size of 64 Kbytes (219 or 512 Kbits), a packet rate of 211 or 2048 per second would be necessary to support a data rate of a gigabit (230 bits per second).

Since these packet rates are achievable with a carefully tuned PC Internet implementation, the real key to high-speed networking is to find ways to scale packet-data payload delivery. The LANL CBI project addresses this through clever hardware design. The TCP protocol has two requirements on its data payload: a delivery requirement and a checksum across the span of both the payload anda special, pseudo-protocol header. A hardware-checksum mechanism offloads from the networking implementation a portion of the protocol processing that increases with packet payload.

A second hardware mechanism eliminates the remaining payload overhead from delivering the data to the application. Essentially, the PC never touches the data inside the packets--it merely manages the association of hardware data-buffer pointers between the two interfaces. The PC simply does the bookkeeping of the protocol, which is the same whether the packet is 64 bytes or 64 Kbytes.

--W.F.J.

For More Information

More information on the LANL HiPPI Project, including documentation on the CBI, is available via ftp at the Internet site ftp.lanl.gov in the /pub/cbi directory. For information on Project SIGNA, 386BSD, or pointers to further information about the LANL HiPPI Project, please send e-mail to wjolitz@cardio.ucsf.edu.

Figure 1 The LANL protocol engine. Figure 2 The design of each CBI. Figure 3 Reducing the average passes required per packet; (a) three pass; (b)one pass.