OBJECT SWAPPING

Familiar memory management techniques don't always apply to OOPS

Every object, except for Integers and Characters, has an entry in a large array structure called the "object table" or OT (see Figure 1). The OT entry contains the current address of the object in memory or on disk if it is swapped out. It also contains two flags pertaining to the object's state. One flag is called the "touched bit," which indicates whether or not an object has been accessed. The other flag is called the "dirty bit," which indicates whether or not the object has been changed.

For objects that are swapped into memory, a header is placed at the beginning of the object. This header contains a few flag bits, the disk address where the object was swapped in from, and an age field used for discarding objects from memory. The header also contains other information used in Actor, such as the object's size.

A block of memory is allocated at program initialization time to hold all of the memory resident static objects. This is called the "swap area." A typical size for good performance is around 50K-bytes. The top 15 percent of this area is called the "overflow area." The overflow area will be explained later.

Processes Involved in Object Swapping

Static-object creation

Static-object referencing

Static-object aging

Static-object deletion

Swap-area compaction

Static object creation occurs when the executing program explicitly requests it. An area based on the size of the new object is allocated at the current allocation pointer (see Figure 2). The disk address in the swapping header (see Figure 3) is initialized with a special signature to designate that it has no disk address yet. The age is initialized to a low value. An OT entry is allocated from a free list, and initialized with the object's memory address. New objects are marked as dirty in their OT entry. The object is now ready for use by the system.

Objects are accessed via their object pointer (OP) value. The OP is simply the offset into the OT. Object access is shown in Example 1. Addressing turns an OP into a physical address for the object. This process swaps the object in from disk if necessary. An important property of addressing an object is that no other object's physical address may be changed in the process. This stability of addresses is required when the system needs to address more than one object at a time. Moving one object would invalidate the other object's address. If swapping in the addressed object requires more memory than is available in the swap area, a fatal error would occur. However, Actor is internally tuned for this not to happen under normal conditions. If an object is already in memory, its touched bit is just set in the OT and its memory address is returned.

Example 1: Object access

  Set the object's touched bit
  if object not in memory
  then
     Swap object in
     Update OT entry
  endif
  Get the object's address

Periodically during system execution, one of two things will happen: All needed objects are accessed or the system runs out of memory in the swap area. Before running out of memory, something must happen to free up memory. This is the job of the aging process.

As the system executes a routine is called periodically that walks through a few object table entries. When the end of the OT is reached it starts over at the beginning.

For each OT entry the touched bit is tested. If the object was touched since the last pass, the age field in the object header is set to 0, indicating the object was recently used. As the system runs, accessing objects and aging them, the age fields indicate which objects were recently used and which were not. In this way, objects can be discarded on a least recently used (LRU) basis.

When an object reaches a certain age it is tested for being changed. The dirty bit in the OT is used to determine this. If the object is unchanged, the swap file on disk still contains a correct version of the object. In this case, the OT entry is updated with the old disk address and the memory block is marked "discarded." If the object has changed, it is written back to its allocated location on disk. The aging process is shown in Example 2.

Example 2: The aging process

  for "a few" OT entries
     if object is touched
     then
       Set object's age to 0
     else
       if object's age is too old
       then
         if object is dirty
         then
            write object to disk
         endif
         Set object address in OT to
           disk address
         Mark object memory as
          "discarded"
         else
          Increment object's age
         endif
       endif
  endfor

Except during periods of many object creations, objects are usually not dirty. The typical ratio is about 500 to 1 for clean to dirty objects. This means that discarded objects rarely cause any disk activity; only addresses are moved around in memory.

A special case occurs when a newly created object is swapped out for the first time. This is detected by a special invalid disk address. Space for the new object is allocated at the end of the object swap file and the object is written out. It is then treated like all the other objects in the system.

If an object is accessed again after being discarded but before the swap area is compacted, a new memory block is allocated and the object is swapped in. This does leave an old, unreferenced copy of the object somewhere else in the swap space, but this dual allocation of memory for a single object rarely occurs.

Swap Area Compaction is the means by which space used by discarded objects in the swap area is recovered. As objects are swapped in, the current allocation pointer into the swap area eventually reaches the beginning of the overflow area. A flag is set, indicating that swap area compaction is desired. At the moment of access, the system may be using the address of other objects, so compaction does not occur immediately. It's deferred until specific sync points are reached. Sync points are places in the execution of Actor when there are no object addresses in use, or more correctly, the addresses are at a known location. The system may keep executing and swapping objects into the swap area before a sync point is reached. This is why there is a reserved overflow area. All objects swapped in after requesting compaction but before reaching a sync point must fit in the overflow area.

When a sync point is reached, execution stops for a moment while the swap area is compacted. The compaction process updates the physical address in the OT of any static objects currently in the swap area. It does this by first overwriting the memory address in the OT with some of the data in the object header. It then overwrites the object's saved data area with the OP of the object. This allows an efficient sweep through the swap area to move all valid objects to the beginning of the swap area. During this sweep all object headers are restored from information saved in the OT entry and the OT entry is updated using our current location in the swap area for the object's address. Example 3 shows pseudocode for swap area compaction.

Example 3: Pseudocode for swap area compaction

  for each valid OT Entry
    save object header info in
      OT entry
    save OP in object header
  endfor
  Set allocation pointer to point at
      the start of the swap area
  for each object in the swap area
      if valid object
      then
        Move object to current
          allocation pointer
        Restore object header
          from the OT using saved OP
        Set OT address equal to
          allocation pointer
        Add size of object to
          allocation pointer
      else
        Skip this object
      endif
  endfor

The final process in object swapping is the deletion of objects from the swap file. Like object creation, this is done when the executing program explicitly requests it. The equivalent of a mark and sweep garbage collection is done using a temporary disk file to store any objects to be kept. As this proceeds, the OT entry for the object is updated with the new offset into the temporary disk file. On completion, the temporary disk file becomes the new object swap file, the swap area allocation pointer is reset to the beginning of the swap area, and the system starts running again. As objects are now accessed they are swapped in from the new compacted swap file. This sweep through the swap file is much like the swap area compaction process but it is applied to the entire swap file.

To save the state of the system, a procedure known as "taking a snapshot" is used. A snapshot simply creates an image file that contains copies of the OT, the swap area, and the current swap file. To start up the Actor system, the snapshot process is reversed: The OT and swap area are read from the image file into memory and a swap file (which is a working copy of the swap file area in the image file) is created. Dynamic objects are also stored in the snapshot file and loaded into memory when the snapshot file is restored.

Conclusions,

In this article, we have shown how to implement a virtual memory system that is optimized for object-oriented languages. It's unique design reduces memory required by approximately 87 percent, allowing much better utilization of available memory. Because special hardware support is not required, this architecture can run on many kinds of machines.

The performance delivered is usually quite high. Tests have measured object "swap in" rates as high as 320 objects per second. Average performance is about 75 to 100 objects per second. Disk caching and disk speed can affect this greatly.

New operating environments with built-in virtual memory support can still benefit from object swapping. Unless enough physical memory for all running programs exists, considerable paging activity may occur.

By using object swapping, more complex programs can be run before performance severely degrades from paging activity. A developer will be able to produce an application that runs efficiently on a larger base of machines. This also allows more sophistication to be incorporated into the application for a specific hardware configuration. Large applications can easily run on 640K machines.

Implementing an object swapping system in an application written in C can be done, but will require the programmer to explicitly call access routines whenever an object is needed. A paging system is easier for the programmer, but does not make optimal use of memory. Having the object swapping system integrated inside a language such as Actor, makes its use transparent to the programmer. This allows Actor to make optimal use of memory and free the program from having to do anything specific to support it.

References

1. Kaehler, Ted, "Virtual Memory for an Object-Oriented Language," Byte, August 1981.

2. Kaehler, Ted and Krasner, Glenn, LOOM - Large Object-Oriented Memory for Smalltalk-80 Systems, Smalltalk-80 Bits of History, Words of Advice, pp 251-270, Krasner, Glenn, editor, Addison-Wesley, 1983.

3. Duff, Charles, "Designing an Efficient Language," Byte, August 1986.

_OBJECT SWAPPING_ by Jan Bottorff and Jim Bolland