STRUCTURED PROGRAMMING

Humpty-Duntemann's Handy Object-Oriented Glossary

     Window
     |
     |
     ---Form
          |
          |
          ---Field
               |
               |
               |---IntegerField
               |
               |
               |---BooleanField
               |
               |
               |---StringField

A class hierarchy is created by defining abstract objects (see also) at the top of the hierarchy, and giving those abstract objects all of the most generally applicable code and data for the whole hierarchy. The abstract objects thus exist as "broadcast stations" from which the most general code and data may be inherited. As you move toward the leaves, each child class adds code and data that is more and more specific in nature, until the leaves of the hierarchy tree are objects that do fully useful tasks.

In the mini-hierarchy shown, the Window class might be nothing more than a rectangular subset of the screen with a border. It contains only X,Y position values and flags indicating whether or not it is currently visible or active. Its methods would allow it to be dragged around the screen and made visible or invisible, but nothing more. The Form class might add a border and mechanisms for vertical/horizontal scrolling. The Field class adds generalized methods for setting and returning a pointer to a value, but does not yet commit to any particular type of value. Only at the leaves of the hierarchy do classes like BooleanField provide a completely useful object -- in this case, a field for the entry and editing of Boolean values. Remember that BooleanField retains everything its parents provided: Windows, drag methods, scroll bars, and so on. It only adds the last and most specific parts of the object: Those parts catering to the Boolean data type.

A class hierarchy is an extremely powerful tool for managing the complexity of an OOP application. It distributes data and functionality along a line from general to specific, and allows the programmer to zero in on only the portion of the functionality that is being worked on at any given time.

Class Tree

The Actor term for class hierarchy (see also).

Dynamic Objects

When objects are allocated on the heap, they are dynamic objects, just as ordinary variables that are allocated on the heap are dynamic variables. You're unlikely to run into this term unless you're using or reading about Turbo Pascal 5.5. In QuickPascal and Macintosh Object Pascal, all objects must be allocated on the heap, so like it or not they're all dynamic, and thus the term loses its purpose and isn't used. Turbo Pascal 5.5, however, allows objects to be allocated statically in the data segment exactly as ordinary variables are. So just as in all Pascals you can have static variables and dynamic variables, in Turbo Pascal you can have static objects and dynamic objects.

Early Binding

Encapsulation

At the heart of the notion of OOP is the practice of defining code (in the form of methods, see also) and data together as a single-named entity called an object. This rolling up of methods and data into a unified bundle is part and parcel of the term encapsulation. The problem is, some people (notably, the Smalltalk partisans) take it considerably further, and hold that encapsulation means that users of an object cannot directly know of or reference the object's data. The data lives at the center of the capsule, as it were, and can only be accessed through one of the methods that surround it.

In other words, if a Boolean field within an object is named Visible, users of the object cannot perform a test such as

     IF MyObject.Visible THEN
     MyObject.Drag ELSE
     MyObject.Relocate;

Instead, a method named Is Visible would have to be defined to return the value of the data field Visible, and two additional methods (perhaps Show and Hide) would be needed to flip the Boolean state of Visible inside the object.

This is literally true in Smalltalk. Smalltalk's encapsulation of data is absolute. In other OOP languages, this sort of absolute encapsulation is allowed, but not enforced. C++, QuickPascal, and Turbo Pascal 5.5 allow ordinary references to an object's data from anywhere within the object's scope. In other words, if the object itself is "visible," its data fields are also visible. The programmer can choose the degree to which data fields are encapsulated within objects.

Absolute encapsulation places an unavoidable performance burden on some types of programs, hence the more lenient encapsulation of C++ and Object Pascal. In most cases, it's a good idea to access object data only through the object's methods, but C++ and Object Pascal people can break that rule according to their own judgment -- and must accept the consequences if they judge badly.

Extendibility

Through the property of inheritance (see also), objects may be extended without access to the objects' source code. The mechanism used is overriding methods (see override.) A child class of an existing class defines a new method with the same name as an inherited method. This new method adds some new functionality to the inherited method and replaces it within the child class.

Inheritance

One of the fundamental characteristics of OOP is that a class (see also) may be defined in terms of an existing class. The new class inherits all definitions made within the existing class (called its "parent class") and any definitions its parent class may have inherited from classes further up the class hierarchy (see also.)

Inherited data and methods may be used as freely as data and methods defined within the class itself.

Instance

The term instance is not limited to OOP situations, but it's used more frequently in connection with objects than with earlier data structures like arrays and records. An instance is a realization in memory of the description we call a class or an object type. Classes or object types are defined after the TYPE reserved word, whereas instances are declared after VAR. Instances are thus object variables, which we normally just call objects.

Late Binding

Many different mechanisms are used to implement late binding, but most involve a behind-the-scenes table of method addresses specific to a given class or object type. To execute a late-bound method DoIt belonging to object MyGadget (in Object Pascal terms the statement MyGadget.DoIt;) the code must first determine the class of MyGadget, locate the method table for that class, look up the table entry for method DoIt, and then pass control to the address at DoIt's entry in the table.

Late binding makes polymorphism (see also) possible.

Method

Through encapsulation (see also) code and data are combined into a single-named entity called an object. The code portion of an object consists of some number of routines called methods. The headers of an object's methods are defined within the object's definition, and the bodies are defined later in this glossary.

In the Location object definition shown later in the entry for virtual, there are four methods: MoveTo, Show, Hide, and IsVisible. In Turbo Pascal 5.5 and QuickPascal, methods may be either procedures or functions and are defined in almost exactly the same way. In Turbo Pascal 5.5, methods may also be either static methods or virtual methods (see also.) All of QuickPascal's methods are the equivalent of Turbo Pascal's virtual methods; static methods are not present in QuickPascal.

Message

This term is used only by Smalltalk and Actor. A message is a command to an object. When received by the object, the message selects which method is to be executed in response to that message.

This scheme (called "message-passing") is the functional paradigm for polymorphism (see also) in Smalltalk and Actor. The message selects its method at run time through late binding (see also.)

In Object Pascal and C++, the name of a virtual method is analogous to the message, and late binding resolves the name of a virtual method to the address of the code implementing the correct method for the true class of the object making the method call.

Multiple Inheritance

In most OOP languages, each class or object type has only one parent, from which it inherits everything defined by that parent or inherited by that parent from classes higher in the class hierarchy. Some newer OOP languages not yet available for DOS (including C++ 2.0 and Eiffel) allow a class to inherit from two or more parent classes. A class hierarchy thus becomes a sort of class web, and there are significant questions as to what happens when more than one entity with the same name is inherited by a single class.

Note that Smalltalk, Actor, Turbo Pascal 5.5 and QuickPascal do not implement multiple inheritance, so I won't go into much more detail here.

Object

An object is an instance (see also) of a class or object type. By an instance I only mean that it's a variable, with an allocation of memory somewhere, and has some special properties that make it an object and not a record. In Smalltalk and (to a slightly lesser extent) Actor, everything is an object. In C++, Turbo Pascal 5.5, and QuickPascal, objects coexist peaceably with older, simpler types such as integers, Booleans, characters, and records.

Object type

Borland International redefined some of the jargon when they released Turbo Pascal as their implementation of Object Pascal. Their term object type is synonymous with class (see also) used in nearly all other OOP languages. An object type or class can be seen as a set of instructions by which the compiler builds objects in memory. In this it is no different from an ordinary record or array type definition that specifies how many bytes in size a variable of that type will be, and how some of those bytes in combination represent different flavors of data.

Again, think of object types or classes as templates by which the compiler whacks out actual objects as needed. These objects are called instances (see also) of the object type. An instance is just a variable of an object type.

Override

Object classes ordinarily inherit all methods from their parent class. The child class has the option, however, of redefining a method it inherits from its parent class or (through the parent class) some more distant ancestor class. This process of redefining an inherited method is called overriding the ancestor's method. This is the primary way that objects are extendible. A child class overrides an existing method and adds new or more specific behavior to the overriding method.

In Turbo Pascal 5.5, methods are overridden simply by redefining them under the same name. In QuickPascal, however, the reserved word "OVERRIDE" must be included after the method header of the overriding method:

   TYPE
        Point : OBJECT(Location)
            Color : Integer;
            PROCEDURE Show; OVERRIDE;
            FUNCTION GetColor : Integer;
            PROCEDURE SetColor-(NewColor : Integer);
        END;

Parent Class

(In Turbo Pascal 5.5, parent object type.) Through the property of inheritance (see also), an object class may be defined as the child of an existing class. This existing class is called the parent class of the new class. Smalltalk calls this a superclass (see also). QuickPascal (but not Turbo Pascal 5.5) objects can reference data and methods defined within their parent classes by qualifying a reference with the reserved word "INHERITED."

Parent classes are sometimes called "ancestor classes."

Polymorphism

From the Greek for "many shapes," polymorphism allows a single method name to act as a doorway to numerous separate methods, with the actual method called chosen by the language's late binding mechanisms at the time the call is made.

This is most tersely explained by example. In an object hierarchy, an abstract class (see also) called "Field" implements a generic data field not committed to any given type of data. Field defines numerous generic methods for manipulating data, including GetValue, PutValue, and Edit. Field has numerous child classes, one for each specific type of data: IntField, BooleanField, CharField, StringField, and so on. Each of these child classes overrides Field's generic methods with methods specific to the child classes' own data types. That is, IntField.Edit edits an integer value, StringField.Edit edits a string value, and so on.

One of the subtler rules of OOP is that assignment compatibility and pointer compatibility are extended down an object hierarchy. In other words, in our example an object of class Field may be assigned an object of class IntField, StringField, BooleanField, and so forth. (But not the other way around!) A pointer defined as pointing to the Field class may also point to any of Field's child classes. Such an assignment is called a polymorphic assignment.

This means that an object of class Field may in fact be a mask over an object of class StringField or any of Field's other child classes. Precisely which class wears the Field mask doesn't matter. Given a Field object named MyField, when the method call MyField.Edit is made, late binding selects the correct Edit method at the time the call is made. If a StringField object had been assigned to MyField, then the StringField.Edit method would be called. On the other hand, if a BooleanField object had been assigned to MyField, then the BooleanField.Edit method would be called -- and the decision is not made until the call itself is made.

In a sense, what polymorphism allows us to do is say to an arbitrary Field object: "Go edit yourself!" and the Field object will select the correct Edit method to use. One command -- Edit -- has a different shape for each different child class of Field. Polymorphism!

This is difficult business, but it is enormously powerful. I'll deal with polymorphism at length in a future column.

Static

The word static is currently the bad boy of OOP programming, and single-handedly causes more confusion than all other OOP terms combined, including that old devil polymorphism.

The problem mainly involves Turbo Pascal 5.5. In a Turbo Pascal context, the adjective "static" has two uses (and two opposites) with wildly different contexts. One use involves the two kinds of methods: Static methods are early-bound methods, as opposed to virtual methods, which are late-bound methods. The other use involves the two ways objects can be created in memory: A static object is an object allocated in the data segment, whereas a dynamic object is an object allocated on the heap. Thus in one context "static" is the opposite of "virtual," while in the other context "static" is the opposite of "dynamic."

The confusion is compounded by the fact that the term "static" is not used at all in QuickPascal, so you Microsofters are probably wondering what all the hoohah is about.

(I've provided more detailed definitions of static methods, static objects, dynamic objects, and virtual methods elsewhere. To pin down your complete understanding of the very slipdpery term static be sure to read them all!)

Static Methods

Only C++ and Turbo Pascal 5.5 allow the definition of static methods. A static method is a method subject to early binding (see also) only. Methods subject to late binding are virtual methods (see also). Note that the term virtual methods is not used in QuickPascal because late binding is applied to all methods, making all methods virtual and thus making the distinction between virtual and static methods unnecessary.

Methods are static by default in Turbo Pascal 5.5. To make a method virtual you must add the reserved word "VIRTUAL" immediately after the method header. In the "Location" object definition shown in the entry for virtual, the MoveTo and IsVisible methods are static, whereas the Show and Hide methods are virtual.

Because static methods are early-bound, a static method call is identical to an ordinary procedure or function call. There is an additional level of indirection involved in making a Turbo Pascal 5.5 virtual method call, so static methods are slightly faster than virtual methods. (But only slightly.)

Static methods may be overridden, but are not subject to polymorphism (see also).

Static Objects

A static object is an object allocated in a program's data segment rather than on the heap. The term is another one that Borland International borrowed from C++ in implementing Turbo Pascal 5.5. QuickPascal, Smalltalk, and Actor make no use of the term, because in those languages all objects are dynamically allocated.

A static object is very much like a super-record having the special properties characteristic of object-oriented programming. Like an ordinary record, you create static objects in the VAR section of your program:

     VAR
         MyRecord : RecordType;
         MyObject : ObjectType;

Nothing magical about it, no pointers, no need to mess with New or Dispose. Static objects are a good place to start learning about objects, since you can concentrate on using objects without worrying so much about creating them correctly.

Subclass

Like the term superclass, subclass is commonly used only with regard to Smalltalk. A subclass is a child class. In other words, if class B inherits from class A, class B is a subclass of class A.

Superclass

Within an object hierarchy (see also) every class has only one immediate ancestor, called its "parent" or superclass. This term is commonly used only in Smalltalk.

Virtual

The term virtual was one of several borrowed from C++ by Borland International and applied to their implementation of Object Pascal released as Turbo Pascal 5.5. It is not used in QuickPascal nor in Smalltalk nor Actor.

"VIRTUAL" is one of four new reserved words in Turbo Pascal 5.5. It is a qualifier placed after a method header in the object type definition:

   TYPE
     Location =
       OBJECT
         X,Y : Integer;
         Visible : Boolean;
         PROCEDURE MoveTo (NewX,NewY : Integer);
         PROCEDURE Show; VIRTUAL;
         PROCEDURE Hide; VIRTUAL;
         FUNCTION IsVisible : Boolean;
       END;

The use of the VIRTUAL reserved word makes the method that precedes it a virtual method, which is a method that may be late bound. (See late binding.)

Keep in mind that the word "virtual" in this context has nothing whatsoever to do with virtual memory or other storage issues. Also keep in mind that functionally, all methods in QuickPascal are virtual methods. In Turbo Pascal there is the option of defining methods as static methods (see also) and the reserved word VIRTUAL was chosen to differentiate between early and late bound methods.

Virtual Methods

Except in Turbo Pascal 5.5 and C++, all methods are virtual methods, so the term is not used much outside of Turbo Pascal 5.5. Virtual methods are defined as methods that take part in late binding (see also).

In early binding (see also) the address of a method is baked into the "CALL" machine instruction that performs the method call, by the compiler at compile-time. In late binding, the actual address of the method being called is not determined until the time that the call takes place. How this is done varies from language to language (and is done differently, in fact, by Turbo Pascal 5.5 and QuickPascal) but almost always involves a table of method addresses hidden inside the object that owns the methods. To make a late-bound method call, the address of the desired method is looked up in the table of method addresses, and then control is passed to that address.

Note that all virtual methods of the same name must have identical method headers, including the identical type and order of parameters.

The real power of virtual methods involves polymorphism, and is not an easy thing to describe in a paragraph or two. I'll take up the subject of late binding and polymorphism in detail in a future column.

Algorithms Over Easy

The difference between a sort library and a book of sort algorithms is the difference between giving a guy a fish and teaching him how to fish. If you don't know how your tools work, you're at their mercy -- which is akin to the feeling of downing your last sardine with no more in the can.

The best algorithms book I've ever seen crossed my desk last week: Turbo Algorithms, by Keith Weiskamp, Namir Shammas, and Ron Pronk. Space is short, so I can't describe it in detail. But the (rather remarkable) idea is this: The authors present a fairly large number of useful algorithms and then implement each one in all four Turbo languages -- Pascal, C, Basic, and Prolog. As you might imagine, the book is by needs terse, but it is nicely written and has some of the best technical figures I've seen in a long time. The algorithms cover sorting, searching, stacks, queues, binary and AVL (balanced) trees; singly-linked, doubly-linked, and circular lists; and word and token string processing. That's a lot of ground to cover with real code in four languages, but somehow, it works. I found the math section pretty tough going, but that could be my own aversion to the subject, and the rest of it was both graspable and immediately useful.

As it happens, Turbo Libraries is one in a series of three solid books by the same authors, all of them spanning the four Turbo languages; but as Wiley's book distribution system is as brain-dead as their books are good, you may have to order them through your local bookstore. They are, however, well worth the wait.

Those Old Release-Level Blues

In my September column I raised some eyebrows by reporting that "Users of MS Pascal 5.0 should note that QuickPascal is only broadly compatible with MS Pascal...." Microsoft's current major release level of their command-line Pascal compiler is 4, not 5, a mistake ascribable to late-night, bleary-eye syndrome. Although, the next version of Microsoft Pascal has yet to be announced, Microsoft has confirmed that any future release will be upwardly compatible with QuickPascal 1.0. The aboriginals who can only count 1 .. 2 .. many had their finger on something -- how are we ever going to keep these things straight in the year 2000, when we're dealing with MS Basic 21.0 and Turbo Pascal 17.5?

Products Mentioned

Turbo Algorithms: A Programmer's Reference Keith Weiskamp, Namir Shammas, and Ron Pronk John Wiley & Sons, Inc., 1989 ISBN 0-471-61009-7 Softcover, 444 pages $26.95 Listings diskette $24.95

Turbo Language Essentials: A Programmer's Reference Keith Weiskamp, Namir Shammas, and Ron Pronk John Wiley & Sons, Inc., 1989 ISBN 0-471-60907-2 Softcover, 500 pages $24.95

Turbo Libraries: A Programmer's Reference Keith Weiskamp, Namir Shammas, and Ron Pronk John Wiley & Sons, Inc., 1989 ISBN 0-471-61005-4 Softcover, 478 pages $26.95 Listings diskette $24.95

QuickC with Quick Assembler Microsoft, Inc. 16011 NE 36th Way Redmond WA 98073 206-882-8088 $199