C PROGRAMMING

Dear Al ...

Al Stevens

Every now and then I spend time catching up on the letters from readers. In recent times my mailbox has contained some interesting comments on the contents of the column and the C language. Being hard-pressed for new copy this month (July) and as a result of a time-consuming boondoggle trip to a jazz festival in England, I thought I'd cop out and let some of you do my job for me.

Here then is the mostly reader-written "C Programming" column for this month.

Token Pasting

In two previous columns I discussed the ANSI preprocessor ## token-pasting operator, complaining that its purpose is not obvious. Even though the ANSI documentation explains the behavior of ##, it does not adequately explain the operator's reason for being. Several readers responded with their comments, and some sent code that illustrates their use of the ## operator. Because those readers were good enough to enlighten me, I thought it only proper that I should pass the good news on to the rest of you.

Robert L. White of Danbury, Connecticut presents an example from a menu interpreter: I use [the ## operator] frequently in macro definitions to simplify the generation of tables. It tends to make the differences in tables stand out better.

Here's an example from a menu interpreter. This thing used tables to describe various menus. In order to keep our sanity, we developed a naming convention. Lists of valid keys for a given menu went in the XXX_Text array, and lists of functions went in the XXX_Funcs array. The addresses of these various lists were stored in a structure called a Menu Descriptor.

  struct MenuDesc
  {
        int            *keys;
        char        **Text;
        int            (**Funcs)();
  };

We could have had a table that looked like this:

struct MenuDesc AllMenus[] =
{
    Main_Keys, Main_Text, Main_Funcs,
    Brain_Keys, Brain_Text, Brain_Funcs,
    Bone_Keys, Bone_Text, Bone_Funcs
   /*  ... etc. ... */
   };

But, using the following macro, we made the table simpler.

#define MENU(name)\
              name##_Keys, name##_ Funcs,
       struct MenuDesc AllMenus[] =
       {
    MENU(Main),
    MENU(Brain),
    MENU(Bone),
        /*  ... etc. ... */
       };

Our tables were actually much more complex than this, but this gives you an idea of how we used the token-pasting operator to simplify the source code. Token-pasting also increases the code's validity by eliminating opportunities for misspelling.

Mr. White's application of the ## operator hits home because his MenuDesc structure closely resembles the one that I used in the menu drivers for the SMALLCOM project last year. His example illustrates one of the strengths of the ## operator -- its ability to reduce repetitious code and therefore reduce the potential for coding errors.

Fred Smith of Stoneham, Massachusetts sends us some history: I have just re-read your column in the July 1990 issue, and note that you are apparently unaware of the origin of token pasting in C. Don't feel bad, for that is certainly not widespread information. It may be that I don't have all the details either, but here's what I know about it: There is at least one preprocessor in widespread use on Unix systems (I have heard it called "the Reiser preprocessor") which allows the use of an empty comment "/**/" as a token-pasting operator in preprocessor statements. This preprocessor is used on (at least) many BSD-derived systems, such as Sun workstations. Example:

  #define pasteup(a,b) (a/**/b)

to give you the "Reiser" equivalent of your example on page 134 of the July issue.

This is a horrible kludge, as it directly violates what K&R said about what the preprocessor does with comments, i.e., comments are replaced with white space. Clearly, at least in the case of an empty comment, this preprocessor actually replaces them with nothing. Now, this is occasionally useful, but it certainly isn't C! (to say nothing of being highly non-portable!).

At my last job I had the (mis)fortune to be involved with a (very) large C program which was full of non-portable Sunisms and BSD-isms, including this one. An interesting use was made of the token-pasting hack, though. It was used to create a general-purpose queue (or linked list) management package implemented entirely in the preprocessor.

There were at least a dozen macros defined as part of the package -- one for creating in the preprocessor the declaration of the data structures you wanted, one to initialize it, one to add a new element, one to remove an element, macros for traversing the list, allocating space for each structure, etc.

The following is a sample macro which shows the (simplified) flavor of these macros:

#define Q_NEXT(a,b)\(Q_/**/a ->b.nextchild)

so that if this were used in a program like this:

child=Q_NEXT(pending_illo_list, current_item);

the expansion would look like this:

child=(Q_pending_illo_list -> current_item.nextchild);

I admit that this capability is useful and powerful. However, I think that this kind of power is easily abused because it makes for code that can be extremely terse and difficult to read, even with the macro expansions in hand (note that the actual macros in the package I describe were much more complex than the example I give here). Because of both the obfuscation factor, and the blatant non-portability of token pasting (either the ANSI sort or the Reiser sort) I would think, not twice, but more like twenty times before perpetrating such code.

Mr. Smith goes further to reveal that the pre-ANSI token-pasting technique I found in a C++ header file works with Microsoft C 5.1 and fails with QuickC 2.0. His linked-list example hints at ways to use the preprocessor token-pasting to build a primitive form of parameterized data types, an issue currently being addressed by the C++ community.

Steve Pritchard of Willowdale, Ontario uses the ## token paster to hide the elements of C syntax and make the operative values more prominent when defining a table. He tells us his design philosophies: In programming I try to use the compiler to assist me in making code understandable and maintainable. I know there are diversities of thought as to whether this means using exotic tricks or just using the basic compiler functions. In the style of coding I use, I sequence in the following priority:

    1. Removal of multiple, scattered dependencies/references

    2. Removal of redundant information

    3. Keeping things simple

Mr. Pritchard sent a lot of code with his letter. I will try to capture the essence of his token-pasting ideas with this fragment that uses the ## operator to eliminate clutter in a table of hexadecimal values.

   #define h(n) 0x##n
   #define tbl(a,b,c,d) h(a),h(b),h(c),h(d)
   int mytable[] =
   {
         tbl(02,09,2a,ff),
         tbl(00,1a,1b,e5)
   };

This code is not an exact extract from Mr. Pritchard's examples, which are in the context of a bigger problem, irrelevant to this discussion. I took the liberty of recasting the technique into a more general example.

There is room for debate about whether the elimination of the 0x notation in the table initializers adds to readability or confusion. By hiding the C syntax, you could mislead the reader of the code. That 02 initializer looks suspiciously like an octal constant. The ff looks like an identifier. Others look like syntax errors. The camouflage is effective only if the reader knows how the coder used the preprocessor. But there is the germ of an idea here. You can use this technique effectively to initialize unusually big tables and thus make the code easier to read but only if you use eye-catching comments and place the macros nearby the tables where the reader can readily find them.

Probably the most ambitious of the token-pasters is Gregory Colvin of Boulder, Colorado who writes: I have found token-pasting invaluable for automatically creating the many declarations necessary to imitate C++ style objects in C code.

We are using macros to implement a user interface library that needs to be portable across all the popular GUI systems. Having spent the last year working in Object Pascal and C++ on the Mac, I didn't want to give up object-oriented programming. But my team at work insists on sticking with C. These macros have proven a reasonable compromise. I also found they had the virtue of removing some of the mystery from object-oriented programming by making the implementation much more apparent.

Mr. Colvin sent several source files that do what he requires. The code includes definitions of a base member class complete with common methods. Then it implements a form of class inheritance by defining a CLASS macro that associates the inherited members and methods of a child class to those of the base. The ## operator pastes the common elements of the various structure names to that of the specific class. I have not used Mr. Colvin's approach but have often thought that such a technique might be possible. I have included the three source files, class.h, class.c, and test.c as Listing One, page 147, Listing Two, page 148, and Listing Three, page 148 for your enjoyment and experimentation.

Greg Colvin works at Information Access Systems in Englewood, Colorado. The group he works in uses a version of this code in a standard user interface tool kit.

An Author Responds

I reviewed two MIDI books by Jim Conger in July. He sent me a nice thank-you letter wherein he corrected my impression of his views of computers and people. Here, in part, is Jim's response.

Your quotes about computers and MIDI threatening musicians are actually from the preface written by my able technical editor, Mark Gavin. I stuck to writing code and explanations.

My own vision of the future role of computers in music is less imaginative than that described in your column. I do not see computers replacing composers any more than I can see computers replacing authors. Like word processors of music, MIDI sequencers have made the task of writing much more efficient and (to me) more satisfying. The authors and composers remain.

MIDI equipment can replace performers, but with mixed results. Even my best efforts on an expensive MIDI setup are far less expressive than those on a 300 year old wooden flute. Synthetic music seems best in the background of a film, not on the front of a stage.

My late father, also a programmer, continually encouraged me to try algorithmic composition. This would seem a natural for someone like myself; long trained in both programming and music. After a few experiments I discarded the sterile outputs, similar to the nonsense poetry written by word pattern analyzers. Intellectually interesting once, but of no emotional content.

The limits of what computers can do seem to result from the limits built into people. We are not beings of logic but are beings of emotion. Music addresses our emotions directly and fails when it attempts to reach us through our conscious thoughts. Millions of years of patterning lead us to a person, talking, singing, or performing with an instrument. Computers will have a difficult time filling this role.

Binding Codes and Messages

The letter from Robert White also included this gem: I have what I think is an innovative C programming technique that I want to pass on to other programmers. Some of my fellow programmers think it is too exotic because it uses the macro preprocessor and enumerated constants which many C programmers (can you believe it?) think are advanced elements of C language. [Not DDJ readers, Bob. -- AS]

The idea is to guarantee that symbolically named indexes into a table always reference the correct element of the table even after several iterations of the code. Sometimes this is hard to guarantee when several developers are working on the same project.

The classic example of such a pairing of index to object is the use of error codes to look up error strings. Here is the simple way of doing this.

   /* ----- ERRCODE.H ----- */
   #define SUCCESS                  0
   #define BrainError                 1
   #define BoneError                 2


  /* ----- ERRTEXT.C ----- */
   char *ErrorStrings[] =
   {
         "No error",
         "Brain error",
         "Bone error"
         };

What if one of the programmers on the project has an old version of the list of error codes and doesn't know it? He or she could spend a good portion of the day trying to track down the wrong error.

Wouldn't it be wonderful if there was a way of specifying the error code and the error string on the same source code line? If the same source code line specified both the code and the string they would always be in sync.

Here's how to do it. We construct a new include file that will serve double duty. When we want it to generate the error strings, we will set a flag for the preprocessor and have the preprocessor spit out a list of error strings. If the flag is not defined, we will force the preprocessor to spit out a set of enumerated constants guaranteed to match the list of error strings.

We do this by creating two definitions for a simple macro. If the flag is defined, the macro expands to the first argument only. If the flag is not defined, the macro expands to the second argument.

   #if defined Make The String Table
   #define Err(string,code) string
   #else   #define Err(string,code) code
   #endif

The include file will contain the following invocations of this macro:

  /* ----- ERRCODE.H ----- */
  Err("No Error",            SUCCESS),
  Err("Brain Error",        BrainError),
  Err("Bone Error",        BoneError)

To define the table of error messages we now do this:

   #define MakeTheStringTable
   char *ErrorStrings[] =
   {
          #include "errcode.h"
   );

To define the table of indexes we do this:

   #undef MakeTheStringTable
   enum ErrorCode
   {
          #include "errcode.h"
   };

There is no need to limit this technique to generating constants and tables. It is really an aligning technique for any set of lists. The most common way of aligning various lists is to combine them into records or structures. This way you only have one list and all related information is defined on one source line. However, sometimes this is not possible or desirable and lists must be defined as separate arrays. Through a cleverly defined macro inside an include file, elements of both lists can be defined on the same line of source code.

This solution is simple and effective. It solves a problem that I've had maintaining error codes which were implemented exactly the way that Mr. White suggests.

TopSpeed C

I looked at TopSpeed C in April, and JPI responded with a letter to the editor in August. I chided TSC for not allowing you to call an interrupt function the way that Turbo C and Microsoft C do. Without this capability, an interrupt-driven program cannot always chain interrupts. I mentioned this deficiency to the president of JPI in February, and he agreed then that it needed to be addressed. JPI reports in their letter, however, that the _chain_intr( ) function provides a way for you to call an interrupt function in TSC. Not always enough. If the documentation is correct, this function chains -- rather than calls -- the called interrupt function to a calling interrupt function. This approach has two deficiencies. First, you can execute an interrupt function only from within another interrupt function. Second, the executed interrupt function returns to the place from where the calling function was invoked, not to the calling function itself. There are many circumstances where an interrupt driver needs to directly call an interrupt function which returns to the calling function. TSC does not allow this, and that restriction renders TSC useless for the development of many interrupt-driven programs.

Avast, Maties

The pirates thrive, and I jump from my mailbox to my soapbox. A letter from Kwee S. Phua of Victoria, Australia tells me he bought two of my C books and needs to get the source code. Dr. Phua says both books were printed in Asia by TP Publications of Singapore. Dr. Phua found my name in DDJ and wrote me here. I have never heard of TP Publications and neither has MIS Press, the publisher of my books. But apparently when TP builds illegal copies of computer books, they lack the grace to include the diskette order forms. I guess they are too stupid to bootleg the diskettes as well. Bob Williams, the boss at MIS Press, tells me the Singapore book stores are loaded with pirated copies of books by Norton, Duncan, Schildt, and other notable authors. I shall not feel left out now, being in such prominent company. But Bob also says that nothing can be done about it. Programmers and authors should know that thieves have franchised a significant segment of their marketplace -- Scurvy dogs.

_C PROGRAMMING COLUMN_ by Al Stevens [LISTING ONE]



/************ CLASS.H COPYRIGHT 1990 GREGORY COLVIN ************
This program may be distributed free with this copyright notice.
***************************************************************/

#include <stddef.h>
#include <assert.h>

/** All objects must be descendants of the Base class, so we
    define the members and methods of Base here. **/
#define Base_MEMBERS
#define Base_METHODS                                           \
    Base *(*create) (void *table);                             \
    Base *(*clone)  (void *self);                              \
    Base *(*copy)   (void *self, Base *other);                 \
    void  (*destroy)(void *self);

typedef struct Base_methods Base_Methods;
typedef struct Base_members Base;
struct Base_members {
    Base_Methods *methods;
};
struct Base_methods {
    char *name;
    size_t size;
    Base_Methods *selfTable;
    Base_Methods *nextTable;
    Base_METHODS
};
extern Base_Methods Base_Table, *Class_List;
Base_Methods *TableFromName(char *name);
Base *ObjectFromName(char *name);

/** The CLASS macro declares a Child class of the Parent. Note
    that Child##_MEMBERS, Child##_METHODS, Parent##_MEMBERS, and
    Parent##_METHODS must be already defined. All methods except
    create() require the first parameter, self, to be a pointer
    to an object of Child type; it can be declared void to stop
    compiler warnings when Parent methods are invoked, at the
    possible cost of warnings when methods are bound. The method
    table, Child##_Table, must be defined as a global structure. **/
#define CLASS(Parent,Child)                                    \
typedef struct Child##_methods Child##_Methods;                \
typedef struct Child##_members Child;                          \
struct Child##_members {                                       \
    Child##_Methods *methods;                                  \
    Parent##_MEMBERS                                           \
    Child##_MEMBERS                                            \
};                                                             \
struct Child##_methods {                                       \
    char *name;                                                \
    size_t size;                                               \
    Child##_Methods *selfTable;                                \
    Base_Methods *nextTable;                                   \
    Parent##_METHODS                                           \
    Child##_METHODS                                            \
};                                                             \
extern Child##_Methods Child##_Table

/** The INHERIT and BIND macros allow for binding functions to
    object methods at run-time. They must be called before
    objects can be created or used. **/
#define INHERIT(Parent,Child)                                  \
    Child##_Table = *(Child##_Methods*)&Parent##_Table;        \
    Child##_Table.name = #Child;                               \
    Child##_Table.size = sizeof(Child);                        \
    Child##_Table.selfTable = &Child##_Table;                  \
    Child##_Table.nextTable = Class_List;                      \
    Class_List = (Base_Methods *)&Child##_Table

#define BIND(Class,Method,Function)                            \
    Class##_Table.Method = Function

/** The CREATE macro allocates and initializes an object pointer
    by invoking the create() method for the specified Class.
    The DEFINE macro declares an object structure as an
    automatic or external variable; it can take initializers
    after a WITH, and ends with ENDDEF. The NAMED macro creates
    an object pointer from a class name. **/
#define CREATE(Class)                                          \
    (Class *)(*Class##_Table.create)(&Class##_Table)

#define DEFINE(Class,ObjectStruct)                             \
    Class ObjectStruct = { &Class##_Table
#define WITH ,
#define ENDDEF }

#define NAMED(ClassName) ObjectFromName(ClassName)

/** The VALID macro tests the method table self reference. **/
#define VALID(ObjectPtr)                                       \
    ((ObjectPtr)->methods->selfTable == (ObjectPtr)->methods)

/** The SEND and CALL macros invoke object methods, through the
    object's methods pointer with SEND, or directly from a
    method table with CALL. Method parameters besides self may
    be sent using WITH, and END terminates the invocation. **/
#define SEND(Message,ObjectPtr)                                \
(   assert(VALID(ObjectPtr)),                                  \
    (*((ObjectPtr)->methods->Message))((ObjectPtr)

#define CALL(Class,Method,ObjectPtr)                           \
(   assert(VALID(ObjectPtr)),                                  \
    assert(Class##_Table.selfTable == &Class##_Table),         \
    (*(Class##_Table.Method))((ObjectPtr)

#define END ))

/** The DESTROY macro invokes an objects destroy() method **/
#define DESTROY(ObjectPtr) SEND(destroy,(ObjectPtr))END




[LISTING TWO]


/************ CLASS.C COPYRIGHT 1990 GREGORY COLVIN ************
This program may be distributed free with this copyright notice.
***************************************************************/
#include <assert.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include "class.h"

Base *BaseCreate(Base_Methods *table)
{   Base *new = (Base *)calloc(1,table->size);
    assert(new);
    new->methods = table;
    return new;
}

Base *BaseClone(Base *self)
{   Base *new =  (Base *)malloc(self->methods->size);
    assert(new);
    memcpy(new,self,self->methods->size);
    return new;
}

Base *BaseCopy(Base *self, Base *other)
{   memcpy(self,other,self->methods->size);
    return self;
}

void BaseDestroy(Base *self)
{  if (self)
       free(self);
}

Base_Methods Base_Table = {
    "Base",
    sizeof(Base),
    &Base_Table,
    0,
    BaseCreate,
    BaseClone,
    BaseCopy,
    BaseDestroy
};
Base_Methods *Class_List = &Base_Table;

Base_Methods *TableFromName(char *name)
{   Base_Methods *table;
    char *pname, *tname=table->name;
    for (table=Class_List; table; table=table->nextTable)
        for (pname=name; *pname == *tname++; pname++ )
            if (!*pname)
                return table;
    return 0;
}

Base *ObjectFromName(char *name)
{   Base_Methods *table = TableFromName(name);
    if (table)
        return table->create(table);
    return 0;
}





[LISTING THREE]


/** TEST.C  Add one to argv[1] the hard way. This program serves
no real purpose except invoking most of the CLASS macros. **/
#include <stdio.h>
#include <stdlib.h>
#include "class.h"

/** Define My class members and methods. **/
#define My_MEMBERS int stuff;
#define My_METHODS void (*set)(void*,int); int (*next)(void*);
CLASS(Base,My);

/** Define space for My method table. **/
My_Methods My_Table;

/** Define functions to implement My methods. **/
void MySet(My *self, int new)
{ self->stuff = new;
}

int MyNext(My *self)
{ return ++self->stuff;
}

/** A function to be called in main to initialize My method table. **/
void MyInit( void )
{ INHERIT(Base,My);
  BIND(My,set,MySet);
  BIND(My,next,MyNext);
}

/** Make My class do something. **/
My *test( int i )
{ int j;
#if 1
  My *objectPtr = CREATE(My);   /* One way to make an object */
#else
  DEFINE(My,object)ENDDEF;      /* Another way */
  My *objectPtr= (My *)SEND(clone,&object)END;
#endif
  CALL(My,set,objectPtr) WITH i END;
  if (j = SEND(next,objectPtr)END)
    return objectPtr;
  DESTROY(objectPtr);
  return 0;
}

main(int argc, char **argv)
{ int arg = atoi(argv[1]);
  My *out;
  MyInit();
  if (out = test(arg)) {
    printf("%d\n",out->stuff);
    DESTROY(out);
  } else
    printf("0\n");
}