WATCOM C7.0

Good things do get better

John M. Dlugosz

John is a free-lance writer and software developer and can be reached at P.O. Box 867506, Plano TX 75086, or on CompuServe at 74066,3717.


You'll notice right away that Watcom is unique. As you open the box you'll find a library-quality magazine holder containing several slim books, a package of disks, and a foam block that prevents the contents from rattling around during shipping. Throw the foam away and you've got a convenient caddy, the perfect size for storing books. This kind of functionality sums up Watcom C7.0.

Documentation

Watcom's documentation consists of a Library Reference Manual, Language Reference (including a Programmer's Guide), Optimizing Compiler and Tools User's Guide, Express C User's Guide, and Graphics Library Reference. Among the smaller documents are an Addendum to the Library Reference, Watcom Editor User's Guide, a Read Me First, and a set of 8.5- x 3-inch fanfold reference cards (one covers the compiler and tools use, including C syntax, another the library, and a third the graphics library), that fits nicely under the keyboard.

The no-nonsense installation instructions are written with the assumption that you know what you're doing, unlike other instructions I've recently seen. They tell you what is on the disks and how the installation program (which is excellent) will arrange the files on the hard disk. You can do without the install program if you like.

The User's Manual tells you how to run the compiler, debugger, make utility, and associated tools, and about the pragmas, calling conventions, and other important information. The Language Reference manual is a reference to the C syntax, and provides compiler-dependent information (bit field organization, for instance). Everything you need to know about the compiler and the language implementation seems to be listed, except how multi-byte character constants (such as 'ab' -- Watcom, is that Ox6162 or Ox6261?) are handled. One shortcoming, however, is that the books neither reference each other nor have a common index.

The information in the User's Guide for the compiler and the debugger is disorganized, though the index is excellent. I thought the book suffered from an attempt to arrange sections in a "tutorial" order instead of as a "pure" reference manual. Nevertheless, as a reference for someone who knows what he is looking for, the documentation is a winner.

Utilities

The driver WCL is a typical C compiler driver. You can list file names (with wild cards) and options on the command line to compile and link the files. You can give the program OBJ files, LIB files, and C source files, but not ASM files. Unfortunately the options apply to all the files, rather than just those files that are listed after the option switch, which means that you cannot give different options to different files.

You can reduce memory requirements and increase speed by using WCC, the C compiler, in make files or anywhere else the driver is not needed or wanted.

Because the linker, WLINK, does not use the same commands as Microsoft's linker (and those that imitate it), you'll have problems with your existing linker response files. (WLINK is modeled after PLINK.) I took an existing project and tried to switch to Watcom's compiler: first switched to the WCC command in the make file (with the proper switches), keeping my old linker so I would not have to rewrite the linker response file. I ran into two problems: First, other linkers cannot handle Watcom's OBJ modules that contain debugging information. Second, because Watcom's OBJ files use a different format for storing the default library search, other linkers will not find it, and you will have to add the library names to the linker response file.

The linker does, however, support multiple overlays, and lets you include or exclude debugging information on a module-by-module basis.

Also included is a window-based debugger called WVIDEO. I'm not sure how much memory it needs, but I know it requires much less than CodeView. Because it loads symbols from the disk module by module, you can include debug information on everything and not run out of memory. All windows are the full width of the screen, can be moved or resized vertically, and can overlap.

The debugger is powerful, and has a macro language that controls all the features and functions. You can use the command line to type in commands or select options (using the Alt key) from a menu along the top of the screen. You must use the up- and down-arrow keys, not the initial keys, and press Return to select choices from submenus, or you can choose items with a mouse. The Tab key moves the cursor from window to window; with a source or assembly listing window, for instance, you can move the cursor and inspect different parts of the program.

Commands (or a sequence of commands) can be assigned to a single key, and are context-sensitive based on the window you are in. So you can assign a key to set a breakpoint at the cursor position in the source window, another to single step one step, or whatever you like. The file called PROFILE.DBG is loaded when the session is started and contains all the Hot-key assignments, window layout, and everything else you start up with. You can modify this file to arrange things the way you like.

The Debugger

Debugger macros can refer to variables in your program, or to variables defined by the debugger. A printf-like function displays values, an If statement makes conditional choices, and long scripts can be stored in disk files. What I liked best, however, is that breakpoints can trigger macro execution.

The debugger is not perfect, however, as there is no "watch" window. (I'd like a window that would display the results of a number of expressions that I want to keep an eye on.) A "command" window, however, can be configured to function as a watch window. The command window is considerably more powerful than a typical "watch." It will execute a statement list, which presumably contains some PRINT statements, every time the window is updated.

The other thing I don't like about the debugger is that an array of chars is always displayed as the address of the first element. If I want to see the value of a string, I must use a format string like ? {%s} name to display the contents of name in a meaningful way. Unfortunately the same thing happens when displaying an entire structure -- if you print a structured type variable, all the fields are printed. You can also bring up an interactive inspector, which shows the fields, and then select any of them to display. In both forms, strings show up as their address.

When debugging is complete, the WSTRIP utility will remove debugging information from the EXE file. All in all, the debugger is good enough for you to invest the time it takes to learn how to use it well.

The Compiler

Though the compiler does not generate a compiled listing, a separate utility, WDISASM, disassembles an OBJ file and correlates it with the original C source. This is the best such utility I have seen. You can specify many details, like putting register names in all caps, using [BX-2] form instead of 2[BX], including symbol references, and producing an assembled listing or assembly source code only. With full symbolic information in the OBJ file, it does a great job. Other tools included are a library manager, a touch utility, a make utility, and a patch utility.

The compiler itself is the heart of the package. It is somewhat slower than other compilers, but it finds all the errors before it starts crunching away on the code generation part, so the development turn-around time is very fast. One nice touch is that if you see a warning you don't like, the Break key stops compilation and aborts instantly.

The Code Generator

The code generator is terrific. It passes parameters in registers, which can do wonders for both size and speed. In some cases the object code produced was a third of the size of Microsoft's. For short and simple functions, much of the overhead of a function call is eliminated, and it becomes more of an assembly language subroutine than a high-level language function call. Figure 1, for instance, shows a simple function that has been optimized. Notice that a stack frame is not generated and that the resulting function is a simple subroutine that doesn't have the overhead of a high-level language call.

Figure 1: A simple function that has been optimized

  Function

            const int c= 5;

            int foo (int x)
            {
            int y;
            y=2*x+c;
            return y;
            }

            int d;

            void bar( )
            {
            d= foo(d);
            }

  Code generated by Watcom C7.0

  foo_         shl        AX,1
               add        AX,_c
               ret

  bar_         mov        AX,_d
               call       foo_
               mov        _d,AX
               ret

  Compare this with:

  foo          push       BP
               mov        BP,SP
               mov        AX,[BP+4]
               shl        AX,1
               add        AX,_c
               pop        BP
               ret

  bar          push       BP
               mov        BP,SP
               mov        AX,_d
               push       AX
               call       foo
               add        SP,2
               mov        _d,AX
               pop        BP
               ret

With optimization disabled, the code generator still generates fair code, so some simple improvements must be built into the code generator on a primitive level. Without optimization, the symbolic debug information is superb. All the line numbers and local symbol information are exactly where I would expect, without quirks. Even normal optimization is easy to follow in the debugger, with the exception of loop optimization, where invariant expressions are hoisted out.

There are numerous ways to fine-tune code generation. You can use a pragma to modify function attributes, and you can specify the calling sequence in some detail (including how parameters are passed, how the call is made, how the value is returned, who clears any parameters from the stack, which registers are not preserved by the function, what name the linker will know it by, and other information). You can even have a "function" that generates a sequence of bytes (which should correspond to some meaningful assembly language) instead of a call instruction. For example, I had a function called bios_int( ) that was written in assembly language. It took four unsigned parameters and loaded them into the AX, BX, CX, and DX registers, did an INT 10h call, and returned the value from AX. With Watcom 7.0, I can define a pragma saying that int_bios receives its parameters in AX, BX, CX, DX, returns a value in AX, and generates an INT 10h instruction when the function is called instead of calling a function. In other words, the assembly language function was completely eliminated and replaced by intrinsic code generation that does the same thing.

The amount of information that can be specified by pragmas is enormous. You can fine-tune a program's performance, call functions that were compiled by a different compiler, and fix linkage problems.

One interesting feature is that you can use a pragma to specify exactly what "cdecl," "pascal," and "fortran" modifiers mean. For example, you can have cdecl mean the Microsoft C calling convention, or you could have it mean something else if you were using libraries produced for another compiler. This can be set up to use existing assembly language routines. There is one quirk, though. In large models, the DS register is used freely and does not necessarily point to DGROUP when your assembly language function is called. A compiler switch can be used to eliminate this behavior, and DS points to DGROUP throughout the program. It would be better to allow the pragma to specify that DS be loaded before the function is called.

As I said before, errors in the source are quickly found. You can even call the compiler with a switch for syntax-check only. For the most part, the check finds mistakes in the source that you would expect a modern day compiler to find, though it can miss some. Figure 2 illustrates that it did not warn me that I was passing a far pointer where a long was expected. Neither does it always catch type errors where typedefs are involved.

Figure 2: Example demonstrating how a parameter of incorrect type is passed without warning

           typedef long BIG;

           void foo (BIG);

           void bar ()
           {
           char *buf;
           foo (buf);
           }

In addition to this compiler, the package includes an Express C integrated environment compiler, though most professional programmers would rather have a regular version.

ANSI Conformance

Because the packaging proudly proclaimed "100% ANSI," I decided to delve into the conformance issue. I tried the home version of the Plum-Hall validation suite, and found that the symbol HUGE_VAL caused a linker error because, it seems, the value is actually defined in the floating point library. The library name was not included for linking, even though a variable of type double was defined and a floating point comparison was made. This shows that you might define a double variable and call printf( ), and have it not work because the printf( ) linked in was taken from the regular library. This probably would have worked on a more realistic program. I brought this to Watcom's attention, and they indicated that they'll move the symbol into the regular library. (Watcom ought to link the floating point library in if any floating point variable is defined, as the method they are using does not catch all the cases where it is needed.)

For things like the order of translation, double-slash comments, line continuation, required headers, and even trigraphs, C7.0 came through with flying colors.

While the const and volatile key words are recognized, the semantics are not all available. A direct constant variable is treated properly, but pointers to constants are not. In a case like const char *s;, the assignment *s=c; is not caught as an error. As for volatile, it does not always prevent the kinds of optimization it is supposed to. Figure 3, which is part of a const semantics test I ran, indicates that Watcom 7.0 does not implement pointers to const values.

Figure 3: This test indicated that Watcom C7.0 does not implement pointers to const values

  void f101( )
  {
  const char *const s=g;
  *s= c;//error, but allowed by Watcom
  s= p;//error
  }

I looked at the standard library and found all the standard headers. I then checked some of the more exotic aspects of ANSI-ism. The library has locale functions (probably the single most difficult thing to implement) but only the "C" locale is supported, so you can't call setlocale( ) (or if you do call it with C, it doesn't actually have to do anything!). Likewise, localeconv( ) can be a trivial function, and any functions that depend on the current locale don't have to do anything special because the locale never changes.

There are multi-byte character handling functions like mblen( ), which gives the number of bytes in the multi-byte character pointed to, and mbtowc( ), which converts a multi-byte character to a wide character. But these functions are trivial because the library does not actually support character sets that have characters comprising more than one byte. Many of the new functions (40 to be exact) are listed in the Addendum, but not in the Library Reference nor in the quick reference card.

The standard library is rich, having all the ANSI required functions and all the DOS expected functions, like FP_SEG( ) and intdos( ). Functions like _splitpath( ), _heapchk( ), and _heap-shrink( ) (which frees any unused memory back to DOS in preparation for a system( ) call) are welcome additions to this version. The library supports most of the niceties that I've encountered on other compilers with one exception: I would like to be able to automatically expand wild card file names in the argv[ ] list passed to main( ).

Wrapping It Up

Watcom 7.0's strong suit is its code generation. With its efficient calling conventions, good optimization, and powerful pragmas, it can generate tight and fast code. It has a rich library, and is a good first cut at an ANSI-conforming implementation. The linker handles overlays well and the debugger can set up complex debugging scenarios. I would be wary of the flaws in compile time error checking, though. That Watcom 7.0 occasionally misses bad parameter types and doesn't fully implement const detracts from the overall evaluation. Watcom 7.0 is worth using for the code generator if that is what is important to you.

Product Information

Watcom C7.0: requires an IBM PC/XT/AT and PS/2 and compatibles with 512K memory, DOS 2.0, or later.

Watcom C7.0/386: is exactly the same as C7.0 but (obviously) requires a 386-based machine. C7.0/386 compiles 32-bit 80386 native code. The initial release does not include a linker or debugger, although future releases will. This release does use Phar Lap DOS Extender Tools and AI Architect's OS/386. $895. Watcom Products Inc., 415 Phillip Street, Waterloo, ON N2L 3X2, Canada; 1-800-265-4555.


Copyright © 1989, Dr. Dobb's Journal