EXAMINING ROOM

THE STATE-OF-THE-ART IN MODULA-2

Kent Porter

Kent Porter, DDJ's senior technical editor, also writes the "Structured Programming" column. Kent can be reached through CompuServe at 7670,51 or through MCI: kporter.

In the past couple of years, Modula-2 has progressed from the status of a newcomer to that of a viable language for both applications and systems software development. DDJ last compared Modula-2 compilers in the August 1986 issue. Since then, the state of the art has advanced dramatically, and so has interest in Modula-2 among DDJ readers. Recent introductions have still further strengthened Modula-2 as a potential successor to both Pascal and C. It's time to take Modula-2 seriously, and consequently this article surveys four leading Modula-2 development systems.

Three of these Modula-2 compilers have been introduced this year. The old-timer is Logitech, which is now in Version 3.0 and has been around for several years. The latecomers are FTL from Workman Associates, TopSpeed by Jensen and Partners international (JPI), and Stony Brook. Others are available as well, but review copies did not arrive in time to be induded.

To one degree or another, all of these products provide an "environment"-- source program editor that interacts with the compiler--along with optional command-line compilation and widely varying program-development aids. Two compilers have their own linkers, and the others use the DOS linker. Without exception, they all support the core language as defined by Wirth, plus they offer extensions. And, of course, some claim to be the fastest, a clamor that will be dehyped with the benchmarks later in this article.

While support for the basic elements of Modula-2 effects a degree of standardization, it does not ensure portability of a Modula-2 source program from one implementation to another. Like C, Modula-2 is a limited language that derives much of its power and flexibility from external routines. For example, Wirth recommends that I/O service routines reside in certain libraries and take thus-and-such parameters, but he doesn't mandate them. Nor does any formal standard for Modula-2 yet exist. Consequently, implementors take liberties. These liberties inevitably lead to discrepancies that diminish portability.

Table 1, page 67, shows a sample of declaring the external routines WriteString and WriteLn, which write a string and a newline, respectively, to the display.

Table 1: Incompatibilities in declaring common external I/O routines

   Wirth:               FROM InOut IMPORT WriteString, WriteLn;
   FTL:                 FROM Terminal IMPORT WriteString, WriteLn;
   Logitech:            FROM InOut IMPORT WriteString, WriteLn;
   TopSpeed:            FROM IO IMPORT WtStr, WrLn
   Stony Brook:         FROM InOut IMPORT WriteString, WriteLn;

Logitech and Stony Brook implement Wirth's recommended InOut library, but FTL and TopSpeed do not. In fact, TopSpeed doesn't even use the same procedure names (although it does provide alternative libraries that do). Therefore, moving a program from one implementation to another is likely to necessitate making changes to accommodate the deviations from Wirth.

Deviations are not necessarily bad, and certainly not unique to Modula-2. If you buy one compiler and stick with it (as most programmers do), you'll never confront the issues of portability. Languages governed by formal standards also cause headaches in porting between compilers. Ada comes as close as any language to being universal, yet even it is imperfect.

Turning to specifics, this article profiles each of the products. Space is limited, so the focus is on the main features, which are summarized in Table 2, on page 67. The performance aspects are covered later in the section "Benchmarking the Compilers."

Table 2: Modula-2 compilers: features, requirements, and pricing

FTL LOGITECH TOPSPEED STONY BROOK Editor: Style Wordstar Point Turbo Brief-like Customizable Macros .INI files .MNU file No Compiler: Switches 7 21 12 18 Directives None 6 24 None Memory modules 2<1> 1<5> Note<10> 6 Overlays No Yes Yes<9> No Inline assembly Yes No No No 8087 support Yes Yes Yes Yes '286 support Yes<2> Yes No No Linker Included DOS<6> Integrated DOS Switches 11<3> N/A 7 N/A Debugger Symbolic Note<7> Symbolic<9> Symbolic Make Yes Optional Integrated Yes Librarian Yes No Integrated No Other utilities Yes Optional No No Standard modules 28 63 13 23 Standard identifiers 233 479 259 157 System requirements Lowest DOS version 2.0 2.0 2.0 2.0 Memory 256K 512 384K ? Floppies 2 2 2 2 Hard Disk N/R Recomm. Recomm. Recomm. Color monitor N/R N/R Recomm. N/R Base price $49.95<4> $99.00<8> $99.95 ? Version tested 1.08 3.03 1.10 1.10

Notes:

1.   FTL small and large models are separate products.
2.   In large model only, via a linker switch
3.   In large model. Four for small model linker
4.   Small model. Large model is $79.95
5.   Large model only
6.   Logitech linker available with optional toolkit (not evaluated here)
7.   Base package includes post-mortem debugger. Interactive symbolic debuggger
     available in optional toolkit.
8.   $249.00 with optional toolkit
9.   Optional and extra
10.  Programmer-defined memory models

To compile:    M2 /C yourmod
To link:       M2 /L yourmod

TopSpeed deviates from the de facto Wirth standard in the naming of default I/O routines. For example, Wirth specifies the module InOut exporting WriteString to output a string and WriteLn for a newline. TopSpeed's implementation instead uses WrStr and WrLn, respectively, exported from a library called IO. This might make you uncomfortable, especially if you're just learning Modula-2. For that reason, one of the three distribution diskettes contains a directory called \CORE, which has a number of compatibility modules (one of which is InOut) that allow you to use Wirth's recommended libraries and standard procedure calls. All you have to do is compile them, and Topspeed becomes compliant with the Wirth standard and with other Modula compilers.

Of the development systems reviewed here, only JPI's TopSpeed supports dot-addressable graphics with intrinsic calls such as Circle, Line, and Polygon. It's not yet at the level of Borland's .BGI drivers, but ahead of the other Modula products. TopSpeed also provides advanced text manipulation and cursor control.

An optional symbolic debugger comparable to CodeView is available from JPI, as is a $59.95 toolkit. This toolkit includes an assembler and ROM-burning software, plus modules for communications, terminate-and-stay-resident programs (TSR's), critical error handlers, EMS, and overlays. It also provides source code for TopSpeed's start-up and run-time libraries.

DDJ doesn't give unqualified raves very often, but there's no question about it in this case: JPl's TopSpeed Modula-2 is first-rate.

Stony Brook

As the benchmark discussion later reveals, Stony Brook delivers outstanding performance. Unfortunately, their tools don't measure up in comparison with the others. Stony Brook comes with an editor, compiler, libraries, and a rudimentary debugger. That's all.

The editor is somewhat like Brief, though less capable. For example, you can't customize it, and you can only work with two files at a time on a split screen. The Alt-C command enables you to compile the file in the active window from within the editor. An error positions the cursor on the offender and generates a message explaining why the compiler choked. It's usually sufficient to suggest a fix.

But editing and compiling are all you can do from within the editor. To link and run, you must exit to DOS. Although there's nothing wrong with this strategy, integration is not as good as with the other versions reviewed here.

The debugger is primitive: better than nothing, but not as good as it might be. There's nothing visual about it. You type commands and get answers on a blind line-by-line basis, without the benefit of watch windows and dancing execution bars Showing where you are in the source. The only way to work with this debugger is with a line-numbered source listing by your side.

I like Stony Brook Modula-2, and I've said so in the "Structured Programming" column. Even its competitors make admiring noises about it. But performance, and not features, sell it. If Stony Brook can retain the performance while ramping up the features and libraries in future releases, they'll have a strong development system.

Benchmarking Compilers

Benchmarks are always controversial, of course, and these will probably be no exception. In evaluating these results, bear in mind that any given benchmark measures performance of a certain limited set of activities. Therefore, a benchmark provides only a general idea of how one computing element (a compiler in this case) stacks up against others of its kind. The set of benchmarks as a whole provides a reasonably good feel for overall relative performance, but the real test is your particular application on your machine.

These tests were all performed on a Tandon AT clone running at 8 MHz with no-wait-state memory and a 35 ms, 40-Mbyte hard disk. The timing figures were provided by a supervising program that ties into mode 2 of the 8253 timing chip, which provides a granularity of 838 nanoseconds. The timer program carries seconds out to four decimal places, which have been rounded to two places in the tables. The figures are the average of five executions each, and represent the total of load-and-run time.

In all cases, the compiler/linker command-line options were set to yield the best execution time for the resulting program. When possible, overhead operations (such as subscript range checking) were disabled. Where alternative libraries were available--as in the case of TopSpeed's limited run-time system and FTL's 8087-only floating-point, for example--I used them. Every effort was made to achieve the most favorable results.

The averages summarizing all except Table 5 are geometric, not arithmetic. A geometric average gives equal weight to all rows. A row that is much "bigger" or "smaller" than others doesn't dominate the results. For example:

                    A            B
1                 90.0         80.0
2                  5.0         10.0
                  ------     ------
Total             95.0         90.0
Arith.avge.       47.5         45.0
Geom.avge.        36.8         52.6

Based on the arithmetic average, B seems to outperform A. Test 1 is inordinately long and Test 2 is very short. Giving equal weight to both tests by using a geometric average, you see that, in fact, A outperforms B.

The contenders were benchmarked with nine test programs shown in Listings One through Nine starting on page 100. If necessary, I changed library names in IMPORT statements and procedure calls to conform to the requirements of the tested implementation, all without altering the underlying algorithm. To the greatest extent possible, all pro- grams are identical across all tested compilers.

FTL and TopSpeed provide their own linkers, which were used to convert .OBJ modules compiled by those products into .EXE files. Logitech and Stony Brook require the DOS linker (the version I used here was 5.0120). The compile and link steps were performed separately on a command-line basis. I did not test the make utility for any of these compilers.

I used the Logitech translator to convert Dhrystone 1.1 from Pascal into Modula-2, and manually translated sieve, fib, and acker from C. The 4)math, qsort, and shsort programs were obtained from BIX and modified slightly, primarily to remove unnecessary output statements. The cortn and ncortn programs were written as new benchmarks specifically for Modula-2. These will be discussed later.

Table 3, on page 69, shows what the benchmark programs measure (in order of importance for each program).

dhrystone                 A statistically balanced mix of operations
sieve                     Array indexing and integer arithmetic
fib                       Integer srithmetic and recursion
acker                     Recursion and integer arithmetic
fpmath                    Transcendental floating-point arithmentic
     Without 80x87        FP emulation package efficiency
     With 80x87           Efficiency of generated code using the 80287 math coprocessor
qsort                     QuickSort algorithm
shshort                   Shell-Metzner sort algorithm
cortn                     Coroutine context switching speed*
ncortn                    Same task as cortn, but without coroutines*

Modula-2 supports soroutines: tasks that call each other as equals rather than as superior/subordinate. The difference between cortn and ncortn measures the time required for 100,000 coroutine context switches (see sidebar for more information).

Now for the results.

Compile and Link Times

Compile/link time is important because it affects the productivity of programmers; the less time from source to .EXE, the less idle time spent by the programmer.

For these tests, I dropped out of the integrated environment and measured time based on command-line mode. This is because accurate timings cannot be obtained within an environment. Actual execution times are shown in Table 4, on page 74, without the gap and typing time between compile and link. Systems with a make option automate the process and eliminate the inter-step gap, and so the totals for these systems are probably very close to the actual time.

Table 4: Modula-2 average compile and link times

Program:                         FTL       LogiTech  TopSpeed       Stony Brook
sieve     Compile                27.0      5.51          3.71          2.65
               Link              3.48      3.70          2.72              2.87
               Total             6.18      9.21          6.44              5.51
Fib        Compile               2.53      5.07          3.66              2.65
               Link              3.57      3.70          2.72              2.86
               Total             6.10      8.77          6.38              5.51
acker     Compile                2.64      5.18         3.65               2.81
               Link              3.63      3.70          2.66               2.97
               Total             6.26      8.87          6.30              5.78
fpmath    Compile                3.76      10.72          5.94             4.62
               Link              15.24      8.25         4.45               5.17
               Total             19.01     18.97         10.39               9.79
qsort     Compile                2.79      5.84          4.15               3.13
               Link              3.62      3.69          2.84               3.30
               Total             6.41      9.53          6.99               6.43
shsort    Compile                2.79      6.15          4.44               3.19
               Link              3.67      3.68          2.96               3.29
               Total             6.46      9.83          7.40               6.47
cortn     Compile                3.45       5.44          4.34              2.91
               Link              8.50       3.57          3.30              3.35
               Total             11.95     9.00          7.64              6.26
ncortn    Compile                2.68      5.02           3.76              2.75
               Link              3.39      3.64           2.94              3.19
               Total             6.07      8.66           6.70              5.93
Geometric averages               8.20     10.37           7.48*             6.60
* Deduct 2.8 sec. for the integrated environment

The apparent winner here is Stony Brook, followed very closely by TopSpeed from JPI and then HI. The spread is less than four seconds from first to third place. Logitech significantly trails in last place, yet its performance can hardly be characterized as bad.

Were it possible to accurately measure elapsed time within an environment, TopSpeed would undoubtedly emerge the winner. This is because the command M2 loads the complete integrated environment, and then switches to batch mode based on the command line. To compile and link SIEVE.MOD, the commands are M2 /C SIEVE and M2 /L SIEVE. When the environment is already active, it's not necessary to reload M2 from disk in order to do a link. Tests with the empty program

MODULE nothing; BEGIN END nothing.

revealed that the load time for TopSpeed is 2.8 seconds. Of those tested, TopSpeed is the only compiler with an integrated linker. Therefore, the compile/link totals should be adjusted downward by 2.8 seconds to estimate time within the environment, and TopSpeed emerges the clear winner at 4.68 seconds average.

Code Size

Table 5, below, shows the size in bytes of .EXE files flowing out of the compile/link process. Code size is not especially important for most applications. That is, a smaller .EXE is not necessarily better. It is important when the application is very large, or consists of many .EXE files that consume great amounts of disk space. For example, if a commercial program needs two distribution diskettes instead of one because of the summed .EXE file sizes, it becomes a cost factor. Otherwise, who cares?

Table 5: Modula-2 benchmarks: code size

Program:               FTL       LogiTech    TopSpeed       Stony Brook
sieve                  3584      3949        687            826
fib                    3584      3891        683            818
acker                  3584      3921        707            850
fpmath    (w/o '87)    25088     31225       13608          12138
          (with '87)   25088     30842       13608          12106
qsort                  3584      4079        839            960
shsort                 3584      4149        917            1002
cortn                  13312     4498        2671           1504
ncortn                 3584      3973        727            846
Geometric
averages               10569     10569       2819           2793

Compilers tend to insert hunks of code as a matter of course. An example is a routine to initialize global variables. Such code cliches become fixed overhead in the .EXE file size. The NOThING program cited previously yields the following .EXE file sizes in bytes:

FTL          3584
Logitech     3797
Topspeed      609
Stony Brook   754

These represent the minimum size overhead of every .EXE file. If you want to determine the approximate amount of real working code included in a specific .EXE, subtract these fixed amounts.

If smaller isn't necessarily better, neither is larger (by definition) worse. As an example, note that Logitech yields an enormous .EXE file for fpmath, yet delivers second-best runtime performance in Table 6, page 76, while KFL produces a smaller file size and enormously slower execution for the same program. Performance doesn't have to do with file size, but instead with code efficiency. That's why file size is relatively unimportant.

Table 6: Modula-2 benchmarks: average run times

Program:                  FTL          LogiTech      TopSpeed      Stoney Brook
sieve                     8.61         8.31          3.33        3.85
fib                       38.39        22.92         22.83       24.28
acker                     20.71        11.82         10.19       12.14
fpmath    (w/o '87)       255.15       51.86         71.21       49.05
          (with '87)      12.08        45.28         10.30       11.63
qsort                     7.85         6.00          4.04        3.62
shsort                    11.69        11.53         8.98        10.38
cortn                     5.59         6.22          10.30       12.85
ncortn                    2.29         2.32          1.68        1.48
     overhead             3.30         3.90          8.62        11.37
     % o'head             58.97        62.68         83.73       88.48
Geometric
averages                  28.99        24.62         17.20       18.16

Run Times

Table 5 lists the run times for the benchmarks. This is where "the rubber meets the road." All programmers want to produce fast applications, and they will sacrifice a certain amount of productivity and file size to do so. There is no equivocating here; either the resulting application is fast, or it's not.

Overall, JPI's TopSpeed Modula-2 lives up to its name, followed closely by Stony Brook. Logitech comes in a distant third, with FTL biting the dust.

What hurts FTL is its disastrously slow floating-point emulation. TopSpeed--out in front almost everywhere else--is third here, but still 3.58 times faster than ITL. If you remove the ipmath tests, the geometric averages are less widely spread, but still in the same order:

TopSpeed        8.67
Stony Brook     9.46
Logitech       10.80
PTL            13.11

These represent the performance rankings of the compilers' output .EXE files, and they're confirmed by the Dhrystone results.

But before leaving run times, compare the effects of adding an 80x87 math coprocessor to the hardware platform in the fpmath test. To perform these tests, it was necessary to recompile and relink fpmath with each compiler. The 80x87 has a negligible effect for Logitech, but for FfL the time plummets from 255 seconds to a mere 12, which is only a whisker behind first-place TopSpeed. Clearly, all but Logitech have expended great effort in exploiting the '87. The message here is if you're crunching a lot of floating-point numbers, buy an '87 and don't use Logitech.

Dhrystone

An industry-standard benchmark, Dhrystone is generally regarded as the most objective single predictor of compiled program performance. It's a synthetic application containsng a statistically balanced mix of operations characteristic of the "typical" systems program. Dhrystone was devised by studying hundreds of non-floating-point programs, then constructing a 100-statement algorithm containing the following:

53 assignments
32 control statements
15 function and procedure calls

The entire program loops 50,000 times. Each iteration is one Dhrystone unit. Therefore, 50,000 divided by elapsed time yields' the number of Dhrystones per second. The higher the number, the better.

Table 7, this page, shows Dhrystone results for the compilers. Stony Brook has the fastest raw compile/link time at 8.89 seconds. If you adjust TopSpeed's performance downward by 2.8 seconds to estimate environment time as discussed previously, TopSpeed becomes the fastest at 8.4. TopSpeed also wins by a wide margin in terms of .EXE size and Dhrystones per second. Logitech and KFL share last place, with Logitech leading by a hair. Stony Brook is exactly halfway between first and last place.

Table 7: Modula-2 Dhrystone results

Program:            FTL        LogiTech    TopSpeed     Stoney Brook
Compile             4.62       12.26       7.17         5.11
Link                7.37       4.84        4.04         3.79
Total               11.99      17.10       11.20*       8.89
Size                9728       8957        2827         5638
Seconds             40.43      40.60        31.77       35.70
Dhrystones/sec      1237       1232         1574        1401
* Est. 8.4 sec. in the integrated environment

Conclusions

JPI's TopSpeed emerges as the clear winner. Its performance is spectacular, and it offers a powerful integrated environment similar to the Turbo languages, plus an optional symbolic debugger and toolkit. TopSpeed earns a standing ovation.

Stony Brook places a respectable second in all performance categories, running overall neck-in-neck with TopSpeed. The editor, debugger, and make utility form a useful, but rudimentary, toolkit for program development.

If you're a tools junkie, you'll love Logitech. Its optional toolkit furnishes oodles of them, and they're good. So is the performance of its output .EXE files--provided you turn off the default switches in the M2C.CFG file. Logitech's compile/link performance is lackluster, though, and its .EXE files are the largest of the crop. Nevertheless, the average execution time is fairly close to that of the leaders. The grand old man of Modula-2 compilers for the PC is still a formidable contender.

Although FL ranks third in compile/link time and code size, its average .EXE run time consigns it to last place in performance. The only area where it really shines is in coroutine switching. What hurts FTL most is inefficient floating-point emulation. Overall, it's a serviceable compiler with an excellent set of tools--a genuine bargain at only $49.95.

So there you have the current state of the art in PC-based Modula-2 development systems, and the state of the art is very good indeed.

A New Benchmark for Modula-2 Compilers

Among the mainstream languages, Modula-2 is the first to support concurrent processes through standard procedure calls. Most computers are single-thread machines, so concur rency is achieved by time-division multiplexing: one process runs for a time, then yields the machine to another process, which takes its turn and then reverts to the first, and so on until both complete. In Modula-2 terminology such processes are called coroutines. Coroutines exist M equals, each periodically deferring to the other, which picks up where it most recently left off.

An issue with coroutines is context switching (the saving and restoring of the machine state during control transfers). This entails overhead, and the question becomes one of quantifying it. A benchmark is clearly needed for comparing the relative performance of competing compilers in handling coroutine switching. That's what DDJ presents here.

In fact, two benchmarks are needed that perform exactly the same task -- one using coroutines and the other not. The Modula-2 programs CORTN.MOD and NCORTN.MOD (see Listings, page 100) both generate a 50,000-character string in lowercase, then shift that string to uppercase. The difference is that CORTN uses a coroutine to count the number of characters shifted, whereas NCORTN uses a normal procedure call. Granted that the task is trivial, but this is consistent with benchmarks in general. The objective is to measure the relative amount of overhead introduced by coroutines.

So here's what you do. Compile and run CORTN and NCORTN, timing the execution period for each as precisely as possible. Then calculate the overhead introduced by coroutine switching. If C is CORTN run time and N is that for NCORTN, the percent overhead (O) for coroutine switching, is

O = (C - N)/C

Thus, if CORTN runs in 10.30 seconds and NCORTN runs in 1.88 seconds, the percent overhead for coroutines is 83.73 percent.

In comparing coroutine performance, the lower the overhead percentage, the more efficient a particular compiler is at handling coroutines. For example, in Table 4 accompanying this article, FTL has the most efficient coroutine handling because its percentage overhead is the lowest.

DDJ would like to place these Modula-2 benchmarks and the method for evaluating them in the public domain. If you have comments, please submit them in writing to DDJ, Attn. Kent Porter, at the address on the masthead, or to kporter on MCI or 76704,51 on CompuServe. The results will be published in a future issue. -- KP

_THE STATE OF MODULA-2_ by Kent Porter [LISTING ONE]



MODULE dry;

  FROM Storage
    IMPORT ALLOCATE, DEALLOCATE, Available, InstallHeap, RemoveHeap;
  FROM Strings
    IMPORT CompareStr;

(*
 *   "DHRYSTONE" Benchmark Program
 *
 *   Version:   Mod2/1
 *   Date:      05/03/86
 *   Author:      Reinhold P. Weicker,  CACM Vol 27, No 10, 10/84 pg. 1013
 *         C version translated from ADA by Rick Richardson
 *         Every method to preserve ADA-likeness has been used,
 *         at the expense of C-ness.
 *         Modula-2 version translated from C by Kevin Northover.
 *         Again every attempt made to avoid distortions of the original.
 *   Machine Specifics:
 *         The time function is system dependant, one is
 *         provided for the Amiga.  Your compiler may be different.
 *         The LOOPS constant is initially set for 50000 loops.
 *         If you have a machine with large integers and is
 *         very fast, please change this number to 500000 to
 *         get better accuracy.
 *         You can also time the program with a stopwatch when it
 *         is lightly loaded (no interlaced 4 bit deep Amiga screens ...).
 *
 **************************************************************************
 *
 *   The following program contains statements of a high-level programming
 *   language (Modula-2) in a distribution considered representative:
 *
 *   assignments         53%
 *   control statements      32%
 *   procedure, function calls   15%
 *
 *   100 statements are dynamically executed.  The program is balanced with
 *   respect to the three aspects:
 *      - statement type
 *      - operand type (for simple data types)
 *      - operand access
 *         operand global, local, parameter, or constant.
 *
 *   The combination of these three aspects is balanced only approximately.
 *
 *   The program does not compute anything meaningfull, but it is
 *   syntactically and semantically correct.
 *
 *)

(* Accuracy of timings and human fatigue controlled by next two lines *)

  CONST
    LOOPS = 50000;

  TYPE
    Enumeration = (Ident1, Ident2, Ident3, Ident4, Ident5);
    OneToThirty = CARDINAL;
    OneToFifty = CARDINAL;
    CapitalLetter = CHAR;
    String30 = ARRAY [0..30-1] OF CHAR;
    Array1Dim = ARRAY [0..50] OF CARDINAL;
    Array2Dim = ARRAY [0..50], [0..50] OF CARDINAL;
    RecordPtr = POINTER TO RecordType;
    RecordType = RECORD
                   PtrComp: RecordPtr;
                   Discr: Enumeration;
                   EnumComp: Enumeration;
                   IntComp: OneToFifty;
                   StringComp: String30;
                 END;

    (*
     * Package 1
     *)

  VAR

    IntGlob: CARDINAL;
    BoolGlob: BOOLEAN;
    Char1Glob: CHAR;
    Char2Glob: CHAR;
    Array1Glob: Array1Dim;
    Array2Glob: Array2Dim;
    PtrGlb: RecordPtr;
    PtrGlbNext: RecordPtr;


  PROCEDURE Proc7(IntParI1, IntParI2: OneToFifty;
                  VAR IntParOut: OneToFifty);

    VAR

      IntLoc: OneToFifty;
  BEGIN
    IntLoc := IntParI1+2;
    IntParOut := IntParI2+IntLoc;
  END Proc7;


  PROCEDURE Proc3(VAR PtrParOut: RecordPtr);
  BEGIN
    IF (PtrGlb <> NIL) THEN

      PtrParOut := PtrGlb^.PtrComp
    ELSE
      IntGlob := 100
    END;
    Proc7(10, IntGlob, PtrGlb^.IntComp);
  END Proc3;


  PROCEDURE Func3(EnumParIn: Enumeration): BOOLEAN;

    VAR
      EnumLoc: Enumeration;
    VAR Func3Result: BOOLEAN;
  BEGIN
    EnumLoc := EnumParIn;
    Func3Result := EnumLoc = Ident3;
    RETURN Func3Result
  END Func3;


  PROCEDURE Proc6(EnumParIn: Enumeration;
                  VAR EnumParOut: Enumeration);
  BEGIN
    EnumParOut := EnumParIn;
    IF ( NOT Func3(EnumParIn)) THEN
      EnumParOut := Ident4
    END;
    CASE EnumParIn OF
        Ident1:
        EnumParOut := Ident1
      | Ident2:
        IF (IntGlob > 100) THEN

          EnumParOut := Ident1
        ELSE
          EnumParOut := Ident4
        END
      | Ident3:
        EnumParOut := Ident2
      | Ident4:
      | Ident5:
        EnumParOut := Ident3

      ELSE
    END;
  END Proc6;



  PROCEDURE Proc1(PtrParIn: RecordPtr);
  BEGIN
    WITH PtrParIn^ DO

      PtrComp^ := PtrGlb^;
      IntComp := 5;
      PtrComp^.IntComp := IntComp;
      PtrComp^.PtrComp := PtrComp;
      Proc3(PtrComp^.PtrComp);
      IF (PtrComp^.Discr = Ident1) THEN
        PtrComp^.IntComp := 6;
        Proc6(EnumComp, PtrComp^.EnumComp);
        PtrComp^.PtrComp := PtrGlb^.PtrComp;
        Proc7(PtrComp^.IntComp, 10, PtrComp^.IntComp);


      ELSE
        PtrParIn^ := PtrComp^
      END;
    END;
  END Proc1;


  PROCEDURE Proc2(VAR IntParIO: OneToFifty);

    VAR

      IntLoc: OneToFifty;
      EnumLoc: Enumeration;
  BEGIN
    IntLoc := IntParIO+10;
    REPEAT

      IF (Char1Glob = 'A') THEN

        DEC(IntLoc, 1);
        IntParIO := IntLoc-IntGlob;
        EnumLoc := Ident1;
      END;
    UNTIL EnumLoc = Ident1;
  END Proc2;


  PROCEDURE Proc4;

    VAR

      BoolLoc: BOOLEAN;
  BEGIN
    BoolLoc := Char1Glob = 'A';
    BoolLoc := BoolLoc OR BoolGlob;
    Char2Glob := 'B';
  END Proc4;


  PROCEDURE Proc5;
  BEGIN
    Char1Glob := 'A';
    BoolGlob := FALSE;
  END Proc5;


  PROCEDURE Proc8(VAR Array1Par: Array1Dim;
                  VAR Array2Par: Array2Dim;
                  IntParI1, IntParI2: OneToFifty);

    VAR

      IntLoc: OneToFifty;
      IntIndex: OneToFifty;
  BEGIN
    IntLoc := IntParI1+5;
    Array1Par[IntLoc] := IntParI2;
    Array1Par[IntLoc+1] := Array1Par[IntLoc];
    Array1Par[IntLoc+30] := IntLoc;
    FOR IntIndex := IntLoc TO (IntLoc+1) DO
      Array2Par[IntLoc][IntIndex] := IntLoc
    END;
    Array2Par[IntLoc][IntLoc-1] := Array2Par[IntLoc][IntLoc-1]+1;
    Array2Par[IntLoc+20][IntLoc] := Array1Par[IntLoc];
    IntGlob := 5;
  END Proc8;


  PROCEDURE Func1(CharPar1, CharPar2: CapitalLetter): Enumeration;

    VAR

      CharLoc1, CharLoc2: CapitalLetter;
    VAR Func1Result: Enumeration;
  BEGIN
    CharLoc1 := CharPar1;
    CharLoc2 := CharLoc1;
    IF (CharLoc2 <> CharPar2) THEN
      Func1Result := (Ident1)
    ELSE
      Func1Result := (Ident2)
    END;
    RETURN Func1Result
  END Func1;


  PROCEDURE Func2(VAR StrParI1, StrParI2: String30): BOOLEAN;

    VAR

      IntLoc: OneToThirty;
      CharLoc: CapitalLetter;
    VAR Func2Result: BOOLEAN;
  BEGIN
    IntLoc := 2;
    WHILE (IntLoc <= 2) DO
      IF (Func1(StrParI1[IntLoc], StrParI2[IntLoc+1]) = Ident1) THEN
        CharLoc := 'A';
        INC(IntLoc, 1);
      END;
    END;
    IF (CharLoc >= 'W') AND (CharLoc <= 'Z') THEN
      IntLoc := 7
    END;
    IF CharLoc = 'X' THEN
      Func2Result := TRUE
    ELSIF CompareStr (StrParI1, StrParI2) > 0 THEN
      INC(IntLoc, 7);
      Func2Result := TRUE
    ELSE
      Func2Result := FALSE
    END;
    RETURN Func2Result
  END Func2;



  PROCEDURE Proc0;

    VAR

      IntLoc1: OneToFifty;
      IntLoc2: OneToFifty;
      IntLoc3: OneToFifty;
      CharLoc: CHAR;
      CharIndex: CHAR;
      EnumLoc: Enumeration;
      String1Loc, String2Loc: String30;
      i, LoopMax: CARDINAL;


  BEGIN
    LoopMax := LOOPS;
    NEW(PtrGlbNext);
    NEW(PtrGlb);
    PtrGlb^.PtrComp := PtrGlbNext;
    PtrGlb^.Discr := Ident1;
    PtrGlb^.EnumComp := Ident3;
    PtrGlb^.IntComp := 40;
    PtrGlb^.StringComp := 'DHRYSTONE PROGRAM, SOME STRING';
    String1Loc := "DHRYSTONE PROGRAM, 1'ST STRING";
    FOR i := 0 TO LoopMax DO

      Proc5;
      Proc4;
      IntLoc1 := 2;
      IntLoc2 := 3;
      String2Loc := "DHRYSTONE PROGRAM, 2'ND STRING";
      EnumLoc := Ident2;
      BoolGlob :=  NOT Func2(String1Loc, String2Loc);
      WHILE (IntLoc1 < IntLoc2) DO

        IntLoc3 := 5*IntLoc1-IntLoc2;
        Proc7(IntLoc1, IntLoc2, IntLoc3);
        INC(IntLoc1, 1);
      END;
      Proc8(Array1Glob, Array2Glob, IntLoc1, IntLoc3);
      Proc1(PtrGlb);
      CharIndex := 'A';
      WHILE CharIndex <= Char2Glob DO

        IF (EnumLoc = Func1(CharIndex, 'C')) THEN
          Proc6(Ident1, EnumLoc)
        END;
        CharIndex := VAL(CHAR, ORD(CharIndex)+1);
      END;
      IntLoc3 := IntLoc2*IntLoc1;
      IntLoc2 := IntLoc3 DIV IntLoc1;
      IntLoc2 := 7*(IntLoc3-IntLoc2)-IntLoc1;
      Proc2(IntLoc1);
    END;
  END Proc0;



  (* The Main Program is trivial *)

BEGIN
  Proc0;
END dry.

[LISTING TWO]



MODULE sieve;
(* Eratosthenes sieve prime number program, Byte Magazine *)

  CONST size = 8190;

  VAR
    psn, k, prime, iter : INTEGER;
    flags : ARRAY [0..size] OF BOOLEAN;

BEGIN
  FOR iter := 1 TO 25 DO
    FOR psn := 0 TO size DO
    flags[ psn ] := TRUE;
    END(* for *);
    FOR psn := 0 TO size DO
    IF flags[ psn ]
    THEN  (* prime *)
      prime := psn + psn + 3;
      k := psn + prime;
      WHILE k <= size DO  (* cancel multiples *)
      flags[ k ] := FALSE;
      k := k + prime;
      END(* while *);
    END(* if then *);
    END(* for *);
  END(* for *);
END sieve.

[LISTING THREE]



MODULE fib;

(* Berkeley standard benchmark *)
(* Computes largest 16-bit Fibonacci number *)
(* Tests compiler recursion efficiency and CPU thruput *)

  CONST
    TIMES = 10;
    VALUE = 24;

  VAR
    i: INTEGER;
    f: CARDINAL;
    (* ----------------------------------------------------------- *)

  PROCEDURE fibonacci(n: INTEGER): CARDINAL;
    VAR fibonacciResult: CARDINAL;
  BEGIN
    IF n >= 2 THEN
      fibonacciResult := fibonacci(n-1)+fibonacci(n-2)
    ELSE
      fibonacciResult := n
    END;
    RETURN fibonacciResult
  END fibonacci; (* --------------------------- *)


BEGIN (* main *)
  FOR i := 1 TO TIMES DO
    f := fibonacci(VALUE)
  END;
END fib.

[LISTING FOUR]




MODULE acker;



(* Berkeley standard benchmark *)
(* Ackerman's function: ack (2, 4) *)
(* Tests recursion and integer math *)
(* Repeats 10,000 times *)



  VAR
    loop, r: INTEGER;
    (* ---------------------------------------------------------- *)




  PROCEDURE ack(x1, x2: INTEGER): INTEGER;

    VAR
      result: INTEGER;

    VAR ackResult: INTEGER;
  BEGIN
    IF x1 = 0 THEN

      result := x2+1
    ELSIF x2 = 0 THEN
      result := ack(x1-1, 1)
    ELSE
      result := ack(x1-1, ack(x1, x2-1))
    END;
    ackResult := result;
    RETURN ackResult
  END ack; (* --------------------------- *)


BEGIN (* main *)
  FOR loop := 1 TO 10000 DO
    r := ack(2, 4)
  END;
END acker.

[LISTING FIVE]



MODULE FPMath;
(* Benchmarks floating point math package *)

  FROM MathLib0 IMPORT arctan, exp, ln, sin, sqrt;
  FROM InOut    IMPORT Write, WriteLn, WriteString;

  CONST
    pi = 3.1415927;
    nloops = 5;

  VAR
    i, j: INTEGER;
    angle, result, argument: REAL;

BEGIN
  WriteString('SQUARE ROOTS   ');
  FOR i := 1 TO nloops DO
    Write ('.');
    argument := 0.0;
    WHILE argument <= 1000.0 DO
      result := sqrt (argument);
      argument := argument + 1.0
    END;
  END; (* FOR *)

  WriteLn;
  WriteString('LOGS           ');
  FOR i := 1 TO nloops DO
    Write ('.');
    argument := 0.1;
    WHILE argument <= 1000.1 DO
      result := ln (argument);
      argument := argument + 1.0
    END;
  END; (* FOR *)

  WriteLn;
  WriteString('EXPONENTIALS   ');
  FOR i := 1 TO nloops DO
    Write ('.');
    argument := 0.1;
    WHILE argument <= 10.0 DO
      result := exp (argument);
      argument := argument + 0.01
    END;
  END; (* FOR *)

  WriteLn;
  WriteString('ARCTANS        ');
  FOR i := 1 TO nloops DO
    Write ('.');
    argument := 0.1;
    WHILE argument <= 10.0 DO
      angle := arctan (argument);
      argument := argument + 0.01
    END;
  END; (* FOR *)

  WriteLn;
  WriteString('SINES          ');
  FOR i := 1 TO nloops DO
    Write ('.');
    angle := 0.0;
    WHILE angle <= 2.0 * pi DO
      result := sin (angle);
      angle := angle + pi / 360.0
    END;
  END; (* FOR *)
  WriteLn;
END FPMath.

[LISTING SIX]



MODULE QSort;

(* The test uses QuickSort to measure recursion speed *)
(* An ordered array is created by the program and is  *)
(* reverse sorted.  The process is performed 'MAXITER'*)
(* number of times.                                   *)

CONST SIZE = 1000;
      MAXITER = 50;

TYPE NUMBERS = ARRAY[1..SIZE] OF CARDINAL;

VAR Iter, Offset, I, J, Temporary : CARDINAL;
    A : NUMBERS;

PROCEDURE InitializeArray ;
(* Procedure to initialize array *)

VAR I : CARDINAL;

BEGIN
    FOR I := 1 TO SIZE DO
        A[I] := SIZE - I + 1
    END; (* FOR I *)
END InitializeArray;

PROCEDURE QuickSort;
(* Procedure to perform a QuickSort *)

PROCEDURE Sort(Left, Right : CARDINAL);

VAR i, j : CARDINAL;
    Data1, Data2 : CARDINAL;

BEGIN
    i := Left; j := Right;
    Data1 := A[(Left + Right) DIV 2];
    REPEAT
        WHILE A[i] < Data1 DO INC(i) END;
        WHILE Data1 < A[j] DO DEC(j) END;
        IF i <= j THEN
            Data2 := A[i]; A[i] := A[j]; A[j] := Data2;
            INC(i); DEC(j)
        END;
    UNTIL i > j;
    IF Left < j  THEN Sort(Left,j)  END;
    IF i < Right THEN Sort(i,Right) END;
END Sort;

BEGIN (* QuickSort *)
    Sort(1,SIZE);
END QuickSort;

BEGIN (* Main *)
    FOR Iter := 1 TO MAXITER  DO
       InitializeArray;
       QuickSort
    END; (* FOR Iter  *)
END QSort.

[LISTING SEVEN]



MODULE ShSort;
(* Tests Shell sort speed on an integer array of ARSIZE elements.  *)
(* Creates an array ordered from smaller to larger, then sorts it  *)
(* into reverse order. Repeats NSORTS times.                       *)

CONST ARSIZE = 1000;
      NSORTS = 20;

TYPE NUMBERS = ARRAY [1..ARSIZE] OF INTEGER;

VAR IsInOrder, Ascending : BOOLEAN;
    Iter, Offset, I, J, Temporary : CARDINAL;
    Ch : CHAR;
    A : NUMBERS;

PROCEDURE InitializeArray ;
     (* Initialize array *)
BEGIN
    FOR I := 1 TO ARSIZE DO
        A [I] := I
    END; (* FOR I *)
END InitializeArray;

PROCEDURE ShellSort ;
     (* Shell-Meztner sort *)

    PROCEDURE Swap;
         (* Swap elements A[I] and A[J] *)
    BEGIN
       IsInOrder := FALSE;
       Temporary := A[I];
       A[I] := A[J];
       A[J] := Temporary;
    END Swap;

BEGIN
   (* Toggle 'Ascending' flag *)
       Ascending := NOT Ascending;
       Offset := ARSIZE;
       WHILE Offset > 1 DO
           Offset := Offset DIV 2;
           REPEAT
               IsInOrder := TRUE;
               FOR J := 1 TO (ARSIZE - Offset) DO
                   I := J + Offset;
                   IF Ascending
                       THEN IF A[I] < A[J] THEN Swap END
                       ELSE IF A[I] > A[J] THEN Swap END
                   END; (* IF AscendingOrder *)
               END; (* FOR J *)
           UNTIL IsInOrder;
       END; (* End of while-loop *)
END ShellSort;

BEGIN (* Main *)
    InitializeArray;
    Ascending := TRUE;
    FOR Iter := 1 TO NSORTS DO
       ShellSort
    END;
END ShSort.

[LISTING EIGHT]



MODULE cortn;

(* Benchmark to test speed of coroutine switching *)
(* Shifts NCHARS characters to upper-case         *)
(* Two transfers per character                    *)

FROM SYSTEM IMPORT NEWPROCESS, TRANSFER, ADDRESS, BYTE, ADR;

CONST  NCHARS = 50000;
       WorkSize = 1000;

VAR    ch : ARRAY [1..NCHARS] OF CHAR;
       ShiftWork, CountWork : ARRAY [1..WorkSize] OF BYTE;
       count, chval, c : CARDINAL;
       main, shifter, counter : ADDRESS;

PROCEDURE CountProc;
    (* Increments count *)
BEGIN
  REPEAT
    count := count + 1;
    TRANSFER (counter, shifter);
  UNTIL FALSE;
END CountProc;

PROCEDURE ShiftProc;
    (* Shifts char at 'count' to upper case *)
BEGIN
  REPEAT
    IF (ch [count] >= 'a') AND (ch [count] <= 'z') THEN
      ch [count] := CHR (ORD (ch [count]) - 32)
    END;
    TRANSFER (shifter, counter);
  UNTIL count = NCHARS;
  TRANSFER (shifter, main);
END ShiftProc;

BEGIN  (* Main program *)

  (* Load array with lower-case letters *)
  chval := ORD ('a');
  FOR c := 1 TO NCHARS DO
    ch [c] := CHR (chval);
    chval := chval + 1;
    IF chval > ORD ('z') THEN
      chval := ORD ('a');
    END;
  END;

  (* Set up coroutines *)
  NEWPROCESS (CountProc, ADR (CountWork), WorkSize, counter);
  NEWPROCESS (ShiftProc, ADR (ShiftWork), WorkSize, shifter);

  (* Dispatch the controlling task *)
  count := 1;
  TRANSFER (main, shifter);
END cortn.

[LISTING NINE]



MODULE ncortn;

(* Does the same thing as CORTN.MOD, but without  *)
(* coroutine switching                            *)
(* Subtract run time for this from time for CORTN *)
(* to find out actual coroutine overhead          *)

CONST  NCHARS = 50000;
       WorkSize = 1000;

VAR    ch : ARRAY [1..NCHARS] OF CHAR;
       count, chval, c : CARDINAL;

PROCEDURE CountProc;
    (* Increments count *)
BEGIN
  count := count + 1;
END CountProc;

PROCEDURE ShiftProc;
    (* Shifts all chars in array 'ch' upper case *)
BEGIN
  REPEAT
    IF (ch [count] >= 'a') AND (ch [count] <= 'z') THEN
      ch [count] := CHR (ORD (ch [count]) - 32)
    END;
    CountProc;          (* Substitute call for TRANSFER *)
  UNTIL count = NCHARS;
END ShiftProc;

BEGIN  (* Main program *)

  (* Load array with lower-case letters *)
  chval := ORD ('a');
  FOR c := 1 TO NCHARS DO
    ch [c] := CHR (chval);
    chval := chval + 1;
    IF chval > ORD ('z') THEN
      chval := ORD ('a');
    END;
  END;

  (* Dispatch the controlling task *)
  count := 1;
  ShiftProc;
END ncortn.