Kent Porter, DDJ's senior technical editor, also writes the "Structured Programming" column. Kent can be reached through CompuServe at 7670,51 or through MCI: kporter.
In the past couple of years, Modula-2 has progressed from the status of a newcomer to that of a viable language for both applications and systems software development. DDJ last compared Modula-2 compilers in the August 1986 issue. Since then, the state of the art has advanced dramatically, and so has interest in Modula-2 among DDJ readers. Recent introductions have still further strengthened Modula-2 as a potential successor to both Pascal and C. It's time to take Modula-2 seriously, and consequently this article surveys four leading Modula-2 development systems.
Three of these Modula-2 compilers have been introduced this year. The old-timer is Logitech, which is now in Version 3.0 and has been around for several years. The latecomers are FTL from Workman Associates, TopSpeed by Jensen and Partners international (JPI), and Stony Brook. Others are available as well, but review copies did not arrive in time to be induded.
To one degree or another, all of these products provide an "environment"-- source program editor that interacts with the compiler--along with optional command-line compilation and widely varying program-development aids. Two compilers have their own linkers, and the others use the DOS linker. Without exception, they all support the core language as defined by Wirth, plus they offer extensions. And, of course, some claim to be the fastest, a clamor that will be dehyped with the benchmarks later in this article.
While support for the basic elements of Modula-2 effects a degree of standardization, it does not ensure portability of a Modula-2 source program from one implementation to another. Like C, Modula-2 is a limited language that derives much of its power and flexibility from external routines. For example, Wirth recommends that I/O service routines reside in certain libraries and take thus-and-such parameters, but he doesn't mandate them. Nor does any formal standard for Modula-2 yet exist. Consequently, implementors take liberties. These liberties inevitably lead to discrepancies that diminish portability.
Table 1, page 67, shows a sample of declaring the external routines WriteString and WriteLn, which write a string and a newline, respectively, to the display.
Wirth: FROM InOut IMPORT WriteString, WriteLn; FTL: FROM Terminal IMPORT WriteString, WriteLn; Logitech: FROM InOut IMPORT WriteString, WriteLn; TopSpeed: FROM IO IMPORT WtStr, WrLn Stony Brook: FROM InOut IMPORT WriteString, WriteLn;
Logitech and Stony Brook implement Wirth's recommended InOut library, but FTL and TopSpeed do not. In fact, TopSpeed doesn't even use the same procedure names (although it does provide alternative libraries that do). Therefore, moving a program from one implementation to another is likely to necessitate making changes to accommodate the deviations from Wirth.
Deviations are not necessarily bad, and certainly not unique to Modula-2. If you buy one compiler and stick with it (as most programmers do), you'll never confront the issues of portability. Languages governed by formal standards also cause headaches in porting between compilers. Ada comes as close as any language to being universal, yet even it is imperfect.
Turning to specifics, this article profiles each of the products. Space is limited, so the focus is on the main features, which are summarized in Table 2, on page 67. The performance aspects are covered later in the section "Benchmarking the Compilers."
FTL LOGITECH TOPSPEED STONY BROOK Editor: Style Wordstar Point Turbo Brief-like Customizable Macros .INI files .MNU file No Compiler: Switches 7 21 12 18 Directives None 6 24 None Memory modules 2<1> 1<5> Note<10> 6 Overlays No Yes Yes<9> No Inline assembly Yes No No No 8087 support Yes Yes Yes Yes '286 support Yes<2> Yes No No Linker Included DOS<6> Integrated DOS Switches 11<3> N/A 7 N/A Debugger Symbolic Note<7> Symbolic<9> Symbolic Make Yes Optional Integrated Yes Librarian Yes No Integrated No Other utilities Yes Optional No No Standard modules 28 63 13 23 Standard identifiers 233 479 259 157 System requirements Lowest DOS version 2.0 2.0 2.0 2.0 Memory 256K 512 384K ? Floppies 2 2 2 2 Hard Disk N/R Recomm. Recomm. Recomm. Color monitor N/R N/R Recomm. N/R Base price $49.95<4> $99.00<8> $99.95 ? Version tested 1.08 3.03 1.10 1.10Notes: 1. FTL small and large models are separate products. 2. In large model only, via a linker switch 3. In large model. Four for small model linker 4. Small model. Large model is $79.95 5. Large model only 6. Logitech linker available with optional toolkit (not evaluated here) 7. Base package includes post-mortem debugger. Interactive symbolic debuggger available in optional toolkit. 8. $249.00 with optional toolkit 9. Optional and extra 10. Programmer-defined memory models
FTL
A few years ago, no one would have believed that $49.95 would ever buy so much program-development software. Workman Associates' FTL Modula-2 comes with an editor, compiler, symbolic debugger, and other utilities, plus a generous array of standard modules. Considering the quantity and quality of the software, FT'L delivers the most per buck of the development systems reviewed here.
The editor uses the whole screen and operates with a Wordstar-like command structure. If you don't like Wordstar, or if you want to add custom commands, you can build your own macros to do anything within reason. To do more than basic editing operations, the ^ O command pops up an options menu that provides (among other things) interactive compiling, linking, and running of the program under development. The options menu also provides control over editing windows, access to the DOS shell, and other useful features. If you really want to change the editor, the source code-written in FTL Modula-2--is available at a modest extra cost.
Both the compiler and the proprietary linker are also available at the command-line level, and respond to a number of switches that influence the resulting .EXE file. For example, the command ML myfile/C/V invokes the linker and tells it to create a compact (64K) code segment and the minimal possible data segment. You can also enter switches from the editor's options menu.
FTL furnishes a symbolic, full-screen debugger with source tracking, breakpoints, trace-back, variable watching, and other features. Most operations are driven by function keys, and a pop-up window on the right side of the display provides reminders. A slick display option enables you to chase down a pointer to see the record it indicates, control the format of real numbers, and view portions of arrays. An associated execution profiler reports the total amount of time spent executing each statement in a program -- useful for tuning an application.
Among the utilities are the MLU librarian (which enables you to create and manage libraries of .OBJ modules), a documenter that shows dependencies among modules, and a make utility. VII also has a TRIMMER utility to remove unused hunks of code from the .EXE file.
HI Modula-2 comes in two versions. The small model, which does programs up to 64K code and 64K data, sells for $49.95. The large model compiler, which is $30 more and the one tested here, handles programs of any size (subject of course to the usual segmentation rules). The large model includes optional 80x87 coprocessor support. Because of file-name conflicts, the systems--if you have both--must reside in different directories. Workman Associates sells the optional editor toolkit for $39.95. Additionally, FTL Modula-2 compilers are available for CP/M (Z80) and the Atari ST.
If you're looking for an inexpensive compiler to learn Modula-2 and develop some software, this is the one to buy. It gives you a lot for the money. If you also want high performance, you d better check the benchmark report later in this article.
Logitech
The oldest and best-known of the Modula development systems, the Logitech compiler, is a modular offering comprised of a core package and an extremely fine accessory toolkit. This is consistent with the nature of the language it supports. The $99 core package consists of the POINT editor, compiler, post-mortem debugger, and Logitech's impressive libraries. For $249 you can get the compiler and the toolkit bundles together. This is well worth the extra cash for serious programmers.
Because this is the same Logitech of mouse fame, POINT is a heavily mouse-oriented, menu-driven editor. You can use it without the mouse, but you will not want to. POINT is infinitely reconfigurable via .INI files that use a quasi-programming language for specifying menus, actions to be taken, and such. The editor itself is friendly, fast, and capable of a wide array of word processor-like functions.
Of primary interest in program development is POINT's M2ASSIST pull-down menu, which has selecttions leading to the syntax checker, compiler, and linker (among other things). You can also run your program without leaving POINT.
Logitech includes two compiler versions called the fully-linked and over-lay versions. The fully-linked version requires 512K; the overlay version requires 290K. The compile-time penalty for running the overlay version is in the range of 15 percent to 20 percent.
To save you the nuisance of typing command-link switches, the compiler looks to a text file called M2C.CFG. This file contains all the switches. To activate a particular switch, place a slash in front of it. Disabling the defaults for switches S, F, R, T, A, and O has a dramatic effect in reducing the code size and execution times of .EXE files. If you elect to leave the the defulats active, the compiler inserts all types of run-time checks. It's a good idea to have the defaults on during development, then turn them off for the final compile in order to optimize size and speed.
The core package's PMD utility is a symbolic post-mortem debugger. If a program crashes, you can run PMD to find out the values of all variables, state information, the point of failure, and the procedure call chain. In order for PMD to work, the failing program must have imported Logitech's DebugPMD module, which activates on abnormal termination and writes a memory dump to disk. at a time, with different files in each or concurrently updated instances of the same file in multiple windows.
The toolkit furnishes a more conventional interactive debugger, which enables you to watch a program run. The debugger subdivides the display into windows that contain the executing source code, the calling chain, and breakpoints, watched variables, adn a module list with an indicator to the current module. You can also open other windows to dump memory, view the application screen, and so forth. A number of commands provide user control over the session and the display.
The toolkit raises Logitech Modula-2 to the level of an advanced professional development system. In addition to the run time debugger, it includes a proprietary linker, a make utility, a disassembler, a cross-reference program, and other useful utilities. It also provides source cord for the the libraries so that you can make modifications and enhancements, then simply recompile. Of the products reviewed, Logitech furnishes the most comprehensive set of tools.
The libraries are also extremely rich. They offer almost twice as many intrinsic procedures and identifiers (479) as any of the other Modula vendors, Logitech's longevity in this market is obvious, and they haven't neglected their responsibility to enhance the product. Looking at some of the calls, you might wonder how you'd ever use them. The point is that if you ever need them, they're there.
Logitech also deserves applause for its superb documentation. The package as a whole has four softcover manuals that are well-organized and indexed, clearly written, and replete with screen shots and other graphics. The 412-page compiler manual includes a lucid introdcution to Modula for Pascal programmers.
Logitech has also recently begun shipping a $349 OS/2 version of the package. To date, this is the only Modula-2 compiler for OS/2.
JPI's TopSpeed Modula-2
Although new to the United States, TopSpeed from Jensen and Partners International (JPI) has been marketed in Europe for the past year under the new name JPI Modula-2. Because it attracted favorable motice there, its reputation has preceded it on this side of the Atlantic. This recognition is well-deserved, because TopSpeed is surely one of the finest new products introduced to date in the PC arena. If you need naything to convince you to move up to Modula-2, this is it.
The essence of TopSpeed's system is an integrated environment that bears a striking resemblance to that of the Turbo languages. Even many of the Alt-commands are the same. If you're familiar with Turbo Pascal, Turbo C, and so forth, you'll feel right at home from the start. And even if you're not comfortable at the beginning, you soon will be because the commands are intuitiev. For example, Alt-C means compile, Alt-R means run after make, and so on. You can always break out of a misdirected command with Esc.
The Wordstar-based editor is reconfigurable by editing the M2.MNU file. This allows you to modify the default menu and add new commands (such as your own utilities) and integrate them into the user interface. You can work with as many aws four configurable editing windows. at a time, with different files in each ot concurrently updated instances of the same file in multiple windows.
Compilation is somewhat different than in the Turbo environments. The compiler doesn't stop on the first error. Instead, it goes all the way through and displays a red meter to show percentage completion. When it's done, the editor rests the cursor on the first offense. You can fix the error, then move to the next error with the F8 key. At each line containing an error, a red-backed diagnosis appears near the bottom of the display. When you've repaired the damage, you recompile with one of the Alt-commands.
The environment integrates the compiler and linker, and it's blazingly fast. If you resist environments (and I can't imagine why you would), you can use your own favorite editor and operate the compiler/linker on a command-line basis. The commands are as follows:
To compile: M2 /C yourmod To link: M2 /L yourmodTopSpeed deviates from the de facto Wirth standard in the naming of default I/O routines. For example, Wirth specifies the module InOut exporting WriteString to output a string and WriteLn for a newline. TopSpeed's implementation instead uses WrStr and WrLn, respectively, exported from a library called IO. This might make you uncomfortable, especially if you're just learning Modula-2. For that reason, one of the three distribution diskettes contains a directory called \CORE, which has a number of compatibility modules (one of which is InOut) that allow you to use Wirth's recommended libraries and standard procedure calls. All you have to do is compile them, and Topspeed becomes compliant with the Wirth standard and with other Modula compilers.
Of the development systems reviewed here, only JPI's TopSpeed supports dot-addressable graphics with intrinsic calls such as Circle, Line, and Polygon. It's not yet at the level of Borland's .BGI drivers, but ahead of the other Modula products. TopSpeed also provides advanced text manipulation and cursor control.
An optional symbolic debugger comparable to CodeView is available from JPI, as is a $59.95 toolkit. This toolkit includes an assembler and ROM-burning software, plus modules for communications, terminate-and-stay-resident programs (TSR's), critical error handlers, EMS, and overlays. It also provides source code for TopSpeed's start-up and run-time libraries.
DDJ doesn't give unqualified raves very often, but there's no question about it in this case: JPl's TopSpeed Modula-2 is first-rate.
Stony Brook
As the benchmark discussion later reveals, Stony Brook delivers outstanding performance. Unfortunately, their tools don't measure up in comparison with the others. Stony Brook comes with an editor, compiler, libraries, and a rudimentary debugger. That's all.
The editor is somewhat like Brief, though less capable. For example, you can't customize it, and you can only work with two files at a time on a split screen. The Alt-C command enables you to compile the file in the active window from within the editor. An error positions the cursor on the offender and generates a message explaining why the compiler choked. It's usually sufficient to suggest a fix.
But editing and compiling are all you can do from within the editor. To link and run, you must exit to DOS. Although there's nothing wrong with this strategy, integration is not as good as with the other versions reviewed here.
The debugger is primitive: better than nothing, but not as good as it might be. There's nothing visual about it. You type commands and get answers on a blind line-by-line basis, without the benefit of watch windows and dancing execution bars Showing where you are in the source. The only way to work with this debugger is with a line-numbered source listing by your side.
I like Stony Brook Modula-2, and I've said so in the "Structured Programming" column. Even its competitors make admiring noises about it. But performance, and not features, sell it. If Stony Brook can retain the performance while ramping up the features and libraries in future releases, they'll have a strong development system.
Benchmarking Compilers
Benchmarks are always controversial, of course, and these will probably be no exception. In evaluating these results, bear in mind that any given benchmark measures performance of a certain limited set of activities. Therefore, a benchmark provides only a general idea of how one computing element (a compiler in this case) stacks up against others of its kind. The set of benchmarks as a whole provides a reasonably good feel for overall relative performance, but the real test is your particular application on your machine.
These tests were all performed on a Tandon AT clone running at 8 MHz with no-wait-state memory and a 35 ms, 40-Mbyte hard disk. The timing figures were provided by a supervising program that ties into mode 2 of the 8253 timing chip, which provides a granularity of 838 nanoseconds. The timer program carries seconds out to four decimal places, which have been rounded to two places in the tables. The figures are the average of five executions each, and represent the total of load-and-run time.
In all cases, the compiler/linker command-line options were set to yield the best execution time for the resulting program. When possible, overhead operations (such as subscript range checking) were disabled. Where alternative libraries were available--as in the case of TopSpeed's limited run-time system and FTL's 8087-only floating-point, for example--I used them. Every effort was made to achieve the most favorable results.
The averages summarizing all except Table 5 are geometric, not arithmetic. A geometric average gives equal weight to all rows. A row that is much "bigger" or "smaller" than others doesn't dominate the results. For example:
A B 1 90.0 80.0 2 5.0 10.0 ------ ------ Total 95.0 90.0 Arith.avge. 47.5 45.0 Geom.avge. 36.8 52.6Based on the arithmetic average, B seems to outperform A. Test 1 is inordinately long and Test 2 is very short. Giving equal weight to both tests by using a geometric average, you see that, in fact, A outperforms B.
The contenders were benchmarked with nine test programs shown in Listings One through Nine starting on page 100. If necessary, I changed library names in IMPORT statements and procedure calls to conform to the requirements of the tested implementation, all without altering the underlying algorithm. To the greatest extent possible, all pro- grams are identical across all tested compilers.
FTL and TopSpeed provide their own linkers, which were used to convert .OBJ modules compiled by those products into .EXE files. Logitech and Stony Brook require the DOS linker (the version I used here was 5.0120). The compile and link steps were performed separately on a command-line basis. I did not test the make utility for any of these compilers.
I used the Logitech translator to convert Dhrystone 1.1 from Pascal into Modula-2, and manually translated sieve, fib, and acker from C. The 4)math, qsort, and shsort programs were obtained from BIX and modified slightly, primarily to remove unnecessary output statements. The cortn and ncortn programs were written as new benchmarks specifically for Modula-2. These will be discussed later.
Table 3, on page 69, shows what the benchmark programs measure (in order of importance for each program).
dhrystone A statistically balanced mix of operations sieve Array indexing and integer arithmetic fib Integer srithmetic and recursion acker Recursion and integer arithmetic fpmath Transcendental floating-point arithmentic Without 80x87 FP emulation package efficiency With 80x87 Efficiency of generated code using the 80287 math coprocessor qsort QuickSort algorithm shshort Shell-Metzner sort algorithm cortn Coroutine context switching speed* ncortn Same task as cortn, but without coroutines*
Now for the results.
Compile/link time is important because it affects the productivity of programmers; the less time from source to .EXE, the less idle time spent by the programmer.
For these tests, I dropped out of the integrated environment and measured time based on command-line mode. This is because accurate timings cannot be obtained within an environment. Actual execution times are shown in Table 4, on page 74, without the gap and typing time between compile and link. Systems with a make option automate the process and eliminate the inter-step gap, and so the totals for these systems are probably very close to the actual time.
Program: FTL LogiTech TopSpeed Stony Brook sieve Compile 27.0 5.51 3.71 2.65 Link 3.48 3.70 2.72 2.87 Total 6.18 9.21 6.44 5.51 Fib Compile 2.53 5.07 3.66 2.65 Link 3.57 3.70 2.72 2.86 Total 6.10 8.77 6.38 5.51 acker Compile 2.64 5.18 3.65 2.81 Link 3.63 3.70 2.66 2.97 Total 6.26 8.87 6.30 5.78 fpmath Compile 3.76 10.72 5.94 4.62 Link 15.24 8.25 4.45 5.17 Total 19.01 18.97 10.39 9.79 qsort Compile 2.79 5.84 4.15 3.13 Link 3.62 3.69 2.84 3.30 Total 6.41 9.53 6.99 6.43 shsort Compile 2.79 6.15 4.44 3.19 Link 3.67 3.68 2.96 3.29 Total 6.46 9.83 7.40 6.47 cortn Compile 3.45 5.44 4.34 2.91 Link 8.50 3.57 3.30 3.35 Total 11.95 9.00 7.64 6.26 ncortn Compile 2.68 5.02 3.76 2.75 Link 3.39 3.64 2.94 3.19 Total 6.07 8.66 6.70 5.93 Geometric averages 8.20 10.37 7.48* 6.60 * Deduct 2.8 sec. for the integrated environment
The apparent winner here is Stony Brook, followed very closely by TopSpeed from JPI and then HI. The spread is less than four seconds from first to third place. Logitech significantly trails in last place, yet its performance can hardly be characterized as bad.
Were it possible to accurately measure elapsed time within an environment, TopSpeed would undoubtedly emerge the winner. This is because the command M2 loads the complete integrated environment, and then switches to batch mode based on the command line. To compile and link SIEVE.MOD, the commands are M2 /C SIEVE and M2 /L SIEVE. When the environment is already active, it's not necessary to reload M2 from disk in order to do a link. Tests with the empty program
MODULE nothing; BEGIN END nothing.
revealed that the load time for TopSpeed is 2.8 seconds. Of those tested, TopSpeed is the only compiler with an integrated linker. Therefore, the compile/link totals should be adjusted downward by 2.8 seconds to estimate time within the environment, and TopSpeed emerges the clear winner at 4.68 seconds average.
Table 5, below, shows the size in bytes of .EXE files flowing out of the compile/link process. Code size is not especially important for most applications. That is, a smaller .EXE is not necessarily better. It is important when the application is very large, or consists of many .EXE files that consume great amounts of disk space. For example, if a commercial program needs two distribution diskettes instead of one because of the summed .EXE file sizes, it becomes a cost factor. Otherwise, who cares?
Program: FTL LogiTech TopSpeed Stony Brook sieve 3584 3949 687 826 fib 3584 3891 683 818 acker 3584 3921 707 850 fpmath (w/o '87) 25088 31225 13608 12138 (with '87) 25088 30842 13608 12106 qsort 3584 4079 839 960 shsort 3584 4149 917 1002 cortn 13312 4498 2671 1504 ncortn 3584 3973 727 846 Geometric averages 10569 10569 2819 2793
Compilers tend to insert hunks of code as a matter of course. An example is a routine to initialize global variables. Such code cliches become fixed overhead in the .EXE file size. The NOThING program cited previously yields the following .EXE file sizes in bytes:
FTL 3584 Logitech 3797 Topspeed 609 Stony Brook 754
These represent the minimum size overhead of every .EXE file. If you want to determine the approximate amount of real working code included in a specific .EXE, subtract these fixed amounts.
If smaller isn't necessarily better, neither is larger (by definition) worse. As an example, note that Logitech yields an enormous .EXE file for fpmath, yet delivers second-best runtime performance in Table 6, page 76, while KFL produces a smaller file size and enormously slower execution for the same program. Performance doesn't have to do with file size, but instead with code efficiency. That's why file size is relatively unimportant.
Program: FTL LogiTech TopSpeed Stoney Brook sieve 8.61 8.31 3.33 3.85 fib 38.39 22.92 22.83 24.28 acker 20.71 11.82 10.19 12.14 fpmath (w/o '87) 255.15 51.86 71.21 49.05 (with '87) 12.08 45.28 10.30 11.63 qsort 7.85 6.00 4.04 3.62 shsort 11.69 11.53 8.98 10.38 cortn 5.59 6.22 10.30 12.85 ncortn 2.29 2.32 1.68 1.48 overhead 3.30 3.90 8.62 11.37 % o'head 58.97 62.68 83.73 88.48 Geometric averages 28.99 24.62 17.20 18.16
Table 5 lists the run times for the benchmarks. This is where "the rubber meets the road." All programmers want to produce fast applications, and they will sacrifice a certain amount of productivity and file size to do so. There is no equivocating here; either the resulting application is fast, or it's not.
Overall, JPI's TopSpeed Modula-2 lives up to its name, followed closely by Stony Brook. Logitech comes in a distant third, with FTL biting the dust.
What hurts FTL is its disastrously slow floating-point emulation. TopSpeed--out in front almost everywhere else--is third here, but still 3.58 times faster than ITL. If you remove the ipmath tests, the geometric averages are less widely spread, but still in the same order:
TopSpeed 8.67 Stony Brook 9.46 Logitech 10.80 PTL 13.11
These represent the performance rankings of the compilers' output .EXE files, and they're confirmed by the Dhrystone results.
But before leaving run times, compare the effects of adding an 80x87 math coprocessor to the hardware platform in the fpmath test. To perform these tests, it was necessary to recompile and relink fpmath with each compiler. The 80x87 has a negligible effect for Logitech, but for FfL the time plummets from 255 seconds to a mere 12, which is only a whisker behind first-place TopSpeed. Clearly, all but Logitech have expended great effort in exploiting the '87. The message here is if you're crunching a lot of floating-point numbers, buy an '87 and don't use Logitech.
An industry-standard benchmark, Dhrystone is generally regarded as the most objective single predictor of compiled program performance. It's a synthetic application containsng a statistically balanced mix of operations characteristic of the "typical" systems program. Dhrystone was devised by studying hundreds of non-floating-point programs, then constructing a 100-statement algorithm containing the following:
53 assignments 32 control statements 15 function and procedure calls
The entire program loops 50,000 times. Each iteration is one Dhrystone unit. Therefore, 50,000 divided by elapsed time yields' the number of Dhrystones per second. The higher the number, the better.
Table 7, this page, shows Dhrystone results for the compilers. Stony Brook has the fastest raw compile/link time at 8.89 seconds. If you adjust TopSpeed's performance downward by 2.8 seconds to estimate environment time as discussed previously, TopSpeed becomes the fastest at 8.4. TopSpeed also wins by a wide margin in terms of .EXE size and Dhrystones per second. Logitech and KFL share last place, with Logitech leading by a hair. Stony Brook is exactly halfway between first and last place.
Program: FTL LogiTech TopSpeed Stoney Brook Compile 4.62 12.26 7.17 5.11 Link 7.37 4.84 4.04 3.79 Total 11.99 17.10 11.20* 8.89 Size 9728 8957 2827 5638 Seconds 40.43 40.60 31.77 35.70 Dhrystones/sec 1237 1232 1574 1401 * Est. 8.4 sec. in the integrated environment
JPI's TopSpeed emerges as the clear winner. Its performance is spectacular, and it offers a powerful integrated environment similar to the Turbo languages, plus an optional symbolic debugger and toolkit. TopSpeed earns a standing ovation.
Stony Brook places a respectable second in all performance categories, running overall neck-in-neck with TopSpeed. The editor, debugger, and make utility form a useful, but rudimentary, toolkit for program development.
If you're a tools junkie, you'll love Logitech. Its optional toolkit furnishes oodles of them, and they're good. So is the performance of its output .EXE files--provided you turn off the default switches in the M2C.CFG file. Logitech's compile/link performance is lackluster, though, and its .EXE files are the largest of the crop. Nevertheless, the average execution time is fairly close to that of the leaders. The grand old man of Modula-2 compilers for the PC is still a formidable contender.
Although FL ranks third in compile/link time and code size, its average .EXE run time consigns it to last place in performance. The only area where it really shines is in coroutine switching. What hurts FTL most is inefficient floating-point emulation. Overall, it's a serviceable compiler with an excellent set of tools--a genuine bargain at only $49.95.
So there you have the current state of the art in PC-based Modula-2 development systems, and the state of the art is very good indeed.
Among the mainstream languages, Modula-2 is the first to support concurrent processes through standard procedure calls. Most computers are single-thread machines, so concur rency is achieved by time-division multiplexing: one process runs for a time, then yields the machine to another process, which takes its turn and then reverts to the first, and so on until both complete. In Modula-2 terminology such processes are called coroutines. Coroutines exist M equals, each periodically deferring to the other, which picks up where it most recently left off.
An issue with coroutines is context switching (the saving and restoring of the machine state during control transfers). This entails overhead, and the question becomes one of quantifying it. A benchmark is clearly needed for comparing the relative performance of competing compilers in handling coroutine switching. That's what DDJ presents here.
In fact, two benchmarks are needed that perform exactly the same task -- one using coroutines and the other not. The Modula-2 programs CORTN.MOD and NCORTN.MOD (see Listings, page 100) both generate a 50,000-character string in lowercase, then shift that string to uppercase. The difference is that CORTN uses a coroutine to count the number of characters shifted, whereas NCORTN uses a normal procedure call. Granted that the task is trivial, but this is consistent with benchmarks in general. The objective is to measure the relative amount of overhead introduced by coroutines.
So here's what you do. Compile and run CORTN and NCORTN, timing the execution period for each as precisely as possible. Then calculate the overhead introduced by coroutine switching. If C is CORTN run time and N is that for NCORTN, the percent overhead (O) for coroutine switching, is
O = (C - N)/C
Thus, if CORTN runs in 10.30 seconds and NCORTN runs in 1.88 seconds, the percent overhead for coroutines is 83.73 percent.
In comparing coroutine performance, the lower the overhead percentage, the more efficient a particular compiler is at handling coroutines. For example, in Table 4 accompanying this article, FTL has the most efficient coroutine handling because its percentage overhead is the lowest.
DDJ would like to place these Modula-2 benchmarks and the method for evaluating them in the public domain. If you have comments, please submit them in writing to DDJ, Attn. Kent Porter, at the address on the masthead, or to kporter on MCI or 76704,51 on CompuServe. The results will be published in a future issue. -- KP
_THE STATE OF MODULA-2_
by
Kent Porter
[LISTING ONE]
MODULE dry;
FROM Storage
IMPORT ALLOCATE, DEALLOCATE, Available, InstallHeap, RemoveHeap;
FROM Strings
IMPORT CompareStr;
(*
* "DHRYSTONE" Benchmark Program
*
* Version: Mod2/1
* Date: 05/03/86
* Author: Reinhold P. Weicker, CACM Vol 27, No 10, 10/84 pg. 1013
* C version translated from ADA by Rick Richardson
* Every method to preserve ADA-likeness has been used,
* at the expense of C-ness.
* Modula-2 version translated from C by Kevin Northover.
* Again every attempt made to avoid distortions of the original.
* Machine Specifics:
* The time function is system dependant, one is
* provided for the Amiga. Your compiler may be different.
* The LOOPS constant is initially set for 50000 loops.
* If you have a machine with large integers and is
* very fast, please change this number to 500000 to
* get better accuracy.
* You can also time the program with a stopwatch when it
* is lightly loaded (no interlaced 4 bit deep Amiga screens ...).
*
**************************************************************************
*
* The following program contains statements of a high-level programming
* language (Modula-2) in a distribution considered representative:
*
* assignments 53%
* control statements 32%
* procedure, function calls 15%
*
* 100 statements are dynamically executed. The program is balanced with
* respect to the three aspects:
* - statement type
* - operand type (for simple data types)
* - operand access
* operand global, local, parameter, or constant.
*
* The combination of these three aspects is balanced only approximately.
*
* The program does not compute anything meaningfull, but it is
* syntactically and semantically correct.
*
*)
(* Accuracy of timings and human fatigue controlled by next two lines *)
CONST
LOOPS = 50000;
TYPE
Enumeration = (Ident1, Ident2, Ident3, Ident4, Ident5);
OneToThirty = CARDINAL;
OneToFifty = CARDINAL;
CapitalLetter = CHAR;
String30 = ARRAY [0..30-1] OF CHAR;
Array1Dim = ARRAY [0..50] OF CARDINAL;
Array2Dim = ARRAY [0..50], [0..50] OF CARDINAL;
RecordPtr = POINTER TO RecordType;
RecordType = RECORD
PtrComp: RecordPtr;
Discr: Enumeration;
EnumComp: Enumeration;
IntComp: OneToFifty;
StringComp: String30;
END;
(*
* Package 1
*)
VAR
IntGlob: CARDINAL;
BoolGlob: BOOLEAN;
Char1Glob: CHAR;
Char2Glob: CHAR;
Array1Glob: Array1Dim;
Array2Glob: Array2Dim;
PtrGlb: RecordPtr;
PtrGlbNext: RecordPtr;
PROCEDURE Proc7(IntParI1, IntParI2: OneToFifty;
VAR IntParOut: OneToFifty);
VAR
IntLoc: OneToFifty;
BEGIN
IntLoc := IntParI1+2;
IntParOut := IntParI2+IntLoc;
END Proc7;
PROCEDURE Proc3(VAR PtrParOut: RecordPtr);
BEGIN
IF (PtrGlb <> NIL) THEN
PtrParOut := PtrGlb^.PtrComp
ELSE
IntGlob := 100
END;
Proc7(10, IntGlob, PtrGlb^.IntComp);
END Proc3;
PROCEDURE Func3(EnumParIn: Enumeration): BOOLEAN;
VAR
EnumLoc: Enumeration;
VAR Func3Result: BOOLEAN;
BEGIN
EnumLoc := EnumParIn;
Func3Result := EnumLoc = Ident3;
RETURN Func3Result
END Func3;
PROCEDURE Proc6(EnumParIn: Enumeration;
VAR EnumParOut: Enumeration);
BEGIN
EnumParOut := EnumParIn;
IF ( NOT Func3(EnumParIn)) THEN
EnumParOut := Ident4
END;
CASE EnumParIn OF
Ident1:
EnumParOut := Ident1
| Ident2:
IF (IntGlob > 100) THEN
EnumParOut := Ident1
ELSE
EnumParOut := Ident4
END
| Ident3:
EnumParOut := Ident2
| Ident4:
| Ident5:
EnumParOut := Ident3
ELSE
END;
END Proc6;
PROCEDURE Proc1(PtrParIn: RecordPtr);
BEGIN
WITH PtrParIn^ DO
PtrComp^ := PtrGlb^;
IntComp := 5;
PtrComp^.IntComp := IntComp;
PtrComp^.PtrComp := PtrComp;
Proc3(PtrComp^.PtrComp);
IF (PtrComp^.Discr = Ident1) THEN
PtrComp^.IntComp := 6;
Proc6(EnumComp, PtrComp^.EnumComp);
PtrComp^.PtrComp := PtrGlb^.PtrComp;
Proc7(PtrComp^.IntComp, 10, PtrComp^.IntComp);
ELSE
PtrParIn^ := PtrComp^
END;
END;
END Proc1;
PROCEDURE Proc2(VAR IntParIO: OneToFifty);
VAR
IntLoc: OneToFifty;
EnumLoc: Enumeration;
BEGIN
IntLoc := IntParIO+10;
REPEAT
IF (Char1Glob = 'A') THEN
DEC(IntLoc, 1);
IntParIO := IntLoc-IntGlob;
EnumLoc := Ident1;
END;
UNTIL EnumLoc = Ident1;
END Proc2;
PROCEDURE Proc4;
VAR
BoolLoc: BOOLEAN;
BEGIN
BoolLoc := Char1Glob = 'A';
BoolLoc := BoolLoc OR BoolGlob;
Char2Glob := 'B';
END Proc4;
PROCEDURE Proc5;
BEGIN
Char1Glob := 'A';
BoolGlob := FALSE;
END Proc5;
PROCEDURE Proc8(VAR Array1Par: Array1Dim;
VAR Array2Par: Array2Dim;
IntParI1, IntParI2: OneToFifty);
VAR
IntLoc: OneToFifty;
IntIndex: OneToFifty;
BEGIN
IntLoc := IntParI1+5;
Array1Par[IntLoc] := IntParI2;
Array1Par[IntLoc+1] := Array1Par[IntLoc];
Array1Par[IntLoc+30] := IntLoc;
FOR IntIndex := IntLoc TO (IntLoc+1) DO
Array2Par[IntLoc][IntIndex] := IntLoc
END;
Array2Par[IntLoc][IntLoc-1] := Array2Par[IntLoc][IntLoc-1]+1;
Array2Par[IntLoc+20][IntLoc] := Array1Par[IntLoc];
IntGlob := 5;
END Proc8;
PROCEDURE Func1(CharPar1, CharPar2: CapitalLetter): Enumeration;
VAR
CharLoc1, CharLoc2: CapitalLetter;
VAR Func1Result: Enumeration;
BEGIN
CharLoc1 := CharPar1;
CharLoc2 := CharLoc1;
IF (CharLoc2 <> CharPar2) THEN
Func1Result := (Ident1)
ELSE
Func1Result := (Ident2)
END;
RETURN Func1Result
END Func1;
PROCEDURE Func2(VAR StrParI1, StrParI2: String30): BOOLEAN;
VAR
IntLoc: OneToThirty;
CharLoc: CapitalLetter;
VAR Func2Result: BOOLEAN;
BEGIN
IntLoc := 2;
WHILE (IntLoc <= 2) DO
IF (Func1(StrParI1[IntLoc], StrParI2[IntLoc+1]) = Ident1) THEN
CharLoc := 'A';
INC(IntLoc, 1);
END;
END;
IF (CharLoc >= 'W') AND (CharLoc <= 'Z') THEN
IntLoc := 7
END;
IF CharLoc = 'X' THEN
Func2Result := TRUE
ELSIF CompareStr (StrParI1, StrParI2) > 0 THEN
INC(IntLoc, 7);
Func2Result := TRUE
ELSE
Func2Result := FALSE
END;
RETURN Func2Result
END Func2;
PROCEDURE Proc0;
VAR
IntLoc1: OneToFifty;
IntLoc2: OneToFifty;
IntLoc3: OneToFifty;
CharLoc: CHAR;
CharIndex: CHAR;
EnumLoc: Enumeration;
String1Loc, String2Loc: String30;
i, LoopMax: CARDINAL;
BEGIN
LoopMax := LOOPS;
NEW(PtrGlbNext);
NEW(PtrGlb);
PtrGlb^.PtrComp := PtrGlbNext;
PtrGlb^.Discr := Ident1;
PtrGlb^.EnumComp := Ident3;
PtrGlb^.IntComp := 40;
PtrGlb^.StringComp := 'DHRYSTONE PROGRAM, SOME STRING';
String1Loc := "DHRYSTONE PROGRAM, 1'ST STRING";
FOR i := 0 TO LoopMax DO
Proc5;
Proc4;
IntLoc1 := 2;
IntLoc2 := 3;
String2Loc := "DHRYSTONE PROGRAM, 2'ND STRING";
EnumLoc := Ident2;
BoolGlob := NOT Func2(String1Loc, String2Loc);
WHILE (IntLoc1 < IntLoc2) DO
IntLoc3 := 5*IntLoc1-IntLoc2;
Proc7(IntLoc1, IntLoc2, IntLoc3);
INC(IntLoc1, 1);
END;
Proc8(Array1Glob, Array2Glob, IntLoc1, IntLoc3);
Proc1(PtrGlb);
CharIndex := 'A';
WHILE CharIndex <= Char2Glob DO
IF (EnumLoc = Func1(CharIndex, 'C')) THEN
Proc6(Ident1, EnumLoc)
END;
CharIndex := VAL(CHAR, ORD(CharIndex)+1);
END;
IntLoc3 := IntLoc2*IntLoc1;
IntLoc2 := IntLoc3 DIV IntLoc1;
IntLoc2 := 7*(IntLoc3-IntLoc2)-IntLoc1;
Proc2(IntLoc1);
END;
END Proc0;
(* The Main Program is trivial *)
BEGIN
Proc0;
END dry.
[LISTING TWO]
MODULE sieve;
(* Eratosthenes sieve prime number program, Byte Magazine *)
CONST size = 8190;
VAR
psn, k, prime, iter : INTEGER;
flags : ARRAY [0..size] OF BOOLEAN;
BEGIN
FOR iter := 1 TO 25 DO
FOR psn := 0 TO size DO
flags[ psn ] := TRUE;
END(* for *);
FOR psn := 0 TO size DO
IF flags[ psn ]
THEN (* prime *)
prime := psn + psn + 3;
k := psn + prime;
WHILE k <= size DO (* cancel multiples *)
flags[ k ] := FALSE;
k := k + prime;
END(* while *);
END(* if then *);
END(* for *);
END(* for *);
END sieve.
[LISTING THREE]
MODULE fib;
(* Berkeley standard benchmark *)
(* Computes largest 16-bit Fibonacci number *)
(* Tests compiler recursion efficiency and CPU thruput *)
CONST
TIMES = 10;
VALUE = 24;
VAR
i: INTEGER;
f: CARDINAL;
(* ----------------------------------------------------------- *)
PROCEDURE fibonacci(n: INTEGER): CARDINAL;
VAR fibonacciResult: CARDINAL;
BEGIN
IF n >= 2 THEN
fibonacciResult := fibonacci(n-1)+fibonacci(n-2)
ELSE
fibonacciResult := n
END;
RETURN fibonacciResult
END fibonacci; (* --------------------------- *)
BEGIN (* main *)
FOR i := 1 TO TIMES DO
f := fibonacci(VALUE)
END;
END fib.
[LISTING FOUR]
MODULE acker;
(* Berkeley standard benchmark *)
(* Ackerman's function: ack (2, 4) *)
(* Tests recursion and integer math *)
(* Repeats 10,000 times *)
VAR
loop, r: INTEGER;
(* ---------------------------------------------------------- *)
PROCEDURE ack(x1, x2: INTEGER): INTEGER;
VAR
result: INTEGER;
VAR ackResult: INTEGER;
BEGIN
IF x1 = 0 THEN
result := x2+1
ELSIF x2 = 0 THEN
result := ack(x1-1, 1)
ELSE
result := ack(x1-1, ack(x1, x2-1))
END;
ackResult := result;
RETURN ackResult
END ack; (* --------------------------- *)
BEGIN (* main *)
FOR loop := 1 TO 10000 DO
r := ack(2, 4)
END;
END acker.
[LISTING FIVE]
MODULE FPMath;
(* Benchmarks floating point math package *)
FROM MathLib0 IMPORT arctan, exp, ln, sin, sqrt;
FROM InOut IMPORT Write, WriteLn, WriteString;
CONST
pi = 3.1415927;
nloops = 5;
VAR
i, j: INTEGER;
angle, result, argument: REAL;
BEGIN
WriteString('SQUARE ROOTS ');
FOR i := 1 TO nloops DO
Write ('.');
argument := 0.0;
WHILE argument <= 1000.0 DO
result := sqrt (argument);
argument := argument + 1.0
END;
END; (* FOR *)
WriteLn;
WriteString('LOGS ');
FOR i := 1 TO nloops DO
Write ('.');
argument := 0.1;
WHILE argument <= 1000.1 DO
result := ln (argument);
argument := argument + 1.0
END;
END; (* FOR *)
WriteLn;
WriteString('EXPONENTIALS ');
FOR i := 1 TO nloops DO
Write ('.');
argument := 0.1;
WHILE argument <= 10.0 DO
result := exp (argument);
argument := argument + 0.01
END;
END; (* FOR *)
WriteLn;
WriteString('ARCTANS ');
FOR i := 1 TO nloops DO
Write ('.');
argument := 0.1;
WHILE argument <= 10.0 DO
angle := arctan (argument);
argument := argument + 0.01
END;
END; (* FOR *)
WriteLn;
WriteString('SINES ');
FOR i := 1 TO nloops DO
Write ('.');
angle := 0.0;
WHILE angle <= 2.0 * pi DO
result := sin (angle);
angle := angle + pi / 360.0
END;
END; (* FOR *)
WriteLn;
END FPMath.
[LISTING SIX]
MODULE QSort;
(* The test uses QuickSort to measure recursion speed *)
(* An ordered array is created by the program and is *)
(* reverse sorted. The process is performed 'MAXITER'*)
(* number of times. *)
CONST SIZE = 1000;
MAXITER = 50;
TYPE NUMBERS = ARRAY[1..SIZE] OF CARDINAL;
VAR Iter, Offset, I, J, Temporary : CARDINAL;
A : NUMBERS;
PROCEDURE InitializeArray ;
(* Procedure to initialize array *)
VAR I : CARDINAL;
BEGIN
FOR I := 1 TO SIZE DO
A[I] := SIZE - I + 1
END; (* FOR I *)
END InitializeArray;
PROCEDURE QuickSort;
(* Procedure to perform a QuickSort *)
PROCEDURE Sort(Left, Right : CARDINAL);
VAR i, j : CARDINAL;
Data1, Data2 : CARDINAL;
BEGIN
i := Left; j := Right;
Data1 := A[(Left + Right) DIV 2];
REPEAT
WHILE A[i] < Data1 DO INC(i) END;
WHILE Data1 < A[j] DO DEC(j) END;
IF i <= j THEN
Data2 := A[i]; A[i] := A[j]; A[j] := Data2;
INC(i); DEC(j)
END;
UNTIL i > j;
IF Left < j THEN Sort(Left,j) END;
IF i < Right THEN Sort(i,Right) END;
END Sort;
BEGIN (* QuickSort *)
Sort(1,SIZE);
END QuickSort;
BEGIN (* Main *)
FOR Iter := 1 TO MAXITER DO
InitializeArray;
QuickSort
END; (* FOR Iter *)
END QSort.
[LISTING SEVEN]
MODULE ShSort;
(* Tests Shell sort speed on an integer array of ARSIZE elements. *)
(* Creates an array ordered from smaller to larger, then sorts it *)
(* into reverse order. Repeats NSORTS times. *)
CONST ARSIZE = 1000;
NSORTS = 20;
TYPE NUMBERS = ARRAY [1..ARSIZE] OF INTEGER;
VAR IsInOrder, Ascending : BOOLEAN;
Iter, Offset, I, J, Temporary : CARDINAL;
Ch : CHAR;
A : NUMBERS;
PROCEDURE InitializeArray ;
(* Initialize array *)
BEGIN
FOR I := 1 TO ARSIZE DO
A [I] := I
END; (* FOR I *)
END InitializeArray;
PROCEDURE ShellSort ;
(* Shell-Meztner sort *)
PROCEDURE Swap;
(* Swap elements A[I] and A[J] *)
BEGIN
IsInOrder := FALSE;
Temporary := A[I];
A[I] := A[J];
A[J] := Temporary;
END Swap;
BEGIN
(* Toggle 'Ascending' flag *)
Ascending := NOT Ascending;
Offset := ARSIZE;
WHILE Offset > 1 DO
Offset := Offset DIV 2;
REPEAT
IsInOrder := TRUE;
FOR J := 1 TO (ARSIZE - Offset) DO
I := J + Offset;
IF Ascending
THEN IF A[I] < A[J] THEN Swap END
ELSE IF A[I] > A[J] THEN Swap END
END; (* IF AscendingOrder *)
END; (* FOR J *)
UNTIL IsInOrder;
END; (* End of while-loop *)
END ShellSort;
BEGIN (* Main *)
InitializeArray;
Ascending := TRUE;
FOR Iter := 1 TO NSORTS DO
ShellSort
END;
END ShSort.
[LISTING EIGHT]
MODULE cortn;
(* Benchmark to test speed of coroutine switching *)
(* Shifts NCHARS characters to upper-case *)
(* Two transfers per character *)
FROM SYSTEM IMPORT NEWPROCESS, TRANSFER, ADDRESS, BYTE, ADR;
CONST NCHARS = 50000;
WorkSize = 1000;
VAR ch : ARRAY [1..NCHARS] OF CHAR;
ShiftWork, CountWork : ARRAY [1..WorkSize] OF BYTE;
count, chval, c : CARDINAL;
main, shifter, counter : ADDRESS;
PROCEDURE CountProc;
(* Increments count *)
BEGIN
REPEAT
count := count + 1;
TRANSFER (counter, shifter);
UNTIL FALSE;
END CountProc;
PROCEDURE ShiftProc;
(* Shifts char at 'count' to upper case *)
BEGIN
REPEAT
IF (ch [count] >= 'a') AND (ch [count] <= 'z') THEN
ch [count] := CHR (ORD (ch [count]) - 32)
END;
TRANSFER (shifter, counter);
UNTIL count = NCHARS;
TRANSFER (shifter, main);
END ShiftProc;
BEGIN (* Main program *)
(* Load array with lower-case letters *)
chval := ORD ('a');
FOR c := 1 TO NCHARS DO
ch [c] := CHR (chval);
chval := chval + 1;
IF chval > ORD ('z') THEN
chval := ORD ('a');
END;
END;
(* Set up coroutines *)
NEWPROCESS (CountProc, ADR (CountWork), WorkSize, counter);
NEWPROCESS (ShiftProc, ADR (ShiftWork), WorkSize, shifter);
(* Dispatch the controlling task *)
count := 1;
TRANSFER (main, shifter);
END cortn.
[LISTING NINE]
MODULE ncortn;
(* Does the same thing as CORTN.MOD, but without *)
(* coroutine switching *)
(* Subtract run time for this from time for CORTN *)
(* to find out actual coroutine overhead *)
CONST NCHARS = 50000;
WorkSize = 1000;
VAR ch : ARRAY [1..NCHARS] OF CHAR;
count, chval, c : CARDINAL;
PROCEDURE CountProc;
(* Increments count *)
BEGIN
count := count + 1;
END CountProc;
PROCEDURE ShiftProc;
(* Shifts all chars in array 'ch' upper case *)
BEGIN
REPEAT
IF (ch [count] >= 'a') AND (ch [count] <= 'z') THEN
ch [count] := CHR (ORD (ch [count]) - 32)
END;
CountProc; (* Substitute call for TRANSFER *)
UNTIL count = NCHARS;
END ShiftProc;
BEGIN (* Main program *)
(* Load array with lower-case letters *)
chval := ORD ('a');
FOR c := 1 TO NCHARS DO
ch [c] := CHR (chval);
chval := chval + 1;
IF chval > ORD ('z') THEN
chval := ORD ('a');
END;
END;
(* Dispatch the controlling task *)
count := 1;
ShiftProc;
END ncortn.