LETTERS

C++ and the 386

Dear DDJ,

As one of what I would imagine is a very small number of Intek C++ users, I read with interest the exchange between Mac Cutchins of Intek and Al Stevens in your March 1990 "Letters" section. When I first received the Intek package I had many of the same problems that Al did, including a not so amusing little bug that resulted in the compiler only working every second time. Several of their header files would not compile due to typos and bugs of one sort or another. Intek technical support was always polite but rarely helpful and I sometimes got the impression that I was the only user they had actually suckered into buying the product. To be fair they did supply an upgrade when I complained that the old version of the PharLap binder they had used prevented its working correctly with the VCPI standard and hence precluded the use of QEMM and DesqView. I was also notified (by telephone no less) of their upgrade to 2.0 and it was reasonably priced and delivered promptly.

I was eventually able to contrive the necessary patches, batch files, and bug fixes so that it would reliably compile things from my Brief editor and I could pretend I was working with a real development tool. Still one might ask why bother with it when there are other alternatives.

First and foremost because Intek C++ is the only product which will support the MetaWare HiC or Watcom compilers, thus it is the only way to produce code for 386 protected mode programs running under DOS extenders. Also, as Mac Cutchins points out, Intek's use of 386 protected mode means you are not concerned with running out of memory when compiling large source modules. The large and numerous header files that C++ encourages can quickly exhaust the memory of real mode compilers such as Zortech. Many of my large library modules would have to be split up and might still give problems compiling under Zortech. Finally and unexpectedly, the translator itself is quite robust once it is running. It correctly compiled code segments where Zortech 1.2 version gave spurious errors. The 2.0 version of Zortech seemed more robust in my limited testing of it, but could not compile many of my files due to the memory limitation problem.

I am the only one in our shop using C++ at the moment, but that will change in the near future and I dread having to invest in more copies of the Intek product. With each new issue of DDJ I carefully scan all the adds and announcements for a Turbo C++ 386 or something similar. The OS/2 version of Zortech is tempting, but I need to use too many PharLap programs to make that feasible just yet. When all is said and done Intek has the singular advantage of being the only product available under DOS for creating really large C++ applications. If something else is available I would love to know about it.

Craig Morris

Calgary, Alberta, Canada

DDJ responds: Thanks for your insights, Craig. Just within the last few days, DDJ contributing editor Andrew Schulman started an in-depth look of C++ implementations for the 386, beginning with Intek C++ and MicroWay's NDP C++. We're looking forward to sharing his findings sometime in the near future.

Trick Trade-offs

Dear DDJ,

This message is in regards to Tim Paterson's article "Assembly Language Tricks of the Trade" in the March 1990 DDJ.

I've always enjoyed reading articles about the tricks and magic that other programmers use. If we assume some things, though, we can do your Binary-To-ASCII Conversion one better.

If we assume that the Carry and Auxiliary Carry are clear, then a binary value in the range 00-0F in AL can be converted to ASCII by:

  daa                         ; 00-09, 10-15
  add          Al0F0H         ; F0-F9 NC, 00-05 CY
  adc          al,040h        ; 30-39, 41-46 ('0'-'9','A'-'F')

Since we usually want to convert a BYTE to two ASCII characters, this is usually preceded by masking and/or shifting some other value. These operations will clear the Carry and Auxiliary Carry, so everything's OK.

Yet another trick: You mention using the AAM and AAD instructions for Binary/Decimal Conversion. There is an undocumented "extension" to these instructions, which is often useful. The opcodes for AAD and AAM are:

  AAD    =    D5 0A
  AAM    =    D4 0A

If the 0As look a little suspicious, it's because they are the divisors used in the conversion. The instruction sequence D4 10 is equivalent to separating the byte in AL into its upper/lower nibbles and placing the upper nibble into the lower nibble of AH, leaving just the lower nibble in AL. This also happens to clear the Carry and Auxiliary Carry flags. Sooooo ... used in conjunction with the Binary-to-ASCII Conversion code above will result in an extremely compact, brutally fast Byte-to-Two-ASCII-Digits Conversion. Neat, eh?

Keith Moore

Fort Worth, Texas

Tim responds: I am aware of the tricks Keith mentions. However, both rely on undocumented features of the 8086 family, which is a very dangerous practice.

The only instructions which are documented to affect the Auxiliary Carry (AC) flag in a specific way are arithmetic instructions (not including shifts). Masking and shifting instructions are documented as leaving the AC flag undefined. Thus it is very unlikely that the state of the AC flag will be known when Keith's instruction is executed, and the method could easily fail.

Testing with a debugger may leave the impression that masking, for example, leaves the AC flag clear. However, did you check this on an 8088, 8086, 286, or 386? What about the 33-MHz 386, which uses a different mask set than the slower versions? Are you sure the 386SX, 486, and 586 (which no one has seen yet) all work that way?

The same thing can be said for using variants of the AAM and AAD instructions to multiply or divide by something other than ten. Eleven years ago I discovered that the 8086 used the second byte of those instructions as an immediate value. But does a 486? If it does, then the 486 has a bug -- it should perform an invalid opcode trap if the second byte is not OAH. Or else Intel needs to document that it works.

There are too many different processors in the family -- and too many different manufacturers -- to consider using undocumented features. Let's all play by the rules.

But Basic Already Does That...

Dear DDJ,

No one is a bigger fan of Jeff Duntemann than I, but he completely missed the boat in his Modula-2 discussion (DDJ, February 1990). As Jeff went over the list of omissions in both Pascal and Modula-2, I kept saying to myself, "But QuickBASIC already does that." In my opinion, Microsoft QuickBASIC overcomes all of the shortcomings of both Pascal and Modula-2, with a language that is both fully structured and incredibly easy to use.

For example, Jeff laments Pascal's inability to view a list of procedures, and praises that feature in Modula-2. But QuickBASIC has had a "View Subs" menu for years. He then compares Pascal's ability to use a varying number and type of parameters for built-in statements, as opposed to Modula with its separate WriteString, ReadInt, and so forth. Again, QuickBASIC (and even interpreted BASIC!) has always had that capability. Worse still, procedures in either language cannot accept a truly "open ended" array. And again, QuickBASIC lets you pass any array -- with any number of dimensions and any range of upper and lower bounds -- to any subroutine. How else could one write a usable sort routine?!

I won't belabor the remaining list of advantages that QuickBASIC has over the "Wirth" languages. No, I won't dwell on QuickBASIC's many data types, automatic support for a coprocessor, TRUE dynamic strings, world-coordinate graphics, or its ability to manage an entire project without requiring all of the files to be in the same directory. (Yeah, that's a good one -- multiple copies of your debugged subroutines scattered all over a disk.) And I won't even belabor QuickBASIC's outstanding support for fully interrupt-drive communications. Where Jeff is bragging about a 100-line Comm program he wrote in an hour using Modula-2, I maintain the same could be done in, say, 20 lines in ten minutes using QuickBASIC.

Indeed, if any language is the rightful successor to "king" Turbo Pascal, surely it is QuickBASIC.

Ethan Winer

Stamford, Connecticut

Editor's note: Ethan is president of Crescent Software, developers of QuickBASIC add-on tools.

Forth-Coming

Dear DDJ,

I read Martin Tracy's article, "Zen Forth," with great interest (DDJ, January 1990). As a Forth programmer myself, I'm interested in Forth systems and applications. I even wrote a Forth system for sale (CorrectForth -- I published it as a product of Correct Software, Inc.). I have a number of comments on the implementation and what looks like bugs in the source code.

First, you could put the address of colon into the register DI. Then colon looks like this:

  LABEL COLON BP SPXGHGSI PUSH BPSP X CHG SI POP NEXT C;

(the CFA code or a colon definition is DI CALL). The result is a system about 290 faster than a JMP colon and numerous changes to the source code (string operators, FIND, etc.). The changes are minor and would involve saving and restoring DI. Another change would be to use register ES to point to RAM, thus increasing the amount of code space and data space available. Only string operations would be affected and would involve saving and restoring ES. Then, too, you could describe another register to hold the nest to top of stack value. This speeds up the system by 10 percent since lots of Forth words use 2 parameters. The system as published in DDJ runs the Sieve of Erastothenes benchmarks in 46 seconds, but the new improved system in 45 second. Time counts in real-time applications!

The source code bugs are as follows:

Screen     Page        Bug(s)
13         98          use of TRUE (a code defined word) in = <, U<
14         98          same as above, only for 0=, 0<
37         102         use of SPO in depth

The reason I'd call them bugs is that I don't think the metacompiler Martin was using would execute works defined in the metacompiler's target dictionary. If it did, I'd think twice before I'd use such a "feature" -- I would cross compile into a processor that might not execute host code ...!

Overall, this system sings pretty good. I counted on that -- Mr. Tracy's been in the Forth community much longer than I have. The choice of a DCT (direct threaded code) implementation of Forth is the best in my opinion since it has the best tradeoff of size vs. speed. If you want speed and don't care about size, go for STC (subroutine threaded code) (like Small C did). If you want really tight code (say you only have 4K of ROM), go for TTC (token threaded code). If you want speed and just have to have small size, go for DTC. The high-level words run at an acceptable speed and providing you chose the proper words as going into assembler (CODE definitions for the knowledgeable), you'll get screaming speed at little cost.

Russell McCale

New York, New York

Martin responds: Thank you for your interest in ZEN Forth. I am writing this letter to answer some of the many questions I have received.

ZEN is a personal dialect I have been developing and porting for several years. Most recently, I have been using it to track the development of the ANSI X3J14 proposed standard. The current state of the standard is reflected in a working document called BASIS. The BASIS changes every three months.

The most recent BASIS is BASIS 10, and I have written ZEN1_10 to match it. ZEN1_10 means Version 1, release 10. I have posted ZEN1_10 on GENIE and on BIX, and will continue to post new versions there.

ZEN1_10 is not meant to be a development system, but rather a simple and efficient Forth dialect. I have provided only the source code, for your study, and an executable file that you can use to load a text file to test a program for ANSI compatibility.

Yes, you are missing documentation, assembler, metacompiler, etc. These will not be written until the draft proposal dpANS is ready, which is at least nine months away. The current release was created by a Forth-to-assembler-source translator. The next release will probably be written in Turbo C or C++.

More on Algorithm Patenting

Dear DDJ,

The compression algorithms have been in my conscious path for a search to reduce some of my voluminous writings. I have corresponded with you and Mark Nelson about this, and although I could never get his C program to run with "Let's C" from Mark Williams, I read with interest what some of the law types have to say about it.

Having been in the chemical field for some 30 years, I have come across many snafus of the Patent Office. I leave to your imagination why these snafus occur; not in the least is the heavy burden of research of prior art before patents are granted. Many times patents were granted on chemical procedures or compounds that were in direct conflict with prior art. These were easy to deal with. Usually showing prior art would annul the patent rights right on the spot.

It may have become a bit more difficult today, since our society is the most litigious in the world and lawyers, in and out of government, seem to thrive on perpetuating their own income at the expense of the general population. Lawyers have become the true leeches of this society, that leech wealth from this society. I am not surprised that some two-bit lawyer will claim the LZW routine to be patentable, while the real inventors lived some 50 years ago and may have been dead for a while. After all, lawyers have to make money too.

Paul A. Elias

Fountain Hills, Arizona

Location IS Everything

Dear DDJ,

I'm working with Softaid's hardware 8088 emulator, and found Mark Nelson's January 1990 article ("Location is Everything!") on an exe-to-hex locate utility useful and instructive. However, I had to move the STACK segment in his START.ASM file in front of the other data segments to make the locate program behave correctly; this with Borland's TASM 1.0, C 2.0, and TLINK 2.0, which combination I assume uses some slight unanticipated variation of the 5 million sacred ways of ordering segments and groups. Without this change, the stack segment would up, in a test file, a paragraph after the rest of the data, and since LOCATE uses this value to figure out where all the data is, it wouldn't relocate properly. (This is because -- I would figure but heaven only knows -- the exe stack record LOCATE uses was actually in fact the genuine offset of the stack, not of some trifling DGROUP, no matter what START.ASM says.)

Once over that minor difficulty I was able, using various C, TASM, and TLINK debugging options, to include line numbers and globals into an output map file which the Softaid SLD (source level debugger) program and utilities could translate, download, and more or less understand -- that is, I could step and breakpoint in source (public variables would-up in the wrong place, but I'm sure a little more hacking could fix that). SLD is a great and powerful thing capable of much more, or so I am told, and inasmuch as the Softaid system is thousands of dollars, we're spending a few hundred more for a sophisticated locator program. But it's nice to have an extra emergency tool, and using/fiddling Mr. Nelson's program was just the bit of 8088-in-ROM exercise I needed to get in the mood. Thanks for the help.

J.G. Owen

Fort Salonga, New York

Round and Round We Go ... Maybe

Dear DDJ,

Recently I had the chance to put to use the parametric circle algorithm described in Robert Zigon's article in the January issue of DDJ ("Parametric Circles"). Shortly thereafter, I came across Joseph M. Hovanes Jr.'s letter in the March issue, citing the shortcomings of this algorithm when compared to Bresenham's algorithm.

Although Bresenham's algorithm is more efficient, the parametric approach does have several advantages. First, the eight-way symmetry that Mr. Hovanes mentions can be applied when drawing a parametric circle, too. Second, only floating-point additions and multiplications (i.e., no trig functions) are performed inside the loop. If your computer has a floating-point coprocessor, the execution time is within the same order of magnitude as integer arithmetic.

Lastly, if you need to draw only part of a circle (i.e., an arbitrary circular arc), the parametric algorithm can be easily adapted to start and stop where you please. After examining Bresenham's algorithm for quite a while, I'm pretty sure that it can only draw a complete circle, or one of the eight symmetric sectors.

Ben White

Mountain View, California

It's All in the Numbers

Dear DDJ,

The major point Michael Swaine makes in his November 1989 "Swaine's Flames" -- that we should not blindly accept "numerical" answers is well taken. Unfortunately, in the second example of incorrect use of numeric things, I believe he is in error and John Paulos is correct. In my 15 years hanging around research laboratories, I have always understood two values to be different by "two orders of magnitude" to mean different by a factor of 10{2}, not, as he claims, by 10{100}. If this were the case, the term would not come up very often, since 10{100} is a very large number -- about equal to the number of atoms in the universe.

I ran across a better example of incorrect number usage in an IBM ad. This ad states that the footprint of their new printer (291 square inches) is 33 percent smaller than H.P.'s LaserJet (432 square inches). Give or take a square inch, this is correct. However, the ad then concludes from this fact "And that gives you 33 percent more usable workspace." This proclamation, while sounding somehow reasonable, is correct for only one of all possible workspaces.

For example, my computer/printer space is a fairly typical 80 x 32 inches (2560 square inches). If I had a LaserJet, I would have 2560 - 432 = 2128 square inches of "usable" workspace. (Is a printer really useless?) If, according to IBM, I purchase their product to replace the LaserJet, I will have 33 percent more workspace, or 2128 x 1.33 = 2830 square inches more than the area of my table with no printer at all. Good deal, it saves buying a bigger desk!

In fact, if each time I buy an IBM printer, I get 33 percent more workspace, the purchase of 118 of them should give me control of the entire surface of the earth. However, if I need still more room, even if I only purchase one a day, inside of a year I can have the lateral dimensions of my workspace increasing at an average speed greater than light. But that, as we know, would be ridiculous.

Of course, in my example, what really happens is that after the purchase of an IBM laser printer, I would have 2560 - 291 = 2269 square inches of workspace; 2269/2128 = => 7 percent more than before. This is of some benefit, of course, but it doesn't sound very impressive -- and the point of using numbers at all is to impress people -- right?

Jeffry Stetson

Villigen, Switzerland

I Fought the Law But I Won

Dear DDJ,

I was a little bit surprised by Duntemann's One Law of Portability (DDJ March 1990): That it's virtually impossible to take source code for an on-line program and recompile it on an entirely different computer with little if any modifications.

Actually, I know that it can be done, since I've done exactly that by switching Ryan McFarland Cobol code between an IBM PC compatible and a minicomputer running Unix. And come to think of it, why can't any higher-level language include verbs which mean "display this on the user's screen" and "place user's keyboard input into this memory location," regardless of whether the code is compiled and executed on a PC, VAX, or 3090?

Jacob Stein

Monsey, New York