DOS + 386 = 4 GIGABYTES!

Directly address 4 gigabytes of memory in DOS from your C or assembly language applications

Al Williams

While Al's programming endeavors range from AI to real-time control software, he specializes in system-level software. Be sure to look for Al's 386 DOS extender in an upcoming DDJ. Al can be reached via CompuServe (72010, 3574) or at 310 Ivy Glen Court, League City, TX 77573.


Ever since Intel introduced the 8088 and 8086, programmers have chafed at the 64K limit imposed by the 8086's segmented architecture. Dealing with data structures greater than 64K has required great feats of legerdemain and been all but impossible in some high-level languages. The 80286 came along, but it still used 64K segments; and though the 286 can address 16 Mbytes of memory, DOS knows only how to deal with the first megabyte. Then the 80386 arrived on the scene. At last, programmers could define segments ranging in size from 1 byte to 4 gigabytes!

Unfortunately, DOS still limits programmers to 1 Mbyte. In this article, I'll show a method for accessing the entire 80386 address space (4 gigabytes) as one flat range of addresses. I'll also provide support for accessing memory from C, controlling the 80286/80386 address lines, allocating extended memory, and adding assembly language to C programs without an assembler. The programs presented all compile under Microsoft C 5.1 with or without the Microsoft assembler, MASM 5.1. Mix's PowerC also compiles these programs.

Addressing Revisited

Recall that the 8086 uses a model of memory addressing known as segmentation to break memory into pieces (or segments). Inside each segment, each particular byte has a unique offset. To address a byte of memory, both its segment and offset must be known. A full address is usually specified as SSSS:OOOO, where SSSS and OOOO represent the segment (or segment selector) and the offset, respectively. The exact interpretation of the segment selector depends on the operating mode of the 386.

In real mode, all segments are exactly 64 Kbytes long. The first segment starts at the bottom of memory, and each consecutive segment starts 16 bytes after the previous segment. Because a 16-bit number (the selector) represents a segment, the address space covers 65,536 x 16 = 1,048,576 bytes (or 1 Mbyte). Memory above 1 Mbyte is normally not accessible in real mode.

One interesting difference between an 8086 and an 80386 (or 80286) in real mode is the handling of addresses at the 1-Mbyte boundary. Generating an address, for example, of FFFF:0011 on an 80386 actually addresses above the 1-Mbyte limit. An 8086 will "wrap-around" so that the address generated is actually the same as 0000:0001. Since some programs may depend on this wrap-around, 80286 and 80386 motherboard designers have added a "gate" for the address line above 1 Mbyte (the A20 line). With the gate turned on, wrap-around doesn't occur. With the gate turned off, as it usually is, addresses appear to wrap around as on an 8086.

Protected mode treats segments differently. In protected mode, a segment selector contains a 13-bit number that indexes into one of two tables known as descriptor tables. One bit in the selector determines which table to use, and 2 bits control segment use (the privilege level, which we won't use), for a total of 16 bits. The descriptor table stores the segment's start address, length, and other pertinent data. Figure 1 shows a segment selector and a partial descriptor table. For our purposes, we need only part of the information in the Global Descriptor Table (GDT).

Each entry in the GDT is 8 bytes long. If the 80386 loaded each entry from memory every time it accessed memory, performance would suffer greatly. To prevent this from happening, the 80386 caches each entry internally whenever the program loads a segment register. In real mode, the processor never changes the cache, because the descriptor tables are not used.

Note that in real mode, any two numbers you put together form a valid address. In protected mode, however, only certain segment selector values are valid. In protected mode, a segment's length determines which offsets are legal. If you try to use a segment improperly or address outside of its range, the 80386 will generate an error. When switching from protected mode to real mode, Intel recommends setting all of the segment registers to selectors that have a 64K limit before switching to real mode. If, however, you disregard the documentation and set the segment registers to selectors with a different limit, the 386 retains that limit during real mode. Set up protected-mode segment registers with a 4 gigabyte limit before returning to real mode.

The Plan

To successfully address the entire memory space from real mode, you must perform the following steps:

    1. Disable interrupts, including Non-Maskable Interrupts (NMI)

    2. Switch to protected mode

    3. Load one or more segment registers with a "big" (4 gigabyte) segment

    4. Switch back to real mode

    5. Enable interrupts

Once these steps are performed, the segment registers remain affected until a processor reset or until another protected-mode program reloads them. Because real mode does not use segment descriptors, the descriptor cache is never reloaded.

For DOS use, it is desirable to provide routines to:

Listings One through Five show the SEG4G library that performs these functions.

Some Assembly Required

Obviously, to switch modes and perform other 386 magic, we need some assembly language routines. However, not everyone has access to an assembler that generates 80386 protected-mode code. Because of this, you may select one of three different methods to generate the assembly language code. The first method uses Microsoft's assembler (MASM Version 5.1). The second and third methods are for Microsoft C Version 5.1 and Mix's PowerC (see Listing Five, page 112), respectively, and do not require an assembler.

While PowerC provides an asm() function, Microsoft does not. The macro contained in ASMFUNC.H (Listing One, page 110) remedies this absence. This macro allows you to create a character array containing the machine code you want to execute and then call it as a function, complete with arguments and an integer return value.

Before compiling, you must select one of the assembly methods (ASM, DATA, or POWER) at the top of SEG4G.H (Listing Two, page 110). If you pick ASM, you must assemble SEG51.ASM (Listing Four, page 111) separately and link it with SEG4G. Be sure to change the .MODEL directive at the top of SEG51 to match the model you are using for your C programs. In addition, if you use an Intel Inboard 386/PC, set the variable inboard to 1.

(Defined near the top of SEG4G.C, Listing Three, page 110.)

Using SEG4G Library

To force the segment limit on the GS and ES registers to 4 gigabytes, call the extend_seg( ) routine. This call modifies the registers until the computer is rebooted. If you plan to access extended memory, you must also enable the A20 line by calling the a20( ) function. The call a20(1) turns on A20, and a20(0) turns it off again.

The library defines a new data type, the LPTR. This is simply a 32-bit linear address pointer implemented as an unsigned long. For example, the start of the CGA video buffer (B800:0000) is equal to an LPTR of 0xB8000. Two of the supplied functions convert LPTRs to C far pointers and vice versa. Call linear_to_seg( ) or seg_to_linear( ), as appropriate.

While preparing to access memory, you may wish to allocate extended memory. The most common method for allocating extended memory is the "top-down" method. This method temporarily reduces the amount of extended memory reported by the BIOS. For example, if you have 1024K of extended memory and you allocate 24K, other programs calling the BIOS will be told that only 1000K of memory is available. Because extended memory always starts at the same place, the memory is allocated top-down. The only major program that does not use this method is DOS's VDISK (or RAMDRIVE). It uses a peculiar scheme that varies from version to version of DOS. However, because VDISK uses memory from the bottom up, you can control allocation of each so that they never overlap.

SEG4G provides several functions to manage extended memory allocation. Most of these functions are only of interest if you plan to stay resident or run other programs that use extended memory from inside your program. If you don't do either of these things, you can simply check how much extended memory is available and then use it as you see fit. To check the amount of extended memory that is available, call ext_size( ), which returns the number of 1K pages that are free.

If you actually need to allocate extended memory, you may use the ext_alloc( ) and ext_realloc( ) functions. These functions each take the number of 1K pages desired and return an LPTR to the start of the memory block. If the request cannot be honored, the routines return (LPTR)-1L.

Note that these functions are not like the traditional malloc( ) functions found in the standard C library. You should not call them repeatedly to allocate small chunks of memory -- allocate all of the extended memory you need in one call. This is especially true of programs that are resident. If another program has allocated extended memory after your first call to ext_alloc( ), you will be unable to expand your memory allocation.

When you have finished using the extended memory allocated, free it with the ext_free( ) function. This function frees all of the extended memory allocated in your program. Exercise caution when using ext_free( ) with other programs that use extended memory. ext_free(1) forcibly frees all extended memory allocated since the first call to ext_alloc( ). If you have allocated extended memory, you must call ext_free(1) before you exit your program. Failure to do so will lock up the computer. Calling ext_free(0) attempts to free up the memory, but won't forcibly do so if another program has also allocated extended memory.

Once you have done all of the required setup, you are ready to access memory. The functions big_read( ), big_write( ), and big_xfer( ) will read, write, and move blocks of memory, respectively. These functions do not have to operate on extended memory -- they work on any linear address.

The use of big_read( ) and big_write( ) is straightforward. The big_xfer( ) function, however, becomes more efficient when you obey certain rules. In particular, performance is best when you move 32-bit words that are aligned on 32-bit boundaries. For example, moving 128 bytes from location 0x42050 to location 0xb8000 is very fast; moving 127 bytes is somewhat less efficient, and moving 128 bytes from location 0x42051 to location 0xb8000 is also somewhat slower. The big_xfer( ) function tries to optimize transfers by making as many full-word moves as possible. It also attempts to move as much on word boundaries as possible.

Examples

TEST.C (Listing Six, page 112) shows an example program using the SEG4G library. (If you are using VDISK or RAMDRIVE, be sure that you have at least 2K of extended memory not being used by the RAM disk before running this program.) TEST calls extend_seg( ) to set up the 4 gigabyte segments, enables A20, and then attempts to allocate 1K of extended memory. If successful, it writes a data byte to the entire block and then tries to read it back. Next, the block is expanded to 2K and freed. At this point a loop executes so you can examine memory anywhere in the computer's address range. Figure 2 shows a session with the test program and the RAMDRIVE driver installed. Notice the RAMDRIVE message at the start of extended memory. When you are ready to leave the program, enter a Ctrl-Z.

Figure 2: Typical session using TEST.C with the RAMDRIVE driver installed

  C:\SEG4G>SEG4G

  1280K of extended memory available
  1K of extended memory allocated at 10FC00. 1279K remains.
  Data written to extended memory

  Data read back OK.
  Expanding allocation to 2K
  2K of extended memory allocated at 10F800. 1278K remains.
  Extended memory freed.
  1280K Available.
  Enter ^Z to quit.

  Address and count?  0x100000 256
  MICROSOFT.EMM.CTRL.VERSION.1.00.CONTROL.BLOCK . . . @ . . .
  . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
  . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

  Enter ^Z to quit.
  Address and count?  ^Z

  C:\SEG4G>

Listing Seven, page 115, shows BLKTEST.C, an example of using big_xfer( ). Because the program writes directly to the screen, you must change the COLOR define to match the type of display your computer has.

Conclusion

The SEG4G library offers a fast, simple method to access the entire 386 memory range from DOS. Even programs that are not running on a 386 can make use of the extended memory allocation, the A20 control routines, and the assembly language interface macro presented here. SEG4G can help implement memory intensive applications such as expanded memory drivers, ram caches, speech/video buffers, and databases.

The extended memory allocation routines allow SEG4G to coexist peacefully with other extended memory-aware programs, but they won't protect it from applications that assume they own all of the extended memory available. In addition, DOS extenders, multitaskers, memory managers, and other software that use protected or virtual 8086 mode may not be compatible with SEG4G.

As with any undocumented feature, this one could vanish at any time. However, it is unlikely that the segment cache scheme used in the 386 will change any time soon. While SEG4G may not be the answer to all of your memory problems, it can provide you with more usable space under DOS, along with some working experience with the 80386's protected mode.

Bibliography

Turley, James L., Advanced 80386 Programming Techniques, Osborne/ McGraw-Hill, Berkeley, Calif., 1988.

Intel Corporation, 80386 Programmer's Reference Manual, Intel Corp., Santa Clara, Calif., 1986.

_DOS + 386 = 4 GIGABYTES! by Al Williams [LISTING ONE]



/**************************************************************************
 * The SEG4G Library by Al Williams. ASMFUNC.H--This header allows an array to
 * be executed as assembly code. The routine is called in far model regardless
 * of the model the C program is compiled in.
 **************************************************************************/

#define asmfunc *(int (far *)())




[LISTING TWO]


/**********************************************************************
 * SEG4G.H--Header for SEG4G Library  -- Williams                     *
 **********************************************************************/
typedef unsigned long LPTR;

/* set this variable to 0 for normal 386, 1 for Intel INBOARD 386/PC */
extern int inboard;

/* Function prototypes */
LPTR seg_to_linear(void far *p);
void far *linear_to_seg(LPTR lin);
void extend_seg(void);
void a20(int flag);
unsigned int big_read(LPTR address);
void big_write(LPTR address,unsigned int byte);
void big_xfer(LPTR src, LPTR dst, unsigned long count);

unsigned int ext_size(void);
LPTR ext_alloc(unsigned size);
LPTR ext_realloc(unsigned size);
int ext_free(int exitflag);




[LISTING THREE]


/***************************************************************************
 * The SEG4G Library by Al Williams. SEG4G.C--These subroutines will allow *
 * an 386 to access a linear address space of 4 Gigabytes. You may select  *
 * one of three methods for incorporating assembly language subroutines    *
 * into the C programs. The three methods are:                             *
 *   ASM   - Use Microsoft's MASM 5.1                                      *
 *   DATA  - Use the asmfunc macro defined in ASMFUNC.H                    *
 *   POWER - Use the asm function present in POWER C                       *
 * You must select one of the three methods below:                         *
 ***************************************************************************/
#define ASM   1
#define DATA  2
#define POWER 3
/* make your selection here: */
#define METHOD DATA

/* If using an INTEL INBOARD 386/PC set this variable to 1 */
int inboard=0;

#include <dos.h>
#include "seg4g.h"

/* Only include asmfunc.h if required */
#if METHOD==DATA
#include "asmfunc.h"
#endif

/* Keyboard controller defines */
#define RAMPORT   0x70
#define KB_PORT   0x64
#define PCNMIPORT 0xA0
#define INBA20    0x60
#define INBA20ON  0xDF
#define INBA20OFF 0xDD

/* Redefinitions for POWERC */
#ifdef __POWERC
#define _enable enable
#define _disable disable
#endif

/*************************************************************************
 * convert a far pointer to a linear address                             *
 *************************************************************************/
LPTR seg_to_linear(void far *p)
  {
  return (((unsigned long)FP_SEG(p))<<4)+FP_OFF(p);
  }

/*************************************************************************
 * convert a linear address to a far pointer                             *
 *************************************************************************/
void far *linear_to_seg(LPTR lin)
  {
  void far *p;
  FP_SEG(p)=(unsigned int)(lin>>4);
  FP_OFF(p)=(unsigned int)(lin&0xF);
  return p;
  }

/* Global descriptor table */
struct _GDT
   {
   unsigned int limit;
   unsigned int base;
   unsigned int access;
   unsigned int hi_limit;
   };

static struct _GDT  GDT[2] =
   {
     {0,0,0,0},                   /* unusable GDT slot 0 */
     {0xFFFF,0,0x9200,0x8F}      /* 4 Gig data segment */
   };

/* FWORD pointer to GDT */
struct fword
   {
   unsigned int limit;
   unsigned long linear_add;
   };

static struct fword  gdtptr;   /* fword ptr to gdt */

#if METHOD==POWER || METHOD==DATA

/* Protected mode assembly language routine */
static unsigned char code[]={
#if METHOD==DATA
         0x55,               /* PUSH BP               */
         0x89, 0xe5,         /* MOV BP,SP             */
         0x1e,               /* PUSH DS               */
         0xc5, 0x5e, 0x06,   /* LDS BX,[BP+6]         */
         0x0F, 0x01, 0x17,   /* LGDT FWORD PTR [BX]   */
         0x1f,               /* POP DS                */
         0x0f, 0x20, 0xc0,   /* MOV EAX,CR0           */
         0x0c, 0x01,         /* OR AL,1               */
         0x0f, 0x22, 0xc0,   /* MOV CR0, EAX          */
         0xeb, 0x00,         /* JMP SHORT 00          */
         0xbb, 0x08, 0x00,   /* MOV BX,8              */
         0x8e, 0xeb,         /* MOV GS,BX             */
         0x8e, 0xc3,         /* MOV ES,BX             */
         0x24, 0xfe,         /* AND AL,0FEH           */
         0x0f, 0x22, 0xc0,   /* MOV CR0,EAX           */
         0x5d,               /* POP BP                */
         0xcb};              /* RETF                  */
#else
         0x0f, 0x01, 0x17,   /* LGDT [BX]             */
         0x0f, 0x20, 0xc0,   /* MOV EAX,CR0           */
         0x0c, 0x01,         /* OR AL,1               */
         0x0f, 0x22, 0xc0,   /* MOV CR0,EAX           */
         0xEB, 0x00,         /* JMP SHORT 0           */
         0xbb, 0x08, 0x00,   /* MOV BX,8              */
         0x8e, 0xeb,         /* MOV GS,BX             */
         0X8e, 0xc3,         /* MOV ES,BX             */
         0x24, 0xfe,         /* AND AL,0FEH           */
         0x0f, 0x22, 0xc0,   /* MOV CR0,EAX           */
         0xC3 };             /* RETN                  */
#endif
#endif

/*************************************************************************
 * Adjust the GS register's limit to 4GB                                 *
 *************************************************************************/
void extend_seg()
  {

/* compute linear address and limit of GDT */
  gdtptr.linear_add=seg_to_linear((void far *)GDT);
  gdtptr.limit=15;

/* disable regular interrupts */
  _disable();

/* disable NMI */
  if (inboard)
    outp(PCNMIPORT,0);
  else
    outp(RAMPORT,inp(RAMPORT)|0x80);

/* call protected mode code */
#if METHOD==ASM
  protsetup(&gdtptr);
#elif METHOD==DATA
  (asmfunc code)((void far *)&gdtptr);
#else
  asm(code,&gdtptr);
#endif
/* Turn interrupts back on */
  _enable();

/* Turn NMI back on */
  if (inboard)
    outp(PCNMIPORT,0x80);
  else
    outp(RAMPORT,inp(RAMPORT)&0x7F);
  }

/* macro to clear keyboard port */
#define keywait() { while (inp(KB_PORT)&2); }

/*************************************************************************
 * General purpose routine to allow A20 (flag=1) or disable A20 (flag=0) *
 *************************************************************************/
void a20(int flag)
  {
  if (inboard)
    {
    outp(INBA20,flag?INBA20ON:INBA20OFF);
    }
  else
    {
    keywait();
    outp(KB_PORT,flag?0xbc:0xb4);
    keywait();
    outp(KB_PORT,flag?0xbc:0xb4);
    keywait();
    }
  }

#if METHOD==DATA || METHOD==POWER
/* Assembly code to read a byte */
static unsigned char rcode[]={
#if METHOD==DATA
         0x55,                         /* PUSH BP             */
         0x89, 0xe5,                   /* MOV BP,SP           */
         0x33, 0xc0,                   /* XOR AX,AX           */
         0x8e, 0xe8,                   /* MOV GS,AX           */
         0x66, 0x8b, 0x46, 0x06,       /* MOV EAX,[BP+6]      */
         0x65, 0x67, 0x8a, 0x00,       /* MOV AL,GS:[EAX]     */
         0x32, 0xe4,                   /* XOR AH,AH           */
         0x5d,                         /* POP BP              */
         0xcb};                        /* RETF                */
#else
         0x31, 0xC0,                   /* XOR AX,AX           */
         0x65, 0x8e, 0xC0,             /* MOV GS,AX           */
         0x66, 0x8b, 0x07,             /* MOV EAX,[BX]        */
         0x65, 0x67, 0x8a, 0x00,       /* MOV AL,GS:[EAX]     */
         0xC3 };                       /* RETN                */
#endif

/* Assembly code to write a byte */
static unsigned char wcode[]={
#if METHOD==DATA
         0x55,                         /* PUSH BP             */
         0x89, 0xe5,                   /* MOV BP,SP           */
         0x33, 0xc0,                   /* XOR AX,AX           */
         0x8e, 0xe8,                   /* MOV GS,AX           */
         0x66, 0x8b, 0x46, 0x06,       /* MOV EAX,[BP+6]      */
         0x8b, 0x5e, 0x0a,             /* MOV BX,[BP+10]      */
         0x65, 0x67, 0x88, 0x18,       /* MOV GS:[EAX],BL     */
         0x5d,                         /* POP BP              */
         0xcb};                        /* RETF                */
#else
         0x31, 0xC0,                   /* XOR AX,AX           */
         0x65, 0x8e, 0xC0,             /* MOV GS,AX           */
         0x66, 0x8b, 0x07,             /* MOV EAX,[BX]        */
         0x65, 0x67, 0xc6, 0x00, 0x00, /* MOV GS:[EAX],??     */
         0xC3 };                       /* RETN                */
#endif

/* Assembly code to block move bytes */
static unsigned char xcode[]={
#if METHOD==DATA
         0x55,                         /* PUSH BP             */
         0x89, 0xe5,                   /* MOV BP,SP           */
         0x06,                         /* PUSH ES             */
         0x56,                         /* PUSH SI             */
         0X57,                         /* PUSH DI             */
         0x33, 0xc0,                   /* XOR AX,AX           */
         0x8e, 0xC0,                   /* MOV ES,AX           */
         0X66, 0X8B, 0X76, 0X06,       /* MOV ESI,[BP+6]      */
         0X66, 0X8B, 0X7E, 0X0A,       /* MOV EDI,[BP+0A]     */
         0X66, 0X8B, 0X4E, 0X0E,       /* MOV ECX,[BP+0E]     */
         0XFC,                         /* CLD                 */
         0X67, 0XE3, 0X29,             /* JECX XEXIT          */
         0XF7, 0XC6, 0X03, 0X00,       /* TEST SI,3           */
         0x74, 0x0D,                   /* JZ XMAIN            */
         0XF7, 0XC7, 0X03, 0X00,       /* TEST DI,3           */
         0x74, 0x07,                   /* JZ XMAIN            */
         0X67, 0X26, 0XA4,             /* MOVSB ES:           */
         0x66, 0X49,                   /* DEC ECX             */
         0XEB, 0XEA,                   /* JMP XTEST           */
         0X51,                         /* PUSH CX             */
         0X66, 0XC1, 0XE9, 0X02,       /* SHR ECX,2           */
         0XF3, 0X67, 0X66, 0X26, 0XA5, /* REP MOVSD ES:       */
         0X59,                         /* POP CX              */
         0X80, 0XE1, 0X03,             /* AND CX,3            */
         0XE3, 0X06,                   /* JCXZ XEXIT          */
         0X67, 0X26, 0XA4,             /* MOVSB ES:           */
         0X49,                         /* DEC CX              */
         0XEB, 0XF8,                   /* JMP XBYTE           */
         0X5F,                         /* POP DI              */
         0X5E,                         /* POP SI              */
         0x07,                         /* POP ES              */
         0X5D,                         /* POP BP              */
         0XCB};                        /* RETF                */
#else
         0x55,                         /* PUSH BP             */
         0x89, 0xe5,                   /* MOV BP,SP           */
         0x06,                         /* PUSH ES             */
         0x33, 0xc0,                   /* XOR AX,AX           */
         0x8E, 0xC0,                   /* MOV ES,AX           */
         0X66, 0XBE,                   /* MOV ESI,            */
           0X00, 0X00, 0x00, 0x00,     /* SRC ADDRESS         */
         0x66, 0xBF,                   /* MOV EDI,            */
           0x00, 0x00, 0x00, 0x00,     /* DST ADDRESS         */
         0x66, 0xB9,                   /* MOV ECX,            */
           0x00, 0x00, 0x00, 0x00,     /* COUNT               */
         0XFC,                         /* CLD                 */
         0X67, 0XE3, 0X29,             /* JECX XEXIT          */
         0XF7, 0XC6, 0X03, 0X00,       /* TEST SI,3           */
         0x74, 0x0D,                   /* JZ XMAIN            */
         0XF7, 0XC7, 0X03, 0X00,       /* TEST DI,3           */
         0x74, 0x07,                   /* JZ XMAIN            */
         0X67, 0X26, 0XA4,             /* MOVSB ES:           */
         0x66, 0X49,                   /* DEC ECX             */
         0XEB, 0XEA,                   /* JMP XTEST           */
         0X51,                         /* PUSH CX             */
         0X66, 0XC1, 0XE9, 0X02,       /* SHR ECX,2           */
         0XF3, 0X67, 0X66, 0X26, 0XA5, /* REP MOVSD ES:       */
         0X59,                         /* POP CX              */
         0X80, 0XE1, 0X03,             /* AND CX,3            */
         0XE3, 0X06,                   /* JCXZ XEXIT          */
         0X67, 0X26, 0XA4,             /* MOVSB ES:           */
         0X49,                         /* DEC CX              */
         0XEB, 0XF8,                   /* JMP XBYTE           */
         0x07,                         /* POP ES              */
         0X5D,                         /* POP BP              */
         0xC3,                         /* RETN                */
#endif

/*************************************************************************
 * Read a single byte from extended memory given a linear address        *
 *************************************************************************/
unsigned int big_read(LPTR address)
  {
#if METHOD==DATA
  return (asmfunc rcode)(address);
#else
  return asm(rcode,&address)&0xFF;
#endif
  }

/*************************************************************************
 * Write a single byte to extended memory given a linear address         *
 *************************************************************************/
void big_write(LPTR address,unsigned int byte)
  {
#if METHOD==DATA
  (asmfunc wcode)(address,byte);
#else
  wcode[12]=byte;
  asm(wcode,&address);
#endif
  }

/*************************************************************************
 * Block move a number of bytes from one area to another                 *
 *************************************************************************/
void big_xfer(LPTR src,LPTR dst,unsigned long count)
  {
#if METHOD==DATA
  (asmfunc xcode)(src,dst,count);
#else
  *(LPTR *)&xcode[10]=src;
  *(LPTR *)&xcode[16]=dst;
  *(unsigned long *)&xcode[22]=count;
  asm(xcode,(void *)0);
#endif
  }

#endif




[LISTING FOUR]


.MODEL LARGE,C

.386P

.CODE

; SEG51.ASM
; Routine to goto protected mode and reset ES and GS registers to 4GB

IF @DataSize
protsetup proc fpointer:dword,c
          push ds
          lds bx,fpointer
ELSE
protsetup proc fpointer:word,c
          mov bx,fpointer
ENDIF
          lgdt fword ptr [bx]                    ; Load GDT
IF @DataSize
          pop ds
ENDIF
          mov eax,cr0                            ; Goto prot mode
          or al,1
          mov cr0,eax
          jmp short nxtlbl                       ; Purge instruction
nxtlbl:   mov bx,8                               ; prefetch
          mov gs,bx                              ; Load gs/es
          mov es,bx
          and al,0feh                            ; Go back to real mode
          mov cr0,eax
          ret
protsetup endp

; Read a byte from an LPTR
big_read proc address:dword,c
         xor ax,ax                               ; zero GS
         mov gs,ax
         mov eax,address                         ; Load LPTR
         mov al,gs:[eax]                         ; Load byte
         xor ah,ah                               ; Zero AH
         ret
big_read endp

; Write a byte to an LPTR address
big_write proc address:dword, byt:word,c
          xor ax,ax                              ; Zero GS
          mov gs,ax
          mov eax,address                        ; Load LPTR
          mov bx,byt                             ; Load byte
          mov byte ptr gs:[eax],bl               ; Store byte -> LPTR
          ret
big_write endp

; Block move bytes between LPTR's
big_xfer proc source:dword, dest:dword, count:dword,c
         push es
         push si
         push di
         xor ax,ax                               ; Zero ES
         mov es,ax
         mov esi,source                          ; load source buffer
         mov edi,dest                            ; load dest buffer
         mov ecx,count                           ; load count
         cld
; The following code tries its best to make efficient moves
; by trying to move bytes until word alignment is achived
xtest:
         jecxz xexit                             ; done?
         test si,3                               ; SI word aligned?
         jz short xmain
         test di,3                               ; DI word aligned?
         jz short xmain
;         test cl,3                               ; Even number of dwords
;         jz short xmain                          ; to move?
         movs es:[esi],byte ptr es:[edi]         ; Move a byte
         dec ecx                                 ; update count
         jmp short xtest                         ; Recheck alignments
xmain:
         push cx
         shr ecx,2                               ; Calculate number of dwords
                                                 ; And move all of them
         rep movs dword ptr es:[esi],dword ptr es:[edi]
         pop cx
         and cl,3                                ; Move left over bytes
xbyte:   jcxz xexit                              ; If any
         movs es:[esi],byte ptr es:[edi]
         dec cx
         jmp short xbyte
xexit:
         pop di
         pop si
         pop es
         ret
big_xfer endp

          end




[LISTING FIVE]


/***************************************************************************
 * The SEG4G Library by Al Williams. EXTMEM.C--These subroutines manage    *
 * top down allocation of extended memory.                                 *
 **************************************************************************/
#include <dos.h>
#include "seg4g.h"

static void far *old15;
static int installed=0;
static unsigned e_size, e_alloc;

/* redefinitions for POWERC */
#ifdef __POWERC
#define _FAR
#define _dos_getvect(n) getvect(n)
#define _dos_setvect(n,p) setvect(n,p)

/* This is a kludge to get POWERC to chain to the next level of interrupt */
unsigned __chain[14]= { 0x559c,0xe589,0xb850,0, 0x4687, 0x87fe, 0x46,\
                        0xb850, 0, 0x4687, 0x5d00, 0xeafa, 0, 0 };
void far *__cptr;

#define _chain_intr(ptr) { __cptr=(void far *)__chain; \
                           __chain[12]=FP_OFF(ptr); \
                           __chain[13]=FP_SEG(ptr); __chain[3]=Rip;\
                           __chain[8]=Rcs; Rcs=FP_SEG(__cptr);\
                           Rip=FP_OFF(__cptr);\
                           return; }

#define INTREGS unsigned Rbp, unsigned Rdi, \
                unsigned Rsi, unsigned Rds, \
                unsigned Res, unsigned Rdx, \
                unsigned Rcx, unsigned Rbx, \
                unsigned Rax, unsigned Rip, \
                unsigned Rcs, unsigned Rflags
#else
#define _FAR far
#define INTREGS unsigned Res, unsigned Rds, \
                unsigned Rdi, unsigned Rsi, \
                unsigned Rbp, unsigned Rsp, \
                unsigned Rbx, unsigned Rdx, \
                unsigned Rcx, unsigned Rax
#endif

/* private routine to capture requests for extended memory size */
static void interrupt _FAR trap15(INTREGS)
  {
  if ((Rax&0xFF00) != 0x8800)
    _chain_intr(old15);
  Rax=e_size;
  return;
  }

/***************************************************************************
 * Get extended memory size (in K) from BIOS                               *
 ***************************************************************************/
unsigned int ext_size()
  {
  union REGS r;
  r.h.ah=0x88;
  int86(0x15,&r,&r);
  return r.x.ax;
  }

/***************************************************************************
 * Allocate memory in 1K blocks, returns start address of block or         *
 * (LPTR) -1 if unable to allocate memory                                  *
 ***************************************************************************/
LPTR ext_alloc(unsigned size)
  {
  if (installed)
    return ext_realloc(size+e_alloc);
  e_alloc=size;
  e_size=ext_size();
  if (e_size<size) return (LPTR) -1L;
  e_size-=size;
  old15=_dos_getvect(0x15);
  _dos_setvect(0x15,trap15);
  installed=1;
  return 0x100000+e_size*1024;
  }

/***************************************************************************
 * Attempt to change the size of an allocated block (size in K).           *
 * Returns start address or (LPTR) -1 if unsuccessful                      *
 ***************************************************************************/
LPTR ext_realloc(unsigned size)
  {
  if (!installed)
    return ext_alloc(size);
  if (size>e_alloc+e_size) return (LPTR)-1L;
  if (size<e_alloc)
    {
    e_size+=e_alloc-size;
    e_alloc=size;
    }
  else if (size>e_alloc)
    {
    if (_dos_getvect(0x15)!=trap15)
      return (LPTR) -1L;
    e_size-=size-e_alloc;
    e_alloc=size;
    }
  return 0x100000+e_size*1024;
  }

/***************************************************************************
 * Free the extended block. Always call before exiting your program!       *
 * If exitflag is set, the INT 15 trap will be reset. If another program   *
 * has captured INT 15, ext_free will return a -1. If you call with        *
 * exitflag == 0, and another program has captured INT 15, the vector is   *
 * not reset and ext_free returns a 1. Otherwise, ext_free returns 0 and   *
 * releases INT 15                                                         *
 ***************************************************************************/
int ext_free(int exitflag)
  {
  int rc=0;
  if (!installed) return rc;
  if (_dos_getvect(0x15)==trap15||exitflag)
    {
    if (_dos_getvect(0x15)!=trap15) rc=-1;
    installed=0;
    _dos_setvect(0x15,old15);
    }
  else
    {
    e_size+=e_alloc;
    e_alloc=0;
    rc=1;
    }
  return rc;
  }





[LISTING SIX]


/*************************************************************************
 * TEST.C--Example program for the SEG4G library                         *
 *************************************************************************/
#include <stdio.h>
#include <ctype.h>
#include <signal.h>
#include "seg4g.h"
main()
  {
  LPTR ad,aptr;
  int ct=1024,i;
  int data=0xAA;
/* Ignore breaks */
  signal(SIGINT,SIG_IGN);
  printf("%dK of extended memory available\n",ext_size());
/* allocate 1K of extended */
  ad=ext_alloc(1);
  if (ad==-1L)
    {
    printf("Not enough extended memory. Only %dK available.\n",ext_size());
    exit(1);
    }
  printf("1K of extended mem allocated at %8lX. %dK remains.\n",ad,ext_size());
/* Make 4GB segments */
  extend_seg();
/* Turn on A20 */
  a20(1);
/* Write data to block */
  aptr=ad;
  for (i=0;i<ct;i++)
    {
    big_write(aptr++,data);
    }
  printf("Data written to extended memory\n\n");
/* Read it back */
  aptr=ad;
  for (i=0;i<ct;i++)
    {
    if (big_read(aptr++)!=data)
      {
      printf("Error reading extended memory\n\n");
      ext_free(1);
      a20(0);
      exit(1);
      }
    }
  printf("Data read back OK.\nExpanding allocation to 2K\n");
/* Expand memory allocation for no good reason */
  ad=ext_realloc(2);
  if (ad==-1L)
    {
    printf("Not enough extended memory. Only %dK is available.\n",ext_size());
    exit(1);
    }
  printf("2K of extended mem allocated at %8lX. %dK remains.\n",ad,ext_size());
/* Free memory */
  ext_free(1);
  printf("Extended memory freed. %dK Available.\n",ext_size());
/* Enter memory examine loop */
  while (1)
    {
    printf("Enter ^Z to quit.\nAddress and count? ");
    if (scanf("%li %i",&ad,&ct)!=2)
      {
      a20(0);
      exit(0);
      }
    while (ct--)
      {
      data=big_read(ad++);
      printf("%c",isgraph(data)?data:'.');
      }
    printf("\n\n");
    }
  }





[LISTING SEVEN]


/*************************************************************************
 * BLKTEST.C--Example block move program for the SEG4G library           *
 *************************************************************************/
#include <stdio.h>
#include <dos.h>
#include "seg4g.h"

/* Set COLOR to 0 if you have a monochrome monitor */
#define COLOR 1

#define SCREEN_SIZE 4000
#define ALIGN_SIZE 3

unsigned char pattern[SCREEN_SIZE+ALIGN_SIZE];

main()
  {
  LPTR data,screen;
  unsigned char far *p;
  int i;
  extend_seg();
#if COLOR
  screen=0xb8000;
#else
  screen=0xb0000;
#endif
  p=pattern;
/* align to nearest 4 byte boundry */
/* This isn't required, but does make big_xfer() more efficient */
 while (FP_OFF(p)&3) p++;
  data=seg_to_linear(p);
  for (i=0;i<SCREEN_SIZE;i+=4)
    {
    p[i]='A';
    p[i+3]=p[i+1]=0x70;
    p[i+2]='B';
    }
  big_xfer(data,screen,(unsigned long)SCREEN_SIZE);
  }