********************************************** * * * RELOCATABLE ASSEMBLER * * * ********************************************** 1.0 CONTENTS ------------- Assembler--------------------------------- Source statement syntax----------------- Label field--------------------------- Operation field----------------------- Operand field------------------------- Adressing modes--------------------- Expressions------------------------- Symbols----------------------------- Constants--------------------------- Special opcodes------------------------- Assembler directives-------------------- BSZ----------------------------------- END----------------------------------- EQU----------------------------------- FCB----------------------------------- FCC----------------------------------- FDB----------------------------------- INCL---------------------------------- NAM----------------------------------- OPT----------------------------------- PAGE---------------------------------- RMB----------------------------------- TTL----------------------------------- XDEF---------------------------------- XREF---------------------------------- Instruction summary--------------------- Error messages-------------------------- Warning messages------------------------ Example--------------------------------- Object module format-------------------- Linker------------------------------------ Commands-------------------------------- CUR----------------------------------- DEF----------------------------------- END----------------------------------- EXIT---------------------------------- LIB----------------------------------- LOAD---------------------------------- MAP----------------------------------- MO------------------------------------ OBJ----------------------------------- STR----------------------------------- Examples-------------------------------- 2.0 ASSEMBLER -------------- This is a two pass assembler whose output is compatible with the linking loader. Large source files can be assembled into relocatable object code with the options of listings and include files. This document serves as a reference for the assembler, not as an introduction to assembly language programming. SOURCE STATEMENTS SYNTAX Source statements consists of up to 80 characters in the following form: LABEL OPERATION OPERAND COMMENTS where each field is separated by one or more spaces. LABEL FIELD The label field must always start in column 1. If no label is present, there must be at least one blank space preceding the operation field. An asterisk '*' in the first column marks the line as a comment and no further processing is required on that line. Otherwise, the label field defines a symbol. A valid symbol can contain at most six characters. The first character of a symbol must be one of the following: 'A' thru 'Z', or '.'. The rest consist of the symbol consists of the characters 'A' thru 'Z', '0' thru '9', '.', '$', or 'souligne'. The following symbols are reserved by the assembler: 'A', 'B', 'CC', 'DP', 'PC', 'S', 'U', ''X', and 'Y'. A symbol can only be defined once in the label field. A symbol defined in this manner is assigned the value of the location counter (except with the EQU directive). Each unique label or external reference symbol requires 13 bytes in the symbol table. Unless noted otherwise in the configuration manal the assembler has room for 1000 symbols. OPERATION FIELD The operation field can either be an opcode or an assembler directive. Opcodes correspond directly to the machine instructions whereas directives control the assembly process. Refer to appendix A for a list of opcodes. OPERAND FIELD The operand field must be one of the following adressing mode. ADDRESSING MODES 1) Immediate addressing = #expression Immediate addressing uses information that immediatly follow the operation in memory. If the operation references a two byte register (D, S, U, X, Y) then a two byte immediate value is generated, otherwise, a one byte value is generated. The immediate value is interpreted either as a two's complement signed value (one byte in range -128 to 127 or to bytes in range -32768 to 32767) or as an unsigned value (one byte in range 0 to 255 ($FF) or two bytes in range 0 to 65535 (#FFFF)). 2) Relative addressing = expression Relative addressing is used by branch instructions. The offset is a one byte value for short branches with a range of -128 to 127 bytes from start of the next instruction. The offset is a two byte value for long branches with a range of -32768 to 32767 bytes from start of the next instruction. 3) Extended addressing = expression Extended addressing uses two bytes to contain the address of the operand. This allows addressing of the full memory range of $0000 to $FFFF. 4) Direct addressing = string The FCC directive stores ASCII characters into consecutive bytes of memory. The number defines the number of characters after the ',' to be output. Of course, there can only be as many characters as there are spaces on the source line. There are never more characters output tnan there are characters on a line regardless of the count specified. The second format of the FCC directive specifies the characters to output between 2 identical delimiters. The delimiter is the first non blank character after the FCC directive. FDB - Form double byte [label] FDB expression{,expression} [comment] The value of each expression of the FDB fills 2 bytes and are stored successively in the object program. The expression(s) may be of type absolute, relocatable, or external. INCL - Include source from file INCL The path name given is opened and source is read from the path until end of file. At that point the path is closed and source continues from the main file. Include directives may not be nested. NAM - Assign program name NAM string [comment] The NAM directive specifies the name of the relocatable program module. Only the first 6 characters of the specified string are used. If no directive is specified, a default blank name is output. OPT - Assembler options The OPT directive is used to control the format of the assembler output. Options not recognized by the assembler are ignored (ie. do not generate errors). The following is a list of options recognized by the assembler: L Print the listing from this point on (default). This only has an effect if the 'L' option is specified in the command line. NOL Do not print the listing from this point on. W Do issue warning messages (default). This only has an effect if the 'W' option is specified in the command line. NOW Do not issue warning messages. PAGE - Move listing to next page PAGE This directive causes the listing to move to the top of the next page. RMB - Reserve memory bytes [label] RMB expression [comment] This directive is identical to BSZ. TTL - Set page title TTL string The TTL directive causes the page title to be set to the string in the operand field. XDEF - External definition XDEF symbol{,symbol} [comment] The XDEF directive is used to specify that the list of symbols is defined within the current program and the definition is passed trough the linker. XREF - External reference XREF symbol{,symbol} [comment] The XREF directive is used to specify that the list of symbols is referenced within the current program but is defined in another program (via XDEF). ERROR MESSAGES 174 Invalid auto increment/decrementformat. Single auto increment or decrement was specified in the indirect mode (eg. LDB [Y+]). 175 Invalid index register format. One of the accumulators was specified as the offset in the index mode but was not followed by one of the index registers. 176 Invalid expression for PSH/PUL. The register list following one of the instructions PSHS, PULS, PSHU, PULU contained symbols that are not registers. 177 Incompatible register for PSH/PUL instruction. The register list for the PSHS/PULS instructions contained the register 'S' or the register list for the PSHU/PULU instructions contained the register 'U'. 178 Invalid register operand specification. Undefined register name encountered in indexed addressing mode. 179 Incompatible register pair. The register pair of an EXG or TFR instruction was not same size (ie. two 16 bit registers or two 8 bit registers). 202 Label or opcode error. A label or opcode symbol does not begin with an alphabetic character or period. 203 Error in operand expression. Incompatible symbol types in expression. 204 Operand needed. An operand was not found when expected. 205 Label error. Invalid character in label. 207 Undefined opcode. The symbol in opcode field is not a valid opcode or directive. 208 Branch out of range. The operand resulted in an offset greater than 129 bytes forward or 126 bytes backward from the first byte of the branch instruction. 209 Illegal addressing mode. The specified addressing mode in the operand field is not valid with this instruction type. 210 Byte overflow. The operand's value exceeded one byte. The most significant 8 bits of the 16 bit expression must all be zeros or all ones for a one byte two's complement field. 212 Directive operand error. A syntax error was detected in the operand field of a directive. 214 FCB directive syntax error. The structure of the FCB directive is syntactically incorrect. 215 FDB directive error. The structure of the FDB directive is syntactically incorrect. 216 Directive operand error. The directive's operand field is missing, terminated by an invalid terminator, or an expression in the operand field contains an invalid operator. 219 No END statement. The END directive was not found at the end of the last source file. The END directive is automatically supplied. 222 Symbol table overflow. The symbol table has overflowed. This is a fatal error and terminates the assembler during pass one. 223 The directive must or must not have a label. Depending on the directive use, the label field must be blank or must contain a valid symbol. 234 Multiply defined symbol. An attempt was made to define a symbol that was already defined. 241 Illegal symbol using an expression. An undefined, forward reference, external reference, or relocatable symbol was used illegally in an expression. 243 XREF or XDEF directive operand error. An invalid symbol or no operand was detected in the operand field of the XDEF or XREF directive. WARNING MESSAGES 1 Long branch not required. The destination could have been reached with a short branch. This warning can also be issued when using PCR relative addressing. 7 Extended addressing. This could result in non position independant code. 8 Non absolute immediate addressing. This could result in non position independant code. EXAMPLE The following routine is one of the pascal callable routines contained within the assembler. The purpose of the routine is to find the first occurence of a '+' or '-' not following a single quote ('). The position of the sign is returned to the calling routine. If no sign is found, then a zero is returned. byte user stack upon byte user stack upon len routine entry len routine exit BOS ------------------ BOS --------------------- 2 : string address : 2 : position of sign : TOS ------------------ TOS --------------------- XDEF GET SI * * FUNCTION GET SIGN (VAR STR : STRING) : INTEGER; EXTERNAL; * GET SI LDX 0,U Get string address LDA 0,X+ Get dynamic string length STA 0,U Store temporarily on stack GS1 DECA Decrement character count BMI GS3 End of string, sign not found LDB 0,X+ Get next character from string CMPB #'+ See if valid + BEQ GS2 Found sign, end of search CMPB #'- See if valid - BEQ GS2 Found sign, end of search CMPB #' ' See if need to ignore next char BNE GS1 Don't ignore next car, go get next char DECA Decrement character count BMI GS3 While skipping next char, end of string found LDB 0,X+ Ignore next char BRA GS1 Go get char after ignored char GS2 SUBA 0,U Compute distance from end of string NEGA Compute char position of sign CLR 0,U Upper part of integer is always zero STA 1,U Store function return value RTS Return to pascal program GS3 CLR 0,U Sign not found, return zero CLR 1,U Lower byte of return value RTS Return to pascal program END End of file OBJECT MODULE FORMAT The following is the output object module format created by the relocatable assembler. The file is recorded in a binary record format where a record consist of: D L X X X . . . X X X C CR where: D is the ASCII character 'D' and signifies the start of record. L is a byte that is the length (data plus checksum). X is the data of the record. C is the 2's complement checksum (starting with L). CR is a carriage return. Record types: 2) Header '2' : $00 : NAME : 'OB' This record preceeds the object module. The six character module name (NAME) is the name that was specified in the NAM directive of the assembler. 3) External symbol definition (ESD) '3' : SYMTYPE/SECT : DEF where SYMTYPE is the upper nibble and has the following format $0 -Load section definition $2 -Label definition $3 -External reference SECT is the lower nibble and has the following format: $0 -ASCT or any (for XREF) $1 -BSCT - not implemented - size set to zero $2 -CSCT - not implemented - size set to zero $3 -DSCT - not implemented - size set to zero $4 -PSCT This record marks definitions of XREF's, XDEF's, and the size of the code. If SYMTYPE = 0 and SECT = 0 then DEF is a 2 byte ASCT section length (always zero) and a 2 byte ASCT start location (always zero). If SYMTYPE = 0 and SECT = 1 then DEF is a 2 byte BSCT section length (always zero). If SYMTYPE = 0 and SECT = 2 then DEF is a 2 byte CSCT section length (always zero). If SYMTYPE = 0 and SECT = 3 then DEF is a 2 byte DSCT section length (always zero). If SYMTYPE = 0 and SECT = 4 then DEF is a 2 byte value of the total PSCT size. If SYMTYPE = 2 (XDEF) then DEF is a 6 byte name followed by the 2 byte relative address (if SECT = $ PSCT) or its absolute address (if SECT = 0 ASCT). If SYMTYPE = 3 (XREF) then DEF is a 6 byte name and SECT = 0 (any section will match). SYMTYPE/SECT: DEF can be repeated 0 or more times. 4) Program '4' : ESD INDEX : $00 : RELADDR : BYTES This record contains program code. ESD INDEX eill always be $0004 which indicates that the code loads into PSCT. RELADDR is the 2 byte address where the data starts to load relative to the start of the program. BYTES are up to 122 bytes of code to load. 5) Relocation and linking record '5' : FLG : RELADDR : ESD INDEX This record marks addresses which must be relocated by the linker and marks addresses which have external references to resolve. This record correspond to program bytes which were included in the last type '4' record (program). FLG is a 1 byte flag that is $00 if the relocation is to be added. Or $08 if the relocation is to be substracted. RELADDR is the relative address of the program word that is to be relocated. ESD INDEX indicates which external symbol definition corresponds with this relocation. (0 means the 1st ESD physically in the module, 1 the 2nd, etc.) FLG : RELADDR : ESD INDEX can be repeated 0 or more times. 6) Terminator '6' : SECT : RELADDR This records marks the end of the object module. RELADDR is the start execution offset referenced to the start of section SECT. 3.0 LINKER ---------- The linking loader is a two pass loader which can accept input object modules from the relocatable assembler. On the first pass, all external symbol values are defined (and relocated if necessary), all object modules which are to be loaded from libraries are determined, and the size of the load module is determined. On pass two the load module is produced using the information from pass 1, satisfying external references, and relocating specified addresses. COMMANDS Each command cannot exceed 80 characters. Pass one determinates when the OBJA command is entered at which point all commands entered during pass one will be repeated and echoed by the linker until the OBJA command is encountered the second time. At this point entry resumes from the input path for acceptance of the map commands (if used) and the exit command. CUR - Set current location CURP=[\]$ This command modifies the current loader location counter. If the '\' is not specified, the location counter will be set to + start load address. Otherwise the causes all future modules to be loaded at an address which is a power of two relative to the start of the section. This option remains in effect until another CURP command is encountered. DEF - Loader symbol definition DEF:=$ This command defines a global symbol and inserts it into the global symbol table. The symbol is defined as absolute. END - Ending address ENDP=$ This command sets the end load address of the load module. If the actual module would be smaller, then the load module is padded with zero (memory image operating systems only). ERRORS: If the module exceeds the ENDP address, then a 'ENDP ADDRESS EXCEEDED' error message will result. EXIT - Exit linker EXIT This command terminates the linker. LIB - Library search LIB={,} This command directs the loader to load object modules in the specified files only if the object module satisfies an unresolved external reference from an object module previously loaded. Only one pass is made through the libraries. ERRORS: A 'MULTIPLY DEFINED SYMBOL' error can result if the same name is defined in more than one loaded module. A 'UNDEFINED SYMBOL' error can result if there is no name definition to satisfy a reference. LOAD - Load file LOAD={,} This command directs the linker to load the specified files. There can be one or more object modules in a file. ERRORS: A 'MULTIPLY DEFINED SYMBOL' error can result if the same name is defined in more than one loaded module. A 'UNDEFINED SYMBOL' error can result if there is no name definition to satisfy a reference. MAP - Print load map MAP [U][S][M][D] MAPC MAPF This command displays the current state of the modules loaded. The U option will list any undefined symbols. The S option will list the memory size of the modules loaded plus symbol table usage. The M option will list the starting load address of all modules loaded. The D option will list each loaded module along with the external definitions for that module. MAPC is equivalent to MAPS (for backwards compatibility). MAPF is equivalent to MAPUSDM (for backwards compatibility). MO - Map output MO= This command overides the load map specification in the command line. Any path valid in the command line redirection is valid with this command. OBJ - Produce load module OBJA= This command specifies the output load module produced by the linker. The OBJA command terminates pass one of the linker, and consequently the start of pass two. The output file is filled during pass two and is automatically closed upon executing the OBJ command during pass two of the linker. STR - Starting address STRP=$ This command sets the starting load address of the load module. EXAMPLES Example #1: The following is an example of a minimum linker specification. The input is one assembly language program (TEST1.RO:1) where the starting load address and the starting execution address is $2000. the output file name is TEST1.LO:1. A map will be listed on the terminal on pass 2 after the output file has been created. MDOS file name conventions are used in those examples. The second set of commands for pass two were echoed by the linker, not entered by the operator. ?STRP=$2000 ?LOAD=TEST1:1 ?OBJA=TEST1:1 ?STRP=$2000 ?LOAD=TEST1:1 ?OBJA=TEST1:1 ?MAPF ?EXIT To run linker: LL Example #2: The following is an example of a more involved linker specification. A pascal program is to be linked with an assembler language routine. The pascal program (compiled and assembled) is in file TESTP.CA:1. The pascal stack initialisation routine is in file TESTP.Pa:1. The assembler language routine is in file TEST2.R:1. the start load address and the start execution address is $4000. the pascal runtime routine library is in file RL.RO:1. The output file is TEST.CM:1. A map is to be listed on the printer in pass 2 (pagesize = 66). ?STRP=$4000 ?LOAD=TESTP.PA:1,TESTP.CA:1,TEST2:1 ?LIB=RL:1 ?OBJA=TEST.CM:1 ?STRP=$4000 ?LOAD=TESTP.PA:1,TESTP.CA:1,TEST2:1 ?LIB=RL:1 ?OBJA=TEST.CM:1 ?MAPF ?EXIT To run linker: LL >P P=66 (MDOS command line)