![]()
RMP
TMS320C40 Silicon Errors
Rev 1.x, 2.x Silicon (Document revision 3.0)
Last Modified: 7/1/93
Revision 2.3 Silicon
====================
Revision 2.3 silicon started shipping to customers on November 2, 1992.
Revision 2.3 silicon has the numbers "22" or higher as the first two digits
in the seven digit lot number on the device (22xxxxx). All silicon shipping
today is Rev 2.3.
Revision 2.0 Silicon
====================
Rev 2.0 silicon started shipping to all customers on August 3, 1992. Rev 2.x
silicon can be identified in the lot code on the device with the letters "EA"
as the first two digits in the date code (EAxxxx).
Revision 1.0 Silicon
====================
Rev 1 silicon shipped from July 1991 through July 1992.
*****************************************************************************
The following lot numbers had problems with boot loader. However, it will nor
produce erratic results (it will not boot-up at all)
2064070
2064071
2151191
2152265
2160979
2196891
*****************************************************************************
ERROR 1. Fetch control logic
PLANNED TO BE FIXED IN PG 3.0
A branch type instruction normally disables instruction fetches as soon as the
branch is decoded. On Rev 2.x and Rev 1.x, a hold-everything pipeline conflict
delays the signal which disables instruction fetches. After the fix, the
instruction fetches will be disable as soon as the branch is decoded. This will
prevent the double fetch of the instruction after a branch.
If all five of the following conditions are true, there is a possibility that
a program fetch will be corrupted. If any one of these conditions is not met,
there will NOT be a problem.
1) If the external ready is pulled high (NOT ready), while re-fetching the
instruction after a branch, and the bus data changes while tri-stated with
( ^CE(0-1), ^DE, and ^AE), this problem can occur. The C40 is different than
the C30, in that the external ready must be used to prevent reads or writes
from completing while the C40 is tri-stated.
2) The cache must be enabled.
3) A branch type instruction (branch, trap, call, return, RPTB, RPTS) is used
(Delayed branches are not affected). The instruction following the branch
may be corrupted.
4) The problem may occur if the instructions are being fetched from either the
local or global port (the peripheral bus is not affected) and a pipeline
conflict occurs at one of the other ports. Below are 3 conditions which may
cause a pipeline conflict:
o If the instruction before a branch attempts to read from a port (other
than the port used to fetch instructions) and the port is not ready.
o If either of the 2 instructions before a branch type instruction does a
multi-cycle fetch from a port (other than the port used to fetch
instructions).
o If either of the 2 instructions before a branch type instruction attempts
to store to a port (other than the port used to fetch instructions) and
that port is not ready.
5) The timing of ready going high, the timing of ( ^CE(0-1), ^DE, or ^AE) going
high, and the timing of the pipeline conflict on one of the other ports,
all have to happen at specific times to cause this problem. These timings
may be hard to control.
This is a problem even with single cycle reads.
WORK AROUND:
1) Insert 1 or 2 nop's before the branch type instruction to avoid this
problem. Instructions which do not fetch or store data to ports can also be
used.
ERROR 2. RETIUD instruction
PLANNED TO BE FIXED IN PG 3.3
A problem occurs when a register ready conflict occurs the cycle after a
delayed RETID. The RETID executes but the stack pointer is not decremented.
WORK ARROUND:
Eliminating the register conflict that occurs the cycle after the RETID
instruction, eliminates the problem. These sequences need to be changed:
These sequences need to be changed | Fix Fix
---------------------------------------------------------------------------
RETIUD | RETIUD RETIUD
STI ARn,*AR1 ;store auxreg(n) | NOP STI AR2,*AR1
STI AR0,*ARn ;auxreg(n) used as address| STI AR2,*AR1 NOP
NOP | STI AR0,*AR2 STI AR0,*AR2
|
|
RETIUD | RETIUD RETIUD
LDI R0,ARn ;load auxreg(n) | NOP LDI R0,AR2
STI AR0,*ARn ;auxreg(n) used as address| LDI R0,AR2 NOP
NOP | STI AR0,*AR2 STI AR0,*AR2
|
|
RETIUD | RETIUD RETIUD
STI DP,*AR1 ;store data page pointer | NOP STI DP,*AR1
LDI @data,R0 ;DP used in address | STI DP,*AR1 NOP
NOP | LDI @data,R0 LDI @data,R0
|
|
RETIUD | RETIUD RETIUD
LDI R0,DP ;load data page pointer | NOP LDI R0,DP
LDI @data,R0 ;DP used in address | LDI R0,DP NOP
NOP | LDI @data,R0 LDI @data,R0
|
|
RETIUD | RETIUD RETIUD
LDI 1, IR0 ;load IR0 | NOP LDI 1, IR0
STI R7, *AR2(IR0) ;IR0 used in address | LDI 1,IR0 NOP
NOP | STI R7,*AR2(IR0) STI R7,*AR2(IR0)
|
|
RETIUD | RETIUD RETIUD
STI IR0,*AR1 ;store IR0 | NOP STI IR0,*AR1
STI R7, *AR2(IR0) ;IR0 used in address | STI IR0,*AR1 NOP
NOP | STI R7,*AR2(IR0) STI R7,*AR2(IR0)
|
|
RETIUD | RETIUD RETIUD
LDI 100, BK ;load BK | NOP LDI 100, BK
STI R7, *++AR2% ;BK used in address | LDI 100,BK NOP
NOP | STI R7,*++AR2% STI R7,*++AR2%
|
|
RETIUD | RETIUD RETIUD
STI BK,*AR1 ;store BK | NOP STI BK,*AR1
STI R7, *++AR2% ;BK used in address | STI BK,*AR1 NOP
NOP | STI R7,++*AR2% STI R7,*++AR2%
ERROR 3. Cache Update Logic
PLANNED TO BE FIXED IN PG 3.3
In a very special condition, TMS320C40 device will execute an incorrect opcode
due to a bug in the cache update logic. This bug exists on PG 3.2 and earlier
version C40 silicons only. It will be fixed in the later version C40 silicon.
When C40 cache is enabled, the cache freeze bit, CF, in the status register,
ST, is used to freeze (CF = 1) and unfreeze (CF =0) the cache update. This
cache problem will occur under the following conditions:
1. The value of CF bit goes through 1-0-1 sequence, and
2. During CF = 0 period, no instruction fetch is started and a
multi-cycle instruction fetch is in progress (single cycle program
fetch will not have problem).
When the above conditions are met, the opcode of the multi-cycle fetch
instruction will be put into the cache without updating the cache segment
address register. Therefore, if this corrupted cache segment is executed again
before it gets update with other address instruction, the incorrect opcode will
be executed.
An interrupt service routine (ISR) is the most likely place to find this
problem. Usually, the ST value is saved on the stack in the beginning of ISR
function and restored before the RETI or RETID instruction. Therefore, if there
is an interrupt pending before returning from the ISR and the instruction after
returning from ISR is a multi-cycle fetch instruction, the above condition can
occur. If somehow this corrupted cache is executed again before it gets update
with other address, for instant - cache is frozen (CF = 1), the system will run
into an unexpected situation. An example that might cause a problem is shown
below:
: :
: : ; CF = 0
Interrupt occurs ; CF set to 1 and PCF set to 0
PUSH ST
: :
ANDN 0800H,ST ; CF = 0
: : ; 4 segment cache filled
: : ; with these instructions
: :
POP ST ; CF set to 1 and PCF set to 0
RETI ; CF set to 0
Interrupt occurs again ; CF set to 1 and PCF set to 0
Repeat above sequence
Although the cache is unfrozen in the above interrupt routine program, it does
not mean the problem won't occur if the cache remain frozen in the ISR. However,
the device will be less likely to run the corrupted cache program again before
the faulty cache segment address gets update if the cache is frozen in the ISR.
Normally after the next RETI, instructions will be fetched and the cache error
is cleared.
WORK AROUND:
The workaround of this problem is to force the CF value equal to PCF value
before saving ST register or RETI/RETID. Examples are shown below:
Example 1:
: :
: : ; CF = 0
Interrupt occurs ; CF set to 1 and PCF set to 0
ANDN 0800H,ST ; Set CF = 0 (PCF = 0)
PUSH ST ; Save the ST value
: :
ANDN 0800H,ST ; CF = 0
: :
: :
: :
POP ST ; CF set to 0 and PCF set to 0
RETI ; CF set to 0
Example 2:
: :
: : ; CF = 0
Interrupt occurs ; CF set to 1 and PCF set to 0
PUSH ST
: :
ANDN 0800H,ST ; CF = 0
: :
: :
: :
POP ST ; CF set to 1 and PCF set to 0
ANDN 0800H,ST ; Set CF = 0 (PCF = 0)
RETI ; CF set to 0
ERROR 4. TOIEEE instruction
FIXED IN PG 2.3
If the floating-point number is a negative power of two (-2.0, -4.0, -8.0,
etc.; in other words s=1 AND f=0), the TOIEEE instruction will convert the
number to the incorrect IEEE number
The IEEE number will be scaled down by 1/4 of the value of the C40
floating-point number.
For example: If the input data is -32.0 (=0x04800000) in C40 format, the
output from TOIEEE instruction will be -8.0 (=0xC1000000) in IEEE format.
ERROR 5. BcondAF/BcondAT instruction
FIXED IN PG 2.3
If the first instruction after BcondAF or BcondAT is a multi-cycle memory
read, the 3 instrcutions after the branch may not be anuled.
Beware that even if the memory is normally single-cycle, if may be multi-cycle
when changing pages, or a read immediately following a store, etc.
Inserting a NOP after BcondAT or BcondAF will solve this problem.
NOTE : Neither instruction is used by the C Compiler.
ERROR 6. Return PC address corruption on interrupts
FIXED IN PG 2.0
The C40 CPU contains a four deep pipeline: fetch, decode, read, and execute.
When the interrupt signal is recognized, the C40 will flush the pipeline
before serving the interrupt. The C40 will complete the instructions at the
"read" and "decode" stages of the pipeline and store the program counter (PC)
of the instruction at the "fetch" stage of the pipeline onto the stack,
allowing the C40 to return to the original stack location after the interrupt.
However, if the interrupt signal is recognized in the "read" or "decode" phase
of a particular instruction, such as, load/store the stack pointer (SP),
parallel store, and store immediate value (STIK), then the wrong return
address from an interrupt service routine may be stored onto the stack.
Specifically:
a) if the instruction before the interrupt is in the "decode" phase and is a
stack pointer (SP) loading instruction such as:
1) LDI SP,AR1
2) LDA SP,AR3,
3) STI SP,*AR1
4) PUSH SP
or if the instruction in the "decode" or "read" phase is a stack pointer
storing instruction such as:
1) SUBI 2,SP
2) LDI IR0,SP
3) ADDI3 3,R0,SP
4) POP SP
Note: PUSHing or POPing other CPU registers will not cause the problem.
b) if the instruction in the "decode" phase is a parallel store instruction
such as:
1) STI R1,*+AR2(1) || STI R3,*AR4
2) STF R5,*AR3++(1) || STF R2,*-AR5(1)
c) if the instruction in the "decode" phase is a store immediate instruction
with the immediate value equal to -12 or the five least significant bits
of the destination address equal to 10100b:
1) STIK -12,*+AR3(4)
2) STIK -12,@1000h
3) STIK 0,@F914h
Note: STIK 1,@F913h or STIK -1,@F915h WON'T cause the problem.
WORK AROUND:
The delay unconditional branch instruction (BUD) can be used to frame those
instructions since the interrupt cannot occur in the decode/read/execute
phases of a delayed branch instruction.
.
.
NOP
LDI 125,AR2
PUSH AR3
Add this to prevent ----> BUD $+4
the interrupt occurring at LDA SP,AR3
next three instructions LDI @V_ADDR,AR1
NOP
.
.
NOTE: The "BUD" instruction will shield the "decode" and "read" phases of
the next instruction and only shield the "decode" phase of the second
instruction after it. Therefore if the instruction is storing data to SP
register, it can only be protected in the first instruction after the
"BUD $+4" instruction. Other cases can be protected in the first and second
instruction after the "BUD $+4" instruction. Some examples are shown below:
Example 1: Example 2:
. .
. .
NOP NOP
LDI 125,AR2 LDI 25,AR2
PUSH AR3 PUSH AR3
BUD $+4 BUD $+4
NOP LDI @V_ADDR,AR1
Shielded---> ADDI3 3,SP,AR3 NOP
LDI @V_ADDR,AR1 NOT----> LDA SP,AR3
. Shielded
. .
Example 3: Example 4:
. .
. .
NOP NOP
LDI 125,AR2 LDI 125,AR2
PUSH AR3 PUSH AR3
BUD $+4 BUD $+4
Shielded---> SUBI3 3,SP NOP
LDI @V_ADDR,AR1 NOT----> LDI R5,SP
NOP Shielded LDI @V_ADDR,AR1
. .
. .
Example 5: Example 6:
. .
. .
MPYF *AR1,R1 MPYI *AR1,R1
BUD $+4 BUD $+4
ADDF R1,R2 ADDI R1,R2
Shielded---> STF R2,*AR3 LDI *AR2++,R3
|| STF R1,*AR1++ NOT ----> STI R2,*AR3
LDF *AR2++,R3 Shielded || STI R1,*AR1++
. .
. .
Example 7: Example 8:
. .
. .
LDHI 010H,AR6 LDPK 02FH
BUD $+4 BUD $+4
OR 041H,AR6 LDI @F800,R2
Shielded---> STIK -12,*AR5 ADDI *AR2,R2
LDI *AR6,R1 NOT----> STIK 0,@FB14H
. Shielded .
. .
ERROR 7. Multi-cycle external memory read
FIXED IN PG 2.0
When the CPU performs two indirect reads in the same cycle (resulting from a
3-operand or parallel instruction), and the operand decoded from the "source 1"
field requires more than 1 wait state, the wrong value may be read.
In all of the parallel instructions and some 3-operand instructions, two of the
operands are read from memory locations by indirect addressing. If these two
memory locations are from different ports, then the TMS320C40 CPU can perform
two data reads in the same cycle (it doesn't mean that the data read will be
completed in the same cycle). There is a problem when source 1 (src1) is a
three or more cycle load (due to wait states or RDY input) and source 2 (src2)
is from another port. The src1 load completes in 2 cycles, even though it
should wait for 3 or more cycles.
For example, if the configuration of the external buses are
AR0 points to --> Internal bus
AR1 points to --> Local bus - 0 wait state
AR2 points to --> Local bus - 2 wait state
AR3 points to --> Global bus - 2 wait state
All 3-operand and parallel instructions that have two indirect addressing
data loads from different ports will have this problem.
WORK AROUND:
Exchanging the src1 and src2 data fields or changing to single data accesses
can get rid of the problem. For the above examples, the workarounds are:
1. LDI *AR3,R1 OR LDI *AR1,R0
|| LDI *AR1,R0 || LDI *AR3,R1
2. ADDF3 *+AR3(5),*AR0,R1 OR LDF *+AR3(5),R2
ADDF3 R2,*AR0,R1
3. CMPI3 *+AR3(3),*+AR1(2) OR LDI *+AR3(3),R1
|| CMPI3 *+AR1(2),R1
This problem DOES NOT effect DMA loads, parallel loads to the same port, or
multi-cycle stores. The following examples are thus NOT EFFECTED:
src1 src2
1. LDI *AR2,R0 3 cycles 3 cycles
|| LDI *AR2,R1
2. SUBI3 *AR3,*+AR3(5),R1 3 cycles 3 cycles
ERROR 8. Pipeline conflict on the DBcond & DBcondD instructions
FIXED IN PG 2.0
Since the TMS320C40 auxiliary register is modified in the decode phase when it
is used as the pointer in the indirect addressing mode, the TMS320C40 has
pipeline protection to ensure its auxiliary register update sequence. However,
this pipeline protection fails on the decrement and delayed branch instructions
(DBcond and DBcondD) with zero wait state program memory.
For example, the following program will cause the problem:
. .
. .
AR5 is decremented-----> ADDI 1,AR5,R2
before the ADDI DBU(D) AR5,LOOP1
. .
. .
When the "DBU" instruction is in the "decode" phase and the "ADDI" instruction
is in the "read" phase, the AR5 is decremented by one already.
WORK AROUND:
Simply adding one instruction between "ADDI" and "DBU" will solve the problem.
For example:
.
.
ADDI 1,AR5,R2
NOP
DBU(D) AR5,LOOP1
.
.
A related error is as follows. If the first instruction after BcondAF or
BcondAT is a multi-cycle memory read, then the 3 instrcutions after the branch
may not be anuled.
Even if the memory is normally single-cycle, if may be multi-cycle when
changing pages, or a read immediately following a store, etc.
ERROR 9. DMA errors
FIXED IN PG 2.0
There are three known errors on the C40 DMA logic:
1) ERROR A - In the first cycle after the completion of an auto-init,
the previous value of the control register is used.
WORK AROUND - Use the same configuration of the DMA functions in the
autoinitialization.
2) ERROR B - After reset or in the DMA fixed priority mode, DMA3 has highest
priority instead of DMA0.
WORK AROUND - Change priority scheme accordingly.
3) ERROR C - The Autoinit Sync. bit will not disable the auto-init
synchronization requirements. Whenever the DMA is in Synchronous transfer
mode, the autoinit will be in the same synchronous mode.
WORK AROUND - Avoid using this mode.
Version 4.40 of the Floating Point C Compiler has the option of implementing
these workarounds into the code during compilation. Check the C Compiler
user's guide for details.
| Device: TMS320C4x Category: Device Information |
Title: C4x Silicon Errata Source: TI Apps Date: 1/4/98 GenId: c40r12se |