Numeric Data Representation and Overflow
The basic precision of the CD2458 is 24 bits. While the instructions
and the program memory have 16-bits data width, the D Bus and the data
path elements, including program memory, can all have a fixed-point precision
N selected between 16 and 32 bits in CD245X architecture. The CD2458 employs
24 bit as the N. The accumulator A Bus and the total product P bus have
a corresponding precision of 48 (2N) bits.
Data is represented in twofs complement form with an implicit binary
point to the right of the sign bit which is the most significant bit (MSB).
Implicit values range between
+1.0-2-N+1 and -1.0.
It is common in digital signal processors to model the clipping or limiting
of signals that happens with analog components. It is often required to
insure system stability. This limiting of numeric overflow is provided
in the CD2458 by executing the MOD sat instruction. Either positive or
negative overflow in the AH register will result in substituting the corresponding
positive or negative full-scale value in AH or AH/AL, when the MOD sat
instruction is executed. For example with a 24-bit AH, 0x7FFFFF will be
substituted on a positive overflow and 0x800000 on a negative overflow.
Note that this protection applies to both AH and AL when the MOD sat takes
place in double word mode. MOD sat instruction checks OVF flag and N Flag
to determine if it should replace current AH or AH/AL value, when the
OM Flag is set to g0h. If the OM flag is set to g1h, a special gsticky
overflow modeh is activated, where the OVFC is counted +/-1 at every
positive/negative overflow happening. The MOD sat instruction checks this
OVFC to determine if it replaces the AH or AH/AL value with a corresponding
value or not.
Numeric overflow in the accumulator may not lead to incorrect results
if the final total value of a summation lies within the range that can
be represented in the accumulator. This is common with FIR filters. Thus,
the overflow protection is optional and should be through gsticky overflow
modeh when it is really needed, otherwise it can reduce the dynamic range
of the computation unnecessarily.
Timing
The CD2458 uses a conventional three-step instruction sequence: fetch
instruction, decode instruction and fetch operands, and finally execute
the instruction. This pipeline sequence is illustrated below for the ADD
A, (Rij) instruction. This pipeline is normally invisible to the user
except where the instruction affects the PC program counter and a dummy
machine cycle is automatically inserted to restore the pipeline.
Many instructions execute in one clock cycle (CKOUT). Obvious exceptions
are two word instructions with immediate data fields or the branch or
subroutine call with full PRAM address fields. Other exceptions are incrementing,
decrementing and bit operations on registers other than AH which also
take two clock cycles. Three clock cycles are taken when three-word long
immediate (24 bit Immediate data.) instruction is executed, or the program
memory is read indirectly. Four cycles are consumed when the program memory
is written back indirectly. Stack related instructions such as a return
from a subroutine or a load memory to a register using the stack pointer
takes an additional clock cycle for the pre-incrementing that can take
place with SP register. Whenever the PC is a destination register, an
additional dummy cycle is inserted to allow the instruction pipeline to
refill.
These variations in instruction execution are normally invisible to the
user because the operation can be considered complete at the end of its
last clock cycle of the instruction execution. The only exception is the
multiplier. The product for one set of X,Y register input data will be
available only from next cycle of current cycle (the Y register setting
cycle).
ALU AND SHIFTER
The arithmetic and logical operations in the CD2458 are accomplished
with a full function 48-bit ALU and a 48-bit Shift unit. (Additional strong
barrel shifter is described later.) ALU operands on the b input of the
ALU are 24-bits through the Shift unit from the D bus or 48-bits from
the multiplier total product output. Operands on the a input are from
the accumulator registers AH and AL. Results are returned to the accumulator
registers AH and AL or fed back to D-Bus directly for instructions that
modify registers other than AH and AL such as the bit operations and increment
or decrement.
Most operations are for 24-bit data and operate only on the 24-bit accumulator
register AH. However, the multiply and the double word add and subtract
instructions are 48 bits and operate on both AH and AL. The MOD OpA instruction
may be either single-word 24 bits or double-word 48 bits. The results
of the ALU operation control the N, OV, Z and CY flags in the status register
ST at one cycle after the related ALU instruction is executed. Actual
ALU operations in these instructions take place in one cycle after the
current execution cycle. Multiplier and ALU are one cycle pipelined, though
users do not feel it. For example, users can modify AH at one instruction,
and he or she can move the result to another register at next instruction.
No dummy wait cycle is necessary.
MULTIPLIER
The multiplier takes either two 24-bit signed operands in the X and Y
data registers or a pair of one signed and one unsigned number. The multiplier
produces either a signed 47-bit product (sign + 46 bit) in the 48-bits
of the PH and PL data or 48 bit product (sign + 47 bit). The MSB is the
sign bit with the implicit binary point to its right. The LSB of PL is
zero when signed and unsigned pair was fed into X and Y. For the case
of -1.0 x -1.0 (0x800000 x 0x800000) in two signed input mode, the result
is the number nearest to +1.0 that can be represented in the format (0x7FFFFFFFFFFFFF).
DATA REGISTERS
The CD2458 core has eight essential 24-bit read/write data registers
with the capacity for eight additional added functional registers. Although
the function of the added registers is user defined, some dedicated functions
like repeat instruction, barrel shifting, normalizing/de-normalizing are
predefined on the CD2458.
Numeric Registers
AH and AL Registers. The high and low 24-bit halves of the accumulator.
Only double-word operations by the ALU treat them together as a 48-bit
register. Single-word ALU operations are on AH alone. Other operations
are selectable on either register individually.
X and Y Registers. The two input N-bit operands for the multiplier.
May be used as temporary general purpose data registers as long as it
is recognized that loading the Y register initiates a multiply operation.
This not only produces a product in PH and PL registers but also increases
power consumption.
PH and PL Registers. The high and low N-bit halves of the multiplier
product. These may be used as temporary general purpose data registers
but the operation of the multiplier should be fully understood with regard
to timing and precedence.
Status Register
ST Register. The 24-bit Status register contains not only the five condition
flags, but also controls interrupts, overflow protection mode, address
pointer loop operations and user input/output signals. The N,OV,Z, and
CY flags are set/reset at one cycle after the related ALU instruction
is executed. The next instruction of the related ALU instruction can utilize
the resulted flag contents without feeling this internal delay.
Program Counter
PC Register. The 24-bit program counter is for program memory addresses.
Loading this register always introduces a one clock cycle NOP delay to
allow the new address to fill the instruction pipeline. Bit operations
instruction cannot be used on the PC as a Reg data register.
External function Registers
BF Register. The BF(EXT0) register is assigned for the 24 bit
parallel IO port dedicated for the host interface in emulator control.
This port may also be used to establish an on-the-fly communication between
the PC host and the CD2458 DSP.
RC Register. The RC(EXT1) register stands for the Repeat Counter
register. A set of two eight-bit numbers is written onto this register
to start the repeat operation. No special instructions for the repeat
operation exist.
TH and TL Registers. The TH|TL(EXT3,EXT2) registers are assigned
as a shadow AH|AL register. There are special instructions for the communication
between these registers and the AH|AL double word register.
SP Register. The SP(EXT4) is assigned as the only Stack Pointer
in the CD2458.
TR and PM Registers. The TR(EXT6), PM(EXT7) are used for barrel
shifting, normalization, RAM pointer modification and Overflow counter.
These registers may be used as additional temporary registers, too.
MEMORIES
There are three possible operational memories in a CD2458: the data memories
RAM0 and RAM1 and the program memory PRAM.
The bottom 256 words of the data memories are directly addressable with
the 9-bit DRAM address field. The lower 512 words are indirectly addressable
through the corresponding address pointer registers when loaded with the
9-bit SIMM immediate data field. The full 16M address spaces are indirectly
addressable through the corresponding 24-bit address pointer registers.
The Ri pointer registers, R0-3, address RAM0 and the Rj pointer registers,
R4-7, address RAM1.
When the program memory PRAM is used for data storage, its full 16M address
space can be addressed indirectly using any of the address pointers registers
Rij, R0-7, or any of the data registers Reg.
ADDRESS POINTER REGISTERS
Indirect Addressing
The eight address pointer registers R0-7 generate indirect addresses
for the data and program memories. The Ri registers, R0-3, address RAM0
and PRAM while the Rj registers address RAM1 and PRAM. Indirect addresses
are set by loading the pointer registers with short or long immediate
data instructions, or by transfers from memory or other data registers.
The short immediate data load can only specify a 9-bit address in the
lowest 512 words, but takes only one instruction clock cycle. The pointer
registers may be modified with the MODR Ri, Rj instruction or be self-modifying
after they are used.
Address pointer modification with looping creates no-overhead circular
buffers that are useful in digital signal processing. The loop sizes for
each DRAM are selected in the PRL0/PRL1 fields of the Status register.
The loop sizes may be specified in those fields in one of three different
ways each:
As a power of two, 2N, where N = 2 to 8. Buffers then start on boundaries
of 2N words and are 4, 8, 16 ...256 words long.
As a multiple of 64, 64(P+1), where P = 0 to 14. Buffers then start on
1024-word boundaries and are 64, 128, 192 ...960 words long.
As a multiple of 2, 2(Q+1), where Q = 0 to 30. Buffers then start on
64-word boundaries and are 2, 4, 6 ...62 words long.
Stacks
The address pointer register SP(EXT4) can be used as LIFO stack pointer.
It is incremented before the address is accessed when using the POP Reg
instruction. This popping of data from a downward growing stack is done
only with the POP Reg instruction. The SP is automatically incremented
by one without looping boundary restriction when the POP instruction is
executed. This pre-incrementing step takes an additional clock cycle over
the usual post-incrementing.
The program counter is pushed onto the stack by interrupts, the Reset
signal, and subroutine calls. Any looping boundaries will not affect the
stack operation either on interrupt, Reset and subroutine calls or returns.
They are automatically treated as no looping operations.