CD2450 Description
The CD2450 is a family of design modules for creating a high performance
digital signal processing integrated circuit. These configurable core
processor, memory and peripheral modules produce a small low-cost design
comparable to custom ones, but with a very much shorter development cycle,
lower development costs and with lower risk. A CD2450 design provides
the flexibility of a programmable processor with a fixed instruction set,
while having the performance and low cost of hardwired circuits configured
for the particular application. This is possible because of the modular
architecture connecting the programmable core processor with its memories
and a wide variety of peripheral digital and analog interfaces.
The core processor with a powerful instruction set has selectable data-path
precision, register set size and interrupt structure. Memories may be
on- and off-chip, ROM and RAM, and with different speeds and sizes. Peripheral
interface circuits support common functions and protocols yet are configurable
for the requirements of the application system. All modules are custom
designed to fit together for a compact layout with minimum interconnection
delays. Figure 1 shows a typical layout of modules and buses for a comprehensive
design using most module types.
Architecture
The block diagram in Figure 1
illustrates the architecture of the CD2450. This is an example system showing
many of the possible configurations of the modules on-chip with possible
external circuits. In this example the data precision is 16 bits with data
buses are sized accordingly. Address buses support the maximum address spaces.
Core
Processor
The left-hand portion of the Figure
is the irreducible internal core processor. Arrayed along the internal D
bus are the Multiplier, ALU, Shifter, the eight Address Pointers, and the
eight Data and Status Registers including the Program Counter. All data
path elements are configurable to a precision N between 16 and 24 bits.
There is a corresponding some increase in the clock cycle from 20 ns as
the precision is increased above 16 bits. Note the Accumulator Bus A, and
Product Bus PT are 2N- bits precision.
The multiplier is pipelined in two stages, producing a 31-bit product
in PH and PL every 20 ns with a latency of 40 ns for new arguments in
the X and Y registers. The ALU is full-function with double-word addition
and subtraction with the 32-bit accumulator ACCH and ACCL(Carry saved
in the middle of the 32 bit in addition). The shifter SHIFT covers a byte-wide
range of left logical shifts and one of right arithmetic shifts on the
b ALU input register.
There are two sets of four address pointer registers RP0 and RP1 for
the two data memories RAM0 and RAM1. Pointers may be indexed and loop
in arbitrarily sized buffers. One register in each set may be used as
a stack pointer for program instructions. The stack in RAM0 is used for
interrupts. The three interrupts are part of the System Function portion
of the core. They have a three level priority structure and may be used
internally with interface modules or with external system signals. The
user defined two inputs and two outputs also provide for handling system
signals through non-interrupt programmed transfers. Other System Functions
signals (see Table 2) are the synchronous and asynchronous clock controls:
RDY or WAIT is used with slower external memory or I/O and SLEEP is used
to reduce power dissipation respectively.
Memories
Within the center portion of Figure
2 are optional modules which may be added on-chip with the processor core.
Usually the largest of these are the two Data Memories RAM0 and RAM 1 and
the Program Memory PM. Each memory may be either RAM or ROM or both and
are added along their respective Data and Address buses. Sizes may be individually
determined for each type but the total for each memory bus cannot exceed
64k words. The number of words is not limited to be a power of two. Word
sizes are 16 to 24 bits and may be different individually and from the core
processor precision N. The Program Memory PM generally remains 16-bits even
though it can be used as data memory also.
Peripherals
Additional user defined Data Registers
REG[15:8] can be added either on-chip or externally. These registers may
be used to expand the general register set or they may be part of an interface.
They are added on the RAM1 Data Bus, which is an extension of the D Bus,
and the REG Address Bus. Figure 2 illustrates these additional registers
distributed between on-chip, within on-chip interfaces and externally.
Input/Output Interfaces are also added along the RAM1 Data Bus. Registers
within them may be Data Registers on the REG Address bus or memory-mapped
on the RAM1 Address bus. Digital interfaces are available for bit-serial
or parallel data transfers with buffers up to 24 bits. Buffers may be
dual-ported. Both synchronous and asynchronous protocol modules are available
with programmed or interrupt driven transfers.
A/D conversion of analog inputs and D/A conversion for analog outputs
are done with configurable modules. A sigma-delta design is used with
an over sampling rate of 256 and a 24-bit precision reconstruction filter.
Resolution is selectable up to 14-bits.
A DRAM controller module provides the necessary address multiplexing
and timing signals for the use of these slower, lower-cost external memories.
The module is configured for the individual DRAM chip size and its data
organization.
External System
The Program Memory, the RAM0 and
the RAM1 Buses can all be extended off-chip to provide additional or alternate
locations for registers, memory and input/output interfaces. Table 2 gives
a summary of the data, address, and timing and control signals that are
generated. Data bus precisions are for a 16-bit example.
Instruction Set
The configurability of the CD2450
architecture provides for high cost-effectiveness by allowing the selection
of just the silicon area needed for the application. The compact instruction
set contributes to this same goal. Because the instructions are powerful
it takes few of them to perform a given task. Thus program memory sizes
are smaller and performance is higher because fewer instruction clock cycles
are used.
Table 3 summarizes the instruction set. Note the wide choice of operands
for most operations and the variations in the basic instruction that are
possible. Subroutines, indirect addressing, and self-indexing reduce program
length through code reuse. No-overhead loop indexing, long and short immediate
data fields, powerful bit operation instructions and conditional execution
all contribute to minimum code length.
The support of RAM in the Program Memory often reduces total system
memory costs because it allows downloading only those program routines
needed for particular operation from slower, less expensive off-chip ROM.
Program RAM makes for easy software upgrades as well. The fact that Program
Memory can be used as data memory, and not just as immediate data, gives
additional freedom in memory usage as tasks change.
Full copy of the CD2450 Description is available in PostScript file
with Fig. & Tables (or PDF) ==>
. |