embedded power for your project embedded power for your project

| Home | Software | Services | Support | Contact |

 Embedded Prozessor
   8-Bit Prozessoren
   16-Bit Prozessoren
   32-Bit Prozessoren
   64-Bit Prozessoren
 Tool Information
   BDM/JTAG tool
   Hardware design
   Eval Boards
   Real Time OS
   Software Tools

32-bit und mehr - MikroProzessor-Power für jede Anwendung

40 Jahre Mikroprozessoren:

Die Auswahl an leistungsfähigen Prozessoren für spezielle Zwecke ist sehr unübersichtlich. Zum einen ersetzen konfigurierbare Architekturen bei kleinen Stückzahlen herkömmliches Board-Level Design, zum anderen denken sich bekannte Siliziumfabrikanten immer wieder raffiniertere Variationen ihrer Standardarchitekturen aus, so daß viele ehemalige Wünsche wahr werden. Ein Prozessor der 6 Schrittmotoren gleichzeitig in Schach hält und obendrein Bluetooth fähig ist? Warum nicht, gerade Zulieferanten der Automobilindustrie sind sehr empfänglich für solche Ideen, denn Millionen-Stückzahlen winken in dieser Industrie.

Einige der leistungsfähigsten SOC - system on chip - die in Automobile der oberen Mittelklasse oder besser verbaut werden basieren auf ARM Cortex A9 und A15, bald auch die ARMv8 Hochleistungsarchitektur Cortex A5x un A7x! Vielleicht hätte sich Intel doch besser auf eine Weiterentwicklung der von Digital geerbten ARM11 XSCALE-Architektur konzentrieren sollen?

Evolution in Architectures

Today's embedded processor choices include:
Chip-level MP Multiprocessing on-chip is reality. Implementations range from mobile SOCs integrating a DSP + uC, to multiple cores on a common bus.
True Systems-on-Chip SOCs have gone beyond just a processor, its memory bus and peripherals. SOCs are taking on the aspects of systems, with multiple processors, common memory, and peripherals with sophisticated system buses to tie it all together.
Parallel Processor Element Designs Alternate multiple Processing Element architectures that can deliver massive amounts of processing power from highly parallel computing structures, think of array processors and the like.
Extensible Processors RISCs that can be extended at the ISA level that rely on system level logic synthesis to integrate the designs.
Add-On Functionality RISC, DSP architectures that enable 3rd parties and vendors to add logic functionality. They rely on logic synthesis to integrate the new functions into the design.
And last, but not least, CPU speeds are up. Embedded processors are moving up the speed curve. Already some embedded processors are passing the 1 GHz barrier, along with the associated EMC and heat problems.

Silicon-Driven Revolution

Everybody knows it, but it's still true: that silicon technology follows Moore's Law, roughly doubling the number of transistors every 18 months. And we are seeing the benefits of this relentless silicon march up the silicon curve. It is a technical commonplace, a cliché, that silicon technology follows Moore's Law, roughly doubling the number of transistors (or functionality, or clock rates, or capabilities) every 18 months to two years.

Modern low power silicon offers higher and higher clock rates, processors need more on-chip memory to minimize off-chip memory access delays. Many are moving toward large on-chip L2 caches to localize processing and to minimize off-chip memory access delays.

Chip-Level MP

In 21st century chip-level MultiProcessor became a reality. The year that SOCs moved from being a way to integrate a processor with its peripherals on one piece of silicon, to the point when SOCs started taking on the characteristics of true systems. Multiple processors on an FPGA became a working reality, one that designers could count on for delivering a large amount of processing power within a realistic silicon budget.

SOC Multi-Processing ranges from paired processors, such as a RISC paired with a microcontroller, to full-scale MP architectures with multiple RISC processors. In addition, a new class of MP processing has emerged, that of multiple processors arranged in sequential processing order or in processing arrays. This latter class represents the deployment of specialized math, vector, graphic, or media processors, which collectively can deliver a very high level of performance at modest clock rates. Now the software needs to become capable to feed processor arrays with enough tasks to turn the gain in silicon capability into real advantage in application processing speed.

Taking advantage of today's plentiful silicon, vendors are packing multiple processors on a single die to minimize design chip counts and costs.

Clocks vs. Execution Units

There's a new variation on an age-old: clock rates vs. execution units. The idea is that we don't have to go faster if we do more in parallel. Many designers are making an interesting tradeoff: clock rates vs. execution units based on the idea that maybe we don't have to go faster if we can have lots of parallel execution units. We can then run the execution units at slower clock rates and get GHz level performance without straining the silicon. It's a variation of the "wider rather than faster" design theme. If you think about it, that's precisely what superscalar RISC, VLIW and SIMD are all about, essentially deploying more execution units in parallel.

Sounds good, but most superscalar RISCs, VLIWs or SIMDs, can't get that many execution units chugging away in parallel. For example, a 4-way superscalar RISC will run 4 execution units in parallel. At best, a VLIW like TI's C6x with an 8-way VLIW has 8 units executing in parallel. SIMDs do a bit better, especially for 8-bit operations: a 128-bit SIMD like Motorola's PowerPC G4 does 16 executions in parallel. But if you need 16-bit accuracy, it only does 8 operations in parallel.

However, there's another way to get more parallel processing power to deliver massive amounts of execution MIPS at relatively low clock rates. New architecture designers have done this by basically upping the number of parallel execution units that can be deployed in tandem. Today's emerging parallel designs are all over the place architecturally, but basically all get their top-level performance by ganging multiple parallel execution units for massive parallelism.

There are several dynamically reconfigurable MP designs with an ARC RISC on-chip host with a 32-bit reconfigurable processing fabric. It is configurable with FPGA-like programmable local and layer interconnects and datapath cells. Examples of such architectures can be found with Stretch, Altera, Atmel, Xilinx and more companies to come.

Through the looking glass:

RISC, Superscalar, VLIW, and SIMD

Today's processor design techniques include RISC, Superscalar, VLIW and SIMD. Each of these techniques enable designers to get more out of their silicon by squeezing down cycle logic, executing instructions in parallel, or multiplying the number of operations a single instruction can execute respectively. The trick is to get more done in the same amount of clock time.

RISC In classic RISCs, the trick was to squeeze down the register-to-ALU-to register cycle for higher execution speeds. One way to get it faster was to simplify the logic: to simplify the instruction set, use fixed multi-word addressing, use a Load/Store architecture (operate only on registers), pipelining to sequentially stage execution (enabling the next instruction to start before the current one finished), and use fixed instruction words. These design techniques enabled RISCs to run faster than the older CISC (complex instruction set computer) processors.

Superscalar The next step to up RISC performance was adding superscalar execution. Superscalar designs can issue more than one RISC instruction per cycle, using multiple execution units to execute multiple instructions in parallel. For example, many RISCs can issue and execute an integer and a floating-point instruction in parallel. But superscalar design techniques ran into some natural limits, namely that the more instructions you issue, the more intermediate stuff you have to hold in case something goes wrong, such as having to take a branch, which negates the instructions that follow it in sequence. Superscalar has settled out into implementations that can issue 2,3 or 4 instructions in parallel.

VLIW Some new design techniques have evolved from RISC. These include VLIW and SIMD. VLIW (very long instruction word) implementations are a relatively successful attempt to bypass the problems of superscalar RISC. VLIW is very like RISC superscalar; both techniques issue a number of RISC instructions. The difference is that RISC superscalar does it dynamically in hardware, deciding which instructions to issue and to handle intermediate scheduling problems. VLIW lets the compiler handle the scheduling, with the hardware receiving and issuing a block of RISC instructions.

SIMD It turns out that SIMD (single instruction, multiple data) has been around a long time. It means that a single instruction controls the operation on multiple data elements. For example, an ADD instruction causes n units to do an add. SIMD have proved to be a very powerful mechanism, especially for 8-, 16-bit, and 32-bit DSP and graphics operations done on large register words. SIMD was a natural extension for floating-point units in RISC and the X86 PC processors. Originally pioneered by Sun for its SPARC and picked up by Intel for its Pentium, SIMD enables one instruction to be applied to multiple fields in a floating-point register word. For a 64-bit word, that can be 8 8-bit adds, 4 16-bit adds, or 2 32-bit adds, delivering a 8x, 4x or 2x speedup. SIMD has now been extended to other architectures and designs: Motorola's PowerPC G4 implements a 128-bit vector engine co-processor with a G3 PPC core. The latest SIMD designs are moving to a separate 128-bit vector unit instead of the earlier 64-bit Floating-Point Execution Units.

(by techonline2000, revised and updated by Bernhard Kockoth embeddedexpert.com 2008)

| Home | Software | Services | Support | Impressum |

Embedded Expert 2016 - Alle Marken, Warenzeichen und Handelsnamen sind Eigentum der jeweiligen Inhaber.

© BK media systems 2002, 2016.

All trademarks and registered names are property of their respective owners. German law requires Impressum