Processor Design: System-On-Chip Computing for ASICs and FPGAs

Processor pipelines seemingly receive a disproportionate share of the limelight as the glamorous darlings of the processor-design community. However, a processor s performance depends on more than just its execution pipeline. As with any engineering discipline, good processor performance depends on balanced design. Many factors contribute to a processor s (or a system s) overall performance and any of these factors can poison the operational efficiency of the perfect pipeline running real-world applications if it s not in balance with the others. Designers must employ an expanded range of design decisions and new technologies to produce balanced, cost-effective systems [60].
Advances made in processor designs over the past decade both circuit advances, which have caused clock rates to rise at roughly 30% per year from 1985 to 2005 [129] and architectural improvements including wider instructions, VLIW architectures, and speculative execution have increased microprocessor instruction-issue rate much faster than the rate of increase in main-memory bandwidth or the rate of decrease in main-memory access latency. Consequently, microprocessor accesses to bulk or main memory have become temporally expensive. This trend forces architectural and system-level design changes including:
Wider connections to main memory (more pins)
Larger and more efficient instruction and data caches
Memory-centric system architectures
Each of these new approaches delivers benefits and incurs costs.
Burger et al. [60] divide a processor s execution time into three components, which help explain how a processor s design might be better balanced. The three components are: