Processor Design: System-On-Chip Computing for ASICs and FPGAs

Michael Bedford Taylor, Walter Lee, Jason Eric Miller, David Wentzlaff, Ian Bratt, Ben Greenwald, Henry Hoffmann, Paul Johnson, Jason Kim, James Psota, Arvind Saraf, Nathan Shnidman, Volker Strumpen, Matt Frank, Rodric Rabbah, Saman Amarasinghe, and Anant Agarwal
Massachusetts Institute of Technology
The physical realities of wire delay and power consumption seriously challenge the ability of microprocessor designers to continue designing monolithic architectures with centralized resources. Materials and process changes have proven insufficient to solve the fundamental physics problems, and it is increasingly challenging for existing architectures to turn chip resources into higher performance, at tractable costs. Fast moving VLSI technology will soon offer tens of billions of transistors, massive chip-level wire bandwidth for local interconnect, and a modestly larger number of pins. Processors need to convert the abundant chip-level resources into power-efficient application performance, while mitigating the negative effects of wire delays.
This chapter discusses the architecture of the Raw Microprocessor, an early multicore processor developed at MIT [411,447]. Raw is a tiled multicore architecture containing 16 homogeneous tiles arranged in a grid. Each tile contains a processor, caches, and several mesh routers. Raw is a general-purpose multicore architecture in that it supports various models of computation including instruction-level parallelism (ILP), streaming, data-level parallelism (parallelism:DLP), and thread-level parallelism (TLP). Raw s point-to-point interconnection networks between tiles support these models by routing both scalar operands and streams with extremely low latency between architecturally exposed function units. The Raw chip was successfully fabricated and demonstrated in 2002. Figure 14.1...