Networks on Chips: Technology and Tools

The preparation of the packets is one of the key stages of NI architectures, and the associated latency is likely to significantly impact overall communication latency, as showed by early NoC prototypes [32].
Three possible implementations are feasible [13]:
Software-based library : The operating system is extended in order to include primitives for implementing all the needed communication services.
On-core module : The processor core itself is modified and parameterized, and the services are implemented in hardware.
Hardware based wrapper : An independent hardware block (the NI) is located between the processor core and the interconnect.
The implementation of the packetization strategy varies depending on the reconfigurability and programmability of a specific core. In general, it is a performance/area/flexibility trade-off [13]. The packet communication process has essentially three stages: packet preparation, packet transmission, and packet handling at the receiver. In this chapter, we will primarily focus on the packet preparation stage for communication over the network. The packet handling stage at the destination is essentially complementary, and exhibits similar tendencies.
Without loss of generality, let us restrict our analysis to a simple distributed memory environment and let us study packetization implementation in this context. The system consists of a core that can access separate memory cores spread through the on-chip network. To the software executing on the processor core, the memory is one contiguous block present at a single location. The processor core is aware of the distributed nature of...