Scientific Computing on Itanium-Based Systems

The Itanium floating-point architecture has been designed to combine high performance and good accuracy. A large floating-point register set of 128 registers is provided, and almost all operations can read their arguments from, and write their results to, arbitrary registers. Together with register rotation for software-pipelined loops, this large number of registers allows the encoding of common algorithms without running short of registers or needing to move data between them in elaborate ways. Registers can store floating-point numbers in a variety of formats, and the rounding of results is determined by a flexible combination of several selectable defaults and additional instruction completers. The basic arithmetic operation, the floating-point multiply-add (fused multiply-add), allows higher accuracy and performance in many common algorithms. Several additional features are also present to support common programming idioms.
Binary floating-point numbers are real numbers representable in the form
where
The sign s is either 0 or 1
The exponent e is a positive or negative integer
The significand [1] m 0. m 1 m 2 m p ?1 is a sequence of binary digits (each m i is either 0 or 1)
In standard floating-point implementations there is a limit both on the range of the exponent, which must fall in a range E min ? e ? E max, and the precision p, which is the number of digits permitted in the significand. These three parameters E min, E max and