Chapter 3: Floating-Point Architecture

The Itanium floating-point architecture has been designed to combine high performance and good accuracy. A large floating-point register set of 128 registers is provided, and almost all operations can read their arguments from, and write their results to, arbitrary registers. Together with register rotation for software-pipelined loops, this large number of registers allows the encoding of common algorithms without running short of registers or needing to move data between them in elaborate ways. Registers can store floating-point numbers in a variety of formats, and the rounding of results is determined by a flexible combination of several selectable defaults and additional instruction completers. The basic arithmetic operation, the floating-point multiply-add (fused multiply-add), allows higher accuracy and performance in many common algorithms. Several additional features are also present to support common programming idioms.

Available Floating-Point Formats

Binary floating-point numbers are real numbers representable in the form

where

The sign s is either 0 or 1
The exponent e is a positive or negative integer
The significand ^[1] m ₀. m ₁ m ₂ m _{p ?1} is a sequence of binary digits (each m _i is either 0 or 1)

In standard floating-point implementations there is a limit both on the range of the exponent, which must fall in a range E _min ? e ? E _max, and the precision p, which is the number of digits permitted in the significand. These three parameters E _min, E _max and

< Previous Excerpt Next Excerpt >

Purchase This Book

Scientific Computing on Itanium-Based Systems

TABLE OF CONTENTS

Chapter 3: Floating-Point Architecture

Available Floating-Point Formats

Contact Preferences

This is embarrasing...

Customize Your GlobalSpec Experience

Select Your Free Newsletters

Industry Newsletters

Select Your Free Product Alerts

This is embarrasing...