Programming with Intel Extended Memory 64 Technology: Migrating Software for Optimal 64-bit Performance

Specific Optimizations for Intel EM64T

Most of the optimizations that leverage Intel EM64T focus on the use of 32-and 64-bit registers and their respective operands. In some cases, the 64-bit items work better; in others, the 32-bit constructs should be preferred.

Reduce Code Bloat Wherever It Does Not Interfere with Other Optimizations

The rationale for this rule is reduction of code size. Reducing the size of routines loaded into memory might not immediately seem like a performance optimization, but in fact it is. Before code can be executed, it is read from the main system RAM into the L2 cache. It is then transformed, sent down the pipeline, and moved into the L1 cache in the form of "micro ops." To appreciate what is going on and how code size affects it, we must examine this process in more detail.

When a program is executed, it undergoes a series of steps before the first instruction is actually executed. First, the program is copied into RAM, where the program loader resolves linkage references that are embedded in the executable. These references take several forms. The most common is when an address in the executable is encoded as an offset from the program load address. Such addresses are resolved to the actual address of the specific byte, since the load address of the program is now known. Another important reference includes certain calls to dynamic libraries. Not all called libraries are loaded, but many are certainly all the ones necessary to begin operation.

UNLIMITED FREE
ACCESS
TO THE WORLD'S BEST IDEAS

SUBMIT
Already a GlobalSpec user? Log in.

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.

Customize Your GlobalSpec Experience

Category: CPU Chips
Finish!
Privacy Policy

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.