The Software Optimization Cookbook: High-Performance Recipes for IA-32 Platforms, Second Edition

One of the most basic operations of a computer language is the conditional branch. Unfortunately, conditional branches are also among the most difficult instructions for the processor to execute efficiently because they can break the in-order flow of instructions. Sometimes branches are executed in a single clock while other times they can take many dozens of clocks. For a detailed discussion of the reason that branching is a problem for processors, refer to Chapter 5 "Processor Architecture."
Branches come in two basic forms: conditional and unconditional. Conditional branches either jump to a designated instruction (taken branch) or go to the next instruction (fall through). Unconditional branches always jump to a new location. That location may be known beforehand as in the case of direct jumps or may not be known until executed, as for indirect jumps. Table 7.1 shows examples of the two types of branches. It should be noted that the call instructions are forms of unconditional branches.
| Conditional | Unconditional |
|---|---|
|
if (a > 10) a = 10; |
Fn(a); |
|
do { a++;} while (a < 10); |
goto end; |
|
(a > 10) ? a=10 : a=0; |
return a; |
|
for (a=0; a<10; a++) b++; |
__asm { int 3 }; |
|
while (!eof) Read_another_byte(); |
fnPointer(a); |
To determine the next instruction, the Pentium 4 and Pentium M processors use a branch target buffer (BTB) and branch history to predict the outcome of a branch. The...