**9.1 ERROR LOCATION OF FAULTY PACKAGES AND FAULTY CHIPS**

As was shown in Figure 1.11 in Subsection 1.4.1, error location falls midway between the

functions of error correction and error detection. In the codes designed by Wolf and Elspas,

the codeword is divided into p distinct bytes, each having *b*-bit length. The code detects

*e*(< *b*) or fewer errors, all occurring within a single byte and identifies that byte. For this

reason the code is referred to as the __S__ingle __e__-bit (*within a *__b__-bit byte) __E__rror __L__ocating code, or

S_{e/b}EL code. For instance, if we let **E**_{i}(**E**_{j}) be the set of *e* or fewer errors occurring within the

*i*(*j*)-th byte, the code must satisfy the relation

The number of check bits* r* is bounded from below by

where [*x*] is the smallest integer not less than *x*.

In general, the error locating code is derived from the tensor product of the parity-check

matrices [WOLF65a].

**Definition 9.1** Let the **X** = (*x*_{i, j}) and **Y **= (*y*_{i, j}) matrices be an *a* × *b* matrix and a *c* ×* d*

matrix, respectively. The matrix **Z**, defined as the tensor product of **X** and **Y**, is the

*ac* × *bd* matrix given by

Let **H**_{D} be the *ρ* × *b* parity-check matrix for a binary (*b*; *b* - *ρ*) linear code *C*_{D} that

detects the class of errors ED. Let **H**_{C} be the *m* × *p* parity-check matrix for a nonbinary

(*p*, *p* - *m*) linear code *C*_{C}, with symbols from GF(2^{ρ}), that corrects the class of errors *E*_{C}.

Here, a column vector with *ρ*-bit length in the parity-check matrix of *C*_{D} corresponds to a

symbol in *C*_{C}, meaning a symbol from GF(2^{ρ}). Finally, let **C** be the binary (*pb*, *pb* - *ρm*) linear

code with the *ρm* × *pb* parity-check matrix **H** given by

| |

**Theorem 9.1** *If all binary byte errors corresponding to the erroneous bytes are within*

class E_{D} and if the erroneous bytes form a pattern of errors over GF(2^{ρ})* that falls in*

class E_{C}, then code **C** detects the errors and identifies the erroneous bytes.

If *C*_{D} is an *e*-bit (within a *b*-bit byte) error detecting code with *ρ* check bits and *C*_{C} is a

single-symbol error correcting code on GF(2^{ρ}), then **C** is an S_{e/b}EL code.

The codes described above only apply to errors having fewer than *b* bits. If the

maximum number of errors located by the codes is equal to *b*, then the S_{b/b}EL code is an

S*b*EC code. This is shown in the following theorem [VAID92].

**Theorem 9.2 ** *An error locating code that can locate all single-byte errors is a single-byte*

error correcting code.

From the result above the existing error locating codes are not always suitable for

application to byte-organized semiconductor memory systems.

In general, a semiconductor memory module has a hierarchical organization consisting

of memory cards or memory packages on which memory chips are mounted. The memory

card on which *b*-bit byte-organized RAM chips are mounted provides data output having

*B*-bit length, where *B* is a multiple of *b*, meaning *B* = *p* × *b*. The output of the clustered

data from the package or card is called here a *block*; its code length is in *B* bits. The output

of the clustered data from the chip is called a *byte*; its length is in *b* bits.

So we now have a new class of error locating codes that pertains to byte-organized

systems. We introduce the term *block* to denote a set of bytes. Each codeword is divided

into disjoint blocks, and the block is subdivided into bytes. This new class of codes will

locate an erroneous block that contains a single-byte error. We can call these codes *single*

b-bit byte (*within a B-bit block*) *error locating codes* (i.e., S_{b/p×b}EL codes) or as *block*

*error locating codes*. We will also use the terms *code length in bits, code length in bytes*,

and *code length in blocks* to denote the lengths of a codeword in bits, bytes, and blocks,

respectively. Figure 9.1 illustrates these relations.

The predominant errors, even in the byte-organized semiconductor memory chips, are

soft errors induced by α particles and external noises. These errors are still apt to be

manifest as single-bit errors in byte-organized RAM chips. Therefore an error locating

code capable of correcting single-bit errors is very useful. We call these codes *single-bit*

error correcting and single b-bit byte (*within a B-bit block*) *error locating codes *(i.e.,

SEC-S_{b/p×b}EL *codes* [FUJI94]) or *block error locating codes with single-bit error*

correction capability. In this regard, codes such as the S_{b/p×b}EL codes and the SECS_{b/p×b}EL

codes discussed above can also be called* codes for locating the package* / *card*

with faulty chips. Once the faulty package / card is located by a code, and the faulty

package / card is replaced by a correct one, then the system can be recovered and proceed

with normal operation. As for the location of the faulty chips, we depend on such codes as

the *single-bit error correcting and single e-bit* (*within a b-bit byte*) *error locating codes*

(i.e., SEC-S_{e/b}EL *codes* [KITA95]) or* byte error locating codes with single-bit error*

correction capability. This type of codes is called *codes for locating faulty chips*.

**9.1 ERROR LOCATION OF FAULTY PACKAGES AND FAULTY CHIPS**

As was shown in Figure 1.11 in Subsection 1.4.1, error location falls midway between the

functions of error correction and error detection. In the codes designed by Wolf and Elspas,

the codeword is divided into p distinct bytes, each having *b*-bit length. The code detects

*e*(< *b*) or fewer errors, all occurring within a single byte and identifies that byte. For this

reason the code is referred to as the __S__ingle __e__-bit (*within a *__b__-bit byte) __E__rror __L__ocating code, or

S_{e/b}EL code. For instance, if we let **E**_{i}(**E**_{j}) be the set of *e* or fewer errors occurring within the

*i*(*j*)-th byte, the code must satisfy the relation

The number of check bits* r* is bounded from below by

where [*x*] is the smallest integer not less than *x*.

In general, the error locating code is derived from the tensor product of the parity-check

matrices [WOLF65a].

**Definition 9.1** Let the **X** = (*x*_{i, j}) and **Y **= (*y*_{i, j...}

More >>