Algorithms on Strings

Chapter 5: Structures for Indexes

Overview

In this chapter, we present data structures for storing the suffixes of a text. These structures are conceived for providing a direct and fast access to the factors of the text. They allow to work on the factors of the string in almost the same way as the suffix array of Chapter 4 does, but the more important part of the technique is put on the structuring of data rather than on algorithms to search the text.

The main application of these techniques is to provide the basis of an index implementation as described in Chapter 6. The direct access to the factors of a string allows a large number of other applications. In particular, the structures can be used for matching patterns by considering them as search machines (see Chapter 6).

Two types of objects are considered in this chapter, trees and automata, together with their compact versions. Trees have for effect to factorize the prefixes of the strings in the set. Automata additionally factorize their common suffixes. The structures are presented in decreasing order of size.

The representation of the suffixes of a string by a trie (Section 5.1) has the advantage to be simple but can lead to a quadratic memory space according to the length of the considered string. The (compact) suffix tree (Section 5.2) avoids this drawback and admits a linear memory space implementation.

The minimization (in the sense of automata) of the suffix trie gives the minimal suffix automaton described in Section 5.4.

UNLIMITED FREE
ACCESS
TO THE WORLD'S BEST IDEAS

SUBMIT
Already a GlobalSpec user? Log in.

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.

Customize Your GlobalSpec Experience

Category: Domain Registration Services
Finish!
Privacy Policy

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.