From Algorithms on Strings
In this chapter, we address the problem of searching for a pattern in a text when the pattern represents a finite set of strings. We present solutions based on the utilization of automata. Note first that the utilization of an automaton as solution of the problem is quite natural: given a finite language X ? A*, locating all the occurrences of strings belonging to X in a text y ? A* amounts to determine all the prefixes of y that ends with a string of X ; this amounts to recognize the language A* X ; and as A* X is a regular language, this can be realized by an automaton. We additionally note that such solutions particularly suit to cases where a pattern has to be located in data that have to be processed in an online way: data flow analysis, downloading, virus detection, etc.
The utilization of an automaton for locating a pattern has already been discussed in Section 1.5. We complete here the subject by specifying how to obtain the deterministic automata mentioned at the beginning of this section. Complexities of the methods exposed at the end of Section 1.5 and that are valid for nondeterministic automata are also compared with those presented in this chapter.
The plan is decomposed as follows. We exhibit a type of deterministic and complete automata recognizing the language A* X. We consider two reduced implementations of this type of automata.
Products & Services
Topics of Interest
In this chapter, we consider the problem of searching for all the occurrences of a fixed string in a text. The methods described here are based on combinatorial properties. They apply when the string...
8.3 Approximate Pattern Matching with Mismatches In this section, we restrict the approximate pattern matching to the search for all the occurrences of a string x of length m in a string y of length...
Overview In this chapter, we present data structures for storing the suffixes of a text. These structures are conceived for providing a direct and fast access to the factors of the text. They allow...
Overview In this chapter, we are interested in the approximate search for fixed strings. Several notions of approximation on strings are considered: jokers, differences, and mismatches. A joker is...
8.2 Approximate Pattern Matching with Differences In this section, we consider the approximate pattern matching with differences: locating all the factors of y that are at a given maximal distance k...