Chapter 2: Pattern Matching Automata
In this chapter, we address the problem of searching for a pattern in a text when the pattern represents a finite set of strings. We present solutions based on the utilization of automata. Note first that the utilization of an automaton as solution of the problem is quite natural: given a finite language X ? A*, locating all the occurrences of strings belonging to X in a text y ? A* amounts to determine all the prefixes of y that ends with a string of X ; this amounts to recognize the language A* X ; and as A* X is a regular language, this can be realized by an automaton. We additionally note that such solutions particularly suit to cases where a pattern has to be located in data that have to be processed in an online way: data flow analysis, downloading, virus detection, etc.
The utilization of an automaton for locating a pattern has already been discussed in Section 1.5. We complete here the subject by specifying how to obtain the deterministic automata mentioned at the beginning of this section. Complexities of the methods exposed at the end of Section 1.5 and that are valid for nondeterministic automata are also compared with those presented in this chapter.
The plan is decomposed as follows. We exhibit a type of deterministic and complete automata recognizing the language A* X. We consider two reduced implementations of this type of automata.