Algorithms on Strings

In this section, we consider the approximate pattern matching with differences: locating all the factors of y that are at a given maximal distance k of x. We set m = x and n = y, and we assume k ? N and k < m ? n. The distance between two strings is defined here as the minimal number of differences between these two strings. A difference can be one of the edit operations: substitution, deletion or insertion (see Section 7.1). The problem corresponds to the utilization of a simplified notion of edit distance. The standard solutions designed to solve the problem consist in using the dynamic programming technique introduced in Chapter 7. We describe three variations around this technique.
We first examine a problem a bit more general for which the cost of the edit operations is not necessarily the unit. It consists thus of the ordinary edit distance (see Chapter 7). Aligning x with a factor of y amounts to align x with a prefix of y considering that the insertion of any number of letters of y at the beginning of x is not penalizing. With the table T of Section 7.2 we can check that, in order to solve the problem, it is sufficient then to initialize to zero the values of the first line of the table. The...