Algorithms on Strings

8.3: Approximate Pattern Matching with Mismatches

8.3 Approximate Pattern Matching with Mismatches

In this section, we restrict the approximate pattern matching to the search for all the occurrences of a string x of length m in a string y of length n with at most k mismatches ( k ? N, k < m ? n). We recall from Chapter 7 that the Hamming distance between two strings u and ? of the same length is the number of mismatches between u and ?, and is defined by

The problem can then be expressed as the search for all the positions j = 0 , 1 , , n ? m on y that satisfy the inequality Ham( x, y[ j .. j + m ? 1]) ? k.

Search automaton

A natural solution to this problem consists in using an automaton that recognizes the language A*{ w : Ham( x, w) ? k}. This extends the method developed in Chapter 2. To do this, we can consider the nondeterministic automaton defined as follows:

  • each state is a pair ( ?, i) where ? is the level of the state and i is its depth, with 0 ? ? ? k, ?1 ? i ? m ? 1, and ? ? i + 1,

  • the initial state is (0 ,

UNLIMITED FREE
ACCESS
TO THE WORLD'S BEST IDEAS

SUBMIT
Already a GlobalSpec user? Log in.

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.

Customize Your GlobalSpec Experience

Category: Power Plugs
Finish!
Privacy Policy

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.