Algorithms on Strings

Chapter 8: Approximate Patterns

Overview

In this chapter, we are interested in the approximate search for fixed strings. Several notions of approximation on strings are considered: jokers, differences, and mismatches.

A joker is a symbol meant to represent all the letters of the alphabet. The solutions to the problem of searching a text for a pattern containing jokers use specific methods that are described in Section 8.1.

More generally, approximate pattern matching consists in locating all the occurrences of factors inside a text y that are similar to a string x. It consists in producing the positions of the factors of y that are at distance at most k from x, for a given natural integer k. We assume in the rest that k < x ? y. We consider two distances for measuring the approximation: the edit distance and the Hamming distance.

The edit distance between two strings u and v, that are not necessarily of the same length, is the minimum cost of a sequence of elementary edit operations between these two strings (see Section 7.1). The method at the basis of approximate pattern matching is a natural extension of the alignment method by dynamic programming of Chapter 7. It can be improved by using a restricted notion of distance obtained by considering the minimum number of edit operations rather than the sum of their costs. With this distance, the problem is known as the approximate pattern matching with

UNLIMITED FREE
ACCESS
TO THE WORLD'S BEST IDEAS

SUBMIT
Already a GlobalSpec user? Log in.

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.

Customize Your GlobalSpec Experience

Category: Bar Graph Arrays
Finish!
Privacy Policy

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.