Data Quality

Chapter 8: The Purdue University Data Quality Project

OVERVIEW

Consistency and completeness are considered two important dimensions of data quality [ [1], [9], [20]]. However, they are usually compromised by errors, accidentally or intentionally introduced, in a database system. Such errors result in inconsistent, incomplete, and erroneous data elements. A small variation in the representation of a data object may, for example, produce a unique instantiation of the object being represented. In order to improve the accuracy of the data stored in a database system, the data must be compared either with real-world counterparts or with other data, stored in the same or in a different system.

This chapter has been written by Vassilios S. Verykios at Drexel University under the direction of Ahmed Elmagarmid at Purdue University. It profiles one aspect of the on-going Purdue research projects. Specifically, it addresses the problem of matching records, which refer to the same entity, by computing their similarity. In this context, exact record matching has limited applicability since, even simple errors, like character transpositions, cannot be captured in the record linking process. The methodology deploys advanced data mining techniques for dealing with the high computational and inferential complexity of the approximate record matching process.

[1]Ballou, D. P. and H. L. Pazer, "Modeling Data and Process Quality in Multi-input, Multi-output Information Systems," Management Science, 31(2), 1985, pp 150 162.

[9]Huang, K., Y. Lee and R. Wang, Quality Information and Knowledge. Prentice Hall, Upper Saddle River: N.J., 1999.

[20]Redman, T. C., ed. Data Quality for...

UNLIMITED FREE
ACCESS
TO THE WORLD'S BEST IDEAS

SUBMIT
Already a GlobalSpec user? Log in.

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.

Customize Your GlobalSpec Experience

Category: Data Warehousing Software
Finish!
Privacy Policy

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.