Data Quality

The widely held assumption in the database community that the data stored in databases are of good quality is not always borne out in the real world. In practice, "dirty" data pervade databases for various reasons [ [15], [17], [22], [24], [35], [46]]. Even though they acknowledge the existence of data quality problems, many in the database community view them as simply accuracy and integrity problems. It is now well established that the scope of data quality goes beyond accuracy and integrity to include other aspects such as believability and timeliness which are equally, if not more, important from the end-user's perspective [ [1], [6], [40], [47]]. Researchers in the data quality field must address these issues.
In this book, we have presented an expos of the state-of-the-art concepts, techniques, and models for data quality in the database area developed over the last decade at the MIT Total Data Quality Management research program. We have also profiled data quality research projects in other research institutions. Much of the research has been followed up by additional work, as presented below.
[15]English, L. P., Improving Data Warehouse and Business Information Quality. John Wiley & Sons, New York, NY, 1999.
[17]Huang, K., Y. Lee and R. Wang, Quality Information and Knowledge. Prentice Hall, Upper Saddle River: N.J., 1999.
[22]Laudon, K. C., "Data Quality and Due Process in Large Interorganizational Record Systems," Communications of the ACM, 29(1),...