@MISC{Duan07dataquality, author = {Rong Duan and Tom Au and Wei Jiang}, title = {Data Quality Assessment via Robust Clustering}, year = {2007} }
Share
OpenURL
Abstract
Although data mining is popularly used in business and industry to improve the quality of decision making, data quality is long time ignored in many practices so that the analytical results derived by data mining methods are usually questionable and unreliable to represent useful knowledge and aid decision making. This paper proposed a generic framework for data quality assessment in nonhomogeneous environments based on robust clustering analysis. In particular, trimmed clustering methods are proposed to robustly characterize groups of similar observations and trimmed observations are then evaluated to assess outlying-ness based on their distance with the cluster profiles. Simulation studies have shown the effectiveness of the proposed framework. 1