Results 1 - 10
of
5,799
Data Cleaning: Problems and Current Approaches
- IEEE Data Engineering Bulletin
, 2000
"... We classify data quality problems that are addressed by data cleaning and provide an overview of the main solution approaches. Data cleaning is especially required when integrating heterogeneous data sources and should be addressed together with schema-related data transformations. In data warehouse ..."
Abstract
-
Cited by 279 (8 self)
- Add to MetaCart
We classify data quality problems that are addressed by data cleaning and provide an overview of the main solution approaches. Data cleaning is especially required when integrating heterogeneous data sources and should be addressed together with schema-related data transformations. In data
An Extensible Framework for Data Cleaning
- In ICDE
, 2000
"... Data integration solutions dealing with large amounts of data have been strongly required in the last few years. Besides the traditional data integration problems (e.g. schema integration, local to global schema mappings), three additional data problems have to be dealt with: (1) the absence of un ..."
Abstract
-
Cited by 74 (0 self)
- Add to MetaCart
of universal keys across dierent databases that is known as the object identity problem, (2) the existence of keyboard errors in the data, and (3) the presence of inconsistencies in data coming from multiple sources. Dealing with these problems is globally called the data cleaning process. In this work, we
Special Issue on Data Cleaning
, 2000
"... We classify data quality problems that are addressed by data cleaning and provide an overview of the main solution approaches. Data cleaning is especially required when integrating heterogeneous data sources and should be addressed together with schema-related data transformations. In data warehous ..."
Abstract
- Add to MetaCart
We classify data quality problems that are addressed by data cleaning and provide an overview of the main solution approaches. Data cleaning is especially required when integrating heterogeneous data sources and should be addressed together with schema-related data transformations. In data
Data Cleaning Methods
, 2003
"... Data Cleaning methods are used for finding duplicates within a file or across sets of files. This overview provides background on the Fellegi-Sunter model of record linkage. The Fellegi-Sunter model provides an optimal theoretical classification rule. Fellegi and Sunter introduced methods for au ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
Data Cleaning methods are used for finding duplicates within a file or across sets of files. This overview provides background on the Fellegi-Sunter model of record linkage. The Fellegi-Sunter model provides an optimal theoretical classification rule. Fellegi and Sunter introduced methods
An Interactive Framework for Data Cleaning
, 2000
"... Cleaning organizational data of discrepancies in structure and content is important for data warehousing and Enterprise Data Integration (EDI). Current commercial solutions for data cleaning involve many iterations of time-consuming "data quality" analysis to find errors, and long-runnin ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Cleaning organizational data of discrepancies in structure and content is important for data warehousing and Enterprise Data Integration (EDI). Current commercial solutions for data cleaning involve many iterations of time-consuming "data quality" analysis to find errors, and long
Email Data Cleaning
- In 5th International Conference on Knowledge and Data Discovery KDD’05
, 2005
"... Addressed in this paper is the issue of ‘email data cleaning ’ for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus it is necessary to clean it before mining. Several products offer email cleaning features, however, the types of noises that c ..."
Abstract
-
Cited by 22 (4 self)
- Add to MetaCart
Addressed in this paper is the issue of ‘email data cleaning ’ for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus it is necessary to clean it before mining. Several products offer email cleaning features, however, the types of noises
THE IMPACT OF DATA CLEANING ON INTERNAL VALIDITY
"... Any number you want: the impact of data cleaning on internal validity ..."
Declarative Data Cleaning: Language, Model, and Algorithms
- In VLDB
, 2001
"... The problem of data cleaning, which consists of removing inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. This holds regardless of the application - relational database joining, web-related, or scientific. In all cases, ex ..."
Abstract
-
Cited by 125 (6 self)
- Add to MetaCart
The problem of data cleaning, which consists of removing inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. This holds regardless of the application - relational database joining, web-related, or scientific. In all cases
Continuous Data Cleaning
"... Abstract—In declarative data cleaning, data semantics are encoded as constraints and errors arise when the data violates the constraints. Various forms of statistical and logical inference can be used to reason about and repair inconsistencies (errors) in data. Recently, unified approaches that repa ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Abstract—In declarative data cleaning, data semantics are encoded as constraints and errors arise when the data violates the constraints. Various forms of statistical and logical inference can be used to reason about and repair inconsistencies (errors) in data. Recently, unified approaches
The LLUNATIC Data-Cleaning Framework
"... Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods r ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods
Results 1 - 10
of
5,799