Results 1 - 10
of
260
Duplicate Detection in Probabilistic Data
"... Abstract — Collected data often contains uncertainties. Probabilistic databases have been proposed to manage uncertain data. To combine data from multiple autonomous probabilistic databases, an integration of probabilistic data has to be performed. Until now, however, data integration approaches hav ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract — Collected data often contains uncertainties. Probabilistic databases have been proposed to manage uncertain data. To combine data from multiple autonomous probabilistic databases, an integration of probabilistic data has to be performed. Until now, however, data integration approaches
A linear-time probabilistic counting algorithm for database applications
- ACM Transactions on Database Systems
, 1990
"... We present a probabilistic algorithm for counting the number of unique values in the presence of duplicates. This algorithm has O(q) time complexity, where q is the number of values including duplicates, and produces an estimation with an arbitrary accuracy prespecified by the user using only a smal ..."
Abstract
-
Cited by 102 (5 self)
- Add to MetaCart
We present a probabilistic algorithm for counting the number of unique values in the presence of duplicates. This algorithm has O(q) time complexity, where q is the number of values including duplicates, and produces an estimation with an arbitrary accuracy prespecified by the user using only a
2005: Uncertainty in predictions of the climate response to rising levels of greenhouse gases
- Nature
"... The range of possibilities for future climate evolution 1-3 needs to be taken into account when planning climate change mitigation and adaptation strategies. This requires ensembles of multidecadal simulations to assess both chaotic climate variability and model response uncertainty As a first ste ..."
Abstract
-
Cited by 175 (9 self)
- Add to MetaCart
as a 'model version') an initialcondition ensemble 22 is used, creating an ensemble of ensembles. Each individual member of this grand ensemble (referred to here as a 'simulation') explores the response to changing boundary conditions 22 by including a period with doubled CO 2
Self-determination and persistence in a real-life setting: Toward a motivational model of high school dropout.
- Journal of Personality and Social Psychology,
, 1997
"... The purpose of this study was to propose and test a motivational model of high school dropout. The model posits that teachers, parents, and the school administration's behaviors toward students influence students' perceptions of competence and autonomy. The less autonomy supportive the so ..."
Abstract
-
Cited by 183 (19 self)
- Add to MetaCart
section). Actual dropout behavior was assessed through a dichotomous variable that reflected enrollment status the following fall semester (0 = re-enrolled; 1 = dropped out). The variance-covariance matrix of the 22 observed variables was used as the database for the analysis. The variance
VLDB Journal manuscript No. (will be inserted by the editor) Creating Probabilistic Databases from Duplicated Data
"... Abstract A major source of uncertainty in databases is the presence of duplicate items, i.e., records that refer to the same real world entity. However, accurate dedu-plication is a difficult task and imperfect data cleaning may result in loss of valuable information. A reason-able alternative appro ..."
Abstract
- Add to MetaCart
prob-abilistic database out of a dirty relation of duplicated data and overview the challenges raised in utilizing this framework for large relations of string data. We study the problem of associating probabilities with duplicates that are detected using state-of-the-art scalable approx-imate join
Whom You Know Matters: Venture Capital Networks and Investment Performance,
- Journal of Finance
, 2007
"... Abstract Many financial markets are characterized by strong relationships and networks, rather than arm's-length, spot-market transactions. We examine the performance consequences of this organizational choice in the context of relationships established when VCs syndicate portfolio company inv ..."
Abstract
-
Cited by 138 (8 self)
- Add to MetaCart
are obtained from Thomson Financial's Venture Economics database. Venture Economics began compiling data on venture capital investments in 1977, and has since backfilled the data to the early 1960s. Gompers and Lerner (1999) investigate the completeness of the Venture Economics database and conclude
Probabilistic Reasoning Models for Face Recognition
- IN PROC. IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
, 1998
"... We introduce in this paper two probabilistic reasoning models (PRM-1 and PRM-2) which combine the Principal Component Analysis (PCA) technique and the Bayes classifier and show their feasibility on the face recognition problem. The conditional probability density function for each class is modeled u ..."
Abstract
-
Cited by 33 (7 self)
- Add to MetaCart
using the within class scatter and the Maximum A Posteriori (MAP) classification rule is implemented in the reduced PCA subspace. Experiments carried out using 1107 facial images corresponding to 369 subjects (with 169 subjects having duplicate images) from the FERET database show that the PRM approach
Studies in Probabilistic Sequence Alignment and Evolution
, 1998
"... The complete sequencing of whole genomes presents opportunities for detailed study of molecular evolution. This thesis combines theoretical developments of Bayesian approaches in bioinformatics with analysis of duplications in the recently completed {\em C.elegans} genome. Developments in the Bayesi ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
parameters in {\em C.elegans} are calculated from the data and reported. A method of dating gene duplications using alignments between conserved introns is presented and compared to existing methods using Bayesian techniques developed earlier in the dissertation. Amongst the principal agents involved
KEY-BASED BLOCKING OF DUPLICATES IN ENTITY-INDEPENDENT PROBABILISTIC DATA (Research-in-Progress)
"... Abstract: Currently, in many application areas the demand on probabilistic data grows. Duplicate entity representations are an essential problem of data quality, for certain databases as well as for probabilistic databases. Traditional duplicate detection approaches are based on pairwise comparisons ..."
Abstract
- Add to MetaCart
Abstract: Currently, in many application areas the demand on probabilistic data grows. Duplicate entity representations are an essential problem of data quality, for certain databases as well as for probabilistic databases. Traditional duplicate detection approaches are based on pairwise
DPLOT Database graphics
"... This user manual is a "Database graphics in examples". The examples show how to look for the values recorded in the DELPHI database, how to create Ntuples from the database etc. The reference section on the manual gives full description of the DPLOT commands. Contents 1 INTRODUCTION 1 ..."
Abstract
- Add to MetaCart
This user manual is a "Database graphics in examples". The examples show how to look for the values recorded in the DELPHI database, how to create Ntuples from the database etc. The reference section on the manual gives full description of the DPLOT commands. Contents 1 INTRODUCTION 1
Results 1 - 10
of
260