Results 1  10
of
547,917
ModelBased Clustering, Discriminant Analysis, and Density Estimation
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract

Cited by 557 (28 self)
 Add to MetaCart
for modelbased clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster
How much should we trust differencesindifferences estimates? Quarterly Journal of Economics 119:249–75
, 2004
"... Most papers that employ DifferencesinDifferences estimation (DD) use many years of data and focus on serially correlated outcomes but ignore that the resulting standard errors are inconsistent. To illustrate the severity of this issue, we randomly generate placebo laws in statelevel data on fema ..."
Abstract

Cited by 775 (1 self)
 Add to MetaCart
Most papers that employ DifferencesinDifferences estimation (DD) use many years of data and focus on serially correlated outcomes but ignore that the resulting standard errors are inconsistent. To illustrate the severity of this issue, we randomly generate placebo laws in statelevel data
Formalising trust as a computational concept
, 1994
"... Trust is a judgement of unquestionable utility — as humans we use it every day of our lives. However, trust has suffered from an imperfect understanding, a plethora of definitions, and informal use in the literature and in everyday life. It is common to say “I trust you, ” but what does that mean? T ..."
Abstract

Cited by 518 (5 self)
 Add to MetaCart
? This thesis provides a clarification of trust. We present a formalism for trust which provides us with a tool for precise discussion. The formalism is implementable: it can be embedded in an artificial agent, enabling the agent to make trustbased decisions. Its applicability in the domain of Distributed
Estimating the number of clusters in a dataset via the Gap statistic
, 2000
"... We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. kmeans or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference ..."
Abstract

Cited by 492 (1 self)
 Add to MetaCart
principal components. 1 Introduction Cluster analysis is an important tool for \unsupervised" learning the problem of nding groups in data without the help of a response variable. A major challenge in cluster analysis is estimation of the optimal number of \clusters". Figure 1 (top right) shows
OPTICS: Ordering Points To Identify the Clustering Structure
, 1999
"... Cluster analysis is a primary method for database mining. It is either used as a standalone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of ..."
Abstract

Cited by 511 (49 self)
 Add to MetaCart
the intrinsic clustering structure accurately. We introduce a new algorithm for the purpose of cluster analysis which does not produce a clustering of a data set explicitly; but instead creates an augmented ordering of the database representing its densitybased clustering structure. This cluster
LOF: Identifying DensityBased Local Outliers
 PROCEEDINGS OF THE 2000 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA
, 2000
"... For many KDD applications, such as detecting criminal activities in Ecommerce, finding the rare instances or the outliers, can be more interesting than finding the common patterns. Existing work in outlier detection regards being an outlier as a binary property. In this paper, we contend that for m ..."
Abstract

Cited by 499 (14 self)
 Add to MetaCart
For many KDD applications, such as detecting criminal activities in Ecommerce, finding the rare instances or the outliers, can be more interesting than finding the common patterns. Existing work in outlier detection regards being an outlier as a binary property. In this paper, we contend that for many scenarios, it is more meaningful to assign to each object a degree of being an outlier. This degree is called the local outlier factor (LOF) of an object. It is local in that the degree depends on how isolated the object is with respect to the surrounding neighborhood. We give a detailed formal analysis showing that LOF enjoys many desirable properties. Using realworld datasets, we demonstrate that LOF can be used to find outliers which appear to be meaningful, but can otherwise not be identified with existing approaches. Finally, a careful performance evaluation of our algorithm confirms we show that our approach of finding local outliers can be practical.
Estimating the Support of a HighDimensional Distribution
, 1999
"... Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We propo ..."
Abstract

Cited by 766 (29 self)
 Add to MetaCart
Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We
Powerlaw distributions in empirical data
 ISSN 00361445. doi: 10.1137/ 070710111. URL http://dx.doi.org/10.1137/070710111
, 2009
"... Powerlaw distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and manmade phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the t ..."
Abstract

Cited by 589 (7 self)
 Add to MetaCart
estimates for powerlaw data, based on maximum likelihood methods and the KolmogorovSmirnov statistic. We also show how to tell whether the data follow a powerlaw distribution at all, defining quantitative measures that indicate when the power law is a reasonable fit to the data and when it is not. We
Mean shift, mode seeking, and clustering
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1995
"... AbstractMean shift, a simple iterative procedure that shifts each data point to the average of data points in its neighborhood, is generalized and analyzed in this paper. This generalization makes some kmeans like clustering algorithms its special cases. It is shown that mean shift is a modeseeki ..."
Abstract

Cited by 620 (0 self)
 Add to MetaCart
AbstractMean shift, a simple iterative procedure that shifts each data point to the average of data points in its neighborhood, is generalized and analyzed in this paper. This generalization makes some kmeans like clustering algorithms its special cases. It is shown that mean shift is a mode
CONDENSATION  conditional density propagation for visual tracking
 International Journal of Computer Vision
, 1998
"... The problem of tracking curves in dense visual clutter is challenging. Kalman filtering is inadequate because it is based on Gaussian densities which, being unimodal, cannot represent simultaneous alternative hypotheses. The Condensation algorithm uses "factored sampling", previously appli ..."
Abstract

Cited by 1499 (12 self)
 Add to MetaCart
The problem of tracking curves in dense visual clutter is challenging. Kalman filtering is inadequate because it is based on Gaussian densities which, being unimodal, cannot represent simultaneous alternative hypotheses. The Condensation algorithm uses "factored sampling", previously
Results 1  10
of
547,917