Results 1 - 10
of
8,786
Accurate Methods for the Statistics of Surprise and Coincidence
- COMPUTATIONAL LINGUISTICS
, 1993
"... Much work has been done on the statistical analysis of text. In some cases reported in the literature, inappropriate statistical methods have been used, and statistical significance of results have not been addressed. In particular, asymptotic normality assumptions have often been used unjustifiably ..."
Abstract
-
Cited by 1057 (1 self)
- Add to MetaCart
small samples. These tests can be implemented efficiently, and have been used for the detection of composite terms and for the determination of domain-specific terms. In some cases, these measures perform much better than the methods previously used. In cases where traditional contingency table methods
Statistical pattern recognition: A review
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2000
"... The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques ..."
Abstract
-
Cited by 1035 (30 self)
- Add to MetaCart
, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved
A Bayesian Framework for the Analysis of Microarray Expression Data: Regularized t-Test and Statistical Inferences of Gene Changes
- Bioinformatics
, 2001
"... Motivation: DNA microarrays are now capable of providing genome-wide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory ..."
Abstract
-
Cited by 491 (6 self)
- Add to MetaCart
or fold methods, and partly compensate for the lack of replication. Availability: The approach is implemented in a software called Cyber-T accessible through a Web interface at www.genomics.uci.edu/software.html. The code is available as Open Source and is written in the freely available statistical
A Sequential Algorithm for Training Text Classifiers
, 1994
"... The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers was ..."
Abstract
-
Cited by 631 (10 self)
- Add to MetaCart
The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers
Mega: molecular evolutionary genetic analysis software for microcomputers
- CABIOS
, 1994
"... A computer program package called MEGA has been developed for estimating evolutionary distances, reconstructing phylogenetic trees and computing basic statistical quantities from molecular data. It is written in C+ + and is intended to be used on IBM and IBM-compatible personal computers. In this pr ..."
Abstract
-
Cited by 505 (10 self)
- Add to MetaCart
. In this program, various methods for estimating evolutionary distances from nucleotide and amino acid sequence data, three different methods of phylogenetic inference (UPGMA, neighbor-joining and maximum parsimony) and two statistical tests of topological differences are included. For the maximum parsimony method
Model-Based Analysis of Oligonucleotide Arrays: Model Validation, Design Issues and Standard Error Application
, 2001
"... Background: A model-based analysis of oligonucleotide expression arrays we developed previously uses a probe-sensitivity index to capture the response characteristic of a specific probe pair and calculates model-based expression indexes (MBEI). MBEI has standard error attached to it as a measure of ..."
Abstract
-
Cited by 775 (28 self)
- Add to MetaCart
better ranking statistic for filtering genes. We can assign reliability indexes for genes in a specific cluster of interest in hierarchical clustering by resampling clustering trees. A software dChip implementing many of these analysis methods is made available. Conclusions: The model-based approach
Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations
"... A key challenge for neural modeling is to explain how a continuous stream of multi-modal input from a rapidly changing environment can be processed by stereotypical recurrent circuits of integrate-and-fire neurons in real-time. We propose a new computational model for real-time computing on time-var ..."
Abstract
-
Cited by 469 (38 self)
- Add to MetaCart
-varying input that provides an alternative to paradigms based on Turing machines or attractor neural networks. It does not require a task-dependent construction of neural circuits. Instead it is based on principles of high dimensional dynamical systems in combination with statistical learning theory, and can
Fast approximate nearest neighbors with automatic algorithm configuration
- In VISAPP International Conference on Computer Vision Theory and Applications
, 2009
"... nearest-neighbors search, randomized kd-trees, hierarchical k-means tree, clustering. For many computer vision problems, the most time consuming component consists of nearest neighbor matching in high-dimensional spaces. There are no known exact algorithms for solving these high-dimensional problems ..."
Abstract
-
Cited by 455 (2 self)
- Add to MetaCart
nearest-neighbors search, randomized kd-trees, hierarchical k-means tree, clustering. For many computer vision problems, the most time consuming component consists of nearest neighbor matching in high-dimensional spaces. There are no known exact algorithms for solving these high
X-means: Extending K-means with Efficient Estimation of the Number of Clusters
- In Proceedings of the 17th International Conf. on Machine Learning
, 2000
"... Despite its popularity for general clustering, K-means suffers three major shortcomings; it scales poorly computationally, the number of clusters K has to be supplied by the user, and the search is prone to local minima. We propose solutions for the first two problems, and a partial remedy for the t ..."
Abstract
-
Cited by 418 (5 self)
- Add to MetaCart
) measure. The innovations include two new ways of exploiting cached sufficient statistics and a new very efficient test that in one K-means sweep selects the most promising subset of classes for refinement. This gives rise to a fast, statistically founded algorithm that outputs both the number of classes
Cluster-Based Scalable Network Services
, 1997
"... This paper has benefited from the detailed and perceptive comments of our reviewers, especially our shepherd Hank Levy. We thank Randy Katz and Eric Anderson for their detailed readings of early drafts of this paper, and David Culler for his ideas on TACC's potential as a model for cluster prog ..."
Abstract
-
Cited by 400 (36 self)
- Add to MetaCart
This paper has benefited from the detailed and perceptive comments of our reviewers, especially our shepherd Hank Levy. We thank Randy Katz and Eric Anderson for their detailed readings of early drafts of this paper, and David Culler for his ideas on TACC's potential as a model for cluster
Results 1 - 10
of
8,786