• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 8,786
Next 10 →

Accurate Methods for the Statistics of Surprise and Coincidence

by Ted Dunning - COMPUTATIONAL LINGUISTICS , 1993
"... Much work has been done on the statistical analysis of text. In some cases reported in the literature, inappropriate statistical methods have been used, and statistical significance of results have not been addressed. In particular, asymptotic normality assumptions have often been used unjustifiably ..."
Abstract - Cited by 1057 (1 self) - Add to MetaCart
small samples. These tests can be implemented efficiently, and have been used for the detection of composite terms and for the determination of domain-specific terms. In some cases, these measures perform much better than the methods previously used. In cases where traditional contingency table methods

Statistical pattern recognition: A review

by Anil K. Jain, Robert P. W. Duin, Jianchang Mao - IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE , 2000
"... The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques ..."
Abstract - Cited by 1035 (30 self) - Add to MetaCart
, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved

A Bayesian Framework for the Analysis of Microarray Expression Data: Regularized t-Test and Statistical Inferences of Gene Changes

by Pierre Baldi, Anthony D. Long - Bioinformatics , 2001
"... Motivation: DNA microarrays are now capable of providing genome-wide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory ..."
Abstract - Cited by 491 (6 self) - Add to MetaCart
or fold methods, and partly compensate for the lack of replication. Availability: The approach is implemented in a software called Cyber-T accessible through a Web interface at www.genomics.uci.edu/software.html. The code is available as Open Source and is written in the freely available statistical

A Sequential Algorithm for Training Text Classifiers

by David D. Lewis, William A. Gale , 1994
"... The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers was ..."
Abstract - Cited by 631 (10 self) - Add to MetaCart
The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers

Mega: molecular evolutionary genetic analysis software for microcomputers

by Sudhir Kumar, Koichiro Tamura, Masatoshi Nei - CABIOS , 1994
"... A computer program package called MEGA has been developed for estimating evolutionary distances, reconstructing phylogenetic trees and computing basic statistical quantities from molecular data. It is written in C+ + and is intended to be used on IBM and IBM-compatible personal computers. In this pr ..."
Abstract - Cited by 505 (10 self) - Add to MetaCart
. In this program, various methods for estimating evolutionary distances from nucleotide and amino acid sequence data, three different methods of phylogenetic inference (UPGMA, neighbor-joining and maximum parsimony) and two statistical tests of topological differences are included. For the maximum parsimony method

Model-Based Analysis of Oligonucleotide Arrays: Model Validation, Design Issues and Standard Error Application

by Cheng Li, Wing Hung Wong , 2001
"... Background: A model-based analysis of oligonucleotide expression arrays we developed previously uses a probe-sensitivity index to capture the response characteristic of a specific probe pair and calculates model-based expression indexes (MBEI). MBEI has standard error attached to it as a measure of ..."
Abstract - Cited by 775 (28 self) - Add to MetaCart
better ranking statistic for filtering genes. We can assign reliability indexes for genes in a specific cluster of interest in hierarchical clustering by resampling clustering trees. A software dChip implementing many of these analysis methods is made available. Conclusions: The model-based approach

Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations

by Wolfgang Maass, Thomas Natschläger, Henry Markram
"... A key challenge for neural modeling is to explain how a continuous stream of multi-modal input from a rapidly changing environment can be processed by stereotypical recurrent circuits of integrate-and-fire neurons in real-time. We propose a new computational model for real-time computing on time-var ..."
Abstract - Cited by 469 (38 self) - Add to MetaCart
-varying input that provides an alternative to paradigms based on Turing machines or attractor neural networks. It does not require a task-dependent construction of neural circuits. Instead it is based on principles of high dimensional dynamical systems in combination with statistical learning theory, and can

Fast approximate nearest neighbors with automatic algorithm configuration

by Marius Muja, David G. Lowe - In VISAPP International Conference on Computer Vision Theory and Applications , 2009
"... nearest-neighbors search, randomized kd-trees, hierarchical k-means tree, clustering. For many computer vision problems, the most time consuming component consists of nearest neighbor matching in high-dimensional spaces. There are no known exact algorithms for solving these high-dimensional problems ..."
Abstract - Cited by 455 (2 self) - Add to MetaCart
nearest-neighbors search, randomized kd-trees, hierarchical k-means tree, clustering. For many computer vision problems, the most time consuming component consists of nearest neighbor matching in high-dimensional spaces. There are no known exact algorithms for solving these high

X-means: Extending K-means with Efficient Estimation of the Number of Clusters

by Dau Pelleg, Andrew Moore - In Proceedings of the 17th International Conf. on Machine Learning , 2000
"... Despite its popularity for general clustering, K-means suffers three major shortcomings; it scales poorly computationally, the number of clusters K has to be supplied by the user, and the search is prone to local minima. We propose solutions for the first two problems, and a partial remedy for the t ..."
Abstract - Cited by 418 (5 self) - Add to MetaCart
) measure. The innovations include two new ways of exploiting cached sufficient statistics and a new very efficient test that in one K-means sweep selects the most promising subset of classes for refinement. This gives rise to a fast, statistically founded algorithm that outputs both the number of classes

Cluster-Based Scalable Network Services

by Armando Fox , Steven D. Gribble, Yatin Chawathe, Eric A. Brewer, Paul Gauthier , 1997
"... This paper has benefited from the detailed and perceptive comments of our reviewers, especially our shepherd Hank Levy. We thank Randy Katz and Eric Anderson for their detailed readings of early drafts of this paper, and David Culler for his ideas on TACC's potential as a model for cluster prog ..."
Abstract - Cited by 400 (36 self) - Add to MetaCart
This paper has benefited from the detailed and perceptive comments of our reviewers, especially our shepherd Hank Levy. We thank Randy Katz and Eric Anderson for their detailed readings of early drafts of this paper, and David Culler for his ideas on TACC's potential as a model for cluster
Next 10 →
Results 1 - 10 of 8,786
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University