Results 11  20
of
768,158
Learning the Kernel Matrix with SemiDefinite Programming
, 2002
"... Kernelbased learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by specifying the inner products between each pair of points in the embedding space. This information ..."
Abstract

Cited by 780 (22 self)
 Add to MetaCart
is contained in the socalled kernel matrix, a symmetric and positive definite matrix that encodes the relative positions of all points. Specifying this matrix amounts to specifying the geometry of the embedding space and inducing a notion of similarity in the input spaceclassical model selection
Indexing by latent semantic analysis
 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
, 1990
"... A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higherorder structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The p ..."
Abstract

Cited by 3723 (35 self)
 Add to MetaCart
. The particular technique used is singularvalue decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries
Missing value estimation methods for DNA microarrays
, 2001
"... Motivation: Gene expression microarray experiments can generate data sets with multiple missing expression values. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. For example, methods such as hierarchical clustering and Kmeans clu ..."
Abstract

Cited by 476 (26 self)
 Add to MetaCart
Motivation: Gene expression microarray experiments can generate data sets with multiple missing expression values. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. For example, methods such as hierarchical clustering and K
Predicting Transmembrane Protein Topology with a Hidden Markov Model: Application to Complete Genomes
 J. MOL. BIOL
, 2001
"... ..."
Using Linear Algebra for Intelligent Information Retrieval
 SIAM REVIEW
, 1995
"... Currently, most approaches to retrieving textual materials from scientific databases depend on a lexical match between words in users' requests and those in or assigned to documents in a database. Because of the tremendous diversity in the words people use to describe the same document, lexical ..."
Abstract

Cited by 672 (18 self)
 Add to MetaCart
, lexical methods are necessarily incomplete and imprecise. Using the singular value decomposition (SVD), one can take advantage of the implicit higherorder structure in the association of terms with documents by determining the SVD of large sparse term by document matrices. Terms and documents represented
Stochastic Perturbation Theory
, 1988
"... . In this paper classical matrix perturbation theory is approached from a probabilistic point of view. The perturbed quantity is approximated by a firstorder perturbation expansion, in which the perturbation is assumed to be random. This permits the computation of statistics estimating the variatio ..."
Abstract

Cited by 886 (35 self)
 Add to MetaCart
and the eigenvalue problem. Key words. perturbation theory, random matrix, linear system, least squares, eigenvalue, eigenvector, invariant subspace, singular value AMS(MOS) subject classifications. 15A06, 15A12, 15A18, 15A52, 15A60 1. Introduction. Let A be a matrix and let F be a matrix valued function of A
Probabilistic Latent Semantic Analysis
 In Proc. of Uncertainty in Artificial Intelligence, UAI’99
, 1999
"... Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of twomode and cooccurrence data, which has applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas. Compared to standard Latent Sema ..."
Abstract

Cited by 760 (9 self)
 Add to MetaCart
Semantic Analysis which stems from linear algebra and performs a Singular Value Decomposition of cooccurrence tables, the proposed method is based on a mixture decomposition derived from a latent class model. This results in a more principled approach which has a solid foundation in statistics. In order
Probabilistic Latent Semantic Indexing
, 1999
"... Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fitted from a training corpus of text documents by a generalization of the Expectation Maximization algorithm, the utilized ..."
Abstract

Cited by 1207 (11 self)
 Add to MetaCart
model is able to deal with domainspecific synonymy as well as with polysemous words. In contrast to standard Latent Semantic Indexing (LSI) by Singular Value Decomposition, the probabilistic variant has a solid statistical foundation and defines a proper generative data model. Retrieval experiments
For Most Large Underdetermined Systems of Linear Equations the Minimal ℓ1norm Solution is also the Sparsest Solution
 Comm. Pure Appl. Math
, 2004
"... We consider linear equations y = Φα where y is a given vector in R n, Φ is a given n by m matrix with n < m ≤ An, and we wish to solve for α ∈ R m. We suppose that the columns of Φ are normalized to unit ℓ 2 norm 1 and we place uniform measure on such Φ. We prove the existence of ρ = ρ(A) so that ..."
Abstract

Cited by 560 (10 self)
 Add to MetaCart
We consider linear equations y = Φα where y is a given vector in R n, Φ is a given n by m matrix with n < m ≤ An, and we wish to solve for α ∈ R m. We suppose that the columns of Φ are normalized to unit ℓ 2 norm 1 and we place uniform measure on such Φ. We prove the existence of ρ = ρ(A) so
A Comparative Study on Feature Selection in Text Categorization
, 1997
"... This paper is a comparative study of feature selection methods in statistical learning of text categorization. The focus is on aggressive dimensionality reduction. Five methods were evaluated, including term selection based on document frequency (DF), information gain (IG), mutual information (MI), ..."
Abstract

Cited by 1294 (15 self)
 Add to MetaCart
precision) . DF thresholding performed similarly. Indeed we found strong correlations between the DF, IG and CHI values of a term. This suggests that DF thresholding, the simplest method with the lowest cost in computation, can be reliably used instead of IG or CHI when the computation of these measures
Results 11  20
of
768,158