• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 11 - 20 of 15,927
Next 10 →

How much should we trust differences-in-differences estimates?

by Marianne Bertrand, Esther Duflo, Sendhil Mullainathan , 2003
"... Most papers that employ Differences-in-Differences estimation (DD) use many years of data and focus on serially correlated outcomes but ignore that the resulting standard errors are inconsistent. To illustrate the severity of this issue, we randomly generate placebo laws in state-level data on femal ..."
Abstract - Cited by 828 (1 self) - Add to MetaCart
into account the auto-correlation of the data) works well when the number of states is large enough. Two corrections based on asymptotic approximation of the variance-covariance matrix work well for moderate numbers of states and one correction that collapses the time series information into a “pre” and “post

The Dantzig selector: statistical estimation when p is much larger than n

by Emmanuel Candes, Terence Tao , 2005
"... In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ ..."
Abstract - Cited by 879 (14 self) - Add to MetaCart
In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n

Biclustering algorithms for biological data analysis: a survey.

by Sara C Madeira , Arlindo L Oliveira - IEEE/ACM Transactions of Computational Biology and Bioinformatics, , 2004
"... Abstract A large number of clustering approaches have been proposed for the analysis of gene expression data obtained from microarray experiments. However, the results of the application of standard clustering methods to genes are limited. These limited results are imposed by the existence of a num ..."
Abstract - Cited by 481 (15 self) - Add to MetaCart
Abstract A large number of clustering approaches have been proposed for the analysis of gene expression data obtained from microarray experiments. However, the results of the application of standard clustering methods to genes are limited. These limited results are imposed by the existence of a

Sequential minimal optimization: A fast algorithm for training support vector machines

by John C. Platt - Advances in Kernel Methods-Support Vector Learning , 1999
"... This paper proposes a new algorithm for training support vector machines: Sequential Minimal Optimization, or SMO. Training a support vector machine requires the solution of a very large quadratic programming (QP) optimization problem. SMO breaks this large QP problem into a series of smallest possi ..."
Abstract - Cited by 461 (3 self) - Add to MetaCart
possible QP problems. These small QP problems are solved analytically, which avoids using a time-consuming numerical QP optimization as an inner loop. The amount of memory required for SMO is linear in the training set size, which allows SMO to handle very large training sets. Because matrix computation

Concept Decompositions for Large Sparse Text Data using Clustering

by Inderjit S. Dhillon, Dharmendra S. Modha - Machine Learning , 2000
"... . Unlabeled document collections are becoming increasingly common and available; mining such data sets represents a major contemporary challenge. Using words as features, text documents are often represented as high-dimensional and sparse vectors--a few thousand dimensions and a sparsity of 95 to 99 ..."
Abstract - Cited by 407 (27 self) - Add to MetaCart
empirically demonstrate that, owing to the high-dimensionality and sparsity of the text data, the clusters produced by the algorithm have a certain "fractal-like" and "self-similar" behavior. As our second contribution, we introduce concept decompositions to approximate the matrix

On the distribution of the largest eigenvalue in principal components analysis

by Iain M. Johnstone - ANN. STATIST , 2001
"... Let x �1 � denote the square of the largest singular value of an n × p matrix X, all of whose entries are independent standard Gaussian variates. Equivalently, x �1 � is the largest principal component variance of the covariance matrix X ′ X, or the largest eigenvalue of a p-variate Wishart distribu ..."
Abstract - Cited by 422 (4 self) - Add to MetaCart
Let x �1 � denote the square of the largest singular value of an n × p matrix X, all of whose entries are independent standard Gaussian variates. Equivalently, x �1 � is the largest principal component variance of the covariance matrix X ′ X, or the largest eigenvalue of a p-variate Wishart

The grid file: an adaptable, symmetric multikey file structure

by J. Nievergelt, H. Hinterberger, K. C. Sevcik - In Trends in Information Processing Systems, Proc. 3rd ECZ Conference, A. Duijvestijn and P. Lockemann, Eds., Lecture Notes in Computer Science 123 , 1981
"... Traditional file structures that provide multikey access to records, for example, inverted files, are extensions of file structures originally designed for single-key access. They manifest various deficiencies in particular for multikey access to highly dynamic files. We study the dynamic aspects of ..."
Abstract - Cited by 426 (4 self) - Add to MetaCart
of tile structures that treat all keys symmetrically, that is, file structures which avoid the distinction between primary and secondary keys. We start from a bitmap approach and treat the problem of file design as one of data compression of a large sparse matrix. This leads to the notions of a grid

Using DryadLINQ for Large Matrix Operations

by Thomas L. Rodeheffer, Frank Mcsherry , 2011
"... DryadLINQ [7] is a system that facilitates the construc-tion of distributed execution plans for processing large amounts of data on clusters containing potentially thou-sands of computers. In this paper, we explore how to use DryadLINQ to perform basic matrix operations on large matrices. DryadLINQ ..."
Abstract - Add to MetaCart
DryadLINQ [7] is a system that facilitates the construc-tion of distributed execution plans for processing large amounts of data on clusters containing potentially thou-sands of computers. In this paper, we explore how to use DryadLINQ to perform basic matrix operations on large matrices. Dryad

The Determinants of Credit Spread Changes.

by Pierre Collin-Dufresne , Robert S Goldstein , J Spencer Martin , Gurdip Bakshi , Greg Bauer , Dave Brown , Francesca Carrieri , Peter Christoffersen , Susan Christoffersen , Greg Duffee , Darrell Duffie , Vihang Errunza , Gifford Fong , Mike Gallmeyer , Laurent Gauthier , Rick Green , John Griffin , Jean Helwege , Kris Jacobs , Chris Jones , Andrew Karolyi , Dilip Madan , David Mauer , Erwan Morellec , Federico Nardari , N R Prabhala , Tony Sanders , Sergei Sarkissian , Bill Schwert , Ken Singleton , Chester Spatt , René Stulz - Journal of Finance , 2001
"... ABSTRACT Using dealer's quotes and transactions prices on straight industrial bonds, we investigate the determinants of credit spread changes. Variables that should in theory determine credit spread changes have rather limited explanatory power. Further, the residuals from this regression are ..."
Abstract - Cited by 422 (2 self) - Add to MetaCart
the probability of default. Changes in the Probability or Magnitude of a Downward Jump in Firm Value Implied volatility smiles in observed option prices suggest that markets account for the probability of large negative jumps in firm value. Thus, increases in either the probability or the magnitude of a negative

Online learning for matrix factorization and sparse coding

by Julien Mairal, Francis Bach, Jean Ponce, Guillermo Sapiro , 2010
"... Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the large-scale matrix factorization problem that consists of learning the basis set in order to ad ..."
Abstract - Cited by 330 (31 self) - Add to MetaCart
Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the large-scale matrix factorization problem that consists of learning the basis set in order
Next 10 →
Results 11 - 20 of 15,927
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University