Results 11 - 20
of
15,927
How much should we trust differences-in-differences estimates?
, 2003
"... Most papers that employ Differences-in-Differences estimation (DD) use many years of data and focus on serially correlated outcomes but ignore that the resulting standard errors are inconsistent. To illustrate the severity of this issue, we randomly generate placebo laws in state-level data on femal ..."
Abstract
-
Cited by 828 (1 self)
- Add to MetaCart
into account the auto-correlation of the data) works well when the number of states is large enough. Two corrections based on asymptotic approximation of the variance-covariance matrix work well for moderate numbers of states and one correction that collapses the time series information into a “pre” and “post
The Dantzig selector: statistical estimation when p is much larger than n
, 2005
"... In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ ..."
Abstract
-
Cited by 879 (14 self)
- Add to MetaCart
In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n
Biclustering algorithms for biological data analysis: a survey.
- IEEE/ACM Transactions of Computational Biology and Bioinformatics,
, 2004
"... Abstract A large number of clustering approaches have been proposed for the analysis of gene expression data obtained from microarray experiments. However, the results of the application of standard clustering methods to genes are limited. These limited results are imposed by the existence of a num ..."
Abstract
-
Cited by 481 (15 self)
- Add to MetaCart
Abstract A large number of clustering approaches have been proposed for the analysis of gene expression data obtained from microarray experiments. However, the results of the application of standard clustering methods to genes are limited. These limited results are imposed by the existence of a
Sequential minimal optimization: A fast algorithm for training support vector machines
- Advances in Kernel Methods-Support Vector Learning
, 1999
"... This paper proposes a new algorithm for training support vector machines: Sequential Minimal Optimization, or SMO. Training a support vector machine requires the solution of a very large quadratic programming (QP) optimization problem. SMO breaks this large QP problem into a series of smallest possi ..."
Abstract
-
Cited by 461 (3 self)
- Add to MetaCart
possible QP problems. These small QP problems are solved analytically, which avoids using a time-consuming numerical QP optimization as an inner loop. The amount of memory required for SMO is linear in the training set size, which allows SMO to handle very large training sets. Because matrix computation
Concept Decompositions for Large Sparse Text Data using Clustering
- Machine Learning
, 2000
"... . Unlabeled document collections are becoming increasingly common and available; mining such data sets represents a major contemporary challenge. Using words as features, text documents are often represented as high-dimensional and sparse vectors--a few thousand dimensions and a sparsity of 95 to 99 ..."
Abstract
-
Cited by 407 (27 self)
- Add to MetaCart
empirically demonstrate that, owing to the high-dimensionality and sparsity of the text data, the clusters produced by the algorithm have a certain "fractal-like" and "self-similar" behavior. As our second contribution, we introduce concept decompositions to approximate the matrix
On the distribution of the largest eigenvalue in principal components analysis
- ANN. STATIST
, 2001
"... Let x �1 � denote the square of the largest singular value of an n × p matrix X, all of whose entries are independent standard Gaussian variates. Equivalently, x �1 � is the largest principal component variance of the covariance matrix X ′ X, or the largest eigenvalue of a p-variate Wishart distribu ..."
Abstract
-
Cited by 422 (4 self)
- Add to MetaCart
Let x �1 � denote the square of the largest singular value of an n × p matrix X, all of whose entries are independent standard Gaussian variates. Equivalently, x �1 � is the largest principal component variance of the covariance matrix X ′ X, or the largest eigenvalue of a p-variate Wishart
The grid file: an adaptable, symmetric multikey file structure
- In Trends in Information Processing Systems, Proc. 3rd ECZ Conference, A. Duijvestijn and P. Lockemann, Eds., Lecture Notes in Computer Science 123
, 1981
"... Traditional file structures that provide multikey access to records, for example, inverted files, are extensions of file structures originally designed for single-key access. They manifest various deficiencies in particular for multikey access to highly dynamic files. We study the dynamic aspects of ..."
Abstract
-
Cited by 426 (4 self)
- Add to MetaCart
of tile structures that treat all keys symmetrically, that is, file structures which avoid the distinction between primary and secondary keys. We start from a bitmap approach and treat the problem of file design as one of data compression of a large sparse matrix. This leads to the notions of a grid
Using DryadLINQ for Large Matrix Operations
, 2011
"... DryadLINQ [7] is a system that facilitates the construc-tion of distributed execution plans for processing large amounts of data on clusters containing potentially thou-sands of computers. In this paper, we explore how to use DryadLINQ to perform basic matrix operations on large matrices. DryadLINQ ..."
Abstract
- Add to MetaCart
DryadLINQ [7] is a system that facilitates the construc-tion of distributed execution plans for processing large amounts of data on clusters containing potentially thou-sands of computers. In this paper, we explore how to use DryadLINQ to perform basic matrix operations on large matrices. Dryad
The Determinants of Credit Spread Changes.
- Journal of Finance
, 2001
"... ABSTRACT Using dealer's quotes and transactions prices on straight industrial bonds, we investigate the determinants of credit spread changes. Variables that should in theory determine credit spread changes have rather limited explanatory power. Further, the residuals from this regression are ..."
Abstract
-
Cited by 422 (2 self)
- Add to MetaCart
the probability of default. Changes in the Probability or Magnitude of a Downward Jump in Firm Value Implied volatility smiles in observed option prices suggest that markets account for the probability of large negative jumps in firm value. Thus, increases in either the probability or the magnitude of a negative
Online learning for matrix factorization and sparse coding
, 2010
"... Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the large-scale matrix factorization problem that consists of learning the basis set in order to ad ..."
Abstract
-
Cited by 330 (31 self)
- Add to MetaCart
Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the large-scale matrix factorization problem that consists of learning the basis set in order
Results 11 - 20
of
15,927