Results 1  10
of
68
Hyperfeatures  multilevel local coding for visual recognition
 In ECCV
, 2006
"... Abstract. Histograms of local appearance descriptors are a popular representation for visual recognition. They are highly discriminant and have good resistance to local occlusions and to geometric and photometric variations, but they are not able to exploit spatial cooccurrence statistics at scales ..."
Abstract

Cited by 71 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Histograms of local appearance descriptors are a popular representation for visual recognition. They are highly discriminant and have good resistance to local occlusions and to geometric and photometric variations, but they are not able to exploit spatial cooccurrence statistics at scales larger than their local input patches. We present a new multilevel visual representation, ‘hyperfeatures’, that is designed to remedy this. The starting point is the familiar notion that to detect object parts, in practice it often suffices to detect cooccurrences of more local object fragments – a process that can be formalized as comparison (e.g. vector quantization) of image patches against a codebook of known fragments, followed by local aggregation of the resulting codebook membership vectors to detect cooccurrences. This process converts local collections of image descriptor vectors into somewhat less local histogram vectors – higherlevel but spatially coarser descriptors. We observe that as the output is again a local descriptor vector, the process can be iterated, and that doing so captures and codes ever larger assemblies of object parts and increasingly abstract or ‘semantic ’ image properties. We formulate the hyperfeatures model and study its performance under several different image coding methods including clustering based Vector Quantization, Gaussian Mixtures, and combinations of these with Latent Dirichlet Allocation. We find that the resulting highlevel features provide improved performance in several object image and texture image classification tasks. 1
Largescale behavioral targeting
 ACM Conference on Knowledge Discovery and Data Mining (KDD
, 2009
"... Behavioral targeting (BT) leverages historical user behavior to select the ads most relevant to users to display. The stateoftheart of BT derives a linear Poisson regression model from finegrained user behavioral data and predicts clickthrough rate (CTR) from user history. We designed and imple ..."
Abstract

Cited by 44 (2 self)
 Add to MetaCart
(Show Context)
Behavioral targeting (BT) leverages historical user behavior to select the ads most relevant to users to display. The stateoftheart of BT derives a linear Poisson regression model from finegrained user behavioral data and predicts clickthrough rate (CTR) from user history. We designed and implemented a highly scalable and efficient solution to BT using Hadoop MapReduce framework. With our parallel algorithm and the resulting system, we can build above 450 BTcategory models from the entire Yahoo’s user base within one day, the scale that one can not even imagine with prior systems. Moreover, our approach has yielded 20 % CTR lift over the existing production system by leveraging the wellgrounded probabilistic model fitted from a much larger training dataset. Specifically, our major contributions include: (1) A MapReduce statistical learning algorithm and implementation that achieve optimal data parallelism, task parallelism, and load balance in spite of the typically skewed distribution of domain data. (2) An inplace feature vector generation algorithm with linear time complexity O(n) regardless of the granularity of sliding target window. (3) An inmemory caching scheme that significantly reduces the number of disk IOs to make largescale learning practical. (4) Highly efficient data structures and sparse representations of models and data to enable fast model updates. We believe that our work makes significant contributions to solving largescale machine learning problems of industrial relevance in general. Finally, we report comprehensive experimental results, using industrial proprietary codebase and datasets.
Discrete Component Analysis
 Subspace, Latent Structure and Feature Selection Techniques
, 2006
"... This article presents a unified theory for analysis of components in discrete data, and compares the methods with techniques such as independent component analysis, nonnegative matrix factorisation and latent Dirichlet allocation. The main families of algorithms discussed are a variational appr ..."
Abstract

Cited by 42 (5 self)
 Add to MetaCart
(Show Context)
This article presents a unified theory for analysis of components in discrete data, and compares the methods with techniques such as independent component analysis, nonnegative matrix factorisation and latent Dirichlet allocation. The main families of algorithms discussed are a variational approximation, Gibbs sampling, and RaoBlackwellised Gibbs sampling. Applications are presented for voting records from the United States Senate for 2003, and for the Reuters21578 newswire collection.
A Spectral Algorithm for Latent Dirichlet Allocation
"... Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by multiple latent factors (topics), as opposed to just one. This increased representational power comes at the cost of a more challenging unsupervised learning problem of estimating th ..."
Abstract

Cited by 42 (10 self)
 Add to MetaCart
Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by multiple latent factors (topics), as opposed to just one. This increased representational power comes at the cost of a more challenging unsupervised learning problem of estimating the topicword distributions when only words are observed, and the topics are hidden. This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of topic models, including Latent Dirichlet Allocation (LDA). For LDA, the procedure correctly recovers both the topicword distributions and the parameters of the Dirichlet prior over the topic mixtures, using only trigram statistics (i.e., third order moments, which may be estimated with documents containing just three words). The method, called Excess Correlation Analysis, is based on a spectral decomposition of loworder moments via two singular value decompositions (SVDs). Moreover, the algorithm is scalable, since the SVDs are carried out only on k × k matrices, where k is the number of latent factors (topics) and is typically much smaller than the dimension of the observation (word) space. 1
Learning AuthorTopic Models from Text Corpora
 ACM TRANSACTIONS ON INFORMATION SYSTEMS
, 2008
"... We propose a new unsupervised learning technique for extracting information about authors and topics from large text collections. We model documents as if they were generated by a twostage stochastic process. An author is represented by a probability distribution over topics, and each topic is repr ..."
Abstract

Cited by 27 (2 self)
 Add to MetaCart
We propose a new unsupervised learning technique for extracting information about authors and topics from large text collections. We model documents as if they were generated by a twostage stochastic process. An author is represented by a probability distribution over topics, and each topic is represented as a probability distribution over words. The probability distribution over topics in a multiauthor paper is a mixture of the distributions associated with the authors. The topicword and authortopic distributions are learned from data in an unsupervised manner using a Markov chain Monte Carlo algorithm. We apply the methodology to three large text corpora: 150,000 abstracts from the CiteSeer digital library, 1,740 papers from the Neural Information Processing Systems (NIPS) Conferences, and 121,000 emails from the Enron corporation. We discuss in detail the interpretation of the results discovered by the system including specific topic and author models, ranking of authors by topic and topics by author, parsing of abstracts by topics and authors, and detection of unusual papers by specific authors. Experiments based on perplexity scores for test documents and precisionrecall for document retrieval are used to illustrate systematic differences between the proposed authortopic model and a number of alternatives. Extensions to the model, allowing (for example) generalizations of the notion of an author, are also briefly discussed.
The Infinite GammaPoisson Feature Model
"... We present a probability distribution over nonnegative integer valued matrices with possibly an infinite number of columns. We also derive a stochastic process that reproduces this distribution over equivalence classes. This model can play the role of the prior in nonparametric Bayesian learning sc ..."
Abstract

Cited by 24 (0 self)
 Add to MetaCart
(Show Context)
We present a probability distribution over nonnegative integer valued matrices with possibly an infinite number of columns. We also derive a stochastic process that reproduces this distribution over equivalence classes. This model can play the role of the prior in nonparametric Bayesian learning scenarios where multiple latent features are associated with the observed data and each feature can have multiple appearances or occurrences within each data point. Such data arise naturally when learning visual object recognition systems from unlabelled images. Together with the nonparametric prior we consider a likelihood model that explains the visual appearance and location of local image patches. Inference with this model is carried out using a Markov chain Monte Carlo algorithm. 1
Multilevel image coding with hyperfeatures
 International Journal of Computer Vision
, 2008
"... ..."
(Show Context)
Betanegative binomial process and Poisson factor analysis
 In AISTATS
, 2012
"... A betanegative binomial (BNB) process is proposed, leading to a betagammaPoisson process, which may be viewed as a “multiscoop” generalization of the betaBernoulli process. The BNB process is augmented into a betagammagammaPoisson hierarchical structure, and applied as a nonparametric Bayesia ..."
Abstract

Cited by 19 (8 self)
 Add to MetaCart
(Show Context)
A betanegative binomial (BNB) process is proposed, leading to a betagammaPoisson process, which may be viewed as a “multiscoop” generalization of the betaBernoulli process. The BNB process is augmented into a betagammagammaPoisson hierarchical structure, and applied as a nonparametric Bayesian prior for an infinite Poisson factor analysis model. A finite approximation for the beta process Lévy random measure is constructed for convenient implementation. Efficient MCMC computations are performed with data augmentation and marginalization techniques. Encouraging results are shown on document count matrix factorization. 1
J.Q.: Predicting user tasks: i know what you’re doing
 In: 20th National Conference on Artificial Intelligence (AAAI05), Workshop on Human Comprehensible Machine Learning
, 2005
"... Knowledge workers spend the majority of their working hours processing and manipulating information. These users face continual costs as they switch between tasks to retrieve and create information. The TaskTracer project at Oregon State University is investigating the possibilities of a desktop sof ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
Knowledge workers spend the majority of their working hours processing and manipulating information. These users face continual costs as they switch between tasks to retrieve and create information. The TaskTracer project at Oregon State University is investigating the possibilities of a desktop software system that will record in detail how knowledge workers complete tasks, and intelligently leverage that information to increase efficiency and productivity. Our approach combines humancomputer interaction and machine learning to assign each observed action (opening a file, saving a file, sending an email, cutting and pasting information, etc.) to a task for which it is likely being performed. In this paper we report on ways we have applied machine learning in this environment and lessons learned so far.
Multiscale Topic Tomography
 In ACMKDD
, 2007
"... Modeling the evolution of topics with time is of great value in automatic summarization and analysis of large document collections. In this work, we propose a new probabilistic graphical model to address this issue. The new model, which we call the Multiscale Topic Tomography Model (MTTM), employs n ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
Modeling the evolution of topics with time is of great value in automatic summarization and analysis of large document collections. In this work, we propose a new probabilistic graphical model to address this issue. The new model, which we call the Multiscale Topic Tomography Model (MTTM), employs nonhomogeneous Poisson processes to model generation of wordcounts. The evolution of topics is modeled through a multiscale analysis using Haar wavelets. One of the new features of the model is its modeling the evolution of topics at various timescales of resolution, allowing the user to zoom in and out of the timescales. Our experiments on Science data using the new model uncovers some interesting patterns in topics. The new model is also comparable to LDA in predicting unseen data as demonstrated by our perplexity experiments.