Results 1  10
of
14
Graphbased Submodular Selection for Extractive Summarization
, 2009
"... We propose a novel approach for unsupervised extractive summarization. Our approach builds a semantic graph for the document to be summarized. Summary extraction is then formulated as optimizing submodular functions defined on the semantic graph. The optimization is theoretically guaranteed to be n ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
(Show Context)
We propose a novel approach for unsupervised extractive summarization. Our approach builds a semantic graph for the document to be summarized. Summary extraction is then formulated as optimizing submodular functions defined on the semantic graph. The optimization is theoretically guaranteed to be nearoptimal under the framework of submodularity. Extensive experiments on the ICSI meeting summarization task on both human transcripts and automatic speech recognition (ASR) outputs show that the graphbased submodular selection approach consistently outperforms the maximum marginal relevance (MMR) approach, a conceptbased approach using integer linear programming (ILP), and a recursive graphbased ranking algorithm using Google’s PageRank.
Optimal Selection of Limited Vocabulary Speech Corpora
"... We address the problem of finding a subset of a large speech data corpus that is useful for accurately and rapidly prototyping novel and computationally expensive speech recognition architectures. To solve this problem, we express it as an optimization problem over submodular functions. Quantities s ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
(Show Context)
We address the problem of finding a subset of a large speech data corpus that is useful for accurately and rapidly prototyping novel and computationally expensive speech recognition architectures. To solve this problem, we express it as an optimization problem over submodular functions. Quantities such as vocabulary size (or quality) of a set of utterances, or quality of a bundle of word types are submodular functions which make finding the optimal solutions possible. We, moreover, are able to express our approach using graph cuts leading to a very fast implementation even on large initial corpora. We show results on the SwitchboardI corpus, demonstrating improved results over previous techniques for this purpose. We also demonstrate the variety of the resulting corpora that may be produced using our method. Index Terms: corpus subset selection, submodularity, LVCSR 1.
Submodular Optimization with Submodular Cover and Submodular Knapsack Constraints: Extended arxiv version
, 2013
"... We investigate two new optimization problems — minimizing a submodular function subject to a submodular lower bound constraint (submodular cover) and maximizing a submodular function subject to a submodular upper bound constraint (submodular knapsack). We are motivated by a number of realworld appl ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
(Show Context)
We investigate two new optimization problems — minimizing a submodular function subject to a submodular lower bound constraint (submodular cover) and maximizing a submodular function subject to a submodular upper bound constraint (submodular knapsack). We are motivated by a number of realworld applications in machine learning including sensor placement and data subset selection, which require maximizing a certain submodular function (like coverage or diversity) while simultaneously minimizing another (like cooperative cost). These problems are often posed as minimizing the difference between submodular functions [9, 25] which is in the worst case inapproximable. We show, however, that by phrasing these problems as constrained optimization, which is more natural for many applications, we achieve a number of bounded approximation guarantees. We also show that both these problems are closely related and an approximation algorithm solving one can be used to obtain an approximation guarantee for the other. We provide hardness results for both problems thus showing that our approximation factors are tight up to logfactors. Finally, we empirically demonstrate the performance and good scalability properties of our algorithms. 1
Machine Learning Paradigms for Speech Recognition: An Overview
, 2013
"... Automatic Speech Recognition (ASR) has historically been a driving force behind many machine learning (ML) techniques, including the ubiquitously used hidden Markov model, discriminative learning, structured sequence learning, Bayesian learning, and adaptive learning. Moreover, ML can and occasional ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Automatic Speech Recognition (ASR) has historically been a driving force behind many machine learning (ML) techniques, including the ubiquitously used hidden Markov model, discriminative learning, structured sequence learning, Bayesian learning, and adaptive learning. Moreover, ML can and occasionally does use ASR as a largescale, realistic application to rigorously test the effectiveness of a given technique, and to inspire new problems arising from the inherently sequential and dynamic nature of speech. On the other hand, even though ASR is available commercially for some applications, it is largely an unsolved problem—for almost all applications, the performance of ASR is not on par with human performance. New insight from modern ML methodology shows great promise to advance the stateoftheart in ASR technology. This overview article provides readers with an overview of modern ML techniques as utilized in the current and as relevant to future ASR research and systems. The intent is to foster further crosspollination between the ML and ASR communities than has occurred in the past. The article is organized according to the major ML paradigms that are either popular already or have potential for making significant contributions to ASR technology. The paradigms presented and elaborated in this overview include: generative and discriminative learning; supervised, unsupervised, semisupervised, and active learning; adaptive and multitask learning; and Bayesian learning. These learning paradigms are motivated and discussed in the context of ASR technology and applications. We finally present and analyze recent developments of deep learning and learning with sparse representations, focusing on their direct relevance to advancing ASR technology.
Fast Semidifferentialbased Submodular Function Optimization
"... We present a practical and powerful new framework for both unconstrained and constrained submodular function optimization based on discrete semidifferentials (sub and superdifferentials). The resulting algorithms, which repeatedly compute and then efficiently optimize submodular semigradients, off ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
We present a practical and powerful new framework for both unconstrained and constrained submodular function optimization based on discrete semidifferentials (sub and superdifferentials). The resulting algorithms, which repeatedly compute and then efficiently optimize submodular semigradients, offer new and generalize many old methods for submodular optimization. Our approach, moreover, takes steps towards providing a unifying paradigm applicable to both submodular minimization and maximization, problems that historically have been treated quite distinctly. The practicality of our algorithms is important since interest in submodularity, owing to its natural and wide applicability, has recently been in ascendance within machine learning. We analyze theoretical properties of our algorithms for minimization and maximization, and show that many stateoftheart maximization algorithms are special cases. Lastly, we complement our theoretical analyses with supporting empirical experiments. 1.
Using Document Summarization Techniques for Speech Data Subset Selection
"... In this paper we leverage methods from submodular function optimization developed for document summarization and apply them to the problem of subselecting acoustic data. We evaluate our results on data subset selection for a phone recognition task. Our framework shows significant improvements over r ..."
Abstract

Cited by 7 (6 self)
 Add to MetaCart
(Show Context)
In this paper we leverage methods from submodular function optimization developed for document summarization and apply them to the problem of subselecting acoustic data. We evaluate our results on data subset selection for a phone recognition task. Our framework shows significant improvements over random selection and previously proposed methods using a similar amount of resources. 1
Submodular feature selection for highdimensional acoustic score spaces
 in Proceedings of ICASSP
, 2013
"... We apply methods for selecting subsets of dimensions from highdimensional score spaces, and subsets of data for training, using submodular function optimization. Submodular functions provide theoretical performance guarantees while simultaneously retaining extremely fast and scalable optimizatio ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
(Show Context)
We apply methods for selecting subsets of dimensions from highdimensional score spaces, and subsets of data for training, using submodular function optimization. Submodular functions provide theoretical performance guarantees while simultaneously retaining extremely fast and scalable optimization via an accelerated greedy algorithm. We evaluate this approach on two applications: data subset selection for phone recognizer training, and semisupervised learning for phone segment classification. Interestingly, the first application uses submodularity twice: first for score space subselection and then for data subset selection. Our approach is computationally efficient but still consistently outperforms a number of baseline methods. Index Terms — feature selection, Fisher kernel, acoustic similarity, graphbased learning, submodularity
Submodular subset selection for largescale speech training data
 In ICASSP, 2014b
"... We address the problem of subselecting a large set of acoustic data to train automatic speech recognition (ASR) systems. To this end, we apply a novel data selection technique based on constrained submodular function maximization. Though NPhard, the combinatorial optimization problem can be appro ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
(Show Context)
We address the problem of subselecting a large set of acoustic data to train automatic speech recognition (ASR) systems. To this end, we apply a novel data selection technique based on constrained submodular function maximization. Though NPhard, the combinatorial optimization problem can be approximately solved by a simple and scalable greedy algorithm with constantfactor guarantees. We evaluate our approach by subselecting data from 1300 hours of conversational English telephone data to train two types largevocabulary speech recognizers, one with Gaussian mixture model (GMM) based acoustic models, and another based on deep neural networks (DNNs). We show that training data can be reduced significantly, and that our technique outperforms both random selection and a previously proposed selection method utilizing comparable resources. Notably, using the submodular selection method, the DNN system using only about 5% of the training data is able to achieve performance on par with the GMM system using 100 % of the training data — with the baseline subset selection methods, however, the DNN system is unable to accomplish this correspondence. Index Terms — speech processing, automatic speech recognition, machine learning, largescale systems 1.
An application of the submodular principal partition to training data subset selection
 in NIPS Workshop on Discrete Optimization in Machine Learning: Submodularity, Sparsity & Polyhedra
, 2010
"... We address the problem of finding a subset of a large training data set (corpus) that is useful for accurately and rapidly prototyping novel and computationally expensive machine learning architectures. To solve this problem, we express it as an minimization problem over a weighted sum of modular fu ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
We address the problem of finding a subset of a large training data set (corpus) that is useful for accurately and rapidly prototyping novel and computationally expensive machine learning architectures. To solve this problem, we express it as an minimization problem over a weighted sum of modular functions and submodular functions. Quantities such as number of classes (or quality) in a set of samples, or quality of a bundle of classes are submodular functions which make finding the optimal solutions possible. We apply the principal partition to our problem such that solutions for all possible tradeoffs between a modular function and a submodular function can be found efficiently. We show results for speech recognition on the SwitchboardI speech recognition corpus, demonstrating improved results over previous techniques for this purpose. We also demonstrate the variety of the resulting corpora that may be produced using our method. 1
Unsupervised submodular subset selection for speech data
, 2014
"... We conduct a comparative study on selecting subsets of acoustic data for training phone recognizers. The data selection problem is approached as a constrained submodular optimization problem. Previous applications of this approach required transcriptions or acoustic models trained in a supervised ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
We conduct a comparative study on selecting subsets of acoustic data for training phone recognizers. The data selection problem is approached as a constrained submodular optimization problem. Previous applications of this approach required transcriptions or acoustic models trained in a supervised way. In this paper we develop and evaluate a novel and entirely unsupervised approach, and apply it to TIMIT data. Results show that our method consistently outperforms a number of baseline methods while being computationally very efficient and requiring no labeling. Index Terms — speech processing, automatic speech recognition, machine learning 1.