Results 1  10
of
275
A Comparison of Algorithms for Maximum Entropy Parameter Estimation
"... A comparison of algorithms for maximum entropy parameter estimation Conditional maximum entropy (ME) models provide a general purpose machine learning technique which has been successfully applied to fields as diverse as computer vision and econometrics, and which is used for a wide variety of class ..."
Abstract

Cited by 290 (2 self)
 Add to MetaCart
A comparison of algorithms for maximum entropy parameter estimation Conditional maximum entropy (ME) models provide a general purpose machine learning technique which has been successfully applied to fields as diverse as computer vision and econometrics, and which is used for a wide variety of classification problems in natural language processing. However, the flexibility of ME models is not without cost. While parameter estimation for ME models is conceptually straightforward, in practice ME models for typical natural language tasks are very large, and may well contain many thousands of free parameters. In this paper, we consider a number of algorithms for estimating the parameters of ME models, including iterative scaling, gradient ascent, conjugate gradient, and variable metric methods. Surprisingly, the standardly used iterative scaling algorithms perform quite poorly in comparison to the others, and for all of the test problems, a limitedmemory variable metric algorithm outperformed the other choices.
Jacobianfree NewtonKrylov methods: a survey of approaches and applications
 J. Comput. Phys
"... Jacobianfree NewtonKrylov (JFNK) methods are synergistic combinations of Newtontype methods for superlinearly convergent solution of nonlinear equations and Krylov subspace methods for solving the Newton correction equations. The link between the two methods is the Jacobianvector product, which ..."
Abstract

Cited by 204 (6 self)
 Add to MetaCart
(Show Context)
Jacobianfree NewtonKrylov (JFNK) methods are synergistic combinations of Newtontype methods for superlinearly convergent solution of nonlinear equations and Krylov subspace methods for solving the Newton correction equations. The link between the two methods is the Jacobianvector product, which may be probed approximately without forming and storing the elements of the true Jacobian, through a variety of means. Various approximations to the Jacobian matrix may still be required for preconditioning the resulting Krylov iteration. As with Krylov methods for linear problems, successful application of the JFNK method to any given problem is dependent on adequate preconditioning. JFNK has potential for application throughout problems governed by nonlinear partial dierential equations and integrodierential equations. In this survey article we place JFNK in context with other nonlinear solution algorithms for both boundary value problems (BVPs) and initial value problems (IVPs). We provide an overview of the mechanics of JFNK and attempt to illustrate the wide variety of preconditioning options available. It is emphasized that JFNK can be wrapped (as an accelerator) around another nonlinear xed point method (interpreted as a preconditioning process, potentially with signicant code reuse). The aim of this article is not to trace fully the evolution of JFNK, nor to provide proofs of accuracy or optimal convergence for all of the constituent methods, but rather to present the reader with a perspective on how JFNK may be applicable to problems of physical interest and to provide sources of further practical information. A review paper solicited by the EditorinChief of the Journal of Computational
The GrADS project: Software support for highlevel grid application development
 International Journal of High Performance Computing Applications
, 2001
"... Advances in networking technologies will soon make it possible to use the global information infrastructure in a qualitatively different way—as a computational resource as well as an information resource. This idea for an integrated computation and information resource called the Computational Power ..."
Abstract

Cited by 161 (24 self)
 Add to MetaCart
(Show Context)
Advances in networking technologies will soon make it possible to use the global information infrastructure in a qualitatively different way—as a computational resource as well as an information resource. This idea for an integrated computation and information resource called the Computational Power Grid has been described by the recent book entitled The Grid: Blueprint for a New Computing Infrastructure [18]. The Grid will connect the nation’s computers, databases, instruments, and people in a seamless web, supporting emerging computationrich application concepts such as remote computing, distributed supercomputing, teleimmersion, smart instruments, and data mining. To realize this vision, significant scientific and technical obstacles must be overcome. Principal among these is usability. Because the Grid will be inherently more complex than existing computer systems, programs that execute on the Grid will reflect some of this complexity. Hence, making Grid resources useful and accessible to scientists and engineers will require new software tools that embody major advances in both the theory and practice of building Grid applications. The goal of the Grid Application Development Software (GrADS) Project is to simplify distributed heterogeneous computing in the same way that the World Wide Web simplified information sharing
deal.II – a general purpose object oriented finite element library
 ACM TRANS. MATH. SOFTW
"... An overview of the software design and data abstraction decisions chosen for deal.II, a general purpose finite element library written in C++, is given. The library uses advanced objectoriented and data encapsulation techniques to break finite element implementations into smaller blocks that can be ..."
Abstract

Cited by 104 (28 self)
 Add to MetaCart
An overview of the software design and data abstraction decisions chosen for deal.II, a general purpose finite element library written in C++, is given. The library uses advanced objectoriented and data encapsulation techniques to break finite element implementations into smaller blocks that can be arranged to fit users requirements. Through this approach, deal.II supports a large number of different applications covering a wide range of scientific areas, programming methodologies, and applicationspecific algorithms, without imposing a rigid framework into which they have to fit. A judicious use of programming techniques allows to avoid the computational costs frequently associated with abstract objectoriented class libraries. The paper presents a detailed description of the abstractions chosen for defining geometric information of meshes and the handling of degrees of freedom associated with finite element spaces, as well as of linear algebra, input/output capabilities and of interfaces to other software, such as visualization tools. Finally, some results obtained with applications built atop deal.II are shown to demonstrate the powerful capabilities of this toolbox.
A scalable modular convex solver for regularized risk minimization
 In KDD. ACM
, 2007
"... A wide variety of machine learning problems can be described as minimizing a regularized risk functional, with different algorithms using different notions of risk and different regularizers. Examples include linear Support Vector Machines (SVMs), Logistic Regression, Conditional Random Fields (CRFs ..."
Abstract

Cited by 78 (16 self)
 Add to MetaCart
(Show Context)
A wide variety of machine learning problems can be described as minimizing a regularized risk functional, with different algorithms using different notions of risk and different regularizers. Examples include linear Support Vector Machines (SVMs), Logistic Regression, Conditional Random Fields (CRFs), and Lasso amongst others. This paper describes the theory and implementation of a highly scalable and modular convex solver which solves all these estimation problems. It can be parallelized on a cluster of workstations, allows for datalocality, and can deal with regularizers such as ℓ1 and ℓ2 penalties. At present, our solver implements 20 different estimation problems, can be easily extended, scales to millions of observations, and is up to 10 times faster than specialized solvers for many applications. The open source code is freely available as part of the ELEFANT toolbox.
Discriminative language modeling with conditional random fields and the perceptron algorithm
 In Proc. ACL
, 2004
"... This paper describes discriminative language modeling for a large vocabulary speech recognition task. We contrast two parameter estimation methods: the perceptron algorithm, and a method based on conditional random fields (CRFs). The models are encoded as deterministic weighted finite state automata ..."
Abstract

Cited by 77 (8 self)
 Add to MetaCart
(Show Context)
This paper describes discriminative language modeling for a large vocabulary speech recognition task. We contrast two parameter estimation methods: the perceptron algorithm, and a method based on conditional random fields (CRFs). The models are encoded as deterministic weighted finite state automata, and are applied by intersecting the automata with wordlattices that are the output from a baseline recognizer. The perceptron algorithm has the benefit of automatically selecting a relatively small feature set in just a couple of passes over the training data. However, using the feature set output from the perceptron algorithm (initialized with their weights), CRF training provides an additional 0.5 % reduction in word error rate, for a total 1.8 % absolute reduction from the baseline of 39.2%. 1
A Component Architecture for HighPerformance Scientific Computing
 Intl. J. HighPerformance Computing Applications
, 2004
"... The Common Component Architecture (CCA) provides a means for software developers to manage the complexity of largescale scientific simulations and to move toward a plugandplay environment for highperformance computing. In the scientific computing context, component models also promote collaborat ..."
Abstract

Cited by 63 (20 self)
 Add to MetaCart
The Common Component Architecture (CCA) provides a means for software developers to manage the complexity of largescale scientific simulations and to move toward a plugandplay environment for highperformance computing. In the scientific computing context, component models also promote collaboration using independently developed software, thereby allowing particular individuals or groups to focus on the aspects of greatest interest to them. The CCA supports parallel and distributed computing as well as local highperformance connections between components in a languageindependent manner. The design places minimal requirements on components
A compiler for variational forms
 ACM Trans. Math. Software
"... As a key step towards a complete automation of the finite element method, we present a new algorithm for automatic and efficient evaluation of multilinear variational forms. The algorithm has been implemented in the form of a compiler, the FEniCS Form Compiler FFC. We present benchmark results for a ..."
Abstract

Cited by 58 (21 self)
 Add to MetaCart
(Show Context)
As a key step towards a complete automation of the finite element method, we present a new algorithm for automatic and efficient evaluation of multilinear variational forms. The algorithm has been implemented in the form of a compiler, the FEniCS Form Compiler FFC. We present benchmark results for a series of standard variational forms, including the incompressible Navier– Stokes equations and linear elasticity. The speedup compared to the standard quadraturebased approach is impressive; in some cases the speedup is as large as a factor 1000.
Programming matrix algorithmsbyblocks for threadlevel parallelism
 ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE
"... With the emergence of threadlevel parallelism as the primary means for continued performance improvement, the programmability issue has reemerged as an obstacle to the use of architectural advances. We argue that evolving legacy libraries for dense and banded linear algebra is not a viable solution ..."
Abstract

Cited by 47 (19 self)
 Add to MetaCart
(Show Context)
With the emergence of threadlevel parallelism as the primary means for continued performance improvement, the programmability issue has reemerged as an obstacle to the use of architectural advances. We argue that evolving legacy libraries for dense and banded linear algebra is not a viable solution due to constraints imposed by early design decisions. We propose a philosophy of abstraction and separation of concerns that provides a promising solution in this problem domain. The first abstraction, FLASH, allows algorithms to express computation with matrices consisting of contiguous blocks, facilitating algorithmsbyblocks. Operand descriptions are registered for a particular operation a priori by the library implementor. A runtime system, SuperMatrix, uses this information to identify data dependencies between suboperations, allowing them to be scheduled to threads outoforder and executed in parallel. But not all classical algorithms in linear algebra lend themselves to conversion to algorithmsbyblocks. We show how our recently proposed LU factorization with incremental pivoting and a closely related algorithmbyblocks for the QR factorization, both originally designed for outofcore computation, overcome this difficulty. Anecdotal evidence regarding the development of routines with a core functionality demonstrates how the methodology supports high productivity while experimental results suggest
Performance and productivity in parallel programming via processor virtualization
 in Proc. of the First Intl. Workshop on Productivity and Performance in HighEnd Computing (at HPCA 10
, 2004
"... We have been pursuing a research program aimed at enhancing productivity and performance in parallel computing at the Parallel Programming Laboratory of University of Illinois for the past decade. We summarize the basic approach, and why it has improved (and will further improve) both productivity a ..."
Abstract

Cited by 41 (18 self)
 Add to MetaCart
(Show Context)
We have been pursuing a research program aimed at enhancing productivity and performance in parallel computing at the Parallel Programming Laboratory of University of Illinois for the past decade. We summarize the basic approach, and why it has improved (and will further improve) both productivity and performance. The centerpiece of our approach is a technique called processor virtualization: the program computation is divided into a large number of chunks (called virtual processors), which are mapped to processors by an adaptive, intelligent runtime system. The runtime system also controls communication between virtual processors. This approach makes possible a number of runtime optimizations. We argue that the following strategies are necessary to improve productivity in parallel programming: • Automated resource management via processor virtualization • Modularity via concurrent composability • Reusability via frameworks, libraries, and multiparadigm interoperability Of these, the first two directly benefit from processor virtualization, while the last is indirectly impacted. We describe our research on all these fronts. 1.