Results 1 - 10
of
54
A Historical Application Profiler for Use by Parallel Schedulers
- In Job Scheduling Strategies for Parallel Processing
, 1997
"... Scheduling algorithms that use application and system knowledge have been shown to be more effective at scheduling parallel jobs on a multiprocessor than algorithms that do not. This paper focuses on obtaining such information for use by a scheduler in a network of workstations environment. The log ..."
Abstract
-
Cited by 61 (0 self)
- Add to MetaCart
Scheduling algorithms that use application and system knowledge have been shown to be more effective at scheduling parallel jobs on a multiprocessor than algorithms that do not. This paper focuses on obtaining such information for use by a scheduler in a network of workstations environment. The log files from three parallel systems are examined to determine both how to categorize parallel jobs for storage in a job database and what job information would be useful to a scheduler. A Historical Profiler is proposed that stores information about programs and users, and manipulates this information to provide schedulers with execution time predictions. Several preemptive and non-preemptive versions of the FCFS, EASY and Least Work First scheduling algorithms are compared to evaluate the utility of the profiler. It is found that both preemption and the use of application execution time predictions obtained from the Historical Profiler lead to improved performance.
Perspectives on system identification
- In Plenary talk at the proceedings of the 17th IFAC World Congress, Seoul, South Korea
, 2008
"... System identification is the art and science of building mathematical models of dynamic systems from observed input-output data. It can be seen as the interface between the real world of applications and the mathematical world of control theory and model abstractions. As such, it is an ubiquitous ne ..."
Abstract
-
Cited by 47 (1 self)
- Add to MetaCart
System identification is the art and science of building mathematical models of dynamic systems from observed input-output data. It can be seen as the interface between the real world of applications and the mathematical world of control theory and model abstractions. As such, it is an ubiquitous necessity for successful applications. System identification is a very large topic, with different techniques that depend on the character of the models to be estimated: linear, nonlinear, hybrid, nonparametric etc. At the same time, the area can be characterized by a small number of leading principles, e.g. to look for sustainable descriptions by proper decisions in the triangle of model complexity, information contents in the data, and effective validation. The area has many facets and there are many approaches and methods. A tutorial or a survey in a few pages is not quite possible. Instead, this presentation aims at giving an overview of the “science ” side, i.e. basic principles and results and at pointing to open problem areas in the practical, “art”, side of how to approach and solve a real problem. 1.
Molecular Modeling Of Proteins And Mathematical Prediction Of Protein Structure
- SIAM Review
, 1997
"... . This paper discusses the mathematical formulation of and solution attempts for the so-called protein folding problem. The static aspect is concerned with how to predict the folded (native, tertiary) structure of a protein, given its sequence of amino acids. The dynamic aspect asks about the possib ..."
Abstract
-
Cited by 41 (4 self)
- Add to MetaCart
. This paper discusses the mathematical formulation of and solution attempts for the so-called protein folding problem. The static aspect is concerned with how to predict the folded (native, tertiary) structure of a protein, given its sequence of amino acids. The dynamic aspect asks about the possible pathways to folding and unfolding, including the stability of the folded protein. From a mathematical point of view, there are several main sides to the static problem: -- the selection of an appropriate potential energy function; -- the parameter identification by fitting to experimental data; and -- the global optimization of the potential. The dynamic problem entails, in addition, the solution of (because of multiple time scales very stiff) ordinary or stochastic differential equations (molecular dynamics simulation), or (in case of constrained molecular dynamics) of differential-algebraic equations. A theme connecting the static and dynamic aspect is the determination and formation of...
Estimation of Parameters and Eigenmodes of Multivariate Autoregressive Models
, 2001
"... Dynamical characteristics of a complex system can often be inferred from analyses of a stochastic time series model fitted to observations of the system. Oscillations in geophysical systems, for example, are sometimes characterized by principal oscillation patterns, eigenmodes of estimated autoregre ..."
Abstract
-
Cited by 40 (2 self)
- Add to MetaCart
Dynamical characteristics of a complex system can often be inferred from analyses of a stochastic time series model fitted to observations of the system. Oscillations in geophysical systems, for example, are sometimes characterized by principal oscillation patterns, eigenmodes of estimated autoregressive (AR) models of first order. This paper describes the estimation of eigenmodes of AR models of arbitrary order. AR processes of any order can be decomposed into eigenmodes with characteristic oscillation periods, damping times, and excitations. Estimated eigenmodes and confidence intervals for the eigenmodes and their oscillation periods and damping times can be computed from estimated model parameters. As a computationally efficient method of estimating the parameters of AR models from high-dimensional data, a stepwise least squares algorithm is proposed. This algorithm computes model coefficients and evaluates criteria for the selection of the model order stepwise for AR models of successively decreasing order. Numerical simulations indicate that, with the least squares algorithm, the AR model coefficients and the eigenmodes derived from the coefficients are estimated reliably and that the approximate 95% confidence intervals for the coefficients and eigenmodes are rough approximations of the confidence intervals inferred from the simulations.
Variable Selection for Regression Models
, 1998
"... A simple method for subset selection of independent variables in regression models is proposed. We expand the usual regression equation to an equation that incorporates all possible subsets of predictors by adding indicator variables as parameters. The vector of indicator variables dictates which pr ..."
Abstract
-
Cited by 32 (2 self)
- Add to MetaCart
A simple method for subset selection of independent variables in regression models is proposed. We expand the usual regression equation to an equation that incorporates all possible subsets of predictors by adding indicator variables as parameters. The vector of indicator variables dictates which predictors to include. Several choices of priors can be employed for the unknown regression coefficients and the unknown indicator parameters. The posterior distribution of the indicator vector is approximated by means of the Markov chain Monte Carlo algorithm. We select subsets with high posterior probabilities. In addition to linear models, we consider generalized linear models.
Proactive Management of Software Aging
, 2001
"... this paper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor. ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
this paper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor.
Combination of Machine Scores for Automatic Grading of Pronunciation Quality
, 1998
"... This work is part of an effort aimed at developing computer-based systems for language instruction; we address the task of grading the pronunciation quality of the speech of a student of a foreign language. The automatic grading system uses SRI's Decipher^TM continuous speech recognition system to ..."
Abstract
-
Cited by 21 (5 self)
- Add to MetaCart
This work is part of an effort aimed at developing computer-based systems for language instruction; we address the task of grading the pronunciation quality of the speech of a student of a foreign language. The automatic grading system uses SRI's Decipher^TM continuous speech recognition system to generate phonetic segmentations. Based on these segmentations and probabilistic models we produce different pronunciation scores for individual or groups of sentences that can be used as predictors of the pronunciation quality. Different types of these machine scores can be combined to obtain a better prediction of the overall pronunciation quality. In this paper we review some of the best-performing machine scores, and discuss the application of several methods based on linear and nonlinear mapping and combination of individual machine scores to predict the pronunciation quality grade that a human expert would have given. We evaluate these methods in a database that consists of pronunciation-quality-graded speech from American students speaking French. With predictors based on spectral match and on durational characteristics, we find that the combination of scores improved the prediction of the human grades and that nonlinear mapping and combination methods performed better than linear ones. Characteristics of the different nonlinear methods studied are discussed.
The Role of Model Validation for Assessing the Size of the Unmodeled Dynamics
- IEEE TRANSACTIONS ON AUTOMATIC CONTROL
, 1997
"... The problem of assessing the quality of a given, or estimated model is a central issue in system identification. Various new techniques for estimating bias and variance contributions to the model error have been suggested in the recent literature. In this contribution, classical model validation pro ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
The problem of assessing the quality of a given, or estimated model is a central issue in system identification. Various new techniques for estimating bias and variance contributions to the model error have been suggested in the recent literature. In this contribution, classical model validation procedures are placed at the focus of our attention. We discuss the principles by which we reach confidence in a model through such validation techniques, and also how the distance to a "true" description can be estimated this way. In particular, we stress how the typical model validation procedure gives a direct measure of the model error of the model test, without referring to its ensemble properties. Several model error bounds are developed for various assumptions about the disturbances entering the system.
Feature (gene) selection in gene expression-based tumor classification. Molecular Genetics and Metabolism
- Mol. Genet. Metab
, 2001
"... There is increasing interest in changing the emphasis of tumor classification from morphologic to molecular. Gene expression profiles may offer more information than morphology and provide an alternative to morphology-based tumor classification systems. Gene selection involves a search for gene subs ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
There is increasing interest in changing the emphasis of tumor classification from morphologic to molecular. Gene expression profiles may offer more information than morphology and provide an alternative to morphology-based tumor classification systems. Gene selection involves a search for gene subsets that are able to discriminate tumor tissue from normal tissue, and may have either clear biological interpretation or some implication in the molecular mechanism of the tumorigenesis. Gene selection is a fundamental issue in gene expressionbased tumor classification. In the formation of a discriminant rule, the number of genes is large relative to the number of tissue samples. Too many genes can harm the performance of the tumor classification system and increase the cost as well. In this report, we discuss criteria and illustrate techniques for reducing the number of genes and selecting an optimal (or near optimal) subset of genes from an initial set of genes for tumor classification. The practical advantages of gene selection over other methods of reducing the dimensionality (e.g., principal components), include its simplicity, future cost savings, and higher likelihood of being adopted in a clinical setting. We analyze the expression profiles of 2000 genes in 22 normal and 40 colon tumor tissues, 5776 sequences in 14 human mammary epithelial cells and 13 breast tumors, and 6817 genes in 47 acute lymphoblastic leukemia and 25 acute myeloid leukemia samples. Through these three examples, we show that using 2 or 3 genes can achieve more than 90 % accuracy of classification. This result implies that after initial investigation of tumor classification using microarrays, a small
Robust Optical Flow Computation Based On Least-Median-of-Squares Regression
, 1999
"... An optical flow estimation technique is presented which is based on the least-median-of-squares (LMedS) robust regression algorithm enabling more accurate flow estimates to be computed in the vicinity of motion discontinuities. The flow is computed in a blockwise fashion using an affine model. Throu ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
An optical flow estimation technique is presented which is based on the least-median-of-squares (LMedS) robust regression algorithm enabling more accurate flow estimates to be computed in the vicinity of motion discontinuities. The flow is computed in a blockwise fashion using an affine model. Through the use of overlapping blocks coupled with a block shifting strategy, redundancy is introduced into the computation of the flow. This eliminates blocking effects common in most other techniques based on blockwise processing and also allows flow to be accurately computed in regions containing three distinct motions. A multiresolution

