Results 11 - 20
of
74
On the impact of data input sets on statistical compiler tuning
- In Proceedings of the Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL
, 2006
"... In recent years, several approaches have been proposed to use profile information in compiler optimization. This profile information can be used at the source level to guide loop transformations as well as in the backend to guide low level optimizations. At the same time, profile guided library gene ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
In recent years, several approaches have been proposed to use profile information in compiler optimization. This profile information can be used at the source level to guide loop transformations as well as in the backend to guide low level optimizations. At the same time, profile guided library generators have been proposed also, like Atlas, Spiral, or FFTW, that tune their routines for the underlying hardware. These approaches have led to excellent performance improvements. However, a possible drawback of these approaches is that applications are optimized using a single or a limited set of data inputs. It is well known that programs can exhibit vastly differing behaviors for different inputs. Therefore, it is not clear whether the performance numbers reported are still valid for other input than the input used to optimize the program. In this paper, we address this problem for a specific statistical compiler tuning method. We use three different platforms and several SPECint2000 benchmarks. We show that when we tune the compiler using train data, we obtain a compiler setting that still performs well for reference data. These results suggest that profile guided optimization may be more stable than is sometimes believed and that a limited number of train data sets is sufficient to obtain a well optimized program for all inputs. 1
Pseudo-random number generation for sketch-based estimations
- TODS
"... The exact computation of aggregate queries, like the size of join of two relations, usually requires large amounts of memory – constrained in data-streaming – or communication – constrained in distributed computation – and large processing times. In this situation, approximation techniques with prov ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
The exact computation of aggregate queries, like the size of join of two relations, usually requires large amounts of memory – constrained in data-streaming – or communication – constrained in distributed computation – and large processing times. In this situation, approximation techniques with provable guarantees, like sketches, are one possible solution. The performance of sketches depends crucially on the ability to generate particular pseudo-random numbers. In this paper we investigate both theoretically and empirically the problem of generating k-wise independent pseudo-random numbers and, in particular, that of generating 3 and 4-wise independent pseudorandom numbers that are fast range-summable (i.e., they can be summed-up in sub-linear time). Our specific contributions are: (a) we provide a thorough comparison of the various pseudorandom number generating schemes, (b) we study both theoretically and empirically the fast range-summation property of the 3 and 4-wise independent generating schemes, (c) we provide algorithms for the fast range-summation of two 3-wise independent schemes, BCH and Extended Hamming, (d) we show convincing theoretical and empirical evidence that the Extended Hamming scheme performs as well as any 4-wise independent scheme for estimating the size of join of two relations using AMS-sketches, even though it is only 3-wise independent. We use this scheme to generate estimators that significantly outperform the state-of-the-art solutions for two problems – size of spatial joins and selectivity estimation.
Design of Reconfigurable Composite Microsystems Based on Hardware/Software Codesign Principles
- IEEE Trans. Computer-Aided Design
, 2002
"... Composite microsystems that integrate mechanical and fluidic components with electronics are emerging as the next generation of system-on-a-chip. Custom microsystems are expensive, inflexible, and unsuitable for high-volume production. The authors address this problem by leveraging hardware/software ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Composite microsystems that integrate mechanical and fluidic components with electronics are emerging as the next generation of system-on-a-chip. Custom microsystems are expensive, inflexible, and unsuitable for high-volume production. The authors address this problem by leveraging hardware/software codesign principles to design reconfigurable composite microsystems. They partition the system design parameters into nonreconfigurable and reconfigurable categories. In this way, operational flexibility is enhanced and the microsystems are designed for a wider range of application. In addition, the Taguchi robust design method is used to make the system robust, and response surface methodologies are used to explore the widest performance range for the system. A case study is presented for a microvalve, which serves as a representative microelectrofluidic device.
Model Validation For A Complex Jointed Structure
, 2001
"... An overview of the modeling and validation of a complex engineering simulation performed at the Los Alamos National Laboratory is presented. The application discussed represents the highly transient response of an assembly with complex joints subjected to an impulsive load. The primary sources of no ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
An overview of the modeling and validation of a complex engineering simulation performed at the Los Alamos National Laboratory is presented. The application discussed represents the highly transient response of an assembly with complex joints subjected to an impulsive load. The primary sources of nonlinearity were the contact mechanics. Several tests were conducted to assess the degree of experimental uncertainty, the variability of the geometry of the test article and its assembly procedures, and to provide reference data for model validation. After presenting the experiment and the corresponding numerical simulation, several issues of model validation are addressed. They include data reduction, feature extraction, design of computer experiments, statistical effects analysis, and model updating. It is shown how these tools can help the analyst gain confidence regarding the predictive quality of the simulation. 1
BLOCKED REGULAR FRACTIONAL FACTORIAL DESIGNS WITH MINIMUM ABERRATION
, 2006
"... This paper considers the construction of minimum aberration (MA) blocked factorial designs. Based on coding theory, the concept of minimum moment aberration due to Xu [Statist. Sinica 13 (2003) 691–708] for unblocked designs is extended to blocked designs. The coding theory approach studies designs ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
This paper considers the construction of minimum aberration (MA) blocked factorial designs. Based on coding theory, the concept of minimum moment aberration due to Xu [Statist. Sinica 13 (2003) 691–708] for unblocked designs is extended to blocked designs. The coding theory approach studies designs in a row-wise fashion and therefore links blocked designs with nonregular and supersaturated designs. A lower bound on blocked wordlength pattern is established. It is shown that a blocked design has MA if it originates from an unblocked MA design and achieves the lower bound. It is also shown that a regular design can be partitioned into maximal blocks if and only if it contains a row without zeros. Sufficient conditions are given for constructing MA blocked designs from unblocked MA designs. The theory is then applied to construct MA blocked designs for all 32 runs, 64 runs up to 32 factors, and all 81 runs with respect to four combined wordlength patterns.
Two-Level Nonregular Designs From Quaternary Linear Codes
, 2006
"... Abstract: A quaternary linear code is a linear space over the ring of integers modulo 4. Recent research in coding theory shows that many famous nonlinear codes such as the Nordstrom and Robinson (1967) code and its generalizations can be simply constructed from quaternary linear codes. This paper e ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract: A quaternary linear code is a linear space over the ring of integers modulo 4. Recent research in coding theory shows that many famous nonlinear codes such as the Nordstrom and Robinson (1967) code and its generalizations can be simply constructed from quaternary linear codes. This paper explores the use of quaternary codes to construct two-level nonregular designs. A general construction of nonregular designs is described and some theoretic results are obtained. Many nonregular designs constructed by this method have better statistical properties than regular designs of the same size in terms of resolution and aberration. A systematic construction procedure is proposed and a collection of nonregular designs with 16, 32, 64, 128, 256 runs and up to 64 factors is presented. Key words and phrases: Fractional factorial design, generalized minimum aberration, generalized resolution, MacWilliams identity, quaternary code.
Statistical Based Non-Linear Model Updating Using Feature Extraction
, 2001
"... model fidelity for non-linear systems. The approach investigates several mechanisms to assist the analyst in updating an analytical model based on experimental data and statistical analysis of parameter effects. The first is a new approach at data reduction called feature extraction. This is an exp ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
model fidelity for non-linear systems. The approach investigates several mechanisms to assist the analyst in updating an analytical model based on experimental data and statistical analysis of parameter effects. The first is a new approach at data reduction called feature extraction. This is an expansion of the update metrics to include specific phenomena or character of the response that is critical to model application. This is an extension of the classical linear updating paradigm of utilizing the eigen-parameters or FRF's to include such devices as peak acceleration, time of arrival or standard deviation of model error. The next expansion of the updating process is the inclusion of statistical based parameter analysis to quantify the effects of uncertain or significant effect parameters in the construction of a meta-model. This provides indicators of the statistical variation associated with parameters as well as confidence intervals on the coefficients of the resulting meta-model. Also included in this method is the investigation of linear parameter effect screening using a partial factorial variable array for simulation. This is intended to aid the analyst in eliminating from the investigation the parameters that do not have a significant variation effect on the feature metric. Finally an investigation of the model to replicate the measured response variation is examined.
Alternative Sampling Methods for Estimating Multivariate Normal Probabilities
"... We study the performance of alternative sampling methods for estimating multivariate normal probabilities through the GHK simulator. The sampling methods are random-ized versions of some quasi-Monte Carlo samples (Halton, Niederreiter, Niederreiter-Xing sequences and lattice points) and some samples ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We study the performance of alternative sampling methods for estimating multivariate normal probabilities through the GHK simulator. The sampling methods are random-ized versions of some quasi-Monte Carlo samples (Halton, Niederreiter, Niederreiter-Xing sequences and lattice points) and some samples based on orthogonal arrays (Latin hyper-cube, orthogonal array and orthogonal array based Latin hypercube samples). In general, these samples turn out to have a better performance than Monte Carlo and antithetic Monte Carlo samples. Improvements over these are large for low-dimensional (4 and 10) cases and still signi…cant for dimensions as large as 50.
Pessimistic Cost-Sensitive Active Learning of Decision Trees
- Data Mining and Knowledge Discovery
, 2008
"... In business applications such as direct marketing, decision-makers are required to choose the action which best maximizes a utility function. Cost-sensitive learning methods can help them achieve this goal. In this paper, we introduce Pessimistic Active Learning (PAL). PAL employs a novel pessimisti ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
In business applications such as direct marketing, decision-makers are required to choose the action which best maximizes a utility function. Cost-sensitive learning methods can help them achieve this goal. In this paper, we introduce Pessimistic Active Learning (PAL). PAL employs a novel pessimistic measure, which relies on confidence intervals and is used to balance the exploration/exploitation trade-off. In order to acquire an initial sample of labeled data, PAL applies orthogonal arrays of fractional factorial design. PAL was tested on ten datasets using a decision tree inducer. A comparison of these results to those of other methods indicates PAL’s superiority. 1.

