Results 1 - 10
of
675
Preliminary guidelines for empirical research in software engineering
- IEEE Transactions on Software Engineering
, 2002
"... ..."
(Show Context)
Experimentation in software engineering
- IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
, 1986
"... ..."
Comparing Detection Methods For Software Requirements Inspections: A Replicated Experiment
, 1995
"... Software requirements specifications (SRS) are often validated manually. One such process is inspection, in which several reviewers independently analyze all or part of the specification and search for faults. These faults are then collected at a meeting of the reviewers and author(s). Usually, revi ..."
Abstract
-
Cited by 198 (22 self)
- Add to MetaCart
Software requirements specifications (SRS) are often validated manually. One such process is inspection, in which several reviewers independently analyze all or part of the specification and search for faults. These faults are then collected at a meeting of the reviewers and author(s). Usually, reviewers use Ad Hoc or Checklist methods to uncover faults. These methods force all reviewers to rely on nonsystematic techniques to search for a wide variety of faults. We hypothesize that a Scenario-based method, in which each reviewer uses different, systematic techniques to search for different, specific classes of faults, will have a significantly higher success rate. We evaluated this hypothesis using a 3 \Theta 2 4 partial factorial, randomized experimental design. Forty eight graduate students in computer science participated in the experiment. They were assembled into sixteen, three-person teams. Each team inspected two SRS using some combination of Ad Hoc, Checklist or Scenario meth...
Stemming Algorithms - A Case Study for Detailed Evaluation
- Journal of the American Society for Information Science
, 1996
"... The majority of information retrieval experiments are evaluated by measures such as average precision and average recall. Fundamental decisions about the superiority of one retrieval technique over another are made solely on the basis of these measures. We claim that average performance figures n ..."
Abstract
-
Cited by 182 (4 self)
- Add to MetaCart
The majority of information retrieval experiments are evaluated by measures such as average precision and average recall. Fundamental decisions about the superiority of one retrieval technique over another are made solely on the basis of these measures. We claim that average performance figures need to be validated with a careful statistical analysis and that there is a great deal of additional information that can be uncovered by looking closely at the results of individual queries. This paper is a case study of stemming algorithms which describes a number of novel approaches to evaluation and demonstrates their value.
Sketch-based Change Detection: Methods, Evaluation, and Applications
- IN INTERNET MEASUREMENT CONFERENCE
, 2003
"... Traffic anomalies such as failures and attacks are commonplace in today's network, and identifying them rapidly and accurately is critical for large network operators. The detection typically treats the traffic as a collection of flows that need to be examined for significant changes in traffic ..."
Abstract
-
Cited by 165 (17 self)
- Add to MetaCart
Traffic anomalies such as failures and attacks are commonplace in today's network, and identifying them rapidly and accurately is critical for large network operators. The detection typically treats the traffic as a collection of flows that need to be examined for significant changes in traffic pattern (e.g., volume, number of connections) . However, as link speeds and the number of flows increase, keeping per-flow state is either too expensive or too slow. We propose building compact summaries of the traffic data using the notion of sketches. We have designed a variant of the sketch data structure, k-ary sketch, which uses a constant, small amount of memory, and has constant per-record update and reconstruction cost. Its linearity property enables us to summarize traffic at various levels. We then implement a variety of time series forecast models (ARIMA, Holt-Winters, etc.) on top of such summaries and detect significant changes by looking for flows with large forecast errors. We also present heuristics for automatically configuring the model parameters. Using a
Flexural rigidity of microtubules and actin filaments measured from thermal fluctuations in shape.
- J Cell Biol
, 1995
"... Abstract. Microtubules are long, proteinaceous filaments that perform structural functions in eukaryotic cells by defining cellular shape and serving as tracks for intracellular motor proteins. We report the first accurate measurements of the flexural rigidity of microtubules. By analyzing the ther ..."
Abstract
-
Cited by 163 (4 self)
- Add to MetaCart
(Show Context)
Abstract. Microtubules are long, proteinaceous filaments that perform structural functions in eukaryotic cells by defining cellular shape and serving as tracks for intracellular motor proteins. We report the first accurate measurements of the flexural rigidity of microtubules. By analyzing the thermally driven fluctuations in their shape, we estimated the mean flexural rigidity of taxol-stabilized microtubules to be 2.2 x 10 -23 Nm 2 (with 6.4% uncertainty) for seven unlabeled microtubules and 2.1 x 10 -23 Nm 2 (with 4.7% uncertainty) for eight rhodamine-labeled microtubules. These values are similar to earlier, less precise estimates of microtubule bending stiffness obtained by modeling flagellar motion. A similar analysis on seven rhodaminephalloidin-labeled actin filaments gave a flexural rigidity of Z3 x 10 -26 Nm 2 (with 6% uncertainty), consistent with previously reported results. The flexural rigidity of these microtubules corresponds to a persistence length of 5,200 #m showing that a microtubule is rigid over cellular dimensions. By contrast, the persistence length of an actin filament is only ,,,,17.7 #m, perhaps explaining why actin filaments within cells are usually cross-linked into bundles. The greater flexural rigidity of a microtubule compared to an actin filament mainly derives from the formers larger cross-section. If tubulin were homogeneous and isotropic, then the microtubule's Young's modulus would be -1.2 GPa, similar to Plexiglas and rigid plastics. Microtubules are expected to be almost inextensible: the compliance of cells is due primarily to filament bending or sliding between filaments rather than the stretching of the filaments themselves.
Some New Three Level Designs for the Study of Quantitative Variables
, 1960
"... This article describes some methods which enable us to construct small designs for quantitative factors, while maintaining as much orthogonality of the design as possible. To calculate the D- ..."
Abstract
-
Cited by 150 (1 self)
- Add to MetaCart
This article describes some methods which enable us to construct small designs for quantitative factors, while maintaining as much orthogonality of the design as possible. To calculate the D-
Efficient algorithms for minimizing cross validation error
- In Proceedings of the Eleventh International Conference on Machine Learning
, 1994
"... Model selection is important in many areas of supervised learning. Given a dataset and a set of models for predicting with that dataset, we must choose the model which is expected to best predict future data. In some situations, such as online learning for control of robots or factories, data is che ..."
Abstract
-
Cited by 150 (7 self)
- Add to MetaCart
Model selection is important in many areas of supervised learning. Given a dataset and a set of models for predicting with that dataset, we must choose the model which is expected to best predict future data. In some situations, such as online learning for control of robots or factories, data is cheap and human expertise costly. Cross validation can then be a highly effective method for automatic model selection. Large scale cross validation search can, however, be computationally expensive. This paper introduces new algorithms to reduce the computational burden of such searches. We show how experimental design methods can achieve this, using a technique similar to a Bayesian version of Kaelbling’s Interval Estimation. Several improvements are then given, including (1) the use of blocking to quickly spot near-identical models, and (2) schemata search: a new method for quickly finding families of relevant features. Experiments are presented for robot data and noisy synthetic datasets. The new algorithms speed up computation without sacrificing reliability, and in some cases are more reliable than conventional techniques. 1
Comparative Studies Of Metamodeling Techniques Under Multiple Modeling Criteria
- Structural and Multidisciplinary Optimization
, 2000
"... 1 Despite the advances in computer capacity, the enormous computational cost of complex engineering simulations makes it impractical to rely exclusively on simulation for the purpose of design optimization. To cut down the cost, surrogate models, also known as metamodels, are constructed from and ..."
Abstract
-
Cited by 134 (8 self)
- Add to MetaCart
1 Despite the advances in computer capacity, the enormous computational cost of complex engineering simulations makes it impractical to rely exclusively on simulation for the purpose of design optimization. To cut down the cost, surrogate models, also known as metamodels, are constructed from and then used in lieu of the actual simulation models. In the paper, we systematically compare four popular metamodeling techniques---Polynomial Regression, Multivariate Adaptive Regression Splines, Radial Basis Functions, and Kriging---based on multiple performance criteria using fourteen test problems representing different classes of problems. Our objective in this study is to investigate the advantages and disadvantages these four metamodeling techniques using multiple modeling criteria and multiple test problems rather than a single measure of merit and a single test problem. 1 Introduction Simulation-based analysis tools are finding increased use during preliminary design to explore desi...
Evaluating recommendation systems
- In Recommender systems handbook
, 2011
"... Abstract Recommender systems are now popular both commercially and in the research community, where many approaches have been suggested for providing recommendations. In many cases a system designer that wishes to employ a rec-ommendation system must choose between a set of candidate approaches. A f ..."
Abstract
-
Cited by 85 (2 self)
- Add to MetaCart
(Show Context)
Abstract Recommender systems are now popular both commercially and in the research community, where many approaches have been suggested for providing recommendations. In many cases a system designer that wishes to employ a rec-ommendation system must choose between a set of candidate approaches. A first step towards selecting an appropriate algorithm is to decide which properties of the application to focus upon when making this choice. Indeed, recommendation sys-tems have a variety of properties that may affect user experience, such as accuracy, robustness, scalability, and so forth. In this paper we discuss how to compare recom-menders based on a set of properties that are relevant for the application. We focus on comparative studies, where a few algorithms are compared using some evaluation metric, rather than absolute benchmarking of algorithms. We describe experimental settings appropriate for making choices between algorithms. We review three types of experiments, starting with an offline setting, where recommendation approaches are compared without user interaction, then reviewing user studies, where a small group of subjects experiment with the system and report on the experience, and fi-nally describe large scale online experiments, where real user populations interact with the system. In each of these cases we describe types of questions that can be answered, and suggest protocols for experimentation. We also discuss how to draw trustworthy conclusions from the conducted experiments. We then review a large set of properties, and explain how to evaluate systems given relevant properties. We also survey a large set of evaluation metrics in the context of the property that they evaluate.