Results 1 - 10
of
101
PROBABILITY INEQUALITIES FOR SUMS OF BOUNDED RANDOM VARIABLES
, 1962
"... Upper bounds are derived for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt. It is assumed that the range of each summand of S is bounded or bounded above. The bounds for Pr(S-ES> nt) depend only on the endpoints of the ranges of the smum ..."
Abstract
-
Cited by 1128 (2 self)
- Add to MetaCart
Upper bounds are derived for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt. It is assumed that the range of each summand of S is bounded or bounded above. The bounds for Pr(S-ES> nt) depend only on the endpoints of the ranges of the smumands and the mean, or the mean and the variance of S. These results are then used to obtain analogous inequalities for certain sums of dependent random variables such as U statistics and the sum of a random sample without replacement from a finite population.
On the effectiveness of the test-first approach to programming
- IEEE Transactions on Software Engineering
, 2005
"... Abstract—Test-Driven Development (TDD) is based on formalizing a piece of functionality as a test, implementing the functionality such that the test passes, and iterating the process. This paper describes a controlled experiment for evaluating an important aspect of TDD: In TDD, programmers write fu ..."
Abstract
-
Cited by 49 (2 self)
- Add to MetaCart
Abstract—Test-Driven Development (TDD) is based on formalizing a piece of functionality as a test, implementing the functionality such that the test passes, and iterating the process. This paper describes a controlled experiment for evaluating an important aspect of TDD: In TDD, programmers write functional tests before the corresponding implementation code. The experiment was conducted with undergraduate students. While the experiment group applied a test-first strategy, the control group applied a more conventional development technique, writing tests after the implementation. Both groups followed an incremental process, adding new features one at a time and regression testing them. We found that test-first students on average wrote more tests and, in turn, students who wrote more tests tended to be more productive. We also observed that the minimum quality increased linearly with the number of programmer tests, independent of the development strategy employed. Index Terms—General programming techniques, coding tools and techniques, testing and debugging, testing strategies, productivity,
The Willingness to Pay/Willingness to Accept Gap, the “Endowment Effect” and Experimental Procedures for Eliciting Valuations. Social Sciences working paper 1132
, 2002
"... Do not reference without permission of the authors. We conduct experiments to explore the possibility that subject misconceptions, as opposed to a particular theory of preferences referred to as the “endowment effect,” account for reported gaps between willingness to pay (“WTP”) and willingness to a ..."
Abstract
-
Cited by 30 (2 self)
- Add to MetaCart
Do not reference without permission of the authors. We conduct experiments to explore the possibility that subject misconceptions, as opposed to a particular theory of preferences referred to as the “endowment effect,” account for reported gaps between willingness to pay (“WTP”) and willingness to accept (“WTA”). Two facts are evident in the literature. First, there is no consensus regarding the nature or robustness of the WTA-WTP gap. Secondly, while experimenters are very concerned to avoid subject misconceptions, there is no consensus about their fundamental properties or how they might be avoided. Instead, experimenters have revealed different conceptions of the phenomenon through different types of experimental procedures and controls. Such controls involve the role of anonymity, elicitation mechanisms, practice, training and binding outcome experiences applied separately or in different combinations. The resulting pattern of research leaves open the possibility that the widely differing reports of a gap between WTP and WTA could be due to an incomplete science regarding
Cooperative Bug Isolation
, 2004
"... Statistical debugging uses lightweight instrumentation and statistical models to identify program behaviors that are strongly predictive of failure. However, most software is mostly correct; nearly all monitored behaviors are poor predictors of failure. We propose an adaptive monitoring strategy tha ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
Statistical debugging uses lightweight instrumentation and statistical models to identify program behaviors that are strongly predictive of failure. However, most software is mostly correct; nearly all monitored behaviors are poor predictors of failure. We propose an adaptive monitoring strategy that mitigates the overhead associated with monitoring poor failure predictors. We begin by monitoring a small portion of the program, then automatically refine instrumentation over time to zero in on bugs. We formulate this approach as a search on the control-dependence graph of the program. We present and evaluate various heuristics that can be used for this search. We also discuss the construction of a binary instrumentor for incorporating the feedback loop into post-deployment monitoring. Performance measurements show that adaptive bug isolation yields an average performance overhead of 1 % for a class of large applications, as opposed to 87 % for realistic sampling-based instrumentation and 300 % for complete binary instrumentation.
A mitotic form of the Golgi apparatus in HeLa cells
- J. Cell
, 1987
"... Abstract. Galactosyltransferase, a marker for trans-Golgi cisternae in interphase cells, was localized in mitotic HeLa cells embedded in Lowicryl K4M by immunoelectron microscopy. Specific labeling was found only over multivesicular structures that we term Golgi clusters. Unlike Golgi stacks in inte ..."
Abstract
-
Cited by 18 (8 self)
- Add to MetaCart
Abstract. Galactosyltransferase, a marker for trans-Golgi cisternae in interphase cells, was localized in mitotic HeLa cells embedded in Lowicryl K4M by immunoelectron microscopy. Specific labeling was found only over multivesicular structures that we term Golgi clusters. Unlike Golgi stacks in interphase cells, these clusters lacked elongated cisternae and ordered stacking of their components but did comprise two distinct regions, one containing electron-lucent vesicles and the other, smaller, vesiculo-tubular structures. Labeling for galactosyltransferase was found predominantly over the latter region. Both structures were embedded in a dense matrix that excluded ribosomes and the cluster was often bounded by cisternae of the rough endoplasmic reticulum, sometimes on all sides. Clusters were
Machine learning methods for predicting failures in hard drives: A multiple-instance application
- Journal of Machine Learning research
, 2005
"... We compare machine learning methods applied to a difficult real-world problem: predicting computer hard-drive failure using attributes monitored internally by individual drives. The problem is one of detecting rare events in a time series of noisy and nonparametrically-distributed data. We develop a ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
We compare machine learning methods applied to a difficult real-world problem: predicting computer hard-drive failure using attributes monitored internally by individual drives. The problem is one of detecting rare events in a time series of noisy and nonparametrically-distributed data. We develop a new algorithm based on the multiple-instance learning framework and the naive Bayesian classifier (mi-NB) which is specifically designed for the low false-alarm case, and is shown to have promising performance. Other methods compared are support vector machines (SVMs), unsupervised clustering, and non-parametric statistical tests (rank-sum and reverse arrangements). The failure-prediction performance of the SVM, rank-sum and mi-NB algorithm is considerably better than the threshold method currently implemented in drives, while maintaining low false alarm rates. Our results suggest that nonparametric statistical tests should be considered for learning problems involving detecting rare events in time series data. An appendix details the calculation of rank-sum significance probabilities in the case of discrete, tied observations, and we give new recommendations about when the exact calculation should be used instead of the commonly-used normal approximation. These normal approximations may be particularly inaccurate for rare event problems like hard drive failures.
Does distributed development affect software quality? an empirical case study of windows vista
- In ICSE ’09: Proceedings of the 2009 IEEE 31st International Conference on Software Engineering
, 2009
"... It is widely believed that distributed software development is riskier and more challenging than collocated development. Prior literature on distributed development in software engineering and other fields discuss various challenges, including cultural barriers, expertise transfer difficulties, and ..."
Abstract
-
Cited by 17 (8 self)
- Add to MetaCart
It is widely believed that distributed software development is riskier and more challenging than collocated development. Prior literature on distributed development in software engineering and other fields discuss various challenges, including cultural barriers, expertise transfer difficulties, and communication and coordination overhead. We evaluate this conventional belief by examining the overall development of Windows Vista and comparing the postrelease failures of components that were developed in a distributed fashion with those that were developed by collocated teams. We found a negligible difference in failures. This difference becomes even less significant when controlling for the number of developers working on a binary. We also examine component characteristics such as code churn, complexity, dependency information, and test code coverage and find very little difference between distributed and collocated components. Further, we examine the software process and phenomena that occurred during the Vista development cycle and present ways in which the development process utilized may be insensitive to geography by mitigating the difficulties introduced in prior work in this area. 1.
Studying the Chaos of Code Development
, 2003
"... As large software systems evolve, controlling their complexity is a major challenge for many companies, as they strive to deliver future releases on time and within budget. Several source code based metrics have been proposed to assist in determining the complexity of code to help control developmen ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
As large software systems evolve, controlling their complexity is a major challenge for many companies, as they strive to deliver future releases on time and within budget. Several source code based metrics have been proposed to assist in determining the complexity of code to help control development costs and outcome. In this
Optimizing Classifier Performance Via the Wilcoxon-Mann-Whitney Statistic
, 2003
"... Cross entropy and mean squared error are typical cost functions used to optimize classifier performance. The goal of the optimization is usually to achieve the best correct classification rate. However, for many two-class real-world problems, the ROC curve is a more meaningful performance measure. W ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Cross entropy and mean squared error are typical cost functions used to optimize classifier performance. The goal of the optimization is usually to achieve the best correct classification rate. However, for many two-class real-world problems, the ROC curve is a more meaningful performance measure. We demonstrate that minimizing cross entropy or mean squared error does not necessarily maximize the area under the ROC curve(AUC). We then consider alternative objective functions for training a classifier to maximize the AUC directly. We propose an objective function that is an approximation to the Wilcoxon-Mann-Whitney statistic, which is equivalent to AUC. The proposed objective function is differentiable, so gradient-based methods can be used to train the classifier. After discussing the improved results of the new objective function over several UCI data sets, we apply the new objective function to real-world customer behavior prediction problems for a wireless service provider and a cable service provider, and achieve reliable and significant improvements in the ROC curve.
Discriminative keyword spotting
- In Proc. of Workshop on Non-Linear Speech Processsing
, 2007
"... This paper proposes a new approach for keyword spotting, which is not based on HMMs. The proposed method employs a new discriminative learning procedure, in which the learning phase aims at maximizing the area under the ROC curve, ..."
Abstract
-
Cited by 8 (6 self)
- Add to MetaCart
This paper proposes a new approach for keyword spotting, which is not based on HMMs. The proposed method employs a new discriminative learning procedure, in which the learning phase aims at maximizing the area under the ROC curve,

