Results 1  10
of
579
PROBABILITY INEQUALITIES FOR SUMS OF BOUNDED RANDOM VARIABLES
, 1962
"... Upper bounds are derived for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt. It is assumed that the range of each summand of S is bounded or bounded above. The bounds for Pr(SES> nt) depend only on the endpoints of the ranges of the s ..."
Abstract

Cited by 2217 (2 self)
 Add to MetaCart
Upper bounds are derived for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt. It is assumed that the range of each summand of S is bounded or bounded above. The bounds for Pr(SES> nt) depend only on the endpoints of the ranges of the smumands and the mean, or the mean and the variance of S. These results are then used to obtain analogous inequalities for certain sums of dependent random variables such as U statistics and the sum of a random sample without replacement from a finite population.
The Willingness to Pay/Willingness to Accept Gap, the “Endowment Effect” and Experimental Procedures for Eliciting Valuations
, 2002
"... ..."
On the effectiveness of the testfirst approach to programming
 IEEE Transactions on Software Engineering
, 2005
"... Abstract—TestDriven Development (TDD) is based on formalizing a piece of functionality as a test, implementing the functionality such that the test passes, and iterating the process. This paper describes a controlled experiment for evaluating an important aspect of TDD: In TDD, programmers write fu ..."
Abstract

Cited by 80 (2 self)
 Add to MetaCart
(Show Context)
Abstract—TestDriven Development (TDD) is based on formalizing a piece of functionality as a test, implementing the functionality such that the test passes, and iterating the process. This paper describes a controlled experiment for evaluating an important aspect of TDD: In TDD, programmers write functional tests before the corresponding implementation code. The experiment was conducted with undergraduate students. While the experiment group applied a testfirst strategy, the control group applied a more conventional development technique, writing tests after the implementation. Both groups followed an incremental process, adding new features one at a time and regression testing them. We found that testfirst students on average wrote more tests and, in turn, students who wrote more tests tended to be more productive. We also observed that the minimum quality increased linearly with the number of programmer tests, independent of the development strategy employed. Index Terms—General programming techniques, coding tools and techniques, testing and debugging, testing strategies, productivity,
Informed trading in stock and option markets
 Journal of Finance
, 2004
"... We investigate the contribution of option markets to price discovery, using a modification of Hasbrouck’s (1995) “information share ” approach. Based on five years of stock and options data for 60 firms, we estimate the option market’s contribution to price discovery to be about 17 percent on averag ..."
Abstract

Cited by 59 (2 self)
 Add to MetaCart
We investigate the contribution of option markets to price discovery, using a modification of Hasbrouck’s (1995) “information share ” approach. Based on five years of stock and options data for 60 firms, we estimate the option market’s contribution to price discovery to be about 17 percent on average. Option market price discovery is related to trading volume and spreads in both markets, and stock volatility. Price discovery across option strike prices is related to leverage, trading volume, and spreads. Our results are consistent with theoretical arguments that informed investors trade in both stock and option markets, suggesting an important informational role for options.
Does distributed development affect software quality? an empirical case study of windows vista
 In ICSE ’09: Proceedings of the 2009 IEEE 31st International Conference on Software Engineering
, 2009
"... It is widely believed that distributed software development is riskier and more challenging than collocated development. Prior literature on distributed development in software engineering and other fields discuss various challenges, including cultural barriers, expertise transfer difficulties, and ..."
Abstract

Cited by 53 (13 self)
 Add to MetaCart
(Show Context)
It is widely believed that distributed software development is riskier and more challenging than collocated development. Prior literature on distributed development in software engineering and other fields discuss various challenges, including cultural barriers, expertise transfer difficulties, and communication and coordination overhead. We evaluate this conventional belief by examining the overall development of Windows Vista and comparing the postrelease failures of components that were developed in a distributed fashion with those that were developed by collocated teams. We found a negligible difference in failures. This difference becomes even less significant when controlling for the number of developers working on a binary. We also examine component characteristics such as code churn, complexity, dependency information, and test code coverage and find very little difference between distributed and collocated components. Further, we examine the software process and phenomena that occurred during the Vista development cycle and present ways in which the development process utilized may be insensitive to geography by mitigating the difficulties introduced in prior work in this area. 1.
Cooperative Bug Isolation
, 2004
"... Statistical debugging uses lightweight instrumentation and statistical models to identify program behaviors that are strongly predictive of failure. However, most software is mostly correct; nearly all monitored behaviors are poor predictors of failure. We propose an adaptive monitoring strategy tha ..."
Abstract

Cited by 53 (4 self)
 Add to MetaCart
(Show Context)
Statistical debugging uses lightweight instrumentation and statistical models to identify program behaviors that are strongly predictive of failure. However, most software is mostly correct; nearly all monitored behaviors are poor predictors of failure. We propose an adaptive monitoring strategy that mitigates the overhead associated with monitoring poor failure predictors. We begin by monitoring a small portion of the program, then automatically refine instrumentation over time to zero in on bugs. We formulate this approach as a search on the controldependence graph of the program. We present and evaluate various heuristics that can be used for this search. We also discuss the construction of a binary instrumentor for incorporating the feedback loop into postdeployment monitoring. Performance measurements show that adaptive bug isolation yields an average performance overhead of 1 % for a class of large applications, as opposed to 87 % for realistic samplingbased instrumentation and 300 % for complete binary instrumentation.
Optimizing classifier performance via an approximation to the WilcoxonMannWhitney statistic
 in Proc.20thInt.Conf.Mach.Learn
, 2003
"... When the goal is to achieve the best correct classification rate, cross entropy and mean squared error are typical cost functions used to optimize classifier performance. However, for many realworld classification problems, the ROC curve is a more meaningful performance measure. We demonstrate that ..."
Abstract

Cited by 49 (4 self)
 Add to MetaCart
(Show Context)
When the goal is to achieve the best correct classification rate, cross entropy and mean squared error are typical cost functions used to optimize classifier performance. However, for many realworld classification problems, the ROC curve is a more meaningful performance measure. We demonstrate that minimizing cross entropy or mean squared error does not necessarily maximize the area under the ROC curve (AUC). We then consider alternative objective functions for training a classifier to maximize the AUC directly. We propose an objective function that is an approximation to the WilcoxonMannWhitney statistic, which is equivalent to the AUC. The proposed objective function is differentiable, so gradientbased methods can be used to train the classifier. We apply the new objective function to realworld customer behavior prediction problems for a wireless service provider and a cable service provider, and achieve reliable improvements in the ROC curve. 1.
On the Relative Value of CrossCompany and WithinCompany Data for Defect Prediction
 EMPIRICAL SOFTWARE ENGINEERING
"... We propose a practical defect prediction approach for companies that do not track defect related data. Specifically, we investigate the applicability of crosscompany (CC) data for building localized defect predictors using static code features. Firstly, we analyze the conditions, where CC data can ..."
Abstract

Cited by 44 (17 self)
 Add to MetaCart
We propose a practical defect prediction approach for companies that do not track defect related data. Specifically, we investigate the applicability of crosscompany (CC) data for building localized defect predictors using static code features. Firstly, we analyze the conditions, where CC data can be used as is. These conditions turn out to be quite few. Then we apply principles of analogybased learning (i.e. nearest neighbor (NN) filtering) to CC data, in order to fine tune these models for localization. We compare the performance of these models with that of defect predictors learned from withincompany (WC) data. As expected, we observe that defect predictors learned from WC data outperform the ones learned from CC data. However, our analyses also yield defect predictors learned from NNfiltered CC data, with performance close to, but still not better than, WC data. Therefore, we perform a final analysis for determining the minimum number of local defect reports in order to learn WC defect predictors. We demonstrate in this paper that the minimum number of data samples required to build effective defect predictors can be quite small and can be collected quickly within a few months. Hence, for companies with no local defect data, we recommend a twophase approach
A mitotic form of the Golgi apparatus in HeLa cells
 J. Cell
, 1987
"... Abstract. Galactosyltransferase, a marker for transGolgi cisternae in interphase cells, was localized in mitotic HeLa cells embedded in Lowicryl K4M by immunoelectron microscopy. Specific labeling was found only over multivesicular structures that we term Golgi clusters. Unlike Golgi stacks in inte ..."
Abstract

Cited by 40 (13 self)
 Add to MetaCart
Abstract. Galactosyltransferase, a marker for transGolgi cisternae in interphase cells, was localized in mitotic HeLa cells embedded in Lowicryl K4M by immunoelectron microscopy. Specific labeling was found only over multivesicular structures that we term Golgi clusters. Unlike Golgi stacks in interphase cells, these clusters lacked elongated cisternae and ordered stacking of their components but did comprise two distinct regions, one containing electronlucent vesicles and the other, smaller, vesiculotubular structures. Labeling for galactosyltransferase was found predominantly over the latter region. Both structures were embedded in a dense matrix that excluded ribosomes and the cluster was often bounded by cisternae of the rough endoplasmic reticulum, sometimes on all sides. Clusters were
Machine learning methods for predicting failures in hard drives: A multiple instance application
, 2005
"... We compare machine learning methods applied to a difficult realworld problem: predicting computer harddrive failure using attributes monitored internally by individual drives. The problem is one of detecting rare events in a time series of noisy and nonparametricallydistributed data. We develop ..."
Abstract

Cited by 39 (1 self)
 Add to MetaCart
(Show Context)
We compare machine learning methods applied to a difficult realworld problem: predicting computer harddrive failure using attributes monitored internally by individual drives. The problem is one of detecting rare events in a time series of noisy and nonparametricallydistributed data. We develop a new algorithm based on the multipleinstance learning framework and the naive Bayesian classifier (miNB) which is specifically designed for the low falsealarm case, and is shown to have promising performance. Other methods compared are support vector machines (SVMs), unsupervised clustering, and nonparametric statistical tests (ranksum and reverse arrangements). The failureprediction performance of the SVM, ranksum and miNB algorithm is considerably better than the threshold method currently implemented in drives, while maintaining low false alarm rates. Our results suggest that nonparametric statistical tests should be considered for learning problems involving detecting rare events in time series data. An appendix details the calculation of ranksum significance probabilities in the case of discrete, tied observations, and we give new recommendations about when the exact calculation should be used instead of the commonlyused normal approximation. These normal approximations may be particularly inaccurate for rare event problems like hard drive failures.