Results 1  10
of
200
PROBABILITY INEQUALITIES FOR SUMS OF BOUNDED RANDOM VARIABLES
, 1962
"... Upper bounds are derived for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt. It is assumed that the range of each summand of S is bounded or bounded above. The bounds for Pr(SES> nt) depend only on the endpoints of the ranges of the smum ..."
Abstract

Cited by 1498 (2 self)
 Add to MetaCart
Upper bounds are derived for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt. It is assumed that the range of each summand of S is bounded or bounded above. The bounds for Pr(SES> nt) depend only on the endpoints of the ranges of the smumands and the mean, or the mean and the variance of S. These results are then used to obtain analogous inequalities for certain sums of dependent random variables such as U statistics and the sum of a random sample without replacement from a finite population.
The Willingness to Pay/Willingness to Accept Gap, the “Endowment Effect” and Experimental Procedures for Eliciting Valuations
, 2002
"... ..."
On the effectiveness of the testfirst approach to programming
 IEEE Transactions on Software Engineering
, 2005
"... Abstract—TestDriven Development (TDD) is based on formalizing a piece of functionality as a test, implementing the functionality such that the test passes, and iterating the process. This paper describes a controlled experiment for evaluating an important aspect of TDD: In TDD, programmers write fu ..."
Abstract

Cited by 55 (2 self)
 Add to MetaCart
Abstract—TestDriven Development (TDD) is based on formalizing a piece of functionality as a test, implementing the functionality such that the test passes, and iterating the process. This paper describes a controlled experiment for evaluating an important aspect of TDD: In TDD, programmers write functional tests before the corresponding implementation code. The experiment was conducted with undergraduate students. While the experiment group applied a testfirst strategy, the control group applied a more conventional development technique, writing tests after the implementation. Both groups followed an incremental process, adding new features one at a time and regression testing them. We found that testfirst students on average wrote more tests and, in turn, students who wrote more tests tended to be more productive. We also observed that the minimum quality increased linearly with the number of programmer tests, independent of the development strategy employed. Index Terms—General programming techniques, coding tools and techniques, testing and debugging, testing strategies, productivity,
Optimizing classifier performance via an approximation to the WilcoxonMannWhitney statistic
 in Proc.20thInt.Conf.Mach.Learn
, 2003
"... When the goal is to achieve the best correct classification rate, cross entropy and mean squared error are typical cost functions used to optimize classifier performance. However, for many realworld classification problems, the ROC curve is a more meaningful performance measure. We demonstrate that ..."
Abstract

Cited by 39 (4 self)
 Add to MetaCart
When the goal is to achieve the best correct classification rate, cross entropy and mean squared error are typical cost functions used to optimize classifier performance. However, for many realworld classification problems, the ROC curve is a more meaningful performance measure. We demonstrate that minimizing cross entropy or mean squared error does not necessarily maximize the area under the ROC curve (AUC). We then consider alternative objective functions for training a classifier to maximize the AUC directly. We propose an objective function that is an approximation to the WilcoxonMannWhitney statistic, which is equivalent to the AUC. The proposed objective function is differentiable, so gradientbased methods can be used to train the classifier. We apply the new objective function to realworld customer behavior prediction problems for a wireless service provider and a cable service provider, and achieve reliable improvements in the ROC curve. 1.
Cooperative Bug Isolation
, 2004
"... Statistical debugging uses lightweight instrumentation and statistical models to identify program behaviors that are strongly predictive of failure. However, most software is mostly correct; nearly all monitored behaviors are poor predictors of failure. We propose an adaptive monitoring strategy tha ..."
Abstract

Cited by 35 (3 self)
 Add to MetaCart
Statistical debugging uses lightweight instrumentation and statistical models to identify program behaviors that are strongly predictive of failure. However, most software is mostly correct; nearly all monitored behaviors are poor predictors of failure. We propose an adaptive monitoring strategy that mitigates the overhead associated with monitoring poor failure predictors. We begin by monitoring a small portion of the program, then automatically refine instrumentation over time to zero in on bugs. We formulate this approach as a search on the controldependence graph of the program. We present and evaluate various heuristics that can be used for this search. We also discuss the construction of a binary instrumentor for incorporating the feedback loop into postdeployment monitoring. Performance measurements show that adaptive bug isolation yields an average performance overhead of 1 % for a class of large applications, as opposed to 87 % for realistic samplingbased instrumentation and 300 % for complete binary instrumentation.
A mitotic form of the Golgi apparatus in HeLa cells
 J. Cell
, 1987
"... Abstract. Galactosyltransferase, a marker for transGolgi cisternae in interphase cells, was localized in mitotic HeLa cells embedded in Lowicryl K4M by immunoelectron microscopy. Specific labeling was found only over multivesicular structures that we term Golgi clusters. Unlike Golgi stacks in inte ..."
Abstract

Cited by 33 (13 self)
 Add to MetaCart
Abstract. Galactosyltransferase, a marker for transGolgi cisternae in interphase cells, was localized in mitotic HeLa cells embedded in Lowicryl K4M by immunoelectron microscopy. Specific labeling was found only over multivesicular structures that we term Golgi clusters. Unlike Golgi stacks in interphase cells, these clusters lacked elongated cisternae and ordered stacking of their components but did comprise two distinct regions, one containing electronlucent vesicles and the other, smaller, vesiculotubular structures. Labeling for galactosyltransferase was found predominantly over the latter region. Both structures were embedded in a dense matrix that excluded ribosomes and the cluster was often bounded by cisternae of the rough endoplasmic reticulum, sometimes on all sides. Clusters were
Does distributed development affect software quality? an empirical case study of windows vista
 In ICSE ’09: Proceedings of the 2009 IEEE 31st International Conference on Software Engineering
, 2009
"... It is widely believed that distributed software development is riskier and more challenging than collocated development. Prior literature on distributed development in software engineering and other fields discuss various challenges, including cultural barriers, expertise transfer difficulties, and ..."
Abstract

Cited by 29 (10 self)
 Add to MetaCart
It is widely believed that distributed software development is riskier and more challenging than collocated development. Prior literature on distributed development in software engineering and other fields discuss various challenges, including cultural barriers, expertise transfer difficulties, and communication and coordination overhead. We evaluate this conventional belief by examining the overall development of Windows Vista and comparing the postrelease failures of components that were developed in a distributed fashion with those that were developed by collocated teams. We found a negligible difference in failures. This difference becomes even less significant when controlling for the number of developers working on a binary. We also examine component characteristics such as code churn, complexity, dependency information, and test code coverage and find very little difference between distributed and collocated components. Further, we examine the software process and phenomena that occurred during the Vista development cycle and present ways in which the development process utilized may be insensitive to geography by mitigating the difficulties introduced in prior work in this area. 1.
Machine learning methods for predicting failures in hard drives: A multipleinstance application
 Journal of Machine Learning research
, 2005
"... We compare machine learning methods applied to a difficult realworld problem: predicting computer harddrive failure using attributes monitored internally by individual drives. The problem is one of detecting rare events in a time series of noisy and nonparametricallydistributed data. We develop a ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
We compare machine learning methods applied to a difficult realworld problem: predicting computer harddrive failure using attributes monitored internally by individual drives. The problem is one of detecting rare events in a time series of noisy and nonparametricallydistributed data. We develop a new algorithm based on the multipleinstance learning framework and the naive Bayesian classifier (miNB) which is specifically designed for the low falsealarm case, and is shown to have promising performance. Other methods compared are support vector machines (SVMs), unsupervised clustering, and nonparametric statistical tests (ranksum and reverse arrangements). The failureprediction performance of the SVM, ranksum and miNB algorithm is considerably better than the threshold method currently implemented in drives, while maintaining low false alarm rates. Our results suggest that nonparametric statistical tests should be considered for learning problems involving detecting rare events in time series data. An appendix details the calculation of ranksum significance probabilities in the case of discrete, tied observations, and we give new recommendations about when the exact calculation should be used instead of the commonlyused normal approximation. These normal approximations may be particularly inaccurate for rare event problems like hard drive failures.
Weighted constraints and gradient restrictions on place coccurrence
 in Muna and Arabic. Natural Language and Linguistic Theory
, 2008
"... Abstract. This paper documents a restriction against the cooccurrence of homorganic consonants in the root morphemes of Muna, a western Austronesian language, and compares the Muna pattern with the muchstudied similar pattern in Arabic. As in Arabic, the restriction applies gradiently: its force d ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
Abstract. This paper documents a restriction against the cooccurrence of homorganic consonants in the root morphemes of Muna, a western Austronesian language, and compares the Muna pattern with the muchstudied similar pattern in Arabic. As in Arabic, the restriction applies gradiently: its force depends on the place of articulation of the consonants involved, and on whether the homorganic consonants are similar in terms of other features. Muna differs from Arabic in the relative strengths of these other features in affecting cooccurrence rates of homorganic consonants. Along with the descriptions of these patterns, this paper presents phonological analyses of the Muna and Arabic patterns in terms of weighted constraints, as in Harmonic Grammar. This account uses a gradual learning algorithm that acquires weights that reflect the relative frequency of different sequence types in the two languages. The resulting grammars assign the sequences acceptability scores that correlate with a measure of their attestedness in the lexicon. This application of Harmonic Grammar illustrates its ability to capture both gradient and categorical patterns.
Leaveoneout crossvalidation based model selection criteria for weighted LSSVMs
 in IJCNN’06. Vancouver: IEEE
"... Abstract — While the model parameters of many kernel learning methods are given by the solution of a convex optimisation problem, the selection of good values for the kernel and regularisation parameters, i.e. model selection, is much less straightforward. This paper describes a simple and efficien ..."
Abstract

Cited by 20 (4 self)
 Add to MetaCart
Abstract — While the model parameters of many kernel learning methods are given by the solution of a convex optimisation problem, the selection of good values for the kernel and regularisation parameters, i.e. model selection, is much less straightforward. This paper describes a simple and efficient approach to model selection for weighted leastsquares support vector machines, and compares a variety of model selection criteria based on leaveoneout crossvalidation. An external crossvalidation procedure is used for performance estimation, with model selection performed independently in each fold to avoid selection bias. The best entry based on these methods was ranked in joint first place in the WCCI2006 performance prediction challenge, demonstrating the effectiveness of this approach. I.