Results 1 - 10
of
17
Characterizing and Predicting Which Bugs Get Fixed: An Empirical Study of Microsoft Windows
"... We performed an empirical study to characterize factors that affect which bugs get fixed in Windows Vista and Windows 7, focusing on factors related to bug report edits and relationships between people involved in handling the bug. We found that bugs reported by people with better reputations were m ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
We performed an empirical study to characterize factors that affect which bugs get fixed in Windows Vista and Windows 7, focusing on factors related to bug report edits and relationships between people involved in handling the bug. We found that bugs reported by people with better reputations were more likely to get fixed, as were bugs handled by people on the same team and working in geographical proximity. We reinforce these quantitative results with survey feedback from 358 Microsoft employees who were involved in Windows bugs. Survey respondents also mentioned additional qualitative influences on bug fixing, such as the importance of seniority and interpersonal skills of the bug reporter. Informed by these findings, we built a statistical model to predict the probability that a new bug will be fixed (the first known one, to the best of our knowledge). We trained it on Windows Vista bugs and got a precision of 68 % and recall of 64 % when predicting Windows 7 bug fixes. Engineers could use such a model to prioritize bugs during triage, to estimate developer workloads, and to decide which bugs should be closed or migrated to future product versions. Categories and Subject Descriptors:
The Missing Links: Bugs and Bug-fix Commits
"... Empirical studies of software defects rely on links between bug databases and program code repositories. This linkage is typically based on bug-fixes identified in developer-entered commit logs. Unfortunately, developers do not always report which commits perform bug-fixes. Prior work suggests that ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Empirical studies of software defects rely on links between bug databases and program code repositories. This linkage is typically based on bug-fixes identified in developer-entered commit logs. Unfortunately, developers do not always report which commits perform bug-fixes. Prior work suggests that such links can be a biased sample of the entire population of fixed bugs. The validity of statistical hypotheses-testing based on linked data could well be affected by bias. Given the wide use of linked defect data, it is vital to gauge the nature and extent of the bias, and try to develop testable theories and models of the bias. To do this, we must establish ground truth: manually analyze a complete version history corpus, and nail down those commits that fix defects, and those that do not. This is a difficult task, requiring an expert to compare versions, analyze changes, find related bugs in the bug database, reverse-engineer missing links, and finally record their work for use later. This effort must be repeated for hundreds of commits to obtain a useful sample of reported and unreported bug-fix commits. We make several contributions. First, we present Linkster, a tool to facilitate link reverse-engineering. Second, we evaluate this tool, engaging a core developer of the Apache HTTP web server project to exhaustively annotate 493 commits that occurred during a six week period. Finally, we analyze this comprehensive data set, showing that there are serious and consequential problems in the data.
Cross-project Defect Prediction A Large Scale Experiment on Data vs. Domain vs. Process
"... Prediction of software defects works well within projects as long as there is a sufficient amount of data available to train any models. However, this is rarely the case for new software projects and for many companies. So far, only a few have studies focused on transferring prediction models from o ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Prediction of software defects works well within projects as long as there is a sufficient amount of data available to train any models. However, this is rarely the case for new software projects and for many companies. So far, only a few have studies focused on transferring prediction models from one project to another. In this paper, we study cross-project defect prediction models on a large scale. For 12 real-world applications, we ran 622 cross-project predictions. Our results indicate that cross-project prediction is a serious challenge, i.e., simply using models from projects in the same domain or with the same process does not lead to accurate predictions. To help software engineers choose models wisely, we identified factors that do influence the success of cross-project predictions. We also derived decision trees that can provide early estimates for precision, recall, and accuracy before a prediction is attempted. Categories and Subject Descriptors. D.2.8 [Software Engineering]: Metrics—Performance measures, Process metrics, Product metrics. D.2.9 [Software Engineering]: Management—Software
An Extensive Comparison of Bug Prediction Approaches
"... Abstract—Reliably predicting software defects is one of software engineering’s holy grails. Researchers have devised and implemented a plethora of bug prediction approaches varying in terms of accuracy, complexity and the input data they require. However, the absence of an established benchmark make ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract—Reliably predicting software defects is one of software engineering’s holy grails. Researchers have devised and implemented a plethora of bug prediction approaches varying in terms of accuracy, complexity and the input data they require. However, the absence of an established benchmark makes it hard, if not impossible, to compare approaches. We present a benchmark for defect prediction, in the form of a publicly available data set consisting of several software systems, and provide an extensive comparison of the explanative and predictive power of well-known bug prediction approaches, together with novel approaches we devised. Based on the results, we discuss the performance and stability of the approaches with respect to our benchmark and deduce a number of insights on bug prediction models. I.
Ownership, Experience and Defects: a fine-grained study of Authorship
- In Proceedings ICSE 2011
, 2011
"... Recent research indicates that “people ” factors such as ownership, experience, organizational structure, and geographic distribution have a big impact on software quality. Understanding these factors, and properly deploying people resources can help managers improve quality outcomes. This paper con ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Recent research indicates that “people ” factors such as ownership, experience, organizational structure, and geographic distribution have a big impact on software quality. Understanding these factors, and properly deploying people resources can help managers improve quality outcomes. This paper considers the impact of code ownership and developer experience on software quality. In a large project, a file might be entirely owned by a single developer, or worked on by many. Some previous research indicates that more developers working on a file might lead to more defects. Prior research considered this phenomenon at the level of modules or files, and thus does not tease apart and study the effect of contributions of different developers to each module or file. We exploit a modern version control system to examine this issue at a fine-grained level. Using version history, we examine contributions to code fragments that are actually repaired to fix bugs. Are these code fragments “implicated ” in bugs the result of contributions from many? or from one? Does experience matter? What type of experience? We find that implicated code is more strongly associated with a single developer’s contribution; our findings also indicate that an author’s specialized experience in the target file is more important than general experience. Our findings suggest that quality control efforts could be profitably targeted at changes made by single developers with limited prior experience on that file.
Clones: What is that Smell?
"... Abstract—Clones are generally considered bad programming practice in software engineering folklore. They are identified as a bad smell and a major contributor to project maintenance difficulties. Clones inherently cause code bloat, thus increasing project size and maintenance costs. In this work, we ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract—Clones are generally considered bad programming practice in software engineering folklore. They are identified as a bad smell and a major contributor to project maintenance difficulties. Clones inherently cause code bloat, thus increasing project size and maintenance costs. In this work, we try to validate the conventional wisdom empirically to see whether cloning makes code more defect prone. This paper analyses relationship between cloning and defect proneness. We find that, first, the great majority of bugs are not significantly associated with clones. Second, we find that clones may be less defect prone than non-cloned code. Finally, we find little evidence that clones with more copies are actually more error prone. Our findings do not support the claim that clones are really a “bad smell”. Perhaps we can clone, and breathe easy, at the same time. Keywords-software clone; empirical software engineering; software maintenance; software evolution; I.
Detecting design rule violations
, 2010
"... In this paper, we present an approach to detect design rule violations that could cause software defects, modularity decay, or expensive refactorings. Our approach is to compute the discrepancies between how components should change together based on the modular structure framed by design rules, and ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In this paper, we present an approach to detect design rule violations that could cause software defects, modularity decay, or expensive refactorings. Our approach is to compute the discrepancies between how components should change together based on the modular structure framed by design rules, and how components actually changed together revealed by how modification requests were fulfilled. Our contributions include a design violation detection framework and a design-rule based impact scope prediction algorithm. We evaluatedour approach usingthe version historyof three large-scale open source software projects. We examined all identified violations to check whether they were refactored or recognized by the developers in later versions. Our results show that (1) on average 73 % of the violations we identified were either recognized or refactored in later releases (when using.5 confidence and varying support from 2 to 10 in Hadoop); (2) our approach can identify problematic design violations much earlier than manual identification by developers; and (3) the identified violations cover multiple bad smells, such as tight coupling and code clone. Categories and Subject Descriptors D.2.7[Software Engineering]: MaintenanceandEnhancement—refactoring, restructuring; D.2.10 [Software Engineering]:
Ownership and Experience in Fix-Inducing Code
"... Software defects cost the U.S. economy billions of dollars annually [1]. Recent research indicates that “people ” factors such as ownership, experience, organizational structure, and geographic distribution have a big impact on software quality. This paper considers the impact of code ownership and ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Software defects cost the U.S. economy billions of dollars annually [1]. Recent research indicates that “people ” factors such as ownership, experience, organizational structure, and geographic distribution have a big impact on software quality. This paper considers the impact of code ownership and developer experience on software quality. In a large project, a file might be entirely owned by a single developer, or worked on by many. Some previous research indicates that more developers working on a file might lead to more defects. We examine this issue at a fine-grained level. Using version control history, we examine contributions to code fragments that are actually repaired to fix bugs. Are these “troubled ” code fragments the result of contributions from many? or from one? Does experience matter? What type of experience? We find that fix-inducing code is more strongly associated with a single developer’s contribution; our findings also indicate that author’s specific experience in the target file is more important than general experience. Our findings suggest that quality control efforts could be profitably targeted at changes made by single developers with limited prior experience on that file.
Detecting Software Modularity Violations
"... This paper presents Clio, an approach thatdetects modularity violations, which can cause software defects, modularity decay, or expensive refactorings. Clio computes the discrepancies between how components should change together based on the modular structure, and how components actually change tog ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper presents Clio, an approach thatdetects modularity violations, which can cause software defects, modularity decay, or expensive refactorings. Clio computes the discrepancies between how components should change together based on the modular structure, and how components actually change together as revealed in version histories. We evaluated Clio using 15 releases of Hadoop Common and 10 releases of Eclipse JDT. The results show that hundreds of violations identified using Clio were indeed recognized as design problems or refactored by the developers in later versions. The identified violations cover multiple symptoms of poor design, some of which are not easily detectable using existing approaches. Categories and Subject Descriptors D.2.7[Software Engineering]: MaintenanceandEnhancement—refactoring, restructuring; D.2.10 [Software Engineering]: Design—modularity violation, refactoring Keywords modularity violation detection, refactoring, bad code smells, design structure matrix 1.
Recalling the “imprecision” of cross-project defect prediction
- In the 20th ACM SIGSOFT FSE
, 2012
"... There has been a great deal of interest in defect prediction: using prediction models trained on historical data to help focus quality-control resources in ongoing development. Since most new projects don’t have historical data, there is interest in cross-project prediction: using data from one proj ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
There has been a great deal of interest in defect prediction: using prediction models trained on historical data to help focus quality-control resources in ongoing development. Since most new projects don’t have historical data, there is interest in cross-project prediction: using data from one project to predict defects in another. Sadly, results in this area have largely been disheartening. Most experiments in cross-project defect prediction report poor performance, using the standard measures of precision, recall and F-score. We argue that these IR-based measures, while broadly applicable, are not as well suited for the quality-control settings in which defect prediction models are used. Specifically, these measures are taken at specific threshold settings (typically thresholds of the predicted probability of defectiveness returned by a logistic regression model). However, in practice, software quality control processes choose from a range of time-and-cost vs quality tradeoffs: how many files shall we test? how many shall we inspect? Thus, we argue that measures based on a variety of tradeoffs, viz., 5%, 10 % or 20 % of files tested/inspected would be more suitable. We study cross-project defect prediction from this perspective. We find that cross-project prediction performance is no worse than within-project performance, and substantially better than random prediction!

