Results 1 - 10
of
99
Mining version histories to guide software changes
- In 26th International Conference on Software Engineering (ICSE 2004
, 2004
"... We apply data mining to version histories in order to guide programmers along related changes: “Programmers who changed these functions also changed... ”. Given a set of existing changes, such rules (a) suggest and predict likely further changes, (b) show up item coupling that is indetectable by pro ..."
Abstract
-
Cited by 236 (20 self)
- Add to MetaCart
We apply data mining to version histories in order to guide programmers along related changes: “Programmers who changed these functions also changed... ”. Given a set of existing changes, such rules (a) suggest and predict likely further changes, (b) show up item coupling that is indetectable by program analysis, and (c) prevent errors due to incomplete changes. After an initial change, our ROSE prototype can correctly predict 26 % of further files to be changed—and 15 % of the precise functions or variables. The topmost three suggestions contain a correct location with a likelihood of 64%. 1.
Does Code Decay? Assessing the Evidence from Change Management Data
- IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
, 1998
"... A central feature of the evolution of large software systems is that change -- which is necessary to add new functionality, accommodate new hardware and repair faults -- becomes increasingly difficult over time. In this paper we approach this phenomenon, which we term code decay, scientifically and ..."
Abstract
-
Cited by 124 (8 self)
- Add to MetaCart
A central feature of the evolution of large software systems is that change -- which is necessary to add new functionality, accommodate new hardware and repair faults -- becomes increasingly difficult over time. In this paper we approach this phenomenon, which we term code decay, scientifically and statistically. We define code decay, and propose a number of measurements (code decay indices) on software, and on the organizations that produce it, that serve as symptoms, risk factors and predictors of decay. Using an unusually rich data set (the fifteen-plus year change history of the millions of lines of software for a telephone switching system), we find mixed but on the whole persuasive statistical evidence of code decay, which is corroborated by developers of the code. Suggestive indications that perfective maintenance can retard code decayarealso discussed.
The Daikon system for dynamic detection of likely invariants
, 2006
"... Daikon is an implementation of dynamic detection of likely invariants; that is, the Daikon invariant detector reports likely program invariants. An invariant is a property that holds at a certain point or points in a program; these are often used in assert statements, documentation, and formal speci ..."
Abstract
-
Cited by 89 (8 self)
- Add to MetaCart
Daikon is an implementation of dynamic detection of likely invariants; that is, the Daikon invariant detector reports likely program invariants. An invariant is a property that holds at a certain point or points in a program; these are often used in assert statements, documentation, and formal specifications. Examples include being constant (x = a), non-zero (x ̸ = 0), being in a
Where the Bugs Are
- Proceedings of the 2004 ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA
, 2004
"... The ability to predict which files in a large software system are most likely to contain the largest numbers of faults in the next release can be a very valuable asset. To accomplish this, a negative binomial regression model using information from previous releases has been developed and used to pr ..."
Abstract
-
Cited by 58 (0 self)
- Add to MetaCart
The ability to predict which files in a large software system are most likely to contain the largest numbers of faults in the next release can be a very valuable asset. To accomplish this, a negative binomial regression model using information from previous releases has been developed and used to predict the numbers of faults for a large industrial inventory system. The files of each release were sorted in descending order based on the predicted number of faults and then the first 20 % of the files were selected. This was done for each of fifteen consecutive releases, representing more than four years of field usage. The predictions were extremely accurate, correctly selecting files that contained between 71 % and 92 % of the faults, with the overall average being 83%. In addition, the same model was used on data for the same system’s releases, but with all fault data prior to integration testing removed. The prediction was again very accurate, ranging from 71 % to 93%, with the average being 84%. Predictions were made for a second system, and again the first 20 % of files accounted for 83 % of the identified faults. Finally, a highly simplified predictor was considered which correctly predicted 73 % and 74 % of the faults for the two systems. Categories and Subject Descriptors:
Predicting risk of software changes
- Bell Labs Technical Journal
, 2000
"... Reducing the number of software failures is one of the most challenging problems of software production. We assume that software development proceeds as a series of changes and model the probability that a change to software will cause a failure. We use predictors based on the properties of a change ..."
Abstract
-
Cited by 54 (14 self)
- Add to MetaCart
Reducing the number of software failures is one of the most challenging problems of software production. We assume that software development proceeds as a series of changes and model the probability that a change to software will cause a failure. We use predictors based on the properties of a change itself. Such predictors include size in lines of code added, deleted, and unmodified; diffusion of the change and its component subchanges, as reflected in the number of files, modules, and subsystems touched, or changed; several measures of developer experience; and the type of change and its subchanges (fault fixes or new code). The model is built on historic information and is used to predict the risk of new changes. In this paper we apply the model to 5ESS ® software updates and find that change diffusion and developer experience are essential to predicting failures. The predictive model is implemented as a Web-based tool to allow timely prediction of change quality. The ability to predict the quality of change enables us to make appropriate decisions regarding inspection, testing, and delivery. Historic information on software changes is recorded in many commercial software projects, suggesting that our results can be easily and widely applied in practice.
Static verification of dynamically detected program invariants: Integrating Daikon and ESC/Java
, 2001
"... This paper shows how to integrate two complementary techniques for manipulating program invariants: dynamic detection and static verification. Dynamic detection proposes likely invariants based on program executions, but the resulting properties are not guaranteed to be true over all possible execut ..."
Abstract
-
Cited by 51 (3 self)
- Add to MetaCart
This paper shows how to integrate two complementary techniques for manipulating program invariants: dynamic detection and static verification. Dynamic detection proposes likely invariants based on program executions, but the resulting properties are not guaranteed to be true over all possible executions. Static verification checks that properties are always true, but it can be difficult and tedious to select a goal and to annotate programs for input to a static checker. Combining these techniques overcomes the weaknesses of each: dynamically detected invariants can annotate a program or provide goals for static verification, and static veri cation can confirm properties proposed by a dynamic tool. We have
Exploring Software Evolution Using Spectrographs
, 2004
"... Software systems become progressively more complex and difficult to maintain. To facilitate maintenance tasks, project managers and developers often turn to the evolution history of the system to recover various kinds of useful information, such as anomalous phenomena and lost design decisions. An i ..."
Abstract
-
Cited by 32 (2 self)
- Add to MetaCart
Software systems become progressively more complex and difficult to maintain. To facilitate maintenance tasks, project managers and developers often turn to the evolution history of the system to recover various kinds of useful information, such as anomalous phenomena and lost design decisions. An informative visualization of the evolution history can help cope with this complexity by highlighting conspicuous evolution events using strong visual cues. In this paper, we present a scalable visualization technique called evolution spectrographs (ESG). An evolution spectrograph portrays the evolution of a spectrum of components based on a particular property measurement. We describe several special-purpose spectrographs and discuss their use in understanding and supporting software evolution through the case studies of three large software systems (OpenSSH, KOffice and FreeBSD).
A Survey on Software Clone Detection Research
- SCHOOL OF COMPUTING TR 2007-541, QUEEN’S UNIVERSITY
, 2007
"... Code duplication or copying a code fragment and then reuse by pasting with or without any modifications is a well known code smell in software maintenance. Several studies show that about 5 % to 20 % of a software systems can contain duplicated code, which is basically the results of copying existin ..."
Abstract
-
Cited by 32 (7 self)
- Add to MetaCart
Code duplication or copying a code fragment and then reuse by pasting with or without any modifications is a well known code smell in software maintenance. Several studies show that about 5 % to 20 % of a software systems can contain duplicated code, which is basically the results of copying existing code fragments and using then by pasting with or without minor modifications. One of the major shortcomings of such duplicated fragments is that if a bug is detected in a code fragment, all the other fragments similar to it should be investigated to check the possible existence of the same bug in the similar fragments. Refactoring of the duplicated code is another prime issue in software maintenance although several studies claim that refactoring of certain clones are not desirable and there is a risk of removing them. However, it is also widely agreed that clones should at least be detected. In this paper, we survey the state of the art in clone detection research. First, we describe the clone terms commonly used in the literature along with their corresponding mappings to the commonly used clone types. Second, we provide a review of the existing
An empirical study of software reuse vs. defect-density and stability
- In ICSE ’04: Proceedings of the 26th International Conference on Software Engineering
, 2004
"... The paper describes results of an empirical study, where some hypotheses about the impact of reuse on defect-density and stability, and about the impact of component size on defects and defect-density in the context of reuse are assessed, using historical data (“data mining”) on defects, modificatio ..."
Abstract
-
Cited by 28 (9 self)
- Add to MetaCart
The paper describes results of an empirical study, where some hypotheses about the impact of reuse on defect-density and stability, and about the impact of component size on defects and defect-density in the context of reuse are assessed, using historical data (“data mining”) on defects, modification rate, and software size of a large-scale telecom system developed by Ericsson. The analysis showed that reused components have lower defect-density than non-reused ones. Reused components have more defects with highest severity than the total distribution, but less defects after delivery, which shows that that these are given higher priority to fix. There are an increasing number of defects with component size for non-reused components, but not for reused components. Reused components were less modified (more stable) than non-reused ones between successive releases, even if reused components must incorporate evolving requirements from several application products. The study furthermore revealed inconsistencies and weaknesses in the existing defect reporting system, by analyzing data that was hardly treated systematically before. 1.

