Results 1 -
6 of
6
The Effects of Over and Under Sampling on Fault-prone Module Detection
- FIRST INTERNATIONAL SYMPOSIUM ON EMPIRICAL SOFTWARE ENGINEERING AND MEASUREMENT
, 2007
"... The goal of this paper is to improve the prediction performance of fault-prone module prediction models (fault-proneness models) by employing over/under sampling methods, which are preprocessing procedures for a fit dataset. The sampling methods are expected to improve prediction performance when th ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The goal of this paper is to improve the prediction performance of fault-prone module prediction models (fault-proneness models) by employing over/under sampling methods, which are preprocessing procedures for a fit dataset. The sampling methods are expected to improve prediction performance when the fit dataset is imbalanced, i.e. there exists a large difference between the number of fault-prone modules and not-fault-prone modules. So far, there has been no research reporting the effects of applying sampling methods to fault-proneness models. In this paper, we experimentally evaluated the effects of four sampling methods (random over sampling, synthetic minority over sampling, random under sampling and one-sided selection) applied to four fault-proneness models (linear discriminant analysis, logistic regression analysis, neural network and classification tree) by using two module sets of industry legacy software. All four sampling methods improved the prediction performance of the linear and logistic models, while neural network and classification tree models did not benefit from the sampling methods. The improvements of F1-values in linear and logistic models were 0.078 at minimum, 0.224 at maximum and 0.121 at the mean.
Characterizing the Differences Between Pre- and Post- Release Versions of Software
"... Many software producers utilize beta programs to predict postrelease quality and to ensure that their products meet quality expectations of users. Prior work indicates that software producers need to adjust predictions to account for usage environments and usage scenarios differences between beta po ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Many software producers utilize beta programs to predict postrelease quality and to ensure that their products meet quality expectations of users. Prior work indicates that software producers need to adjust predictions to account for usage environments and usage scenarios differences between beta populations and postrelease populations. However, little is known about how usage characteristics relate to field quality and how usage characteristics differ between beta and post-release. In this study, we examine application crash, application hang, system crash, and usage information from millions of Windows ® users to 1) examine the effects of usage characteristics differences on field quality (e.g. which usage characteristics impact quality), 2) examine usage characteristics differences between beta and post-release (e.g. do impactful usage characteristics differ), and 3) report experiences adjusting field quality predictions for Windows. Among the 18 usage characteristics that we examined, the five most important were: the number of application executed, whether the machines was pre-installed by the original equipment manufacturer, two sub-populations (two language/geographic locales), and whether Windows was 64-bit (not 32-bit). We found each of these usage characteristics to differ between beta and post-release, and by adjusting for the differences, accuracy of field quality predictions for Windows improved by ~59%.
Balancing time-to-market and quality in embedded systems 1
"... Finding a balance between the time-to-market and quality of a delivered product is a daunting task. The optimal release moment is not easily found. We propose to use historical project data to monitor the progress of running projects. From the data we inferred a formula providing a rough indication ..."
Abstract
- Add to MetaCart
Finding a balance between the time-to-market and quality of a delivered product is a daunting task. The optimal release moment is not easily found. We propose to use historical project data to monitor the progress of running projects. From the data we inferred a formula providing a rough indication of the number of defects given the effort spent thus far (effort-to-defect formula). Furthermore we provide a worst case bound to the allowed number of residual defects at the end of a project in order to achieve the required level of quality. For this purpose we slightly modified a well-known reliability growth model by Bishop and Bloomfield. It turned out that the software in Philips ’ MRI scanners has a defect rate of 1 per 1,175 deviceyears of observation. This coincides with the second highest safety integrity level (SIL3) as defined in the IEC 61508 standard. The highest level (SIL4) is only attainable by applying redundancy. Finally, we combine the effort-to-defect formula with the reliability growth model to monitor the progress of a project and to determine when the required level of quality will be reached. We show that a common
Information Systems III
"... While the financial consequences of software errors on the developer’s side have been explored extensively, the cost arising for the end user has been largely neglected. One reason is the difficulty of linking errors in the code with emerging failure behavior of the software. The problem becomes eve ..."
Abstract
- Add to MetaCart
While the financial consequences of software errors on the developer’s side have been explored extensively, the cost arising for the end user has been largely neglected. One reason is the difficulty of linking errors in the code with emerging failure behavior of the software. The problem becomes even more difficult when trying to predict failure probabilities based on models or code metrics. In this paper we take a first step towards a cost prediction model by exploring the possibilities of modeling the financial consequences of already identified software failures. Firefox, a well-known open source software, is used as a test subject. Historically identified failures are modeled using fault trees. To identify expenses, usage profiles are employed to depict the interaction with the system. The presented approach demonstrates the possibility to model failure cost for an organization using a specific software by establishing a relationship between user behavior, software failures, and cost. As future work, an extension with software error prediction techniques as well as an empirical validation of the model is aspired.
Defect Prediction using Combined Product and Project Metrics A Case Study from the Open Source “Apache ” MyFaces Project Family
"... The quality evaluation of open source software (OSS) products, e.g., defect estimation and prediction approaches of individual releases, gains importance with increasing OSS adoption in industry applications. Most empirical studies on the accuracy of defect prediction and software maintenance focus ..."
Abstract
- Add to MetaCart
The quality evaluation of open source software (OSS) products, e.g., defect estimation and prediction approaches of individual releases, gains importance with increasing OSS adoption in industry applications. Most empirical studies on the accuracy of defect prediction and software maintenance focus on product metrics as predictors that are available only when the product is finished. Only few prediction models consider information on the development process (project metrics) that seems relevant to quality improvement of the software product. In this paper, we investigate defect prediction with data from a family of widely used OSS projects based both on product and project metrics as well as on combinations of these metrics. Main results of data analysis are (a) a set of project metrics prior to product release that had strong correlation to potential defect growth between releases and (b) a combination of product and project metrics enables a more accurate defect prediction than the application of one single type of measurement. Thus, the combined application of project and product metrics can (a) improve the accuracy of defect prediction, (b) enable a better guidance of the release process from project management point of view, and (c) help identifying areas for product and process improvement.
A Catalog of Techniques that Predict Information about the Count or Rate of Field Defects
, 2006
"... assurance, reliability, risk management, planning, software reliability growth models, ..."
Abstract
- Add to MetaCart
assurance, reliability, risk management, planning, software reliability growth models,

