Results 1 - 10
of
13
Mining exception-handling rules as sequence association rules
- In Proc. ICSE
, 2009
"... Programming languages such as Java and C++ provide exception-handling constructs to handle exception conditions. Applications are expected to handle these exception conditions and take necessary recovery actions such as releasing opened database connections. However, exceptionhandling rules that des ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
Programming languages such as Java and C++ provide exception-handling constructs to handle exception conditions. Applications are expected to handle these exception conditions and take necessary recovery actions such as releasing opened database connections. However, exceptionhandling rules that describe these necessary recovery actions are often not available in practice. To address this issue, we develop a novel approach that mines exceptionhandling rules as sequence association rules of the form “(FC 1 c...FC n c) ∧ FCa ⇒ (FC 1 e...FC m e)”. This rule describes that function call FCa should be followed by a sequence of function calls (FC 1 e...FC m e) when FCa is preceded by a sequence of function calls (FC 1 c...FC n c). Such form of rules is required to characterize common exceptionhandling rules. We show the usefulness of these mined rules by applying them on five real-world applications (including 285 KLOC) to detect violations in our evaluation. Our empirical results show that our approach mines 294 real exception-handling rules in these five applications and also detects 160 defects, where 87 defects are new defects that are not found by a previous related approach. 1
MAPO: Mining and recommending API usage patterns
- In European Conference on Object-Oriented Programming (ECOOP
, 2009
"... Abstract. To improve software productivity, when constructing new software systems, programmers often reuse existing libraries or frameworks by invoking methods provided in their APIs. Those API methods, however, are often complex and not well documented. To get familiar with how those API methods a ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Abstract. To improve software productivity, when constructing new software systems, programmers often reuse existing libraries or frameworks by invoking methods provided in their APIs. Those API methods, however, are often complex and not well documented. To get familiar with how those API methods are used, programmers often exploit a source code search tool to search for code snippets that use the API methods of interest. However, the returned code snippets are often large in number, and the huge number of snippets places a barrier for programmers to locate useful ones. In order to help programmers overcome this barrier, we have developed an API usage mining framework and its supporting tool called MAPO (Mining API usage Pattern from Open source repositories) for mining API usage patterns automatically. A mined pattern describes that in a certain usage scenario, some API methods are frequently called together and their usages follow some sequential rules. MAPO further recommends the mined API usage patterns and their associated code snippets upon programmers ’ requests. Our experimental results show that with these patterns MAPO helps programmers locate useful code snippets more effectively than two state-of-the-art code search tools. To investigate whether MAPO can assist programmers in programming tasks, we further conducted an empirical study. The results show that using MAPO, programmers produce code with fewer bugs when facing relatively complex API usages, comparing with using the two state-of-the-art code search tools. 1
Penumbra: automatically identifying failure-relevant inputs using dynamic tainting
- In Proc. of Symposium on Software Testing and Analysis (2009), ACM
"... Most existing automated debugging techniques focus on reducing the amount of code to be inspected and tend to ignore an important component of software failures: the inputs that cause the failure to manifest. In this paper, we present a new technique based on dynamic tainting for automatically ident ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Most existing automated debugging techniques focus on reducing the amount of code to be inspected and tend to ignore an important component of software failures: the inputs that cause the failure to manifest. In this paper, we present a new technique based on dynamic tainting for automatically identifying subsets of a program’s inputs that are relevant to a failure. The technique (1) marks program inputs when they enter the application, (2) tracks them as they propagate during execution, and (3) identifies, for an observed failure, the subset of inputs that are potentially relevant for debugging that failure. To investigate feasibility and usefulness of our technique, we created a prototype tool, penumbra, and used it to evaluate our technique on several failures in real programs. Our results are promising, as they show that penumbra can point developers to inputs that are actually relevant for investigating a failure and can be more practical than existing alternative approaches.
Alattin: Mining Alternative Patterns for Detecting Neglected Conditions
"... Abstract—To improve software quality, static or dynamic verification tools accept programming rules as input and detect their violations in software as defects. As these programming rules are often not well documented in practice, previous work developed various approaches that mine programming rule ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract—To improve software quality, static or dynamic verification tools accept programming rules as input and detect their violations in software as defects. As these programming rules are often not well documented in practice, previous work developed various approaches that mine programming rules as frequent patterns from program source code. Then these approaches use static defect-detection techniques to detect pattern violations in source code under analysis. These existing approaches often produce many false positives due to various factors. To reduce false positives produced by these mining approaches, we develop a novel approach, called Alattin, that includes a new mining algorithm and a technique for detecting neglected conditions based on our mining algorithm. Our new mining algorithm mines alternative patterns in example form “P1 or P2”, where P1 and P2 are alternative rules such as condition checks on method arguments or return values related to the same API method. We conduct two evaluations to show the effectiveness of our Alattin approach. Our evaluation results show that (1) alternative patterns reach more than 40% of all mined patterns for APIs provided by six open source libraries; (2) the mining of alternative patterns helps reduce nearly 28 % of false positives among detected violations. Keywords-code search; frequent itemset mining; alternative patterns; I.
Mining API Error-Handling Specifications from Source Code
"... Abstract. API error-handling specifications are often not documented, necessitating automated specification mining. Automated mining of error-handling specifications is challenging for procedural languages such as C, which lack explicit exception-handling mechanisms. Due to the lack of explicit exce ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract. API error-handling specifications are often not documented, necessitating automated specification mining. Automated mining of error-handling specifications is challenging for procedural languages such as C, which lack explicit exception-handling mechanisms. Due to the lack of explicit exception handling, error-handling code is often scattered across different procedures and files making it difficult to mine error-handling specifications through manual inspection of source code. In this paper, we present a novel framework for mining API errorhandling specifications automatically from API client code, without any user input. In our framework, we adapt a trace generation technique to distinguish and generate static traces representing different API run-time behaviors. We apply data mining techniques on the static traces to mine specifications that define correct handling of API errors. We then use the mined specifications to detect API error-handling violations. Our framework mines 62 error-handling specifications and detects 264 real error-handling defects from the analyzed open source packages. 1 1
Improving Software Quality via Code Searching and Mining
"... Enormous amount of open source code is available on the Internet and various code search engines (CSE) are available to serve as a means for searching in open source code. However, usage of CSEs is often limited to simple tasks such as searching for relevant code examples. In this paper, we present ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Enormous amount of open source code is available on the Internet and various code search engines (CSE) are available to serve as a means for searching in open source code. However, usage of CSEs is often limited to simple tasks such as searching for relevant code examples. In this paper, we present a generic life-cycle model that can be used to improve software quality by exploiting CSEs. We present three example software development tasks that can be assisted by our life-cycle model and show how these three tasks can contribute to improve the software quality. We also show the application of our life-cycle model with a preliminary evaluation. 1.
Performance Debugging in the Large via Mining Millions of Stack Traces
"... Abstract—Given limited resource and time before software release, development-site testing and debugging become more and more insufficient to ensure satisfactory software performance. As a counterpart for debugging in the large pioneered by the Microsoft Windows Error Reporting (WER) system focusing ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract—Given limited resource and time before software release, development-site testing and debugging become more and more insufficient to ensure satisfactory software performance. As a counterpart for debugging in the large pioneered by the Microsoft Windows Error Reporting (WER) system focusing on crashing/hanging bugs, performance debugging in the large has emerged thanks to available infrastructure support to collect execution traces with performance issues from a huge number of users at the deployment sites. However, performance debugging against these numerous and complex traces remains a significant challenge for performance analysts. In this paper, to enable performance debugging in the large in practice, we propose a novel approach, called StackMine, that mines callstack traces to help performance analysts effectively discover highly impactful performance bugs (e.g., bugs impacting many users with long response delay). As a successful technology-transfer effort, since December 2010, StackMine has been applied in performance-debugging activities at a Microsoft team for performance analysis, especially for a large number of execution traces. Based on real-adoption experiences of StackMine in practice, we conducted an evaluation of StackMine on performance debugging in the large for Microsoft Windows 7. We also conducted another evaluation on a third-party application. The results highlight substantial benefits offered by StackMine in performance debugging in the large for large-scale software systems. I.
Mining API Specifications from Source Code for Improving Software Reliability
"... A software system interacts with third-party libraries through various Application Program Interfaces (APIs). Using these APIs correctly often needs to follow certain programming rules, i.e., API specifications. API specifications specify the required checks (on API input parameters and return value ..."
Abstract
- Add to MetaCart
A software system interacts with third-party libraries through various Application Program Interfaces (APIs). Using these APIs correctly often needs to follow certain programming rules, i.e., API specifications. API specifications specify the required checks (on API input parameters and return values) and other APIs to be invoked before (preconditions) and after (postconditions) an API call. Incorrect usage of APIs (in short, API violations) can lead to security and robustness problems, two primary hindrances for the reliable operation of a software system. Hence, for a software system, adherence to the specifications, which govern the correct usage of APIs used by the system, is paramount for software reliability. Specifications, when known, can be formally written for third-party APIs and statically verified against a software system. This dissertation addresses two main problems faced by programmers in effectively and correctly reusing third-party APIs. (1) Formal API specifications are complicated and lengthy mainly due to the various API details (such as input/return type, error-flag codes, and return values for APIs on success/failure) and language-specific syntax considerations required for the specification to be accurate and complete. Hence, manually writing a large number of formal API specifications, when known, for static verification is often inaccurate or incomplete,
NEGWeb: Detecting Neglected Conditions via Mining Programming Rules from Open Source Code
"... Neglected conditions, also referred as missing paths, are known to be an important class of software defects. Revealing neglected conditions around individual API calls in an application requires the knowledge of programming rules that must be obeyed while reusing those APIs. To mine those implicit ..."
Abstract
- Add to MetaCart
Neglected conditions, also referred as missing paths, are known to be an important class of software defects. Revealing neglected conditions around individual API calls in an application requires the knowledge of programming rules that must be obeyed while reusing those APIs. To mine those implicit programming rules and hence to detect neglected conditions, we develop a novel framework, called NEGWeb, that substantially expands mining scope to billions of lines of open source code available on the web by leveraging a code search engine. We evaluated NEGWeb to detect violations of mined rules in local code bases or open source code bases. In our evaluation, we show that NEGWeb finds three real defects in Java code reported in the literature and also finds three previously unknown defects in a large-scale open source project called Columba (91, 508 lines of Java code) that reuses 541 classes and 2225 methods. We also report a high percentage of real rules among the top 25 reported patterns mined for APIs provided by five popular open source applications.
ETH Zurich
"... In statically-typed programming languages, the compiler ensures that method arguments are passed in the expected order by checking the type of each argument. However, calls to methods with multiple equally-typed parameters slip through this check. The uncertainty about the correct argument order of ..."
Abstract
- Add to MetaCart
In statically-typed programming languages, the compiler ensures that method arguments are passed in the expected order by checking the type of each argument. However, calls to methods with multiple equally-typed parameters slip through this check. The uncertainty about the correct argument order of equally-typed arguments can cause various problems, for example, if a programmer accidentally reverses two arguments. We present an automated, static program analysis that detects such problems without any input except for the source code of a program. The analysis leverages the observation that programmer-given identifier names convey information about the semantics of arguments, which can be used to assign equally-typed arguments to their expected position. We evaluate the approach with a large corpus of Java programs and show that our analysis finds relevant anomalies with a precision of 76%.

