Results 1 - 10
of
26
Estimating the Numbers of End Users and End User Programmers
- In IEEE Symp. on Visual Languages and Human-Centric Computing
, 2005
"... In 1995, Boehm predicted that by 2005, there would be “55 million performers ” of “end user programming ” in the United States. The original context and method which generated this number had two weaknesses, both of which we address. First, it relies on undocumented, judgment based factors to estima ..."
Abstract
-
Cited by 57 (17 self)
- Add to MetaCart
In 1995, Boehm predicted that by 2005, there would be “55 million performers ” of “end user programming ” in the United States. The original context and method which generated this number had two weaknesses, both of which we address. First, it relies on undocumented, judgment based factors to estimate the number of end user programmers based on the total number of end users; we address this weakness by identifying specific end user sub populations and then estimating their sizes. Second, Boehm's estimate relies on additional undocumented, judgment based factors to adjust for rising computer usage rates; we address this weakness by integrating fresh Bureau of Labor Statistics (BLS) data and projections as well as a richer estimation method. With these improvements to Boehm’s method, we estimate that in 2012 there will be 90 million end users in American workplaces. Of these, we anticipate that over 55 million will use spreadsheets or databases (and therefore may potentially program), while over 13 million will describe themselves as programmers, compared to BLS projections of fewer than 3 million professional programmers. We have validated our improved method by generating estimates for 2001 and 2003, then verifying that our estimates are consistent with existing estimates from other sources.
GoalDebug: A Spreadsheet Debugger for End Users
- In 29th IEEE Int. Conf. on Software Engineering
, 2007
"... We present a spreadsheet debugger targeted at end users. Whenever the computed output of a cell is incorrect, the user can supply an expected value for a cell, which is employed by the system to generate a list of change suggestions for formulas that, when applied, would result in the user-specified ..."
Abstract
-
Cited by 16 (11 self)
- Add to MetaCart
We present a spreadsheet debugger targeted at end users. Whenever the computed output of a cell is incorrect, the user can supply an expected value for a cell, which is employed by the system to generate a list of change suggestions for formulas that, when applied, would result in the user-specified output. The change suggestions are ranked using a set of heuristics. In previous work, we had presented the system as a proof of concept. In this paper, we describe a systematic evaluation of the effectiveness of inferred change suggestions and the employed ranking heuristics. Based on the results of the evaluation we have extended both, the change inference process and the ranking of suggestions. An evaluation of the improved system shows that change inference process and the ranking heuristics have both been substantially improved and that the system performs effectively. 1
Unsupervised Inference of Data Formats in Human-Readable Notation
- Proceedings of 9th International Conference on Enterprise Integration Systems (ICEIS'07
, 2007
"... Abstract: One common approach to validating data such as email addresses and phone numbers is to check whether values conform to some desired data format. Unfortunately, users may need to learn a specialized notation such as regular expressions to specify the format, and even after learning the nota ..."
Abstract
-
Cited by 14 (10 self)
- Add to MetaCart
Abstract: One common approach to validating data such as email addresses and phone numbers is to check whether values conform to some desired data format. Unfortunately, users may need to learn a specialized notation such as regular expressions to specify the format, and even after learning the notation, specifying formats may take substantial time. To address these problems, this paper introduces Topei, a system that infers a format from an unlabeled collection of examples (which may contain errors). The generated format is presented as understandable English, so users can review and customize the format. In addition, the format can be used to automatically check data against the format and find outliers that do not match. Topei shows substantially higher precision and recall than an alternate algorithm (Lapis) on test data. Topei’s usefulness is demonstrated by integrating it with spreadsheet, database, and web services systems. 1
Topes: Reusable Abstractions for Validating Data
- Proc 30 th Intl. Conf. Software Engineering
"... Programmers often omit input validation when inputs can appear in many different formats or when validation criteria cannot be precisely specified. To enable validation in these situations, we present a new technique that puts valid inputs into a consistent format and that identifies “questionable ” ..."
Abstract
-
Cited by 13 (7 self)
- Add to MetaCart
Programmers often omit input validation when inputs can appear in many different formats or when validation criteria cannot be precisely specified. To enable validation in these situations, we present a new technique that puts valid inputs into a consistent format and that identifies “questionable ” inputs which might be valid or invalid, so that these values can be double-checked by a person or a program. Our technique relies on the concept of a “tope”, which is an application-independent abstraction describing how to recognize and transform values in a category of data. We present our definition of topes and describe a development environment that supports the implementation and use of topes. Experiments with web application and spreadsheet data indicate that using our technique improves the accuracy and reusability of validation code and also improves the effectiveness of subsequent data cleaning such as duplicate identification.
Identifying Categories of End Users Based on the Abstractions That They Create
, 2005
"... Software created by end users often lacks key quality attributes that professional programmers try to ensure through the use of abstraction. Yet to date, large-scale studies of end users have not examined end user software usage at a level whic h is suffic ientl y fine-grained to determine the exten ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Software created by end users often lacks key quality attributes that professional programmers try to ensure through the use of abstraction. Yet to date, large-scale studies of end users have not examined end user software usage at a level whic h is suffic ientl y fine-grained to determine the extent to whic h the yc reate abstrac tions. To address this, we deployed an online survey to In formation Week subsc ribers to ask about not only so ftware usage but also feature usage related to abstra c tionc reation. Most respondents didc reate abstrac tions. Moreover, through fac tor analysis, w e found that features fell into three clusters--when users had a propensity to use one feature, then they also had a propensity to use other features in the same cluster. These clusters
Dimension Inference in Spreadsheets
"... We present a reasoning system for inferring dimension information in spreadsheets. This system can be used to check the consistency of spreadsheet formulas and can be employed to detect errors in spreadsheets. We have prototypically implemented the system as an add-in to Excel. In an evaluation of t ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
We present a reasoning system for inferring dimension information in spreadsheets. This system can be used to check the consistency of spreadsheet formulas and can be employed to detect errors in spreadsheets. We have prototypically implemented the system as an add-in to Excel. In an evaluation of this implementation we were able to detect dimension errors in almost 50 % of the investigated spreadsheets, which shows (i) that the system works reliably in practice and (ii) that dimension information can be well exploited to uncover errors in spreadsheets.
Integrating automated test generation into the WYSIWYT spreadsheet testing methodology
- ACM TRANS. SOFTW. ENG. METHODOL
, 2006
"... Spreadsheet languages, which include commercial spreadsheets and various research systems, have had a substantial impact on end-user computing. Research shows, however, that spreadsheets often contain faults. Thus, in previous work, we presented a methodology that helps spreadsheet users test their ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Spreadsheet languages, which include commercial spreadsheets and various research systems, have had a substantial impact on end-user computing. Research shows, however, that spreadsheets often contain faults. Thus, in previous work, we presented a methodology that helps spreadsheet users test their spreadsheet formulas. Our empirical studies have shown that end users can use this methodology to test spreadsheets more adequately and efficiently; however, the process of generating test cases can still represent a significant impediment. To address this problem, we have been investigating how to incorporate automated test case generation into our testing methodology in ways that support incremental testing and provide immediate visual feedback. We have utilized two techniques for generating test cases, one involving random selection and one involving a goal-oriented approach. We describe these techniques and their integration into our testing environment, and report results of an experiment examining their effectiveness and efficiency.
AND ERWIG, M.: Automatic Detection of Dimension Errors in Spreadsheets
- Journal of Visual Languages and Computing
"... We present a reasoning system for inferring dimension information in spreadsheets. This system can be used to check the consistency of spreadsheet formulas and thus is able to detect errors in spreadsheets. Our approach is based on three static analysis components. First, the spatial structure of th ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
We present a reasoning system for inferring dimension information in spreadsheets. This system can be used to check the consistency of spreadsheet formulas and thus is able to detect errors in spreadsheets. Our approach is based on three static analysis components. First, the spatial structure of the spreadsheet is analyzed to infer a labeling relationship among cells. Second, cells that are used as labels are lexically analyzed and mapped to potential dimensions. Finally, dimension information is propagated through spreadsheet formulas. An important aspect of the rule system defining dimension inference is that it works bi-directionally, that is, not only "downstream " from referenced arguments to the current cell, but also "upstream " in the reverse direction. This flexibility makes the system robust and turns out to be particularly useful in cases when the initial dimension information that can be inferred from headers is incomplete or ambiguous. We have implemented a prototype system as an add-in to Excel. In an evaluation of this implementation we were able to detect dimension errors in almost 50 % of the investigated spreadsheets, which shows (i) that the system works reliably in practice and (ii) that dimension information can be well exploited to uncover errors in spreadsheets.
Interactive fault localization techniques in a spreadsheet environment
- IEEE Trans. Soft. Eng
, 2006
"... End-user programmers develop more software than any other group of programmers, using software authoring devices such as multimedia simulation builders, e-mail filtering editors, by-demonstration macro builders, and spreadsheet environments. Despite this, there has been only a little research on fin ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
End-user programmers develop more software than any other group of programmers, using software authoring devices such as multimedia simulation builders, e-mail filtering editors, by-demonstration macro builders, and spreadsheet environments. Despite this, there has been only a little research on finding ways to help these programmers with the dependability of the software they create. We have been working to address this problem in several ways, one of which includes supporting end-user debugging activities through interactive fault localization techniques. This article investigates fault localization techniques in the spreadsheet domain, the most common type of end-user programming environment. We investigate a technique previously described in the research literature, and two new techniques. We present the results of an empirical study to examine the impact of two individual factors on the effectiveness of fault localization techniques. Our results reveal several insights into the contributions such techniques can make to the end-user debugging process, and highlight key issues of interest to researchers and practitioners who may design and evaluate future fault localization techniques.
Combining Spatial and Semantic Label Analysis ⋆
"... Labels in spreadsheets can be exploited for finding errors in spreadsheet formulas. Previous approaches have either used the positional information of labels or their interpretation as dimension for checking the consistency of formulas. In this paper we demonstrate how these two approaches can be co ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Labels in spreadsheets can be exploited for finding errors in spreadsheet formulas. Previous approaches have either used the positional information of labels or their interpretation as dimension for checking the consistency of formulas. In this paper we demonstrate how these two approaches can be combined. We have formalized a combined reasoning system and have implemented a corresponding prototype system. We have evaluated the system on the EUSES spreadsheet corpus. The evaluation has demonstrated that adding a syntactic, spatial analysis to a dimension inference can significantly improve the rate of detected errors. 1.

