Results 1 - 10
of
47
Approaches to the Automatic Discovery of Patterns in Biosequences
, 1995
"... This paper is a survey of approaches and algorithms used for the automatic discovery of patterns in biosequences. Patterns with the expressive power in the class of regular languages are considered, and a classification of pattern languages in this class is developed, covering those patterns which a ..."
Abstract
-
Cited by 125 (21 self)
- Add to MetaCart
This paper is a survey of approaches and algorithms used for the automatic discovery of patterns in biosequences. Patterns with the expressive power in the class of regular languages are considered, and a classification of pattern languages in this class is developed, covering those patterns which are the most frequently used in molecular bioinformatics. A formulation is given of the problem of the automatic discovery of such patterns from a set of sequences, and an analysis presented of the ways in which an assessment can be made of the significance and usefulness of the discovered patterns. It is shown that this problem is related to problems studied in the field of machine learning. The largest part of this paper comprises a review of a number of existing methods developed to solve this problem and how these relate to each other, focusing on the algorithms underlying the approaches. A comparison is given of the algorithms, and examples are given of patterns that have been discovered...
Version Space Algebra and its Application to Programming by Demonstration
- In ICML
, 2000
"... Machine learning research has been very successful at producing powerful, broadly applicable classification learners. However, many practical learning problems do not fit the classification framework well, and as a result the initial phase of suitably formulating the problem and incorporating the re ..."
Abstract
-
Cited by 48 (13 self)
- Add to MetaCart
Machine learning research has been very successful at producing powerful, broadly applicable classification learners. However, many practical learning problems do not fit the classification framework well, and as a result the initial phase of suitably formulating the problem and incorporating the relevant domain knowledge can be very difficult and time-consuming. Here we propose a framework to systematize and speed this process, based on the notion of version space algebra. We extend the notion of version spaces beyond concept learning, and propose that carefully-tailored version spaces for complex applications can be built by composing simpler, restricted version spaces. We illustrate our approach with SMARTedit, a programming by demonstration application for repetitive text-editing that uses version space algebra to guide a search over text-editing action sequences. We demonstrate the system on a suite of repetitive text-editing problems and present experimental re...
Incremental concept learning for bounded data mining
- Information and Computation
, 1999
"... Important re nements of concept learning in the limit from positive data considerably restricting the accessibility of input data are studied. Let c be any concept; every in nite sequence of elements exhausting c is called positive presentation of c. In all learning models considered the learning ma ..."
Abstract
-
Cited by 41 (29 self)
- Add to MetaCart
Important re nements of concept learning in the limit from positive data considerably restricting the accessibility of input data are studied. Let c be any concept; every in nite sequence of elements exhausting c is called positive presentation of c. In all learning models considered the learning machine computes a sequence of hypotheses about the target concept from a positive presentation of it. With iterative learning, the learning machine, in making a conjecture, has access to its previous conjecture and the latest data item coming in. In k-bounded example-memory inference (k is a priori xed) the learner is allowed to access, in making a conjecture, its previous hypothesis, its memory of up to k data items it has already seen, and the next element coming in. In the case of k-feedback identi cation, the learning machine, in making a conjecture, has access to its previous conjecture, the latest data item coming in, and, on the basis of this information, it can compute k items and query the database of previous data to nd out, for each of the k items, whether or not it is in the database (k is again a priori xed). In all cases, the sequence of conjectures has to converge to a hypothesis
An Experimental Evaluation of Continuous Testing During Development
- In ISSTA
, 2004
"... to continuously run regression tests in the background, providing rapid feedback about test failures as source code is edited. It is intended to reduce the time and energy required to keep code well-tested and prevent regression errors from persisting uncaught for long periods of time. This paper re ..."
Abstract
-
Cited by 40 (5 self)
- Add to MetaCart
to continuously run regression tests in the background, providing rapid feedback about test failures as source code is edited. It is intended to reduce the time and energy required to keep code well-tested and prevent regression errors from persisting uncaught for long periods of time. This paper reports on a controlled human experiment to evaluate whether students using continuous testing are more successful in completing programming assignments. We also summarize users' subjective impressions and discuss why the results may generalize.
Programming By Demonstration Using Version Space Algebra
, 2001
"... Programming by demonstration enables users to easily personalize their applications, automating repetitive tasks simply by executing a few examples. We formalize programming by demonstration as a machine learning problem: given the changes in the application state that result from the user's demonst ..."
Abstract
-
Cited by 37 (7 self)
- Add to MetaCart
Programming by demonstration enables users to easily personalize their applications, automating repetitive tasks simply by executing a few examples. We formalize programming by demonstration as a machine learning problem: given the changes in the application state that result from the user's demonstrated actions, learn the general program that maps from one application state to the next. We present a methodology for learning in this space of complex functions. First we extend version spaces to learn arbitrary functions, not just concepts. Then we introduce the version space algebra, a method for composing simpler version spaces to construct more complex spaces. Finally, we apply our version space algebra to the text-editing domain and describe an implemented system called SMARTedit that learns repetitive text-editing procedures by example. We evaluate our approach by measuring the number of examples required for the system to learn a procedure that works on the remainder of examples, and by an informal user study measuring the e#ort users spend using our system versus performing the task by hand. The results show that SMARTedit is capable of generalizing correctly from as few as one or two examples, and that users generally save a significant amount of e#ort when completing tasks with SMARTedit's help.
Outlier Finding: Focusing User Attention on Possible Errors
, 2001
"... When users handle large amounts of data, errors are hard to notice. Outlier finding is a new way to reduce errors by directing the user's attention to inconsistent data which may indicate errors. We have implemented an outlier finder for text, which can detect both unusual matches and unusual mismat ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
When users handle large amounts of data, errors are hard to notice. Outlier finding is a new way to reduce errors by directing the user's attention to inconsistent data which may indicate errors. We have implemented an outlier finder for text, which can detect both unusual matches and unusual mismatches to a text pattern. When integrated into the user interface of a PBD text editor and tested in a user study, outlier finding substantially reduced errors. KEYWORDS: programming-by-demonstration,PBD, intelligent user interfaces, text editing, pattern matching, searchand -replace, LAPIS, cluster analysis, unsupervised learning
Interactive Simultaneous Editing of Multiple Text Regions
, 2001
"... Simultaneous editing is a new method for automating repetitive text editing. After describing a set of regions to edit (records), the user can edit any one record and see equivalent edits applied simultaneously to all other records. The essence of simultaneous editing is generalizing the user's sele ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
Simultaneous editing is a new method for automating repetitive text editing. After describing a set of regions to edit (records), the user can edit any one record and see equivalent edits applied simultaneously to all other records. The essence of simultaneous editing is generalizing the user's selection in one record to equivalent selections in the other records. We describe a generalization method that is fast (suitable for interactive use), domain-specific (capable of using high-level knowledge such as Java and HTML syntax), and under user control (generalizations can be corrected or overridden). Simultaneous editing has applications in source code editing, HTML editing, and scripting, among others. 1 Introduction Text editing is full of small repetitive tasks. Examples include: # Replace the string "Hashtable" with "Map" throughout a program # Reformat a list of phone numbers from "(xxx) yyyzzzz " to "+1 xxx yyy zzzz" # Insert print statements to trace entry and exit from each...
MarkItUp! - An incremental approach to document structure recognition
- Electronic Publishing
, 1993
"... ion: We distinguish three kinds of abstraction: (1) Implicit abstraction arises from the fact that we unify each right--hand side of each rule with an eventually occurring new example, 452 P. FANKHAUSER AND YI XU regardless of the nesting depth. Obviously, the resulting grammar will be more generic ..."
Abstract
-
Cited by 27 (3 self)
- Add to MetaCart
ion: We distinguish three kinds of abstraction: (1) Implicit abstraction arises from the fact that we unify each right--hand side of each rule with an eventually occurring new example, 452 P. FANKHAUSER AND YI XU regardless of the nesting depth. Obviously, the resulting grammar will be more generic than the top level disjunction of the old grammar with the grammar derived from the new example. (2) Another kind of abstraction takes place during merging when only trivial unification is possible. (3) The third kind of abstraction is applied after the example has been merged into the old grammar to further simplify the inferred grammar. For merging new examples with the existing grammar in a more tolerant way (2) we introduce three additional rules: Whereas the unification Rules 4a and 4b merge only sequences with a common prefix or suffix, the abstraction Rules 7a-c merge sequences with a number of common subsequences interleaved by distinct subsequences. 7. Abstraction merge of sequen...
SWYN: A Visual Representation for Regular Expressions
, 2001
"... People find it difficult to create and maintain abstractions. We often deal with abstract tasks by using notations that make the structure of the abstraction visible. PBE systems sometimes make it more difficult to create abstractions. The user has to second-guess the results of the inference algori ..."
Abstract
-
Cited by 25 (1 self)
- Add to MetaCart
People find it difficult to create and maintain abstractions. We often deal with abstract tasks by using notations that make the structure of the abstraction visible. PBE systems sometimes make it more difficult to create abstractions. The user has to second-guess the results of the inference algorithm, and sometimes cannot see any visual representation of the inferred result, let alone manipulate it easily. SWYN (See What You Need) addresses these issues in the context of constructing regular expressions from examples. It provides a visual representation that has been evaluated in empirical user testing, and an induction interface that always allows the user to see and modify the effects of the supplied examples. The results demonstrate the potential advantages of more strictly applying cognitive dimensions analysis and direct manipulation principles when designing systems for programming by example.
Repeat and Predict - Two Keys to Efficient Text Editing
- In Conference on Human Factors in Computing Systems
, 1994
"... We propose a simple and powerful predictive interface technique for text editing tasks. With our technique called the dynamic macro creation, when a user types a special "repeat" key after doing repetitive operations in a text editor, an editing sequence corresponding to one iteration is detected, d ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
We propose a simple and powerful predictive interface technique for text editing tasks. With our technique called the dynamic macro creation, when a user types a special "repeat" key after doing repetitive operations in a text editor, an editing sequence corresponding to one iteration is detected, defined as a macro, and executed at the same time. Although being simple, a wide range of repetitive tasks can be performed just by typing the repeat key. When we use another special "predict " key for conventional prediction techniques in addition to the repeat key, wider range of prediction schemes can be performed depending on the order of using these two keys.

