Results 1  10
of
391
Wrapper Induction for Information Extraction
, 1997
"... The Internet presents numerous sources of useful informationtelephone directories, product catalogs, stock quotes, weather forecasts, etc. Recently, many systems have been built that automatically gather and manipulate such information on a user's behalf. However, these resources are usually form ..."
Abstract

Cited by 519 (30 self)
 Add to MetaCart
The Internet presents numerous sources of useful informationtelephone directories, product catalogs, stock quotes, weather forecasts, etc. Recently, many systems have been built that automatically gather and manipulate such information on a user's behalf. However, these resources are usually formatted for use by people (e.g., the relevant content is embedded in HTML pages), so extracting their content is difficult. Wrappers are often used for this purpose. A wrapper is a procedure for extracting a particular resource's content. Unfortunately, handcoding wrappers is tedious. We introduce wrapper induction, a technique for automatically constructing wrappers. Our techniques can be described in terms of three main contributions. First, we pose the problem of wrapper construction as one of inductive learn...
Selection of relevant features and examples in machine learning
 ARTIFICIAL INTELLIGENCE
, 1997
"... In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been mad ..."
Abstract

Cited by 423 (1 self)
 Add to MetaCart
In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been made on these topics in both empirical and theoretical work in machine learning, and we present a general framework that we use to compare different methods. We close with some challenges for future work in this area.
Improving generalization with active learning
 Machine Learning
, 1994
"... Abstract. Active learning differs from "learning from examples " in that the learning algorithm assumes at least some control over what part of the input domain it receives information about. In some situations, active learning is provably more powerful than learning from examples alone, g ..."
Abstract

Cited by 419 (1 self)
 Add to MetaCart
Abstract. Active learning differs from "learning from examples " in that the learning algorithm assumes at least some control over what part of the input domain it receives information about. In some situations, active learning is provably more powerful than learning from examples alone, giving better generalization for a fixed number of training examples. In this article, we consider the problem of learning a binary concept in the absence of noise. We describe a formalism for active concept learning called selective sampling and show how it may be approximately implemented by a neural network. In selective sampling, a learner receives distribution information from the environment and queries an oracle on parts of the domain it considers "useful. " We test our implementation, called an SGnetwork, on three domains and observe significant improvement in generalization.
Cryptographic Limitations on Learning Boolean Formulae and Finite Automata
 PROCEEDINGS OF THE TWENTYFIRST ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING
, 1989
"... In this paper we prove the intractability of learning several classes of Boolean functions in the distributionfree model (also called the Probably Approximately Correct or PAC model) of learning from examples. These results are representation independent, in that they hold regardless of the syntact ..."
Abstract

Cited by 311 (16 self)
 Add to MetaCart
In this paper we prove the intractability of learning several classes of Boolean functions in the distributionfree model (also called the Probably Approximately Correct or PAC model) of learning from examples. These results are representation independent, in that they hold regardless of the syntactic form in which the learner chooses to represent its hypotheses. Our methods reduce the problems of cracking a number of wellknown publickey cryptosystems to the learning problems. We prove that a polynomialtime learning algorithm for Boolean formulae, deterministic finite automata or constantdepth threshold circuits would have dramatic consequences for cryptography and number theory: in particular, such an algorithm could be used to break the RSA cryptosystem, factor Blum integers (composite numbers equivalent to 3 modulo 4), and detect quadratic residues. The results hold even if the learning algorithm is only required to obtain a slight advantage in prediction over random guessing. The techniques used demonstrate an interesting duality between learning and cryptography. We also apply our results to obtain strong intractability results for approximating a generalization of graph coloring.
Principles and methods of Testing Finite State Machines a survey. The
 Proceedings of IEEE
, 1996
"... With advanced computer technology, systems are getting larger to fulfill more complicated tasks, however, they are also becoming less reliable. Consequently, testing is an indispensable part of system design and implementation; yet it has proved to be a formidable task for complex systems. This moti ..."
Abstract

Cited by 244 (13 self)
 Add to MetaCart
With advanced computer technology, systems are getting larger to fulfill more complicated tasks, however, they are also becoming less reliable. Consequently, testing is an indispensable part of system design and implementation; yet it has proved to be a formidable task for complex systems. This motivates the study of testing finite state machines to ensure the correct functioning of systems and to discover aspects of their behavior. A finite state machine contains a finite number of states and produces outputs on state transitions after receiving inputs. Finite state machines are widely used to model systems in diverse areas, including sequential circuits, certain types of programs, and, more recently, communication protocols. In a testing problem we have a machine about which we lack some information; we would like to deduce this information by providing a sequence of inputs to the machine and observing the outputs produced. Because of its practical importance and theoretical interest, the problem of testing finite state machines has been studied in different areas and at various times. The earliest published literature on this topic dates back to the 50’s. Activities in the 60’s and early 70’s were motivated mainly by automata theory and sequential circuit testing. The area seemed to have mostly died down until a few years ago when the testing problem was resurrected and is now being studied anew due to its applications to conformance testing of communication protocols. While some old problems which had been open for decades were resolved recently, new concepts and more intriguing problems from new applications emerge. We review the fundamental problems in testing finite state machines and techniques for solving these problems, tracing progress in the area from its inception to the present and the state of the art. In addition, we discuss extensions of finite state machines and some other topics related to testing. 21.
Discovering Models of Software Processes from EventBased Data
 ACM Transactions on Software Engineering and Methodology
, 1998
"... this article we describe a Markov method that we developed specifically for process discovery, as well as describe two additional methods that we adopted from other domains and augmented for our purposes. The three methods range from the purely algorithmic to the purely statistical. We compare the m ..."
Abstract

Cited by 230 (7 self)
 Add to MetaCart
this article we describe a Markov method that we developed specifically for process discovery, as well as describe two additional methods that we adopted from other domains and augmented for our purposes. The three methods range from the purely algorithmic to the purely statistical. We compare the methods and discuss their application in an industrial case study.
Wrapper Induction: Efficiency and Expressiveness
 Artificial Intelligence
, 2000
"... The Internet presents numerous sources of useful informationtelephone directories, product catalogs, stock quotes, event listings, etc. Recently, many systems have been built that automatically gather and manipulate such information on a user's behalf. However, these resources are usually formatt ..."
Abstract

Cited by 210 (12 self)
 Add to MetaCart
The Internet presents numerous sources of useful informationtelephone directories, product catalogs, stock quotes, event listings, etc. Recently, many systems have been built that automatically gather and manipulate such information on a user's behalf. However, these resources are usually formatted for use by people (e.g., the relevant content is embedded in HTML pages), so extracting their content is difficult. Most systems use customized wrapper procedures to perform this extraction task. Unfortunately, writing wrappers is tedious and errorprone. As an alternative, we advocate wrapper induction, a technique for automatically constructing wrappers. In this article, we describe six wrapper classes, and use a combination of empirical and analytical techniques to evaluate the computational tradeoffs among them. We first consider expressiveness: how well the classes can handle actual Internet resources, and the extent to which wrappers in one class can mimic those in another. We then...
Computational Limitations on Learning from Examples
 Journal of the ACM
, 1988
"... Abstract. The computational complexity of learning Boolean concepts from examples is investigated. It is shown for various classes of concept representations that these cannot be learned feasibly in a distributionfree sense unless R = NP. These classes include (a) disjunctions of two monomials, (b) ..."
Abstract

Cited by 192 (10 self)
 Add to MetaCart
Abstract. The computational complexity of learning Boolean concepts from examples is investigated. It is shown for various classes of concept representations that these cannot be learned feasibly in a distributionfree sense unless R = NP. These classes include (a) disjunctions of two monomials, (b) Boolean threshold functions, and (c) Boolean formulas in which each variable occurs at most once. Relationships between learning of heuristics and finding approximate solutions to NPhard optimization problems are given. Categories and Subject Descriptors: F. 1.1 [Computation by Abstract Devices]: Models of Computationrelations among models; F. 1.2 [Computation by Abstract Devices]: Modes of Computationprobabilistic computation; F. 1.3 [Computation by Abstract Devices]: Complexity Classesreducibility and completeness; 1.2.6 [Artificial Intelligence]: Learningconcept learning; induction
Learning Decision Trees using the Fourier Spectrum
, 1991
"... This work gives a polynomial time algorithm for learning decision trees with respect to the uniform distribution. (This algorithm uses membership queries.) The decision tree model that is considered is an extension of the traditional boolean decision tree model that allows linear operations in each ..."
Abstract

Cited by 182 (10 self)
 Add to MetaCart
This work gives a polynomial time algorithm for learning decision trees with respect to the uniform distribution. (This algorithm uses membership queries.) The decision tree model that is considered is an extension of the traditional boolean decision tree model that allows linear operations in each node (i.e., summation of a subset of the input variables over GF (2)). This paper shows how to learn in polynomial time any function that can be approximated (in norm L 2 ) by a polynomially sparse function (i.e., a function with only polynomially many nonzero Fourier coefficients). The authors demonstrate that any function f whose L 1 norm (i.e., the sum of absolute value of the Fourier coefficients) is polynomial can be approximated by a polynomially sparse function, and prove that boolean decision trees with linear operations are a subset of this class of functions. Moreover, it is shown that the functions with polynomial L 1 norm can be learned deterministically. The algorithm can a...