Results 1 - 10
of
717,624
Wrapper Induction for Information Extraction
, 1997
"... The Internet presents numerous sources of useful information---telephone directories, product catalogs, stock quotes, weather forecasts, etc. Recently, many systems have been built that automatically gather and manipulate such information on a user's behalf. However, these resources are usually ..."
Abstract
-
Cited by 612 (30 self)
- Add to MetaCart
are usually formatted for use by people (e.g., the relevant content is embedded in HTML pages), so extracting their content is difficult. Wrappers are often used for this purpose. A wrapper is a procedure for extracting a particular resource's content. Unfortunately, hand-coding wrappers is tedious. We
Extracting patterns and relations from the world wide web
- In WebDB Workshop at 6th International Conference on Extending Database Technology, EDBT’98
, 1998
"... Abstract. The World Wide Web is a vast resource for information. At the same time it is extremely distributed. A particular type of data such as restaurant lists may be scattered across thousands of independent information sources in many di erent formats. In this paper, we consider the problem of e ..."
Abstract
-
Cited by 462 (1 self)
- Add to MetaCart
of extracting a relation for such a data type from all of these sources automatically. We present a technique which exploits the duality between sets of patterns and relations to grow the target relation starting from a small sample. To test our technique we use it to extract a relation of (author,title) pairs
Maximum entropy markov models for information extraction and segmentation
, 2000
"... Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many text-related tasks, such as part-of-speech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial ..."
Abstract
-
Cited by 554 (18 self)
- Add to MetaCart
Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many text-related tasks, such as part-of-speech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled
Bandera: Extracting Finite-state Models from Java Source Code
- IN PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING
, 2000
"... Finite-state verification techniques, such as model checking, have shown promise as a cost-effective means for finding defects in hardware designs. To date, the application of these techniques to software has been hindered by several obstacles. Chief among these is the problem of constructing a fini ..."
Abstract
-
Cited by 653 (35 self)
- Add to MetaCart
), and difficult to optimize (which is necessary to combat the exponential complexity of verification algorithms). In this paper, we describe an integrated collection of program analysis and transformation components, called Bandera, that enables the automatic extraction of safe, compact finite-state models from
State Transition Analysis: A Rule-Based Intrusion Detection Approach
- IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
, 1995
"... This paper presents a new approach to representing and detecting computer penetrations in real-time. The approach, called state transition analysis, models penetrations as a series of state changes that lead from an initial secure state to a target compromised state. State transition diagrams, the g ..."
Abstract
-
Cited by 347 (19 self)
- Add to MetaCart
system, and these diagrams form the basis of a rule-based expert system for detecting penetrations, called the State Transition Analysis Tool (STAT). The design and implementation of a UNIX-specific prototype of this expert system, called USTAT, is also presented. This prototype provides a further
A Compositional Approach to Performance Modelling
, 1996
"... Performance modelling is concerned with the capture and analysis of the dynamic behaviour of computer and communication systems. The size and complexity of many modern systems result in large, complex models. A compositional approach decomposes the system into subsystems that are smaller and more ea ..."
Abstract
-
Cited by 746 (102 self)
- Add to MetaCart
easily modelled. In this thesis a novel compositional approach to performance modelling is presented. This approach is based on a suitably enhanced process algebra, PEPA (Performance Evaluation Process Algebra). The compositional nature of the language provides benefits for model solution as well
Learning probabilistic relational models
- In IJCAI
, 1999
"... A large portion of real-world data is stored in commercial relational database systems. In contrast, most statistical learning methods work only with "flat " data representations. Thus, to apply these methods, we are forced to convert our data into a flat form, thereby losing much ..."
Abstract
-
Cited by 619 (31 self)
- Add to MetaCart
of the dependency structure in a model. Moreover, we show how the learning procedure can exploit standard database retrieval techniques for efficient learning from large datasets. We present experimental results on both real and synthetic relational databases. 1
The empirical case for two systems of reasoning
, 1996
"... Distinctions have been proposed between systems of reasoning for centuries. This article distills properties shared by many of these distinctions and characterizes the resulting systems in light of recent findings and theoretical developments. One system is associative because its computations ref ..."
Abstract
-
Cited by 631 (4 self)
- Add to MetaCart
and can simultaneously generate different solutions to a reasoning problem. The rule-based system can suppress the associative system but not completely inhibit it. The article reviews evidence in favor of the distinction and its characterization.
A Signal Processing Approach To Fair Surface Design
, 1995
"... In this paper we describe a new tool for interactive free-form fair surface design. By generalizing classical discrete Fourier analysis to two-dimensional discrete surface signals -- functions defined on polyhedral surfaces of arbitrary topology --, we reduce the problem of surface smoothing, or fai ..."
Abstract
-
Cited by 668 (15 self)
- Add to MetaCart
, or fairing, to low-pass filtering. We describe a very simple surface signal low-pass filter algorithm that applies to surfaces of arbitrary topology. As opposed to other existing optimization-based fairing methods, which are computationally more expensive, this is a linear time and space complexity algorithm
A Maximum Entropy approach to Natural Language Processing
- COMPUTATIONAL LINGUISTICS
, 1996
"... The concept of maximum entropy can be traced back along multiple threads to Biblical times. Only recently, however, have computers become powerful enough to permit the widescale application of this concept to real world problems in statistical estimation and pattern recognition. In this paper we des ..."
Abstract
-
Cited by 1341 (5 self)
- Add to MetaCart
describe a method for statistical modeling based on maximum entropy. We present a maximum-likelihood approach for automatically constructing maximum entropy models and describe how to implement this approach efficiently, using as examples several problems in natural language processing.
Results 1 - 10
of
717,624