Results 1 - 10
of
22
Designing Adaptive Information Extraction for the Semantic Web in Amilcare
- Annotation for the Semantic Web, Frontiers in Artificial Intelligence and Applications. IOS
, 2003
"... ..."
Artequakt: Generating Tailored Biographies with Automatically Annotated Fragments from the Web
- Presented at the Semantic Authoring, Annotation and Knowledge Markup (SAAKM) 2002 Workshop at the 15th European Conference on Artificial Intelligence (ECAI 2002
"... The Artequakt project is working towards automatically generating narrative biographies of artists from knowledge that has been extracted from the Web and maintained in a knowledge base. An overview of the system architecture is presented here and the three key components of that architecture are ex ..."
Abstract
-
Cited by 29 (9 self)
- Add to MetaCart
The Artequakt project is working towards automatically generating narrative biographies of artists from knowledge that has been extracted from the Web and maintained in a knowledge base. An overview of the system architecture is presented here and the three key components of that architecture are explained in detail, namely knowledge extraction, information management and biography construction. Conclusions are drawn from the initial experiences of the project and future plans are described.
SystemT: an Algebraic Approach to Declarative Information Extraction
"... As information extraction (IE) becomes more central to enterprise applications, rule-based IE engines have become increasingly important. In this paper, we describe SystemT, a rule-based IE system whose basic design removes the expressivity and performance limitations of current systems based on cas ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
As information extraction (IE) becomes more central to enterprise applications, rule-based IE engines have become increasingly important. In this paper, we describe SystemT, a rule-based IE system whose basic design removes the expressivity and performance limitations of current systems based on cascading grammars. SystemT uses a declarative rule language, AQL, and an optimizer that generates high-performance algebraic execution plans for AQL rules. We compare SystemT’s approach against cascading grammars, both theoretically and with a thorough experimental evaluation. Our results show that SystemT can deliver result quality comparable to the state-of-theart and an order of magnitude higher annotation throughput. 1
Towards a cultural heritage digital library
- In JCDL
, 2003
"... Abstract: This paper surveys research areas relevant to cultural ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Abstract: This paper surveys research areas relevant to cultural
Blueprint for a High Performance NLP Infrastructure
, 2003
"... Natural Language Processing (NLP) system developers face a number of new challenges. Interest is increasing for real-world systems that use NLP tools and techniques. The quantity of text now available for training and processing is increasing dramatically. Also, the range of languages and task ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Natural Language Processing (NLP) system developers face a number of new challenges. Interest is increasing for real-world systems that use NLP tools and techniques. The quantity of text now available for training and processing is increasing dramatically. Also, the range of languages and tasks being researched continues to grow rapidly. Thus it is an ideal time to consider the development of new experimental frameworks. We describe the requirements, initial design and exploratory implementation of a high performance NLP infrastructure.
SOCIS: Scene of Crime Information System
, 2001
"... this technical meeting, we realised that field work was necessary for gathering the data we needed. So, we decided to spend four days at the Rotherham police station, collecting data and following O#cer Hawley at the crime scenes he would attend, in order to see how a SOC is recorded and documented. ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
this technical meeting, we realised that field work was necessary for gathering the data we needed. So, we decided to spend four days at the Rotherham police station, collecting data and following O#cer Hawley at the crime scenes he would attend, in order to see how a SOC is recorded and documented. Dr Saggion spent two days (the 10th and 11th of January 2001) at Rotherham and Ms Katerina Pastra spent another two days (the 5th and 6th of February 2001)
LearningPinocchio: Adaptive information extraction for real world applications
- IN PROCEEDINGS OF THE 2ND WORKSHOP ON ROBUST METHODS IN ANALYSIS OF NATURAL LANGUAGE DATA (ROMAND 2002)
, 2002
"... The new frontier of research on Information Extraction from texts is portability without any knowledge of Natural Language Processing. The market potential is very large in principle, provided that a suitable easy-to-use and effective methodology is provided. In this paper we describe LearningPinocc ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
The new frontier of research on Information Extraction from texts is portability without any knowledge of Natural Language Processing. The market potential is very large in principle, provided that a suitable easy-to-use and effective methodology is provided. In this paper we describe LearningPinocchio, a system for adaptive Information Extraction from texts that is having good commercial and scientific success. Real world applications have been built and evaluation licenses have been released to external companies for application development. In this paper we outline the basic algorithm behind the scenes and present a number of applications developed with LearningPinocchio. Then we report about an evaluation performed by an independent company. Finally we discuss the general suitability of this IE technology for real world applications and draw some conclusion.
GIR experiements with Forostar at GeoCLEF 2007
- in ImageCLEF 2007
, 2007
"... In this paper we describe our Geographic Information Retrieval experiments with Forostar, our GIR application on the GeoCLEF 2007 corpus and query set. We compare the results from orthogonal text with no geographic entities and only geographic entities with standard text retrieval and combined text ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
In this paper we describe our Geographic Information Retrieval experiments with Forostar, our GIR application on the GeoCLEF 2007 corpus and query set. We compare the results from orthogonal text with no geographic entities and only geographic entities with standard text retrieval and combined text and geographic relevance methods. The text and named entity analysis and retrieval methods of Forostar are described in detail. We also detail our placename disambiguation and geographic relevance ranking methods. The paper concludes with an analysis of our results including significance testing where we show our baseline method, in fact, to be best. Finally we identify weaknesses in our approach and ways in which the system could be optimised and improved.
Generating Adaptive Hypertext Content from
- the Semantic Web. 1st Int. Workshop on Hypermedia and the Semantic Web, HyperText'03
, 2003
"... Accessing and extracting knowledge from online documents is crucial for the realisation of the Semantic Web and the provision of advanced knowledge services. The Artequakt project is an ongoing investigation tackling these issues to facilitate the creation of tailored biographies from information ha ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Accessing and extracting knowledge from online documents is crucial for the realisation of the Semantic Web and the provision of advanced knowledge services. The Artequakt project is an ongoing investigation tackling these issues to facilitate the creation of tailored biographies from information harvested from the web. In this paper we will present the methods we currently use to model, consolidate and store knowledge extracted from the web so that it can be re-purposed as adaptive content. We look at how Semantic Web technology could be used within this process and also how such techniques might be used to provide content to be published via the Semantic Web. 1
TextGrid and eHumanities
- In E-SCIENCE ’06: Proc. of the Second IEEE International Conf. on e-Science and Grid Computing
, 2006
"... TextGrid is a new Grid project in the framework of the German D-Grid initiative, with the aim to deploy Grid technologies for humanities scholars working on historical (German) texts. Its two roots, humanities computing and eScience (Grid computing used by research together with modern communication ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
TextGrid is a new Grid project in the framework of the German D-Grid initiative, with the aim to deploy Grid technologies for humanities scholars working on historical (German) texts. Its two roots, humanities computing and eScience (Grid computing used by research together with modern communication technologies), are the basis for TextGrid to provide pioneer work in eHumanities. After summarizing Humanities Computing and modern network technologies, community expectations in the fields of philological edition and other application areas are set forth, from which functional requirements such as modularity, distribution, etc. are distilled. The first version of the TextGrid architecture was designed in accordance with these requirements, and focuses on openness by standard conformance and encapsulation. It provides storage Grid services via a pure Web Services interface to dedicated Web Services tools for different aspects of text processing, analysis and retrieval. This platform aims to provide easily usable tools for scholars, but also specifies interfaces for external program developers to add functionality. 1.

