Results 1 -
5 of
5
Evolving GATE to Meet New Challenges in . . .
, 1998
"... In this paper we present recent work on GATE, a widely-used framework and graphical development environment for creating and deploying Language Engineering components and resources in a robust fashion. The GATE architecture has facilitated the development of a number of successful applications for v ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
In this paper we present recent work on GATE, a widely-used framework and graphical development environment for creating and deploying Language Engineering components and resources in a robust fashion. The GATE architecture has facilitated the development of a number of successful applications for various language processing tasks (such as Information Extraction, dialogue and summarisation), the building and annotation of corpora and the quantitative evaluations of LE applications. The focus of this paper is on recent developments in response to new challenges in Language Engineering: Semantic Web, integration with Information Retrieval and data mining, and the need for machine learning support.
iASA: Learning to Annotate the Semantic Web
"... With the advent of the Semantic Web, there is a great need to upgrade existing web content to semantic web content. This can be accomplished through semantic annotations. Unfortunately, manual annotation is tedious, time consuming and error-prone. In this paper, we propose a tool, called iASA, that ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
With the advent of the Semantic Web, there is a great need to upgrade existing web content to semantic web content. This can be accomplished through semantic annotations. Unfortunately, manual annotation is tedious, time consuming and error-prone. In this paper, we propose a tool, called iASA, that learns to automatically annotate web documents according to an ontology. iASA is based on the combination of information extraction (specifically, the Similarity-based Rule Learner—SRL) and machine learning techniques. Using linguistic knowledge and optimal dynamic window size, SRL produces annotation rules of better quality than comparable semantic annotation systems. Similarity-based learning efficiently reduces the search space by avoiding pseudo rule generalization. In the annotation phase, iASA exploits ontology knowledge to refine the annotation it proposes. Moreover, our annotation algorithm exploits machine learning methods to correctly select instances and to predict missing instances. Finally, iASA provides an explanation component that explains the nature of the learner and annotator to the user. Explanations can greatly help users understand the rule induction and annotation process, so that they can focus on correcting rules and annotations quickly. Experimental results show that iASA can reach high accuracy quickly.
NLP Technologies and the Semantic Web: Risks, Opportunities and Challenges
- In 8th Conference of the AI*IA (Italian Association for Artificial Intelligence
, 2003
"... Abstract. In this paper we provide a set of hypotheses about possible interactions between the raising paradigm of the Semantic Web and NLP technologies. We show that there is some role to be played by NLP both on the ground of creation and maintenance of the Semantic Web and on the one of its acces ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. In this paper we provide a set of hypotheses about possible interactions between the raising paradigm of the Semantic Web and NLP technologies. We show that there is some role to be played by NLP both on the ground of creation and maintenance of the Semantic Web and on the one of its access by humans. We also provide the skeleton of a running application which emulates a situation where the Semantic Web has reached its mature state. 1
Automatic Semantic Subject Indexing of Web Documents in Highly In ected Languages
"... Abstract. Structured semantic metadata about unstructured web documents can be created using automatic subject indexing methods, avoiding laborious manual indexing. A succesful automatic subject indexing tool for the web should work with texts in multiple languages and be independent of the domain o ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. Structured semantic metadata about unstructured web documents can be created using automatic subject indexing methods, avoiding laborious manual indexing. A succesful automatic subject indexing tool for the web should work with texts in multiple languages and be independent of the domain of discourse of the documents and controlled vocabularies. However, analyzing text written in a highly in ected language requires word form normalization that goes beyond rule-based stemming algorithms. We have tested the state-of-the art automatic indexing tool Maui on Finnish texts using three stemming and lemmatization algorithms and tested it with documents and vocabularies of di erent domains. Both of the lemmatization algorithms we tested performed signi cantly better than a rule-based stemmer, and the subject indexing quality was found to be comparable to that of human indexers. 1
Question Answering Biographic Information and Social Network Powered by the Semantic Web
"... After several years of development, the vision of the Semantic Web is gradually becoming reality. Large data repositories have been created and offer semantic information in a machine-processable form for various domains. Semantic Web data can be published on the Web, gathered automatically, and rea ..."
Abstract
- Add to MetaCart
After several years of development, the vision of the Semantic Web is gradually becoming reality. Large data repositories have been created and offer semantic information in a machine-processable form for various domains. Semantic Web data can be published on the Web, gathered automatically, and reasoned about. All these developments open interesting perspectives for building a new class of domain-specific, broad-coverage information systems that overcome a long-standing bottleneck of AI systems, the notoriously incomplete knowledge base. We present a system that shows how the wealth of information in the Semantic Web can be interfaced with humans once again, using natural language for querying and answering rather than technical formalisms. Whereas current Question Answering systems typically select snippets from Web documents retrieved by a search engine, we utilize Semantic Web data, which allows us to provide natural-language answers that are tailored to the current dialog context. Furthermore, we show how to use natural language processing technologies to acquire new data and enrich existing data in a Semantic Web framework. Our system has acquired a rich biographic data resource by combining existing Semantic Web resources, which are discovered from semi-structured textual data in Web pages, with information extracted from free natural language texts. 1.

