Results 1 - 10
of
50
SemTag and Seeker: Bootstrapping the semantic web via automated semantic annotation
- Proceedings of the 12 th International Conference on World Wide Web (WWW’03
, 2003
"... This paper describes Seeker, a platform for large-scale text analytics, and SemTag, an application written on the platform to perform automated semantic tagging of large corpora. We apply SemTag to a collection of approximately 264 million web pages, and generate approximately 434 million automatica ..."
Abstract
-
Cited by 120 (4 self)
- Add to MetaCart
This paper describes Seeker, a platform for large-scale text analytics, and SemTag, an application written on the platform to perform automated semantic tagging of large corpora. We apply SemTag to a collection of approximately 264 million web pages, and generate approximately 434 million automatically disambiguated semantic tags, published to the web as a label bureau providing metadata regarding the 434 million annotations. The final version of this paper will reflect new data labeling one billion pages, rather than the 264 million pages reported on herein. To our knowledge, this is the largest scale semantic tagging effort to date. We describe the Seeker platform, discuss the architecture of the SemTag application, describe a new disambiguation algorithm specialized to support ontological disambiguation of large-scale data, evaluate the algorithm, and present our final results with information about acquiring and making use of the semantic tags. We argue that automated large scale semantic tagging of ambiguous content can bootstrap and accelerate the creation of the semantic web. 1.
S-CREAM -- Semi-automatic CREAtion of Metadata
, 2002
"... Richly interlinked, machine-understandable data constitute the basis for the Semantic Web. We provide a framework, S-CREAM, that allows for creation of metadata and is trainable for a specific domain. Annotating web ..."
Abstract
-
Cited by 118 (23 self)
- Add to MetaCart
Richly interlinked, machine-understandable data constitute the basis for the Semantic Web. We provide a framework, S-CREAM, that allows for creation of metadata and is trainable for a specific domain. Annotating web
CREAM -- Creating relational metadata with a component-based, ontology-driven annotation framework
, 2001
"... Richly interlinked, machine-understandable data constitutes the basis for the Semantic Web. Annotating web documents is one of the major techniques for creating metadata on the Web. However, annotation tools so far are restricted in their capabilities of providing richly interlinked and truely ma ..."
Abstract
-
Cited by 98 (18 self)
- Add to MetaCart
Richly interlinked, machine-understandable data constitutes the basis for the Semantic Web. Annotating web documents is one of the major techniques for creating metadata on the Web. However, annotation tools so far are restricted in their capabilities of providing richly interlinked and truely machine-understandable data. They basically allow the user to annotate with plain text according to a template structure, such as Dublin Core. We here present CREAM (Creating RElational, Annotationbased Metadata), a framework for an annotation environment that allows to construct relational metadata, i.e. metadata that comprises class instances and relationship instances. These instances are not based on a fix structure, but on a domain ontology. We discuss some of the requirements one has to meet when developing such a framework, e.g. the integration of a metadata crawler, inference services, document management and information extraction, and describe its implementation, viz. Ont-O-Mat a component-based, ontology-driven annotation tool.
Authoring and Annotation of Web Pages in CREAM
, 2002
"... Richly interlinked, machine-understandable data constitute the basis for the Semantic Web. We provide a framework, CREAM, that allows for creation of metadata. While the annotation mode of CREAM allows to create metadata for existing web pages, the authoring mode lets authors create metadata --- ..."
Abstract
-
Cited by 82 (15 self)
- Add to MetaCart
Richly interlinked, machine-understandable data constitute the basis for the Semantic Web. We provide a framework, CREAM, that allows for creation of metadata. While the annotation mode of CREAM allows to create metadata for existing web pages, the authoring mode lets authors create metadata --- almost for free --- while putting together the content of a page. As a particularity of our framework, CREAM allows to create relational metadata, i.e. metadata that instantiate interrelated definitions of classes in a domain ontology rather than a comparatively rigid template-like schema as Dublin Core. We discuss some of the requirements one has to meet when developing such an ontology-based framework, e.g. the integration of a metadata crawler, inference services, document management and a meta-ontology, and describe its implementation, viz. Ont-O-Mat a component-based, ontology-driven Web page authoring and annotation tool.
A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet
- In EKAW-2000 Workshop “Ontologies and Text”, Juan-Les-Pins
, 2000
"... The focused access to knowledge resources like intranet documents plays a vital role in knowledge management and supports in general the shifting towards a Semantic Web. Ontologies act as a conceptual backbone for semantic document access by providing a common understanding and conceptualization of ..."
Abstract
-
Cited by 52 (4 self)
- Add to MetaCart
The focused access to knowledge resources like intranet documents plays a vital role in knowledge management and supports in general the shifting towards a Semantic Web. Ontologies act as a conceptual backbone for semantic document access by providing a common understanding and conceptualization of a domain. Building domain-specific ontologies is a time-consuming and expensive manual construction task. This paper describes our actual and ongoing work in supporting semi-automatic ontology acquisition from a corporate intranet of an insurance company. We present a comprehensive architecture and generic method for discovering a domain-tailored ontology from given intranet resources. 1 Introduction The amount of information available to corporate employees has grown drastically with the use of intranets. Unfortunately this growth of available information has made the access to useful or necessary information much more difficult due to the fact that the access is usually based on ...
SEAL - A Framework for Developing SEmantic PortALs
, 2001
"... The core idea of the Semantic Web is to make information accessible to human and software agents on a semantic basis. Hence, Web sites may feed directly from the Semantic Web exploiting the underlying structures for human and machine access. We have developed a domain-independent approach for develo ..."
Abstract
-
Cited by 39 (9 self)
- Add to MetaCart
The core idea of the Semantic Web is to make information accessible to human and software agents on a semantic basis. Hence, Web sites may feed directly from the Semantic Web exploiting the underlying structures for human and machine access. We have developed a domain-independent approach for developing semantic portals, o/oov"## SEAL (SEmantic portAL), that exploits semantics for providing and accessing information at a portal as well as constructing and maintaining the portal. In this paper we focus on semanticsbased means that make semantic Web sites accessible from the outside, i.e. semantics-based browsing, semantic querying, querying with semantic similarity, and machine access to semantic information. In particular, we focus on methods for acquiring and structuring community information as well as methods for sharing information.
Knowledge Portals -- Ontologies at Work
- AI MAGAZINE
, 2001
"... Knowledge portals provide views onto domain-specific information on the World Wide Web, thus facilitating their users to find relevant, domain-specific information. The construction of intelligent access and the provisioning of information to knowledge portals, however, remained an ad hoc task re ..."
Abstract
-
Cited by 33 (11 self)
- Add to MetaCart
Knowledge portals provide views onto domain-specific information on the World Wide Web, thus facilitating their users to find relevant, domain-specific information. The construction of intelligent access and the provisioning of information to knowledge portals, however, remained an ad hoc task requiring extensive manual editing and maintenance by the knowledge portal providers. In order to diminish these efforts we use ontologies as a conceptual backbone for providing, accessing and structuring information in a comprehensive approach for building and maintaining knowledge portals. We present one research and one commercial case study that show how our approach is used in practice.
An Annotation Framework for the Semantic Web
- IN PROCEEDINGS OF THE FIRST WORKSHOP ON MULTIMEDIA ANNOTATION
, 2001
"... Creating metadata by annotating documents is one of the major techniques for putting machine understandable data on the Web. Though there exist many tools for annotating web pages, few of them fully support the creation of semantically interlinked metadata, such as necessary for a truely Semantic We ..."
Abstract
-
Cited by 30 (2 self)
- Add to MetaCart
Creating metadata by annotating documents is one of the major techniques for putting machine understandable data on the Web. Though there exist many tools for annotating web pages, few of them fully support the creation of semantically interlinked metadata, such as necessary for a truely Semantic Web. In this paper, we present an ontology-based annotation environment, OntoAnnotate, which offers comprehensive support for the creation of semantically interlinked metadata by human annotators.
Bootstrapping an ontology-based information extraction system
- STUDIES IN FUZZINESS AND SOFT COMPUTING, INTELLIGENT EXPLORATION OF THE WEB
, 2002
"... Automatic intelligent web exploration will benefit from shallow information extraction techniques if the latter can be brought to work within many different domains. The major bottleneck for this, however, lies in the so far difficult and expensive modeling of lexical knowledge, extraction rules, a ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
Automatic intelligent web exploration will benefit from shallow information extraction techniques if the latter can be brought to work within many different domains. The major bottleneck for this, however, lies in the so far difficult and expensive modeling of lexical knowledge, extraction rules, and an ontology that together define the information extraction system. In this paper we present a bootstrapping approach that allows for the fast creation of an ontology-based information extracting system relying on several basic components, viz. a core information extraction system, an ontology engineering environment and an inference engine. We make extensive use of machine learning techniques to support the semi-automatic, incremental bootstrapping of the domain-specific target information extraction system.
CREAM -- CREAting Metadata for the Semantic Web
, 2003
"... Richly interlinked, machine-understandable data constitute the basis for the Semantic Web. We provide a framework, CREAM, that allows for creation of metadata. While the annotation mode of CREAM allows to create metadata for existing web pages, the authoring mode lets authors create metadata -- almo ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
Richly interlinked, machine-understandable data constitute the basis for the Semantic Web. We provide a framework, CREAM, that allows for creation of metadata. While the annotation mode of CREAM allows to create metadata for existing web pages, the authoring mode lets authors create metadata -- almost for free -- while putting together the content of a page. As a

