Results 1 - 10
of
19
An Adaptation of the Vector-Space Model for Ontology-Based Information Retrieval
, 2006
"... Semantic search has been one of the motivations of the Semantic Web since it was envisioned. We propose a model for the exploitation of ontology-based knowledge bases to improve search over large document repositories. In our view of Information Retrieval on the Semantic Web, a search engine return ..."
Abstract
-
Cited by 46 (19 self)
- Add to MetaCart
Semantic search has been one of the motivations of the Semantic Web since it was envisioned. We propose a model for the exploitation of ontology-based knowledge bases to improve search over large document repositories. In our view of Information Retrieval on the Semantic Web, a search engine returns documents rather than, or in addition to, exact values in response to user queries. For this purpose, our approach includes an ontology-based scheme for the semiautomatic annotation of documents, and a retrieval system. The retrieval model is based on an adaptation of the classic vector-space model, including an annotation weighting algorithm, and a ranking algorithm. Semantic search is combined with conventional keyword-based retrieval to achieve tolerance to knowledge base incompleteness. Experiments are shown where our approach is tested on corpora of significant scale, showing clear improvements with respect to keyword-based search.
An ontology-based information retrieval model
- In ESWC
, 2005
"... Abstract. Semantic search has been one of the motivations of the Semantic Web since it was envisioned. We propose a model for the exploitation of ontologybased KBs to improve search over large document repositories. Our approach includes an ontology-based scheme for the semi-automatic annotation of ..."
Abstract
-
Cited by 20 (6 self)
- Add to MetaCart
Abstract. Semantic search has been one of the motivations of the Semantic Web since it was envisioned. We propose a model for the exploitation of ontologybased KBs to improve search over large document repositories. Our approach includes an ontology-based scheme for the semi-automatic annotation of documents, and a retrieval system. The retrieval model is based on an adaptation of the classic vector-space model, including an annotation weighting algorithm, and a ranking algorithm. Semantic search is combined with keyword-based search to achieve tolerance to KB incompleteness. Our proposal is illustrated with sample experiments showing improvements with respect to keyword-based search, and providing ground for further research and discussion. 1
Expressive and Flexible Access to Web-Extracted Data: A Keyword-based Structured Query Language
- In SIGMOD ’10: Proceedings of International Conference on Management of Data
, 2010
"... Automated extraction of structured data from Web sources often leads to large heterogeneous knowledge bases (KB), with data and schema items numbering in the hundreds of thousands or millions. Formulating information needs with conventional structured query languages is difficult due to the sheer si ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Automated extraction of structured data from Web sources often leads to large heterogeneous knowledge bases (KB), with data and schema items numbering in the hundreds of thousands or millions. Formulating information needs with conventional structured query languages is difficult due to the sheer size of schema information available to the user. We address this challenge by proposing a new query language that blends keyword search with structured query processing over large information graphs with rich semantics.
A Categorization Scheme for Semantic Web Search Engines
, 2006
"... Semantic web search engines are evolving and many prototype systems and some implementation have been developed. However, there are some different views on what a semantic search engine should do. In this paper, a categorization scheme for semantic web search engines are introduced and elaborated. F ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Semantic web search engines are evolving and many prototype systems and some implementation have been developed. However, there are some different views on what a semantic search engine should do. In this paper, a categorization scheme for semantic web search engines are introduced and elaborated. For each category, its components are described according to a proposed general architecture and various approaches employed in these components are discussed. We also propose some factors to evaluate systems in each category.
Search on the semantic web
- IEEE Computer
, 2005
"... The Semantic Web provides a way to encode information and knowledge on web pages in a form that is easier for computers to understand and process. This article discusses the issues underlying the discovery, indexing and search over web documents that contain semantic web markup. Unlike conventional ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
The Semantic Web provides a way to encode information and knowledge on web pages in a form that is easier for computers to understand and process. This article discusses the issues underlying the discovery, indexing and search over web documents that contain semantic web markup. Unlike conventional Web search engines, which use information retrieval techniques designed for documents of unstructured text, Semantic Web search engines must handle documents comprised of semi-structured data. Moreover, the meaning of data is defined by associated ontologies that are also encoded as semantic web documents whose processing may require significant amount of reasoning. We describe Swoogle, an implemented semantic web search engine that discovers, analyzes, and indexes knowledge encoded in semantic web documents throughout the Web, and illustrate its use to help human users and software agents find relevant knowledge. 1
Information retrieval support for ontology construction and use
- in Proceedings 3rd International Semantic Web Conference (ISWC
, 2004
"... Abstract. Information retrieval can contribute towards the construction of ontologies and the effective usage of ontologies. We use collocation-based keyword extraction to suggest new concepts, and study the generation of hyperlinks to automate the population of ontologies with instances. We evaluat ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Abstract. Information retrieval can contribute towards the construction of ontologies and the effective usage of ontologies. We use collocation-based keyword extraction to suggest new concepts, and study the generation of hyperlinks to automate the population of ontologies with instances. We evaluate our methods within the setting of digital library project, using information retrieval evaluation methodology. Within the same setting we study retrieval methods that complement the navigational support offered by the semantic relations in most ontologies to help users explore the ontology. 1
Ontosearch: A full-text search engine for the semantic web
- In Proc. of the 21st National Conf. on Artificial Intelligence and the 18th Innovative Applications of Artificial Intelligence Conf
, 2006
"... OntoSearch, a full-text search engine that exploits ontological knowledge for document retrieval, is presented in this paper. Different from other ontology based search engines, OntoSearch does not require a user to specify the associated concepts of his/her queries. Domain ontology in OntoSearch is ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
OntoSearch, a full-text search engine that exploits ontological knowledge for document retrieval, is presented in this paper. Different from other ontology based search engines, OntoSearch does not require a user to specify the associated concepts of his/her queries. Domain ontology in OntoSearch is in the form of a semantic network. Given a keyword based query, OntoSearch infers the related concepts through a spreading activation process in the domain ontology. To provide personalized information access, we further develop algorithms to learn and exploit user ontology model based on a customized view of the domain ontology. The proposed system has been applied to the domain of searching scientific publications in the ACM Digital Library. The experimental results support the efficacy of the OntoSearch system by using domain ontology and user ontology for enhanced search performance.
Produce and Consume Linked Data with Drupal! ⋆
"... Abstract. Currently a large number of Web sites are driven by Content Management Systems (CMS) which manage textual and multimedia content but also-inherently- carry valuable information about a site’s structure and content model. Exposing this structured information to the Web of Data has so far re ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract. Currently a large number of Web sites are driven by Content Management Systems (CMS) which manage textual and multimedia content but also-inherently- carry valuable information about a site’s structure and content model. Exposing this structured information to the Web of Data has so far required considerable expertise in RDF and OWL modelling and additional programming effort. In this paper we tackle one of the most popular CMS: Drupal. We enable site administrators to export their site content model and data to the Web of Data without requiring extensive knowledge on Semantic Web technologies. Our modules create RDFa annotations and – optionally – a SPARQL endpoint for any Drupal site out of the box. Likewise, we add the means to map the site data to existing ontologies on the Web with a search interface to find commonly used ontology terms. We also allow a Drupal site administrator to include existing RDF data from remote SPARQL endpoints on the Web in the site. When brought together, these features allow networked RDF Drupal sites that reuse and enrich Linked Data. We finally discuss the adoption of our modules and report on a use case in the biomedical field and the current status of its deployment. 1
NPBibSearch - an Ontology Augmented Bibliographic Search Engine, http://ipc755.inf-nf.uni.jena.de/NPBibSearch
- In Proc. of SWAP 2005, the 2nd Italian Semantic Web Workshop
, 2005
"... Abstract. Ontologies are considered to be the state-of-the-art technology for the development and evolution of the Semantic Web. Today, the use of semantic markup in the World Wide Web (WWW) is rather poor. Therefore, search engines and software agents often use external ontologies for applying info ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. Ontologies are considered to be the state-of-the-art technology for the development and evolution of the Semantic Web. Today, the use of semantic markup in the World Wide Web (WWW) is rather poor. Therefore, search engines and software agents often use external ontologies for applying information retrieval tasks in the WWW. We have developed NPBibSearch, an ontology augmented search engine tool for bibliographical search in the restricted domain of NP-complete problems, an important subject in theoretical computer science. In connection with the keyword-based full-text retrieval of the Google web APIs service, NPBibSearch searches the database of the Electronic Colloquium of Computational Complexity (ECCC), guided by a simple ontology driven navigation tool that unfolds the domain of NP-complete decision problems to the user. 1
Combining Semantics, Context, and Statistical Evidence in Genomics Literature Search
"... Abstract—We present an information retrieval model for combining evidence from concept-based semantics, term statistics, and context for improving search precision of genomics literature by accurately identifying concise, variable length passages of text to answer a user query. The system combines a ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract—We present an information retrieval model for combining evidence from concept-based semantics, term statistics, and context for improving search precision of genomics literature by accurately identifying concise, variable length passages of text to answer a user query. The system combines a dimensional data model for indexing scientific literature at multiple levels of document structure and context with a rule-based query processing algorithm. The query processing algorithm uses an iterative information extraction technique to identify query concepts, and a retrieval function for systematically combining concepts with term statistics at multiple levels of context. We define context by variable length passages of text and different levels of document lexical structure including terms, sentences, paragraphs, and entire documents. Our results demonstrate improved search results in the presence of varying levels of semantic evidence, and higher performance using retrieval functions that combine document as well as sentence and passage level information versus using document, sentence or passage level information alone. Initial results are promising. When ranking documents based on the most relevant extracted passages, the results exceed the state-of-the-art by 13.89 % as assessed by the TREC 2005 Genomics track collection of 4.5 million MEDLINE citations.

