Results 1 -
6 of
6
Semi-automatic data-driven ontology construction system
- PASCAL EPRINTS (2006) HTTP://EPRINTS.PASCAL-NETWORK. ORG/PERL/OAI2. – WORKING GROUP SUMMARY 15
, 2006
"... In this paper we present a new version of OntoGen system for semi-automatic data-driven ontology construction. The system is based on a novel ontology learning framework which formalizes and extends the role of machine learning and text mining algorithms used in the previous version. List of new fea ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
In this paper we present a new version of OntoGen system for semi-automatic data-driven ontology construction. The system is based on a novel ontology learning framework which formalizes and extends the role of machine learning and text mining algorithms used in the previous version. List of new features includes extended number of supported ontology formats (RDFS and OWL), supervised methods for concept discovery (based on Active Learning), adding of new instances to ontology and improved user interface (based on comments from the users).
ADVANCING TOPIC ONTOLOGY LEARNING THROUGH TERM EXTRACTION
"... This paper presents a novel methodology for topic ontology learning from text documents. The proposed methodology, named OntoTermExtraction is based on OntoGen, a semi-automated tool for topic ontology construction, upgraded by using and an advanced terminology extraction tool in an iterative, semia ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper presents a novel methodology for topic ontology learning from text documents. The proposed methodology, named OntoTermExtraction is based on OntoGen, a semi-automated tool for topic ontology construction, upgraded by using and an advanced terminology extraction tool in an iterative, semiautomated ontology construction process. This process consists of (a) document clustering to find the nodes in the topic ontology, (b) term extraction from document clusters, (c) populating the term vocabulary and keyword extraction, and (d) choosing the concept names by comparing the best ranked terms with the
Exploring Wikipedia and DMoz as Knowledge Bases for Engineering a User Interests Hierarchy for Social Network Applications
"... Abstract. The outgrowth of social networks in the recent years has resulted in opportunities for interesting data mining problems, such as interest or friendship recommendations. A global ontology over the interests specified by the users of a social network is essential for accurate recommendations ..."
Abstract
- Add to MetaCart
Abstract. The outgrowth of social networks in the recent years has resulted in opportunities for interesting data mining problems, such as interest or friendship recommendations. A global ontology over the interests specified by the users of a social network is essential for accurate recommendations. We propose, evaluate and compare three approaches to engineering a hierarchical ontology over user interests. The proposed approaches make use of two popular knowledge bases, Wikipedia and Directory Mozilla, to extract interest definitions and/or relationships between interests. More precisely, the first approach uses Wikipedia to find interest definitions, the latent semantic analysis technique to measure the similarity between interests based on their definitions, and an agglomerative clustering algorithm to group similar interests into higher level concepts. The second approach uses the Wikipedia Category Graph to extract relationships between interests, while the third approach uses Directory Mozilla to extract relationships between interests. Our results show that the third approach, although the simplest, is the most effective for building a hierarchy over user interests. 1
A Service Oriented Framework for Natural Language Text
, 2009
"... This paper describes a text enrichment framework and the corresponding document representation model that integrates natural language processing, information extraction, entity resolution, automatic document categorization and summarization. We also describe the implementation of the framework and g ..."
Abstract
- Add to MetaCart
This paper describes a text enrichment framework and the corresponding document representation model that integrates natural language processing, information extraction, entity resolution, automatic document categorization and summarization. We also describe the implementation of the framework and give several illustrative use cases where the service-oriented approach has proven to be useful. Povzetek: Opisan je okvir za obogatitev naravnega besedila. 1
The Open University KMI Annotating Knowledge Resources
"... 1.1 Motivation and Research Problem................. 2 1.1.1 A motivating scenario.................... 2 1.1.2 Theoretical constraints.................... 4 ..."
Abstract
- Add to MetaCart
1.1 Motivation and Research Problem................. 2 1.1.1 A motivating scenario.................... 2 1.1.2 Theoretical constraints.................... 4
Automated Text Classification in the DMOZ Hierarchy- Project Plan
, 2009
"... The goal of this project is to build a text classifier[2, 4, 7] for the DMOZ hierarchy of classification labels. Based on previous successes[3, 5, 6], this project will focus on non-parametric methods such as nearest-neighbour algorithms exploring different feature representations. Various approache ..."
Abstract
- Add to MetaCart
The goal of this project is to build a text classifier[2, 4, 7] for the DMOZ hierarchy of classification labels. Based on previous successes[3, 5, 6], this project will focus on non-parametric methods such as nearest-neighbour algorithms exploring different feature representations. Various approaches will be implemented and evaluated in the initial stage, focusing on the use of the classification hierarchy to improve performance and development of language-independent classification techniques. Of the most successful approach(es), work in the latter stage will work on making the classifier as efficient as possible. 1.2 Motivation The growth in the availability of on-line digital text documents has spurred considerable interest in Information Retrieval and Text Classification. The Internet particularly represents a considerable opportunity for many corporations and individuals to exchange ideas, access products and services. Automation of the management of this wealth of Internet hypertext is becoming an increasingly important endeavor as the rate of new material continues to grow at its substantial rate. The DMOZ open directory project[1] is an on-line service which provides a searchable and browsable hierarchically organised directory to facilitate access to the Internets resources. DMOZ is a collaborative effort of over 56,000 volunteers who contribute and renew Internet content for a growing list of over 718 thousand categories. This represents a considerable convenience for users of the Internet and also a valuable resource for Data Mining applications.

