Results 1 -
5 of
5
OntoMiner: automated metadata and instance mining from news websites
"... Abstract: RDF/XML has been widely recognised as the standard for annotating online web documents and for transforming the HTML web into the so-called Semantic Web. In order to enable widespread usability of the Semantic Web, there is a need to bootstrap large, rich and up-to-date domain ontologies t ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract: RDF/XML has been widely recognised as the standard for annotating online web documents and for transforming the HTML web into the so-called Semantic Web. In order to enable widespread usability of the Semantic Web, there is a need to bootstrap large, rich and up-to-date domain ontologies that organise the most relevant concepts, their relationships and instances. In this paper, we present automated techniques for bootstrapping and populating specialised domain ontologies by organising and mining a set of relevant overlapping websites. We develop algorithms that detect and utilise HTML regularities in the web documents to turn them into hierarchical semantic structures encoded as XML. Next, we present tree-mining algorithms that identify key domain concepts and their taxonomical relationships. We also extract semi-structured concept instances annotated with their labels whenever they are available. We also report experimental evaluation for the news, travel and shopping domains to demonstrate the efficacy of our algorithms.
Technick a univerzita v Ko siciach
, 2004
"... excellent black tea from York, which has been keeping me awake for the last couple of weeks during my hectic schedule, but he has also found time to review the document and make grammatical corrections. And last but not least, thanks to my always supportive sister, parents and friends. Sorry for pu ..."
Abstract
- Add to MetaCart
excellent black tea from York, which has been keeping me awake for the last couple of weeks during my hectic schedule, but he has also found time to review the document and make grammatical corrections. And last but not least, thanks to my always supportive sister, parents and friends. Sorry for putting you all on one side for a while -- will try to make it up to you in the near future. N azov pr ace: Semiautomatick a Kon strukcia Ontol ogi z Textov Pracovisko: Katedra kybernetiky a umelej inteligencie, FEI TU v Ko siciach Autor: D avid Celjuska Ved uci DP: Ing. J an Parali c, Phd. Konzultant DP: Dr. Maria Vargas-Vera KMi -- Knowledge Media Institute, The Open University, United Kingdom D atum: 9. 5. 2004 K l u cov e slov a: syst em pre semiautomatick e dop l nanie ontol ogi o nov e in stancie, kon strukcia ontol ogi , semiautomatick e dop l nanie ontol ogi o nov e in stancie, in stancia, text, spo lahlivos t pravidiel, v ypo cet spo lahlivosti pravidiel, prirodzen y ja
Prontolearn: Unsupervised . . . GENERATION USING PROBABILISTIC METHODS
, 2010
"... An ontology is a formal, explicit specification of a shared conceptualization [1, 2]. Formalizing an ontology for a domain is a tedious and cumbersome process. It is constrained by the knowledge acquisition bottleneck (KAB). There exists a large number of text corpora that can be used for classific ..."
Abstract
- Add to MetaCart
An ontology is a formal, explicit specification of a shared conceptualization [1, 2]. Formalizing an ontology for a domain is a tedious and cumbersome process. It is constrained by the knowledge acquisition bottleneck (KAB). There exists a large number of text corpora that can be used for classification in order to create ontologies with the intention to provide better support for the intended parties. In our research we provide a novel unsupervised bottom-up ontology generation method. This method is based on lexico-semantic structures and Bayesian reasoning to expedite the ontology generation process. This process also provides evidence to domain experts to build ontologies based on top-down approaches.
A Strategy to support Information Extraction from Natural Language at production time
, 2008
"... On the vision of intelligent authoring tools and question answering systems, it is of relevance to find a strategy that accelerates the process of IE, such that it can occur on the fly. This thesis presents a novel rule-based system which extracts surface-level information from domain-specific texts ..."
Abstract
- Add to MetaCart
On the vision of intelligent authoring tools and question answering systems, it is of relevance to find a strategy that accelerates the process of IE, such that it can occur on the fly. This thesis presents a novel rule-based system which extracts surface-level information from domain-specific texts using lightweight natural language processing techniques. A practical implementation of this system is provided as a component of the General Architecture for Text Engineering (GATE). The system achieves the F-measure of over 84 % on the named entity recognition task with the processing speed of around 65 kilobytes of text per second.

