Results 1 -
5 of
5
Theoretical foundations for enabling a web of knowledge
- In Foundations of Information and Knowledge Systems, Sixth International Symposium (FoIKS 2010) (accepted paper
, 2010
"... Abstract. The current web is a web of linked pages. Frustrated users ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Abstract. The current web is a web of linked pages. Frustrated users
AUTOMATIC EXTRACTION FROM AND REASONING ABOUT GENEALOGICAL RECORDS: A PROTOTYPE
"... There is great interest in family history research on the web and a great many competing genealogical websites that contain large amounts of data-rich, unstructured, primary genealogical records. The problem is that it is so labor-intensive even after making these records machine-readable, for human ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
There is great interest in family history research on the web and a great many competing genealogical websites that contain large amounts of data-rich, unstructured, primary genealogical records. The problem is that it is so labor-intensive even after making these records machine-readable, for humans to make the same records searchable. What we need are computer tools that can automatically produce indices and databases from the data-rich, unstructured genealogical records and can identify individuals and events, determine relationships, and put families together. We propose here a possible solution—a specialized ontology, built specifically for the information extraction of primary genealogical records, with expert logic and rules to infer genealogical facts and assemble relationship links between persons with respect to the genealogical events in their lives. The deliverables of this solution are a set of specialized extraction ontologies used to extract parish or town records, a marked-up version of the original document, a data file of individuals and events, and rules used to define family relationships and manipulate the data file. The solution also provides for the ability to query over the rules and data files. An evaluation of the prototype solution shows that the extraction has good recall and precision results and that inferred facts are correct. 1
Ontologies for Multilingual Extraction
"... In our global society, multilingual barriers sometimes prohibit and often discourage people from accessing a wider variety of goods and services. We propose multilingual extraction ontologies as an approach to resolving these issues. Our ontologies provide a conceptual framework for a narrow domain ..."
Abstract
- Add to MetaCart
In our global society, multilingual barriers sometimes prohibit and often discourage people from accessing a wider variety of goods and services. We propose multilingual extraction ontologies as an approach to resolving these issues. Our ontologies provide a conceptual framework for a narrow domain of interest. Grounding narrow-domain ontologies linguistically enables them to map relevant utterances and text to meaningful concepts in the ontology. Our prior work includes leveraging large-scale lexicons and terminology resources for grounding and augmenting ontological content [14]. Linguistically grounding ontologies in multiple languages enables cross-language communication within the scope of the various ontologies ’ domains. We quantify the success of linguistically grounded ontologies by measuring precision and recall of extracted concepts, and we can gauge the success of automated cross-linguistic-mapping construction by measuring the speed of creation and the accuracy of generated lexical resources. 1.
1 Research Area KBB: A Knowledge-Bundle Builder for Bio-Research
"... We propose research into and development of a “Knowledge-Bundle Builder for Bio-Research.” We direct our proposed research at the broad Challenge Area 04: Clinical Research and the specific Challenge Topic 04-NS-102 Developing web-based entry and data-management tools for clinical research. The volu ..."
Abstract
- Add to MetaCart
We propose research into and development of a “Knowledge-Bundle Builder for Bio-Research.” We direct our proposed research at the broad Challenge Area 04: Clinical Research and the specific Challenge Topic 04-NS-102 Developing web-based entry and data-management tools for clinical research. The volume of biological data is enormous and increasing rapidly. Unfortunately, the information a bio-researcher needs is scattered in various repositories and in the published literature. To do activities bio-researchers need a system that can efficiently locate, extract, and organize available bio-information so that it can be analyzed and scientific hypotheses can be verified. Currently, bio-researchers manually search for information of interest from thousands of data sources (either online repositories or publications) to achieve their goals. This process is tedious and time-consuming. As a specific example, to do a recent study about associations between lung cancer and TP53 polymorphism, researchers needed to: (1) do a keyword-based search on the SNP data repository for “tp53 ” within organism ”homo sapiens”; (2) from the returned records, open each record page one by one and find those coding SNPs that have a minor allele frequency greater than 1%; (3) for each qualifying SNP, record the SNP ID and many properties of the SNP; (4)
Management
"... Building a database of facts extracted from historical documents to enable database-like query and search would reduce the tedium of gleaning facts of interest from historical documents. We propose a solution in which historical documents themselves constitute the stored database. In our solution, w ..."
Abstract
- Add to MetaCart
Building a database of facts extracted from historical documents to enable database-like query and search would reduce the tedium of gleaning facts of interest from historical documents. We propose a solution in which historical documents themselves constitute the stored database. In our solution, we use information-extraction techniques to produce a conceptualized external annotation of facts found in each document, and we superimpose the conceptualization over the document collection. The annotation process populates the conceptualization producing a repository of extracted facts, and a reasoner obtains inferred facts from these extracted facts. Our query interface accepts free-form queries and converts them to formal queries over the extracted and inferred facts. Displayed results include, in addition to standard query results, images of original documents with results highlighted along with reasoning chains for inferred facts grounded in these highlighted facts. Along with giving the implementation status of our proof-of-concept prototype, we present results for extraction accuracy and efficiency and point to current and future work needed to enable a practical solution for the envisioned historical-document database.

