Results 1 - 10
of
40
Yago: A Large Ontology from Wikipedia and WordNet
, 2007
"... This article presents YAGO, a large ontology with high coverage and precision. YAGO has been automatically derived from Wikipedia and WordNet. It comprises entities and relations, and currently contains more than 1.7 million entities and 15 million facts. These include the taxonomic Is-A hierarchy a ..."
Abstract
-
Cited by 43 (11 self)
- Add to MetaCart
This article presents YAGO, a large ontology with high coverage and precision. YAGO has been automatically derived from Wikipedia and WordNet. It comprises entities and relations, and currently contains more than 1.7 million entities and 15 million facts. These include the taxonomic Is-A hierarchy as well as semantic relations between entities. The facts for YAGO have been extracted from the category system and the infoboxes of Wikipedia and have been combined with taxonomic relations from WordNet. Type checking techniques help us keep YAGO’s precision at 95% – as proven by an extensive evaluation study. YAGO is based on a clean logical model with a decidable consistency. Furthermore, it allows representing n-ary relations in a natural way while maintaining compatibility with RDFS. A powerful query model facilitates access to YAGO’s data.
Confidence Intervals for a Binomial Proportion And Asymptotic Expansions
, 1999
"... We address the classic problem of interval estimation of a binomial proportion. The Wald interval ^ p z =2 n 1=2 (^p(1 ^ p)) 1=2 is currently in near universal use. We first show that the coverage properties of the Wald interval are persistently poor and defy virtually all conventional wisdom. We th ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
We address the classic problem of interval estimation of a binomial proportion. The Wald interval ^ p z =2 n 1=2 (^p(1 ^ p)) 1=2 is currently in near universal use. We first show that the coverage properties of the Wald interval are persistently poor and defy virtually all conventional wisdom. We then proceed to a theoretical comparison of the standard interval and four additional alternative intervals by asymptotic expansions of their coverage probabilities and expected lengths. The four additional interval methods we study in detail are the score-test interval (Wilson (1927)) the likelihood-ratio-test interval, a Jeffreys prior Bayesian interval and an interval suggested in Agresti and Coull (1998). The asymptotic expansions for coverage show that the first three of these alternative methods have coverages that uctuate about the nominal value, while the Agresti-Coull interval has a somewhat larger and more nearly conservative coverage function. For the five interval methods we al...
Confidence Intervals for Probabilities of Default
, 2005
"... In this paper we conduct a systematic comparison of confidence intervals around estimated probabilities of default (PD) using several analytical approaches as well as parametric and nonparametric bootstrap methods. We do so for two different PD estimation methods, cohort and duration (intensity), wi ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
In this paper we conduct a systematic comparison of confidence intervals around estimated probabilities of default (PD) using several analytical approaches as well as parametric and nonparametric bootstrap methods. We do so for two different PD estimation methods, cohort and duration (intensity), with 22 years of credit ratings data. We find that the bootstrapped intervals for the duration based estimates are relatively tight when compared to either analytic or bootstrapped intervals around the less efficient cohort estimator. We show how the large differences between the point estimates and confidence intervals of these two estimators are consistent with non-Markovian migration behavior. Surprisingly, even with these relatively tight confidence intervals, it is impossible to distinguish notch-level PDs for investment grade ratings, e.g. a PDAA- from a PDA+. However, once the speculative grade barrier is crossed, we are able to distinguish quite cleanly notch-level estimated PDs. Conditioning on the state of the business cycle helps: it is easier to distinguish adjacent PDs in recessions than in expansions.
YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia
- Commun. ACM
"... We are grateful for input from various people’s work: Edwin Lewis-Kelham for implementing the YAGO2 user interface, Gerard de Melo for his help on integrating his Universal WordNet, and Erdal Kuzey for his work on named events and time facts in Wikipedia. We would also like to thank the people who h ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
We are grateful for input from various people’s work: Edwin Lewis-Kelham for implementing the YAGO2 user interface, Gerard de Melo for his help on integrating his Universal WordNet, and Erdal Kuzey for his work on named events and time facts in Wikipedia. We would also like to thank the people who helped evaluate the quality of YAGO2 by manual assessment, most notably, Ndapandula Nakashole, Stephan Seufert, Erdal Kuzey, and We present YAGO2, an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space. YAGO2 is built automatically from Wikipedia, GeoNames, and WordNet. It contains 80 million facts about 9.8 million entities. Human evaluation confirmed an accuracy of 95 % of the facts in YAGO2. In this paper, we present the extraction methodology, the integration of the spatio-temporal dimension, and our knowledge representation SPOTL, an extension of the original SPO-triple
Integrating YAGO into the Suggested Upper Merged Ontology
, 2008
"... Ontologies are becoming more and more popular as background knowledge for intelligent applications. Up to now, there has been a schism between manually assembled, highly axiomatic ontologies and large, automatically constructed knowledge bases. This paper discusses how the two worlds can be brought ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Ontologies are becoming more and more popular as background knowledge for intelligent applications. Up to now, there has been a schism between manually assembled, highly axiomatic ontologies and large, automatically constructed knowledge bases. This paper discusses how the two worlds can be brought together by combining the high-level axiomatizations from the Standard Upper Merged Ontology (SUMO) with the extensive world knowledge of the YAGO ontology. The result is a new large-scale formal ontology, which provides information about millions of entities such as people, cities, organizations, and companies.
DETC2007-35158 UPDATING UNCERTAINTY ASSESSMENTS: A COMPARISON OF STATISTICAL APPROACHES
"... The performance of a product that is being designed is affected by variations in material, manufacturing process, use, and environmental variables. As a consequence of uncertainties in these factors, some items may fail. Failure is taken very generally, but we assume that it is a random event that o ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
The performance of a product that is being designed is affected by variations in material, manufacturing process, use, and environmental variables. As a consequence of uncertainties in these factors, some items may fail. Failure is taken very generally, but we assume that it is a random event that occurs at most once in the lifetime of an item. The designer wants the probability of failure to be less than a given threshold. In this paper, we consider three approaches for modeling the uncertainty in whether or not the failure probability meets this threshold: a classical approach, a precise Bayesian approach, and a robust Bayesian (or imprecise probability) approach. In some scenarios, the designer may have some initial beliefs about the failure probability. The designer also has the opportunity to obtain more information about product performance (e.g. from either experiments with actual items or runs of a simulation program that provides an acceptable surrogate for actual performance). The different approaches for forming and updating the designer's beliefs about the failure probability are illustrated and compared under different assumptions of available information. The goal is to gain insight into the relative strengths and weaknesses of the approaches. Examples are presented for illustrating the conclusions. 1.
Thinking positively
- Risk
, 2005
"... How to come up with numerical PD estimates if there are no default observations? Katja Pluto and Dirk Tasche propose a statistically based methodology to derive non-zero probabilities of default for credit portfolios with none or very few observed defaults. Their most prudent estimation principle de ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
How to come up with numerical PD estimates if there are no default observations? Katja Pluto and Dirk Tasche propose a statistically based methodology to derive non-zero probabilities of default for credit portfolios with none or very few observed defaults. Their most prudent estimation principle delivers results for any desired degree of conservatism, and can be applied to both uncorrelated and correlated default events. The estimates could serve as a basis for bank internal credit risk management and regulatory purposes alike. 1
Reducing conservatism of exact small-sample methods of inference for discrete data
- TH SYMPOSIUM OF THE IASC, ROME 28 AUGUST - 1
, 2006
"... ..."
Dynamic Probability Estimator for Machine Learning
- IEEE Trans. on Neural Networks
"... Abstract—An efficient algorithm for dynamic estimation of probabilities without division on unlimited number of input data is presented. The method estimates probabilities of the sampled data from the raw sample count, while keeping the total count value constant. Accuracy of the estimate depends on ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract—An efficient algorithm for dynamic estimation of probabilities without division on unlimited number of input data is presented. The method estimates probabilities of the sampled data from the raw sample count, while keeping the total count value constant. Accuracy of the estimate depends on the counter size, rather than on the total number of data points. Estimator follows variations of the incoming data probability within a fixed window size, without explicit implementation of the windowing technique. Total design area is very small and all probabilities are estimated concurrently. Dynamic probability estimator was implemented using a programmable gate array from Xilinx. The performance of this implementation is evaluated in terms of the area efficiency and execution time. This method is suitable for the highly integrated design of artificial neural networks where a large number of dynamic probability estimators can work concurrently. Index Terms—Classification, entropy, machine learning, neural network hardware, probability estimator.

