Results 1  10
of
71
Yago: A Large Ontology from Wikipedia and WordNet
, 2007
"... This article presents YAGO, a large ontology with high coverage and precision. YAGO has been automatically derived from Wikipedia and WordNet. It comprises entities and relations, and currently contains more than 1.7 million entities and 15 million facts. These include the taxonomic IsA hierarchy a ..."
Abstract

Cited by 72 (11 self)
 Add to MetaCart
This article presents YAGO, a large ontology with high coverage and precision. YAGO has been automatically derived from Wikipedia and WordNet. It comprises entities and relations, and currently contains more than 1.7 million entities and 15 million facts. These include the taxonomic IsA hierarchy as well as semantic relations between entities. The facts for YAGO have been extracted from the category system and the infoboxes of Wikipedia and have been combined with taxonomic relations from WordNet. Type checking techniques help us keep YAGO’s precision at 95% – as proven by an extensive evaluation study. YAGO is based on a clean logical model with a decidable consistency. Furthermore, it allows representing nary relations in a natural way while maintaining compatibility with RDFS. A powerful query model facilitates access to YAGO’s data.
YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia
 Commun. ACM
"... We are grateful for input from various people’s work: Edwin LewisKelham for implementing the YAGO2 user interface, Gerard de Melo for his help on integrating his Universal WordNet, and Erdal Kuzey for his work on named events and time facts in Wikipedia. We would also like to thank the people who h ..."
Abstract

Cited by 43 (12 self)
 Add to MetaCart
We are grateful for input from various people’s work: Edwin LewisKelham for implementing the YAGO2 user interface, Gerard de Melo for his help on integrating his Universal WordNet, and Erdal Kuzey for his work on named events and time facts in Wikipedia. We would also like to thank the people who helped evaluate the quality of YAGO2 by manual assessment, most notably, Ndapandula Nakashole, Stephan Seufert, Erdal Kuzey, and We present YAGO2, an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space. YAGO2 is built automatically from Wikipedia, GeoNames, and WordNet. It contains 80 million facts about 9.8 million entities. Human evaluation confirmed an accuracy of 95 % of the facts in YAGO2. In this paper, we present the extraction methodology, the integration of the spatiotemporal dimension, and our knowledge representation SPOTL, an extension of the original SPOtriple
Confidence intervals for a binomial proportion and asymptotic expansions
 Ann. Statist
, 2002
"... We address the classic problem of interval estimation of a binomial proportion. The Wald interval ˆp ± z α/2n −1/2 ( ˆp(1 −ˆp)) 1/2 is currently in near universal use. We first show that the coverage properties of the Wald interval are persistently poor and defy virtually all conventional wisdom. We ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
We address the classic problem of interval estimation of a binomial proportion. The Wald interval ˆp ± z α/2n −1/2 ( ˆp(1 −ˆp)) 1/2 is currently in near universal use. We first show that the coverage properties of the Wald interval are persistently poor and defy virtually all conventional wisdom. We then proceed to a theoretical comparison of the standard interval and four additional alternative intervals by asymptotic expansions of their coverage probabilities and expected lengths. The four additional interval methods we study in detail are the scoretest interval (Wilson), the likelihoodratiotest interval, a Jeffreys prior Bayesian interval and an interval suggested by Agresti and Coull. The asymptotic expansions for coverage show that the first three of these alternative methods have coverages that fluctuate about the nominal value, while the Agresti– Coull interval has a somewhat larger and more nearly conservative coverage function. For the five interval methods we also investigate asymptotically their average coverage relative to distributions for p supported within (0, 1). In terms of expected length, asymptotic expansions show that the Agresti– Coull interval is always the longest of these. The remaining three are rather comparable and are shorter than the Wald interval except for p near 0 or 1. These analytical calculations support and complement the findings and the recommendations in Brown, Cai and DasGupta (Statist. Sci. (2001) 16
Confidence Intervals for Probabilities of Default
, 2005
"... In this paper we conduct a systematic comparison of confidence intervals around estimated probabilities of default (PD) using several analytical approaches as well as parametric and nonparametric bootstrap methods. We do so for two different PD estimation methods, cohort and duration (intensity), wi ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
In this paper we conduct a systematic comparison of confidence intervals around estimated probabilities of default (PD) using several analytical approaches as well as parametric and nonparametric bootstrap methods. We do so for two different PD estimation methods, cohort and duration (intensity), with 22 years of credit ratings data. We find that the bootstrapped intervals for the duration based estimates are relatively tight when compared to either analytic or bootstrapped intervals around the less efficient cohort estimator. We show how the large differences between the point estimates and confidence intervals of these two estimators are consistent with nonMarkovian migration behavior. Surprisingly, even with these relatively tight confidence intervals, it is impossible to distinguish notchlevel PDs for investment grade ratings, e.g. a PDAA from a PDA+. However, once the speculative grade barrier is crossed, we are able to distinguish quite cleanly notchlevel estimated PDs. Conditioning on the state of the business cycle helps: it is easier to distinguish adjacent PDs in recessions than in expansions.
Integrating YAGO into the Suggested Upper Merged Ontology
, 2008
"... Ontologies are becoming more and more popular as background knowledge for intelligent applications. Up to now, there has been a schism between manually assembled, highly axiomatic ontologies and large, automatically constructed knowledge bases. This paper discusses how the two worlds can be brought ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
Ontologies are becoming more and more popular as background knowledge for intelligent applications. Up to now, there has been a schism between manually assembled, highly axiomatic ontologies and large, automatically constructed knowledge bases. This paper discusses how the two worlds can be brought together by combining the highlevel axiomatizations from the Standard Upper Merged Ontology (SUMO) with the extensive world knowledge of the YAGO ontology. The result is a new largescale formal ontology, which provides information about millions of entities such as people, cities, organizations, and companies.
Confidence Intervals for a Binomial Proportion And Edgeworth Expansions
, 1999
"... We address the classic problem of interval estimation of a binomial proportion. The Wald interval p\Sigmaz ff=2 n \Gamma1=2 (p(1\Gamma p)) 1=2 is currently in near universal use. We first show that the coverage properties of the Wald interval are persistently poor and defy virtually all conven ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
We address the classic problem of interval estimation of a binomial proportion. The Wald interval p\Sigmaz ff=2 n \Gamma1=2 (p(1\Gamma p)) 1=2 is currently in near universal use. We first show that the coverage properties of the Wald interval are persistently poor and defy virtually all conventional wisdom. We then proceed to a theoretical comparison of the standard interval and four additional alternative intervals by asymptotic expansions of their coverage probabilities and expected lengths. Fortunately, the asymptotic expansions are remarkably accurate at rather modest sample sizes, such as n = 40, or sometimes even n = 20. The expansions show that an interval suggested in Agresti and Coull (1998) dominates the score interval (Wilson (1927)), the Jeffreys prior Bayesian interval, and also the standard interval in coverage probability. However, the asymptotic expansions for expected lengths show that the AgrestiCoull interval is always the longest of these, and the Jeffreys pr...
Thinking positively
 Risk
, 2005
"... How to come up with numerical PD estimates if there are no default observations? Katja Pluto and Dirk Tasche propose a statistically based methodology to derive nonzero probabilities of default for credit portfolios with none or very few observed defaults. Their most prudent estimation principle de ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
How to come up with numerical PD estimates if there are no default observations? Katja Pluto and Dirk Tasche propose a statistically based methodology to derive nonzero probabilities of default for credit portfolios with none or very few observed defaults. Their most prudent estimation principle delivers results for any desired degree of conservatism, and can be applied to both uncorrelated and correlated default events. The estimates could serve as a basis for bank internal credit risk management and regulatory purposes alike. 1
DETC200735158 UPDATING UNCERTAINTY ASSESSMENTS: A COMPARISON OF STATISTICAL APPROACHES
"... The performance of a product that is being designed is affected by variations in material, manufacturing process, use, and environmental variables. As a consequence of uncertainties in these factors, some items may fail. Failure is taken very generally, but we assume that it is a random event that o ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
The performance of a product that is being designed is affected by variations in material, manufacturing process, use, and environmental variables. As a consequence of uncertainties in these factors, some items may fail. Failure is taken very generally, but we assume that it is a random event that occurs at most once in the lifetime of an item. The designer wants the probability of failure to be less than a given threshold. In this paper, we consider three approaches for modeling the uncertainty in whether or not the failure probability meets this threshold: a classical approach, a precise Bayesian approach, and a robust Bayesian (or imprecise probability) approach. In some scenarios, the designer may have some initial beliefs about the failure probability. The designer also has the opportunity to obtain more information about product performance (e.g. from either experiments with actual items or runs of a simulation program that provides an acceptable surrogate for actual performance). The different approaches for forming and updating the designer's beliefs about the failure probability are illustrated and compared under different assumptions of available information. The goal is to gain insight into the relative strengths and weaknesses of the approaches. Examples are presented for illustrating the conclusions. 1.
Estimating Probabilities of Default for Low Default Portfolios,” working paper; available at http://www.defaultrisk.com/pp_score_45.htm
, 2005
"... For credit risk management purposes in general, and for allocation of regulatory capital by banks in particular (Basel II), numerical assessments of creditworthiness are indispensable. These assessments are expressed in terms of probabilities of default (PD) that should incorporate a certain degree ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
For credit risk management purposes in general, and for allocation of regulatory capital by banks in particular (Basel II), numerical assessments of creditworthiness are indispensable. These assessments are expressed in terms of probabilities of default (PD) that should incorporate a certain degree of conservatism in order to reflect the prudential risk management style banks are required to apply. In case of credit portfolios that did not at all suffer defaults or very few defaults only over years, the resulting naive zero or close to zero estimates would clearly not involve a sufficient conservatism. As an attempt to overcome this issue, we suggest the most prudent estimation principle. This means to estimate the PDs by upper confidence bounds while guaranteeing at the same time a PD ordering that respects the differences in credit quality indicated by the rating grades. The methodology is most easily applied under an assumption of independent default events but can be adapted to the case of correlated defaults without too much effort. 1