Results 1  10
of
105
Yago: A Large Ontology from Wikipedia and WordNet
, 2007
"... This article presents YAGO, a large ontology with high coverage and precision. YAGO has been automatically derived from Wikipedia and WordNet. It comprises entities and relations, and currently contains more than 1.7 million entities and 15 million facts. These include the taxonomic IsA hierarchy a ..."
Abstract

Cited by 87 (12 self)
 Add to MetaCart
This article presents YAGO, a large ontology with high coverage and precision. YAGO has been automatically derived from Wikipedia and WordNet. It comprises entities and relations, and currently contains more than 1.7 million entities and 15 million facts. These include the taxonomic IsA hierarchy as well as semantic relations between entities. The facts for YAGO have been extracted from the category system and the infoboxes of Wikipedia and have been combined with taxonomic relations from WordNet. Type checking techniques help us keep YAGO’s precision at 95% – as proven by an extensive evaluation study. YAGO is based on a clean logical model with a decidable consistency. Furthermore, it allows representing nary relations in a natural way while maintaining compatibility with RDFS. A powerful query model facilitates access to YAGO’s data.
YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia
, 2010
"... We present YAGO2, an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space. YAGO2 is built automatically from Wikipedia, GeoNames, and WordNet. It contains 80 million facts about 9.8 million entities. Human evaluation confirmed an accuracy o ..."
Abstract

Cited by 46 (11 self)
 Add to MetaCart
We present YAGO2, an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space. YAGO2 is built automatically from Wikipedia, GeoNames, and WordNet. It contains 80 million facts about 9.8 million entities. Human evaluation confirmed an accuracy of 95 % of the facts in YAGO2. In this paper, we present the extraction methodology, the integration of the spatiotemporal dimension, and our knowledge representation SPOTL, an extension of the original SPOtriple
Confidence intervals for a binomial proportion and asymptotic expansions
 Ann. Statist
, 2002
"... We address the classic problem of interval estimation of a binomial proportion. The Wald interval ˆp ± z α/2n −1/2 ( ˆp(1 −ˆp)) 1/2 is currently in near universal use. We first show that the coverage properties of the Wald interval are persistently poor and defy virtually all conventional wisdom. We ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
We address the classic problem of interval estimation of a binomial proportion. The Wald interval ˆp ± z α/2n −1/2 ( ˆp(1 −ˆp)) 1/2 is currently in near universal use. We first show that the coverage properties of the Wald interval are persistently poor and defy virtually all conventional wisdom. We then proceed to a theoretical comparison of the standard interval and four additional alternative intervals by asymptotic expansions of their coverage probabilities and expected lengths. The four additional interval methods we study in detail are the scoretest interval (Wilson), the likelihoodratiotest interval, a Jeffreys prior Bayesian interval and an interval suggested by Agresti and Coull. The asymptotic expansions for coverage show that the first three of these alternative methods have coverages that fluctuate about the nominal value, while the Agresti– Coull interval has a somewhat larger and more nearly conservative coverage function. For the five interval methods we also investigate asymptotically their average coverage relative to distributions for p supported within (0, 1). In terms of expected length, asymptotic expansions show that the Agresti– Coull interval is always the longest of these. The remaining three are rather comparable and are shorter than the Wald interval except for p near 0 or 1. These analytical calculations support and complement the findings and the recommendations in Brown, Cai and DasGupta (Statist. Sci. (2001) 16
Confidence Intervals for Probabilities of Default
, 2005
"... In this paper we conduct a systematic comparison of confidence intervals around estimated probabilities of default (PD) using several analytical approaches as well as parametric and nonparametric bootstrap methods. We do so for two different PD estimation methods, cohort and duration (intensity), wi ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
In this paper we conduct a systematic comparison of confidence intervals around estimated probabilities of default (PD) using several analytical approaches as well as parametric and nonparametric bootstrap methods. We do so for two different PD estimation methods, cohort and duration (intensity), with 22 years of credit ratings data. We find that the bootstrapped intervals for the duration based estimates are relatively tight when compared to either analytic or bootstrapped intervals around the less efficient cohort estimator. We show how the large differences between the point estimates and confidence intervals of these two estimators are consistent with nonMarkovian migration behavior. Surprisingly, even with these relatively tight confidence intervals, it is impossible to distinguish notchlevel PDs for investment grade ratings, e.g. a PDAA from a PDA+. However, once the speculative grade barrier is crossed, we are able to distinguish quite cleanly notchlevel estimated PDs. Conditioning on the state of the business cycle helps: it is easier to distinguish adjacent PDs in recessions than in expansions.
DEMPSTERSHAFER INFERENCE WITH WEAK BELIEFS
"... Beliefs specified for predicting an unobserved realization of pivotal variables in the context of the fiducial and DempsterShafer (DS) inference can be weakened for credible inference. We consider predictive random sets for predicting an unobserved random sample from a known distribution, e.g., t ..."
Abstract

Cited by 14 (11 self)
 Add to MetaCart
Beliefs specified for predicting an unobserved realization of pivotal variables in the context of the fiducial and DempsterShafer (DS) inference can be weakened for credible inference. We consider predictive random sets for predicting an unobserved random sample from a known distribution, e.g., the uniform distribution U(0, 1). More specifically, we choose our beliefs for inference in two steps: (i) define a class of weak beliefs in terms of DS models for predicting an unobserved sample, and (ii) seek a belief within that class to balance the tradeoff between credibility and efficiency of the resulting DS inference. We call this approach the Maximal Belief (MB) method. The MB method is illustrated with two examples: (1) inference about µ based on a sample n from the Gaussian model N(µ,1), and (2) inference about the number of outliers (µi ̸ = 0) based on the observed data ind X1,..., Xn with the model Xi ∼ N(µi,1). The first example shows that MBDS analysis does a type of conditional inference. The second example demonstrates that MB posterior probabilities are easy to interpret for hypothesis testing.
Integrating YAGO into the Suggested Upper Merged Ontology
, 2008
"... Ontologies are becoming more and more popular as background knowledge for intelligent applications. Up to now, there has been a schism between manually assembled, highly axiomatic ontologies and large, automatically constructed knowledge bases. This paper discusses how the two worlds can be brought ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
(Show Context)
Ontologies are becoming more and more popular as background knowledge for intelligent applications. Up to now, there has been a schism between manually assembled, highly axiomatic ontologies and large, automatically constructed knowledge bases. This paper discusses how the two worlds can be brought together by combining the highlevel axiomatizations from the Standard Upper Merged Ontology (SUMO) with the extensive world knowledge of the YAGO ontology. The result is a new largescale formal ontology, which provides information about millions of entities such as people, cities, organizations, and companies.
Estimating Probabilities of Default for Low Default Portfolios,” working paper; available at http://www.defaultrisk.com/pp_score_45.htm
, 2005
"... For credit risk management purposes in general, and for allocation of regulatory capital by banks in particular (Basel II), numerical assessments of creditworthiness are indispensable. These assessments are expressed in terms of probabilities of default (PD) that should incorporate a certain degree ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
For credit risk management purposes in general, and for allocation of regulatory capital by banks in particular (Basel II), numerical assessments of creditworthiness are indispensable. These assessments are expressed in terms of probabilities of default (PD) that should incorporate a certain degree of conservatism in order to reflect the prudential risk management style banks are required to apply. In case of credit portfolios that did not at all suffer defaults or very few defaults only over years, the resulting naive zero or close to zero estimates would clearly not involve a sufficient conservatism. As an attempt to overcome this issue, we suggest the most prudent estimation principle. This means to estimate the PDs by upper confidence bounds while guaranteeing at the same time a PD ordering that respects the differences in credit quality indicated by the rating grades. The methodology is most easily applied under an assumption of independent default events but can be adapted to the case of correlated defaults without too much effort. 1
Confidence Intervals for a Binomial Proportion And Edgeworth Expansions
, 1999
"... We address the classic problem of interval estimation of a binomial proportion. The Wald interval p\Sigmaz ff=2 n \Gamma1=2 (p(1\Gamma p)) 1=2 is currently in near universal use. We first show that the coverage properties of the Wald interval are persistently poor and defy virtually all conven ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
We address the classic problem of interval estimation of a binomial proportion. The Wald interval p\Sigmaz ff=2 n \Gamma1=2 (p(1\Gamma p)) 1=2 is currently in near universal use. We first show that the coverage properties of the Wald interval are persistently poor and defy virtually all conventional wisdom. We then proceed to a theoretical comparison of the standard interval and four additional alternative intervals by asymptotic expansions of their coverage probabilities and expected lengths. Fortunately, the asymptotic expansions are remarkably accurate at rather modest sample sizes, such as n = 40, or sometimes even n = 20. The expansions show that an interval suggested in Agresti and Coull (1998) dominates the score interval (Wilson (1927)), the Jeffreys prior Bayesian interval, and also the standard interval in coverage probability. However, the asymptotic expansions for expected lengths show that the AgrestiCoull interval is always the longest of these, and the Jeffreys pr...
Thinking positively
 Risk
, 2005
"... How to come up with numerical PD estimates if there are no default observations? Katja Pluto and Dirk Tasche propose a statistically based methodology to derive nonzero probabilities of default for credit portfolios with none or very few observed defaults. Their most prudent estimation principle de ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
How to come up with numerical PD estimates if there are no default observations? Katja Pluto and Dirk Tasche propose a statistically based methodology to derive nonzero probabilities of default for credit portfolios with none or very few observed defaults. Their most prudent estimation principle delivers results for any desired degree of conservatism, and can be applied to both uncorrelated and correlated default events. The estimates could serve as a basis for bank internal credit risk management and regulatory purposes alike. 1