• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 13,565
Next 10 →

Building a Large Annotated Corpus of English: The Penn Treebank

by Mitchell P. Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz - COMPUTATIONAL LINGUISTICS , 1993
"... There is a growing consensus that significant, rapid progress can be made in both text understanding and spoken language understanding by investigating those phenomena that occur most centrally in naturally occurring unconstrained materials and by attempting to automatically extract information abou ..."
Abstract - Cited by 2740 (10 self) - Add to MetaCart
and comparison of the adequacy of parsing models. In this paper, we review our experience with constructing one such large annotated corpus--the Penn Treebank, a corpus 1 consisting of over 4.5 million words of American English. During the first three-year phase of the Penn Treebank Project (1989

Building A Large Annotated Corpus of

by Penn Treebank, Mitchell Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz, English The, Penn Treebank, Mitchell P. Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz , 1993
"... In this paper, we review our experience with constructing one such large annotated corpus--the Penn Treebank, a corpus consisting of over 4.5 million words of American English. During the first three-year phase of the Penn Treebank Project (1989-1992), this corpus has been annotated for part-of-spee ..."
Abstract - Add to MetaCart
In this paper, we review our experience with constructing one such large annotated corpus--the Penn Treebank, a corpus consisting of over 4.5 million words of American English. During the first three-year phase of the Penn Treebank Project (1989-1992), this corpus has been annotated for part

Building a large annotated corpus of English: the Penn Treebank

by Penn Treebank, Mitchell P. Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz , 1993
"... this paper, we review our experience with constructing one such large annotated corpus---the Penn Treebank, a corpus ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
this paper, we review our experience with constructing one such large annotated corpus---the Penn Treebank, a corpus

A large annotated corpus for learning natural language inference

by Samuel R. Bowman, Gabor Angeli, Christopher Potts, Christopher D. Manning
"... Understanding entailment and contradic-tion is fundamental to understanding nat-ural language, and inference about entail-ment and contradiction is a valuable test-ing ground for the development of seman-tic representations. However, machine learning research in this area has been dra-matically limi ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
-matically limited by the lack of large-scale resources. To address this, we introduce the Stanford Natural Language Inference corpus, a new, freely available collection of labeled sentence pairs, written by hu-mans doing a novel grounded task based on image captioning. At 570K pairs, it is two orders of magnitude

Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English

by Daniel Dahlmeier, Hwee Tou Ng, Siew Mei Wu
"... We describe the NUS Corpus of Learner English (NUCLE), a large, fully annotated corpus of learner English that is freely available for research purposes. The goal of the corpus is to provide a large data resource for the development and evaluation of grammatical error correction systems. Although NU ..."
Abstract - Cited by 11 (0 self) - Add to MetaCart
We describe the NUS Corpus of Learner English (NUCLE), a large, fully annotated corpus of learner English that is freely available for research purposes. The goal of the corpus is to provide a large data resource for the development and evaluation of grammatical error correction systems. Although

The Proposition Bank: An Annotated Corpus of Semantic Roles

by Martha Palmer, Paul Kingsbury, Daniel Gildea - Computational Linguistics , 2005
"... The Proposition Bank project takes a practical approach to semantic representation, adding a layer of predicate-argument information, or semantic role labels, to the syntactic structures of the Penn Treebank. The resulting resource can be thought of as shallow, in that it does not represent corefere ..."
Abstract - Cited by 556 (22 self) - Add to MetaCart
coreference, quantification, and many other higher-order phenomena, but also broad, in that it covers every instance of every verb in the corpus and allows representative statistics to be calculated. We discuss the criteria used to define the sets of semantic roles used in the annotation process

LabelMe: A Database and Web-Based Tool for Image Annotation

by B. C. Russell, A. Torralba, K. P. Murphy, W. T. Freeman , 2008
"... We seek to build a large collection of images with ground truth labels to be used for object detection and recognition research. Such data is useful for supervised learning and quantitative evaluation. To achieve this, we developed a web-based tool that allows easy image annotation and instant sha ..."
Abstract - Cited by 679 (46 self) - Add to MetaCart
We seek to build a large collection of images with ground truth labels to be used for object detection and recognition research. Such data is useful for supervised learning and quantitative evaluation. To achieve this, we developed a web-based tool that allows easy image annotation and instant

Imagenet: A large-scale hierarchical image database

by Jia Deng, Wei Dong, Richard Socher, Li-jia Li, Kai Li, Li Fei-fei - In CVPR , 2009
"... The explosion of image data on the Internet has the potential to foster more sophisticated and robust models and algorithms to index, retrieve, organize and interact with images and multimedia data. But exactly how such data can be harnessed and organized remains a critical problem. We introduce her ..."
Abstract - Cited by 840 (28 self) - Add to MetaCart
of annotated images organized by the semantic hierarchy of WordNet. This paper offers a detailed analysis of ImageNet in its current state: 12 subtrees with 5247 synsets and 3.2 million images in total. We show that ImageNet is much larger in scale and diversity and much more accurate than the current image

A Maximum Entropy Model for Part-Of-Speech Tagging

by Adwait Ratnaparkhi , 1996
"... This paper presents a statistical model which trains from a corpus annotated with Part-OfSpeech tags and assigns them to previously unseen text with state-of-the-art accuracy(96.6%). The model can be classified as a Maximum Entropy model and simultaneously uses many contextual "features" t ..."
Abstract - Cited by 580 (1 self) - Add to MetaCart
This paper presents a statistical model which trains from a corpus annotated with Part-OfSpeech tags and assigns them to previously unseen text with state-of-the-art accuracy(96.6%). The model can be classified as a Maximum Entropy model and simultaneously uses many contextual "

The Berkeley FrameNet Project

by Collin F. Baker , Charles J. Fillmore, John B. Lowe - IN PROCEEDINGS OF THE COLING-ACL , 1998
"... FrameNet is a three-year NSF-supported project in corpus-based computational lexicography, now in its second year #NSF IRI-9618838, #Tools for Lexicon Building"#. The project's key features are #a# a commitment to corpus evidence for semantic and syntactic generalizations, and #b# the repr ..."
Abstract - Cited by 643 (3 self) - Add to MetaCart
#semantic and syntactic# of several thousand words and phrases, each accompanied by #c# a representative collection of annotated corpus attestations, which jointly exemplify the observed linkings between #frame elements" and their syntactic realizations #e.g. grammatical function, phrase type
Next 10 →
Results 1 - 10 of 13,565
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University