• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 260
Next 10 →

Corpus-based induction of syntactic structure: Models of dependency and constituency

by Dan Klein - In Proceedings of the 42nd Annual Meeting of the ACL , 2004
"... We present a generative model for the unsupervised learning of dependency structures. We also describe the multiplicative combination of this dependency model with a model of linear constituency. The product model outperforms both components on their respective evaluation metrics, giving the best pu ..."
Abstract - Cited by 229 (9 self) - Add to MetaCart
We present a generative model for the unsupervised learning of dependency structures. We also describe the multiplicative combination of this dependency model with a model of linear constituency. The product model outperforms both components on their respective evaluation metrics, giving the best

Corpus-based schema matching

by Jayant Madhavan, Philip A. Bernstein, Anhai Doan, Alon Halevy - In ICDE , 2005
"... Schema Matching is the problem of identifying corresponding elements in different schemas. Discovering these correspondences or matches is inherently difficult to automate. Past solutions have proposed a principled combination of multiple algorithms. However, these solutions sometimes perform rather ..."
Abstract - Cited by 167 (18 self) - Add to MetaCart
Schema Matching is the problem of identifying corresponding elements in different schemas. Discovering these correspondences or matches is inherently difficult to automate. Past solutions have proposed a principled combination of multiple algorithms. However, these solutions sometimes perform

Collective Latent Dirichlet Allocation

by Zhi-yong Shen, Yi-dong Shen
"... In this paper, we propose a new variant of Latent Dirichlet Allocation(LDA): Collective LDA (C-LDA), for multiple corpora modeling. C-LDA combines multiple corpora during learning such that it can transfer knowledge from one corpus to another; meanwhile it keeps a discriminative node which represent ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
In this paper, we propose a new variant of Latent Dirichlet Allocation(LDA): Collective LDA (C-LDA), for multiple corpora modeling. C-LDA combines multiple corpora during learning such that it can transfer knowledge from one corpus to another; meanwhile it keeps a discriminative node which

Combining Multiple Knowledge Sources for Discourse Segmentation

by Diane J. Litman, Rebecca J. Passonneau - IN PROCEEDINGS OF THE 33RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS , 1995
"... We predict discourse segment boundaries from linguistic features of utterances, using a corpus of spoken narratives as data. We present two methods for developing segmentation algorithms from training data: hand tuning and machine learning. When multiple types of features are used, results approach ..."
Abstract - Cited by 72 (5 self) - Add to MetaCart
We predict discourse segment boundaries from linguistic features of utterances, using a corpus of spoken narratives as data. We present two methods for developing segmentation algorithms from training data: hand tuning and machine learning. When multiple types of features are used, results approach

Retrieving Spoken Documents by Combining Multiple Index Sources

by G. J. F. Jones , J. T. Foote , K. Spärck Jones , S. J. Young , 1996
"... This paper presents domain-independent methods of spoken document retrieval. Both a continuous-speech large vocabulary recognition system, and a phone-lattice word spotter, are used to locate index units within an experimental corpus of voice messages. Possible index terms are nearly unconstrained; ..."
Abstract - Cited by 69 (6 self) - Add to MetaCart
This paper presents domain-independent methods of spoken document retrieval. Both a continuous-speech large vocabulary recognition system, and a phone-lattice word spotter, are used to locate index units within an experimental corpus of voice messages. Possible index terms are nearly unconstrained

Mining Association Rules in Multiple Relations

by Luc Dehaspe, Luc De Raedt - In Proceedings of the 7th International Workshop on Inductive Logic Programming , 1997
"... . The application of algorithms for efficiently generating association rules is so far restricted to cases where information is put together in a single relation. We describe how this restriction can be overcome through the combination of the available algorithms with standard techniques from the fi ..."
Abstract - Cited by 102 (8 self) - Add to MetaCart
the field of inductive logic programming. We present the system Warmr, which extends Apriori [2] to mine association rules in multiple relations. We apply Warmr to the natural language processing task of mining part-of-speech tagging rules in a large corpus of English. Keywords: association rules

Inferring strategies for sentence ordering in multidocument news summarization

by Regina Barzilay, Noemie Elhadad, Kathleen R. Mckeown - Journal of Artificial Intelligence Research , 2002
"... The problem of organizing information for multidocument summarization so that the generated summary is coherent has received relatively little attention. While sentence ordering for single document summarization can be determined from the ordering of sentences in the input article, this is not the c ..."
Abstract - Cited by 118 (10 self) - Add to MetaCart
, this is not the case for multidocument summarization where summary sentences may be drawn from different input articles. In this paper, we propose a methodology for studying the properties of ordering information in the news genre and describe experiments done on a corpus of multiple acceptable orderings we developed

Learning Methods for Combining Linguistic Indicators to Classify Verbs

by Eric V. Siegel , 1997
"... Fourteen linguistically-motivated numeri- cal indicators are evaluated for their abil- ity to categorize verbs as either states or events. The values for each indicator are computed automatically across a corpus of text. To improve classification performance, machine learning techniques are employed ..."
Abstract - Cited by 52 (3 self) - Add to MetaCart
are employed to combine multiple indicators. Three machine learning methods are compared for this task: decision tree induction, a genetic algorithm, and log-linear regres- sion.

Corpus support for machine translation at LDC

by Xiaoyi Ma, Christopher Cieri - In Proceedings of LREC 2006: Fifth International Conference on Language Resources and Evaluation , 2006
"... This paper describes LDC's efforts in collecting, creating and processing different types of linguistic data, including lexicons, parallel text, multiple translation corpora, and human assessment of translation quality, to support the research and development in Machine Translation. Through a c ..."
Abstract - Cited by 7 (1 self) - Add to MetaCart
combination of different procedures and core technologies, the LDC was able to create very large, high quality, and cost-efficient corpora, which have contributed significantly to recent advances in Machine Translation. Multiple translation corpora and human assessment together facilitate, validate

Corpus-based Evidence for the Construal of Adjectives with Multiple Meanings: The Case of New

by Christine S. Sing
"... Semantically complex linguistic structures have regularly presented a challenge to traditional accounts of semantics as well as cognitive semantics. In this respect, adjectival meaning has proved to be a particular challenge, which results from the mixed properties residing in the category ADJECTIVE ..."
Abstract - Add to MetaCart
given that adjectives may assign different types of properties, which, inevitably, will have an impact on their construal. What is more, the findings of the corpus analysis at hand suggest that adjectival meaning cannot be restricted to adjective-noun combinations alone. Therefore, this paper sets out
Next 10 →
Results 1 - 10 of 260
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University