Results 1 - 10
of
33
Using the Web to Obtain Frequencies for Unseen Bigrams
- Computational Linguistics
, 2003
"... This article shows that the Web can be employed to obtain frequencies for bigrams that are unseen in a given corpus. We describe a method for retrieving counts for adjective-noun, noun-noun, and verb-object bigrams from the Web by querying a search engine. We evaluate this method by demonstrating: ( ..."
Abstract
-
Cited by 104 (2 self)
- Add to MetaCart
This article shows that the Web can be employed to obtain frequencies for bigrams that are unseen in a given corpus. We describe a method for retrieving counts for adjective-noun, noun-noun, and verb-object bigrams from the Web by querying a search engine. We evaluate this method by demonstrating: (a) a high correlation between Web frequencies and corpus frequencies; (b) a reliable correlation between Web frequencies and plausibility judgments; (c) a reliable correlation between Web frequencies and frequencies recreated using class-based smoothing; (d) a good performance of Web frequencies in a pseudodisambiguation task. 1.
Probabilistic Syntax
, 2002
"... istic methods for syntax, just as for a long time McCarthy and Hayes (1969) discouraged exploration of probabilistic methods in Artificial Intelligence. Among his arguments were that: (i) Probabilistic models wrongly mix in world knowledge (New York occurs more in text than Dayton, Ohio, but for no ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
istic methods for syntax, just as for a long time McCarthy and Hayes (1969) discouraged exploration of probabilistic methods in Artificial Intelligence. Among his arguments were that: (i) Probabilistic models wrongly mix in world knowledge (New York occurs more in text than Dayton, Ohio, but for no linguistic reason), (ii) Probabilistic models don't model grammaticality (neither Colorless green ideas sleep furiously nor Furiously sleep ideas green colorless have previously been uttered -- and hence must be estimated to have probability zero, Chomsky wrongly assumes -- but the former is grammatical while the latter is not, and (iii) Use of probabilities does not meet the goal of describing the mind-internal I-language as opposed to the observed-in-the-world E-language. This chapter is not meant to be a detailed critique of Chomsky's arguments -- Abney (1996) provides a survey and a rebuttal, and Pereira (2000) has further useful discussion -- but some of these concerns are still importa
Using the Web to Overcome Data Sparseness
- In Proceedings of EMNLP-02
, 2002
"... This paper shows that the web can be employed to obtain frequencies for bigrams that are unseen in a given corpus. We describe a method for retrieving counts for adjective-noun, noun-noun, and verbobject bigrams from the web by querying a search engine. We evaluate this method by demonstratin ..."
Abstract
-
Cited by 25 (1 self)
- Add to MetaCart
This paper shows that the web can be employed to obtain frequencies for bigrams that are unseen in a given corpus. We describe a method for retrieving counts for adjective-noun, noun-noun, and verbobject bigrams from the web by querying a search engine. We evaluate this method by demonstrating that web frequencies and correlate with frequencies obtained from a carefully edited, balanced corpus.
Determinants of Adjective-Noun Plausibility
, 1999
"... This paper explores the determinants of adjective-noun plausibility by using correlation analysis to compare judgements elicited from human subjects with five corpus-based variables: co-occurrence frequency of the adjective-noun pair, noun frequency, conditional probability of the noun given ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
This paper explores the determinants of adjective-noun plausibility by using correlation analysis to compare judgements elicited from human subjects with five corpus-based variables: co-occurrence frequency of the adjective-noun pair, noun frequency, conditional probability of the noun given the adjective, the log-likelihood ratio, and Resnik's (1993) selectional asso- ciation measure. The highest correlation is obtained with the co-occurrence frequency, which points to the strongly lexicalist and collocational nature of adjective-noun combinations.
WebExp: A Java Toolbox for Web-Based Psychological Experiments - Users' Guide for WebExp 2.1
, 1998
"... This User's Guide explains the installation and use of WebExp, a set of Java classes for conducting psychological experiments over the Word Wide Web. The WebExp toolbox consists of two modules: the WebExp server, which is a stand-alone Java application, and the WebExp client, which is implemented ..."
Abstract
-
Cited by 16 (8 self)
- Add to MetaCart
This User's Guide explains the installation and use of WebExp, a set of Java classes for conducting psychological experiments over the Word Wide Web. The WebExp toolbox consists of two modules: the WebExp server, which is a stand-alone Java application, and the WebExp client, which is implemented as a Java applet. The server application runs on the Web server that hosts the experiment, and waits for client applets to connect to it. The client runs remotely on the machine of the experimental participant. It administers the experiment and connects to the server application to download the experimental stimuli, and to store the subject's responses. WebExp oers the following features for conducting Web-based experiments: Two experimental paradigms are supported: magnitude estimation and sentence completion. Both within-subject and between-subject designs can be used. Automatic subject authentication is achieved by conducting basic plausibility checks on the subject's data ...
A Probabilistic Account of Logical Metonymy
, 2003
"... In this article we investigate logical metonymy, that is, constructions in which the argument of a word in syntax appears to be different from that argument in logical form (e.g., enjoy the book means enjoy reading the book, and easy problem means a problem that is easy to solve). The systematic var ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
In this article we investigate logical metonymy, that is, constructions in which the argument of a word in syntax appears to be different from that argument in logical form (e.g., enjoy the book means enjoy reading the book, and easy problem means a problem that is easy to solve). The systematic variation in the interpretation of such constructions suggests a rich and complex theory of composition on the syntax/semantics interface. Linguistic accounts of logical metonymy typically fail to describe exhaustively all the possible interpretations, or they don't rank those interpretations in terms of their likelihood. In view of this, we acquire the meanings of metonymic verbs and adjectives from a large corpus and propose a probabilistic model that provides a ranking on the set of possible interpretations. We identify the interpretations automatically by exploiting the consistent correspondences between surface syntactic cues and meaning. We evaluate our results against paraphrase judgments elicited experimentally from humans and show that the model's ranking of meanings correlates reliably with human intuitions.
Evaluating and Combining Approaches to Selectional Preference Acquisition
- In Proc. of the EACL
, 2003
"... Previous work on the induction of se- lectional preferences has been mainly carried out for English and has concentrated almost exclusively on verbs and their direct objects. In this paper, we focus on class-based models of selec- tional preferences for German verbs and take into account not ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Previous work on the induction of se- lectional preferences has been mainly carried out for English and has concentrated almost exclusively on verbs and their direct objects. In this paper, we focus on class-based models of selec- tional preferences for German verbs and take into account not only direct objects, but also subjects and prepositional complements. We evaluate model performance against human judgments and show that there is no single method that overall performs best. We explore a variety of parametrizations for our mod- els and demonstrate that model combi- nation enhances agreement with human ratings.
Gradient Well-Formedness in Optimality Theory
, 1998
"... A minor modification in the framework of Optimality Theory (Prince and Smolensky 1993) is suggested which enables it to model phenomena where consultant intuitions are gradient, falling somewhere between complete well-formedness and complete ill-formedness. The proposal consists of assigning to cert ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
A minor modification in the framework of Optimality Theory (Prince and Smolensky 1993) is suggested which enables it to model phenomena where consultant intuitions are gradient, falling somewhere between complete well-formedness and complete ill-formedness. The proposal consists of assigning to certain constraints bands of values along a reified continuum of constraint strictness. When a particular form can be generated only by assigning a constraint a strictness value within a designated “fringe ” of the strictness band, the grammar generates the form marked with an intermediate degree of well-formedness. The proposal is tested against data involving light and dark /l / in American English, using a set of gradient intuitions obtained from ten native speaker consultants. A
Constraints on linguistic coreference: structural vs. pragmatic factors
- Proceedings of the 23rd annual conference of the Cognitive Science Society, Mahawa
, 2001
"... Binding theory is the component of grammar that regulates the interpretation of noun phrases. Certain syntactic configurations involving picture noun phrases (PNPs) are problematic for the standard formulation of binding theory, which has prompted competing proposals for revisions of the theory. Som ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Binding theory is the component of grammar that regulates the interpretation of noun phrases. Certain syntactic configurations involving picture noun phrases (PNPs) are problematic for the standard formulation of binding theory, which has prompted competing proposals for revisions of the theory. Some authors have proposed an account based on structural constraints, while others have argued that anaphors in PNPs are exempt from binding theory, but subject to pragmatic restrictions. In this paper, we present an experimental study that aims to resolve this dispute. The results show that structural factors govern the binding possibilities in PNPs, while pragmatic factors play only a limited role. However, the structural factors identified differ from the ones standardly assumed.
The robustness of critical period effects in second language acquisition
- Studies in Second Language Acquisition 22
, 2000
"... This study was designed to test the Fundamental Difference Hypothesis (Bley-Vroman, 1988), which states that, whereas children are known to learn language almost completely through (implicit)domain-specific mechanisms, adults have largely lost the ability to learn a language without reflecting on it ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
This study was designed to test the Fundamental Difference Hypothesis (Bley-Vroman, 1988), which states that, whereas children are known to learn language almost completely through (implicit)domain-specific mechanisms, adults have largely lost the ability to learn a language without reflecting on its structure and have to use alternative mechanisms, drawing especially on their problem-solving capacities, to learn a second language. The hypothesis implies that only adults with a high level of verbal analytical ability will reach near-native competence in their second language, but that this ability will not be a significant predictor of success for childhood second language acquisition. A study with 57 adult Hungarian-speaking immigrants confirmed the hypothesis in the sense that very few adult immigrants scored within the range of child arrivals on a grammaticality judgment test, and that the few who did had high levels of verbal analytical ability; this ability was not a significant predictor for

