Results 1 - 10
of
68
Integrating topics and syntax
- In Advances in Neural Information Processing Systems 17
, 2005
"... Statistical approaches to language learning typically focus on either short-range syntactic dependencies or long-range semantic dependencies between words. We present a generative model that uses both kinds of dependencies, and can be used to simultaneously find syntactic classes and semantic topics ..."
Abstract
-
Cited by 89 (12 self)
- Add to MetaCart
Statistical approaches to language learning typically focus on either short-range syntactic dependencies or long-range semantic dependencies between words. We present a generative model that uses both kinds of dependencies, and can be used to simultaneously find syntactic classes and semantic topics despite having no representation of syntax or semantics beyond statistical dependency. This model is competitive on tasks like part-of-speech tagging and document classification with models that exclusively use short- and long-range dependencies respectively. 1
Topics in semantic representation
- Psychological Review
, 2007
"... Processing language requires the retrieval of concepts from memory in response to an ongoing stream of information. This retrieval is facilitated if one can infer the gist of a sentence, conversation, or document computational problem underlying the extraction and use of gist, formulating this probl ..."
Abstract
-
Cited by 48 (8 self)
- Add to MetaCart
Processing language requires the retrieval of concepts from memory in response to an ongoing stream of information. This retrieval is facilitated if one can infer the gist of a sentence, conversation, or document computational problem underlying the extraction and use of gist, formulating this problem as a rational statistical inference. This leads to a novel approach to semantic representation in which word meanings are represented in terms of a set of probabilistic topics. The topic model performs well in predicting word association and the effects of semantic association and ambiguity on a variety of language-processing and memory tasks. It also provides a foundation for developing more richly structured statistical models of language, as the generative process assumed in the topic model can easily be extended to incorporate other kinds of semantic and syntactic structure.
ABL: Alignment-Based Learning
, 2000
"... This pal)or int;roduces a new type of grammar learning algorit;hm, insl)ircd l)y sl,ring edii, dis- tan(;c (Wagner an(t Fis(:hcr, 1974). The algorithm takes a (:oft)us of fiat senl,en(:cs as intml, and rcLurns a corpus of labelled, 1)ra(:keted senl, en(:es. Th( lnel,hod works on pairs of Lured sellt ..."
Abstract
-
Cited by 29 (1 self)
- Add to MetaCart
This pal)or int;roduces a new type of grammar learning algorit;hm, insl)ircd l)y sl,ring edii, dis- tan(;c (Wagner an(t Fis(:hcr, 1974). The algorithm takes a (:oft)us of fiat senl,en(:cs as intml, and rcLurns a corpus of labelled, 1)ra(:keted senl, en(:es. Th( lnel,hod works on pairs of Lured sellt,ellCeS l,ha[, have oBe o1: illore words in (:ommon. When t, wo sentences are (tivi(led int,o t)arLs i;haL m'e Lhc same in 1)ol, h s(mLen(:es and t)arLs that m:e (litlrenL, this interreal,ion is used to find ])m'Ls l, haL are hd;cr(:hmgeablc. These t)arLs m'e tak(m as possible (:onsLii, uenLs same type. Afi,er this aligmnent learning step, the sele(:tion learning s(,c 1) s(l(z(:l,s i,he mosL at)le (:onsl;ihmnl;s fi'om all possible (:onsLiLuent,s. This method was used 1,o booLsLra t) stru(:hrc on the A.TIS (:oftres (Mm'(:us et, al., 1993) and on the OVI'S 1 corpus (Bornmina eL al., 1997). While Lhc results are en(:om'aging (we o})l, aincd up t,o 89.25 % non-crossing l)ra(:ket,s 1)rc(:ision), this paper will 1)oini; ouL some of the shorl,COlnings of our apl)rom:h and will suggest 1)ossible sohd,ions.
Extracting semantic representations from word co-occurrence statistics: A computational study
- Behavior Research Methods
, 2007
"... Abstract: In a previous paper we presented a systematic computational study of the extraction of semantic representations from the word-word co-occurrence statistics of large text corpora. The conclusion was that semantic vectors of Pointwise Mutual Information (PMI) values from very small co-occurr ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
Abstract: In a previous paper we presented a systematic computational study of the extraction of semantic representations from the word-word co-occurrence statistics of large text corpora. The conclusion was that semantic vectors of Pointwise Mutual Information (PMI) values from very small co-occurrence windows, together with a cosine distance measure, consistently resulted in the best representations across a range of psychologically relevant semantic tasks. This paper extends that study by investigating the use of three further factors, namely the application of stop-lists, word stemming, and dimensionality reduction using Singular Value Decomposition (SVD), that have been used to provide improved performance elsewhere. It also introduces an additional semantic task and explores the advantages of using a much larger corpus. This leads to the discovery and analysis of improved SVD based methods for generating semantic representations (that provide new state-of-the-art performance on a standard TOEFL task) and the identification and discussion of problems and misleading results that can arise without a full systematic study.
Becoming Syntactic
"... Psycholinguistic research has shown that the influence of abstract syntactic knowledge on performance is shaped by particular sentences that have been experienced. To explore this idea, the authors applied a connectionist model of sentence production to the development and use of abstract syntax. Th ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Psycholinguistic research has shown that the influence of abstract syntactic knowledge on performance is shaped by particular sentences that have been experienced. To explore this idea, the authors applied a connectionist model of sentence production to the development and use of abstract syntax. The model makes use of (a) error-based learning to acquire and adapt sequencing mechanisms and (b) meaning–form mappings to derive syntactic representations. The model is able to account for most of what is known about structural priming in adult speakers, as well as key findings in preferential looking and elicited production studies of language acquisition. The model suggests how abstract knowledge and concrete experience are balanced in the development and use of syntax.
Poverty of the stimulus? A rational approach
- In the Proceedings of the 2006 Cognitive Science conference. 2006
, 2006
"... The Poverty of the Stimulus (PoS) argument holds that children do not receive enough evidence to infer the existence of core aspects of language, such as the dependence of linguistic rules on hierarchical phrase structure. We reevaluate one version of this argument with a Bayesian model of grammar i ..."
Abstract
-
Cited by 22 (5 self)
- Add to MetaCart
The Poverty of the Stimulus (PoS) argument holds that children do not receive enough evidence to infer the existence of core aspects of language, such as the dependence of linguistic rules on hierarchical phrase structure. We reevaluate one version of this argument with a Bayesian model of grammar induction, and show that a rational learner without any initial language-specific biases could learn this dependency given typical child-directed input. This choice enables the learner to master aspects of syntax, such as the auxiliary fronting rule in interrogative formation, even without having heard directly relevant data (e.g., interrogatives containing an auxiliary in a relative clause in the subject NP).
Environmental Determinants of Lexical Processing Effort
, 2000
"... A central concern of psycholinguistic research is explaining the relative ease or difficulty involved in processing words. In this thesis, we explore the connection between lexical processing effort and measurable properties of the linguistic environment. Distributional information (information abou ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
A central concern of psycholinguistic research is explaining the relative ease or difficulty involved in processing words. In this thesis, we explore the connection between lexical processing effort and measurable properties of the linguistic environment. Distributional information (information about a word's contexts of use) is easily extracted from large language corpora in the form of co-occurrence statistics. We claim that such simple distributional statistics can form the basis of a parsimonious model of lexical processing effort.
Neural network processing of natural language: I. Sensitivity to serial, temporal and abstract structure of language in the infant
, 2000
"... ..."
The Acquisition of Word Meaning through Global Lexical Co-occurrences
, 2000
"... Introduction The acquisition of word meaning has been extensively studied for the last thirty years in the field of language acquisition. However, the question of how children acquire word meaning remains highly controversial today. Recently, a number of computational studies have examined the emer ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
Introduction The acquisition of word meaning has been extensively studied for the last thirty years in the field of language acquisition. However, the question of how children acquire word meaning remains highly controversial today. Recently, a number of computational studies have examined the emergence of lexical representations in connectionist networks or similar statistical systems, suggesting that word meaning can be acquired by the computation of statistical regularities inherent in the input data. In particular, Elman (1990, 1998) showed that categories of nouns and verbs, and subcategories of animates versus inanimates (within nouns), and transitives versus intransitives (within verbs), can emerge from the network's computing of the lexical co-occurrence properties in the input. Redington, Chater, and Finch (1998) also demonstrated that the use of distributional properties in large-scale speech corpus allows a statistical system to acquire syntactic categories. These s
Addressing the Learnability of Verb Subcategorizations with Bayesian Inference
, 2000
"... Elman (1993) has shown that simple syntactic systems can be learned solely on the basis of distributions of words in text presentation. However Pinker (1989) has proposed that children must make use of verbs' semantic representations in order to infer their syntactic subcategorizations (semantic ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Elman (1993) has shown that simple syntactic systems can be learned solely on the basis of distributions of words in text presentation. However Pinker (1989) has proposed that children must make use of verbs' semantic representations in order to infer their syntactic subcategorizations (semantic bootstrapping) . Results reported here demonstrate how Bayesian statistical inference can provide an alternative, and much simpler, account of how subcategorizations are learned. The acquisition mechanism described here suggests that syntactic acquisition may involve a much larger component of learning, and less innate knowledge, than is presumed within mainstream generative theory.

