## Acquiring selectional preferences in a thai lexical database (2004)

Venue: | the 1st Joint Conference on Natural Language Processing (IJCNLP04 |

Citations: | 2 - 2 self |

### BibTeX

@INPROCEEDINGS{Kruengkrai04acquiringselectional,

author = {Canasai Kruengkrai and Thatsanee Charoenporn and Virach Sornlertlamvanich and Hitoshi Isahara},

title = {Acquiring selectional preferences in a thai lexical database},

booktitle = {the 1st Joint Conference on Natural Language Processing (IJCNLP04},

year = {2004}

}

### OpenURL

### Abstract

In this paper, we consider the problem of enriching a Thai lexical database by extending the semantic information with selectional preferences. We propose a novel approach for acquiring selectional preferences of verbs, which is motivated by the tree cut model. We apply a model selection technique called the Bayesian Information Criterion (BIC). Given a semantic hierarchy, our goal is to generalize initial noun classes to the most plausible levels on that hierarchy. We present an iterative algorithm for generalization. The algorithm performs agglomerative merging on the semantic hierarchy in a bottomup manner. The BIC is used to measure the improvement of the model both locally and globally. In our experiments, we consider the Web as a large corpus. We also propose approaches for extracting examples from the Web. Preliminarily experimental results are given to show the feasibility and effectiveness of our approach. 1

### Citations

1559 | Introduction to WordNet: An on-line lexical database
- Miller, Beckwith, et al.
- 1990
(Show Context)
Citation Context ...tural language processing have been interested in the problem of acquiring large semantic knowledge for natural language understanding systems. The availability of lexical databases, such as WordNet (=-=Miller et al., 1993-=-) and EuroWordNet (Vossen, 1999), appears to be useful for many different research areas, including word sense disambiguation, machine translation, information retrieval, etc. More recently, the reeme... |

804 | Accurate methods for the statistics of surprise and coincidence
- Dunning
- 1993
(Show Context)
Citation Context ...dow size +3. nouns that have insignificant dependence of the target verb, we measure dependence between words by using statistics taken from all the snippets. We apply the log likelihood ratio (LLR) (=-=Dunning, 1994-=-) for selecting the most optimal nouns. Given the verb v and the noun n occurring within window size z, a fast version of the LLR can be calculated as follows (Tanaka, 2002): LLRz(v, n) = k11log k11N ... |

733 |
Foundations of Statistical Natural Language Processing
- Manning, Schütze
- 1999
(Show Context)
Citation Context ...ame verb is the object. To deal with this limitation, we are interested in semantic constraints that are analogous to syntactic constraints called selectional preferences or selectional restrictions (=-=Manning and Schütze, 1999-=-). For example, the subject of the verb ���� ‘check’ prefers to be humans, the subject of the verb ��� ‘fly’ tends to be birds or airplanes, and the object of the verb ���� ‘drink’ prefers to be bever... |

374 | Automatic Word Sense Discrimination
- Schütze
- 1998
(Show Context)
Citation Context ... input data of the algorithm in the form of co-occurrence tuples without human participants. For the algorithm, several methods for solving noun sense ambiguity will be investigated (Yarowsky, 1995) (=-=Schütze, 1998-=-). In (Abe and Li, 1996), the authors show that combining the association norm with the MLE can improve the accuracy of generalization. We believe that it can be effectively applied to our algorithm. ... |

235 | Selection and information: a class-based approach to lexical relationships - Resnik - 1993 |

176 | Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables
- Chickering, Heckerman
- 1997
(Show Context)
Citation Context ...(mi) = ˆ li(D) − pi 2 · log|D| , (1) where ˆli(D) is the log-likelihood of the data D according to mi, and pi is the number of independent parameters. The BIC has several interesting characterist=-=ics (Chickering and Heckerman, 1997-=-). On the one hand, it is independent of the prior. On the other hand, it is exactly minus the MDL. We adopt the tree cut model to characterize the probabilistic model of the semantic hierarchy. Let m... |

106 | Generalizing case frames using a thesaurus and the MDL principle
- Li, Abe
- 1998
(Show Context)
Citation Context ...in Algorithm 1 and 2. The algorithm iterates until it cannot find leaf nodes to merge or there remains one class. Figure 3 illustrates an example of how the algorithm works, which is reproduced from (=-=Li and Abe, 1998-=-). Given the verb fly with the syntactic relationship subject, the co-occurring nouns are: crow (2), eagle (2), bird (4), and bee (2), where numbers in the parentheses indicate the co-occurring freque... |

57 | Bayesian model selection and model averaging
- Wasserman
- 2000
(Show Context)
Citation Context ... Similarly, we LOCATION just know that the direct object of the same verb is the object. REGION given verb, we propose the use of a model selection technique called the Bayesian Information Criteria (=-=Wasserman, 1999-=-). We also propose an efficient alGEOGRAPHICAL_AREA To deal withABSTRACTION this limitation, we are interested in gorithm for searching candidate sets and selecting semantic constraints that ABSTRACT_... |

32 |
Learning word association norms using tree cut pair models
- Abe, Li
- 1996
(Show Context)
Citation Context ...lgorithm in the form of co-occurrence tuples without human participants. For the algorithm, several methods for solving noun sense ambiguity will be investigated (Yarowsky, 1995) (Schütze, 1998). In =-=(Abe and Li, 1996-=-), the authors show that combining the association norm with the MLE can improve the accuracy of generalization. We believe that it can be effectively applied to our algorithm. Acknowledgements The au... |

31 | On learning more appropriate selectional restrictions
- Ribas
- 1995
(Show Context)
Citation Context ...able classes on a semantic hierarchy for predicates. Most algorithms for selectional preference induction are based on corpus-based approaches. The process can be broadly classified into three steps (=-=Ribas, 1995-=-). The steps are to create the space of candidate classes from examples, evaluate the appropriateness of the candidates using some statistical measures, and select the most optimal candidates to stand... |

27 | Using semantic preferences to identify verbal participation in role switching alternations - McCarthy - 2000 |

26 | Hiding a semantic hierarchy in a Markov model
- Abney, Light
- 1999
(Show Context)
Citation Context ...troducing a weighting factor to the log-likelihood of the data. Recently, other statistical approaches, such as the Bayesian Networks (Ciaramita and Johnson, 2000) and the hidden Markov models (HMM) (=-=Abney and Light, 1999-=-), have been investigated. This paper presents a novel approach for selectional preference acquisition, which is motivated by the tree cut model. We apply a model selection technique called the Bayesi... |

18 | Explaining away ambiguity: Learning verb selectional preference with bayesian networks
- Ciaramita, Johnson
- 2000
(Show Context)
Citation Context ...s. Wagner (2000) proposed a variation of the tree cut model by introducing a weighting factor to the log-likelihood of the data. Recently, other statistical approaches, such as the Bayesian Networks (=-=Ciaramita and Johnson, 2000-=-) and the hidden Markov models (HMM) (Abney and Light, 1999), have been investigated. This paper presents a novel approach for selectional preference acquisition, which is motivated by the tree cut mo... |

18 | Enriching a Lexical Semantic Net with Selectional Preferences by Means of Statistical Corpus Analysis - Wagner - 2000 |

12 | Language acquisition in the MDL framework
- Rissanen, Ristad
- 1994
(Show Context)
Citation Context ...s conditional probability distributions over possible partitions of nouns using the maximum likelihood estimate, and selects the best partition through the Minimum Description Length (MDL) principal (=-=Rissanen and Ristad, 1994-=-). McCarthy (2000) also applied the tree cut model to the problem of identifying diathesis alternations. Wagner (2000) proposed a variation of the tree cut model by introducing a weighting factor to t... |

10 |
On Acquiring Appropriate Selectional Restrictions from Corpora Using a Semantic Taxonomy
- Ribas
- 1995
(Show Context)
Citation Context ...able classes on a semantic hierarchy for predicates. Most algorithms for selectional preference induction are based on corpus-based approaches. The process can be broadly classified into three steps (=-=Ribas, 1995-=-). The steps are to create the space of candidate classes from examples, evaluate the appropriateness of the candidates using some statistical measures, and select the most optimal candidates to stand... |

7 | Measuring the Similarity between Compound Nouns in Different Languages Using Non-parallel Corpora - Tanaka - 2002 |

2 |
EuroWordNet general document. EuroWordNet
- Vossen
- 1999
(Show Context)
Citation Context ...terested in the problem of acquiring large semantic knowledge for natural language understanding systems. The availability of lexical databases, such as WordNet (Miller et al., 1993) and EuroWordNet (=-=Vossen, 1999-=-), appears to be useful for many different research areas, including word sense disambiguation, machine translation, information retrieval, etc. More recently, the reemergence of ontology researches i... |