• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A Statistical Analysis of Morphemes in Japanese Terminology (1998)

by Kyo Kageura
Venue:In: COLING-ACL
Add To MetaCart

Tools

Sorted by:
Results 1 - 2 of 2

An application and evaluation of the C/NC-value approach for the automatic term recognition of multi-word units in Japanese

by Hideki Mima, Sophia Ananiadou - Int. J. on Terminology
"... Abstract. Technical terms are important for knowledge mining, especially as vast amounts of multi-lingual documents are available over the Internet. Thus, a domain and language-independent method for term recognition is necessary to automatically recognize terms from Internet documents. The C-/NC-va ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
Abstract. Technical terms are important for knowledge mining, especially as vast amounts of multi-lingual documents are available over the Internet. Thus, a domain and language-independent method for term recognition is necessary to automatically recognize terms from Internet documents. The C-/NC-value method is an efficient domain-independent multi-word term recognition method which combines linguistic and statistical knowledge. Although the C-value/NC-value method is originally based on the recognition of nested terms in English, our aim is to evaluate the application of the method to other languages and to show its feasibility for multi-language environment. In this paper, we describe the application of the C/NC-value method to Japanese texts. Several experiments analysing the performance of the method using the NACSIS Japanese AI-domain corpus demonstrate that the method can be utilized to realize a practical domain- and language-independent term recognition system. Keywords: Automatic term recognition, C-value, NC-value, nested terms, term context word 1.

39 Distributions in text

by Marco Baroni , 2005
"... The frequency of words and other linguistic units plays a central role in all branches of corpus linguistics. Indeed, the use of frequency information distinguishes corpus-based methodology from other approaches to language. Thus, not surprisingly, the distribution of frequencies of words and combin ..."
Abstract - Add to MetaCart
The frequency of words and other linguistic units plays a central role in all branches of corpus linguistics. Indeed, the use of frequency information distinguishes corpus-based methodology from other approaches to language. Thus, not surprisingly, the distribution of frequencies of words and combinations of
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University