Results 1 -
2 of
2
An application and evaluation of the C/NC-value approach for the automatic term recognition of multi-word units in Japanese
- Int. J. on Terminology
"... Abstract. Technical terms are important for knowledge mining, especially as vast amounts of multi-lingual documents are available over the Internet. Thus, a domain and language-independent method for term recognition is necessary to automatically recognize terms from Internet documents. The C-/NC-va ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract. Technical terms are important for knowledge mining, especially as vast amounts of multi-lingual documents are available over the Internet. Thus, a domain and language-independent method for term recognition is necessary to automatically recognize terms from Internet documents. The C-/NC-value method is an efficient domain-independent multi-word term recognition method which combines linguistic and statistical knowledge. Although the C-value/NC-value method is originally based on the recognition of nested terms in English, our aim is to evaluate the application of the method to other languages and to show its feasibility for multi-language environment. In this paper, we describe the application of the C/NC-value method to Japanese texts. Several experiments analysing the performance of the method using the NACSIS Japanese AI-domain corpus demonstrate that the method can be utilized to realize a practical domain- and language-independent term recognition system. Keywords: Automatic term recognition, C-value, NC-value, nested terms, term context word 1.
39 Distributions in text
, 2005
"... The frequency of words and other linguistic units plays a central role in all branches of corpus linguistics. Indeed, the use of frequency information distinguishes corpus-based methodology from other approaches to language. Thus, not surprisingly, the distribution of frequencies of words and combin ..."
Abstract
- Add to MetaCart
The frequency of words and other linguistic units plays a central role in all branches of corpus linguistics. Indeed, the use of frequency information distinguishes corpus-based methodology from other approaches to language. Thus, not surprisingly, the distribution of frequencies of words and combinations of

