Results 1 -
2 of
2
Developing A Persian Chunker Using a Hybrid Approach
"... Abstract—Text segmentation is the process of recognizing boundaries of text constituents, such as sentences, phrases and words. This paper focuses on phrase segmentation also known as chunking. This task has different problems in various natural languages depending on linguistic features and prescri ..."
Abstract
- Add to MetaCart
Abstract—Text segmentation is the process of recognizing boundaries of text constituents, such as sentences, phrases and words. This paper focuses on phrase segmentation also known as chunking. This task has different problems in various natural languages depending on linguistic features and prescribed form of writing. In this paper, we will discuss the problems and solutions especially for the Persian language and present our system for Persian phrase segmentation. Our system exploits a hybrid method for automatic chunking of Persian texts. The method at first exploits a rule-based approach to create a tagged corpus for training a neural network and then uses a multilayer perceptron neural network and Fuzzy C-Means Clustering to chunk new sentences. Experimental results show the average precision of %85.7 for the chunking result. S I.
Rio de Janeiro, Brasil
"... Current Natural Language Processing tools provide shallow semantics for textual data. These kind of knowledge could be used in the Semantic Web. In this paper, we describe F-EXT-WS, a Portuguese Language Processing Service that is now available at the Web. The first version of this service provides ..."
Abstract
- Add to MetaCart
Current Natural Language Processing tools provide shallow semantics for textual data. These kind of knowledge could be used in the Semantic Web. In this paper, we describe F-EXT-WS, a Portuguese Language Processing Service that is now available at the Web. The first version of this service provides Part-of-Speech Tagging, Noun Phrase Chunking and Named Entity Recognition. All these tools were built with the Entropy Guided Transformation Learning algorithm, a state-of-the-art Machine Learning algorithm for such tasks. We show the service architecture and interface. We also report on some experiments to evaluate the system’s performance. The service is fast and reliable. Categories and Subject Descriptors

