Results 1 - 10
of
26
Guided Learning for Bidirectional Sequence Classification
, 2007
"... In this paper, we propose guided learning, a new learning framework for bidirectional sequence classification. The tasks of learning the order of inference and training the local classifier are dynamically incorporated into a single Perceptron like learning algorithm. We apply this novel learning al ..."
Abstract
-
Cited by 31 (2 self)
- Add to MetaCart
In this paper, we propose guided learning, a new learning framework for bidirectional sequence classification. The tasks of learning the order of inference and training the local classifier are dynamically incorporated into a single Perceptron like learning algorithm. We apply this novel learning algorithm to POS tagging. It obtains an error rate of 2.67 % on the standard PTB test set, which represents 3.3 % relative error reduction over the previous best result on the same data set, while using fewer features. 1
Exploiting semantic role labeling, WordNet and Wikipedia for coreference resolution
- In Proc. of HLT/NAACL
, 2006
"... In this paper we present an extension of a machine learning based coreference resolution system which uses features induced from different semantic knowledge sources. These features represent knowledge mined from WordNet and Wikipedia, as well as information about semantic role labels. We show that ..."
Abstract
-
Cited by 31 (5 self)
- Add to MetaCart
In this paper we present an extension of a machine learning based coreference resolution system which uses features induced from different semantic knowledge sources. These features represent knowledge mined from WordNet and Wikipedia, as well as information about semantic role labels. We show that semantic features indeed improve the performance on different referring expression types such as pronouns and common nouns. 1
The CoNLL-2008 Shared Task on Joint Parsing of Syntactic and Semantic Dependencies
"... The Conference on Computational Natural Language Learning is accompanied every year by a shared task whose purpose is to promote natural language processing applications and evaluate them in a standard setting. In 2008 the shared task was dedicated to the joint parsing of syntactic and semantic depe ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
The Conference on Computational Natural Language Learning is accompanied every year by a shared task whose purpose is to promote natural language processing applications and evaluate them in a standard setting. In 2008 the shared task was dedicated to the joint parsing of syntactic and semantic dependencies. This shared task not only unifies the shared tasks of the previous four years under a unique dependency-based formalism, but also extends them significantly: this year’s syntactic dependencies include more information such as named-entity boundaries; the semantic dependencies model roles of both verbal and nominal predicates. In this paper, we define the shared task and describe how the data sets were created. Furthermore, we report and analyze the results and describe the approaches of the participating systems.
Knowledge derived from Wikipedia for computing semantic relatedness
- JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 2007
"... Wikipedia provides a semantic network for computing semantic relatedness in a more structured fashion than a search engine and with more coverage than WordNet. We present experiments on using Wikipedia for computing semantic relatedness and compare it to WordNet on various benchmarking datasets. Exi ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
Wikipedia provides a semantic network for computing semantic relatedness in a more structured fashion than a search engine and with more coverage than WordNet. We present experiments on using Wikipedia for computing semantic relatedness and compare it to WordNet on various benchmarking datasets. Existing relatedness measures perform better using Wikipedia than a baseline given by Google counts, and we show that Wikipedia outperforms WordNet on some datasets. We also address the question whether and how Wikipedia can be integrated into NLP applications as a knowledge base. Including Wikipedia improves the performance of a machine learning based coreference resolution system, indicating that it represents a valuable resource for NLP applications. Finally, we show that our method can be easily used for languages other than English by computing semantic relatedness for a German dataset.
Arabic Diacritization through Full Morphological Tagging
, 2007
"... We present a diacritization system for written Arabic which is based on a lexical resource. It combines a tagger and a lexeme language model. It improves on the best results reported in the literature. ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
We present a diacritization system for written Arabic which is based on a lexical resource. It combines a tagger and a lexeme language model. It improves on the best results reported in the literature.
Targeting sentiment expressions through supervised ranking of linguistic configurations
- In 3rd Int’l AAAI Conference on Weblogs and Social Media (ICWSM
, 2009
"... User generated content is extremely valuable for mining market intelligence because it is unsolicited. We study the problem of analyzing users ’ sentiment and opinion in their blog, message board, etc. posts with respect to topics expressed as a search query. In the scenario we consider the matches ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
User generated content is extremely valuable for mining market intelligence because it is unsolicited. We study the problem of analyzing users ’ sentiment and opinion in their blog, message board, etc. posts with respect to topics expressed as a search query. In the scenario we consider the matches of the search query terms are expanded through coreference and meronymy to produce a set of mentions. The mentions are contextually evaluated for sentiment and their scores are aggregated (using a data structure we introduce call the sentiment propagation graph) to produce an aggregate score for the input entity. An extremely crucial part in the contextual evaluation of individual mentions is finding which sentiment expressions are semantically related to (target) which mentions — this is the focus of our paper. We present an approach where potential target mentions for a sentiment expression are ranked using supervised machine learning (Support Vector Machines) where the main features are the syntactic configurations (typed dependency paths) connecting the sentiment expression and the mention. We have created a large English corpus of product discussions blogs annotated with semantic types of mentions, coreference, meronymy and sentiment targets. The corpus proves that coreference and meronymy are not marginal phenomena but are really central to determining the overall sentiment for the toplevel entity. We evaluate a number of techniques for sentiment targeting and present results which we believe push the current state-of-the-art. 1.
Cross-Linguistic Sentiment Analysis: From English to Spanish
"... We explore the adaptation of English resources and techniques for text sentiment analysis to a new language, Spanish. Our main focus is the modification of an existing English semantic orientation calculator and the building of dictionaries; however we also compare alternate approaches, including ma ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
We explore the adaptation of English resources and techniques for text sentiment analysis to a new language, Spanish. Our main focus is the modification of an existing English semantic orientation calculator and the building of dictionaries; however we also compare alternate approaches, including machine translation and Support Vector Machine classification. The results indicate that, although languageindependent methods provide a decent baseline performance, there is also a significant cost to automation, and thus the best path to long-term improvement is through the inclusion of language-specific knowledge and resources. 1.
Designing and evaluating a Russian tagset
- In LREC’08
, 2008
"... This paper reports the principles behind designing a tagset to cover Russian morphosyntactic phenomena, modifications of the core tagset, and its evaluation. The tagset and associated morphosyntactic specifications are based on the MULTEXT-East framework, while the decisions in designing it were aim ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
This paper reports the principles behind designing a tagset to cover Russian morphosyntactic phenomena, modifications of the core tagset, and its evaluation. The tagset and associated morphosyntactic specifications are based on the MULTEXT-East framework, while the decisions in designing it were aimed at achieving a balance between parameters important for linguists and the possibility to detect and disambiguate them automatically. The final tagset contains about 600 tags and achieves about 95 % accuracy on the disambiguated portion of the Russian National Corpus. We have also produced a test set of tagging models and corpora that can be shared with other researchers. 1.
Assigning function labels to unparsed text
- In Procs of RANLP’05, Korovets
, 2005
"... In this paper, we propose a novel solution to the problem of assigning function labels to syntactic constituents. This task is a useful intermediate step between syntactic parsing and semantic role labelling. What distinguishes our proposal from other attempts in function or semantic role labelling ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
In this paper, we propose a novel solution to the problem of assigning function labels to syntactic constituents. This task is a useful intermediate step between syntactic parsing and semantic role labelling. What distinguishes our proposal from other attempts in function or semantic role labelling is that we perform the learning of function labels at the same time as parsing. We reach state-of-the-art performance both on parsing and function labelling. Our results indicate that function label information is located in the lower levels of the parse tree, and that, similarly to other function and semantic labelling results, the main difficulty lies in distinguishing constituents that bear a function label from constituents that do not. 1
Natural language processing (almost) from scratch. arXiv:1103.0398v1
, 2011
"... We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific eng ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.

