Results 1 - 10
of
45
UMBC EBIQUITY-CORE: Semantic Textual Similarity Systems
"... We describe three semantic text similarity systems developed for the *SEM 2013 STS shared task and the results of the corresponding three runs. All of them shared a word similarity feature that combined LSA word similarity and WordNet knowledge. The first, which achieved the best mean score of the 8 ..."
Abstract
-
Cited by 30 (2 self)
- Add to MetaCart
We describe three semantic text similarity systems developed for the *SEM 2013 STS shared task and the results of the corresponding three runs. All of them shared a word similarity feature that combined LSA word similarity and WordNet knowledge. The first, which achieved the best mean score of the 89 submitted runs, used a simple term alignment algorithm augmented with penalty terms. The other two runs, ranked second and fourth, used support vector regression models to combine larger sets of features. 1
Task 3: Cross-level semantic similarity
- In Proceedings of SemEval-2014
, 2014
"... This paper introduces a new SemEval task on Cross-Level Semantic Similarity (CLSS), which measures the degree to which the meaning of a larger linguistic item, such as a paragraph, is captured by a smaller item, such as a sentence. High-quality data sets were constructed for four comparison types us ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
(Show Context)
This paper introduces a new SemEval task on Cross-Level Semantic Similarity (CLSS), which measures the degree to which the meaning of a larger linguistic item, such as a paragraph, is captured by a smaller item, such as a sentence. High-quality data sets were constructed for four comparison types using multi-stage an-notation procedures with a graded scale of similarity. Nineteen teams submitted 38 systems. Most systems surpassed the baseline performance, with several attain-ing high performance for multiple com-parison types. Further, our results show that comparisons of semantic representa-tion increase performance beyond what is possible with text alone. 1
Back to Basics for Monolingual Alignment: Exploiting Word Similarity and Contextual Evidence
"... We present a simple, easy-to-replicate monolin-gual aligner that demonstrates state-of-the-art performance while relying on almost no su-pervision and a very small number of external resources. Based on the hypothesis that words with similar meanings represent potential pairs for alignment if locate ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
We present a simple, easy-to-replicate monolin-gual aligner that demonstrates state-of-the-art performance while relying on almost no su-pervision and a very small number of external resources. Based on the hypothesis that words with similar meanings represent potential pairs for alignment if located in similar contexts, we propose a system that operates by finding such pairs. In two intrinsic evaluations on alignment test data, our system achieves F1 scores of 88– 92%, demonstrating 1–3 % absolute improve-ment over the previous best system. Moreover, in two extrinsic evaluations our aligner out-performs existing aligners, and even a naive application of the aligner approaches state-of-the-art performance in each extrinsic task. 1
Towards Dynamic Word Sense Discrimination with Random Indexing
"... Most distributional models of word similarity represent a word type by a single vector of contextual features, even though, words commonly have more than one sense. The multiple senses can be captured by employing several vectors per word in a multi-prototype distributional model, prototypes that ca ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Most distributional models of word similarity represent a word type by a single vector of contextual features, even though, words commonly have more than one sense. The multiple senses can be captured by employing several vectors per word in a multi-prototype distributional model, prototypes that can be obtained by first constructing all the context vectors for the word and then clustering similar vectors to create sense vectors. Storing and clustering context vectors can be expensive though. As an alternative, we introduce Multi-Sense Random Indexing, which performs on-the-fly (incremental) clustering. To evaluate the method, a number of measures for word similarity are proposed, both contextual and non-contextual, including new measures based on optimal alignment of word senses. Experimental results on the task of predicting semantic textual similarity do, however, not show a systematic difference between singleprototype and multi-prototype models. 1
SemantiKLUE: Robust semantic similarity at multiple levels using maximum weight matching
- In Proceedings of SemEval 2014: International Workshop on Semantic Evaluation
, 2014
"... Being able to quantify the semantic similar-ity between two texts is important for many practical applications. SemantiKLUE com-bines unsupervised and supervised tech-niques into a robust system for measuring semantic similarity. At the core of the sys-tem is a word-to-word alignment of two texts us ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Being able to quantify the semantic similar-ity between two texts is important for many practical applications. SemantiKLUE com-bines unsupervised and supervised tech-niques into a robust system for measuring semantic similarity. At the core of the sys-tem is a word-to-word alignment of two texts using a maximum weight matching algorithm. The system participated in three SemEval-2014 shared tasks and the com-petitive results are evidence for its usability in that broad field of application. 1
iKernels-Core: Tree Kernel Learning for Textual Similarity
"... This paper describes the participation of iKernels system in the Semantic Textual Similarity (STS) shared task at *SEM 2013. Different from the majority of approaches, where a large number of pairwise similarity features are used to learn a regression model, our model directly encodes the input text ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
This paper describes the participation of iKernels system in the Semantic Textual Similarity (STS) shared task at *SEM 2013. Different from the majority of approaches, where a large number of pairwise similarity features are used to learn a regression model, our model directly encodes the input texts into syntactic/semantic structures. Our systems rely on tree kernels to automatically extract a rich set of syntactic patterns to learn a similarity score correlated with human judgements. We experiment with different structural representations derived from constituency and dependency trees. While showing large improvements over the top results from the previous year task (STS-2012), our best system ranks 21st out of total 88 participated in the STS-2013 task. Nevertheless, a slight refinement to our model makes it rank 4th. 1
2014b. DLS@CU: Sentence similarity from word aligment
- In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval2014
"... We describe a set of top-performing systems at the SemEval 2015 English Semantic Textual Similarity (STS) task. Given two English sen-tences, each system outputs the degree of their semantic similarity. Our unsupervised system, which is based on word alignments across the two input sentences, ranked ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
We describe a set of top-performing systems at the SemEval 2015 English Semantic Textual Similarity (STS) task. Given two English sen-tences, each system outputs the degree of their semantic similarity. Our unsupervised system, which is based on word alignments across the two input sentences, ranked 5th among 73 sub-mitted system runs with a mean correlation of 79.19 % with human annotations. We also sub-mitted two runs of a supervised system which uses word alignments and similarities between compositional sentence vectors as its features. Our best supervised run ranked 1st with a mean correlation of 80.15%. 1
SOFTCARDINALITY-CORE: Improving Text Overlap with Distributional Measures for Semantic Textual Similarity
"... www.gelbukh.com Soft cardinality has been shown to be a very strong text-overlapping baseline for the task of measuring semantic textual similarity (STS), obtaining 3 rd place in SemEval-2012. At *SEM-2013 shared task, beside the plain textoverlapping approach, we tested within soft cardinality two ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
www.gelbukh.com Soft cardinality has been shown to be a very strong text-overlapping baseline for the task of measuring semantic textual similarity (STS), obtaining 3 rd place in SemEval-2012. At *SEM-2013 shared task, beside the plain textoverlapping approach, we tested within soft cardinality two distributional word-similarity functions derived from the ukWack corpus. Unfortunately, we combined these measures with other features using regression, obtaining positions 18 th, 22 nd and 23 rd among the 90 participants systems in the official ranking. Already after the release of the gold standard annotations of the test data, we observed that using only the similarity measures without combining them with other features would have obtained positions 6 th, 7 th and 8 th; moreover, an arithmetic average of these similarity measures would have been 4 th (mean=0.5747). This paper describes both the 3 systems as they were submitted and the similarity measures that would obtained those better results. 1
NTNU: Measuring semantic similarity with sublexical feature representations and soft cardinalty
- In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval-2014
, 2014
"... The paper describes the approaches taken by the NTNU team to the SemEval 2014 Semantic Textual Similarity shared task. The solutions combine measures based on lexical soft cardinality and character n-gram feature representations with lexi-cal distance metrics from TakeLab’s base-line system. The fin ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
The paper describes the approaches taken by the NTNU team to the SemEval 2014 Semantic Textual Similarity shared task. The solutions combine measures based on lexical soft cardinality and character n-gram feature representations with lexi-cal distance metrics from TakeLab’s base-line system. The final NTNU system is based on bagged support vector machine regression over the datasets from previous shared tasks and shows highly competi-tive performance, being the best system on three of the datasets and third best overall (on weighted mean over all six datasets). 1
Fact Checking: Task definition and dataset construction
"... In this paper we introduce the task of fact checking, i.e. the assessment of the truth-fulness of a claim. The task is commonly performed manually by journalists verify-ing the claims made by public figures. Fur-thermore, ordinary citizens need to assess the truthfulness of the increasing volume of ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
In this paper we introduce the task of fact checking, i.e. the assessment of the truth-fulness of a claim. The task is commonly performed manually by journalists verify-ing the claims made by public figures. Fur-thermore, ordinary citizens need to assess the truthfulness of the increasing volume of statements they consume. Thus, de-veloping fact checking systems is likely to be of use to various members of soci-ety. We first define the task and detail the construction of a publicly available dataset using statements fact-checked by journal-ists available online. Then, we discuss baseline approaches for the task and the challenges that need to be addressed. Fi-nally, we discuss how fact checking relates to mainstream natural language processing tasks and can stimulate further research. 1