Results 1 - 10
of
19
Learning Deep Structured Semantic Models for Web Search using Clickthrough Data
"... Latent semantic models, such as LSA, intend to map a query to its relevant documents at the semantic level where keyword-based matching often fails. In this study we strive to develop a series of new latent semantic models with a deep structure that project queries and documents into a common low-di ..."
Abstract
-
Cited by 38 (15 self)
- Add to MetaCart
Latent semantic models, such as LSA, intend to map a query to its relevant documents at the semantic level where keyword-based matching often fails. In this study we strive to develop a series of new latent semantic models with a deep structure that project queries and documents into a common low-dimensional space where the relevance of a document given a query is readily computed as the distance between them. The proposed deep structured semantic models are discriminatively trained by maximizing the conditional likelihood of the clicked documents given a query using the clickthrough data. To make our models applicable to large-scale Web search applications, we also use a technique called word hashing, which is shown to effectively scale up our semantic models to handle large vocabularies which are common in such tasks. The new models are evaluated on a Web document ranking task using a real-world data set. Results show that our best model significantly outperforms other latent semantic models, which were considered state-of-the-art in the performance prior to the work presented in this paper.
Convolutional Neural Networks for Sentence Classification
"... We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vec-tors for sentence-level classification tasks. We show that a simple CNN with lit-tle hyperparameter tuning and static vec-tors achieves excellent results on multi-ple benchmarks. Lear ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
(Show Context)
We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vec-tors for sentence-level classification tasks. We show that a simple CNN with lit-tle hyperparameter tuning and static vec-tors achieves excellent results on multi-ple benchmarks. Learning task-specific vectors through fine-tuning offers further gains in performance. We additionally propose a simple modification to the ar-chitecture to allow for the use of both task-specific and static vectors. The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification. 1
2014b. Multilingual Models for Compositional Distributional Semantics
- In Proceedings of ACL
"... We present a novel technique for learn-ing semantic representations, which ex-tends the distributional hypothesis to mul-tilingual data and joint-space embeddings. Our models leverage parallel data and learn to strongly align the embeddings of semantically equivalent sentences, while maintaining suf ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
We present a novel technique for learn-ing semantic representations, which ex-tends the distributional hypothesis to mul-tilingual data and joint-space embeddings. Our models leverage parallel data and learn to strongly align the embeddings of semantically equivalent sentences, while maintaining sufficient distance between those of dissimilar sentences. The mod-els do not rely on word alignments or any syntactic information and are success-fully applied to a number of diverse lan-guages. We extend our approach to learn semantic representations at the document level, too. We evaluate these models on two cross-lingual document classification tasks, outperforming the prior state of the art. Through qualitative analysis and the study of pivoting effects we demonstrate that our representations are semantically plausible and can capture semantic rela-tionships across languages without paral-lel data. 1
Learning continuous phrase representations for translation modeling
- In ACL
, 2014
"... This paper tackles the sparsity problem in estimating phrase translation probabilities by learning continuous phrase representa-tions, whose distributed nature enables the sharing of related phrases in their represen-tations. A pair of source and target phrases are projected into continuous-valued v ..."
Abstract
-
Cited by 20 (5 self)
- Add to MetaCart
(Show Context)
This paper tackles the sparsity problem in estimating phrase translation probabilities by learning continuous phrase representa-tions, whose distributed nature enables the sharing of related phrases in their represen-tations. A pair of source and target phrases are projected into continuous-valued vec-tor representations in a low-dimensional latent space, where their translation score is computed by the distance between the pair in this new space. The projection is performed by a neural network whose weights are learned on parallel training data. Experimental evaluation has been performed on two WMT translation tasks. Our best result improves the performance of a state-of-the-art phrase-based statistical machine translation system trained on WMT 2012 French-English data by up to 1.3 BLEU points. 1
Clickthrough-Based Latent Semantic Models for Web Search
- In Proceedings of SIGIR
, 2011
"... This paper presents two new document ranking models for Web search based upon the methods of semantic representation and the statistical translation-based approach to information retrieval (IR). Assuming that a query is parallel to the titles of the documents clicked on for that query, large amounts ..."
Abstract
-
Cited by 18 (9 self)
- Add to MetaCart
(Show Context)
This paper presents two new document ranking models for Web search based upon the methods of semantic representation and the statistical translation-based approach to information retrieval (IR). Assuming that a query is parallel to the titles of the documents clicked on for that query, large amounts of query-title pairs are constructed from clickthrough data; two latent semantic models are learned from this data. One is a bilingual topic model within the language modeling framework. It ranks documents for a query by the likelihood of the query being a semantics-based translation of the documents. The semantic representation is language independent and learned from query-title pairs, with the assumption that a query and its paired titles share the same distribution over semantic topics. The other is a discriminative projection model within the vector space modeling framework. Unlike Latent Semantic Analysis and its variants, the projection matrix in our model, which is used to map from term vectors into sematic space, is learned discriminatively such that the distance between a query and its paired title, both represented as vectors in the projected semantic space, is smaller than that between the query and the titles of other documents which have no clicks for that query. These models are evaluated on the Web search task using a real world data set. Results show that they significantly outperform their corresponding baseline models, which are state-of-the-art.
Question answering using enhanced lexical semantic models.
- In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL).
, 2013
"... Abstract In this paper, we study the answer sentence selection problem for question answering. Unlike previous work, which primarily leverages syntactic analysis through dependency tree matching, we focus on improving the performance using models of lexical semantic resources. Experiments show that ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
(Show Context)
Abstract In this paper, we study the answer sentence selection problem for question answering. Unlike previous work, which primarily leverages syntactic analysis through dependency tree matching, we focus on improving the performance using models of lexical semantic resources. Experiments show that our systems can be consistently and significantly improved with rich lexical semantic information, regardless of the choice of learning algorithms. When evaluated on a benchmark dataset, the MAP and MRR scores are increased by 8 to 10 points, compared to one of our baseline systems using only surface-form matching. Moreover, our best system also outperforms pervious work that makes use of the dependency tree structure by a wide margin.
Learning semantic representations for the phrase translation model
- In The 52nd Annual Meeting of the Association for Computational Linguistics. ACL
, 2014
"... Abstract This paper presents a novel semantic-based phrase translation model. A pair of source and target phrases are projected into continuous-valued vector representations in a lowdimensional latent semantic space, where their translation score is computed by the distance between the pair in this ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
Abstract This paper presents a novel semantic-based phrase translation model. A pair of source and target phrases are projected into continuous-valued vector representations in a lowdimensional latent semantic space, where their translation score is computed by the distance between the pair in this new space. The projection is performed by a multi-layer neural network whose weights are learned on parallel training data. The learning is aimed to directly optimize the quality of end-to-end machine translation results. Experimental evaluation has been performed on two Europarl translation tasks, English-French and German-English. The results show that the new semantic-based phrase translation model significantly improves the performance of a state-of-the-art phrase-based statistical machine translation system, leading to a gain of 0.7-1.0 BLEU points.
Multi-relational latent semantic analysis.
- for Computational Linguistics.
, 2013
"... Abstract We present Multi-Relational Latent Semantic Analysis (MRLSA) which generalizes Latent Semantic Analysis (LSA). MRLSA provides an elegant approach to combining multiple relations between words by constructing a 3-way tensor. Similar to LSA, a lowrank approximation of the tensor is derived u ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract We present Multi-Relational Latent Semantic Analysis (MRLSA) which generalizes Latent Semantic Analysis (LSA). MRLSA provides an elegant approach to combining multiple relations between words by constructing a 3-way tensor. Similar to LSA, a lowrank approximation of the tensor is derived using a tensor decomposition. Each word in the vocabulary is thus represented by a vector in the latent semantic space and each relation is captured by a latent square matrix. The degree of two words having a specific relation can then be measured through simple linear algebraic operations. We demonstrate that by integrating multiple relations from both homogeneous and heterogeneous information sources, MRLSA achieves stateof-the-art performance on existing benchmark datasets for two relations, antonymy and is-a.
Sketch-based 3D Shape Retrieval using Convolutional Neural Networks
"... Retrieving 3D models from 2D human sketches has re-ceived considerable attention in the areas of graphics, im-age retrieval, and computer vision. Almost always in state of the art approaches a large amount of “best views ” are computed for 3D models, with the hope that the query sketch matches one o ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Retrieving 3D models from 2D human sketches has re-ceived considerable attention in the areas of graphics, im-age retrieval, and computer vision. Almost always in state of the art approaches a large amount of “best views ” are computed for 3D models, with the hope that the query sketch matches one of these 2D projections of 3D models using predefined features. We argue that this two stage approach (view selection – matching) is pragmatic but also problematic because the “best views ” are subjective and ambiguous, which makes the matching inputs obscure. This imprecise nature of matching further makes it challenging to choose features manually. Instead of relying on the elusive concept of “best views ” and the hand-crafted features, we propose to define our views using a minimalism approach and learn features for both sketches and views. Specifically, we drastically reduce the number of views to only two predefined direc-tions for the whole dataset. Then, we learn two Siamese Convolutional Neural Networks (CNNs), one for the views and one for the sketches. The loss function is defined on the within-domain as well as the cross domain similarities. Our experiments on three large datasets demonstrated that our method is significantly better than state of the art ap-proaches, and outperforms them in all conventional metrics. 1.
unknown title
"... Using metafeatures to increase the effectiveness of latent semantic models in web search Borisov, A.; Serdyukov, P.; de Rijke, M. General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other th ..."
Abstract
- Add to MetaCart
(Show Context)
Using metafeatures to increase the effectiveness of latent semantic models in web search Borisov, A.; Serdyukov, P.; de Rijke, M. General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. ABSTRACT In web search, latent semantic models have been proposed to bridge the lexical gap between queries and documents that is due to the fact that searchers and content creators often use different vocabularies and language styles to express the same concept. Modern search engines simply use the outputs of latent semantic models as features for a so-called global ranker. We argue that this is not optimal, because a single value output by a latent semantic model may be insufficient to describe all aspects of the model's prediction, and thus some information captured by the model is not used effectively by the search engine. To increase the effectiveness of latent semantic models in web search, we propose to create metafeatures-feature vectors that describe the structure of the model's prediction for a given querydocument pair-and pass them to the global ranker along with the models' scores. We provide simple guidelines to represent the latent semantic model's prediction with more than a single number, and illustrate these guidelines using several latent semantic models. We test the impact of the proposed metafeatures on a web document ranking task using four latent semantic models. Our experiments show that (1) through the use of metafeatures, the performance of each individual latent semantic model can be improved by 10.2% and 4.2% in NDCG scores at truncation levels 1 and 10; and (2) through the use of metafeatures, the performance of a combination of latent semantic models can be improved by 7.6% and 3.8% in NDCG scores at truncation levels 1 and 10, respectively.