DMCA
Back to the Past: Supporting Interpretations of Forgotten Stories by Time-aware Re-Contextualization (2015)
Venue: | In 8th ACM International Conference on Web Search and Data Mining |
Citations: | 2 - 1 self |
Citations
4360 | Latent Dirichlet allocation
- Blei, Ng, et al.
- 2003
(Show Context)
Citation Context ... Diversity. This class of features is aimed to compare the dissimilarity between document d and context c on a higher level by representing them using topics. We use latent Dirichlet allocation (LDA) =-=[3]-=- to model a set of implicit topics distribution of the document and context. We define this feature as follows. R1(c, d) = √√√√ m∑ k=1 (p(zk|d)− p(zk|c))2 where m is the number of topics and zk is the... |
1153 | A language modeling approach to information retrieval
- Ponte, Croft
- 1998
(Show Context)
Citation Context ...of contextualization candidates. Later, learning to select relevant context items is applied to this ranked list. 5.1 Retrieval Model For the retrieval step, we use query-likelihood language modeling =-=[31]-=- to determine the similarity of a query with the context. In particular, given a query q generated by using one of the methods described in Section 4 for the document d, we compute the likelihood of g... |
322 | Learning to link with wikipedia
- Milne, Witten
- 2008
(Show Context)
Citation Context ...ext at the respective time for its interpretation. Just adding information, which is related to the entities and concepts mentioned in the text, as it is done in Wikification approaches, for example, =-=[28, 29]-=- or for a domain specific case [17], is not sufficient for many reasons. First, we require a kind of a virtual time-travel, in which - by the information about the past - we are mentally transported i... |
269 | Predicting Query Performance.
- Cronen-Townsend, Zhou, et al.
- 2002
(Show Context)
Citation Context ...ormation is that the information needs of the users are made explicit, possibly driving to more effective queries. We formulate queries by combining annotations via Query Performance Prediction (QPP) =-=[9]-=-, using both pre–retrieval [15] and post–retrieval [5] features. The formers are based only on the query and corpus-based statistics, while the latter also analyze the retrieved list of results. In li... |
265 | Wikify!: linking documents to encyclopedic knowledge.
- Mihalcea, Csomai
- 2007
(Show Context)
Citation Context ...ext at the respective time for its interpretation. Just adding information, which is related to the entities and concepts mentioned in the text, as it is done in Wikification approaches, for example, =-=[28, 29]-=- or for a domain specific case [17], is not sufficient for many reasons. First, we require a kind of a virtual time-travel, in which - by the information about the past - we are mentally transported i... |
256 | Support vector regression machines
- Drucker, Burges, et al.
- 1997
(Show Context)
Citation Context ... phase we are interested in retrieving as much contextualization candidates as possible. In this work we predict query performances with a regression model learned via Support Vector Regression (SVR) =-=[12]-=-. In this model, each learning sample s = ( fq, rq ) consists in a feature vector fq describing query q (as well as the document it refers to) and its recall rq, i.e., the label to be predicted. Note ... |
213 | Learning algorithms for keyphrase extraction, - Turney - 2000 |
206 |
Novelty and diversity in information retrieval evaluation
- Clarke, Kolla, et al.
- 2008
(Show Context)
Citation Context ...not looking for more information on the current context, but we try to re-construct the original context of a document. The contextualization task is also related to the diversification problem in IR =-=[7, 38, 39]-=-. In [38], different metrics Figure 2: Time-aware re-contextualization approach. are proposed to measure redundancy in order to investigate the novelty and redundancy of relevant documents in filterin... |
175 | KEA: Practical automatic keyphrase extraction.
- Witten, Paynter, et al.
- 1999
(Show Context)
Citation Context ...ment information already available in the documents. Automatically formulating queries from text [18] can be done by using tf–idf, mutual information, natural language processing, or machine learning =-=[34, 36, 37]-=-. Assuming the presence of basic metadata and structure for documents, as in [33], some of methods in our paper build queries by exploiting the title and lead paragraph of documents. Similarly to [14]... |
155 | Novelty and redundancy detection in adaptive filtering
- Zhang, Callan, et al.
- 2002
(Show Context)
Citation Context ...s, which are used for retrieving adequate contextualization candidates from an underlying knowledge source. In the second step, we rank the candidates. Similarly to diversification approaches, (e.g., =-=[38]-=-), this requires balancing two goals: high content-based and temporal relevance for the text to be contextualized, on one hand, and complementarity for providing information that cannot already be fou... |
124 | The Stanford CoreNLP natural language processing toolkit.
- Manning, Surdeanu, et al.
- 2014
(Show Context)
Citation Context ... considered paragraphs as contextualization units. In this particular snapshot, we obtain 4,414,920 Wikipedia articles that contain 25,708,539 paragraphs. For each paragraph, we used Stanford CoreNLP =-=[26]-=- for tokenization, entity annotation and temporal expression extraction. In addition, anchor texts found in the paragraph hyperlinks are also extracted. We used Apache Solr3 to index the annotated par... |
104 | Inferring query performance using pre-retrieval predictors.
- He, Ounis
- 2004
(Show Context)
Citation Context ...present queries and the document it belongs to are described in the rest of this section. It is composed of novel temporal features for query performance prediction, along with more standard features =-=[4, 16, 30]-=-. Linguistic Features. We compute a family of linguistic features [30] for a query by considering its text and the document it refers to. This results in a set of features both at query and document l... |
100 | Robust disambiguation of named entities in text.
- Hoffart, Yosef, et al.
- 2011
(Show Context)
Citation Context ...or example, enables an automated linkage of entity and concept mentions with Wikipedia pages. Meanwhile, a lot of progress has been made in further developing the entity disambiguation step (see e.g. =-=[19]-=-), which is crucial for robust linking of entity mentions to Wikipedia entity pages or entity representations in other knowledge bases, such as, Yago, DBPedia or Freebase. Entity linking, or Entity Di... |
84 | Discovering key concepts in verbose queries
- Bendersky, Croft
- 2008
(Show Context)
Citation Context ...d described before, we will not discuss them further. 4.3 Learning to Select Hook-based Queries Different methods based on ranking and selection of query terms from an initial query might be employed =-=[1, 24, 27]-=-, considering the entire set of hooks for a document as the initial query. We explore an adaptive method which formulates queries based on the characteristics of the input document and hooks. Our appr... |
71 | Query-Free news search.
- Henzinger, Chang, et al.
- 2005
(Show Context)
Citation Context ...esults (contexts) for finding the ones that are not only topical and temporal relevant, but also complement information already available in the documents. Automatically formulating queries from text =-=[18]-=- can be done by using tf–idf, mutual information, natural language processing, or machine learning [34, 36, 37]. Assuming the presence of basic metadata and structure for documents, as in [33], some o... |
65 | Temporal profiles of queries.
- Jones, Diaz
- 2007
(Show Context)
Citation Context ...tiveness for timesensitive queries. Basically, there are two types of temporal information particularly useful for time-aware information retrieval: (1) the publication or creation time of a document =-=[20, 22]-=-, and (2) temporal expressions mentioned in a document or a query [2]. Aforementioned works address one of two main aspects for temporal relevance, namely, recency ranking [10, 11] or time-dependent r... |
51 | A language modeling approach for temporal information needs.
- Berberich, Bedathur, et al.
- 2010
(Show Context)
Citation Context ...oral information particularly useful for time-aware information retrieval: (1) the publication or creation time of a document [20, 22], and (2) temporal expressions mentioned in a document or a query =-=[2]-=-. Aforementioned works address one of two main aspects for temporal relevance, namely, recency ranking [10, 11] or time-dependent ranking [2, 22]. The first aspect takes into account the freshness of ... |
41 | Towards recency ranking in web search.
- Dong, Chang, et al.
- 2010
(Show Context)
Citation Context ... time of a document [20, 22], and (2) temporal expressions mentioned in a document or a query [2]. Aforementioned works address one of two main aspects for temporal relevance, namely, recency ranking =-=[10, 11]-=- or time-dependent ranking [2, 22]. The first aspect takes into account the freshness of web documents, whereas the second aspect considers temporal information needs and the temporal profiles of docu... |
39 | From Reading to Retrieval: Freeform Ink Annotations as Queries
- Golovchinsky, Price, et al.
- 1999
(Show Context)
Citation Context ... 37]. Assuming the presence of basic metadata and structure for documents, as in [33], some of methods in our paper build queries by exploiting the title and lead paragraph of documents. Similarly to =-=[14]-=-, we also explore approaches that assume the availability of manual annotations as seeds for query formulation. The advantage of having such additional information is that the information needs of the... |
38 | Query by document, in:
- Yang, Bansal, et al.
- 2009
(Show Context)
Citation Context ...ment information already available in the documents. Automatically formulating queries from text [18] can be done by using tf–idf, mutual information, natural language processing, or machine learning =-=[34, 36, 37]-=-. Assuming the presence of basic metadata and structure for documents, as in [33], some of methods in our paper build queries by exploiting the title and lead paragraph of documents. Similarly to [14]... |
29 | Determining time of queries for re-ranking search results.
- Kanhabua, Nørvag
- 2010
(Show Context)
Citation Context ...tiveness for timesensitive queries. Basically, there are two types of temporal information particularly useful for time-aware information retrieval: (1) the publication or creation time of a document =-=[20, 22]-=-, and (2) temporal expressions mentioned in a document or a query [2]. Aforementioned works address one of two main aspects for temporal relevance, namely, recency ranking [10, 11] or time-dependent r... |
29 | Linguistic features to predict query difficulty.
- Mothe, Tanguy
- 2005
(Show Context)
Citation Context ...present queries and the document it belongs to are described in the rest of this section. It is composed of novel temporal features for query performance prediction, along with more standard features =-=[4, 16, 30]-=-. Linguistic Features. We compute a family of linguistic features [30] for a query by considering its text and the document it refers to. This results in a set of features both at query and document l... |
24 |
A survey of pre-retrieval query performance predictors.
- Hauff, Hiemstra, et al.
- 2008
(Show Context)
Citation Context ...on needs of the users are made explicit, possibly driving to more effective queries. We formulate queries by combining annotations via Query Performance Prediction (QPP) [9], using both pre–retrieval =-=[15]-=- and post–retrieval [5] features. The formers are based only on the query and corpus-based statistics, while the latter also analyze the retrieved list of results. In line with the previous work on ti... |
23 |
Estimating the query difficulty for information retrieval.
- Carmel, Yom-Tov
- 2010
(Show Context)
Citation Context ...e made explicit, possibly driving to more effective queries. We formulate queries by combining annotations via Query Performance Prediction (QPP) [9], using both pre–retrieval [15] and post–retrieval =-=[5]-=- features. The formers are based only on the query and corpus-based statistics, while the latter also analyze the retrieved list of results. In line with the previous work on time–aware performance pr... |
17 | Learning to rank for freshness and relevance.
- Dai, Shokouhi, et al.
- 2011
(Show Context)
Citation Context ... time of a document [20, 22], and (2) temporal expressions mentioned in a document or a query [2]. Aforementioned works address one of two main aspects for temporal relevance, namely, recency ranking =-=[10, 11]-=- or time-dependent ranking [2, 22]. The first aspect takes into account the freshness of web documents, whereas the second aspect considers temporal information needs and the temporal profiles of docu... |
17 | Linking Online News and Social Media
- Tsagkias, Rijke, et al.
- 2011
(Show Context)
Citation Context ...In [21], for example, news articles are enriched with related predictions – sentences containing temporal references to the future – retrieved from other documents in the same collection. Other works =-=[13, 33, 35]-=- exploit social media (e.g., Twitter) as external sources when processing news articles. In [35], the most interesting tweets regarding a given news are selected by formulating the tweet selection as ... |
16 | Generating links to background knowledge: a case study using narrative radiology reports
- He, Rijke, et al.
- 2011
(Show Context)
Citation Context ...rpretation. Just adding information, which is related to the entities and concepts mentioned in the text, as it is done in Wikification approaches, for example, [28, 29] or for a domain specific case =-=[17]-=-, is not sufficient for many reasons. First, we require a kind of a virtual time-travel, in which - by the information about the past - we are mentally transported into the time of content creation, i... |
13 | A term dependency-based approach for query terms ranking.
- Lee, Chen, et al.
- 2009
(Show Context)
Citation Context ...d described before, we will not discuss them further. 4.3 Learning to Select Hook-based Queries Different methods based on ranking and selection of query terms from an initial query might be employed =-=[1, 24, 27]-=-, considering the entire set of hooks for a document as the initial query. We explore an adaptive method which formulates queries based on the characteristics of the input document and hooks. Our appr... |
10 | Ranking related news predictions
- Kanhabua, Blanco, et al.
- 2011
(Show Context)
Citation Context ...t considers temporal information needs and the temporal profiles of documents. Retrieving and processing external information to be added to documents gain increasing interest in the recent years. In =-=[21]-=-, for example, news articles are enriched with related predictions – sentences containing temporal references to the future – retrieved from other documents in the same collection. Other works [13, 33... |
10 |
Automatic selection of social media responses to news
- ˇStajner, Thomee, et al.
- 2013
(Show Context)
Citation Context ...In [21], for example, news articles are enriched with related predictions – sentences containing temporal references to the future – retrieved from other documents in the same collection. Other works =-=[13, 33, 35]-=- exploit social media (e.g., Twitter) as external sources when processing news articles. In [35], the most interesting tweets regarding a given news are selected by formulating the tweet selection as ... |
9 |
Joint topic modeling for event summarization across news and social media streams.
- Gao, Li, et al.
- 2012
(Show Context)
Citation Context ...In [21], for example, news articles are enriched with related predictions – sentences containing temporal references to the future – retrieved from other documents in the same collection. Other works =-=[13, 33, 35]-=- exploit social media (e.g., Twitter) as external sources when processing news articles. In [35], the most interesting tweets regarding a given news are selected by formulating the tweet selection as ... |
7 | Compact Query Term Selection using Topically Related Text
- Maxwell, Croft
- 2013
(Show Context)
Citation Context ...d described before, we will not discuss them further. 4.3 Learning to Select Hook-based Queries Different methods based on ranking and selection of query terms from an initial query might be employed =-=[1, 24, 27]-=-, considering the entire set of hooks for a document as the initial query. We explore an adaptive method which formulates queries based on the characteristics of the input document and hooks. Our appr... |
4 |
Learning for Search Result Diversification”,
- Zhu, Lan, et al.
- 2014
(Show Context)
Citation Context ...not looking for more information on the current context, but we try to re-construct the original context of a document. The contextualization task is also related to the diversification problem in IR =-=[7, 38, 39]-=-. In [38], different metrics Figure 2: Time-aware re-contextualization approach. are proposed to measure redundancy in order to investigate the novelty and redundancy of relevant documents in filterin... |
3 |
Evaluation of machine-learning protocols for technology-assisted review in electronic discovery.
- Cormack, Grossman
- 2014
(Show Context)
Citation Context ...approaches, which focus on precision metrics, we consider the performances of queries in terms of recall, which have been recently remarked and considered in different information retrieval scenarios =-=[8, 25]-=-. 3. APPROACH OVERVIEW In the general contextualization model underlying our approach we distinguish the information items d to be contextualized and the context source, where the information for the ... |
3 | ReQ-ReC: High Recall Retrieval with Query Pooling and Interactive Classification.
- Li, Wang, et al.
- 2014
(Show Context)
Citation Context ...approaches, which focus on precision metrics, we consider the performances of queries in terms of recall, which have been recently remarked and considered in different information retrieval scenarios =-=[8, 25]-=-. 3. APPROACH OVERVIEW In the general contextualization model underlying our approach we distinguish the information items d to be contextualized and the context source, where the information for the ... |
1 |
Query performance prediction for ir
- Carmel, Kurland
- 2012
(Show Context)
Citation Context ...present queries and the document it belongs to are described in the rest of this section. It is composed of novel temporal features for query performance prediction, along with more standard features =-=[4, 16, 30]-=-. Linguistic Features. We compute a family of linguistic features [30] for a query by considering its text and the document it refers to. This results in a set of features both at query and document l... |
1 | Bridging temporal context gaps using time-aware re-contextualization
- Ceroni, Tran, et al.
- 2014
(Show Context)
Citation Context ...essed in this paper is how such context can be computed for helping in the interpretation of past or forgotten stories, e.g., from a news archive. We call this process time-aware re-contextualization =-=[6]-=- or contextualization, for short. The process automatically provides complementing information to a textual document, which reflects required but not expressed context for fully understanding it. Alth... |
1 | Time-based query performance predictors
- Kanhabua, Nørv̊ag
- 2011
(Show Context)
Citation Context ...s. The formers are based only on the query and corpus-based statistics, while the latter also analyze the retrieved list of results. In line with the previous work on time–aware performance predictor =-=[23]-=-, we investigate novel features for QPP that explicitly take the temporal dimension into account. Differently from the previously mentioned approaches, which focus on precision metrics, we consider th... |
1 |
Query-performance prediction: Setting the expectations straight
- Raiber, Kurland
- 2014
(Show Context)
Citation Context ...r Figure 3: Recall curves of document-based and hook-based methods. values and high correlation value, if compared with the performances in predicting query precision reported in previous works (e.g. =-=[5, 32]-=-), show that the recall of queries in our task can be predicted quite accurately by using the features described in Section 4.3.2. Feature Analysis. In order to analyze which are the most important fe... |