Results 1 - 10
of
57
Facts or friends?: distinguishing informational and conversational questions in social Q&A sites
- In CHI
, 2009
"... Tens of thousands of questions are asked and answered every day on social question and answer (Q&A) Web sites such as Yahoo Answers. While these sites generate an enormous volume of searchable data, the problem of determining which questions and answers are archival quality has grown. One major ..."
Abstract
-
Cited by 62 (3 self)
- Add to MetaCart
(Show Context)
Tens of thousands of questions are asked and answered every day on social question and answer (Q&A) Web sites such as Yahoo Answers. While these sites generate an enormous volume of searchable data, the problem of determining which questions and answers are archival quality has grown. One major component of this problem is the prevalence of conversational questions, identified both by Q&A sites and academic literature as questions that are intended simply to start discussion. For example, a conversational question such as “do you believe in evolution? ” might successfully engage users in discussion, but probably will not yield a useful web page for users searching for information about evolution. Using data from three popular Q&A sites, we confirm that humans can reliably distinguish between these conversational questions and other informational questions, and present evidence that conversational questions typically have much lower potential archival value than informational questions. Further, we explore the use of machine learning techniques to automatically classify questions as conversational or informational, learning in the process about categorical, linguistic, and social differences between different question types. Our algorithms approach human performance, attaining 89.7 % classification accuracy in our experiments. Author Keywords Q&A, online community, machine learning.
Discovering Value from Community Activity on Focused Question Answering Sites: A Case Study of Stack Overflow
- Proc. KDD, 2012
"... Question answering (Q&A) websites are now large repositories of valuable knowledge. While most Q&A sites were initially aimed at providing useful answers to the question asker, there has been a marked shift towards question answering as a community-driven knowledge creation process whose end ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
(Show Context)
Question answering (Q&A) websites are now large repositories of valuable knowledge. While most Q&A sites were initially aimed at providing useful answers to the question asker, there has been a marked shift towards question answering as a community-driven knowledge creation process whose end product can be of enduring value to a broad audience. As part of this shift, specific expertise and deep knowledge of the subject at hand have become increasingly important, and many Q&A sites employ voting and reputation mechanisms as centerpieces of their design to help users identify the trustworthiness and accuracy of the content. To better understand this shift in focus from one-off answers to a group knowledge-creation process, we consider a question together with its entire set of corresponding answers as our fundamental unit of analysis, in contrast with the focus on individual questionanswer pairs that characterized previous work. Our investigation considers the dynamics of the community activity that shapes the set of answers, both how answers and voters arrive over time and how this influences the eventual outcome. For example, we observe significant assortativity in the reputations of co-answerers, relationships between reputation and answer speed, and that the probability of an answer being chosen as the best one strongly depends on temporal characteristics of answer arrivals. We then show that our understanding of such properties is naturally applicable to predicting several important quantities, including the long-term value of the question and its answers, as well as whether a question requires a better answer. Finally, we discuss the implications of these results for the design of Q&A sites.
Designing Incentives for Online Question and Answer Forums
"... In this paper, we provide a simple game-theoretic model of an online question and answer forum. We focus on factual questions in which user responses aggregate while a question remains open. Each user has a unique piece of information and can decide when to report this information. The asker prefers ..."
Abstract
-
Cited by 29 (5 self)
- Add to MetaCart
(Show Context)
In this paper, we provide a simple game-theoretic model of an online question and answer forum. We focus on factual questions in which user responses aggregate while a question remains open. Each user has a unique piece of information and can decide when to report this information. The asker prefers to receive information sooner rather than later, and will stop the process when satisfied with the cumulative value of the posted information. We consider two distinct cases: a complements case, in which each successive piece of information is worth more to the asker than the previous one; and a substitutes case, in which each successive piece of information is worth less than the previous one. A best-answer scoring rule is adopted to model Yahoo! Answers, and is effective for substitutes information, where it isolates an equilibrium in which all users respond in the first round. But we find that this rule is ineffective for complements information, isolating instead an equilibrium in which all users respond in the final round. In addressing this, we demonstrate that an approval-voting scoring rule and a proportional-share scoring rule can enable the most efficient equilibrium with complements information, under certain conditions, by providing incentives for early responders as well as the user who submits the final answer.
Ranking Community Answers by Modeling Question-Answer Relationships via Analogical Reasoning
"... The method of finding high-quality answers has a significant impact on users ’ satisfaction in a community question answering system. However, due to the lexical gap between questions and answers as well as spam typically contained in user-generated content, filtering and ranking answers is very cha ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
(Show Context)
The method of finding high-quality answers has a significant impact on users ’ satisfaction in a community question answering system. However, due to the lexical gap between questions and answers as well as spam typically contained in user-generated content, filtering and ranking answers is very challenging. Existing solutions mainly focus on generating redundant features, or finding textual clues using machine learning techniques; none of them ever consider questions and their answers as relational data but instead model them as independent information. Meanwhile, they only consider the answers of the current question, and ignore any previous knowledge that would be helpful to bridge the lexical and semantic gap. We assume that answers are connected to their questions with various types of links, i.e. positive links indicating high-quality answers, negative links indicating incorrect answers or user-generated spam, and propose an analogical reasoning-based approach which measures the analogy between the new question-answer linkages and those of some previous relevant knowledge which contains only positive links; the candidate answer which has the most analogous link to the supporting set is assumed to be the best answer. We conducted our experiments based on 29.8 million Yahoo!Answer question-answer threads and showed the effectiveness of our proposed approach.
A Generalized Framework of Exploring Category Information for Question Retrieval in Community Question Answer Archives
- In Proceedings of the 19th International World Wide Web Conference (WWW2010
, 2010
"... Community Question Answering (CQA) has emerged as a popu-lar type of service where users ask and answer questions and ac-cess historical question-answer pairs. CQA archives contain very large volumes of questions organized into a hierarchy of categories. As an essential function of CQA services, que ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
(Show Context)
Community Question Answering (CQA) has emerged as a popu-lar type of service where users ask and answer questions and ac-cess historical question-answer pairs. CQA archives contain very large volumes of questions organized into a hierarchy of categories. As an essential function of CQA services, question retrieval in a CQA archive aims to retrieve historical question-answer pairs that are relevant to a query question. In this paper, we present a new approach to exploiting category information of questions for im-proving the performance of question retrieval, and we apply the approach to existing question retrieval models, including a state-of-the-art question retrieval model. Experiments conducted on real CQA data demonstrate that the proposed techniques are capable of outperforming a variety of baseline methods significantly. 1.
Competition-based user expertise score estimation,”
- in Proceedings of the 34th international ACM SIGIR conference on Research and development in Information. ACM,
, 2011
"... ABSTRACT In this paper, we consider the problem of estimating the relative expertise score of users in community question and answering services (CQA). Previous approaches typically only utilize the explicit question answering relationship between askers and answerers and apply link analysis to add ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
(Show Context)
ABSTRACT In this paper, we consider the problem of estimating the relative expertise score of users in community question and answering services (CQA). Previous approaches typically only utilize the explicit question answering relationship between askers and answerers and apply link analysis to address this problem. The implicit pairwise comparison between two users that is implied in the best answer selection is ignored. Given a question and answering thread, it's likely that the expertise score of the best answerer is higher than the asker's and all other non-best answerers'. The goal of this paper is to explore such pairwise comparisons inferred from best answer selections to estimate the relative expertise scores of users. Formally, we treat each pairwise comparison between two users as a two-player competition with one winner and one loser. Two competition models are proposed to estimate user expertise from pairwise comparisons. Using the NTCIR-8 CQA task data with 3 million questions and introducing answer quality prediction based evaluation metrics, the experimental results show that the pairwise comparison based competition model significantly outperforms link analysis based approaches (PageRank and HITS) and pointwise approaches (number of best answers and best answer ratio) for estimating the expertise of active users. Furthermore, it's shown that pairwise comparison based competition models have better discriminative power than other methods. It's also found that answer quality (best answer) is an important factor to estimate user expertise.
Predicting web searcher satisfaction with existing community-based answers
- In Proceedings of the 34th International ACM SIGIR Conference on Research and Development on Information Retrieval
, 2011
"... Community-based Question Answering (CQA) sites, such as Yahoo! Answers, Baidu Knows, Naver, and Quora, have been rapidly growing in popularity. The resulting archives of posted answers to questions, in Yahoo! Answers alone, already exceed in size 1 billion, and are aggressively indexed by web search ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
(Show Context)
Community-based Question Answering (CQA) sites, such as Yahoo! Answers, Baidu Knows, Naver, and Quora, have been rapidly growing in popularity. The resulting archives of posted answers to questions, in Yahoo! Answers alone, already exceed in size 1 billion, and are aggressively indexed by web search engines. In fact, a large number of search engine users benefit from these archives, by finding existing answers that address their own queries. This scenario poses new challenges and opportunities for both search engines and CQA sites. To this end, we formulate a new problem of predicting the satisfaction of web searchers with CQA answers. We analyze a large number of web searches that result in a visit to a popular CQA site, and identify unique characteristics of searcher satisfaction in this setting, namely, the effects of query clarity, query-to-question match, and answer quality. We then propose and evaluate several approaches to predicting searcher satisfaction that exploit these characteristics. To the best of our knowledge, this is the first attempt to predict and validate the usefulness of CQA archives for external searchers, rather than for the original askers. Our results suggest promising directions for improving and exploiting community question answering services in pursuit of satisfying even more Web search queries.
Analyzing and predicting question quality in community question answering services
- In Proc. of CQA Workshop (WWW
, 2012
"... Users tend to ask and answer questions in community question answering (CQA) services to seek information and share knowledge. A corollary is that myriad of questions and answers appear in CQA service. Accordingly, volumes of studies have been taken to explore the answer quality so as to provide a p ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
(Show Context)
Users tend to ask and answer questions in community question answering (CQA) services to seek information and share knowledge. A corollary is that myriad of questions and answers appear in CQA service. Accordingly, volumes of studies have been taken to explore the answer quality so as to provide a preliminary screening for better answers. However, to our knowledge, less attention has so far been paid to question quality in CQA. Knowing question quality provides us with finding and recommending good questions together with identifying bad ones which hinder the CQA service. In this paper, we are conducting two studies to investigate the question quality issue. The first study analyzes the factors of question quality and finds that the interaction between askers and topics results in the differences of question quality. Based on this finding, in the second study we propose a Mutual Reinforcement-based Label Propagation (MRLP) algorithm to predict question quality. We experiment with Yahoo! Answers data and the results demonstrate the effectiveness of our algorithm in distinguishing high-quality questions from low-quality ones. Categories and Subject Descriptors H.3.4 [System and Software]: question answering (fact retrieval) systems; H.3.5 [Online Information Services]: Web-based services
You’ve Got Answers: Towards Personalized Models for Predicting Success in Community Question Answering
"... Question answering communities such as Yahoo! Answers have emerged as a popular alternative to general-purpose web search. By directly interacting with other participants, information seekers can obtain specific answers to their questions. However, user success in obtaining satisfactory answers vari ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
Question answering communities such as Yahoo! Answers have emerged as a popular alternative to general-purpose web search. By directly interacting with other participants, information seekers can obtain specific answers to their questions. However, user success in obtaining satisfactory answers varies greatly. We hypothesize that satisfaction with the contributed answers is largely determined by the asker’s prior experience, expectations, and personal preferences. Hence, we begin to develop personalized models of asker satisfaction to predict whether a particular question author will be satisfied with the answers contributed by the community participants. We formalize this problem, and explore a variety of content, structure, and interaction features for this task using standard machine learning techniques. Our experimental evaluation over thousands of real questions indicates that indeed it is beneficial to personalize satisfaction predictions when sufficient prior user history exists, significantly improving accuracy over a “one-size-fits-all ” prediction model. 1
Supporting synchronous social Q&A throughout the question lifecycle. WWW
, 2011
"... Synchronous social Q&A systems exist on the Web and in the enterprise to connect people with questions to people with answers in real-time. In such systems, askers ’ desire for quick answers is in tension with costs associated with interrupting numerous candidate answerers per question. Supporti ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
Synchronous social Q&A systems exist on the Web and in the enterprise to connect people with questions to people with answers in real-time. In such systems, askers ’ desire for quick answers is in tension with costs associated with interrupting numerous candidate answerers per question. Supporting users of synchronous social Q&A systems at various points in the question lifecycle (from conception to answer) helps askers make informed decisions about the likelihood of question success and helps answerers face fewer interruptions. For example, predicting that a question will not be well answered may lead the asker to rephrase or retract the question. Similarly, predicting that an answer is not forthcoming during the dialog can prompt system behaviors such as finding other answerers to join the conversation. As another example, predictions of asker satisfaction can be assigned to completed conversations and used for later retrieval. In this paper, we use data from an instant-messaging-based synchronous social Q&A service deployed to an online community of over two thousand users to study the prediction of: (i) whether a question will be answered, (ii) the number of candidate answerers that the question will be sent to, and (iii) whether the asker will be satisfied by the answer received. Predictions are made at many points of the question lifecycle (e.g., when the question is entered, when the answerer is located, halfway through the asker-answerer dialog, etc.). The findings from our study show that we can learn capable models for these tasks using a broad range of features derived from user profiles, system interactions, question setting, and the dialog between asker and answerer. Our research can lead to more sophisticated and more useful real-time Q&A support.