Results 1 - 10
of
181
Knowledge sharing and Yahoo Answers: Everyone knows something
- Proceedings of WWW'08
, 2008
"... Yahoo Answers (YA) is a large and diverse question-answer forum, acting not only as a medium for sharing technical knowledge, but as a place where one can seek advice, gather opinions, and satisfy one’s curiosity about a countless number of things. In this paper, we seek to understand YA’s knowledge ..."
Abstract
-
Cited by 185 (4 self)
- Add to MetaCart
(Show Context)
Yahoo Answers (YA) is a large and diverse question-answer forum, acting not only as a medium for sharing technical knowledge, but as a place where one can seek advice, gather opinions, and satisfy one’s curiosity about a countless number of things. In this paper, we seek to understand YA’s knowledge sharing activity. We analyze the forum categories and cluster them according to content characteristics and patterns of interaction among the users. While interactions in some categories resemble expertise sharing forums, others incorporate discussion, everyday advice, and support. With such a diversity of categories in which one can participate, we find that some users focus narrowly on specific topics, while others participate across categories. This not only allows us to map related categories, but to characterize the entropy of the users ’ interests. We find that lower entropy correlates with receiving higher answer ratings, but only for categories where factual expertise is primarily sought after. We combine both user attributes and answer characteristics
Finding high-quality content in social media with an application to community-based question answering
- In Proceedings of WSDM
, 2008
"... The quality of user-generated content varies drastically from excellent to abuse and spam. As the availability of such content increases, the task of identifying high-quality content in sites based on user contributions—social media sites— becomes increasingly important. Social media in general exhi ..."
Abstract
-
Cited by 184 (14 self)
- Add to MetaCart
(Show Context)
The quality of user-generated content varies drastically from excellent to abuse and spam. As the availability of such content increases, the task of identifying high-quality content in sites based on user contributions—social media sites— becomes increasingly important. Social media in general exhibit a rich variety of information sources: in addition to the content itself, there is a wide array of non-content information available, such as links between items and explicit quality ratings from members of the community. In this paper we investigate methods for exploiting such community feedback to automatically identify high quality content. As a test case, we focus on Yahoo! Answers, a large community question/answering portal that is particularly rich in the amount and types of content and social interactions available in it. We introduce a general classification framework for combining the evidence from different sources of information, that can be tuned automatically for a given social media type and quality definition. In particular, for the community question/answering domain, we show that our system is able to separate high-quality items from the rest with an accuracy close to that of humans. Categories and Subject Descriptors H.3 [Information Storage and Retrieval]: H.3.1 Content Analysis and Indexing – indexing methods, linguistic
Design Lessons from the Fastest Q&A Site in the West
- CHI 2011
, 2011
"... This paper analyzes a Question & Answer site for programmers, Stack Overflow, that dramatically improves on the utility and performance of Q&A systems for technical domains. Over 92 % of Stack Overflow questions about expert topics are answered — in a median time of 11 minutes. Using a mixed ..."
Abstract
-
Cited by 68 (1 self)
- Add to MetaCart
This paper analyzes a Question & Answer site for programmers, Stack Overflow, that dramatically improves on the utility and performance of Q&A systems for technical domains. Over 92 % of Stack Overflow questions about expert topics are answered — in a median time of 11 minutes. Using a mixed methods approach that combines statistical data analysis with user interviews, we seek to understand this success. We argue that it is not primarily due to an a priori superior technical design, but also to the high visibility and daily involvement of the design team within the community they serve. This model of continued community leadership presents challenges to both CSCW systems research as well as to attempts to apply the Stack Overflow model to other specialized knowledge domains.
Identifying Topical Authorities in Microblogs
, 2011
"... Content in microblogging systems such as Twitter is produced by tens to hundreds of millions of users. This diversity is a notable strength, but also presents the challenge of finding the most interesting and authoritative authors for any given topic. To address this, we first propose a set of featu ..."
Abstract
-
Cited by 58 (1 self)
- Add to MetaCart
(Show Context)
Content in microblogging systems such as Twitter is produced by tens to hundreds of millions of users. This diversity is a notable strength, but also presents the challenge of finding the most interesting and authoritative authors for any given topic. To address this, we first propose a set of features for characterizing social media authors, including both nodal and topical metrics. We then show how probabilistic clustering over this feature space, followed by a within-cluster ranking procedure, can yield a final list of top authors for a given topic. We present results across several topics, along with results from a user study confirming that our method finds authors who are significantly more interesting and authoritative than those resulting from several baseline conditions. Additionally our algorithm is computationally feasible in near real-time scenarios making it an attractive alternative for capturing the rapidly changing dynamics of microblogs.
Questions in, knowledge in?: a study of naver’s question answering community
- In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’09
, 2009
"... Large general-purposed community question-answering sites are becoming popular as a new venue for generating knowledge and helping users in their information needs. In this paper we analyze the characteristics of knowledge generation and user participation behavior in the largest question-answering ..."
Abstract
-
Cited by 55 (2 self)
- Add to MetaCart
(Show Context)
Large general-purposed community question-answering sites are becoming popular as a new venue for generating knowledge and helping users in their information needs. In this paper we analyze the characteristics of knowledge generation and user participation behavior in the largest question-answering online community in South Korea, Naver Knowledge–iN. We collected and analyzed over 2.6 million question/answer pairs from fifteen categories between 2002 and 2007, and have interviewed twenty six users to gain insights into their motivations, roles, usage and expertise. We find altruism, learning, and competency are frequent motivations for top answerers to participate, but that participation is often highly intermittent. Using a simple measure of user performance, we find that higher levels of participation correlate with better performance. We also observe that users are motivated in part through a point system to build a comprehensive knowledge database. These and other insights have significant implications for future knowledge generating online communities.
Learning to Recognize Reliable Users and Content in Social Media with Coupled Mutual Reinforcement
- WWW 2009 MADRID! TRACK: DATA MINING / SESSION: GRAPH ALGORITHMS
, 2009
"... Community Question Answering (CQA) has emerged as a popular forum for users to pose questions for other users to answer. Over the last few years, CQA portals such as Naver and Yahoo! Answers have exploded in popularity, and now provide a viable alternative to general purpose Web search. At the same ..."
Abstract
-
Cited by 50 (2 self)
- Add to MetaCart
(Show Context)
Community Question Answering (CQA) has emerged as a popular forum for users to pose questions for other users to answer. Over the last few years, CQA portals such as Naver and Yahoo! Answers have exploded in popularity, and now provide a viable alternative to general purpose Web search. At the same time, the answers to past questions submitted in CQA sites comprise a valuable knowledge repository which could be a gold mine for information retrieval and automatic question answering. Unfortunately, the quality of the submitted questions and answers varies widely- increasingly so that a large fraction of the content is not usable for answering queries. Previous approaches for retrieving relevant and high quality content have been proposed, but they require large amounts of manually labeled data – which
Ecient top-k querying over social-tagging networks
- In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '08
, 2008
"... ABSTRACT Online communities have become popular for publishing and searching content, as well as for finding and connecting to other users. User-generated content includes, for example, personal blogs, bookmarks, and digital photos. These items can be annotated and rated by different users, and the ..."
Abstract
-
Cited by 49 (4 self)
- Add to MetaCart
ABSTRACT Online communities have become popular for publishing and searching content, as well as for finding and connecting to other users. User-generated content includes, for example, personal blogs, bookmarks, and digital photos. These items can be annotated and rated by different users, and these social tags and derived user-specific scores can be leveraged for searching relevant content and discovering subjectively interesting items. Moreover, the relationships among users can also be taken into consideration for ranking search results, the intuition being that you trust the recommendations of your close friends more than those of your casual acquaintances. Queries for tag or keyword combinations that compute and rank the top-k results thus face a large variety of options that complicate the query processing and pose efficiency challenges. This paper addresses these issues by developing an incremental top-k algorithm with two-dimensional expansions: social expansion considers the strength of relations among users, and semantic expansion considers the relatedness of different tags. It presents a new algorithm, based on principles of threshold algorithms, by folding friends and related tags into the search space in an incremental on-demand manner. The excellent performance of the method is demonstrated by an experimental evaluation on three real-world datasets, crawled from deli.cio.us, Flickr, and LibraryThing.
Discovering Value from Community Activity on Focused Question Answering Sites: A Case Study of Stack Overflow
- Proc. KDD, 2012
"... Question answering (Q&A) websites are now large repositories of valuable knowledge. While most Q&A sites were initially aimed at providing useful answers to the question asker, there has been a marked shift towards question answering as a community-driven knowledge creation process whose end ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
(Show Context)
Question answering (Q&A) websites are now large repositories of valuable knowledge. While most Q&A sites were initially aimed at providing useful answers to the question asker, there has been a marked shift towards question answering as a community-driven knowledge creation process whose end product can be of enduring value to a broad audience. As part of this shift, specific expertise and deep knowledge of the subject at hand have become increasingly important, and many Q&A sites employ voting and reputation mechanisms as centerpieces of their design to help users identify the trustworthiness and accuracy of the content. To better understand this shift in focus from one-off answers to a group knowledge-creation process, we consider a question together with its entire set of corresponding answers as our fundamental unit of analysis, in contrast with the focus on individual questionanswer pairs that characterized previous work. Our investigation considers the dynamics of the community activity that shapes the set of answers, both how answers and voters arrive over time and how this influences the eventual outcome. For example, we observe significant assortativity in the reputations of co-answerers, relationships between reputation and answer speed, and that the probability of an answer being chosen as the best one strongly depends on temporal characteristics of answer arrivals. We then show that our understanding of such properties is naturally applicable to predicting several important quantities, including the long-term value of the question and its answers, as well as whether a question requires a better answer. Finally, we discuss the implications of these results for the design of Q&A sites.
Crowdsourcing and knowledge sharing: Strategic user behavior on taskcn
- In Proceedings of the 9th ACM conference on electronic commerce, EC ’08
, 2008
"... ABSTRACT Witkeys are a thriving type of web-based knowledge sharing market in China, supporting a form of crowdsourcing. In a Witkey site, users offer a small award for a solution to a task, and other users compete to have their solution selected. In this paper, we examine the behavior of users on ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
(Show Context)
ABSTRACT Witkeys are a thriving type of web-based knowledge sharing market in China, supporting a form of crowdsourcing. In a Witkey site, users offer a small award for a solution to a task, and other users compete to have their solution selected. In this paper, we examine the behavior of users on one of the biggest Witkey websites in China, Taskcn.com. On Taskcn, we observed several characteristics in users' activity over time. Most users become inactive after only a few submissions. Others keep attempting tasks. Over time, users tend to select tasks where they are competing against fewer opponents to increase their chances of winning. They will also, perhaps counterproductively, select tasks with higher expected rewards. Yet, on average, they do not increase their chances of winning, and in some categories of tasks, their chances actually decrease. This does not paint the full picture, however, because there is a very small core of successful users who manage not only to win multiple tasks, but to increase their win-tosubmission ratio over time. This core group proposes nearly 20% of the winning solutions on the site. The patterns we observe on Taskcn, we believe, hold clues to the future of crowdsourcing and freelance marketplaces, and raise interesting design implications for such sites.