Results 1 - 10
of
37
Mining meaning from Wikipedia
, 2009
"... Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts an ..."
Abstract
-
Cited by 76 (2 self)
- Add to MetaCart
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced.
Measuring self-focus bias in community-maintained knowledge repositories
- In Proc. C&T
, 2009
"... Self-focus is a novel way of understanding a type of bias in community-maintained Web 2.0 graph structures. It goes beyond previous measures of topical coverage bias by encapsulating both node- and edge-hosted biases in a single holistic measure of an entire community-maintained graph. We outline tw ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
(Show Context)
Self-focus is a novel way of understanding a type of bias in community-maintained Web 2.0 graph structures. It goes beyond previous measures of topical coverage bias by encapsulating both node- and edge-hosted biases in a single holistic measure of an entire community-maintained graph. We outline two methods to quantify self-focus, one of which is very computationally inexpensive, and present empirical evidence for the existence of self-focus using a “hyperlingual ” approach that examines 15 different language editions of Wikipedia. We suggest applications of our methods and discuss the risks of ignoring self-focus bias in technological applications. Categories and Subject Descriptors
Determinants of Wikipedia Quality: The Roles of Global and Local Contribution Inequality
- In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work (CSCW
"... The success of Wikipedia and the relative high quality of its articles seem to contradict conventional wisdom. Recent studies have begun shedding light on the processes contributing to Wikipedia’s success, highlighting the role of coordination and contribution inequality. In this study, we expand on ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
(Show Context)
The success of Wikipedia and the relative high quality of its articles seem to contradict conventional wisdom. Recent studies have begun shedding light on the processes contributing to Wikipedia’s success, highlighting the role of coordination and contribution inequality. In this study, we expand on these works in two ways. First, we make a distinction between global (Wikipedia-wide) and local (article-specific) inequality and investigate both constructs. Second, we explore both direct and indirect effects of these inequalities, exposing the intricate relationships between global inequality, local inequality, coordination, and article quality. We tested our hypotheses on a sample of a Wikipedia articles using structural equation modeling and found that global inequality exerts significant positive impact on article quality, while the effect of local inequality is indirect and is mediated by coordination. Author Keywords Wikipedia, quality, contributing inequality, coordination.
WikiTranslate: Query Translation for Cross-lingual Information Retrieval using only Wikipedia
"... This paper presents WikiTranslate, a system which performs query translation for cross-lingual information retrieval (CLIR) using only Wikipedia to obtain translations. Queries are mapped to Wikipedia concepts and the corresponding translations of these concepts in the target language are used to cr ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
(Show Context)
This paper presents WikiTranslate, a system which performs query translation for cross-lingual information retrieval (CLIR) using only Wikipedia to obtain translations. Queries are mapped to Wikipedia concepts and the corresponding translations of these concepts in the target language are used to create the final query. WikiTranslate is evaluated by searching with topics in Dutch, French and Spanish in an English data collection. The systems achieved a performance of 67 % compared to the monolingual baseline.
Terabytes of tobler: evaluating the first law in a massive, domain-neutral representation of world knowledge
- In COSIT ’09
, 2009
"... Abstract. The First Law of Geography states, “everything is related to everything else, but near things are more related than distant things. ” Despite the fact that it is to a large degree what makes “spatial special, ” the law has never been empirically evaluated on a large, domain-neutral represe ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
Abstract. The First Law of Geography states, “everything is related to everything else, but near things are more related than distant things. ” Despite the fact that it is to a large degree what makes “spatial special, ” the law has never been empirically evaluated on a large, domain-neutral representation of world knowledge. We address the gap in the literature about this critical idea by statistically examining the multitude of entities and relations between entities present across 22 different language editions of Wikipedia. We find that, at least according to the myriad authors of Wikipedia, the First Law is true to an overwhelming extent regardless of language-defined cultural domain.
Defining, understanding, and supporting open collaboration: Lessons from the literature. American Behavioral Scientist
, 2013
"... The past twenty years have seen broad popularization of a relatively novel kind of human enterprise: open collaboration. Open collaboration projects are distributed, collaborative efforts made possible because of changes in information and communication technology that facilitate cooperative activit ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
The past twenty years have seen broad popularization of a relatively novel kind of human enterprise: open collaboration. Open collaboration projects are distributed, collaborative efforts made possible because of changes in information and communication technology that facilitate cooperative activities. The groundswell of open collaboration could be felt in the open source movement of the 90s but became unmistakable with the growth of projects like Wikipedia and, in particular, the maturation of research to help explain how and why such systems work, who participates, and when they might fail. By now thousands of scholars have written about open collaboration systems, many hundreds of thousands of people have participated in them, and millions of people use products of open collaboration every day. This special issue of American Behavioral Scientist assembles interdisciplinary scholarship that examines different aspects of open collaboration and the diverse systems that support it. The goal of this short introductory piece is to define open collaboration and contextualize a set of articles that span multiple disciplines and methods in a common vocabulary and history. We provide a definition of open collaboration and situate the phenomenon within an interrelated set of scholarly and ideological movements. We then examine the properties of open collaboration systems that have given rise to research and review major areas of scholarship, including the works in this issue, and close with a
Wikipedia Pages as Entry Points for Book Search
- In Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM 2009
, 2009
"... A lot of the world’s knowledge is stored in books, which, as a result of recent mass-digitisation efforts, are increasingly available online. Search engines, such as Google Books, provide mechanisms for searchers to enter this vast knowledge space using queries as entry points. In this paper, we vie ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
(Show Context)
A lot of the world’s knowledge is stored in books, which, as a result of recent mass-digitisation efforts, are increasingly available online. Search engines, such as Google Books, provide mechanisms for searchers to enter this vast knowledge space using queries as entry points. In this paper, we view Wikipedia as a summary of this world knowledge and aim to use this resource to guide users to relevant books. Thus, we investigate possible ways of using Wikipedia as an intermediary between the user’s query and a collection of books being searched. We experiment with traditional query expansion techniques, exploiting Wikipedia articles as rich sources of information that can augment the user’s query. We then propose a novel approach based on link distance in an extended Wikipedia graph: we associate books with Wikipedia pages that cite these books and use the link distance between these nodes and the pages that match the user query as an estimation of a book’s relevance to the query. Our results show that a) classical query expansion using terms extracted from query pages leads to increased precision, and b) link distance between query and book pages in Wikipedia provides a good indicator of relevance that can boost the retrieval score of relevant books in the result ranking of a book search engine.
VidWiki: Enabling the Crowd to Improve the Legibility of Online Educational Videos
"... Videos are becoming an increasingly popular medium for communicating information, especially for online education. Recent efforts by organizations like Coursera, edX, Udacity and Khan Academy have produced thousands of educational videos with hundreds of millions of views in their attempt to make hi ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Videos are becoming an increasingly popular medium for communicating information, especially for online education. Recent efforts by organizations like Coursera, edX, Udacity and Khan Academy have produced thousands of educational videos with hundreds of millions of views in their attempt to make high quality teaching available to the masses. As a medium, videos are time-consuming to produce and cannot be easily modified after release. As a result, errors or problems with legibility are common. While text-based information platforms like Wikipedia have benefitted enormously from crowdsourced contributions for the creation and improvement of content, the various limitations of video hinder the collaborative editing and improvement of educational videos. To address this issue, we present VidWiki, an online platform that enables students to iteratively improve the presentation quality and content of educational videos. Through the platform, users can improve the legibility of handwriting, correct errors, or translate text in videos by overlaying typeset content such as text, shapes, equations, or images. We conducted a small user study in which 13 novice users annotated and revised Khan Academy videos. Our results suggest that with only a small investment of time on the part of viewers, it may be possible to make meaningful improvements in online educational videos. Author Keywords Online education; massive open online course;
Multilinguals and Wikipedia editing
- In Proc. WebSci 2014, ACM
, 2014
"... This article analyzes one month of edits to Wikipedia in or-der to examine the role of users editing multiple language editions (referred to as multilingual users). Such multilin-gual users may serve an important function in diffusing in-formation across different language editions of the encyclo-pe ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
This article analyzes one month of edits to Wikipedia in or-der to examine the role of users editing multiple language editions (referred to as multilingual users). Such multilin-gual users may serve an important function in diffusing in-formation across different language editions of the encyclo-pedia, and prior work has suggested this could reduce the level of self-focus bias in each edition. This study finds mul-tilingual users are much more active than their single-edition (monolingual) counterparts. They are found in all language editions, but smaller-sized editions with fewer users have a higher percentage of multilingual users than larger-sized editions. About a quarter of multilingual users always edit the same articles in multiple languages, while just over 40% of multilingual users edit different articles in different lan-guages. When non-English users do edit a second language edition, that edition is most frequently English. Nonethe-less, several regional and linguistic cross-editing patterns are also present.
Coordination and learning in Wikipedia: Revisiting the dynamics of exploitation and exploration.
- Research in the Sociology of Organizations,
, 2013
"... ABSTRACT The evolution of Wikipedia betrays an increasing reliance on policies and guidelines, signalling certain stabilisation in the knowledge making processes underlying the encyclopaedia. We interpret such a state of affairs as reflecting the need to provide a few principles and guidelines of c ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
ABSTRACT The evolution of Wikipedia betrays an increasing reliance on policies and guidelines, signalling certain stabilisation in the knowledge making processes underlying the encyclopaedia. We interpret such a state of affairs as reflecting the need to provide a few principles and guidelines of coordination, in a context that has otherwise been marked by vast diversity, high membership turnover and the lack of traditional exploitative structures. Rather than reflecting bureaucratisation and a shift away from its constitutive principles, the consolidation of these coordinative mechanisms further embeds the distinctive profile of knowledge making processes characteristic of the online encyclopaedia. They reinforce the diversity of the collective (rather than individual capabilities and skills) as the primary source of knowledge and render the mechanisms of harvesting Managing 'Human Resources' by Exploiting and Exploring People's Potentials