Results 1 -
4 of
4
Wikipedia vandalism detection: Combining natural language, metadata, and reputation features
- In CICLing’11: Proceedings of the 12th International Conference on Intelligent Text Processing and Computational Linguistics, LNCS 6609
, 2011
"... Abstract. Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7 % are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content. In this work, we present the results of an eff ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Abstract. Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7 % are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content. In this work, we present the results of an effort to integrate three of the leading approaches to Wikipedia vandalism detection: a spatiotemporal analysis of metadata (STiki), a reputation-based system (Wiki-Trust), and natural language processing features. The performance of the resulting joint system improves the state-of-the-art from all previous methods and establishes a new baseline for Wikipedia vandalism detection. We examine in detail the contribution of the three approaches, both for the task of discovering fresh vandalism, and for the task of locating vandalism in the complete set of Wikipedia revisions. 1
filtering
"... Wikipedia is an online encyclopedia that anyone can access and edit. It has become one of the most important sources of knowledge online and many third party projects rely on it for a wide-range of purposes. The open model of Wikipedia allows pranksters, lobbyists and spammers to attack the integrit ..."
Abstract
- Add to MetaCart
Wikipedia is an online encyclopedia that anyone can access and edit. It has become one of the most important sources of knowledge online and many third party projects rely on it for a wide-range of purposes. The open model of Wikipedia allows pranksters, lobbyists and spammers to attack the integrity of the encyclopedia and this endangers it as a public resource. This is known in the community as vandalism. A plethora of methods have been developed within the Wikipedia and the scientific community to tackle this problem. We have participated in this effort and developed one of the leading approaches. Our research aims to create a fully-working antivandalism system and get it working in the real world.
Edit wars in Wikipedia
"... Abstract—We present a new, efficient method for automatically detecting severe conflicts, ‘edit wars ’ in Wikipedia and evaluate this method on six different language Wikipedias. We discuss how the number of edits and reverts deviate in such pages from those following the general workflow, and argue ..."
Abstract
- Add to MetaCart
Abstract—We present a new, efficient method for automatically detecting severe conflicts, ‘edit wars ’ in Wikipedia and evaluate this method on six different language Wikipedias. We discuss how the number of edits and reverts deviate in such pages from those following the general workflow, and argue that earlier work has significantly over-estimated the contentiousness of the Wikipedia editing process. I.
Combining Natural Language, Metadata, and Reputation Features
"... Abstract. Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7 % are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content. In this work, we present the results of an eff ..."
Abstract
- Add to MetaCart
Abstract. Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7 % are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content. In this work, we present the results of an effort to integrate three of the leading approaches to Wikipedia vandalism detection: a spatiotemporal analysis of metadata (STiki), a reputation-based system (Wiki-Trust), and natural language processing features. The performance of the resulting joint system improves the state-of-the-art from all previous methods and establishes a new baseline for Wikipedia vandalism detection. We examine in detail the contribution of the three approaches, both for the task of discovering fresh vandalism, and for the task of locating

