Results 1 
4 of
4
Ensembles for Unsupervised Outlier Detection: Challenges and Research Questions [Position Paper]
"... Ensembles for unsupervised outlier detection is an emerging topic that has been neglected for a surprisingly long time (although there are reasons why this is more difficult than supervised ensembles or even clustering ensembles). Aggarwal recently discussed algorithmic patterns of outlier detecti ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
Ensembles for unsupervised outlier detection is an emerging topic that has been neglected for a surprisingly long time (although there are reasons why this is more difficult than supervised ensembles or even clustering ensembles). Aggarwal recently discussed algorithmic patterns of outlier detection ensembles, identified traces of the idea in the literature, and remarked on potential as well as unlikely avenues for future transfer of concepts from supervised ensembles. Complementary to his points, here we focus on the core ingredients for building an outlier ensemble, discuss the first steps taken in the literature, and identify challenges for future research. 1.
Netray: Visualizing and mining billionscale graphs
 in Adv in Knowledge Discovery and Data Mining
, 2014
"... Abstract. How can we visualize billionscale graphs? How to spot outliers in such graphs quickly? Visualizing graphs is the most direct way of understanding them; however, billionscale graphs are very difficult to visualize since the amount of information overflows the resolution of a typical scre ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract. How can we visualize billionscale graphs? How to spot outliers in such graphs quickly? Visualizing graphs is the most direct way of understanding them; however, billionscale graphs are very difficult to visualize since the amount of information overflows the resolution of a typical screen. In this paper we propose NETRAY, an opensource package for visualizationbased mining on billionscale graphs. NETRAY visualizes graphs using the spy plot (adjacency matrix patterns), distribution plot, and correlation plot which involve careful node ordering and scaling. In addition, NETRAY efficiently summarizes scatter clusters of graphs in a way that finds outliers automatically, and makes it easy to interpret them visually. Extensive experiments show that NETRAY handles very large graphs with billions of nodes and edges efficiently and effectively. Specifically, among the various datasets that we study, we visualize in multiple ways the YahooWeb graph which spans 1.4 billion webpages and 6.6 billion links, and the Twitter whofollowswhom graph, which consists of 62.5 million users and 1.8 billion edges. We report interesting clusters and outliers spotted and summarized by NETRAY. 1
Rapid DistanceBased Outlier Detection via Sampling
"... Distancebased approaches to outlier detection are popular in data mining, as they do not require to model the underlying probability distribution, which is particularly challenging for highdimensional data. We present an empirical comparison of various approaches to distancebased outlier detecti ..."
Abstract
 Add to MetaCart
(Show Context)
Distancebased approaches to outlier detection are popular in data mining, as they do not require to model the underlying probability distribution, which is particularly challenging for highdimensional data. We present an empirical comparison of various approaches to distancebased outlier detection across a large number of datasets. We report the surprising observation that a simple, samplingbased scheme outperforms stateoftheart techniques in terms of both efficiency and effectiveness. To better understand this phenomenon, we provide a theoretical analysis why the samplingbased approach outperforms alternative methods based on knearest neighbor search. 1