Anomaly Detection: A Survey
, 2007
Abstract

Cited by 186 (4 self)
Anomaly detection is an important problem that has been researched within diverse research areas and application domains. Many anomaly detection techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection. We have grouped existing techniques into different categories based on the underlying approach adopted by each technique. For each category we have identified key assumptions, which are used by the techniques to differentiate between normal and anomalous behavior. When applying a given technique to a particular domain, these assumptions can be used as guidelines to assess the effectiveness of the technique in that domain. For each category, we provide a basic anomaly detection technique, and then show how the different existing techniques in that category are variants of the basic technique. This template provides an easier and succinct understanding of the techniques belonging to each category. Further, for each category, we identify the advantages and disadvantages of the techniques in that category. We also provide a discussion on the computational complexity of the techniques since it is an important issue in real application domains. We hope that this survey will provide a better understanding of the di®erent directions in which research has been done on this topic, and how techniques developed in one area can be applied in domains for which they were not intended to begin with.
Mixed memory Markov models: decomposing complex stochastic processes as mixtures of simpler ones
, 1998
Abstract

Cited by 62 (1 self)
. We study Markov models whose state spaces arise from the Cartesian product of two or more discrete random variables. We show how to parameterize the transition matrices of these models as a convex combinationor mixtureof simpler dynamical models. The parameters in these models admit a simple probabilistic interpretation and can be fitted iteratively by an ExpectationMaximization (EM) procedure. We derive a set of generalized BaumWelch updates for factorial hidden Markov models that make use of this parameterization. We also describe a simple iterative procedure for approximately computing the statistics of the hidden states. Throughout, we give examples where mixed memory models provide a useful representation of complex stochastic processes. Keywords: Markov models, mixture models, discrete time series 1. Introduction The modeling of time series is a fundamental problem in machine learning, with widespread applications. These include speech recognition (Rabiner, 1989), natu...
1997], Error bounds for functional approximation and estimation using mixtures of experts
Abstract

Cited by 6 (3 self)
We examine some mathematical aspects of learning unknown mappings with the Mixture of Experts Model (MEM). Speci cally, we observe that the MEM is at least as powerful as a class of neural networks, in a sense that will be made precise. Upper bounds on the approximation error are established for a wide class of target functions. The general theorem states that inf kf; f nk p c=n r=d holds uniformly for f 2 W r(L) (a Sobolev class over [;1 � 1] p d), where fn belongstoanndimensional manifold of normalized ridge functions. The same bound holds for the MEM as a special case of the above. The stochastic error, in the context of learning from i.i.d. examples, is also examined. An asymptotic analysis establishes the limiting behavior of this error, in terms of certain pseudoinformation matrices. These results substantiate the intuition behind the MEM, and motivate applications.
Outlier Detection: A Survey
, 2007
Abstract

Cited by 1 (0 self)
Outlier detection is an important problem that has been researched within diverse knowledge disciplines and application domains. Many of these techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and comprehensive overview of the research on outlier detection. We hope that this survey will provide a better understanding of the different directions in which research has been done on this topic, and how techniques developed in one area can be applied to applications for which they were not intended to begin with.
Practical Generation of Video Textures the AutoRegressive Process
, 2004
Abstract
Recently, there have been several attempts at creating `video textures', that is, synthesising new (potentially infinitely long) video clips based on existing ones. One method for achieving this is to transform each frame of the video into an eigenspace using Principal Components Analysis so that the original sequence can be viewed as a signature through a lowdimensional space. A new sequence can be generated by moving through this space and creating `similar' signatures. These signatures may be derived using an autoregressive process (ARP). Such an ARP assumes that the signature has Gaussian statistics. For many sequences this assumption is valid, however, some sequences are strongly nonlinearly correlated, in which case their statistical properties are nonGaussian. We examine two methods by which such nonlinearities may be overcome. The first is by modelling the nonlinearity automatically using a spline, and the second using a combined appearance model. New video sequences created using these approaches contain images never present in the original sequence and appear very convincing.
Anomaly Detection for Symbolic Sequences . . .
, 2009
Abstract
This thesis deals with the problem of anomaly detection for sequence data. Anomaly detection has been a widely researched problem in several application domains such as system health management, intrusion detection, healthcare, bioinformatics, fraud detection, and mechanical fault detection. Traditional anomaly detection techniques analyze each data instance (as a univariate or multivariate record) independently, and ignore the sequential aspect of the data. Often, anomalies in sequences can be detected only by analyzing data instances together as a sequence, and hence cannot detected by traditional anomaly detection techniques. The problem of anomaly detection for sequence data is a rich area of research because of two main reasons. First, sequences can be of different types, e.g., symbolic sequences, time series data, etc., and each type of sequence poses unique set of problems. Second, anomalies in sequences can be defined in multiple ways and hence there are different problem formulations. In this thesis we focus on solving one particular problem formulation called semisupervised anomaly detection. We study the problem