Results 1  10
of
22
Latent Semantic Kernels
"... Kernel methods like Support Vector Machines have successfully been used for text categorization. A standard choice of kernel function has been the inner product between the vectorspace representationoftwo documents, in analogy with classical information retrieval (IR) approaches. Latent Semantic In ..."
Abstract

Cited by 87 (7 self)
 Add to MetaCart
Kernel methods like Support Vector Machines have successfully been used for text categorization. A standard choice of kernel function has been the inner product between the vectorspace representationoftwo documents, in analogy with classical information retrieval (IR) approaches. Latent Semantic Indexing (LSI) has been successfully used for IR purposes as a technique for capturing semantic relations between terms and inserting them into the similarity measure between two documents. One of its main drawbacks, in IR, is its computational cost. In this paper we describe how the LSI approach can be implementedinakernelde ned feature space. We provide experimental results demonstrating that the approach can significantly improve performance, and that it does not impair it.
A Linear Programming Approach to Novelty Detection
, 2001
"... Novelty detection involves modeling the normal behaviour of a system hence enabling detection of any divergence from normality. It has potential applications in many areas such as detection of machine damage or highlighting abnormal features in medical data. One approach is to build a hypothesis ..."
Abstract

Cited by 67 (5 self)
 Add to MetaCart
Novelty detection involves modeling the normal behaviour of a system hence enabling detection of any divergence from normality. It has potential applications in many areas such as detection of machine damage or highlighting abnormal features in medical data. One approach is to build a hypothesis estimating the support of the normal data i.e. constructing a function which is positive in the region where the data is located and negative elsewhere. Recently kernel methods have been proposed for estimating the support of a distribution and they have performed well in practice  training involves solution of a quadratic programming problem. In this paper we propose a simpler kernel method for estimating the support based on linear programming. The method is easy to implement and can learn large datasets rapidly. We demonstrate the method on medical and fault detection datasets. 1 Introduction. An important classification task is the ability to distinguish between new instance...
Uniform Object Generation for Optimizing Oneclass Classifiers
 Journal of Machine Learning Research
, 2001
"... tion, one lled theta rgetcla ss,ha s to be distinguished from the rest of thefea ssumed t only mples of theta rgetcla a rea va ila Thiscla s to be constructed such t objects not ting from the ta329 set, by definition outlier objects, a. not cla.2 fieda s ta2G9 objects In previous resea rch the supp ..."
Abstract

Cited by 46 (4 self)
 Add to MetaCart
tion, one lled theta rgetcla ss,ha s to be distinguished from the rest of thefea ssumed t only mples of theta rgetcla a rea va ila Thiscla s to be constructed such t objects not ting from the ta329 set, by definition outlier objects, a. not cla.2 fieda s ta2G9 objects In previous resea rch the support description (SVDD) is proposed to solve the problem of oneclac cla ssifica ion It hyperspherea.2 nd the taU9U nd by the introduction of kernel functions, more flexible descriptionsa ined In the optimiza ion of the SVDD, two pa8# eters ve to be given beforeha nd by the user To tica ly optimize the va ues for these pa29 eters, the error on both the da ta ha beestima tedBeca use no outlier exa mplesa rea va ila ble, we propose method for genera tinga rtificia l outliers, uniformly distributed hypersphere An (rela3G e) e#cient estima2 for the volume covered by the onecla. claa.R8G isobta53F. a soa n estima e for the outlier error re shown for a. ificia daa a nd for reaworlddaa Keywords: Support ssifiers, onecla sscla ssifica tion, novelty detection, outlier detection 1.
PEBL: Web Page Classification without Negative Examples
 IEEE Transactions on Knowledge and Data Engineering
, 2004
"... Web page classification is one of the essential techniques for Web mining because classifying Web pages of an interesting class is often the first step of mining the Web. However, constructing a classifier for an interesting class requires laborious preprocessing such as collecting positive and ne ..."
Abstract

Cited by 30 (0 self)
 Add to MetaCart
Web page classification is one of the essential techniques for Web mining because classifying Web pages of an interesting class is often the first step of mining the Web. However, constructing a classifier for an interesting class requires laborious preprocessing such as collecting positive and negative training examples. For instance, in order to construct a "homepage" classifier, one needs to collect a sample of homepages (positive examples) and a sample of nonhomepages (negative examples). In particular, collecting negative training examples requires arduous work and caution to avoid bias. This paper presents a framework, called Positive Example Based Learning (PEBL), for Web page classification which eliminates the need for manually collecting negative training examples in preprocessing. The PEBL framework applies an algorithm, called MappingConvergence (MC), to achieve high classification accuracy (with positive and unlabeled data) as high as that of a traditional SVM (with positive and negative data). MC runs in two stages: the mapping stage and convergence stage. In the mapping stage, the algorithm uses a weak classifier that draws an initial approximation of "strong" negative data. Based on the initial approximation, the convergence stage iteratively runs an internal classifier (e.g., SVM) which maximizes margins to progressively improve the approximation of negative data. Thus, the class boundary eventually converges to the true boundary of the positive class in the feature space. We present the MC algorithm with supporting theoretical and experimental justifications. Our experiments show that, given the same set of positive examples, the MC algorithm outperforms oneclass SVMs, and it is almost as accurate as the traditional SVMs.
Sparse Kernel Feature Analysis
, 1999
"... Kernel Principal Component Analysis (KPCA) has proven to be a versatile tool for unsupervised learning, however at a high computational cost due to the dense expansions in terms of kernel functions. We overcome this problem by proposing a new class of feature extractors employing ` 1 norms in c ..."
Abstract

Cited by 27 (2 self)
 Add to MetaCart
Kernel Principal Component Analysis (KPCA) has proven to be a versatile tool for unsupervised learning, however at a high computational cost due to the dense expansions in terms of kernel functions. We overcome this problem by proposing a new class of feature extractors employing ` 1 norms in coefficient space instead of the reproducing kernel Hilbert space in which KPCA was originally formulated in. Moreover, the modified setting allows us to efficiently extract features maximizing criteria other than the variance much in a projection pursuit fashion.
A Support Vector Method for Clustering
 ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 13
, 2001
"... We present a novel method for clustering using the support vector machine approach. Data points are mapped to a high dimensional feature space, where support vectors are used to define a sphere enclosing them. The boundary of the sphere forms in data space a set of closed contours containing the ..."
Abstract

Cited by 15 (3 self)
 Add to MetaCart
We present a novel method for clustering using the support vector machine approach. Data points are mapped to a high dimensional feature space, where support vectors are used to define a sphere enclosing them. The boundary of the sphere forms in data space a set of closed contours containing the data. Data points enclosed by each contour are defined as a cluster. As the width parameter of the Gaussian kernel is decreased, these contours fit the data more tightly and splitting of contours occurs. The algorithm works by separating clusters according to valleys in the underlying probability distribution, and thus clusters can take on arbitrary geometrical shapes. As in other SV algorithms, outliers can be dealt with by introducing a soft margin constant leading to smoother cluster boundaries. The structure of the data is explored by varying the two parameters. We investigate the dependence of our method on these parameters and apply it to several data sets.
Realtime Object Classification and Novelty Detection for Collaborative Video Surveillance
 In Proceedings of the International Joint Conference on Neural Networks
, 2002
"... To conduct realtime video surveillance using lowcost commercial offtheshelf hardware, system designers typically define the classifiers prior to the deployment of the system so that the performance of the system can be optimized for a particular mission. This implies the system is restricted to ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
To conduct realtime video surveillance using lowcost commercial offtheshelf hardware, system designers typically define the classifiers prior to the deployment of the system so that the performance of the system can be optimized for a particular mission. This implies the system is restricted to interpreting activity in the environment in terms of the original context specified. Ideally the system should allow the user to provide additional context in an incremental fashion as conditions change. Given the volumes of data produced by the system, it is impractical for the user to periodically review and label a significant fraction of the available data. We explore a strategy for designing a realtime object classification process that aids the user in identifying novel, informative examples for efficient incremental learning.
A Mixture Approach to Novelty Detection Using Training Data with Outliers
 Lecture Notes in Computer Science
, 2001
"... This paper describes an approach to handle multivariate training data which contain outliers. The aim is to analyze the training patterns and to detect anomalous patterns. Therefore we explicitly model the existence of outliers in the training data using a widespread outlier distribution. Indica ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
This paper describes an approach to handle multivariate training data which contain outliers. The aim is to analyze the training patterns and to detect anomalous patterns. Therefore we explicitly model the existence of outliers in the training data using a widespread outlier distribution. Indicator variables assign each pattern to either the outlier distribution or the distribution of normal patterns. Thus we can estimate the data distribution using the EMalgorithm or Data Augmentation.
Outlier detection with oneclass kernel fisher discriminants
 Advances in Neural Information Processing Systems 17
, 2005
"... The problem of detecting “atypical objects ” or “outliers ” is one of the classical topics in (robust) statistics. Recently, it has been proposed to address this problem by means of oneclass SVM classifiers. The main conceptual shortcoming of most oneclass approaches, however, is that in a strict ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
The problem of detecting “atypical objects ” or “outliers ” is one of the classical topics in (robust) statistics. Recently, it has been proposed to address this problem by means of oneclass SVM classifiers. The main conceptual shortcoming of most oneclass approaches, however, is that in a strict sense they are unable to detect outliers, since the expected fraction of outliers has to be specified in advance. The method presented in this paper overcomes this problem by relating kernelized oneclass classification to Gaussian density estimation in the induced feature space. Having established this relation, it is possible to identify “atypical objects ” by quantifying their deviations from the Gaussian model. For RBF kernels it is shown that the Gaussian model is “rich enough ” in the sense that it asymptotically provides an unbiased estimator for the true density. In order to overcome the inherent model selection problem, a crossvalidated likelihood criterion for selecting all free model parameters is applied. 1
Distributed Surveillance and Reconnaissance Using Multiple Autonomous ATVs: CyberScout
 IEEE Transactions on Robotics and Automation
, 2002
"... The objective of the CyberScout project is to develop an autonomous surveillance and reconnaissance system using a network of allterrain vehicles. In this paper, we focus on two facets of this system: 1) vision for surveillance and 2) autonomous navigation and dynamic path planning. In the area of ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
The objective of the CyberScout project is to develop an autonomous surveillance and reconnaissance system using a network of allterrain vehicles. In this paper, we focus on two facets of this system: 1) vision for surveillance and 2) autonomous navigation and dynamic path planning. In the area of visionbased surveillance, we have developed robust, efficient algorithms to detect, classify, and track moving objects of interest (person, people, or vehicle) with a static camera. Adaptation through feedback from the classifier and tracker allow the detector to use grayscale imagery, but perform as well as prior colorbased detectors. We have extended the detector using scene mosaicing to detect and index moving objects when the camera is panning or tilting. The classification algorithm performs well (less than 8% error rate for all classes) with coarse inputs (20 20pixel binary image chips), has unparalleled rejection capabilities (rejects 72% of spurious detections), and can flag novel moving objects. The tracking algorithm achieves highly accurate (96%) frametoframe correspondence for multiple moving objects in cluttered scenes by determining the discriminant relevance of object features. We have also developed a novel mission coordination architecture, CPAD (Checkpoint/Priority/Action Database), which performs path planning via checkpoint and dynamic priority assignment, using statistical estimates of the environment's motion structure. The motion structure is used to make both preplanning and reactive behaviors more efficient by applying global context. This approach is more computationally efficient than centralized approaches and exploits robot cooperation in dynamic environments better than decoupled approaches.