Results 1  10
of
124
Statistical Dependency Analysis with Support Vector Machines
 In Proceedings of IWPT
, 2003
"... In this paper, we propose a method for analyzing wordword dependencies using deterministic bottomup manner using Support Vector machines. We experimented with dependency trees converted from Penn treebank data, and achieved over 90 % accuracy of wordword dependency. Though the result is little wo ..."
Abstract

Cited by 160 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we propose a method for analyzing wordword dependencies using deterministic bottomup manner using Support Vector machines. We experimented with dependency trees converted from Penn treebank data, and achieved over 90 % accuracy of wordword dependency. Though the result is little worse than the most uptodate phrase structure based parsers, it looks satisfactorily accurate considering that our parser uses no information from phrase structures. 1
Learning globallyconsistent local distance functions for shapebased image retrieval and classification
 In ICCV
, 2007
"... We address the problem of visual category recognition by learning an imagetoimage distance function that attempts to satisfy the following property: the distance between images from the same category should be less than the distance between images from different categories. We use patchbased feat ..."
Abstract

Cited by 150 (3 self)
 Add to MetaCart
(Show Context)
We address the problem of visual category recognition by learning an imagetoimage distance function that attempts to satisfy the following property: the distance between images from the same category should be less than the distance between images from different categories. We use patchbased feature vectors common in object recognition work as a basis for our imagetoimage distance functions. Our largemargin formulation for learning the distance functions is similar to formulations used in the machine learning literature on distance metric learning, however we differ in that we learn local distance functions— a different parameterized function for every image of our training set—whereas typically a single global distance function is learned. This was a novel approach first introduced in Frome, Singer, & Malik, NIPS 2006. In that work we learned the local distance functions independently, and the outputs of these functions could not be compared at test time without the use of additional heuristics or training. Here we introduce a different approach that has the advantage that it learns distance functions that are globally consistent in that they can be directly compared for purposes of retrieval and classification. The output of the learning algorithm are weights assigned to the image features, which is intuitively appealing in the computer vision setting: some features are more salient than others, and which are more salient depends on the category, or image, being considered. We train and test using the Caltech 101 object recognition benchmark. Using fifteen training images per category, we achieved a mean recognition rate of 63.2 % and
Extracting relations with integrated information using kernel methods
 In Proceedings of the annual meeting of ACL
, 2005
"... Entity relation detection is a form of information extraction that finds predefined relations between pairs of entities in text. This paper describes a relation detection approach that combines clues from different levels of syntactic processing using kernel methods. Information from three different ..."
Abstract

Cited by 90 (6 self)
 Add to MetaCart
(Show Context)
Entity relation detection is a form of information extraction that finds predefined relations between pairs of entities in text. This paper describes a relation detection approach that combines clues from different levels of syntactic processing using kernel methods. Information from three different levels of processing is considered: tokenization, sentence parsing and deep dependency analysis. Each source of information is represented by kernel functions. Then composite kernels are developed to integrate and extend individual kernels so that processing errors occurring at one level can be overcome by information from other levels. We present an evaluation of these methods on the 2004 ACE relation detection task, using Support Vector Machines, and show that each level of syntactic processing contributes useful information for this task. When evaluated on the official test data, our approach produced very competitive ACE value scores. We also compare the SVM with KNN on different kernels. 1
CONTRAlign: discriminative training for protein sequence alignment
 In: International Conference in Research on Computational Molecular Biology (RECOMB). (2006
, 2006
"... 1 Introduction In comparative structural biology studies, analyzing or predicting protein threedimensional structure often begins with identifying patterns of amino acid substitution via protein sequence alignment. While the evolutionary informationobtained from alignments can provide insights into ..."
Abstract

Cited by 41 (5 self)
 Add to MetaCart
(Show Context)
1 Introduction In comparative structural biology studies, analyzing or predicting protein threedimensional structure often begins with identifying patterns of amino acid substitution via protein sequence alignment. While the evolutionary informationobtained from alignments can provide insights into protein structure, constructing accurate alignments may be difficult when proteins share significant structural similarity but little sequence similarity. Indeed, for modern alignment tools, alignment quality drops rapidly when the sequences compared have lower than25 % identity, the &quot;twilight zone &quot; of protein alignment [1].
A hierarchy of support vector machines for pattern detection
 JMLR
, 2006
"... We introduce a computational design for pattern detection based on a treestructured network of support vector machines (SVMs). An SVM is associated with each cell in a recursive partitioning of the space of patterns (hypotheses) into increasingly finer subsets. The hierarchy is traversed coarsetof ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
(Show Context)
We introduce a computational design for pattern detection based on a treestructured network of support vector machines (SVMs). An SVM is associated with each cell in a recursive partitioning of the space of patterns (hypotheses) into increasingly finer subsets. The hierarchy is traversed coarsetofine and each chain of positive responses from the root to a leaf constitutes a detection. Our objective is to design and build a network which balances overall error and computation. Initially, SVMs are constructed for each cell with no constraints. This “free network ” is then perturbed, cell by cell, into another network, which is “graded” in two ways: first, the number of support vectors of each SVM is reduced (by clustering) in order to adjust to a predetermined, increasing function of cell depth; second, the decision boundaries are shifted to preserve all positive responses from the original set of training data. The limits on the numbers of clusters (virtual support vectors) result from minimizing the mean computational cost of collecting all detections subject to a bound on the expected number of false positives. When applied to detecting faces in cluttered scenes, the patterns correspond to poses and the free network is already faster and more accurate than applying a single posespecific SVM many times. The graded network promotes very rapid processing of background regions while maintaining the discriminatory power of the free network.
Transforming strings to vector spaces using prototype selection
, 2006
"... Abstract. A common way of expressing string similarity in structural pattern recognition is the edit distance. It allows one to apply the kNN rule in order to classify a set of strings. However, compared to the wide range of elaborated classifiers known from statistical pattern recognition, this is ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
(Show Context)
Abstract. A common way of expressing string similarity in structural pattern recognition is the edit distance. It allows one to apply the kNN rule in order to classify a set of strings. However, compared to the wide range of elaborated classifiers known from statistical pattern recognition, this is only a very basic method. In the present paper we propose a method for transforming strings into ndimensional real vector spaces based on prototype selection. This allows us to subsequently classify the transformed strings with more sophisticated classifiers, such as support vector machine and other kernel based methods. In a number of experiments, we show that the recognition rate can be significantly improved by means of this procedure. 1
Frequency sensitive competitive learning for balanced clustering on highdimensional hyperspheres
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2004
"... Competitive learning mechanisms for clustering in general suffer from poor performance for very high dimensional (> 1000) data because of “curse of dimensionality” effects. In applications such as document clustering, it is customary to normalize the high dimensional input vectors to unit length ..."
Abstract

Cited by 13 (8 self)
 Add to MetaCart
(Show Context)
Competitive learning mechanisms for clustering in general suffer from poor performance for very high dimensional (> 1000) data because of “curse of dimensionality” effects. In applications such as document clustering, it is customary to normalize the high dimensional input vectors to unit length, and it is sometimes also desirable to obtain balanced clusters, i.e., clusters of comparable sizes. The spherical kmeans (spkmeans) algorithm, which normalizes the cluster centers as well as the inputs, has been successfully used to cluster normalized text documents in 2000+ dimensional space. Unfortunately, like regularkmeans and its soft EM based version,spkmeans tends to generate extremely imbalanced clusters in high dimensional spaces when the desired number of clusters is large (tens or more). In this paper, we first show that the spkmeans algorithm can be derived from a certain maximum likelihood formulation using a mixture of von MisesFisher distributions as the generative model and in fact it can be considered as a batch mode version of (normalized) competitive learning. The proposed generative model is then adapted in a principled way to yield three frequency sensitive competitive learning variants that are applicable to static data and produced high quality and well balanced clusters for highdimensional data. Like kmeans, each iteration is linear in the number of data points and in the number of clusters for all the three algorithms. We also propose a frequency sensitive algorithm to cluster streaming 1 data. Experimental results on clustering of highdimensional text data sets are provided to show the effectiveness and applicability of the proposed techniques.
Kernel pca for similarity invariant shape recognition
 In the Journal of Neurocomputing
, 2006
"... We present in this paper a novel approach for shape description based on kernel principal component analysis (KPCA). The strength of this method resides in the similarity (rotation, translation and particularly scale) invariance of KPCA when using a family of triangular conditionally positive defini ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
We present in this paper a novel approach for shape description based on kernel principal component analysis (KPCA). The strength of this method resides in the similarity (rotation, translation and particularly scale) invariance of KPCA when using a family of triangular conditionally positive definite kernels. Beside this invariance, the method provides an effective way to capture nonlinearities in shape geometry. A given twodimensional curve is described using the eigenvalues of the underlying manifold modeled in a highdimensional Hilbert space. Using Fourier analysis, we will show that this eigenvalue description captures low to high variations of the shape frequencies. Experiments conducted on standard databases including the SQUID, the Swedish and the Smithsonian leaf databases, show that the method is effective in capturing invariance and generalizes well for shape matching and retrieval. Key words:
Tissue classification using cluster features for lesion detection in digital cervigrams
 Proc. SPIE Medical Imaging
, 2008
"... In this paper, we propose a new method for automated detection and segmentation of different tissue types in digitized uterine cervix images using meanshift clustering and support vector machines (SVM) classification on cluster features. We specifically target the segmentation of precancerous lesio ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
(Show Context)
In this paper, we propose a new method for automated detection and segmentation of different tissue types in digitized uterine cervix images using meanshift clustering and support vector machines (SVM) classification on cluster features. We specifically target the segmentation of precancerous lesions in a NCI/NLM archive of 60,000 cervigrams. Due to large variations in image appearance in the archive, color and texture features of a tissue type in one image often overlap with that of a different tissue type in another image. This makes reliable tissue segmentation in a large number of images a very challenging problem. In this paper, we propose the use of powerful machine learning techniques such as Support Vector Machines (SVM) to learn, from a database with ground truth annotations, critical visual signs that correlate with important tissue types and to use the learned classifier for tissue segmentation in unseen images. In our experiments, SVM performs better than unsupervised methods such as Gaussian Mixture clustering, but it does not scale very well to large training sets and does not always guarantee improved performance given more training data. To address this problem, we combine SVM and clustering so that the features we extracted for classification are features of clusters returned by the meanshift clustering algorithm. Compared to classification using individual pixel features, classification by cluster features greatly reduces the dimensionality of the problem, thus it is more efficient while producing results with comparable accuracy.
Attractor Modeling and Empirical Nonlinear Model Reduction of Dissipative Dynamical Systems” to appear
 International Journal of Bifurcation and Chaos (IJBC) in Applied Sciences and Engineering
, 2007
"... In a broad sense, model reduction means producing a lowdimensional dynamical system that replicates either approximately, or more strictly, exactly and topologically, the output of a dynamical system. Model reduction has an important role in the study of dynamical systems and also with engineering ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
In a broad sense, model reduction means producing a lowdimensional dynamical system that replicates either approximately, or more strictly, exactly and topologically, the output of a dynamical system. Model reduction has an important role in the study of dynamical systems and also with engineering problems. In many cases, there exists a good lowdimensional model for even very highdimensional systems, even infinite dimensional systems in the case of a PDE with a lowdimensional attractor. The theory of global attractors approaches these issues analytically, and focuses on finding (depending on the question at hand), a slowmanifold, inertial manifold, or center manifold, on which a restricted dynamical system represents the interesting behavior of the dynamical system; the main issue depends on defining a stable invariant manifold in which the dynamical system is invariant. These approaches are analytical in nature, however, and are therefore not always appropriate for dynamical systems known only empirically through a dataset. Empirically, the collection of tools available are much more restricted, and are essentially linear in nature. Usually variants of Galerkin’s method, project the dynamical system onto a function linear subspace spanned by modes of some chosen spanning set. Even the popular Karhunen–Loeve decomposition, or POD, method is exactly such a method. As such, it