Results 1 - 10
of
44
Generalized Discriminant Analysis Using a Kernel Approach
, 2000
"... We present a new method that we call Generalized Discriminant Analysis (GDA) to deal with nonlinear discriminant analysis using kernel function operator. The underlying theory is close to the Support Vector Machines (SVM) insofar as the GDA method provides a mapping of the input vectors into high di ..."
Abstract
-
Cited by 150 (2 self)
- Add to MetaCart
We present a new method that we call Generalized Discriminant Analysis (GDA) to deal with nonlinear discriminant analysis using kernel function operator. The underlying theory is close to the Support Vector Machines (SVM) insofar as the GDA method provides a mapping of the input vectors into high dimensional feature space. In the transformed space, linear properties make it easy to extend and generalize the classical Linear Discriminant Analysis (LDA) to non linear discriminant analysis. The formulation is expressed as an eigenvalue problem resolution. Using a different kernel, one can cover a wide class of nonlinearities. For both simulated data and alternate kernels, we give classification results as well as the shape of the separating function. The results are confirmed using a real data to perform seed classification. 1. Introduction Linear discriminant analysis (LDA) is a traditional statistical method which has proven successful on classification problems [Fukunaga, 1990]. The p...
The Correlation Ratio as a New Similarity Measure for Multimodal Image Registration
, 1998
"... Over the last five years, new "voxel-based" approaches have allowed important progress in multimodal image registration, notably due to the increasing use of information-theoretic similarity measures. ..."
Abstract
-
Cited by 73 (16 self)
- Add to MetaCart
Over the last five years, new "voxel-based" approaches have allowed important progress in multimodal image registration, notably due to the increasing use of information-theoretic similarity measures.
Multimodal Image Registration by Maximization of the Correlation Ratio
, 1998
"... Over the last five years, new "voxel-based" approaches have allowed important leaps in multimodal image registration, notably due to the increasing use of information-theoretic similarity measures. Their wide success has led to the progressive abandon of measures using standard image statistics (mea ..."
Abstract
-
Cited by 18 (5 self)
- Add to MetaCart
Over the last five years, new "voxel-based" approaches have allowed important leaps in multimodal image registration, notably due to the increasing use of information-theoretic similarity measures. Their wide success has led to the progressive abandon of measures using standard image statistics (mean and variance). Until now, such measures have essentially been based on heuristics. In this paper, we address the determination of a new measure based on standard statistics from a theoretical point of view. We show that it naturally leads to a known concept of probability theory, the correlation ratio. In our derivation, we take as the hypothesis the functional dependence between the image intensities. This means that one image is considered as a model of the other. Although suchahypothesis is not validate in every circumstance, it enables us to incorporate implicitely an a priori smoothness model. We also demonstrate preliminary results of multimodal rigid registration involving Magnetic Resonance (MR), Computed Tomography (CT), and Positron Emission Tomography (PET) images. These results suggest that the correlation ratio provides a good trade-off between accuracy and robustness.
A writer identification and verification system
- Pattern Recognition Letters
, 2005
"... In this paper, we propose to apply an Information Retrieval model for the writer identification task. A set of local features is defined by clustering the graphemes produced by a segmentation procedure. Then a textual based Information Retrieval model is applied. After a first indexing step, this mo ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
In this paper, we propose to apply an Information Retrieval model for the writer identification task. A set of local features is defined by clustering the graphemes produced by a segmentation procedure. Then a textual based Information Retrieval model is applied. After a first indexing step, this model no longer requires image access to the database for responding to a specific query, thus making the process particularly effective. Image queries are handwritten documents projected in the feature space prior to the retrieval of the suitable responses. Writer hypothesis retrieved are then analysed during a verification phase. We have called upon a mutual information criterion to verify that two documents may have been produced by the same writer or not. Hypothesis testing is used for this purpose. The method is tested on two different databases and appears to be robust over these two databases thus making the approach very promising for large scale applications in the domain of handwritten document querying and writer verification.
Bagging Equalizes Influence
, 2002
"... Bagging constructs an estimator by averaging predictors trained on bootstrap samples. Bagged estimates almost consistently improve on the original predictor. It is thus important to understand the reasons for this success, and also for the occasional failures. It is widely believed that bagging is e ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Bagging constructs an estimator by averaging predictors trained on bootstrap samples. Bagged estimates almost consistently improve on the original predictor. It is thus important to understand the reasons for this success, and also for the occasional failures. It is widely believed that bagging is effective thanks to the variance reduction stemming from averaging predictors. However, seven years from its introduction, bagging is still not fully understood. This paper provides experimental evidence supporting the hypothesis that bagging stabilizes prediction by equalizing the influence of training examples. This eect is detailed in two dierent frameworks: estimation on the real line and regression. Bagging's improvements/deteriorations are explained by the goodness/badness of highly influential examples, in situations where the usual variance reduction argument is at best questionable. Finally, reasons for the equalization effect are advanced. They support that other resampling strategies such as half-sampling should provide qualitatively identical effects while being computationally less demanding than bootstrap sampling.
An Introduction to Symbolic Data Analysis and the Sodas Software
- Journal of Symbolic Data Analysis
, 2003
"... ..."
Extending an Existing Specialized Semantic Lexicon
- Proceedings of the First International Conference on Language Resources and Evaluation
, 1998
"... There is a constant need to extend and tune specialized vocabularies to account for new words and new word usages. This paper addresses the issue of characterizing the semantic class of such words. We test the hypothesis that the analysis of word distribution in a representative corpus, as obtained ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
There is a constant need to extend and tune specialized vocabularies to account for new words and new word usages. This paper addresses the issue of characterizing the semantic class of such words. We test the hypothesis that the analysis of word distribution in a representative corpus, as obtained by robust NLP tools, can help identify words with similar meanings, and to decide on the most likely category for a given word based on the categories of its neighbors. We report on an experiment with a moderatesize corpus of patient discharge summaries collected during the MENELAS project, taking as categories the high-level axes of the SNOMED nomenclature, and processing the corpus with the ZELLIG suite of tools. We attempt to quantify the extent to which this process succeeds in proposing a correct category for a given word of the corpus while we vary several parameters of the method. The percentage of correctly categorized words (precision) ranges between 50 and 75 %, while the best p...
Building Detection from High Resolution Colour Images
, 1998
"... We describe a new method for the detection and reconstruction of building in dense urban areas using high resolution aerial images. Our approach begins with the generation of a dense digital elevation model (DEM). A sparse disparity map is densified using a region-based segmentation of the left aeri ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We describe a new method for the detection and reconstruction of building in dense urban areas using high resolution aerial images. Our approach begins with the generation of a dense digital elevation model (DEM). A sparse disparity map is densified using a region-based segmentation of the left aerial image: each detected region is tested to be planar in the disparity map. A strategy is proposed to optimize the generation of these planar surfaces taking into account the noise present in the sparse disparity map and the robustness and complexity of different algorithms for planar approximation. The second step of our approach deals with the generation of building hypotheses. Based on the DEM previously computed, geometric and colorimetric criteria are used for the fusion of parallel regions, for the detection of symmetrical regions in the 3D object space and for the reconstruction of roof buildings. Experimental results are presented on a scene in the suburb of Bruxelles with colour ima...
On the Correctness of Multimedia Applications
- In 11th EuroMicro Conf. on Real Time Systems. IEEE
, 1999
"... This paper provides a method to verify that the properties we consider critical to build a correct multimedia application are met. These properties are the compatibility of the protocols employed by the computing elements to interact with each other and the timeliness of the application. The method ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
This paper provides a method to verify that the properties we consider critical to build a correct multimedia application are met. These properties are the compatibility of the protocols employed by the computing elements to interact with each other and the timeliness of the application. The method relies on (i) an original model that enables a software architect to specify the basic elements of its multimedia application according to the two aforementioned properties, and (ii) two algorithms that determine the best level of perceived quality of service that can be guaranteed by the application. 1. Introduction A multimedia application is characterized by its ability to handle quasi-periodic streams. Typically, these streams transport a large amount of data (e.g. video or audio streams) and have soft temporal constraints, i.e. missing a deadline does not have any critical consequences for users of the application. In general, temporal constraints apply either on a single stream or on ...
μ-SOM : Weighting features during clustering
- PROCEEDINGS OF THE 5TH WORKSHOP ON SELF-ORGANIZING MAPS (WSOM’05)
, 2005
"... Real life datasets used in marketing studies contain a lot of redundant features which may prevent data-mining techniques such as self-organizing maps from discovering relevant clusters. An extension of the batch Kohonen's algorithm is proposed in this paper to avoid the large amount of work whic ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Real life datasets used in marketing studies contain a lot of redundant features which may prevent data-mining techniques such as self-organizing maps from discovering relevant clusters. An extension of the batch Kohonen's algorithm is proposed in this paper to avoid the large amount of work which is required by data preprocessing if redundancy isn't treated explicitly by the training method. The proposed approach integrates a weighting of variables built on a simultaneous clustering of both observations and variables and avoids the side e#ects of redundancy. An application to market segmentation is then briefly described to validate the learning algorithm introduced; identified clusters of products and motivations are used to simplify the analysis of the consumer segmentation by giving the user a first rough description of the di#erent groups.

