## Evidence Combination Techniques For Robust Classification Of Short-Duration Oceanic Signals (1992)

Venue: | In SPIE Conf. on Adaptive and Learning Systems, SPIE Proc |

Citations: | 18 - 12 self |

### BibTeX

@INPROCEEDINGS{Ghosh92evidencecombination,

author = {Joydeep Ghosh and Steven Beck and Chen-chau Chu},

title = {Evidence Combination Techniques For Robust Classification Of Short-Duration Oceanic Signals},

booktitle = {In SPIE Conf. on Adaptive and Learning Systems, SPIE Proc},

year = {1992},

pages = {266--276}

}

### OpenURL

### Abstract

. The identification and classification of underwater acoustic signals is an extremely difficult problem because of low SNRs and a high degree of variability in the signals emanated from the same type of sound source. Since different classification techniques have different inductive biases, a single method cannot give the best results for all signal types. Rather, more accurate and robust classification can obtained by combining the outputs (evidences) of multiple classifiers based on neural network and/or statistical pattern recognition techniques. In this paper, four approaches to evidence combination are presented and compared using realistic oceanic data. The first method uses an entropy-based weighting of individual classifier outputs. The second is based on combination of confidence factors in a manner similar to that used in MYCIN. The other two methods use majority voting and averaging, with little extra computational overhead. All these techniques give better results than tho...

### Citations

4179 |
Pattern Classification and Scene Analysis
- Duda, Hart
- 1973
(Show Context)
Citation Context ...t classes have substantial overlap in their density distributions, then the statistically optimal expected correct classification achieved by Bayes decision theory can be significantly less than 100% =-=[8]-=-. For real-life, difficult signals such as the underwater acoustic signals considered in this paper, the apriori class distributions are not known, and thus Bayesian a posteriori probabilities are not... |

519 |
Fast Learning in Networks of Locally-Tuned Processing Units
- Moody, Darken
- 1989
(Show Context)
Citation Context ...l" classifiers using feed-forward networks [16]. These include the Multi-Layer Perceptron (MLP) as well as kernel-based classifiers such as those employing Radial or Elliptical Basis Functions (E=-=BFs) [3, 18]-=-. These networks can serve as non-parametric, adaptive classifiers that learn through examples [16], without requiring a good apriori mathematical model for the underlying signal characteristics. If t... |

484 |
Multivariate functional interpolation and adaptive networks
- Broomhead, Lowe
- 1988
(Show Context)
Citation Context ...l" classifiers using feed-forward networks [16]. These include the Multi-Layer Perceptron (MLP) as well as kernel-based classifiers such as those employing Radial or Elliptical Basis Functions (E=-=BFs) [3, 18]-=-. These networks can serve as non-parametric, adaptive classifiers that learn through examples [16], without requiring a good apriori mathematical model for the underlying signal characteristics. If t... |

445 | Optimal brain damage
- LeCun, Denker, et al.
- 1990
(Show Context)
Citation Context ...s and conventional techniques such as decision trees, K nearest neighbor, Gaussian mixtures, and CART can be found in [19, 23]. For this study, we chose the MLP augmented with weight decay strategies =-=[6]-=-, and the EBF network, since these classifiers are comparatively insensitive to noise and to irrelevant inputs [2]. 2.1 MLP with Weight Decay The standard fully-connected MLP network that adapts weigh... |

380 |
Computer systems that learn
- Weiss, Kulikowski
- 1991
(Show Context)
Citation Context ...thin each category, is available in [16, 19]. Comparisons between these classifiers and conventional techniques such as decision trees, K nearest neighbor, Gaussian mixtures, and CART can be found in =-=[19, 23]-=-. For this study, we chose the MLP augmented with weight decay strategies [6], and the EBF network, since these classifiers are comparatively insensitive to noise and to irrelevant inputs [2]. 2.1 MLP... |

283 |
Neural networks classifiers estimate Bayesian a posteriori probabilities
- Richard, Lippmann
- 1991
(Show Context)
Citation Context ...networks trained by minimizing either the expected mean square error (MSE) or cross-entropy, and by using a 1 of M teaching function, yield network outputs that estimate posterior class probabilities =-=[11, 20, 21]-=-. These estimations have been observed to be very good for low-dimensional input patterns, at least in regions where there are sufficient training patterns [20, 21]. Detailed experiments for high-dime... |

105 |
A Model for Inexact Reasoning in Medicine
- Shortliffe, Buchanan
- 1975
(Show Context)
Citation Context ...llel combination of rules in expert systems. Certainty factors were introduced in the MYCIN expert system for reasoning in expert systems under uncertainty, and reflect the confidence in a given rule =-=[22]-=-. The original method of rule combination in MYCIN was later expressed in a more probabilistic framework by Heckerman [13], and serves as the basis for the method proposed below: First, the outputs, w... |

66 |
Pattern classification using neural networks
- Lippmann
- 1989
(Show Context)
Citation Context ...rtificial neural network (ANN) approaches to problems in the field of pattern recognition and signal processing have led to the development of various "neural" classifiers using feed-forward=-= networks [16]-=-. These include the Multi-Layer Perceptron (MLP) as well as kernel-based classifiers such as those employing Radial or Elliptical Basis Functions (EBFs) [3, 18]. These networks can serve as non-parame... |

59 |
A Probabilistic Approach to the Understanding and Training of Neural Network Classifiers
- Gish
- 1990
(Show Context)
Citation Context ...networks trained by minimizing either the expected mean square error (MSE) or cross-entropy, and by using a 1 of M teaching function, yield network outputs that estimate posterior class probabilities =-=[11, 20, 21]-=-. These estimations have been observed to be very good for low-dimensional input patterns, at least in regions where there are sufficient training patterns [20, 21]. Detailed experiments for high-dime... |

28 |
Practical characteristics of neural network and conventional pattern classi
- Ng, Lippmann
- 1991
(Show Context)
Citation Context ...pattern recognition techniques tailored to recognizing short-duration oceanic signals [10]. 2 Overview of ANN Classifiers Used Our experiences, corroborated by those of several other researchers (see =-=[19]-=- for example), show that classification error rates are similar across different ANN classifiers when they are powerful enough to form minimum error decision regions, if they are properly tuned, and w... |

26 |
Predicting the future: advantages of semilocal units
- Hartman, Keeler
- 1991
(Show Context)
Citation Context ...ons, they are not so practical and efficient for high dimensional inputs. This is because a much larger number of kernel nodes are required to cover higher dimensional spaces with adequate resolution =-=[12]-=-. Moreover, they are more sensitive to noise and to irrelevant inputs [2]. The abovementioned drawbacks can be countered by using Elliptical Basis Function (EBF) networks that employ Gaussian basis fu... |

23 | Mapping neural networks onto message-passing multicomputers
- Ghosh, Hwang
- 1989
(Show Context)
Citation Context ...roperly tuned, and when sufficient training data is available. Practical characteristics such as training time, classification time and memory requirements, however, can differ by orders of magnitude =-=[9]-=-. Also, the classifiers differ in their robustness against noise, effects of small training sets, and in their ability to handle high-dimensional inputs [2]. A good review of probabilistic, hyperplane... |

13 |
Pattern recognition properties of neural networks
- Makhoul
- 1991
(Show Context)
Citation Context ...w values of MSE, f c (x) approximates P (c=x) according to Eq. 7. Let f c;i (x) be the output of the node that denotes membership in class c in the ith neural classifier. We expect that, for all i, x =-=[17]-=-: X c f c;i ' 1: (8) Similarly, if the posteriori estimate is very good, one would expect for all c; i: 1 N N X j=1 f c;i (x j ) ' P (c); (9) where j indexes the N training data samples, and P(c) is o... |

13 |
Least squares learning and approximation of posterior probabilities on classification problems by neural network modcls
- Shoemaker, Carlin, et al.
- 1991
(Show Context)
Citation Context ...networks trained by minimizing either the expected mean square error (MSE) or cross-entropy, and by using a 1 of M teaching function, yield network outputs that estimate posterior class probabilities =-=[11, 20, 21]-=-. These estimations have been observed to be very good for low-dimensional input patterns, at least in regions where there are sufficient training patterns [20, 21]. Detailed experiments for high-dime... |

10 |
A hybrid neural network classifier of short duration acoustic signals
- Beck, Deuser, et al.
- 1991
(Show Context)
Citation Context ...signal source type and number of training and test samples are given in Table 1. Signal preprocessing as well as the extraction of good feature vectors is crucial to the performance of the classifiers=-=[1]-=-. The 25-dimensional feature vectors extracted from the raw signals consist of : 16 coefficients of Gabor wavelets - a multiscale representation that does not assume signal stationarity[5], 1 value de... |

9 |
Probabilistic interpretation for MYCIN's uncertainty factors.” Uncertainty in
- Heckerman
- 1986
(Show Context)
Citation Context ... expert systems under uncertainty, and reflect the confidence in a given rule [22]. The original method of rule combination in MYCIN was later expressed in a more probabilistic framework by Heckerman =-=[13]-=-, and serves as the basis for the method proposed below: First, the outputs, which are in the range [0,1], are mapped into confidence factors (CFs) in the range [-1,1] using a log transformation. Then... |

8 |
Noise sensitivity of static neural classifiers
- Beck, Ghosh
- 1992
(Show Context)
Citation Context ... however, can differ by orders of magnitude [9]. Also, the classifiers differ in their robustness against noise, effects of small training sets, and in their ability to handle high-dimensional inputs =-=[2]-=-. A good review of probabilistic, hyperplane, kernel and exemplar-based classifiers that discusses the relative merit of various schemes within each category, is available in [16, 19]. Comparisons bet... |

7 |
Multilayer Feedforward Potential Function Network
- Lee, Kil
- 1988
(Show Context)
Citation Context ...fferent dimensions. Thus, for an EBF network, Eq. 2 is used with R j (x p ) = e \Gamma 1 2 \Sigma k (x pk \Gammax jk ) 2 oe 2 jk : (3) The EBF network is a type of Gaussian Potential Function Network =-=[15]-=- which involves segmentation of the input domain into several potential fields in form of Gaussians. The Gaussian potential functions of this scheme need not be radially symmetric functions. Instead, ... |

5 |
Efficient training procedures for adaptive kernel classifiers
- Chakravarthy, Ghosh, et al.
- 1991
(Show Context)
Citation Context ...kernel/hidden node to the ith output node. A well known example is the Radial Basis Function (RBF) network wherein a radially symmetric function is chosen for R(x), i.e., R j (x) = R(kx \Gamma x j k) =-=[4, 18]-=-. If Gaussian functions are chosen as the basis functions, we have R j (x) = e \Gamma 1 2 kx\Gammax j k 2 oe 2 where oe determines the width of the receptive field. The j hidden node has a maximumoutp... |

4 |
et al. “Large automatic learning, rule extraction and generalization
- Denker
- 1987
(Show Context)
Citation Context ... that adapts weights using "backpropagation" of error is perhaps the most commonly used static ANN classifier [16]. Selective pruning of weights improves the generalization capability of MLP=-= networks [7, 6], and also-=- serves to reduce the effective number of parameters, making the resultant "parsimonious" feed-forward networks more suitable for noisy, high-dimensional inputs. For this reason, the first n... |

3 |
et al. Adaptive kernel classifiers for short-duration oceanic signals
- Ghosh
- 1991
(Show Context)
Citation Context ...a larger project on the design of a detection and classification system that uses a hybrid of ANN and statistical pattern recognition techniques tailored to recognizing short-duration oceanic signals =-=[10]-=-. 2 Overview of ANN Classifiers Used Our experiences, corroborated by those of several other researchers (see [19] for example), show that classification error rates are similar across different ANN c... |