## On-line Handwriting Recognition with Support Vector Machines - A Kernel Approach (2002)

### Cached

### Download Links

Venue: | In Proc. of the 8th IWFHR |

Citations: | 92 - 8 self |

### BibTeX

@INPROCEEDINGS{Bahlmann02on-linehandwriting,

author = {Claus Bahlmann and Bernard Haasdonk and Hans Burkhardt},

title = {On-line Handwriting Recognition with Support Vector Machines - A Kernel Approach},

booktitle = {In Proc. of the 8th IWFHR},

year = {2002},

pages = {49--54}

}

### Years of Citing Articles

### OpenURL

### Abstract

In this' contribution we describe a novel classification approach for on-line handwriting recognition. The technique combines dynamic time warping (DTW) and support vector machines (SVMs) by establishing a new SVM kernel. We call this' kernel Gaussian DTW (GDTW) ker- nel. This kernel approach haw' a main advantage over common HMM techniques. It does not assume a model for the generarive class conditional densities. Instead, it directly addresses the problem of discrimination by creating class boundaries and thus is' less sensitive to modeling assumptions. By incorporating DTW in the kernel function, general classification problems with variable-sized sequential data can be handled. In this respect the proposed method can be straightforwardly applied to all classification problems, where DTW gives a reasonable distance measure, e.g. speech recognition or genome processing. We show experiments with this' kernel approach on the UNIPEN handwriting data, achieving results' comparable to an HMMbased technique.

### Citations

2530 | A tutorial on support vector machines for pattern recognition
- Burges
- 1998
(Show Context)
Citation Context ...g. We show experiments with this kernel approach on the UNIPEN handwriting data, achieving results comparable to an HMMbased technique. 1. Introduction The utilization of support vector machine (SVM) =-=[2, 4]-=- classifiers has gained immense popularity in the last years. SVMs have achieved excellent recognition results in various pattern recognition applications [4]. Also in off-line optical character recog... |

1667 |
Fundamentals of Speech Recognition
- Rabiner, Juang
- 1993
(Show Context)
Citation Context ...ata and a comparison to UNIPEN results of other recognition techniques are presented in section 4. Section 5 provides a conclusion of this contribution. 2. Background 2.1. Dynamic time warping In DTW =-=[15]-=- a distance D (T , R) from two vector sequences T = (t1,...,tNT ) and R = (r1,...,rNR) is determined. In on-line HWR the vectors ti ∈ IRF and rj ∈ IRF are usually computed from the local neighborhood ... |

1126 |
Fast training of support vector machines using sequential minimal optimization
- Platt
- 1999
(Show Context)
Citation Context ...lied in this case. Each pattern is typically represented by about 10–80 sample points. 4.3. Two-class experiments We have trained the SVM-GDTW with the sequential minimal optimization (SMO) algorithm =-=[14]-=-, using a third party Matlab SVM toolbox [3]. For the following experiments the SVM and kernel parameters were set to C =1 and γ =1.8, respectively. In the first investigation we were concerned whethe... |

809 |
Statistical Methods for Speech Recognition
- Jelinek
- 1998
(Show Context)
Citation Context ...n the literature [15]. These only allow forward steps of size 1 in T , R or in both of them, i.e. φ (n +1)− φ (n) equals (1, 0), (0, 1) or (1, 1). Usual dynamic programming and beam search strategies =-=[11]-=- are applied to reduce the computational complexity when minimizing (2). The DTW technique itself in combination with a minimum distance classifier [17, 18] as well as the incorporation of statistical... |

294 | Large margin DAGs for multiclass classification
- Platt, Cristianini, et al.
- 2000
(Show Context)
Citation Context ...in (3) is limited to a subset of the Pi, which therefore is called the set of support vectors. Extensions of the binary classification to the multi-class situation are suggested in several approaches =-=[2, 13]-=-. 3. Gaussian dynamic time warping kernel As indicated in the introduction, when dealing with sequential on-line handwriting data we cannot simply employ the basic SVM framework given by (3)–(6). Diff... |

172 | Using the Fisher kernel method to detect remote protein homologies
- Jaakkola
- 1999
(Show Context)
Citation Context ...arting point for linking SVMs with sequential data is the so-called kernel, as will be shown. Some work has been done in other research areas dealing with kernels for sequential data. Jaakkola et al. =-=[10]-=- developed an SVM kernel in their application of protein homology detection and refer to it as Fisher kernel. Watkins [19] describes several explicit kernels for sequential data and shows that the joi... |

150 | B.: Training invariant support vector machines
- DeCoste, Schölkopf
- 2002
(Show Context)
Citation Context ...cations [4]. Also in off-line optical character recognition (OCR) they have been shown to be comparable or even superior to the standard techniques like Bayesian classifiers or multilayer perceptrons =-=[5]-=-. SVMs are discriminative classifiers based on Vapnik’s structural risk minimization principle. They can implement flexible decision boundaries in high dimensional feature spaces. The implicit regular... |

129 | Dynamic alignment kernels
- Watkins
- 1999
(Show Context)
Citation Context ...ther research areas dealing with kernels for sequential data. Jaakkola et al. [10] developed an SVM kernel in their application of protein homology detection and refer to it as Fisher kernel. Watkins =-=[19]-=- describes several explicit kernels for sequential data and shows that the joint probability of two sequences according to a pair HMM is a proper SVM kernel under certain conditions. Since the kernels... |

106 | UNIPEN project of on-line data exchange and recognizer Benchmarks
- Guyon, Schomaker, et al.
- 1994
(Show Context)
Citation Context ...We shall start with a short review of dynamic time warping (DTW) and SVMs in the following section. Section 3 then introduces the GDTW kernel. Experimental results with this GDTW kernel on the UNIPEN =-=[7]-=- data and a comparison to UNIPEN results of other recognition techniques are presented in section 4. Section 5 provides a conclusion of this contribution. 2. Background 2.1. Dynamic time warping In DT... |

104 | Input space versus feature space in kernel-based methods
- Schölkopf, Mika, et al.
- 1999
(Show Context)
Citation Context ...onally one can omit or interrupt applicable kernel evaluations by pruning techniques for the DTW. Furthermore techniques exist that decrease the number of support vectorsposterior to the SVM training =-=[16]-=-. These techniques produce SVMs which are up to ten times faster without large losses in classification accuracy. 5. Conclusion We have presented a novel approach for the recognition of on-line handwr... |

84 |
Support vector machines
- Cristianini, Ricci
- 2008
(Show Context)
Citation Context ...g. We show experiments with this kernel approach on the UNIPEN handwriting data, achieving results comparable to an HMMbased technique. 1. Introduction The utilization of support vector machine (SVM) =-=[2, 4]-=- classifiers has gained immense popularity in the last years. SVMs have achieved excellent recognition results in various pattern recognition applications [4]. Also in off-line optical character recog... |

46 |
Cursive Script Recognition by Elastic Matching
- Tappert
- 1982
(Show Context)
Citation Context ... dynamic programming and beam search strategies [11] are applied to reduce the computational complexity when minimizing (2). The DTW technique itself in combination with a minimum distance classifier =-=[17, 18]-=- as well as the incorporation of statistical knowledge to this concept [1] have been successfully applied to handwriting recognition. 2.2. Support vector classification Here, we provide a brief introd... |

37 |
MATLAB support vector machine toolbox (v0.55 [ http://theoval.sys.uea.ac.uk/~gcc/svm/toolbox
- Cawley
- 2000
(Show Context)
Citation Context ...represented by about 10–80 sample points. 4.3. Two-class experiments We have trained the SVM-GDTW with the sequential minimal optimization (SMO) algorithm [14], using a third party Matlab SVM toolbox =-=[3]-=-. For the following experiments the SVM and kernel parameters were set to C =1 and γ =1.8, respectively. In the first investigation we were concerned whether an SVM-GDTW is able to classify clearly se... |

34 | Tangent distance kernels for support vector machines
- Haasdonk, Keysers
- 2002
(Show Context)
Citation Context ...uaranteed to be the global optimum. In fact general pd cannot be proven for (7), as simple counterexamples can be found. Nevertheless such kernels can produce good results like in our case and others =-=[5, 8]-=-. Recalling the fact that positive definite kernels are characterized by the property of generating kernel matrices Kij = K (Pi, Pj) with solely nonnegative eigenvalues λi, some reasons for the good r... |

31 |
Writer independent online handwriting recognition using an hmm approach
- Hu, Lim, et al.
- 2000
(Show Context)
Citation Context ...d. chosen 20 %/20 % Train/Test 3.8 % rand. chosen 40 %/40 % Train/Test Train-R01/V07 4.5 % rand. chosen 20 %/20 % Train/Test 3.2 % rand. chosen 40 %/40 % Train/Test MLP [12] 3.0 % DevTest-R02/V02 HMM =-=[9]-=- 3.2 % DAG-SVM-GDTW SDTW [1] HMM [9] 6.4 % DAG-SVM-GDTW SDTW [1] Train-R01/V06 4 % ”bad characters” removed Train-R01/V07 7.6 % rand. chosen 20 %/20 % Train/Test 7.6 % rand. chosen 40 %/40 % Train/Tes... |

24 | Measuring HMM similarity with the Bayes probability of error and its application to online handwriting recognition
- Bahlmann, Burkhardt
- 2001
(Show Context)
Citation Context ...computational complexity when minimizing (2). The DTW technique itself in combination with a minimum distance classifier [17, 18] as well as the incorporation of statistical knowledge to this concept =-=[1]-=- have been successfully applied to handwriting recognition. 2.2. Support vector classification Here, we provide a brief introduction to support vector classification. For more details and geometrical ... |

10 | Character Recognition Experiments Using UNIPEN Data
- Parizeau, Lemieux, et al.
- 2004
(Show Context)
Citation Context ...DTW [1] Train-R01/V07 4.0 % rand. chosen 20 %/20 % Train/Test 3.8 % rand. chosen 40 %/40 % Train/Test Train-R01/V07 4.5 % rand. chosen 20 %/20 % Train/Test 3.2 % rand. chosen 40 %/40 % Train/Test MLP =-=[12]-=- 3.0 % DevTest-R02/V02 HMM [9] 3.2 % DAG-SVM-GDTW SDTW [1] HMM [9] 6.4 % DAG-SVM-GDTW SDTW [1] Train-R01/V06 4 % ”bad characters” removed Train-R01/V07 7.6 % rand. chosen 20 %/20 % Train/Test 7.6 % ra... |

8 | Adaptive character recognizer for a hand-held device: implementation and evaluation setup
- Vuori, Aksela, et al.
- 2000
(Show Context)
Citation Context ... dynamic programming and beam search strategies [11] are applied to reduce the computational complexity when minimizing (2). The DTW technique itself in combination with a minimum distance classifier =-=[17, 18]-=- as well as the incorporation of statistical knowledge to this concept [1] have been successfully applied to handwriting recognition. 2.2. Support vector classification Here, we provide a brief introd... |

7 | Strategies for combining on-Line and off-line information in an on-Line handwriting recognition system”. ICDAR’01, pp 412416
- Gauthier, Artieres, et al.
- 2001
(Show Context)
Citation Context ...%/20 % Train/Test Train-R01/V07 13.0 % rand. chosen 10 %/10 % Train/Test 11.4 % rand. chosen 20 %/20 % Train/Test 9.7 % rand. chosen 66 %/33 % Train/Test MLP [12] 14.4 % DevTest-R02/V02 HMM-NN hybrid =-=[6]-=- 13,2 % Train-R01/V07 HMM [9] 14,1 % Train-R01/V06 4 % ”bad characters” removed sured CTkernel ≈ 0.001 sec in our implementation on an AMD Athlon 1200MHz. The asymptotic training time of the two-class... |