Results 1 -
4 of
4
An ultra low power, ultra miniature voice command system based on hidden markov models
- in IEEE ICASSP
"... A real-time HMM-based isolated word recognition system is implemented on an ultra low-power miniature DSP system. The DSP system consumes less than 1 milliWatt, much less than what is considered today as "lowresource". It has a very small footprint and requires only a single hearing aid sized 1 volt ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
A real-time HMM-based isolated word recognition system is implemented on an ultra low-power miniature DSP system. The DSP system consumes less than 1 milliWatt, much less than what is considered today as "lowresource". It has a very small footprint and requires only a single hearing aid sized 1 volt battery. The efficient implementation of HMM and MFCC feature extraction algorithms is accomplished through the use of three processing units running concurrently. In addition to the DSP core, an input/output processor creates frames of input speech signals, and a WOLA filterbank unit performs windowing, FFT and vector multiplications. A system evaluation using a vocabulary of 18 words shows a success rate of more than 99%. 1.
A High-Speed, Low-Resource ASR Back-End Based on Custom Arithmetic
"... Abstract—With the skyrocketing popularity of mobile devices, new processing methods tailored to a specific application have become necessary for low-resource systems. This work presents a high-speed, low-resource speech recognition system using custom arithmetic units, where all system variables are ..."
Abstract
- Add to MetaCart
Abstract—With the skyrocketing popularity of mobile devices, new processing methods tailored to a specific application have become necessary for low-resource systems. This work presents a high-speed, low-resource speech recognition system using custom arithmetic units, where all system variables are represented by integer indices and all arithmetic operations are replaced by hardware-based table lookups. To this end, several reordering and rescaling techniques, including two accumulation structures for Gaussian evaluation and a novel method for the normalization of Viterbi search scores, are proposed to ensure low entropy for all variables. Furthermore, a discriminatively inspired distortion measure is investigated for scalar quantization of forward probabilities to maximize the recognition rate. Finally, heuristic algorithms are explored to optimize system-wide resource allocation. Our best bit-width allocation scheme only requires 59 kB of ROMs to hold the lookup tables, and its recognition performance with various vocabulary sizes in both clean and noisy conditions is nearly as good as that of a system using a 32-bit floating-point unit. Simulations on various architectures show that, on most modern processor designs, we can expect a cycle-count speedup of at least three times over systems with floating-point units. Additionally, the memory bandwidth is reduced by over 70 % and the offline storage for model parameters is reduced by 80%. Index Terms—Alpha recursion, bit-width allocation, custom arithmetic, discriminative distortion measure, forward probability normalization and scaling, high speed, low resource, normalization, quantization, speech recognition. I.
FIXED-POINT IMPLEMENTATIONS OF SPEECH RECOGNITION SYSTEMS
"... Fixed-point hardware implementations of signal processing algorithms can often achieve higher performance with lower computational requirements than a floating-point implementation. However, the design of such systems is hard due to the difficulty of addressing the quantization issues. This paper pr ..."
Abstract
- Add to MetaCart
Fixed-point hardware implementations of signal processing algorithms can often achieve higher performance with lower computational requirements than a floating-point implementation. However, the design of such systems is hard due to the difficulty of addressing the quantization issues. This paper presents an optimization approach to determining the wordlengths of fixed-point operators in a speech recognition system. This approach enables users to achieve the same result as in floating-point implementation with minimum hardware resources, resulting in reduced cost and perhaps lower power consumption. These techniques lead to an automated optimization based design methodology for fixedpoint based signal processing systems. An object oriented library, called Fixed, was developed to simulate fixed-point quantization effects. Quantization effects during recognition were analyzed, and appropriate wordlength that can balance hardware cost and calculation accuracy were determined for the operators. 1.
Implementation of an Intonational Quality Assessment System for a Handheld Device
"... In this paper, we describe an implementation of an intonational quality assessment system for foreign language learning using a handheld portable device. The Viterbi algorithm is employed to conduct the forced alignments that indicate the boundary of each phonemes and a pitch detector is used to ext ..."
Abstract
- Add to MetaCart
In this paper, we describe an implementation of an intonational quality assessment system for foreign language learning using a handheld portable device. The Viterbi algorithm is employed to conduct the forced alignments that indicate the boundary of each phonemes and a pitch detector is used to extract the intonational features. The tonal pitch type of the segmented syllables is classified and the tendency of the pitch movement is measured. Then, the score of the spoken sentence is generated based on this information. We have implemented this system on an ARM7 RISC processor based system. For real time operation, we applied fixed-point arithmetic to the signal processing kernels and rearranged the algorithm flow of the system. As a result, the system runs in real time on a 60MHz CPU clock frequency. Reference DB

