Results 11  20
of
1,183
Learning to Construct Knowledge Bases from the World Wide Web
, 2000
"... The World Wide Web is a vast source of information accessible to computers, but understandable only to humans. The goal of the research described here is to automatically create a computer understandable knowledge base whose content mirrors that of the World Wide Web. Such a knowledge base would ena ..."
Abstract

Cited by 201 (4 self)
 Add to MetaCart
The World Wide Web is a vast source of information accessible to computers, but understandable only to humans. The goal of the research described here is to automatically create a computer understandable knowledge base whose content mirrors that of the World Wide Web. Such a knowledge base would enable much more effective retrieval of Web information, and promote new uses of the Web to support knowledgebased inference and problem solving. Our approach is to develop a trainable information extraction system that takes two inputs. The first is an ontology that defines the classes (e.g., company, person, employee, product) and relations (e.g., employed_by, produced_by) of interest when creating the knowledge base. The second is a set of training data consisting of labeled regions of hypertext that represent instances of these classes and relations. Given these inputs, the system learns to extract information from other pages and hyperlinks on the Web. This article describes our general a...
A Survey of Socially Interactive Robots
, 2002
"... This paper reviews "socially interactive robots": robots for which social humanrobot interaction is important. We begin by discussing the context for socially interactive robots, emphasizing the relationship to other research fields and the di#erent forms of "social robots". We then present a taxon ..."
Abstract

Cited by 194 (24 self)
 Add to MetaCart
This paper reviews "socially interactive robots": robots for which social humanrobot interaction is important. We begin by discussing the context for socially interactive robots, emphasizing the relationship to other research fields and the di#erent forms of "social robots". We then present a taxonomy of design methods and system components used to build socially interactive robots. Finally, we describe the impact of these these robots on humans and discuss open issues. An expanded version of this paper, which contains a survey and taxonomy of current applications, is available as a technical report[61].
Efficient Retrieval of Similar Time Sequences Under Time Warping
, 1997
"... Fast similarity searching in large timesequence databases has attracted a lot of research interest [1, 5, 2, 6, 3, 10]. All of them use the Euclidean distance (L 2 ), or some variation of L p metrics. L p metrics lead to efficient indexing, thanks to feature extraction (e.g., by keeping the first ..."
Abstract

Cited by 173 (3 self)
 Add to MetaCart
Fast similarity searching in large timesequence databases has attracted a lot of research interest [1, 5, 2, 6, 3, 10]. All of them use the Euclidean distance (L 2 ), or some variation of L p metrics. L p metrics lead to efficient indexing, thanks to feature extraction (e.g., by keeping the first few DFT coefficients) and subsequent use of fast spatial access methods for the points in feature space. In this work we examine a popular, fieldtested dissimilarity function, the "time warping" distance function which permits local accelerations and decelerations in the rate of the signals or sequences. This function is natural and suitable for several applications, like matching of voice, audio and medical signals (e.g., electrocardiograms) However, from the indexing viewpoint it presents two major challenges: (a) it does not lead to any natural "features", precluding the use of spatial access methods (b) it is quadratic (O(len 1 len 2 )) on the length of the sequences involved. Here we ...
Hidden Markov processes
 IEEE Trans. Inform. Theory
, 2002
"... Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finite ..."
Abstract

Cited by 170 (3 self)
 Add to MetaCart
Abstract—An overview of statistical and informationtheoretic aspects of hidden Markov processes (HMPs) is presented. An HMP is a discretetime finitestate homogeneous Markov chain observed through a discretetime memoryless invariant channel. In recent years, the work of Baum and Petrie on finitestate finitealphabet HMPs was expanded to HMPs with finite as well as continuous state spaces and a general alphabet. In particular, statistical properties and ergodic theorems for relative entropy densities of HMPs were developed. Consistency and asymptotic normality of the maximumlikelihood (ML) parameter estimator were proved under some mild conditions. Similar results were established for switching autoregressive processes. These processes generalize HMPs. New algorithms were developed for estimating the state, parameter, and order of an HMP, for universal coding and classification of HMPs, and for universal decoding of hidden Markov channels. These and other related topics are reviewed in this paper. Index Terms—Baum–Petrie algorithm, entropy ergodic theorems, finitestate channels, hidden Markov models, identifiability, Kalman filter, maximumlikelihood (ML) estimation, order estimation, recursive parameter estimation, switching autoregressive processes, Ziv inequality. I.
A Stochastic Model of HumanMachine Interaction for learning dialog Strategies
 IEEE Transactions on Speech and Audio Processing
, 2000
"... Abstract—In this paper, we propose a quantitative model for dialog systems that can be used for learning the dialog strategy. We claim that the problem of dialog design can be formalized as an optimization problem with an objective function reflecting different dialog dimensions relevant for a given ..."
Abstract

Cited by 167 (4 self)
 Add to MetaCart
Abstract—In this paper, we propose a quantitative model for dialog systems that can be used for learning the dialog strategy. We claim that the problem of dialog design can be formalized as an optimization problem with an objective function reflecting different dialog dimensions relevant for a given application. We also show that any dialog system can be formally described as a sequential decision process in terms of its state space, action set, and strategy. With additional assumptions about the state transition probabilities and cost assignment, a dialog system can be mapped to a stochastic model known as Markov decision process (MDP). A variety of data driven algorithms for finding the optimal strategy (i.e., the one that optimizes the criterion) is available within the MDP framework, based on reinforcement learning. For an effective use of the available training data we propose a combination of supervised and reinforcement learning: the supervised learning is used to estimate a model of the user, i.e., the MDP parameters that quantify the user’s behavior. Then a reinforcement learning algorithm is used to estimate the optimal strategy while the system interacts with the simulated user. This approach is tested for learning the strategy in an air travel information system (ATIS) task. The experimental results we present in this paper show that it is indeed possible to find a simple criterion, a state space representation, and a simulated user parameterization in order to automatically learn a relatively complex dialog behavior, similar to one that was heuristically designed by several research groups. Index Terms—Dialog systems, Markov decision process, reinforcement learning, sequential decision process, speech, spoken
Polynomial Splines and Their Tensor Products in Extended Linear Modeling
 Ann. Statist
, 1997
"... ANOVA type models are considered for a regression function or for the logarithm of a probability function, conditional probability function, density function, conditional density function, hazard function, conditional hazard function, or spectral density function. Polynomial splines are used to m ..."
Abstract

Cited by 142 (14 self)
 Add to MetaCart
ANOVA type models are considered for a regression function or for the logarithm of a probability function, conditional probability function, density function, conditional density function, hazard function, conditional hazard function, or spectral density function. Polynomial splines are used to model the main effects, and their tensor products are used to model any interaction components that are included. In the special context of survival analysis, the baseline hazard function is modeled and nonproportionality is allowed. In general, the theory involves the L 2 rate of convergence for the fitted model and its components. The methodology involves least squares and maximum likelihood estimation, stepwise addition of basis functions using Rao statistics, stepwise deletion using Wald statistics, and model selection using BIC, crossvalidation or an independent test set. Publically available software, written in C and interfaced to S/SPLUS, is used to apply this methodology to...
Improving timbre similarity: How high is the sky
 Results in Speech and Audio Sciences
"... Abstract. We report on experiments done in an attempt to improve the performance of a music similarity measure which we introduced earlier. The technique aims at comparing music titles on the basis of their global “timbre”, which has many applications in the field of Music Information Retrieval. Suc ..."
Abstract

Cited by 141 (14 self)
 Add to MetaCart
Abstract. We report on experiments done in an attempt to improve the performance of a music similarity measure which we introduced earlier. The technique aims at comparing music titles on the basis of their global “timbre”, which has many applications in the field of Music Information Retrieval. Such measures of timbre similarity have seen a growing interest lately, and every contribution (including ours) is yet another instantiation of the same basic pattern recognition architecture, only with different algorithm variants and parameters. Most give encouraging results with a little effort, and imply that nearperfect results would just extrapolate by finetuning the algorithms ’ parameters. However, such systematic testing over large, interdependent parameter spaces is both difficult and costly, as it requires to work on a whole general metadatabase architecture. This paper contributes in two ways to the current state of the art. We report on extensive tests over very many parameters and algorithmic variants, either already envisioned in the literature or not. This leads to an improvement over existing algorithms of about 15 % Rprecision. But most importantly, we describe many variants that surprisingly do not lead to any substancial improvement. Moreover, our simulations suggest the existence of a “glass ceiling ” at Rprecision about 65 % which cannot probably be overcome by pursuing such variations on the same theme.
Continuous Probabilistic Transform for Voice Conversion
 IEEE Transactions on Speech and Audio Processing
, 1998
"... Abstract — Voice conversion, as considered in this paper, is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). Our contribution includes the design of a new methodology for representing the r ..."
Abstract

Cited by 129 (4 self)
 Add to MetaCart
Abstract — Voice conversion, as considered in this paper, is defined as modifying the speech signal of one speaker (source speaker) so that it sounds as if it had been pronounced by a different speaker (target speaker). Our contribution includes the design of a new methodology for representing the relationship between two sets of spectral envelopes. The proposed method is based on the use of a Gaussian mixture model of the source speaker spectral envelopes. The conversion itself is represented by a continuous parametric function which takes into account the probabilistic classification provided by the mixture model. The parameters of the conversion function are estimated by least squares optimization on the training data. This conversion method is implemented in the context of the HNM (harmonic C noise model) system, which allows highquality modifications of speech signals. Compared to earlier methods based on vector quantization, the proposed conversion scheme results in a much better match between the converted envelopes and the target envelopes. Evaluation by objective tests and formal listening tests shows that the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods. I.
Signal modeling techniques in speech recognition
 PROCEEDINGS OF THE IEEE
, 1993
"... We have seen three important trends develop in the last five years in speech recognition. First, heterogeneous parameter sets that mix absolute spectral information with dynamic, or timederivative, spectral information, have become common. Second, similariry transform techniques, often used to norm ..."
Abstract

Cited by 126 (5 self)
 Add to MetaCart
We have seen three important trends develop in the last five years in speech recognition. First, heterogeneous parameter sets that mix absolute spectral information with dynamic, or timederivative, spectral information, have become common. Second, similariry transform techniques, often used to normalize and decorrelate parameters in some computationally inexpensive way, have become popular. Third, the signal parameter estimation problem has merged with the speech recognition process so that more sophisticated statistical models of the signal’s spectrum can be estimated in a closedloop manner. In this paper, we review the signal processing components of these algorithms. These algorithms are presented as part of a unified view of the signal parameterization problem in which there are three major tasks: measurement, transformation, and statistical modeling. This paper is by no means a comprehensive survey of all possible techniques of signal modeling in speech recognition. There are far too many algorithms in use today to make an exhaustive survey feasible (and cohesive). Instead, this paper is meant to serve as a tutorial on signal processing in stateoftheart speech recognition systems and to review those techniques most commonly used. In keeping with this goal, a complete mathematical description of each algorithm has been included in the paper.
Progressive Image Coding for Noisy Channels
 IEEE SIGNAL PROCESSING LETTERS
, 1997
"... We cascade an existing image coder with carefully chosen error control coding, and thus produce a progressive image compression scheme whose performance on a noisy channel is significantly better than that of previously known techniques. The main idea is to trade off the available transmission rate ..."
Abstract

Cited by 123 (9 self)
 Add to MetaCart
We cascade an existing image coder with carefully chosen error control coding, and thus produce a progressive image compression scheme whose performance on a noisy channel is significantly better than that of previously known techniques. The main idea is to trade off the available transmission rate between source coding and channel coding in an efficient manner. This coding system is easy to implement and has acceptably low complexity. Furthermore, effectively no degradation due to channel noise can be detected; instead, the penalty paid due to channel noise is a reduction in source coding resolution. Detailed numerical comparisons are given that can serve as benchmarks for comparisons with future encoding schemes. For example, for the 512 512 Lena image, at a transmission rate of 1 b/pixel, and for binary symmetric channels with bit error probabilities 03 , 02 , and 01 , the proposed system outperforms previously reported results by at least 2.6, 2.8, and 8.9 dB, respectively.