Results 1 - 10
of
14
Discovering frequent arrangements of temporal intervals
- In Proceedings of the 5th IEEE International Conference on Data Mining (ICDM’05
, 2005
"... In this paper we study a new problem in temporal pattern mining: discovering frequent arrangements of temporal intervals. We assume that the database consists of sequences of events, where an event occurs during a time-interval. The goal is to mine arrangements of event intervals that appear frequen ..."
Abstract
-
Cited by 25 (5 self)
- Add to MetaCart
In this paper we study a new problem in temporal pattern mining: discovering frequent arrangements of temporal intervals. We assume that the database consists of sequences of events, where an event occurs during a time-interval. The goal is to mine arrangements of event intervals that appear frequently in the database. There are many applications where these type of patterns can be useful, including data network, scientific, and financial applications. Efficient methods to find frequent arrangements of temporal intervals using both breadth first and depth first search techniques are described. The performance of the proposed algorithms is evaluated and compared with other approaches on real datasets (American Sign Language streams and network data) and large synthetic datasets.
The American Sign Language lexicon video dataset
- In IEEE Workshop on Computer Vision and Pattern Recognition for Human Communicative Behavior Analysis (CVPR4HB
, 2008
"... The lack of a written representation for American Sign Language (ASL) makes it difficult to do something as commonplace as looking up an unknown word in a dictionary. The majority of printed dictionaries organize ASL signs (represented in drawings or pictures) based on their nearest English translat ..."
Abstract
-
Cited by 16 (9 self)
- Add to MetaCart
(Show Context)
The lack of a written representation for American Sign Language (ASL) makes it difficult to do something as commonplace as looking up an unknown word in a dictionary. The majority of printed dictionaries organize ASL signs (represented in drawings or pictures) based on their nearest English translation; so unless one already knows the meaning of a sign, dictionary look-up is not a simple proposition. In this paper we introduce the ASL Lexicon Video Dataset, a large and expanding public dataset containing video sequences of thousands of distinct ASL signs, as well as annotations of those sequences, including start/end frames and class label of every sign. This dataset is being created as part of a project to develop a computer vision system that allows users to look up the meaning of an ASL sign. At the same time, the dataset can be useful for benchmarking a variety of computer vision and machine learning methods designed for learning and/or indexing a large number of visual classes, and especially approaches for analyzing gestures and human communication. 1.
Automatic Detection of Relevant Head Gestures in American Sign Language Communication
- In: Proceedings of the International Conference on Pattern Recognition - ICPR 2002
, 2002
"... An automated system for detection of head movements is described. The goal is to label relevant head gestures in video of American Sign Language (ASL) communication. In the system, a 3D head tracker recovers head rotation and translation parameters from monocular video. Relevant head gestures are th ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
(Show Context)
An automated system for detection of head movements is described. The goal is to label relevant head gestures in video of American Sign Language (ASL) communication. In the system, a 3D head tracker recovers head rotation and translation parameters from monocular video. Relevant head gestures are then detected by analyzing the length and frequency of the motion signal's peaks and valleys. Each parameter is analyzed independently, due to the fact that a number of relevant head movements in ASL are associated with major changes around one rotational axis. No explicit training of the system is necessary. Currently, the system can detect "head shakes." In experimental evaluation, classification performance is compared against ground-truth labels obtained from ASL linguists. Initial results are promising, as the system matches the linguists' labels in a significant number of cases.
Can you see me now?’ An objective metric for predicting intelligibility of compressed American Sign Language video
- in Proc. SPIE Vol. 6492, Human Vision and Electronic Imaging ’07
"... For members of the Deaf Community in the United States, current communication tools include TTY/TTD services, video relay services, and text-based communication. With the growth of cellular technology, mobile sign language conversations are becoming a possibility. Proper coding techniques must be em ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
(Show Context)
For members of the Deaf Community in the United States, current communication tools include TTY/TTD services, video relay services, and text-based communication. With the growth of cellular technology, mobile sign language conversations are becoming a possibility. Proper coding techniques must be employed to compress American Sign Language (ASL) video for low-rate transmission while maintaining the quality of the conversation. In order to evaluate these techniques, an appropriate quality metric is needed. This paper demonstrates that traditional video quality metrics, such as PSNR, fail to predict subjective intelligibility scores. By considering the unique structure of ASL video, an appropriate objective metric is developed. Face and hand segmentation is performed using skin-color detection techniques. The distortions in the face and hand regions are optimally weighted to create an objective intelligibility score for a distorted sequence. The objective intelligibility metric performs significantly better than PSNR in terms of correlation with subjective responses.
A human-computer interface using symmetry between eyes to detect gaze direction
- TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, PART A: SYSTEMS AND HUMANS
, 2008
"... In the cases of paralysis so severe that a person’s ability to control movement is limited to the muscles around the eyes, eye movements or blinks are the only way for the person to communicate. Interfaces that assist in such communication are often intrusive, require special hardware, or rely on a ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
(Show Context)
In the cases of paralysis so severe that a person’s ability to control movement is limited to the muscles around the eyes, eye movements or blinks are the only way for the person to communicate. Interfaces that assist in such communication are often intrusive, require special hardware, or rely on active infrared illumination. A nonintrusive communication interface system called EyeKeys was therefore developed, which runs on a consumer-grade computer with video input from an inexpensive Universal Serial Bus camera and works without special lighting. The system detects and tracks the person’s face using multiscale template correlation. The symmetry between left and right eyes is exploited to detect if the person is looking at the camera or to the left or right side. The detected eye direction can then be used to control applications such as spelling programs or games. The game “BlockEscape” was developed to evaluate the performance of EyeKeys and compare it to a mouse substitution interface. Experiments with EyeKeys have shown that it is an easily used computer input and control device for able-bodied people and has the potential to become a practical tool for people with severe paralysis.
Exploiting Phonological Constraints for Handshape Inference in ASL Video
"... Handshape is a key linguistic component of signs, and thus, handshape recognition is essential to algorithms for sign language recognition and retrieval. In this work, linguistic constraints on the relationship between start and end handshapes are leveraged to improve handshape recognition accuracy. ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
Handshape is a key linguistic component of signs, and thus, handshape recognition is essential to algorithms for sign language recognition and retrieval. In this work, linguistic constraints on the relationship between start and end handshapes are leveraged to improve handshape recognition accuracy. A Bayesian network formulation is proposed for learning and exploiting these constraints, while taking into consideration inter-signer variations in the production of particular handshapes. A Variational Bayes formulation is employed for supervised learning of the model parameters. A non-rigid image alignment algorithm, which yields improved robustness to variability in handshape appearance, is proposed for computing image observation likelihoods in the model. The resulting handshape inference algorithm is evaluated using a dataset of 1500 lexical signs in American Sign Language (ASL), where each lexical sign is produced by three native ASL signers. 1.
Mining Frequent Arrangements of Temporal Intervals
- UNDER CONSIDERATION FOR PUBLICATION IN KNOWLEDGE AND INFORMATION SYSTEMS
, 2008
"... The problem of discovering frequent arrangements of temporal intervals is studied. It is assumed that the database consists of sequences of events, where an event occurs during a time-interval. The goal is to mine temporal arrangements of event intervals that appear frequently in the database. The m ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
The problem of discovering frequent arrangements of temporal intervals is studied. It is assumed that the database consists of sequences of events, where an event occurs during a time-interval. The goal is to mine temporal arrangements of event intervals that appear frequently in the database. The motivation of this work is the observation that in practice most events are not instantaneous but occur over a period of time and different events may occur concurrently. Thus, there are many practical applications that require mining such temporal correlations between intervals including the linguistic analysis of annotated data from American Sign Language as well as network and biological data. Three efficient methods to find frequent arrangements of temporal intervals are described; the first two are tree-based and use breadth and depth first search to mine the set of frequent arrangements, whereas the third one is prefix-based. The above methods apply efficient pruning techniques that include a set of constraints that add user-controlled focus into the mining process. Moreover, based on the extracted patterns a standard method for mining association rules is employed that applies different interestingness measures to evaluate the significance of the discovered
Motion Mining
- In Proceedings of the 2 nd Int’l Workshop on Multimedia Databases and Image Communication, MDIC’01
, 2001
"... Abstract. A long-term research effort to support data mining applications for video databases of human motion is described. Due to the spatio-temporal nature of human motion data, novel methods for indexing and mining databases of time series data of human motion are required. Further, since data mi ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract. A long-term research effort to support data mining applications for video databases of human motion is described. Due to the spatio-temporal nature of human motion data, novel methods for indexing and mining databases of time series data of human motion are required. Further, since data mining requires a significant sample size to accurately model patterns in the data, algorithms that automatically extract motion trajectories and time series data from video are required. A preliminary system for estimating human motion in video, as well as indexing and data mining of the resulting motion databases is described. 1
Modeling Signs using Functional Data Analysis
"... Abstract 1 We present a functional data analysis (FDA) based method to statistically model continuous signs of the American Sign Language (ASL) for use in the recognition of signs in continuous sentences. We build models in the Space of Probability Functions (SoPF) that captures the evolution of the ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract 1 We present a functional data analysis (FDA) based method to statistically model continuous signs of the American Sign Language (ASL) for use in the recognition of signs in continuous sentences. We build models in the Space of Probability Functions (SoPF) that captures the evolution of the relationships among the low-level features (e.g. edge pixels) in each frame. The distribution (histogram) of the horizontal and vertical displacements between all pairs of edge pixels in an image frame forms the relational distributions. We represent these sequence of relational distributions, corresponding to the sequence of image frames in a sign, as a sequence of points in a multi-dimensional space, capturing the salient variations in these relational distributions over time; we call this space the SoPF. Each sign model consists of a mean sign function and covariance functions, capturing the variability of each sign in the training set. We use functional data analysis to arrive at this model. Recognition and sign localization is performed by correlating this statistical model with any given sentence. We also present a method to infer and learn sign models, in an unsupervised manner, from sentence samples containing the sign; there is no need for manual intervention. 1.
Graduate Group Chairperson
, 2005
"... In many ways, writing the Acknowledgments section of a dissertation is almost harder than the work itself. So many people contribute in so many ways that it is usually hard to keep track of it all. This is not true in my case; the debts I owe to others for helping me bring this thesis to completion ..."
Abstract
- Add to MetaCart
(Show Context)
In many ways, writing the Acknowledgments section of a dissertation is almost harder than the work itself. So many people contribute in so many ways that it is usually hard to keep track of it all. This is not true in my case; the debts I owe to others for helping me bring this thesis to completion are very clear to me. There were those in the beginning who helped me formulate and shape my ideas; there were those in the middle who helped me design FORM and who performed the Herculean task of gathering and annotating the data that is the meat of what follows; and, finally, there were those who served to sharpen the writing and presentation of the work to make it much better than it was. To all of these, let me say that my thanks here does justice neither to your contribution nor to the depth of my appreciation. However, I hope it will serve. This thesis would have not have been the same without you. All of what is good in it has your stamp on it, and that which is not so good remains because I simply did not listen well enough. Firstly, I would like to acknowledge my parents. In very different ways, they both served to make me the curious person I am today.