Results 1 -
5 of
5
Unsupervised topic modelling for multi-party spoken discourse
- In COLING-ACL 2006
, 2006
"... We present a method for unsupervised topic modelling which adapts methods used in document classification (Blei et al., 2003; Griffiths and Steyvers, 2004) to unsegmented multi-party discourse transcripts. We show how Bayesian inference in this generative model can be used to simultaneously address ..."
Abstract
-
Cited by 42 (8 self)
- Add to MetaCart
We present a method for unsupervised topic modelling which adapts methods used in document classification (Blei et al., 2003; Griffiths and Steyvers, 2004) to unsegmented multi-party discourse transcripts. We show how Bayesian inference in this generative model can be used to simultaneously address the problems of topic segmentation and topic identification: automatically segmenting multi-party meetings into topically coherent segments with performance which compares well with previous unsupervised segmentation-only methods (Galley et al., 2003) while simultaneously extracting topics which rate highly when assessed for coherence by human judges. We also show that this method appears robust in the face of off-topic dialogue and speech recognition errors. 1
Multimodal integration for meeting group action segmentation and recognition
- in Proc. Workshop on Machine Learning for Multimodal Interaction (MLMI
, 2005
"... Abstract. We address the problem of segmentation and recognition of sequences of multimodal human interactions in meetings. These interactions can be seen as a rough structure of a meeting, and can be used either as input for a meeting browser or as a first step towards a higher semantic analysis of ..."
Abstract
-
Cited by 15 (7 self)
- Add to MetaCart
Abstract. We address the problem of segmentation and recognition of sequences of multimodal human interactions in meetings. These interactions can be seen as a rough structure of a meeting, and can be used either as input for a meeting browser or as a first step towards a higher semantic analysis of the meeting. A common lexicon of multimodal group meeting actions, a shared meeting data set, and a common evaluation procedure enable us to compare the different approaches. We compare three different multimodal feature sets and four modelling infrastructures: a higher semantic feature approach, multi-layer HMMs, a multistream DBN, as well as a multi-stream mixed-state DBN for disturbed data. 1
A multimodal discourse ontology for meeting understanding
- Machine Learning for Multimodal Interaction: 2nd International Workshop, MLMI 2005
, 2006
"... Abstract. In this paper, we present a multimodal discourse ontology that serves as a knowledge representation and annotation framework for the discourse understanding component of an artificial personal office assistant. The ontology models components of natural language, multimodal communication, m ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Abstract. In this paper, we present a multimodal discourse ontology that serves as a knowledge representation and annotation framework for the discourse understanding component of an artificial personal office assistant. The ontology models components of natural language, multimodal communication, multi-party dialogue structure, meeting structure, and the physical and temporal aspects of human communication. We compare our models to those from the research literature and from similar applications. We also highlight some algorithms that are used to perform automatic processing and understanding using these models and suggest elements of the ontology that may be of immediate interest to meeting annotation by human or automated means. 1
Writer identification for smart meeting room systems
- In Proc. 7th IAPR Workshop on Document Analysis Systems, volume 3872 of LNCS
, 2006
"... Abstract. In this paper we present a text independent on-line writer identification system based on Gaussian Mixture Models (GMMs). This system has been developed in the context of research on Smart Meeting Rooms. The GMMs in our system are trained using two sets of features extracted from a text li ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract. In this paper we present a text independent on-line writer identification system based on Gaussian Mixture Models (GMMs). This system has been developed in the context of research on Smart Meeting Rooms. The GMMs in our system are trained using two sets of features extracted from a text line. The first feature set is similar to feature sets used in signature verification systems before. It consists of information gathered for each recorded point of the handwriting, while the second feature set contains features extracted from each stroke. While both feature sets perform very favorably, the stroke-based feature set outperforms the point-based feature set in our experiments. We achieve a writer identification rate of 100 % for writer sets with up to 100 writers. Increasing the number of writers to 200, the identification rate decreases to 94.75%. 1
General Examination
"... People spend many hours in meetings during their working lives. The growing need for help in keeping records in meetings and searching through them has been recognized, and several groups around the world are working on a meeting browser or a summarization tool. In this research, we propose the deve ..."
Abstract
- Add to MetaCart
People spend many hours in meetings during their working lives. The growing need for help in keeping records in meetings and searching through them has been recognized, and several groups around the world are working on a meeting browser or a summarization tool. In this research, we propose the development of a classification system that uses machine learning techniques to segment and detect meeting acts, which are high-level interactions among meeting participants as a group (e.g. negotiation, reporting, discussion, planning). As in other data-driven tasks, this requires a large amount of data, but labeling data can be costly, time-consuming and errorprone. To address this problem, semi-supervised learning techniques are often applied, in which a small amount of data are labeled and is used to train a classifier together with a large body of unlabeled data. In this study, we propose to use and extend a novel semi-supervised learning algorithm, the contrast classifier approach, which exploits the contrast between the distributions of labeled and unlabeled data. We will also present our research plan to investigate the impact of different labeling mechanisms on the performance of existing and proposed semi-supervised learning techniques, especially in the presence of imbalanced class distribution. Contents 1

