Results 11 - 20
of
25
Mark My Words! Linguistic Style Accommodation in Social Media
"... The psycholinguistic theory of communication accommodation accounts for the general observation that participants in conversations tend to converge to one another’s communicative behavior: they coordinate in a variety of dimensions including choice of words, syntax, utterance length, pitch and gestu ..."
Abstract
- Add to MetaCart
The psycholinguistic theory of communication accommodation accounts for the general observation that participants in conversations tend to converge to one another’s communicative behavior: they coordinate in a variety of dimensions including choice of words, syntax, utterance length, pitch and gestures. In its almost forty years of existence, this theory has been empirically supported exclusively through smallscale or controlled laboratory studies. Here we address this phenomenon in the context of Twitter conversations. Undoubtedly, this setting is unlike any other in which accommodation was observed and, thus, challenging to the theory. Its novelty comes not only from its size, but also from the non real-time nature of conversations, from the 140 character length restriction, from the wide variety of social relation types, and from a design that was initially not geared towards conversation at all. Given such constraints, it is not clear a priori whether accommodation is robust enough to occur given the constraints of this new environment. To investigate this, we develop a probabilistic framework that can model accommodation and measure its effects. We apply it to a large Twitter conversational dataset specifically developed for this task. This is the first time the hypothesis of linguistic style accommodation has been examined (and verified) in a large scale, real world setting. Furthermore, when investigating concepts such as stylistic influence and symmetry of accommodation, we discover a complexity of the phenomenon which was never observed before. We also explore the potential relation between stylistic influence and network features commonly associated with social status.
Toward Learning and Evaluation of Dialogue Policies with Text Examples
"... We present a dialogue collection and enrichment framework that is designed to explore the learning and evaluation of dialogue policies for simple conversational characters using textual training data. To facilitate learning and evaluation, our framework enriches a collection of role-play dialogues w ..."
Abstract
- Add to MetaCart
We present a dialogue collection and enrichment framework that is designed to explore the learning and evaluation of dialogue policies for simple conversational characters using textual training data. To facilitate learning and evaluation, our framework enriches a collection of role-play dialogues with additional training data, including paraphrases of user utterances, and multiple independent judgments by external referees about the best policy response for the character at each point. As a case study, we use this framework to train a policy for a limited domain tactical questioning character, reaching promising performance. We also introduce an automatic policy evaluation metric that recognizes the validity of multiple conversational responses at each point in a dialogue. We use this metric to explore the variability in human opinion about optimal policy decisions, and to automatically evaluate several learned policies in our example domain. 1
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Short Text Conceptualization Using a Probabilistic Knowledgebase
"... Most text mining tasks, including clustering and topic detection, are based on statistical methods that treat text as bags of words. Semantics in the text is largely ignored in the mining process, and mining results often have low interpretability. One particular challenge faced by such approaches l ..."
Abstract
- Add to MetaCart
Most text mining tasks, including clustering and topic detection, are based on statistical methods that treat text as bags of words. Semantics in the text is largely ignored in the mining process, and mining results often have low interpretability. One particular challenge faced by such approaches lies in short text understanding, as short texts lack enough content from which statistical conclusions can be drawn easily. In this paper, we improve text understanding by using a probabilistic knowledgebase that is as rich as our mental world in terms of the concepts (of worldly facts) it contains. We then develop a Bayesian inference mechanism to conceptualize words and short text. We conducted comprehensive experiments on conceptualizing textual terms, and clustering short pieces of text such as Twitter messages. Compared to purely statistical methods such as latent semantic topic modeling or methods that use existing knowledgebases (e.g., WordNet, Freebase and Wikipedia), our approach brings significant improvements in short text understanding as reflected by the clustering accuracy. 1
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Unsupervised Modeling of Dialog Acts in Asynchronous Conversations
"... We present unsupervised approaches to the problem of modeling dialog acts in asynchronous conversations; i.e., conversations where participants collaborate with each other at different times. In particular, we investigate a graph-theoretic deterministic framework and two probabilistic conversation m ..."
Abstract
- Add to MetaCart
We present unsupervised approaches to the problem of modeling dialog acts in asynchronous conversations; i.e., conversations where participants collaborate with each other at different times. In particular, we investigate a graph-theoretic deterministic framework and two probabilistic conversation models (i.e., HMM and HMM+Mix) for modeling dialog acts in emails and forums. We train and test our conversation models on (a) temporal order and (b) graph-structural order of the datasets. Empirical evaluation suggests (i) the graph-theoretic framework that relies on lexical and structural similarity metrics is not the right model for this task, (ii) conversation models perform better on the graphstructural order than the temporal order of the datasets and (iii) HMM+Mix is a better conversation model than the simple HMM model. 1
Building a Conversational Model from Two-Tweets
"... Abstract—The current problem in building a conversational model from Twitter data is the scarcity of long conversations. According to our statistics, more than 90 % of conversations in Twitter are composed of just two tweets. Previous work has utilized only conversations lasting longer than three tw ..."
Abstract
- Add to MetaCart
Abstract—The current problem in building a conversational model from Twitter data is the scarcity of long conversations. According to our statistics, more than 90 % of conversations in Twitter are composed of just two tweets. Previous work has utilized only conversations lasting longer than three tweets for dialogue modeling so that more than a single interaction can be successfully modeled. This paper verifies, by experiment, that two-tweet exchanges alone can lead to conversational models that are comparable to those made from longer-tweet conversations. This finding leverages the value of Twitter as a dialogue corpus and opens the possibility of better conversational modeling using Twitter data. I.
Unsupervised Modeling of Dialog Acts in Asynchronous Conversations
"... We present unsupervised approaches to the problem of modeling dialog acts in asynchronous conversations; i.e., conversations where participants collaborate with each other at different times. In particular, we investigate a graph-theoretic deterministic framework and two probabilistic conversation m ..."
Abstract
- Add to MetaCart
We present unsupervised approaches to the problem of modeling dialog acts in asynchronous conversations; i.e., conversations where participants collaborate with each other at different times. In particular, we investigate a graph-theoretic deterministic framework and two probabilistic conversation models (i.e., HMM and HMM+Mix) for modeling dialog acts in emails and forums. We train and test our conversation models on (a) temporal order and (b) graph-structural order of the datasets. Empirical evaluation suggests (i) the graph-theoretic framework that relies on lexical and structural similarity metrics is not the right model for this task, (ii) conversation models perform better on the graphstructural order than the temporal order of the datasets and (iii) HMM+Mix is a better conversation model than the simple HMM model. 1
The Imagination of Crowds: Conversational AAC Language Modeling using Crowdsourcing and Large Data Sources
"... Augmented and alternative communication (AAC) devices enable users with certain communication disabilities to participate in everyday conversations. Such devices often rely on statistical language models to improve text entry by offering word predictions. These predictions can be improved if the lan ..."
Abstract
- Add to MetaCart
Augmented and alternative communication (AAC) devices enable users with certain communication disabilities to participate in everyday conversations. Such devices often rely on statistical language models to improve text entry by offering word predictions. These predictions can be improved if the language model is trained on data that closely reflects the style of the users ’ intended communications. Unfortunately, there is no large dataset consisting of genuine AAC messages. In this paper we demonstrate how we can crowdsource the creation of a large set of fictional AAC messages. We show that these messages model conversational AAC better than the currently used datasets based on telephone conversations or newswire text. We leverage our crowdsourced messages to intelligently select sentences from much larger sets of Twitter, blog and Usenet data. Compared to a model trained only on telephone transcripts, our best performing model reduced perplexity on three test sets of AAC-like communications by 60– 82 % relative. This translated to a potential keystroke savings in a predictive keyboard interface of 5–11%. 1
Actions Speak as Loud as Words: Predicting Relationships from Social Behavior Data
"... In recent years, new studies concentrating on analyzing user personality and finding credible content in social media have become quite popular. Most such work augments features from textual content with features representing the user’s social ties and the tie strength. Social ties are crucial in un ..."
Abstract
- Add to MetaCart
In recent years, new studies concentrating on analyzing user personality and finding credible content in social media have become quite popular. Most such work augments features from textual content with features representing the user’s social ties and the tie strength. Social ties are crucial in understanding the network the people are a part of. However, textual content is extremely useful in understanding topics discussed and the personality of the individual. We bring a new dimension to this type of analysis with methods to compute the type of ties individuals have and the strength of the ties in each dimension. We present a new genre of behavioral features that are able to capture the “function ” of a specific relationship without the help of textual features. Our novel features are based on the statistical properties of communication patterns between individuals such as reciprocity, assortativity, attention and latency. We introduce a new methodology for determining how such features can be compared to textual features, and show, using Twitter data, that our features can be used to capture contextual information present in textual features very accurately. Conversely, we also demonstrate how textual features can be used to determine social attributes related to an individual.

