Results 1 -
7 of
7
An analysis of the user occupational class through Twitter content
"... Social media content can be used as a complementary source to the traditional methods for extracting and studying col-lective social attributes. This study focuses on the prediction of the occupational class for a public user profile. Our analysis is conducted on a new annotated corpus of Twitter us ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Social media content can be used as a complementary source to the traditional methods for extracting and studying col-lective social attributes. This study focuses on the prediction of the occupational class for a public user profile. Our analysis is conducted on a new annotated corpus of Twitter users, their respective job titles, posted textual content and platform-related attributes. We frame our task as classifi-cation using latent feature representations such as word clusters and embeddings. The employed linear and, especially, non-linear methods can predict a user’s occupational class with strong accuracy for the coars-est level of a standard occupation taxon-omy which includes nine classes. Com-bined with a qualitative assessment, the derived results confirm the feasibility of our approach in inferring a new user at-tribute that can be embedded in a multitude of downstream applications. 1
The Role of Personality, Age and Gender in Tweeting about Mental Illnesses
- In Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, NAACL
, 2015
"... Mental illnesses, such as depression and post traumatic stress disorder (PTSD), are highly underdiagnosed globally. Populations sharing similar demographics and personality traits are known to be more at risk than others. In this study, we characterise the language use of users disclosing their ment ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Mental illnesses, such as depression and post traumatic stress disorder (PTSD), are highly underdiagnosed globally. Populations sharing similar demographics and personality traits are known to be more at risk than others. In this study, we characterise the language use of users disclosing their mental illness on Twit-ter. Language-derived personality and demo-graphic estimates show surprisingly strong per-formance in distinguishing users that tweet a diagnosis of depression or PTSD from random controls, reaching an area under the receiver-operating characteristic curve – AUC – of around.8 in all our binary classification tasks. In fact, when distinguishing users disclosing depression from those disclosing PTSD, the single feature of estimated age shows nearly as strong performance (AUC =.806) as using thousands of topics (AUC =.819) or tens of thousands of n-grams (AUC =.812). We also find that differential language analyses, con-trolled for demographics, recover many symp-toms associated with the mental illnesses in the clinical literature. 1
Mental Illness Detection at the World Well-Being Project for the CLPsych 2015 Shared Task
"... This article is a system description and report on the submission of the World Well-Being Project from the University of Pennsylvania in the ‘CLPsych 2015 ’ shared task. The goal of the shared task was to automatically determine Twitter users who self-reported having one of two mental illnesses: pos ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This article is a system description and report on the submission of the World Well-Being Project from the University of Pennsylvania in the ‘CLPsych 2015 ’ shared task. The goal of the shared task was to automatically determine Twitter users who self-reported having one of two mental illnesses: post traumatic stress dis-order (PTSD) and depression. Our system em-ploys user metadata and textual features de-rived from Twitter posts. To reduce the fea-ture space and avoid data sparsity, we con-sider several word clustering approaches. We explore the use of linear classifiers based on different feature sets as well as a combination use a linear ensemble. This method is agnos-tic of illness specific features, such as lists of medicines, thus making it readily applicable in other scenarios. Our approach ranked second in all tasks on average precision and showed best results at.1 false positive rates. 1
Discovering User Attribute Stylistic Differences via Paraphrasing
"... User attribute prediction from social media text has proven successful and useful for downstream tasks. In previous studies, differences in user trait language use have been limited primarily to the presence or absence of words that indicate topical preferences. In this study, we aim to find linguis ..."
Abstract
- Add to MetaCart
User attribute prediction from social media text has proven successful and useful for downstream tasks. In previous studies, differences in user trait language use have been limited primarily to the presence or absence of words that indicate topical preferences. In this study, we aim to find linguistic style distinctions across three dif-ferent user attributes: gender, age and occupational class. By combining paraphrases with a simple yet effective method, we capture a wide set of stylistic differences that are exempt from topic bias. We show their predictive power in user profiling, conformity with human percep-tion and psycholinguistic hypotheses, and potential use in generating natural language tailored to specific user traits.
Inferring the Socioeconomic Status of Social Media Users based on Behaviour and Language
"... Abstract. This paper presents a method to classify social media users based on their socioeconomic status. Our experiments are conducted on a curated set of Twitter profiles, where each user is represented by the posted text, topics of discussion, interactive behaviour and estimated impact on the mi ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. This paper presents a method to classify social media users based on their socioeconomic status. Our experiments are conducted on a curated set of Twitter profiles, where each user is represented by the posted text, topics of discussion, interactive behaviour and estimated impact on the microblogging platform. Initially, we formulate a 3-way classification task, where users are classified as having an upper, middle or lower socioeconomic status. A nonlinear, generative learning approach using a composite Gaussian Process kernel provides significantly better classification accuracy (75%) than a competitive linear alternative. By turning this task into a binary classification – upper vs. medium and lower class – the proposed classifier reaches an accuracy of 82%.
Classifying Tweet Level Judgements of Rumours in Social Media
"... Social media is a rich source of rumours and corresponding community reactions. Rumours reflect different characteristics, some shared and some individual. We for-mulate the problem of classifying tweet level judgements of rumours as a super-vised learning task. Both supervised and unsupervised doma ..."
Abstract
- Add to MetaCart
(Show Context)
Social media is a rich source of rumours and corresponding community reactions. Rumours reflect different characteristics, some shared and some individual. We for-mulate the problem of classifying tweet level judgements of rumours as a super-vised learning task. Both supervised and unsupervised domain adaptation are con-sidered, in which tweets from a rumour are classified on the basis of other annotated rumours. We demonstrate how multi-task learning helps achieve good results on ru-mours from the 2011 England riots. 1