Results 1 -
4 of
4
ReGroup: Interactive Machine Learning for On-Demand Group Creation
- in Social Networks. To Appear in Proceedings of CHI 2012
, 2012
"... We present ReGroup, a novel end-user interactive machine learning system for helping people create custom, on-demand groups in online social networks. As a person adds members to a group, ReGroup iteratively learns a probabilistic model of group membership specific to that group. ReGroup then uses i ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We present ReGroup, a novel end-user interactive machine learning system for helping people create custom, on-demand groups in online social networks. As a person adds members to a group, ReGroup iteratively learns a probabilistic model of group membership specific to that group. ReGroup then uses its currently learned model to suggest additional members and group characteristics for filtering. Our evaluation shows that ReGroup is effective for helping people create large and varied groups, whereas traditional methods (searching by name or selecting from an alphabetical list) are better suited for small groups whose members can be easily recalled by name. By facilitating on-demand group creation, ReGroup can enable in-context sharing and potentially encourage better online privacy practices. In addition, applying interactive machine learning to social network group creation introduces several challenges for designing effective end-user interaction with machine learning. We identify these challenges and discuss how we address them in ReGroup. Author Keywords Interactive machine learning, social network group creation, access control lists, example and feature-based interaction.
Research Statement
"... I am motivated by the prospect of computers that learn, by interacting and collaborating with humans, how to solve problems. Such systems might take several forms. For example, imagine you are an entrepreneur and you want to train a computer to help you analyze what people are saying about your prod ..."
Abstract
- Add to MetaCart
I am motivated by the prospect of computers that learn, by interacting and collaborating with humans, how to solve problems. Such systems might take several forms. For example, imagine you are an entrepreneur and you want to train a computer to help you analyze what people are saying about your products. You have domain knowledge about your business and the decisions you want the system to make, such as identifying positive vs. negative product reviews across the Internet. You might want to initialize the system with your background knowledge (e.g., the words “wonderful” and “terrible ” indicate high and low customer satisfaction, respectively), inspect substantial amounts of relevant text data, and then allow it ask questions to help refine its understanding of your goals (e.g., is “predictable ” a positive word for your product? — which may depend on whether you make kitchen appliances or write novels). Alternatively, imagine you are a biologist with a highthroughput laboratory technique to test hundreds of proteins in tandem. You would like it to analyze hundreds (even thousands) of these measurements, induce hypotheses that might explain the data and communicate them to you (which you might want to edit based on your knowledge or intuition), and let it propose subsequent experiments in order to refine these hypotheses, or potentially discover other proteins with the properties you study.
Toward Interactive Training and Evaluation
"... Machine learning often relies on costly labeled data, and this impedes its application to new classification and information extraction problems. This has motivated the development of methods for leveraging abundant prior knowledge about these problems, including methods for lightly supervised learn ..."
Abstract
- Add to MetaCart
Machine learning often relies on costly labeled data, and this impedes its application to new classification and information extraction problems. This has motivated the development of methods for leveraging abundant prior knowledge about these problems, including methods for lightly supervised learning using model expectation constraints. Building on this work, we envision an interactive training paradigm in which practitioners perform evaluation, analyze errors, and provide and refine expectation constraints in a closed loop. In this paper, we focus on several key subproblems in this paradigm that can be cast as selecting a representative sample of the unlabeled data for the practitioner to inspect. To address these problems, we propose stratified sampling methods that use model expectations as a proxy for latent output variables. In classification and sequence labeling experiments, these sampling strategies reduce accuracy evaluation effort by as much as 53%, provide more reliable estimates of F1 for rare labels, and aid in the specification and refinement of constraints.
Socioscope: Spatio-Temporal Signal Recovery from Social Media
"... Abstract. Many real-world phenomena can be represented by a spatiotemporal signal: where, when, and how much. Social media is a tantalizing data source for those who wish to monitor such signals. Unlike most prior work, we assume that the target phenomenon is known and we are given a method to count ..."
Abstract
- Add to MetaCart
Abstract. Many real-world phenomena can be represented by a spatiotemporal signal: where, when, and how much. Social media is a tantalizing data source for those who wish to monitor such signals. Unlike most prior work, we assume that the target phenomenon is known and we are given a method to count its occurrences in social media. However, counting is plagued by sample bias, incomplete data, and, paradoxically, data scarcity – issues inadequately addressed by prior work. We formulate signal recovery as a Poisson point process estimation problem. We explicitly incorporate human population bias, time delays and spatial distortions, and spatio-temporal regularization into the model to address the noisy count issues. We present an efficient optimization algorithm and discuss its theoretical properties. We show that our model is more accurate than commonly-used baselines. Finally, we present a case study on wildlife roadkill monitoring, where our model produces qualitatively convincing results. 1

