Results 1 -
5 of
5
A Linear Programming Formulation for Global Inference in Natural Language Tasks
- In Proceedings of CoNLL-2004
, 2004
"... The typical processing paradigm in natural language processing is the "pipeline" approach, where learners are being used at one level, their outcomes are being used as features for a second level of predictions and so one. In addition to accumulating errors, it is clear that the sequential processin ..."
Abstract
-
Cited by 91 (26 self)
- Add to MetaCart
The typical processing paradigm in natural language processing is the "pipeline" approach, where learners are being used at one level, their outcomes are being used as features for a second level of predictions and so one. In addition to accumulating errors, it is clear that the sequential processing is a crude approximation to a process in which interactions occur across levels and down stream decisions often interact with previous decisions. This work develops a general...
Inferring private information using social network data
, 2008
"... Online social networks, such as Facebook, are increasingly utilized by many users. These networks allow people to publish details about themselves and connect to their friends. Some of the information revealed inside these networks is private and it is possible that corporations could use learning a ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
Online social networks, such as Facebook, are increasingly utilized by many users. These networks allow people to publish details about themselves and connect to their friends. Some of the information revealed inside these networks is private and it is possible that corporations could use learning algorithms on the released data to predict undisclosed private information. In this paper, we propose an effective, scalable inference attack for released social networking data to infer undisclosed private information about individuals. We then explore the effectiveness of possible sanitization techniques that can be used to combat such an inference attack. 1
Preventing Private Information Inference Attacks on Social Networks
, 2009
"... On-line social networks, such as Facebook, are increasingly utilized by many people. These networks allow users to publish details about themselves and connect to their friends. Some of the information revealed inside these networks is meant to be private. Yet it is possible that corporations could ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
On-line social networks, such as Facebook, are increasingly utilized by many people. These networks allow users to publish details about themselves and connect to their friends. Some of the information revealed inside these networks is meant to be private. Yet it is possible that corporations could use learning algorithms on released data to predict undisclosed private information. In this paper, we explore how to launch inference attacks using released social networking data to predict undisclosed private information about individuals. We then devise three possible sanitization techniques that could be used in various situations. Then, we explore the effectiveness of these techniques by implementing them on a dataset obtained from the Dallas/Fort Worth, Texas network of the Facebook social networking application and attempting to use methods of collective inference to discover sensitive attributes of the data set. We show that we can decrease the effectiveness of both local and relational classification algorithms by using the sanitization methods we described. Further, we discover a problem domain where collective inference degrades the performance of classification algorithms for determining private attributes. 1
Effective use of phrases in language modeling to improve information retrieval
- 2004 Symposium on AI & Math Special Session on Intelligent Text Processing
, 2004
"... Traditional information retrieval models treat the query as a bag of words, assuming that the occurrence of each query term is independent of the positions and occurrences of others. Several of these traditional models have been extended to incorporate positional information, most often through the ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Traditional information retrieval models treat the query as a bag of words, assuming that the occurrence of each query term is independent of the positions and occurrences of others. Several of these traditional models have been extended to incorporate positional information, most often through the inclusion of phrases. This has shown improvements in effectiveness on large, modern test collections. The language modeling approach to information retrieval is attractive because it provides a well-studied theoretical framework that has been successful in other fields. Incorporating positional information into language models is intuitive and has shown significant improvements in several language-modeling applications. However, attempts to integrate positional information into the language-modeling approach to IR have not shown consistent significant improvements. This paper provides a broader exploration of this problem. We apply the backoff technique to incorporate a bigram phrase language model with the traditional unigram one and compare its performance to an interpolation of a conditional bigram model with the unigram model. While this novel application of backoff does not improve effectiveness, we find that our formula for interpolating a conditional bigram model with the unigram model yields significantly different results from prior work. Namely, it shows an 11 % relative improvement in average precision on one query set, while yielding no improvement on the other two. 1.
Chapter 1 LINK PREDICTION IN SOCIAL NETWORKS Link
"... Link prediction is an important task for analying social networks which also has applications in other domains like, information retrieval, bioinformatics and e-commerce. There exist a variety of techniques for link prediction, ranging from feature-based classification and kernelbased method to matr ..."
Abstract
- Add to MetaCart
Link prediction is an important task for analying social networks which also has applications in other domains like, information retrieval, bioinformatics and e-commerce. There exist a variety of techniques for link prediction, ranging from feature-based classification and kernelbased method to matrix factorization and probabilistic graphical models. These methods differ from each other with respect to model complexity, prediction performance, scalability, and generalization ability. In this article, we survey some representative link prediction methods by categorizing them by the type of the models. We largely consider threetypesofmodels: first, thetraditional(non-Bayesian)modelswhich extract a set of features to train a binary classification model. Second, the probabilistic approaches which model the joint-probability among the entities in a network by Bayesian graphical models. And, finally the linear algebraic approach which computes the similarity between the nodes in a network by rank-reduced similarity matrices. We discuss various existing link prediction models that fall in these broad categories and analyze their strength and weakness. We conclude the survey with a discussion on recent developments and future research direction.

