Results 1  10
of
22
Hypergraphbased anomaly detection of highdimensional cooccurrences
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2009
"... Abstract—This paper addresses the problem of detecting anomalous multivariate cooccurrences using a limited number of unlabeled training observations. A novel method based on using a hypergraph representation of the data is proposed to deal with this very highdimensional problem. Hypergraphs const ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Abstract—This paper addresses the problem of detecting anomalous multivariate cooccurrences using a limited number of unlabeled training observations. A novel method based on using a hypergraph representation of the data is proposed to deal with this very highdimensional problem. Hypergraphs constitute an important extension of graphs that allow edges to connect more than two vertices simultaneously. A variational ExpectationMaximization algorithm for detecting anomalies directly on the hypergraph domain without any feature selection or dimensionality reduction is presented. The resulting estimate can be used to calculate a measure of anomalousness based on the falsediscovery rate. The algorithm has OðnpÞ computational complexity, where n is the number of training observations and p is the number of potential participants in each cooccurrence event. This efficiency makes the method ideally suited for very highdimensional settings and requires no tuning, bandwidth, or regularization parameters. The proposed approach is validated on both highdimensional synthetic data and the Enron email database, where p>75; 000, and it is shown that it can outperform other stateoftheart methods. Index Terms—Anomaly detection, cooccurrence analysis, unsupervised learning, variational methods, social networks. 1
Profiling of a network behind an infectious disease outbreak
, 2009
"... I describe a method to estimate a social network topology and diffusion parameters from the time sequence data of an infectious disease outbreak. The method is applicable to a stochastic diffusion process in a metapopulation and ..."
Abstract

Cited by 6 (6 self)
 Add to MetaCart
I describe a method to estimate a social network topology and diffusion parameters from the time sequence data of an infectious disease outbreak. The method is applicable to a stochastic diffusion process in a metapopulation and
Node discovery problem for a social network, eprint arxiv.org/abs/0710
, 2007
"... This paper presents a practical heuristic algorithm to address a node discovery problem. The node discovery problem is to discover a clue on the person, who does not appear in the observed records, but is relevant functionally in affecting decisionmaking and behavior of an organization. We define t ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
This paper presents a practical heuristic algorithm to address a node discovery problem. The node discovery problem is to discover a clue on the person, who does not appear in the observed records, but is relevant functionally in affecting decisionmaking and behavior of an organization. We define two topological relevance of a node in a social network (global and local relevance). Association between the topological relevance and the functional relevance is studied with a few example networks in criminal organizations. We propose a heuristic algorithm to infer an invisible, functionally relevant person. Its performance (precision, recall, and F value) is demonstrated with a simulation experiment using a network derived from the WattsStrogatz (WS) model. 1 Node discovery problem The activity of an organization is often under influence from an invisible relevant person. The term, invisible, means that the influence is not seen directly by the method applied in the observation procedure. This phenomenon arises intentionally or unintentionally. Let us show 2 examples. 1. Criminal organization: A commander tries to conceal himself from leaving any traces in communication and meeting logs, which are the basic intelligence to the police. Otherwise, exposure and arrest of a relevant pilot would have been a fatal damage to the terrorist organization in the 9/11 attack. 2. Manufacturing company: A sales person happens to be a close friend of an expertise factory engineer through a common friend: a product designer.
Optimal structural inference of signaling pathways from unordered and overlapping gene sets
 Bioinformatics
, 2012
"... Motivation A plethora of bioinformatics analysis has led to the discovery of numerous gene sets, which can be interpreted as discrete measurements emitted from latent signaling pathways. Their potential to infer signaling pathway structures, however, has not been sufficiently exploited. Existing met ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Motivation A plethora of bioinformatics analysis has led to the discovery of numerous gene sets, which can be interpreted as discrete measurements emitted from latent signaling pathways. Their potential to infer signaling pathway structures, however, has not been sufficiently exploited. Existing methods accommodating discrete data do not explicitly consider signal cascading mechanisms that characterize a signaling pathway. Novel computational methods are thus needed to fully utilize gene sets and broaden the scope from focusing only on pairwise interactions to the more general cascading events in the inference of signaling pathway structures. Results We propose a gene set based Simulated Annealing (SA) algorithm for the reconstruction of signaling pathway structures. A signaling pathway structure is a directed graph containing up to a few hundred nodes and many overlapping signal cascades, where each
Node discovery in a networked organization
, 803
"... Abstract—In this paper, I present a method to solve a node discovery problem in a networked organization. Covert nodes refer to the nodes which are not observable directly. They affect social interactions, but do not appear in the surveillance logs which record the participants of the social interac ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract—In this paper, I present a method to solve a node discovery problem in a networked organization. Covert nodes refer to the nodes which are not observable directly. They affect social interactions, but do not appear in the surveillance logs which record the participants of the social interactions. Discovering the covert nodes is defined as identifying the suspicious logs where the covert nodes would appear if the covert nodes became overt. A mathematical model is developed for the maximal likelihood estimation of the network behind the social interactions and for the identification of the suspicious logs. Precision, recall, and F measure characteristics are demonstrated with the dataset generated from a real organization and the computationally synthesized datasets. The performance is close to the theoretical limit for any covert nodes in the networks of any topologies and sizes if the ratio of the number of observation to the number of possible communication patterns is large.
Secure Friend Discovery via PrivacyPreserving and Decentralized Community Detection
"... The problem of secure friend discovery on a social network has long been proposed and studied. The requirement is that a pair of nodes can make befriending decisions with minimum information exposed to the other party. In this paper, we propose to use community detection to tackle the problem of ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
The problem of secure friend discovery on a social network has long been proposed and studied. The requirement is that a pair of nodes can make befriending decisions with minimum information exposed to the other party. In this paper, we propose to use community detection to tackle the problem of secure friend discovery. We formulate the first privacypreserving and decentralized community detection problem as a multiobjective optimization. We design the first protocol to solve this problem, which transforms community detection to a series of Private Set Intersection (PSI) instances using Truncated Random Walk (TRW). Preliminary theoretical results show that our protocol can uncover communities with overwhelming probability and preserve privacy. We also discuss future works, potential extensions and variations. 1
GSGS: A Computational Approach to Reconstruct Signaling Pathway Structures from Gene Sets
"... Abstract—Reconstruction of signaling pathway structures is essential to decipher complex regulatory relationships in living cells. The existing computational approaches often rely on unrealistic biological assumptions and do not explicitly consider signal transduction mechanisms. Signal transduction ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Reconstruction of signaling pathway structures is essential to decipher complex regulatory relationships in living cells. The existing computational approaches often rely on unrealistic biological assumptions and do not explicitly consider signal transduction mechanisms. Signal transduction events refer to linear cascades of reactions from the cell surface to the nucleus and characterize a signaling pathway. In this paper, we propose a novel approach, Gene Set Gibbs Sampling (GSGS), to reverse engineer signaling pathway structures from gene sets related to the pathways. We hypothesize that signaling pathways are structurally an ensemble of overlapping linear signal transduction events which we encode as Information Flows (IFs). We infer signaling pathway structures from gene sets, referred to as Information Flow Gene Sets (IFGSs), corresponding to these events. Thus, an IFGS only reflects which genes appear in the underlying IF but not their ordering. GSGS offers a Gibbs sampling like procedure to reconstruct the underlying signaling pathway structure by sequentially inferring IFs from the overlapping IFGSs related to the pathway. In the proofofconcept studies, our approach is shown to outperform the existing stateoftheart network inference approaches using both continuous and discrete data generated from benchmark networks in the DREAM initiative. We perform a comprehensive sensitivity analysis to assess the robustness of our approach. Finally, we implement GSGS to reconstruct signaling mechanisms in breast cancer cells.
Node Discovery Problem for a Social Network
"... A node discovery problem is defined as a problem in discovering a covert node within a social network. The covert node is a person who is not directly observable. The person transmits influence to neighbors and affects the resulting collaborative activities (e.g. meetings) within a social network, b ..."
Abstract
 Add to MetaCart
A node discovery problem is defined as a problem in discovering a covert node within a social network. The covert node is a person who is not directly observable. The person transmits influence to neighbors and affects the resulting collaborative activities (e.g. meetings) within a social network, but does not appear in any information reported by the intelligence. Throughout this study, the information comes from data that record the participants of collaborative activities. Discovery of the covert node refers to the retrieval of the data and the corresponding collaborative activities that result from the influence of the covert node. The nodes that appear commonly in the retrieved data are likely to neighbor the covert node. Two methods are presented for detecting covert nodes within a social network. A novel statistical inference method is discussed and compared with a conventional heuristic method (data crystallization). The statistical inference method employs the maximal likelihood estimation and outlier detection techniques. The performance of the methods is demonstrated with test datasets that are generated from computationally synthesized networks and from a real organization. Author: Dr. Yoshiharu Maeno is a founder management consultant and scientist at Social Design Group. He has developed mathematical methods to reveal the topological structure and to profile the information diffusion in social networks. Correspondence: Contact Yoshiharu Maeno at Sengoku 1638F, Bunkyoku, Tokyo 1120011, or email
1Online Methods for Network Endpoint Localization
"... Online techniques are presented for estimating the source and destination of a suspect transmission through a network based on the activation pattern of sensors placed on network components. A hierarchical Bayesian model is used to relate routing, tracking, and topological parameters. A controlled ..."
Abstract
 Add to MetaCart
(Show Context)
Online techniques are presented for estimating the source and destination of a suspect transmission through a network based on the activation pattern of sensors placed on network components. A hierarchical Bayesian model is used to relate routing, tracking, and topological parameters. A controlled Markovian routing model is used in conjunction with a recursive EM algorithm to derive adaptive routing and tracking parameter estimates. Previously developed semidefinite programming methods are used to account for any prior topological information through Monte Carlo estimates of the topology parameters. Convergence of the routing and tracking parameter estimates is proven and it is shown that their asymptotic estimates are fixed points of an exact EM algorithm. Approximate methods based on permutation clustering are presented to reduce the complexity of sums that arise in the estimator formulas. A multiarmed bandit approach to the design problem of online probe scheduling is also presented. Finally, the effectiveness of the new methods is illustrated through a variety of tracking simulations inspired by real world scenarios and involving real Internet data. Speedy performance and good accuracy are observed.