Results 1 - 10
of
27
A survey of outlier detection methodologies
- Artificial Intelligence Review
, 2004
"... Abstract. Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populat ..."
Abstract
-
Cited by 80 (3 self)
- Add to MetaCart
Abstract. Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review.
Anomaly Detection: A Survey
, 2007
"... Anomaly detection is an important problem that has been researched within diverse research areas and application domains. Many anomaly detection techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and c ..."
Abstract
-
Cited by 69 (1 self)
- Add to MetaCart
Anomaly detection is an important problem that has been researched within diverse research areas and application domains. Many anomaly detection techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection. We have grouped existing techniques into different categories based on the underlying approach adopted by each technique. For each category we have identified key assumptions, which are used by the techniques to differentiate between normal and anomalous behavior. When applying a given technique to a particular domain, these assumptions can be used as guidelines to assess the effectiveness of the technique in that domain. For each category, we provide a basic anomaly detection technique, and then show how the different existing techniques in that category are variants of the basic technique. This template provides an easier and succinct understanding of the techniques belonging to each category. Further, for each category, we identify the advantages and disadvantages of the techniques in that category. We also provide a discussion on the computational complexity of the techniques since it is an important issue in real application domains. We hope that this survey will provide a better understanding of the di®erent directions in which research has been done on this topic, and how techniques developed in one area can be applied in domains for which they were not intended to begin with.
Investigating ancient duplication events in the Arabidopsis genome
- J. Struct. Funct. Genomics
, 2003
"... The complete genomic analysis of Arabidopsis thaliana has shown that a major fraction of the genome consists of paralogous genes that probably originated through one or more ancient large-scale gene or genome duplication events. However, the number and timing of these duplications still remains uncl ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
The complete genomic analysis of Arabidopsis thaliana has shown that a major fraction of the genome consists of paralogous genes that probably originated through one or more ancient large-scale gene or genome duplication events. However, the number and timing of these duplications still remains unclear, and several different hypotheses have been put forward recently. Here, we reanalyzed duplicated blocks found in the Arabidopsis genome described previously and determined their date of divergence based on silent substitution estimations between the paralogous genes and, where possible, by phylogenetic reconstruction. We show that methods based on averaging protein distances of heterogeneous classes of duplicated genes lead to unreliable conclusions and that a large fraction of blocks duplicated much more recently than assumed previously. We found clear evidence for one large-scale gene or even complete genome duplication event somewhere between 70 to 90 million years ago. Traces pointing to a much older (probably more than 200 million years) large-scale gene duplication event could be detected. However, for now it is impossible to conclude whether these old duplicates are the result of one or more large-scale gene duplication events.
Order stability in supply chains: Coordination risk and the role of coordination stock
, 2004
"... The bullwhip effect describes the tendency for the variance of orders in supply chains to increase as one moves upstream from consumer demand. Previous research attributes this phenomenon to both operational and behavioral causes. Operational causes are features of the institutional setting that lea ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The bullwhip effect describes the tendency for the variance of orders in supply chains to increase as one moves upstream from consumer demand. Previous research attributes this phenomenon to both operational and behavioral causes. Operational causes are features of the institutional setting that lead rational agents to amplify changes in demand, while behavioral causes arise from suboptimal decision-making. This paper examines causes of the bullwhip through experiments with a serial supply chain, using the Beer Distribution Game. Unlike prior studies, we control all four commonly cited operational causes of the bullwhip, including uncertainty about customer demand. We eliminate demand uncertainty completely by making customer demand constant and known to all participants. Despite these controls, order amplification, instability, and supply line underweighting remain pervasive. We propose a new behavioral cause of the bullwhip, coordination risk, that arises when players place excessive orders to address the perceived risk that others will not behave optimally. We test two strategies to mitigate coordination risk: (1) holding additional on-hand inventory, and (2) creating common knowledge by informing participants of the optimal policy. Both strategies reduce, but not eliminate, the bullwhip effect. Holding excess inventory reduces order amplification by providing a buffer against the
Mechanistic considerations for carcinogenic risk estimation: chloroform. Environ. Health Perspect. 46
, 1982
"... Chloroform has been reported to induce cancer in rodents after chronic administration of high doses by gavage. However, the interpretation of these findings is hampered by a lack of knowledge concerning the relative roles of genetic and nongenetic mechanisms in these bioassays. The present studies w ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Chloroform has been reported to induce cancer in rodents after chronic administration of high doses by gavage. However, the interpretation of these findings is hampered by a lack of knowledge concerning the relative roles of genetic and nongenetic mechanisms in these bioassays. The present studies were carried out in male B6C3F1 mice in order to investigate the potential of chloroform to induce genetic damage and/or organ toxicity at the sites where tumors have been observed in the various bioassays. These studies revealed that carcinogenic doses of chloroform produced severe necrosis at the sites where tumors later developed. This was demonstrated by light microscopy as well as by determination of the cellular regeneration index following administration of 3H-thymidine. Noncarcinogenic doses of chloroform failed to induce these responses. In contrast, studies of DNA alkylation and DNA repair in vivo failed to give any indication that chloroform had produced the type of genetic alterations associated with known genotoxic chemicals. These data suggest that the- primary mechanism of chloroform-induced carcinogenesis is nongenetic in nature. If the same mechanism predominates in man, there should be little to no carcinogenic risk associated with exposure to noncytotoxic levels of chloroform.
Is This My Hand I See Before Me? The Rubber Hand Illusion in Reality, Virtual Reality, and Mixed Reality
"... This paper presents a first study in which a recently reported intermodal perceptual illusion known as the rubber hand illusion is experimentally investigated under mediated conditions. When one’s own hand is placed out of view and a visible fake hand is repeatedly stroked and tapped in synchrony wi ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper presents a first study in which a recently reported intermodal perceptual illusion known as the rubber hand illusion is experimentally investigated under mediated conditions. When one’s own hand is placed out of view and a visible fake hand is repeatedly stroked and tapped in synchrony with the unseen hand, subjects report a strong sense in which the fake hand is experienced as part of their own body. In our experiment, we investigated this illusion under three conditions: (i) unmediated condition, replicating the original paradigm, (ii) virtual reality (VR) condition, where both the fake hand and its stimulation were projected on the table in front of the participant, and (iii) mixed reality (MR) condition, where the fake hand was projected, but its stimulation was unmediated. Dependent measures included self-report (open-ended and questionnaire-based) and drift, that is, the offset between the felt position of the hidden hand and its actual position. As expected, the unmediated condition produced the strongest illusion, as indicated both by self-report and drift towards the rubber hand. The VR condition produced a more convincing subjective illusion than the MR condition, although no difference in drift was found between the mediated conditions. Results are discussed in terms of perceptual mechanisms underlying the rubber hand illusion, and the illusion’s relevance to understanding telepresence. Keywords--- Rubber hand illusion, multisensory integration, body image, virtual reality, mixed reality, telepresence 1
OUTLIER DETECTION
"... In many data analysis tasks a large number of variables are being recorded or sampled. One of the first steps towards obtaining a coherent analysis is the detection of outlaying observations. Although outliers are often considered as an error or noise, they may carry important information. Detected ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In many data analysis tasks a large number of variables are being recorded or sampled. One of the first steps towards obtaining a coherent analysis is the detection of outlaying observations. Although outliers are often considered as an error or noise, they may carry important information. Detected outliers are candidates for aberrant data that may otherwise adversely
766 Combinatorial QSAR Modeling of Chemical Toxicants Tested against
, 2007
"... Selecting most rigorous quantitative structure-activity relationship (QSAR) approaches is of great importance in the development of robust and predictive models of chemical toxicity. To address this issue in a systematic way, we have formed an international virtual collaboratory consisting of six in ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Selecting most rigorous quantitative structure-activity relationship (QSAR) approaches is of great importance in the development of robust and predictive models of chemical toxicity. To address this issue in a systematic way, we have formed an international virtual collaboratory consisting of six independent groups with shared interests in computational chemical toxicology. We have compiled an aqueous toxicity data set containing 983 unique compounds tested in the same laboratory over a decade against Tetrahymena pyriformis. A modeling set including 644 compounds was selected randomly from the original set and distributed to all groups that used their own QSAR tools for model development. The remaining 339 compounds in the original set (external set I) as well as 110 additional compounds (external set II) published recently by the same laboratory (after this computational study was already in progress) were used as two independent validation sets to assess the external predictive power of individual models. In total, our virtual collaboratory has developed 15 different types of QSAR models of aquatic toxicity for the training set. The internal
This un-edited manuscript has been accepted for publication in
, 2007
"... Biophysical Journal and is freely available on BioFast at ..."
Congestion Location Detection: Methodology, Algorithm, and Performance
"... Abstract—Can an end-host running multiple TCP sessions detect not just the occurrence, but also the location of congestion? This paper answers this question through new analytic results on the two underlying technical difficulties: synchronization effects of loss and delay in TCP and distributed hyp ..."
Abstract
- Add to MetaCart
Abstract—Can an end-host running multiple TCP sessions detect not just the occurrence, but also the location of congestion? This paper answers this question through new analytic results on the two underlying technical difficulties: synchronization effects of loss and delay in TCP and distributed hypothesis testing using only local loss and delay data, as well as practical algorithm development and extensive simulations. It presents a Congestion Location Detection algorithm that effectively allows an end host to distributedly detect whether congestion happens in the local access link or in more remote links. This further enables the practical usage of low-priority congestion control protocols. I.

