Results 1 - 10
of
10
Adaptive Performance Prediction for Distributed Data-Intensive Applications
, 1999
"... The computational grid is becoming the platform of choice for large-scale distributed data-intensive applications. Accurately predicting the transfer times of remote data les, a fundamental component of such applications, is critical to achieving application performance. In this paper, we introduce ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
The computational grid is becoming the platform of choice for large-scale distributed data-intensive applications. Accurately predicting the transfer times of remote data les, a fundamental component of such applications, is critical to achieving application performance. In this paper, we introduce a performance prediction method, ARM (Adaptive Regression Modeling), to determine data transfer times for network-bound distributed dataintensive applications. We demonstrate the eectiveness of the ARM method on two distributed data applications, SARA (Synthetic Aperture Radar Atlas) and SRB (Storage Resource Broker) , and discuss how it can be used for application scheduling. Our experiments demonstrate that applying the ARM method to these applications predicted data transfer times in wide-area multi-user grid environments with accuracy of 88% or better. 1 Introduction Ensembles of distributed computational, storage, and other resources, also known as computational grids [12, 14], are...
Decomposition in Data Mining: An Industrial Case Study
- IEEE TRANSACTIONS ON ELECTRONICS PACKAGING MANUFACTURING
, 2000
"... Data mining offers tools for discovery of relationships, patterns, and knowledge in large databases. The knowledge extraction process is computationally complex and therefore a subset of all data is normally considered for mining. In this paper, numerous methods for decomposition of data sets are di ..."
Abstract
-
Cited by 24 (11 self)
- Add to MetaCart
Data mining offers tools for discovery of relationships, patterns, and knowledge in large databases. The knowledge extraction process is computationally complex and therefore a subset of all data is normally considered for mining. In this paper, numerous methods for decomposition of data sets are discussed. Decomposition enhances the quality of knowledge extracted from large databases by simplification of the data mining task. The ideas presented are illustrated with examples and an industrial case study. In the case study reported in this paper, a data mining approach is applied to extract knowledge from a data set. The extracted knowledge is used for the prediction and prevention of manufacturing faults in wafers.
DATA MINING FOR INTRUSION DETECTION -- A Critical Review
"... Data mining techniques have been successfully applied in many di#erent fields including marketing, manufacturing, process control, fraud detection, and network management. Over the past five years, a growing number of research projects have applied data mining to various problems in intrusion detect ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Data mining techniques have been successfully applied in many di#erent fields including marketing, manufacturing, process control, fraud detection, and network management. Over the past five years, a growing number of research projects have applied data mining to various problems in intrusion detection. This chapter surveys a representative cross section of these research e#orts. Moreover, four characteristics of contemporary research are identified and discussed in a critical manner. Conclusions are drawn and directions for future research are suggested. Note: This article is an excerpt of the original work published in D. Barbara and S. Jajodia, editors, Applications of Data Mining in Computer Security, Kluwer Academic Publisher, Boston, 2002.
Intrusion Detection Systems Using Decision Trees and Support Vector Machines
- VECTOR MACHINES, INTERNATIONAL JOURNAL OF APPLIED SCIENCE AND COMPUTATIONS
, 2004
"... Security of computers and the networks that connect them is increasingly becoming of great significance. Intrusion detection is a mechanism of providing security to computer networks. Although there are some existing mechanisms for Intrusion detection, there is need to improve the performance. Da ..."
Abstract
-
Cited by 7 (6 self)
- Add to MetaCart
Security of computers and the networks that connect them is increasingly becoming of great significance. Intrusion detection is a mechanism of providing security to computer networks. Although there are some existing mechanisms for Intrusion detection, there is need to improve the performance. Data mining techniques are a new approach for Intrusion detection. In this paper we investigate and evaluate the decision tree data mining techniques as an intrusion detection mechanism and we compare it with Support Vector Machines (SVM). Intrusion detection with Decision trees and SVM were tested with benchmark 1998 DARPA Intrusion Detection data set. Our research shows that Decision trees gives better overall performance than the SVM.
A MIXED-INTEGER PROGRAMMING APPROACH TO THE CLUSTERING PROBLEM WITH AN APPLICATION IN CUSTOMER SEGMENTATION
, 2005
"... and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the final examining committee have been made. ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
and have found that it is complete and satisfactory in all respects, and that any and all revisions required by the final examining committee have been made.
Collusion Program' in The U.S. Crop Insurance Applied Data Mining
- in KDD ’02: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
"... This paper quantitatively analyzes indicators of Agent (policy seller), Adjuster (indemnity claim adjuster), Producer (policy purchaser/holder) indemnity behavior suggestive of collusion in the United States Department of Agriculture (USDA) Risk Management Agency (RMA) national crop insurance progra ..."
Abstract
- Add to MetaCart
This paper quantitatively analyzes indicators of Agent (policy seller), Adjuster (indemnity claim adjuster), Producer (policy purchaser/holder) indemnity behavior suggestive of collusion in the United States Department of Agriculture (USDA) Risk Management Agency (RMA) national crop insurance program. According to guidance from the federal law and using six indicator variables of indemnity behavior, those entities equal to or exceeding 150% of the county mean (computed using a simple jackknife procedure) on all entityrelevant indicators were flagged as "anomalous." Log linear analysis was used to test (1) hierarchical node-node arrangements and (2) a non-recursive model of node information sharing. Chi-square distributed deviance statistic identified the optimal log linear model. The results of the applied data mining technique used here suggest that the non-recursive triplet and Agent-producer doublet collusion probabilistically accounts for the greatest proportion of waste, fraud, and abuse in the federal crop insurance program. Triplet and Agent-producer doublets need detailed investigation for possible collusion. Hence, this data mining technique provided a high level of confidence when 24 million records were quantitatively analyzed for possible fraud, waste, or other abuse of the crop insurance program administered by the USDA RMA, and suspect entities reported to USDA. This data mining technique can be applied where vast amounts of data are available to detect patterns of collusion or conspiracy as may be of interest to the criminal justice or intelligence agencies.
Collusion in The U.S. Crop Insurance
"... This paper quantitatively analyzes indicators of Agent (policy seller), Adjuster (indemnity claim adjuster), Producer (policy purchaser/holder) indemnity behavior suggestive of collusion in the United States Department of Agriculture (USDA) Risk Management Agency (RMA) national crop insurance progra ..."
Abstract
- Add to MetaCart
This paper quantitatively analyzes indicators of Agent (policy seller), Adjuster (indemnity claim adjuster), Producer (policy purchaser/holder) indemnity behavior suggestive of collusion in the United States Department of Agriculture (USDA) Risk Management Agency (RMA) national crop insurance program. According to guidance from the federal law and using six indicator variables of indemnity behavior, those entities equal to or exceeding 150 % of the county mean (computed using a simple jackknife procedure) on all entity-relevant indicators were flagged as “anomalous. ” Log linear analysis was used to test (1) hierarchical node-node arrangements and (2) a non-recursive model of node information sharing. Chi-square distributed deviance statistic identified the optimal log linear model. The results of the applied data mining technique used here suggest that the non-recursive triplet and agent-producer doublet collusion probabilistically accounts for the greatest proportion of waste, fraud, and abuse in the federal crop insurance program. Triplet and agent-producer doublets need detailed investigation for possible collusion. Hence, this data mining technique provided a high level of confidence when 24 million records were quantitatively analyzed for possible fraud, waste, or other abuse of the crop insurance
Performance Analysis of Data Mining Tools Cumulating with a Proposed Data Mining Middleware
"... Abstract: Data mining has becoming increasingly popular in helping to reveal important knowledge from the organization’s databases and has led to the emergence of a variety of data mining tools to help in decision making. Present study described a test bed to investigate five major data mining tools ..."
Abstract
- Add to MetaCart
Abstract: Data mining has becoming increasingly popular in helping to reveal important knowledge from the organization’s databases and has led to the emergence of a variety of data mining tools to help in decision making. Present study described a test bed to investigate five major data mining tools, namely IBM intelligent miner, SPSS Clementine, SAS enterprise miner, oracle data miner and Microsoft business intelligence development studio. Present studies focus on the performance of these tools. Results provide a review of these tools and propose a data mining middleware adopting the strengths of the tools. Key words: knowledge discovery, performance metrics, test bed
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited 583Collusion in The U.S. Crop Insurance
"... This paper quantitatively analyzes indicators of Agent (policy seller), Adjuster (indemnity claim adjuster), Producer (policy purchaser/holder) indemnity behavior suggestive of collusion in the United States Department of Agriculture (USDA) Risk Management Agency (RMA) national crop insurance progra ..."
Abstract
- Add to MetaCart
This paper quantitatively analyzes indicators of Agent (policy seller), Adjuster (indemnity claim adjuster), Producer (policy purchaser/holder) indemnity behavior suggestive of collusion in the United States Department of Agriculture (USDA) Risk Management Agency (RMA) national crop insurance program. According to guidance from the federal law and using six indicator variables of indemnity behavior, those entities equal to or exceeding 150 % of the county mean (computed using a simple jackknife procedure) on all entity-relevant indicators were flagged as “anomalous. ” Log linear analysis was used to test (1) hierarchical node-node arrangements and (2) a non-recursive model of node information sharing. Chi-square distributed deviance statistic identified the optimal log linear model. The results of the applied data mining technique used here suggest that the non-recursive triplet and agent-producer doublet collusion probabilistically accounts for the greatest proportion of waste, fraud, and abuse in the federal crop insurance program. Triplet and agent-producer doublets need detailed investigation for possible collusion. Hence, this data mining technique provided a high level of confidence when 24 million records were quantitatively analyzed for possible fraud, waste, or other abuse of the crop insurance

