Results 1 - 10
of
604
The WEKA Data Mining Software: An Update
"... More than twelve years have elapsed since the first public release of WEKA. In that time, the software has been rewritten entirely from scratch, evolved substantially and now accompanies a text on data mining [35]. These days, WEKA enjoys widespread acceptance in both academia and business, has an a ..."
Abstract
-
Cited by 175 (6 self)
- Add to MetaCart
More than twelve years have elapsed since the first public release of WEKA. In that time, the software has been rewritten entirely from scratch, evolved substantially and now accompanies a text on data mining [35]. These days, WEKA enjoys widespread acceptance in both academia and business, has an active community, and has been downloaded more than 1.4 million times since being placed on Source-Forge in April 2000. This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003. 1.
Low-Cost Traffic Analysis Of Tor
- In Proceedings of the 2005 IEEE Symposium on Security and Privacy. IEEE CS
, 2005
"... Tor is the second generation Onion Router, supporting the anonymous transport of TCP streams over the Internet. Its low latency makes it very suitable for common tasks, such as web browsing, but insecure against trafficanalysis attacks by a global passive adversary. We present new traffic-analysis t ..."
Abstract
-
Cited by 101 (7 self)
- Add to MetaCart
Tor is the second generation Onion Router, supporting the anonymous transport of TCP streams over the Internet. Its low latency makes it very suitable for common tasks, such as web browsing, but insecure against trafficanalysis attacks by a global passive adversary. We present new traffic-analysis techniques that allow adversaries with only a partial view of the network to infer which nodes are being used to relay the anonymous streams and therefore greatly reduce the anonymity provided by Tor. Furthermore, we show that otherwise unrelated streams can be linked back to the same initiator. Our attack is feasible for the adversary anticipated by the Tor designers. Our theoretical attacks are backed up by experiments performed on the deployed, albeit experimental, Tor network. Our techniques should also be applicable to any low latency anonymous network. These attacks highlight the relationship between the field of traffic-analysis and more traditional computer security issues, such as covert channel analysis. Our research also highlights that the inability to directly observe network links does not prevent an attacker from performing traffic-analysis: the adversary can use the anonymising network as an oracle to infer the traffic load on remote nodes in order to perform traffic-analysis. 1
Strategies for Sound Internet Measurement
- IMC'04
, 2004
"... Conducting an Internet measurement study in a sound fashion can be much more difficult than it might first appear. We present a number of strategies drawn from experiences for avoiding or overcoming some of the pitfalls. In particular, we discuss dealing with errors and inaccuracies; the importance ..."
Abstract
-
Cited by 55 (2 self)
- Add to MetaCart
Conducting an Internet measurement study in a sound fashion can be much more difficult than it might first appear. We present a number of strategies drawn from experiences for avoiding or overcoming some of the pitfalls. In particular, we discuss dealing with errors and inaccuracies; the importance of associating meta-data with measurements; the technique of calibrating measurements by examining outliers and testing for consistencies; difficulties that arise with large-scale measurements; the utility of developing a discipline for reliably reproducing analysis results; and issues with making datasets publicly available. We conclude with thoughts on the sorts of tools and community practices that can assist researchers with conducting sound measurement studies.
The Performance of Reliable Server Pooling Systems in Different Server Capacity Scenarios
- In Proceedings of the IEEE TENCON ’05, Melbourne/Australia, Nov. 2005. ISBN
, 2005
"... Reliable Server Pooling (RSerPool) is a protocol framework for server pool management and session failover, currently under standardization by the IETF RSerPool WG. While the basic ideas of RSerPool are not new, their combination into one architecture is. Some research into the performance of RSerPo ..."
Abstract
-
Cited by 26 (22 self)
- Add to MetaCart
Reliable Server Pooling (RSerPool) is a protocol framework for server pool management and session failover, currently under standardization by the IETF RSerPool WG. While the basic ideas of RSerPool are not new, their combination into one architecture is. Some research into the performance of RSerPool for certain specific applications has been made, but a detailed, application-independent sensitivity analysis of the system parameters is still missing. The goal of this paper is to systematically investigate RSerPool's load distribution behaviour on changes of workload and system parameters, to determine basic guidelines on designing efficient RSerPool systems. In this paper, we focus particularly on scenarios of server pools consisting of servers with unequal capacities.
A Comparison of Statistical Significance Tests for Information Retrieval Evaluation
, 2007
"... Information retrieval (IR) researchers commonly use three tests of statistical significance: the Student’s paired t-test, the Wilcoxon signed rank test, and the sign test. Other researchers have previously proposed using both the bootstrap and Fisher’s randomization (permutation) test as nonparametr ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Information retrieval (IR) researchers commonly use three tests of statistical significance: the Student’s paired t-test, the Wilcoxon signed rank test, and the sign test. Other researchers have previously proposed using both the bootstrap and Fisher’s randomization (permutation) test as nonparametric significance tests for IR but these tests have seen little use. For each of these five tests, we took the ad-hoc retrieval runs submitted to TRECs 3 and 5-8, and for each pair of runs, we measured the statistical significance of the difference in their mean average precision. We discovered that there is little practical difference between the randomization, bootstrap, and t tests. Both the Wilcoxon and sign test have a poor ability to detect significance and have the potential to lead to false detections of significance. The Wilcoxon and sign tests are simplified variants of the randomization test and their use should be discontinued for measuring the significance of a difference between means.
Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors
, 2008
"... Within-die process variation causes individual cores in a Chip Multiprocessor (CMP) to differ substantially in both static power consumed and maximum frequency supported. In this environment, ignoring variation effects when scheduling applications or when managing power with Dynamic Voltage and Freq ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
Within-die process variation causes individual cores in a Chip Multiprocessor (CMP) to differ substantially in both static power consumed and maximum frequency supported. In this environment, ignoring variation effects when scheduling applications or when managing power with Dynamic Voltage and Frequency Scaling (DVFS) is suboptimal. This paper proposes variation-aware algorithms for application scheduling and power management. One such power management algorithm, called LinOpt, uses linear programming to find the best voltage and frequency levels for each of the cores in the CMP — maximizing throughput at a given power budget. In a 20core CMP, the combination of variation-aware application scheduling and LinOpt increases the average throughput by 12–17 % and reduces the average ED 2 by 30–38 % — all relative to using variation-aware scheduling together with a simple extension to Intel’s Foxton power management algorithm.
An Experimentation Workbench for Replayable Networking Research
- In Proceedings of the Symposium on Networked System Design and Implementation
, 2007
"... The networked and distributed systems research communities have an increasing need for “replayable ” research, but our current experimentation resources fall short of satisfying this need. Replayable activities are those that can be re-executed, either as-is or in modified form, yielding new results ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
The networked and distributed systems research communities have an increasing need for “replayable ” research, but our current experimentation resources fall short of satisfying this need. Replayable activities are those that can be re-executed, either as-is or in modified form, yielding new results that can be compared to previous ones. Replayability requires complete records of experiment processes and data, of course, but it also requires facilities that allow those processes to actually be examined, repeated, modified, and reused. We are now evolving Emulab, our popular network testbed management system, to be the basis of a new experimentation workbench in support of realistic, largescale, replayable research. We have implemented a new model of testbed-based experiments that allows people to move forward and backward through their experimentation processes. Integrated tools help researchers manage their activities (both planned and unplanned), software artifacts, data, and analyses. We present the workbench, describe its implementation, and report how it has been used by early adopters. Our initial case studies highlight both the utility of the current workbench and additional usability challenges that must be addressed. 1
Highway hierarchies star
- 9TH DIMACS IMPLEMENTATION CHALLENGE
, 2006
"... We study two speedup techniques for route planning in road networks: highway hierarchies (HH) and goal directed search using landmarks (ALT). It turns out that there are several interesting synergies. Highway hierarchies yield a way to implement landmark selection more efficiently and to store landm ..."
Abstract
-
Cited by 19 (8 self)
- Add to MetaCart
We study two speedup techniques for route planning in road networks: highway hierarchies (HH) and goal directed search using landmarks (ALT). It turns out that there are several interesting synergies. Highway hierarchies yield a way to implement landmark selection more efficiently and to store landmark information more space efficiently than before. ALT gives queries in highway hierarchies an excellent sense of direction and allows some pruning of the search space. For computing shortest distances and approximately shortest travel times, this combination yields a significant speedup over HH alone. We also explain how to compute actual shortest paths very efficiently.
SimProcTC – The Design and Realization of a Powerful Tool-Chain for OMNeT++ Simulations ∗ ABSTRACT
"... In this paper, we introduce our Open Source simulation ..."
Abstract
-
Cited by 18 (16 self)
- Add to MetaCart
In this paper, we introduce our Open Source simulation
Mitigating parameter variation with dynamic fine-grain body biasing
- in International Symposium on Microarchitecture
, 2007
"... Parameter variation is detrimental to a processor’s frequency and leakage power. One proposed technique to mitigate it is Fine-Grain Body Biasing (FGBB), where different parts of the processor chip are given a voltage bias that changes the speed and leakage properties of their transistors. This tech ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
Parameter variation is detrimental to a processor’s frequency and leakage power. One proposed technique to mitigate it is Fine-Grain Body Biasing (FGBB), where different parts of the processor chip are given a voltage bias that changes the speed and leakage properties of their transistors. This technique has been proposed for static application, with the bias voltages being programmed at manufacturing time for worst-case conditions. In this paper, we introduce Dynamic FGBB (D-FGBB), which allows the continuous re-evaluation of the bias voltages to adapt to dynamic conditions. Our results show that D-FGBB is very versatile and effective. Specifically, with the processor working in normal mode at fixed frequency, D-FGBB reduces the leakage power of the chip by an average of 28–42 % compared to static FGBB. Alternatively, with the processor working in a high-performance mode, D-FGBB increases the processor frequency by an average of 7–9 % compared to static FGBB — or 7–16 % compared to no body biasing. Finally, we also show that D-FGBB can be synergistically combined with Dynamic Voltage and Frequency Scaling (DVFS), creating an effective means to manage power. 1.

