Results 1 -
9 of
9
Java for On-line Distributed Monitoring of Heterogeneous Systems and Services
- The Computer Journal
, 2002
"... This paper presents the design of a Java-based Monitoring Application Programming Interface (MAPI) for the on-line monitoring of Web services. MAPI overcomes Internet platform heterogeneity and permits one to observe the state of systems/applications during execution. MAPI collects monitoring data a ..."
Abstract
-
Cited by 9 (9 self)
- Add to MetaCart
This paper presents the design of a Java-based Monitoring Application Programming Interface (MAPI) for the on-line monitoring of Web services. MAPI overcomes Internet platform heterogeneity and permits one to observe the state of systems/applications during execution. MAPI collects monitoring data at the different levels of abstraction as required. At the application level, it dynamically interacts with the Java Virtual Machine (JVM) to gather detailed information about the execution of Java-based services. At the kernel level, it enables access to system indicators at the monitored target (either Java-based or external to the JVM), such as CPU and memory usage of all active processes
M-JavaMPI: A Java-MPI Binding with Process Migration Support
- In Proc. CCGrid 2002
, 2002
"... Several Java bindings to the Message Passing Interface (MPI) software have been developed for highperformance parallel Java-based computing with message-passing in the past. None of them however addressed the issue of supporting transparent Java process migration for achieving dynamic load distribut ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Several Java bindings to the Message Passing Interface (MPI) software have been developed for highperformance parallel Java-based computing with message-passing in the past. None of them however addressed the issue of supporting transparent Java process migration for achieving dynamic load distribution and balancing. This paper presents' a middleware, called M-JavaMPI, that runs on top of the standard JVM to support transparent Java process migration and communication redirection. The middleware allows' Java processes to freely and transparently migrate between machines to achieve load balancing, and migrated processes can continue communication with other processes using MPI. The method we use to achieve process migration is' to capture execution context and restoring the execution context at the Java bytecode level using the Java Virtual Machine Debugger Interface (JVMDI). Post-migration interprocess communication is enabled via a restorable Java-MPI API. Tests' using a 16-node cluster have shown that our mechanism yields' considerable performance gain through migration.
GEMS: Gossip-Enabled Monitoring Service for Scalable Heterogeneous Distributed Systems
- Cluster Comput
"... Abstract. Gossip protocols have proven to be effective means by which failures can be detected in large, distributed systems in an asynchronous manner without the limitations associated with reliable multicasting for group communications. In this paper, we discuss the development and features of a G ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. Gossip protocols have proven to be effective means by which failures can be detected in large, distributed systems in an asynchronous manner without the limitations associated with reliable multicasting for group communications. In this paper, we discuss the development and features of a Gossip-Enabled Monitoring Service (GEMS), a highly responsive and scalable resource monitoring service, to monitor health and performance information in heterogeneous distributed systems. GEMS has many novel and essential features such as detection of network partitions and dynamic insertion of new nodes into the service. Easily extensible, GEMS also incorporates facilities for distributing arbitrary system and application-specific data. We present experiments and analytical projections demonstrating scalability, fast response times and low resource utilization requirements, making GEMS a potent solution for resource monitoring in distributed computing.
Distributed particle simulation method on adaptive collaborative system
- Future Gener. Comput. Syst
"... This paper presents a distributed N-body method based on an adaptive collaborative system model. The collaborative system is formed by the distributed objects in a cluster. The system can be reconfigured during the computation to fully utilize the computing power of the cluster. The method is implem ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper presents a distributed N-body method based on an adaptive collaborative system model. The collaborative system is formed by the distributed objects in a cluster. The system can be reconfigured during the computation to fully utilize the computing power of the cluster. The method is implemented in Java and RMI to support distributed computing on heterogeneous platforms. A distributed tree structure is designed for communication-efficient computation of the method. The performance test shows satisfactory speedup and portability of the N-body method on both of homogeneous and heterogeneous clusters. The collaborative system model is applicable to various applications and it is expandable to wide-area environment. Keywords: N-body; Distributed object; Collaborative system; Java 1.
Fast and Scalable Real-time Monitoring System for Beowulf Clusters
- Lecture Notes in Computer Science
, 2001
"... Fast real-time monitoring of system information is important to the understanding of parallel system especially for a large cluster system that appeared recently. Making the system fast and scalable at the same time is still a challenging task. This paper presents the design and implementation of ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Fast real-time monitoring of system information is important to the understanding of parallel system especially for a large cluster system that appeared recently. Making the system fast and scalable at the same time is still a challenging task. This paper presents the design and implementation of a fast and real time monitoring system called SCMS/RMS. This system is a part of more comprehensive cluster management tool called SCMS. SCMS/RMS is designed to be flexible, highly scalable, and efficient. Many techniques that are used to increase the monitoring speed and to achieve high scalability have been described in this paper. The experiment has been conducted on a 72 nodes Beowulf Cluster and the results show that SCMS/RMS is very fast and highly scalable.
Java-based On-line Monitoring of Heterogeneous Resources and Systems
- 7 th Workshop HP OpenView University Association
, 2000
"... The diffusion of Web-based multimedia services and the emerging competition among service providers require to enrich the Internet infrastructure with mechanisms to manage and control service quality and availability. These goals require monitoring mechanisms that ascertain the state of resources ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The diffusion of Web-based multimedia services and the emerging competition among service providers require to enrich the Internet infrastructure with mechanisms to manage and control service quality and availability. These goals require monitoring mechanisms that ascertain the state of resources and applications in the global distributed system, and that should be a core functionality of any infrastructure for Web service provision. The paper describes the design and the implementation of a Java-based Application Programming Interface (API) to monitor uniformly heterogeneous resources and systems over the Internet. The monitoring tool operates at different levels of abstraction. On the one hand, it can instrument the Java Virtual Machine (JVM) to handle several types of events produced by Java applications. On the other hand, it can inspect the state of machine specific information (e.g., CPU and memory utilization) typically hidden by the JVM, and available via platform-dependent modules (currently developed for WindowsNT, Solaris and Linux). The implemented monitoring tool can be integrated in any Java-based Web service infrastructure and is currently part of the SOMA mobile agent platform. 1.
GEMS: Gossip-Enabled Monitoring Service for Heterogeneous Distributed Systems,” http://www.hcs.ufl.edu/pubs/GEMS2002.pdf, submitted to Journal of Network and Systems Management
"... Abstract – Gossip protocols provide a scalable means for detecting failures in heterogeneous distributed systems in an asynchronous manner without the limits associated with group communication. In this paper, we discuss the development and features of a hierarchical Gossip-Enabled Monitoring Servic ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract – Gossip protocols provide a scalable means for detecting failures in heterogeneous distributed systems in an asynchronous manner without the limits associated with group communication. In this paper, we discuss the development and features of a hierarchical Gossip-Enabled Monitoring Service (GEMS), which extends the gossip-style failure detection service to support resource monitoring. By dividing the system into groups of nodes and layers of communication, the GEMS paradigm scales well. Easily extensible, GEMS incorporates facilities for distributing arbitrary system and application-specific data. In this paper we present experiments and analytical projections demonstrating fast response times and low resource utilization requirements, making GEMS a superior solution for resource monitoring issues in distributed computing. Also, we demonstrate the utility of GEMS through the development of a simple dynamic load balancing service for which GEMS forms the information base.
Towards A Model-Based Autonomic Reliability Framework for Computing Clusters
"... One of the primary problems with computing clusters is to ensure that they maintain a reliable working state most of the time to justify economics of operation. In this paper, we introduce a model-based hierarchical reliability framework that enables periodic monitoring of vital health parameters ac ..."
Abstract
- Add to MetaCart
One of the primary problems with computing clusters is to ensure that they maintain a reliable working state most of the time to justify economics of operation. In this paper, we introduce a model-based hierarchical reliability framework that enables periodic monitoring of vital health parameters across the cluster and provides for autonomic fault mitigation. We also discuss some of the challenges faced by autonomic reliability frameworks in cluster environments such as non-determinism in task scheduling in standard operating systems such as Linux and need for synchronized execution of monitoring sensors across the cluster. Additionally, we present a solution to these problems in the context of our framework, which utilizes a feedback controller based approach to compensate for the scheduling jitter in non realtime operating systems. Finally, we present experimental data that illustrates the effectiveness of our approach. 1.

