GEMS: Gossip-Enabled Monitoring Service for Scalable Heterogeneous Distributed Systems

Cached

Download Links

by Rajagopal Subramaniyan , Pirabhu Raman , Alan D. George , Matthew Radlinski
Venue:Cluster Comput
Citations:6 - 2 self

Documents Related by Co-Citation

9 Understanding fault tolerant distributed systems – F Christian - 1991
4 The SAM-Grid Fabric services", talk at the – G Garzoglio
5 Distributed Computing: Fundamentals – H Attiya, J Welch - 2004
8 Experience producing simulated events for the DZero experiment on the SAM-Grid, presented at Computing – G Garzoglio, I Terekhov, J Snow, A Nishandar, S Jain - 2004
41 Supermon: A High-Speed Cluster Monitoring System – Matthew J. Sottile, Ronald G. Minnich - 2002
80 A Fault Detection Service for Wide Area Distributed Computations – Paul Stelling, Ian Foster, Carl Kesselman, Craig Lee, Gregor Von Laszewski - 1998
299 eds.): The Grid: Blueprint for a New Computing Infrastructure – I Foster, C Kesselman - 1999
1999 The Anatomy of the Grid - Enabling Scalable Virtual Organizations – Ian Foster, Carl Kesselman, Steven Tuecke - 2001
1057 Condor - a hunter of idle workstations – M Litzkow, M Livny, M Mutka - 1988
193 The Ganglia Distributed Monitoring System: Design, Implementation And Experience – Matthew L. Massie , Brent N. Chun , David E. Culler - 2004
47 MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools – Philip C. Roth, Dorian C. Arnold, Barton P. Miller - 2003
6 Proactive Fault Tolerance Using Preemptive Migration – C. Engelmann, G. R. Vallée, T. Naughton, S. L. Scott
54 Exascale computing study: Technology challenges in achieving exascale systems,” DARPA-IPTO – P Kogge - 2008
2 Group file operations for scalable tools and middleware – M J Brim, B P Miller - 1619
3 A framework for scalable, parallel performance monitoring. Concurrency and Computation: Practice and Experience – A Nataraj, A D Malony, A Morris, D C Arnold, B P Miller
21 The International Exascale Software Project Roadmap 1 – Jack Dongarra, Pete Beckman, Terry Moore, Patrick Aerts, Giovanni Aloisio, David Barkai, Taisuke Boku, Barbara Chapman, Xuebin Chi, Alok Choudhary, Sudip Dosanjh, Thom Dunning, Ro Fiore, Al Geist, Robert Harrison, Mark Hereld, Michael Heroux, Koh Hotta, Yutaka Ishikawa, Zhong Jin, Fred Johnson, Sanjay Kale, Richard Kenway, David Keyes, Bill Kramer, Jesus Labarta, Alain Lichnewsky, Bob Lucas, Satoshi Matsuoka, Paul Messina, Peter Michielse, Bernd Mohr, Matthias Mueller, John Shalf, David Skinner, Marc Snir, Thomas Sterling, Rick Stevens, Fred Streitz, Bob Sugar, Aad Van Der Steen, Jeffrey Vetter, Peg Williams, Robert Wisniewski, Kathy Yelick
5 OVIS-2: A robust distributed architecture for scalable RAS – J M Brandt, B J Debusschere, A C Gentile, J R Mayo, P P Pébay, D Thompson, M H Wong
1 Lightweight Online Performance Monitoring and Tuning with Embedded Gossip – Wenbin Zhu, Patrick G. Bridges, Arthur B. Maccabe - 2008
2 computer science challenges at exascale – Major - 2009