Clusterpunch: a distributed mini-benchmark system for clusters

Other Tools

Other cluster distribution, monitoring, and visualization tools worth exploring:

Top Supercomputers

Courtesy of the Top 500 List (v200211, Hans Meuer, Erich Strohmaier, Jack Dongarra, Horst D. Simon)

Top 5 Supercomputers

1 Earth Simulator Center Yokohama
Earth-Simulator (NEC)
35,860 Gflops
5,120 CPUs | Research | Japan
2 Los Alamos National Laboratory Los Alamos
ASCI Q - AlphaServer SC ES45/1.25 GHz (Hewlett-Packard)
7,727 Gflops
4,096 CPUs | Research | USA
3 Los Alamos National Laboratory Los Alamos
ASCI Q - AlphaServer SC ES45/1.25 GHz (Hewlett-Packard)
7,727 Gflops
4,096 CPUs | Research | USA
4 Lawrence Livermore National Laboratory Livermore
ASCI White, SP Power3 375 MHz (IBM)
7,226 Gflops
8,192 CPUs | Research Energy | USA
5 Lawrence Livermore National Laboratory Livermore
MCR Linux Cluster Xeon 2.4 GHz - Quadrics (Linux NetworX/Quadrics)
5,694 Gflops
2,304 CPUs | Research | USA

Top Canadian Installation

196 High Performance Computing Virtual Laboratory Kingston
Fire 15k/Fire 6800/Sun Fire Link (Sun)
321 Gflops
336 CPUs | Research | Canada

Bottom 5 Supercomputers

How fast does your cluster have to be to make it onto the list? Here are the bottom 5 entries on the list.

495 France Telecom
SuperDome/HyperPlex (Hewlett-Packard)
195.8 Gflops
128 CPUs | Industry Telecomm | France
496 Government
SuperDome/HyperPlex (Hewlett-Packard)
195.8 Gflops
128 CPUs | Classified | UK
497 Government
SuperDome/HyperPlex (Hewlett-Packard)
195.8 Gflops
128 CPUs | Classified | UK
498 Government
SuperDome/HyperPlex (Hewlett-Packard)
195.8 Gflops
128 CPUs | Classified | UK
499 Government
SuperDome/HyperPlex (Hewlett-Packard)
195.8 Gflops
128 CPUs | Classified | UK
500 LG-EDS Systems
SuperDome/HyperPlex (Hewlett-Packard)
195.8 Gflops
128 CPUs | Industry Information Service | Korea
Big Brother Sean MacGuire, Rober-Andri Croteau

Big Brother monitors system and network services for availability. Your current network status is displayed on a color-coded web page in near-real time. When problems are detected, you're immediately notified by e-mail, pager, or text messaging.

Big Sister Thomas Aeby

Big Sister is a clone of Sean MacGuire's Big Brother. Big Sister can monitor networked systems, provide a simple view of the current network status, generate alarms on status changes, generate a history of status changes, and interoperate with other Big Sister or Big Brother instances or foreign network monitors (such as HP Openview).

Ganglia toolkit UC Berkeley Compujter Science (Millenium Cluster Project) & NPACI Rocks Cluster Group

Ganglia provides a complete real-time monitoring and execution environment that is in use by hundreds of universities, private and government laboratories and commercial cluster implementors around the world. Ganglia is as simple to install and use on a 16-node cluster as it is to use on a 512-node cluster as has been proven by its use on multiple 500+ node clusters.

PVM: Parallel Virtual Machine Jack Dongarra, Al Geist, Weicheng Jiang, Jim Kohl, Robert Manchek, Phil Papadopoulos, Vaidy Sunderam, Markus Fischer (project members list)

PVM (Parallel Virtual Machine) is a software package that permits a heterogeneous collection of Unix and/or Windows computers hooked together by a network to be used as a single large parallel computer. Thus large computational problems can be solved more cost effectively by using the aggregate power and memory of many computers. The software is very portable. The source, which is available free thru netlib, has been compiled on everything from laptops to CRAYs.

PVM enables users to exploit their existing computer hardware to solve much larger problems at minimal additional cost. Hundreds of sites around the world are using PVM to solve important scientific, industrial, and medical problems in addition to PVM's use as an educational tool to teach parallel programming. With tens of thousands of users, PVM has become the de facto standard for distributed computing world-wide.

MPI: Message Passing Standard

MPI is a library specification for message-passing, proposed as a standard by a broadly based committee of vendors, implementors, and users.

HPL: High-Performance Linpack Benchmark Petitet A, Whaley RC, Dongarra J, Cleary A Innovative Computing Laboratory, University of Tennessee

HPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. It can thus be regarded as a portable as well as freely available implementation of the High Performance Computing Linpack Benchmark.

MRTG: Multi-Router Traffic Grapher Tobias Oetiker, Dave Rand

The Multi Router Traffic Grapher (MRTG) is a tool to monitor the traffic load on network-links. MRTG generates HTML pages containing graphical images which provide a LIVE visual representation of this traffic. Check here for an example. MRTG is based on Perl and C and works under UNIX and Windows NT.

RRDTool: Round-Robin Database Tool Tobias Oetiker

If you know MRTG, you can think of RRDtool as a reimplementation of MRTGs graphing and logging features. Magnitudes faster and more flexible than you ever thought possible. RRD is a system to store and display time-series data (i.e. network bandwidth, machine-room temperature, server load average). It stores the data in a very compact way that will not expand over time, and it presents useful graphs by processing the data to enforce a certain data density. It can be used either via simple wrapper scripts (from shell or Perl) or via frontends that poll network devices and put a friendly user interface on it.

If you would like me to add your favourite tool to this list, let me know. If you are one of the developers or maintainers of any of the tools listed here and notice a mistake, inform me right away.