According to Marco Annaratone, general manager at Scyld, the company currently has about 200 installations of its commercialized implementation of the open source Beowulf clustering software for Linux. Some of those clusters have thousands of nodes, and many of them are among Fortune 1000-class firms as well as government and academic research institutions.
At BioIT World, Scyld was touting the fact that Beowulf can run simulation software aimed at the exploding bioinformatics market. Specifically, Scyld demonstrated a Linux cluster running an application called Assisted Model Building with Energy Refinement (AMBER), which is a simulation package used for molecular dynamics. Scyld also rattled off a list of other parallel high performance computing (HPC) applications such BLAST, GROMACS, HMMer, mpiBLAST, NAMD, and Phylip–all of which do complex simulations of molecules.
As previously reported, Beowulf Series 29 cz-5 is based on the Linux 2.4.27 kernel, like prior Series 29 releases. The latest release, however, has an open source, distributed system monitoring program with a graphical dashboard, called Ganglia, woven into the BeoMaster Beowulf cluster management software that is at the heart of Scyld’s product and makes a cluster look like a big SMP box as far as systems management is concerned.
As with other clustering architectures, Beowulf has a master node in the cluster runs all of the clustering and management features, while the compute nodes in the cluster run a streamlined Linux kernel that has been stripped of all of the daemons that a normal Linux distro has. This architecture means that compute nodes spend their time computing, not managing. Ganglia was created at the University of California at Berkeley, and the standard setup of this software requires daemons running on all nodes.
Scyld has woven Ganglia into BeoMaster in such a way that the agents used to feed BeoMaster data from the compute nodes are now feeding data into the Ganglia management program. Ganglia supplements BeoMaster; it does not replace it. With Series 29 cz-5, Scyld has added support for Penguin’s own BladeRunner blade servers and has expanded its support different InfiniBand interconnection products. And Scyld has just announced it is now offering high availability clustering and failover for the master node in the cluster so the failure of the master node doesn’t crash a running job.
While Scyld has been focused on the HPC market for the past several years, Annaratone says the features that have been put into Beowulf make it a much easier sell into enterprises. He says Scyld is not trying to chase down the high-profile, multi-teraflops supercomputing deals that make all the press, but rather has created a product that is suitable for big companies that need to do market analysis and simulations and who might need a few hundred nodes.
Scyld’s Beowulf is, in essence, its own Linux distribution, and more importantly for commercial enterprises that don’t have cheap and eager grad students as high-tech labor, it installs like a single system, administers like a single system, and, for the most part, is programmatically used as a single system even though it is a cluster with dozens or hundreds of nodes.
With early Linux clusters using cheap interconnections, cheap boxes, and open source software, it was easy to demonstrate that a Beowulf cluster offers very good flops for the dollar. But management costs for these clusters were much higher than for parallel clusters of big RISC/Unix SMP boxes.
The addition of the Ganglia management features is, therefore, something that companies engaged in bioengineering and other corporations who are not experienced with the nitty-gritty of Linux clusters–and no desire to learn them–really want. For many companies, in fact, a Linux cluster will be the first supercomputer they ever own, and maybe the only one. So getting the cluster provisioning and management tools right is a big deal.
Small and mid-sized biotech companies do not have the resources to hire a cluster administrator, said Annaratone. They want to hire people to do drug discovery.
What Scyld is not going to do, says Annaratone, is prebundle its clustering software exclusively on Penguin’s own X86 servers. While the company will sell so-called bright cluster configurations to customers who want racks of ready-to-go cluster nodes, to exclusively sell on Penguin’s iron would limit the company’s addressable market.
Back in February, as the Scyld was getting ready for the general availability of the cz-5 release, the company said it expected to sell the software for $3,500 on the master node and $500 per compute node. As we were going to press, Annaratone said the pricing was just being finalized and that a master node would cost several thousand dollars and the compute nodes will cost several hundred. This price includes the software license and support for one year.
When asked what he thought about Sun Microsystems’s public relations campaign to shift companies away from building and owning their own clusters and toward using a shared utility model, Annaratone said Scyld and Penguin did not have any plans to build out an alternative to the Sun Grid–although Penguin clearly has most of the pieces to do so if it wanted to.
We personally believe that intranet grids are the wave of the future, says Annaratone. There will be business opportunities here. Extranet and planetary grids are right now a vision.