Intel Linux Beowulf Class Cluster

by Brian D. Haymore

Introduction:

Linux has for some time been working its way into various parts of the University of Utah campus. Linux has been used as an operating system for web servers, dialup servers, database servers, mail servers along with many other uses. The Center of Excellence in Space Data and Information Science at NASA's Goddard Space Flight Center is to be credited with the first efforts in bringing Linux into the compute server world. They created a cluster of Intel based PCs that they networked together to act as one greater machine. Their project was named Beowulf. Now all similar clusters are referred to as Beowulf class clusters. Linux runs on a vast variety of hardware including x86 (Intel, AMD & others), Sparc, UltraSparc, PPC, DEC Alpha, MIPS and many others. The attraction to using Linux and Intel systems for a cluster is primarily based on price/performance. PC hardware or commodity hardware can be purchased at a very affordable price and from many different vendors. The low price/performance of these clusters is quite appealing to computational scientists, but it is unclear which applications can benefit from this technology. Note that a large contributor to the high price of traditional parallel supercomputers is the interconnect system. To keep the Beowulf cluster price/performance competitive, commodity parts have to be used for the interconnect. With the reduced performance of the interconnect systems, many applications required to have low latencies or high bandwidth between processors may not run effectively on Beowulf clusters. To allow our user's to gain experience in this technology, CHPC has built a Beowulf Class Linux Cluster from 33 Intel based PCs. We have named our creation ICE Box.

Hardware:

ICE Box is built from a dual processor master node and 32 single processor compute nodes. The system has a total of 8.5 gigabytes of memory and over 260 gigabytes of disk space. The master node is a dual processor Intel Pentium II 450Mhz machine with 512MB of RAM. It houses 56GB of user disk space as well as 5GB of system disk space. Three 100 megabit ethernet cards are used to connect the master node to the slave nodes as well as to the world. The slave nodes are all mirror images of each other. They are built from Intel Pentium II 350Mhz processors with 256MB of RAM per machine. Each has a 5GB hard drive. Each slave is also connected to the cluster via a 100 megabit ethernet card. All of the network cards are interconnected through the use of an ethernet switch which provides for all cluster communication.

The system has a front end master. The end users logs into the master node, develops and tests his code on the master node, and submits his code to a scheduling system to be run on the slave compute nodes. End users are not allowed to log directly into the slave nodes. This design makes it simple for the administrator as well as the end user to function and use the system.

Software:

We are running Linux, a freely developed UNIX-like operation system, as the OS for ICE Box. This decision was made in consultation with users testing the system.

We have installed MPICH 1.1.1, an implementation of the MPI (Message Passing Interface) standard, to provide a means of writing parallel code. This package offers libraries for C, Fortran 77, Fortran 90 and C++ so that the end users can have a choice as to which language they wish to code in. We also have installed PVM (Parallel Virtual Machine) version 3 which is available free through netlib.

We have three compilers for our users to choose from on our system. The first is the GNU Compiler suite. This includes ANSI C, C++ and Fortran 77 compilers. We also have the EGCS Compiler Suite including all of the above listed languages with support also for Fortran 90. In addition to these two free compilers we have also opted to offer the Portland Group Compiler suite. The Portland Group Compiler offers superior performance with Fortran code as well as other optimizations that make it more desirable then the GNU and EGCS compilers in many situations. In most situations we recommend users choose the Portland Group Compiler.

Beowulf Cluster Practicality & Performance:

After ICE Box was assembled and ready to run we began a testing phase. This phase was to not only determine the performance of the cluster, but also to learn its characteristic strengths and weaknesses. We quickly learned that we would need to understand how to use the network very efficiently. Even with the network being a potential bottle neck for the system, we have achieved very good performance. Carleton Detar from the physics department and Jerry Schuster from the geophysics department are helping us test out the system. Physics was able to get their code up and running first and began the tests. The results pleased us and showed that the system could perform with very satisfactory results for this application. Carleton reported that he was able to port his code to this platform with very little effort. Geophysics was finished next and began testing their code. After some initial debugging Geophysics reported that their code ran with 90% efficiency across the cluster. After these two great starts we began a polishing phase that helped us continue to learn more of the characteristics of ICE Box.

Conclusion:

We have only begun to understand all of the possibilities that a Beowulf Class Cluster has to offer. The cost effectiveness of commodity hardware intermixed with the free and open nature of Linux has produced a very viable system for people looking for inexpensive ways to achieve supercomputing performance.

Many more applications need to be tested before we can decide which problems are suitable for this architecture. People who wish to experiment with this platform should contact our director, Dr. Julio Facelli by sending him email at facelli@chpc.utah.edu.

Additional Information:

PBS Tips

by Kristina Bogar, CHPC Consultant

On November 2, 1998, a new batch system referred to as PBS was installed on raptor, the Onyx2/Origin2000 system, and inca/maya, the PowerCHALLENGE cluster. PBS, Portable Batch System, replaced the previous batch system DQS. Since PBS's installation, we have collected some helpful tips on writing batch scripts for this queueing system:

  1. If the first line in the script is something other than a #PBS command or a # comment, PBS will ignore the rest of the script. If the first line of the script is a blank line, PBS will also ignore the rest of the script.
  2. Sample PBS batch scripts can be found at: http://www.chpc.utah/software/docs/pbs.html.
  3. A list of PBS commands can be found at the bottom of: http://www.chpc.utah/software/docs/pbs.html.
  4. Scripts can only be submitted from the batch domain they are to run on. (i.e. jobs to be run on raptor must be submitted from raptor and jobs to be run on inca/maya must be submitted from inca).
  5. PBS will fail if you have tty-dependent commands in your.profile, .cshrc, .login, and .logout files. Under the heading "Creating a PBS Batch Script" at http://www.chpc.utah.ed/origin2000 or http://www.chpc.utah.edu/power_challenge, there are suggestions on how to modify these dot files to a avoid problems with PBS.
  6. If you receive the following message: "stty:: Function not implemented", you can ignore it as long as you have changed your dot files so that they do not invoke any tty commands while in batch mode.
  7. If you receive a message similar to the following one: "PBS: job killed: ncpus 16.06 exceeded limit 16", you will need to request one more cpu than you actually use. This fix can be accomplished by the following commands in your script:
      #PBS -l ncpus=17,walltime=2:00:00
      mpirun -np 16
      For a Gaussian job, the number of CPUs is not specified on the line which runs Gaussian. Instead a directive is used in the input file for Gaussian. This directive takes the format %NProc=#, where # specifies the number of cpus %Gaussian is allowed to use (note that it may use less than this number). In the example above, the line
    ; %NProc=16
    is included in the input file. If you have told PBS that you are using two more CPUs than you really are (i.e. you asked for 18 but are only using 16) and PBS still kills your job because PBS thinks you are using more CPUs than requested, increase the PBS parameter ncpus to three more than needed and contact the CHPC consultants. Unless a consultant has told you otherwise, do not request any more than three CPUs above your actual need.
  8. The following line defines the resources needed by the job:
      #PBS -l ncpus=4, walltime=1:00:00
      This job requires four CPUs and a maximum of one hour to run. The default for ncpus is one and for walltime is 60 minutes. If the user does not explicitly specify these quantities, the default values will be used by PBS.
  9. Walltime, not total CPU time is used to monitor jobs. If the flag cput is set, but the walltime is not specified, PBS will kill the job after 60 minutes, no matter what value cput has been assigned.
  10. In case the system where your code is running, crashes or is rebooted, you will need to be aware of the qsub flag -r. If -r y is included in your script, your program will be rerun after the system is brought back on-line. However, if you are not sure you want your code rerun automatically, you will need to use -r n. In the case of checkpointing, the user should make arrangements in their code for the new data to be appended to the existing data or stored in a new file. The default for this flag is y (yes).
  11. If you're having trouble with PBS, the first troubleshooting technique we suggest is to type "qsub" on the command line. Without any arguments the qsub command accepts standard input, so will wait for you to type something. Type:
      /usr/bin/printenv
      then press the enter key and then type ^d (control-d) which ends your input. This should create a file in your current directory called STDIN.oXXX where XXX is a job number This file will tell you all about your environment.

If you need further help send the file to consult@chpc.utah.edu

CHPC Web Site

by Stefano Foresti, CHPC Staff Scientist

CHPC's web site has undergone a major upgrade, as you may experience at: http://www.chpc.utah.edu/

The primary goal of the site re-design was to make information easy to find, accommodating the different requirements and backgrounds of the users. The CHPC Home Page contains relevant news to frequent viewers and users of CHPC systems, and is a gateway to other information with minimal clicks.

The user interface has been designed with a uniform look and feel that provides a sense of context and optimizes the content on a page, it's readability and it's download time. The layout renders properly in different browsers. Frames are used in certain sections to improve navigability, but the content displays anyway on some old browsers that do not support them.

The information has been structured so that relevant information can be found in little time and with minimal effort; mainly navigating the links within the pages and eventually with the aid of a search engine.

The information in the site has significantly expanded and more information and interactive features will be added in the future. We welcome feedback on the functionality and content of the web site at webmaster@chpc.utah.edu.

The web site has also been designed to allow for easier administration and maintenance. The pages are developed with templates that provide uniform layout and global navigation. The server side includes allow global changes to be propagated throughout the site instead of editing the pages individually. If you need advice on planning and designing a large or complex web site please contact Stefano Foresti at: stefano@chpc.utah.edu.

Proposals and allocation requests for computer time on the IBM SP, the SGI Origin 2000 and the SGI Power Challenge are due by March 1st, 1999. We must have this information if you wish to be considered for an allocation of time for the Spring 1999 and/or the subsequent three quarters. This is for additional computer time above the default amount given to each account for the quarter.

  1. Information on the allocation process and relevant forms are located on the web at http://www.chpc.utah.edu/policies/alloc_policy.html Also please use the form located at http://www.chpc.utah.edu/policies/allocation.html when you send in your request, as only those requests following this format will be considered.
  2. You may request computer time for up to four quarters.
  3. Spring Quarter (Apr-June) allocations go into effect on April 1, 1999.
  4. Only faculty members can request additional computer time for themselves and those working with them. Please consolidate all projects onto one proposal to be listed under the requesting faculty member.
  5. Send your proposal and relevant information to the attention of: DeeAnn Raynor, Admin. Officer, CHPC, 405 INSCC, FAX: 585-5366, email: admin@chpc.utah.edu, tel: 581-5253.

Note: CHPC uses a calendar quarter system and deadlines have not changed as a result of the Semester converions.

MP-000938 COMPUTER PROFESSIONAL (ELECTRONIC SECURITY ANALYST) Bachelor's degree in Computer Science, Information System, Technical Communications or related field or equivalency, three years experience which includes all phases of Information Technology and electronic security implementation required. Demonstrated human relation skills and experience in a large research university environment preferred. Works with University entities in planning security implementations, intrusion detection, intrusion response, and regular security assessments. Maintains security related WWW content, develops and presents security related educational materials. Applicants must submit a University of Utah Application for Employment (available on our Web Site at www.Personnel.utah.edu). Grade: BB1. Posted: 12-23-98. Review of applications will begin 12-29-98 and continue until position is filled. (Center for High Performance Computing).

MP-000939 COMPUTER PROFESSIONAL (APPLICATION SUPPORT) Bachelor's degree in Computer Science, Information Systems or related field or equivalency, two years experience which includes compilation, installation, and configuration of applications in a UNIX environment, demonstrated effective oral and written communication skills required. Experience with C, C++, FORTRAN, Java, and shell programming and Solaris, AIX and IRIX operating systems preferred. Duties include documentation, license server management, application problem diagnosis and limited user support. Applicants must submit a University of Utah Application for Employment (available on our Web Site at www.Personnel.utah.edu). Grade: BB1. Posted: 12-23-98. Review of applications will begin 12-29-98 and continue until position is filled. (Center for High Performance Computing).

In the CHPC News, Vol 9, No. 4, September 1998 some comments were made regarding the DQS Batch system and its developers. These views in no way reflect the opinion of CHPC and we humbly retract those statements. We sincerely apologize to Florida State University Supercomputer Computations Research Institute.

Last Modified: October 06, 2008 @ 21:09:11