CHPC News, February 2001, Volume 12, Number 2
by Jimmy Miklavcic, Digital Video Specialist
The Center for High Performance Computing (CHPC) has been investigating new technologies in videoconference services during the past two years. CHPC has placed into service the second Access Grid Node in the Lecture Hall (RM 110) of the Intermountain Networking and Computation Center (INSCC). The first node was installed at the Scientific Computation and Imaging Institute in Merrill Engineering.
Components of an AG Node (image courtesy of Argonne National Lab)
The Access Grid (AG) is part of the Alliance Access Grid project. The Access Grid Node (AGN) is a designed space that supports the high-end audio/video technology needed to provide a compelling and productive research experience. It is comprised of large-format multimedia display, presentation and interaction environments, interfaces to grid middleware and interfaces to remote visualization environments. The Access Grid supports distributed meetings, collaborative sessions, seminars, lectures, tutorials and training. It utilizes the multicast protocols across the Abilene Network (Internet 2).
The AG sets itself apart from normal desktop-to-desktop tools by being a group-to-group communications environment. This node will enable both formal and informal group interactions. As participants in The Alliance Access Grid project we will assist in developing a prototype environment to conduct meetings, site visits and other research and educational events.
All AG Nodes have the same basic components so that all sites can communicate within the same environment. Each site's basic configuration includes the following hardware specifications.
- Three video projectors aligned and registered visually into a single display device.
- Three video cameras, one to cover the speaker, one the audience and one the node operator.
- Three microphones, one for the speaker, one for the audience and one for the node operator.
- One PC-based computer system, running LINUX, for video capture (agvideo.chpc.utah.edu).
- One PC-based computer system, running LINUX, for audio broadcast and receive (agaudio.chpc.utah.edu).
- One PC-based computer system, running Windows 98, for audio control (agcontrol.chpc.utah.edu). This system communicates and controls the Gentner AP400 Echo Canceller.
- One PC-based computer system, running Windows 2000, for the display system (agdisplay.chpc.utah.edu).
So how does all this work in a cohesive manner? The systems are brought together by a web-based interface software component system called Virtual Venues developed at Argonne National Labs. This software allows an AG operator (a.k.a. node op) to control, manipulate, broadcast and receive several audio and video streams, distributed power point presentations and remote window manipulation. When the AGN is first activated, a connection to the Access Grid Lobby is established. This virtual lobby is the main entrance to several other venues named after local brewpubs in the Portland, Oregon area. To attend an event, the node op selects among the various virtual rooms and creates a new connection with the other participating sites that have entered the same virtual room.
Most videoconference systems only allow three to four simultaneous views of meeting participants. This system, borrowing from Mbone technology, uses VIC and RAT for the multimedia communications. VIC is the video capturing, broadcasting and receiving software of the video AGN system, and RAT (Robust Audio Tool) is the audio component. VIC allows us to display numerous video streams onto the three-projector display. Each site broadcasts a minimum of three video streams at nearly 30 frames per second (fps) and at a rate of 300Kb per second (kbps). If we participate in a five-institution meeting we can view fifteen simultaneous video streams including our own.
RAT simply takes a stereo signal, digitizes it into a multicast packet stream and sends it on its way at a rate of 300 Kbps with an 8Khz sample rate to the other sites. We can also listen to the other sites. Currently all microphones pass through our Gentner AP400 Echo Canceller. This takes our portion of the signal and strips it off the incoming audio stream so that we do not hear ourselves. This would normally create an echo.
The display system uses the three projectors as a single display. This allows the operator to place any window anywhere on the display space to create an intelligible viewing of the event for the audience. Additional display tools include Distributed PowerPoint (DPPT) and Virtual Network Collaborator (VNC).
The DPPT allows the host site to present its PowerPoint presentation and have full control of its timing as it is displayed on all the remote sites. The VNC is a distributed shell that allows the host site to display a system window on all remote sites to demonstrate programming and coding techniques and examples.
An AGN operator at each site controls all this interaction. Communication among these important people occurs on a back channel where they synchronize the event as they chat through a MUD system. This process is analogous to stagehands communicating with each other over two-way headsets during a theater performance. The node op is responsible for running a trouble free collaborative event allowing participants to focus on the content of the event.
Here in the INSCC we have made some enhancements to the lecture hall in order to create an environment that is easy to use and participate in AG events. We have removed the white boards on the north wall and replaced them with a writing and projection surface. The entire wall can be used simultaneously as a white board and a projection screen. Speakers can write on the wall while a projected image is displayed with any written comments or emphasis to any presentation. The board can then be erased cleanly. The surface cleans with water so no special cleaners are required. The top of the displayed image is placed close to the ceiling to allow space for writing below it.
A camera mounted on the south wall is focused on the speaker area and can be remotely controlled to zoom and pan. A second camera is mounted on the north wall focused on the audience and a third is used in the operator's control room. We'll be able to add other video sources such as DVD player, VTR or additional cameras as needed.
Microphones will be placed throughout the room so that participants may speak from their seats. The speaker will be outfitted with a wireless microphone. We will be able to incorporate other audio sources as well, such as CD and tape. Speakers are mounted in the ceiling just above the front row and in the back. There is some research being conducted into sending surround sound audio over IP, so we are preparing early for that possibility.
We have hired a part time AG operator who will be responsible for operating and maintaining our AG Node. Sam Liston is a graduate of the University of Utah Communications Department and currently works for Channel 2 as a camera operator. He will announce upcoming AG events and schedule an AG Virtual Venue when we are hosting an event. The lecture hall must still be scheduled through the CHPC offices in the INSCC building.
We are looking forward to seeing this room being utilized often. This AGN is available to any person or group that has an interest in using this technology to collaborate with other colleagues around the nation. In fact work has begun to make the Access Grid Network an international infrastructure.
For more information please visit the following web sites.
by Anita Orendt, Molecular Science, CHPC
CHPC now has a stand-alone web-based installation of BLAST (Basic Local Alignment Search Tool) for University researchers. BLAST, developed by the National Center for Biotechnology Information (NCBI) (at the National Library of Medicine of the National Institutes of Health) is a sequence similarity search tool designed to support analysis of nucleotide and protein sequences. In the past researchers have accessed the NCBI search site. However, with increased traffic the turn-around time for a search has become unpredictable and in some cases very long. Therefore, it was requested that CHPC maintain an independent site reserved for use by the university. All the software tools to establish such a site were provided by the NCBI.
The CHPC installation of BLAST is on visualice.chpc.utah.edu. No CHPC accounts are necessary to access the BLAST search engines. To access, users need only open their web browser from within the utah.edu domain and go to http://visualice.chpc.utah.edu/. When this site is opened, the user will see six buttons dealing with BLAST. The top (blue and white) button is a link to the NCBI BLAST pages. This is a great source of information on BLAST and its use, including a frequently-asked-questions link, a BLAST tutorial, a BLAST usage course, and links to conducting BLAST searches on the NCBI installation. The other five buttons are links to the actual search input pages. The only search tool that has been rigorously tested is BLAST. MegaBLAST, PsiBLAST [for both Psi (position specific iterated) and Phi (pattern hit initiated) BLAST], and Blast2 are all functional but testing has been minimal. RPS-Blast (Reverse Position Specific) requires a different database format than the other installations and has not yet been implemented. Please contact me if you can provide a good test case for Psi/Phi Blast or have need of RPS Blast.
In order to do a basic BLAST search, the use must input a search sequence, specified as either a sequence in FASTA format or as an Accession or GI (Genbank Index) number. The input sequence can be either an amino acid or a DNA sequence. The user must also choose the program to use from the following choices:
- Blastp - searches an amino acid query sequence against a protein database.
- Blastn - searches a DNA (nucleotide) query sequence against a nucleotide database.
- Blastx - searches a nucleotide query sequence translated in all reading frames against a protein database.
- Tblastn - searches a protein query sequence against a dynamically translated nucleotide database.
- Tblastx - searches a six-frame translation of a nucleotide query sequence against a six-frame translation of a nucleotide sequence database; not allowed with the large non-redundant database on the web-based versions due to time requirements.
The user must also choose the database, making sure that it is a proper database for use with the program chosen, from the following choices:
Nucleotide databases: nt (non-redundant), drosoph.nt, est, est_mouse, est_human, est_others, mito.nt, sts, vector, pdbnt, alu.n, gss, htgs, ecoli.nt, yeast.nt, and patnt.
Protein databases: nr (non-redundant), drosoph.aa, swissprot, pdbaa, alu.a, ecoli.aa, yeast.aa, pataa, mito.aa.
There are also search options that can be specified for the sensitivity of the search, though in most case the default options are sufficient. The user is directed to the NCBI help pages for information on the setting of these parameters.
Users of the NCBI site will recognize that the names of the databases are slightly different from those used at the NCBI site; these are the names as provided from the FTP site. Downloading updated databases from the NCBI's FTP site is done overnight on a daily basis. The one exception to this is the est database; due to its length the update is still downloaded and installed manually; therefore it only gets done on days that I am in (normally each weekday). Tools exist to support and format user developed databases.
Once an input sequence is specified along with the program and database, the user need only select the search button. An output window will appear, with the BLAST citation and the database information. The page will be completed once the search is finished. The output window has a graphical display of the distribution of hits found, along with a listing of the hits with significant alignments and links to the matching sequences' Entrez entries. Statistical information about the search and information on the date and size of the database are found at the end of the output window.
If there are any problems or questions dealing with the use of the CHPC site, please feel free to contact me at Orendt@chpc.utah.edu or at 587-9434.
by Anita Orendt, Molecular Science, CHPC
NWChem 4.0 is now installed on raptor and installation on icebox should follow shortly.
Version 4.0 has several new modules, including the ONIOM method for treating different portions of a molecular system at different levels of theory, COSMO for solvation treatments, NBO - Natural Bond Orbital analysis, and SODFT - Spin orbit DFT. Users are directed to the NWChem webpages at:
http://www.emsl.pnl.gov:2080/docs/nwchem/nwchem.html for more information on NWChem 4.0.
February 8-9, 2001 CHPC will host the "2nd Cluster Computing in the Sciences Conference. The conference will take place in the new Heritage Center near the University Guest House.
The conference will provide attendees a chance to hear speakers from national labs, univeristies, the computing industry and the oil industry. CHPC will present tutorials on building your own Beowulf cluster, an introduction to MPI and an Introduction to OpenMP.
For more information about this conference, please see http://www.chpc.utah.edu/cc
In March 2001 CHPC expects delivery of a new high performance computing platform: an 8 node (32 processor) COMPAQ ES40 Sierra Cluster. This is a smaller version of the 45 million dollar "Terascale" computing system, funded by the NSF, which the Pittsburgh Supercomputing Center is installing.
Each of the 8 SMP nodes will contain 4 667 MHz Alpha 21264 CPUs; each with 64 KB I-cache, 64 KB D-cache on chip and 8 MB per CPU L2 cache.
Each node will have 8 GB of Memory for a total of 64 GB. The memory is ECC 100 Mhz.
The Compaq comes with a 16 port Quadrix Supercomputer World (QSW) switch. Each of the 8 nodes has a switch card that MPI programs will use. With our current purchase, we can expand to 16 nodes without a switch upgrade. Features of switching interconnect are:
- Elan-3 PCI adaptor:
- DMA driven
- Get and put
- 20 MV/s/rail bi-directional
- Elite "fat tree" switch:
- 8-way x-bar chips
- 16 ports
- Up to 20m cables
- 0.035 ms switch latency
- Multiple virtual circuits and load balancing.
- < 3ms DMA/Shmem
- < 5.5 ms MPI
We will be running the Tru64 UNIX 5.0 operating system, and plan to have the full suite of compilers and message passing libraries available. The Compaq MPI is highly optimized to make the most of the Sierra Cluster by communicating over shared memory and memory channels, and is compatible with MPICH 1.1.1.
The Compaq will come with software from QSW called Resource Management Software (RMS). RMS delivers a rich suite of tools, and in many ways portrays the cluster as a single machine with many processors.
We will also have the Compaq Portable Math Library (CPML) and the Compaq Extended Math Library (CXML) which includes BLAS, LAPACK, Sparse Linear System Solvers and Signal Processing routines.
We will be using a port from the Pittsburgh Supercomputer Center of PBS that uses RMS. We also intend to use the Maui scheduler, although the Maui port is not yet completed to this platform. Our goal is to have a consistent interface on all of our batch systems.
Users will be required to use multiples of 4 processors as we do not plan to allow node sharing. There will be a front-end interactive machine from which jobs will be compiled, debugged and submitted.
Disk (Global Scratch)
The Compaq comes with 864 Gigabytes of disk space by using 24 (twenty four) 36 Gigabyte Fiber Channel Disks. The disks will serve as a large "global scratch" area and will communicate with Compaq's Cluster File System (CFS) and Parallel File System (from Compaq) to deliver high speed file I/O from each node. Each node will use QSW's parallel file system across the Quadrix switch for high speed I/O.
We have not yet established when users will have access to this system, but we plan to make it available to the CHPC community as soon as possible.