Genetic Algorighm to Model Crystal Structures

by Julio C. Facelli, Director, Center for High Performance Computing

In recent years the field of crystal engineering, whose goal is to design solids with specific properties,[1] has received significant contributions from advances in computer modeling of organic materials.[2] Organic crystalline or polycrystalline materials are prevalent in many industries including pharmaceuticals, agrochemicals, pigments, dyes, explosives and specialty chemicals. Molecular structure modeling is well understood and depending on the size of the compounds under consideration there are well prescribed methodologies. These include ab initio, semi empirical and empirical methods, that can be used to obtain molecular structures of organic compounds.[3] This is not the case when attempting to predict their crystalline structures, where modeling techniques are still in developmental stages.[4]

                and Experimental Structure Diagram

Polymorphism is one of the greatest challenges in crystal engineering today. Increasingly, the experimental evidence shows that organic crystals may exist in a number of polymorphs that can fall within a narrow range of energies. This is not a well-understood phenomenon, therefore it is necessary to develop a better understanding of the conditions under which organic solids can crystallize in different polymorphic forms. This is important because different polymorphs can exhibit very different properties such as shelf life, bio-availability, solubility, morphology, vapor pressure, density, color, shock sensitivity, etc. Therefore, when designing new solid organic materials, it is important to be able to predict how many polymorphs are possible, their crystalline structure and how their macroscopic properties depend on the crystal structures. Diffraction studies in single crystals for a large number of polymorphs, are in general impractical and alternative methods using power diffraction, solid state NMR and computational modeling are highly desirable. Furthermore, one cannot be certain that all possible polymorphs have been discovered experimentally and/or that those that have been found previously can be reproduced. Thus, methods that could predict potential polymorphic structures using molecular modeling techniques would be extremely valuable.

Practical methods that can lead to accurate crystalline structures in an effective manner are required to make significant advances in crystal engineering. Numerous approaches have been attempted to develop these methods;[5-10] a comprehensive review of the existing methods to predict crystal structures is given in Ref. 4. A cursory study of this reference makes it clear that better and more efficient methods are necessary for practical applications.

Our recent research efforts,[11], [12] with Dr. Marta Ferraro and Lic. Victor Bazterra of the University of Buenos Aires, have been focused on using Genetic Algorithms[13], [14] (GA) to predict crystalline structures. GA are a family of search techniques rooted on the ideas of Darwinian biological evolution. These methods are based in the principle of survival of the fittest, considering that each string or genome represents a trial solution candidate of the problem. At any generation, the genomes or "individuals" compete with each other in the population for survival and produce offsprings for the next generation by prescribed propagation rules. One of the advantages of genetic algorithms is that they can provide not only a global minimum, but also information on other states with energies close to the minimum. Operator analogues to crossover mutation and natural selection are employed to perform a search mechanism able to explore and learn the multidimensional parameter space and determine which regions of that space provide good solutions to the problem.

Using the resources at the Center for High Performance Computing, we have developed a new computational package named Modified Genetic Algorithm for Crystal and Cluster structures (MGAC). The modular nature of this library allows for great flexibility in the selection of the fitness function which can be calculated using empirical potentials, enthalpy, ab initio energy functions or even experimental constrains. The MGAC method has been implemented in C++ language using parallel techniques (MPI), making it very portable as well as easy to maintain upgrade and run effectively in parallel architectures. A full account of these research projects is given in two papers that will appear in April 8th issue of The Journal of Chemical Physics.[11], [12] As an example, we present here the results in benzene.

                comparing crystallographic parameters for the predicted,
                reference, and experimental structures in benzeneBenzene: The crystal structure of benzene is known experimentally to belong to the Pbca symmetry space group with four molecules per unit cell. Therefore, a genome of 27 parameters or genes is necessary to represent this crystal. The initial population was randomly created with cell parameters belonging to the following intervals: [-12,12] for the crystallographic axis, [-10,10] for the molecular positions and [0,2p] for the Euler angles.

Table I compares the crystallographic parameters for the MGAC predicted structure, the reference structure and the experimental structure in benzene. It is easy to realize that the MGAC predicted structure and the reference structure are almost identical, but there are some larger differences between MGAC predicted structure and experimental structure. From the comparison between the reference and the experimental structures, it can be inferred that these discrepancies are associated with the "approximation defects", and not with the optimization method. Even so, the differences between the MGAC predicted structure and the experimental one are quite small. Figure 2 presents a pictorial comparison of the experimental and the MGAC predicted structures, clearly depicting that MGAC correctly predicts the position and the orientation of the molecules inside the cells.

This work has been partially funded by the NSF Chemistry and International Divisions and by the Argentine Research Council (CONICET).


  1. G. R. Desiraju, Science 278, 404 (1997).
  2. A. Gavezzoti, "Theoretical Aspects and Computer Modelling of Molecular Solid State", Willey, New York (1997).
  3. J. B. Foresman and AE. Frisch, "Exploring Chemistry with Electronic Structure Methods", Gaussian, Inc., Pittsburgh (1996).
  4. J. P. M. Lommerse, W. D. S. Motherwell, H. L. Ammon, J. D. Dunitz, A. Gavezzotti, D. W. M. Hofmann, F. J. J. Leusen, W. T. M. Mooij, S. L. Price, B. Schweizer, M. U. Schmidt, B. P. van Eijck, P. Verwer and D. E. Williams, Acta Crystall. B56, 697 (2000).
  5. A. Gavezzotti and G. Filippini, J. Am. Chem. Soc. 118, 7153 (1996).
  6. C. B. Aakeroy, M. Nieuwenhuyzen and S. L. Price, J. Am. Chem. Soc. 120, 8986 (1998).
  7. H. Karfunkel and R. Gadnitz, J. Compt. Chem. 13, 1171 (1992).
  8. A. Gavezzotti, Faraday Discuss. 106, 63 (1997).
  9. D. W. Hoffman and T. Lengauer, J. Mol . Struct. 474, 13 (1999).
  10. B. P. Van Eijck, W. T. M. Mooij and J. Kroon, Acta Cryst. B51, 99 (1995).
  11. V. E. Bazterra, M. B. Ferraro and J. C. Facelli, Modified Genetic Algorithm to Model Crystal Structures. I. Benzene, Naphthalene and Anthracene, J. Chem. Phys.,April 8, 2002.
  12. V. E. Bazterra, M. B. Ferraro and J. C. Facelli, Modified Genetic Algorithm to Model Crystal Structures. II. Determination of a Polymorphic Structure of Benzene Using Enthalpy Minimization, J. Chem. Phys., April 8, 2002.
  13. K. F. Man, K. S. Tang and S. Kwong, "Genetic Algorithms", Springer, Berlin (1999).
  14. D. E. Goldberg, "Genetic Algorithms in Search, Optimisation and Machine Learning", Addison-Wesley, New York (1989).

Introduction to Parallel I/O at the Center for High Performance Computing

by Martin Cuma, Scientific Applications Programmer, Center for High Performance Computing

While the parallelization of programs provides the necessary speedup to allow users to tackle demanding problems, input and output of the data to the permanent storage systems is still mostly done in serial or in a better case, individually from each parallel process. This can seriously affect efficiency of the program.

Figure 1

There has been a lot of effort in the code development community to come up with a portable and easy to use parallel I/O solution. The Message Passing Interface (MPI) Forum created an MPI-I/O specification as a part of the MPI-2 standard Figure 1. Most of the computer vendors as well as several public domain MPI distributions have implemented this standard. While MPI-I/O forms the software layer for parallel I/O, a hardware system that conforms to the requirements of MPI-I/O (e.g. large I/O bandwidth, multiple process access to the same file) is necessary for the physical storage. One such system is PVFS, Parallel Virtual File System Figure 2, installed on the Icebox cluster as the /scratch/global file system. In this article, we will first describe the physical setup of PVFS. Then we go over several ways to do parallel I/O and concentrate on efficient use of the MPI-I/O features. We finalize with a glance at some more advanced MPI-I/O features.


PVFS has been developed with the idea of providing a fast parallel I/O solution for Linux-based clusters. To achieve high performance, PVFS spreads the data across several I/O nodes, thus having multiple network paths for the data access and eliminating single path bottlenecks. The PVFS layout is schematically shown in Figure 1. The system consists of several parts. The metadata server stores metadata information about the PVFS files (file name, directory, ownership, distribution on the I/O nodes). The I/O server handles storing and retrieving data stored on the I/O nodes. The two are often located on the same physical computer, denoted as Management Node on the figure. The files are stored on the I/O Nodes. Typically the file is spread in blocks of certain size on the I/O nodes in a round-robin fashion. It should be obvious from the figure, that if we do parallel I/O from the compute nodes, the I/O bandwidth will be limited by the collective bandwidth of either the compute nodes, or the I/O nodes.

Figure 2

PVFS provides users with several options for doing the I/O. The simplest way is using standard Unix I/O functions. While the data will be distributed on the I/O nodes and the higher bandwidth will be utilized, the operations will be performed through expensive system kernel requests. A faster option is to use the PVFS API (Application Program Interface) functions, which bypass the kernel. In this case a PVFS library must be linked with the applications.

An MPI-I/O implementation called ROMIO [3], which is a part of the public domain MPICH [4] distribution, has been interfaced with the PVFS providing parallel I/O capabilities for the MPI applications on the Linux platform. On the CHPC's Icebox Linux cluster, the PVFS file system is located at /scratch/global. The physical system consists of six I/O nodes and one management node, connected with gigabit Ethernet, with total storage capacity approximately 980 GB. Since the computational nodes are connected to this system using 100BaseT Ethernet, the peak I/O bandwidth this system is able to achieve is 12.5*numprocs MB/s. Apart from the Unix file I/O, the users have a choice of using either the parallel I/O functions provided with the PVFS API, or an MPI-I/O interface shipped with ROMIO. Details on how to compile and use both are on our website [5]. In Figure 2., we provide an illustration of what I/O bandwidth can be achieved with the native PVFS and MPI-I/O calls. The benchmark tests aggregate I/O bandwidth from 2 to 32 processors which each write/read 64 MB blocks into a single common file. The PVFS writes achieve performance fairly close to the peak bandwidth for 32 processors, 400 MB/s. MPI-I/O starts to lag behind with larger number of processors, an issue which we have not resolved, yet. The slower read performance can be explained by several buffering issues related to the Linux kernel.

Figure 3

Apart from the Icebox, MPI-I/O is also supported on the Compaq Sierra cluster on the /scratch/global file system. This file system consists of Compaq's AdvFS (Advanced File System) connected physically to nodes 0 and 1 of the cluster and via the Quadrics switch to all the other nodes. Details on how to compile MPI-I/O code on the Sierra are also detailed on our website [5].


Programmers writing MPI programs using standard Unix I/O have generally two options for performing the I/O. The first is to communicate the data to one process that then writes to a single file, as illustrated in Figure 3a. The other is for each process to write to a different file Figure 3b. Both of these methods are inefficient and inconvenient. In the first, we are wasting time in communication, thus limiting application's scalability. The other method produces numerous files that can get difficult to manage, and it may not be convenient to restart the calculation on a different number of nodes. MPI-I/O provides a framework for single file access from multiple processors, illustrated in Figure 3c. Similarly to the MPI communication, there are individual and collective file access functions, as well and blocking and non-blocking file access. Apart from this, MPI-I/O also provides supportive functions for improved I/O performance, such as special data types for non-contiguous file access and for passing hints to the MPI-I/O implementation. We note here that all MPI-I/O access is performed in the binary format, there are no functions to provide text based I/O.


Individual I/O functions are those that can be called from each process separately. They provide the most convenient way for transforming the code from serial Unix I/O. However, their performance may be lagging behind the collective I/O operations, which are performed together by all processes. In Figure 4, we provide an example of individual MPI-I/O to a single file. To open a file, each process calls function MPI_File_open. The first argument to this function is a communicator, in this case we choose MPI_COMM_SELF, which represents a single process. The next argument is the file name. Note that we pass path in /scratch/global file system, which is the only one on Icebox that supports MPI-I/O. The third argument specifies the file open mode, here we use MPI_MODE_WRONLY and MPI_MODE_CREATE to indicate that we are creating the file for write-only access. The fourth argument of type MPI_Info can be used to pass additional hints to the implementation; we don't want to pass any, so we use predefined MPI_INFO_NULL.

Figure 4

The last argument returns the handle to the file. In order for each process to write to a different part of the file, we need to specify a file offset, which can be done in several ways. The simplest is to use function MPI_File_seek, equivalent to the Unix function seek. The first argument is the file handle. The second is the file offset which is different for each node. For portability reasons, the offset must be of type MPI_Offset, since different platforms may use different integer types to represent file offsets. The last argument specifies the starting point from which the offset is calculated, in our case MPI_SEEK_SET denotes from the start of the file. We write into the file using MPI_File_write. This function is individual since processors can use it independently. The first argument is again the file handle, second the data, third their count, fourth their datatype and fifth returns the status of the operation, of type MPI_Status (similar to the MPI send operations). Reading from a file is done using MPI_File_read with identical argument list. The file is closed with MPI_File_close function, which takes only one argument, the file handle. There is another simple way to specify the file offset, shown in Figure 5. For a change, this piece of code is reading from a file.

Figure 5

Instead of a combination of MPI_File_seek and MPI_File_read we specify an explicit file offset using single function MPI_File_read_at, with one extra argument representing the file offset. Note also that in this example we used the collective communicator MPI_COMM_WORLD to open the file, which is also allowed for individual file access. While the individual file access functions can be used collectively by all the processes in a communicator to open the file, it is usually more beneficial to use collective I/O functions for this purpose. The implementation can provide some extra optimizations when doing the I/O collectively. The collective I/O functions have the same syntax as the individual functions, with the additon of suffix _all to each function name. In Figure 6, we present an example of collective I/O write using the third possible way to specify the offset, file view. A file view in MPI defines which portion of a file is "visible" to the process.

Figure 6

A read or write can only access data from this portion of the file. When a file is first opened its whole contents is visible to the process. Specifying a different file view can be done for two reasons. Using file view, we can specify different type of data access than the default byte. These can be either basic data types, such as integer or float, or derived data types for more complicated data structures. The other reason is to specify which parts of the file should be skipped by the particular process, which is the case in our example. File views are specified by a triplet displacement, etype and filetype, arguments two to four in the MPI_File_set_view function. The displacement specifies the number of bytes to be skipped from the start of the file. The etype is the basic unit of the file access, in our case it is integer, MPI_INT. The filetype specifies the portion of the file visible to the process. In our example, the file type is again MPI_INT. In case of noncontiguous file access, detailed further down in this article, this could be a derived data type created for this purpose. The fifth argument to MPI_File_set_view specifies the data storage representation, native indicates the representation native to the platform, which is non-portable. There are two other representations that can provide binary file cross-platform compatibility at moderate performance decrease, internal and external32.


In parallel applications often each processor needs to access a lot of small pieces of data located noncontiguously in the file. Separate read/write to access these pieces is very costly because of relatively high Unix I/O latency. MPI-I/O allows users to create derived data types for noncontiguous file access, thus combining the I/O requests which considerably speed them up. Derived data types are also commonly used to create specific file views for easier data access.

Figure 7

In Figure 7 we present a simple example of using a derived data type that creates a file view for writing an array into a file in a noncontiguous manner. What we want is to make the particular processor read two integers, then skip another four, then read two, and so on. On top of that, we want to skip the first five integers at the beginning of the file, which can be considered as some sort of a file header. Code that will achieve this is in Figure 8. As we noted before, a file view is specified by three variables. We set the etype as MPI_INT - our data are integers. The file type, visible portion of the file, is two integers, followed by a gap of four. We create this datatype in several steps. First, we call MPI_Type_contiguous that creates a datatype of two integers long. Then we call MPI_Type_create_re-sized that will increase the datatype size to six integers, however, only the two from the previously created type will have a physical representation and will be written to the file. This way we created the gap of four integers. We commit this type by MPI_Type_commit (all newly created derived datatypes must be committed before the use). Then we call MPI_File_open to open the file, and MPI_File_set_view to set the file view. In this function, we feed as filetype our newly created data type, as etype MPI_INT and as displacement 5*sizeof(int) - the initial file displacement or the header of the file. This value must always be in bytes. MPI_File_write_all will then do a collective noncontiguous write of the array buf that is contiguously stored in memory. In the end, we have to free the derived data type by calling MPI_Type_free.

Figure 8

Apart from this simple example, the MPI-2 specifications contains several new data types that can simplify noncontiguous file access, and which description is behind the scope of this article. These include Darray and Subarray type used in distribution of multidimensional data on the processors, and indexed block type to specify irregularly distributed arrays.


In this article, we tried to familiarize the users with the parallel I/O capabilities available at the CHPC. We would like to encourage the users to use this potential, as we believe that it can improve their programs performance. The author would be happy to assist with any problems that may arise in the process.


  1. MPI-2 standard, see
  2. PVFS - Parallel Virtual File System,
  3. ROMIO - A high performance, portable MPI-I/O implementation,
  4. MPICH - A portable implementation of MPI,
  5. CHPC's parallel I/O webpage -

CHPC to retire SGI Origin 2000

by Julia Harrison, Associated Director, Center for High Performance Computing

CHPC is planning to retire the SGI Origin 2000 (raptor) in July of 2002. The retirement of the system was approved at the last Faculty Advisory Board meeting, following the staff recommendation. The compuational capacity of this system will be replaced by adding capacity at ICEBOX. This is possible because eliminating the SGI system will save significant maintanence funds that can be used to increase the compuational capacity of the Center. We estimate that with the savings in maintenence we will able to add to the Center an amount of compute capacity that doubles the current capacity on raptor in the next year alone.

The current plan is to give the 64 processor system which has been primarly used for batch numeric intensive work to the SCI group in the School of Computing. SCI plans to trade in raptor along with rapture, a 32 node Origin 2000 which has been used mostly for visualization work, for a new O3000 system.

The new system is to be used for visualization work and will be administered by the SCI group in the School of Computing. While the new system will not be administered or supported by CHPC, CHPC users who need this high end visualization equipment will still be supported on the new system by making appropiate arragemants with the SCI group. Any numeric intensive, batch style work should be migrated to another CHPC system before July 1st, 2002. We have conducted a careful analysis of the workload of the system and determined that there will not be significant hurdles to accomplish this migration. Users are encouraged to contact CHPC as soon as possible to receive help in migrating their applications to ICEBOX or other CHPC systems.

Last Modified: October 06, 2008 @ 21:09:11