The extreme power of the Genetic Algorithm techniques in automated fitting and assigning of complex spectra.

Introduction

Rotationally resolved electronic spectroscopy provides a valuable tool for determination of molecular structures in different electronic states. An implicit problem of this method is that for larger molecules the spectra rapidly get very congested. Additionally, overlapping bands due to (i) close spaced vibronic bands (ii) other isotopomers with similar zero-point energy shifts or (iii) split bands due to large amplitude internal motions might complicate the experimental spectrum further. All these factors make a straightforward assignment of single rovibronic lines and therefore line position assigned fits impossible. An alternative approach is unassigned fits of the spectra using genetic algorithms (GA) with special cost functions for evaluation of the level of the fit. It has been shown by Hageman et al. [1] that the GA with a properly defined cost function was capable of performing automated fitting of spectra without any prior knowledge of the molecular parameters. The cost function used by Hageman et al. [1] is able to smooth the error landscape and therefore allows the GA to locate the global minimum. Unfortunately this method is quite time consuming, compared to other cost functions like simple least squares or peak picking functions. The automated fitting of several overlapping bands requires therefore fast parallel processing and large computing times.

In recent work [2] we show how the computing time of the cost function can be reduced drastically, so that the automated fit of a rovibronic spectrum can be performed in less than one hour using a standard desktop PC. The performance of the GA for spectra simulations has been described in detail elsewhere [1]. A good introduction into the vocabulary and theory of genetic algorithms as a tool for solving optimization problems can be found in [3, 4].

We extend the automated fit to the case of several overlapping spectra, i.e., the fitting of molecular parameters that belong to different molecular species or spectral components. The method is applied to a synthetic spectrum, which consists of two completely overlapping bands, in order to adapt the internal parameters for the GA fit. The refined method is then applied to a series of experimental rovibronic spectra of isotopomers of phenol and benzonitrile and clusters thereof. The discussed results demonstrate the extreme power of the GA in automated fitting and assigning of complex spectra. . The discussed results demonstrate the extreme power of the GA in automated fitting and assigning of complex spectra. It opens the road to the analysis of complex spectra of biomolecules and their building blocks.

Click on the image to follow a GA automated fitting of a spectrum

The genetic algorithm

A description of the GA used in this investigation can be found in [1]. The GA library PGAPack version 1.0, which can run on parallel processors, has been used [5]. The calculations were performed on four processors of a SUN UltraSPARC 333 MHz and on a 2.6 GHz PC with two processors under Linux. The genetic algorithm is basically a global optimizer, which uses concepts copied from natural reproduction and selection processes. For a detailed description of the GA the reader is referred to the original literature [6,7,8]. We shortly introduce the elements of the GA, which will be used in the following.

The performance of the GA depends on internal parameters like mutation rate, elitism, crossover probability and population size, which therefore should also be optimized for a given problem. Fortunately this meta-optimization results in similar parameters for quite different problems of optimization. The meta-optimization for some of the parameters is described in ref. [2].

A short slideshow on the Genetic Algorithm

References

[1] J. A. Hageman, R. Wehrens, R. de Gelder, W. L. Meerts, and L. M. C. Buydens, J. Chem. Phys. 113 (2000) 7955-7962.
[2] W.L. Meerts, M. Schmitt and G. C. Groenenboom. Can. J. Chem. 82 (2004) 804-819.
[3] R. E. Haupt and S. E. Haupt, Practical Genetic Algorithms, Wiley-Interscience, New York, 1988.
[4] M. Mitchell, An Introduction to Genetic Algorithms (Complex Adaptive Systems), MIT Press, 1998.
[5] D. Levine, PGAPack V1.0, PgaPack can be obtained via anonymous ftp from: ftp://ftp.mcs.anl.gov/pub/pgapack/.
[6] J. H. Holland, Adaption in Natural and Artificial Systems, MI: The University of Michigan Press, Ann-Arbor, 1975.
[7] D. E. Goldberg, Genetic Algorithms in search, optimisation and machine learning, Addison-Wesley, Reading Massachusetts, 1989.
[8] I. Rechenberg, Evolutionsstrategie - Optimierung technischer Systeme nach Prinzipien der biologischen Evolution, Frommann-Holzboog, Stuttgart, 1973


This project is performed in collaboration with Michael Schmitt (University of Dusseldorf)