by Steve Stein
Gas chromatography/mass spectrometry, GC/MS, has long been the method of choice for identifying volatile compounds in complex mixtures. This method can fail, however, when acquired spectra are “contaminated” with extraneous mass spectral peaks, as commonly arise from co-eluting compounds and ionization chamber contaminants. These extraneous peaks can pose a serious problem for automated identification methods where they can cause identifications to be missed by reducing the spectrum comparison factor below some pre-set identification threshold. In addition, the presence of spurious peaks in a spectrum adds to the risk of making false identifications. Perhaps worst of all, the added uncertainty leads to a general loss of confidence in the reliability of making identifications by GC/MS, especially for trace components in complex mixtures, a key application area for this technique.
The most common method for extracting “pure” spectra for a chromatographic component from acquired spectra is to subtract spectra in a selected “background” region of the chromatogram from spectra at the component maximum. This, however, is only appropriate when background signal levels are relatively constant (ionization chamber contamination, for example). Moreover, highly complex chromatograms may have no identifiable “background” region.
An automated approach for dealing with contaminated spectra is to assume that acquired mass spectral peaks that do not match a reference spectrum originate from impurities. While this method can identify trace components embedded in complex background spectra, it can also produce false positive identifications for target compounds having simple spectra (i.e., when target compounds have spectra which are, in effect, embedded in the spectra of other compounds in the analyzed mixture).
AMDIS is an integrated set of procedures for first extracting pure component spectra and related information from complex chromatograms and then using this information to determine whether the component can be identified as one of the compounds represented in a reference library. The practical goal is to reduce the effort involved in identifying compounds by GC/MS while maintaining the high level of reliability associated with traditional analysis.
Since the inception of GC/MS, there has been a continuing interest in extracting “pure” component spectra from complex chromatograms. Biller and Biemann  devised a simple method in which the extracted spectrum is composed of all mass spectral peaks that maximize simultaneously. Colby  improved the resolution of this method by computing more precise ion maximization times. Herron, Donnelly and Sovocol  demonstrated the utility of Colby's method in the analysis of environmental samples.
Another computationally facile approach for extracting spectra based on subtraction of adjacent scans (“backfolding”) has been recently proposed . An advantage of this approach is that it does not explicitly require maximization.
A more computationally intensive approach developed by Dromey et al , called the “model peak” method, extracts ion profiles that have similar shapes. As in the Biller/Biemann procedure, this method uses maxima in ion chromatograms to detect chromatographic components. The shape of the most prominent of these maximizing ion chromatograms is used to represent the shape of the actual chromatographic component. Ion chromatograms with this shape are extracted by a simple least-squares procedure. This method was successfully used for target compound identification in a large-scale EPA study . Rosenthal  proposed an improvement to the peak perception logic for this method.
A number of matrix-based approaches have been proposed that make no assumptions concerning component peak shape. These methods generally process an abundance data matrix consisting of m/z, elution time pairs. Sets of ions whose abundances are correlated with each another are extracted. While diverse approaches have been described, to our knowledge none of them have been fully implemented and tested for general-purpose use.
The model peak method of Dromey et al.  was selected as the basis for spectrum extraction both because it has been shown to produce reliable results in large-scale tests and because it followed an approach similar to that of an analyst. However, its ability to extract weak signals was found to be poor. The origin of this problem was its inability to establish thresholds to enable it to distinguish signal from noise. This problem was solved in the present work by processing ion abundances in signal-to-noise units rather than as absolute abundances. This permitted the rational setting of the thresholds throughout the spectrum extraction process. Chemical identification was based on an optimized spectrum comparison function described earlier , and extended to incorporate other information derived from GC/MS data. Analysis of test results led to the development of further refinements in the spectrum comparison process. The overall process involves four sequential steps: 1) noise analysis, 2) component perception, 3) spectral “deconvolution”, 4) compound identification.