Annotated Recurrent Unidentified Spectra (ARUS )

To assist in the compound identification problem, the NIST Mass Spectrometry Data Center has developed a novel type of mass spectral library, one that includes all recurrent unidentified mass spectra in a material. Unlike traditional spectral libraries, which consist of reference spectra of known compounds derived from neat standards, these libraries are derived from recurring spectra of unknown identity in the target material itself, where spectra are extracted, clustered, and where possible annotated prior to entry into a library. Building the library itself follows a similar methodological procedure to the one described for libraries of neat compounds, though with a different set of spectrum and measurement annotation.

In general, this type of library can be useful in many usual tasks of ‘omics studies (i) answering where, how often, and in what conditions certain ions are observed, (ii) assigning class ID for compounds not in current tandem mass spectral libraries or not commercially available, (iii) connecting samples in an unambiguous way for control-case studies or interlaboratory comparisons (each molecular feature is represented by a spectrum in the library).

We provide eight spectral libraries of annotated recurring spectra found in human urine and plasma samples of sixteen different NIST reference materials (Table 1, pooled, 9 plasma SRMs and seven urine SRMs):

Table 1. NIST Human plasma, serum and urine Standard Reference Materials (SRM).

Plasma SRM/RM Number Brief Description
1950 Metabolites in Frozen Human Plasma serum
909c Frozen Human Serum
967a Creatinine in Frozen Human Serum
968e Fat-Soluble Vitamins, Carotenoids, and Cholesterol in Human Serum
971 Hormones in Frozen Human Serum
972a Vitamin D Metabolites in Frozen Human Serum
1951c Lipids in Frozen Human Serum
3950 Vitamin B6 in Frozen Human Serum
956d Electrolytes in Frozen Human Serum
Urine SRM/RM Number Brief Description
3667 Creatinine in Frozen Human Urine
3671 Nicotine Metabolites in Human Urine (Frozen, 3 levels, 3671.1, 3671.2, 3671.3)
3672 Organic Contaminants in Smokers' Urine (Frozen)
3673 Organic Contaminants in Non-Smokers
3674 Organic Contaminants in Fortified Smokers

Detailed information about these materials can be found at and the corresponding Certificate of Analysis.

The present libraries represent the third version of the original ones (Rapid Commun. Mass Spectrom. 2016, 30, 581–593 and Anal. Chem. 2019 91 (18), 12021-12029). It includes spectra that have not been identified by the NIST Tandem Mass Spectral Library. The spectral annotation is performed using our Hybrid Search. The following nomenclature was used for compound names in the library: Name_Adduct Type_Score_DeltaMasss_Formula_LibID, where Name, DeltaMass, Score and Formula are derived from the best hit in a hybrid search. LibID is a sequential number assigned to spectra in the archive. In case of not matching, known-unknown, ‘Names’ were simply given as cluster numbers. The Hybrid Search Match Factor is also shown. Nreps in the comment field gives the number of original spectra used to make the consensus spectrum/total spectra found along with collision energy. Among other uses, these libraries are intended to assist users with making identifications of compounds in urine that are not included in the NIST 20 Tandem Library. Users interested in metabolomics may find the libraries useful for connecting samples in an unambiguous way in control-case studies or inter-laboratory comparisons. Libraries include positive and negative ion modes. Separate libraries are given for fragmentation by collision cell (HCD) and ion trap, all at high mass accuracy. The name of the library gives the material and fragmentation conditions. These libraries can be searched using NIST Mass Spectral Search Program (NISTMS.EXE). A version of MS Search is available for download from this website, earlier versions should also work.

The libraries are offered “as is” and without warranty of any kind and are intended for research purposes only.

Download Readme file for help with installation and using these libraries: Readme_ARUS.PDF (1.76 MiB). It is included with each library too.
Download the Tab delimited text file 2013-0828_1950_1.mgf.tsv (271.88 KiB). The Excel file contains a worked example of the combined use of the ARUS libraries, the tandem mass spectral library and the hybrid search method by using a single run of a plasma sample downloaded from

Description Download Link
Plasma, HCD, negative mode (57.72 MiB)
Plasma, HCD, positive mode (140.42 MiB)
Plasma, IT, negative mode (15.97 MiB)
Plasma, IT, positive mode (36.61 MiB)
Urine, HCD, negative mode (100.61 MiB)
Urine, HCD, positive mode (412.31 MiB)
Urine, IT, negative mode (27.77 MiB)
Urine, IT, positive mode (37.21 MiB)
chemdata/arus.txt · Last modified: 2020/06/29 15:33 (external edit)

Page Tools