Solving the correspondence problem in analytical chemistry : Automated methods for alignment and quantification of multiple signals

Sammanfattning: When applying statistical data analysis techniques to analytical chemical data, all variables must have correspondence over the samples dimension in order for the analysis to generate meaningful results. Peak shifts in NMR and chromatography destroys that correspondence and creates data matrices that have to be aligned before analysis. In this thesis, new methods are introduced that allow for automated transformation from unaligned raw data to aligned data matrices where each column corresponds to a unique signal. These methods are based around linear multivariate models for the peak shifts and Hough transform for establishing the parameters of these linear models. Methods for quantification under difficult conditions, such as crowded spectral regions, noisy data and unknown peak identities are also introduced. These methods include automated peak selection and a robust method for background subtraction. This thesis focuses on the processing of the data; the experimental work is secondary and is not discussed in great detail. All the developed methods are put together in a full procedure that takes us from raw data to a table of concentrations in a matter of minutes. The procedure is applied to 1H-NMR data from biological samples, which is one of the toughest alignment tasks available in the field of analytical chemistry. It is shown that the procedure performs consistently on the same level as much more labor intensive manual techniques such as Chenomx NMRSuite spectral profiling. Several kinds of datasets are evaluated using the procedure. Most of the data is from the field of Metabolomics, where the goal is to establish concentrations of as many small molecules as possible in biological samples.