Chemometric Spectroscopy is the term used to describe the direct combination of a spectroscopic measurement with a chemometric data evaluation procedure, typically a multivariate statistic method [1, 2], so providing quick and reliable deduction of the qualitative and quantitative properties of a sample from its spectrum. A multi-linear regression (MLR) or partial least squares (PLS) regression is usually used for the quantitative determination of individual or combined parameters, while substance identification generally relies on techniques such as discriminant or cluster analysis.
These procedures have been proven in combination with near infra red spectroscopy (NIRS) over the last two decades and have spread to other instrumental techniques. Many applications have been described for samples from the agricultural, food, chemical, pharmaceutical, biotechnology and medical fields, and some have become established as official methods.
Based on this success the technique is now being extended with improvements to quality control, operation, consistency of operating procedures and data formats and the integration of products from different manufacturers, and these developments are expected to result in significant changes in software.
A hypothetical problem based on fact
A laboratory manager would like to introduce a new analysis procedure to determine the dry matter content of maize. NIR spectroscopy offers a quick and easy method of analysis without complicated sample preparation. The laboratory possesses an advanced type of spectrometer such as an FT-NIR or grating monochromator made by Company A. Routine measurements in field laboratories will use compact and economical diode array spectrometers from Company B. The sample spectra and analysis results must be logged in a LIMS database.
Company A offers software for routine measurement and software for method development which uses a file format to its own internal, unpublished standard. This software has a data export format which is compatible with the LIMS. Company B’s software only supports the operation of its own instrument up to and including the storage of the recorded spectra: to create a new calibration model they recommend the software from Company C, and their instrument uses a published interface for the spectral transfer which uses Company C’s data format.
If the user wants to establish and validate a calibration on the research spectrometer from Company A, before being able to use it on the field instruments he must first:
- use Company B’s software to perform the spectral measurement
- import the data using Company A’s chemometric software
- archive the data to the LIMS and apply it to the calibration model
- convert the calibration to Company C’s data format.
This type of non-standardisation is commonly encountered, and results in excessive use of staff time as well as introducing the possibility of transfer errors.
Fig. 1: All the SL programs share a common data structure, using compact relational databases containing all the spectra, sample data and calibration models together with other data sets. The SL Database Viewer provides ready access to the data.
A solution to the problem: modular software with a unified data format
SensoLogic has developed software which overcomes the type of problem described above. The SL software family for Chemometric Spectroscopy comprises a set of software modules covering the routines needed for calibration development (SL Calibration Workshop), for routine measurement (SL Predictor) and for other specialised tasks. A common data structure is used throughout. The software incorporates the original drivers from well-known spectrometer manufacturers together with tools to simplify data transfer. The program modules can be configured to users’ needs and there is a tool kit for developing further applications (the SL Application Development Kit) which provides the variety and flexibility needed to work in different ways with different instruments.
Data is stored in a Chemometric Project File (CPF) which contains all the data relevant to a particular project in a single file. This contains the following information and data (see also Figure 1):
- Samples with ID and their physical or chemical reference data
- Spectra, organised into series and libraries
- User-configurable transformations for spectral pre-treatment
- Calibration models, ready to use in the methods and applications.
As an example, the data for a maize analysis project might include all the measured spectra with information on the variety, harvest year, region of origin or other significant variables and a LIMS identifier linking each sample with analysis data such as dry-matter or protein. All the developed calibration models together with the resulting tested methods for dry-matter and protein prediction are documented together with validated and released methods for single or multi-component analysis.
The CPF file provides the common platform for the individual SL programs, so that for example SL Predictor can be run on the lab computer which is connected to the spectrometer while SL Calibration Workshop is in use on a PC in the office.
The first step in calibration development is to capture the sample spectra with SL Predictor or to use previously recorded data from a source such as JCAMP or SPC. The sample data are imported into the project file. Next, SL Calibration Workshop is used to generate the calibration models. This procedure uses well-proven tools and procedures and is focussed on the essential operations so as to keep the user time requirement to a minimum. For quantitative analysis a multi-linear regression can be applied directly to the wavelength variables (MLR) of the spectra or it can be applied to the derived factors from a variance analysis using Principal Component Regression (PCR) or Partial Least Squares Regression (PLSR). For qualitative analysis a discriminant module for spectral libraries and a cluster model are available, both with factoring by Principal Component Analysis (PCA).
Calibrations which appear to be suitable are then incorporated into a Method which is tested by measuring an independent set of samples not included in the calibration and, if successful, may subsequently be validated.
Three freely selectable combinations from validated methods can be combined and released for routine analysis so that for example a quantitative application for dry-matter measurement entitled “Maize” which has been proven during one year’s harvest may be extended by adding calibration samples from the new crop and also by adding the sample data for a further measurement such as protein. The project file containing the Maize application can then be used for direct measurement using SL Predictor.
Fig. 2: Before starting routine measurement with SL Predictor, a one-time set-up procedure configures preferences such as user entries, result output format and storage and automatic sample numbering, so that subsequent routine operation is simplified.
Before the first measurement a few simple steps of preparation are made so as to simplify the daily work; for example, the user selects the established application, or calibration, to be used for routine analysis and, if wanted, the option to add the analysis values for other parameters if they will already be available for the samples. Figure 2 shows an example, with an additional “Quality Index”.
After the sample has been measured the application automatically checks whether the sample is valid for the calibration being used and marks the result accordingly in green or red (Figure 3). In the case of red, a code indicates which aspect of the outlier check failed:
- H: Leverage outlier
Spectrum lies too far outside the calibration data
- S: Spectrum reconstruction error
The factor model used could not deconstruct this spectrum
- R: Range outlier
The estimated value lies outside the calibration range.
Fig. 3: During a series of measurements all entries, spectra, results and status information are clearly displayed. The main screen of SL Predictor here shows that the protein value was acceptable but the dry-matter result is marked as an outlier and is therefore invalid. The Quality Index was entered manually. The measurement uncertainty for each measurement can be optionally displayed.
As well as checking outliers the software provides an estimate of the measurement error. This indicates the limits within which a reference analysis would have a 95% probability of agreeing with the measured result. SL Predictor calculates this according to the method of ASTM E1655, part 15.4.1.
Results from a qualitative analysis are presented with similar clarity. For a conformity analysis (“Is this sample the same as X or not?”) a positively identified sample is marked as OK. If the result is uncertain, the program shows the user a list of substances with the most similar spectra from the selected library.
Results can be printed in report form or sent as an ASCII file to a LIMS. They are also stored in the CPF project file.
Fig. 4: Spectrometers are controlled using original manufacturer’s drivers, which are integrated into the SL software. In this example a diode array spectrometer (the Corona, made by Carl Zeiss Jena GmbH) is controlled with a synchronised sample carousel or additional baseline correction.
Wherever possible the SL software incorporates the original drivers from a wide range of spectrometer manufacturers so as to ensure a seamless connection to the spectrometer and without tying the user to a particular manufacturer or type. All the necessary operating functions can be carried out from the SL Predictor user interface (Figure 4).
In close co-operation with spectrometer manufacturers and in accordance with their recommendations, all instrument-specific settings and procedures are implemented, for example for a single-beam spectrometer a reference measurement can be performed with simultaneous optimisation of the signal level. SL Predictor adds other useful functions for all instrument types, such as a baseline measurement to correct the extinction value for a particular sample container, or other type of interference which would not be corrected in a reference measurement.
This provides the user with a program which is ready to use, so that during installation it is only necessary to select the instrument driver(s) to be loaded: the system is then ready for use with the selected spectrometer.
In addition to the main applications the SL software family offers a range of useful accessory programs. These are bundled together in the SL Utilities. There are tools to support the direct or indirect import of existing data in a CPF database, for example the batch import of individual SPC spectra, optionally combined with the associated reference analysis values in an Excel file.
The “Subset Selection“ allows representative spectra sets for calibration and validation data sets to be assembled by selecting significant spectra, either with the help of a Gauss-Jordan procedure  or at random.
SL Database Viewer (Figure 1) provides an independent tool for quickly viewing and managing CPF project files, and the SL Application Development Kit offers the possibility to read from and write to CPF files from other software programs and to use the analysis methods they contain.
The user in our original example for Maize analysis can – provided spectrometer manufacturers A and B have provided their drivers - now perform his dry-matter analysis as easily as this:
- use SL Predictor with both spectrometers for sample measurement
- have instant access to the recorded data with SL Calibration Workshop and archive the results to a LIMS, develop a calibration model or optimise and validate it
- use the calibration immediately for routine use, with the spectrometer from either Company A or B, and archive the analysis results to the LIMS.
In this way all the procedures are brought together in a simple and easy to use way, so that the work becomes more efficient and the risk of errors is reduced, while at the same time meeting the requirements of quality assurance for the chemometric analysis.
 T. Næs, T. Isaksson, T. Fearn, T. Davies:
Multivariate Calibration and Classification
NIR Publications, Chichester, 2002
 H. Mark:
Principles and Practice of Spectroscopic Calibration
Wiley, New York, 1991
 D.E. Honigs, G.M. Hieftje, H.L. Mark, T.B. Hirschfeld:
Unique-Sample Selection via Near-Infrared Spectral Subtraction.
Analytical Chemistry, vol. 57, no.12, October 1985, 2299-2303
With grateful thanks to KWS Saat AG / Einbeck for permission to publish maize spectra in the application example.