Department of Marine Sciences, University of Gothenburg, Kristineberg 566, 45178 Fiskebäckskil, Sweden.
Anal Chem. 2021 Dec 14;93(49):16360-16368. doi: 10.1021/acs.analchem.1c02618. Epub 2021 Nov 22.
Herein we report on a deep-learning method for the removal of instrumental noise and unwanted spectral artifacts in Fourier transform infrared (FTIR) or Raman spectra, especially in automated applications in which a large number of spectra have to be acquired within limited time. Automated batch workflows allowing only a few seconds per measurement, without the possibility of manually optimizing measurement parameters, often result in challenging and heterogeneous datasets. A prominent example of this problem is the automated spectroscopic measurement of particles in environmental samples regarding their content of microplastic (MP) particles. Effective spectral identification is hampered by low signal-to-noise ratios and baseline artifacts as, again, spectral post-processing and analysis must be performed in automated measurements, without adjusting specific parameters for each spectrum. We demonstrate the application of a simple autoencoding neural net for reconstruction of complex spectral distortions, such as high levels of noise, baseline bending, interferences, or distorted bands. Once trained on appropriate data, the network is able to remove all unwanted artifacts in a single pass without the need for tuning spectra-specific parameters and with high computational efficiency. Thus, it offers great potential for monitoring applications with a large number of spectra and limited analysis time with availability of representative data from already completed experiments.
在此,我们报告了一种用于去除傅里叶变换红外(FTIR)或拉曼光谱中仪器噪声和不需要的光谱伪影的深度学习方法,特别是在需要在有限时间内获取大量光谱的自动化应用中。允许每个测量仅几秒钟且无法手动优化测量参数的自动化批量工作流程,通常会导致具有挑战性和异质的数据集。这个问题的一个突出例子是关于环境样品中微塑料(MP)颗粒含量的颗粒的自动光谱测量。由于必须在自动化测量中执行光谱后处理和分析,而无需针对每个光谱调整特定参数,因此低信噪比和基线伪影会妨碍有效的光谱识别。我们展示了一种简单的自编码神经网络在重建复杂光谱失真(例如高水平噪声、基线弯曲、干扰或扭曲的波段)方面的应用。一旦在适当的数据上进行训练,该网络就能够在单个传递中去除所有不需要的伪影,而无需调整特定于光谱的参数,并且具有很高的计算效率。因此,它为具有大量光谱和有限分析时间的监测应用提供了很大的潜力,并且可以利用已经完成的实验的代表性数据。