Detection of People Positive to COVID-19 through ATR-FTIR Spectra Analysis of Saliva using Machine Learning

  • Gustavo Jesús Vazquez-Zapien Escuela Militar de Medicina, Centro Militar de Ciencias de la Salud, Secretaría de la Defensa Nacional
  • Monica Maribel Mata-Miranda Escuela Militar de Medicina, Centro Militar de Ciencias de la Salud, Secretaría de la Defensa Nacional
  • Adriana Martinez-Cuazitl Escuela Militar de Medicina, Centro Militar de Ciencias de la Salud, Secretaría de la Defensa Nacional
  • Melissa Guerrero-Ruiz Escuela Militar de Medicina, Centro Militar de Ciencias de la Salud, Secretaría de la Defensa Nacional
  • Francisco Garibay-Gonzalez Escuela Militar de Medicina, Centro Militar de Ciencias de la Salud, Secretaría de la Defensa Nacional
  • Miguel Sanchez-Brito Instituto Politécnico Nacional, Escuela Superior de Computación
Keywords: Saliva, ATR-FTIR, Machine learning, COVID-19, Diagnosis


COVID-19 is an infectious disease caused by the SARS-CoV-2 virus. This virus's spread is mainly through droplets released from the nose or mouth of an infected person. Although vaccines have been developed that effectively reduce the effects that this viral infection causes, the most effective method to contain the virus’s spread is numerous tests to detect and isolate possible carriers. However, the response time, combined with the cost of actual tests, makes this option impractical. Herein, we compare some machine learning methodologies to propose a reliable strategy to detect people positive to COVID-19, analyzing saliva spectra obtained by Fourier transform infrared (FTIR) spectroscopy. After analyzing 1275 spectra, with 7 strategies commonly used in machine learning, we concluded that a multivariate linear regression model (MLMR) turns out to be the best option to identify possible infected persons. According to our results, the displacement observed in the region of the amide I of the spectrum, is fundamental and reliable to establish a border from the change in slope that causes this displacement that allows us to characterize the carriers of the virus. Being more agile and cheaper than reverse transcriptase polymerase chain reaction (RT-PCR), it could be reliably applied as a preliminary strategy to RT-PCR.


Download data is not yet available.


World Health Organization. Coronavirus disease (COVID-19) [Internet]. 2021. Available from:

To KKW, Tsang OTY, Leung WS, Tam AR, et al. Temporal profiles of viral load in posterior oropharyngeal saliva samples and serum antibody responses during infection by SARS-CoV-2: an observational cohort study. Lancet Infect Dis [Internet]. 2020;20(5):565–574. Available from:

To KKW, Tsang OTY, Yip CCY, Chan KH, et al. Consistent Detection of 2019 Novel Coronavirus in Saliva. Clin Infect Dis [Internet]. 2020;71(15):841–843. Available from:

Long C, Xu H, Shen Q, Zhang X, et al. Diagnosis of the Coronavirus disease (COVID-19): rRT-PCR or CT? Eur J Radiol [Internet]. 2020;126(1):108961. Available:

Ravi N, Cortade DL, Ng E, Wang SX. Diagnostics for SARS-CoV-2 detection: A comprehensive review of the FDA-EUA COVID-19 testing landscape. Biosens Bioelectron [Internet]. 2020;165:112454. Available from:

Zhang W, Du RH, Li B, Zheng XS, et al. Molecular and serological investigation of 2019-nCoV infected patients: implication of multiple shedding routes. Emerg Microbes Infect [Internet]. 2020;9(1):386–389. Available from:

Böger B, Fachi MM, Vilhena RO, Cobre AF, et al. Systematic review with meta-analysis of the accuracy of diagnostic tests for COVID-19. Am J Infect Control [Internet]. 2021;49(1):21–29. Available from:

Farshidfar N, Hamedani S. The Potential Role of Smartphone-Based Microfluidic Systems for Rapid Detection of COVID-19 Using Saliva Specimen. Mol Diagn Ther [Internet]. 2020;24:371–373. Available from:

Kucirka LM, Lauer SA, Laeyendecker O, Boon D, et al. Variation in False-Negative Rate of Reverse Transcriptase Polymerase Chain Reaction–Based SARS-CoV-2 Tests by Time Since Exposure. Ann Intern Med [Internet]. 2020;173(4):262–267. Available from:

Luhby T. A coronavirus test could cost as little as $20 or as much as $850. CNN [Internet]. 2020. Available from:

Smith B. Fundamentals of Fourier Transform Infrared Spectroscopy. 2nd ed. Florida: Taylor and Francis Group; 2011. 1-17 p.

Smith BC. Infrared Spectral Interpretation: A Systematic Approach. United States of America: CRC Press; 1998. 304 p.

Bel’skaya LV, Sarf EA, Kosenok VK. Age and gender characteristics of the biochemical composition of saliva: Correlations with the composition of blood plasma. J Oral Biol Craniofacial Res [Internet]. 2020;10(2):59–65. Available from:

Nogueira MS, Leal LB, Marcarini WD, Pimentel RL, et al. Rapid diagnosis of COVID-19 using FT-IR ATR spectroscopy and machine learning. Sci Rep [Internet]. 2021;11:15409. Available from:

Guleken Z, Jakubczyk P, Wiesław P, Krzysztof P, et al. Characterization of Covid-19 infected pregnant women sera using laboratory indexes, vibrational spectroscopy, and machine learning classifications. Talanta [Internet]. 2022;237:122916. Available from:

Calvo-Gomez O, Calvo H, Cedillo-Barrón L, Vivanco-Cid H, et al. Potential of ATR-FTIR–Chemometrics in Covid-19: Disease Recognition. ACS Omega [Internet]. 2022;7(35):30756–30767. Available from:

Bandeira CCS, Madureira KCM, Rossi MB, Gallo JF, et al. Micro-Fourier-transform infrared reflectance spectroscopy as tool for probing IgG glycosylation in COVID-19 patients. Sci Rep [Internet]. 2022;12:4269. Available from:

Kitane DL, Loukman S, Marchoudi N, Fernandez-Galiana A, et al. A simple and fast spectroscopy-based technique for Covid-19 diagnosis. Sci Rep [Internet]. 2021;11:16740 Available from:

Baker MJ, Trevisan J, Bassan P, Bhargava R, et al. Using Fourier transform IR spectroscopy to analyze biological materials. Nat Protoc [Internet]. 2014;9(8): 1771-1791. Available from:

Morais CLM, Lima KMG, Singh M, Martin FL. Tutorial: multivariate classification for vibrational spectroscopy in biological samples. Nat Protoc [Internet]. 2020;15:2143-2162. Available from:

Callery EL, Morais CLM, Paraskevaidi M, Brusic V, et al. New approach to investigate Common Variable Immunodeficiency patients using spectrochemical analysis of blood. Sci Rep [Internet]. 2019;9:7239. Available from:

Shi L, Liu X, Shi L, Stinson HT, et al. Mid-infrared metabolic imaging with vibrational probes. Nat. Methods [Internet]. 2020;17:844-851. Available from:

Yang X, Fang T, Li Y, Guo L, et al. Pre-diabetes diagnosis based on ATR-FTIR spectroscopy combined with CART and XGBoots. Optik [Internet]. 2019;180:189–198. Available from:

Lu Y, Zhao Y, Zhu Y, Xu X, et al. In situ research and diagnosis of breast cancer by using HOF-ATR-FTIR spectroscopy. Spectrochim Acta A Mol Biomol Spectrosc [Internet]. 2020;235:118178. Available from:

Naseer K, Ali S, Mubarik S, Hussain I, et al. FTIR spectroscopy of freeze-dried human sera as a novel approach for dengue diagnosis. Infrared Phys Technol [Internet]. 2019;102:102998. Available from:

Wang X, Wu Q, Li C, Zhou Y, et. A study of Parkinson’s disease patients’ serum using FTIR spectroscopy. Infrared Phys Technol [Internet]. 2020;106:103279. Available from:

Sala A, Anderson DJ, Brennan PM, Butler HJ, et al. Biofluid diagnostics by FTIR spectroscopy: A platform technology for cancer detection. Cancer Lett [Internet]. 2020;477:122-130. Available from:

Jarvis S, Crossley SA. Approaching Language Transfer Through Text Classification: Explorations in the Detection-based Approach [Internet]. Blue Ridge Summit: Multilingual Matters; 2012. 208p. Available from:

Stewart WJ. Probability, Markov Chains, Queues, and Simulation: The Mathematical Basis of Performance Modeling [Internet]. New Jersey, United States: Princeton University Press; 2009. 776p. Available from:

Paliouras G, Karkaletsis V, Spyropoulos CD (eds.). Machine Learning and Its Applications: Advanced Lectures [Internet]. Germany: Springer; 2001. 324p. Available from:

Hodeghatta UR, Nayak U. Business Analytics Using R - A Practical Approach. United States: Apress; 2016. 297p.

Vidgen R, Kirshner S, Tan F. Business Analytics: A Management Approach. United Kingdom: Bloomsbury Academic; 2019. 430p.

Géron A. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. Canada: O’Reilly Media, Inc.; 2019. 851p.

Rajaguru H, Prabhakar SK. KNN Classifier and K-Means Clustering for Robust Classification of Epilepsy from EEG Signals. A Detailed Analysis. Germany: Anchor Academic Publishing; 2017. 54p.

How to Cite
Vazquez-Zapien, G. J., Mata-Miranda, M. M., Martinez-Cuazitl, A., Guerrero-Ruiz, M., Garibay-Gonzalez, F., & Sanchez-Brito, M. (2022). Detection of People Positive to COVID-19 through ATR-FTIR Spectra Analysis of Saliva using Machine Learning . Mexican Journal of Biomedical Engineering, 43(3), 44 - 59.
Research Articles