Automated Mapping of Real-world Oncology Laboratory Data to LOINC.

In this study we seek to determine the efficacy of using automated mapping methods to reduce the manual mapping burden of laboratory data to LOINC(r) on a nationwide electronic health record derived oncology specific dataset. We developed novel encoding methodologies to vectorize free text lab data, and evaluated logistic regression, random forest, and knn machine learning classifiers. All machine learning models did significantly better than deterministic baseline algorithms. The best classifiers were random forest and were able to predict the correct LOINC code 94.5% of the time. Ensemble classifiers further increased accuracy, with the best ensemble classifier predicting the same code 80.5% of the time with an accuracy of 99%. We conclude that by using an automated laboratory mapping model we can both reduce manual mapping time, and increase quality of mappings, suggesting automated mapping is a viable tool in a real-world oncology dataset.

AMIA ... Annual Symposium proceedings. AMIA Symposium. 2021 ;2021():611-620.

ISSN 1942-597X

Authors: Jonathan Kelly, Chen Wang, Jianyi Zhang, Spandan Das, Anna Ren, Pradnya Warnekar

©2021 AMIA - All rights reserved.

PMID 35308998

PubMed BibTeX

Mapping