Data Element Mapping in the Data Privacy Era.

Secondary use of health data is made difficult in part because of large semantic heterogeneity. Many efforts are being made to align local terminologies with international standards. With increasing concerns about data privacy, we focused here on the use of machine learning methods to align biological data elements using aggregated features that could be shared as open data. A 3-step methodology (features engineering, blocking strategy and supervised learning) was proposed. The first results, although modest, are encouraging for the future development of this approach.

Studies in health technology and informatics. 2022 May;294():332-336.

ISSN 1879-8365

Authors: Romain Griffier, Sébastien Cossin, François Konschelle, Fleur Mougin, Vianney Jouhet

PMID 35612087

PubMed BibTeX