Derivation and validation of a machine learning record linkage algorithm between emergency medical services and the emergency department


Linking emergency medical services (EMS) electronic patient care reports (ePCRs) to emergency department (ED) records can provide clinicians access to vital information that can alter management. It can also create rich databases for research and quality improvement. Unfortunately, previous attempts at ePCR and ED record linkage have had limited success. In this study, we use supervised machine learning to derive and validate an automated record linkage algorithm between EMS ePCRs and ED records.All consecutive ePCRs from a single EMS provider between June 2013 and June 2015 were included. A primary reviewer matched ePCRs to a list of ED patients to create a gold standard. Age, gender, last name, first name, social security number, and date of birth were extracted. Data were randomly split into 80% training and 20% test datasets. We derived missing indicators, identical indicators, edit distances, and percent differences. A multivariate logistic regression model was trained using 5-fold cross-validation, using label k-fold, L2 regularization, and class reweighting.A total of 14 032 ePCRs were included in the study. Interrater reliability between the primary and secondary reviewer had a kappa of 0.9. The algorithm had a sensitivity of 99.4%, a positive predictive value of 99.9%, and an area under the receiver-operating characteristic curve of 0.99 in both the training and test datasets. Date-of-birth match had the highest odds ratio of 16.9, followed by last name match (10.6). Social security number match had an odds ratio of 3.8. We were able to successfully derive and validate a record linkage algorithm from a single EMS ePCR provider to our hospital EMR.

Journal of the American Medical Informatics Association
Yoni Halpern
Yoni Halpern
PhD student

Google Research

David Sontag
David Sontag
Associate Professor of EECS

My research focuses on advancing machine learning and artificial intelligence, and using these to transform health care.