You can view the full text of this article for free using the link below.

Title: A comparison of structured data query methods versus natural language processing to identify metastatic melanoma cases from electronic health records

Authors: Jinghua He; Lawrence Mark; Charity Hilton; Joel Martin; Jarod Baker; Jon Duke; Siu L. Hui; Xiaochun Li; Paul Dexter

Addresses: Merck & Co., Inc., Center for Observational and Real-World Evidence (CORE), 770 Sumneytown Pike, West Point, PA 19486, USA ' Indiana University School of Medicine, Indianapolis, IN, USA ' Regenstrief Institute, Indianapolis, IN, USA ' Regenstrief Institute, Indianapolis, IN, USA ' Regenstrief Institute, Indianapolis, IN, USA ' College of Computing, Georgia Institute of Technology, USA ' Indiana University School of Medicine, Indianapolis, IN, USA; Regenstrief Institute, Indianapolis, IN, USA ' Indiana University School of Medicine, Indianapolis, IN, USA ' Indiana University School of Medicine, Indianapolis, IN, USA; Regenstrief Institute, Indianapolis, IN, USA; Eskenazi Health, Indianapolis, IN, USA

Abstract: The relative efficacy of natural language processing (NLP) of text reports compared to structured data queries for identifying patients from electronic health records (EHRs) with metastatic cancer remains unclear. Such identification is critical for identifying and recruiting potential study candidates for cancer trials, particularly trials of cancer chemotherapy. For such purposes, we performed a direct comparison between NLP and structured data query methods for identifying patients with metastatic melanoma. Using EHR data from two large institutions, we found that NLP of text reports identified close to three times as many patients with metastatic melanoma compared to a structured data query algorithm (1,727 vs. 607 patients). Using an external tumour registry, we also found NLP had much higher sensitivity than structured query for identifying such patients (67% vs. 35%). Our results emphasise the importance of employing NLP criteria when identifying potential cancer study candidates with metastatic disease.

Keywords: efficacy; natural language processing; NLP; structured data queries; identification; identifying patients; electronic health records; EHRs.

DOI: 10.1504/IJCMH.2019.104364

International Journal of Computational Medicine and Healthcare, 2019 Vol.1 No.1, pp.101 - 111

Published online: 06 Jan 2020 *

Full-text access for editors Full-text access for subscribers Free access Comment on this article