Abstract
Background/Aims One of the primary challenges to observational research based on electronic health record (EHR) data is judging whether or not a patient had a given disease as of a specific date. The problem list, encounter record, and prescription medication orders are all sources of coded diagnoses, but their use varies across providers and diseases. The objective of this study was to test the sensitivity of using different diagnosis criteria for classifying patients with specific diseases in the EHR.
Methods A deidentified dataset was assembled for a randomly selected population of 10,000 patients (age 18+) who had at least one encounter per year from 2007–11, including all problem list, encounter, and prescription diagnoses. For each of the 17 diseases in the Charlson Comorbidity Index, we counted the total number of patients that had 1 or more problem list diagnoses, encounter diagnoses, prescription diagnoses, and combinations of the three. Because many past projects at our institution have used a criterion of 1 problem list entry or 2 encounter diagnoses (“P1/E2”) as an inclusion criterion, this was highlighted as a benchmark for comparison.
Results While every patient with a disease should have it documented everywhere in the EHR, this study confirmed that this ideal scenario is not the case. The criterion of “P1/E2” was usually ranked between 5th and 9th (out of 20 criteria tested), suggesting that it casts a fairly wide net for most diseases. We were surprised to find that “time elapsed between two events” did not generally impact the results; i.e., requiring patients to have two encounter diagnoses >90 days apart did not yield substantially fewer patients than requiring them >1 day apart. Overall sensitivity varied dramatically by disease: for example, for Neoplasms, the 4 least restrictive criteria yielded similar numbers of patients, but then there was a significant drop-off. For Heart Failure, there was a much more gradual decline in the number of patients as criteria became more restrictive, suggesting greater consistency across the EHR.
Conclusions In conclusion, we hope empirical data like these can aid researchers in better understanding how diagnosis criteria affect their cohorts in retrospective studies.

