?:abstract
|
-
Healthcare systems ideally should be able to draw lessons from historical data, including whether common exposures are associated with adverse clinical outcomes. Unfortunately, structured clinical data, such as encounter diagnostic codes in electronic health records, suffer from multiple limitations and biases, limiting effective learning. We hypothesized that a machine learning approach to automate ascertainment of clinical events and disease history from medical notes would improve upon using structured data and enable the estimation of real-world risks. We sought to test this approach to address a timely goal: estimating the delayed risk of adverse cardiovascular events (i.e. after the index infection) in patients infected with respiratory viruses. Using 4,151 cardiologist-labeled notes as gold standard, we trained a series of neural network models to automate event adjudication for heart failure hospitalization, acute coronary syndrome, stroke, and coronary revascularization and to identify past medical history for heart failure. Though performance varied by task, in nearly all cases, our models surpassed the use of structured data in terms of sensitivity for a given specificity level and enabled principled evaluation of classification thresholds, which is typically impossible to do with diagnostic codes. Deploying our models on more than 17 million notes for 267,596 patients across an extensive integrated delivery network, we found that patients infected with respiratory syncytial virus had a 23% increased risk of delayed heart failure hospitalization over a subsequent 4-year period compared with propensity-score matched patients who had the same test but with negative results (p = 0.003, log-rank). In contrast, we found no such increased risk in patients with a positive influenza viral test compared with a negative test (rate ratio 0.98, p = 0.71). We conclude that convolutional neural network-based models enable accurate clinical labeling at scale, thereby unlocking timely insights from unstructured clinical data.
|