Automated Extraction of Time References from Clinical Notes in a Heart Failure Telehealth Network

Fabian Wiesmueller1, Alphons Eggerth1, Karl Kreiner1, Dieter Hayn1, Bernhard Pfeifer1, Gerhard Pölzl2, Tim Egelseer-Bründl3, Günter Schreier1
1AIT Austrian Institute of Technology GmbH, Graz, Austria, 2Department of Internal Medicine III, Cardiology and Angiology, Medical University Innsbruck, Innsbruck, Austria, 3Landesinstitut für Integrierte Versorgung – LIV Tirol, Innsbruck, Austria


Abstract

Aims: In the ageing population, the number of people affected by chronic heart diseases is rising. Being widespread and severe, heart failure is not only a burden to patients, but also for health care systems. Starting in 2012, a heart failure telehealth network called “HerzMobil” was established to re-duce hospital readmission rates and improve outcomes in heart failure pa-tients. Patients use blood pressure measuring devices and weight scales along with off-the-shelf smartphones to daily transmit their data. Health profes-sionals can access these data via a web portal. Beside entering various struc-tured data, they can also communicate via free text notes. These notes (cur-rently 27,287 in total), potentially, include important details, like for example a change in medication. Thus, they can provide essential information for the further treatment process. Automated analysis of clinical notes and trans-formation to structured data would improve the availability of this infor-mation for further automated processing. Methods: As a first step towards a fully automated analysis of the Her-zMobil notes, we extracted all dates and time references. For implementa-tion, we chose Python, because it is widely popular in science, freely availa-ble, powerful and intuitive. Due to a lack of accurate, existing Python librar-ies, we developed a customized Python script. This software module was mostly based on regular expressions and a set of tailored rules. Finally, we compared our results to parsedatetime, considered the most accurate Python library available. Result: Using a manually pre-annotated subset of notes (250 with time references and 250 without time references) for evaluation, our approach achieved a significantly higher accuracy (98.4%) as compared to the state-of-the-art, i.e. the parsedatetime library (81.6%). Conclusion: The achieved accuracy of 98.4% indicates that our ap-proach presents a viable basis for the next step, i.e. transforming all extracted time references to timestamps and linking them to specific events.