Intelligent audit code generation from free text in the context of neurosurgery
thesisposted on 03.03.2017, 01:20 by Khademi Habibabadi, Sedigheh
Clinical auditing requires structured data for aggregation and analysis of patterns. Clinicians however, need to record clinical encounters in written or spoken language, not only for its work-flow naturalness but also for its expressivity, precision, and capacity to convey all required information, which codified structured data is incapable of. Therefore, structured data must be obtained from clinical text as a later step, a task known as information extraction. Specialised areas of medicine use their own clinical language and clinical coding systems, resulting in unique challenges for the extraction process. This research is devoted to creating a novel semi-automated method for generating codified auditing data from clinical notes recorded in a neurosurgical department in an Australian teaching hospital. The department has its own audit coding system, and language used in its clinical notes is highly specific to the neurological and neurosurgical domains, which necessitated a customised approach. The principles of Design Science Research were followed to design a method that combines Natural Language Information Extraction and Machine Learning techniques. The method was tested by developing a computer programme that incorporates text extraction algorithms trained and tested on data supplied by the neurosurgical department. The software implements rules initially provided by a domain expert and extended during the development of the software; combined with a custom built machine learning-based prediction system. The software architecture was informed by the requirement for it to be an instantiation of the method, therefore that it should be capable of evaluation within the department’s computer systems. To the author’s knowledge there has been no previous published research addressing the challenges of codifying neurosurgical-specific audit categories from free text. By combining highly specific rules-based information extraction with the weighted word counts of a machine learning component in a unique way, the method demonstrates a unique approach to creating applications that solve this codification problem.