tr-2005-172-full.pdf (1.52 MB)
Data Mining - A case study of the Victorian Coroner’s database
reportposted on 2022-07-25, 00:37 authored by J Ceddia
This report describes the process undertaken to ‘data mine’ a dataset that comprises the Victorian case management data for the Coroner’s office at the Victorian Institute of Forensic Medicine (VIFM). The study consists of three main activities: (i) determining which data from the database is relevant to the study (ii) what data mining techniques are appropriate to use with this data and (iii) evaluating the implications of any results. To support aim one, the data structure and the data preparation phase are discussed together with the exploratory statistics and extra data derivation such as ICD10 codes for cause of death, body mass index and statistical subdivision that are used in further analysis. The tools and techniques for the second aim are SPSS for statistical analysis; clustering and association rule mining are performed with the freely available tools such as Magnum Opus (trial version) and WEKA The last aim requires a comparison/interpretation of results with other sources such as the Australian Bureau of Statistics (ABS). For example, the distribution of sex in this database is 66% male to 34% female which is in contrast to the population distribution reported by the ABS of a nearly equal male to female ratio. Application of free text clustering tools such as growing self organizing feature maps could be the next stage of analysis. However, while the technique can be applied, analysis of the results would require a domain expert such as a pathologist.