Monash University
Browse
20170309-Tan-Thesis.pdf (1.64 MB)

MML Inference of Decision Trees, Graphs and Forests

Download (1.64 MB)
thesis
posted on 2017-03-16, 04:38 authored by Peter J. Tan
The Minimum Message Length (MML) principle is a general inductive inference framework which recasts inductive inference as a coding process. Using assumptions identical to those from the well-known Bayesian inference, prediction and modelling, MML represents general inductive and statistical inference problems as a data encoding process which conveys the data to a receiver in a two-part message. The first part is the message to encode the inferred model. The second part is the message to encode the data in light of the inferred model. MML states that the model with the shortest two-part message is the best model approximating the true model.
   
   Decision trees, decision graphs and decision forests are popular supervised learning methods in machine learning. In this dissertation, the MML principle is applied to build machine learning schemes for decision trees, decision graphs and decision forests. Two novel MML inference schemes are developed. One is a MML coding scheme for Oblique decision trees, which are decision trees with linear discriminate functions at their internal nodes. Another is a MML coding scheme for decision graphs with multi-way joins and dynamic attributes. A decision forests learning scheme based on MML oblique decision trees is also presented.
   
   Experiments were conducted across a range of problems using data from University of California Machine Learning Repository and the Singapore Data Mining Centre. These experiments showed that compared to other popular decision tree models such C4.5 and C5, models generated by MML inference schemes achieved favourable results in both classification and probabilistic predictive accuracy. The study showed that MML inference schemes are able to find the optimal trade-off between the complexity of these structure models and goodness-of-fit for a given set of data.

History

Campus location

Australia

Principal supervisor

David Dowe

Additional supervisor 1

Trevor Dix

Year of Award

2017

Department, School or Centre

Information Technology (Monash University Clayton)

Course

Doctor of Philosophy

Degree Type

DOCTORATE

Faculty

Faculty of Information Technology

Usage metrics

    Faculty of Information Technology Theses

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC