Learning and Inference of Probabilistic Finite State Machines using MML and Applications to Classification Problem
thesisposted on 19.06.2017 by VIDYA SAIKRISHNA
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
This thesis examines the problem of learning Probabilistic Finite State Machines from text data and applies it to text classification. Probabilistic Finite State Machines capture regularities and patterns in the text data very effectively and this feature is combined with the ability to compress using the Minimum Message Length principle. Different approaches are developed and are applied on a two-class classification scenario like, classifying spam and non-spam emails on the Enron spam datasets and prediction of individuals in the Activities of Daily Living datasets. The approaches produce significant results and outperform the existing methods of classification.