Anomaly detection using isolation

Liu, Fei Tony

doi:10.4225/03/58901b400c59b

4597651_monash_80106.pdf (1.4 MB)

Anomaly detection using isolation

thesis

posted on 2017-01-31, 05:06 authored by Liu, Fei Tony

Anomaly detection is the process of discovering unusual data patterns that are different from the majority of the data. It has been used for fraud detection in the credit card and insurance industries, as well as other applications such as intrusion detection, industrial damage detection, and medical and public health anomaly detection. Alongside predictive modelling, link analysis and cluster analysis, anomaly detection forms one of the four pillars in data mining research and applications. Anomalies are data points that are intrinsically few in number and different from other normal data. Due to these intrinsic properties, anomalies are highly susceptible to isolation. This thesis proposes the first isolation-based anomaly detectors that detect anomalies purely based on the concept of isolation. The proposed method is fundamentally different from all existing methods that determine anomalies using distance-based or density-based approaches. Isolation-based anomaly detectors estimate the susceptibility to isolation for each data point without employing any computationally expensive distance or density measures. This fundamental allows a significantly lower processing time, higher detection accuracy and the ability to detect a wider range of anomalies, such as clustered anomalies. This thesis explains how isolation-based anomaly detectors work in separating anomalies from the majority of data, even when there is a high volume of data. In addition, an extensive empirical evaluation and an investigation on high dimensional data are provided. Finally, we discuss possible extensions of this novel method, such as handling categorical data and data streams.

History

Campus location

Australia

Principal supervisor

Kai Ming Ting

Year of Award

2011

Department, School or Centre

Information Technology (Monash University Gippsland)

Course

Doctor of Philosophy

Degree Type

DOCTORATE

Faculty

Faculty of Information Technology

Usage metrics

Keywords

monash:80106 Outlier Anomaly 1959.1/523154 Machine learning ethesis-20110606-21573 Open access Data mining 2011 thesis(doctorate)

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Anomaly detection using isolation

History

Campus location

Principal supervisor

Year of Award

Department, School or Centre

Course

Degree Type

Faculty

Usage metrics

Categories

Keywords

Licence

Exports