Monash University
Browse

Discovering Significant Patterns

Download (333.82 kB)
report
posted on 2022-07-25, 00:35 authored by G I Webb
Exploratory pattern discovery techniques, such as association rule discovery, explore large search spaces of potential patterns to find those that satisfy some user-specified constraints. Due to the large number of patterns considered, they suffer from an extreme risk of type-1 error, that is, of finding patterns that appear due to chance alone to satisfy the constraints on the sample data. This paper proposes techniques to overcome this problem by applying well-established statistical practices. These al- low the user to enforce a strict upper limit on the risk of experiment wise error. Empirical studies demonstrate that standard exploratory pattern discovery techniques can discover numerous spurious patterns when applied to random data and when applied to real-world data result in large numbers of patterns that are rejected when subjected to statistical evaluation on holdout data. They also reveal that modification of the pattern discovery process to anticipate subsequent statistical evaluation can in- crease the number of patterns that are accepted by statistical evaluation on holdout data.

History

Technical report number

2006/202

Year of publication

2006

Usage metrics

    Monash Information Technology Technical Reports

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC