Monash University
Browse

Towards Efficient Transformers via Network Quantization

Download (3.88 MB)
thesis
posted on 2025-04-14, 11:42 authored by Jing Liu
This thesis tackles the high computational and memory demands of Transformers, proposing “Green AI” methods to reduce costs without sacrificing performance. Key techniques include network quantization, which lowers computational precision to make AI more accessible across platforms. The two main approaches—quantization-aware training (QAT) and post-training quantization (PTQ)—offer distinct advantages: QAT integrates quantization during training for improved performance, while PTQ is a cost-effective solution suited for larger models. Novel methods are introduced for both QAT and PTQ to minimize performance loss after quantization, supporting efficient and sustainable AI applications.

History

Campus location

Australia

Principal supervisor

Bohan Zhuang

Additional supervisor 1

Jianfei Cai

Year of Award

2025

Department, School or Centre

Data Science & Artificial Intelligence

Course

Doctor of Philosophy

Degree Type

DOCTORATE

Faculty

Faculty of Information Technology

Usage metrics

    Faculty of Information Technology Theses

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC