Monash University
Browse

Towards Open-World Vision Applications by Learning Image-Region Representation

Download (45.33 MB)
thesis
posted on 2024-05-04, 19:39 authored by DUY SON DAO
Open-World Vision Applications use computer vision techniques to analyze and understand visual data in a dynamic environment. Researchers are developing novel approaches for learning image-region representation, i.e., focusing on regions within an image. It offers advantages such as a more comprehensive understanding of visual content, enhanced adaptability, and integration of contextual information. The challenges of learning image-region representation for open-world vision applications include model generalization and data availability. This research proposes learning frameworks for Open-Vocabulary Multi-Label Classification (OVML) and Open-Vocabulary Semantic Segmentation (OVS).

History

Campus location

Australia

Principal supervisor

Jianfei Cai

Additional supervisor 1

Dinh Phung

Year of Award

2024

Department, School or Centre

Data Science & Artificial Intelligence

Additional Institution or Organisation

Data Science and Artificial Intelligence

Course

Doctor of Philosophy

Degree Type

DOCTORATE

Faculty

Faculty of Information Technology

Usage metrics

    Faculty of Information Technology Theses

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC