A semantic SLAM model for autonomous mobile robots using content based image retrieval techniques

Tan, Choon Ling

doi:10.4225/03/589010dbc8376

monash_85496.pdf (2.79 MB)

A semantic SLAM model for autonomous mobile robots using content based image retrieval techniques

thesis

posted on 2017-01-31, 04:21 authored by Tan, Choon Ling

Localization and environmental mapping are two fundamental functions for an autonomous mobile robot. This thesis develops a new framework that allows a robot with a vision sensor to simultaneously achieve both of these functions. The novel approach attempts to interpret video images for their meaning, generating a map and localization data from these meanings. The experiments show promising results for this new approach. If robots are to perform robustly with no a priori knowledge of their environment then they must have the ability to perform Simultaneous Localization and Mapping (SLAM), whereby the robot incrementally builds a map of the environment it is navigating while simultaneously keeping track of its location within the built map. SLAM might be considered a solved problem. There is certainly a large body of literature, discussed in this thesis, which provides decent models for building robust solutions. However, most, if not all, of the current state of the art SLAM techniques rely on solutions with a tight loop of detecting and tracking low-level features to update the robots current pose, or location. We argue these methods are brittle and do not offer general purpose solutions to the problem. This thesis takes a cognitive approach to the subject and develops a new SLAM model based on extracting semantic information from the robot’s sensor data. In this thesis, we develop a new SLAM framework which analyses video streams for semantic content. We do this with inspiration from the Content Based Image Retrieval (CBIR) research area. We use the well-established Tamura texture features to decompose the video stream into a grid of lexemes (or recognized categories) which we then use to construct grammatical sentences. These sentences form place descriptions and are used for constructing the environmental map and localization. In contrast to engineered methods, our framework does not return precise location information as we argue it is enough to know roughly where one is located. We have implemented a proof of concept model and tested within both indoor and outdoor environments. The results show that our model can construct useful semantic descriptions from the video stream and use these descriptions to implement SLAM. Although the derived semantic descriptions are fairly coarse (based on the limitations of the Tamura texture features), the technique could be refined by adopting a richer set of the feature vectors, however, we leave this as future work.

History

Campus location

Australia

Principal supervisor

Simon Egerton

Additional supervisor 1

Velappa Ganapathy

Year of Award

2011

Department, School or Centre

Information Technology (Monash University Malaysia)

Course

Doctor of Philosophy

Degree Type

DOCTORATE

Faculty

Faculty of Information Technology

Usage metrics

Keywords

Image patches Semantic signatures thesis(doctorate)monash:85496 CBIR Computer vision Open access Semantic SLAM ethesis-20120213-184126 1959.1/577568 2011

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

A semantic SLAM model for autonomous mobile robots using content based image retrieval techniques

History

Campus location

Principal supervisor

Additional supervisor 1

Year of Award

Department, School or Centre

Course

Degree Type

Faculty

Usage metrics

Categories

Keywords

Licence

Exports