monash_85498.pdf (6.69 MB)
Download file

Integrating the surrounding image information within a high-level conceptual framework for symbolic image indexing and retrieval on the WWW

Download (6.69 MB)
posted on 17.02.2017, 01:43 by Paizi, Wan Fariza
The contextual information of Web images is investigated to address the issue of enriching their index characterizations with semantic descriptors and therefore bridge the semantic gap (i.e. the gap between the low-level content-based description of images and their semantic interpretation). Although we are highly motivated by the availability of rich knowledge on the Web and the relative success achieved by commercial search engines in indexing images using surrounding text-based information in webpages, we are aware that the unpredictable quality of the surrounding text is a major limiting factor. In order to improve its quality, we describe a multifaceted semantic concept-based indexing model which analyzes the semantics of image contextual information and classifies it into five broad semantic concept classes (or facets) of signal, object, abstract, scene, and relational. This faceted indexing model relates to the users’ levels of image descriptions and meets user needs for specific/compound queries. We first highlight contextual information which is relevant for the semantic characterization of Web images and study its statistical properties in terms of its location and semantic nature. A user study is conducted to validate the results. The results suggest that there are several locations that consistently contain relevant textual information with respect to the image. The importance of each location is influenced by the type of webpage as the results show the different distribution of relevant contextual information across the locations for different webpage types. The frequently found semantic concept classes are object and abstract. Another important outcome of the user study shows that a webpage is not an atomic unit and can be further partitioned into smaller segments. Segments containing images are of interest and termed as image segments. The study shows that users typically single out textual information which they consider relevant to the image from the textual information bounded within the image segment. Hence, the second contribution is a DOM Tree-based webpage segmentation algorithm to automatically partition webpages into image segments. Its effectiveness is validated using a number of datasets including our resultant human-labeled dataset from the user study and experiments demonstrate that our method consistently achieves better results compared to other existing web image context extractors. The final contribution is a fully automatic facet classification algorithm for our proposed multifaceted semantic concept-based image indexing framework. The classification is performed on image contextual information extracted from any general webpage. Natural language processing techniques are used to break the contextual information into a set of disambiguated indexing terms (i.e. word or non-compositional phrase) and discover the syntactical relations between the terms. Then, a knowledge base is used to classify each term to the corresponding facet(s), thus, giving us the single-facet concepts. The single-facet concepts with identified relationships form the multifaceted concepts. Encouraging results in terms of precision and recall are reported, which demonstrate the effectiveness of the proposed faceted indexing method over baseline methods.


Campus location


Principal supervisor

Christopher Hugh Messom

Additional supervisor 1

Mohammad Belkhatir

Year of Award


Department, School or Centre

School of Information Technology (Monash University Malaysia)


Doctor of Philosophy

Degree Type



Faculty of Information Technology