Scene analysis for robotic watercraft
thesisposted on 17.02.2017 by Walia, Rahul
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
The current thesis describes image processing techniques, developed for analysis of visual scenes encountered during robotic navigation on water. The research was characterized by a lack of relevant literature which could provide an initiating platform for approaching vision guided navigation in water. It was an interesting challenge to decide the best discipline for conducting the research in, as the problem (of water robot navigation) could be approached from marine engineering, robotics, oceanography, mathematics, pattern recognition and image processing prospectives. Consequently the literature reviewed in this thesis, extends to eclectic but related (in application) scientific disciplines. A variety of these scientific techniques were researched, refined, attempted and abandoned with varying degree of success. The thesis, documents some of these techniques (successful or otherwise) along with theory and analysis. The parlance of the thesis attempts to adhere to robotics and vision terminology and conventions. For a robotic watercraft to navigate successfully using computer vision, it should be able to integrate two major components viz. vision and navigation. Computer vision can be viewed as means to achieve the objective (end) of navigation. The uniqueness of the operating environment simultaneously challenges and assists both the components differently. Generally, absence of pathways makes water based navigation easier than its land based counterpart. The scene analysis for the purpose of vision guided water navigation is characterized by following contrasting features: 1. Unreliable Photometry: The dynamics of water and (consequently) the camera makes it difficult to prepare a valid mathematical model or extract relevant features to describe water. 2. Reliable and Sparse Scene Composition: Unlike land scenes, water scenes are less cluttered and are mainly comprised of water, sky, clouds and occasional foreground object. Navigation on water can be achieved by obstacle detection and avoidance. Consequently the phrase Scene-analysis, has an objective interpretation of obstacle detection, which is one of the key areas of research in this thesis. Scene analysis described herein is predominantly a sequential attrition of scene components (clouds, sky and water) in a water scene, with a view to detect and locate a foreground object in water. E.g. clouds are not identified as a separate component, but are eliminated in the scene by the use of a mathematical technique which makes clouds transparent in the gray scale-images. The sequential steps (along with associated problems and solutions) in locating the obstacles in a water-scene are 1. Generating homogeneous Sky: It eliminates false positive identification of clouds as obstacles. Gray scale Pseudo Spectra Images (PSI) were generated from the tri-color images at a fixed wavelengths. It was experimentally established that PSI results in similar response for sky and clouds, thereby preventing the clouds from appearing in gray scale images. In addition, PSIs increase the contrast between sky and water, enabling easy detection of the horizon. The main contribution of the research into PSI, is in developing a mathematical basis for generating images at various discrete wavelengths. The PSIs are envisaged to have applications beyond cloud elimination. 2. Identifying horizon: This is done with an objective to define spatial spread of water in the image. Enclosing ellipses are used to identify the horizon from the other edges. The method is better suited on water being: (a) Faster: As compared to conventional Hough Transform. (b) Robust: Able to detect both straight and curved horizons. (c) Simple: Maximizes a mathematical criterion derived from the skew and the zeroth moment of ellipses. 3. Identifying Foreground (Obstacles): The literature review and pilot studies reveal that images or videos captured by cameras are subjected to excessive variability to develop any reliable obstacle recognition algorithm. To achieve robust obstacle detection, the obstacle was identified by it’s boundary with the water. It was found that the obstacle boundary (with water) edge in a gray scale image has following invariable characteristics: (a) Spatial Scarcity: The number of edges in a gray scale image that are created due to boundary of foreground (obstacle) with the background (water) edge is very small as compared to the total edges number of edges in the image. (b) High Derivative Magnitude: The magnitude of the derivative of image intensity of the obstacle-boundary edge is higher than that of edges not due to boundary. By using these invariable characteristics of the boundary of foreground with the background, a theoretical framework of image statistics in Scale-Space is prepared. This framework can identify the presence or absence of the obstacle boundary / discontinuity, and locate the boundary of the obstacle if it is present. Specifically, the magnitude of the Sobel derivative of a gray scale image is subjected to Scale-Space i.e. convolved with a gaussian kernel of increasing standard deviation. At each scale, the statistical parameter Otsu’s Threshold (OT)(Otsu; 1979) is calculated. The plot of Otsu’s Threshold against increasing scale enables identification and location of the foreground boundary. Mathematical proofs are provided, that the OT has differing plots in the presence and absence of foreground-boundary (and therefore obstacle). Theoretical research has yielded following results proved via theorems and experimentation: (a) Expression for PDF and CDF of the derivative of discontinuity in Scale-Space. (b) Bimodality of the PDF. (c) Unbalance of the PDF. (d) Scale-Life: The duration of scales for which the discontinuity can be statistically identified as a separate mode in the PDF. Scale-Life is a function of the magnitude of the discontinuity and the upper bound of error. (e) Analytical expression of the OT for the derivative of a discontinuity in Scale-Space. (f) Different plots of OT in the presence and absence of a discontinuity. (g) Algorithm for simultaneous detection of discontinuity, threshold and scale appropriate to discontinuity. (h) Validation of algorithm on synthetic and natural images. The results of research into discontinuity can be generalized to a variety of scientific and engineering problems which involve detecting and locating discontinuities.