posted on 2023-12-13, 13:58authored byXUELIAN CHENG
3D scene understanding involves extracting critical geometry details from 2D images, including depth maps, camera position, and surface normals, providing insights into spatial arrangements. It also encompasses high-level vision tasks such as object recognition and semantic segmentation. In contrast to manual feature engineering, deep learning methods aim to improve efficiency and accuracy in learning visual geometry. This thesis focuses on three vital aspects: depth estimation from stereo inputs, object detection and segmentation, and 3D scene reconstruction, all utilizing neural networks to enhance geometry understanding with a deep learning approach, offering robust and efficient solutions for these challenges.