Investigation into the Influence of Biological Depth Cueson Monocular Depth Estimation for the Improvement of an Automated Privacy-Preserving Video Processing System

Monocular depth estimation (MDE) in computer vision is the process of generating a depth map from a single 2-D image. This task is far from trivial since a 2-D image can be created from an infinite number of 3-D scenes. Fortunately, for many images captured in the real-world it is quite clear what the approximate depth is, and many MDE methods exist that produce results close to the ground truth. A depth map for an image has many use-cases, including the anonymisation of video material. Humans and some other animals can also get an idea about the depth of what they are observing when using only a single eye, for this they use so-called biological cues.
An examples of these cues are the size of the observed objects and linear perspective. This work focuses on using prior knowledge about these biological cues to extract related information, which is used as extra input for an MDE. The goals are to determine the effect of  these explicitly extracted cues on an MDE and to find out whether these explicit cues can be learned by an MDE instead, so that they are used implicitly. As a secondary contribution, a new data set containing RGB-depth image pairs is to be created.