Understanding the depth order of surfaces in the natural world is one of the most fundamental operations of the visual systems of many species. Humans reliably perceive the depth order of visually adjacent surfaces when there is relative motion between them such that one surface appears or disappears behind another. We have adapted a computational model of primate vision that fits important classical and recent psychophysical data on ordinal depth from motion in order to develop a fast, robust, and reliable algorithm for determining the depth order of regions in natural scene video. The algorithm uses dense optic flow to delineate moving surfaces and their relative depth order with respect to the parts of the static environment. The algorithm categorizes surfaces according to whether they are emerging, disappearing, unoccluded, or doubly occluded. We have tested this algorithm on real video where pedestrians and cars sometimes go behind and sometimes in front of trees. Because the algorithm extracts surfaces and labels their depth order, it is suitable as a low-level pre-processing step for complex surveillance applications. Our implementation of the algorithm uses the open source HPE Cognitive Computing Toolkit and can be scaled to very large video streams.
Gennady Livitz, Harald Ruda, Ennio Mingolla, "A Neurally-Inspired Algorithm for Detecting Ordinal Depth from Motion Signals in Video Streams" in Proc. IS&T Int’l. Symp. on Electronic Imaging: Human Vision and Electronic Imaging, 2017, pp 160 - 166, https://doi.org/10.2352/ISSN.2470-1173.2017.14.HVEI-137