Efficient dense stereo with occlusions for new view-synthesis by four-state dynamic programming

TitleEfficient dense stereo with occlusions for new view-synthesis by four-state dynamic programming
Publication TypeJournal Article
Year of Publication2007
AuthorsCriminisi, A, Blake, A, Rother, C, Shotton, J, Torr, PHS
JournalInternational Journal of Computer Vision
Volume71
Pagination89–110
ISSN09205691
KeywordsDense stereo, Gaze correction, Image-based rendering, Video-conferencing
Abstract

A new algorithm is proposed for efficient stereo and novel view synthesis. Given the video streams acquired by two synchronized cameras the proposed algorithm synthesises images from a virtual camera in arbitrary position near the physical cameras. The new technique is based on an improved, dynamic-programming, stereo algorithm for efficient novel view generation. The two main contributions of this paper are: (i) a new four state matching graph for dense stereo dynamic programming, that supports accurate occlusion labelling; (ii) a compact geometric derivation for novel view synthesis by direct projection of the minimum cost surface. Furthermore, the paper presents an algorithm for the temporal maintenance of a background model to enhance the rendering of occlusions and reduce temporal artefacts (flicker); and a cost aggregation algorithm that acts directly in the three-dimensional matching cost space. The proposed algorithm has been designed to work with input images with large disparity range, a common practical situation. The enhanced occlusion handling capabilities of the new dynamic programming algorithm are evaluated against those of the most powerful state-of-the-art dynamic programming and graph-cut techniques. Four-state DP is also evaluated against the disparity-based Middlebury error metrics and its performance found to be amongst the best of the efficient algorithms. A number of examples demonstrate the robustness of four-state DP to artefacts in stereo video streams. This includes demonstrations of cyclopean view synthesis in extended conversational sequences, synthesis from a freely translating virtual camera and, finally, basic 3D scene editing. © 2006 Springer Science + Business Media, LLC.

DOI10.1007/s11263-006-8525-1
Citation KeyCriminisi2007