Title: An efficient three-dimensional prediction structure for coding light field video content using the MV-HEVC standard
Authors: Joseph Khoury; Nusrat Mehajabin; Mahsa T. Pourazad; Panos Nasiopoulos; Victor C.M. Leung
Addresses: Electrical and Computer Engineering, University of British Columbia, 2332 Main Mall, Vancouver, BC, V6T1Z4, Canada ' Electrical and Computer Engineering, University of British Columbia, 2332 Main Mall, Vancouver, BC, V6T1Z4, Canada ' Electrical and Computer Engineering, University of British Columbia, 2332 Main Mall, Vancouver, BC, V6T1Z4, Canada ' Electrical and Computer Engineering, University of British Columbia, 2332 Main Mall, Vancouver, BC, V6T1Z4, Canada ' Electrical and Computer Engineering, University of British Columbia, 2332 Main Mall, Vancouver, BC, V6T1Z4, Canada
Abstract: Light field cameras have emerged in the consumer market as a technology that captures richer visual information than legacy cameras. While traditional photography captures only a 2D projection of the scene, the light field camera collects light intensity and direction. As a result, this technology opens new opportunities for applications such as remote surgery, autonomous driving, augmented reality, and digital health. However, one of the main problems with this technology is the size of the data captured which significantly increases the consumers' bandwidth requirements. Numerous solutions have been proposed that attempt to compress light field efficiently, but none of them fully evaluate the intricacies found in light field content. This paper proposes a three-dimensional prediction structure for compressing light field video content using the multi-view extension of HEVC (MV-HEVC). The inter-view structure exploits the correlations between the views in two directions and the high degree of resemblance between views around the centre of each frame. Experimental results show a BD-rate gain of 50.89% while subjective tests have shown a BD-rate improvement of 65.83% in mean opinion score over the state-of-the-art method. This means more visually appealing quality at a significantly reduced bitrate, thus facilitating practical implementations of the emerging technology.
Keywords: HEVC; light field; multi-view video coding; prediction structure.
DOI: 10.1504/IJMIS.2022.121281
International Journal of Multimedia Intelligence and Security, 2022 Vol.4 No.1, pp.47 - 64
Received: 27 Sep 2021
Accepted: 20 Nov 2021
Published online: 03 Mar 2022 *