File(s) under embargo
Reason: The document includes unpublished content
until file(s) become available
Advancing Video Compression With Error Resilience And Content Analysis
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
In this thesis, two aspects of video coding improvement are discussed, namely error resilience and coding efficiency.
With the increasing amount of videos being created and consumed, better video compression tools are needed to provide reliable and fast transmission. Many popular video coding standards such as VPx, H.26x achieve video compression by using spa- tial and temporal dependencies in the source video signal. This makes the encoded bitstream vulnerable to errors during transmission. In this thesis, we investigate an error resilient video coding for the VP9 bitstreams using error resilience packets. An error resilient packet consists of encoded keyframe contents and the prediction sig- nals for each non-keyframe. Experimental results exhibit that our proposed method is effective under typical packet loss conditions.
In the second part of the thesis, we first present an automatic stillness feature detection method for group of pictures. The encoder adaptively chooses the coding structure for each group of pictures based on its stillness feature to optimize the coding efficiency.
Secondly, a content-based video coding method is proposed. Modern video codecs including the newly developed AOM/AV1 utilize hybrid coding techniques to remove spatial and temporal redundancy. However, the efficient exploitation of statistical dependencies measured by a mean squared error (MSE) does not always produce the best psychovisual result. One interesting approach is to only encode visually relevant information and use a different coding method for “perceptually insignificant” regions
in the frame. In this thesis, we introduce a texture analyzer before encoding the input sequences to identify detail irrelevant texture regions in the frame using convolutional neural networks. The texture region is then reconstructed based on one set of motion parameters. We show that for many standard test sets, the proposed method achieved significant data rate reductions.