Adaptive streaming is fast becoming the most widely used method for video delivery to the end users over the internet. The ITU-T P.1203 standard is the first standardized quality of experience model for audiovisual HTTP-based adaptive streaming. This recommendation has been trained and validated for H.264 and resolutions up to and including full-HD. The paper provides an extension for the existing standardized short-term video quality model mode 0 for new codecs i.e., H.265, VP9 and AV1 and resolutions larger than full-HD (e.g. UHD-1). The extension is based on two subjective video quality tests. In the tests, in total 13 different source contents of 10 seconds each were used. These sources were encoded with resolutions ranging from 360p to 2160p and various quality levels using the H.265, VP9 and AV1 codecs. The subjective results from the two tests were then used to derive a mapping/correction function for P.1203.1 to handle new codecs and resolutions. It should be noted that the standardized model was not re-trained with the new subjective data, instead only a mapping/correction function was derived from the two subjective test results so as to extend the existing standard to the new codecs and resolutions.
Encoders of AOM/AV1 codec consider an input video sequence as succession of frames grouped in Golden-Frame (GF) groups. The coding structure of a GF group is fixed with a given GF group size. In the current AOM/AV1 encoder, video frames are coded using a hierarchical, multilayer coding structure within one GF group. It has been observed that the use of multilayer coding structure may result in worse coding performance if the GF group presents consistent stillness across its frames. This paper proposes a new approach that adaptively designs the Golden-Frame (GF) group coding structure through the use of stillness detection. Our new approach hence develops an automatic stillness detection scheme using three metrics extracted from each GF group. It then differentiates those GF groups of stillness from other non-still GF groups and uses different GF coding structures accordingly. Experimental result demonstrates a consistent coding gain using the new approach.
Google started the WebM Project in 2010 to develop open source, royalty--free video codecs designed specifically for media on the Web. Subsequently, Google jointly founded a consortium of major tech companies called the Alliance for Open Media (AOM) to develop a new codec AV1, aiming at a next edition codec that achieves at least a generational improvement in coding efficiency over VP9. This paper proposes a new coding tool as one of the many efforts devoted to AOM/AV1. In particular, we propose a second ALTREF_FRAME in the AV1 syntax, which brings the total reference frames to seven on top of the work presented in [11]. ALTREF_FRAME is a constructed, no-show reference obtained through temporal filtering of a look-ahead frame. The use of twoALTREF_FRAMEs adds further flexibility to the multi-layer, multi-reference symmetric framework, and provides a great potential for the overall Rate- Distortion (RD) performance enhancement. The experimental results have been collected over several video test sets of various resolutions and characteristics both texture- and motion-wise, which demonstrate that the proposed approach achieves a consistent coding gain, compared against the AV1 baseline as well as against the results in [11]. For instance, using overall-PSNR as the distortion metric, an average bitrate saving of 5.880% in BDRate is obtained for the CIF-level resolution set, and 4.595% on average for the VGA-level resolution set.