PV 2009

May. 11-12, 2009


Video Source Coding

A New H.264/AVC Error Resilience Model Based on Regions of Interest
Fadi Boulos, Wei Chen, Benoît Parrein, and Patrick Le Callet (Nantes Atlantique Universités)

Video transmission over the Internet can sometimes be subject to packet loss which reduces the end-user’s Quality of Experience (QoE). Solutions aiming at improving the robustness of a video bitstream can be used to subdue this problem. In this paper, we propose a new Region of Interest-based error resilience model to protect the most important part of the picture from distortions. We conduct eye tracking tests in order to collect the Region of Interest (RoI) data. Then, we apply in the encoder an intra-prediction restriction algorithm to the macroblocks belonging to the RoI. Results show that while no significant overhead is noted, the perceived quality of the video’s RoI, measured by means of a perceptual video quality metric, increases in the presence of packet loss compared to the traditional encoding approach.

Rate-Distortion Model For Motion Prediction Efficiency in Scalable Wavelet Video Coding
Chia-Yang Tsai (National Taipei University of Technology), Hseuh-Ming Hang (National Chiao Tung University)

A rate-distortion model for motion prediction efficiency in scalable wavelet video coding is proposed in this paper. The Lagrangian multiplier is widely used to solve the ratedistortion optimization problems in video coding, especially on mode decision and rate-constrained motion estimation. Different from the non-scalable video coding, the scalable wavelet video coding needs to operate under multiple bitrate conditions and it has an open-loop structure. Therefore, the conventional rate-distortion optimization technique is not suitable for the scalable wavelet case. By analyzing the ratedistortion trade-off due to different bits allocated to motion information, we propose a motion prediction gain (MPG) metric to measure motion coding efficiency. Based on the MPG metric, a new cost function for mode decision is thus proposed. Compared with the conventional Lagrangian multiplier optimization method, our experiments show that the new mode decision procedure can generally improve the PSNR performance for, particularly, the combined SNR and temporal scalability.

New Insights Into Improving Compression Efficiency for Distributed Video Coding
Guogang Hua (Qualcomm Inc), Chang Wen Chen(SUNY Buffalo)

In this paper, we investigate a fundamental issue in the distributed video coding (DVC) that, once resolved, would substantially improve the compression efficiency in DVC. This fundamental issue is the underlining relation between the distribution of the prediction errors and the compression efficiency of DVC. In the current approach to DVC, after the construction of prediction frame at decoder, the difference between the prediction and the current frames, or the prediction error, is inversely proportional to the correlation between these frames. Most existing approaches in DVC attempt to maximize such correlation, or to minimize the prediction error, in order to achieve the desired video compression efficiency. Recently, the research in DVC has reached a plateau in terms of its performance in coding efficiency. We believe one key solution to take DVC out of such performance plateau will be to design a better scheme to represent the correlation more effectively in DVC. In this research, we have worked out compelling theoretical analysis and proved that, in order to reduce the number of bits to be sent to decoder, the distribution of the prediction errors needs to be as concentrated as possible. From the practical point of view, we also show that the error control codes adopted for DVC will achieve higher efficiency when the distribution of prediction errors is more concentrated.

Modeling Rate and Perceptual Quality of Scalable Video as Functions of Quantization and Frame Rate and Its Application in Scalable Video Adaptation
Yao Wang, Zhan Ma, Yen-Fu Ou (Polytechnic Institute of NYU)

This paper investigates the impact of frame rate and quantization on the bit rate and perceptual quality of a scalable video with temporal and quality scalability. We propose a rate model and a quality model, both in terms of the quantization stepsize and frame rate. The quality model is derived from our earlier quality model in terms of the PSNR of decoded frames and frame rate. Both models are developed based on the key observation from experimental data that the relative reduction of either rate and quality when the frame rate decreases is quite independent of the quantization stepsize. This observation enables us to express both rate and quality as the product of separate functions of quantization stepsize and frame rate, respectively. The proposed rate and quality models are analytically tractable, each requiring only two content-dependent parameters. Both models fit the measured data very accurately, with high Pearson correlation. We further apply these models for rate-constrained bitstream adaptation, where the problem is to determine the optimal combination of quality and temporal layers that provides the highest perceptual quality for a given bandwidth constraint.

Error Control of Video System

Generation of Redundant Frame Structure for Interactive Multiview Video Streaming
Gene Cheung (HP Labs Japan), Antonio Ortega, Ngai-Man Cheung (Univ. of Southern California)

While multiview video coding focuses on the rate-distortion performance of compressing all frames of all views, we address the problem of designing a frame structure to enable interactive multiview streaming, where clients can interactively switch views during video playback. Thus, as a client is playing back successive frames (in time) for a given view, it can send a request to the server to switch to a different view while continuing uninterrupted temporal playback. Noting that standard tools for random access (i.e., I-frame insertion) can be inefficient for this application, we propose a technique where redundant representations of some frames can be stored to facilitate view switching. We first present an optimal algorithm with exponential running time that generates such a frame structure so that the expected transmission rate is optimally traded off with total storage. We then present methods to reduce the algorithm complexity for practical use. We show in our experiments that we can generate redundant frame structures offering a range of tradeoff points with transmission and storage, including ones that outperformsimple Iframe insertion structures by up to 48% in terms of bandwidth efficiency for similar storage costs.

Network Coding-Based Wireless Media Transmission Using POMDP
Dong Nguyen, Thinh Nguyen (Oregon State Univ.)

We consider the problem of joint network coding and packet scheduling for multimedia transmission from the Access Point (AP) to multiple receivers in 802.11 networks. The state of receivers is described by a hidden Markov model and the AP acts as a decision maker which employs a partially observable Markov decision process (POMDP) to optimize the media transmission. Importantly, we introduce a simulation-based dynamic programming algorithm as a solution tool for our POMDP abstract. Our simulation-based algorithm simplifies the modeling process as well as reduces the computational complexity of the solution process. Our simulation results demonstrate that the proposed scheme provides higher performance than the network coding scheme without using optimization techniques and traditional retransmission scheme.

Frame Loss Error Concealment for Spatial Scalability Using Hallucination
Qirong Ma (Univ. of Washington), Feng Wu (Microsoft Research Asia), Jian Lou (Univ. of Washington), and Ming-Ting Sun (Univ. of Washington)

We present a new error concealment algorithm for spatially Scalable Video Coding with frame loss in the enhancement layer, based on the technique of hallucination. For a lost enhancement layer frame, the error concealment is done as hallucinating its base layer frame, using the database trained from previously decoded frames nearby to the lost one. Simulation results show that the proposed method could outperform the state-of-the-art error concealment algorithms of SVC significantly.

Rate-Distortion Based Mode Selection For Video Coding over Wireless Networks with Burst Losses
Yiting Liao and Jerry D. Gibson (UC Santa Barbara)

Video communications over wireless networks suffer various patterns of losses, including both random packet loss and burst losses. Previous error resilient techniques simply consider the average packet loss rate to enhance error robustness for video transmission. However, loss patterns, specifically burst losses, have great impact on video quality. In this paper, we propose a method that can take account of both random and burst losses to further improve the error resilience of video coding. Our method estimates the end-to-end distortion based on recursive optimal per-pixel estimate (ROPE) including both random and burst losses, and applies it for rate-distortion (RD)-based optimal mode selection. We apply our method in two cases: For single description video coding, we estimate the reconstructed pixel values for random packet loss and burst losses, and calculate the overall distortion. For multiple description video coding, we estimate the end-to-end distortion for multiple state video coding (MSVC) by considering the network conditions and multiple state recovery to reduce the error propagation due to packet loss in both descriptions for MSVC. Simulation results show that our proposed method achieves better performance than MSVC and original ROPE (only considering average packet loss rate) over wireless networks with burst losses.

Special Session: Peer-to-Peer Video Streaming

Optimal Server Bandwidth Allocation Among Multiple P2P Multicast Live Video Streaming Sessions
Aditya Mavlankar, Jeonghun Noh, and Bernd Girod (Stanford Univ.)

We consider a server that simultaneously streams multiple video channels. Each video channel is delivered to a set of receivers using peer-to-peer (P2P) live multicast. We propose a framework for allocating server bandwidth to minimize distortion across the peer population, across all channels. The optimization problem considers rate, distortion, the audience size, and peer-churn associated with each channel. Network simulations demonstrate reduction of mean distortion across the peer population due to the proposed server bandwidth allocation. We also highlight the general scope of the proposed framework which makes it applicable to scenarios beyond the one considered in this paper. Other optimization metrics rather than average distortion can also be accommodated in the proposed framework.

Resource Trade-off in P2P Streaming
Majed Alhaisoni (University of Essex), Antonio Liotta (Eindhoven University of Technology), and Mohammed Ghanbari (University of Essex)

P2P TV has emerged as a powerful alternative solution for multimedia streaming over the traditional Client-Server paradigm. It has proven to be a valid substitute for online applications which offer video-on-demand and real-time video. This is mainly due to the scalability and resiliency that P2P gives to these applications to be deployed smoothly. Recently, various P2P platforms such as Sopcast, Joost, Zattoo, and Babelgum have become widely popular tools for delivering both Real Time and Video-On-Demand services. However, these P2P TV approaches do experience some points of failure and have limitations. For instance, most P2P applications are designed to balance CPU load and memory but not network resources or vice versa. Through an experimental-based study, this paper unveils strengths (e.g. good resilience to end-to-end delay and jitter) and shortcomings (e.g. poor resource optimization and load balancing) of these tools and makes proposals for improving the P2P IPTV performance. Our findings are based on the analysis of traffic traces from P2P streaming applications at Zattoo, Joost, Sopcast and Babelgum.

Foresighted Joint Resource Reciprocation and Scheduling Strategies for Real-Time Video Streaming over Peer-to-Peer Networks
Sunghoon Ivan Lee, Hyunggon Park, Mihaela Van der Schaar (Univ. of California, Los Angeles)

We consider peer-to-peer (P2P) networks, where multiple heterogeneous and self-interested peers are sharing multimedia data. In this paper, we propose a novel scheduling algorithm for real-time video streaming over dynamic P2P networks. The proposed scheduling algorithm is foresighted, since it enables each peer to maximize its long-term video quality by efficiently utilizing its limited resources (e.g., uploading bandwidth) over time, while explicitly considering the time-varying resource reciprocation behaviors of its associated peers. To successfully design the scheduling algorithm, we consider a distinct buffer structure that allows the peers to model the resource reciprocation behavior as a reciprocation game. Then, each peer can determine its foresighted decisions based on a Markov Decision Process (MDP). The simulation results show that the proposed algorithm significantly improves the average video quality, compared to other existing scheduling strategies. Moreover, simulation results also show that the proposed algorithm can flexibly and effectively operate in heterogeneous P2P networks.

Understanding the Flash Crowd in P2P Live Video Streaming Systems
Fangming Liu, Bo Li, Lili Zhong (Hong Kong Univ. of Science and Technology), Baochun Li (Univ. of Toronto)

Peer-to-Peer (P2P) live video streaming systems have recently received significant attention, with commercial deployment gaining increased popularity in the Internet. It is evident from our experiences with real-world systems that, it is not uncommon to have hundreds of thousands of users trying to join a program in the first few minutes of a live broadcast. This phenomenon, unique in live streaming systems, referred to as the flash crowd, poses significant challenges in the system design. In this paper, we develop a mathematical model to capture the inherent relationship between time and scale in P2P streaming systems under the flash crowd. Specifically, we show that there is an upper bound on the system scale with respect to a time constraint. In addition, our analysis has brought forth an in-depth understanding on the effect from the Gossip protocol and churn effects.

Experiences with a Large-Scale Deployment of the Stanford Peer-to-Peer Multicast
Jeonghun Noh, Pierpaolo Baccichet, Bernd Girod (Stanford Univ.)

Traditionally, a large number of dedicated media servers have been deployed to serve a large population of viewers for a single streaming event. However, maintaining media servers is not only costly but also usually requires over-provisioning due to the difficulty of predicting the peak size of an audience. Peer-to-Peer (P2P) streaming is a new approach to overcome these difficulties inherent in server-based streaming. We have developed the Stanford Peer-to-Peer Multicast (SPPM) protocol for live multicast streaming. SPPM constructs multiple multicast trees to push media streams to the population of peers, thereby achieving low end-to-end transmission delay. The degradation of video quality due to peer churn and packet loss in the network is reduced by video-aware packet scheduling and retransmission. In this paper, we present lessons we acquired from the deployment of a commercial variant of SPPM for a large-scale streaming event which attracted more than 33,000 viewers. We collected server logs and analyzed user statistics as well as the system performance. The results show that our system can achieve low end-to-end delay of only a few seconds with an average packet loss ratio of around 1%. We also found that improving peer-to-peer connectivity can substantially enhance the aggregate uplink capacity of P2P systems.

Video System

Equitable quality video streaming over DSL
Barry Crabtree, Mike Nilsson, Pat Mulroy and Steve Appleby (BT Innovate)

Video streaming has frequently been deployed using constant bit rate video encoding and transmission as these are easily implemented and network provisioning, although requiring admission control, is straightforward. However, it is sub-optimal for both the user, who experiences time varying quality, and the network operator, who cannot fully utilize the network during periods of low usage and is faced with admission control and session refusal at busy times. Instead we propose an equitable quality scheme in which the available network bandwidth is divided between the concurrent sessions so that the same quality is delivered in each. We show that we can increase average quality (Mean Opinion Score) by 0.74, or alternatively increase the number of sessions that can be supported at the same average overall quality by 100%.

Cross-Layer Optimization with Complete and Incomplete Knowledge for Delay-Sensitive Applications
Fangwen Fu, Mihaela van der Schaar (UCLA)

In this paper, we first formulate the cross-layer design as a non-linear constrained optimization problem by assuming complete knowledge of the dynamically changing application characteristics and the underlying time-varying network conditions. By decomposing the cross-layer optimization problem, we determine the necessary message exchanges between layers for achieving the optimal cross-layer solution and explicitly show how the cross-layer strategies selected for one date unit (DU, e.g. packet) will impact its neighboring DUs as well as the DUs that depend on it. However, the attributes (e.g. distortion impact, delay deadline etc) of future DUs as well as the network conditions are often unknown in the considered real-time applications. The impact of current cross-layer actions on the future DUs can be characterized by a state-value function in the Markov decision process (MDP) framework. Based on the value iteration solution to the MDP, we develop a low-complexity cross-layer optimization algorithm using online learning for each DU transmission. This online optimization utilizes information only about the previous transmitted DUs and past experienced network conditions. This online algorithm can be implemented in real-time in order to cope with unknown source characteristics, network dynamics and resource constraints. Our numerical results demonstrate the efficiency of the proposed online algorithm.

Multi-Layer Video Broadcasting With Low Channel Switching Delays
Cheng-Hsin Hsu and Mohamed Hefeeda (Simon Fraser University)

Modern mobile devices, despite their small sizes, can run many multimedia applications that were only possible to stationary workstations. Mobile devices, however, have quite heterogeneous resources, which poses a challenge to mobile TV broadcast networks. We study the problem of broadcasting multi-layer video streams to mobile devices with heterogeneous resources. We propose broadcast schemes that allow each mobile device to selectively receive a few (or all) layers of the complete video streams, and achieve proportional energy saving. We also propose a broadcast scheme that achieves low channel switching delay, which is important to user experience. We analytically analyze the performance of the proposed schemes. Most importantly, we have implemented them in a real mobile TV testbed. We conduct extensive experiments to show the practicality and efficiency of the proposed schemes. The experimental results show that channel switching delays less than 200 msec and energy saving between 75% and 95% are possible under typical system parameters of mobile TV networks.

Analysis of Authentication Schemes for Nonscalable Video Streams
Mohamed Hefeeda and Kianoosh Mokhtarian (Simon Fraser University)

The problem of multimedia stream authentication has received significant attention by previous works and various solutions have been proposed. These solutions, however, have not been rigorously analyzed and contrasted to each other, and thus their relative suitability for different streaming environments is not clear. In this paper, we conduct comprehensive analysis and comparison among the main authentication schemes proposed in the literature. To perform this analysis, we propose five performance metrics: computation cost, communication overhead, receiver buffer size, delay, and tolerance to packet losses. We derive analytic formulas for these metrics for all schemes, and we numerically analyze these formulas. In addition, we implement all schemes in a simulator to study their performance in different environments. Our detailed analysis reveals the merits and shortcomings of each scheme and provides guidelines on choosing the most appropriate scheme for a given application. Our analysis also helps in designing new authentication schemes and/or improving existing ones.

Special Session: Congestion Control and FEC for Video System

The Use of MulTCP for the Delivery of Equitable Quality Video
Pat Mulroy, Steve Appleby, Mike Nilsson, Barry Crabtree (BT Innovate)

Instead of constant bit rate encoding and delivery with the known problems of variable quality, not fully using the network at all times, and session rejection, we describe a system in which independent video servers, by using a set of fixed quality encodings and analyzing the statistics of the encoded video they are about to deliver, vary the aggressiveness of TCP rate control so as to obtain unequal shares of the contended network, in such a way that more complex video sequences get a bigger share. Compared to an equal share of bit rate, there is much less variation in quality between video streams delivered at the same time. This paper reports on the use of MulTCP, a server side modification to TCP, to apportion bandwidth more appropriate to the demands of the content being streamed. We present results of NS-2 simulations of our autonomous rate adaptive streaming server using MulTCP as the transport and show how similar qualities across content types are maintained when streaming over a contended network.

Congestion State-Based Dynamic FEC Algorithm for Media Friendly Transport Layer
Hulya Seferoglu (UC Irvine), Ulas¸ C. Kozat, M. Reha Civanlar and James Kempf (DoCoMo USA Labs)

TCP-friendliness has been adopted as the most important property for the design of new media-specific transport layers in the Internet. The TCP protocol is mainly concerned with achieving as much throughput as possible while preventing long-term congestion. Various TCP protocol designs do this by inducing brief episodes of network congestion, measuring it, then reducing the offered load quickly to remove the congestion. Media flows, on the other hand, are very sensitive to even brief episodes of congestion. The question therefore arises: how can we protect media flows against TCP-induced network congestion? In this paper, we focus on combining the TCP Friendly Rate Control (TFRC) protocol with Forward Error Correction (FEC) to achieve such protection. We observe that FEC methods that solely rely on loss statistics generate significant overhead in terms of the redundant parity packets transmitted over the network. Accordingly, we investigate the loss and delay characteristics in several TCP-induced congestion scenarios in order to identify potential periods of increased congestion and apply FEC protection during those periods judiciously. We find out that indeed efficient models can be developed and incorporated into a dynamic FEC framework which can achieve substantially better overhead vs. reliability tradeoff (e.g., up to 60% improvement at high reliability region) than an FEC approach that uses fixed coding rate to satisfy a given reliability.

Predictive Control for Efficient Statistical Multiplexing of Digital Video Programs
Nesrine Changuel, Bessem Sayadi (Alcatel-Lucent Bell-Labs France), Michel Kleffer (Univ. Paris)

In broadcast and multicast systems, Video programs are transmitted over a constant bit rate channel. Apart from bandwidth constraints, the control of each encoder involved in the statistical multiplexing has to satisfy several other constraints: minimum quality, fairness, and smoothness constraints. This paper proposes an improved control scheme for multiplexing H.264/AVC encoded video programs considering all the mentioned constraints. This process is based on grouping pictures of each program into Set Of Pictures (SOP) whose control parameters are determined from parameters of both past and future SOP. This process leads to smoothed multiplexed program and bounded quality differences between programs. Previous results using only past regulation process showed an increase of the multiplexing efficiency compared to the CBR allocation. Using the proposed regulation process, higher multiplexing efficiency is achieved allowing a potential increase of the number of multiplexed programs.

Forward and Retransmitted Systematic Lossy Error Protection For IPTV Video Multicast
Zhi Li (Stanford Univ), Xiaoqing Zhu (Cisco Systems Inc), Ali C. Begen (Cisco Systems Inc), Bernd Girod (Stanford Univ)

Emerging IPTV deployments combine Forward Error Correction (FEC) and packet retransmissions to resist the impulse noise of the Digital Subscriber Line (DSL) links. In this work, we re-engineer such a solution to improve its robustness against impulse noise while keeping it backwardcompatible with the current network infrastructure. We propose forward and retransmitted Systematic Lossy Error Protection (SLEP/SLEPr), which effectively provides error resiliency at the expense of some slight loss of video quality. We demonstrate its effectiveness through a set of experiments. We further present an analytical model for SLEP/SLEPr and show that the experimental results can be explained by the analysis.

Wireless Video

Distributed Rate Control for Video Streaming over Wireless Networks with Intersession Network Coding
Hulya Seferoglu, Athina Markopoulou (UC Irvine)

In this paper, we study the problem of rate control over wireless networks with intersession network coding. We formulate the general problem as a utility maximization problem, show that it has a distributed solution and demonstrate convergence through numerical simulations. Rate control for video streaming fits naturally within this framework. However, the time-varying nature of video content implies time-varying utilities and affects the underlying network coding opportunities. A key observation is that by delaying some scenes and by optimizing the rate allocation over longer time intervals, we can create more network coding opportunities and thus achieve higher total utility. We develop distributed rate allocation schemes that exploit this observation and demonstrate the benefit through numerical simulations.

An Evaluation of Quality of Service for H.264 over 802.11e WLANs
Richard MacKenzie (University of Leeds), David Hands (BT Innovate) and Timothy O’Farrell (Swansea Univ.)

802.11 wireless local area networks are now a common feature in the home. In order to meet the quality of service (QoS) demands for the increasing number of multimedia applications on these home networks the 802.11e amendment was developed. A suitable video coding standard for these multimedia applications is H.264 due to its high compression and error resilience. In this paper we investigate how the quality of H.264 video is affected as the number of concurrent video streams sent over a multi-rate 802.11e network is increased. Several packet mapping schemes are compared. We show that the mapping schemes which differentiate video packets based on their frame type are more successful at maintaining acceptable video quality when congestion occurs, providing a more gradual quality degradation as congestion increases rather than the cliffedge quality drop that tends to occur with the other mapping schemes. These differentiated schemes are more successful for videos that do not have a high amount of temporal activity. We also identify that impairments caused by congestion tend to occur towards the bottom of each frame when the flexible macroblock ordering (FMO) feature of H.264 is not used but the use of FMO can reduce this effect.

Online Learning for Wireless Video Transmission with Limited Information
Yu Zhang, Fangwen Fu, Mihaela van der Schaar (UCLA)

In this paper, we address the problem of joint packet scheduling at the application layer as well as power and rate allocation at the physical layer for delay-sensitive video streaming over slow-varying flat-fading wireless channels. Our goal is to find the optimal cross-layer policy that maximizes the cumulative received video quality, while minimizing the total transmission energy. We first formulate the cross-layer optimization using a systematic layered Markov Decision Process (MDP) framework and then propose a layered real-time dynamic programming (RTDP) algorithm for solving this crosslayer optimization problem by combining together the policy update and real-time decision making. This approach reduces the high complexity of the conventionally used offline dynamic programming methods. Moreover, to accommodate the cases when the network environment dynamics (e.g. state transition probabilities) are unknown or non-stationary (e.g. state transition probabilities are changed over time), we further improve our RTDP method by collecting the required network information and estimating the dynamics online, using a modelfree approach. Based on this information, a user (a transmitterreceiver pair) can adaptively change its policy to cope in realtime with the experienced environment dynamics. We also prove the convergence of this RTDP method (which complies with the layered architecture of the OSI stack). Finally, our numerical experiments show that the proposed RTDP solutions outperform the conventional offline DP methods for real-time video streaming.

Cross-layer Optimization for the Scalable Video Codec over WLAN
Carolina Blanch, Tong Gan, Antoine Dejonghe, Bart Masschelein (Interuniversity Microelectronics Center)

A major limitation for wireless video communication over portable devices is the limited energy supply. For this reason, an efficient energy usage becomes a critical issue. In this paper we focus on the energy minimization of the two main energy consumers at the device: video encoding and wireless communication tasks. For this purpose, we develop a cross-layer approach that explores the tradeoff between coding and communication energies. We then exploit the Power-Rate tradeoffs and flexibility of the Scalable Video Codec. Our results show that by adapting the codec configuration at runtime to the specific scenarios we can save up to 40% of the total energy without video quality loss. Moreover, our approach is of low complexity and easily deployable.