The Role
of Signal Processing in the Multimedia Communications Revolution
Getting Internet Video Ready for Prime Time
Bernd Girod,
Ya-Qin
Zhang,
The keynote will be presented by Jenq-Neng
Hwang on Ya-Qin Zhang's behalf.
A personal history of Perceptual Coding --Tripping
over the cobblestones to MP3 and beyond
James D. (JJ) Johnston, Microsoft, USA
KEYNOTE 1
When:
Where: Crystal Ballroom
Title: The
Role of Signal Processing in the Multimedia Communications Revolution
Speaker:
Abstract:
We are now in the
midst of a Multimedia Communications Revolution in which virtually every aspect
of telecom is changing in ways that would have been considered unthinkable just
a decade or so ago. Perhaps the greatest
challenge in realizing this communications revolution is to figure out how to
provide a range of new services that seamlessly integrate text, sound, image,
and video information and to do it in a way that preserves the ease-of-use and
interactivity of conventional telephony, irrelevant of the bandwidth or means
of access of the connection to the service.
In order to achieve this overarching goal, there are a number of
technological problems that must be considered, including:
·
compression
and coding of multimedia signals, including algorithmic issues, standards
issues, and transmission issues;
·
synthesis
and recognition of multimedia signals, including speech, images, handwriting,
and text;
·
organization,
storage, and retrieval of multimedia signals;
·
access
methods to the multimedia signal;
·
searching;
·
browsing.
In each of these areas a great deal of
progress has been made in the past few years, driven in part by the relentless
growth in processing and storage capacity of VLSI chips, and in part by the
availability of broadband access to and from the home and to and from wireless
connections.
It is the purpose of this talk to review the
status of the technology in each of the areas listed above and to illustrate
some of the challenges and limitations of current capabilities.
Speaker’s bio:
Lawrence Rabiner was born in
From
1962 through 1964, he participated in the cooperative program in Electrical
Engineering at AT&T Bell Laboratories.
During this period Dr. Rabiner worked on
digital circuitry, military communications problems, and problems in binaural
hearing. Dr. Rabiner
joined AT&T Bell Labs in 1967 as a Member of the Technical Staff. He was promoted to Supervisor in 1972,
Department Head in 1985, Director in 1990, and Functional Vice President in
1995. He joined AT&T Labs in 1996 as
Director of the Speech and Image Processing Services Research Lab, and was
promoted to Vice President of Research in 1998 where he managed a broad
research program in communications, computing, and information sciences
technologies. Dr. Rabiner
retired at the end of March 2002 and is now a Professor of Electrical and
Computer Engineering at
When:
Where: Crystal Ballroom
Title: Getting
Internet Video Ready for Prime Time
Speaker: Bernd Girod
Abstract:
A decade after the introduction of video streaming,
Internet video is finally getting ready for prime time. Despite the well-known challenges of congestion, packet loss, and delay jitter, Internet video will soon look better than conventional broadcast television. This is in no small measure due to advances in media processing and communication, which enable efficient and robust media delivery. In this talk, I review recent advances and current challenges in Internet video delivery and consider some of the key questions of
real-time transport. Is best-effort good enough? How hard are media delivery deadlines? How can congestion be avoided? Should transport mechanisms be media-aware? Should we bother with packet scheduling? Can multipath routing help? I will
argue that a cross-layer paradigm comprising network-adaptive
media processing and media-aware transport is essential for superior system performance and show examples from our current research on IPTV delivery over wireless home networks and P2P live video multicast.
Speaker’s bio:
Bernd Girod is
Professor of Electrical Engineering and (by courtesy) Computer Science in the Information Systems
Laboratory of Stanford University,
California. He was Chaired Professor of Telecommunications in the Electrical Engineering Department of
the University of Erlangen-Nuremberg from 1993 to 1999. His research
interests are in the areas of networked
media systems and video signal compression. Prior visiting or regular faculty positions include MIT, Georgia
Tech, and Stanford. He has been
involved with several startup ventures as founder, director, investor, or advisor, among them
Vivo Software, 8x8 (Nasdaq: EGHT), and RealNetworks
(Nasdaq: RNWK). Since 2004, he serves as the Chairman of the new Deutsche Telekom Laboratories in
When:
Where: Crystal Ballroom
Title: Advances in
Speaker: Ya-Qin Zhang
The keynote will be
presented by Jenq-Neng Hwang on Ya-Qin
Zhang's behalf.
Abstract:
We see a continued convergence of mobile, computer,
and consumer electronics industry with rapid advances in smart devices,
communications and networking, and new applications and services. New intelligent devices are emerging with
powerful 32-bit embedded processors and multi-tasking operating systems. The
continued evolution from 2G/2.5G to 3G and advances in PAN/LAN/WAN lead to
all-IP infrastructure with high-speed access, multi-radio technology, always-on
capability, and seamless connectivity. While voice continues to be a critical
driving force for synchronous communications, new data-centric applications,
such as messaging, media, push-to-talk, emails, web browsing, location-based
service, and corporate data access, create most exciting opportunities for
operators, OEM/ODM, developers, consumers, and business.
This talk presents
Microsoft’s vision on seamless mobile computing that enables (a) deep
connectivity of mobile devices with desktop PCs, backend servers, web, and
other devices; (b) automatic detection, seamless roaming and soft handover in a
multi-radio environment with the “best” QoS and
consistent user experiences; (c) natural user interface with voice dialing,
voice command, TTS, ink, and vision; and (d) a powerful platform and ecosystem
with compelling applications and
services developed by ISVs, OEMs, and operators. The talk will discuss new
advances in the embedded and mobile space, and in particular highlight a
plethora of new devices built on Windows CE, PocketPC,
and Smartphone platforms. The talk will also touch on
a few examples of our active research work on mobility and networking across
Microsoft Research labs, including seamless roaming, mobile media, navigation,
and mesh networks.
Speaker's bio:
Ya-Qin Zhang is Corporate Vice President of
Microsoft Corporation, and President of Microsoft China Research and
Development Group. He was the Corporate Vice President of Microsoft Corporation,
responsible for product development of Microsoft’s Mobile and Embedded
Division, including WinCE operating system, Smartphone,
PocketPC, and other Windows Mobile platform and
devices. Before then he was the Managing Director of Microsoft Research Asia,
Microsoft’s basic research arm in Asia-Pacific region. From 1994 to 1999, he
was the Director of Multimedia Technology Laboratory at Sarnoff Corporation in
Princeton, NJ (RCA Laboratories). He was with GTE (now Verizon)
Corp. in Waltham, MA from 1989 to 1994. He has published over 200-refereed
papers in leading international conferences and journals. He has been granted
over 50 US patents in digital video, Internet, multimedia, wireless and
satellite communications. Many of the technologies he and his team developed
have become the basis for start-up ventures, commercial products, and
international standards. He serves on the Board of Directors of five high-tech IT companies. He served as the Editor-In-Chief for
the IEEE Transactions on Video Technology, editorial boards of seven other
professional journals, and over a dozen conference committees. He has been a
key contributor to the ISO/MPEG and ITU standardization efforts in digital
video and multimedia. Ya-Qin is a Fellow of IEEE.
Ya-Qin received his B.S. and M.S. in Electrical
Engineering from the
When: Banquet
Where: TBA
Title: A personal history of
Perceptual Coding --Tripping
over the cobblestones to MP3 and beyond
Speaker: James D. (JJ)
Abstract:
Somewhere in the
middle of the 1970’s, while I was working on speech coding at Bell Labs, it became
completely obvious that there was more to the world than SNR or any kind of
Mean Squared Error criterion. When the Alliant ™ Minicomputers arrived at Bell Labs in the early
1980’s, I wrote a test program for the first one that built a perceptual model,
and applied it to an FFT filterbank (overlap add,
non-critically sampled), using lots of features of the new language then
available, mostly in order to test the performance of the computer. The memory
size of the older computers had prevented this kind of work, it would not fit
in the memory available. The results
were surprising, in fact, considering perception led to an enormous drop in the
necessary bit rate, so much so that I had to spend some time ensuring that the
results weren’t just a wild mistake.
The end result of this programming exercise was called “PXFM”. After
working on PXFM, Bob Safranek and I, with some
collaboration from Phil Chou (then a new Bell Labs MTS) and Ruth Rosenholtz, created a perceptual image coder called “PIC”,
and determined rather quickly that even primitive coding methods, along with
some guidance from perceptual models, worked remarkably well for images, as
well.
Just about then, a surprise encounter with Karlheinz
Speaker's
bio:
JJ is currently employed at Microsoft Corporation as an audio
architect. He is retired from
AT&T Labs - Research, quartered at Florham Park, NJ, Speech Processing
Software and Technology Research Department. Before that, he was employed by
AT&T Bell Laboratories, in the Acoustics Research Department under Dr. J.
L. Flanagan, and in the Signal Processing Research Department.
His original assignments involved using analog
signal processing to do speech coding (APCM, ADPCM, SBC) for testing of
algorithms, sampling rates, and quantizer
resolutions. His first IEEE paper detailed the hardware construction of an
ADPCM implementation using analog multipliers and integrators to provide both
step-size and predictor "calculation", in a form that allowed
sampling rate and quantizer resolution changes.
Since then, he has worked in analog signal
processing, speech coding, voice privacy, quadrature
mirror filter design, and perceptual coding of both audio and images. During
this work on perceptual audio coding, he has been the primary investigator of
the early PXFM audio coder which was reported on at the ASSP Digital Audio
Meeting in
During this time, he also did an investigation of
coding of still-frame images using a forward-driven perceptual model with Dr.
R. J. Safranek, also of AT&T Bell Laboratories.
This image coder, called PIC (for Perceptual Image Coder), used very simple
techniques to provide state of the art still-image compression. He was until
recently the primary researcher and inventor of AT&T's contributions to the
MPEG_2 AAC audio coding algorithm. He also represented AT&T in the ANSI
accredited group X3L3.1, and X3L3.1 in the ISO-MPEG-AUDIO (AAC) arena in
support of the AAC algorithm.
He received his BSEE and MSEE from