61
Multimedia Communications : 1. Introduction Institut Sains dan Teknologi Nasional - Jakarta

1 Multimedia Communications Introduction

Embed Size (px)

Citation preview

Page 1: 1 Multimedia Communications Introduction

Multimedia Communications :

1. IntroductionInstitut Sains dan Teknologi Nasional - Jakarta

Page 2: 1 Multimedia Communications Introduction

Reference Text: Multimedia Communications;

Applications, Networks, Protocols and Standards, Fred Halsall,  Addison-Wesley; 1st edition (2002), ISBN: 0-201-39818-4.

Page 3: 1 Multimedia Communications Introduction

What is Multimedia?

01/22/20073

Multimedia is a combination of text, art, sound, animation, and video.

Slide: Courtesy, Hung Nguyen

Page 4: 1 Multimedia Communications Introduction

Multimedia Description

Introduction to Multimedia4

Multimedia is an integration of continuous media (e.g. audio, video)

and discrete media (e.g. text, graphics, images) through which digital information can be conveyed to the user in an appropriate way.

Multi many, much, multiple

Medium An interleaving substance through which something is

transmitted or carried on

Page 5: 1 Multimedia Communications Introduction

Why Multimedia Computing?

Introduction to Multimedia5

Application driven e.g. medicine, sports, entertainment, education

Information can often be better represented using audio/video/animation rather than using text, images and graphics alone.

Information is distributed using computer and telecommunication networks.

Integration of multiple media places demands on computation power storage requirements networking requirements

Page 6: 1 Multimedia Communications Introduction

Multimedia Information Systems

Introduction to Multimedia6

Technical challenges Sheer volume of data

Need to manage huge volumes of data Timing requirements

among components of data computation and communication.

Must work internally with given timing constraints - real-time performance is required.

Integration requirements need to process traditional media (text, images) as well

as continuous media (audio/video). Media are not always independent of each other -

synchronization among the media may be required.

Page 7: 1 Multimedia Communications Introduction

High Data Volume of Multimedia Information

Speech 8000 samples/s 8Kbytes/s

CD Audio 44,100 samples/s, 2 bytes/sample

176Kbytes/s

Satellite Imagery

180X180 km 2̂ 30m 2̂ resolution

600MB/image (60MB compressed)

NTSC Video 30fps, 640X480 pixels, 3bytes/pixel

30Mbytes/s (2-8 Mbits/s compressed)

Introduction to Multimedia7

Page 8: 1 Multimedia Communications Introduction

Technology Incentive

Introduction to Multimedia8

Growth in computational capacity MM workstations with audio/video processing capability Dramatic increase in CPU processing power Dedicated compression engines for audio, video etc.

Rise in storage capacity Large capacity disks (several gigabytes) Increase in storage bandwidth,e.g. disk array

technology

Surge in available network bandwidth high speed fiber optic networks - gigabit networks fast packet switching technology

Page 9: 1 Multimedia Communications Introduction

Application Areas

Introduction to Multimedia9

Residential Services video-on-demand video phone/conferencing systems multimedia home shopping (MM catalogs, product

demos and presentation) self-paced education

Business Services Corporate training Desktop MM conferencing, MM e-mail

Page 10: 1 Multimedia Communications Introduction

Application Areas

Introduction to Multimedia10

Education Distance education - MM repository of class videos Access to digital MM libraries over high speed networks

Science and Technology computational visualization and prototyping astronomy, environmental science

Medicine Diagnosis and treatment - e.g. MM databases that

provide support for queries on scanned images, X-rays, assessments, response etc.

Page 11: 1 Multimedia Communications Introduction

Classification of Media

Introduction to Multimedia11

Perception Medium How do humans perceive information in a computer?

Through seeing - text, images, video Through hearing - music, noise, speech

Representation Medium How is the computer information encoded?

Using formats for representing and information ASCII(text), JPEG(image), MPEG(video)

Presentation Medium Through which medium is information delivered by the

computer or introduced into the computer? Via I/O tools and devices paper, screen, speakers (output media) keyboard, mouse, camera, microphone (input media)

Page 12: 1 Multimedia Communications Introduction

Classification of Media (cont.)

Introduction to Multimedia12

Storage Medium Where will the information be stored? Storage media - floppy disk, hard disk, tape, CD-ROM etc.

Transmission Medium Over what medium will the information be transmitted? Using information carriers that enable continuous data

transmission - networks wire, coaxial cable, fiber optics

Information Exchange Medium Which information carrier will be used for information

exchange between different places? Direct transmission using computer networks Combined use of storage and transmission media (e.g.

electronic mail).

Page 13: 1 Multimedia Communications Introduction

Media Concepts

Introduction to Multimedia13

Each medium defines Representation values - determine the information

representation of different media Continuous representation values (e.g. electro-magnetic

waves) Discrete representation values(e.g. text characters in digital

form) Representation space determines the surrounding

where the media are presented. Visual representation space (e.g. paper, screen) Acoustic representation space (e.g. stereo)

Page 14: 1 Multimedia Communications Introduction

Media Concepts (cont.)

Introduction to Multimedia14

Representation dimensions of a representation space are: Spatial dimensions:

two dimensional (2D graphics) three dimensional (holography)

Temporal dimensions: Time independent (document) - Discrete media

Information consists of a sequence of individual elements without a time component.

Time dependent (movie) - Continuous media Information is expressed not only by its individual value but

also by its time of occurrence.

Page 15: 1 Multimedia Communications Introduction

Multimedia Systems

Introduction to Multimedia15

Qualitative and quantitative evaluation of multimedia systems Combination of media

continuous and discrete. Levels of media-independence

some media types (audio/video) may be tightly coupled, others may not.

Computer supported integration timing, spatial and semantic synchronization

Communication capability

Page 16: 1 Multimedia Communications Introduction

Data Streams

Introduction to Multimedia16

Distributed multimedia communication systems

data of discrete and continuous media are broken into individual units (packets) and transmitted.

Data Stream sequence of individual packets that are transmitted in a

time-dependant fashion. Transmission of information carrying different media

leads to data streams with varying features Asynchronous Synchronous Isochronous

Page 17: 1 Multimedia Communications Introduction

Data Stream Characteristics

Introduction to Multimedia17

Asynchronous transmission mode provides for communication with no time restriction Packets reach receiver as quickly as possible, e.g. protocols

for email transmission Synchronous transmission mode

defines a maximum end-to-end delay for each packet of a data stream.

May require intermediate storage E.g. audio connection established over a network.

Isochronous transmission mode defines a maximum and a minimum end-to-end delay for

each packet of a data stream. Delay jitter of individual packets is bounded.

E.g. transmission of video over a network. Intermediate storage requirements reduced.

Page 18: 1 Multimedia Communications Introduction

Data Stream Characteristics

Introduction to Multimedia18

Data Stream characteristics for continuous media can be based on Time intervals between complete transmission of

consecutive packets Strongly periodic data streams - constant time interval Weakly periodic data streams - periodic function with finite

period. Aperiodic data streams

Data size - amount of consecutive packets Strongly regular data streams - constant amount of data Weakly regular data streams - varies periodically with time Irregular data streams

Continuity Continuous data streams Discrete data streams

Page 19: 1 Multimedia Communications Introduction

Classification based on time intervals

Introduction to Multimedia19

Strongly periodic data stream

Weakly periodic data stream

Aperiodic data stream

T

T

T1 T3T2

T1 T2

T

Page 20: 1 Multimedia Communications Introduction

Classification based on packet size

Introduction to Multimedia20

TD1

D1

TD1D2D3D1D2D3

D1D2D3

Dn

Strongly regular data stream

Weakly regular data stream

Irregular data stream

t

t

t

Page 21: 1 Multimedia Communications Introduction

Classification based on continuity

Introduction to Multimedia21

Continuous data stream

Discrete data stream

D

D1 D2 D3 D4

D

D1 D2 D3 D4

Page 22: 1 Multimedia Communications Introduction

Logical Data Units

Introduction to Multimedia22

Continuous media consist of a time-dependent sequence of individual information units called Logical Data Units (LDU).

a symphony consists of independent sentences a sentence consists of notes notes are sequences of samples

Granularity of LDUs symphony, sentence, individual notes, grouped samples,

individual samples film, clip, frame, raster, pixel

Duration of LDU: open LDU - duration not known in advance closed LDU - predefined duration

Page 23: 1 Multimedia Communications Introduction

Granularity of Logical Data Units

Introduction to Multimedia23

Film

Clip

Frame

Blocks

Pixels

Page 24: 1 Multimedia Communications Introduction

Multimedia Components Simplified

01/22/200724

Multimedia can be viewed as they combination of audio, video, data and how they interact with the user (more than the sum of the individual components)

Audio

Multimedia

VideoData

Page 25: 1 Multimedia Communications Introduction

Background

01/22/200725

Fast paced emergence in applications in medicine, education, travel etc

Characterized by large documents that must be communicated with short delays

Glamorous applications such as distance learning, video teleconferencing

Applications that are enhanced by Video are often seen as driver for development of multimedia networks

Page 26: 1 Multimedia Communications Introduction

Forces Driving Communications That Facilitate Multimedia Communications

01/22/200726

Evolution of communications and data networks

Increasing availability of almost unlimited bandwidth demand

Availability of ubiquitous access to the network

Ever increasing amount of memory and computational power

Sophisticated terminals Digitization of virtually everything

Page 27: 1 Multimedia Communications Introduction

New Information System Paradigm

01/22/200727

Integration

MultimediaIntegrated

Communication

MultimediaProcessing

Broadband Link

Workstation, PC

Slide: Courtesy, Hung Nguyen

Page 28: 1 Multimedia Communications Introduction

Elements of Multimedia Systems

01/22/200728

Two key communication modes Person-to-person Person-to-machine

TransportUse

InterfaceUse

Interface

TransportProcessingStorage and

Retrieval

UseInterface

Slide: Courtesy, Hung Nguyen

Page 29: 1 Multimedia Communications Introduction

Multimedia Networks

01/22/200729

The world has been wrapped in copper and glass fiber and can be viewed as a “hair ball” with physical, wireless and satellite entry/exit points.

Physical: LAN-WAN connections Wireless: Cellular telephony, wireless PC

connectivity Satellite: INMARSAT, THURYA, ACeS etc

Page 30: 1 Multimedia Communications Introduction

Multimedia Communication Model

01/22/200730

Partitioning of information objects into distinct types, e.g., text, audio, video

Standardization of service components per information type

Creation of platforms at two levels – network service and multimedia communication

Define general applications for multiple use in various multimedia environments

Define specific applications, e.g. e-commerce, tele-training, … using building blocks from platform and general applications

Page 31: 1 Multimedia Communications Introduction

Requirements

01/22/200731

User Requirements Fast preparation and presentation Dynamic control of multimedia applications Intelligent support to users Standardization

Network Requirements High speed and variable bit rates Multiple virtual connections using the same access Synchronization of different information types Suitable standardized services along with support

Page 32: 1 Multimedia Communications Introduction

Network Requirements

01/22/200732

ATM-BISDN and SS7 have enabled the switching based communications capabilities over the PSTN that support the necessary services

ATM-BISDN-SS7 will evolve to all optical “switchless” networks based on packet transfer

Page 33: 1 Multimedia Communications Introduction

Packet Transfer Concept

01/22/200733

Allows voice, video and data to be dealt with in a common format

More flexible than circuit switching which it can emulate while allowing the multiplexing of varied bit rate data streams

Dynamic allocation of bandwidth Handle Variable Bit Rate (VBR) directly

Page 34: 1 Multimedia Communications Introduction

Considerations

01/22/200734

Buffering required for constant bit rate data such as audio

Re-sequencing and recovery capabilities must be provided over networks where packets may be received either in an order different from that transmitted or dropped In an ATM network some packets can be dropped

while others may not (i.e. voice vs bank transfer data packets)

Optimum packet lengths for voice video and data differ in an ATM network

IP packets over the internet may arrive in a different order or be dropped.

Page 35: 1 Multimedia Communications Introduction

Digital Video Signal Transport

01/22/200735

Vid

eo

Encoder•Transformation•Quantization•Entropy Coding•Bit-Rate Control

Application

•Data Structuring

Use

rs

Network Multiplexing/Routing

•Overhead (FEC)•Re-Trans

•Error detection•Loss detection•Error correction•Erasure correction

Application

•Re-Synch

Decoder•De-quantization•Entropy decode•Inv Trans•Loss conceal•Post process

The following figure will be examined over the course of the semester

Page 36: 1 Multimedia Communications Introduction

Quality of Service (QoS)

01/22/200736

The set of parameters that defines the properties of media streams

Can define four QoS layers:1. User QoS: Perception of the multimedia data at

the user interface (“qualitative”)2. Application QoS: Parameters such as end-to-end

delay (“quantitative”)3. System QoS: Requirements on the

communications services derived from the application QoS

4. Network QoS: Parameters such as network load and performance

Page 37: 1 Multimedia Communications Introduction

Applications of Multimedia

01/22/200737

Business - Business applications for multimedia include presentations training, marketing, advertising, product demos, databases, catalogues, instant messaging, and networked communication.

Schools - Educational software can be developed to enrich the learning process.

Slide: Courtesy, Hung Nguyen

Page 38: 1 Multimedia Communications Introduction

Applications of Multimedia

01/22/200738

Home - Most multimedia projects reach the homes via television sets or monitors with built-in user inputs.

Public places - Multimedia will become available at stand-alone terminals or kiosks to provide information and help.

Slide: Courtesy, Hung Nguyen

Page 39: 1 Multimedia Communications Introduction

Compact Disc Read-Only (CD-ROM)

01/22/200739

CD-ROM is the most cost-effective distribution medium for multimedia projects.

It can contain up to 80 minutes of full-screen video or sound.

CD burners are used for reading discs and converting the discs to audio, video, and data formats.

Slide: Courtesy, Hung Nguyen

Page 40: 1 Multimedia Communications Introduction

Digital Versatile Disc (DVD)

01/22/200740

Multilayered DVD technology increases the capacity of current optical technology to 18 GB.

DVD authoring and integration software is used to create interactive front-end menus for films and games.

DVD burners are used for reading discs and converting the disc to audio, video, and data formats.

Slide: Courtesy, Hung Nguyen

Page 41: 1 Multimedia Communications Introduction

Multimedia Communications

01/22/200741

Multimedia communications is the delivery of multimedia to the user by electronic or digitally manipulated means.

Audio Communications(Telephony, sound, Broadcast)

Multimedia Communications

Video Communications(Video telephony,

TV/HDTV)

Data, text, imageCommunications

(Data Transfer, fax…)

Slide: Courtesy, Hung Nguyen

Page 42: 1 Multimedia Communications Introduction

Multimedia Terms

01/22/200742

Page 43: 1 Multimedia Communications Introduction

Alternative Types of Media used in Multimedia Applications

01/22/200743

Page 44: 1 Multimedia Communications Introduction

Multimedia Communications Networks

01/22/200744

Page 45: 1 Multimedia Communications Introduction

Multimedia Networks and Their Services

01/22/200745

Page 46: 1 Multimedia Communications Introduction

Multimedia Networks and Their Services

01/22/200746

Page 47: 1 Multimedia Communications Introduction

Audio-Visual Integration

Page 48: 1 Multimedia Communications Introduction

Application in Biometrics – Bimodal Person Verification

01/22/200748

Existing methods for person verification are mainly based on a single modality which would have limitation in security and robustness

Audio visual integration using a camera and microphone makes person verification a more reliable product

Slide: Courtesy, Hung Nguyen

Page 49: 1 Multimedia Communications Introduction

Joint Audio-Video Coding

01/22/200749

Correlation between audio and video can be used to achieve more efficient coding Predictive coding of audio and video information

used to construct estimate of current frame (cross-modal redundancy)

Difference between original and estimated signal can be transmitted as parameters

Decision on what and how to send is based on Rate Distortion (R-D) criteria

Reconstruction done at receiver according to agreed-upon decoding rules

Slide: Courtesy, Hung Nguyen

Page 50: 1 Multimedia Communications Introduction

Cross-Model Predictive Coding

01/22/200750

Visual Analysis

A-to-VMapping

DecisionModule(R-D)

Parameter X

XX ˆ

Nothing

Parameter X

Slide: Courtesy, Hung Nguyen

Page 51: 1 Multimedia Communications Introduction

Importance of Interaction

01/22/200751

Multimedia is more than the combination of text, audio, video and data

Interaction among media is important

Consider a poorly dubbed movie Audio not synchronized with video Lip movements inconsistent with

language Audio dynamic range inconsistent with

the sceneSlide: Courtesy, Hung Nguyen

Page 52: 1 Multimedia Communications Introduction

Media Interaction

01/22/200752

Process and Model

Audio

TextImageVideo

Multimedia

Lip synchFace Animation

Joint A/V Coding

CompressionSynthesis3D Sound

Sign languageLip reading

Speech RecognitionText-to-Speech

Compression, GraphicsDatabase indexing/retrieval

TranslationNatural language

Slide: Courtesy, Hung Nguyen

Page 53: 1 Multimedia Communications Introduction

Bimodality of Human Speech

01/22/200753

Human speech is produced by vibration of the vocal cord, configuration of the vocal tract with muscles that generate facial expressions

Audio + Visual Perceived

ba ga da

pa ga ta

ma ga na

Slide: Courtesy, Hung Nguyen

Page 54: 1 Multimedia Communications Introduction

Basic Definitions

01/22/200754

The basic unit of acoustic speech is called a phoneme

In the visual domain, the basic unit of mouth movement is called viseme A viseme is the smallest visibly distinguishable

unit of speech Can contain several phonemes and thus form one

viseme group A many-to-one mapping between phonemes and

visemes

Slide: Courtesy, Hung Nguyen

Page 55: 1 Multimedia Communications Introduction

Lip Reading System

01/22/200755

Application to support hearing-impaired person

People learn to understand spoken language by combining visual content with lexical, syntactic, semantic and programmatic information

Automated lip reading systems Speech recognition possible using only visual

information Integrated with speech recognition systems to

improve accuracy

Slide: Courtesy, Hung Nguyen

Page 56: 1 Multimedia Communications Introduction

Lip Synchronization

01/22/200756

Applications In VTC (video teleconferencing) where video frame

is dropped (low bandwidth requirement) but audio must still be continuous

In non-real-time use such as dubbing in studio where recorded voice full of background noise

Time-warping commonly used in both audio and video modes Time-frequency analysis Video time-warping could be used for VTC Audio time-warping could be used for dubbing

Slide: Courtesy, Hung Nguyen

Page 57: 1 Multimedia Communications Introduction

Lip Tracking

01/22/200757

To prevent too much jerkiness in the motion rendering and too much loss in lip synchronization

Involved real-time analysis on 3-dimensional of the video signal plus one temporal dimension

Produce meaningful parameters Classification of mouth images into visemes Measures of dimension, e.g. mouth widths and

heights Analysis tools – Fourier Transform, Karhunen-

Loeve Transform (KLT), Probability Density Function (pdf) EstimationSlide: Courtesy, Hung Nguyen

Page 58: 1 Multimedia Communications Introduction

Audio-to-Visual Mapping for Lip Tracking

01/22/200758

Conversion of acoustic speech to mouth shape parameters

A mapping of phonemes to visemes Could be most precisely implemented with a

complete speech recognizer followed by a look-up table High computational overhead plus table look-up

complexity Do not need to recognize spoken word to achieve audio-

to-visual mapping Physical relationships exist between vocal tract

shape and sound produced functional relationships exist between speech and visual parametersSlide: Courtesy, Hung Nguyen

Page 59: 1 Multimedia Communications Introduction

Classification-Based Conversion Approaches for Lip Tracking

01/22/200759

Two-step process Classification of acoustic signal using VQ (vector

quantization), HMM (hidden Markov model) and NN (neural network)

Mapping of the acoustic classes into corresponding visual outputs, then averaged to get centroid

Shortcomings Error resulting from averaging visual vector to get

visual centroid Not a continuous mapping – finite output levels

Slide: Courtesy, Hung Nguyen

Page 60: 1 Multimedia Communications Introduction

Classification-Based Conversion

01/22/200760

Phoneme Space Viseme Space

Centroid

Slide: Courtesy, Hung Nguyen

Page 61: 1 Multimedia Communications Introduction

Audio and Visual Integration for Lip Reading Applications

01/22/200761

Three major steps Audio-visual pre-processing – Principal Component

Analysis (PCA) has been used for feature extraction

Pattern recognition strategy (HMM, NN, time-warping…)

Integration strategy (decision making) Heuristic rules to incorporate knowledge of phonemes

about the two modalities Combination of independent evaluation score for each

modalities

Slide: Courtesy, Hung Nguyen