ebffe4e4806e9f0114871494acf5333c.ppt
- Количество слайдов: 62
INF 5070 – Media Storage and Distribution Systems Data Formats and Codecs 30/8 – 2004
Why codecs and formats? q Codecs (coders/decoders) q Determine how information is represented q Important for servers and distribution systems q q Required sending speed Amount of loss allowed Buffers required … q Formats q Determine how data is stored q Important for servers and distribution systems q q Where is the data? Where is the data about the data? INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Media data
Media data q Medium: "Thing in the middle“ q here: means to distribute and present information q Media affect human computer interaction q The mantra of multimedia users q Speaking is faster than writing q Listening is easier than reading q Showing is easier than describing INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Dependence of Media q Time-independent media q Text q Graphics q Discrete media q Time-dependent media q Audio q Video q Continuous media q Interdependant media q Multimedia INF 5070 – media servers and distribution systems q "Continuous" refers to the user’s impression of the data, not necessarily to its representation q Combined video and audio is multimedia - relations must be specified 2004 Carsten Griwodz & Pål Halvorsen
Dependence of Media q Defined by the presentation of the data, not its representation q Discrete media q Text q Graphics q Video stills (image displayed by pausing a video stream) q Continuous media q Audio q Video q Animation q Ticker news (continuously scrolling text) q Multimedia q Multiplexed audio and video INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Properties of a Multimedia System q Flexibility q Provide mechanisms to handle all kinds of media, in particular, discrete and continuous media q A VCR and a desktop publishing system for text and graphics are no multimedia systems q An editor with voice annotation is a multimedia system q Integration q Independent media storage q Computer-controlled media combination q Definition A multimedia system is characterized by the integrated computer-controlled handling of independent discrete and continuous media INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Coding for distribution
Compression - Necessity q E. g. , video sequence q 25 images/sec. q q 3 byte/pixel q q q PAL standard YUV (luminance + 2 chrominance values) RGB (red-green-blue values) Image resolution 640 * 480 pixel q Data rate = 640 * 480 * 3 Byte * 25/s = 23040000 byte/s ~ 22 MByte/s q q q Approx. 1/16 stream over Ethernet Approx. 1/2 stream over Fast Ethernet Compression is necessary INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Compression – General Requirements q Dependence on application type: q q Dialogue mode Retrieval mode INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Compression – Mode Dependent Requirements q Dialogue and retrieval mode requirements: q q Dialogue mode requirements: q q Synchronization of audio, video, and other media End-to-end delay < 150 ms Compression and decompression in real-time Symmetric Retrieval mode requirements: q q q Fast forward and backward data retrieval Random access within 1/2 s Asymmetric q We look mainly at retrieval mode! INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Compression Categories INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Basic Encoding Steps INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Run-Length Coding q Assumption q Long sequences of identical symbols q Example INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Bit-Plane Coding q Assumption q Even longer sequences of identical bits q Example 10, 0, 6, 0, 0, 3, 0, 2, 2, 0, 0, 1, 0, … , 0, 0 (absolute) 0, x, 1, x, 0, 0, x, x, 1, x, x, 0, x, … , x, x (sign bits) 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, (0, 1) (2, 1) (0, 0)(1, 0)(2, 0)(1, 0)(0, 0)(2, 1) (5, 0)(8, 1) Ø … … , 0, 0 (MSB) (MSB-1) (MSB-2) (MSB-3) End of plane No 0 s before a 1 Up to 20% savings over run-length coding can be achieved INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Huffman Coding q Assumption q Some symbols occur more often than others q E. g. , character frequencies of the English language q Fundamental principle q Frequently occurring symbols are coded with shorter bit strings INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Huffman Coding q Example q Characters to be encoded: q q A, B, C, D, E Probability to occur: q p(A)=0. 3, p(B)=0. 3, p(C)=0. 1, p(D)=0. 15, p(E)=0. 15 INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Huffman q Table and example of application to data stream INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
JPEG q “JPEG”: Joint Photographic Expert Group q International Standard: q For digital compression and coding of continuous-tone still images: q q q Gray-scale Color Since 1992 q Joint effort of: q ISO/IEC JTC 1/SC 2/WG 10 q Commission Q. 16 of CCITT SGVIII q Compression rate of 1: 10 yields reasonable results INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
JPEG q Very general compression scheme q Independence of q Image resolution q Image and pixel aspect ratio q Color representation q Image complexity and statistical characteristics q Well-defined interchange format of encoded data q Implementation in q Software only q Software and hardware INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
JPEG q Sequence of compression steps q Different resolutions possible q Lossy or lossless mode q q lossless compression factor ~1, 6: 1 Symmetrical codec INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
JPEG – Baseline Mode: Quantization q Use of quantization tables for the DCT-coefficients q Map interval of real numbers to one integer number q Allows to use different granularity for each coefficient INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
JPEG – 4 Modes of Compression INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Motion JPEG q Use series of JPEG frames to encode video q Pro q Lossless mode q Frame-accurate seeking q Arbitrary frame rates q Arbitrary frame skipping q Scaling through progressive mode q Min transmission delay = 1/framerate q Supported by popular frame grabbers – – – editing advantage playback advantage distribution advantage conferencing advantage q Contra q Series of JPEG-compressed images q No standard, no specification q q q Worse, several competing quasi-standards No relation to audio No inter-frame compression INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
H. 261 (px 64) q International Standard q Video codec for video conferences at p x 64 kbit/s (ISDN): q q q Intraframe coding q q Real-time encoding/decoding, max. signal delay of 150 ms Constant data rate DCT as in JPEG baseline mode Interframe coding, motion estimation q Search of similar macroblock in previous image and compare q q Position of this macroblock defines motion vector Difference between similar macroblocks INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG (Moving Pictures Expert Group) q International Standard: q Compression of audio and video for playback (1. 5 Mbit/s): q Real-time decoding q Sequence of I-, P-, and B-Frames: q Random access q at I-frames q at P-frames: i. e. decode previous I-frame first q at B-frame: i. e. decode I and P-frames first INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG-2 q From MPEG-1 to MPEG-2 q Improvement in quality q q From VCR to TV to HDTV No CD-ROM based constraints q Higher data rates q q MPEG-1: about 1. 5 MBit/s MPEG-2: 2 -100 MBit/s q Evolution q 1994: International Standard q Also later known as H. 262 q Prominent role for digital TV in DVB (digital video broadcasting) and DVD (digital video disk) q Commercial MPEG-2 realizations available INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG-2 q Beyond MPEG-1: q Higher quality encoding q Higher data rates q Interleaved modes q Use cases q Broadcast quality production q q Program Stream q q DVB-T: Terrestrial DVB-S: Satellite DVB-C: Cable for post-processing, storage, and DVD distribution Transport Stream q for broadcasting, error resilience q Scaling: q Signal to Noise Ration (SNR) scaling - progressive compression error correcting codes q Spatial scaling - several pixel resolutions q Temporal scaling - frame dropping INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG-4 q MPEG-4 (ISO 14496) originally q Targeted at systems with very scarce resources q To support applications like q q q Mobile communication Videophone and E-mail Max. data rates and dimensions (roughly) q q Between 4800 and 64000 bits/s 176 columns x 144 lines x 10 frames/s q Further demand q To provide enhanced functionality to allow for analysis and manipulation of image contents INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG-4 q Hence: find standardized ways to q Represent units of aural, visual or audiovisual content q q Compose these objects together q q i. e. creation of compound objects that form audiovisual scenes Multiplex and synchronize the data associated with AVOs q q audio/visual objects" or AVOs object coding independent of other objects, surroundings and background natural and synthetic objects for transportation over network channels providing a Qo. S (Quality-of-Service) Interact with the audiovisual scene generated at the decoder’s site INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG-4: Scope q Definition of q „System Decoder Model“ q q Description language q q q specification for decoder implementations binary syntax of an AV object’s bitstream representation scene description information Corresponding concepts, tools and algorithms, especially for q q q q content-based compression of simple and compound audiovisual objects manipulation of objects transmission of objects random access to objects animation scaling error robustness INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG-4: Scope q Targeted bit rates for video and audio: q VLBV core q q Higher-quality video q q q „Very Low Bit-rate Video“ 5 - 64 Kbit/s image sequences with CIF resolution and up to 15 frames/s 64 Kbit/s - 4 Mbit/s quality like digital TV Natural audio coding q 2 - 64 Kbit/s INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG-4: Video and Image Encoding q Encoding / decoding of q Rectangular images and video q q coding similar to MPEG-1/2 motion prediction texture coding Images and video of arbitrary shape q as done in conventional approach q q 8 x 8 DCT or shape-adaptive DCT plus coding of shape and transparency information q Encoder q Must generate timing information q q speed of the encoder clock = time base desired decoding times and/or expiration times q q by using time stamps attached to the stream Can specify the minimum buffer resources needed for decoding INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG-4: Composition of Scenes q Scene description includes: q Tree to define hierarchical relationships between objects q Objects’ positions in space and time q q Attribute value selection q q e. g. pitch of sound, color, texture, animation parameters Description based on some VRML concepts q q by converting the objects’ local coordinate system into a global coordinate system VRML = „Virtual Reality Modeling Language“ Interaction with scenes q e. g. change viewing point, drag object, start/stop streams, select language INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG-4: Example of a Composition INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG-4: Synthetic Objects q Visual objects: q Virtual parts of scenes q q e. g. virtual background Animation q e. g. animated faces q Audio objects: q „Text-to-speech“ q q q „Score driven synthesis“ q q q speech generation from given text and prosodic parameters face animation control music generation from a score more general than MIDI Special effects INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG-4: Error Handling q Mobile communication: q Low bit-rate (< 64 Kbps) q Error-prone q MPEG-4 concepts for error handling: q Resynchronization q q q Data recovery q q q enables receiver to „tune in“ again based on markers within bitstream enables receiver to reconstruct lost data encode data in an error-resilient manner Error concealment q q enables receiver to bridge gaps in data e. g. by repeating parts of old frames INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Network-aware coding
Network-aware coding q Adapt to reality of the Internet q Content q q q No guarantees concerning q q Throughput Jitter Packet loss Sending rate q q q Is created once, off-line Is sent many times, under different circumstances Must adhere to rules Often: don’t send more than TCP would Can’t send at the best available encoding rate INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Approaches q Simulcast q Scalable coding q SNR Scalability q Temporal Scalability q Spatial Scalability q Fine Grained Scalability q Multiple Description Coding INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Simulcast q Choose a set of sending rates q During content creation q q Encode content in best possible quality below that sending rate During transmission q Choose version with the best admissable quality Quality lit t ya os l sib p est ua eq p i oss en es ate gr in d bl 3 simulcast rates B Single rate codec Sending rate INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Scalable coding q Typically used as Layered coding q A base layer q Provides basic quality q Must always be transferred q One or more Enhancement layer enhancement layers q Improve quality Transferred if possible lit Quality q t ya st Be p i oss bl ua eq p i oss en es ate gr in d bl Base layer Sending rate INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Temporal Scalability q Frames can be dropped q In a controlled manner q Frame dropping does not violate dependancies q Low gain example: B-frame dropping in MPEG-1 INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Spatial Scalability q Idea q Base layer q q q Enhancement layer q q q Downsample the original image (code only 1 pixel instead of 4) Send like a lower resolution version Subtract base layer pixels from all pixels Send like a normal resolution version If enhancement layer arrives at client q q Decode both layers Add layers 72 61 75 83 73 Base layer Less data to code -1 -12 2 10 INF 5070 – media servers and distribution systems Enhancement layer Better compression due to low values 2004 Carsten Griwodz & Pål Halvorsen
Spatial Scalability raw video DS DCT Q VLC base layer + DS DCT Q enhancement layer + VLC DCT Q VLC enhancement layer 2 DS - downsampling DCT – discrete cosine transformation Q – quantization VLC – variable length coding INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
SNR Scalability q SNR – signal-to-noise ratio q Idea q Base layer q q q Enhancement layer is regularly DCT encoded q q Is regularly DCT encoded A lot of data is removed using quantization Run Inverse DCT on quantized base layer Subtract from original DCT encode the result If enhancement layer arrives at client q Add base and enhancement layer before running Inverse DCT INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
SNR Scalability raw video DCT + Q - VLC base layer IQ Q VLC enhancement layer DCT – discrete cosine transformation Q – quantization IQ – inverse quantization VLC – variable length coding INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Fine Grained Scalability q Idea q Cut of compressed tail bits of samples q Base layer q As in SNR coding q Enhancement layer q Use bit-plane coding for enhancement layer instead of run-level coding q Cut tail bits off until data rate is reached INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Fine Grained Scalability MSB-1 MSB-2 MSB-3 … (0, 1) (2, 1) (0, 0)(1, 0)(2, 0)(1, 0)(0, 0)(2, 1) (5, 0)(8, 1) e Quality ya st Be ibl oss p alit qu e ible ss o tp l a Go F of at gr din sen GS Sending rate INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Fine Grained Scalability raw video DCT + Q - VLC base layer IQ Q BC enhancement layer DCT – discrete cosine transformation Q – quantization IQ – inverse quantization VLC – variable length coding BC – bitplane coding INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Fine Grained Scalability Motion vectors Motion Estimation raw video + IQ IDCT + - base layer IQ Q INF 5070 – media servers and distribution systems VLC Q BC enhancement layer 2004 Carsten Griwodz & Pål Halvorsen
Multiple Description Coding q Idea q Encode data in two streams q Each stream has acceptable quality q Both streams combined have good quality q The redundancy between both streams is low q Problem q The same relevant information must exist in both streams q Old problem: started for audio coding in telephony q Currently a hot topic INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Multimedia File Formats
Overview q File formats q Define the storage of media data on disks q Specify synchronization q Specify timing q Contain metadata q They allow q Interchange of data without interpretation q q q Copying Platform independance Management Editing Retrieval for presentation q Needed for all asynchronous applications INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
File Format Examples q Streaming format q File format and wire format are identical q MPEG-1, DVI q Streamable format q File format specifies wire format(s) q MPEG-4, Quicktime, Windows Media, Real Video INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
RTP Recorder Solution q Pragmatic generic solution q Stores and sends all MBone sessions q No interpretation of data q Interpretation of network timestamps q Derivation of synchronity information INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Stored Motion JPEG q Motion JPEG Chunk File Format (UC Berkeley) q Specifies entire clip’s length in s+ns q Contains sequence of images q Each image in Independent JPEG Group’s JFIF format q AVI MJPEG DIB (Microsoft) q Supports audio interleaving q Time-stamped data chunks q One frame per AVI RIFF data chunk q Hack for. le size > 1 GB q Quicktime (Apple) q Dedicated tracks for interleaving and timing q One frame per field q Several fields per sample q Formats A: full JFIF images, B: QT headers and data only INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Quicktime File Format q Run-time choice of tracks q availability of codecs q bandwidth q language INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
MPEG-4 File Format INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Other File Formats q Real Video q Not published q Supports various codecs q Supports various encoding formats per. le q Supports dynamic selection q Supports dynamic scaling ("stream thinning") q AVI is published q Uses Resource Interchange File Format (RIFF) q Supports various codecs q ASF / Windows Media File Format q Submitted as MPEG-4 proposal (but refused) q ASF files can include Windows binary code q ASF is patented in the USA INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
Summary q Storage and distribution system must support: q Discrete media such as text and graphics q Continuous media such as audio and video q Interrelated Multiplexed media q Encoding Format and File Format must be distinguished q q Separation of file format and wire format Streamable files vs. streaming format q Trend towards q Formats that define presentation environments q Interaction of encoding format and application q Interaction of client and server q Influence on Distribution Systems? INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
References q q Ralf Steinmetz, Klara Nahrstedt: Multimedia Fundamentals, Volume I: Media Coding and Content Processing (2 nd Edition), Prentice Hall, 2002, ISBN 0130313998 Touradj Ebrahimi (Ed. ), Fernando Pereira, The MPEG-4 Book, Prentice Hall, 2002, ISBN 0130616214 Weiping Li, Overview of Fine Granularity Scalability in MPEG-4 Video Standard , IEEE Transactions on Circuits and Systems for Video Technology, 11(3), Mar. 2001 Vivek K. Goyal, Multiple Description Coding: Compression Meets the Network , IEEE Signal Processing Magazine, Sep. 2001 INF 5070 – media servers and distribution systems 2004 Carsten Griwodz & Pål Halvorsen
ebffe4e4806e9f0114871494acf5333c.ppt