4ea1b33739b046e8feed20efceb68ef5.ppt
- Количество слайдов: 63
Multimedia Coding Standards (MPEG-4 Part 10/H. 264) 최영호
Contents • • • 0. 1. 2. 3. 4. 5. H. 264/MPEG-4 Part 10 개요 Introduction Video Format and Quality Video Coding Concepts MPEG-4 and H. 264 Applications 2
H. 264/AVC 응용 분야 • Broadcast – over cable, satellite, cable modem, DSL, terrestrial • Interactive or serial storage – on optical and magnetic storage devices, DVD • Conversional services – over ISDN, Ethernet, LAN, DSL, wireless and mobile networks • Video-on-demand or multimedia streaming services – over cable modem, DSL, ISDN, LAN, wireless networks • Multimedia messaging services – over DSL, ISDN 3
H. 264/AVC 특징 • The range of bit rate and picture size – from very low bit-rate: mobile and dial-up devices – to entertainment-quality standard-definition television services HDTV and beyond. • Technical design goal – high coding efficiency – robustness to network environments • Potentially interesting features – Support of arbitrarily-shaped video objects – Some forms of bit rate scalability(4: 2: 2, 4: 4: 4) – Color sampling accuracies exceeding eight bits per color component 4
New developments(1) • Coding Efficiency(1) (Improved prediction design aspects) – Variable block-size motion compensation with small block size – Quarter-sample accuracy for motion compensation – Motion vectors over picture boundaries – Multiple reference picture motion compensation – Decoupling of referencing order from display order – Decoupling of picture representation methods from the ability to use a picture for reference, – Weighted prediction – Improved “Skipped” and “Direct” motion inference – Directional spatial prediction for intra coding – In-the-loop deblocking filtering 5
New developments(2) • Coding efficiency(2) - ( Non prediction aspects) – Small block-size transform – Hierarchical block transform – Short word-length transform – Exact-match transform – Arithmetic entropy coding – Context-adaptive entropy coding 6
New developments(3) • Robustness to data errors/losses & flexibility for operation over a variety of network environments – Parameter set structure – NAL unit syntax structure – Flexible slice size – Flexible macroblock ordering – Arbitrary slice ordering – Redundant pictures – Data partitioning – SP/SI synchronization switching pictures 7
H. 264/AVC : : : MPEG-4 Part 2 • MPEG-4 PART 2 – Video object를 이용한 digital visual content의 표현에 대 한 새로운 단계의 creativity와 flexibility를 제공 – 19 profiles • H. 264/AVC – Efficient compression of generic camera-shot rectangular video pictures with robustness to network losses – 3 profiles • 참고) – H. 261 or MPEG-1 표준안: 25 pages – H. 264 or MPEG-4 표준안: 250 and 500 pages – key concepts에 대한 이해가 필요함 8
1. Introduction • MPEG-4 Visual – a rich, interactive on-line world bringing together synthetic, natural, video, image, 2 D and 3 D ‘objects’ – to move away from a restrictive reliance on rectangular video images – to provide an open, flexible framework for visual communications that uses the best features of efficient video communication and object-oriented processing • H. 264/AVC – highly efficient and reliable video communications, supporting two-way, streaming and broadcast applications and robust to channel transmission problems – aiming to do what previous standards did but to do it in a more efficient, robust and practical way 9
MPEG-4 Visual & H. 264 • MPEG-4 Visual – MPEG(ISO working group)에 의해 진행됨 – Part 2 of MPEG-4 group of standards – MPEG-4 was first conceived in 1993 – Part-2 in 1999 • H. 264 – Video Coding Experts Group(VCEG)에 의해 시작 됨(ITU-T working group) – 표준안의 최종단계에서는 VCEG와 MPEG의 joint team에 의해 완성됨 ISO/IEC(as MPEG-4 Part 10) and ITU-T(as H. 264) in 2003 10
2. Video Format and Quality • • 2. 1 2. 2 2. 3 2. 4 Introduction Sampling Formats Video Format Subject Quality 11
2. 1 Introduction • Capture(Analog video digital video) – Spatial samples, temporal samples • Color spaces – RGB – YCb. Cr 12
2. 2 Sampling Formats • 4: 2: 0(YV 12) – Cb, Cr의 수직/수평 resolution의 Y resolution의 절 반 – 12 bits per pixel – Video conferencing, digital television, DVD – 4: 2: 0 interlaced video: 색 정보를 top field와 bottom filed에 번갈아 배정한다. • 4: 2: 2(YUV 2) – 수직 resolution은 Y resolution과 같다. – 수평 resolution은 Y resolution의 절반 – High quality color reproduction 13
2. 3 Video Formats(1) Format Luminance resolution Bits per frame (4: 2: 0, 8 bits per sample) 128 × 96 147456 Quarter CIF 176 × 144 304128 CIF 352 × 288 1216512 4 CIF 704 × 576 4866048 Sub-QCIF • 4 CIF: standard-definition television, DVD • CIF/QCIF: video conferencing applications • QCIF/SQCIF: mobile multimedia applications • television coding • ITU-R Recommendation BT. 601 -5 • Luminance signal: sampled at 13. 5 MHz • Chrominance signal : sampled at 6. 75 MHz • 4: 2: 2 signal 생성 14
2. 3 Video Formats(2) 30 Hz frame rate 25 Hz frame rate Filed per second 60 50 Lines per complete frame 525 625 Luminance samples per line 858 864 Chrominance samples per line 429 432 8 8 216 Mbps Active lines per frame 480 576 Active samples per line(Y) 720 Active samples per line(Cr, Cb) 360 Bits per sample Total bit rate • each sample: level of 0 and 255 is used for synchronization • luminance signal : 16(Black) to 235(white) 15
2. 4 Subject Quality Measurements • Quality에 영향을 미치는 요인 – Spatial fidelity(distortion etc) – Temporal fidelity(motion appears natural) – Viewing environments – Viewer’s state of mind • ITU-R 500(subjective quality measurements) – Subjective quality evaluation – Double Stimulus Continuous Quality Scale – Excellent to Bad: 5 intervals 16
2. 4 Objective Quality 측정 17
3. Video Coding Concepts • • • 3. 1 3. 2 3. 3 3. 4 3. 5 3. 6 Compression Video CODEC Temporal Model Image Model Entropy Coder DPCM/DCT CODEC 18
3. 1 Compression • Data compression Redundancy의 제거 – Statistical redundancy 제거 : lossless • JPEG-LS: 3 -4 times compression ratio – Subjective redundancy 제거 : lossy • H. 264/MPEG-4 Visual의 공통적 CODEC model – Block-based motion compensation – Transform – Quantization – Entropy coding 19
3. 2 Video CODEC • Encoder 구성 요소 – Temporal model: a residual frame + a set of model parameters(motion vector) – Spatial model: a set of quantized transform coefficients – Entropy model: removes statistical redundancy Video input Temporal model Stored frames residuals Spatial model vectors coefficients Entropy encoder Encoded output 20
3. 3 Temporal Model - (1) • (1) Prediction from Previous Video Frame – Predictor로 previous frame 사용 • Frame Changes의 원인 – Object motion • Rigid object motion: moving car • Deformable object motion: moving arm – Camera motion • Panning, tilting, zoom, rotation – Uncovered region – Lighting conditions • 위의 두 경우(object motion, camera motion)에 대하여 각 화 소당 프레임간의 trajectory를 구할 수 있다. optical flow 시간과 정보양의 증가로 prediction에 optical flow를 사용하 지는 않는다. 21
3. 3 Temporal Model - (2) • (2) Motion Compensated Prediction of a Macroblock – MPEG-1, MPEG-2, MPEG-4, H. 261, H. 263, H. 264 – Source video : 4: 2: 0 format • (3) Region-based Motion Compensation – MPEG-4 Visual includes a number of tools that support region-based compensation and coding(later) Y 16*16 Region (color) 0 1 2 3 Cb 4 Cr 5 22
3. 4 Image Model – (1) • Image model의 목적 – Decorrelate image or residual data – Convert it into a form that can be efficiently compressed using entropy coder. • Image model 구성 – Transformation • Decorrelates and compacts the data – Quantization • Reduces the precision of the transformed data – Reordering • Arranges the data to group together significant values 23
3. 4 Image Model – (2) • (0) Predictive Image coding – H. 264 Intra coding(applied in the transform domain) – Encoder는 사용하는 화소로, reconstructed values를 사용한다. B A C X 24
3. 4 Image Model – (3) • (1) Transform Coding – Transform 조건 • Data in the transform domain should be decorrelated • Reversible • Computationally tractable – 종류 • Block-based transform – Karhunen-Loeve Trnasform(KLT) – Singular Value Decomposition(SVD) – DCT( H. 264 uses the variant of DCT) (MPEG-4 Visual) • Image-based transform – 전체 영상 사용 – Discrete Wavelet Trnasform(DWT) (MPEG-4 Visual) 25
3. 4 Image Model – DCT (4) • DCT에 대한 행렬 표현식 26
3. 4 Image Model – Wavelet (5) 27
3. 4 Image Model – (6) – Quantization • Scalar quantization – One sample one quantized output value • Linear quantiser • Non-linear quantiser • Vector quantization – A group of input samples a group of quantized values • Block of image values를 single value(codeword) 로 할당 28
3. 4 Image Model – (7) – Reordering and zero encoding • DCT의 경우 – Scan • 그림 3. 41 다음장 • 그림 3. 42 다음장 – Run-level encoding • Input array : 16, 0, 0, -3, 5, 6, 0, 0, -7, • Output values: (0, 16), (2, -3), (0, 5), (0, 6), (4, -7) • 3 D : (0, 16, 0), (2, -3, 0), (0, 5, 0), (0, 6, 0), (4, -7, 1) 29
Zigzag scan order (frame block) Zigzag scan order (field block) 30
3. 5 Entropy Coder - (1) • Input of entropy coder – Quantized transform coefficients • Run-level, zerotree encoded – Motion vector – Marker • Indicates a resynchronization point in the sequence – Headers • Macroblock header, picture headers, sequence headers – Supplementary information • 살펴보는 entropy coding 기법 – Predictive pre-coding – Modified Huffman – Arithmetic coding 31
3. 5 Entropy Coder - (2) Predictive Coding • Highly correlated symbols – DC values of neighboring intra-coded blocks of pixels – Neighboring motion vectors • 이전의 prediction 구조와 유사 – Quantization parameter • 이는 네트워크 형편에 따라 양자화 파라미터를 변 경할 필요가 발생하는데 이때 양자화 파라미터 전 부를 부호화 하지 않고 그 차이를 부호화 한다. 32
3. 5 Entropy Coder - (3) Variable length coding • (1) Huffman Code – 확률에 기반 • 확률 값의 오름차순으로 데이터를 정렬한다. • Two lowest-probability data items을 하나의 node 로 병합한 후, 병합된 node에 이들 확률 값의 합을 할당한다. Vector Probability • 재 정렬한 후 반복한다. -2 0. 1 -1 0. 2 0 0. 4 1 0. 2 2 0. 1 33
3. 5 Entropy Coder - (4) Variable length coding • Huffman Code 생성 34
3. 5 Entropy Coder - (5) Variable length coding • Huffman Code의 단점 – 디코더는 인코더와 같은 codeword 사용해야 함 – Probability table에 계산의 어려움 • Pre-calculated VLC table(MPEG-4 Visual (Simple Profile) – Transform Coefficients(TCOEF) • 3 D coding of quantized coefficients(run, level, last) • Total 102 specific combination of (run, level, last) • Others ESCAPE code(0000011) + 13 -bit fixed length code describing the values of run, level, last – Motion Vector Difference(MVD) • True Huffman tree 구조를 갖는다. 35
MPEG-4 Visual Transform Coefficients (TCOEF) VLC 36
MPEG-4 Visual Transform Coefficients (TCOEF) VLC 37
3. 5 Entropy Coder - (6) Arithmetic Coding • A practical alternative to Huffman coding • Converts a sequence of data symbol into a single fractional number Vector Probability Sub-range -2 0. 1 0 -01 -1 0. 2 0. 1 -0. 3 0 0. 4 0. 3 -0. 7 1 0. 2 0. 7 -0. 9 2 0. 1 0. 9 -1. 0 38
3. 5 Entropy Coder - (7) Arithmetic Coding • Encoding for (0, -1, 0, 2) 39
3. 5 Entropy Coder - (8) Context-based Arithmetic Coding • Successful entropy coding – Depends on accurate model of probability • CAC – Use local spatial and/or temporal characteristics to estimate the probability – JBIG의 bi-level image compression – MPEG-4 Visual의 coding binary shape ‘masks’ – H. 264 Main Profile에서의 entropy coding 40
3. 6 DPCM/DCT CODEC - (1) 41
3. 6 DPCM/DCT CODEC - (2) 42
4. MPEG-4 & H. 264 • • • 4. 1 4. 2 4. 3 4. 4 4. 5 4. 6 표준안 개발 과정 표준안 사용하기 MPEG-4 Visual/PART 2 개요 H. 264/MPEG-4 PART 10 개요 MPEG-4 & H. 264 비교 관련 표준안 – 4. 6. 1 – 4. 6. 2 – 4. 6. 3 – 4. 6. 4 JPEG, JPEG 200 MPEG-1, MPEG-2 H. 261, H. 263 MPGE-4 기타 사항 43
4. 1 표준안 개발 과정 • ISO/IEC 14496(MPEG-4)의 생성, 관리, 향상은 MPEG 의 책임 – MPEG: a study group of ISO • H. 264 recommendation – Other names • MPEG-4 Part 10 • Advanced Video Coding • H. 26 L – Joint effort Joint Video Team(JVT) (H. 264/MPEG-4 Part 10 완성 published by ISO/IEC and ITU-T) • MPEG – MPEG-1, MPEG-2 개발, MPEG-7, MPEG-21 개발 중 • Video Coding Experts Group(VCEG): a study group of ITU – H. 261, H. 263 개발, early version of H. 26 L 44
ISO MPEG • MPEG – ISO/IEC JTC 1/SC 29/WG 11 • ISO: International Organization for Standardization • IEC: International Electrotechnical Commission – to develop standards for compression, processing, representation of moving pictures and audio • MPEG-1: Compression of video and audio for CD playback • MPEG-2: Storage and broadcasting of “television-quality” video and audio ( MP 3: MPEG Layer 3 audio coding ) • MPEG-4: Coding of audio-visual objects • MPEG-7: multimedia content representation • MPGE-21: multimedia framework 45
MPEG Sub. Group Sub group responsibilities Requirements 산업계의 needs에 따른 새로운 표준안의 requirements 정립 Systems Combining audio, video and related information Carrying the combined data on delivery mechanisms Description Declaring and describing digital media items Video Coding of moving images Audio Coding of audio Synthetic Natural Hybrid Coding of synthetic audio and video for integration with natural audio and video Integration Conformance testing and reference software Test Methods of subjective quality assessment Implementation Experimental frameworks, feasibility studies, implementation guidelines Liaison Relations with other relevant groups and bodies Joint Video Team 46
ITU-T VCEG • ITU-T – develops standards for telecommunication – VCEG • working group of ITU-T Standardization Sector • Responsible for a series of standards related to video communication over telecommunication networks and computer network – sub-group으로 구성 • SG 16: responsible for multimedia services, systems and terminals – working party 3 of SG 16: media coding » Question 6의 결과 H. 264 » VCEG: ITU-T SG 16 Q. 6(공식적 이름) 47
VCEG standards • H. 261 – Video teleconferencing standard • H. 263 – More efficient than H. 261 • H. 263+/H. 263++ – Extends the capability of H. 263 • H. 26 L: – The latest version of H. 263 • ITU-T H. 264 and ISO/IEC MPEG-4 Part 10 • MPEG과 달리 VCEG의 document는 공개되어있다. – 1996– 1992년 http: //standards. picte. com/ftp/video-site/ – 2002년 이후의 자료 ftp: //ftp. imtc-files. org/jvt-experts/ 48
JVT • Joint Video Team • ISO/IEC JTC 1/SC 29/WG 11(MPEG) & ITU-T SG 16 Q. 6(VCEG) – MPEG-4 Visual (part 2)의 core coding mechanism이 낡은 H. 263에 기반하였던 바 많은 실험 끝에 H. 26 L이 best임에 동의 하여 발족 • 1993: MPEG-4 project launched. Early results of H. 263 project produced • 1995: MPEG-4 call for proposals (efficient video coding, contentbased functionalities). H. 263 chosen as core video coding tool • 1998: call for proposals for H. 26 L • 1999: MPEG-4 visual standard published. Initial test model of H. 26 L defined • 2000: MPEG call for proposals for advanced video coding tools • 2001: Edition 2 of MPEG-4 Visual published. H. 26 L adopted as basis for proposed MPEG-4 Part 10. JVT formed. • 2002: Amendments 1 and 2 to MPEG-4 Visual Edition 2 published. H. 264 frozen • 2003: H. 264/MPEG-4 Part 10 published 49
표준안 내용 결정 • Work plan is agreed with – A set of functional and performance objectives for new standard • Decision upon the basic technology – Competitive trial 시행 – (ex) block-based motion compensation transformation and quantization of residual • Develop the detail of the standard – Proposals for algorithms, methods or functionalities companies/organization • Software model(reference software) developed together with Document (test model) • Draft standard • International Standard 50
4. 2 표준안 사용하기 • MPEG-4 Visual: 539 A 4 pages • H. 264: 250 A 4 pages • Standard – Do not specify a video encoder – Specify • syntax of a coded bitstream • semantics of these syntax elements(what they mean) • process by which the syntax elements may be decoded to produce visual information 51
4. 3 MPEG-4 Visual/part 2 - (1) • Data – – – Moving video(rectangular frames) Video objects(arbitrary-shaped regions of moving video) 2 D and 3 D mesh objects(representing deformable objects) Animated human faces and bodies Static texture(still images) • Set of coding tools을 제공하여 다음과 같은 응용분야를 지원한다. – Legacy video applications: digital TV broadcasting – Object-based video applications – Rendered computer graphics using 2 D and 3 D deformable mesh geometry / animated human faces and bodies – Hybrid video applications combining real-world video, still images and computer generated graphics – Streaming video over the Internet and mobile channels – High quality video editing and distribution for the studio production environment 52
4. 3 MPEG-4 Visual/part 2 - (2) • Introduction – Overview of some of the target applications and data types, with a particular emphasis on 2 D and 3 D mesh objects • Section 1 – Describes the scope of the standard • Section 2 – References other standards documents • Section 3 – Contains a useful list of terminology and definitions • Section 4 – Lists standard symbols and abbreviations • Section 5 – Explains the conventions for describing the syntax of the standard 53
4. 3 MPEG-4 Visual/part 2 - (3) • Section 6 – Describes the syntax and semantics of MPEG-4 Visual – Defines the acceptable parameters of an MPEG-4 Visual bitstream • Section 7 – Describes a set of processes for decoding an MPEG-4 Visual bitstream – Defines the series of steps required to decode a compliant bitstream and convert it to a visual scene or visual object – Defines how an MPEG-4 Visual bitstream should be decoded • Section 8 – Discusses how separately-decoded visual object should be composed to make a visual scene • Section 9 – Defines a set of conformance points known as ‘profile’ and ‘level’ • Annexes A-O 54
4. 4 H. 264/MPEG-4 Part 10 – (1) • Narrower scope than MPEG-4 Visual • Target – Two-way video communication – Coding for broadcast and high quality video – Video streaming over packet networks – Support for robust transmission over network is built in 55
4. 4 H. 264/MPEG-4 Part 10 – (2) • Introduction – Lists some target applications – Explains the concept of Profiles and Levels – Gives a overview of the coded representation • Section 1 - 5 – Preamble to the detail • Section 6 – – – Input output data formats 4: 2: 0 progressive and interlaced video Sampling rate Format of the coded bitstream The order of processing of video formats Defines for finding ‘neighbor’ of a coded element • Section 7 – Describes the syntax and semantics 56
4. 4 H. 264/MPEG-4 Part 10 – (3) • Section 8 – Describes the process involved in decoding slices • • • Picture boundary detection/reference picture management Intra/inter prediction Transform coefficient decoding Reconstruction Bulit-in deblocking filter • Section 9 – Describes how a coded bitstream should be ‘parsed’ • Varible-length codes • Context-adaptive binary arithmetic codes • Annex A – Defiens Baseline, Main, Extended profiles and ‘levels’ • Annex B – Defiens the format of a byte stream • Annex C: Hypothetical Reference Decoder 57
4. 5 MPEG-4와 H. 264 비교 비교 항목 MPEG-4 Visual H. 264 Data types Rectangular video frames and fields, arbitraryshaped video objects, still texture and sprites, synthetic or synthetic-natural hybrid video objects, 2 D/3 D mesh objects Rectangular video frames and files # of profiles 19 3 Compression Efficiency Medium High Video streaming Scalable coding 지원 Switching slices MC minimum block size 8*8 4*4 MV accuracy Half or quarter-pixel Quarter pixel Transform 8*8 DCT 4*4 DCT 근사 Bulit-in deblocking filter No Yes Licence payment Yes Not(baseline) Probably(main, extended profiles) 58
4. 6 관련 표준안 • JPEG • 8*8 DCT quantization reordering run-level coding variable-length entropy coding • JPEG 2000 • DWT 사용 : MPEG-4 Visual의 still texture coding과 유사 • MPEG-1 • Optimized for a compressed video bitrate of 1. 2 Mbps • MPEG-2 • Interlaced format 지원 • H. 261 • First widely-used standard for videoconferencing • H. 263 • Better performance than H. 261 59
MPEG-4 전체 구성(1) • Part 1 Systems • Scene description, multiplexing of audio, video and related information, synchronization, buffer management, intellectual property management • Part 2 Visual • Coding of “natural” and “synthetic” visual objects • Part 3 Audio • Coding of natural and synthetic audio objects • Part 4 Conformance Testing • Conformance conditions, test procedures, test bitstreams • Part 5 Reference Software • Part 6 Delivery Multimedia Integration Framwork • Session protocol for multimedia streaming • Part 7 Optimized Visual Reference software(Technical report) 60
MPEG-4 전체 구성(2) • Part 8 Carriage of MPEG-4 over IP • The mechanism for carrying MPEG-4 coded data over IP networks • Part 9 Reference Hardware Description • VHDL descriptions of MPEG-4 coding tools (Technical report) Part 10 Advanced Video Coding Part 11 Scene Description and Application Engine Part 12 ISO Base Media File Format Part 13 Intellectual Property Managemennt and Protection Extensions • Part 14 MPEG-4 File Format • Part 15 AVC File Format • Part 16 Animation Framework Extensions • • 61
5. Application (1) 응용 분야 요구사항 MPEG-4 Profiles ASP H. 264 Profiles Broadcast television Coding efficiency, Interlace Reliability(over a controlled distribution channel), Low-complexity decoder Main Streaming Video Coding efficiency, Scalability ARTS or Reliability(over an ‘uncontrolled’ packet-based network), FGS Extended Video storage Coding efficiency, Interlace, low-complexity decoder ASP Main Video 회의 Coding efficiency, Reliability, low latency, lowcomplexity encoder and decoder SP Baseline Mobile Coding efficiency, Reliability, low latency, lowcomplexity encoder and decoder, low power consumption SP Baseline Studio Lossless or near-lossless, interlace, efficient transcoding Studio Main SP: Simple, ASP: Advanced Simple, ARTS: Advanced Real Time Simple, FGS: Fine Granular Scalability 62
5. Application (2) Platform 장점 단점 Dedicated HW Performance and power efficiency(best) Inflexible, high development cost DSP / media processor Performance and power efficiency(good) Flexibility Limited choice of CODECs, Medium development cost, Single vendor Embedded processor Power efficiency(good), Flexibility Performance(medium/go od) Poor Performance Single vendor General purpose processor(PC) Flexibility(best) Wide choice of CODECs Poor power efficiency 63
4ea1b33739b046e8feed20efceb68ef5.ppt