
bc3258b6d78b250850d55712d04f55aa.ppt
- Количество слайдов: 28
Advanced Video Capabilities of HD DVD-Video Kilroy Hughes Digital Media Architect Microsoft Corporation
Contents • • • What is HD DVD-Video Format? Video Capabilities Video/Graphics Layout Model (2 D) Video/Graphics Composition Model (3 D) Presentation and Synchronization Model (4 D) Programming Animation Resource Management Output Conclusion
What is HD DVD-Video? • “HD DVD-Video Format” is an APPLICATION format (i. e. content format) defined by the DVD Forum for use on various storage media • The HD DVD-Video Application format is currently specified for use on: – HD DVD-ROM discs (blue laser, 15 – 60 GB) – DVD-ROM discs (red laser, 4. 7 – 16. 8 GB) – R/W storage (flash memory, hard disk, etc. ) • The format can combine video, audio, text, and graphics from optical disc, internal player storage, local area network storage, and program streams from the Web into realtime interactive video presentations
Advanced Video Capabilities • Simultaneous presentation of: – – – • • • 1 stream of up to 1920 x 1080 P 30 HD video (MPEG-2, H. 264, or SMPTE 421 M) 1 stream of up to 720 x 576 P 30 video (SD required, HD optional) 3 streams of up to 8 channel audio; streams can be from different sources 1 stream of text and graphics Subtitles, or bitmap Subpictures 16 Applications with programmed text, images, drawing, and animation A graphical cursor controlled by a pointing device (e. g. joystick, mouse, trackball, pressure pad, etc. ) Z-order and alpha blend of graphics objects, and alpha blending of graphics and video planes Independent scaling, clipping, and positioning of all video and graphics objects Property animation (i. e. object size, position, transparency, color, etc. can be changed over time) Frame accurate composition and animation based on timecodes derived from video time (video frame/position) or Application time (elapsed time) Network support that enables updating presentations on optical disc with new content and programming that can be downloaded and streamed from the Web (e. g. new subtitles, new commentary, new movie trailers, new menus, new storyboard guide, new video games, etc. )
2 D Video/Graphics Layout Model Application coordinates (0, 0) Canvass coordinate space (0, 0) (all origins upper left) Text Full Screen Display Aperture Author specified (e. g. 1920 x 1080, 1280 x 720) Text Invisible Video (-200, 1000) (+2^31 -1) Application Region Note: Only App text/graphics Inside both App Region and Aperture are visible (+2^31 -1) Text Invisible text Text Visible Video Text Invisible Video (1920, 1080) (2220, 1000) Example of video object with portions outside the visible Aperture (To pan right, video object position would be animated left, etc. )
3 D Multi-Plane Composition Model Object opacity style Cursor Application and Object Z-order Object opacity style Interactive Graphics Point of View Subtitles Object opacity style Secondary Video Alpha key & Rect Opaque Primary Video and background Z-Axis
3 D Multi-Application and Object Composition Model Text Application Region Z-ordered Applications z-ordered Objects in an Application Region Z=0 z=0. 1 z=0. n z=1. 0 z=1. 1 z=1. n z-ordered Objects in an Application Region z=N. 0 z=N. 1 z=N. n z-ordered Objects in an App Text Z=1 Text Z=N Text and graphics objects contained in an Application’s 3 D Region Interactive Graphics Plane Painter’s algorithm draws objects from back to front, from z=N. n to z=0. 0, with “Source Over” mixing Application and Object Z-orders can be dynamically changed by programming
Video Keying and Blending • The Primary Video Plane is opaque, and any area not filled with video will show a designated background color • The Secondary Video Plane can be “luma” and chroma keyed, can have transparent objects called “clear rectangles”, and can set an Opacity style property (alpha value) for the entire video object – “Luma key” treats author designated sub-black pixels as transparent to the Primary Video below it (intended for professionally pre-produced blue screen or rotoscoped mattes) – “Chroma key” allows authors to designate a transparent color range, with the caveat that color quantization and block transforms used in video compression may result in rough edges and unintended areas of opacity or transparency (may be appropriate for “live video” overlay) – A video alpha channel for alpha per pixel is not supported • “Clear Rectangles” are layout objects defined in Graphics Plane Applications that “cut a hole” through any graphics objects in the same area and reveal either the Primary or Secondary Video beneath as designated
Primary and Secondary Clear Rectangles Secondary Clear Rect Graphics Plane Secondary Video Primary Clear Rect
Primary Video Plane
Secondary Video Overlay (Not Keyed)
Secondary Video with Chroma Key
An Image in the Graphics Plane Overlaying Primary Video
Example of a “Clear Rectangle” Punching Through Graphics to Video
Secondary Video with Clear Rectangle to Secondary Video Plane
Secondary Video with Clear Rectangle to Primary Video Plane
Presentation and Synchronization Model • HD DVD-Video uses an XML presentation language referred to as “i. HD” for frame accurate video and graphics presentation and animation • A “Title Timeline” is specified for each presentation sequence (a Title); and Video Clips, Audio Clips, Subtitles, Applications, and Resources are laid out in sequences on that timeline and called Tracks • Multiple Titles can be combined in a Playlist, which contains all the valid content and playback sequences defined for a disc and its associated downloaded and streamed content • i. HD Applications use a timing language that can reference the timecode of a Title, which is synchronized to a frame of video or audio on each Track, so i. HD Applications can create deterministic, frame accurate, interactive graphics and video presentations • Simple interactive video applications without interactive graphics can be created with only a Playlist, video and audio Program Streams, and Time Map indexes for those Program Streams
Playlists • • • Typically multiple Titles in a Playlist Each Title has its own timeline and Title: Timecode Video Clips sequence to form Video Tracks Audio Clips sequence to form Audio Tracks Subtitle Segments sequence to form Subtitle Tracks Application Segments sequence to form Application Tracks Application Resource Tracks span one Application Title Resource Tracks span multiple Applications Playlist Applications and Resources span multiple Titles Playlists also specify: – Configuration information such as Aperture size – Navigation mapping of Tracks for remote controls – Media attributes that identify codec, resolution, active area, source frame rate, number of audio channels, nominal bitrate, etc.
Playlist Title with 3 Video Clips Video Track Audio Track Ch 1 Ch 2 Title Timeline Ch 3 End Video Clip 1 Video Clip 2 Video Clip 3 Audio Clip 1 Audio Clip 2 Audio Clip 3 Program Stream “Clips” can be segments of the same or different files They are combined on the Title timeline and “spliced” on playback TMAP (File 1. MAP) TMAP (File 2. MAP) TMAP (File 3. MAP) Three Time Map files provide timecode > byte offset indexes for three video files P-storage A/V (File 1. EVOB) Disc A/V (File 2. EVOB) Web A/V (File 3. EVOB) File/byte offsets are used to play Program Streams from files or HTTP: protocol
Playlist with Secondary Video Ch 1 Main Video Sub Video Application Ch 2 Title Timeline Ch 3 End Video Clip 1 Video Clip 2 Video Clip 3 Audio Clip 1 Audio Clip 2 Audio Clip 3 Menu App 1 Tablet PC App 2 Tracking App 3 App 1 Resources App 2 Resources App 3 Resources 4 D layout of content that can be shown in Primary Video Plane, Secondary Video Plane, and Graphics Plane with additional control by i. HD Application programs
Playlist Resource Management Ch 1 Ch 2 Title Timeline Ch 3 End Sub Video App Resources Video Clip 2 Video Clip 3 Audio Clip 1 Audio Clip 2 Audio Clip 3 Video Clip 1 Video Clip 2 Video Clip 3 Audio Clip 1 Audio Clip 2 Audio Clip 3 Menu App 1 Main Video Clip 1 Tablet PC App 2 Tracking App 3 App 1 Resources App 2 Resources App 3 Resources The Resource Track on the bottom schedules loading and unloading of all required Application files into a 64 MB File Cache so they are instantly accessible to the user during any portion of the Title when that App is “valid”
i. HD Programming • Optimized mix of Declarative and Procedural languages • Declarative Markup language handles most presentation needs with simple tags and reliable, realtime performance using native code and hardware • Compact ECMAScript Procedural language provides full programmability, through content and player APIs, author handled events and state machine
Animation • Property animation – Any object (graphics, text, drawing, video) can change its properties over time in response to simple markup statements – Properties include position, size, opacity, color, z-order, etc. • Bitmap animation – Bitmap animations are a sequence of images that capture a pre-rendered animation. – Playback can use a timed sequence of PNG or JPG image files (good for frame accuracy, trick modes, such as reverse play, etc. ); or a single MNG file. • Cell animation – Cell animation combines bitmap or property animated objects with separate backgrounds. Performance is improved because the entire frame does not have to be stored and redrawn each frame, and it is more flexible because animated foreground objects can be added, removed, and controlled by programming and user input. • Animation can be synchronized to the Title Clock, Application Clock, or Page Clock – If an animation is synchronized to the Title Clock, it will pause when video pauses, jump to a timecoded animation frame or state when the video jumps to that timecode, play slow when the video plays slow, etc. One thing this enables is “video tracking hotspots”, which are graphics or interactive regions superimposed over “objects’’ in the video, such as adding a halo to a person who is walking around, appearing and disappearing from the video. – If an animation is synchronized to the Application clock, it will continue to run or loop regardless of video playback – If an animation is synchronized to an Application “page”, it can be run each time the page is loaded; for instance to do a menu build, or “fly in” a video image
Audio/Video Output Synchronization • Most “DVD” video is 24 frame per second progressive source, such as movies and episodic television • HD DVD-Video perpetuates the practice of encoding 24 P source as 30 i by adding repeat field flags to generate 60 Hz timing and (optionally) 3: 2 pulldown • The HD DVD-V system is capable of ignoring the repeat flags and outputting pure 24 frame per second video, text, and graphics over HDMI … but • The current consumer electronics industry direction is to apply 3: 2 pulldown and convert to 60 fields per second somewhere in the display pipeline in order to generate a raster signal for analog connections to CRT displays • It is very important that new HD displays and their HDMI inputs support 1080 P 24 input mode. Scaling and refresh should be handled in the display with methods appropriate for its particular display technology (which will rarely be CRT), and not add an extra step of inverse telecine detection, deinterlacing, scaling, and filtering
The 50 Hz/60 Hz “Problem” • The legacy solution of +4% speed shift from 25 Hz to 25 Hz no longer works with compressed digital audio outputs (and was never really satisfactory) • HD DVD-V format requires that video be encoded at either 50 Hz or 60 Hz, so most content will be 24 P encoded with 60 HZ timing • Europe’s “HD Ready” logo indicates a display will handle both 50 Hz and 60 Hz HDMI input, but what about 24 Hz? • Unless Europe (and other 50 Hz regions) require 24 Hz on HDMI displays, the options are: – Wait for a format converted 50 Hz version of each disc – Watch the 60 Hz version at 30 i with 3: 2 pulldown – Speed shift 24 P to 25 P and watch at 50 Hz with pitch shifted uncompressed audio over HDMI
The Interlace “Problem” • Most new DVD players and displays today support 480 P over analog component interfaces at various refresh rates (e. g. 72 Hz refresh) • But, the encoded video has reduced vertical resolution intended to reduce flicker on interlaced CRT displays (done by CCD sensors that mix adjacent “scan” lines, optical filters, FIR filters on resampling, etc. ) • Deinterlace chips can’t restore the vertical resolution that was thrown away (a separate issue from the number of scan lines) • The industry needs to change this production and display model for HD DVD-V and BD!!! – Encode 1080 P 24 video at full vertical resolution to enable full resolution progressive display – Players must apply anti-alias and interlace filtering if they subsample and sequentially output 540 line fields for 1080 i 30 signal output (also applies to generated text and graphics)
Take Aways on HD DVD-Video • XML Playlists accomplish “on the fly” editing and mixing in the player like EDLs or AAF on video editing work stations • Players include an HD video and graphics “blender”’ that alpha blends multiple planes of video, graphics and text in realtime with frame and pixel accuracy • Resources from various storage and network sources are marshaled and managed for realtime presentations that can be interactively navigated by users • Advanced audio and video codecs provide state of the art quality and efficiency including 1080 P video and mathematically lossless 8 channel audio • Programmable and network updatable user experiences create new entertainment possibilities that combine the flexibility of the Web with the high quality and reliable consumer experience of DVD-Video
Thank You
bc3258b6d78b250850d55712d04f55aa.ppt