bd34772393a11bd0f5ab386ff2bc98d4.ppt
- Количество слайдов: 28
LYU 0102 : XML for Interoperable Digital Video Library • Recent years, rapid increase in the usage of multimedia information, • New approach: DIGITAL VIDEO LIBRARY • Automated video and audio indexing • Navigation, visualization • Search and retrieval • Video segmentation and summarization
Video Information • Integration of speech , language, and • • image processing Text processing Audio processing Image processing Video processing
Digital Video Library System Overview
Techniques to segment data
Techniques we may apply • VOCD • Scene changes • Text processing • Face detection • Storage as XML
Techniques to be discussed • VOCR • Scene changes • Storage and editing with XML
Video OCR for Digital News
Detection of Text Region • Video news program comprises huge numbers of frames • Roughly detect text region • Increase processing speed • Reduce processing cost
Detection of Text Region • Typical text region can be characterized as a horizontal rectangular structure • With clustered sharp edges • Regions of high contrast against the background
Image Enhancement • Sub-pixel Interpolation: – To magnify the text area – To increase the resolution of caption • Multi-frame Integration: – Video motion of non-caption areas, caption relatively stable – To reduce the variability on background
Character Segmentation • Vertical project profile • Character segmentation
Character Recognition • Binarize the character image with threshold • Filter the binary image with morphological filter • Filter the character image with connected component filter
Post-Processing • Further improve the recognition rate 1. Using the words of dictionary to refine the character 2. Integrate the recognition result of multiple frames
Scene change • detection technique • effective method for segmenting a video sequence into significant components
Existing Method • Image difference method • Histogram Difference Method using DC Coefficient Image • Our Method vhistogram difference method with a dynamic threshold
Scene change • grasp scene from the video for every 0. 05 second • grasped scenes are 24 -bit image, 8 bits for each color (red R, green G, blue B) • check each pixel with the most 2 significant bits • classify them into 64 different classes • build a color histogram
Scene change • Compared the histogram with the pervious scene • For each column of the histogram, calculate the difference • Sum all the difference • If (total difference) > threshold => scene change • Use the first frame as key frame
XML • Extensible Markup Language • Create its own mark-up language for describing the contents • Look like a big database
Advantages of using XML • Platform and system independent • Create your own tag • Adopt Unicode • Universal format • Easy to search
Design schema • Starts with choosing • • a vocabulary Words and phrases that are able to describe extracted video information content and therefore can be used as tag name Show relationship between vocabulary entries
XML Parser • A parser is a interface between an XML document and the application program • Document Object Model (DOM)
How to present XML • Tree model becomes • very similar to an XML schema Represented as nodes that show element/attribute names or the text content and their relative places within the XML
OUR TOOL
OUR TOOL
OUR TOOL
OUR TOOL
COMING • EXTRACT SECONDARY INFORMATION
THE END
bd34772393a11bd0f5ab386ff2bc98d4.ppt