b6e599cff0d9b380bbcad6e70a82f42d.ppt
- Количество слайдов: 46
Detection and Extraction of Artificial Text from Videos Christian Wolf and Jean-Michel Jolion 10 th July 2001 PROJECT France Télécom Research & Development 001 B 575 Laboratoire de Reconnaissance de Formes et Vision Bât. Jules Verne INSA 69621 Villeurbanne CEDEX http: //rfv. insa-lyon. fr/~{wolf, jolion}
Plan of the presentation Slides: êIntroduction êDetection êImage enhancement - multiple frame integration êBinarisation of the text boxes êSetup of the experiments êResults Ô Detection Ô Binarisation Ô OCR êConclusion and outlook Intro Detection Enhancement Binarisation 6 8 3 10 11 6 2 46 Experiments Results
Content based image retrieval Result Example image Similarity Function Indexing phase Intro Detection Enhancement Binarisation Experiments Results
Similarity measures similar Intro Detection Not similar Enhancement Binarisation Experiments Results
Indexing using Text Result Key word Keyword based Search Patrick Mayhew Indexing phase Patrick Mayhew Min. chargé de l´irlande de Nord ISRAEL Jerusalem montage T. Nouel. . . . Intro Detection Enhancement Binarisation Experiments Results
Video properties 80 px 12 px 8 px Intro Detection Enhancement Binarisation Experiments Results
Text extraction: general scheme Detection of the text in single frames Tracking Image enhancement Multiple frame integration Video "EVENEMENT" "ACTU" "SPELEOS" "Gouffre Berger (Isére)" "aujourd'hui" "France 3 Alpes" "un spéléologue sauveteur" Intro Detection OCR Enhancement Segmentation/ Binarisation Experiments Results
Detection in single frames Video Connected components Analysis Calculation of the gradient Accumulation Verification of geometric constraints Verification of special cases Binarisation Combination of the rectangles Mathematical Morphology List of rectangles Intro Detection Enhancement Binarisation Experiments Results
Detection in single frames: examples Intro Detection Enhancement Binarisation Experiments Results
A filter for text detection Accumulation of horizontal gradients. Justification: Text forms a regular texture containing vertical edges which are aligned horizontally. Intro Detection Enhancement W Binarisation M-W Experiments Results
Mathematical morphology Close Deletion of small bridges between the components dilate (special) to connect characters erode horizontally dilate horizontally Intro Detection Enhancement Binarisation Experiments Results
Detection in video sequences Detection per single frame Text occurrences List of rectangles per frame Frame nr. (time) Tracking keeping track of text occurrences Suppression of false alarms Image Enhancement Multiple frame integration Intro Detection Enhancement Binarisation Experiments Results
Integration of the rectangles occurrences At every new frame, the detected rectangles must be matched with the stored text occurrences Frame nr. (time) List of rectangles detected for the current frame Text occurrences List containing the most recent rectangle of each text occurrence The integration is done using overlap information (overlap matrix) Intro Detection Enhancement Binarisation Experiments Results
Suppression of false alarms: Examples All detections After suppression of false alarms Intro Detection Enhancement Binarisation Experiments Results
Image enhancement Super-resolution (interpolation) Integration of multiple frames to create a single image of higher quality. M 1 M 2 M 4 M 3 Robust bi-linear Robust bi-cubic An additional weight is included into the interp. scheme: Fi M V Multiple frame integration: Averaging Intro Detection Enhancement Binarisation ith image Mean image Std. deviation image Experiments Results
Interpolation: Examples Bi-linear interpolation Robust bi-cubic interpolation Intro Detection Enhancement Binarisation Experiments Results
Interpolation: thresholded examples Bi-linear interpolation Robust bi-cubic interpolation Intro Detection Enhancement Binarisation Experiments Results
Binarisation Different Binarisation algorithms have been implemented and evaluated: • Fisher/Otsu and windowed Fisher/Otsu algorithm • Yanowitz-Bruckstein • Niblack, Sauvola • Our adaptive version of Niblack/Sauvola´s method. Intro Detection Enhancement Binarisation Experiments Results
Binarisation methods Yanowitz Bruckstein: The threshold surface is calculated from the edge information. Threshold surface Windowed-Fisher, Niblack-Sauvola: The threshold surface is calculated from the statistics collected in a window which is shifted across the image. Threshold surface Intro Detection Enhancement Binarisation Experiments Results
Binarisation by Niblack proposed a method which calculates a threshold surface by gliding a rectangular window over the image and calculating statistics on this window: m mean s standard deviation k parameter, = -0. 2 Intro Detection Enhancement Binarisation Experiments Results
Binarisation by Niblack: Problems are light textures in the background, which are considered as text with small contrast: Intro Detection Enhancement Binarisation Experiments Results
Binarisation: Improvement by Sauvola To overcome these problems, Sauvola et al. proposed a new improved formula to calculate threshold: m s k R mean standard deviation parameter, = 0. 5 parameter (dynamic range of std. dev. ), R = 128 Reformulation shows, that a hypothesis on the gray values of text and non-text are used to remove the noise produced by background textures: Intro Detection Enhancement Binarisation Experiments Results
Binarisation by Sauvola, examples Original image Binarised using Niblack´s method Binarised using Sauvola et al. ´s method Intro Detection Enhancement Binarisation Experiments Results
Improvement: Adaptive dynamic range Fixing the dynamic range R=128 might be ok for document images, but not for text boxes taken from videos. Binarisation will not be correct, if the contrast of the image is smaller. We therefore set the parameter R to the maximum standard deviation for all windows calculated: To avoid two passes of the windowing algorithm, the mean and standard deviation can be stored in a table during the first pass and the threshold surface calculated on this data. Nib Sauv. R=128 R ad. Intro Detection Enhancement Binarisation Experiments Results
Improvement: Shift of the image range The strong hypothesis on the gray values (text pixels must be near zero) is not justified for some video text boxes: Niblack Sauvola R=128 Gray value histogram R ad. Intro Detection Enhancement Binarisation Experiments Results
Improvement: Shift of the image range A correction of the image´s histogram resolves this problem: Original image Corrected image binarised, R adaptive The same effect can also be achieved by changing the threshold formula: m s k R M Intro Detection Enhancement Binarisation mean standard deviation parameter, = 0. 5 = maximum of the std. dev. of all windows = minimum gray value of the text box Experiments Results
Fast incremental calculation Mean and variance can be calculated in one pass: At the beginning of each line, the full window is calculated and the variables a and b kept. After each shift, a and b are calculated incrementally by subtracting the column of pixels which left the window and adding the column which entered the window. L R Mean and standard deviation are stored in 2 d tables, then the maximum R=max(s) is computed before calculating the threshold surface Intro Detection Enhancement Binarisation Experiments Results
The experiments Description of the experiments êThe videos used in the experiments. êDescription of the evaluation process (OCR Evaluation). Results for: êText detection êBinarisation êOCR Intro Detection Enhancement Binarisation Experiments Results
Test videos We performed experiments on 5 different MPEG 1 videos of resolution 384 x 288: Intro Detection Enhancement Binarisation Experiments Results
AIM 2 Commercials AIM 3 News AIM 4 Cartoon, News AIM 5 News Intro Detection Enhancement Binarisation Experiments Results
Video example - France Télécom ~22 minutes of video ~33000 frames Intro Detection Enhancement Binarisation Experiments Results
The interface to the OCR software Ideal situation: Pass individual (binarised) text boxes to an OCR software which recognises the contents box after box. In reality: We used standard commercial OCR software for our tests. This software has been designed to recognise scanned A 4 or US letter pages and cannot directly process text boxes. A 4 page Intro Detection Enhancement Binarisation Experiments Results
OCR Page - Manual Intro Detection Enhancement Binarisation An input image, ready for the OCR Experiments Results
OCR Output 051 Q 07Ô 7 N*Verf 05 JQ 0707 PUBLICITE IPUBIIÏITE IPUBLICITE prenez boyard ^française FRANCE FRANCE c'est plus musclé iï 'J fort fort cot. Hf. Uet blé c. Q#tf. Uet blé uutàfruuk On va beaucoup {&*$ loin avec Itineris. Partout Partout Partout I 22 h 35 I 22 h 35 PUBLICITE PUBLICITE >3 h 55 l 23 h 55 l 23 h 55 20 h. 50120 h 50 |20 h 50120 h 50 , f ort boyard 2, 4 Kg J 2, 4 Kg g 2, 4 Kg J II II II gà dents IIH r Lessive classique lljir Lessive classique I[HT Lessive classique le temps le temps ^PUBLICITE I Par Amour du Goût. Il Par Amour du Goût. I en en en révolution Intro Detection Enhancement Binarisation Experiments Results
Post processing of OCR output Post processed OCR output Ground truth 23 h 55 051 Q 07Ô 7 PUBLICITE prenez boyard ^française FRANCE c'est plus musclé fort blé cot. Hf. Uet uutàfruuk On va beaucoup {&*$ loin avec Itineris. Partout I 22 h 35 PUBLICITE >3 h 55 l 20 h. 50 , f ort boyard Intro Detection Enhancement dimanche 23 h 55 N Vert 05100707 Berlingo PUBLICITE prenez diffusion simultanée en stéréo sur boyard française FRANCE c'est plus musclé PUBLICITE fort Coral blé complet fruits On va beaucoup Plus loin avec Itineris. Bohême Partout 22 h 35 PUBLICITE 23 h 55 20 h 50 fort boyard Binarisation Experiments Results
Automatic evaluation using markers The manual processing of the OCR output (separation of the output strings and search of the corresponding input box) is time consuming and error prone, especially in cases where the quality of the OCR output is very poor. Automatic OCR output processing can be achieved by placing marker images between the text boxes. The marker boxes contain text which is easily recognised by the OCR software. In the results section we will present results for both types of evaluation. Intro Detection Enhancement Binarisation Experiments Results
An input image with markers, ready for the OCR Intro Detection Enhancement Binarisation Experiments Results
OCR Evaluation OCR output Raw ground truth Tkenchar 037 'gfrançaise Tkenchar 038 Mpe pire de| fj^e pire de| Tkenchar 039 @S @S Par Amour du Goût. en révolution la française le pire de 20 H 45 Search output for individual text boxes List of strings, each corresponding to the output for a text box, but eventually multiple times Intro Detection Structure log # P T M T Page 1: 1 1 2 2 3 2 Prepare ground truth Evaluation Transformation cost Recall Precision Enhancement Binarisation List of strings, each corresponding to the ground truth for a text box. Each string is repeated the same number of times as the corresponding text image in the OCR input image Experiments Results
OCR Evaluation: Wagner & Fischer A measure for resemblance of two character strings. The cost to transform string A into string B is calculated. Basic transformation operations are used, which correspond to a certain cost. The cost function is minimised. Substitution: cost Airbag. Gtroônn Insertion: cost Deletion: Airbag Citroën cost Intro Detection Enhancement Binarisation Experiments Results
Detection results - INA Videos No suppression of false alarms Intro Detection Enhancement Binarisation Experiments Results
Binarisation methods: Examples Original image Fisher (windowed) Yanowitz B. + PP Niblack Sauvola et al. Our method Intro Detection Enhancement Binarisation Experiments Results
Binarisation methods: Examples Original image Fisher (windowed) Yanowitz B. + PP Niblack Sauvola et al. Our method Intro Detection Enhancement Binarisation Experiments Results
OCR Results - Classification by binarisation method Robust bi-cubic interpolation Results obtained using the manual evaluation method (no markers in the input page). 44 pages Intro Detection Enhancement Binarisation Experiments Results
OCR Results: Interpolation methods Results obtained using the automatic evaluation method (including markers in the input page). 97 pages Intro Robust bi-linear interpolation Robust bi-cubic interpolation Detection Enhancement Binarisation Experiments Results
Conclusion êWe developed a system for detection, tracking, enhancement and binarisation of text. êA detection performance of 93. 5% is obtained. êWe derived a new binarisation method adapted to the type of text found in videos. êThe total recognition rate is surprisingly high, given the quality of the text, but not yet good enough for indexation purposes. êOCR integration problem: No software development kits for direct access to the recognition functions available. A collaboration with an OCR company seems to be inevitable. Intro Detection Enhancement Binarisation Experiments Results
Outlook The perspectives of our work are situated in the extension of the existing algorithms to text with more difficult properties, and the enhancement and deeper studies of the existing techniques: Scene text: The binarisation techniques developed in the last 30 years are aimed either at document images or images from computer vision. The method we introduced in the framework of this project is an improvement of the work already presented, but the quality of the text is not yet satisfying enough. Especially the binarisation of scene text will demand the development of new methods. Detection recall: We are convinced, that the recall of the detection system can still be increased by further research, e. g. on the binarisation technique applied to the map of accumulated gradients. Intro Detection Enhancement Binarisation Experiments Results