Скачать презентацию Detection and Extraction of Artificial Text from Videos Скачать презентацию Detection and Extraction of Artificial Text from Videos

b6e599cff0d9b380bbcad6e70a82f42d.ppt

  • Количество слайдов: 46

Detection and Extraction of Artificial Text from Videos Christian Wolf and Jean-Michel Jolion 10 Detection and Extraction of Artificial Text from Videos Christian Wolf and Jean-Michel Jolion 10 th July 2001 PROJECT France Télécom Research & Development 001 B 575 Laboratoire de Reconnaissance de Formes et Vision Bât. Jules Verne INSA 69621 Villeurbanne CEDEX http: //rfv. insa-lyon. fr/~{wolf, jolion}

Plan of the presentation Slides: êIntroduction êDetection êImage enhancement - multiple frame integration êBinarisation Plan of the presentation Slides: êIntroduction êDetection êImage enhancement - multiple frame integration êBinarisation of the text boxes êSetup of the experiments êResults Ô Detection Ô Binarisation Ô OCR êConclusion and outlook Intro Detection Enhancement Binarisation 6 8 3 10 11 6 2 46 Experiments Results

Content based image retrieval Result Example image Similarity Function Indexing phase Intro Detection Enhancement Content based image retrieval Result Example image Similarity Function Indexing phase Intro Detection Enhancement Binarisation Experiments Results

Similarity measures similar Intro Detection Not similar Enhancement Binarisation Experiments Results Similarity measures similar Intro Detection Not similar Enhancement Binarisation Experiments Results

Indexing using Text Result Key word Keyword based Search Patrick Mayhew Indexing phase Patrick Indexing using Text Result Key word Keyword based Search Patrick Mayhew Indexing phase Patrick Mayhew Min. chargé de l´irlande de Nord ISRAEL Jerusalem montage T. Nouel. . . . Intro Detection Enhancement Binarisation Experiments Results

Video properties 80 px 12 px 8 px Intro Detection Enhancement Binarisation Experiments Results Video properties 80 px 12 px 8 px Intro Detection Enhancement Binarisation Experiments Results

Text extraction: general scheme Detection of the text in single frames Tracking Image enhancement Text extraction: general scheme Detection of the text in single frames Tracking Image enhancement Multiple frame integration Video "EVENEMENT" "ACTU" "SPELEOS" "Gouffre Berger (Isére)" "aujourd'hui" "France 3 Alpes" "un spéléologue sauveteur" Intro Detection OCR Enhancement Segmentation/ Binarisation Experiments Results

Detection in single frames Video Connected components Analysis Calculation of the gradient Accumulation Verification Detection in single frames Video Connected components Analysis Calculation of the gradient Accumulation Verification of geometric constraints Verification of special cases Binarisation Combination of the rectangles Mathematical Morphology List of rectangles Intro Detection Enhancement Binarisation Experiments Results

Detection in single frames: examples Intro Detection Enhancement Binarisation Experiments Results Detection in single frames: examples Intro Detection Enhancement Binarisation Experiments Results

A filter for text detection Accumulation of horizontal gradients. Justification: Text forms a regular A filter for text detection Accumulation of horizontal gradients. Justification: Text forms a regular texture containing vertical edges which are aligned horizontally. Intro Detection Enhancement W Binarisation M-W Experiments Results

Mathematical morphology Close Deletion of small bridges between the components dilate (special) to connect Mathematical morphology Close Deletion of small bridges between the components dilate (special) to connect characters erode horizontally dilate horizontally Intro Detection Enhancement Binarisation Experiments Results

Detection in video sequences Detection per single frame Text occurrences List of rectangles per Detection in video sequences Detection per single frame Text occurrences List of rectangles per frame Frame nr. (time) Tracking keeping track of text occurrences Suppression of false alarms Image Enhancement Multiple frame integration Intro Detection Enhancement Binarisation Experiments Results

Integration of the rectangles occurrences At every new frame, the detected rectangles must be Integration of the rectangles occurrences At every new frame, the detected rectangles must be matched with the stored text occurrences Frame nr. (time) List of rectangles detected for the current frame Text occurrences List containing the most recent rectangle of each text occurrence The integration is done using overlap information (overlap matrix) Intro Detection Enhancement Binarisation Experiments Results

Suppression of false alarms: Examples All detections After suppression of false alarms Intro Detection Suppression of false alarms: Examples All detections After suppression of false alarms Intro Detection Enhancement Binarisation Experiments Results

Image enhancement Super-resolution (interpolation) Integration of multiple frames to create a single image of Image enhancement Super-resolution (interpolation) Integration of multiple frames to create a single image of higher quality. M 1 M 2 M 4 M 3 Robust bi-linear Robust bi-cubic An additional weight is included into the interp. scheme: Fi M V Multiple frame integration: Averaging Intro Detection Enhancement Binarisation ith image Mean image Std. deviation image Experiments Results

Interpolation: Examples Bi-linear interpolation Robust bi-cubic interpolation Intro Detection Enhancement Binarisation Experiments Results Interpolation: Examples Bi-linear interpolation Robust bi-cubic interpolation Intro Detection Enhancement Binarisation Experiments Results

Interpolation: thresholded examples Bi-linear interpolation Robust bi-cubic interpolation Intro Detection Enhancement Binarisation Experiments Results Interpolation: thresholded examples Bi-linear interpolation Robust bi-cubic interpolation Intro Detection Enhancement Binarisation Experiments Results

Binarisation Different Binarisation algorithms have been implemented and evaluated: • Fisher/Otsu and windowed Fisher/Otsu Binarisation Different Binarisation algorithms have been implemented and evaluated: • Fisher/Otsu and windowed Fisher/Otsu algorithm • Yanowitz-Bruckstein • Niblack, Sauvola • Our adaptive version of Niblack/Sauvola´s method. Intro Detection Enhancement Binarisation Experiments Results

Binarisation methods Yanowitz Bruckstein: The threshold surface is calculated from the edge information. Threshold Binarisation methods Yanowitz Bruckstein: The threshold surface is calculated from the edge information. Threshold surface Windowed-Fisher, Niblack-Sauvola: The threshold surface is calculated from the statistics collected in a window which is shifted across the image. Threshold surface Intro Detection Enhancement Binarisation Experiments Results

Binarisation by Niblack proposed a method which calculates a threshold surface by gliding a Binarisation by Niblack proposed a method which calculates a threshold surface by gliding a rectangular window over the image and calculating statistics on this window: m mean s standard deviation k parameter, = -0. 2 Intro Detection Enhancement Binarisation Experiments Results

Binarisation by Niblack: Problems are light textures in the background, which are considered as Binarisation by Niblack: Problems are light textures in the background, which are considered as text with small contrast: Intro Detection Enhancement Binarisation Experiments Results

Binarisation: Improvement by Sauvola To overcome these problems, Sauvola et al. proposed a new Binarisation: Improvement by Sauvola To overcome these problems, Sauvola et al. proposed a new improved formula to calculate threshold: m s k R mean standard deviation parameter, = 0. 5 parameter (dynamic range of std. dev. ), R = 128 Reformulation shows, that a hypothesis on the gray values of text and non-text are used to remove the noise produced by background textures: Intro Detection Enhancement Binarisation Experiments Results

Binarisation by Sauvola, examples Original image Binarised using Niblack´s method Binarised using Sauvola et Binarisation by Sauvola, examples Original image Binarised using Niblack´s method Binarised using Sauvola et al. ´s method Intro Detection Enhancement Binarisation Experiments Results

Improvement: Adaptive dynamic range Fixing the dynamic range R=128 might be ok for document Improvement: Adaptive dynamic range Fixing the dynamic range R=128 might be ok for document images, but not for text boxes taken from videos. Binarisation will not be correct, if the contrast of the image is smaller. We therefore set the parameter R to the maximum standard deviation for all windows calculated: To avoid two passes of the windowing algorithm, the mean and standard deviation can be stored in a table during the first pass and the threshold surface calculated on this data. Nib Sauv. R=128 R ad. Intro Detection Enhancement Binarisation Experiments Results

Improvement: Shift of the image range The strong hypothesis on the gray values (text Improvement: Shift of the image range The strong hypothesis on the gray values (text pixels must be near zero) is not justified for some video text boxes: Niblack Sauvola R=128 Gray value histogram R ad. Intro Detection Enhancement Binarisation Experiments Results

Improvement: Shift of the image range A correction of the image´s histogram resolves this Improvement: Shift of the image range A correction of the image´s histogram resolves this problem: Original image Corrected image binarised, R adaptive The same effect can also be achieved by changing the threshold formula: m s k R M Intro Detection Enhancement Binarisation mean standard deviation parameter, = 0. 5 = maximum of the std. dev. of all windows = minimum gray value of the text box Experiments Results

Fast incremental calculation Mean and variance can be calculated in one pass: At the Fast incremental calculation Mean and variance can be calculated in one pass: At the beginning of each line, the full window is calculated and the variables a and b kept. After each shift, a and b are calculated incrementally by subtracting the column of pixels which left the window and adding the column which entered the window. L R Mean and standard deviation are stored in 2 d tables, then the maximum R=max(s) is computed before calculating the threshold surface Intro Detection Enhancement Binarisation Experiments Results

The experiments Description of the experiments êThe videos used in the experiments. êDescription of The experiments Description of the experiments êThe videos used in the experiments. êDescription of the evaluation process (OCR Evaluation). Results for: êText detection êBinarisation êOCR Intro Detection Enhancement Binarisation Experiments Results

Test videos We performed experiments on 5 different MPEG 1 videos of resolution 384 Test videos We performed experiments on 5 different MPEG 1 videos of resolution 384 x 288: Intro Detection Enhancement Binarisation Experiments Results

AIM 2 Commercials AIM 3 News AIM 4 Cartoon, News AIM 5 News Intro AIM 2 Commercials AIM 3 News AIM 4 Cartoon, News AIM 5 News Intro Detection Enhancement Binarisation Experiments Results

Video example - France Télécom ~22 minutes of video ~33000 frames Intro Detection Enhancement Video example - France Télécom ~22 minutes of video ~33000 frames Intro Detection Enhancement Binarisation Experiments Results

The interface to the OCR software Ideal situation: Pass individual (binarised) text boxes to The interface to the OCR software Ideal situation: Pass individual (binarised) text boxes to an OCR software which recognises the contents box after box. In reality: We used standard commercial OCR software for our tests. This software has been designed to recognise scanned A 4 or US letter pages and cannot directly process text boxes. A 4 page Intro Detection Enhancement Binarisation Experiments Results

OCR Page - Manual Intro Detection Enhancement Binarisation An input image, ready for the OCR Page - Manual Intro Detection Enhancement Binarisation An input image, ready for the OCR Experiments Results

OCR Output 051 Q 07Ô 7 N*Verf 05 JQ 0707 PUBLICITE IPUBIIÏITE IPUBLICITE prenez OCR Output 051 Q 07Ô 7 N*Verf 05 JQ 0707 PUBLICITE IPUBIIÏITE IPUBLICITE prenez boyard ^française FRANCE FRANCE c'est plus musclé iï 'J fort fort cot. Hf. Uet blé c. Q#tf. Uet blé uutàfruuk On va beaucoup {&*$ loin avec Itineris. Partout Partout Partout I 22 h 35 I 22 h 35 PUBLICITE PUBLICITE >3 h 55 l 23 h 55 l 23 h 55 20 h. 50120 h 50 |20 h 50120 h 50 , f ort boyard 2, 4 Kg J 2, 4 Kg g 2, 4 Kg J II II II gà dents IIH r Lessive classique lljir Lessive classique I[HT Lessive classique le temps le temps ^PUBLICITE I Par Amour du Goût. Il Par Amour du Goût. I en en en révolution Intro Detection Enhancement Binarisation Experiments Results

Post processing of OCR output Post processed OCR output Ground truth 23 h 55 Post processing of OCR output Post processed OCR output Ground truth 23 h 55 051 Q 07Ô 7 PUBLICITE prenez boyard ^française FRANCE c'est plus musclé fort blé cot. Hf. Uet uutàfruuk On va beaucoup {&*$ loin avec Itineris. Partout I 22 h 35 PUBLICITE >3 h 55 l 20 h. 50 , f ort boyard Intro Detection Enhancement dimanche 23 h 55 N Vert 05100707 Berlingo PUBLICITE prenez diffusion simultanée en stéréo sur boyard française FRANCE c'est plus musclé PUBLICITE fort Coral blé complet fruits On va beaucoup Plus loin avec Itineris. Bohême Partout 22 h 35 PUBLICITE 23 h 55 20 h 50 fort boyard Binarisation Experiments Results

Automatic evaluation using markers The manual processing of the OCR output (separation of the Automatic evaluation using markers The manual processing of the OCR output (separation of the output strings and search of the corresponding input box) is time consuming and error prone, especially in cases where the quality of the OCR output is very poor. Automatic OCR output processing can be achieved by placing marker images between the text boxes. The marker boxes contain text which is easily recognised by the OCR software. In the results section we will present results for both types of evaluation. Intro Detection Enhancement Binarisation Experiments Results

An input image with markers, ready for the OCR Intro Detection Enhancement Binarisation Experiments An input image with markers, ready for the OCR Intro Detection Enhancement Binarisation Experiments Results

OCR Evaluation OCR output Raw ground truth Tkenchar 037 'gfrançaise Tkenchar 038 Mpe pire OCR Evaluation OCR output Raw ground truth Tkenchar 037 'gfrançaise Tkenchar 038 Mpe pire de| fj^e pire de| Tkenchar 039 @S @S Par Amour du Goût. en révolution la française le pire de 20 H 45 Search output for individual text boxes List of strings, each corresponding to the output for a text box, but eventually multiple times Intro Detection Structure log # P T M T Page 1: 1 1 2 2 3 2 Prepare ground truth Evaluation Transformation cost Recall Precision Enhancement Binarisation List of strings, each corresponding to the ground truth for a text box. Each string is repeated the same number of times as the corresponding text image in the OCR input image Experiments Results

OCR Evaluation: Wagner & Fischer A measure for resemblance of two character strings. The OCR Evaluation: Wagner & Fischer A measure for resemblance of two character strings. The cost to transform string A into string B is calculated. Basic transformation operations are used, which correspond to a certain cost. The cost function is minimised. Substitution: cost Airbag. Gtroônn Insertion: cost Deletion: Airbag Citroën cost Intro Detection Enhancement Binarisation Experiments Results

Detection results - INA Videos No suppression of false alarms Intro Detection Enhancement Binarisation Detection results - INA Videos No suppression of false alarms Intro Detection Enhancement Binarisation Experiments Results

Binarisation methods: Examples Original image Fisher (windowed) Yanowitz B. + PP Niblack Sauvola et Binarisation methods: Examples Original image Fisher (windowed) Yanowitz B. + PP Niblack Sauvola et al. Our method Intro Detection Enhancement Binarisation Experiments Results

Binarisation methods: Examples Original image Fisher (windowed) Yanowitz B. + PP Niblack Sauvola et Binarisation methods: Examples Original image Fisher (windowed) Yanowitz B. + PP Niblack Sauvola et al. Our method Intro Detection Enhancement Binarisation Experiments Results

OCR Results - Classification by binarisation method Robust bi-cubic interpolation Results obtained using the OCR Results - Classification by binarisation method Robust bi-cubic interpolation Results obtained using the manual evaluation method (no markers in the input page). 44 pages Intro Detection Enhancement Binarisation Experiments Results

OCR Results: Interpolation methods Results obtained using the automatic evaluation method (including markers in OCR Results: Interpolation methods Results obtained using the automatic evaluation method (including markers in the input page). 97 pages Intro Robust bi-linear interpolation Robust bi-cubic interpolation Detection Enhancement Binarisation Experiments Results

Conclusion êWe developed a system for detection, tracking, enhancement and binarisation of text. êA Conclusion êWe developed a system for detection, tracking, enhancement and binarisation of text. êA detection performance of 93. 5% is obtained. êWe derived a new binarisation method adapted to the type of text found in videos. êThe total recognition rate is surprisingly high, given the quality of the text, but not yet good enough for indexation purposes. êOCR integration problem: No software development kits for direct access to the recognition functions available. A collaboration with an OCR company seems to be inevitable. Intro Detection Enhancement Binarisation Experiments Results

Outlook The perspectives of our work are situated in the extension of the existing Outlook The perspectives of our work are situated in the extension of the existing algorithms to text with more difficult properties, and the enhancement and deeper studies of the existing techniques: Scene text: The binarisation techniques developed in the last 30 years are aimed either at document images or images from computer vision. The method we introduced in the framework of this project is an improvement of the work already presented, but the quality of the text is not yet satisfying enough. Especially the binarisation of scene text will demand the development of new methods. Detection recall: We are convinced, that the recall of the detection system can still be increased by further research, e. g. on the binarisation technique applied to the map of accumulated gradients. Intro Detection Enhancement Binarisation Experiments Results