Скачать презентацию Image Processing and Computer Vision Lecture 4 Multimedia Скачать презентацию Image Processing and Computer Vision Lecture 4 Multimedia

412d5add83289875cdb579d9b260f0c4.ppt

  • Количество слайдов: 86

Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, http: //www. cs. cmu. edu/~hws) © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann Carnegie Mellon

Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization of documents • Digitization of images/photographs • Biometrics • Management of images on computers • Other: manufacturing, military, games, … Research in Image Processing and Computer Vision • • Automatically Finding Faces and Cars Content-based Image Retrieval © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 2 Carnegie Mellon

Image Processing vs. Computer Vision • Image Processing • Research area within electrical engineering/signal Image Processing vs. Computer Vision • Image Processing • Research area within electrical engineering/signal processing • Focus on syntax, low level features image • Computer Vision • Research area within computer science/artificial intelligence • Focus on semantics, symbolic or geometric descriptions © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann image 3 Faces People Chairs etc. Carnegie Mellon

Optical Character Recognition (OCR) • First patent in OCR in 19 th century • Optical Character Recognition (OCR) • First patent in OCR in 19 th century • First applications in post-office and banks • Documents easier to distribute, search, organize, and edit in digital form • • Typewriter has been replaced by word processor Lots of legacy materials (the world’s libraries of books) available only in print • State of the art not perfect, but 99% accurate on cleanly printed pages • Examples of errors. . . © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 4 Carnegie Mellon

Heavy Print Output from 3 commercial OCR systems © Copyright 2002 Michael G. Christel Heavy Print Output from 3 commercial OCR systems © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 5 Carnegie Mellon

Light Print © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 6 Carnegie Light Print © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 6 Carnegie Mellon

Stray Marks © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 7 Carnegie Stray Marks © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 7 Carnegie Mellon

Typography © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 8 Carnegie Mellon Typography © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 8 Carnegie Mellon

Processing Overlaid Text in Video Text Area Video The Video OCR (VOCR) process used Processing Overlaid Text in Video Text Area Video The Video OCR (VOCR) process used by the Informedia research group at Carnegie Mellon Detection Text Area Preprocessing Commercial OCR © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 9 ASCII Text Carnegie Mellon

Text Area Detection Text Area Detection

Video Frames Filtered Frames AND-ed Frames (1/2 s intervals) © Copyright 2002 Michael G. Video Frames Filtered Frames AND-ed Frames (1/2 s intervals) © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann Carnegie Mellon

VOCR Preprocessing Problems © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 12 VOCR Preprocessing Problems © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 12 Carnegie Mellon

Augmenting VOCR with Dictionary Look-up Augmenting VOCR with Dictionary Look-up

Handwriting Recognition • Natural progression to OCR work for print • Works if constraints Handwriting Recognition • Natural progression to OCR work for print • Works if constraints on writer, e. g. palm pilot, where user is asked to conform to specific style or convention © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 14 Carnegie Mellon

Other Document Processing • Not just for text. . . • Examples: • Engineering Other Document Processing • Not just for text. . . • Examples: • Engineering document to CAD file • Maps to GIS format • Music score to MIDI representation © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 15 Carnegie Mellon

Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization of documents • Digitization of images/photographs • Biometrics • Management of images on computers • Other: manufacturing, military, games, … Research in Image Processing and Computer Vision • • Automatically Finding Faces and Cars Content-based Image Retrieval © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 16 Carnegie Mellon

Digital Cameras = Convenience • Easy to capture photos • Easy to store and Digital Cameras = Convenience • Easy to capture photos • Easy to store and organize photos • Easy to duplicate photos • Easy to edit photos • Rough Multimedia e. Commerce class survey: • • 1999: 2000: 2001: 2002: 10% own digital cameras 25% 50% ? ? © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 17 Carnegie Mellon

Digital Camera Cautions Via “Photo Industry Reporter” e-Magazine at: http: //www. photoreporter. com/2002/1021/photokina_report_look_at_35 mm. Digital Camera Cautions Via “Photo Industry Reporter” e-Magazine at: http: //www. photoreporter. com/2002/1021/photokina_report_look_at_35 mm. html • Film cameras still outsell digital cameras by almost three to one • The household penetration of digital is at about 15% • “But let’s face it: film’s days are numbered. Anyone staying solely with film these days will have a glorious buggy whip in a market that will be clamoring for cars. ” © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 18 Carnegie Mellon

Digital Camera Growth • Photo Marketing Association on US digital camera sales: • • Digital Camera Growth • Photo Marketing Association on US digital camera sales: • • 4. 5 million in 2000 6. 9 million in 2001 Projected 9. 3 million for 2002 http: //www. visioneer. com/About/press/june 2402. html • Info. Trends Research Group estimates that the U. S. photo-enabled TV set-top installed base will grow from less than 1 million units in 2002, to over 114 million units in 2006. Household penetration will climb from under 1% to around 85%. • Info. Trends projects digital camera sales to grow at a rate of 38% through 2003 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 19 Carnegie Mellon

State of the Art: Digital Cameras • Film is currently better in resolution and State of the Art: Digital Cameras • Film is currently better in resolution and color • Professional photographers • Digital for low quality newspaper advertisements • Film for portrait photos • Computer storage limitations: 1 high resolution digital image = 2025 Megabytes • http: //pic. templetons. com/brad/photo/pixels. html • 3500 line pairs/35 mm or about 5000 dots/inch, but grainy • At 3: 2 frame size, ~20 million pixels • Conclusion: “a 5300 x 4000 digital camera would produce a shot equivalent to a scan from a quality 35 mm camera -provided you can get more than 8 bits per pixel. …A 3000 x 2000 digital camera would match the 35 mm for a good percentage of shots. ” • Printing: home printers not comparable to commercial printers © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 20 Carnegie Mellon

Future of Digital Cameras • Improved resolution and color • “Smart” cameras • More Future of Digital Cameras • Improved resolution and color • “Smart” cameras • More programmable features • Auto-focus on object of interest • “Everything in focus” photo • Capture photo when event X occurs © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 21 Carnegie Mellon

Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization of documents • Digitization of images/photographs • Biometrics • Management of images on computers • Other: manufacturing, military, games, … Research in Image Processing and Computer Vision • • Automatically Finding Faces and Cars Content-based Image Retrieval © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 22 Carnegie Mellon

Biometrics • Technology for identification • Finger/palm print • Iris • Face © Copyright Biometrics • Technology for identification • Finger/palm print • Iris • Face © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 23 Carnegie Mellon

Fingerprints • Minutae – spits and merges of ridges © Copyright 2002 Michael G. Fingerprints • Minutae – spits and merges of ridges © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 24 Carnegie Mellon

Face Identification • Not quite reliable yet. • Performance degrades rapidly with uncontrolled lighting, Face Identification • Not quite reliable yet. • Performance degrades rapidly with uncontrolled lighting, facial expression, and size of database • Several companies exist: • • Visionics (Rockfeller University spin-off) Viisage (MIT spin-off) Eye. Matic (USC spin-off) Miros (MIT spin-off) Banque-Tec Intl (Australia) C-VIS Computer Vision (Germany) LAU Technologies • Commercial systems installed in London and Brazil to catch criminals © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 25 Carnegie Mellon

Automatic Age Progression Original Image (1962) Computer-Aged (1997) © Copyright 2002 Michael G. Christel Automatic Age Progression Original Image (1962) Computer-Aged (1997) © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 26 Actual Photo (1997) Carnegie Mellon

Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization of documents • Digitization of images/photographs • Biometrics • Management of images on computers • Other: manufacturing, military, games, … Research in Image Processing and Computer Vision • • Automatically Finding Faces and Cars Content-based Image Retrieval © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 27 Carnegie Mellon

Management of images on computers • Compression – reducing storage size needed for images Management of images on computers • Compression – reducing storage size needed for images • Watermarking – Protecting copyright • Microsoft, Bell Labs, NEC, etc. Visible watermark © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 28 Carnegie Mellon

Photo Manipulation • Adobe Photoshop, Corel Photo. Paint, Pixami, Photo. IQ, etc. • Image Photo Manipulation • Adobe Photoshop, Corel Photo. Paint, Pixami, Photo. IQ, etc. • Image editing: crop an image, adjust the color, paint over part of any image, airbrush part of an image, combine images, etc. • Future: Applications of computer vision, e. g. , discriminating foreground from background. © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 29 Carnegie Mellon

Online Digital Image Collections • Stock photos of use to graphic designers, artists, etc. Online Digital Image Collections • Stock photos of use to graphic designers, artists, etc. • Large collections of images exist • Corbis 67 million images • Getty 70 million stock photography images • AP collects 1000 s of digitized images per day © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 30 Carnegie Mellon

Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization of documents • Digitization of images/photographs • Biometrics • Management of images on computers • Other: manufacturing, military, games, … Research in Image Processing and Computer Vision • • Automatically Finding Faces and Cars Content-based Image Retrieval © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 31 Carnegie Mellon

Inspection for Manufacturing • Occum – inspection of printed circuit boards ($100 M / Inspection for Manufacturing • Occum – inspection of printed circuit boards ($100 M / year) • Cognex – Do-it-yourself toolkits for inspection (400 employees) © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 32 Carnegie Mellon

Automatic Target Recognition (ATR) • Finding mines, tanks, etc. • Billion dollar a year Automatic Target Recognition (ATR) • Finding mines, tanks, etc. • Billion dollar a year industry • Martin-Lockheed, TSR, Northrup-Grumman, other aerospace contractors. • Various types of imagery: • Synthetic Aperture Radar (SAR), Sonar, hyper-spectral imagery (more than 3 colors) © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 33 Carnegie Mellon

Aerial Photo Interpretation • Also referred to as “automated cartography” • Classification of land-use: Aerial Photo Interpretation • Also referred to as “automated cartography” • Classification of land-use: forest, vegetation, water • Identification of man-made objects: buildings, roads, etc. © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 34 Carnegie Mellon

Better Security Cameras • Cameras that are responsive to the environment • • Track Better Security Cameras • Cameras that are responsive to the environment • • Track and zoom on moving objects Automatic adjustment of contrast © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 35 Carnegie Mellon

Medical imagery • Medical image libraries for study and diagnosis • Image overlay to Medical imagery • Medical image libraries for study and diagnosis • Image overlay to guide surgeons © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 36 Carnegie Mellon

History • 1980’s ~100 companies – manufacturing applications mostly • Early 1990’s less than History • 1980’s ~100 companies – manufacturing applications mostly • Early 1990’s less than 10 companies • Late 1990’s ~100 companies – face recognition, intelligent teleconferencing, inspection, digital libraries, medical imaging © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 37 Carnegie Mellon

Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization of documents • Digitization of images/photographs • Biometrics • Management of images on computers • Other: manufacturing, military, games, … Research in Image Processing and Computer Vision • • Automatically Finding Faces and Cars Content-based Image Retrieval © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 38 Carnegie Mellon

Image Processing: Filtering Enhancing an image’s quality for human viewing, e. g. , in Image Processing: Filtering Enhancing an image’s quality for human viewing, e. g. , in medical imaging or in telescopic views of space © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 39 Carnegie Mellon

Image Processing: Compression • Lossless – No loss in quality: gif, tiff • Lossy Image Processing: Compression • Lossless – No loss in quality: gif, tiff • Lossy – Original image cannot be reconstructed: jpeg • New work on advancing lossy compression strategies with fewer visual artifacts: JPEG 2000 and wavelet transformations © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 40 Carnegie Mellon

Image Processing: Watermarking • Information hiding • Protecting copyright © Copyright 2002 Michael G. Image Processing: Watermarking • Information hiding • Protecting copyright © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 41 Carnegie Mellon

Image Processing: Transformation • Transforming image can make it easier to analyze Wavelet transform Image Processing: Transformation • Transforming image can make it easier to analyze Wavelet transform of image © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 42 Carnegie Mellon

Wavelet Coefficients Horizontal LP, Vertical LP Horizontal HP, Vertical LP © Copyright 2002 Michael Wavelet Coefficients Horizontal LP, Vertical LP Horizontal HP, Vertical LP © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 43 Horizontal LP, Vertical HP Horizontal HP, Vertical HP Carnegie Mellon

5/3 Linear Phase Wavelets Linear phase 5/3: c[n] = {-1, 2, 6, 2, -1}, 5/3 Linear Phase Wavelets Linear phase 5/3: c[n] = {-1, 2, 6, 2, -1}, d[n]={1, -2, 1} g[n] = {1, 2, -6, 2, 1}, f[n]={1, 2, 1} © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 44 Carnegie Mellon

Computer Vision: 3 D Shape Reconstruction • Use images to build 3 D model Computer Vision: 3 D Shape Reconstruction • Use images to build 3 D model of object or site 3 D site model built from laser range scans collected by CMU autonomous helicopter © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 45 Carnegie Mellon

Computer Vision: Guiding Motion • Visually guided manipulation • Hand-eye coordination • Visually guided Computer Vision: Guiding Motion • Visually guided manipulation • Hand-eye coordination • Visually guided locomotion • robotic vehicles CMU Nav. Lab II © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 46 Carnegie Mellon

Computer Vision: Recognition & Classification © Copyright 2002 Michael G. Christel and Alexander G. Computer Vision: Recognition & Classification © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 47 Carnegie Mellon

Challenges in Object Recognition 245 267 234 142 22 28 38 121 156 187 Challenges in Object Recognition 245 267 234 142 22 28 38 121 156 187 98 73 32 12 123 21 21 38 209 237 121 99 87 59 197 216 244 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 48 Carnegie Mellon

Object Recognition Research Large Quantity of Data Quality/Quantity Issues Segmentation and Hierarchical Analysis Robust Object Recognition Research Large Quantity of Data Quality/Quantity Issues Segmentation and Hierarchical Analysis Robust Algorithms Intraclass Object Variation Face Lips Hand Gesture Text Object Clock Detection License Plate Building Vehicle Automated Advanced Learning Image Enhancement Large number of Object Classes Low Image Quality Object Detection Issues © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 49 Carnegie Mellon

Intra-Class Variation © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 50 Carnegie Intra-Class Variation © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 50 Carnegie Mellon

Lighting Variation © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 51 Carnegie Lighting Variation © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 51 Carnegie Mellon

Geometric Variation © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 52 Carnegie Geometric Variation © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 52 Carnegie Mellon

Simpler Problem: Classification • Fixed size input • Fixed object size, orientation, and alignment Simpler Problem: Classification • Fixed size input • Fixed object size, orientation, and alignment “Object is present” (at fixed size and alignment) Decision “Object is NOT present” (at fixed size and alignment) © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 53 Carnegie Mellon

Detection: Apply Classifier Exhaustively Search in position © Copyright 2002 Michael G. Christel and Detection: Apply Classifier Exhaustively Search in position © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 54 Search in scale Carnegie Mellon

View-based Classifiers Face Classifier #1 Face Classifier #2 Face Classifier #3 © Copyright 2002 View-based Classifiers Face Classifier #1 Face Classifier #2 Face Classifier #3 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 55 Carnegie Mellon

1) Apply Local Operators f 1(0, 0) = #5710 f 1(0, 1) = #3214 1) Apply Local Operators f 1(0, 0) = #5710 f 1(0, 1) = #3214 fk(n, m) = #723 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 56 Carnegie Mellon

2) Look Up Probabilities P 1( #5710, 0, 0 | obj) = 0. 53 2) Look Up Probabilities P 1( #5710, 0, 0 | obj) = 0. 53 f 1(0, 0) = #5710 P 1( #5710, 0, 0 | non-obj) = 0. 56 P 1( #3214, 0, 1 | obj) = 0. 57 f 1(0, 1) = #3214 P 1( #3214, 0, 1 | non-obj) = 0. 48 fk(n, m) = #723 Pk( #723, n, m | obj) = 0. 83 Pk( #723, n, m | non-obj) = 0. 19 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 57 Carnegie Mellon

3) Make Decision P 1( #5710, 0, 0 | obj) = 0. 53 P 3) Make Decision P 1( #5710, 0, 0 | obj) = 0. 53 P 1( #5710, 0, 0 | non-obj) = 0. 56 P 1( #3214, 0, 1 | obj) = 0. 57 P 1( #3214, 0, 1 | non-obj) = 0. 48 0. 53 * 0. 57 *. . . * 0. 83 >l 0. 56 * 0. 48 *. . . * 0. 19 Pk( #723, n, m | obj) = 0. 83 Pk( #723, n, m | non-obj) = 0. 19 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 58 Carnegie Mellon

Two Classifiers Trained for Faces © Copyright 2002 Michael G. Christel and Alexander G. Two Classifiers Trained for Faces © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 59 Carnegie Mellon

Eight Classifiers Trained for Cars © Copyright 2002 Michael G. Christel and Alexander G. Eight Classifiers Trained for Cars © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 60 Carnegie Mellon

Probabilities Estimated Off-Line f 1(0, 0) = #567 H 1(#567, 0, 0) = H Probabilities Estimated Off-Line f 1(0, 0) = #567 H 1(#567, 0, 0) = H 1(567, 0, 0) + 1 H 1(#567, 0, 0) P 1(#567, 0, 0) = fk(n, m) = #350 S H 1(#i, 0, 0) Hk(#350, 0, 0) = Hk(#350, 0, 0) + 1 Hk(#350, 0, 0) Pk(#350, 0, 0) = © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 61 S Hk(#i, 0, 0) Carnegie Mellon

Training Classifiers • Cars: 300 -500 images per viewpoint • Faces: 2, 000 images Training Classifiers • Cars: 300 -500 images per viewpoint • Faces: 2, 000 images per viewpoint • ~1, 000 synthetic variations of each original image • background scenery, orientation, position, frequency • 2000 non-object images • Samples selected by bootstrapping • Minimization of classification error on training set • Ada. Boost algorithm (Freund & Shapire ‘ 97, Shapire & Singer ‘ 99) • Iterative method • Determines weights for samples © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 62 Carnegie Mellon

Web-based Demo of Face Detector http: //www. vasc. ri. cmu. edu/cgi-bin/demos/findface. cgi Web-based Demo of Face Detector http: //www. vasc. ri. cmu. edu/cgi-bin/demos/findface. cgi

CMU Face Detector in Commercial Product CMU Face Detector © Copyright 2002 Michael G. CMU Face Detector in Commercial Product CMU Face Detector © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 68 Carnegie Mellon

Applications of Face Detection • Automatic red-eye removal from photographs • Automatic color balancing Applications of Face Detection • Automatic red-eye removal from photographs • Automatic color balancing in photo-finishing • Intelligent teleconferencing • Component in face identification system © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 69 Carnegie Mellon

Difficulty Increases with Complexity of Object • 2 D vs. 3 D • Specific Difficulty Increases with Complexity of Object • 2 D vs. 3 D • Specific objects – e. g. my coffee mug • A category of objects – e. g. all coffee mugs • Amount of intra-category variation • • • Rigid or semi-rigid structure, e. g. face Articulated objects, e. g. human body Functionally defined objects, e. g. chairs © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 70 Carnegie Mellon

Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization Outline • • • Defining Image Processing and Computer Vision Emerging Technology • Digitization of documents • Digitization of images/photographs • Biometrics • Management of images on computers • Other: manufacturing, military, games, … Research in Image Processing and Computer Vision • • Automatically Finding Faces and Cars Content-based Image Retrieval © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 71 Carnegie Mellon

Find Images With Similar Colors © Copyright 2002 Michael G. Christel and Alexander G. Find Images With Similar Colors © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 72 Carnegie Mellon

Find Images with Similar Shape © Copyright 2002 Michael G. Christel and Alexander G. Find Images with Similar Shape © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 73 Carnegie Mellon

Goal: Find Images with Similar Content © Copyright 2002 Michael G. Christel and Alexander Goal: Find Images with Similar Content © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 74 Carnegie Mellon

Spectrum of Content-Based Image Retrieval Histogram matching Similar texture pattern Texture analysis Similar shape/pattern Spectrum of Content-Based Image Retrieval Histogram matching Similar texture pattern Texture analysis Similar shape/pattern Image Segmentation, Pattern recognition Similar real content Degree of difficulty Similar color distribution Life-time goal : -) © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 75 Carnegie Mellon

Status of Image Search • Typical Search Features • • Color Texture Shape Spatial Status of Image Search • Typical Search Features • • Color Texture Shape Spatial attributes (local color regions, less common than global color, texture, shape metrics) • Commercial Activity • e. Vision (notes that “visual search engine market segment is projected to reach $1. 4 billion by 2005 according to the Mc. Kenna Group” http: //www. evisionglobal. com/about/index. html • Virage (www. virage. com) • IBM (QBIC part of database toolset) © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 76 Carnegie Mellon

Reference: “A Review of CBIR” Recommended reading: A Review of Content-Based Image Retrieval Systems Reference: “A Review of CBIR” Recommended reading: A Review of Content-Based Image Retrieval Systems Colin C. Venters and Dr. Matthew Cooper, University of Manchester Available at http: //www. jisc. ac. uk/jtap/htm/jtap-054. html This review lists features from a number of image retrieval systems, along with heuristic evaluations on the interfaces for a subset of these systems. © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 77 Carnegie Mellon

Search Engines Used by 2001 Multimedia Class • Search Engines used for 2001 multimedia Search Engines Used by 2001 Multimedia Class • Search Engines used for 2001 multimedia retrieval homework (15 others answered a single query each): © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 78 Carnegie Mellon

Search Engines Used in This 2002 Class Also answering 1 query each were: Excite+, Search Engines Used in This 2002 Class Also answering 1 query each were: Excite+, Rexfeature, Webseek+, search. netscape. com+, animalplanet. com+, ask. com, naver. com+ © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 79 Carnegie Mellon

For Further Reading on Texture Search • Texture Search: “Texture features for browsing and For Further Reading on Texture Search • Texture Search: “Texture features for browsing and retrieval of image data”, B. S. Manjunath and W. Y. Ma, IEEE Trans. on Pattern Analysis and Machine Intelligence 18(8), Aug. 1996, pp. 837 -842. • Texture search via http: //www. engin. umd. umich. edu/ceep/tech_day/2000/r eports/ECEreport 2. htm (texture features include coarseness, average gray scale value, and number of horizontal and vertical extrema of a specific image region) • For QBIC, texture search works on global coarseness, contrast and directionality features © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 80 Carnegie Mellon

For Further Exploration of Image Segmentation • Blob. World work at UC Berkeley • For Further Exploration of Image Segmentation • Blob. World work at UC Berkeley • Papers, description, sample system available at http: //elib. cs. berkeley. edu/photos/blobworld/ © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 81 Carnegie Mellon

Further Reading on Wavelet Compression and JPEG 2000 • http: //www. gvsu. edu/math/wavelets/student_work/EF/howworks. html Further Reading on Wavelet Compression and JPEG 2000 • http: //www. gvsu. edu/math/wavelets/student_work/EF/howworks. html • http: //www-ise. stanford. edu/class/psych 221/00/shuoyen/ • Henry Schneiderman Ph. D. Thesis “A Statistical Approach to 3 D Object Detection Applied to Faces and Cars”, http: //www. ri. cmu. edu/pub_files/pub 2/schneiderman_henry _2000_2/schneiderman_henry_2000_2. pdf • http: //www. jpeg. org/JPEG 2000. html © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 82 Carnegie Mellon

Summary: Image Processing & Computer Vision • Not as mature as speech recognition • Summary: Image Processing & Computer Vision • Not as mature as speech recognition • • Technology not as reliable Fewer companies, fewer products • Success on limited problems, e. g. , documents • More applicable to fault tolerant problems • Technology will grow • • Emergence of digital camera Improved methods © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 83 Carnegie Mellon

Decomposition in Resolution/Frequency coarse intermediate fine © Copyright 2002 Michael G. Christel and Alexander Decomposition in Resolution/Frequency coarse intermediate fine © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 84 Carnegie Mellon

Wavelet Decomposition Vertical subbands (LH) © Copyright 2002 Michael G. Christel and Alexander G. Wavelet Decomposition Vertical subbands (LH) © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 85 Carnegie Mellon

Wavelet Decomposition Horizontal subbands (HL) © Copyright 2002 Michael G. Christel and Alexander G. Wavelet Decomposition Horizontal subbands (HL) © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 86 Carnegie Mellon