bc509b497bf0272e559e4d3471a7b60e.ppt
- Количество слайдов: 10
From: Perrizo, William Sent: Thursday, February 02, 2012 10: 45 AM To: 'Mark Silverman' The Satlog (Landsat Satellite) data set from UCI Machine Learning Repository is a good one for image classification trials. I have created level-1 p. Trees for the impure predicate “> 50% 1’s” using 64 bit strides. It is surprising how accurately one seems to be able to classify using only level-1 impure p. Tress (the speedup is a given). An aside: The longer term project of changing the horizontal mindset to a vertical mindset, is ongoing. I’m pretty sure CCD cameras produce vertical data, not horizontal. A embedded device might strip off bitslices as the image bands come out of these cameras and produce stride=64, level-1 “ 50% ones” p. Trees on-the-fly (using FPGAs? )) Suppose the task is to identify tanks from photos (many, large photos) and we get only one quick (software) look at them. We want to have an Oblique FAUST classifier trained from previous images containing identified tanks. I. e. , We want an accurate cut-hyperplane (if there are n bands, the cut-hyperplane is n-1 dimensional and divides the space in half) From training data, we calculate vector, D≡ mean. Tank mean. Not. Tank, (d≡D/|D|) and constant, a≡ ½dom. T + ½dom. NT std. T std. NT then all points in the p. Tree, Pxod<a are classified as tanks. Note a ≡ dom. T + dom. NT is std. T+std. NT better (std. T = standard deviation of the projection of T on the d-line). With multilevel p. Trees, the row ordering is important (for speed and accuracy (and zoomability of images) ). If pixels are raster ordered, it is unlikely that tank-pixel runs will cover any level-1 stride (if Z or Hilbert ordered, maybe? ). But we can sort the training set on class (training is a 1 -time process). Once a good cut-plane is identified, then it can be used to classify all incoming images using Mohammad's formula for Pxod<a (one AND program across). Since training is a one-time process, it should be done for maximum accuracy, not speed (so level-1 training may be dumb? ). If the flood of images to be classified come in as level-1 50% p. Trees, Mo's formula can be used for very fast classification (using FPGAs? ). The accuracy of this classification depends only on the quality of the training step (cut-planes). So, train slowly (obtain optimal d and a) using level-0 p. Trees, then classify using level-1 50% p. Trees and Mo's formula? The classification will be fast even using level-0 p. Trees (using Mo's formula implemented on FPGAs? )
FAUST Oblique formula: P(Xod)<a X any set of vectors (e. g. , a training class). D≡ mr mv. Let d = D/|D|. To separate rs from vs using means_midpoint as the cut-point, calculate a as follows: Viewing mr, mv as vectors ( e. g. , mr ≡ origin pt_mr ), a = ( mr+(mv-mr)/2 ) o d = (mr+mv)/2 o d What if d points away from the intersection, , of the Cut-hyperplane (Cut-line in this 2 -D case) and the d-line (as it does for class=V, where d = (mv mr)/|mv mr| ? Then a is the negative of the distance shown (the angle is obtuse so its cosine is negative). But each vod is a larger negative number than a=(mr+mv)/2 od, so we still want vod < ½(mv+mr) o d r r r v v r mr r v v v d r r v mv v r v v r v d a
PX o d < a = P d X <a i i FAUST Oblique vector of stds D≡ mr mv , d=D/|D| To separate r from v: Using the vector of stds cutpoint , calculate a as follows: Viewing mr, mv as vectors, a = ( stdr m + stdr+stdv r stdv m ) stdr+stdv v o d What are the purple stds? approach-1: for each coordinate (or dimension) calculate the stds of the coordinate values and for the vector of those stds. Let's remind ourselves that the formula PX o d < a = P d X <a i i given Md's formula, does not require looping through the X-values but requires only one AND program across the p. Trees. r r r v v r mr r v v v r r v mv v r v v r v d
FAUST Oblique PXod<a = P d X <a i i D≡ mr mv , d=D/|D| Approach 2 To separate r from v: Using the stds of the projections , calculate a as follows: pstdr a = pmr + pstd +pstd (pmv-pmr) = r v pmr*pstdr + pmr*pstdv + pmv*pstdr - pmr*pstdr +pstdv pmr*2 pstdr + pmr*pstdv + pmv*2 pstdr - pmr*2 pstdr pmr + pstd +2 pstd (pmv-pmr) = 2 pstdr +pstdv v r next? In this case the predicted classes will overlap (i. e. , a given sample point may be assigned multiple classes) therefore we will have to order the class predictions. By pm r, By ps tdr , st d we m {rod|r ean t R} his di stanc e d , m od r r r v v r mr r v v v r r v mv v r v v r v , whic h is a r r p | || | mr | r |r r lso m e an{ro v | | pmv | || v v v d|r R }
FAUST for Satlog(landsat) R 62. 83 48. 84 87. 48 77. 41 59. 59 69. 01 G 95. 29 39. 91 105. 50 90. 94 62. 27 77. 42 ir 1 108. 12 113. 89 110. 60 95. 61 83. 02 81. 59 ir 2 89. 50 118. 31 87. 46 75. 35 69. 95 64. 13 means 1 2 3 4 5 7 Non. Oblique lev-0 True Positives: 1's 99 2's 193 3's 325 4's 130 5's 151 7's 257 Class Totals-> 461 224 397 211 237 470 Non. Oblq lev-1 50% True Positives: False Positives: 1's 212 14 2's 183 1 3's 314 42 4's 103 5's 157 36 7's 330 189 Oblique level-0 using midpoint of means 1's 2's 3's True Positives: 322 199 344 False Positives: 28 3 80 4's 145 171 5's 174 107 7's 353 74 R 8 8 5 6 6 5 G 15 13 7 8 12 8 ir. R 1 13 13 7 8 13 9 ir 2 9 19 6 7 13 7 stds 1 2 3 4 5 7 Oblique level-0 using means and stds of projections (w/o class elimination) 1's 2's 3's 4's 5's 7's True Positives: 359 205 332 144 175 324 False Positives: 29 18 47 156 131 58 Oblique lev-0, means, stds of projs (with class elimination in 2, 3, 4, 5, 6, 7, 1 order) Note that no elimination occurs! 1's 2's 3's 4's 5's 7's True Positives: 359 205 332 144 175 324 False Positives: 29 18 47 156 131 58
FAUST PX dot d<a = P d X <a i i X any set of vectors. D≡ mr mv , d=D/|D| To separate r from v: Using the mean and std of the projections cutpoint , and: a = pmr*2 pstdr + pmr*pstdv + pmv*2 pstdr - pmr*2 pstdr pmr + pstd +2 pstd (pmv-pmr) = pstdr +2 pstdv v r Oblique level-0 using means and stds of projections 1's 2's 3's 4's 5's 7's True Positives: 359 205 332 144 175 324 False Positives: 29 18 47 156 131 58 Class Totals-> 461 224 397 211 237 470 Oblique level-0 using means and stds of projections, doubling pstdr as above 1's 2's 3's 4's 5's 7's True Positives: 410 212 277 179 199 324 False Positives: 114 40 113 259 235 58 Oblique lev-0, means, stds of projs, doubling pstd r, classify, eliminate in 2, 3, 4, 5, 7, 1 order True Positives: False Positives: 1's 309 22 2's 212 40 3's 277 65 4's 154 211 5's 163 196 7's 248 27 So the number of FPs is drastically reduced and TPs somewhat reduced. Is that better? If we parameterize the 2 (doubling) and adjust to max TPs and min FPs, what is the optimal multiplier parameter value? Next slide shows low-to-high std elimination ordering. r r r v v r mr r v v v r r v mv v r v v r v d r rm p | || | |r r v | | pmv | | | v v
FAUST Oblique: PX dot d<a = P d X <a i i X any set of vectors. D≡ mr mv , d=D/|D| To separate r from v: Using the mean and std of the projections cutpoint , and: a = pmr*2 pstdr + pmr*pstdv + pmv*2 pstdr - pmr*2 pstdr pmr + pstd +2 pstd (pmv-pmr) = pstdr +2 pstdv v r Class Totals-> 461 224 397 211 Oblique level-0 using means and stds of projections 1's 2's 3's 4's True Positives: 359 205 332 144 False Positives: 29 18 47 156 237 470 5's 175 131 7's 324 58 Oblique level-0 using means and stds of projections, doubling pstdr as above 1's 2's 3's 4's 5's 7's True Positives: 410 212 277 179 199 324 False Positives: 114 40 113 259 235 58 Oblique lev-0, means, stds of projs, doubling pstd r, classify, eliminate in 2, 3, 4, 5, 7, 1 order True Positives: False Positives: 1's 309 22 2's 212 40 3's 277 65 4's 154 211 5's 163 196 7's 248 27 low-to-high std elimination ordering: Oblique lev-0, means, stds of projs, doubling pstd r, classify, eliminate in 3, 4, 7, 5, 1, 2 order True Positives: False Positives: 1's 329 25 2's 189 1 3's 277 113 4's 154 211 5's 164 121 7's 307 33
lev-0 std 1/(std 1+std 2) True Positives: False Positives: 1's 359 29 2's 205 18 3's 332 47 4's 144 156 5's 175 131 7's 324 58 lev 0 2 std 1/(2 std 1+std 2) TP: FP: 410 114 212 40 277 113 179 259 199 235 324 58 2 s 1/(2 s 1+s 2) elim ord: 234571 TP: FP: 309 22 212 40 277 65 154 211 163 196 248 27 2 s 1/(2 s 1+s 2) elim ord: 347512 TP: FP: 329 25 189 1 277 113 154 211 164 121 307 33 2 s 1/(2 s 1+s 2) elim ord: 425713 TP: FP: 355 37 205 18 224 14 179 259 172 121 307 33 red above=(std+stdup)/gap below=(std+stddn)/gapdn which suggest elim order 425713: above below 1 4. 33 2. 10 2 1. 30 3 1. 09 4 1. 31 1. 09 5 1. 30 4. 33 7 2. 10 1. 31 green ir 1 ir 2 above below 5. 29 2. 16 1. 12 2. 16 1. 18 5. 29 1. 12 1. 32 1. 18 above below 1. 68 8. 09 6. 07 1. 67 1. 68 15. 37 1. 67 15. 37 above below avg |cls 13. 11 0. 94 4. 71 | 4 0. 94 2. 36 | 2 1. 07 13. 11 5. 27 | 5 3. 70 1. 07 2. 12 | 7 3. 43 3. 70 4. 03 | 1 3. 43 4. 12 | 3 avg 2. 12 2. 36 4. 03 4. 12 4. 71 5. 27 best to use s 1/(s 1+s 2) 1 359 29 2 205 18 3 332 47 4 144 156 5 175 131 7 324 58 total 1539 410 114 212 40 277 113 179 259 199 235 324 58 1601 819 TP 2 s 1/(2 s 1+s 2) FP 309 22 212 40 277 65 154 211 163 196 248 27 1363 561 TP 2 s 1/(2 s 1+s 2) FP 234571 329 25 189 1 277 113 154 211 164 121 307 33 1420 504 TP 2 s 1/(2 s 1+s 2) FP 347512 355 37 189 18 277 14 154 259 164 121 307 33 1446 482 TP 2 s 1/(2 s 1+s 2) FP 425713 355 37 189 18 277 14 154 259 164 121 307 33 1446 482 TP s 1/(s 1+s 2) level-1 50% FP TP s 1/(s 1+s 2) FP FAUST Oblique
FAUST Oblique: D≡ mr mv , d=D/|D| lev-0 std 1/(std 1+std 2) True Positives: False Positives: 1's 359 29 2's 205 18 3's 332 47 4's 144 156 5's 175 131 7's 324 58 lev 0 2 std 1/(2 std 1+std 2) TP: FP: 410 114 212 40 277 113 179 259 199 235 324 58 2 s 1/(2 s 1+s 2) elim ord: 234571 TP: FP: 309 22 212 40 277 65 154 211 163 196 248 27 2 s 1/(2 s 1+s 2) elim ord: 347512 TP: FP: 329 25 189 1 277 113 154 211 164 121 307 33 2 s 1/(2 s 1+s 2) elim ord: 425713 TP: FP: 355 37 205 18 224 14 179 259 172 121 307 33 1 359 29 2 205 18 3 332 47 4 144 156 5 175 131 7 324 58 total 1539 439 355 37 189 18 277 14 154 259 164 121 307 33 1446 482 Non. Oblq lev-1 50% True Positives: False Positives: 1's 212 14 2's 183 1 TP s 1/(s 1+s 2) FP TP 2 s 1/(2 s 1+s 2) FP 3's 314 42 425713 4's 103 5's 157 36 7's 330 189
We are looking for real-life examples, e. g. , “friends of those who buy X, buy Y”. Market agency targeting advertising to friends of customers: Sell=S(F, G) 1 1 0 1 targets. Q(D, E) 1 0 0 0 1 1 2 3 4 5 0 0 Entities: 1. advertisements 2. markets 3. merchants 4. products 5. reviews 6. customers 7. friends 1 0 0 0 3 4 5 0 0 1 0 1 0 0 0 0 1 1 2 3 4 5 1 0 0 0 1 1 0 1 4 3 2 1 H=rating Is. AFriend? (I, J) G=products 0 1 4 3 2 1 F=merchants 0 0 1 0 Rated. At(G, H) E=Markets 1 1 0 1 0 0 0 1 1 1 2 3 4 5 4 3 2 1 0 0 1 0 Is. Used. By= R(E, F) D=Ads A Another example (from Ya). In 779 slide "03 vertical_audio 1. ppt", 2 techniques to address the curse of cardinality and dimensionality are: Parallelizing the processing engine. . Parallelize the greyware engine on clusters of people (i. e. , enable visualization and use the web. . . ). There is a good example of the "visualization and use the web": Crystal structure of a monomeric retroviral protease solved by protein folding game players http: //www. nature. com/nsmb/journal/v 18/n 10/full/nsmb. 2119. html Although this is a protein structure problem, not a curse of cardinality or curse of dimensionality problem. J=Person 1 1 0 1 I=Customers 1 1 0 1 6 -hop myrrh example (from Damian). The hop logic would be: 1. Advertisements target specific markets. 2. Markets have particular merchants 3. Merchants sell individual products. 4. Products have reviews. 5. Reviews are provided by customers. 6. Customers have friends. 2 Are. Given. By=U(H, I) C
bc509b497bf0272e559e4d3471a7b60e.ppt