Information Market based Decision Fusion in Multi-Classifier Combination

Information Market based Decision Fusion in Multi-Classifier Combination Johan Perols, Kaushal Chari and Manish Agrawal Information Systems/Decision Sciences, College of Business Administration University of South Florida jperols@coba. usf. edu kchari@coba. usf. edu magrawal@coba. usf. edu

Overview n Multi-Classifier Combination n Information Market based Fusion n Experiments and Results n Contributions and Future Research Opportunities

Multi-Classifier Combination Related Research Our Method Experiment Results Conclusion

Research Objectives To design a combiner method that: 1) is relatively effective 2) can adapt to changes in ensemble composition and to changes in relative base-classifier accuracy; and 3) does not assume that base-classifiers are cooperative. Related Research Our Method Experiment Results Conclusion

Information Markets Definition n n Information markets are markets designed specifically for the purpose of information aggregation. Equilibrium prices provide information about a specific situation, future event or object of interest Related Research Our Method Experiment Results Conclusion

Information Markets - Parimutuel Betting n n Empirical field research (Weitzman 1965; Ali 1977 and 1979; Asch, et al. 1982; and Thaler 1992) and experiments (Plott, et al. 2003) support the efficient market hypothesis in these betting markets. Parimutuel betting originated in horse betting q q odds horse i = total amount bet on all horses total amount bet on horse i the market’s likelihood assessment that horse i will win the race is given by 1 / odds horse i recursive relation between odds and amount bet winners divide the total amount bet in proportion to the sums they have wagered individually on the winning horse Related Research Our Method Experiment Results Conclusion

IMF Algorithm 1. Binary Search iterations with odds setting and agent betting 2. Optimization Assumption: only objects classified as positive are investigated Agents’ maximize their individual utility According to the parimutuel betting mechanism Related Research Our Method Experiment Results Conclusion

IMF – Binary Search Pl – lower probability boundary Pu – upper probability boundary Qtj – total bets on event j Qt – total bets on all events Otj – market odds for outcome j ε – binary search stopping parameter Related Research Our Method Experiment Results Conclusion

IMF – Optimization Pl – lower probability boundary Pu – upper probability boundary Qtj – total bets on event j Qt – total bets on all events Otj – market odds for outcome j Related Research Our Method Experiment Results Conclusion

IMF – Get Cutoff Independent cycle that given prior cutoff Cj determines future Cj: 1) Use Cj for the next n transactions. 2) Set Cj+k to Cj+k, use Cj+k for transactions n to 2 n 3) Set Cj-k to Cj-k, use Cj-k for transactions 2 n to 3 n 4) Set Cj to the cutoff that generated the highest net benefit Related Research Our Method Experiment Results Conclusion

IMF – Classify Object Based on the final odds Oft 1 and cutoff C 1 determined previously classify the object: if (1/Oft 1 C 1) then class of t as j=1 else class of t as j=2 end if Related Research Our Method Experiment Results Conclusion

IMF – Take Agent Bets Agents determine their bets qit 1 and qit 2 by solving P 1: Z 1 = pit 1 Ui(wit + m - qit 1 - qit 2 + qit 1 Ot 1) + pit 2 Ui(wit + m - qit 1 - qit 2 + qit 2 Ot 2) S. T. qitj, sit 0 Ui Otj pitj wit m k - agent’s utility function - current market odds - agent’s probability estimate - agent’s wealth, - periodic endowment - multiplier that determines the house enforced maximum bet km. , Related Research Our Method Experiment Results Conclusion

IMF – Take Agent Bets Assuming a log utility function and forcing the agents to bet everything allowed (they can still hedge their bets) P 1 is transformed to P 2: Z 2 = maxqitj pit 1 ln(wit+m-qit 1 -qit 2+qit 1 Ot 1) + pit 2 ln(wit+m-qit 1 -qit 2+qit 2 Ot 2) S. T. qitj 0 Related Research Our Method Experiment Results Conclusion

IMF – Take Agent Bets Lemma 1: If wit+ m ≤ km then the optimal bets of agent i while classifying t is: = pitj(wit+m) j J. Lemma 2: If wit+ m > km then the optimal bets of agent i while classifying t is: = pit 1 km + ait , and = pit 2 km + ait Related Research Our Method Experiment Results Conclusion

IMF – Distribute Payout On investigating transaction t΄ that occurred v transactions before the now current transaction t, if t΄ is found to be a member of the positive class j = 1, then agent i 's wealth is updated as follow: wit = wit + Qt-v(qi, t-v, 1/Q, t-v, 1) If t΄ is found to be a member of the negative class j = 2, then agent i ’s wealth is updated as follows: wit = wit + Qt-v(qi, t-v, 2/Q, t-v, , 2) Related Research Our Method Experiment Results Conclusion

Research Objectives & Design Solutions Research Objective Design Solution Changes in The market mechanism works independent of what Ensemble specific agents are participating in the market. Composition Changes in Base-Classifier Accuracy Agents improving their performance do better in the market and have an increasingly larger influence on the ensemble’s decisions, and vice versa. Agent Cooperactiveness The market mechanism uses the amount bet on each outcome and not probabilities to determine odds. Related Research Our Method Experiment Results Conclusion

Experiment – Main Objectives n Does IMF outperform AVG, MAJ and WAVG? n Are these results sensitive to the: q number of agents in the ensemble; q cost-benefit ratio; q dataset size; q dataset positive ratio; or q dataset average agent accuracy? Related Research Our Method Experiment Results Conclusion

Experiment – Variables Variable Net Benefit Function DV Description benefit of TP * number of TP – investigation cost * number of TP+FP Combiner Method Main IV IMF, AVG, WAVG, MAJ Number of Agents Manipulated Moderator 2, 4, 6, …, and 22 agents in the ensemble Cost-Benefit Ratio Manipulated Moderator 1: 100; 1: 50, 1: 25, 1: 10, 1: 7. 5, 1: 4, 1: 3, 1: 2, 1: 1. 5, 1: 1, 1. 5: 1, and 2: 1 Dataset Size Measured Moderator Number of dataset records Dataset Agent Accuracy Measured Moderator Average dataset hit-rate of the baseclassifiers Dataset Positive Ratio Measured Moderator Number of Positive Instances/ Total Number of Records Related Research Our Method Experiment Results Conclusion

Experiment – Datasets Dataset Instances Attributes Classes Adult 32, 561 Wisconsin Breast Cancer 699 Contraceptive Method Choice 1, 473 Horse Colic 368 Covertype (class 1 & 2) 10, 000 Covertype (class 3 & 4) 10, 395 Covertype (class 5 & 6) 10, 009 Australian Credit Approval 690 German Credit 1, 000 Pima Indians Diabetes 768 Thyroid Disease 3, 772 Labor 57 Mushrooms 8, 124 Sick 3, 772 Spambase 4, 601 Splice-junction Gene Sequences 3, 190 Waveform 3, 345 Related Research Our Method 14 110 10 22 11 11 11 15 20 8 5 16 5 12 58 20 40 Experiment 2 2 2 2 2 Results Positive Average Agent Rate Accuracy 24. 1% 82. 00% 34. 5% 94. 02% 57. 3% 66. 12% 37. 0% 81. 68% 72. 9% 86. 60% 6. 8% 95. 17% 66. 9% 96. 17% 55. 5% 83. 65% 30. 0% 72. 02% 34. 9% 73. 53% 7. 7% 92. 66% 64. 9% 80. 06% 48. 2% 91. 19% 6. 9% 96. 84% 39. 4% 87. 55% 51. 9% 62. 31% 49. 4% 86. 91% Conclusion

Experiment – Base-Classifiers n n n 22 base classifiers from Weka generated decision output for each data set 10 -fold cross validation Standard parameter settings Related Research Our Method Experiment Base-Classifiers ADTree Bayes. Net Conjunctive. Rule Decision. Stump Decision. Table IBk J 48 JRip KStar LMT LWL Results Multilayer. Perceptron Naive. Bayes NBTree NNge One. R PART Random. Forest RBFNetwork Ridor Simple. Logistic SMO Conclusion

Experiment – Implementation Facts n n 2, 000, 900 base-classifiers decisions (5, 350 average dataset size 17 datasets 22 base-classifiers) generated in Weka. Combiner methods created using Visual Basic and LINGO. Each of the 17 datasets combined 100 times (4 combiner methods x 11 levels of number of agents + 4 combiner methods x 14 cost-benefit ratios) for a total of 1, 700 observations. A total of 9, 095, 000 (100 dataset combinations x 5, 350 average dataset size x 17 datasets) aggregated decisions generated. Related Research Our Method Experiment Results Conclusion

Results – Main Experiment Moderator Does IMF outperform AVG, MAJ and WAVG? Net Benefit. IMF < > Net Benefit AVG, WAVG, MAJ main effect IMF > AVG IMF > MAJ IMF > WAVG none YES! p=0. 001 (p=0. 0030) (p=0. 0011) (p=0. 0001) number-ofagents p=0. 821 dataset size p=0. 922 p=0. 304 dataset positive ratio Related Research cost-benefit ratio dataset average agent accuracy Are these results sensitive to. . . p=0. 739 p<0. 001 Our Method Experiment Results Conclusion

Combiner Method * Positive Ratio Follow-Up Analysis 4 AVG IMF MAJ WAVG ln. Net. Benefit 3. 75 3. 25 10 20 30 40 50 60 70 positive_ratio Related Research Our Method Experiment Results Conclusion

Combiner Method * Positive Ratio Follow-Up Analysis ln. Net. Benefit 4 LS Means Positive Ratio High 3. 7 3. 6 p=0. 1843 Low 3. 8 p=0. 2849 Middle 3. 9 Net Benefit. IMF < > Net Benefit AVG, WAVG, MAJ p=0. 0010 High Positi Middle Posi Low Positiv 3. 5 3. 4 3. 3 3. 2 3. 1 AVG IMF MAJ WAVG Method Related Research Our Method Experiment Results Conclusion

Results – Additional Analyses Dynamic cutoff algorithm performance. Impact of investigation time lags on IMF performance Impact of binary search stopping parameter on IMF performance Impact of house enforced max bet on IMF performance Related Research Test Status Net Benefit. MAJ-DYN < > Net Benefit MAJ-OPT, MAJ- p=0. 0006 IMF(lag time) p=0. 8908 min search space interactions (p>0. 12) main effect (p=0. 32) max bet average-agent accuracy interaction (p=0. 03) other interactions (p>0. 05) RAN Our Method Experiment Results Conclusion

Results Dynamic Cutoff – Follow-Up Analysis Related Research Our Method Experiment Results Conclusion

Contributions IMF is a novel combiner method that: n n n outperforms AVG, MAJ and WAVG; can adapt to changes in ensemble composition and in the relative base-classifiers accuracy; and does not assume that the base-classifiers are cooperative. Dynamic Cutoff algorithm that: n outperforms randomly select cutoffs; and n does not perform significantly different from optimal cutoffs. Related Research Our Method Experiment Results Conclusion

Future Research… § Implement and test IMF in different MCC architectures, i. e. bagging, boosting, etc. § Evaluate other agent behaviors (human biases, different utility functions, belief updating, etc). Related Research Our Method Experiment Results Conclusion

Questions and ?

Appendix

Experiment – Additional Objectives n n n How do investigation time lags impact the performance of IMF? What is the impact of selecting different IMF parameter values? How good is the dynamic cut-off algorithm? Related Research Our Method Experiment Results Conclusion

Results – Additional Analyses Dynamic cut-off algorithm performance. Impact of investigation time lags on IMF performance Result sensitivity to cost-based retraining Impact of binary search stopping parameter on IMF performance Impact of house enforced max bet on IMF performance Related Research Test Status Net Benefit. MAJ-DYN < > Net Benefit MAJ-OPT, MAJ- p=0. 0006 IMF(lag time) p=0. 8908 Training Type * Combiner Method p=0. 13 min search space interactions (p>0. 12) main effect (p=0. 32) max bet average-agent accuracy interaction (p=0. 03) other interactions (p>0. 05) RAN Our Method Experiment Results Conclusion

Results – Additional Analyses *** * ** Related Research Our Method Experiment Results Conclusion * y-axel: z-value of log net benefit ** x-axel: treatment level of k factor *** markers: average-agent accuracy (%)

Results – Cost Based Retraining Related Research Our Method Experiment Results Conclusion

Experiment - Overview n n n 2, 000, 900 base-classifiers decisions (5, 350 average dataset size 17 datasets 22 base-classifiers) generated in Weka. Combiner methods created in Visual Basic and LINGO. Each dataset combined 100 times (4 combiner methods x 11 levels of number of agents + 4 combiner methods x 14 cost-benefit ratios). Related Research Our Method Experiment Results Conclusion

*Multi-Classifier Combination Combiner Method ”Weaknesses”: n performance is not “perfect” n training data requirement n stable base classifier performance n static ensemble composition n assuming cooperative base classifiers n no integration with coordination mechanism Related Research Our Method Experiment Results Conclusion

Information Market based Fusion IMF Algorithm – Key Ideas n n n Odds are optimized based on agent bets in an iterative fashion until optimal or near optimal odds are found. The odds are used to determine object class membership. Winnings are distributed based on the pari-mutuel system in both implementations. Related Research Our Method Experiment Results Conclusion

*Information Market based Fusion Behavioral Models (Plott, et al. 2003) n Decision Theory Private Information (DTPI) q q n Bets based on private information only. No learning (private information not updated). Competitive Equilibrium Private Information (CEPI) q Bets based on private information and market prices. q No learning. Related Research Our Method Experiment Results Conclusion