Profit Mining From Patterns to Action Ke Wang

Скачать презентацию Profit Mining From Patterns to Action Ke Wang

a7ee80dabdfd068604ab1b571671d4f6.ppt

Количество слайдов: 22

Profit Mining: From Patterns to Action Ke Wang, Senqiang Zhou, Jiawei Han Simon Fraser University 1

Why Profit Mining? n A major obstacle in data mining application is the gap between: – – n statistic-based pattern extraction and value-based decision making Profit mining: – value-based data mining 2

An Example n Suppose we want to maximize profit. Association rules [AIS 93] {Perfume}->Lipstick (more often) {Perfume}->Diamond (more profit) do not suggest which items (and prices) to recommend to a customer who bought Perfume. n Similar problems with correlation, classification, etc. 3

The Problem n Given: several transactions of form: – – n {, …, | }, for Item, Promotion code, and Quantity. | separates nontarget items and target items. { | } Recommend target to customers who buy non-target items, to maximize profit. 4

Not Prediction Problem n An example: – 100 customers each bought 1 pack for $1/pack. Profit=100(1 -0. 5)=$50. – 100 customers each bought 4 packs for $3. 2/4 -pack. Profit=100(3. 2 -2)=$120. n Prediction repeats the history. n Profit mining gets smarter from the history, by n recommending “right items” and “right prices”. 5

Challenge I - notion of profit n Pure statistic approach favors – n Pure profit approach favors – n {Perfume}-> Lipstick {Perfume}-> Diamond. Profit mining considers: – both statistical significance and profit significance. 6

Challenge II - customer intention n Mining On Availability (MOA): – Paying a higher price implies the willingness to pay a lower price. n {} -> can be extracted from transaction { | } n Recognizing this behavior brings new sales opportunities (at lower price). 7

Challenge III - search space n Thousands of items, and much more sales. Any combination can trigger a recommendation. n Search at alternative concepts (food, meat, etc) and prices makes it worse. 8

Step 1: generating rules n Association rules – n {Diaper -> Beer}, supp=10%, conf=80% Recommendation rules: – {g 1, …, gk} -> , where gi is , or Item, or Concept. – {} -> – {Flaked. Chick. } -> – {Meat} -> 9

Handle alternative concept and prices 10

Step 2: building the model n We rank rules by the “average profit” made by the recommendation of a rule. – {} -> matches n n – n t 1: {| } (a hit) t 2: {|} ( a miss) If the cost of Sunchip is $0. 7, the average profit is $0. 15. To recommend, we select the matching rule of the highest possible rank. 11

Step 3: Pruning the model n The model favors “high average profit” rules. n Such rules may bring a large profit. n Such rules may be random noise. n Cannot prune them simply based on statistical frequency. 12

Pruning the model n We prune rules to increase the estimated profit on the whole population. n We organize rules into specificity tree: the parent is the highest ranked general rule of a child. n We cut off the tree to maximize the estimated profit. 13

Evaluation n Synthetic datasets: IBM synthetic data generator, modified to have price and cost. n 1000 items and 1000 K transactions n For non-target item i: – – n cost(i)=c/i price j=(1+j*10%)cost(i), j=1, 2, 3, 4. For target items: – Dataset I has 2 target items – Dataset II has 10 target items 15

Profit Gain on Dataset I 16

Hit Ratio on Dataset I 17

Hit Ratio on Dataset I 18

Profit Gain on Dataset II 19

Hit Ratio on Dataset II 20

Hit Ratio on Dataset II 21

Conclusion n Proposed a new direction of data mining: Mining for profit. n Directly factor in business goal into data mining n Related work: microeconomic view of data mining [KPR 98] 22