af624bc0eded1bc57d190136183d0867.ppt
- Количество слайдов: 17
Chapter 8 Collaborative Filtering Stand 20. 12. 00
Recommended References • Shardanand, U. , and Mayes, P. (1995) Social Information Filtering: Algorithms for Automating ‘Word of Mouth’, in Proceedings of CHI 95, 210 -217. http: //www. acm. org/sigchi/chi 95/Electronic/documnts/papers/us_bdy. htm • Billsus, D. , & Pazzani, M. J. (1998) Learning Collaborative Information Filters. In: The 15 th International Conference on Machine Learning, ICML-98. http: //www. ics. uci. edu/~dbillsus/papers/icml 98. pdf • Smyth B. , Cotter P. , ‘Surfing the Digital Wave, Generating Personalised TV Listings using Collaborative, Case-Based Recommendation’, In: Proceedings of the Third International Conference on Case-Based Reasoning ICCBR 99’, Springer. • Berkeley School of Information Systems, Link Collection on Collaborative Filtering. http: //www. sims. berkeley. edu/resources/collab/ -2 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
Content-based vs. Collaborative Filtering/Selection • Filtering and Selection means basically the same: – Filtering: removing certain objects from a universe – Selection: picking certain objects from a universe • Previously discussed approaches for selecting products are content-based. • Representation of products is required and a notion of similarity between demands and products (see chapters 4 -7) • Alternative approach discussed in this chapter: collaborative selection -3 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
Collaborative Filtering Approach (1) Basic Idea • Select items based on aggregated user ratings of those items You buy an item only because many of your friends (which share the same interest with you) bought it an like it, although you don’t really know anything about the product. • Consider ratings of similar users (customers) only • Requires stored user profiles of the kind: – Customer C 1 likes (buys) product p 1, p 4, p 8 – Customer C 2 likes (buys) product p 1, p 2, p 8 –. . . -4 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
Collaborative Filtering Approach (2) • Users 1, 2 and 3 are similar since they all bought products A, B, and C • D & E can be recommended to User 1 based on this shared interest 1 User • Recommendation based on observations – no detailed representation of D or E – users must be identified, i. e. , a user profile must be available -5 - Products A, . . . , F User 2 F A B C D E User 3 (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
First Realization (1) • Customer U gives ratings Ux for certain products xÎPU • A rating Ux is a value from an ordered set, e. g. , an Integer value 1. . 7, 1: don’t like at all . . . 4: neutral . . . 7: great stuff • Note: Not every Customer rates every Product • Determine similarity of customers U and V based on the similarity of ratings of those products both have rated, i. e. , PUÇV. -6 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
First Realization (2) Distance/ Similarity Measures for Customers • Given: two customers U and V • Mean Squared Difference (Distance Measure) • Pearson correlation coefficient may be better: r. Pearson(U, V) • – ruv > 0: positively related – ruv = 0: not related – ruv < 0: negatively related -7 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
First Realization (3) Determining Recommendations • Profile of a new customer W is compare to the profile of all known users U and the similarity/distance r. WU is determined • Users whose profile similarity exceeds a certain threshold are selected • Rating for an item is a weighted average of rating of similar users for that item • Products with the highest rating Wx are recommended to W -8 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
Shortcomings of the First Realization • Correlation only based on items which two customers have in common – When thousands of items available only little overlap! – Then: Recommendations based on only a few observations • Correlation Coefficient is not transitive, however customer similarity is at least to some degree transitive – If A and B correlated and B and C are correlated then A and C should also be correlated -9 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
Second Realization (1) • We view collaborative filtering as a classification task • For each customer Ui, determine a classifier f i that classifies a product into classes, e. g. – { like, dislike } or – ratings from 1. . . 7 • A product is represented by the rating vector of the other customers • The classifier is a function f i: Others. Rating. Vector ® My. Rating, i. e. , the predicted rating for product x is determined by f i(U 1 x, . . . , Unx). • This classifier can be learned from examples using machine learning approaches (see also chapter 13). • Training examples for f i are the ratings of those products that are also rated Ui - 10 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
Second Realization (2) Construction of Training Examples • Current ratings P 1 U 1 P 2 P 3 + U 2 P 4 P 5 + - - U 3 + + - U 4 + - - • Training Examples for U 4 - 11 - + Ui: Customers Pi: Products +: like -: dislike no information U 1 + U 1 U 2 + U 2 U 3 + U 3 Class E 1 1 0 0 0 1 E 2 0 0 0 1 1 0 0 E 3 1 0 0 1 0 (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
Second Realization (3) • Various Machine Learning Approaches can be applied • Feed-forward nets with one hidden layer with two units show good results; Training with backpropagation • Problems: – High dimensionality of training data – Sparse data (i. e. only few ‘ 1’ entries, many ‘ 0’s) • Methods for reducing the dimensions (compression) must be applied during a pre-processing step – Choose not all users, but characteristic (reference) users only – LSI approach (see Billsus & Pazzani, 1998) - 12 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
Drawbacks of Collaborative Filtering • No anonymity: User Profiles are required and must be stored • The pump priming problem: (1) When a new store is launched, no ratings are available poor recommendations (2) When a new product emerges, no ratings for this product available new product is never recommended • Large training effort involved - 13 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
Application 1: Amazon Book Store Customers who bought this book also bought: · Reinforcement Learning: An Introduction; R. S. Sutton, A. G. Barto · Advances in Knowledge Discovery and Data Mining; U. M. Fayyad · Probabilistic Reasoning in Intelligent Systems; J. Pearl - 14 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
Application 2: Personalized TV Program www. ptv. ie • Generates personalized TV guides • Uses collaborative & casebased recommendations • based on descriptions of programs • based on likes of users with similar tastes. - 15 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
PTV: Recommendations & Feedback - 16 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
Summary Collaborative vs. Content Based • Collaborative Filtering – – – requires identification “representationless” pump priming problem scalability sparse matrix • Content Based (CBR) – can be anonymous – requires representation Current Trend: Combination of both approaches - 17 - (c) 2000 Dr. Ralph Bergmann and Prof. Dr. Michael M. Richter, Universität Kaiserslautern
af624bc0eded1bc57d190136183d0867.ppt