Скачать презентацию Recommendation Systems Prof Dr Daning Hu Department of Скачать презентацию Recommendation Systems Prof Dr Daning Hu Department of

39c0679127eb5f312fc8f18049f7ddc1.ppt

  • Количество слайдов: 22

Recommendation Systems Prof. Dr. Daning Hu Department of Informatics University of Zurich Nov 13 Recommendation Systems Prof. Dr. Daning Hu Department of Informatics University of Zurich Nov 13 th, 2012

Outline n Introduction n Approaches Recommendation Systems ¨ ¨ Content-based ¨ n Collaborative Filtering Outline n Introduction n Approaches Recommendation Systems ¨ ¨ Content-based ¨ n Collaborative Filtering Social Contagion Ref Book: Social Network Analysis: Methods and Applications (Structural Analysis in the Social Sciences) ¨ http: //www. amazon. com/Social-Network-Analysis-Applications. Structural/dp/0521387078 2

Introduction n Recommendation systems are a subclass of information filtering system that seek to Introduction n Recommendation systems are a subclass of information filtering system that seek to predict the 'rating' or 'preference' that a user would give to an item or social element they had not yet considered (Wiki) ¨ the user's social approaches) environment (Collaborative Filtering ¨ using a model built from the characteristics of an item (Contentbased approaches) or ¨ studying consumer purchase behavior in e-commerce setting ¨ In particular, the evolution of interactions among consumers and products reflected in online-sales transactions.

Underlying Technologies: Machine Learning n Recommendation systems are instances of personalization software. ¨ ¨ Underlying Technologies: Machine Learning n Recommendation systems are instances of personalization software. ¨ ¨ n adapting to the individual needs, interests, and preferences of each user. as part of Customer Relationship Management (CRM). Machine Learning (ML) aims to learn a user model or profile of a particular user based on: ¨ Sample interaction ¨ Rated examples ¨ Used to filter information and predict consumer behaviors 4

Collaborative Filtering n A database of many users’ ratings of a variety of items. Collaborative Filtering n A database of many users’ ratings of a variety of items. n For a given user, find other similar users whose ratings strongly correlate with the current user. n Recommend items rated highly by these similar users, but have not yet rated by the current user. n Amazon, etc. 5

Collaborative Filtering User Database A B C : Z 9 3 : 5 A Collaborative Filtering User Database A B C : Z 9 3 : 5 A B C 9 : : Z 10 A B C : Z 5 3 A B C 8 : : Z : 7 Correlation Match Active User A 9 B 3 C. . Z 5 A 6 B 4 C : : Z A B C : Z 9 3 : 5 A 10 B 4 C 8. . Z 1 Extract Recommendations C 6

Collaborative Filtering Method n Weight all users with respect to similarity with the active Collaborative Filtering Method n Weight all users with respect to similarity with the active user. n Select a subset of the users (neighbors) to use as predictors. n Normalize ratings and compute a prediction from a weighted combination of the selected neighbors’ ratings. n Present items with highest predicted ratings as recommendations. 7

Similarity Weighting n Typically use Pearson correlation coefficient between ratings for active user, a, Similarity Weighting n Typically use Pearson correlation coefficient between ratings for active user, a, and another user, u. n n ra and ru are the ratings vectors for the m items rated by both a and u Covariance: n Standard Deviation: n ri, j is user i’s rating for item j 8

Cons n Cold Start: enough users in the system to find a match. n Cons n Cold Start: enough users in the system to find a match. n Sparsity: The user/ratings matrix is sparse, and it is hard to find users that have rated the same items. n First Rater: Not for an item that has not been previously rated n Popularity Bias: Cannot recommend items to someone with unique tastes. ¨ Tends to recommend popular items. 9

Content-based Approaches n Recommendations are based on information on the content of items rather Content-based Approaches n Recommendations are based on information on the content of items rather than on other users’ opinions. n Uses machine learning algorithms to induce a profile of the users preferences from examples based on content features. ¨ No need for data on other users. ¨ No cold-start or sparsity problems. ¨ Able to recommend to users with unique tastes. ¨ No first-rater problem. 10

Combining Content and Collaboration n Content-based and collaborative methods have complementary strengths and weaknesses. Combining Content and Collaboration n Content-based and collaborative methods have complementary strengths and weaknesses. Combined methods to obtain the best of both. ¨ Apply both methods and combine recommendations. ¨ Use collaborative data as content. ¨ Use content-based predictor as another collaborator. ¨ Use content-based predictor to complete collaborative data. 11

Using Social Contagion for Recommendations § Intelligent Advertising, Product Recommendation § Who are the Using Social Contagion for Recommendations § Intelligent Advertising, Product Recommendation § Who are the most influential people? § What are the patterns of information diffusion? 12

Social Contagion Thoery – Le. Bon et al. 1895 n Le Bon, Park and Social Contagion Thoery – Le. Bon et al. 1895 n Le Bon, Park and Blumer the three major theorists made an assumption that something happens in a crowd situation that can cause people to become irrational. n The social pathology and social contagion perspectives – the idea that someone who already has the affliction (behavior) can pass it on the someone else, and it can rapidly infect others ¨ n Gabrielle Tarde’s work on the ‘laws of imitation’ Applications: Viral marketing, social media marketing 13

Social Recommendations for Marketing n Mass marketing is not the best way to attract Social Recommendations for Marketing n Mass marketing is not the best way to attract people ¨ ¨ n $ Expensive $ Usually not very focused Recommendations by people we know are more effective then input by unknown individuals ¨ Content: Our friends know what we like ¨ Homophily: Our friends and us are more likely to share interests and preferences ¨ Biased: We listen more to what our friends say (usually) ¨ Inexpensive 14

15 15

16 16

Data n The dataset for this study was collected from a large online OSS Data n The dataset for this study was collected from a large online OSS community – Ohloh, which provides information about 11, 800 OSS projects involving 94, 330 people Positive evaluation relationship ¨ Developers’ sociological features n Nationality, geographical location, etc. ¨ OSS project related information n Primary programming language, development activity, ratings, etc. ¨ From software revision control repositories – Subversion, CVS and Git. ¨ n Ohloh web site provides a REST-based application programming interface (API) for users to access and query its data. Figure. 1. Sample data from Ohloh developers 17

Statistical Analysis on Link Formation n Dependent variable: The outcome of a developer D Statistical Analysis on Link Formation n Dependent variable: The outcome of a developer D participates in an OSS project P at time T , coded as a binary variable “Kudo” link. n Independent variables include three types of possible determinants ¨ Homophily factors ¨ Share affiliation factors ¨ Preferential attachment factors 18

19 19

20 20

Conditional Logit Analysis n Conditional logistic model (CLM) have been widely used to examine Conditional Logit Analysis n Conditional logistic model (CLM) have been widely used to examine the determinants which affect individuals’ choices (Mc. Fadden 1980; Mc. Fadden et al. 1974; Powell et al. 2005). ¨ n Model human choice behavior – project participation choices. It is specified as follows: where is the observed choice of the new developer to participate in project i , and is a vector of the factors that influence such choice. J is the alternative set of projects available. The unknown coefficients are typically estimated by maximum likelihood methods. 21

Predicting Future Evaluation Choices n Our analysis also provided a prediction mechanism using conditional Predicting Future Evaluation Choices n Our analysis also provided a prediction mechanism using conditional logistic model and the discovered determinants. n For instance, if developers a and the developer b Live in New York City (Coefficient of homophily in location: 5. 190) ¨ Use Java as their primary programming language (Coefficient : 1. 623) ¨ etc. ¨ n The probability for a choose to positively evaluate b from an alternative set can be calculated.