Contextual Recommendation in Multi-User Devices Raz Nissim Michal

Скачать презентацию Contextual Recommendation in Multi-User Devices Raz Nissim Michal

10df0cf0b228054e1669472b2baf5915.ppt

Количество слайдов: 29

Contextual Recommendation in Multi-User Devices Raz Nissim, Michal Aharon, Eshcar Hillel, Amit Kagian, Ronny Lempel, Hayim Makabee

Recommendation in Personal Devices and Accounts 3/17/2018 2

Challenge: Recommendations in Shared Accounts and Devices n “I am a 34 yo man who enjoys action and sci-fi movies. This is what my children have done to my netflix account” 3/17/2018 3

Our Focus: Recommendations for Smart TVs n Smart TVs can track what is being watched on them n Main problems: n n n Inferring who has consumed each item in the past Who is currently requesting the recommendations “Who” can be a subset of users 3/17/2018 4

Solution: Using Context n Previous work: time of day 3/17/2018 5

Context in this Work: Current Item Being Watched 3/17/2018 6

This Work: Contextual Personalized Recommendations Watch. It. Next problem: n n 3/17/2018 it is 8: 30 pm and “House of Cards” is on What should we recommend to be watched next on this device? Implicit assumption: there’s a good chance whoever is in front of the set now, will remain there Technically, think of HMM where the hidden state corresponds to who is watching the set, and states don’t change too often 7

Watch. It. Next Inputs and Output Available programs, a. k. a. “line-up” 3/17/2018 Ranked recommendations 8

Recommendation Settings: Exploratory and Habitual n n n One typically doesn’t buy the same book twice, nor do people typically read the same news story twice But people listen to the songs they like over and over again, and watch movies they like multiple times as well In the TV setting, people regularly watch series and sports events Habitual setting: all line-up items are eligible for recommendation to a device Exploratory setting: only items that were not previously watched on the device are eligible for recommendation 3/17/2018 9

Contextual Recommendations in a Different Context Personalized How can contextualized and personalized recommendations be served together? Contextual Popular 3/17/2018 10

Collaborative Filtering n n n A fundamental principle in recommender systems Taps similarities in patterns of consumption/enjoyment of items by users Recommends to a user what users with detected similar tastes have consumed/enjoyed 3/17/2018 11

Collaborative Filtering – Mathematical Abstraction n Consider a consumption matrix R of users and items n n n ru, i=1 whenever person u consumed item i In other cases, ru, i might be person u’s rating on item i The matrix R is typically very sparse n Items …and often very large – predict which yet-to-be-consumed items the user would most enjoy • Related task on ratings data: matrix completion R= users • Real-life task: top-k recommendation |U| x |I| – Predict users’ ratings for items they have yet to rate, i. e. “complete” missing values 3/17/2018 12

Collaborative Filtering – Matrix Factorization n Latent factor models (LFM): n n n Map both users and items to some f-dimensional space R f, i. e. produce f-dimensional vectors vu and wi for each user and item Define rating estimates as inner products: qui = Main problem: finding a mapping of users and items to the latent factor space that produces “good” estimates Items V W users R= ≈ |U| x |I| 3/17/2018 |U| x f Closely related to dimensionality reduction techniques of the ratings matrix R (e. g. Singular Value Decomposition) f x |I| 13

LFMs Rise to Fame: Netflix Prize Used extensively by Challenge winners “Bellkor’s Pragmatic Chaos” (2006 -2009) 3/17/2018 14

Latent Dirichlet Allocation (LDA) [Blei, Ng, Jordan 2003] Originally devised as a generative model of documents in a corpus, where documents are represented as bags-of-words n V U Word 1 Word 2 … L Document 1 #1, 2 #1, … Document 2 #2, 1 #2, 2 #2, … … #. . . , 2 … n #. . . , 1 L ≈ |D| x k n n k x |W| k is a parameter representing the number of “topics” in the corpus V is a stochastic matrix: V[d, t] = P(topict|documentd), t=1, …, k U is a stochastic matrix: U[t, w] = P(wordw|topict), t=1, …, k L is a vector holding the documents’ lengths (#words per document) 3/17/2018 15

Latent Dirichlet Allocation (cont. ) n n n In our case: given a parameter k and the collection of devices (=documents) and their viewing history (=bags of shows), output: n k “profiles”, where each profile is a distribution over items n Associate each device to a distribution over the profiles Profiles, hopefully, will represent viewing preferences such as: n “Kids shows” n “Cooking reality and home improvement” n “News and Late Night” n “History and Science” n “Redneck reality: fishing & hunting shows, MMA” A-priori probability of an item being watched on a device: Score(item|device) = profile=1, …, k P(item|profile) x P(profile|device) 3/17/2018 16

Contextualizing Recommendations: Three Main Approaches 1. Contextual pre-filtering: use context to restrict the data to be modeled 2. Contextual post-filtering: use context to filter or weight the recommendations produced by conventional models 3. Contextual modeling: context information is incorporated in the model itself n n n Typically requires denser data due to many more parameters Computationally intensive E. g. Tensor Factorization, Karatzoglou et al. , 2010 3/17/2018 17

Main Contribution: “ 3 -Way” Technique n n Learn a standard matrix factorization model (LFM/LDA) When recommending to a device d currently watching context item c, score each target item t as follows: S(t follows c|d) = j=1. . k vd(j)*wc(j)*wt(j) n n With LFM, requires an additive shift to all vectors to get rid of negative values Results in “Sequential LFM/LDA” – a personalized contextual recommender Score is high for targets that agree with both context and device Again – no need to model context or change learning algorithm; learn as usual, just apply change when scoring 3/17/2018 18

Data: Historical Viewing Logs n n n Triplets of the form (devide ID, program ID, timestamp) Don’t know who watched the device at that time Actually, don’t know whether anyone watched Is anyone watching? Time 3/17/2018 19

Data by the Numbers n Training data: three months’ worth of viewership data Devices Triplets 339647 n Unique items* 17232 More than 19 M Test Data: derived from one month of viewership data Setting Test Instances Average Line-up Size Habitual ~3. 8 M 390 Exploratory ~1. 7 M 349 * Items are {movie, sports event, series} – not at the individual episode level 3/17/2018 20

Metric: Avg. Rank Percentile (ARP) Rank Percentile properties: n Ranges in (0, 1] n Higher is better n Random scores ~0. 5 in large lineups next ? (RP = 1. 0) RP = 0. 75 (RP = 0. 50) (RP = 0. 25) Note: with large line-ups, ARP is practically equivalent to average AUC 3/17/2018 21

Baselines Name Personalized? Contextual? General popularity No No Sequential popularity No Yes Temporal popularity No Yes Device popularity* Yes No LFM Yes No LDA Yes No * Only applicable to habitual recommendations 3/17/2018 22

Contextual Personalized Recommenders n n Sequential. LDA [LFM]: 3 -way element-wise multiplication of device vector, context item and target item Temporal. LDA[LFM]: regular LDA/LFM score, multiplied by Temporal Popularity Temp. Seq. LDA[LFM]: 3 -way score multiplied by Temporal Popularity All LDA/LFM models are 80 -dimensional 3/17/2018 23

Results (1) Sequential Context Matters n 0. 8 n 0. 75 n. ARP n 0. 7 n 0. 65 n 0. 7457 n 0. 7123 n 0. 64200000001 n 0. 6493 n 0. 6175 n 0. 6 n 0. 55 n 0. 5 n n 0. 5773 n. No Context n. Currently n. Random n. LFM Watched Context Item Context n. LDA Degradation when using a random item as context indicates that the correct context item reflects the current viewing session, and implicitly the current watchers of the device 3/17/2018 24

Results (2) Sequential Context Matters Device Entropy: the entropy of p(topic | device) as computed by LDA on the training data; high values correspond to diverse distributions 3/17/2018 25

Results (3) - Exploratory Setting n 0. 85 n 0. 82809 n 0. 79725 n 0. 8 n 0. 758780000 n 0. 78168 n 0. 762020000 01 01 n 0. 745720000 01 n 0. 75 n. ARP n 0. 7123 n 0. 7 n 0. 65 n 0. 64200000 n 0. 63400000 01 n 0. 64926 01 n 0. 6 n 0. 55 n 0. 5365 n 0. 5 op l. Pop LFM l. LFM n. LDA l. LDA n q q a al. P a a ia ia ia ner uent mpor p. Se por ent e n. G eq Te equ n. Tem n n. S 3/17/2018 26

Results (4) - Habitual Setting n 0. 95 n 0. 9 n 0. 86200000 01 n 0. 85 n 0. 79160. 774450000 n 0. 79101 n ARP n 0. 8 n 0. 75 01 n 0. 730020000 n 0. 73656 01 n 0. 7 n 0. 65 n 0. 6 n 0. 554 n 0. 55 n 0. 5 ne n. Ge 3/17/2018 p p ial. Po ral. Po uent q n. Se p op ral. Po mpo n. Te vice. P n. De A A A ial. LD eq. LD uent mp. S q n. Te n. Se n. LD 27

Conclusions n n n Multi-user or shared devices pose challenging recommendation problems TV recommendations characterized by two use cases – habitual and exploratory Sequential context helps – it “narrows" the topical variety of the program to be watched next on the device n Intuitively, context serves to implicitly disambiguate the current user or users of the device 3 -Way technique is an effective way of incorporating sequential context that has no impact on learning Future: explore applications of Hidden Topic Markov Models [Gruber, Rosen-Zvi, Weiss 2007] 3/17/2018 28

Thank You – Questions? rlempel [at] yahoo-inc [dot] com 3/17/2018 29