ba40c3ac56b50edf236db5510bdf9f3c.ppt
- Количество слайдов: 39
A Multimedia Semantic Recommender System for Cultural Heritage Applications Massimiliano Albanese, Antonio D’Acierno, Moscato, Fabio Persia and Antonio Picariello Vincenzo • Dipartimento di Informatica e Sistemistica Multimedia Information System LABoratory Universita’ di Napoli “Federico II” • ISA – CNR – Avellino • UMIACS University of Maryland, College Park
Introduction (1/2) w One of the most important challenge in the information access field is information overload w It is very difficult for a common user to search huge data collections n Information filtering systems (Recommender Systems) are being developed w A Recommender System n suggests to users items on the basis of their needs and preferences 2
Introduction (2/2) w The main goal of our idea is: n To present a novel approach to reccomendation in multimedia browsing systems, based on an importance ranking algorithm that strongly resembles the well known Page. Rank ranking strategy w We propose a method that computes customized recommendations by originally combining: n n n Intrinsic features of multimedia objects Past behaviours of individual users Overall behavior of the entire community of users 3
Recommender systems (1/3) w Huge data collections, in form of digital video and image libraries, news archives, shopping catalogs, virtual museums and so on are widely available n A number of algorithms and tools (Recommender Systems) is being proposed to facilitate the browsing of these large data repositories w Recommender systems thus help people in retrieving information that match their preferences by recommending products or services from large number of candidates and support people in making decision in various contexts: n n what items to buy which movie to watch 4
Recommender systems (2/3) w Formally, a recommender system deals with: n n a set of users U = {u 1, u 2, …, ui, …un} a set of objects O = {o 1, o 2, …oj, …, om} w It computes, for each pair (ui, oj) n A score ri, j that measures the expected interest of user ui for object oj w Using: n n A knowledge base A ranking algorithm l should also take into account that users preference change with context 5
Recommender systems (3/3) w Each element of the user space U can be defined with a profile that includes various user characteristics n For instance age, gender, income, marital status and so on w Similarly, each element of the item space O is defined with a set of characteristics w For instance, in a movie recommendation application, O being a collection of movies, each movie can be represented by: n Title, genre, director, year of release, leading actors, etc w The utility r is usually not defined on the whole U X O, but only on some subset of it n So, the central problem is to extrapolate r to the while space U X O 6
A recommendation example (1/3) w Let us consider a virtual museum that offers a web – based access to a multimedia collection of digital reproductions of Uffizi paintings in Florence w In particular, let us consider users visiting a virtual museum and suppose that they request, at the beginning of their tour, some paintings depicting the “Holy Mary” subject w While observing such paintings, they are attracted, for example by a Albrecht Durer's painting entitled “Madonna col Bambino” (a) w It would be helpful if the system could learn the preferences of the users, based on these first interactions and predict their future needs by suggesting other paintings representing the same or related subjects, depicted by the same or other related authors or items that have been requested by users with similar 7 preferences
A recommendation example (2/3) w w w a: “Madonna col Bambino”, Albrecht Durer b: “Madonna Bambino e San Giovannino”, Jacopo Carducci c: “Madonna col Bambino”, Andrea Vanni 8
A recommendation example (3/3) w A user who is currently observing (a) might be recommended to see: n n (b) , because is very similar to the current picture in terms of color and texture (c), that is not similar in terms of low level features, but is similar in terms of semantic content w From the user perspective n There is the advantage of having a guide suggesting paintings which the users might be interested in w From the system perspective n There is the undoubted advantage of using the suggestions for pre–fetching and caching the objects that are more likely to be requested 9
State of the art w There are different kinds of recommender systems well-known in literature: n n Content based systems Collaborative filtering systems Systems using a hybrid approach Semantic and social recommender systems 10
Content based recommender systems w In content based recommender systems, the utility ri, j of item oj is estimated using the utilities r(ui, ok) assigned by the user ui to items ok ε O, k ≠ j that are in some way similar to item oj w Then, only the movies that have a high degree of similarity to whatever the user's preferences are would be recommended 11
Collaborative Filtering recommender systems w Collaborative filtering is, in the opposite, the process of filtering or evaluating items using the opinion of other people w The main problem behind collaborative filtering clearly is to associate each user to a set of other users having similar profiles 12
Comparison between collaborative filtering and content based systems w Collaborative filtering uses the assumption that people with similar tastes will rate things similarly w Content based filtering uses the assumption that items with similar objective features will be rated similarly and can be viewed as complementary 13
A hybrid approach w Content-based filtering and collaborative filtering may be manually combined by the end-user w Different ways to combine collaborative and content-based methods into a hybrid recommender system can be classified as follows: n n implementing collaborative and content-based methods separately and combining their predictions incorporating some content-based characteristics into a collaborative approach incorporating some collaborative characteristics into a content-based approach constructing a general unifying model that incorporates both content-based and collaborative characteristics 14
Semantic and social recommender systems w These systems took advantage of the advancements in the semantic web technologies and features n Such as ontologies, taxonomies and social networks tagging w Semantic recommender systems are classified into three different types: n n n Vocabulary or ontology based systems Trust network based systems Context-adaptable systems 15
A recommendation strategy based on multimedia semantic analysis (1/12) w An effective multimedia recommender system for supporting intelligent browsing of multimedia collections has the capability of reliably identify the objects that are most likely to satisfy the interests of a user at any given point of her exploration w We have to address four fondamental questions: n n How can we select a set of objects from the collection that are good candidates for recommendation? How can we rank the set of candidates? How can we capture, represent and manage semantics related to multimedia objects? How can we take into account such semantics in the recommendation process? 16
A recommendation strategy based on multimedia semantic analysis (2/12) w Our idea is to model a browsing system for O as a labeled graph (G, l) w So, given a set O = {o 1, . . on} of objects, n n w G=(O, E) is a directed graph l: E {pattern, sim} * R+ is a function that associates each edge in E O*O with a pair (t, w) t is the type of the edge which can assume two enumerate values (pattern and similarity) w is the weight of the edge We can list two different cases: n n a pattern label for an edge (oj, oi) denotes the fact that an object oi was accessed immediately after an object oj: in this case, the weight wij is the number of times oi was accessed immediately after oj the similarity label for an edge (oj, oi) denotes the fact that an object oi is similar to oj and, in this case, the weight wij is the similarity between oj and oi 17
A recommendation strategy based on multimedia semantic analysis (3/12) w Given a labeled graph (G, l), we can formulate the definition of recommendation grade more formally as in the following (1) w PG(oi) = {oj ε O | (oj, oi) ε E} is the set of predecessor of oi in G w wij is the normalized weight of the edge from oj to oi w For each oj ε O w So, a link from oj to oi indicates that part of the importance of oj is transferred to oi 18
A recommendation strategy based on multimedia semantic analysis (4/12) w Given the iterative nature of the definition (1), it is easy to see that the vector R = can be computed as the solution to the following equation: (2) w C = {ωij} is an ad-hoc matrix that defines how the importance of each object is transferred to other objects and can be seen as a linear combination of: n n n A local browsing matrix A global browsing matrix A multimedia similarity matrix 19
A recommendation strategy based on multimedia semantic analysis (5/12) w In order to guarantee the existence of a unique solution to equation (2), the matrix C must be a positive and irreducible column-stochastic matrix w Given a set U = {u 1, . . . , un} of users, we define: n n A local browsing matrix Al = for each user ul ε U. Its generic element is defined as the ratio of the number of times object oi has been accessed by user ul immediately after oj to the number of times any object in O has been accessed by ul immediately after oj A global browsing matrix A = {aij}. Its generic element aij is defined as the ratio of the number of times object oi has been accessed by any user immediately after oj to the number of times any object in O has been accessed immediately after oj w It is easy to see that: w Anonymous users contribute to A but not to any Al 20
A recommendation strategy based on multimedia semantic analysis (6/12) 21
A recommendation strategy based on multimedia semantic analysis (7/12) w We define the similarity matrix as follows: (3) w fsim is any similarity function defined over O which calculates for each couple of objects their multimedia relatedness in term of low (features) and high level (semantics) image descriptors w τ is a treshold w Ѓ is a normalization factor which garantees that w Equation (3) garantees that the matrix B is sparse when the condition holds for many pairs of objects 22
A recommendation strategy based on multimedia semantic analysis (8/12) w To compute B matrix, we have decided to adopt 4 sets of multimedia features: n n Tamura descriptors MPEG-7 color-based descriptors MPEG-7 edge-based descriptors MPEG-7 color layout-based descriptors w We exploit specific image metadata (artist, genre and subject) w The semantic similarity has been computed using the most diffused metrics for semantic relatedness of concepts based on a vocabulary n n Li-Bandar-Mc. Lean Wu-Palmer Rada Leacock-Chodorow 23
A recommendation strategy based on multimedia semantic analysis (9/12) w The semantic similarity combines similarities among artists, genres and subjects obtained by using a fixed taxonomy, which part is shown right here: 24
A recommendation strategy based on multimedia semantic analysis (10/12) w Our main goal is to compute customized rankings for each individual user w We can then rewrite equation (2) as follows: (4) w where for a user ul is the vector of recommandation grades, customized w We define each matrix Cl as a linear combination of Al, A and B in order to take into account the following three elements: n n n the browsing behavior of the individual user (matrix Al) the overall behavior of the community of users (matrix A) the intrinsec similarity between objects in the collection (matrix B) 25
A recommendation strategy based on multimedia semantic analysis (11/12) w Now, we will describe a method for generating a set of candidates for recommendations w Assume that a user ul is currently watching object oj, we can define the set of candidate recommendations as follows: (5) w The set of candidates includes: n n the objects that have been accessed by at least one user within k steps from oj, with k between 1 and M and the objects that are most similar to oj w The ranked list of recommendations is then generated by ranking the objects in Cj using the ranking vector Rl 26
A recommendation strategy based on multimedia semantic analysis (12/12) 27
The system (1/4) w We describe a case study in the cultural heritage domain for a web recommendation system that provides browsing facilities for multimedia collection of the Uffizi Gallery paintings w We use a memory-based algorithm so that low and high level similarities are evaluated once w If we add new paintings n Similarity matrices have to be conveniently updated w To capture the dynamic nature of user's behaviour n n We periodically recompute connection matrices Each connection matrix is updated as soon the browsing session ends 28
The system (2/4) w To solve the cold start problem n Our system uses low or/and high level similarities, in addition to the extracted behaviour of the whole community w For new items n Recommendation is based just on similarities w Our “Uffizi Gallery” data collection consists of 474 digital reproductions of paintings: n n n which in turn belong to 144 artists grouped into 16 artistic genres each painting is also linked to a pair of subjects, which have been chosen among the 47 available ones 29
The system (3/4) 30
The system (4/4) w From the final users perspective, the client application has the following features: n n a set of forms to provide users log in or registration a gallery to visualize images which are returned after a search by author, subject or artistic genre visualization of an image and of the related information and presentation of recommended images storing of user session with the information related to the browsing patterns 31
Preliminary experimental results (1/6) w Recommender systems are complex applications that are based on a combination of several models, algorithms and heuristics w We decided to give more importance to a user-centric evaluation w The proposed evaluation strategy aims at measuring the effectiveness of the system in terms of the user satisfaction with respect to assigned browsing tasks 32
Preliminary experimental results (2/6) § We evaluated the impact of our system on the users and compared its performances with respect to another existing system for organizing and browsing large photo collection (Picasa Web Albums), which does not take into account browsing behavior of users and intrinsic features of the multimedia object w Our goal was to establish how helpful our system was to provide an exploration of digital reproductions of paintings w The dataset used in the experiments is the same used in tuning system, that means it consists of 474 paintings of various genres, artists and subjects, which are digital reproductions of Uffizi paintings 33
Preliminary experimental results (3/6) w We asked a group of 20 people to browse the digital collection of paintings, with the assistance of our recommender system, and complete several browsing tasks of different complexity w This group consisted of: n n n 10 not-expert users on art 5 medium expert users on art 5 expert users on art w After this test, we asked them to browse once again the same collection of paintings using Picasa 34
Preliminary experimental results (4/6) w We defined four browsing tasks: n n Low Complexity task (T 1): explore al least 10 paintings of Renaissance style Medium Complexity task (T 2): explore at least 20 paintings of Renaissance style that have Holy Mary as their subject High Complexity task (T 3): explore at least 20 paintings of Renaissance style with subject Holy Mary and with a predominance of dark blue color Very High Complexity task (T 4): explore at least 3 paintings of Renaissance style with subject Holy Mary and with a predominance of dark blue color whose author is Botticelli w Two strategies were used to evaluate the results of this experiment: n n empirical measurements of access complexity in terms of access time and mouse clicks TLX (NASA Task Load Index factor) 35
Preliminary experimental results (5/6) w We asked the users to express their opinion about the capability of Picasa and our system respectively to provide an effective user experience in completing the assigned browsing tasks by the TLX evaluation w TLX is a multi-dimensional rating procedure that provides an overall workload score based on a weighted average of ratings on six sub-scales w The lower TLX scores, the better they are 36
Preliminary experimental results (6/6) 37
Conclusions w In this paper we proposed a multimedia semantic approach to recommendation in browsing systems, based on a method that computes customized recommendations by combining in an original way: n n n Intrinsic features (semantic contents and low-level features) of the objects Past behavior of individual users Behavior of the users’ community as a whole w We realized a recommender system which helps users to browse digital reproductions of Uffizi paintings w We investigated the effectiveness of the proposed approach in the considered scenario, based on the user satisfaction w Experimental results showed that our approach is promising and encourages further research in this direction 38
Future works w Improvements on the presentations of multimedia contents w Optimization of the ranking algorithm using ad-hoc data structures to store sparse matrices in Java (CSR, CSC, JSA) w Testing of the system performances on a larger dataset w Research of new possible metrics to evaluate the system performances w The system is currently online and reachable to the following URL: http: //143. 225. 229. 92: 8080/uffizi. jsp 39


