3b4d1fe18128737b62c38b1874350075.ppt
- Количество слайдов: 17
Information Retrieval Lab Di. SCo – University of Milan Bicocca viale Sarca 336 Head: Prof. Gabriella Pasi
The IR Lab in brief The Information Retrieval Group (IRG) was established in 2005 at Di. SCo, University of Milan Bicocca. Today the amount of information available on the Web has increased to a point that there are great demands for effective systems that allow an easy and flexible access to information relevant to specific user’s needs. By flexibility is here meant the capability of the system to both manage imperfect (vague and/or uncertain) information, and to adapt its behaviour to the user context. The research activity undertaken by the Information Retrieval group is aimed at defining models and techniques that improve the limitations of current systems for the Information Access (mainly Information Retrieval and Information Filtering systems). In particular the problems of context modeling and personalization are addressed.
IR Lab numbers n Small but active! ¨ ¨ ¨ One scientist Two external collaborators Three workplaces for Students and Collaborators About 50 articles in proceeding of international conferences and in international journals in the last three years 4 -5 master students per year
IR Lab Activity n Research areas: ¨ Information Retrieval ¨ Information Filtering ¨ XML Retrieval ¨ Web Intelligence n Application Domains: ¨ Large document repositories ¨ World Wide Web
The problem of automatic access to information Two main types of systems to locate information “relevant” to users needs: ¨ Information Retrieval Systems (Search Engines) They require An explicit query formulation ¨ Information Filtering Systems They require users profiles, i. e. descriptions of specific users’ needs dinamically updated also on the basis of the user’s behaviour (no explicit user query! Push technology)
Basic structure of an IRS ARCHIVE OF DOCUMENTS Usually unstructured or semi-structured text INDEXING MECHANISM FORMAL REPRESENTATION OF DOCUMENTS ITEMS ESTIMATED RELEVANT QUERY FORMULATION MATCHING MECHANISM USER QUERY An IRS is based on a mathematical model Off line On line
Information Filtering is the process of monitoring large amounts of dynamically generated information and pushing to a user the subset of information likely to be of her/his interest (based on her/his information needs). Do Do Do c Do c c c Do c
Information Filtering An IFS needs an information filter that, when applied to an information item, evaluates whether the item is of interest or not to the considered user. Doc Doc Doc
XML Retrieval n n n IR systems can be used for the content-based retrieval of documents encoded in XML, SGML, and HTML. In these collections it is important to retrieve documents content and stucture following the user's needs. Search and retrieval can be supported through ad hoc indexing strategies. This research area studies and proposes advanced solutions for storing, managing and retrieving structured documents, with particular focus on XML documents.
Web Intelligence n Web Intelligence (WI) exploits AI and advanced information technology on the Web and Internet. It is the key and the most urgent research field of IT for business intelligence.
A multiple criteria decision model for Information Filtering n n n www. peng-project. org Project Coordinator: Gabriella Pasi Partners: ATOS Origin (SP), UJF (FR), USG (UK), USI (SW), RTSI (SW) Objective: The PENG Project (2004 -2006 IST-2003 -004597) had the objective of defining and developing a news content composition and programming environment so as to provide news professionals and general users with an interactive and personalised tool for news gathering and delivery. This tool is conceived as a flexible system for a personalised filtering, retrieval and composition of news.
Personalized Filtering Module: pushes news or clusters relevant to a user interests to each user (where each user may have multiple overlapping interests). The matching function applied by the filter is personalised to the user and performs the combined evaluation of each news with respect to five matching criteria
IR Lab people n Gabriella Pasi Associate Professor and Head of the Laboratory n Stefania Marrara Junior Research Fellow n Célia Cristina Pereira Junior Research Fellow
Conferences and Events n Open Conferences (2008) ¨ n "Special Track on Information Access and Retrieval Systems”, within the “ACM Symposium on Applied Computing”, (Fortaleza, Ceará, Brazil, March 16 - 20, 2008). IAR 2008 Past Events (since 2005) ¨ ¨ ¨ International Workshop on Fuzzy Logic and Applications (WILF 2007), Hotel Portofino Kulm, Portofino Vetta - Ruta di Camogli, Genova (Italy) - July 7 -10, 2007 Ph. D School on Web Information Retrieval, Web. Bar 2007 Varenna, Italy, 26 th August-1 st September 2007. Imprecision, Uncertainty and Fuzziness in Databases area at the 23 rd International Conference on Data Engineering (ICDE 07). Istanbul, Turkey; April 17 -20, 2007 Seventh International Conference on Flexible Query Answering Systems (FQAS 2006), Milano, 2 -10 June 2006. "Special Track on Information Access and Retrieval Systems”, within the “ACM Symposium on Applied Computing”, (Fortaleza, Ceará, Brazil, March 16 - 20, 2008, Dijon France March 2006, Santa Fe - New Mexico 13 -17 March 2005, Cyprus 14 -17 March 2004, Melbourne - Florida 9 -12 March 2003, Madrid 10 -14 March 2002). IAR 2008 “ 3 rd International Summer School on Aggregation Operators”, Università della Svizzera Italiana (USI-Lugano), Lugano, 10 -15 July 2005
Recent Publications n Edited Volumes E. Herrera-Viedma, F. Crestani and G. Pasi: “Soft Computing for Web Information Retrieval”, edited by Physica Verlag, series Studies in Fuzziness 2006. ¨ G. Pasi: “Flexible Query Answering Systems”, Proceedings of the 7 th International Conference FQAS 2006, Milan, Italy, June 2006, Springer Verlag, LNAI 4027. ¨ F. Masulli, S. Mitra and G. Pasi: “Applications if Fuzzy Sets Theory”, Proceedings of the International Workshop on Fuzzy Logic and Applications, Ruta di Camogli, Italy, July 2007, Springer Verlag, LNAI 4578. ¨ n Special Issues Allel-Adjali, P. Bosc and G. Pasi eds. “Flexible Queries in Information Systems” of the Journal of Intelligent Information Systems, to appear, 2008. ¨ E. Herrera-Viedma and G. Pasi eds. "Aggregation Operators for Information Systems”, of the International Journal of Intelligent Systems, to appear, 2008. ¨ E. Herrera-Viedma and G. Pasi eds. "Soft Approaches to Information Retrieval and Information Access on the Web", of the Journal of the American Society for Information Science, 2006. ¨
Recent Publications n Papers in International Journals. ¨ ¨ ¨ ¨ ¨ A. Campi, E. Damiani, S. Guinea, S. Marrara, G. Pasi, P. Spoletini, “A Fuzzy Extension for the XPath Query Language”, International Journal of Intelligent Systems, to appear in 2008. G. Bordogna, G. Pasi, “A flexible model for the evaluation of soft Conditional Preferences in fuzzy databases” International Journal of Intelligent Systems, to appear in 2008. G. Bordogna, G. Pasi, “A flexible approach to evaluating soft conditions with unequal preferences in fuzzy databases”, Special Topic Issue on Advances in Fuzzy Database Technology, International Journal of Intelligent Systems, Vol. 22, Issue 7, pp. 665 -689, July 2007. M. Baziz, M Boughanem, G. Pasi, H. Prade, “A fuzzy logic approach to information retrieval using an ontology-based representation of documents”, International Journal of Applied Mathematics and Computer Science (AMCS), to appear in 2008. E. Herrera-Viedma, and G. Pasi, "Soft Approaches to Information Retrieval and Information Access on the Web: an introduction to the special topic section" of the Journal of the American Society for Information Science and Technology, JASIST 57(4): 511 -514, 2006. E. Herrera-Viedma, G. Pasi, A. G. Lopez-Herrera, C. Porcel, Evaluating the Information Quality of Web Sites: A Methodology Based on Fuzzy Computing with Words Journal of the American Society of Information Science, JASIST 57(4): 538 -549, 2006. G. Pasi and R. R Yager, “Modeling the concept of majority opinion in group decision making ” Information Sciences, Volume: 176, Issue: 4, pp. 390 -414, February 22, 2006. K. Atanassov, G. Pasi, R. R. Yager, "Intuitionistic fuzzy interpretations of multi-criteria multi-person and multi-measurement tool decision making", International Journal of Systems Science, Vol. 36, n. 14, pp. 859 -868, November 2005. G. Bordogna and G. Pasi, Personalized Indexing and Retrieval of Heterogeneous Structured Documents, Information Retrieval, Kluwer, Vol. 8, Issue 2, pp. 301 -318, April 2005. R. A. Marques Pereira, A. Molinari, G. Pasi, Contextual weighted representations and indexing models for the retrieval of HTML documents, Soft Computing, Vol. 9, Issue 7, pp. 481 -492, July 2005.
Recent Publications n Chapters of International Books ¨ ¨ ¨ ¨ ¨ G. Pasi, “Fuzzy Models”, Encyclopedia of Database Systems, Ling Liu and M. Tamer Özsu (Eds. ), Springer, to appear, 2008. M. Fedrizzi and G. Pasi, Fuzzy Approaches to Consensus Modelling in Group Decision Making, in Intelligent Decision and Policy Making Support Systems (D. Ruan, F. Hardeman, K. van der Meer eds. ), Springer, to appear 2008 G. Bordogna, M. Pagani, G. Pasi An Incremental Hierarchical Fuzzy Clustering for Category-based News Filtering, in "Uncertainty and Intelligent Information Systems" (B. Bouchon-Meunier, R. R. Yager, C. Marsala, and M. Rifqi eds), World Scientific, ISBN 978 -981 -279 -234 -1, 2008. G. Bordogna, D. H. Kraft, G. Pasi, Soft Approaches to Information Access and Retrieval, in The Handbook of Granular Computing, G. , Witold Pedrycz, Andrzej Skowron, and Vladik Kreinovich Co-editors, John Wiley & Sons, Ltd. , 2008. G. Pasi, Fuzzy Sets in Information Retrieval: State of the Art and Research trends, In “Fuzzy Sets and Their Extensions: Representation, Aggregation and Models. Intelligent Systems from Decision Making to Data Mining, Web Intelligence and Computer Vision”, (H. Bustince, F. Herrera, J. Montero eds. ), series Studies in Fuzziness and Soft Computing, Springer Verlag, Vol. 220, 2008. G. Bordogna, M. Pagani, G. Pasi, G. Psaila, Flexible location-based spatial queries in “Theoretical Advances and Applications of Fuzzy Logic and Soft Computing”, Oscar Castillo, Patricia Melin, Oscar Montiel Ross, Roberto Sepulveda Cruz, Witold Pedrycz, Janusz Kacprzyk Eds, ISBN: 3540724338, Sprinter Verlag, 42, Advances in Soft Computing series, 36 -45, 2007. G. Bordogna, M. Pagani, G. Pasi, A Flexible decision support approach to model ill-defined knowledge in GISs. In: AAVV. “Geographic Uncertainty in Environmental Security”, Book Series NATO Security through Science Series, (pp. 133 -152). ISBN: 978 -1 -4020 -6436 -4. doi: 10. 1007/978 -1 -4020 -6438 -8: (NETHERLANDS), 2007. G. Bordogna, M. Pagani, G. Pasi, A dynamical Hierarchical fuzzy clustering algorithm for document filtering, in “Soft Computing in Web Information Retrieval” (E. Herrera-Viedma, G. Pasi, F. Crestani eds. ), series Studies in Fuzziness and Soft Computing, Springer Verlag, Vol. 197, 1 -23, 2006. M. Baziz, M. Boughanem, G. Pasi, A fuzzy logic approach to information retrieval using an ontologybased representation of documents, in “Fuzzy Logic and the Semantic Web" (E. Sanchez, Ed. ), Elsevier Science, pp. 363 -377, March 2006.
3b4d1fe18128737b62c38b1874350075.ppt