362c61a3ae73669ec39239528f3dd22c.ppt
- Количество слайдов: 42
Intelligent Access to Digital Heritage Conference 19 Oct. 2007, Tallinn, Estonia Intelligent Information and Knowledge Infrastructures Daniel Olmedilla L 3 S Research Center & Hannover University
Outline n L 3 S Background n Introduction & Motivation n Personalized Search & Ranking n Privacy & Access Control n EU Projects Summary Daniel Olmedilla 19 Oct. 2007 2
Outline n L 3 S Background n Introduction & Motivation n Personalized Search & Ranking n Privacy & Access Control n EU Projects Summary Daniel Olmedilla 19 Oct. 2007 3
L 3 S Background Mission and Focus L 3 S research focuses on innovative and cutting-edge methods and technologies for three key enablers for the European Information Society: n Knowledge n Information n Learning LS 3 projects focus on n digital resources and their technological underpinnings: § Digital libraries and Search § Semantic Web and Knowledge Sharing § Distributed Systems, Networks and Grids n the use of these resources in e. Learning and e. Science contexts Daniel Olmedilla 19 Oct. 2007 4
L 3 S Background Area “Semantic Web & Digital Libraries” provide personalized access to distributed information resources and advanced search and recommendation functionalities provide enhanced search on the desktop, in companies, on the Web enhance traditional libraries with digital content and personalized library services Daniel Olmedilla 19 Oct. 2007 5
Outline n L 3 S Background n Introduction & Motivation n Personalized Search & Ranking n Privacy & Access Control n EU Projects Summary Daniel Olmedilla 19 Oct. 2007 6
Introduction & Motivation Conference Theme Intelligent Access to Digital Heritage Daniel Olmedilla 19 Oct. 2007 7
Introduction & Motivation UNESCO E-Heritage (I) Digital Heritage are resources of human knowledge or expression, whether cultural, educational, scientific and administrative, or embracing technical, legal, medical and other kinds of information Digital materials include texts, databases, still and moving images, audio, graphics, software, and web pages, among a wide and growing range of formats [ http: //portal. unesco. org/ci/en/ev. php-URL_ID=1539&URL_DO=DO_TOPIC&URL_SECTION=201. html, http: //portal. unesco. org/ci/en/files/13367/10700115911 Charter_en. pdf/Charter_en. pdf ] Daniel Olmedilla 19 Oct. 2007 8
Introduction & Motivation UNESCO E-Heritage (II) Born-digital heritage available on-line, including electronic journals, World Wide Web pages or on-line databases, is now part of the world’s cultural heritage Using computers and related tools, humans are creating and sharing digital resources - information, creative expression, ideas, and knowledge encoded for computer processing - that they value and want to share with others over time as well as across space Daniel Olmedilla 19 Oct. 2007 9
Introduction & Motivation UNESCO E-Heritage (& III) The purpose of preserving the digital heritage is to ensure that it remains accessible to the public. (…). At the same time, sensitive and personal information should be protected from any form of intrusion. Daniel Olmedilla 19 Oct. 2007 10
Introduction & Motivation Focus of this talk Search • Personalized of media Rank Information • Access to sensitive Resources Intelligent Access to Digital Heritage Daniel Olmedilla 19 Oct. 2007 11
Introduction & Motivation Information growth In today's society, individuals and organisations are, on one hand, confronted with an ever growing load of information and content and, on the other, with increasing demands for knowledge and skills. To cope with this, we need to link content, knowledge and learning, making content and knowledge more accessible, interactive and usable over time by humans and machines alike. Daniel Olmedilla 19 Oct. 2007 12
Introduction & Motivation Not only textual resources Daniel Olmedilla 19 Oct. 2007 13
Introduction & Motivation The 1 TB life (Gordon Bell) 1 TB gives you 65+ years of: n n n 100 email messages a day (5 KB each) 100 web pages a day (50 KB each) 5 scanned pages a day (100 KB each) 1 book every 10 days (1 MB each) 10 photos per day (400 KB JPEG each) 8 hours per day of sound - e. g. telephone, voice annotations, and meeting recordings (8 Kb/s) n 1 new music CD every 10 days (45 min each at 128 Kb/s) It will take you 10 years to fill up your 160 GB drive Want video? Buy more cheap drives (1 TB/year lets you record 4 hours/day of 1. 5 Mb/s video) Daniel Olmedilla 19 Oct. 2007 14
Introduction & Motivation Main Objectives 1. Search for textual and audiovisual content 2. Rank results according to relevance 3. Personalize such search and ranking § Not all users are the same § Find what they are interested in 4. While protecting private information and resources Daniel Olmedilla 19 Oct. 2007 15
Outline n L 3 S Background n Introduction & Motivation n Personalized Search & Ranking n Privacy & Access Control n EU Projects Summary Daniel Olmedilla 19 Oct. 2007 16
Personalized Search & Ranking Representing context by SW metadata Metadata for resources can be created by appropriate metadata generators Ontologies specify context metadata for i. e. : n n Emails Files Web pages Publications Metadata have to be applicationindependent! Store Metadata as RDF Daniel Olmedilla 19 Oct. 2007 17
Personalized Search & Ranking Personalization in the SW n gather online information, integrate heterogenous sources, syndicate according to user’s preferences n embed resources with a personalized context n enable users to choose which kind of personalized guidance in what combination they appreciate as support (plug & learn) Realization: n semi-automated extraction of information from heterogenous sources n re-usable personalization algorithms reason about distributed data sources (user data, course descriptions, ontologies, etc. ) n personalization rules reason about resources, e. g. to make recommendations [Baumgartner, Henze, Herzog. The Personal Publication Reader: Illustrating Web Data Extraction, Personalization and Reasoning for the Semantic Web. ESWC’ 05 ] Daniel Olmedilla 19 Oct. 2007 18
Personalized Search & Ranking User Knowledge and Interests Competence: “an effective performance within a domain / context at different levels of proficiency” Can be explicitly defined by the user or inferred automatically Daniel Olmedilla 19 Oct. 2007 19
Personalized Search & Ranking Expanding User Queries with Local Context Score and extract keywords Top query-dependent, user-biased keywords Extract query expansion or re-ranking terms User related documents (desktop documents) containing the query [ Chirita, Firan, Nejdl. Summarizing local context to personalize global web search. CIKM 2006 ] Daniel Olmedilla 19 Oct. 2007 20
Personalized Search & Ranking Data heterogeneity Characteristics n A lot of text (unstructured information) n A lot of structures, e. g. title, author, creation-date, … n Heterogeneity in structure § Different holders (applications) use different schemas § In nature, the structure of a domain is too complex for us to give it a clear and certain definition Classical Data Integration n Transform data into a clear and uniform structure before we use it n Intensive human intervention – very laborious and not scalable Malleable Schema (X. Dong & A. Halevy ’ 05) n Allow overlapping and vague elements to be defined in a single schema Daniel Olmedilla 19 Oct. 2007 21
Personalized Search & Ranking Malleable Schemas: Example Data first name xml search Xml is the standard for data exchange ……. False Person title body sur name Doc author name Person John Gary Isa book subject body Dear Sergey, Please find attached the file ……. Daniel Olmedilla Pan author sender My paper Jack email attachment date 25. 03. 2006 writer Doc Isa paper True contents Desktop Search We have many data ……. 19 Oct. 2007 22
Personalized Search & Ranking Querying Malleable Schemas first name Person sur name … Person name …… … For example, user issue query: Q 1: Select Person Where first_name Contains “Philip” To obtain the complete results, we should relax the query to: Q 2: Select Person Where first_name Contains “Philip” Or name Contains “Philip” A query has to be relaxed to related schema elements But, how to discover the correlation between schema elements? Daniel Olmedilla 19 Oct. 2007 23
Personalized Search & Ranking Discover Schema Correlations (I) Solution: find duplicates which use different attributes. Observation: 1. more duplicates – better schema correlation discovery 2. more accurate schema correlations – better duplicate detection Solution: Let schema correlation discovery and duplicate detection reinforce each other to achieve improved results Daniel Olmedilla 19 Oct. 2007 24
Personalized Search & Ranking Discover Schema Correlations (& II) title E 1 XML E 2 E 3 E 6 author writer Daniel XML DB E 4 E 5 subject Daniel DB Dec 2003 Jul 1994 Ullman Stuart Logic Rec-date Jan 1999 Ullman AI Pub-date Nov 2001 Stuart Nov 2001 duplicates: {E 1, E 2}, {E 3, E 4}, {E 5, E 6} attribute matches: {title, subject}, {author, writer}, {pub-date, rec-date} [ Xuan Zhou, Julien Gaugaz, Wolf-Tilo Balke, Wolfgang Nejdl. Query Relaxation Using Malleable Schema. SIGMOD’ 07 ] Daniel Olmedilla 19 Oct. 2007 25
Outline n L 3 S Background n Introduction & Motivation n Personalized Search & Ranking n Privacy & Access Control n EU Projects Summary Daniel Olmedilla 19 Oct. 2007 26
Privacy & Access Control in Open Systems (I) Daniel Olmedilla 19 Oct. 2007 27
Privacy & Access Control in Open Systems (& II) Assumption: I already know you n you have a local account! Not a member? Daniel Olmedilla 19 Oct. 2007 28
Privacy & Access Control Policy Examples n Give customers younger than 26 a 20% discount n Up to 15% of network bandwidth can be reserved by paying with an accepted credit card n Customers can rent a car if they are 18 or older, and exhibit a driving license and a valid credit card [ Bonatti, Olmedilla. Driving and Monitoring Provisional Trust Negotiation with Metapolicies. IEEE Policies for Distributed Systems and Networks, 2005 ] Daniel Olmedilla 19 Oct. 2007 29
Privacy & Access Control Use Credentials Daniel Olmedilla 19 Oct. 2007 30
Privacy & Access Control Negotiations Alice Bob Step 1: Alice requests a service from Amazon Step 2: Amazon discloses its policy for the service Step 3: Alice discloses her policy for VISA Step 4: Amazon discloses its BBB credential Step 5: Alice discloses her VISA card credential Service Step 6: Amazon grants access to the service [Winsborough, Seamons, Jones. Automated trust negotiation. DARPA Information Survivability Conference and Exposition, 2000 ] Daniel Olmedilla 19 Oct. 2007 31
Privacy & Access Control User awareness and Control n Explain policies and system decisions § Make rules & reasoning intelligible to the common user n Use natural language? § “Academic users can download the files in folder historical_data whenever their creation date precedes 1942” § Suitably restricted to avoid ambiguities § Fortunately, users spontaneously formulate rules Daniel Olmedilla 19 Oct. 2007 32
Privacy & Access Control Cooperativeness & Verbalization Suppose Alice's request is rejected She may want to ask questions like: n Why didn't you accept my credit card? Other possible queries n How-to queries n What-if queries § Would I get the special discount on financial products X if I were locally employed? [ Bonatti, Olmedilla, Peer. Advanced policy explanations on the web. ECAI 2006 ] Daniel Olmedilla 19 Oct. 2007 33
Privacy & Access Control Sample Screenshot (I) Daniel Olmedilla 19 Oct. 2007 34
Privacy & Access Control Sample Screenshot (& II) Daniel Olmedilla 19 Oct. 2007 35
Outline n L 3 S Background n Introduction & Motivation n Personalized Search & Ranking n Privacy & Access Control n EU Projects Summary Daniel Olmedilla 19 Oct. 2007 36
EU Projects Summary EU IP Nepomuk: Social Semantic Desktop - Desktop: Help individuals in managing information on their PC - Semantic: Make content available to automated processing - Social: Enable exchange across individual boundaries Person Email Topic Document Web. Site Event acquaintance Person Image Personal Semantic Web: a semantically enlarged intimate supplement to memory Daniel Olmedilla friend colleague Social protocols NEPOMUK enabled and distributed search peers 19 Oct. 2007 37
EU Projects Summary EU IP PHAROS sion i V tion gra e Int gh Hi act Imp & ess n pen atio O er Fed Daniel Olmedilla PHAROS will move forward audiovisual searching from a point-solution search engine paradigm to an integrated search platform paradigm. PHAROS will integrate future user and search requirements in a living laboratories for innovation PHAROS partners are from 9 European Countries and will integrate its development with their nationally funded projects. SMEs, academia and large industrial players will ensure maximum impact on the business scenario PHAROS will use an open approach in integrating external experiences and contributions and exchange results through the PHAROS Federation. PHAROS will use an specifically-designed management structure, integrating the different PHAROS “streams” 19 Oct. 2007 38
EU Projects Summary EU No. E REWERSE REasoning on the WEb with Rules and SEmantics Web reasoning languages & processing n Define set of reasoning languages § Coherent § Inter-operable § Functionality and application independent n For Advanced Web systems and applications Advanced Applications as testbeds for languages n Context-adaptive Web systems n Web-based decision support systems Daniel Olmedilla 19 Oct. 2007 39
EU Projects Summary EU IP TENCompetence Daniel Olmedilla 19 Oct. 2007 40
EU Projects Summary L 3 S Project Leaders (http: //www. L 3 S. de) NEPOMUK (http: //nepomuk. semanticdesktop. org/ Dr. Claudia Niederee PHAROS - http: //www. pharos-audiovisual-search. eu/ Dr. Bhaskar Mehta REWERSE - http: //rewerse. net/ Prof. Dr. Nicola Henze TENCompetence - http: //www. tencompentece. org/ Dr. Daniel Olmedilla 19 Oct. 2007 41
Thanks ! Daniel Olmedilla olmedilla@L 3 S. de - http: //www. L 3 S. de/~olmedilla/ Daniel Olmedilla 19 Oct. 2007 42