
19ce2fbee4db3f9e9440c8f73e5231cc.ppt
- Количество слайдов: 79
1 Last update: 30 September 2008 Advanced databases – Defining and combining heterogeneous databases: The Semantic Web Bettina Berendt Katholieke Universiteit Leuven, Department of Computer Science http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 1
2 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex. s of standardization: E-commerce, social networks Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 2
3 Problems with current search engines Current search engines = (mostly) keywords: n low precision (… and recall? ) n sensitive to vocabulary n insensitive to implicit content Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 3
4 Search engines on the Semantic Web n concept search instead of keyword search Two classes of approaches n semantic narrowing/widening of queries n query-answering over >1 document n document transformation operators Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 4
5 Resolving content problems: Example homonymy <html> <head> <title>A page about jaguars</title> <meta name=„description“ content=„animals, cats“> </head> (Solution approach I) OR. . . Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 5
6 Homonymy: Solution approach II Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 6
7 Homonymy: Solution approach III Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 7
8 Resolving quality problems How to find out whether a page is good, important, etc. ? <meta name=„is. Endorsed. By“ content=„an. Important. Person“> OR (Page. Rank) Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 8
9 Semantic non-interoperability has real consequences. . . Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 9
10 The Semantic Web: overview n The semantic web is an evolving extension of the World Wide Web in which web content can be expressed not only in natural language, but also in a format that can be read and used by software agents, thus permitting them to find, share and integrate information more easily. n It derives from W 3 C director Sir Tim Berners-Lee's vision of the Web as a universal medium for data, information, and knowledge exchange. n At its core, the semantic web comprises a philosophy, a set of design principles, collaborative working groups, and a variety of enabling technologies. n Some elements of the semantic web are expressed as prospective future possibilities that have yet to be implemented or realized. n Other elements of the semantic web are expressed in formal specifications. n Some of these include Resource Description Framework (RDF), a variety of data interchange formats (e. g. RDF/XML, N 3, Turtle, N-Triples), and notations such as RDF Schema (RDFS) and the Web Ontology Language (OWL), all of which are intended to provide a formal description of concepts, terms, and relationships within a given knowledge domain. Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 10
11 The Semantic Web layer cake (T. Berners-Lee talk at XML 2000) OWL: W 3 C Rec. 2004 RDF: W 3 C Rec. 2004 Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 11
12 The original vision (or: semantics for interoperability) The entertainment system was belting out the Beatles' "We Can Work It Out" when the phone rang. When Pete answered, his phone turned the sound down by sending a message to all the other local devices that had a volume control. His sister, Lucy, was on the line from the doctor's office: "Mom needs to see a specialist and then has to have a series of physical therapy sessions. Biweekly or something. I'm going to have my agent set up the appointments. " Pete immediately agreed to share the chauffeuring. At the doctor's office, Lucy instructed her Semantic Web agent through her handheld Web browser. The agent promptly retrieved information about Mom's prescribed treatment from the doctor's agent, looked up several lists of providers, and checked for the ones in-plan for Mom's insurance within a 20 -mile radius of her home and with a rating of excellent or very good on trusted rating services. It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their Web sites) and Pete's and Lucy's busy schedules. (The emphasized keywords indicate terms whose semantics, or meaning, were defined for the agent through the Semantic Web. ) Tim Berners-Lee, James Hendler and Ora Lassila (2001). The Semantic Web. A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities. Scientific American. http: //www. sciam. com/article. cfm? id=the-semantic-web Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 12
13 Update 2006: decentralization, bottom-up engineering Q: Project failure is a big subject in the UK and you've been involved in a massive ongoing IT project - what have you learned from it that could benefit our members? A: [. . . ] But I think IT projects are about supporting social systems - about communications between people and machines. They tend to fail due to cultural issues. [. . . ] The view we are taking with the Semantic Web is interesting here. In the past scientists have been trained to do things top down. In the business world projects are often the boss's vision made flesh. Even software engineering is about taking an idea and breaking it into smaller pieces to work on - but the software project is itself part of something larger. To make this better we need Web-like approaches - I'm not talking about HTML here but, rather, an interconnected approach. The Semantic Web approach can be visualized as rigid platelets of information loosely sewn together at the edges - rich in local knowledge, but capable of linking to things in the outside world. That approach would benefit the social aspects of projects. Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 13
14 Update 2006: The Semantic Web and databases Q; [. . . ] the application of [ontologies] would clearly see a true Semantic Web, but how can we apply these principles to the billions of existing Web pages? Don't. Web pages are designed for people. For the Semantic Web we need to look at existing databases and the data in them. To make this information useful semantically requires a sequence of events: 1. Do a model of what's in the database - which would give you an ontology you could work out on the back of an envelope. Write it in RDF Schema or OWL (the Web Ontology Language). 2. Find out who else has already got equivalent terms in an ontology. For those things use their terms instead. 3. Write down how your database connects to those things. Using this information you can set up a Web server that runs resource description framework (RDF). A larger database could support queries. Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 14
15 Update 2006: Identifiers, human-machine collaboration To make all this really useful it's important that all important things - such as customers and products - have URIs (Uniform Resource Identifiers) - for example, http: // example. com/products. rdf#hairdryers - so invoices, shipping notes, product specifications and so on can refer to them. These would all be virtual RDF files - the server would generate them on the fly and it would all be available on the Semantic Web. Then an individual could compare products directly by their specifications, weight and delivery charges, price and so on, in a way that HTML won't allow. (last 3 slides from: Isn't it semantic? Interview with Tim Berners-Lee on BCS. 2006. http: //www. bcs. org/server. php? show=Con. Web. Doc. 3337) Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 15
What does this “buy us”? A motivating example: Bridging the Terminology Gap using OWL 16 A key problem in achieving interoperability is to be able to recognize that two pieces of data are talking about the same thing, even though different terminology is being used. The following slides presents an example to show OWL may be used to bridge the "terminology gap". Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 16
Interested in Purchasing a Camera 17 Scenario: n I am interested in purchasing a camera with a 75 -300 mm zoom lens size, that has an aperture of 4. 5 -5. 6, and a shutter speed that ranges from 1/500 sec. to 1. 0 sec. n I launch my personal "Web Bot" which crawls the Web looking for Web sites that can fulfill my request. n Assume that there exists an OWL Camera Ontology, which the Web Bot can "consult" upon its travels across the Web. Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 17
18 Is this document relevant? <Photography. Store rdf: ID="Hunts" xmlns: rdf="http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#"> <store-location>Malden, MA</store-location> <phone>617 -555 -1234</phone> <catalog rdf: parse. Type="Collection"> <SLR rdf: ID="Olympus-OM-10" xmlns="http: //www. camera. org#"> <lens> <Lens> <focal-length>75 -300 mm zoom</focal-length> <f-stop>4. 5 -5. 6</f-stop> </Lens> </lens> <body> <Body> <shutter-speed rdf: parse. Type="Resource"> <min>0. 002</min> <max>1. 0</max> <units>seconds</units> </shutter-speed> </Body> </body> <cost rdf: parse. Type="Resource"> <rdf: value>325</rdf: value> <currency>USD</currency> </cost> </SLR> (Note: SLR = Single Lens </catalog> Reflex) </Photography. Store> Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ The Web Bot finds this document at a Web site: Is it relevant? 18
A Match? <Photography. Store rdf: ID="Hunts" xmlns: rdf="&rdf; #"> <store-location>Malden, MA</store-location> <phone>617 -555 -1234</phone> <catalog rdf: parse. Type="Collection"> <SLR rdf: ID="Olympus-OM-10" xmlns="http: //www. camera. org#"> <lens> <Lens> <focal-length>75 -300 mm zoom</focal-length> <f-stop>4. 5 -5. 6</f-stop> </Lens> </lens> <body> <Body> <shutter-speed rdf: parse. Type="Resource"> <min>0. 002</min> <max>1. 0</max> <units>seconds</units> </shutter-speed> </Body> </body> <cost rdf: parse. Type="Resource"> <rdf: value>325</rdf: value> <currency>USD</currency> </cost> </SLR> </catalog> </Photography. Store> 19 Match? I am interested in purchasing a camera with a 75 -300 mm zoom lens size, that has an aperture of 4. 5 -5. 6, and a shutter speed that ranges from 1/500 sec. to 1. 0 sec. To determine if there is a match, these questions must be answered: 1. What's the relationship between "SLR" and "Camera"? 2. What's the relationship between "focal-length" and "size"? Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 3. What's the relationship between "f-stop" and "aperture"? 19
20 Relationship between SLR and Camera? The Web Bot "consults" the OWL Camera Ontology. This OWL statement tells the Web Bot that a SLR is a type of Camera: <owl: Class rdf: ID="SLR"> <rdfs: sub. Class. Of rdf: resource="#Camera"/> </owl: Class> <Photography. Store rdf: ID="Hunts" <SLR> … </SLR> </Photography. Store> Web Bot "Relationship between Camera and SLR? " <owl: Class rdf: ID="SLR"> <rdfs: sub. Class. Of rdf: resource="#Camera"/> </owl: Class> "SLR is a type of Camera. " Camera. owl Hunts. xml Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 20
21 Relationship between focal-length and lens size? This OWL statement tells the Web Bot that focal-length is equivalent to lens size: <owl: Datatype. Property rdf: ID="focal-length"> <owl: equivalent. Property rdf: resource="#size"/> <rdfs: domain rdf: resource="#Lens"/> <rdfs: range rdf: resource="&xsd; #string"/> </owl: Datatype. Property> "focal-length is synonymous with (lens) size. focal-length is to be used within a Lens. focal-length has a value that is a string. " Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 21
22 Relationship between f-stop and aperture? This OWL statement tells the Web Bot that f-stop is equivalent to aperture: <owl: Datatype. Property rdf: ID="f-stop"> <owl: equivalent. Property rdf: resource="#aperture"/> <rdfs: domain rdf: resource="#Lens"/> <rdfs: range rdf: resource="&xsd; #string"/> </owl: Datatype. Property> The Web Bot now recognizes that the XML document it found at the Web site - is talking about Cameras, and it - does show the lens size, and it - does show the aperture for the camera, and - the values for lens size, aperture, and shutter speed are met. Thus, the Web Bot recognizes that the XML document is a match! Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 22
23 Semantic Definitions Separate from Application! <SLR rdf: ID="Olympus-OM-10" xmlns="http: //www. camera. org#"> <lens> <Lens> <focal-length>75 -300 mm zoom</focal-length> <f-stop>4. 5 -5. 6</f-stop> </Lens> </lens> <body> <Body> <shutter-speed rdf: parse. Type="Resource"> <min>0. 002</min> <max>1. 0</max> <units>seconds</units> </shutter-speed> </Body> </body> <cost rdf: parse. Type="Resource"> <rdf: value>325</rdf: value> <currency>USD</currency> </cost> </SLR> Hunts. xml "Relationship between Camera and SLR? " Semantic Definitions <owl: Class rdf: ID="SLR"> <rdfs: sub. Class. Of rdf: resource="#Camera"/> </owl: Class> "SLR is a type of Camera. " Web Bot (application) "Relationship between aperture and f-stop? " "f-stop is synonymous with aperture. " "Relationship between size and focal-length? " "focal-length is synonymous with size. " <owl: Datatype. Property rdf: ID="focal-length"> <owl: equivalent. Property rdf: resource="#size"/> <rdfs: domain rdf: resource="#Lens"/> <rdfs: range rdf: resource="&xsd; #string"/> </owl: Datatype. Property> <owl: Datatype. Property rdf: ID="f-stop"> <owl: equivalent. Property rdf: resource="#aperture"/> <rdfs: domain rdf: resource="#Lens"/> <rdfs: range rdf: resource="&xsd; #string"/> </owl: Datatype. Property> Camera. owl Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 23
24 Summary: Interoperability despite terminology differences! The example demonstrated how a Web Bot application was able to dynamically process an XML document from a Web site, despite the fact that the XML document used terminology different than was used to express the request. This interoperability was achieved by using the OWL Camera Ontology! This example also demonstrated the architectural design principle of cleanly separating the application code (e. g. , Web Bot) from the semantic definitions (e. g. , Camera. owl). Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 24
25 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex. s of standardization: E-commerce, social networks Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 25
26 You have data … How should you structure it? Here's some data about an aircraft: medium-altitude, long-endurance unmanned aerial vehicle 14. 7 meters 512 kilograms 70 knots 400 nautical miles Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 26
The XML approach is to "wrap" each data item in start/end tags 27 <Aircraft> <wingspan>14. 8 meters</wingspan> <weight>512 kilograms</weight> <cruise-speed>70 knots</cruise-speed> <range>400 nautical miles</range> <description> medium-altitude, long-endurance unmanned aerial vehicle </description> </Aircraft> RQ-1. xml Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 27
28 XML Terminology <wingspan>14. 8 meters</wingspan> Start tag End tag Data Element Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 28
29 Why use XML? n It is a universally accepted standard way of structuring data (syntax). n It is a W 3 C recommendation (W 3 C = World Wide Web Consortium) n The marketplace supports it with a lot of free/inexpensive tools. n The alternative to using XML is to define your own proprietary data syntax, and then build your own proprietary tools to support the proprietary syntax (Not a very appealing idea). Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 29
30 BUT … XML: limitations for semantic markup XML makes no commitment on: Domain-specific ontological vocabulary Ontological modeling primitives Requires pre-arranged agreement on & Only feasible for closed collaboration n agents in a small & stable community n pages on a small & stable intranet Not suited for sharing Web-resources Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 30
31 Syntax versus Semantics Syntax: the structure of your data n e. g. , XML mandates that you structure your data by "wrapping" each data item within a start tag and an end tag pair, with the end tag being preceded by / and both tags in <…> brackets. n That is, XML specifies the syntax of your data. Semantics: the meaning of your data Two conditions necessary for interoperability: 1. Adopt a common syntax: this enables applications to parse the data. XML provides a common syntax, and thus is a critical first step. 2. Adopt a means for understanding the semantics: this enables applications to use the data. OWL provides a standard way of expressing the semantics. Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 31
32 What is this XML snippet talking about, i. e. , what are the semantics? What is a Predator? <Predator> … </Predator> Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 32
33 Predator - which one? n Predator: a medium-altitude, long-endurance unmanned aerial vehicle system. n Predator : one that victimizes, plunders, or destroys, especially for one's own gain. n Predator : an organism that lives by preying on other organisms. n Predator: a company which specializes in camouflage attire. n Predator: a video game. n Predator: software for machine networking. n Predator: a chain of paintball stores. Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 33
34 Resolving Semantics The next few slides presents an approach that applications can take for understanding the meaning of data. This approach is often taken today. We will then examine the disadvantages of the approach, and then offer a better approach. Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 34
35 Meaning (semantics) applied on a per-application basis Semantics: A Predator is type of Aircraft. Actions: These actions must be performed on the Predator data: - identify ground control station. - determine onboard sensors. - determine ordnance. <Predator> … </Predator> application Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 35
36 Meaning (semantics) applied on a per-application basis XML app#1 Semantics: Code to interpret the data Action: Code to process the data app#2 Semantics: Code to interpret the data Action: Code to process the data Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 36
Problem with attaching semantics on a perapplication basis application Semantics: Code to interpret the data Action: Code to process the data 37 Problems with burying semantic definitions within each application: - Duplicate effort - Each application must express the semantics - Variability of interpretation - Each application can take its own interpretation - Example: Mars probe disaster - one application interpreted the data in inches, another application interpreted the data in centimeters. - No ad-hoc discovery and exploitation - Applications have the semantics pre-wired. Thus, when new data (e. g. , new type of aircraft) is encountered an application may not be able to effectively process it. This makes for brittle applications. What's a better approach? Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 37
38 Better approach: (1) Extricate semantic definitions from applications (2) Express semantic definitions in a standard vocabulary XML app#1 Action: Code to process the data app#2 Action: Code to process the data OWL Document Semantic Definitions Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 38
39 OWL provides an agreed-upon vocabulary for expressing semantics A Sampling of the OWL Vocabulary: sub. Class. Of: this OWL element is used to assert that one class of items is a subset of another class of items. Example: Predator is a sub. Class. Of Aircraft. Functional. Property: this OWL element is used to assert that a property has a unique value. Example: sensor. ID is a Functional. Property, i. e. , sensor. ID has a unique value. equivalent. Class: this OWL element is used to assert that one Class is equivalent to another Class. Example: Platform is an equivalent. Class to Aircraft. Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 39
40 Why use OWL? Why use RDF? Benefits to application developers: n Less code to write (save $$$). n Less chance of misinterpretation (save $$$). Benefits to community at large: n Everyone can understand each other's data's semantics, since they are in a common language. n OWL uses the XML syntax to express semantics, i. e. , it builds on an existing technology. l Don't have to learn new syntax. l Common XML tools (e. g. , parsers) can work on OWL. n OWL is a W 3 C recommendation. n OWL builds on RDF (also a W 3 C recommendation) l Expressive enough for many applications l Simpler l need to understand this first Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 40
41 Ontologies and concepts n An ontology is a conceptual model. n An Ontology is the collection of semantic definitions for a domain. n Example: an Aircraft Ontology is the set of semantic definitions for the Aircraft domain, e. g. , l l sensor. ID is a Functional. Property. l n Predator is a sub. Class. Of Aircraft. Platform is an equivalent. Class to Aircraft. Predator, Aircraft etc. are concepts. Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 41
Basic idea of conceptual modelling (not only in SW): The semiotic triangle Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 42
What is an ontology? 43 (A commonly accepted informal definition and one formal definition) An ontology is „an explicit specification of a shared conceptualisation. “ (Gruber, 1993) Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 43
44 Ontologies, decentralization, and bottom-up engineering Communities of users (application builders, . . . ) can n Re-use existing ontologies l Established domain-specific ontologies (e. g. , real-estate, medicine, bioinformatics) l All kinds: see the Semantic Web search engine http: //swoogle. umbc. edu/ l „The big one“: Cyc, see www. cyc. com n Link to existing ontologies ( Ontology matching / alignment) n Extend existing ontologies Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 44
Ontologies as conceptual models / schemas; or: Database (knowledge base) = Ontology + Instances 45 <owl: Class rdf: ID="Book. Catalogue"/> <owl: Datatype. Property rdf: ID="title"> <rdfs: domain rdf: resource="#Book. Catalogue"/> <rdfs: range rdf: resource="&xsd; #string"/> </owl: Datatype. Property> <owl: Datatype. Property rdf: ID="author"> <rdfs: domain rdf: resource="#Book. Catalogue"/> <rdfs: range rdf: resource="&xsd; #string"/> </owl: Datatype. Property> title author date My Life and Times Paul Mc. Cartney June, 1998 Illusions Richard Bach 1972 First and Last Freedom J. Krishnamurti 1974 <owl: Datatype. Property rdf: ID="date"> <rdfs: domain rdf: resource="#Book. Catalogue"/> <rdfs: range rdf: resource="&xsd; #date"/> </owl: Datatype. Property> Book. Catalogue <? xml version=“ 1. 0”? > <Book. Catalogue> <title>My Life and Times</title> <author>Paul Mc. Cartney</author> <date>June, 1998</date> </Book. Catalogue> Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 45
46 OWL vs. Database Advantages of using OWL to define an Ontology: n Extensible: much easier to add new properties. Contrast with a database - adding a new column may break a lot of applications n Portable: much easier to move an OWL document than to move a database. Advantages of using a Database to define an Ontology: n Mature: the database technology has been around a long time and is very mature. Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 46
47 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex. s of standardization: E-commerce, social networks Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 47
48 What is RDF ? RDF is a data model l the model is domain-neutral, application-neutral l the model can be viewed as directed, labeled graphs or as an object-oriented model (object/attribute/value) RDF data model is an abstract, conceptual layer independent of XML l consequently, XML is a transfer syntax for RDF, not a component of RDF l RDF data might never occur in XML form Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 48
49 RDF model RDF “statements” consist of resources (= nodes) which have properties which have values (= nodes, strings) = subject = predicate = object resource property value “http: //www. w 3. org/TR/REC-rdf-syntax/ has the author Ora Lassila” http: //www. w 3. org/TR/REC-rdf-syntax/ author “Ora Lassila” Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 49
50 RDF Model Example “W 3 C” dc: Publisher http: //www. w 3. org/TR/REC-rdf-syntax/ dc: Creator dc: Date “Ora Lassila” “ 1999 -02 -22” Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 50
51 Complex values So far, values of properties have been strings A graph node (corresponding to a resource) also can be the value of a property n arbitrarily complex tree and graph structures are possible n syntactically, values can be embedded (i. e. lexically in-line) or referenced (linked) Example: http: //www. w 3. org/TR/REC-rdf-syntax/ dc: Creator p: Name “Ora Lassila” p: EMail “ora. lassila@nokia. com” Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 51
52 Complex values (continued) Corresponding triples { “http: //www. w 3. org/TR/PR-rdf-syntax/”, dc: Creator, x } { x, p: Name, “Ora Lassila” } { x, p: EMail, “ora. lassila@nokia. com” } http: //www. w 3. org/TR/REC-rdf-syntax/ dc: Creator p: Name “Ora Lassila” p: EMail “ora. lassila@nokia. com” Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 52
53 Containers are collections n they allow grouping of resources (or literal values) It is possible to make statements about the container (as a whole) or about its members individually Different types of containers exist n bag - unordered collection n seq - ordered collection (= “sequence”) n alt - represents alternatives It is also possible to create collections based on URI patterns n for example, all files in a particular web site Duplicate values are permitted n there is no mechanism to enforce unique value constraints Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 53
54 Containers (continued) http: //www. w 3. org/TR/REC-rdf-syntax dc: Creator rdf: Type rdf: _1 “Ora Lassila” rdf: Seq rdf: _2 “Ralph Swick” Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 54
55 Higher-order statements One can make RDF statements about other RDF statements n example: “Ralph believes that the web contains one billion documents” Higher-order statements n allow us to express beliefs (and other modalities) n are important for trust models, digital signatures, etc. n also: metadata about metadata n are represented by modeling RDF in RDF itself Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 55
56 Reification n RDF is not really second-order n But it does provide a built-in predicate vocabulary for reification http: //www. w 3. org/TR/REC-rdf-syntax dc: Creator “Ora Lassila” dc: Creator “Library of Congress” • The dotted box corresponds to the following statements • • { x, rdf: predicate, “dc: creator” } { x, rdf: subject, “http: //www. w 3. org/TR/RED-rdf-syntax } { x, rdf: object, “Ora Lassila” } { x, rdf: type, “rdf: statement” } Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 56
57 Reification n. Any statement can be an object graphs can be nested - reification NYT claims pers 05 Author-of ISBN. . . <rdf: Description rdf: about=“#NYT”> <claims> <rdf: Description rdf: about=“#pers 05”> <author. Of>ISBN. . . </author. Of> </rdf: Description> </claims> </rdf: Description> Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 57
58 RDF Schema • Defines small vocabulary for RDF: • Class, sub. Class. Of, type • Property, sub. Property. Of • domain, range • Vocabulary can be used to define other vocabularies for your application Person domain sub. Class. Of Student domain has. Super. Visor type Frank has. Super. Visor range Researcher type Jeen Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 58
59 RDF Schema syntax in XML <rdf: Description ID="Motor. Vehicle"> <rdf: type resource="http: //www. w 3. org/. . . #Class"/> <rdfs: sub. Class. Of rdf: resource="http: //www. w 3. org/. . . #Resource"/> </rdf: Description> <rdf: Description ID="Truck"> <rdf: type resource="http: //www. w 3. org/. . . #Class"/> <rdfs: sub. Class. Of rdf: resource="#Motor. Vehicle"/> </rdf: Description> <rdf: Description ID="registered. To"> <rdf: type resource="http: //www. w 3. org/. . . #Property"/> <rdfs: domain rdf: resource="#Motor. Vehicle"/> <rdfs: range rdf: resource="#Person"/> </rdf: Description> <rdf: Description ID=”owned. By"> <rdf: type resource="http: //www. w 3. org/. . . #Property"/> <rdfs: sub. Property. Of rdf: resource="#registered. To"/> </rdf: Description> Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 59
60 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex. s of standardization: E-commerce, social networks Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 60
61 I will use parts of this excellent tutorial: Roger L. Costello & David B. Jacobs (2003). OWL Web Ontology Language. http: //www. racai. ro/EUROLAN-2003/html/presentations/James. Hendler/owl/OWL. ppt (please note: the other tutorials referenced on slide 3 of that slide set are not available) Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 61
62 Agenda The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex. s of standardization: E-commerce, social networks Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 62
63 EDI (Electronic Data Interchange) n A set of standards for structuring information that is to be electronically exchanged between businesses, organizations, . . . n The structures emulate documents, e. g. , purchase orders n Standards independent of communication and software technologies n EDI messages can be transmitted using any methodology agreed to by sender and recipient: Value Added Networks (bisync modem), FTP, email, HTTP, AS 2 (MIME-based HTTP EDIINT), . . . n Mappings to XML exist; Rosetta. Net sometimes regarded as EDI standard n Data format used by the vast majority of E-commerce transactions worldwide n Since the 1960 s; first UN/EDIFACT standard 1988 n Different sets of standards for different subdomains l UN/EDIFACT: the only international standard, predominant outside North America l US standard ANSI ASC X 12 (X 12) l TRADACOMS: UK retail industry l ODETTE: Europan automotive industry Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 63
64 EDI: Components needed for an information transfer n The standard used n Message Implementation Guidelines (human-readable, agreed-upon between the trading partners of a transaction) n EDI Implementation Guidelines n Data transformation from/to the company‘s back-end business systems, e. g. ERP n Transmission protocols n Audit: ensures that any transaction can be tracked to ensure that it is not lost Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 64
65 EDI example: A purchase order message according to UN/EDIFACT version spring 1996 UNA: +. ? ' UNB+UNOC: 3+Sender. ID+Recipient. ID+060620: 0931+1++1234567' UNH+1+ORDERS: D: 96 A: UN' BGM+220+B 10001' DTM+4: 20060620: 102' NAD+BY+++Customer. ID+Street+City++23436+xx' LIN+1++Product Screws: SA' QTY+1: 1000' UNS+S' CNT+2: 1' UNT+9+1' UNZ+1+1234567' Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 65
66 EDI: Lessons learned n Economics l l Processes with intangibles (e. g. tenders, auctions with unknown partners) can usually not be represented in EDI alone l n Only worthwhile if lots of similar transactions (economies of scale) significant barrier: the accompanying business process change Semantics (and economics) l Semantics are dynamic (new EDIFACT versions, often > once a year!) l Often forgotten but essential: background knowledge (e. g. , master data EANCOM) l Information often incomplete and not contained in EDI Implementation Guidelines – e. g. : how much are „ 10 boxes of candy“ (assume packaged in big boxes: 5 display boxes; each 24 consumer-packaged boxes)? – ? Shows need for comprehensive ontology language ? l Two-way negotiation of trading partners remain essential – Market power decides (e. g. , whose IDs? ; Wal. Mart requires ist trading partners to use AS 2 transmission protocol) Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 66
67 FOAF (Friend of a Friend) n a machine-readable ontology describing persons, their activities and their relations to other people and objects. n Anyone can use FOAF to describe him or herself. n FOAF is an extension to RDF and is defined using OWL. n Computers may use these FOAF profiles to find, for example, all people living in Europe, or to list all people both you and a friend of you know. n This is accomplished by defining relationships between people. n Each profile has a unique identifier (such as the person's e-mail addresses, a Jabber ID, or a URI of the homepage or weblog of the person), which is used when defining these relationships. n The FOAF project, which defines and extends the vocabulary of a FOAF profile, was started in 2000 by Libby Miller and Dan Brickley. l n http: //www. foaf-project. org „possibly the single most prevalent use of Semantic Web technologies so far“ – blog software exporting FOAF + RSS (Paolillo et al. , 2005) Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 67
68 FOAF example (1) <rdf: RDF xmlns: rdf="http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#" xmlns: foaf="http: //xmlns. com/foaf/0. 1/" xmlns: rdfs="http: //www. w 3. org/2000/01/rdf-schema#"> <foaf: Person rdf: about="#JW"> <foaf: name>Jimmy Wales</foaf: name> <foaf: mbox rdf: resource="mailto: jwales@bomis. com" /> <foaf: homepage rdf: resource="http: //www. jimmywales. com/" /> <foaf: nick>Jimbo</foaf: nick> <foaf: depiction rdf: resource="http: //www. jimmywales. com/aus_img_small. jpg" /> <foaf: interest> <rdf: Description rdf: about="http: //www. wikimedia. org" rdfs: label="Wikipedia" /> </foaf: interest> Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 68
69 FOAF example (2) <foaf: knows> Social-web inferences <foaf: Person> <foaf: name>Angela Beesley</foaf: name> <!-- Wikimedia Board of Trustees --> </foaf: Person> </foaf: knows> </foaf: Person> </rdf: RDF> Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 69
70 FOAF extensions (1) <rdf: RDF xmlns: rdf="http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#" xmlns: foaf="http: //xmlns. com/foaf/0. 1/" xmlns: rel="http: //www. perceive. net/schemas/relationship/"> <foaf: Person rdf: ID="spiderman"> <foaf: name>Spiderman</foaf: name> <rel: enemy. Of rdf: resource="#green-goblin"/> </foaf: Person> <foaf: Person rdf: ID="green-goblin"> <foaf: name>Green Goblin</foaf: name> <rel: enemy. Of rdf: resource="#spiderman"/> </foaf: Person> Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 70
71 FOAF extensions (2) <foaf: Person rdf: ID="peter"> <foaf: name>Peter Parker</foaf: name> <rel: friend. Of rdf: resource="#harry"/> </foaf: Person> <foaf: Person rdf: ID="harry"> <foaf: name>Harry Osborn</foaf: name> <rel: friend. Of rdf: resource="#peter"/> <rel: child. Of rdf: resource="#norman"/> </foaf: Person> <foaf: Person rdf: ID="norman"> <foaf: name>Norman Osborn</foaf: name> <rel: parent. Of rdf: resource="#harry"/> </foaf: Person> </rdf: RDF> Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 71
72 FOAF multimedia (1) <rdf: RDF xmlns: rdf="http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#" xmlns: foaf="http: //xmlns. com/foaf/0. 1/" xmlns: dc="http: //purl. org/dc/elements/1. 1/"> <foaf: Person rdf: ID="peter"> <foaf: name>Peter Parker</foaf: name> <foaf: depicts rdf: resource="http: //www. peterparker. com/peter. jpg"/> </foaf: Person> <foaf: Person rdf: ID="spiderman"> <foaf: name>Spiderman</foaf: name> </foaf: Person> Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 72
73 FOAF multimedia (2) <foaf: Person rdf: ID="green-goblin"> <foaf: name>Green Goblin</foaf: name> </foaf: Person> <!-- codepiction --> <foaf: Image rdf: about="http: //www. peterparker. com/photos/spiderman/statue. jpg"> <dc: title>Battle on the Statue Of Liberty</dc: title> <foaf: depicts rdf: resource="#spiderman"/> <foaf: depicts rdf: resource="#green-goblin"/> <foaf: maker rdf: resource="#peter"/> </foaf: Image> </rdf: RDF> Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 73
74 What inferences? Ex. : A social-network analysis of Live. Journal FOAF entries (Paolillo et al. , 2005) n Interests over time remain similar n Friends over time remain similar n But: the manner in which people elect friends and interests in their Live. Journal profiles is sharply different. . [These differences] represent fundamentally different social behaviors. n What does this mean for recommender systems? Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 74
75 Cf. : Data about individuals available to Google operates the largest Internet search engine in the United States. In March 2007 alone, approximately 3. 5 billion search queries were performed on Google websites. 25 Google’s services include: a. Google search: any search term a user enters into Google; b. Google Desktop: an index of the user’s computer files, e-mails, music, photos, and chat and web browser history; c. Google Talk: instant-message chats between users; d. Google Maps: address information requested, often including the user’s home address for use in obtaining directions; e. Google Mail (Gmail): a user’s e-mail history, with default settings set to retain emails “forever”; f. Google Calendar: a user’s schedule as inputted by the user; g. Google Orkut: social networking tool storing personal information such as name, location, relationship status, etc. ; h. Google Reader: which ATOM/RSS feeds a user reads; i. Google Video/You. Tube: videos watched by user; from: EPIC (2007). Complaint and Request for Injunction, Request for Investigation and for Other Relief In the Matter of Google, Inc. and Double. Click, Inc. Before the Federal Trade Commission Washington, DC 20580 http: //news. findlaw. com/hdocs/google/googdoubleclick 42007 cmp. pdf Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 75
76 ? Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 76
77 Next lecture The Semantic Web: Motivation and overview Very brief recap of XML (& why it’s not semantic) RDF and RDFS OWL Ex. s of standardization: E-commerce, social networks Schema integration and Federated databases Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 77
78 References p. 10: http: //en. wikipedia. org/wiki/Semantic_Web p. 16 -24: Costello, R. L. (2003). A Five Minute Intro to XML. http: //www. daml. org/meetings/2003/05/SWMU/briefings/08_Tutorial_D. ppt pp. 26 -29, pp. 31 -39: Costello, R. L. & Jacobs, D. B. (2003). A Two Minute Intro to XML. www. daml. org/meetings/2003/05/SWMU/briefings/07_1045_Essential_Building_Blocks. ppt p. 30, pp. 48 -59: Unnamed (no date). RDF and XML tutorial. http: //lsdis. cs. uga. edu/Sem. Web. Course/RDF. ppt pp. 40, 41: based on Costello, R. L. & Jacobs, D. B. (2003). A Two Minute Intro to XML. www. daml. org/meetings/2003/05/SWMU/briefings/07_1045_Essential_Building_Blocks. ppt p. 45, 46: based on Costello, R. L. & Jacobs, D. B. (2003). OWL Web Ontology Language. http: //www. racai. ro/EUROLAN-2003/html/presentations/James. Hendler/owl/OWL. ppt p. 65: based on http: //de. wikipedia. org/wiki/EDIFACT (here is the English info: http: //en. wikipedia. org/wiki/EDIFACT) pp. 67 -69: based on http: //en. wikipedia. org/wiki/FOAF_(software) pp. 70 -73: Dodds, L. (2004). An Introduction to FOAF. http: //www. xml. com/pub/a/2004/02/04/foaf. html Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 78
79 Further references, background reading; acknowledgements http: //de. wikipedia. org/wiki/Electronic_Data_Interchange http: //en. wikipedia. org/wiki/Electronic_Data_Interchange J. C. Paolillo, S. Mercure, and E. Wright. (2005). The social semantics of Livejournal FOAF: Structure and change from 2004 to 2005. In G. Stumme, B. Hoser, C. Schmitz, and H. Alani, editors, Proceedings of the 1 st Workshop on Semantic Network Analysis at the ISWC 2005 Conference, pages 69 – 80. http: //www. blogninja. com/paolillo-mercure-wright. final. pdf Specifications: RDF: http: //www. w 3. org/TR/rdf-primer OWL: http: //www. w 3. org/TR/owl-features FOAF: http: //xmlns. com/foaf/spec Berendt: Advanced databases, first sem. 2008, http: //www. cs. kuleuven. be/~berendt/teaching/2008 w/adb/ 79
19ce2fbee4db3f9e9440c8f73e5231cc.ppt