Скачать презентацию Semantic Web The Story So Far Ian Horrocks Скачать презентацию Semantic Web The Story So Far Ian Horrocks

030ecb2a95c344350d0cb999bfc00843.ppt

  • Количество слайдов: 51

Semantic Web The Story So Far Ian Horrocks <ian. horrocks@comlab. ox. ac. uk> Oxford Semantic Web The Story So Far Ian Horrocks Oxford University Computing Laboratory

The Semantic Web The Semantic Web

What is it? • Web “invented” by Tim Berners-Lee (amongst others) – (Conceptual) simplicity What is it? • Web “invented” by Tim Berners-Lee (amongst others) – (Conceptual) simplicity of web has contributed to success, but is also a limiting factor • Tim has ambitious goals for future of the web – Objective is to overcome existing limitations “… a consistent logical web of data …” “… information is given well-defined meaning …” • This vision of the future of the Web has become known as the Semantic Web

Why do we want it? Many tasks are difficult or impossible using existing web: Why do we want it? Many tasks are difficult or impossible using existing web: Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois

Why do we want it? Many tasks are difficult or impossible using existing web: Why do we want it? Many tasks are difficult or impossible using existing web: • Complex queries involving background knowledge – Find information about “animals that use sonar but are neither bats nor dolphins” e. g. , Barn Owl , • Locating information in data repositories – Travel enquiries – Prices of goods and services – Results of human genome experiments • Finding and using “web services” – Given DNA sequence, identify genes, determine proteins they produce, and hence biological processes they control

What is the Problem? Consider a typical web page: • Markup consists of: – What is the Problem? Consider a typical web page: • Markup consists of: – rendering information (e. g. , font size and colour) – Hyper-links to related content • Semantic content is accessible to humans, but not (easily) to computers…

How Will It Work? • Add semantic annotations to web resources Dr. Alan Rector, How Will It Work? • Add semantic annotations to web resources Dr. Alan Rector, Professor of Computer Alan Rector, Science, University of Manchester Professor of Computer Science, University of Manchester Rev. Alan M. Gates, M. Alan Associate Rector of the Church of Associate Gates, the Holy Spirit, Lake Forest, Illinois Rector of the Church of the Holy Spirit, Lake Forest, Illinois

How Will It Work? Now. . . that should clear up a few things How Will It Work? Now. . . that should clear up a few things around here

Giving Semantics to Annotations • Agree on meaning of a set of annotation tags Giving Semantics to Annotations • Agree on meaning of a set of annotation tags • E. g. , Dublin Core – Limited flexibility and extensibility – Limited number of things can be expressed • Agree on language used to define meanings • E. g. , an ontology language – Flexible and extensible • New terms can be formed by combining existing ones – Meaning (semantics) of such terms is formally specified

The Web Ontology Language OWL The Web Ontology Language OWL

Web Ontology Language OWL • Semantic Web led to requirement for a “web ontology Web Ontology Language OWL • Semantic Web led to requirement for a “web ontology language” • set up Web-Ontology (Web. Ont) Working Group – Web. Ont developed OWL language – OWL based on earlier languages RDF, OIL and DAML+OIL – OWL now a W 3 C recommendation (i. e. , a standard) • OWL is a family of 3 languages: OWL Lite, OWL DL and OWL Full • OIL, DAML+OIL and OWL (DL & Lite) based on Description Logics – Has facilitated development of wide range of high quality tools & infrastructure • OWL now language of choice in many applications

What Are Description Logics? • A family of logic based Knowledge Representation formalisms – What Are Description Logics? • A family of logic based Knowledge Representation formalisms – Descendants of semantic networks and KL-ONE – Describe domain in terms of concepts (classes), roles (properties, relationships) and individuals – Operators allow for composition of complex concepts – Names can be given to complex concepts, e. g. : Happy. Parent ´ Parent u 8 has. Child. (Intelligent t Athletic)

Why (Description) Logic? • OWL exploits results of 15+ years of DL research – Why (Description) Logic? • OWL exploits results of 15+ years of DL research – Well defined (model theoretic) semantics – Most DLs are subsets of C 2, i. e. , decidable fragments of FOL

Why (Description) Logic? • OWL exploits results of 15+ years of DL research – Why (Description) Logic? • OWL exploits results of 15+ years of DL research – Well defined (model theoretic) semantics – Formal properties well understood (complexity, decidability) I can’t find an efficient algorithm, but neither can all these famous people. [Garey & Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, 1979. ]

Why (Description) Logic? • OWL exploits results of 15+ years of DL research – Why (Description) Logic? • OWL exploits results of 15+ years of DL research – Well defined (model theoretic) semantics – Formal properties well understood (complexity, decidability) – Known reasoning algorithms

Why (Description) Logic? • OWL exploits results of 15+ years of DL research – Why (Description) Logic? • OWL exploits results of 15+ years of DL research – Well defined (model theoretic) semantics – Formal properties well understood (complexity, decidability) – Known reasoning algorithms – Implemented systems (highly optimised) KAON 2 Pellet CEL

Class/Concept Constructors • Concept can be thought of as a FOL formula with one Class/Concept Constructors • Concept can be thought of as a FOL formula with one free variable

Knowledge Base / Ontology Axioms Knowledge Base / Ontology Axioms

OWL RDF/XML Exchange Syntax E. g. , Parent u 8 has. Child. (Intelligent t OWL RDF/XML Exchange Syntax E. g. , Parent u 8 has. Child. (Intelligent t Athletic):

Ontology based Information Systems • Similar to relational databases – Ontology ¼ schema; instances Ontology based Information Systems • Similar to relational databases – Ontology ¼ schema; instances ¼ data • Some important (dis)advantages + (Relatively) easy to maintain and update schema • Both schema and data are “self organising” + Query answers reflect both schema and data + Able to answer both intensional and extensional queries – Semantics may be counter-intuitive or even inappropriate • Open -v- closed world; axioms -v- constraints – Query answering (logical entailment) much more difficult • Can lead to scalability problems

Ontology based Information Systems • Similar to relational databases – Ontology ¼ schema; instances Ontology based Information Systems • Similar to relational databases – Ontology ¼ schema; instances ¼ data • Some important (dis)advantages + (Relatively) easy to maintain and update schema • Both schema and data are “self organising” + Query answers reflect both schema and data + Able to answer both intensional and extensional queries – Semantics may be counter-intuitive or even inappropriate • Open -v- closed world; axioms -v- constraints – Query answering (logical entailment) much more difficult • Can lead to scalability problems Very useful, but don’t expect miracles!

Ontologies and Reasoning Ontologies and Reasoning

Support for Ontology Engineering • Developing and maintaining quality ontolgies is very challenging • Support for Ontology Engineering • Developing and maintaining quality ontolgies is very challenging • Users need tools and services, e. g. , to help check if ontology is: – Meaningful — all named classes can have instances

Support for Ontology Engineering • Developing and maintaining quality ontolgies is very challenging • Support for Ontology Engineering • Developing and maintaining quality ontolgies is very challenging • Users need tools and services, e. g. , to help check if ontology is: – Meaningful — all named classes can have instances – Correct — captures intuitions of domain experts

Support for Ontology Engineering • Developing and maintaining quality ontolgies is very challenging • Support for Ontology Engineering • Developing and maintaining quality ontolgies is very challenging • Users need tools and services, e. g. , to help check if ontology is: – Meaningful — all named classes can have instances – Correct — captures intuitions of domain experts – Minimally redundant — no unintended synonyms Banana split Banana sundae

Support for Ontology Engineering • Range of new “non-standard” services supporting, e. g. : Support for Ontology Engineering • Range of new “non-standard” services supporting, e. g. : – Modular design and integration • What is the effect of merging O 2 into O 1? • In general, check that O 1 [ O 2 ² C iff O 1 ² C for any concept C constructed using vocabulary occurring in O 1 – Module Extraction • Extract a (small) module from O capturing all “relevant” information about some vocabulary V • In general, find O’ µ O s. t. O’ ² C iff O ² C for any concept C constructed using terms from V – Bottom-up design • Find a (small and specific) concept describing a set of individuals • In general, find most specific C s. t. O ² C(i 1) Æ … Æ C(in) – Where C may be “small” and/or in a sub-language (of O)

Support for Ontology Engineering • Range of new “non-standard” services supporting, e. g. : Support for Ontology Engineering • Range of new “non-standard” services supporting, e. g. : – Error diagnosis and repair

Support for Query Answering • In an Ontology based Information System (OIS), Query answering Support for Query Answering • In an Ontology based Information System (OIS), Query answering ¼ computing logical entailment – Reasoner needed in order to answer queries, e. g. : • C is a sub-class of D iff O ² 8 x. C(x) ! D(x) • a is an instance of C iff O ² C(a) OIS with no reasoner ¼ DBMS with no query engine

Example Applications Example Applications

e-Science • E. g. , for “in silico” investigations and “hypothesis testing” – Comparing e-Science • E. g. , for “in silico” investigations and “hypothesis testing” – Comparing data (e. g. , on proteins) to (model of) biological knowledge – Characteristics of proteins captured in an ontology O • Goal is to identify protein instances based on characteristics

e-Science • E. g. , for “in silico” investigations and “hypothesis testing” – Comparing e-Science • E. g. , for “in silico” investigations and “hypothesis testing” – Comparing data (e. g. , on proteins) to (model of) biological knowledge – Characteristics of proteins captured in an ontology O • Goal is to identify protein instances based on characteristics – Equivalent to answering queries of form: O ² P(i)? for protein P and instance i – Result may be discovery of new kinds of protein • And these may be potential drug targets if unique to a pathenogen – Result may also be discovery of errors in model • Which may reflect gaps/errors in existing knowledge

Healthcare • UK NHS has a £ 6. 2 billion “Connecting for Health” IT Healthcare • UK NHS has a £ 6. 2 billion “Connecting for Health” IT programme • Key component is Care Records Service (CRS) – “Live, interactive patient record service accessible 24/7” – Patient data distributed across local centres in 5 regional clusters, and a national DB • Detailed records held by local service providers • Diverse applications support radiology, pharmacy, etc • Applications exchange messages containing “semantically rich clinical information” • Summaries sent to national database – SNOMED-CT ontology provides common vocabulary for data • Clinical data uses terms drawn from ontology

SNOMED • Over 400, 000 concepts SNOMED • Over 400, 000 concepts

SNOMED • • Over 400, 000 concepts Schema only — no instances Language used SNOMED • • Over 400, 000 concepts Schema only — no instances Language used is a (well known) fragment of OWL NHS version extended with 1, 000 s of additional classes – OWL reasoner (Fa. CT++) used to classify and check ontology • Currently takes ¼ 4 hours – 180 missing sub. Class relationships were found, e. g. : • Periocular_dermatitis sub. Class. Of Disease_of_face • Fibrin_measurement sub. Class. Of Coagulation_factor_assay

SNOMED • Vocabulary is extensible at point of use: “post coordination” – Users (e. SNOMED • Vocabulary is extensible at point of use: “post coordination” – Users (e. g. clinicians) may add/define new vocabulary – Terminology service (reasoner) used to insert in ontology • Typical new term: – almond_allergy ´ “allergy caused_by almond” – OWL reasoner (Fa. CT++) used to classify new term • Takes <10 ms – Classified as a kind of “nut allergy” • Clearly of crucial importance to recognise patients with allergy caused by almond as kinds of patient with nut allergy

Recent Developments Recent Developments

Improving Scalability • Optimisation techniques – Improve performance of DL reasoners, e. g. , Improving Scalability • Optimisation techniques – Improve performance of DL reasoners, e. g. , [Tsarkov et al, JAR, 2007] • New reasoning techniques – Reduction to disjunctive Datalog [Motik et at, KR-04] – Hybrid DL-DB systems [Horrocks et al, CADE-05] – Hypertableau based algorithms [Motik et al, CADE-07] • Polynomial time algorithms for sub-ALC logics – Graph based techniques for EL+ [Baader et al, IJCAI-05] – Database techniques for DL-Lite [Calvanese et al, AAAI-05]

Extending Tools and Infrastructure • Editors/environments – Oiled, Protégé, Swoop, Top. Braid, Ontotrack, … Extending Tools and Infrastructure • Editors/environments – Oiled, Protégé, Swoop, Top. Braid, Ontotrack, …

Extending Tools and Infrastructure • Editors/environments – Oiled, Protégé, Swoop, Top. Braid, Ontotrack, … Extending Tools and Infrastructure • Editors/environments – Oiled, Protégé, Swoop, Top. Braid, Ontotrack, … • Reasoning systems – Cerebra, Fa. CT++, Kaon 2, Pellet, Racer, CEL, … Pellet KAON 2 CEL

Extending Tools and Infrastructure • Editors/environments – Oiled, Protégé, Swoop, Top. Braid, Ontotrack, … Extending Tools and Infrastructure • Editors/environments – Oiled, Protégé, Swoop, Top. Braid, Ontotrack, … • Reasoning systems – Cerebra, Fa. CT++, Kaon 2, Pellet, Racer, CEL, … • Design methodologies – Modularity, foundational ontologies, etc. Entity Endurant Quality Substantial Perdurant Event Achievement Stative Accomplishment

Increasing Expressive Power • Database style keys [Lutz et al, JAIR 2004] • Rule Increasing Expressive Power • Database style keys [Lutz et al, JAIR 2004] • Rule language extensions – W 3 C RIF WG (see http: //www. w 3. org/2005/rules/) – First order extensions (e. g. , SWRL) [Horrocks et al, JWS, 2005] – Hybrid language extensions, e. g. , [Eiter et al, KR-04; Motik et al, ISWC-04; Rosati, Jo. WS, 2005] – LP/F-Logic/Common Logic [Chen et al, JLP, 1993; de Bruijn et al, WWW-05] • Other extensions – Temporal, Fuzzy, … • OWL 1. 1 extension to OWL

OWL 1. 1 • Is an extension of OWL – Addresses deficiencies identified by OWL 1. 1 • Is an extension of OWL – Addresses deficiencies identified by users and developers (at OWLED workshop) • Is based on more expressive DL: SROIQ – (OWL is based on SHOIN) • W 3 C working group now chartered – Will develop recommendation based on existing member submission • Already supported by popular OWL tools – Protégé, Swoop, Top. Braid, Fa. CT++, Pellet

What’s New in OWL 1. 1? Four kinds of features: • More expressive logic What’s New in OWL 1. 1? Four kinds of features: • More expressive logic – qualified cardinality restrictions, e. g. : Object. Min. Cardinality(2 friend. Of hacker) – property chain inclusion axioms, e. g. : Sub. Object. Property. Of(Sub. Object. Property. Chain(parent brother) uncle) – local reflexivity restrictions, e. g. : Object. Exists. Self(likes) – [for narcissists] reflexive, irreflexive, symmetric, and antisymmetric properties, e. g. : Reflexive. Object. Property(knows); Irreflexive. Object. Property(husband. Of) – disjoint properties, e. g. : Disjoint. Object. Properties(child. Of spouse. Of)

What’s New in OWL 1. 1? Four kinds of features: • More expressive datatypes What’s New in OWL 1. 1? Four kinds of features: • More expressive datatypes – User-defined datatypes using facets from XML Schema Datatypes, e. g. : Sub. Class. Of(Adult Data. Some. Values. From(age Datatype. Restriction(xsd: integer min. Inclusive "18"^^xsd: integer)) – Simple relationships between values of functional data-valued properties, e. g. : Data. Some. Values. From(shoe. Size IQ greater. Than)

What’s New in OWL 1. 1? Four kinds of features: • Metamodelling and annotations What’s New in OWL 1. 1? Four kinds of features: • Metamodelling and annotations – Names can be used as any or all of an individual, a class, or a property – Allows for a restricted form of metamodelling (“punning”), e. g. : sub. Class. Of(Snow. Leopard Big. Cat) Class. Assertion(Snow. Leopard Endangered. Species) – Annotations of axioms as well as entities Class. Assertion(Comment(“source: WWF”) Snow. Leopard Endangered. Species)

What’s New in OWL 1. 1? Four kinds of features: • Syntactic sugar (make What’s New in OWL 1. 1? Four kinds of features: • Syntactic sugar (make things easier to say) – Disjoint unions, e. g. : Disjoint. Union(Element Earth Wind Fire Water) – Negative assertions, e. g. : Negative. Object. Property. Assertion(Ian has. Child Mary) Negative. Data. Property. Assertion (Ian has. Age 21)

Tractable Fragments • OWL defines only one fragment (OWL Lite) – And it isn’t Tractable Fragments • OWL defines only one fragment (OWL Lite) – And it isn’t very tractable! • OWL 1. 1 defines several different fragments with useful computational properties – E. g. , reasoning complexity in range LOGSPACE to PTIME – Smaller fragments implementable using RDBs

Tractable Fragments Tractable Fragments

Summary • Semantic Web aims to make web content more accessible to automated processes Summary • Semantic Web aims to make web content more accessible to automated processes – Adds semantic annotations to web resources • OWL Ontologies provide vocabulary for annotations – Terms have well defined meaning • OWL now being used in a wide range of applications – e-Science, medicine, geography, geology, … • Reasoning enabled tools are of crucial importance – For both design and deployment of ontologies • Active research area – Expressive power, scalability, methodologies, tools, …

Thank you for listening Thank you for listening

Thank you for listening FRAZZ: © Jeff Mallett/Dist. by United Feature Syndicate, Inc. Any Thank you for listening FRAZZ: © Jeff Mallett/Dist. by United Feature Syndicate, Inc. Any questions?