Скачать презентацию Formalization of documentary knowledge and conceptual knowledge with Скачать презентацию Formalization of documentary knowledge and conceptual knowledge with

963f40412a1de817948b51f8ae20bcba.ppt

  • Количество слайдов: 43

Formalization of documentary knowledge and conceptual knowledge with ontologies : applying to the description Formalization of documentary knowledge and conceptual knowledge with ontologies : applying to the description of audio-visual documents Raphaël Troncy Friday 23 rd of April, 2004 23/04/2004 CWI Talk - Raphaël Troncy

Background • The audio-visual document : some peculiarities – structured – spatio-temporal – composed Background • The audio-visual document : some peculiarities – structured – spatio-temporal – composed of images use of a textual description • The digital audio-visual document : – allow new possibilities : • « intelligent » search • AV library structuration • publication and broadcasting – need for an hyper-linked description: the content has to be linked with the description 23/04/2004 CWI Talk - Raphaël Troncy 1

Plan of this talk 1. Problems 2. Document engineering vs. knowledge representation 3. Our Plan of this talk 1. Problems 2. Document engineering vs. knowledge representation 3. Our proposal: an architecture for reasoning on descriptions of video documents 4. Experimentations 5. Conclusion and future work 23/04/2004 CWI Talk - Raphaël Troncy 2

Description of the AV content 1. Problems 2. Document engineering vs. KR 3. Architecture Description of the AV content 1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations 5. Conclusion and future work • A three step process : – identification of the content creator and the content provider : Dublin Core metadata, VRA core categories … – structural decomposition in video segments corresponding to the logical structure of the program : time-code, spatial coordinates – semantic description of these segments : controlled vocabulary, thesaurus, free text annotation 23/04/2004 CWI Talk - Raphaël Troncy 3

Description of the AV content 1. Problems 2. Document engineering vs. KR 3. Architecture Description of the AV content 1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations 5. Conclusion and future work describe the logical structure • Segmentation – locate and date some events • Description – characterize each segment with an AV genre – characterize each segment with a general thematic – describe the scene (who, when, where, what, …) describe the semantics of the content 23/04/2004 CWI Talk - Raphaël Troncy 4

Example 1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations 5. Example 1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations 5. Conclusion and future work 13 [Indoor Set: 6 th part] at 18: 43: 56: 00 - 00: 09: 06: 00. – Eurosport In studio, the second part of the interview, from Nice, of Sandy CASAR by Jean René GODART about the Paris-Nice cycling race and a few sports news with pictures commented by Alexandre BOYON and Laurent PUYAT. Q : Find all AV sequences of type dialog sequence with a interview with Sandy rider and concerning any cycling race stages Casar and concerning the Paris-Nice with several – noise answer: there are other sports news in the sequence – incomplete answer: the interview was broadcasted in two parts and began in a previous sequence – the query cannot be extended ! 23/04/2004 CWI Talk - Raphaël Troncy 5

1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations 5. Conclusion 1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations 5. Conclusion and future work Problems • Weak use of the logical structures • Descriptions are not made for reasoning make the AV descriptions accessible to automated processes • Requirements : – express models that constrain the logical structure Whichan interview inside a report most suitable to languages are the of a sports magazine • identify perform all these tasks ? – represent the meaning contained in this structure • a cartoon kind of with no real characters What is a fiction knowledge do we need ? – describe semantically the content of each sequence • the Prologue is always an individual time trial numbered stage 0 23/04/2004 CWI Talk - Raphaël Troncy 6

Document engineering 1. Problems 2. 1. Document engineering 2. Document engineering vs. KR 3. Document engineering 1. Problems 2. 1. Document engineering 2. Document engineering vs. KR 3. Architecture proposal 2. 2. Knowledge representation 4. Experimentations 5. Conclusion and future work • Provide models, languages and tools for managing document libraries • Encode both structured documents and structured data: XML [W 3 C, 1998] & XML Schema [W 3 C, 2001] • Distinguish the content from its presentation – Languages for presenting multimedia documents : SMIL – Models for describing multimedia documents • from Hy. Time [ISO, 1997] to MPEG-7 [ISO, 2001] 23/04/2004 CWI Talk - Raphaël Troncy 7

MPEG-7, the new multimedia description language? • ISO standard since December of 2001 • MPEG-7, the new multimedia description language? • ISO standard since December of 2001 • Main components: – Descriptors (Ds) and Description Schemes (DSs) – DDL (XML Schema + extensions) • Concern all types of media 23/04/2004 2. Document engineering vs. KR 2. 1. Document engineering 2. 2. Knowledge representation Part 5 - MDS CWI Talk - Raphaël Troncy 8

Structure and semantics 2. Document engineering vs. KR 2. 1. Document engineering 2. 2. Structure and semantics 2. Document engineering vs. KR 2. 1. Document engineering 2. 2. Knowledge representation • Structure • Base unit: segment - temporal bounds or mask • Possible decomposition 23/04/2004 CWI Talk - Raphaël Troncy 9

Structure and semantics 2. Document engineering vs. KR 2. 1. Document engineering 2. 2. Structure and semantics 2. Document engineering vs. KR 2. 1. Document engineering 2. 2. Knowledge representation • Semantics – entity – attribute – relation • Classification Schemes (CS) – thesauric relationships 23/04/2004 CWI Talk - Raphaël Troncy 10

Other models 2. Document engineering vs. KR 2. 1. Document engineering 2. 2. Knowledge Other models 2. Document engineering vs. KR 2. 1. Document engineering 2. 2. Knowledge representation • MPEG-7 = a rich set of descriptors, but too restrictive to cover all the possible descriptions • MPEG-7 extension with XML Schema: – Example: TV Anytime, Mdéfi [Tran Thuong, 2003] – Problem: add structure without semantics • MPEG-7 extension with CS : – Example: the COALA system [Fatemi, 2003] – Problem: very poor expressivity • Free annotation, knowledge-oriented – Strates-IA [Prié, 1999]: no control of the structure – E-SIA [Egyed-Zs, 2003]: knowledge base lost MPEG-7+XML Schema are not enough! … but KR brings new solutions 23/04/2004 CWI Talk - Raphaël Troncy 11

Ontologies in KR 2. Document engineering vs. KR 2. 1. Document engineering 2. 2. Ontologies in KR 2. Document engineering vs. KR 2. 1. Document engineering 2. 2. Knowledge representation • The formal specification of a conceptual model for a given domain – A set of concepts, of relations and axioms – Knowledge representation languages • Methodologies of construction: – Adaptation of well-known software engineering guidelines: Methontology [Gomez-Perez] – Terminological acquisition: [Bachimont], [Aussenac Gilles] – Ontology cleaning with formal properties: [Guarino] • Tools : – Protégé, Web. ODE, Oil. Ed, Onto. Edit, Terminae, DOE 23/04/2004 CWI Talk - Raphaël Troncy 12

KR languages for the Web 2. Document engineering vs. KR 2. 1. Document engineering KR languages for the Web 2. Document engineering vs. KR 2. 1. Document engineering 2. 2. Knowledge representation • RDF : [W 3 C, 1999 & W 3 C, 2004] – a data model for annotating Web resources – triples: resource → property → value • RDFS : [W 3 C, 2004] • – definition of the vocabulary OWL : [W 3 C, 2004] 17 -03 -2002 – hierarchy of classes and relations – axioms: algebraic properties, concept definitions, set operators, cardinalities (: "Stade 2" rdf: type ina: Sports. News) (: "Stade 2" ina: broad. Channel "France 2") (: "Stade 2" ina: broad. Date 17 -03 -2002) 23/04/2004 CWI Talk - Raphaël Troncy 13

Use of OWL+RDF for describing AV documents 2. Document engineering vs. KR 2. 1. Use of OWL+RDF for describing AV documents 2. Document engineering vs. KR 2. 1. Document engineering 2. 2. Knowledge representation • Definition of concepts and relations • • Studio. Program and ( Homogeneous. Program (all has. Part Studio. Sequence) ) Definition of axioms Homogeneous. Program Heterogeneous. Program = Inferences if ONPP is. A Studio. Prog then seq ONPP, seq is. A Studio. Seq Problem: how to control the structure of the descriptions ? 23/04/2004 CWI Talk - Raphaël Troncy 14

Our proposition 1. Problems 2. 3. 1. AV ontology Document engineering vs. KR 3. Our proposition 1. Problems 2. 3. 1. AV ontology Document engineering vs. KR 3. 3. 2. Description schemes Architecture proposal 4. 3. 3. Valid description Experimentations 5. 3. 4. KB population future work Conclusion and • Use jointly both approaches for representing the descriptions – the markup languages for describing and controlling the structure of each program – the ontology and the KR languages for describing formally the semantics of this structure and the content • Automatize as much as possible the translation between these two representations • Develop an architecture for reasoning on descriptions of video documents 23/04/2004 CWI Talk - Raphaël Troncy 15

General architecture 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. 1. AV General architecture 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population 16

The Audio-visual Ontology 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes The Audio-visual Ontology 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population • Methodology of construction: ARCHONTE [Bachimont] – Conceptualization : differential principles – Formalization : formal definitions, axioms – Operationalization : export into a KR language • AV domain: – Production objects (program, sequence, AV genre), Properties (theme), Persons, Technical Process (shooting, recording, postproduction), Signal descriptors (audio, video), etc. • Tools: – Conceptualization : DOE [Troncy & Isaac, IC’ 02] – Formalization : Oil. Ed [Bechhofer, KI’ 01] – Languages : OWL • Ontologies available on the Web: http: //opales. ina. fr/public/ontologies/ 23/04/2004 CWI Talk - Raphaël Troncy 17

The DOE ontology editor 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. The DOE ontology editor 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population 18

OWL Formalization 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. OWL Formalization 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population • professional practices Based on well-established Ontology export into the OWL language • Results: – Construction time: 4 weeks – Ontology size quite important: • 400 concepts 23/04/2004 CWI Talk - Raphaël Troncy 19

General architecture 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. 1. AV General architecture 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population 20

Generate XML Schema types 3. Architecture proposal 3. 1. AV ontology 3. 2. Description Generate XML Schema types 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population Some concepts (program, sequence) refer to categories of audio-visual segments OWL • Class • Sub-class • Restriction on properties • Union of classes • • XML Schema Complex type Extension Element of the content model Choice in the content model transformation 23/04/2004 CWI Talk - Raphaël Troncy 21

Generic MPEG-7 extension 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes Generic MPEG-7 extension 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population • Link these types to the existing MPEG-7 types 23/04/2004 CWI Talk - Raphaël Troncy 22

Build description schemes 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes Build description schemes 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population • Let us watch some sports magazines – construction of a simple schema based on Studio. Sequence, Report and Interview – a Report contains some Excerpts of Broadcast Live Sports • The schema provides the description skeleton for several sports magazine: – Téléfoot (soccer) – Vélo. Club (cycling) – 3 Partout (multisports) 23/04/2004 CWI Talk - Raphaël Troncy 23

General architecture 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. 1. AV General architecture 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. 1. AV Ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population 24

Segmen. Tool [French projet CHAPERON] 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal Segmen. Tool [French projet CHAPERON] 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. 1. AV Ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population 25

Instantiate a document content model 3. Architecture proposal 3. 1. AV Ontology 3. 2. Instantiate a document content model 3. Architecture proposal 3. 1. AV Ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population . . . . . . T 00: 24: 19 PT 00 H 00 M 07 S . . . KB RDF triples 23/04/2004 CWI Talk - Raphaël Troncy 26

General architecture 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. 1. AV General architecture 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population 27

The Cycling Ontology 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes The Cycling Ontology 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population • Methodology of construction: – Terminological acquisition • Textual corpus of 550 000 words [Le. Roux, 2003] • Tool for candidate term extraction: Lexter – Conceptualization and formalization • DOE + Oil. Ed • Results: – Construction time: 3 weeks • conceptualization, upper level, formalization – Ontology size: average • 97 concepts, 61 relations 23/04/2004 CWI Talk - Raphaël Troncy 28

The Cycling Ontology 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. 1. The Cycling Ontology 23/04/2004 CWI Talk - Raphaël Troncy 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population 29

Knowledge Base population 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes Knowledge Base population 3. Architecture proposal 3. 1. AV ontology 3. 2. Description schemes 3. 3. Valid description 3. 4. KB population Cycling domain Base of facts text SEIGO + [Le Roux, 2003] 23/04/2004 CWI Talk - Raphaël Troncy 30

General architecture 23/04/2004 CWI Talk - Raphaël Troncy 1. Problems 2. Document engineering vs. General architecture 23/04/2004 CWI Talk - Raphaël Troncy 1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations 5. Conclusion and future work 31

Experimentations 1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations 5. Experimentations 1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations 5. Conclusion and future work 1. First experimentation – Sesame : architecture for the storage of RDF triples [Broekstra, 2002] • Supports different query languages: RQL, RDQL and Se. RQL • Implements the RDF Schema semantics (RDF-MT engine) – BOR : reasoner for the DAML+OIL language [Simov & Jordanov, 2002] – Se. BOR : integration of the two systems, done in the On-To-Knowledge EU-IST Project 2. Second experimentation – – Racer : OWL DL reasoner [Haarslev & Möller, 2001] Rice : visualization interface [Möller et al. , 2003] 23/04/2004 CWI Talk - Raphaël Troncy 32

Conclusion 1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations 5. Conclusion 1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations 5. Conclusion and future work • General architecture for reasoning on descriptions of video documents: – Control of the structure: creation of document schemes – Formal representation of the semantics: AV ontology and domain-specific ontology – Based on standards languages (MPEG-7, OWL, RDF) and the use of transformations • Implementation and experimentations – Generic extension of MPEG-7 – Modeling of 2 ontologies with DOE – Creation of a Knowledge Base of events related to cycling race and use of an adapted reasoner 23/04/2004 CWI Talk - Raphaël Troncy 33

Future work 1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations Future work 1. Problems 2. Document engineering vs. KR 3. Architecture proposal 4. Experimentations 5. Conclusion and future work • Development integration – Better integration of the tools used • Planned experimentations – Populate a database with annotated video documents and test the system with a real panel of users – Apply this architecture to another domain than the cycling one – Benchmark the contribution of the AV ontology in a huge AV library without modifying the descriptions • Long-term objectives – The ideal AV description language is still a research program – The description could be linked with: • a rhetorical analysis of the documents • a semiotic analysis of the documents 23/04/2004 CWI Talk - Raphaël Troncy 34

Questions? 1. Problems 2. Document engineering vs. knowledge representation 3. Our proposal: an architecture Questions? 1. Problems 2. Document engineering vs. knowledge representation 3. Our proposal: an architecture for reasoning on descriptions of video documents 4. Experimentations 5. Conclusion and future work 23/04/2004 CWI Talk - Raphaël Troncy 35

Advertising • June 21 -25: The Week of Digital Document La Rochelle - France Advertising • June 21 -25: The Week of Digital Document La Rochelle - France http: //sdn 2004. univ-lr. fr/ • Workshop on: (unfortunately in French) "Documentary Model for Audio-visual" • Web Site: http: //liris. cnrs. fr/~yprie/Projets/SDN 04/ • Deadline approaching … April 30 23/04/2004 CWI Talk - Raphaël Troncy 36

23/04/2004 CWI Talk - Raphaël Troncy 37 23/04/2004 CWI Talk - Raphaël Troncy 37

23/04/2004 CWI Talk - Raphaël Troncy 38 23/04/2004 CWI Talk - Raphaël Troncy 38

23/04/2004 CWI Talk - Raphaël Troncy 39 23/04/2004 CWI Talk - Raphaël Troncy 39

23/04/2004 CWI Talk - Raphaël Troncy 40 23/04/2004 CWI Talk - Raphaël Troncy 40

23/04/2004 CWI Talk - Raphaël Troncy 41 23/04/2004 CWI Talk - Raphaël Troncy 41

23/04/2004 CWI Talk - Raphaël Troncy 42 23/04/2004 CWI Talk - Raphaël Troncy 42