
d64d5a27abf8795a17ce0fa77743a9c5.ppt
- Количество слайдов: 34
1 Last update: 29 August 2007 Advanced databases – Introduction and overview Prof. Dr. Bettina Berendt – I thank my colleagues Henk Olivié and Erik Duval for holding the first lectures in my place! – Katholieke Universiteit Leuven, Department of Computer Science http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 1
2 Agenda Organisation of the course Motivation and overview Data, information, and knowledge Conceptual modelling, schemas, and ontologies Recap: Entity-relationship model for data modelling Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 2
Organisation of the course n www. cs. kuleuven. be/~berendt/teaching/2007 w/adb n Master’s course for CS students specializing in Databases n 3 Teaching l l n Lecture: Bettina Berendt, 2 * 1. 5 hours per week, in English Exercises and homeworks: Katrien Verbert, 2. 5 hours per week, in Dutch Materials: l see Web site; available ~ 1 week before each class n Exam: January n Contact us: l via the toledo system (details to be announced) – always your first choice! l bettina. berendt@cs. kuleuven. be, katrien. verbert@cs. kuleuven. be Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 3
4 Agenda Organisation of the course Motivation and overview Data, information, and knowledge Conceptual modelling, schemas, and ontologies Recap: Entity-relationship model for data modelling Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 4
LOTS of data (often, but not always, in database form) 5 Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 5
6 What is this course about? (1) – What does it build on? n The database field profits from a well-understood, wellfunctioning, commonly-used general model: relational databases n You have learned about this in the Databases course n Relational databases: a „homogenizing model“ n What else makes databases so powerful today ? Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 6
1. Data are accessible because they are interconnected (often, but not always, over the Internet/Web) 7 Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 7
2. Heterogeneous data are integrated (often, but not always, „semantically“) 8 Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 8
3. They are analysed to reveal the „knowledge“ implicit in them (e. g. , link structure Page. Rank sorting to order by relevance) 9 Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 9
10 So: What is this course about? (2) – What will it be about? n The database field profits from a well-understood, wellfunctioning, commonly-used general model: relational databases n You have learned about this in the Databases course n Relational databases: a „homogenizing model“ n What else makes databases so powerful today ? l Semantic integration of heterogeneous data l Integration over the Internet/Web l Analysis beyond retrieval: „Knowledge discovery (in databases)“ aka „Data mining“ Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 10
11 Outline of the course (I) Modelling data n Basics n Conceptual modelling Heterogeneous data (special forms) n Temporal databases n Spatial databases Defining and combining heterogeneous databases, schemas and ontologies n Basics / problem definition and approaches n Combining: Schema integration n Semantic Web / Semantic technologies n Combining: Schema integration and ontology alignment n Dynamics: Schema evolution and ontology evolution n Interfacing with applications: Web Services and Semantic Web Services Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 11
12 Outline of the course (II) Making implicit knowledge explicit n Knowledge discovery in databases n Deductive databases n Inductive databases n Multirelational data mining n Mining on structured, semi-structured, and unstructured data n Inferences in Semantic Web architectures Applications and implications n Applications n Data and personal data, data protection, and privacy Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 12
13 Learning outcomes: After this course, you will. . . n understand master relevant concepts and techniques of current databases and processing based on databases n understand the potentials, limitations, and risks inherent in assembling, combining, and processing huge amounts of heterogeneous data in globally interconnected environments n be able to design such databases and connectivity and relevant methods for combining and enriching data n have worked with concrete examples of such data collection / processing Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 13
14 Agenda Organisation of the course Motivation and overview Data, information, and knowledge Conceptual modelling, schemas, and ontologies Recap: Entity-relationship model for data modelling Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 14
15 Data and information n Datum / Data l n Fact or concept from reality, in a form suitable for communicating it, interpreting it, and processing it Information l Interpreted data Example: The length of the road is 400 km Interpretation Data (based on Henk Olivié: Gegevensbanken – 01. 2006/07) Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 15
16 Data, information, and knowledge Data represents a fact or statement of event without relation to other things. n Ex: It is raining. Information embodies the understanding of a relationship of some sort, possibly cause and effect. n Ex: The temperature dropped 15 degrees and then it started raining. Knowledge represents a pattern that connects and generally provides a high level of predictability as to what is described or what will happen next. n Ex: If the humidity is very high and the temperature drops substantially the atmospheres is often unlikely to be able to hold the moisture so it rains. (This is from knowledge-management theory. If you want to know about wisdom, check the Web page: G. Bellinger, D. Castro, & A. Mills: Data, Information, Knowledge, and Wisdom. http: //www. systems-thinking. org/dikw. htm ) Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 16
17 „Knowledge“ as used in this course Data represents a fact or statement of event without relation to other things. n Ex: It is raining. Information embodies the understanding of a relationship of some sort, possibly cause and effect. n Ex: The temperature dropped 15 degrees and then it started raining. Knowledge represents a pattern that connects and generally provides a high level of predictability as to what is described or what will happen next. n Ex: If the humidity is very high and the temperature drops substantially the atmospheres is often unlikely to be able to hold the moisture so it rains. èThis definition of „knowledge“ corresponds to that used in è Data mining (aka „knowledge discovery (in databases)“) è (in particular symbolic) AI (e. g. , „knowledge-based systems“) è It is not the only definition; e. g. , cognitive psychology generally assumes that only people can have knowledge, such that computers can only possess (different types of) information. Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 17
Computerizing data, information, and knowledge: Databases and knowledge bases n 18 Databases l = data + interpretation (metadata) focus on data and information = focus on the retrieval of data and information n Knowledge bases l a special kind of database l provide the means for the computerized collection, organization, and retrieval of knowledge focus on knowledge = focus on the inferences that can be made from data+information Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 18
19 How/where are knowledge bases used? Knowledge-based systems n computer program that contain some of the subject-specific knowledge, and contains the knowledge and analytical skills of one or more human experts n Perform retrieval + inferencing / reasoning on a knowledge base to reach „intelligent“ conclusions (e. g. , medical diagnosis) n Most popular 1960 s – 1980 s, aka „expert systems“ n Terminology changed: l „Semantic Web“ l „Semantic technologies“ n Observation: There is no corresponding term like „data-based systems“ – having data + retrieval in the background appears to be the default. n Hypothesis: Having data + retrieval + inference in the background is becoming the default (cf. Web search engines) Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 19
Combining data and knowledge from different sources: The importance of conceptual models n 20 To combine data from different databases: l know + integrate their conceptual models n To combine data from databases and knowledge bases: 1. understand the commonalities and differences of their conceptual meta-models Simplified: n n 2. database conceptual models = entities + relations knowledge base conceptual models = entities + relations + rules for inferencing integrate these conceptual models (as for databases) Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 20
21 Agenda Organisation of the course Motivation and overview Data, information, and knowledge Conceptual modelling, schemas, and ontologies Recap: Entity-relationship model for data modelling Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 21
22 Conceptual modelling as a part of database design Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 22
23 Conceptual database schemas and conceptual models in general n Conceptual schema: l a concise description of the data requirements of the users l includes detailed descriptions of the entity types, relationships, and constraints l does not include implementation details l can be used to communicate with non-technical users (Elmasri, R. & Navathe, S. B. (2007). Fundamentals of Database Systems. Boston: Addison Wesley. 5 th Edition. p. 60) n Conceptual model l a theoretical construct that represents something, with a set of variables and a set of logical and quantitative relationships between them. l describes the semantics of the modelled domain l Models in this sense are constructed to enable reasoning within an idealized logical framework l Often in the form of an ontology, or having an ontology as a part Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 23
From database schema to ontology (1) TITLE 24 OBJECT NAME PERSON PROJECT COOPERATES -WITH Schema WORKS-IN RESEARCHER Semantic Web Mining DAMLPROJ URI-SWMining Andreas Hotho WORKS-IN URI-AHO WORKS-IN Records COOPERATESWITH URI-GST Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 24
From database schema to ontology (2) TITLE 25 OBJECT cooperateswith(X, Y) NAME Þcooperateswith(Y, X) PERSON PROJECT COOPERATES -WITH Ontology WORKS-IN RESEARCHER Semantic Web Mining DAMLPROJ URI-SWMining Andreas Hotho WORKS-IN URI-AHO WORKS-IN Instances COOPERATESWITH URI-GST Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 25
What is an ontology? 26 (A commonly accepted informal definition and one formal definition) An ontology is „an explicit specification of a shared conceptualisation. “ (Gruber, 1993) Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 26
27 Key terms in DB/KB conceptual modelling Database Knowledge base ER diagram e. g. OWL model Conceptual model schema ontology constituents * entity * concept * attribute * relationship * relation * (no correspondence) * axiom record instance Conceptual model ( modelling language) Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 27
28 Semantics n „A conceptual model describes the semantics of a domain“ n Semantics l the aspects of meaning that are expressed in a language, code, or other form of representation of information. l Semantics is contrasted with two other aspects of meaningful expression, namely, syntax, the construction of complex signs from simpler signs, and pragmatics, the practical use of signs by agents or communities of interpretation in particular circumstances and contexts. l also: theoretical study of meaning in systems of signs l Meaning (denotational): the relation of a sign to objects and objective situations, actual or possible Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 28
29 Knowledge representation n What constituents a conceptual model should have, and how they should interact, is the question of knowledge representation (KR). n KR is an issue that arises in both cognitive science and artificial intelligence. n In cognitive science it is concerned with how people store and process information. n In artificial intelligence (AI) the primary aim is to store knowledge so that programs can process it and achieve the verisimilitude of human intelligence. n AI researchers have borrowed representation theories from cognitive science. Thus there are representation techniques such as frames, rules and semantic networks which have originated from theories of human information processing. n Since knowledge is used to achieve intelligent behavior, the fundamental goal of knowledge representation is to represent knowledge in a manner as to facilitate inferencing i. e. drawing conclusions from knowledge. Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 29
Conceptual modelling: languages, automated code generation, integration n 30 Typically, the conceptual model(s) that are developed are captured in a software tool, using a particular conceptual modeling language. l Entity-relationship models (ERM) l Unified modeling language (UML) l But also: resource description framework (RDF), Web ontology language (OWL) n Conceptual modeling is one of the key activities in developing computerized systems for two important reasons. n Firstly, more and more, it is now possible to use computerized tools that can generate part (or sometimes all) of a computer application from the conceptual models encoded in standardized modeling languages [such as UML]. n Secondly, computerization of enterprises continues with a focus on integrating systems. l Integration of systems requires an understanding of the semantics of each of the systems to be integrated. l The availability of conceptual models for the participant systems can facilitate the integration process and will require the involved staff to be fluent with the basics of the models employed and to have some modeling capabilities of their own. . Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 30
31 Agenda Organisation of the course Motivation and overview Data, information, and knowledge Conceptual modelling, schemas, and ontologies Recap: Entity-relationship model for data modelling Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 31
32 Recap: Conceptual modelling in the Entity-Relationship Model insert here: Jeff Ullman The Entity-Relationship (E/R) Model. 2004 Slide set. http: //infolab. stanford. edu/~ullman/dscb/pslides/er. ppt (in particular pp. 1 -39) (A lot of detail also in Henk Olivié, 2006. Gegevensbanken: 3: gegevensmodellering met het entiteit-relatie model, 4: het uitgebreide entiteit relatie model en UML Or (Instructor slides of the Elmasri/Navathe book, in English) ch 03. ppt, ch 04. ppt in the directory „Lecture/Other. Slides“ of this course´s Web site Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 32
33 Next lecture Organisation of the course Motivation and overview Data, information, and knowledge Conceptual modelling, schemas, and ontologies Recap: Entity-relationship model for data modelling Data modelling: UML, logics, Semantic Web Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 33
34 References / background reading; acknowledgements p. 22, p. 32: Elmasri, R. & Navathe, S. B. (2007). Fundamentals of Database Systems. Boston: Addison Wesley. 5 th Edition. p. 410 p. 23: http: //en. wikipedia. org/wiki/Conceptual_modeling , . . . /Conceptual_schema , . . . /Model_(abstract) p. 26: Gruber, T. R. (1993). Towards principles for the design of ontologies used for knowledge sharing. In N. Guarino & R. Poli (Eds. ), Formal Ontologies in Conceptual Analysis and Knowledge Representation Deventer, NL: Kluwer. Bozsak, Ehrig, Handschuh, Hotho, Maedche, Motik, Oberle, Schmitz, Staab, Stojanovic, Studer, Stumme, Sure, Tane, Volz, & Zacharias (2002). KAON Towards a Large Scale Semantic Web. In Kurt Bauknecht, A. Min Tjoa, & Gerald Quirchmayr (Eds. ), E-Commerce and Web Technologies, Third International Conference, EC-Web 2002, Aix-en-Provence, France, September 2 -6, 2002, Proceedings (pp. 304 -313). Springer: LNCS 2455 p. 28: http: //en. wikipedia. org/wiki/Semantics p. 29: Based on http: //en. wikipedia. org/wiki/Knowledge_representation p. 30: Based on: Dagstuhl seminar April 2008: The Evolution of Conceptual Modeling http: //www. dagstuhl. de/de/programm/kalender/semhp/? semnr=2008181 p. 32 – the referenced Ullman slides refer to Hector Garcia-Molina, Jeff Ullman, & Jennifer Widom (2002). Database Systems: The Complete Book. Upper Saddle River, NJ: Prentice-Hall. Berendt: Advanced databases, winter term 2007/08, http: //www. cs. kuleuven. ac. be/~berendt/teaching/2007 w/adb/ 34
d64d5a27abf8795a17ce0fa77743a9c5.ppt