Скачать презентацию Comp 3311 Database Management Systems 1 Introduction Acknowledgement Скачать презентацию Comp 3311 Database Management Systems 1 Introduction Acknowledgement

0ada84659a482506f3561d63a2e60c3b.ppt

  • Количество слайдов: 27

Comp 3311 Database Management Systems 1. Introduction Acknowledgement: Slides modified by Dr. Lei Chen Comp 3311 Database Management Systems 1. Introduction Acknowledgement: Slides modified by Dr. Lei Chen using previous versions created by Prof Dimitris Papadias, Prof. Dik Lee and Dr. Wilfred Ng. COMP 3311 Fall 2017 CSE, HKUST Slide 1

Course info • L 1: Schedule: Monday 1: 30 pm-2: 50 pm, Friday 9: Course info • L 1: Schedule: Monday 1: 30 pm-2: 50 pm, Friday 9: 00 am-10: 20 am, Room 1103 • L 2: Schedule: Wednesday, Friday 4: 30 pm-5: 50 pm, Room 2504 Instructor: Lei Chen • • WWW page: https: //www. cse. ust. hk/~leichen/comp 3311/index. html Midterm exam: Wed Sept 29 thth (in class exam). Exams will be with open books and notes. Textbook Database System Concepts, A. Silberschatz, H. Korth, and S. Sudarshan. Reference Database Management Systems, Raghu Ramakrishnan and Johannes Gehrke. Grading Policy: 40% final, 26% midterm, 24% assignment +Bonus (in class question answers) The first week of classes (Sept 4 th) there will not be any tutorials or labs. COMP 3311 Fall 2017 CSE, HKUST Slide 2

Course Outline • • • • E/R Model Relational Model, Algebra SQL Functional Dependencies Course Outline • • • • E/R Model Relational Model, Algebra SQL Functional Dependencies and Relational Database Design Storage and File Systems Tree and Hash Indexes Query Processing and Implementation of Relational Operators Query Optimization Physical Database Design Transactions Concurrency Control Protocols Database Recovery Advanced Topics COMP 3311 Fall 2017 CSE, HKUST Slide 3

What is a Database Management System (DBMS) • Collection of interrelated data + Set What is a Database Management System (DBMS) • Collection of interrelated data + Set of programs to access the data • DBMS contains information about a particular enterprise • DBMS provides an environment that is both convenient and efficient to use. • Database Applications: – – – Banking: all transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductions • Databases touch all aspects of our lives COMP 3311 Fall 2017 CSE, HKUST Slide 4

Commercial Database Systems Company Product Remarks Oracle 8 i, 9 i, etc. World’s 2 Commercial Database Systems Company Product Remarks Oracle 8 i, 9 i, etc. World’s 2 nd largest software company CEO, Larry Ellison, world’s 2 nd richest IBM Microsoft Sybase DB 2, Universal Server Access, SQL Server Adaptive Server World’s 2 nd largest after Informix acquisition Informix Dynamic Server Acquired by IBM in 2001 COMP 3311 Fall 2017 Access comes with MS Office CEO John Chen, grown up in HK, bought by SAP, 2010 CSE, HKUST Slide 5

DBMS vs File Systems • In the early days, database applications were built on DBMS vs File Systems • In the early days, database applications were built on top of file systems • Drawbacks of using file systems to store data: – Data redundancy and inconsistency • Multiple file formats, duplication of information in different files – Difficulty in accessing data • Need to write a new program to carry out each new task – Integrity problems • Integrity constraints (e. g. account balance > 0) become part of program code • Hard to add new constraints or change existing ones COMP 3311 Fall 2017 CSE, HKUST Slide 6

DBMS vs File Systems (cont) • Drawbacks of using file systems (cont. ) – DBMS vs File Systems (cont) • Drawbacks of using file systems (cont. ) – Atomicity of updates • Failures may leave database in an inconsistent state with partial updates carried out • E. g. transfer of funds from one account to another should either complete or not happen at all – Concurrent access by multiple users • Concurrent accesses needed for performance • Uncontrolled concurrent accesses can lead to inconsistencies – E. g. two people reading a balance and updating it at the same time – Security problems • DBMS offer automated solutions to all the above problems; they solve problems caused by different people writing different applications independently. COMP 3311 Fall 2017 CSE, HKUST Slide 7

Data Independence • One big problem in application development is the separation of applications Data Independence • One big problem in application development is the separation of applications from data • Do I have change my program when I … – replace my hard drive? – store the data in a b-tree instead of a hash file? – partition the data into two physical files (or merge two physical files into one)? – store salary as floating point number instead of integer? – develop other applications that use the same set of data? – add more data fields to support other applications? • Solution: introduce levels of abstraction. COMP 3311 Fall 2017 CSE, HKUST Slide 8

Three Levels of Abstraction ARR view 1 CSE Dept view 2 . . ……. Three Levels of Abstraction ARR view 1 CSE Dept view 2 . . ……. . . Financial Office view n Logical view HKUST database Physical view Files on disks COMP 3311 Fall 2017 CSE, HKUST Slide 9

Three Levels of Abstraction (cont. ) • Physical level: describe how a record is Three Levels of Abstraction (cont. ) • Physical level: describe how a record is stored on disks. • e. g. , “Divide the customer records into 3 partitions and store them on disks 1, 2 and 3. ” • Logical level: describes data stored in database, and the relationships among the data. Similar to defining a record type in Pascal or C: Type customer = record name: string; street: string; city: integer; end; • View level: Define a subset of the database for a particular application. Views can also hide information (e. g. salary) for security purposes. COMP 3311 Fall 2017 CSE, HKUST Slide 10

Instances and Schemas • Each level is defined by a schema, which describes the Instances and Schemas • Each level is defined by a schema, which describes the data at the corresponding level – A logical schema defines the logical structure of the database (e. g. , set of customers and accounts and the relationship between them) – A physical schema defines the file formats and locations • A database instance refers to the actual content of the database at a particular point in time. A database instance must conform to the corresponding schema COMP 3311 Fall 2017 CSE, HKUST Slide 11

Data Independence • Ability to modify a schema definition in one level without affecting Data Independence • Ability to modify a schema definition in one level without affecting a schema definition in the next higher level. • The interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others. • Two levels of data independence: - Physical data independence (users are shielded from changes in the physical structure of the data) - Logical data independence (users are shielded from changes in the logical structure of the data) COMP 3311 Fall 2017 CSE, HKUST Slide 12

Application view View definitions Logical Data Independence logical Logical schema Physical Data Independence physical Application view View definitions Logical Data Independence logical Logical schema Physical Data Independence physical COMP 3311 Fall 2017 Physical schema CSE, HKUST Slide 13

An Example of Data Independence Data on disk 1129 John Law program …… Program An Example of Data Independence Data on disk 1129 John Law program …… Program accessing data directly has to know: • first 4 bytes is an ID number • next 10 bytes is an employee name Schema Data on disk 1129 John Law …… Student: ID: integer Name char(10) DBMS program COMP 3311 Fall 2017 CSE, HKUST Slide 14

Data Models • A collection of tools for describing: – data relationships – data Data Models • A collection of tools for describing: – data relationships – data semantics – data constraints COMP 3311 Fall 2017 CSE, HKUST Slide 15

Entity-Relationship Model • Example of entity-relationship model social-security customer-street account-number balance customer-city customer-name CUSTOMER Entity-Relationship Model • Example of entity-relationship model social-security customer-street account-number balance customer-city customer-name CUSTOMER COMP 3311 Fall 2017 DEPOSITOR CSE, HKUST Slide 16 ACCOUNT

Relational Model Example of tabular data in the relational model: customername Johnson Smith Johnson Relational Model Example of tabular data in the relational model: customername Johnson Smith Johnson Jones Smith socialsecurity 192 -83 -7465 019 -28 -3746 192 -83 -7465 321 -12 -3123 019 -28 -3746 customerstreet Alma North Alma Main North account-number A-101 A-215 A-217 COMP 3311 Fall 2017 customercity Palo Alto Rye Palo Alto Harrison Rye accountnumber A-101 A-215 A-201 A-217 A-201 balance 500 900 750 CSE, HKUST Slide 17

Data Definition Language (DDL) • Specification notation for defining the database schema – Express Data Definition Language (DDL) • Specification notation for defining the database schema – Express what were in the previous two slides to the DBMS in a formal language • Data storage and definition language - special type of DDL in which the storage structure and access methods used by the database system are specified COMP 3311 Fall 2017 CSE, HKUST Slide 18

Data Manipulation Language (DML) • Language for accessing and manipulation the data organized by Data Manipulation Language (DML) • Language for accessing and manipulation the data organized by the appropriate data model • Two types of formal languages – Algebra (Procedural) - user specifies what data is required and how to get those data. – Calculus (Nonprocedural) - user specifies what data is required without specifying how to get those data • Commercial languages – SQL COMP 3311 Fall 2017 CSE, HKUST Slide 19

SQL • Most common language – used in all commercial DBMS • In addition SQL • Most common language – used in all commercial DBMS • In addition to DML, also DDL and more. • Example SELECT Name FROM Students WHERE Dept = CSE COMP 3311 Fall 2017 CSE, HKUST Slide 20

Transaction Management • A transaction is a collection of operations that performs a single Transaction Management • A transaction is a collection of operations that performs a single logical function in the database Example: ATM withdrawal Read account record Modify balance Write back modified record Give money to customer • Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e. g. power failures and operating system crashes) and transaction failures. COMP 3311 Fall 2017 CSE, HKUST Slide 21

Concurrency-control Management • Concurrency-control manager controls the interaction among the concurrent transactions, to ensure Concurrency-control Management • Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database. Transaction 1 Transaction 2 Conflicting read/write COMP 3311 Fall 2017 CSE, HKUST Slide 22

Storage/Buffer Management • The storage manager is a module that provides the interface between Storage/Buffer Management • The storage manager is a module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. • The buffer manager is responsible for fetching data from disk storage into main memory and deciding what data to cache in main memory. COMP 3311 Fall 2017 CSE, HKUST Slide 23

Database Administrator (DBA) • • Coordinates all the activities of the database system; the Database Administrator (DBA) • • Coordinates all the activities of the database system; the database administrator has good understanding of the enterprise’s information resources and needs. Database administrator’s duties include: Primary job of a database – Schema definition designer – Specifying integrity constraints – Storage structure and access method definition – Schema and physical organization modification More system – Granting user authority to access the database oriented – Acting as liaison with users – Monitoring performance and responding to changes in requirements COMP 3311 Fall 2017 CSE, HKUST Slide 24

Database Users • • Users are differentiated by the way they expected to interact Database Users • • Users are differentiated by the way they expected to interact with the system End users – invoke one of the existing application programs (e. g. , print monthly sales report) – Interact with applications through GUI • Application programmers – Develop applications that interact with DBMS through DML calls • Sophisticated users – form requests in a database query language – mostly one-time ad hoc queries COMP 3311 Fall 2017 CSE, HKUST Slide 25

Overall System Architecture COMP 3311 Fall 2017 CSE, HKUST Slide 26 Overall System Architecture COMP 3311 Fall 2017 CSE, HKUST Slide 26

Application Architectures §Two-tier architecture: E. g. client programs using ODBC/JDBC to communicate with a Application Architectures §Two-tier architecture: E. g. client programs using ODBC/JDBC to communicate with a database §Three-tier architecture: E. g. web-based applications, and applications built using “middleware” COMP 3311 Fall 2017 CSE, HKUST Slide 27