Скачать презентацию Introduction to Database CHAPTER 1 INTRODUCTION n n Скачать презентацию Introduction to Database CHAPTER 1 INTRODUCTION n n

182dd8def3172700f6b9f92f36362ae1.ppt

  • Количество слайдов: 56

Introduction to Database CHAPTER 1 INTRODUCTION n n n n n Edited: Wei-Pang Yang, Introduction to Database CHAPTER 1 INTRODUCTION n n n n n Edited: Wei-Pang Yang, IM. NDHU Database-System Applications Purpose of Database Systems View of Database Languages Relational Databases Database Design Data Storage and Querying Transaction Management Database Architecture Database Users and Administrators Source: Database System Concepts, Silberschatz etc. 2006 1 -1

Database System: Introduction n Database Management System (DBMS) n Contains a large bodies of Database System: Introduction n Database Management System (DBMS) n Contains a large bodies of information n Collection of interrelated data (database) n Set of programs to access the data Goal of a DBMS: n provides a way to store and retrieve database information that is both • convenient and • efficient. Functions of DBMS: Management of Data (MOD) n Defining structure for storage data n Providing mechanisms for manipulation of data n Ensure safety of data (system crashes, unauthorized access, misused, …) n Concurrent control in multi-user environment § Computer Scientists: developed a lot of concepts and technique for MOD § concepts and technique form the focus of this book, and this course Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -2

1. 1 Database-System Applications n Database Applications: n n Airlines: reservations, schedules n Universities: 1. 1 Database-System Applications n Database Applications: n n Airlines: reservations, schedules n Universities: registration, grades, student profile, . . n Sales: customers, products, purchases n Manufacturing: production, inventory, orders, supply chain n n Banking: all transactions Human resources: employee records, salaries, tax deductions Databases touch all aspects of our lives Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -3

1. 2 Purpose of Database Systems n n In the early days, database applications 1. 2 Purpose of Database Systems n n In the early days, database applications were built on top of file systems Drawbacks of using file systems to store data: n n Data redundancy and inconsistency • Multiple file formats, duplication of information in different files Difficulty in accessing data • Need to write a new program to carry out each new task Data isolation — multiple files and formats Integrity problems • Integrity constraints (e. g. account balance > 0) become part of program code • Hard to add new constraints or change existing ones Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -4

Drawbacks of using file systems (cont. ) n Drawbacks of using file systems to Drawbacks of using file systems (cont. ) n Drawbacks of using file systems to store data: (cont. ) n n Atomicity of updates 原子性, 單一性 • Failures may leave database in an inconsistent state with partial updates carried out • E. g. transfer of funds from one account to another should either complete or not happen at all Concurrent access by multiple users • Concurrent accessed needed for performance • Uncontrolled concurrent accesses can lead to inconsistencies § E. g. two people reading a balance and updating it at the same time n Security problems Solution Database systems offer solutions to all the above problems Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -5

1. 3 View of Data and Data Abstraction n Physical level: describes how a 1. 3 View of Data and Data Abstraction n Physical level: describes how a record (e. g. , customer information) is stored in disk. n n By sequential file, pointer, or hash structure, … Logical level: describes data stored in database, and the relationships among the data. type customer = record name : string; street : string; city : string; income : integer; end; n View level: application programs hide details of data types. Views can also hide information (e. g. , income) for security purposes. Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -6

View of Data -1: Three Levels An architecture for a database system Edited: Wei-Pang View of Data -1: Three Levels An architecture for a database system Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -7

View of Data -2: Three Levels User A 1 Host Language + DSL 1 View of Data -2: Three Levels User A 1 Host Language + DSL 1 User A 2 Host Language + DSL 2 External View @ # & External schema A External/conceptual mapping A < External schema B External View B External/conceptual mapping B Conceptual View Database management system Dictionary (DBMS) e. g. system catalog Conceptual/internal mapping DBA (Build and maintain schemas and mappings) User B 2 User B 3 Host C, C++ Language + DSL DSL (Data Sub. Language) e. g. SQL 3 Conceptual schema User B 1 Host Language + DSL Storage structure definition (Internal schema) Edited: Wei-Pang Yang, IM. NDHU Stored database (Internal View) 1 2 Source: Database System Concepts, Silberschatz etc. 2006 3 # . . . 100 & @ 1 -8

1. 3. 2 Instances and Schemas n Schema – the logical structure of the 1. 3. 2 Instances and Schemas n Schema – the logical structure of the database n e. g. , the database consists of information about a set of customers and accounts and the relationship between them n Analogous to type information of a variable in a program n Physical schema: database design at the physical level n Logical schema: database design at the logical level account create table account (account-number char(10), balance integer) type customer = record name : string; street : string; city : integer; end; Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 customer 1 -9

Instances and Schemas (cont. ) n Instance – the actual content of the database Instances and Schemas (cont. ) n Instance – the actual content of the database at a particular point in time n Analogous to the value of a variable Instance Schema create table account (account-number char(10), balance integer) n Physical Data Independence – the ability to modify the physical schema without changing the logical schema n Applications depend on the logical schema n In general, the interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others. Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -10

View of Data: Three Levels An architecture for a database system Physical Data Independence View of Data: Three Levels An architecture for a database system Physical Data Independence Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -11

1. 3. 3 Data Models n A collection of conceptual tools for describing n 1. 3. 3 Data Models n A collection of conceptual tools for describing n n data relationships n data semantics n n data (entities, objects) data consistency constraints Data Models Provide: n Edited: Wei-Pang Yang, IM. NDHU A way to describe the design of a database at 3 levels • Physical level • Logical level • View level Source: Database System Concepts, Silberschatz etc. 2006 1 -12

Category of Data Models n Category of Data Models: n Entity-Relationship model n Relational Category of Data Models n Category of Data Models: n Entity-Relationship model n Relational model n Object-oriented model n n Edited: Wei-Pang Yang, IM. NDHU Semi-structured data models • Extensible Markup Language (XML) Older models: • Network model and • Hierarchical model Source: Database System Concepts, Silberschatz etc. 2006 1 -13

1. 4 Database Languages n Data Definition Language (DDL): n n n Specification notation 1. 4 Database Languages n Data Definition Language (DDL): n n n Specification notation for defining the database schema E. g. create table account (account-number char(10), balance integer) Data Manipulation Language (DML) n To express database queries or updates n E. g. Select account-number from account where balance >1000 n SQL (Structured Query Language): a single language for both Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -14

1. 4. 1 Data-Manipulation Language (DML) n Language for accessing and manipulating the data 1. 4. 1 Data-Manipulation Language (DML) n Language for accessing and manipulating the data organized by the appropriate data model n DML also known as query language For retrieval, insertion, deletion, modification (update) Two classes of languages n n n Procedural DMLs – user specifies what data is required and how to get those data • E. g. … in C Declarative DMLs (Nonprocedural DMLs) – user specifies what data is required without specifying how to get those data • E. g. In SQL: Select account-number from account where balance > 700 SQL is the most widely used query language n n Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -15

1. 4. 2 Data-Definition Language (DDL) n Specification notation for defining the database schema 1. 4. 2 Data-Definition Language (DDL) n Specification notation for defining the database schema n E. g. create table account (account-number char(10), balance integer) n Define: • Attributes name • Data type • Consistency constraints (integrity constraints) § Domain constraints: e. g. assets are integer type § Assertions: e. g. assets >= 0 § Authorization: for different users § …. Edited: Wei-Pang Yang, IM. NDHU create table branch (branch-name char(15), branch-city char(30), assets integer, primary key (branch-name), check (assets >= 0)) Source: Database System Concepts, Silberschatz etc. 2006 1 -16

Data Dictionary and Storage Definition n Data Dictionary: n n DDL compiler generates a Data Dictionary and Storage Definition n Data Dictionary: n n DDL compiler generates a set of tables stored in a data dictionary contains metadata (i. e. , data about data) • Database schema • System tables • Users • … Database system consults the Data dictionary before reading or modifying actual dada. Data storage and definition language • To specify the storage structure and access methods • Usually an extension of the data definition language Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 (ch. 11, 12) 1 -17

1. 5 Relational Databases n Definition 1: A Relational Database is a database that 1. 5 Relational Databases n Definition 1: A Relational Database is a database that is perceived by the users as a collection of time-varying, normalized relations (tables). • • • n Perceived by the users: the relational model apply at the view level and logical levels. Time-varying: the set of tuples changes with time. Normalized: contains no repeating group (only contains atomic value). The relational model represents a database system at a level of abstraction that removed from the details of the underlying machine, like high-level language. C, PASCAL , PL/1 assembler machine Edited: Wei-Pang Yang, IM. NDHU DBMS environments Relational DBMS Relational Data Model Source: Database System Concepts, Silberschatz etc. 2006 1 -18

1. 5. 1 Tables n Definition 2: A Relational Database is a database that 1. 5. 1 Tables n Definition 2: A Relational Database is a database that is perceived by its users as a collection of tables (and nothing but tables). Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -19

1. 5. 2 Data-Manipulation Language n SQL (Structured Query Language) : widely used n 1. 5. 2 Data-Manipulation Language n SQL (Structured Query Language) : widely used n E. g. find the name of the customer with customer-id 192 -83 -7465 select customer-name from customer where customer-id = ‘ 192 -83 -7465’ customer Output: customer-name Johnson Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -20

SQL (Structured Query Language) n E. g. find the balances of all accounts held SQL (Structured Query Language) n E. g. find the balances of all accounts held by the customer with customer-id 192 -83 -7465 select account. balance from depositor, account where depositor. customer-id = ‘ 192 -83 -7465’ and depositor. account-number = account-number Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -21

1. 5. 3 Data-Definition Language n SQL provides DDL to define database schema: n 1. 5. 3 Data-Definition Language n SQL provides DDL to define database schema: n n Tables • E. g. create table account (account-number char(10), balance integer) Assertions (ref. p. 132) • n E. g. create assertion balance-constraint check account. balance >= 1000 integrity Constraints (ref. p. 129) Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -22

Referential Integrity Constraint create table account (account-number char(10), branch-name char(15), balance integer, primary key Referential Integrity Constraint create table account (account-number char(10), branch-name char(15), balance integer, primary key (account-number), 3. account 存款帳 references create table depositor (customer-name char(20), account-number char(10), primary key (customer-name, account-number), foreign key (account-number) references account, Edited: Wei-Pang Yang, IM. NDHU 4. depositor 存款戶 Source: Database System Concepts, Silberschatz etc. 2006 1 -23

1. 5. 4 Data Access from Application Programs n Application programs generally access databases 1. 5. 4 Data Access from Application Programs n Application programs generally access databases through one of n n Language extensions to allow embedded SQL Application program interface (e. g. ODBC/JDBC) which allow SQL queries to be sent to a database n ODBC: Open Database Connectivity for C n JDBC: Java Database Connectivity for Java language ODBC/JDBC Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -24

1. 6 Database Design n n Database Design - The process of designing the 1. 6 Database Design n n Database Design - The process of designing the general structure of the database: n Logical Design n Physical Design Logical Design – Deciding on the database schema. n n n To find a “good” collection of relation schemas. Business decision – What attributes should we record in the database? Computer Science decision – What relation schemas should we have and how should the attributes be distributed among the various relation schemas? Physical Design – Deciding on the physical layout of the database n Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -25

1. 6. 1 Design Process n n Phase I n Specification of user requirement 1. 6. 1 Design Process n n Phase I n Specification of user requirement (with domain experts) Phase II n Conceptual design (ch. 6) n Choose a data model n Design tables n Normalization (ch. 7) Phase III n Specification of functional requirements Phase IV n Implementation n Logical-design n Physical-design (ch. 11, 12) Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -26

1. 6. 2 Database Design for Banking n Banking Database: consists 6 relations: 1. 1. 6. 2 Database Design for Banking n Banking Database: consists 6 relations: 1. 2. branch (branch-name, branch-city, assets) customer (customer-name, customer-street, customeronly) 3. account (account-number, branch-name, balance) 4. loan (loan-number, branch-name, amount) 5. depositor (customer-name, account-number) 6. borrower (customer-name, loan-number) Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -27

Example: Banking Database 1. branch 分公司 4. borrower 貸款戶 Edited: Wei-Pang Yang, IM. NDHU Example: Banking Database 1. branch 分公司 4. borrower 貸款戶 Edited: Wei-Pang Yang, IM. NDHU 2. customer 客戶(存款戶, 貸款戶) 3. depositor 5. account 存款帳 6. loan Source: Database System Concepts, Silberschatz etc. 2006 存款戶 貸款帳 1 -28

1. 6. 3 Entity-Relationship Model (ch. 6) § Example: Schema in the Entity-Relationship model 1. 6. 3 Entity-Relationship Model (ch. 6) § Example: Schema in the Entity-Relationship model 客戶 存款帳 客戶(存款戶, 貸款戶, 信用卡戶) Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 存款戶 1 -29

E-R Diagram for a Banking Enterprise, p. 240 Edited: Wei-Pang Yang, IM. NDHU Source: E-R Diagram for a Banking Enterprise, p. 240 Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -30

Entity Relationship Model (cont. ) n E-R model of real world n n n Entity Relationship Model (cont. ) n E-R model of real world n n n Entities (objects) • E. g. customers, accounts, bank branch Relationships between entities • E. g. Account A-101 is held by customer Johnson • E. g. Relationship set depositor associates customers with accounts Widely used for database design n Database design in E-R model usually converted to design in the Relational model (coming up next) which is used for storage and processing Relational Model (ch. 2) E-R model (ch. 6) Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -31

1. 6. 4 Normalization n Definition: A Relational Database is a database that is 1. 6. 4 Normalization n Definition: A Relational Database is a database that is perceived by its users as a collection of tables (and nothing but tables). Supplier-and-Parts Database S P S# S 1 S 2 S 3 S 4 S 5 P# P 1 P 2 P 3 P 4 P 5 P 6 SNAME Smith Jones Blake Clark Adams PNAME Nut Bolt Screw Cam Cog Edited: Wei-Pang Yang, IM. NDHU STATUS 20 10 30 20 30 COLOR Red Green Blue Red SP S# CITY London Paris London Athens WEIGHT 12 17 17 14 12 19 CITY London Paris Rome London Paris London S 1 S 1 S 1 S 2 S 3 S 4 S 4 P# QTY P 1 300 P 2 200 P 3 400 P 4 200 P 5 100 P 6 100 P 1 300 P 2 400 P 2 200 P 4 300 P 5 400 Source: Database System Concepts, Silberschatz etc. 2006 1 -32

Problem of Normalization <e. g. > S 1, Smith, 20, London, P 1, Nut, Problem of Normalization S 1, Smith, 20, London, P 1, Nut, Red, 12, London, 300 S 1, Smith, 20, London, P 2, Bolt, Green, 17, Paris, 200 . S 4, Clark, 20, London, P 5, Cam, Blue, 12, Paris, 400 S P Normalization SP S# SNAME STATUS CITY s 1 . London S# P# QTY P# . . . SP' S' or P S# SNAME STATUS P# . . . S 1 Smith . S 2. . . S 1 London P 1 300 . . Redundancy Edited: Wei-Pang Yang, IM. NDHU S# CITY P# QTY S 1 London P 2 200 . . . Update Anomalies! (異常) Source: Database System Concepts, Silberschatz etc. 2006 1 -33

1. 7 Object-Based and Semistructured Databases n Extend the relational data model n by 1. 7 Object-Based and Semistructured Databases n Extend the relational data model n by including object orientation and n constructs to deal with added data types. (video, image, …) Allow attributes of tuples to have complex types, including n non-atomic values such as nested relations. (repeated data, …) Preserve relational foundations, n n n in particular the declarative access to data, while extending modeling power. 6. borrower Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -34

1. 7. 2 Semistructured Data Models n XML (Extensible Markup Language) n n n 1. 7. 2 Semistructured Data Models n XML (Extensible Markup Language) n n n Defined by the WWW Consortium (W 3 C) 聯合 Originally intended as a document markup language not a database language The ability to specify new tags, and to create nested tag structures made XML a great way to exchange data, not just documents XML has become the basis for all new generation data interchange formats. A wide variety of tools is available for parsing, browsing and querying XML documents/data Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -35

1. 8 Data Storage and Querying n Components of Database System n Query Processor 1. 8 Data Storage and Querying n Components of Database System n Query Processor • • • n Query Helps to simplify to access data High-level view Users are not be burdened unnecessarily with the physical details DBMS Language Processor Optimizer Query Processor Operation Processor Storage Manager • • Require a large amount of space Can not store in main memory Disk speed is slower Minimize the need to move data between disk and main memory Access Method Storage Manager File Manager Goal of a DBMS: provides a way to store and retrieve data that is both convenient and efficient. Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 Database 1 -36

Overall System Structure low-level data stored Source: Database System Concepts, Silberschatz etc. 2006 database Overall System Structure low-level data stored Source: Database System Concepts, Silberschatz etc. 2006 database Edited: Wei-Pang Yang, IM. NDHU 1 -37

1. 8. 1 Storage Management n n n Storage Manager n is a program 1. 8. 1 Storage Management n n n Storage Manager n is a program module n that provides the interface between the low-level data stored and the application programs and queries submitted to the system. Tasks of the Storage Manager: n interaction with the file manager (part of Operating System) n Translates DML into low-level file-system commands, n i. e. responsible for storing, retrieving and updating of data in database Data Structures of the Storage Manager n Data files: store database itself n Data Dictionary: store metadata n Indices: provide fast access to data items that hold particular values Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -38

Storage Management (cont. ) n Components of Storage manager: n Authorization and Integrity Manager Storage Management (cont. ) n Components of Storage manager: n Authorization and Integrity Manager • Tests for the satisfaction of integrity constraints • Checks the authority of users to access data n Transaction Manager • Ensure the database in a consistent state (correct) after failures • Ensure that concurrent transaction executions proceed without conflicting n File Manager • Manages the allocation of space on disk • Manages the data structures used to representation data stored n Buffer manager • Fetches data from disk into main memory • Decides what data to cache in main memory Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -39

1. 8. 2 The Query Processor n DDL Interpreter n n n Interprets DDL 1. 8. 2 The Query Processor n DDL Interpreter n n n Interprets DDL statements write the definitions (schema, view, . . ) into the data dictionary DML Compiler n n n Translates DML statements into an evaluation plan (or some evaluation plans) which consists low-level instructions Query Optimization: picks the lowest cost evaluation plan Query Evaluation Engine: n execute low-level instructions generated by the DML Compiler Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -40

Flow of Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation Edited: Wei-Pang Flow of Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -41

Query Optimizer n Alternative ways of evaluating a given query n n n Equivalent Query Optimizer n Alternative ways of evaluating a given query n n n Equivalent expressions Different algorithms for each operation Cost difference between a good and a bad way of evaluating a query can be enormous n Need to estimate the cost of operations n n Depends critically on statistical information about relations which the database must maintain Need to estimate statistics for intermediate results to compute cost of complex expressions Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -42

Example: A Simple Query Processing (補) Query in SQL: DBMS SELECT CUSTOMER. NAME FROM Example: A Simple Query Processing (補) Query in SQL: DBMS SELECT CUSTOMER. NAME FROM CUSTOMER, INVOICE WHERE REGION = 'N. Y. ' AND AMOUNT > 10000 AND CUTOMER. C#=INVOICE. C Language Processor Internal Form : ( (S Optimizer SP) Operator : SCAN C using region index, create C SCAN I using amount index, create I SORT C? and I? on C# JOIN C? and I? on C# EXTRACT name field Language Processor Query Processor Access Method Storage Manager Operator Processor Calls to Access Method: OPEN SCAN on C with region index GET next tuple. . . Calls to file system: GET 10 th to 25 th bytes from block #6 of file #5 Access Method e. g. B-tree; Index; Hashing File System Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 database 1 -43

1. 9 Transaction Management n Transaction: n n n Atomicity: all or nothing Failure 1. 9 Transaction Management n Transaction: n n n Atomicity: all or nothing Failure recovery manager n n n A transaction is a collection of operations that performs a single logical function in a database application ensures that the database remains in a consistent (correct) state, Failure: • system failures (e. g. , power failures and operating system crashes) • transaction failures. Concurrency-control manager n controls the interaction among the concurrent transactions, to ensure the consistency of the database. Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -44

1. 10 Data Mining and Analysis n Data Analysis and Mining n Decision Support 1. 10 Data Mining and Analysis n Data Analysis and Mining n Decision Support Systems n Data Analysis and OLAP (Online analytical processing), n Data Warehousing n Data Mining Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -45

Decision Support Systems n Decision-support systems n are used to make business decisions, often Decision Support Systems n Decision-support systems n are used to make business decisions, often based on data collected by on-line transaction systems. Examples of business decisions: n n n What items to stock? n What insurance premium to change? To whom to send advertisements? Examples of data used for making decisions n n n Retail sales transaction details n Customer profiles (income, age, gender, etc. ) Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -46

Data Mining (ch. 18) n Data mining: n seeks to discover knowledge automatically in Data Mining (ch. 18) n Data mining: n seeks to discover knowledge automatically in the form of statistical rules and patterns from large databases. E. g. p. 23: Young women buy cars. is the process of semi-automatically analyzing large databases to find useful patterns Prediction based on past history n n n Predict if a credit card applicant poses a good credit risk, based on some attributes (income, job type, age, . . ) and past history Predict if a pattern of phone calling card usage is likely to be fraudulent 欺騙的 Descriptive Patterns n n Associations • Find books that are often bought by “similar” customers. If a new such customer buys one such book, suggest the others too. (library) Associations may be used as a first step in detecting causation Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 引起; 因果關係 1 -47

1. 11 Database Architecture n n System Structure of a Database System n Fig. 1. 11 Database Architecture n n System Structure of a Database System n Fig. 1. 6 (p. 25) Application Structure n User uses database at the site n Users uses database through a network • Client: remote database users work • Sever: database system runs here n Partition of Database Application n Two-tier architecture n Three-tier architecture Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -48

Application Architectures ODBC/JDBC n n Two-tier Architecture: e. g. client programs using ODBC/JDBC to Application Architectures ODBC/JDBC n n Two-tier Architecture: e. g. client programs using ODBC/JDBC to communicate with a database Three-tier Architecture: e. g. web-based applications, and applications built using “middleware” Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -49

1. 12 Database Users and Administrators User A 1 Host Language + DSL 1 1. 12 Database Users and Administrators User A 1 Host Language + DSL 1 User A 2 Host Language + DSL 2 External View @ # & External schema A External/conceptual mapping A < External schema B External View B External/conceptual mapping B Conceptual View Database management system Dictionary (DBMS) e. g. system catalog Conceptual/internal mapping DBA (Build and maintain schemas and mappings) User B 2 User B 3 Host C, C++ Language + DSL DSL (Data Sub. Language) e. g. SQL 3 Conceptual schema User B 1 Host Language + DSL Storage structure definition (Internal schema) Edited: Wei-Pang Yang, IM. NDHU Stored database (Internal View) 1 2 Source: Database System Concepts, Silberschatz etc. 2006 3 # . . . 100 & @ 1 -50

1. 12. 1 Database Users and User Interfaces n n n 單純的 n Application 1. 12. 1 Database Users and User Interfaces n n n 單純的 n Application programmers n interact with system through DML calls Sophisticated users 複雜, 多用途 n Submit query without write program n E. g. OLAP (Online analytical processing), data mining tools Specialized users n write specialized database applications that do not fit into the traditional data processing framework n E. g. CAD, expert system, complex data type (graphics, audio) Naive users (end user) n invoke one of the permanent application programs that have been written previously n E. g. people accessing database over the web, bank tellers, clerical staff 辦事員 Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -51

1. 12. 2 Database Administrator n n Database Administrator: n Coordinates all the activities 1. 12. 2 Database Administrator n n Database Administrator: n Coordinates all the activities of the database system; n has a good understanding of the enterprise’s information resources and needs. Database Administrator's Duties: n Schema definition n Storage structure and access method definition n Schema and physical organization modification n Granting of authorization for data access n Routine maintenance • Periodically backup database • Upgrade system e. g. disk • Monitoring performance n … Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -52

1. 13 History of Database Systems n n n 1950 s – early 1960: 1. 13 History of Database Systems n n n 1950 s – early 1960: n Tapes: sequentially n Application: Payroll, n Input: punched decks, Output: printer Late 1960 s -- 1970 s: n Disk: direct access n Codd proposed Relational Model, … Turing Award 1980 s: n System R: IBM Res. Lab. IBM DB 2, Oracle, Ingress, DEC Rdb n Replaced Network/Hierarchical model n Research: parallel database, distributed database, object-oriented, … Early 1990 s: n Parallel database n Object-Relational Late 1990 s: n World Wide Web was explosive growth n Database were used much more than ever before n Database had to support Web interfaces to data Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -53

History of Database Systems (補) 1950 -1965 Data Model Network Hierarchical 1980 -1989 1990 History of Database Systems (補) 1950 -1965 Data Model Network Hierarchical 1980 -1989 1990 -1995 Semantic Object-oriented Logic Relation proposed Relation Merging data models, knowledge-base Relation 1965 -1979 Network Hierarchical Database Hardware Mainframes User Interface None Forms DL/I COBOL+DL/I Program Interface Procedural Presentation and display processing Reports Processing data Edited: Wei-Pang Yang, IM. NDHU Procedural Reports Processing data Mainframes Minis PCs Graphics, Menus SQL, QUEL Query-by-forms Embedded Query non-Procedural Report generators Information and transaction processing 1995 -present Object-Oriented OO-relation XML Relation Faster PCs Workstations Parallel Database machines Optical memories Natural language Speech input WWW Web interface Integrated database 4 GL and programming Logic programming language Business graphics Image output Knowledge processing Source: Database System Concepts, Silberschatz etc. 2006 Multimedia 1 -54

計算機科學的諾貝爾獎 – 杜林獎 (趙坤茂) n n 象徵最崇高學術桂冠的諾貝爾獎,從 1901年開始頒發,根據瑞典發 明家諾貝爾的遺囑,設有物理、化學、生理醫學、文學及和平等五 個獎項;自 1969年起,增設了經濟學諾貝爾獎。 疑問: 為什麼諾貝爾獎沒有數學獎項呢?坊間流傳的說法是,當初 諾貝爾的夫人,曾經和瑞典一位很有成就的數學家米塔雷符勒有過 計算機科學的諾貝爾獎 – 杜林獎 (趙坤茂) n n 象徵最崇高學術桂冠的諾貝爾獎,從 1901年開始頒發,根據瑞典發 明家諾貝爾的遺囑,設有物理、化學、生理醫學、文學及和平等五 個獎項;自 1969年起,增設了經濟學諾貝爾獎。 疑問: 為什麼諾貝爾獎沒有數學獎項呢?坊間流傳的說法是,當初 諾貝爾的夫人,曾經和瑞典一位很有成就的數學家米塔雷符勒有過 一段婚外情,所以諾貝爾決定不設數學獎項。 英國數學家亞蘭杜林(Alan Turing,1912 -1954),雖然無緣在有生之 年得到諾貝爾獎,但後人為了紀念他在數位計算理論貢獻而設立的 杜林獎(Turing Award),已被公認是計算機科學領域最崇高的獎項。 杜林獎從 1966年開始頒發,受獎人都是對計算機科學有深遠影響的 大師級學者。例如,在計算複雜度理論上有卓越貢獻的庫克(Cook)、 C程式語言的創始人理奇(Ritchie)、Unix作業系統製作人湯普生 (Thompson)及資料庫管理系統的先驅卡德(Codd)等。 Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -55

計算機科學的諾貝爾獎 – 杜林獎 (cont. ) n n n 1936年時,杜林提出了一個假想性的計算 具,稱為杜林機器 (Turing machine),這個機器有一個長條型、無窮多格的儲存磁帶, 每一格位置是空白或一個符號;附帶在磁帶上的是一個可讀寫的磁 頭,它可以在磁帶的格子往左或往右,並在每次移動時讀、寫或擦 計算機科學的諾貝爾獎 – 杜林獎 (cont. ) n n n 1936年時,杜林提出了一個假想性的計算 具,稱為杜林機器 (Turing machine),這個機器有一個長條型、無窮多格的儲存磁帶, 每一格位置是空白或一個符號;附帶在磁帶上的是一個可讀寫的磁 頭,它可以在磁帶的格子往左或往右,並在每次移動時讀、寫或擦 拭該格子;還有一個有限狀態控制機,可運用狀態的改變,配合目 前磁頭所在的位置,來決定這些移動讀寫的動作。 這樣一個簡單的機器,它的運算功力竟然相當於今天的數位計算機, 換句話說,目前數位計算機可以運算的方法,我們都可以在杜林機 器上實現! 杜林也提出了如何決定電腦是否會“思考”的方法,也被視為人 智 慧研究領域的基石。 在二次世界大戰時,杜林曾發展一個可以破解德軍密碼的機器,不 過世人在戰爭結束二十五年後才知曉。 他也是馬拉松運動的好手,真是多才多藝的科學家。可惜他在 1954 年時就過世,只享年 42歲。 Edited: Wei-Pang Yang, IM. NDHU Source: Database System Concepts, Silberschatz etc. 2006 1 -56