Скачать презентацию Distributed DBMS Architecture Tahir Rashid DDBMS Architecture Скачать презентацию Distributed DBMS Architecture Tahir Rashid DDBMS Architecture

f3dff0d090c277603054234ccf5bfa0e.ppt

  • Количество слайдов: 85

Distributed DBMS Architecture Tahir Rashid DDBMS Architecture Distributed DBMS Architecture Tahir Rashid DDBMS Architecture

Architecture Defines the structure of the system – components identified – functions of each Architecture Defines the structure of the system – components identified – functions of each component defined – interrelationships and interactions between components defined Tahir Rashid DDBMS Architecture 2

Architecture • Goal: – present the issues that need to be addressed at design Architecture • Goal: – present the issues that need to be addressed at design – present a framework within which the design and implementation issues can be discussed • The ISO/OSI 7 -layered reference model for computer networks Tahir Rashid DDBMS Architecture 3

Standardization Reference Model – A conceptual framework whose purpose is to divide standardization work Standardization Reference Model – A conceptual framework whose purpose is to divide standardization work into manageable pieces and to show at a general level how these pieces are related to one another. A reference model can be described according to three different approaches: • Based on components • Based on functions • Based on data Tahir Rashid DDBMS Architecture 4

DBMS STANDARDIZATION • Based on components. The components of the system are defined together DBMS STANDARDIZATION • Based on components. The components of the system are defined together with the interrelationships between components. A DBMS consists of a number of components, each of which provides some functionality. • Based on functions. The different classes of users are identified and the functions that the system will perform for each class are defined. The system specifications within this category typically specify a hierarchical structure for the user classes. The ISO/OSI architecture fall in this category. Tahir Rashid DDBMS Architecture 5

DBMS STANDARDIZATION • Based on data. The different types of data are identified, and DBMS STANDARDIZATION • Based on data. The different types of data are identified, and an architectural framework is specified which defines the functional units that will realize or use data according to these different views. This approach (also referred as the data logical approach) is claimed to be the preferable choice for standardization activities. Tahir Rashid DDBMS Architecture 6

DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE The ANSI / SPARC architecture is claimed to DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE The ANSI / SPARC architecture is claimed to be based on the data organization. It recognizes three views of data: the external view, which is that of the user, who might be a programmer; the internal view, that of the system or machine; and the conceptual view, that of the enterprise. For each of these views, an appropriate schema definition is required. Tahir Rashid DDBMS Architecture 7

DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE Tahir Rashid DDBMS Architecture 8 DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE Tahir Rashid DDBMS Architecture 8

DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE • At the lowest level of the architecture DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE • At the lowest level of the architecture is the internal view, which deals with the physical definition and organization of data. • At the other extreme is the external view, which is concerned with how users view the database. • Between these two ends is the conceptual schema, which is an abstract definition of the database. It is the „real world” view of the enterprise being modeled in the database. Tahir Rashid DDBMS Architecture 9

Conceptual Schema Definition RELATION EMP [ KEY = {ENO} ATTRIBUTES = { ENO: ENAME Conceptual Schema Definition RELATION EMP [ KEY = {ENO} ATTRIBUTES = { ENO: ENAME TITLE CHARACTER(9) : CHARACTER(15) : CHARACTER(10) } ] RELATION PAY [ KEY = {TITLE} ATTRIBUTES = { TITLE SAL : : CHARACTER(10) NUMERIC(6) } Tahir Rashid ] DDBMS Architecture 10

Conceptual Schema Definition RELATION PROJ [ KEY = {PNO} ATTRIBUTES = { PNO: PNAME Conceptual Schema Definition RELATION PROJ [ KEY = {PNO} ATTRIBUTES = { PNO: PNAME BUDGET CHARACTER(7) : CHARACTER(20) : NUMERIC(7) } ] RELATION ASG [ KEY = {ENO, PNO} ATTRIBUTES = { ENO: PNO: RESP DUR: Tahir Rashid ] } CHARACTER(9) CHARACTER(7) : CHARACTER(10) NUMERIC(3) DDBMS Architecture 11

Internal Schema Definition RELATION EMP [ KEY = {ENO} ATTRIBUTES = { ENO : Internal Schema Definition RELATION EMP [ KEY = {ENO} ATTRIBUTES = { ENO : ENAME TITLE CHARACTER(9) : CHARACTER(15) : CHARACTER(10) } ] INTERNAL_REL EMPL [ INDEX ON E# CALL EMINX FIELD = { HEADER : BYTE(1) E# : BYTE(9) ENAME : BYTE(15) TIT : BYTE(10) } ] Tahir Rashid DDBMS Architecture 12

External View Definition – Example 1 Create a BUDGET view from the PROJ relation External View Definition – Example 1 Create a BUDGET view from the PROJ relation CREATE AS Tahir Rashid VIEW BUDGET(PNAME, BUD) SELECT PNAME, BUDGET FROM PROJ DDBMS Architecture 13

External View Definition – Example 2 Create a Payroll view from relations EMP and External View Definition – Example 2 Create a Payroll view from relations EMP and TITLE_SALARY CREATE VIEW AS SELECT FROM WHERE Tahir Rashid PAYROLL (ENO, ENAME, SAL) EMP. ENO, EMP. ENAME, PAY. SAL EMP, PAY EMP. TITLE = PAY. TITLE DDBMS Architecture 14

DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE Tahir Rashid DDBMS Architecture 15 DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE Tahir Rashid DDBMS Architecture 15

DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE • The square boxes represent processing functions, whereas DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE • The square boxes represent processing functions, whereas the hexagons are administrative roles. • The arrows indicate data, command, program, and description flow, whereas the „I”-shaped bars on them represent interfaces. • The major component that permits mapping between different data organizational views is the data dictionary / directory (depicted as a triangle), which is a meta-database. • The database administrator is responsible for defining the internal schema definition. • The enterprise administrator’s role is to prepare the conceptual schema definition. • The application administrator is responsible for preparing the external schema for applications. Tahir Rashid DDBMS Architecture 16

DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE • Two more users: – Application programmer – DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE • Two more users: – Application programmer – System programmer • Two user classes: – Casual user • Retrieve database and possible update • Added in external schema – Novice user • Typically have no knowledge of data base • Example (banking machine) Tahir Rashid DDBMS Architecture 17

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs The systems are characterized with respect to: (1) the ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs The systems are characterized with respect to: (1) the autonomy of the local systems, (2) their distribution, (3) their heterogeneity. Tahir Rashid DDBMS Architecture 18

Architectural models for Distributed DBMSs Tahir Rashid DDBMS Architecture 19 Architectural models for Distributed DBMSs Tahir Rashid DDBMS Architecture 19

Autonomy • Distribution of control (and not data) - the degree of independence – Autonomy • Distribution of control (and not data) - the degree of independence – The local operations of the individual DBMSs are not affected by their participation in the multidatabase system – The manner in which individual DBMSs process queries and optimize them should not be affected by the execution of global queries – System consistency should not be compromised when individual DBMSs join or leave the multidatabase system Tahir Rashid DDBMS Architecture 20

Autonomy • On the other hand specifies the dimension of autonomy as: • Design Autonomy • On the other hand specifies the dimension of autonomy as: • Design autonomy: Ability of a component DBMS to decide on issues related to its own design. • Communication autonomy: Ability of a component DBMS to decide whether and how to communicate with other DBMSs. • Execution autonomy: Ability of a component DBMS to execute local operations in any manner it wants to. Tahir Rashid DDBMS Architecture 21

Autonomy • Possibilities: – Tight integration – a single-image of the entire database is Autonomy • Possibilities: – Tight integration – a single-image of the entire database is available to any user who wants to share the information, which may reside in multiple databases. – Semiautonomous system – consist of DBMSs that can operate independently, but have decided to participate in a federation to make their local data sharable. – Total isolation – the individual systems are standalone DBMSs, which know neither of the existence of other DBMSs nor how to communicate with them. Tahir Rashid DDBMS Architecture 22

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - DISTRIBUTION Distributions refers to the distributions of data. ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - DISTRIBUTION Distributions refers to the distributions of data. Of course, we are considering the physical distribution of data over multiple sites; the user sees the data as one logical pool. Two alternatives: – client / server distribution – peer-to-peer distribution (full distribution) Tahir Rashid DDBMS Architecture 23

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - DISTRIBUTION Client / server distribution. The client / ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - DISTRIBUTION Client / server distribution. The client / server distribution concentrates data management duties at servers while the clients focus on providing the application environment including the user interface. The communication duties are shared between the client machines and servers. Client / server DBMSs represent the first attempt at distributing functionality. Peer-to-peer distribution. There is no distinction of client machines versus servers. Each machine has full DBMS functionality and can communicate with other machines to execute queries and transactions. Tahir Rashid DDBMS Architecture 24

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - HETEROGENEITY Heterogeneity may occur in various forms in ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - HETEROGENEITY Heterogeneity may occur in various forms in distributed systems, ranging form hardware heterogeneity and differences in networking protocols to variations in data managers. Representing data with different modeling tools creates heterogeneity because of the inherent expressive powers and limitations of individual data models. Heterogeneity in query languages not only involves the use of completely different data access paradigms in different data models, but also covers differences in languages even when the individual systems use the same data model. Tahir Rashid DDBMS Architecture 25

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - HETEROGENEITY • Various levels (hardware, communications, operating system) ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - HETEROGENEITY • Various levels (hardware, communications, operating system) • DBMS important one – data model, query language, transaction management algorithms • Representing data with different modeling tools creates heterogeneity because of the inherent expressive power and limitations of individual data models. Tahir Rashid DDBMS Architecture 26

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES The dimensions are identified as: A (autonomy), ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES The dimensions are identified as: A (autonomy), D (distribution) and H (heterogeneity). The alternatives along each dimension are identified by numbers as: 0, 1 or 2. Tahir Rashid DDBMS Architecture 27

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES A 0 - tight integration A 1 ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES A 0 - tight integration A 1 - semiautonomous systems A 2 - total isolation H 0 - homogeneous systems H 1 - heterogeneous systems D 0 - no distribution D 1 - client / server systems D 2 - peer-to-peer systems Tahir Rashid DDBMS Architecture 28

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs Tahir Rashid DDBMS Architecture 29 ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs Tahir Rashid DDBMS Architecture 29

Alternatives in Distributed Database Systems Distribution Distributed multi-DBMS Peer-to-peer Distributed DBMS Client/server Autonomy Multi-DBMS Alternatives in Distributed Database Systems Distribution Distributed multi-DBMS Peer-to-peer Distributed DBMS Client/server Autonomy Multi-DBMS Federated DBMS Heterogeneity Tahir Rashid DDBMS Architecture 30

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES • In figure 4. 3 , two ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES • In figure 4. 3 , two alternative architectures that are focus of this book: • (A 0, D 2, H 0) • (A 2, D 2, H 1) • Not all the architectures that are identified by this design space are meaningful. Tahir Rashid DDBMS Architecture 31

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES (A 0, D 0, H 0) If ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES (A 0, D 0, H 0) If there is no distribution or heterogeneity, the system is a set of multiple DBMSs that are logically integrated. Such systems can be given generic name composite systems. Not such examples but they may be suitable for shared everything multiprocessor systems. (A 0, D 0, H 1) If heterogeneity is introduced, one has multiple data managers that are heterogeneous but provide an integrated view to the user. (A 0, D 1, H 0) The more interesting case is where the database is distributed even though an integrated view of the data is provided to users (client / server distribution). Mentioned earlier and will discuss further. Tahir Rashid DDBMS Architecture 32

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES (A 0, D 2, H 0) The ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES (A 0, D 2, H 0) The same type of transparency is provided to the user in a fully distributed environment. There is no distinction among clients and servers, each site providing identical functionality. (A 1, D 0, H 0) These are semiautonomous systems, which are commonly termed federated DBMS. The component systems in a federated environment have significant autonomy in their execution, but their participation in the federation indicate that they are willing to cooperate with other in executing user requests that access multiple databases. An example may be multiple installations of an DBMS. Tahir Rashid DDBMS Architecture 33

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES (A 1, D 0, H 1) These ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES (A 1, D 0, H 1) These are systems that introduce heterogeneity as well as autonomy, what we might call a heterogeneous federated DBMS. (A 1, D 1, H 1) System of this type introduce distribution by placing component systems on different machines. They may be referred to as distributed, heterogeneous federated DBMS. (A 2, D 0, H 0) Now we have full autonomy. These are multidatabase systems (MDBS). The components have no concept of cooperation. Without heterogeneity and distribution, an MDBS is an interconnected collection of autonomous databases. Tahir Rashid DDBMS Architecture 34

ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES (A 2, D 0, H 1) These ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES (A 2, D 0, H 1) These case is realistic, maybe even more so than (A 1, D 0, H 1), in that we always want to built applications which access data from multiple storage systems with different characteristics. (A 2, D 1, H 1) and (A 2, D 2, H 1) These two cases are together, because of the similarity of the problem. They both represent the case where component databases that make up the MDBS are distributed over a number of sites - we call this the distributed MDBS. Tahir Rashid DDBMS Architecture 35

Data logical Distributed DBMS Architecture ES 1 ES 2 . . . ESn GCS Data logical Distributed DBMS Architecture ES 1 ES 2 . . . ESn GCS LCS 1 LCS 2 . . . LCSn LIS 1 LIS 2 . . . LISn Tahir Rashid ES: External Schema GCS: Global Conceptual Schema LCS: Local Conceptual Schema LIS: Local Internal Schema DDBMS Architecture 36

Datalogical Multi-DBMS Architecture. . . GESn GES 1 LES 1 n GCS LESn 1 Datalogical Multi-DBMS Architecture. . . GESn GES 1 LES 1 n GCS LESn 1 … LCS 1 LCS 2 … LCSn LIS 1 LES 11 GES 2 LIS 2 … LISn … • GES: Global External Schema • LES: Local External Schema Tahir Rashid LESnm • LCS: Local Conceptual Schema • LIS: Local Internal Schema DDBMS Architecture 37

Distributed DBMS • Distributed database requires distributed DBMS • Functions of a distributed DBMS: Distributed DBMS • Distributed database requires distributed DBMS • Functions of a distributed DBMS: – Locate data with a distributed data dictionary – Determine location from which to retrieve data and process query components – DBMS translation between nodes with different local DBMSs (using middleware) – Data consistency (via multiphase commit protocols) – Global primary key control – Scalability – Security, concurrency, query optimization, failure Tahir Rashid DDBMS Architecture 38 recovery

Figure 13 -10 – Distributed DBMS architecture Tahir Rashid DDBMS Architecture 39 Figure 13 -10 – Distributed DBMS architecture Tahir Rashid DDBMS Architecture 39

Local Transaction Steps 1. Application makes request to distributed DBMS 2. Distributed DBMS checks Local Transaction Steps 1. Application makes request to distributed DBMS 2. Distributed DBMS checks distributed data repository for location of data. Finds that it is local 3. Distributed DBMS sends request to local DBMS 4. Local DBMS processes request 5. Local DBMS sends results to application Tahir Rashid DDBMS Architecture 40

Figure 13 -10: Distributed DBMS Architecture (cont. ) (showing local transaction steps) 2 1 Figure 13 -10: Distributed DBMS Architecture (cont. ) (showing local transaction steps) 2 1 3 5 4 Local transaction – all data stored locally Tahir Rashid DDBMS Architecture 41

Global Transaction Steps 1. Application makes request to distributed DBMS 2. Distributed DBMS checks Global Transaction Steps 1. Application makes request to distributed DBMS 2. Distributed DBMS checks distributed data repository for location of data. Finds that it is remote 3. Distributed DBMS routes request to remote site 4. Distributed DBMS at remote site translates request for its local DBMS if necessary, and sends request to local DBMS 5. Local DBMS at remote site processes request 6. Local DBMS at remote site sends results to distributed DBMS at remote site 7. Remote distributed DBMS sends results back to originating site 8. Distributed DBMS at originating site sends results to application Tahir Rashid DDBMS Architecture 42

Figure 13 -10: Distributed DBMS architecture (cont. ) (showing global transaction steps) 2 3 Figure 13 -10: Distributed DBMS architecture (cont. ) (showing global transaction steps) 2 3 1 7 8 6 4 5 Global transaction – some data is at remote site(s) Tahir Rashid DDBMS Architecture 43

DISTRIBUTED DBMS ARCHITECTURE • Client / server systems - (Ax, D 1, Hy) • DISTRIBUTED DBMS ARCHITECTURE • Client / server systems - (Ax, D 1, Hy) • Distributed databases - (A 0, D 2, H 0) • Multidatabase systems - (A 2, Dx, Hy) Tahir Rashid DDBMS Architecture 44

The Client/Server Database Environment Tahir Rashid DDBMS Architecture The Client/Server Database Environment Tahir Rashid DDBMS Architecture

Client/Server Systems • Networked computing model • Processes distributed between clients and servers • Client/Server Systems • Networked computing model • Processes distributed between clients and servers • Client – Workstation (usually a PC) that requests and uses a service • Server – Computer (PC/mini/mainframe) that provides a service • For DBMS, server is a database server Tahir Rashid DDBMS Architecture 46

Application Logic in C/S Systems • Presentation Logic – Input – keyboard/mouse – Output Application Logic in C/S Systems • Presentation Logic – Input – keyboard/mouse – Output – monitor/printer • Processing Logic – I/O processing – Business rules – Data management GUI Interface Procedures, functions, programs • Storage Logic – Data storage/retrieval Tahir Rashid DBMS activities DDBMS Architecture 47

Client/Server Architectures • File Server Architecture • Database Server Architecture • Three-tier Architecture Tahir Client/Server Architectures • File Server Architecture • Database Server Architecture • Three-tier Architecture Tahir Rashid DDBMS Architecture 48

File Server Architecture • All processing is done at the PC that requested the File Server Architecture • All processing is done at the PC that requested the data • Entire files are transferred from the server to the client for processing. • Problems: – Huge amount of data transfer on the network – Each client must contain full DBMS • Heavy resource demand on clients • Client DBMSs must recognize shared locks, integrity checks, etc. Tahir Rashid DDBMS Architecture 49

File Server Architecture Tahir Rashid DDBMS Architecture 50 File Server Architecture Tahir Rashid DDBMS Architecture 50

Database Server Architectures • 2 -tiered approach • Client is responsible for – I/O Database Server Architectures • 2 -tiered approach • Client is responsible for – I/O processing logic – Some business rules logic • Server performs all data storage and access processing DBMS is only on server • Advantages – – Clients do not have to be as powerful Greatly reduces data traffic on the network Improved data integrity since it is all processed centrally Stored procedures some business rules done on server Tahir Rashid DDBMS Architecture 51

Advantages of Stored Procedures • • Compiled SQL statements Reduced network traffic Improved security Advantages of Stored Procedures • • Compiled SQL statements Reduced network traffic Improved security Improved data integrity Tahir Rashid DDBMS Architecture 52

Database server architecture DBMS only on server Tahir Rashid DDBMS Architecture 53 Database server architecture DBMS only on server Tahir Rashid DDBMS Architecture 53

Three-Tier Architectures • Three layers: GUI interface (I/O processing) – Client – Application server Three-Tier Architectures • Three layers: GUI interface (I/O processing) – Client – Application server Business rules – Database server Data storage l Thin l Tahir Rashid Browser Web Server DBMS Client PC just for user interface and a little application processing. Limited or no data storage (sometimes no hard drive) DDBMS Architecture 54

Three-tier architecture Thinnest clients Business rules on separate server DBMS only on DB server Three-tier architecture Thinnest clients Business rules on separate server DBMS only on DB server Tahir Rashid DDBMS Architecture 55

DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS • This provides two-level architecture which make DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS • This provides two-level architecture which make it easier to manage the complexity of modern DBMSs and the complexity of distribution. • The server does most of the data management work (query processing and optimization, transaction management, storage management). • The client is the application and the user interface (management the data that is cached to the client, management the transaction locks). Tahir Rashid DDBMS Architecture 56

DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS • This architecture is quite common in DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS • This architecture is quite common in relational systems where the communicationbetween the clients and the server(s) is at the level of SQL statements. Tahir Rashid DDBMS Architecture 57

DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS Tahir Rashid DDBMS Architecture 58 DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS Tahir Rashid DDBMS Architecture 58

DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS Multiple client - single server From a DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS Multiple client - single server From a data management perspective, this is not much different from centralized databases since the database is stored on only one machine (the server) which also hosts the software to manage it. However, there are some differences from centralized systems in the way transactions are executed and caches are managed. Multiple client - multiple server In this case, two alternative management strategies are possible: either each client manages its own connection to the appropriate server or each client knows of only its “home server” which then communicates with other servers as required. Tahir Rashid DDBMS Architecture 59

Multiple Clients/Single Server Applications Client Services Communications High-level requests Filtered data only LAN Communications Multiple Clients/Single Server Applications Client Services Communications High-level requests Filtered data only LAN Communications DBMS Services Database Tahir Rashid DDBMS Architecture 60

Task Distribution Application QL Interface … Programmatic Interface Communications Manager SQL query result table Task Distribution Application QL Interface … Programmatic Interface Communications Manager SQL query result table Communications Manager Query Optimizer Lock Manager Storage Manager Page & Cache Manager Database Tahir Rashid DDBMS Architecture 61

Advantages of Client-Server Architectures • More efficient division of labor • Better price/performance on Advantages of Client-Server Architectures • More efficient division of labor • Better price/performance on client machines • Ability to use familiar tools on client machines • Client access to remote data (via standards) • Full DBMS functionality provided to client workstations • Overall better system price/performance Tahir Rashid DDBMS Architecture 62

Problems With Multiple. Client/Single Server • Server forms bottleneck • Server forms single point Problems With Multiple. Client/Single Server • Server forms bottleneck • Server forms single point of failure • Database scaling difficult Tahir Rashid DDBMS Architecture 63

Multiple Clients/Multiple Servers Applications • directory Client • caching Services Communications • query decomposition Multiple Clients/Multiple Servers Applications • directory Client • caching Services Communications • query decomposition • commit protocols Communications LAN Communications DBMS Services Database Tahir Rashid DDBMS Architecture DBMS Services Database 64

Server-to-Server • SQL interface Applications Client • programmatic Services Communications interface • other application Server-to-Server • SQL interface Applications Client • programmatic Services Communications interface • other application support environments. Communications LAN Communications DBMS Services Database Tahir Rashid DBMS Services Database DDBMS Architecture 65

DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS Tahir Rashid DDBMS Architecture 66 DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS Tahir Rashid DDBMS Architecture 66

DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS • Physically data organization on each machine may DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS • Physically data organization on each machine may be, and probably is different. • Then there need to be a internal schema at each site which call LIS and for enterprise view there need to be a external schema which call GCS. Tahir Rashid DDBMS Architecture 67

DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS • Data Independence due to model of ANSI/SPARC. DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS • Data Independence due to model of ANSI/SPARC. • Loc & Rep Transparence by LCS & GCS. • Net Transparence by GCS. • DDBMS translate global queries into a group of local queries, execute by DDBMS components at different sites that communicate one another. Tahir Rashid DDBMS Architecture 68

DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS Tahir Rashid DDBMS Architecture 69 DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS Tahir Rashid DDBMS Architecture 69

What is Schema? • In a relational database, the schema defines the tables, the What is Schema? • In a relational database, the schema defines the tables, the fields in each table, and the relationships between fields and tables. Schemas are generally stored in a data dictionary. Although a schema is defined in text database language, the term is often used to refer to a graphical depiction of the database structure Tahir Rashid DDBMS Architecture 70

DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS • The physical data organization on each machine DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS • The physical data organization on each machine may be different. • Local internal scheme (LIS) - is an individual internal schema definition at each site. • Global conceptual schema (GCS) - describes the enterprise view of the data. • Local conceptual schema (LCS) - describes the logical organization of data at each site. • External schemas (ESs) - support user applications and user access to the database. Tahir Rashid DDBMS Architecture 71

DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS In these case, the ANSI/SPARC model is extended DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS In these case, the ANSI/SPARC model is extended by the addition of global directory / dictionary (GD/D) to permits the required global mappings. The local mappings are still performed by local directory / dictionary (LD/D). The local database management components are integrated by means of global DBMS functions. Local conceptual schemas are mappings of global schema onto each site. Tahir Rashid DDBMS Architecture 72

DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS Tahir Rashid DDBMS Architecture 73 DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS Tahir Rashid DDBMS Architecture 73

DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS The detailed components of a distributed DBMS. Two DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS The detailed components of a distributed DBMS. Two major components: – user processor – data processor Tahir Rashid DDBMS Architecture 74

Tahir Rashid DDBMS Architecture 75 Tahir Rashid DDBMS Architecture 75

DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS User processor • user interface handler - is DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS User processor • user interface handler - is responsible for interpreting user commands as they come in, and formatting the result data as it is sent to the user, • semantic data controller - uses the integrity constraints and authorizations that are defined as part of the global conceptual schema to check if the user query can be processed, • global query optimizer and decomposer - determines an execution strategy to minimize a cost function, and translates the global queries in local ones using the global and local conceptual schemas as well as global directory, • distributed execution monitor - coordinates the distributed execution of the user request. Tahir Rashid DDBMS Architecture 76

DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS Data processor • local query optimizer - is DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS Data processor • local query optimizer - is responsible for choosing the best access path to access any data item, • local recovery manager - is responsible for making sure that the local database remains consistent even when failures occur, • run-time support processor - physically accesses the database according to the physical commands in the schedule generated by the query optimizer. This is the interface to the operating system and contains the database buffer (or cache) manager, which is responsible for maintaining the main memory buffers and managing the data accesses. Tahir Rashid DDBMS Architecture 77

Peer-to-Peer Component Architecture Local Internal Schema Database Runtime Support Processor System Local Conceptual Log Peer-to-Peer Component Architecture Local Internal Schema Database Runtime Support Processor System Local Conceptual Log Schema Local Recovery Manager GD/D Global Execution Monitor Global Query Optimizer USER Global Conceptual Schema Semantic Data Controller User requests User Interface Handler External Schema DATA PROCESSOR Local Query Processor USER PROCESSOR System responses Tahir Rashid DDBMS Architecture 78

DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE Models using a Global Conceptual Schema (GCS) The GCS DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE Models using a Global Conceptual Schema (GCS) The GCS is defined by integrating either the external schemas of local autonomous databases or parts of their local conceptual schemas. If the heterogeneity exists in the system, then two implementation alternatives exists unilingual and multilingual. Models without a Global Conceptual Schema (GCS) The existence of a global conceptual schema in a multidatabase system is a controversial issue. There are researchers who even define a multidatabase management system as one that manages “several databases without the global schema”. Tahir Rashid DDBMS Architecture 79

DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE - models using a GCS Tahir Rashid DDBMS Architecture DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE - models using a GCS Tahir Rashid DDBMS Architecture 80

DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE - models using a GCS • A unilingual multi-DBMS DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE - models using a GCS • A unilingual multi-DBMS requires the users to utilize possibly different data models and languages when both a local database and the global database are accessed. • Any application that accesses data from multiple databases must do so by means of an external view that is defined on the global conceptual schema. • One application may have a local external schema (LES) defined on the local conceptual schema as well as a global external schema (GES) defined on the global conceptual schema. Tahir Rashid DDBMS Architecture 81

DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE - models using a GCS • An alternative is DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE - models using a GCS • An alternative is multilingual architecture, where the basic philosophy is to permit each user to access the global database by means of an external schema, defined using the language of the user’s local DBMS. • The multilingual approach obviously makes querying the databases easier from the user’s perspective. However, it is more complicated because we must deal with translation of queries at run time. Tahir Rashid DDBMS Architecture 82

DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE - models without a GCS Tahir Rashid DDBMS Architecture DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE - models without a GCS Tahir Rashid DDBMS Architecture 83

DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE - models without a GCS • The architecture identifies DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE - models without a GCS • The architecture identifies two layers: the local system layer and the multidatabase layer on top of it. • The local system layer consists of a number of DBMSs, which present to the multidatabase layer the part of their local database they are willing to share with users of the other databases. This shared data is presented either as the actual local conceptual schema or as a local external schema definition. • The multidatabase layer consist of a number of external views, which are constructed where each view may be defined on one local conceptual schema or on multiple conceptual schemas. Thus the responsibility of providing access to multiple databases is delegated to the mapping between the external schemas and the local conceptual schemas. Tahir Rashid DDBMS Architecture 84

DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE - models without a GCS The MDBS provides a DISTRIBUTED DBMS ARCHITECTURE MDBS ARCHITECTURE - models without a GCS The MDBS provides a layer of software that runs on top of these individual DBMSs and provides users with the facilities of accessing various databases. Fig. represents a nondistributed multi-DBMS. If the system is distributed, we would need to replicate the multidatabase layer to each site where there is a local DBMS that participates in the system. Tahir Rashid DDBMS Architecture 85