2269c67a82543024df4b1ee088045dcf.ppt
- Количество слайдов: 62
Distributed DBMS Architecture
Architecture Defines the structure of the system – components identified – functions of each component defined – interrelationships and interactions between components defined DDBMS Architecture 2
Architecture • Goal: – present the issues that need to be addressed at design – present a framework within which the design and implementation issues can be discussed • The ISO/OSI 7 -layered reference model for computer networks DDBMS Architecture 3
Standardization Reference Model – A conceptual framework whose purpose is to divide standardization work into manageable pieces and to show at a general level how these pieces are related to one another. A reference model can be described according to three different approaches: • Based on components • Based on functions • Based on data DDBMS Architecture 4
DBMS STANDARDIZATION • Based on components. The components of the system are defined together with the interrelationships between components. A DBMS consists of a number of components, each of which provides some functionality. • Based on functions. The different classes of users are identified and the functions that the system will perform for each class are defined. The system specifications within this category typically specify a hierarchical structure for the user classes. The ISO/OSI architecture fall in this category. DDBMS Architecture 5
DBMS STANDARDIZATION • Based on data. The different types of data are identified, and an architectural framework is specified which defines the functional units that will realize or use data according to these different views. This approach (also referred as the data logical approach) is claimed to be the preferable choice for standardization activities. DDBMS Architecture 6
DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE The ANSI / SPARC architecture is claimed to be based on the data organization. It recognizes three views of data: the external view, which is that of the user, who might be a programmer; the internal view, that of the system or machine; and the conceptual view, that of the enterprise. For each of these views, an appropriate schema definition is required. DDBMS Architecture 7
DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE DDBMS Architecture 8
DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE • At the lowest level of the architecture is the internal view, which deals with the physical definition and organization of data. • At the other extreme is the external view, which is concerned with how users view the database. • Between these two ends is the conceptual schema, which is an abstract definition of the database. It is the „real world” view of the enterprise being modeled in the database. DDBMS Architecture 9
Conceptual Schema Definition RELATION EMP [ KEY = {ENO} ATTRIBUTES = { ENO : CHARACTER(9) ENAME : CHARACTER(15) TITLE : CHARACTER(10) } ] RELATION PAY [ KEY = {TITLE} ATTRIBUTES = { TITLE SAL : CHARACTER(10) : NUMERIC(6) } ] DDBMS Architecture 10
Conceptual Schema Definition RELATION PROJ [ KEY = {PNO} ATTRIBUTES = { PNO : CHARACTER(7) PNAME : CHARACTER(20) BUDGET: NUMERIC(7) } ] RELATION ASG [ KEY = {ENO, PNO} ATTRIBUTES = { ENO PNO RESP DUR } ] : : CHARACTER(9) CHARACTER(7) CHARACTER(10) NUMERIC(3) DDBMS Architecture 11
Internal Schema Definition RELATION EMP [ KEY = {ENO} ATTRIBUTES = { ENO ENAME TITLE : CHARACTER(9) : CHARACTER(15) : CHARACTER(10) } ] INTERNAL_REL EMPL [ INDEX ON E# CALL EMINX FIELD = { HEADER E# ENAME TIT : BYTE(1) : BYTE(9) : BYTE(15) : BYTE(10) } ] DDBMS Architecture 12
External View Definition – Example 1 Create a BUDGET view from the PROJ relation CREATE VIEW BUDGET(PNAME, BUD) AS SELECT PNAME, BUDGET FROM PROJ DDBMS Architecture 13
External View Definition – Example 2 Create a Payroll view from relations EMP and TITLE_SALARY CREATE AS VIEW SELECT FROM WHERE PAYROLL (ENO, ENAME, SAL) EMP. ENO, EMP. ENAME, PAY. SAL EMP, PAY EMP. TITLE = PAY. TITLE DDBMS Architecture 14
DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE DDBMS Architecture 15
DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE • The square boxes represent processing functions, whereas the hexagons are administrative roles. • The arrows indicate data, command, program, and description flow, whereas the „I”-shaped bars on them represent interfaces. • The major component that permits mapping between different data organizational views is the data dictionary / directory (depicted as a triangle), which is a meta-database. • The database administrator is responsible for defining the internal schema definition. • The enterprise administrator’s role is to prepare the conceptual schema definition. • The application administrator is responsible for preparing the external schema for applications. DDBMS Architecture 16
DBMS STANDARDIZATION ANSI / SPARC ARCHITECTURE • Two more users: – Application programmer – System programmer • Two user classes: – Casual user • Retrieve database and possible update • Added in external schema – Novice user • Typically have no knowledge of data base • Example (banking machine) DDBMS Architecture 17
ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs The systems are characterized with respect to: (1) the autonomy of the local systems, (2) their distribution, (3) their heterogeneity. DDBMS Architecture 18
Architectural models for Distributed DBMSs DDBMS Architecture 19
Autonomy • Distribution of control (and not data) - the degree of independence – The local operations of the individual DBMSs are not affected by their participation in the multidatabase system – The manner in which individual DBMSs process queries and optimize them should not be affected by the execution of global queries – System consistency should not be compromised when individual DBMSs join or leave the multidatabase system DDBMS Architecture 20
Autonomy • On the other hand specifies the dimension of autonomy as: • Design autonomy: Ability of a component DBMS to decide on issues related to its own design. • Communication autonomy: Ability of a component DBMS to decide whether and how to communicate with other DBMSs. • Execution autonomy: Ability of a component DBMS to execute local operations in any manner it wants to. DDBMS Architecture 21
ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - DISTRIBUTION Distributions refers to the distributions of data. Of course, we are considering the physical distribution of data over multiple sites; the user sees the data as one logical pool. Two alternatives: – client / server distribution – peer-to-peer distribution (full distribution) DDBMS Architecture 22
ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - DISTRIBUTION Client / server distribution. The client / server distribution concentrates data management duties at servers while the clients focus on providing the application environment including the user interface. The communication duties are shared between the client machines and servers. Client / server DBMSs represent the first attempt at distributing functionality. Peer-to-peer distribution. There is no distinction of client machines versus servers. Each machine has full DBMS functionality and can communicate with other machines to execute queries and transactions. DDBMS Architecture 23
ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - HETEROGENEITY Heterogeneity may occur in various forms in distributed systems, ranging form hardware heterogeneity and differences in networking protocols to variations in data managers. Representing data with different modeling tools creates heterogeneity because of the inherent expressive powers and limitations of individual data models. Heterogeneity in query languages not only involves the use of completely different data access paradigms in different data models, but also covers differences in languages even when the individual systems use the same data model. DDBMS Architecture 24
ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - HETEROGENEITY • Various levels (hardware, communications, operating system) • DBMS important one – data model, query language, transaction management algorithms • Representing data with different modeling tools creates heterogeneity because of the inherent expressive power and limitations of individual data models. DDBMS Architecture 25
ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES The dimensions are identified as: A (autonomy), D (distribution) and H (heterogeneity). The alternatives along each dimension are identified by numbers as: 0, 1 or 2. DDBMS Architecture 26
ARCHITECTURAL MODELS FOR DISTRIBUTED DBMSs - ALTERNATIVES A 0 - tight integration A 1 - semiautonomous systems A 2 - total isolation H 0 - homogeneous systems H 1 - heterogeneous systems D 0 - no distribution D 1 - client / server systems D 2 - peer-to-peer systems DDBMS Architecture 27
Alternatives in Distributed Database Systems Distribution Distributed multi-DBMS Peer-to-peer Distributed DBMS Client/server Autonomy Multi-DBMS Federated DBMS Heterogeneity DDBMS Architecture 28
Datalogical Multi-DBMS Architecture. . . GESn GES 1 LES 1 n GCS LESn 1 … LCS 1 LCS 2 … LCSn LIS 1 LES 11 GES 2 LIS 2 … LISn … • GES: Global External Schema • LES: Local External Schema LESnm • LCS: Local Conceptual Schema • LIS: Local Internal Schema DDBMS Architecture 29
Distributed DBMS • Distributed database requires distributed DBMS • Functions of a distributed DBMS: – Locate data with a distributed data dictionary – Determine location from which to retrieve data and process query components – DBMS translation between nodes with different local DBMSs (using middleware) – Data consistency (via multiphase commit protocols) – Global primary key control – Scalability – Security, concurrency, query optimization, failure DDBMS Architecture 30 recovery
Figure 13 -10 – Distributed DBMS architecture DDBMS Architecture 31
Local Transaction Steps 1. Application makes request to distributed DBMS 2. Distributed DBMS checks distributed data repository for location of data. Finds that it is local 3. Distributed DBMS sends request to local DBMS 4. Local DBMS processes request 5. Local DBMS sends results to application DDBMS Architecture 32
Figure 13 -10: Distributed DBMS Architecture (cont. ) (showing local transaction steps) 2 1 3 5 4 Local transaction – all data stored locally DDBMS Architecture 33
Global Transaction Steps 1. Application makes request to distributed DBMS 2. Distributed DBMS checks distributed data repository for location of data. Finds that it is remote 3. Distributed DBMS routes request to remote site 4. Distributed DBMS at remote site translates request for its local DBMS if necessary, and sends request to local DBMS 5. Local DBMS at remote site processes request 6. Local DBMS at remote site sends results to distributed DBMS at remote site 7. Remote distributed DBMS sends results back to originating site 8. Distributed DBMS at originating site sends results to application DDBMS Architecture 34
Figure 13 -10: Distributed DBMS architecture (cont. ) (showing global transaction steps) 2 3 1 7 8 6 4 5 Global transaction – some data is at remote site(s) DDBMS Architecture 35
DISTRIBUTED DBMS ARCHITECTURE • Client / server systems - (Ax, D 1, Hy) • Distributed databases - (A 0, D 2, H 0) • Multidatabase systems - (A 2, Dx, Hy) DDBMS Architecture 36
The Client/Server Database Environment
Client/Server Systems • Networked computing model • Processes distributed between clients and servers • Client – Workstation (usually a PC) that requests and uses a service • Server – Computer (PC/mini/mainframe) that provides a service • For DBMS, server is a database server DDBMS Architecture 38
Application Logic in C/S Systems • Presentation Logic – Input – keyboard/mouse – Output – monitor/printer • Processing Logic – I/O processing – Business rules – Data management GUI Interface Procedures, functions, programs • Storage Logic – Data storage/retrieval DBMS activities DDBMS Architecture 39
Client/Server Architectures • File Server Architecture • Database Server Architecture • Three-tier Architecture DDBMS Architecture 40
File Server Architecture • All processing is done at the PC that requested the data • Entire files are transferred from the server to the client for processing. • Problems: – Huge amount of data transfer on the network – Each client must contain full DBMS • Heavy resource demand on clients • Client DBMSs must recognize shared locks, integrity checks, etc. DDBMS Architecture 41
File Server Architecture DDBMS Architecture 42
Database Server Architectures • 2 -tiered approach • Client is responsible for – I/O processing logic – Some business rules logic • Server performs all data storage and access processing DBMS is only on server • Advantages – – Clients do not have to be as powerful Greatly reduces data traffic on the network Improved data integrity since it is all processed centrally Stored procedures some business rules done on server DDBMS Architecture 43
Advantages of Stored Procedures • • Compiled SQL statements Reduced network traffic Improved security Improved data integrity DDBMS Architecture 44
Database server architecture DBMS only on server DDBMS Architecture 45
Three-Tier Architectures • Three layers: GUI interface (I/O processing) – Client – Application server Business rules – Database server Data storage l Thin l Browser Web Server DBMS Client PC just for user interface and a little application processing. Limited or no data storage (sometimes no hard drive) DDBMS Architecture 46
Three-tier architecture Thinnest clients Business rules on separate server DBMS only on DB server DDBMS Architecture 47
DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS • This provides two-level architecture which make it easier to manage the complexity of modern DBMSs and the complexity of distribution. • The server does most of the data management work (query processing and optimization, transaction management, storage management). • The client is the application and the user interface (management the data that is cached to the client, management the transaction locks). DDBMS Architecture 48
DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS • This architecture is quite common in relational systems where the communicationbetween the clients and the server(s) is at the level of SQL statements. DDBMS Architecture 49
DISTRIBUTED DBMS ARCHITECTURE CLIENT / SERVER SYSTEMS Multiple client - single server From a data management perspective, this is not much different from centralized databases since the database is stored on only one machine (the server) which also hosts the software to manage it. However, there are some differences from centralized systems in the way transactions are executed and caches are managed. Multiple client - multiple server In this case, two alternative management strategies are possible: either each client manages its own connection to the appropriate server or each client knows of only its “home server” which then communicates with other servers as required. DDBMS Architecture 50
Multiple Clients/Single Server Applications Client Services Communications High-level requests Filtered data only LAN Communications DBMS Services Database DDBMS Architecture 51
Task Distribution Application QL Interface … Programmatic Interface Communications Manager SQL query result table Communications Manager Query Optimizer Lock Manager Storage Manager Page & Cache Manager Database DDBMS Architecture 52
Advantages of Client-Server Architectures • More efficient division of labor • Better price/performance on client machines • Ability to use familiar tools on client machines • Client access to remote data (via standards) • Full DBMS functionality provided to client workstations • Overall better system price/performance DDBMS Architecture 53
Problems With Multiple. Client/Single Server • Server forms bottleneck • Server forms single point of failure • Database scaling difficult DDBMS Architecture 54
Multiple Clients/Multiple Servers Applications • directory Client • caching Services Communications • query decomposition • commit protocols Communications LAN Communications DBMS Services Database DDBMS Architecture DBMS Services Database 55
Server-to-Server • SQL interface Applications Client • programmatic Services Communications interface • other application support environments. Communications LAN Communications DBMS Services Database DDBMS Architecture 56
DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS DDBMS Architecture 57
DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS • Physically data organization on each machine may be, and probably is different. • Then there need to be a internal schema at each site which call LIS and for enterprise view there need to be a external schema which call GCS. DDBMS Architecture 58
DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS • Data Independence due to model of ANSI/SPARC. • Loc & Rep Transparence by LCS & GCS. • Net Transparence by GCS. • DDBMS translate global queries into a group of local queries, execute by DDBMS components at different sites that communicate one another. DDBMS Architecture 59
DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS DDBMS Architecture 60
What is Schema? • In a relational database, the schema defines the tables, the fields in each table, and the relationships between fields and tables. Schemas are generally stored in a data dictionary. Although a schema is defined in text database language, the term is often used to refer to a graphical depiction of the database structure Tahir Rashid DDBMS Architecture 61
DISTRIBUTED DBMS ARCHITECTURE PEER-TO-PEER DISTRIBUTED SYSTEMS • The physical data organization on each machine may be different. • Local internal scheme (LIS) - is an individual internal schema definition at each site. • Global conceptual schema (GCS) - describes the enterprise view of the data. • Local conceptual schema (LCS) - describes the logical organization of data at each site. • External schemas (ESs) - support user applications and user access to the database. DDBMS Architecture 62
2269c67a82543024df4b1ee088045dcf.ppt