
e048903541691f2d6a8165d613727ff2.ppt
- Количество слайдов: 96
Fundamentals of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft. com Prof. Andreas Reuter Professor U. Stuttgart Reuter@Informatik. uni-stuttgart. de ™ 1
Outline Concepts and Terminology u Why Distributed u Distributed data & objects u Distributed execution u Three tier architectures u Transaction concepts Goal: What you need to know to understand Microsoft Transaction Server (or CORBA or …) 2
What’s a Distributed System? u Centralized: Ø Ø u everything in one place stand-alone PC or Mainframe Distributed: Ø some parts remote Ø distributed users Ø distributed execution Ø distributed data 3
Why Distribute? u No best organization u Companies constantly swing between Ø Ø u Centralized: focus, control, economy Decentralized: adaptive, responsive, competitive Why distribute? Ø Ø Ø reflect organization or application structure empower users / producers improve service (response / availability) distributed load use PC technology (economics) 4
What Should Be Distributed? u Users and User Interface Ø u Processing Ø u Trim client Data Ø u Thin client Fat client Presentation workflow Business Objects Database Will discuss tradeoffs later 5
Transparency in Distributed Systems u Make distributed system as easy to use and manage as a centralized system u Give a Single-System Image u Location transparency: Ø Ø Ø u hide fact that object is remote hide fact that object has moved hide fact that object is partitioned or replicated Name doesn’t change if object is replicated, partitioned or moved. 6
Naming- The basics u u Objects have Ø Globally Unique Identifier (GUIDs) Ø location(s) = address(es) Ø name(s) Ø addresses can change Ø objects can have many names Names are context dependent: Ø u (Jim @ KGB not the same as Jim @ CIA) Many naming systems Ø Ø Ø Address guid Jim James UNC: \nodedevicedirdirobject Internet: http: //node. domain. root/dir/dir/object LDAP: ldap: //ldap. domain. root/o=org, c=US, cn=dir 7
Name Servers in Distributed Systems u u u Name servers translate names + context to address (+ GUID) Name servers are partitioned (subtrees of name space) Name servers replicate root of name tree Name servers form a hierarchy Distributed data from hell: Ø high read traffic high reliability & availability Ø autonomy Ø North root Northern names South root Southern names 8
Autonomy in Distributed Systems u u Owner of site (or node, or application, or database) Wants to control it If my part is working , must be able to access & manage it (reorganize, upgrade, add user, …) u u Autonomy is Ø Essential Ø Difficult to implement. Ø Conflicts with global consistency examples: naming, authentication, admin… 9
Security The Basics u Authentication server u Security matrix: subject Ø who can do what to whom Ø Access control list is column of matrix Ø “who” is authenticated ID u subject + Authenticator => (Yes + token) | No Object Permissions In a distributed system, “who” and “what” and “whom” are distributed objects 10
u u Security in Distributed Systems Security domain: nodes with a shared security server. Security domains can have trust relationships: Ø u u A trusts B: A “believes” B when it says this is Jim@B Security domains form a hierarchy. Delegation: passing authority to a server when A asks B to do something (e. g. print a file, read a database) B may need A’s authority Autonomy requires: Ø each node is an authenticator Ø each node does own security checks Internet Today: Ø no trust among domains (fire walls, many passwords) Ø trust based on digital signatures 11
Clusters The Ideal Distributed System. u Cluster is distributed system BUT single location Ø manager Ø security policy relatively homogeneous Ø u u communications is Ø high bandwidth Ø low latency Ø low error rate u Clusters use distributed system techniques for Ø Ø Ø load distribution Ø storage Ø execution growth fault tolerance 12
Cluster: Shared What? u Shared Memory Multiprocessor Ø Ø Ø u Shared Disk Cluster Ø Ø Ø u Multiple processors, one memory all devices are local DEC or SGI or Sequent 16 x nodes an array of nodes all shared common disks VAXcluster + Oracle Shared Nothing Cluster Ø Ø Ø each device local to a node ownership may change Tandem, SP 2, Wolfpack 13
Outline Concepts and Terminology u Why Distribute u Distributed data & objects Ø Ø Partitioned Replicated u Distributed execution u Three tier architectures u Transaction concepts 14
Partitioned Data Break file into disjoint groups u Exploit data access locality Ø Ø Ø u Orders N. A. S. A. Europe Asia Put data near consumer Less network traffic Better response time Better availability Owner controls data autonomy Spread Load Ø data or traffic may exceed single store 15
How to Partition Data? u How to Partition Ø Ø u N. A. S. A. Europe Asia Problem: to find it must have Ø Ø u by attribute or random or by source or by use Directory (replicated) or Algorithm Encourages attribute-based partitioning 16
Replicated Data Place fragment at many sites u Pros: + Improves availability + Disconnected (mobile) operation + Distributes load + Reads are cheaper u Cons: ; N times more updates ; N times more storage u Placement strategies: Ø Dynamic: cache on demand Ø Static: place specific Catalog 17
Updating Replicated Data u When a replica is updated, how do changes propagate? u Master copy, many slave copies (SQL Server) Ø Ø u always know the correct value (master) change propagation can be Ø transactional Ø as soon as possible Ø periodic Ø on demand Symmetric, and anytime (Access) Ø Ø Ø allows mobile (disconnected) updates propagated ASAP, periodic, on demand non-serializable colliding updates must be reconciled. hard to know “real” value 18
Replication and Partitioning Compared Scaleup Base case 1 TPS server 100 Users u Partition Scaleup 2 x more work u Replication Scaleup 4 x more work 2 TPS server 1 TPS server 2 TPS server 100 Users 2 TPS server 1 tps 100 Users 1 tps O tps 1 TPS server O tps Central Scaleup 2 x more work Two 2 TPS systems Two 1 TPS systems 100 Users 200 Users Replication Partitioning 100 Users u to a 2 TPS centralized system a 1 TPS system 19
Outline Concepts and Terminology u Why Distribute u Distributed data & objects Ø Ø u Partitioned Replicated Distributed execution Ø Ø remote procedure call queues u Three tier architectures u Transaction concepts 20
Distributed Execution Threads and Messages u Thread is Execution unit threads (software analog of cpu+memory) u Threads execute at a node u Threads communicate via Ø Ø shared memory Shared memory (local) Messages (local and remote) messages 21
Peer-to-Peer or Client-Server u Peer-to-Peer is symmetric: Ø u Either side can send Client-server Ø Ø Ø client sends requests server sends responses simple subset of peer-to-peer req uest resp ons e 22
Connection-less or Connected u Connection-less Ø request contains Ø Ø Ø client id client context work request client authenticated on each message only a single response message e. g. HTTP, NFS v 1 u. Connected (sessions) Øopen - request/reply - close Øclient authenticated once ØMessages arrive in order ØCan send many replies (e. g. FTP) Ø Server has client context (context sensitive) Ø e. g. Winsock and ODBC Ø HTTP adding connections 23
Remote Procedure Call: The key to transparency u Object may be local or remote u y = p. Obj->f(x); Methods on object work wherever it is. u Local invocation x f() return val; y = val; val 24
Remote Procedure Call: The key to transparency u Remote invocation y = p. Obj->f(x); x proxy Obj Local? x marshal stub x un marshal p. Obj->f(x) f() x Obj Local? f() return val; y = val; val return val; un marshal val 25
Object Request Broker (ORB) Orchestrates RPC u u u Registers Servers Manages pools of servers Connects clients to servers Does Naming, request-level authorization, Provides transaction coordination (new feature) Old names: Ø Ø Ø Transaction Processing Monitor, Web server, Transaction Net. Ware Object-Request Broker 26
History and Alphabet Soup CORBA Solaris 1995 Object Management Group (OMG) 1990 X/Open UNIX International 1985 Open software Foundation (OSF) Microsoft DCOM based on OSF-DCE Technology DCOM and Active. X extend it Open Group OSF DCE OD B XA C / TX CEC D P s R UID G L s ID NS ero D erb K COM NT 27
Using RPC for Transparency u Partition Transparency Send updates to correct partition y = pfile->write(x); x part Local? x x send to correct partition un marshal x p. Obj->write(x) write() return val; val marshal val 28
Using RPC for Transparency u Replication Transparency Send updates to EACH node y = pfile->write(x); x x Send to each replica val 29
u Client/Server Interactions All can be done with RPC Request-Response C S response may be many messages u C Conversational server keeps client context u Dispatcher three-tier: complex operation at server u Queued de-couples client from server allows disconnected operation C S S S 30
Queued Request/Response u Time-decouples client and server Ø Three Transactions u Almost real time, ASAP processing u Communicate at each other’s convenience u Disk queues survive client & server failures Allows mobile (disconnected) operation Submit Response Client Perform Server 31
Why Queued Processing? u Prioritize requests u Manage Workflows Order ambulance dispatcher favors high-priority calls Build Ship Invoice Pay u Deferred processing in mobile apps u Interface heterogeneous systems EDI, MOM: Message-Oriented-Middleware DAD: Direct Access to Data 32
Outline Concepts and Terminology u Why Distributed u Distributed data & objects u Distributed execution Ø Ø u Three tier architectures Ø Ø u remote procedure call queues what why Transaction concepts 33
Work Distribution Spectrum u u Presentation and plug-ins Workflow manages session & invokes objects Business objects Database Fat Thin Presentation workflow Business Objects Database Fat Thin 34
Transaction Processing Evolution to Three Tier Intelligence migrated to clients Mainframe u Mainframe Batch processing (centralized) u Dumb terminals & Remote Job Entry u u cards green screen 3270 TP Monitor Intelligent terminals database backends Workflow Systems Object Request Brokers Application Generators Server Active ORB 35
Web Evolution to Three Tier Intelligence migrated to clients (like TP) Web u Character-mode clients, smart servers WAIS Server archie ghopher green screen Mosaic u GUI Browsers - Web file servers u GUI Plugins - Web dispatchers - CGI u Smart clients - Web dispatcher (ORB) pools of app servers (ISAPI, Viper) workflow scripts at client & server NS & IE Active 36
u PC Evolution to Three Tier Intelligence migrated to server Stand-alone PC (centralized) u PC + File & print server message per I/O u PC + Database server message per SQL statement u PC + App server u Active. X Client, ORB Active. X server, Xscript message per transaction IO request reply disk I/O SQL Statement Transaction 37
The Pattern: Three Tier Computing Presentation u Clients do presentation, gather input u Clients do some workflow (Xscript) u Clients send high-level requests to ORB (Object Request Broker) u ORB dispatches workflows and business objects -- proxies for client, Business Objects orchestrate flows & queues u Server-side workflow scripts call on distributed business objects to execute task workflow Database 38
The Three Tiers Web Client HTML VB Java plug-ins VBscritpt Java. Scrpt VB or Java Script Engine Middleware Object server Pool VB or Java Virt Machine Internet HTTP+ DCOM ORB TP Monitor Web Server. . . Object & Data server. DCOM (ole. DB, ODBC, . . . ) 2 6. U L IBM Legacy Gateways 39
Why Did Everyone Go To Three-Tier? u Manageability Ø Ø u Business rules must be with data Middleware operations tools Performance (scaleability) Ø Server resources are precious ORB dispatches requests to server pools Technology & Physics Ø Ø Presentation Put UI processing near user Put shared data processing near shared data workflow Business Objects Database 40
Why Put Business Objects at Server? MOM’s Business Objects DAD’s. Raw Data Customer comes to store Takes what he wants Fills out invoice Leaves money for goods Easy to build No clerks Customer comes to store with list Gives list to clerk Clerk gets goods, makes invoice Customer pays clerk, gets goods Easy to manage Clerks controls access Encapsulation 41
What Middleware Does ORB, TP Monitor, Workflow Mgr, Web Server u u u u Registers transaction programs workflow and business objects (DLLs) Pre-allocates server pools Provides server execution environment Dynamically checks authority (request-level security) Does parameter binding Dispatches requests to servers Ø parameter binding Ø load balancing Provides Queues Operator interface 42
u ORB gives simple execution environment Object gets Ø Ø u u start invoke shutdown Everything else is automatic Drag & Drop Business Objects Receiver Queue Connections Context Security Thread Pool Configuration Ø Network Management u Server Side Objects Easy Server-Side Execution A Server Service logic Synchronization Shared Data 43
Why Server Pools? u Server resources are precious. u Pre-allocate everything on server Clients have 100 x more power than server. Ø Ø u Keep high duty-cycle on objects (re-use them) Ø u preallocate memory pre-open files pre-allocate threads N clients x N Servers x F files = N x F file opens!!! pre-open and authenticate clients Pool threads, not one per client Classic example: TPC-C benchmark IE Ø 2 processes Ø everything pre-allocated Pool of DBC links HTTP 7, 000 clients IIS SQL 44
Ø Ø Ø u u order entry , payment , status (oltp) delivery (mini-batch) restock (mini-DSS) Metrics: Throughput, Price/Performance Shows best practices: Ø everyone three tier Ø Ø 2 processes at server everything pre-allocated HTTP u Transaction Processing Performance Council (TPC): standard performance benchmarks 5 transaction types IIS = Web Pool of DBC links ODBC u Classic Three-Tier Example TPC-C 7, 000 Web clients SQL 45
Classic Mistakes u u Thread per terminal fix: DB server thread pools fix: server pools Process per request (CGI) fix: ISAPI & NSAPI DLLs fix: connection pools Many messages per operation fix: stored procedures fix: server-side objects File open per request fix: cache hot files 46
Outline u Why Distributed u Distributed data & objects u Distributed execution u Three tier architectures Ø Ø u why: manageability & performance what: server side workflows & objects Transaction concepts Ø Ø Why transactions? Using transactions Two Phase Commit How transactions? 47
Thesis u u Transactions are key to structuring distributed applications ACID properties ease exception handling Ø Ø Atomic: all or nothing Consistent: state transformation Isolated: no concurrency anomalies Durable: committed transaction effects persist 48
What Is A Transaction? u Programmer’s view: Ø u Bracket a collection of actions A simple failure model Ø Only two outcomes: Begin() action action action Commit() Success! Rollback() Fail ! Failure! 49
Why Bother: Atomicity? u RPC semantics: Ø At most once: try one time Ø At least once: keep trying ’till acknowledged Ø ? ? ? Exactly once: keep trying ’till acknowledged and server discards duplicate requests 50
Why Bother: Atomicity? u Example: insert record in file At most once: time-out means “maybe” Ø At least once: retry may get “duplicate” error or retry may do second insert Ø Exactly once: you do not have to worry Ø u What if operation involves Insert several records? Ø Send several messages? Ø u Want ALL or NOTHING for group of actions 51
Why Bother: Consistency Ø Ø Ø u Debit but not credit (destroys money) Delete old file before create new file in a copy Print document before delete from spool queue Begin and commit are points of consistency State transformations new state under construction Commit u Begin-Commit brackets a set of operations You can violate consistency inside brackets Begin u 52
Why Bother: Isolation u Running programs concurrently on same data can create concurrency anomalies Ø The shared checking account example Begin() read BAL add 10 write BAL Commit() Begin() Bal = 100 Bal = 110 Bal = 70 u read BAL Subtract 30 write BAL Commit() Programming is hard enough without having to worry about concurrency 53
Isolation u It is as though programs run one at a time Ø u No concurrency anomalies System automatically protects applications Locking (DB 2, Informix, Microsoft® SQL Server™, Sybase…) Ø Versioned databases (Oracle, Interbase…) Ø Begin() read BAL add 10 write BAL Commit() Bal = 100 Begin() Bal = 110 Bal = 80 read BAL Subtract 30 write BAL Commit() 54
Why Bother: Durability u u Once a transaction commits, want effects to survive failures Fault tolerance: old master-new master won’t work: Ø Ø u u Can’t do daily dumps: would lose recent work Want “continuous” dumps Redo “lost” transactions in case of failure Resend unacknowledged messages 55
Why ACID For Client/Server And Distributed u u u ACID is important for centralized systems Failures in centralized systems are simpler In distributed systems: Ø Ø u More and more-independent failures ACID is harder to implement That makes it even MORE IMPORTANT Ø Ø Simple failure model Simple repair model 56
ACID Generalizations u Taxonomy of actions Ø Unprotected: not undone or redone Ø Temp files Ø Transactional: can be undone before commit Ø Database and message operations Ø Real: cannot be undone Ø Drill a hole in a piece of metal, print a check u u Nested transactions: subtransactions Work flow: long-lived transactions 57
Outline u u u Why Distributed data & objects Distributed execution Three tier architectures Transaction concepts Ø Why transactions? Ø Ø ACID: atomic, consisistent, isolated, durable Using transactions programming Ø save points Ø nested, chained Ø workflow Two Phase Commit How transactions? Ø Ø Ø 58
Programming & Transactions The Application View u You Start (e. g. in Transact. SQL): Ø Ø u Begin [Distributed] Transaction <name> Begin Perform actions Optional Save Transaction <name> Commit or Rollback Commit Begin Roll. Back You Inherit a XID Ø Ø Caller passes you a transaction XID You return or Rollback. You can Begin / Commit sub-trans. You can use save points Return Roll. Back Return 59
Transaction Save Points Backtracking within a transaction BEGIN WORK: 1 u action SAVE WORK: 2 action SAVE WORK: 3 action SAVE WORK: 5 action SAVE WORK: 6 action SAVE WORK: 4 action ROLLBACK WORK(2) u Allows app to cancel parts of a transaction prior to commit This is in most SQL products (save transaction in MS SQL Server) action SAVE WORK: 7 action ROLLBACK WORK(7) SAVE WORK: 8 action COMMIT WORK 60
Chained Transactions u u Commit of T 1 implicitly begins T 2. Carries context forward to next transaction Ø Ø Ø cursors locks other state Transaction #1 Processing context established C o m m i t Transaction #2 B e g i n Processing context used 61
Nested Transactions u u u Going Beyond Flat Transactions Need transactions within transactions Sub-transactions commit only if root does Only root commit is durable. Subtransactions may rollback if so, all its subtransactions rollback Parallel version of nested transactions T 12 T 112 T 113 T 114 T 121 T 13 T 122 T 123 T 131 T 132 T 133 62
Workflow: A Sequence of Transactions u Application transactions are multi-step Presentation Ø u u u order, build, ship & invoice, reconcile Each step is an ACID unit Workflow is a script describing steps Workflow systems Instantiate the scripts Ø Drive the scripts Ø Allow query against scripts Examples Manufacturing Work In Process (WIP) Queued processing Loan application & approval, Hospital admissions… workflow Ø u Business Objects Database 63
Workflow Scripts u Workflow scripts are programs (could use VBScript or Java. Script) u u u If step fails, compensation action handles error Events, messages, time, other steps cause step. Workflow controller drives flows join Source branch fork case loop Compensation Action Step 64
Workflow and ACID u u u Workflow is not Atomic or Isolated Results of a step visible to all Workflow is Consistent and Durable Each flow may take hours, weeks, months Workflow controller Ø Ø Ø keeps flows moving maintains context (state) for each flow provides a query and operator interface e. g. : “what is the status of Job # 72149? ” 65
ACID Objects Using ACID DBs The easy way to build transactional objects u u Application uses transactional objects (objects have ACID properties) If object built on top of ACID objects, then object is ACID. Ø u SQL Example: New, En. Queue, De. Queue on top of SQL provides ACID Business Object: Customer Business Object Mgr: Customer. Mgr SQL Persistent Programming languages automate this. dim c as Customer dim CM as Customer. Mgr. . . set C = CM. get(Cust. ID). . . C. credit_limit = 1000. . . CM. update(C, Cust. ID). . 66
ACID Objects From Bare Metal The Hard Way to Build Transactional Objects u Object Class is a Resource Manager (RM) Ø Provides ACID objects from persistent storage Ø Provides Undo (on rollback) Ø Provides Redo (on restart or media failure) Ø Provides Isolation for concurrent ops u u Microsoft SQL Server, IBM DB 2, Oracle, … are Resource managers. Many more coming. u RM implementation techniques described later 67
Outline u u u Why Distributed data & objects Distributed execution Three tier architectures Transaction concepts Ø Why transactions? Ø Using transactions Ø programming Ø save points Ø nested, chained Ø workflow Two Phase Commit Ø Ø Prepare and commit phases Transaction & Resource Managers How transactions? 68
Transaction Manager u Transaction Manager (TM): manages transaction objects. Ø Ø Ø u u egi b App n XID call(. . XID) App gets XID from TM Transactional RPC Ø Ø u XID factory tracks them coordinates them TM enlist RM passes XID on all calls manages XID inheritance TM manages commit & rollback 69
TM Two-Phase Commit Dealing with multiple RMs u u u If all use one RM, then all or none commit If multiple RMs, then need coordination Standard technique: Marriage: Do you? I do. I pronounce…Kiss Ø Theater: Ready on the set? Ready! Action! Act Ø Sailing: Ready about? Ready! Helm’s a-lee! Tack Ø Contract law: Escrow agent Ø u Two-phase commit: 1. Voting phase: can you do it? Ø 2. If all vote yes, then commit phase: do it! Ø 70
Two-Phase Commit In Pictures u u Transactions managed by TM App gets unique ID (XID) from TM at Begin() XID passed on Transactional RPC RMs Enlist when first do work on XID gin Be XID App TM Call(. . XID. . ) Call(. . XI D. . ) En lis RM 1 t En RM 2 lis t 71
When App Requests Commit Two Phase Commit in Pictures u TM tracks all RMs enlisted on an XID u TM calls enlisted RM’s Prepared() callback If all vote yes, TM calls RM’s Commit() If any vote no, TM calls RM’s Rollback() u u 4. TM decides Yes, broadcasts 1. Application requests Commit 6. TM says yes TM e e ar ar ep ep Pr Pr App mm Co es y 2 2 2. TM broadcasts prepared? 4 Ye 3 s it it m m Co Co 1 it 4 3 RM 1 Yes RM 2 5. RMs acknowledge Ye Ye s s 5 5 3. RMs all vote Yes 72
X/Open Standardizes Two-Phase Commit Standardized APIs for apps and to RMs Points to OSI/TP for interoperation TX: begin commit rollback Client SQL or MTS or. . TM XA: enlist, Prepare Commit RM XA+: outgoing incoming Comm mgr TM Comm mgr Server RM 73
How Does This Relate To Microsoft? u u SQL Server is transactional (so is Oracle, DB 2, Informix, Sybase) MS Distributed Transaction Coordinator (DTC) packaged with SQL Server, MTS, and other RMs Connects to CICS, Encina, Topend, Tuxedo Any RM (SNA LU 6. 2, DB 2, Oracle, Sybase, Informix, …) can participate in transactions 74
OLE Transactions: the Movie Two styles: (1) Bind an RM connection to the transaction. All work on that connection is now part of that transaction. c sa an M on ti n ra T t Ge c Dt r ge () a TM I Transaction. Dispenser Begin. Transaction ITransaction Get. Transaction. Info Commit Abort (2) pass transaction object on every RM call. Not shown: client can get async notification of transaction outcome. Client begin commit rollback Resource Manager aka (sql, Comm viper, …) Mgr Commit / Abort 75
OLE Transactions RM Enlist a RM registers with TM RM Enlists in transaction (provides callbacks) n io ct a t e c. G Dt Tr s an an M r ge () TM )!!!! T (IResource. Manager. Factory S ENLI Create RM IResource. Manager Enlist Re. Enlistment. Complete Transaction begin commit rollback Resource Manager aka (sql, viper, …) ITransaction. Resource. Async Prepare. Request Commit. Request Abort. Request TMDown Enlistment ITransaction. Enlistment. Async Prepare. Req. Done Commit. Req. Done 76 Abort. Req. Done
OLE Transactions RM Commit COMMIT TM Transaction Two phase commit Enlisted RMs get prepare & commit callbacks Abort callbacks are similar begin commit rollback Resource Manager aka (sql, viper, …) ITransaction. Resource. Async Prepare. Request Commit. Request Abort. Request TMDown Enlistment ITransaction. Enlistment. Async Prepare. Req. Done Commit. Req. Done 77 Abort. Req. Done
u u u Outline Why Distributed data & objects Distributed execution Three tier architectures Transaction concepts Ø Why transactions? Ø Using transactions Two Phase Commit Ø Prepare and commit phases Ø Transaction and Resource Managers How transactions? Ø logging Ø locking or versioning Ø Ø 78
Implementing Transactions u Atomicity Ø Ø Ø u Durability Ø Ø u The DO/UNDO/REDO protocol Idempotence Two-phase commit Durable logs Force at commit Isolation Ø Locking or versioning 79
DO/UNDO/REDO u Each action generates a log record New state Old state DO u Has an UNDO action Log Old state New state UNDO u Has a REDO action Log New state Old state REDO 80
What Does A Log Record Look Like? u Log record has Header (transaction ID, timestamp… ) Ø Item ID Ø Old value ? Log ? Ø New value Ø u u u For messages: just message text and sequence # For records: old and new value on update Keep records small 81
Transaction Is A Sequence Of Actions u Each action changes state Ø Ø Ø u Changes database Sends messages Operates a display/printer/drill press Leaves a log trail New state Old state DO New state Old state DO New state Log DO Log Log 82
Transaction UNDO Is Easy u u u Read log backwards UNDO one step at a time Can go half-way back to get nested transactions New state Old state New state UNDO New state Log UNDO Log Log 83
Durability: Protecting The Log u When transaction commits Put its log in a durable place (duplexed disk) Ø Need log to redo transaction in case of failure Ø System failure: lost Log in-memory updates Log Log Ø Media failure (lost disk) W rit e Ø u u This makes transaction durable Log is sequential file Converts random IO to single sequential IO Ø See NTFS or newer UNIX file systems Ø 84
Recovery After ASystem Failure u u During normal processing, write checkpoints on non-volatile storage When recovering from a system failure… return to the checkpoint state Ø Reapply log of all committed transactions Ø Force-at-commit insures log will survive restart Ø u Then UNDO all uncommitted transactions Old state New state REDO Old state REDO New state Old state REDO Log Log 85
Idempotence Dealing with failure u What if fail during restart? Ø u REDO many times What if new state not around at restart? Ø UNDO something not done New state Old state REDO Log New state UNDO REDO Log Old state New state Log Old state UNDO Log 86
Idempotence Dealing with failure u Solution: make F(F(x))=F(x) (idempotence) Ø Ø Discard duplicates Ø Message sequence numbers to discard duplicates Ø Use sequence numbers on pages to detect state (Or) make operations idempotent Ø Move to position x, write value V to byte B… New state Old state REDO Log New state UNDO REDO Log Old state New state Log Old state UNDO Log 87
Recap u u ACID makes it easy to program distributed applications DO/UNDO/REDO + log allows atomicity Multiple logs need two-phase commit Persistent log gives durability Ø Ø Recover from system failure Recover from media failure 88
Outline u u u Why Distributed data & objects Distributed execution Three tier architectures Transaction concepts Ø Why transactions? Ø Using transactions Two Phase Commit How transactions? Ø logging Ø locking or versioning Ø Ø 89
Concurrency Control Locking u u How to automatically prevent concurrency bugs? Serialization theorem: Ø Ø u Automatic Locking: Ø Ø u If you lock all you touch and hold to commit: no bugs If you do not follow these rules, you may see bugs Set automatically (well-formed) Released at commit/rollback (two-phase locking) Greater concurrency for locks: Ø Ø Granularity: objects or containers or server Mode: shared or exclusive or… 90
Reduced Isolation Levels u u It is possible to lock less and risk fuzzy data Example: want statistical summary of DB Ø u But do not want to lock whole database Reduced levels: Ø Ø Ø Repeatable Read: may see fuzzy inserts/delete Ø But will serialize all updates Read Committed: see only committed data Read Uncommitted: may see uncommitted updates 91
Multiversion Concurrency Control u u u Run transaction at some timestamp in the past No locking needed, reconstruct “old” state from log Add in your transaction’s updates At commit assure updates do not collide with other committed transactions Almost as good as serializable (only obscure bugs) 92
Summary u ACID eases error handling Ø Ø u u Atomic: all or nothing Consistent: correct transformation Isolated: no concurrency bugs Durable: survives failures Allows you to build robust distributed applications ACID becoming standard part of systems Ø It’s real 93
Outline u u u Why Distributed data & objects Distributed execution Three tier architectures Transaction concepts 2 -Tier 3 -Tier Acid Atomic Autonomy Commit Consistent Delegation Durable Fat Client Idempotent Isolated Lock Log ORB Partitioned Data Queued or Direct Replicated Data Resource Manager Rollback (Abort) RPC Serializable Server Pool Thin Client Transaction Manager Two Phase Commit Undo/Redo Update Anywhere Workflow XID 94
References u Essential Client/Server Survival Guide 2 nd ed. Ø u Principles of Transaction Processing Ø u Orfali, Harkey & Edwards, J. Wiley, 1996 Bernstein & Newcomer, Morgan Kaufmann, 1997 Transaction Processing Concepts and Techniques Ø Gray & Reuter, Morgan Kaufmann, 1993 95
™ 96
e048903541691f2d6a8165d613727ff2.ppt