3f41a368878c598c4162489487fa509a.ppt

- Количество слайдов: 71

STABILIZATION & LOCALITY Shay Kutten Technion, Israel 1

Recall: Traditional methods are global * Dijkstra- 1 fault- O(n) Time; f faults- o(fn) time * [Katz, Perry]- 1 st general method, 1 fault- O(n) time (1) self stab bcast freezees all nodes (2) global self-stab snapshot to a leaser (3) leader checks global state. (4) Leader initializes every node (if faulty) (5) bcast unfreezes nodes 2

Recall: Traditional methods are global (continuted) * Global reset (general) methods, 1 fault- O(n) time [Afek, Kutten, Yung], [Awerbuch, Patt-Shamir, Vargheses], [Awerbuch, K. , Mansour, Patt-Shamir, Varghese] (1) bcast freezes all nodes (2) reset to a specific initial state [Dolev, Herman] superstabilizing global reset: (2’) reset to a state “nearest” to current state (still 1 fault- O(n) time) 3

Another look at Dijkstra’s algorithm 4

Global effect example 8 8 8 8 leader 7=8 8 8 7 7 7 token 7 8 8 7 8 5

Global effect example 8 8 8 8 leader 7=8 8 8 7 7 token 7 8 8 6

Global effect example 8 8 8 8 leader 7=8 8 8 7 7 token 8 8 7 7

Global effect example 7 8 8 8 8 leader 8 8 7 7 7 token 7 8 8 8

Global effect example 7 8 8 8 8 leader 8 8 7 7 7 token 7 8 8 9

Global effect example 7 8 7 7 8 8 8 leader 8 8 7 7 7 token 7 8 8 10

Global effect example 7 8 7 8 leader 8 8 7 7 7 token 7 8 8 11

Global effect sample The effect of one fault circles whole ring 7 8 7 8 leader 7 8 8 8 7 7 7 token 7 8 8 12

Another example of self stab in industry A General purpose computer B Fast stupid switch E C D Fast route from A to C, passing only B’s stupid switch, not B’s general purpose computer. Possible since route is preset. 13

Another example of self stab in Industry A B E C D In bcast how does A’s stupid switch detect that it already received the bcast 14

Another example of self stab in industry A B E C D (non- self stab) solution- stupid switch forwards only over ports Ports of links marked tree 15

Another example of self stab in industry A E B fault C D (non- self stab) solution- (stupid) switch forwards only over ports Ports of links marked tree. Vulnerable to state fault: suppose the “tree” is really a cycle. 16

Industrial solutions to the self stab problem (1) Digital’s LAN bridges solution root A pulse E B D C 17

Industrial Solutions to the self stab problem (1) Digital’s LAN Bridges Solution root A pulse E pulse D B pulse C 18

Industrial Solutions to the self stab problem (1) Digital’s LAN Bridges Solution root A pulse E pulse B pulse C D pulse 19

Industrial Solutions to the self stab problem (2) IBM’s ATM Solution root A 6 6 E B D C Hop counter decreased. When hop counter reaches 0, discard message. 20

Industrial Solutions to the self stab problem (2) IBM’s ATM Solution 6 root A 6 E 5 D B 5 C Hop counter decreased. When hop counter reaches 0, discard message. 21

Industrial Solutions to the self stab problem (2) IBM’s ATM Solution root A E 5 B 5 C D 4 4 4 Hop counter decreased. When hop counter reaches 0, discard message. 22

Industrial Solutions to the self stab problem (1) Digital’s LAN Bridges Solution root A E B 3 3 4 D 4 3 3 C 4 Hop counter decreased. When hop counter reaches 0, discard message. 23

Industrial solutions to the self stab problem (1) Digital’s LAN bridges solution root A 2 2 E 3 2 3 D B 2 C 3 Hop counter decreased. When hop counter reaches 0, discard message. 24

Scallability • Industrial solutions are global. They do not scale well • Conventional self stab protocols are global. They do not scale well 25

Local detection [Afek, Kutten, Yung] (Local “checking” [Awerbuch, Patt-Shamir, Varghese]) root 0 1 14 2 14 13 A cycle will be detected by a node seeing Only its parent’s state and its own state. 12 3 4 11 10 7 6 5 26

Local checking of a spanning tree, with a root 2(A) 1(A) ro ot D C B roo A t 0(A) 0(C) G H 1(C) 2(A) F E 1(A) 0(A) 27

Local checking of other functions • Any graph marking function is locally checkable • Any algorithm global state is locally checkable • For any bit complexity C function with local checking bit complexity O(C) • Some complexities for interesting marking functions are known • Many other problems are open 28

From local checking to local correction Goal: (unknown) f faults O(f) time for correction Idea: Diameter of faults is f. Prevent faults expansion, shrink faulty area. 0 1 14 2 14 13 3 fault 12 4 11 25 10 22 23 24 7 6 Possibly consistent faults 5 29

From local checking to local correction Goal: (unknown) f faults O(f) time for correction Idea: Diameter of faults is f. Prevent faults expansion, shrink faulty area. 0 1 14 2 14 13 3 fault 12 (“impossible” if faulty majority) 4 11 25 10 22 23 24 7 6 Possibly consistent faults 5 30

For example: need to prevent expansion of Faulty area: recall Dijkstra’s algorithm 31

f faults create “gap” or “bump” of length f. 7 8 7 8 leader 8 8 7 7 7 token 7 8 8 32

f faults create “gap” or “bump” of length f. T T T 33

Another example of expanding faulty area: Stable Value Problem val 1 A Adversary spoils (ones) minority of replicated val 1 and of states. val 2 B Alg. Recovers val 1 everywhere. val 5 E D C val 4 val 3 34

Stable Value Problem val 1 A Adversary spoils (ones) minority of replicated val 1 and of states. val 1 B Alg. Recovers val 1 everywhere. val 5 E D C val 1 val’ 1 35

Reducing a general problem to the Stable Value Problem val 1 A B can compute any func(val 1, val 2, val 3, val 4, val 5) if it receives every correct vali. val 5 E val 2 B D C val 4 val 3 36

“Simple” but global solution: consensus voting val 1 A B “Simple” since E’s vote can be spoiled in D on the way to B. val 1 E D C val 1 37

“Simple” but global solution: consensus voting val 1 A Time = O(diameter) even for one fault. Desired: few faults val 1 B short time val 1 E D C val 1 38

Local voting does not solve Minority faulty notes but local majority everywhere 39

Idea 1: For (unknown) f faults get votes (values) from radius 2 f In O(f) time get >2 f votes so majority is non faulty A 2 f 40

Idea 1: For (unknown) f faults get votes (values) from radius 2 f In O(f) time get >2 f authentic votes so majority is non faulty C’s vote at A is authentic (though faulty) A C 2 f B B’s vote at A is NOT authentic 41

Bcast (vote sending) spreading faults problem under state faults 1 A B C D 1 E 42

Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B C D 1 E 43

Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B 1 C D A said: “ 0” E 44

Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B 1 C A said: “ 0” D A said: “ 0” E 45

Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B 1 C A said: “ 0” D A said: “ 0” E 46

Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B 1 C A said: “ 0” D A said: “ 0” E 47

Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B 1 C A said: “ 0” D A said: “ 0” E 48

Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B 1 C A said: “ 0” D A said: “ 0” E 49

Bcast (vote sending) spreading faults problem under state faults 1 Actually, A said: “ 1” A B D C 1 A said: “ 0” E 50

Bcast (vote sending) spreading faults problem under state faults 1 Actually, A said: “ 1” A B D C 1 Actually, A said: “ 1” E 51

Bcast (vote sending) spreading faults problem under state faults 1 Actually, A said: “ 1” A B D C 1 Actually, A said: “ 1” E Actually, A said: “ 1” 52

Bcast (vote sending) spreading faults problem under state faults 1 Actually, A said: “ 1” A B D C 1 Actually, A said: “ 1” E Actually, A said: “ 1” 53

Bcast (vote sending) spreading faults problem under state faults 1 Actually, A said: “ 1” A B D C 1 Here, the vote of A at Z is not authentic even After a long time, even with 1 fault Actually, A said: “ 1” E Z Actually, A said: Actually… Actually, A said: “ 1” 54

Bcast (vote sending) spreading faults problem under state faults 1 Actually, A said: “ 1” A B D C 1 Here, the vote of A at Z is not authentic even After a long time, even with 1 fault Global effect for one fault Z Actually, A said: “ 1” E Actually, A said: Actually… Actually, A said: “ 1” 55

Another bcast spreading fault problem F A B C B, C, D, D, F All said: “ 0” D E One faulty node can make the cotes of a majority non-authentic 56

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] 1 A B C D … 57

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] 1 A B C D … A 11111111111111111111111111111 58

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 11111111111111111111111111111000001111111111111111111 f= five faults hit 59

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 1 1111 0000 111111111111 After One time unit, the first 0 notices inconsistency and resets A’s Bcast value to bottom. The first “ 1” after the zeros does the same 60

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] 0 11111 000 0 1111111111 A 1 After two time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 61

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 1 11111 00 0 111111111 After Three time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 62

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 1 11111 0 0 0 11111111 After Four time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 63

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 1 11111 0 0 1111111 After Five time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 64

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 1 111111 0 0 111111 After Six time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 65

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 1 11111 11 0 1111 After seven time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 66

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] 0 1 111111 0 1111 A After eight time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 67

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 111111 1111 After nine time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 68

Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] 11111 A After 10 time units, the replaces the un-authentic vote, while the authentic vote “ 1” continues to proceed at half speed. 69

Recall the Stable Value Problem (1) In O(f) time all non authentic values disappear in A. Impossible A 2 f after O(f) (2) In another O(f) time A knows the authentic votes of 2 f +1. (3) Majority of these votes are 70 not faulty.

A lot of recent related work. A very partial bibliography: [Kutten, Peleg], [KP 1] [Ghosh, Gupta, Herman, Pamaraju] [Afek, Dolev] [Chlamtac, Pinter], [Dolev, Herman], [Naor, Stockmeyer] [Arora, Zhang] Beauquier, Genolini, Cournier, Datta, Petit, Viliain, Xin He Error Confinement, Time Adaptive, Fault Local, Mending, Fault Containment, Snap Stabilization, Local Stabilization 71