3f41a368878c598c4162489487fa509a.ppt
- Количество слайдов: 71
STABILIZATION & LOCALITY Shay Kutten Technion, Israel 1
Recall: Traditional methods are global * Dijkstra- 1 fault- O(n) Time; f faults- o(fn) time * [Katz, Perry]- 1 st general method, 1 fault- O(n) time (1) self stab bcast freezees all nodes (2) global self-stab snapshot to a leaser (3) leader checks global state. (4) Leader initializes every node (if faulty) (5) bcast unfreezes nodes 2
Recall: Traditional methods are global (continuted) * Global reset (general) methods, 1 fault- O(n) time [Afek, Kutten, Yung], [Awerbuch, Patt-Shamir, Vargheses], [Awerbuch, K. , Mansour, Patt-Shamir, Varghese] (1) bcast freezes all nodes (2) reset to a specific initial state [Dolev, Herman] superstabilizing global reset: (2’) reset to a state “nearest” to current state (still 1 fault- O(n) time) 3
Another look at Dijkstra’s algorithm 4
Global effect example 8 8 8 8 leader 7=8 8 8 7 7 7 token 7 8 8 7 8 5
Global effect example 8 8 8 8 leader 7=8 8 8 7 7 token 7 8 8 6
Global effect example 8 8 8 8 leader 7=8 8 8 7 7 token 8 8 7 7
Global effect example 7 8 8 8 8 leader 8 8 7 7 7 token 7 8 8 8
Global effect example 7 8 8 8 8 leader 8 8 7 7 7 token 7 8 8 9
Global effect example 7 8 7 7 8 8 8 leader 8 8 7 7 7 token 7 8 8 10
Global effect example 7 8 7 8 leader 8 8 7 7 7 token 7 8 8 11
Global effect sample The effect of one fault circles whole ring 7 8 7 8 leader 7 8 8 8 7 7 7 token 7 8 8 12
Another example of self stab in industry A General purpose computer B Fast stupid switch E C D Fast route from A to C, passing only B’s stupid switch, not B’s general purpose computer. Possible since route is preset. 13
Another example of self stab in Industry A B E C D In bcast how does A’s stupid switch detect that it already received the bcast 14
Another example of self stab in industry A B E C D (non- self stab) solution- stupid switch forwards only over ports Ports of links marked tree 15
Another example of self stab in industry A E B fault C D (non- self stab) solution- (stupid) switch forwards only over ports Ports of links marked tree. Vulnerable to state fault: suppose the “tree” is really a cycle. 16
Industrial solutions to the self stab problem (1) Digital’s LAN bridges solution root A pulse E B D C 17
Industrial Solutions to the self stab problem (1) Digital’s LAN Bridges Solution root A pulse E pulse D B pulse C 18
Industrial Solutions to the self stab problem (1) Digital’s LAN Bridges Solution root A pulse E pulse B pulse C D pulse 19
Industrial Solutions to the self stab problem (2) IBM’s ATM Solution root A 6 6 E B D C Hop counter decreased. When hop counter reaches 0, discard message. 20
Industrial Solutions to the self stab problem (2) IBM’s ATM Solution 6 root A 6 E 5 D B 5 C Hop counter decreased. When hop counter reaches 0, discard message. 21
Industrial Solutions to the self stab problem (2) IBM’s ATM Solution root A E 5 B 5 C D 4 4 4 Hop counter decreased. When hop counter reaches 0, discard message. 22
Industrial Solutions to the self stab problem (1) Digital’s LAN Bridges Solution root A E B 3 3 4 D 4 3 3 C 4 Hop counter decreased. When hop counter reaches 0, discard message. 23
Industrial solutions to the self stab problem (1) Digital’s LAN bridges solution root A 2 2 E 3 2 3 D B 2 C 3 Hop counter decreased. When hop counter reaches 0, discard message. 24
Scallability • Industrial solutions are global. They do not scale well • Conventional self stab protocols are global. They do not scale well 25
Local detection [Afek, Kutten, Yung] (Local “checking” [Awerbuch, Patt-Shamir, Varghese]) root 0 1 14 2 14 13 A cycle will be detected by a node seeing Only its parent’s state and its own state. 12 3 4 11 10 7 6 5 26
Local checking of a spanning tree, with a root 2(A) 1(A) ro ot D C B roo A t 0(A) 0(C) G H 1(C) 2(A) F E 1(A) 0(A) 27
Local checking of other functions • Any graph marking function is locally checkable • Any algorithm global state is locally checkable • For any bit complexity C function with local checking bit complexity O(C) • Some complexities for interesting marking functions are known • Many other problems are open 28
From local checking to local correction Goal: (unknown) f faults O(f) time for correction Idea: Diameter of faults is f. Prevent faults expansion, shrink faulty area. 0 1 14 2 14 13 3 fault 12 4 11 25 10 22 23 24 7 6 Possibly consistent faults 5 29
From local checking to local correction Goal: (unknown) f faults O(f) time for correction Idea: Diameter of faults is f. Prevent faults expansion, shrink faulty area. 0 1 14 2 14 13 3 fault 12 (“impossible” if faulty majority) 4 11 25 10 22 23 24 7 6 Possibly consistent faults 5 30
For example: need to prevent expansion of Faulty area: recall Dijkstra’s algorithm 31
f faults create “gap” or “bump” of length f. 7 8 7 8 leader 8 8 7 7 7 token 7 8 8 32
f faults create “gap” or “bump” of length f. T T T 33
Another example of expanding faulty area: Stable Value Problem val 1 A Adversary spoils (ones) minority of replicated val 1 and of states. val 2 B Alg. Recovers val 1 everywhere. val 5 E D C val 4 val 3 34
Stable Value Problem val 1 A Adversary spoils (ones) minority of replicated val 1 and of states. val 1 B Alg. Recovers val 1 everywhere. val 5 E D C val 1 val’ 1 35
Reducing a general problem to the Stable Value Problem val 1 A B can compute any func(val 1, val 2, val 3, val 4, val 5) if it receives every correct vali. val 5 E val 2 B D C val 4 val 3 36
“Simple” but global solution: consensus voting val 1 A B “Simple” since E’s vote can be spoiled in D on the way to B. val 1 E D C val 1 37
“Simple” but global solution: consensus voting val 1 A Time = O(diameter) even for one fault. Desired: few faults val 1 B short time val 1 E D C val 1 38
Local voting does not solve Minority faulty notes but local majority everywhere 39
Idea 1: For (unknown) f faults get votes (values) from radius 2 f In O(f) time get >2 f votes so majority is non faulty A 2 f 40
Idea 1: For (unknown) f faults get votes (values) from radius 2 f In O(f) time get >2 f authentic votes so majority is non faulty C’s vote at A is authentic (though faulty) A C 2 f B B’s vote at A is NOT authentic 41
Bcast (vote sending) spreading faults problem under state faults 1 A B C D 1 E 42
Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B C D 1 E 43
Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B 1 C D A said: “ 0” E 44
Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B 1 C A said: “ 0” D A said: “ 0” E 45
Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B 1 C A said: “ 0” D A said: “ 0” E 46
Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B 1 C A said: “ 0” D A said: “ 0” E 47
Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B 1 C A said: “ 0” D A said: “ 0” E 48
Bcast (vote sending) spreading faults problem under state faults 1 A said: “ 1” A B 1 C A said: “ 0” D A said: “ 0” E 49
Bcast (vote sending) spreading faults problem under state faults 1 Actually, A said: “ 1” A B D C 1 A said: “ 0” E 50
Bcast (vote sending) spreading faults problem under state faults 1 Actually, A said: “ 1” A B D C 1 Actually, A said: “ 1” E 51
Bcast (vote sending) spreading faults problem under state faults 1 Actually, A said: “ 1” A B D C 1 Actually, A said: “ 1” E Actually, A said: “ 1” 52
Bcast (vote sending) spreading faults problem under state faults 1 Actually, A said: “ 1” A B D C 1 Actually, A said: “ 1” E Actually, A said: “ 1” 53
Bcast (vote sending) spreading faults problem under state faults 1 Actually, A said: “ 1” A B D C 1 Here, the vote of A at Z is not authentic even After a long time, even with 1 fault Actually, A said: “ 1” E Z Actually, A said: Actually… Actually, A said: “ 1” 54
Bcast (vote sending) spreading faults problem under state faults 1 Actually, A said: “ 1” A B D C 1 Here, the vote of A at Z is not authentic even After a long time, even with 1 fault Global effect for one fault Z Actually, A said: “ 1” E Actually, A said: Actually… Actually, A said: “ 1” 55
Another bcast spreading fault problem F A B C B, C, D, D, F All said: “ 0” D E One faulty node can make the cotes of a majority non-authentic 56
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] 1 A B C D … 57
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] 1 A B C D … A 11111111111111111111111111111 58
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 11111111111111111111111111111000001111111111111111111 f= five faults hit 59
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 1 1111 0000 111111111111 After One time unit, the first 0 notices inconsistency and resets A’s Bcast value to bottom. The first “ 1” after the zeros does the same 60
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] 0 11111 000 0 1111111111 A 1 After two time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 61
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 1 11111 00 0 111111111 After Three time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 62
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 1 11111 0 0 0 11111111 After Four time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 63
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 1 11111 0 0 1111111 After Five time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 64
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 1 111111 0 0 111111 After Six time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 65
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 1 11111 11 0 1111 After seven time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 66
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] 0 1 111111 0 1111 A After eight time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 67
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] A 0 111111 1111 After nine time units, the bcast (“ 0”s and “ 1”s). spreads twice as fast as 68
Sample technique: solving the bcast authenticity in O(f) time Regulated Bcast [Kutten, Patt-Shamir] Power Supply [Afek, Bremler] 11111 A After 10 time units, the replaces the un-authentic vote, while the authentic vote “ 1” continues to proceed at half speed. 69
Recall the Stable Value Problem (1) In O(f) time all non authentic values disappear in A. Impossible A 2 f after O(f) (2) In another O(f) time A knows the authentic votes of 2 f +1. (3) Majority of these votes are 70 not faulty.
A lot of recent related work. A very partial bibliography: [Kutten, Peleg], [KP 1] [Ghosh, Gupta, Herman, Pamaraju] [Afek, Dolev] [Chlamtac, Pinter], [Dolev, Herman], [Naor, Stockmeyer] [Arora, Zhang] Beauquier, Genolini, Cournier, Datta, Petit, Viliain, Xin He Error Confinement, Time Adaptive, Fault Local, Mending, Fault Containment, Snap Stabilization, Local Stabilization 71
3f41a368878c598c4162489487fa509a.ppt