Скачать презентацию Future Directions in Advanced Storage Services Danny Dolev Скачать презентацию Future Directions in Advanced Storage Services Danny Dolev

0845b241708e3396aed3d2bde94881ae.ppt

  • Количество слайдов: 33

Future Directions in Advanced Storage Services Danny Dolev School of Engineering and Computer Science Future Directions in Advanced Storage Services Danny Dolev School of Engineering and Computer Science Hebrew University

Case Study -replication for efficiency and robustness Storage Area Network (SAN) – Current technology Case Study -replication for efficiency and robustness Storage Area Network (SAN) – Current technology utilizes standard Ethernet connectivity and Cluster of workstations • Message ordering is used to overcome possible inconsistency in case of failures. • During “stable ” periods total ordering is not used because of its high latency • What about “stress” periods ? ? ? •

Motivation • Message delivery order is a fundamental building block in distributed systems • Motivation • Message delivery order is a fundamental building block in distributed systems • The agreed order allows distributed applications to use the state-machine replication model to achieve fault tolerance and data replication • Replicated systems are often built atop Group Communication Systems (GCS). • • provide message ordering, reliable delivery and group membership Many GCS were introduced with variety of optimization tradeoffs – each has its own bottleneck, preventing it from becoming truly scalable

Current State • High performance implementations use a management layer that resides in the Current State • High performance implementations use a management layer that resides in the critical path and provides: • • • This layer consumes valuable CPU cycles needed to the actual “body of work“ • • • Message ordering Membership State synchronization Consistency No standard interface (API) and no Interoperability Depicts a specific programming methodology (e. g. , event driven) The network capacity outperforms any progress in CPU capability

The challenges • As network speed reaches several 10 th of Gbs even a The challenges • As network speed reaches several 10 th of Gbs even a multi-core server reaches its CPU limits (approximately 1 hz = 1 bps ) GHz/Gbps Rx ratio GHz/Gbps Tx ratio *The graphs appear in the paper: “TCP performance revisited” (ispass’ 03) by Foong et. al. and are used with the authors‘ permission.

The challenges • New techniques are called for to free the CPU to do The challenges • New techniques are called for to free the CPU to do a productive work • Extra resources exist: Peripheral devices are equipped with programmable processors (GPU, disk controllers, NICs) • Some devices have dedicated CPUs with unique properties (SIMD, TCAM memory, d/encryption logic) • • Offloading parts of the application to such devices is the new dimension!!!

Reasons for Offloading • Memory Bottlenecks – reduced memory pressure and cache-misses (due to Reasons for Offloading • Memory Bottlenecks – reduced memory pressure and cache-misses (due to filtering done at the device) • Better timeliness guarantees – GPOS ↔ Embedded OS (RTOS) – avoiding “OS noise” (interrupts, context switches, timers etc. )

Reasons for Offloading • Security – Another level of isolation – Harder to tamper Reasons for Offloading • Security – Another level of isolation – Harder to tamper with • Reduced power consumption – Pentium 4 2. 8 Ghz: 68 Watt – Intel XScale 600 Mhz: 0. 5 Watt

Sample Devices: Graphics AGEIA Phys. X - 500 Mhz multi-core processor - Specialized physics Sample Devices: Graphics AGEIA Phys. X - 500 Mhz multi-core processor - Specialized physics units IBM T 60 - ATI Mobility™ Radeon® X 1300 6 programmable shader processors • 512 MB • NVIDIA® Ge. Force® 6/7800, 600$ 400 Mhz Core; 512 MB DDR Memory Bandwidth (GB/sec): 54. 4

Sample Devices: Graphics • Compared to the CPU, GPU performance has been increasing at Sample Devices: Graphics • Compared to the CPU, GPU performance has been increasing at a much faster rate - SIMD architecture (Single Instruction, Multiple Data) ~3 times Moore’s law ~12 Gflops

Sample Devices: Networking • Today’s Network Interface Cards (NICs) are equipped with an onboard Sample Devices: Networking • Today’s Network Interface Cards (NICs) are equipped with an onboard CPU. – – Execute proprietary code Inaccessible to the OS “Killer NIC” (http: //www. killernic. com/Killer. Nic/) 400 Mhz Network Processing Unit; 64 MB DDR Embedded Linux OS “world’s first Network Card designed specifically for Online Gaming“

Replication and Offloading • Offloading reusable components that implement various distributed algorithms will facilitate Replication and Offloading • Offloading reusable components that implement various distributed algorithms will facilitate the development of cluster replication and reliability. • Possible candidates: • • • Reliable Broadcast Total Order: timestamp ordering, Token Ring, etc. Membership Services Failure Detectors Atomic Commit Protocols (2 PC, 3 PC, E 3 PC, etc. ) Locking Service

Example: Offloaded TO-Application • • We have offloaded Lamport's Timestamp ordering algorithm to the Example: Offloaded TO-Application • • We have offloaded Lamport's Timestamp ordering algorithm to the networking device Application Architecture: TO Service GUI Host Orderer Device Reliable Broadcast TO Service Orderer Reliable Broadcast Device PC 1 PC n

Hydra: An Offloading Framework • Offloading application is a tedious task Depends on device Hydra: An Offloading Framework • Offloading application is a tedious task Depends on device capabilities, SDK and toolchain • Requires kernel knowledge (device drivers, DMA) • Repeated for each target device • • We have developed a generic offloading framework that enables a developer to design the offloading aspects of the application at design time. • Joint work with: Yaron Weinsberg (HUJI), Tal Anker (Marvell), Muli Ben-Yehuda (IBM), Pete Wyckoff (OSC)

HYDRA Programming Model • Hydra programming model enables one to develop an Offload-Aware (OA) HYDRA Programming Model • Hydra programming model enables one to develop an Offload-Aware (OA) Applications • • “aware” of available computing resources The minimal unit for offloading is called: “Offcode” (i. e. , “Offloaded-Code”) Exports a well defined interface (like COM objects) • Given as open source or as compiled binaries • Described by Offcode Description File (ODF) • • Exposes the offcode’s functionality (interfaces)

Offcode Libraries Offcode Library Networking Math BSD Socket socket. odf Graphics CRC 32 Security Offcode Libraries Offcode Library Networking Math BSD Socket socket. odf Graphics CRC 32 Security OA-App crc 32. odf ort mp i ort imp User Lib mpeg Decoder. odf

Offcode Description File Offcode Description <offcode name=“BSD socket offcode”> <interfaces> <interface name=“unicast” ID=IID_UNICAST> <method Offcode Description File Offcode Description … … Device section ethernet 1000 … Import section NetBSD Socketcrc 32. odf 6060843

Channels • Offcodes are interconnected via Channels Determines various communication properties between offcodes (I) Channels • Offcodes are interconnected via Channels Determines various communication properties between offcodes (I) An Out-Of-Band Channel, OOB-channel, is attached to every OA-application and Offcode • • • Not performance critical (uses memory copies) Used for initialization, control and events dissemination B A Specialized channel OOB-channel C

Channels (II) A specialized channel is created for performance critical communication. • Hydra provides Channels (II) A specialized channel is created for performance critical communication. • Hydra provides several channel types: • • Unicast / Multicast Reliable / Unreliable Synchronized / Asynchronous Buffered / Zero-Copy R/W/Both

Design Methodology • We follow the “layout design” methodology first presented in Far. Go Design Methodology • We follow the “layout design” methodology first presented in Far. Go 1 and later in Far. Go-DA 2. • Offload-aware applications are designed by two aspects: 1. Basic logic design: Design the application logic and define the components to be offloaded. 2. Offloading Layout design: Define the communication channels between offcodes and their location constraints. “Far. Go-System”, ICDCS’ 99, Ophir Holder and Israel Ben-Shaul (2) “A programming model and system support for disconnected-aware applications on resource-constrained devices”, ICSE’ 02, Yaron Weinsberg and Israel Ben-Shaul (1)

1. Logical Design (the example) Component GUI Description Provides the viewing area and user 1. Logical Design (the example) Component GUI Description Provides the viewing area and user controls (define a message pattern, frequency and send it) TO Service Provides the TO API: TO_broadcast() TO_recv() Lamport. Orderer Implements the specific algorithm instance (Timestamp Ordering) Reliable. Boradcast Implements a simple RB algorithm

2. Offloading Layout Design 1 2 Components Legend 1: GUI 2: TO Service 3: 2. Offloading Layout Design 1 2 Components Legend 1: GUI 2: TO Service 3: Lamport Orderer 4: Reliable Broadcast 3 4 net

Channel Constraints Link Constraint (default) B. ODF B target = Device 1 or Device Channel Constraints Link Constraint (default) B. ODF B target = Device 1 or Device 2 A. ODF A target = Device 1 A Device 1 Link B Device 2

Channel Constraints Link Constraint (default): B. ODF B target = Device 1 or Device Channel Constraints Link Constraint (default): B. ODF B target = Device 1 or Device 2 A. ODF A target = Device 1 B Link A Device 1 Device 2

Channel Constraints Link Constraint (default): B. ODF B target = Device 1 or Device Channel Constraints Link Constraint (default): B. ODF B target = Device 1 or Device 2 A. ODF A target = Device 1 A Link B Device 1 Device 2

Channel Constraints Pull Constraint: B. ODF B target = Device 1 or Device 2 Channel Constraints Pull Constraint: B. ODF B target = Device 1 or Device 2 A. ODF A target = Device 1 A Device 1 Pull B Device 2

Channel Constraints Pull Constraint: B. ODF B target = Device 1 or Device 2 Channel Constraints Pull Constraint: B. ODF B target = Device 1 or Device 2 A. ODF A target = Device 1 A Pull B Device 1 Device 2

Channel Constraints Gang Constraint: B. ODF B target = Device 2 A. ODF A Channel Constraints Gang Constraint: B. ODF B target = Device 2 A. ODF A target = Device 1 A Device 1 Gang B Device 2

Channel Constraints Gang Constraint: B. ODF B target = Device 2 A. ODF A Channel Constraints Gang Constraint: B. ODF B target = Device 2 A. ODF A target = Device 1 A Device 1 Gang B Device 2

Finally: Application Deployment Logical Devices Layout Graph mapping Physical Devices mapping Offcode Generation Offloading Finally: Application Deployment Logical Devices Layout Graph mapping Physical Devices mapping Offcode Generation Offloading Execution

Evaluation OA Total-Order Application Evaluation OA Total-Order Application

Evaluation OA Total-Order Application 5 Intel Pentium~4 2. 4~GHz systems 512 MB of RAM, Evaluation OA Total-Order Application 5 Intel Pentium~4 2. 4~GHz systems 512 MB of RAM, 32 -bit, 33~MHz PCI bus. Programmable Netgear 620 NICs, 512 k. B RAM. We used Linux version 2. 6. 11 with the Hydra module Dell Power. Connect 6024 Gigabit ethernet switch

Conclusions • We are at the beginning of a journey for enabling an application Conclusions • We are at the beginning of a journey for enabling an application developer to fully utilize the available computing resource Peripherals • Multi-core systems • • Offloading can improve the performance of distributed applications, advanced storage services, IDS systems, VMMs etc.