Скачать презентацию IRAM and ISTORE Projects Aaron Brown James Beck Скачать презентацию IRAM and ISTORE Projects Aaron Brown James Beck

7a43f2de4f64d2a12f7bfdc54ba6df3c.ppt

  • Количество слайдов: 56

IRAM and ISTORE Projects Aaron Brown, James Beck, Rich Fromm, Joe Gebis, Paul Harvey, IRAM and ISTORE Projects Aaron Brown, James Beck, Rich Fromm, Joe Gebis, Paul Harvey, Adam Janin, Dave Judd, Kimberly Keeton, Christoforos Kozyrakis, David Martin, Rich Martin, Thinh Nguyen, David Oppenheimer, Steve Pope, Randi Thomas, Noah Treuhaft, Sam Williams, John Kubiatowicz, Kathy Yelick, and David Patterson http: //iram. cs. berkeley. edu/[istore] Winter 2000 IRAM/ISTORE Retreat Slide 1

IRAM Vision: Intelligent PDA Pilot PDA + gameboy, cell phone, radio, timer, camera, TV IRAM Vision: Intelligent PDA Pilot PDA + gameboy, cell phone, radio, timer, camera, TV remote, am/fm radio, garage door opener, . . . + Wireless data (WWW) + Speech, vision, video + Voice output for conversations Speech control +Vision to see, scan documents, read bar code, . . . Slide 3

ISTORE Hardware Vision • System-on-a-chip enables computer, memory, without significantly increasing size of disk ISTORE Hardware Vision • System-on-a-chip enables computer, memory, without significantly increasing size of disk • 5 -7 year target: • Micro. Drive: 1. 7” x 1. 4” x 0. 2” 2006: ? - 1999: 340 MB, 5400 RPM, 5 MB/s, 15 ms seek – 2006: 9 GB, 50 MB/s ? (1. 6 X/yr capacity, 1. 4 X/yr BW) • Integrated IRAM processor – 2 x height • Connected via crossbar switch – growing like Moore’s law • 10, 000+ nodes in one rack! Slide 4

VIRAM: System on a Chip Prototype scheduled for tape-out 1 H 2000 • 0. VIRAM: System on a Chip Prototype scheduled for tape-out 1 H 2000 • 0. 18 um EDL process • 16 MB DRAM, 8 banks Memory (64 Mbits / 8 MBytes) • MIPS Scalar core and caches @ 200 MHz • 4 64 -bit vector unit 4 Vector Pipes/Lanes Xbar pipelines @ 200 MHz • 4 100 MB parallel I/O lines • 17 x 17 mm, 2 Watts Memory (64 Mbits / 8 MBytes) • 25. 6 GB/s memory (6. 4 GB/s per direction and per Xbar) • 1. 6 Gflops (64 -bit), 6. 4 GOPs (16 -bit) C P U +$ Slide 5 I/O

IRAM Architecture Update • ISA mostly frozen since 6/99 – better fixed-point model and IRAM Architecture Update • ISA mostly frozen since 6/99 – better fixed-point model and instructions » gained some experience using them over past year – better exception model – better support for short vectors » auto-increment memory addressing » instructions for in-register reductions & butterflypermutations – memory consistency model spec refined (poster) • Suite of simulators actively used and maintained – vsim-isa (functional), vsim-p (performance), vsim-db (debugger), vsim-sync (memory synchronization) Slide 6

IRAM Software Update • Vectorizing Compiler for VIRAM – retargeting CRAY vectorizing compiler (talk) IRAM Software Update • Vectorizing Compiler for VIRAM – retargeting CRAY vectorizing compiler (talk) » Initial backend complete: scalar and vector instructions » Extensive testing for correct functionality » Instruction scheduling and performance tuning begun • Applications using compiler underway – Speech processing (talk) – Small benchmarks; suggestions welcome • Hand-coded fixed point applications – Video encoder application complete (poster) – FFT, floating point done, fixed point started (talk) Slide 7

IRAM Chip Update • IBM to supply embedded DRAM/Logic (98%) –DRAM macro added to IRAM Chip Update • IBM to supply embedded DRAM/Logic (98%) –DRAM macro added to 0. 18 micron logic process –DRAM specs under NDA; final agreement in UCB bureaucracy • MIPS to supply scalar core (99%) –MIPS processor, caches, TLB • MIT to supply FPU (100%) –single precision (32 bit) only • VIRAM-1 Tape-out scheduled for mid-2000 –Some updates of micro-architecture based on benchmarks (talk) –Layout of multiplier (poster), register file nearly complete –Test strategy developed (talk) –Demo system high level hardware design complete (talk) –Network interface design complete (talk) Slide 8

VIRAM-1 block diagram Slide 9 VIRAM-1 block diagram Slide 9

Microarchitecture configuration • 2 arithmetic units • Memory system – both execute integer operations Microarchitecture configuration • 2 arithmetic units • Memory system – both execute integer operations – 8 DRAM banks – one executes FP operations – 256 -bit synchronous interface – 4 64 -bit datapaths (lanes) per unit – 1 sub-bank per bank • 2 flag processing units – 16 Mbytes total capacity – for conditional execution and speculation support • Peak performance • 1 load-store unit – 3. 2 GOPS 64, 12. 8 GOPS 16 – optimized for strides 1, 2, 3, (w. madd) and 4 – 1. 6 GOPS 64, 6. 4 GOPS 16 – 4 addresses/cycle for (wo. madd) indexed and strided – 0. 8 GFLOPS 64, 3. 2 operations GFLOPS 32 (w. madd) – decoupled indexed and – 6. 4 Gbyte/s memory strided stores bandwidth Slide 10

Media Kernel Performance Slide 11 Media Kernel Performance Slide 11

Base-line system comparison • All numbers in cycles/pixel • MMX and VIS results assume Base-line system comparison • All numbers in cycles/pixel • MMX and VIS results assume all data in L 1 cache Slide 12

Scaling to 10 K Processors • IRAM + micro-disk offer huge scaling opportunities • Scaling to 10 K Processors • IRAM + micro-disk offer huge scaling opportunities • Still many hard system problems, SAM AME (talk) – Availability » 24 x 7 databases without human intervention » Discrete vs. continuous model of machine being up – Maintainability » 42% of system failures are due to administrative errors » self-monitoring, tuning, and repair – Evolution » Dynamic scaling with plug-and-play components » Scalable performance, gracefully down as well as up » Machines become heterogeneous in performance at scale Slide 13

ISTORE-1: Hardware for AME Hardware: plug-and-play intelligent devices with selfmonitoring, diagnostics, and fault injection ISTORE-1: Hardware for AME Hardware: plug-and-play intelligent devices with selfmonitoring, diagnostics, and fault injection hardware –intelligence used to collect and filter monitoring data –diagnostics and fault injection enhance robustness –networked to create a scalable shared-nothing cluster Intelligent Chassis 80 nodes, 8 per tray 2 levels of switches • 20 100 Mb/s • 2 1 Gb/s Environment Monitoring: UPS, redundant PS, fans, heat and vibrartion sensors. . . Intelligent Disk “Brick” Portable PC Processor: Pentium II+ DRAM Redundant NICs (4 100 Mb/s links) Diagnostic Processor Disk Half-height canister Slide 14

ISTORE Brick Block Diagram Mobile Pentium II Module SCSI CPU North Bridge South Bridge ISTORE Brick Block Diagram Mobile Pentium II Module SCSI CPU North Bridge South Bridge DRAM 256 MB Ethernets 4 x 100 Mb/s Super I/O BIOS Disk (18 GB) Diagnostic Net DUAL UART Diagnostic Processor Monitor & Control PCI • Sensors for heat and vibration Flash RTC RAM • Control over power to individual nodes Slide 15

ISTORE Software Approach • Two-pronged approach to providing reliability: 1) reactive self-maintenance: dynamic reaction ISTORE Software Approach • Two-pronged approach to providing reliability: 1) reactive self-maintenance: dynamic reaction to exceptional system events » self-diagnosing, self-monitoring hardware » software monitoring and problem detection » automatic reaction to detected problems 2) proactive self-maintenance: continuous online selftesting and self-analysis » automatic characterization of system components » in situ fault injection, self-testing, and scrubbing to detect flaky hardware components and to exercise rarely-taken application code paths before they’re used Slide 16

ISTORE Applications • Storage-intensive, reliable services for ISTORE-1 – infrastructure for “thin clients, ” ISTORE Applications • Storage-intensive, reliable services for ISTORE-1 – infrastructure for “thin clients, ” e. g. , PDAs – web services, such as mail and storage – large-scale databases (talk) – information retrieval (search and on-the-fly indexing) • Scalable memory-intensive computations for ISTORE in 2006 – Performance estimates through IRAM simulation + model » not major emphasis – Large-scale defense and scientific applications enabled by high memory bw and arithmetic performance Slide 17

Performance Availability • System performance limited by the weakest link • NOW Sort experience: Performance Availability • System performance limited by the weakest link • NOW Sort experience: performance heterogeneity is the norm – disks: inner vs. outer track (50%), fragmentation – processors: load (1. 5 -5 x) and heat • Virtual Streams: dynamically off-load I/O work from slower disks to faster ones Slide 18

ISTORE Update • High level hardware design by UCB complete (talk) – Design of ISTORE Update • High level hardware design by UCB complete (talk) – Design of ISTORE boards handed off to Anigma » » First run complete; SCSI problem to be fixed Testing of UCB design (DP), to start asap 10 nodes by end of 1 Q 2000, 80 by 2 Q 2000 Design of BIOS handed off to AMI – Most parts donated or discounted » Adaptec, Andataco, IBM, Intel, Micron, Motorola, Packet Engines • Proposal for Quantifying AME (talk) • Beginning work on short-term applications » » Mail server Web server Large database Decision support primitives will be used to drive principled system design Slide 19

Conclusions • IRAM attractive for two Post-PC applications because of low power, small size, Conclusions • IRAM attractive for two Post-PC applications because of low power, small size, high memory bandwidth – Mobile consumer electronic devices – Scaleable infrastructure • IRAM benchmarking result: faster than DSPs • ISTORE: hardware/software architecture for large scale network services • Scaling systems requires – new continuous models of availability – performance not limited by the weakest link – self* systems to reduce human interaction Slide 20

Backup Slides Slide 21 Backup Slides Slide 21

Introduction and Ground Rules • Who is here? – Mixed IRAM/ISTORE “experience” • Questions Introduction and Ground Rules • Who is here? – Mixed IRAM/ISTORE “experience” • Questions are welcome during talks • Schedule: lecture from Brewster Kahle during Thursday’s Open Mic Session. • Feedback is required (Fri am) – Be careful, we have been known to listen to you • Mixed experience: please ask • Time for skiing and talking tomorrow afternoon Slide 22

2006 ISTORE • ISTORE node – Add 20% pad to Micro. Drive size for 2006 ISTORE • ISTORE node – Add 20% pad to Micro. Drive size for packaging, connectors – Then double thickness to add IRAM – 2. 0” x 1. 7” x 0. 5” (51 mm x 43 mm x 13 mm) • Crossbar switches growing by Moore’s Law – 2 x/1. 5 yrs 4 X transistors/3 yrs – Crossbars grow by N 2 2 X switch/3 yrs – 16 x 16 in 1999 64 x 64 in 2005 • ISTORE rack (19” x 33” x 84”) 1 tray (3” high) 16 x 32 512 ISTORE nodes / try • 20 trays+switches+UPS 10, 240 ISTORE nodes / rack (!) Slide 23

IRAM/VSUIF Decryption (IDEA) # lanes Virtual processor width • • IDEA Decryption operates on IRAM/VSUIF Decryption (IDEA) # lanes Virtual processor width • • IDEA Decryption operates on 16 -bit ints Compiled with IRAM/VSUIF Note scalability of both #lanes and data width Some hand-optimizations (unrolling) will be automated by Cray compiler Slide 24

1 D FFT on IRAM FFT study on IRAM – bit-reversal time included; cost 1 D FFT on IRAM FFT study on IRAM – bit-reversal time included; cost hidden using indexed store – Faster than DSPs on floating point (32 -bit) FFTs – CRI Pathfinder does 24 -bit fixed point, 1 K points in 28 usec (2 Watts without SRAM) Slide 25

3 D FFT on ISTORE 2006 • Performance of large 3 D FFT’s depend 3 D FFT on ISTORE 2006 • Performance of large 3 D FFT’s depend on 2 factors – speed of 1 D FFT on a single node (next slide) – network bandwidth for “transposing” data – 1. 3 Tflop FFT possible w/ 1 K IRAM nodes, if network bisection bandwidth scales (!) Slide 26

ISTORE-1 System Layout Brick shelf Brick shelf Slide 27 ISTORE-1 System Layout Brick shelf Brick shelf Slide 27

V-IRAM 1: 0. 18 µm, Fast Logic, 200 MHz 1. 6 GFLOPS(64 b)/6. 4 V-IRAM 1: 0. 18 µm, Fast Logic, 200 MHz 1. 6 GFLOPS(64 b)/6. 4 GOPS(16 b)/32 MB + 2 -way Superscalar Processor I/O x Vector Instruction Queue I/O ÷ Load/Store 16 K I cache 16 K D cache Vector Registers 4 x 64 100 MB each Memory Crossbar Switch M I/O 4 x 64 or 8 x 32 or 16 x 16 M 4 x 64 … M M M M … M 4 x 64 … M x … 4… 64 … M M M M M 4 x 64 … M M M … M 4… 64 x … M M … Slide 28

Fixed-point multiply-add model Multiply half word & Shift & Round x n/2 y n/2 Fixed-point multiply-add model Multiply half word & Shift & Round x n/2 y n/2 Add & Saturate zn * + n Round a sat n w n • Same basic model, different set of instructions – fixed-point: multiply & shift & round, shift right & round, shift left & saturate – integer saturated arithmetic: add or sub & saturate – added multiply-add instruction for improved performance and energy consumption Slide 29

Other ISA modifications • Auto-increment loads/stores – a vector load/store can post-increment its base Other ISA modifications • Auto-increment loads/stores – a vector load/store can post-increment its base address – added base (16), stride (8), and increment (8) registers – necessary for applications with short vectors or scaled-up implementations • Butterfly permutation instructions – perform step of a butterfly permutation within a vector register – used for FFT and reduction operations • Miscellaneous instructions added – min and max instructions (integer and FP) – FP reciprocal and reciprocal square root Slide 30

Major architecture updates • Integer arithmetic units support multiply-add instructions • 1 load store Major architecture updates • Integer arithmetic units support multiply-add instructions • 1 load store unit – complexity Vs. benefit • Optimize for strides 2, 3, and 4 – useful for complex arithmetic and image processing functions • Decoupled strided and indexed stores – memory stalls due to bank conflicts do not stall the arithmetic pipelines – allows scheduling of independent arithmetic operations in parallel with stores that experience many stalls – implemented with address, not data, buffering Slide 31 – currently examining a similar optimization for loads

Micro-kernel results: simulated systems • Note : simulations performed with 2 load-store units and Micro-kernel results: simulated systems • Note : simulations performed with 2 load-store units and without decoupled stores or optimizations for strides 2, 3, and 4 Slide 32

Micro-kernels • Vectorization and scheduling performed manually Slide 33 Micro-kernels • Vectorization and scheduling performed manually Slide 33

Scaled system results • Near linear speedup for all application apart from i. DCT Scaled system results • Near linear speedup for all application apart from i. DCT • i. DCT bottlenecks • large number of bank conflicts • 4 addresses/cycle for strided accesses Slide 34

i. DCT scaling with sub-banks • Sub-banks reduce bank conflicts and increase performance • i. DCT scaling with sub-banks • Sub-banks reduce bank conflicts and increase performance • Alternative (but not as effective) ways to reduce conflicts: – different memory layout – different address interleaving schemes Slide 35

Compiling for VIRAM • Long-term success of DIS technology depends on simple programming model, Compiling for VIRAM • Long-term success of DIS technology depends on simple programming model, i. e. , a compiler • Needs to handle significant class of applications – IRAM: multimedia, graphics, speech and image processing – ISTORE: databases, signal processing, other DIS benchmarks • Needs to utilize hardware features for performance – IRAM: vectorization – ISTORE: scalability of shared-nothing programming model Slide 36

IRAM Compilers • IRAM/Cray vectorizing compiler [Judd] – Production compiler » Used on the IRAM Compilers • IRAM/Cray vectorizing compiler [Judd] – Production compiler » Used on the T 90, C 90, as well as the T 3 D and T 3 E » Being ported (by SGI/Cray) to the SV 2 architecture – Has C, C++, and Fortran front-ends (focus on C) – Extensive vectorization capability » outer loop vectorization, scatter/gather, short loops, … – VIRAM port is under way • IRAM/VSUIF vectorizing compiler [Krashinsky] – Based on VSUIF from Corinna Lee’s group at Toronto which is based on Machine. SUIF from Mike Smith’s group at Harvard which is based on SUIF compiler from Monica Lam’s group at Stanford – This is a “research” compiler, not intended for compiling large complex applications Slide 37 – It has been working since 5/99.

IRAM/Cray Compiler Status Frontends C C++ Fortran Vectorizer PDGCS Code Generators C 90 IRAM IRAM/Cray Compiler Status Frontends C C++ Fortran Vectorizer PDGCS Code Generators C 90 IRAM • MIPS backend developed in this year – Validated using a commercial test suite for code generation • Vector backend recently started – Testing with simulator under way • Leveraging from Cray – Automatic vectorization Slide 38

VIRAM/VSUIF Matrix/Vector Multiply • VIRAM/VSUIF does reasonably well on long loops • 256 x VIRAM/VSUIF Matrix/Vector Multiply • VIRAM/VSUIF does reasonably well on long loops • 256 x 256 single matrix • Compare to 1600 Mflop/s (peak witho multadd) • Note BLAS-2 (little reuse) • ~350 on Power 3 and EV 6 • Problems specific to VSUIF – hand strip-mining results in short loops – reductions – no multadd support mvm vmm Slide 39

Reactive Self-Maintenance • ISTORE defines a layered system model for monitoring and reaction: Provided Reactive Self-Maintenance • ISTORE defines a layered system model for monitoring and reaction: Provided by Application Problem detection SW monitoring Policies Provided by ISTORE Runtime System ISTORE API Reaction mechanisms Coordination of reaction Self-monitoring hardware • ISTORE API defines interface between runtime system and app. reaction mechanisms • Policies define system’s monitoring, detection, and reaction behavior Slide 40

Proactive Self-Maintenance • Continuous online self-testing of HW and SW – detects flaky, failing, Proactive Self-Maintenance • Continuous online self-testing of HW and SW – detects flaky, failing, or buggy components via: » fault injection: triggering hardware and software error handling paths to verify their integrity/existence » stress testing: pushing HW/SW components past normal operating parameters » scrubbing: periodic restoration of potentially “decaying” hardware or software state – automates preventive maintenance • Dynamic HW/SW component characterization – used to adapt to heterogeneous hardware and behavior of application software components Slide 41

ISTORE-0 Prototype and Plans • ISTORE-0: testbed for early experimentation with ISTORE research ideas ISTORE-0 Prototype and Plans • ISTORE-0: testbed for early experimentation with ISTORE research ideas • Hardware: cluster of 6 PCs – intended to model ISTORE-1 using COTS components – nodes interconnected using ISTORE-1 network fabric – custom fault-injection hardware on subset of nodes • Initial research plans – runtime system software – fault injection – scalability, availability, maintainability benchmarking – applications: block storage server, database, FFT Slide 42

Runtime System Software • Demonstrate simple policy-driven adaptation – within context of a single Runtime System Software • Demonstrate simple policy-driven adaptation – within context of a single OS and application – software monitoring information collected and processed in realtime » e. g. , health & performance parameters of OS, application – problem detection and coordination of reaction » controlled by a stock set of configurable policies – application-level adaptation mechanisms » invoked to implement reaction • Use experience to inform ISTORE API design • Investigate reinforcement learning as technique to infer appropriate reactions from goals Slide 43

Record-breaking performance is not the common case • NOW-Sort records demonstrate peak performance • Record-breaking performance is not the common case • NOW-Sort records demonstrate peak performance • But perturb just 1 of 8 nodes and. . . Slide 44

Virtual Streams: • • Dynamic load balancing for I/O Replicas of data serve as Virtual Streams: • • Dynamic load balancing for I/O Replicas of data serve as second sources Maintain a notion of each process’s progress Arbitrate use of disks to ensure equal progress The right behavior, but what mechanism? Process Virtual Streams Software Disk Arbiter Slide 45

Graduated Declustering: A Virtual Streams implementation • Clients send progress, servers schedule in response Graduated Declustering: A Virtual Streams implementation • Clients send progress, servers schedule in response Before Slowdown Client 0 B Client 1 B B/2 From Server 3 Client 2 B B/2 B/2 After Slowdown Client 3 B B/2 Client 0 7 B/8 To Client 0 B/2 Client 1 7 B/8 B/4 B/2 3 B/8 B/2 Client 2 7 B/8 5 B/8 Client 3 7 B/8 3 B/8 B/4 To Client 0 B/2 5 B/8 0 1 1 2 2 3 Server 0 B Server 1 B Server 2 B 3 0 Server 3 B From Server 3 0 1 1 2 2 3 Server 0 B Server 1 B/2 Server 2 B 3 0 Server 3 B Slide 46

Read Performance: Multiple Slow Disks Slide 47 Read Performance: Multiple Slow Disks Slide 47

Storage Priorities: Research v. Users Traditional Research Priorities 1) Performance 1’) Cost easy 3) Storage Priorities: Research v. Users Traditional Research Priorities 1) Performance 1’) Cost easy 3) Scalability to 4) Availability measure 5) Maintainability } } ISTORE Priorities 1) Maintainability 2) Availability 3) Scalability 4) Performance hard 5) Cost to measure Slide 48

Intelligent Storage Project Goals • ISTORE: a hardware/software architecture for building scaleable, self-maintaining storage Intelligent Storage Project Goals • ISTORE: a hardware/software architecture for building scaleable, self-maintaining storage – An introspective system: it monitors itself and acts on its observations • Self-maintenance: does not rely on administrators to configure, monitor, or tune system Slide 49

Self-maintenance • Failure management – devices must fail fast without interrupting service – predict Self-maintenance • Failure management – devices must fail fast without interrupting service – predict failures and initiate replacement – failures immediate human intervention • System upgrades and scaling – new hardware automatically incorporated without interruption – new devices immediately improve performance or repair failures • Performance management – system must adapt to changes in workload or access patterns Slide 50

ISTORE-I: 2 H 99 • Intelligent disk – Portable PC Hardware: Pentium II, DRAM ISTORE-I: 2 H 99 • Intelligent disk – Portable PC Hardware: Pentium II, DRAM – Low Profile SCSI Disk (9 to 18 GB) – 4 100 -Mbit/s Ethernet links per node – Placed inside Half-height canister – Monitor Processor/path to power off components? • Intelligent Chassis – 64 nodes: 8 enclosures, 8 nodes/enclosure » 64 x 4 or 256 Ethernet ports – 2 levels of Ethernet switches: 14 small, 2 large » Small: 20 100 -Mbit/s + 2 1 -Gbit; Large: 25 1 -Gbit » Just for prototype; crossbar chips for real system – Enclosure sensing, UPS, redundant PS, fans, . . . Slide 51

Disk Limit • Continued advance in capacity (60%/yr) and bandwidth (40%/yr) • Slow improvement Disk Limit • Continued advance in capacity (60%/yr) and bandwidth (40%/yr) • Slow improvement in seek, rotation (8%/yr) • Time to read whole disk Year Sequentially Randomly (1 sector/seek) 1990 4 minutes 6 hours 1999 35 minutes 1 week(!) • 3. 5” form factor make sense in 5 -7 years? Slide 52

Related Work • • • ISTORE adds to several recent research efforts Active Disks, Related Work • • • ISTORE adds to several recent research efforts Active Disks, NASD (UCSB, CMU) Network service appliances (Net. App, Snap!, Qube, . . . ) High availability systems (Compaq/Tandem, . . . ) Adaptive systems (HP Auto. RAID, M/S Auto. Admin, M/S Millennium) • Plug-and-play system construction (Jini, PC Plug&Play, . . . ) Slide 53

Other (Potential) Benefits of ISTORE • Scalability: add processing power, memory, network bandwidth as Other (Potential) Benefits of ISTORE • Scalability: add processing power, memory, network bandwidth as add disks • Smaller footprint vs. traditional server/disk • Less power – embedded processors vs. servers – spin down idle disks? • For decision-support or web-service applications, potentially better performance than traditional servers Slide 54

n Disk Limit: I/O Buses Multiple copies of data, SW layers CPU Memory bus n Disk Limit: I/O Buses Multiple copies of data, SW layers CPU Memory bus Memory C C Cannot use 100% of bus m Queuing Theory (< 70%) m Command overhead (Effective size = size x Internal I/O bus 1. 2) External (PCI) I/O bus n • Bus rate vs. Disk rate C – SCSI: Ultra 2 (40 MHz), Wide (16 bit): 80 MByte/s – FC-AL: 1 Gbit/s = 125 MByte/s (single disk in 2002) (SCSI) C (15 Controllers disks) Slide 55

State of the Art: Seagate Cheetah 36 – 36. 4 GB, 3. 5 inch State of the Art: Seagate Cheetah 36 – 36. 4 GB, 3. 5 inch disk – 12 platters, 24 surfaces – 10, 000 RPM – 18. 3 to 28 MB/s internal media transfer rate (14 to 21 MB/s user data) – 9772 cylinders (tracks), (71, 132, 960 sectors total) – Avg. seek: read 5. 2 ms, write 6. 0 ms (Max. seek: 12/13, 1 track: 0. 6/0. 9 ms) – $2100 or 17 MB/$ (6¢/MB) (list price) – 0. 15 ms controller time source: www. seagate. com Slide 56

User Decision Support Demand vs. Processor speed Database demand: 2 X / 9 -12 User Decision Support Demand vs. Processor speed Database demand: 2 X / 9 -12 months “Greg’s Law” Database-Proc. Performance Gap: “Moore’s Law” CPU speed 2 X / 18 months Slide 57