Скачать презентацию Harnessing Moore s Law with Selected Implications Mark D Скачать презентацию Harnessing Moore s Law with Selected Implications Mark D

bb38fc82a952db7109374f1709bce84e.ppt

  • Количество слайдов: 36

Harnessing Moore’s Law (with Selected Implications) Mark D. Hill Computer Sciences Department University of Harnessing Moore’s Law (with Selected Implications) Mark D. Hill Computer Sciences Department University of Wisconsin-Madison http: //www. cs. wisc. edu/~markhill This talk is based, in part, on an essay I wrote as part of a National Academy of Sciences study panel. © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Motivation • What the do the following intervals have in common? – Prehistory-2003 – Motivation • What the do the following intervals have in common? – Prehistory-2003 – 2004 -2005 • Answer: Equal progress in absolute computer speed • Furthermore, more doublings in 2006 -07, 2008 -09, … • Questions – Why do computers get better and cheaper? – How do computer architects contribute (my bias)? – How to learn to project future trends and implications? © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Outline • Computer Primer – Software – Hardware • Technology Primer • Harnessing Moore’s Outline • Computer Primer – Software – Hardware • Technology Primer • Harnessing Moore’s Law • Future Trends © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Computer Primer: Software Application programmers write software: int main (int argc, char *argv[]) { Computer Primer: Software Application programmers write software: int main (int argc, char *argv[]) { int i; int sum = 0; for (i = 0; i <= 100; i++) sum = sum + i * i; printf (“The sum from 0. . 100 is %dn”, sum); } [Example due to Jim Larus] © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Computer Primer: Software, cont. System software translates for hardware: . main: . . . Computer Primer: Software, cont. System software translates for hardware: . main: . . . loop: lw $14, 28($sp) mul $15, $14 <--- multiply i * i lw $24, 24($sp) addu $25, $24, $15 <--- add to sum sw $25, 24($sp) addu $8, $14, 1 sw $8, 28($sp) ble $8, 100, loop la $4, str lw $5, 24($sp) jal printf move $2, $0 lw $31, 20($sp) addu $sp, 32 j $31 © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Computer Primer: Software, cont. What the hardware really sees: … 10001111101011100000011100 1000111110000000011000 000000011100000011001 <--- Computer Primer: Software, cont. What the hardware really sees: … 10001111101011100000011100 1000111110000000011000 000000011100000011001 <--- multiply i * i 00100101110010000000001 001010010000000001100101 1010111110101000000011100 000000000111100000010010 00000011111100100001 <--- add to sum 000101000001111110111 10101111100100000011000 00111100000001000000 1000111110100000001100000000000011101100 0010010010000000110000 100011111100000010100 00100111101000001000000111110000000001000 000000000010000001 © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Computer Primer: Hardware Components • Processor – – Rapidly executes instructions Commonly: Processor implemented Computer Primer: Hardware Components • Processor – – Rapidly executes instructions Commonly: Processor implemented as microprocessor chip (Intel Pentium 4) Larger computers have multiple processors • Memory – Stores vast quantities of instructions and data – Commonly: DRAM chips backed by magnetic disks • Input/Output – Connect compute to outside world – E. g. , keyboards, displays, & network interfaces © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Apple Mac 7200 (from Hennessy & Patterson) (C) Copyright 1998 Morgan Kaufmann Publishers. Reproduced Apple Mac 7200 (from Hennessy & Patterson) (C) Copyright 1998 Morgan Kaufmann Publishers. Reproduced with permission from Computer Organization and Design: The Hardware/Software Interface, 2 E. © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Computer Primer: Hardware Operation E. g. , do mul temp, i, i & go Computer Primer: Hardware Operation E. g. , do mul temp, i, i & go on to next instruction Fetch-Execute Loop { S 1: read “current” instruction from memory S 2: decode instruction to see what is to be done S 3: read instruction input(s) S 4: perform instruction operation S 5: write instruction output(s) Also determine “next” instruction and make it “current” } Repeat © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Computer Big Picture • Separate Software & Hardware (divide & conquer) • Software – Computer Big Picture • Separate Software & Hardware (divide & conquer) • Software – Worry about applications only (hardware can already exist) – Translate from one form to another (instructions & data interchangeable!) • Hardware – Expose set of instructions (most functionally equivalent) – Execute instructions rapidly (without regard for software) © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Outline • Computer Primer • Technology Primer – Exponential Growth – Technology Background – Outline • Computer Primer • Technology Primer – Exponential Growth – Technology Background – Moore’s Law • Harnessing Moore’s Law • Future Trends © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Exponential Growth • Occurs when growth is proportional to current size • Mathematically: dy Exponential Growth • Occurs when growth is proportional to current size • Mathematically: dy / dt = k * y • Solution: y = e k*t • E. g. , a bond with $100 principal yielding 10% interest • 1 year: $110 = $100 * (1 + 0. 10) • 2 years: $121 = $100 * (1 + 0. 10) • … • 8 years: $214 = $100 * (1 + 0. 10)8 • Other examples – Unconstrained population growth – Moore’s Law © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Absurd Exponential Example • Parameters – $16 base – 59% growth/year – 36 years Absurd Exponential Example • Parameters – $16 base – 59% growth/year – 36 years • • • 1 st year’s $16 buy book 3 rd year’s $64 buy computer game 15 th year’s $16, 000 buy car 24 th year’s $100, 000 buy house 36 th year’s $300, 000 buy a lot © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Technology Background • Computer logic implemented with switches – Like light switches, except that Technology Background • Computer logic implemented with switches – Like light switches, except that a switch can control others – Yields a network (called circuit) of switches – Want circuits to be fast, reliable, & cheap • Logic Technologies – Mechanical switch & vacuum tube – Transistor (1947) – Integrated circuit (chip): circuit of many transistors made at once (1958) • (Also memory & communication technologies) © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

(Technologist’s) Moore’s Law • Parameters – 16 transistor/chip circa 1964 – 59% growth/year – (Technologist’s) Moore’s Law • Parameters – 16 transistor/chip circa 1964 – 59% growth/year – 36 years (2000) and counting • • • 1 st year’s 16 ? ? ? 3 rd year’s 64 ? ? ? 15 th year’s 16, 000 ? ? ? 24 th year’s 100, 000 ? ? ? 36 th year’s 300, 000 ? ? ? • Was useful & then got more than 1, 000 times better! © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

(Technologist’s) Moore’s Law Data © 2003 Mark D. Hill CS & ECE, University of (Technologist’s) Moore’s Law Data © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Other “Moore’s Laws” • Other technologies improving rapidly – Magnetic disk capacity – DRAM Other “Moore’s Laws” • Other technologies improving rapidly – Magnetic disk capacity – DRAM capacity – Fiber-optic network bandwidth • Other aspects improving slowly – Delay to memory – Delay to disk – Delay across networks • Computer Implementor’s Challenge – Design with dissimilarly expanding resources – To Double computer performance every two years – A. k. a. , (Popular) Moore’s Law © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Outline • Computer Primer • Technology Primer • Harnessing Moore’s Law – – – Outline • Computer Primer • Technology Primer • Harnessing Moore’s Law – – – Microprocessor Bit-Level Parallelism Instruction-Level Parallelism Caching & Memory Hierarchies Cost & Implications • Future Trends © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Microprocessor • Computers for the 1960 s expensive, using 100 s if not 1000 Microprocessor • Computers for the 1960 s expensive, using 100 s if not 1000 s of chips • First Microprocessor in 1971 – – – Processor on one chip Intel 4004 2300 transistors Barely a processor Could access 300 bytes of memory (0. 0003 megabytes) • Use more and faster transistor in parallel © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Transistor Parallelism • To use more transistor quickly, – use them side-by-side (or in Transistor Parallelism • To use more transistor quickly, – use them side-by-side (or in parallel) – Approach depend on scale • Consider organizing people – 1000 people – 1, 000 people • Transistors – Bit-level parallelism – Instuction-level parallelism – (Thread-level parallelism) © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Bit-Level Parallelism • Less (e. g. , 8 * 15 = 120): 00001000 * Bit-Level Parallelism • Less (e. g. , 8 * 15 = 120): 00001000 * 00001111 = 00001000 ------00001111000 • More: 010101010101 * 0000111100001111 = 101000001001111101011111011 • More bits manipulated faster! © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Instruction-Level Parallelism • Limits to bit-level parallelism – Numbers are big enough – Operations Instruction-Level Parallelism • Limits to bit-level parallelism – Numbers are big enough – Operations are fast • Seek parallelism executing many instruction at once • Recall Fetch-Execute Loop { S 1: read “current” instruction from memory S 2: decode instruction to see what is to be done S 3: read instruction input(s) S 4: perform instruction operation S 5: write instruction output(s) Also determine “next” instruction and make it “current” } © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Instruction-Level Parallelism, cont. • One-at-a-time instructions per cycle = 1/5 Time 01 02 03 Instruction-Level Parallelism, cont. • One-at-a-time instructions per cycle = 1/5 Time 01 02 03 04 05 06 07 08 09 10 ADD S 1 S 2 S 3 S 4 S 5 SUB. . S 1 S 2 S 3 S 4 S 5 • Pipelining instructions per cycle = 1 (or less) Time ADD SUB ORI AND MUL © 2003 Mark D. Hill 01 S 1. . . . 02 S 1. . . 03 S 2 S 1. . 04 S 3 S 2 S 1. . 05 S 4 S 3 S 2 S 1 06 07 08 09 10 S 5 S 4 S 5 S 3 S 4 S 5 S 2 S 3 S 4 S 5 CS & ECE, University of Wisconsin-Madison

Instruction-Level Parallelism, cont. • 4 -way Superscalar instructions per cycle = 4 (or less) Instruction-Level Parallelism, cont. • 4 -way Superscalar instructions per cycle = 4 (or less) Time ADD SUB ORI AND MUL SRL XOR LDW STW DIV © 2003 Mark D. Hill 01 S 1 S 1. . . 02 S 2 S 2 S 1 S 1. . 03 S 3 S 3 S 2 S 2 S 1 04 S 4 S 4 S 3 S 3 S 2 05 S 5 S 5 S 4 S 4 S 3 06 07 08 09 10 S 5 S 5 S 4 S 5 CS & ECE, University of Wisconsin-Madison

Instruction-Level Parallelism, cont. • Current processors have dozens of instructions executing • Must predict Instruction-Level Parallelism, cont. • Current processors have dozens of instructions executing • Must predict which instructions are next • Limits to control prediction? • Look elsewhere? (thread-level parallelism later) • Memory a serious problem – 1980: memory access time = one instruction time – 2000: memory access time = 100 instruction times © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Caching & Memory Hierarchies • Memory can be – Fast – Vast – But Caching & Memory Hierarchies • Memory can be – Fast – Vast – But not both • Use two memories – Cache: small, fast (e. g. , 64, 000 bytes in 1 ns) – Memory: large, vast (e. g. , 64, 000 bytes in 100 ns) • Use prediction to fill cache – Likely to re-reference information – Likely to reference nearby information – E. g. , address book cache of phone directory © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Caching & Memory Hierarchies, cont. • Cache + Memory makes memory look fast & Caching & Memory Hierarchies, cont. • Cache + Memory makes memory look fast & vast – If cache has information on 99% of accesses – 1 ns + 1% * 100 ns = 2 ns – E. g. P 3 (w/o L 2 cache) • Caching Applied Recursively – – – – Registers Level-one cache Level-two cache Memory Disk (File Server) (Proxy Cache) © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Cost Side of Moore’s Law • About every two years: same computing at half Cost Side of Moore’s Law • About every two years: same computing at half cost • Long-term effect: – – – – 1940 s Prototypes for calculating ballistic trajectories 1950 s Early mainframes for large banks 1960 s Mainframes flourish in many large businesses 1970 s Minicomputers for business, science, & engineering Early 1980 s PCs for word processing & spreadsheets Late 1980 s PCs for desktop publishing 1990 s PCs for games, multimedia, e-mail, & web • Jim Gray: In ten years you can buy a computer for the cost of its sales tax today (assuming 3% or more) © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Outline • Computer Primer • Technology Primer • Harnessing Moore’s Law • Future Trends Outline • Computer Primer • Technology Primer • Harnessing Moore’s Law • Future Trends – – Moore’s Law Harnessing Moore’s Law Computer uses Some Non-Technical Implications © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Revolutions • Industrial Revolution enabled by machines – Interchangeable parts – Mass production – Revolutions • Industrial Revolution enabled by machines – Interchangeable parts – Mass production – Lower costs expanded application • Information Revolution enabled by machines – Interchangeable purpose (software) – Mass production (chips = integrated circuits) – Lower costs expanded application © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Future of Moore’s Law • Short-Term (1 -5 years) – Will operate (due to Future of Moore’s Law • Short-Term (1 -5 years) – Will operate (due to prototypes in lab) – Fabrication cost will go up rapidly • Medium-Term (5 -15 years) – Exponential growth rate will likely slow – Trillion-dollar industry is motivated • Long-Term (>15 years) – May need new technology (chemical or quantum) – We can do better (e. g. , human brain) – I would not close the patent office © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Future of Harnessing Moore’s Law • Thread-Level Parallelism – Multiple processors cooperating (exists today) Future of Harnessing Moore’s Law • Thread-Level Parallelism – Multiple processors cooperating (exists today) – More common in future with multiple processors per chip – Parallelism in Internet? The Grid. • System on a Chip – Processor, memory, and I/O on one chip – Cost-performance leap like microprocessor? – (e. g. , accelerometer at right) • Communication – World-wide web & wireless cell phone fuse! • Other properties: robust & easy to design & use © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Future Computer Uses • Computer cost-effectiveness determines application viability – Spreadsheets on a US$2 Future Computer Uses • Computer cost-effectiveness determines application viability – Spreadsheets on a US$2 M mainframe do not make sense – A 10 x cost-performance change enables new possibilities [Joy] • Most computers will NOT be computers – How many electric motors do you have in your home? – How many did you buy as electric motors? – I control several computers, but most computers I control are embedded in cars, remote controls, refrigerators, etc. • Two Stories – Danny Hillis’s doorknobs – William Wulf’s “powerful” computer © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Future Computer Uses, cont. • Technologists have always been poor predictors for future use Future Computer Uses, cont. • Technologists have always been poor predictors for future use – Edison invented the motion picture machine – Hollywood invented movies • To Predict: – What would you want if it was 10 times cheaper? – What can be 10 time cheaper if you make more? – Better yet, ask a ten year old! • What do you think? © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Some Non-Technical Thoughts • We make over a billion transistors/second – One transistor per Some Non-Technical Thoughts • We make over a billion transistors/second – One transistor per man/woman/child in < 10 seconds (humankind has made many more transistors than bricks!) – But those transistors are not being distributed equally • Computers can be incredibly effectively tools – Knowledge workers in medicine, law, & engineering – But not unskilled laborers! • Computer use will exacerbate the social gradient • As citizens, we should ask – Can/should we ameliorate this trend? – If so how? © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison

Summary • Computers are machines for purposes “to be determined” • Vast cost reductions Summary • Computers are machines for purposes “to be determined” • Vast cost reductions have enabled new uses – Software flexibility – Moore’s Law and its harnessing • Technology should be our tool, not our master – Many benefits – Some costs © 2003 Mark D. Hill CS & ECE, University of Wisconsin-Madison