e374803dcdf05530984a2ff8ad7051f8.ppt
- Количество слайдов: 27
Introduction Computer Architecture
Modern Computers • Advancing at a rapid pace – Integral part of daily life • Almost every aspect of human civilization now depends on it! – Started to perform mathematical calculations • Reduce human errors • Increased speed – Evolved into a complex, versatile machine • Store, retrieve, & process voluminous data – At very high speeds • Intuitive human-machine interface – Used for communication & entertainment as well
Managing Complexity • Increased complexity poses many challenges – Exacerbates design, manufacturing, usage, and maintenance • Solution: Multi-facetted Abstraction – Abstraction of data • Store & process digital rather than analog data – Better control on the electronics/physics of the machine – Abstraction of layers PC • Multi-tiered design Cards & • Each tier provides a more simplified view of underlying physics – Abstraction of components Devices IC Chips • System built as a collection of sub-systems with well defined behavior and functionality Transistors • Special interconnections between sub-systems – Abstraction of control • In the form of software
A peek under the hood of a PC • Major components of a PC – Main board aka Motherboard • They are all Printed Circuit Board (PCB) – PCB is a multi-layered, rigid plastic sheet with copper wires etched directly on it and permit ICs and electronic components to be mounted and soldered. – Microprocessor/CPU • The brain and heart of the computer. It processes instructions to manipulate data – RAM or Main memory • RAM: Random Access Memory • High speed read/write memory for storing instructions and data – Buses: Collection of wires that interconnect components • FSB: Front-Side-Bus interconnects CPU to main memory
Anatomy of a PC RAM CPU Video Card North Bridge Hard Disk Drive (HDD) South Bridge Audio Network USB Keyboard & Mouse PCI Slots Connectivity Super I/O Floppy Parallel & Serial Ports
Common Units • It is important to know and understand the common units used for – Time (base unit: second) • Derived units: – – Millisecond (msec): 10 -3 seconds Microsecond (usec): 10 -6 seconds Nanosecond (nsec): 10 -9 seconds Picoseconds (psec): 10 -12 seconds – Frequency (base unit: Hertz or Hz) • Note that time and frequency are inversely related • Derived units: – Kilohertz (KHz): 103 Hz – Megahertz (MHz): 106 Hz – Gigahertz (GHz): 109 Hz
Common Units (Contd. ) • It is important to know and understand the common units used for – Memory size (base unit: byte) • Derived units: – – – Kilobyte (KB): 103 bytes Megabyte (MB): 106 bytes Gigabyte (GB): 109 bytes Terabyte (GB): 1012 bytes Petabyte (GB): 1015 bytes • Do not confuse above units with Ki. B/Mi. B/Gi. B – Units with a “i” in the middle are powers of 1024 rather than 1000. – For example 1 Mi. B = 10242 bytes. – FLOPS (base unit: FLOPS) • Derived units: – – Megaflop (MFLOP): 106 FLOPS Gigaflop (MFLOP): 109 FLOPS Teraflop (MFLOP): 1012 FLOPS Petaflop (MFLOP): 1015 FLOPS
RAM • Random Access Memory (RAM) – Volatile memory used to store data • Data is lost when power is lost – Most frequently accessed by the CPU – A PC will not boot unless there is some RAM on the main board • Therefore it must operate fast. • Typically 20 -70 nanoseconds (10 -9 seconds) – Is available in a variety of sizes, technologies, and speeds • Latest and greatest is DDR 3
North & South Bridge • The North & South Bridges are special Integrated circuits (ICs) (aka chips) that interconnect various components on a computer – These are often called chipsets • Some computers (namely those from AMD) have the functionality of the Northbridge already fabricated on the CPU itself and don’t need a special chip for it. – They are specific to the microprocessor and type of devices present on a computer
CPU • The Central Processing Unit (CPU) aka Microprocessor ( p) – The brain of the computer – Its task is to execute instructions • It keeps on executing instructions from the moment it is powered-on to the moment it is powered-off. • Execution of various instructions causes the computer to perform various tasks. – There a wide range of CPUs • We will study CPUs in detail in this course • The dominant CPUs on the market today are from – Intel: Pentium (Desktop), Xeon (Server), Pentium-M (Mobile), Xscale (Embedded) – AMD: Athlon (Desktop), Opteron (Server), Turion (Mobile), Geode (Embedded)
Key Characteristics of a CPU • Several metrics are used to describe the characteristics of a CPU – Native “Word” size • 32 -bit or 64 -bit – Clock speeds • The clock speed of a CPU defines the rate at which the internal clock of the CPU operates – The clock sequences and determines the speed at which instructions are executed by the CPU • Modern CPUs operate in Gigahertz (1 GHz = 109 Hz) • However, clock speeds alone do not determine the overall ability of a CPU – Instruction set • More versatile CPUs support a richer and more efficient instruction sets • Modern CPUs have new instruction sets such as SSE, SSE 2, 3 DNow that can be used to boost performance of many common applications – These instructions perform the operation of several instructions but take less time – FLOPS: Floating Point Operations per Second • FLOPS are a better metric for measuring a CPU’s computational capabilities • Modern CPUs deliver 2 -3 Giga FLOPS (1 GFLOPS = 109 FLOPS) • FLOPS is also used as a metric for describing computational capabilities of computer systems – Modern supercomputers deliver 1 Terra Flop (TFLOP), 1 TFLOP = 1012 FLOPS.
Key Characteristics of a CPU (Contd. ) • Several metrics are used to describe the characteristics of a CPU – Power consumption • Power is an important factor in today’s computing • Power is the product of voltage applied to the CPU (V) and the amount of current drawn by the CPU (I) » The unit for voltage is volts » The unit for current is ampere (aka amps) – Power = V x I – Power is typically represented in Watts – Modern desktop processors consume anywhere from 35 to 100 watts • Lower power is better as power is proportional to heat – More power implies the processor generates more heat and heating is a big problem – Number of computational units of core per CPU • Some CPUs have multiple cores • Each core is an independent computational unit and can execute instructions in parallel • More the number of cores the better the CPU is for multi-threaded applications – Single threaded applications typically experience a slow down on multi-core processors due to reduced clock speeds
Key Characteristics of a CPU (Contd. ) • Several metrics are used to describe the characteristics of a CPU – Cache size and configuration • Cache is a small, but high speed memory that is fabricated along with the CPU – Size of cache is inversely proportional to its speed – Cost of the CPU increases as size of cache increases • It is much faster than RAM • It is used to minimize the overall latency of accessing RAM • Microprocessors have a hierarchy of caches – L 1 cache: Fastest and closest to the core components – L 3 cache: Relatively slower and further away from CPU – Example cache configurations (See comparative die images): • • Quad core AMD Opteron (Shanghai) 32 KB (Data) + 32 KB (Instr. ) L 1 cache per core Unified 512 KB L 2 cache per core Unified 6 MB shared L 3 cache (for 4 cores) – Quad core Intel Xeon (Nehalem) • 32 KB (Data) + 32 KB (Instr. ) L 1 Cache per core • Unified 256 KB L 2 cache per core • Unified 8 MB shared L 3 cache (for 4 cores)
Trends in computing • Until recently, hardware performance improvements have been primarily achieved due to advancement in microprocessor fabrication technologies: – Steady improvement in processor clock speeds • Faster clocks (with in the same family of processors) provide higher FLOPS – Increase in number of transistors on-chip • More complex and sophisticated hardware to improve performance • Larger caches to provide rapid access to instruction and data
Moore’s Law • The steady advancement in microprocessor technology was predicted by Gordon Moore (in 1965), co-founder of Intel – Moore’s law states that the number of transistors on microprocessors will double approximately every two years. • Many advancements in digital technologies can be linked to Moore’s law. This includes: – – Processing speed Memory Capacity Speed and bandwidth of data communication networks and Resolution of monitors and digital cameras – Thus far, Moore’s law has steadily held true for about 40 years (from 1965 to about 2005) – Breakthroughs in miniaturization of transistors has been the turnkey technology • See comparative technical video (Courtesy Intel) • See trends from a non-technical perspective (Courtesy Intel)
Moore’s Law vs. Intel’s roadmap • Here is a graph illustrating the progress of Moore’s law based on Intel Inc. technological roadmap (obtained from Wikipedia)
Stagnation of Moore’s Law • In the past few years we have reached the fundamental limits at IC fabrication technology (particularly lithography and interconnect) – It is no longer feasible to further miniaturize the transistors on the IC • They are already just a several atoms large and at this point laws of physics change making it an extremely challenging task – Heat dissipation has reached breakdown threshold • Heat is generated as a part of regular transistor operations • Higher heat dissipations will cause the transistors to fail • With the current state of the art a single processor cannot yield any more than 4 to 5 GFLOPS – How do we move beyond this barrier?
Multi-core and Multi-processors • The solution to increasing the effective compute power is via the use of multi-core, multiprocessor computer system along with suitable software – This is a paradigm shift in hardware and software technologies – Multiple cores and multiple processors are interconnected using a variety of high speed data communication networks – Software plays a central role in harnessing the power of multi-core/multi-processor systems • Most industry leaders believe this is the near future of computing!
Multi-core Trends • Multi-core processors are most definitely the future of computing – Both Intel and AMD are pushing for larger number of cores per CPU package – The Cell Broadband Engine (aka Cell) has 8 synergistic processing elements (SPE) – The Sun Microsystems Niagara has 8 cores, with each core capable of running 8 -threads • Here is a short video from Intel demonstrating their proof-ofconcept, next generation Tera-chip designed to deliver a teraflop of compute power from a single CPU package.
Manufacturing ICs • Manufacturing of Integrated Circuits (ICs) in a complex process – Almost all the phases are completely automated • Use Computer Aided Manufacturing (CAM) • Manufacturing plants where the ICs are fabricated are called “Fabs” – Everything is computerized and software driven • Extensive Computer Aided Design (CAD) • Testing of designs is performed using simulations
CPU Manufacturing Process Computer Aided Design (CAD) Simulation-based Testing Silicon Manufacturing Silicon Wafer Multiple phases of Photolithography Packaging Tested Dies Testing & Shipping to OEMs & Retail Mask Design for Photolithography Testing & Dicer Wafer with multiple CPUs
Computer Architecture • Science to enable effective engineering of digital computers. – This is a complex field that spans various levels of hierarchy • • Component level Device interconnection CPU design – This is where most of the R&D work lies • Development of firmware and BIOS
Why do we need Comp. Arch. ? • Effective use of modern computers – Optimal software development • Crucial for embedded computing • Important for system programming – Development of operating systems – Design of device drivers and custom software • Tap into high performance features • Avoid hidden bottlenecks – Know hidden pitfalls – Prudent economic investment • Know what you need & buy what you need – Don’t waste your money • Know what you are buying – Design & development of hardware
Goals of this class • To understand: – Design of a PC – Working and functionality of various components and subsystems in a PC – Develop initial set of skills to program a computer at a low level via assembly • It is an important part in developing a hardwaresoftware interface using a hierarchy of language translators
Hardware-Software Interface • The interface between if (a > b) { hardware and software c = a; } else { is achieved using a c= b; hierarchy} of translators – Hierarchy helps to cmp a, b balance: else. Part jg mov & • Portabilitya, c jmp endif Interoperability else. Part: mov b, c • Development overhead vs. endif: performance 00010111 101000100 • Design & manufacturing 1010 10001110101010101 Typical Hierarchy: Program/software in a Highlevel Language (C/C++) Compiler Translated code in Assembly Assembler Machine Language
Semantic Gap • Semantic gap is a term that is used to describe 25 Total time (seconds) Java vs. C – The disconnect between a high level programming language Network communication (such as: Java, C++, or 20 C) and the underlying hardware architecture – It is used to refer to the 15 distinction between the conceptual view (from a programmer’s perspective) versus the actual operations 10 of a CPU • Small semantic gap is desirable 5 – High level programming languages have large semantic gaps Java C • Prevent effective use of 0 underlying hardware 0 500000 1000000 2000000 – Assembly has small semantic gap. Messages (512 byte 1500000 Sent/Received Number of size each) 2500000 • Higher performance comes at the price of higher software development overheads • Hierarchy of translators essentially attempt to bridge the semantic gap – Bridging the gap often requires significant human intervention
Plan of Action • Study in a bottom-up fashion – Digital logic / Boolean algebra • Logic gates • Logic circuits (Interconnection of gates) – Number representation using digital logic • Memory circuits • Arithmetic circuits – Arithmetic & Logic circuits – Processor (or CPU) • Programming the CPU in Assembly language – We will spend good deal of time here • Learn about performance enhancement strategies – Most computer architecture work lies in this area