
66bc4709e7baf990766d68ae21fe2ce9.ppt
- Количество слайдов: 79
ECE 545 Digital System Design with VHDL Course web page: ECE web page Courses Course web pages ECE 545 http: //ece. gmu. edu/coursewebpages/ECE 545/F 11/
Kris Gaj Research and teaching interests: • reconfigurable computing • computer arithmetic • cryptography • network security Contact: The Engineering Building, room 3225 kgaj@gmu. edu Office hours: Thursday, 7: 30 -8: 30 PM, Tuesday, 7: 30 -8: 30 PM, and by appointment
ECE 545 Part of: MS in Computer Engineering One of five core courses (must be passed with B or better) Fundamental course for the specialization area: Digital Systems Design Elective course in the remaining specialization areas MS in Electrical Engineering Elective
ECE 545 Part of: Ph. D in Electrical and Computer Engineering Knowledge tested at the Technical Qualifying Exam (TQE) Topic 2: Digital Design and Computer Organization
I am interested in… I want to specialize primarily in… CAD tools & Design Automation VLSI Hardware Description Languages Recommended program & specialization MS Cp. E Digital Systems Design FPGAs & Reconfigurable computing ASICs & FPGAs Computer Arithmetic VHDL/Verilog Front-end ASIC Design (algorithmic downto gate level) CAD Tools Reconfigurable Computing Back-end ASIC Design (circuit and mask layout levels) Analog & Digital Circuit Design Microelectronics VLSI Fabrication Microelectronics Nanoelectronics Semiconductor Devices MS EE Microelectronics/ Nanoelectronics
Courses Design level Digital System Computer Design with VHDL Arithmetic VLSI Design VLSI Test for ASICs Concepts algorithmic register-transfer ECE 545 gate transistor layout devices ECE 645 ECE 586 ECE 680 ECE 681 ECE 682 Digital Integrated Circuits Physical VLSI Design Semiconductor ECE 584 ECE 684 Device Fundamentals MOS Device Electronics
Cp. E Digital Systems Design Cp. E Microprocessors and Embedded Systems Pre. Approved Electives ECE 545 Digital System Design with VHDL ECE 586 Digital Integrated Circuits ECE 645 Computer Arithmetic ECE 681 VLSI Design for ASICs ECE 682 VLSI Test Concepts ECE 510 Real-Time Concepts ECE 511 Microprocessors ECE 611 Advanced Microprocessors ECE 612 Real-Time Embedded Systems ECE 641 Computer System Architecture Suggested Electives CS 540, 583 (languages, algorithms) CS 635 (parallel machines) ECE 584, 684, … (technology) ECE 511, 611, … (microprocessors) ECE 542, 642, 742 (networks) ECE 645, 681 (digital design) ECE 646, 746, … (applications) ECE 548 (sequential mach. theory) Professors K. Gaj, K. Hintz, H. Homayoun, J. Kaps, T. Storey H. Homayoun, J. Kaps, P. Pachowicz, C. Sabzevari
DIGITAL SYSTEMS DESIGN Concentration advisors: Kris Gaj, Jens-Peter Kaps, Ken Hintz 1. ECE 545 Digital System Design with VHDL – K. Gaj, project, FPGA design with VHDL, Aldec/Mentor Graphics, Xilinx/Altera 2. ECE 645 Computer Arithmetic – K. Gaj, project, FPGA design with VHDL Aldec/Mentor Graphics, Xilinx/Altera 3. ECE 681 VLSI Design for ASICs – H. Homayoun, project/lab, front-end and back-end ASIC design with Synopsys tools 4. ECE 586 Digital Integrated Circuits – D. Ioannou, R. Mulpuri, 5. ECE 682 VLSI Test Concepts – T. Storey
Grading Scheme • Homework - 10% • Project - 40% • Midterm Exam - 20% • Final Exam - 30%
Midterm exam 1 ü 2 hours 30 minutes ü in class ü design-oriented ü open-books, open-notes ü practice exams available on the web Tentative date: Last week of October
Final exam ü 2 hours 45 minutes ü in class ü design-oriented ü open-books, open-notes ü practice exams available on the web Date: Thursday, December 13, 4: 30 -7: 15 pm
Textbooks 12
Required Textbook Pong P. Chu, RTL Hardware Design Using VHDL, Wiley-Interscience, 2006.
Supplementary Textbook – Basics Refresher Stephen Brown and Zvonko Vranesic, Fundamentals of Digital Logic with VHDL Design, Mc. Graw-Hill, 3 rd or 2 nd Edition
Supplementary Textbook – Advanced Hubert Kaeslin, Digital Integrated Circuit Design: From VLSI Architectures to CMOS Fabrication, Cambridge University Press; 1 st Edition, 2008. Used in ECE 681 “VLSI Design for ASICs”
Technology & Tools 16
What is an FPGA? Configurable Logic Blocks Block RAMs I/O Blocks Block RAMs
FPGA Design process (1) Design and implement a simple unit permitting to speed up encryption with RC 5 -similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able to perform an encryption algorithm by itself, executing 32 rounds…. . Specification / Pseudocode On-paper hardware design (Block diagram & ASM chart) VHDL description (Your Source Files) Library IEEE; use ieee. std_logic_1164. all; use ieee. std_logic_unsigned. all; Functional simulation entity RC 5_core is port( clock, reset, encr_decr: in std_logic; data_input: in std_logic_vector(31 downto 0); data_output: out std_logic_vector(31 downto 0); out_full: in std_logic; key_input: in std_logic_vector(31 downto 0); key_read: out std_logic; ); end AES_core; Synthesis Post-synthesis simulation
FPGA Design process (2) Implementation Timing simulation Configuration On chip testing
Simulation Tools
FPGA Synthesis Tools
Logic Synthesis VHDL description architecture MLU_DATAFLOW of MLU is signal A 1: STD_LOGIC; signal B 1: STD_LOGIC; signal Y 1: STD_LOGIC; signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC; begin A 1<=A when (NEG_A='0') else not A; B 1<=B when (NEG_B='0') else not B; Y<=Y 1 when (NEG_Y='0') else not Y 1; MUX_0<=A 1 and B 1; MUX_1<=A 1 or B 1; MUX_2<=A 1 xor B 1; MUX_3<=A 1 xnor B 1; with (L 1 & L 0) select Y 1<=MUX_0 when "00", MUX_1 when "01", MUX_2 when "10", MUX_3 when others; end MLU_DATAFLOW; Circuit netlist
FPGA Implementation • After synthesis the entire implementation process is performed by FPGA vendor tools
Design Process control from Active-HDL
Xilinx FPGA Tools ECE Labs Aldec Active-HDL Design Flow Xilinx ISE Design Flow Aldec Active-HDL (IDE) Mentor Graphics Model. Sim SE Xilinx XST & Synopsys Synplify Premier Xilinx ISE Design Suite (IDE) simulation synthesis implementation
Xilinx FPGA Tools Home Xilinx ISE Design Flow Aldec Active-HDL Student Edition (IDE) Mentor Graphics Model. Sim PE Student Edition Xilinx XST (restricted) Xilinx ISE Web. PACK (IDE) (restricted) simulation synthesis implementation
Altera FPGA Tools ECE Labs Altera Design Flow Mentor Graphics Model. Sim-Altera Quartus II Subscription Edition simulation synthesis & implementation
Altera FPGA Tools Home Altera Design Flow Mentor Graphics Model. Sim-Altera Starter (restricted) Altera Quartus II Web Edition (restricted) simulation synthesis & implementation
Project 32
Project ü semester-long ü related to the research project conducted by Cryptographic Engineering Research Group (CERG) at GMU ü supporting NIST (National Institute of Standards and Technology) in the evaluation of candidates for a new cryptographic standard
Background 34
Crypto 101
Cryptography is Everywhere Buying a book on-line Teleconferencing over Intranets Withdrawing cash from ATM Backing up files on remote server
Cryptographic Standards Before 1997 Secret-Key Block Ciphers IBM & NSA DES – Data Encryption Standard Triple DES 1993 1995 Hash Functions 2003 SHA-1–Secure Hash Algorithm NSA SHA-2 SHA 1970 2005 1999 1977 1980 1990 2000 2010 time
Why a Contest for a Cryptographic Standard? • Avoid back-door theories • Speed-up the acceptance of the standard • Stimulate non-classified research on methods of designing a specific cryptographic transformation • Focus the effort of a relatively small cryptographic community
Cryptographic Standard Contests IX. 1997 X. 2000 AES 15 block ciphers 1 winner NESSIE I. 2000 XII. 2002 CRYPTREC V. 2008 XI. 2004 34 stream ciphers 4 HW winners + 4 SW winners e. STREAM XII. 2012 X. 2007 51 hash functions 1 winner SHA-3 96 97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12 13 time
Cryptographic Contests - Evaluation Criteria Security Software Efficiency μProcessors Hardware Efficiency μControllers Flexibility Simplicity FPGAs ASICs Licensing 40
Specific Challenges of Evaluations in Cryptographic Contests • Very wide range of possible applications, and as a result performance and cost targets throughput: single Mbits/s to hundreds Gbits/s cost: single cents to thousands of dollars • Winner in use for the next 20 -30 years, implemented using technologies not in existence today • Large number of candidates • Limited time for evaluation • Only one winner and the results are final
Mitigating Circumstances • Security is a primary criterion • Performance of competing algorithms tend to very significantly (sometimes as much as 500 times) • Only relatively large differences in performance matter (typically at least 20%) • Multiple groups independently implement the same algorithms (catching mistakes, comparing best results, etc. ) • Second best may be good enough
AES Contest 1997 -2000
Rules of the Contest Each team submits Detailed cipher specification Justification of design decisions Source code in C Source code in Java Tentative results of cryptanalysis Test vectors
AES: Candidate Algorithms 2 8 Canada: CAST-256 Deal USA: Mars RC 6 Twofish Safer+ HPC Costa Rica: Frog 4 Germany: Magenta Belgium: Rijndael France: DFC Israel, UK, Norway: Serpent Korea: Crypton Japan: E 2 1 Australia: LOKI 97
AES Contest Timeline June 1998 15 Candidates CAST-256, Crypton, Deal, DFC, E 2, Frog, HPC, LOKI 97, Magenta, Mars, RC 6, Rijndael, Safer+, Serpent, Twofish, August 1999 Round 1 Security Software efficiency Round 2 5 final candidates Mars, RC 6, Twofish (USA) Rijndael, Serpent (Europe) October 2000 1 winner: Rijndael Belgium Security Software efficiency Hardware efficiency
NIST Report: Security & Simplicity Security High MARS Twofish Serpent Rijndael Adequate RC 6 Complex Simple Simplicity
Efficiency in software: NIST-specified platform 200 MHz Pentium Pro, Borland C++ Throughput [Mbits/s] 128 -bit key 192 -bit key 30 256 -bit key 25 20 15 10 5 0 Rijndael RC 6 Twofish Mars Serpent
NIST Report: Software Efficiency Encryption and Decryption Speed 32 -bit processors high medium low 64 -bit processors DSPs RC 6 Rijndael Twofish Rijndael Mars Twofish Mars RC 6 Serpent
Efficiency in FPGAs: Speed Xilinx Virtex XCV-1000 Throughput [Mbit/s] 500 450 400 350 300 431 444 George Mason University 414 University of Southern California 353 Worcester Polytechnic Institute 294 250 200 150 100 177 173 149 143 104 62 112 88 102 61 50 0 Serpent Rijndael x 8 Twofish Serpent RC 6 x 1 Mars
Efficiency in ASICs: Speed Throughput [Mbit/s] 700 MOSIS 0. 5μm, NSA Group 606 128 -bit key scheduling 600 500 3 -in-1 (128, 192, 256 bit) key scheduling 443 400 300 202 200 105 103 104 57 57 100 0 Rijndael Serpent x 1 Twofish RC 6 Mars
Lessons Learned Results for ASICs matched very well results for FPGAs, and were both very different than software FPGA ASIC x 8 x 1 GMU+USC, Xilinx Virtex XCV-1000 x 1 NSA Team, ASIC, 0. 5μm MOSIS Serpent fastest in hardware, slowest in software
Lessons Learned Hardware results matter! Final round of the AES Contest, 2000 Speed in FPGAs GMU results Votes at the AES 3 conference
Limitations of the AES Evaluation • Optimization for maximum throughput • Single high-speed architecture per candidate • No use of embedded resources of FPGAs (Block RAMs, dedicated multipliers) • Single FPGA family from a single vendor: Xilinx Virtex
FPGA Evaluations AES e. STREAM SHA-3 Multiple FPGA families No No Yes Multiple architectures No Yes Use of embedded resources No No Yes Primary optimization target Throughput/ Area Experimental results No Area Throughput/Ar ea No Availability of source codes No No Yes Specialized tools No No Yes
ASIC Evaluations AES e. STREAM SHA-3 Multiple processes/ libraries No No Yes Multiple architectures No Yes Primary optimization target Throughput Power x Area Throughput x Time /Area Post-layout results No Yes Experimental results No Yes Availability of source codes No No Yes Specialized tools No No No
Benchmarking Tools
Tools for Benchmarking Implementations of Cryptography Software FPGAs e. BACS ATHENa D. Bernstein (UIC) T. Lange (TUE) K. Gaj, J. Kaps, et al. (GMU) 2006 -present 2009 -present ASICs ?
Benchmarking in Software: e. BACS 59
e. BACS: ECRYPT Benchmarking of Cryptographic Systems: http: //bench. cr. yp. to/ SUPERCOP - toolkit developed by D. Bernstein and T. Lange for measuring performance of cryptographic software • measurements on multiple machines (currently over 90) • each implementation is recompiled multiple times (currently over 1600 times) with various compiler options • time measured in clock cycles/byte for multiple input/output sizes • median, lower quartile (25 th percentile), and upper quartile (75 th percentile) reported • standardized function arguments (common API) 60
SUPERCOP Extension for Microcontrollers – XBX: 2009 -present Allows on-board timing measurements Supports at least the following microcontrollers: 8 -bit: Atmel ATmega 1284 P (AVR) Developers: Ø Christian Wenzel-Benner, ITK Engineering AG, Germany Ø Jens Gräf, Li. Net. Co Gmb. H, Heiger, Germany 32 -bit: TI AR 7 (MIPS) Atmel AT 91 RM 9200 (ARM 920 T) Intel XScale IXP 420 (ARM v 5 TE) Cortex-M 3 (ARM)
Benchmarking in FPGAs: ATHENa 62
ATHENa – Automated Tool for Hardware Evaluatio. N http: //cryptography. gmu. edu/athena Open-source benchmarking environment, written in Perl, aimed at AUTOMATED generation of OPTIMIZED results for MULTIPLE hardware platforms. The most recent version 0. 6. 2 released in June 2011. Full features in ATHENa 1. 0 to be released in 2012. 63
Why Athena? "The Greek goddess Athena was frequently called upon to settle disputes between the gods or various mortals.
Athena Goddess known for her superb logic and intellect. Her decisions were usually well-considered, highly ethical, and seldom motivated by self-interest. ” from "Athena, Greek Goddess of Wisdom and Craftsmanship" 64
Basic Dataflow of ATHENa User FPGA Synthesis and Implementation 6 5 Database query ATHENa Server 2 Ranking of designs HDL + scripts + configuration files 3 Result Summary + Database Entries 1 Download scripts and configuration files 8 HDL + FPGA Tools 4 Designer Database Entries 0 Interfaces + Testbenches 65
Three Components of the ATHENa Environment • ATHENa Tool • ATHENa Database of Results • ATHENa Website
ATHENa – Database of Results 67
ATHENa Database http: //cryptography. gmu. edu/athenadb 68
ATHENa Database – Result View • Algorithm parameters • Design parameters § Optimization target § Architecture type § Datapath width § I/O bus widths § Availability of source code § Platform § Vendor, Family, Device § Timing § Maximum clock frequency § Maximum throughput § Resource utilization § Logic blocks (Slices/LEs/ALUTs) § Multipliers/DSP units § Tools § Names & versions § Detailed options § Credits § Designers & contact information 69
ATHENa Database – Compare Feature Matching fields in grey Non-matching fields in red and blue 70
ATHENa - Website 71
ATHENa Website http: //cryptography. gmu. edu/athena/ • Download of ATHENa Tool • Links to related tools SHA-3 Competition in FPGAs & ASICs • Specifications of candidates • Interface proposals • RTL source codes • Testbenches • ATHENa database of results • Related papers & presentations 72
ATHENa Result Replication Files • Scripts and configuration files sufficient to easily reproduce all results (without repeating optimizations) • Automatically created by ATHENa for all results generated using ATHENa • Stored in the ATHENa Database In the same spirit of Reproducible Research as: • J. Claerbout (Stanford University) “Electronic documents give reproducible research a new meaning, ” in Proc. 62 nd Ann. Int. Meeting of the Soc. of Exploration Geophysics, 1992, http: //sepwww. stanford. edu/doku. php? id=sep: research: reproducible: seg 92. . . • Patrick Vandewalle 1, Jelena Kovacevic 2, and Martin Vetterli 1 (1 EPFL, 2 CMU) Reproducible research in signal processing - what, why, and how. IEEE Signal Processing Magazine, May 2009. http: //rr. epfl. ch/17/ 73
Benchmarking Goals Facilitated by ATHENa Comparing multiple: 1. cryptographic algorithms 2. hardware architectures or implementations of the same cryptographic algorithm 3. hardware platforms from the point of view of their suitability for the implementation of a given algorithm, (e. g. , choice of an FPGA device or FPGA board) 4. tools and languages in terms of quality of results they generate (e. g. Verilog vs. VHDL, Synplicity Synplify Premier vs. Xilinx XST, ISE v. 13. 1 vs. ISE v. 12. 3) 74
Your Project: Implementation and Benchmarking of Authenticated Ciphers 75
Features of Authenticated Ciphers 1. Confidentiality Bob Alice Charlie 2. Message integrity Bob Alice Charlie 3. Message authentication Bob Alice Charlie
All Projects - Organization • Projects divided into phases • Deliverables for each phase submitted through Blackboard at selected checkpoints and evaluated by the instructor and/or TA • Feedback provided to students on a best effort basis • Final report and codes submitted using Blackboard at the end of the semester
Honor Code Rules • All students are expected to write and debug their codes individually • Students are encouraged to help and support each other in all problems related to the - operation of the CAD tools - understanding of an investigated algorithm and existing implementations - understanding of the project tasks
Additional Skills Learned in the Project • Reading & understanding specification of a complex algorithm • Design of new hardware architectures based on existing architectures (datapath & controller) • Reading, understanding, and modifying existing VHDL code • Using embedded resources of modern FPGAs • Characterizing performance of your codes for multiple FPGA families 79