a92a1affbabd120771b7d11fc0ab72f8.ppt
- Количество слайдов: 15
Building a Synthesizable x 86 Eriko Nurvitadhi, James C. Hoe, Babak Falsafi {enurvita, jhoe, babak}@ece. cmu. edu SIMFLEX/PROTOFLEX Computer Architecture Lab at http: //www. ece. cmu. edu/~simflex/
Motivation • Build synth x 86 func model for prototyping q q most widely-used ISA Intel won’t give out theirs • Problem: a very complicated ISA q many instructions § 482 instructions total (**ADD has 14 variations) q many individually complicated instructions § PUSHAD – push all GP registers to stack q many under-specified instructions § LOADALL inst; BCD operation flag updates • Also must be maintainable & extensible return on investment June 22, 2006 2
Overcoming Complexity • 4 key ingredients in our approach q q working SW simulator as design spec simplified multi-cycle datapath high-level HDL HW-SW co-simulation validation & evaluation • What we have today. . . q q an x 86 functional model in Bluespec all real-mode general-purpose insts § includes I/O instructions! q q q June 22, 2006 boots Free. DOS OS in co-simulation testbench synthesizes to 85% of a Virtex II Pro 70 FPGA Max 10 MIPS (based on synthesis + simulation) 3
Outline • Introduction • Our Approach • Status and Results • Discussions and Future work June 22, 2006 4
Functional View of an ISA Inst_1 ACT ACT ACT beh_1 beh_2 ACT Inst_2 Inst_n beh_m functional model • ISA = architectural states + instructions • instruction = set of alternate behaviors q q e. g. , due to different addressing modes x 86 has 482 insts but ~1000 behaviors • behavior = sequence of actions that read & alter states June 22, 2006 5
SW x 86 Sim as ISA Spec • Simulator source code = precise and executable design spec • We use Bochs (http: //bochs. sourceforge. net/) q open-source q code structure fits our high-level ISA view § i. e. , q explicit architecture state declaration one instruction behavior C++ function (Essentially) complete x 86 functionalities § simulate complete PC system § run various OSs (e. g. , Linux, Win XP) § support 386 through Pentium Pro June 22, 2006 6
Multi-cycle Implementation • Sequential, multi-cycle execution Start Fetch Decode Execute Commit Finish • Top-level view arch, aux states decoder FU FU Mem accesses I/O operations FU x 86 functional model June 22, 2006 7
Bluespec Design Capture • Explicit state declaration q q x 86 architectural states auxiliary simulation states used by Bochs • Predicated atomic rules q one rule one action in our ISA view • Maintainability & extensibility q q new behavior: add rules changing behavior: add/modify rules • Optimizations (low-level) q q June 22, 2006 reduce logic: reuse + combine rules reduce critical path delay: split rules 8
HW-SW co-simulation for Validation and Evaluation • Virtually “plug-in” our model into a PC q q execute Bochs to provide reference behavior simulate RTL along side the simulated Bochs PC • For validation and performance (CPI) eval Bochs RTL CPU == CPU MEM Validation June 22, 2006 I/Os RTL Bochs CPU MEM I/Os Performance Evaluation 9
Co-Simulation Testbench Bochs src code Manual coding Bluespec x 86 Bluespec compilation Automated Workloads on Bochs Verilog x 86 C++ conversion (Verilator) C++ x 86 June 22, 2006 Bochs simulation Co-simulation Traces Validation and performance evaluation results 10
Outline • Introduction • Our Approach • Status and Results • Discussions and Future work June 22, 2006 11
Implementation Progress • Implemented ISA subset q all real-mode general purpose instructions § 166 insts, 369 inst behaviors q compared to complete x 86 § 482 insts, ~1000 inst behaviors • Synthesis q q q June 22, 2006 convert Bluespec to synthesizable Verilog Xilinx ISE 7. 1, Virtex II Pro 70 (FPGA on BEE 2) results: 98 MHz, 28 K Slices (85% util) 12
Co-simulation Results • Validation q validated our model w/ Free. DOS bootup traces § tested first 140 M dynamic instructions § exercised 183 inst behaviors • Performance Evaluation q June 22, 2006 also with Free. DOS bootup traces 13
A Complete x 86? • To finish the x 86 model q q can be done, but takes effort consumes a lot of FPGA resources • Do we really need all of it? a workload uses only a subset of the ISA q some insts used more often than others parts of ISA is never or rarely used q • PROTOFLEX migration q q June 22, 2006 combine FPGA & simulation model necessary subset in HW, the rest in SW 14
Future Work • Short-term (Fall’ 06) q q implement protected-mode support validate/evaluate w/ more workloads § Linux, SPEC-CPU, commercial apps (DB 2) q deployment on the BEE 2 board • Long-term q q full-system prototype execution architectural exploration SIMFLEX/PROTOFLEX Computer Architecture Lab at http: //www. ece. cmu. edu/~simflex/ June 22, 2006 15
a92a1affbabd120771b7d11fc0ab72f8.ppt