Скачать презентацию Flex CC 2 An Optimizing Retargetable C Скачать презентацию Flex CC 2 An Optimizing Retargetable C

cd2f81f7293ea7034734ee461b3c3b53.ppt

  • Количество слайдов: 20

Flex. CC 2 : An Optimizing Retargetable C Compiler for DSP Applications V. Bertin, Flex. CC 2 : An Optimizing Retargetable C Compiler for DSP Applications V. Bertin, J-M. Daveau, P. Guillaume, D. Pilat, C. Robine, M. Santana, T. Théry Flex. Ware Embedded System Technology STMicroelectronics

Plan Context Goals Flex. CC 2 – – architecture optimizations Results Conclusion COMPANY CONFIDENTIAL Plan Context Goals Flex. CC 2 – – architecture optimizations Results Conclusion COMPANY CONFIDENTIAL Crolles 2

Context: Industrial Compiler Enabling technology for embedded processors • Specific instr. /features to certain Context: Industrial Compiler Enabling technology for embedded processors • Specific instr. /features to certain classes of applications. ASIP / AS-DSP • Loop intensive. Digital imaging MP 3 Hard-disk • Time-to-market • Retargetability Embedded System Mobile • Performance located in small portions of Embedded critical code. software • Productivity COMPANY CONFIDENTIAL Crolles 3

Goals High-quality generated code – – best in class for DSP compilers eliminate any Goals High-quality generated code – – best in class for DSP compilers eliminate any interest of ASM hand coding. Irregular target architectures – – – encoding constraints irregular instruction-level parallelism register-set constraints Specific instructions and features – hardware loops, multiply-accumulate, addressing modes, post-operations, …. Short retargeting time – Shorten time-to-market for new processors COMPANY CONFIDENTIAL Crolles 4

Flex. CC 2 Overall Design Flexible compilation framework : – – – Easily add/remove Flex. CC 2 Overall Design Flexible compilation framework : – – – Easily add/remove generic/custom optimizations. Re-order optimizations. Retargetable compilation system. Multi-level framework – – Machine level optimizations. Multi-level optimizations. DSP oriented – – Support for DSP datatypes and operations High added-value DSP optimizations COMPANY CONFIDENTIAL Crolles 5

Flex. CC 2 Architecture . c anc 0 cse EDL HW Loops code generation Flex. CC 2 Architecture . c anc 0 cse EDL HW Loops code generation ar. T register HW allocator post Loops operation High Level IR . asm software local pipeliner global scheduler lower TDF Low Level IR CGD EDF SDF COMPANY CONFIDENTIAL Crolles 6

High-level framework: Co. Sy® Front End strength. c anc 0 Back End chainflow BEG High-level framework: Co. Sy® Front End strength. c anc 0 Back End chainflow BEG match CCMIR cse Engine Description EDL List gra lowering engine TDF Target Description File emit . asm sched CGD Code Generator Description COMPANY CONFIDENTIAL Crolles 7

Specific DSP optimizations Array to pointer transformation Loops + arrays Parallel evolution of references Specific DSP optimizations Array to pointer transformation Loops + arrays Parallel evolution of references 1 set = 1 pointer Loops + pointers Partitioning § § Loop analysis Connivance sets Induction Expressions (IEs) Sets manipulation ADDRESS { Ax[1. . 6]; …} OPERATIONS { Ax: ++; Ax[1]+=2; …} Pointers generation addressing resources & operations § § Group access op. into families Optimize address modes Use index registers Handle loop nesting Support for hardware-do loops Intrinsic functions recognition and replacement COMPANY CONFIDENTIAL Crolles 8

Eli. Xir Back-end Infrastructure software pipeliner local scheduler register allocator post operation Low Level Eli. Xir Back-end Infrastructure software pipeliner local scheduler register allocator post operation Low Level IR dwarf SDF Machine Description EDF -engine Chaining global scheduler liveness HW Loops COMPANY CONFIDENTIAL Crolles 9

 engines Flow Register Allocation Coalesce Hwloop Post-Op Software Pipelining Pre-allocation optimizations Scheduler Super engines Flow Register Allocation Coalesce Hwloop Post-Op Software Pipelining Pre-allocation optimizations Scheduler Super Blocker Post-Op Dataflow Peephole Scheduler Software Pipelining … Dominator Paths Output assembly file Code Generator Liveness ASMdump Post-allocation optimizations COMPANY CONFIDENTIAL Crolles 10

Register Allocation Framework Targeting API Conservative Coalescer SSA Briggs Allocator Callahan Allocator Briggs API Register Allocation Framework Targeting API Conservative Coalescer SSA Briggs Allocator Callahan Allocator Briggs API Spill Manager / Optimizer Regset. Group Interference Graph Regset. Group Shuffle code Manager microengines C++ classes Interference Graph Stack. Info allocation API low level API Reg. Id Dependencies Loop. Tree COMPANY CONFIDENTIAL Crolles 11

Processor Specific Instr. /Features Managed as target specific or generic – – Intrinsics recognition Processor Specific Instr. /Features Managed as target specific or generic – – Intrinsics recognition and replacement. Post operation, post increment. Mainly handled by specific engine or engine. Some optimizations require retargeting. Make use of various Eli. Xir APIs (dataflow graph, scheduling, …). COMPANY CONFIDENTIAL Crolles 12

Intrinsics Recognition & Replacement if(a b) else Control Flow Graph Then Expression Trees max Intrinsics Recognition & Replacement if(a b) else Control Flow Graph Then Expression Trees max = a; max = a Else max = b; • Complex expressions Graph • Multi-statements Pattern Matching max r 1, r 2, r 3 C Instruction Patterns max = L_max(a, b); cmp r 1, r 2 move r 2, r 3 move if(ge), r 1, r 3 Unoptimized ASM Optimized ASM COMPANY CONFIDENTIAL Crolles 13

Dataflow Peephole rep L 14, r 5 h ldx_f ax 1, r 4 h Dataflow Peephole rep L 14, r 5 h ldx_f ax 1, r 4 h L 12: ldx_f ax 1, r 4 h ldx_f ax 2, r 1 h ldx_f axx 1, r 0 h L_fmul r 0 h, r 1 h, r 3 dmv r 4 h, r 1 h fmul r 0 h, r 1 h, r 0 h X_deplsp r 0 h, r 0 L_addsat r 3, r 0, r 3 Dataflow Graph dmv r 4 h, r 1 h mea ax 1, ++#-1 Def-Use … Graph … Pattern Matching mea axx 1, ++#1 mea ax 2, ++#-1 mea Dataflow Instruction Patterns ax 1, ++#-1 Liveness L 14: ldx_f ax 1 --, r 4 h COMPANY CONFIDENTIAL Crolles 14

Retargeting Flex. CC 2 Machine description SDF EDL CGD C++ API Engine flow Code Retargeting Flex. CC 2 Machine description SDF EDL CGD C++ API Engine flow Code generation rules BEG High level IR Intr. patterns Lowered IR Code generation engines µ-engine flow EDF Low level IR (µ-) engines COMPANY CONFIDENTIAL Crolles 15

Results MMDSP+ single MAC DSP core. Retargeting time 4 months. ETSI Enhanced Full Rate Results MMDSP+ single MAC DSP core. Retargeting time 4 months. ETSI Enhanced Full Rate benchmark (EFR). Flex. CC 1 low level C Flex. CC 2 + 5 pragmas Mips 52, 8 25, 5 18. 58 16, 96 Instructions (64 bits) 6366 6322 6376 6318 COMPANY CONFIDENTIAL Crolles 16

Crolles ter gis Re +H n tio p +a r. T po sto ps Crolles ter gis Re +H n tio p +a r. T po sto ps oo WL nin g+ eli pip ca are All o ftw So Xir Eli Co Sy Results COMPANY CONFIDENTIAL 17

Original research work Flex. CC 2 includes advanced in-house research work – – – Original research work Flex. CC 2 includes advanced in-house research work – – – ar. T / Gar. T. flexible back-end infrastructure retargetable register allocation for irregular architecture. retargetable dataflow peephole optimizer. automatic intrinsic functions recognition. MMX optimization using pattern matching COMPANY CONFIDENTIAL Crolles 18

Future work Inter procedural optimizations. Aliasing. Memory placement. MMX optimization using pattern matching. Interaction Future work Inter procedural optimizations. Aliasing. Memory placement. MMX optimization using pattern matching. Interaction between scheduling and register allocation. Improved retargetability. COMPANY CONFIDENTIAL Crolles 19

Conclusion Keystone for embedded software development – Synthesizing application code into processor I/S – Conclusion Keystone for embedded software development – Synthesizing application code into processor I/S – Exploiting processor features Optimizing code and resource usage Driving processor architecture evolution Modular and extendible compiler framework – – At high and low level. State of the art optimizations. Advanced DSP optimizations. Target specific optimizations. Short retargeting time. Perspectives: compiler as a CAD tool for So. Cs COMPANY CONFIDENTIAL Crolles 20