
870703310268d7b5adec923ee50a5767.ppt
- Количество слайдов: 59
Telescoping Languages A Framework for Generating High. Performance Problem-Solving Systems Kennedy Center for High Performance Software Rice University http: //www. cs. rice. edu/~ken/Presentations/Telescope. pdf Center for High Performance Software Research
Collaborators Bradley Broom Arun Chauhan Keith Cooper Jack Dongarra Rob Fowler Lennart Johnsson Chuck Koelbel Cheryl Mc. Cosh John Mellor-Crummey Linda Torczon Center for High Performance Software Research
Philosophy • Compiler Technology = Off-Line Processing • Examples — Goals: improved performance and language usability – Making it practical to use the full power of the language — Trade-off: preprocessing time versus execution time — Rule: performance of both compiler and application must be acceptable to the end user — Macro expansion – PL/I interpretive macro facility – Fixed macros can be compiled 10 x improvement with compilation — Trans. Meta “Code Morphing” – Dynamic compilation of machine code Center for High Performance Software Research
Making Languages Usable It was our belief that if FORTRAN, during its first months, were to translate any reasonable “scientific” source program into an object program only half as fast as its hand-coded counterpart, then acceptance of our system would be in serious danger. . . I believe that had we failed to produce efficient programs, the widespread use of languages like FORTRAN would have been seriously delayed. — John Backus Center for High Performance Software Research
A Java Experiment • Scientific Programming In Java • Owl. Pack Benchmark Suite • Experiment — Goal: make it possible to use the full object-oriented power for scientific applications – Many scientific implementations mimic Fortran style — Three versions of Lin. PACK in Java – Fortran style – Lite object-oriented style – Full polymorphism No differences for type — Compare running times for different styles on same Java VM — Evaluate potential for compiler optimization Center for High Performance Software Research
Performance Results Using JDK 1. 2 JIT on SUN Ultra 5 Center for High Performance Software Research
Programming Productivity • Challenges • One Strategy: Make the End User a Programmer • Compilation for High Performance — programming is hard — professional programmers are in short supply — high performance will continue to be important — professional programmers develop components — users integrate components using: – problem-solving environments (PSEs) based on scripting languages (possibly graphical) examples: Visual Basic, Tcl/Tk, AVS, Khoros — translate scripts and components to common intermediate language — optimize the resulting program using interprocedural methods Center for High Performance Software Research
Script-Based Programming Component Library User Library Script Center for High Performance Software Research
Script-Based Programming Component Library User Library Translator Script Center for High Performance Software Research Intermediate Code
Script-Based Programming Global Optimizer Component Library User Library Translator Script Center for High Performance Software Research Intermediate Code
Script-Based Programming Global Optimizer Component Library User Library Translator Script Center for High Performance Software Research Intermediate Code Generator
Script-Based Programming Global Optimizer Component Library User Library Script Translator Intermediate Code Generator Problem: long compilation times, even for short scripts! Center for High Performance Software Research
Script-Based Programming Global Optimizer Component Library User Library Script Translator Intermediate Code Generator Problem: long compilation times, even for short scripts! Problem: expert knowledge on specialization lost Center for High Performance Software Research
Telescoping Languages L 1 Class Library Center for High Performance Software Research
Telescoping Languages L 1 Class Library Compiler Generator Could run for hours L 1 Compiler Center for High Performance Software Research
Telescoping Languages L 1 Class Library Script Compiler Generator Script Translator Could run for hours L 1 Compiler Vendor Compiler Center for High Performance Software Research understands library calls as primitives Optimized Application
Telescoping Languages: Advantages • Compile times can be reasonable • High-level optimizations can be included • User retains substantive control over language performance • Reliability can be improved — More compilation time can be spent on libraries — Script compilations can be fast – Components reused from scripts may be included in libraries — Based on specifications of the library designer – Properties often cannot be determined by compilers – Properties may be hidden after low-level code generation — Mature code can be built into a library and incorporated into language — Specialization by compilation framework, not user Center for High Performance Software Research
Applications • Matlab Compiler • Generator for ARPACK • Flexible Data Distributions • Generator for Grid Computations — Automatically generated from LAPACK or Sca. LAPACK – With help via annotations from the designer — Library developer maintains code in Matlab — Currently recodes in Fortran by hand — could be automated — Failing of HPF: inflexible distributions — Data distribution == collection of interfaces that meet specs — Compiler applies standard transformations — Gr. ADS: automatic generation of Net. Solve Center for High Performance Software Research
Application: Matlab for Signal Processing • Automatically generated from LAPACK or Sca. LAPACK • Special project: Signal Processing Applications written in Matlab — With help via annotations from the designer — Users want simplicity and performance — Matlab currently gives them the first but not the second – Codes rewritten in C for communications devices — Run signal processing procedures through the generator – Many code modules reused Center for High Performance Software Research
Application: POOMA • Procedure library for computational hydrodynamics • Telescoping languages — Distributed data structures – vectors, arrays, tensors — Coded in C++ — Context optimizations coded into template expansion mechanism – 20 -line program compiles for over an hour on 32 processors — Enhanced reliability — Generate POOMA from simpler libraries for Fortran and Java Center for High Performance Software Research
Requirements of Script Compilation • Scripts must generate efficient programs • Script compile times should be proportional to length of script — Comparable to those generated from standard interprocedural methods — Avoid need to recode in standard language — Not a function of the complexity of the library — Principle of “least astonishment” Center for High Performance Software Research
Telescoping Languages Script Translator L 1 Compiler Vendor Compiler Center for High Performance Software Research understands library calls as primitives Optimized Application
Script Compilation Algorithm • Propagate variable property information throughout the program • Apply high-level transformations • Select and substitute specialized variants for library calls — Use jump functions to propagate through calls to library — Driven by information about properties — Ensure that process applies to expanded code — At each call site, determine the best approximation to parameter properties that is reflected by a specialized fragment in the code database – Use a method similar to “unification” — Substitute fragment from database for call – This could contain a call to a lower-level library routine. Center for High Performance Software Research
Telescoping Languages L 1 Class Library Compiler Generator Could run for hours L 1 Compiler Center for High Performance Software Research
Library Analysis and Preparation • Discovery of Critical Properties and Propagator Construction • Analysis of Transformation Specifications • Code Specialization for Different Sets of Parameter Properties — Construction of a specification-driven translator for use in compiling scripts Center for High Performance Software Research
Library Analysis and Preparation • Discovery of Critical Properties and Propagator Construction — Which properties of parameters affect optimization – Examples: value, type, rank and size of matrix Center for High Performance Software Research
Discovery of Critical Properties • From specifications by the library designer • From examining the code itself • From sample calling programs provided by the designer — If the matrix is triangular, then… — Look at a promising optimization point — Determine conditions under which we can make significant optimizations — See if any of these conditions can be mapped back to parameter properties – call average(shift(A, -1), shift(A, +1)) Can save on memory accesses Center for High Performance Software Research
Examining the Code • Example from LAPACK subroutine VMP(C, A, B, m, n, s) integer m, n, s; real A(n), B(n), C(m) i = 1 do j = 1, n C(i) = C(i) + A(j)*B(j) Vectorizable if s != 0 i = i + s enddo end VMP Center for High Performance Software Research
Library Analysis and Preparation • Discovery of Critical Properties and Propagator Construction — Which properties of parameters affect optimization – Examples: value, type, rank and size of matrix — Construction of jump functions for the library calls – With respect to critical properties Center for High Performance Software Research
Library Analysis and Preparation • Discovery of Critical Properties and Propagator Construction • Analysis of Transformation Specifications — Which properties of parameters affect optimization – Examples: value, type, rank and size of matrix — Construction of jump functions for the library calls – With respect to critical properties — Construction of a specification-driven translator for use in compiling scripts Center for High Performance Software Research
High-level Identities • Often library developer knows high-level identities • Example: Push and Pop • Example: Trigonometric Functions — Difficult for the compiler to discern — Optimization should be performed on sequences of calls rather than code remaining after expansion — Designer Push(x) followed by y = Pop() becomes y = x – Ignore possibility of overflow in Push — Sin and Cos used in same loop—both computed using expensive calls to the trig library — Recognize that cos(x) and sin(x) can be computed by a single call to sincos(x, s, c) in a little more than the time required for sin(x). Center for High Performance Software Research
Contextual Expansions • Out of Core Arrays • Get in a loop • When can we vectorize? — Operations Get(I, J) and Get. Row(I, Lo, N) Do I Do J … Get(I, J) Enddo — Turn into Get. Row — Answer: if Get is not involved in a recurrence. – How can we know? Center for High Performance Software Research
Contextual Expansions • Out of Core Arrays • Get in a loop • — Operations Get(I, J) and Get. Row(I, Lo, N) Do I Do J … Get(I, J) Enddo Vector versions of library routines can often be constructed When can we vectorize? — Turn into Get. Row — Answer: if Get is not involved in a recurrence. – How can we know? Center for High Performance Software Research
Library Analysis and Preparation • Discovery of Critical Properties and Propagator Construction • Analysis of Transformation Specifications • Code Specialization for Different Sets of Parameter Properties — Which properties of parameters affect optimization – Examples: value, type, rank and size of matrix — Construction of jump functions for the library calls – With respect to critical properties — Construction of a specification-driven translator for use in compiling scripts — For each set, assume and optimize to produce specialized code Center for High Performance Software Research
Code Selection Example • Library compiler develops inlining tables subroutine VMP(C, A, B, m, n, s) integer m, n, s; real A(n), B(n), C(m) i = 1 do j = 1, n C(i) = C(i) + A(j)*B(j) i = i + s enddo end VMP Inlining Table: case on s: ==0: C(1) = C(1) + sum(A(1: n)*B(1: n)) !=0: C(1: n: s) = C(1: n: s) + A(1: n)*B(1: n) default: call VMP(C, A, B, m, n, s) Center for High Performance Software Research vector
Application: Matlab for Signal Processing • Signal processing users want simplicity, programming power, and performance — Currently over 500, 000 Matlab licenses • Matlab gives them simplicity and power but not performance • Telescoping Languages: — Codes prototyped in Matlab — Codes rewritten in C for communications devices – Users would rather not do this — Many signal processing code modules reused over and over — Run these procedures through the language generator – Produce Matlab SP, a high-level domain-specific environment Center for High Performance Software Research
Matlab SP: Preliminary Findings • Optimizations That Pay Off • New Optimizations — Vectorization – Wins because of hand coded vector/matrix primitives — Elimination of common array subexpressions — Optimization of array allocation and reshape operations — Procedure vectorization – Interchange call and loop after distribution — Procedure strength reduction – Subdivide procedure in to variant and invariant components – Use invariant component only once Center for High Performance Software Research
Procedure Strength Reduction • Procedure called in loop • Becomes • Further improvements possible for i = 1: N x = f(c 1, c 2, i, c 3) end fm(c 1, c 2, c 3) for i = 1: N x = f (i) end — Use code differentiation to compute differences – ADIFOR Center for High Performance Software Research
Procedure Strength Reduction Performance Center for High Performance Software Research
Summary • Optimization enables language power • Programming support is challenging • Strategy: make end users into application developers — Principle: encourage rather than discourage use of powerful features – Good programming practice should be rewarded — Particularly with application and platform complexity on the rise – Compounded by the shortage of IT professionals — Telescoping languages: Framework for generating high-level problemsolving systems — Must produce high-quality code – Avoid the need to recode by hand Center for High Performance Software Research http: //www. cs. rice. edu/~ken/Presentations/Telescope. pdf
Summary • • PITAC: Focus on long-term, high-risk research • Programming support is still relatively primitive • Strategy: make end users into application developers • Telescoping languages: The scalable infrastructure should be a scalable problem-solver — Access to information is not enough — Linked computation is not enough — Application and platform complexity increasing — Compounded by the shortage of IT professionals — Professional programmers focus on components — End users build applications in scripting systems — Framework for generation of high-level problem-solving systems Center for High Performance Software Research
Software Support for High. Performance Problem Solving (With Application to Grid Programming) Kennedy Center for High Performance Software Rice University http: //www. cs. rice. edu/~ken/Presentations/Grid. Telescope. pdf Center for High Performance Software Research
Collaborators Bradley Broom Arun Chauhan Keith Cooper Jack Dongarra Rob Fowler Dennis Gannon Lennart Johnsson John Mellor-Crummey John Reynders Linda Torczon Center for High Performance Software Research
Lessons from PITAC • Findings • Refocus Research on Long-Term, High-Risk Problems • Invest in Key Areas — Research funding increasingly focused on short term — Universities weakened – Impact on workforce — Industry cannot fill the gap – Return on investment: 24 percent versus 66 percent — Requires an expansion of the base — Software — Scalable Information Infrastructure — High Performance Computing — Social, Economic, and Workforce Issues (Education) Center for High Performance Software Research
Two IT Grand Challenges • The Internet as Problem-Solving Engine • Software Productivity — Challenge: How do we develop applications and manage their execution? – Reliable performance under varying load – Accessibility to ordinary scientists and engineers — Gr. ADS Project — Challenge: How do we increase the nation’s productivity in software development – Too much software to be written, too few developers – Application and platform complexity increasing — Idea: make it possible for end users to be application developers Center for High Performance Software Research
Grids are “Hot” Computational Data DISCOM Information Sin. RG Access Knowledge APGrid Tera. Grid Center for High Performance Software Research
National Distributed Problem Solving Center for High Performance Software Research
National Distributed Problem Solving Center for High Performance Software Research
National Distributed Problem Solving Supercomput er Center for High Performance Software Research
National Distributed Problem Solving Supercomput er Center for High Performance Software Research Database
National Distributed Problem Solving Supercomput er Center for High Performance Software Research Database
National Distributed Problem Solving Database Supercomput er Center for High Performance Software Research Database
Today: Globus • Developed by Ian Foster and Carl Kesselman • Basic Services for distributed computing • Applications are programmed by hand — Grew from the I-Way (SC-95) — Resource discovery and information services — User authentication and access control — Job initiation — Communication services (Nexus and MPI) — Many applications — User responsible for resource mapping and all communication – Existing users acknowledge how hard this is Center for High Performance Software Research
Gr. ADSoft Architecture • Goal: reliable performance on dynamically changing resources Performance Feedback Software Components Source Application Whole. Program Compiler Libraries Center for High Performance Software Research Performance Problem Configurable Object Program Resource Negotiator Real-time Performance Monitor Negotiation Scheduler Binder Grid Runtime System
Gr. ADSoft Architecture Execution Environment Performance Feedback Software Components Source Application Whole. Program Compiler Libraries Center for High Performance Software Research Performance Problem Configurable Object Program Resource Negotiator Real-time Performance Monitor Negotiation Scheduler Binder Grid Runtime System
Gr. ADSoft Architecture Execution Environment Performance Feedback Software Components Source Application Whole. Program Compiler Libraries Center for High Performance Software Research Performance Problem Configurable Object Program Resource Negotiator Real-time Performance Monitor Negotiation Scheduler Binder Grid Runtime System
Gr. ADSoft Architecture Program Preparation System Performance Feedback Software Components Source Application Whole. Program Compiler Libraries Center for High Performance Software Research Performance Problem Configurable Object Program Resource Negotiator Real-time Performance Monitor Negotiation Scheduler Binder Grid Runtime System
Gr. ADSoft Architecture Problem-Solving Environments Performance Feedback Software Components Source Application Whole. Program Compiler Libraries Center for High Performance Software Research Performance Problem Configurable Object Program Resource Negotiator Real-time Performance Monitor Negotiation Scheduler Binder Grid Runtime System
Library Analysis and Preparation • Discovery of Critical Properties and Propagator Construction • Analysis of Transformation Specifications • Code Specialization for Different Sets of Parameter Properties — Which properties of parameters affect optimization – Examples: value, type, rank and size of matrix — Construction of jump functions for the library calls – With respect to critical properties — Construction of a specification-driven translator for use in compiling scripts — For each set, assume and optimize to produce specialized code Center for High Performance Software Research