Скачать презентацию Multiversioning Satheesh and Lauren 4 -8 -2013 Скачать презентацию Multiversioning Satheesh and Lauren 4 -8 -2013

e75b7917bd6ee15c28980b723dcb2bda.ppt

  • Количество слайдов: 63

Multiversioning Satheesh and Lauren 4 -8 -2013 Multiversioning Satheesh and Lauren 4 -8 -2013

Topics Overview What is Multiversioning? Why Use It? General Approaches and Algorithms Dynamic Compiler/Optimizer Topics Overview What is Multiversioning? Why Use It? General Approaches and Algorithms Dynamic Compiler/Optimizer Case Studies Advantages/Disadvantages Questions -2 -

Background Dynamic Compilation » Becoming Source for Continued Performance Boosts » Intelligent Optimization Opportunities Background Dynamic Compilation » Becoming Source for Continued Performance Boosts » Intelligent Optimization Opportunities Take Advantage of Runtime Knowledge Maintains Low compile time and Space Overheads -3 -

Topics Overview What is Multiversioning? Why Use It? General Approaches and Algorithms Dynamic Compiler/Optimizer Topics Overview What is Multiversioning? Why Use It? General Approaches and Algorithms Dynamic Compiler/Optimizer Case Studies Advantages/Disadvantages Questions -4 -

Motivation for Multi Versioning Making Dinner For Your Spouse What Do They Want Tonight? Motivation for Multi Versioning Making Dinner For Your Spouse What Do They Want Tonight? Option Time Until Eating Meals To Make Wait Until They Get Home And Ask, Then Make It High 1 Make Every Meal You Can, Pick One When They Get Home Lowest Many Make Their Favorite, If They Don’t Want it, Make Something New Low(est) 1 or 2 -5 -

Motivation for Multi Versioning Code Snippet Can Be Optimized Many Different Ways What Version Motivation for Multi Versioning Code Snippet Can Be Optimized Many Different Ways What Version Will Be Optimal “This” Time? Option Time To Execute Versions To Make Static – No Multiversioning High 1 Static - Multiversioning Lowest Many Dynamic - Multiversioning Low(est) 1 or 2 -6 -

Motivation for Multi Versioning Java. Script Example: function f(a, b){ var c = a+b; Motivation for Multi Versioning Java. Script Example: function f(a, b){ var c = a+b; } Version 1: a, b as integers/floats Version 2: a, b as strings -7 -

Motivation for Multi Versioning Dynamic Compilation Must Maximize Speedup » Only Perform “Beneficial” Optimizations Motivation for Multi Versioning Dynamic Compilation Must Maximize Speedup » Only Perform “Beneficial” Optimizations » Minimize Code Expansion » Dependent on Application, Input, Program Phase, etc. Single Code Snippet Often Exhibits Varied Characteristics Across Executions » Way for Low overhead? Opportunities for Multi Versioning: » » Dynamic Class Loading Exceptions Type Information Loops -8 -

What is Multi Versioning ? • Maintain Multiple Versions of a Code Section During What is Multi Versioning ? • Maintain Multiple Versions of a Code Section During Run Time • Opens Door for Specialized Optimizations • Which Version to Run? • Decided During the Run Time -9 -

Topics Overview What is Multiversioning? Why Use It? General Approaches and Algorithms Dynamic Compiler/Optimizer Topics Overview What is Multiversioning? Why Use It? General Approaches and Algorithms Dynamic Compiler/Optimizer Case Studies Advantages/Disadvantages Questions - 10 -

Static Multi Versioning Static Compiler Optimizations Are Limited Due To Precise Exceptions, Null Checks, Static Multi Versioning Static Compiler Optimizations Are Limited Due To Precise Exceptions, Null Checks, Etc. Locate Safe Regions Exception-Causing or Aliasing Code Snippets » One Version Assumes Common Case, Second Handles Special » Perform Single “Safe” Check » Allows for More Aggressive Optimizations » Ex. Array Bounds - 11 -

Static Multi Versioning – Class Problem How Can We “Multi Version” This? - 12 Static Multi Versioning – Class Problem How Can We “Multi Version” This? - 12 -

Static Multi Versioning - 13 - Static Multi Versioning - 13 -

Static Multi Versioning - Tradeoffs Performance Boosts » Eliminated Exception Checks » Optimizations In Static Multi Versioning - Tradeoffs Performance Boosts » Eliminated Exception Checks » Optimizations In Each Version Limited to these Code Snippets Can Cause Excessive Code Explosion! » No Runtime Execution Knowledge » Must be Conservative - 14 -

Comparative Study Approach Static vs. Dynamic Safe Regions Static # Versions 2 - 15 Comparative Study Approach Static vs. Dynamic Safe Regions Static # Versions 2 - 15 - Runtime Overhead Runtime Feedback Added Checks No

Thin Guards Useful for Java Dynamic Dispatch Designed to be leaner, lower overhead Creates Thin Guards Useful for Java Dynamic Dispatch Designed to be leaner, lower overhead Creates Only Two Versions a Class-Loaded Version and a Non-Class-Loaded Version Handles Different Methods/Receiver Objects Ex. A. get. X() vs. B. get. X() Only Requires Single Condition Bit Load To Represent All Checks When A Condition Becomes Unsatisfied, (ie. A Class Is Loaded): Fix All Associated Condition Bits - 16 -

Thin Guards - Simple Example – Class Problem - 17 - Thin Guards - Simple Example – Class Problem - 17 -

Thin Guards - Simple Example ---------------------------------------------NOTE: If/When Class Loading Occurs, Condition Bit Will Be Thin Guards - Simple Example ---------------------------------------------NOTE: If/When Class Loading Occurs, Condition Bit Will Be changed to 0 f - 18 -

Thin Guards Can Achieve Significant Speedup Run On Jikes. RVM for the SPECjvm 98 Thin Guards Can Achieve Significant Speedup Run On Jikes. RVM for the SPECjvm 98 Suite - 19 -

Thin Guards - Limitations Drawbacks Single Check Leads to Conservative Optimizations Bit Vector Size Thin Guards - Limitations Drawbacks Single Check Leads to Conservative Optimizations Bit Vector Size Limits Number of Distinct Checks Hashing Only Applicable for Object-Oriented Languages Conditions Must be Able to Be Represented by a Bit Thin Guards Limited In Their Applicability Limited Performance Boost – One Less Load How Can We Do Better? - 20 -

Comparative Study Approach Static vs. Dynamic Safe Regions Static # Versions 2 Runtime Overhead Comparative Study Approach Static vs. Dynamic Safe Regions Static # Versions 2 Runtime Overhead Runtime Feedback Added Checks No Thin Guards 2 Condition Bit Maintenance Dynamic - 21 - Yes

Devirtualization A Technique for replacing dynamic method calls. Usually resolved during runtime Java , Devirtualization A Technique for replacing dynamic method calls. Usually resolved during runtime Java , C++ and other OOP languages. Challenges in static analysis for devirtualization Dynamic method calls No complete knowledge of the classes - 22 -

Devirtualization – Interface Calls Java allows multiple inheritance via interfaces Interface 1 interface 1 Devirtualization – Interface Calls Java allows multiple inheritance via interfaces Interface 1 interface 1 = new Check. Call(); interface 1. method(); Dynamically resolves the call to the method. interface 1 can be changed during the execution of the program. - 24 -

Direct Devirtualization Checks if the dynamic call site is devirtualizable Direct Inlining If there Direct Devirtualization Checks if the dynamic call site is devirtualizable Direct Inlining If there is a single reference (class or method call) Maintains the original code (backup) Optimistically Executes Inlined Code Dynamic Class Loading ? » If Dynamic Class is Loaded, Inlining becomes Invalid » Causes “Backup Path” to Execute » Contains the original dynamic call - 25 -

Direct Devirtualization – Code Patching - 26 - Direct Devirtualization – Code Patching - 26 -

Direct Devirtualization – Code Patching Implements Backup Path » Invoked When A Method Override Direct Devirtualization – Code Patching Implements Backup Path » Invoked When A Method Override is Detected » Rewrites Inlined Code to Jump To Backup Code Challenges in removing direct devirtualization Thread- safe Split cache for Instruction and data issues – Flush cache Instruction Prefetch should be flushed - 27 -

Direct Devirtualization – Other analysis Enable Code movement - Removes merging point. Flow sensitive Direct Devirtualization – Other analysis Enable Code movement - Removes merging point. Flow sensitive type analysis Static analysis during compile time Determines all set of classes reachable at each object reference point. If same definitions, then no overriding. No back-up code Pre-existence Analysis Allocated before invocation of a caller method in a straight line No back-up code Class Tests/Method Tests Needs backup code. - 28 -

Direct Devirtualization – Class Problem – Flow sensitive analysis Class A{ virtual void f(); Direct Devirtualization – Class Problem – Flow sensitive analysis Class A{ virtual void f(); } Class B: public A{ void f(); } Class C: public A{ void f(); } main(){ B b; A *p = new C(); p -> f(); p = new B(); p -> f(); } - 29 -

Direct Devirtualization – Results - 30 - Direct Devirtualization – Results - 30 -

Comparative Study Approach Static vs. Dynamic Safe Regions Static # Versions 2 Runtime Overhead Comparative Study Approach Static vs. Dynamic Safe Regions Static # Versions 2 Runtime Overhead Runtime Feedback Added Checks No Thin Guards Dynamic 2 Devirtualizati Dynamic on 2 Condition Bit Maintenance Inlining Code, Flushing Cache, Maintain Backup - 31 - Yes

Specialization Transforms “Selected” Code Portions » Runtime Regions » Ex. Runtime Constants (Dynamic Constant Specialization Transforms “Selected” Code Portions » Runtime Regions » Ex. Runtime Constants (Dynamic Constant Propagation) Manual Specialization - Programmer Directives Declarative Annotations » Which Code Portions Should be Considered? » What are the runtime constants for the function params? » What should be optimized before/during runtime? Automated specialization » Dy. C Compiler for C programs - 32 -

Dynamic Feedback - Adaptive Computing void body: : one interaction(body *b) { double val Dynamic Feedback - Adaptive Computing void body: : one interaction(body *b) { double val = interact(this->pos, b->pos); mutex. acquire(); sum = sum + val; mutex. release(); } void body: : interactions(body b[], int n) { for (int i = 0; i < n; i++) { this->one interaction(&b[i]); } } - 35 -

Dynamic Feedback - Adaptive Computing void body: : one interaction(body *b) { double val Dynamic Feedback - Adaptive Computing void body: : one interaction(body *b) { double val = interact(this->pos, b->pos); sum = sum + val; } void body: : interactions(body b[], int n) { mutex. acquire(); for (int i = 0; i < n; i++) { this->one interaction(&b[i]); } mutex. release(); } - 36 -

Comparative Study Approach Static vs. Dynamic Safe Regions Static # Versions 2 Runtime Overhead Comparative Study Approach Static vs. Dynamic Safe Regions Static # Versions 2 Runtime Overhead Runtime Feedback Added Checks No Thin Guards Dynamic 2 Devirtualizati Dynamic on 2 Condition Bit Maintenance Inlining Code, Flushing Cache, Maintain Backup Specialization Dynamic /Empirical Many Yes “Training Samples”, Yes Dispatch logic for choosing the right version - 38 -

Topics Overview What is Multiversioning? Why Use It? General Approaches and Algorithms Dynamic Compiler/Optimizer Topics Overview What is Multiversioning? Why Use It? General Approaches and Algorithms Dynamic Compiler/Optimizer Case Studies Advantages/Disadvantages Questions - 39 -

SELF Compiler - Dynamic Compiler for Pure OO Languages - Operations via Message Passing SELF Compiler - Dynamic Compiler for Pure OO Languages - Operations via Message Passing - Dynamically Typed Language - Deferred Compilation of Uncommon Code - Multiple Version of Loops. - Multiple Versions of Instructions. - 40 -

SELF 89, 90, 91 - Performance - 41 - SELF 89, 90, 91 - Performance - 41 -

Type Analysis SELF Messages: - 42 - Type Analysis SELF Messages: - 42 -

Type Prediction - 43 - Type Prediction - 43 -

Deferred Compilation - Uncommon Cases Are Not Compiled Until Required. - Saves Compilation Time. Deferred Compilation - Uncommon Cases Are Not Compiled Until Required. - Saves Compilation Time. - Compiler Will be Biased - Biased in Favor of Saving Compile Time. - 44 -

Message Splitting and Merging - 2 Paths Have Different Type Information - Opens Room Message Splitting and Merging - 2 Paths Have Different Type Information - Opens Room for Optimization - Split Paths Consume More Compile Time - Do Deferred Compilation - Types - Eager Splitting - Reluctant Splitting - Path objects - Through the CFG - Represents Unique Path 45 -

Value Swap – Bubble Sort - 46 - Value Swap – Bubble Sort - 46 -

Complete path split of Bubble Sort - 47 - Complete path split of Bubble Sort - 47 -

Optimizing Loops - Iterative Type Analysis - 48 - Optimizing Loops - Iterative Type Analysis - 48 -

Optimizing Loops - 49 - Optimizing Loops - 49 -

Execution Speed - 50 - Execution Speed - 50 -

Compilation Speed - 51 - Compilation Speed - 51 -

Code Density - 52 - Code Density - 52 -

Comparative Study Approach Static vs. Dynamic Safe Regions Static # Versions 2 Runtime Overhead Comparative Study Approach Static vs. Dynamic Safe Regions Static # Versions 2 Runtime Overhead Runtime Feedback Added Checks No Thin Guards Dynamic 2 Devirtualizati Dynamic on 2 Condition Bit Maintenance Inlining Code, Flushing Cache, Maintain Backup Specialization Dynamic /Empirical Many SELF Many Dynamic Yes “Training Samples”, Yes Dispatch logic for choosing the right versino Condition Checks Yes, (Deferred for Types, Loop Compilation) Unrolling - 53 -

ADAPT – Case Study Dynamic Compiler/Optimizer Since 2001 Uses More Aggressive and Deeper Analysis ADAPT – Case Study Dynamic Compiler/Optimizer Since 2001 Uses More Aggressive and Deeper Analysis to Achieve Performance Key Features Dedicated Language, AL Dynamic Feedback Parallel Execution and Optimization Iterative Improvement - 54 -

ADAPT – Dedicated Language, AL User Defined Heuristics Define Machine Characteristics Indicate What Optimizations ADAPT – Dedicated Language, AL User Defined Heuristics Define Machine Characteristics Indicate What Optimizations To Perform Dictate How Often To Perform These Optimizations Define What Profiling To Perform To Collect Dynamic Feedback - 55 -

ADAPT –AL – Loop unroller - 56 - ADAPT –AL – Loop unroller - 56 -

ADAPT – Dynamic Feedback Runtime Profiling Information Collected Collects Separate Profiles for Each “Context” ADAPT – Dynamic Feedback Runtime Profiling Information Collected Collects Separate Profiles for Each “Context” » EX. Different Loop Counts, Different Function Parameters Couples This Information With User Defined Heuristics Captures Phase Behavior and Adapts to Current Execution Characteristics - 57 -

ADAPT – Parallel Optimization and Execution Local Optimizer Finds “Hot” Code Sections, etc. Loops ADAPT – Parallel Optimization and Execution Local Optimizer Finds “Hot” Code Sections, etc. Loops without I/O or function calls Sends “Hot” Code Info and Profiling Feedback to Remote Measures Runtime Speedups Remote Optimizer Parallel to the Application » Combines User Defined Heuristics with Dynamic Feedback » Loop Tiling, Loop Unrolling, etc. » Returns New Optimized Version to Local - 58 -

ADAPT – Putting It All Together - 59 - ADAPT – Putting It All Together - 59 -

ADAPT - Weaknesses Relies on Repetitive Executions of Code Section » Else Optimization Overhead ADAPT - Weaknesses Relies on Repetitive Executions of Code Section » Else Optimization Overhead is Not Amortized Requires User Interaction and Knowledge Code Size Potentially Doubles Does Not Optimize Very Short Sections » Applications Must Exhibit “Long” Enough Sections - 61 -

Comparative Study Approach Static vs. Dynamic Safe Regions Static # Versions 2 Runtime Overhead Comparative Study Approach Static vs. Dynamic Safe Regions Static # Versions 2 Runtime Overhead Runtime Feedback Added Checks No Thin Guards Dynamic 2 Devirtualizati Dynamic on 2 Condition Bit Maintenance Inlining Code, Flushing Cache, Maintain Backup Specialization Dynamic /Empirical Many SELF Dynamic Many ADAPT Dynamic 2, but changing Yes “Training Samples”, Yes Dispatch logic for choosing the right versino Condition Checks Yes, (Deferred for Types, Loop Compilation) Unrolling - 62 - Profiling Info, Yes Optimizer Comm. , Loop Optimizations

Multi Versioning - Advantages Multiversioning Consistently Produces Execution and Compile Time Speedup. Dynamic Multiversioning Multi Versioning - Advantages Multiversioning Consistently Produces Execution and Compile Time Speedup. Dynamic Multiversioning Significantly Outperforms Static Multiversioning ADAPT and SELF Both Achieve Significant Performance Boosts ADAPT – up to 35% SELF – up to 57% - 63 -

Multi Versioning – Disadvantages However, Code Explosion is Still an Issue Non-Trivial Amount – Multi Versioning – Disadvantages However, Code Explosion is Still an Issue Non-Trivial Amount – Sometimes Exponential But Less Than Static Multi Versioning More Compilation Time. - 64 -

Discussion There is Not a Single Correct Algorithm for Multiversioning Program Characteristics and Compiler Discussion There is Not a Single Correct Algorithm for Multiversioning Program Characteristics and Compiler Tradeoffs Determine Optimal Implementation The Main Consequence is the Code Explosion Inherently Caused By Multiple Versions of a Single Original Code Section - 65 -

Discussion Future Directions Continue to Leverage All Runtime Knowledge Available Minimize Concurrent Versions of Discussion Future Directions Continue to Leverage All Runtime Knowledge Available Minimize Concurrent Versions of Code Maximize Parallelism of Optimization and Execution Expand to Other Fields Ex. Server and Embedded Worlds - 66 -

Conclusions Many Approaches to Dynamic Multiversioning Thin Guards, Direct Devirtualization, Specialization, Empirical, Type Prediction Conclusions Many Approaches to Dynamic Multiversioning Thin Guards, Direct Devirtualization, Specialization, Empirical, Type Prediction and Loop Optimization Each Algorithm Makes Tradeoffs - User Involvement - Code Explosion - Optimization Speed Produces Significant Speedup On All Real-World Dynamic Compilers/Optimizers Key is To Find Balance Between Compilation Overhead and Execution Speedup - 67 -

Questions How Practical is Multiversioning For Fields Such as the Embedded System Field? Consider Questions How Practical is Multiversioning For Fields Such as the Embedded System Field? Consider the Issue of Code Explosion Could Multiversioning Information be Persisted across Application Executions? Similar to Persistent Code Caches Lessen Dependency on Profiling, User Lessen Number of Suboptimal Code Versions How long to maintain the multiple versions? Management Policies – Heuristics? Questions? - 68 -