bf74148588f818de53930a6e3ef3fe77.ppt
- Количество слайдов: 47
Pablo and Autopilot: Performance Tuning in Distributed Computing Environments Ruth Aydt Pablo Research Group Department of Computer Science University of Illinois at Urbana-Champaign http: //www-pablo. cs. uiuc. edu Pablo Research Group - Department of Computer Science - UIUC
Presentation Outline • Requirements for successful performance tuning • Pablo toolkit components - how we got here • Autopilot – Basic concepts – Component interactions – Fuzzy Logic decision infrastructure • Pablo-provided monitor/control programs – Autodriver – Virtue • Case study of Parallel Rocket Simulation Code • Current Work Pablo Research Group - Department of Computer Science - UIUC
Requirements for Successful Performance Tuning in a Distributed Environment: • • top to bottom and end to end real-time performance data capture “appropriate” performance data detail and granularity… just enough but not too much! tools to help correlate and interpret captured data dynamic policy selection in response to current resource availability and application demands Pablo Research Group - Department of Computer Science - UIUC
Pablo Toolkit Components: a Decade of Performance Monitoring and Analysis Tools Pablo Research Group - Department of Computer Science - UIUC
Pablo Trace Library and Extensions • • • Libraries linked with application to trace “generic events” and also loops, message passing, procedure calls, Unix I/O, MPI I/O, HDF routines Standard function names (e. g. read) replaced with tracing version (e. g. trace. READ) by preprocessor for C codes. Fortran, calls bracketed by trace. Read. Begin / trace. Read. End manually Timestamped event data written to buffer and flushed periodically to per-processor files Pablo Research Group - Department of Computer Science - UIUC
Pablo I/O, MPI I/O, HDF Analysis • • Produce reports from I/O event data Sample MPI-IO summary report shown: Pablo Research Group - Department of Computer Science - UIUC
Pablo Self-Defining Data Format • • A performance data metaformat that specifies both data record structures and data record instances Unlimited set of event types supported depending on the “interesting” performance data SDDF library provides classes to read and write files in SDDF format General-purpose tools can be written using the library and the Record/Field names in the SDDF files Pablo Research Group - Department of Computer Science - UIUC
Sample SDDF File showing Data Structure and Data Instance Pablo Research Group - Department of Computer Science - UIUC
SDDFStatistics Analysis Program for SDDF Files Pablo Research Group - Department of Computer Science - UIUC
Sv. Pablo • • A graphical source code browser and performance capture/correlation tool Allows user to select loops and procedures to instrument in C, F 77, F 90 code. Automatic instrumentation for HPF via PGI performance interface. Collects performance data and later displays it relative to source code line Option for real-time data transmission via Autopilot tagged sensors (more later) Pablo Research Group - Department of Computer Science - UIUC
Sv. Pablo GUI Pablo Research Group - Department of Computer Science - UIUC
Virtue • A collaborative virtual environment for direct software manipulation – Hierarchical graph representations that show – – • software structure, dynamics, and performance Manipulation tools for augmented interactions with the virtual environment Annotation tools for distributed, collaborative exploration and recording Uses Open. GL and EVL CAVE library for 3 -d effects in CAVE, Immersa. Desk, and desktop environments Pablo Research Group - Department of Computer Science - UIUC
Autopilot : Performance Tuning in Distributed Computing Environments Pablo Research Group - Department of Computer Science - UIUC
• • • Autopilot Toolkit Provides a framework for the capture and analysis of real-time application and infrastructure data in a multi-threaded distributed environment Offers the ability to control volume of performance data through – selective registration and property matching – analysis and data reduction at point of collection – constant, periodic, or on-demand transmission of data – ability to dynamically enable/disable data collection Includes a control interface to allow steering of infrastructure policies and applications, either interactively or via automated decision procedures Pablo Research Group - Department of Computer Science - UIUC
Basic Autopilot Concepts • Sensors: provide data to remote processes, allowing real-time monitoring – intrinsic (procedural - push) – extrinsic (threaded - push) – transfer data when requested by remote • process (pull) Sensor Attached Functions: transform sensed data via user-defined functions before it is recorded by the sensor, providing an important data-reduction technique Pablo Research Group - Department of Computer Science - UIUC
Basic Autopilot Concepts • Actuators: provide remote processes the ability to invoke local functions or update data, allowing remote steering – synchronous (application controls when updates are – • made; requests may be held in pending buffer) asynchronous (updates are made when request received from external agent) Properties: key-value pairs that are associated with and used to identify a sensor or actuator, allowing remote processes to be selective about the sensors and actuators they connect to Pablo Research Group - Department of Computer Science - UIUC
Basic Autopilot Concepts • • Sensor Client: a process that connects to one or more sensors with matching properties and receives data from those sensors Actuator Client: a process that connects to one or more actuators with matching properties and sends data to those actuators, causing application variables controlled by the actuators to be updated or functions to be invoked Pablo Research Group - Department of Computer Science - UIUC
Basic Autopilot Concepts • Autopilot Manager: a daemon process that is responsible for handling registration requests from sensors and actuators, and matching sensor client and actuator client requests to registered sensors and actuators. * Autopilot. Manager daemons may be run on multiple hosts throughout the computational grid, allowing sensors, actuators, and clients to tailor data transfer volumes to appropriate levels for local and distant tasks. Pablo Research Group - Department of Computer Science - UIUC
Tagged Sensors, Actuators, Clients • • • Information about the structure of the data is forwarded when a client first connects to a matching sensor or actuator, allowing the client to perform verification checks and ignore unwanted data. Tagged data sets map naturally into what we normally think of as event trace records. Sometimes called “SDDF-enabled” because the buffer contents can easily be translated to SDDF Pablo Research Group - Department of Computer Science - UIUC
Autopilot and Nexus/Globus • • • Autopilot uses the Nexus component of the Globus toolkit (http: //www-globus. org) to provide. . . – communication substrate & multithreading capabilities Nexus creates a global address space that encompasses all processes executing on a distributed network Nexus Remote Service Requests (RSRs) used by Autopilot classes to transmit messages, insuring optimal underlying transfer protocol Nexus multi-threaded handlers used by Autopilot classes to process RSRs Most Nexus details hidden by Autopilot classes Pablo Research Group - Department of Computer Science - UIUC
Autopilot Component Interactions Autopilot Manager 1. sensors and actuators register with their properties Instrumented Task 2. clients request matching sensors and actuators 4. sensor and actuator controls and actuator data 3. global pointers returned for matches Monitor/Control Task 5. sensor data Pablo Research Group - Department of Computer Science - UIUC
Instrumented Tasks • May contain multiple sensors and/or • • actuators Many instrumented tasks may be active at any given time May register sensors and actuators with multiple Autopilot Managers running on different hosts • May be application code or Instrumented Task infrastructure resource monitor (lmon) Pablo Research Group - Department of Computer Science - UIUC
Monitor/Control Tasks • May contain multiple sensor clients and/or • • actuator clients Many monitor/control tasks may be active at any given time May query multiple Autopilot Managers running on different hosts May implement “human in the loop” (Autodriver, Virtue) or automated fuzzy logic decision server (PPFS II) Monitor/Control May be monitor only, Task writing collected data to a file or displaying it Pablo Research Group - Department of Computer Science - UIUC
Fuzzy Logic Decision Infrastructure Monitor/Control Task(s) Knowledge Repository Sensors System Outputs Fuzzy Logic Decision Process Defuzzifier Fuzzifier Inputs Fuzzy Logic Rule Base Actuators Instrumented Task(s) Pablo Research Group - Department of Computer Science - UIUC
Sample Fuzzy Logic Rule Base for Temperature Control rulebase Furnace. Rules; // decide what to do based on roomtemp which falls into 3 ranges var roomtemp(0, 100) { set trapez cold ( 0, 50, 0, 20 ); set trapez medium( 50, 70, 10 ); set trapez hot ( 80, 100, 20, 0 ); }; roomtemp truth values Pablo Research Group - Department of Computer Science - UIUC
Sample Fuzzy Logic Rule Base for Temperature Control (continued) // control the furnace value in a range of 0 -1, with 0 = off var furnace(0, 1) { set triangle off ( 0, 0, 0. 1 ); set triangle half( 0. 5, 0. 1 ); set triangle full( 1, 0 ); }; // if if if the rules ( roomtemp == cold ) { furnace = full; } ( roomtemp == medium ) { furnace = half; } ( roomtemp == hot ) { furnace = off; } Pablo Research Group - Department of Computer Science - UIUC
Fuzzy Logic Decision Infrastructure • Autopilot sensors provide a stream of room temperature readings. After fuzzification, this stream defines the value of the roomtemp fuzzy variable. • Rules whose conditions are non-zero all contribute to determining the value of the output fuzzy variable furnace. After defuzzification, the value of furnace defines the action taken by the Autopilot actuator. • Fuzzy logic handles noisy data and conflicting goals. • Fuzzy logic separates data sets (definition of fuzzy variables) and rules (assertions and consequents) allowing each to be independently adjusted for a particular computing environment without re-coding the decision procedure. Pablo Research Group - Department of Computer Science - UIUC
Autodriver Monitor and Control Architecture Autodriver Java GUI x ni U Instrumented Task Java Remote Method Invocation Autodriver Autopilot Adapter Task Autopilot Manager Pablo Research Group - Department of Computer Science - UIUC
Autodriver Startup • User specifies hosts for Autopilot Manager and, if remote, Adapter • Main window displays currently registered sensors and actuators • User selects sensors and/or actuators they are interested in Pablo Research Group - Department of Computer Science - UIUC
Autodriver Field Selection • When a tagged sensor is selected, a new window showing the list of fields in that sensor are displayed • The user selects the field(s) they want to view Pablo Research Group - Department of Computer Science - UIUC
Autodriver Numeric Display • Data can be displayed as numeric values • The user can choose to save the data values to a file for later analysis Pablo Research Group - Department of Computer Science - UIUC
Autodriver Plot Display • Using ptplot package from Berkeley, values can be plotted as connected or unconnected points • Multiple fields can be plotted to a single window • User can control number of points to display in window and zoom in on area of graph Pablo Research Group - Department of Computer Science - UIUC
Autodriver Actuator Interaction • User may enter value for selected actuator and transmit it to the remote process • Interface may be customized for nonnumeric data entry such as pull-down menu choice of LRU or MRU for actuator controlling cache replacement policy Pablo Research Group - Department of Computer Science - UIUC
Virtue Monitor and Control Architecture Virtue Tagged Sensor data Instrumented Task Actuator controls Autopilot Manager Pablo Research Group - Department of Computer Science - UIUC
Virtue Display and Control • Each sphere in the ring represents a workstation • lmon collects processor utilization data and makes it available via sensors • Virtue maps the data to the display • Data transmission frequency can be adjusted via slider connected to lmon actuator Pablo Research Group - Department of Computer Science - UIUC
Case study: Rocket Simulation Code • Code developed by DOE ASCI Center for Simulation of Advanced Rockets (CSAR) at UIUC Init Fluids Code (10 fluid iterations) Interpolation • 40, 000 lines of Fortran, MPI for communication between processes, runs on SGI Origin • 200+ hours on 128 PEs to simulate 1/2 second of burn • Ultimately want to model 2 minutes for complete booster burn-off * Could Modify Iterations with Actuator Solids Code Do 3: 1 Multigrid Solution for Each of the Meshes Convergence Test * Saves Date * Advances Time Step Y * 3 for coarse grain mesh; 1 for fine grain n * Check Against a Residual * Best Case, Converge on First Try Output Pablo Research Group - Department of Computer Science - UIUC
Execution Environment Running on SGI Octane and Immersadesk in Pablo group Virtue Running on systems around the country Lmons on systems across lmon gathering systems across the country network data the country CSAR code instrumented via Sv. Pablo Autopilot Manager Running on SGI Origin at NCSA Running on SPARC in Pablo group Pablo Research Group - Department of Computer Science - UIUC
Wide-area Network Performance Data • Network latency statistics gathered via modified traceroute and made available via Autopilot sensors • Edge color represents latency -- warm colors for high latency • Cutting plane shows max value of intersected edges Pablo Research Group - Department of Computer Science - UIUC
Time Tunnel in Display Hierarchy • Time tunnel is second level in Virtue display hierarchy, showing application behavior on a single parallel system • Notice long delays for some MPI allreduce calls (shown in white) Pablo Research Group - Department of Computer Science - UIUC
Application Phases and Communication Patterns Pablo Research Group - Department of Computer Science - UIUC
View from “inside” Time Tunnel • User can fly around within the virtual environment to get different views • MPI profiling wrappers provide MPI call information via Autopilot Sensors • Sv. Pablo provides code region information via Autopilot Sensors Pablo Research Group - Department of Computer Science - UIUC
Call Graph in Display Hierarchy • For each processor in the time tunnel, you can “drill-down” to the procedure call graph • Sv. Pablo provides call graph layout and dynamic updates via Autopilot sensors Pablo Research Group - Department of Computer Science - UIUC
Call Graph Close-Up View • Color mapped to inclusive procedure execution time • Size mapped to number of times procedure called • Magic lens exposes the procedure names Pablo Research Group - Department of Computer Science - UIUC
Source Code Text Billboard • The user can select a procedure in the call graph display and “drill-down” to the final level, which is the source code for the procedure Pablo Research Group - Department of Computer Science - UIUC
Current Efforts • • Sv. Pablo: version with output via Autopilot sensors generally available Virtue: new displays and controls for interacting with Autopilot sensors and actuators Autodriver: integrated event definition, recognition, adaptation, and notification Trace Library and Extensions: rework to use Autopilot as infrastructure, providing “automatic” instrumentation of I/O, MPI I/O, and HDF calls with corresponding well-defined sensor data structures Pablo Research Group - Department of Computer Science - UIUC
Current Efforts • • • Integrate sensors and actuators into Globus infrastructure Provide translators from – (appropriate) tagged sensor data to Net. Logger format – Netlogger format to SDDF – SDDF to XML – XML to SDDF Continue to explore analysis, visualization, and control techniques in dynamic, distributed environments Pablo Research Group - Department of Computer Science - UIUC
Pablo Group Participants • Professor Dan Reed, Pablo Project Director • • • Randy Ribler* Huseyin Simitci Jim Oly Nancy Tran Guoyi Wang Don Schmidt Jeff Vetter* Luiz De. Rose* Ying Zhang Mario Pantano* • • • Eric Shaffer Shannon Whitmore Ben Schaeffer Dan Wells Deb Israel and lots of others who have been part of the Pablo group over the years * postdocs previously with the Pablo group Pablo Research Group - Department of Computer Science - UIUC
bf74148588f818de53930a6e3ef3fe77.ppt