
5daa10b7e852c17bdbbba838573923d3.ppt
- Количество слайдов: 25
Enabling Grids for E-scienc. E Supporting MPI Applications on EGEE Grids Zoltán Farkas MTA SZTAKI www. eu-egee. org EGEE-II INFSO-RI-031688
Contents Enabling Grids for E-scienc. E • MPI – Standards – Implementations • EGEE and MPI – – History Current status Working/research groups in EGEE Future Works • P-GRADE Grid Portal – Workflow execution, file handling – Direct job submission – Brokered job submission 2
MPI Enabling Grids for E-scienc. E • MPI stands for Message Passing Interface – Standards 1. 1 and 2. 0 • MPI Standard features: – – – Collective communication (1. 1+) Point-to-Point communication (1. 1+) Group management (1. 1+) Dynamic Processes (2. 0) Programming Language APIs … Budapest, 5 July 2006
MPI Implementations Enabling Grids for E-scienc. E • MPICH – – – Freely available implementation of MPI Runs on many architectures (even on Windows) Implements Standards 1. 1 (MPICH) and 2. 0 (MPICH 2) Supports Globus (MPICH-G 2) Nodes are allocated upon application execution • LAM/MPI – – Open-source implementation of MPI Implements Standards 1. 1 and parts of 2. 0 Many interesting features (checkpoint) Nodes are allocated before application execution • Open MPI – Implements Standard 2. 0 – Uses technologies of other projects Budapest, 5 July 2006
MPICH execution on x 86 clusters Enabling Grids for E-scienc. E • Application can be started … – … using ‘mpirun’ – … specifying: number of requested nodes (-np <nodenumber>), a file containing the nodes to be allocated (-machinefile <arg>) [OPTIONAL], the executable, executable arguments. – $ mpirun –np 7. /cummu –N –M –p 32 • Processes are spawned using ‘rsh’ or ‘ssh’, depending on the configuration Budapest, 5 July 2006
MPICH x 86 execution – requirements Enabling Grids for E-scienc. E • Executable (and input files) must be present on worker nodes: – Using Shared Filesystem, or – User distributes the files before invoking ‘mpirun’. • Accessing worker nodes from the host running ‘mpirun’: – Using ‘rsh’ or ‘ssh’ – Without user interaction (host-based authentication) Budapest, 5 July 2006
EGEE and MPI Enabling Grids for E-scienc. E • MPI became important at the end of 2005/beginning of 2006: – Intructions about CE/jobmanager/WN configuration – The user has to start a wrapper script the input sandbox isn’t distributed to worker nodes sample wrapper script, which works for PBS, LFS and assumes ssh • Current status (according to experiments): – No need to use wrapper scripts – MPI jobs fail in case on no shared filesystems – Remote file handling not supported, so user has to take care Budapest, 5 July 2006
EGEE and MPI - II Enabling Grids for E-scienc. E • Research/Working groups formed: – MPI TCG WG: User requirements: • „Shared” filesystem: distribute executable and input files • Storage Element handling Site requirements: • Solution must be compatible with a big number of jobmanagers • Infosystem extensions (max. number of concurent CPUs used by a job, …) – MSc research group (1 -month project, 2 students): Created wrapper scripts for MPICH, LAM/MPI, Open. MPI Application source is compiled before execution Executable and input files are distributed to allocated worker nodes, ‘ssh’ is assumed No remote file support Budapest, 5 July 2006
EGEE and MPI – Future work Enabling Grids for E-scienc. E • Add support for: – all possible jobmanagers – all possible MPI implementations – Storage Element handling in case of legacy applications – input sandbox distribution in case of no shared filesystems before application execution – output file collection in case of no shared filesystems after the application has been executed Budapest, 5 July 2006
P-GRADE Grid Portal Enabling Grids for E-scienc. E • Workflow execution: – DAGMan as workflow scheduler – pre and post script to perform tasks around job exeution – Direct job execution using GT-2: Grid. FTP, GRAM pre: create temporary storage directory, copy input files job: Condor-G is executing a wrapper script post: download results – Job execution using EGEE broker (both LCG/g. Lite): pre: create application context as input sandbox job: Scheduler universe Condor job executing a script, which does job submission, status polling, output downloading. A wrapper script is submitted to the broker post: error checking Budapest, 5 July 2006
Portal: File handling Enabling Grids for E-scienc. E • „Local” files: – User has access to these files through the Portal – Local input files are uploaded from the user machine – Local output files are downloaded to the user machine • „Remote” files: – Files reside on EGEE Storage Elements or are accessible using Grid. FTP – EGEE SE files: lfn: /… guid: … – Grid. FTP files: gsiftp: //… Budapest, 5 July 2006
Portal: Direct job execution Enabling Grids for E-scienc. E • The resource to be used is known before job execution • The user must have a valid, accepted certificate • Local files are supported • Remote Grid. FTP files are supported, even in case of grid-unaware applications • Jobs may be sequential or MPI applications Budapest, 5 July 2006
Direct exec: step-by-step I. Enabling Grids for E-scienc. E 1. Pre script: • • • creates a storage directory on the selected site’s front-end node, using the ‘fork’ jobmanager local input files are copied to this directory from the Portal machine using Grid. FTP remote input files are copied using Grid. FTP (in case of errors, a two-phase copy is tried using Portal machine) 2. Condor-G job: • • a wrapper script (wrapperp) is specified as the real executable a single job is submitted to the requested jobmanager, for MPI jobs the ‘hostcount’ RSL attribute is used to specify the number of requested nodes Budapest, 5 July 2006
Direct exec: step-by-step II. Enabling Grids for E-scienc. E 3. LRSM: • • allocate the number of requested nodes (if needed) start wrapperp on one of the allocated nodes (master worker node) 4. Wrapperp (running on master worker node): • • • copies the executable and input files from the front-end node (‘scp’ or ‘rcp’) in case of PBS jobmanagers, executable and input files are copied to the allocated nodes (PBS_NODEFILE). In case of non-PBS jobmanagers, shared filesystem is required, as the host names of the allocated nodes cannot be determined wrapperp searches for ‘mpirun’ the real executable is started using the found ‘mpirun’ in case of PBS jobmanagers, output files are copied from the allocated worker nodes to the master worker node) output files are copied to the front-end node Budapest, 5 July 2006
Direct exec: step-by-step III. Enabling Grids for E-scienc. E 5. Post script: • • local output files are copied from the temporary working directory created by the pre script to the Portal machine using Grid. FTP remote output files are copied using Grid. FTP (in case of errors, a two-phase copy is tried using Portal machine) 6. DAGMan: schedule next jobs… Budapest, 5 July 2006
Direct execution: animated Enabling Grids for E-scienc. E Portal machine 1 2 1 5 Fork Wrapperp In/exe mpirun Executable Output Grid. FTP Temp. Storage Master WN Remote file storage 5 PBS 3 4 Slave WN 1 Slave WNn-1 4 In/exe Executable Output Budapest, 5 July 2006
Direct Submission Summary Enabling Grids for E-scienc. E • Pros: – Users can add remote file support to legacy applications – Works for both sequential and MPI(CH) applications – For PBS jobmanagers, there is no need to have a shared filesystem (support for other jobmanagers can be added, depends on informations provided by jobmanagers) – Works in case of jobmanagers, which do not support MPI – Faster, than submitting with the broker • Cons: – doesn’t integrate into the EGEE middleware – user needs to specify the execution resource – currently doesn’t work on non-PBS jobmanagers without shared filesystems Budapest, 5 July 2006
Portal: Brokered job submission Enabling Grids for E-scienc. E • The resource to be used is unknown before job execution • The user must have a valid, accepted certificate • Local files are supported • Remote files residing on Storage Elements are supported, even in case of grid-unaware applications • Jobs may be sequential or MPI applications Budapest, 5 July 2006
Broker exec: step-by-step I. Enabling Grids for E-scienc. E 1. Pre script: • creates the Scheduler universe Condor submit file 2. Scheduler Universe Condor job: • • the job is a shell script the script is responsible for: • • • job submission: a wrapper script (wrapperrb) is specified as the real executable in the JDL file job status polling job output downloading Budapest, 5 July 2006
Broker exec: step-by-step II. Enabling Grids for E-scienc. E 3. Resource Broker: • • • handles requests of the Scheduler universe Condor job sends the job to a CE watches its exeution reports errors … 4. LRMS on CE: • • allocates the requested number of nodes starts wrapperrb on the master worker node using ‘mpirun’ Budapest, 5 July 2006
Broker exec: step-by-step III. Enabling Grids for E-scienc. E 5. Wrapperrb: • • • the script is started by ‘mpirun’, so this script starts on every allocated worker node like an MPICH process checks if remote input files are already present. If not, they are downloaded from the storage element if the user specified any remote output files, they are removed from the storage the real executable is started with the arguments passed to the script. These arguments already contain MPICH-specific ones after the executable has been finished, remote output files are uploaded to the storage element (only in case of g. Lite) 6. Post script: • nothing special… Budapest, 5 July 2006
Broker execution: animated Enabling Grids for E-scienc. E Portal Machine 2 Resource Broker 5 3 Storage Element Master WN Globus 4 PBS Front-end node … mpirun Slave WN 1 5 Slave WNn-1 wrapperrb Real exe 5 Budapest, 5 July 2006
Broker Submission Summary Enabling Grids for E-scienc. E • Pros: – adds support for remote file handling in case of legacy applications – extends the functionality of the EGEE broker – one solution supports both sequential and MPI applications • Cons: – slow application execution – status polling generates high load with 500+ jobs Budapest, 5 July 2006
Experimental results Enabling Grids for E-scienc. E • Tested some selected SEEGRID CEs using the broker from command line and the direct job submission from P-GRADE Portal with a job requesting 3 nodes CE Name Broker Result Portal Direct Result ce. phy. bg. ac. yu Failed (exe not found) OK ce. ulakbim. gov. tr Scheduled OK ce 01. isabella. grnet. gr OK Failed (job held) ce 02. grid. acad. bg OK OK cluster 1. csk. kg. ac. yu Failed OK grid 01. rcub. bg. ac. yu Failed OK grid 2. cs. bilkent. edu. tr Failed (exe not found) OK Budapest, 5 July 2006
? Enabling Grids for E-scienc. E Thank you for your attention Budapest, 5 July 2006
5daa10b7e852c17bdbbba838573923d3.ppt