715fb635a0020a65f0916704027a584e.ppt
- Количество слайдов: 15
Enabling Grids for E-scienc. E ICE-CREAM Luigi Zangrando On behalf of the JRA 1 IT-CZ Padova group www. eu-egee. org INFSO-RI-508833
Slide shown last time Enabling Grids for E-scienc. E • Last time we showed these 2 slides: – Test: LSFConnector vs BLAHConnector § Submitted to CREAM 100 jobs to a CREAM based CE, sequentially § No other load (e. g. no other jobs) on CREAM § Measured LRMSSubmission. Time – Submission. Time for all the jobs, in the two scenarios (LSFConnector and BLAHConnector) • Submission. Time: when the job is received by CREAM (i. e. when CREAM insert the job in its journal manager) • LRMSSubmission. Time: when the job is submitted to LSF (as reported by the LSF log) § For the purpose of this test, jobs in the Journal. Manager are managed sequentially (i. e. a job is submitted to the LRMS, only when the previous job has been submitted) o I. e. Used the sync mode, for what concerns BLAH o Possible to do a better job for both connectors INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 2
Slide shown last time Enabling Grids for E-scienc. E INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 3
CREAM - BLAH Enabling Grids for E-scienc. E • This triggered a discussion with BLAH developers • Decided a revision of the CREAM architecture – Decided to give up with the LRMS specific connector to use instead BLAH for every “interaction” with the underlying resource management system • CREAM journal manager modified allowing parallel BLAH submissions – Since BLAH submission is I/O bound – Number of threads is configurable – Test repeated with 10 threads § 9 -10 s. (constant) as LRMSSubmission. Time – Submission. Time § 4 s. measured on another CREAM installation • Not investigated further INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 4
CREAM - BLAH Enabling Grids for E-scienc. E • Changes negotiated with BLAH developers to get by BLAH log parser notifications about job status changes – See: http: //savannah. cern. ch/bugs/? func=detailitem&item_id=12225 – Just provided by the BLAH developers – Starting integration with CREAM • Changes negotiated with BLAH developers to have BLAH commands working on multiple jobs – Waiting to get these modifications INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 5
Credential mapping Enabling Grids for E-scienc. E • Glexec (formerly known as su-exec) not in GLite 1. 5 and not released yet – Needed for credential mapping • Talked in Pisa with JRA 3 developers – Discussed about the dirty details – Agreed on some needed modifications – They reported that in about 10 days after Pisa they should be able to release something working § It should be now • Started discussing with BLAH developers where to apply this integration – In CREAM calling BLAH or in BLAH calling the LRMS commands ? – Decision also depends on the overhead introduced by glxexec § To be measured when glexec is usable • In the meantime started applying some other needed changes – Deployment and integration gridftp server LCMAPS enabled – Proper ownerships and protections of directories INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 6
CREAM: other accomplishments Enabling Grids for E-scienc. E • Porting to Axis 1. 2. 1 • Porting to GSoap 2. 7. 6 b – Several problems managing faults • Applied modifications needed because of changes in delegation stuff • Support for configuration file in CREAM CLI • Several bug fixes (in both client and server) • User documentation updated • First draft of a “high level” document describing CREAM architecture and functionality available • First unit tests committed INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 7
CREAM: other issues and other next steps Enabling Grids for E-scienc. E • Integration of VOMS based authorization – Voms. PDP just released • Integration with CEMon – To provide asynchronous notifications about CREAM jobs • Support of DAGs and bulk jobs – We plan to implement parametric and collection jobs as DAG jobs, as done in the WMS • CREAM CLI in the build system – Still the circular dependency problem to be addressed INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 8
Safe interactive access to jobs Enabling Grids for E-scienc. E • • • Tobia Conforto joined us for his “BSC” University stage Resumed the work about interactive access to job General idea – One-way interactivity: job → user – Let a user monitor her job’s stdout, stderr and output files in real time In detail – Interactive read-only access to a running job’s environment – The CREAM Job. Id is the only parameter needed – Remote ps, top, ls, cat and tail-like functionality on the Worker Node – Intelligent browsing of remote files: client-side hex viewer and view-like functionality only trasfers needed chunks of the remote file as needed – GUI clients are possible, although not currently scheduled Why – Inspection of long-running jobs: the user is not blind to the job’s progress, she can make an informed decision on whether to stop it or let it run – Early sampling of a batch of jobs’ correct operation can save considerable amounts of possibly wasted resources – Faster turnaround of debug sessions, trial runs and other kinds of tests INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 9
Safe interactive access to jobs Enabling Grids for E-scienc. E • How C++ Client • • glite-authenticated ssh as the SOAP messages Specific CE local user Webservice (CE LAN) (Internet) Worker Node Security considerations – Access to the service is subject to the same authentication as CREAM is – The user has only access to worker nodes where one of her jobs is running – She may only issue a fixed set of commands, none of which can alter files – User-supplied arguments are strictly parsed against shell escaping Privacy considerations – SOAP messages, including all traffic payload, are encrypted with SSL – The set of files / directories / devices the user has read access to on the worker node is restricted by the same OS file permissions as her job’s – Additional filters can restrict the commands to the job’s working directories INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 10
CREAM: “external links” Enabling Grids for E-scienc. E • GRIDCC – GRIDCC is integrating CREAM submission in their portal § Based on Java clients that we provided, as they requested § This has been shown in the recent Grid. CC EU review • – We maintain a CREAM installation, deployed on a small LSF farm in Legnaro – Support to Laura Del Cano (Elettra, Trieste) who is doing the work AVANADE – Software company with whom we had a meeting some time ago – Interesting in evaluating our stuff and possibly collaborating with us – Trying to deploy CREAM in their. NET environment § They need a document/literal version of the services • Provided for CREAM • The problem is with the delegation stuff o Pinged the security group, but it looks like they are not going to do it in the short term INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 11
XYZ ICE Enabling Grids for E-scienc. E • Found a better (than XYZ) name for the WMS component dealing with submissions to CREAM CEs: ICE • ICE: Interface to Cream Environment • Isn’t “ICE-CREAM integration” nice ? • • Contacted first the Grid. ICE team – They didn’t see problems, even if the ICE name can make people think about Grid. ICE (and viceversa) § But this couldn’t be bad • ICE in CVS (org. glite. wms. ice) but not yet linked to the build system INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 12
ICE (Interface to Cream Environment) Enabling Grids for E-scienc. E • ICE is the software component acting as an interface between the WMS and CREAM CEs • Operations initially handled by ICE – Job submissions – Job removals • ICE is being developed as a stand-alone process – Written in C++ – It will be investigated if it can be a WM thread for the future • At the moment it is under heavy development; many features are missing – Jobs right now are polled to get status changes – In the future, there will be an additional ICE thread which will receive notifications from CEMon coupled with CREAM CEs INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 13
WMS-CREAM integration / 1 Enabling Grids for E-scienc. E • • ICE takes the job management requests from its filelist ICE manages the submission to CREAM (see next slides); ICE keeps the mapping between the Gridjob. Id and CREAMjob. Id – This mapping is critical. It is essential that ICE remembers which job it controls – The mapping is for the moment kept on -disk, using a journal to record updates NS • Failed submissions are reinserted into the WM’s filelist as in the current implementation (JC+LM) ICE features: – Multithreaded (the submitter and status poller are two separate threads) – Uses log 4 cpp for logging debug messages – Tries to be fault tolerant Helpers File. List MM JA WM § To be investigated of LBproxy can be used • WMProxy File. List ICE Submitter Poller JC+LM Condor CREAM INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 14
ICE: what we have so far Enabling Grids for E-scienc. E • Implemented – Submission to a CREAM CE is working § Support for multiple CREAM CEs done sequentially right now • – The job status poller is working fine – Removal (cancel) of a job is coming soon To do – Job status change listener via CEMon – Extending ICE to handle submission to multiple CREAM CEs § In parallel (being implemented) • – LB logging – “Lease” submission protocol – Proxy Renewal All tests are being done with a stand-alone ICE to easily identify where problems are located – Requests are inserted into the “WM” filelist via a testing tool which simulates the WM inserting requests in the filelist – “True” WM integration will be done when everything is tested enough INFSO-RI-508833 EGEE JRA 1 -ITCZ cluster meeting. Torino, November 2005 15
715fb635a0020a65f0916704027a584e.ppt