b575feae95d53fa0d25cc222aa2f246c.ppt
- Количество слайдов: 46
Grid interoperability using Sylvain Reynaud, Pascal Calvat CC-IN 2 P 3
Plan demo of n overview of n demo of JUX n summary and perspectives n JSAGA is an API for uniform access to grids. JJS and JUX are tools using JSAGA 2
JJS – Overview n JJS was developed by Pascal Calvat (CC-IN 2 P 3) in 2003, to submit jobs to the DATAGRID infrastructure – has evolved to submit jobs to the EGEE infrastructure n JJS is designed to ease job submission from web servers hosted in laboratories – it is an alternative to User Interface + Resource Broker (or to g. Lite. UI + g. Lite-WMS) n JJS is optimized for submitting short-life jobs – based on observed Qo. S of sites: JJS give a score to selected sites and use it for subsequent match-makings – but it can also be used with long-life jobs 3/18/2018 3 JSAGA 3
JJS – Demo 1 job Overall performance for short-life jobs (install povray on-the-fly, then generate part of the image) Execution time on local host Execution time on EGEE grid infrastructure Ratio 8000 s 600 s with 100 jobs 13 3/18/2018 JSAGA 4 4
JJS – Overview n n JJS was initially developed on top of cog-jglobus API cog-jglobus is being replaced with JSAGA for… – – n security data management execution management job collection management (done) (in a near future) Using JSAGA enables JJS to become independent of g. Lite middleware evolutions – from Globus proxy to VOMS proxy – from GSIFTP to SRM – from LCG-CE to g. Lite-CREAM (done) (work in progress…) (in a near future) 3/18/2018 JSAGA 5 5
JSAGA – targeted use cases cluster Motivations for using several grid infrastructures: • increasing the number of computing resources available to user • need for resources with specific constraints • super-computer • confidentiality • small overhead (e. g. consolidation) • interactivity • availability, on a given grid, of: • the data • the software JSAGA 6
n n n Hide heterogeneity between middlewares As many interfaces as ways to implement each functionality n JSAGA Hide heterogeneity between grid infrastructures n SAGA Ready-to-use software, adapted to targeted scientific field As many interfaces as used technologies 7
SAGA: code example // use factories to create SAGA objects Session session = Session. Factory. create. Session(); URL url = URLFactory. create. URL("gsiftp: //cclcgseli 01. in 2 p 3. fr/tmp/"); NSDirectory dir = NSFactory. create. NSDirectory(session, url); // use SAGA objects List<URL> result = dir. list(); for (URL r : result) System. out. println(r); JSAGA 8
n Hide heterogeneity between grid infrastructures n Hide heterogeneity between middlewares n As many interfaces as ways to implement each functionality n As many interfaces as used technologies er Ready-to-use software, adapted to targeted scientific field end us n r velope per develo JSAGA core engine + plug-ins s plug-in JSAGA tion de applica SAGA 9
core engine JSAGA plug-ins + Plug-ins interfaces SAGA n close to application developer needs JSAGA n – object-oriented – high-level – uniform interface to all the supported technologies n design objectives – easy to use … but << certainly not simple to implement >> (T. Kielmann) plug-ins interfaces close to existing middleware APIs – service-oriented – low-level – as many interfaces as ways to implement each functionality – optional interfaces n design objectives – easy to implement – enable efficient usage of middleware APIs • engine code = 2 x plug-ins code JSAGA 10
core engine JSAGA plug-ins + Plug-ins: execution management Streaming Monitoring Plug-in interfaces: direct/buffered/redirected streams used before/during/after execution set stream set get Plug-in interfaces: querying / listening individual job / list of jobs / filtered jobs query listen query status for individual filtered job jobs for non- for interactive get. Input get. Output get. Error SAGA user interface: get. State wait. For SAGA user interface: get. State / wait. For get. Input / get. Output … cream fork ssh unicore 6 wsgram Job monitoring g. Lite-LB gatekeeper naregi remote PBS cream fork ssh unicore 6 wsgram g. Lite-WMS gatekeeper done construction JSAGA planned Job control 11
Plug-ins provided srm ftp mail cache JKS gsiftp tar s Login / pwd SSH zip My. Proxy sftp rbyteio file G. RFC 820 https file G. Legacy http l X 509 a ic ys Ph VOMS Globus lfn srb / irods s file … cream fork ssh unicore 6 wsgram g. Lite-LB gatekeeper RSL-4 RSL-2 JDL SAGA JSDL+ext. fork Bean. Shell basic default JEP gatekeeper g. Lite-WMS wsgram unicore 6 ssh cream PBS remote naregi Exec. 12 done construction JSAGA planned In. Mem. Cred al gic Language Expression rns (monitor) (control) catalog Data Job control Exec. Lo Security core engine JSAGA plug-ins +
This is still not enough… core engine JSAGA plug-ins + job desc. hide midd hete leware roge (e. g. g. Lite neity , Gl lobus , Uni core ) JSAGA g. Lite plug-ins JDL Globus plug-ins RSL 13
OSG , DE ISA) h infra ide struc t hete roge ures (e. g. EGE neity E, JSAGA g. Lite plug-ins JDL LCG-CE job firewall WMS de le & gat file e s s s ele ta cti gin on g lobus , Uni core ) hide midd hete leware roge (e. g. g. Lite neity , Gl This is still not enough… job desc. EO EE GP last JSAGA Globus plug-ins RSL staging graph SRM input data Grid. FTP WS-GRAM job 14
n Hide heterogeneity between grid infrastructures n Hide heterogeneity between middlewares n As many interfaces as ways to implement each functionality n As many interfaces as used technologies er Ready-to-use software, adapted to targeted scientific field end us n r per develo JSAGA core engine + plug-ins s plug-in JSAGA velope SAGA tion de applica jobs JSAGA collection 15
Description of infrastructures srb: // CC-IN 2 P 3 eper gateke srm: // WMS EGEE n gsiftp: // n Infrastructures heterogeneity – Grid/site policy eper gateke • e. g. network filtering, shared FS wsgram – Environment variables • e. g. $VO_? _SW_DIR, /usr/local t Open. Plas – Configuration attributes (client) Grid http: // Globus localhost Middleware heterogeneity – e. g. CREAM, WMS, SSH, GK VOMS eper gateke example: execution management VOMS lfn: // jobs JSAGA collection tar: // • e. g. monitor service URL, shell path on cygwin, default SE URL – Command line interfaces (worker) • e. g. globus-url-copy, srmcp, Scp, wget, tar World JSAGA 16
Transfer path depends on… n eper gateke srm: // jobs JSAGA collection VOMS Whensusing a single grid infrastructure rb: // – all-IN 2 P 3 can be transported to/from the worker nodes through a CC files lfn: // gsif tp: // WMS single storage node EGEE n VOMS per When using gatekeep grid infrastructures several w gr m – need stoadynamically build a more complex transfer graph, according to… n. Plast Ope Grid http: // eper gateke localhost tar: // job desc. ins ugplu Globus JSAGA rl: // u World JSAGA 17
jobs JSAGA collection Transfer path depends on… n grid or site n execution service – network filtering policy – protocols supported for staging – commands available on workers n transfer protocol – services available from workers (close – access mode (RO, WO, RW) Storage Element, shared FS) – third-party transfer – supported context instances – supported data protection level n data to stage – – shared by several jobs installed on some worker nodes file size required data protection level VOMS eper srb: // gatekesr m: / CC-IN 2 P 3 EGEE WMS / lfn: // g si ft Grid eper gateke calhost lo VOMS http: // Globus tar: // // JSAGA rl: // u job desc. ins ugplu eper gateke m wsgra Open. Plast p: World JSAGA 18
C C' common jobs JSAGA collection Transfer path depends on… n grid or site n data to stage n execution service – network filtering policy – protocols supported for staging – commands available on workers n transfer protocol – services available from workers (close – access mode (RO, WO, RW) Storage Element, shared FS) – third-party transfer – supported context instances – supported data protection level SMTP SRB R 1 OPla st EG GSIFTP EE CA R 1 E 1 OP GSIFTPlast C' – – result std-error shared by several jobs installed on some worker nodes file size required data protection level HTTP OPla st job VOMS eper srb: // gatekesr m: / CC-IN 2 P 3 C EGEE WMS eper gateke m wsgra Open. Plast Grid eper gateke calhost lo E 1 t Open. Plas JSAGA / lfn: // g si ft VOMS OPla st p: // OPl GSIFTPast job http: // Globus tar: // OPla st World job 19
C C'C'' common jobs JSAGA collection E E src executable Transfer path depends on… n grid or site n data to stage n execution service – network filtering policy – protocols supported for staging – commands available on workers n transfer protocol – services available from workers (close – access mode (RO, WO, RW) Storage Element, shared FS) – third-party transfer – supported context instances – supported data protection level SMTP SRB R 1 OPla st EG GSIFTP EE CA C' E src D 1 R 1 OP GSIFTPlast D 1 – – input data result std-error E 1 shared by several jobs installed on some worker nodes file size required data protection level HTTP C" OPla st job OPla st C OPl GSIFTPast job TAR E i. Get OPla st E 1 JSAGA job 20
Example of generated graph C C'C'' common jobs JSAGA collection E E src executable input data D 1 R 1 result E 1 std-error Data flow example with several protocols used, but only 3 jobs submitted on 1 grid… OPla st JSAGA 21
Ready-to-use software, adapted to targeted scientific field n Hide heterogeneity between grid infrastructures n Hide heterogeneity between middlewares n As many interfaces as ways to implement each functionality n As many interfaces as used technologies r per develo JSAGA core engine + plug-ins s plug-in JSAGA velope SAGA tion de applica jobs JSAGA collection er Applications end us n 22
Applications Command line interfaces Applications n JSAGA provides command line interfaces for… – security • jsaga-context-init • jsaga-context-info • jsaga-context-destroy – execution management • jsaga-job-run • jsaga-job-status • jsaga-job-cancel JSAGA – data management • • • jsaga-cat jsaga-cp jsaga-ls jsaga-mkdir jsaga-mv jsaga-rmdir jsaga-stat jsaga-test jsaga-logical 23
Related projects Applications n JSAGA is used by… – Elis@ / • a web portal for submitting jobs to industrial and research grid infrastructures – JJS (Java Job Submission) • a tool for submitting jobs to EGEE • optimized for short-life jobs (resource selection based on Qo. S observed while submitting jobs) – JUX (Java Universal e. Xplorer) • a multi-protocols file browser JSAGA 24
JUX – Overview n JUX is a file explorer designed to be independent of – Operating System full java code • tested on Windows, Scientific Linux, Ubuntu, Mac – Data management protocol • tested with gsiftp, srb, irods, https, sftp, zip, (srm) – Security mechanism JSAGA • tested with GSI, VOMS, Login/Password, X 509, SSH – File content viewer png, gif, jpg, bmp, tiff, dicom mp 3, wav • provided viewers are for text file, image viewer, audio player • can use local applications (only for protocol "file: //" on OS "Windows") JSAGA 25
JUX – Overview n Data management and security – JUX does not only use the SAGA API – it also uses the JSAGA introspection API to discover… • list of available protocols • list of configured security contexts • list supported security context types, for each protocol – this allows JUX to be completely independent of technologies used • just copy your own JSAGA plug-in in JUX "lib/" directory to add the support for a new technology ! JSAGA 26
Demo of JUX … and then conclusion about JSAGA 27
Software quality n Build process fully automated, including… – build tools installation – code generation – testing • unitary tests • integration tests – project web site generation • http: //grid. in 2 p 3. fr/jsaga/ – installer GUI generation (see next slide…) n Plug-ins – external dependencies reduced • e. g. g. Lite-UI not needed • most plug-ins supports – a maven 'archetype' generates skeleton of new plug-in project – plug-ins automatically validated with a reusable SAGA test suite # SAGA protocols test-suite configuration gsiftp. base=gsiftp: //ccrugceli 01. in 2 p 3. fr/tmp/ gsiftp. base 2=gsiftp: //agena. c-s. fr/grid/tmp/ gsiftp. context=Open. Plast_proxy https. base=http: //grid. in 2 p 3. fr/html/Private/ https. context=Web_X 509 file. base=file: ///c: /tmp/ file. base 2=file: ///c: / JSAGA 28
Installer GUI JSAGA 29
License(s) n LGPL license – for the core engine and most plug-ins n Optional licenses – for plug-ins having external dependencies, which license is not compatible with LGPL – then, end-user must… • either accept the terms of the license agreement • or uncheck these plug-ins (see previous slide) JSAGA 30
Summary Main assets of JSAGA n Implement standard specifications from n – SAGA – JSDL n – thanks to JSAGA VOMS eper srb: // gatekesr m: / CC-IN 2 P 3 EGEE WMS / lfn: // g s eper gateke sgram w Open. Plast Provide high-level abstraction layer with no sacrifice on efficiency or scalability – thanks to design (definition of plug-ins interface) – thanks to cache mechanisms Use grid infrastructures as they are (i. e. no pre-requisite) Grid ift VOMS p: // http: // Globus tar: // eper gateke localhost World n Hide heterogeneity – of middlewares – of grid infrastructures 31
Perspectives n Support new technologies – develop plug-ins • g. Lite-CREAM • French research grid middleware ? • … – integrate plug-ins developed by partners n Implement new specifications – SAGA Extension: Service Discovery API • discussions on candidate spec. has just finished, the final spec. should be available soon • JSAGA – has no equivalent for this – plug-in based implementation – JSDL Extension: Parameter Sweep Job • proposed for public comments • JSAGA does this in a nonstandard way JSAGA 32
Backup slides JSAGA 33
Plan overview n summary and perspectives n JUX overview n summary and perspectives n JSAGA 34
JJS – Performance For short-life jobs, grid overhead is not negligible need to optimize each step of job submission: → job submission: multi-threaded → data staging: input/output files are grouped in tarballs → monitoring: get all job status with a single request → job life-time: waiting and running jobs have a timeout limit …and last but not least: select the execution sites, which are the most efficient for short-life jobs (based on observed Qo. S) 3/18/2018 JSAGA 35 35
JJS – Performance (submission) Time elapsed before entering state WAITING (i. e. time for transferring the input sandboxes + submitting the jobs) Average time before entering state WAITING 12 seconds 95% of jobs enter state WAITING before… 15 seconds 3/18/2018 JSAGA 36 36
JJS – Performance (monitoring) Use naming convention on GSIFTP server instead of Globus monitoring (detecting job failure is not needed because all the jobs timeout shortly…) Average time for getting status of all jobs 3 seconds Step File name extension Job status Input sandbox uploaded . tar UPLOADED Job submitted to CE (. tar) WAITING Job started . run RUNNING Job completed . res. tar DONE 3/18/2018 JSAGA 37 37
JJS – Summary n Optimized for short-life jobs – Qo. S-based selection of execution sites – pragmatic usage of deployed grid technologies n Easy to install, configure and use n Robust – designed to be not sensible to grid middleware failures – because developed when grid was not mature (DATAGRID) http: //cc. in 2 p 3. fr/docenligne/269 3/18/2018 JSAGA 38 38
JJS - Perspectives n Finish integration of JSAGA – for job submission (SAGA) – for job collection management (JSDL Parameter Sweep Job Extension) • job description: independent of language • data staging: independent of protocols and infrastructure constraints n JJS is also waiting… – for SRM data management JSAGA plug-in – for Service Discovery API (SAGA Extension) support in JSAGA • in order to enable efficient usage of SRM with short-life jobs (by discovering GSIFTP servers through the SRM web service) JSAGA 39
Plan overview n summary and perspectives n JUX overview n summary and perspectives n JSAGA 40
JUX – Screenshots The connection manager enables user to create connection profiles with URL and security context. Only the security contexts compatible with selected protocols appear in the popup list. 3/18/2018 JSAGA 41 41
JUX – Screenshots Connection is kept open until the nodes are collapsed (left side). Copy several files with a single drag-and-drop. 3/18/2018 JSAGA 42 42
JUX – Related work n Similar tools exist – HERMES (Australia) – VBrowser (Holland) n based on Apache Commons VFS Using JSAGA for JUX enables – to factorize development efforts with JJS (for data staging) – to manage logical files through a common interface (SAGA) – protocol-specific optimizations • e. g. third-party transfer, filtered file list – to automatically recover some errors • e. g. create parent directory if missing, retry if error is Incorrect. State JSAGA 43
JUX – Summary n JUX can work with potentially any – protocol – security mechanism – file content n you can develop the plug-ins missing for your use-case JUX is easy to use – targeted users are scientists n JUX is lightweight – currently 11 MB with all plug-ins http: //cc. in 2 p 3. fr/docenligne/821 JSAGA 44
JUX – Perspectives (meta-data) Name Value DICOM Study Date 18/11/2008 DICOM Patient's Name John Smith DICOM Patient's Sex M DICOM Patient's Age 28 size 2493827 JSAGA 45
JUX – Perspectives (meta-data) SEARCH entry name *. txt and Study Date Patient's Name John S* Patient's Sex M and Patient's Age size Recursive JSAGA Search 46
b575feae95d53fa0d25cc222aa2f246c.ppt