Скачать презентацию Consorzio COMETA — Progetto PI 2 S 2 Скачать презентацию Consorzio COMETA — Progetto PI 2 S 2

83e81093adfeb0e16cbcdf55f8fce609.ppt

  • Количество слайдов: 20

Consorzio COMETA - Progetto PI 2 S 2 FESR Long term job submission and Consorzio COMETA - Progetto PI 2 S 2 FESR Long term job submission and monitoring uing grid services Riccardo Bruno INFN, Sez. CT 23/07/2007 Meeting sull'uso di applicazioni parallele in PI 2 S 2 www. consorzio-cometa. it

Outline • Long term job submission – My. Proxy. Server – Renewal – The Outline • Long term job submission – My. Proxy. Server – Renewal – The renewal process and JDL tag – Long term job submission • Long term job monitoring – Middleware tools – How to do monitoring efficiently – The Watchdog – Watchdog use example – The main script – The watchdog flow – The main script code – Some outputs – The future … • References Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

Long term job submission Catania, Meeting sull'uso di applicazioni parallele in PI 2 S Long term job submission Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

My. Proxy. Server – Proxy has limited lifetime (default is 12 h) • Bad My. Proxy. Server – Proxy has limited lifetime (default is 12 h) • Bad idea to have longer proxy – myproxy server: • myproxy-init –voms -s – Allows to create and store a long term proxy certificate: -s: specifies the hostname of the myproxyserver • myproxy-info – Get information about stored long living proxy • myproxy-get-delegation – Get a new proxy from the My. Proxy server • myproxy-destroy – Removes the stored proxy from the server Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

Renewal • A dedicated service on the RB can renew automatically the proxy: [edg-wl-renewd] Renewal • A dedicated service on the RB can renew automatically the proxy: [edg-wl-renewd] - /etc/init. d/edg-wl- proxyrenewal • Some dedicated flags are required during the creation of the long term proxy credential with myproxy-init: – -d : Use the proxy certificate subject (DN) as the default username, instead of the LOGNAME env. var. – -n : Don't prompt for passphrase myproxy-init –voms cometa -d -n bash-2. 05 b$ Your identity: /C=IT/O=GILDA/L=INFN Catania/CN=Riccardo Bruno/ Email=riccardo. [email protected] infn. it Enter GRID pass phrase for this identity: Creating proxy. . . . . Done Proxy Verify OK Your proxy is valid until: Fri Jul 23 09: 30: 33 2007 A proxy valid for 168 hours (7. 0 days) for user /C=IT/O=GILDA/L=INFN Catania/ CN=Riccardo Bruno/Email=riccardo. [email protected] infn. it now exists on grid 001. ct. infn. it. Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

The renewal process and JDL tag • 5 or 10 minutes before the proxy The renewal process and JDL tag • 5 or 10 minutes before the proxy expires the RB proxy renewal daemon will perform the following steps: – Contacts the My. Proxy. Server indicated into the JDL and asks for a new delegation – contacts the VOMS server to add the ACs – transfers the new VOMS-enabled proxy to the WNs running the job. • An additional attribute has to be added to the JDL – My. Proxy. Server = "grid 001. ct. infn. it"; § The item informs the RB which My. Proxy. Server has to be contacted to renew the credentials. Otherwise a default one is taken from UI VO configuration settings: glite_wmsui. conf Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

Long term job submission • Create the long term proxy on the My. Proxy Long term job submission • Create the long term proxy on the My. Proxy server – myproxy-init --voms cometa -d –n • Create a new proxy or get the delegation from My. Proxy server – voms-proxy-init –voms cometa – myproxy-get-delegation –d -a $X 509_USER_PROXY (Please notice you must have already a valid proxy on the UI) • Submit the job normaly – edg-job-submit -o jid testmyproxy. jdl myproxy-init –voms cometa -d -n bash-2. 05 b$ Your identity: /C=IT/O=GILDA/L=INFN Catania/CN=Riccardo Bruno/ Email=riccardo. [email protected] infn. it Enter GRID pass phrase for this identity: Creating proxy. . . . . Done Proxy Verify OK Your proxy is valid until: Fri Jul 23 09: 30: 33 2007 A proxy valid for 168 hours (7. 0 days) for user /C=IT/O=GILDA/L=INFN Catania/ CN=Riccardo Bruno/Email=riccardo. [email protected] infn. it now exists on grid 001. ct. infn. it. Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

Renewal feedback Starting at: 20070720124320 subject : /C=IT/O=INFN/…/CN=proxy/CN=proxy/CN=limited proxy … type : limited proxy Renewal feedback Starting at: 20070720124320 subject : /C=IT/O=INFN/…/CN=proxy/CN=proxy/CN=limited proxy … type : limited proxy strength : 512 bits path : /tmp/globus-tmp. unime-wn-03. 27834. 0 timeleft : 0: 56: 58 === VO cometa extension information === VO : cometa subject : /C=IT/O=INFN/OU=Personal Certificate/L=Catania/CN=Riccardo Bruno This job has been executed with a delegated proxy 1 hr long issuer : /C=IT/O=INFN/OU=Host/L=Catania/CN=voms. ct. infn. it attribute : /cometa/Role=NULL/Capability=NULL (myproxy-get-delegation -d -t 1: 00 -a $X 509_USER_PROXY) timeleft : voms-proxy-info returns 0: 56: 58 as time left The 1° call to 11: 56: 01 … Other output from job’ core execution (just sleep execution) After the job core execution the 2° call to voms-proxy-info gives 8: 45: 18 as time left subject : /C=IT/O=INFN/…/CN=proxy/CN=proxy/CN=limited proxy … Please notice also the different subjects: type : limited proxy /C=IT/O=INFN/…/CN=proxy/CN=proxy/CN=limited proxy strength : 512 bits /C=IT/O=INFN/…/CN=proxy/CN=proxy/CN=limited proxy path : /tmp/globus-tmp. unime-wn-03. 27834. 0 timeleft : 8: 45: 18 === VO cometa extension information === VO : cometa subject : /C=IT/O=INFN/OU=Personal Certificate/L=Catania/CN=Riccardo Bruno issuer : /C=IT/O=INFN/OU=Host/L=Catania/CN=voms. ct. infn. it attribute : /cometa/Role=NULL/Capability=NULL timeleft : 10: 26: 00 Ending at: 20070720141321. Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

Long term jobs monitoring Catania, Meeting sull'uso di applicazioni parallele in PI 2 S Long term jobs monitoring Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

Middleware tools • Currently g. Lite offers the following services allowing to monitor the Middleware tools • Currently g. Lite offers the following services allowing to monitor the job execution – Interactive Jobs or direct use of X server communication via SSH tunneling § User forced to use interactive JDL § Keep open the X client for the whole job duration – Use of RGMA § The use of dedicated producers need to apply code changes not ever possible. § Code changes are error prone and need to be tested – Use of AMGA § The use of AMGA APIs requires code changes as well Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

How to do monitoring efficiently • IDEA: Perform the job monitoring using still grid How to do monitoring efficiently • IDEA: Perform the job monitoring using still grid services in the less possible invasive way. – Observations: § Almost all jobs submitted on the grid are piloted by shell scripts • Shell scripting allow to get precious info in case of faults • Shell scripting can pilot more complex batch processing § Both SE and file catalog can be used as the simplest IS on the grid. • lfc-* and lcg-* tools already available for file creation and retrieve • The latency of CLI tools for the storage is very low compared to long term jobs – Requirements: § It would be useful to configure the monitoring tool accordingly to the user needs • Few shell environment variables can be used to configure the monitoring tool Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

The Watchdog • The Watchdog is a shell script to be included in the The Watchdog • The Watchdog is a shell script to be included in the main script. – Some watchdog features: § § § It starts in background before to run the long term job The watchdog runs as long as the main job The main script can stop and wait until the watchdog has finished Easily and highly configurable The watchdog does not compromise the CPU power of the WN The watchdog is really simple and its behavior can be extended by the user • The best way to explain the watchdog is to make an use example … Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

Watchdog use example • The simplest use case foresees the following: – The JDL: Watchdog use example • The simplest use case foresees the following: – The JDL: script. jdl – The main script file: script. sh – The watchdog script file: watchdog. sh Input. Sandbox script. jdl file. out script. sh watchdog. sh file. err watchdog. out Output. Sandbox Type = "Job"; Job. Type = "Normal"; Executable = "/bin/bash"; Std. Output = "file. out"; Std. Error = "file. err"; Input. Sandbox = {"watchdog. sh", "script. sh"}; Output. Sandbox = {"file. out", "file. err", "watchdog. out"}; Arguments = "script. sh"; Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

The main script • It is a good practice to have a main script The main script • It is a good practice to have a main script like the following structure: Get information about the WN Start the watchdog Execute and control the main job Stop the watchdog Collect information about the job execution Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

The watchdog flow Initialization File Catalog/SE USERPATH/Job. Id Enter the loop For each file The watchdog flow Initialization File Catalog/SE USERPATH/Job. Id Enter the loop For each file in the list Take a snapshoot (just increments will be copied) CTLR File exsists Create notification file VO USERPATH FILE Catalog SE DELAY LIST OF FILES _ __ CTRL file NTFY file Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

The main script code # # watchdog – Riccardo Bruno 200707 # echo The main script code # # watchdog – Riccardo Bruno 200707 # echo "Starting at: “ $(date +%y%m%d%H%M%S) HOSTNAME=$(hostname -f) USER=$(whoami) ARG 1=$1 LOCALDIR=$(pwd) echo "***************" echo "HOST: "$HOSTNAME echo "USER: "$USER echo "ARGS: "$ARG 1 echo "LOCALDIR is: "$LOCALDIR echo "HOMEDIR is: "$HOME echo "Content of home: " ls -l $HOME echo "Content of current dir: " ls -l. echo "***************" #start the watchdog chmod +x watchdog. sh. /watchdog. sh > watchdog. out & # perform 8 iterations, 15 seconds each # 2 minutes for i in $(seq 1 8) do echo "This is mine output at: “ $(date +%y%m%d%H%M%S) echo "This is mine error at: “ $(date +%y%m%d%H%M%S) 1>&2 sleep 15 done #stop and wait the dog rm -f watchdog. ctrl while [ ! -e watchdog. done ] do sleep 1 echo "Waiting for watchdog: “ $(date +%y%m%d%H%M%S) done echo "Watchdog closed" echo "done" 1>&2 echo "Ending at: "$(date +%y%m%d%H%M%S) Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

Some outputs [brunor@glite-tutor tmp]$ lfc-ls -l /grid/gilda/brunor/2 DFf. QYycd 5 gu. ISZSU 3 Zd. Some outputs [[email protected] tmp]$ lfc-ls -l /grid/gilda/brunor/2 DFf. QYycd 5 gu. ISZSU 3 Zd. OQ -rw-rw-r-1 1023 102 2211 Jul 18 16: 13 070718161318_testmyproxy. out -rw-rw-r-1 1023 102 85 Jul 18 16: 14 070718161347_testmyproxy. err … [[email protected] brunor_2 DFf. QYycd 5 gu. ISZSU 3 Zd. OQ]$ cat file. out Starting at: 070713155443 ******************** ******************** This is my output at: 070713155443 … This is my output at: 070713155633 done Ending at: 070713155643 [[email protected] brunor_2 DFf. QYycd 5 gu. ISZSU 3 Zd. OQ]$ cat file. err This is my error at: 070713155443 … [[email protected] brunor_2 DFf. QYycd 5 gu. ISZSU 3 Zd. OQ]$ cat watchdog. out Starting watchdog at: 070713155443 guid: 205 a 2902 -89 e 0 -4 c 68 -b 963 -2 facf 30 efb 6 f guid: a 21 f 30 b 4 -46 cf-4 e 63 -919 b-ceb 911 bfe 710 … Ending watchdog at: 070713155443 Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

The future … • The watchdog can be easily improved – Use a special The future … • The watchdog can be easily improved – Use a special folder in the catalog to be used as a virtual UI on the WN allowing the user to issue shell commands: WD_USER_PATH// _file_1 _file_2 … _file_n UI/ commands _cmdresult_1 … – Use of AMGA/RGMA CLI tools instead of the catalog Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

References • The watchdog wiki – https: //grid. ct. infn. it/twiki/bin/view/PI 2 S 2/Watchdog. References • The watchdog wiki – https: //grid. ct. infn. it/twiki/bin/view/PI 2 S 2/Watchdog. Utility Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007

Questions… Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. Questions… Catania, Meeting sull'uso di applicazioni parallele in PI 2 S 2 , 23. 07. 2007