2b802319f800460fb67f92bf9265bc63.ppt
- Количество слайдов: 18
INFN - Ferrara Ba. Bar Meeting SPGrid: status in Ferrara Enrica Antonioli - Paolo Veronesi Ferrara, 12/02/2003 Ferrara - 12/02/03
Topics Ø The Data. Grid project Ø Ferrara Farm Configuration Ø First SP submissions through the Grid Ø Work in Progress Ø Future Plans Ferrara - 12/02/03 2
European Data. Grid and INFN-GRID EDG Ø Ø Ø 2001 - 2003 R. A. L Funded by European Union To USA Computing Grids permit: • High Throughput Computing • Ø Special project of INFN To Russia/Japan Ø 2001 - 2003 ØCern To manage and use PD MI computing resources distributed on. FE Garr-b sites BO Analysis of large dimension data • Ø INFN-GRID Manchester Sharing resources and data Applications involved: • Biomedical Sciences • Earth Observation TO Ø ROMA Deployment of Testbed sites, in order to validate EDG software release and to CA adapt them to High Energy Physics requests CT • High prototipe of INFN Data. Grid testbed Current Energy Physics connected to EDG testbed – US and ASIA Ferrara - 12/02/03 3
EDG Architecture and Services APPLICATION Layer ALICE Data. GRID Architecture GLOBUS toolkit ATLAS CMS LHCb Ba. Bar High level GRID middleware Basics Services OS & Net services Ferrara - 12/02/03 4
Grid Elements in Ferrara u. The Data. Grid Testbed consists of different types of machines (Grid Elements). u. In Ferrara the farm is composed by one Computing Element (CE), three Worker Nodes (WN), one User Interface (UI) and one Storage Element (SE). u. All these machines are managed by a LCFGng (Local Con. Fi. Guration system new generation) server and they are automatically configured. CE/WN LCFGng Server UI SE Ferrara - 12/02/03 5
User Interface Ø UI (User Interface): component for accessing to the workload management system. Ø Users can submit a job and retrieve the output, they sholud have an account and a personal certificate installed in their home directory. To access the GRID you have to request a certificate to a certification authority. Certificate Authorities INFN-GRID users can obtain a certificate from the INFN CA UI (http: //security. fi. infn. it/). To use the Ba. Bar Grid, you must register that certificate with the Ba. Bar Virtual Organisation (Ba. Bar VO). http: //www. slac. stanford. edu/BFROOT/www/Computing/Offline/Ba. Bar. Grid/registration. html Ferrara - 12/02/03 6
Job Submission UI JDL Replica Catalogue Input Sandbox Information Service (IS) Job Status submitted waiting Job Submit Event Output Sandbox Resource Broker (RB) ready scheduled Input Sandbox running done Logging & Book-keeping (LB) Job Submission Service (JSS) Job Status Output Sandbox Job Status Ferrara - 12/02/03 outputready Storage Computing Element cleared 7
SPGrid Farm in Ferrara 250 GB SE R A I D 0 LCFGng Server UI SCSI Data server Management CE-WN Ferrara - EDG 1. 4. 3 Ferrara - 12/02/03 RB CNAF - Bologna CERN Lock server 8
Configuration Ø INFN Grid Testbed Status: EDG 1. 4. 3 (Red. Hat 6. 2). Ø A Ba. Bar software special release (12. 3. 2 y) has been built and installed to: ü ü Write Kanga files Run Moose on RH 6. 2 Ø A special tag of Prod. Tools has been installed to perform tests. Ø A pool of Ba. Bar accounts (babar 000, babar 001, …) has been created in the EDG farm of Ferrara. Ø Each member of Ba. Bar VO is able to submit jobs to the farm of Ferrara through the RB located at CNAF (grid 009 g. cnaf. infn. it). Ferrara - 12/02/03 9
Current Status Ø Created a JDL file to run Moose on Grid resources. Ø Created scripts containing EDG commands to submit jobs, to check their status and retrieve output files. Ø An user can submit a range of runs. Ø For each run a job is created and submitted to the Resource Broker, then it is sent to the Ferrara CE (grid 0. fe. infn. it). Ø The output file is then transferred to the closest SE (grid 2. fe. infn. it). Ferrara - 12/02/03 10
Moose. jdl Similar to SP standard scripts (Job. Xsh) grid 1> more Moose. jdl Executable ="Moose. csh"; Input. Sandbox ={"Moose. csh", ". cshrc", "config. csh"}; Std. Output ="Moose. txt"; Std. Error ="Moose. log"; Output. Sandbox ={"Moose. txt", "Moose. log"}; General environment configurations Config file for Ba. Bar. Similar to SP standard scripts Globus command: To copy output files from WN to SE […] tar -czvf run${RUNNUM}. tar. gz *. root globus-url-copy -vb file: //`pwd`/run${RUNNUM}. tar. gz gsiftp: //grid 2. fe. infn. it/flatfiles/SE 00/paolo/run${RUNNUM}. tar. gz Ferrara - 12/02/03 11
The launch script grid 1> more launch #!/bin/tcsh -v @ num_f = $1 @ fin = $2 while ( $num_f <= For each run a job is created runtime Range of runs to submit A config file is $fin ) created for each directories run ####build the run […] ####build a config. csh with the appropriate environment variables echo "#!/bin/tcsh -v" > config. csh […] EDG job submission #### now run the job command dg-job-submit -o run$num_f. jobid -r grid 0. fe. infn. it: 2119/jobmanager-pbs-long Moose. jdl cd. . @ num_f++ end Ferrara - 12/02/03 12
Job Submission grid 1>. /launch 1962016 1962017 Range of runs to submit […] dg-job-submit -o run$num_f. jobid -r grid 0. fe. infn. it: 2119/jobmanager-pbs-long Moose. jdl Connecting to host grid 009 g. cnaf. infn. it, port 7771 Logging to host grid 009 g. cnaf. infn. it, port 15830 CNAF RB ========= dg-job-submit Success ======== The job has been successfully submitted to the Resource Broker. Use dg-job-status command to check job current status. Your job identifier (dg_job. Id) is: https: //grid 009 g. cnaf. infn. it: 7846/193. 206. 188. 102/104224188091275? grid 009 g. cnaf. infn. it: 7771 The dg_job. Id has been saved in the following file: Job ID /home/enrica/stress/1962016/run 1962016. jobid […] grid 1> ls 1962016 1962017 Moose. csh Moose. jdl config. csh launch monitor retrieve grid 1> ls 1962016/ Moose. csh Moose. jdl config. csh run 1962016. jobid Ferrara - 12/02/03 grid 1> ls 1962017/ Moose. csh Moose. jdl config. csh run 1962017. jobid 13
The monitor script grid 1> more monitor #!/bin/tcsh @ num_f = $1 @ fin = $2 while ( $num_f <= $fin ) echo Run $num_f is `dg-job-status -i $num_f/run$num_f. jobid | grep Status` EDG command @ num_f++ end grid 1>. /monitor 1962016 1962017 Run 1962016 is Status = = Scheduled Status Reason = initial Run 1962016 is Status Output. Ready Status Reason = accepted Ready Status Reason = job terminated Running Status Run 1962017 is Status = = Scheduled Status Reason = initial Run 1962017 is Status Output. Ready Status Reason = accepted Ready Status Reason = job terminated Running Status Ferrara - 12/02/03 14
The retrieve script grid 1> more retrieve #!/bin/tcsh -v @ num_f = $1 @ fin = $2 Globus command: Direct copy of file from SE to UI while ( $num_f <= $fin ) cd $num_f #### get logfiles dg-job-get-output -i run$num_f. jobid --dir $PWD EDG command #### get rootfiles globus-url-copy gsiftp: //grid 2. fe. infn. it/flatfiles/SE 00/paolo/run$num_f. tar. gz file: //`pwd`/run$num_f. tar. gz tar -xzvf run$num_f. tar. gz rm -f run$num_f. tar. gz #### delete rootfiles form SE globus-job-run grid 2. fe. infn. it /bin/rm /flatfiles/SE 00/paolo/run$num_f. tar. gz cd. . @ num_f++ end Ferrara - 12/02/03 Globus command: delete file from. SE 15
Retrieving Output grid 1> ls 1962016 1962017 Moose. csh Moose. jdl config. csh launch monitor retrieve grid 1> ls 1962016/ 150546318633191 Moose. jdl rootdef-tru. root Moose. csh config. csh run 1962016. jobid rootdef-tag. rootdef-aod. root grid 1> ls 1962017/ 150551318931039 Moose. jdl Moose. csh config. csh run 1962017. jobid rootdef-tru. rootdef-tag. rootdef-aod. root grid 1> ls 1962016/150546318633191/ Moose. log Moose. txt grid 1> ls 1962017/150551318931039/ Moose. log Moose. txt Ferrara - 12/02/03 16
Future Plans Lock server UI CE-WN Ferrara Integration of Moose RP M LCFGng Server RB (UK) Application with EDG software releases Data server RPM MOOSE Objectivity DB Management SE 3) Install Objy DB on the SE SPGrid Farm 2) MOOSE in RPM format 1) Use of IC RB and others Ferrara - 12/02/03 17
Documentation The Data. Grid Project: http: //eu-datagrid. web. cern. ch/eu-datagrid/default. htm EDG tutorials Archive Web Site: http: //hep-proj-grid-tutorials. web. cern. ch/hep-proj-grid-tutorials/loginex. html INFN-Grid Testbed: http: //server 11. infn. it/testbed-grid/ Ba. Bar-Grid: http: //www. slac. stanford. edu/BFROOT/www/Computing/Offline/Ba. Bar. Grid/ Status of the Farm in Ferrara: http: //print. fe. infn. it/status/ Ferrara - 12/02/03 18
2b802319f800460fb67f92bf9265bc63.ppt