5672f9d3402ad3e4779139f76b9c3b1a.ppt
- Количество слайдов: 95
GRID (European Scientific Computing infrastructure) .
CERN currently comprises 20 European Member States: Austria, Belgium, Bulgaria, the Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Italy, The Netherlands, Norway, Poland, Portugal, the Slovak Republic, Spain, Sweden, Switzerland the United Kingdom. Observer and Non-member States: India, Israel, Japan, the Russian Federation, the United States of America, Turkey, Algeria, Argentina, Armenia, Australia, Azerbaijan, Belarus, Brazil, Canada, China, Croatia, Cyprus, Estonia, Georgia, Iceland, India, Iran, Ireland, Mexico, Morocco, Pakistan, Peru, Romania, Serbia, Slovenia, South Africa, South Korea, Taiwan and the Ukraine. Latvija, atšķirībā nn Igaunijas, nav CERN dalībvalsts studenti no Latvijas nevar vasarās tur praktizēties. . .
CERN: Tim Berbers Lee invented HTTP (WWW) 1990, 1993
CERN LHC An image of one of the first lead-ion collisions seen by the ALICE experiment on 7 November , 2010.
Data visualisation of a particle collision inside the Large Hadron Collider, recorded by the ALICE experiment.
LHC Atlas: search for Higgs boson Greatest Mysteries: What Causes Gravity? w E=mc 2 ; v=g * t ; w "Much of today's research in elementary particle physics focuses on the search for a particle called the Higgs boson. This particle is the one missing piece of our present understanding of the laws of nature, known as the Standard Model. This model describes three types of forces: electromagnetic interactions, strong interactions, which bind atomic nuclei; and the weak nuclear force, which governs beta decay. (The Standard Model does not describe the fourth force, gravity. ) w “Over the next 15 years, we should begin to find a real understanding of the origin of mass. ” w "Newton thought that gravity's force was instantaneous. Einstein assumed that it moved at the speed of light, but until now, no one had measured it, " w General relativity (GR) suggests that gravitation (unlike electromagnetic forces) is a pure geometric effect of curved spacetime, not a force of nature that propagates. w Most scientists assume that gravity travels at the speed of light and there exist gravitational waves and gravitational radiation (single positive experemnt in 2003) w I put my last dollar, the Higgs shall not be found, just as
The ‘Standard Model’ of Particle Physics Proposed by. Abdus Salam, Glashow and Weinberg Tested by experiments at CERN & elsewhere Perfect agreement between theory and experiments in all laboratories
Open Questions beyond the Standard Model w What is the origin of particle masses? LHC due to a Higgs boson? w Why so many flavours of matter particles? LHC w What is the dark matter in the Universe? LHC w Unification of fundamental forces? LHC w Quantum theory of gravity? LHC
Why do Things Weigh? Newton: Weight proportional to Mass Einstein: Energy related to Mass Neither explained 0 origin of Mass Where do the masses come from? Are masses due to Higgs boson? (the physicists’ Holy Grail)
Has the Higgs Boson been Discovered? Interesting hints around Mh = 125 Ge. V ? CMS sees broad enhancement ATLAS prefers 125 Ge. V
CERN, 100 m zemes, Atlas w 2007. gadā
CERN Trubā, gar kuru grābstos, protoni (jonizēti ūdeņraža atomi) skrien ar 99, 9999% no gaismas ātruma (tikai 3 m/s lēnāk par gaismas ātrumu 300000 km/s). w 2017. gadā
A global, federated e-Infrastructure Baltic. Grid SEE-GRID OSG EUMed. Grid NAREGI EUChina. Grid EUIndia. Grid EELA EGEE infrastructure ~ 200 sites in 39 countries ~ 20 000 CPUs > 5 PB storage > 10 000 concurrent jobs per day > 60 Virtual Organisations
Scale of EGEE Production Service 3000000 No. jobs / month - all 2500000 98 k jobs/day 2000000 1500000 LHC 1000000 Non-LHC OPS 500000 -0 7 ar 07 M Fe b- 07 n. Ja 06 c- De v 06 No t-0 6 Oc 06 Se p- 06 6 Au g- l-0 Ju n 06 Ju ay M Ap r-0 6 0
Grid sertifikāta iegūšana w Iegūt apstiprinātu Baltic. Grid sertifikātu – pirmais solis ceļā uz Grid izmantošanu w Informācija: http: //grid. lumii. lv/section/show/12 n n Domain of the Institution (domain. zz): lumii. lv Common Name (John Smith): Janis Berzins
Certification Procedure
Creating a Certification Request
Baltic. Grid. CA-user. cnf # # Open. SSL configuration file for generating certificate requests for Baltic Grid CA. # # This definition stops the following lines choking if HOME isn't # defined. HOME =. ###RANDFILE = $ENV: : HOME/. rnd [ req ] default_bits = 1024 default_keyfile = userkey. pem default_md = sha 1 distinguished_name = req_distinguished_name string_mask = nombstr [ req_distinguished_name ] 0. domain. Component_default 1. domain. Component_default organizational. Unit. Name common. Name_max = = Domain Component org Domain Component balticgrid = Domain = Common = 64 # which md to use. (org) (Baltic. Grid) of the Institution (domain. zz) Name (John Smith)
-----BEGIN RSA PRIVATE KEY----Proc-Type: 4, ENCRYPTED DEK-Info: DES-EDE 3 -CBC, C 280 CE 744 C 634255 Result Br. T 3 Iotvrbcp. TVeq. Kss. GQnpx 2 dcnqq. GIRb 0 Jt 8 p. JEUj. TX 24 Isd. Ag+Lx. OUEJ 70 y 1 a a. XMg. Qm. Fyem. Rpbn. Mwg. Z 8 w. DQYJKo. ZIhvc. NAQEBBQADg. Y 0 AMIGJAo. GBANep. Pbidunic 4 dq 8 i. Kj 1 e. EDlic. CZ 51 c. KX 43 Hn 17 Ca+IKv S 7 c. TBavb. Ficm 6 mkf. No. CO+er. ZWL 3 nlrh GXuh. Uy. CHZJct. A 9 Fu 37 II 3 ik 7 SZe 6 Lah. CKu 55 Zr. CP 9 b. EXucv. Q 7 gi. I 2 FUcgvj. Ec. K/I 9+Nn. O+chk. Jw. CTafa 32 Sx. Zs. G 7 MOnwv 14 XAg. MBAAGg. ADANBgkqhki. G 9 w 0 BAQUFAAOB g. QC 8 o. V 1 AQv 1 jj 2 D 3 gb 0 a. BUw. A 1 Ca. Vq. JN+bq 2 wwme. QSP 1+r. JXic. Slfp. IEq. I 8 Two. T 6 F v. Et 2 En. PAtb. Xp. WMj. Ftbu. M 816+t. Edkr. GLw 0 wf. Hdl. TCwswc. Rt. Hn 3 QVl 4 jx. A/w. Reb+CY l/OAjuw 1 hvq. YG 6 ZY 6 n 5 zmx. Zs. Cn. Vi. LMIIt. W 2 NMJGBR 43 Crt. Ju. UHly 13 hf 3 e. TZi. IZq GVj. Hr. RPzj 8 GC 6 AOBz. Q 9 Kk. G/Gcale 4 ALU 1 czm. SIjw. AABL 1 DNUc 8 n. F/w== -----END RSA PRIVATE KEY-----BEGIN CERTIFICATE REQUEST----MIIBnj. CCAQc. CAQAw. Xj. ETMBEGCgm. SJom. T 8 ixk. ARk. WA 29 y. Zz. Ea. MBg. GCgm. SJom. T 8 ixk ARk. WCm. Jhb. HRp. Y 2 dya. WQx. ETAPBg. NVBAs. TCGx 1 b. Wlp. Lmx 2 MRgw. Fg. YDVQQDEw 9 Hd. W 50 a. XMg. Qm. Fyem. Rpbn. Mwg. Z 8 w. DQYJKo. ZIhvc. NAQEBBQADg. Y 0 AMIGJAo. GBANep. Pbidunic 4 dq 8 i. Kj 1 e. EDlic. CZ 51 c. KX 43 Hn 17 Ca+IKv S 7 c. TBavb. Ficm 6 mkf. No. CO+er. ZWL 3 nlrh GXuh. Uy. CHZJct. A 9 Fu 37 II 3 ik 7 SZe 6 Lah. CKu 55 Zr. CP 9 b. EXucv. Q 7 gi. I 2 FUcgvj. Ec. K/I 9+Nn. O+chk. Jw. CTafa 32 Sx. Zs. G 7 MOnwv 14 XAg. MBAAGg. ADANBgkqhki. G 9 w 0 BAQUFAAOB g. QC 8 o. V 1 AQv 1 jj 2 D 3 gb 0 a. BUw. A 1 Ca. Vq. JN+bq 2 wwme. QSP 1+r. JXic. Slfp. IEq. I 8 Two. T 6 F v. Et 2 En. PAtb. Xp. WMj. Ftbu. M 816+t. Edkr. GLw 0 wf. Hdl. TCwswc. Rt. Hn 3 QVl 4 jx. A/w. Reb+CY CSSIx 0 n 3 i. P 6 KFP 7 PMzq. LMi. Gm 4 jb. UVo. Di. A 6 Zf. Kq 1 HAq. PHig== -----END CERTIFICATE REQUEST-----
Certificate: Data: Version: 3 (0 x 2) Serial Number: 13 (0 xd) Signature Algorithm: sha 1 With. RSAEncryption Issuer: O=Baltic. Grid, CN=Baltic Grid Certification Authority Validity Not Before: Mar 24 12: 30: 32 2005 GMT Not After : Mar 24 12: 30: 32 2006 GMT Subject: O=Baltic. Grid, OU=latnet. lv, CN=Guntis Barzdins Subject Public Key Info: Public Key Algorithm: rsa. Encryption RSA Public Key: (1024 bit) Modulus (1024 bit): 00: c 1: 54: 28: 7 c: de: 67: 95: b 0: 7 b: 53: 24: 85: a 1: c 4: dd: b 3: 12: b 4: 06: c 4: b 0: 13: 93: c 0: 5 b: ad: 2 a: ad: 0 a: 8 a: 6 c: d 7: f 3: c 1: 65: d 5: 1 a: 3 f: f 2: e 8: ed: da: 37: a 0: 52: e 0: 05: 17: 3 f: ee: 45: 91: a 8: 07: 8 d: 8 f: 7 f: 96: aa: fc: 7 c: 4 f: 27: c 6: fc: 82: b 8: 89: 54: 42: 60: ea: 18: ff: fa: a 4: 1 e: f 7: 00: 22: 66: b 2: 5 b: bb: 85: c 9: a 8: 12: 87: f 3: 6 f: 96: c 2: 05: c 8: a 0: eb: 9 c: 54: 03: f 1: 05: c 3: f 4: 27: ab: 6 b: 30: 47: dd: 4 b: 12: b 8: 21: d 9: 25: fe: e 6: 68: 70: 23: ae: 35: 15: 80: b 5: e 7 Exponent: 65537 (0 x 10001) X 509 v 3 extensions: X 509 v 3 Basic Constraints: critical CA: FALSE X 509 v 3 Key Usage: critical Digital Signature, Non Repudiation, Key Encipherment, Data Encipherment X 509 v 3 Subject Key Identifier: B 3: 0 B: DD: 96: 09: 86: 37: 1 F: CF: 5 D: D 5: 78: 5 B: 6 D: AB: 6 F: D 0: BC: 5 A: 24 X 509 v 3 Authority Key Identifier: keyid: 24: 4 E: 75: 31: 6 A: 6 C: DF: AA: 4 D: AD: C 6: 34: 39: 23: 5 F: 18: DB: 17: 47: 86 Dir. Name: /O=Baltic. Grid/CN=Baltic Grid Certification Authority serial: 00 X 509 v 3 Certificate Policies: Policy: 1. 3. 6. 1. 4. 1. 19974. 11. 1. 0. 1 X 509 v 3 Issuer Alternative Name: URI: http: //grid. eenet. ee/Baltic. Grid. CA/ Signature Algorithm: sha 1 With. RSAEncryption 67: e 8: 50: 7 d: 28: 84: d 7: cb: 88: de: 4 a: 14: da: f 4: 09: 16: 05: 38: 4 a: 55: 23: 11: b 5: 87: 77: 05: 7 d: 07: d 8: 1 c: 03: 45: 19: 6 f: 97: ef: 7 d: 1 b: c 8: 7 f: 29: 98: c 5: d 8: 35: cf: 2 e: b 2: 16: 7 e: 19: 8 c: 32: 79: 2 d: ed: 9 a: 7 b: 50: e 3: 26: df: 79: 59: 84: 8 f: c 6: 34: d 4: 3 a: c 1: 65: 5 b: 79: 2 e: 6 e: eb: 62: 50: 2 f: 0 a: 47: 00: 08: 54: ee: 54: 6 d: 91: 9 f: ff: 58: f 0: b 5: 79: aa: 68: 12: e 9: 2 c: 15: 9 d: 06: 41: 3 b: 3 f: 29: 4 b: ba: be: e 1: ef: e 1: aa: 7 c: 83: 5 b: be: 3 a: e 1: 16: 5 f: 02: 65: 70: c 6: 7 d: 15: 7 b: e 0: 43: 3 e: f 9: c 1: b 3: 96: 80: fb: a 0: aa: a 8: 83: 79: 0 e: 0 b: 87: b 7: 09: b 6: 60: 6 d: 64: 2 c: de: c 3: 1 c: 4 c: cc: e 5: 54: 4 c: 33: 26: d 9: 31: 35: 29: 30: df: 8 b: 7 b: e 6: a 8: 31: 6 e: a 4: 57: ef: 51: 53: 6 c: df: 7 b: f 6: 6 d: 8 e: d 0: ad: ba: 72: 87: 17: 47: aa: d 4: fa: ff: 4 d: d 0: cc: 45: a 5: 28: e 5: a 3: 46: 84: cf: c 4: 4 b: 94: f 8: ba: 27: b 5: 35: e 3: 79: f 8: 49: 3 d: 90: b 0: 41: 5 d: 71: e 5: 15: 6 c: 25: d 3: 61: 73: 31: c 8: c 5: 3 d: 5 e: a 1: 68: fe: 82: 9 a: 4 a: 0 f: ea: 5 b: 13: b 4: 6 a: be -----BEGIN CERTIFICATE----MIIDd. TCCAl 2 g. Aw. IBAg. IBDTANBgkqhki. G 9 w 0 BAQUFADBDMRMw. EQYDVQQKEwp. CYWx 0 a. WNHcmlk. MSww. Kg. YDVQQDEy. NCYWx 0 a. WMg. R 3 Jp. ZCBDZXJ 0 a. WZp. Y 2 F 0 a. W 9 u. IEF 1 d. Ghv cml 0 e. TAe. Fw 0 w. NTAz. Mj. Qx. Mj. Mw. Mz. Ja. Fw 0 w. Nj. Az. Mj. Qx. Mj. Mw. Mz. Ja. MEMx. Ez. ARBg. NVBAo. T y. H 8 pm. MXYNc 8 u. Lr. IWfhm. MPDJ 5 Le 2 ae 1 Dj. Jt 95 WYSPxj. TUOs. Fl. W 3 kubuti. UC 8 KRw. AI VO 5 Ub. ZGf/1 jwt. Xmqa. BLp. LBWd. Bk. E 7 Pyl. Lur 7 h 7+Gqf. INbvjrh. Fl 8 CZXDGf. RV 74 EM+ +c. Gzlo. D 7 o. Kqog 3 k. OC 4 e 3 Cb. Zgb. WQs 3 t 7 DHEz. M 5 VRMMyb. ZMTUp. MN+Le+ao. MW 6 k. V+9 R U 2 zfe/Ztjt. Ctun. KHF 0 eq 1 Pr/Td. DMRa. Uo 5 a. NGh. M/ES 5 T 4 uie 1 Ne. N 5+Ek 9 k. LBBXXHl FWwl 02 Fz. Mcj. FPV 6 ha. P 6 Cmko. P 6 ls. Tt. Gq+vg== -----END CERTIFICATE----- Sertifikāts
Getting Started 1. Get a digital certificate Authentication – who you are http: //grid. lumii. lv/section/show/12 2. Join a Virtual Organisation (VO) For LHC join LCG and choose a VO Authorisation – what you are allowed to do https: //voms. balticgrid. org: 8443/voms/balticgrid/ 3. Get access to a local User Interface Machine (UI) and copy your files and certificate there
Šis būs vajadzīgs w ssh zars. latnet. lv n User interface (sertificētie dabūs kontus) w openssl pkcs 12 -export -in usercert. pem inkey userkey. pem –out certificate. p 12 n Izveidot sertifikata kopiju MS Explorer derīgā formāta w http: //winscp. net/download/winscp 380. ex e n Failu kopēšanai starp zars. latnet. lv un PC
Job Preparation Prepare a file of Job Description Language (JDL): Script to run Input files Job Options My C++ Code ####### athena. jdl ######### Executable = "athena. sh"; Std. Output = "athena. out"; Std. Error = "athena. err"; Input. Sandbox = {"athena. sh", "My. Job. Options. py", "My. Alg. cxx", "My. Alg. h", "My. Alg_entries. cxx", "My. Alg_load. cxx", "login_requirements", "Makefile"}; Output. Sandbox = {"athena. out", "athena. err", "ntuple. root", "histo. root", "CLIDDBout. txt"}; Requirements = Member("VO-atlas-release-9. 0. 4", other. Glue. Host. Application. Software. Run. Time. Environment); ######################## Choose ATLAS Version (Satisfied by ~32 Sites) Output Files
Job Submission Make a copy of your certificate to send out (~ once a day): [lloyd@lcgui ~/atlas]$ grid-proxy-init Your identity: /C=UK/O=e. Science/OU=Queen. Mary. London/L=Physics/CN=steve lloyd Enter GRID pass phrase for this identity: Creating proxy. . . . Done Your proxy is valid until: Thu Mar 17 03: 25: 06 2005 [lloyd@lcgui ~/atlas]$ Submit the Job: VO File to hold job IDs JDL [lloyd@lcgui ~/atlas]$ edg-job-submit --vo atlas -o job. IDfile athena. jdl Selected Virtual Organisation name (from --vo option): atlas Connecting to host lxn 1188. cern. ch, port 7772 Logging to host lxn 1188. cern. ch, port 9002 ================ edg-job-submit Success ================== The job has been successfully submitted to the Network Server. Use edg-job-status command to check job current status. Your job identifier (edg_job. Id) is: - https: //lxn 1188. cern. ch: 9000/0 u. Djtwb. Bbj 8 DTRetx. Yxoq. Q The edg_job. Id has been saved in the following file: /home/lloyd/atlas/job. IDfile ============================================== [lloyd@lcgui ~/atlas]$
Job Status Find out its status: Ran at: [lloyd@lcgui ~/atlas]$ edg-job-status -i job. IDfile ---------------------------------1 : https: //lxn 1188. cern. ch: 9000/t. Kl. ZHxq. Ehuro. JUhuh. EBt. SA 2 : https: //lxn 1188. cern. ch: 9000/IJhk. SOba. AN 5 XDKBHPQLQy. A 3 : https: //lxn 1188. cern. ch: 9000/BMEOq 90 zq. ALvkri. Hd. Ve. N 7 A 4 : https: //lxn 1188. cern. ch: 9000/l 6 wist 7 SMq 6 j. Ve. Pw. Qj. Hofg 5 : https: //lxn 1188. cern. ch: 9000/w. Hl 9 Yl_puz 9 h. ZDMe 1 OYRy. Q 6 : https: //lxn 1188. cern. ch: 9000/Pci. XGNu. Au 7 v. Zfcu. Wi. GS 3 z. Q 7 : https: //lxn 1188. cern. ch: 9000/0 u. Djtwb. Bbj 8 DTRetx. Yxoq. Q a : all q : quit ---------------------------------Choose one or more edg_job. Id(s) in the list - [1 -7]all: 7 ******************************* BOOKKEEPING INFORMATION: Valencia RAL CERN Taiwan Status info for the Job : https: //lxn 1188. cern. ch: 9000/0 u. Djtwb. Bbj 8 DTRetx. Yxoq. Q Current Status: Done (Success) Exit code: 0 Status Reason: Job terminated successfully Destination: lcg 00125. grid. sinica. edu. tw: 2119/jobmanager-lcgpbs-short reached on: Wed Mar 16 17: 45: 41 2005 ******************************* [lloyd@lcgui ~/atlas]$ Taiwan
Retrieve the Output: Job Retrieval [lloyd@lcgui ~/atlas]$ edg-job-get-output -dir. -i job. IDfile Retrieving files from host: lxn 1188. cern. ch ( for https: //lxn 1188. cern. ch: 9000/0 u. Djtwb. Bbj 8 DTRetx. Yxoq. Q ) ***************************************** JOB GET OUTPUT OUTCOME Output sandbox files for the job: - https: //lxn 1188. cern. ch: 9000/0 u. Djtwb. Bbj 8 DTRetx. Yxoq. Q have been successfully retrieved and stored in the directory: /home/lloyd/atlas/lloyd_0 u. Djtwb. Bbj 8 DTRetx. Yxoq. Q ***************************************** [lloyd@lcgui ~/atlas]$ ls -lt /home/lloyd/atlas/lloyd_0 u. Djtwb. Bbj 8 DTRetx. Yxoq. Q total 11024 -rw-r--r-1 lloyd hep 224 Mar 17 10: 47 CLIDDBout. txt -rw-r--r-1 lloyd hep 69536 Mar 17 10: 47 ntuple. root -rw-r--r-1 lloyd hep 5372 Mar 17 10: 47 athena. err -rw-r--r-1 lloyd hep 11185282 Mar 17 10: 47 athena. out
Reāls pilns piemērs [guntisb@zars guntisb]$ tar -xvf tutor 1. tar [guntisb@zars guntisb]$ cd job 1 [guntisb@zars job 1]$ voms-proxy-init Your identity: /DC=org/DC=balticgrid/OU=lumii. lv/CN=Guntis Barzdins Enter GRID pass phrase: Creating proxy. . . . Done Your proxy is valid until Wed May 3 23: 51: 41 2006 [guntisb@zars job 1]$ edg-job-submit --vo balticgrid job 1. jdl Selected Virtual Organisation name (from --vo option): balticgrid Connecting to host grid 3. mif. vu. lt, port 7772 Logging to host grid 3. mif. vu. lt, port 9002 *********************************************** JOB SUBMIT OUTCOME The job has been successfully submitted to the Network Server. Use edg-job-status command to check job current status. Your job identifier (edg_job. Id) is: - https: //grid 3. mif. vu. lt: 9000/Nd. Aque. Qcc 5 a. ARq. LN 0 vw. Wng *********************************************** [guntisb@zars job 1]$ edg-job-status https: //grid 3. mif. vu. lt: 9000/Nd. Aque. Qcc 5 a. ARq. LN 0 vw. Wng ******************************* BOOKKEEPING INFORMATION: Status info for the Job : https: //grid 3. mif. vu. lt: 9000/Nd. Aque. Qcc 5 a. ARq. LN 0 vw. Wng Current Status: Ready Status Reason: unavailable Destination: grid 2. mif. vu. lt: 2119/jobmanager-lcgpbs-balticgrid reached on: Wed May 3 13: 16: 56 2006 ******************************* [guntisb@zars job 1]$ edg-job-status https: //grid 3. mif. vu. lt: 9000/Nd. Aque. Qcc 5 a. ARq. LN 0 vw. Wng ******************************* BOOKKEEPING INFORMATION: Status info for the Job : https: //grid 3. mif. vu. lt: 9000/Nd. Aque. Qcc 5 a. ARq. LN 0 vw. Wng Current Status: Done (Success) Exit code: 0 Status Reason: Job terminated successfully Destination: grid 2. mif. vu. lt: 2119/jobmanager-lcgpbs-balticgrid reached on: Wed May 3 13: 22: 58 2006 ******************************* [guntisb@zars job 1]$ edg-job-get-output https: //grid 3. mif. vu. lt: 9000/Nd. Aque. Qcc 5 a. ARq. LN 0 vw. Wng Retrieving files from host: grid 3. mif. vu. lt ( for https: //grid 3. mif. vu. lt: 9000/Nd. Aque. Qcc 5 a. ARq. LN 0 vw. Wng ) ***************************************** JOB GET OUTPUT OUTCOME Output sandbox files for the job: - https: //grid 3. mif. vu. lt: 9000/Nd. Aque. Qcc 5 a. ARq. LN 0 vw. Wng have been successfully retrieved and stored in the directory: /tmp/job. Output/guntisb_Nd. Aque. Qcc 5 a. ARq. LN 0 vw. Wng ***************************************** [guntisb@zars total 12 drwxrwxr-x drwxr-xr-x -rw-rw-r-[guntisb@zars job 1]$ mkdir abc job 1]$ cp /tmp/job. Output/guntisb_Nd. Aque. Qcc 5 a. ARq. LN 0 vw. Wng/* abc/. job 1]$ ls -al abc 2 guntisb 3 guntisb 1 guntisb job 1]$ guntisb 4096 0 593 May May 3 3 15: 29 . . . stderr. log stdout. log [guntisb@zars job 1]$ more abc/stdout. log *********** Job Start *********** job 1 started on Wed May 3 16: 18: 39 EEST 2006 Executing on: Linux stud 128. mif 2. 4. 32 -grsec-i 686 -r 1 #1 Št Sau 14 15: 54: 22 EET 2006 i 686 GNU/L inux Current working directory and contents: /home/bg 035/globus-tmp. stud 128. 918. 0/WMS_stud 128_010893_https_3 a_2 f_2 fgrid 3. mif. vu. lt_3 a 9000_2 f. Nd. Aque. Qcc 5 a. ARq. LN 0 vw. Wng total 8 -rw-r--r-- 1 bg 035 balticgrid 694 May 3 16: 18 job 1. sh -rw-r--r-- 1 bg 035 balticgrid 0 May 3 16: 18 stderr. log -rw-r--r-- 1 bg 035 balticgrid 357 May 3 16: 18 stdout. log *********** Job End *********** [guntisb@zars job 1]$
Pilni Grid uzdevumu paraugi w http: //www. ltn. lv/~guntis/unix/tutor 1. tar w http: //www. ltn. lv/~guntis/unix/mpi. tgz w Papildinformācija n n http: //gridit. cnaf. infn. it/index. php? jobsubmit&type=1 Citi EGEE projekta dokumenti
A global, federated e-Infrastructure Baltic. Grid SEE-GRID OSG EUMed. Grid NAREGI EUChina. Grid EUIndia. Grid EELA EGEE infrastructure ~ 200 sites in 39 countries ~ 20 000 CPUs > 5 PB storage > 10 000 concurrent jobs per day
e-Infrastructures: network layer
e-Infrastructures: grid layer
e-Infrastructures: data layer Standards OGF
DEISA
Microsoft Cluster Server SW
Typical Ways to Use MPI on Multicore Systems w One MPI process per core n Each MPI process is a single thread node Dual-core processor w One MPI process per node n n n MPI processes are multithreaded One thread per core aka Hybrid model w Some combination of the two process
Scalability w MPI can scale to very large systems n E. g. MPICH 2 on 65 K dual-core Blue. Gene/L nodes w But will an existing application scale? n n If it runs on 4 K single-core nodes, it will run fine on 1 K four-core nodes But will it run on 4 K four-core nodes? w Hybrid programming model can help improve scalability n n Shared-memory/threads programming on nodes MPI across nodes
w Pros n n n Single-Threaded MPI Programming Same paradigm developers are used to Existing codes can make use of multicore Easier to debug w Cons n n There may be a better shared-memory algorithm Possible duplication of large arrays
MPICH 2 Features to Support Multicore Architechtures w Nemesis: New “multimethod” communication subsystem for MPICH 2 n Optimized intranode and internode communication Shared memory Network Process 0 Process 2 Process 1 Node 0 Node 1
w Pros n n n Multithreaded MPI Programming Hybrid programming model Use shared-memory algorithms where appropriate One copy of large array shared by all threads w Cons n In general threaded programming can be difficult to write, debug and verify (e. g. , using pthreads) w Open. MP and UPC make threaded programming easier n Language constructs to parallelize loops, etc.
MPI Supported Thread Levels w MPI_THREAD_SINGLE n Only one user thread is allowed w MPI_THREAD_FUNNELED n May have one or more threads, but only the “main” thread may make MPI calls w MPI_THREAD_SERIALIZED n May have one or more threads, but only one thread can make MPI calls at a time. It is the application developer’s responsibility to guarantee this. w MPI_THREAD_MULTIPLE n May have one or more threads. Any thread can make MPI calls at any time (with certain conditions). w MPICH 2 supports MPI_THREAD_MULTIPLE
Using Multiple Threads in MPI w The main thread must call MPI_Init_thread() n n App requests a thread level MPI returns the thread level actually provided These values need not be the same on every process Hint: Request only the level you need to avoid unnecessary overhead for higher thread levels. w MPI_Init_thread() n n n Called in place of MPI_Init() Only the main thread should call this The main thread (and only the main thread) must call MPI_Finalize l there is no MPI_Finalize_thread() w MPI does not provide routines to create threads n That’s left to the user l E. g. , use pthreads Open. MP, etc
Allocating Processes/Threads to Cores w Typically: one process/thread per core w It may be beneficial to oversubscribe in some cases n n If a process/thread is blocking most of the time Examples: l l Communication thread with compute threads Master/slave w Master process distributes work then waits for results w Caution: Some MPI implementations actively poll while waiting for a message n Comm. Thread would compete for cycles with compute threads
Binding Processes/Threads to Specific Cores w Processes/threads can be bound to one or more cores w Which core to choose? w For best communication, locate threads that do a lot of communication close together n On same node processor cores sharing L 2 cache w But cores on the same processor share cache and pins to frontside bus n Might do better splitting up threads that do a lot of memory accesses w Bind the communication thread to one of the cores for the compute threads of the same process w I/O interrupt handling is statically mapped to one core
What is a thread? w process: • • an address space with 1 or more threads executing within that address space, and the required system resources for those threads a program that is running w thread: • • a sequence of control within a process shares the resources in that process
Advantages and Drawbacks of Threads w Advantages: • • • the overhead for creating a thread is significantly less than that for creating a process (~ 2 milliseconds for threads) multitasking, i. e. , one process serves multiple clients switching between threads requires the OS to do much less work than switching between processes
w Drawbacks: • • not as widely available as longer established features writing multithreaded programs require more careful thought more difficult to debug than single threaded programs for single processor machines, creating several threads in a program may not necessarily produce an increase in performance (only so many CPU cycles to be had)
POSIX Threads (pthreads) w IEEE's POSIX Threads Model: • • programming models for threads in a UNIX platform pthreads are included in the international standards ISO/IEC 9945 -1 w pthreads programming model: • • • creation of threads managing thread execution managing the shared resources of the process
w main thread: • • initial thread created when main() (in C) or PROGRAM (in fortran) are invoked by the process loader once in the main(), the application has the ability to create daughter threads if the main thread returns, the process terminates even if there are running threads in that process, unless special precautions are taken to explicitly avoid terminating the entire process, use pthread_exit()
w thread termination methods: • implicit termination: l • thread function execution is completed explicit termination: calling pthread_exit() within the thread l calling pthread_cancel() to terminate other threads l w for numerically intensive routines, it is suggested that the application calls p threads if there are p available processors
Sample Pthreads Program in C++ and Fortran 90/95 w The program in C++ calls the pthread. h header file. Pthreads related statements are preceded by the pthread_ prefix (except for semaphores). Knowing how to manipulate pointers is important. w The program in Fortran 90/95 uses the f_pthread module. Pthreads related statements are preceded by the f_pthread_ prefix (again, except for semaphores). w Pthreads in Fortran is still not an industry-wide standard.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 //******************************** // This is a sample threaded program in C++. The main thread creates // 4 daughter threads. Each daughter thread simply prints out a message // before exiting. Notice that I’ve set the thread attributes to joinable and // of system scope. //******************************** #include
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 // creating threads for ( i=0; i
54 55 56 57 58 59 60 61 62 63 64 65 66 //****************************** // This is the function each thread is going to run. It simply asks // the thread to print out a message. Notice the pointer acrobatics. //****************************** void *thread_function( void *arg ) { int id; id = *((int *)arg); printf( "Hello from thread %d!n", id ); pthread_exit( NULL ); }
w How to compile: • in Linux use: > {C++ comp} –D_REENTRANT hello. cc –lpthread –o hello • it might also be necessary for some systems to define the _POSIX_C_SOURCE (to 199506 L) • Creating a thread: int pthread_create( pthread_t *thread, pthread_attr_t *attr, *(*thread_function)(void *), void *arg ); • • • void first argument – pointer to the identifier of the created thread second argument – thread attributes third argument – pointer to the function the thread will execute fourth argument – the argument of the executed function (usually a struct) returns 0 for success
w Waiting for the threads to finish: int pthread_join( pthread_t thread, void **thread_return ) • • • main thread will wait for daughter thread to finish first argument – the thread to wait for second argument – pointer to a pointer to the return value from the thread returns 0 for success threads should always be joined; otherwise, a thread might keep on running even when the main thread has already terminated
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 !******************************** ! This is a sample threaded program in Fortran 90/95. The main thread ! creates 4 daughter threads. Each daughter thread simply prints out ! a message before exiting. Notice that I've set the thread attributes to ! be joinable and of system scope. !******************************** PROGRAM hello USE f_pthread IMPLICIT NONE INTEGER, PARAMETER : : num_threads = 4 INTEGER : : i, tmp, flag INTEGER, DIMENSION(num_threads) : : arg TYPE(f_pthread_t), DIMENSION(num_threads) : : thread TYPE(f_pthread_attr_t) : : attr EXTERNAL : : thread_function DO i = 1, num_threads arg(i) = i – 1 END DO !initialize and set the thread attributes tmp = f_pthread_attr_init( attr ) tmp = f_pthread_attr_setdetachstate( attr, PTHREAD_CREATE_JOINABLE ) tmp = f_pthread_attr_setscope( attr, PTHREAD_SCOPE_SYSTEM )
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 ! this is an extra variable needed in fortran (not needed in C) flag = FLAG_DEFAULT ! creating threads DO i = 1, num_threads tmp = f_pthread_create( thread(i), attr, flag, thread_function, arg(i) ) IF ( tmp /= 0 ) THEN WRITE (*, *) "Creating thread", i, "failed!" STOP END IF END DO ! joining threads DO i = 1, num_threads tmp = f_pthread_join( thread(i) ) IF ( tmp /= 0 ) THEN WRITE (*, *) "Joining thread", i, "failed!" STOP END IF END DO
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 !******************************* ! This is the subroutine each thread is going to run. It simply asks ! the thread to print out a message. Notice that f_pthread_exit() is ! a subroutine call. !******************************* SUBROUTINE thread_function( id ) IMPLICIT NONE INTEGER : : id, tmp WRITE (*, *) "Hello from thread", id CALL f_pthread_exit() END SUBROUTINE thread_function
w How to compile: • only available in AIX 4. 3 and above: > xlf 95_r –lpthread hello. f –o hello the compiler should be thread safe w The concept for creating and joining threads are the same in C/C++ except that pointers are not directly involved in fortran. w Note that in fortran some pthread calls are function calls while some are subroutine calls. •
Thread Attributes w detach state attribute: int pthread_attr_setdetachstate(pthread_attr_t *attr, detachstate); • • int detached – main thread continues working without waiting for the daughter threads to terminate joinable – main thread waits for the daughter threads to terminate before continuing further
w contention scope attribute: int pthread_attr_setscope(pthread_attr_t *attr, int *scope); • • system scope – threads are mapped one-to -one on the OS's kernel threads (kernel threads are entities that scheduled onto processors by the OS) process scope – threads share a kernel thread with other process scoped threads
Thread Synchronization Mechanisms w Mutual exclusion (mutex): • • • guard against multiple threads modifying the same shared data simultaneously provides locking/unlocking critical code sections where shared data is modified each thread waits for the mutex to be unlocked (by the thread who locked it) before performing the code section
w Basic Mutex Functions: int pthread_mutex_init(pthread_mutex_t *mutex, const pthread_mutexattr_t *mutexattr); int pthread_mutex_lock(pthread_mutex_t *mutex); int pthread_mutex_unlock(pthread_mutex_t *mutex); int pthread_mutex_destroy(pthread_mutex_t *mutex); • • a new data type named pthread_mutex_t is designated for mutexes a mutex is like a key (to access the code section) that is handed to only one thread at a time the attribute of a mutex can be controlled by using the pthread_mutex_init() function the lock/unlock functions work in tandem
#include
w Counting Semaphores: • • permit a limited number of threads to execute a section of the code similar to mutexes should include the semaphore. h header file semaphore functions do not have pthread_ prefixes; instead, they have sem_ prefixes
w Basic Semaphore Functions: • creating a semaphore: int sem_init(sem_t *sem, int pshared, unsigned int value); initializes a semaphore object pointed to by sem l pshared is a sharing option; a value of 0 means the semaphore is local to the calling process l gives an initial value to the semaphore l • terminating a semaphore: int sem_destroy(sem_t *sem); frees the resources allocated to the semaphore sem l usually called after pthread_join() l an error will occur if a semaphore is destroyed for which a thread is waiting l
• semaphore control: int sem_post(sem_t *sem); int sem_wait(sem_t *sem); l l atomically increases the value of a semaphore by 1, i. e. , when 2 threads call sem_post simultaneously, the semaphore's value will also be increased by 2 (there are 2 atoms calling) sem_wait atomically decreases the value of a semaphore by 1; but always waits until the semaphore has a non-zero value first sem_post
#include
void *thread_function( void *arg ) { sem_wait( &semaphore ); perform_task_when_sem_open(); . . . pthread_exit( NULL ); } • • • the main thread increments the semaphore's count value in the while loop the threads wait until the semaphore's count value is non-zero before performing perform_task_when_sem_open() and further daughter thread activities stop only when pthread_join() is called
w When mixing with MPI, the simplest way is to let only 1 thread handle the communications. w So why threads? In some cases, it is the only viable approach.
MPI Programming with Six Routines w Some programs can be written with only six routines n n n MPI_Init MPI_Finalize MPI_Comm_size MPI_Comm_rank MPI_Send MPI_Recv
Non-blocking Communication w Communication split into two parts n n n MPI_Isend or MPI_Irecv starts communication and returns request data structure. MPI_Wait (also MPI_Waitall, MPI_Waitany) uses request as an argument and blocks until communication is complete. MPI_Test uses request as an argument and checks for completion. w Advantages n n n No deadlocks Overlap communication with computation Exploit bi-directional communication
Identifying Processes w MPI Communicator n Defines a group (set of ordered processes) and a context (a virtual network) w Rank n n Process number within the group MPI_ANY_SOURCE will receive from any process w Default communicator n MPI_COMM_WORLD the whole group
Non-blocking send/recv buffers w May not modify or read the message buffer between MPI_Irecv and MPI_Wait calls. w May not modify or read the message buffer between MPI_Isend and MPI_Wait calls. w May not have two MPI_Irecv pending on the same buffer. w May not have two MPI_Isend pending


