d909e6830207709f5b2d20bc337be7ee.ppt
- Количество слайдов: 38
Global Lambdas and Grids for Particle Physics in the LHC Era Harvey B. Newman California Institute of Technology SC 2005 Seattle, November 14 -18 2005
Beyond the SM: Great Questions of Particle Physics and Cosmology 1. Where does the pattern of particle families and masses come from ? 2. Where are the Higgs particles; what is the mysterious Higgs field ? 3. Why do neutrinos and quarks oscillate ? 4. Is Nature Supersymmetric ? 5. Why is any matter left in the universe ? 6. Why is gravity so weak? 7. Are there extra space-time dimensions? You Are Here. We do not know what makes up 95% of the universe.
Large Hadron Collider CERN, Geneva: 2007 Start s =14 Te. V L=1034 cm-2 s-1 pp km Tunnel in Switzerland & France 27 CMS TOTEM Atlas pp, general purpose; HI 5000+ Physicists 250+ Institutes 60+ Countries ALICE : HI LHCb: B-physics Higgs, SUSY, Extra Dimensions, CP Violation, QG Plasma, … Challenges: Analyze petabytes of complex data cooperatively Harness global computing, data & network resources the Unexpected
LHC Data Grid Hierarchy CERN/Outside Resource Ratio ~1: 2 Tier 0/( Tier 1)/( Tier 2) ~1: 1: 1 ~PByte/sec ~150 -1500 MBytes/sec Online System Experiment CERN Center PBs of Disk; Tape Robot Tier 0 +1 Tier 1 10 - 40 Gbps IN 2 P 3 Center INFN Center RAL Center FNAL Center ~10 Gbps Tier 2 Tier 3 Tier 2 Center Tier 2 Center ~1 -10 Gbps Institute Physics data cache Workstations Institute Tens of Petabytes by 2007 -8. An Exabyte ~5 -7 Years later. 1 to 10 Gbps Tier 4 Emerging Vision: A Richly Structured, Global Dynamic System
Long Term Trends in Network Traffic Volumes: 300 -1000 X/10 Yrs ESnet Accepted Traffic 1990 – 2005 Exponential Growth: +82%/Year for the Last 15 Years; 400 X Per Decade W. Johnston 400 300 200 100 TERABYTES Per Month 600 500 R. Cottrell 10 Gbit/s Progress in Steps u SLAC Traffic Growth in Steps: ~10 X/4 Years. u Projected: ~2 Terabits/s by ~2014 u “Summer” ‘ 05: 2 x 10 Gbps links: one for production, one for R&D
Internet 2 Land Speed Record (LSR) 7. 2 G X 20. 7 kkm Throuhgput (Petabit-m/sec) r IPv 4 Multi-stream record with FAST TCP: 6. 86 Gbps X 27 kkm: Nov 2004 Internet 2 LSRs: Blue = HEP r IPv 6 record: 5. 11 Gbps between Geneva and Starlight: Jan. 2005 r Disk-to-disk Marks: 536 Mbytes/sec (Windows); 500 Mbytes/sec (Linux) u. End System Issues: PCI-X Bus, Linux Kernel, NIC Drivers, CPU NB: Manufacturers’ Roadmaps for 2006: One Server Pair to One 10 G Link Nov. 2004 Record Network
HENP Bandwidth Roadmap for Major Links (in Gbps) Continuing Trend: ~1000 Times Bandwidth Growth Per Decade; HEP: Co-Developer as well as Application Driver of Global Nets
LHCNet , ESnet Plan 2006 -2009: 20 -80 Gbps US-CERN, ESnet MANs, IRNC Asia. Pac SEA Aus. Europe ESnet 2 nd Core: 30 -50 G SNV Europe CHI NYC DEN Aus. SDG ALB ELP ESnet IP Core ≥ 10 Gbps GEANT 2 SURFNet IN 2 P 3 DC FNAL Metro Rings BNL Japan LHCNet US-CERN: Wavelength Triangle 10/05: 10 G CHI + 10 G NY 2007: 20 G + 20 G 2009: ~40 G + 40 G ATL ESnet hubs New ESnet hubs Metropolitan Area Rings Major DOE Office of Science Sites High-speed cross connects with Internet 2/Abilene Production IP ESnet core, 10 Gbps enterprise IP traffic Science Data Network core, 40 -60 Gbps circuit transport Lab supplied Major international LHCNet Data Network NSF/IRNC circuit; GVA-AMS connection via Surfnet or Geant 2 IRNC Links CERN 10 Gb/s 30 Gb/s 2 x 10 Gb/s LHCNet Data Network (2 to 8 x 10 Gbps US-CERN) ESNet MANs to FNAL & BNL; Dark fiber (60 Gbps) to FNAL
Global Lambdas for Particle Physics Caltech/CACR and FNAL/SLAC Booths u Preview global-scale data analysis of the LHC Era (2007 -2020+), using next-generation networks and intelligent grid systems r Using state of the art WAN infrastructure and Grid-based Web service frameworks, based on the LHC Tiered Data Grid Architecture r Using a realistic mixture of streams: organized transfer of multi-TB event datasets, plus numerous smaller flows of physics data that absorb the remaining capacity. u The analysis software suites are based on the Grid-enabled Analysis Environment (GAE) developed at Caltech and U. Florida, as well as Xrootd from SLAC, and dcache from FNAL u Monitored by Caltech’s Mon. ALISA global monitoring and control system
Global Lambdas for Particle Physics Caltech/CACR and FNAL/SLAC Booths u We used Twenty Two [*] 10 Gbps waves to carry bidirectional traffic between Fermilab, Caltech, SLAC, BNL, CERN and other partner Grid Service sites including: Michigan, Florida, Manchester, Rio de Janeiro (UERJ) and Sao Paulo (UNESP) in Brazil, Korea (KNU), and Japan (KEK) u Results r 151 Gbps peak, 100+ Gbps of throughput sustained for hours: 475 Terabytes of physics data transported in < 24 hours è 131 Gbps measured by SCInet bwc team on 17 of our waves r Using real physics applications and production as well as test systems for data access, transport and analysis: bbcp, xrootd, dcache, and gridftp; and grid analysis tool suites r Linux kernel for TCP-based protocols, including Caltech’s FAST r Far surpassing our previous SC 2004 BWC Record of 101 Gbps [*] 15 at the Caltech/CACR and 7 at the FNAL/SLAC Booth
Monitoring NLR, Abilene/HOPI, LHCNet, USNet, Tera. Grid, PWave, SCInet, Gloriad, JGN 2, WHREN, other Int’l R&E Nets, and 14000+ Grid Nodes Simultaneously I. Legrand
Switch and Server Interconnections at the Caltech Booth (#428) u 15 10 G Waves u 72 nodes with 280+ Cores u 64 10 G Switch Ports: 2 Fully Populated Cisco 6509 Es u 45 Neterion 10 Gb. E NICs u 200 SATA Disks u 40 Gbps (20 HBAs) to Stor. Cloud u Thursday – Sunday Setup http: //monalisa-ul. caltech. edu: 8080/stats? page=nodeinfo_sys
Fermilab u Our BWC data sources are the Production Storage Systems and File Servers used by: CDF DØ US CMS Tier 1 Sloan Digital Sky Survey u Each of these produces, stores and moves Multi. TB to PB-scale data: Tens of TB per day u ~600 gridftp servers (of 1000 s) directly involved
Xrootd Server Performance A. Hanushevsky Scientific Results r Ad hoc Analysis of Multi. TByte Archives r Immediate exploration r Spurs novel discovery approaches u Linear Scaling r Hardware Performance r Deterministic Sizing u High Capacity r Thousands of clients r Hundreds of Parallel Streams u Very Low Latency Excellent Across WANs r 12 us + Transfer Cost r Device + NIC Limited
Xrootd Clustering u. Unbounded Clustering Xrootd Clustering ? e. X Client il sf a open file X go to C open file X h ho W Ih av e u Self organizing A Data Servers B D X? e fil s ha e ho av CW E Ih go to F I ha ve Redirector Supervisor (Head Node) (sub-redirector) F Cluster Client sees all servers as xrootd data servers u. Total Fault Tolerance u Automatic real-time reorganization u. Result u Minimum Admin Overhead u Better Client CPU Utilization u More results in less time at less cost
Remote Sites: Caltech, UFL, Brazil…. . ROOT Analysis GAE Services ROOT Analysis ØAuthenticated users automatically discover, and initiate multiple transfers of physics datasets (Root files) through secure Clarens based GAE services. ØTransfer is monitored through Mon. ALISA ØOnce data arrives at the target sites (remote) analysis can start by authenticated users, using the Root analysis framework. ØUsing the Clarens Root viewer or COJAC event viewer data from remote can be presented transparently to the user.
SC|05 Abilene and HOPI Waves
GLORIAD: 10 Gbps Optical Ring Around the Globe by March 2007 GLORIAD Circuits Today u 10 Gbps Hong Kong-Daejon. Seattle u 10 Gbps Seattle-Chicago-NYC (CANARIE contribution to GLORIAD) u 622 Mbps Moscow-AMS-NYC u 2. 5 Gbps Moscow-AMS u 155 Mbps Beijing-Khabarovsk. Moscow China, Russia, Korea, Japan, US, Netherlands Partnership US: NSF IRNC Program u 2. 5 Gbps Beijing-Hong Kong u 1 Gb. E NYC-Chicago (CANARIE)
ESLEA/UKLight SC|05 Network Diagram 6 X 1 GE OC-192
KNU (Korea) Main Goals Uses 10 Gbps GLORIAD link from Korea to US, which is called BIGGLORIAD, also part of Ultra. Light q Try to saturate this BIG -GLORIAD link with servers and cluster storages connected with 10 Gbps q Korea is planning to be a Tier-1 site for LHC experiments q Korea BIG-GLORIAD U. S.
KEK (Japan) at SC 05 10 GE Switches on the KEK-JGN 2 -Star. Light Path JGN 2: 10 G Network Research Testbed • Operational since 4/04 • 10 Gbps L 2 between Tsukuba and Tokyo Otemachi • 10 Gbps IP to Starlight since August 2004 • 10 Gbps L 2 to Starlight since September 2005 Otemachi–Chicago OC 192 link replaced by 10 GE WANPHY in September 2005
Brazil HEPGrid: Rio de Janeiro (UERJ) and Sao Paulo (UNESP)
“Global Lambdas for Particle Physics” A Worldwide Network & Grid Experiment u We have Previewed the IT Challenges of Next Generation Science at the High Energy Frontier (for the LHC and other major programs) r Petabyte-scale datasets r Tens of national and transoceanic links at 10 Gbps (and up) r 100+ Gbps aggregate data transport sustained for hours; We reached a Petabyte/day transport rate for real physics data u We set the scale and learned to gauge the difficulty of the global networks and transport systems required for the LHC mission r But we set up, shook down and successfully ran the system in <1 week u We have substantive take-aways from this marathon exercise r An optimized Linux (2. 6. 12 + FAST + NFSv 4) kernel for data transport; after 7 full kernel-build cycles in 4 days r A newly optimized application-level copy program, bbcp, that matches the performance of iperf under some conditions r Extension of Xrootd, an optimized low-latency file access application for clusters, across the wide area r Understanding of the limits of 10 Gbps-capable systems under stress
“Global Lambdas for Particle Physics” A Worldwide Network & Grid Experiment u We are grateful to our many network partners: SCInet, LHCNet, Starlight, NLR, Internet 2’s Abilene and HOPI, ESnet, Ultra. Science Net, Mi. LR, FLR, CENIC, Pacific Wave, UKLight, Tera. Grid, Gloriad, AMPATH, RNP, ANSP, CANARIE and JGN 2. u And to our partner projects: US CMS, US ATLAS, D 0, CDF, Ba. Bar, US LHCNet, Ultra. Light, Lambda. Station, Terapaths, PPDG, Gri. Phy. N/i. VDGL, LHCNet, Stor. Cloud, SLAC IEPM, ICFA/SCIC and Open Science Grid u Our Supporting Agencies: DOE and NSF u And for the generosity of our vendor supporters, especially Cisco Systems, Neterion, HP, IBM, and many others, who have made this possible u And the Hudson Bay Fan Company…
Extra Slides Follow
Global Lambdas for Particle Physics Analysis SC|05 Bandwidth Challenge Entry Caltech, CERN, Fermilab, Florida, Manchester, Michigan, SLAC, Vanderbilt, Brazil, Korea, Japan, et al CERN's Large Hadron Collider experiments: Data/Compute/Network Intensive Discovering the Higgs, Super. Symmetry, or Extra Space-Dimensions - with a Global Grid Worldwide Collaborations of Physicists Working Together; while Developing Next-generation Global Network and Grid Systems
Analysis Sandbox 3 rd party application Catalog Storage datasets Service Clarens (ACL, X 509, Discovery) Web server XML-RPC SOAP Java RMI JSON RPC Clarens Client http/ https Network Ø Authentication Ø Access control on Web Services. Ø Remote file access (and access control on files). Ø Discovery of Web Services and Software. Ø Shell service. Shell like access to remote machines (managed by access control lists). Ø Proxy certificate functionality Ø Virtual Organization management and role management. ØUser's point of access to a Grid system. ØProvides environment where user can: ØAccess Grid resources and services. ØExecute and monitor Grid applications. ØCollaborate with other users. ØOne stop shop for Grid needs Start (remote) select analysis dataset Portals can lower the barrier for users to access Web Services and using Grid enabled applications
d909e6830207709f5b2d20bc337be7ee.ppt