
f1d368bf9301fe711f2dcc29fd4f85b9.ppt
- Количество слайдов: 17
US LHC NWG Dynamic Circuit Services in US LHCNet Artur Barczyk, Caltech Joint Techs Workshop Honolulu, 01/23/2008
US LHC NWG US LHCNet Overview Mission oriented network: Provide trans-Atlantic network infrastructure to support the US LHC program SARA Starlight CERN Manlan Four Po. Ps: CERN Starlight (→ Fermilab) Manlan (→ Brookhaven) 2008: 30 (40) Gbps trans-Atlantic bandwidth (roadmap: 80 Gbps by 2010) SARA
Large Hadron Collider @ CERN Start in 2008 US LHC NWG s =14 Te. V L=1034 cm-2 s-1 pp km Tunnel in Switzerland & France 27 6000+ Physicists & Engineers 250+ Institutes 60+ Countries Atlas LHCb ALICE CMS Higgs, SUSY, Extra Dimensions, CP Violation, QG Plasma, … Challenges: Analyze petabytes of complex data cooperatively Harness global computing, data & network resources the Unexpected
The LHC Data Grid Hierarchy US LHC NWG CERN/Outside Ratio ~1: 4 T 0/( T 1)/( T 2) ~1: 2: 2 ~40% of Resources in Tier 2 s US T 1 s and T 2 s Connect to US LHCNet Po. Ps Online GEANT 2+NRENS Germany T 1 USLHCNet + ESnet BNL T 1 10 – 40 Gbps 10 Gbps Outside/CERN Ratio Larger; Expanded Role of Emerging Vision: A Richly Structured, Global. Networks System Tier 1 s & Tier 2 s: Greater Reliance on Dynamic
The Roles of Tier Centers US LHC NWG 11 Tier 1 s, over 100 Tier 2 s → LHC Computing will be more dynamic & network-oriented the dynamism Defines of data transfers Prompt calibration and alignment Reconstruction Store complete set of RAW data Reprocessing Store part of processed data Monte Carlo Production Physics Analysis Tier 0 (CERN) Requirements for Dynamic Circuit Services in US LHCNet Physics Analysis Tier 1 Tier 2 Tier 3
CMS Data Transfer Volume (May – Aug. 2007) 10 Peta. Bytes transferred Over 4 Mos. = 8. 0 Gbps Avg. (15 Gbps Peak) US LHC NWG
End-system capabilities growing US LHC NWG 40 G In 40 G Out 88 Gbps Peak; 80+ Gbps Sustainable for Hours, Storage-to-Storage
Managed Data Transfers US LHC NWG The scale of the problem and the capabilities of the end-systems require a managed approach with scheduled data transfer requests The dynamism of the data transfers defines the requirements for scheduling Tier 0 → Tier 1, linked to duty cycle of the LHC Tier 1 → Tier 1, whenever data sets are reprocessed Tier 1 → Tier 2, distribute data sets for analysis Tier 2 → Tier 1, distribute MC produced data Transfer Classes Fixed allocation Preemptible transfers Best effort Priorities Preemption Use LCAS to squeeze low(er) priority circuits Interact with End-Systems Verify and monitor capabilities All of this will happen “on demand” from Experiment’s Data Management systems Needs to work end-to-end: collaboration in GLIF, DICE
Managed Network Services Operations Scenario US LHC NWG Receive request, check capabilities, schedule network resources “Transfer N Gigabytes from A to B with target throughput R 1” Authenticate/authorize/prioritize Verify end-host rate capabilities R 2 (achievable rate) Schedule bandwidth B > R 2; estimate time to complete T(0) Schedule path with priorities P(i) on segment S(i) Check progress periodically Compare rate R(t) to R 2, update time to complete T(i) to T(i-1) Trigger on behaviours requiring further action Error (e. g. segment failure) Performance issues (e. g. poor progress, channel underutilized, long waits) State change (e. g. new high priority transfer submitted) Respond dynamically: to match policies and optimize throughput Change channel size(s) Build alternative path(s) Create new channel(s) and squeeze others in class
Managed Network Services: End-System Integration US LHC NWG Required for a robust end-to-end production system Integration of network services and end-systems Requires end-to-end view of the network and end-systems, real-time monitoring Robust, real-time and scalable messaging infrastructure Information extraction and correlation e. g. network state, end-host state, transfer queues-state Obtain via network services end-host agent (EHA) interactions Provide sufficient information for decision support Cooperation of EHAs and network services Automate some operational decisions using accumulated experience Increase level of automation to respond to: increases in usage, number of users, and competition for scarce network resources
Lightpaths in US LHCNet domain US LHC NWG Dynamic setup and reservation of lightpaths has been successfully demonstrated by the VINCI project controlling optical switches Control Plane Data Plane (Virtual Intelligent Networks for Computing Infrastructures in Physics)
Planned Interfaces US LHC NWG Most, if not all, LHC data transfers will cross more than one domain E. g. in order to transfer data from CERN to Fermilab: CERN → US LHCNet → ESnet → Fermilab VINCI Control Plane for intra-domain, DCN (DICE/GLIF) IDC for inter-domain provisioning I-NNI: VINCI (custom) protocols UNI: DCN IDC? Lambda. Station? Tera. Paths? E-NNI: Web Services (DCN IDC) UNI: VINCI custom protocol, client = EHA
Protection Schemes US LHC NWG Mesh-protection at Layer 1 US LHCNet links are assigned to primary users CERN – Starlight for CMS CERN – Manlan for Atlas In case of link failure cannot blindly use bandwidth belonging to the other collaboration Carefully choose protection links, e. g. use the indirect path (CERN-SARAManlan) Designated Transit Lists, and DTL- Sets High-level protection features implemented in VINCI Re-provision lower priority circuits Preemption, LCAS Needs to work end-to-end: collaboration in GLIF, DICE
Basic Functionality To-Date US LHC NWG Semi-automatic intra-domain circuit provisioning Bandwidth adjustment (LCAS) End-host tuning US LHCNet by the End-Host Agent End-to-End monitoring routers Pre-production (R&D) setup: Local domain: routing of private IP subnets onto tagged VLANs Core network (TDM): VLAN based Virtual Circuits Ultralight routers Ciena Core. Directors High performance servers 14
Mon. ALISA: Monitoring the US LHCNet Ciena CDCI Network SARA USLHCnet Starlight Manlan CERN Geneva US LHC NWG
Roadmap Ahead The current capabilities include End-to-End monitoring Intra-domain circuit provisioning End-host tuning by the End-Host Agent Towards a production system (intra-domain) Integrate existing end-host agent, monitoring and measurement services Provide a uniform user/application interface Integration with experiments’ Data Management Systems Automated fault handling Priority-based transfer scheduling Include Authorisation, Authentication and Accounting Towards a production system (inter-domain) Interface to DCN IDC Work with DICE, GLIF on IDC protocol specification Topology exchange, routing, end-to-end path calculation Extend AAA infrastructure to multi-domain US LHC NWG
Summary and Conclusions US LHC NWG Movement of LHC data will be highly dynamic Follow LHC data grid hierarchy Different data sets (size, transfer speed and duration), different priorities Data Management requires network-awareness Guaranteed bandwidth end-to-end (storage-system to storage-system) End-to-end monitoring including end-systems We are developing the intra-domain control plane for US LHCNet VINCI project, based on Mon. ALISA framework Many services and agents are already developed or in advanced state Use Internet 2’s IDC protocol for inter-domain provisioning Collaboration with Internet 2, ESNet, Lambda. Station, Terapaths on end-to-end circuit provisioning