Скачать презентацию — Eddy Caron Lego Team from GRAAL Скачать презентацию — Eddy Caron Lego Team from GRAAL

f179236ce8b93c5a78bbb730ce833527.ppt

  • Количество слайдов: 37

- Eddy Caron - Eddy Caron

Lego Team from GRAAL • • Anne Benoît (Mc. F) Eddy Caron (Mc. F) Lego Team from GRAAL • • Anne Benoît (Mc. F) Eddy Caron (Mc. F) Frédéric Desprez (DR) Yves Caniou (Mc. F) • • Raphaël Bolze (Ph. D) Pushpinder Kaur Chouhan (Ph. D) Jean-Sébastien Gay (Ph. D) Cedric Tedeschi (Ph. D) E. Caron - Réunion de lancement LEGO - 10/02/06

Lego Team from GRAAL • • Anne Benoît (Mc. F) Eddy Caron (Mc. F) Lego Team from GRAAL • • Anne Benoît (Mc. F) Eddy Caron (Mc. F) Frédéric Desprez (DR) Yves Caniou (Mc. F) • • Raphaël Bolze (Ph. D) Pushpinder Kaur Chouhan (Ph. D) Jean-Sébastien Gay (Ph. D) Cedric Tedeschi (Ph. D) E. Caron - Réunion de lancement LEGO - 10/02/06

DIET Architecture Client Master Agent MA JXTA MA MA LA FAST library Application Modeling DIET Architecture Client Master Agent MA JXTA MA MA LA FAST library Application Modeling LDAP LA LA System availabilities NWS Server front end Local Agent E. Caron - Réunion de lancement LEGO - 10/02/06 LA

Data Management Join work with G. Antoniu, E. Caron, B. Del Fabbro, M. Jan Data Management Join work with G. Antoniu, E. Caron, B. Del Fabbro, M. Jan

Data/replica management • Two needs w Keep the data in place to reduce the Data/replica management • Two needs w Keep the data in place to reduce the overhead of communications between clients and servers w Replicate data whenever possible • Two approaches for DIET Client w DTM (LIFC, Besançon) § Hierarchy similar to the DIET’s one § Distributed data manager § Redistribution between servers B B § P 2 P data cache F X Server 2 Net. Solve w IBP (Internet Backplane Protocol) : data cache w Request Sequencing to find data dependences • Server 1 B w Jux. Mem (Paris, Rennes) • A Work done within the Grid. RPC Working Group (GGF) w Relations with workflow management E. Caron - Réunion de lancement LEGO - 10/02/06 Y Client G

Data management with DTM within DIET • Persistence at the server level • To Data management with DTM within DIET • Persistence at the server level • To avoid useless data transfers w w • Intermediate results (C, D) Between clients and servers Between servers “transparent” for the client Data Manager/Loc Manager w Hierarchy mapped on the DIET one w modularity • Proposition to the Grid-RPC WG (GGF) w Data handles w Persistence flag w Data management functions E. Caron - Réunion de lancement LEGO - 10/02/06

Performances (A = C * B) Performance E. Caron - Réunion de lancement LEGO Performances (A = C * B) Performance E. Caron - Réunion de lancement LEGO - 10/02/06

Performances (C = A * B; D = E + C; A =t. A) Performances (C = A * B; D = E + C; A =t. A) E. Caron - Réunion de lancement LEGO - 10/02/06

JUXMEM PARIS project, IRISA, France • A peer-to-peer architecture for a data-sharing service in JUXMEM PARIS project, IRISA, France • A peer-to-peer architecture for a data-sharing service in memory • Persistence and data coherency mechanism • Transparent data localization Peer ID Peer ID Toolbox for the development of P 2 P applications One peer Peer TCP/IP Peer Peer Firewall Peer Set of protocols Peer HTTP E. Caron - Réunion de lancement LEGO - 10/02/06 Unique ID Several communication protocols (TCP, HTTP, …)

Visualization Work with Raphaël Bolze Visualization Work with Raphaël Bolze

Viz. DIET: A visualization tool • • Current view of the DIET platform A Viz. DIET: A visualization tool • • Current view of the DIET platform A postmortem analysis from log files is available Good scalability We can show : w w w Communication between agents State of Se. D Available Services Persistent Data Name information CPU, memory and network load. E. Caron - Réunion de lancement LEGO - 10/02/06

Log. Service • • CORBA communications Messages ordering and scheduling Messages filtering System state Log. Service • • CORBA communications Messages ordering and scheduling Messages filtering System state E. Caron - Réunion de lancement LEGO - 10/02/06

Log. Service & DIET • Log. Service Componant w Log. Manager (LM) w Log. Log. Service & DIET • Log. Service Componant w Log. Manager (LM) w Log. Central • Each Log. Manager receives information from agent and send them to Log. Central out of DIET structure. • Viz. Diet shows graphicaly all messages from Log. Service • Message transfert from agent using Log. Manager w No disc storage E. Caron - Réunion de lancement LEGO - 10/02/06

Viz. DIET v 1. 0 XML: - DIET Agents - DIET Servers - Physical Viz. DIET v 1. 0 XML: - DIET Agents - DIET Servers - Physical Machines - Physical Storage Viz. DIET Distributed DIET Deployment Log. Service Go. DIET E. Caron - Réunion de lancement LEGO - 10/02/06

Screenshot : Platform Visualization E. Caron - Réunion de lancement LEGO - 10/02/06 Screenshot : Platform Visualization E. Caron - Réunion de lancement LEGO - 10/02/06

Screenshots: Statistic module E. Caron - Réunion de lancement LEGO - 10/02/06 Screenshots: Statistic module E. Caron - Réunion de lancement LEGO - 10/02/06

Platform Deployment Work from E. Caron, P. -K. Chouhan and A. Legrand Platform Deployment Work from E. Caron, P. -K. Chouhan and A. Legrand

Go. DIET: A tool for automated DIET deployment • Automate configuration, staging, execution and Go. DIET: A tool for automated DIET deployment • Automate configuration, staging, execution and management of distributed DIET platform w Support experiments at large scale w Faster and easier bulk testing w Reduce errors & debugging time for users • Constraints: w Simple XML file w Console & batch mode w Integrate w/ visualization tools and CORBA tools [wrote in Java] E. Caron - Réunion de lancement LEGO - 10/02/06

DIET usage with contrib services Déploiement distribué de DIET Administration de DIET XML Go. DIET usage with contrib services Déploiement distribué de DIET Administration de DIET XML Go. DIET Traces Sous-ensemble de traces Viz. DIET E. Caron - Réunion de lancement LEGO - 10/02/06 Log. Service Sous-ensemble de traces

Launch process • Go. DIET follows DIET hierarchy in launch order • For each Launch process • Go. DIET follows DIET hierarchy in launch order • For each element to be launched: w Configuration file written local disk [including parent agent, naming service location, hostname and/or port endpoint…] w Configuration file staged remote disk (scp) w Remote command launched (ssh) [PID retrieved, stdout & stderr saved on request] • Feedback from Log. Central used to time launch of next element E. Caron - Réunion de lancement LEGO - 10/02/06

Go. DIET Console • java -jar Go. DIET. jar vthd 4 site. xml E. Go. DIET Console • java -jar Go. DIET. jar vthd 4 site. xml E. Caron - Réunion de lancement LEGO - 10/02/06

Go. DIET: before launch E. Caron - Réunion de lancement LEGO - 10/02/06 Go. DIET: before launch E. Caron - Réunion de lancement LEGO - 10/02/06

Go. DIET: after launch • 27 sec launch w/ waiting for feedback E. Caron Go. DIET: after launch • 27 sec launch w/ waiting for feedback E. Caron - Réunion de lancement LEGO - 10/02/06

Grid’ 5000 DIET deployment • 7 sites / 8 clusters w Bordeaux, Lille, Lyon, Grid’ 5000 DIET deployment • 7 sites / 8 clusters w Bordeaux, Lille, Lyon, Orsay, Rennes, Sophia, Toulouse • 1 MA • 8 LA • 574 Se. D E. Caron - Réunion de lancement LEGO - 10/02/06

Scheduling Work with Alan Su, Peter Frauenkron, Eric Boix Scheduling Work with Alan Su, Peter Frauenkron, Eric Boix

The scheduling • • • Plug-in scheduler Round robin as default scheduling Advanced scheduling The scheduling • • • Plug-in scheduler Round robin as default scheduling Advanced scheduling only possible with more information. Existing schedulers in DIET use data of FAST and/or NWS. Limitations: w w deployment of appropriate hierarchies for a given grid platform is non-obvious limited consideration of inter-task factors non-standard application- and platform-specific performance measures FAST, NWS : low availability, Se. D idles, for NWS no default weighting difficult (possible? ). E. Caron - Réunion de lancement LEGO - 10/02/06

Plugin Scheduling • • • Plugin scheduling facilities to enable application-specific definitions of appropriate Plugin Scheduling • • • Plugin scheduling facilities to enable application-specific definitions of appropriate performance metrics an extensible measurement system tunable comparison/aggregation routines for scheduling composite requirements enables various selection methods w w basic resource availability processor speed, memory database contention future requests Component Se. D Agents Client Before After automatic performance estimate (FAST/NWS) chosen/defined by application programmer exec. time sorting “menu” of aggregation methods CLIENT CODE UNCHANGED E. Caron - Réunion de lancement LEGO - 10/02/06

Co. RI • • • Collector: an easy interface to gathering performance and load Co. RI • • • Collector: an easy interface to gathering performance and load about a specific Se. D. Two modules (currently): Co. RI-Easy and FAST Possible to extend (new modules): Ganglia, Nagios, R-GMA, Hawkeye, INCA, MDS, … Co. RI - Easy FAST other • Using fast and basic functions or simple performance tests. • Keep the independence of DIET. • Able to run on “all” operating systems to allow a default scheduling with basic information. E. Caron - Réunion de lancement LEGO - 10/02/06

Batch and parallel submissions Work with Yves Caniou Batch and parallel submissions Work with Yves Caniou

Difficulties of the problem • Several Se. D types • Parallel or sequential jobs Difficulties of the problem • Several Se. D types • Parallel or sequential jobs • Submit a parallel job (pdgemm, . . . ) • Transparent for the user • General API agent Se. D_batch Se. D_seq Se. D_parallel

Se. D_parallel • Se. D_parallel on the frontal • Submit a parallel job → Se. D_parallel • Se. D_parallel on the frontal • Submit a parallel job → system dependant w w agent NFS: copy the code ? MPI: LAM, MPICH ? Reservation ? Monitoring & Perf. prediction agent Se. D_parallel Frontal NFS

Se. D_batch • Se. D_batch on the frontal • Submit a parallel job → Se. D_batch • Se. D_batch on the frontal • Submit a parallel job → even more system dependent agent w Previous mentioned problems w Numerous batch systems → homogenization ? w Batch sched. behavior → queues, scripts, etc. agent Se. D_batch GLUE OAR SGE LSF PBS Condor Loadleveler

Batch & parallel submissions • Asynchronous, long term production jobs • Still more problems Batch & parallel submissions • Asynchronous, long term production jobs • Still more problems w System dependent, numerous batch systems and their behavior w Performance prediction ! → Application makespan in function of #proc? → If reservation available, how to compute deadline? w Scheduling problems → Do we reserve when probing? How long hold it? → How to manage data transfers when waiting in the queue? w Co-scheduling? w Data & job migration?

Future Work Future Work

Future work • LEGO applications with DIET w CRAL (RAMSES) w CERFACS w TLSE Future work • LEGO applications with DIET w CRAL (RAMSES) w CERFACS w TLSE (Update) • Components and DIET w Which architecture ? • Deployment w Link between ADAGE and theoretical solution on cluster [IJHPCA 06] ? w Anne Benoît approach • … E. Caron - Réunion de lancement LEGO - 10/02/06

Questions ? http: //graal. ens-lyon. fr/DIET Questions ? http: //graal. ens-lyon. fr/DIET