- Количество слайдов: 15
The 24 h cache in the cloud TT-GISC – Brasilia – 13 -16 October 2015 Rémy Giraud (Météo-France)
A short history – For GISC to GISC communication and in particular for exchange of « Global. Exchange » data the current architecture is based on an any to any unicast based solution – The solution is: • Politically very challenging (some bilateral links are virtually impossible to set up) • Technically difficult to establish, to monitor and to maintain • Financially unattractive as the same data is sent multiple times on an expensive network – In 2014, at TT-GISC and ET-CTS, a solution using a cloud based approach was presented – It was then agreed to run a pilot in 2015 to assess whether this option was viable – Suitability of this solution will be established based on: • The technical outcome of the pilot • The result of a questionnaire (it appears that some organisations forbid the storage of their data into the cloud) • The financial and contractual aspects • The agreement of the GISCs to proceed with such a solution
The should and the may What we should have now… What we may have tomorrow…
The pilot § The pilot is managed by ET-CTS (R. Giraud) with the support of H. Kiehl (DWD) § From a technical point of view, we are using AFD (Automated File Distribution). The dataflow is presented in next slide. § As it is a test, it is only using the Internet. § Pilot limited to data exchange, not to metadata § The timeline of the pilot: o 2015 Ø May: » ET-CTS conf call to present and discuss the plan » Based on ET-CTS outcome, communication to TT-GISC for volunteers » ECMWF as part of their own evaluation of a cloud based approach for their dissemination has agreed to « loan » two VMs for up to a year starting in May 2015 Ø June: » Installation of the system and configuration of AFD (June) Ø July » Interested GISC will be invited to join the pilot from July Ø Septembre » Each GISC will be able to join on a piecemeal approach » All GISC in the pilot will be on board » Issue the questionnaire Ø We will then run a three-month evaluation o 2016 Ø January: we will prepare the report for ET-CTS 2016 session
The potential dataflow The file system of the VM in the cloud GISC A uploads file in its incoming GISC B uploads /data file in its incoming /GISC I /GISC B /GISC A AFD copy files in other outgoing and in cache /incoming /outgoing 24 h cache /outgoing /24 h cache • The files in /incoming are deleted after processing. • The files in the outgoing are available if a GISC wants to download data again. • The files in the cache are kept as a reference of the 24 h cache. AFD push files to other GISCs /24 h cache CRON clear files older than 24 h
The status on october 12 th 2015 § Thirteen GISCs are pushing the data to the cloud server (Moscow, Brasilia, Offenbach, Beijing, Toulouse, Exeter, Melbourne, Tokyo, Seoul, Jeddah*, Tehran*, Washington*, Pretoria*) § One GISCs has confirmed is willingness to join the pilot (Dehli) § One GISC hasn’t made their decision clear (Casablanca) § Protocol used is either FTP or SFTP § Some are sending the GFNC (files like A_*. txt) and others are sending CCCC files. § In all cases, bulletins are stored as individual files in the cache § We have started experiment grouping files in. tar. bz 2 (tests as shown that we should reduce the overall size to be transferred by half and divided by three at least the number of files) § We plan to include a trial of AMQP later in the process (*) These GISCs are sending their data via a 3 rd party (Offenbach and Exeter)
Some questions and answers so far § Questions from the participants: o What file names should we use? Ø Att. II 15 describes two methods: » CCCCNNNN. [a, b, ua, ub] » Global File Naming Convention Ø Answer: both are possible Ø Question 1: is it the job of the distribution software to be “WMO aware” ? Ø Question 2: should it just receive/push files unknowingly of their content? o What happen with duplicates? Ø Answer: nothing. Everything received is resent. o What should I send? Ø All GTS data? A subset? Ø Remark 1: The “ 24 h cache” definition is apparently still unclear Ø Remark 2: MSS are used to push/receive data, MSS don’t know WIS and don’t look at metadata to know what they should do with the files. So defining and handling the content of the cache is a challenge » Remark 3: MSS et WIS software should have an interface to facilitate this definition » Eg. WIS software could inform one a day which TTAAii are for Global. Exchange § Should the « distribution » software be “WMO aware”? o If a CCCCNNNN. a file is received, what should happen? Ø Answer 1: Nothing, it is just a file the cloud doesn’t care about its content Ø Answer 2: If I want to keep a clean “ 24 h cache” in the cloud, the cloud needs to unpack it
The VDC by Interoute § ECMWF is paying for the cloud servers for one year. Two virtual servers are available (one in Paris, one in Berlin). So far, only the server in Paris is used. § A basic web interface to configure the VMs and the storage, but it does the job § As part of the configuration: o Firewall – Very limited. No support of “established” TCP connection. So to allow FTP, a large bunch of ports must be allowed o NAT o Load-Balancing (not used) § Very good online support § Network performance access very good. In theory, unlimited 10 Gb/s access to the Internet (and if needed to the RMDCN) for free! § For the time being, this solution is a very cost effective (the cost per VM is approximately 4 k€ per year) and proves to be an easy way to “multicast” from any GISC to all others (while using “unicast” protocol)
Still to do… as part of the test § Complete the connection with all GISCs: either directly or using a relay o Dehli and Casablanca (either directly or via a 3 rd party) o Aim at having for the first time (? ) the full 24 h cache available! § Improve configuration o Handle urgent messages (eg creating a separate directory for that purpose) o Test new features - AMQP § Run the test over a few months to assess the reliability of the solution § Assess performance and gather statistics
Still to do… after the test § Prepare questionnaire for all WIS centers (to be issued before the end of 2015) § Gather and consider responses to the questionnaire § Prepare reports for ET-CTS, TT-GISC, CBS in 2016
Some ideas for the questionnaire (1) § For the GISCs: o Assess support for this solution o Agree on method to procure the VMs Ø Do we need two independent providers for redundancy purposes (one being Interoute for the RMDCN access and another cloud provider on the Internet)? Ø Use of the RMDCN contract? Do we need another contract? Which one? o Agree on features required by the software on the cache server: Ø How much “WMO awareness” do we need? » None: the server in the cloud will just receive and send files, unaware of their content » Some: Extract CCCC files and push bulletins using GFNC, create CCCC files? create . tar. bz 2 files? » Full: Handle deduplication? Filter data not intended for “Global Exchange” Ø Depending on the requirements, select the appropriate software o How to manage the VMs and the software used? Ø For the “network” RMDCN, ECMWF is the technical and administrative interface. Interoute is in charge of the configuration and monitoring Ø For the “cloud”, we need a similar design. Interoute may provide the VMs but won’t take care of the software, . Is ECMWF ready to extend its role to the “cloud” function? Who else could do that? Any volunteer among the GISC (or DCPC/NC)? Ø Interest in managing the servers/software on behalf of the community. For free? For a fee? For a limited period of time?
Some ideas for the questionnaire (2) § For the DCPCs/NCs: o Assess support for this solution Ø In particular: Is storing your data in the cloud compliant with your policy o Interest in managing the servers/software on behalf of the community. For free? For a fee? For a limited period of time?
The thank you slide § All the GISCs part of the pilot o Email sent to the GISC list on 11 th August o In two weeks answers from 12 of them o Two months later: Ø 13 are up and running (at least partially) Ø Exchange of emails with the 2 remaining o Are we going to see the “real” 24 h cache ? § ECMWF for supporting this pilot by offering the VMs for one year § Holger Kiehl (DWD) is a tremendous support: o Contact for GISC Offenbach o Configuration and hardening of the VMs o Configuration of AFD o Improving AFD
Recommended text § TT-GISC: o Recognizes that the “cache in the cloud” is a very promising solution to obtain a shared and uniformed cache between all the GISCs o Welcomes the participation of 13 GISCs and urges the two remaining to join the pilot, either directly or via a third party o Tasks ET-CTS to prepare the questionnaire to asses the possibility of using this solution operationally