ff77e465415c22cb1dfbb846d01aa227.ppt
- Количество слайдов: 25
UK E-Science Initiative and its Application to SDO J. L. Culhane MSSL
SUMMARY • The UK Astrogrid • Dealing with SDO Data Volumes • The PPARC E-Science AO • HMI Data Products and Pipeline
What is the Grid? Ian Foster, Argonne National Lab & University of Chicago “A Grid is a system that: • Coordinates resources that are not subject to centralized control. • Uses standard, open, general-purpose protocols and interfaces. • Delivers nontrivial qualities of service. ” Network - Ian Foster, “What is the Grid? A Three Point Checklist” Laptop Phone / PDA Printer PC Mainframe GRID Space Missions
UK Astrogrid • Astrogrid is one of three major world-wide projects (along with European AVO and US-VO projects) which aim to create an astronomical Virtual Observatory • Astrogrid has a significant Solar Physics component • The Virtual Observatory will be a set of co-operating and interoperable software systems that: – allow users to interrogate multiple data centres in a seamless and transparent way; – provide powerful new analysis and visualisation tools; – give data centres a standard framework for publishing and delivering services using their data.
How does Astrogrid work? Web Service: “A web service is any piece of software that makes itself available over the Internet and uses a standardized XML messaging system. ” - Ethan Cerami, “Top Ten FAQs for Web Services”, The O’Reilly Network Web Service User Data Archive Web Interface Web Service RESOURCES Web Service Distributed Network of Registries Data Storage Data Transformation & Processing
Astrogrid Registry: “Dynamic database of metadata describing a set of Internet-available resources. A registry is used to identify and locate resources satisfying user-specified criteria, and to direct more detailed information requests to the relevant services. Robert Hanisch, STSCI METADATA: • Basic: ID, title, service type • Curation: Location, contact, publisher, creator, etc. Registry Databa se • Metadata: Allowed methods, input / output variables, etc. • Metadata Format: Wavelength, coordinates, instrument coverage… Registries contain information about resources Data Archive Data Storage Data Transformation Distributed Network of Registries
Solar Interior to Outer Atmosphere Science goal: Connect observations of the interior to fluctuations in the solar atmosphere Data Required: Helioseismology observations connected with solar atmosphere observations Current difficulties: Being able to search efficiently for solar atmospheric events that may be responding to an excitation source in the interior Grid future: Ability to: - Search easily for events e. g. flux emergence, AR evolution, flares, coronal mass ejections, over specific time periods - Extract parameters over the cycle from the atmosphere and interior in order to compare their evolution Crucial for SDO to relate convection zone observations to magnetic field data for Photosphere and above
SDO HMI Archiving and Processing • SDO instruments generate raw data (~ 2 Tbyte/day) along with derived products • Derived products result from pipeline processing that must keep up with the flow of incoming data • GRID or Virtual Observatory approach could allow: – Distributed data holding – Distributed processing capability • Network bandwidths and processing power at single sites set limits: – Available network bandwidths for users could limit data transfer from/between multiple archives – All data at one site implies considerable processing power accessible by many distributed users
Distributed Archive Approach • Multiple copies of the data desirable • Needs a minimum of two geographically separated sources with the advantages: – Greater resilience in ability to supply users – Load sharing between different providers (network and processing) – Avoids need for single site to provide excessive processing power
Single Archive Approach • Solar data normally stored in a raw form and need to be processed before use • Processing involves extraction and calibration of selected observations. • For data (e. g. helioseismology data) involving extended time intervals, processing data at source is desirable • Advantages that result: – Reduced amount of information to be returned to user – Affords the instrument teams more control over the processing and quality of their data products but • Heavy loading of processors at single archive site unless requests are for high-level lower-volume data products
Network Issues • UK has “Super. Janet” backbone currently at 10 Gbps • Local access points operate at 2. 5 Gbps (e. g. UCL interconnect rate to backbone) • Europe has “Geant” backbone at 10 Gbps covering UK, France, Germany, Sweden, Switzerland with 2. 5 Gbps local interconnects • Transatlanic connection to Geant currently 2. 5 Gbps with upgrade to 10 Gbps planned for 2004 • Discussion of “Global” 1 Tbps network by 2006? ? • Geant driven in part by needs of HEP community for LHC – hence SDO may not have a problem in moving data between sites
PPARC E-Science AO • Proposals due by 31 st May, 2003 • Existence of first level Astrogrid infrastructure assumed • Proposals should: – Be for the application of infrastructure and related techniques to “real” data sets – Underpin science but close connection between projects and the science programme is essential – Demonstrate an enabling role for eventual science exploitation – Ensure development of standards and deployment of Grid infrastructure – SDO bid is now anticipated by PPARC
HMI Data Analysis Pipeline Enabling Code/ Algorithms Processing HMI Data Filtergrams Data Doppler Velocity Spherical Harmonic Time series To l=1000 Mode frequencies And splitting Ring diagrams Heliographic Doppler velocity maps Local wave frequency shifts Time-distance Cross-covariance function Wave travel times Egression and Ingression maps Tracked Tiles Of Dopplergrams Net Access/ Mirror Wave phase shift maps Product Internal rotation Ω(r, Θ) (0
HMI Science Data Analysis Plan Science Exploitation HMI SRR/SCR Presentation April 8 -10
HMI Data Volumes Net Access HMI SRR/SCR Presentation April 8 -10
END OF TALK
What is Astrogrid? Astrogrid is a £ 5 M data grid project that will link data archives, resources, and disciplines from UK space institutions into a virtual observatory. Resources Data Archives • Datasets • Mullard Space Science Laboratory • Processors • Storage • Other virtual observatories Disciplines • Astrophysics • Solar Physics • Solar Terrestrial Physics • Rutherford Appleton Laboratory • University of Cambridge • University of Leicester • Royal Observatory Edinburgh • Queens University Belfast • Jodrell Bank Observatory
GRID/Virtual Observatory Within a virtual observatory: • Not required for all datasets to be stored at a single site • Metadata and registries allow system to handle a distributed archive. • Different organisations or countries could host the different datasets or different parts of the datasets (e. g. split by time). • Complete catalogues relating to particular datasets should be held wherever the data are held. • Distributed data holding reduces the pressure on: – Network connection to an archive – Processing capabilities needed at the archive site • Most accessed data could be selectively copied to distributed archives e. g. EGSO, Astrogrid • Derived data products should be held at distributed sites • Material needed for more detailed searches should be described by metadata in appropriate registries.
Example: Solar / Stellar Flares Science Problem: A solar physicist studying the flare mechanism would like to gather data on both solar and stellar flares. Data Required: X-ray datasets: lightcurves, spectra, and redshift / blueshift information from SOHO, Yokhoh, EXOSAT, ROSAT, XMM, Chandra, etc. Current Issues: No stellar flare catalogue (at time of science problem writing), datasets provided by several different archives Solar Flare Yohkoh with no common interface. Catalogue #1 #2 Archive Solar-B Archive Merged Solar Flare List User Web Interface XMM Archive Chandra Archive NEW: Stellar Flare Catalogue
HMI Data Archive
HMI Data Flow
HMI Dataflow Concept HMI SRR/SCR Presentation April 8 -10
HMI Standard Data Products
UK Astrogrid Scientific Aims • Improve the quality, efficiency, ease, speed, and cost-effectiveness of on-line astronomical research • Make comparison and integration of data from diverse sources seamless and transparent • Remove data analysis barriers to interdisciplinary research • Make science involving manipulation of large datasets as easy and as powerful as possible.
UK Astrogrid Practical Goals • Develop, with our IVOA partners (including European Grid of Solar Observations/EGSO), internationally agreed standards for data, metadata, data exchange and provenance • Develop a software infrastructure for data services • Establish a physical grid of resources shared by Astro. Grid and key data centres • Construct and maintain an Astro. Grid Service and Resource Registry • Implement a working Virtual Observatory system based around key UK databases and of real scientific use to astronomers • Provide a user interface to that VO system • Provide, either by construction or by adaptation, a set of science user tools to work with that VO system • Establish a leading position for the UK in VO work