Скачать презентацию Overview Ocean Data Processing System ODPS Introduction Missions Скачать презентацию Overview Ocean Data Processing System ODPS Introduction Missions

4dec29d4379bb1e83f914940d229cbe7.ppt

  • Количество слайдов: 42

Overview Ocean Data Processing System (ODPS) Introduction Missions Evolution Philosophy Software Components and Subsystems: Overview Ocean Data Processing System (ODPS) Introduction Missions Evolution Philosophy Software Components and Subsystems: > RDBMS > Ingest > VDC/Scheduler > Distribution Scientific Support Browser

Introduction The ODPS is an automated data system that provides ingest, processing, archive, and Introduction The ODPS is an automated data system that provides ingest, processing, archive, and distribution functions for legacy, operational, and future remote-sensing satellite missions. Legacy Missions: > CZCS Oct 1978 – Jun 1986 > OCTS Nov 1996 – Jun 1997 Operational Missions: > Aqua-MODIS Jul 2002 > MERIS Mar 2002 > Sea. Wi. FS Sep 1997 > Terra-MODIS Feb 2000 Future Missions: > Aquarius > Glory > NPP VIIRS

Evolution Originally developed between 1991 and 1996 to support Sea. Wi. FS Support for Evolution Originally developed between 1991 and 1996 to support Sea. Wi. FS Support for OCTS added in 1996 Delivered to MODIS project to serve as the MODIS Emergency Backup System (MEBS) in 1997 Complete system redesign and rewrite 2003 -2004 Delivered to GISS in 2008 to support Glory mission Multiple evolutionary cycles in response to changes in hardware infrastructure and support-function requirements > Began on early multi-processor SGI IRIX systems > Ported to Linux in 2000 > Processing concurrency increased from 30 to over 500 > Distribution functions added in 2004 > Storage evolution > Validation targets

Philosophy Adaptive framework that allows any standalone program to be incorporated as a system Philosophy Adaptive framework that allows any standalone program to be incorporated as a system job Loosely coupled, modular subsystems > Ease of maintenance > Development and testing alongside production > Subsystem swapping Standardized coding practices minimize impact of operating-system upgrades > SGI IRIX to Linux > 32 -bit to 64 -bit > Strict GSFC IT requirements necessitate more-frequent OS updates Software lifecycle of requirements analysis, rapid-prototype development, and refinement allows new concepts to be quickly developed and adopted for operational use > Data subscriptions and orders

Ingest and Distribution Statistics ODPS currently manages over 20 million files in its archive, Ingest and Distribution Statistics ODPS currently manages over 20 million files in its archive, about 1. 06 petabytes Daily ingests: 576 MODIS-L 0 granules, 120 GB (60 GB each for Aqua and Terra) 2 Sea. Wi. FS recorder dumps, 200 MB each 2 -3 Sea. Wi. FS HRPT (direct broadcast) passes, 50 MB each 5 -6 MERIS-L 1 granules, 1 GB each Distribution (Oct 2010): 978 orders; 650, 786 files; 5. 2 TB 473 active subscriptions; 576, 346 files staged

Software Proprietary Software Open Source Software RDBMS Sybase Adaptive Server Enterprise 15. 0. 3 Software Proprietary Software Open Source Software RDBMS Sybase Adaptive Server Enterprise 15. 0. 3 Sybase Open Client CT Library Sybase Transact-SQL Framework GCC 4. x Perl 5 Perl DBI module with Sybase driver Open. Motif 2. x Bash Processing IDL (limited use) Image Generation GMT Image. Magick Net. Pbm Octave Version Control Subversion

Subsystems RDBMS Data Acquisition and Ingest Archive Device Manager Level-3 Scheduler VDC/Scheduler Data Distribution Subsystems RDBMS Data Acquisition and Ingest Archive Device Manager Level-3 Scheduler VDC/Scheduler Data Distribution File Management and Migration

Components and Subsystems: RDBMS Primary element that manages all system activity Core databases support Components and Subsystems: RDBMS Primary element that manages all system activity Core databases support generic system framework, data ingest, processing, file management, and distribution functions Mission databases house mission-specific data and procedures High level of reuse possible for similar missions, i. e. MODIS Aqua/Terra, Sea. Wi. FS, and OCTS are ocean-color missions and have similar product suites, data flows, and processing requirements Database and transaction-log dumps performed regularly and stored in three different locations Clone of database-server hardware and OS maintained as a warm backup

Components and Subsystems: RDBMS Generic Core Databases Admin Catalog Dataflow Processing MODIS Aqua MODIS Components and Subsystems: RDBMS Generic Core Databases Admin Catalog Dataflow Processing MODIS Aqua MODIS Terra OCTS Sea. Wi. FS CZCS Aquarius VIIRS New Mission-Specific Databases

Components and Subsystems: RDBMS Goal: Isolate RDBMS from system software To use a different Components and Subsystems: RDBMS Goal: Isolate RDBMS from system software To use a different RDBMS vendor, swap in a new Database Services Layer RDBMS Vendor Library Module Vendor Client Library Database Services Layer Perl DBI Module C Interface Functions Perl Scripts C Programs

Subsystems: Ingest Data types and sources are described in the database Active, passive, and Subsystems: Ingest Data types and sources are described in the database Active, passive, and periodic notification methods > Active method scans remote systems for new files > Passive method handles messages for new files > Periodic method schedules transfers of files at specified intervals File transfers performed by ingest daemons and scheduler tasks FTP, RCP, SFTP, and HTTP transfer protocols supported Generic file transfer process hands off to data-specific post-transfer scripts

Ingest: Flowchart Ingest: Flowchart

Subsystems: VDC/Scheduler Visual Database Cookbook (VDC) > Prototype developed in 1991 > Four separate Subsystems: VDC/Scheduler Visual Database Cookbook (VDC) > Prototype developed in 1991 > Four separate programs > Originally a distributed model Runs in a daemon-like state on each server on which processing or supporting jobs need to run Two main functions: Task Scheduler – Run high-level jobs (tasks) that support a variety of system functions Processing Engine – Run processing streams, typically scientific programs, sequenced into steps such as L 0 ->L 1, L 1 ->L 2, etc Greedy client model adapted in 2004 Unification of task scheduler and processing engine in 2007

VDC Function: Scheduler Primary system element responsible for coordinating most of the system activity VDC Function: Scheduler Primary system element responsible for coordinating most of the system activity Monitors task records in a to-do list database table and runs tasks according to defined attributes > Manual > Periodic > Timed > Triggered Standard job-shell interface allows new programs to be quickly adapted for Scheduler control Tasks may be bound to specific hosts or claimed by any available host in the processing group

VDC Function: Scheduler Daily Tasks To-do List User input via SCHEDMON GUI Tasks for VDC Function: Scheduler Daily Tasks To-do List User input via SCHEDMON GUI Tasks for the current day Daily Task Scheduler VDC/ Scheduler Task Shell

SCHEDMON GUI SCHEDMON GUI

VDC Function: Processing Engine Scalable infrastructure for concurrent processing of serial streams (e. g. VDC Function: Processing Engine Scalable infrastructure for concurrent processing of serial streams (e. g. L 0 -> L 1 A -> L 1 B -> L 2) Each instance of the VDC Engine actively competes for jobs that it is allowed to run based on priority, length of time in the queue, and processing weight Uses recipes to encapsulate data-specific processing schemes, parameters, and preprocessing rules Virtual Processing Units (VPUs) serve as distinct processing resources and are allocated based on available time, current OS load, and processing weight Comprehensive processing priorities allow high-priority real-time data to be handled ahead of lower-priority processing Standard job-shell interface allows new scientific programs to be quickly adapted as recipe steps

VDC Function: Processing Engine Captures system boot time and monitors OS load Invokes recipe VDC Function: Processing Engine Captures system boot time and monitors OS load Invokes recipe steps and monitors step-execution time Handles operator-requested stream actions Performs flushing operations on completed tasks and streams

VDC: Rule Manager Runs in a daemon-like state Polls jobs in the processing queue VDC: Rule Manager Runs in a daemon-like state Polls jobs in the processing queue and runs the pre-processing rule procedures Promotes job status when all rule procedures complete successfully Governed by currently configured processing priorities Primarily used for matching proper ancillary data with granules in the processing queue

VDC: Make. VDC Polls processing queue for jobs that have met pre-processing requirements Generates VDC: Make. VDC Polls processing queue for jobs that have met pre-processing requirements Generates VDC job files from recipe templates according to configured priorities and populates the VDC queue Runs as a Scheduler task, so it can easily be configured to run as often as needed to keep the VDC queue full

VDC: Flowchart VDC: Flowchart

VDCMON GUI VDCMON GUI

Subsystems: Distribution Interactive, web-based Data Ordering System, currently supporting Aqua and Terra MODIS, CZCS, Subsystems: Distribution Interactive, web-based Data Ordering System, currently supporting Aqua and Terra MODIS, CZCS, OCTS, Sea. Wi. FS Data Subscription System, currently supporting Aqua and Terra MODIS and Sea. Wi. FS, allows users to define region and products of interest Order and Subscription Manager daemons poll the order and subscription queues and stage files on FTP servers (stage rate ~12 GBs / hr) Near-real-time data extraction and image support Web-CGI applications that allow users to view and update their orders and subscriptions

Distribution: Flowchart Data Orders U s e r s Order Manager Data Subscriptions U Distribution: Flowchart Data Orders U s e r s Order Manager Data Subscriptions U s e r s Subscription Manager Regional Extraction and Map Requests Local Distribution Servers Extraction and Mapping Recipe Data and images optionally pushed to users

Scientific Support 24/7 operational support forward-stream processing > 9 -to-5 staffing > Extended lights-out Scientific Support 24/7 operational support forward-stream processing > 9 -to-5 staffing > Extended lights-out periods > No unscheduled down time in past year due to system-software faults Support algorithm/calibration testing alongside production > Product suites > Test recipes > Alternate tags in science-software repository > Processing priorities Non-standard processing requests > Regional L 3 processing > Great Barrier Reef research > Mozambique Whale Shark research > GMT Intermediate Coastline > Aquarius Simulation

Ocean. Color Web oceancolor. gsfc. nasa. gov Consolidated data access, information, services and community Ocean. Color Web oceancolor. gsfc. nasa. gov Consolidated data access, information, services and community feedback

Ocean. Color Web oceancolor. gsfc. nasa. gov Consolidated data access, information, services and community Ocean. Color Web oceancolor. gsfc. nasa. gov Consolidated data access, information, services and community feedback

Ocean. Color Web oceancolor. gsfc. nasa. gov Consolidated data access, information, services and community Ocean. Color Web oceancolor. gsfc. nasa. gov Consolidated data access, information, services and community feedback