Скачать презентацию Data Integration Assessing the Value and Significance of Скачать презентацию Data Integration Assessing the Value and Significance of

5f22702501159fd3806f42c53a245d9c.ppt

  • Количество слайдов: 24

Data Integration: Assessing the Value and Significance of New Observations and Products John Williams, Data Integration: Assessing the Value and Significance of New Observations and Products John Williams, NCAR Haig Iskenderian, MIT LL NASA Applied Sciences Weather Program Review Boulder, CO November 19, 2008

Data Integration • Goals – Integrate NASA-funded research into Next. Gen 4 -D data Data Integration • Goals – Integrate NASA-funded research into Next. Gen 4 -D data cube for SAS products and decision support – Evaluate potential of new data to contribute to Next. Gen product skill, in context of other data sources – Provide feedback on temporal/spatial scales and operationally significant scenarios where new data may contribute • Approaches – Perform physically-informed transformations and forecast system integration, e. g. , into fuzzy logic algorithm – Use nonlinear statistical analysis to evaluate new data importance in conjunction with other predictor fields – Implement, evaluate and tune the system

Example of Forecast System Integration: SATCAST integration into Co. SPA Example of Forecast System Integration: SATCAST integration into Co. SPA

Co. SPA 0 -2 hour Forecasts LLWAS ASOS TDWR NEXRAD Canadian Weather Radar Surface Co. SPA 0 -2 hour Forecasts LLWAS ASOS TDWR NEXRAD Canadian Weather Radar Surface Weather Co. SPA Situation Display Lightning Air Traffic Managers Co. SPA Weather Product Generator Airline Dispatch Satellite Numerical Forecast Models Decision Support Tools

Overview of Heuristic Forecast Overview of Heuristic Forecast

Generation of Interest Images • Interest Images: – Are VIL-like (0 -255) images that Generation of Interest Images • Interest Images: – Are VIL-like (0 -255) images that have a high impact upon evolution and pattern of future VIL – Result from combining individual predictor fields using expert meteorological knowledge and image processing for feature extraction

Creating Interest Images Convective Initiation Predictor Fields Number CI Indicators & Visible Image Processing Creating Interest Images Convective Initiation Predictor Fields Number CI Indicators & Visible Image Processing Lower Tropospheric Winds/Speed Feature Extraction Cumulus Orientation and elongation of elliptical kernel prescribed by winds Favorable for CI Forecast Engine Unfavorable for CI Stability Mask Locations prescribed by CI Scores Regional CI Weights CI Interest

Feature Extraction Weather Classification Embedded Stratiform Large Airmass Line Small Airmass Feature Extraction Weather Classification Embedded Stratiform Large Airmass Line Small Airmass

Overview of Heuristic Forecast Overview of Heuristic Forecast

Forecast Engine Combine Interest Images VIL Long-term Trend Short-term Trend Satellite Interest RADAR Boundary Forecast Engine Combine Interest Images VIL Long-term Trend Short-term Trend Satellite Interest RADAR Boundary . . . P(t, pixel, wxtype) = S (weight * Pixel Value) S weight Weather Type Image Combined Forecast Image

Example of VIL Interest Evolution Example of VIL Interest Evolution

Summary of Heuristic Approach and Limitations • Individual interest images are each 0 -255 Summary of Heuristic Approach and Limitations • Individual interest images are each 0 -255 VIL-like images resulting from a combination of predictor fields and feature extraction • Forecast is a weighted average of all interest images dependent on lead time and Wx. Type, with weights determined heuristically – Combines static set of interest images into 0 -2 hour forecasts – Storm evolution is embedded in the weights, dependent on Wx. Type • Limitations: – The process of integrating a candidate predictor is a manual, timeintensive process – The utility of the predictor or an interest image to the forecast is known only qualitatively – There may be other predictor fields and interest images that would be helpful that are not being currently used – Interest image weights and evolution functions may not be optimal – An objective method could help address these issues

Automated Data Importance Evaluation: Random Forests Automated Data Importance Evaluation: Random Forests

Random Forest (RF) • A non-linear statistical analysis technique • Produces a collection of Random Forest (RF) • A non-linear statistical analysis technique • Produces a collection of decision trees using a “training set” of predictor variables (e. g. , observation and model datafeatures) and associated “truth” (e. g. , future storm intensity) values – each decision tree’s forecast logic is based on a random subset of data and predictor variables, making it independent from others – during training, random forests produce estimates of predictor importance

Example: Co. SPA combiner development (focus on 1 hour VIP level prediction) • Analyzed Example: Co. SPA combiner development (focus on 1 hour VIP level prediction) • Analyzed data collected in summer 2007 – Radar, satellite, RUC model, METAR, MIT-LL feature fields, storm climatology and satellite-based land use fields – Transformations • distances to VIP thresholds; channel differences • disc min, max, mean, coverage over 5, 10, 20, 40 and 80 -km radii – Used motion vectors to “pull back” +1 hr VIP truth data to align with analysis time data fields • For each problem, randomly selected balanced sets of “true” and “false” pixels from dataset and trained RF – VIP 3 (operationally significant convection) – initiation at varying distances from existing convection • Plotted ranks of each predictor (low rank is good) for various scenarios

VIL 8 bit 06/19/2007 23: 30 VIL 8 bit 06/19/2007 23: 30

VIL 8 bit_40 km. Max 06/19/2007 23: 00 VIL 8 bit_40 km. Max 06/19/2007 23: 00

VIL 8 bit_40 km. Pct. Cov 06/19/2007 23: 30 Example fields VIL 8 bit_40 km. Pct. Cov 06/19/2007 23: 30 Example fields

VIL 8 bit_dist. VIPLevel 6+ 06/19/2007 23: 30 VIL 8 bit_dist. VIPLevel 6+ 06/19/2007 23: 30

Importance Rank more important less important Importance summary for VIP 3 (var. Wx. Type) Importance Rank more important less important Importance summary for VIP 3 (var. Wx. Type) MITLL Wx. Type

Importance Rank more important less important Importance summary for init 20 km from existing Importance Rank more important less important Importance summary for init 20 km from existing storm MITLL Wx. Type

Importance Rank more important less important Importance summary for init 80 km from existing Importance Rank more important less important Importance summary for init 80 km from existing storm MITLL Wx. Type

RF Empirical Model Performance: VIP 3 Fract. Instances with VIP >= 3 Calibration ROC RF Empirical Model Performance: VIP 3 Fract. Instances with VIP >= 3 Calibration ROC Curve (blue) Random Forest votes for VIP >= 3 RF empirical model provides a probabilistic forecast performance benchmark

Summary and Conclusions • Developing satellite-based weather products may be only the first step Summary and Conclusions • Developing satellite-based weather products may be only the first step of their integration into an operational forecast system • Integration into an existing forecast system may require physically-informed transformations and heuristics • An RF statistical analysis can help evaluate new candidate predictors in the context of others – Relative importance – Feedback on scales of contribution – Also supplies an empirical model benchmark • Successful operational implementation may require additional funding beyond initial R&D