e6abdabfa9816072ab67d628657d5645.ppt
- Количество слайдов: 22
Building Statistical Forecast Models Wes Wilson MIT Lincoln Laboratory April, 2001 Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Experiential Forecasting • • Idea: Base Forecast on observed outcomes in previous similar situations (training data) Possible ways to evaluate and condense the training data – Categorization Seek comparable cases, usually expert-based – Statistical Correlation and significance analysis – Fuzzy Logic Combines Expert and Statistical analysis • • Belief: Incremental changes in predictors relate to incremental changes in the predictand Issues – Requirements on the Training Data – Development Methodology – Automation Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Outline • • • Regression-based Models Predictor Selection Data Quality and Clustering Measuring Success An Example Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Statistical Forecast Models • Multi-Linear Regression F = w 0 + S wi P i wi = Predictor Weighting w 0 = Conditional Climatology Mean Predictor Values • GAM: Generalized Additive Models F = w 0 + S wi fi(Pi) fi = Structure Function, determined during regression • PGAM: Pre-scaled Generalized Additive Models F = w 0 + S wi fi(Pi) fi = Structure Function, determined prior to regression • The constant term w 0 is conditional climatology less the weighted mean bias of the scaled predictors Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Models Based on Regression • Training Data for one predictor – P vector of predictor values – E vector of observed events • • Residual R 2 = || FP – E ||2 Regression solutions are obtained by adjusting the parametric description of the forecast model (parameters w) until the objective function J(w) = R 2 is minimized Multi-Linear Regression (MLR) J(w) = || Aw – E ||2 MLR is solved by matrix algebra; the most stable solution is provided by the SVD decomposition of A Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Regression and Correlation • Training Data for one predictor – P vector of predictor values – E vector of observed events – Error Residual: R 2 = || FP – E ||2 • Correlation Coefficient r(P, E) = DP • DE / s. DPs. DE • Fundamental Relationship. Let F 0 be a forecast equation with error residuals E 0 (||E 0||=R 0). Let W 0 + W 1 P be a BLUE correction for E 0, and let F = F 0 + E 0. The error residual RF of F satisfies RF 2 = R 02 [ 1 - r(P, E 0)2 ] Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Model Training Considerations • Assumption: The training data are representative of what is expected during the implementation period • Simple models are less likely to capture undesirable (nonstationary) short-term fluctuations in the training data • The climatology of the training period should match that expected in the intended implementation period (decade scale) • It is irrational to expect that short training periods can lead to models with long-term skill – – • Plan for repeated model tuning Design self-tuning into the system It is desirable to have many more training cases than model parameters The only way to prepare for the future is to prepare to be surprised; that doesn’t mean we have to be flabbergasted. Kenneth Boulding Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
GAM • An established statistical technique, which uses the training data to define nonlinear scaling of the predictors • Standard implementation represents the structure functions as B-splines with many knots, which requires the use of a large set of training data • The forecast equations are determined by linear regression including the nonlinear scaling of the predictors F = w 0 + Si wi fi(Pi) • • The objective is to minimize the error residual • If a GAM model has p predictors and k knots per structure function, then the regression model has np+1 (linear) regression parameters The structure functions are influence by all of the predictors, and may change if the predictor mix is altered Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
PGAM: Pre-scaled GAM • A new statistical technique, which permits the use of training sets that are decidedly smaller than those for GAM • Once the structure functions are selected, the forecast equations are determined by linear regression of the pre-scaled predictors F = w 0 + S wi fi(Pi) • Determination of the structure functions is based on enhancing the correlation of the (scaled) predictor with the error residual of conditional climatology Maximize r( fi(Pi), DE ) • • • The structure function is determined for each predictor separately Composite predictors should be scaled as composites The structure functions often have interpretations in terms of scientific principles and forecasting techniques Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Predictors • • Every Method Involves a Choice of Predictors The Great Predictor Set: Everything relevant and available Possible Reduction based on Correlation Analysis Predictor Selection Strategies – Sequential Addition – Sequential Deletion – Ensemble Decision ( SVD ) • Changing the predictor list changes the model weights; for GAM, it also changes the structure functions Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Computing Solutions for the Basic Regression Problem • Setting: Predictor List { Pi }n and observed outcomes b over the m trials of the training set • Basic Linear Regression Problem Aw=b where the columns of the m by n matrix A are the lists of observed predictor values over the trials • • • Normal Equations: ATA w = ATb Linear Algebra: w = (ATA)-1 Atb Optimization: Find x to minimize R 2 = | Aw – b |2 Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
SVD – Singular Value Decomposition A = U S VT where U and V are orthogonal matrices and S = [ S | 0 ]T where S is diagonal with positive diagonal entries UT A w = S V T w = U T b Set • w= VTw, b= n Restatement of the Basic Problem S VT w = b or (original problem space) • [UTb] [S|0]T= S 0 Sw=b (VT-transformed problem space) Since U is orthogonal, the error residual is not altered by this restatement of the problem CAUTION: Analysis of Residuals can be misleading unless the dynamic ranges of the predictor values have been standardized Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Structure of the Error Residual Vector S s 1 w = b w 1 w 2 w 3 s 2 s 3 * sn * 0 wn 0 = b 1 b 2 b 3 * bn si’s are usually decreasing sn > 0, or reduce predictor list For i < n, w i = b i / si For i > n, there is no solution. This is the portion of the problem that is not resolved by these predictors • Magnitude of the unresolved portion of the problem: m R*2 = Sn+1 bi 2 • Truncated Problem: For i > k , set wi = 0. This increases the error residual to . bn+1 * * bm Stat. Fcst. Models Wes Wilson 3/15/2018 • • . . Rk = Sk+1 2 m bi 2= R*2 n + Sk+1 bi 2 MIT Lincoln Laboratory
Controlling Predictor Selection • • • SVD / PC analysis provides guidance Truncation in w space reduces the degrees of freedom Truncation does not provide nulling of predictors: since 0 components of w do not lead to 0 components of w = V w. . • • Seek a linear forecast model of the form F( a ) = a. T w = S wi ai , a is a vector of predictor values Predictor Nulling: – The ith predictor is eliminated from the problem if wi = 0 • Benefits of predictor nulling – Provides simple models – Eliminate designated predictors (missing data problem) – Quantifies the incremental benefit provided by essential predictors (sensor benefit problem) Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Predictor Selection Process • • • Gross Predictor Selection (availability & correlation) SVD for problem sizing an gross error estimation Truncation and Predictor Nulling maximal model(s) ( there may be more than one good solution) • Successive Elimination in the Original Problem Space minimal model (until SD starts to grow rapidly) • • Successive Augmentation in the Original Problem Space At this point, the good solutions are bracketed between the maximal and the minimal models; exhaustive searches are probably feasible, cross validation is wise. Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Creating 15 z Satellite Forecast Models (1) • • • 149 marine stratus days from 1996 to 2000 51 sectors and 3 potential predictors per sector (153) Compute the correlation for each predictor with the residual from conditional climatology Retain only predictors, which have correlation greater than. 25, reduces the predictor list to 45 predictors Separate analysis for two data sets, Raw and PGAM Truncate each when SD reduction drops below 1. 5 % RAW: Stat. Fcst. Models Wes Wilson 3/15/2018 PGAM: MIT Lincoln Laboratory
Creating 15 z Satellite Forecast Models (2) Raw Data • • • SVD Truncate 6 Pred. Nulling In the Truncation space: Null to 7 predictors with acceptable error growth Maximal Problems (R-8, P-7) Minimal Problems (R-5, P-4) Neither problem would accept augmentation according to the strict cross-validation test Different predictors were selected Stat. Fcst. Models Wes Wilson 3/15/2018 SVD Raw 6 PGAM Data SVD PGAM 6 Sigma PC 6 1. 134 Sigma PC 6 0. 999 Sigma 1. 148 Sigma 0. 999 MIT Lincoln Laboratory
Data Quality and Clustering • DQA is similar to NWP – need to do the training set – probably need to work to tighter standards • Data Clustering – During training - manual ++ – For implementation - fully automated • Conditional Climatology based on Clustering Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Satellite Statistical Model (MIT/LL) • • 1 -km visible channel (brightness) Data pre-processing – – • topography physical forcing operational areas Sector statistics – – – • • re-mapping to 2 km grid 3 x 3 median smoother normalized for sun angle calibrated for lens graying Grid points grouped into sectors – – – • SECTORIZATION Brightness Coverage Texture 4 year data archive, 153 predictors PGAM Regression Analysis Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Consensus Forecast Day Characterization - Wind direction - Inversion height - Forcing influences COBEL Local SFM Regional SFM Forecast Weighting Function Consensus Forecast Satellite SFM Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Measuring Success Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
Conclusions • PGAM, SVD/PC, and Predictor Nulling provides a systematic way to approach the development of Linear Forecast models via Regression • This methodology provides a way to investigate the elimination of specific predictors, which could be useful in the development of contingency models • We are investigating full automation Stat. Fcst. Models Wes Wilson 3/15/2018 MIT Lincoln Laboratory
e6abdabfa9816072ab67d628657d5645.ppt