Скачать презентацию TRIM Workshop Arco van Strien Wildlife statistics Statistics Скачать презентацию TRIM Workshop Arco van Strien Wildlife statistics Statistics

4cf66255437a8fb75b41b52079e99d87.ppt

  • Количество слайдов: 48

TRIM Workshop Arco van Strien Wildlife statistics Statistics Netherlands (CBS) TRIM Workshop Arco van Strien Wildlife statistics Statistics Netherlands (CBS)

What is TRIM? • TRends and Indices for Monitoring data • Computer program for What is TRIM? • TRends and Indices for Monitoring data • Computer program for the analysis of time series of count data with missing observations • Loglinear, Poisson regression (GLM) • Made for the production of wildlife statistics by Statistics Netherlands (Jeroen Pannekoek / freeware / version 3. 0) Introduction

Why TRIM? • To get better indices? No, GLM in statistical packages (Splus, Genstat. Why TRIM? • To get better indices? No, GLM in statistical packages (Splus, Genstat. . . ) may produce similar results • But statistical packages are often unpractical for large datasets • TRIM is more easy to use Introduction

The program of this workshop Aim: a basic understanding of TRIM • basic theory The program of this workshop Aim: a basic understanding of TRIM • basic theory of imputation • how to use TRIM to impute missing counts and to assess indices etc. • basic theory of weighting procedure to cope with unequal sampling of areas & how to use TRIM to weight particular sites Introduction

INDEX: the total (= sum of al sites) for a year divided by the INDEX: the total (= sum of al sites) for a year divided by the total of the base year Introduction

Missing values affect indices Theory imputation Missing values affect indices Theory imputation

How to impute missing values? 2 6 200 ESTIMATION OF SITE 2 IN YEAR How to impute missing values? 2 6 200 ESTIMATION OF SITE 2 IN YEAR 2? SITE 1 SUGGESTS: TWICE THE NUMBER OF YEAR 1 (site & year effect taken into account) Theory imputation

Another example. . 6 8 200 ESTIMATION OF SITE 2 IN YEAR 2? SITE Another example. . 6 8 200 ESTIMATION OF SITE 2 IN YEAR 2? SITE 1 SUGGESTS: TWICE THE NUMBER OF YEAR 1 Theory imputation

And another example. . . 9 12 300 ESTIMATION OF SITE 2 IN YEAR And another example. . . 9 12 300 ESTIMATION OF SITE 2 IN YEAR 2? SITE 1 SUGGESTS: THREE TIMES AS MANY AS IN YEAR 1 Theory imputation

Try this one…. . THERE IS NOT A SINGLE SOLUTION (TRIM will prompt an Try this one…. . THERE IS NOT A SINGLE SOLUTION (TRIM will prompt an ERROR) Theory imputation

Difficult to guess missings here. . Theory imputation Difficult to guess missings here. . Theory imputation

Estimating missing values by an iterative procedure (REQUIRED IN CASE OF MORE THAN A Estimating missing values by an iterative procedure (REQUIRED IN CASE OF MORE THAN A FEW MISSING VALUES) Theory imputation

First estimate of site 2, year 2: 1 X 4/7 = 0. 6 >>1. First estimate of site 2, year 2: 1 X 4/7 = 0. 6 >>1. 6 >>4. 6 >>7. 6 RECALCULATE THE MARGIN TOTALS AND REPEAT ESTIMATION OF MISSING Theory imputation

2 nd estimate of site 2, year 2: 1. 6 X 4. 6/7. 6 2 nd estimate of site 2, year 2: 1. 6 X 4. 6/7. 6 = 0. 96 REPEAT AGAIN: MISSING VALUE = 1. 22, 1. 40, 1. 54 ETC. … >> 2 Theory imputation

 • To get proper indices, it is necessary to estimate (impute) missings • • To get proper indices, it is necessary to estimate (impute) missings • Missings may be estimated from the margin totals using an iterative procedure (taking into account both site effect as year effect) (Note: TRIM uses a much faster algorithm to impute missing values). • Assumption: year-to-year changes are similar for all sites (assumption will be relaxed later!) • Test this assumption using a Goodness-of-fit (X 2 test) Theory imputation

X 2: COMPARE EXPECTED COUNTS WITH REAL COUNTS PER CELL (1. 8) (1. 2) X 2: COMPARE EXPECTED COUNTS WITH REAL COUNTS PER CELL (1. 8) (1. 2) (4. 2) (2. 8) X 2 IS SUMMATION OF (COUNTED - EXPECTED VALUE)2 / EXP. VALUE (2 -1. 8)2 /1. 8 + (4 -4. 2)2 /4. 2 ETC. >> X 2 = 0. 08 WITH A P-VALUE OF 0. 78 >> MODEL NOT REJECTED (FITS, but note: cell values in this example are too small for a proper X 2 test) Theory imputation

Imputation without covariate (X 2 = 18 and p-value = 0. 18) Theory imputation Imputation without covariate (X 2 = 18 and p-value = 0. 18) Theory imputation

Using a covariate: better imputations & indices, X 2 = 1. 7 p = Using a covariate: better imputations & indices, X 2 = 1. 7 p = 0. 99 Theory imputation

What is the best model? <<< rejected < not rejected Both model 2 and What is the best model? <<< rejected < not rejected Both model 2 and 3 are valid Theory imputation

Summary imputation theory • To get proper indices, it is necessary to impute missings Summary imputation theory • To get proper indices, it is necessary to impute missings • Assumption: year-to-year changes are similar for all sites of the same covariate category • Test assumption using a GOF test; if p -value < 0. 05, try better covariates • If these cannot be found, the resulting indices may be of low quality (and standard errors high). See also FAQ’s! Theory imputation

The program of this workshop Aim: a basic understanding of TRIM • basic theory The program of this workshop Aim: a basic understanding of TRIM • basic theory of imputation • how to use TRIM to impute missing counts and to assess indices etc. • basic theory of weighting procedure to cope with unequal sampling of areas & how to use TRIM to weigh particular sites Using TRIM

Using TRIM • several statistical models (time effects, linear model) • statistical complications (overdispersion, Using TRIM • several statistical models (time effects, linear model) • statistical complications (overdispersion, serial correlation) taken into account • Wald tests to test significances • model versus imputed indices • interpretation of slope Using TRIM

Time effects model (skylark data) without covariate Using TRIM Time effects model (skylark data) without covariate Using TRIM

Time effects model with covariate 0 = total 1= dunes 2 = heathland Using Time effects model with covariate 0 = total 1= dunes 2 = heathland Using TRIM

Lineair trend model (uses trend estimate to impute missing values) Using TRIM Lineair trend model (uses trend estimate to impute missing values) Using TRIM

Lineair trend model with a changepoint at year 2 Using TRIM Lineair trend model with a changepoint at year 2 Using TRIM

Lineair trend model with changepoints at year 2 and 3 Using TRIM Lineair trend model with changepoints at year 2 and 3 Using TRIM

Lineair trend model with all changepoints = time effects model Use lineair trend model Lineair trend model with all changepoints = time effects model Use lineair trend model when: • data are too sparse for the time effects model • one is interested in testing trends, e. g. trends before and after a particular year (or let TRIM stepwise search for relevant changepoints) But be careful with simple linear models! Using TRIM

Statistical complications: • Serial correlation: dependence of counts of earlier years (0 = no Statistical complications: • Serial correlation: dependence of counts of earlier years (0 = no corr. ) • Overdispersion: deviation from Poisson distribution (1 = Poisson) Run TRIM with overdispersion = on and serial correlation = on, else standard errors and statistical tests are usually invalid Using TRIM

Running TRIM features • trim command file • output: GOF (as X 2) test Running TRIM features • trim command file • output: GOF (as X 2) test and Wald tests • output (fitted values, indices) • indices, time totals • overall trend slope • Frequently Asked Questions • different models (lineair trend model, changepoints, covariate) Using TRIM

What is the best model? Both 2 and 3 are valid. Model 3 is What is the best model? Both 2 and 3 are valid. Model 3 is the most sparse model. Using TRIM

Model choice • The indices depend on the statistical model! • TRIM allows to Model choice • The indices depend on the statistical model! • TRIM allows to search for the best model using GOF test, Akaikes Information Criterion and Wald tests • In case of substantial overdispersion, one has to rely on the Wald tests Using TRIM

Wald tests Different Wald-tests to test for the significance of: • the trend slope Wald tests Different Wald-tests to test for the significance of: • the trend slope parameters • changes in the slope • deviations from a linear trend • the effect of each covariate Using TRIM

TRIM generates both model indices and imputed indices Using TRIM TRIM generates both model indices and imputed indices Using TRIM

Imputed vs model indices Imputed indices: summation of real counts plus - for missing Imputed vs model indices Imputed indices: summation of real counts plus - for missing counts model predictions. Closer to real counts (more realistic course in time) Model indices: summation of model predictions of all sites. Often more stable Usually Model and Imputed Indices hardly differ! Using TRIM

TRIM computes both additive and multiplicative slopes Additive + s. e. 0. 0485 0. TRIM computes both additive and multiplicative slopes Additive + s. e. 0. 0485 0. 0124 Multiplicative + s. e. 1. 0497 0. 0130 Relation: ln(1, 0497) = 0. 0485 Multiplicative parameters are easier to understand Using TRIM

Interpretation multiplicative slope Slope of 1. 05 means 5% increase a year Standard error Interpretation multiplicative slope Slope of 1. 05 means 5% increase a year Standard error of 0. 013 means a confidence interval of 2 x 0. 013 = 0. 026 Thus, slope between 1. 024 and 1. 076 Or, 2% to 8% increase a year = significant different from 1 Using TRIM

Summary use of TRIM: • choice between time effects and linear trend model • Summary use of TRIM: • choice between time effects and linear trend model • include overdispersion & serial correlation in models • use GOF and Wald tests for better models and indices & to test hypotheses • choice between model and imputed indices • use multiplicative slope Using TRIM

The program of this workshop Aim: a basic understanding of TRIM • basic theory The program of this workshop Aim: a basic understanding of TRIM • basic theory of imputation • how to use TRIM to impute missing counts and to assess indices etc. • basic theory of weighting procedure to cope with unequal sampling of areas & how to use TRIM to weight particular sites Weighting

Unequal sampling due to • stratified random site selection, with oversampling of particular strata. Unequal sampling due to • stratified random site selection, with oversampling of particular strata. Weighting results in unbiased national indices • site selection by the free choice of observers, with oversampling of particular regions & attractive habitat types. Weighting reduces the bias of indices. Weighting

To cope with unequal sampling. • stratify the data, e. g. into regions and To cope with unequal sampling. • stratify the data, e. g. into regions and habitat types • strata are to be expected to have different indices & trends • weigh strata according to (1) the number of sample sites in the stratum and (2) the area surface of the stratum • or weigh by population size per stratum Weighting

Weighting factor for each stratum or 10 or 5 Weighting factor for stratum i Weighting factor for each stratum or 10 or 5 Weighting factor for stratum i = total area of i / area of i sampled Weighting

Another example. . 100/5= 20 (or 4) 50/10=5 (or 1) Weighting factor for stratum Another example. . 100/5= 20 (or 4) 50/10=5 (or 1) Weighting factor for stratum i = total area of i / area of i sampled Weighting

Weighting in TRIM • include weight factor (different per stratum) in data file for Weighting in TRIM • include weight factor (different per stratum) in data file for each site and year record • weight strata and combine the results to produce a weighted total (= run TRIM with weighting = on and covariate = on) Weighting

Indices for Skylark unweighted (0 = total index 1= dunes 2 = heath-land) Weighting Indices for Skylark unweighted (0 = total index 1= dunes 2 = heath-land) Weighting

Indices for Skylark with weight factor for each dune site = 10 (0 = Indices for Skylark with weight factor for each dune site = 10 (0 = total index 1= dunes 2 = heathland) Weighting

Final remarks To facilitate the calculation of many indices on a routine basis • Final remarks To facilitate the calculation of many indices on a routine basis • TRIM in batch mode, using TRIM Command Language (see manual) • Option to incorporate TRIM in your own automation system (Access or Delphi or so) (not in manual)

That’s all, but: • if you have any questions about TRIM, see the manual, That’s all, but: • if you have any questions about TRIM, see the manual, the FAQ’s in TRIM or mail Arco van Strien asin@cbs. nl Success!