Department for Environment, Food & Rural Affairs

Interim Report on the Effects of the Management of
Field Scale Releases of GM Herbicide Tolerant Crops on the
Abundance and Diversity of Farmland Wildlife (October 2000)


3 PROGRESS ON DATA ANALYSIS

3.1 Testing the null hypothesis

A standard program, compiled in the statistical package GENSTAT, has been written. The program is designed to perform a standard, basic statistical analysis on a single variate of interest, for example, counts of all the seeds in the seed bank of the beet crop during the year 2000. The goal is to link the program with the developing database, so that when data are updated within the database, the new analysis is generated automatically, and the results stored for later inspection. The output of this preliminary analysis will allow the crop co-ordinators to quickly screen the data for those variates worthy of further study. This further study will take the form of more detailed analyses, done in collaboration with the project statisticians. It might well involve further covariates of interest and allow deeper interpretation.

The model underlying the basic analysis is tailored for the analysis of integer, count abundance data; the unit is the total count, for a given half-field. Two main approaches are used.

In the first, termed the 'Normal model', the variability of each count is assumed to vary proportionally to the square of its mean. This leads naturally to a logarithmic transformation for efficient analysis; an analysis of variance is used to provide a test of the null hypothesis, with the farms as blocks. This analysis implicitly assumes a log-normal distribution for the counts. In the second, termed the 'Log-linear model', the variability is assumed to vary proportionally to the mean. The null hypothesis is tested for this model using a generalised linear model with Poisson errors and logarithmic link, with the scale parameter estimated. Both models are used ubiquitously in the scientific literature to analyse weed and insect counts; both provide an F-statistic to test the null hypothesis.

Diagnostic plots of residuals are given in the program to test the distributional assumptions made by both models. Either or both of these tests might lack sufficient robustness to deviations from the log-normal or Poisson distributions, respectively. Hence, Monte Carlo paired randomisation tests were also done. These three randomisation tests are especially useful when there are many small and/or zero counts in a data set. In the first of these, the same test statistic was used as that tested in the ANOVA, i.e. d, the mean difference between the treatments on a logarithmic scale in the Normal model. The second randomisation statistic, denoted r, was based directly on the Log-linear model, and was calculated as the logarithm of the ratio of the overall arithmetic means of the two treatments. The third test was introduced to try to accommodate the intermediate case, when the exponent in the variance-mean relationship was 1.5. The third statistic is denoted as dw and is a weighted version of d, with weights based on a theoretical expectation of the variance that may be derived from the data in practice. The randomisation tests use 999 simulations within each run to estimate p-values.

In addition, the program outputs summary statistics and simple scatter graphs for checking. An estimate of the coefficient of variation (CV) is provided in order to provide an approximate check that the earlier power calculations reported to the Steering Group are supported by field data.

Provisional analyses of seed bank data imply that the range of variation among sites is acceptably high, both in terms of species number and total seed abundance. Also, the between half-field variability within a site is sufficiently small for tests of the null hypothesis to have the levels of power assumed previously for this experimental design.

3.2 Evaluating crop management

As the field season has progressed, data has been collated on how the farmers have been managing the crops. These crop management records are being finalised with the farmer, who signs a crop management diary. Only when these signed records have been entered and validated can formal analyses of the data be undertaken.

[ Previous ] [ Contents ] [ Next ]


Published 2 January 2001
Genetically Modified Crop Farm-Scale Evaluations Index
Environmental Protection Index
Defra Home Page