Statistical Methods

Home | Contact Defra | About Defra | News | Access to information | Links | Search | Site A-Z
Home | About the Survey | Survey Data Search | History | Schools | Economics & Statistics Home | Contact us
Home (June Survey) - About the Survey

Statistical Methods for the June Survey

Historically, the June Agricultural and Horticultural Survey was a 'Census' and as this word implies, involved obtaining data from every single agricultural holding in the country. However in recent years it has actually been conducted as a sample survey, in which data is only sought from a proportion of holdings in each year. The exception to this is in every tenth year when EU regulations demand a complete census; 2000 was the most recent occasion. The survey is conducted by post (or via the web) and a specimen form can be viewed in the 'forms' section of the Defra website.

This document describes how we decide which holdings should be sampled each year and how we estimate national and local figures based on the data received. The aim is to include sufficient detail to allow statisticians to understand our methods, but we hope the document will also be of use to interested non-statisticians. The details refer to the 2004 survey, but it is likely that similar principles will apply in the future.

SAMPLING STRATEGY

There are currently around 190,000 agricultural holdings in England, many of which are very small. As a result a completely random sample would be highly inefficient, with many forms going to small holdings and some large, economically important holdings not being sent a form. We therefore adopt a stratified random sampling approach in which holdings are divided into groups (strata) on the basis of their economic size, with higher sampling rates being used in the larger strata. For a general discussion of the merits of stratified random sampling see sampling textbooks, such as the ones by Sampford and Barnett (see references section).

The strata and sampling rates used in 2004 are shown in Table 1 below. Simply stratifying on economic size would lead to unacceptably low precision for some important crops, particularly horticultural crops. We therefore have separate strata with higher sampling rates for horticultural holdings.

Also, as administrative data on crops is available from the IACS system for a large number of cereal holdings, less information is required on the survey forms so these holdings are stratified out and sampled at a lower rate. Unfortunately, this level of detail will not be available from the IACS system in 2005, and so we will have to revert to collecting this information via paper forms.

Due to the impending CAP reform, more information than usual was required on the agricultural activity on LFA holdings, so these holdings were stratified out and sampled at a higher than usual rate. This will enable us to monitor the environmental effects of the reforms.

Although the number of strata for sampling appears high, for analysis of results the strata were collapsed down to 5 bands, according to their sampling rates; e.g. all strata sampled at approx. 20% were combined together for analysis- thus avoiding the risk of bias.

The strata of 'new' holdings (either those which are really new since June 2003, or those which have failed to return a form in the past) were all sampled as it is important for us to gain some information on these, even though in the majority of cases they will prove to be relatively small. Past experience tells us that a few of these new holdings are of enormous economic importance, particularly in the pig and poultry sectors, where large new businesses can start up on small areas of land.

Table 1: Sampling rates used in the June 2004 survey.

Stratum
Description
Holdings
Sampled
Sampling Rate
1  
0-9,600
103303
11127
11%
2  
9,600-48,000
24466
4734
19%
3  
48,000-120,000
15186
4697
31%
4  
120,000-240,000
7372
3511
48%
5  
>=240,000
3003
2804
93%
6
Horticulture holdings
0-9,600
865
182
21%
7
9,600-48,000
1945
612
31%
8
48,000-120,000
748
378
51%
9
120,000-240,000
257
256
100%
10
>=240,000
231
230
100%
11
Arable holdings
0-9,600
549
55
10%
12
9,600-48,000
1729
188
11%
13
48,000-120,000
3400
744
22%
14
120,000-240,000
2364
771
33%
15
>=240,000
1661
869
52%
16
Holdings in Less Favoured Areas (LFA's)
0-9,600
4471
986
22%
17
9,600-48,000
5258
1708
32%
18
48,000-120,000
2997
1487
50%
19
120,000-240,000
712
712
100%
20
>=240,000
119
119
100%
98  
New no base
6297
6261
99%
99  
New since June 2003
5954
5903
99%
   
Total
192887
48334
25%

Within each stratum holdings are sampled at random. The only exception to this is amongst the smallest holdings where a partially systematic approach is adopted to ensure that individual holdings are not usually sampled in successive years. This departure from randomness does create some risk of bias, but this risk is small, particularly since this stratum makes a relatively small contribution to the national totals for most crops.

STATISTICAL ANALYSIS

Statistical analysis of the June survey uses the separate ratio approach (see e.g. the book by Sampford for details). In brief, this method works by firstly finding the trend between the current number of e.g. sheep and the previous year's number of sheep (called the ratio) for the holdings which responded to this years survey. This ratio is then applied to the total number of e.g. sheep from last year to get a new estimate for the number of sheep in England.These calculations are performed on each stratum separately, and then the estimated totals are summed to give a national figure. This ensures that the final estimate is approximately unbiased, despite the differing sampling rates in the different strata. For the new holdings strata, there is no previous number on which to base a ratio, and so the usual estimates of a total are used instead.

We have also investigated the use of more complex analysis approaches (particularly the model-based approach of Karlberg), but so far simulations have indicated that these give no increase in accuracy compared to the relatively simple ratio approach.

Where serious inconsistencies are found in the form received from a holding (for example, if the area of crops reported greatly exceeds the size of the holding), we telephone the farmer or grower to seek clarification. Where we are unable to contact them prior to the publication of results, the holding is excluded from the analysis for the relevant sections, and treated as though no form had been received. This happens mainly prior to the publication of provisional results in August; most of these inconsistencies are resolved prior to the publication of final results.

FURTHER READING

The following books provide general information on sample surveys, including ratio analysis:

Barnett, V (1991) Sample Survey Principles and Methods. Arnold. ISBN: 0340545534. (A new edition is to be published in late 2002).

Cochran, W.G. (1977) Sampling Techniques (3rd Edition). Wiley. ISBN 0 471 16240

Sampford, M.R. (1962) An Introduction to Sampling Theory With Applications to Agriculture. Edinburgh: Oliver & Boyd. (This book is difficult to get hold of now, but provides an excellent introduction to sampling in an agricultural context).

This is the article on model-based estimation:

Karlberg, F. (2000). Survey Estimation for Highly Skewed Populations in the Presence of Zeroes. Journal of Official Statistics, 16 229-241. (Full text available from www.jos.nu).

Click here if you would like this document in Word format      
Page last modified: Contact Us
Top | Feedback | Help | Access Keys | Copyright   Department for Environment, Food and Rural Affairs