|
Historically, the June Agricultural and Horticultural Survey
was a 'Census' and as this word implies, involved obtaining
data from every single agricultural holding in the country.
However in recent years it has actually been conducted as
a sample survey, in which data is only sought from a proportion
of holdings in each year. The exception to this is in every
tenth year when EU regulations demand a complete census; 2000
was the most recent occasion. The survey is conducted by post
(or via the web) and a specimen form can be viewed in the
'forms' section of the Defra website.
This document describes how we decide which holdings should
be sampled each year and how we estimate national and local
figures based on the data received. The aim is to include
sufficient detail to allow statisticians to understand our
methods, but we hope the document will also be of use to interested
non-statisticians. The details refer to the 2004 survey, but
it is likely that similar principles will apply in the future.
SAMPLING STRATEGY
There are currently around 190,000 agricultural holdings
in England, many of which are very small. As a result a completely
random sample would be highly inefficient, with many forms
going to small holdings and some large, economically important
holdings not being sent a form. We therefore adopt a stratified
random sampling approach in which holdings are divided into
groups (strata) on the basis of their economic size, with
higher sampling rates being used in the larger strata. For
a general discussion of the merits of stratified random sampling
see sampling textbooks, such as the ones by Sampford and Barnett
(see references section).
The strata and sampling rates used in 2004 are shown in Table
1 below. Simply stratifying on economic size would lead to
unacceptably low precision for some important crops, particularly
horticultural crops. We therefore have separate strata with
higher sampling rates for horticultural holdings.
Also, as administrative data on crops is available from the
IACS system for a large number of cereal holdings, less information
is required on the survey forms so these holdings are stratified
out and sampled at a lower rate. Unfortunately, this level
of detail will not be available from the IACS system in 2005,
and so we will have to revert to collecting this information
via paper forms.
Due to the impending CAP reform, more information than usual
was required on the agricultural activity on LFA holdings,
so these holdings were stratified out and sampled at a higher
than usual rate. This will enable us to monitor the environmental
effects of the reforms.
Although the number of strata for sampling appears high,
for analysis of results the strata were collapsed down to
5 bands, according to their sampling rates; e.g. all strata
sampled at approx. 20% were combined together for analysis-
thus avoiding the risk of bias.
The strata of 'new' holdings (either those which are really
new since June 2003, or those which have failed to return
a form in the past) were all sampled as it is important for
us to gain some information on these, even though in the majority
of cases they will prove to be relatively small. Past experience
tells us that a few of these new holdings are of enormous
economic importance, particularly in the pig and poultry sectors,
where large new businesses can start up on small areas of
land.
Table 1: Sampling rates used in the June 2004 survey.
| Stratum |
|
Description
|
Holdings
|
Sampled
|
Sampling Rate
|
| 1 |
|
0-9,600
|
103303
|
11127
|
11%
|
| 2 |
|
9,600-48,000
|
24466
|
4734
|
19%
|
| 3 |
|
48,000-120,000
|
15186
|
4697
|
31%
|
| 4 |
|
120,000-240,000
|
7372
|
3511
|
48%
|
| 5 |
|
>=240,000
|
3003
|
2804
|
93%
|
| 6 |
Horticulture holdings
|
0-9,600
|
865
|
182
|
21%
|
| 7 |
9,600-48,000
|
1945
|
612
|
31%
|
| 8 |
48,000-120,000
|
748
|
378
|
51%
|
| 9 |
120,000-240,000
|
257
|
256
|
100%
|
| 10 |
>=240,000
|
231
|
230
|
100%
|
| 11 |
Arable holdings
|
0-9,600
|
549
|
55
|
10%
|
| 12 |
9,600-48,000
|
1729
|
188
|
11%
|
| 13 |
48,000-120,000
|
3400
|
744
|
22%
|
| 14 |
120,000-240,000
|
2364
|
771
|
33%
|
| 15 |
>=240,000
|
1661
|
869
|
52%
|
| 16 |
Holdings in Less Favoured Areas (LFA's)
|
0-9,600
|
4471
|
986
|
22%
|
| 17 |
9,600-48,000
|
5258
|
1708
|
32%
|
| 18 |
48,000-120,000
|
2997
|
1487
|
50%
|
| 19 |
120,000-240,000
|
712
|
712
|
100%
|
| 20 |
>=240,000
|
119
|
119
|
100%
|
| 98 |
|
New no base
|
6297
|
6261
|
99%
|
| 99 |
|
New since June 2003
|
5954
|
5903
|
99%
|
| |
|
Total
|
192887
|
48334
|
25%
|
Within each stratum holdings are sampled at random. The only
exception to this is amongst the smallest holdings where a
partially systematic approach is adopted to ensure that individual
holdings are not usually sampled in successive years. This
departure from randomness does create some risk of bias, but
this risk is small, particularly since this stratum makes
a relatively small contribution to the national totals for
most crops.
STATISTICAL ANALYSIS
Statistical analysis of the June survey uses the separate
ratio approach (see e.g. the book by Sampford for details).
In brief, this method works by firstly finding the trend between
the current number of e.g. sheep and the previous year's number
of sheep (called the ratio) for the holdings which responded
to this years survey. This ratio is then applied to the total
number of e.g. sheep from last year to get a new estimate
for the number of sheep in England.These calculations are
performed on each stratum separately, and then the estimated
totals are summed to give a national figure. This ensures
that the final estimate is approximately unbiased, despite
the differing sampling rates in the different strata. For
the new holdings strata, there is no previous number on which
to base a ratio, and so the usual estimates of a total are
used instead.
We have also investigated the use of more complex analysis
approaches (particularly the model-based approach of Karlberg),
but so far simulations have indicated that these give no increase
in accuracy compared to the relatively simple ratio approach.
Where serious inconsistencies are found in the form received
from a holding (for example, if the area of crops reported
greatly exceeds the size of the holding), we telephone the
farmer or grower to seek clarification. Where we are unable
to contact them prior to the publication of results, the holding
is excluded from the analysis for the relevant sections, and
treated as though no form had been received. This happens
mainly prior to the publication of provisional results in
August; most of these inconsistencies are resolved prior to
the publication of final results.
FURTHER READING
The following books provide general information on sample
surveys, including ratio analysis:
Barnett, V (1991) Sample Survey Principles and Methods. Arnold.
ISBN: 0340545534. (A new edition is to be published in late
2002).
Cochran, W.G. (1977) Sampling Techniques (3rd Edition). Wiley.
ISBN 0 471 16240
Sampford, M.R. (1962) An Introduction to Sampling Theory
With Applications to Agriculture. Edinburgh: Oliver &
Boyd. (This book is difficult to get hold of now, but provides
an excellent introduction to sampling in an agricultural context).
This is the article on model-based estimation:
Karlberg, F. (2000). Survey Estimation for Highly Skewed
Populations in the Presence of Zeroes. Journal of Official
Statistics, 16 229-241. (Full text available from www.jos.nu).
|