This is a new series (this is essentially post-graduate materials, meaning you need strong statistical background at university level) on Sampling and Survey. I know many JC students will think, I thought its just a simple 2-4 marks questions in A-levels. You are correct, but here, i’ll concentrate on the statistical aspects of taking and analysing a sample, i.e., we learn to tell when a sample is valid of not, and how to design and analyse many different forms of sample survey.
Let us first look at some problems with surveys today.
- Self – selected sample
- Special groups of people (not representative)
- Long survey
- Vague questions
- Leading questions
A good sample should be representative, in the sense that characteristics of interest in the population can be estimated from the sample with a known degree of accuracy. Examples include personal interview, telephone, etc.
Next, let us look at some definitions (terminologies) for a sample.
- Target Population: Complete collection of observations we want to study. This might sound easy, but it is one of the hardest and most important part, do you know the exact population size of Singapore? To simplify things, we always assume a FINITE population.
- Observation Unit: An object on which a measure is taken. Think of it as a basic unit of observation, an element. We will let denote the total number of elements in the population.
- Sample: A subset of population.
- Sampled Population: The collection of all possible observation units, a.k.a., the population from which the sample was taken.
- Sampling Unit: The unit we can actually sample. This is different from observation unit! Example, households serve as the sampling units, and the observation units are the individuals living in the households. Here, we may want to study individuals, but do not have a list of all individuals in the target populations. Do take note of the distinction here.
- Sampling Frame: The list of sampling units.
So you might ask, why collect a sample and not a census? In all surveys, we look for the following:
- Speed: We want to be able to publish estimates in a timely fashion; a survey allows data to be collected more quickly.
- Accuracy: Much to your surprise, estimates based on sample surveys are often more accurate than those based on a census because investigators can be more careful when collecting data.
- Destructive nature of the measurements: Sampling can produce reliable information at far less cost than a census. In some instances, an observation unit must be destroyed to be measured, for example, the lifespan of light bulbs made in a factory.
So in an ideal survey, we want (hope & pray) the sampled population will be identical to the target population, of course, this is rarely met. A simple reason will be that not all persons in the target population (the people you are interested in) are in my sampling frame (the people you can survey). For example, you conduct a telephone survey of likely voters, however not all households have telephones.
Lastly, all surveys will have errors.
- Total survey error: Difference between a population parameter and the estimate of the parameter based on the sample survey. Total survey can be participation into two types of component.
- Sampling error: Due to selecting a sample instead of the entire population.
- Non-sampling error: Due to mistakes or system deficiencies (non-response, measurement error, data manipulation). Essentially, all the kind of errors made during data collection, data processing, and estimation except sampling error.
We will end of with a simple equation here.
Total Survey error in an estimate = Sampling error + Non-sampling error
We will look at sampling methods next, and gear up for a lot of Math.
Sampling & Survey #1 – Introduction
Sampling & Survey #2 – Simple Probability Samples
Sampling & Survey #3 – Simple Random Sampling
Sampling & Survey #4 – Qualities of estimator in SRS
Sampling & Survey #5 – Sampling weight, Confidence Interval and sample size in SRS
Sampling & Survey #6 – Systematic Sampling
Sampling & Survey #7 – Stratified Sampling
Sampling & Survey # 8 – Ratio Estimation
Sampling & Survey # 9 – Regression Estimation
Sampling & Survey #10 – Cluster Sampling
Sampling & Survey #11 – Two – Stage Cluster Sampling
Sampling & Survey #12 – Sampling with unequal probabilities (Part 1)
Sampling & Survey #13 – Sampling with unequal probabilities (Part 2)
Sampling & Survey #14 – Nonresponse