So recall that we are interested on the statistical aspects of taking and analysing a sample, and a good sample will be representative in the sense that characteristics of interest in the population can be estimated form the sample with a known degree of accuracy.

Here, we will use Probability Sampling to conduct surveys. Probability sampling means each unit in the population has a known non-zero probability of being included in the sample. At the same time, we will make the following assumption:

- Sampled population = target population
- Sampling frame is complete, no non-response or missing data
- No measurement error

Clearly, with these assumptions, we have removed non-sampling error and only observe sampling error.

**Simple Random Sample**

- Simplest form of probability sample
- Each unit has an equal probability to be in the sample
- Each sample of size
*n*has the same chance of being the samples

*Systematic Sample*

- Units are equally spaced in the list

*Stratified sample*

- Elements in the same stratum often tend to be more similar.
- Simple random sample selected from each stratum, and sample random samples in the strata are selected independently

*Cluster Sample*

- Elements are aggregated into larger sampling units (cluster)
- The cluster is sampled:
- One – stage (entire cluster is sampled)
- Two – stage (probability sampling within the cluster)

So here is an example to sample 20 integers from the population {1, 2, …, 100} using the above methods

- Simple random sample: Use a computer to randomly generate 20 integers from 1 to 100.
- Systematic sample: Use a computer to randomly generate an integer from 1 to 5, then take every element. Suppose it was 2, then the sample contains units 2, 7, 12, 17, …
- Stratified sample: Divide the population into 10 strata, {1, 2, …, 10}, {11, 12, …, 20}, …, {91, 92, …, 100}, and a simple random sample of 2 numbers will be drawn from each of the 10 strata.
- Cluster sample: Divide the population into 20 clusters {1, 2, 3, 4, 5}, {6, 7, 8, 9, 10}, …, {96, 97, 98, 99, 100}. A simple random sample of 4 of these clusters is selected.

Now we move on to developing some concepts and tools to analyse our sample.

For most samples, we are establish a characteristic of interest, y. Let be the characteristic of interest for unit *i*.

- Population mean,

- Population proportion,
*p*

This is a special population mean.

Let be binary variable, taking value of 1 if unit*i*have characteristic and 0 if unit*i*does not have characteristic.

- Population Total
*t*

- Population variance

*S*is the standard deviation of*y*. - Coefficient of variation
*CY*(*y*)

The coefficient of variation is a measure of relative variability; it is the ratio of the standard deviation of*y*with .

Next, we will delve deep into each of the sampling methods above.

Sampling & Survey #1 – Introduction

Sampling & Survey #2 – Simple Probability Samples

Sampling & Survey #3 – Simple Random Sampling

Sampling & Survey #4 – Qualities of estimator in SRS

Sampling & Survey #5 – Sampling weight, Confidence Interval and sample size in SRS

Sampling & Survey #6 – Systematic Sampling

Sampling & Survey #7 – Stratified Sampling

Sampling & Survey # 8 – Ratio Estimation

Sampling & Survey # 9 – Regression Estimation

Sampling & Survey #10 – Cluster Sampling

Sampling & Survey #11 – Two – Stage Cluster Sampling

Sampling & Survey #12 – Sampling with unequal probabilities (Part 1)

Sampling & Survey #13 – Sampling with unequal probabilities (Part 2)

Sampling & Survey #14 – Nonresponse

[…] & Survey #1 – Introduction Sampling & Survey #2 – Simple Probability Samples Sampling & Survey #3 – Simple Random Sampling Sampling & Survey #4 – […]