So we formally introduced the estimators used in SRS previously. Now, we are interested in how good our estimators are, and if you recall, by good, I refer to accuracy (MSE), Precision (Variance) & Bias (Expectation). Let us look at these qualities of our estimators now.

1. Estimator of population mean $\bar{y_u}$
$\mathbb{E}(\bar{y}) = \bar{y_u}$Unbiased, this implies that $Bias(\bar{y}) = 0$
$Var(\bar{y}) = \frac{S^2}{n}(1-\frac{n}{N})$, where $S^2$ is our population variance. $Var(\bar{y})$ measures the variability among estimates of $\bar{y_u}$ from different samples.
$MSE(\bar{y}) = Var(\bar{y}) + (Bias(\bar{y}))^2 = Var(\bar{y})$ since $\mathbb{E}(\bar{y}) = \bar{y_u} \Rightarrow Bias(\bar{y}) = 0$.
2. Estimator of population covariance $S^2$
$s^2 = \frac{1}{n-1} \sum_{i \in D} (y_i - \bar{y})^2$ is an unbiased estimator of $S^2 = \frac{1}{N-1} \sum_{i=1}^N (y_i - \bar{y_u})^2$
$\Rightarrow \mathbb{E}(s^2) = S^2$Unbiased. You should be able to proof this.
Since $\mathbb{E}(s^2)=S^2 \Rightarrow \hat{Var}(\bar{y}) = \frac{s^2}{n}(1-\frac{n}{N})$ is an unbiased estimator for $Var(\bar{y})$
Standard error, $SE(\bar{y}) = \sqrt{\frac{s^2}{n}(1-\frac{n}{N})}$
Coefficient of variation, $CV(\bar{y}) = \frac{\sqrt{Var(\bar{y})}}{\mathbb{E}(\bar{y})} = \sqrt{1 - \frac{n}{N}} \frac{S}{\sqrt{n} \bar{y_u}}$
Estimator of Coefficient of variation, $\hat{CV}(\bar{y}) = \frac{SE(\bar{y})}{\bar{y}} = \sqrt{1-\frac{n}{N}} \frac{s}{\sqrt{n} \bar{y}}$
3. Estimator of population total t and proportion p
Intuitively, they are both related to the population mean, so estimators should be related to the sample mean.

1. Lets look at the population total first.
$t = \sum_{i=1}^N y_i = N \bar{y_u}$
Estimator, $\hat{t} = N \bar{y}$
$\mathbb{E}(\hat{t}) = t$Unbiased
$Var(\hat{t}) = Var(N \bar{y}) = N^2 Var(\bar{y}) = N^2 (1-\frac{n}{N}) \frac{S^2}{n}$
$\hat{Var}(\hat{t}) = N^2 (1 - \frac{n}{N}) \frac{s^2}{n}$
$CV(\hat{t}) = \frac{\sqrt{\hat{t}}}{\mathbb{E}(\hat{t})} = \sqrt{1-\frac{n}{N}} \frac{S}{\sqrt{n} \bar{y_u}} = CV(\bar{y})$
2. Next, we look at the population proportion
$p = \frac{1}{N} \sum_{i=1}^N y_i = \bar{y_u}$
Estimator, $\hat{p} = \bar{y}$
$\mathbb{E}(\hat{p})=p$ – Unbiased
$Var(\hat{p}) = Var(\bar{y}) = \frac{S^2}{n}(1-\frac{n}{N})$
$S^2 = \frac{N}{N-1}p(1-p)$
$s^2 = \frac{n}{n-1} \hat{p}(1 - \hat{p})$
$Var(\hat{p}) = \frac{S^2}{n}(1 - \frac{n}{N}) = (\frac{N-n}{N-1})\frac{p(1-p)}{n}$
$\hat{Var}(\hat{p}) = \frac{s^2}{n}(1 - \frac{n}{N}) = (1 - \frac{n}{N}) \frac{\hat{p} (1 - \hat{p})}{n-1}$

So by now, you should have noticed that this term, $1 - \frac{n}{N}$ keeps appearing. This is the finite population correction factor (FPC). $\frac{n}{N}$ refers to the sampling fraction that we observe, and the larger the sample size, the larger the sampling fraction, vice versa. Since we assume finite population at the beginning, clearly the largest sample size we can obtain is N, then FPC = 0. This implies that $Var(\bar{y})=0$ since we take the one and only possible sample, that is, D = U and $\bar{y} = \bar{y_u}$. Thus, there should be no variability. On the other hand, if we take samples from extremely large populations, the FPC $\approx 1$. Simply consider taking a sample of size 100 from a population of 100,000 or 100,000,000 units, we have the sample FPC of 0.999 against 0.999999. This effectively gives $Var(\bar{y}) = \frac{S^2}{n} \bullet FPC$. From here, we observe that when dealing with large population, it is the size of the sample, not the percentage of the population sampled, that determines the precision of the estimator. 🙂

To sum things up, we observe how FPC works, and we will see more of it definitely. We can now find our unbiased estimators for population mean, total and proportion. We can also measure the qualities of these estimators. The last take-home message is that since this estimators are unbiased, their qualities which are measured by MSE, is essentially jus the variance. 🙂

Sampling & Survey #1 – Introduction
Sampling & Survey #2 – Simple Probability Samples
Sampling & Survey #3 – Simple Random Sampling
Sampling & Survey #4 – Qualities of estimator in SRS
Sampling & Survey #5 – Sampling weight, Confidence Interval and sample size in SRS
Sampling & Survey #6 – Systematic Sampling
Sampling & Survey #7 – Stratified Sampling
Sampling & Survey # 8 – Ratio Estimation
Sampling & Survey # 9 – Regression Estimation
Sampling & Survey #10 – Cluster Sampling
Sampling & Survey #11 – Two – Stage Cluster Sampling
Sampling & Survey #12 – Sampling with unequal probabilities (Part 1)
Sampling & Survey #13 – Sampling with unequal probabilities (Part 2)
Sampling & Survey #14 – Nonresponse