Population and Sample

Table of contents
  1. Selecting the goal and target of analysis
  2. Population
    1. Finite population
    2. Infinite population
  3. Finding the characteristics of the population
    1. Complete enumeration
    2. Sample survey
      1. Sample size

Selecting the goal and target of analysis

Before you start analyzing any data, it is important to clearly define the goal and target of analysis.

As described in the introduction, the goal may be to understand the data or to predict the data.

Once the goal is set, you need to define the target of analysis.

If the goal is to confirm the effect of a certain treatment, then the target of analysis would be the entire population of people who have the related disease.


Population

The target of the analysis is called the population.

The number of elements in the population is called the population size.

There are two types of population depending on the size:

Finite population

No matter how large the population is, if the cardinality of the population can be fully enumerated, then it is a finite population.

Infinite population

If the cardinality of the population is infinite, then it is an infinite population.

Population that changes over time is also considered an infinite population.


Finding the characteristics of the population

If we know the characteristics of the population, then it becomes easier for us to predict the data.

Then how would we know the characteristics of the population?

Complete enumeration

This is a survey you can perform on a finite population.

If we can enumerate the entire population, then we can know the characteristics of the population.

Because the number data you analyze equals the population size, performing descriptive statistics is enough to understand the data.

However, in most cases, complete enumeration is not realistic due to the cost and time required.

Also, if the population is infinite, then complete enumeration is obviously impossible.

Sample survey

If complete enumeration is not possible, then we need to sample the population.

Sampling is the process of selecting a subset of the population.

Inferential statistics will be used to infer the characteristics of the population from the sample.

Sample size

The number of elements in the sample is called the sample size.

The sample size is usually denoted by:

$$ n $$

It is important to differentiate number of samples from sample size. Number of samples is the number of times you perform the sampling process, while sample size is the number of elements in the sample.