Beginner

Scatter Graphs and Sampling

AicademyAicademy
·Edexcel GCSE Mathematics·Pearson Edexcel 1MA1·6 min
S1·S6

Sampling — Methods and Limitations (S1)

A population is the entire group being studied. A sample is a subset used when it is impractical to study the whole population.

Good sample requirements:

  • Representative: reflects the full variety of the population
  • Random: every member has an equal chance of selection
  • Adequate size: larger samples are more reliable

Sampling methods:

MethodDescriptionLimitation
Simple randomEvery member equally likelyCostly if population is large
SystematicEvery th member (e.g. every 10th person)May miss patterns with period
StratifiedSample from each subgroup in proportion to sizeRequires knowledge of subgroup sizes
Opportunity/convenienceThose who are availableLikely biased — not representative

Worked example — a school of 600 students: 200 in Year 10, 400 in Year 11. A stratified sample of 30 is required.

Year 10: students. Year 11: students. ✓

Limitations: sample results may not perfectly represent the population due to random variation, sampling bias, or non-response.

Inferring from Samples (S1)

Data collected from a sample is used to infer properties of the population. The reliability of inference depends on:

  1. Sample size: larger samples → less random variation → more reliable estimates
  2. Sample representativeness: a biased sample produces biased inferences regardless of size
  3. Variability in the population: high variability requires a larger sample to estimate reliably

Worked example — a random sample of 50 light bulbs from a batch shows 3 defective. Estimate the number of defective bulbs in a batch of 2000.

defective bulbs (expected). ✓

Caution: this is an estimate — the actual number may differ. A larger sample would give a more reliable estimate.

Scatter Graphs and Correlation (S6)

A scatter graph (scatter diagram) plots two variables for each data item — one on each axis.

Correlation describes the relationship between the variables:

TypeDescriptionExample
Strong positiveAs increases, increases; points close to a lineHeight and weight
Weak positiveGeneral upward trend but scatteredShoe size and income
No correlationNo clear patternHair length and intelligence
Negative correlationAs increases, decreasesSpeed and journey time

Correlation does not imply causation. Two variables may be correlated because of a hidden third factor, or simply by coincidence. Stating that one causes the other requires further evidence beyond the scatter graph.

Worked example — a scatter graph shows a strong positive correlation between hours of revision and exam score. This means they are associated — it does not prove that more revision causes higher scores (though it suggests it).

Line of Best Fit (S6)

A line of best fit is a straight line drawn through the scatter plot to represent the trend. It does not have to pass through any data point.

Rules for drawing:

  • Roughly equal numbers of points above and below the line
  • The line should pass through (or very near) the mean point — this is the balancing point of the data
  • Do not extend beyond the range of the data unless asked

Making predictions (interpolation): read off the -value for a given -value from the line, where the -value is within the range of the data. Interpolation is reliable.

Extrapolation: predicting beyond the range of the data using the trend line. Extrapolation is unreliable because the trend may not continue.

Worked example — a line of best fit passes through and . Estimate the -value when .

Gradient: . Equation: .

At : ✓ (interpolation — is within the data range)

At : (extrapolation — may not be reliable)

Worth saving these ideas?

Turn what you've read into instant revision cards. Free to get started.

Make flashcards

Describing Scatter Graphs in Context

Exam questions often ask you to describe and interpret scatter graphs. Use this structure:

  1. Type of correlation: positive/negative/no correlation, strong/weak
  2. Context: state what this means in the real-world context of the variables
  3. Causation caveat: note that correlation does not prove causation if relevant

Worked example — a scatter graph shows the age of a car (years) against its value (£).

"There is a strong negative correlation between the age of a car and its value. As the car gets older, its value decreases. This suggests that older cars tend to be worth less, though other factors such as mileage and condition also affect value."

Common Exam Mistakes

1. Describing correlation — confusing strength with direction

"Strong" and "positive/negative" are independent descriptions. A weak negative correlation means a general downward trend but with a lot of scatter. State both.

2. Causation claim from correlation

Stating "more revision causes higher exam scores" based solely on a scatter graph oversteps the data. Say "there is a positive correlation" or "there is an association" — not "causes."

3. Line of best fit — forcing it through the origin

The line of best fit should fit the data, not pass through the origin unless the context genuinely requires it (e.g. a conversion graph). Draw it where it fits best.

4. Extrapolation — treating it as equally reliable as interpolation

The further beyond the data range a prediction extends, the less reliable it is. Identify when a prediction is extrapolation and state the limitation.

MistakeCorrection
"Strong correlation proves causation"Correlation shows association; proving causation requires controlled experiments
"Line of best fit must pass through as many points as possible"It should balance above and below equally — most points will not lie on it
"Predicting outside the data range is fine"Extrapolation may be unreliable; trends may not continue

Generate revision on any topic you study

Type any topic you're studying and Aicademy generates a complete lesson, quiz, and flashcard set — personalised to your level.

Lessons on anything

Structured, level-matched lessons on any topic you study

Practice quizzes

Find out what you actually know before the exam does

Flashcard sets

Lock in key concepts with instant revision cards

Ask Aica

Stuck on something? Get a clear explanation, any time

Prev

Data Representation and Averages

Next

Circle Theorems

Related lessons

7 Slides

Lesson

Data Representation and Averages

Edexcel GCSE Mathematics · Pearson Edexcel 1MA1

1 day ago

6 Slides

Lesson

Direct and Inverse Proportion, and Rates of Change

Edexcel GCSE Mathematics · Pearson Edexcel 1MA1

1 day ago