Beginner

Data Representation and Averages

AicademyAicademy
·Edexcel GCSE Mathematics·Pearson Edexcel 1MA1·6 min
S2·S3·S4·S5

Charts and Graphs for Categorical and Discrete Data (S2)

Choosing the right chart:

Chart typeBest for
Bar chartComparing discrete or categorical data frequencies
PictogramSimple comparative data (each symbol = fixed quantity)
Pie chartShowing proportions of a whole (one dataset)
Vertical line chartUngrouped discrete numerical data (e.g. number of siblings)
Time series (line graph)Data recorded at regular time intervals
Frequency tableOrganising raw data before drawing charts

Pie chart — calculating sector angles:

Worked example — 40 students chose a sport: 15 football, 10 basketball, 8 tennis, 7 swimming.

Football: ; Basketball: ; Tennis: ; Swimming: . Sum:

Interpreting time series: the overall trend (rising, falling, stable) and any seasonal patterns are the key things to comment on.

Mean, Median, Mode and Range (S4)

Mode: the most frequent value (for grouped data: the modal class).

Median: the middle value when data is ordered. For values, the median is the th value.

Mean: (where is frequency and is the value/midpoint).

Range: largest value − smallest value (a measure of spread).

Worked example — from a frequency table:

Goals ()Frequency ()
030
188
2510
3412

; ; Mean goals ✓

Median: th value → between 10th and 11th. Count: 0s (3), 1s (8) → 11 values up to 1. Both 10th and 11th are 1. Median

Outliers: values much larger or smaller than the rest. The mean is sensitive to outliers; the median is resistant.

Grouped Data — Estimates from Class Intervals (S4)

When data is grouped, use the midpoint of each class interval as an estimate for the mean.

Worked example — estimate the mean from:

Height (cm)FrequencyMidpoint mid
5145725
121551860
81651320

; ; Estimated mean cm ✓

Modal class: the class with the highest frequency →

Median class: the class containing the th value. Cumulative: 5, then 17. The 13th value is in

Histograms and Cumulative Frequency (S3 Higher)

Histograms are used for continuous grouped data. The vertical axis shows frequency density, not frequency.

Equal class widths: bars are equally wide; areas are proportional to frequencies.

Unequal class widths: you must use frequency density so that area still represents frequency correctly.

Worked example — class , frequency 15, width 10. Frequency density

Cumulative frequency graphs: plot cumulative frequency against the upper class boundary. Use to read off the median (at ), quartiles (at and ), and percentiles.

Median from cumulative frequency graph — for values: read off the value at cumulative frequency 30 on the vertical axis.

Something not quite clicking?

Ask Aica to explain any part of this differently. Free, takes 30 seconds.

Ask Aica

Box Plots and Interquartile Range (S4 Higher)

A box plot (box-and-whisker diagram) displays the five-number summary:

Interquartile range (IQR): — the spread of the middle 50% of data.

Reading quartiles from ordered data or cumulative frequency:

  • : value at position (use for small discrete datasets — both methods are accepted)
  • : value at position (use for small discrete datasets)

Worked example — ordered data: 3, 5, 7, 9, 11, 14, 16, 20 ().

: position nd value . : position th value .

Median: mean of 4th and 5th values . . ✓

Comparing distributions: use the median (centre) and IQR (spread) to compare two datasets. A higher median means a higher typical value; a higher IQR means more variability.

Applying Statistics to Describe Populations (S5)

Statistics drawn from a sample are used to infer properties of the population. Key principles:

  • A larger sample gives more reliable estimates.
  • A representative (unbiased) sample is essential — convenience samples may be skewed.
  • Statistical measures describe the sample; they are used to make inferences about the population.

Using statistics to describe: state the measure (mean, median, IQR) and interpret it in context.

"The mean journey time is 34 minutes, suggesting the typical commuter spends just over half an hour travelling."

Misleading statistics: the mean can be pulled by outliers; presenting only the mean without the spread can misrepresent a dataset. A median with IQR gives a more complete picture.

Common Exam Mistakes

1. Histogram — plotting frequency instead of frequency density

If class widths differ, plotting frequency makes wider bars appear more frequent than they are. Calculate frequency density frequency / class width and plot that.

2. Mean from grouped data — using class boundaries instead of midpoints

Use the midpoint of each class interval (e.g. for ), not the boundary values.

3. Box plot — confusing IQR with the full range

The IQR is (the box width). The range is maximum − minimum (the full whisker span).

4. Median — not ordering data first

The median is the middle value of the ordered dataset. Finding the median of unordered data gives a wrong answer.

MistakeCorrection
"Modal class is the class with the largest upper boundary"Modal class has the highest frequency (or frequency density in a histogram)
"Cumulative frequency: plot against midpoint"Plot against the upper class boundary
"IQR = Q3 × Q1" (subtract, not multiply)

Generate revision on any topic you study

Type any topic you're studying and Aicademy generates a complete lesson, quiz, and flashcard set — personalised to your level.

Lessons on anything

Structured, level-matched lessons on any topic you study

Practice quizzes

Find out what you actually know before the exam does

Flashcard sets

Lock in key concepts with instant revision cards

Ask Aica

Stuck on something? Get a clear explanation, any time

Prev

Combined and Conditional Probability

Next

Scatter Graphs and Sampling

Related lessons

6 Slides

Lesson

Scatter Graphs and Sampling

Edexcel GCSE Mathematics · Pearson Edexcel 1MA1

1 day ago

6 Slides

Lesson

Direct and Inverse Proportion, and Rates of Change

Edexcel GCSE Mathematics · Pearson Edexcel 1MA1

1 day ago