Advanced

Data Compression: RLE and Huffman Coding

AicademyAicademy
·GCSE Computer Science·AQA 8525·7 slides
3.3.8 Data compression

Why Compress Data?

Uncompressed files consume significant storage and take longer to transmit across networks. A single minute of uncompressed CD-quality audio is approximately 10 MB; an uncompressed full HD video frame is around 6 MB. Without compression, streaming, email attachments, and cloud storage as they exist today would be impractical.

Data compression reduces file size by encoding the same information more efficiently. There are two broad categories:

TypeDescriptionData loss?Examples
LosslessOriginal data can be perfectly reconstructedNoneRLE, Huffman coding
LossySome data is permanently discardedYes — irreversibleJPEG images, MP3 audio

AQA 8525 requires knowledge of run-length encoding (RLE) and Huffman coding, both of which are lossless methods. The existence of lossy compression is worth knowing for comparison, but RLE and Huffman are the assessed techniques.

Lossless compression is essential wherever the original data must be preserved exactly — executable programs, text documents, medical scans.

Run-Length Encoding (RLE)

Run-length encoding replaces consecutive repeated values with a single frequency/data pair: the number of times the value appears followed by the value itself.

RLE achieves good compression when data contains long runs of the same value — solid-colour regions in images, silence in audio, or repeated characters in text.

Format: each pair is written as (frequency, data) or frequency data depending on the system.

Original dataRLE encodedPairs
AAAAAABBBCCA(6,A)(3,B)(2,C)(1,A)4
000000001111(8,0)(4,1)2
ABCABC(1,A)(1,B)(1,C)(1,A)(1,B)(1,C)6

The third example shows a case where RLE increases the data size — it only saves space when runs are longer than the overhead of storing the frequency/data pair.

RLE: A Worked Example

Encoding — compress the pixel row WWWWWWWWBBBBWWWW (16 pixels: 8 white, 4 black, 4 white):

RLE pairs: (8, W)(4, B)(4, W)

Original: 16 values
Compressed: 3 pairs (6 values total) — 62.5% fewer values to store

Decoding — expand (5,A)(3,B)(2,C)(1,A):

PairExpansion
(5,A)AAAAA
(3,B)BBB
(2,C)CC
(1,A)A

Result: AAAAABBBCCA — 11 characters ✓

Verification: count the characters — 5 + 3 + 2 + 1 = 11. Always verify your decode length matches what the pairs predict.

RLE is effective for bitmap images with large uniform areas (logos, icons, screenshots of text editors) and ineffective for photographs, where almost every pixel has a different colour.

4 more slides

Continue this lesson

Create a free account to unlock all 7 slides, track your progress, and ask the AI tutor for help.

Related lessons

7 Slides

Lesson

Image Representation: Pixels and Colour Depth

GCSE Computer Science · AQA 8525

7 hours ago

7 Slides

Lesson

Representing Sound: Sampling Rate and Bit Depth

GCSE Computer Science · AQA 8525

7 hours ago

7 Slides

Lesson

ASCII and Unicode: Character Encoding

GCSE Computer Science · AQA 8525

1 day ago