Intermediate

Sets and Formal Languages

AicademyAicademy
·A-Level Computer Science·AQA 7517·5 min
4.4.2.2 Maths for regular expressions

Sets and Set Notation

A set is an unordered collection of distinct elements. Sets are fundamental to the mathematics of formal languages and computation.

Defining a set:

A = {1, 2, 3, 4, 5}          // List notation (roster notation)
B = {x | x is an even integer, 0 ≤ x ≤ 10}  // Set-builder notation
  = {0, 2, 4, 6, 8, 10}

Key set notation:

| Symbol | Meaning | Example | | --------------------- | ---------------------------------------------- | ------------------------------ | -------------------------------------------- | --- | ----------- | ---- | | | is a member of set | is True | | | is not a member of | is True | | or | Empty set — contains no elements | | | | Natural numbers | | | | Integers | | | | Cardinality (size/number of elements) of | |

Sets do not have duplicate elements: .

Set Operations

Three fundamental operations on sets:

Union () — elements in , , or both:

Intersection () — elements in both and :

Complement ( or ) — all elements NOT in (relative to the universal set ):

If and , then .

Universal set () — the set of all elements under consideration in a given context. Every other set in that context is a subset of .

Cartesian Product

The Cartesian product of two sets and is the set of all ordered pairs where and :

Example:

.

Why it matters in computing: the Cartesian product defines transition tables for finite state machines. A transition function maps (state, input symbol) pairs to new states — is a Cartesian product.

Formal Languages: Alphabets and Strings

A formal language is a set of strings defined over an alphabet.

Alphabet () — a finite, non-empty set of symbols. Examples:

  • Binary alphabet:
  • Lowercase letters:
  • DNA bases:

String — a finite sequence of symbols from . Examples over :

  • a, b, aa, ab, ba, bb, aab, …

Empty string () — the string with zero symbols. .

— the set of all possible strings over , including . This is an infinite set even when is finite.

Language () — any subset of . Examples:

  • — a finite language
  • — strings of a's followed by b's (infinite language)
  • The set of all valid Python programs over the ASCII alphabet

How much of this have you taken in?

Quiz yourself on this section — free, no card needed.

Test myself

Countable and Uncountable Sets

AQA requires awareness of the following distinction:

Countable set — a set whose elements can be listed in a sequence (put into one-to-one correspondence with ).

  • Examples: , , (rationals), the set of all strings over any finite alphabet ()

Uncountable set — a set too large to be listed in any sequence; cannot be put into one-to-one correspondence with .

  • Examples: (real numbers), the set of all possible languages over an alphabet

Significance for computing:

  • is countable — there are countably many possible programs
  • The set of all languages over is uncountable — there are uncountably many possible problems
  • Therefore, most problems cannot be solved by any program — there are more problems than programs

(Extra context — not required beyond AQA 7517 awareness level) This is formalized by Cantor's diagonal argument — the same technique that proves the set of real numbers is uncountable and underpins the proof that the halting problem is unsolvable.

Common Exam Mistakes

1. Including duplicates in a set

Sets cannot contain duplicate elements. is the same as . Listing duplicates in a set answer shows a misunderstanding of what a set is.

2. Confusing and

is the empty set — it contains no elements; . is a set containing one element (the empty set); . These are different.

3. Confusing union and intersection symbols

(union) looks like a "U" — Union. (intersection) looks like an inverted U — the elements in common. Getting these backwards is a common notation error.

4. Treating ε as a character in the alphabet

is not a symbol in — it represents the absence of any symbol. The string has length 0. Adding it to a string leaves the string unchanged: .

5. Forgetting ε in

includes the empty string . If the question asks for all strings of length 0 or more over , the empty string must be included.

Generate revision on any topic you study

Type any topic you're studying and Aicademy generates a complete lesson, quiz, and flashcard set — personalised to your level.

Lessons on anything

Structured, level-matched lessons on any topic you study

Practice quizzes

Find out what you actually know before the exam does

Flashcard sets

Lock in key concepts with instant revision cards

Ask Aica

Stuck on something? Get a clear explanation, any time

Prev

Finite State Machines

Next

Regular Expressions

Related lessons

7 Slides

Lesson

Finite State Machines

A-Level Computer Science · AQA 7517

7 days ago

7 Slides

Lesson

Regular Expressions

A-Level Computer Science · AQA 7517

7 days ago

6 Slides

Lesson

BNF and Syntax Diagrams

A-Level Computer Science · AQA 7517

10 hours ago