A Beginner’s Guide to Mathematics for Computer Science
Comets arrive, ablaze
with borrowed fire,
heralds of the everlasting.
Morning brushes them away.In bedrock’s deep,
a river flows unseen,
wearing stone, where all that may
and may not be whispers.Pearls wash ashore,
gleaming in our palms,
but seeds that slip and fall—
they become trees,
and trees become forests,
and we become dust.After brilliance fades,
a distant silent thunder
echoes
from before the dawn of time.
—Do you promise it’s true?
—Yes.
—All of it?
—All of it—but only if you trust me.
—And if I don’t?
A proposition is a statement that is either true or false.
Connectives:
De Morgan’s Laws:
Extends propositional logic with quantifiers over objects.
Quantifiers:
Variables:
Direct proof: to prove $P \Rightarrow Q$, assume $P$ is true and derive $Q$.
Proof by contradiction: to prove $P$, assume $\lnot P$ and derive a contradiction.
Proof by contrapositive: to prove $P \Rightarrow Q$, prove $\lnot Q \Rightarrow \lnot P$.
Proof by cases: to prove $P$, split into exhaustive cases and prove each.
Counterexample: to disprove $\forall x,\ P(x)$, find one $x$ where $P(x)$ is false.
—Who are they?
—Outsiders.
—But why do there have to be walls?
—Walls define us. Make us who we are. And make us strong.
—But my friend is on that side. I want to see him.
A set is a collection of distinct objects.
Notation:
Operations:
The Cartesian product of sets $A$ and $B$ is the set of all ordered pairs.
$$A \times B = \{(a, b) \mid a \in A \land b \in B\}$$Product type $A \times B$: combines two types into ordered pairs.
Sum type $A + B$:
$$(\{\text{left}\} \times A) \cup (\{\text{right}\} \times B)$$An element from $A$ or $B$, tagged with its origin.
A relation $R$ from set $A$ to set $B$ is a subset of $A \times B$.
$(a, b) \in R$, equivalently written $a \mathrel{R} b$.
Properties of a relation $R$ on a set $A$:
Given a relation $E$ on a set $V$, we can visualize it as a graph.
A graph $G = (V, E)$ consists of $V$, a set of vertices, and $E \subseteq V \times V$, a set of edges. If $(u, v) \in E$, there is an edge from $u$ to $v$.
Types:
Terminology:
An equivalence relation is reflexive, symmetric, and transitive.
The equivalence class of $x$:
$$[x] = \{y \mid x \sim y\}$$An equivalence relation partitions a set into disjoint equivalence classes.
A partial order is a relation that is reflexive, antisymmetric, and transitive. A partially ordered set (poset) $(S, \preceq)$ has special elements:
A function $f: A \to B$ is a relation where each input maps to exactly one output:
$$\forall a \in A,\ \exists!\, b \in B,\ (a, b) \in f$$Properties:
Composition:
$$(g \circ f)(x) = g(f(x))$$The function type $A \to B$ contains all functions mapping $A$ to $B$. Higher-order functions take or return functions; e.g., $(A \to B) \to (B \to C) \to (A \to C)$ is the type of function composition.
—The flowers are dying. They come back next year, right?
—No, not these.
—So we’ll never see them again?
—We could collect the seeds though. Plant them in the field.
—And we’ll see them again?
The natural numbers $\mathbb{N} = \{0, 1, 2, 3, \ldots\}$ represent counting. They are defined by the Peano axioms: zero is a natural number; every natural number has a unique successor; zero is not a successor; distinct numbers have distinct successors.
A sequence is a function $s: \mathbb{N} \to S$, written $s_0, s_1, s_2, \ldots$
A finite sequence of length $n$ is a function from $\{0, 1, \ldots, n-1\}$ to $S$.
To prove $\forall n \in \mathbb{N},\ P(n)$:
Strong induction: the inductive step uses $P(0), P(1), \ldots, P(k)$ to prove $P(k+1)$.
Objects defined recursively using base cases and inductive rules.
Factorial:
$$0! = 1 \qquad n! = n \cdot (n-1)!$$Sum:
$$\text{sum}(0) = 0 \qquad \text{sum}(n) = n + \text{sum}(n-1)$$List type:
List(A) = Empty | Cons(A, List(A))
Tree type:
Tree(A) = Leaf | Node(Tree(A), A, Tree(A))
A recursive function calls itself, with a base case and a recursive case that reduces toward the base. Example — Fibonacci:
$$\text{fib}(0) = 0 \qquad \text{fib}(1) = 1 \qquad \text{fib}(n) = \text{fib}(n-1) + \text{fib}(n-2)$$Permutations of $k$ from $n$:
$$P(n, k) = n \cdot P(n-1, k-1) = \frac{n!}{(n-k)!}$$Combinations:
$$\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k} = \frac{n!}{k!\,(n-k)!}$$—Careful! These rocks won’t hold!
—What if I fall? It’s a long way down!
—Don’t look down! You won’t fall.
—But how do you know for sure?
—I don’t! You just have to trust me.
Number systems are constructed from simpler ones using equivalence relations.
An algebraic structure $(S, *)$ pairs a set $S$ with a binary operation $*: S \times S \to S$ satisfying specified axioms. Different axioms define different structures: monoids, groups, rings, fields.
A monoid $(M, *, e)$ satisfies associativity and has an identity element:
Examples:
A sequence $(s_n)$ has limit $L$ if:
$$\forall \varepsilon > 0,\ \exists N,\ \forall n \geq N,\ d(s_n, L) < \varepsilon$$ $$\lim_{n \to \infty} s_n = L$$A series $\sum a_n$ converges if its partial sums converge.
A probability space consists of a sample space $\Omega$ (all possible outcomes) and events, which are subsets of $\Omega$.
A probability function $P$ assigns probabilities to events, satisfying:
Events $A$ and $B$ are independent if $P(A \cap B) = P(A) \cdot P(B)$.
A random variable is a function $X: \Omega \to \mathbb{R}$. Its probability mass function is $p(x) = P(X = x)$, satisfying $\sum p(x) = 1$ and $p(x) \geq 0$.
Law of Large Numbers:
$$\lim_{n \to \infty} \frac{X_1 + X_2 + \cdots + X_n}{n} = \mu$$Central Limit Theorem: the standardized sum of independent random variables converges to a normal distribution.
Entropy measures uncertainty in a random variable $X$ with probabilities $p_1, \ldots, p_n$:
$$H(X) = -\sum_i p_i \log_2 p_i$$Measured in bits; $H(X)$ is the average number of bits needed to encode $X$.
Represents how much knowing $Y$ reduces uncertainty about $X$.
—Did you get my letter?
—I did.
—Then why didn’t you come? They are coming for us.
—Can’t we go another day?
—There is no other day. The last train left this morning.
An alphabet $\Sigma$ is a finite set of symbols, e.g., $\Sigma = \{0, 1\}$ or $\Sigma = \{a, b, \ldots, z\}$.
A string over $\Sigma$ is a finite sequence over $\Sigma$. The empty string is $\varepsilon$; the length of string $s$ is $|s|$; concatenation juxtaposes two strings.
$\Sigma^*$ is the set of all strings over $\Sigma$:
$$\Sigma^* = \Sigma^0 \cup \Sigma^1 \cup \Sigma^2 \cup \cdots$$where $\Sigma^0 = \{\varepsilon\}$, $\Sigma^1 = \Sigma$, $\Sigma^n = \{s_1 s_2 \mid s_1 \in \Sigma^{n-1},\ s_2 \in \Sigma\}$.
$\Sigma^+ = \Sigma^* \setminus \{\varepsilon\}$ denotes all non-empty strings.
A regular language over $\Sigma$ is built from basic languages using regular operations.
Basic languages: $\emptyset$, $\{\varepsilon\}$, and $\{a\}$ for each $a \in \Sigma$.
Regular operations:
Regular languages are closed under union, intersection, complement, concatenation, and Kleene star. Regular expressions provide notation for describing them.
A context-free grammar $G = (V, T, P, S)$ consists of variables $V$, terminals $T$, production rules $P$, and a start variable $S$.
Example — balanced parentheses:
S ::= ε | (S) | SS
Example — arithmetic expressions:
<expr> ::= <number>
| <expr> + <expr>
| <expr> * <expr>
| (<expr>)
Example — statements:
<stmt> ::= <var> := <expr>
| <stmt> ; <stmt>
| if <expr> then <stmt> else <stmt>
| while <expr> do <stmt>
—What has it been? Twelve years?
—Twelve years. Never thought I’d see you again.
—Remember those poppies we planted?
—Come on, let’s go see them.
—Would you look at that… It’s become an ocean.
A state is a function $s: \text{Var} \to \text{Val}$. The update $s[x \mapsto v]$ is the state identical to $s$ except $x$ maps to $v$:
$$s[x \mapsto v](y) = \begin{cases} v & \text{if } y = x \\ s(y) & \text{if } y \neq x \end{cases}$$Denotational semantics interprets programs as mathematical objects: expressions denote values; statements denote state transformations $\text{State} \to \text{State}$.
Expressions:
$$\begin{aligned} \llbracket n \rrbracket(s) &= n \\ \llbracket x \rrbracket(s) &= s(x) \\ \llbracket e_1 + e_2 \rrbracket(s) &= \llbracket e_1 \rrbracket(s) + \llbracket e_2 \rrbracket(s) \\ \llbracket e_1 \times e_2 \rrbracket(s) &= \llbracket e_1 \rrbracket(s) \times \llbracket e_2 \rrbracket(s) \end{aligned}$$Statements:
$$\begin{aligned} \llbracket x := e \rrbracket(s) &= s[x \mapsto \llbracket e \rrbracket(s)] \\ \llbracket c_1;\, c_2 \rrbracket(s) &= \llbracket c_2 \rrbracket(\llbracket c_1 \rrbracket(s)) \end{aligned}$$A rewrite system $(T, R)$ pairs a set of terms with rewrite rules $L \to R$. A term $t$ rewrites to $t'$ (written $t \to t'$) if $t$ contains $L$ and $t'$ replaces it with $R$.
Desirable properties:
Lambda calculus is a formal system for expressing computation via functions.
Syntax:
$$M\ ::=\ x\ \mid\ \lambda x.\ M\ \mid\ M_1\, M_2$$$\beta$-reduction:
$$(\lambda x.\ M)\, N \longrightarrow_\beta M[x := N]$$where $M[x := N]$ substitutes $N$ for all free occurrences of $x$ in $M$.
A type system assigns types to terms and enforces constraints on their use. A typing judgment $\Gamma \vdash t : T$ means “in context $\Gamma$, term $t$ has type $T$”.
Simple types: base types $\text{Bool}$, $\text{Nat}$, $\text{Int}$; function type $T_1 \to T_2$; product type $T_1 \times T_2$; sum type $T_1 + T_2$.
Type safety requires:
Syntax:
$$t\ ::=\ x\ \mid\ \lambda x : T.\ t\ \mid\ t_1\, t_2$$Typing rules:
$$\dfrac{\Gamma(x) = T}{\Gamma \vdash x : T} \quad \text{Var}$$ $$\dfrac{\Gamma,\, x : T_1 \vdash t : T_2}{\Gamma \vdash \lambda x : T_1.\ t\ :\ T_1 \to T_2} \quad \text{Abs}$$ $$\dfrac{\Gamma \vdash t_1 : T_1 \to T_2 \qquad \Gamma \vdash t_2 : T_1}{\Gamma \vdash t_1\, t_2 : T_2} \quad \text{App}$$Parametric polymorphism allows types with type variables (type schemes $\forall \alpha.\ T$). Examples:
$$\forall \alpha.\ \alpha \to \alpha \qquad \forall \alpha.\ \text{List}(\alpha)$$Types correspond to logical propositions; programs correspond to proofs:
| Type | Proposition |
|---|---|
| $T_1 \to T_2$ | $T_1 \Rightarrow T_2$ |
| $T_1 \times T_2$ | $T_1 \land T_2$ |
| $T_1 + T_2$ | $T_1 \lor T_2$ |
A term of type $T$ is a proof of proposition $T$.
Small-step semantics defines individual computation steps via a transition relation $\langle e, s \rangle \to \langle e', s' \rangle$.
Expression rules:
$$\dfrac{}{\langle x, s \rangle \to \langle s(x), s \rangle}$$ $$\dfrac{\langle e_1, s \rangle \to \langle e_1', s' \rangle}{\langle e_1 + e_2, s \rangle \to \langle e_1' + e_2, s' \rangle}$$Statement rules:
$$\dfrac{\langle e, s \rangle \to \langle e', s' \rangle}{\langle x := e, s \rangle \to \langle x := e', s' \rangle}$$ $$\dfrac{}{\langle x := v, s \rangle \to \langle \mathtt{skip}, s[x \mapsto v] \rangle}$$ $$\dfrac{\langle c_1, s \rangle \to \langle c_1', s' \rangle}{\langle c_1;\, c_2, s \rangle \to \langle c_1';\, c_2, s' \rangle}$$ $$\dfrac{}{\langle \mathtt{skip};\, c_2, s \rangle \to \langle c_2, s \rangle}$$Big-step semantics defines complete evaluation: $\langle e, s \rangle \Downarrow v$ (expression evaluates to value) and $\langle c, s \rangle \Downarrow s'$ (command transforms state).
Expression rules:
$$\dfrac{}{\langle n, s \rangle \Downarrow n}$$ $$\dfrac{\langle e_1, s \rangle \Downarrow n_1 \qquad \langle e_2, s \rangle \Downarrow n_2}{\langle e_1 + e_2, s \rangle \Downarrow n_1 + n_2}$$Statement rules:
$$\dfrac{\langle e, s \rangle \Downarrow v}{\langle x := e, s \rangle \Downarrow s[x \mapsto v]}$$ $$\dfrac{\langle c_1, s \rangle \Downarrow s' \qquad \langle c_2, s' \rangle \Downarrow s''}{\langle c_1;\, c_2, s \rangle \Downarrow s''}$$Hoare logic provides formal reasoning about program correctness. A Hoare triple $\{P\}\, c\, \{Q\}$ states: if $P$ holds before executing $c$, then $Q$ holds afterward.
Inference rules:
$$\dfrac{}{\{P[x \mapsto e]\}\; x := e\; \{P\}} \quad \text{Assign}$$ $$\dfrac{\{P\}\, c_1\, \{Q\} \qquad \{Q\}\, c_2\, \{R\}}{\{P\}\, c_1;\, c_2\, \{R\}} \quad \text{Seq}$$ $$\dfrac{\{P \land b\}\, c_1\, \{Q\} \qquad \{P \land \lnot b\}\, c_2\, \{Q\}}{\{P\}\;\mathtt{if}\; b\;\mathtt{then}\; c_1\;\mathtt{else}\; c_2\;\{Q\}} \quad \text{If}$$ $$\dfrac{\{I \land b\}\, c\, \{I\}}{\{I\}\;\mathtt{while}\; b\;\mathtt{do}\; c\;\{I \land \lnot b\}} \quad \text{While}$$ $$\dfrac{P' \Rightarrow P \qquad \{P\}\, c\, \{Q\} \qquad Q \Rightarrow Q'}{\{P'\}\, c\, \{Q'\}} \quad \text{Cons}$$Examples:
$$\{x = 5\}\; y := x + 1\; \{y = 6\}$$ $$\{x > 0\}\;\mathtt{while}\; x > 0\;\mathtt{do}\; x := x - 1\;\{x = 0\}$$Temporal logic formalizes reasoning about execution sequences. Key operators over an infinite state sequence:
Example:
$$\Box\,(\text{request} \Rightarrow \Diamond\,\text{response})$$—Do you believe in fate?
—I don’t know. Why?
—It can’t all be chance, right? That we both came back here.
—I mean, it’s our hometown, after all.
—But what if everything was… prewritten?
A transition system $(S, s_0, T)$ consists of a set of states $S$, initial state $s_0$, and transition relation $T \subseteq S \times S$. An execution is a sequence $s_0, s_1, s_2, \ldots$ where $(s_i, s_{i+1}) \in T$.
A semiautomaton $(S, \Sigma, \delta)$ adds an input alphabet $\Sigma$ with transition function $\delta: S \times \Sigma \to S$. Reading input string $a_1 a_2 \cdots a_n$ produces a state sequence starting from $s_0$.
A finite state automaton $(Q, \Sigma, \delta, q_0, F)$ recognizes languages by accepting or rejecting strings.
Deterministic FSA (DFA): $\delta: Q \times \Sigma \to Q$ gives exactly one next state.
Nondeterministic FSA (NFA): $\delta: Q \times \Sigma \to \mathcal{P}(Q)$ gives a set of possible next states.
Equivalence: for every NFA there exists an equivalent DFA, via the subset construction (states of DFA = subsets of NFA states).
A pushdown automaton $(Q, \Sigma, \Gamma, \delta, q_0, Z_0, F)$ extends a FSA with a stack.
Deterministic and nondeterministic PDAs are not equivalent. Nondeterministic PDAs recognize exactly the context-free languages.
A Turing machine $(Q, \Sigma, \Gamma, \delta, q_0, q_\text{accept}, q_\text{reject})$ has an infinite tape.
At each step: read the symbol under the head, write a symbol, move left ($L$) or right ($R$).
Nondeterministic Turing Machine (NTM):
$$\delta: Q \times \Gamma \to \mathcal{P}(Q \times \Gamma \times \{L, R\})$$DTMs and NTMs recognize the same languages.
A universal Turing machine $U$ takes $\langle M, w \rangle$ (an encoding of machine $M$ and input $w$) and simulates $M$ on $w$. $U$ accepts iff $M$ accepts $w$; $U$ rejects iff $M$ rejects $w$.
All reasonable models of computation are equivalent in power: Turing machines, lambda calculus, and every other proposed model compute exactly the same class of functions (the computable functions). This is supported by evidence but is not a mathematical theorem.
—Will we meet again… in the afterlife?
—Maybe—if there is one.
—But how do you know we’d end up in the same place?
—I don’t. I just… have a feeling.
—It doesn’t matter now. I trust you.
Big-$O$ (upper bound): $f(n) = O(g(n))$ if
$$\exists c \in \mathbb{R}^+,\ \exists n_0 \in \mathbb{N},\ \forall n \geq n_0:\ f(n) \leq c \cdot g(n)$$Big-$\Omega$ (lower bound): $f(n) = \Omega(g(n))$ if
$$\exists c \in \mathbb{R}^+,\ \exists n_0 \in \mathbb{N},\ \forall n \geq n_0:\ f(n) \geq c \cdot g(n)$$Big-$\Theta$ (tight bound): $f(n) = \Theta(g(n))$ if $f(n) = O(g(n))$ and $f(n) = \Omega(g(n))$
Common classes: $O(1)$, $O(\log n)$, $O(n)$, $O(n \log n)$, $O(n^2)$, $O(2^n)$.
Common classes: $O(1)$, $O(\log n)$, $O(n)$, $O(n^k)$.
Equivalently: $L \in \mathbf{NP}$ iff there exists a polynomial-time verifier $V$ and certificate $c$ such that:
$$x \in L \Leftrightarrow V(x, c) \text{ accepts}$$$A \leq_p B$ (a Karp reduction) if $\exists$ polynomial-time $f$ such that:
$$x \in A \Leftrightarrow f(x) \in B$$$L$ is $\mathbf{NP}$-complete if $L \in \mathbf{NP}$ and:
$$\forall A \in \mathbf{NP},\ A \leq_p L$$Can every problem whose solution is verifiable in polynomial time also be solved in polynomial time? Status: open.
$A \leq_T B$ if there exists a Turing machine with oracle for $B$ that decides $A$.
Given $\langle M, w \rangle$, does $M$ halt on $w$?
Theorem: the halting problem is undecidable.
Proof. Assume $H$ decides the halting problem. Construct:
D(⟨M⟩):
if H(⟨M⟩, ⟨M⟩) accepts: loop forever
if H(⟨M⟩, ⟨M⟩) rejects: halt
Then $D(\langle D \rangle)$: if $H(\langle D \rangle, \langle D \rangle)$ accepts, $D$ loops (contradiction); if it rejects, $D$ halts (contradiction). Therefore $H$ cannot exist. $\square$
All non-trivial semantic properties of Turing machine languages are undecidable: if $\mathcal{P}$ is a non-trivial property of languages, then “does $M$ recognize a language with property $\mathcal{P}$?” is undecidable.
In any consistent formal system $F$ sufficiently powerful to express arithmetic, there exists a statement $G$ that is true but neither $G$ nor $\lnot G$ is provable in $F$. The system is incomplete.
The proof uses self-reference analogous to the halting problem: “this statement is not provable.”
A consistent formal system $F$ cannot prove its own consistency:
$$F \not\vdash \text{Con}(F)$$Incompleteness in logic and undecidability in computation reveal the same fundamental limit. There exist truths that no formal system can prove, questions that no algorithm can answer. The boundary we have reached is not a deficiency of our methods but a feature of reality itself.
Beyond this boundary lies not error or confusion but a different kind of truth—one that transcends all mechanical procedure, all formal proof, all algorithmic process.
What cannot be computed must be contemplated.
What cannot be proven must be lived.