Tutorial : A Verified Interpreter with Side Effects

17 minute read

Published: June 16, 2025

1. Introduction

In this final section we will do a mini tutorial to show the power of the free monad by building an interpreter for an expression language with side effects. The key idea here is that the free monad lets us separate what we want to do (a syntactic description of effectful computation) from how we want to do it (interpreting and executing the effects semantically).

Introduction
Language and Effects
Interpreter
Verification
Conclusion
Exercises
References

2. Language and Effects

We begin by defining a tiny expression language, with integers, variables, addition, and division. We use the Env type to represent environments, which are just mappings of variables to values:

inductive Expr where
  | val : Int → Expr
  | var : String → Expr
  | add : Expr → Expr → Expr
  | div : Expr → Expr → Expr

abbrev Env := List (String × Int)

We also define three effect types: mutable state (for the environment), errors (for failed variable lookups or division by zero), and a trace log (for inspection or debugging):

inductive StateEff : Type → Type where
  | Get : StateEff Env
  | Put : Env → StateEff Unit

inductive ErrorEff : Type → Type where
  | Fail : String → ErrorEff Unit

inductive TraceEff : Type → Type where
  | Log : String → TraceEff Unit

We can then define a sum/coproduct of type constructors as follows:

inductive FSum (F G : Type → Type) (α : Type) where
  | inl : F α → FSum F G α
  | inr : G α → FSum F G α

infixl:50 "⊕" => FSum

And we define our overall effect signature as the nested sum:

abbrev Eff := StateEff ⊕ (ErrorEff ⊕ TraceEff)

Notice how free monads are extensible in their effects. Adding a new effect is simply constructing a new datatype and adding it to the Eff definition.

This type Eff is a pure description of the available commands in our language. Not what they do, just what kinds of actions exist. Our computations will now live in the type FreeM Eff α, which means they are pure syntax trees of abstract effects that eventually return a value of type α.

Lifting Effects into the Syntax Tree

To construct nodes in our effect AST, we define some helper functions that wrap each command in the FreeM monad. We first define a general way to lift an effect signature from F into its free monad FreeM F. This is the morphism lift : F -> FreeM F in the universal property diagram:

def lift (op : F ι) : FreeM F ι :=
  .liftBind op pure

def getEnv : FreeM Eff Env :=
  FreeM.lift (FSum.inl StateEff.Get)

def putEnv (e : Env) : FreeM Eff Unit :=
  FreeM.lift (FSum.inl (StateEff.Put e))

def fail (msg : String) : FreeM Eff Unit :=
  FreeM.lift (FSum.inr (FSum.inl (ErrorEff.Fail msg)))

def log (msg : String) : FreeM Eff Unit :=
  FreeM.lift (FSum.inr (FSum.inr (TraceEff.Log msg)))

Writing a Program

We can now write a little program. It logs a message, updates the environment, reads back a variable, and returns its increment:

def ex : FreeM Eff Int := do
  log "Starting"
  putEnv [("x", 10)]
  let env ← getEnv
  match env.find? (⋅.fst = "x") with
  | some (_, x) => pure (x + 1)
  | none => do fail "x not found"; pure 0

This “program” is constructing a tree of abstract effects independently of any execution or semantics. The calls to log, putEnv, getEnv, and fail are not doing anything yet, they are just nodes in a tree. When programs are represented as data structures you can do much more with them than just their operational interpretation, and you gain immense leverage and control over how you’d like to interpret them.

This separation between syntax and semantics is the core idea. We build up a value of type FreeM Eff Int that describes a program in terms of its desired behavior. This value is like an AST of effects. The tree is built using the constructors pure and bind, and the functorial action of the coproduct ⊕ lets us represent multiple kinds of effects simultaneously.

3. Interpreter

To run a program written in FreeM Eff α, we must interpret its abstract syntax tree into a concrete computation. This involves defining a catamorphism — a recursive fold over the FreeM structure — into a semantic domain of effectful computations:

Env → Trace → Except String (α × Env × Trace)

Before we can fold the entire syntax tree, we need to define how to interpret each individual effect. This is done via a handler, which is a function that gives meaning to each primitive operation in the effect functor Eff.

Eff α → Env → Trace → Except String (α × Env × Trace)

This function interprets each effect label into our semantic domain of exceptions, states, traces:

abbrev Trace := List String

def effInterp : {α : Type} → Eff α → Env → Trace → Except String (α × Env × Trace)
  | _, .inl StateEff.Get => fun env trace => .ok (env, env, trace)
  | _, .inl (StateEff.Put newEnv) => fun _ trace => .ok ((), newEnv, trace)
  | _, .inr (.inl (ErrorEff.Fail msg)) => fun _ _ => .error msg
  | _, .inr (.inr (TraceEff.Log msg)) => fun env trace => .ok ((), env, trace ++ [msg])

This gives us the semantics for a single Eff node. But interpreting a full program requires recursively folding over the FreeM tree. Think of effInterp as the function which we will fold over the tree.

We define this fold using our catamorphism:

def cataFreeM {F : Type u → Type v} {α β : Type w}
  (pureCase : α → β)
  (bindCase : {ι : Type u} → F ι → (ι → β) → β)
  : FreeM F α → β
| .pure a => pureCase a
| .liftBind op k => bindCase op (fun x => cataFreeM pureCase bindCase (k x))

This is saying, given a type β with a pureCase : α → β and a bindCase : {ι : Type u} → F ι → (ι → β) → β (making it an algebra over the free monad functor), we define a function cataFreeM : FreeM F α -> β. This is indeed the catamorphism guaranteed by initiality of FreeM F α.

We define the carrier type of our effect algebra as:

abbrev EffAction (α : Type) := Env → Trace → Except String (α × Env × Trace)

Then we define the two parts of the algebra:

-- Handle pure values
def effPure {α} (a : α) : EffAction α :=
  fun env tr => .ok (a, env, tr)

-- Handle effectful operations and sequencing
def effStep {α} :
    {ι : Type} → Eff ι → (ι → EffAction α) → EffAction α
  | _, .inl StateEff.Get,        k => fun env tr => k env env tr
  | _, .inl (StateEff.Put σ),    k => fun _   tr => k () σ tr
  | _, .inr (.inl (ErrorEff.Fail msg)), _ =>
        fun _ _  => .error msg
  | _, .inr (.inr (TraceEff.Log msg)), k =>
        fun env tr => k () env (tr ++ [msg])

Finally, we combine the two cases into a full interpreter via our catamorphism cataFreeM:

def run {α} : FreeM Eff α → EffAction α :=
  cataFreeM effPure effStep

This catamorphism run is the unique morphism from FreeM Eff α to our effect algebra EffAction α which interprets computation trees of type FreeM Eff α by evaluating them and executing their effects concretely.

4. Verification

Now that we have an interpreter, we can verify its correctness. What does correctness mean here?

In order to check that our interpreter is correct, we need some kind of semantics for our language, i.e., an assignment of meaning to our programs. In programming language theory, this is given as a formal relation that specifies when evaluation succeeds and what result it produces.

We’ll define a big-step operational semantics as an inductive relation, and then prove that the interpreter agrees with the semantics.

Semantics

We define a relation EvalRel e env trace res that says: under environment env and trace trace, expression e evaluates to result res. This result is either an error or a triple of the resulting value, environment, and trace. We also define a function eval which maps an expression to the effectful AST. Our correctness claim will then be that if EvalRel e env trace res holds (i.e., e evaluates to res), then our interpreter also returns res when run on the output of eval e.

inductive EvalRel : Expr → Env → Trace → Except String (Int × Env × Trace) → Prop where
| val :
    ∀ n env trace,
    EvalRel (.val n) env trace (.ok (n, env, trace))
| var_found :
    ∀ x env trace v,
    env.find? (·.fst = x) = some (x, v) →
    EvalRel (.var x) env trace (.ok (v, env, trace))
| var_missing :
    ∀ x env trace,
    env.find? (·.fst = x) = none →
    EvalRel (.var x) env trace (.error s!"unbound variable {x}")
| add :
    ∀ e1 e2 env trace₁ trace₂ trace₃ v1 v2 env₂ env₃,
    EvalRel e1 env trace₁ (.ok (v1, env₂, trace₂)) →
    EvalRel e2 env₂ trace₂ (.ok (v2, env₃, trace₃)) →
    EvalRel (.add e1 e2) env trace₁ (.ok (v1 + v2, env₃, trace₃))
| div_ok :
    ∀ e1 e2 env trace₁ trace₂ trace₃ v1 v2 env₂ env₃,
    v2 ≠ 0 →
    EvalRel e1 env trace₁ (.ok (v1, env₂, trace₂)) →
    EvalRel e2 env₂ trace₂ (.ok (v2, env₃, trace₃)) →
    EvalRel (.div e1 e2) env trace₁ (.ok (v1 / v2, env₃, trace₃))
| div_zero :
    ∀ e1 e2 env trace₁ trace₂ trace₃ v1 v2 env₂ env₃,
    v2 = 0 →
    EvalRel e1 env trace₁ (.ok (v1, env₂, trace₂)) →
    EvalRel e2 env₂ trace₂ (.ok (v2, env₃, trace₃)) →
    EvalRel (.div e1 e2) env trace₁ (.error "divide by zero")

The function eval : Expr → FreeM Eff Int constructs a tree of effects representing what should happen during evaluation from an expression. This is the object our interpreter consumes.

def eval : Expr → FreeM Eff Int
  | .val n => pure n
  | .var x => do
      let env ← getEnv
      match env.find? (·.fst = x) with
      | some (_, v) => pure v
      | none => do
          fail s!"unbound variable {x}"
          pure 0
  | .add e1 e2 => do
      let v1 ← eval e1
      let v2 ← eval e2
      pure (v1 + v2)
  | .div e1 e2 => do
      let v1 ← eval e1
      let v2 ← eval e2
      if v2 = 0 then
        fail "divide by zero"
        pure 0
      else
        pure (v1 / v2)

What do we want to prove?

We want to prove that eval followed by run gives the same result as the semantics. That is:

theorem eval_correct (e : Expr) (env : Env) (trace : Trace)
    (res : Except String (Int × Env × Trace))
    (h : EvalRel e env trace res) :
    run (eval e) env trace = res

Proof

We proceed by induction on the derivation of EvalRel e env trace res. In each case, we:

Unfold the definition of eval for the given expression
Use helper lemmas to simplify run (p >>= k)
Match the result with the expected output

These two helper lemmas simplify run (p >>= k):

theorem run_bind_ok {α β}
    {p : FreeM Eff α} {k : α → FreeM Eff β}
    {env env' : Env} {tr tr' : Trace} {v : α} (h : runEff p env tr = .ok (v, env', tr')) :
    runEff (p >>= k) env tr = runEff (k v) env' tr' := by ...

The above says if p succeeds with v, then p >>= k runs k v next.

theorem run_bind_err {α β}
    {p : FreeM Eff α} {k : α → FreeM Eff β}
    {env : Env} {tr : Trace} {msg : String} :
  runEff p env tr = .error msg →
  runEff (p >>= k) env tr = .error msg ...

This one says if p errors, then p >>= k errors with the same message.

Now we can prove the theorem.

theorem eval_correct (e : Expr) (env : Env) (trace : Trace)
    (res : Except String (Int × Env × Trace))
    (h : EvalRel e env trace res) :
    runEff (eval e) env trace = res := by
    induction' h
    · case val z env trace =>
      simp [eval, pure_eq_purePure, runEff, cataFreeM, effPure]
    · case var_found x env trace v h =>
      simp [runEff, eval, getEnv, bind_pure_comp, lift_def, cataFreeM, effStep, h, effPure]
    · case var_missing x env trace h =>
      simp [runEff, eval, bind, getEnv, fail, lift_def, cataFreeM, effStep, h]
    · case add e₁ e₂ env trace₁ trace₂ trace₃ v1 v2 env₂ env₃ h₁ h₂ ih₁ ih₂ =>
      simp [eval, bind]
      have step₁ := runEff_bind_ok (p := eval e₁ ) (k := fun v1 => do
        let v2 ← eval e₂
        pure (v1 + v2)) ih₁
      simp [bind] at step₁; simp [step₁]
      have step₂ := runEff_bind_ok (p := eval e₂) (k := fun v2 => pure (v1 + v2)) ih₂
      simp [bind] at step₂; simp [step₂]; congr
    · case div_ok e₁ e₂ env trace₁ trace₂ trace₃ v₁ v₂ env₂ env₃ v₂_ne_0 h₁ h₂ ih₁ ih₂  =>
      simp [eval, bind]
      have step₁ := runEff_bind_ok (p := eval e₁) (k := fun v1 => do
        let v2 ← eval e₂
        if v2 = 0 then do fail "divide by zero"; pure 0 else pure (v1 / v2)) ih₁
      simp [bind] at step₁; simp [step₁]
      have step₂ := runEff_bind_ok (p := eval e₂) (k := fun v₂ =>
        if v₂ = 0 then do fail "divide by zero"; pure 0 else pure (v₁ / v₂)) ih₂
      simp [bind] at step₂; simp [step₂, v₂_ne_0]
      congr
    · case div_zero e₁ e₂ env' trace₁ trace₂ trace₃ v₁ v₂ env₂ env₃ v₂_ne_0 h₁ h₂ ih₁ ih₂ =>
      simp [eval, bind]
      have step₁ := runEff_bind_ok (p := eval e₁) (k := fun v₁ => do let v₂ ← eval e₂; if v₂ = 0 then fail "divide by zero"; pure 0 else pure (v₁ / v₂)) ih₁
      simp [bind] at step₁; simp [step₁]
      have step₂ := runEff_bind_ok (p := eval e₂) (k := fun v₂ => if v₂ = 0 then (do fail "divide by zero"; pure 0) else pure (v₁ / v₂)) ih₂
      simp [bind] at step₂; simp [step₂, v₂_ne_0]
      simp [pure, fail, lift, runEff]
      congr

Now we have formally verified our interpreter agrees with our language semantics.

5. Conclusion

Hopefully this article was informative and helpful in understanding free monads mathematically and gave you a glimpse of their usefulness in programming. This is the first blog post I’ve written so I’m hoping it was enjoyable. To review what we did:

We introduced the concept of free objects in mathematics, starting with vector spaces, monoids, and groups.
We defined the free monad categorically as the initial algebra of a particular functor, drawing analogy to the List type.
In Haskell, we implemented the standard FreeM f a type and gave it a Monad instance.
We learned about strict positivity in dependently typed proof assistants and why the classic Free monad fails in Lean
We introduced the Freer monad as a strictly positive solution, and showed it forms a monad for any F : Type -> Type
We talked about initial algebras and how catamorphisms are canonical interpreters for effect functors, and how universal morphisms interpret free monadic computations into other monads.
We defined a small expression language with three effects: state, errors, and tracing, showed how effects can be represented as data structures using the FreeM monad, and wrote an interpreter for it as a catamorphism
We showed how this separation between syntax and semantics enables flexibility in evaluating and interpreting effectful languages.
We showed how to define an operational semantics and prove that the interpreter agrees with it.

All this code can be found here

7. Exercises

Define the standard monads such as Cont, Writer, and Reader as Free monads.
Define the dual notion - the cofree comonad and explore its properties.

8. References

nLab: Free Monad
CIS 5520 Lecture Notes on Freer Monads
The Dao of Functional Programming, Bartosz Milewski (2025)
Serokell: Introduction to Free Monads
Okmij : FreeM and Freer Monads: Putting Monads Back into Closet

Share on

Twitter Facebook LinkedIn

Tanner Duve

Tutorial : A Verified Interpreter with Side Effects

1. Introduction

Table of Contents

2. Language and Effects

3. Interpreter

4. Verification

5. Conclusion

7. Exercises

8. References

Share on

You May Also Enjoy

Future Blog Post

Initial Algebras, Catamorphisms, and Interpreters

Part 1: Defining the Free Monad in Lean

1. Introduction

ZkFlow: A Formally Verified Compiler for Zero-Knowledge Circuits