Original post

Algebraic data types (also known as variant types, sum types or
discriminated unions) is a neat feature of some programming languages that
lets us specify that a value might take one of several related types, and
includes convenient syntax for pattern matching on these types at run-time.
Here’s a canonical binary tree example from Wikipedia, written in :

data Tree = Empty
          | Leaf Int
          | Node Tree Tree

A Tree can be either empty, or a leaf with one integer field, or a node with
two Tree fields – its left and right children. Here’s a function that finds
the depth of such trees:

depth :: Tree -> Int
depth Empty = 0
depth (Leaf n) = 1
depth (Node l r) = 1 + max (depth l) (depth r)

It takes a Tree and returns an integer, and its workings are laid out by
cases, depending on the run-time type of the parameter [1]. If we pass anything
that’s not a Tree into depth, we’ll get an error. Very concise and
elegant!

“Why doesn’t have algebraic data types” is a commonly asked question, and
there’s a FAQ entry about it. Even
more details can be found in a Reddit AMA session
the Go developers did in 2016 and in this proposal issue. The short version of the answer
is that it isn’t clear how this feature would interact with interfaces; how
would one variant type discriminate between multiple interfaces that may have
the same methods, etc. Indeed, it’s a tricky issue and no one has found a
satisfactory solution yet.

Another common answer is that Go already supports very similar functionality via
a combination of interfaces and run-time type switches. In this post I want to
show how it can be done, with some examples from synthetic to “real life”. The
full code for the samples in this post is available here.

Here’s the same binary tree type, written in Go [2]:

type Tree interface {
}

type Empty struct {
}

type Leaf struct {
    v int
}

type Node struct {
    left, right Tree
}

Tree is an interface; Empty, Leaf and Node implement the
interface. Note that the Tree interface is empty, so any type implements it,
even the built-in string. We can easily restrict this by providing a method
that would only be implemented by the interesting types:

type Tree interface {
    isTree()
}

And we’ll have to add empty imlementations for our types:

func (_ Leaf) isTree()  {}
func (_ Node) isTree()  {}
func (_ Empty) isTree() {}

Now only types with a isTree method will be allowed where the Tree
interface is expected.

The depth function employs a type switch to do run-time type dispatching:

func depth(t Tree) int {
    switch nt := t.(type) {
    case Empty:
        return 0
    case Leaf:
        return 1
    case Node:
        return 1 + max(depth(nt.left), depth(nt.right))
    default:
        log.Fatalf("unexpected type %T", nt)
    }
    return 0
}

It’s much more verbose than the Haskell pattern matching, but not less readable,
and just as efficient. One obvious deficiency is having to manually reject
other types in the default clause; Haskell’s pattern matching does this
automatically and also makes sure we’ve covered all cases.

A more realistic example

Let’s turn to a more realistic example that you could legitimatley encounter in
a Go program. Since I’m a compiler nerd this will be about trees again –
abstract syntax trees. These data structures are a key layer in most compilers,
produced by parsers and consumed by whatever comes after (type
checking/inference, lowering, optimization, evaluation, etc).

For this sample I wrote a complete evaluator for a simple calculator language
with arithmetic operations, variables you can set and access, and if
conditionals. The full code is in parser-evaluator.go;
here I’ll just focus on the AST nodes and how to “pattern match” them. This is
the Node interface:

type Node interface {
    isNode()
    String() string
}

It has the isNode guard method that all concrete node types will implement,
along with a String method to make sure all nodes implement the
fmt.Stringer interface for debugging. Here are a few sample concrete node
types:

type AssignStmt struct {
    Pos  scanner.Position
    Name string
    Expr Node
}

type IfStmt struct {
    Pos  scanner.Position
    Cond Node
    Then Node
    Else Node
}

type IntConstant struct {
    Pos   scanner.Position
    Value int
}

If you look at the full code, all of these have a dummy isNode method, along
with a String method.

This is how pattern matching happens in the Eval function that evaluates an
expression:

func (eval *Evaluator) Eval(n Node) (int, error) {
    switch nt := n.(type) {
    case IntConstant:
        return nt.Value, nil
    // ...
    // ... more cases
    // ...
    case AssignStmt:
        val, err := eval.Eval(nt.Expr)
        if err != nil {
            return 0, err
        }
        eval.symbolTable[nt.Name] = val
        return 0, nil
    case IfStmt:
        condVal, err := eval.Eval(nt.Cond)
        if err != nil {
            return 0, err
        }
        // Lazily evaluate Then or Else based on the result of Cond.
        if condVal == 0 {
            elseVal, err := eval.Eval(nt.Else)
            if err != nil {
                return 0, err
            }
            return elseVal, nil
        } else {
            thenVal, err := eval.Eval(nt.Then)
            if err != nil {
                return 0, err
            }
            return thenVal, nil
        }
    }
    return 0, fmt.Errorf("unmatched node %s", n)
}

Again, a type switch to cleanly discriminate between all the run-time types
n could take.

An even more realistic example

The evaluator shown in the previous section is something you could run into
in real programs, and in fact you do. Go’s own go/ast package uses the
same idiom for its AST nodes [3].

Looking in src/go/ast/ast.go in Go’s source code, we see:

// All statement nodes implement the Stmt interface.
type Stmt interface {
    Node
    stmtNode()
}

Node is an embedded interface for some position-related methods, and
stmtNode is a dummy method only Stmt types implement. Then, looking
in src/go/types/stmt.go we find many examples of the by-now familiar
type switch:

// stmt typechecks statement s.
func (check *Checker) stmt(ctxt stmtContext, s ast.Stmt) {
  // ...
    switch s := s.(type) {
    case *ast.BadStmt, *ast.EmptyStmt:
        // ignore
    case *ast.IfStmt:
        check.openScope(s, "if")
        defer check.closeScope()

        check.simpleStmt(s.Init)
        var x operand
        check.expr(&x, s.Cond)
        // ...
    // ...
}

Conclusion

It seems that Go can do just fine without adding variant types, due to the
challenges mentioned in the beginning of the post and the ease with which the
major use-cases can be implemented using existing language features. While
Go’s interfaces and type switches are certainly more verbose than Haskell’s
ADT declarations with pattern matching, and provide a bit less static safety,
they are nevertheless fully usable for the task and just as efficient.

Language design is a careful balance, and a lot can be said for keeping the
language simple with a minimal number of features, even if this leads to small
losses in expressivity. It’s not worth adding a significant language feature
just to cover for a small weakness in the existing ones that can be easily
worked around.


[1] It’s important to note that this is run-time type dispatch; the value
has to have a run-time tag saying what its type is. This will be
important in comparing with the Go implementation later on.
[2] This is not the idiomatic way of writing binary trees in Go, it’s
just here to provide a syntactically close comparison to the Haskell
code.
[3] go/ast is a library package useful for constructing tools to process
Go code. The Go compiler is now itself written in Go and has similar code
in it (though AFAICT it’s incompatible with the public-facing
go/ast).