Original post

Many languages support subtyping, a kind of polymorphism that lets
us define hierarchical relations on types, with specific types being subtypes of
more generic types. For example, a Cat could be a subtype of Mammal, which
itself is a subtype of Vertebrate.

Intuitively, functions that accept any Mammal would accept a Cat too. More
formally, this is known as the Liskov substitution principle:

Let phi (x) be a property
provable about objects x of type T. Then phi (y)
should be true for objects y of type S where S is a subtype
of T.

A shorter way to say S is a subtype of T is S <: T. The relation <:
is also sometimes expressed as le, and can be thought of as “is less
general than”. So Cat <: Mammal and Mammal <: Vertebrate. Naturally,
<: is transitive, so Cat <: Vertebrate; it’s also reflexive, as T
<: T
for any type T [1].

Kinds of variance in subtyping

Variance refers to how subtyping between composite types (e.g. list of Cats
versus list of Mammals) relates to subtyping between their components (e.g. Cats
and Mammals). Let’s use the general Composite<T> to refer to some composite
type with components of type T.

Given types S and T with the relation S <: T, variance is a way
to describe the relation between the composite types:

  • Covariant means the ordering of component types is preserved:
    Composite<S> <: Composite<T>.
  • Contravariant means the ordering is reversed: Composite<T> <:
    Composite<S>
    [2].
  • Bivariant means both covariant and contravariant.
  • Invariant means neither covariant nor contravariant.

That’s a lot of theory and rules right in the beginning; the
following examples should help clarify all of this.

Covariance in return types of overriding methods in C++

In C++, when a subclass method overrides a similarly named method in a
superclass, their signatures have to match. There is an important exception to
this rule, however. When the original return type is B* or B&, the
return type of the overriding function is allowed to be D* or D&
respectively, provided that D is a public subclass of B. This rule is
important to implement methods like Clone:

struct Mammal {
  virtual ~Mammal() = 0;
  virtual Mammal* Clone() = 0;
};

struct Cat : public Mammal {
  virtual ~Cat() {}

  Cat* Clone() override {
    return new Cat(*this);
  }
};

struct Dog : public Mammal {
  virtual ~Dog() {}

  Dog* Clone() override {
    return new Dog(*this);
  }
};

And we can write functions like the following:

Mammal* DoSomething(Mammal* m) {
  Mammal* cloned = m->Clone();
  // Do something with cloned
  return cloned;
}

No matter what the concrete run-time class of m is, m->Clone() will
return the right kind of object.

Armed with our new terminology, we can say that the return type rule for
overriding methods is covariant for pointer and reference types. In other
words, given Cat <: Mammal we have Cat* <: Mammal*.

Being able to replace Mammal* by Cat* seems like a natural thing to
do in C++, but not all typing rules are covariant. Consider this code:

struct MammalClinic {
  virtual void Accept(Mammal* m);
};

struct CatClinic : public MammalClinic {
  virtual void Accept(Cat* c);
};

Looks legit? We have general MammalClinics that accept all mammals, and
more specialized CatClinics that only accept cats. Given a
MammalClinic*, we should be able to call Accept and the right one will
be invoked at run-time, right? Wrong. CatClinic::Accept does not actually
override MammalClinic::Accept; it simply overloads it. If we try to add
the override keyword (as we should always do starting with C++11):

struct CatClinic : public MammalClinic {
  virtual void Accept(Cat* c) override;
};

We’ll get:

error: ‘virtual void CatClinic::Accept(Cat*)’ marked ‘override’, but does not override
   virtual void Accept(Cat* c) override;
                ^

This is precisely what the override keyword was created for – help us find
erroneous assumptions about methods overriding other methods. The reality is
that function overrides are not covariant for pointer types. They are
invariant. In fact, the vast majority of typing rules in C++ are invariant;
std::vector<Cat> is not a subclass of std::vector<Mammal>, even though
Cat <: Mammal. As the next section demonstrates, there’s a good reason for
that.

Covariant arrays in Java

Suppose we have PersianCat <: Cat, and some class representing a list of
cats. Does it make sense for lists to be covariant? On initial thought, yes. Say
we have this (pseudocode) function:

MakeThemMeow(List<Cat> lst) {
    for each cat in lst {
        cat->Meow()
    }
}

Why shouldn’t we be able to pass a List<PersianCat> into it? After all,
all persian cats are cats, so they can all meow! As long as lists are immutable,
this is actually safe. The problem appears when lists can be modified. The
best example of this problem can be demonstrated with actual Java code, since
in Java array constructors are covariant:

class Main {
  public static void main(String[] args) {
    String strings[] = {"house", "daisy"};
    Object objects[] = strings; // covariant

    objects[1] = "cauliflower"; // works fine
    objects[0] = 5;             // throws exception
  }
}

In Java, String <: Object, and since arrays are covariant, it means that
String[] <: Object[], which makes the assignment on the line marked with
“covariant” type-check successfully. From that point on, objects is an
array of Object as far as the compiler is concerned, so assigning anything
that’s a subclass of Object to its elements is kosher, including integers
[3]. Therefore the last line in main throws an exception at run-time:

Exception in thread "main" java.lang.ArrayStoreException: java.lang.Integer
    at Main.main(Main.java:7)

Assigning an integer fails because at run-time it’s known that objects is
actually an array of strings. Thus, covariance together with mutability makes
array types unsound. Note, however, that this is not just a mistake – it’s a
deliberate historical decision made when Java didn’t have generics and
polymorphism was still desired; the same problem exists in C# – read this for
more details
.

Other languages have immutable containers, which can then be made covariant
without jeopardizing the soundness of the type system. For example in OCaml
lists are immutable and covariant.

Contravariance for function types

Covariance seems like a pretty intuitive concept, but what about contravariance?
When does it make sense to reverse the subtyping relation for composite types
to get Composite<T> <: Composite<S> for S <: T?

An important use case is function types. Consider a function that takes a
Mammal and returns a Mammal; in functional programming the type of this
function is commonly referred to as Mammal -> Mammal. Which function types
are valid subtypes of this type?

Here’s a pseudo-code definition that makes it easier to discuss:

func user(f : Mammal -> Mammal) {
  // do stuff with 'f'
}

Can we call user providing it a function of type Mammal -> Cat as f?
Inside its body, user may invoke f and expect its return value to be
a Mammal. Since Mammal -> Cat returns cats, that’s fine, so this usage
is safe. It aligns with our earlier intuition that covariance makes sense for
function return types.

Note that passing a Mammal -> Vertebrate function as f doesn’t work as
well, because user expects f to return Mammals, but our function
may return a Vertebrate that’s not a Mammal (maybe a Bird).
Therefore, function return types are not contravariant.

But what about function parameters? So far we’ve been looking at function types
that take Mammal – an exact match for the expected signature of f. Can
we call user with a function of type Cat -> Mammal? No, because user
expects to be able to pass any kind of Mammal into f, not just
Cats. So function parameters are not covariant. On the other hand, it
should be safe to pass a function of type Vertebrate -> Mammal as f,
because it can take any Mammal, and that’s what user is going to pass to
it. So contravariance makes sense for function parameters.

Most generally, we can say that Vertebrate -> Cat is a subtype of Mammal
-> Mammal
, because parameters types are contravariant and return types are
covariant. A nice quote that can help remember these rules is: be liberal in
what you accept and conservative in what you produce
.

This is not just theory; if we back to C++, this is exactly how function
types with std::function behave:

#include <functional>

struct Vertebrate {};
struct Mammal : public Vertebrate {};
struct Cat : public Mammal {};

Cat* f1(Vertebrate* v) {
  return nullptr;
}

Vertebrate* f2(Vertebrate* v) {
  return nullptr;
}

Cat* f3(Cat* v) {
  return nullptr;
}

void User(std::function<Mammal*(Mammal*)> f) {
  // do stuff with 'f'
}

int main() {
  User(f1);       // works

  return 0;
}

The invocation User(f1) compiles, because f1 is convertible to the type
std::function<Mammal*(Mammal*)> [4]. Had we tried to invoke User(f2) or
User(f3), they would fail because neither f2 nor f3 are proper
subtypes of std::function<Mammal*(Mammal*)>.

Bivariance

So far we’ve seen examples of invariance, covariance and contravariance. What
about bivariance? Recall, bivariance means that given S <: T, both
Composite<S> <: Composite<T> and Composite<T> <: Composite<S> are true.
When is this useful? Not often at all, it turns out.

In TypeScript, function parameters are bivariant. The following code compiles
correctly but fails at run-time:

function trainDog(d: Dog) { ... }
function cloneAnimal(source: Animal, done: (result: Animal) => void): void { ... }
let c = new Cat();

// Runtime error here occurs because we end up invoking 'trainDog' with a 'Cat'
cloneAnimal(c, trainDog);

Once again, this is not because the TypeScript designers are incompetent. The
reason is fairly intricate and explained on this page;
the summary is that it’s needed to help the type-checker treat functions that
don’t mutate their arguments as covariant for arrays.

That said, in TypeScript 2.6 this is being changed
with a new strictness flag that treats parameters only contravariantly.

Explicit variance specification in type-checking

If you had to guess which of the mainstream languages has the most advanced
support for variance in their type system, Python probably wouldn’t be your
first guess, right? I admit it wasn’t mine either, because Python is dynamically
(duck) typed. But the new type hinting support (described in PEP 484 with more details in PEP 483) is actually fairly advanced.

Here’s an example:

class Mammal:
    pass

class Cat(Mammal):
    pass

def count_mammals_list(seq : List[Mammal]) -> int:
    return len(seq)

mlst = [Mammal(), Mammal()]
print(count_mammals_list(mlst))

If we run mypy type-checking on this code, it will succeed.
count_mammals_list takes a list of Mammals, and this is what we passed
in; so far, so good. However, the following will fail:

clst = [Cat(), Cat()]
print(count_mammals_list(clst))

Because List is not covariant. Python doesn’t know whether
count_mammals_list will modify the list, so allowing calls with a list of
Cats is potentially unsafe.

It turns out that the typing module lets us express the variance of types
explicitly. Here’s a very minimal “immutable list” implementation that only
supports counting elements:

T_co = TypeVar('T_co', covariant=True)

class ImmutableList(Generic[T_co]):
    def __init__(self, items: Iterable[T_co]) -> None:
        self.lst = list(items)

    def __len__(self) -> int:
        return len(self.lst)

And now if we define:

def count_mammals_ilist(seq : ImmutableList[Mammal]) -> int:
    return len(seq)

We can actually invoke it with a ImmutableList of Cats, and this will
pass type checking:

cimmlst = ImmutableList([Cat(), Cat()])
print(count_mammals_ilist(cimmlst))

Similarly, we can support contravariant types, etc. The typing module also
provides a number of useful built-ins; for example, it’s not really necessary
to create an ImmutableList type, as there’s already a Sequence type that
is covariant.


[1] In most cases <: is also antisymmetric, making it a
partial order,
but in some cases it isn’t; for example, structs with permuted fields can
be considered subtypes of each other (in most languages they aren’t!) but
such subtyping is not antisymmetric.
[2] These terms come from math, and a good
rule of thumb to remember how they apply is: co means together, while
contra means against. As long as the composite types vary together (in
the same direction) as their component types, they are co-variant. When
they vary against their component types (in the reverse direction), they
are contra-variant.
[3] Strictly speaking, integer literals like 5 are primitives in Java
and not objects at all. However, due to autoboxing,
this is equivalent to wrapping the 5 in Integer prior to the
assignment.
[4] Note that we’re using pointer types here. The same example would work
with std::function<Mammal(Mammal)> and corresponding f1 taking
and returning value types. It’s just that in C++ value types are not
very useful for polymorphism, so pointer (or reference) values are much
more commonly used.