Generic classes
This article is a draft
Generic classes are classes that can be parameterized by one or more types.
The most notable examples of generic classes you've seen are probably
list and dict: you can write list[int] no denote a list with just integers,
and you can write dict[str, int] to denote a dictionary where the keys are
strings and the values are integers.
This parameterization is not reserved for built-in types: you and the libraries you use can define their own classes with parameters. This chapter teaches you how to do that.
You might already have an intuition as to why configuring a class with
another type is useful.
Imagine for a moment that the list class does not accept parameters and
can only be used as is in type annotations:
numbers = [1, 2, 3]
reveal_type(numbers) # list
first = numbers[0]
reveal_type(first) # what is this? `object`? `Any`?
first + 42 # is this allowed? why?
len(first) # is this allowed? why?
A lot of programming is concerned with processing collections.
Either accessing the elements of a collection would not really be type checked
(since elements would be inferred as Any or something similar);
or accessing the elements of a collection would require extra conditionals,
leading to very silly code that checks if e.g. first is an int here.
This would be such a bad experience that adding type annotations to code would
rarely be worth the hassle.
Basic syntax
To make a class generic, add square brackets after the class name and list one or more names that will acts as type parameters.
Naming type parameters
For collections and iterators (like list, dict, set,
collections.ChainMap, classes from itertools), and in other cases where
the subject matter is very abstract, it is common to use one-letter names for
type variables, like T; T, U, V; A, B; K, V (for "key" and "value").
However, if you can think of a better name, use that instead. It will help with understanding their purpose when reading the code.
Just like in generic functions (see Chapter 3), you can then use the type parameters in type annotations.
Here, we created a Box class that accepts one type parameter, representing
the type of the value it's storing.
Then, on line 11, we created an instance of Box[int].
You can think of Box[int] as a copy-pasted version of Box where
every instance T is replaced by an int:
class IntBox:
def __init__(self, value: int) -> None:
self._value = value
def put(self, new_value: int) -> None:
self._value = new_value
def get(self) -> int:
return self._value
Instantiating a generic class
You don't have to call the generic class like the above, you can also use a variable annotation:
However, very often you can omit the annotation entirely, ifT can be inferred from
the arguments to __init__:
Use this where possible.
Do not add annotations when they are unnecessary.
You've already seen a common case where this is not an option though — when creating an empty collection:
One reason to avoid the former style is that it has a runtime cost 1.More examples
Pair
An example with more than one type variable:
class Pair[L, R]:
def __init__(self, left: L, right: R, /) -> None:
self._left = left
self._right = right
@property
def left(self) -> L:
return self._left
@property
def right(self) -> R:
return self._right
pair = Pair("red", 42)
reveal_type(pair) # Pair[str, int]
reveal_type(pair.left) # str
reveal_type(pair.right) # int
You can also add an explicit annotation where the type can be inferred, but it's not the type you wanted:
from typing import Literal
type Color = Literal["red", "blue"]
pair2: Pair[Color, int] = Pair("red", 42)
pair3: Pair[str, list[int]] = Pair("red", [True, True, False])
Stack
from collections.abc import Iterable, Iterator
from typing import reveal_type
class Stack[T]:
def __init__(self, items: Iterable[T] = (), /) -> None:
self._items = list(items)
def __len__(self) -> int:
return len(self._items)
def __iter__(self) -> Iterator[T]:
return iter(self._items)
def push(self, item: T, /) -> None:
self._items.append(item)
def pop(self, /) -> T:
return self._items.pop()
int_stack = Stack([1, 2, 3])
reveal_type(int_stack) # `Stack[int]`
stack_idk = Stack()
reveal_type(stack_idk) # mypy, ty, pyrefly say: `Stack[Any]`
# pyright says: `Stack[Never]` (see explanation)
stack_explicit: Stack[int] = Stack()
reveal_type(stack_explicit) # `Stack[int]`
str_stack = Stack(["a", "b", "c"])
reveal_type(str_stack) # `Stack[str]`
str_stack.push("something") # ok
reveal_type(str_stack.pop()) # `str`
str_stack.push(42) # type checking error
Type checkers don't agree on what to do with stack_idk.
mypy, pyrefly and ty all say that it's Stack[Any] (or something close),
because they don't have any clues as to what T in Iterable[T] stands for.
However, pyright infers it as Stack[Never].
We haven't seen typing.Never before, but it will probably appear in a later chapter.
It is an unusual special type that contains no values at all.
For example, if x: list[Never] then x is always empty.
pyright sees that () is empty, which means that it's a tuple[Never, ...]
and therefore an Iterable[Never], so it solves that T = Never.
Some type checkers have configuration options to catch situations where you
accidentally make a value with a partially unknown type (meaning that it has
Any as a parameter without you explicitly writing it).
- In
mypywill complain thatstack_idkneeds an annotation without any settings - In
pyrefly, enable theimplicit-anyconfiguration option - In
pyrightorbasedpyright, enable thereportUnknownMemberType,reportUnknownVariableTypeand all the other^reportUnknown.*options. Alternative, set thetypeCheckingModetostrict, which changes the defaults to these options.
If you don't want to accept an initial iterable in the __init__, you can explicitly annotate
the attribute in the __init__ or the class body:
class Stack[T]:
def __init__(self) -> None:
self._items: list[T] = []
# or:
class Stack[T]:
_items: list[T]
def __init__(self) -> None:
self._items = []
map
from collections.abc import Callable, Iterator, Iterable
class MapIter[Src, Dest]:
def __init__(self,
source: Iterable[Src],
fn: Callable[[Src], Dest],
/) -> None:
self._it = iter(source) # self._it inferred as: Iterator[Src]
self._fn = fn
def __iter__(self) -> Iterator[Dest]:# (1)!
return self
def __next__(self) -> Dest:
return self._fn(next(self._it))
typing.Selfmight be more appropriate here, but we haven't introduced it yet, butIterator[Dest]orMapIter[Src, Dest]are perfectly fine.
What's interesting here is that once you instantiate MapIter, none of the
public interface of the class mentions Src.
It only matters inside of the class body: inside of __next__,
self._it is of type Iterator[Src], next(self._it) is of type Src,
and self._fn is of type Callable[[Src], Dst].
TODO: more examples
Difference between method-level and class-level type variables
In Chapter 3's "Generic methods" section we showed an example of a generic method with its own type variable. That's not the same as the whole class being generic. It's crucial to understand the difference.
Let's look at the example from chapter 3:
from collections.abc import Callable, Iterator
class Times:
def __init__(self, value: int) -> None:
self._value = value
def repeat[T](self, item: T, /) -> list[T]:
return [item] * self._value
def do[T](self, fn: Callable[[int], T], /) -> Iterator[T]:
for i in range(self._value):
yield fn(i)
times = Times(10)
reveal_type(times) # Times
# ^^^^^ no parameter!
strings = times.repeat("banana")
reveal_type(strings) # list[str]
numbers = times.repeat(42)
reveal_type(numbers) # list[int]
hundreds = times.do(lambda x: x * 1.5)
reveal_type(hundreds) # Iterator[float]
The Times class is not generic, you cannot parameterize it.
All of its attributes are also of a non-generic type.
The repeat method is a self-contained generic function,
and the do method is also a self-contained generic function.
They do not have some shared T type when you call those methods on the same instance.
To further illustrate the difference, here's a generic class that also has a generic method:
from collections.abc import Iterable, Iterator, Callable
class Stack[T]:
def __init__(self, items: Iterable[T] = (), /) -> None:
self._items = list(items)
def __len__(self) -> int:
return len(self._items)
def __iter__(self) -> Iterator[T]:
return iter(self._items)
def push(self, item: T, /) -> None:
self._items.append(item)
def pop(self, /) -> T:
return self._items.pop()
def map[U](self, func: Callable[[T], U], /) -> Stack[U]:
return Stack(map(func, self._items))
strings = Stack(["apple", "banana", "cherry"])
reveal_type(strings) # Stack[str]
lengths = strings.map(len)
reveal_type(lengths) # Stack[int]
encoded = strings.map(lambda s: s.encode())
reveal_type(encoded) # Stack[bytes]
first_letters = strings.map(lambda s: s[:1])
reveal_type(first_letters) # Stack[str]
Here, the T type variable is scoped at the class level, and the U
type variable is scoped to the map method.
T describes variation that applies to the a particular Stack instance, while
U describes variation that applies to a particular call of map.
Inheriting from a generic class
Scenario 1: continuing the same generic vibe
TODO (example: class ReversibleStack[T](Stack[T]))
Scenario 2: inheriting from a parameterized class
TODO (example: class LowerCaseStack(Stack[str]))
Scenario 3: inheriting with a transformed parameter
TODO (example: class MultiDict[K, V](Mapping[K, list[V]])?)
Variance of generic classes
In Chapter 4 we introduced the concepts of assignability and variance, and mentioned some rough rules for figuring out the variance of a class. Now that you know how to define your own generic classes, we'll make it more rigorous.
Recap
As a reminder, variance describes how assignability of a type parameter impacts
the assignability on the generic class as a whole.
Assuming that Banana is assignable to Fruit (perhaps a subclass of Fruit),
Class[T] is:
- covariant if
Class[Banana]is assignable toClass[Fruit] - contravariant if
Class[Fruit]is assignable toClass[Banana] - invariant if neither is true (
Class[Fruit]andClass[Banana]are not assignable to each other)
A covariant class parameterized by T is a pure "producer" of Ts.
It's safe to assign a: Producer[Banana] to b: Producer[Fruit] because
you can only interact with b by getting some Fruit out of it.
And since you should be okay with getting a Banana as part of that, that's okay.
Immutable collections are typically covariant.
![Conveyor belt delivering fruits to a funnel labeled 'Belt[Fruit]',
and a conveyor belt delivering bananas labeled 'Belt[Bananas]' shown as assignable to it,
but not the other way around](covariance_belt_illustration.png)
Belt[T] is covariant in T
A contravariant class parameterized by T is a pure "consumer" of Ts.
It's safe to assign a: Consumer[Fruit] to b: Consumer[Banana] because
the only way to interact with a is to give it Fruits, and Banana is
a kind of fruit, so a must be capable of handling them too.
"Handlers" of things (events, requests, etc.) are typically contravariant.
![Funnel accepting any fruits from a conveyor belt labeled 'Funnel[Fruit]',
and a funnel with a banana-shaped hole and a warning saying 'Banana only!' with type 'Funnel[Banana]'.
The fruit funnel is assignable to the banana funnel, but not the other way around.](contravariance_belt_illustration.png)
Funnel[T] is contravariant in T
An invariant class parameterized by T has ways to interact with it that
get a T from it as well as give it Ts.
Flexing the parameter either way will land you in trouble: if you assign a: Both[Banana]
to b: Both[Fruit], you risk someone handing b an Apple, which it cannot handle;
and if you assign a: Both[Fruit] to b: Both[Banana], you risk b producing something
which is not a banana.
Mutable collections are typically invariant.
![Tank that accepts bananas and produces banana juice on the left labeled 'Tank[Banana]'.
Tank that accepts any fruits and produces multifruit juice on the right labeled 'Tank[Fruit]'.
They are not interchangeable](invariance_belt_illustration.png)
Tank[T] is invariant in T
The algorithm
This is the series of steps you need to take to figure out the variance of a class. It might be a bit complex, but we'll break down the steps in more detail.
-
If there are any public (not
_underscoredor__mangled) attributes inClassthat depend onT, thenClassis invariant inT2. -
Imagine that
Bananais assignable toFruit, but not the other way around. Construct two versions of the class,Class[Banana]andClass[Fruit]. -
Check the compatibility of private attributes. For each attribute
_attr:- If
Class[Fruit]._attris not assignable toClass[Banana]._attr, thenClasscannot be contravariant inT. - If
Class[Banana]._attris not assignable toClass[Fruit]._attr, thenClasscannot be covariant inT.
In other words: if all the attributes are covariant in
T, the class is allowed to be covariant inT; if all the attributes are contravariant inT, the class is allowed to be contravariant inT. - If
-
Check the compatibility of methods:
- If
Class[Fruit].methodwouldn't be compatible withClass[Banana].method, thenClasscannot be contravariant inT. - If
Class[Banana].methodwouldn't be compatible withClass[Fruit].method, thenClasscannot be covariant inT.
child_funcis compatible withparent_funcif, given:def parent_func(self, arg: OriginalArg) -> OriginalReturn: ... def child_func(self, arg: NewArg) -> NewReturn: ...OriginalArgis assignable toNewArgandNewReturnis assignable toOriginalReturn. - If
-
By this point, if both covariance and contravariance are possible, then the class is considered covariant. If neither of them is possible, it is invariant.
This algorithm is phrased in terms of adding restrictions. We start with the assumption that the class can "flex" in both directions (co- and contra-) and we try to "cross out" one of the directions on each step.
If this is confusing, think of it this way: if all the elements are pointing in one direction (e.g.: prohibit contravariance, allowing covariance), then the entire class is "that direction" (e.g.: covariant). If the directions are mixed (sometimes we cross out covariance, sometimes cross out contravariance), then the class is invariant.
First, let's look at methods (step 4).
In Chapter 4, we introduced a shortcut: covariant type variables occur in "output positions" (return types), and contravariant type variables occur in "input positions" (argument types).
Step 4 in the algorithm phrases it more accurately: we check the assignability of argument and return types.
This correctly takes into account cases where argument and return types themselves are generics parameterized by the class's type parameter. In a sense, this plays back recursively into assignability. Let's look at a practical example:
class Belt[T]:
def get_next(self) -> T: ...
def on_next(self, callback: Callable[[T], None]) -> None: ...
def get_many_tuple(self, count: int) -> tuple[T, ...]: ...
def get_many_list(self, count: int) -> list[T]: ...
-
get_next:Bananais assignable toFruit, and there are no arguments, soBelt[Banana].get_nextis compatible withBelt[Fruit].get_next. But not the other way around:Fruitis not assignable toBanana, soBelt[Fruit].get_nextis not compatible withBelt[Banana].get_next. This meansBeltcan still be covariant, but cannot be contravariant. -
on_next:Belt[Fruit].on_next: (self, callback: Callable[[Fruit], None]) -> Fruit Belt[Banana].on_next: (self, callback: Callable[[Banana], None]) -> BananaRemember,
Callableis contravariant in its parameters, soCallable[[Fruit], None]is assignable toCallable[[Banana], None]. Therefore,Belt[Banana].on_nextis compatible withBelt[Fruit].on_next.As in
get_next:Callable[[Banana], None]is not assignable toCallable[[Fruit], None], so this method would preventBeltfrom being contravariant.This illustrates why "input position" and "output position" are a bit misleading.
Tis used as an input to a contravariant type, so it's used in the "output position" here. -
get_many_tuple:Belt[Fruit].get_many_tuple: (self) -> tuple[Fruit, ...] Belt[Banana].get_many_tuple: (self) -> tuple[Banana, ...]This will be almost the same as in
get_next:tuple[Banana, ...]is assignable totuple[Fruit, ...] (becausetupleis covariant), and there are no arguments, soBelt[Banana].get_nextis compatible withBelt[Fruit].get_next. Not the other way around, so it's not contravariant. -
get_many_list:list[Banana]is not assignable tolist[Fruit], andlist[Fruit]is not assignable tolist[Banana]. Therefore, this method requiresBeltto be invariant.This might be surprising, since you can imagine that the method probably constructs a new a list on the fly, since storing a
list[T]as an attribute would make the class invariant.Here's one way you could violate type expectations if
Beltcould be covariant withget_many_list:bananas: list[Banana] = [Banana(), Banana()] class BananaBelt(Belt[Banana]): def get_many_list(self) -> list[Banana]: return bananas banana_belt = BananaBelt() generic_belt: Belt[Banana] = banana_belt #^ allowed because of direct subclassing fruit_belt: Belt[Fruit] = generic_belt #^ this would be allowed because of covariance fruits: list[Fruit] = fruit_belt.get_many_list() fruits.append(Apple()) # now `bananas` contains an `Apple()`Another example
Here's another example that doesn't involve subclassing
Belt:from collections.abc import Iterable, Callable class CallbackList[T](list[T]): def __init__( self, iterable: Iterable[T] = (), callback: Callable[[T], None] = lambda _: None, /) -> None: self._callback = callback super().__init__(iterable) def append(self, o: T, /) -> None: self._callback(o) super().append(o) class Belt[T]: def __init__(self, item: T) -> None: self._items = (item,) * 100 def get_first(self) -> T: return self._items[0] def get_batch(self) -> list[T]: def callback(item: T) -> None: self._items = (item,) + self._items return CallbackList(self._items, callback) belt1: Belt[int] = Belt(42) belt2: Belt[object] = belt1 #^ this would be allowed with covariance belt2.get_batch().append("a") first: int = belt1.get_first() #^ ...but first is actually a string!
TODO: expand on how attributes impact variance, and the role of private attributes
Examples of inferring variance
TODO
Debugging variance
TODO (show examples for pyright and mypy)
The Dataclass Disaster
TODO (and make a less dramatic title)
Generic classes before Python 3.12
TODO
-
The amount of extra time and the reason for it can be quite surprising. Turns out that both creating
Class[Param1, Param2]and instantiating it is slow:Python 3.14.5 (IPython 9.14.0)In [2]: %timeit defaultdict() 109 ns ± 0.374 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each) In [3]: %timeit defaultdict[set, set[int]]() 653 ns ± 4.85 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) In [4]: %timeit defaultdict[set, set[int]] 224 ns ± 0.312 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) In [5]: %timeit defaultdict["set", "set[int]"] 87.8 ns ± 0.497 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each) In [6]: d = defaultdict[set, set[int]] In [7]: %timeit d() 403 ns ± 4.06 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)Python 3.14.5 (IPython 9.14.0)In [1]: class Custom[T]: pass In [2]: %timeit Custom() 65.9 ns ± 0.0822 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each) In [3]: %timeit Custom[int]() 830 ns ± 2.59 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) In [4]: %timeit Custom[int] 620 ns ± 5.33 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) In [5]: d = Custom[int] In [6]: %timeit d() 202 ns ± 0.714 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)One potential reason instantiating a parameterized class is slow is that it attaches (or attempts to attaches) some extra metadata to the object:
Of course, this is Python, so there's a limit to how paranoid it makes sense to be about achieving optimal performance. But I think it's a good to be aware of how expensive common operations are. ↩
-
This is not quite true. The full truth is something like this:
-
For private attributes,
_attr: Xor__attr: Xis evaluated as if it was a method with a(self) -> Xsignature. -
For public attributes,
attr: Xis evaluated as if it was a pair of methods (or a property) with the signatures(self, arg: X) -> Noneand(self) -> X.
For most cases, this would work the same. Here's an example of an edge case where this is different:
from typing import Any class FooAttr[T]: def __init__(self) -> None: self.things: list[T | Any] = [] class FooMethod[T]: def get_things(self) -> list[T | Any]: ... def set_things(self, arg: list[T | Any]): ...In both
FooAttrandFooMethod,list[Fruit | Any]andlist[Banana | Any]are assignable to each other, so the class is allowed to be both covariant and contravariant (as written, it is covariant, since it has no other items). ↩ -