Last time we talked about the benefits of integrating mypy along with type hints into your development workflow. If you have never used mypy and type hinting in python, do take a look at that article to see how to get started.
In that article, I mentioned that Python has a very sophisticated type system, much better than some languages like Java. So why don't we explore the type system some more?
Today we are going to discuss an important concept in typing called Generics. This is a topic that is very basic and complex at the same time. You will see it everywhere if you ever worked in a modern static typed language, yet it few developers really understand it.
So, lets get started.
Generics
Normally, a python list can contain anything in it. See the list below, it contains an integer, a string and a user-defined object. This is part of the dynamic nature of python that it allows such a list.
class Person:
def __init__(self, name):
self.name = name
l: list = [10, "hello", Person("Aparna")]
When we have a list like this, there are no guarantees about any of the items. If we access l[0]
, l[1]
or l[2]
they could be of any data type. In practice, we don't create lists like this. We may want a list of int
or a list of str
or a list of Person
.
While the python runtime itself allows us to put anything in a list, we can use type hints to constrain the type that is allowed to be put in the list. We do this by putting the type of the item in [ ]
as shown below. When we do this, mypy can infer the type when we access an item from the list. And mypy will validate that we only put the right type in the list
l1: list[int] = [10, 20, 30]
l2: list[str] = ["hello", "bye"]
l3: list[Person] = [Person('Aparna'), Person('Anjali')]
a = l1[0] # a is an int
b = l2[0] # b is a str
c = l3[0] # c is a Person
print(c.name) # this is fine, we know c has that field
l1.append(40) # ok
l2.append(50) # NOT OK, mypy will complain here
list[int]
, list[str]
, list[Person]
... you can see a pattern here, which we can generalise as list[T]
where T
is a variable representing the type of the item in the list. Unlike a normal variable that refers to some data, T
refers to another type. While list[int]
is a specific type, list[T]
is a general type that represents all the specific types. Hence list[T]
is called a Generic type and T
is called a type variable (thats why we used the letter T
for the variable, but we could have used any name for it)
Here is another generic type, the dict
. It contains two type variables, one for the type of the keys, another for the type of the values. Once specified, mypy will validate the types.
items: dict[str, int] = {
'jackfruit': 10,
'banana': 15,
'mango': 20
}
people: dict[str, Person] = {
'Aparna': Person('Aparna'),
'Anjali': Person('Anjali')
}
items['mango'] # int
rate = input('Enter the rate for fig: ')
items['fig'] = rate # mypy error ❌ rate is a str not an int
There is no limit on type variables. Here is a tuple with four type variables, since each element in the tuple could be a different type.
person: tuple[str, int, str, str] = ('Anand', 32, 'Mumbai', 'India')
Generic types are a powerful concept. They enable mypy to validate that the data doesn't get messed up somewhere due to a bug, Often the bug isn't obvious and is hidden between many operations. Can you spot the bug in the code below?
all_items: dict[str, int] = {
'jackfruit': 10,
'banana': 15,
'mango': 20
}
cart: list[str] = []
print('Available Items')
print(','.join(all_items.keys()))
item = None
while item != '':
item = input('Type item, or press enter to stop: ')
cart.append(item)
rates: list[int] = [all_items.get(item) for item in cart]
total: int = sum(rates)
print(f'Total = {total}')
This is what mypy reports
> mypy test.py
test.py:15: error: List comprehension has incompatiple type List[Optional[int]]; expected List[int] [misc]
Found 1 error in 1 file (checked 1 source file)
The problem here is that all_items.get()
might return None
if the item is not in the dictionary. mypy knows that from the type declaration of all_items
and it complains that None
is not a valid type for list[int]
. mypy has pointed out a valid bug here and we should write code to handle that error scenario.
Thats the basics of generics. Generic types like list
, dict
, tuple
and many others contain type variables that we can specify to create specialised types. mypy will use that information to validate that we aren't doing something wrong.
Inheritance
Now that we have an idea of generics, lets look at inheritance. Suppose we have a Person
class and an Employee
class that derives from it as shown below
class Person:
def __init__(self, name: str) -> None:
self.name = name
def greet(self) -> str:
return f'Hello {self.name}'
class Employee(Person):
def __init__(self, name: str, employee_id: int) -> None:
super().__init__(name)
self.employee_id = employee_id
def login(self) -> None:
print(f'logging in with it {self.employee_id}')
Now lets try to assign different object to each of these types.
p1: Person = Person('Anjali') # ✅ Valid
p2: Employee = Employee('Varsha', 1) # ✅ Valid
p3: Person = Employee('Aditi', 2) # ✅ Valid
The first two are obviously valid. The third one is a bit surprising. p3
is declared as a Person
type, but we are assigning an Employee
object to it. mypy says this is valid.
The Liskov Substitution Principle of Object Oriented design states that wherever the parent object is expected, you should be able to use the child object instead, since the child retains the capabilities of the parent, plus has some more of its own.
Lets take an example.
def hello(p: Person):
print(p.greet())
p4: Employee = Employee('Aparna', 3)
hello(p4)
The hello
function above takes a Person
class as argument and calls the greet
method on it. We are calling the function, but passing in an Employee
object instead of Person
. Since Employee
inherits Person
, it also has the greet
method and the code works.
So the first rule is:
- Any variable declared with a parent class type will also accept any child object instead
What about the reverse? Can I assign a Person
object to a variable declarad as Employee
? This is not valid and mypy will complain.
p6: Employee = Person('Anand') # ❌ Not Valid
def hello_employee(e: Employee):
e.login()
p7: Person = Person('Aishwarya')
hello_employee(p7) # ❌ Not Valid
You can see that the hello_employee
function calls the login()
method which is not present in a Person
object.
Here then, are the two rules for inheritance:
- You can substitute a child class object in the places where the parent type is declared
- You cannot substitute a parent class object in the places where the child type is needed
mypy will check the code and enforce both the rules.
Summary
In this article, we saw the concept of how generics work and how the type system interacts with inheritance. We also got a sneak look into the concept of covariance.
In the next article, we will take a deeper look at covariance and its siblings - contra-variance and invariance, and how all three interact with generics.
Did you like this article?
If you liked this article, consider subscribing to this site. Subscribing is free.
Why subscribe? Here are three reasons:
- You will get every new article as an email in your inbox, so you never miss an article
- You will be able to comment on all the posts, ask questions, etc
- Once in a while, I will be posting conference talk slides, longer form articles (such as this one), and other content as subscriber-only