Last time we talked about the benefits of integrating mypy along with type hints into your development workflow. If you have never used mypy and type hinting in python, do take a look at that article to see how to get started.

Type Hinting: The Basics
To celebrate the release of mypy 1.0, we are going to be doing a series of articles on the various facets of Python’s type hinting feature. In this article, we see how to get started with the feature.

In that article, I mentioned that Python has a very sophisticated type system, much better than some languages like Java. So why don't we explore the type system some more?

Today we are going to discuss an important concept in typing called Generics. This is a topic that is very basic and complex at the same time. You will see it everywhere if you ever worked in a modern static typed language, yet it few developers really understand it.

So, lets get started.

Generics

Normally, a python list can contain anything in it. See the list below, it contains an integer, a string and a user-defined object. This is part of the dynamic nature of python that it allows such a list.

class Person:
    def __init__(self, name):
        self.name = name
        
l: list = [10, "hello", Person("Aparna")]

When we have a list like this, there are no guarantees about any of the items. If we access l[0], l[1] or l[2] they could be of any data type. In practice, we don't create lists like this. We may want a list of int or a list of str or a list of Person.

While the python runtime itself allows us to put anything in a list, we can use type hints to constrain the type that is allowed to be put in the list. We do this by putting the type of the item in [ ] as shown below. When we do this, mypy can infer the type when we access an item from the list. And mypy will validate that we only put the right type in the list

l1: list[int] = [10, 20, 30]
l2: list[str] = ["hello", "bye"]
l3: list[Person] = [Person('Aparna'), Person('Anjali')]

a = l1[0] # a is an int
b = l2[0] # b is a str
c = l3[0] # c is a Person
print(c.name) # this is fine, we know c has that field

l1.append(40) # ok
l2.append(50) # NOT OK, mypy will complain here

list[int], list[str], list[Person] ... you can see a pattern here, which we can generalise as list[T] where T is a variable representing the type of the item in the list. Unlike a normal variable that refers to some data, T refers to another type. While list[int] is a specific type, list[T] is a general type that represents all the specific types. Hence list[T] is called a Generic type and T is called a type variable (thats why we used the letter T for the variable, but we could have used any name for it)

Here is another generic type, the dict. It contains two type variables, one for the type of the keys, another for the type of the values. Once specified, mypy will validate the types.

items: dict[str, int] = {
    'jackfruit': 10,
    'banana': 15,
    'mango': 20
}

people: dict[str, Person] = {
    'Aparna': Person('Aparna'),
    'Anjali': Person('Anjali')
}

items['mango'] # int

rate = input('Enter the rate for fig: ')
items['fig'] = rate # mypy error ❌ rate is a str not an int

There is no limit on type variables. Here is a tuple with four type variables, since each element in the tuple could be a different type.

person: tuple[str, int, str, str] = ('Anand', 32, 'Mumbai', 'India')

Generic types are a powerful concept. They enable mypy to validate that the data doesn't get messed up somewhere due to a bug, Often the bug isn't obvious and is hidden between many operations. Can you spot the bug in the code below?

all_items: dict[str, int] = {
    'jackfruit': 10,
    'banana': 15,
    'mango': 20
}

cart: list[str] = []
print('Available Items')
print(','.join(all_items.keys()))

item = None
while item != '':
    item = input('Type item, or press enter to stop: ')
    cart.append(item)
    
rates: list[int] = [all_items.get(item) for item in cart]
total: int = sum(rates)
print(f'Total = {total}')

This is what mypy reports

> mypy test.py

test.py:15: error: List comprehension has incompatiple type List[Optional[int]]; expected List[int] [misc]
Found 1 error in 1 file (checked 1 source file)

The problem here is that all_items.get() might return None if the item is not in the dictionary. mypy knows that from the type declaration of all_items and it complains that None is not a valid type for list[int]. mypy has pointed out a valid bug here and we should write code to handle that error scenario.

Thats the basics of generics. Generic types like list, dict, tuple and many others contain type variables that we can specify to create specialised types. mypy will use that information to validate that we aren't doing something wrong.

Inheritance

Now that we have an idea of generics, lets look at inheritance. Suppose we have a Person class and an Employee class that derives from it as shown below

class Person:
    def __init__(self, name: str) -> None:
        self.name = name
        
    def greet(self) -> str:
        return f'Hello {self.name}'
    
class Employee(Person):
    def __init__(self, name: str, employee_id: int) -> None:
        super().__init__(name)
        self.employee_id = employee_id
        
    def login(self) -> None:
        print(f'logging in with it {self.employee_id}')

Now lets try to assign different object to each of these types.

p1: Person = Person('Anjali') # ✅ Valid
p2: Employee = Employee('Varsha', 1) # ✅ Valid
p3: Person = Employee('Aditi', 2) # ✅ Valid

The first two are obviously valid. The third one is a bit surprising. p3 is declared as a Person type, but we are assigning an Employee object to it. mypy says this is valid.

The Liskov Substitution Principle of Object Oriented design states that wherever the parent object is expected, you should be able to use the child object instead, since the child retains the capabilities of the parent, plus has some more of its own.

Lets take an example.

def hello(p: Person):
    print(p.greet())
    
p4: Employee = Employee('Aparna', 3)
hello(p4)

The hello function above takes a Person class as argument and calls the greet method on it. We are calling the function, but passing in an Employee object instead of Person. Since Employee inherits Person, it also has the greet method and the code works.

So the first rule is:

  • Any variable declared with a parent class type will also accept any child object instead
😄
If you want to impress your friends, you can use computer science jargon and say that in assignment, Employee is covariant to Person. We will discuss more about covariance in the next article.

What about the reverse? Can I assign a Person object to a variable declarad as Employee? This is not valid and mypy will complain.

p6: Employee = Person('Anand') # ❌ Not Valid

def hello_employee(e: Employee):
    e.login()

p7: Person = Person('Aishwarya')
hello_employee(p7) # ❌ Not Valid

You can see that the hello_employee function calls the login() method which is not present in a Person object.

Here then, are the two rules for inheritance:

  • You can substitute a child class object in the places where the parent type is declared
  • You cannot substitute a parent class object in the places where the child type is needed

mypy will check the code and enforce both the rules.

Summary

In this article, we saw the concept of how generics work and how the type system interacts with inheritance. We also got a sneak look into the concept of covariance.

In the next article, we will take a deeper look at covariance and its siblings - contra-variance and invariance, and how all three interact with generics.

Did you like this article?

If you liked this article, consider subscribing to this site. Subscribing is free.

Why subscribe? Here are three reasons:

  1. You will get every new article as an email in your inbox, so you never miss an article
  2. You will be able to comment on all the posts, ask questions, etc
  3. Once in a while, I will be posting conference talk slides, longer form articles (such as this one), and other content as subscriber-only

Tagged in: