4.6. Dataclass Postinit

  • Dataclasses generate __init__()

  • Overloading __init__() manually will destroy it

  • For init time validation there is __post_init__()

  • It is run after all parameters are set in the class

  • Hence you have to take care about negative cases (errors)

4.6.1. Initial Validation in Classes

  • Init serves not only for fields initialization

  • It could be also used for value validation

>>> from typing import ClassVar
>>>
>>>
>>> class Astronaut:
...     firstname: str
...     lastname: str
...     age: int
...     AGE_MIN: ClassVar[int] = 30
...     AGE_MAX: ClassVar[int] = 50
...
...     def __init__(self, firstname, lastname, age):
...         self.firstname = firstname
...         self.lastname = lastname
...         if not self.AGE_MIN <= age < self.AGE_MAX:
...             raise ValueError('Age is out of range')
...         else:
...             self.age = age
>>>
>>>
>>> astro = Astronaut('Mark', 'Watney', age=44)
>>> vars(astro)
{'firstname': 'Mark', 'lastname': 'Watney', 'age': 44}
>>>
>>> Astronaut('Mark', 'Watney', age=60)
Traceback (most recent call last):
ValueError: Age is out of range

4.6.2. Initial Validation in Dataclasses

  • Creating own __init__() will overload init from dataclasses

  • Therefore in dataclasses there is __post_init__() method

  • It is run after init (as the name suggest)

  • It works on fields, which already saved (it was done in __init__)

  • No need to assign it once again

  • You can focus only on bailing-out (checking only negative path - errors)

>>> from dataclasses import dataclass
>>> from typing import ClassVar
>>>
>>>
>>> @dataclass
... class Astronaut:
...     firstname: str
...     lastname: str
...     age: int
...     AGE_MIN: ClassVar[int] = 30
...     AGE_MAX: ClassVar[int] = 50
...
...     def __post_init__(self):
...         if not self.AGE_MIN <= self.age < self.AGE_MAX:
...             raise ValueError('Age is out of range')
>>>
>>>
>>> Astronaut('Mark', 'Watney', age=44)
Astronaut(firstname='Mark', lastname='Watney', age=44)
>>>
>>> Astronaut('Mark', 'Watney', age=60)
Traceback (most recent call last):
ValueError: Age is out of range

4.6.3. Date and Time Conversion

  • __post_init__() can also be used to convert data

  • Example str 1969-07-21 to date object date(1969, 7, 21)

>>> from dataclasses import dataclass
>>> from datetime import date
>>>
>>>
>>> @dataclass
... class Astronaut:
...     firstname: str
...     lastname: str
...     born: date
...
...     def __post_init__(self):
...         self.born = date.fromisoformat(self.born)
>>>
>>>
>>> Astronaut('Mark', 'Watney', '1961-04-12')  
Astronaut(firstname='Mark', lastname='Watney',
          born=datetime.date(1961, 4, 12))
>>> from dataclasses import dataclass
>>> from datetime import datetime
>>>
>>>
>>> @dataclass
... class Astronaut:
...     firstname: str
...     lastname: str
...     launch: datetime | None = None
...
...     def __post_init__(self):
...         if self.launch is not None:
...             self.launch = datetime.fromisoformat(self.launch)
>>>
>>>
>>> Astronaut('Mark', 'Watney')
Astronaut(firstname='Mark', lastname='Watney', launch=None)
>>>
>>> Astronaut('Mark', 'Watney', '1969-07-21T02:56:15+00:00')  
Astronaut(firstname='Mark', lastname='Watney',
          launch=datetime.datetime(1969, 7, 21, 2, 56, 15, tzinfo=datetime.timezone.utc))

4.6.4. InitVar

  • Init-only fields

  • Added as parameters to the generated __init__

  • Passed to the optional __post_init__ method

  • They are not otherwise used by Data Classes

>>> import datetime
>>> from dataclasses import dataclass, InitVar
>>>
>>>
>>> @dataclass
... class DateTime:
...     string: InitVar[str]
...     date: datetime.date | None = None
...     time: datetime.time | None = None
...
...     def __post_init__(self, string: str):
...         dt = datetime.datetime.fromisoformat(string)
...         self.date = dt.date()
...         self.time = dt.time()
...
...
>>> apollo11 = DateTime('1969-07-21 02:56:15')
>>>
>>> apollo11
DateTime(date=datetime.date(1969, 7, 21), time=datetime.time(2, 56, 15))
>>>
>>> apollo11.date
datetime.date(1969, 7, 21)
>>>
>>> apollo11.time
datetime.time(2, 56, 15)

4.6.5. Use Case - 0x01

>>> from datetime import date, time, datetime, timezone
>>> from dataclasses import dataclass, InitVar
>>> from zoneinfo import ZoneInfo
>>>
>>>
>>> @dataclass
... class CurrentTime:
...     tzname: InitVar[str]
...     d: date | None = None
...     t: time | None = None
...     tz: ZoneInfo | None = None
...
...     def __post_init__(self, tzname):
...         current = datetime.now(ZoneInfo('UTC'))
...         localized = current.astimezone(ZoneInfo(tzname))
...         self.d = localized.date()
...         self.t = localized.time()
...         self.tz = localized.tzname()
>>>
>>>
>>> now = CurrentTime('Europe/Warsaw')
>>>
>>> print(now)  
CurrentTime(d=datetime.date(1969, 7, 21),
            t=datetime.time(2, 56, 15),
            tz='CEST')

4.6.6. Use Case - 0x02

>>> from dataclasses import dataclass, InitVar
>>>
>>>
>>> @dataclass
... class Astronaut:
...     fullname: InitVar[str] = None
...     firstname: str | None = None
...     lastname: str | None = None
...
...     def __post_init__(self, fullname):
...         if fullname:
...             self.firstname, self.lastname = fullname.split()
>>>
>>>
>>> Astronaut('Mark Watney')
Astronaut(firstname='Mark', lastname='Watney')
>>>
>>> Astronaut(firstname='Mark', lastname='Watney')
Astronaut(firstname='Mark', lastname='Watney')

4.6.7. Use Case - 0x03

>>> from dataclasses import dataclass, InitVar
>>>
>>>
>>> @dataclass
... class Email:
...     address: InitVar[str]
...     username: str | None = None
...     domain: str | None = None
...
...     def __post_init__(self, address):
...         self.username, self.domain = address.split('@')
...
...     def get_address(self):
...         return f'{self.username}@{self.domain}'
>>>
>>>
>>> myemail = Email('mwatney@nasa.gov')
>>>
>>> print(myemail)
Email(username='mwatney', domain='nasa.gov')
>>>
>>> print(myemail.username)
mwatney
>>>
>>> print(myemail.domain)
nasa.gov
>>>
>>> print(myemail.get_address())
mwatney@nasa.gov
>>>
>>> print(myemail.address)
Traceback (most recent call last):
AttributeError: 'Email' object has no attribute 'address'

4.6.8. Use Case - 0x04

>>> from typing import ClassVar
>>> from dataclasses import dataclass
>>>
>>>
>>> @dataclass
... class Astronaut:
...     firstname: str
...     lastname: str
...     age: int
...     AGE_MIN: ClassVar[int] = 30
...     AGE_MAX: ClassVar[int] = 50
...
...     def __post_init__(self):
...         min = self.AGE_MIN
...         max = self.AGE_MAX
...         if self.age not in range(min, max):
...             raise ValueError(f'Age {self.age} not in range {min} to {max}')
>>>
>>>
>>> Astronaut('Mark', 'Watney', 60)
Traceback (most recent call last):
ValueError: Age 60 not in range 30 to 50
>>>
>>> Astronaut('Mark', 'Watney', 60, AGE_MAX=70)
Traceback (most recent call last):
TypeError: Astronaut.__init__() got an unexpected keyword argument 'AGE_MAX'

4.6.9. Use Case - 0x05

  • Boundary check

>>> class Point:
...     def __init__(self, x, y):
...         if x < 0:
...             raise ValueError('Coordinate cannot be negative')
...         else:
...             self.x = x
...
...         if y < 0:
...             raise ValueError('Coordinate cannot be negative')
...         else:
...             self.y = y
>>> from dataclasses import dataclass
>>>
>>>
>>> @dataclass
... class Point:
...     x: int = 0
...     y: int = 0
...
...     def __post_init__(self):
...         if self.x < 0 or self.y < 0:
...             raise ValueError('Coordinate cannot be negative')

4.6.10. Use Case - 0x06

  • Var Range

>>> from dataclasses import dataclass, field
>>> from typing import Final
>>>
>>>
>>> @dataclass
... class Point:
...     x: int = 0
...     y: int = 0
...     X_MIN: Final[int] = 0
...     X_MAX: Final[int] = 1024
...     Y_MIN: Final[int] = 0
...     Y_MAX: Final[int] = 768
...
...     def __post_init__(self):
...         if not self.X_MIN <= self.x < self.X_MAX:
...             raise ValueError(f'x value ({self.x}) is not between {self.X_MIN} and {self.X_MAX}')
...         if not self.Y_MIN <= self.y < self.Y_MAX:
...             raise ValueError(f'y value ({self.y}) is not between {self.Y_MIN} and {self.Y_MAX}')
>>>
>>>
>>> Point(0, 0)
Point(x=0, y=0, X_MIN=0, X_MAX=1024, Y_MIN=0, Y_MAX=768)
>>>
>>> Point(-1, 0)
Traceback (most recent call last):
ValueError: x value (-1) is not between 0 and 1024
>>>
>>> Point(0, 2000)
Traceback (most recent call last):
ValueError: y value (2000) is not between 0 and 768
>>>
>>> Point(0, 0, X_MIN=10, X_MAX=100)
Traceback (most recent call last):
ValueError: x value (0) is not between 10 and 100

4.6.11. Use Case - 0x07

  • Const Range

>>> from dataclasses import dataclass, field
>>> from typing import Final
>>>
>>>
>>> @dataclass
... class Point:
...     x: int = 0
...     y: int = 0
...     X_MIN: Final[int] = field(init=False, default=0)
...     X_MAX: Final[int] = field(init=False, default=1024)
...     Y_MIN: Final[int] = field(init=False, default=0)
...     Y_MAX: Final[int] = field(init=False, default=768)
...
...     def __post_init__(self):
...         if not self.X_MIN <= self.x < self.X_MAX:
...             raise ValueError(f'x value ({self.x}) is not between {self.X_MIN} and {self.X_MAX}')
...         if not self.Y_MIN <= self.y < self.Y_MAX:
...             raise ValueError(f'y value ({self.y}) is not between {self.Y_MIN} and {self.Y_MAX}')
>>>
>>>
>>> Point(0, 0)
Point(x=0, y=0, X_MIN=0, X_MAX=1024, Y_MIN=0, Y_MAX=768)
>>>
>>> Point(0, 0, X_MIN=10, X_MAX=100)
Traceback (most recent call last):
TypeError: Point.__init__() got an unexpected keyword argument 'X_MIN'

4.6.12. Use Case - 0x08

  • Init, Repr

>>> from dataclasses import dataclass, field
>>> from typing import Final
>>>
>>>
>>> @dataclass
... class Point:
...     x: int = 0
...     y: int = 0
...     X_MIN: Final[int] = field(init=False, repr=False, default=0)
...     X_MAX: Final[int] = field(init=False, repr=False, default=1024)
...     Y_MIN: Final[int] = field(init=False, repr=False, default=0)
...     Y_MAX: Final[int] = field(init=False, repr=False, default=768)
...
...     def __post_init__(self):
...         if not self.X_MIN <= self.x < self.X_MAX:
...             raise ValueError(f'x value ({self.x}) is not between {self.X_MIN} and {self.X_MAX}')
...         if not self.Y_MIN <= self.y < self.Y_MAX:
...             raise ValueError(f'y value ({self.y}) is not between {self.Y_MIN} and {self.Y_MAX}')
>>>
>>>
>>> Point(0, 0)
Point(x=0, y=0)
>>>
>>> Point(-1, 0)
Traceback (most recent call last):
ValueError: x value (-1) is not between 0 and 1024
>>>
>>> Point(0, -1)
Traceback (most recent call last):
ValueError: y value (-1) is not between 0 and 768

4.6.13. Use Case - 0x09

>>> @dataclass
... class Phone:
...     full_number: InitVar[str]
...
...     country_code: int = None
...     number: int = None
...
...     def __post_init__(self, full_number: str):
...         self.country_code, self.number = full_number.split(' ', maxsplit=1)
>>>
>>>
>>> phone = Phone('+48 123 456 789')

4.6.14. Use Case - 0x0A

>>> @dataclass
... class Pesel:
...     number: InitVar[str]
...
...     pesel: str = None
...     birthday: str = None
...     gender: str = None
...     valid: bool = None
...
...     def calc_check_digit(self):
...         weights = (1, 3, 7, 9, 1, 3, 7, 9, 1, 3)
...         check = sum(w * int(n) for w, n in zip(weights, self.pesel))
...         return str((10 - check) % 10)
...
...     def __post_init__(self, number: str):
...         self.pesel = number
...         self.birthday = datetime.strptime(number[:6], '%y%m%d').date()
...         self.gender =  'male' if int(number[-2]) % 2 else 'female'
...         self.valid = number[-1] == self.calc_check_digit()
>>>
>>>
>>> pesel = Pesel('69072101234')
>>>
>>> print(pesel)  
Pesel(pesel='69072101234',
      birthday=datetime.date(1969, 7, 21),
      gender='male',
      valid=False)

4.6.15. Assignments

Code 4.31. Solution
"""
* Assignment: Dataclass PostInit Syntax
* Complexity: easy
* Lines of code: 3 lines
* Time: 5 min

English:
    1. Use Dataclass to define class `Point` with attributes:
        a. `x: int` with default value `0`
        b. `y: int` with default value `0`
    2. When `x` or `y` has negative value raise en exception
       `ValueError('Coordinate cannot be negative')`
    3. Use `datalass` and validation in `__post_init__()`
    4. Run doctests - all must succeed

Polish:
    1. Użyj Dataclass do zdefiniowania klasy `Point` z atrybutami:
        a. `x: int` z domyślną wartością `0`
        b. `y: int` z domyślną wartością `0`
    2. Gdy `x` lub `y` mają wartość ujemną podnieś wyjątek
       `ValueError('Coordinate cannot be negative')`
    3. Użyj `datalass` i walidacji w `__post_init__()`
    4. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from inspect import isclass
    >>> from dataclasses import is_dataclass

    >>> assert isclass(Point)
    >>> assert is_dataclass(Point)
    >>> assert hasattr(Point, 'x')
    >>> assert hasattr(Point, 'y')

    >>> Point()
    Point(x=0, y=0)

    >>> Point(x=0, y=0)
    Point(x=0, y=0)

    >>> Point(x=1, y=2)
    Point(x=1, y=2)

    >>> Point(x=-1, y=0)
    Traceback (most recent call last):
    ValueError: Coordinate cannot be negative

    >>> Point(x=0, y=-1)
    Traceback (most recent call last):
    ValueError: Coordinate cannot be negative
"""

from dataclasses import dataclass


# Use Dataclass to define class `Point` with attributes: `x` and `y`
# type: Type
@dataclass
class Point:
    x: int = 0
    y: int = 0


Code 4.32. Solution
"""
* Assignment: Dataclass PostInit DatabaseDump
* Complexity: medium
* Lines of code: 5 lines
* Time: 5 min

English:
    1. You received input data in JSON format from the API
        a. `str` fields: firstname, lastname, role, username, password, email,
        b. `datetime` fields: born, last_login,
        c. `bool` fields: is_active, is_staff, is_superuser,
        d. `list[dict]` field: user_permissions
    2. Using `dataclass` model data as class `User`
        a. Note, that fields order is important for tests to pass
    3. Parse fields with dates and store as `date` or `datetime` objects
    4. Run doctests - all must succeed

Polish:
    1. Otrzymałeś z API dane wejściowe w formacie JSON
        a. pola `str`: firstname, lastname, role, username, password, email,
        b. pola `datetime`: born, last_login,
        c. pola `bool`: is_active, is_staff, is_superuser,
        d. pola `list[dict]`: user_permissions
    2. Wykorzystując `dataclass` zamodeluj dane za pomocą klasy `User`
        a. Zwróć uwagę, że kolejność pól ma znaczenie aby testy przechodziły
    3. Sparsuj pola z datami i zapisz je jako obiekty `date` lub `datetime`
    4. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from inspect import isclass
    >>> from dataclasses import is_dataclass

    >>> assert isclass(User)
    >>> assert is_dataclass(User)

    >>> attributes = User.__dataclass_fields__.keys()
    >>> list(attributes)  # doctest: +NORMALIZE_WHITESPACE
    ['firstname', 'lastname', 'role', 'username', 'password', 'email', 'born',
     'last_login', 'is_active', 'is_staff', 'is_superuser', 'user_permissions']

    >>> data = json.loads(DATA)
    >>> result = [User(**user['fields']) for user in data]

    >>> last_login = [user['fields']['last_login'] for user in data]
    >>> last_login # doctest: +NORMALIZE_WHITESPACE
    ['1970-01-01T00:00:00.000+00:00',
     None,
     None,
     '1970-01-01T00:00:00.000+00:00',
     None,
     None]

    >>> last_login = [user.last_login for user in result]
    >>> last_login  # doctest: +NORMALIZE_WHITESPACE
    [datetime.datetime(1970, 1, 1, 0, 0, tzinfo=datetime.timezone.utc),
     None,
     None,
     datetime.datetime(1970, 1, 1, 0, 0, tzinfo=datetime.timezone.utc),
     None,
     None]


    >>> born = {user['fields']['born'] for user in data}
    >>> sorted(born)  # doctest: +NORMALIZE_WHITESPACE
    ['1994-10-12',
     '1994-11-15',
     '1995-07-15',
     '1996-01-21',
     '1999-08-02',
     '2006-05-09']

    >>> born = {user.born for user in result}
    >>> sorted(born)  # doctest: +NORMALIZE_WHITESPACE
    [datetime.date(1994, 10, 12),
     datetime.date(1994, 11, 15),
     datetime.date(1995, 7, 15),
     datetime.date(1996, 1, 21),
     datetime.date(1999, 8, 2),
     datetime.date(2006, 5, 9)]

    >>> result[0]  # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
    User(firstname='Melissa',
         lastname='Lewis',
         role='commander',
         username='mlewis',
         password='pbkdf2_sha256$120000$gvEBNiCeTrYa0$5C+NiCeTrYsha1PHog...=',
         email='melissa.lewis@nasa.gov',
         born=datetime.date(1995, 7, 15),
         last_login=datetime.datetime(1970, 1, 1, 0, 0,
                                      tzinfo=datetime.timezone.utc),
         is_active=True,
         is_staff=True,
         is_superuser=False,
         user_permissions=[{'eclss': ['add', 'modify', 'view']},
                           {'communication': ['add', 'modify', 'view']},
                           {'medical': ['add', 'modify', 'view']},
                           {'science': ['add', 'modify', 'view']}])

    >>> result[1]  # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
    User(firstname='Rick',
         lastname='Martinez',
         role='pilot',
         username='rmartinez',
         password='pbkdf2_sha256$120000$aXNiCeTrY$UfCJrBh/qhXohNiCeTrYH8...=',
         email='rick.martinez@ansa.gov',
         born=datetime.date(1996, 1, 21),
         last_login=None,
         is_active=True,
         is_staff=True,
         is_superuser=False,
         user_permissions=[{'communication': ['add', 'view']},
                           {'eclss': ['add', 'modify', 'view']},
                           {'science': ['add', 'modify', 'view']}])
"""

import json
from dataclasses import dataclass
from datetime import date, datetime
from typing import Optional


DATA = ('[{"model":"authorization.user","pk":1,"fields":{"firstname":"Melissa"'
        ',"lastname":"Lewis","role":"commander","username":"mlewis","password"'
        ':"pbkdf2_sha256$120000$gvEBNiCeTrYa0$5C+NiCeTrYsha1PHogqvXNiCeTrY0CRS'
        'LYYAA90=","email":"melissa.lewis@nasa.gov","born":"1995-07-15","last_'
        'login":"1970-01-01T00:00:00.000+00:00","is_active":true,"is_staff":tr'
        'ue,"is_superuser":false,"user_permissions":[{"eclss":["add","modify",'
        '"view"]},{"communication":["add","modify","view"]},{"medical":["add",'
        '"modify","view"]},{"science":["add","modify","view"]}]}},{"model":"au'
        'thorization.user","pk":2,"fields":{"firstname":"Rick","lastname":"Mar'
        'tinez","role":"pilot","username":"rmartinez","password":"pbkdf2_sha25'
        '6$120000$aXNiCeTrY$UfCJrBh/qhXohNiCeTrYH8nsdANiCeTrYnShs9M/c=","born"'
        ':"1996-01-21","last_login":null,"email":"rick.martinez@ansa.gov","is_'
        'active":true,"is_staff":true,"is_superuser":false,"user_permissions":'
        '[{"communication":["add","view"]},{"eclss":["add","modify","view"]},{'
        '"science":["add","modify","view"]}]}},{"model":"authorization.user","'
        'pk":3,"fields":{"firstname":"Alex","lastname":"Vogel","role":"chemist'
        '","username":"avogel","password":"pbkdf2_sha256$120000$eUNiCeTrYHoh$X'
        '32NiCeTrYZOWFdBcVT1l3NiCeTrY4WJVhr+cKg=","email":"alex.vogel@esa.int"'
        ',"born":"1994-11-15","last_login":null,"is_active":true,"is_staff":tr'
        'ue,"is_superuser":false,"user_permissions":[{"eclss":["add","modify",'
        '"view"]},{"communication":["add","modify","view"]},{"medical":["add",'
        '"modify","view"]},{"science":["add","modify","view"]}]}},{"model":"au'
        'thorization.user","pk":4,"fields":{"firstname":"Chris","lastname":"Be'
        'ck","role":"crew-medical-officer","username":"cbeck","password":"pbkd'
        'f2_sha256$120000$3G0RNiCeTrYlaV1$mVb62WNiCeTrYQ9aYzTsSh74NiCeTrY2+c9/'
        'M=","email":"chris.beck@nasa.gov","born":"1999-08-02","last_login":"1'
        '970-01-01T00:00:00.000+00:00","is_active":true,"is_staff":true,"is_su'
        'peruser":false,"user_permissions":[{"communication":["add","view"]},{'
        '"medical":["add","modify","view"]},{"science":["add","modify","view"]'
        '}]}},{"model":"authorization.user","pk":5,"fields":{"firstname":"Beth'
        '","lastname":"Johanssen","role":"sysop","username":"bjohanssen","pass'
        'word":"pbkdf2_sha256$120000$QmSNiCeTrYBv$Nt1jhVyacNiCeTrYSuKzJ//Wdyjl'
        'NiCeTrYYZ3sB1r0g=","email":"","born":"2006-05-09","last_login":null,"'
        'is_active":true,"is_staff":true,"is_superuser":false,"user_permission'
        's":[{"communication":["add","view"]},{"science":["add","modify","view'
        '"]}]}},{"model":"authorization.user","pk":6,"fields":{"firstname":"Ma'
        'rk","lastname":"Watney","role":"botanist","username":"mwatney","passw'
        'ord":"pbkdf2_sha256$120000$bxS4dNiCeTrY1n$Y8NiCeTrYRMa5bNJhTFjNiCeTrY'
        'p5swZni2RQbs=","email":"","born":"1994-10-12","last_login":null,"is_a'
        'ctive":true,"is_staff":true,"is_superuser":false,"user_permissions":['
        '{"communication":["add","modify","view"]},{"science":["add","modify",'
        '"view"]}]}}]')

# Using `dataclass` model data as class `User`
# type: Type
@dataclass
class User:
    firstname: str
    lastname: str
    role: str
    username: str
    password: str
    email: str
    born: date
    last_login: Optional[datetime]
    is_active: bool
    is_staff: bool
    is_superuser: bool
    user_permissions: list[dict]