Advanced Python Dataclasses
debt(d3/e2/b3/t6)
Closest to 'default linter catches the common case' (d3); ruff/pylint flag mutable default arguments and mypy catches type mismatches, though semantic mistakes like missing frozen=True go undetected.
Closest to 'one-line patch' (e1), slightly above; quick_fix is a swap (field(default_factory=list), add frozen=True, add __post_init__) — small localised changes per class.
Closest to 'localised tax' (b3); dataclass choices affect the class itself and its consumers but don't define system shape — though wide use across web/cli adds some reach.
Closest to 'serious trap' (t7); misconception conflates dataclasses with Pydantic (no validation/coercion), and mutable default behaviour contradicts intuition — devs assume validation happens when it doesn't, leading to silent data corruption.
Also Known As
TL;DR
Explanation
Basic dataclasses auto-generate boilerplate. Advanced: frozen=True makes instances immutable (like PHP readonly), slots=True uses __slots__ for memory efficiency (Python 3.10+), field(default_factory=list) for mutable defaults, post_init for validation, KW_ONLY for keyword-only fields. @dataclass(order=True) generates comparison methods. Combine with __post_init__ for validation logic. For data validation with coercion, use Pydantic; for pure data containers, dataclasses are simpler.
Common Misconception
Why It Matters
Common Mistakes
- Mutable default argument: @dataclass class Foo: items: list = [] — use field(default_factory=list).
- Not using frozen=True for value objects — mutable 'value objects' are not true value objects.
- Comparing dataclasses without order=True — __lt__/__gt__ not generated by default.
- Using dataclasses where Pydantic is needed — dataclasses don't validate or coerce types at runtime.
Code Examples
# Mutable default — shared across all instances:
@dataclass
class Order:
items: list = [] # BUG: all Order instances share the same list!
o1 = Order(); o2 = Order()
o1.items.append('widget')
print(o2.items) # ['widget'] — shared reference!
from dataclasses import dataclass, field
from typing import ClassVar
@dataclass(frozen=True, slots=True) # Immutable + memory-efficient
class Money:
amount: int # Cents
currency: str
CURRENCIES: ClassVar[set] = {'USD', 'EUR', 'GBP'}
def __post_init__(self):
if self.currency not in self.CURRENCIES:
raise ValueError(f'Unknown currency: {self.currency}')
if self.amount < 0:
raise ValueError('Amount cannot be negative')
@dataclass
class Order:
items: list = field(default_factory=list) # Correct mutable default