Python Dataclasses & Pydantic
debt(d5/e3/b3/t5)
Closest to 'specialist tool catches' (d5), pylint/ruff flag mutable default arguments and ruff has rules (RUF008/009) for dataclass mutable defaults, but subtler misuses like missing frozen=True or behavior-heavy dataclasses pass silently.
Closest to 'simple parameterised fix' (e3), per quick_fix adding @dataclass(frozen=True) or replacing field(default=[]) with field(default_factory=list) is a small pattern-replacement within the class definition.
Closest to 'localised tax' (b3), dataclasses are scoped to individual data-holder classes per applies_to (web/cli); the choice between dataclass/NamedTuple/Pydantic affects local model code but doesn't gravitationally shape the whole system.
Closest to 'notable trap' (t5), per misconception devs conflate dataclasses with NamedTuples and miss mutability defaults; the field(default_factory=...) requirement for mutable defaults is a documented gotcha most Python devs eventually learn.
Also Known As
TL;DR
Explanation
@dataclass (Python 3.7+) generates boilerplate from type-annotated fields: __init__ with defaults, __repr__, __eq__ (and optionally __hash__, __lt__ for ordering). @dataclass(frozen=True) makes instances immutable — Python's equivalent of PHP 8.1 readonly classes. field() allows customisation: default_factory for mutable defaults, compare=False, repr=False. Pydantic v2 goes further: validates types at runtime (coercing '42' to int or rejecting invalid data), generates JSON Schema, and serialises to dict/JSON with model.model_dump(). Widely used in FastAPI for request/response models. Compared to PHP DTOs backed by readonly classes + PHPStan: Python dataclasses give the same structural benefits; Pydantic adds the runtime validation that PHP's type declarations provide natively.
Common Misconception
Why It Matters
Common Mistakes
- Mutable default arguments: field(default=[]) is required — bare list defaults are shared across instances.
- Not using frozen=True for value objects that should be immutable.
- Using dataclasses for classes with significant behaviour — they are for data containers, not services.
- Not using __post_init__ for validation — the generated __init__ provides no validation hook otherwise.
Code Examples
# Without dataclass — repetitive boilerplate:
class Point:
def __init__(self, x, y): self.x = x; self.y = y
def __repr__(self): return f'Point({self.x}, {self.y})'
def __eq__(self, other): return self.x == other.x and self.y == other.y
# With dataclass:
from dataclasses import dataclass
@dataclass(frozen=True)
class Point:
x: float
y: float
from dataclasses import dataclass, field
@dataclass(frozen=True)
class Money:
amount: int
currency: str
tags: list[str] = field(default_factory=list)
def __post_init__(self):
if self.amount < 0:
raise ValueError('Amount cannot be negative')