Python Descriptor Protocol
debt(d8/e5/b5/t7)
Closest to 'silent in production until users hit it' (d9), pulled to d8. detection_hints.automated is 'no' and the only signal is a regex code_pattern for self-assignment in __set__/__delete__; storing per-instance state on the shared descriptor produces silent cross-instance corruption that surfaces only at runtime with multiple instances, not caught by standard linters/type checkers.
Closest to 'touches multiple files / significant refactor in one component' (e5). The quick_fix (move per-instance state into instance __dict__ via __set_name__ or a WeakKeyDictionary) is more than a one-line swap — it changes the descriptor's storage model and every read/write path within the descriptor class, though it stays localised to that component.
Closest to 'persistent productivity tax' (b5). applies_to library/web/cli and tags metaprogramming/data-model: descriptors are load-bearing for ORM fields, typed-config, and reusable validation across many attributes/classes, so a flawed descriptor reaches many work streams that depend on the field abstraction.
Closest to 'serious trap' (t7). The misconception (descriptors treated as a fancy instance-wide __getattr__) plus data vs non-data precedence and shared-state-on-descriptor mistakes contradict how attribute access appears to work elsewhere; the 'obvious' instance-level mental model is wrong and produces silent shared state.
Also Known As
TL;DR
Explanation
A descriptor is any object defining at least one of __get__(self, obj, objtype), __set__(self, obj, value), or __delete__(self, obj). When such an object is assigned as a class attribute, Python routes attribute access through these methods instead of returning the descriptor itself. This is fundamentally different from __getattr__/__getattribute__, which live on the instance's class and act as fallbacks for the whole object: descriptors are per-attribute and resolved during the normal attribute lookup chain. There are two kinds. A data descriptor defines __set__ or __delete__ (plus usually __get__) and takes priority over the instance __dict__ — this is why property assignments are intercepted even when an instance has a same-named entry. A non-data descriptor defines only __get__ and is shadowed by instance __dict__ entries, which is how functions become bound methods yet can be overridden per instance. The lookup order for instance.attr is: data descriptors on the type, then instance __dict__, then non-data descriptors and class attributes. Built-ins property, classmethod, staticmethod, and functions are all descriptors. Since Python 3.6, __set_name__(self, owner, name) is called automatically when the owning class is created, letting a descriptor learn the attribute name it was bound to without repeating it. Descriptors shine when you need reusable, validated, computed, or lazily-loaded attributes across many fields — ORMs (Django, SQLAlchemy), typed-field libraries, and caching decorators rely on them. The common error is storing per-instance state on the descriptor itself; because one descriptor instance is shared by every instance of the owning class, state must live in the instance __dict__ or a WeakKeyDictionary keyed by instance, not on self. For single computed attributes, property is the simpler, idiomatic choice; reach for a custom descriptor only when the same access logic must be reused across multiple attributes or classes.
Common Misconception
Why It Matters
Common Mistakes
- Storing per-instance values on the descriptor object (self.value), so every instance shares and overwrites the same state.
- Confusing data and non-data descriptors and being surprised that instance __dict__ shadows a __get__-only descriptor.
- Hardcoding the attribute name instead of using __set_name__ to learn it automatically in Python 3.6+.
- Reaching for a custom descriptor when a simple property would express a single computed attribute more clearly.
- Forgetting to handle obj is None in __get__, which breaks class-level access like MyClass.field.
Avoid When
- A single computed or validated attribute is needed — a property is simpler and more readable.
- The logic is not reused across multiple attributes or classes, so the indirection adds no value.
- Team members are unfamiliar with the data model and the implicit lookup order would obscure behavior.
When To Use
- The same access, validation, or caching logic must be reused across many attributes or classes.
- Building field abstractions for ORMs, typed-config, or lazy-loaded computed properties.
- You need data-descriptor precedence to guarantee interception even when instance __dict__ holds a same-named key.
Code Examples
class Positive:
def __get__(self, obj, objtype=None):
return self.value # shared across ALL instances!
def __set__(self, obj, value):
if value < 0:
raise ValueError('must be positive')
self.value = value # bug: stored on descriptor, not obj
class Account:
balance = Positive()
a = Account(); a.balance = 100
b = Account(); b.balance = 5
print(a.balance) # 5 — a and b share descriptor state
class Positive:
def __set_name__(self, owner, name):
self.name = name # learn the attribute name
def __get__(self, obj, objtype=None):
if obj is None:
return self # class-level access
return obj.__dict__[self.name]
def __set__(self, obj, value):
if value < 0:
raise ValueError('must be positive')
obj.__dict__[self.name] = value # per-instance storage
class Account:
balance = Positive()
a = Account(); a.balance = 100
b = Account(); b.balance = 5
print(a.balance) # 100 — state isolated per instance