UPSERT
debt(d5/e3/b3/t7)
Closest to 'specialist tool catches' (d5). The detection_hints specify semgrep and phpstan as tools that can catch the SELECT-then-INSERT race condition pattern, but this requires configuring specific rules — standard linters won't flag this by default, and the race condition itself only manifests under concurrent load.
Closest to 'simple parameterised fix' (e3). The quick_fix indicates replacing SELECT-then-INSERT with a single UPSERT statement (INSERT ... ON DUPLICATE KEY UPDATE or INSERT ... ON CONFLICT DO UPDATE). This is typically a localized change within a single repository method or query, but may require adjusting multiple call sites if the pattern is widespread.
Closest to 'localised tax' (b3). UPSERT is a query-level pattern choice that affects individual database operations. While applies_to shows it spans web/cli/queue contexts, each usage is independent — fixing one doesn't require fixing others, and the choice doesn't impose ongoing maintenance burden beyond that specific query.
Closest to 'serious trap' (t7). The misconception explicitly states developers believe REPLACE INTO is equivalent to INSERT ON CONFLICT, but REPLACE deletes and reinserts, resetting auto-increment IDs and losing columns. This contradicts intuition from other databases where 'replace' means 'update in place'. Additionally, common_mistakes show developers believe wrapping SELECT-then-INSERT in a transaction prevents races, which is false without serializable isolation.
Also Known As
TL;DR
Explanation
The naive SELECT-then-INSERT pattern has a race condition: two concurrent processes both see the row is missing and both try to insert, causing a unique constraint violation. UPSERT handles this atomically. PostgreSQL uses INSERT ... ON CONFLICT DO UPDATE. MySQL uses INSERT ... ON DUPLICATE KEY UPDATE or REPLACE INTO (which deletes then inserts — different semantics). Upserts are essential for idempotent event processing, counter increments, and sync operations.
Common Misconception
Why It Matters
Common Mistakes
- Using REPLACE INTO when ON DUPLICATE KEY UPDATE is needed — REPLACE deletes the row, triggering cascades and resetting IDs.
- Not specifying which columns to update in ON CONFLICT — accidentally overwriting columns that should be preserved.
- Using SELECT-then-INSERT in a transaction thinking it prevents races — without serialisable isolation, the race window still exists.
- Not using EXCLUDED in PostgreSQL to reference the incoming row values in the UPDATE clause.
Code Examples
// Race condition — SELECT then INSERT:
$exists = $db->query('SELECT id FROM page_views WHERE page = ?', [$page])->fetch();
if ($exists) {
$db->query('UPDATE page_views SET count = count + 1 WHERE page = ?', [$page]);
} else {
$db->query('INSERT INTO page_views (page, count) VALUES (?, 1)', [$page]);
// Two concurrent requests both see missing → both INSERT → duplicate key error
}
// PostgreSQL atomic upsert:
$db->query(
'INSERT INTO page_views (page, count) VALUES (?, 1)
ON CONFLICT (page) DO UPDATE SET count = page_views.count + 1',
[$page]
);
// MySQL:
$db->query(
'INSERT INTO page_views (page, count) VALUES (?, 1)
ON DUPLICATE KEY UPDATE count = count + 1',
[$page]
);