Kqr Row Cache: Contention Check Gets

KQR had a job: cache frequently accessed rows so the main disk could rest. For years, this worked beautifully. Until .

At 9:00:00 AM, a surge of traffic hit. Every user, in every time zone, suddenly demanded the same piece of data: the flash sale metadata for item ID #42. kqr row cache contention check gets

— KQR’s row cache for item:42 expired. 9:00:02 — 10,000 concurrent GET requests arrived simultaneously. KQR had a job: cache frequently accessed rows

From that day on, KQR’s monitoring dashboard had a new rule: If row cache contention check gets > 1000 per second — flip on single-flight mode. And the team learned a valuable lesson: sometimes, the most dangerous lock isn’t in your database — it’s in your cache’s eagerness to help . At 9:00:00 AM, a surge of traffic hit

She hot-patched KQR’s logic to use :

def get(key): if key in cache: return cache[key] else: value = db.query("SELECT * FROM items WHERE id = ?", key) // slow cache[key] = value return value Because the cache was empty, all 10,000 threads saw a at the exact same moment. They all rushed to the database.