Archive
043
EXH-0432026Fictional reconstruction

Copy Fail

Late April 2026. A tenant reports that `su` misbehaves on a shared node after a rival team's CI job ran. Press and vendor posts about CVE-2026-31431 (Copy Fail) land the same shift while NVD already listed the CVE on 22 April.

Type
Defensive / IR
Difficulty
Advanced
Era
2020s
Time
12 min

Briefing

You are paged because an internal tenant saw `su` crash with impossible offsets right after another team's unprivileged job finished on the same bare-metal node. Thirty minutes later the Copy Fail disclosure lands. The shape matches what they saw: a logic flaw in the kernel algif_aead module lets an unprivileged process write a few bytes into the page cache of readable files, including setuid binaries, and the page cache is host-wide, so every pod on the node inherits the poisoned mapping. Public PoC is tiny and reliable. You still cannot reboot until the maintenance window. Prove the node is in the vulnerable configuration, then lock it down.

Your role

Platform engineer on a Kubernetes cluster that hosts CI runners and AI sandboxes for multiple internal teams. Same host kernel underneath all of them.

Objective

Walk from the first suspicious tenant report to a containment call: prove the shared-kernel exposure path, confirm blast radius, and ship a same-day mitigation before the reboot window tonight.

Terminal environment

user
responder
host
k8s-node-04
cwd
/home/responder
steps
11
Enter the terminalAbout 12 minutesSafe simulation

Safety note. This is a safe reconstruction. All systems, files, hosts, credentials, and outputs are simulated. Do not use these techniques on systems you do not own or have explicit permission to test.