{"id":"CVE-2021-47094","details":"In the Linux kernel, the following vulnerability has been resolved:\n\nKVM: x86/mmu: Don't advance iterator after restart due to yielding\n\nAfter dropping mmu_lock in the TDP MMU, restart the iterator during\ntdp_iter_next() and do not advance the iterator.  Advancing the iterator\nresults in skipping the top-level SPTE and all its children, which is\nfatal if any of the skipped SPTEs were not visited before yielding.\n\nWhen zapping all SPTEs, i.e. when min_level == root_level, restarting the\niter and then invoking tdp_iter_next() is always fatal if the current gfn\nhas as a valid SPTE, as advancing the iterator results in try_step_side()\nskipping the current gfn, which wasn't visited before yielding.\n\nSprinkle WARNs on iter-\u003eyielded being true in various helpers that are\noften used in conjunction with yielding, and tag the helper with\n__must_check to reduce the probabily of improper usage.\n\nFailing to zap a top-level SPTE manifests in one of two ways.  If a valid\nSPTE is skipped by both kvm_tdp_mmu_zap_all() and kvm_tdp_mmu_put_root(),\nthe shadow page will be leaked and KVM will WARN accordingly.\n\n  WARNING: CPU: 1 PID: 3509 at arch/x86/kvm/mmu/tdp_mmu.c:46 [kvm]\n  RIP: 0010:kvm_mmu_uninit_tdp_mmu+0x3e/0x50 [kvm]\n  Call Trace:\n   \u003cTASK\u003e\n   kvm_arch_destroy_vm+0x130/0x1b0 [kvm]\n   kvm_destroy_vm+0x162/0x2a0 [kvm]\n   kvm_vcpu_release+0x34/0x60 [kvm]\n   __fput+0x82/0x240\n   task_work_run+0x5c/0x90\n   do_exit+0x364/0xa10\n   ? futex_unqueue+0x38/0x60\n   do_group_exit+0x33/0xa0\n   get_signal+0x155/0x850\n   arch_do_signal_or_restart+0xed/0x750\n   exit_to_user_mode_prepare+0xc5/0x120\n   syscall_exit_to_user_mode+0x1d/0x40\n   do_syscall_64+0x48/0xc0\n   entry_SYSCALL_64_after_hwframe+0x44/0xae\n\nIf kvm_tdp_mmu_zap_all() skips a gfn/SPTE but that SPTE is then zapped by\nkvm_tdp_mmu_put_root(), KVM triggers a use-after-free in the form of\nmarking a struct page as dirty/accessed after it has been put back on the\nfree list.  This directly triggers a WARN due to encountering a page with\npage_count() == 0, but it can also lead to data corruption and additional\nerrors in the kernel.\n\n  WARNING: CPU: 7 PID: 1995658 at arch/x86/kvm/../../../virt/kvm/kvm_main.c:171\n  RIP: 0010:kvm_is_zone_device_pfn.part.0+0x9e/0xd0 [kvm]\n  Call Trace:\n   \u003cTASK\u003e\n   kvm_set_pfn_dirty+0x120/0x1d0 [kvm]\n   __handle_changed_spte+0x92e/0xca0 [kvm]\n   __handle_changed_spte+0x63c/0xca0 [kvm]\n   __handle_changed_spte+0x63c/0xca0 [kvm]\n   __handle_changed_spte+0x63c/0xca0 [kvm]\n   zap_gfn_range+0x549/0x620 [kvm]\n   kvm_tdp_mmu_put_root+0x1b6/0x270 [kvm]\n   mmu_free_root_page+0x219/0x2c0 [kvm]\n   kvm_mmu_free_roots+0x1b4/0x4e0 [kvm]\n   kvm_mmu_unload+0x1c/0xa0 [kvm]\n   kvm_arch_destroy_vm+0x1f2/0x5c0 [kvm]\n   kvm_put_kvm+0x3b1/0x8b0 [kvm]\n   kvm_vcpu_release+0x4e/0x70 [kvm]\n   __fput+0x1f7/0x8c0\n   task_work_run+0xf8/0x1a0\n   do_exit+0x97b/0x2230\n   do_group_exit+0xda/0x2a0\n   get_signal+0x3be/0x1e50\n   arch_do_signal_or_restart+0x244/0x17f0\n   exit_to_user_mode_prepare+0xcb/0x120\n   syscall_exit_to_user_mode+0x1d/0x40\n   do_syscall_64+0x4d/0x90\n   entry_SYSCALL_64_after_hwframe+0x44/0xae\n\nNote, the underlying bug existed even before commit 1af4a96025b3 (\"KVM:\nx86/mmu: Yield in TDU MMU iter even if no SPTES changed\") moved calls to\ntdp_mmu_iter_cond_resched() to the beginning of loops, as KVM could still\nincorrectly advance past a top-level entry when yielding on a lower-level\nentry.  But with respect to leaking shadow pages, the bug was introduced\nby yielding before processing the current gfn.\n\nAlternatively, tdp_mmu_iter_cond_resched() could simply fall through, or\ncallers could jump to their \"retry\" label.  The downside of that approach\nis that tdp_mmu_iter_cond_resched() _must_ be called before anything else\nin the loop, and there's no easy way to enfornce that requirement.\n\nIdeally, KVM would handling the cond_resched() fully within the iterator\nmacro (the code is actually quite clean) and avoid this entire class of\nbugs, but that is extremely difficult do wh\n---truncated---","modified":"2026-03-14T11:18:57.148641Z","published":"2024-03-04T18:15:07.837Z","related":["SUSE-SU-2024:1320-1","SUSE-SU-2024:1321-1","SUSE-SU-2024:1466-1","SUSE-SU-2024:1480-1","SUSE-SU-2024:1490-1"],"references":[{"type":"FIX","url":"https://git.kernel.org/stable/c/3a0f64de479cae75effb630a2e0a237ca0d0623c"},{"type":"FIX","url":"https://git.kernel.org/stable/c/d884eefd75cc54887bc2e9e724207443525dfb2c"}],"affected":[{"database_specific":{"source":"https://storage.googleapis.com/cve-osv-conversion/osv-output/CVE-2021-47094.json","unresolved_ranges":[{"events":[{"introduced":"5.10"},{"fixed":"5.15.12"}]},{"events":[{"introduced":"0"},{"last_affected":"5.16-rc1"}]},{"events":[{"introduced":"0"},{"last_affected":"5.16-rc2"}]},{"events":[{"introduced":"0"},{"last_affected":"5.16-rc3"}]},{"events":[{"introduced":"0"},{"last_affected":"5.16-rc4"}]},{"events":[{"introduced":"0"},{"last_affected":"5.16-rc5"}]},{"events":[{"introduced":"0"},{"last_affected":"5.16-rc6"}]}]}}],"schema_version":"1.7.5","severity":[{"type":"CVSS_V3","score":"CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:H"}]}