linux-stable/virt/kvm
Michal Luczaj 154de42fe3 KVM: Fix vcpu_array[0] races
commit afb2acb2e3 upstream.

In kvm_vm_ioctl_create_vcpu(), add vcpu to vcpu_array iff it's safe to
access vcpu via kvm_get_vcpu() and kvm_for_each_vcpu(), i.e. when there's
no failure path requiring vcpu removal and destruction. Such order is
important because vcpu_array accessors may end up referencing vcpu at
vcpu_array[0] even before online_vcpus is set to 1.

When online_vcpus=0, any call to kvm_get_vcpu() goes through
array_index_nospec() and ends with an attempt to xa_load(vcpu_array, 0):

	int num_vcpus = atomic_read(&kvm->online_vcpus);
	i = array_index_nospec(i, num_vcpus);
	return xa_load(&kvm->vcpu_array, i);

Similarly, when online_vcpus=0, a kvm_for_each_vcpu() does not iterate over
an "empty" range, but actually [0, ULONG_MAX]:

	xa_for_each_range(&kvm->vcpu_array, idx, vcpup, 0, \
			  (atomic_read(&kvm->online_vcpus) - 1))

In both cases, such online_vcpus=0 edge case, even if leading to
unnecessary calls to XArray API, should not be an issue; requesting
unpopulated indexes/ranges is handled by xa_load() and xa_for_each_range().

However, this means that when the first vCPU is created and inserted in
vcpu_array *and* before online_vcpus is incremented, code calling
kvm_get_vcpu()/kvm_for_each_vcpu() already has access to that first vCPU.

This should not pose a problem assuming that once a vcpu is stored in
vcpu_array, it will remain there, but that's not the case:
kvm_vm_ioctl_create_vcpu() first inserts to vcpu_array, then requests a
file descriptor. If create_vcpu_fd() fails, newly inserted vcpu is removed
from the vcpu_array, then destroyed:

	vcpu->vcpu_idx = atomic_read(&kvm->online_vcpus);
	r = xa_insert(&kvm->vcpu_array, vcpu->vcpu_idx, vcpu, GFP_KERNEL_ACCOUNT);
	kvm_get_kvm(kvm);
	r = create_vcpu_fd(vcpu);
	if (r < 0) {
		xa_erase(&kvm->vcpu_array, vcpu->vcpu_idx);
		kvm_put_kvm_no_destroy(kvm);
		goto unlock_vcpu_destroy;
	}
	atomic_inc(&kvm->online_vcpus);

This results in a possible race condition when a reference to a vcpu is
acquired (via kvm_get_vcpu() or kvm_for_each_vcpu()) moments before said
vcpu is destroyed.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Message-Id: <20230510140410.1093987-2-mhal@rbox.co>
Cc: stable@vger.kernel.org
Fixes: c5b0775491 ("KVM: Convert the kvm->vcpus array to a xarray", 2021-12-08)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-24 17:32:50 +01:00
..
async_pf.c KVM: Add helpers to wake/query blocking vCPU 2021-12-08 04:24:54 -05:00
async_pf.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504 2019-06-19 17:09:56 +02:00
binary_stats.c KVM: stats: remove dead stores 2021-08-13 03:35:15 -04:00
coalesced_mmio.c KVM: Destroy target device if coalesced MMIO unregistration fails 2023-03-10 09:34:11 +01:00
coalesced_mmio.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dirty_ring.c KVM: Use acquire/release semantics when accessing dirty ring GFN state 2022-09-29 10:23:08 +01:00
eventfd.c KVM: eventfd: Fix false positive RCU usage warning 2022-05-20 09:10:33 -04:00
irqchip.c KVM/arm updates for 5.3 2019-07-11 15:14:16 +02:00
Kconfig KVM: Add KVM_CAP_DIRTY_LOG_RING_ACQ_REL capability and config option 2022-09-29 10:23:08 +01:00
kvm_main.c KVM: Fix vcpu_array[0] races 2023-05-24 17:32:50 +01:00
kvm_mm.h KVM: SPDX style and spelling fixes 2022-04-21 13:16:13 -04:00
Makefile.kvm KVM: Reinstate gfn_to_pfn_cache with invalidation support 2022-01-07 10:44:44 -05:00
pfncache.c KVM: Update gfn_to_pfn_cache khva when it moves within the same page 2022-11-23 18:58:46 -05:00
vfio.c kvm/vfio: Fix potential deadlock on vfio group_lock 2023-02-01 08:34:36 +01:00
vfio.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00