Now that the file is empty, fixup all references with the proper includes
and delete the former kitchen sink.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011540.001197214@linutronix.de
Include the header which only provides the XCR accessors. That's all what
is needed here.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011539.896573039@linutronix.de
In order to remove internal.h make signal.h independent of it.
Include asm/fpu/xstate.h to fix a missing update_regset_xstate_info()
prototype, which is
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011539.844565975@linutronix.de
Move function declarations which need to be globally available to api.h
where they belong.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011539.792363754@linutronix.de
Only used internally in the FPU core code.
While at it, convert to the percpu accessors which verify preemption is
disabled.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011539.686806639@linutronix.de
Further disintegration of internal.h:
Move the CPU feature tests to a core header and remove the unused one.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011539.401510559@linutronix.de
internal.h is a kitchen sink which needs to get out of the way to prepare
for the upcoming changes.
Move the context switch and exit to user inlines into a separate header,
which is all that code needs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011539.349132461@linutronix.de
Prepare for replacing the KVM copy xstate to user function by extending
copy_xstate_to_uabi_buf() with a pkru argument which allows the caller to
hand in the pkru value, which is required for KVM because the guest PKRU is
not accessible via current. Fixup all callsites accordingly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011539.191902137@linutronix.de
Copying a user space buffer to the memory buffer is already available in
the FPU core. The copy mechanism in KVM lacks sanity checks and needs to
use cpuid() to lookup the offset of each component, while the FPU core has
this information cached.
Make the FPU core variant accessible for KVM and replace the home brewed
mechanism.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: kvm@vger.kernel.org
Link: https://lkml.kernel.org/r/20211015011539.134065207@linutronix.de
Swapping the host/guest FPU is directly fiddling with FPU internals which
requires 5 exports. The upcoming support of dynamically enabled states
would even need more.
Implement a swap function in the FPU core code and export that instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Link: https://lkml.kernel.org/r/20211015011539.076072399@linutronix.de
These loops evaluating xfeature bits are really hard to read. Create an
iterator and use for_each_set_bit_from() inside which already does the right
thing.
No functional changes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011538.958107505@linutronix.de
No point in having this duplicated all over the place with needlessly
different defines.
Provide a proper initialization function which initializes user buffers
properly and make KVM use it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011538.897664678@linutronix.de
There is no reason why kernel and IO worker threads need a full clone of
the parent's FPU state. Both are kernel threads which are not supposed to
use FPU. So copying a large state or doing XSAVE() is pointless. Just clean
out the minimally required state for those tasks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011538.839822981@linutronix.de
There is no reason to clone FPU in arch_dup_task_struct(). Quite the
contrary - it prevents optimizations. Move it to copy_thread().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011538.780714235@linutronix.de
Zeroing the forked task's FPU registers buffer to avoid leaking init
optimized stale data into the clone is a pointless exercise for the case
where the current task has TIF_NEED_FPU_LOAD set. In that case, the FPU
registers state is copied from current's FPU register buffer which can
contain stale init optimized data as well.
The alledged information leak is non-existant because this stale init
optimized data is used nowhere and cannot leak anywhere.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011538.722854569@linutronix.de
These interfaces are really only valid for features which are independently
managed and not part of the task context state for various reasons.
Tighten the checks and adjust the misleading comments.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011538.608492174@linutronix.de
copy_fpstate_to_sigframe() does not have a slow path anymore. Neither does
the !ia32 restore in __fpu_restore_sig().
Update the comments accordingly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211015011538.493570236@linutronix.de
Resolve the conflict between these commits:
x86/fpu: 1193f408cd ("x86/fpu/signal: Change return type of __fpu_restore_sig() to boolean")
x86/urgent: d298b03506 ("x86/fpu: Restore the masking out of reserved MXCSR bits")
b2381acd3f ("x86/fpu: Mask out the invalid MXCSR bits properly")
Conflicts:
arch/x86/kernel/fpu/signal.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This is a fix for the fix (yeah, /facepalm).
The correct mask to use is not the negation of the MXCSR_MASK but the
actual mask which contains the supported bits in the MXCSR register.
Reported and debugged by Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: d298b03506 ("x86/fpu: Restore the masking out of reserved MXCSR bits")
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ser Olmy <ser.olmy@protonmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/YWgYIYXLriayyezv@intel.com
out due to histerical reasons and 64-bit kernels reject them
- A fix to clear X86_FEATURE_SMAP when support for is not config-enabled
- Three fixes correcting misspelled Kconfig symbols used in code
- Two resctrl object cleanup fixes
- Yet another attempt at fixing the neverending saga of botched x86
timers, this time because some incredibly smart hardware decides to turn
off the HPET timer in a low power state - who cares if the OS is relying
on it...
- Check the full return value range of an SEV VMGEXIT call to determine
whether it returned an error
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmFiqboACgkQEsHwGGHe
VUpIFA//cLWfa1vvamCcLjW0lruQVzHrZesO4Cbti3Fyp2at/Dtwt9w/uZPu9NAa
+sreJBdrkZfo9lmKW6/E1MmLT/YlLg8YHsylKn9d+XSdcy0qWXLYdVVm7bb4teJf
XxRQfYNQrwfpjFNnt+7NUcaqte2zUo7K16CctJF5+E6SGUn+hlu6zK15tf6MMAM1
TFHsQWEuRW5Mgc7eD734cNGDOJgzvb4IACn5BNfKR1+jD1ANfutytXjGqcveJ/sg
lBoWMCU47vo5/uoW516oBK6PfQ/+s1OvYAx2G4DMQSC7WpEWpxnJUoszj9umu+jE
VndS8jQ4WGXcVmfkkwUHbVxcJzsPEzZ/5m+nER9hrGOykKWTajzi2MirBHju5EKv
xfYLqEJHNG9YulxKy2wIW0VRmXDE3wFZfaPAmQbLXud1KfzlC/EpEaloZSJSgqyG
L4uOKk8CBumYJzgVCfTFAqqr1HhmeylYSxHmOUEzTm0sEJX2HuodGcl+sPI/LDPW
MkjVYXq2sOUEVLmk5xyJIkbAUcK2X/Fzt3rKS4CVsjfzWRW67o1oopMy6ZrQ0o/h
Dt/fHub/+Pke5sbB2+RiRsvq3aDftRkvaZK05pTiqlE9gFlKaCVwxDQqvmTnY0oa
PkPzauXRC4qjNsdDMGHaiclm/fk/nlLM9vxXGJ+oTXP6snC4OhQ=
=kKOw
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v5.15_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- A FPU fix to properly handle invalid MXCSR values: 32-bit masks them
out due to historical reasons and 64-bit kernels reject them
- A fix to clear X86_FEATURE_SMAP when support for is not
config-enabled
- Three fixes correcting misspelled Kconfig symbols used in code
- Two resctrl object cleanup fixes
- Yet another attempt at fixing the neverending saga of botched x86
timers, this time because some incredibly smart hardware decides to
turn off the HPET timer in a low power state - who cares if the OS is
relying on it...
- Check the full return value range of an SEV VMGEXIT call to determine
whether it returned an error
* tag 'x86_urgent_for_v5.15_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu: Restore the masking out of reserved MXCSR bits
x86/Kconfig: Correct reference to MWINCHIP3D
x86/platform/olpc: Correct ifdef symbol to intended CONFIG_OLPC_XO15_SCI
x86/entry: Clear X86_FEATURE_SMAP when CONFIG_X86_SMAP=n
x86/entry: Correct reference to intended CONFIG_64_BIT
x86/resctrl: Fix kfree() of the wrong type in domain_add_cpu()
x86/resctrl: Free the ctrlval arrays when domain_setup_mon_state() fails
x86/hpet: Use another crystalball to evaluate HPET usability
x86/sev: Return an error on a returned non-zero SW_EXITINFO1[31:0]
Ser Olmy reported a boot failure:
init[1] bad frame in sigreturn frame:(ptrval) ip:b7c9fbe6 sp:bf933310 orax:ffffffff \
in libc-2.33.so[b7bed000+156000]
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
CPU: 0 PID: 1 Comm: init Tainted: G W 5.14.9 #1
Hardware name: Hewlett-Packard HP PC/HP Board, BIOS JD.00.06 12/06/2001
Call Trace:
dump_stack_lvl
dump_stack
panic
do_exit.cold
do_group_exit
get_signal
arch_do_signal_or_restart
? force_sig_info_to_task
? force_sig
exit_to_user_mode_prepare
syscall_exit_to_user_mode
do_int80_syscall_32
entry_INT80_32
on an old 32-bit Intel CPU:
vendor_id : GenuineIntel
cpu family : 6
model : 6
model name : Celeron (Mendocino)
stepping : 5
microcode : 0x3
Ser bisected the problem to the commit in Fixes.
tglx suggested reverting the rejection of invalid MXCSR values which
this commit introduced and replacing it with what the old code did -
simply masking them out to zero.
Further debugging confirmed his suggestion:
fpu->state.fxsave.mxcsr: 0xb7be13b4, mxcsr_feature_mask: 0xffbf
WARNING: CPU: 0 PID: 1 at arch/x86/kernel/fpu/signal.c:384 __fpu_restore_sig+0x51f/0x540
so restore the original behavior only for 32-bit kernels where you have
ancient machines with buggy hardware. For 32-bit programs on 64-bit
kernels, user space which supplies wrong MXCSR values is considered
malicious so fail the sigframe restoration there.
Fixes: 6f9866a166 ("x86/fpu/signal: Let xrstor handle the features to init")
Reported-by: Ser Olmy <ser.olmy@protonmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Ser Olmy <ser.olmy@protonmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/YVtA67jImg3KlBTw@zn.tnic
Commit
3c73b81a91 ("x86/entry, selftests: Further improve user entry sanity checks")
added a warning if AC is set when in the kernel.
Commit
662a022189 ("x86/entry: Fix AC assertion")
changed the warning to only fire if the CPU supports SMAP.
However, the warning can still trigger on a machine that supports SMAP
but where it's disabled in the kernel config and when running the
syscall_nt selftest, for example:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 49 at irqentry_enter_from_user_mode
CPU: 0 PID: 49 Comm: init Tainted: G T 5.15.0-rc4+ #98 e6202628ee053b4f310759978284bd8bb0ce6905
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
RIP: 0010:irqentry_enter_from_user_mode
...
Call Trace:
? irqentry_enter
? exc_general_protection
? asm_exc_general_protection
? asm_exc_general_protectio
IS_ENABLED(CONFIG_X86_SMAP) could be added to the warning condition, but
even this would not be enough in case SMAP is disabled at boot time with
the "nosmap" parameter.
To be consistent with "nosmap" behaviour, clear X86_FEATURE_SMAP when
!CONFIG_X86_SMAP.
Found using entry-fuzz + satrandconfig.
[ bp: Massage commit message. ]
Fixes: 3c73b81a91 ("x86/entry, selftests: Further improve user entry sanity checks")
Fixes: 662a022189 ("x86/entry: Fix AC assertion")
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20211003223423.8666-1-vegard.nossum@oracle.com
Commit in Fixes separated the architecture specific and filesystem parts
of the resctrl domain structures.
This left the error paths in domain_add_cpu() kfree()ing the memory with
the wrong type.
This will cause a problem if someone adds a new member to struct
rdt_hw_domain meaning d_resctrl is no longer the first member.
Fixes: 792e0f6f78 ("x86/resctrl: Split struct rdt_domain")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20210917165924.28254-1-james.morse@arm.com
domain_add_cpu() is called whenever a CPU is brought online. The
earlier call to domain_setup_ctrlval() allocates the control value
arrays.
If domain_setup_mon_state() fails, the control value arrays are not
freed.
Add the missing kfree() calls.
Fixes: 1bd2a63b4f ("x86/intel_rdt/mba_sc: Add initialization support")
Fixes: edf6fa1c4a ("x86/intel_rdt/cqm: Add RMID (Resource monitoring ID) management")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20210917165958.28313-1-james.morse@arm.com
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFXQUoUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMglgf/egh3zb9/+BUQWe0xWfhcINNzpsVk
PJtiBmJc3nQLbZbTSLp63rouy1lNgR0s2DiMwP7G1u39OwW8W3LHMrBUSqF1F01+
gntb4GGiRTiTPJI64K4z6ytORd3tuRarHq8TUIa2zvki9ZW5Obgkm1i1RsNMOo+s
AOA7whhpS8e/a5fBbtbS9bTZb30PKTZmbW4oMjvO9Sw4Eb76IauqPSEtRPSuCAc7
r7z62RTlm10Qk0JR3tW1iXMxTJHZk+tYPJ8pclUAWVX5bZqWa/9k8R0Z5i/miFiZ
glW/y3R4+aUwIQV2v7V3Jx9MOKDhZxniMtnqZG/Hp9NVDtWIz37V/U37vw==
=zQQ1
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more kvm fixes from Paolo Bonzini:
"Small x86 fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: selftests: Ensure all migrations are performed when test is affined
KVM: x86: Swap order of CPUID entry "index" vs. "significant flag" checks
ptp: Fix ptp_kvm_getcrosststamp issue for x86 ptp_kvm
x86/kvmclock: Move this_cpu_pvti into kvmclock.h
selftests: KVM: Don't clobber XMM register when read
KVM: VMX: Fix a TSX_CTRL_CPUID_CLEAR field mask issue
On recent Intel systems the HPET stops working when the system reaches PC10
idle state.
The approach of adding PCI ids to the early quirks to disable HPET on
these systems is a whack a mole game which makes no sense.
Check for PC10 instead and force disable HPET if supported. The check is
overbroad as it does not take ACPI, intel_idle enablement and command
line parameters into account. That's fine as long as there is at least
PMTIMER available to calibrate the TSC frequency. The decision can be
overruled by adding "hpet=force" on the kernel command line.
Remove the related early PCI quirks for affected Ice Cake and Coffin Lake
systems as they are not longer required. That should also cover all
other systems, i.e. Tiger Rag and newer generations, which are most
likely affected by this as well.
Fixes: Yet another hardware trainwreck
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Cc: stable@vger.kernel.org
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
After returning from a VMGEXIT NAE event, SW_EXITINFO1[31:0] is checked
for a value of 1, which indicates an error and that SW_EXITINFO2
contains exception information. However, future versions of the GHCB
specification may define new values for SW_EXITINFO1[31:0], so really
any non-zero value should be treated as an error.
Fixes: 597cfe4821 ("x86/boot/compressed/64: Setup a GHCB-based VC Exception handler")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 5.10+
Link: https://lkml.kernel.org/r/efc772af831e9e7f517f0439b13b41f56bad8784.1633063321.git.thomas.lendacky@amd.com
There're other modules might use hv_clock_per_cpu variable like ptp_kvm,
so move it into kvmclock.h and export the symbol to make it visiable to
other modules.
Signed-off-by: Zelin Deng <zelin.deng@linux.alibaba.com>
Cc: <stable@vger.kernel.org>
Message-Id: <1632892429-101194-2-git-send-email-zelin.deng@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix the missing return code polarity in save_xstate_epilog().
[ bp: Massage, use the right commit in the Fixes: tag ]
Fixes: 2af07f3a6e ("x86/fpu/signal: Change return type of copy_fpregs_to_sigframe() helpers to boolean")
Reported-by: Remi Duraffort <remi.duraffort@linaro.org>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1461
Link: https://lkml.kernel.org/r/20210922200901.1823741-1-anders.roxell@linaro.org
Commit in Fixes introduced early_reserve_memory() to do all needed
initial memblock_reserve() calls in one function. Unfortunately, the call
of early_reserve_memory() is done too late for Xen dom0, as in some
cases a Xen hook called by e820__memory_setup() will need those memory
reservations to have happened already.
Move the call of early_reserve_memory() before the call of
e820__memory_setup() in order to avoid such problems.
Fixes: a799c2bd29 ("x86/setup: Consolidate early memory reservations")
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210920120421.29276-1-jgross@suse.com
- Prevent a infinite loop in the MCE recovery on return to user space,
which was caused by a second MCE queueing work for the same page and
thereby creating a circular work list.
- Make kern_addr_valid() handle existing PMD entries, which are marked not
present in the higher level page table, correctly instead of blindly
dereferencing them.
- Pass a valid address to sanitize_phys(). This was caused by the mixture
of inclusive and exclusive ranges. memtype_reserve() expect 'end' being
exclusive, but sanitize_phys() wants it inclusive. This worked so far,
but with end being the end of the physical address space the fail is
exposed.
- Increase the maximum supported GPIO numbers for 64bit. Newer SoCs exceed
the previous maximum.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmFHhPIACgkQEsHwGGHe
VUqqQA/+MHQ2HxVOPxnJ0i/D1nK8ccNqTEkSN08z23RGnjqKQun/VaNIIceJY25f
Abeb2tI+0qRrdWVPVd5YqcTHuBLmnPs6Je3MfOrG47eQNW4/SmkXYuOexK80Bew3
YDgEV73d40rHcolXZCaonVajx+FmjoNvkDt5LpLvLcCxIyv0GClFBcZrFAm70AxI
Feax30koh3/MIFxHoXyADN8D+MJu1GxA6QWuoTK40s3G/gTTAwimkDgnNU1JXbcj
VvVVZaNnnAxjxrCa81blr9nDpHJCDinG9bdvDT3UDLous52hGMZTsHoHogxwfogT
EhIgPvL8hf+wm1WXA4NyvSNKZxsGfdkvIXaUq9XYHpLRD6Ao6x7jQDL039imucqb
9YtaH52GhG0SgJlYjkm/zrKezIjKLDen0ZYr/2iNTDM1p2GqQEFo07wC/ME8TkQ6
/BvtbkIvOuUz3nJeV4/AO+O4kaNvto9O2eHq9oodIN9nrwmlO5fMg8XO9nrhWB11
ChXEz6kPqta1nyZXy0mwOrlXlqzcusiroG4G9F7IBBz+t/gNwlu3uZuIgkQCXyYw
DgKz9cnQ3RdgCFknbmEwV5oCjewm7UdcgwaDAaelHIDuWMcshZFvMf1uSjnyg4Z/
39WI8W7W2aZnIoKWpvu8s7Gr8f1krE7C3xrkvl2WmbKPkxNAin8=
=7cq3
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v5.15_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Prevent a infinite loop in the MCE recovery on return to user space,
which was caused by a second MCE queueing work for the same page and
thereby creating a circular work list.
- Make kern_addr_valid() handle existing PMD entries, which are marked
not present in the higher level page table, correctly instead of
blindly dereferencing them.
- Pass a valid address to sanitize_phys(). This was caused by the
mixture of inclusive and exclusive ranges. memtype_reserve() expect
'end' being exclusive, but sanitize_phys() wants it inclusive. This
worked so far, but with end being the end of the physical address
space the fail is exposed.
- Increase the maximum supported GPIO numbers for 64bit. Newer SoCs
exceed the previous maximum.
* tag 'x86_urgent_for_v5.15_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Avoid infinite loop for copy from user recovery
x86/mm: Fix kern_addr_valid() to cope with existing but not present entries
x86/platform: Increase maximum GPIO number for X86_64
x86/pat: Pass valid address to sanitize_phys()
The boot-time allocation interface for memblock is a mess, with
'memblock_alloc()' returning a virtual pointer, but then you are
supposed to free it with 'memblock_free()' that takes a _physical_
address.
Not only is that all kinds of strange and illogical, but it actually
causes bugs, when people then use it like a normal allocation function,
and it fails spectacularly on a NULL pointer:
https://lore.kernel.org/all/20210912140820.GD25450@xsang-OptiPlex-9020/
or just random memory corruption if the debug checks don't catch it:
https://lore.kernel.org/all/61ab2d0c-3313-aaab-514c-e15b7aa054a0@suse.cz/
I really don't want to apply patches that treat the symptoms, when the
fundamental cause is this horribly confusing interface.
I started out looking at just automating a sane replacement sequence,
but because of this mix or virtual and physical addresses, and because
people have used the "__pa()" macro that can take either a regular
kernel pointer, or just the raw "unsigned long" address, it's all quite
messy.
So this just introduces a new saner interface for freeing a virtual
address that was allocated using 'memblock_alloc()', and that was kept
as a regular kernel pointer. And then it converts a couple of users
that are obvious and easy to test, including the 'xbc_nodes' case in
lib/bootconfig.c that caused problems.
Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: 40caa127f3 ("init: bootconfig: Remove all bootconfig data when the init memory is removed")
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__fpu_sig_restore() only needs information about success or fail and no
real error code.
This cleans up the confusing conversion of the trap number, which is
returned by the *RSTOR() exception fixups, to an error code.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132526.084109938@linutronix.de
__fpu_sig_restore() only needs success/fail information and no detailed
error code.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132526.024024598@linutronix.de
Now that fpu__restore_sig() returns a boolean get rid of the individual
error codes in __fpu_restore_sig() as well.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.966197097@linutronix.de
None of the call sites cares about the error code. All they need to know is
whether the function succeeded or not.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.909065931@linutronix.de
None of the call sites cares about the return code. All they are interested
in is success or fail.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.851280949@linutronix.de
Now that copy_fpregs_to_sigframe() returns boolean the individual return
codes in the related helper functions do not make sense anymore. Change
them to return boolean success/fail.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.794334915@linutronix.de
None of the call sites cares about the actual return code. Change the
return type to boolean and return 'true' on success.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.736773588@linutronix.de
When the direct saving of the FPU registers to the user space sigframe
fails, copy_fpregs_to_sigframe() attempts to clear the user buffer.
The most likely reason for such a fail is a page fault. As
copy_fpregs_to_sigframe() is invoked with pagefaults disabled the chance
that __clear_user() succeeds is minuscule.
Move the clearing out into the caller which replaces the
fault_in_pages_writeable() in that error handling path.
The return value confusion will be cleaned up separately.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210908132525.679356300@linutronix.de