linux-stable/arch
Nick Desaulniers 4899d8b256 ARM: 9256/1: NWFPE: avoid compiler-generated __aeabi_uldivmod
commit 3220022038 upstream.

clang-15's ability to elide loops completely became more aggressive when
it can deduce how a variable is being updated in a loop. Counting down
one variable by an increment of another can be replaced by a modulo
operation.

For 64b variables on 32b ARM EABI targets, this can result in the
compiler generating calls to __aeabi_uldivmod, which it does for a do
while loop in float64_rem().

For the kernel, we'd generally prefer that developers not open code 64b
division via binary / operators and instead use the more explicit
helpers from div64.h. On arm-linux-gnuabi targets, failure to do so can
result in linkage failures due to undefined references to
__aeabi_uldivmod().

While developers can avoid open coding divisions on 64b variables, the
compiler doesn't know that the Linux kernel has a partial implementation
of a compiler runtime (--rtlib) to enforce this convention.

It's also undecidable for the compiler whether the code in question
would be faster to execute the loop vs elide it and do the 64b division.

While I actively avoid using the internal -mllvm command line flags, I
think we get better code than using barrier() here, which will force
reloads+spills in the loop for all toolchains.

Link: https://github.com/ClangBuiltLinux/linux/issues/1666

Reported-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-07 11:15:53 +01:00
..
alpha alpha: fix syscall entry in !AUDUT_SYSCALL case 2022-12-31 13:25:42 +01:00
arc arc: iounmap() arg is volatile 2022-11-04 00:00:27 +09:00
arm ARM: 9256/1: NWFPE: avoid compiler-generated __aeabi_uldivmod 2023-01-07 11:15:53 +01:00
arm64 arm64: dts: mediatek: mt8195-demo: fix the memory size of node secmon 2023-01-07 11:15:53 +01:00
csky Merge 'irq/loongarch', 'pci/ctrl/loongson' and 'pci/header-cleanup-immutable' 2022-08-11 21:06:14 +08:00
hexagon provide arch_test_bit_acquire for architectures that define test_bit 2022-08-27 09:49:54 -07:00
ia64 ia64: export memory_add_physaddr_to_nid to fix cxl build error 2022-10-21 12:38:34 +02:00
loongarch LoongArch: Fix unsigned comparison with less than zero 2022-12-14 11:40:47 +01:00
m68k m68k: Rework BI_VIRT_RNG_SEED as BI_RNG_SEED 2022-11-16 10:03:48 +01:00
microblaze - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
mips MIPS: ralink: mt7621: avoid to init common ralink reset controller 2022-12-31 13:26:50 +01:00
nios2 nios2: add FORCE for vmlinuz.gz 2022-12-02 17:43:11 +01:00
openrisc Updates to various subsystems which I help look after. lib, ocfs2, 2022-08-07 10:03:24 -07:00
parisc parisc: Avoid printing the hardware path twice 2022-11-10 18:17:36 +01:00
powerpc powerpc/ftrace: fix syscall tracing on PPC64_ELF_ABI_V1 2023-01-07 11:15:51 +01:00
riscv RISC-V: KVM: Fix reg_val check in kvm_riscv_vcpu_set_reg_config() 2022-12-31 13:26:35 +01:00
s390 KVM: s390: vsie: Fix the initialization of the epoch extension (epdx) field 2022-12-14 11:40:53 +01:00
sh sh: machvec: Use char[] for section boundaries 2022-10-21 12:37:59 +02:00
sparc sparc: Unbreak the build 2022-10-12 09:39:04 +02:00
um UM: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK 2022-10-21 12:37:40 +02:00
x86 x86/kprobes: Fix optprobe optimization check with CONFIG_RETHUNK 2023-01-07 11:15:51 +01:00
xtensa xtensa: add __umulsidi3 helper 2023-01-07 11:15:50 +01:00
.gitignore
Kconfig ftrace: Allow WITH_ARGS flavour of graph tracer with shadow call stack 2022-12-31 13:26:31 +01:00