Commit Graph

12 Commits

Author SHA1 Message Date
Linus Torvalds 4356e9f841 work around gcc bugs with 'asm goto' with outputs
We've had issues with gcc and 'asm goto' before, and we created a
'asm_volatile_goto()' macro for that in the past: see commits
3f0116c323 ("compiler/gcc4: Add quirk for 'asm goto' miscompilation
bug") and a9f180345f ("compiler/gcc4: Make quirk for
asm_volatile_goto() unconditional").

Then, much later, we ended up removing the workaround in commit
43c249ea0b ("compiler-gcc.h: remove ancient workaround for gcc PR
58670") because we no longer supported building the kernel with the
affected gcc versions, but we left the macro uses around.

Now, Sean Christopherson reports a new version of a very similar
problem, which is fixed by re-applying that ancient workaround.  But the
problem in question is limited to only the 'asm goto with outputs'
cases, so instead of re-introducing the old workaround as-is, let's
rename and limit the workaround to just that much less common case.

It looks like there are at least two separate issues that all hit in
this area:

 (a) some versions of gcc don't mark the asm goto as 'volatile' when it
     has outputs:

        https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619
        https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420

     which is easy to work around by just adding the 'volatile' by hand.

 (b) Internal compiler errors:

        https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422

     which are worked around by adding the extra empty 'asm' as a
     barrier, as in the original workaround.

but the problem Sean sees may be a third thing since it involves bad
code generation (not an ICE) even with the manually added 'volatile'.

but the same old workaround works for this case, even if this feels a
bit like voodoo programming and may only be hiding the issue.

Reported-and-tested-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Andrew Pinski <quic_apinski@quicinc.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-02-09 15:57:48 -08:00
Xiao Wang 55ca8d7aa2
riscv: Optimize hweight API with Zbb extension
The Hamming Weight of a number is the total number of bits set in it, so
the cpop/cpopw instruction from Zbb extension can be used to accelerate
hweight() API.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20231112095244.4015351-1-xiao.w.wang@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-17 18:18:40 -08:00
Linus Torvalds 56d428ae1c RISC-V Patches for the 6.7 Merge Window, Part 2
* Support for handling misaligned accesses in S-mode.
 * Probing for misaligned access support is now properly cached and
   handled in parallel.
 * PTDUMP now reflects the SW reserved bits, as well as the PBMT and
   NAPOT extensions.
 * Performance improvements for TLB flushing.
 * Support for many new relocations in the module loader.
 * Various bug fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmVOUCcTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYicJ2D/9S+9dnHYHVGTeJfr9Zf2T4r+qHBPyx
 LXbTAbgHN6139MgcRLMRlcUaQ04RVxuBCWhxewJ6mQiHiYNlullgKmJO8oYMS4uZ
 2yQGHKhzKEVluXxe+qT6VW+zsP0cY6pDQ+e59AqZgyWzvATxMU4VtFfCDdjFG03I
 k/8Y3MUKSHAKzIHUsGHiMW5J2YRiM/iVehv2gZfanreulWlK6lyiV4AZ4KChu8Sa
 gix9QkFJw+9+7RHnouHvczt4xTqLPJQcdecLJsbisEI4VaaPtTVzkvXx/kwbMwX0
 qkQnZ7I60fPHrCb9ccuedjDMa1Z0lrfwRldBGz9f9QaW37Eppirn6LA5JiZ1cA47
 wKTwba6gZJCTRXELFTJLcv+Cwdy003E0y3iL5UK2rkbLqcxfvLdq1WAJU2t05Lmh
 aRQN10BtM2DZG+SNPlLoBpXPDw0Q3KOc20zGtuhmk010+X4yOK7WXlu8zNGLLE0+
 yHamiZqAbpIUIEzwDdGbb95jywR1sUhNTbScuhj4Rc79ZqLtPxty1PUhnfqFat1R
 i3ngQtCbeUUYFS2YV9tKkXjLf/xkQNRbt7kQBowuvFuvfksl9UwMdRAWcE/h0M9P
 7uz7cBFhuG0v/XblB7bUhYLkKITvP+ltSMyxaGlfpGqCLAH2KIztdZ2PLWLRdKeU
 +9dtZSQR6oBLqQ==
 =NhdR
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.7-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - Support for handling misaligned accesses in S-mode

 - Probing for misaligned access support is now properly cached and
   handled in parallel

 - PTDUMP now reflects the SW reserved bits, as well as the PBMT and
   NAPOT extensions

 - Performance improvements for TLB flushing

 - Support for many new relocations in the module loader

 - Various bug fixes and cleanups

* tag 'riscv-for-linus-6.7-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
  riscv: Optimize bitops with Zbb extension
  riscv: Rearrange hwcap.h and cpufeature.h
  drivers: perf: Do not broadcast to other cpus when starting a counter
  drivers: perf: Check find_first_bit() return value
  of: property: Add fw_devlink support for msi-parent
  RISC-V: Don't fail in riscv_of_parent_hartid() for disabled HARTs
  riscv: Fix set_memory_XX() and set_direct_map_XX() by splitting huge linear mappings
  riscv: Don't use PGD entries for the linear mapping
  RISC-V: Probe misaligned access speed in parallel
  RISC-V: Remove __init on unaligned_emulation_finish()
  RISC-V: Show accurate per-hart isa in /proc/cpuinfo
  RISC-V: Don't rely on positional structure initialization
  riscv: Add tests for riscv module loading
  riscv: Add remaining module relocations
  riscv: Avoid unaligned access when relocating modules
  riscv: split cache ops out of dma-noncoherent.c
  riscv: Improve flush_tlb_kernel_range()
  riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb
  riscv: Improve flush_tlb_range() for hugetlb pages
  riscv: Improve tlb_flush()
  ...
2023-11-10 09:23:17 -08:00
Xiao Wang 457926b253
riscv: Optimize bitops with Zbb extension
This patch leverages the alternative mechanism to dynamically optimize
bitops (including __ffs, __fls, ffs, fls) with Zbb instructions. When
Zbb ext is not supported by the runtime CPU, legacy implementation is
used. If Zbb is supported, then the optimized variants will be selected
via alternative patching.

The legacy bitops support is taken from the generic C implementation as
fallback.

If the parameter is a build-time constant, we leverage compiler builtin to
calculate the result directly, this approach is inspired by x86 bitops
implementation.

EFI stub runs before the kernel, so alternative mechanism should not be
used there, this patch introduces a macro NO_ALTERNATIVE for this purpose.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20231031064553.2319688-3-xiao.w.wang@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-09 10:15:52 -08:00
Matthew Wilcox (Oracle) f12fb73b74 mm: delete checks for xor_unlock_is_negative_byte()
Architectures which don't define their own use the one in
asm-generic/bitops/lock.h.  Get rid of all the ifdefs around "maybe we
don't have it".

Link: https://lkml.kernel.org/r/20231004165317.1061855-15-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:17 -07:00
Matthew Wilcox (Oracle) 2a667285b5 riscv: implement xor_unlock_is_negative_byte
Inspired by the riscv clear_bit_unlock(), this will surely be
more efficient than the generic one defined in filemap.c.

Link: https://lkml.kernel.org/r/20231004165317.1061855-13-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18 14:34:17 -07:00
Yury Norov 47d8c15615 include: move find.h from asm_generic to linux
find_bit API and bitmap API are closely related, but inclusion paths
are different - include/asm-generic and include/linux, correspondingly.
In the past it made a lot of troubles due to circular dependencies
and/or undefined symbols. Fix this by moving find.h under include/linux.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2022-01-15 08:47:31 -08:00
Linus Torvalds eb7c825bf7 RISC-V patches for v5.2-rc6
This tag contains fixes, defconfig, and DT data changes for the v5.2-rc
 series.  The fixes are relatively straightforward:
 
 - Addition of a TLB fence in the vmalloc_fault path, so the CPU doesn't
   enter an infinite page fault loop;
 - Readdition of the pm_power_off export, so device drivers that
   reassign it can now be built as modules;
 - A udelay() fix for RV32, fixing a miscomputation of the delay time;
 - Removal of deprecated smp_mb__*() barriers.
 
 The tag also adds initial DT data infrastructure for arch/riscv, along
 with initial data for the SiFive FU540-C000 SoC and the corresponding
 HiFive Unleashed board.
 
 We also update the RV64 defconfig to include some core drivers for the
 FU540 in the build.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAl0HtEkACgkQx4+xDQu9
 KkuRIw//f2vSrUyMh44sevr6euVD0K++hQ0AbteQ94cGHqYWWaNxfwMHFD91Gxbj
 wowTwgssq7H9nePsKANjiiLULnZNIkWXAlIncjzv3aXkH6JG3f9nEGR49yzvCbIZ
 yN8wgElJ8rcVWLd096E53Su84CzxuJJ2o3wOI1nQi8aI4h3LwkM2b/O4GxZFpnWb
 vIhWXqjvbUb8XL7Y+VPewtxnZItOUDHkuIkup4kP2bTgl2iDW93hzWwxNKbt6v+m
 9wTzAChjcepCAXSmEGeeZ/h2HNqw2crs+NWOe0drcKxL2vKPZ6gS8ZRX/NuIoDr4
 JgMILzYSO28z8N6w1cJJUdN4eGhCTvdxVTQXvkk/yZoT08X6M0xb5A1MbtizgOJ6
 mZK/vM9gtuoUSZG0SRNeNoqHbWu1tIm29z435Be8hWAtzXlEfewJm8ntgFO4dGmb
 E8TRSgjLzdHY0Nvwx/KVtvYmE/TMybVVRsxJJ525dqJlHT7f3VuRstvw7VQJQpz2
 +JfsZbYk1KjbUc25QpAqF1LUxrRQFn2JL0Cqw+L49J8eshY77rsTcAKP6ZZWiSFZ
 qodU0oPF4BkS1t0bnFuNwlqsAr/q9EiAnQO7+SvqQY/ZUnMNk9gCNn5k/rHMCfyD
 2Dyo6iAbj+Yyb1rrQxX6QnlbHgpFxsG3N4s9E5jOPgKyEQM4JQ4=
 =aotJ
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-v5.2/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Paul Walmsley:
 "This contains fixes, defconfig, and DT data changes for the v5.2-rc
  series.

  The fixes are relatively straightforward:

   - Addition of a TLB fence in the vmalloc_fault path, so the CPU
     doesn't enter an infinite page fault loop

   - Readdition of the pm_power_off export, so device drivers that
     reassign it can now be built as modules

   - A udelay() fix for RV32, fixing a miscomputation of the delay time

   - Removal of deprecated smp_mb__*() barriers

  This also adds initial DT data infrastructure for arch/riscv, along
  with initial data for the SiFive FU540-C000 SoC and the corresponding
  HiFive Unleashed board.

  We also update the RV64 defconfig to include some core drivers for the
  FU540 in the build"

* tag 'riscv-for-v5.2/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: remove unused barrier defines
  riscv: mm: synchronize MMU after pte change
  riscv: dts: add initial board data for the SiFive HiFive Unleashed
  riscv: dts: add initial support for the SiFive FU540-C000 SoC
  dt-bindings: riscv: convert cpu binding to json-schema
  dt-bindings: riscv: sifive: add YAML documentation for the SiFive FU540
  arch: riscv: add support for building DTB files from DT source data
  riscv: Fix udelay in RV32.
  riscv: export pm_power_off again
  RISC-V: defconfig: enable clocks, serial console
2019-06-17 10:34:03 -07:00
Rolf Eike Beer 259931fd3b riscv: remove unused barrier defines
They were introduced in commit fab957c11e ("RISC-V: Atomic and
Locking Code") long after commit 2e39465abc ("locking: Remove
deprecated smp_mb__() barriers") removed the remnants of all previous
instances from the tree.

Signed-off-by: Rolf Eike Beer <eb@emlix.com>
[paul.walmsley@sifive.com: stripped spurious mbox header from patch
 description; fixed commit references in patch header]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-06-17 07:09:43 -07:00
Thomas Gleixner 50acfb2b76 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 286
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation version 2 this program is distributed
  in the hope that it will be useful but without any warranty without
  even the implied warranty of merchantability or fitness for a
  particular purpose see the gnu general public license for more
  details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 97 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.025053186@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Palmer Dabbelt 9347ce54cd RISC-V: __test_and_op_bit_ord should be strongly ordered
I mis-read the documentation.  After looking at it again the
documentation is actually as clear as it can be, it's just that I didn't
actually read it in order and therefor did the wrong thing.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-11-28 14:04:05 -08:00
Palmer Dabbelt fab957c11e RISC-V: Atomic and Locking Code
This contains all the code that directly interfaces with the RISC-V
memory model.  While this code corforms to the current RISC-V ISA
specifications (user 2.2 and priv 1.10), the memory model is somewhat
underspecified in those documents.  There is a working group that hopes
to produce a formal memory model by the end of the year, but my
understanding is that the basic definitions we're relying on here won't
change significantly.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
2017-09-26 15:26:45 -07:00