linux-stable/scripts/atomic/gen-atomic-fallback.sh
Mark Rutland 6d2779ecae locking/atomic: scripts: fix fallback ifdeffery
Since commit:

  9257959a6e ("locking/atomic: scripts: restructure fallback ifdeffery")

The ordering fallbacks for atomic*_read_acquire() and
atomic*_set_release() erroneously fall back to the implictly relaxed
atomic*_read() and atomic*_set() variants respectively, without any
additional barriers. This loses the ACQUIRE and RELEASE ordering
semantics, which can result in a wide variety of problems, even on
strongly-ordered architectures where the implementation of
atomic*_read() and/or atomic*_set() allows the compiler to reorder those
relative to other accesses.

In practice this has been observed to break bit spinlocks on arm64,
resulting in dentry cache corruption.

The fallback logic was intended to allow ACQUIRE/RELEASE/RELAXED ops to
be defined in terms of FULL ops, but where an op had RELAXED ordering by
default, this unintentionally permitted the ACQUIRE/RELEASE ops to be
defined in terms of the implicitly RELAXED default.

This patch corrects the logic to avoid falling back to implicitly
RELAXED ops, resulting in the same behaviour as prior to commit
9257959a6e.

I've verified the resulting assembly on arm64 by generating outlined
wrappers of the atomics. Prior to this patch the compiler generates
sequences using relaxed load (LDR) and store (STR) instructions, e.g.

| <outlined_atomic64_read_acquire>:
|         ldr     x0, [x0]
|         ret
|
| <outlined_atomic64_set_release>:
|         str     x1, [x0]
|         ret

With this patch applied the compiler generates sequences using the
intended load-acquire (LDAR) and store-release (STLR) instructions, e.g.

| <outlined_atomic64_read_acquire>:
|         ldar    x0, [x0]
|         ret
|
| <outlined_atomic64_set_release>:
|         stlr    x1, [x0]
|         ret

To make sure that there were no other victims of the ifdeffery rewrite,
I generated outlined copies of all of the {atomic,atomic64,atomic_long}
atomic operations before and after commit 9257959a6e. A diff of
the generated assembly on arm64 shows that only the read_acquire() and
set_release() operations were changed, and only lost their intended
ordering:

| [mark@lakrids:~/src/linux]% diff -u \
| 	<(aarch64-linux-gnu-objdump -d before-9257959a6e5b4fca.o)
| 	<(aarch64-linux-gnu-objdump -d after-9257959a6e5b4fca.o)
| --- /proc/self/fd/11    2023-09-19 16:51:51.114779415 +0100
| +++ /proc/self/fd/16    2023-09-19 16:51:51.114779415 +0100
| @@ -1,5 +1,5 @@
|
| -before-9257959a6e5b4fca.o:     file format elf64-littleaarch64
| +after-9257959a6e5b4fca.o:     file format elf64-littleaarch64
|
|
|  Disassembly of section .text:
| @@ -9,7 +9,7 @@
|         4:      d65f03c0        ret
|
|  0000000000000008 <outlined_atomic_read_acquire>:
| -       8:      88dffc00        ldar    w0, [x0]
| +       8:      b9400000        ldr     w0, [x0]
|         c:      d65f03c0        ret
|
|  0000000000000010 <outlined_atomic_set>:
| @@ -17,7 +17,7 @@
|        14:      d65f03c0        ret
|
|  0000000000000018 <outlined_atomic_set_release>:
| -      18:      889ffc01        stlr    w1, [x0]
| +      18:      b9000001        str     w1, [x0]
|        1c:      d65f03c0        ret
|
|  0000000000000020 <outlined_atomic_add>:
| @@ -1230,7 +1230,7 @@
|      1070:      d65f03c0        ret
|
|  0000000000001074 <outlined_atomic64_read_acquire>:
| -    1074:      c8dffc00        ldar    x0, [x0]
| +    1074:      f9400000        ldr     x0, [x0]
|      1078:      d65f03c0        ret
|
|  000000000000107c <outlined_atomic64_set>:
| @@ -1238,7 +1238,7 @@
|      1080:      d65f03c0        ret
|
|  0000000000001084 <outlined_atomic64_set_release>:
| -    1084:      c89ffc01        stlr    x1, [x0]
| +    1084:      f9000001        str     x1, [x0]
|      1088:      d65f03c0        ret
|
|  000000000000108c <outlined_atomic64_add>:
| @@ -2427,7 +2427,7 @@
|      207c:      d65f03c0        ret
|
|  0000000000002080 <outlined_atomic_long_read_acquire>:
| -    2080:      c8dffc00        ldar    x0, [x0]
| +    2080:      f9400000        ldr     x0, [x0]
|      2084:      d65f03c0        ret
|
|  0000000000002088 <outlined_atomic_long_set>:
| @@ -2435,7 +2435,7 @@
|      208c:      d65f03c0        ret
|
|  0000000000002090 <outlined_atomic_long_set_release>:
| -    2090:      c89ffc01        stlr    x1, [x0]
| +    2090:      f9000001        str     x1, [x0]
|      2094:      d65f03c0        ret
|
|  0000000000002098 <outlined_atomic_long_add>:

I've build tested this with a variety of configs for alpha, arm, arm64,
csky, i386, m68k, microblaze, mips, nios2, openrisc, powerpc, riscv,
s390, sh, sparc, x86_64, and xtensa, for which I've seen no issues. I
was unable to build test for ia64 and parisc due to existing build
breakage in v6.6-rc2.

Fixes: 9257959a6e ("locking/atomic: scripts: restructure fallback ifdeffery")
Reported-by: Ming Lei <ming.lei@redhat.com>
Reported-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Baokun Li <libaokun1@huawei.com>
Link: https://lkml.kernel.org/r/20230919171430.2697727-1-mark.rutland@arm.com
2023-09-20 09:39:03 +02:00

333 lines
8.3 KiB
Bash
Executable file

#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
ATOMICDIR=$(dirname $0)
. ${ATOMICDIR}/atomic-tbl.sh
#gen_template_fallback(template, meta, pfx, name, sfx, order, atomic, int, args...)
gen_template_fallback()
{
local template="$1"; shift
local meta="$1"; shift
local pfx="$1"; shift
local name="$1"; shift
local sfx="$1"; shift
local order="$1"; shift
local atomic="$1"; shift
local int="$1"; shift
local ret="$(gen_ret_type "${meta}" "${int}")"
local retstmt="$(gen_ret_stmt "${meta}")"
local params="$(gen_params "${int}" "${atomic}" "$@")"
local args="$(gen_args "$@")"
. ${template}
}
#gen_order_fallback(meta, pfx, name, sfx, order, atomic, int, args...)
gen_order_fallback()
{
local meta="$1"; shift
local pfx="$1"; shift
local name="$1"; shift
local sfx="$1"; shift
local order="$1"; shift
local tmpl_order=${order#_}
local tmpl="${ATOMICDIR}/fallbacks/${tmpl_order:-fence}"
gen_template_fallback "${tmpl}" "${meta}" "${pfx}" "${name}" "${sfx}" "${order}" "$@"
}
#gen_proto_fallback(meta, pfx, name, sfx, order, atomic, int, args...)
gen_proto_fallback()
{
local meta="$1"; shift
local pfx="$1"; shift
local name="$1"; shift
local sfx="$1"; shift
local order="$1"; shift
local tmpl="$(find_fallback_template "${pfx}" "${name}" "${sfx}" "${order}")"
gen_template_fallback "${tmpl}" "${meta}" "${pfx}" "${name}" "${sfx}" "${order}" "$@"
}
#gen_proto_order_variant(meta, pfx, name, sfx, order, atomic, int, args...)
gen_proto_order_variant()
{
local meta="$1"; shift
local pfx="$1"; shift
local name="$1"; shift
local sfx="$1"; shift
local order="$1"; shift
local atomic="$1"; shift
local int="$1"; shift
local atomicname="${atomic}_${pfx}${name}${sfx}${order}"
local basename="${atomic}_${pfx}${name}${sfx}"
local template="$(find_fallback_template "${pfx}" "${name}" "${sfx}" "${order}")"
local ret="$(gen_ret_type "${meta}" "${int}")"
local retstmt="$(gen_ret_stmt "${meta}")"
local params="$(gen_params "${int}" "${atomic}" "$@")"
local args="$(gen_args "$@")"
gen_kerneldoc "raw_" "${meta}" "${pfx}" "${name}" "${sfx}" "${order}" "${atomic}" "${int}" "$@"
printf "static __always_inline ${ret}\n"
printf "raw_${atomicname}(${params})\n"
printf "{\n"
# Where there is no possible fallback, this order variant is mandatory
# and must be provided by arch code. Add a comment to the header to
# make this obvious.
#
# Ideally we'd error on a missing definition, but arch code might
# define this order variant as a C function without a preprocessor
# symbol.
if [ -z ${template} ] && [ -z "${order}" ] && ! meta_has_relaxed "${meta}"; then
printf "\t${retstmt}arch_${atomicname}(${args});\n"
printf "}\n\n"
return
fi
printf "#if defined(arch_${atomicname})\n"
printf "\t${retstmt}arch_${atomicname}(${args});\n"
# Allow FULL/ACQUIRE/RELEASE ops to be defined in terms of RELAXED ops
if [ "${order}" != "_relaxed" ] && meta_has_relaxed "${meta}"; then
printf "#elif defined(arch_${basename}_relaxed)\n"
gen_order_fallback "${meta}" "${pfx}" "${name}" "${sfx}" "${order}" "${atomic}" "${int}" "$@"
fi
# Allow ACQUIRE/RELEASE/RELAXED ops to be defined in terms of FULL ops
if [ ! -z "${order}" ] && ! meta_is_implicitly_relaxed "${meta}"; then
printf "#elif defined(arch_${basename})\n"
printf "\t${retstmt}arch_${basename}(${args});\n"
fi
printf "#else\n"
if [ ! -z "${template}" ]; then
gen_proto_fallback "${meta}" "${pfx}" "${name}" "${sfx}" "${order}" "${atomic}" "${int}" "$@"
else
printf "#error \"Unable to define raw_${atomicname}\"\n"
fi
printf "#endif\n"
printf "}\n\n"
}
#gen_proto_order_variants(meta, pfx, name, sfx, atomic, int, args...)
gen_proto_order_variants()
{
local meta="$1"; shift
local pfx="$1"; shift
local name="$1"; shift
local sfx="$1"; shift
local atomic="$1"
gen_proto_order_variant "${meta}" "${pfx}" "${name}" "${sfx}" "" "$@"
if meta_has_acquire "${meta}"; then
gen_proto_order_variant "${meta}" "${pfx}" "${name}" "${sfx}" "_acquire" "$@"
fi
if meta_has_release "${meta}"; then
gen_proto_order_variant "${meta}" "${pfx}" "${name}" "${sfx}" "_release" "$@"
fi
if meta_has_relaxed "${meta}"; then
gen_proto_order_variant "${meta}" "${pfx}" "${name}" "${sfx}" "_relaxed" "$@"
fi
}
#gen_basic_fallbacks(basename)
gen_basic_fallbacks()
{
local basename="$1"; shift
cat << EOF
#define raw_${basename}_acquire arch_${basename}
#define raw_${basename}_release arch_${basename}
#define raw_${basename}_relaxed arch_${basename}
EOF
}
gen_order_fallbacks()
{
local xchg="$1"; shift
cat <<EOF
#define raw_${xchg}_relaxed arch_${xchg}_relaxed
#ifdef arch_${xchg}_acquire
#define raw_${xchg}_acquire arch_${xchg}_acquire
#else
#define raw_${xchg}_acquire(...) \\
__atomic_op_acquire(arch_${xchg}, __VA_ARGS__)
#endif
#ifdef arch_${xchg}_release
#define raw_${xchg}_release arch_${xchg}_release
#else
#define raw_${xchg}_release(...) \\
__atomic_op_release(arch_${xchg}, __VA_ARGS__)
#endif
#ifdef arch_${xchg}
#define raw_${xchg} arch_${xchg}
#else
#define raw_${xchg}(...) \\
__atomic_op_fence(arch_${xchg}, __VA_ARGS__)
#endif
EOF
}
gen_xchg_order_fallback()
{
local xchg="$1"; shift
local order="$1"; shift
local forder="${order:-_fence}"
printf "#if defined(arch_${xchg}${order})\n"
printf "#define raw_${xchg}${order} arch_${xchg}${order}\n"
if [ "${order}" != "_relaxed" ]; then
printf "#elif defined(arch_${xchg}_relaxed)\n"
printf "#define raw_${xchg}${order}(...) \\\\\n"
printf " __atomic_op${forder}(arch_${xchg}, __VA_ARGS__)\n"
fi
if [ ! -z "${order}" ]; then
printf "#elif defined(arch_${xchg})\n"
printf "#define raw_${xchg}${order} arch_${xchg}\n"
fi
printf "#else\n"
printf "extern void raw_${xchg}${order}_not_implemented(void);\n"
printf "#define raw_${xchg}${order}(...) raw_${xchg}${order}_not_implemented()\n"
printf "#endif\n\n"
}
gen_xchg_fallbacks()
{
local xchg="$1"; shift
for order in "" "_acquire" "_release" "_relaxed"; do
gen_xchg_order_fallback "${xchg}" "${order}"
done
}
gen_try_cmpxchg_fallback()
{
local cmpxchg="$1"; shift;
local order="$1"; shift;
cat <<EOF
#define raw_try_${cmpxchg}${order}(_ptr, _oldp, _new) \\
({ \\
typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \\
___r = raw_${cmpxchg}${order}((_ptr), ___o, (_new)); \\
if (unlikely(___r != ___o)) \\
*___op = ___r; \\
likely(___r == ___o); \\
})
EOF
}
gen_try_cmpxchg_order_fallback()
{
local cmpxchg="$1"; shift
local order="$1"; shift
local forder="${order:-_fence}"
printf "#if defined(arch_try_${cmpxchg}${order})\n"
printf "#define raw_try_${cmpxchg}${order} arch_try_${cmpxchg}${order}\n"
if [ "${order}" != "_relaxed" ]; then
printf "#elif defined(arch_try_${cmpxchg}_relaxed)\n"
printf "#define raw_try_${cmpxchg}${order}(...) \\\\\n"
printf " __atomic_op${forder}(arch_try_${cmpxchg}, __VA_ARGS__)\n"
fi
if [ ! -z "${order}" ]; then
printf "#elif defined(arch_try_${cmpxchg})\n"
printf "#define raw_try_${cmpxchg}${order} arch_try_${cmpxchg}\n"
fi
printf "#else\n"
gen_try_cmpxchg_fallback "${cmpxchg}" "${order}"
printf "#endif\n\n"
}
gen_try_cmpxchg_fallbacks()
{
local cmpxchg="$1"; shift;
for order in "" "_acquire" "_release" "_relaxed"; do
gen_try_cmpxchg_order_fallback "${cmpxchg}" "${order}"
done
}
gen_cmpxchg_local_fallbacks()
{
local cmpxchg="$1"; shift
printf "#define raw_${cmpxchg} arch_${cmpxchg}\n\n"
printf "#ifdef arch_try_${cmpxchg}\n"
printf "#define raw_try_${cmpxchg} arch_try_${cmpxchg}\n"
printf "#else\n"
gen_try_cmpxchg_fallback "${cmpxchg}" ""
printf "#endif\n\n"
}
cat << EOF
// SPDX-License-Identifier: GPL-2.0
// Generated by $0
// DO NOT MODIFY THIS FILE DIRECTLY
#ifndef _LINUX_ATOMIC_FALLBACK_H
#define _LINUX_ATOMIC_FALLBACK_H
#include <linux/compiler.h>
EOF
for xchg in "xchg" "cmpxchg" "cmpxchg64" "cmpxchg128"; do
gen_xchg_fallbacks "${xchg}"
done
for cmpxchg in "cmpxchg" "cmpxchg64" "cmpxchg128"; do
gen_try_cmpxchg_fallbacks "${cmpxchg}"
done
for cmpxchg in "cmpxchg_local" "cmpxchg64_local" "cmpxchg128_local"; do
gen_cmpxchg_local_fallbacks "${cmpxchg}" ""
done
for cmpxchg in "sync_cmpxchg"; do
printf "#define raw_${cmpxchg} arch_${cmpxchg}\n\n"
done
grep '^[a-z]' "$1" | while read name meta args; do
gen_proto "${meta}" "${name}" "atomic" "int" ${args}
done
cat <<EOF
#ifdef CONFIG_GENERIC_ATOMIC64
#include <asm-generic/atomic64.h>
#endif
EOF
grep '^[a-z]' "$1" | while read name meta args; do
gen_proto "${meta}" "${name}" "atomic64" "s64" ${args}
done
cat <<EOF
#endif /* _LINUX_ATOMIC_FALLBACK_H */
EOF