2019-06-04 08:11:33 +00:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0-only */
|
2014-03-04 01:10:04 +00:00
|
|
|
/*
|
|
|
|
* Copyright (C) 2014 Linaro Ltd. <ard.biesheuvel@linaro.org>
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef __ASM_CPUFEATURE_H
|
|
|
|
#define __ASM_CPUFEATURE_H
|
|
|
|
|
arm64: alternatives: add alternative_has_feature_*()
Currrently we use a mixture of alternative sequences and static branches
to handle features detected at boot time. For ease of maintenance we
generally prefer to use static branches in C code, but this has a few
downsides:
* Each static branch has metadata in the __jump_table section, which is
not discarded after features are finalized. This wastes some space,
and slows down the patching of other static branches.
* The static branches are patched at a different point in time from the
alternatives, so changes are not atomic. This leaves a transient
period where there could be a mismatch between the behaviour of
alternatives and static branches, which could be problematic for some
features (e.g. pseudo-NMI).
* More (instrumentable) kernel code is executed to patch each static
branch, which can be risky when patching certain features (e.g.
irqflags management for pseudo-NMI).
* When CONFIG_JUMP_LABEL=n, static branches are turned into a load of a
flag and a conditional branch. This means it isn't safe to use such
static branches in an alternative address space (e.g. the NVHE/PKVM
hyp code), where the generated address isn't safe to acccess.
To deal with these issues, this patch introduces new
alternative_has_feature_*() helpers, which work like static branches but
are patched using alternatives. This ensures the patching is performed
at the same time as other alternative patching, allows the metadata to
be freed after patching, and is safe for use in alternative address
spaces.
Note that all supported toolchains have asm goto support, and since
commit:
a0a12c3ed057af57 ("asm goto: eradicate CC_HAS_ASM_GOTO)"
... the CC_HAS_ASM_GOTO Kconfig symbol has been removed, so no feature
check is necessary, and we can always make use of asm goto.
Additionally, note that:
* This has no impact on cpus_have_cap(), which is a dynamic check.
* This has no functional impact on cpus_have_const_cap(). The branches
are patched slightly later than before this patch, but these branches
are not reachable until caps have been finalised.
* It is now invalid to use cpus_have_final_cap() in the window between
feature detection and patching. All existing uses are only expected
after patching anyway, so this should not be a problem.
* The LSE atomics will now be enabled during alternatives patching
rather than immediately before. As the LL/SC an LSE atomics are
functionally equivalent this should not be problematic.
When building defconfig with GCC 12.1.0, the resulting Image is 64KiB
smaller:
| % ls -al Image-*
| -rw-r--r-- 1 mark mark 37108224 Aug 23 09:56 Image-after
| -rw-r--r-- 1 mark mark 37173760 Aug 23 09:54 Image-before
According to bloat-o-meter.pl:
| add/remove: 44/34 grow/shrink: 602/1294 up/down: 39692/-61108 (-21416)
| Function old new delta
| [...]
| Total: Before=16618336, After=16596920, chg -0.13%
| add/remove: 0/2 grow/shrink: 0/0 up/down: 0/-1296 (-1296)
| Data old new delta
| arm64_const_caps_ready 16 - -16
| cpu_hwcap_keys 1280 - -1280
| Total: Before=8987120, After=8985824, chg -0.01%
| add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
| RO Data old new delta
| Total: Before=18408, After=18408, chg +0.00%
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220912162210.3626215-8-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-12 16:22:09 +00:00
|
|
|
#include <asm/alternative-macros.h>
|
2016-11-03 18:34:34 +00:00
|
|
|
#include <asm/cpucaps.h>
|
2018-03-26 14:12:44 +00:00
|
|
|
#include <asm/cputype.h>
|
2014-03-04 01:10:04 +00:00
|
|
|
#include <asm/hwcap.h>
|
2015-10-19 13:24:42 +00:00
|
|
|
#include <asm/sysreg.h>
|
2014-03-04 01:10:04 +00:00
|
|
|
|
2022-07-07 10:36:31 +00:00
|
|
|
#define MAX_CPU_FEATURES 128
|
2019-04-09 09:52:40 +00:00
|
|
|
#define cpu_feature(x) KERNEL_HWCAP_ ## x
|
2014-03-04 01:10:04 +00:00
|
|
|
|
2023-06-09 16:21:46 +00:00
|
|
|
#define ARM64_SW_FEATURE_OVERRIDE_NOKASLR 0
|
2023-06-09 16:21:47 +00:00
|
|
|
#define ARM64_SW_FEATURE_OVERRIDE_HVHE 4
|
2024-02-14 12:28:58 +00:00
|
|
|
#define ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF 8
|
2023-06-09 16:21:46 +00:00
|
|
|
|
2014-11-14 15:54:10 +00:00
|
|
|
#ifndef __ASSEMBLY__
|
2014-11-14 15:54:07 +00:00
|
|
|
|
2016-11-08 13:56:20 +00:00
|
|
|
#include <linux/bug.h>
|
|
|
|
#include <linux/jump_label.h>
|
2015-04-30 17:55:50 +00:00
|
|
|
#include <linux/kernel.h>
|
2023-10-17 05:23:20 +00:00
|
|
|
#include <linux/cpumask.h>
|
2015-04-30 17:55:50 +00:00
|
|
|
|
2017-01-09 17:28:27 +00:00
|
|
|
/*
|
|
|
|
* CPU feature register tracking
|
|
|
|
*
|
|
|
|
* The safe value of a CPUID feature field is dependent on the implications
|
|
|
|
* of the values assigned to it by the architecture. Based on the relationship
|
|
|
|
* between the values, the features are classified into 3 types - LOWER_SAFE,
|
|
|
|
* HIGHER_SAFE and EXACT.
|
|
|
|
*
|
|
|
|
* The lowest value of all the CPUs is chosen for LOWER_SAFE and highest
|
|
|
|
* for HIGHER_SAFE. It is expected that all CPUs have the same value for
|
|
|
|
* a field when EXACT is specified, failing which, the safe value specified
|
|
|
|
* in the table is chosen.
|
|
|
|
*/
|
|
|
|
|
2015-10-19 13:24:45 +00:00
|
|
|
enum ftr_type {
|
2019-07-30 14:40:20 +00:00
|
|
|
FTR_EXACT, /* Use a predefined safe value */
|
|
|
|
FTR_LOWER_SAFE, /* Smaller value is safe */
|
|
|
|
FTR_HIGHER_SAFE, /* Bigger value is safe */
|
|
|
|
FTR_HIGHER_OR_ZERO_SAFE, /* Bigger value is safe, but 0 is biggest */
|
2015-10-19 13:24:45 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
#define FTR_STRICT true /* SANITY check strict matching required */
|
|
|
|
#define FTR_NONSTRICT false /* SANITY check ignored */
|
|
|
|
|
2015-11-18 17:08:57 +00:00
|
|
|
#define FTR_SIGNED true /* Value should be treated as signed */
|
|
|
|
#define FTR_UNSIGNED false /* Value should be treated as unsigned */
|
|
|
|
|
2017-01-09 17:28:30 +00:00
|
|
|
#define FTR_VISIBLE true /* Feature visible to the user space */
|
|
|
|
#define FTR_HIDDEN false /* Feature is hidden from the user */
|
|
|
|
|
2017-12-14 14:03:44 +00:00
|
|
|
#define FTR_VISIBLE_IF_IS_ENABLED(config) \
|
|
|
|
(IS_ENABLED(config) ? FTR_VISIBLE : FTR_HIDDEN)
|
|
|
|
|
2015-10-19 13:24:45 +00:00
|
|
|
struct arm64_ftr_bits {
|
2015-11-18 17:08:57 +00:00
|
|
|
bool sign; /* Value is signed ? */
|
2017-01-09 17:28:30 +00:00
|
|
|
bool visible;
|
2015-11-18 17:08:57 +00:00
|
|
|
bool strict; /* CPU Sanity check: strict matching required ? */
|
2015-10-19 13:24:45 +00:00
|
|
|
enum ftr_type type;
|
|
|
|
u8 shift;
|
|
|
|
u8 width;
|
2016-09-09 13:07:08 +00:00
|
|
|
s64 safe_val; /* safe value for FTR_EXACT features */
|
2015-10-19 13:24:45 +00:00
|
|
|
};
|
|
|
|
|
2021-04-08 13:10:08 +00:00
|
|
|
/*
|
|
|
|
* Describe the early feature override to the core override code:
|
|
|
|
*
|
|
|
|
* @val Values that are to be merged into the final
|
|
|
|
* sanitised value of the register. Only the bitfields
|
|
|
|
* set to 1 in @mask are valid
|
|
|
|
* @mask Mask of the features that are overridden by @val
|
|
|
|
*
|
|
|
|
* A @mask field set to full-1 indicates that the corresponding field
|
|
|
|
* in @val is a valid override.
|
|
|
|
*
|
|
|
|
* A @mask field set to full-0 with the corresponding @val field set
|
|
|
|
* to full-0 denotes that this field has no override
|
|
|
|
*
|
|
|
|
* A @mask field set to full-0 with the corresponding @val field set
|
2024-02-02 01:33:06 +00:00
|
|
|
* to full-1 denotes that this field has an invalid override.
|
2021-04-08 13:10:08 +00:00
|
|
|
*/
|
2021-02-08 09:57:19 +00:00
|
|
|
struct arm64_ftr_override {
|
|
|
|
u64 val;
|
|
|
|
u64 mask;
|
|
|
|
};
|
|
|
|
|
2015-10-19 13:24:45 +00:00
|
|
|
/*
|
|
|
|
* @arm64_ftr_reg - Feature register
|
|
|
|
* @strict_mask Bits which should match across all CPUs for sanity.
|
|
|
|
* @sys_val Safe value across the CPUs (system view)
|
|
|
|
*/
|
|
|
|
struct arm64_ftr_reg {
|
2016-08-31 10:31:08 +00:00
|
|
|
const char *name;
|
|
|
|
u64 strict_mask;
|
2017-01-09 17:28:30 +00:00
|
|
|
u64 user_mask;
|
2016-08-31 10:31:08 +00:00
|
|
|
u64 sys_val;
|
2017-01-09 17:28:30 +00:00
|
|
|
u64 user_val;
|
2021-02-08 09:57:19 +00:00
|
|
|
struct arm64_ftr_override *override;
|
2016-08-31 10:31:08 +00:00
|
|
|
const struct arm64_ftr_bits *ftr_bits;
|
2015-10-19 13:24:45 +00:00
|
|
|
};
|
|
|
|
|
2016-08-31 10:31:10 +00:00
|
|
|
extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0;
|
|
|
|
|
arm64: capabilities: Prepare for fine grained capabilities
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed
to the userspace and the CPU hwcaps used by the kernel, which
include cpu features and CPU errata work arounds. Capabilities
have some properties that decide how they should be treated :
1) Detection, i.e scope : A cap could be "detected" either :
- if it is present on at least one CPU (SCOPE_LOCAL_CPU)
Or
- if it is present on all the CPUs (SCOPE_SYSTEM)
2) When is it enabled ? - A cap is treated as "enabled" when the
system takes some action based on whether the capability is detected or
not. e.g, setting some control register, patching the kernel code.
Right now, we treat all caps are enabled at boot-time, after all
the CPUs are brought up by the kernel. But there are certain caps,
which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI)
and kernel starts using them, even before the secondary CPUs are brought
up. We would need a way to describe this for each capability.
3) Conflict on a late CPU - When a CPU is brought up, it is checked
against the caps that are known to be enabled on the system (via
verify_local_cpu_capabilities()). Based on the state of the capability
on the CPU vs. that of System we could have the following combinations
of conflict.
x-----------------------------x
| Type | System | Late CPU |
------------------------------|
| a | y | n |
------------------------------|
| b | n | y |
x-----------------------------x
Case (a) is not permitted for caps which are system features, which the
system expects all the CPUs to have (e.g VHE). While (a) is ignored for
all errata work arounds. However, there could be exceptions to the plain
filtering approach. e.g, KPTI is an optional feature for a late CPU as
long as the system already enables it.
Case (b) is not permitted for errata work arounds which requires some
work around, which cannot be delayed. And we ignore (b) for features.
Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we
are too late to enable it (because we change the allocation of ASIDs
etc).
So this calls for a lot more fine grained behavior for each capability.
And if we define all the attributes to control their behavior properly,
we may be able to use a single table for the CPU hwcaps (which cover
errata and features, not the ELF HWCAPs). This is a prepartory step
to get there. More bits would be added for the properties listed above.
We are going to use a bit-mask to encode all the properties of a
capabilities. This patch encodes the "SCOPE" of the capability.
As such there is no change in how the capabilities are treated.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-26 14:12:31 +00:00
|
|
|
/*
|
|
|
|
* CPU capabilities:
|
|
|
|
*
|
|
|
|
* We use arm64_cpu_capabilities to represent system features, errata work
|
2023-06-07 16:48:43 +00:00
|
|
|
* arounds (both used internally by kernel and tracked in system_cpucaps) and
|
arm64: capabilities: Prepare for fine grained capabilities
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed
to the userspace and the CPU hwcaps used by the kernel, which
include cpu features and CPU errata work arounds. Capabilities
have some properties that decide how they should be treated :
1) Detection, i.e scope : A cap could be "detected" either :
- if it is present on at least one CPU (SCOPE_LOCAL_CPU)
Or
- if it is present on all the CPUs (SCOPE_SYSTEM)
2) When is it enabled ? - A cap is treated as "enabled" when the
system takes some action based on whether the capability is detected or
not. e.g, setting some control register, patching the kernel code.
Right now, we treat all caps are enabled at boot-time, after all
the CPUs are brought up by the kernel. But there are certain caps,
which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI)
and kernel starts using them, even before the secondary CPUs are brought
up. We would need a way to describe this for each capability.
3) Conflict on a late CPU - When a CPU is brought up, it is checked
against the caps that are known to be enabled on the system (via
verify_local_cpu_capabilities()). Based on the state of the capability
on the CPU vs. that of System we could have the following combinations
of conflict.
x-----------------------------x
| Type | System | Late CPU |
------------------------------|
| a | y | n |
------------------------------|
| b | n | y |
x-----------------------------x
Case (a) is not permitted for caps which are system features, which the
system expects all the CPUs to have (e.g VHE). While (a) is ignored for
all errata work arounds. However, there could be exceptions to the plain
filtering approach. e.g, KPTI is an optional feature for a late CPU as
long as the system already enables it.
Case (b) is not permitted for errata work arounds which requires some
work around, which cannot be delayed. And we ignore (b) for features.
Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we
are too late to enable it (because we change the allocation of ASIDs
etc).
So this calls for a lot more fine grained behavior for each capability.
And if we define all the attributes to control their behavior properly,
we may be able to use a single table for the CPU hwcaps (which cover
errata and features, not the ELF HWCAPs). This is a prepartory step
to get there. More bits would be added for the properties listed above.
We are going to use a bit-mask to encode all the properties of a
capabilities. This patch encodes the "SCOPE" of the capability.
As such there is no change in how the capabilities are treated.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-26 14:12:31 +00:00
|
|
|
* ELF HWCAPs (which are exposed to user).
|
|
|
|
*
|
|
|
|
* To support systems with heterogeneous CPUs, we need to make sure that we
|
|
|
|
* detect the capabilities correctly on the system and take appropriate
|
|
|
|
* measures to ensure there are no incompatibilities.
|
|
|
|
*
|
|
|
|
* This comment tries to explain how we treat the capabilities.
|
|
|
|
* Each capability has the following list of attributes :
|
|
|
|
*
|
|
|
|
* 1) Scope of Detection : The system detects a given capability by
|
|
|
|
* performing some checks at runtime. This could be, e.g, checking the
|
|
|
|
* value of a field in CPU ID feature register or checking the cpu
|
|
|
|
* model. The capability provides a call back ( @matches() ) to
|
|
|
|
* perform the check. Scope defines how the checks should be performed.
|
2018-03-26 14:12:41 +00:00
|
|
|
* There are three cases:
|
arm64: capabilities: Prepare for fine grained capabilities
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed
to the userspace and the CPU hwcaps used by the kernel, which
include cpu features and CPU errata work arounds. Capabilities
have some properties that decide how they should be treated :
1) Detection, i.e scope : A cap could be "detected" either :
- if it is present on at least one CPU (SCOPE_LOCAL_CPU)
Or
- if it is present on all the CPUs (SCOPE_SYSTEM)
2) When is it enabled ? - A cap is treated as "enabled" when the
system takes some action based on whether the capability is detected or
not. e.g, setting some control register, patching the kernel code.
Right now, we treat all caps are enabled at boot-time, after all
the CPUs are brought up by the kernel. But there are certain caps,
which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI)
and kernel starts using them, even before the secondary CPUs are brought
up. We would need a way to describe this for each capability.
3) Conflict on a late CPU - When a CPU is brought up, it is checked
against the caps that are known to be enabled on the system (via
verify_local_cpu_capabilities()). Based on the state of the capability
on the CPU vs. that of System we could have the following combinations
of conflict.
x-----------------------------x
| Type | System | Late CPU |
------------------------------|
| a | y | n |
------------------------------|
| b | n | y |
x-----------------------------x
Case (a) is not permitted for caps which are system features, which the
system expects all the CPUs to have (e.g VHE). While (a) is ignored for
all errata work arounds. However, there could be exceptions to the plain
filtering approach. e.g, KPTI is an optional feature for a late CPU as
long as the system already enables it.
Case (b) is not permitted for errata work arounds which requires some
work around, which cannot be delayed. And we ignore (b) for features.
Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we
are too late to enable it (because we change the allocation of ASIDs
etc).
So this calls for a lot more fine grained behavior for each capability.
And if we define all the attributes to control their behavior properly,
we may be able to use a single table for the CPU hwcaps (which cover
errata and features, not the ELF HWCAPs). This is a prepartory step
to get there. More bits would be added for the properties listed above.
We are going to use a bit-mask to encode all the properties of a
capabilities. This patch encodes the "SCOPE" of the capability.
As such there is no change in how the capabilities are treated.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-26 14:12:31 +00:00
|
|
|
*
|
|
|
|
* a) SCOPE_LOCAL_CPU: check all the CPUs and "detect" if at least one
|
|
|
|
* matches. This implies, we have to run the check on all the
|
|
|
|
* booting CPUs, until the system decides that state of the
|
|
|
|
* capability is finalised. (See section 2 below)
|
|
|
|
* Or
|
|
|
|
* b) SCOPE_SYSTEM: check all the CPUs and "detect" if all the CPUs
|
|
|
|
* matches. This implies, we run the check only once, when the
|
|
|
|
* system decides to finalise the state of the capability. If the
|
|
|
|
* capability relies on a field in one of the CPU ID feature
|
|
|
|
* registers, we use the sanitised value of the register from the
|
|
|
|
* CPU feature infrastructure to make the decision.
|
2018-03-26 14:12:41 +00:00
|
|
|
* Or
|
|
|
|
* c) SCOPE_BOOT_CPU: Check only on the primary boot CPU to detect the
|
|
|
|
* feature. This category is for features that are "finalised"
|
|
|
|
* (or used) by the kernel very early even before the SMP cpus
|
|
|
|
* are brought up.
|
arm64: capabilities: Prepare for fine grained capabilities
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed
to the userspace and the CPU hwcaps used by the kernel, which
include cpu features and CPU errata work arounds. Capabilities
have some properties that decide how they should be treated :
1) Detection, i.e scope : A cap could be "detected" either :
- if it is present on at least one CPU (SCOPE_LOCAL_CPU)
Or
- if it is present on all the CPUs (SCOPE_SYSTEM)
2) When is it enabled ? - A cap is treated as "enabled" when the
system takes some action based on whether the capability is detected or
not. e.g, setting some control register, patching the kernel code.
Right now, we treat all caps are enabled at boot-time, after all
the CPUs are brought up by the kernel. But there are certain caps,
which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI)
and kernel starts using them, even before the secondary CPUs are brought
up. We would need a way to describe this for each capability.
3) Conflict on a late CPU - When a CPU is brought up, it is checked
against the caps that are known to be enabled on the system (via
verify_local_cpu_capabilities()). Based on the state of the capability
on the CPU vs. that of System we could have the following combinations
of conflict.
x-----------------------------x
| Type | System | Late CPU |
------------------------------|
| a | y | n |
------------------------------|
| b | n | y |
x-----------------------------x
Case (a) is not permitted for caps which are system features, which the
system expects all the CPUs to have (e.g VHE). While (a) is ignored for
all errata work arounds. However, there could be exceptions to the plain
filtering approach. e.g, KPTI is an optional feature for a late CPU as
long as the system already enables it.
Case (b) is not permitted for errata work arounds which requires some
work around, which cannot be delayed. And we ignore (b) for features.
Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we
are too late to enable it (because we change the allocation of ASIDs
etc).
So this calls for a lot more fine grained behavior for each capability.
And if we define all the attributes to control their behavior properly,
we may be able to use a single table for the CPU hwcaps (which cover
errata and features, not the ELF HWCAPs). This is a prepartory step
to get there. More bits would be added for the properties listed above.
We are going to use a bit-mask to encode all the properties of a
capabilities. This patch encodes the "SCOPE" of the capability.
As such there is no change in how the capabilities are treated.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-26 14:12:31 +00:00
|
|
|
*
|
|
|
|
* The process of detection is usually denoted by "update" capability
|
|
|
|
* state in the code.
|
|
|
|
*
|
|
|
|
* 2) Finalise the state : The kernel should finalise the state of a
|
|
|
|
* capability at some point during its execution and take necessary
|
|
|
|
* actions if any. Usually, this is done, after all the boot-time
|
|
|
|
* enabled CPUs are brought up by the kernel, so that it can make
|
|
|
|
* better decision based on the available set of CPUs. However, there
|
|
|
|
* are some special cases, where the action is taken during the early
|
|
|
|
* boot by the primary boot CPU. (e.g, running the kernel at EL2 with
|
|
|
|
* Virtualisation Host Extensions). The kernel usually disallows any
|
|
|
|
* changes to the state of a capability once it finalises the capability
|
|
|
|
* and takes any action, as it may be impossible to execute the actions
|
|
|
|
* safely. A CPU brought up after a capability is "finalised" is
|
|
|
|
* referred to as "Late CPU" w.r.t the capability. e.g, all secondary
|
|
|
|
* CPUs are treated "late CPUs" for capabilities determined by the boot
|
|
|
|
* CPU.
|
|
|
|
*
|
2018-03-26 14:12:41 +00:00
|
|
|
* At the moment there are two passes of finalising the capabilities.
|
|
|
|
* a) Boot CPU scope capabilities - Finalised by primary boot CPU via
|
|
|
|
* setup_boot_cpu_capabilities().
|
|
|
|
* b) Everything except (a) - Run via setup_system_capabilities().
|
|
|
|
*
|
arm64: capabilities: Prepare for fine grained capabilities
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed
to the userspace and the CPU hwcaps used by the kernel, which
include cpu features and CPU errata work arounds. Capabilities
have some properties that decide how they should be treated :
1) Detection, i.e scope : A cap could be "detected" either :
- if it is present on at least one CPU (SCOPE_LOCAL_CPU)
Or
- if it is present on all the CPUs (SCOPE_SYSTEM)
2) When is it enabled ? - A cap is treated as "enabled" when the
system takes some action based on whether the capability is detected or
not. e.g, setting some control register, patching the kernel code.
Right now, we treat all caps are enabled at boot-time, after all
the CPUs are brought up by the kernel. But there are certain caps,
which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI)
and kernel starts using them, even before the secondary CPUs are brought
up. We would need a way to describe this for each capability.
3) Conflict on a late CPU - When a CPU is brought up, it is checked
against the caps that are known to be enabled on the system (via
verify_local_cpu_capabilities()). Based on the state of the capability
on the CPU vs. that of System we could have the following combinations
of conflict.
x-----------------------------x
| Type | System | Late CPU |
------------------------------|
| a | y | n |
------------------------------|
| b | n | y |
x-----------------------------x
Case (a) is not permitted for caps which are system features, which the
system expects all the CPUs to have (e.g VHE). While (a) is ignored for
all errata work arounds. However, there could be exceptions to the plain
filtering approach. e.g, KPTI is an optional feature for a late CPU as
long as the system already enables it.
Case (b) is not permitted for errata work arounds which requires some
work around, which cannot be delayed. And we ignore (b) for features.
Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we
are too late to enable it (because we change the allocation of ASIDs
etc).
So this calls for a lot more fine grained behavior for each capability.
And if we define all the attributes to control their behavior properly,
we may be able to use a single table for the CPU hwcaps (which cover
errata and features, not the ELF HWCAPs). This is a prepartory step
to get there. More bits would be added for the properties listed above.
We are going to use a bit-mask to encode all the properties of a
capabilities. This patch encodes the "SCOPE" of the capability.
As such there is no change in how the capabilities are treated.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-26 14:12:31 +00:00
|
|
|
* 3) Verification: When a CPU is brought online (e.g, by user or by the
|
|
|
|
* kernel), the kernel should make sure that it is safe to use the CPU,
|
|
|
|
* by verifying that the CPU is compliant with the state of the
|
|
|
|
* capabilities finalised already. This happens via :
|
|
|
|
*
|
|
|
|
* secondary_start_kernel()-> check_local_cpu_capabilities()
|
|
|
|
*
|
|
|
|
* As explained in (2) above, capabilities could be finalised at
|
2018-03-26 14:12:41 +00:00
|
|
|
* different points in the execution. Each newly booted CPU is verified
|
|
|
|
* against the capabilities that have been finalised by the time it
|
|
|
|
* boots.
|
|
|
|
*
|
|
|
|
* a) SCOPE_BOOT_CPU : All CPUs are verified against the capability
|
|
|
|
* except for the primary boot CPU.
|
|
|
|
*
|
|
|
|
* b) SCOPE_LOCAL_CPU, SCOPE_SYSTEM: All CPUs hotplugged on by the
|
|
|
|
* user after the kernel boot are verified against the capability.
|
|
|
|
*
|
|
|
|
* If there is a conflict, the kernel takes an action, based on the
|
|
|
|
* severity (e.g, a CPU could be prevented from booting or cause a
|
|
|
|
* kernel panic). The CPU is allowed to "affect" the state of the
|
|
|
|
* capability, if it has not been finalised already. See section 5
|
|
|
|
* for more details on conflicts.
|
arm64: capabilities: Prepare for fine grained capabilities
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed
to the userspace and the CPU hwcaps used by the kernel, which
include cpu features and CPU errata work arounds. Capabilities
have some properties that decide how they should be treated :
1) Detection, i.e scope : A cap could be "detected" either :
- if it is present on at least one CPU (SCOPE_LOCAL_CPU)
Or
- if it is present on all the CPUs (SCOPE_SYSTEM)
2) When is it enabled ? - A cap is treated as "enabled" when the
system takes some action based on whether the capability is detected or
not. e.g, setting some control register, patching the kernel code.
Right now, we treat all caps are enabled at boot-time, after all
the CPUs are brought up by the kernel. But there are certain caps,
which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI)
and kernel starts using them, even before the secondary CPUs are brought
up. We would need a way to describe this for each capability.
3) Conflict on a late CPU - When a CPU is brought up, it is checked
against the caps that are known to be enabled on the system (via
verify_local_cpu_capabilities()). Based on the state of the capability
on the CPU vs. that of System we could have the following combinations
of conflict.
x-----------------------------x
| Type | System | Late CPU |
------------------------------|
| a | y | n |
------------------------------|
| b | n | y |
x-----------------------------x
Case (a) is not permitted for caps which are system features, which the
system expects all the CPUs to have (e.g VHE). While (a) is ignored for
all errata work arounds. However, there could be exceptions to the plain
filtering approach. e.g, KPTI is an optional feature for a late CPU as
long as the system already enables it.
Case (b) is not permitted for errata work arounds which requires some
work around, which cannot be delayed. And we ignore (b) for features.
Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we
are too late to enable it (because we change the allocation of ASIDs
etc).
So this calls for a lot more fine grained behavior for each capability.
And if we define all the attributes to control their behavior properly,
we may be able to use a single table for the CPU hwcaps (which cover
errata and features, not the ELF HWCAPs). This is a prepartory step
to get there. More bits would be added for the properties listed above.
We are going to use a bit-mask to encode all the properties of a
capabilities. This patch encodes the "SCOPE" of the capability.
As such there is no change in how the capabilities are treated.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-26 14:12:31 +00:00
|
|
|
*
|
|
|
|
* 4) Action: As mentioned in (2), the kernel can take an action for each
|
|
|
|
* detected capability, on all CPUs on the system. Appropriate actions
|
|
|
|
* include, turning on an architectural feature, modifying the control
|
|
|
|
* registers (e.g, SCTLR, TCR etc.) or patching the kernel via
|
|
|
|
* alternatives. The kernel patching is batched and performed at later
|
|
|
|
* point. The actions are always initiated only after the capability
|
|
|
|
* is finalised. This is usally denoted by "enabling" the capability.
|
|
|
|
* The actions are initiated as follows :
|
|
|
|
* a) Action is triggered on all online CPUs, after the capability is
|
|
|
|
* finalised, invoked within the stop_machine() context from
|
|
|
|
* enable_cpu_capabilitie().
|
|
|
|
*
|
|
|
|
* b) Any late CPU, brought up after (1), the action is triggered via:
|
|
|
|
*
|
|
|
|
* check_local_cpu_capabilities() -> verify_local_cpu_capabilities()
|
|
|
|
*
|
2018-03-26 14:12:32 +00:00
|
|
|
* 5) Conflicts: Based on the state of the capability on a late CPU vs.
|
|
|
|
* the system state, we could have the following combinations :
|
|
|
|
*
|
|
|
|
* x-----------------------------x
|
|
|
|
* | Type | System | Late CPU |
|
|
|
|
* |-----------------------------|
|
|
|
|
* | a | y | n |
|
|
|
|
* |-----------------------------|
|
|
|
|
* | b | n | y |
|
|
|
|
* x-----------------------------x
|
|
|
|
*
|
|
|
|
* Two separate flag bits are defined to indicate whether each kind of
|
|
|
|
* conflict can be allowed:
|
|
|
|
* ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU - Case(a) is allowed
|
|
|
|
* ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU - Case(b) is allowed
|
|
|
|
*
|
|
|
|
* Case (a) is not permitted for a capability that the system requires
|
|
|
|
* all CPUs to have in order for the capability to be enabled. This is
|
|
|
|
* typical for capabilities that represent enhanced functionality.
|
|
|
|
*
|
|
|
|
* Case (b) is not permitted for a capability that must be enabled
|
|
|
|
* during boot if any CPU in the system requires it in order to run
|
|
|
|
* safely. This is typical for erratum work arounds that cannot be
|
|
|
|
* enabled after the corresponding capability is finalised.
|
|
|
|
*
|
|
|
|
* In some non-typical cases either both (a) and (b), or neither,
|
|
|
|
* should be permitted. This can be described by including neither
|
|
|
|
* or both flags in the capability's type field.
|
2020-03-13 09:04:54 +00:00
|
|
|
*
|
|
|
|
* In case of a conflict, the CPU is prevented from booting. If the
|
|
|
|
* ARM64_CPUCAP_PANIC_ON_CONFLICT flag is specified for the capability,
|
|
|
|
* then a kernel panic is triggered.
|
arm64: capabilities: Prepare for fine grained capabilities
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed
to the userspace and the CPU hwcaps used by the kernel, which
include cpu features and CPU errata work arounds. Capabilities
have some properties that decide how they should be treated :
1) Detection, i.e scope : A cap could be "detected" either :
- if it is present on at least one CPU (SCOPE_LOCAL_CPU)
Or
- if it is present on all the CPUs (SCOPE_SYSTEM)
2) When is it enabled ? - A cap is treated as "enabled" when the
system takes some action based on whether the capability is detected or
not. e.g, setting some control register, patching the kernel code.
Right now, we treat all caps are enabled at boot-time, after all
the CPUs are brought up by the kernel. But there are certain caps,
which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI)
and kernel starts using them, even before the secondary CPUs are brought
up. We would need a way to describe this for each capability.
3) Conflict on a late CPU - When a CPU is brought up, it is checked
against the caps that are known to be enabled on the system (via
verify_local_cpu_capabilities()). Based on the state of the capability
on the CPU vs. that of System we could have the following combinations
of conflict.
x-----------------------------x
| Type | System | Late CPU |
------------------------------|
| a | y | n |
------------------------------|
| b | n | y |
x-----------------------------x
Case (a) is not permitted for caps which are system features, which the
system expects all the CPUs to have (e.g VHE). While (a) is ignored for
all errata work arounds. However, there could be exceptions to the plain
filtering approach. e.g, KPTI is an optional feature for a late CPU as
long as the system already enables it.
Case (b) is not permitted for errata work arounds which requires some
work around, which cannot be delayed. And we ignore (b) for features.
Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we
are too late to enable it (because we change the allocation of ASIDs
etc).
So this calls for a lot more fine grained behavior for each capability.
And if we define all the attributes to control their behavior properly,
we may be able to use a single table for the CPU hwcaps (which cover
errata and features, not the ELF HWCAPs). This is a prepartory step
to get there. More bits would be added for the properties listed above.
We are going to use a bit-mask to encode all the properties of a
capabilities. This patch encodes the "SCOPE" of the capability.
As such there is no change in how the capabilities are treated.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-26 14:12:31 +00:00
|
|
|
*/
|
|
|
|
|
|
|
|
|
2018-03-26 14:12:41 +00:00
|
|
|
/*
|
|
|
|
* Decide how the capability is detected.
|
|
|
|
* On any local CPU vs System wide vs the primary boot CPU
|
|
|
|
*/
|
arm64: capabilities: Prepare for fine grained capabilities
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed
to the userspace and the CPU hwcaps used by the kernel, which
include cpu features and CPU errata work arounds. Capabilities
have some properties that decide how they should be treated :
1) Detection, i.e scope : A cap could be "detected" either :
- if it is present on at least one CPU (SCOPE_LOCAL_CPU)
Or
- if it is present on all the CPUs (SCOPE_SYSTEM)
2) When is it enabled ? - A cap is treated as "enabled" when the
system takes some action based on whether the capability is detected or
not. e.g, setting some control register, patching the kernel code.
Right now, we treat all caps are enabled at boot-time, after all
the CPUs are brought up by the kernel. But there are certain caps,
which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI)
and kernel starts using them, even before the secondary CPUs are brought
up. We would need a way to describe this for each capability.
3) Conflict on a late CPU - When a CPU is brought up, it is checked
against the caps that are known to be enabled on the system (via
verify_local_cpu_capabilities()). Based on the state of the capability
on the CPU vs. that of System we could have the following combinations
of conflict.
x-----------------------------x
| Type | System | Late CPU |
------------------------------|
| a | y | n |
------------------------------|
| b | n | y |
x-----------------------------x
Case (a) is not permitted for caps which are system features, which the
system expects all the CPUs to have (e.g VHE). While (a) is ignored for
all errata work arounds. However, there could be exceptions to the plain
filtering approach. e.g, KPTI is an optional feature for a late CPU as
long as the system already enables it.
Case (b) is not permitted for errata work arounds which requires some
work around, which cannot be delayed. And we ignore (b) for features.
Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we
are too late to enable it (because we change the allocation of ASIDs
etc).
So this calls for a lot more fine grained behavior for each capability.
And if we define all the attributes to control their behavior properly,
we may be able to use a single table for the CPU hwcaps (which cover
errata and features, not the ELF HWCAPs). This is a prepartory step
to get there. More bits would be added for the properties listed above.
We are going to use a bit-mask to encode all the properties of a
capabilities. This patch encodes the "SCOPE" of the capability.
As such there is no change in how the capabilities are treated.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-26 14:12:31 +00:00
|
|
|
#define ARM64_CPUCAP_SCOPE_LOCAL_CPU ((u16)BIT(0))
|
|
|
|
#define ARM64_CPUCAP_SCOPE_SYSTEM ((u16)BIT(1))
|
2018-03-26 14:12:41 +00:00
|
|
|
/*
|
|
|
|
* The capabilitiy is detected on the Boot CPU and is used by kernel
|
|
|
|
* during early boot. i.e, the capability should be "detected" and
|
|
|
|
* "enabled" as early as possibly on all booting CPUs.
|
|
|
|
*/
|
|
|
|
#define ARM64_CPUCAP_SCOPE_BOOT_CPU ((u16)BIT(2))
|
arm64: capabilities: Prepare for fine grained capabilities
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed
to the userspace and the CPU hwcaps used by the kernel, which
include cpu features and CPU errata work arounds. Capabilities
have some properties that decide how they should be treated :
1) Detection, i.e scope : A cap could be "detected" either :
- if it is present on at least one CPU (SCOPE_LOCAL_CPU)
Or
- if it is present on all the CPUs (SCOPE_SYSTEM)
2) When is it enabled ? - A cap is treated as "enabled" when the
system takes some action based on whether the capability is detected or
not. e.g, setting some control register, patching the kernel code.
Right now, we treat all caps are enabled at boot-time, after all
the CPUs are brought up by the kernel. But there are certain caps,
which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI)
and kernel starts using them, even before the secondary CPUs are brought
up. We would need a way to describe this for each capability.
3) Conflict on a late CPU - When a CPU is brought up, it is checked
against the caps that are known to be enabled on the system (via
verify_local_cpu_capabilities()). Based on the state of the capability
on the CPU vs. that of System we could have the following combinations
of conflict.
x-----------------------------x
| Type | System | Late CPU |
------------------------------|
| a | y | n |
------------------------------|
| b | n | y |
x-----------------------------x
Case (a) is not permitted for caps which are system features, which the
system expects all the CPUs to have (e.g VHE). While (a) is ignored for
all errata work arounds. However, there could be exceptions to the plain
filtering approach. e.g, KPTI is an optional feature for a late CPU as
long as the system already enables it.
Case (b) is not permitted for errata work arounds which requires some
work around, which cannot be delayed. And we ignore (b) for features.
Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we
are too late to enable it (because we change the allocation of ASIDs
etc).
So this calls for a lot more fine grained behavior for each capability.
And if we define all the attributes to control their behavior properly,
we may be able to use a single table for the CPU hwcaps (which cover
errata and features, not the ELF HWCAPs). This is a prepartory step
to get there. More bits would be added for the properties listed above.
We are going to use a bit-mask to encode all the properties of a
capabilities. This patch encodes the "SCOPE" of the capability.
As such there is no change in how the capabilities are treated.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-26 14:12:31 +00:00
|
|
|
#define ARM64_CPUCAP_SCOPE_MASK \
|
|
|
|
(ARM64_CPUCAP_SCOPE_SYSTEM | \
|
2018-03-26 14:12:41 +00:00
|
|
|
ARM64_CPUCAP_SCOPE_LOCAL_CPU | \
|
|
|
|
ARM64_CPUCAP_SCOPE_BOOT_CPU)
|
arm64: capabilities: Prepare for fine grained capabilities
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed
to the userspace and the CPU hwcaps used by the kernel, which
include cpu features and CPU errata work arounds. Capabilities
have some properties that decide how they should be treated :
1) Detection, i.e scope : A cap could be "detected" either :
- if it is present on at least one CPU (SCOPE_LOCAL_CPU)
Or
- if it is present on all the CPUs (SCOPE_SYSTEM)
2) When is it enabled ? - A cap is treated as "enabled" when the
system takes some action based on whether the capability is detected or
not. e.g, setting some control register, patching the kernel code.
Right now, we treat all caps are enabled at boot-time, after all
the CPUs are brought up by the kernel. But there are certain caps,
which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI)
and kernel starts using them, even before the secondary CPUs are brought
up. We would need a way to describe this for each capability.
3) Conflict on a late CPU - When a CPU is brought up, it is checked
against the caps that are known to be enabled on the system (via
verify_local_cpu_capabilities()). Based on the state of the capability
on the CPU vs. that of System we could have the following combinations
of conflict.
x-----------------------------x
| Type | System | Late CPU |
------------------------------|
| a | y | n |
------------------------------|
| b | n | y |
x-----------------------------x
Case (a) is not permitted for caps which are system features, which the
system expects all the CPUs to have (e.g VHE). While (a) is ignored for
all errata work arounds. However, there could be exceptions to the plain
filtering approach. e.g, KPTI is an optional feature for a late CPU as
long as the system already enables it.
Case (b) is not permitted for errata work arounds which requires some
work around, which cannot be delayed. And we ignore (b) for features.
Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we
are too late to enable it (because we change the allocation of ASIDs
etc).
So this calls for a lot more fine grained behavior for each capability.
And if we define all the attributes to control their behavior properly,
we may be able to use a single table for the CPU hwcaps (which cover
errata and features, not the ELF HWCAPs). This is a prepartory step
to get there. More bits would be added for the properties listed above.
We are going to use a bit-mask to encode all the properties of a
capabilities. This patch encodes the "SCOPE" of the capability.
As such there is no change in how the capabilities are treated.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-26 14:12:31 +00:00
|
|
|
|
|
|
|
#define SCOPE_SYSTEM ARM64_CPUCAP_SCOPE_SYSTEM
|
|
|
|
#define SCOPE_LOCAL_CPU ARM64_CPUCAP_SCOPE_LOCAL_CPU
|
2018-03-26 14:12:41 +00:00
|
|
|
#define SCOPE_BOOT_CPU ARM64_CPUCAP_SCOPE_BOOT_CPU
|
2018-03-26 14:12:34 +00:00
|
|
|
#define SCOPE_ALL ARM64_CPUCAP_SCOPE_MASK
|
2016-04-22 11:25:31 +00:00
|
|
|
|
2018-03-26 14:12:32 +00:00
|
|
|
/*
|
|
|
|
* Is it permitted for a late CPU to have this capability when system
|
|
|
|
* hasn't already enabled it ?
|
|
|
|
*/
|
|
|
|
#define ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU ((u16)BIT(4))
|
|
|
|
/* Is it safe for a late CPU to miss this capability when system has it */
|
|
|
|
#define ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU ((u16)BIT(5))
|
2020-03-13 09:04:54 +00:00
|
|
|
/* Panic when a conflict is detected */
|
|
|
|
#define ARM64_CPUCAP_PANIC_ON_CONFLICT ((u16)BIT(6))
|
2018-03-26 14:12:32 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* CPU errata workarounds that need to be enabled at boot time if one or
|
|
|
|
* more CPUs in the system requires it. When one of these capabilities
|
|
|
|
* has been enabled, it is safe to allow any CPU to boot that doesn't
|
|
|
|
* require the workaround. However, it is not safe if a "late" CPU
|
|
|
|
* requires a workaround and the system hasn't enabled it already.
|
|
|
|
*/
|
|
|
|
#define ARM64_CPUCAP_LOCAL_CPU_ERRATUM \
|
|
|
|
(ARM64_CPUCAP_SCOPE_LOCAL_CPU | ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU)
|
|
|
|
/*
|
|
|
|
* CPU feature detected at boot time based on system-wide value of a
|
|
|
|
* feature. It is safe for a late CPU to have this feature even though
|
2018-06-15 10:36:43 +00:00
|
|
|
* the system hasn't enabled it, although the feature will not be used
|
2018-03-26 14:12:32 +00:00
|
|
|
* by Linux in this case. If the system has enabled this feature already,
|
|
|
|
* then every late CPU must have it.
|
|
|
|
*/
|
|
|
|
#define ARM64_CPUCAP_SYSTEM_FEATURE \
|
|
|
|
(ARM64_CPUCAP_SCOPE_SYSTEM | ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU)
|
2018-03-26 14:12:39 +00:00
|
|
|
/*
|
|
|
|
* CPU feature detected at boot time based on feature of one or more CPUs.
|
|
|
|
* All possible conflicts for a late CPU are ignored.
|
2020-11-06 11:14:26 +00:00
|
|
|
* NOTE: this means that a late CPU with the feature will *not* cause the
|
|
|
|
* capability to be advertised by cpus_have_*cap()!
|
2018-03-26 14:12:39 +00:00
|
|
|
*/
|
|
|
|
#define ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE \
|
|
|
|
(ARM64_CPUCAP_SCOPE_LOCAL_CPU | \
|
|
|
|
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU | \
|
|
|
|
ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU)
|
2018-03-26 14:12:32 +00:00
|
|
|
|
2018-03-26 14:12:40 +00:00
|
|
|
/*
|
|
|
|
* CPU feature detected at boot time, on one or more CPUs. A late CPU
|
|
|
|
* is not allowed to have the capability when the system doesn't have it.
|
|
|
|
* It is Ok for a late CPU to miss the feature.
|
|
|
|
*/
|
|
|
|
#define ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE \
|
|
|
|
(ARM64_CPUCAP_SCOPE_LOCAL_CPU | \
|
|
|
|
ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU)
|
|
|
|
|
2018-03-26 14:12:42 +00:00
|
|
|
/*
|
|
|
|
* CPU feature used early in the boot based on the boot CPU. All secondary
|
2020-03-13 09:04:54 +00:00
|
|
|
* CPUs must match the state of the capability as detected by the boot CPU. In
|
|
|
|
* case of a conflict, a kernel panic is triggered.
|
2018-03-26 14:12:42 +00:00
|
|
|
*/
|
2020-03-13 09:04:54 +00:00
|
|
|
#define ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE \
|
|
|
|
(ARM64_CPUCAP_SCOPE_BOOT_CPU | ARM64_CPUCAP_PANIC_ON_CONFLICT)
|
2018-03-26 14:12:42 +00:00
|
|
|
|
2020-03-13 09:04:55 +00:00
|
|
|
/*
|
|
|
|
* CPU feature used early in the boot based on the boot CPU. It is safe for a
|
|
|
|
* late CPU to have this feature even though the boot CPU hasn't enabled it,
|
|
|
|
* although the feature will not be used by Linux in this case. If the boot CPU
|
|
|
|
* has enabled this feature already, then every late CPU must have it.
|
2018-03-26 14:12:42 +00:00
|
|
|
*/
|
2020-03-13 09:04:55 +00:00
|
|
|
#define ARM64_CPUCAP_BOOT_CPU_FEATURE \
|
|
|
|
(ARM64_CPUCAP_SCOPE_BOOT_CPU | ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU)
|
2018-03-26 14:12:42 +00:00
|
|
|
|
2015-03-27 13:09:23 +00:00
|
|
|
struct arm64_cpu_capabilities {
|
|
|
|
const char *desc;
|
|
|
|
u16 capability;
|
arm64: capabilities: Prepare for fine grained capabilities
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed
to the userspace and the CPU hwcaps used by the kernel, which
include cpu features and CPU errata work arounds. Capabilities
have some properties that decide how they should be treated :
1) Detection, i.e scope : A cap could be "detected" either :
- if it is present on at least one CPU (SCOPE_LOCAL_CPU)
Or
- if it is present on all the CPUs (SCOPE_SYSTEM)
2) When is it enabled ? - A cap is treated as "enabled" when the
system takes some action based on whether the capability is detected or
not. e.g, setting some control register, patching the kernel code.
Right now, we treat all caps are enabled at boot-time, after all
the CPUs are brought up by the kernel. But there are certain caps,
which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI)
and kernel starts using them, even before the secondary CPUs are brought
up. We would need a way to describe this for each capability.
3) Conflict on a late CPU - When a CPU is brought up, it is checked
against the caps that are known to be enabled on the system (via
verify_local_cpu_capabilities()). Based on the state of the capability
on the CPU vs. that of System we could have the following combinations
of conflict.
x-----------------------------x
| Type | System | Late CPU |
------------------------------|
| a | y | n |
------------------------------|
| b | n | y |
x-----------------------------x
Case (a) is not permitted for caps which are system features, which the
system expects all the CPUs to have (e.g VHE). While (a) is ignored for
all errata work arounds. However, there could be exceptions to the plain
filtering approach. e.g, KPTI is an optional feature for a late CPU as
long as the system already enables it.
Case (b) is not permitted for errata work arounds which requires some
work around, which cannot be delayed. And we ignore (b) for features.
Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we
are too late to enable it (because we change the allocation of ASIDs
etc).
So this calls for a lot more fine grained behavior for each capability.
And if we define all the attributes to control their behavior properly,
we may be able to use a single table for the CPU hwcaps (which cover
errata and features, not the ELF HWCAPs). This is a prepartory step
to get there. More bits would be added for the properties listed above.
We are going to use a bit-mask to encode all the properties of a
capabilities. This patch encodes the "SCOPE" of the capability.
As such there is no change in how the capabilities are treated.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-26 14:12:31 +00:00
|
|
|
u16 type;
|
2016-04-22 11:25:31 +00:00
|
|
|
bool (*matches)(const struct arm64_cpu_capabilities *caps, int scope);
|
2018-03-26 14:12:28 +00:00
|
|
|
/*
|
2019-08-08 14:05:54 +00:00
|
|
|
* Take the appropriate actions to configure this capability
|
|
|
|
* for this CPU. If the capability is detected by the kernel
|
|
|
|
* this will be called on all the CPUs in the system,
|
|
|
|
* including the hotplugged CPUs, regardless of whether the
|
|
|
|
* capability is available on that specific CPU. This is
|
|
|
|
* useful for some capabilities (e.g, working around CPU
|
|
|
|
* errata), where all the CPUs must take some action (e.g,
|
|
|
|
* changing system control/configuration). Thus, if an action
|
|
|
|
* is required only if the CPU has the capability, then the
|
|
|
|
* routine must check it before taking any action.
|
2018-03-26 14:12:28 +00:00
|
|
|
*/
|
|
|
|
void (*cpu_enable)(const struct arm64_cpu_capabilities *cap);
|
2015-03-27 13:09:23 +00:00
|
|
|
union {
|
|
|
|
struct { /* To be used for erratum handling only */
|
2018-03-26 14:12:44 +00:00
|
|
|
struct midr_range midr_range;
|
2018-03-06 17:15:34 +00:00
|
|
|
const struct arm64_midr_revidr {
|
|
|
|
u32 midr_rv; /* revision/variant */
|
|
|
|
u32 revidr_mask;
|
|
|
|
} * const fixed_revs;
|
2015-03-27 13:09:23 +00:00
|
|
|
};
|
2015-06-12 11:06:36 +00:00
|
|
|
|
2018-03-26 14:12:45 +00:00
|
|
|
const struct midr_range *midr_range_list;
|
2015-06-12 11:06:36 +00:00
|
|
|
struct { /* Feature register checking */
|
2015-10-19 13:24:51 +00:00
|
|
|
u32 sys_reg;
|
2016-01-26 10:58:15 +00:00
|
|
|
u8 field_pos;
|
2022-02-07 15:20:32 +00:00
|
|
|
u8 field_width;
|
2016-01-26 10:58:15 +00:00
|
|
|
u8 min_field_value;
|
|
|
|
u8 hwcap_type;
|
|
|
|
bool sign;
|
2015-10-19 13:24:52 +00:00
|
|
|
unsigned long hwcap;
|
2015-06-12 11:06:36 +00:00
|
|
|
};
|
2015-03-27 13:09:23 +00:00
|
|
|
};
|
2018-12-12 15:53:54 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* An optional list of "matches/cpu_enable" pair for the same
|
|
|
|
* "capability" of the same "type" as described by the parent.
|
|
|
|
* Only matches(), cpu_enable() and fields relevant to these
|
|
|
|
* methods are significant in the list. The cpu_enable is
|
|
|
|
* invoked only if the corresponding entry "matches()".
|
|
|
|
* However, if a cpu_enable() method is associated
|
|
|
|
* with multiple matches(), care should be taken that either
|
|
|
|
* the match criteria are mutually exclusive, or that the
|
|
|
|
* method is robust against being called multiple times.
|
|
|
|
*/
|
|
|
|
const struct arm64_cpu_capabilities *match_list;
|
2023-10-17 05:23:20 +00:00
|
|
|
const struct cpumask *cpus;
|
2015-03-27 13:09:23 +00:00
|
|
|
};
|
|
|
|
|
arm64: capabilities: Prepare for fine grained capabilities
We use arm64_cpu_capabilities to represent CPU ELF HWCAPs exposed
to the userspace and the CPU hwcaps used by the kernel, which
include cpu features and CPU errata work arounds. Capabilities
have some properties that decide how they should be treated :
1) Detection, i.e scope : A cap could be "detected" either :
- if it is present on at least one CPU (SCOPE_LOCAL_CPU)
Or
- if it is present on all the CPUs (SCOPE_SYSTEM)
2) When is it enabled ? - A cap is treated as "enabled" when the
system takes some action based on whether the capability is detected or
not. e.g, setting some control register, patching the kernel code.
Right now, we treat all caps are enabled at boot-time, after all
the CPUs are brought up by the kernel. But there are certain caps,
which are enabled early during the boot (e.g, VHE, GIC_CPUIF for NMI)
and kernel starts using them, even before the secondary CPUs are brought
up. We would need a way to describe this for each capability.
3) Conflict on a late CPU - When a CPU is brought up, it is checked
against the caps that are known to be enabled on the system (via
verify_local_cpu_capabilities()). Based on the state of the capability
on the CPU vs. that of System we could have the following combinations
of conflict.
x-----------------------------x
| Type | System | Late CPU |
------------------------------|
| a | y | n |
------------------------------|
| b | n | y |
x-----------------------------x
Case (a) is not permitted for caps which are system features, which the
system expects all the CPUs to have (e.g VHE). While (a) is ignored for
all errata work arounds. However, there could be exceptions to the plain
filtering approach. e.g, KPTI is an optional feature for a late CPU as
long as the system already enables it.
Case (b) is not permitted for errata work arounds which requires some
work around, which cannot be delayed. And we ignore (b) for features.
Here, yet again, KPTI is an exception, where if a late CPU needs KPTI we
are too late to enable it (because we change the allocation of ASIDs
etc).
So this calls for a lot more fine grained behavior for each capability.
And if we define all the attributes to control their behavior properly,
we may be able to use a single table for the CPU hwcaps (which cover
errata and features, not the ELF HWCAPs). This is a prepartory step
to get there. More bits would be added for the properties listed above.
We are going to use a bit-mask to encode all the properties of a
capabilities. This patch encodes the "SCOPE" of the capability.
As such there is no change in how the capabilities are treated.
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-03-26 14:12:31 +00:00
|
|
|
static inline int cpucap_default_scope(const struct arm64_cpu_capabilities *cap)
|
|
|
|
{
|
|
|
|
return cap->type & ARM64_CPUCAP_SCOPE_MASK;
|
|
|
|
}
|
|
|
|
|
2018-12-12 15:53:54 +00:00
|
|
|
/*
|
2020-08-28 03:18:22 +00:00
|
|
|
* Generic helper for handling capabilities with multiple (match,enable) pairs
|
2018-12-12 15:53:54 +00:00
|
|
|
* of call backs, sharing the same capability bit.
|
|
|
|
* Iterate over each entry to see if at least one matches.
|
|
|
|
*/
|
|
|
|
static inline bool
|
|
|
|
cpucap_multi_entry_cap_matches(const struct arm64_cpu_capabilities *entry,
|
|
|
|
int scope)
|
|
|
|
{
|
|
|
|
const struct arm64_cpu_capabilities *caps;
|
|
|
|
|
|
|
|
for (caps = entry->match_list; caps->matches; caps++)
|
|
|
|
if (caps->matches(caps, scope))
|
|
|
|
return true;
|
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2020-10-26 13:49:31 +00:00
|
|
|
static __always_inline bool is_vhe_hyp_code(void)
|
|
|
|
{
|
|
|
|
/* Only defined for code run in VHE hyp context */
|
|
|
|
return __is_defined(__KVM_VHE_HYPERVISOR__);
|
|
|
|
}
|
|
|
|
|
|
|
|
static __always_inline bool is_nvhe_hyp_code(void)
|
|
|
|
{
|
|
|
|
/* Only defined for code run in NVHE hyp context */
|
|
|
|
return __is_defined(__KVM_NVHE_HYPERVISOR__);
|
|
|
|
}
|
|
|
|
|
|
|
|
static __always_inline bool is_hyp_code(void)
|
|
|
|
{
|
|
|
|
return is_vhe_hyp_code() || is_nvhe_hyp_code();
|
|
|
|
}
|
|
|
|
|
2023-06-07 16:48:43 +00:00
|
|
|
extern DECLARE_BITMAP(system_cpucaps, ARM64_NCAPS);
|
2014-11-14 15:54:07 +00:00
|
|
|
|
2023-06-07 16:48:43 +00:00
|
|
|
extern DECLARE_BITMAP(boot_cpucaps, ARM64_NCAPS);
|
2019-01-31 14:58:53 +00:00
|
|
|
|
2018-11-30 17:18:06 +00:00
|
|
|
#define for_each_available_cap(cap) \
|
2023-06-07 16:48:43 +00:00
|
|
|
for_each_set_bit(cap, system_cpucaps, ARM64_NCAPS)
|
2018-11-30 17:18:06 +00:00
|
|
|
|
2016-04-22 11:25:32 +00:00
|
|
|
bool this_cpu_has_cap(unsigned int cap);
|
2019-04-09 09:52:41 +00:00
|
|
|
void cpu_set_feature(unsigned int num);
|
|
|
|
bool cpu_have_feature(unsigned int num);
|
|
|
|
unsigned long cpu_get_elf_hwcap(void);
|
|
|
|
unsigned long cpu_get_elf_hwcap2(void);
|
2016-04-22 11:25:32 +00:00
|
|
|
|
2019-04-09 09:52:40 +00:00
|
|
|
#define cpu_set_named_feature(name) cpu_set_feature(cpu_feature(name))
|
|
|
|
#define cpu_have_named_feature(name) cpu_have_feature(cpu_feature(name))
|
2014-03-04 01:10:04 +00:00
|
|
|
|
2023-10-16 10:24:28 +00:00
|
|
|
static __always_inline bool boot_capabilities_finalized(void)
|
|
|
|
{
|
|
|
|
return alternative_has_cap_likely(ARM64_ALWAYS_BOOT);
|
|
|
|
}
|
|
|
|
|
arm64: cpufeature: add cpus_have_final_cap()
When cpus_have_const_cap() was originally introduced it was intended to
be safe in hyp context, where it is not safe to access the cpu_hwcaps
array as cpus_have_cap() did. For more details see commit:
a4023f682739439b ("arm64: Add hypervisor safe helper for checking constant capabilities")
We then made use of cpus_have_const_cap() throughout the kernel.
Subsequently, we had to defer updating the static_key associated with
each capability in order to avoid lockdep complaints. To avoid breaking
kernel-wide usage of cpus_have_const_cap(), this was updated to fall
back to the cpu_hwcaps array if called before the static_keys were
updated. As the kvm hyp code was only called later than this, the
fallback is redundant but not functionally harmful. For more details,
see commit:
63a1e1c95e60e798 ("arm64/cpufeature: don't use mutex in bringup path")
Today we have more users of cpus_have_const_cap() which are only called
once the relevant static keys are initialized, and it would be
beneficial to avoid the redundant code.
To that end, this patch adds a new cpus_have_final_cap(), helper which
is intend to be used in code which is only run once capabilities have
been finalized, and will never check the cpus_hwcap array. This helps
the compiler to generate better code as it no longer needs to generate
code to address and test the cpus_hwcap array. To help catch misuse,
cpus_have_final_cap() will BUG() if called before capabilities are
finalized.
In hyp context, BUG() will result in a hyp panic, but the specific BUG()
instance will not be identified in the usual way.
Comments are added to the various cpus_have_*_cap() helpers to describe
the constraints on when they can be used. For clarity cpus_have_cap() is
moved above the other helpers. Similarly the helpers are updated to use
system_capabilities_finalized() consistently, and this is made
__always_inline as required by its new callers.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-02-21 14:50:21 +00:00
|
|
|
static __always_inline bool system_capabilities_finalized(void)
|
2016-11-08 13:56:20 +00:00
|
|
|
{
|
2023-06-07 16:48:44 +00:00
|
|
|
return alternative_has_cap_likely(ARM64_ALWAYS_SYSTEM);
|
2016-11-08 13:56:20 +00:00
|
|
|
}
|
|
|
|
|
arm64: cpufeature: add cpus_have_final_cap()
When cpus_have_const_cap() was originally introduced it was intended to
be safe in hyp context, where it is not safe to access the cpu_hwcaps
array as cpus_have_cap() did. For more details see commit:
a4023f682739439b ("arm64: Add hypervisor safe helper for checking constant capabilities")
We then made use of cpus_have_const_cap() throughout the kernel.
Subsequently, we had to defer updating the static_key associated with
each capability in order to avoid lockdep complaints. To avoid breaking
kernel-wide usage of cpus_have_const_cap(), this was updated to fall
back to the cpu_hwcaps array if called before the static_keys were
updated. As the kvm hyp code was only called later than this, the
fallback is redundant but not functionally harmful. For more details,
see commit:
63a1e1c95e60e798 ("arm64/cpufeature: don't use mutex in bringup path")
Today we have more users of cpus_have_const_cap() which are only called
once the relevant static keys are initialized, and it would be
beneficial to avoid the redundant code.
To that end, this patch adds a new cpus_have_final_cap(), helper which
is intend to be used in code which is only run once capabilities have
been finalized, and will never check the cpus_hwcap array. This helps
the compiler to generate better code as it no longer needs to generate
code to address and test the cpus_hwcap array. To help catch misuse,
cpus_have_final_cap() will BUG() if called before capabilities are
finalized.
In hyp context, BUG() will result in a hyp panic, but the specific BUG()
instance will not be identified in the usual way.
Comments are added to the various cpus_have_*_cap() helpers to describe
the constraints on when they can be used. For clarity cpus_have_cap() is
moved above the other helpers. Similarly the helpers are updated to use
system_capabilities_finalized() consistently, and this is made
__always_inline as required by its new callers.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-02-21 14:50:21 +00:00
|
|
|
/*
|
|
|
|
* Test for a capability with a runtime check.
|
|
|
|
*
|
|
|
|
* Before the capability is detected, this returns false.
|
|
|
|
*/
|
2022-09-12 16:22:03 +00:00
|
|
|
static __always_inline bool cpus_have_cap(unsigned int num)
|
2014-11-14 15:54:07 +00:00
|
|
|
{
|
arm64: Add cpucap_is_possible()
Many cpucaps can only be set when certain CONFIG_* options are selected,
and we need to check the CONFIG_* option before the cap in order to
avoid generating redundant code. Due to this, we have a growing number
of helpers in <asm/cpufeature.h> of the form:
| static __always_inline bool system_supports_foo(void)
| {
| return IS_ENABLED(CONFIG_ARM64_FOO) &&
| cpus_have_const_cap(ARM64_HAS_FOO);
| }
This is unfortunate as it forces us to use cpus_have_const_cap()
unnecessarily, resulting in redundant code being generated by the
compiler. In the vast majority of cases, we only require that feature
checks indicate the presence of a feature after cpucaps have been
finalized, and so it would be sufficient to use alternative_has_cap_*().
However some code needs to handle a feature before alternatives have
been patched, and must test the system_cpucaps bitmap via
cpus_have_const_cap(). In other cases we'd like to check for
unintentional usage of a cpucap before alternatives are patched, and so
it would be preferable to use cpus_have_final_cap().
Placing the IS_ENABLED() checks in each callsite is tedious and
error-prone, and the same applies for writing wrappers for each
comination of cpucap and alternative_has_cap_*() / cpus_have_cap() /
cpus_have_final_cap(). It would be nicer if we could centralize the
knowledge of which cpucaps are possible, and have
alternative_has_cap_*(), cpus_have_cap(), and cpus_have_final_cap()
handle this automatically.
This patch adds a new cpucap_is_possible() function which will be
responsible for checking the CONFIG_* option, and updates the low-level
cpucap checks to use this. The existing CONFIG_* checks in
<asm/cpufeature.h> are moved over to cpucap_is_possible(), but the (now
trival) wrapper functions are retained for now.
There should be no functional change as a result of this patch alone.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:27 +00:00
|
|
|
if (__builtin_constant_p(num) && !cpucap_is_possible(num))
|
|
|
|
return false;
|
arm64: Provide a namespace to NCAPS
Building arm64.allmodconfig leads to the following warning:
usb/gadget/function/f_ncm.c:203:0: warning: "NCAPS" redefined
#define NCAPS (USB_CDC_NCM_NCAP_ETH_FILTER | USB_CDC_NCM_NCAP_CRC_MODE)
^
In file included from /home/build/work/batch/arch/arm64/include/asm/io.h:32:0,
from /home/build/work/batch/include/linux/clocksource.h:19,
from /home/build/work/batch/include/clocksource/arm_arch_timer.h:19,
from /home/build/work/batch/arch/arm64/include/asm/arch_timer.h:27,
from /home/build/work/batch/arch/arm64/include/asm/timex.h:19,
from /home/build/work/batch/include/linux/timex.h:65,
from /home/build/work/batch/include/linux/sched.h:19,
from /home/build/work/batch/arch/arm64/include/asm/compat.h:25,
from /home/build/work/batch/arch/arm64/include/asm/stat.h:23,
from /home/build/work/batch/include/linux/stat.h:5,
from /home/build/work/batch/include/linux/module.h:10,
from /home/build/work/batch/drivers/usb/gadget/function/f_ncm.c:19:
arch/arm64/include/asm/cpufeature.h:27:0: note: this is the location of the previous definition
#define NCAPS 2
So add a ARM64 prefix to avoid such problem.
Reported-by: Olof's autobuilder <build@lixom.net>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-12-04 01:17:01 +00:00
|
|
|
if (num >= ARM64_NCAPS)
|
2014-11-14 15:54:07 +00:00
|
|
|
return false;
|
2023-06-07 16:48:43 +00:00
|
|
|
return arch_test_bit(num, system_cpucaps);
|
2014-11-14 15:54:07 +00:00
|
|
|
}
|
|
|
|
|
arm64: cpufeature: add cpus_have_final_cap()
When cpus_have_const_cap() was originally introduced it was intended to
be safe in hyp context, where it is not safe to access the cpu_hwcaps
array as cpus_have_cap() did. For more details see commit:
a4023f682739439b ("arm64: Add hypervisor safe helper for checking constant capabilities")
We then made use of cpus_have_const_cap() throughout the kernel.
Subsequently, we had to defer updating the static_key associated with
each capability in order to avoid lockdep complaints. To avoid breaking
kernel-wide usage of cpus_have_const_cap(), this was updated to fall
back to the cpu_hwcaps array if called before the static_keys were
updated. As the kvm hyp code was only called later than this, the
fallback is redundant but not functionally harmful. For more details,
see commit:
63a1e1c95e60e798 ("arm64/cpufeature: don't use mutex in bringup path")
Today we have more users of cpus_have_const_cap() which are only called
once the relevant static keys are initialized, and it would be
beneficial to avoid the redundant code.
To that end, this patch adds a new cpus_have_final_cap(), helper which
is intend to be used in code which is only run once capabilities have
been finalized, and will never check the cpus_hwcap array. This helps
the compiler to generate better code as it no longer needs to generate
code to address and test the cpus_hwcap array. To help catch misuse,
cpus_have_final_cap() will BUG() if called before capabilities are
finalized.
In hyp context, BUG() will result in a hyp panic, but the specific BUG()
instance will not be identified in the usual way.
Comments are added to the various cpus_have_*_cap() helpers to describe
the constraints on when they can be used. For clarity cpus_have_cap() is
moved above the other helpers. Similarly the helpers are updated to use
system_capabilities_finalized() consistently, and this is made
__always_inline as required by its new callers.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-02-21 14:50:21 +00:00
|
|
|
/*
|
|
|
|
* Test for a capability without a runtime check.
|
|
|
|
*
|
2023-10-16 10:24:28 +00:00
|
|
|
* Before boot capabilities are finalized, this will BUG().
|
|
|
|
* After boot capabilities are finalized, this is patched to avoid a runtime
|
|
|
|
* check.
|
arm64: cpufeature: add cpus_have_final_cap()
When cpus_have_const_cap() was originally introduced it was intended to
be safe in hyp context, where it is not safe to access the cpu_hwcaps
array as cpus_have_cap() did. For more details see commit:
a4023f682739439b ("arm64: Add hypervisor safe helper for checking constant capabilities")
We then made use of cpus_have_const_cap() throughout the kernel.
Subsequently, we had to defer updating the static_key associated with
each capability in order to avoid lockdep complaints. To avoid breaking
kernel-wide usage of cpus_have_const_cap(), this was updated to fall
back to the cpu_hwcaps array if called before the static_keys were
updated. As the kvm hyp code was only called later than this, the
fallback is redundant but not functionally harmful. For more details,
see commit:
63a1e1c95e60e798 ("arm64/cpufeature: don't use mutex in bringup path")
Today we have more users of cpus_have_const_cap() which are only called
once the relevant static keys are initialized, and it would be
beneficial to avoid the redundant code.
To that end, this patch adds a new cpus_have_final_cap(), helper which
is intend to be used in code which is only run once capabilities have
been finalized, and will never check the cpus_hwcap array. This helps
the compiler to generate better code as it no longer needs to generate
code to address and test the cpus_hwcap array. To help catch misuse,
cpus_have_final_cap() will BUG() if called before capabilities are
finalized.
In hyp context, BUG() will result in a hyp panic, but the specific BUG()
instance will not be identified in the usual way.
Comments are added to the various cpus_have_*_cap() helpers to describe
the constraints on when they can be used. For clarity cpus_have_cap() is
moved above the other helpers. Similarly the helpers are updated to use
system_capabilities_finalized() consistently, and this is made
__always_inline as required by its new callers.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-02-21 14:50:21 +00:00
|
|
|
*
|
|
|
|
* @num must be a compile-time constant.
|
|
|
|
*/
|
2023-10-16 10:24:28 +00:00
|
|
|
static __always_inline bool cpus_have_final_boot_cap(int num)
|
arm64: cpufeature: add cpus_have_final_cap()
When cpus_have_const_cap() was originally introduced it was intended to
be safe in hyp context, where it is not safe to access the cpu_hwcaps
array as cpus_have_cap() did. For more details see commit:
a4023f682739439b ("arm64: Add hypervisor safe helper for checking constant capabilities")
We then made use of cpus_have_const_cap() throughout the kernel.
Subsequently, we had to defer updating the static_key associated with
each capability in order to avoid lockdep complaints. To avoid breaking
kernel-wide usage of cpus_have_const_cap(), this was updated to fall
back to the cpu_hwcaps array if called before the static_keys were
updated. As the kvm hyp code was only called later than this, the
fallback is redundant but not functionally harmful. For more details,
see commit:
63a1e1c95e60e798 ("arm64/cpufeature: don't use mutex in bringup path")
Today we have more users of cpus_have_const_cap() which are only called
once the relevant static keys are initialized, and it would be
beneficial to avoid the redundant code.
To that end, this patch adds a new cpus_have_final_cap(), helper which
is intend to be used in code which is only run once capabilities have
been finalized, and will never check the cpus_hwcap array. This helps
the compiler to generate better code as it no longer needs to generate
code to address and test the cpus_hwcap array. To help catch misuse,
cpus_have_final_cap() will BUG() if called before capabilities are
finalized.
In hyp context, BUG() will result in a hyp panic, but the specific BUG()
instance will not be identified in the usual way.
Comments are added to the various cpus_have_*_cap() helpers to describe
the constraints on when they can be used. For clarity cpus_have_cap() is
moved above the other helpers. Similarly the helpers are updated to use
system_capabilities_finalized() consistently, and this is made
__always_inline as required by its new callers.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-02-21 14:50:21 +00:00
|
|
|
{
|
2023-10-16 10:24:28 +00:00
|
|
|
if (boot_capabilities_finalized())
|
2023-10-16 10:25:01 +00:00
|
|
|
return alternative_has_cap_unlikely(num);
|
2023-10-16 10:24:28 +00:00
|
|
|
else
|
|
|
|
BUG();
|
arm64: cpufeature: add cpus_have_final_cap()
When cpus_have_const_cap() was originally introduced it was intended to
be safe in hyp context, where it is not safe to access the cpu_hwcaps
array as cpus_have_cap() did. For more details see commit:
a4023f682739439b ("arm64: Add hypervisor safe helper for checking constant capabilities")
We then made use of cpus_have_const_cap() throughout the kernel.
Subsequently, we had to defer updating the static_key associated with
each capability in order to avoid lockdep complaints. To avoid breaking
kernel-wide usage of cpus_have_const_cap(), this was updated to fall
back to the cpu_hwcaps array if called before the static_keys were
updated. As the kvm hyp code was only called later than this, the
fallback is redundant but not functionally harmful. For more details,
see commit:
63a1e1c95e60e798 ("arm64/cpufeature: don't use mutex in bringup path")
Today we have more users of cpus_have_const_cap() which are only called
once the relevant static keys are initialized, and it would be
beneficial to avoid the redundant code.
To that end, this patch adds a new cpus_have_final_cap(), helper which
is intend to be used in code which is only run once capabilities have
been finalized, and will never check the cpus_hwcap array. This helps
the compiler to generate better code as it no longer needs to generate
code to address and test the cpus_hwcap array. To help catch misuse,
cpus_have_final_cap() will BUG() if called before capabilities are
finalized.
In hyp context, BUG() will result in a hyp panic, but the specific BUG()
instance will not be identified in the usual way.
Comments are added to the various cpus_have_*_cap() helpers to describe
the constraints on when they can be used. For clarity cpus_have_cap() is
moved above the other helpers. Similarly the helpers are updated to use
system_capabilities_finalized() consistently, and this is made
__always_inline as required by its new callers.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-02-21 14:50:21 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2020-10-26 13:49:30 +00:00
|
|
|
* Test for a capability without a runtime check.
|
arm64: cpufeature: add cpus_have_final_cap()
When cpus_have_const_cap() was originally introduced it was intended to
be safe in hyp context, where it is not safe to access the cpu_hwcaps
array as cpus_have_cap() did. For more details see commit:
a4023f682739439b ("arm64: Add hypervisor safe helper for checking constant capabilities")
We then made use of cpus_have_const_cap() throughout the kernel.
Subsequently, we had to defer updating the static_key associated with
each capability in order to avoid lockdep complaints. To avoid breaking
kernel-wide usage of cpus_have_const_cap(), this was updated to fall
back to the cpu_hwcaps array if called before the static_keys were
updated. As the kvm hyp code was only called later than this, the
fallback is redundant but not functionally harmful. For more details,
see commit:
63a1e1c95e60e798 ("arm64/cpufeature: don't use mutex in bringup path")
Today we have more users of cpus_have_const_cap() which are only called
once the relevant static keys are initialized, and it would be
beneficial to avoid the redundant code.
To that end, this patch adds a new cpus_have_final_cap(), helper which
is intend to be used in code which is only run once capabilities have
been finalized, and will never check the cpus_hwcap array. This helps
the compiler to generate better code as it no longer needs to generate
code to address and test the cpus_hwcap array. To help catch misuse,
cpus_have_final_cap() will BUG() if called before capabilities are
finalized.
In hyp context, BUG() will result in a hyp panic, but the specific BUG()
instance will not be identified in the usual way.
Comments are added to the various cpus_have_*_cap() helpers to describe
the constraints on when they can be used. For clarity cpus_have_cap() is
moved above the other helpers. Similarly the helpers are updated to use
system_capabilities_finalized() consistently, and this is made
__always_inline as required by its new callers.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-02-21 14:50:21 +00:00
|
|
|
*
|
2023-10-16 10:24:28 +00:00
|
|
|
* Before system capabilities are finalized, this will BUG().
|
|
|
|
* After system capabilities are finalized, this is patched to avoid a runtime
|
|
|
|
* check.
|
arm64: cpufeature: add cpus_have_final_cap()
When cpus_have_const_cap() was originally introduced it was intended to
be safe in hyp context, where it is not safe to access the cpu_hwcaps
array as cpus_have_cap() did. For more details see commit:
a4023f682739439b ("arm64: Add hypervisor safe helper for checking constant capabilities")
We then made use of cpus_have_const_cap() throughout the kernel.
Subsequently, we had to defer updating the static_key associated with
each capability in order to avoid lockdep complaints. To avoid breaking
kernel-wide usage of cpus_have_const_cap(), this was updated to fall
back to the cpu_hwcaps array if called before the static_keys were
updated. As the kvm hyp code was only called later than this, the
fallback is redundant but not functionally harmful. For more details,
see commit:
63a1e1c95e60e798 ("arm64/cpufeature: don't use mutex in bringup path")
Today we have more users of cpus_have_const_cap() which are only called
once the relevant static keys are initialized, and it would be
beneficial to avoid the redundant code.
To that end, this patch adds a new cpus_have_final_cap(), helper which
is intend to be used in code which is only run once capabilities have
been finalized, and will never check the cpus_hwcap array. This helps
the compiler to generate better code as it no longer needs to generate
code to address and test the cpus_hwcap array. To help catch misuse,
cpus_have_final_cap() will BUG() if called before capabilities are
finalized.
In hyp context, BUG() will result in a hyp panic, but the specific BUG()
instance will not be identified in the usual way.
Comments are added to the various cpus_have_*_cap() helpers to describe
the constraints on when they can be used. For clarity cpus_have_cap() is
moved above the other helpers. Similarly the helpers are updated to use
system_capabilities_finalized() consistently, and this is made
__always_inline as required by its new callers.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-02-21 14:50:21 +00:00
|
|
|
*
|
|
|
|
* @num must be a compile-time constant.
|
|
|
|
*/
|
2020-10-26 13:49:30 +00:00
|
|
|
static __always_inline bool cpus_have_final_cap(int num)
|
arm64/cpufeature: don't use mutex in bringup path
Currently, cpus_set_cap() calls static_branch_enable_cpuslocked(), which
must take the jump_label mutex.
We call cpus_set_cap() in the secondary bringup path, from the idle
thread where interrupts are disabled. Taking a mutex in this path "is a
NONO" regardless of whether it's contended, and something we must avoid.
We didn't spot this until recently, as ___might_sleep() won't warn for
this case until all CPUs have been brought up.
This patch avoids taking the mutex in the secondary bringup path. The
poking of static keys is deferred until enable_cpu_capabilities(), which
runs in a suitable context on the boot CPU. To account for the static
keys being set later, cpus_have_const_cap() is updated to use another
static key to check whether the const cap keys have been initialised,
falling back to the caps bitmap until this is the case.
This means that users of cpus_have_const_cap() gain should only gain a
single additional NOP in the fast path once the const caps are
initialised, but should always see the current cap value.
The hyp code should never dereference the caps array, since the caps are
initialized before we run the module initcall to initialise hyp. A check
is added to the hyp init code to document this requirement.
This change will sidestep a number of issues when the upcoming hotplug
locking rework is merged.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyniger <marc.zyngier@arm.com>
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-05-16 14:18:05 +00:00
|
|
|
{
|
arm64: cpufeature: add cpus_have_final_cap()
When cpus_have_const_cap() was originally introduced it was intended to
be safe in hyp context, where it is not safe to access the cpu_hwcaps
array as cpus_have_cap() did. For more details see commit:
a4023f682739439b ("arm64: Add hypervisor safe helper for checking constant capabilities")
We then made use of cpus_have_const_cap() throughout the kernel.
Subsequently, we had to defer updating the static_key associated with
each capability in order to avoid lockdep complaints. To avoid breaking
kernel-wide usage of cpus_have_const_cap(), this was updated to fall
back to the cpu_hwcaps array if called before the static_keys were
updated. As the kvm hyp code was only called later than this, the
fallback is redundant but not functionally harmful. For more details,
see commit:
63a1e1c95e60e798 ("arm64/cpufeature: don't use mutex in bringup path")
Today we have more users of cpus_have_const_cap() which are only called
once the relevant static keys are initialized, and it would be
beneficial to avoid the redundant code.
To that end, this patch adds a new cpus_have_final_cap(), helper which
is intend to be used in code which is only run once capabilities have
been finalized, and will never check the cpus_hwcap array. This helps
the compiler to generate better code as it no longer needs to generate
code to address and test the cpus_hwcap array. To help catch misuse,
cpus_have_final_cap() will BUG() if called before capabilities are
finalized.
In hyp context, BUG() will result in a hyp panic, but the specific BUG()
instance will not be identified in the usual way.
Comments are added to the various cpus_have_*_cap() helpers to describe
the constraints on when they can be used. For clarity cpus_have_cap() is
moved above the other helpers. Similarly the helpers are updated to use
system_capabilities_finalized() consistently, and this is made
__always_inline as required by its new callers.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-02-21 14:50:21 +00:00
|
|
|
if (system_capabilities_finalized())
|
2023-10-16 10:25:01 +00:00
|
|
|
return alternative_has_cap_unlikely(num);
|
arm64/cpufeature: don't use mutex in bringup path
Currently, cpus_set_cap() calls static_branch_enable_cpuslocked(), which
must take the jump_label mutex.
We call cpus_set_cap() in the secondary bringup path, from the idle
thread where interrupts are disabled. Taking a mutex in this path "is a
NONO" regardless of whether it's contended, and something we must avoid.
We didn't spot this until recently, as ___might_sleep() won't warn for
this case until all CPUs have been brought up.
This patch avoids taking the mutex in the secondary bringup path. The
poking of static keys is deferred until enable_cpu_capabilities(), which
runs in a suitable context on the boot CPU. To account for the static
keys being set later, cpus_have_const_cap() is updated to use another
static key to check whether the const cap keys have been initialised,
falling back to the caps bitmap until this is the case.
This means that users of cpus_have_const_cap() gain should only gain a
single additional NOP in the fast path once the const caps are
initialised, but should always see the current cap value.
The hyp code should never dereference the caps array, since the caps are
initialized before we run the module initcall to initialise hyp. A check
is added to the hyp init code to document this requirement.
This change will sidestep a number of issues when the upcoming hotplug
locking rework is merged.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyniger <marc.zyngier@arm.com>
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-05-16 14:18:05 +00:00
|
|
|
else
|
2020-10-26 13:49:30 +00:00
|
|
|
BUG();
|
arm64/cpufeature: don't use mutex in bringup path
Currently, cpus_set_cap() calls static_branch_enable_cpuslocked(), which
must take the jump_label mutex.
We call cpus_set_cap() in the secondary bringup path, from the idle
thread where interrupts are disabled. Taking a mutex in this path "is a
NONO" regardless of whether it's contended, and something we must avoid.
We didn't spot this until recently, as ___might_sleep() won't warn for
this case until all CPUs have been brought up.
This patch avoids taking the mutex in the secondary bringup path. The
poking of static keys is deferred until enable_cpu_capabilities(), which
runs in a suitable context on the boot CPU. To account for the static
keys being set later, cpus_have_const_cap() is updated to use another
static key to check whether the const cap keys have been initialised,
falling back to the caps bitmap until this is the case.
This means that users of cpus_have_const_cap() gain should only gain a
single additional NOP in the fast path once the const caps are
initialised, but should always see the current cap value.
The hyp code should never dereference the caps array, since the caps are
initialized before we run the module initcall to initialise hyp. A check
is added to the hyp init code to document this requirement.
This change will sidestep a number of issues when the upcoming hotplug
locking rework is merged.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyniger <marc.zyngier@arm.com>
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-05-16 14:18:05 +00:00
|
|
|
}
|
|
|
|
|
2015-10-19 13:24:44 +00:00
|
|
|
static inline int __attribute_const__
|
2016-01-26 10:58:16 +00:00
|
|
|
cpuid_feature_extract_signed_field_width(u64 features, int field, int width)
|
2015-07-21 12:23:26 +00:00
|
|
|
{
|
2015-10-19 13:24:44 +00:00
|
|
|
return (s64)(features << (64 - width - field)) >> (64 - width);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int __attribute_const__
|
2016-01-26 10:58:16 +00:00
|
|
|
cpuid_feature_extract_signed_field(u64 features, int field)
|
2015-10-19 13:24:44 +00:00
|
|
|
{
|
2016-01-26 10:58:16 +00:00
|
|
|
return cpuid_feature_extract_signed_field_width(features, field, 4);
|
2015-07-21 12:23:26 +00:00
|
|
|
}
|
|
|
|
|
2020-02-20 16:58:39 +00:00
|
|
|
static __always_inline unsigned int __attribute_const__
|
2015-11-18 17:08:56 +00:00
|
|
|
cpuid_feature_extract_unsigned_field_width(u64 features, int field, int width)
|
|
|
|
{
|
|
|
|
return (u64)(features << (64 - width - field)) >> (64 - width);
|
|
|
|
}
|
|
|
|
|
2020-02-20 16:58:39 +00:00
|
|
|
static __always_inline unsigned int __attribute_const__
|
2015-11-18 17:08:56 +00:00
|
|
|
cpuid_feature_extract_unsigned_field(u64 features, int field)
|
|
|
|
{
|
|
|
|
return cpuid_feature_extract_unsigned_field_width(features, field, 4);
|
|
|
|
}
|
|
|
|
|
2020-03-02 18:17:50 +00:00
|
|
|
/*
|
|
|
|
* Fields that identify the version of the Performance Monitors Extension do
|
|
|
|
* not follow the standard ID scheme. See ARM DDI 0487E.a page D13-2825,
|
|
|
|
* "Alternative ID scheme used for the Performance Monitors Extension version".
|
|
|
|
*/
|
|
|
|
static inline u64 __attribute_const__
|
|
|
|
cpuid_feature_cap_perfmon_field(u64 features, int field, u64 cap)
|
|
|
|
{
|
|
|
|
u64 val = cpuid_feature_extract_unsigned_field(features, field);
|
|
|
|
u64 mask = GENMASK_ULL(field + 3, field);
|
|
|
|
|
|
|
|
/* Treat IMPLEMENTATION DEFINED functionality as unimplemented */
|
2022-09-10 16:33:50 +00:00
|
|
|
if (val == ID_AA64DFR0_EL1_PMUVer_IMP_DEF)
|
2020-03-02 18:17:50 +00:00
|
|
|
val = 0;
|
|
|
|
|
|
|
|
if (val > cap) {
|
|
|
|
features &= ~mask;
|
|
|
|
features |= (cap << field) & mask;
|
|
|
|
}
|
|
|
|
|
|
|
|
return features;
|
|
|
|
}
|
|
|
|
|
2016-08-31 10:31:08 +00:00
|
|
|
static inline u64 arm64_ftr_mask(const struct arm64_ftr_bits *ftrp)
|
2015-10-19 13:24:45 +00:00
|
|
|
{
|
|
|
|
return (u64)GENMASK(ftrp->shift + ftrp->width - 1, ftrp->shift);
|
|
|
|
}
|
|
|
|
|
2017-01-09 17:28:30 +00:00
|
|
|
static inline u64 arm64_ftr_reg_user_value(const struct arm64_ftr_reg *reg)
|
|
|
|
{
|
|
|
|
return (reg->user_val | (reg->sys_val & reg->user_mask));
|
|
|
|
}
|
|
|
|
|
2016-01-26 10:58:16 +00:00
|
|
|
static inline int __attribute_const__
|
arm64/cpufeature: check correct field width when updating sys_val
When we're updating a register's sys_val, we use arm64_ftr_value() to
find the new field value. We use cpuid_feature_extract_field() to find
the new value, but this implicitly assumes a 4-bit field, so we may
extract more bits than we mean to for fields like CTR_EL0.L1ip.
This affects update_cpu_ftr_reg(), where we may extract erroneous values
for ftr_cur and ftr_new. Depending on the additional bits extracted in
either case, we may erroneously detect that the value is mismatched, and
we'll try to compute a new safe value.
Dependent on these extra bits and feature type, arm64_ftr_safe_value()
may pessimistically select the always-safe value, or may erroneously
choose either the extracted cur or new value as the safe option. The
extra bits will subsequently be masked out in arm64_ftr_set_value(), so
we may choose a higher value, yet write back a lower one.
Fix this by passing the width down explicitly in arm64_ftr_value(), so
we always extract the correct amount.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-02-23 16:03:17 +00:00
|
|
|
cpuid_feature_extract_field_width(u64 features, int field, int width, bool sign)
|
2016-01-26 10:58:16 +00:00
|
|
|
{
|
2022-03-07 18:08:59 +00:00
|
|
|
if (WARN_ON_ONCE(!width))
|
|
|
|
width = 4;
|
2016-01-26 10:58:16 +00:00
|
|
|
return (sign) ?
|
arm64/cpufeature: check correct field width when updating sys_val
When we're updating a register's sys_val, we use arm64_ftr_value() to
find the new field value. We use cpuid_feature_extract_field() to find
the new value, but this implicitly assumes a 4-bit field, so we may
extract more bits than we mean to for fields like CTR_EL0.L1ip.
This affects update_cpu_ftr_reg(), where we may extract erroneous values
for ftr_cur and ftr_new. Depending on the additional bits extracted in
either case, we may erroneously detect that the value is mismatched, and
we'll try to compute a new safe value.
Dependent on these extra bits and feature type, arm64_ftr_safe_value()
may pessimistically select the always-safe value, or may erroneously
choose either the extracted cur or new value as the safe option. The
extra bits will subsequently be masked out in arm64_ftr_set_value(), so
we may choose a higher value, yet write back a lower one.
Fix this by passing the width down explicitly in arm64_ftr_value(), so
we always extract the correct amount.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-02-23 16:03:17 +00:00
|
|
|
cpuid_feature_extract_signed_field_width(features, field, width) :
|
|
|
|
cpuid_feature_extract_unsigned_field_width(features, field, width);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int __attribute_const__
|
|
|
|
cpuid_feature_extract_field(u64 features, int field, bool sign)
|
|
|
|
{
|
|
|
|
return cpuid_feature_extract_field_width(features, field, 4, sign);
|
2016-01-26 10:58:16 +00:00
|
|
|
}
|
|
|
|
|
2016-08-31 10:31:08 +00:00
|
|
|
static inline s64 arm64_ftr_value(const struct arm64_ftr_bits *ftrp, u64 val)
|
2015-10-19 13:24:45 +00:00
|
|
|
{
|
arm64/cpufeature: check correct field width when updating sys_val
When we're updating a register's sys_val, we use arm64_ftr_value() to
find the new field value. We use cpuid_feature_extract_field() to find
the new value, but this implicitly assumes a 4-bit field, so we may
extract more bits than we mean to for fields like CTR_EL0.L1ip.
This affects update_cpu_ftr_reg(), where we may extract erroneous values
for ftr_cur and ftr_new. Depending on the additional bits extracted in
either case, we may erroneously detect that the value is mismatched, and
we'll try to compute a new safe value.
Dependent on these extra bits and feature type, arm64_ftr_safe_value()
may pessimistically select the always-safe value, or may erroneously
choose either the extracted cur or new value as the safe option. The
extra bits will subsequently be masked out in arm64_ftr_set_value(), so
we may choose a higher value, yet write back a lower one.
Fix this by passing the width down explicitly in arm64_ftr_value(), so
we always extract the correct amount.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-02-23 16:03:17 +00:00
|
|
|
return (s64)cpuid_feature_extract_field_width(val, ftrp->shift, ftrp->width, ftrp->sign);
|
2015-10-19 13:24:45 +00:00
|
|
|
}
|
|
|
|
|
2015-10-19 13:24:42 +00:00
|
|
|
static inline bool id_aa64mmfr0_mixed_endian_el0(u64 mmfr0)
|
2015-07-21 12:23:26 +00:00
|
|
|
{
|
2022-09-05 22:54:05 +00:00
|
|
|
return cpuid_feature_extract_unsigned_field(mmfr0, ID_AA64MMFR0_EL1_BIGEND_SHIFT) == 0x1 ||
|
2022-09-05 22:54:01 +00:00
|
|
|
cpuid_feature_extract_unsigned_field(mmfr0, ID_AA64MMFR0_EL1_BIGENDEL0_SHIFT) == 0x1;
|
2015-07-21 12:23:26 +00:00
|
|
|
}
|
|
|
|
|
2020-04-21 14:29:20 +00:00
|
|
|
static inline bool id_aa64pfr0_32bit_el1(u64 pfr0)
|
|
|
|
{
|
2022-09-05 22:54:03 +00:00
|
|
|
u32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_EL1_EL1_SHIFT);
|
2020-04-21 14:29:20 +00:00
|
|
|
|
2022-09-05 22:54:03 +00:00
|
|
|
return val == ID_AA64PFR0_EL1_ELx_32BIT_64BIT;
|
2020-04-21 14:29:20 +00:00
|
|
|
}
|
|
|
|
|
2016-04-18 09:28:34 +00:00
|
|
|
static inline bool id_aa64pfr0_32bit_el0(u64 pfr0)
|
|
|
|
{
|
2022-09-05 22:54:03 +00:00
|
|
|
u32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_EL1_EL0_SHIFT);
|
2016-04-18 09:28:34 +00:00
|
|
|
|
2022-09-05 22:54:03 +00:00
|
|
|
return val == ID_AA64PFR0_EL1_ELx_32BIT_64BIT;
|
2016-04-18 09:28:34 +00:00
|
|
|
}
|
|
|
|
|
2017-10-31 15:51:10 +00:00
|
|
|
static inline bool id_aa64pfr0_sve(u64 pfr0)
|
|
|
|
{
|
2022-09-05 22:54:03 +00:00
|
|
|
u32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_EL1_SVE_SHIFT);
|
2017-10-31 15:51:10 +00:00
|
|
|
|
|
|
|
return val > 0;
|
|
|
|
}
|
|
|
|
|
2022-04-19 11:22:17 +00:00
|
|
|
static inline bool id_aa64pfr1_sme(u64 pfr1)
|
|
|
|
{
|
2022-09-05 22:54:04 +00:00
|
|
|
u32 val = cpuid_feature_extract_unsigned_field(pfr1, ID_AA64PFR1_EL1_SME_SHIFT);
|
2022-04-19 11:22:17 +00:00
|
|
|
|
|
|
|
return val > 0;
|
|
|
|
}
|
|
|
|
|
2021-05-26 19:36:21 +00:00
|
|
|
static inline bool id_aa64pfr1_mte(u64 pfr1)
|
|
|
|
{
|
2022-09-05 22:54:04 +00:00
|
|
|
u32 val = cpuid_feature_extract_unsigned_field(pfr1, ID_AA64PFR1_EL1_MTE_SHIFT);
|
2021-05-26 19:36:21 +00:00
|
|
|
|
2022-09-05 22:54:13 +00:00
|
|
|
return val >= ID_AA64PFR1_EL1_MTE_MTE2;
|
2021-05-26 19:36:21 +00:00
|
|
|
}
|
|
|
|
|
2023-12-12 17:09:10 +00:00
|
|
|
void __init setup_boot_cpu_features(void);
|
2023-10-16 10:24:29 +00:00
|
|
|
void __init setup_system_features(void);
|
|
|
|
void __init setup_user_features(void);
|
|
|
|
|
2016-09-09 13:07:10 +00:00
|
|
|
void check_local_cpu_capabilities(void);
|
|
|
|
|
2017-03-23 15:14:39 +00:00
|
|
|
u64 read_sanitised_ftr_reg(u32 id);
|
2021-02-08 09:57:20 +00:00
|
|
|
u64 __read_sysreg_by_encoding(u32 sys_id);
|
2015-10-19 13:24:47 +00:00
|
|
|
|
2015-10-19 13:24:48 +00:00
|
|
|
static inline bool cpu_supports_mixed_endian_el0(void)
|
|
|
|
{
|
|
|
|
return id_aa64mmfr0_mixed_endian_el0(read_cpuid(ID_AA64MMFR0_EL1));
|
|
|
|
}
|
|
|
|
|
arm64: Mitigate spectre style branch history side channels
Speculation attacks against some high-performance processors can
make use of branch history to influence future speculation.
When taking an exception from user-space, a sequence of branches
or a firmware call overwrites or invalidates the branch history.
The sequence of branches is added to the vectors, and should appear
before the first indirect branch. For systems using KPTI the sequence
is added to the kpti trampoline where it has a free register as the exit
from the trampoline is via a 'ret'. For systems not using KPTI, the same
register tricks are used to free up a register in the vectors.
For the firmware call, arch-workaround-3 clobbers 4 registers, so
there is no choice but to save them to the EL1 stack. This only happens
for entry from EL0, so if we take an exception due to the stack access,
it will not become re-entrant.
For KVM, the existing branch-predictor-hardening vectors are used.
When a spectre version of these vectors is in use, the firmware call
is sufficient to mitigate against Spectre-BHB. For the non-spectre
versions, the sequence of branches is added to the indirect vector.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2021-11-10 14:48:00 +00:00
|
|
|
|
|
|
|
static inline bool supports_csv2p3(int scope)
|
|
|
|
{
|
|
|
|
u64 pfr0;
|
|
|
|
u8 csv2_val;
|
|
|
|
|
|
|
|
if (scope == SCOPE_LOCAL_CPU)
|
|
|
|
pfr0 = read_sysreg_s(SYS_ID_AA64PFR0_EL1);
|
|
|
|
else
|
|
|
|
pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
|
|
|
|
|
|
|
|
csv2_val = cpuid_feature_extract_unsigned_field(pfr0,
|
2022-09-05 22:54:03 +00:00
|
|
|
ID_AA64PFR0_EL1_CSV2_SHIFT);
|
arm64: Mitigate spectre style branch history side channels
Speculation attacks against some high-performance processors can
make use of branch history to influence future speculation.
When taking an exception from user-space, a sequence of branches
or a firmware call overwrites or invalidates the branch history.
The sequence of branches is added to the vectors, and should appear
before the first indirect branch. For systems using KPTI the sequence
is added to the kpti trampoline where it has a free register as the exit
from the trampoline is via a 'ret'. For systems not using KPTI, the same
register tricks are used to free up a register in the vectors.
For the firmware call, arch-workaround-3 clobbers 4 registers, so
there is no choice but to save them to the EL1 stack. This only happens
for entry from EL0, so if we take an exception due to the stack access,
it will not become re-entrant.
For KVM, the existing branch-predictor-hardening vectors are used.
When a spectre version of these vectors is in use, the firmware call
is sufficient to mitigate against Spectre-BHB. For the non-spectre
versions, the sequence of branches is added to the indirect vector.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
2021-11-10 14:48:00 +00:00
|
|
|
return csv2_val == 3;
|
|
|
|
}
|
|
|
|
|
2021-12-10 14:32:56 +00:00
|
|
|
static inline bool supports_clearbhb(int scope)
|
|
|
|
{
|
|
|
|
u64 isar2;
|
|
|
|
|
|
|
|
if (scope == SCOPE_LOCAL_CPU)
|
|
|
|
isar2 = read_sysreg_s(SYS_ID_AA64ISAR2_EL1);
|
|
|
|
else
|
|
|
|
isar2 = read_sanitised_ftr_reg(SYS_ID_AA64ISAR2_EL1);
|
|
|
|
|
|
|
|
return cpuid_feature_extract_unsigned_field(isar2,
|
2023-09-12 13:34:29 +00:00
|
|
|
ID_AA64ISAR2_EL1_CLRBHB_SHIFT);
|
2021-12-10 14:32:56 +00:00
|
|
|
}
|
|
|
|
|
2021-06-08 18:02:55 +00:00
|
|
|
const struct cpumask *system_32bit_el0_cpumask(void);
|
|
|
|
DECLARE_STATIC_KEY_FALSE(arm64_mismatched_32bit_el0);
|
|
|
|
|
2016-04-18 09:28:36 +00:00
|
|
|
static inline bool system_supports_32bit_el0(void)
|
|
|
|
{
|
2021-06-08 18:02:55 +00:00
|
|
|
u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
|
|
|
|
|
|
|
|
return static_branch_unlikely(&arm64_mismatched_32bit_el0) ||
|
|
|
|
id_aa64pfr0_32bit_el0(pfr0);
|
2016-04-18 09:28:36 +00:00
|
|
|
}
|
|
|
|
|
2018-11-15 05:52:47 +00:00
|
|
|
static inline bool system_supports_4kb_granule(void)
|
|
|
|
{
|
|
|
|
u64 mmfr0;
|
|
|
|
u32 val;
|
|
|
|
|
|
|
|
mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
|
|
|
|
val = cpuid_feature_extract_unsigned_field(mmfr0,
|
2022-09-05 22:54:01 +00:00
|
|
|
ID_AA64MMFR0_EL1_TGRAN4_SHIFT);
|
2018-11-15 05:52:47 +00:00
|
|
|
|
2022-09-05 22:54:01 +00:00
|
|
|
return (val >= ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MIN) &&
|
|
|
|
(val <= ID_AA64MMFR0_EL1_TGRAN4_SUPPORTED_MAX);
|
2018-11-15 05:52:47 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool system_supports_64kb_granule(void)
|
|
|
|
{
|
|
|
|
u64 mmfr0;
|
|
|
|
u32 val;
|
|
|
|
|
|
|
|
mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
|
|
|
|
val = cpuid_feature_extract_unsigned_field(mmfr0,
|
2022-09-05 22:54:01 +00:00
|
|
|
ID_AA64MMFR0_EL1_TGRAN64_SHIFT);
|
2018-11-15 05:52:47 +00:00
|
|
|
|
2022-09-05 22:54:01 +00:00
|
|
|
return (val >= ID_AA64MMFR0_EL1_TGRAN64_SUPPORTED_MIN) &&
|
|
|
|
(val <= ID_AA64MMFR0_EL1_TGRAN64_SUPPORTED_MAX);
|
2018-11-15 05:52:47 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool system_supports_16kb_granule(void)
|
|
|
|
{
|
|
|
|
u64 mmfr0;
|
|
|
|
u32 val;
|
|
|
|
|
|
|
|
mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
|
|
|
|
val = cpuid_feature_extract_unsigned_field(mmfr0,
|
2022-09-05 22:54:01 +00:00
|
|
|
ID_AA64MMFR0_EL1_TGRAN16_SHIFT);
|
2018-11-15 05:52:47 +00:00
|
|
|
|
2022-09-05 22:54:01 +00:00
|
|
|
return (val >= ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MIN) &&
|
|
|
|
(val <= ID_AA64MMFR0_EL1_TGRAN16_SUPPORTED_MAX);
|
2018-11-15 05:52:47 +00:00
|
|
|
}
|
|
|
|
|
2015-10-19 13:24:48 +00:00
|
|
|
static inline bool system_supports_mixed_endian_el0(void)
|
|
|
|
{
|
2017-03-23 15:14:39 +00:00
|
|
|
return id_aa64mmfr0_mixed_endian_el0(read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1));
|
2015-10-19 13:24:48 +00:00
|
|
|
}
|
2014-11-14 15:54:09 +00:00
|
|
|
|
2018-11-15 05:52:47 +00:00
|
|
|
static inline bool system_supports_mixed_endian(void)
|
|
|
|
{
|
|
|
|
u64 mmfr0;
|
|
|
|
u32 val;
|
|
|
|
|
|
|
|
mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
|
|
|
|
val = cpuid_feature_extract_unsigned_field(mmfr0,
|
2022-09-05 22:54:05 +00:00
|
|
|
ID_AA64MMFR0_EL1_BIGEND_SHIFT);
|
2018-11-15 05:52:47 +00:00
|
|
|
|
|
|
|
return val == 0x1;
|
|
|
|
}
|
|
|
|
|
2020-02-20 16:58:39 +00:00
|
|
|
static __always_inline bool system_supports_fpsimd(void)
|
2016-11-08 13:56:21 +00:00
|
|
|
{
|
arm64: Use a positive cpucap for FP/SIMD
Currently we have a negative cpucap which describes the *absence* of
FP/SIMD rather than *presence* of FP/SIMD. This largely works, but is
somewhat awkward relative to other cpucaps that describe the presence of
a feature, and it would be nicer to have a cpucap which describes the
presence of FP/SIMD:
* This will allow the cpucap to be treated as a standard
ARM64_CPUCAP_SYSTEM_FEATURE, which can be detected with the standard
has_cpuid_feature() function and ARM64_CPUID_FIELDS() description.
* This ensures that the cpucap will only transition from not-present to
present, reducing the risk of unintentional and/or unsafe usage of
FP/SIMD before cpucaps are finalized.
* This will allow using arm64_cpu_capabilities::cpu_enable() to enable
the use of FP/SIMD later, with FP/SIMD being disabled at boot time
otherwise. This will ensure that any unintentional and/or unsafe usage
of FP/SIMD prior to this is trapped, and will ensure that FP/SIMD is
never unintentionally enabled for userspace in mismatched big.LITTLE
systems.
This patch replaces the negative ARM64_HAS_NO_FPSIMD cpucap with a
positive ARM64_HAS_FPSIMD cpucap, making changes as described above.
Note that as FP/SIMD will now be trapped when not supported system-wide,
do_fpsimd_acc() must handle these traps in the same way as for SVE and
SME. The commentary in fpsimd_restore_current_state() is updated to
describe the new scheme.
No users of system_supports_fpsimd() need to know that FP/SIMD is
available prior to alternatives being patched, so this is updated to
use alternative_has_cap_likely() to check for the ARM64_HAS_FPSIMD
cpucap, without generating code to test the system_cpucaps bitmap.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:36 +00:00
|
|
|
return alternative_has_cap_likely(ARM64_HAS_FPSIMD);
|
2016-11-08 13:56:21 +00:00
|
|
|
}
|
|
|
|
|
arm64: sdei: explicitly simulate PAN/UAO entry
In preparation for removing addr_limit and set_fs() we must decouple the
SDEI PAN/UAO manipulation from the uaccess code, and explicitly
reinitialize these as required.
SDEI enters the kernel with a non-architectural exception, and prior to
the most recent revision of the specification (ARM DEN 0054B), PSTATE
bits (e.g. PAN, UAO) are not manipulated in the same way as for
architectural exceptions. Notably, older versions of the spec can be
read ambiguously as to whether PSTATE bits are inherited unchanged from
the interrupted context or whether they are generated from scratch, with
TF-A doing the latter.
We have three cases to consider:
1) The existing TF-A implementation of SDEI will clear PAN and clear UAO
(along with other bits in PSTATE) when delivering an SDEI exception.
2) In theory, implementations of SDEI prior to revision B could inherit
PAN and UAO (along with other bits in PSTATE) unchanged from the
interrupted context. However, in practice such implementations do not
exist.
3) Going forward, new implementations of SDEI must clear UAO, and
depending on SCTLR_ELx.SPAN must either inherit or set PAN.
As we can ignore (2) we can assume that upon SDEI entry, UAO is always
clear, though PAN may be clear, inherited, or set per SCTLR_ELx.SPAN.
Therefore, we must explicitly initialize PAN, but do not need to do
anything for UAO.
Considering what we need to do:
* When set_fs() is removed, force_uaccess_begin() will have no HW
side-effects. As this only clears UAO, which we can assume has already
been cleared upon entry, this is not a problem. We do not need to add
code to manipulate UAO explicitly.
* PAN may be cleared upon entry (in case 1 above), so where a kernel is
built to use PAN and this is supported by all CPUs, the kernel must
set PAN upon entry to ensure expected behaviour.
* PAN may be inherited from the interrupted context (in case 3 above),
and so where a kernel is not built to use PAN or where PAN support is
not uniform across CPUs, the kernel must clear PAN to ensure expected
behaviour.
This patch reworks the SDEI code accordingly, explicitly setting PAN to
the expected state in all cases. To cater for the cases where the kernel
does not use PAN or this is not uniformly supported by hardware we add a
new cpu_has_pan() helper which can be used regardless of whether the
kernel is built to use PAN.
The existing system_uses_ttbr0_pan() is redefined in terms of
system_uses_hw_pan() both for clarity and as a minor optimization when
HW PAN is not selected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-12-02 13:15:48 +00:00
|
|
|
static inline bool system_uses_hw_pan(void)
|
|
|
|
{
|
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_PAN
In system_uses_hw_pan() we use cpus_have_const_cap() to check for
ARM64_HAS_PAN, but this is only necessary so that the
system_uses_ttbr0_pan() check in setup_cpu_features() can run prior to
alternatives being patched, and otherwise this is not necessary and
alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The ARM64_HAS_PAN cpucap is used by system_uses_hw_pan() and
system_uses_ttbr0_pan() depending on whether CONFIG_ARM64_SW_TTBR0_PAN
is selected, and:
* We only use system_uses_hw_pan() directly in __sdei_handler(), which
isn't reachable until after alternatives have been patched, and for
this it is safe to use alternative_has_cap_*().
* We use system_uses_ttbr0_pan() in a few places:
- In check_and_switch_context() and cpu_uninstall_idmap(), which will
defer installing a translation table into TTBR0 when the
ARM64_HAS_PAN cpucap is not detected.
Prior to patching alternatives, all CPUs will be using init_mm with
the reserved ttbr0 translation tables install in TTBR0, so these can
safely use alternative_has_cap_*().
- In update_saved_ttbr0(), which will only save the active TTBR0 into
a per-thread variable when the ARM64_HAS_PAN cpucap is not detected.
Prior to patching alternatives, all CPUs will be using init_mm with
the reserved ttbr0 translation tables install in TTBR0, so these can
safely use alternative_has_cap_*().
- In efi_set_pgd(), which will handle check_and_switch_context()
deferring the installation of TTBR0 when TTBR0 PAN is detected.
The EFI runtime services are not initialized until after
alternatives have been patched, and so this can safely use
alternative_has_cap_*() or cpus_have_final_cap().
- In uaccess_ttbr0_disable() and uaccess_ttbr0_enable(), where we'll
avoid installing/uninstalling a translation table in TTBR0 when
ARM64_HAS_PAN is detected.
Prior to patching alternatives we will not perform any uaccess and
will not call uaccess_ttbr0_disable() or uaccess_ttbr0_enable(), and
so these can safely use alternative_has_cap_*() or
cpus_have_final_cap().
- In is_el1_permission_fault() where we will consider a translation
fault on a TTBR0 address to be a permission fault when ARM64_HAS_PAN
is not detected *and* we have set the PAN bit in the SPSR (which
tells us that in the interrupted context, TTBR0 pointed at the
reserved zero ttbr).
In the window between detecting system cpucaps and patching
alternatives we should not perform any accesses to TTBR0 addresses,
and no userspace translation tables exist until after patching
alternatives. Thus it is safe for this to use alternative_has_cap*().
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime.
So that the check for TTBR0 PAN in setup_cpu_features() can run prior to
alternatives being patched, the call to system_uses_ttbr0_pan() is
replaced with an explicit check of the ARM64_HAS_PAN bit in the
system_cpucaps bitmap.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:44 +00:00
|
|
|
return alternative_has_cap_unlikely(ARM64_HAS_PAN);
|
arm64: sdei: explicitly simulate PAN/UAO entry
In preparation for removing addr_limit and set_fs() we must decouple the
SDEI PAN/UAO manipulation from the uaccess code, and explicitly
reinitialize these as required.
SDEI enters the kernel with a non-architectural exception, and prior to
the most recent revision of the specification (ARM DEN 0054B), PSTATE
bits (e.g. PAN, UAO) are not manipulated in the same way as for
architectural exceptions. Notably, older versions of the spec can be
read ambiguously as to whether PSTATE bits are inherited unchanged from
the interrupted context or whether they are generated from scratch, with
TF-A doing the latter.
We have three cases to consider:
1) The existing TF-A implementation of SDEI will clear PAN and clear UAO
(along with other bits in PSTATE) when delivering an SDEI exception.
2) In theory, implementations of SDEI prior to revision B could inherit
PAN and UAO (along with other bits in PSTATE) unchanged from the
interrupted context. However, in practice such implementations do not
exist.
3) Going forward, new implementations of SDEI must clear UAO, and
depending on SCTLR_ELx.SPAN must either inherit or set PAN.
As we can ignore (2) we can assume that upon SDEI entry, UAO is always
clear, though PAN may be clear, inherited, or set per SCTLR_ELx.SPAN.
Therefore, we must explicitly initialize PAN, but do not need to do
anything for UAO.
Considering what we need to do:
* When set_fs() is removed, force_uaccess_begin() will have no HW
side-effects. As this only clears UAO, which we can assume has already
been cleared upon entry, this is not a problem. We do not need to add
code to manipulate UAO explicitly.
* PAN may be cleared upon entry (in case 1 above), so where a kernel is
built to use PAN and this is supported by all CPUs, the kernel must
set PAN upon entry to ensure expected behaviour.
* PAN may be inherited from the interrupted context (in case 3 above),
and so where a kernel is not built to use PAN or where PAN support is
not uniform across CPUs, the kernel must clear PAN to ensure expected
behaviour.
This patch reworks the SDEI code accordingly, explicitly setting PAN to
the expected state in all cases. To cater for the cases where the kernel
does not use PAN or this is not uniformly supported by hardware we add a
new cpu_has_pan() helper which can be used regardless of whether the
kernel is built to use PAN.
The existing system_uses_ttbr0_pan() is redefined in terms of
system_uses_hw_pan() both for clarity and as a minor optimization when
HW PAN is not selected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-12-02 13:15:48 +00:00
|
|
|
}
|
|
|
|
|
2016-07-01 15:53:00 +00:00
|
|
|
static inline bool system_uses_ttbr0_pan(void)
|
|
|
|
{
|
|
|
|
return IS_ENABLED(CONFIG_ARM64_SW_TTBR0_PAN) &&
|
arm64: sdei: explicitly simulate PAN/UAO entry
In preparation for removing addr_limit and set_fs() we must decouple the
SDEI PAN/UAO manipulation from the uaccess code, and explicitly
reinitialize these as required.
SDEI enters the kernel with a non-architectural exception, and prior to
the most recent revision of the specification (ARM DEN 0054B), PSTATE
bits (e.g. PAN, UAO) are not manipulated in the same way as for
architectural exceptions. Notably, older versions of the spec can be
read ambiguously as to whether PSTATE bits are inherited unchanged from
the interrupted context or whether they are generated from scratch, with
TF-A doing the latter.
We have three cases to consider:
1) The existing TF-A implementation of SDEI will clear PAN and clear UAO
(along with other bits in PSTATE) when delivering an SDEI exception.
2) In theory, implementations of SDEI prior to revision B could inherit
PAN and UAO (along with other bits in PSTATE) unchanged from the
interrupted context. However, in practice such implementations do not
exist.
3) Going forward, new implementations of SDEI must clear UAO, and
depending on SCTLR_ELx.SPAN must either inherit or set PAN.
As we can ignore (2) we can assume that upon SDEI entry, UAO is always
clear, though PAN may be clear, inherited, or set per SCTLR_ELx.SPAN.
Therefore, we must explicitly initialize PAN, but do not need to do
anything for UAO.
Considering what we need to do:
* When set_fs() is removed, force_uaccess_begin() will have no HW
side-effects. As this only clears UAO, which we can assume has already
been cleared upon entry, this is not a problem. We do not need to add
code to manipulate UAO explicitly.
* PAN may be cleared upon entry (in case 1 above), so where a kernel is
built to use PAN and this is supported by all CPUs, the kernel must
set PAN upon entry to ensure expected behaviour.
* PAN may be inherited from the interrupted context (in case 3 above),
and so where a kernel is not built to use PAN or where PAN support is
not uniform across CPUs, the kernel must clear PAN to ensure expected
behaviour.
This patch reworks the SDEI code accordingly, explicitly setting PAN to
the expected state in all cases. To cater for the cases where the kernel
does not use PAN or this is not uniformly supported by hardware we add a
new cpu_has_pan() helper which can be used regardless of whether the
kernel is built to use PAN.
The existing system_uses_ttbr0_pan() is redefined in terms of
system_uses_hw_pan() both for clarity and as a minor optimization when
HW PAN is not selected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-12-02 13:15:48 +00:00
|
|
|
!system_uses_hw_pan();
|
2016-07-01 15:53:00 +00:00
|
|
|
}
|
|
|
|
|
2020-02-20 16:58:39 +00:00
|
|
|
static __always_inline bool system_supports_sve(void)
|
2017-10-31 15:51:02 +00:00
|
|
|
{
|
arm64: Avoid cpus_have_const_cap() for ARM64_{SVE,SME,SME2,FA64}
In system_supports_{sve,sme,sme2,fa64}() we use cpus_have_const_cap() to
check for the relevant cpucaps, but this is only necessary so that
sve_setup() and sme_setup() can run prior to alternatives being patched,
and otherwise alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
All of system_supports_{sve,sme,sme2,fa64}() will return false prior to
system cpucaps being detected. In the window between system cpucaps being
detected and patching alternatives, we need system_supports_sve() and
system_supports_sme() to run to initialize SVE and SME properties, but
all other users of system_supports_{sve,sme,sme2,fa64}() don't depend on
the relevant cpucap becoming true until alternatives are patched:
* No KVM code runs until after alternatives are patched, and so this can
safely use cpus_have_final_cap() or alternative_has_cap_*().
* The cpuid_cpu_online() callback in arch/arm64/kernel/cpuinfo.c is
registered later from cpuinfo_regs_init() as a device_initcall, and so
this can safely use cpus_have_final_cap() or alternative_has_cap_*().
* The entry, signal, and ptrace code isn't reachable until userspace has
run, and so this can safely use cpus_have_final_cap() or
alternative_has_cap_*().
* Currently perf_reg_validate() will un-reserve the PERF_REG_ARM64_VG
pseudo-register before alternatives are patched, and before
sve_setup() has run. If a sampling event is created early enough, this
would allow perf_ext_reg_value() to sample (the as-yet uninitialized)
thread_struct::vl[] prior to alternatives being patched.
It would be preferable to defer this until alternatives are patched,
and this can safely use alternative_has_cap_*().
* The context-switch code will run during this window as part of
stop_machine() used during alternatives_patch_all(), and potentially
for other work if other kernel threads are created early. No threads
require the use of SVE/SME/SME2/FA64 prior to alternatives being
patched, and it would be preferable for the related context-switch
logic to take effect after alternatives are patched so that ths is
guaranteed to see a consistent system-wide state (e.g. anything
initialized by sve_setup() and sme_setup().
This can safely ues alternative_has_cap_*().
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime. The sve_setup() and sme_setup() functions are modified to
use cpus_have_cap() directly so that they can observe the cpucaps being
set prior to alternatives being patched.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:52 +00:00
|
|
|
return alternative_has_cap_unlikely(ARM64_SVE);
|
2017-10-31 15:51:02 +00:00
|
|
|
}
|
|
|
|
|
2022-04-19 11:22:16 +00:00
|
|
|
static __always_inline bool system_supports_sme(void)
|
|
|
|
{
|
arm64: Avoid cpus_have_const_cap() for ARM64_{SVE,SME,SME2,FA64}
In system_supports_{sve,sme,sme2,fa64}() we use cpus_have_const_cap() to
check for the relevant cpucaps, but this is only necessary so that
sve_setup() and sme_setup() can run prior to alternatives being patched,
and otherwise alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
All of system_supports_{sve,sme,sme2,fa64}() will return false prior to
system cpucaps being detected. In the window between system cpucaps being
detected and patching alternatives, we need system_supports_sve() and
system_supports_sme() to run to initialize SVE and SME properties, but
all other users of system_supports_{sve,sme,sme2,fa64}() don't depend on
the relevant cpucap becoming true until alternatives are patched:
* No KVM code runs until after alternatives are patched, and so this can
safely use cpus_have_final_cap() or alternative_has_cap_*().
* The cpuid_cpu_online() callback in arch/arm64/kernel/cpuinfo.c is
registered later from cpuinfo_regs_init() as a device_initcall, and so
this can safely use cpus_have_final_cap() or alternative_has_cap_*().
* The entry, signal, and ptrace code isn't reachable until userspace has
run, and so this can safely use cpus_have_final_cap() or
alternative_has_cap_*().
* Currently perf_reg_validate() will un-reserve the PERF_REG_ARM64_VG
pseudo-register before alternatives are patched, and before
sve_setup() has run. If a sampling event is created early enough, this
would allow perf_ext_reg_value() to sample (the as-yet uninitialized)
thread_struct::vl[] prior to alternatives being patched.
It would be preferable to defer this until alternatives are patched,
and this can safely use alternative_has_cap_*().
* The context-switch code will run during this window as part of
stop_machine() used during alternatives_patch_all(), and potentially
for other work if other kernel threads are created early. No threads
require the use of SVE/SME/SME2/FA64 prior to alternatives being
patched, and it would be preferable for the related context-switch
logic to take effect after alternatives are patched so that ths is
guaranteed to see a consistent system-wide state (e.g. anything
initialized by sve_setup() and sme_setup().
This can safely ues alternative_has_cap_*().
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime. The sve_setup() and sme_setup() functions are modified to
use cpus_have_cap() directly so that they can observe the cpucaps being
set prior to alternatives being patched.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:52 +00:00
|
|
|
return alternative_has_cap_unlikely(ARM64_SME);
|
2022-04-19 11:22:16 +00:00
|
|
|
}
|
|
|
|
|
2023-01-16 16:04:43 +00:00
|
|
|
static __always_inline bool system_supports_sme2(void)
|
|
|
|
{
|
arm64: Avoid cpus_have_const_cap() for ARM64_{SVE,SME,SME2,FA64}
In system_supports_{sve,sme,sme2,fa64}() we use cpus_have_const_cap() to
check for the relevant cpucaps, but this is only necessary so that
sve_setup() and sme_setup() can run prior to alternatives being patched,
and otherwise alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
All of system_supports_{sve,sme,sme2,fa64}() will return false prior to
system cpucaps being detected. In the window between system cpucaps being
detected and patching alternatives, we need system_supports_sve() and
system_supports_sme() to run to initialize SVE and SME properties, but
all other users of system_supports_{sve,sme,sme2,fa64}() don't depend on
the relevant cpucap becoming true until alternatives are patched:
* No KVM code runs until after alternatives are patched, and so this can
safely use cpus_have_final_cap() or alternative_has_cap_*().
* The cpuid_cpu_online() callback in arch/arm64/kernel/cpuinfo.c is
registered later from cpuinfo_regs_init() as a device_initcall, and so
this can safely use cpus_have_final_cap() or alternative_has_cap_*().
* The entry, signal, and ptrace code isn't reachable until userspace has
run, and so this can safely use cpus_have_final_cap() or
alternative_has_cap_*().
* Currently perf_reg_validate() will un-reserve the PERF_REG_ARM64_VG
pseudo-register before alternatives are patched, and before
sve_setup() has run. If a sampling event is created early enough, this
would allow perf_ext_reg_value() to sample (the as-yet uninitialized)
thread_struct::vl[] prior to alternatives being patched.
It would be preferable to defer this until alternatives are patched,
and this can safely use alternative_has_cap_*().
* The context-switch code will run during this window as part of
stop_machine() used during alternatives_patch_all(), and potentially
for other work if other kernel threads are created early. No threads
require the use of SVE/SME/SME2/FA64 prior to alternatives being
patched, and it would be preferable for the related context-switch
logic to take effect after alternatives are patched so that ths is
guaranteed to see a consistent system-wide state (e.g. anything
initialized by sve_setup() and sme_setup().
This can safely ues alternative_has_cap_*().
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime. The sve_setup() and sme_setup() functions are modified to
use cpus_have_cap() directly so that they can observe the cpucaps being
set prior to alternatives being patched.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:52 +00:00
|
|
|
return alternative_has_cap_unlikely(ARM64_SME2);
|
2023-01-16 16:04:43 +00:00
|
|
|
}
|
|
|
|
|
2022-04-19 11:22:16 +00:00
|
|
|
static __always_inline bool system_supports_fa64(void)
|
|
|
|
{
|
arm64: Avoid cpus_have_const_cap() for ARM64_{SVE,SME,SME2,FA64}
In system_supports_{sve,sme,sme2,fa64}() we use cpus_have_const_cap() to
check for the relevant cpucaps, but this is only necessary so that
sve_setup() and sme_setup() can run prior to alternatives being patched,
and otherwise alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
All of system_supports_{sve,sme,sme2,fa64}() will return false prior to
system cpucaps being detected. In the window between system cpucaps being
detected and patching alternatives, we need system_supports_sve() and
system_supports_sme() to run to initialize SVE and SME properties, but
all other users of system_supports_{sve,sme,sme2,fa64}() don't depend on
the relevant cpucap becoming true until alternatives are patched:
* No KVM code runs until after alternatives are patched, and so this can
safely use cpus_have_final_cap() or alternative_has_cap_*().
* The cpuid_cpu_online() callback in arch/arm64/kernel/cpuinfo.c is
registered later from cpuinfo_regs_init() as a device_initcall, and so
this can safely use cpus_have_final_cap() or alternative_has_cap_*().
* The entry, signal, and ptrace code isn't reachable until userspace has
run, and so this can safely use cpus_have_final_cap() or
alternative_has_cap_*().
* Currently perf_reg_validate() will un-reserve the PERF_REG_ARM64_VG
pseudo-register before alternatives are patched, and before
sve_setup() has run. If a sampling event is created early enough, this
would allow perf_ext_reg_value() to sample (the as-yet uninitialized)
thread_struct::vl[] prior to alternatives being patched.
It would be preferable to defer this until alternatives are patched,
and this can safely use alternative_has_cap_*().
* The context-switch code will run during this window as part of
stop_machine() used during alternatives_patch_all(), and potentially
for other work if other kernel threads are created early. No threads
require the use of SVE/SME/SME2/FA64 prior to alternatives being
patched, and it would be preferable for the related context-switch
logic to take effect after alternatives are patched so that ths is
guaranteed to see a consistent system-wide state (e.g. anything
initialized by sve_setup() and sme_setup().
This can safely ues alternative_has_cap_*().
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime. The sve_setup() and sme_setup() functions are modified to
use cpus_have_cap() directly so that they can observe the cpucaps being
set prior to alternatives being patched.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:52 +00:00
|
|
|
return alternative_has_cap_unlikely(ARM64_SME_FA64);
|
2022-04-19 11:22:16 +00:00
|
|
|
}
|
|
|
|
|
2022-04-19 11:22:20 +00:00
|
|
|
static __always_inline bool system_supports_tpidr2(void)
|
|
|
|
{
|
|
|
|
return system_supports_sme();
|
|
|
|
}
|
|
|
|
|
2024-03-06 23:14:48 +00:00
|
|
|
static __always_inline bool system_supports_fpmr(void)
|
|
|
|
{
|
|
|
|
return alternative_has_cap_unlikely(ARM64_HAS_FPMR);
|
|
|
|
}
|
|
|
|
|
2020-02-20 16:58:37 +00:00
|
|
|
static __always_inline bool system_supports_cnp(void)
|
2018-07-31 13:08:56 +00:00
|
|
|
{
|
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_CNP
In system_supports_cnp() we use cpus_have_const_cap() to check for
ARM64_HAS_CNP, but this is only necessary so that the cpu_enable_cnp()
callback can run prior to alternatives being patched, and otherwise this
is not necessary and alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The cpu_enable_cnp() callback is run immediately after the ARM64_HAS_CNP
cpucap is detected system-wide under setup_system_capabilities(), prior
to alternatives being patched. During this window cpu_enable_cnp() uses
cpu_replace_ttbr1() to set the CNP bit for the swapper_pg_dir in TTBR1.
No other users of the ARM64_HAS_CNP cpucap need the up-to-date value
during this window:
* As KVM isn't initialized yet, kvm_get_vttbr() isn't reachable.
* As cpuidle isn't initialized yet, __cpu_suspend_exit() isn't
reachable.
* At this point all CPUs are using the swapper_pg_dir with a reserved
ASID in TTBR1, and the idmap_pg_dir in TTBR0, so neither
check_and_switch_context() nor cpu_do_switch_mm() need to do anything
special.
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime. To allow cpu_enable_cnp() to function prior to alternatives
being patched, cpu_replace_ttbr1() is split into cpu_replace_ttbr1() and
cpu_enable_swapper_cnp(), with the former only used for early TTBR1
replacement, and the latter used by both cpu_enable_cnp() and
__cpu_suspend_exit().
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:41 +00:00
|
|
|
return alternative_has_cap_unlikely(ARM64_HAS_CNP);
|
2018-07-31 13:08:56 +00:00
|
|
|
}
|
|
|
|
|
2018-12-07 18:39:24 +00:00
|
|
|
static inline bool system_supports_address_auth(void)
|
|
|
|
{
|
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_{ADDRESS,GENERIC}_AUTH
In system_supports_address_auth() and system_supports_generic_auth() we
use cpus_have_const_cap to check for ARM64_HAS_ADDRESS_AUTH and
ARM64_HAS_GENERIC_AUTH respectively, but this is not necessary and
alternative_has_cap_*() would bre preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The ARM64_HAS_ADDRESS_AUTH cpucap is a boot cpu feature which is
detected and patched early on the boot CPU before any pointer
authentication keys are enabled via their respective SCTLR_ELx.EN* bits.
Nothing which uses system_supports_address_auth() is called before the
boot alternatives are patched. Thus it is safe for
system_supports_address_auth() to use cpus_have_final_boot_cap() to
check for ARM64_HAS_ADDRESS_AUTH.
The ARM64_HAS_GENERIC_AUTH cpucap is a system feature which is detected
on all CPUs, then finalized and patched under
setup_system_capabilities(). We use system_supports_generic_auth() in a
few places:
* The pac_generic_keys_get() and pac_generic_keys_set() functions are
only reachable from system calls once userspace is up and running. As
cpucaps are finalzied long before userspace runs, these can safely use
alternative_has_cap_*() or cpus_have_final_cap().
* The ptrauth_prctl_reset_keys() function is only reachable from system
calls once userspace is up and running. As cpucaps are finalized long
before userspace runs, this can safely use alternative_has_cap_*() or
cpus_have_final_cap().
* The ptrauth_keys_install_user() function is used during
context-switch. This is called prior to alternatives being applied,
and so cannot use cpus_have_final_cap(), but as this only needs to
switch the APGA key for userspace tasks, it's safe to use
alternative_has_cap_*().
* The ptrauth_keys_init_user() function is used to initialize userspace
keys, and is only reachable after system cpucaps have been finalized
and patched. Thus this can safely use alternative_has_cap_*() or
cpus_have_final_cap().
* The system_has_full_ptr_auth() helper function is only used by KVM
code, which is only reachable after system cpucaps have been finalized
and patched. Thus this can safely use alternative_has_cap_*() or
cpus_have_final_cap().
This patch modifies system_supports_address_auth() to use
cpus_have_final_boot_cap() to check ARM64_HAS_ADDRESS_AUTH, and modifies
system_supports_generic_auth() to use alternative_has_cap_unlikely() to
check ARM64_HAS_GENERIC_AUTH. In either case this will avoid generating
code to test the system_cpucaps bitmap and should be better for all
subsequent calls at runtime. The use of cpus_have_final_boot_cap() will
make it easier to spot if code is chaanged such that these run before
the relevant cpucap is guaranteed to have been finalized.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:37 +00:00
|
|
|
return cpus_have_final_boot_cap(ARM64_HAS_ADDRESS_AUTH);
|
2018-12-07 18:39:24 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool system_supports_generic_auth(void)
|
|
|
|
{
|
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_{ADDRESS,GENERIC}_AUTH
In system_supports_address_auth() and system_supports_generic_auth() we
use cpus_have_const_cap to check for ARM64_HAS_ADDRESS_AUTH and
ARM64_HAS_GENERIC_AUTH respectively, but this is not necessary and
alternative_has_cap_*() would bre preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The ARM64_HAS_ADDRESS_AUTH cpucap is a boot cpu feature which is
detected and patched early on the boot CPU before any pointer
authentication keys are enabled via their respective SCTLR_ELx.EN* bits.
Nothing which uses system_supports_address_auth() is called before the
boot alternatives are patched. Thus it is safe for
system_supports_address_auth() to use cpus_have_final_boot_cap() to
check for ARM64_HAS_ADDRESS_AUTH.
The ARM64_HAS_GENERIC_AUTH cpucap is a system feature which is detected
on all CPUs, then finalized and patched under
setup_system_capabilities(). We use system_supports_generic_auth() in a
few places:
* The pac_generic_keys_get() and pac_generic_keys_set() functions are
only reachable from system calls once userspace is up and running. As
cpucaps are finalzied long before userspace runs, these can safely use
alternative_has_cap_*() or cpus_have_final_cap().
* The ptrauth_prctl_reset_keys() function is only reachable from system
calls once userspace is up and running. As cpucaps are finalized long
before userspace runs, this can safely use alternative_has_cap_*() or
cpus_have_final_cap().
* The ptrauth_keys_install_user() function is used during
context-switch. This is called prior to alternatives being applied,
and so cannot use cpus_have_final_cap(), but as this only needs to
switch the APGA key for userspace tasks, it's safe to use
alternative_has_cap_*().
* The ptrauth_keys_init_user() function is used to initialize userspace
keys, and is only reachable after system cpucaps have been finalized
and patched. Thus this can safely use alternative_has_cap_*() or
cpus_have_final_cap().
* The system_has_full_ptr_auth() helper function is only used by KVM
code, which is only reachable after system cpucaps have been finalized
and patched. Thus this can safely use alternative_has_cap_*() or
cpus_have_final_cap().
This patch modifies system_supports_address_auth() to use
cpus_have_final_boot_cap() to check ARM64_HAS_ADDRESS_AUTH, and modifies
system_supports_generic_auth() to use alternative_has_cap_unlikely() to
check ARM64_HAS_GENERIC_AUTH. In either case this will avoid generating
code to test the system_cpucaps bitmap and should be better for all
subsequent calls at runtime. The use of cpus_have_final_boot_cap() will
make it easier to spot if code is chaanged such that these run before
the relevant cpucap is guaranteed to have been finalized.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:37 +00:00
|
|
|
return alternative_has_cap_unlikely(ARM64_HAS_GENERIC_AUTH);
|
2018-12-07 18:39:24 +00:00
|
|
|
}
|
|
|
|
|
2020-11-18 19:44:01 +00:00
|
|
|
static inline bool system_has_full_ptr_auth(void)
|
|
|
|
{
|
|
|
|
return system_supports_address_auth() && system_supports_generic_auth();
|
|
|
|
}
|
|
|
|
|
2020-06-18 17:12:54 +00:00
|
|
|
static __always_inline bool system_uses_irq_prio_masking(void)
|
2019-01-31 14:58:42 +00:00
|
|
|
{
|
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_GIC_PRIO_MASKING
In system_uses_irq_prio_masking() we use cpus_have_const_cap() to check
for ARM64_HAS_GIC_PRIO_MASKING, but this is not necessary and
alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
When CONFIG_ARM64_PSEUDO_NMI=y the ARM64_HAS_GIC_PRIO_MASKING cpucap is
a strict boot cpu feature which is detected and patched early on the
boot cpu, which both happen in smp_prepare_boot_cpu(). In the window
between the ARM64_HAS_GIC_PRIO_MASKING cpucap is detected and
alternatives are patched we don't run any code that depends upon the
ARM64_HAS_GIC_PRIO_MASKING cpucap:
* We leave DAIF.IF set until after boot alternatives are patched, and
interrupts are unmasked later in init_IRQ(), so we cannot reach
IRQ/FIQ entry code and will not use irqs_priority_unmasked().
* We don't call any code which uses arm_cpuidle_save_irq_context() and
arm_cpuidle_restore_irq_context() during this window.
* We don't call start_thread_common() during this window.
* The local_irq_*() code in <asm/irqflags.h> depends solely on an
alternative branch since commit:
a5f61cc636f48bdf ("arm64: irqflags: use alternative branches for pseudo-NMI logic")
... and hence will use the default (DAIF-only) masking behaviour until
alternatives are patched.
* Secondary CPUs are brought up later after alternatives are patched,
and alternatives are patched on the boot CPU immediately prior to
calling init_gic_priority_masking(), so we'll correctly initialize
interrupt masking regardless.
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which avoid generating code to test the
system_cpucaps bitmap and should be better for all subsequent calls at
runtime. As this makes system_uses_irq_prio_masking() equivalent to
__irqflags_uses_pmr(), the latter is removed and replaced with the
former for consistency.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:43 +00:00
|
|
|
return alternative_has_cap_unlikely(ARM64_HAS_GIC_PRIO_MASKING);
|
2019-01-31 14:58:42 +00:00
|
|
|
}
|
|
|
|
|
2019-09-06 09:58:01 +00:00
|
|
|
static inline bool system_supports_mte(void)
|
|
|
|
{
|
arm64: Avoid cpus_have_const_cap() for ARM64_MTE
In system_supports_mte() we use cpus_have_const_cap() to check for
ARM64_MTE, but this is not necessary and cpus_have_final_boot_cap()
would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
The ARM64_MTE cpucap is a boot cpu feature which is detected and patched
early on the boot CPU under smp_prepare_boot_cpu(). In the window
between detecting the ARM64_MTE cpucap and patching alternatives,
nothing depends on the ARM64_MTE cpucap:
* The kasan_hw_tags_enabled() helper depends upon the kasan_flag_enabled
static key, which is initialized later in kasan_init_hw_tags() after
alternatives have been applied.
* No KVM code is called during this window, and KVM is not initialized
until after system cpucaps have been detected and patched. KVM code
can safely use cpus_have_final_cap() or alternative_has_cap_*().
* We don't context-switch prior to patching boot alternatives, and thus
mte_thread_switch() is not reachable during this window. Thus, we can
safely use cpus_have_final_boot_cap() or alternative_has_cap_*() in
the context-switch code.
* IRQ and FIQ are masked during this window, and we can only take SError
and Debug exceptions. SError exceptions are fatal at this point in
time, and we do not expect to take Debug exceptions, thus:
- It's fine to lave TCO set for exceptions taken during this window,
and mte_disable_tco_entry() doesn't need to do anything.
- We don't need to detect and report asynchronous tag cehck faults
during this window, and neither mte_check_tfsr_entry() nor
mte_check_tfsr_exit() need to do anything.
Since we want to report any SErrors taken during thiw window, these
cannot safely use cpus_have_final_boot_cap() or cpus_have_final_cap(),
but these can safely use alternative_has_cap_*().
* The __set_pte_at() function is not used during this window. It is
possible for this to be used on kernel mappings prior to boot cpucaps
being finalized, so this cannot safely use cpus_have_final_boot_cap()
or cpus_have_final_cap(), but this can safely use
alternative_has_cap_*().
* No userspace translation tables have been created yet, and swap has
not been initialized yet. Thus swapping is not possible and none of
the following are called:
- arch_thp_swp_supported()
- arch_prepare_to_swap()
- arch_swap_invalidate_page()
- arch_swap_invalidate_area()
- arch_swap_restore()
These can safely use system_has_final_cap() or
alternative_has_cap_*().
* The elfcore functions are only reachable after userspace is brought
up, which happens after system cpucaps have been detected and patched.
Thus the elfcore code can safely use cpus_have_final_cap() or
alternative_has_cap_*().
* Hibernation is only possible after userspace is brought up, which
happens after system cpucaps have been detected and patched. Thus the
hibernate code can safely use cpus_have_final_cap() or
alternative_has_cap_*().
* The set_tagged_addr_ctrl() function is only reachable after userspace
is brought up, which happens after system cpucaps have been detected
and patched. Thus this can safely use cpus_have_final_cap() or
alternative_has_cap_*().
* The copy_user_highpage() and copy_highpage() functions are not used
during this window, and can safely use alternative_has_cap_*().
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which avoid generating code to test the
system_cpucaps bitmap and should be better for all subsequent calls at
runtime.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:49 +00:00
|
|
|
return alternative_has_cap_unlikely(ARM64_MTE);
|
2019-09-06 09:58:01 +00:00
|
|
|
}
|
|
|
|
|
2019-06-11 09:38:11 +00:00
|
|
|
static inline bool system_has_prio_mask_debugging(void)
|
|
|
|
{
|
|
|
|
return IS_ENABLED(CONFIG_ARM64_DEBUG_PRIORITY_MASKING) &&
|
|
|
|
system_uses_irq_prio_masking();
|
|
|
|
}
|
|
|
|
|
2020-03-16 16:50:45 +00:00
|
|
|
static inline bool system_supports_bti(void)
|
|
|
|
{
|
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_BTI
In system_supports_bti() we use cpus_have_const_cap() to check for
ARM64_HAS_BTI, but this is not necessary and alternative_has_cap_*() or
cpus_have_final_*cap() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
When CONFIG_ARM64_BTI_KERNEL=y, the ARM64_HAS_BTI cpucap is a strict
boot cpu feature which is detected and patched early on the boot cpu.
All uses guarded by CONFIG_ARM64_BTI_KERNEL happen after the boot CPU
has detected ARM64_HAS_BTI and patched boot alternatives, and hence can
safely use alternative_has_cap_*() or cpus_have_final_boot_cap().
Regardless of CONFIG_ARM64_BTI_KERNEL, all other uses of ARM64_HAS_BTI
happen after system capabilities have been finalized and alternatives
have been patched. Hence these can safely use alternative_has_cap_*) or
cpus_have_final_cap().
This patch splits system_supports_bti() into system_supports_bti() and
system_supports_bti_kernel(), with the former handling where the cpucap
affects userspace functionality, and ther latter handling where the
cpucap affects kernel functionality. The use of cpus_have_const_cap() is
replaced by cpus_have_final_cap() in cpus_have_const_cap, and
cpus_have_final_boot_cap() in system_supports_bti_kernel(). This will
avoid generating code to test the system_cpucaps bitmap and should be
better for all subsequent calls at runtime. The use of
cpus_have_final_cap() and cpus_have_final_boot_cap() will make it easier
to spot if code is chaanged such that these run before the ARM64_HAS_BTI
cpucap is guaranteed to have been finalized.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:39 +00:00
|
|
|
return cpus_have_final_cap(ARM64_BTI);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool system_supports_bti_kernel(void)
|
|
|
|
{
|
|
|
|
return IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) &&
|
|
|
|
cpus_have_final_boot_cap(ARM64_BTI);
|
2020-01-13 23:30:17 +00:00
|
|
|
}
|
|
|
|
|
arm64: tlb: Use the TLBI RANGE feature in arm64
Add __TLBI_VADDR_RANGE macro and rewrite __flush_tlb_range().
When cpu supports TLBI feature, the minimum range granularity is
decided by 'scale', so we can not flush all pages by one instruction
in some cases.
For example, when the pages = 0xe81a, let's start 'scale' from
maximum, and find right 'num' for each 'scale':
1. scale = 3, we can flush no pages because the minimum range is
2^(5*3 + 1) = 0x10000.
2. scale = 2, the minimum range is 2^(5*2 + 1) = 0x800, we can
flush 0xe800 pages this time, the num = 0xe800/0x800 - 1 = 0x1c.
Remaining pages is 0x1a;
3. scale = 1, the minimum range is 2^(5*1 + 1) = 0x40, no page
can be flushed.
4. scale = 0, we flush the remaining 0x1a pages, the num =
0x1a/0x2 - 1 = 0xd.
However, in most scenarios, the pages = 1 when flush_tlb_range() is
called. Start from scale = 3 or other proper value (such as scale =
ilog2(pages)), will incur extra overhead.
So increase 'scale' from 0 to maximum, the flush order is exactly
opposite to the example.
Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
Link: https://lore.kernel.org/r/20200715071945.897-4-yezhenyu2@huawei.com
[catalin.marinas@arm.com: removed unnecessary masks in __TLBI_VADDR_RANGE]
[catalin.marinas@arm.com: __TLB_RANGE_NUM subtracts 1]
[catalin.marinas@arm.com: minor adjustments to the comments]
[catalin.marinas@arm.com: introduce system_supports_tlb_range()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-15 07:19:45 +00:00
|
|
|
static inline bool system_supports_tlb_range(void)
|
|
|
|
{
|
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_TLB_RANGE
We use cpus_have_const_cap() to check for ARM64_HAS_TLB_RANGE, but this
is not necessary and alternative_has_cap_*() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than
it needs to be. Before cpucaps are finalized, it will perform a bitmap
test of the system_cpucaps bitmap, and once cpucaps are finalized it
will use an alternative branch. This used to be necessary to handle some
race conditions in the window between cpucap detection and the
subsequent patching of alternatives and static branches, where different
branches could be out-of-sync with one another (or w.r.t. alternative
sequences). Now that we use alternative branches instead of static
branches, these are all patched atomically w.r.t. one another, and there
are only a handful of cases that need special care in the window between
cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and
migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(),
or cpus_have_cap() depending on when their requirements. This will
remove redundant instructions and improve code generation, and will make
it easier to determine how each callsite will behave before, during, and
after alternative patching.
In the window between detecting the ARM64_HAS_TLB_RANGE cpucap and
patching alternative branches, we do not perform any TLB invalidation,
and even if we were to perform TLB invalidation here it would not be
functionally necessary to optimize this by using range invalidation.
Hence there's no need to use cpus_have_const_cap(), and
alternative_has_cap_unlikely() is sufficient.
This patch replaces the use of cpus_have_const_cap() with
alternative_has_cap_unlikely(), which will avoid generating code to test
the system_cpucaps bitmap and should be better for all subsequent calls
at runtime.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-10-16 10:24:48 +00:00
|
|
|
return alternative_has_cap_unlikely(ARM64_HAS_TLB_RANGE);
|
arm64: tlb: Use the TLBI RANGE feature in arm64
Add __TLBI_VADDR_RANGE macro and rewrite __flush_tlb_range().
When cpu supports TLBI feature, the minimum range granularity is
decided by 'scale', so we can not flush all pages by one instruction
in some cases.
For example, when the pages = 0xe81a, let's start 'scale' from
maximum, and find right 'num' for each 'scale':
1. scale = 3, we can flush no pages because the minimum range is
2^(5*3 + 1) = 0x10000.
2. scale = 2, the minimum range is 2^(5*2 + 1) = 0x800, we can
flush 0xe800 pages this time, the num = 0xe800/0x800 - 1 = 0x1c.
Remaining pages is 0x1a;
3. scale = 1, the minimum range is 2^(5*1 + 1) = 0x40, no page
can be flushed.
4. scale = 0, we flush the remaining 0x1a pages, the num =
0x1a/0x2 - 1 = 0xd.
However, in most scenarios, the pages = 1 when flush_tlb_range() is
called. Start from scale = 3 or other proper value (such as scale =
ilog2(pages)), will incur extra overhead.
So increase 'scale' from 0 to maximum, the flush order is exactly
opposite to the example.
Signed-off-by: Zhenyu Ye <yezhenyu2@huawei.com>
Link: https://lore.kernel.org/r/20200715071945.897-4-yezhenyu2@huawei.com
[catalin.marinas@arm.com: removed unnecessary masks in __TLBI_VADDR_RANGE]
[catalin.marinas@arm.com: __TLB_RANGE_NUM subtracts 1]
[catalin.marinas@arm.com: minor adjustments to the comments]
[catalin.marinas@arm.com: introduce system_supports_tlb_range()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-15 07:19:45 +00:00
|
|
|
}
|
|
|
|
|
2023-11-27 11:17:30 +00:00
|
|
|
static inline bool system_supports_lpa2(void)
|
|
|
|
{
|
|
|
|
return cpus_have_final_cap(ARM64_HAS_LPA2);
|
|
|
|
}
|
|
|
|
|
arm64: rework EL0 MRS emulation
On CPUs without FEAT_IDST, ID register emulation is slower than it needs
to be, as all threads contend for the same lock to perform the
emulation. This patch reworks the emulation to avoid this unnecessary
contention.
On CPUs with FEAT_IDST (which is mandatory from ARMv8.4 onwards), EL0
accesses to ID registers result in a SYS trap, and emulation of these is
handled with a sys64_hook. These hooks are statically allocated, and no
locking is required to iterate through the hooks and perform the
emulation, allowing emulation to occur in parallel with no contention.
On CPUs without FEAT_IDST, EL0 accesses to ID registers result in an
UNDEFINED exception, and emulation of these accesses is handled with an
undef_hook. When an EL0 MRS instruction is trapped to EL1, the kernel
finds the relevant handler by iterating through all of the undef_hooks,
requiring undef_lock to be held during this lookup.
This locking is only required to safely traverse the list of undef_hooks
(as it can be concurrently modified), and the actual emulation of the
MRS does not require any mutual exclusion. This locking is an
unfortunate bottleneck, especially given that MRS emulation is enabled
unconditionally and is never disabled.
This patch reworks the non-FEAT_IDST MRS emulation logic so that it can
be invoked directly from do_el0_undef(). This removes the bottleneck,
allowing MRS traps to be handled entirely in parallel, and is a stepping
stone to making all of the undef_hooks lock-free.
I've tested this in a 64-vCPU VM on a 64-CPU ThunderX2 host, with a
benchmark which spawns a number of threads which each try to read
ID_AA64ISAR0_EL1 1000000 times. This is vastly more contention than will
ever be seen in realistic usage, but clearly demonstrates the removal of
the bottleneck:
| Threads || Time (seconds) |
| || Before || After |
| || Real | System || Real | System |
|---------++--------+---------++--------+---------|
| 1 || 0.29 | 0.20 || 0.24 | 0.12 |
| 2 || 0.35 | 0.51 || 0.23 | 0.27 |
| 4 || 1.08 | 3.87 || 0.24 | 0.56 |
| 8 || 4.31 | 33.60 || 0.24 | 1.11 |
| 16 || 9.47 | 149.39 || 0.23 | 2.15 |
| 32 || 19.07 | 605.27 || 0.24 | 4.38 |
| 64 || 65.40 | 3609.09 || 0.33 | 11.27 |
Aside from the speedup, there should be no functional change as a result
of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221019144123.612388-6-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-10-19 14:41:19 +00:00
|
|
|
int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
|
|
|
|
bool try_emulate_mrs(struct pt_regs *regs, u32 isn);
|
2018-10-26 00:57:35 +00:00
|
|
|
|
2018-09-26 16:32:40 +00:00
|
|
|
static inline u32 id_aa64mmfr0_parange_to_phys_shift(int parange)
|
|
|
|
{
|
|
|
|
switch (parange) {
|
2022-09-05 22:54:01 +00:00
|
|
|
case ID_AA64MMFR0_EL1_PARANGE_32: return 32;
|
|
|
|
case ID_AA64MMFR0_EL1_PARANGE_36: return 36;
|
|
|
|
case ID_AA64MMFR0_EL1_PARANGE_40: return 40;
|
|
|
|
case ID_AA64MMFR0_EL1_PARANGE_42: return 42;
|
|
|
|
case ID_AA64MMFR0_EL1_PARANGE_44: return 44;
|
|
|
|
case ID_AA64MMFR0_EL1_PARANGE_48: return 48;
|
|
|
|
case ID_AA64MMFR0_EL1_PARANGE_52: return 52;
|
2018-09-26 16:32:40 +00:00
|
|
|
/*
|
|
|
|
* A future PE could use a value unknown to the kernel.
|
|
|
|
* However, by the "D10.1.4 Principles of the ID scheme
|
|
|
|
* for fields in ID registers", ARM DDI 0487C.a, any new
|
|
|
|
* value is guaranteed to be higher than what we know already.
|
|
|
|
* As a safe limit, we return the limit supported by the kernel.
|
|
|
|
*/
|
|
|
|
default: return CONFIG_ARM64_PA_BITS;
|
|
|
|
}
|
|
|
|
}
|
2019-10-11 14:09:36 +00:00
|
|
|
|
|
|
|
/* Check whether hardware update of the Access flag is supported */
|
|
|
|
static inline bool cpu_has_hw_af(void)
|
|
|
|
{
|
|
|
|
u64 mmfr1;
|
|
|
|
|
|
|
|
if (!IS_ENABLED(CONFIG_ARM64_HW_AFDBM))
|
|
|
|
return false;
|
|
|
|
|
2023-01-09 15:19:55 +00:00
|
|
|
/*
|
|
|
|
* Use cached version to avoid emulated msr operation on KVM
|
|
|
|
* guests.
|
|
|
|
*/
|
|
|
|
mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
|
2019-10-11 14:09:36 +00:00
|
|
|
return cpuid_feature_extract_unsigned_field(mmfr1,
|
2022-09-05 22:54:07 +00:00
|
|
|
ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
|
2019-10-11 14:09:36 +00:00
|
|
|
}
|
|
|
|
|
arm64: sdei: explicitly simulate PAN/UAO entry
In preparation for removing addr_limit and set_fs() we must decouple the
SDEI PAN/UAO manipulation from the uaccess code, and explicitly
reinitialize these as required.
SDEI enters the kernel with a non-architectural exception, and prior to
the most recent revision of the specification (ARM DEN 0054B), PSTATE
bits (e.g. PAN, UAO) are not manipulated in the same way as for
architectural exceptions. Notably, older versions of the spec can be
read ambiguously as to whether PSTATE bits are inherited unchanged from
the interrupted context or whether they are generated from scratch, with
TF-A doing the latter.
We have three cases to consider:
1) The existing TF-A implementation of SDEI will clear PAN and clear UAO
(along with other bits in PSTATE) when delivering an SDEI exception.
2) In theory, implementations of SDEI prior to revision B could inherit
PAN and UAO (along with other bits in PSTATE) unchanged from the
interrupted context. However, in practice such implementations do not
exist.
3) Going forward, new implementations of SDEI must clear UAO, and
depending on SCTLR_ELx.SPAN must either inherit or set PAN.
As we can ignore (2) we can assume that upon SDEI entry, UAO is always
clear, though PAN may be clear, inherited, or set per SCTLR_ELx.SPAN.
Therefore, we must explicitly initialize PAN, but do not need to do
anything for UAO.
Considering what we need to do:
* When set_fs() is removed, force_uaccess_begin() will have no HW
side-effects. As this only clears UAO, which we can assume has already
been cleared upon entry, this is not a problem. We do not need to add
code to manipulate UAO explicitly.
* PAN may be cleared upon entry (in case 1 above), so where a kernel is
built to use PAN and this is supported by all CPUs, the kernel must
set PAN upon entry to ensure expected behaviour.
* PAN may be inherited from the interrupted context (in case 3 above),
and so where a kernel is not built to use PAN or where PAN support is
not uniform across CPUs, the kernel must clear PAN to ensure expected
behaviour.
This patch reworks the SDEI code accordingly, explicitly setting PAN to
the expected state in all cases. To cater for the cases where the kernel
does not use PAN or this is not uniformly supported by hardware we add a
new cpu_has_pan() helper which can be used regardless of whether the
kernel is built to use PAN.
The existing system_uses_ttbr0_pan() is redefined in terms of
system_uses_hw_pan() both for clarity and as a minor optimization when
HW PAN is not selected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-12-02 13:15:48 +00:00
|
|
|
static inline bool cpu_has_pan(void)
|
|
|
|
{
|
|
|
|
u64 mmfr1 = read_cpuid(ID_AA64MMFR1_EL1);
|
|
|
|
return cpuid_feature_extract_unsigned_field(mmfr1,
|
2022-09-05 22:54:07 +00:00
|
|
|
ID_AA64MMFR1_EL1_PAN_SHIFT);
|
arm64: sdei: explicitly simulate PAN/UAO entry
In preparation for removing addr_limit and set_fs() we must decouple the
SDEI PAN/UAO manipulation from the uaccess code, and explicitly
reinitialize these as required.
SDEI enters the kernel with a non-architectural exception, and prior to
the most recent revision of the specification (ARM DEN 0054B), PSTATE
bits (e.g. PAN, UAO) are not manipulated in the same way as for
architectural exceptions. Notably, older versions of the spec can be
read ambiguously as to whether PSTATE bits are inherited unchanged from
the interrupted context or whether they are generated from scratch, with
TF-A doing the latter.
We have three cases to consider:
1) The existing TF-A implementation of SDEI will clear PAN and clear UAO
(along with other bits in PSTATE) when delivering an SDEI exception.
2) In theory, implementations of SDEI prior to revision B could inherit
PAN and UAO (along with other bits in PSTATE) unchanged from the
interrupted context. However, in practice such implementations do not
exist.
3) Going forward, new implementations of SDEI must clear UAO, and
depending on SCTLR_ELx.SPAN must either inherit or set PAN.
As we can ignore (2) we can assume that upon SDEI entry, UAO is always
clear, though PAN may be clear, inherited, or set per SCTLR_ELx.SPAN.
Therefore, we must explicitly initialize PAN, but do not need to do
anything for UAO.
Considering what we need to do:
* When set_fs() is removed, force_uaccess_begin() will have no HW
side-effects. As this only clears UAO, which we can assume has already
been cleared upon entry, this is not a problem. We do not need to add
code to manipulate UAO explicitly.
* PAN may be cleared upon entry (in case 1 above), so where a kernel is
built to use PAN and this is supported by all CPUs, the kernel must
set PAN upon entry to ensure expected behaviour.
* PAN may be inherited from the interrupted context (in case 3 above),
and so where a kernel is not built to use PAN or where PAN support is
not uniform across CPUs, the kernel must clear PAN to ensure expected
behaviour.
This patch reworks the SDEI code accordingly, explicitly setting PAN to
the expected state in all cases. To cater for the cases where the kernel
does not use PAN or this is not uniformly supported by hardware we add a
new cpu_has_pan() helper which can be used regardless of whether the
kernel is built to use PAN.
The existing system_uses_ttbr0_pan() is redefined in terms of
system_uses_hw_pan() both for clarity and as a minor optimization when
HW PAN is not selected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201202131558.39270-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-12-02 13:15:48 +00:00
|
|
|
}
|
|
|
|
|
2020-03-05 09:06:21 +00:00
|
|
|
#ifdef CONFIG_ARM64_AMU_EXTN
|
|
|
|
/* Check whether the cpu supports the Activity Monitors Unit (AMU) */
|
|
|
|
extern bool cpu_has_amu_feat(int cpu);
|
2020-11-06 12:53:32 +00:00
|
|
|
#else
|
|
|
|
static inline bool cpu_has_amu_feat(int cpu)
|
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
2020-03-05 09:06:21 +00:00
|
|
|
#endif
|
|
|
|
|
2020-11-06 12:53:34 +00:00
|
|
|
/* Get a cpu that supports the Activity Monitors Unit (AMU) */
|
|
|
|
extern int get_cpu_with_amu_feat(void);
|
|
|
|
|
2020-05-12 01:57:27 +00:00
|
|
|
static inline unsigned int get_vmid_bits(u64 mmfr1)
|
|
|
|
{
|
|
|
|
int vmid_bits;
|
|
|
|
|
|
|
|
vmid_bits = cpuid_feature_extract_unsigned_field(mmfr1,
|
2022-09-05 22:54:07 +00:00
|
|
|
ID_AA64MMFR1_EL1_VMIDBits_SHIFT);
|
|
|
|
if (vmid_bits == ID_AA64MMFR1_EL1_VMIDBits_16)
|
2020-05-12 01:57:27 +00:00
|
|
|
return 16;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Return the default here even if any reserved
|
|
|
|
* value is fetched from the system register.
|
|
|
|
*/
|
|
|
|
return 8;
|
|
|
|
}
|
|
|
|
|
2023-06-09 19:00:50 +00:00
|
|
|
s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new, s64 cur);
|
2022-09-09 16:59:37 +00:00
|
|
|
struct arm64_ftr_reg *get_arm64_ftr_reg(u32 sys_id);
|
|
|
|
|
2024-02-14 12:29:12 +00:00
|
|
|
extern struct arm64_ftr_override id_aa64mmfr0_override;
|
2021-02-08 09:57:23 +00:00
|
|
|
extern struct arm64_ftr_override id_aa64mmfr1_override;
|
2024-02-14 12:29:12 +00:00
|
|
|
extern struct arm64_ftr_override id_aa64mmfr2_override;
|
2022-06-30 16:04:59 +00:00
|
|
|
extern struct arm64_ftr_override id_aa64pfr0_override;
|
2021-02-08 09:57:29 +00:00
|
|
|
extern struct arm64_ftr_override id_aa64pfr1_override;
|
2022-06-30 16:04:59 +00:00
|
|
|
extern struct arm64_ftr_override id_aa64zfr0_override;
|
2022-06-30 16:04:58 +00:00
|
|
|
extern struct arm64_ftr_override id_aa64smfr0_override;
|
2021-02-08 09:57:31 +00:00
|
|
|
extern struct arm64_ftr_override id_aa64isar1_override;
|
2022-02-24 12:49:52 +00:00
|
|
|
extern struct arm64_ftr_override id_aa64isar2_override;
|
2021-02-08 09:57:23 +00:00
|
|
|
|
2023-06-09 16:21:46 +00:00
|
|
|
extern struct arm64_ftr_override arm64_sw_feature_override;
|
|
|
|
|
2024-02-14 12:28:56 +00:00
|
|
|
static inline
|
|
|
|
u64 arm64_apply_feature_override(u64 val, int feat, int width,
|
|
|
|
const struct arm64_ftr_override *override)
|
|
|
|
{
|
|
|
|
u64 oval = override->val;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* When it encounters an invalid override (e.g., an override that
|
|
|
|
* cannot be honoured due to a missing CPU feature), the early idreg
|
|
|
|
* override code will set the mask to 0x0 and the value to non-zero for
|
|
|
|
* the field in question. In order to determine whether the override is
|
|
|
|
* valid or not for the field we are interested in, we first need to
|
|
|
|
* disregard bits belonging to other fields.
|
|
|
|
*/
|
|
|
|
oval &= GENMASK_ULL(feat + width - 1, feat);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The override is valid if all value bits are accounted for in the
|
|
|
|
* mask. If so, replace the masked bits with the override value.
|
|
|
|
*/
|
|
|
|
if (oval == (oval & override->mask)) {
|
|
|
|
val &= ~override->mask;
|
|
|
|
val |= oval;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Extract the field from the updated value */
|
|
|
|
return cpuid_feature_extract_unsigned_field(val, feat);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool arm64_test_sw_feature_override(int feat)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Software features are pseudo CPU features that have no underlying
|
|
|
|
* CPUID system register value to apply the override to.
|
|
|
|
*/
|
|
|
|
return arm64_apply_feature_override(0, feat, 4,
|
|
|
|
&arm64_sw_feature_override);
|
|
|
|
}
|
|
|
|
|
2024-02-14 12:28:57 +00:00
|
|
|
static inline bool kaslr_disabled_cmdline(void)
|
|
|
|
{
|
|
|
|
return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_NOKASLR);
|
|
|
|
}
|
|
|
|
|
2020-05-12 01:57:27 +00:00
|
|
|
u32 get_kvm_ipa_limit(void);
|
2020-06-29 04:38:31 +00:00
|
|
|
void dump_cpu_features(void);
|
2020-05-12 01:57:27 +00:00
|
|
|
|
2024-02-14 12:28:59 +00:00
|
|
|
static inline bool cpu_has_bti(void)
|
|
|
|
{
|
|
|
|
if (!IS_ENABLED(CONFIG_ARM64_BTI))
|
|
|
|
return false;
|
|
|
|
|
|
|
|
return arm64_apply_feature_override(read_cpuid(ID_AA64PFR1_EL1),
|
|
|
|
ID_AA64PFR1_EL1_BT_SHIFT, 4,
|
|
|
|
&id_aa64pfr1_override);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool cpu_has_pac(void)
|
|
|
|
{
|
|
|
|
u64 isar1, isar2;
|
|
|
|
|
|
|
|
if (!IS_ENABLED(CONFIG_ARM64_PTR_AUTH))
|
|
|
|
return false;
|
|
|
|
|
|
|
|
isar1 = read_cpuid(ID_AA64ISAR1_EL1);
|
|
|
|
isar2 = read_cpuid(ID_AA64ISAR2_EL1);
|
|
|
|
|
|
|
|
if (arm64_apply_feature_override(isar1, ID_AA64ISAR1_EL1_APA_SHIFT, 4,
|
|
|
|
&id_aa64isar1_override))
|
|
|
|
return true;
|
|
|
|
|
|
|
|
if (arm64_apply_feature_override(isar1, ID_AA64ISAR1_EL1_API_SHIFT, 4,
|
|
|
|
&id_aa64isar1_override))
|
|
|
|
return true;
|
|
|
|
|
|
|
|
return arm64_apply_feature_override(isar2, ID_AA64ISAR2_EL1_APA3_SHIFT, 4,
|
|
|
|
&id_aa64isar2_override);
|
|
|
|
}
|
|
|
|
|
arm64: mm: Handle LVA support as a CPU feature
Currently, we detect CPU support for 52-bit virtual addressing (LVA)
extremely early, before creating the kernel page tables or enabling the
MMU. We cannot override the feature this early, and so large virtual
addressing is always enabled on CPUs that implement support for it if
the software support for it was enabled at build time. It also means we
rely on non-trivial code in asm to deal with this feature.
Given that both the ID map and the TTBR1 mapping of the kernel image are
guaranteed to be 48-bit addressable, it is not actually necessary to
enable support this early, and instead, we can model it as a CPU
feature. That way, we can rely on code patching to get the correct
TCR.T1SZ values programmed on secondary boot and resume from suspend.
On the primary boot path, we simply enable the MMU with 48-bit virtual
addressing initially, and update TCR.T1SZ if LVA is supported from C
code, right before creating the kernel mapping. Given that TTBR1 still
points to reserved_pg_dir at this point, updating TCR.T1SZ should be
safe without the need for explicit TLB maintenance.
Since this gets rid of all accesses to the vabits_actual variable from
asm code that occurred before TCR.T1SZ had been programmed, we no longer
have a need for this variable, and we can replace it with a C expression
that produces the correct value directly, based on the value of TCR.T1SZ.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-70-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-02-14 12:29:11 +00:00
|
|
|
static inline bool cpu_has_lva(void)
|
|
|
|
{
|
|
|
|
u64 mmfr2;
|
|
|
|
|
|
|
|
mmfr2 = read_sysreg_s(SYS_ID_AA64MMFR2_EL1);
|
2024-02-14 12:29:12 +00:00
|
|
|
mmfr2 &= ~id_aa64mmfr2_override.mask;
|
|
|
|
mmfr2 |= id_aa64mmfr2_override.val;
|
arm64: mm: Handle LVA support as a CPU feature
Currently, we detect CPU support for 52-bit virtual addressing (LVA)
extremely early, before creating the kernel page tables or enabling the
MMU. We cannot override the feature this early, and so large virtual
addressing is always enabled on CPUs that implement support for it if
the software support for it was enabled at build time. It also means we
rely on non-trivial code in asm to deal with this feature.
Given that both the ID map and the TTBR1 mapping of the kernel image are
guaranteed to be 48-bit addressable, it is not actually necessary to
enable support this early, and instead, we can model it as a CPU
feature. That way, we can rely on code patching to get the correct
TCR.T1SZ values programmed on secondary boot and resume from suspend.
On the primary boot path, we simply enable the MMU with 48-bit virtual
addressing initially, and update TCR.T1SZ if LVA is supported from C
code, right before creating the kernel mapping. Given that TTBR1 still
points to reserved_pg_dir at this point, updating TCR.T1SZ should be
safe without the need for explicit TLB maintenance.
Since this gets rid of all accesses to the vabits_actual variable from
asm code that occurred before TCR.T1SZ had been programmed, we no longer
have a need for this variable, and we can replace it with a C expression
that produces the correct value directly, based on the value of TCR.T1SZ.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-70-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-02-14 12:29:11 +00:00
|
|
|
return cpuid_feature_extract_unsigned_field(mmfr2,
|
|
|
|
ID_AA64MMFR2_EL1_VARange_SHIFT);
|
|
|
|
}
|
|
|
|
|
arm64: Enable LPA2 at boot if supported by the system
Update the early kernel mapping code to take 52-bit virtual addressing
into account based on the LPA2 feature. This is a bit more involved than
LVA (which is supported with 64k pages only), given that some page table
descriptor bits change meaning in this case.
To keep the handling in asm to a minimum, the initial ID map is still
created with 48-bit virtual addressing, which implies that the kernel
image must be loaded into 48-bit addressable physical memory. This is
currently required by the boot protocol, even though we happen to
support placement outside of that for LVA/64k based configurations.
Enabling LPA2 involves more than setting TCR.T1SZ to a lower value,
there is also a DS bit in TCR that needs to be set, and which changes
the meaning of bits [9:8] in all page table descriptors. Since we cannot
enable DS and every live page table descriptor at the same time, let's
pivot through another temporary mapping. This avoids the need to
reintroduce manipulations of the page tables with the MMU and caches
disabled.
To permit the LPA2 feature to be overridden on the kernel command line,
which may be necessary to work around silicon errata, or to deal with
mismatched features on heterogeneous SoC designs, test for CPU feature
overrides first, and only then enable LPA2.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-78-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-02-14 12:29:19 +00:00
|
|
|
static inline bool cpu_has_lpa2(void)
|
|
|
|
{
|
|
|
|
#ifdef CONFIG_ARM64_LPA2
|
|
|
|
u64 mmfr0;
|
|
|
|
int feat;
|
|
|
|
|
|
|
|
mmfr0 = read_sysreg(id_aa64mmfr0_el1);
|
|
|
|
mmfr0 &= ~id_aa64mmfr0_override.mask;
|
|
|
|
mmfr0 |= id_aa64mmfr0_override.val;
|
|
|
|
feat = cpuid_feature_extract_signed_field(mmfr0,
|
|
|
|
ID_AA64MMFR0_EL1_TGRAN_SHIFT);
|
|
|
|
|
|
|
|
return feat >= ID_AA64MMFR0_EL1_TGRAN_LPA2;
|
|
|
|
#else
|
|
|
|
return false;
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
2014-11-14 15:54:10 +00:00
|
|
|
#endif /* __ASSEMBLY__ */
|
|
|
|
|
2014-03-04 01:10:04 +00:00
|
|
|
#endif
|