arm64 updates for 5.10

- Userspace support for the Memory Tagging Extension introduced by Armv8.5.
   Kernel support (via KASAN) is likely to follow in 5.11.
 
 - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context
   switching.
 
 - Fix and subsequent rewrite of our Spectre mitigations, including the
   addition of support for PR_SPEC_DISABLE_NOEXEC.
 
 - Support for the Armv8.3 Pointer Authentication enhancements.
 
 - Support for ASID pinning, which is required when sharing page-tables with
   the SMMU.
 
 - MM updates, including treating flush_tlb_fix_spurious_fault() as a no-op.
 
 - Perf/PMU driver updates, including addition of the ARM CMN PMU driver and
   also support to handle CPU PMU IRQs as NMIs.
 
 - Allow prefetchable PCI BARs to be exposed to userspace using normal
   non-cacheable mappings.
 
 - Implementation of ARCH_STACKWALK for unwinding.
 
 - Improve reporting of unexpected kernel traps due to BPF JIT failure.
 
 - Improve robustness of user-visible HWCAP strings and their corresponding
   numerical constants.
 
 - Removal of TEXT_OFFSET.
 
 - Removal of some unused functions, parameters and prototypes.
 
 - Removal of MPIDR-based topology detection in favour of firmware
   description.
 
 - Cleanups to handling of SVE and FPSIMD register state in preparation
   for potential future optimisation of handling across syscalls.
 
 - Cleanups to the SDEI driver in preparation for support in KVM.
 
 - Miscellaneous cleanups and refactoring work.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl+AUXMQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNFc1B/4q2Kabe+pPu7s1f58Q+OTaEfqcr3F1qh27
 F1YpFZUYxg0GPfPsFrnbJpo5WKo7wdR9ceI9yF/GHjs7A/MSoQJis3pG6SlAd9c0
 nMU5tCwhg9wfq6asJtl0/IPWem6cqqhdzC6m808DjeHuyi2CCJTt0vFWH3OeHEhG
 cfmLfaSNXOXa/MjEkT8y1AXJ/8IpIpzkJeCRA1G5s18PXV9Kl5bafIo9iqyfKPLP
 0rJljBmoWbzuCSMc81HmGUQI4+8KRp6HHhyZC/k0WEVgj3LiumT7am02bdjZlTnK
 BeNDKQsv2Jk8pXP2SlrI3hIUTz0bM6I567FzJEokepvTUzZ+CVBi
 =9J8H
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "There's quite a lot of code here, but much of it is due to the
  addition of a new PMU driver as well as some arm64-specific selftests
  which is an area where we've traditionally been lagging a bit.

  In terms of exciting features, this includes support for the Memory
  Tagging Extension which narrowly missed 5.9, hopefully allowing
  userspace to run with use-after-free detection in production on CPUs
  that support it. Work is ongoing to integrate the feature with KASAN
  for 5.11.

  Another change that I'm excited about (assuming they get the hardware
  right) is preparing the ASID allocator for sharing the CPU page-table
  with the SMMU. Those changes will also come in via Joerg with the
  IOMMU pull.

  We do stray outside of our usual directories in a few places, mostly
  due to core changes required by MTE. Although much of this has been
  Acked, there were a couple of places where we unfortunately didn't get
  any review feedback.

  Other than that, we ran into a handful of minor conflicts in -next,
  but nothing that should post any issues.

  Summary:

   - Userspace support for the Memory Tagging Extension introduced by
     Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11.

   - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context
     switching.

   - Fix and subsequent rewrite of our Spectre mitigations, including
     the addition of support for PR_SPEC_DISABLE_NOEXEC.

   - Support for the Armv8.3 Pointer Authentication enhancements.

   - Support for ASID pinning, which is required when sharing
     page-tables with the SMMU.

   - MM updates, including treating flush_tlb_fix_spurious_fault() as a
     no-op.

   - Perf/PMU driver updates, including addition of the ARM CMN PMU
     driver and also support to handle CPU PMU IRQs as NMIs.

   - Allow prefetchable PCI BARs to be exposed to userspace using normal
     non-cacheable mappings.

   - Implementation of ARCH_STACKWALK for unwinding.

   - Improve reporting of unexpected kernel traps due to BPF JIT
     failure.

   - Improve robustness of user-visible HWCAP strings and their
     corresponding numerical constants.

   - Removal of TEXT_OFFSET.

   - Removal of some unused functions, parameters and prototypes.

   - Removal of MPIDR-based topology detection in favour of firmware
     description.

   - Cleanups to handling of SVE and FPSIMD register state in
     preparation for potential future optimisation of handling across
     syscalls.

   - Cleanups to the SDEI driver in preparation for support in KVM.

   - Miscellaneous cleanups and refactoring work"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
  Revert "arm64: initialize per-cpu offsets earlier"
  arm64: random: Remove no longer needed prototypes
  arm64: initialize per-cpu offsets earlier
  kselftest/arm64: Check mte tagged user address in kernel
  kselftest/arm64: Verify KSM page merge for MTE pages
  kselftest/arm64: Verify all different mmap MTE options
  kselftest/arm64: Check forked child mte memory accessibility
  kselftest/arm64: Verify mte tag inclusion via prctl
  kselftest/arm64: Add utilities and a test to validate mte memory
  perf: arm-cmn: Fix conversion specifiers for node type
  perf: arm-cmn: Fix unsigned comparison to less than zero
  arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD
  arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op
  arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option
  arm64: Pull in task_stack_page() to Spectre-v4 mitigation code
  KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled
  arm64: Get rid of arm64_ssbd_state
  KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state()
  KVM: arm64: Get rid of kvm_arm_have_ssbd()
  KVM: arm64: Simplify handling of ARCH_WORKAROUND_2
  ...
This commit is contained in:
Linus Torvalds 2020-10-12 10:00:51 -07:00
commit 6734e20e39
184 changed files with 10389 additions and 1907 deletions

View file

@ -0,0 +1,65 @@
=============================
Arm Coherent Mesh Network PMU
=============================
CMN-600 is a configurable mesh interconnect consisting of a rectangular
grid of crosspoints (XPs), with each crosspoint supporting up to two
device ports to which various AMBA CHI agents are attached.
CMN implements a distributed PMU design as part of its debug and trace
functionality. This consists of a local monitor (DTM) at every XP, which
counts up to 4 event signals from the connected device nodes and/or the
XP itself. Overflow from these local counters is accumulated in up to 8
global counters implemented by the main controller (DTC), which provides
overall PMU control and interrupts for global counter overflow.
PMU events
----------
The PMU driver registers a single PMU device for the whole interconnect,
see /sys/bus/event_source/devices/arm_cmn. Multi-chip systems may link
more than one CMN together via external CCIX links - in this situation,
each mesh counts its own events entirely independently, and additional
PMU devices will be named arm_cmn_{1..n}.
Most events are specified in a format based directly on the TRM
definitions - "type" selects the respective node type, and "eventid" the
event number. Some events require an additional occupancy ID, which is
specified by "occupid".
* Since RN-D nodes do not have any distinct events from RN-I nodes, they
are treated as the same type (0xa), and the common event templates are
named "rnid_*".
* The cycle counter is treated as a synthetic event belonging to the DTC
node ("type" == 0x3, "eventid" is ignored).
* XP events also encode the port and channel in the "eventid" field, to
match the underlying pmu_event0_id encoding for the pmu_event_sel
register. The event templates are named with prefixes to cover all
permutations.
By default each event provides an aggregate count over all nodes of the
given type. To target a specific node, "bynodeid" must be set to 1 and
"nodeid" to the appropriate value derived from the CMN configuration
(as defined in the "Node ID Mapping" section of the TRM).
Watchpoints
-----------
The PMU can also count watchpoint events to monitor specific flit
traffic. Watchpoints are treated as a synthetic event type, and like PMU
events can be global or targeted with a particular XP's "nodeid" value.
Since the watchpoint direction is otherwise implicit in the underlying
register selection, separate events are provided for flit uploads and
downloads.
The flit match value and mask are passed in config1 and config2 ("val"
and "mask" respectively). "wp_dev_sel", "wp_chn_sel", "wp_grp" and
"wp_exclusive" are specified per the TRM definitions for dtm_wp_config0.
Where a watchpoint needs to match fields from both match groups on the
REQ or SNP channel, it can be specified as two events - one for each
group - with the same nonzero "combine" value. The count for such a
pair of combined events will be attributed to the primary match.
Watchpoint events with a "combine" value of 0 are considered independent
and will count individually.

View file

@ -12,6 +12,7 @@ Performance monitor support
qcom_l2_pmu
qcom_l3_pmu
arm-ccn
arm-cmn
xgene-pmu
arm_dsu_pmu
thunderx2-pmu

View file

@ -175,6 +175,8 @@ infrastructure:
+------------------------------+---------+---------+
| Name | bits | visible |
+------------------------------+---------+---------+
| MTE | [11-8] | y |
+------------------------------+---------+---------+
| SSBS | [7-4] | y |
+------------------------------+---------+---------+
| BT | [3-0] | y |

View file

@ -240,6 +240,10 @@ HWCAP2_BTI
Functionality implied by ID_AA64PFR0_EL1.BT == 0b0001.
HWCAP2_MTE
Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0010, as described
by Documentation/arm64/memory-tagging-extension.rst.
4. Unused AT_HWCAP bits
-----------------------

View file

@ -14,6 +14,7 @@ ARM64 Architecture
hugetlbpage
legacy_instructions
memory
memory-tagging-extension
perf
pointer-authentication
silicon-errata

View file

@ -0,0 +1,305 @@
===============================================
Memory Tagging Extension (MTE) in AArch64 Linux
===============================================
Authors: Vincenzo Frascino <vincenzo.frascino@arm.com>
Catalin Marinas <catalin.marinas@arm.com>
Date: 2020-02-25
This document describes the provision of the Memory Tagging Extension
functionality in AArch64 Linux.
Introduction
============
ARMv8.5 based processors introduce the Memory Tagging Extension (MTE)
feature. MTE is built on top of the ARMv8.0 virtual address tagging TBI
(Top Byte Ignore) feature and allows software to access a 4-bit
allocation tag for each 16-byte granule in the physical address space.
Such memory range must be mapped with the Normal-Tagged memory
attribute. A logical tag is derived from bits 59-56 of the virtual
address used for the memory access. A CPU with MTE enabled will compare
the logical tag against the allocation tag and potentially raise an
exception on mismatch, subject to system registers configuration.
Userspace Support
=================
When ``CONFIG_ARM64_MTE`` is selected and Memory Tagging Extension is
supported by the hardware, the kernel advertises the feature to
userspace via ``HWCAP2_MTE``.
PROT_MTE
--------
To access the allocation tags, a user process must enable the Tagged
memory attribute on an address range using a new ``prot`` flag for
``mmap()`` and ``mprotect()``:
``PROT_MTE`` - Pages allow access to the MTE allocation tags.
The allocation tag is set to 0 when such pages are first mapped in the
user address space and preserved on copy-on-write. ``MAP_SHARED`` is
supported and the allocation tags can be shared between processes.
**Note**: ``PROT_MTE`` is only supported on ``MAP_ANONYMOUS`` and
RAM-based file mappings (``tmpfs``, ``memfd``). Passing it to other
types of mapping will result in ``-EINVAL`` returned by these system
calls.
**Note**: The ``PROT_MTE`` flag (and corresponding memory type) cannot
be cleared by ``mprotect()``.
**Note**: ``madvise()`` memory ranges with ``MADV_DONTNEED`` and
``MADV_FREE`` may have the allocation tags cleared (set to 0) at any
point after the system call.
Tag Check Faults
----------------
When ``PROT_MTE`` is enabled on an address range and a mismatch between
the logical and allocation tags occurs on access, there are three
configurable behaviours:
- *Ignore* - This is the default mode. The CPU (and kernel) ignores the
tag check fault.
- *Synchronous* - The kernel raises a ``SIGSEGV`` synchronously, with
``.si_code = SEGV_MTESERR`` and ``.si_addr = <fault-address>``. The
memory access is not performed. If ``SIGSEGV`` is ignored or blocked
by the offending thread, the containing process is terminated with a
``coredump``.
- *Asynchronous* - The kernel raises a ``SIGSEGV``, in the offending
thread, asynchronously following one or multiple tag check faults,
with ``.si_code = SEGV_MTEAERR`` and ``.si_addr = 0`` (the faulting
address is unknown).
The user can select the above modes, per thread, using the
``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where
``flags`` contain one of the following values in the ``PR_MTE_TCF_MASK``
bit-field:
- ``PR_MTE_TCF_NONE`` - *Ignore* tag check faults
- ``PR_MTE_TCF_SYNC`` - *Synchronous* tag check fault mode
- ``PR_MTE_TCF_ASYNC`` - *Asynchronous* tag check fault mode
The current tag check fault mode can be read using the
``prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0)`` system call.
Tag checking can also be disabled for a user thread by setting the
``PSTATE.TCO`` bit with ``MSR TCO, #1``.
**Note**: Signal handlers are always invoked with ``PSTATE.TCO = 0``,
irrespective of the interrupted context. ``PSTATE.TCO`` is restored on
``sigreturn()``.
**Note**: There are no *match-all* logical tags available for user
applications.
**Note**: Kernel accesses to the user address space (e.g. ``read()``
system call) are not checked if the user thread tag checking mode is
``PR_MTE_TCF_NONE`` or ``PR_MTE_TCF_ASYNC``. If the tag checking mode is
``PR_MTE_TCF_SYNC``, the kernel makes a best effort to check its user
address accesses, however it cannot always guarantee it.
Excluding Tags in the ``IRG``, ``ADDG`` and ``SUBG`` instructions
-----------------------------------------------------------------
The architecture allows excluding certain tags to be randomly generated
via the ``GCR_EL1.Exclude`` register bit-field. By default, Linux
excludes all tags other than 0. A user thread can enable specific tags
in the randomly generated set using the ``prctl(PR_SET_TAGGED_ADDR_CTRL,
flags, 0, 0, 0)`` system call where ``flags`` contains the tags bitmap
in the ``PR_MTE_TAG_MASK`` bit-field.
**Note**: The hardware uses an exclude mask but the ``prctl()``
interface provides an include mask. An include mask of ``0`` (exclusion
mask ``0xffff``) results in the CPU always generating tag ``0``.
Initial process state
---------------------
On ``execve()``, the new process has the following configuration:
- ``PR_TAGGED_ADDR_ENABLE`` set to 0 (disabled)
- Tag checking mode set to ``PR_MTE_TCF_NONE``
- ``PR_MTE_TAG_MASK`` set to 0 (all tags excluded)
- ``PSTATE.TCO`` set to 0
- ``PROT_MTE`` not set on any of the initial memory maps
On ``fork()``, the new process inherits the parent's configuration and
memory map attributes with the exception of the ``madvise()`` ranges
with ``MADV_WIPEONFORK`` which will have the data and tags cleared (set
to 0).
The ``ptrace()`` interface
--------------------------
``PTRACE_PEEKMTETAGS`` and ``PTRACE_POKEMTETAGS`` allow a tracer to read
the tags from or set the tags to a tracee's address space. The
``ptrace()`` system call is invoked as ``ptrace(request, pid, addr,
data)`` where:
- ``request`` - one of ``PTRACE_PEEKMTETAGS`` or ``PTRACE_POKEMTETAGS``.
- ``pid`` - the tracee's PID.
- ``addr`` - address in the tracee's address space.
- ``data`` - pointer to a ``struct iovec`` where ``iov_base`` points to
a buffer of ``iov_len`` length in the tracer's address space.
The tags in the tracer's ``iov_base`` buffer are represented as one
4-bit tag per byte and correspond to a 16-byte MTE tag granule in the
tracee's address space.
**Note**: If ``addr`` is not aligned to a 16-byte granule, the kernel
will use the corresponding aligned address.
``ptrace()`` return value:
- 0 - tags were copied, the tracer's ``iov_len`` was updated to the
number of tags transferred. This may be smaller than the requested
``iov_len`` if the requested address range in the tracee's or the
tracer's space cannot be accessed or does not have valid tags.
- ``-EPERM`` - the specified process cannot be traced.
- ``-EIO`` - the tracee's address range cannot be accessed (e.g. invalid
address) and no tags copied. ``iov_len`` not updated.
- ``-EFAULT`` - fault on accessing the tracer's memory (``struct iovec``
or ``iov_base`` buffer) and no tags copied. ``iov_len`` not updated.
- ``-EOPNOTSUPP`` - the tracee's address does not have valid tags (never
mapped with the ``PROT_MTE`` flag). ``iov_len`` not updated.
**Note**: There are no transient errors for the requests above, so user
programs should not retry in case of a non-zero system call return.
``PTRACE_GETREGSET`` and ``PTRACE_SETREGSET`` with ``addr ==
``NT_ARM_TAGGED_ADDR_CTRL`` allow ``ptrace()`` access to the tagged
address ABI control and MTE configuration of a process as per the
``prctl()`` options described in
Documentation/arm64/tagged-address-abi.rst and above. The corresponding
``regset`` is 1 element of 8 bytes (``sizeof(long))``).
Example of correct usage
========================
*MTE Example code*
.. code-block:: c
/*
* To be compiled with -march=armv8.5-a+memtag
*/
#include <errno.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/auxv.h>
#include <sys/mman.h>
#include <sys/prctl.h>
/*
* From arch/arm64/include/uapi/asm/hwcap.h
*/
#define HWCAP2_MTE (1 << 18)
/*
* From arch/arm64/include/uapi/asm/mman.h
*/
#define PROT_MTE 0x20
/*
* From include/uapi/linux/prctl.h
*/
#define PR_SET_TAGGED_ADDR_CTRL 55
#define PR_GET_TAGGED_ADDR_CTRL 56
# define PR_TAGGED_ADDR_ENABLE (1UL << 0)
# define PR_MTE_TCF_SHIFT 1
# define PR_MTE_TCF_NONE (0UL << PR_MTE_TCF_SHIFT)
# define PR_MTE_TCF_SYNC (1UL << PR_MTE_TCF_SHIFT)
# define PR_MTE_TCF_ASYNC (2UL << PR_MTE_TCF_SHIFT)
# define PR_MTE_TCF_MASK (3UL << PR_MTE_TCF_SHIFT)
# define PR_MTE_TAG_SHIFT 3
# define PR_MTE_TAG_MASK (0xffffUL << PR_MTE_TAG_SHIFT)
/*
* Insert a random logical tag into the given pointer.
*/
#define insert_random_tag(ptr) ({ \
uint64_t __val; \
asm("irg %0, %1" : "=r" (__val) : "r" (ptr)); \
__val; \
})
/*
* Set the allocation tag on the destination address.
*/
#define set_tag(tagged_addr) do { \
asm volatile("stg %0, [%0]" : : "r" (tagged_addr) : "memory"); \
} while (0)
int main()
{
unsigned char *a;
unsigned long page_sz = sysconf(_SC_PAGESIZE);
unsigned long hwcap2 = getauxval(AT_HWCAP2);
/* check if MTE is present */
if (!(hwcap2 & HWCAP2_MTE))
return EXIT_FAILURE;
/*
* Enable the tagged address ABI, synchronous MTE tag check faults and
* allow all non-zero tags in the randomly generated set.
*/
if (prctl(PR_SET_TAGGED_ADDR_CTRL,
PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_SYNC | (0xfffe << PR_MTE_TAG_SHIFT),
0, 0, 0)) {
perror("prctl() failed");
return EXIT_FAILURE;
}
a = mmap(0, page_sz, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (a == MAP_FAILED) {
perror("mmap() failed");
return EXIT_FAILURE;
}
/*
* Enable MTE on the above anonymous mmap. The flag could be passed
* directly to mmap() and skip this step.
*/
if (mprotect(a, page_sz, PROT_READ | PROT_WRITE | PROT_MTE)) {
perror("mprotect() failed");
return EXIT_FAILURE;
}
/* access with the default tag (0) */
a[0] = 1;
a[1] = 2;
printf("a[0] = %hhu a[1] = %hhu\n", a[0], a[1]);
/* set the logical and allocation tags */
a = (unsigned char *)insert_random_tag(a);
set_tag(a);
printf("%p\n", a);
/* non-zero tag access */
a[0] = 3;
printf("a[0] = %hhu a[1] = %hhu\n", a[0], a[1]);
/*
* If MTE is enabled correctly the next instruction will generate an
* exception.
*/
printf("Expecting SIGSEGV...\n");
a[16] = 0xdd;
/* this should not be printed in the PR_MTE_TCF_SYNC mode */
printf("...haven't got one\n");
return EXIT_FAILURE;
}

View file

@ -0,0 +1,57 @@
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
# Copyright 2020 Arm Ltd.
%YAML 1.2
---
$id: http://devicetree.org/schemas/perf/arm,cmn.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Arm CMN (Coherent Mesh Network) Performance Monitors
maintainers:
- Robin Murphy <robin.murphy@arm.com>
properties:
compatible:
const: arm,cmn-600
reg:
items:
- description: Physical address of the base (PERIPHBASE) and
size (up to 64MB) of the configuration address space.
interrupts:
minItems: 1
maxItems: 4
items:
- description: Overflow interrupt for DTC0
- description: Overflow interrupt for DTC1
- description: Overflow interrupt for DTC2
- description: Overflow interrupt for DTC3
description: One interrupt for each DTC domain implemented must
be specified, in order. DTC0 is always present.
arm,root-node:
$ref: /schemas/types.yaml#/definitions/uint32
description: Offset from PERIPHBASE of the configuration
discovery node (see TRM definition of ROOTNODEBASE).
required:
- compatible
- reg
- interrupts
- arm,root-node
additionalProperties: false
examples:
- |
#include <dt-bindings/interrupt-controller/arm-gic.h>
#include <dt-bindings/interrupt-controller/irq.h>
pmu@50000000 {
compatible = "arm,cmn-600";
reg = <0x50000000 0x4000000>;
/* 4x2 mesh with one DTC, and CFG node at 0,1,1,0 */
interrupts = <GIC_SPI 46 IRQ_TYPE_LEVEL_HIGH>;
arm,root-node = <0x104000>;
};
...

View file

@ -54,9 +54,9 @@ these functions (see arch/arm{,64}/include/asm/virt.h):
x3 = x1's value when entering the next payload (arm64)
x4 = x2's value when entering the next payload (arm64)
Mask all exceptions, disable the MMU, move the arguments into place
(arm64 only), and jump to the restart address while at HYP/EL2. This
hypercall is not expected to return to its caller.
Mask all exceptions, disable the MMU, clear I+D bits, move the arguments
into place (arm64 only), and jump to the restart address while at HYP/EL2.
This hypercall is not expected to return to its caller.
Any other value of r0/x0 triggers a hypervisor-specific handling,
which is not documented here.

View file

@ -29,6 +29,7 @@ config ARM64
select ARCH_HAS_SETUP_DMA_OPS
select ARCH_HAS_SET_DIRECT_MAP
select ARCH_HAS_SET_MEMORY
select ARCH_STACKWALK
select ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_HAS_STRICT_MODULE_RWX
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
@ -211,12 +212,18 @@ config ARM64_PAGE_SHIFT
default 14 if ARM64_16K_PAGES
default 12
config ARM64_CONT_SHIFT
config ARM64_CONT_PTE_SHIFT
int
default 5 if ARM64_64K_PAGES
default 7 if ARM64_16K_PAGES
default 4
config ARM64_CONT_PMD_SHIFT
int
default 5 if ARM64_64K_PAGES
default 5 if ARM64_16K_PAGES
default 4
config ARCH_MMAP_RND_BITS_MIN
default 14 if ARM64_64K_PAGES
default 16 if ARM64_16K_PAGES
@ -1165,32 +1172,6 @@ config UNMAP_KERNEL_AT_EL0
If unsure, say Y.
config HARDEN_BRANCH_PREDICTOR
bool "Harden the branch predictor against aliasing attacks" if EXPERT
default y
help
Speculation attacks against some high-performance processors rely on
being able to manipulate the branch predictor for a victim context by
executing aliasing branches in the attacker context. Such attacks
can be partially mitigated against by clearing internal branch
predictor state and limiting the prediction logic in some situations.
This config option will take CPU-specific actions to harden the
branch predictor against aliasing attacks and may rely on specific
instruction sequences or control bits being set by the system
firmware.
If unsure, say Y.
config ARM64_SSBD
bool "Speculative Store Bypass Disable" if EXPERT
default y
help
This enables mitigation of the bypassing of previous stores
by speculative loads.
If unsure, say Y.
config RODATA_FULL_DEFAULT_ENABLED
bool "Apply r/o permissions of VM areas also to their linear aliases"
default y
@ -1664,6 +1645,39 @@ config ARCH_RANDOM
provides a high bandwidth, cryptographically secure
hardware random number generator.
config ARM64_AS_HAS_MTE
# Initial support for MTE went in binutils 2.32.0, checked with
# ".arch armv8.5-a+memtag" below. However, this was incomplete
# as a late addition to the final architecture spec (LDGM/STGM)
# is only supported in the newer 2.32.x and 2.33 binutils
# versions, hence the extra "stgm" instruction check below.
def_bool $(as-instr,.arch armv8.5-a+memtag\nstgm xzr$(comma)[x0])
config ARM64_MTE
bool "Memory Tagging Extension support"
default y
depends on ARM64_AS_HAS_MTE && ARM64_TAGGED_ADDR_ABI
select ARCH_USES_HIGH_VMA_FLAGS
help
Memory Tagging (part of the ARMv8.5 Extensions) provides
architectural support for run-time, always-on detection of
various classes of memory error to aid with software debugging
to eliminate vulnerabilities arising from memory-unsafe
languages.
This option enables the support for the Memory Tagging
Extension at EL0 (i.e. for userspace).
Selecting this option allows the feature to be detected at
runtime. Any secondary CPU not implementing this feature will
not be allowed a late bring-up.
Userspace binaries that want to use this feature must
explicitly opt in. The mechanism for the userspace is
described in:
Documentation/arm64/memory-tagging-extension.rst.
endmenu
config ARM64_SVE
@ -1876,6 +1890,10 @@ config ARCH_ENABLE_HUGEPAGE_MIGRATION
def_bool y
depends on HUGETLB_PAGE && MIGRATION
config ARCH_ENABLE_THP_MIGRATION
def_bool y
depends on TRANSPARENT_HUGEPAGE
menu "Power management options"
source "kernel/power/Kconfig"

View file

@ -11,7 +11,6 @@
# Copyright (C) 1995-2001 by Russell King
LDFLAGS_vmlinux :=--no-undefined -X
CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
ifeq ($(CONFIG_RELOCATABLE), y)
# Pass --no-apply-dynamic-relocs to restore pre-binutils-2.27 behaviour
@ -132,9 +131,6 @@ endif
# Default value
head-y := arch/arm64/kernel/head.o
# The byte offset of the kernel image in RAM from the start of RAM.
TEXT_OFFSET := 0x0
ifeq ($(CONFIG_KASAN_SW_TAGS), y)
KASAN_SHADOW_SCALE_SHIFT := 4
else
@ -145,8 +141,6 @@ KBUILD_CFLAGS += -DKASAN_SHADOW_SCALE_SHIFT=$(KASAN_SHADOW_SCALE_SHIFT)
KBUILD_CPPFLAGS += -DKASAN_SHADOW_SCALE_SHIFT=$(KASAN_SHADOW_SCALE_SHIFT)
KBUILD_AFLAGS += -DKASAN_SHADOW_SCALE_SHIFT=$(KASAN_SHADOW_SCALE_SHIFT)
export TEXT_OFFSET
core-y += arch/arm64/
libs-y := arch/arm64/lib/ $(libs-y)
libs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a

View file

@ -79,10 +79,5 @@ arch_get_random_seed_long_early(unsigned long *v)
}
#define arch_get_random_seed_long_early arch_get_random_seed_long_early
#else
static inline bool __arm64_rndr(unsigned long *v) { return false; }
static inline bool __init __early_cpu_has_rndr(void) { return false; }
#endif /* CONFIG_ARCH_RANDOM */
#endif /* _ASM_ARCHRANDOM_H */

View file

@ -13,8 +13,7 @@
#define MAX_FDT_SIZE SZ_2M
/*
* arm64 requires the kernel image to placed
* TEXT_OFFSET bytes beyond a 2 MB aligned base
* arm64 requires the kernel image to placed at a 2 MB aligned base address
*/
#define MIN_KIMG_ALIGN SZ_2M

View file

@ -21,7 +21,7 @@
* mechanism for doing so, tests whether it is possible to boot
* the given CPU.
* @cpu_boot: Boots a cpu into the kernel.
* @cpu_postboot: Optionally, perform any post-boot cleanup or necesary
* @cpu_postboot: Optionally, perform any post-boot cleanup or necessary
* synchronisation. Called from the cpu being booted.
* @cpu_can_disable: Determines whether a CPU can be disabled based on
* mechanism-specific information.

View file

@ -31,13 +31,13 @@
#define ARM64_HAS_DCPOP 21
#define ARM64_SVE 22
#define ARM64_UNMAP_KERNEL_AT_EL0 23
#define ARM64_HARDEN_BRANCH_PREDICTOR 24
#define ARM64_SPECTRE_V2 24
#define ARM64_HAS_RAS_EXTN 25
#define ARM64_WORKAROUND_843419 26
#define ARM64_HAS_CACHE_IDC 27
#define ARM64_HAS_CACHE_DIC 28
#define ARM64_HW_DBM 29
#define ARM64_SSBD 30
#define ARM64_SPECTRE_V4 30
#define ARM64_MISMATCHED_CACHE_TYPE 31
#define ARM64_HAS_STAGE2_FWB 32
#define ARM64_HAS_CRC32 33
@ -64,7 +64,8 @@
#define ARM64_BTI 54
#define ARM64_HAS_ARMv8_4_TTL 55
#define ARM64_HAS_TLB_RANGE 56
#define ARM64_MTE 57
#define ARM64_NCAPS 57
#define ARM64_NCAPS 58
#endif /* __ASM_CPUCAPS_H */

View file

@ -358,7 +358,7 @@ static inline int cpucap_default_scope(const struct arm64_cpu_capabilities *cap)
}
/*
* Generic helper for handling capabilties with multiple (match,enable) pairs
* Generic helper for handling capabilities with multiple (match,enable) pairs
* of call backs, sharing the same capability bit.
* Iterate over each entry to see if at least one matches.
*/
@ -681,6 +681,12 @@ static __always_inline bool system_uses_irq_prio_masking(void)
cpus_have_const_cap(ARM64_HAS_IRQ_PRIO_MASKING);
}
static inline bool system_supports_mte(void)
{
return IS_ENABLED(CONFIG_ARM64_MTE) &&
cpus_have_const_cap(ARM64_MTE);
}
static inline bool system_has_prio_mask_debugging(void)
{
return IS_ENABLED(CONFIG_ARM64_DEBUG_PRIORITY_MASKING) &&
@ -698,30 +704,6 @@ static inline bool system_supports_tlb_range(void)
cpus_have_const_cap(ARM64_HAS_TLB_RANGE);
}
#define ARM64_BP_HARDEN_UNKNOWN -1
#define ARM64_BP_HARDEN_WA_NEEDED 0
#define ARM64_BP_HARDEN_NOT_REQUIRED 1
int get_spectre_v2_workaround_state(void);
#define ARM64_SSBD_UNKNOWN -1
#define ARM64_SSBD_FORCE_DISABLE 0
#define ARM64_SSBD_KERNEL 1
#define ARM64_SSBD_FORCE_ENABLE 2
#define ARM64_SSBD_MITIGATED 3
static inline int arm64_get_ssbd_state(void)
{
#ifdef CONFIG_ARM64_SSBD
extern int ssbd_state;
return ssbd_state;
#else
return ARM64_SSBD_UNKNOWN;
#endif
}
void arm64_set_ssbd_mitigation(bool state);
extern int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
static inline u32 id_aa64mmfr0_parange_to_phys_shift(int parange)

View file

@ -35,7 +35,9 @@
#define ESR_ELx_EC_SYS64 (0x18)
#define ESR_ELx_EC_SVE (0x19)
#define ESR_ELx_EC_ERET (0x1a) /* EL2 only */
/* Unallocated EC: 0x1b - 0x1E */
/* Unallocated EC: 0x1B */
#define ESR_ELx_EC_FPAC (0x1C) /* EL1 and above */
/* Unallocated EC: 0x1D - 0x1E */
#define ESR_ELx_EC_IMP_DEF (0x1f) /* EL3 only */
#define ESR_ELx_EC_IABT_LOW (0x20)
#define ESR_ELx_EC_IABT_CUR (0x21)

View file

@ -47,4 +47,5 @@ void bad_el0_sync(struct pt_regs *regs, int reason, unsigned int esr);
void do_cp15instr(unsigned int esr, struct pt_regs *regs);
void do_el0_svc(struct pt_regs *regs);
void do_el0_svc_compat(struct pt_regs *regs);
void do_ptrauth_fault(struct pt_regs *regs, unsigned int esr);
#endif /* __ASM_EXCEPTION_H */

View file

@ -22,6 +22,15 @@ struct exception_table_entry
#define ARCH_HAS_RELATIVE_EXTABLE
static inline bool in_bpf_jit(struct pt_regs *regs)
{
if (!IS_ENABLED(CONFIG_BPF_JIT))
return false;
return regs->pc >= BPF_JIT_REGION_START &&
regs->pc < BPF_JIT_REGION_END;
}
#ifdef CONFIG_BPF_JIT
int arm64_bpf_fixup_exception(const struct exception_table_entry *ex,
struct pt_regs *regs);

View file

@ -69,6 +69,9 @@ static inline void *sve_pffr(struct thread_struct *thread)
extern void sve_save_state(void *state, u32 *pfpsr);
extern void sve_load_state(void const *state, u32 const *pfpsr,
unsigned long vq_minus_1);
extern void sve_flush_live(void);
extern void sve_load_from_fpsimd_state(struct user_fpsimd_state const *state,
unsigned long vq_minus_1);
extern unsigned int sve_get_vl(void);
struct arm64_cpu_capabilities;

View file

@ -164,25 +164,59 @@
| ((\np) << 5)
.endm
/* PFALSE P\np.B */
.macro _sve_pfalse np
_sve_check_preg \np
.inst 0x2518e400 \
| (\np)
.endm
.macro __for from:req, to:req
.if (\from) == (\to)
_for__body \from
_for__body %\from
.else
__for \from, (\from) + ((\to) - (\from)) / 2
__for (\from) + ((\to) - (\from)) / 2 + 1, \to
__for %\from, %((\from) + ((\to) - (\from)) / 2)
__for %((\from) + ((\to) - (\from)) / 2 + 1), %\to
.endif
.endm
.macro _for var:req, from:req, to:req, insn:vararg
.macro _for__body \var:req
.noaltmacro
\insn
.altmacro
.endm
.altmacro
__for \from, \to
.noaltmacro
.purgem _for__body
.endm
/* Update ZCR_EL1.LEN with the new VQ */
.macro sve_load_vq xvqminus1, xtmp, xtmp2
mrs_s \xtmp, SYS_ZCR_EL1
bic \xtmp2, \xtmp, ZCR_ELx_LEN_MASK
orr \xtmp2, \xtmp2, \xvqminus1
cmp \xtmp2, \xtmp
b.eq 921f
msr_s SYS_ZCR_EL1, \xtmp2 //self-synchronising
921:
.endm
/* Preserve the first 128-bits of Znz and zero the rest. */
.macro _sve_flush_z nz
_sve_check_zreg \nz
mov v\nz\().16b, v\nz\().16b
.endm
.macro sve_flush
_for n, 0, 31, _sve_flush_z \n
_for n, 0, 15, _sve_pfalse \n
_sve_wrffr 0
.endm
.macro sve_save nxbase, xpfpsr, nxtmp
_for n, 0, 31, _sve_str_v \n, \nxbase, \n - 34
_for n, 0, 15, _sve_str_p \n, \nxbase, \n - 16
@ -197,13 +231,7 @@
.endm
.macro sve_load nxbase, xpfpsr, xvqminus1, nxtmp, xtmp2
mrs_s x\nxtmp, SYS_ZCR_EL1
bic \xtmp2, x\nxtmp, ZCR_ELx_LEN_MASK
orr \xtmp2, \xtmp2, \xvqminus1
cmp \xtmp2, x\nxtmp
b.eq 921f
msr_s SYS_ZCR_EL1, \xtmp2 // self-synchronising
921:
sve_load_vq \xvqminus1, x\nxtmp, \xtmp2
_for n, 0, 31, _sve_ldr_v \n, \nxbase, \n - 34
_sve_ldr_p 0, \nxbase
_sve_wrffr 0

View file

@ -8,18 +8,27 @@
#include <uapi/asm/hwcap.h>
#include <asm/cpufeature.h>
#define COMPAT_HWCAP_SWP (1 << 0)
#define COMPAT_HWCAP_HALF (1 << 1)
#define COMPAT_HWCAP_THUMB (1 << 2)
#define COMPAT_HWCAP_26BIT (1 << 3)
#define COMPAT_HWCAP_FAST_MULT (1 << 4)
#define COMPAT_HWCAP_FPA (1 << 5)
#define COMPAT_HWCAP_VFP (1 << 6)
#define COMPAT_HWCAP_EDSP (1 << 7)
#define COMPAT_HWCAP_JAVA (1 << 8)
#define COMPAT_HWCAP_IWMMXT (1 << 9)
#define COMPAT_HWCAP_CRUNCH (1 << 10)
#define COMPAT_HWCAP_THUMBEE (1 << 11)
#define COMPAT_HWCAP_NEON (1 << 12)
#define COMPAT_HWCAP_VFPv3 (1 << 13)
#define COMPAT_HWCAP_VFPV3D16 (1 << 14)
#define COMPAT_HWCAP_TLS (1 << 15)
#define COMPAT_HWCAP_VFPv4 (1 << 16)
#define COMPAT_HWCAP_IDIVA (1 << 17)
#define COMPAT_HWCAP_IDIVT (1 << 18)
#define COMPAT_HWCAP_IDIV (COMPAT_HWCAP_IDIVA|COMPAT_HWCAP_IDIVT)
#define COMPAT_HWCAP_VFPD32 (1 << 19)
#define COMPAT_HWCAP_LPAE (1 << 20)
#define COMPAT_HWCAP_EVTSTRM (1 << 21)
@ -95,7 +104,7 @@
#define KERNEL_HWCAP_DGH __khwcap2_feature(DGH)
#define KERNEL_HWCAP_RNG __khwcap2_feature(RNG)
#define KERNEL_HWCAP_BTI __khwcap2_feature(BTI)
/* reserved for KERNEL_HWCAP_MTE __khwcap2_feature(MTE) */
#define KERNEL_HWCAP_MTE __khwcap2_feature(MTE)
/*
* This yields a mask that user programs can use to figure out what

View file

@ -359,9 +359,13 @@ __AARCH64_INSN_FUNCS(brk, 0xFFE0001F, 0xD4200000)
__AARCH64_INSN_FUNCS(exception, 0xFF000000, 0xD4000000)
__AARCH64_INSN_FUNCS(hint, 0xFFFFF01F, 0xD503201F)
__AARCH64_INSN_FUNCS(br, 0xFFFFFC1F, 0xD61F0000)
__AARCH64_INSN_FUNCS(br_auth, 0xFEFFF800, 0xD61F0800)
__AARCH64_INSN_FUNCS(blr, 0xFFFFFC1F, 0xD63F0000)
__AARCH64_INSN_FUNCS(blr_auth, 0xFEFFF800, 0xD63F0800)
__AARCH64_INSN_FUNCS(ret, 0xFFFFFC1F, 0xD65F0000)
__AARCH64_INSN_FUNCS(ret_auth, 0xFFFFFBFF, 0xD65F0BFF)
__AARCH64_INSN_FUNCS(eret, 0xFFFFFFFF, 0xD69F03E0)
__AARCH64_INSN_FUNCS(eret_auth, 0xFFFFFBFF, 0xD69F0BFF)
__AARCH64_INSN_FUNCS(mrs, 0xFFF00000, 0xD5300000)
__AARCH64_INSN_FUNCS(msr_imm, 0xFFF8F01F, 0xD500401F)
__AARCH64_INSN_FUNCS(msr_reg, 0xFFF00000, 0xD5100000)

View file

@ -86,7 +86,7 @@
+ EARLY_PGDS((vstart), (vend)) /* each PGDIR needs a next level page table */ \
+ EARLY_PUDS((vstart), (vend)) /* each PUD needs a next level page table */ \
+ EARLY_PMDS((vstart), (vend))) /* each PMD needs a next level page table */
#define INIT_DIR_SIZE (PAGE_SIZE * EARLY_PAGES(KIMAGE_VADDR + TEXT_OFFSET, _end))
#define INIT_DIR_SIZE (PAGE_SIZE * EARLY_PAGES(KIMAGE_VADDR, _end))
#define IDMAP_DIR_SIZE (IDMAP_PGTABLE_LEVELS * PAGE_SIZE)
#ifdef CONFIG_ARM64_SW_TTBR0_PAN

View file

@ -12,6 +12,7 @@
#include <asm/types.h>
/* Hyp Configuration Register (HCR) bits */
#define HCR_ATA (UL(1) << 56)
#define HCR_FWB (UL(1) << 46)
#define HCR_API (UL(1) << 41)
#define HCR_APK (UL(1) << 40)
@ -66,7 +67,7 @@
* TWI: Trap WFI
* TIDCP: Trap L2CTLR/L2ECTLR
* BSU_IS: Upgrade barriers to the inner shareable domain
* FB: Force broadcast of all maintainance operations
* FB: Force broadcast of all maintenance operations
* AMO: Override CPSR.A and enable signaling with VA
* IMO: Override CPSR.I and enable signaling with VI
* FMO: Override CPSR.F and enable signaling with VF
@ -78,7 +79,7 @@
HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \
HCR_FMO | HCR_IMO | HCR_PTW )
#define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF)
#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK)
#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK | HCR_ATA)
#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
/* TCR_EL2 Registers bits */

View file

@ -9,9 +9,6 @@
#include <asm/virt.h>
#define VCPU_WORKAROUND_2_FLAG_SHIFT 0
#define VCPU_WORKAROUND_2_FLAG (_AC(1, UL) << VCPU_WORKAROUND_2_FLAG_SHIFT)
#define ARM_EXIT_WITH_SERROR_BIT 31
#define ARM_EXCEPTION_CODE(x) ((x) & ~(1U << ARM_EXIT_WITH_SERROR_BIT))
#define ARM_EXCEPTION_IS_TRAP(x) (ARM_EXCEPTION_CODE((x)) == ARM_EXCEPTION_TRAP)
@ -102,11 +99,9 @@ DECLARE_KVM_HYP_SYM(__kvm_hyp_vector);
#define __kvm_hyp_init CHOOSE_NVHE_SYM(__kvm_hyp_init)
#define __kvm_hyp_vector CHOOSE_HYP_SYM(__kvm_hyp_vector)
#ifdef CONFIG_KVM_INDIRECT_VECTORS
extern atomic_t arm64_el2_vector_last_slot;
DECLARE_KVM_HYP_SYM(__bp_harden_hyp_vecs);
#define __bp_harden_hyp_vecs CHOOSE_HYP_SYM(__bp_harden_hyp_vecs)
#endif
extern void __kvm_flush_vm_context(void);
extern void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa,

View file

@ -391,20 +391,6 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
}
static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
{
return vcpu->arch.workaround_flags & VCPU_WORKAROUND_2_FLAG;
}
static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
bool flag)
{
if (flag)
vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
else
vcpu->arch.workaround_flags &= ~VCPU_WORKAROUND_2_FLAG;
}
static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
{
if (vcpu_mode_is_32bit(vcpu)) {

View file

@ -631,46 +631,6 @@ static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
static inline void kvm_clr_pmu_events(u32 clr) {}
#endif
#define KVM_BP_HARDEN_UNKNOWN -1
#define KVM_BP_HARDEN_WA_NEEDED 0
#define KVM_BP_HARDEN_NOT_REQUIRED 1
static inline int kvm_arm_harden_branch_predictor(void)
{
switch (get_spectre_v2_workaround_state()) {
case ARM64_BP_HARDEN_WA_NEEDED:
return KVM_BP_HARDEN_WA_NEEDED;
case ARM64_BP_HARDEN_NOT_REQUIRED:
return KVM_BP_HARDEN_NOT_REQUIRED;
case ARM64_BP_HARDEN_UNKNOWN:
default:
return KVM_BP_HARDEN_UNKNOWN;
}
}
#define KVM_SSBD_UNKNOWN -1
#define KVM_SSBD_FORCE_DISABLE 0
#define KVM_SSBD_KERNEL 1
#define KVM_SSBD_FORCE_ENABLE 2
#define KVM_SSBD_MITIGATED 3
static inline int kvm_arm_have_ssbd(void)
{
switch (arm64_get_ssbd_state()) {
case ARM64_SSBD_FORCE_DISABLE:
return KVM_SSBD_FORCE_DISABLE;
case ARM64_SSBD_KERNEL:
return KVM_SSBD_KERNEL;
case ARM64_SSBD_FORCE_ENABLE:
return KVM_SSBD_FORCE_ENABLE;
case ARM64_SSBD_MITIGATED:
return KVM_SSBD_MITIGATED;
case ARM64_SSBD_UNKNOWN:
default:
return KVM_SSBD_UNKNOWN;
}
}
void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu);
void kvm_vcpu_put_sysregs_vhe(struct kvm_vcpu *vcpu);

View file

@ -9,6 +9,7 @@
#include <asm/page.h>
#include <asm/memory.h>
#include <asm/mmu.h>
#include <asm/cpufeature.h>
/*
@ -430,19 +431,17 @@ static inline int kvm_write_guest_lock(struct kvm *kvm, gpa_t gpa,
return ret;
}
#ifdef CONFIG_KVM_INDIRECT_VECTORS
/*
* EL2 vectors can be mapped and rerouted in a number of ways,
* depending on the kernel configuration and CPU present:
*
* - If the CPU has the ARM64_HARDEN_BRANCH_PREDICTOR cap, the
* hardening sequence is placed in one of the vector slots, which is
* executed before jumping to the real vectors.
* - If the CPU is affected by Spectre-v2, the hardening sequence is
* placed in one of the vector slots, which is executed before jumping
* to the real vectors.
*
* - If the CPU has both the ARM64_HARDEN_EL2_VECTORS cap and the
* ARM64_HARDEN_BRANCH_PREDICTOR cap, the slot containing the
* hardening sequence is mapped next to the idmap page, and executed
* before jumping to the real vectors.
* - If the CPU also has the ARM64_HARDEN_EL2_VECTORS cap, the slot
* containing the hardening sequence is mapped next to the idmap page,
* and executed before jumping to the real vectors.
*
* - If the CPU only has the ARM64_HARDEN_EL2_VECTORS cap, then an
* empty slot is selected, mapped next to the idmap page, and
@ -452,19 +451,16 @@ static inline int kvm_write_guest_lock(struct kvm *kvm, gpa_t gpa,
* VHE, as we don't have hypervisor-specific mappings. If the system
* is VHE and yet selects this capability, it will be ignored.
*/
#include <asm/mmu.h>
extern void *__kvm_bp_vect_base;
extern int __kvm_harden_el2_vector_slot;
/* This is called on both VHE and !VHE systems */
static inline void *kvm_get_hyp_vector(void)
{
struct bp_hardening_data *data = arm64_get_bp_hardening_data();
void *vect = kern_hyp_va(kvm_ksym_ref(__kvm_hyp_vector));
int slot = -1;
if (cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR) && data->fn) {
if (cpus_have_const_cap(ARM64_SPECTRE_V2) && data->fn) {
vect = kern_hyp_va(kvm_ksym_ref(__bp_harden_hyp_vecs));
slot = data->hyp_vectors_slot;
}
@ -481,76 +477,6 @@ static inline void *kvm_get_hyp_vector(void)
return vect;
}
/* This is only called on a !VHE system */
static inline int kvm_map_vectors(void)
{
/*
* HBP = ARM64_HARDEN_BRANCH_PREDICTOR
* HEL2 = ARM64_HARDEN_EL2_VECTORS
*
* !HBP + !HEL2 -> use direct vectors
* HBP + !HEL2 -> use hardened vectors in place
* !HBP + HEL2 -> allocate one vector slot and use exec mapping
* HBP + HEL2 -> use hardened vertors and use exec mapping
*/
if (cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR)) {
__kvm_bp_vect_base = kvm_ksym_ref(__bp_harden_hyp_vecs);
__kvm_bp_vect_base = kern_hyp_va(__kvm_bp_vect_base);
}
if (cpus_have_const_cap(ARM64_HARDEN_EL2_VECTORS)) {
phys_addr_t vect_pa = __pa_symbol(__bp_harden_hyp_vecs);
unsigned long size = __BP_HARDEN_HYP_VECS_SZ;
/*
* Always allocate a spare vector slot, as we don't
* know yet which CPUs have a BP hardening slot that
* we can reuse.
*/
__kvm_harden_el2_vector_slot = atomic_inc_return(&arm64_el2_vector_last_slot);
BUG_ON(__kvm_harden_el2_vector_slot >= BP_HARDEN_EL2_SLOTS);
return create_hyp_exec_mappings(vect_pa, size,
&__kvm_bp_vect_base);
}
return 0;
}
#else
static inline void *kvm_get_hyp_vector(void)
{
return kern_hyp_va(kvm_ksym_ref(__kvm_hyp_vector));
}
static inline int kvm_map_vectors(void)
{
return 0;
}
#endif
#ifdef CONFIG_ARM64_SSBD
DECLARE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);
static inline int hyp_map_aux_data(void)
{
int cpu, err;
for_each_possible_cpu(cpu) {
u64 *ptr;
ptr = per_cpu_ptr(&arm64_ssbd_callback_required, cpu);
err = create_hyp_mappings(ptr, ptr + 1, PAGE_HYP);
if (err)
return err;
}
return 0;
}
#else
static inline int hyp_map_aux_data(void)
{
return 0;
}
#endif
#define kvm_phys_to_vttbr(addr) phys_to_ttbr(addr)
/*

View file

@ -126,13 +126,18 @@
/*
* Memory types available.
*
* IMPORTANT: MT_NORMAL must be index 0 since vm_get_page_prot() may 'or' in
* the MT_NORMAL_TAGGED memory type for PROT_MTE mappings. Note
* that protection_map[] only contains MT_NORMAL attributes.
*/
#define MT_DEVICE_nGnRnE 0
#define MT_DEVICE_nGnRE 1
#define MT_DEVICE_GRE 2
#define MT_NORMAL_NC 3
#define MT_NORMAL 4
#define MT_NORMAL_WT 5
#define MT_NORMAL 0
#define MT_NORMAL_TAGGED 1
#define MT_NORMAL_NC 2
#define MT_NORMAL_WT 3
#define MT_DEVICE_nGnRnE 4
#define MT_DEVICE_nGnRE 5
#define MT_DEVICE_GRE 6
/*
* Memory types for Stage-2 translation
@ -169,7 +174,7 @@ extern s64 memstart_addr;
/* PHYS_OFFSET - the physical address of the start of memory. */
#define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); memstart_addr; })
/* the virtual base of the kernel image (minus TEXT_OFFSET) */
/* the virtual base of the kernel image */
extern u64 kimage_vaddr;
/* the offset between the kernel virtual and physical mappings */

View file

@ -9,16 +9,53 @@
static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
unsigned long pkey __always_unused)
{
if (system_supports_bti() && (prot & PROT_BTI))
return VM_ARM64_BTI;
unsigned long ret = 0;
return 0;
if (system_supports_bti() && (prot & PROT_BTI))
ret |= VM_ARM64_BTI;
if (system_supports_mte() && (prot & PROT_MTE))
ret |= VM_MTE;
return ret;
}
#define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
{
/*
* Only allow MTE on anonymous mappings as these are guaranteed to be
* backed by tags-capable memory. The vm_flags may be overridden by a
* filesystem supporting MTE (RAM-based).
*/
if (system_supports_mte() && (flags & MAP_ANONYMOUS))
return VM_MTE_ALLOWED;
return 0;
}
#define arch_calc_vm_flag_bits(flags) arch_calc_vm_flag_bits(flags)
static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
{
return (vm_flags & VM_ARM64_BTI) ? __pgprot(PTE_GP) : __pgprot(0);
pteval_t prot = 0;
if (vm_flags & VM_ARM64_BTI)
prot |= PTE_GP;
/*
* There are two conditions required for returning a Normal Tagged
* memory type: (1) the user requested it via PROT_MTE passed to
* mmap() or mprotect() and (2) the corresponding vma supports MTE. We
* register (1) as VM_MTE in the vma->vm_flags and (2) as
* VM_MTE_ALLOWED. Note that the latter can only be set during the
* mmap() call since mprotect() does not accept MAP_* flags.
* Checking for VM_MTE only is sufficient since arch_validate_flags()
* does not permit (VM_MTE & !VM_MTE_ALLOWED).
*/
if (vm_flags & VM_MTE)
prot |= PTE_ATTRINDX(MT_NORMAL_TAGGED);
return __pgprot(prot);
}
#define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags)
@ -30,8 +67,21 @@ static inline bool arch_validate_prot(unsigned long prot,
if (system_supports_bti())
supported |= PROT_BTI;
if (system_supports_mte())
supported |= PROT_MTE;
return (prot & ~supported) == 0;
}
#define arch_validate_prot(prot, addr) arch_validate_prot(prot, addr)
static inline bool arch_validate_flags(unsigned long vm_flags)
{
if (!system_supports_mte())
return true;
/* only allow VM_MTE if VM_MTE_ALLOWED has been set previously */
return !(vm_flags & VM_MTE) || (vm_flags & VM_MTE_ALLOWED);
}
#define arch_validate_flags(vm_flags) arch_validate_flags(vm_flags)
#endif /* ! __ASM_MMAN_H__ */

View file

@ -17,11 +17,14 @@
#ifndef __ASSEMBLY__
#include <linux/refcount.h>
typedef struct {
atomic64_t id;
#ifdef CONFIG_COMPAT
void *sigpage;
#endif
refcount_t pinned;
void *vdso;
unsigned long flags;
} mm_context_t;
@ -45,7 +48,6 @@ struct bp_hardening_data {
bp_hardening_cb_t fn;
};
#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
DECLARE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void)
@ -57,21 +59,13 @@ static inline void arm64_apply_bp_hardening(void)
{
struct bp_hardening_data *d;
if (!cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR))
if (!cpus_have_const_cap(ARM64_SPECTRE_V2))
return;
d = arm64_get_bp_hardening_data();
if (d->fn)
d->fn();
}
#else
static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void)
{
return NULL;
}
static inline void arm64_apply_bp_hardening(void) { }
#endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */
extern void arm64_memblock_init(void);
extern void paging_init(void);

View file

@ -177,7 +177,13 @@ static inline void cpu_replace_ttbr1(pgd_t *pgdp)
#define destroy_context(mm) do { } while(0)
void check_and_switch_context(struct mm_struct *mm);
#define init_new_context(tsk,mm) ({ atomic64_set(&(mm)->context.id, 0); 0; })
static inline int
init_new_context(struct task_struct *tsk, struct mm_struct *mm)
{
atomic64_set(&mm->context.id, 0);
refcount_set(&mm->context.pinned, 0);
return 0;
}
#ifdef CONFIG_ARM64_SW_TTBR0_PAN
static inline void update_saved_ttbr0(struct task_struct *tsk,
@ -248,6 +254,9 @@ switch_mm(struct mm_struct *prev, struct mm_struct *next,
void verify_cpu_asid_bits(void);
void post_ttbr_update_workaround(void);
unsigned long arm64_mm_context_get(struct mm_struct *mm);
void arm64_mm_context_put(struct mm_struct *mm);
#endif /* !__ASSEMBLY__ */
#endif /* !__ASM_MMU_CONTEXT_H */

View file

@ -0,0 +1,86 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (C) 2020 ARM Ltd.
*/
#ifndef __ASM_MTE_H
#define __ASM_MTE_H
#define MTE_GRANULE_SIZE UL(16)
#define MTE_GRANULE_MASK (~(MTE_GRANULE_SIZE - 1))
#define MTE_TAG_SHIFT 56
#define MTE_TAG_SIZE 4
#ifndef __ASSEMBLY__
#include <linux/page-flags.h>
#include <asm/pgtable-types.h>
void mte_clear_page_tags(void *addr);
unsigned long mte_copy_tags_from_user(void *to, const void __user *from,
unsigned long n);
unsigned long mte_copy_tags_to_user(void __user *to, void *from,
unsigned long n);
int mte_save_tags(struct page *page);
void mte_save_page_tags(const void *page_addr, void *tag_storage);
bool mte_restore_tags(swp_entry_t entry, struct page *page);
void mte_restore_page_tags(void *page_addr, const void *tag_storage);
void mte_invalidate_tags(int type, pgoff_t offset);
void mte_invalidate_tags_area(int type);
void *mte_allocate_tag_storage(void);
void mte_free_tag_storage(char *storage);
#ifdef CONFIG_ARM64_MTE
/* track which pages have valid allocation tags */
#define PG_mte_tagged PG_arch_2
void mte_sync_tags(pte_t *ptep, pte_t pte);
void mte_copy_page_tags(void *kto, const void *kfrom);
void flush_mte_state(void);
void mte_thread_switch(struct task_struct *next);
void mte_suspend_exit(void);
long set_mte_ctrl(struct task_struct *task, unsigned long arg);
long get_mte_ctrl(struct task_struct *task);
int mte_ptrace_copy_tags(struct task_struct *child, long request,
unsigned long addr, unsigned long data);
#else
/* unused if !CONFIG_ARM64_MTE, silence the compiler */
#define PG_mte_tagged 0
static inline void mte_sync_tags(pte_t *ptep, pte_t pte)
{
}
static inline void mte_copy_page_tags(void *kto, const void *kfrom)
{
}
static inline void flush_mte_state(void)
{
}
static inline void mte_thread_switch(struct task_struct *next)
{
}
static inline void mte_suspend_exit(void)
{
}
static inline long set_mte_ctrl(struct task_struct *task, unsigned long arg)
{
return 0;
}
static inline long get_mte_ctrl(struct task_struct *task)
{
return 0;
}
static inline int mte_ptrace_copy_tags(struct task_struct *child,
long request, unsigned long addr,
unsigned long data)
{
return -EIO;
}
#endif
#endif /* __ASSEMBLY__ */
#endif /* __ASM_MTE_H */

View file

@ -25,6 +25,9 @@ const struct cpumask *cpumask_of_node(int node);
/* Returns a pointer to the cpumask of CPUs on Node 'node'. */
static inline const struct cpumask *cpumask_of_node(int node)
{
if (node == NUMA_NO_NODE)
return cpu_all_mask;
return node_to_cpumask_map[node];
}
#endif

View file

@ -11,13 +11,8 @@
#include <linux/const.h>
/* PAGE_SHIFT determines the page size */
/* CONT_SHIFT determines the number of pages which can be tracked together */
#define PAGE_SHIFT CONFIG_ARM64_PAGE_SHIFT
#define CONT_SHIFT CONFIG_ARM64_CONT_SHIFT
#define PAGE_SIZE (_AC(1, UL) << PAGE_SHIFT)
#define PAGE_MASK (~(PAGE_SIZE-1))
#define CONT_SIZE (_AC(1, UL) << (CONT_SHIFT + PAGE_SHIFT))
#define CONT_MASK (~(CONT_SIZE-1))
#endif /* __ASM_PAGE_DEF_H */

View file

@ -15,18 +15,25 @@
#include <linux/personality.h> /* for READ_IMPLIES_EXEC */
#include <asm/pgtable-types.h>
extern void __cpu_clear_user_page(void *p, unsigned long user);
extern void __cpu_copy_user_page(void *to, const void *from,
unsigned long user);
struct page;
struct vm_area_struct;
extern void copy_page(void *to, const void *from);
extern void clear_page(void *to);
void copy_user_highpage(struct page *to, struct page *from,
unsigned long vaddr, struct vm_area_struct *vma);
#define __HAVE_ARCH_COPY_USER_HIGHPAGE
void copy_highpage(struct page *to, struct page *from);
#define __HAVE_ARCH_COPY_HIGHPAGE
#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
#define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
#define clear_user_page(addr,vaddr,pg) __cpu_clear_user_page(addr, vaddr)
#define copy_user_page(to,from,vaddr,pg) __cpu_copy_user_page(to, from, vaddr)
#define clear_user_page(page, vaddr, pg) clear_page(page)
#define copy_user_page(to, from, vaddr, pg) copy_page(to, from)
typedef struct page *pgtable_t;
@ -36,7 +43,7 @@ extern int pfn_valid(unsigned long);
#endif /* !__ASSEMBLY__ */
#define VM_DATA_DEFAULT_FLAGS VM_DATA_FLAGS_TSK_EXEC
#define VM_DATA_DEFAULT_FLAGS (VM_DATA_FLAGS_TSK_EXEC | VM_MTE_ALLOWED)
#include <asm-generic/getorder.h>

View file

@ -17,6 +17,7 @@
#define pcibios_assign_all_busses() \
(pci_has_flag(PCI_REASSIGN_ALL_BUS))
#define arch_can_pci_mmap_wc() 1
#define ARCH_GENERIC_PCI_MMAP_RESOURCE 1
extern int isa_dma_bridge_buggy;

View file

@ -236,6 +236,9 @@
#define ARMV8_PMU_USERENR_CR (1 << 2) /* Cycle counter can be read at EL0 */
#define ARMV8_PMU_USERENR_ER (1 << 3) /* Event counter can be read at EL0 */
/* PMMIR_EL1.SLOTS mask */
#define ARMV8_PMU_SLOTS_MASK 0xff
#ifdef CONFIG_PERF_EVENTS
struct pt_regs;
extern unsigned long perf_instruction_pointer(struct pt_regs *regs);

View file

@ -81,25 +81,15 @@
/*
* Contiguous page definitions.
*/
#ifdef CONFIG_ARM64_64K_PAGES
#define CONT_PTE_SHIFT (5 + PAGE_SHIFT)
#define CONT_PMD_SHIFT (5 + PMD_SHIFT)
#elif defined(CONFIG_ARM64_16K_PAGES)
#define CONT_PTE_SHIFT (7 + PAGE_SHIFT)
#define CONT_PMD_SHIFT (5 + PMD_SHIFT)
#else
#define CONT_PTE_SHIFT (4 + PAGE_SHIFT)
#define CONT_PMD_SHIFT (4 + PMD_SHIFT)
#endif
#define CONT_PTE_SHIFT (CONFIG_ARM64_CONT_PTE_SHIFT + PAGE_SHIFT)
#define CONT_PTES (1 << (CONT_PTE_SHIFT - PAGE_SHIFT))
#define CONT_PTE_SIZE (CONT_PTES * PAGE_SIZE)
#define CONT_PTE_MASK (~(CONT_PTE_SIZE - 1))
#define CONT_PMD_SHIFT (CONFIG_ARM64_CONT_PMD_SHIFT + PMD_SHIFT)
#define CONT_PMDS (1 << (CONT_PMD_SHIFT - PMD_SHIFT))
#define CONT_PMD_SIZE (CONT_PMDS * PMD_SIZE)
#define CONT_PMD_MASK (~(CONT_PMD_SIZE - 1))
/* the numerical offset of the PTE within a range of CONT_PTES */
#define CONT_RANGE_OFFSET(addr) (((addr)>>PAGE_SHIFT)&(CONT_PTES-1))
/*
* Hardware page table definitions.

View file

@ -19,6 +19,13 @@
#define PTE_DEVMAP (_AT(pteval_t, 1) << 57)
#define PTE_PROT_NONE (_AT(pteval_t, 1) << 58) /* only when !PTE_VALID */
/*
* This bit indicates that the entry is present i.e. pmd_page()
* still points to a valid huge page in memory even if the pmd
* has been invalidated.
*/
#define PMD_PRESENT_INVALID (_AT(pteval_t, 1) << 59) /* only when !PMD_SECT_VALID */
#ifndef __ASSEMBLY__
#include <asm/cpufeature.h>
@ -50,6 +57,7 @@ extern bool arm64_use_ng_mappings;
#define PROT_NORMAL_NC (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_NC))
#define PROT_NORMAL_WT (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_WT))
#define PROT_NORMAL (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL))
#define PROT_NORMAL_TAGGED (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_TAGGED))
#define PROT_SECT_DEVICE_nGnRE (PROT_SECT_DEFAULT | PMD_SECT_PXN | PMD_SECT_UXN | PMD_ATTRINDX(MT_DEVICE_nGnRE))
#define PROT_SECT_NORMAL (PROT_SECT_DEFAULT | PMD_SECT_PXN | PMD_SECT_UXN | PMD_ATTRINDX(MT_NORMAL))
@ -59,6 +67,7 @@ extern bool arm64_use_ng_mappings;
#define _HYP_PAGE_DEFAULT _PAGE_DEFAULT
#define PAGE_KERNEL __pgprot(PROT_NORMAL)
#define PAGE_KERNEL_TAGGED __pgprot(PROT_NORMAL_TAGGED)
#define PAGE_KERNEL_RO __pgprot((PROT_NORMAL & ~PTE_WRITE) | PTE_RDONLY)
#define PAGE_KERNEL_ROX __pgprot((PROT_NORMAL & ~(PTE_WRITE | PTE_PXN)) | PTE_RDONLY)
#define PAGE_KERNEL_EXEC __pgprot(PROT_NORMAL & ~PTE_PXN)

View file

@ -9,6 +9,7 @@
#include <asm/proc-fns.h>
#include <asm/memory.h>
#include <asm/mte.h>
#include <asm/pgtable-hwdef.h>
#include <asm/pgtable-prot.h>
#include <asm/tlbflush.h>
@ -35,11 +36,6 @@
extern struct page *vmemmap;
extern void __pte_error(const char *file, int line, unsigned long val);
extern void __pmd_error(const char *file, int line, unsigned long val);
extern void __pud_error(const char *file, int line, unsigned long val);
extern void __pgd_error(const char *file, int line, unsigned long val);
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
@ -50,6 +46,14 @@ extern void __pgd_error(const char *file, int line, unsigned long val);
__flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1)
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
/*
* Outside of a few very special situations (e.g. hibernation), we always
* use broadcast TLB invalidation instructions, therefore a spurious page
* fault on one CPU which has been handled concurrently by another CPU
* does not need to perform additional invalidation.
*/
#define flush_tlb_fix_spurious_fault(vma, address) do { } while (0)
/*
* ZERO_PAGE is a global shared page that is always zero: used
* for zero-mapped memory areas etc..
@ -57,7 +61,8 @@ extern void __pgd_error(const char *file, int line, unsigned long val);
extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
#define ZERO_PAGE(vaddr) phys_to_page(__pa_symbol(empty_zero_page))
#define pte_ERROR(pte) __pte_error(__FILE__, __LINE__, pte_val(pte))
#define pte_ERROR(e) \
pr_err("%s:%d: bad pte %016llx.\n", __FILE__, __LINE__, pte_val(e))
/*
* Macros to convert between a physical address and its placement in a
@ -90,6 +95,8 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
#define pte_user_exec(pte) (!(pte_val(pte) & PTE_UXN))
#define pte_cont(pte) (!!(pte_val(pte) & PTE_CONT))
#define pte_devmap(pte) (!!(pte_val(pte) & PTE_DEVMAP))
#define pte_tagged(pte) ((pte_val(pte) & PTE_ATTRINDX_MASK) == \
PTE_ATTRINDX(MT_NORMAL_TAGGED))
#define pte_cont_addr_end(addr, end) \
({ unsigned long __boundary = ((addr) + CONT_PTE_SIZE) & CONT_PTE_MASK; \
@ -145,6 +152,18 @@ static inline pte_t set_pte_bit(pte_t pte, pgprot_t prot)
return pte;
}
static inline pmd_t clear_pmd_bit(pmd_t pmd, pgprot_t prot)
{
pmd_val(pmd) &= ~pgprot_val(prot);
return pmd;
}
static inline pmd_t set_pmd_bit(pmd_t pmd, pgprot_t prot)
{
pmd_val(pmd) |= pgprot_val(prot);
return pmd;
}
static inline pte_t pte_wrprotect(pte_t pte)
{
pte = clear_pte_bit(pte, __pgprot(PTE_WRITE));
@ -284,6 +303,10 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
if (pte_present(pte) && pte_user_exec(pte) && !pte_special(pte))
__sync_icache_dcache(pte);
if (system_supports_mte() &&
pte_present(pte) && pte_tagged(pte) && !pte_special(pte))
mte_sync_tags(ptep, pte);
__check_racy_pte_update(mm, ptep, pte);
set_pte(ptep, pte);
@ -363,15 +386,24 @@ static inline int pmd_protnone(pmd_t pmd)
}
#endif
#define pmd_present_invalid(pmd) (!!(pmd_val(pmd) & PMD_PRESENT_INVALID))
static inline int pmd_present(pmd_t pmd)
{
return pte_present(pmd_pte(pmd)) || pmd_present_invalid(pmd);
}
/*
* THP definitions.
*/
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define pmd_trans_huge(pmd) (pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT))
static inline int pmd_trans_huge(pmd_t pmd)
{
return pmd_val(pmd) && pmd_present(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT);
}
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
#define pmd_present(pmd) pte_present(pmd_pte(pmd))
#define pmd_dirty(pmd) pte_dirty(pmd_pte(pmd))
#define pmd_young(pmd) pte_young(pmd_pte(pmd))
#define pmd_valid(pmd) pte_valid(pmd_pte(pmd))
@ -381,7 +413,14 @@ static inline int pmd_protnone(pmd_t pmd)
#define pmd_mkclean(pmd) pte_pmd(pte_mkclean(pmd_pte(pmd)))
#define pmd_mkdirty(pmd) pte_pmd(pte_mkdirty(pmd_pte(pmd)))
#define pmd_mkyoung(pmd) pte_pmd(pte_mkyoung(pmd_pte(pmd)))
#define pmd_mkinvalid(pmd) (__pmd(pmd_val(pmd) & ~PMD_SECT_VALID))
static inline pmd_t pmd_mkinvalid(pmd_t pmd)
{
pmd = set_pmd_bit(pmd, __pgprot(PMD_PRESENT_INVALID));
pmd = clear_pmd_bit(pmd, __pgprot(PMD_SECT_VALID));
return pmd;
}
#define pmd_thp_or_huge(pmd) (pmd_huge(pmd) || pmd_trans_huge(pmd))
@ -541,7 +580,8 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
#if CONFIG_PGTABLE_LEVELS > 2
#define pmd_ERROR(pmd) __pmd_error(__FILE__, __LINE__, pmd_val(pmd))
#define pmd_ERROR(e) \
pr_err("%s:%d: bad pmd %016llx.\n", __FILE__, __LINE__, pmd_val(e))
#define pud_none(pud) (!pud_val(pud))
#define pud_bad(pud) (!(pud_val(pud) & PUD_TABLE_BIT))
@ -608,7 +648,8 @@ static inline unsigned long pud_page_vaddr(pud_t pud)
#if CONFIG_PGTABLE_LEVELS > 3
#define pud_ERROR(pud) __pud_error(__FILE__, __LINE__, pud_val(pud))
#define pud_ERROR(e) \
pr_err("%s:%d: bad pud %016llx.\n", __FILE__, __LINE__, pud_val(e))
#define p4d_none(p4d) (!p4d_val(p4d))
#define p4d_bad(p4d) (!(p4d_val(p4d) & 2))
@ -667,15 +708,21 @@ static inline unsigned long p4d_page_vaddr(p4d_t p4d)
#endif /* CONFIG_PGTABLE_LEVELS > 3 */
#define pgd_ERROR(pgd) __pgd_error(__FILE__, __LINE__, pgd_val(pgd))
#define pgd_ERROR(e) \
pr_err("%s:%d: bad pgd %016llx.\n", __FILE__, __LINE__, pgd_val(e))
#define pgd_set_fixmap(addr) ((pgd_t *)set_fixmap_offset(FIX_PGD, addr))
#define pgd_clear_fixmap() clear_fixmap(FIX_PGD)
static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
{
/*
* Normal and Normal-Tagged are two different memory types and indices
* in MAIR_EL1. The mask below has to include PTE_ATTRINDX_MASK.
*/
const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY |
PTE_PROT_NONE | PTE_VALID | PTE_WRITE | PTE_GP;
PTE_PROT_NONE | PTE_VALID | PTE_WRITE | PTE_GP |
PTE_ATTRINDX_MASK;
/* preserve the hardware dirty information */
if (pte_hw_dirty(pte))
pte = pte_mkdirty(pte);
@ -847,6 +894,11 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) })
#define __swp_entry_to_pte(swp) ((pte_t) { (swp).val })
#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
#define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val(pmd) })
#define __swp_entry_to_pmd(swp) __pmd((swp).val)
#endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */
/*
* Ensure that there are not more swap files than can be encoded in the kernel
* PTEs.
@ -855,6 +907,38 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
extern int kern_addr_valid(unsigned long addr);
#ifdef CONFIG_ARM64_MTE
#define __HAVE_ARCH_PREPARE_TO_SWAP
static inline int arch_prepare_to_swap(struct page *page)
{
if (system_supports_mte())
return mte_save_tags(page);
return 0;
}
#define __HAVE_ARCH_SWAP_INVALIDATE
static inline void arch_swap_invalidate_page(int type, pgoff_t offset)
{
if (system_supports_mte())
mte_invalidate_tags(type, offset);
}
static inline void arch_swap_invalidate_area(int type)
{
if (system_supports_mte())
mte_invalidate_tags_area(type);
}
#define __HAVE_ARCH_SWAP_RESTORE
static inline void arch_swap_restore(swp_entry_t entry, struct page *page)
{
if (system_supports_mte() && mte_restore_tags(entry, page))
set_bit(PG_mte_tagged, &page->flags);
}
#endif /* CONFIG_ARM64_MTE */
/*
* On AArch64, the cache coherency is handled via the set_pte_at() function.
*/

View file

@ -38,6 +38,7 @@
#include <asm/pgtable-hwdef.h>
#include <asm/pointer_auth.h>
#include <asm/ptrace.h>
#include <asm/spectre.h>
#include <asm/types.h>
/*
@ -151,6 +152,10 @@ struct thread_struct {
struct ptrauth_keys_user keys_user;
struct ptrauth_keys_kernel keys_kernel;
#endif
#ifdef CONFIG_ARM64_MTE
u64 sctlr_tcf0;
u64 gcr_user_incl;
#endif
};
static inline void arch_thread_struct_whitelist(unsigned long *offset,
@ -197,40 +202,15 @@ static inline void start_thread_common(struct pt_regs *regs, unsigned long pc)
regs->pmr_save = GIC_PRIO_IRQON;
}
static inline void set_ssbs_bit(struct pt_regs *regs)
{
regs->pstate |= PSR_SSBS_BIT;
}
static inline void set_compat_ssbs_bit(struct pt_regs *regs)
{
regs->pstate |= PSR_AA32_SSBS_BIT;
}
static inline void start_thread(struct pt_regs *regs, unsigned long pc,
unsigned long sp)
{
start_thread_common(regs, pc);
regs->pstate = PSR_MODE_EL0t;
if (arm64_get_ssbd_state() != ARM64_SSBD_FORCE_ENABLE)
set_ssbs_bit(regs);
spectre_v4_enable_task_mitigation(current);
regs->sp = sp;
}
static inline bool is_ttbr0_addr(unsigned long addr)
{
/* entry assembly clears tags for TTBR0 addrs */
return addr < TASK_SIZE;
}
static inline bool is_ttbr1_addr(unsigned long addr)
{
/* TTBR1 addresses may have a tag if KASAN_SW_TAGS is in use */
return arch_kasan_reset_tag(addr) >= PAGE_OFFSET;
}
#ifdef CONFIG_COMPAT
static inline void compat_start_thread(struct pt_regs *regs, unsigned long pc,
unsigned long sp)
@ -244,13 +224,23 @@ static inline void compat_start_thread(struct pt_regs *regs, unsigned long pc,
regs->pstate |= PSR_AA32_E_BIT;
#endif
if (arm64_get_ssbd_state() != ARM64_SSBD_FORCE_ENABLE)
set_compat_ssbs_bit(regs);
spectre_v4_enable_task_mitigation(current);
regs->compat_sp = sp;
}
#endif
static inline bool is_ttbr0_addr(unsigned long addr)
{
/* entry assembly clears tags for TTBR0 addrs */
return addr < TASK_SIZE;
}
static inline bool is_ttbr1_addr(unsigned long addr)
{
/* TTBR1 addresses may have a tag if KASAN_SW_TAGS is in use */
return arch_kasan_reset_tag(addr) >= PAGE_OFFSET;
}
/* Forward declaration, a strange C thing */
struct task_struct;
@ -315,10 +305,10 @@ extern void __init minsigstksz_setup(void);
#ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
/* PR_{SET,GET}_TAGGED_ADDR_CTRL prctl */
long set_tagged_addr_ctrl(unsigned long arg);
long get_tagged_addr_ctrl(void);
#define SET_TAGGED_ADDR_CTRL(arg) set_tagged_addr_ctrl(arg)
#define GET_TAGGED_ADDR_CTRL() get_tagged_addr_ctrl()
long set_tagged_addr_ctrl(struct task_struct *task, unsigned long arg);
long get_tagged_addr_ctrl(struct task_struct *task);
#define SET_TAGGED_ADDR_CTRL(arg) set_tagged_addr_ctrl(current, arg)
#define GET_TAGGED_ADDR_CTRL() get_tagged_addr_ctrl(current)
#endif
/*

View file

@ -0,0 +1,32 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Interface for managing mitigations for Spectre vulnerabilities.
*
* Copyright (C) 2020 Google LLC
* Author: Will Deacon <will@kernel.org>
*/
#ifndef __ASM_SPECTRE_H
#define __ASM_SPECTRE_H
#include <asm/cpufeature.h>
/* Watch out, ordering is important here. */
enum mitigation_state {
SPECTRE_UNAFFECTED,
SPECTRE_MITIGATED,
SPECTRE_VULNERABLE,
};
struct task_struct;
enum mitigation_state arm64_get_spectre_v2_state(void);
bool has_spectre_v2(const struct arm64_cpu_capabilities *cap, int scope);
void spectre_v2_enable_mitigation(const struct arm64_cpu_capabilities *__unused);
enum mitigation_state arm64_get_spectre_v4_state(void);
bool has_spectre_v4(const struct arm64_cpu_capabilities *cap, int scope);
void spectre_v4_enable_mitigation(const struct arm64_cpu_capabilities *__unused);
void spectre_v4_enable_task_mitigation(struct task_struct *tsk);
#endif /* __ASM_SPECTRE_H */

View file

@ -63,7 +63,7 @@ struct stackframe {
extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
extern void walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
int (*fn)(struct stackframe *, void *), void *data);
bool (*fn)(void *, unsigned long), void *data);
extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
const char *loglvl);

View file

@ -91,10 +91,12 @@
#define PSTATE_PAN pstate_field(0, 4)
#define PSTATE_UAO pstate_field(0, 3)
#define PSTATE_SSBS pstate_field(3, 1)
#define PSTATE_TCO pstate_field(3, 4)
#define SET_PSTATE_PAN(x) __emit_inst(0xd500401f | PSTATE_PAN | ((!!x) << PSTATE_Imm_shift))
#define SET_PSTATE_UAO(x) __emit_inst(0xd500401f | PSTATE_UAO | ((!!x) << PSTATE_Imm_shift))
#define SET_PSTATE_SSBS(x) __emit_inst(0xd500401f | PSTATE_SSBS | ((!!x) << PSTATE_Imm_shift))
#define SET_PSTATE_TCO(x) __emit_inst(0xd500401f | PSTATE_TCO | ((!!x) << PSTATE_Imm_shift))
#define __SYS_BARRIER_INSN(CRm, op2, Rt) \
__emit_inst(0xd5000000 | sys_insn(0, 3, 3, (CRm), (op2)) | ((Rt) & 0x1f))
@ -181,6 +183,8 @@
#define SYS_SCTLR_EL1 sys_reg(3, 0, 1, 0, 0)
#define SYS_ACTLR_EL1 sys_reg(3, 0, 1, 0, 1)
#define SYS_CPACR_EL1 sys_reg(3, 0, 1, 0, 2)
#define SYS_RGSR_EL1 sys_reg(3, 0, 1, 0, 5)
#define SYS_GCR_EL1 sys_reg(3, 0, 1, 0, 6)
#define SYS_ZCR_EL1 sys_reg(3, 0, 1, 2, 0)
@ -218,6 +222,8 @@
#define SYS_ERXADDR_EL1 sys_reg(3, 0, 5, 4, 3)
#define SYS_ERXMISC0_EL1 sys_reg(3, 0, 5, 5, 0)
#define SYS_ERXMISC1_EL1 sys_reg(3, 0, 5, 5, 1)
#define SYS_TFSR_EL1 sys_reg(3, 0, 5, 6, 0)
#define SYS_TFSRE0_EL1 sys_reg(3, 0, 5, 6, 1)
#define SYS_FAR_EL1 sys_reg(3, 0, 6, 0, 0)
#define SYS_PAR_EL1 sys_reg(3, 0, 7, 4, 0)
@ -321,6 +327,8 @@
#define SYS_PMINTENSET_EL1 sys_reg(3, 0, 9, 14, 1)
#define SYS_PMINTENCLR_EL1 sys_reg(3, 0, 9, 14, 2)
#define SYS_PMMIR_EL1 sys_reg(3, 0, 9, 14, 6)
#define SYS_MAIR_EL1 sys_reg(3, 0, 10, 2, 0)
#define SYS_AMAIR_EL1 sys_reg(3, 0, 10, 3, 0)
@ -368,6 +376,7 @@
#define SYS_CCSIDR_EL1 sys_reg(3, 1, 0, 0, 0)
#define SYS_CLIDR_EL1 sys_reg(3, 1, 0, 0, 1)
#define SYS_GMID_EL1 sys_reg(3, 1, 0, 0, 4)
#define SYS_AIDR_EL1 sys_reg(3, 1, 0, 0, 7)
#define SYS_CSSELR_EL1 sys_reg(3, 2, 0, 0, 0)
@ -460,6 +469,7 @@
#define SYS_ESR_EL2 sys_reg(3, 4, 5, 2, 0)
#define SYS_VSESR_EL2 sys_reg(3, 4, 5, 2, 3)
#define SYS_FPEXC32_EL2 sys_reg(3, 4, 5, 3, 0)
#define SYS_TFSR_EL2 sys_reg(3, 4, 5, 6, 0)
#define SYS_FAR_EL2 sys_reg(3, 4, 6, 0, 0)
#define SYS_VDISR_EL2 sys_reg(3, 4, 12, 1, 1)
@ -516,6 +526,7 @@
#define SYS_AFSR0_EL12 sys_reg(3, 5, 5, 1, 0)
#define SYS_AFSR1_EL12 sys_reg(3, 5, 5, 1, 1)
#define SYS_ESR_EL12 sys_reg(3, 5, 5, 2, 0)
#define SYS_TFSR_EL12 sys_reg(3, 5, 5, 6, 0)
#define SYS_FAR_EL12 sys_reg(3, 5, 6, 0, 0)
#define SYS_MAIR_EL12 sys_reg(3, 5, 10, 2, 0)
#define SYS_AMAIR_EL12 sys_reg(3, 5, 10, 3, 0)
@ -531,6 +542,15 @@
/* Common SCTLR_ELx flags. */
#define SCTLR_ELx_DSSBS (BIT(44))
#define SCTLR_ELx_ATA (BIT(43))
#define SCTLR_ELx_TCF_SHIFT 40
#define SCTLR_ELx_TCF_NONE (UL(0x0) << SCTLR_ELx_TCF_SHIFT)
#define SCTLR_ELx_TCF_SYNC (UL(0x1) << SCTLR_ELx_TCF_SHIFT)
#define SCTLR_ELx_TCF_ASYNC (UL(0x2) << SCTLR_ELx_TCF_SHIFT)
#define SCTLR_ELx_TCF_MASK (UL(0x3) << SCTLR_ELx_TCF_SHIFT)
#define SCTLR_ELx_ITFSB (BIT(37))
#define SCTLR_ELx_ENIA (BIT(31))
#define SCTLR_ELx_ENIB (BIT(30))
#define SCTLR_ELx_ENDA (BIT(27))
@ -559,6 +579,14 @@
#endif
/* SCTLR_EL1 specific flags. */
#define SCTLR_EL1_ATA0 (BIT(42))
#define SCTLR_EL1_TCF0_SHIFT 38
#define SCTLR_EL1_TCF0_NONE (UL(0x0) << SCTLR_EL1_TCF0_SHIFT)
#define SCTLR_EL1_TCF0_SYNC (UL(0x1) << SCTLR_EL1_TCF0_SHIFT)
#define SCTLR_EL1_TCF0_ASYNC (UL(0x2) << SCTLR_EL1_TCF0_SHIFT)
#define SCTLR_EL1_TCF0_MASK (UL(0x3) << SCTLR_EL1_TCF0_SHIFT)
#define SCTLR_EL1_BT1 (BIT(36))
#define SCTLR_EL1_BT0 (BIT(35))
#define SCTLR_EL1_UCI (BIT(26))
@ -587,6 +615,7 @@
SCTLR_EL1_SA0 | SCTLR_EL1_SED | SCTLR_ELx_I |\
SCTLR_EL1_DZE | SCTLR_EL1_UCT |\
SCTLR_EL1_NTWE | SCTLR_ELx_IESB | SCTLR_EL1_SPAN |\
SCTLR_ELx_ITFSB| SCTLR_ELx_ATA | SCTLR_EL1_ATA0 |\
ENDIAN_SET_EL1 | SCTLR_EL1_UCI | SCTLR_EL1_RES1)
/* MAIR_ELx memory attributes (used by Linux) */
@ -595,6 +624,7 @@
#define MAIR_ATTR_DEVICE_GRE UL(0x0c)
#define MAIR_ATTR_NORMAL_NC UL(0x44)
#define MAIR_ATTR_NORMAL_WT UL(0xbb)
#define MAIR_ATTR_NORMAL_TAGGED UL(0xf0)
#define MAIR_ATTR_NORMAL UL(0xff)
#define MAIR_ATTR_MASK UL(0xff)
@ -636,14 +666,22 @@
#define ID_AA64ISAR1_APA_SHIFT 4
#define ID_AA64ISAR1_DPB_SHIFT 0
#define ID_AA64ISAR1_APA_NI 0x0
#define ID_AA64ISAR1_APA_ARCHITECTED 0x1
#define ID_AA64ISAR1_API_NI 0x0
#define ID_AA64ISAR1_API_IMP_DEF 0x1
#define ID_AA64ISAR1_GPA_NI 0x0
#define ID_AA64ISAR1_GPA_ARCHITECTED 0x1
#define ID_AA64ISAR1_GPI_NI 0x0
#define ID_AA64ISAR1_GPI_IMP_DEF 0x1
#define ID_AA64ISAR1_APA_NI 0x0
#define ID_AA64ISAR1_APA_ARCHITECTED 0x1
#define ID_AA64ISAR1_APA_ARCH_EPAC 0x2
#define ID_AA64ISAR1_APA_ARCH_EPAC2 0x3
#define ID_AA64ISAR1_APA_ARCH_EPAC2_FPAC 0x4
#define ID_AA64ISAR1_APA_ARCH_EPAC2_FPAC_CMB 0x5
#define ID_AA64ISAR1_API_NI 0x0
#define ID_AA64ISAR1_API_IMP_DEF 0x1
#define ID_AA64ISAR1_API_IMP_DEF_EPAC 0x2
#define ID_AA64ISAR1_API_IMP_DEF_EPAC2 0x3
#define ID_AA64ISAR1_API_IMP_DEF_EPAC2_FPAC 0x4
#define ID_AA64ISAR1_API_IMP_DEF_EPAC2_FPAC_CMB 0x5
#define ID_AA64ISAR1_GPA_NI 0x0
#define ID_AA64ISAR1_GPA_ARCHITECTED 0x1
#define ID_AA64ISAR1_GPI_NI 0x0
#define ID_AA64ISAR1_GPI_IMP_DEF 0x1
/* id_aa64pfr0 */
#define ID_AA64PFR0_CSV3_SHIFT 60
@ -686,6 +724,10 @@
#define ID_AA64PFR1_SSBS_PSTATE_INSNS 2
#define ID_AA64PFR1_BT_BTI 0x1
#define ID_AA64PFR1_MTE_NI 0x0
#define ID_AA64PFR1_MTE_EL0 0x1
#define ID_AA64PFR1_MTE 0x2
/* id_aa64zfr0 */
#define ID_AA64ZFR0_F64MM_SHIFT 56
#define ID_AA64ZFR0_F32MM_SHIFT 52
@ -920,6 +962,28 @@
#define CPACR_EL1_ZEN_EL0EN (BIT(17)) /* enable EL0 access, if EL1EN set */
#define CPACR_EL1_ZEN (CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN)
/* TCR EL1 Bit Definitions */
#define SYS_TCR_EL1_TCMA1 (BIT(58))
#define SYS_TCR_EL1_TCMA0 (BIT(57))
/* GCR_EL1 Definitions */
#define SYS_GCR_EL1_RRND (BIT(16))
#define SYS_GCR_EL1_EXCL_MASK 0xffffUL
/* RGSR_EL1 Definitions */
#define SYS_RGSR_EL1_TAG_MASK 0xfUL
#define SYS_RGSR_EL1_SEED_SHIFT 8
#define SYS_RGSR_EL1_SEED_MASK 0xffffUL
/* GMID_EL1 field definitions */
#define SYS_GMID_EL1_BS_SHIFT 0
#define SYS_GMID_EL1_BS_SIZE 4
/* TFSR{,E0}_EL1 bit definitions */
#define SYS_TFSR_EL1_TF0_SHIFT 0
#define SYS_TFSR_EL1_TF1_SHIFT 1
#define SYS_TFSR_EL1_TF0 (UL(1) << SYS_TFSR_EL1_TF0_SHIFT)
#define SYS_TFSR_EL1_TF1 (UK(2) << SYS_TFSR_EL1_TF1_SHIFT)
/* Safe value for MPIDR_EL1: Bit31:RES1, Bit30:U:0, Bit24:MT:0 */
#define SYS_MPIDR_SAFE_VAL (BIT(31))
@ -1024,6 +1088,13 @@
write_sysreg(__scs_new, sysreg); \
} while (0)
#define sysreg_clear_set_s(sysreg, clear, set) do { \
u64 __scs_val = read_sysreg_s(sysreg); \
u64 __scs_new = (__scs_val & ~(u64)(clear)) | (set); \
if (__scs_new != __scs_val) \
write_sysreg_s(__scs_new, sysreg); \
} while (0)
#endif
#endif /* __ASM_SYSREG_H */

View file

@ -67,6 +67,7 @@ void arch_release_task_struct(struct task_struct *tsk);
#define TIF_FOREIGN_FPSTATE 3 /* CPU's FP state is not current's */
#define TIF_UPROBE 4 /* uprobe breakpoint or singlestep */
#define TIF_FSCHECK 5 /* Check FS is USER_DS on return */
#define TIF_MTE_ASYNC_FAULT 6 /* MTE Asynchronous Tag Check Fault */
#define TIF_SYSCALL_TRACE 8 /* syscall trace active */
#define TIF_SYSCALL_AUDIT 9 /* syscall auditing */
#define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */
@ -96,10 +97,11 @@ void arch_release_task_struct(struct task_struct *tsk);
#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
#define _TIF_32BIT (1 << TIF_32BIT)
#define _TIF_SVE (1 << TIF_SVE)
#define _TIF_MTE_ASYNC_FAULT (1 << TIF_MTE_ASYNC_FAULT)
#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
_TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
_TIF_UPROBE | _TIF_FSCHECK)
_TIF_UPROBE | _TIF_FSCHECK | _TIF_MTE_ASYNC_FAULT)
#define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
_TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \

View file

@ -24,7 +24,7 @@ struct undef_hook {
void register_undef_hook(struct undef_hook *hook);
void unregister_undef_hook(struct undef_hook *hook);
void force_signal_inject(int signal, int code, unsigned long address);
void force_signal_inject(int signal, int code, unsigned long address, unsigned int err);
void arm64_notify_segfault(unsigned long addr);
void arm64_force_sig_fault(int signo, int code, void __user *addr, const char *str);
void arm64_force_sig_mceerr(int code, void __user *addr, short lsb, const char *str);

View file

@ -74,6 +74,6 @@
#define HWCAP2_DGH (1 << 15)
#define HWCAP2_RNG (1 << 16)
#define HWCAP2_BTI (1 << 17)
/* reserved for HWCAP2_MTE (1 << 18) */
#define HWCAP2_MTE (1 << 18)
#endif /* _UAPI__ASM_HWCAP_H */

View file

@ -242,6 +242,15 @@ struct kvm_vcpu_events {
#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL 0
#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL 1
#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED 2
/*
* Only two states can be presented by the host kernel:
* - NOT_REQUIRED: the guest doesn't need to do anything
* - NOT_AVAIL: the guest isn't mitigated (it can still use SSBS if available)
*
* All the other values are deprecated. The host still accepts all
* values (they are ABI), but will narrow them to the above two.
*/
#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2 KVM_REG_ARM_FW_REG(2)
#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL 0
#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN 1

View file

@ -5,5 +5,6 @@
#include <asm-generic/mman.h>
#define PROT_BTI 0x10 /* BTI guarded page */
#define PROT_MTE 0x20 /* Normal Tagged mapping */
#endif /* ! _UAPI__ASM_MMAN_H */

View file

@ -51,6 +51,7 @@
#define PSR_PAN_BIT 0x00400000
#define PSR_UAO_BIT 0x00800000
#define PSR_DIT_BIT 0x01000000
#define PSR_TCO_BIT 0x02000000
#define PSR_V_BIT 0x10000000
#define PSR_C_BIT 0x20000000
#define PSR_Z_BIT 0x40000000
@ -75,6 +76,9 @@
/* syscall emulation path in ptrace */
#define PTRACE_SYSEMU 31
#define PTRACE_SYSEMU_SINGLESTEP 32
/* MTE allocation tag access */
#define PTRACE_PEEKMTETAGS 33
#define PTRACE_POKEMTETAGS 34
#ifndef __ASSEMBLY__

View file

@ -3,8 +3,6 @@
# Makefile for the linux kernel.
#
CPPFLAGS_vmlinux.lds := -DTEXT_OFFSET=$(TEXT_OFFSET)
AFLAGS_head.o := -DTEXT_OFFSET=$(TEXT_OFFSET)
CFLAGS_armv8_deprecated.o := -I$(src)
CFLAGS_REMOVE_ftrace.o = $(CC_FLAGS_FTRACE)
@ -19,7 +17,7 @@ obj-y := debug-monitors.o entry.o irq.o fpsimd.o \
return_address.o cpuinfo.o cpu_errata.o \
cpufeature.o alternative.o cacheinfo.o \
smp.o smp_spin_table.o topology.o smccc-call.o \
syscall.o
syscall.o proton-pack.o
targets += efi-entry.o
@ -59,9 +57,9 @@ arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-$(CONFIG_CRASH_CORE) += crash_core.o
obj-$(CONFIG_ARM_SDE_INTERFACE) += sdei.o
obj-$(CONFIG_ARM64_SSBD) += ssbd.o
obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
obj-$(CONFIG_SHADOW_CALL_STACK) += scs.o
obj-$(CONFIG_ARM64_MTE) += mte.o
obj-y += vdso/ probes/
obj-$(CONFIG_COMPAT_VDSO) += vdso32/

View file

@ -35,6 +35,10 @@ SYM_CODE_START(__cpu_soft_restart)
mov_q x13, SCTLR_ELx_FLAGS
bic x12, x12, x13
pre_disable_mmu_workaround
/*
* either disable EL1&0 translation regime or disable EL2&0 translation
* regime if HCR_EL2.E2H == 1
*/
msr sctlr_el1, x12
isb

View file

@ -106,365 +106,6 @@ cpu_enable_trap_ctr_access(const struct arm64_cpu_capabilities *cap)
sysreg_clear_set(sctlr_el1, SCTLR_EL1_UCT, 0);
}
atomic_t arm64_el2_vector_last_slot = ATOMIC_INIT(-1);
#include <asm/mmu_context.h>
#include <asm/cacheflush.h>
DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
#ifdef CONFIG_KVM_INDIRECT_VECTORS
static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
const char *hyp_vecs_end)
{
void *dst = lm_alias(__bp_harden_hyp_vecs + slot * SZ_2K);
int i;
for (i = 0; i < SZ_2K; i += 0x80)
memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);
__flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
}
static void install_bp_hardening_cb(bp_hardening_cb_t fn,
const char *hyp_vecs_start,
const char *hyp_vecs_end)
{
static DEFINE_RAW_SPINLOCK(bp_lock);
int cpu, slot = -1;
/*
* detect_harden_bp_fw() passes NULL for the hyp_vecs start/end if
* we're a guest. Skip the hyp-vectors work.
*/
if (!hyp_vecs_start) {
__this_cpu_write(bp_hardening_data.fn, fn);
return;
}
raw_spin_lock(&bp_lock);
for_each_possible_cpu(cpu) {
if (per_cpu(bp_hardening_data.fn, cpu) == fn) {
slot = per_cpu(bp_hardening_data.hyp_vectors_slot, cpu);
break;
}
}
if (slot == -1) {
slot = atomic_inc_return(&arm64_el2_vector_last_slot);
BUG_ON(slot >= BP_HARDEN_EL2_SLOTS);
__copy_hyp_vect_bpi(slot, hyp_vecs_start, hyp_vecs_end);
}
__this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot);
__this_cpu_write(bp_hardening_data.fn, fn);
raw_spin_unlock(&bp_lock);
}
#else
static void install_bp_hardening_cb(bp_hardening_cb_t fn,
const char *hyp_vecs_start,
const char *hyp_vecs_end)
{
__this_cpu_write(bp_hardening_data.fn, fn);
}
#endif /* CONFIG_KVM_INDIRECT_VECTORS */
#include <linux/arm-smccc.h>
static void __maybe_unused call_smc_arch_workaround_1(void)
{
arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
}
static void call_hvc_arch_workaround_1(void)
{
arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
}
static void qcom_link_stack_sanitization(void)
{
u64 tmp;
asm volatile("mov %0, x30 \n"
".rept 16 \n"
"bl . + 4 \n"
".endr \n"
"mov x30, %0 \n"
: "=&r" (tmp));
}
static bool __nospectre_v2;
static int __init parse_nospectre_v2(char *str)
{
__nospectre_v2 = true;
return 0;
}
early_param("nospectre_v2", parse_nospectre_v2);
/*
* -1: No workaround
* 0: No workaround required
* 1: Workaround installed
*/
static int detect_harden_bp_fw(void)
{
bp_hardening_cb_t cb;
void *smccc_start, *smccc_end;
struct arm_smccc_res res;
u32 midr = read_cpuid_id();
arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
ARM_SMCCC_ARCH_WORKAROUND_1, &res);
switch ((int)res.a0) {
case 1:
/* Firmware says we're just fine */
return 0;
case 0:
break;
default:
return -1;
}
switch (arm_smccc_1_1_get_conduit()) {
case SMCCC_CONDUIT_HVC:
cb = call_hvc_arch_workaround_1;
/* This is a guest, no need to patch KVM vectors */
smccc_start = NULL;
smccc_end = NULL;
break;
#if IS_ENABLED(CONFIG_KVM)
case SMCCC_CONDUIT_SMC:
cb = call_smc_arch_workaround_1;
smccc_start = __smccc_workaround_1_smc;
smccc_end = __smccc_workaround_1_smc +
__SMCCC_WORKAROUND_1_SMC_SZ;
break;
#endif
default:
return -1;
}
if (((midr & MIDR_CPU_MODEL_MASK) == MIDR_QCOM_FALKOR) ||
((midr & MIDR_CPU_MODEL_MASK) == MIDR_QCOM_FALKOR_V1))
cb = qcom_link_stack_sanitization;
if (IS_ENABLED(CONFIG_HARDEN_BRANCH_PREDICTOR))
install_bp_hardening_cb(cb, smccc_start, smccc_end);
return 1;
}
DEFINE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);
int ssbd_state __read_mostly = ARM64_SSBD_KERNEL;
static bool __ssb_safe = true;
static const struct ssbd_options {
const char *str;
int state;
} ssbd_options[] = {
{ "force-on", ARM64_SSBD_FORCE_ENABLE, },
{ "force-off", ARM64_SSBD_FORCE_DISABLE, },
{ "kernel", ARM64_SSBD_KERNEL, },
};
static int __init ssbd_cfg(char *buf)
{
int i;
if (!buf || !buf[0])
return -EINVAL;
for (i = 0; i < ARRAY_SIZE(ssbd_options); i++) {
int len = strlen(ssbd_options[i].str);
if (strncmp(buf, ssbd_options[i].str, len))
continue;
ssbd_state = ssbd_options[i].state;
return 0;
}
return -EINVAL;
}
early_param("ssbd", ssbd_cfg);
void __init arm64_update_smccc_conduit(struct alt_instr *alt,
__le32 *origptr, __le32 *updptr,
int nr_inst)
{
u32 insn;
BUG_ON(nr_inst != 1);
switch (arm_smccc_1_1_get_conduit()) {
case SMCCC_CONDUIT_HVC:
insn = aarch64_insn_get_hvc_value();
break;
case SMCCC_CONDUIT_SMC:
insn = aarch64_insn_get_smc_value();
break;
default:
return;
}
*updptr = cpu_to_le32(insn);
}
void __init arm64_enable_wa2_handling(struct alt_instr *alt,
__le32 *origptr, __le32 *updptr,
int nr_inst)
{
BUG_ON(nr_inst != 1);
/*
* Only allow mitigation on EL1 entry/exit and guest
* ARCH_WORKAROUND_2 handling if the SSBD state allows it to
* be flipped.
*/
if (arm64_get_ssbd_state() == ARM64_SSBD_KERNEL)
*updptr = cpu_to_le32(aarch64_insn_gen_nop());
}
void arm64_set_ssbd_mitigation(bool state)
{
int conduit;
if (!IS_ENABLED(CONFIG_ARM64_SSBD)) {
pr_info_once("SSBD disabled by kernel configuration\n");
return;
}
if (this_cpu_has_cap(ARM64_SSBS)) {
if (state)
asm volatile(SET_PSTATE_SSBS(0));
else
asm volatile(SET_PSTATE_SSBS(1));
return;
}
conduit = arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_WORKAROUND_2, state,
NULL);
WARN_ON_ONCE(conduit == SMCCC_CONDUIT_NONE);
}
static bool has_ssbd_mitigation(const struct arm64_cpu_capabilities *entry,
int scope)
{
struct arm_smccc_res res;
bool required = true;
s32 val;
bool this_cpu_safe = false;
int conduit;
WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
if (cpu_mitigations_off())
ssbd_state = ARM64_SSBD_FORCE_DISABLE;
/* delay setting __ssb_safe until we get a firmware response */
if (is_midr_in_range_list(read_cpuid_id(), entry->midr_range_list))
this_cpu_safe = true;
if (this_cpu_has_cap(ARM64_SSBS)) {
if (!this_cpu_safe)
__ssb_safe = false;
required = false;
goto out_printmsg;
}
conduit = arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
ARM_SMCCC_ARCH_WORKAROUND_2, &res);
if (conduit == SMCCC_CONDUIT_NONE) {
ssbd_state = ARM64_SSBD_UNKNOWN;
if (!this_cpu_safe)
__ssb_safe = false;
return false;
}
val = (s32)res.a0;
switch (val) {
case SMCCC_RET_NOT_SUPPORTED:
ssbd_state = ARM64_SSBD_UNKNOWN;
if (!this_cpu_safe)
__ssb_safe = false;
return false;
/* machines with mixed mitigation requirements must not return this */
case SMCCC_RET_NOT_REQUIRED:
pr_info_once("%s mitigation not required\n", entry->desc);
ssbd_state = ARM64_SSBD_MITIGATED;
return false;
case SMCCC_RET_SUCCESS:
__ssb_safe = false;
required = true;
break;
case 1: /* Mitigation not required on this CPU */
required = false;
break;
default:
WARN_ON(1);
if (!this_cpu_safe)
__ssb_safe = false;
return false;
}
switch (ssbd_state) {
case ARM64_SSBD_FORCE_DISABLE:
arm64_set_ssbd_mitigation(false);
required = false;
break;
case ARM64_SSBD_KERNEL:
if (required) {
__this_cpu_write(arm64_ssbd_callback_required, 1);
arm64_set_ssbd_mitigation(true);
}
break;
case ARM64_SSBD_FORCE_ENABLE:
arm64_set_ssbd_mitigation(true);
required = true;
break;
default:
WARN_ON(1);
break;
}
out_printmsg:
switch (ssbd_state) {
case ARM64_SSBD_FORCE_DISABLE:
pr_info_once("%s disabled from command-line\n", entry->desc);
break;
case ARM64_SSBD_FORCE_ENABLE:
pr_info_once("%s forced from command-line\n", entry->desc);
break;
}
return required;
}
/* known invulnerable cores */
static const struct midr_range arm64_ssb_cpus[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A35),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A53),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A55),
MIDR_ALL_VERSIONS(MIDR_BRAHMA_B53),
MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_3XX_SILVER),
MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_4XX_SILVER),
{},
};
#ifdef CONFIG_ARM64_ERRATUM_1463225
DEFINE_PER_CPU(int, __in_cortex_a76_erratum_1463225_wa);
@ -519,83 +160,6 @@ cpu_enable_cache_maint_trap(const struct arm64_cpu_capabilities *__unused)
.type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, \
CAP_MIDR_RANGE_LIST(midr_list)
/* Track overall mitigation state. We are only mitigated if all cores are ok */
static bool __hardenbp_enab = true;
static bool __spectrev2_safe = true;
int get_spectre_v2_workaround_state(void)
{
if (__spectrev2_safe)
return ARM64_BP_HARDEN_NOT_REQUIRED;
if (!__hardenbp_enab)
return ARM64_BP_HARDEN_UNKNOWN;
return ARM64_BP_HARDEN_WA_NEEDED;
}
/*
* List of CPUs that do not need any Spectre-v2 mitigation at all.
*/
static const struct midr_range spectre_v2_safe_list[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A35),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A53),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A55),
MIDR_ALL_VERSIONS(MIDR_BRAHMA_B53),
MIDR_ALL_VERSIONS(MIDR_HISI_TSV110),
MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_3XX_SILVER),
MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_4XX_SILVER),
{ /* sentinel */ }
};
/*
* Track overall bp hardening for all heterogeneous cores in the machine.
* We are only considered "safe" if all booted cores are known safe.
*/
static bool __maybe_unused
check_branch_predictor(const struct arm64_cpu_capabilities *entry, int scope)
{
int need_wa;
WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
/* If the CPU has CSV2 set, we're safe */
if (cpuid_feature_extract_unsigned_field(read_cpuid(ID_AA64PFR0_EL1),
ID_AA64PFR0_CSV2_SHIFT))
return false;
/* Alternatively, we have a list of unaffected CPUs */
if (is_midr_in_range_list(read_cpuid_id(), spectre_v2_safe_list))
return false;
/* Fallback to firmware detection */
need_wa = detect_harden_bp_fw();
if (!need_wa)
return false;
__spectrev2_safe = false;
if (!IS_ENABLED(CONFIG_HARDEN_BRANCH_PREDICTOR)) {
pr_warn_once("spectrev2 mitigation disabled by kernel configuration\n");
__hardenbp_enab = false;
return false;
}
/* forced off */
if (__nospectre_v2 || cpu_mitigations_off()) {
pr_info_once("spectrev2 mitigation disabled by command line option\n");
__hardenbp_enab = false;
return false;
}
if (need_wa < 0) {
pr_warn_once("ARM_SMCCC_ARCH_WORKAROUND_1 missing from firmware\n");
__hardenbp_enab = false;
}
return (need_wa > 0);
}
static const __maybe_unused struct midr_range tx2_family_cpus[] = {
MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
@ -887,9 +451,11 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
},
#endif
{
.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
.desc = "Spectre-v2",
.capability = ARM64_SPECTRE_V2,
.type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM,
.matches = check_branch_predictor,
.matches = has_spectre_v2,
.cpu_enable = spectre_v2_enable_mitigation,
},
#ifdef CONFIG_RANDOMIZE_BASE
{
@ -899,11 +465,11 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
},
#endif
{
.desc = "Speculative Store Bypass Disable",
.capability = ARM64_SSBD,
.desc = "Spectre-v4",
.capability = ARM64_SPECTRE_V4,
.type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM,
.matches = has_ssbd_mitigation,
.midr_range_list = arm64_ssb_cpus,
.matches = has_spectre_v4,
.cpu_enable = spectre_v4_enable_mitigation,
},
#ifdef CONFIG_ARM64_ERRATUM_1418040
{
@ -960,40 +526,3 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
{
}
};
ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr,
char *buf)
{
return sprintf(buf, "Mitigation: __user pointer sanitization\n");
}
ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr,
char *buf)
{
switch (get_spectre_v2_workaround_state()) {
case ARM64_BP_HARDEN_NOT_REQUIRED:
return sprintf(buf, "Not affected\n");
case ARM64_BP_HARDEN_WA_NEEDED:
return sprintf(buf, "Mitigation: Branch predictor hardening\n");
case ARM64_BP_HARDEN_UNKNOWN:
default:
return sprintf(buf, "Vulnerable\n");
}
}
ssize_t cpu_show_spec_store_bypass(struct device *dev,
struct device_attribute *attr, char *buf)
{
if (__ssb_safe)
return sprintf(buf, "Not affected\n");
switch (ssbd_state) {
case ARM64_SSBD_KERNEL:
case ARM64_SSBD_FORCE_ENABLE:
if (IS_ENABLED(CONFIG_ARM64_SSBD))
return sprintf(buf,
"Mitigation: Speculative Store Bypass disabled via prctl\n");
}
return sprintf(buf, "Vulnerable\n");
}

View file

@ -75,6 +75,7 @@
#include <asm/cpu_ops.h>
#include <asm/fpsimd.h>
#include <asm/mmu_context.h>
#include <asm/mte.h>
#include <asm/processor.h>
#include <asm/sysreg.h>
#include <asm/traps.h>
@ -197,9 +198,9 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_FCMA_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_JSCVT_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_API_SHIFT, 4, 0),
FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_API_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_APA_SHIFT, 4, 0),
FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_APA_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_DPB_SHIFT, 4, 0),
ARM64_FTR_END,
};
@ -227,7 +228,9 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_MPAMFRAC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_RASFRAC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_SSBS_SHIFT, 4, ID_AA64PFR1_SSBS_PSTATE_NI),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_MTE),
FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_MTE_SHIFT, 4, ID_AA64PFR1_MTE_NI),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR1_SSBS_SHIFT, 4, ID_AA64PFR1_SSBS_PSTATE_NI),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_BTI),
FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_BT_SHIFT, 4, 0),
ARM64_FTR_END,
@ -487,7 +490,7 @@ static const struct arm64_ftr_bits ftr_id_pfr1[] = {
};
static const struct arm64_ftr_bits ftr_id_pfr2[] = {
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_PFR2_SSBS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_PFR2_SSBS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_PFR2_CSV3_SHIFT, 4, 0),
ARM64_FTR_END,
};
@ -1111,6 +1114,7 @@ u64 read_sanitised_ftr_reg(u32 id)
return 0;
return regp->sys_val;
}
EXPORT_SYMBOL_GPL(read_sanitised_ftr_reg);
#define read_sysreg_case(r) \
case r: return read_sysreg_s(r)
@ -1443,6 +1447,7 @@ static inline void __cpu_enable_hw_dbm(void)
write_sysreg(tcr, tcr_el1);
isb();
local_flush_tlb_all();
}
static bool cpu_has_broken_dbm(void)
@ -1583,48 +1588,6 @@ static void cpu_has_fwb(const struct arm64_cpu_capabilities *__unused)
WARN_ON(val & (7 << 27 | 7 << 21));
}
#ifdef CONFIG_ARM64_SSBD
static int ssbs_emulation_handler(struct pt_regs *regs, u32 instr)
{
if (user_mode(regs))
return 1;
if (instr & BIT(PSTATE_Imm_shift))
regs->pstate |= PSR_SSBS_BIT;
else
regs->pstate &= ~PSR_SSBS_BIT;
arm64_skip_faulting_instruction(regs, 4);
return 0;
}
static struct undef_hook ssbs_emulation_hook = {
.instr_mask = ~(1U << PSTATE_Imm_shift),
.instr_val = 0xd500401f | PSTATE_SSBS,
.fn = ssbs_emulation_handler,
};
static void cpu_enable_ssbs(const struct arm64_cpu_capabilities *__unused)
{
static bool undef_hook_registered = false;
static DEFINE_RAW_SPINLOCK(hook_lock);
raw_spin_lock(&hook_lock);
if (!undef_hook_registered) {
register_undef_hook(&ssbs_emulation_hook);
undef_hook_registered = true;
}
raw_spin_unlock(&hook_lock);
if (arm64_get_ssbd_state() == ARM64_SSBD_FORCE_DISABLE) {
sysreg_clear_set(sctlr_el1, 0, SCTLR_ELx_DSSBS);
arm64_set_ssbd_mitigation(false);
} else {
arm64_set_ssbd_mitigation(true);
}
}
#endif /* CONFIG_ARM64_SSBD */
#ifdef CONFIG_ARM64_PAN
static void cpu_enable_pan(const struct arm64_cpu_capabilities *__unused)
{
@ -1648,11 +1611,37 @@ static void cpu_clear_disr(const struct arm64_cpu_capabilities *__unused)
#endif /* CONFIG_ARM64_RAS_EXTN */
#ifdef CONFIG_ARM64_PTR_AUTH
static bool has_address_auth(const struct arm64_cpu_capabilities *entry,
int __unused)
static bool has_address_auth_cpucap(const struct arm64_cpu_capabilities *entry, int scope)
{
return __system_matches_cap(ARM64_HAS_ADDRESS_AUTH_ARCH) ||
__system_matches_cap(ARM64_HAS_ADDRESS_AUTH_IMP_DEF);
int boot_val, sec_val;
/* We don't expect to be called with SCOPE_SYSTEM */
WARN_ON(scope == SCOPE_SYSTEM);
/*
* The ptr-auth feature levels are not intercompatible with lower
* levels. Hence we must match ptr-auth feature level of the secondary
* CPUs with that of the boot CPU. The level of boot cpu is fetched
* from the sanitised register whereas direct register read is done for
* the secondary CPUs.
* The sanitised feature state is guaranteed to match that of the
* boot CPU as a mismatched secondary CPU is parked before it gets
* a chance to update the state, with the capability.
*/
boot_val = cpuid_feature_extract_field(read_sanitised_ftr_reg(entry->sys_reg),
entry->field_pos, entry->sign);
if (scope & SCOPE_BOOT_CPU)
return boot_val >= entry->min_field_value;
/* Now check for the secondary CPUs with SCOPE_LOCAL_CPU scope */
sec_val = cpuid_feature_extract_field(__read_sysreg_by_encoding(entry->sys_reg),
entry->field_pos, entry->sign);
return sec_val == boot_val;
}
static bool has_address_auth_metacap(const struct arm64_cpu_capabilities *entry,
int scope)
{
return has_address_auth_cpucap(cpu_hwcaps_ptrs[ARM64_HAS_ADDRESS_AUTH_ARCH], scope) ||
has_address_auth_cpucap(cpu_hwcaps_ptrs[ARM64_HAS_ADDRESS_AUTH_IMP_DEF], scope);
}
static bool has_generic_auth(const struct arm64_cpu_capabilities *entry,
@ -1702,6 +1691,22 @@ static void bti_enable(const struct arm64_cpu_capabilities *__unused)
}
#endif /* CONFIG_ARM64_BTI */
#ifdef CONFIG_ARM64_MTE
static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
{
static bool cleared_zero_page = false;
/*
* Clear the tags in the zero page. This needs to be done via the
* linear map which has the Tagged attribute.
*/
if (!cleared_zero_page) {
cleared_zero_page = true;
mte_clear_page_tags(lm_alias(empty_zero_page));
}
}
#endif /* CONFIG_ARM64_MTE */
/* Internal helper functions to match cpu capability type */
static bool
cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap)
@ -1976,19 +1981,16 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.field_pos = ID_AA64ISAR0_CRC32_SHIFT,
.min_field_value = 1,
},
#ifdef CONFIG_ARM64_SSBD
{
.desc = "Speculative Store Bypassing Safe (SSBS)",
.capability = ARM64_SSBS,
.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
.matches = has_cpuid_feature,
.sys_reg = SYS_ID_AA64PFR1_EL1,
.field_pos = ID_AA64PFR1_SSBS_SHIFT,
.sign = FTR_UNSIGNED,
.min_field_value = ID_AA64PFR1_SSBS_PSTATE_ONLY,
.cpu_enable = cpu_enable_ssbs,
},
#endif
#ifdef CONFIG_ARM64_CNP
{
.desc = "Common not Private translations",
@ -2021,7 +2023,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.sign = FTR_UNSIGNED,
.field_pos = ID_AA64ISAR1_APA_SHIFT,
.min_field_value = ID_AA64ISAR1_APA_ARCHITECTED,
.matches = has_cpuid_feature,
.matches = has_address_auth_cpucap,
},
{
.desc = "Address authentication (IMP DEF algorithm)",
@ -2031,12 +2033,12 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.sign = FTR_UNSIGNED,
.field_pos = ID_AA64ISAR1_API_SHIFT,
.min_field_value = ID_AA64ISAR1_API_IMP_DEF,
.matches = has_cpuid_feature,
.matches = has_address_auth_cpucap,
},
{
.capability = ARM64_HAS_ADDRESS_AUTH,
.type = ARM64_CPUCAP_BOOT_CPU_FEATURE,
.matches = has_address_auth,
.matches = has_address_auth_metacap,
},
{
.desc = "Generic authentication (architected algorithm)",
@ -2121,6 +2123,19 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.sign = FTR_UNSIGNED,
},
#endif
#ifdef CONFIG_ARM64_MTE
{
.desc = "Memory Tagging Extension",
.capability = ARM64_MTE,
.type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
.matches = has_cpuid_feature,
.sys_reg = SYS_ID_AA64PFR1_EL1,
.field_pos = ID_AA64PFR1_MTE_SHIFT,
.min_field_value = ID_AA64PFR1_MTE,
.sign = FTR_UNSIGNED,
.cpu_enable = cpu_enable_mte,
},
#endif /* CONFIG_ARM64_MTE */
{},
};
@ -2237,6 +2252,9 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
HWCAP_MULTI_CAP(ptr_auth_hwcap_addr_matches, CAP_HWCAP, KERNEL_HWCAP_PACA),
HWCAP_MULTI_CAP(ptr_auth_hwcap_gen_matches, CAP_HWCAP, KERNEL_HWCAP_PACG),
#endif
#ifdef CONFIG_ARM64_MTE
HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_MTE_SHIFT, FTR_UNSIGNED, ID_AA64PFR1_MTE, CAP_HWCAP, KERNEL_HWCAP_MTE),
#endif /* CONFIG_ARM64_MTE */
{},
};

View file

@ -43,94 +43,93 @@ static const char *icache_policy_str[] = {
unsigned long __icache_flags;
static const char *const hwcap_str[] = {
"fp",
"asimd",
"evtstrm",
"aes",
"pmull",
"sha1",
"sha2",
"crc32",
"atomics",
"fphp",
"asimdhp",
"cpuid",
"asimdrdm",
"jscvt",
"fcma",
"lrcpc",
"dcpop",
"sha3",
"sm3",
"sm4",
"asimddp",
"sha512",
"sve",
"asimdfhm",
"dit",
"uscat",
"ilrcpc",
"flagm",
"ssbs",
"sb",
"paca",
"pacg",
"dcpodp",
"sve2",
"sveaes",
"svepmull",
"svebitperm",
"svesha3",
"svesm4",
"flagm2",
"frint",
"svei8mm",
"svef32mm",
"svef64mm",
"svebf16",
"i8mm",
"bf16",
"dgh",
"rng",
"bti",
/* reserved for "mte" */
NULL
[KERNEL_HWCAP_FP] = "fp",
[KERNEL_HWCAP_ASIMD] = "asimd",
[KERNEL_HWCAP_EVTSTRM] = "evtstrm",
[KERNEL_HWCAP_AES] = "aes",
[KERNEL_HWCAP_PMULL] = "pmull",
[KERNEL_HWCAP_SHA1] = "sha1",
[KERNEL_HWCAP_SHA2] = "sha2",
[KERNEL_HWCAP_CRC32] = "crc32",
[KERNEL_HWCAP_ATOMICS] = "atomics",
[KERNEL_HWCAP_FPHP] = "fphp",
[KERNEL_HWCAP_ASIMDHP] = "asimdhp",
[KERNEL_HWCAP_CPUID] = "cpuid",
[KERNEL_HWCAP_ASIMDRDM] = "asimdrdm",
[KERNEL_HWCAP_JSCVT] = "jscvt",
[KERNEL_HWCAP_FCMA] = "fcma",
[KERNEL_HWCAP_LRCPC] = "lrcpc",
[KERNEL_HWCAP_DCPOP] = "dcpop",
[KERNEL_HWCAP_SHA3] = "sha3",
[KERNEL_HWCAP_SM3] = "sm3",
[KERNEL_HWCAP_SM4] = "sm4",
[KERNEL_HWCAP_ASIMDDP] = "asimddp",
[KERNEL_HWCAP_SHA512] = "sha512",
[KERNEL_HWCAP_SVE] = "sve",
[KERNEL_HWCAP_ASIMDFHM] = "asimdfhm",
[KERNEL_HWCAP_DIT] = "dit",
[KERNEL_HWCAP_USCAT] = "uscat",
[KERNEL_HWCAP_ILRCPC] = "ilrcpc",
[KERNEL_HWCAP_FLAGM] = "flagm",
[KERNEL_HWCAP_SSBS] = "ssbs",
[KERNEL_HWCAP_SB] = "sb",
[KERNEL_HWCAP_PACA] = "paca",
[KERNEL_HWCAP_PACG] = "pacg",
[KERNEL_HWCAP_DCPODP] = "dcpodp",
[KERNEL_HWCAP_SVE2] = "sve2",
[KERNEL_HWCAP_SVEAES] = "sveaes",
[KERNEL_HWCAP_SVEPMULL] = "svepmull",
[KERNEL_HWCAP_SVEBITPERM] = "svebitperm",
[KERNEL_HWCAP_SVESHA3] = "svesha3",
[KERNEL_HWCAP_SVESM4] = "svesm4",
[KERNEL_HWCAP_FLAGM2] = "flagm2",
[KERNEL_HWCAP_FRINT] = "frint",
[KERNEL_HWCAP_SVEI8MM] = "svei8mm",
[KERNEL_HWCAP_SVEF32MM] = "svef32mm",
[KERNEL_HWCAP_SVEF64MM] = "svef64mm",
[KERNEL_HWCAP_SVEBF16] = "svebf16",
[KERNEL_HWCAP_I8MM] = "i8mm",
[KERNEL_HWCAP_BF16] = "bf16",
[KERNEL_HWCAP_DGH] = "dgh",
[KERNEL_HWCAP_RNG] = "rng",
[KERNEL_HWCAP_BTI] = "bti",
[KERNEL_HWCAP_MTE] = "mte",
};
#ifdef CONFIG_COMPAT
#define COMPAT_KERNEL_HWCAP(x) const_ilog2(COMPAT_HWCAP_ ## x)
static const char *const compat_hwcap_str[] = {
"swp",
"half",
"thumb",
"26bit",
"fastmult",
"fpa",
"vfp",
"edsp",
"java",
"iwmmxt",
"crunch",
"thumbee",
"neon",
"vfpv3",
"vfpv3d16",
"tls",
"vfpv4",
"idiva",
"idivt",
"vfpd32",
"lpae",
"evtstrm",
NULL
[COMPAT_KERNEL_HWCAP(SWP)] = "swp",
[COMPAT_KERNEL_HWCAP(HALF)] = "half",
[COMPAT_KERNEL_HWCAP(THUMB)] = "thumb",
[COMPAT_KERNEL_HWCAP(26BIT)] = NULL, /* Not possible on arm64 */
[COMPAT_KERNEL_HWCAP(FAST_MULT)] = "fastmult",
[COMPAT_KERNEL_HWCAP(FPA)] = NULL, /* Not possible on arm64 */
[COMPAT_KERNEL_HWCAP(VFP)] = "vfp",
[COMPAT_KERNEL_HWCAP(EDSP)] = "edsp",
[COMPAT_KERNEL_HWCAP(JAVA)] = NULL, /* Not possible on arm64 */
[COMPAT_KERNEL_HWCAP(IWMMXT)] = NULL, /* Not possible on arm64 */
[COMPAT_KERNEL_HWCAP(CRUNCH)] = NULL, /* Not possible on arm64 */
[COMPAT_KERNEL_HWCAP(THUMBEE)] = NULL, /* Not possible on arm64 */
[COMPAT_KERNEL_HWCAP(NEON)] = "neon",
[COMPAT_KERNEL_HWCAP(VFPv3)] = "vfpv3",
[COMPAT_KERNEL_HWCAP(VFPV3D16)] = NULL, /* Not possible on arm64 */
[COMPAT_KERNEL_HWCAP(TLS)] = "tls",
[COMPAT_KERNEL_HWCAP(VFPv4)] = "vfpv4",
[COMPAT_KERNEL_HWCAP(IDIVA)] = "idiva",
[COMPAT_KERNEL_HWCAP(IDIVT)] = "idivt",
[COMPAT_KERNEL_HWCAP(VFPD32)] = NULL, /* Not possible on arm64 */
[COMPAT_KERNEL_HWCAP(LPAE)] = "lpae",
[COMPAT_KERNEL_HWCAP(EVTSTRM)] = "evtstrm",
};
#define COMPAT_KERNEL_HWCAP2(x) const_ilog2(COMPAT_HWCAP2_ ## x)
static const char *const compat_hwcap2_str[] = {
"aes",
"pmull",
"sha1",
"sha2",
"crc32",
NULL
[COMPAT_KERNEL_HWCAP2(AES)] = "aes",
[COMPAT_KERNEL_HWCAP2(PMULL)] = "pmull",
[COMPAT_KERNEL_HWCAP2(SHA1)] = "sha1",
[COMPAT_KERNEL_HWCAP2(SHA2)] = "sha2",
[COMPAT_KERNEL_HWCAP2(CRC32)] = "crc32",
};
#endif /* CONFIG_COMPAT */
@ -166,16 +165,25 @@ static int c_show(struct seq_file *m, void *v)
seq_puts(m, "Features\t:");
if (compat) {
#ifdef CONFIG_COMPAT
for (j = 0; compat_hwcap_str[j]; j++)
if (compat_elf_hwcap & (1 << j))
seq_printf(m, " %s", compat_hwcap_str[j]);
for (j = 0; j < ARRAY_SIZE(compat_hwcap_str); j++) {
if (compat_elf_hwcap & (1 << j)) {
/*
* Warn once if any feature should not
* have been present on arm64 platform.
*/
if (WARN_ON_ONCE(!compat_hwcap_str[j]))
continue;
for (j = 0; compat_hwcap2_str[j]; j++)
seq_printf(m, " %s", compat_hwcap_str[j]);
}
}
for (j = 0; j < ARRAY_SIZE(compat_hwcap2_str); j++)
if (compat_elf_hwcap2 & (1 << j))
seq_printf(m, " %s", compat_hwcap2_str[j]);
#endif /* CONFIG_COMPAT */
} else {
for (j = 0; hwcap_str[j]; j++)
for (j = 0; j < ARRAY_SIZE(hwcap_str); j++)
if (cpu_have_feature(j))
seq_printf(m, " %s", hwcap_str[j]);
}

View file

@ -384,7 +384,7 @@ void __init debug_traps_init(void)
hook_debug_fault_code(DBG_ESR_EVT_HWSS, single_step_handler, SIGTRAP,
TRAP_TRACE, "single-step handler");
hook_debug_fault_code(DBG_ESR_EVT_BRK, brk_handler, SIGTRAP,
TRAP_BRKPT, "ptrace BRK handler");
TRAP_BRKPT, "BRK handler");
}
/* Re-enable single step for syscall restarting. */

View file

@ -66,6 +66,13 @@ static void notrace el1_dbg(struct pt_regs *regs, unsigned long esr)
}
NOKPROBE_SYMBOL(el1_dbg);
static void notrace el1_fpac(struct pt_regs *regs, unsigned long esr)
{
local_daif_inherit(regs);
do_ptrauth_fault(regs, esr);
}
NOKPROBE_SYMBOL(el1_fpac);
asmlinkage void notrace el1_sync_handler(struct pt_regs *regs)
{
unsigned long esr = read_sysreg(esr_el1);
@ -92,6 +99,9 @@ asmlinkage void notrace el1_sync_handler(struct pt_regs *regs)
case ESR_ELx_EC_BRK64:
el1_dbg(regs, esr);
break;
case ESR_ELx_EC_FPAC:
el1_fpac(regs, esr);
break;
default:
el1_inv(regs, esr);
}
@ -227,6 +237,14 @@ static void notrace el0_svc(struct pt_regs *regs)
}
NOKPROBE_SYMBOL(el0_svc);
static void notrace el0_fpac(struct pt_regs *regs, unsigned long esr)
{
user_exit_irqoff();
local_daif_restore(DAIF_PROCCTX);
do_ptrauth_fault(regs, esr);
}
NOKPROBE_SYMBOL(el0_fpac);
asmlinkage void notrace el0_sync_handler(struct pt_regs *regs)
{
unsigned long esr = read_sysreg(esr_el1);
@ -272,6 +290,9 @@ asmlinkage void notrace el0_sync_handler(struct pt_regs *regs)
case ESR_ELx_EC_BRK64:
el0_dbg(regs, esr);
break;
case ESR_ELx_EC_FPAC:
el0_fpac(regs, esr);
break;
default:
el0_inv(regs, esr);
}

View file

@ -32,6 +32,7 @@ SYM_FUNC_START(fpsimd_load_state)
SYM_FUNC_END(fpsimd_load_state)
#ifdef CONFIG_ARM64_SVE
SYM_FUNC_START(sve_save_state)
sve_save 0, x1, 2
ret
@ -46,4 +47,28 @@ SYM_FUNC_START(sve_get_vl)
_sve_rdvl 0, 1
ret
SYM_FUNC_END(sve_get_vl)
/*
* Load SVE state from FPSIMD state.
*
* x0 = pointer to struct fpsimd_state
* x1 = VQ - 1
*
* Each SVE vector will be loaded with the first 128-bits taken from FPSIMD
* and the rest zeroed. All the other SVE registers will be zeroed.
*/
SYM_FUNC_START(sve_load_from_fpsimd_state)
sve_load_vq x1, x2, x3
fpsimd_restore x0, 8
_for n, 0, 15, _sve_pfalse \n
_sve_wrffr 0
ret
SYM_FUNC_END(sve_load_from_fpsimd_state)
/* Zero all SVE registers but the first 128-bits of each vector */
SYM_FUNC_START(sve_flush_live)
sve_flush
ret
SYM_FUNC_END(sve_flush_live)
#endif /* CONFIG_ARM64_SVE */

View file

@ -132,9 +132,8 @@ alternative_else_nop_endif
* them if required.
*/
.macro apply_ssbd, state, tmp1, tmp2
#ifdef CONFIG_ARM64_SSBD
alternative_cb arm64_enable_wa2_handling
b .L__asm_ssbd_skip\@
alternative_cb spectre_v4_patch_fw_mitigation_enable
b .L__asm_ssbd_skip\@ // Patched to NOP
alternative_cb_end
ldr_this_cpu \tmp2, arm64_ssbd_callback_required, \tmp1
cbz \tmp2, .L__asm_ssbd_skip\@
@ -142,10 +141,35 @@ alternative_cb_end
tbnz \tmp2, #TIF_SSBD, .L__asm_ssbd_skip\@
mov w0, #ARM_SMCCC_ARCH_WORKAROUND_2
mov w1, #\state
alternative_cb arm64_update_smccc_conduit
alternative_cb spectre_v4_patch_fw_mitigation_conduit
nop // Patched to SMC/HVC #0
alternative_cb_end
.L__asm_ssbd_skip\@:
.endm
/* Check for MTE asynchronous tag check faults */
.macro check_mte_async_tcf, flgs, tmp
#ifdef CONFIG_ARM64_MTE
alternative_if_not ARM64_MTE
b 1f
alternative_else_nop_endif
mrs_s \tmp, SYS_TFSRE0_EL1
tbz \tmp, #SYS_TFSR_EL1_TF0_SHIFT, 1f
/* Asynchronous TCF occurred for TTBR0 access, set the TI flag */
orr \flgs, \flgs, #_TIF_MTE_ASYNC_FAULT
str \flgs, [tsk, #TSK_TI_FLAGS]
msr_s SYS_TFSRE0_EL1, xzr
1:
#endif
.endm
/* Clear the MTE asynchronous tag check faults */
.macro clear_mte_async_tcf
#ifdef CONFIG_ARM64_MTE
alternative_if ARM64_MTE
dsb ish
msr_s SYS_TFSRE0_EL1, xzr
alternative_else_nop_endif
#endif
.endm
@ -182,6 +206,8 @@ alternative_cb_end
ldr x19, [tsk, #TSK_TI_FLAGS]
disable_step_tsk x19, x20
/* Check for asynchronous tag check faults in user space */
check_mte_async_tcf x19, x22
apply_ssbd 1, x22, x23
ptrauth_keys_install_kernel tsk, x20, x22, x23
@ -233,6 +259,13 @@ alternative_if ARM64_HAS_IRQ_PRIO_MASKING
str x20, [sp, #S_PMR_SAVE]
alternative_else_nop_endif
/* Re-enable tag checking (TCO set on exception entry) */
#ifdef CONFIG_ARM64_MTE
alternative_if ARM64_MTE
SET_PSTATE_TCO(0)
alternative_else_nop_endif
#endif
/*
* Registers that may be useful after this macro is invoked:
*
@ -697,11 +730,9 @@ el0_irq_naked:
bl trace_hardirqs_off
#endif
#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
tbz x22, #55, 1f
bl do_el0_irq_bp_hardening
1:
#endif
irq_handler
#ifdef CONFIG_TRACE_IRQFLAGS
@ -744,6 +775,8 @@ SYM_CODE_START_LOCAL(ret_to_user)
and x2, x1, #_TIF_WORK_MASK
cbnz x2, work_pending
finish_ret_to_user:
/* Ignore asynchronous tag check faults in the uaccess routines */
clear_mte_async_tcf
enable_step_tsk x1, x2
#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
bl stackleak_erase

View file

@ -32,9 +32,11 @@
#include <linux/swab.h>
#include <asm/esr.h>
#include <asm/exception.h>
#include <asm/fpsimd.h>
#include <asm/cpufeature.h>
#include <asm/cputype.h>
#include <asm/neon.h>
#include <asm/processor.h>
#include <asm/simd.h>
#include <asm/sigcontext.h>
@ -312,7 +314,7 @@ static void fpsimd_save(void)
* re-enter user with corrupt state.
* There's no way to recover, so kill it:
*/
force_signal_inject(SIGKILL, SI_KERNEL, 0);
force_signal_inject(SIGKILL, SI_KERNEL, 0, 0);
return;
}
@ -928,7 +930,7 @@ void fpsimd_release_task(struct task_struct *dead_task)
* the SVE access trap will be disabled the next time this task
* reaches ret_to_user.
*
* TIF_SVE should be clear on entry: otherwise, task_fpsimd_load()
* TIF_SVE should be clear on entry: otherwise, fpsimd_restore_current_state()
* would have disabled the SVE access trap for userspace during
* ret_to_user, making an SVE access trap impossible in that case.
*/
@ -936,7 +938,7 @@ void do_sve_acc(unsigned int esr, struct pt_regs *regs)
{
/* Even if we chose not to use SVE, the hardware could still trap: */
if (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc);
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0);
return;
}

View file

@ -36,14 +36,10 @@
#include "efi-header.S"
#define __PHYS_OFFSET (KERNEL_START - TEXT_OFFSET)
#define __PHYS_OFFSET KERNEL_START
#if (TEXT_OFFSET & 0xfff) != 0
#error TEXT_OFFSET must be at least 4KB aligned
#elif (PAGE_OFFSET & 0x1fffff) != 0
#if (PAGE_OFFSET & 0x1fffff) != 0
#error PAGE_OFFSET must be at least 2MB aligned
#elif TEXT_OFFSET > 0x1fffff
#error TEXT_OFFSET must be less than 2MB
#endif
/*
@ -55,7 +51,7 @@
* x0 = physical address to the FDT blob.
*
* This code is mostly position independent so you call this at
* __pa(PAGE_OFFSET + TEXT_OFFSET).
* __pa(PAGE_OFFSET).
*
* Note that the callee-saved registers are used for storing variables
* that are useful before the MMU is enabled. The allocations are described
@ -77,7 +73,7 @@ _head:
b primary_entry // branch to kernel start, magic
.long 0 // reserved
#endif
le64sym _kernel_offset_le // Image load offset from start of RAM, little-endian
.quad 0 // Image load offset from start of RAM, little-endian
le64sym _kernel_size_le // Effective size of kernel image, little-endian
le64sym _kernel_flags_le // Informative flags, little-endian
.quad 0 // reserved
@ -382,7 +378,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
* Map the kernel image (starting with PHYS_OFFSET).
*/
adrp x0, init_pg_dir
mov_q x5, KIMAGE_VADDR + TEXT_OFFSET // compile time __va(_text)
mov_q x5, KIMAGE_VADDR // compile time __va(_text)
add x5, x5, x23 // add KASLR displacement
mov x4, PTRS_PER_PGD
adrp x6, _end // runtime __pa(_end)
@ -474,7 +470,7 @@ SYM_FUNC_END(__primary_switched)
.pushsection ".rodata", "a"
SYM_DATA_START(kimage_vaddr)
.quad _text - TEXT_OFFSET
.quad _text
SYM_DATA_END(kimage_vaddr)
EXPORT_SYMBOL(kimage_vaddr)
.popsection

View file

@ -21,7 +21,6 @@
#include <linux/sched.h>
#include <linux/suspend.h>
#include <linux/utsname.h>
#include <linux/version.h>
#include <asm/barrier.h>
#include <asm/cacheflush.h>
@ -31,6 +30,7 @@
#include <asm/kexec.h>
#include <asm/memory.h>
#include <asm/mmu_context.h>
#include <asm/mte.h>
#include <asm/pgalloc.h>
#include <asm/pgtable-hwdef.h>
#include <asm/sections.h>
@ -285,6 +285,117 @@ static int create_safe_exec_page(void *src_start, size_t length,
#define dcache_clean_range(start, end) __flush_dcache_area(start, (end - start))
#ifdef CONFIG_ARM64_MTE
static DEFINE_XARRAY(mte_pages);
static int save_tags(struct page *page, unsigned long pfn)
{
void *tag_storage, *ret;
tag_storage = mte_allocate_tag_storage();
if (!tag_storage)
return -ENOMEM;
mte_save_page_tags(page_address(page), tag_storage);
ret = xa_store(&mte_pages, pfn, tag_storage, GFP_KERNEL);
if (WARN(xa_is_err(ret), "Failed to store MTE tags")) {
mte_free_tag_storage(tag_storage);
return xa_err(ret);
} else if (WARN(ret, "swsusp: %s: Duplicate entry", __func__)) {
mte_free_tag_storage(ret);
}
return 0;
}
static void swsusp_mte_free_storage(void)
{
XA_STATE(xa_state, &mte_pages, 0);
void *tags;
xa_lock(&mte_pages);
xas_for_each(&xa_state, tags, ULONG_MAX) {
mte_free_tag_storage(tags);
}
xa_unlock(&mte_pages);
xa_destroy(&mte_pages);
}
static int swsusp_mte_save_tags(void)
{
struct zone *zone;
unsigned long pfn, max_zone_pfn;
int ret = 0;
int n = 0;
if (!system_supports_mte())
return 0;
for_each_populated_zone(zone) {
max_zone_pfn = zone_end_pfn(zone);
for (pfn = zone->zone_start_pfn; pfn < max_zone_pfn; pfn++) {
struct page *page = pfn_to_online_page(pfn);
if (!page)
continue;
if (!test_bit(PG_mte_tagged, &page->flags))
continue;
ret = save_tags(page, pfn);
if (ret) {
swsusp_mte_free_storage();
goto out;
}
n++;
}
}
pr_info("Saved %d MTE pages\n", n);
out:
return ret;
}
static void swsusp_mte_restore_tags(void)
{
XA_STATE(xa_state, &mte_pages, 0);
int n = 0;
void *tags;
xa_lock(&mte_pages);
xas_for_each(&xa_state, tags, ULONG_MAX) {
unsigned long pfn = xa_state.xa_index;
struct page *page = pfn_to_online_page(pfn);
mte_restore_page_tags(page_address(page), tags);
mte_free_tag_storage(tags);
n++;
}
xa_unlock(&mte_pages);
pr_info("Restored %d MTE pages\n", n);
xa_destroy(&mte_pages);
}
#else /* CONFIG_ARM64_MTE */
static int swsusp_mte_save_tags(void)
{
return 0;
}
static void swsusp_mte_restore_tags(void)
{
}
#endif /* CONFIG_ARM64_MTE */
int swsusp_arch_suspend(void)
{
int ret = 0;
@ -302,6 +413,10 @@ int swsusp_arch_suspend(void)
/* make the crash dump kernel image visible/saveable */
crash_prepare_suspend();
ret = swsusp_mte_save_tags();
if (ret)
return ret;
sleep_cpu = smp_processor_id();
ret = swsusp_save();
} else {
@ -315,6 +430,8 @@ int swsusp_arch_suspend(void)
dcache_clean_range(__hyp_text_start, __hyp_text_end);
}
swsusp_mte_restore_tags();
/* make the crash dump kernel image protected again */
crash_post_resume();
@ -332,11 +449,7 @@ int swsusp_arch_suspend(void)
* mitigation off behind our back, let's set the state
* to what we expect it to be.
*/
switch (arm64_get_ssbd_state()) {
case ARM64_SSBD_FORCE_ENABLE:
case ARM64_SSBD_KERNEL:
arm64_set_ssbd_mitigation(true);
}
spectre_v4_enable_mitigation(NULL);
}
local_daif_restore(flags);

View file

@ -64,12 +64,10 @@ __efistub__ctype = _ctype;
#define KVM_NVHE_ALIAS(sym) __kvm_nvhe_##sym = sym;
/* Alternative callbacks for init-time patching of nVHE hyp code. */
KVM_NVHE_ALIAS(arm64_enable_wa2_handling);
KVM_NVHE_ALIAS(kvm_patch_vector_branch);
KVM_NVHE_ALIAS(kvm_update_va_mask);
/* Global kernel state accessed by nVHE hyp code. */
KVM_NVHE_ALIAS(arm64_ssbd_callback_required);
KVM_NVHE_ALIAS(kvm_host_data);
KVM_NVHE_ALIAS(kvm_vgic_global_state);

View file

@ -62,7 +62,6 @@
*/
#define HEAD_SYMBOLS \
DEFINE_IMAGE_LE64(_kernel_size_le, _end - _text); \
DEFINE_IMAGE_LE64(_kernel_offset_le, TEXT_OFFSET); \
DEFINE_IMAGE_LE64(_kernel_flags_le, __HEAD_FLAGS);
#endif /* __ARM64_KERNEL_IMAGE_H */

View file

@ -60,16 +60,10 @@ bool __kprobes aarch64_insn_is_steppable_hint(u32 insn)
case AARCH64_INSN_HINT_XPACLRI:
case AARCH64_INSN_HINT_PACIA_1716:
case AARCH64_INSN_HINT_PACIB_1716:
case AARCH64_INSN_HINT_AUTIA_1716:
case AARCH64_INSN_HINT_AUTIB_1716:
case AARCH64_INSN_HINT_PACIAZ:
case AARCH64_INSN_HINT_PACIASP:
case AARCH64_INSN_HINT_PACIBZ:
case AARCH64_INSN_HINT_PACIBSP:
case AARCH64_INSN_HINT_AUTIAZ:
case AARCH64_INSN_HINT_AUTIASP:
case AARCH64_INSN_HINT_AUTIBZ:
case AARCH64_INSN_HINT_AUTIBSP:
case AARCH64_INSN_HINT_BTI:
case AARCH64_INSN_HINT_BTIC:
case AARCH64_INSN_HINT_BTIJ:
@ -176,7 +170,7 @@ bool __kprobes aarch64_insn_uses_literal(u32 insn)
bool __kprobes aarch64_insn_is_branch(u32 insn)
{
/* b, bl, cb*, tb*, b.cond, br, blr */
/* b, bl, cb*, tb*, ret*, b.cond, br*, blr* */
return aarch64_insn_is_b(insn) ||
aarch64_insn_is_bl(insn) ||
@ -185,8 +179,11 @@ bool __kprobes aarch64_insn_is_branch(u32 insn)
aarch64_insn_is_tbz(insn) ||
aarch64_insn_is_tbnz(insn) ||
aarch64_insn_is_ret(insn) ||
aarch64_insn_is_ret_auth(insn) ||
aarch64_insn_is_br(insn) ||
aarch64_insn_is_br_auth(insn) ||
aarch64_insn_is_blr(insn) ||
aarch64_insn_is_blr_auth(insn) ||
aarch64_insn_is_bcond(insn);
}

336
arch/arm64/kernel/mte.c Normal file
View file

@ -0,0 +1,336 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Copyright (C) 2020 ARM Ltd.
*/
#include <linux/bitops.h>
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/prctl.h>
#include <linux/sched.h>
#include <linux/sched/mm.h>
#include <linux/string.h>
#include <linux/swap.h>
#include <linux/swapops.h>
#include <linux/thread_info.h>
#include <linux/uio.h>
#include <asm/cpufeature.h>
#include <asm/mte.h>
#include <asm/ptrace.h>
#include <asm/sysreg.h>
static void mte_sync_page_tags(struct page *page, pte_t *ptep, bool check_swap)
{
pte_t old_pte = READ_ONCE(*ptep);
if (check_swap && is_swap_pte(old_pte)) {
swp_entry_t entry = pte_to_swp_entry(old_pte);
if (!non_swap_entry(entry) && mte_restore_tags(entry, page))
return;
}
mte_clear_page_tags(page_address(page));
}
void mte_sync_tags(pte_t *ptep, pte_t pte)
{
struct page *page = pte_page(pte);
long i, nr_pages = compound_nr(page);
bool check_swap = nr_pages == 1;
/* if PG_mte_tagged is set, tags have already been initialised */
for (i = 0; i < nr_pages; i++, page++) {
if (!test_and_set_bit(PG_mte_tagged, &page->flags))
mte_sync_page_tags(page, ptep, check_swap);
}
}
int memcmp_pages(struct page *page1, struct page *page2)
{
char *addr1, *addr2;
int ret;
addr1 = page_address(page1);
addr2 = page_address(page2);
ret = memcmp(addr1, addr2, PAGE_SIZE);
if (!system_supports_mte() || ret)
return ret;
/*
* If the page content is identical but at least one of the pages is
* tagged, return non-zero to avoid KSM merging. If only one of the
* pages is tagged, set_pte_at() may zero or change the tags of the
* other page via mte_sync_tags().
*/
if (test_bit(PG_mte_tagged, &page1->flags) ||
test_bit(PG_mte_tagged, &page2->flags))
return addr1 != addr2;
return ret;
}
static void update_sctlr_el1_tcf0(u64 tcf0)
{
/* ISB required for the kernel uaccess routines */
sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF0_MASK, tcf0);
isb();
}
static void set_sctlr_el1_tcf0(u64 tcf0)
{
/*
* mte_thread_switch() checks current->thread.sctlr_tcf0 as an
* optimisation. Disable preemption so that it does not see
* the variable update before the SCTLR_EL1.TCF0 one.
*/
preempt_disable();
current->thread.sctlr_tcf0 = tcf0;
update_sctlr_el1_tcf0(tcf0);
preempt_enable();
}
static void update_gcr_el1_excl(u64 incl)
{
u64 excl = ~incl & SYS_GCR_EL1_EXCL_MASK;
/*
* Note that 'incl' is an include mask (controlled by the user via
* prctl()) while GCR_EL1 accepts an exclude mask.
* No need for ISB since this only affects EL0 currently, implicit
* with ERET.
*/
sysreg_clear_set_s(SYS_GCR_EL1, SYS_GCR_EL1_EXCL_MASK, excl);
}
static void set_gcr_el1_excl(u64 incl)
{
current->thread.gcr_user_incl = incl;
update_gcr_el1_excl(incl);
}
void flush_mte_state(void)
{
if (!system_supports_mte())
return;
/* clear any pending asynchronous tag fault */
dsb(ish);
write_sysreg_s(0, SYS_TFSRE0_EL1);
clear_thread_flag(TIF_MTE_ASYNC_FAULT);
/* disable tag checking */
set_sctlr_el1_tcf0(SCTLR_EL1_TCF0_NONE);
/* reset tag generation mask */
set_gcr_el1_excl(0);
}
void mte_thread_switch(struct task_struct *next)
{
if (!system_supports_mte())
return;
/* avoid expensive SCTLR_EL1 accesses if no change */
if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
update_gcr_el1_excl(next->thread.gcr_user_incl);
}
void mte_suspend_exit(void)
{
if (!system_supports_mte())
return;
update_gcr_el1_excl(current->thread.gcr_user_incl);
}
long set_mte_ctrl(struct task_struct *task, unsigned long arg)
{
u64 tcf0;
u64 gcr_incl = (arg & PR_MTE_TAG_MASK) >> PR_MTE_TAG_SHIFT;
if (!system_supports_mte())
return 0;
switch (arg & PR_MTE_TCF_MASK) {
case PR_MTE_TCF_NONE:
tcf0 = SCTLR_EL1_TCF0_NONE;
break;
case PR_MTE_TCF_SYNC:
tcf0 = SCTLR_EL1_TCF0_SYNC;
break;
case PR_MTE_TCF_ASYNC:
tcf0 = SCTLR_EL1_TCF0_ASYNC;
break;
default:
return -EINVAL;
}
if (task != current) {
task->thread.sctlr_tcf0 = tcf0;
task->thread.gcr_user_incl = gcr_incl;
} else {
set_sctlr_el1_tcf0(tcf0);
set_gcr_el1_excl(gcr_incl);
}
return 0;
}
long get_mte_ctrl(struct task_struct *task)
{
unsigned long ret;
if (!system_supports_mte())
return 0;
ret = task->thread.gcr_user_incl << PR_MTE_TAG_SHIFT;
switch (task->thread.sctlr_tcf0) {
case SCTLR_EL1_TCF0_NONE:
return PR_MTE_TCF_NONE;
case SCTLR_EL1_TCF0_SYNC:
ret |= PR_MTE_TCF_SYNC;
break;
case SCTLR_EL1_TCF0_ASYNC:
ret |= PR_MTE_TCF_ASYNC;
break;
}
return ret;
}
/*
* Access MTE tags in another process' address space as given in mm. Update
* the number of tags copied. Return 0 if any tags copied, error otherwise.
* Inspired by __access_remote_vm().
*/
static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
struct iovec *kiov, unsigned int gup_flags)
{
struct vm_area_struct *vma;
void __user *buf = kiov->iov_base;
size_t len = kiov->iov_len;
int ret;
int write = gup_flags & FOLL_WRITE;
if (!access_ok(buf, len))
return -EFAULT;
if (mmap_read_lock_killable(mm))
return -EIO;
while (len) {
unsigned long tags, offset;
void *maddr;
struct page *page = NULL;
ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
&vma, NULL);
if (ret <= 0)
break;
/*
* Only copy tags if the page has been mapped as PROT_MTE
* (PG_mte_tagged set). Otherwise the tags are not valid and
* not accessible to user. Moreover, an mprotect(PROT_MTE)
* would cause the existing tags to be cleared if the page
* was never mapped with PROT_MTE.
*/
if (!test_bit(PG_mte_tagged, &page->flags)) {
ret = -EOPNOTSUPP;
put_page(page);
break;
}
/* limit access to the end of the page */
offset = offset_in_page(addr);
tags = min(len, (PAGE_SIZE - offset) / MTE_GRANULE_SIZE);
maddr = page_address(page);
if (write) {
tags = mte_copy_tags_from_user(maddr + offset, buf, tags);
set_page_dirty_lock(page);
} else {
tags = mte_copy_tags_to_user(buf, maddr + offset, tags);
}
put_page(page);
/* error accessing the tracer's buffer */
if (!tags)
break;
len -= tags;
buf += tags;
addr += tags * MTE_GRANULE_SIZE;
}
mmap_read_unlock(mm);
/* return an error if no tags copied */
kiov->iov_len = buf - kiov->iov_base;
if (!kiov->iov_len) {
/* check for error accessing the tracee's address space */
if (ret <= 0)
return -EIO;
else
return -EFAULT;
}
return 0;
}
/*
* Copy MTE tags in another process' address space at 'addr' to/from tracer's
* iovec buffer. Return 0 on success. Inspired by ptrace_access_vm().
*/
static int access_remote_tags(struct task_struct *tsk, unsigned long addr,
struct iovec *kiov, unsigned int gup_flags)
{
struct mm_struct *mm;
int ret;
mm = get_task_mm(tsk);
if (!mm)
return -EPERM;
if (!tsk->ptrace || (current != tsk->parent) ||
((get_dumpable(mm) != SUID_DUMP_USER) &&
!ptracer_capable(tsk, mm->user_ns))) {
mmput(mm);
return -EPERM;
}
ret = __access_remote_tags(mm, addr, kiov, gup_flags);
mmput(mm);
return ret;
}
int mte_ptrace_copy_tags(struct task_struct *child, long request,
unsigned long addr, unsigned long data)
{
int ret;
struct iovec kiov;
struct iovec __user *uiov = (void __user *)data;
unsigned int gup_flags = FOLL_FORCE;
if (!system_supports_mte())
return -EIO;
if (get_user(kiov.iov_base, &uiov->iov_base) ||
get_user(kiov.iov_len, &uiov->iov_len))
return -EFAULT;
if (request == PTRACE_POKEMTETAGS)
gup_flags |= FOLL_WRITE;
/* align addr to the MTE tag granule */
addr &= MTE_GRANULE_MASK;
ret = access_remote_tags(child, addr, &kiov, gup_flags);
if (!ret)
ret = put_user(kiov.iov_len, &uiov->iov_len);
return ret;
}

View file

@ -137,11 +137,11 @@ void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
* whist unwinding the stackframe and is like a subroutine return so we use
* the PC.
*/
static int callchain_trace(struct stackframe *frame, void *data)
static bool callchain_trace(void *data, unsigned long pc)
{
struct perf_callchain_entry_ctx *entry = data;
perf_callchain_store(entry, frame->pc);
return 0;
perf_callchain_store(entry, pc);
return true;
}
void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,

View file

@ -69,6 +69,9 @@ static const unsigned armv8_pmuv3_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
[C(ITLB)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_L1I_TLB_REFILL,
[C(ITLB)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_L1I_TLB,
[C(LL)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS_RD,
[C(LL)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_LL_CACHE_RD,
[C(BPU)][C(OP_READ)][C(RESULT_ACCESS)] = ARMV8_PMUV3_PERFCTR_BR_PRED,
[C(BPU)][C(OP_READ)][C(RESULT_MISS)] = ARMV8_PMUV3_PERFCTR_BR_MIS_PRED,
};
@ -302,13 +305,33 @@ static struct attribute_group armv8_pmuv3_format_attr_group = {
.attrs = armv8_pmuv3_format_attrs,
};
static ssize_t slots_show(struct device *dev, struct device_attribute *attr,
char *page)
{
struct pmu *pmu = dev_get_drvdata(dev);
struct arm_pmu *cpu_pmu = container_of(pmu, struct arm_pmu, pmu);
u32 slots = cpu_pmu->reg_pmmir & ARMV8_PMU_SLOTS_MASK;
return snprintf(page, PAGE_SIZE, "0x%08x\n", slots);
}
static DEVICE_ATTR_RO(slots);
static struct attribute *armv8_pmuv3_caps_attrs[] = {
&dev_attr_slots.attr,
NULL,
};
static struct attribute_group armv8_pmuv3_caps_attr_group = {
.name = "caps",
.attrs = armv8_pmuv3_caps_attrs,
};
/*
* Perf Events' indices
*/
#define ARMV8_IDX_CYCLE_COUNTER 0
#define ARMV8_IDX_COUNTER0 1
#define ARMV8_IDX_COUNTER_LAST(cpu_pmu) \
(ARMV8_IDX_CYCLE_COUNTER + cpu_pmu->num_events - 1)
/*
@ -348,6 +371,73 @@ static inline bool armv8pmu_event_is_chained(struct perf_event *event)
#define ARMV8_IDX_TO_COUNTER(x) \
(((x) - ARMV8_IDX_COUNTER0) & ARMV8_PMU_COUNTER_MASK)
/*
* This code is really good
*/
#define PMEVN_CASE(n, case_macro) \
case n: case_macro(n); break
#define PMEVN_SWITCH(x, case_macro) \
do { \
switch (x) { \
PMEVN_CASE(0, case_macro); \
PMEVN_CASE(1, case_macro); \
PMEVN_CASE(2, case_macro); \
PMEVN_CASE(3, case_macro); \
PMEVN_CASE(4, case_macro); \
PMEVN_CASE(5, case_macro); \
PMEVN_CASE(6, case_macro); \
PMEVN_CASE(7, case_macro); \
PMEVN_CASE(8, case_macro); \
PMEVN_CASE(9, case_macro); \
PMEVN_CASE(10, case_macro); \
PMEVN_CASE(11, case_macro); \
PMEVN_CASE(12, case_macro); \
PMEVN_CASE(13, case_macro); \
PMEVN_CASE(14, case_macro); \
PMEVN_CASE(15, case_macro); \
PMEVN_CASE(16, case_macro); \
PMEVN_CASE(17, case_macro); \
PMEVN_CASE(18, case_macro); \
PMEVN_CASE(19, case_macro); \
PMEVN_CASE(20, case_macro); \
PMEVN_CASE(21, case_macro); \
PMEVN_CASE(22, case_macro); \
PMEVN_CASE(23, case_macro); \
PMEVN_CASE(24, case_macro); \
PMEVN_CASE(25, case_macro); \
PMEVN_CASE(26, case_macro); \
PMEVN_CASE(27, case_macro); \
PMEVN_CASE(28, case_macro); \
PMEVN_CASE(29, case_macro); \
PMEVN_CASE(30, case_macro); \
default: WARN(1, "Invalid PMEV* index\n"); \
} \
} while (0)
#define RETURN_READ_PMEVCNTRN(n) \
return read_sysreg(pmevcntr##n##_el0)
static unsigned long read_pmevcntrn(int n)
{
PMEVN_SWITCH(n, RETURN_READ_PMEVCNTRN);
return 0;
}
#define WRITE_PMEVCNTRN(n) \
write_sysreg(val, pmevcntr##n##_el0)
static void write_pmevcntrn(int n, unsigned long val)
{
PMEVN_SWITCH(n, WRITE_PMEVCNTRN);
}
#define WRITE_PMEVTYPERN(n) \
write_sysreg(val, pmevtyper##n##_el0)
static void write_pmevtypern(int n, unsigned long val)
{
PMEVN_SWITCH(n, WRITE_PMEVTYPERN);
}
static inline u32 armv8pmu_pmcr_read(void)
{
return read_sysreg(pmcr_el0);
@ -365,28 +455,16 @@ static inline int armv8pmu_has_overflowed(u32 pmovsr)
return pmovsr & ARMV8_PMU_OVERFLOWED_MASK;
}
static inline int armv8pmu_counter_valid(struct arm_pmu *cpu_pmu, int idx)
{
return idx >= ARMV8_IDX_CYCLE_COUNTER &&
idx <= ARMV8_IDX_COUNTER_LAST(cpu_pmu);
}
static inline int armv8pmu_counter_has_overflowed(u32 pmnc, int idx)
{
return pmnc & BIT(ARMV8_IDX_TO_COUNTER(idx));
}
static inline void armv8pmu_select_counter(int idx)
static inline u32 armv8pmu_read_evcntr(int idx)
{
u32 counter = ARMV8_IDX_TO_COUNTER(idx);
write_sysreg(counter, pmselr_el0);
isb();
}
static inline u64 armv8pmu_read_evcntr(int idx)
{
armv8pmu_select_counter(idx);
return read_sysreg(pmxevcntr_el0);
return read_pmevcntrn(counter);
}
static inline u64 armv8pmu_read_hw_counter(struct perf_event *event)
@ -440,15 +518,11 @@ static u64 armv8pmu_unbias_long_counter(struct perf_event *event, u64 value)
static u64 armv8pmu_read_counter(struct perf_event *event)
{
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
u64 value = 0;
if (!armv8pmu_counter_valid(cpu_pmu, idx))
pr_err("CPU%u reading wrong counter %d\n",
smp_processor_id(), idx);
else if (idx == ARMV8_IDX_CYCLE_COUNTER)
if (idx == ARMV8_IDX_CYCLE_COUNTER)
value = read_sysreg(pmccntr_el0);
else
value = armv8pmu_read_hw_counter(event);
@ -458,8 +532,9 @@ static u64 armv8pmu_read_counter(struct perf_event *event)
static inline void armv8pmu_write_evcntr(int idx, u64 value)
{
armv8pmu_select_counter(idx);
write_sysreg(value, pmxevcntr_el0);
u32 counter = ARMV8_IDX_TO_COUNTER(idx);
write_pmevcntrn(counter, value);
}
static inline void armv8pmu_write_hw_counter(struct perf_event *event,
@ -477,16 +552,12 @@ static inline void armv8pmu_write_hw_counter(struct perf_event *event,
static void armv8pmu_write_counter(struct perf_event *event, u64 value)
{
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
value = armv8pmu_bias_long_counter(event, value);
if (!armv8pmu_counter_valid(cpu_pmu, idx))
pr_err("CPU%u writing wrong counter %d\n",
smp_processor_id(), idx);
else if (idx == ARMV8_IDX_CYCLE_COUNTER)
if (idx == ARMV8_IDX_CYCLE_COUNTER)
write_sysreg(value, pmccntr_el0);
else
armv8pmu_write_hw_counter(event, value);
@ -494,9 +565,10 @@ static void armv8pmu_write_counter(struct perf_event *event, u64 value)
static inline void armv8pmu_write_evtype(int idx, u32 val)
{
armv8pmu_select_counter(idx);
u32 counter = ARMV8_IDX_TO_COUNTER(idx);
val &= ARMV8_PMU_EVTYPE_MASK;
write_sysreg(val, pmxevtyper_el0);
write_pmevtypern(counter, val);
}
static inline void armv8pmu_write_event_type(struct perf_event *event)
@ -516,7 +588,10 @@ static inline void armv8pmu_write_event_type(struct perf_event *event)
armv8pmu_write_evtype(idx - 1, hwc->config_base);
armv8pmu_write_evtype(idx, chain_evt);
} else {
armv8pmu_write_evtype(idx, hwc->config_base);
if (idx == ARMV8_IDX_CYCLE_COUNTER)
write_sysreg(hwc->config_base, pmccfiltr_el0);
else
armv8pmu_write_evtype(idx, hwc->config_base);
}
}
@ -532,6 +607,11 @@ static u32 armv8pmu_event_cnten_mask(struct perf_event *event)
static inline void armv8pmu_enable_counter(u32 mask)
{
/*
* Make sure event configuration register writes are visible before we
* enable the counter.
* */
isb();
write_sysreg(mask, pmcntenset_el0);
}
@ -550,6 +630,11 @@ static inline void armv8pmu_enable_event_counter(struct perf_event *event)
static inline void armv8pmu_disable_counter(u32 mask)
{
write_sysreg(mask, pmcntenclr_el0);
/*
* Make sure the effects of disabling the counter are visible before we
* start configuring the event.
*/
isb();
}
static inline void armv8pmu_disable_event_counter(struct perf_event *event)
@ -606,15 +691,10 @@ static inline u32 armv8pmu_getreset_flags(void)
static void armv8pmu_enable_event(struct perf_event *event)
{
unsigned long flags;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
/*
* Enable counter and interrupt, and set the counter to count
* the event that we're interested in.
*/
raw_spin_lock_irqsave(&events->pmu_lock, flags);
/*
* Disable counter
@ -622,7 +702,7 @@ static void armv8pmu_enable_event(struct perf_event *event)
armv8pmu_disable_event_counter(event);
/*
* Set event (if destined for PMNx counters).
* Set event.
*/
armv8pmu_write_event_type(event);
@ -635,21 +715,10 @@ static void armv8pmu_enable_event(struct perf_event *event)
* Enable counter
*/
armv8pmu_enable_event_counter(event);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
}
static void armv8pmu_disable_event(struct perf_event *event)
{
unsigned long flags;
struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
/*
* Disable counter and interrupt
*/
raw_spin_lock_irqsave(&events->pmu_lock, flags);
/*
* Disable counter
*/
@ -659,30 +728,18 @@ static void armv8pmu_disable_event(struct perf_event *event)
* Disable interrupt for this counter
*/
armv8pmu_disable_event_irq(event);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
}
static void armv8pmu_start(struct arm_pmu *cpu_pmu)
{
unsigned long flags;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
raw_spin_lock_irqsave(&events->pmu_lock, flags);
/* Enable all counters */
armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMU_PMCR_E);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
}
static void armv8pmu_stop(struct arm_pmu *cpu_pmu)
{
unsigned long flags;
struct pmu_hw_events *events = this_cpu_ptr(cpu_pmu->hw_events);
raw_spin_lock_irqsave(&events->pmu_lock, flags);
/* Disable all counters */
armv8pmu_pmcr_write(armv8pmu_pmcr_read() & ~ARMV8_PMU_PMCR_E);
raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
}
static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
@ -735,20 +792,16 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
if (!armpmu_event_set_period(event))
continue;
/*
* Perf event overflow will queue the processing of the event as
* an irq_work which will be taken care of in the handling of
* IPI_IRQ_WORK.
*/
if (perf_event_overflow(event, &data, regs))
cpu_pmu->disable(event);
}
armv8pmu_start(cpu_pmu);
/*
* Handle the pending perf events.
*
* Note: this call *must* be run with interrupts disabled. For
* platforms that can have the PMU interrupts raised as an NMI, this
* will not work.
*/
irq_work_run();
return IRQ_HANDLED;
}
@ -997,6 +1050,12 @@ static void __armv8pmu_probe_pmu(void *info)
bitmap_from_arr32(cpu_pmu->pmceid_ext_bitmap,
pmceid, ARMV8_PMUV3_MAX_COMMON_EVENTS);
/* store PMMIR_EL1 register for sysfs */
if (pmuver >= ID_AA64DFR0_PMUVER_8_4 && (pmceid_raw[1] & BIT(31)))
cpu_pmu->reg_pmmir = read_cpuid(PMMIR_EL1);
else
cpu_pmu->reg_pmmir = 0;
}
static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
@ -1019,7 +1078,8 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
int (*map_event)(struct perf_event *event),
const struct attribute_group *events,
const struct attribute_group *format)
const struct attribute_group *format,
const struct attribute_group *caps)
{
int ret = armv8pmu_probe_pmu(cpu_pmu);
if (ret)
@ -1044,104 +1104,112 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
events : &armv8_pmuv3_events_attr_group;
cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_FORMATS] = format ?
format : &armv8_pmuv3_format_attr_group;
cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_CAPS] = caps ?
caps : &armv8_pmuv3_caps_attr_group;
return 0;
}
static int armv8_pmu_init_nogroups(struct arm_pmu *cpu_pmu, char *name,
int (*map_event)(struct perf_event *event))
{
return armv8_pmu_init(cpu_pmu, name, map_event, NULL, NULL, NULL);
}
static int armv8_pmuv3_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_pmuv3",
armv8_pmuv3_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_pmuv3",
armv8_pmuv3_map_event);
}
static int armv8_a34_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a34",
armv8_pmuv3_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a34",
armv8_pmuv3_map_event);
}
static int armv8_a35_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a35",
armv8_a53_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a35",
armv8_a53_map_event);
}
static int armv8_a53_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a53",
armv8_a53_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a53",
armv8_a53_map_event);
}
static int armv8_a55_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a55",
armv8_pmuv3_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a55",
armv8_pmuv3_map_event);
}
static int armv8_a57_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a57",
armv8_a57_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a57",
armv8_a57_map_event);
}
static int armv8_a65_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a65",
armv8_pmuv3_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a65",
armv8_pmuv3_map_event);
}
static int armv8_a72_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a72",
armv8_a57_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a72",
armv8_a57_map_event);
}
static int armv8_a73_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a73",
armv8_a73_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a73",
armv8_a73_map_event);
}
static int armv8_a75_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a75",
armv8_pmuv3_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a75",
armv8_pmuv3_map_event);
}
static int armv8_a76_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a76",
armv8_pmuv3_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a76",
armv8_pmuv3_map_event);
}
static int armv8_a77_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cortex_a77",
armv8_pmuv3_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cortex_a77",
armv8_pmuv3_map_event);
}
static int armv8_e1_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_neoverse_e1",
armv8_pmuv3_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_neoverse_e1",
armv8_pmuv3_map_event);
}
static int armv8_n1_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_neoverse_n1",
armv8_pmuv3_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_neoverse_n1",
armv8_pmuv3_map_event);
}
static int armv8_thunder_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_cavium_thunder",
armv8_thunder_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_cavium_thunder",
armv8_thunder_map_event);
}
static int armv8_vulcan_pmu_init(struct arm_pmu *cpu_pmu)
{
return armv8_pmu_init(cpu_pmu, "armv8_brcm_vulcan",
armv8_vulcan_map_event, NULL, NULL);
return armv8_pmu_init_nogroups(cpu_pmu, "armv8_brcm_vulcan",
armv8_vulcan_map_event);
}
static const struct of_device_id armv8_pmu_of_device_ids[] = {

View file

@ -16,7 +16,7 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
/*
* Our handling of compat tasks (PERF_SAMPLE_REGS_ABI_32) is weird, but
* we're stuck with it for ABI compatability reasons.
* we're stuck with it for ABI compatibility reasons.
*
* For a 32-bit consumer inspecting a 32-bit task, then it will look at
* the first 16 registers (see arch/arm/include/uapi/asm/perf_regs.h).

View file

@ -29,7 +29,8 @@ static bool __kprobes aarch64_insn_is_steppable(u32 insn)
aarch64_insn_is_msr_imm(insn) ||
aarch64_insn_is_msr_reg(insn) ||
aarch64_insn_is_exception(insn) ||
aarch64_insn_is_eret(insn))
aarch64_insn_is_eret(insn) ||
aarch64_insn_is_eret_auth(insn))
return false;
/*
@ -42,8 +43,10 @@ static bool __kprobes aarch64_insn_is_steppable(u32 insn)
!= AARCH64_INSN_SPCLREG_DAIF;
/*
* The HINT instruction is is problematic when single-stepping,
* except for the NOP case.
* The HINT instruction is steppable only if it is in whitelist
* and the rest of other such instructions are blocked for
* single stepping as they may cause exception or other
* unintended behaviour.
*/
if (aarch64_insn_is_hint(insn))
return aarch64_insn_is_steppable_hint(insn);

View file

@ -21,6 +21,7 @@
#include <linux/lockdep.h>
#include <linux/mman.h>
#include <linux/mm.h>
#include <linux/nospec.h>
#include <linux/stddef.h>
#include <linux/sysctl.h>
#include <linux/unistd.h>
@ -52,6 +53,7 @@
#include <asm/exec.h>
#include <asm/fpsimd.h>
#include <asm/mmu_context.h>
#include <asm/mte.h>
#include <asm/processor.h>
#include <asm/pointer_auth.h>
#include <asm/stacktrace.h>
@ -239,7 +241,7 @@ static void print_pstate(struct pt_regs *regs)
const char *btype_str = btypes[(pstate & PSR_BTYPE_MASK) >>
PSR_BTYPE_SHIFT];
printk("pstate: %08llx (%c%c%c%c %c%c%c%c %cPAN %cUAO BTYPE=%s)\n",
printk("pstate: %08llx (%c%c%c%c %c%c%c%c %cPAN %cUAO %cTCO BTYPE=%s)\n",
pstate,
pstate & PSR_N_BIT ? 'N' : 'n',
pstate & PSR_Z_BIT ? 'Z' : 'z',
@ -251,6 +253,7 @@ static void print_pstate(struct pt_regs *regs)
pstate & PSR_F_BIT ? 'F' : 'f',
pstate & PSR_PAN_BIT ? '+' : '-',
pstate & PSR_UAO_BIT ? '+' : '-',
pstate & PSR_TCO_BIT ? '+' : '-',
btype_str);
}
}
@ -336,6 +339,7 @@ void flush_thread(void)
tls_thread_flush();
flush_ptrace_hw_breakpoint(current);
flush_tagged_addr_state();
flush_mte_state();
}
void release_thread(struct task_struct *dead_task)
@ -368,6 +372,9 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
dst->thread.sve_state = NULL;
clear_tsk_thread_flag(dst, TIF_SVE);
/* clear any pending asynchronous tag fault raised by the parent */
clear_tsk_thread_flag(dst, TIF_MTE_ASYNC_FAULT);
return 0;
}
@ -421,8 +428,7 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
cpus_have_const_cap(ARM64_HAS_UAO))
childregs->pstate |= PSR_UAO_BIT;
if (arm64_get_ssbd_state() == ARM64_SSBD_FORCE_DISABLE)
set_ssbs_bit(childregs);
spectre_v4_enable_task_mitigation(p);
if (system_uses_irq_prio_masking())
childregs->pmr_save = GIC_PRIO_IRQON;
@ -472,8 +478,6 @@ void uao_thread_switch(struct task_struct *next)
*/
static void ssbs_thread_switch(struct task_struct *next)
{
struct pt_regs *regs = task_pt_regs(next);
/*
* Nothing to do for kernel threads, but 'regs' may be junk
* (e.g. idle task) so check the flags and bail early.
@ -485,18 +489,10 @@ static void ssbs_thread_switch(struct task_struct *next)
* If all CPUs implement the SSBS extension, then we just need to
* context-switch the PSTATE field.
*/
if (cpu_have_feature(cpu_feature(SSBS)))
if (cpus_have_const_cap(ARM64_SSBS))
return;
/* If the mitigation is enabled, then we leave SSBS clear. */
if ((arm64_get_ssbd_state() == ARM64_SSBD_FORCE_ENABLE) ||
test_tsk_thread_flag(next, TIF_SSBD))
return;
if (compat_user_mode(regs))
set_compat_ssbs_bit(regs);
else if (user_mode(regs))
set_ssbs_bit(regs);
spectre_v4_enable_task_mitigation(next);
}
/*
@ -571,6 +567,13 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
*/
dsb(ish);
/*
* MTE thread switching must happen after the DSB above to ensure that
* any asynchronous tag check faults have been logged in the TFSR*_EL1
* registers.
*/
mte_thread_switch(next);
/* the actual thread switch */
last = cpu_switch_to(prev, next);
@ -620,6 +623,11 @@ void arch_setup_new_exec(void)
current->mm->context.flags = is_compat_task() ? MMCF_AARCH32 : 0;
ptrauth_thread_init_user(current);
if (task_spec_ssb_noexec(current)) {
arch_prctl_spec_ctrl_set(current, PR_SPEC_STORE_BYPASS,
PR_SPEC_ENABLE);
}
}
#ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
@ -628,11 +636,18 @@ void arch_setup_new_exec(void)
*/
static unsigned int tagged_addr_disabled;
long set_tagged_addr_ctrl(unsigned long arg)
long set_tagged_addr_ctrl(struct task_struct *task, unsigned long arg)
{
if (is_compat_task())
unsigned long valid_mask = PR_TAGGED_ADDR_ENABLE;
struct thread_info *ti = task_thread_info(task);
if (is_compat_thread(ti))
return -EINVAL;
if (arg & ~PR_TAGGED_ADDR_ENABLE)
if (system_supports_mte())
valid_mask |= PR_MTE_TCF_MASK | PR_MTE_TAG_MASK;
if (arg & ~valid_mask)
return -EINVAL;
/*
@ -642,20 +657,28 @@ long set_tagged_addr_ctrl(unsigned long arg)
if (arg & PR_TAGGED_ADDR_ENABLE && tagged_addr_disabled)
return -EINVAL;
update_thread_flag(TIF_TAGGED_ADDR, arg & PR_TAGGED_ADDR_ENABLE);
if (set_mte_ctrl(task, arg) != 0)
return -EINVAL;
update_ti_thread_flag(ti, TIF_TAGGED_ADDR, arg & PR_TAGGED_ADDR_ENABLE);
return 0;
}
long get_tagged_addr_ctrl(void)
long get_tagged_addr_ctrl(struct task_struct *task)
{
if (is_compat_task())
long ret = 0;
struct thread_info *ti = task_thread_info(task);
if (is_compat_thread(ti))
return -EINVAL;
if (test_thread_flag(TIF_TAGGED_ADDR))
return PR_TAGGED_ADDR_ENABLE;
if (test_ti_thread_flag(ti, TIF_TAGGED_ADDR))
ret = PR_TAGGED_ADDR_ENABLE;
return 0;
ret |= get_mte_ctrl(task);
return ret;
}
/*

View file

@ -0,0 +1,792 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Handle detection, reporting and mitigation of Spectre v1, v2 and v4, as
* detailed at:
*
* https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability
*
* This code was originally written hastily under an awful lot of stress and so
* aspects of it are somewhat hacky. Unfortunately, changing anything in here
* instantly makes me feel ill. Thanks, Jann. Thann.
*
* Copyright (C) 2018 ARM Ltd, All Rights Reserved.
* Copyright (C) 2020 Google LLC
*
* "If there's something strange in your neighbourhood, who you gonna call?"
*
* Authors: Will Deacon <will@kernel.org> and Marc Zyngier <maz@kernel.org>
*/
#include <linux/arm-smccc.h>
#include <linux/cpu.h>
#include <linux/device.h>
#include <linux/nospec.h>
#include <linux/prctl.h>
#include <linux/sched/task_stack.h>
#include <asm/spectre.h>
#include <asm/traps.h>
/*
* We try to ensure that the mitigation state can never change as the result of
* onlining a late CPU.
*/
static void update_mitigation_state(enum mitigation_state *oldp,
enum mitigation_state new)
{
enum mitigation_state state;
do {
state = READ_ONCE(*oldp);
if (new <= state)
break;
/* Userspace almost certainly can't deal with this. */
if (WARN_ON(system_capabilities_finalized()))
break;
} while (cmpxchg_relaxed(oldp, state, new) != state);
}
/*
* Spectre v1.
*
* The kernel can't protect userspace for this one: it's each person for
* themselves. Advertise what we're doing and be done with it.
*/
ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr,
char *buf)
{
return sprintf(buf, "Mitigation: __user pointer sanitization\n");
}
/*
* Spectre v2.
*
* This one sucks. A CPU is either:
*
* - Mitigated in hardware and advertised by ID_AA64PFR0_EL1.CSV2.
* - Mitigated in hardware and listed in our "safe list".
* - Mitigated in software by firmware.
* - Mitigated in software by a CPU-specific dance in the kernel.
* - Vulnerable.
*
* It's not unlikely for different CPUs in a big.LITTLE system to fall into
* different camps.
*/
static enum mitigation_state spectre_v2_state;
static bool __read_mostly __nospectre_v2;
static int __init parse_spectre_v2_param(char *str)
{
__nospectre_v2 = true;
return 0;
}
early_param("nospectre_v2", parse_spectre_v2_param);
static bool spectre_v2_mitigations_off(void)
{
bool ret = __nospectre_v2 || cpu_mitigations_off();
if (ret)
pr_info_once("spectre-v2 mitigation disabled by command line option\n");
return ret;
}
ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr,
char *buf)
{
switch (spectre_v2_state) {
case SPECTRE_UNAFFECTED:
return sprintf(buf, "Not affected\n");
case SPECTRE_MITIGATED:
return sprintf(buf, "Mitigation: Branch predictor hardening\n");
case SPECTRE_VULNERABLE:
fallthrough;
default:
return sprintf(buf, "Vulnerable\n");
}
}
static enum mitigation_state spectre_v2_get_cpu_hw_mitigation_state(void)
{
u64 pfr0;
static const struct midr_range spectre_v2_safe_list[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A35),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A53),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A55),
MIDR_ALL_VERSIONS(MIDR_BRAHMA_B53),
MIDR_ALL_VERSIONS(MIDR_HISI_TSV110),
MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_3XX_SILVER),
MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_4XX_SILVER),
{ /* sentinel */ }
};
/* If the CPU has CSV2 set, we're safe */
pfr0 = read_cpuid(ID_AA64PFR0_EL1);
if (cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_CSV2_SHIFT))
return SPECTRE_UNAFFECTED;
/* Alternatively, we have a list of unaffected CPUs */
if (is_midr_in_range_list(read_cpuid_id(), spectre_v2_safe_list))
return SPECTRE_UNAFFECTED;
return SPECTRE_VULNERABLE;
}
#define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED (1)
static enum mitigation_state spectre_v2_get_cpu_fw_mitigation_state(void)
{
int ret;
struct arm_smccc_res res;
arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
ARM_SMCCC_ARCH_WORKAROUND_1, &res);
ret = res.a0;
switch (ret) {
case SMCCC_RET_SUCCESS:
return SPECTRE_MITIGATED;
case SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED:
return SPECTRE_UNAFFECTED;
default:
fallthrough;
case SMCCC_RET_NOT_SUPPORTED:
return SPECTRE_VULNERABLE;
}
}
bool has_spectre_v2(const struct arm64_cpu_capabilities *entry, int scope)
{
WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
if (spectre_v2_get_cpu_hw_mitigation_state() == SPECTRE_UNAFFECTED)
return false;
if (spectre_v2_get_cpu_fw_mitigation_state() == SPECTRE_UNAFFECTED)
return false;
return true;
}
DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
enum mitigation_state arm64_get_spectre_v2_state(void)
{
return spectre_v2_state;
}
#ifdef CONFIG_KVM
#include <asm/cacheflush.h>
#include <asm/kvm_asm.h>
atomic_t arm64_el2_vector_last_slot = ATOMIC_INIT(-1);
static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
const char *hyp_vecs_end)
{
void *dst = lm_alias(__bp_harden_hyp_vecs + slot * SZ_2K);
int i;
for (i = 0; i < SZ_2K; i += 0x80)
memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);
__flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
}
static void install_bp_hardening_cb(bp_hardening_cb_t fn)
{
static DEFINE_RAW_SPINLOCK(bp_lock);
int cpu, slot = -1;
const char *hyp_vecs_start = __smccc_workaround_1_smc;
const char *hyp_vecs_end = __smccc_workaround_1_smc +
__SMCCC_WORKAROUND_1_SMC_SZ;
/*
* detect_harden_bp_fw() passes NULL for the hyp_vecs start/end if
* we're a guest. Skip the hyp-vectors work.
*/
if (!is_hyp_mode_available()) {
__this_cpu_write(bp_hardening_data.fn, fn);
return;
}
raw_spin_lock(&bp_lock);
for_each_possible_cpu(cpu) {
if (per_cpu(bp_hardening_data.fn, cpu) == fn) {
slot = per_cpu(bp_hardening_data.hyp_vectors_slot, cpu);
break;
}
}
if (slot == -1) {
slot = atomic_inc_return(&arm64_el2_vector_last_slot);
BUG_ON(slot >= BP_HARDEN_EL2_SLOTS);
__copy_hyp_vect_bpi(slot, hyp_vecs_start, hyp_vecs_end);
}
__this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot);
__this_cpu_write(bp_hardening_data.fn, fn);
raw_spin_unlock(&bp_lock);
}
#else
static void install_bp_hardening_cb(bp_hardening_cb_t fn)
{
__this_cpu_write(bp_hardening_data.fn, fn);
}
#endif /* CONFIG_KVM */
static void call_smc_arch_workaround_1(void)
{
arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
}
static void call_hvc_arch_workaround_1(void)
{
arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
}
static void qcom_link_stack_sanitisation(void)
{
u64 tmp;
asm volatile("mov %0, x30 \n"
".rept 16 \n"
"bl . + 4 \n"
".endr \n"
"mov x30, %0 \n"
: "=&r" (tmp));
}
static enum mitigation_state spectre_v2_enable_fw_mitigation(void)
{
bp_hardening_cb_t cb;
enum mitigation_state state;
state = spectre_v2_get_cpu_fw_mitigation_state();
if (state != SPECTRE_MITIGATED)
return state;
if (spectre_v2_mitigations_off())
return SPECTRE_VULNERABLE;
switch (arm_smccc_1_1_get_conduit()) {
case SMCCC_CONDUIT_HVC:
cb = call_hvc_arch_workaround_1;
break;
case SMCCC_CONDUIT_SMC:
cb = call_smc_arch_workaround_1;
break;
default:
return SPECTRE_VULNERABLE;
}
install_bp_hardening_cb(cb);
return SPECTRE_MITIGATED;
}
static enum mitigation_state spectre_v2_enable_sw_mitigation(void)
{
u32 midr;
if (spectre_v2_mitigations_off())
return SPECTRE_VULNERABLE;
midr = read_cpuid_id();
if (((midr & MIDR_CPU_MODEL_MASK) != MIDR_QCOM_FALKOR) &&
((midr & MIDR_CPU_MODEL_MASK) != MIDR_QCOM_FALKOR_V1))
return SPECTRE_VULNERABLE;
install_bp_hardening_cb(qcom_link_stack_sanitisation);
return SPECTRE_MITIGATED;
}
void spectre_v2_enable_mitigation(const struct arm64_cpu_capabilities *__unused)
{
enum mitigation_state state;
WARN_ON(preemptible());
state = spectre_v2_get_cpu_hw_mitigation_state();
if (state == SPECTRE_VULNERABLE)
state = spectre_v2_enable_fw_mitigation();
if (state == SPECTRE_VULNERABLE)
state = spectre_v2_enable_sw_mitigation();
update_mitigation_state(&spectre_v2_state, state);
}
/*
* Spectre v4.
*
* If you thought Spectre v2 was nasty, wait until you see this mess. A CPU is
* either:
*
* - Mitigated in hardware and listed in our "safe list".
* - Mitigated in hardware via PSTATE.SSBS.
* - Mitigated in software by firmware (sometimes referred to as SSBD).
*
* Wait, that doesn't sound so bad, does it? Keep reading...
*
* A major source of headaches is that the software mitigation is enabled both
* on a per-task basis, but can also be forced on for the kernel, necessitating
* both context-switch *and* entry/exit hooks. To make it even worse, some CPUs
* allow EL0 to toggle SSBS directly, which can end up with the prctl() state
* being stale when re-entering the kernel. The usual big.LITTLE caveats apply,
* so you can have systems that have both firmware and SSBS mitigations. This
* means we actually have to reject late onlining of CPUs with mitigations if
* all of the currently onlined CPUs are safelisted, as the mitigation tends to
* be opt-in for userspace. Yes, really, the cure is worse than the disease.
*
* The only good part is that if the firmware mitigation is present, then it is
* present for all CPUs, meaning we don't have to worry about late onlining of a
* vulnerable CPU if one of the boot CPUs is using the firmware mitigation.
*
* Give me a VAX-11/780 any day of the week...
*/
static enum mitigation_state spectre_v4_state;
/* This is the per-cpu state tracking whether we need to talk to firmware */
DEFINE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);
enum spectre_v4_policy {
SPECTRE_V4_POLICY_MITIGATION_DYNAMIC,
SPECTRE_V4_POLICY_MITIGATION_ENABLED,
SPECTRE_V4_POLICY_MITIGATION_DISABLED,
};
static enum spectre_v4_policy __read_mostly __spectre_v4_policy;
static const struct spectre_v4_param {
const char *str;
enum spectre_v4_policy policy;
} spectre_v4_params[] = {
{ "force-on", SPECTRE_V4_POLICY_MITIGATION_ENABLED, },
{ "force-off", SPECTRE_V4_POLICY_MITIGATION_DISABLED, },
{ "kernel", SPECTRE_V4_POLICY_MITIGATION_DYNAMIC, },
};
static int __init parse_spectre_v4_param(char *str)
{
int i;
if (!str || !str[0])
return -EINVAL;
for (i = 0; i < ARRAY_SIZE(spectre_v4_params); i++) {
const struct spectre_v4_param *param = &spectre_v4_params[i];
if (strncmp(str, param->str, strlen(param->str)))
continue;
__spectre_v4_policy = param->policy;
return 0;
}
return -EINVAL;
}
early_param("ssbd", parse_spectre_v4_param);
/*
* Because this was all written in a rush by people working in different silos,
* we've ended up with multiple command line options to control the same thing.
* Wrap these up in some helpers, which prefer disabling the mitigation if faced
* with contradictory parameters. The mitigation is always either "off",
* "dynamic" or "on".
*/
static bool spectre_v4_mitigations_off(void)
{
bool ret = cpu_mitigations_off() ||
__spectre_v4_policy == SPECTRE_V4_POLICY_MITIGATION_DISABLED;
if (ret)
pr_info_once("spectre-v4 mitigation disabled by command-line option\n");
return ret;
}
/* Do we need to toggle the mitigation state on entry to/exit from the kernel? */
static bool spectre_v4_mitigations_dynamic(void)
{
return !spectre_v4_mitigations_off() &&
__spectre_v4_policy == SPECTRE_V4_POLICY_MITIGATION_DYNAMIC;
}
static bool spectre_v4_mitigations_on(void)
{
return !spectre_v4_mitigations_off() &&
__spectre_v4_policy == SPECTRE_V4_POLICY_MITIGATION_ENABLED;
}
ssize_t cpu_show_spec_store_bypass(struct device *dev,
struct device_attribute *attr, char *buf)
{
switch (spectre_v4_state) {
case SPECTRE_UNAFFECTED:
return sprintf(buf, "Not affected\n");
case SPECTRE_MITIGATED:
return sprintf(buf, "Mitigation: Speculative Store Bypass disabled via prctl\n");
case SPECTRE_VULNERABLE:
fallthrough;
default:
return sprintf(buf, "Vulnerable\n");
}
}
enum mitigation_state arm64_get_spectre_v4_state(void)
{
return spectre_v4_state;
}
static enum mitigation_state spectre_v4_get_cpu_hw_mitigation_state(void)
{
static const struct midr_range spectre_v4_safe_list[] = {
MIDR_ALL_VERSIONS(MIDR_CORTEX_A35),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A53),
MIDR_ALL_VERSIONS(MIDR_CORTEX_A55),
MIDR_ALL_VERSIONS(MIDR_BRAHMA_B53),
MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_3XX_SILVER),
MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_4XX_SILVER),
{ /* sentinel */ },
};
if (is_midr_in_range_list(read_cpuid_id(), spectre_v4_safe_list))
return SPECTRE_UNAFFECTED;
/* CPU features are detected first */
if (this_cpu_has_cap(ARM64_SSBS))
return SPECTRE_MITIGATED;
return SPECTRE_VULNERABLE;
}
static enum mitigation_state spectre_v4_get_cpu_fw_mitigation_state(void)
{
int ret;
struct arm_smccc_res res;
arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
ARM_SMCCC_ARCH_WORKAROUND_2, &res);
ret = res.a0;
switch (ret) {
case SMCCC_RET_SUCCESS:
return SPECTRE_MITIGATED;
case SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED:
fallthrough;
case SMCCC_RET_NOT_REQUIRED:
return SPECTRE_UNAFFECTED;
default:
fallthrough;
case SMCCC_RET_NOT_SUPPORTED:
return SPECTRE_VULNERABLE;
}
}
bool has_spectre_v4(const struct arm64_cpu_capabilities *cap, int scope)
{
enum mitigation_state state;
WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
state = spectre_v4_get_cpu_hw_mitigation_state();
if (state == SPECTRE_VULNERABLE)
state = spectre_v4_get_cpu_fw_mitigation_state();
return state != SPECTRE_UNAFFECTED;
}
static int ssbs_emulation_handler(struct pt_regs *regs, u32 instr)
{
if (user_mode(regs))
return 1;
if (instr & BIT(PSTATE_Imm_shift))
regs->pstate |= PSR_SSBS_BIT;
else
regs->pstate &= ~PSR_SSBS_BIT;
arm64_skip_faulting_instruction(regs, 4);
return 0;
}
static struct undef_hook ssbs_emulation_hook = {
.instr_mask = ~(1U << PSTATE_Imm_shift),
.instr_val = 0xd500401f | PSTATE_SSBS,
.fn = ssbs_emulation_handler,
};
static enum mitigation_state spectre_v4_enable_hw_mitigation(void)
{
static bool undef_hook_registered = false;
static DEFINE_RAW_SPINLOCK(hook_lock);
enum mitigation_state state;
/*
* If the system is mitigated but this CPU doesn't have SSBS, then
* we must be on the safelist and there's nothing more to do.
*/
state = spectre_v4_get_cpu_hw_mitigation_state();
if (state != SPECTRE_MITIGATED || !this_cpu_has_cap(ARM64_SSBS))
return state;
raw_spin_lock(&hook_lock);
if (!undef_hook_registered) {
register_undef_hook(&ssbs_emulation_hook);
undef_hook_registered = true;
}
raw_spin_unlock(&hook_lock);
if (spectre_v4_mitigations_off()) {
sysreg_clear_set(sctlr_el1, 0, SCTLR_ELx_DSSBS);
asm volatile(SET_PSTATE_SSBS(1));
return SPECTRE_VULNERABLE;
}
/* SCTLR_EL1.DSSBS was initialised to 0 during boot */
asm volatile(SET_PSTATE_SSBS(0));
return SPECTRE_MITIGATED;
}
/*
* Patch a branch over the Spectre-v4 mitigation code with a NOP so that
* we fallthrough and check whether firmware needs to be called on this CPU.
*/
void __init spectre_v4_patch_fw_mitigation_enable(struct alt_instr *alt,
__le32 *origptr,
__le32 *updptr, int nr_inst)
{
BUG_ON(nr_inst != 1); /* Branch -> NOP */
if (spectre_v4_mitigations_off())
return;
if (cpus_have_final_cap(ARM64_SSBS))
return;
if (spectre_v4_mitigations_dynamic())
*updptr = cpu_to_le32(aarch64_insn_gen_nop());
}
/*
* Patch a NOP in the Spectre-v4 mitigation code with an SMC/HVC instruction
* to call into firmware to adjust the mitigation state.
*/
void __init spectre_v4_patch_fw_mitigation_conduit(struct alt_instr *alt,
__le32 *origptr,
__le32 *updptr, int nr_inst)
{
u32 insn;
BUG_ON(nr_inst != 1); /* NOP -> HVC/SMC */
switch (arm_smccc_1_1_get_conduit()) {
case SMCCC_CONDUIT_HVC:
insn = aarch64_insn_get_hvc_value();
break;
case SMCCC_CONDUIT_SMC:
insn = aarch64_insn_get_smc_value();
break;
default:
return;
}
*updptr = cpu_to_le32(insn);
}
static enum mitigation_state spectre_v4_enable_fw_mitigation(void)
{
enum mitigation_state state;
state = spectre_v4_get_cpu_fw_mitigation_state();
if (state != SPECTRE_MITIGATED)
return state;
if (spectre_v4_mitigations_off()) {
arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_WORKAROUND_2, false, NULL);
return SPECTRE_VULNERABLE;
}
arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_WORKAROUND_2, true, NULL);
if (spectre_v4_mitigations_dynamic())
__this_cpu_write(arm64_ssbd_callback_required, 1);
return SPECTRE_MITIGATED;
}
void spectre_v4_enable_mitigation(const struct arm64_cpu_capabilities *__unused)
{
enum mitigation_state state;
WARN_ON(preemptible());
state = spectre_v4_enable_hw_mitigation();
if (state == SPECTRE_VULNERABLE)
state = spectre_v4_enable_fw_mitigation();
update_mitigation_state(&spectre_v4_state, state);
}
static void __update_pstate_ssbs(struct pt_regs *regs, bool state)
{
u64 bit = compat_user_mode(regs) ? PSR_AA32_SSBS_BIT : PSR_SSBS_BIT;
if (state)
regs->pstate |= bit;
else
regs->pstate &= ~bit;
}
void spectre_v4_enable_task_mitigation(struct task_struct *tsk)
{
struct pt_regs *regs = task_pt_regs(tsk);
bool ssbs = false, kthread = tsk->flags & PF_KTHREAD;
if (spectre_v4_mitigations_off())
ssbs = true;
else if (spectre_v4_mitigations_dynamic() && !kthread)
ssbs = !test_tsk_thread_flag(tsk, TIF_SSBD);
__update_pstate_ssbs(regs, ssbs);
}
/*
* The Spectre-v4 mitigation can be controlled via a prctl() from userspace.
* This is interesting because the "speculation disabled" behaviour can be
* configured so that it is preserved across exec(), which means that the
* prctl() may be necessary even when PSTATE.SSBS can be toggled directly
* from userspace.
*/
static void ssbd_prctl_enable_mitigation(struct task_struct *task)
{
task_clear_spec_ssb_noexec(task);
task_set_spec_ssb_disable(task);
set_tsk_thread_flag(task, TIF_SSBD);
}
static void ssbd_prctl_disable_mitigation(struct task_struct *task)
{
task_clear_spec_ssb_noexec(task);
task_clear_spec_ssb_disable(task);
clear_tsk_thread_flag(task, TIF_SSBD);
}
static int ssbd_prctl_set(struct task_struct *task, unsigned long ctrl)
{
switch (ctrl) {
case PR_SPEC_ENABLE:
/* Enable speculation: disable mitigation */
/*
* Force disabled speculation prevents it from being
* re-enabled.
*/
if (task_spec_ssb_force_disable(task))
return -EPERM;
/*
* If the mitigation is forced on, then speculation is forced
* off and we again prevent it from being re-enabled.
*/
if (spectre_v4_mitigations_on())
return -EPERM;
ssbd_prctl_disable_mitigation(task);
break;
case PR_SPEC_FORCE_DISABLE:
/* Force disable speculation: force enable mitigation */
/*
* If the mitigation is forced off, then speculation is forced
* on and we prevent it from being disabled.
*/
if (spectre_v4_mitigations_off())
return -EPERM;
task_set_spec_ssb_force_disable(task);
fallthrough;
case PR_SPEC_DISABLE:
/* Disable speculation: enable mitigation */
/* Same as PR_SPEC_FORCE_DISABLE */
if (spectre_v4_mitigations_off())
return -EPERM;
ssbd_prctl_enable_mitigation(task);
break;
case PR_SPEC_DISABLE_NOEXEC:
/* Disable speculation until execve(): enable mitigation */
/*
* If the mitigation state is forced one way or the other, then
* we must fail now before we try to toggle it on execve().
*/
if (task_spec_ssb_force_disable(task) ||
spectre_v4_mitigations_off() ||
spectre_v4_mitigations_on()) {
return -EPERM;
}
ssbd_prctl_enable_mitigation(task);
task_set_spec_ssb_noexec(task);
break;
default:
return -ERANGE;
}
spectre_v4_enable_task_mitigation(task);
return 0;
}
int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which,
unsigned long ctrl)
{
switch (which) {
case PR_SPEC_STORE_BYPASS:
return ssbd_prctl_set(task, ctrl);
default:
return -ENODEV;
}
}
static int ssbd_prctl_get(struct task_struct *task)
{
switch (spectre_v4_state) {
case SPECTRE_UNAFFECTED:
return PR_SPEC_NOT_AFFECTED;
case SPECTRE_MITIGATED:
if (spectre_v4_mitigations_on())
return PR_SPEC_NOT_AFFECTED;
if (spectre_v4_mitigations_dynamic())
break;
/* Mitigations are disabled, so we're vulnerable. */
fallthrough;
case SPECTRE_VULNERABLE:
fallthrough;
default:
return PR_SPEC_ENABLE;
}
/* Check the mitigation state for this task */
if (task_spec_ssb_force_disable(task))
return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE;
if (task_spec_ssb_noexec(task))
return PR_SPEC_PRCTL | PR_SPEC_DISABLE_NOEXEC;
if (task_spec_ssb_disable(task))
return PR_SPEC_PRCTL | PR_SPEC_DISABLE;
return PR_SPEC_PRCTL | PR_SPEC_ENABLE;
}
int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which)
{
switch (which) {
case PR_SPEC_STORE_BYPASS:
return ssbd_prctl_get(task);
default:
return -ENODEV;
}
}

View file

@ -34,6 +34,7 @@
#include <asm/cpufeature.h>
#include <asm/debug-monitors.h>
#include <asm/fpsimd.h>
#include <asm/mte.h>
#include <asm/pointer_auth.h>
#include <asm/stacktrace.h>
#include <asm/syscall.h>
@ -1032,6 +1033,35 @@ static int pac_generic_keys_set(struct task_struct *target,
#endif /* CONFIG_CHECKPOINT_RESTORE */
#endif /* CONFIG_ARM64_PTR_AUTH */
#ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
static int tagged_addr_ctrl_get(struct task_struct *target,
const struct user_regset *regset,
struct membuf to)
{
long ctrl = get_tagged_addr_ctrl(target);
if (IS_ERR_VALUE(ctrl))
return ctrl;
return membuf_write(&to, &ctrl, sizeof(ctrl));
}
static int tagged_addr_ctrl_set(struct task_struct *target, const struct
user_regset *regset, unsigned int pos,
unsigned int count, const void *kbuf, const
void __user *ubuf)
{
int ret;
long ctrl;
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &ctrl, 0, -1);
if (ret)
return ret;
return set_tagged_addr_ctrl(target, ctrl);
}
#endif
enum aarch64_regset {
REGSET_GPR,
REGSET_FPR,
@ -1051,6 +1081,9 @@ enum aarch64_regset {
REGSET_PACG_KEYS,
#endif
#endif
#ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
REGSET_TAGGED_ADDR_CTRL,
#endif
};
static const struct user_regset aarch64_regsets[] = {
@ -1148,6 +1181,16 @@ static const struct user_regset aarch64_regsets[] = {
},
#endif
#endif
#ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
[REGSET_TAGGED_ADDR_CTRL] = {
.core_note_type = NT_ARM_TAGGED_ADDR_CTRL,
.n = 1,
.size = sizeof(long),
.align = sizeof(long),
.regset_get = tagged_addr_ctrl_get,
.set = tagged_addr_ctrl_set,
},
#endif
};
static const struct user_regset_view user_aarch64_view = {
@ -1691,6 +1734,12 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task)
long arch_ptrace(struct task_struct *child, long request,
unsigned long addr, unsigned long data)
{
switch (request) {
case PTRACE_PEEKMTETAGS:
case PTRACE_POKEMTETAGS:
return mte_ptrace_copy_tags(child, request, addr, data);
}
return ptrace_request(child, request, addr, data);
}
@ -1793,7 +1842,7 @@ void syscall_trace_exit(struct pt_regs *regs)
* We also reserve IL for the kernel; SS is handled dynamically.
*/
#define SPSR_EL1_AARCH64_RES0_BITS \
(GENMASK_ULL(63, 32) | GENMASK_ULL(27, 25) | GENMASK_ULL(23, 22) | \
(GENMASK_ULL(63, 32) | GENMASK_ULL(27, 26) | GENMASK_ULL(23, 22) | \
GENMASK_ULL(20, 13) | GENMASK_ULL(5, 5))
#define SPSR_EL1_AARCH32_RES0_BITS \
(GENMASK_ULL(63, 32) | GENMASK_ULL(22, 22) | GENMASK_ULL(20, 20))

View file

@ -36,18 +36,6 @@ SYM_CODE_START(arm64_relocate_new_kernel)
mov x14, xzr /* x14 = entry ptr */
mov x13, xzr /* x13 = copy dest */
/* Clear the sctlr_el2 flags. */
mrs x0, CurrentEL
cmp x0, #CurrentEL_EL2
b.ne 1f
mrs x0, sctlr_el2
mov_q x1, SCTLR_ELx_FLAGS
bic x0, x0, x1
pre_disable_mmu_workaround
msr sctlr_el2, x0
isb
1:
/* Check if the new image needs relocation. */
tbnz x16, IND_DONE_BIT, .Ldone

View file

@ -18,16 +18,16 @@ struct return_address_data {
void *addr;
};
static int save_return_addr(struct stackframe *frame, void *d)
static bool save_return_addr(void *d, unsigned long pc)
{
struct return_address_data *data = d;
if (!data->level) {
data->addr = (void *)frame->pc;
return 1;
data->addr = (void *)pc;
return false;
} else {
--data->level;
return 0;
return true;
}
}
NOKPROBE_SYMBOL(save_return_addr);

View file

@ -244,7 +244,8 @@ static int preserve_sve_context(struct sve_context __user *ctx)
if (vq) {
/*
* This assumes that the SVE state has already been saved to
* the task struct by calling preserve_fpsimd_context().
* the task struct by calling the function
* fpsimd_signal_preserve_current_state().
*/
err |= __copy_to_user((char __user *)ctx + SVE_SIG_REGS_OFFSET,
current->thread.sve_state,
@ -748,6 +749,9 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
regs->pstate |= PSR_BTYPE_C;
}
/* TCO (Tag Check Override) always cleared for signal handlers */
regs->pstate &= ~PSR_TCO_BIT;
if (ka->sa.sa_flags & SA_RESTORER)
sigtramp = ka->sa.sa_restorer;
else
@ -932,6 +936,12 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
if (thread_flags & _TIF_UPROBE)
uprobe_notify_resume(regs);
if (thread_flags & _TIF_MTE_ASYNC_FAULT) {
clear_thread_flag(TIF_MTE_ASYNC_FAULT);
send_sig_fault(SIGSEGV, SEGV_MTEAERR,
(void __user *)NULL, current);
}
if (thread_flags & _TIF_SIGPENDING)
do_signal(regs);

View file

@ -83,9 +83,9 @@ static int smp_spin_table_cpu_prepare(unsigned int cpu)
/*
* We write the release address as LE regardless of the native
* endianess of the kernel. Therefore, any boot-loaders that
* endianness of the kernel. Therefore, any boot-loaders that
* read this address need to convert this address to the
* boot-loader's endianess before jumping. This is mandated by
* boot-loader's endianness before jumping. This is mandated by
* the boot protocol.
*/
writeq_relaxed(__pa_symbol(secondary_holding_pen), release_addr);

View file

@ -1,129 +0,0 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (C) 2018 ARM Ltd, All Rights Reserved.
*/
#include <linux/compat.h>
#include <linux/errno.h>
#include <linux/prctl.h>
#include <linux/sched.h>
#include <linux/sched/task_stack.h>
#include <linux/thread_info.h>
#include <asm/cpufeature.h>
static void ssbd_ssbs_enable(struct task_struct *task)
{
u64 val = is_compat_thread(task_thread_info(task)) ?
PSR_AA32_SSBS_BIT : PSR_SSBS_BIT;
task_pt_regs(task)->pstate |= val;
}
static void ssbd_ssbs_disable(struct task_struct *task)
{
u64 val = is_compat_thread(task_thread_info(task)) ?
PSR_AA32_SSBS_BIT : PSR_SSBS_BIT;
task_pt_regs(task)->pstate &= ~val;
}
/*
* prctl interface for SSBD
*/
static int ssbd_prctl_set(struct task_struct *task, unsigned long ctrl)
{
int state = arm64_get_ssbd_state();
/* Unsupported */
if (state == ARM64_SSBD_UNKNOWN)
return -ENODEV;
/* Treat the unaffected/mitigated state separately */
if (state == ARM64_SSBD_MITIGATED) {
switch (ctrl) {
case PR_SPEC_ENABLE:
return -EPERM;
case PR_SPEC_DISABLE:
case PR_SPEC_FORCE_DISABLE:
return 0;
}
}
/*
* Things are a bit backward here: the arm64 internal API
* *enables the mitigation* when the userspace API *disables
* speculation*. So much fun.
*/
switch (ctrl) {
case PR_SPEC_ENABLE:
/* If speculation is force disabled, enable is not allowed */
if (state == ARM64_SSBD_FORCE_ENABLE ||
task_spec_ssb_force_disable(task))
return -EPERM;
task_clear_spec_ssb_disable(task);
clear_tsk_thread_flag(task, TIF_SSBD);
ssbd_ssbs_enable(task);
break;
case PR_SPEC_DISABLE:
if (state == ARM64_SSBD_FORCE_DISABLE)
return -EPERM;
task_set_spec_ssb_disable(task);
set_tsk_thread_flag(task, TIF_SSBD);
ssbd_ssbs_disable(task);
break;
case PR_SPEC_FORCE_DISABLE:
if (state == ARM64_SSBD_FORCE_DISABLE)
return -EPERM;
task_set_spec_ssb_disable(task);
task_set_spec_ssb_force_disable(task);
set_tsk_thread_flag(task, TIF_SSBD);
ssbd_ssbs_disable(task);
break;
default:
return -ERANGE;
}
return 0;
}
int arch_prctl_spec_ctrl_set(struct task_struct *task, unsigned long which,
unsigned long ctrl)
{
switch (which) {
case PR_SPEC_STORE_BYPASS:
return ssbd_prctl_set(task, ctrl);
default:
return -ENODEV;
}
}
static int ssbd_prctl_get(struct task_struct *task)
{
switch (arm64_get_ssbd_state()) {
case ARM64_SSBD_UNKNOWN:
return -ENODEV;
case ARM64_SSBD_FORCE_ENABLE:
return PR_SPEC_DISABLE;
case ARM64_SSBD_KERNEL:
if (task_spec_ssb_force_disable(task))
return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE;
if (task_spec_ssb_disable(task))
return PR_SPEC_PRCTL | PR_SPEC_DISABLE;
return PR_SPEC_PRCTL | PR_SPEC_ENABLE;
case ARM64_SSBD_FORCE_DISABLE:
return PR_SPEC_ENABLE;
default:
return PR_SPEC_NOT_AFFECTED;
}
}
int arch_prctl_spec_ctrl_get(struct task_struct *task, unsigned long which)
{
switch (which) {
case PR_SPEC_STORE_BYPASS:
return ssbd_prctl_get(task);
default:
return -ENODEV;
}
}

View file

@ -118,12 +118,12 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
NOKPROBE_SYMBOL(unwind_frame);
void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
int (*fn)(struct stackframe *, void *), void *data)
bool (*fn)(void *, unsigned long), void *data)
{
while (1) {
int ret;
if (fn(frame, data))
if (!fn(data, frame->pc))
break;
ret = unwind_frame(tsk, frame);
if (ret < 0)
@ -132,84 +132,89 @@ void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
}
NOKPROBE_SYMBOL(walk_stackframe);
#ifdef CONFIG_STACKTRACE
struct stack_trace_data {
struct stack_trace *trace;
unsigned int no_sched_functions;
unsigned int skip;
};
static int save_trace(struct stackframe *frame, void *d)
static void dump_backtrace_entry(unsigned long where, const char *loglvl)
{
struct stack_trace_data *data = d;
struct stack_trace *trace = data->trace;
unsigned long addr = frame->pc;
printk("%s %pS\n", loglvl, (void *)where);
}
if (data->no_sched_functions && in_sched_functions(addr))
return 0;
if (data->skip) {
data->skip--;
return 0;
void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
const char *loglvl)
{
struct stackframe frame;
int skip = 0;
pr_debug("%s(regs = %p tsk = %p)\n", __func__, regs, tsk);
if (regs) {
if (user_mode(regs))
return;
skip = 1;
}
trace->entries[trace->nr_entries++] = addr;
return trace->nr_entries >= trace->max_entries;
}
void save_stack_trace_regs(struct pt_regs *regs, struct stack_trace *trace)
{
struct stack_trace_data data;
struct stackframe frame;
data.trace = trace;
data.skip = trace->skip;
data.no_sched_functions = 0;
start_backtrace(&frame, regs->regs[29], regs->pc);
walk_stackframe(current, &frame, save_trace, &data);
}
EXPORT_SYMBOL_GPL(save_stack_trace_regs);
static noinline void __save_stack_trace(struct task_struct *tsk,
struct stack_trace *trace, unsigned int nosched)
{
struct stack_trace_data data;
struct stackframe frame;
if (!tsk)
tsk = current;
if (!try_get_task_stack(tsk))
return;
data.trace = trace;
data.skip = trace->skip;
data.no_sched_functions = nosched;
if (tsk != current) {
start_backtrace(&frame, thread_saved_fp(tsk),
thread_saved_pc(tsk));
} else {
/* We don't want this function nor the caller */
data.skip += 2;
if (tsk == current) {
start_backtrace(&frame,
(unsigned long)__builtin_frame_address(0),
(unsigned long)__save_stack_trace);
(unsigned long)dump_backtrace);
} else {
/*
* task blocked in __switch_to
*/
start_backtrace(&frame,
thread_saved_fp(tsk),
thread_saved_pc(tsk));
}
walk_stackframe(tsk, &frame, save_trace, &data);
printk("%sCall trace:\n", loglvl);
do {
/* skip until specified stack frame */
if (!skip) {
dump_backtrace_entry(frame.pc, loglvl);
} else if (frame.fp == regs->regs[29]) {
skip = 0;
/*
* Mostly, this is the case where this function is
* called in panic/abort. As exception handler's
* stack frame does not contain the corresponding pc
* at which an exception has taken place, use regs->pc
* instead.
*/
dump_backtrace_entry(regs->pc, loglvl);
}
} while (!unwind_frame(tsk, &frame));
put_task_stack(tsk);
}
void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
void show_stack(struct task_struct *tsk, unsigned long *sp, const char *loglvl)
{
__save_stack_trace(tsk, trace, 1);
}
EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
void save_stack_trace(struct stack_trace *trace)
{
__save_stack_trace(current, trace, 0);
dump_backtrace(NULL, tsk, loglvl);
barrier();
}
#ifdef CONFIG_STACKTRACE
void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
struct task_struct *task, struct pt_regs *regs)
{
struct stackframe frame;
if (regs)
start_backtrace(&frame, regs->regs[29], regs->pc);
else if (task == current)
start_backtrace(&frame,
(unsigned long)__builtin_frame_address(0),
(unsigned long)arch_stack_walk);
else
start_backtrace(&frame, thread_saved_fp(task),
thread_saved_pc(task));
walk_stackframe(task, &frame, consume_entry, cookie);
}
EXPORT_SYMBOL_GPL(save_stack_trace);
#endif

View file

@ -10,6 +10,7 @@
#include <asm/daifflags.h>
#include <asm/debug-monitors.h>
#include <asm/exec.h>
#include <asm/mte.h>
#include <asm/memory.h>
#include <asm/mmu_context.h>
#include <asm/smp_plat.h>
@ -72,8 +73,10 @@ void notrace __cpu_suspend_exit(void)
* have turned the mitigation on. If the user has forcefully
* disabled it, make sure their wishes are obeyed.
*/
if (arm64_get_ssbd_state() == ARM64_SSBD_FORCE_DISABLE)
arm64_set_ssbd_mitigation(false);
spectre_v4_enable_mitigation(NULL);
/* Restore additional MTE-specific configuration */
mte_suspend_exit();
}
/*

View file

@ -123,6 +123,16 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
local_daif_restore(DAIF_PROCCTX);
user_exit();
if (system_supports_mte() && (flags & _TIF_MTE_ASYNC_FAULT)) {
/*
* Process the asynchronous tag check fault before the actual
* syscall. do_notify_resume() will send a signal to userspace
* before the syscall is restarted.
*/
regs->regs[0] = -ERESTARTNOINTR;
return;
}
if (has_syscall_work(flags)) {
/*
* The de-facto standard way to skip a system call using ptrace

View file

@ -36,21 +36,23 @@ void store_cpu_topology(unsigned int cpuid)
if (mpidr & MPIDR_UP_BITMASK)
return;
/* Create cpu topology mapping based on MPIDR. */
if (mpidr & MPIDR_MT_BITMASK) {
/* Multiprocessor system : Multi-threads per core */
cpuid_topo->thread_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2) |
MPIDR_AFFINITY_LEVEL(mpidr, 3) << 8;
} else {
/* Multiprocessor system : Single-thread per core */
cpuid_topo->thread_id = -1;
cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1) |
MPIDR_AFFINITY_LEVEL(mpidr, 2) << 8 |
MPIDR_AFFINITY_LEVEL(mpidr, 3) << 16;
}
/*
* This would be the place to create cpu topology based on MPIDR.
*
* However, it cannot be trusted to depict the actual topology; some
* pieces of the architecture enforce an artificial cap on Aff0 values
* (e.g. GICv3's ICC_SGI1R_EL1 limits it to 15), leading to an
* artificial cycling of Aff1, Aff2 and Aff3 values. IOW, these end up
* having absolutely no relationship to the actual underlying system
* topology, and cannot be reasonably used as core / package ID.
*
* If the MT bit is set, Aff0 *could* be used to define a thread ID, but
* we still wouldn't be able to obtain a sane core ID. This means we
* need to entirely ignore MPIDR for any topology deduction.
*/
cpuid_topo->thread_id = -1;
cpuid_topo->core_id = cpuid;
cpuid_topo->package_id = cpu_to_node(cpuid);
pr_debug("CPU%u: cluster %d core %d thread %d mpidr %#016llx\n",
cpuid, cpuid_topo->package_id, cpuid_topo->core_id,

View file

@ -34,6 +34,7 @@
#include <asm/daifflags.h>
#include <asm/debug-monitors.h>
#include <asm/esr.h>
#include <asm/extable.h>
#include <asm/insn.h>
#include <asm/kprobes.h>
#include <asm/traps.h>
@ -53,11 +54,6 @@ static const char *handler[]= {
int show_unhandled_signals = 0;
static void dump_backtrace_entry(unsigned long where, const char *loglvl)
{
printk("%s %pS\n", loglvl, (void *)where);
}
static void dump_kernel_instr(const char *lvl, struct pt_regs *regs)
{
unsigned long addr = instruction_pointer(regs);
@ -83,66 +79,6 @@ static void dump_kernel_instr(const char *lvl, struct pt_regs *regs)
printk("%sCode: %s\n", lvl, str);
}
void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
const char *loglvl)
{
struct stackframe frame;
int skip = 0;
pr_debug("%s(regs = %p tsk = %p)\n", __func__, regs, tsk);
if (regs) {
if (user_mode(regs))
return;
skip = 1;
}
if (!tsk)
tsk = current;
if (!try_get_task_stack(tsk))
return;
if (tsk == current) {
start_backtrace(&frame,
(unsigned long)__builtin_frame_address(0),
(unsigned long)dump_backtrace);
} else {
/*
* task blocked in __switch_to
*/
start_backtrace(&frame,
thread_saved_fp(tsk),
thread_saved_pc(tsk));
}
printk("%sCall trace:\n", loglvl);
do {
/* skip until specified stack frame */
if (!skip) {
dump_backtrace_entry(frame.pc, loglvl);
} else if (frame.fp == regs->regs[29]) {
skip = 0;
/*
* Mostly, this is the case where this function is
* called in panic/abort. As exception handler's
* stack frame does not contain the corresponding pc
* at which an exception has taken place, use regs->pc
* instead.
*/
dump_backtrace_entry(regs->pc, loglvl);
}
} while (!unwind_frame(tsk, &frame));
put_task_stack(tsk);
}
void show_stack(struct task_struct *tsk, unsigned long *sp, const char *loglvl)
{
dump_backtrace(NULL, tsk, loglvl);
barrier();
}
#ifdef CONFIG_PREEMPT
#define S_PREEMPT " PREEMPT"
#elif defined(CONFIG_PREEMPT_RT)
@ -200,9 +136,9 @@ void die(const char *str, struct pt_regs *regs, int err)
oops_exit();
if (in_interrupt())
panic("Fatal exception in interrupt");
panic("%s: Fatal exception in interrupt", str);
if (panic_on_oops)
panic("Fatal exception");
panic("%s: Fatal exception", str);
raw_spin_unlock_irqrestore(&die_lock, flags);
@ -412,7 +348,7 @@ static int call_undef_hook(struct pt_regs *regs)
return fn ? fn(regs, instr) : 1;
}
void force_signal_inject(int signal, int code, unsigned long address)
void force_signal_inject(int signal, int code, unsigned long address, unsigned int err)
{
const char *desc;
struct pt_regs *regs = current_pt_regs();
@ -438,7 +374,7 @@ void force_signal_inject(int signal, int code, unsigned long address)
signal = SIGKILL;
}
arm64_notify_die(desc, regs, signal, code, (void __user *)address, 0);
arm64_notify_die(desc, regs, signal, code, (void __user *)address, err);
}
/*
@ -455,7 +391,7 @@ void arm64_notify_segfault(unsigned long addr)
code = SEGV_ACCERR;
mmap_read_unlock(current->mm);
force_signal_inject(SIGSEGV, code, addr);
force_signal_inject(SIGSEGV, code, addr, 0);
}
void do_undefinstr(struct pt_regs *regs)
@ -468,17 +404,28 @@ void do_undefinstr(struct pt_regs *regs)
return;
BUG_ON(!user_mode(regs));
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc);
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0);
}
NOKPROBE_SYMBOL(do_undefinstr);
void do_bti(struct pt_regs *regs)
{
BUG_ON(!user_mode(regs));
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc);
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0);
}
NOKPROBE_SYMBOL(do_bti);
void do_ptrauth_fault(struct pt_regs *regs, unsigned int esr)
{
/*
* Unexpected FPAC exception or pointer authentication failure in
* the kernel: kill the task before it does any more harm.
*/
BUG_ON(!user_mode(regs));
force_signal_inject(SIGILL, ILL_ILLOPN, regs->pc, esr);
}
NOKPROBE_SYMBOL(do_ptrauth_fault);
#define __user_cache_maint(insn, address, res) \
if (address >= user_addr_max()) { \
res = -EFAULT; \
@ -528,7 +475,7 @@ static void user_cache_maint_handler(unsigned int esr, struct pt_regs *regs)
__user_cache_maint("ic ivau", address, ret);
break;
default:
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc);
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0);
return;
}
@ -581,7 +528,7 @@ static void mrs_handler(unsigned int esr, struct pt_regs *regs)
sysreg = esr_sys64_to_sysreg(esr);
if (do_emulate_mrs(regs, sysreg, rt) != 0)
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc);
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0);
}
static void wfi_handler(unsigned int esr, struct pt_regs *regs)
@ -775,6 +722,7 @@ static const char *esr_class_str[] = {
[ESR_ELx_EC_SYS64] = "MSR/MRS (AArch64)",
[ESR_ELx_EC_SVE] = "SVE",
[ESR_ELx_EC_ERET] = "ERET/ERETAA/ERETAB",
[ESR_ELx_EC_FPAC] = "FPAC",
[ESR_ELx_EC_IMP_DEF] = "EL3 IMP DEF",
[ESR_ELx_EC_IABT_LOW] = "IABT (lower EL)",
[ESR_ELx_EC_IABT_CUR] = "IABT (current EL)",
@ -935,26 +883,6 @@ asmlinkage void enter_from_user_mode(void)
}
NOKPROBE_SYMBOL(enter_from_user_mode);
void __pte_error(const char *file, int line, unsigned long val)
{
pr_err("%s:%d: bad pte %016lx.\n", file, line, val);
}
void __pmd_error(const char *file, int line, unsigned long val)
{
pr_err("%s:%d: bad pmd %016lx.\n", file, line, val);
}
void __pud_error(const char *file, int line, unsigned long val)
{
pr_err("%s:%d: bad pud %016lx.\n", file, line, val);
}
void __pgd_error(const char *file, int line, unsigned long val)
{
pr_err("%s:%d: bad pgd %016lx.\n", file, line, val);
}
/* GENERIC_BUG traps */
int is_valid_bugaddr(unsigned long addr)
@ -994,6 +922,21 @@ static struct break_hook bug_break_hook = {
.imm = BUG_BRK_IMM,
};
static int reserved_fault_handler(struct pt_regs *regs, unsigned int esr)
{
pr_err("%s generated an invalid instruction at %pS!\n",
in_bpf_jit(regs) ? "BPF JIT" : "Kernel text patching",
(void *)instruction_pointer(regs));
/* We cannot handle this */
return DBG_HOOK_ERROR;
}
static struct break_hook fault_break_hook = {
.fn = reserved_fault_handler,
.imm = FAULT_BRK_IMM,
};
#ifdef CONFIG_KASAN_SW_TAGS
#define KASAN_ESR_RECOVER 0x20
@ -1059,6 +1002,7 @@ int __init early_brk64(unsigned long addr, unsigned int esr,
void __init trap_init(void)
{
register_kernel_break_hook(&bug_break_hook);
register_kernel_break_hook(&fault_break_hook);
#ifdef CONFIG_KASAN_SW_TAGS
register_kernel_break_hook(&kasan_break_hook);
#endif

View file

@ -30,15 +30,11 @@
#include <asm/vdso.h>
extern char vdso_start[], vdso_end[];
#ifdef CONFIG_COMPAT_VDSO
extern char vdso32_start[], vdso32_end[];
#endif /* CONFIG_COMPAT_VDSO */
enum vdso_abi {
VDSO_ABI_AA64,
#ifdef CONFIG_COMPAT_VDSO
VDSO_ABI_AA32,
#endif /* CONFIG_COMPAT_VDSO */
};
enum vvar_pages {
@ -284,21 +280,17 @@ static int __setup_additional_pages(enum vdso_abi abi,
/*
* Create and map the vectors page for AArch32 tasks.
*/
#ifdef CONFIG_COMPAT_VDSO
static int aarch32_vdso_mremap(const struct vm_special_mapping *sm,
struct vm_area_struct *new_vma)
{
return __vdso_remap(VDSO_ABI_AA32, sm, new_vma);
}
#endif /* CONFIG_COMPAT_VDSO */
enum aarch32_map {
AA32_MAP_VECTORS, /* kuser helpers */
#ifdef CONFIG_COMPAT_VDSO
AA32_MAP_SIGPAGE,
AA32_MAP_VVAR,
AA32_MAP_VDSO,
#endif
AA32_MAP_SIGPAGE
};
static struct page *aarch32_vectors_page __ro_after_init;
@ -309,7 +301,10 @@ static struct vm_special_mapping aarch32_vdso_maps[] = {
.name = "[vectors]", /* ABI */
.pages = &aarch32_vectors_page,
},
#ifdef CONFIG_COMPAT_VDSO
[AA32_MAP_SIGPAGE] = {
.name = "[sigpage]", /* ABI */
.pages = &aarch32_sig_page,
},
[AA32_MAP_VVAR] = {
.name = "[vvar]",
.fault = vvar_fault,
@ -319,11 +314,6 @@ static struct vm_special_mapping aarch32_vdso_maps[] = {
.name = "[vdso]",
.mremap = aarch32_vdso_mremap,
},
#endif /* CONFIG_COMPAT_VDSO */
[AA32_MAP_SIGPAGE] = {
.name = "[sigpage]", /* ABI */
.pages = &aarch32_sig_page,
},
};
static int aarch32_alloc_kuser_vdso_page(void)
@ -362,25 +352,25 @@ static int aarch32_alloc_sigpage(void)
return 0;
}
#ifdef CONFIG_COMPAT_VDSO
static int __aarch32_alloc_vdso_pages(void)
{
if (!IS_ENABLED(CONFIG_COMPAT_VDSO))
return 0;
vdso_info[VDSO_ABI_AA32].dm = &aarch32_vdso_maps[AA32_MAP_VVAR];
vdso_info[VDSO_ABI_AA32].cm = &aarch32_vdso_maps[AA32_MAP_VDSO];
return __vdso_init(VDSO_ABI_AA32);
}
#endif /* CONFIG_COMPAT_VDSO */
static int __init aarch32_alloc_vdso_pages(void)
{
int ret;
#ifdef CONFIG_COMPAT_VDSO
ret = __aarch32_alloc_vdso_pages();
if (ret)
return ret;
#endif
ret = aarch32_alloc_sigpage();
if (ret)
@ -449,14 +439,12 @@ int aarch32_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
if (ret)
goto out;
#ifdef CONFIG_COMPAT_VDSO
ret = __setup_additional_pages(VDSO_ABI_AA32,
mm,
bprm,
uses_interp);
if (ret)
goto out;
#endif /* CONFIG_COMPAT_VDSO */
if (IS_ENABLED(CONFIG_COMPAT_VDSO)) {
ret = __setup_additional_pages(VDSO_ABI_AA32, mm, bprm,
uses_interp);
if (ret)
goto out;
}
ret = aarch32_sigreturn_setup(mm);
out:
@ -497,8 +485,7 @@ static int __init vdso_init(void)
}
arch_initcall(vdso_init);
int arch_setup_additional_pages(struct linux_binprm *bprm,
int uses_interp)
int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
{
struct mm_struct *mm = current->mm;
int ret;
@ -506,11 +493,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm,
if (mmap_write_lock_killable(mm))
return -EINTR;
ret = __setup_additional_pages(VDSO_ABI_AA64,
mm,
bprm,
uses_interp);
ret = __setup_additional_pages(VDSO_ABI_AA64, mm, bprm, uses_interp);
mmap_write_unlock(mm);
return ret;

View file

@ -105,7 +105,7 @@ SECTIONS
*(.eh_frame)
}
. = KIMAGE_VADDR + TEXT_OFFSET;
. = KIMAGE_VADDR;
.head.text : {
_text = .;
@ -274,4 +274,4 @@ ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) == PAGE_SIZE,
/*
* If padding is applied before .head.text, virt<->phys conversions will fail.
*/
ASSERT(_text == (KIMAGE_VADDR + TEXT_OFFSET), "HEAD is misaligned")
ASSERT(_text == KIMAGE_VADDR, "HEAD is misaligned")

View file

@ -57,9 +57,6 @@ config KVM_ARM_PMU
Adds support for a virtual Performance Monitoring Unit (PMU) in
virtual machines.
config KVM_INDIRECT_VECTORS
def_bool HARDEN_BRANCH_PREDICTOR || RANDOMIZE_BASE
endif # KVM
endif # VIRTUALIZATION

View file

@ -1259,6 +1259,40 @@ long kvm_arch_vm_ioctl(struct file *filp,
}
}
static int kvm_map_vectors(void)
{
/*
* SV2 = ARM64_SPECTRE_V2
* HEL2 = ARM64_HARDEN_EL2_VECTORS
*
* !SV2 + !HEL2 -> use direct vectors
* SV2 + !HEL2 -> use hardened vectors in place
* !SV2 + HEL2 -> allocate one vector slot and use exec mapping
* SV2 + HEL2 -> use hardened vectors and use exec mapping
*/
if (cpus_have_const_cap(ARM64_SPECTRE_V2)) {
__kvm_bp_vect_base = kvm_ksym_ref(__bp_harden_hyp_vecs);
__kvm_bp_vect_base = kern_hyp_va(__kvm_bp_vect_base);
}
if (cpus_have_const_cap(ARM64_HARDEN_EL2_VECTORS)) {
phys_addr_t vect_pa = __pa_symbol(__bp_harden_hyp_vecs);
unsigned long size = __BP_HARDEN_HYP_VECS_SZ;
/*
* Always allocate a spare vector slot, as we don't
* know yet which CPUs have a BP hardening slot that
* we can reuse.
*/
__kvm_harden_el2_vector_slot = atomic_inc_return(&arm64_el2_vector_last_slot);
BUG_ON(__kvm_harden_el2_vector_slot >= BP_HARDEN_EL2_SLOTS);
return create_hyp_exec_mappings(vect_pa, size,
&__kvm_bp_vect_base);
}
return 0;
}
static void cpu_init_hyp_mode(void)
{
phys_addr_t pgd_ptr;
@ -1295,7 +1329,7 @@ static void cpu_init_hyp_mode(void)
* at EL2.
*/
if (this_cpu_has_cap(ARM64_SSBS) &&
arm64_get_ssbd_state() == ARM64_SSBD_FORCE_DISABLE) {
arm64_get_spectre_v4_state() == SPECTRE_VULNERABLE) {
kvm_call_hyp_nvhe(__kvm_enable_ssbs);
}
}
@ -1552,10 +1586,6 @@ static int init_hyp_mode(void)
}
}
err = hyp_map_aux_data();
if (err)
kvm_err("Cannot map host auxiliary data: %d\n", err);
return 0;
out_err:

View file

@ -10,5 +10,4 @@ subdir-ccflags-y := -I$(incdir) \
-DDISABLE_BRANCH_PROFILING \
$(DISABLE_STACKLEAK_PLUGIN)
obj-$(CONFIG_KVM) += vhe/ nvhe/
obj-$(CONFIG_KVM_INDIRECT_VECTORS) += smccc_wa.o
obj-$(CONFIG_KVM) += vhe/ nvhe/ smccc_wa.o

View file

@ -116,35 +116,6 @@ el1_hvc_guest:
ARM_SMCCC_ARCH_WORKAROUND_2)
cbnz w1, el1_trap
#ifdef CONFIG_ARM64_SSBD
alternative_cb arm64_enable_wa2_handling
b wa2_end
alternative_cb_end
get_vcpu_ptr x2, x0
ldr x0, [x2, #VCPU_WORKAROUND_FLAGS]
// Sanitize the argument and update the guest flags
ldr x1, [sp, #8] // Guest's x1
clz w1, w1 // Murphy's device:
lsr w1, w1, #5 // w1 = !!w1 without using
eor w1, w1, #1 // the flags...
bfi x0, x1, #VCPU_WORKAROUND_2_FLAG_SHIFT, #1
str x0, [x2, #VCPU_WORKAROUND_FLAGS]
/* Check that we actually need to perform the call */
hyp_ldr_this_cpu x0, arm64_ssbd_callback_required, x2
cbz x0, wa2_end
mov w0, #ARM_SMCCC_ARCH_WORKAROUND_2
smc #0
/* Don't leak data from the SMC call */
mov x3, xzr
wa2_end:
mov x2, xzr
mov x1, xzr
#endif
wa_epilogue:
mov x0, xzr
add sp, sp, #16
@ -288,7 +259,6 @@ SYM_CODE_START(__kvm_hyp_vector)
valid_vect el1_error // Error 32-bit EL1
SYM_CODE_END(__kvm_hyp_vector)
#ifdef CONFIG_KVM_INDIRECT_VECTORS
.macro hyp_ventry
.align 7
1: esb
@ -338,4 +308,3 @@ SYM_CODE_START(__bp_harden_hyp_vecs)
1: .org __bp_harden_hyp_vecs + __BP_HARDEN_HYP_VECS_SZ
.org 1b
SYM_CODE_END(__bp_harden_hyp_vecs)
#endif

View file

@ -479,39 +479,6 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
return false;
}
static inline bool __needs_ssbd_off(struct kvm_vcpu *vcpu)
{
if (!cpus_have_final_cap(ARM64_SSBD))
return false;
return !(vcpu->arch.workaround_flags & VCPU_WORKAROUND_2_FLAG);
}
static inline void __set_guest_arch_workaround_state(struct kvm_vcpu *vcpu)
{
#ifdef CONFIG_ARM64_SSBD
/*
* The host runs with the workaround always present. If the
* guest wants it disabled, so be it...
*/
if (__needs_ssbd_off(vcpu) &&
__hyp_this_cpu_read(arm64_ssbd_callback_required))
arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2, 0, NULL);
#endif
}
static inline void __set_host_arch_workaround_state(struct kvm_vcpu *vcpu)
{
#ifdef CONFIG_ARM64_SSBD
/*
* If the guest has disabled the workaround, bring it back on.
*/
if (__needs_ssbd_off(vcpu) &&
__hyp_this_cpu_read(arm64_ssbd_callback_required))
arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2, 1, NULL);
#endif
}
static inline void __kvm_unexpected_el2_exception(void)
{
unsigned long addr, fixup;

View file

@ -202,8 +202,6 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
__debug_switch_to_guest(vcpu);
__set_guest_arch_workaround_state(vcpu);
do {
/* Jump in the fire! */
exit_code = __guest_enter(vcpu, host_ctxt);
@ -211,8 +209,6 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
/* And we're baaack! */
} while (fixup_guest_exit(vcpu, &exit_code));
__set_host_arch_workaround_state(vcpu);
__sysreg_save_state_nvhe(guest_ctxt);
__sysreg32_save_state(vcpu);
__timer_disable_traps(vcpu);

View file

@ -131,8 +131,6 @@ static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
sysreg_restore_guest_state_vhe(guest_ctxt);
__debug_switch_to_guest(vcpu);
__set_guest_arch_workaround_state(vcpu);
do {
/* Jump in the fire! */
exit_code = __guest_enter(vcpu, host_ctxt);
@ -140,8 +138,6 @@ static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
/* And we're baaack! */
} while (fixup_guest_exit(vcpu, &exit_code));
__set_host_arch_workaround_state(vcpu);
sysreg_save_guest_state_vhe(guest_ctxt);
__deactivate_traps(vcpu);

View file

@ -24,27 +24,36 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
feature = smccc_get_arg1(vcpu);
switch (feature) {
case ARM_SMCCC_ARCH_WORKAROUND_1:
switch (kvm_arm_harden_branch_predictor()) {
case KVM_BP_HARDEN_UNKNOWN:
switch (arm64_get_spectre_v2_state()) {
case SPECTRE_VULNERABLE:
break;
case KVM_BP_HARDEN_WA_NEEDED:
case SPECTRE_MITIGATED:
val = SMCCC_RET_SUCCESS;
break;
case KVM_BP_HARDEN_NOT_REQUIRED:
case SPECTRE_UNAFFECTED:
val = SMCCC_RET_NOT_REQUIRED;
break;
}
break;
case ARM_SMCCC_ARCH_WORKAROUND_2:
switch (kvm_arm_have_ssbd()) {
case KVM_SSBD_FORCE_DISABLE:
case KVM_SSBD_UNKNOWN:
switch (arm64_get_spectre_v4_state()) {
case SPECTRE_VULNERABLE:
break;
case KVM_SSBD_KERNEL:
val = SMCCC_RET_SUCCESS;
break;
case KVM_SSBD_FORCE_ENABLE:
case KVM_SSBD_MITIGATED:
case SPECTRE_MITIGATED:
/*
* SSBS everywhere: Indicate no firmware
* support, as the SSBS support will be
* indicated to the guest and the default is
* safe.
*
* Otherwise, expose a permanent mitigation
* to the guest, and hide SSBS so that the
* guest stays protected.
*/
if (cpus_have_final_cap(ARM64_SSBS))
break;
fallthrough;
case SPECTRE_UNAFFECTED:
val = SMCCC_RET_NOT_REQUIRED;
break;
}

View file

@ -269,6 +269,7 @@ void kvm_pmu_vcpu_destroy(struct kvm_vcpu *vcpu)
for (i = 0; i < ARMV8_PMU_MAX_COUNTERS; i++)
kvm_pmu_release_perf_event(&pmu->pmc[i]);
irq_work_sync(&vcpu->arch.pmu.overflow_work);
}
u64 kvm_pmu_valid_counter_mask(struct kvm_vcpu *vcpu)
@ -433,6 +434,22 @@ void kvm_pmu_sync_hwstate(struct kvm_vcpu *vcpu)
kvm_pmu_update_state(vcpu);
}
/**
* When perf interrupt is an NMI, we cannot safely notify the vcpu corresponding
* to the event.
* This is why we need a callback to do it once outside of the NMI context.
*/
static void kvm_pmu_perf_overflow_notify_vcpu(struct irq_work *work)
{
struct kvm_vcpu *vcpu;
struct kvm_pmu *pmu;
pmu = container_of(work, struct kvm_pmu, overflow_work);
vcpu = kvm_pmc_to_vcpu(pmu->pmc);
kvm_vcpu_kick(vcpu);
}
/**
* When the perf event overflows, set the overflow status and inform the vcpu.
*/
@ -465,7 +482,11 @@ static void kvm_pmu_perf_overflow(struct perf_event *perf_event,
if (kvm_pmu_overflow_status(vcpu)) {
kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu);
kvm_vcpu_kick(vcpu);
if (!in_nmi())
kvm_vcpu_kick(vcpu);
else
irq_work_queue(&vcpu->arch.pmu.overflow_work);
}
cpu_pmu->pmu.start(perf_event, PERF_EF_RELOAD);
@ -764,6 +785,9 @@ static int kvm_arm_pmu_v3_init(struct kvm_vcpu *vcpu)
return ret;
}
init_irq_work(&vcpu->arch.pmu.overflow_work,
kvm_pmu_perf_overflow_notify_vcpu);
vcpu->arch.pmu.created = true;
return 0;
}

View file

@ -425,27 +425,30 @@ static int get_kernel_wa_level(u64 regid)
{
switch (regid) {
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
switch (kvm_arm_harden_branch_predictor()) {
case KVM_BP_HARDEN_UNKNOWN:
switch (arm64_get_spectre_v2_state()) {
case SPECTRE_VULNERABLE:
return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
case KVM_BP_HARDEN_WA_NEEDED:
case SPECTRE_MITIGATED:
return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
case KVM_BP_HARDEN_NOT_REQUIRED:
case SPECTRE_UNAFFECTED:
return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED;
}
return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL;
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
switch (kvm_arm_have_ssbd()) {
case KVM_SSBD_FORCE_DISABLE:
return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
case KVM_SSBD_KERNEL:
return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
case KVM_SSBD_FORCE_ENABLE:
case KVM_SSBD_MITIGATED:
switch (arm64_get_spectre_v4_state()) {
case SPECTRE_MITIGATED:
/*
* As for the hypercall discovery, we pretend we
* don't have any FW mitigation if SSBS is there at
* all times.
*/
if (cpus_have_final_cap(ARM64_SSBS))
return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
fallthrough;
case SPECTRE_UNAFFECTED:
return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
case KVM_SSBD_UNKNOWN:
default:
return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN;
case SPECTRE_VULNERABLE:
return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
}
}
@ -462,14 +465,8 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
val = kvm_psci_version(vcpu, vcpu->kvm);
break;
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
break;
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
if (val == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL &&
kvm_arm_get_vcpu_workaround_2_flag(vcpu))
val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED;
break;
default:
return -ENOENT;
@ -527,34 +524,35 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
return -EINVAL;
wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
if (get_kernel_wa_level(reg->id) < wa_level)
return -EINVAL;
/* The enabled bit must not be set unless the level is AVAIL. */
if (wa_level != KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL &&
wa_level != val)
if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED) &&
(val & KVM_REG_FEATURE_LEVEL_MASK) != KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL)
return -EINVAL;
/* Are we finished or do we need to check the enable bit ? */
if (kvm_arm_have_ssbd() != KVM_SSBD_KERNEL)
return 0;
/*
* If this kernel supports the workaround to be switched on
* or off, make sure it matches the requested setting.
* Map all the possible incoming states to the only two we
* really want to deal with.
*/
switch (wa_level) {
switch (val & KVM_REG_FEATURE_LEVEL_MASK) {
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN:
wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
break;
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
kvm_arm_set_vcpu_workaround_2_flag(vcpu,
val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED);
break;
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED:
kvm_arm_set_vcpu_workaround_2_flag(vcpu, true);
wa_level = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED;
break;
default:
return -EINVAL;
}
/*
* We can deal with NOT_AVAIL on NOT_REQUIRED, but not the
* other way around.
*/
if (get_kernel_wa_level(reg->id) < wa_level)
return -EINVAL;
return 0;
default:
return -ENOENT;

View file

@ -319,10 +319,6 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
vcpu->arch.reset_state.reset = false;
}
/* Default workaround setup is enabled (if supported) */
if (kvm_arm_have_ssbd() == KVM_SSBD_KERNEL)
vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
/* Reset timer */
ret = kvm_timer_vcpu_reset(vcpu);
out:

View file

@ -1131,6 +1131,11 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
if (!vcpu_has_sve(vcpu))
val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
val &= ~(0xfUL << ID_AA64PFR0_AMU_SHIFT);
if (!(val & (0xfUL << ID_AA64PFR0_CSV2_SHIFT)) &&
arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED)
val |= (1UL << ID_AA64PFR0_CSV2_SHIFT);
} else if (id == SYS_ID_AA64PFR1_EL1) {
val &= ~(0xfUL << ID_AA64PFR1_MTE_SHIFT);
} else if (id == SYS_ID_AA64ISAR1_EL1 && !vcpu_has_ptrauth(vcpu)) {
val &= ~((0xfUL << ID_AA64ISAR1_APA_SHIFT) |
(0xfUL << ID_AA64ISAR1_API_SHIFT) |
@ -1382,6 +1387,13 @@ static bool access_ccsidr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
return true;
}
static bool access_mte_regs(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
kvm_inject_undefined(vcpu);
return false;
}
/* sys_reg_desc initialiser for known cpufeature ID registers */
#define ID_SANITISED(name) { \
SYS_DESC(SYS_##name), \
@ -1547,6 +1559,10 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 },
{ SYS_DESC(SYS_ACTLR_EL1), access_actlr, reset_actlr, ACTLR_EL1 },
{ SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
{ SYS_DESC(SYS_RGSR_EL1), access_mte_regs },
{ SYS_DESC(SYS_GCR_EL1), access_mte_regs },
{ SYS_DESC(SYS_ZCR_EL1), NULL, reset_val, ZCR_EL1, 0, .visibility = sve_visibility },
{ SYS_DESC(SYS_TTBR0_EL1), access_vm_reg, reset_unknown, TTBR0_EL1 },
{ SYS_DESC(SYS_TTBR1_EL1), access_vm_reg, reset_unknown, TTBR1_EL1 },
@ -1571,6 +1587,9 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_ERXMISC0_EL1), trap_raz_wi },
{ SYS_DESC(SYS_ERXMISC1_EL1), trap_raz_wi },
{ SYS_DESC(SYS_TFSR_EL1), access_mte_regs },
{ SYS_DESC(SYS_TFSRE0_EL1), access_mte_regs },
{ SYS_DESC(SYS_FAR_EL1), access_vm_reg, reset_unknown, FAR_EL1 },
{ SYS_DESC(SYS_PAR_EL1), NULL, reset_unknown, PAR_EL1 },

View file

@ -1001,8 +1001,8 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg, bool allow_group1)
raw_spin_lock_irqsave(&irq->irq_lock, flags);
/*
* An access targetting Group0 SGIs can only generate
* those, while an access targetting Group1 SGIs can
* An access targeting Group0 SGIs can only generate
* those, while an access targeting Group1 SGIs can
* generate interrupts of either group.
*/
if (!irq->group || allow_group1) {

View file

@ -16,3 +16,5 @@ lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
obj-$(CONFIG_CRC32) += crc32.o
obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
obj-$(CONFIG_ARM64_MTE) += mte.o

Some files were not shown because too many files have changed in this diff Show more