Commit graph

96 commits

Author SHA1 Message Date
Puranjay Mohan
c060f93253 arm64: stacktrace: fix the usage of ftrace_graph_ret_addr()
ftrace_graph_ret_addr() takes an 'idx' integer pointer that is used to
optimize the stack unwinding process. arm64 currently passes `NULL` for
this parameter which stops it from utilizing these optimizations.

Further, the current code for ftrace_graph_ret_addr() will just return
the passed in return address if it is NULL which will break this usage.

Pass a valid integer pointer to ftrace_graph_ret_addr() similar to
x86_64's stack unwinder.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Fixes: 29c1c24a27 ("function_graph: Fix up ftrace_graph_ret_addr()")
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20240618162342.28275-1-puranjay@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-09-05 15:03:35 +01:00
chenqiwu
410e471f87 arm64: Add USER_STACKTRACE support
Currently, userstacktrace is unsupported for ftrace and uprobe
tracers on arm64. This patch uses the perf_callchain_user() code
as blueprint to implement the arch_stack_walk_user() which add
userstacktrace support on arm64.
Meanwhile, we can use arch_stack_walk_user() to simplify the
implementation of perf_callchain_user().
This patch is tested pass with ftrace, uprobe and perf tracers
profiling userstacktrace cases.

Tested-by: chenqiwu <qiwu.chen@transsion.com>
Signed-off-by: chenqiwu <qiwu.chen@transsion.com>
Link: https://lore.kernel.org/r/20231219022229.10230-1-qiwu.chen@transsion.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-05-03 14:12:45 +01:00
Linus Torvalds
9187210eee Networking changes for 6.9.
Core & protocols
 ----------------
 
  - Large effort by Eric to lower rtnl_lock pressure and remove locks:
 
    - Make commonly used parts of rtnetlink (address, route dumps etc.)
      lockless, protected by RCU instead of rtnl_lock.
 
    - Add a netns exit callback which already holds rtnl_lock,
      allowing netns exit to take rtnl_lock once in the core
      instead of once for each driver / callback.
 
    - Remove locks / serialization in the socket diag interface.
 
    - Remove 6 calls to synchronize_rcu() while holding rtnl_lock.
 
    - Remove the dev_base_lock, depend on RCU where necessary.
 
  - Support busy polling on a per-epoll context basis. Poll length
    and budget parameters can be set independently of system defaults.
 
  - Introduce struct net_hotdata, to make sure read-mostly global config
    variables fit in as few cache lines as possible.
 
  - Add optional per-nexthop statistics to ease monitoring / debug
    of ECMP imbalance problems.
 
  - Support TCP_NOTSENT_LOWAT in MPTCP.
 
  - Ensure that IPv6 temporary addresses' preferred lifetimes are long
    enough, compared to other configured lifetimes, and at least 2 sec.
 
  - Support forwarding of ICMP Error messages in IPSec, per RFC 4301.
 
  - Add support for the independent control state machine for bonding
    per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled
    control state machine.
 
  - Add "network ID" to MCTP socket APIs to support hosts with multiple
    disjoint MCTP networks.
 
  - Re-use the mono_delivery_time skbuff bit for packets which user
    space wants to be sent at a specified time. Maintain the timing
    information while traversing veth links, bridge etc.
 
  - Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets.
 
  - Simplify many places iterating over netdevs by using an xarray
    instead of a hash table walk (hash table remains in place, for
    use on fastpaths).
 
  - Speed up scanning for expired routes by keeping a dedicated list.
 
  - Speed up "generic" XDP by trying harder to avoid large allocations.
 
  - Support attaching arbitrary metadata to netconsole messages.
 
 Things we sprinkled into general kernel code
 --------------------------------------------
 
  - Enforce VM_IOREMAP flag and range in ioremap_page_range and introduce
    VM_SPARSE kind and vm_area_[un]map_pages (used by bpf_arena).
 
  - Rework selftest harness to enable the use of the full range of
    ksft exit code (pass, fail, skip, xfail, xpass).
 
 Netfilter
 ---------
 
  - Allow userspace to define a table that is exclusively owned by a daemon
    (via netlink socket aliveness) without auto-removing this table when
    the userspace program exits. Such table gets marked as orphaned and
    a restarting management daemon can re-attach/regain ownership.
 
  - Speed up element insertions to nftables' concatenated-ranges set type.
    Compact a few related data structures.
 
 BPF
 ---
 
  - Add BPF token support for delegating a subset of BPF subsystem
    functionality from privileged system-wide daemons such as systemd
    through special mount options for userns-bound BPF fs to a trusted
    & unprivileged application.
 
  - Introduce bpf_arena which is sparse shared memory region between BPF
    program and user space where structures inside the arena can have
    pointers to other areas of the arena, and pointers work seamlessly
    for both user-space programs and BPF programs.
 
  - Introduce may_goto instruction that is a contract between the verifier
    and the program. The verifier allows the program to loop assuming it's
    behaving well, but reserves the right to terminate it.
 
  - Extend the BPF verifier to enable static subprog calls in spin lock
    critical sections.
 
  - Support registration of struct_ops types from modules which helps
    projects like fuse-bpf that seeks to implement a new struct_ops type.
 
  - Add support for retrieval of cookies for perf/kprobe multi links.
 
  - Support arbitrary TCP SYN cookie generation / validation in the TC
    layer with BPF to allow creating SYN flood handling in BPF firewalls.
 
  - Add code generation to inline the bpf_kptr_xchg() helper which
    improves performance when stashing/popping the allocated BPF objects.
 
 Wireless
 --------
 
  - Add SPP (signaling and payload protected) AMSDU support.
 
  - Support wider bandwidth OFDMA, as required for EHT operation.
 
 Driver API
 ----------
 
  - Major overhaul of the Energy Efficient Ethernet internals to support
    new link modes (2.5GE, 5GE), share more code between drivers
    (especially those using phylib), and encourage more uniform behavior.
    Convert and clean up drivers.
 
  - Define an API for querying per netdev queue statistics from drivers.
 
  - IPSec: account in global stats for fully offloaded sessions.
 
  - Create a concept of Ethernet PHY Packages at the Device Tree level,
    to allow parameterizing the existing PHY package code.
 
  - Enable Rx hashing (RSS) on GTP protocol fields.
 
 Misc
 ----
 
  - Improvements and refactoring all over networking selftests.
 
  - Create uniform module aliases for TC classifiers, actions,
    and packet schedulers to simplify creating modprobe policies.
 
  - Address all missing MODULE_DESCRIPTION() warnings in networking.
 
  - Extend the Netlink descriptions in YAML to cover message encapsulation
    or "Netlink polymorphism", where interpretation of nested attributes
    depends on link type, classifier type or some other "class type".
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Add a new driver for Marvell's Octeon PCI Endpoint NIC VF.
    - Intel (100G, ice, idpf):
      - support E825-C devices
    - nVidia/Mellanox:
      - support devices with one port and multiple PCIe links
    - Broadcom (bnxt):
      - support n-tuple filters
      - support configuring the RSS key
    - Wangxun (ngbe/txgbe):
      - implement irq_domain for TXGBE's sub-interrupts
    - Pensando/AMD:
      - support XDP
      - optimize queue submission and wakeup handling (+17% bps)
      - optimize struct layout, saving 28% of memory on queues
 
  - Ethernet NICs embedded and virtual:
    - Google cloud vNIC:
      - refactor driver to perform memory allocations for new queue
        config before stopping and freeing the old queue memory
    - Synopsys (stmmac):
      - obey queueMaxSDU and implement counters required by 802.1Qbv
    - Renesas (ravb):
      - support packet checksum offload
      - suspend to RAM and runtime PM support
 
  - Ethernet switches:
    - nVidia/Mellanox:
      - support for nexthop group statistics
    - Microchip:
      - ksz8: implement PHY loopback
      - add support for KSZ8567, a 7-port 10/100Mbps switch
 
  - PTP:
    - New driver for RENESAS FemtoClock3 Wireless clock generator.
    - Support OCP PTP cards designed and built by Adva.
 
  - CAN:
    - Support recvmsg() flags for own, local and remote traffic
      on CAN BCM sockets.
    - Support for esd GmbH PCIe/402 CAN device family.
    - m_can:
      - Rx/Tx submission coalescing
      - wake on frame Rx
 
  - WiFi:
    - Intel (iwlwifi):
      - enable signaling and payload protected A-MSDUs
      - support wider-bandwidth OFDMA
      - support for new devices
      - bump FW API to 89 for AX devices; 90 for BZ/SC devices
    - MediaTek (mt76):
      - mt7915: newer ADIE version support
      - mt7925: radio temperature sensor support
    - Qualcomm (ath11k):
      - support 6 GHz station power modes: Low Power Indoor (LPI),
        Standard Power) SP and Very Low Power (VLP)
      - QCA6390 & WCN6855: support 2 concurrent station interfaces
      - QCA2066 support
    - Qualcomm (ath12k):
      - refactoring in preparation for Multi-Link Operation (MLO) support
      - 1024 Block Ack window size support
      - firmware-2.bin support
      - support having multiple identical PCI devices (firmware needs to
        have ATH12K_FW_FEATURE_MULTI_QRTR_ID)
      - QCN9274: support split-PHY devices
      - WCN7850: enable Power Save Mode in station mode
      - WCN7850: P2P support
    - RealTek:
      - rtw88: support for more rtw8811cu and rtw8821cu devices
      - rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL
      - rtlwifi: speed up USB firmware initialization
      - rtwl8xxxu:
        - RTL8188F: concurrent interface support
        - Channel Switch Announcement (CSA) support in AP mode
    - Broadcom (brcmfmac):
      - per-vendor feature support
      - per-vendor SAE password setup
      - DMI nvram filename quirk for ACEPC W5 Pro
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmXv0mgACgkQMUZtbf5S
 IrtgMxAAuRd+WJW++SENr4KxIWhYO1q6Xcxnai43wrNkan9swD24icG8TYALt4f3
 yoT6idQvWReAb5JNlh9rUQz8R7E0nJXlvEFn5MtJwcthx2C6wFo/XkJlddlRrT+j
 c2xGILwLjRhW65LaC0MZ2ECbEERkFz8xcGfK2SWzUgh6KYvPjcRfKFxugpM7xOQK
 P/Wnqhs4fVRS/Mj/bCcXcO+yhwC121Q3qVeQVjGS0AzEC65hAW87a/kc2BfgcegD
 EyI9R7mf6criQwX+0awubjfoIdr4oW/8oDVNvUDczkJkbaEVaLMQk9P5x/0XnnVS
 UHUchWXyI80Q8Rj12uN1/I0h3WtwNQnCRBuLSmtm6GLfCAwbLvp2nGWDnaXiqryW
 DVKUIHGvqPKjkOOMOVfSvfB3LvkS3xsFVVYiQBQCn0YSs/gtu4CoF2Nty9CiLPbK
 tTuxUnLdPDZDxU//l0VArZmP8p2JM7XQGJ+JH8GFH4SBTyBR23e0iyPSoyaxjnYn
 RReDnHMVsrS1i7GPhbqDJWn+uqMSs7N149i0XmmyeqwQHUVSJN3J2BApP2nCaDfy
 H2lTuYly5FfEezt61NvCE4qr/VsWeEjm1fYlFQ9dFn4pGn+HghyCpw+xD1ZN56DN
 lujemau5B3kk1UTtAT4ypPqvuqjkRFqpNV2LzsJSk/Js+hApw8Y=
 =oY52
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Large effort by Eric to lower rtnl_lock pressure and remove locks:

      - Make commonly used parts of rtnetlink (address, route dumps
        etc) lockless, protected by RCU instead of rtnl_lock.

      - Add a netns exit callback which already holds rtnl_lock,
        allowing netns exit to take rtnl_lock once in the core instead
        of once for each driver / callback.

      - Remove locks / serialization in the socket diag interface.

      - Remove 6 calls to synchronize_rcu() while holding rtnl_lock.

      - Remove the dev_base_lock, depend on RCU where necessary.

   - Support busy polling on a per-epoll context basis. Poll length and
     budget parameters can be set independently of system defaults.

   - Introduce struct net_hotdata, to make sure read-mostly global
     config variables fit in as few cache lines as possible.

   - Add optional per-nexthop statistics to ease monitoring / debug of
     ECMP imbalance problems.

   - Support TCP_NOTSENT_LOWAT in MPTCP.

   - Ensure that IPv6 temporary addresses' preferred lifetimes are long
     enough, compared to other configured lifetimes, and at least 2 sec.

   - Support forwarding of ICMP Error messages in IPSec, per RFC 4301.

   - Add support for the independent control state machine for bonding
     per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled
     control state machine.

   - Add "network ID" to MCTP socket APIs to support hosts with multiple
     disjoint MCTP networks.

   - Re-use the mono_delivery_time skbuff bit for packets which user
     space wants to be sent at a specified time. Maintain the timing
     information while traversing veth links, bridge etc.

   - Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets.

   - Simplify many places iterating over netdevs by using an xarray
     instead of a hash table walk (hash table remains in place, for use
     on fastpaths).

   - Speed up scanning for expired routes by keeping a dedicated list.

   - Speed up "generic" XDP by trying harder to avoid large allocations.

   - Support attaching arbitrary metadata to netconsole messages.

  Things we sprinkled into general kernel code:

   - Enforce VM_IOREMAP flag and range in ioremap_page_range and
     introduce VM_SPARSE kind and vm_area_[un]map_pages (used by
     bpf_arena).

   - Rework selftest harness to enable the use of the full range of ksft
     exit code (pass, fail, skip, xfail, xpass).

  Netfilter:

   - Allow userspace to define a table that is exclusively owned by a
     daemon (via netlink socket aliveness) without auto-removing this
     table when the userspace program exits. Such table gets marked as
     orphaned and a restarting management daemon can re-attach/regain
     ownership.

   - Speed up element insertions to nftables' concatenated-ranges set
     type. Compact a few related data structures.

  BPF:

   - Add BPF token support for delegating a subset of BPF subsystem
     functionality from privileged system-wide daemons such as systemd
     through special mount options for userns-bound BPF fs to a trusted
     & unprivileged application.

   - Introduce bpf_arena which is sparse shared memory region between
     BPF program and user space where structures inside the arena can
     have pointers to other areas of the arena, and pointers work
     seamlessly for both user-space programs and BPF programs.

   - Introduce may_goto instruction that is a contract between the
     verifier and the program. The verifier allows the program to loop
     assuming it's behaving well, but reserves the right to terminate
     it.

   - Extend the BPF verifier to enable static subprog calls in spin lock
     critical sections.

   - Support registration of struct_ops types from modules which helps
     projects like fuse-bpf that seeks to implement a new struct_ops
     type.

   - Add support for retrieval of cookies for perf/kprobe multi links.

   - Support arbitrary TCP SYN cookie generation / validation in the TC
     layer with BPF to allow creating SYN flood handling in BPF
     firewalls.

   - Add code generation to inline the bpf_kptr_xchg() helper which
     improves performance when stashing/popping the allocated BPF
     objects.

  Wireless:

   - Add SPP (signaling and payload protected) AMSDU support.

   - Support wider bandwidth OFDMA, as required for EHT operation.

  Driver API:

   - Major overhaul of the Energy Efficient Ethernet internals to
     support new link modes (2.5GE, 5GE), share more code between
     drivers (especially those using phylib), and encourage more
     uniform behavior. Convert and clean up drivers.

   - Define an API for querying per netdev queue statistics from
     drivers.

   - IPSec: account in global stats for fully offloaded sessions.

   - Create a concept of Ethernet PHY Packages at the Device Tree level,
     to allow parameterizing the existing PHY package code.

   - Enable Rx hashing (RSS) on GTP protocol fields.

  Misc:

   - Improvements and refactoring all over networking selftests.

   - Create uniform module aliases for TC classifiers, actions, and
     packet schedulers to simplify creating modprobe policies.

   - Address all missing MODULE_DESCRIPTION() warnings in networking.

   - Extend the Netlink descriptions in YAML to cover message
     encapsulation or "Netlink polymorphism", where interpretation of
     nested attributes depends on link type, classifier type or some
     other "class type".

  Drivers:

   - Ethernet high-speed NICs:
      - Add a new driver for Marvell's Octeon PCI Endpoint NIC VF.
      - Intel (100G, ice, idpf):
         - support E825-C devices
      - nVidia/Mellanox:
         - support devices with one port and multiple PCIe links
      - Broadcom (bnxt):
         - support n-tuple filters
         - support configuring the RSS key
      - Wangxun (ngbe/txgbe):
         - implement irq_domain for TXGBE's sub-interrupts
      - Pensando/AMD:
         - support XDP
         - optimize queue submission and wakeup handling (+17% bps)
         - optimize struct layout, saving 28% of memory on queues

   - Ethernet NICs embedded and virtual:
      - Google cloud vNIC:
         - refactor driver to perform memory allocations for new queue
           config before stopping and freeing the old queue memory
      - Synopsys (stmmac):
         - obey queueMaxSDU and implement counters required by 802.1Qbv
      - Renesas (ravb):
         - support packet checksum offload
         - suspend to RAM and runtime PM support

   - Ethernet switches:
      - nVidia/Mellanox:
         - support for nexthop group statistics
      - Microchip:
         - ksz8: implement PHY loopback
         - add support for KSZ8567, a 7-port 10/100Mbps switch

   - PTP:
      - New driver for RENESAS FemtoClock3 Wireless clock generator.
      - Support OCP PTP cards designed and built by Adva.

   - CAN:
      - Support recvmsg() flags for own, local and remote traffic on CAN
        BCM sockets.
      - Support for esd GmbH PCIe/402 CAN device family.
      - m_can:
         - Rx/Tx submission coalescing
         - wake on frame Rx

   - WiFi:
      - Intel (iwlwifi):
         - enable signaling and payload protected A-MSDUs
         - support wider-bandwidth OFDMA
         - support for new devices
         - bump FW API to 89 for AX devices; 90 for BZ/SC devices
      - MediaTek (mt76):
         - mt7915: newer ADIE version support
         - mt7925: radio temperature sensor support
      - Qualcomm (ath11k):
         - support 6 GHz station power modes: Low Power Indoor (LPI),
           Standard Power) SP and Very Low Power (VLP)
         - QCA6390 & WCN6855: support 2 concurrent station interfaces
         - QCA2066 support
      - Qualcomm (ath12k):
         - refactoring in preparation for Multi-Link Operation (MLO)
           support
         - 1024 Block Ack window size support
         - firmware-2.bin support
         - support having multiple identical PCI devices (firmware needs
           to have ATH12K_FW_FEATURE_MULTI_QRTR_ID)
         - QCN9274: support split-PHY devices
         - WCN7850: enable Power Save Mode in station mode
         - WCN7850: P2P support
      - RealTek:
         - rtw88: support for more rtw8811cu and rtw8821cu devices
         - rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL
         - rtlwifi: speed up USB firmware initialization
         - rtwl8xxxu:
             - RTL8188F: concurrent interface support
             - Channel Switch Announcement (CSA) support in AP mode
      - Broadcom (brcmfmac):
         - per-vendor feature support
         - per-vendor SAE password setup
         - DMI nvram filename quirk for ACEPC W5 Pro"

* tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2255 commits)
  nexthop: Fix splat with CONFIG_DEBUG_PREEMPT=y
  nexthop: Fix out-of-bounds access during attribute validation
  nexthop: Only parse NHA_OP_FLAGS for dump messages that require it
  nexthop: Only parse NHA_OP_FLAGS for get messages that require it
  bpf: move sleepable flag from bpf_prog_aux to bpf_prog
  bpf: hardcode BPF_PROG_PACK_SIZE to 2MB * num_possible_nodes()
  selftests/bpf: Add kprobe multi triggering benchmarks
  ptp: Move from simple ida to xarray
  vxlan: Remove generic .ndo_get_stats64
  vxlan: Do not alloc tstats manually
  devlink: Add comments to use netlink gen tool
  nfp: flower: handle acti_netdevs allocation failure
  net/packet: Add getsockopt support for PACKET_COPY_THRESH
  net/netlink: Add getsockopt support for NETLINK_LISTEN_ALL_NSID
  selftests/bpf: Add bpf_arena_htab test.
  selftests/bpf: Add bpf_arena_list test.
  selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages
  bpf: Add helper macro bpf_addr_space_cast()
  libbpf: Recognize __arena global variables.
  bpftool: Recognize arena map type
  ...
2024-03-12 17:44:08 -07:00
Puranjay Mohan
2c79bd34af arm64: prohibit probing on arch_kunwind_consume_entry()
Make arch_kunwind_consume_entry() as __always_inline otherwise the
compiler might not inline it and allow attaching probes to it.

Without this, just probing arch_kunwind_consume_entry() via
<tracefs>/kprobe_events will crash the kernel on arm64.

The crash can be reproduced using the following compiler and kernel
combination:
clang version 19.0.0git (https://github.com/llvm/llvm-project.git d68d29516102252f6bf6dc23fb22cef144ca1cb3)
commit 87adedeba5 ("Merge tag 'net-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")

 [root@localhost ~]# echo 'p arch_kunwind_consume_entry' > /sys/kernel/debug/tracing/kprobe_events
 [root@localhost ~]# echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable

 Modules linked in: aes_ce_blk aes_ce_cipher ghash_ce sha2_ce virtio_net sha256_arm64 sha1_ce arm_smccc_trng net_failover failover virtio_mmio uio_pdrv_genirq uio sch_fq_codel dm_mod dax configfs
 CPU: 3 PID: 1405 Comm: bash Not tainted 6.8.0-rc6+ #14
 Hardware name: linux,dummy-virt (DT)
 pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : kprobe_breakpoint_handler+0x17c/0x258
 lr : kprobe_breakpoint_handler+0x17c/0x258
 sp : ffff800085d6ab60
 x29: ffff800085d6ab60 x28: ffff0000066f0040 x27: ffff0000066f0b20
 x26: ffff800081fa7b0c x25: 0000000000000002 x24: ffff00000b29bd18
 x23: ffff00007904c590 x22: ffff800081fa6590 x21: ffff800081fa6588
 x20: ffff00000b29bd18 x19: ffff800085d6ac40 x18: 0000000000000079
 x17: 0000000000000001 x16: ffffffffffffffff x15: 0000000000000004
 x14: ffff80008277a940 x13: 0000000000000003 x12: 0000000000000003
 x11: 00000000fffeffff x10: c0000000fffeffff x9 : aa95616fdf80cc00
 x8 : aa95616fdf80cc00 x7 : 205d343137373231 x6 : ffff800080fb48ec
 x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000000
 x2 : 0000000000000000 x1 : ffff800085d6a910 x0 : 0000000000000079
 Call trace:
 kprobes: Failed to recover from reentered kprobes.
 kprobes: Dump kprobe:
 .symbol_name = arch_kunwind_consume_entry, .offset = 0, .addr = arch_kunwind_consume_entry+0x0/0x40
 ------------[ cut here ]------------
 kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
 kprobes: Failed to recover from reentered kprobes.
 kprobes: Dump kprobe:
 .symbol_name = arch_kunwind_consume_entry, .offset = 0, .addr = arch_kunwind_consume_entry+0x0/0x40

Fixes: 1aba06e7b2 ("arm64: stacktrace: factor out kunwind_stack_walk()")
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240229231620.24846-1-puranjay12@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2024-03-04 13:00:00 +00:00
Puranjay Mohan
e74cb1b422 arm64: stacktrace: Implement arch_bpf_stack_walk() for the BPF JIT
This will be used by bpf_throw() to unwind till the program marked as
exception boundary and run the callback with the stack of the main
program.

This is required for supporting BPF exceptions on ARM64.

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20240201125225.72796-2-puranjay12@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-02-27 13:54:17 -08:00
Mark Rutland
1aba06e7b2 arm64: stacktrace: factor out kunwind_stack_walk()
Currently arm64 uses the generic arch_stack_walk() interface for all
stack walking code. This only passes a PC value and cookie to the unwind
callback, whereas we'd like to pass some additional information in some
cases. For example, the BPF exception unwinder wants the FP, for
reliable stacktrace we'll want to perform additional checks on other
portions of unwind state, and we'd like to expand the information
printed by dump_backtrace() to include provenance and reliability
information.

As preparation for all of the above, this patch factors the core unwind
logic out of arch_stack_walk() and into a new kunwind_stack_walk()
function which provides all of the unwind state to a callback function.
The existing arch_stack_walk() interface is implemented atop this.

The kunwind_stack_walk() function is intended to be a private
implementation detail of unwinders in stacktrace.c, and not something to
be exported generally to kernel code. It is __always_inline'd into its
caller so that neither it or its caller appear in stactraces (which is
the existing/required behavior for arch_stack_walk() and friends) and so
that the compiler can optimize away some of the indirection.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Puranjay Mohan <puranjay12@gmail.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20231124110511.2795958-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-12-11 11:42:55 +00:00
Mark Rutland
1beef60e7d arm64: stacktrace: factor out kernel unwind state
On arm64 we share some unwinding code between the regular kernel
unwinder and the KVM hyp unwinder. Some of this common code only matters
to the regular unwinder, e.g. the `kr_cur` and `task` fields in the
common struct unwind_state.

We're likely to add more state which only matters for regular kernel
unwinding (or only for hyp unwinding). In preparation for such changes,
this patch factors out the kernel-specific state into a new struct
kunwind_state, and updates the kernel unwind code accordingly.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Puranjay Mohan <puranjay12@gmail.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Puranjay Mohan <puranjay12@gmail.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20231124110511.2795958-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-12-11 11:42:55 +00:00
Mark Rutland
ca708599ca arm64: use XPACLRI to strip PAC
Currently we strip the PAC from pointers using C code, which requires
generating bitmasks, and conditionally clearing/setting bits depending
on bit 55. We can do better by using XPACLRI directly.

When the logic was originally written to strip PACs from user pointers,
contemporary toolchains used for the kernel had assemblers which were
unaware of the PAC instructions. As stripping the PAC from userspace
pointers required unconditional clearing of a fixed set of bits (which
could be performed with a single instruction), it was simpler to
implement the masking in C than it was to make use of XPACI or XPACLRI.

When support for in-kernel pointer authentication was added, the
stripping logic was extended to cover TTBR1 pointers, requiring several
instructions to handle whether to clear/set bits dependent on bit 55 of
the pointer.

This patch simplifies the stripping of PACs by using XPACLRI directly,
as contemporary toolchains do within __builtin_return_address(). This
saves a number of instructions, especially where
__builtin_return_address() does not implicitly strip the PAC but is
heavily used (e.g. with tracepoints). As the kernel might be compiled
with an assembler without knowledge of XPACLRI, it is assembled using
the 'HINT #7' alias, which results in an identical opcode.

At the same time, I've split ptrauth_strip_insn_pac() into
ptrauth_strip_user_insn_pac() and ptrauth_strip_kernel_insn_pac()
helpers so that we can avoid unnecessary PAC stripping when pointer
authentication is not in use in userspace or kernel respectively.

The underlying xpaclri() macro uses inline assembly which clobbers x30.
The clobber causes the compiler to save/restore the original x30 value
in a frame record (protected with PACIASP and AUTIASP when in-kernel
authentication is enabled), so this does not provide a gadget to alter
the return address. Similarly this does not adversely affect unwinding
due to the presence of the frame record.

The ptrauth_user_pac_mask() and ptrauth_kernel_pac_mask() are exported
from the kernel in ptrace and core dumps, so these are retained. A
subsequent patch will move them out of <asm/compiler.h>.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230412160134.306148-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-04-13 12:27:11 +01:00
Mark Rutland
b5ecc19e68 arm64: stacktrace: always inline core stacktrace functions
The arm64 stacktrace code can be used in kprobe context, and so cannot
be safely probed. Some (but not all) of the unwind functions are
annotated with `NOKPROBE_SYMBOL()` to ensure this, with others markes as
`__always_inline`, relying on the top-level unwind function being marked
as `noinstr`.

This patch has stacktrace.c consistently mark the internal stacktrace
functions as `__always_inline`, removing the need for NOKPROBE_SYMBOL()
as the top-level unwind function (arch_stack_walk()) is marked as
`noinstr`. This is more consistent and is a simpler pattern to follow
for future additions to stacktrace.c.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230411162943.203199-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-04-11 18:34:59 +01:00
Mark Rutland
ead6122c28 arm64: stacktrace: move dump functions to end of file
For historical reasons, the backtrace dumping functions are placed in
the middle of stacktrace.c, despite using functions defined later. For
clarity, and to make subsequent refactoring easier, move the dumping
functions to the end of stacktrace.c

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230411162943.203199-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-04-11 18:34:59 +01:00
Mark Rutland
9e09d445f1 arm64: stacktrace: recover return address for first entry
The function which calls the top-level backtracing function may have
been instrumented with ftrace and/or kprobes, and hence the first return
address may have been rewritten.

Factor out the existing fgraph / kretprobes address recovery, and use
this for the first address. As the comment for the fgraph case isn't all
that helpful, I've also dropped that.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230411162943.203199-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-04-11 18:34:59 +01:00
Ard Biesheuvel
7ea55715c4 arm64: efi: Account for the EFI runtime stack in stack unwinder
The EFI runtime services run from a dedicated stack now, and so the
stack unwinder needs to be informed about this.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-01-16 15:27:31 +01:00
Masami Hiramatsu (Google)
0fbcd8abf3 arm64: Prohibit instrumentation on arch_stack_walk()
Mark arch_stack_walk() as noinstr instead of notrace and inline functions
called from arch_stack_walk() as __always_inline so that user does not
put any instrumentations on it, because this function can be used from
return_address() which is used by lockdep.

Without this, if the kernel built with CONFIG_LOCKDEP=y, just probing
arch_stack_walk() via <tracefs>/kprobe_events will crash the kernel on
arm64.

 # echo p arch_stack_walk >> ${TRACEFS}/kprobe_events
 # echo 1 > ${TRACEFS}/events/kprobes/enable
  kprobes: Failed to recover from reentered kprobes.
  kprobes: Dump kprobe:
  .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
  ------------[ cut here ]------------
  kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
  kprobes: Failed to recover from reentered kprobes.
  kprobes: Dump kprobe:
  .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
  ------------[ cut here ]------------
  kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
  PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 17 Comm: migration/0 Tainted: G                 N 6.1.0-rc5+ #6
  Hardware name: linux,dummy-virt (DT)
  Stopper: 0x0 <- 0x0
  pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : kprobe_breakpoint_handler+0x178/0x17c
  lr : kprobe_breakpoint_handler+0x178/0x17c
  sp : ffff8000080d3090
  x29: ffff8000080d3090 x28: ffff0df5845798c0 x27: ffffc4f59057a774
  x26: ffff0df5ffbba770 x25: ffff0df58f420f18 x24: ffff49006f641000
  x23: ffffc4f590579768 x22: ffff0df58f420f18 x21: ffff8000080d31c0
  x20: ffffc4f590579768 x19: ffffc4f590579770 x18: 0000000000000006
  x17: 5f6b636174735f68 x16: 637261203d207264 x15: 64612e202c30203d
  x14: 2074657366666f2e x13: 30633178302f3078 x12: 302b6b6c61775f6b
  x11: 636174735f686372 x10: ffffc4f590dc5bd8 x9 : ffffc4f58eb31958
  x8 : 00000000ffffefff x7 : ffffc4f590dc5bd8 x6 : 80000000fffff000
  x5 : 000000000000bff4 x4 : 0000000000000000 x3 : 0000000000000000
  x2 : 0000000000000000 x1 : ffff0df5845798c0 x0 : 0000000000000064
  Call trace:
  kprobes: Failed to recover from reentered kprobes.
  kprobes: Dump kprobe:
  .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
  ------------[ cut here ]------------
  kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!

Fixes: 39ef362d2d ("arm64: Make return_address() use arch_stack_walk()")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/166994751368.439920.3236636557520824664.stgit@devnote3
Signed-off-by: Will Deacon <will@kernel.org>
2022-12-05 14:20:08 +00:00
Mark Rutland
4b5e694e25 arm64: stacktrace: track hyp stacks in unwinder's address space
Currently unwind_next_frame_record() has an optional callback to convert
the address space of the FP. This is necessary for the NVHE unwinder,
which tracks the stacks in the hyp VA space, but accesses the frame
records in the kernel VA space.

This is a bit unfortunate since it clutters unwind_next_frame_record(),
which will get in the way of future rework.

Instead, this patch changes the NVHE unwinder to track the stacks in the
kernel's VA space and translate to FP prior to calling
unwind_next_frame_record(). This removes the need for the translate_fp()
callback, as all unwinders consistently track stacks in the native
address space of the unwinder.

At the same time, this patch consolidates the generation of the stack
addresses behind the stackinfo_get_*() helpers.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Fuad Tabba <tabba@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220901130646.1316937-10-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 12:30:08 +01:00
Mark Rutland
8df137300d arm64: stacktrace: track all stack boundaries explicitly
Currently we call an on_accessible_stack() callback for each step of the
unwinder, requiring redundant work to be performed in the core of the
unwind loop (e.g. disabling preemption around accesses to per-cpu
variables containing stack boundaries). To prevent unwind loops which go
through a stack multiple times, we have to track the set of unwound
stacks, requiring a stack_type enum which needs to cater for all the
stacks of all possible callees. To prevent loops within a stack, we must
track the prior FP values.

This patch reworks the unwinder to minimize the work in the core of the
unwinder, and to remove the need for the stack_type enum. The set of
accessible stacks (and their boundaries) are determined at the start of
the unwind, and the current stack is tracked during the unwind, with
completed stacks removed from the set of accessible stacks. This makes
the boundary checks more accurate (e.g. detecting overlapped frame
records), and removes the need for separate tracking of the prior FP and
visited stacks.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Fuad Tabba <tabba@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220901130646.1316937-9-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 12:30:08 +01:00
Mark Rutland
d1f684e46b arm64: stacktrace: rework stack boundary discovery
In subsequent patches we'll want to acquire the stack boundaries
ahead-of-time, and we'll need to be able to acquire the relevant
stack_info regardless of whether we have an object the happens to be on
the stack.

This patch replaces the on_XXX_stack() helpers with stackinfo_get_XXX()
helpers, with the caller being responsible for the checking whether an
object is on a relevant stack. For the moment this is moved into the
on_accessible_stack() functions, making these slightly larger;
subsequent patches will remove the on_accessible_stack() functions and
simplify the logic.

The on_irq_stack() and on_task_stack() helpers are kept as these are
used by IRQ entry sequences and stackleak respectively. As they're only
used as predicates, the stack_info pointer parameter is removed in both
cases.

As the on_accessible_stack() functions are always passed a non-NULL info
pointer, these now update info unconditionally. When updating the type
to STACK_TYPE_UNKNOWN, the low/high bounds are also modified, but as
these will not be consumed this should have no adverse affect.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Fuad Tabba <tabba@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220901130646.1316937-7-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 12:30:07 +01:00
Mark Rutland
75758d5114 arm64: stacktrace: move SDEI stack helpers to stacktrace code
For clarity and ease of maintenance, it would be helpful for all the
stack helpers to be in the same place.

Move the SDEI stack helpers into the stacktrace code where all the other
stack helpers live.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220901130646.1316937-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 12:30:07 +01:00
Mark Rutland
b532ab5f23 arm64: stacktrace: rename unwind_next_common() -> unwind_next_frame_record()
The unwind_next_common() function unwinds a single frame record. There
are other unwind steps (e.g. unwinding through trampolines) which are
handled in the regular kernel unwinder, and in future there may be other
common unwind helpers.

Clarify the purpose of unwind_next_common() by renaming it to
unwind_next_frame_record(). At the same time, add commentary, and delete
the redundant comment at the top of asm/stacktrace/common.h.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Fuad Tabba <tabba@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220901130646.1316937-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 12:30:07 +01:00
Mark Rutland
bc8d75212d arm64: stacktrace: simplify unwind_next_common()
Currently unwind_next_common() takes a pointer to a stack_info which is
only ever used within unwind_next_common().

Make it a local variable and simplify callers.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Fuad Tabba <tabba@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220901130646.1316937-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09 12:30:07 +01:00
Marc Zyngier
4e00532f37 KVM: arm64: Make unwind()/on_accessible_stack() per-unwinder functions
Having multiple versions of on_accessible_stack() (one per unwinder)
makes it very hard to reason about what is used where due to the
complexity of the various includes, the forward declarations, and
the reliance on everything being 'inline'.

Instead, move the code back where it should be. Each unwinder
implements:

- on_accessible_stack() as well as the helpers it depends on,

- unwind()/unwind_next(), as they pass on_accessible_stack as
  a parameter to unwind_next_common() (which is the only common
  code here)

This hardly results in any duplication, and makes it much
easier to reason about the code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com>
Tested-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20220727142906.1856759-4-maz@kernel.org
2022-07-27 18:18:03 +01:00
Kalesh Singh
f51e714674 arm64: stacktrace: Factor out common unwind()
Move unwind() to stacktrace/common.h, and as a result
the kernel unwind_next() to asm/stacktrace.h. This allow
reusing unwind() in the implementation of the nVHE HYP
stack unwinder, later in the series.

Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220726073750.3219117-6-kaleshsingh@google.com
2022-07-26 10:48:43 +01:00
Kalesh Singh
5b1b08619f arm64: stacktrace: Handle frame pointer from different address spaces
The unwinder code is made reusable so that it can be used to
unwind various types of stacks. One usecase is unwinding the
nVHE hyp stack from the host (EL1) in non-protected mode. This
means that the unwinder must be able to translate HYP stack
addresses to kernel addresses.

Add a callback (stack_trace_translate_fp_fn) to allow specifying
the translation function.

Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220726073750.3219117-5-kaleshsingh@google.com
2022-07-26 10:48:32 +01:00
Kalesh Singh
be63c647fd arm64: stacktrace: Factor out unwind_next_common()
Move common unwind_next logic to stacktrace/common.h. This allows
reusing the code in the implementation the nVHE hypervisor stack
unwinder, later in this series.

Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220726073750.3219117-4-kaleshsingh@google.com
2022-07-26 10:48:20 +01:00
Kalesh Singh
6bf212c89c arm64: stacktrace: Add shared header for common stack unwinding code
In order to reuse the arm64 stack unwinding logic for the nVHE
hypervisor stack, move the common code to a shared header
(arch/arm64/include/asm/stacktrace/common.h).

The nVHE hypervisor cannot safely link against kernel code, so we
make use of the shared header to avoid duplicated logic later in
this series.

Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220726073750.3219117-2-kaleshsingh@google.com
2022-07-26 10:47:14 +01:00
Madhavan T. Venkataraman
82a592c13b arm64: Copy the task argument to unwind_state
Copy the task argument passed to arch_stack_walk() to unwind_state so that
it can be passed to unwind functions via unwind_state rather than as a
separate argument. The task is a fundamental part of the unwind state.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220617180219.20352-3-madvenka@linux.microsoft.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27 10:51:34 +01:00
Madhavan T. Venkataraman
a019d8a2cc arm64: Split unwind_init()
unwind_init() is currently a single function that initializes all of the
unwind state. Split it into the following functions and call them
appropriately:

	- unwind_init_from_regs() - initialize from regs passed by caller.

	- unwind_init_from_caller() - initialize for the current task
	  from the caller of arch_stack_walk().

	- unwind_init_from_task() - initialize from the saved state of a
	  task other than the current task. In this case, the other
	  task must not be running.

This is done for two reasons:

	- the different ways of initializing are clear

	- specialized code can be added to each initializer in the future.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220617180219.20352-2-madvenka@linux.microsoft.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27 10:51:34 +01:00
Andrey Konovalov
446297b28a arm64: stacktrace: use non-atomic __set_bit
Use the non-atomic version of set_bit() in arch/arm64/kernel/stacktrace.c,
as there is no concurrent accesses to frame->prev_type.

This speeds up stack trace collection and improves the boot time of
Generic KASAN by 2-5%.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Link: https://lore.kernel.org/r/23dfa36d1cc91e4a1059945b7834eac22fb9854d.1653317461.git.andreyknvl@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-23 15:57:29 +01:00
Andrey Konovalov
802b91118d arm64: kasan: do not instrument stacktrace.c
Disable KASAN instrumentation of arch/arm64/kernel/stacktrace.c.

This speeds up Generic KASAN by 5-20%.

As a side-effect, KASAN is now unable to detect bugs in the stack trace
collection code. This is taken as an acceptable downside.

Also replace READ_ONCE_NOCHECK() with READ_ONCE() in stacktrace.c.
As the file is now not instrumented, there is no need to use the
NOCHECK version of READ_ONCE().

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Link: https://lore.kernel.org/r/c4c944a2a905e949760fbeb29258185087171708.1653317461.git.andreyknvl@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-06-23 15:57:29 +01:00
Madhavan T. Venkataraman
bd5552bc48 arm64: stacktrace: align with common naming
For historical reasons, the naming of parameters and their types in the
arm64 stacktrace code differs from that used in generic code and other
architectures, even though the types are equivalent.

For consistency and clarity, use the generic names.

There should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> for the series.
Link: https://lore.kernel.org/r/20220413145910.3060139-7-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 15:33:14 +01:00
Madhavan T. Venkataraman
e9d75a0ba8 arm64: stacktrace: rename stackframe to unwind_state
Rename "struct stackframe" to "struct unwind_state" for consistency and
better naming. Accordingly, rename variable/argument "frame" to "state".

There should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> for the series.
Link: https://lore.kernel.org/r/20220413145910.3060139-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 15:33:14 +01:00
Madhavan T. Venkataraman
c797bd4548 arm64: stacktrace: rename unwinder functions
Rename unwinder functions for consistency and better naming.

	- Rename start_backtrace() to unwind_init().
	- Rename unwind_frame() to unwind_next().
	- Rename walk_stackframe() to unwind().

There should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> for the series.
Link: https://lore.kernel.org/r/20220413145910.3060139-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 15:33:14 +01:00
Mark Rutland
96bb1530c4 arm64: stacktrace: make struct stackframe private to stacktrace.c
Now that arm64 uses arch_stack_walk() consistently, struct stackframe is
only used within stacktrace.c. To make it easier to read and maintain
this code, it would be nicer if the definition were there too.

Move the definition into stacktrace.c.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviwed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> for the series.
Link: https://lore.kernel.org/r/20220413145910.3060139-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 15:33:13 +01:00
Mark Rutland
cb86a41b35 arm64: stacktrace: delete PCS comment
The comment at the top of stacktrace.c isn't all that helpful, as it's
not associated with the code which inspects the frame record, and the
code example isn't representative of common code generation today.

Delete it.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> for the series.
Link: https://lore.kernel.org/r/20220413145910.3060139-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 15:33:13 +01:00
Madhavan T. Venkataraman
4f6277e8ac arm64: stacktrace: remove NULL task check from unwind_frame()
Currently, there is a check for a NULL task in unwind_frame(). It is not
needed since all current callers pass a non-NULL task.

There should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Kalesh Singh <kaleshsingh@google.com> for the series.
Link: https://lore.kernel.org/r/20220413145910.3060139-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-22 15:33:13 +01:00
Masami Hiramatsu
1e0924bd09 arm64: Mark start_backtrace() notrace and NOKPROBE_SYMBOL
Mark the start_backtrace() as notrace and NOKPROBE_SYMBOL
because this function is called from ftrace and lockdep to
get the caller address via return_address(). The lockdep
is used in kprobes, it should also be NOKPROBE_SYMBOL.

Fixes: b07f349966 ("arm64: stacktrace: Move start_backtrace() out of the header")
Cc: <stable@vger.kernel.org> # 5.13.x
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/164301227374.1433152.12808232644267107415.stgit@devnote2
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-01-24 14:25:20 +00:00
Mark Rutland
d2d1d2645c arm64: Make some stacktrace functions private
Now that open-coded stack unwinds have been converted to
arch_stack_walk(), we no longer need to expose any of unwind_frame(),
walk_stackframe(), or start_backtrace() outside of stacktrace.c.

Make those functions private to stacktrace.c, removing their prototypes
from <asm/stacktrace.h> and marking them static.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Link: https://lore.kernel.org/r/20211129142849.3056714-10-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-10 14:06:04 +00:00
Madhavan T. Venkataraman
2dad6dc17b arm64: Make dump_backtrace() use arch_stack_walk()
To enable RELIABLE_STACKTRACE and LIVEPATCH on arm64, we need to
substantially rework arm64's unwinding code. As part of this, we want to
minimize the set of unwind interfaces we expose, and avoid open-coding
of unwind logic.

Currently, dump_backtrace() walks the stack of the current task or a
blocked task by calling stact_backtrace() and iterating unwind steps
using unwind_frame(). This can be written more simply in terms of
arch_stack_walk(), considering three distinct cases:

1) When unwinding a blocked task, start_backtrace() is called with the
   blocked task's saved PC and FP, and the unwind proceeds immediately
   from this point without skipping any entries. This is functionally
   equivalent to calling arch_stack_walk() with the blocked task, which
   will start with the task's saved PC and FP.

   There is no functional change to this case.

2) When unwinding the current task without regs, start_backtrace() is
   called with dump_backtrace() as the PC and __builtin_frame_address(0)
   as the next frame, and the unwind proceeds immediately without
   skipping. This is *almost* functionally equivalent to calling
   arch_stack_walk() for the current task, which will start with its
   caller (i.e. an offset into dump_backtrace()) as the PC, and the
   callers frame record as the next frame.

   The only difference being that dump_backtrace() will be reported with
   an offset (which is strictly more correct than currently). Otherwise
   there is no functional cahnge to this case.

3) When unwinding the current task with regs, start_backtrace() is
   called with dump_backtrace() as the PC and __builtin_frame_address(0)
   as the next frame, and the unwind is performed silently until the
   next frame is the frame pointed to by regs->fp. Reporting starts
   from regs->pc and continues from the frame in regs->fp.

   Historically, this pre-unwind was necessary to correctly record
   return addresses rewritten by the ftrace graph calller, but this is
   no longer necessary as these are now recovered using the FP since
   commit:

   c6d3cd32fd ("arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR")

   This pre-unwind is not necessary to recover return addresses
   rewritten by kretprobes, which historically were not recovered, and
   are now recovered using the FP since commit:

   cd9bc2c925 ("arm64: Recover kretprobe modified return address in stacktrace")

   Thus, this is functionally equivalent to calling arch_stack_walk()
   with the current task and regs, which will start with regs->pc as the
   PC and regs->fp as the next frame, without a pre-unwind.

This patch makes dump_backtrace() use arch_stack_walk(). This simplifies
dump_backtrace() and will permit subsequent changes to the unwind code.

Aside from the improved reporting when unwinding current without regs,
there should be no functional change as a result of this patch.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
[Mark: elaborate commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211129142849.3056714-9-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-10 14:06:04 +00:00
Peter Zijlstra
1614b2b11f arch: Make ARCH_STACKWALK independent of STACKTRACE
Make arch_stack_walk() available for ARCH_STACKWALK architectures
without it being entangled in STACKTRACE.

Link: https://lore.kernel.org/lkml/20211022152104.356586621@infradead.org/
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[Mark: rebase, drop unnecessary arm change]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20211129142849.3056714-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-12-10 14:06:03 +00:00
Mark Rutland
c6d3cd32fd arm64: ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
When CONFIG_FUNCTION_GRAPH_TRACER is selected and the function graph
tracer is in use, unwind_frame() may erroneously associate a traced
function with an incorrect return address. This can happen when starting
an unwind from a pt_regs, or when unwinding across an exception
boundary.

This can be seen when recording with perf while the function graph
tracer is in use. For example:

| # echo function_graph > /sys/kernel/debug/tracing/current_tracer
| # perf record -g -e raw_syscalls:sys_enter:k /bin/true
| # perf report

... reports the callchain erroneously as:

| el0t_64_sync
| el0t_64_sync_handler
| el0_svc_common.constprop.0
| perf_callchain
| get_perf_callchain
| syscall_trace_enter
| syscall_trace_enter

... whereas when the function graph tracer is not in use, it reports:

| el0t_64_sync
| el0t_64_sync_handler
| el0_svc
| do_el0_svc
| el0_svc_common.constprop.0
| syscall_trace_enter
| syscall_trace_enter

The underlying problem is that ftrace_graph_get_ret_stack() takes an
index offset from the most recent entry added to the fgraph return
stack. We start an unwind at offset 0, and increment the offset each
time we encounter a rewritten return address (i.e. when we see
`return_to_handler`). This is broken in two cases:

1) Between creating a pt_regs and starting the unwind, function calls
   may place entries on the stack, leaving an arbitrary offset which we
   can only determine by performing a full unwind from the caller of the
   unwind code (and relying on none of the unwind code being
   instrumented).

   This can result in erroneous entries being reported in a backtrace
   recorded by perf or kfence when the function graph tracer is in use.
   Currently show_regs() is unaffected as dump_backtrace() performs an
   initial unwind.

2) When unwinding across an exception boundary (whether continuing an
   unwind or starting a new unwind from regs), we currently always skip
   the LR of the interrupted context. Where this was live and contained
   a rewritten address, we won't consume the corresponding fgraph ret
   stack entry, leaving subsequent entries off-by-one.

   This can result in erroneous entries being reported in a backtrace
   performed by any in-kernel unwinder when that backtrace crosses an
   exception boundary, with entries after the boundary being reported
   incorrectly. This includes perf, kfence, show_regs(), panic(), etc.

To fix this, we need to be able to uniquely identify each rewritten
return address such that we can map this back to the original return
address. We can use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR to associate
each rewritten return address with a unique location on the stack. As
the return address is passed in the LR (and so is not guaranteed a
unique location in memory), we use the FP upon entry to the function
(i.e. the address of the caller's frame record) as the return address
pointer. Any nested call will have a different FP value as the caller
must create its own frame record and update FP to point to this.

Since ftrace_graph_ret_addr() requires the return address with the PAC
stripped, the stripping of the PAC is moved before the fixup of the
rewritten address. As we would unconditionally strip the PAC, moving
this earlier is not harmful, and we can avoid a redundant strip in the
return address fixup code.

I've tested this with the perf case above, the ftrace selftests, and
a number of ad-hoc unwinder tests. The tests all pass, and I have seen
no unexpected behaviour as a result of this change. I've tested with
pointer authentication under QEMU TCG where magic-sysrq+l correctly
recovers the original return addresses.

Note that this doesn't fix the issue of skipping a live LR at an
exception boundary, which is a more general problem and requires more
substantial rework. Were we to consume the LR in all cases this would
result in warnings where the interrupted context's LR contains
`return_to_handler`, but the FP has been altered, e.g.

| func:
|	<--- ftrace entry ---> 	// logs FP & LR, rewrites LR
| 	STP	FP, LR, [SP, #-16]!
| 	MOV	FP, SP
| 	<--- INTERRUPT --->

... as ftrace_graph_get_ret_stack() fill not find a matching entry,
triggering the WARN_ON_ONCE() in unwind_frame().

Link: https://lore.kernel.org/r/20211025164925.GB2001@C02TD0UTHF1T.local
Link: https://lore.kernel.org/r/20211027132529.30027-1-mark.rutland@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211029162245.39761-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-11-16 09:47:54 +00:00
Masami Hiramatsu
cd9bc2c925 arm64: Recover kretprobe modified return address in stacktrace
Since the kretprobe replaces the function return address with
the kretprobe_trampoline on the stack, stack unwinder shows it
instead of the correct return address.

This checks whether the next return address is the
__kretprobe_trampoline(), and if so, try to find the correct
return address from the kretprobe instance list. For this purpose
this adds 'kr_cur' loop cursor to memorize the current kretprobe
instance.

With this fix, now arm64 can enable
CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE, and pass the
kprobe self tests.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-22 12:16:53 -04:00
Mark Rutland
0c32706dac arm64: stacktrace: avoid tracing arch_stack_walk()
When the function_graph tracer is in use, arch_stack_walk() may unwind
the stack incorrectly, erroneously reporting itself, missing the final
entry which is being traced, and reporting all traced entries between
these off-by-one from where they should be.

When ftrace hooks a function return, the original return address is
saved to the fgraph ret_stack, and the return address  in the LR (or the
function's frame record) is replaced with `return_to_handler`.

When arm64's unwinder encounter frames returning to `return_to_handler`,
it finds the associated original return address from the fgraph ret
stack, assuming the most recent `ret_to_hander` entry on the stack
corresponds to the most recent entry in the fgraph ret stack, and so on.

When arch_stack_walk() is used to dump the current task's stack, it
starts from the caller of arch_stack_walk(). However, arch_stack_walk()
can be traced, and so may push an entry on to the fgraph ret stack,
leaving the fgraph ret stack offset by one from the expected position.

This can be seen when dumping the stack via /proc/self/stack, where
enabling the graph tracer results in an unexpected
`stack_trace_save_tsk` entry at the start of the trace, and `el0_svc`
missing form the end of the trace.

This patch fixes this by marking arch_stack_walk() as notrace, as we do
for all other functions on the path to ftrace_graph_get_ret_stack().
While a few helper functions are not marked notrace, their calls/returns
are balanced, and will have no observable effect when examining the
fgraph ret stack.

It is possible for an exeption boundary to cause a similar offset if the
return address of the interrupted context was in the LR. Fixing those
cases will require some more substantial rework, and is left for
subsequent patches.

Before:

| # cat /proc/self/stack
| [<0>] proc_pid_stack+0xc4/0x140
| [<0>] proc_single_show+0x6c/0x120
| [<0>] seq_read_iter+0x240/0x4e0
| [<0>] seq_read+0xe8/0x140
| [<0>] vfs_read+0xb8/0x1e4
| [<0>] ksys_read+0x74/0x100
| [<0>] __arm64_sys_read+0x28/0x3c
| [<0>] invoke_syscall+0x50/0x120
| [<0>] el0_svc_common.constprop.0+0xc4/0xd4
| [<0>] do_el0_svc+0x30/0x9c
| [<0>] el0_svc+0x2c/0x54
| [<0>] el0t_64_sync_handler+0x1a8/0x1b0
| [<0>] el0t_64_sync+0x198/0x19c
| # echo function_graph > /sys/kernel/tracing/current_tracer
| # cat /proc/self/stack
| [<0>] stack_trace_save_tsk+0xa4/0x110
| [<0>] proc_pid_stack+0xc4/0x140
| [<0>] proc_single_show+0x6c/0x120
| [<0>] seq_read_iter+0x240/0x4e0
| [<0>] seq_read+0xe8/0x140
| [<0>] vfs_read+0xb8/0x1e4
| [<0>] ksys_read+0x74/0x100
| [<0>] __arm64_sys_read+0x28/0x3c
| [<0>] invoke_syscall+0x50/0x120
| [<0>] el0_svc_common.constprop.0+0xc4/0xd4
| [<0>] do_el0_svc+0x30/0x9c
| [<0>] el0t_64_sync_handler+0x1a8/0x1b0
| [<0>] el0t_64_sync+0x198/0x19c

After:

| # cat /proc/self/stack
| [<0>] proc_pid_stack+0xc4/0x140
| [<0>] proc_single_show+0x6c/0x120
| [<0>] seq_read_iter+0x240/0x4e0
| [<0>] seq_read+0xe8/0x140
| [<0>] vfs_read+0xb8/0x1e4
| [<0>] ksys_read+0x74/0x100
| [<0>] __arm64_sys_read+0x28/0x3c
| [<0>] invoke_syscall+0x50/0x120
| [<0>] el0_svc_common.constprop.0+0xc4/0xd4
| [<0>] do_el0_svc+0x30/0x9c
| [<0>] el0_svc+0x2c/0x54
| [<0>] el0t_64_sync_handler+0x1a8/0x1b0
| [<0>] el0t_64_sync+0x198/0x19c
| # echo function_graph > /sys/kernel/tracing/current_tracer
| # cat /proc/self/stack
| [<0>] proc_pid_stack+0xc4/0x140
| [<0>] proc_single_show+0x6c/0x120
| [<0>] seq_read_iter+0x240/0x4e0
| [<0>] seq_read+0xe8/0x140
| [<0>] vfs_read+0xb8/0x1e4
| [<0>] ksys_read+0x74/0x100
| [<0>] __arm64_sys_read+0x28/0x3c
| [<0>] invoke_syscall+0x50/0x120
| [<0>] el0_svc_common.constprop.0+0xc4/0xd4
| [<0>] do_el0_svc+0x30/0x9c
| [<0>] el0_svc+0x2c/0x54
| [<0>] el0t_64_sync_handler+0x1a8/0x1b0
| [<0>] el0t_64_sync+0x198/0x19c

Cc: <stable@vger.kernel.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviwed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210802164845.45506-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-08-03 10:39:35 +01:00
Stephen Boyd
f61b870607 arm64: stacktrace: use %pSb for backtrace printing
Let's use the new printk format to print the stacktrace entry when
printing a backtrace to the kernel logs. This will include any module's
build ID[1] in it so that offline/crash debugging can easily locate the
debuginfo for a module via something like debuginfod[2].

Link: https://lkml.kernel.org/r/20210511003845.2429846-7-swboyd@chromium.org
Link: https://fedoraproject.org/wiki/Releases/FeatureBuildId [1]
Link: https://sourceware.org/elfutils/Debuginfod.html [2]
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Evan Green <evgreen@chromium.org>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-07-08 11:48:22 -07:00
Peter Collingbourne
33c222aeda arm64: stacktrace: Relax frame record alignment requirement to 8 bytes
The AAPCS places no requirements on the alignment of the frame
record. In theory it could be placed anywhere, although it seems
sensible to require it to be aligned to 8 bytes. With an upcoming
enhancement to tag-based KASAN Clang will begin creating frame records
located at an address that is only aligned to 8 bytes. Accommodate
such frame records in the stack unwinding code.

As pointed out by Mark Rutland, the userspace stack unwinding code
has the same problem, so fix it there as well.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/Ia22c375230e67ca055e9e4bb639383567f7ad268
Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210526174927.2477847-2-pcc@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26 20:01:17 +01:00
Peter Collingbourne
76734d26b5 arm64: Change the on_*stack functions to take a size argument
unwind_frame() was previously implicitly checking that the frame
record is in bounds of the stack by enforcing that FP is both aligned
to 16 and in bounds of the stack. Once the FP alignment requirement
is relaxed to 8 this will not be sufficient because it does not
account for the case where FP points to 8 bytes before the end of the
stack.

Make the check explicit by changing the on_*stack functions to take a
size argument and adjusting the callers to pass the appropriate sizes.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/Ib7a3eb3eea41b0687ffaba045ceb2012d077d8b4
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210526174927.2477847-1-pcc@google.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26 20:01:17 +01:00
Madhavan T. Venkataraman
7d7b720a4b arm64: Implement stack trace termination record
Reliable stacktracing requires that we identify when a stacktrace is
terminated early. We can do this by ensuring all tasks have a final
frame record at a known location on their task stack, and checking
that this is the final frame record in the chain.

We'd like to use task_pt_regs(task)->stackframe as the final frame
record, as this is already setup upon exception entry from EL0. For
kernel tasks we need to consistently reserve the pt_regs and point x29
at this, which we can do with small changes to __primary_switched,
__secondary_switched, and copy_process().

Since the final frame record must be at a specific location, we must
create the final frame record in __primary_switched and
__secondary_switched rather than leaving this to start_kernel and
secondary_start_kernel. Thus, __primary_switched and
__secondary_switched will now show up in stacktraces for the idle tasks.

Since the final frame record is now identified by its location rather
than by its contents, we identify it at the start of unwind_frame(),
before we read any values from it.

External debuggers may terminate the stack trace when FP == 0. In the
pt_regs->stackframe, the PC is 0 as well. So, stack traces taken in the
debugger may print an extra record 0x0 at the end. While this is not
pretty, this does not do any harm. This is a small price to pay for
having reliable stack trace termination in the kernel. That said, gdb
does not show the extra record probably because it uses DWARF and not
frame pointers for stack traces.

Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
[Mark: rebase, use ASM_BUG(), update comments, update commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210510110026.18061-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25 18:53:29 +01:00
Linus Torvalds
51595e3b49 Assorted arm64 fixes and clean-ups, the most important:
- Restore terminal stack frame records. Their previous removal caused
   traces which cross secondary_start_kernel to terminate one entry too
   late, with a spurious "0" entry.
 
 - Fix boot warning with pseudo-NMI due to the way we manipulate the PMR
   register.
 
 - ACPI fixes: avoid corruption of interrupt mappings on watchdog probe
   failure (GTDT), prevent unregistering of GIC SGIs.
 
 - Force SPARSEMEM_VMEMMAP as the only memory model, it saves with having
   to test all the other combinations.
 
 - Documentation fixes and updates: tagged address ABI exceptions on
   brk/mmap/mremap(), event stream frequency, update booting requirements
   on the configuration of traps.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmCVba0ACgkQa9axLQDI
 XvEClxAAsqigp+Mnotdr8YUOuXLjHWU41EMShV6WbFcmlViEyZxxtZ5qavw19T3L
 rPxb8hq9QqI8kCd+j4MAU7cdc0ry+047njJmQ3Va0WeiDsbgEfPvLWPguDbeDFXW
 EjKKib+F/u58IffDkn6rVA7ZVPgYHRH+8yw6EdApp0BN4JuxEFzGBzG4EWKXnNHH
 IOu4IIXlbLX+U1kTtUFR4u6i4uBs2pZdEYzo1NF/Joacg14F01CBRuh8U04eeWFD
 HF4pWd4eCl/bLYPurF1rOi1dIUyrPuaPgNInGEdSaocD0hIvQH0r0wyIt+aMmqvK
 9Jm+dDEGeLxQn2nDrXfyldYG5EbFa3OplkUt2MVDDMWwN2Gpsjlnf/ucff/SBT/N
 7D6AL2OH6KDDCsNgU1JH9H6rAlh4nWJcsMBrWmP7aQtBMRyccQLywrt4HXB8cy7E
 +MyhTit05P3lpsrK2uZSFujK35Ts8hxywA7lAlU7YP4ADKu3Noc6qXSaxZRe+1Gb
 O5k3Qdcih0VLE843PjJj8f8fW1ysJW5J60cK9BaZxpB77gNufKkh/hS6YAiA8qkt
 PT3J0jk/cgGvwKK54rW52dG7qvDImgUMGkXGKQnEimgb62DatCZ4ZOPC+UoiDiqO
 SEd1DSW0Lt1VxVIulAjatVgzIJGM0jGCm9L7/vBguR0+Lahakbg=
 =vYok
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull more arm64 updates from Catalin Marinas:
 "A mix of fixes and clean-ups that turned up too late for the first
  pull request:

   - Restore terminal stack frame records. Their previous removal caused
     traces which cross secondary_start_kernel to terminate one entry
     too late, with a spurious "0" entry.

   - Fix boot warning with pseudo-NMI due to the way we manipulate the
     PMR register.

   - ACPI fixes: avoid corruption of interrupt mappings on watchdog
     probe failure (GTDT), prevent unregistering of GIC SGIs.

   - Force SPARSEMEM_VMEMMAP as the only memory model, it saves with
     having to test all the other combinations.

   - Documentation fixes and updates: tagged address ABI exceptions on
     brk/mmap/mremap(), event stream frequency, update booting
     requirements on the configuration of traps"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kernel: Update the stale comment
  arm64: Fix the documented event stream frequency
  arm64: entry: always set GIC_PRIO_PSR_I_SET during entry
  arm64: Explicitly document boot requirements for SVE
  arm64: Explicitly require that FPSIMD instructions do not trap
  arm64: Relax booting requirements for configuration of traps
  arm64: cpufeatures: use min and max
  arm64: stacktrace: restore terminal records
  arm64/vdso: Discard .note.gnu.property sections in vDSO
  arm64: doc: Add brk/mmap/mremap() to the Tagged Address ABI Exceptions
  psci: Remove unneeded semicolon
  ACPI: irq: Prevent unregistering of GIC SGIs
  ACPI: GTDT: Don't corrupt interrupt mappings on watchdow probe failure
  arm64: Show three registers per line
  arm64: remove HAVE_DEBUG_BUGVERBOSE
  arm64: alternative: simplify passing alt_region
  arm64: Force SPARSEMEM_VMEMMAP as the only memory management model
  arm64: vdso32: drop -no-integrated-as flag
2021-05-07 12:11:05 -07:00
Mark Rutland
8533d5bfad arm64: stacktrace: restore terminal records
We removed the terminal frame records in commit:

   6106e1112c ("arm64: remove EL0 exception frame record")

... on the assumption that as we no longer used them to find the pt_regs
at exception boundaries, they were no longer necessary.

However, Leo reports that as an unintended side-effect, this causes
traces which cross secondary_start_kernel to terminate one entry too
late, with a spurious "0" entry.

There are a few ways we could sovle this, but as we're planning to use
terminal records for RELIABLE_STACKTRACE, let's revert the logic change
for now, keeping the update comments and accounting for the changes in
commit:

  3c02600144 ("arm64: stacktrace: Report when we reach the end of the stack")

This is effectively a partial revert of commit:

  6106e1112c ("arm64: remove EL0 exception frame record")

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 6106e1112c ("arm64: remove EL0 exception frame record")
Reported-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210429104813.GA33550@C02TD0UTHF1T.local
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-30 18:30:45 +01:00
Linus Torvalds
31a24ae89c arm64 updates for 5.13:
- MTE asynchronous support for KASan. Previously only synchronous
   (slower) mode was supported. Asynchronous is faster but does not allow
   precise identification of the illegal access.
 
 - Run kernel mode SIMD with softirqs disabled. This allows using NEON in
   softirq context for crypto performance improvements. The conditional
   yield support is modified to take softirqs into account and reduce the
   latency.
 
 - Preparatory patches for Apple M1: handle CPUs that only have the VHE
   mode available (host kernel running at EL2), add FIQ support.
 
 - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers, new
   functions for the HiSilicon HHA and L3C PMU, cleanups.
 
 - Re-introduce support for execute-only user permissions but only when
   the EPAN (Enhanced Privileged Access Never) architecture feature is
   available.
 
 - Disable fine-grained traps at boot and improve the documented boot
   requirements.
 
 - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC).
 
 - Add hierarchical eXecute Never permissions for all page tables.
 
 - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs
   to control which PAC keys are enabled in a particular task.
 
 - arm64 kselftests for BTI and some improvements to the MTE tests.
 
 - Minor improvements to the compat vdso and sigpage.
 
 - Miscellaneous cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmB5xkkACgkQa9axLQDI
 XvEBgRAAsr6r8gsBQJP3FDHmbtbVf2ej5QJTCOAQAGHbTt0JH7Pk03pWSBr7h5nF
 vsddRDxxeDgB6xd7jWP7EvDaPxHeB0CdSj5gG8EP/ZdOm8sFAwB1ZIHWikgUgSwW
 nu6R28yXTMSj+EkyFtahMhTMJ1EMF4sCPuIgAo59ST5w/UMMqLCJByOu4ej6RPKZ
 aeSJJWaDLBmbgnTKWxRvCc/MgIx4J/LAHWGkdpGjuMK6SLp38Kdf86XcrklXtzwf
 K30ZYeoKq8zZ+nFOsK9gBVlOlocZcbS1jEbN842jD6imb6vKLQtBWrKk9A6o4v5E
 XulORWcSBhkZb3ItIU9+6SmelUExf0VeVlSp657QXYPgquoIIGvFl6rCwhrdGMGO
 bi6NZKCfJvcFZJoIN1oyhuHejgZSBnzGEcvhvzNdg7ItvOCed7q3uXcGHz/OI6tL
 2TZKddzHSEMVfTo0D+RUsYfasZHI1qAiQ0mWVC31c+YHuRuW/K/jlc3a5TXlSBUa
 Dwu0/zzMLiqx65ISx9i7XNMrngk55uzrS6MnwSByPoz4M4xsElZxt3cbUxQ8YAQz
 jhxTHs1Pwes8i7f4n61ay/nHCFbmVvN/LlsPRpZdwd8JumThLrDolF3tc6aaY0xO
 hOssKtnGY4Xvh/WitfJ5uvDb1vMObJKTXQEoZEJh4hlNQDxdeUE=
 =6NGI
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - MTE asynchronous support for KASan. Previously only synchronous
   (slower) mode was supported. Asynchronous is faster but does not
   allow precise identification of the illegal access.

 - Run kernel mode SIMD with softirqs disabled. This allows using NEON
   in softirq context for crypto performance improvements. The
   conditional yield support is modified to take softirqs into account
   and reduce the latency.

 - Preparatory patches for Apple M1: handle CPUs that only have the VHE
   mode available (host kernel running at EL2), add FIQ support.

 - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers,
   new functions for the HiSilicon HHA and L3C PMU, cleanups.

 - Re-introduce support for execute-only user permissions but only when
   the EPAN (Enhanced Privileged Access Never) architecture feature is
   available.

 - Disable fine-grained traps at boot and improve the documented boot
   requirements.

 - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC).

 - Add hierarchical eXecute Never permissions for all page tables.

 - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs
   to control which PAC keys are enabled in a particular task.

 - arm64 kselftests for BTI and some improvements to the MTE tests.

 - Minor improvements to the compat vdso and sigpage.

 - Miscellaneous cleanups.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (86 commits)
  arm64/sve: Add compile time checks for SVE hooks in generic functions
  arm64/kernel/probes: Use BUG_ON instead of if condition followed by BUG.
  arm64: pac: Optimize kernel entry/exit key installation code paths
  arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS)
  arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere
  arm64/sve: Remove redundant system_supports_sve() tests
  arm64: fpsimd: run kernel mode NEON with softirqs disabled
  arm64: assembler: introduce wxN aliases for wN registers
  arm64: assembler: remove conditional NEON yield macros
  kasan, arm64: tests supports for HW_TAGS async mode
  arm64: mte: Report async tag faults before suspend
  arm64: mte: Enable async tag check fault
  arm64: mte: Conditionally compile mte_enable_kernel_*()
  arm64: mte: Enable TCO in functions that can read beyond buffer limits
  kasan: Add report for async mode
  arm64: mte: Drop arch_enable_tagging()
  kasan: Add KASAN mode kernel parameter
  arm64: mte: Add asynchronous mode support
  arm64: Get rid of CONFIG_ARM64_VHE
  arm64: Cope with CPUs stuck in VHE mode
  ...
2021-04-26 10:25:03 -07:00
Mark Brown
b07f349966 arm64: stacktrace: Move start_backtrace() out of the header
Currently start_backtrace() is a static inline function in the header.
Since it really shouldn't be sufficiently performance critical that we
actually need to have it inlined move it into a C file, this will save
anyone else scratching their head about why it is defined in the header.
As far as I can see it's only there because it was factored out of the
various callers.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20210319174022.33051-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-03-28 18:09:47 +01:00
Mark Rutland
c607ab4f91 arm64: stacktrace: don't trace arch_stack_walk()
We recently converted arm64 to use arch_stack_walk() in commit:

  5fc57df2f6 ("arm64: stacktrace: Convert to ARCH_STACKWALK")

The core stacktrace code expects that (when tracing the current task)
arch_stack_walk() starts a trace at its caller, and does not include
itself in the trace. However, arm64's arch_stack_walk() includes itself,
and so traces include one more entry than callers expect. The core
stacktrace code which calls arch_stack_walk() tries to skip a number of
entries to prevent itself appearing in a trace, and the additional entry
prevents skipping one of the core stacktrace functions, leaving this in
the trace unexpectedly.

We can fix this by having arm64's arch_stack_walk() begin the trace with
its caller. The first value returned by the trace will be
__builtin_return_address(0), i.e. the caller of arch_stack_walk(). The
first frame record to be unwound will be __builtin_frame_address(1),
i.e. the caller's frame record. To prevent surprises, arch_stack_walk()
is also marked noinline.

While __builtin_frame_address(1) is not safe in portable code, local GCC
developers have confirmed that it is safe on arm64. To find the caller's
frame record, the builtin can safely dereference the current function's
frame record or (in theory) could stash the original FP into another GPR
at function entry time, neither of which are problematic.

Prior to this patch, the tracing code would unexpectedly show up in
traces of the current task, e.g.

| # cat /proc/self/stack
| [<0>] stack_trace_save_tsk+0x98/0x100
| [<0>] proc_pid_stack+0xb4/0x130
| [<0>] proc_single_show+0x60/0x110
| [<0>] seq_read_iter+0x230/0x4d0
| [<0>] seq_read+0xdc/0x130
| [<0>] vfs_read+0xac/0x1e0
| [<0>] ksys_read+0x6c/0xfc
| [<0>] __arm64_sys_read+0x20/0x30
| [<0>] el0_svc_common.constprop.0+0x60/0x120
| [<0>] do_el0_svc+0x24/0x90
| [<0>] el0_svc+0x2c/0x54
| [<0>] el0_sync_handler+0x1a4/0x1b0
| [<0>] el0_sync+0x170/0x180

After this patch, the tracing code will not show up in such traces:

| # cat /proc/self/stack
| [<0>] proc_pid_stack+0xb4/0x130
| [<0>] proc_single_show+0x60/0x110
| [<0>] seq_read_iter+0x230/0x4d0
| [<0>] seq_read+0xdc/0x130
| [<0>] vfs_read+0xac/0x1e0
| [<0>] ksys_read+0x6c/0xfc
| [<0>] __arm64_sys_read+0x20/0x30
| [<0>] el0_svc_common.constprop.0+0x60/0x120
| [<0>] do_el0_svc+0x24/0x90
| [<0>] el0_svc+0x2c/0x54
| [<0>] el0_sync_handler+0x1a4/0x1b0
| [<0>] el0_sync+0x170/0x180

Erring on the side of caution, I've given this a spin with a bunch of
toolchains, verifying the output of /proc/self/stack and checking that
the assembly looked sound. For GCC (where we require version 5.1.0 or
later) I tested with the kernel.org crosstool binares for versions
5.5.0, 6.4.0, 6.5.0, 7.3.0, 7.5.0, 8.1.0, 8.3.0, 8.4.0, 9.2.0, and
10.1.0. For clang (where we require version 10.0.1 or later) I tested
with the llvm.org binary releases of 11.0.0, and 11.0.1.

Fixes: 5fc57df2f6 ("arm64: stacktrace: Convert to ARCH_STACKWALK")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Jun <chenjun102@huawei.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org> # 5.10.x
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210319184106.5688-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-22 12:42:49 +00:00