linux-stable/arch/arm/mach-mvebu/coherency_ll.S

160 lines
4.0 KiB
ArmAsm
Raw Normal View History

/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Coherency fabric: low level functions
*
* Copyright (C) 2012 Marvell
*
* Gregory CLEMENT <gregory.clement@free-electrons.com>
*
* This file implements the assembly function to add a CPU to the
* coherency fabric. This function is called by each of the secondary
* CPUs during their early boot in an SMP kernel, this why this
* function have to callable from assembly. It can also be called by a
* primary CPU from C code during its boot.
*/
#include <linux/linkage.h>
#define ARMADA_XP_CFB_CTL_REG_OFFSET 0x0
#define ARMADA_XP_CFB_CFG_REG_OFFSET 0x4
#include <asm/assembler.h>
#include <asm/cp15.h>
ARM: 9263/1: use .arch directives instead of assembler command line flags Similar to commit a6c30873ee4a ("ARM: 8989/1: use .fpu assembler directives instead of assembler arguments"). GCC and GNU binutils support setting the "sub arch" via -march=, -Wa,-march, target function attribute, and .arch assembler directive. Clang was missing support for -Wa,-march=, but this was implemented in clang-13. The behavior of both GCC and Clang is to prefer -Wa,-march= over -march= for assembler and assembler-with-cpp sources, but Clang will warn about the -march= being unused. clang: warning: argument unused during compilation: '-march=armv6k' [-Wunused-command-line-argument] Since most assembler is non-conditionally assembled with one sub arch (modulo arch/arm/delay-loop.S which conditionally is assembled as armv4 based on CONFIG_ARCH_RPC, and arch/arm/mach-at91/pm-suspend.S which is conditionally assembled as armv7-a based on CONFIG_CPU_V7), prefer the .arch assembler directive. Add a few more instances found in compile testing as found by Arnd and Nathan. Link: https://github.com/llvm/llvm-project/commit/1d51c699b9e2ebc5bcfdbe85c74cc871426333d4 Link: https://bugs.llvm.org/show_bug.cgi?id=48894 Link: https://github.com/ClangBuiltLinux/linux/issues/1195 Link: https://github.com/ClangBuiltLinux/linux/issues/1315 Suggested-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-10-24 19:44:41 +00:00
.arch armv7-a
.text
/*
* Returns the coherency base address in r1 (r0 is untouched), or 0 if
* the coherency fabric is not enabled.
*/
ENTRY(ll_get_coherency_base)
mrc p15, 0, r1, c1, c0, 0
tst r1, #CR_M @ Check MMU bit enabled
bne 1f
/*
* MMU is disabled, use the physical address of the coherency
* base address, (or 0x0 if the coherency fabric is not mapped)
*/
adr r1, 3f
ldr r3, [r1]
ldr r1, [r1, r3]
b 2f
1:
/*
* MMU is enabled, use the virtual address of the coherency
* base address.
*/
ldr r1, =coherency_base
ldr r1, [r1]
2:
ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+ ARMv6 and greater introduced a new instruction ("bx") which can be used to return from function calls. Recent CPUs perform better when the "bx lr" instruction is used rather than the "mov pc, lr" instruction, and this sequence is strongly recommended to be used by the ARM architecture manual (section A.4.1.1). We provide a new macro "ret" with all its variants for the condition code which will resolve to the appropriate instruction. Rather than doing this piecemeal, and miss some instances, change all the "mov pc" instances to use the new macro, with the exception of the "movs" instruction and the kprobes code. This allows us to detect the "mov pc, lr" case and fix it up - and also gives us the possibility of deploying this for other registers depending on the CPU selection. Reported-by: Will Deacon <will.deacon@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1 Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood Tested-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385 Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-06-30 15:29:12 +00:00
ret lr
ENDPROC(ll_get_coherency_base)
/*
* Returns the coherency CPU mask in r3 (r0 is untouched). This
* coherency CPU mask can be used with the coherency fabric
* configuration and control registers. Note that the mask is already
* endian-swapped as appropriate so that the calling functions do not
* have to care about endianness issues while accessing the coherency
* fabric registers
*/
ENTRY(ll_get_coherency_cpumask)
mrc p15, 0, r3, cr0, cr0, 5
and r3, r3, #15
mov r2, #(1 << 24)
lsl r3, r2, r3
ARM_BE8(rev r3, r3)
ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+ ARMv6 and greater introduced a new instruction ("bx") which can be used to return from function calls. Recent CPUs perform better when the "bx lr" instruction is used rather than the "mov pc, lr" instruction, and this sequence is strongly recommended to be used by the ARM architecture manual (section A.4.1.1). We provide a new macro "ret" with all its variants for the condition code which will resolve to the appropriate instruction. Rather than doing this piecemeal, and miss some instances, change all the "mov pc" instances to use the new macro, with the exception of the "movs" instruction and the kprobes code. This allows us to detect the "mov pc, lr" case and fix it up - and also gives us the possibility of deploying this for other registers depending on the CPU selection. Reported-by: Will Deacon <will.deacon@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1 Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood Tested-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385 Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-06-30 15:29:12 +00:00
ret lr
ENDPROC(ll_get_coherency_cpumask)
/*
* ll_add_cpu_to_smp_group(), ll_enable_coherency() and
* ll_disable_coherency() use the strex/ldrex instructions while the
* MMU can be disabled. The Armada XP SoC has an exclusive monitor
* that tracks transactions to Device and/or SO memory and thanks to
* that, exclusive transactions are functional even when the MMU is
* disabled.
*/
ENTRY(ll_add_cpu_to_smp_group)
/*
* As r0 is not modified by ll_get_coherency_base() and
* ll_get_coherency_cpumask(), we use it to temporarly save lr
* and avoid it being modified by the branch and link
* calls. This function is used very early in the secondary
* CPU boot, and no stack is available at this point.
*/
mov r0, lr
bl ll_get_coherency_base
/* Bail out if the coherency is not enabled */
cmp r1, #0
reteq r0
bl ll_get_coherency_cpumask
mov lr, r0
add r0, r1, #ARMADA_XP_CFB_CFG_REG_OFFSET
1:
ldrex r2, [r0]
orr r2, r2, r3
strex r1, r2, [r0]
cmp r1, #0
bne 1b
ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+ ARMv6 and greater introduced a new instruction ("bx") which can be used to return from function calls. Recent CPUs perform better when the "bx lr" instruction is used rather than the "mov pc, lr" instruction, and this sequence is strongly recommended to be used by the ARM architecture manual (section A.4.1.1). We provide a new macro "ret" with all its variants for the condition code which will resolve to the appropriate instruction. Rather than doing this piecemeal, and miss some instances, change all the "mov pc" instances to use the new macro, with the exception of the "movs" instruction and the kprobes code. This allows us to detect the "mov pc, lr" case and fix it up - and also gives us the possibility of deploying this for other registers depending on the CPU selection. Reported-by: Will Deacon <will.deacon@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1 Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood Tested-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385 Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-06-30 15:29:12 +00:00
ret lr
ENDPROC(ll_add_cpu_to_smp_group)
ENTRY(ll_enable_coherency)
/*
* As r0 is not modified by ll_get_coherency_base() and
* ll_get_coherency_cpumask(), we use it to temporarly save lr
* and avoid it being modified by the branch and link
* calls. This function is used very early in the secondary
* CPU boot, and no stack is available at this point.
*/
mov r0, lr
bl ll_get_coherency_base
/* Bail out if the coherency is not enabled */
cmp r1, #0
reteq r0
bl ll_get_coherency_cpumask
mov lr, r0
add r0, r1, #ARMADA_XP_CFB_CTL_REG_OFFSET
1:
ldrex r2, [r0]
orr r2, r2, r3
strex r1, r2, [r0]
cmp r1, #0
bne 1b
dsb
mov r0, #0
ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+ ARMv6 and greater introduced a new instruction ("bx") which can be used to return from function calls. Recent CPUs perform better when the "bx lr" instruction is used rather than the "mov pc, lr" instruction, and this sequence is strongly recommended to be used by the ARM architecture manual (section A.4.1.1). We provide a new macro "ret" with all its variants for the condition code which will resolve to the appropriate instruction. Rather than doing this piecemeal, and miss some instances, change all the "mov pc" instances to use the new macro, with the exception of the "movs" instruction and the kprobes code. This allows us to detect the "mov pc, lr" case and fix it up - and also gives us the possibility of deploying this for other registers depending on the CPU selection. Reported-by: Will Deacon <will.deacon@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1 Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood Tested-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385 Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-06-30 15:29:12 +00:00
ret lr
ENDPROC(ll_enable_coherency)
ENTRY(ll_disable_coherency)
/*
* As r0 is not modified by ll_get_coherency_base() and
* ll_get_coherency_cpumask(), we use it to temporarly save lr
* and avoid it being modified by the branch and link
* calls. This function is used very early in the secondary
* CPU boot, and no stack is available at this point.
*/
mov r0, lr
bl ll_get_coherency_base
/* Bail out if the coherency is not enabled */
cmp r1, #0
reteq r0
bl ll_get_coherency_cpumask
mov lr, r0
add r0, r1, #ARMADA_XP_CFB_CTL_REG_OFFSET
1:
ldrex r2, [r0]
bic r2, r2, r3
strex r1, r2, [r0]
cmp r1, #0
bne 1b
dsb
ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+ ARMv6 and greater introduced a new instruction ("bx") which can be used to return from function calls. Recent CPUs perform better when the "bx lr" instruction is used rather than the "mov pc, lr" instruction, and this sequence is strongly recommended to be used by the ARM architecture manual (section A.4.1.1). We provide a new macro "ret" with all its variants for the condition code which will resolve to the appropriate instruction. Rather than doing this piecemeal, and miss some instances, change all the "mov pc" instances to use the new macro, with the exception of the "movs" instruction and the kprobes code. This allows us to detect the "mov pc, lr" case and fix it up - and also gives us the possibility of deploying this for other registers depending on the CPU selection. Reported-by: Will Deacon <will.deacon@arm.com> Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1 Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood Tested-by: Shawn Guo <shawn.guo@freescale.com> Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385 Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-06-30 15:29:12 +00:00
ret lr
ENDPROC(ll_disable_coherency)
.align 2
3:
.long coherency_phys_base - .