Commit Graph

49 Commits

Author SHA1 Message Date
Ard Biesheuvel 1f640552d9 ARM: cacheflush: avoid clobbering the frame pointer
Thumb2 uses R7 rather than R11 as the frame pointer, and even if we
rarely use a frame pointer to begin with when building in Thumb2 mode,
there are cases where it is required by the compiler (Clang when
inserting profiling hooks via -pg)

However, preserving and restoring the frame pointer is risky, as any
unhandled exceptions raised in the mean time will produce a bogus
backtrace, and it would be better not to touch the frame pointer at all.
This is the case even when CONFIG_FRAME_POINTER is not set, as the
unwind directive used by the unwinder may also use R7 or R11 as the
unwind anchor, even if the frame pointer is not managed strictly
according to the frame pointer ABI.

So let's tweak the cacheflush asm code not to clobber R7 or R11 at all,
so that we can drop R7 from the clobber lists of the inline asm blocks
that call these routines, and remove the code that preserves/restores
R11.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-02-09 10:13:10 +01:00
Ard Biesheuvel 95731b8ee6 ARM: 9059/1: cache-v7: get rid of mini-stack
Now that we have reduced the number of registers that we need to
preserve when calling v7_invalidate_l1 from the boot code, we can use
scratch registers to preserve the remaining ones, and get rid of the
mini stack entirely. This works around any issues regarding cache
behavior in relation to the uncached accesses to this memory, which is
hard to get right in the general case (i.e., both bare metal and under
virtualization)

While at it, switch v7_invalidate_l1 to using ip as a scratch register
instead of r4. This makes the function AAPCS compliant, and removes the
need to stash r4 in ip across the call.

Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2021-03-09 10:25:18 +00:00
Ard Biesheuvel f9e7a99fb6 ARM: 9058/1: cache-v7: refactor v7_invalidate_l1 to avoid clobbering r5/r6
The cache invalidation code in v7_invalidate_l1 can be tweaked to
re-read the associativity from CCSIDR, and keep the way identifier
component in a single register that is assigned in the outer loop. This
way, we need 2 registers less.

Given that the number of sets is typically much larger than the
associativity, rearrange the code so that the outer loop has the fewer
number of iterations, ensuring that the re-read of CCSIDR only occurs a
handful of times in practice.

Fix the whitespace while at it, and update the comment to indicate that
this code is no longer a clone of anything else.

Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2021-03-09 10:25:18 +00:00
Ard Biesheuvel c0e50736e8 ARM: 9057/1: cache-v7: add missing ISB after cache level selection
A write to CSSELR needs to complete before its results can be observed
via CCSIDR. So add a ISB to ensure that this is the case.

Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2021-03-09 10:25:17 +00:00
Thomas Gleixner e7289c6de8 sched/rt, ARM: Use CONFIG_PREEMPTION
CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT.
Both PREEMPT and PREEMPT_RT require the same functionality which today
depends on CONFIG_PREEMPT.

Switch the entry code, cache over to use CONFIG_PREEMPTION and add output
in show_stack() for PREEMPT_RT.

[bigeasy: +traps.c]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20191015191821.11479-2-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-12-08 14:37:32 +01:00
Linus Torvalds 2b49350b16 ARM updates:
- Add a "cut here" to make it clearer where oops dumps should be cut
   from - we already have a marker for the end of the dumps.
 - Add logging severity to show_pte()
 - Drop unnecessary common-page-size linker flag
 - Errata workarounds for Cortex A12 857271, Cortex A17 857272 and
   Cortex A7 814220.
 - Remove some unused variables that had started to provoke a compiler
   warning.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAXSMct/TnkBvkraxkAQIpKBAAoE+5ZLeBanLZ6GMUpIMW+wVfVYaI4p8g
 kZd0Dm2EB5Y9x9kfYwCLmdUp5A/Lh3dmAkAGHSP0TVdDj34qK9KXxCA3t+k85ph5
 +VDzdpAtUrbuDWaF/6KIaGMLgA4ss+VXpoqMOvZuwDQReXbXaT325WVzcTFlNLe7
 sUBS8k2DxTTjiF0HM/jnxtbvVmc3sfkPij+MTEz9XEcS9K0Hilz9suoJEKZCXY6H
 ULX3n+8wJAxbnbnXdGLyQ095BqtaQC88mEcMUvBci7NBGiRne19vvRqdLwe1mJPc
 PTz1v/Pjvbfj1apSZDGVJvIu3BVEkwJNBF1NujRGUVrYnKQd/YJSKJkqjXat5Q5M
 H/G2l+y0vmzn4VSbn1LUEvleuou6Hq/1nucksS1DHruEB7tH/aQRtf0EduzGRsop
 VmYkTiIiaF28N+eHDolDtUzSg5RNzEMCQnni4y6ajGhbWiu1Q63GBR7k9zwB27Bj
 3t6RdrqIaE52uKUdgwc9APcpZhdJR/ncqrI68B+wIwbuObhUSY5sa1Ec8uiELOiK
 g3zCp4AWRBbYQFNeQy87A811JSFU+L/lCbABs32Bj8UEKGn/qdmVwEaH9rZ8/PHk
 TAlcu8PtwBVQQIEgs+ur+OyfZcLt2U4DUDiL/26GCyG+pzzL7fYap5GSoBuqR02p
 MmoHFr87YZM=
 =Ru/4
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM updates from Russell King:

 - Add a "cut here" to make it clearer where oops dumps should be cut
   from - we already have a marker for the end of the dumps.

 - Add logging severity to show_pte()

 - Drop unnecessary common-page-size linker flag

 - Errata workarounds for Cortex A12 857271, Cortex A17 857272 and
   Cortex A7 814220.

 - Remove some unused variables that had started to provoke a compiler
   warning.

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8863/1: stm32: select ARM errata 814220
  ARM: 8862/1: errata: 814220-B-Cache maintenance by set/way operations can execute out of order
  ARM: 8865/1: mm: remove unused variables
  ARM: 8864/1: Add workaround for I-Cache line size mismatch between CPU cores
  ARM: 8861/1: errata: Workaround errata A12 857271 / A17 857272
  ARM: 8860/1: VDSO: Drop implicit common-page-size linker flag
  ARM: arrange show_pte() to issue severity-based messages
  ARM: add "8<--- cut here ---" to kernel dumps
2019-07-08 21:08:34 -07:00
Benjamin Gaignard 779eb41ccb ARM: 8862/1: errata: 814220-B-Cache maintenance by set/way operations can execute out of order
The v7 ARM states that all cache and branch predictor maintenance operations
that do not specify an address execute, relative to each other, in program
order. However, because of this erratum, an L2 set/way cache maintenance
operation can overtake an L1 set/way cache maintenance operation, this would
cause the data corruption.

This ERRATA affected the Cortex-A7 and present in r0p2, r0p3, r0p4, r0p5.

This patch is the SW workaround by adding a DSB before changing cache levels as
the ARM ERRATA: ARM/MP: 814220 told in the ARM ERRATA documentation.

Signed-off-by: Jason Liu <r64343@freescale.com>
Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-06-21 09:06:06 +01:00
Marek Szyprowski 5f41f9198f ARM: 8864/1: Add workaround for I-Cache line size mismatch between CPU cores
Some big.LITTLE systems have I-Cache line size mismatch between
LITTLE and big cores. This patch adds a workaround for proper I-Cache
support on such systems. Without it, some class of the userspace code
(typically self-modifying) might suffer from random SIGILL failures.

Similar workaround already exists for ARM64 architecture. I has been
added by commit 116c81f427 ("arm64: Work around systems with mismatched
cache line sizes").

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2019-06-20 22:29:58 +01:00
Thomas Gleixner d2912cb15b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:55 +02:00
Chris Cole a1208f6a82 ARM: 8814/1: mm: improve/fix ARM v7_dma_inv_range() unaligned address handling
This patch addresses possible memory corruption when
v7_dma_inv_range(start_address, end_address) address parameters are not
aligned to whole cache lines. This function issues "invalidate" cache
management operations to all cache lines from start_address (inclusive)
to end_address (exclusive). When start_address and/or end_address are
not aligned, the start and/or end cache lines are first issued "clean &
invalidate" operation. The assumption is this is done to ensure that any
dirty data addresses outside the address range (but part of the first or
last cache lines) are cleaned/flushed so that data is not lost, which
could happen if just an invalidate is issued.

The problem is that these first/last partial cache lines are issued
"clean & invalidate" and then "invalidate". This second "invalidate" is
not required and worse can cause "lost" writes to addresses outside the
address range but part of the cache line. If another component writes to
its part of the cache line between the "clean & invalidate" and
"invalidate" operations, the write can get lost. This fix is to remove
the extra "invalidate" operation when unaligned addressed are used.

A kernel module is available that has a stress test to reproduce the
issue and a unit test of the updated v7_dma_inv_range(). It can be
downloaded from
http://ftp.sageembedded.com/outgoing/linux/cache-test-20181107.tgz.

v7_dma_inv_range() is call by dmac_[un]map_area(addr, len, direction)
when the direction is DMA_FROM_DEVICE. One can (I believe) successfully
argue that DMA from a device to main memory should use buffers aligned
to cache line size, because the "clean & invalidate" might overwrite
data that the device just wrote using DMA. But if a driver does use
unaligned buffers, at least this fix will prevent memory corruption
outside the buffer.

Signed-off-by: Chris Cole <chris@sageembedded.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-12-04 22:38:32 +00:00
Florian Fainelli 1238c4fd48 ARM: 8729/1: Hook B15 readahead cache functions based on processor
If we detect that we are running on a Broadcom Brahma-B15 CPU, and
CONFIG_CACHE_B15_RAC is enabled, make sure that we pick-up the
b15_cache_fns function operations.

If CONFIG_CACHE_B15_RAC is enabled, but we are not running on a Broadcom
Brahma-B15 CPU, we will fallback to calling into the regular
v7_cache_fns with no cost. If CONFIG_CACHE_B15_RAC is disabled, there is
no cost and we just use the regular v7_cache_fns.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-12-17 22:15:36 +00:00
Masahiro Yamada 08a7e621ff scripts/spelling.txt: add "swith" pattern and fix typo instances
Fix typos and add the following to the scripts/spelling.txt:

  swith||switch
  swithable||switchable
  swithed||switched
  swithing||switching

While we are here, fix the "update" to "updates" in the touched hunk in
drivers/net/wireless/marvell/mwifiex/wmm.c.

Link: http://lkml.kernel.org/r/1481573103-11329-2-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:46 -08:00
Russell King aaf4b5d92c ARM: cache-v7: optimise test for Cortex A9 r0pX devices
Eliminate one unnecessary instruction from this test by pre-shifting
the Cortex A9 ID - we can shift the actual ID in the teq instruction
thereby losing the pX bit of the ID at no cost.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-04-14 22:26:52 +01:00
Russell King d3cd451dfb ARM: cache-v7: optimise branches in v7_flush_cache_louis
Optimise the branches such that for the majority of unaffected devices,
we avoid needing to execute the errata work-around code path by
branching to start_flush_levels early.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-04-14 22:26:52 +01:00
Russell King cd8b24d9e8 ARM: cache-v7: consolidate initialisation of cache level index
Both v7_flush_cache_louis and v7_flush_dcache_all both begin the
flush_levels loop with r10 initialised to zero.  In each case, this
is done immediately prior to entering the loop.  Branch to this
instruction in v7_flush_dcache_all from v7_flush_cache_louis and
eliminate the unnecessary initialisation in v7_flush_cache_louis.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-04-14 22:26:52 +01:00
Russell King 47b8484ea6 ARM: cache-v7: shift CLIDR to extract appropriate field before masking
Rather than have code which masks and then shifts, such as:

	mrc     p15, 1, r0, c0, c0, 1
ALT_SMP(ands	r3, r0, #7 << 21)
ALT_UP( ands	r3, r0, #7 << 27)
ALT_SMP(mov	r3, r3, lsr #20)
ALT_UP(	mov	r3, r3, lsr #26)

re-arrange this as a shift and then mask.  The masking is the same for
each field which we want to extract, so this allows the mask to be
shared amongst code paths:

	mrc     p15, 1, r0, c0, c0, 1
ALT_SMP(mov	r3, r0, lsr #20)
ALT_UP(	mov	r3, r0, lsr #26)
	ands	r3, r3, #7 << 1

Use this method for the LoUIS, LoUU and LoC fields.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-04-14 22:26:51 +01:00
Russell King 5aca370826 ARM: cache-v7: use movw/movt instructions
We always build cache-v7.S for ARMv7, so we can use the ARMv7 16-bit
move instructions to load large constants, rather than using constants
in a literal pool.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-04-14 22:26:51 +01:00
Russell King 6ebbf2ce43 ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+
ARMv6 and greater introduced a new instruction ("bx") which can be used
to return from function calls.  Recent CPUs perform better when the
"bx lr" instruction is used rather than the "mov pc, lr" instruction,
and this sequence is strongly recommended to be used by the ARM
architecture manual (section A.4.1.1).

We provide a new macro "ret" with all its variants for the condition
code which will resolve to the appropriate instruction.

Rather than doing this piecemeal, and miss some instances, change all
the "mov pc" instances to use the new macro, with the exception of
the "movs" instruction and the kprobes code.  This allows us to detect
the "mov pc, lr" case and fix it up - and also gives us the possibility
of deploying this for other registers depending on the CPU selection.

Reported-by: Will Deacon <will.deacon@arm.com>
Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S
Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood
Tested-by: Shawn Guo <shawn.guo@freescale.com>
Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385
Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci
Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp
Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M
Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-07-18 12:29:04 +01:00
Will Deacon 9581960a40 ARM: 8055/1: cacheflush: use -st dsb option for ensuring completion
dsb st can be used to ensure completion of pending cache maintenance
operations, so use it for the v7 cache maintenance operations.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-05-25 23:47:46 +01:00
Lorenzo Pieralisi 70f665fe77 ARM: 7919/1: mm: refactor v7 cache cleaning ops to use way/index sequence
Set-associative caches on all v7 implementations map the index bits
to physical addresses LSBs and tag bits to MSBs. As the last level
of cache on current and upcoming ARM systems grows in size,
this means that under normal DRAM controller configurations, the
current v7 cache flush routine using set/way operations triggers a
DRAM memory controller precharge/activate for every cache line
writeback since the cache routine cleans lines by first fixing the
index and then looping through ways (index bits are mapped to lower
physical addresses on all v7 cache implementations; this means that,
with last level cache sizes in the order of MBytes, lines belonging
to the same set but different ways map to different DRAM pages).

Given the random content of cache tags, swapping the order between
indexes and ways loops do not prevent DRAM pages precharge and
activate cycles but at least, on average, improves the chances that
either multiple lines hit the same page or multiple lines belong to
different DRAM banks, improving throughput significantly.

This patch swaps the inner loops in the v7 cache flushing routine
to carry out the clean operations first on all sets belonging to
a given way (looping through sets) and then decrementing the way.

Benchmarks showed that by swapping the ordering in which sets and
ways are decremented in the v7 cache flushing routine, that uses
set/way operations, time required to flush caches is reduced
significantly, owing to improved writebacks throughput to the DRAM
controller.

Benchmarks results vary and depend heavily on the last level of
cache tag RAM content when cache is cleaned and invalidated, ranging
from 2x throughput when all tag RAM entries contain dirty lines
mapping to sequential pages of RAM to 1x (ie no improvement) when
all tag RAM accesses trigger a DRAM precharge/activate cycle, as the
current code implies on most DRAM controller configurations.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-12-29 12:32:40 +00:00
Will Deacon 6abdd49169 ARM: mm: use inner-shareable barriers for TLB and user cache operations
System-wide barriers aren't required for situations where we only need
to make visibility and ordering guarantees in the inner-shareable domain
(i.e. we are not dealing with devices or potentially incoherent CPUs).

This patch changes the v7 TLB operations, coherent_user_range and
dcache_clean_area functions to user inner-shareable barriers. For cache
maintenance, only the store access type is required to ensure completion.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2013-08-12 12:25:45 +01:00
Jon Medhurst 691557941a ARM: 7752/1: errata: LoUIS bit field in CLIDR register is incorrect
On Cortex-A9 before version r1p0, the LoUIS bit field of the CLIDR
register returns zero when it should return one. This leads to cache
maintenance operations which rely on this value to not function as
intended, causing data corruption.

The workaround for this errata is to detect affected CPUs and correct
the LoUIS value read.

Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-06-17 10:30:49 +01:00
Dinh Nguyen c08e20d246 arm: Add v7_invalidate_l1 to cache-v7.S
mach-socfpga is another platform that needs to use
v7_invalidate_l1 to bringup additional cores. There was a comment that
the ideal place for v7_invalidate_l1 should be in arm/mm/cache-v7.S

Signed-off-by: Dinh Nguyen <dinguyen@altera.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Stephen Warren <swarren@nvidia.com>
Reviewed-by: Pavel Machek <pavel@denx.de>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Pavel Machek <pavel@denx.de>
Tested-by: Stephen Warren <swarren@nvidia.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Olof Johansson <olof@lixom.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Magnus Damm <magnus.damm@gmail.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
2013-02-11 19:37:24 -08:00
Will Deacon d056a699dd ARM: 7606/1: cache: flush to LoUU instead of LoUIS on uniprocessor CPUs
flush_cache_louis flushes the D-side caches to the point of unification
inner-shareable. On uniprocessor CPUs, this is defined as zero and
therefore no flushing will take place. Rather than invent a new interface
for UP systems, instead use our SMP_ON_UP patching code to read the
LoUU from the CLIDR instead.

Cc: <stable@vger.kernel.org>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Tested-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-12-20 10:41:56 +00:00
Russell King a0f0dd57f4 Merge branch 'fixes' into for-linus
Conflicts:
	arch/arm/kernel/smp.c
2012-10-11 10:55:04 +01:00
Simon Horman 7253b85cc6 ARM: 7541/1: Add ARM ERRATA 775420 workaround
arm: Add ARM ERRATA 775420 workaround

Workaround for the 775420 Cortex-A9 (r2p2, r2p6,r2p8,r2p10,r3p0) erratum.
In case a date cache maintenance operation aborts with MMU exception, it
might cause the processor to deadlock. This workaround puts DSB before
executing ISB if an abort may occur on cache maintenance.

Based on work by Kouei Abe and feedback from Catalin Marinas.

Signed-off-by: Kouei Abe <kouei.abe.cp@rms.renesas.com>
[ horms@verge.net.au: Changed to implementation
  suggested by catalin.marinas@arm.com ]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-09-28 21:11:49 +01:00
Lorenzo Pieralisi 3287be8c4e ARM: mm: rename jump labels in v7_flush_dcache_all function
This patch renames jump labels in v7_flush_dcache_all in order to define
a specific flush cache levels entry point.

Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
2012-09-25 11:20:25 +01:00
Lorenzo Pieralisi 031bd879f7 ARM: mm: implement LoUIS API for cache maintenance ops
ARM v7 architecture introduced the concept of cache levels and related
control registers. New processors like A7 and A15 embed an L2 unified cache
controller that becomes part of the cache level hierarchy. Some operations in
the kernel like cpu_suspend and __cpu_disable do not require a flush of the
entire cache hierarchy to DRAM but just the cache levels belonging to the
Level of Unification Inner Shareable (LoUIS), which in most of ARM v7 systems
correspond to L1.

The current cache flushing API used in cpu_suspend and __cpu_disable,
flush_cache_all(), ends up flushing the whole cache hierarchy since for
v7 it cleans and invalidates all cache levels up to Level of Coherency
(LoC) which cripples system performance when used in hot paths like hotplug
and cpuidle.

Therefore a new kernel cache maintenance API must be added to cope with
latest ARM system requirements.

This patch adds flush_cache_louis() to the ARM kernel cache maintenance API.

This function cleans and invalidates all data cache levels up to the
Level of Unification Inner Shareable (LoUIS) and invalidates the instruction
cache for processors that support it (> v7).

This patch also creates an alias of the cache LoUIS function to flush_kern_all
for all processor versions prior to v7, so that the current cache flushing
behaviour is unchanged for those processors.

v7 cache maintenance code implements a cache LoUIS function that cleans and
invalidates the D-cache up to LoUIS and invalidates the I-cache, according
to the new API.

Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
2012-09-25 11:20:25 +01:00
Will Deacon c5102f5935 ARM: 7408/1: cacheflush: return error to userspace when flushing syscall fails
The cacheflush syscall can fail for two reasons:

(1) The arguments are invalid (nonsensical address range or no VMA)

(2) The region generates a translation fault on a VIPT or PIPT cache

This patch allows do_cache_op to return an error code to userspace in
the case of the above. The various coherent_user_range implementations
are modified to return 0 in the case of VIVT caches or -EFAULT in the
case of an abort on v6/v7 cores.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-05-02 11:12:49 +01:00
Rabin Vincent 8e43a905dd ARM: 7325/1: fix v7 boot with lockdep enabled
Bootup with lockdep enabled has been broken on v7 since b46c0f7465
("ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR").

This is because v7_setup (which is called very early during boot) calls
v7_flush_dcache_all, and the save_and_disable_irqs added by that patch
ends up attempting to call into lockdep C code (trace_hardirqs_off())
when we are in no position to execute it (no stack, MMU off).

Fix this by using a notrace variant of save_and_disable_irqs.  The code
already uses the notrace variant of restore_irqs.

Reviewed-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-15 21:09:52 +00:00
Stephen Boyd b46c0f7465 ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR
armv7's flush_cache_all() flushes caches via set/way. To
determine the cache attributes (line size, number of sets,
etc.) the assembly first writes the CSSELR register to select a
cache level and then reads the CCSIDR register. The CSSELR register
is banked per-cpu and is used to determine which cache level CCSIDR
reads. If the task is migrated between when the CSSELR is written and
the CCSIDR is read the CCSIDR value may be for an unexpected cache
level (for example L1 instead of L2) and incorrect cache flushing
could occur.

Disable interrupts across the write and read so that the correct
cache attributes are read and used for the cache flushing
routine. We disable interrupts instead of disabling preemption
because the critical section is only 3 instructions and we want
to call v7_dcache_flush_all from __v7_setup which doesn't have a
full kernel stack with a struct thread_info.

This fixes a problem we see in scm_call() when flush_cache_all()
is called from preemptible context and sometimes the L2 cache is
not properly flushed out.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-02-09 16:25:37 +00:00
Will Deacon f630c1bdfb ARM: 7091/1: errata: D-cache line maintenance operation by MVA may not succeed
This patch implements a workaround for erratum 764369 affecting
Cortex-A9 MPCore with two or more processors (all current revisions).
Under certain timing circumstances, a data cache line maintenance
operation by MVA targeting an Inner Shareable memory region may fail to
proceed up to either the Point of Coherency or to the Point of
Unification of the system. This workaround adds a DSB instruction before
the relevant cache maintenance functions and sets a specific bit in the
diagnostic control register of the SCU.

Cc: <stable@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-09-17 12:47:17 +01:00
Dave Martin 455a01ec30 ARM: mm: cache-v7: Use the new processor struct macros
Signed-off-by: Dave Martin <dave.martin@linaro.org>
2011-07-07 15:31:06 +01:00
Will Deacon a248b13b21 ARM: 6941/1: cache: ensure MVA is cacheline aligned in flush_kern_dcache_area
The v6 and v7 implementations of flush_kern_dcache_area do not align
the passed MVA to the size of a cacheline in the data cache. If a
misaligned address is used, only a subset of the requested area will
be flushed. This has been observed to cause failures in SMP boot where
the secondary_data initialised by the primary CPU is not cacheline
aligned, causing the secondary CPUs to read incorrect values for their
pgd and stack pointers.

This patch ensures that the base address is cacheline aligned before
flushing the d-cache.

Cc: <stable@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-05-26 12:14:32 +01:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Catalin Marinas da30e0ac0f ARM: 6528/1: Use CTR for the I-cache line size on ARMv7
The current implementation of the v7_coherent_*_range function assumes
that the D and I cache lines have the same size, which is incorrect
architecturally. This patch adds the icache_line_size macro which reads
the CTR register. The main loop in v7_coherent_*_range is split in two
independent loops or the D and I caches. This also has the performance
advantage that the DSB is moved outside the main loop.

Reported-by: Kevin Sapp <ksapp@quicinc.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-12-12 23:25:58 +00:00
Tony Lindgren 81d11955bf ARM: 6405/1: Handle __flush_icache_all for CONFIG_SMP_ON_UP
Do this by adding flush_icache_all to cache_fns for ARMv6 and 7.
As flush_icache_all may neeed to be called from flush_kern_cache_all,
add it as the first entry in the cache_fns.

Note that now we can remove the ARM_ERRATA_411920 dependency
to !SMP so it can be selected on UP ARMv6 processors, such
as omap2.

Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-10-04 20:23:36 +01:00
Russell King f00ec48fad ARM: Allow SMP kernels to boot on UP systems
UP systems do not implement all the instructions that SMP systems have,
so in order to boot a SMP kernel on a UP system, we need to rewrite
parts of the kernel.

Do this using an 'alternatives' scheme, where the kernel code and data
is modified prior to initialization to replace the SMP instructions,
thereby rendering the problematical code ineffectual.  We use the linker
to generate a list of 32-bit word locations and their replacement values,
and run through these replacements when we detect a UP system.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-10-04 20:23:36 +01:00
Santosh Shilimkar a901ff715d ARM: 6139/1: ARMv7: Use the Inner Shareable I-cache on MP
This patch fixes the flush_cache_all for ARMv7 SMP.It was
missing from commit b8349b569a

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-20 23:52:38 +01:00
Catalin Marinas b8349b569a ARM: 6112/1: Use the Inner Shareable I-cache and BTB ops on ARMv7 SMP
The standard I-cache Invalidate All (ICIALLU) and Branch Predication
Invalidate All (BPIALL) operations are not automatically broadcast to
the other CPUs in an ARMv7 MP system. The patch adds the Inner Shareable
variants, ICIALLUIS and BPIALLIS, if ARMv7 and SMP.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-05-08 10:44:30 +01:00
Russell King 2ffe2da3e7 ARM: dma-mapping: fix for speculative prefetching
ARMv6 and ARMv7 CPUs can perform speculative prefetching, which makes
DMA cache coherency handling slightly more interesting.  Rather than
being able to rely upon the CPU not accessing the DMA buffer until DMA
has completed, we now must expect that the cache could be loaded with
possibly stale data from the DMA buffer.

Where DMA involves data being transferred to the device, we clean the
cache before handing it over for DMA, otherwise we invalidate the buffer
to get rid of potential writebacks.  On DMA Completion, if data was
transferred from the device, we invalidate the buffer to get rid of
any stale speculative prefetches.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-02-15 15:22:25 +00:00
Russell King 702b94bff3 ARM: dma-mapping: remove dmac_clean_range and dmac_inv_range
These are now unused, and so can be removed.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-02-15 15:22:23 +00:00
Russell King a9c9147eb9 ARM: dma-mapping: provide per-cpu type map/unmap functions
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-By: Santosh Shilimkar <santosh.shilimkar@ti.com>
2010-02-15 15:22:20 +00:00
Russell King 2c9b9c8490 ARM: add size argument to __cpuc_flush_dcache_page
... and rename the function since it no longer operates on just
pages.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-14 14:53:22 +00:00
Catalin Marinas 32cfb1b16f ARM: 5746/1: Handle possible translation errors in ARMv6/v7 coherent_user_range
This is needed because applications using the sys_cacheflush system call
can pass a memory range which isn't mapped yet even though the
corresponding vma is valid. The patch also adds unwinding annotations
for correct backtraces from the coherent_user_range() functions.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-10-07 13:12:59 +01:00
Catalin Marinas 347c8b70b1 Thumb-2: Implement the unified arch/arm/mm support
This patch adds the ARM/Thumb-2 unified support to the arch/arm/mm/*
files.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2009-07-24 12:32:56 +01:00
Catalin Marinas c30c2f99e1 ARMv7: Add extra barriers for flush_cache_all compressed/head.S
The flush_cache_all function on ARMv7 is implemented as a series of
cache operations by set/way. These are not guaranteed to be ordered with
previous memory accesses, requiring a DMB. This patch also adds barriers
for the TLB operations in compressed/head.S

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2008-11-06 13:23:07 +00:00
Catalin Marinas 93ed397011 [ARM] 5227/1: Add the ENDPROC declarations to the .S files
This declaration specifies the "function" type and size for various
assembly functions, mainly needed for generating the correct branch
instructions in Thumb-2.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-01 12:06:34 +01:00
Catalin Marinas bbe888864e [ARM] armv7: add support for ARMv7 cores.
This patch adds support for the ARMv7 cores.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-05-08 22:55:53 +01:00