Commit graph

224 commits

Author SHA1 Message Date
Corentin Labbe
d282eeeb11 crypto: arm64/sha-ce - implement export/import
When an ahash algorithm fallback to another ahash and that fallback is
shaXXX-CE, doing export/import lead to error like this:
alg: ahash: sha1-sun8i-ce export() overran state buffer on test vector 0, cfg=\"import/export\"

This is due to the descsize of shaxxx-ce being larger than struct shaxxx_state
off by an u32.
For fixing this, let's implement export/import which rip the finalize
variant instead of using generic export/import.

Fixes: 6ba6c74dfc ("arm64/crypto: SHA-224/SHA-256 using ARMv8 Crypto Extensions")
Fixes: 2c98833a42 ("arm64/crypto: SHA-1 using ARMv8 Crypto Extensions")

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-03-06 12:28:21 +11:00
Matteo Croce
4fb3d8ba28 crypto: arm64/poly1305 - ignore build files
Add arch/arm64/crypto/poly1305-core.S to .gitignore
as it's built from poly1305-core.S_shipped

Fixes: f569ca1647 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-02-13 17:05:25 +08:00
Linus Torvalds
a78208e243 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Removed CRYPTO_TFM_RES flags
   - Extended spawn grabbing to all algorithm types
   - Moved hash descsize verification into API code

  Algorithms:
   - Fixed recursive pcrypt dead-lock
   - Added new 32 and 64-bit generic versions of poly1305
   - Added cryptogams implementation of x86/poly1305

  Drivers:
   - Added support for i.MX8M Mini in caam
   - Added support for i.MX8M Nano in caam
   - Added support for i.MX8M Plus in caam
   - Added support for A33 variant of SS in sun4i-ss
   - Added TEE support for Raven Ridge in ccp
   - Added in-kernel API to submit TEE commands in ccp
   - Added AMD-TEE driver
   - Added support for BCM2711 in iproc-rng200
   - Added support for AES256-GCM based ciphers for chtls
   - Added aead support on SEC2 in hisilicon"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (244 commits)
  crypto: arm/chacha - fix build failured when kernel mode NEON is disabled
  crypto: caam - add support for i.MX8M Plus
  crypto: x86/poly1305 - emit does base conversion itself
  crypto: hisilicon - fix spelling mistake "disgest" -> "digest"
  crypto: chacha20poly1305 - add back missing test vectors and test chunking
  crypto: x86/poly1305 - fix .gitignore typo
  tee: fix memory allocation failure checks on drv_data and amdtee
  crypto: ccree - erase unneeded inline funcs
  crypto: ccree - make cc_pm_put_suspend() void
  crypto: ccree - split overloaded usage of irq field
  crypto: ccree - fix PM race condition
  crypto: ccree - fix FDE descriptor sequence
  crypto: ccree - cc_do_send_request() is void func
  crypto: ccree - fix pm wrongful error reporting
  crypto: ccree - turn errors to debug msgs
  crypto: ccree - fix AEAD decrypt auth fail
  crypto: ccree - fix typo in comment
  crypto: ccree - fix typos in error msgs
  crypto: atmel-{aes,sha,tdes} - Retire crypto_platform_data
  crypto: x86/sha - Eliminate casts on asm implementations
  ...
2020-01-28 15:38:56 -08:00
Jason A. Donenfeld
31899908a0 crypto: {arm,arm64,mips}/poly1305 - remove redundant non-reduction from emit
This appears to be some kind of copy and paste error, and is actually
dead code.

Pre: f = 0 ⇒ (f >> 32) = 0
    f = (f >> 32) + le32_to_cpu(digest[0]);
Post: 0 ≤ f < 2³²
    put_unaligned_le32(f, dst);

Pre: 0 ≤ f < 2³² ⇒ (f >> 32) = 0
    f = (f >> 32) + le32_to_cpu(digest[1]);
Post: 0 ≤ f < 2³²
    put_unaligned_le32(f, dst + 4);

Pre: 0 ≤ f < 2³² ⇒ (f >> 32) = 0
    f = (f >> 32) + le32_to_cpu(digest[2]);
Post: 0 ≤ f < 2³²
    put_unaligned_le32(f, dst + 8);

Pre: 0 ≤ f < 2³² ⇒ (f >> 32) = 0
    f = (f >> 32) + le32_to_cpu(digest[3]);
Post: 0 ≤ f < 2³²
    put_unaligned_le32(f, dst + 12);

Therefore this sequence is redundant. And Andy's code appears to handle
misalignment acceptably.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-16 15:18:12 +08:00
Eric Biggers
674f368a95 crypto: remove CRYPTO_TFM_RES_BAD_KEY_LEN
The CRYPTO_TFM_RES_BAD_KEY_LEN flag was apparently meant as a way to
make the ->setkey() functions provide more information about errors.

However, no one actually checks for this flag, which makes it pointless.

Also, many algorithms fail to set this flag when given a bad length key.
Reviewing just the generic implementations, this is the case for
aes-fixed-time, cbcmac, echainiv, nhpoly1305, pcrypt, rfc3686, rfc4309,
rfc7539, rfc7539esp, salsa20, seqiv, and xcbc.  But there are probably
many more in arch/*/crypto/ and drivers/crypto/.

Some algorithms can even set this flag when the key is the correct
length.  For example, authenc and authencesn set it when the key payload
is malformed in any way (not just a bad length), the atmel-sha and ccree
drivers can set it if a memory allocation fails, and the chelsio driver
sets it for bad auth tag lengths, not just bad key lengths.

So even if someone actually wanted to start checking this flag (which
seems unlikely, since it's been unused for a long time), there would be
a lot of work needed to get it working correctly.  But it would probably
be much better to go back to the drawing board and just define different
return values, like -EINVAL if the key is invalid for the algorithm vs.
-EKEYREJECTED if the key was rejected by a policy like "no weak keys".
That would be much simpler, less error-prone, and easier to test.

So just remove this flag.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-01-09 11:30:53 +08:00
Mark Brown
0e89640b64 crypto: arm64 - Use modern annotations for assembly functions
In an effort to clarify and simplify the annotation of assembly functions
in the kernel new macros have been introduced. These replace ENTRY and
ENDPROC and also add a new annotation for static functions which previously
had no ENTRY equivalent. Update the annotations in the crypto code to the
new macros.

There are a small number of files imported from OpenSSL where the assembly
is generated using perl programs, these are not currently annotated at all
and have not been modified.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-20 14:58:35 +08:00
Ard Biesheuvel
5441c6507b crypto: arm64/ghash-neon - bump priority to 150
The SIMD based GHASH implementation for arm64 is typically much faster
than the generic one, and doesn't use any lookup tables, so it is
clearly preferred when available. So bump the priority to reflect that.

Fixes: 5a22b198cd ("crypto: arm64/ghash - register PMULL variants ...")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:36:55 +08:00
Sami Tolvanen
6320a15e98 crypto: arm64/sha - fix function types
Instead of casting pointers to callback functions, add C wrappers
to avoid type mismatch failures with Control-Flow Integrity (CFI)
checking.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-12-11 16:36:55 +08:00
Thomas Gleixner
7ef858dad9 sched/rt, arm64: Use CONFIG_PREEMPTION
CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT.
Both PREEMPT and PREEMPT_RT require the same functionality which today
depends on CONFIG_PREEMPT.

Switch the Kconfig dependency, entry code and preemption handling over
to use CONFIG_PREEMPTION. Add PREEMPT_RT output in show_stack().

[bigeasy: +traps.c, Kconfig]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20191015191821.11479-3-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-12-08 14:37:32 +01:00
Jason A. Donenfeld
8394bfec51 crypto: arch - conditionalize crypto api in arch glue for lib code
For glue code that's used by Zinc, the actual Crypto API functions might
not necessarily exist, and don't need to exist either. Before this
patch, there are valid build configurations that lead to a unbuildable
kernel. This fixes it to conditionalize those symbols on the existence
of the proper config entry.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-27 13:08:49 +08:00
Ard Biesheuvel
f569ca1647 crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation
This is a straight import of the OpenSSL/CRYPTOGAMS Poly1305 implementation
for NEON authored by Andy Polyakov, and contributed by him to the OpenSSL
project. The file 'poly1305-armv8.pl' is taken straight from this upstream
GitHub repository [0] at commit ec55a08dc0244ce570c4fc7cade330c60798952f,
and already contains all the changes required to build it as part of a
Linux kernel module.

[0] https://github.com/dot-asm/cryptogams

Co-developed-by: Andy Polyakov <appro@cryptogams.org>
Signed-off-by: Andy Polyakov <appro@cryptogams.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:41 +08:00
Ard Biesheuvel
b36d8c09e7 crypto: arm/chacha - remove dependency on generic ChaCha driver
Instead of falling back to the generic ChaCha skcipher driver for
non-SIMD cases, use a fast scalar implementation for ARM authored
by Eric Biggers. This removes the module dependency on chacha-generic
altogether, which also simplifies things when we expose the ChaCha
library interface from this module.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:40 +08:00
Ard Biesheuvel
b3aad5bad2 crypto: arm64/chacha - expose arm64 ChaCha routine as library function
Expose the accelerated NEON ChaCha routine directly as a symbol
export so that users of the ChaCha library API can use it directly.

Given that calls into the library API will always go through the
routines in this module if it is enabled, switch to static keys
to select the optimal implementation available (which may be none
at all, in which case we defer to the generic implementation for
all invocations).

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:39 +08:00
Ard Biesheuvel
c77da4867c crypto: arm64/chacha - depend on generic chacha library instead of crypto driver
Depend on the generic ChaCha library routines instead of pulling in the
generic ChaCha skcipher driver, which is more than we need, and makes
managing the dependencies between the generic library, generic driver,
accelerated library and driver more complicated.

While at it, drop the logic to prefer the scalar code on short inputs.
Turning the NEON on and off is cheap these days, and one major use case
for ChaCha20 is ChaCha20-Poly1305, which is guaranteed to hit the scalar
path upon every invocation  (when doing the Poly1305 nonce generation)

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:39 +08:00
Ard Biesheuvel
5fb8ef2580 crypto: chacha - move existing library code into lib/crypto
Currently, our generic ChaCha implementation consists of a permute
function in lib/chacha.c that operates on the 64-byte ChaCha state
directly [and which is always included into the core kernel since it
is used by the /dev/random driver], and the crypto API plumbing to
expose it as a skcipher.

In order to support in-kernel users that need the ChaCha streamcipher
but have no need [or tolerance] for going through the abstractions of
the crypto API, let's expose the streamcipher bits via a library API
as well, in a way that permits the implementation to be superseded by
an architecture specific one if provided.

So move the streamcipher code into a separate module in lib/crypto,
and expose the init() and crypt() routines to users of the library.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-17 09:02:39 +08:00
Eric Biggers
b95bba5d01 crypto: skcipher - rename the crypto_blkcipher module and kconfig option
Now that the blkcipher algorithm type has been removed in favor of
skcipher, rename the crypto_blkcipher kernel module to crypto_skcipher,
and rename the config options accordingly:

	CONFIG_CRYPTO_BLKCIPHER => CONFIG_CRYPTO_SKCIPHER
	CONFIG_CRYPTO_BLKCIPHER2 => CONFIG_CRYPTO_SKCIPHER2

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-01 13:42:47 +08:00
Yunfeng Ye
9b537997b6 crypto: arm64/aes-neonbs - add return value of skcipher_walk_done() in __xts_crypt()
A warning is found by the static code analysis tool:
  "Identical condition 'err', second condition is always false"

Fix this by adding return value of skcipher_walk_done().

Fixes: 67cfa5d3b7 ("crypto: arm64/aes-neonbs - implement ciphertext stealing for XTS")
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-11-01 13:33:42 +08:00
Ard Biesheuvel
11031c0d7d crypto: arm64/gcm-ce - implement 4 way interleave
To improve performance on cores with deep pipelines such as ThunderX2,
reimplement gcm(aes) using a 4-way interleave rather than the 2-way
interleave we use currently.

This comes down to a complete rewrite of the GCM part of the combined
GCM/GHASH driver, and instead of interleaving two invocations of AES
with the GHASH handling at the instruction level, the new version
uses a more coarse grained approach where each chunk of 64 bytes is
encrypted first and then ghashed (or ghashed and then decrypted in
the converse case).

The core NEON routine is now able to consume inputs of any size,
and tail blocks of less than 64 bytes are handled using overlapping
loads and stores, and processed by the same 4-way encryption and
hashing routines. This gets rid of most of the branches, and avoids
having to return to the C code to handle the tail block using a
stack buffer.

The table below compares the performance of the old driver and the new
one on various micro-architectures and running in various modes.

        |     AES-128      |     AES-192      |     AES-256      |
 #bytes | 512 | 1500 |  4k | 512 | 1500 |  4k | 512 | 1500 |  4k |
 -------+-----+------+-----+-----+------+-----+-----+------+-----+
    TX2 | 35% |  23% | 11% | 34% |  20% |  9% | 38% |  25% | 16% |
   EMAG | 11% |   6% |  3% | 12% |   4% |  2% | 11% |   4% |  2% |
    A72 |  8% |   5% | -4% |  9% |   4% | -5% |  7% |   4% | -5% |
    A53 | 11% |   6% | -1% | 10% |   8% | -1% | 10% |   8% | -2% |

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-10-05 01:04:31 +10:00
Ard Biesheuvel
67cfa5d3b7 crypto: arm64/aes-neonbs - implement ciphertext stealing for XTS
Update the AES-XTS implementation based on NEON instructions so that it
can deal with inputs whose size is not a multiple of the cipher block
size. This is part of the original XTS specification, but was never
implemented before in the Linux kernel.

Since the bit slicing driver is only faster if it can operate on at
least 7 blocks of input at the same time, let's reuse the alternate
path we are adding for CTS to process any data tail whose size is
not a multiple of 128 bytes.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-09-09 17:35:39 +10:00
Ard Biesheuvel
7cceca8b25 crypto: arm64/aes - implement support for XTS ciphertext stealing
Add the missing support for ciphertext stealing in the implementation
of AES-XTS, which is part of the XTS specification but was omitted up
until now due to lack of a need for it.

The asm helpers are updated so they can deal with any input size, as
long as the last full block and the final partial block are presented
at the same time. The glue code is updated so that the common case of
operating on a sector or page is mostly as before. When CTS is needed,
the walk is split up into two pieces, unless the entire input is covered
by a single step.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-09-09 17:35:39 +10:00
Ard Biesheuvel
7c9d65c40a crypto: arm64/aes-cts-cbc - move request context data to the stack
Since the CTS-CBC code completes synchronously, there is no point in
keeping part of the scratch data it uses in the request context, so
move it to the stack instead.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-09-09 17:35:39 +10:00
Ard Biesheuvel
0cfd507c83 crypto: arm64/aes-cts-cbc-ce - performance tweak
Optimize away one of the tbl instructions in the decryption path,
which turns out to be unnecessary.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-09-09 17:35:38 +10:00
Ard Biesheuvel
69b6f2e817 crypto: arm64/aes-neon - limit exposed routines if faster driver is enabled
The pure NEON AES implementation predates the bit-slicing one, and is
generally slower, unless the algorithm in question can only execute
sequentially.

So advertising the skciphers that the bit-slicing driver implements as
well serves no real purpose, and we can just disable them. Note that the
bit-slicing driver also has a link time dependency on the pure NEON
driver, for CBC encryption and for XTS tweak calculation, so we still
need both drivers on systems that do not implement the Crypto Extensions.

At the same time, expose those modaliases for the AES instruction based
driver. This is necessary since otherwise, we may end up loading the
wrong driver when any of the skciphers are instantiated before the CPU
capability based module loading has completed.

Finally, add the missing modalias for cts(cbc(aes)) so requests for
this algorithm will autoload the correct module.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-09-09 17:35:38 +10:00
Ard Biesheuvel
7a3b1c6ee7 crypto: arm64/aes-neonbs - replace tweak mask literal with composition
Replace the vector load from memory sequence with a simple instruction
sequence to compose the tweak vector directly.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-09-09 17:35:28 +10:00
zhong jiang
7b865ec15e crypto: arm64/aes - Use PTR_ERR_OR_ZERO rather than its implementation.
PTR_ERR_OR_ZERO contains if(IS_ERR(...)) + PTR_ERR. It is better to
use it directly. hence just replace it.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-09-09 17:35:27 +10:00
Hans de Goede
8f373bf493 crypto: arm64 - Rename functions to avoid conflict with crypto/sha256.h
Rename static / file-local functions so that they do not conflict with
the functions declared in crypto/sha256.h.

This is a preparation patch for folding crypto/sha256.h into crypto/sha.h.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-09-05 14:37:30 +10:00
Ard Biesheuvel
735177ca14 crypto: arm64/aes - implement accelerated ESSIV/CBC mode
Add an accelerated version of the 'essiv(cbc(aes),sha256)' skcipher,
which is used by fscrypt or dm-crypt on systems where CBC mode is
signficantly more performant than XTS mode (e.g., when using a h/w
accelerator which supports the former but not the latter) This avoids
a separate call into the AES cipher for every invocation.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-08-30 18:05:27 +10:00
Ard Biesheuvel
65d0042b52 crypto: arm64/aes-cts-cbc - factor out CBC en/decryption of a walk
The plain CBC driver and the CTS one share some code that iterates over
a scatterwalk and invokes the CBC asm code to do the processing. The
upcoming ESSIV/CBC mode will clone that pattern for the third time, so
let's factor it out first.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-08-30 18:05:27 +10:00
Ard Biesheuvel
642a88fbe9 crypto: arm64/aes-cipher - switch to shared AES inverse Sbox
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:58:37 +10:00
Ard Biesheuvel
58144b8d03 crypto: arm64/aes-neon - switch to shared AES Sboxes
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:58:36 +10:00
Ard Biesheuvel
4d3f9d89c7 crypto: arm64/aes-ce-cipher - use AES library as fallback
Instead of calling into the table based scalar AES code in situations
where the SIMD unit may not be used, use the generic AES code, which
is more appropriate since it is less likely to be susceptible to
timing attacks.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:58:09 +10:00
Ard Biesheuvel
ff6f4115cb crypto: aes - move sync ctr(aes) to AES library and generic helper
In preparation of duplicating the sync ctr(aes) functionality to modules
under arch/arm, move the helper function from a inline .h file to the
AES library, which is already depended upon by the drivers that use this
fallback.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:58:08 +10:00
Ard Biesheuvel
c184472902 crypto: arm64/aes-ce - switch to library version of key expansion routine
Switch to the new AES library that also provides an implementation of
the AES key expansion routine. This removes the dependency on the
generic AES cipher, allowing it to be omitted entirely in the future.

While at it, remove some references to the table based arm64 version
of AES and replace them with AES library calls as well.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:56:06 +10:00
Ard Biesheuvel
f68df54307 crypto: arm64/aes-neonbs - switch to library version of key expansion routine
Switch to the new AES library that also provides an implementation of
the AES key expansion routine. This removes the dependency on the
generic AES cipher, allowing it to be omitted entirely in the future.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:56:05 +10:00
Ard Biesheuvel
c59a6dffa3 crypto: arm64/aes-ccm - switch to AES library
The CCM code calls directly into the scalar table based AES cipher for
arm64 from the fallback path, and since this implementation is known to
be non-time invariant, doing so from a time invariant SIMD cipher is a
bit nasty.

So let's switch to the AES library - this makes the code more robust,
and drops the dependency on the generic AES cipher, allowing us to
omit it entirely in the future.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:56:05 +10:00
Ard Biesheuvel
fe3b99b649 crypto: arm64/ghash - switch to AES library
The GHASH code uses the generic AES key expansion routines, and calls
directly into the scalar table based AES cipher for arm64 from the
fallback path, and since this implementation is known to be non-time
invariant, doing so from a time invariant SIMD cipher is a bit nasty.

So let's switch to the AES library - this makes the code more robust,
and drops the dependency on the generic AES cipher, allowing us to
omit it entirely in the future.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:56:04 +10:00
Ard Biesheuvel
724ecd3c0e crypto: aes - rename local routines to prevent future clashes
Rename some local AES encrypt/decrypt routines so they don't clash with
the names we are about to introduce for the routines exposed by the
generic AES library.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-26 14:52:03 +10:00
Linus Torvalds
4d2fa8b44b Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 5.3:

  API:
   - Test shash interface directly in testmgr
   - cra_driver_name is now mandatory

  Algorithms:
   - Replace arc4 crypto_cipher with library helper
   - Implement 5 way interleave for ECB, CBC and CTR on arm64
   - Add xxhash
   - Add continuous self-test on noise source to drbg
   - Update jitter RNG

  Drivers:
   - Add support for SHA204A random number generator
   - Add support for 7211 in iproc-rng200
   - Fix fuzz test failures in inside-secure
   - Fix fuzz test failures in talitos
   - Fix fuzz test failures in qat"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (143 commits)
  crypto: stm32/hash - remove interruptible condition for dma
  crypto: stm32/hash - Fix hmac issue more than 256 bytes
  crypto: stm32/crc32 - rename driver file
  crypto: amcc - remove memset after dma_alloc_coherent
  crypto: ccp - Switch to SPDX license identifiers
  crypto: ccp - Validate the the error value used to index error messages
  crypto: doc - Fix formatting of new crypto engine content
  crypto: doc - Add parameter documentation
  crypto: arm64/aes-ce - implement 5 way interleave for ECB, CBC and CTR
  crypto: arm64/aes-ce - add 5 way interleave routines
  crypto: talitos - drop icv_ool
  crypto: talitos - fix hash on SEC1.
  crypto: talitos - move struct talitos_edesc into talitos.h
  lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE
  crypto/NX: Set receive window credits to max number of CRBs in RxFIFO
  crypto: asymmetric_keys - select CRYPTO_HASH where needed
  crypto: serpent - mark __serpent_setkey_sbox noinline
  crypto: testmgr - dynamically allocate crypto_shash
  crypto: testmgr - dynamically allocate testvec_config
  crypto: talitos - eliminate unneeded 'done' functions at build time
  ...
2019-07-08 20:57:08 -07:00
Ard Biesheuvel
7367bfeb2c crypto: arm64/aes-ce - implement 5 way interleave for ECB, CBC and CTR
This implements 5-way interleaving for ECB, CBC decryption and CTR,
resulting in a speedup of ~11% on Marvell ThunderX2, which has a
very deep pipeline and therefore a high issue latency for NEON
instructions operating on the same registers.

Note that XTS is left alone: implementing 5-way interleave there
would either involve spilling of the calculated tweaks to the
stack, or recalculating them after the encryption operation, and
doing either of those would most likely penalize low end cores.

For ECB, this is not a concern at all, given that we have plenty
of spare registers. For CTR and CBC decryption, we take advantage
of the fact that v16 is not used by the CE version of the code
(which is the only one targeted by the optimization), and so we
can reshuffle the code a bit and avoid having to spill to memory
(with the exception of one extra reload in the CBC routine)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-03 22:13:12 +08:00
Ard Biesheuvel
e217413964 crypto: arm64/aes-ce - add 5 way interleave routines
In preparation of tweaking the accelerated AES chaining mode routines
to be able to use a 5-way stride, implement the core routines to
support processing 5 blocks of input at a time. While at it, drop
the 2 way versions, which have been unused for a while now.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-07-03 22:13:12 +08:00
Thomas Gleixner
d2912cb15b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:55 +02:00
Eric Biggers
860ab2e502 crypto: chacha - constify ctx and iv arguments
Constify the ctx and iv arguments to crypto_chacha_init() and the
various chacha*_stream_xor() functions.  This makes it clear that they
are not modified.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-13 14:31:40 +08:00
Elena Petrova
6bd934de1e crypto: arm64/sha2-ce - correct digest for empty data in finup
The sha256-ce finup implementation for ARM64 produces wrong digest
for empty input (len=0). Expected: the actual digest, result: initial
value of SHA internal state. The error is in sha256_ce_finup:
for empty data `finalize` will be 1, so the code is relying on
sha2_ce_transform to make the final round. However, in
sha256_base_do_update, the block function will not be called when
len == 0.

Fix it by setting finalize to 0 if data is empty.

Fixes: 03802f6a80 ("crypto: arm64/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer")
Cc: stable@vger.kernel.org
Signed-off-by: Elena Petrova <lenaptr@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-06 14:38:57 +08:00
Elena Petrova
1d4aaf16de crypto: arm64/sha1-ce - correct digest for empty data in finup
The sha1-ce finup implementation for ARM64 produces wrong digest
for empty input (len=0). Expected: da39a3ee..., result: 67452301...
(initial value of SHA internal state). The error is in sha1_ce_finup:
for empty data `finalize` will be 1, so the code is relying on
sha1_ce_transform to make the final round. However, in
sha1_base_do_update, the block function will not be called when
len == 0.

Fix it by setting finalize to 0 if data is empty.

Fixes: 07eb54d306 ("crypto: arm64/sha1-ce - move SHA-1 ARMv8 implementation to base layer")
Cc: stable@vger.kernel.org
Signed-off-by: Elena Petrova <lenaptr@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-06 14:38:57 +08:00
Thomas Gleixner
2874c5fd28 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:32 -07:00
Linus Torvalds
81ff5d2cba Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 "API:
   - Add support for AEAD in simd
   - Add fuzz testing to testmgr
   - Add panic_on_fail module parameter to testmgr
   - Use per-CPU struct instead multiple variables in scompress
   - Change verify API for akcipher

  Algorithms:
   - Convert x86 AEAD algorithms over to simd
   - Forbid 2-key 3DES in FIPS mode
   - Add EC-RDSA (GOST 34.10) algorithm

  Drivers:
   - Set output IV with ctr-aes in crypto4xx
   - Set output IV in rockchip
   - Fix potential length overflow with hashing in sun4i-ss
   - Fix computation error with ctr in vmx
   - Add SM4 protected keys support in ccree
   - Remove long-broken mxc-scc driver
   - Add rfc4106(gcm(aes)) cipher support in cavium/nitrox"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (179 commits)
  crypto: ccree - use a proper le32 type for le32 val
  crypto: ccree - remove set but not used variable 'du_size'
  crypto: ccree - Make cc_sec_disable static
  crypto: ccree - fix spelling mistake "protedcted" -> "protected"
  crypto: caam/qi2 - generate hash keys in-place
  crypto: caam/qi2 - fix DMA mapping of stack memory
  crypto: caam/qi2 - fix zero-length buffer DMA mapping
  crypto: stm32/cryp - update to return iv_out
  crypto: stm32/cryp - remove request mutex protection
  crypto: stm32/cryp - add weak key check for DES
  crypto: atmel - remove set but not used variable 'alg_name'
  crypto: picoxcell - Use dev_get_drvdata()
  crypto: crypto4xx - get rid of redundant using_sd variable
  crypto: crypto4xx - use sync skcipher for fallback
  crypto: crypto4xx - fix cfb and ofb "overran dst buffer" issues
  crypto: crypto4xx - fix ctr-aes missing output IV
  crypto: ecrdsa - select ASN1 and OID_REGISTRY for EC-RDSA
  crypto: ux500 - use ccflags-y instead of CFLAGS_<basename>.o
  crypto: ccree - handle tee fips error during power management resume
  crypto: ccree - add function to handle cryptocell tee fips error
  ...
2019-05-06 20:15:06 -07:00
Eric Biggers
4a8108b705 crypto: arm64/aes-neonbs - don't access already-freed walk.iv
If the user-provided IV needs to be aligned to the algorithm's
alignmask, then skcipher_walk_virt() copies the IV into a new aligned
buffer walk.iv.  But skcipher_walk_virt() can fail afterwards, and then
if the caller unconditionally accesses walk.iv, it's a use-after-free.

xts-aes-neonbs doesn't set an alignmask, so currently it isn't affected
by this despite unconditionally accessing walk.iv.  However this is more
subtle than desired, and unconditionally accessing walk.iv has caused a
real problem in other algorithms.  Thus, update xts-aes-neonbs to start
checking the return value of skcipher_walk_virt().

Fixes: 1abee99eaf ("crypto: arm64/aes - reimplement bit-sliced ARM/NEON implementation for arm64")
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-18 22:14:58 +08:00
Andrew Murray
aaba098fe6 arm64: HWCAP: add support for AT_HWCAP2
As we will exhaust the first 32 bits of AT_HWCAP let's start
exposing AT_HWCAP2 to userspace to give us up to 64 caps.

Whilst it's possible to use the remaining 32 bits of AT_HWCAP, we
prefer to expand into AT_HWCAP2 in order to provide a consistent
view to userspace between ILP32 and LP64. However internal to the
kernel we prefer to continue to use the full space of elf_hwcap.

To reduce complexity and allow for future expansion, we now
represent hwcaps in the kernel as ordinals and use a
KERNEL_HWCAP_ prefix. This allows us to support automatic feature
based module loading for all our hwcaps.

We introduce cpu_set_feature to set hwcaps which complements the
existing cpu_have_feature helper. These helpers allow us to clean
up existing direct uses of elf_hwcap and reduce any future effort
required to move beyond 64 caps.

For convenience we also introduce cpu_{have,set}_named_feature which
makes use of the cpu_feature macro to allow providing a hwcap name
without a {KERNEL_}HWCAP_ prefix.

Signed-off-by: Andrew Murray <andrew.murray@arm.com>
[will: use const_ilog2() and tweak documentation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-16 16:27:12 +01:00
Eric Biggers
f6e9af8766 crypto: arm64/cbcmac - handle empty messages in same way as template
My patches to make testmgr fuzz algorithms against their generic
implementation detected that the arm64 implementations of "cbcmac(aes)"
handle empty messages differently from the cbcmac template.  Namely, the
arm64 implementations return the encrypted initial value, but the cbcmac
template returns the initial value directly.

This isn't actually a meaningful case because any user of cbcmac needs
to prepend the message length, as CCM does; otherwise it's insecure.
However, we should keep the behavior consistent; at the very least this
makes testing easier.

Do it the easy way, which is to change the arm64 implementations to have
the same behavior as the cbcmac template.

For what it's worth, ghash does things essentially the same way: it
returns its initial value when given an empty message, even though in
practice ghash is never passed an empty message.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-08 14:42:55 +08:00
Eric Biggers
e52b7023cd crypto: arm64 - convert to use crypto_simd_usable()
Replace all calls to may_use_simd() in the arm64 crypto code with
crypto_simd_usable(), in order to allow testing the no-SIMD code paths.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-03-22 20:57:27 +08:00