Function definitions in headers are usually marked as 'static inline'.
Since 'inline' is missing for crypto_reportstat(), if it were not
referenced from a .c file that includes this header, it would produce
a warning.
Also, 'struct crypto_user_alg' is not declared in this header.
I included <linux/crytouser.h> instead of adding the forward declaration
as suggested [1].
Detected by compile-testing this header as a standalone unit:
./include/crypto/internal/cryptouser.h:6:44: warning: ‘struct crypto_user_alg’ declared inside parameter list will not be visible outside of this definition or declaration
struct crypto_alg *crypto_alg_match(struct crypto_user_alg *p, int exact);
^~~~~~~~~~~~~~~
./include/crypto/internal/cryptouser.h:11:12: warning: ‘crypto_reportstat’ defined but not used [-Wunused-function]
static int crypto_reportstat(struct sk_buff *in_skb, struct nlmsghdr *in_nlh, struct nlattr **attrs)
^~~~~~~~~~~~~~~~~
[1] https://lkml.org/lkml/2019/6/13/1121
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add header include guards in case they are included multiple times.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support of printing the dpseci frame queue statistics using debugfs.
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It seems that smp_processor_id() is only used for a best-effort
load-balancing, refer to qat_crypto_get_instance_node(). It's not feasible
to disable preemption for the duration of the crypto requests. Therefore,
just silence the warning. This commit is similar to e7a9b05ca4
("crypto: cavium - Fix smp_processor_id() warnings").
Silences the following splat:
BUG: using smp_processor_id() in preemptible [00000000] code: cryptomgr_test/2904
caller is qat_alg_ablkcipher_setkey+0x300/0x4a0 [intel_qat]
CPU: 1 PID: 2904 Comm: cryptomgr_test Tainted: P O 4.14.69 #1
...
Call Trace:
dump_stack+0x5f/0x86
check_preemption_disabled+0xd3/0xe0
qat_alg_ablkcipher_setkey+0x300/0x4a0 [intel_qat]
skcipher_setkey_ablkcipher+0x2b/0x40
__test_skcipher+0x1f3/0xb20
? cpumask_next_and+0x26/0x40
? find_busiest_group+0x10e/0x9d0
? preempt_count_add+0x49/0xa0
? try_module_get+0x61/0xf0
? crypto_mod_get+0x15/0x30
? __kmalloc+0x1df/0x1f0
? __crypto_alloc_tfm+0x116/0x180
? crypto_skcipher_init_tfm+0xa6/0x180
? crypto_create_tfm+0x4b/0xf0
test_skcipher+0x21/0xa0
alg_test_skcipher+0x3f/0xa0
alg_test.part.6+0x126/0x2a0
? finish_task_switch+0x21b/0x260
? __schedule+0x1e9/0x800
? __wake_up_common+0x8d/0x140
cryptomgr_test+0x40/0x50
kthread+0xff/0x130
? cryptomgr_notify+0x540/0x540
? kthread_create_on_node+0x70/0x70
ret_from_fork+0x24/0x50
Fixes: ed8ccaef52 ("crypto: qat - Add support for SRIOV")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use devm_hwrng_register to get rid of manual
unregistration.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This reverts commit ecc8bc81f2
("crypto: aegis128 - provide a SIMD implementation based on NEON
intrinsics") and commit 7cdc0ddbf7
("crypto: aegis128 - add support for SIMD acceleration").
They cause compile errors on platforms other than ARM because
the mechanism to selectively compile the SIMD code is broken.
Repoted-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The generic aegis128 software crypto driver recently gained support
for using SIMD intrinsics to increase performance, for which it
uncondionally #include's the <asm/simd.h> header. Unfortunately,
this header does not exist on many architectures, resulting in
build failures.
Since asm-generic already has a version of simd.h, let's make it
a mandatory header so that it gets instantiated on all architectures
that don't provide their own version.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The newly introduced AES library exposes aes_encrypt/aes_decrypt
routines so rename existing occurrences of those identifiers in
the s390 driver.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
To help avoid confusion, add a comment to ghash-generic.c which explains
the convention that the kernel's implementation of GHASH uses.
Also update the Kconfig help text and module descriptions to call GHASH
a "hash function" rather than a "message digest", since the latter
normally means a real cryptographic hash function, which GHASH is not.
Cc: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
With the removal of the padata timer, padata_do_serial no longer
needs special CPU handling, so remove it.
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Validated assoclen for RFC4543 which expects an assoclen
of 16 or 20, the same as RFC4106.
Based on seqiv, IPsec ESP and RFC4543/RFC4106 the assoclen is sizeof
IP Header (spi, seq_no, extended seq_no) and IV len. This can be 16 or
20 bytes.
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Check assoclen to solve the extra tests that expect -EINVAL to be
returned when the associated data size is not valid.
Validated assoclen for RFC4543 which expects an assoclen
of 16 or 20, the same as RFC4106.
Based on seqiv, IPsec ESP and RFC4543/RFC4106 the assoclen is sizeof
IP Header (spi, seq_no, extended seq_no) and IV len. This can be 16 or
20 bytes.
Signed-off-by: Iuliana Prodan <iuliana.prodan@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The function padata_reorder will use a timer when it cannot progress
while completed jobs are outstanding (pd->reorder_objects > 0). This
is suboptimal as if we do end up using the timer then it would have
introduced a gratuitous delay of one second.
In fact we can easily distinguish between whether completed jobs
are outstanding and whether we can make progress. All we have to
do is look at the next pqueue list.
This patch does that by replacing pd->processed with pd->cpu so
that the next pqueue is more accessible.
A work queue is used instead of the original try_again to avoid
hogging the CPU.
Note that we don't bother removing the work queue in
padata_flush_queues because the whole premise is broken. You
cannot flush async crypto requests so it makes no sense to even
try. A subsequent patch will fix it by replacing it with a ref
counting scheme.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Clang sometimes makes very different inlining decisions from gcc.
In case of the aegis crypto algorithms, it decides to turn the innermost
primitives (and, xor, ...) into separate functions but inline most of
the rest.
This results in a huge amount of variables spilled on the stack, leading
to rather slow execution as well as kernel stack usage beyond the 32-bit
warning limit when CONFIG_KASAN is enabled:
crypto/aegis256.c:123:13: warning: stack frame size of 648 bytes in function 'crypto_aegis256_encrypt_chunk' [-Wframe-larger-than=]
crypto/aegis256.c:366:13: warning: stack frame size of 1264 bytes in function 'crypto_aegis256_crypt' [-Wframe-larger-than=]
crypto/aegis256.c:187:13: warning: stack frame size of 656 bytes in function 'crypto_aegis256_decrypt_chunk' [-Wframe-larger-than=]
crypto/aegis128l.c:135:13: warning: stack frame size of 832 bytes in function 'crypto_aegis128l_encrypt_chunk' [-Wframe-larger-than=]
crypto/aegis128l.c:415:13: warning: stack frame size of 1480 bytes in function 'crypto_aegis128l_crypt' [-Wframe-larger-than=]
crypto/aegis128l.c:218:13: warning: stack frame size of 848 bytes in function 'crypto_aegis128l_decrypt_chunk' [-Wframe-larger-than=]
crypto/aegis128.c:116:13: warning: stack frame size of 584 bytes in function 'crypto_aegis128_encrypt_chunk' [-Wframe-larger-than=]
crypto/aegis128.c:351:13: warning: stack frame size of 1064 bytes in function 'crypto_aegis128_crypt' [-Wframe-larger-than=]
crypto/aegis128.c:177:13: warning: stack frame size of 592 bytes in function 'crypto_aegis128_decrypt_chunk' [-Wframe-larger-than=]
Forcing the primitives to all get inlined avoids the issue and the
resulting code is similar to what gcc produces.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use dma_pool_zalloc instead of using dma_pool_alloc to allocate
memory and then zeroing it with memset 0.
This simplifies the code.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
While running ipsec processing for traffic through multiple network
interfaces, it is observed that caam driver gets less time to poll
responses from caam block compared to ethernet driver. This is because
ethernet driver has as many napi instances per cpu as the number of
ethernet interfaces in system. Therefore, caam driver's napi executes
lesser than the ethernet driver's napi instances. This results in
situation that we end up submitting more requests to caam (which it is
able to finish off quite fast), but don't dequeue the responses at same
rate. This makes caam response FQs bloat with large number of frames. In
some situations, it makes kernel crash due to out-of-memory. To prevent
it We increase the napi budget of dpseci driver to a big value so that
caam driver is able to drain its response queues at enough rate.
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use the new helper devm_platform_ioremap_resource() which wraps the
platform_get_resource() and devm_ioremap_resource() together, to
simplify the code.
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use the new helper devm_platform_ioremap_resource() which wraps the
platform_get_resource() and devm_ioremap_resource() together, to
simplify the code.
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Each of the operations in ccp_run_cmd() needs several hundred
bytes of kernel stack. Depending on the inlining, these may
need separate stack slots that add up to more than the warning
limit, as shown in this clang based build:
drivers/crypto/ccp/ccp-ops.c:871:12: error: stack frame size of 1164 bytes in function 'ccp_run_aes_cmd' [-Werror,-Wframe-larger-than=]
static int ccp_run_aes_cmd(struct ccp_cmd_queue *cmd_q, struct ccp_cmd *cmd)
The problem may also happen when there is no warning, e.g. in the
ccp_run_cmd()->ccp_run_aes_cmd()->ccp_run_aes_gcm_cmd() call chain with
over 2000 bytes.
Mark each individual function as 'noinline_for_stack' to prevent
this from happening, and move the calls to the two special cases for aes
into the top-level function. This will keep the actual combined stack
usage to the mimimum: 828 bytes for ccp_run_aes_gcm_cmd() and
at most 524 bytes for each of the other cases.
Fixes: 63b945091a ("crypto: ccp - CCP device driver and interface support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Redefine pr_fmt so that the module name is prefixed to every
log message produced by the ccp-crypto module
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The directory tools/crypto and the only file under it never gets
built anywhere. This program should instead be incorporated into
one of the existing user-space projects, crconf or libkcapi.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support to load Asymmetric crypto firmware on
AE cores of CNN55XX device. Firmware is stored on UCD block 2
and all available AE cores are tagged to group 0.
Signed-off-by: Phani Kiran Hemadri <phemadri@marvell.com>
Reviewed-by: Srikanth Jampala <jsrikanth@marvell.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The CCP driver is able to act as a DMA engine. Add a module parameter that
allows this feature to be enabled/disabled.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Provide the ability to constrain the total number of enabled devices in
the system. Once max_devs devices have been configured, subsequently
probed devices are ignored.
The max_devs parameter may be zero, in which case all CCPs are disabled.
PSPs are always enabled and active.
Disabling the CCPs also disables DMA and RNG registration.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a module parameter to limit the number of queues per CCP. The default
value (nqueues=0) is to set up every available queue on each device.
The count of queues starts from the first one found on the device (which
varies based on the device ID).
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add a config option to exclude DebugFS support in the CCP driver.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently, NETLINK_CRYPTO works only in the init network namespace. It
doesn't make much sense to cut it out of the other network namespaces,
so do the minor plumbing work necessary to make it work in any network
namespace. Code inspired by net/core/sock_diag.c.
Tested using kcapi-dgst from libkcapi [1]:
Before:
# unshare -n kcapi-dgst -c sha256 </dev/null | wc -c
libkcapi - Error: Netlink error: sendmsg failed
libkcapi - Error: Netlink error: sendmsg failed
libkcapi - Error: NETLINK_CRYPTO: cannot obtain cipher information for hmac(sha512) (is required crypto_user.c patch missing? see documentation)
0
After:
# unshare -n kcapi-dgst -c sha256 </dev/null | wc -c
32
[1] https://github.com/smuellerDD/libkcapi
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch recognises the fact that the hardware cannot ever process more
than 2,199,023,386,111 bytes of hash or HMAC payload, so there is no point
in maintaining 128 bit wide byte counters, 64 bits is more than sufficient
Signed-off-by: Pascal van Leeuwen <pvanleeuwen@verimatrix.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds support for the following AEAD ciphersuites:
- authenc(hmac(sha1),rfc3686(ctr(aes)))
- authenc(hmac(sha224),rfc3686(ctr(aes)))
- authenc(hmac(sha256),rfc3686(ctr(aes)))
- authenc(hmac(sha384),rfc3686(ctr(aes)))
- authenc(hmac(sha512),rfc3686(ctr(aes)))
Signed-off-by: Pascal van Leeuwen <pvanleeuwen@verimatrix.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For spinlocks the type spinlock_t should be used instead of "struct
spinlock".
Use spinlock_t for spinlock's definition.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
kmemdup is introduced to duplicate a region of memory in a neat way.
Rather than kmalloc/kzalloc + memcpy, which the programmer needs to
write the size twice (sometimes lead to mistakes), kmemdup improves
readability, leads to smaller code and also reduce the chances of mistakes.
Suggestion to use kmemdup rather than using kmalloc/kzalloc + memcpy.
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Provide an accelerated implementation of aegis128 by wiring up the
SIMD hooks in the generic driver to an implementation based on NEON
intrinsics, which can be compiled to both ARM and arm64 code.
This results in a performance of 2.2 cycles per byte on Cortex-A53,
which is a performance increase of ~11x compared to the generic
code.
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add some plumbing to allow the AEGIS128 code to be built with SIMD
routines for acceleration.
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The generic AES code provides four sets of lookup tables, where each
set consists of four tables containing the same 32-bit values, but
rotated by 0, 8, 16 and 24 bits, respectively. This makes sense for
CISC architectures such as x86 which support memory operands, but
for other architectures, the rotates are quite cheap, and using all
four tables needlessly thrashes the D-cache, and actually hurts rather
than helps performance.
Since x86 already has its own implementation of AEGIS based on AES-NI
instructions, let's tweak the generic implementation towards other
architectures, and avoid the prerotated tables, and perform the
rotations inline. On ARM Cortex-A53, this results in a ~8% speedup.
Acked-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
TFM init/exit routines are optional, so no need to provide empty ones.
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Three variants of AEGIS were proposed for the CAESAR competition, and
only one was selected for the final portfolio: AEGIS128.
The other variants, AEGIS128L and AEGIS256, are not likely to ever turn
up in networking protocols or other places where interoperability
between Linux and other systems is a concern, nor are they likely to
be subjected to further cryptanalysis. However, uninformed users may
think that AEGIS128L (which is faster) is equally fit for use.
So let's remove them now, before anyone starts using them and we are
forced to support them forever.
Note that there are no known flaws in the algorithms or in any of these
implementations, but they have simply outlived their usefulness.
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
MORUS was not selected as a winner in the CAESAR competition, which
is not surprising since it is considered to be cryptographically
broken [0]. (Note that this is not an implementation defect, but a
flaw in the underlying algorithm). Since it is unlikely to be in use
currently, let's remove it before we're stuck with it.
[0] https://eprint.iacr.org/2019/172.pdf
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The scalar table based AES routines are not used by other drivers, so
let's keep it that way and unexport the symbols.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
There are a few copies of the AES S-boxes floating around, so export
the ones from the AES library so that we can reuse them in other
modules.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The versions of the AES lookup tables that are only used during the last
round are never used outside of the driver, so there is no need to
export their symbols.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Replace a couple of occurrences where the "aes-generic" cipher is
instantiated explicitly and only used for encryption of a single block.
Use AES library calls instead.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use the AES library instead of the cipher interface to perform
the single block of AES processing involved in updating the key
of the cmac(aes) hash.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The AMCC code for GCM key derivation allocates a AES cipher to
perform a single block encryption. So let's switch to the new
and more lightweight AES library instead.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>