linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-08-24 09:50:04 +00:00

Author	SHA1	Message	Date
Yangtao Li	88d905e20b	crypto: cavium/nitrox - convert to DEFINE_SHOW_ATTRIBUTE Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:01 +08:00
Atul Gupta	8362ea16f6	crypto: chcr - ESN for Inline IPSec Tx Send SPI, 64b seq nos and 64b IV with aadiv drop for inline crypto. This information is added in outgoing packet after the CPL TX PKT XT and removed by hardware. The aad, auth and cipher offsets are then adjusted for ESN enabled tunnel. Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:01 +08:00
Atul Gupta	c35828ea90	crypto: chcr - small packet Tx stalls the queue Immediate packets sent to hardware should include the work request length in calculating the flits. WR occupy one flit and if not accounted result in invalid request which stalls the HW queue. Cc: stable@vger.kernel.org Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:01 +08:00
Corentin Labbe	1f6669b971	crypto: user - Add crypto_stats_init This patch add the crypto_stats_init() function. This will permit to remove some ifdef from __crypto_register_alg(). Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:00 +08:00
Corentin Labbe	44f13133cb	crypto: user - rename err_cnt parameter Since now all crypto stats are on their own structures, it is now useless to have the algorithm name in the err_cnt member. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:00 +08:00
Corentin Labbe	17c18f9e33	crypto: user - Split stats in multiple structures Like for userspace, this patch splits stats into multiple structures, one for each algorithm class. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:00 +08:00
Corentin Labbe	5fff81729f	crypto: user - remove intermediate variable The use of the v64 intermediate variable is useless, and removing it bring to much readable code. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:00 +08:00
Corentin Labbe	b0af91c141	crypto: user - Fix invalid stat reporting Some error count use the wrong name for getting this data. But this had not caused any reporting problem, since all error count are shared in the same union. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:00 +08:00
Corentin Labbe	f7d76e05d0	crypto: user - fix use_after_free of struct xxx_request All crypto_stats functions use the struct xxx_request for feeding stats, but in some case this structure could already be freed. For fixing this, the needed parameters (len and alg) will be stored before the request being executed. Fixes: `cac5818c25` ("crypto: user - Implement a generic crypto statistics") Reported-by: syzbot <syzbot+6939a606a5305e9e9799@syzkaller.appspotmail.com> Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:00 +08:00
Corentin Labbe	76d09ea7c2	crypto: tool: getstat: convert user space example to the new crypto_user_stat uapi This patch converts the getstat example tool to the recent changes done in crypto_user_stat - changed all stats to u64 - separated struct stats for each crypto alg Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:00 +08:00
Corentin Labbe	7f0a9d5c9d	crypto: user - split user space crypto stat structures It is cleaner to have each stat in their own structures. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:00 +08:00
Corentin Labbe	6e8e72cd20	crypto: user - convert all stats from u32 to u64 All the 32-bit fields need to be 64-bit. In some cases, UINT32_MAX crypto operations can be done in seconds. Reported-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:00 +08:00
Corentin Labbe	a6a3138536	crypto: user - CRYPTO_STATS should depend on CRYPTO_USER CRYPTO_STATS is using CRYPTO_USER stuff, so it should depends on it. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:00 +08:00
Corentin Labbe	2ced26078f	crypto: user - made crypto_user_stat optional Even if CRYPTO_STATS is set to n, some part of CRYPTO_STATS are compiled. This patch made all part of crypto_user_stat uncompiled in that case. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:15:00 +08:00
Paulo Flabiano Smorigo	c97e4df573	MAINTAINERS: change NX/VMX maintainers Add Breno and Nayna as NX/VMX crypto driver maintainers. Also change my email address to my personal account and remove Leonidas since he's not working with the driver anymore. Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:14:59 +08:00
Gilad Ben-Yossef	18596781e0	MAINTAINERS: ccree: add co-maintainer Add Yael Chemla as co-maintainer of Arm TrustZone CryptoCell REE driver. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:14:59 +08:00
Gilad Ben-Yossef	fefbc0b4bc	dt-bindings: crypto: ccree: add dt bindings for ccree 703 Add device tree bindings associating Arm TrustZone CryptoCell 703 with the ccree driver. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:14:59 +08:00
Gilad Ben-Yossef	1c876a90e2	crypto: ccree - add support for CryptoCell 703 Add support for Arm TrustZone CryptoCell 703. The 703 is a variant of the CryptoCell 713 that supports only algorithms certified by the Chinesse Office of the State Commercial Cryptography Administration (OSCCA). Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 14:14:59 +08:00
Herbert Xu	946dca8fe4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Merge crypto tree to pick up crypto stats API revert.	2018-12-07 13:59:10 +08:00
Herbert Xu	e61efff4ae	crypto: user - Disable statistics interface Since this user-space API is still undergoing significant changes, this patch disables it for the current merge window. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-12-07 13:56:08 +08:00
Srikanth, Jampala	7a027b57f9	crypto: cavium/nitrox - Enable interrups for PF in SR-IOV mode. Enable the available interrupt vectors for PF in SR-IOV Mode. Only single vector entry 192 is valid of PF. This is used to notify any hardware errors and mailbox messages from VF(s). Signed-off-by: Srikanth Jampala <Jampala.Srikanth@cavium.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-29 16:27:04 +08:00
Nagadheeraj, Rottela	4bede34c1a	crypto: cavium/nitrox - crypto request format changes nitrox_skcipher_crypt() will do the necessary formatting/ordering of input and output sglists based on the algorithm requirements. It will also accommodate the mandatory output buffers required for NITROX hardware like Output request headers (ORH) and Completion headers. Signed-off-by: Nagadheeraj Rottela <rottela.nagadheeraj@cavium.com> Reviewed-by: Srikanth Jampala <Jampala.Srikanth@cavium.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-29 16:27:04 +08:00
Martin Willi	180def6c4a	crypto: x86/chacha20 - Add a 4-block AVX-512VL variant This version uses the same principle as the AVX2 version by scheduling the operations for two block pairs in parallel. It benefits from the AVX-512VL rotate instructions and the more efficient partial block handling using "vmovdqu8", resulting in a speedup of the raw block function of ~20%. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-29 16:27:04 +08:00
Martin Willi	29a47b54e0	crypto: x86/chacha20 - Add a 2-block AVX-512VL variant This version uses the same principle as the AVX2 version. It benefits from the AVX-512VL rotate instructions and the more efficient partial block handling using "vmovdqu8", resulting in a speedup of ~20%. Unlike the AVX2 version, it is faster than the single block SSSE3 version to process a single block. Hence we engage that function for (partial) single block lengths as well. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-29 16:27:04 +08:00
Martin Willi	cee7a36ecb	crypto: x86/chacha20 - Add a 8-block AVX-512VL variant This variant is similar to the AVX2 version, but benefits from the AVX-512 rotate instructions and the additional registers, so it can operate without any data on the stack. It uses ymm registers only to avoid the massive core throttling on Skylake-X platforms. Nontheless does it bring a ~30% speed improvement compared to the AVX2 variant for random encryption lengths. The AVX2 version uses "rep movsb" for partial block XORing via the stack. With AVX-512, the new "vmovdqu8" can do this much more efficiently. The associated "kmov" instructions to work with dynamic masks is not part of the AVX-512VL instruction set, hence we depend on AVX-512BW as well. Given that the major AVX-512VL architectures provide AVX-512BW and this extension does not affect core clocking, this seems to be no problem at least for now. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-29 16:27:04 +08:00
Pan Bian	e5bde04ccc	crypto: do not free algorithm before using In multiple functions, the algorithm fields are read after its reference is dropped through crypto_mod_put. In this case, the algorithm memory may be freed, resulting in use-after-free bugs. This patch delays the put operation until the algorithm is never used. Fixes: `79c65d179a` ("crypto: cbc - Convert to skcipher") Fixes: `a7d85e06ed` ("crypto: cfb - add support for Cipher FeedBack mode") Fixes: `043a44001b` ("crypto: pcbc - Convert to skcipher") Cc: <stable@vger.kernel.org> Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-29 14:53:59 +08:00
Eric Biggers	059c2a4d8e	crypto: adiantum - add Adiantum support Add support for the Adiantum encryption mode. Adiantum was designed by Paul Crowley and is specified by our paper: Adiantum: length-preserving encryption for entry-level processors (https://eprint.iacr.org/2018/720.pdf) See our paper for full details; this patch only provides an overview. Adiantum is a tweakable, length-preserving encryption mode designed for fast and secure disk encryption, especially on CPUs without dedicated crypto instructions. Adiantum encrypts each sector using the XChaCha12 stream cipher, two passes of an ε-almost-∆-universal (εA∆U) hash function, and an invocation of the AES-256 block cipher on a single 16-byte block. On CPUs without AES instructions, Adiantum is much faster than AES-XTS; for example, on ARM Cortex-A7, on 4096-byte sectors Adiantum encryption is about 4 times faster than AES-256-XTS encryption, and decryption about 5 times faster. Adiantum is a specialization of the more general HBSH construction. Our earlier proposal, HPolyC, was also a HBSH specialization, but it used a different εA∆U hash function, one based on Poly1305 only. Adiantum's εA∆U hash function, which is based primarily on the "NH" hash function like that used in UMAC (RFC4418), is about twice as fast as HPolyC's; consequently, Adiantum is about 20% faster than HPolyC. This speed comes with no loss of security: Adiantum is provably just as secure as HPolyC, in fact slightly more secure. Like HPolyC, Adiantum's security is reducible to that of XChaCha12 and AES-256, subject to a security bound. XChaCha12 itself has a security reduction to ChaCha12. Therefore, one need not "trust" Adiantum; one need only trust ChaCha12 and AES-256. Note that the εA∆U hash function is only used for its proven combinatorical properties so cannot be "broken". Adiantum is also a true wide-block encryption mode, so flipping any plaintext bit in the sector scrambles the entire ciphertext, and vice versa. No other such mode is available in the kernel currently; doing the same with XTS scrambles only 16 bytes. Adiantum also supports arbitrary-length tweaks and naturally supports any length input >= 16 bytes without needing "ciphertext stealing". For the stream cipher, Adiantum uses XChaCha12 rather than XChaCha20 in order to make encryption feasible on the widest range of devices. Although the 20-round variant is quite popular, the best known attacks on ChaCha are on only 7 rounds, so ChaCha12 still has a substantial security margin; in fact, larger than AES-256's. 12-round Salsa20 is also the eSTREAM recommendation. For the block cipher, Adiantum uses AES-256, despite it having a lower security margin than XChaCha12 and needing table lookups, due to AES's extensive adoption and analysis making it the obvious first choice. Nevertheless, for flexibility this patch also permits the "adiantum" template to be instantiated with XChaCha20 and/or with an alternate block cipher. We need Adiantum support in the kernel for use in dm-crypt and fscrypt, where currently the only other suitable options are block cipher modes such as AES-XTS. A big problem with this is that many low-end mobile devices (e.g. Android Go phones sold primarily in developing countries, as well as some smartwatches) still have CPUs that lack AES instructions, e.g. ARM Cortex-A7. Sadly, AES-XTS encryption is much too slow to be viable on these devices. We did find that some "lightweight" block ciphers are fast enough, but these suffer from problems such as not having much cryptanalysis or being too controversial. The ChaCha stream cipher has excellent performance but is insecure to use directly for disk encryption, since each sector's IV is reused each time it is overwritten. Even restricting the threat model to offline attacks only isn't enough, since modern flash storage devices don't guarantee that "overwrites" are really overwrites, due to wear-leveling. Adiantum avoids this problem by constructing a "tweakable super-pseudorandom permutation"; this is the strongest possible security model for length-preserving encryption. Of course, storing random nonces along with the ciphertext would be the ideal solution. But doing that with existing hardware and filesystems runs into major practical problems; in most cases it would require data journaling (like dm-integrity) which severely degrades performance. Thus, for now length-preserving encryption is still needed. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:56 +08:00
Eric Biggers	16aae3595a	crypto: arm/nhpoly1305 - add NEON-accelerated NHPoly1305 Add an ARM NEON implementation of NHPoly1305, an ε-almost-∆-universal hash function used in the Adiantum encryption mode. For now, only the NH portion is actually NEON-accelerated; the Poly1305 part is less performance-critical so is just implemented in C. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:56 +08:00
Eric Biggers	26609a21a9	crypto: nhpoly1305 - add NHPoly1305 support Add a generic implementation of NHPoly1305, an ε-almost-∆-universal hash function used in the Adiantum encryption mode. CONFIG_NHPOLY1305 is not selectable by itself since there won't be any real reason to enable it without also enabling Adiantum support. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:56 +08:00
Eric Biggers	1b6fd3d5d1	crypto: poly1305 - add Poly1305 core API Expose a low-level Poly1305 API which implements the ε-almost-∆-universal (εA∆U) hash function underlying the Poly1305 MAC and supports block-aligned inputs only. This is needed for Adiantum hashing, which builds an εA∆U hash function from NH and a polynomial evaluation in GF(2^{130}-5); this polynomial evaluation is identical to the one the Poly1305 MAC does. However, the crypto_shash Poly1305 API isn't very appropriate for this because its calling convention assumes it is used as a MAC, with a 32-byte "one-time key" provided for every digest. But by design, in Adiantum hashing the performance of the polynomial evaluation isn't nearly as critical as NH. So it suffices to just have some C helper functions. Thus, this patch adds such functions. Acked-by: Martin Willi <martin@strongswan.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:56 +08:00
Eric Biggers	878afc35cd	crypto: poly1305 - use structures for key and accumulator In preparation for exposing a low-level Poly1305 API which implements the ε-almost-∆-universal (εA∆U) hash function underlying the Poly1305 MAC and supports block-aligned inputs only, create structures poly1305_key and poly1305_state which hold the limbs of the Poly1305 "r" key and accumulator, respectively. These structures could actually have the same type (e.g. poly1305_val), but different types are preferable, to prevent misuse. Acked-by: Martin Willi <martin@strongswan.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:56 +08:00
Eric Biggers	bdb063a79f	crypto: arm/chacha - add XChaCha12 support Now that the 32-bit ARM NEON implementation of ChaCha20 and XChaCha20 has been refactored to support varying the number of rounds, add support for XChaCha12. This is identical to XChaCha20 except for the number of rounds, which is 12 instead of 20. XChaCha12 is faster than XChaCha20 but has a lower security margin, though still greater than AES-256's since the best known attacks make it through only 7 rounds. See the patch "crypto: chacha - add XChaCha12 support" for more details about why we need XChaCha12 support. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:56 +08:00
Eric Biggers	3cc215198e	crypto: arm/chacha20 - refactor to allow varying number of rounds In preparation for adding XChaCha12 support, rename/refactor the NEON implementation of ChaCha20 to support different numbers of rounds. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:56 +08:00
Eric Biggers	d97a94309d	crypto: arm/chacha20 - add XChaCha20 support Add an XChaCha20 implementation that is hooked up to the ARM NEON implementation of ChaCha20. This is needed for use in the Adiantum encryption mode; see the generic code patch, "crypto: chacha20-generic - add XChaCha20 support", for more details. We also update the NEON code to support HChaCha20 on one block, so we can use that in XChaCha20 rather than calling the generic HChaCha20. This required factoring the permutation out into its own macro. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:56 +08:00
Eric Biggers	be2830b15b	crypto: arm/chacha20 - limit the preemption-disabled section To improve responsivesess, disable preemption for each step of the walk (which is at most PAGE_SIZE) rather than for the entire encryption/decryption operation. Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:56 +08:00
Eric Biggers	aa7624093c	crypto: chacha - add XChaCha12 support Now that the generic implementation of ChaCha20 has been refactored to allow varying the number of rounds, add support for XChaCha12, which is the XSalsa construction applied to ChaCha12. ChaCha12 is one of the three ciphers specified by the original ChaCha paper (https://cr.yp.to/chacha/chacha-20080128.pdf: "ChaCha, a variant of Salsa20"), alongside ChaCha8 and ChaCha20. ChaCha12 is faster than ChaCha20 but has a lower, but still large, security margin. We need XChaCha12 support so that it can be used in the Adiantum encryption mode, which enables disk/file encryption on low-end mobile devices where AES-XTS is too slow as the CPUs lack AES instructions. We'd prefer XChaCha20 (the more popular variant), but it's too slow on some of our target devices, so at least in some cases we do need the XChaCha12-based version. In more detail, the problem is that Adiantum is still much slower than we're happy with, and encryption still has a quite noticeable effect on the feel of low-end devices. Users and vendors push back hard against encryption that degrades the user experience, which always risks encryption being disabled entirely. So we need to choose the fastest option that gives us a solid margin of security, and here that's XChaCha12. The best known attack on ChaCha breaks only 7 rounds and has 2^235 time complexity, so ChaCha12's security margin is still better than AES-256's. Much has been learned about cryptanalysis of ARX ciphers since Salsa20 was originally designed in 2005, and it now seems we can be comfortable with a smaller number of rounds. The eSTREAM project also suggests the 12-round version of Salsa20 as providing the best balance among the different variants: combining very good performance with a "comfortable margin of security". Note that it would be trivial to add vanilla ChaCha12 in addition to XChaCha12. However, it's unneeded for now and therefore is omitted. As discussed in the patch that introduced XChaCha20 support, I considered splitting the code into separate chacha-common, chacha20, xchacha20, and xchacha12 modules, so that these algorithms could be enabled/disabled independently. However, since nearly all the code is shared anyway, I ultimately decided there would have been little benefit to the added complexity. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Martin Willi <martin@strongswan.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:55 +08:00
Eric Biggers	1ca1b91794	crypto: chacha20-generic - refactor to allow varying number of rounds In preparation for adding XChaCha12 support, rename/refactor chacha20-generic to support different numbers of rounds. The justification for needing XChaCha12 support is explained in more detail in the patch "crypto: chacha - add XChaCha12 support". The only difference between ChaCha{8,12,20} are the number of rounds itself; all other parts of the algorithm are the same. Therefore, remove the "20" from all definitions, structures, functions, files, etc. that will be shared by all ChaCha versions. Also make ->setkey() store the round count in the chacha_ctx (previously chacha20_ctx). The generic code then passes the round count through to chacha_block(). There will be a ->setkey() function for each explicitly allowed round count; the encrypt/decrypt functions will be the same. I decided not to do it the opposite way (same ->setkey() function for all round counts, with different encrypt/decrypt functions) because that would have required more boilerplate code in architecture-specific implementations of ChaCha and XChaCha. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Martin Willi <martin@strongswan.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:55 +08:00
Eric Biggers	de61d7ae5d	crypto: chacha20-generic - add XChaCha20 support Add support for the XChaCha20 stream cipher. XChaCha20 is the application of the XSalsa20 construction (https://cr.yp.to/snuffle/xsalsa-20081128.pdf) to ChaCha20 rather than to Salsa20. XChaCha20 extends ChaCha20's nonce length from 64 bits (or 96 bits, depending on convention) to 192 bits, while provably retaining ChaCha20's security. XChaCha20 uses the ChaCha20 permutation to map the key and first 128 nonce bits to a 256-bit subkey. Then, it does the ChaCha20 stream cipher with the subkey and remaining 64 bits of nonce. We need XChaCha support in order to add support for the Adiantum encryption mode. Note that to meet our performance requirements, we actually plan to primarily use the variant XChaCha12. But we believe it's wise to first add XChaCha20 as a baseline with a higher security margin, in case there are any situations where it can be used. Supporting both variants is straightforward. Since XChaCha20's subkey differs for each request, XChaCha20 can't be a template that wraps ChaCha20; that would require re-keying the underlying ChaCha20 for every request, which wouldn't be thread-safe. Instead, we make XChaCha20 its own top-level algorithm which calls the ChaCha20 streaming implementation internally. Similar to the existing ChaCha20 implementation, we define the IV to be the nonce and stream position concatenated together. This allows users to seek to any position in the stream. I considered splitting the code into separate chacha20-common, chacha20, and xchacha20 modules, so that chacha20 and xchacha20 could be enabled/disabled independently. However, since nearly all the code is shared anyway, I ultimately decided there would have been little benefit to the added complexity of separate modules. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Martin Willi <martin@strongswan.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:55 +08:00
Eric Biggers	5e04542a0e	crypto: chacha20-generic - don't unnecessarily use atomic walk chacha20-generic doesn't use SIMD instructions or otherwise disable preemption, so passing atomic=true to skcipher_walk_virt() is unnecessary. Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Martin Willi <martin@strongswan.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:55 +08:00
Eric Biggers	dd333449d0	crypto: chacha20-generic - add HChaCha20 library function Refactor the unkeyed permutation part of chacha20_block() into its own function, then add hchacha20_block() which is the ChaCha equivalent of HSalsa20 and is an intermediate step towards XChaCha20 (see https://cr.yp.to/snuffle/xsalsa-20081128.pdf). HChaCha20 skips the final addition of the initial state, and outputs only certain words of the state. It should not be used for streaming directly. Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Martin Willi <martin@strongswan.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:55 +08:00
Eric Biggers	3d234b3313	crypto: drop mask=CRYPTO_ALG_ASYNC from 'shash' tfm allocations 'shash' algorithms are always synchronous, so passing CRYPTO_ALG_ASYNC in the mask to crypto_alloc_shash() has no effect. Many users therefore already don't pass it, but some still do. This inconsistency can cause confusion, especially since the way the 'mask' argument works is somewhat counterintuitive. Thus, just remove the unneeded CRYPTO_ALG_ASYNC flags. This patch shouldn't change any actual behavior. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:55 +08:00
Eric Biggers	1ad0f1603a	crypto: drop mask=CRYPTO_ALG_ASYNC from 'cipher' tfm allocations 'cipher' algorithms (single block ciphers) are always synchronous, so passing CRYPTO_ALG_ASYNC in the mask to crypto_alloc_cipher() has no effect. Many users therefore already don't pass it, but some still do. This inconsistency can cause confusion, especially since the way the 'mask' argument works is somewhat counterintuitive. Thus, just remove the unneeded CRYPTO_ALG_ASYNC flags. This patch shouldn't change any actual behavior. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:55 +08:00
Eric Biggers	d41655909e	crypto: remove useless initializations of cra_list Some algorithms initialize their .cra_list prior to registration. But this is unnecessary since crypto_register_alg() will overwrite .cra_list when adding the algorithm to the 'crypto_alg_list'. Apparently the useless assignment has just been copy+pasted around. So, remove the useless assignments. Exception: paes_s390.c uses cra_list to check whether the algorithm is registered or not, so I left that as-is for now. This patch shouldn't change any actual behavior. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:55 +08:00
Eric Biggers	2b78aeb366	crypto: inside-secure - remove useless setting of type flags Remove the unnecessary setting of CRYPTO_ALG_TYPE_SKCIPHER. Commit `2c95e6d978` ("crypto: skcipher - remove useless setting of type flags") took care of this everywhere else, but a few more instances made it into the tree at about the same time. Squash them before they get copy+pasted around again. This patch shouldn't change any actual behavior. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-20 14:26:55 +08:00
Vitaly Chikunov	3da2c1dfdb	crypto: ecc - regularize scalar for scalar multiplication ecc_point_mult is supposed to be used with a regularized scalar, otherwise, it's possible to deduce the position of the top bit of the scalar with timing attack. This is important when the scalar is a private key. ecc_point_mult is already using a regular algorithm (i.e. having an operation flow independent of the input scalar) but regularization step is not implemented. Arrange scalar to always have fixed top bit by adding a multiple of the curve order (n). References: The constant time regularization step is based on micro-ecc by Kenneth MacKay and also referenced in the literature (Bernstein, D. J., & Lange, T. (2017). Montgomery curves and the Montgomery ladder. (Cryptology ePrint Archive; Vol. 2017/293). s.l.: IACR. Chapter 4.6.2.) Signed-off-by: Vitaly Chikunov <vt@altlinux.org> Cc: kernel-hardening@lists.openwall.com Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-16 14:11:04 +08:00
Martin Willi	8a5a79d555	crypto: x86/chacha20 - Add a 4-block AVX2 variant This variant builds upon the idea of the 2-block AVX2 variant that shuffles words after each round. The shuffling has a rather high latency, so the arithmetic units are not optimally used. Given that we have plenty of registers in AVX, this version parallelizes the 2-block variant to do four blocks. While the first two blocks are shuffling, the CPU can do the XORing on the second two blocks and vice-versa, which makes this version much faster than the SSSE3 variant for four blocks. The latter is now mostly for systems that do not have AVX2, but there it is the work-horse, so we keep it in place. The partial XORing function trailer is very similar to the AVX2 2-block variant. While it could be shared, that code segment is rather short; profiling is also easier with the trailer integrated, so we keep it per function. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-16 14:11:04 +08:00
Martin Willi	a5dd97f862	crypto: x86/chacha20 - Add a 2-block AVX2 variant This variant uses the same principle as the single block SSSE3 variant by shuffling the state matrix after each round. With the wider AVX registers, we can do two blocks in parallel, though. This function can increase performance and efficiency significantly for lengths that would otherwise require a 4-block function. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-16 14:11:04 +08:00
Martin Willi	9b17608f15	crypto: x86/chacha20 - Use larger block functions more aggressively Now that all block functions support partial lengths, engage the wider block sizes more aggressively. This prevents using smaller block functions multiple times, where the next larger block function would have been faster. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-16 14:11:04 +08:00
Martin Willi	c3b734dd32	crypto: x86/chacha20 - Support partial lengths in 8-block AVX2 variant Add a length argument to the eight block function for AVX2, so the block function may XOR only a partial length of eight blocks. To avoid unnecessary operations, we integrate XORing of the first four blocks in the final lane interleaving; this also avoids some work in the partial lengths path. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-16 14:11:04 +08:00
Martin Willi	db8e15a249	crypto: x86/chacha20 - Support partial lengths in 4-block SSSE3 variant Add a length argument to the quad block function for SSSE3, so the block function may XOR only a partial length of four blocks. As we already have the stack set up, the partial XORing does not need to. This gives a slightly different function trailer, so we keep that separate from the 1-block function. Signed-off-by: Martin Willi <martin@strongswan.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2018-11-16 14:11:04 +08:00

1 2 3 4 5 ...

796699 commits