linux-stable/arch/x86
Ard Biesheuvel 6c3d86e6ff crypto: x86/aes-ni-xts - use direct calls to and 4-way stride
commit 86ad60a65f upstream.

The XTS asm helper arrangement is a bit odd: the 8-way stride helper
consists of back-to-back calls to the 4-way core transforms, which
are called indirectly, based on a boolean that indicates whether we
are performing encryption or decryption.

Given how costly indirect calls are on x86, let's switch to direct
calls, and given how the 8-way stride doesn't really add anything
substantial, use a 4-way stride instead, and make the asm core
routine deal with any multiple of 4 blocks. Since 512 byte sectors
or 4 KB blocks are the typical quantities XTS operates on, increase
the stride exported to the glue helper to 512 bytes as well.

As a result, the number of indirect calls is reduced from 3 per 64 bytes
of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup
when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU)

Fixes: 9697fa39ef ("x86/retpoline/crypto: Convert crypto assembler indirect jumps")
Tested-by: Eric Biggers <ebiggers@google.com> # x86_64
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
[ardb: rebase onto stable/linux-5.4.y]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-20 10:39:47 +01:00
..
boot x86/asm: Replace __force_order with a memory clobber 2020-10-29 09:58:01 +01:00
configs vgacon: remove software scrollback support 2020-09-17 13:47:54 +02:00
crypto crypto: x86/aes-ni-xts - use direct calls to and 4-way stride 2021-03-20 10:39:47 +01:00
entry x86/asm/32: Add ENDs to some functions and relabel with SYM_CODE_* 2021-01-17 14:05:30 +01:00
events perf/x86/kvm: Add Cascade Lake Xeon steppings to isolation_ucodes[] 2021-03-07 12:20:47 +01:00
hyperv x86/hyperv: check cpu mask after interrupt has been disabled 2021-01-19 18:26:12 +01:00
ia32 syscalls/x86: Use COMPAT_SYSCALL_DEFINE0 for IA32 (rt_)sigreturn 2020-01-17 19:48:30 +01:00
include crypto: x86 - Regularize glue function prototypes 2021-03-20 10:39:47 +01:00
kernel x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2 2021-03-17 17:03:57 +01:00
kvm kvm: x86: replace kvm_spec_ctrl_test_value with runtime test on the host 2021-03-04 10:26:09 +01:00
lib x86/mmx: Use KFPU_387 for MMX string operations 2021-01-27 11:47:49 +01:00
math-emu x86: math-emu: Fix up 'cmp' insn for clang ias 2020-07-29 10:18:40 +02:00
mm x86: fix seq_file iteration for pat/memtype.c 2021-03-04 10:26:48 +01:00
net bpf, x86: Fix encoding for lower 8-bit registers in BPF_STX BPF_B 2020-05-02 08:48:55 +02:00
oprofile
pci x86/PCI: Fix intel_mid_pci.c build error when ACPI is not enabled 2020-11-01 12:01:02 +01:00
platform efi/x86: Free efi_pgd with free_pages() 2020-11-24 13:29:19 +01:00
power x86/asm/32: Add ENDs to some functions and relabel with SYM_CODE_* 2021-01-17 14:05:30 +01:00
purgatory x86/purgatory: Disable various profiling and sanitizing options 2020-06-24 17:50:20 +02:00
ras RAS/CEC: Add CONFIG_RAS_CEC_DEBUG and move CEC debug features there 2019-06-08 17:39:24 +02:00
realmode x86/asm/32: Add ENDs to some functions and relabel with SYM_CODE_* 2021-01-17 14:05:30 +01:00
tools x86/build: Treat R_386_PLT32 relocation as R_386_PC32 2021-03-07 12:20:45 +01:00
um um: Implement copy_thread_tls 2020-01-14 20:08:35 +01:00
video
xen Xen/gnttab: handle p2m update errors on a per-slot basis 2021-03-07 12:20:49 +01:00
.gitignore
Kbuild
Kconfig x86/tsx: Add config options to set tsx=on|off|auto 2019-10-28 09:12:18 +01:00
Kconfig.cpu x86/cpu: Create Zhaoxin processors architecture support file 2019-06-22 11:45:57 +02:00
Kconfig.debug x86, perf: Fix the dependency of the x86 insn decoder selftest 2019-09-02 20:05:58 +02:00
Makefile x86/build: Disable CET instrumentation in the kernel for 32-bit too 2021-02-17 10:35:17 +01:00
Makefile.um
Makefile_32.cpu