Commit graph

1074446 commits

Author SHA1 Message Date
Peter Zijlstra
af22700390 x86/ibt,kexec: Disable CET on kexec
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.641454603@infradead.org
2022-03-15 10:32:39 +01:00
Peter Zijlstra
991625f3dd x86/ibt: Add IBT feature, MSR and #CP handling
The bits required to make the hardware go.. Of note is that, provided
the syscall entry points are covered with ENDBR, #CP doesn't need to
be an IST because we'll never hit the syscall gap.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.582331711@infradead.org
2022-03-15 10:32:39 +01:00
Peter Zijlstra
0aec21cfb5 x86/ibt,ftrace: Add ENDBR to samples/ftrace
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.523421433@infradead.org
2022-03-15 10:32:38 +01:00
Peter Zijlstra
5891271055 x86/ibt,bpf: Add ENDBR instructions to prologue and trampoline
With IBT enabled builds we need ENDBR instructions at indirect jump
target sites, since we start execution of the JIT'ed code through an
indirect jump, the very first instruction needs to be ENDBR.

Similarly, since eBPF tail-calls use indirect branches, their landing
site needs to be an ENDBR too.

The trampolines need similar adjustment.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Fixed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.464998838@infradead.org
2022-03-15 10:32:38 +01:00
Peter Zijlstra
cc66bb9145 x86/ibt,kprobes: Cure sym+0 equals fentry woes
In order to allow kprobes to skip the ENDBR instructions at sym+0 for
X86_KERNEL_IBT builds, change _kprobe_addr() to take an architecture
callback to inspect the function at hand and modify the offset if
needed.

This streamlines the existing interface to cover more cases and
require less hooks. Once PowerPC gets fully converted there will only
be the one arch hook.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.405947704@infradead.org
2022-03-15 10:32:38 +01:00
Peter Zijlstra
e52fc2cf3f x86/ibt,ftrace: Make function-graph play nice
Return trampoline must not use indirect branch to return; while this
preserves the RSB, it is fundamentally incompatible with IBT. Instead
use a retpoline like ROP gadget that defeats IBT while not unbalancing
the RSB.

And since ftrace_stub is no longer a plain RET, don't use it to copy
from. Since RET is a trivial instruction, poke it directly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.347296408@infradead.org
2022-03-15 10:32:37 +01:00
Peter Zijlstra
d15cb3dab1 x86/livepatch: Validate __fentry__ location
Currently livepatch assumes __fentry__ lives at func+0, which is most
likely untrue with IBT on. Instead make it use ftrace_location() by
default which both validates and finds the actual ip if there is any
in the same symbol.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.285971256@infradead.org
2022-03-15 10:32:37 +01:00
Peter Zijlstra
aebfd12521 x86/ibt,ftrace: Search for __fentry__ location
Currently a lot of ftrace code assumes __fentry__ is at sym+0. However
with Intel IBT enabled the first instruction of a function will most
likely be ENDBR.

Change ftrace_location() to not only return the __fentry__ location
when called for the __fentry__ location, but also when called for the
sym+0 location.

Then audit/update all callsites of this function to consistently use
these new semantics.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.227581603@infradead.org
2022-03-15 10:32:37 +01:00
Peter Zijlstra
6649fa876d x86/ibt,kvm: Add ENDBR to fastops
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.168850084@infradead.org
2022-03-15 10:32:37 +01:00
Peter Zijlstra
214b9a83b6 x86/ibt,crypto: Add ENDBR for the jump-table entries
The code does:

	## branch into array
	mov     jump_table(,%rax,8), %bufp
	JMP_NOSPEC bufp

resulting in needing to mark the jump-table entries with ENDBR.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.110500806@infradead.org
2022-03-15 10:32:36 +01:00
Peter Zijlstra
c3b037917c x86/ibt,paravirt: Sprinkle ENDBR
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.051635891@infradead.org
2022-03-15 10:32:36 +01:00
Peter Zijlstra
c4691712b5 x86/linkage: Add ENDBR to SYM_FUNC_START*()
Ensure the ASM functions have ENDBR on for IBT builds, this follows
the ARM64 example. Unlike ARM64, we'll likely end up overwriting them
with poison.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.992708941@infradead.org
2022-03-15 10:32:36 +01:00
Peter Zijlstra
8f93402b92 x86/ibt,entry: Sprinkle ENDBR dust
Kernel entry points should be having ENDBR on for IBT configs.

The SYSCALL entry points are found through taking their respective
address in order to program them in the MSRs, while the exception
entry points are found through UNWIND_HINT_IRET_REGS.

The rule is that any UNWIND_HINT_IRET_REGS at sym+0 should have an
ENDBR, see the later objtool ibt validation patch.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.933157479@infradead.org
2022-03-15 10:32:35 +01:00
Peter Zijlstra
5b2fc51576 x86/ibt,xen: Sprinkle the ENDBR
Even though Xen currently doesn't advertise IBT, prepare for when it
will eventually do so and sprinkle the ENDBR dust accordingly.

Even though most of the entry points are IRET like, the CPL0
Hypervisor can set WAIT-FOR-ENDBR and demand ENDBR at these sites.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.873919996@infradead.org
2022-03-15 10:32:35 +01:00
Peter Zijlstra
8b87d8cec1 x86/entry,xen: Early rewrite of restore_regs_and_return_to_kernel()
By doing an early rewrite of 'jmp native_iret` in
restore_regs_and_return_to_kernel() we can get rid of the last
INTERRUPT_RETURN user and paravirt_iret.

Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.815039833@infradead.org
2022-03-15 10:32:34 +01:00
Peter Zijlstra
6cf3e4c0d2 x86/entry: Cleanup PARAVIRT
Since commit 5c8f6a2e31 ("x86/xen: Add
xenpv_restore_regs_and_return_to_usermode()") Xen will no longer reach
this code and we can do away with the paravirt
SWAPGS/INTERRUPT_RETURN.

Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.756014488@infradead.org
2022-03-15 10:32:34 +01:00
Peter Zijlstra
ba27d1a808 x86/ibt,paravirt: Use text_gen_insn() for paravirt_patch()
Less duplication is more better.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.697253958@infradead.org
2022-03-15 10:32:34 +01:00
Peter Zijlstra
bbf92368b0 x86/text-patching: Make text_gen_insn() play nice with ANNOTATE_NOENDBR
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.638561109@infradead.org
2022-03-15 10:32:33 +01:00
Peter Zijlstra
c8c301abea x86/ibt: Add ANNOTATE_NOENDBR
In order to have objtool warn about code references to !ENDBR
instruction, we need an annotation to allow this for non-control-flow
instances -- consider text range checks, text patching, or return
trampolines etc.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.578968224@infradead.org
2022-03-15 10:32:33 +01:00
Peter Zijlstra
156ff4a544 x86/ibt: Base IBT bits
Add Kconfig, Makefile and basic instruction support for x86 IBT.

(Ab)use __DISABLE_EXPORTS to disable IBT since it's already employed
to mark compressed and purgatory. Additionally mark realmode with it
as well to avoid inserting ENDBR instructions there. While ENDBR is
technically a NOP, inserting them was causing some grief due to code
growth. There's also a problem with using __noendbr in code compiled
without -fcf-protection=branch.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.519875203@infradead.org
2022-03-15 10:32:33 +01:00
Peter Zijlstra
5cff2086b0 objtool: Have WARN_FUNC fall back to sym+off
Currently WARN_FUNC() either prints func+off and failing that prints
sec+off, add an intermediate sym+off. This is useful when playing
around with entry code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.461283840@infradead.org
2022-03-15 10:32:32 +01:00
Peter Zijlstra
537da1ed54 objtool,efi: Update __efi64_thunk annotation
The current annotation relies on not running objtool on the file; this
won't work when running objtool on vmlinux.o. Instead explicitly mark
__efi64_thunk() to be ignored.

This preserves the status quo, which is somewhat unfortunate. Luckily
this code is hardly ever used.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.402118218@infradead.org
2022-03-15 10:32:32 +01:00
Peter Zijlstra
1ffbe4e935 objtool: Default ignore INT3 for unreachable
Ignore all INT3 instructions for unreachable code warnings, similar to NOP.
This allows using INT3 for various paddings instead of NOPs.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.343312938@infradead.org
2022-03-15 10:32:32 +01:00
Peter Zijlstra
f2d3a25089 objtool: Add --dry-run
Add a --dry-run argument to skip writing the modifications. This is
convenient for debugging.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.282720146@infradead.org
2022-03-15 10:32:32 +01:00
Peter Zijlstra
b44544fe02 static_call: Avoid building empty .static_call_sites
Without CONFIG_HAVE_STATIC_CALL_INLINE there's no point in creating
the .static_call_sites section and it's related symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.223798256@infradead.org
2022-03-15 10:32:31 +01:00
Peter Zijlstra
599d66b847 Merge branch 'arm64/for-next/linkage'
Enjoy the cleanups and avoid conflicts vs linkage

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-03-15 10:32:31 +01:00
Fenghua Yu
227a06553f tools/objtool: Check for use of the ENQCMD instruction in the kernel
The ENQCMD instruction implicitly accesses the PASID_MSR to fill in the
pasid field of the descriptor being submitted to an accelerator. But
there is no precise (and stable across kernel changes) point at which
the PASID_MSR is updated from the value for one task to the next.

Kernel code that uses accelerators must always use the ENQCMDS instruction
which does not access the PASID_MSR.

Check for use of the ENQCMD instruction in the kernel and warn on its
usage.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220207230254.3342514-11-fenghua.yu@intel.com
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-03-15 10:32:30 +01:00
Linus Torvalds
09688c0166 Linux 5.17-rc8 2022-03-13 13:23:37 -07:00
Linus Torvalds
f0e18b03fc - Free shmem backing storage for SGX enclave pages when those are
swapped back into EPC memory
 
 - Prevent do_int3() from being kprobed, to avoid recursion
 
 - Remap setup_data and setup_indirect structures properly when accessing
 their members
 
 - Correct the alternatives patching order for modules too
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmItzJgACgkQEsHwGGHe
 VUqaow/8C115xuZEBn+iT+adcQxbqrg3S2en/Hq0aJOEhkNkbOhgAW0OWHvj7Gs3
 +2taD35MqzneEOfa0Gv46600V4+SV5K5NAFndr4PA2FVgIw01rEQios2oc4QSQBP
 PVJgvGyIMpN71ODKTiZ8w4ihp3J7MWDkCP1z4hbO/lfM4tOXcYzh2Lv1fE8hHr5b
 qFtPDyYgfEKUVFa+sv2sE1cJw670UFDcFqGAIjxUUm0r78GKmPz08gZm9YiBTJgV
 jrxySdpAh/eaPeHNfFH9RzAD2ZGZppgIkPCp33ZdrMEhnZmwLz7vc76BMbkD2P6w
 1fBmBZ5F8yOMaaLHSGh4Ek5Gs3p9DjmaZdEWwz+yiIe1RFLKyOQu6gsmGbAyuQx4
 KSfPFnfkOfw/7cz6BSp3Sh6zgrGPqloIVcHkWRth/LJZSV/fVgM8bPg3VLJP6WFi
 o4WTcNAq/fNMAmGwtIVpTUW/QJafXvOauKkDGQkMQ87U68QSh6uDrvrvMHPF8W+Y
 SPcYrdsAPagLxq0GCCQ6doSvBjWNTolXfTnfAoATZpae0URmrvu9ddgUbIlgeQWY
 n/rK+cKk+iuLTEZC55+v5OALwEMOM3Tuz4Ghko8re0pkD/kE61m3Az6w5sKN3Inc
 c21tvO/dxHhAnHV+34d2LM27PU4qoFdVO2mPup702x68XT+X0/g=
 =YLph
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v5.17_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Free shmem backing storage for SGX enclave pages when those are
   swapped back into EPC memory

 - Prevent do_int3() from being kprobed, to avoid recursion

 - Remap setup_data and setup_indirect structures properly when
   accessing their members

 - Correct the alternatives patching order for modules too

* tag 'x86_urgent_for_v5.17_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sgx: Free backing memory after faulting the enclave page
  x86/traps: Mark do_int3() NOKPROBE_SYMBOL
  x86/boot: Add setup_indirect support in early_memremap_is_setup_data()
  x86/boot: Fix memremap of setup_indirect structures
  x86/module: Fix the paravirt vs alternative order
2022-03-13 10:36:38 -07:00
Linus Torvalds
aad611a868 perf tools fixes for 5.17: 4th batch
- Fix event parser error for hybrid systems.
 
 - Fix NULL check against wrong variable in 'perf bench' and in the parsing
   code.
 
 - Update arm64 KVM headers from the kernel sources.
 
 - Sync cpufeatures header with the kernel sources
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYiy4oAAKCRCyPKLppCJ+
 J0ZfAQD0PfI+Pe9b8kVb4hyy6jMafulryUnyZTf0Zj6Sc+WWoQD+PquQDlT3f708
 MRW6O+M3c8yf0zmMjJuFvj5dr0A2hws=
 =RKL0
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.17-2022-03-12' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix event parser error for hybrid systems

 - Fix NULL check against wrong variable in 'perf bench' and in the
   parsing code

 - Update arm64 KVM headers from the kernel sources

 - Sync cpufeatures header with the kernel sources

* tag 'perf-tools-fixes-for-v5.17-2022-03-12' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf parse: Fix event parser error for hybrid systems
  perf bench: Fix NULL check against wrong variable
  perf parse-events: Fix NULL check against wrong variable
  tools headers cpufeatures: Sync with the kernel sources
  tools kvm headers arm64: Update KVM headers from the kernel sources
2022-03-12 10:29:25 -08:00
Linus Torvalds
1518a4f636 drm kconfig fix for 5.17-rc8
- fix regression in Kconfig.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmIsTy4ACgkQDHTzWXnE
 hr57Ww/+KR4+aWU8NRQZbzd+0n5C4eL2t3HsqgYbi3ZLJhKQsg5925Xm3RTYqOqB
 j6Pe9Qtxjo1WIHCTZ3wg9ZW+WvHv9Lfv0t/n+dX1DphbZ0qX4+gnMeX4RW1TCkgz
 S2GAh/NZ12KGzXIvwVNkvq3DCx444Vg04md4V/eD8Iu9FEt3d7HvpiFbfKPbgz1y
 KKkgJneXU1SsylqsaNQ2eco5HBu5KppiLUiCvRkUq6jUB8EO+/vC31IFc1cJ3ssr
 aikzxb1jH1wp4SM1SHEzqVbbnUA58Dqcij1FbdF8iOd7ia6K6q+R27qfKe6aPCKq
 ZQVf1b7HfQl+BOGRjHxsKesZjCBkxBV2FUNhPZRqvTSMlZBj+1UHZosh+Ihf6or5
 7fiIkq9xfHX9Hwe+/vuwDnVrJgvqPsvPc4ITKMAsK0omh018Dor/Ibz3Hr2BvsIj
 2tIPOVqCdqJyyOT633IgVmItnAExXPJVAWyBRVAorjAZPaX6WmhvJmHVby59AgYj
 aDPxSyICzyrEMdyPETsUdesrTvV0dNO17EIDJpZL4XZ42egis3pBsCDzqBZnvC18
 UDWxAIJEgYokPxppRV0UE2ITGXmiizmLI9vz/n7Oujbq7vEGytZKlZvSO3YHHj2+
 XTqJbOpbDP1IH3iZoA7UM8dgm5Hasb9PQSZVsKQZKX2wqRNkXQM=
 =4Kys
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2022-03-12' of git://anongit.freedesktop.org/drm/drm

Pull drm kconfig fix from Dave Airlie:
 "Thorsten pointed out this had fallen down the cracks and was in -next
  only, I've picked it out, fixed up it's Fixes: line.

   - fix regression in Kconfig"

* tag 'drm-fixes-2022-03-12' of git://anongit.freedesktop.org/drm/drm:
  drm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDP
2022-03-12 10:22:43 -08:00
Zhengjun Xing
91c9923a47 perf parse: Fix event parser error for hybrid systems
This bug happened on hybrid systems when both cpu_core and cpu_atom
have the same event name such as "UOPS_RETIRED.MS" while their event
terms are different, then during perf stat, the event for cpu_atom
will parse fail and then no output for cpu_atom.

UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
UOPS_RETIRED.MS -> cpu_atom/period=0x1e8483,umask=0x1,event=0xc2/

It is because event terms in the "head" of parse_events_multi_pmu_add
will be changed to event terms for cpu_core after parsing UOPS_RETIRED.MS
for cpu_core, then when parsing the same event for cpu_atom, it still
uses the event terms for cpu_core, but event terms for cpu_atom are
different with cpu_core, the event parses for cpu_atom will fail. This
patch fixes it, the event terms should be parsed from the original
event.

This patch can work for the hybrid systems that have the same event
in more than 2 PMUs. It also can work in non-hybrid systems.

Before:

  # perf stat -v  -e  UOPS_RETIRED.MS  -a sleep 1

  Using CPUID GenuineIntel-6-97-1
  UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
  Control descriptor is not initialized
  UOPS_RETIRED.MS: 2737845 16068518485 16068518485

 Performance counter stats for 'system wide':

         2,737,845      cpu_core/UOPS_RETIRED.MS/

       1.002553850 seconds time elapsed

After:

  # perf stat -v  -e  UOPS_RETIRED.MS  -a sleep 1

  Using CPUID GenuineIntel-6-97-1
  UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
  UOPS_RETIRED.MS -> cpu_atom/period=0x1e8483,umask=0x1,event=0xc2/
  Control descriptor is not initialized
  UOPS_RETIRED.MS: 1977555 16076950711 16076950711
  UOPS_RETIRED.MS: 568684 8038694234 8038694234

 Performance counter stats for 'system wide':

         1,977,555      cpu_core/UOPS_RETIRED.MS/
           568,684      cpu_atom/UOPS_RETIRED.MS/

       1.004758259 seconds time elapsed

Fixes: fb0811535e ("perf parse-events: Allow config on kernel PMU events")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220307151627.30049-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 11:04:55 -03:00
Weiguo Li
073a15c351 perf bench: Fix NULL check against wrong variable
We did a NULL check after "epollfdp = calloc(...)", but we checked
"epollfd" instead of "epollfdp".

Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/tencent_B5D64530EB9C7DBB8D2C88A0C790F1489D0A@qq.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 10:49:13 -03:00
Weiguo Li
a7a72631f6 perf parse-events: Fix NULL check against wrong variable
We did a null check after "tmp->symbol = strdup(...)", but we checked
"list->symbol" other than "tmp->symbol".

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/tencent_DF39269807EC9425E24787E6DB632441A405@qq.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 10:47:52 -03:00
Arnaldo Carvalho de Melo
ec9d50ace3 tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  d45476d983 ("x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE")

Its just a comment fixup.

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YiyiHatGaJQM7l/Y@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 10:38:05 -03:00
Arnaldo Carvalho de Melo
3ec94eeaff tools kvm headers arm64: Update KVM headers from the kernel sources
To pick the changes from:

  a5905d6af4 ("KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated")

That don't causes any changes in tooling (when built on x86), only
addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
  diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h

Cc: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/lkml/YiyhAK6sVPc83FaI@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 10:33:13 -03:00
Thomas Zimmermann
3755d35ee1 drm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDP
As reported in [1], DRM_PANEL_EDP depends on DRM_DP_HELPER. Select
the option to fix the build failure. The error message is shown
below.

  arm-linux-gnueabihf-ld: drivers/gpu/drm/panel/panel-edp.o: in function
    `panel_edp_probe': panel-edp.c:(.text+0xb74): undefined reference to
    `drm_panel_dp_aux_backlight'
  make[1]: *** [/builds/linux/Makefile:1222: vmlinux] Error 1

The issue has been reported before, when DisplayPort helpers were
hidden behind the option CONFIG_DRM_KMS_HELPER. [2]

v2:
	* fix and expand commit description (Arnd)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 9d6366e743 ("drm: fb_helper: improve CONFIG_FB dependency")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://lore.kernel.org/dri-devel/CA+G9fYvN0NyaVkRQmA1O6rX7H8PPaZrUAD7=RDy33QY9rUU-9g@mail.gmail.com/ # [1]
Link: https://lore.kernel.org/all/20211117062704.14671-1-rdunlap@infradead.org/ # [2]
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20220203093922.20754-1-tzimmermann@suse.de
Signed-off-by: Dave Airlie <airlied@redhat.com>
2022-03-12 17:41:30 +10:00
Randy Dunlap
6845376713 ARM: Spectre-BHB: provide empty stub for non-config
When CONFIG_GENERIC_CPU_VULNERABILITIES is not set, references
to spectre_v2_update_state() cause a build error, so provide an
empty stub for that function when the Kconfig option is not set.

Fixes this build error:

  arm-linux-gnueabi-ld: arch/arm/mm/proc-v7-bugs.o: in function `cpu_v7_bugs_init':
  proc-v7-bugs.c:(.text+0x52): undefined reference to `spectre_v2_update_state'
  arm-linux-gnueabi-ld: proc-v7-bugs.c:(.text+0x82): undefined reference to `spectre_v2_update_state'

Fixes: b9baf5c8c5 ("ARM: Spectre-BHB workaround")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: patches@armlinux.org.uk
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 12:42:49 -08:00
Linus Torvalds
77fe1ba902 RISC-V Fixes for 5.17-rc8
* A fix to prevent users from enabling the alternatives framework (and
   thus errata handling) on XIP kernels, where runtime code patching does
   not function correctly.
 * A fix to properly detect offset overflow for AUIPC-based relocations
   in modules.  This may manifest as modules calling arbitrary invalid
   addresses, depending on the address allocated when a module is loaded.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmIrfkkTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYib4WD/9C/v+AVqbFQuf4qhXlwlros+3s00/e
 LYlD79ddDYvU0wvjvUDBx3WLC/tk2dI96VDuPcUNC60T7W1x0K9TmzcR1YqV9efL
 sx3c7SnV5J1lnmGM+/1EtJWGbYJkSqmH6bk3BvTVCpFGpGd1UcEHu0zMkOq54AIK
 5GD3IG7dpKYPpzziIG3T7F96RFn+N4N4cORLAGFXa07TAcQfwJvL2rnAy6YMza6Q
 NIhpD22sNrKX/9tImwN9YCwZB0G+mrNDLzU8KQAzz1DcBtoWQoEV1b+vRyuubVP+
 YSCGO3e1PWMAU3UKzGMYjGPik8WHTqUrst25affV001v1xYEy835K3td+PiDcOfy
 eHwhhJJOE9TiQ28i1z4F5HsEbT+Sk61Xyf8cXra5y4utlQowIrO5mu4S2GNOi/3Z
 kbDmUzWHZtQ2yNyz4WlmXxwLWAgx7QeRTax7PvqnMLK/Wxbzy9qgbx1MzjmmaCBI
 aI3D6CMc3h54MoWidsrBu1Hmx+qm7UqR7ML2jcEI3LqBB+lIYHNJh/d/Brg/lyTh
 5V/ncrILYalT0qd2zDSODzYYKFm2EucrneJNiJr720McFc54V2ZzLKOWg4gLdBUl
 KJypYoGw2fDbpVqmlN6/oSCWI1RETq6u5kOiCPBaXz/2sKz6aT3xq5m5wXynQAZi
 NzOHZEqS9GYAQQ==
 =qUfT
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - prevent users from enabling the alternatives framework (and thus
   errata handling) on XIP kernels, where runtime code patching does not
   function correctly.

 - properly detect offset overflow for AUIPC-based relocations in
   modules. This may manifest as modules calling arbitrary invalid
   addresses, depending on the address allocated when a module is
   loaded.

* tag 'riscv-for-linus-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix auipc+jalr relocation range checks
  riscv: alternative only works on !XIP_KERNEL
2022-03-11 12:28:21 -08:00
Linus Torvalds
878409ecde powerpc fixes for 5.17 #6
Fix STACKTRACE=n build, in particular for skiroot_defconfig.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmIqyMoTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgG3tEAChrTGM3zCopx5BUtYdIL5OOUkNPrCn
 kDgrG6fhT8WFoYIwKAca7Q/GHYvkwbjLUNLv8ImPQNiq9RgIXA56xlL9IBQ9V8Gd
 N+9QMPwwdI+/WTp2UjC0ceS7doYz+dlngybVZOkwJrjdh0eh2WX425Gpx5CdlShj
 4n9MPc1STT92SAYFvw/lY4hIHEUAncyPzoCxAFZ4CpCcOUtR6PmIPxSIKruOkPr0
 3+8fyh3kZP5H7rRJR42xxSxQ3Dz4bXDnr9fyw38UXeDzitWwkwXiyFx50+Znl8O8
 oVYgyyX8/Nb9lEtYD/f3QLWst+2bu0joQCMDz3JMZ/9pyPhbHKyjcVoCzkeGMNhK
 0RZb57LDDIHf1BzxCa2IhmvOEcJ7ozPTQcYMfgx8wMP80xGD85dpZlAyKNaE/5WJ
 P9qHlgJw+tp5Qj+BKgdnpzaoz+af7ItMN8ToaiBi71S7QcFBU+13blkQ++se5iQp
 FAmZ2FtwMnEYqj25kzlSUcY0LxJn6fH0arhAg9OHBZ+BJysRZA5GdNkiR0cjYoW7
 myQj/Fnblcrc/WKUpHn+HtU8LHSkcsR2bIRunnZZ8k83Op4YZROO1zfYSiDSBMp6
 uV9cXKdJPxwrEMjZ/aTAugYUCgH166KKGgjqIDmTBh7cD851pgeLjGaQxt9STjN0
 KuA7Hjg8uCpVpw==
 =Iu0S
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.17-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:
 "Fix STACKTRACE=n build, in particular for skiroot_defconfig"

* tag 'powerpc-5.17-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Fix STACKTRACE=n build
2022-03-11 11:50:36 -08:00
Russell King (Oracle)
6c7cb60bff ARM: fix Thumb2 regression with Spectre BHB
When building for Thumb2, the vectors make use of a local label. Sadly,
the Spectre BHB code also uses a local label with the same number which
results in the Thumb2 reference pointing at the wrong place. Fix this
by changing the number used for the Spectre BHB local label.

Fixes: b9baf5c8c5 ("ARM: Spectre-BHB workaround")
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 11:40:08 -08:00
Linus Torvalds
3977a3fb67 MMC core:
- Restore (almost) the busy polling for MMC_SEND_OP_COND
 
 MMC host:
  - meson-gx: Fix DMA usage of meson_mmc_post_req()
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAmIrZXkXHHVsZi5oYW5z
 c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCllWQ/+JJ2op5lW8Kca350ZNhEIl+Jt
 Kna82QGUiGye6a4zXVmWCmZKeDCbvx+iT0ercjlTOXvsZwm5dAFx0XLGIDXFXAV6
 r0mkM9/CDaVJ2jqz6on/AKDjVmVDw4NPWtbsiRnouZ7It+SZowpLvUzMgnpEVTT5
 nYxjkYICFWXMLyEE2Zgf+SiWtsx/Fhs6Nkj1XitCysLXDsTgPevw9SiLR22zxI1T
 KPBQAYZqcz7UIAVEo1ujZqInGGfaVXLWQQ6uce41h9bLGHOy3D1a+J7piVUt55c0
 nO8MlAPIaqM3agB2smdlzPRd9X+j5V+kntfsTPATw8rp+hBBkfinEqiscz27hr+p
 m3bPiJcbLcszy9FANqbcAXv8NB9z1U14JVSl2xFX5VvBuJvMq4lNczx94tEI3Glm
 bzclYXxBHjrpTv65443ggO9v0mO/d6T/PFUw1sAQlvNynHnqd2QP/IAtLpiMMH9k
 Ddxati2fNwyYWsKFqENPGEV7DV0AFHApe3EkQPIDpbs45+AoHzt46KWfTb1glm3T
 iJWdSWqHCmd0pS+Zxhp09tCE+fmyRycYBdK/peLl1Cy/oABtZLvJRC2+ZFVgiPdA
 hpMdKH4UvHjAcnCQ8bLcdzhtna7kwVE6E5CMObi2fNRzT5zb+nJ1oX2cOOGPHhNz
 eBnpjMn8ZWZzSrNrCb0=
 =XKXe
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - Restore (mostly) the busy polling for MMC_SEND_OP_COND

  MMC host:
   - meson-gx: Fix DMA usage of meson_mmc_post_req()"

* tag 'mmc-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: core: Restore (almost) the busy polling for MMC_SEND_OP_COND
  mmc: meson: Fix usage of meson_mmc_post_req()
2022-03-11 11:24:58 -08:00
Jarkko Sakkinen
08999b2489 x86/sgx: Free backing memory after faulting the enclave page
There is a limited amount of SGX memory (EPC) on each system.  When that
memory is used up, SGX has its own swapping mechanism which is similar
in concept but totally separate from the core mm/* code.  Instead of
swapping to disk, SGX swaps from EPC to normal RAM.  That normal RAM
comes from a shared memory pseudo-file and can itself be swapped by the
core mm code.  There is a hierarchy like this:

	EPC <-> shmem <-> disk

After data is swapped back in from shmem to EPC, the shmem backing
storage needs to be freed.  Currently, the backing shmem is not freed.
This effectively wastes the shmem while the enclave is running.  The
memory is recovered when the enclave is destroyed and the backing
storage freed.

Sort this out by freeing memory with shmem_truncate_range(), as soon as
a page is faulted back to the EPC.  In addition, free the memory for
PCMD pages as soon as all PCMD's in a page have been marked as unused
by zeroing its contents.

Cc: stable@vger.kernel.org
Fixes: 1728ab54b4 ("x86/sgx: Add a page reclaimer")
Reported-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20220303223859.273187-1-jarkko@kernel.org
2022-03-11 10:31:06 -08:00
Linus Torvalds
93ce93587d Merge branch 'davidh' (fixes from David Howells)
Merge misc fixes from David Howells:
 "A set of patches for watch_queue filter issues noted by Jann. I've
  added in a cleanup patch from Christophe Jaillet to convert to using
  formal bitmap specifiers for the note allocation bitmap.

  Also two filesystem fixes (afs and cachefiles)"

* emailed patches from David Howells <dhowells@redhat.com>:
  cachefiles: Fix volume coherency attribute
  afs: Fix potential thrashing in afs writeback
  watch_queue: Make comment about setting ->defunct more accurate
  watch_queue: Fix lack of barrier/sync/lock between post and read
  watch_queue: Free the alloc bitmap when the watch_queue is torn down
  watch_queue: Fix the alloc bitmap size to reflect notes allocated
  watch_queue: Use the bitmap API when applicable
  watch_queue: Fix to always request a pow-of-2 pipe ring size
  watch_queue: Fix to release page in ->release()
  watch_queue, pipe: Free watchqueue state after clearing pipe ring
  watch_queue: Fix filter limit check
2022-03-11 10:28:32 -08:00
David Howells
413a4a6b0b cachefiles: Fix volume coherency attribute
A network filesystem may set coherency data on a volume cookie, and if
given, cachefiles will store this in an xattr on the directory in the
cache corresponding to the volume.

The function that sets the xattr just stores the contents of the volume
coherency buffer directly into the xattr, with nothing added; the
checking function, on the other hand, has a cut'n'paste error whereby it
tries to interpret the xattr contents as would be the xattr on an
ordinary file (using the cachefiles_xattr struct).  This results in a
failure to match the coherency data because the buffer ends up being
shifted by 18 bytes.

Fix this by defining a structure specifically for the volume xattr and
making both the setting and checking functions use it.

Since the volume coherency doesn't work if used, take the opportunity to
insert a reserved field for future use, set it to 0 and check that it is
0.  Log mismatch through the appropriate tracepoint.

Note that this only affects cifs; 9p, afs, ceph and nfs don't use the
volume coherency data at the moment.

Fixes: 32e150037d ("fscache, cachefiles: Store the volume coherency data")
Reported-by: Rohith Surabattula <rohiths.msft@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: Steve French <smfrench@gmail.com>
cc: linux-cifs@vger.kernel.org
cc: linux-cachefs@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:24:37 -08:00
David Howells
173ce1ca47 afs: Fix potential thrashing in afs writeback
In afs_writepages_region(), if the dirty page we find is undergoing
writeback or write to cache, but the sync_mode is WB_SYNC_NONE, we go
round the loop trying the same page again and again with no pausing or
waiting unless and until another thread manages to clear the writeback
and fscache flags.

Fix this with three measures:

 (1) Advance start to after the page we found.

 (2) Break out of the loop and return if rescheduling is requested.

 (3) Arbitrarily give up after a maximum of 5 skips.

Fixes: 31143d5d51 ("AFS: implement basic file write support")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
Acked-by: Marc Dionne <marc.dionne@auristor.com>
Link: https://lore.kernel.org/r/164692725757.2097000.2060513769492301854.stgit@warthog.procyon.org.uk/ # v1
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:24:37 -08:00
Li Huafei
a365a65f9c x86/traps: Mark do_int3() NOKPROBE_SYMBOL
Since kprobe_int3_handler() is called in do_int3(), probing do_int3()
can cause a breakpoint recursion and crash the kernel. Therefore,
do_int3() should be marked as NOKPROBE_SYMBOL.

Fixes: 21e28290b3 ("x86/traps: Split int3 handler up")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220310120915.63349-1-lihuafei1@huawei.com
2022-03-11 19:19:30 +01:00
David Howells
4edc076041 watch_queue: Make comment about setting ->defunct more accurate
watch_queue_clear() has a comment stating that setting ->defunct to true
preventing new additions as well as preventing notifications.  Whilst
the latter is true, the first bit is superfluous since at the time this
function is called, the pipe cannot be accessed to add new event
sources.

Remove the "new additions" bit from the comment.

Fixes: c73be61ced ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:17:13 -08:00
David Howells
2ed147f015 watch_queue: Fix lack of barrier/sync/lock between post and read
There's nothing to synchronise post_one_notification() versus
pipe_read().  Whilst posting is done under pipe->rd_wait.lock, the
reader only takes pipe->mutex which cannot bar notification posting as
that may need to be made from contexts that cannot sleep.

Fix this by setting pipe->head with a barrier in post_one_notification()
and reading pipe->head with a barrier in pipe_read().

If that's not sufficient, the rd_wait.lock will need to be taken,
possibly in a ->confirm() op so that it only applies to notifications.
The lock would, however, have to be dropped before copy_page_to_iter()
is invoked.

Fixes: c73be61ced ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:17:13 -08:00
David Howells
7ea1a0124b watch_queue: Free the alloc bitmap when the watch_queue is torn down
Free the watch_queue note allocation bitmap when the watch_queue is
destroyed.

Fixes: c73be61ced ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-11 10:17:13 -08:00