linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-11-01 17:08:10 +00:00

Author	SHA1	Message	Date
Christoph Hellwig	bce2b68b89	exec: use flush_icache_user_range in read_code read_code operates on user addresses. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Link: http://lkml.kernel.org/r/20200515143646.3857579-27-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:58 -07:00
Christoph Hellwig	48304f7994	exec: only build read_code when needed Only build read_code when binary formats that use it are built into the kernel. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Link: http://lkml.kernel.org/r/20200515143646.3857579-26-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:58 -07:00
Christoph Hellwig	a1e81f9654	m68k: implement flush_icache_user_range Rename the current flush_icache_range to flush_icache_user_range as per commit `ae92ef8a44` ("PATCH] flush icache in correct context") there seems to be an assumption that it operates on user addresses. Add a flush_icache_range around it that for now is a no-op. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-25-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:58 -07:00
Christoph Hellwig	fca7f8e6fd	arm: rename flush_cache_user_range to flush_icache_user_range flush_icache_user_range will be the name for a generic primitive. Move the arm name so that arm already has an implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Russell King <linux@armlinux.org.uk> Link: http://lkml.kernel.org/r/20200515143646.3857579-24-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:58 -07:00
Christoph Hellwig	70cd3444c1	xtensa: implement flush_icache_user_range The Xtensa implementation of flush_icache_range seems to be able to cope with user addresses. Just define flush_icache_user_range to flush_icache_range. [jcmvbkbc@gmail.com: fix flush_icache_user_range in noMMU configs] Link: http://lkml.kernel.org/r/20200525221556.4270-1-jcmvbkbc@gmail.com Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-23-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:58 -07:00
Christoph Hellwig	952ec41c44	sh: implement flush_icache_user_range The SuperH implementation of flush_icache_range seems to be able to cope with user addresses. Just define flush_icache_user_range to flush_icache_range. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-22-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:58 -07:00
Christoph Hellwig	1268c33382	asm-generic: add a flush_icache_user_range stub Define flush_icache_user_range to flush_icache_range unless the architecture provides its own implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Link: http://lkml.kernel.org/r/20200515143646.3857579-21-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:58 -07:00
Christoph Hellwig	885f7f8e30	mm: rename flush_icache_user_range to flush_icache_user_page The function currently known as flush_icache_user_range only operates on a single page. Rename it to flush_icache_user_page as we'll need the name flush_icache_user_range for something else soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-20-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:58 -07:00
Christoph Hellwig	97f52c1536	arm,sparc,unicore32: remove flush_icache_user_range flush_icache_user_range is only used by <asm-generic/cacheflush.h>, so remove it from the architectures that implement it, but don't use <asm-generic/cacheflush.h>. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Russell King <linux@armlinux.org.uk> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@pku.edu.cn> Link: http://lkml.kernel.org/r/20200515143646.3857579-19-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	396eb69c6e	riscv: use asm-generic/cacheflush.h RISC-V needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Also remove the pointless __KERNEL__ ifdef while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Link: http://lkml.kernel.org/r/20200515143646.3857579-18-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	5019f76019	powerpc: use asm-generic/cacheflush.h Power needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Also remove the pointless __KERNEL__ ifdef while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20200515143646.3857579-17-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	e050945123	openrisc: use asm-generic/cacheflush.h OpenRISC needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-16-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	9e730ffac1	m68knommu: use asm-generic/cacheflush.h m68knommu needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Greg Ungerer <gerg@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-15-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	03518c82b4	microblaze: use asm-generic/cacheflush.h Microblaze needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Simek <monstr@monstr.eu> Link: http://lkml.kernel.org/r/20200515143646.3857579-14-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	57b94ff597	ia64: use asm-generic/cacheflush.h IA64 needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-13-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	af23eea561	hexagon: use asm-generic/cacheflush.h Hexagon needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Brian Cain <bcain@codeaurora.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-12-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	2d49d89c73	c6x: use asm-generic/cacheflush.h C6x needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-11-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	a7ba121215	arm64: use asm-generic/cacheflush.h ARM64 needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200515143646.3857579-10-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	43c74ca337	alpha: use asm-generic/cacheflush.h Alpha needs almost no cache flushing routines of its own. Rely on asm-generic/cacheflush.h for the defaults. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-9-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	76b3b58fac	asm-generic: improve the flush_dcache_page stub There is a magic ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE cpp symbol that guards non-stub availability of flush_dcache_pagge. Use that to check if flush_dcache_pagg is implemented. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Link: http://lkml.kernel.org/r/20200515143646.3857579-8-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	e0cf615d72	asm-generic: don't include <linux/mm.h> in cacheflush.h This seems to lead to some crazy include loops when using asm-generic/cacheflush.h on more architectures, so leave it to the arch header for now. [hch@lst.de: fix warning] Link: http://lkml.kernel.org/r/20200520173520.GA11199@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Will Deacon <will@kernel.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Link: http://lkml.kernel.org/r/20200515143646.3857579-7-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	92a73bd29a	asm-generic: fix the inclusion guards for cacheflush.h cacheflush.h uses a somewhat to generic include guard name that clashes with various arch files. Use a more specific one. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Link: http://lkml.kernel.org/r/20200515143646.3857579-6-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	7c95fda549	unicore32: remove flush_cache_user_range flush_cache_user_range is an ARMism not used by any generic or unicore32 specific code. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Link: http://lkml.kernel.org/r/20200515143646.3857579-5-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	e292e7403e	powerpc: unexport flush_icache_user_range flush_icache_user_range is only used by copy_to_user_page, which is only used by core VM code. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20200515143646.3857579-4-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	e7c1fa11b0	nds32: unexport flush_icache_page flush_icache_page is only used by mm/memory.c. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Vincent Chen <deanbo422@gmail.com> Link: http://lkml.kernel.org/r/20200515143646.3857579-3-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
Christoph Hellwig	ce450ebf61	arm: fix the flush_icache_range arguments in set_fiq_handler Patch series "sort out the flush_icache_range mess", v2. flush_icache_range is mostly used for kernel address, except for the following cases: - the nommu brk and mmap implementations - the read_code helper that is only used for binfmt_flat, binfmt_elf_fdpic, and binfmt_aout including the broken ia32 compat version - binfmt_flat itself none of which really are used by a typical MMU enabled kernel, as a.out can only be build for alpha and m68k to start with. But strangely enough commit `ae92ef8a44` ("PATCH] flush icache in correct context") added a "set_fs(KERNEL_DS)" around the flush_icache_range call in the module loader, because apparently m68k assumed user pointers. This series first cleans up the cacheflush implementations, largely by switching as much as possible to the asm-generic version after a few preparations, then moves the misnamed current flush_icache_user_range to a new name, to finally introduce a real flush_icache_user_range to be used for the above use cases to flush the instruction cache for a userspace address range. The last patch then drops the set_fs in the module code and moves it into the m68k implementation. This patch (of 29): The arguments passed look bogus, try to fix them to something that seems to make sense. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jeff Dike <jdike@addtoit.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Keith Busch <keith.busch@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Song Liu <songliubraving@fb.com> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200515143646.3857579-1-hch@lst.de Link: http://lkml.kernel.org/r/20200515143646.3857579-2-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
John Hubbard	690623e1b4	vhost: convert get_user_pages() --> pin_user_pages() This code was using get_user_pages(), in approximately a "Case 5" scenario (accessing the data within a page), using the categorization from [1]. That means that it's time to convert the get_user_pages() + put_page() calls to pin_user_pages*() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/20200529234309.484480-3-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
John Hubbard	eaf4d22a9e	docs: mm/gup: pin_user_pages.rst: add a "case 5" Patch series "vhost, docs: convert to pin_user_pages(), new "case 5"" It recently became clear to me that there are some get_user_pages() callers that don't fit neatly into any of the four cases that are so far listed in pin_user_pages.rst. vhost.c is one of those. Add a Case 5 to the documentation, and refer to that when converting vhost.c. Thanks to Jan Kara for helping me (again) in understanding the interaction between get_user_pages() and page writeback [1]. This is based on today's mmotm, which has a nearby patch to pin_user_pages.rst that rewords cases 3 and 4. Note that I have only compile-tested the vhost.c patch, although that does also include cross-compiling for a few other arches. Any run-time testing would be greatly appreciated. [1] https://lore.kernel.org/r/20200529070343.GL14550@quack2.suse.cz This patch (of 2): There are four cases listed in pin_user_pages.rst. These are intended to help developers figure out whether to use get_user_pages(), or pin_user_pages(). However, the four cases do not cover all the situations. For example, drivers/vhost/vhost.c has a "pin, write to page, set page dirty, unpin" case. Add a fifth case, to help explain that there is a general pattern that requires pin_user_pages() API calls. [jhubbard@nvidia.com: v2] Link: http://lkml.kernel.org/r/20200601052633.853874-2-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: "Michael S . Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Link: http://lkml.kernel.org/r/20200529234309.484480-1-jhubbard@nvidia.com Link: http://lkml.kernel.org/r/20200529234309.484480-2-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:57 -07:00
John Hubbard	6a005645ed	mm/gup: documentation fix for pin_user_pages() APIs All of the pin_user_pages() API calls will cause pages to be dma-pinned. As such, they are all suitable for either DMA, RDMA, and/or Direct IO. The documentation should say so, but it was instead saying that three of the API calls were only suitable for Direct IO. This was discovered when a reviewer wondered why an API call that specifically recommended against Case 2 (DMA/RDMA) was being used in a DMA situation [1]. Fix this by simply deleting those claims. The gup.c comments already refer to the more extensive Documentation/core-api/pin_user_pages.rst, which does have the correct guidance. So let's just write it once, there. [1] https://lore.kernel.org/r/20200529074658.GM30374@kadam Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Jan Kara <jack@suse.cz> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/20200529084515.46259-1-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
John Hubbard	55a650c35f	mm/gup: frame_vector: convert get_user_pages() --> pin_user_pages() This code was using get_user_pages(), and all of the callers so far were in a "Case 2" scenario (DMA/RDMA), using the categorization from [1]. That means that it's time to convert the get_user_pages() + put_page() calls to pin_user_pages*() + unpin_user_pages() calls. There is some helpful background in [2]: basically, this is a small part of fixing a long-standing disconnect between pinning pages, and file systems' use of those pages. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Hildenbrand <david@redhat.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Link: http://lkml.kernel.org/r/20200527223243.884385-3-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
John Hubbard	420c2091b6	mm/gup: introduce pin_user_pages_locked() Patch series "mm/gup: introduce pin_user_pages_locked(), use it in frame_vector.c", v2. This adds yet one more pin_user_pages() variant, and uses that to convert mm/frame_vector.c. With this, along with maybe 20 or 30 other recent patches in various trees, we are close to having the relevant gup call sites converted--with the notable exception of the bio/block layer. This patch (of 2): Introduce pin_user_pages_locked(), which is nearly identical to get_user_pages_locked() except that it sets FOLL_PIN and rejects FOLL_GET. As with other pairs of get_user_pages() and pin_user_pages() API calls, it's prudent to assert that FOLL_PIN is not set in the get_user_pages*() call, so add that as part of this. [jhubbard@nvidia.com: v2] Link: http://lkml.kernel.org/r/20200531234131.770697-2-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Link: http://lkml.kernel.org/r/20200531234131.770697-1-jhubbard@nvidia.com Link: http://lkml.kernel.org/r/20200527223243.884385-1-jhubbard@nvidia.com Link: http://lkml.kernel.org/r/20200527223243.884385-2-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
John Hubbard	a8f80f53fb	mm/gup: update pin_user_pages.rst for "case 3" (mmu notifiers) Update case 3 so that it covers the use of mmu notifiers, for hardware that does, or does not have replayable page faults. Also, elaborate case 4 slightly, as it was quite cryptic. Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Jonathan Corbet <corbet@lwn.net> Link: http://lkml.kernel.org/r/20200527194953.11130-1-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Souptick Joarder	dadbb612f6	mm/gup.c: convert to use get_user_{page\|pages}_fast_only() API __get_user_pages_fast() renamed to get_user_pages_fast_only() to align with pin_user_pages_fast_only(). As part of this we will get rid of write parameter. Instead caller will pass FOLL_WRITE to get_user_pages_fast_only(). This will not change any existing functionality of the API. All the callers are changed to pass FOLL_WRITE. Also introduce get_user_page_fast_only(), and use it in a few places that hard-code nr_pages to 1. Updated the documentation of the API. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> [arch/powerpc/kvm] Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Michal Suchanek <msuchanek@suse.de> Link: http://lkml.kernel.org/r/1590396812-31277-1-git-send-email-jrdr.linux@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Rafael Aquini	e77132e758	kernel/sysctl.c: ignore out-of-range taint bits introduced via kernel.tainted Users with SYS_ADMIN capability can add arbitrary taint flags to the running kernel by writing to /proc/sys/kernel/tainted or issuing the command 'sysctl -w kernel.tainted=...'. This interface, however, is open for any integer value and this might cause an invalid set of flags being committed to the tainted_mask bitset. This patch introduces a simple way for proc_taint() to ignore any eventual invalid bit coming from the user input before committing those bits to the kernel tainted_mask. Signed-off-by: Rafael Aquini <aquini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Link: http://lkml.kernel.org/r/20200512223946.888020-1-aquini@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Guilherme G. Piccoli	60c958d8df	panic: add sysctl to dump all CPUs backtraces on oops event Usually when the kernel reaches an oops condition, it's a point of no return; in case not enough debug information is available in the kernel splat, one of the last resorts would be to collect a kernel crash dump and analyze it. The problem with this approach is that in order to collect the dump, a panic is required (to kexec-load the crash kernel). When in an environment of multiple virtual machines, users may prefer to try living with the oops, at least until being able to properly shutdown their VMs / finish their important tasks. This patch implements a way to collect a bit more debug details when an oops event is reached, by printing all the CPUs backtraces through the usage of NMIs (on architectures that support that). The sysctl added (and documented) here was called "oops_all_cpu_backtrace", and when set will (as the name suggests) dump all CPUs backtraces. Far from ideal, this may be the last option though for users that for some reason cannot panic on oops. Most of times oopses are clear enough to indicate the kernel portion that must be investigated, but in virtual environments it's possible to observe hypervisor/KVM issues that could lead to oopses shown in other guests CPUs (like virtual APIC crashes). This patch hence aims to help debug such complex issues without resorting to kdump. Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200327224116.21030-1-gpiccoli@canonical.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Guilherme G. Piccoli	0ec9dc9bcb	kernel/hung_task.c: introduce sysctl to print all traces when a hung task is detected Commit `401c636a0e` ("kernel/hung_task.c: show all hung tasks before panic") introduced a change in that we started to show all CPUs backtraces when a hung task is detected _and_ the sysctl/kernel parameter "hung_task_panic" is set. The idea is good, because usually when observing deadlocks (that may lead to hung tasks), the culprit is another task holding a lock and not necessarily the task detected as hung. The problem with this approach is that dumping backtraces is a slightly expensive task, specially printing that on console (and specially in many CPU machines, as servers commonly found nowadays). So, users that plan to collect a kdump to investigate the hung tasks and narrow down the deadlock definitely don't need the CPUs backtrace on dmesg/console, which will delay the panic and pollute the log (crash tool would easily grab all CPUs traces with 'bt -a' command). Also, there's the reciprocal scenario: some users may be interested in seeing the CPUs backtraces but not have the system panic when a hung task is detected. The current approach hence is almost as embedding a policy in the kernel, by forcing the CPUs backtraces' dump (only) on hung_task_panic. This patch decouples the panic event on hung task from the CPUs backtraces dump, by creating (and documenting) a new sysctl called "hung_task_all_cpu_backtrace", analog to the approach taken on soft/hard lockups, that have both a panic and an "all_cpu_backtrace" sysctl to allow individual control. The new mechanism for dumping the CPUs backtraces on hung task detection respects "hung_task_warnings" by not dumping the traces in case there's no warnings left. Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: http://lkml.kernel.org/r/20200327223646.20779-1-gpiccoli@canonical.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Guilherme G. Piccoli	f117955a22	kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases After a recent change introduced by Vlastimil's series [0], kernel is able now to handle sysctl parameters on kernel command line; also, the series introduced a simple infrastructure to convert legacy boot parameters (that duplicate sysctls) into sysctl aliases. This patch converts the watchdog parameters softlockup_panic and {hard,soft}lockup_all_cpu_backtrace to use the new alias infrastructure. It fixes the documentation too, since the alias only accepts values 0 or 1, not the full range of integers. We also took the opportunity here to improve the documentation of the previously converted hung_task_panic (see the patch series [0]) and put the alias table in alphabetical order. [0] http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.cz Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Kees Cook <keescook@chromium.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Link: http://lkml.kernel.org/r/20200507214624.21911-1-gpiccoli@canonical.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Vlastimil Babka	4f2f682d89	lib/test_sysctl: support testing of sysctl. boot parameter Testing is done by a new parameter debug.test_sysctl.boot_int which defaults to 0 and it's expected that the tester passes a boot parameter that sets it to 1. The test checks if it's set to 1. To distinguish true failure from parameter not being set, the test checks /proc/cmdline for the expected parameter, and whether test_sysctl is built-in and not a module. [vbabka@suse.cz: skip the new test if boot_int sysctl is not present] Link: http://lkml.kernel.org/r/305af605-1e60-cf84-fada-6ce1ca37c102@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: David Rientjes <rientjes@google.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com> Cc: Kees Cook <keescook@chromium.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200427180433.7029-6-vbabka@suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Vlastimil Babka	4546cde96f	tools/testing/selftests/sysctl/sysctl.sh: support CONFIG_TEST_SYSCTL=y The testing script recommends CONFIG_TEST_SYSCTL=y, but actually only works with CONFIG_TEST_SYSCTL=m. Testing of sysctl setting via boot param however requires the test to be built-in, so make sure the test script supports it. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: David Rientjes <rientjes@google.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com> Cc: Kees Cook <keescook@chromium.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200427180433.7029-5-vbabka@suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Vlastimil Babka	b467f3ef3c	kernel/hung_task convert hung_task_panic boot parameter to sysctl We can now handle sysctl parameters on kernel command line and have infrastructure to convert legacy command line options that duplicate sysctl to become a sysctl alias. This patch converts the hung_task_panic parameter. Note that the sysctl handler is more strict and allows only 0 and 1, while the legacy parameter allowed any non-zero value. But there is little reason anyone would not be using 1. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: David Rientjes <rientjes@google.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200427180433.7029-4-vbabka@suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Vlastimil Babka	0a477e1ae2	kernel/sysctl: support handling command line aliases We can now handle sysctl parameters on kernel command line, but historically some parameters introduced their own command line equivalent, which we don't want to remove for compatibility reasons. We can, however, convert them to the generic infrastructure with a table translating the legacy command line parameters to their sysctl names, and removing the one-off param handlers. This patch adds the support and makes the first conversion to demonstrate it, on the (deprecated) numa_zonelist_order parameter. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: David Rientjes <rientjes@google.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200427180433.7029-3-vbabka@suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Vlastimil Babka	3db978d480	kernel/sysctl: support setting sysctl parameters from kernel command line Patch series "support setting sysctl parameters from kernel command line", v3. This series adds support for something that seems like many people always wanted but nobody added it yet, so here's the ability to set sysctl parameters via kernel command line options in the form of sysctl.vm.something=1 The important part is Patch 1. The second, not so important part is an attempt to clean up legacy one-off parameters that do the same thing as a sysctl. I don't want to remove them completely for compatibility reasons, but with generic sysctl support the idea is to remove the one-off param handlers and treat the parameters as aliases for the sysctl variants. I have identified several parameters that mention sysctl counterparts in Documentation/admin-guide/kernel-parameters.txt but there might be more. The conversion also has varying level of success: - numa_zonelist_order is converted in Patch 2 together with adding the necessary infrastructure. It's easy as it doesn't really do anything but warn on deprecated value these days. - hung_task_panic is converted in Patch 3, but there's a downside that now it only accepts 0 and 1, while previously it was any integer value - nmi_watchdog maps to two sysctls nmi_watchdog and hardlockup_panic, so there's no straighforward conversion possible - traceoff_on_warning is a flag without value and it would be required to handle that somehow in the conversion infractructure, which seems pointless for a single flag This patch (of 5): A recently proposed patch to add vm_swappiness command line parameter in addition to existing sysctl [1] made me wonder why we don't have a general support for passing sysctl parameters via command line. Googling found only somebody else wondering the same [2], but I haven't found any prior discussion with reasons why not to do this. Settings the vm_swappiness issue aside (the underlying issue might be solved in a different way), quick search of kernel-parameters.txt shows there are already some that exist as both sysctl and kernel parameter - hung_task_panic, nmi_watchdog, numa_zonelist_order, traceoff_on_warning. A general mechanism would remove the need to add more of those one-offs and might be handy in situations where configuration by e.g. /etc/sysctl.d/ is impractical. Hence, this patch adds a new parse_args() pass that looks for parameters prefixed by 'sysctl.' and tries to interpret them as writes to the corresponding sys/ files using an temporary in-kernel procfs mount. This mechanism was suggested by Eric W. Biederman [3], as it handles all dynamically registered sysctl tables, even though we don't handle modular sysctls. Errors due to e.g. invalid parameter name or value are reported in the kernel log. The processing is hooked right before the init process is loaded, as some handlers might be more complicated than simple setters and might need some subsystems to be initialized. At the moment the init process can be started and eventually execute a process writing to /proc/sys/ then it should be also fine to do that from the kernel. Sysctls registered later on module load time are not set by this mechanism - it's expected that in such scenarios, setting sysctl values from userspace is practical enough. [1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/ [2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter [3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/ Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Christian Brauner <christian.brauner@ubuntu.com> Link: http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.cz Link: http://lkml.kernel.org/r/20200427180433.7029-2-vbabka@suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Manfred Spraul	01f39c1c11	xarray.h: correct return code documentation for xa_store_{bh,irq}() __xa_store() and xa_store() document that the functions can fail, and that the return code can be an xa_err() encoded error code. xa_store_bh() and xa_store_irq() do not document that the functions can fail and that they can also return xa_err() encoded error codes. Thus: Update the documentation. Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200430111424.16634-1-manfred@colorfullife.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Rafael Aquini	db38d5c106	kernel: add panic_on_taint Analogously to the introduction of panic_on_warn, this patch introduces a kernel option named panic_on_taint in order to provide a simple and generic way to stop execution and catch a coredump when the kernel gets tainted by any given flag. This is useful for debugging sessions as it avoids having to rebuild the kernel to explicitly add calls to panic() into the code sites that introduce the taint flags of interest. For instance, if one is interested in proceeding with a post-mortem analysis at the point a given code path is hitting a bad page (i.e. unaccount_page_cache_page(), or slab_bug()), a coredump can be collected by rebooting the kernel with 'panic_on_taint=0x20' amended to the command line. Another, perhaps less frequent, use for this option would be as a means for assuring a security policy case where only a subset of taints, or no single taint (in paranoid mode), is allowed for the running system. The optional switch 'nousertaint' is handy in this particular scenario, as it will avoid userspace induced crashes by writes to sysctl interface /proc/sys/kernel/tainted causing false positive hits for such policies. [akpm@linux-foundation.org: tweak kernel-parameters.txt wording] Suggested-by: Qian Cai <cai@lca.pw> Signed-off-by: Rafael Aquini <aquini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Dave Young <dyoung@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Adrian Bunk <bunk@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Laura Abbott <labbott@redhat.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Takashi Iwai <tiwai@suse.de> Link: http://lkml.kernel.org/r/20200515175502.146720-1-aquini@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Orson Zhai	ceabef7dd7	dynamic_debug: add an option to enable dynamic debug for modules only Instead of enabling dynamic debug globally with CONFIG_DYNAMIC_DEBUG, CONFIG_DYNAMIC_DEBUG_CORE will only enable core function of dynamic debug. With the DYNAMIC_DEBUG_MODULE defined for any modules, dynamic debug will be tied to them. This is useful for people who only want to enable dynamic debug for kernel modules without worrying about kernel image size and memory consumption is increasing too much. [orson.zhai@unisoc.com: v2] Link: http://lkml.kernel.org/r/1587408228-10861-1-git-send-email-orson.unisoc@gmail.com Signed-off-by: Orson Zhai <orson.zhai@unisoc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Petr Mladek <pmladek@suse.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jason Baron <jbaron@akamai.com> Cc: Randy Dunlap <rdunlap@infradead.org> Link: http://lkml.kernel.org/r/1586521984-5890-1-git-send-email-orson.unisoc@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Giuseppe Scrivano	e1eb26fa62	ipc/namespace.c: use a work queue to free_ipc the reason is to avoid a delay caused by the synchronize_rcu() call in kern_umount() when the mqueue mount is freed. the code: #define _GNU_SOURCE #include <sched.h> #include <error.h> #include <errno.h> #include <stdlib.h> int main() { int i; for (i = 0; i < 1000; i++) if (unshare(CLONE_NEWIPC) < 0) error(EXIT_FAILURE, errno, "unshare"); } goes from Command being timed: "./ipc-namespace" User time (seconds): 0.00 System time (seconds): 0.06 Percent of CPU this job got: 0% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:08.05 to Command being timed: "./ipc-namespace" User time (seconds): 0.00 System time (seconds): 0.02 Percent of CPU this job got: 96% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.03 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Waiman Long <longman@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Manfred Spraul <manfred@colorfullife.com> Link: http://lkml.kernel.org/r/20200225145419.527994-1-gscrivan@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:56 -07:00
Jules Irenge	4b78e2013a	ipc/msg: add missing annotation for freeque() Sparse reports a warning at freeque() warning: context imbalance in freeque() - unexpected unlock The root cause is the missing annotation at freeque() Add the missing __releases(RCU) annotation Add the missing __releases(&msq->q_perm) annotation Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Lu Shuaibing <shuaibinglu@126.com> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Link: http://lkml.kernel.org/r/20200403160505.2832-2-jbi.octave@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:55 -07:00
SeongJae Park	92fb1db26e	mm/page_idle.c: skip offline pages 'Idle page tracking' users can pass random pfn that might be mapped to an offline page. To avoid accessing such pages, this commit modifies the 'page_idle_get_page()' to use 'pfn_to_online_page()' instead of 'pfn_valid()' and 'pfn_to_page()' combination, so that the pfn mapped to an offline page can be skipped. Reported-by: David Hildenbrand <david@redhat.com> Signed-off-by: SeongJae Park <sjpark@amazon.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Link: http://lkml.kernel.org/r/20200605092502.18018-2-sjpark@amazon.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-06-08 11:05:55 -07:00
Linus Torvalds	9aa900c809	Char/Misc driver patches for 5.8-rc1 Here is the large set of char/misc driver patches for 5.8-rc1 Included in here are: - habanalabs driver updates, loads - mhi bus driver updates - extcon driver updates - clk driver updates (approved by the clock maintainer) - firmware driver updates - fpga driver updates - gnss driver updates - coresight driver updates - interconnect driver updates - parport driver updates (it's still alive!) - nvmem driver updates - soundwire driver updates - visorbus driver updates - w1 driver updates - various misc driver updates In short, loads of different driver subsystem updates along with the drivers as well. All have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXtzkHw8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+yldOwCgus/DgpnI1UL4z+NdBxJrAXtkPmgAn2sgTUea i5RblCmcVMqvHaGtYkY+ =tScN -----END PGP SIGNATURE----- Merge tag 'char-misc-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the large set of char/misc driver patches for 5.8-rc1 Included in here are: - habanalabs driver updates, loads - mhi bus driver updates - extcon driver updates - clk driver updates (approved by the clock maintainer) - firmware driver updates - fpga driver updates - gnss driver updates - coresight driver updates - interconnect driver updates - parport driver updates (it's still alive!) - nvmem driver updates - soundwire driver updates - visorbus driver updates - w1 driver updates - various misc driver updates In short, loads of different driver subsystem updates along with the drivers as well. All have been in linux-next for a while with no reported issues" * tag 'char-misc-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (233 commits) habanalabs: correctly cast u64 to void* habanalabs: initialize variable to default value extcon: arizona: Fix runtime PM imbalance on error extcon: max14577: Add proper dt-compatible strings extcon: adc-jack: Fix an error handling path in 'adc_jack_probe()' extcon: remove redundant assignment to variable idx w1: omap-hdq: print dev_err if irq flags are not cleared w1: omap-hdq: fix interrupt handling which did show spurious timeouts w1: omap-hdq: fix return value to be -1 if there is a timeout w1: omap-hdq: cleanup to add missing newline for some dev_dbg /dev/mem: Revoke mappings when a driver claims the region misc: xilinx-sdfec: convert get_user_pages() --> pin_user_pages() misc: xilinx-sdfec: cleanup return value in xsdfec_table_write() misc: xilinx-sdfec: improve get_user_pages_fast() error handling nvmem: qfprom: remove incorrect write support habanalabs: handle MMU cache invalidation timeout habanalabs: don't allow hard reset with open processes habanalabs: GAUDI does not support soft-reset habanalabs: add print for soft reset due to event habanalabs: improve MMU cache invalidation code ...	2020-06-07 10:59:32 -07:00
Linus Torvalds	f558b8364e	Driver core patches for 5.8-rc1 Here is the set of driver core patches for 5.8-rc1. Not all that huge this release, just a number of small fixes and updates: - software node fixes - kobject now sends KOBJ_REMOVE when it is removed from sysfs, not when it is removed from memory (which could come much later) - device link additions and fixes based on testing on more devices - firmware core cleanups - other minor changes, full details in the shortlog All have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXtzmXg8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymaAQCfZZ9prH3AMLF7DIkG3vMw0njLXt0An2FxrKYU wetHRG4KL9vTkdz7+TqU =t5LE -----END PGP SIGNATURE----- Merge tag 'driver-core-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of driver core patches for 5.8-rc1. Not all that huge this release, just a number of small fixes and updates: - software node fixes - kobject now sends KOBJ_REMOVE when it is removed from sysfs, not when it is removed from memory (which could come much later) - device link additions and fixes based on testing on more devices - firmware core cleanups - other minor changes, full details in the shortlog All have been in linux-next for a while with no reported issues" * tag 'driver-core-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (23 commits) driver core: Update device link status correctly for SYNC_STATE_ONLY links firmware_loader: change enum fw_opt to u32 software node: implement software_node_unregister() kobject: send KOBJ_REMOVE uevent when the object is removed from sysfs driver core: Remove unnecessary is_fwnode_dev variable in device_add() drivers property: When no children in primary, try secondary driver core: platform: Fix spelling errors in platform.c driver core: Remove check in driver_deferred_probe_force_trigger() of: platform: Batch fwnode parsing when adding all top level devices driver core: fw_devlink: Add support for batching fwnode parsing driver core: Look for waiting consumers only for a fwnode's primary device driver core: Move code to the right part of the file Revert "Revert "driver core: Set fw_devlink to "permissive" behavior by default"" drivers: base: Fix NULL pointer exception in __platform_driver_probe() if a driver developer is foolish firmware_loader: move fw_fallback_config to a private kernel symbol namespace driver core: Add missing '\n' in log messages driver/base/soc: Use kobj_to_dev() API Add documentation on meaning of -EPROBE_DEFER driver core: platform: remove redundant assignment to variable ret debugfs: Use the correct style for SPDX License Identifier ...	2020-06-07 10:53:36 -07:00

1 2 3 4 5 ...

930157 commits