linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-09-24 19:35:58 +00:00

Author	SHA1	Message	Date
Jing Leng	2e0ffef173	kbuild: Fix include path in scripts/Makefile.modpost commit `23a0cb8e32` upstream. When building an external module, if users don't need to separate the compilation output and source code, they run the following command: "make -C $(LINUX_SRC_DIR) M=$(PWD)". At this point, "$(KBUILD_EXTMOD)" and "$(src)" are the same. If they need to separate them, they run "make -C $(KERNEL_SRC_DIR) O=$(KERNEL_OUT_DIR) M=$(OUT_DIR) src=$(PWD)". Before running the command, they need to copy "Kbuild" or "Makefile" to "$(OUT_DIR)" to prevent compilation failure. So the kernel should change the included path to avoid the copy operation. Signed-off-by: Jing Leng <jleng@ambarella.com> [masahiro: I do not think "M=$(OUT_DIR) src=$(PWD)" is the official way, but this patch is a nice clean up anyway.] Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Nicolas Schier <n.schier@avm.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:06 +02:00
Pavel Begunkov	e9d7ca0c46	io_uring: fix UAF due to missing POLLFREE handling [ upstream commmit `791f3465c4` ] Fixes a problem described in `50252e4b5e` ("aio: fix use-after-free due to missing POLLFREE handling") and copies the approach used there. In short, we have to forcibly eject a poll entry when we meet POLLFREE. We can't rely on io_poll_get_ownership() as can't wait for potentially running tw handlers, so we use the fact that wqs are RCU freed. See Eric's patch and comments for more details. Reported-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20211209010455.42744-6-ebiggers@kernel.org Reported-and-tested-by: syzbot+5426c7ed6868c705ca14@syzkaller.appspotmail.com Fixes: `221c5eb233` ("io_uring: add support for IORING_OP_POLL") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4ed56b6f548f7ea337603a82315750449412748a.1642161259.git.asml.silence@gmail.com [axboe: drop non-functional change from patch] Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:06 +02:00
Pavel Begunkov	182dc3aa5a	io_uring: fix wrong arm_poll error handling [ upstream commmit `9d2ad2947a` ] Leaving ip.error set when a request was punted to task_work execution is problematic, don't forget to clear it. Fixes: `aa43477b04` ("io_uring: poll rework") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a6c84ef4182c6962380aebe11b35bdcb25b0ccfb.1655852245.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:06 +02:00
Pavel Begunkov	6c7259c837	io_uring: fail links when poll fails [ upstream commmit `c487a5ad48` ] Don't forget to cancel all linked requests of poll request when __io_arm_poll_handler() failed. Fixes: `aa43477b04` ("io_uring: poll rework") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a78aad962460f9fdfe4aa4c0b62425c88f9415bc.1655852245.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:05 +02:00
Jens Axboe	c41e79a0c4	io_uring: bump poll refs to full 31-bits [ upstream commmit `e2c0cb7c0c` ] The previous commit: 1bc84c40088 ("io_uring: remove poll entry from list when canceling all") removed a potential overflow condition for the poll references. They are currently limited to 20-bits, even if we have 31-bits available. The upper bit is used to mark for cancelation. Bump the poll ref space to 31-bits, making that kind of situation much harder to trigger in general. We'll separately add overflow checking and handling. Fixes: `aa43477b04` ("io_uring: poll rework") Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:05 +02:00
Jens Axboe	7524ec52ca	io_uring: remove poll entry from list when canceling all [ upstream commmit `61bc84c400` ] When the ring is exiting, as part of the shutdown, poll requests are removed. But io_poll_remove_all() does not remove entries when finding them, and since completions are done out-of-band, we can find and remove the same entry multiple times. We do guard the poll execution by poll ownership, but that does not exclude us from reissuing a new one once the previous removal ownership goes away. This can race with poll execution as well, where we then end up seeing req->apoll be NULL because a previous task_work requeue finished the request. Remove the poll entry when we find it and get ownership of it. This prevents multiple invocations from finding it. Fixes: `aa43477b04` ("io_uring: poll rework") Reported-by: Dylan Yudaken <dylany@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:05 +02:00
Jiapeng Chong	95a004a223	io_uring: Remove unused function req_ref_put [ upstream commmit `c84b8a3fef` ] Fix the following clang warnings: fs/io_uring.c:1195:20: warning: unused function 'req_ref_put' [-Wunused-function]. Fixes: `aa43477b04` ("io_uring: poll rework") Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20220113162005.3011-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:05 +02:00
Pavel Begunkov	f770fba096	io_uring: poll rework [ upstream commmit `aa43477b04` ] It's not possible to go forward with the current state of io_uring polling, we need a more straightforward and easier synchronisation. There are a lot of problems with how it is at the moment, including missing events on rewait. The main idea here is to introduce a notion of request ownership while polling, no one but the owner can modify any part but ->poll_refs of struct io_kiocb, that grants us protection against all sorts of races. Main users of such exclusivity are poll task_work handler, so before queueing a tw one should have/acquire ownership, which will be handed off to the tw handler. The other user is __io_arm_poll_handler() do initial poll arming. It starts taking the ownership, so tw handlers won't be run until it's released later in the function after vfs_poll. note: also prevents races in __io_queue_proc(). Poll wake/etc. may not be able to get ownership, then they need to increase the poll refcount and the task_work should notice it and retry if necessary, see io_poll_check_events(). There is also IO_POLL_CANCEL_FLAG flag to notify that we want to kill request. It makes cancellations more reliable, enables double multishot polling, fixes double poll rewait, fixes missing poll events and fixes another bunch of races. Even though it adds some overhead for new refcounting, and there are a couple of nice performance wins: - no req->refs refcounting for poll requests anymore - if the data is already there (once measured for some test to be 1-2% of all apoll requests), it removes it doesn't add atomics and removes spin_lock/unlock pair. - works well with multishots, we don't do remove from queue / add to queue for each new poll event. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6b652927c77ed9580ea4330ac5612f0e0848c946.1639605189.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:05 +02:00
Pavel Begunkov	8dc669632f	io_uring: inline io_poll_complete [ upstream commmit `eb6e6f0690` ] Inline io_poll_complete(), it's simple and doesn't have any particular purpose. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/933d7ee3e4450749a2d892235462c8f18d030293.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:05 +02:00
Pavel Begunkov	20bbcc3163	io_uring: kill poll linking optimisation [ upstream commmit `ab1dab960b` ] With IORING_FEAT_FAST_POLL in place, io_put_req_find_next() for poll requests doesn't make much sense, and in any case re-adding it shouldn't be a problem considering batching in tctx_task_work(). We can remove it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/15699682bf81610ec901d4e79d6da64baa9f70be.1639605189.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:05 +02:00
Pavel Begunkov	a85d7ac14f	io_uring: move common poll bits [ upstream commmit `5641897a5e` ] Move some poll helpers/etc up, we'll need them there shortly Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/6c5c3dba24c86aad5cd389a54a8c7412e6a0621d.1639605189.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:04 +02:00
Pavel Begunkov	040e58f51c	io_uring: refactor poll update [ upstream commmit `2bbb146d96` ] Clean up io_poll_update() and unify cancellation paths for remove and update. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5937138b6265a1285220e2fab1b28132c1d73ce3.1639605189.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:04 +02:00
Pavel Begunkov	b850d6ddc7	io_uring: clean cqe filling functions [ upstream commmit `913a571aff` ] Split io_cqring_fill_event() into a couple of more targeted functions. The first on is io_fill_cqe_aux() for completions that are not associated with request completions and doing the ->cq_extra accounting. Examples are additional CQEs from multishot poll and rsrc notifications. The second is io_fill_cqe_req(), should be called when it's a normal request completion. Nothing more to it at the moment, will be used in later patches. The last one is inlined __io_fill_cqe() for a finer grained control, should be used with caution and in hottest places. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/59a9117a4a44fc9efcf04b3afa51e0d080f5943c.1636559119.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:04 +02:00
Pavel Begunkov	5c0ea4c8e5	io_uring: correct fill events helpers types [ upstream commit `54daa9b2d8` ] CQE result is a 32-bit integer, so the functions generating CQEs are better to accept not long but ints. Convert io_cqring_fill_event() and other helpers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7ca6f15255e9117eae28adcac272744cae29b113.1633373302.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> [pavel: backport] Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:04 +02:00
James Morse	285e77dbb3	arm64: errata: Add Cortex-A510 to the repeat tlbi list commit `39fdb65f52` upstream. Cortex-A510 is affected by an erratum where in rare circumstances the CPUs may not handle a race between a break-before-make sequence on one CPU, and another CPU accessing the same page. This could allow a store to a page that has been unmapped. Work around this by adding the affected CPUs to the list that needs TLB sequences to be done twice. Signed-off-by: James Morse <james.morse@arm.com> Link: https://lore.kernel.org/r/20220704155732.21216-1-james.morse@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Lucas Wei <lucaswei@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:04 +02:00
Miaohe Lin	da60ddd80d	mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte commit `ab74ef708d` upstream. In MCOPY_ATOMIC_CONTINUE case with a non-shared VMA, pages in the page cache are installed in the ptes. But hugepage_add_new_anon_rmap is called for them mistakenly because they're not vm_shared. This will corrupt the page->mapping used by page cache code. Link: https://lkml.kernel.org/r/20220712130542.18836-1-linmiaohe@huawei.com Fixes: `f619147104` ("userfaultfd: add UFFDIO_CONTINUE ioctl") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:04 +02:00
Boqun Feng	e7a792dcd6	Drivers: hv: balloon: Support status report for larger page sizes commit `b3d6dd09ff` upstream. DM_STATUS_REPORT expects the numbers of pages in the unit of 4k pages (HV_HYP_PAGE) instead of guest pages, so to make it work when guest page sizes are larger than 4k, convert the numbers of guest pages into the numbers of HV_HYP_PAGEs. Note that the numbers of guest pages are still used for tracing because tracing is internal to the guest kernel. Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220325023212.1570049-2-boqun.feng@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:04 +02:00
Eric Biggers	2edbdfc89d	crypto: lib - remove unneeded selection of XOR_BLOCKS commit `874b301985` upstream. CRYPTO_LIB_CHACHA_GENERIC doesn't need to select XOR_BLOCKS. It perhaps was thought that it's needed for __crypto_xor, but that's not the case. Enabling XOR_BLOCKS is problematic because the XOR_BLOCKS code runs a benchmark when it is initialized. That causes a boot time regression on systems that didn't have it enabled before. Therefore, remove this unnecessary and problematic selection. Fixes: `e56e189855` ("lib/crypto: add prompts back to crypto libraries") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:03 +02:00
Timo Alho	6db913f5e4	firmware: tegra: bpmp: Do only aligned access to IPC memory area commit `a4740b148a` upstream. Use memcpy_toio and memcpy_fromio variants of memcpy to guarantee no unaligned access to IPC memory area. This is to allow the IPC memory to be mapped as Device memory to further suppress speculative reads from happening within the 64 kB memory area above the IPC memory when 64 kB memory pages are used. Signed-off-by: Timo Alho <talho@nvidia.com> Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Cc: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:03 +02:00
Maxime Ripard	80d46e73e8	drm/vc4: hdmi: Depends on CONFIG_PM commit `72e2329e7c` upstream. We already depend on runtime PM to get the power domains and clocks for most of the devices supported by the vc4 driver, so let's just select it to make sure it's there. Link: https://lore.kernel.org/r/20220629123510.1915022-38-maxime@cerno.tech Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> (cherry picked from commit `f1bc386b31`) Signed-off-by: Maxime Ripard <maxime@cerno.tech> Cc: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:03 +02:00
Maxime Ripard	3d2d12fb78	drm/vc4: hdmi: Rework power up commit `258e483a4d` upstream. The current code tries to handle the case where CONFIG_PM isn't selected by first calling our runtime_resume implementation and then properly report the power state to the runtime_pm core. This allows to have a functionning device even if pm_runtime_get_* functions are nops. However, the device power state if CONFIG_PM is enabled is RPM_SUSPENDED, and thus our vc4_hdmi_write() and vc4_hdmi_read() calls in the runtime_pm hooks will now report a warning since the device might not be properly powered. Even more so, we need CONFIG_PM enabled since the previous RaspberryPi have a power domain that needs to be powered up for the HDMI controller to be usable. The previous patch has created a dependency on CONFIG_PM, now we can just assume it's there and only call pm_runtime_resume_and_get() to make sure our device is powered in bind. Link: https://lore.kernel.org/r/20220629123510.1915022-39-maxime@cerno.tech Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> (cherry picked from commit `53565c28e6`) Signed-off-by: Maxime Ripard <maxime@cerno.tech> Cc: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:03 +02:00
Adam Borowski	8468ccbf4c	ACPI: thermal: drop an always true check commit `e5b5d25444` upstream. Address of a field inside a struct can't possibly be null; gcc-12 warns about this. Signed-off-by: Adam Borowski <kilobyte@angband.pl> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:03 +02:00
Maxime Ripard	f8b07c05b6	drm/bridge: Add stubs for devm_drm_of_get_bridge when OF is disabled commit `59050d7838` upstream. If CONFIG_OF is disabled, devm_drm_of_get_bridge won't be compiled in and drivers using that function will fail to build. Add an inline stub so that we can still build-test those cases. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Link: https://patchwork.freedesktop.org/patch/msgid/20210928181333.1176840-1-maxime@cerno.tech Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:03 +02:00
Jann Horn	3ffb97fce2	mm: Force TLB flush for PFNMAP mappings before unlink_file_vma() commit `b67fbebd4c` upstream. Some drivers rely on having all VMAs through which a PFN might be accessible listed in the rmap for correctness. However, on X86, it was possible for a VMA with stale TLB entries to not be listed in the rmap. This was fixed in mainline with commit `b67fbebd4c` ("mmu_gather: Force tlb-flush VM_PFNMAP vmas"), but that commit relies on preceding refactoring in commit `18ba064e42` ("mmu_gather: Let there be one tlb_{start,end}_vma() implementation") and commit `1e9fdf21a4` ("mmu_gather: Remove per arch tlb_{start,end}_vma()"). This patch provides equivalent protection without needing that refactoring, by forcing a TLB flush between removing PTEs in unmap_vmas() and the call to unlink_file_vma() in free_pgtables(). [This is a stable-specific rewrite of the upstream commit!] Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-05 10:30:03 +02:00
Greg Kroah-Hartman	1ded0ef241	Linux 5.15.64 Link: https://lore.kernel.org/r/20220829105804.609007228@linuxfoundation.org Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:52 +02:00
Daniel Borkmann	4f672112f8	bpf: Don't use tnum_range on array range checking for poke descriptors commit `a657182a5c` upstream. Hsin-Wei reported a KASAN splat triggered by their BPF runtime fuzzer which is based on a customized syzkaller: BUG: KASAN: slab-out-of-bounds in bpf_int_jit_compile+0x1257/0x13f0 Read of size 8 at addr ffff888004e90b58 by task syz-executor.0/1489 CPU: 1 PID: 1489 Comm: syz-executor.0 Not tainted 5.19.0 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x9c/0xc9 print_address_description.constprop.0+0x1f/0x1f0 ? bpf_int_jit_compile+0x1257/0x13f0 kasan_report.cold+0xeb/0x197 ? kvmalloc_node+0x170/0x200 ? bpf_int_jit_compile+0x1257/0x13f0 bpf_int_jit_compile+0x1257/0x13f0 ? arch_prepare_bpf_dispatcher+0xd0/0xd0 ? rcu_read_lock_sched_held+0x43/0x70 bpf_prog_select_runtime+0x3e8/0x640 ? bpf_obj_name_cpy+0x149/0x1b0 bpf_prog_load+0x102f/0x2220 ? __bpf_prog_put.constprop.0+0x220/0x220 ? find_held_lock+0x2c/0x110 ? __might_fault+0xd6/0x180 ? lock_downgrade+0x6e0/0x6e0 ? lock_is_held_type+0xa6/0x120 ? __might_fault+0x147/0x180 __sys_bpf+0x137b/0x6070 ? bpf_perf_link_attach+0x530/0x530 ? new_sync_read+0x600/0x600 ? __fget_files+0x255/0x450 ? lock_downgrade+0x6e0/0x6e0 ? fput+0x30/0x1a0 ? ksys_write+0x1a8/0x260 __x64_sys_bpf+0x7a/0xc0 ? syscall_enter_from_user_mode+0x21/0x70 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f917c4e2c2d The problem here is that a range of tnum_range(0, map->max_entries - 1) has limited ability to represent the concrete tight range with the tnum as the set of resulting states from value + mask can result in a superset of the actual intended range, and as such a tnum_in(range, reg->var_off) check may yield true when it shouldn't, for example tnum_range(0, 2) would result in 00XX -> v = 0000, m = 0011 such that the intended set of {0, 1, 2} is here represented by a less precise superset of {0, 1, 2, 3}. As the register is known const scalar, really just use the concrete reg->var_off.value for the upper index check. Fixes: `d2e4c1e6c2` ("bpf: Constant map key tracking for prog array pokes") Reported-by: Hsin-Wei Hung <hsinweih@uci.edu> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Shung-Hsi Yu <shung-hsi.yu@suse.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/984b37f9fdf7ac36831d2137415a4a915744c1b6.1661462653.git.daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:51 +02:00
Saurabh Sengar	cd2a50d0a0	scsi: storvsc: Remove WQ_MEM_RECLAIM from storvsc_error_wq commit `d957e7ffb2` upstream. storvsc_error_wq workqueue should not be marked as WQ_MEM_RECLAIM as it doesn't need to make forward progress under memory pressure. Marking this workqueue as WQ_MEM_RECLAIM may cause deadlock while flushing a non-WQ_MEM_RECLAIM workqueue. In the current state it causes the following warning: [ 14.506347] ------------[ cut here ]------------ [ 14.506354] workqueue: WQ_MEM_RECLAIM storvsc_error_wq_0:storvsc_remove_lun is flushing !WQ_MEM_RECLAIM events_freezable_power_:disk_events_workfn [ 14.506360] WARNING: CPU: 0 PID: 8 at <-snip->kernel/workqueue.c:2623 check_flush_dependency+0xb5/0x130 [ 14.506390] CPU: 0 PID: 8 Comm: kworker/u4:0 Not tainted 5.4.0-1086-azure #91~18.04.1-Ubuntu [ 14.506391] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 05/09/2022 [ 14.506393] Workqueue: storvsc_error_wq_0 storvsc_remove_lun [ 14.506395] RIP: 0010:check_flush_dependency+0xb5/0x130 <-snip-> [ 14.506408] Call Trace: [ 14.506412] __flush_work+0xf1/0x1c0 [ 14.506414] __cancel_work_timer+0x12f/0x1b0 [ 14.506417] ? kernfs_put+0xf0/0x190 [ 14.506418] cancel_delayed_work_sync+0x13/0x20 [ 14.506420] disk_block_events+0x78/0x80 [ 14.506421] del_gendisk+0x3d/0x2f0 [ 14.506423] sr_remove+0x28/0x70 [ 14.506427] device_release_driver_internal+0xef/0x1c0 [ 14.506428] device_release_driver+0x12/0x20 [ 14.506429] bus_remove_device+0xe1/0x150 [ 14.506431] device_del+0x167/0x380 [ 14.506432] __scsi_remove_device+0x11d/0x150 [ 14.506433] scsi_remove_device+0x26/0x40 [ 14.506434] storvsc_remove_lun+0x40/0x60 [ 14.506436] process_one_work+0x209/0x400 [ 14.506437] worker_thread+0x34/0x400 [ 14.506439] kthread+0x121/0x140 [ 14.506440] ? process_one_work+0x400/0x400 [ 14.506441] ? kthread_park+0x90/0x90 [ 14.506443] ret_from_fork+0x35/0x40 [ 14.506445] ---[ end trace 2d9633159fdc6ee7 ]--- Link: https://lore.kernel.org/r/1659628534-17539-1-git-send-email-ssengar@linux.microsoft.com Fixes: `436ad94133` ("scsi: storvsc: Allow only one remove lun work item to be issued per lun") Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:51 +02:00
Kiwoong Kim	2c72bead9b	scsi: ufs: core: Enable link lost interrupt commit `6d17a112e9` upstream. Link lost is treated as fatal error with commit `c99b9b2301` ("scsi: ufs: Treat link loss as fatal error"), but the event isn't registered as interrupt source. Enable it. Link: https://lore.kernel.org/r/1659404551-160958-1-git-send-email-kwmad.kim@samsung.com Fixes: `c99b9b2301` ("scsi: ufs: Treat link loss as fatal error") Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:51 +02:00
Ian Rogers	da86f80da3	perf stat: Clear evsel->reset_group for each stat run commit `bf515f024e` upstream. If a weak group is broken then the reset_group flag remains set for the next run. Having reset_group set means the counter isn't created and ultimately a segfault. A simple reproduction of this is: # perf stat -r2 -e '{cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles}:W which will be added as a test in the next patch. Fixes: `4804e01116` ("perf stat: Use affinity for opening events") Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220822213352.75721-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:51 +02:00
Stephane Eranian	b5f5fee03d	perf/x86/intel/ds: Fix precise store latency handling commit `d4bdb0bebc` upstream. With the existing code in store_latency_data(), the memory operation (mem_op) returned to the user is always OP_LOAD where in fact, it should be OP_STORE. This comes from the fact that the function is simply grabbing the information from a data source map which covers only load accesses. Intel 12th gen CPU offers precise store sampling that captures both the data source and latency. Therefore it can use the data source mapping table but must override the memory operation to reflect stores instead of loads. Fixes: `61b985e3e7` ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids") Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220818054613.1548130-1-eranian@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:51 +02:00
Stephane Eranian	83bd6d1212	perf/x86/intel/uncore: Fix broken read_counter() for SNB IMC PMU commit `11745ecfe8` upstream. Existing code was generating bogus counts for the SNB IMC bandwidth counters: $ perf stat -a -I 1000 -e uncore_imc/data_reads/,uncore_imc/data_writes/ 1.000327813 1,024.03 MiB uncore_imc/data_reads/ 1.000327813 20.73 MiB uncore_imc/data_writes/ 2.000580153 261,120.00 MiB uncore_imc/data_reads/ 2.000580153 23.28 MiB uncore_imc/data_writes/ The problem was introduced by commit: `07ce734dd8` ("perf/x86/intel/uncore: Clean up client IMC") Where the read_counter callback was replace to point to the generic uncore_mmio_read_counter() function. The SNB IMC counters are freerunnig 32-bit counters laid out contiguously in MMIO. But uncore_mmio_read_counter() is using a readq() call to read from MMIO therefore reading 64-bit from MMIO. Although this is okay for the uncore_perf_event_update() function because it is shifting the value based on the actual counter width to compute a delta, it is not okay for the uncore_pmu_event_start() which is simply reading the counter and therefore priming the event->prev_count with a bogus value which is responsible for causing bogus deltas in the perf stat command above. The fix is to reintroduce the custom callback for read_counter for the SNB IMC PMU and use readl() instead of readq(). With the change the output of perf stat is back to normal: $ perf stat -a -I 1000 -e uncore_imc/data_reads/,uncore_imc/data_writes/ 1.000120987 296.94 MiB uncore_imc/data_reads/ 1.000120987 138.42 MiB uncore_imc/data_writes/ 2.000403144 175.91 MiB uncore_imc/data_reads/ 2.000403144 68.50 MiB uncore_imc/data_writes/ Fixes: `07ce734dd8` ("perf/x86/intel/uncore: Clean up client IMC") Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20220803160031.1379788-1-eranian@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:51 +02:00
James Clark	a38e7ab467	perf python: Fix build when PYTHON_CONFIG is user supplied commit `bc9e7fe313` upstream. The previous change to Python autodetection had a small mistake where the auto value was used to determine the Python binary, rather than the user supplied value. The Python binary is only used for one part of the build process, rather than the final linking, so it was producing correct builds in most scenarios, especially when the auto detected value matched what the user wanted, or the system only had a valid set of Pythons. Change it so that the Python binary path is derived from either the PYTHON_CONFIG value or PYTHON value, depending on what is specified by the user. This was the original intention. This error was spotted in a build failure an odd cross compilation environment after commit `4c41cb46a7` ("perf python: Prefer python3") was merged. Fixes: `630af16eee` ("perf tools: Use Python devtools for version autodetection rather than runtime") Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220728093946.1337642-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:51 +02:00
Yu Kuai	77864ed6c6	blk-mq: fix io hung due to missing commit_rqs commit `65fac0d54f` upstream. Currently, in virtio_scsi, if 'bd->last' is not set to true while dispatching request, such io will stay in driver's queue, and driver will wait for block layer to dispatch more rqs. However, if block layer failed to dispatch more rq, it should trigger commit_rqs to inform driver. There is a problem in blk_mq_try_issue_list_directly() that commit_rqs won't be called: // assume that queue_depth is set to 1, list contains two rq blk_mq_try_issue_list_directly blk_mq_request_issue_directly // dispatch first rq // last is false __blk_mq_try_issue_directly blk_mq_get_dispatch_budget // succeed to get first budget __blk_mq_issue_directly scsi_queue_rq cmd->flags \|= SCMD_LAST virtscsi_queuecommand kick = (sc->flags & SCMD_LAST) != 0 // kick is false, first rq won't issue to disk queued++ blk_mq_request_issue_directly // dispatch second rq __blk_mq_try_issue_directly blk_mq_get_dispatch_budget // failed to get second budget ret == BLK_STS_RESOURCE blk_mq_request_bypass_insert // errors is still 0 if (!list_empty(list) \|\| errors && ...) // won't pass, commit_rqs won't be called In this situation, first rq relied on second rq to dispatch, while second rq relied on first rq to complete, thus they will both hung. Fix the problem by also treat 'BLK_STS_RESOURCE' as 'errors' since it means that request is not queued successfully. Same problem exists in blk_mq_dispatch_rq_list(), 'BLK_STS_RESOURCE' can't be treated as 'errors' here, fix the problem by calling commit_rqs if queue_rq return 'BLK_STS_*RESOURCE'. Fixes: `d666ba98f8` ("blk-mq: add mq_ops->commit_rqs()") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220726122224.1790882-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:50 +02:00
Salvatore Bonaccorso	4428d15cdd	Documentation/ABI: Mention retbleed vulnerability info file for sysfs commit `00da0cb385` upstream. While reporting for the AMD retbleed vulnerability was added in `6b80b59b35` ("x86/bugs: Report AMD retbleed vulnerability") the new sysfs file was not mentioned so far in the ABI documentation for sysfs-devices-system-cpu. Fix that. Fixes: `6b80b59b35` ("x86/bugs: Report AMD retbleed vulnerability") Signed-off-by: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220801091529.325327-1-carnil@debian.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:50 +02:00
Peter Zijlstra	992d2fc2fe	x86/nospec: Fix i386 RSB stuffing commit `3329249737` upstream. Turns out that i386 doesn't unconditionally have LFENCE, as such the loop in __FILL_RETURN_BUFFER isn't actually speculation safe on such chips. Fixes: `ba6e31af2b` ("x86/speculation: Add LFENCE to RSB fill sequence") Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/Yv9tj9vbQ9nNlXoY@worktop.programming.kicks-ass.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:50 +02:00
Liam Howlett	577d9c05cc	binder_alloc: add missing mmap_lock calls when using the VMA commit `44e602b4e5` upstream. Take the mmap_read_lock() when using the VMA in binder_alloc_print_pages() and when checking for a VMA in binder_alloc_new_buf_locked(). It is worth noting binder_alloc_new_buf_locked() drops the VMA read lock after it verifies a VMA exists, but may be taken again deeper in the call stack, if necessary. Link: https://lkml.kernel.org/r/20220810160209.1630707-1-Liam.Howlett@oracle.com Fixes: `a43cfc87ca` (android: binder: stop saving a pointer to the VMA) Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Ondrej Mosnacek <omosnace@redhat.com> Reported-by: <syzbot+a7b60a176ec13cafb793@syzkaller.appspotmail.com> Acked-by: Carlos Llamas <cmllamas@google.com> Tested-by: Ondrej Mosnacek <omosnace@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Christian Brauner (Microsoft) <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hridya Valsaraju <hridya@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Martijn Coenen <maco@android.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Todd Kjos <tkjos@android.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: "Arve Hjønnevåg" <arve@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:50 +02:00
Zenghui Yu	1ed630bc53	arm64: Fix match_list for erratum 1286807 on Arm Cortex-A76 commit `5e1e087457` upstream. Since commit `51f559d665` ("arm64: Enable repeat tlbi workaround on KRYO4XX gold CPUs"), we failed to detect erratum 1286807 on Cortex-A76 because its entry in arm64_repeat_tlbi_list[] was accidently corrupted by this commit. Fix this issue by creating a separate entry for Kryo4xx Gold. Fixes: `51f559d665` ("arm64: Enable repeat tlbi workaround on KRYO4XX gold CPUs") Cc: Shreyas K K <quic_shrekk@quicinc.com> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220809043848.969-1-yuzenghui@huawei.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:50 +02:00
Yonglong Li	af61a8f760	mptcp: Fix crash due to tcp_tsorted_anchor was initialized before release skb commit `3ef3905aa3` upstream. Got crash when doing pressure test of mptcp: =========================================================================== dst_release: dst:ffffa06ce6e5c058 refcnt:-1 kernel tried to execute NX-protected page - exploit attempt? (uid: 0) BUG: unable to handle kernel paging request at ffffa06ce6e5c058 PGD 190a01067 P4D 190a01067 PUD 43fffb067 PMD 22e403063 PTE 8000000226e5c063 Oops: 0011 [#1] SMP PTI CPU: 7 PID: 7823 Comm: kworker/7:0 Kdump: loaded Tainted: G E Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.2.1 04/01/2014 Call Trace: ? skb_release_head_state+0x68/0x100 ? skb_release_all+0xe/0x30 ? kfree_skb+0x32/0xa0 ? mptcp_sendmsg_frag+0x57e/0x750 ? __mptcp_retrans+0x21b/0x3c0 ? __switch_to_asm+0x35/0x70 ? mptcp_worker+0x25e/0x320 ? process_one_work+0x1a7/0x360 ? worker_thread+0x30/0x390 ? create_worker+0x1a0/0x1a0 ? kthread+0x112/0x130 ? kthread_flush_work_fn+0x10/0x10 ? ret_from_fork+0x35/0x40 =========================================================================== In __mptcp_alloc_tx_skb skb was allocated and skb->tcp_tsorted_anchor will be initialized, in under memory pressure situation sk_wmem_schedule will return false and then kfree_skb. In this case skb->_skb_refdst is not null because_skb_refdst and tcp_tsorted_anchor are stored in the same mem, and kfree_skb will try to release dst and cause crash. Fixes: `f70cad1085` ("mptcp: stop relying on tcp_tx_skb_cache") Reviewed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Link: https://lore.kernel.org/r/20220317220953.426024-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:50 +02:00
Guoqing Jiang	661c01b218	md: call __md_stop_writes in md_stop commit `0dd84b3193` upstream. From the link [1], we can see raid1d was running even after the path raid_dtr -> md_stop -> __md_stop. Let's stop write first in destructor to align with normal md-raid to fix the KASAN issue. [1]. https://lore.kernel.org/linux-raid/CAPhsuW5gc4AakdGNdF8ubpezAuDLFOYUO_sfMZcec6hQFm8nhg@mail.gmail.com/T/#m7f12bf90481c02c6d2da68c64aeed4779b7df74a Fixes: `48df498daf` ("md: move bitmap_destroy to the beginning of __md_stop") Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:50 +02:00
Guoqing Jiang	ee0c613bfe	Revert "md-raid: destroy the bitmap after destroying the thread" commit `1d258758cf` upstream. This reverts commit `e151db8ecf`. Because it obviously breaks clustered raid as noticed by Neil though it fixed KASAN issue for dm-raid, let's revert it and fix KASAN issue in next commit. [1]. https://lore.kernel.org/linux-raid/a6657e08-b6a7-358b-2d2a-0ac37d49d23a@linux.dev/T/#m95ac225cab7409f66c295772483d091084a6d470 Fixes: `e151db8ecf` ("md-raid: destroy the bitmap after destroying the thread") Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:49 +02:00
David Hildenbrand	0038f85933	mm/hugetlb: fix hugetlb not supporting softdirty tracking commit `f96f7a4087` upstream. Patch series "mm/hugetlb: fix write-fault handling for shared mappings", v2. I observed that hugetlb does not support/expect write-faults in shared mappings that would have to map the R/O-mapped page writable -- and I found two case where we could currently get such faults and would erroneously map an anon page into a shared mapping. Reproducers part of the patches. I propose to backport both fixes to stable trees. The first fix needs a small adjustment. This patch (of 2): Staring at hugetlb_wp(), one might wonder where all the logic for shared mappings is when stumbling over a write-protected page in a shared mapping. In fact, there is none, and so far we thought we could get away with that because e.g., mprotect() should always do the right thing and map all pages directly writable. Looks like we were wrong: -------------------------------------------------------------------------- #include <stdio.h> #include <stdlib.h> #include <string.h> #include <fcntl.h> #include <unistd.h> #include <errno.h> #include <sys/mman.h> #define HUGETLB_SIZE (2 * 1024 * 1024u) static void clear_softdirty(void) { int fd = open("/proc/self/clear_refs", O_WRONLY); const char ctrl = "4"; int ret; if (fd < 0) { fprintf(stderr, "open(clear_refs) failed\n"); exit(1); } ret = write(fd, ctrl, strlen(ctrl)); if (ret != strlen(ctrl)) { fprintf(stderr, "write(clear_refs) failed\n"); exit(1); } close(fd); } int main(int argc, char argv) { char map; int fd; fd = open("/dev/hugepages/tmp", O_RDWR \| O_CREAT); if (!fd) { fprintf(stderr, "open() failed\n"); return -errno; } if (ftruncate(fd, HUGETLB_SIZE)) { fprintf(stderr, "ftruncate() failed\n"); return -errno; } map = mmap(NULL, HUGETLB_SIZE, PROT_READ\|PROT_WRITE, MAP_SHARED, fd, 0); if (map == MAP_FAILED) { fprintf(stderr, "mmap() failed\n"); return -errno; } map = 0; if (mprotect(map, HUGETLB_SIZE, PROT_READ)) { fprintf(stderr, "mmprotect() failed\n"); return -errno; } clear_softdirty(); if (mprotect(map, HUGETLB_SIZE, PROT_READ\|PROT_WRITE)) { fprintf(stderr, "mmprotect() failed\n"); return -errno; } map = 0; return 0; } -------------------------------------------------------------------------- Above test fails with SIGBUS when there is only a single free hugetlb page. # echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # ./test Bus error (core dumped) And worse, with sufficient free hugetlb pages it will map an anonymous page into a shared mapping, for example, messing up accounting during unmap and breaking MAP_SHARED semantics: # echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # ./test # cat /proc/meminfo \| grep HugePages_ HugePages_Total: 2 HugePages_Free: 1 HugePages_Rsvd: 18446744073709551615 HugePages_Surp: 0 Reason in this particular case is that vma_wants_writenotify() will return "true", removing VM_SHARED in vma_set_page_prot() to map pages write-protected. Let's teach vma_wants_writenotify() that hugetlb does not support softdirty tracking. Link: https://lkml.kernel.org/r/20220811103435.188481-1-david@redhat.com Link: https://lkml.kernel.org/r/20220811103435.188481-2-david@redhat.com Fixes: `64e455079e` ("mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared") Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Peter Feiner <pfeiner@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Jamie Liu <jamieliu@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Peter Xu <peterx@redhat.com> Cc: <stable@vger.kernel.org> [3.18+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:49 +02:00
Greg Kroah-Hartman	6ee82524b0	Revert "usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling" This reverts commit `eaf3a094d8` which is upstream commit `1ce8b37241`. It is reported to cause problems, so drop it from the 5.15.y tree until the root cause can be determined. Reported-by: Lukas Wunner <lukas@wunner.de> Cc: Oleksij Rempel <o.rempel@pengutronix.de> Cc: Ferry Toth <fntoth@gmail.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Andre Edich <andre.edich@microchip.com> Cc: David S. Miller <davem@davemloft.net> Cc: Sasha Levin <sashal@kernel.org> Link: https://lore.kernel.org/r/20220826132137.GA24932@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:49 +02:00
Greg Kroah-Hartman	7ae43647f4	Revert "usbnet: smsc95xx: Fix deadlock on runtime resume" This reverts commit `b574d1e3e9` which is commit `7b960c967f` upstream. It is reported to cause problems, so drop it from the 5.15.y tree until the root cause can be determined. Reported-by: Lukas Wunner <lukas@wunner.de> Cc: Oleksij Rempel <o.rempel@pengutronix.de> Cc: Ferry Toth <fntoth@gmail.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Andre Edich <andre.edich@microchip.com> Cc: David S. Miller <davem@davemloft.net> Cc: Sasha Levin <sashal@kernel.org> Link: https://lore.kernel.org/r/20220826132137.GA24932@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:49 +02:00
Jens Axboe	295219ab7d	io_uring: fix issue with io_write() not always undoing sb_start_write() commit `e053aaf4da` upstream. This is actually an older issue, but we never used to hit the -EAGAIN path before having done sb_start_write(). Make sure that we always call kiocb_end_write() if we need to retry the write, so that we keep the calls to sb_start_write() etc balanced. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:49 +02:00
Conor Dooley	f8aafb25ec	riscv: traps: add missing prototype commit `d951b20b9d` upstream. Sparse complains: arch/riscv/kernel/traps.c:213:6: warning: symbol 'shadow_stack' was not declared. Should it be static? The variable is used in entry.S, so declare shadow_stack there alongside SHADOW_OVERFLOW_STACK_SIZE. Fixes: `31da94c25a` ("riscv: add VMAP_STACK overflow detection") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220814141237.493457-5-mail@conchuod.ie Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:49 +02:00
Juergen Gross	c2b7bae7c9	xen/privcmd: fix error exit of privcmd_ioctl_dm_op() commit `c5deb27895` upstream. The error exit of privcmd_ioctl_dm_op() is calling unlock_pages() potentially with pages being NULL, leading to a NULL dereference. Additionally lock_pages() doesn't check for pin_user_pages_fast() having been completely successful, resulting in potentially not locking all pages into memory. This could result in sporadic failures when using the related memory in user mode. Fix all of that by calling unlock_pages() always with the real number of pinned pages, which will be zero in case pages being NULL, and by checking the number of pages pinned by pin_user_pages_fast() matching the expected number of pages. Cc: <stable@vger.kernel.org> Fixes: `ab520be8cd` ("xen/privcmd: Add IOCTL_PRIVCMD_DM_OP") Reported-by: Rustam Subkhankulov <subkhankulov@ispras.ru> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Link: https://lore.kernel.org/r/20220825141918.3581-1-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:49 +02:00
David Howells	0351fdbd8c	smb3: missing inode locks in punch hole commit `ba0803050d` upstream. smb3 fallocate punch hole was not grabbing the inode or filemap_invalidate locks so could have race with pagemap reinstantiating the page. Cc: stable@vger.kernel.org Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:49 +02:00
Karol Herbst	3640cdccbe	nouveau: explicitly wait on the fence in nouveau_bo_move_m2mf commit `6b04ce966a` upstream. It is a bit unlcear to us why that's helping, but it does and unbreaks suspend/resume on a lot of GPUs without any known drawbacks. Cc: stable@vger.kernel.org # v5.15+ Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/156 Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220819200928.401416-1-kherbst@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:48 +02:00
Riwen Lu	b490dfcbb9	ACPI: processor: Remove freq Qos request for all CPUs commit `36527b9d88` upstream. The freq Qos request would be removed repeatedly if the cpufreq policy relates to more than one CPU. Then, it would cause the "called for unknown object" warning. Remove the freq Qos request for each CPU relates to the cpufreq policy, instead of removing repeatedly for the last CPU of it. Fixes: `a1bb46c36c` ("ACPI: processor: Add QoS requests for all CPUs") Reported-by: Jeremy Linton <Jeremy.Linton@arm.com> Tested-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: Riwen Lu <luriwen@kylinos.cn> Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:48 +02:00
Shakeel Butt	f1aedd2ffe	Revert "memcg: cleanup racy sum avoidance code" commit `dbb16df644` upstream. This reverts commit `96e51ccf1a`. Recently we started running the kernel with rstat infrastructure on production traffic and begin to see negative memcg stats values. Particularly the 'sock' stat is the one which we observed having negative value. $ grep "sock " /mnt/memory/job/memory.stat sock 253952 total_sock 18446744073708724224 Re-run after couple of seconds $ grep "sock " /mnt/memory/job/memory.stat sock 253952 total_sock 53248 For now we are only seeing this issue on large machines (256 CPUs) and only with 'sock' stat. I think the networking stack increase the stat on one cpu and decrease it on another cpu much more often. So, this negative sock is due to rstat flusher flushing the stats on the CPU that has seen the decrement of sock but missed the CPU that has increments. A typical race condition. For easy stable backport, revert is the most simple solution. For long term solution, I am thinking of two directions. First is just reduce the race window by optimizing the rstat flusher. Second is if the reader sees a negative stat value, force flush and restart the stat collection. Basically retry but limited. Link: https://lkml.kernel.org/r/20220817172139.3141101-1-shakeelb@google.com Fixes: `96e51ccf1a` ("memcg: cleanup racy sum avoidance code") Signed-off-by: Shakeel Butt <shakeelb@google.com> Cc: "Michal Koutný" <mkoutny@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: David Hildenbrand <david@redhat.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: <stable@vger.kernel.org> [5.15] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-31 17:16:48 +02:00

1 2 3 4 5 ...

1055135 commits