linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-10-29 23:53:32 +00:00

Author	SHA1	Message	Date
Jann Horn	bcf85fcedf	romfs: fix uninitialized memory leak in romfs_dev_read() romfs has a superblock field that limits the size of the filesystem; data beyond that limit is never accessed. romfs_dev_read() fetches a caller-supplied number of bytes from the backing device. It returns 0 on success or an error code on failure; therefore, its API can't represent short reads, it's all-or-nothing. However, when romfs_dev_read() detects that the requested operation would cross the filesystem size limit, it currently silently truncates the requested number of bytes. This e.g. means that when the content of a file with size 0x1000 starts one byte before the filesystem size limit, ->readpage() will only fill a single byte of the supplied page while leaving the rest uninitialized, leaking that uninitialized memory to userspace. Fix it by returning an error code instead of truncating the read when the requested read operation would go beyond the end of the filesystem. Fixes: `da4458bda2` ("NOMMU: Make it possible for RomFS to use MTD devices directly") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: David Howells <dhowells@redhat.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200818013202.2246365-1-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-21 09:52:53 -07:00
Leon Romanovsky	86f54bb7e4	mm/rodata_test.c: fix missing function declaration The compilation with CONFIG_DEBUG_RODATA_TEST set produces the following warning due to the missing include. mm/rodata_test.c:15:6: warning: no previous prototype for 'rodata_test' [-Wmissing-prototypes] 15 \| void rodata_test(void) \| ^~~~~~~~~~~ Fixes: `2959a5f726` ("mm: add arch-independent testcases for RODATA") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lkml.kernel.org/r/20200819080026.918134-1-leon@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-21 09:52:53 -07:00
Aneesh Kumar K.V	e47110e905	mm/vunmap: add cond_resched() in vunmap_pmd_range Like zap_pte_range add cond_resched so that we can avoid softlockups as reported below. On non-preemptible kernel with large I/O map region (like the one we get when using persistent memory with sector mode), an unmap of the namespace can report below softlockups. 22724.027334] watchdog: BUG: soft lockup - CPU#49 stuck for 23s! [ndctl:50777] NIP [c0000000000dc224] plpar_hcall+0x38/0x58 LR [c0000000000d8898] pSeries_lpar_hpte_invalidate+0x68/0xb0 Call Trace: flush_hash_page+0x114/0x200 hpte_need_flush+0x2dc/0x540 vunmap_page_range+0x538/0x6f0 free_unmap_vmap_area+0x30/0x70 remove_vm_area+0xfc/0x140 __vunmap+0x68/0x270 __iounmap.part.0+0x34/0x60 memunmap+0x54/0x70 release_nodes+0x28c/0x300 device_release_driver_internal+0x16c/0x280 unbind_store+0x124/0x170 drv_attr_store+0x44/0x60 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 __vfs_write+0x3c/0x70 vfs_write+0xd8/0x260 ksys_write+0xdc/0x130 system_call+0x5c/0x70 Reported-by: Harish Sriram <harish@linux.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200807075933.310240-1-aneesh.kumar@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-21 09:52:53 -07:00
Hugh Dickins	f3f99d63a8	khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter() syzbot crashes on the VM_BUG_ON_MM(khugepaged_test_exit(mm), mm) in __khugepaged_enter(): yes, when one thread is about to dump core, has set core_state, and is waiting for others, another might do something calling __khugepaged_enter(), which now crashes because I lumped the core_state test (known as "mmget_still_valid") into khugepaged_test_exit(). I still think it's best to lump them together, so just in this exceptional case, check mm->mm_users directly instead of khugepaged_test_exit(). Fixes: `bbe98f9cad` ("khugepaged: khugepaged_test_exit() check mmget_still_valid()") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Yang Shi <shy828301@gmail.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Song Liu <songliubraving@fb.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Eric Dumazet <edumazet@google.com> Cc: <stable@vger.kernel.org> [4.8+] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008141503370.18085@eggly.anvils Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-21 09:52:53 -07:00
Xu Wang	d5a1695977	hugetlb_cgroup: convert comma to semicolon Replace a comma between expression statements by a semicolon. Fixes: `faced7e080` ("mm: hugetlb controller for cgroups v2") Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Tejun Heo <tj@kernel.org> Cc: Giuseppe Scrivano <gscrivan@redhat.com> Link: http://lkml.kernel.org/r/20200818064333.21759-1-vulab@iscas.ac.cn Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-21 09:52:52 -07:00
Nick Desaulniers	00c54a80cd	mailmap: add Andi Kleen I keep getting bounce back from the suse.de address. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Quentin Perret <qperret@qperret.net> Link: http://lkml.kernel.org/r/20200818203214.659955-1-ndesaulniers@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-21 09:52:52 -07:00
Thomas Gleixner	d88d59b64c	core/entry: Respect syscall number rewrites The transcript of the x86 entry code to the generic version failed to reload the syscall number from ptregs after ptrace and seccomp have run, which both can modify the syscall number in ptregs. It returns the original syscall number instead which is obviously not the right thing to do. Reload the syscall number to fix that. Fixes: `142781e108` ("entry: Provide generic syscall entry functionality") Reported-by: Kyle Huey <me@kylehuey.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Kyle Huey <me@kylehuey.com> Tested-by: Kees Cook <keescook@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/87blj6ifo8.fsf@nanos.tec.linutronix.de	2020-08-21 16:17:29 +02:00
Sean Christopherson	6a3ea3e68b	x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVM KVM has an optmization to avoid expensive MRS read/writes on VMENTER/EXIT. It caches the MSR values and restores them either when leaving the run loop, on preemption or when going out to user space. The affected MSRs are not required for kernel context operations. This changed with the recently introduced mechanism to handle FSGSBASE in the paranoid entry code which has to retrieve the kernel GSBASE value by accessing per CPU memory. The mechanism needs to retrieve the CPU number and uses either LSL or RDPID if the processor supports it. Unfortunately RDPID uses MSR_TSC_AUX which is in the list of cached and lazily restored MSRs, which means between the point where the guest value is written and the point of restore, MSR_TSC_AUX contains a random number. If an NMI or any other exception which uses the paranoid entry path happens in such a context, then RDPID returns the random guest MSR_TSC_AUX value. As a consequence this reads from the wrong memory location to retrieve the kernel GSBASE value. Kernel GS is used to for all regular this_cpu_*() operations. If the GSBASE in the exception handler points to the per CPU memory of a different CPU then this has the obvious consequences of data corruption and crashes. As the paranoid entry path is the only place which accesses MSR_TSX_AUX (via RDPID) and the fallback via LSL is not significantly slower, remove the RDPID alternative from the entry path and always use LSL. The alternative would be to write MSR_TSC_AUX on every VMENTER and VMEXIT which would be inflicting massive overhead on that code path. [ tglx: Rewrote changelog ] Fixes: `eaad981291` ("x86/entry/64: Introduce the FIND_PERCPU_BASE macro") Reported-by: Tom Lendacky <thomas.lendacky@amd.com> Debugged-by: Tom Lendacky <thomas.lendacky@amd.com> Suggested-by: Andy Lutomirski <luto@kernel.org> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200821105229.18938-1-pbonzini@redhat.com	2020-08-21 16:15:27 +02:00
Leon Romanovsky	f6da70d99c	MAINTAINERS: Update Mellanox and Cumulus Network addresses to new domain Mellanox and Cumulus Network were acquired by Nvidia, so change the maintainers emails to new domain name. Link: https://lore.kernel.org/r/20200810091100.243932-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-21 10:48:48 -03:00
Kajol Jain	64ef8f2c47	powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driver Commit `792f73f747` ("powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show cpumask") added cpumask file as part of hv-24x7 driver inside the interface folder. The cpumask file is supposed to be in the top folder of the PMU driver in order to make hotplug work. This patch fixes that issue and creates new group 'cpumask_attr_group' to add cpumask file and make sure it added in top folder. command:# cat /sys/devices/hv_24x7/cpumask 0 Fixes: `792f73f747` ("powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show cpumask") Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200821080610.123997-1-kjain@linux.ibm.com	2020-08-21 23:35:27 +10:00
Christophe Leroy	541cebb51f	powerpc/32s: Fix module loading failure when VMALLOC_END is over 0xf0000000 In is_module_segment(), when VMALLOC_END is over 0xf0000000, ALIGN(VMALLOC_END, SZ_256M) has value 0. In that case, addr >= ALIGN(VMALLOC_END, SZ_256M) is always true then is_module_segment() always returns false. Use (ALIGN(VMALLOC_END, SZ_256M) - 1) which will have value 0xffffffff and will be suitable for the comparison. Fixes: `c496433197` ("powerpc/32s: Only leave NX unset on segments used for modules") Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/09fc73fe9c7423c6b4cf93f93df9bb0ed8eefab5.1597994047.git.christophe.leroy@csgroup.eu	2020-08-21 23:30:25 +10:00
Rob Herring	abf532ccea	KVM: arm64: Print warning when cpu erratum can cause guests to deadlock If guests don't have certain CPU erratum workarounds implemented, then there is a possibility a guest can deadlock the system. IOW, only trusted guests should be used on systems with the erratum. This is the case for Cortex-A57 erratum 832075. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: kvmarm@lists.cs.columbia.edu Link: https://lore.kernel.org/r/20200803193127.3012242-2-robh@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-08-21 12:23:09 +01:00
Marc Zyngier	bf87bb0881	arm64: Allow booting of late CPUs affected by erratum `1418040` As we can now switch from a system that isn't affected by `1418040` to a system that globally is affected, let's allow affected CPUs to come in at a later time. Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200731173824.107480-3-maz@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-08-21 11:39:56 +01:00
Marc Zyngier	d49f7d7376	arm64: Move handling of erratum `1418040` into C code Instead of dealing with erratum `1418040` on each entry and exit, let's move the handling to __switch_to() instead, which has several advantages: - It can be applied when it matters (switching between 32 and 64 bit tasks). - It is written in C (yay!) - It can rely on static keys rather than alternatives Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200731173824.107480-2-maz@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2020-08-21 11:39:56 +01:00
Masahiro Yamada	510bc3cb1d	kconfig: qconf: replace deprecated QString::sprintf() with QTextStream QString::sprintf() is deprecated in the latest Qt version, and spawns a lot of warnings: HOSTCXX scripts/kconfig/qconf.o scripts/kconfig/qconf.cc: In member function ‘void ConfigInfoView::menuInfo()’: scripts/kconfig/qconf.cc:1090:61: warning: ‘QString& QString::sprintf(const char, ...)’ is deprecated: Use asprintf(), arg() or QTextStream instead [-Wdeprecated-declarations] 1090 \| head += QString().sprintf("<a href=\"s%s\">", sym->name); \| ^ In file included from /usr/include/qt5/QtGui/qkeysequence.h:44, from /usr/include/qt5/QtWidgets/qaction.h:44, from /usr/include/qt5/QtWidgets/QAction:1, from scripts/kconfig/qconf.cc:7: /usr/include/qt5/QtCore/qstring.h:382:14: note: declared here 382 \| QString &sprintf(const char format, ...) Q_ATTRIBUTE_FORMAT_PRINTF(2, 3); \| ^~~~~~~ scripts/kconfig/qconf.cc:1099:60: warning: ‘QString& QString::sprintf(const char, ...)’ is deprecated: Use asprintf(), arg() or QTextStream instead [-Wdeprecated-declarations] 1099 \| head += QString().sprintf("<a href=\"s%s\">", sym->name); \| ^ In file included from /usr/include/qt5/QtGui/qkeysequence.h:44, from /usr/include/qt5/QtWidgets/qaction.h:44, from /usr/include/qt5/QtWidgets/QAction:1, from scripts/kconfig/qconf.cc:7: /usr/include/qt5/QtCore/qstring.h:382:14: note: declared here 382 \| QString &sprintf(const char format, ...) Q_ATTRIBUTE_FORMAT_PRINTF(2, 3); \| ^~~~~~~ scripts/kconfig/qconf.cc:1127:90: warning: ‘QString& QString::sprintf(const char, ...)’ is deprecated: Use asprintf(), arg() or QTextStream instead [-Wdeprecated-declarations] 1127 \| debug += QString().sprintf("defined at %s:%d<br><br>", _menu->file->name, _menu->lineno); \| ^ In file included from /usr/include/qt5/QtGui/qkeysequence.h:44, from /usr/include/qt5/QtWidgets/qaction.h:44, from /usr/include/qt5/QtWidgets/QAction:1, from scripts/kconfig/qconf.cc:7: /usr/include/qt5/QtCore/qstring.h:382:14: note: declared here 382 \| QString &sprintf(const char format, ...) Q_ATTRIBUTE_FORMAT_PRINTF(2, 3); \| ^~~~~~~ scripts/kconfig/qconf.cc: In member function ‘QString ConfigInfoView::debug_info(symbol)’: scripts/kconfig/qconf.cc:1150:68: warning: ‘QString& QString::sprintf(const char, ...)’ is deprecated: Use asprintf(), arg() or QTextStream instead [-Wdeprecated-declarations] 1150 \| debug += QString().sprintf("prompt: <a href=\"m%s\">", sym->name); \| ^ In file included from /usr/include/qt5/QtGui/qkeysequence.h:44, from /usr/include/qt5/QtWidgets/qaction.h:44, from /usr/include/qt5/QtWidgets/QAction:1, from scripts/kconfig/qconf.cc:7: /usr/include/qt5/QtCore/qstring.h:382:14: note: declared here 382 \| QString &sprintf(const char format, ...) Q_ATTRIBUTE_FORMAT_PRINTF(2, 3); \| ^~~~~~~ scripts/kconfig/qconf.cc: In static member function ‘static void ConfigInfoView::expr_print_help(void, symbol, const char)’: scripts/kconfig/qconf.cc:1225:59: warning: ‘QString& QString::sprintf(const char, ...)’ is deprecated: Use asprintf(), arg() or QTextStream instead [-Wdeprecated-declarations] 1225 \| text += QString().sprintf("<a href=\"s%s\">", sym->name); \| ^ In file included from /usr/include/qt5/QtGui/qkeysequence.h:44, from /usr/include/qt5/QtWidgets/qaction.h:44, from /usr/include/qt5/QtWidgets/QAction:1, from scripts/kconfig/qconf.cc:7: /usr/include/qt5/QtCore/qstring.h:382:14: note: declared here 382 \| QString &sprintf(const char *format, ...) Q_ATTRIBUTE_FORMAT_PRINTF(2, 3); \| ^~~~~~~ The documentation also says: "Warning: We do not recommend using QString::asprintf() in new Qt code. Instead, consider using QTextStream or arg(), both of which support Unicode strings seamlessly and are type-safe." Use QTextStream as suggested. Reported-by: Robert Crawford <flacycads@cox.net> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2020-08-21 10:23:38 +09:00
Masahiro Yamada	68fd110b3e	kconfig: qconf: remove redundant help in the info view The same information is repeated in the info view. Remove the second one. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2020-08-21 10:23:37 +09:00
Masahiro Yamada	53efe2e76c	kconfig: qconf: remove qInfo() to get back Qt4 support qconf is supposed to work with Qt4 and Qt5, but since commit `c4f7398bee` ("kconfig: qconf: make debug links work again"), building with Qt4 fails as follows: HOSTCXX scripts/kconfig/qconf.o scripts/kconfig/qconf.cc: In member function ‘void ConfigInfoView::clicked(const QUrl&)’: scripts/kconfig/qconf.cc:1241:3: error: ‘qInfo’ was not declared in this scope; did you mean ‘setInfo’? 1241 \| qInfo() << "Clicked link is empty"; \| ^~~~~ \| setInfo scripts/kconfig/qconf.cc:1254:3: error: ‘qInfo’ was not declared in this scope; did you mean ‘setInfo’? 1254 \| qInfo() << "Clicked symbol is invalid:" << data; \| ^~~~~ \| setInfo make[1]: * [scripts/Makefile.host:129: scripts/kconfig/qconf.o] Error 1 make: * [Makefile:606: xconfig] Error 2 qInfo() does not exist in Qt4. In my understanding, these call-sites should be unreachable. Perhaps, qWarning(), assertion, or something is better, but qInfo() is not the right one to use here, I think. Fixes: `c4f7398bee` ("kconfig: qconf: make debug links work again") Reported-by: Ronald Warsow <rwarsow@gmx.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2020-08-21 10:22:46 +09:00
Dave Airlie	0790e63f58	Merge tag 'drm-intel-fixes-2020-08-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v5.9-rc2: - GVT fixes - Fix device parameter usage for selftest mock i915 device - Fix LPSP capability debugfs NULL dereference - Fix buddy register pagemask table - Fix intel_atomic_check() non-negative return value - Fix selftests passing a random 0 into ilog2() - Fix TGL power well enable/disable ordering - Switch to PMU module refcounting Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87a6yp7jp3.fsf@intel.com	2020-08-21 11:03:52 +10:00
Dave Airlie	ba9086a6df	Merge tag 'amd-drm-fixes-5.9-2020-08-20' of git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.9-2020-08-20: amdgpu: - Fixes for Navy Flounder - Misc display fixes - RAS fix amdkfd: - SDMA fix for renoir Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200820041938.3928-1-alexander.deucher@amd.com	2020-08-21 10:17:52 +10:00
Xin Long	f6db909641	tipc: call rcu_read_lock() in tipc_aead_encrypt_done() b->media->send_msg() requires rcu_read_lock(), as we can see elsewhere in tipc, tipc_bearer_xmit, tipc_bearer_xmit_skb and tipc_bearer_bc_xmit(). Syzbot has reported this issue as: net/tipc/bearer.c:466 suspicious rcu_dereference_check() usage! Workqueue: cryptd cryptd_queue_worker Call Trace: tipc_l2_send_msg+0x354/0x420 net/tipc/bearer.c:466 tipc_aead_encrypt_done+0x204/0x3a0 net/tipc/crypto.c:761 cryptd_aead_crypt+0xe8/0x1d0 crypto/cryptd.c:739 cryptd_queue_worker+0x118/0x1b0 crypto/cryptd.c:181 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:291 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293 So fix it by calling rcu_read_lock() in tipc_aead_encrypt_done() for b->media->send_msg(). Fixes: `fc1b6d6de2` ("tipc: introduce TIPC encryption & authentication") Reported-by: syzbot+47bbc6b678d317cccbe0@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-20 16:42:08 -07:00
Alaa Hleihel	eda814b97d	net/sched: act_ct: Fix skb double-free in tcf_ct_handle_fragments() error flow tcf_ct_handle_fragments() shouldn't free the skb when ip_defrag() call fails. Otherwise, we will cause a double-free bug. In such cases, just return the error to the caller. Fixes: `b57dc7c13e` ("net/sched: Introduce action ct") Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-20 16:39:31 -07:00
David Laight	ab921f3cdb	net: sctp: Fix negotiation of the number of data streams. The number of output and input streams was never being reduced, eg when processing received INIT or INIT_ACK chunks. The effect is that DATA chunks can be sent with invalid stream ids and then discarded by the remote system. Fixes: `2075e50caf` ("sctp: convert to genradix") Signed-off-by: David Laight <david.laight@aculab.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-20 16:37:37 -07:00
Geert Uytterhoeven	41506bff84	dt-bindings: net: renesas, ether: Improve schema validation - Remove pinctrl consumer properties, as they are handled by core dt-schema, - Document missing properties, - Document missing PHY child node, - Add "additionalProperties: false". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-20 16:30:56 -07:00
Mark Tomlinson	272502fcb7	gre6: Fix reception with IP6_TNL_F_RCV_DSCP_COPY When receiving an IPv4 packet inside an IPv6 GRE packet, and the IP6_TNL_F_RCV_DSCP_COPY flag is set on the tunnel, the IPv4 header would get corrupted. This is due to the common ip6_tnl_rcv() function assuming that the inner header is always IPv6. This patch checks the tunnel protocol for IPv4 inner packets, but still defaults to IPv6. Fixes: `308edfdf15` ("gre6: Cleanup GREv6 receive path, call common GRE functions") Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-20 16:28:04 -07:00
David S. Miller	e14fd8da84	Merge branch 'hv_netvsc-Some-fixes-for-the-select_queue' Haiyang Zhang says: ==================== hv_netvsc: Some fixes for the select_queue This patch set includes two fixes for the select_queue process. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-20 16:24:00 -07:00
Haiyang Zhang	c3d897e01a	hv_netvsc: Fix the queue_mapping in netvsc_vf_xmit() netvsc_vf_xmit() / dev_queue_xmit() will call VF NIC’s ndo_select_queue or netdev_pick_tx() again. They will use skb_get_rx_queue() to get the queue number, so the “skb->queue_mapping - 1” will be used. This may cause the last queue of VF not been used. Use skb_record_rx_queue() here, so that the skb_get_rx_queue() called later will get the correct queue number, and VF will be able to use all queues. Fixes: `b3bf5666a5` ("hv_netvsc: defer queue selection to VF") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-20 16:24:00 -07:00
Haiyang Zhang	4d820543c5	hv_netvsc: Remove "unlikely" from netvsc_select_queue When using vf_ops->ndo_select_queue, the number of queues of VF is usually bigger than the synthetic NIC. This condition may happen often. Remove "unlikely" from the comparison of ndev->real_num_tx_queues. Fixes: `b3bf5666a5` ("hv_netvsc: defer queue selection to VF") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-20 16:24:00 -07:00
Yauheni Kaliuta	c210773d6c	bpf: selftests: global_funcs: Check err_str before strstr The error path in libbpf.c:load_program() has calls to pr_warn() which ends up for global_funcs tests to test_global_funcs.c:libbpf_debug_print(). For the tests with no struct test_def::err_str initialized with a string, it causes call of strstr() with NULL as the second argument and it segfaults. Fix it by calling strstr() only for non-NULL err_str. Signed-off-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200820115843.39454-1-yauheni.kaliuta@redhat.com	2020-08-20 14:31:14 -07:00
Andrii Nakryiko	c8a36f1945	bpf: xdp: Fix XDP mode when no mode flags specified `7f0a838254` ("bpf, xdp: Maintain info on attached XDP BPF programs in net_device") inadvertently changed which XDP mode is assumed when no mode flags are specified explicitly. Previously, driver mode was preferred, if driver supported it. If not, generic SKB mode was chosen. That commit changed default to SKB mode always. This patch fixes the issue and restores the original logic. Fixes: `7f0a838254` ("bpf, xdp: Maintain info on attached XDP BPF programs in net_device") Reported-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/bpf/20200820052841.1559757-1-andriin@fb.com	2020-08-20 14:27:12 -07:00
Veronika Kabatova	5597432dde	selftests/bpf: Remove test_align leftovers Calling generic selftests "make install" fails as rsync expects all files from TEST_GEN_PROGS to be present. The binary is not generated anymore (commit `3b09d27cc9`) so we can safely remove it from there and also from gitignore. Fixes: `3b09d27cc9` ("selftests/bpf: Move test_align under test_progs") Signed-off-by: Veronika Kabatova <vkabatov@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/20200819160710.1345956-1-vkabatov@redhat.com	2020-08-20 14:25:25 -07:00
Jiri Olsa	51f6463aac	tools/resolve_btfids: Fix sections with wrong alignment The data of compressed section should be aligned to 4 (for 32bit) or 8 (for 64 bit) bytes. The binutils ld sets sh_addralign to 1, which makes libelf fail with misaligned section error during the update as reported by Jesper: FAILED elf_update(WRITE): invalid section alignment While waiting for ld fix, we can fix compressed sections sh_addralign value manually. Adding warning in -vv mode when the fix is triggered: $ ./tools/bpf/resolve_btfids/resolve_btfids -vv vmlinux ... section(36) .comment, size 44, link 0, flags 30, type=1 section(37) .debug_aranges, size 45684, link 0, flags 800, type=1 - fixing wrong alignment sh_addralign 16, expected 8 section(38) .debug_info, size 129104957, link 0, flags 800, type=1 - fixing wrong alignment sh_addralign 1, expected 8 section(39) .debug_abbrev, size 1152583, link 0, flags 800, type=1 - fixing wrong alignment sh_addralign 1, expected 8 section(40) .debug_line, size 7374522, link 0, flags 800, type=1 - fixing wrong alignment sh_addralign 1, expected 8 section(41) .debug_frame, size 702463, link 0, flags 800, type=1 section(42) .debug_str, size 1017571, link 0, flags 830, type=1 - fixing wrong alignment sh_addralign 1, expected 8 section(43) .debug_loc, size 3019453, link 0, flags 800, type=1 - fixing wrong alignment sh_addralign 1, expected 8 section(44) .debug_ranges, size 1744583, link 0, flags 800, type=1 - fixing wrong alignment sh_addralign 16, expected 8 section(45) .symtab, size 2955888, link 46, flags 0, type=2 section(46) .strtab, size 2613072, link 0, flags 0, type=3 ... update ok for vmlinux Another workaround is to disable compressed debug info data CONFIG_DEBUG_INFO_COMPRESSED kernel option. Fixes: `fbbb68de80` ("bpf: Add resolve_btfids tool to resolve BTF IDs in ELF object") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Cc: Mark Wielaard <mjw@redhat.com> Cc: Nick Clifton <nickc@redhat.com> Link: https://lore.kernel.org/bpf/20200819092342.259004-1-jolsa@kernel.org	2020-08-20 14:17:20 -07:00
Linus Torvalds	da2968ff87	pci-v5.9-fixes-1 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl8+4XcUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vwpNA/6Aqb4Q4g6zZhjdRS8GOewMqn37VOd SATT6nmmVoCndx1kG9W/Ggl+p4Ia3fn3qbt9qNZvuivOw9AAgMX6RkgdfN3TnnB/ Bzj1D06sy8/9dg011nvlR+EA/rReiY7V8VJoDcmn36lkaNPJxQwAyydcPs4pRKso 6q76XMRi3249jHzKxPRaBiHlSKZhmmYJUQ9eL2oHOGD3USAQhfZqkE1iCc3oGS1I Ow5+Ofzitoznr6XgTa/Iryzxh6xOKpGzcpse5E6IK5VMr1I//1iDwxqGyU+07rUd /9PwTaX1Achhj37dFh7Bd7ycxUqBiyQ4+3iQXxVltsXPQP2F14tmot+XghgD6kTl fkbZokzUvE0Cy1/OxwPFShxT0NdmyUxDUp9wFEjEiOHJ1zqk7hWz0bkJmJkBfxnZ wJjUU1rQtaawhjJ/PPpNznY0/BTD3YmNWkOHBEaK5ReP0CM8Y9iChfpA0uTwhvVB sAbYpu5yuJou3HV3+T2HiygMc+0CKTHnr/aouGnyqGn7+KYrwBFLS9VreD6/mVbj MChyaCNGr0+Wi8kir0/EyeaskKsFWta0ZvYIcMrFKKhpb4bGdPZ5rTllhLNcm9Uv 67Hp+BdILCOCnsh1zxF7VSwK9pH+xVbUwpDvdmhGl0eRhYRQJN6Fe69UyJtyIP7N e2PQdvszMa2RFmU= =F1zG -----END PGP SIGNATURE----- Merge tag 'pci-v5.9-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "Fix P2PDMA build issue (Christoph Hellwig)" * tag 'pci-v5.9-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/P2PDMA: Fix build without DMA ops	2020-08-20 14:17:03 -07:00
Peilin Ye	ce51f63e63	net/smc: Prevent kernel-infoleak in __smc_diag_dump() __smc_diag_dump() is potentially copying uninitialized kernel stack memory into socket buffers, since the compiler may leave a 4-byte hole near the beginning of `struct smcd_diag_dmbinfo`. Fix it by initializing `dinfo` with memset(). Fixes: `4b1b7d3b30` ("net/smc: add SMC-D diag support") Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-20 12:07:31 -07:00
Edward Cree	3e659a82c4	sfc: fix build warnings on 32-bit Truncation of DMA_BIT_MASK to 32-bit dma_addr_t is semantically safe, but the compiler was warning because it was happening implicitly. Insert explicit casts to suppress the warnings. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-20 12:00:16 -07:00
Kaige Li	fb73ed5ef7	net: phy: mscc: Fix a couple of spelling mistakes "spcified" -> "specified" There are a couple of spelling mistakes in comment text. Fix these. Signed-off-by: Kaige Li <likaige@loongson.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-08-20 11:10:12 -07:00
Bin Meng	fc26f5bbf1	riscv: Add SiFive drivers to rv32_defconfig This adds SiFive drivers to rv32_defconfig, to keep in sync with the 64-bit config. This is useful when testing 32-bit kernel with QEMU 'sifive_u' 32-bit machine. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>	2020-08-20 11:00:21 -07:00
Anup Patel	a2770b57d0	dt-bindings: timer: Add CLINT bindings We add DT bindings documentation for CLINT device. Signed-off-by: Anup Patel <anup.patel@wdc.com> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Tested-by: Emil Renner Berhing <kernel@esmil.dk> Reviewed-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>	2020-08-20 10:58:28 -07:00
Anup Patel	2bc3fc877a	RISC-V: Remove CLINT related code from timer and arch Right now the RISC-V timer driver is convoluted to support: 1. Linux RISC-V S-mode (with MMU) where it will use TIME CSR for clocksource and SBI timer calls for clockevent device. 2. Linux RISC-V M-mode (without MMU) where it will use CLINT MMIO counter register for clocksource and CLINT MMIO compare register for clockevent device. We now have a separate CLINT timer driver which also provide CLINT based IPI operations so let's remove CLINT MMIO related code from arch/riscv directory and RISC-V timer driver. Signed-off-by: Anup Patel <anup.patel@wdc.com> Tested-by: Emil Renner Berhing <kernel@esmil.dk> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>	2020-08-20 10:58:13 -07:00
Anup Patel	2ac6795fcc	clocksource/drivers: Add CLINT timer driver We add a separate CLINT timer driver for Linux RISC-V M-mode (i.e. RISC-V NoMMU kernel). The CLINT MMIO device provides three things: 1. 64bit free running counter register 2. 64bit per-CPU time compare registers 3. 32bit per-CPU inter-processor interrupt registers Unlike other timer devices, CLINT provides IPI registers along with timer registers. To use CLINT IPI registers, the CLINT timer driver provides IPI related callbacks to arch/riscv. Signed-off-by: Anup Patel <anup.patel@wdc.com> Tested-by: Emil Renner Berhing <kernel@esmil.dk> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>	2020-08-20 10:57:29 -07:00
Anup Patel	cc7f3f72dc	RISC-V: Add mechanism to provide custom IPI operations We add mechanism to set custom IPI operations so that CLINT driver from drivers directory can provide custom IPI operations. Signed-off-by: Anup Patel <anup.patel@wdc.com> Tested-by: Emil Renner Berhing <kernel@esmil.dk> Reviewed-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>	2020-08-20 10:55:40 -07:00
Linus Torvalds	d271b51c60	dma-mapping fixes for 5.9 - fix out more fallout from the dma-pool changes (Nicolas Saenz Julienne, me) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl8+pzoLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYOtEw/+MPgKyqy/PTxVVNXY8X0dyy79IMQ95I46/jwbbVUg BJUMhJslzSpYH9FS96K8LsPY1ZzuU5Yr24bRxLXhJYLr3tfoa8tW8YAHfbBbbYkx Ycfo8Tf1F55ZKHwoQvyV47acRhfJW+FRlSfpYCBqsNPyz7YwVTAPPt7PTeeyqMsV nZnzSDlZCoJkDjdEtbv57apo8KSlpQ1wf+QNRCbLjveUcKFqKB9iJiCFpXmI9jCH fT5BHcWv6ZzwSHorsFayy9AooSXrvahTnMAOsL90UYAm0R81x/xsE4/+LP2oigRD HuTjy4yHPeLUZcGukwTRkh30SQ009N7b6fhAyDFKUt4/6gKfXH2mKuEQmxz/KT1P cmw0sCpaA+OjpedOm05hbIIIQJewQzFYj0KxuPPXZX9LS826YHntPOvZRltN8fWB 0Gd5SnkCyHseGmFmz8Kx3inYfpynM7EOSJ9CzbfpWjchLEjpzS0EkCunTP0gV8Zw Qq8RegbwTpNMroh9n05UYQH3j1XRNO7dYxtkCwSwByOr3TdsQ76fHaqIAF/YMUH+ Wd6XmtHC3wMtjDMyWTGoBhZtmuUTdMCATDA3avc+cUl2QkQf0kPhXBOuiS8tN/Yl P9jlJDetDJqwz2brUFa+rHMXSjwp2QtK/zZTmviIq+nPPkE5sNQQ9/l7oGLPJPn3 qYs= =RQ4K -----END PGP SIGNATURE----- Merge tag 'dma-mapping-5.9-1' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fixes from Christoph Hellwig: "Fix more fallout from the dma-pool changes (Nicolas Saenz Julienne, me)" * tag 'dma-mapping-5.9-1' of git://git.infradead.org/users/hch/dma-mapping: dma-pool: Only allocate from CMA when in same memory zone dma-pool: fix coherent pool allocations for IOMMU mappings	2020-08-20 10:48:17 -07:00
David Howells	ba8e42077b	afs: Fix key ref leak in afs_put_operation() The afs_put_operation() function needs to put the reference to the key that's authenticating the operation. Fixes: `e49c7b2f6d` ("afs: Build an abstraction around an "operation" concept") Reported-by: Dave Botsch <botsch@cnf.cornell.edu> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-08-20 10:41:45 -07:00
Rafael J. Wysocki	cc15fd9892	Merge branch 'opp/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull operating performance points (OPP) framework fixes for 5.9-rc2 from Viresh Kumar: "This contains the following fixes for 5.9: - Fix re-enabling of resources (Rajendra Nayak). - Put OPP table references (Stephen Boyd)." * 'opp/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: opp: Enable resources again if they were disabled earlier opp: Put opp table in dev_pm_opp_set_rate() if _set_opp_bw() fails opp: Put opp table in dev_pm_opp_set_rate() for empty tables	2020-08-20 18:13:45 +02:00
Toke Høiland-Jørgensen	1e891e513e	libbpf: Fix map index used in error message The error message emitted by bpf_object__init_user_btf_maps() was using the wrong section ID. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200819110534.9058-1-toke@redhat.com	2020-08-20 16:21:27 +02:00
Vasant Hegde	90a9b102ed	powerpc/pseries: Do not initiate shutdown when system is running on UPS As per PAPR we have to look for both EPOW sensor value and event modifier to identify the type of event and take appropriate action. In LoPAPR v1.1 section 10.2.2 includes table 136 "EPOW Action Codes": SYSTEM_SHUTDOWN 3 The system must be shut down. An EPOW-aware OS logs the EPOW error log information, then schedules the system to be shut down to begin after an OS defined delay internal (default is 10 minutes.) Then in section 10.3.2.2.8 there is table 146 "Platform Event Log Format, Version 6, EPOW Section", which includes the "EPOW Event Modifier": For EPOW sensor value = 3 0x01 = Normal system shutdown with no additional delay 0x02 = Loss of utility power, system is running on UPS/Battery 0x03 = Loss of system critical functions, system should be shutdown 0x04 = Ambient temperature too high All other values = reserved We have a user space tool (rtas_errd) on LPAR to monitor for EPOW_SHUTDOWN_ON_UPS. Once it gets an event it initiates shutdown after predefined time. It also starts monitoring for any new EPOW events. If it receives "Power restored" event before predefined time it will cancel the shutdown. Otherwise after predefined time it will shutdown the system. Commit `79872e3546` ("powerpc/pseries: All events of EPOW_SYSTEM_SHUTDOWN must initiate shutdown") changed our handling of the "on UPS/Battery" case, to immediately shutdown the system. This breaks existing setups that rely on the userspace tool to delay shutdown and let the system run on the UPS. Fixes: `79872e3546` ("powerpc/pseries: All events of EPOW_SYSTEM_SHUTDOWN must initiate shutdown") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [mpe: Massage change log and add PAPR references] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200820061844.306460-1-hegdevasant@linux.vnet.ibm.com	2020-08-20 22:55:54 +10:00
Pavel Begunkov	867a23eab5	io_uring: kill extra iovec=NULL in import_iovec() If io_import_iovec() returns an error, return iovec is undefined and must not be used, so don't set it to NULL when failing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-08-20 05:36:19 -06:00
Pavel Begunkov	f261c16861	io_uring: comment on kfree(iovec) checks kfree() handles NULL pointers well, but io_{read,write}() checks it because of performance reasons. Leave a comment there for those who are tempted to patch it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-08-20 05:36:17 -06:00
Pavel Begunkov	bb175342aa	io_uring: fix racy req->flags modification Setting and clearing REQ_F_OVERFLOW in io_uring_cancel_files() and io_cqring_overflow_flush() are racy, because they might be called asynchronously. REQ_F_OVERFLOW flag in only needed for files cancellation, so if it can be guaranteed that requests _currently_ marked inflight can't be overflown, the problem will be solved with removing the flag altogether. That's how the patch works, it removes inflight status of a request in io_cqring_fill_event() whenever it should be thrown into CQ-overflow list. That's Ok to do, because no opcode specific handling can be done after io_cqring_fill_event(), the same assumption as with "struct io_completion" patches. And it already have a good place for such cleanups, which is io_clean_op(). A nice side effect of this is removing this inflight check from the hot path. note on synchronisation: now __io_cqring_fill_event() may be taking two spinlocks simultaneously, completion_lock and inflight_lock. It's fine, because we never do that in reverse order, and CQ-overflow of inflight requests shouldn't happen often. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-08-20 05:36:15 -06:00
Weihang Li	6da06c6291	Revert "RDMA/hns: Reserve one sge in order to avoid local length error" This patch caused some issues on SEND operation, and it should be reverted to make the drivers work correctly. There will be a better solution that has been tested carefully to solve the original problem. This reverts commit `711195e57d`. Fixes: `711195e57d` ("RDMA/hns: Reserve one sge in order to avoid local length error") Link: https://lore.kernel.org/r/1597829984-20223-1-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-20 08:35:19 -03:00
Kaike Wan	b25e8e85e7	RDMA/hfi1: Correct an interlock issue for TID RDMA WRITE request The following message occurs when running an AI application with TID RDMA enabled: hfi1 0000:7f:00.0: hfi1_0: [QP74] hfi1_tid_timeout 4084 hfi1 0000:7f:00.0: hfi1_0: [QP70] hfi1_tid_timeout 4084 The issue happens when TID RDMA WRITE request is followed by an IB_WR_RDMA_WRITE_WITH_IMM request, the latter could be completed first on the responder side. As a result, no ACK packet for the latter could be sent because the TID RDMA WRITE request is still being processed on the responder side. When the TID RDMA WRITE request is eventually completed, the requester will wait for the IB_WR_RDMA_WRITE_WITH_IMM request to be acknowledged. If the next request is another TID RDMA WRITE request, no TID RDMA WRITE DATA packet could be sent because the preceding IB_WR_RDMA_WRITE_WITH_IMM request is not completed yet. Consequently the IB_WR_RDMA_WRITE_WITH_IMM will be retried but it will be ignored on the responder side because the responder thinks it has already been completed. Eventually the retry will be exhausted and the qp will be put into error state on the requester side. On the responder side, the TID resource timer will eventually expire because no TID RDMA WRITE DATA packets will be received for the second TID RDMA WRITE request. There is also risk of a write-after-write memory corruption due to the issue. Fix by adding a requester side interlock to prevent any potential data corruption and TID RDMA protocol error. Fixes: `a0b34f75ec` ("IB/hfi1: Add interlock between a TID RDMA request and other requests") Link: https://lore.kernel.org/r/20200811174931.191210.84093.stgit@awfm-01.aw.intel.com Cc: <stable@vger.kernel.org> # 5.4.x+ Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-20 08:31:41 -03:00

1 2 3 4 5 ...

949318 commits