linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-10-01 22:54:01 +00:00

Author	SHA1	Message	Date
Evan Quan	a71849cdea	drm/amd/pm: fix the deadlock issue observed on SI The adev->pm.mutx is already held at the beginning of amdgpu_dpm_compute_clocks/amdgpu_dpm_enable_uvd/amdgpu_dpm_enable_vce. But on their calling path, amdgpu_display_bandwidth_update will be called and thus its sub functions amdgpu_dpm_get_sclk/mclk. They will then try to acquire the same adev->pm.mutex and deadlock will occur. By placing amdgpu_display_bandwidth_update outside of adev->pm.mutex protection(considering logically they do not need such protection) and restructuring the call flow accordingly, we can eliminate the deadlock issue. This comes with no real logics change. Fixes: `3712e7a494` ("drm/amd/pm: unified lock protections in amdgpu_dpm.c") Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Reported-by: Arthur Marsh <arthur.marsh@internode.on.net> Link: https://lore.kernel.org/all/9e689fea-6c69-f4b0-8dee-32c4cf7d8f9c@molgen.mpg.de/ BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1957 Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-27 17:38:02 -04:00
Miaoqian Lin	65e5498750	drm/amd/display: Fix memory leak in dcn21_clock_source_create When dcn20_clk_src_construct() fails, we need to release clk_src. Fixes: `6f4e6361c3` ("drm/amd/display: Add Renoir resource (v2)") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-27 17:19:31 -04:00
Alex Deucher	f95af4a923	drm/amdgpu: don't runtime suspend if there are displays attached (v3) We normally runtime suspend when there are displays attached if they are in the DPMS off state, however, if something wakes the GPU we send a hotplug event on resume (in case any displays were connected while the GPU was in suspend) which can cause userspace to light up the displays again soon after they were turned off. Prior to commit `087451f372` ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's."), the driver took a runtime pm reference when the fbdev emulation was enabled because we didn't implement proper shadowing support for vram access when the device was off so the device never runtime suspended when there was a console bound. Once that commit landed, we now utilize the core fb helper implementation which properly handles the emulation, so runtime pm now suspends in cases where it did not before. Ultimately, we need to sort out why runtime suspend in not working in this case for some users, but this should restore similar behavior to before. v2: move check into runtime_suspend v3: wake ups -> wakeups in comment, retain pm_runtime behavior in runtime_idle callback Fixes: `087451f372` ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Link: https://lore.kernel.org/r/20220403132322.51c90903@darkstar.example.org/ Tested-by: Michele Ballabio <ballabio.m@gmail.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-04-27 17:18:53 -04:00
David Yat Sin	f567656f8a	drm/amdkfd: CRIU add support for GWS queues Add support to checkpoint/restore GWS (Global Wave Sync) queues. Signed-off-by: David Yat Sin <david.yatsin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-27 17:16:20 -04:00
David Yat Sin	7c6b6e18c8	drm/amdkfd: Fix GWS queue count dqm->gws_queue_count and pdd->qpd.mapped_gws_queue need to be updated each time the queue gets evicted. Fixes: `b8020b0304` ("drm/amdkfd: Enable over-subscription with >1 GWS queue") Signed-off-by: David Yat Sin <david.yatsin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-27 17:15:15 -04:00
Alexandru Elisei	8f6379e207	KVM/arm64: Don't emulate a PMU for 32-bit guests if feature not set kvm->arch.arm_pmu is set when userspace attempts to set the first PMU attribute. As certain attributes are mandatory, arm_pmu ends up always being set to a valid arm_pmu, otherwise KVM will refuse to run the VCPU. However, this only happens if the VCPU has the PMU feature. If the VCPU doesn't have the feature bit set, kvm->arch.arm_pmu will be left uninitialized and equal to NULL. KVM doesn't do ID register emulation for 32-bit guests and accesses to the PMU registers aren't gated by the pmu_visibility() function. This is done to prevent injecting unexpected undefined exceptions in guests which have detected the presence of a hardware PMU. But even though the VCPU feature is missing, KVM still attempts to emulate certain aspects of the PMU when PMU registers are accessed. This leads to a NULL pointer dereference like this one, which happens on an odroid-c4 board when running the kvm-unit-tests pmu-cycle-counter test with kvmtool and without the PMU feature being set: [ 454.402699] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000150 [ 454.405865] Mem abort info: [ 454.408596] ESR = 0x96000004 [ 454.411638] EC = 0x25: DABT (current EL), IL = 32 bits [ 454.416901] SET = 0, FnV = 0 [ 454.419909] EA = 0, S1PTW = 0 [ 454.423010] FSC = 0x04: level 0 translation fault [ 454.427841] Data abort info: [ 454.430687] ISV = 0, ISS = 0x00000004 [ 454.434484] CM = 0, WnR = 0 [ 454.437404] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000c924000 [ 454.443800] [0000000000000150] pgd=0000000000000000, p4d=0000000000000000 [ 454.450528] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 454.456036] Modules linked in: [ 454.459053] CPU: 1 PID: 267 Comm: kvm-vcpu-0 Not tainted 5.18.0-rc4 #113 [ 454.465697] Hardware name: Hardkernel ODROID-C4 (DT) [ 454.470612] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 454.477512] pc : kvm_pmu_event_mask.isra.0+0x14/0x74 [ 454.482427] lr : kvm_pmu_set_counter_event_type+0x2c/0x80 [ 454.487775] sp : ffff80000a9839c0 [ 454.491050] x29: ffff80000a9839c0 x28: ffff000000a83a00 x27: 0000000000000000 [ 454.498127] x26: 0000000000000000 x25: 0000000000000000 x24: ffff00000a510000 [ 454.505198] x23: ffff000000a83a00 x22: ffff000003b01000 x21: 0000000000000000 [ 454.512271] x20: 000000000000001f x19: 00000000000003ff x18: 0000000000000000 [ 454.519343] x17: 000000008003fe98 x16: 0000000000000000 x15: 0000000000000000 [ 454.526416] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 454.533489] x11: 000000008003fdbc x10: 0000000000009d20 x9 : 000000000000001b [ 454.540561] x8 : 0000000000000000 x7 : 0000000000000d00 x6 : 0000000000009d00 [ 454.547633] x5 : 0000000000000037 x4 : 0000000000009d00 x3 : 0d09000000000000 [ 454.554705] x2 : 000000000000001f x1 : 0000000000000000 x0 : 0000000000000000 [ 454.561779] Call trace: [ 454.564191] kvm_pmu_event_mask.isra.0+0x14/0x74 [ 454.568764] kvm_pmu_set_counter_event_type+0x2c/0x80 [ 454.573766] access_pmu_evtyper+0x128/0x170 [ 454.577905] perform_access+0x34/0x80 [ 454.581527] kvm_handle_cp_32+0x13c/0x160 [ 454.585495] kvm_handle_cp15_32+0x1c/0x30 [ 454.589462] handle_exit+0x70/0x180 [ 454.592912] kvm_arch_vcpu_ioctl_run+0x1c4/0x5e0 [ 454.597485] kvm_vcpu_ioctl+0x23c/0x940 [ 454.601280] __arm64_sys_ioctl+0xa8/0xf0 [ 454.605160] invoke_syscall+0x48/0x114 [ 454.608869] el0_svc_common.constprop.0+0xd4/0xfc [ 454.613527] do_el0_svc+0x28/0x90 [ 454.616803] el0_svc+0x34/0xb0 [ 454.619822] el0t_64_sync_handler+0xa4/0x130 [ 454.624049] el0t_64_sync+0x18c/0x190 [ 454.627675] Code: a9be7bfd 910003fd f9000bf3 52807ff3 (b9415001) [ 454.633714] ---[ end trace 0000000000000000 ]--- In this particular case, Linux hasn't detected the presence of a hardware PMU because the PMU node is missing from the DTB, so userspace would have been unable to set the VCPU PMU feature even if it attempted it. What happens is that the 32-bit guest reads ID_DFR0, which advertises the presence of the PMU, and when it tries to program a counter, it triggers the NULL pointer dereference because kvm->arch.arm_pmu is NULL. kvm-arch.arm_pmu was introduced by commit `46b1878214` ("KVM: arm64: Keep a per-VM pointer to the default PMU"). Until that commit, this error would be triggered instead: [ 73.388140] ------------[ cut here ]------------ [ 73.388189] Unknown PMU version 0 [ 73.390420] WARNING: CPU: 1 PID: 264 at arch/arm64/kvm/pmu-emul.c:36 kvm_pmu_event_mask.isra.0+0x6c/0x74 [ 73.399821] Modules linked in: [ 73.402835] CPU: 1 PID: 264 Comm: kvm-vcpu-0 Not tainted 5.17.0 #114 [ 73.409132] Hardware name: Hardkernel ODROID-C4 (DT) [ 73.414048] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 73.420948] pc : kvm_pmu_event_mask.isra.0+0x6c/0x74 [ 73.425863] lr : kvm_pmu_event_mask.isra.0+0x6c/0x74 [ 73.430779] sp : ffff80000a8db9b0 [ 73.434055] x29: ffff80000a8db9b0 x28: ffff000000dbaac0 x27: 0000000000000000 [ 73.441131] x26: ffff000000dbaac0 x25: 00000000c600000d x24: 0000000000180720 [ 73.448203] x23: ffff800009ffbe10 x22: ffff00000b612000 x21: 0000000000000000 [ 73.455276] x20: 000000000000001f x19: 0000000000000000 x18: ffffffffffffffff [ 73.462348] x17: 000000008003fe98 x16: 0000000000000000 x15: 0720072007200720 [ 73.469420] x14: 0720072007200720 x13: ffff800009d32488 x12: 00000000000004e6 [ 73.476493] x11: 00000000000001a2 x10: ffff800009d32488 x9 : ffff800009d32488 [ 73.483565] x8 : 00000000ffffefff x7 : ffff800009d8a488 x6 : ffff800009d8a488 [ 73.490638] x5 : ffff0000f461a9d8 x4 : 0000000000000000 x3 : 0000000000000001 [ 73.497710] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000000dbaac0 [ 73.504784] Call trace: [ 73.507195] kvm_pmu_event_mask.isra.0+0x6c/0x74 [ 73.511768] kvm_pmu_set_counter_event_type+0x2c/0x80 [ 73.516770] access_pmu_evtyper+0x128/0x16c [ 73.520910] perform_access+0x34/0x80 [ 73.524532] kvm_handle_cp_32+0x13c/0x160 [ 73.528500] kvm_handle_cp15_32+0x1c/0x30 [ 73.532467] handle_exit+0x70/0x180 [ 73.535917] kvm_arch_vcpu_ioctl_run+0x20c/0x6e0 [ 73.540489] kvm_vcpu_ioctl+0x2b8/0x9e0 [ 73.544283] __arm64_sys_ioctl+0xa8/0xf0 [ 73.548165] invoke_syscall+0x48/0x114 [ 73.551874] el0_svc_common.constprop.0+0xd4/0xfc [ 73.556531] do_el0_svc+0x28/0x90 [ 73.559808] el0_svc+0x28/0x80 [ 73.562826] el0t_64_sync_handler+0xa4/0x130 [ 73.567054] el0t_64_sync+0x1a0/0x1a4 [ 73.570676] ---[ end trace 0000000000000000 ]--- [ 73.575382] kvm: pmu event creation failed -2 The root cause remains the same: kvm->arch.pmuver was never set to something sensible because the VCPU feature itself was never set. The odroid-c4 is somewhat of a special case, because Linux doesn't probe the PMU. But the above errors can easily be reproduced on any hardware, with or without a PMU driver, as long as userspace doesn't set the PMU feature. Work around the fact that KVM advertises a PMU even when the VCPU feature is not set by gating all PMU emulation on the feature. The guest can still access the registers without KVM injecting an undefined exception. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220425145530.723858-1-alexandru.elisei@arm.com	2022-04-27 22:10:29 +01:00
Will Deacon	2a50fc5fd0	KVM: arm64: Handle host stage-2 faults from 32-bit EL0 When pKVM is enabled, host memory accesses are translated by an identity mapping at stage-2, which is populated lazily in response to synchronous exceptions from 64-bit EL1 and EL0. Extend this handling to cover exceptions originating from 32-bit EL0 as well. Although these are very unlikely to occur in practice, as the kernel typically ensures that user pages are initialised before mapping them in, drivers could still map previously untouched device pages into userspace and expect things to work rather than panic the system. Cc: Quentin Perret <qperret@google.com> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220427171332.13635-1-will@kernel.org	2022-04-27 21:54:30 +01:00
Linus Torvalds	8f4dd16603	Merge branch 'akpm' (patches from Andrew) Merge fixes from Andrew Morton: "Two patches. Subsystems affected by this patch series: mm/kasan and mm/debug" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: docs: vm/page_owner: use literal blocks for param description kasan: prevent cpu_quarantine corruption when CPU offline and cache shrink occur at same time	2022-04-27 13:44:37 -07:00
Akira Yokosawa	5603f9bdea	docs: vm/page_owner: use literal blocks for param description Sphinx generates hard-to-read lists of parameters at the bottom of the page. Fix them by putting literal-block markers of "::" in front of them. Link: https://lkml.kernel.org/r/cfd3bcc0-b51d-0c68-c065-ca1c4c202447@gmail.com Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Fixes: `57f2b54a93` ("Documentation/vm/page_owner.rst: update the documentation") Cc: Shenghong Han <hanshenghong2019@email.szu.edu.cn> Cc: Haowen Bai <baihaowen@meizu.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Alex Shi <seakeel@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-27 13:28:48 -07:00
Zqiang	31fa985b41	kasan: prevent cpu_quarantine corruption when CPU offline and cache shrink occur at same time kasan_quarantine_remove_cache() is called in kmem_cache_shrink()/ destroy(). The kasan_quarantine_remove_cache() call is protected by cpuslock in kmem_cache_destroy() to ensure serialization with kasan_cpu_offline(). However the kasan_quarantine_remove_cache() call is not protected by cpuslock in kmem_cache_shrink(). When a CPU is going offline and cache shrink occurs at same time, the cpu_quarantine may be corrupted by interrupt (per_cpu_remove_cache operation). So add a cpu_quarantine offline flags check in per_cpu_remove_cache(). [akpm@linux-foundation.org: add comment, per Zqiang] Link: https://lkml.kernel.org/r/20220414025925.2423818-1-qiang1.zhang@intel.com Signed-off-by: Zqiang <qiang1.zhang@intel.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-27 13:28:48 -07:00
Tejun Heo	4cddeacad6	Revert "block: inherit request start time from bio for BLK_CGROUP" This reverts commit `0006707723`. It has a couple problems: * bio_issue_time() is stored in bio->bi_issue truncated to 51 bits. This overflows in slightly over 26 days. Setting rq->io_start_time_ns with it means that io duration calculation would yield >26days after 26 days of uptime. This, for example, confuses kyber making it cause high IO latencies. * rq->io_start_time_ns should record the time that the IO is issued to the device so that on-device latency can be measured. However, bio_issue_time() is set before the bio goes through the rq-qos controllers (wbt, iolatency, iocost), so when the bio gets throttled in any of the mechanisms, the measured latencies make no sense - on-device latencies end up higher than request-alloc-to-completion latencies. We'll need a smarter way to avoid calling ktime_get_ns() repeatedly back-to-back. For now, let's revert the commit. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org # v5.16+ Link: https://lore.kernel.org/r/YmmeOLfo5lzc+8yI@slm.duckdns.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-04-27 13:51:28 -06:00
Artem Bityutskiy	7eac3bd38d	intel_idle: Fix SPR C6 optimization The Sapphire Rapids (SPR) C6 optimization was added to the end of the 'spr_idle_state_table_update()' function. However, the function has a 'return' which may happen before the optimization has a chance to run. And this may prevent the optimization from happening. This is an unlikely scenario, but possible if user boots with, say, the 'intel_idle.preferred_cstates=6' kernel boot option. This patch fixes the issue by eliminating the problematic 'return' statement. Fixes: `3a9cf77b60` ("intel_idle: add core C6 optimization for SPR") Suggested-by: Jan Beulich <jbeulich@suse.com> Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> [ rjw: Minor changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-04-27 20:36:47 +02:00
Artem Bityutskiy	39c184a6a9	intel_idle: Fix the 'preferred_cstates' module parameter Problem description. When user boots kernel up with the 'intel_idle.preferred_cstates=4' option, we enable C1E and disable C1 states on Sapphire Rapids Xeon (SPR). In order for C1E to work on SPR, we have to enable the C1E promotion bit on all CPUs. However, we enable it only on one CPU. Fix description. The 'intel_idle' driver already has the infrastructure for disabling C1E promotion on every CPU. This patch uses the same infrastructure for enabling C1E promotion on every CPU. It changes the boolean 'disable_promotion_to_c1e' variable to a tri-state 'c1e_promotion' variable. Tested on a 2-socket SPR system. I verified the following combinations: * C1E promotion enabled and disabled in BIOS. * Booted with and without the 'intel_idle.preferred_cstates=4' kernel argument. In all 4 cases C1E promotion was correctly set on all CPUs. Also tested on an old Broadwell system, just to make sure it does not cause a regression. C1E promotion was correctly disabled on that system, both C1 and C1E were exposed (as expected). Fixes: `da0e58c038` ("intel_idle: add 'preferred_cstates' module argument") Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> [ rjw: Minor changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-04-27 20:36:47 +02:00
Rafael J. Wysocki	0f03610b20	cpufreq arm fixes for 5.18-rc5 - Fix issues with the Qualcomm's cpufreq driver (Dmitry Baryshkov and Vladimir Zapolskiy). - Fix memory leak with the Sun501 driver (Xiaobing Luo). -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEx73Crsp7f6M6scA70rkcPK6BEhwFAmJnlhUACgkQ0rkcPK6B Ehxa7RAAi+RNRFIluLn1oZzsJPa0DlBk5C+mxlyXEnJuLypsuBGHxXYnvdduy+8O Ji3UxjUj1nnNmj4cYwVHLrkbIL3rpP7reCrsoTYcX9iMlwSYY6cOJjJSyrg53XtB tbzhf/23+zPUzfZdMSM9m4myTY052/iqj2paUWeoPAHCB4a4ADAq60NSFN6ZQGZg VuEQHWjY7j+Mgqc61Q/ZUf1YuM4njR5wtHu6qw7orimnSMqsEOpeSZLvQjJj9JWr fYc2LR7Lg2ilWzCjdFJbwsR93XDhm62bT0p9tQdkaGRZQmFLEwqngpC03dgsqtLJ Pbr/hQgvAnM+dyDa8xIAkSLTlTIb16q8+V6lbMZFoMmKJvL0+wgXnWywdh07jq9z KEbrm3ECnqJWJ0iIZEvkeIupn0FIRkVgwCcL5wweF3Q46uyGuJCqLko83YZgFfjK NUlu92vKLKVN3F6ZKm5T4koDwlvPzy+PdkTPOXFxHsQi7o7BiEcoHe4axQ9biPmD HFlDQORfhRcT/blJoPmldIcL26R86CcACo8d5mEK2pQFUxcKVAETxxdE6zJDFd59 9BXhQAMAnVDJ0Lu++o7OsTteeL1Gp9pDTvhVFQfnjapNtcr6SeLI+SqNSes6oiD2 AHjXPv17gnGB1duAqgqbqyGMMWYQ86X0QeFWSdua/4gPlo0TYyw= =HauS -----END PGP SIGNATURE----- Merge tag 'cpufreq-arm-fixes-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull ARM cpufreq fixes for 5.18-rc5 from Viresh Kumar: "- Fix issues with the Qualcomm's cpufreq driver (Dmitry Baryshkov and Vladimir Zapolskiy). - Fix memory leak with the Sun501 driver (Xiaobing Luo)." * tag 'cpufreq-arm-fixes-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: qcom-cpufreq-hw: Clear dcvs interrupts cpufreq: fix memory leak in sun50i_cpufreq_nvmem_probe cpufreq: qcom-cpufreq-hw: Fix throttle frequency value on EPSS platforms cpufreq: qcom-hw: provide online/offline operations cpufreq: qcom-hw: fix the opp entries refcounting cpufreq: qcom-hw: fix the race between LMH worker and cpuhp cpufreq: qcom-hw: drop affinity hint before freeing the IRQ	2022-04-27 20:20:03 +02:00
Mikulas Patocka	e4d8a29997	hex2bin: fix access beyond string end If we pass too short string to "hex2bin" (and the string size without the terminating NUL character is even), "hex2bin" reads one byte after the terminating NUL character. This patch fixes it. Note that hex_to_bin returns -1 on error and hex2bin return -EINVAL on error - so we can't just return the variable "hi" or "lo" on error. This inconsistency may be fixed in the next merge window, but for the purpose of fixing this bug, we just preserve the existing behavior and return -1 and -EINVAL. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Fixes: `b78049831f` ("lib: add error checking to hex2bin") Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-27 10:57:33 -07:00
Mikulas Patocka	e5be15767e	hex2bin: make the function hex_to_bin constant-time The function hex2bin is used to load cryptographic keys into device mapper targets dm-crypt and dm-integrity. It should take constant time independent on the processed data, so that concurrently running unprivileged code can't infer any information about the keys via microarchitectural convert channels. This patch changes the function hex_to_bin so that it contains no branches and no memory accesses. Note that this shouldn't cause performance degradation because the size of the new function is the same as the size of the old function (on x86-64) - and the new function causes no branch misprediction penalties. I compile-tested this function with gcc on aarch64 alpha arm hppa hppa64 i386 ia64 m68k mips32 mips64 powerpc powerpc64 riscv sh4 s390x sparc32 sparc64 x86_64 and with clang on aarch64 arm hexagon i386 mips32 mips64 powerpc powerpc64 s390x sparc32 sparc64 x86_64 to verify that there are no branches in the generated code. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-27 10:57:33 -07:00
Shin'ichiro Kawasaki	c7d2f89fea	bus: fsl-mc-msi: Fix MSI descriptor mutex lock for msi_first_desc() Commit `e8604b1447` introduced a call to the helper function msi_first_desc(), which needs MSI descriptor mutex lock before call. However, the required mutex lock was not added. This results in lockdep assertion: WARNING: CPU: 4 PID: 119 at kernel/irq/msi.c:274 msi_first_desc+0xd0/0x10c msi_first_desc+0xd0/0x10c fsl_mc_msi_domain_alloc_irqs+0x7c/0xc0 fsl_mc_populate_irq_pool+0x80/0x3cc Fix this by adding the mutex lock and unlock around the function call. Fixes: `e8604b1447` ("bus: fsl-mc-msi: Simplify MSI descriptor handling") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412075636.755454-1-shinichiro.kawasaki@wdc.com	2022-04-27 19:42:32 +02:00
Minchan Kim	ad8d869343	kernfs: fix NULL dereferencing in kernfs_remove kernfs_remove supported NULL kernfs_node param to bail out but revent per-fs lock change introduced regression that dereferencing the param without NULL check so kernel goes crash. This patch checks the NULL kernfs_node in kernfs_remove and if so, just return. Quote from bug report by Jirka ``` The bug is triggered by running NAS Parallel benchmark suite on SuperMicro servers with 2x Xeon(R) Gold 6126 CPU. Here is the error log: [ 247.035564] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 247.036009] #PF: supervisor read access in kernel mode [ 247.036009] #PF: error_code(0x0000) - not-present page [ 247.036009] PGD 0 P4D 0 [ 247.036009] Oops: 0000 [#1] PREEMPT SMP PTI [ 247.058060] CPU: 1 PID: 6546 Comm: umount Not tainted 5.16.0393c3714081a53795bbff0e985d24146def6f57f+ #16 [ 247.058060] Hardware name: Supermicro Super Server/X11DDW-L, BIOS 2.0b 03/07/2018 [ 247.058060] RIP: 0010:kernfs_remove+0x8/0x50 [ 247.058060] Code: 4c 89 e0 5b 5d 41 5c 41 5d 41 5e c3 49 c7 c4 f4 ff ff ff eb b2 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 41 54 55 <48> 8b 47 08 48 89 fd 48 85 c0 48 0f 44 c7 4c 8b 60 50 49 83 c4 60 [ 247.058060] RSP: 0018:ffffbbfa48a27e48 EFLAGS: 00010246 [ 247.058060] RAX: 0000000000000001 RBX: ffffffff89e31f98 RCX: 0000000080200018 [ 247.058060] RDX: 0000000080200019 RSI: fffff6760786c900 RDI: 0000000000000000 [ 247.058060] RBP: ffffffff89e31f98 R08: ffff926b61b24d00 R09: 0000000080200018 [ 247.122048] R10: ffff926b61b24d00 R11: ffff926a8040c000 R12: ffff927bd09a2000 [ 247.122048] R13: ffffffff89e31fa0 R14: dead000000000122 R15: dead000000000100 [ 247.122048] FS: 00007f01be0a8c40(0000) GS:ffff926fa8e40000(0000) knlGS:0000000000000000 [ 247.122048] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 247.122048] CR2: 0000000000000008 CR3: 00000001145c6003 CR4: 00000000007706e0 [ 247.122048] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 247.122048] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 247.122048] PKRU: 55555554 [ 247.122048] Call Trace: [ 247.122048] <TASK> [ 247.122048] rdt_kill_sb+0x29d/0x350 [ 247.122048] deactivate_locked_super+0x36/0xa0 [ 247.122048] cleanup_mnt+0x131/0x190 [ 247.122048] task_work_run+0x5c/0x90 [ 247.122048] exit_to_user_mode_prepare+0x229/0x230 [ 247.122048] syscall_exit_to_user_mode+0x18/0x40 [ 247.122048] do_syscall_64+0x48/0x90 [ 247.122048] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 247.122048] RIP: 0033:0x7f01be2d735b ``` Link: https://bugzilla.kernel.org/show_bug.cgi?id=215696 Link: https://lore.kernel.org/lkml/CAE4VaGDZr_4wzRn2___eDYRtmdPaGGJdzu_LCSkJYuY9BEO3cw@mail.gmail.com/ Fixes: `393c371408` (kernfs: switch global kernfs_rwsem lock to per-fs lock) Cc: stable@vger.kernel.org Reported-by: Jirka Hladky <jhladky@redhat.com> Tested-by: Jirka Hladky <jhladky@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Minchan Kim <minchan@kernel.org> Link: https://lore.kernel.org/r/20220427172152.3505364-1-minchan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-27 19:32:07 +02:00
Linus Torvalds	211ed5480a	zonefs fixes for 5.18-rc5 Two fixes for rc5: * Fix inode initialization to make sure that the inode flags are all cleared. * Use zone reset operation instead of close to make sure that the zone of an empty sequential file in never in an active state after closing the file. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCYmjSewAKCRDdoc3SxdoY dqd+APkBb/phIqK21E1x1KIXe51WTeL4mhkc0dwkQ6vRGcZOfAEA6cW++S7X1Sqo JqORCus5FZEfs59NhI6TBD6BtsIZ9wc= =hCgn -----END PGP SIGNATURE----- Merge tag 'zonefs-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull zonefs fixes from Damien Le Moal: "Two fixes for rc5: - Fix inode initialization to make sure that the inode flags are all cleared. - Use zone reset operation instead of close to make sure that the zone of an empty sequential file in never in an active state after closing the file" * tag 'zonefs-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: Fix management of open zones zonefs: Clear inode information flags on inode creation	2022-04-27 10:30:29 -07:00
Linus Torvalds	03498b7131	MTD core fix: * Fix a possible data corruption of the 'part' field in mtd_info Rawnand fixes: * Fix the check on the return value of wait_for_completion_timeout * Fix wrong ECC parameters for mt7622 * Fix a possible memory corruption that might panic in the Qcom driver -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEE9HuaYnbmDhq/XIDIJWrqGEe9VoQFAmJpD0cACgkQJWrqGEe9 VoRjRwgAj31KV9OhfKupvj5QWqsB/06kyKElkJJJFibLPT/tPlS5iY6o/UMUlCAy hUw8yYMOrSN2JySYHiDvifJBbRZPrJmAnY8CkC5uvCx7E9yzRpSbmdSSZTjKk2S0 4eml5JSV0iVKIX5sZ6CMMKZMB9b2peoWi9ebT+hBYndof8CT8d8cki0Yj/W/DGVA +PuT6l4zJNII8NsrR+lGTyCTiw2DUuUn0sudBCj+LU7gBl16bD4Oxj9s2c6N6TuC grze+yeq0AqntS2gaiIsmRHShd/gpTfd4OAYkifpvclUYMGlfy2i038eTsuLLMfA /mUa/sEPQkhG5TlHZvNtHf73QBVK3g== =gnxP -----END PGP SIGNATURE----- Merge tag 'mtd/fixes-for-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD fixes from Miquel Raynal: "Core fix: - Fix a possible data corruption of the 'part' field in mtd_info Rawnand fixes: - Fix the check on the return value of wait_for_completion_timeout - Fix wrong ECC parameters for mt7622 - Fix a possible memory corruption that might panic in the Qcom driver" * tag 'mtd/fixes-for-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: rawnand: qcom: fix memory corruption that causes panic mtd: fix 'part' field data corruption in mtd_info mtd: rawnand: Fix return value check of wait_for_completion_timeout mtd: rawnand: fix ecc parameters for mt7622	2022-04-27 10:14:52 -07:00
Jakub Kicinski	7b5148be4a	Add Eric Dumazet to networking maintainers Welcome Eric! Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Link: https://lore.kernel.org/r/20220426175723.417614-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-27 10:03:05 -07:00
Willy Tarreau	233087ca06	floppy: disable FDRAWCMD by default Minh Yuan reported a concurrency use-after-free issue in the floppy code between raw_cmd_ioctl and seek_interrupt. [ It turns out this has been around, and that others have reported the KASAN splats over the years, but Minh Yuan had a reproducer for it and so gets primary credit for reporting it for this fix - Linus ] The problem is, this driver tends to break very easily and nowadays, nobody is expected to use FDRAWCMD anyway since it was used to manipulate non-standard formats. The risk of breaking the driver is higher than the risk presented by this race, and accessing the device requires privileges anyway. Let's just add a config option to completely disable this ioctl and leave it disabled by default. Distros shouldn't use it, and only those running on antique hardware might need to enable it. Link: https://lore.kernel.org/all/000000000000b71cdd05d703f6bf@google.com/ Link: https://lore.kernel.org/lkml/CAKcFiNC=MfYVW-Jt9A3=FPJpTwCD2PL_ULNCpsCVE5s8ZeBQgQ@mail.gmail.com Link: https://lore.kernel.org/all/CAEAjamu1FRhz6StCe_55XY5s389ZP_xmCF69k987En+1z53=eg@mail.gmail.com Reported-by: Minh Yuan <yuanmingbuaa@gmail.com> Reported-by: syzbot+8e8958586909d62b6840@syzkaller.appspotmail.com Reported-by: cruise k <cruise4k@gmail.com> Reported-by: Kyungtae Kim <kt0755@gmail.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Tested-by: Denis Efremov <efremov@linux.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-27 09:41:54 -07:00
Tom Rix	eb2fd9b43f	platform/x86/intel: pmc/core: change pmc_lpm_modes to static Sparse reports this issue core.c: note: in included file: core.h:239:12: warning: symbol 'pmc_lpm_modes' was not declared. Should it be static? Global variables should not be defined in headers. This only works because core.h is only included by core.c. Single file use variables should be static, so change its storage-class specifier to static. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220423123048.591405-1-trix@redhat.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-04-27 16:55:54 +02:00
David E. Box	00dd3ace93	platform/x86/intel/sdsi: Fix bug in multi packet reads Fix bug that added an offset to the mailbox addr during multi-packet reads. Did not affect current ABI since it doesn't support multi-packet transactions. Fixes: `2546c60004` ("platform/x86: Add Intel Software Defined Silicon driver") Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220420155622.1763633-4-david.e.box@linux.intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-04-27 16:55:54 +02:00
David E. Box	a30393b36c	platform/x86/intel/sdsi: Poll on ready bit for writes Due to change in firmware flow, update mailbox writes to poll on ready bit instead of run_busy bit. This change makes the polling method consistent for both writes and reads, which also uses the ready bit. Fixes: `2546c60004` ("platform/x86: Add Intel Software Defined Silicon driver") Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220420155622.1763633-3-david.e.box@linux.intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-04-27 16:55:54 +02:00
David E. Box	679c7a3f15	platform/x86/intel/sdsi: Handle leaky bucket To prevent an agent from indefinitely holding the mailbox firmware has implemented a leaky bucket algorithm. Repeated access to the mailbox may now incur a delay of up to 2.1 seconds. Add a retry loop that tries for up to 2.5 seconds to acquire the mailbox. Fixes: `2546c60004` ("platform/x86: Add Intel Software Defined Silicon driver") Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220420155622.1763633-2-david.e.box@linux.intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-04-27 16:55:54 +02:00
Srinivas Pandruvada	8d75f7b4a3	platform/x86: intel-uncore-freq: Prevent driver loading in guests Loading this driver in guests results in unchecked MSR access error for MSR 0x620. There is no use of reading and modifying package/die scope uncore MSRs in guests. So check for CPU feature X86_FEATURE_HYPERVISOR to prevent loading of this driver in guests. Fixes: `dbce412a77` ("platform/x86/intel-uncore-freq: Split common and enumeration part") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215870 Suggested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://lore.kernel.org/r/20220427100304.2562990-1-srinivas.pandruvada@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-04-27 16:55:54 +02:00
Darryn Anton Jordan	e5483b45f6	platform/x86: gigabyte-wmi: added support for B660 GAMING X DDR4 motherboard This works on my system. Signed-off-by: Darryn Anton Jordan <darrynjordan@icloud.com> Acked-by: Thomas Weißschuh <thomas@weissschuh.net> Link: https://lore.kernel.org/r/Ylguq87YG+9L3foV@hark Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-04-27 16:55:54 +02:00
Gabriele Mazzotta	89a8f23fee	platform/x86: dell-laptop: Add quirk entry for Latitude 7520 The Latitude 7520 supports AC timeouts, but it has no KBD_LED_AC_TOKEN and so changes to stop_timeout appear to have no effect if the laptop is plugged in. Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Acked-by: Pali Rohár <pali@kernel.org> Link: https://lore.kernel.org/r/20220426120827.12363-1-gabriele.mzt@gmail.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-04-27 16:55:54 +02:00
Hans de Goede	9fe1bb29ea	platform/x86: asus-wmi: Fix driver not binding when fan curve control probe fails Before this commit fan_curve_check_present() was trying to not cause the probe to fail on devices without fan curve control by testing for known error codes returned by asus_wmi_evaluate_method_buf(). Checking for ENODATA or ENODEV, with the latter being returned by this function when an ACPI integer with a value of ASUS_WMI_UNSUPPORTED_METHOD is returned. But for other ACPI integer returns this function just returns them as is, including the ASUS_WMI_DSTS_UNKNOWN_BIT value of 2. On the Asus U36SD ASUS_WMI_DSTS_UNKNOWN_BIT gets returned, leading to: asus-nb-wmi: probe of asus-nb-wmi failed with error 2 Instead of playing whack a mole with error codes here, simply treat all errors as there not being any fan curves, fixing the driver no longer loading on the Asus U36SD laptop. Fixes: `e3d13da7f7` ("platform/x86: asus-wmi: Fix regression when probing for fan curve control") BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2079125 Cc: Luke D. Jones <luke@ljones.dev> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20220427114956.332919-1-hdegoede@redhat.com	2022-04-27 16:55:54 +02:00
Dan Carpenter	4345ece8f0	platform/x86: asus-wmi: Potential buffer overflow in asus_wmi_evaluate_method_buf() This code tests for if the obj->buffer.length is larger than the buffer but then it just does the memcpy() anyway. Fixes: `0f0ac158d2` ("platform/x86: asus-wmi: Add support for custom fan curves") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20220413073744.GB8812@kili Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-04-27 16:55:54 +02:00
Tejun Heo	8c936f9ea1	iocost: don't reset the inuse weight of under-weighted debtors When an iocg is in debt, its inuse weight is owned by debt handling and should stay at 1. This invariant was broken when determining the amount of surpluses at the beginning of donation calculation - when an iocg's hierarchical weight is too low, the iocg is excluded from donation calculation and its inuse is reset to its active regardless of its indebtedness, triggering warnings like the following: WARNING: CPU: 5 PID: 0 at block/blk-iocost.c:1416 iocg_kick_waitq+0x392/0x3a0 ... RIP: 0010:iocg_kick_waitq+0x392/0x3a0 Code: 00 00 be ff ff ff ff 48 89 4d a8 e8 98 b2 70 00 48 8b 4d a8 85 c0 0f 85 4a fe ff ff 0f 0b e9 43 fe ff ff 0f 0b e9 4d fe ff ff <0f> 0b e9 50 fe ff ff e8 a2 ae 70 00 66 90 0f 1f 44 00 00 55 48 89 RSP: 0018:ffffc90000200d08 EFLAGS: 00010016 ... <IRQ> ioc_timer_fn+0x2e0/0x1470 call_timer_fn+0xa1/0x2c0 ... As this happens only when an iocg's hierarchical weight is negligible, its impact likely is limited to triggering the warnings. Fix it by skipping resetting inuse of under-weighted debtors. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Rik van Riel <riel@surriel.com> Fixes: `c421a3eb2e` ("blk-iocost: revamp debt handling") Cc: stable@vger.kernel.org # v5.10+ Link: https://lore.kernel.org/r/YmjODd4aif9BzFuO@slm.duckdns.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-04-27 08:41:11 -06:00
Volodymyr Mytnyk	626873c446	netfilter: conntrack: fix udp offload timeout sysctl `nf_flowtable_udp_timeout` sysctl option is available only if CONFIG_NFT_FLOW_OFFLOAD enabled. But infra for this flow offload UDP timeout was added under CONFIG_NF_FLOW_TABLE config option. So, if you have CONFIG_NFT_FLOW_OFFLOAD disabled and CONFIG_NF_FLOW_TABLE enabled, the `nf_flowtable_udp_timeout` is not present in sysfs. Please note, that TCP flow offload timeout sysctl option is present even CONFIG_NFT_FLOW_OFFLOAD is disabled. I suppose it was a typo in commit that adds UDP flow offload timeout and CONFIG_NF_FLOW_TABLE should be used instead. Fixes: `975c57504d` ("netfilter: conntrack: Introduce udp offload timeout configuration") Signed-off-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-04-27 15:48:49 +02:00
Florian Westphal	c7aab4f170	netfilter: nf_conntrack_tcp: re-init for syn packets only Jaco Kroon reported tcp problems that Eric Dumazet and Neal Cardwell pinpointed to nf_conntrack tcp_in_window() bug. tcp trace shows following sequence: I > R Flags [S], seq 3451342529, win 62580, options [.. tfo [\|tcp]> R > I Flags [S.], seq 2699962254, ack 3451342530, win 65535, options [..] R > I Flags [P.], seq 1:89, ack 1, [..] Note 3rd ACK is from responder to initiator so following branch is taken: } else if (((state->state == TCP_CONNTRACK_SYN_SENT && dir == IP_CT_DIR_ORIGINAL) \|\| (state->state == TCP_CONNTRACK_SYN_RECV && dir == IP_CT_DIR_REPLY)) && after(end, sender->td_end)) { ... because state == TCP_CONNTRACK_SYN_RECV and dir is REPLY. This causes the scaling factor to be reset to 0: window scale option is only present in syn(ack) packets. This in turn makes nf_conntrack mark valid packets as out-of-window. This was always broken, it exists even in original commit where window tracking was added to ip_conntrack (nf_conntrack predecessor) in 2.6.9-rc1 kernel. Restrict to 'tcph->syn', just like the 3rd condtional added in commit `82b72cb946` ("netfilter: conntrack: re-init state for retransmitted syn-ack"). Upon closer look, those conditionals/branches can be merged: Because earlier checks prevent syn-ack from showing up in original direction, the 'dir' checks in the conditional quoted above are redundant, remove them. Return early for pure syn retransmitted in reply direction (simultaneous open). Fixes: `9fb9cbb108` ("[NETFILTER]: Add nf_conntrack subsystem.") Reported-by: Jaco Kroon <jaco@uls.co.za> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-04-27 15:48:45 +02:00
David S. Miller	a1bde8c92d	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net -queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-04-26 This series contains updates to ice driver only. Ivan Vecera removes races related to VF message processing by changing mutex_trylock() call to mutex_lock() and moving additional operations to occur under mutex. Petr Oros increases wait time after firmware flash as current time is not sufficient. Jake resolves a use-after-free issue for mailbox snapshot. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:58:39 +01:00
Jens Axboe	5a1e99b61b	io_uring: check reserved fields for recv/recvmsg We should check unused fields for non-zero and -EINVAL if they are set, making it consistent with other opcodes. Fixes: `aa1fa28fc7` ("io_uring: add support for recvmsg()") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-04-26 20:48:37 -06:00
Jens Axboe	588faa1ea5	io_uring: check reserved fields for send/sendmsg We should check unused fields for non-zero and -EINVAL if they are set, making it consistent with other opcodes. Fixes: `0fa03c624d` ("io_uring: add support for sendmsg()") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-04-26 20:48:31 -06:00
Martin Blumenstingl	71cffebf63	net: dsa: lantiq_gswip: Don't set GSWIP_MII_CFG_RMII_CLK Commit `4b5923249b` ("net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits") added all known bits in the GSWIP_MII_CFGp register. It helped bring this register into a well-defined state so the driver has to rely less on the bootloader to do things right. Unfortunately it also sets the GSWIP_MII_CFG_RMII_CLK bit without any possibility to configure it. Upon further testing it turns out that all boards which are supported by the GSWIP driver in OpenWrt which use an RMII PHY have a dedicated oscillator on the board which provides the 50MHz RMII reference clock. Don't set the GSWIP_MII_CFG_RMII_CLK bit (but keep the code which always clears it) to fix support for the Fritz!Box 7362 SL in OpenWrt. This is a board with two Atheros AR8030 RMII PHYs. With the "RMII clock" bit set the MAC also generates the RMII reference clock whose signal then conflicts with the signal from the oscillator on the board. This results in a constant cycle of the PHY detecting link up/down (and as a result of that: the two ports using the AR8030 PHYs are not working). At the time of writing this patch there's no known board where the MAC (GSWIP) has to generate the RMII reference clock. If needed this can be implemented in future by providing a device-tree flag so the GSWIP_MII_CFG_RMII_CLK bit can be toggled per port. Fixes: `4b5923249b` ("net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits") Tested-by: Jan Hoffmann <jan@3e8.eu> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Link: https://lore.kernel.org/r/20220425152027.2220750-1-martin.blumenstingl@googlemail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:32:52 -07:00
Sebastian Andrzej Siewior	6510ea973d	net: Use this_cpu_inc() to increment net->core_stats The macro dev_core_stats_##FIELD##_inc() disables preemption and invokes netdev_core_stats_alloc() to return a per-CPU pointer. netdev_core_stats_alloc() will allocate memory on its first invocation which breaks on PREEMPT_RT because it requires non-atomic context for memory allocation. This can be avoided by enabling preemption in netdev_core_stats_alloc() assuming the caller always disables preemption. It might be better to replace local_inc() with this_cpu_inc() now that dev_core_stats_##FIELD##_inc() gained a preempt-disable section and does not rely on already disabled preemption. This results in less instructions on x86-64: local_inc: \| incl %gs:__preempt_count(%rip) # __preempt_count \| movq 488(%rdi), %rax # _1->core_stats, _22 \| testq %rax, %rax # _22 \| je .L585 #, \| add %gs:this_cpu_off(%rip), %rax # this_cpu_off, tcp_ptr__ \| .L586: \| testq %rax, %rax # _27 \| je .L587 #, \| incq (%rax) # _6->a.counter \| .L587: \| decl %gs:__preempt_count(%rip) # __preempt_count this_cpu_inc(), this patch: \| movq 488(%rdi), %rax # _1->core_stats, _5 \| testq %rax, %rax # _5 \| je .L591 #, \| .L585: \| incq %gs:(%rax) # _18->rx_dropped Use unsigned long as type for the counter. Use this_cpu_inc() to increment the counter. Use a plain read of the counter. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/YmbO0pxgtKpCw4SY@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-26 17:32:30 -07:00
Linus Torvalds	46cf2c613f	Pin control fixes for v5.18: - Fix some register offsets on Intel Alderlake - Fix the order the UFS and SDC pins on Qualcomm SM6350 - Fix a build error in Mediatek Moore. - Fix a pin function table in the Sunplus SP7021. - Fix some Kconfig and static keywords on the Samsung Tesla FSD SoC. - Fix up the EOI function for edge triggered IRQs and keep the block clock enabled for level IRQs in the STM32 driver. - Fix some bits and order in the Rockchip RK3308 driver. - Handle the errorpath in the Pistachio driver probe() properly. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmJobdYACgkQQRCzN7AZ XXMltw/+JuwmtaJOOARGp7oKy3s0L0eV4lquF/a3cvGjViua0HPWlYd8aFAGnSiq /a0wQT5pmKqdg0HEgVFxNyxe6s7P8RXpFz5kA0jtO85O6/l6GZDnix9kQ7Qdwmin dP0JGBxr05UbejOMrO6CRSjDWPX+ptPO1DtCYWZC2A3qpwXTYdrnq568uH9dePby xpqeJ35upIK7DfLBTCa2FaQWcXM3IQsAuUB84gN4oi1UXdimXAXqExRx77vTFrM2 pXmOgKF9l9inAUJnF/X3aM3GeR44x1uAH2mxpMupAWe81WQTn997O2B/odc4Arw1 LpIbBgQPY8+R2Go2CqJ/UKGYqh0jABiZV3DdGEG+Gvex/qfiT6yW4AJvaIoaI6ZT A9L2aZdw6Q6U7OR6wyQ6CFNxpHeyxjjEbRBYhlQW0Bzb/IZnqtIJagv6M5It0OhI +FJ4poIPKS5jTsdjUCk5AbWuFP8GMhDE+8eRtBg6CEHsp7+0unGlhAvmztyrtW6G lGUQBw5OsvKqyPujitmSVWkKC99PhQhd7BR9feSKLP7+9Is7Wo45hamZXbJF5+bz rqznr4jRMZb7/mUWRLg8wXkmrSaKfzOo9yDyMix5JQw6JyOEw8oyeweI8Mfa+JbG cRa6PYoVNpHZr+BVqLxMnTTbz/5dtV/NT6H2Ny8c+UYKod8eA4w= =xFvm -----END PGP SIGNATURE----- Merge tag 'pinctrl-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Fix some register offsets on Intel Alderlake - Fix the order the UFS and SDC pins on Qualcomm SM6350 - Fix a build error in Mediatek Moore. - Fix a pin function table in the Sunplus SP7021. - Fix some Kconfig and static keywords on the Samsung Tesla FSD SoC. - Fix up the EOI function for edge triggered IRQs and keep the block clock enabled for level IRQs in the STM32 driver. - Fix some bits and order in the Rockchip RK3308 driver. - Handle the errorpath in the Pistachio driver probe() properly. * tag 'pinctrl-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: pistachio: fix use of irq_of_parse_and_map() pinctrl: stm32: Keep pinctrl block clock enabled when LEVEL IRQ requested pinctrl: rockchip: sort the rk3308_mux_recalced_data entries pinctrl: rockchip: fix RK3308 pinmux bits pinctrl: stm32: Do not call stm32_gpio_get() for edge triggered IRQs in EOI pinctrl: Fix an error in pin-function table of SP7021 pinctrl: samsung: fix missing GPIOLIB on ARM64 Exynos config pinctrl: mediatek: moore: Fix build error pinctrl: qcom: sm6350: fix order of UFS & SDC pins pinctrl: alderlake: Fix register offsets for ADL-N variant pinctrl: samsung: staticize fsd_pin_ctrl	2022-04-26 16:34:11 -07:00
Linus Torvalds	cf424ef014	fbdev fixes and updates for kernel v5.18-rc5 A bunch of outstanding fbdev patches - all trivial and small: neofb: Fix the check of 'var->pixclock' kyro, vt8623fb, tridentfb, arkfb, s3fb, i740fb: Error out if 'lineclock' equals zero sis: Fix potential NULL dereference in sisfb_post_sis300() fb.h: Spelling fix: palette/palette/ pm2fb: Fix kernel-doc formatting issue clps711x-fb: Use syscon_regmap_lookup_by_phandle() of: display_timing: Remove a redundant zeroing of memory aty & matrox: Cleanup for powerpc's asm/prom.h sh_mobile_lcdcfb: Remove sh_mobile_lcdc_check_var() declaration mmp: Replace usage of found with dedicated list iterator variable omap: Make it CCF clk API compatible imxfb: Fix missing of_node_put in imxfb_probe i740fb: Use memset_io() to clear screen udlfb: Properly check endpoint type pxafb: Use if else instead -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCYmbo3gAKCRD3ErUQojoP XzuBAQCTKo9GRy2J0kEeSTDUrw+RQ649z5DSqkv07gXU/4eFVwD/at0HVXD7eHCR d550YxqFodM7B9bHBJu4YSSKMg4c0AA= =ANZe -----END PGP SIGNATURE----- Merge tag 'for-5.18/fbdev-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev fixes and updates from Helge Deller: "A bunch of outstanding fbdev patches - all trivial and small" * tag 'for-5.18/fbdev-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: video: fbdev: clps711x-fb: Use syscon_regmap_lookup_by_phandle video: fbdev: mmp: replace usage of found with dedicated list iterator variable video: fbdev: sh_mobile_lcdcfb: Remove sh_mobile_lcdc_check_var() declaration video: fbdev: i740fb: Error out if 'pixclock' equals zero video: fbdev: i740fb: use memset_io() to clear screen video: fbdev: s3fb: Error out if 'pixclock' equals zero video: fbdev: arkfb: Error out if 'pixclock' equals zero video: fbdev: tridentfb: Error out if 'pixclock' equals zero video: fbdev: vt8623fb: Error out if 'pixclock' equals zero video: fbdev: kyro: Error out if 'lineclock' equals zero video: fbdev: neofb: Fix the check of 'var->pixclock' video: fbdev: imxfb: Fix missing of_node_put in imxfb_probe video: fbdev: omap: Make it CCF clk API compatible video: fbdev: aty/matrox/...: Prepare cleanup of powerpc's asm/prom.h video: fbdev: pm2fb: Fix a kernel-doc formatting issue linux/fb.h: Spelling s/palette/palette/ video: fbdev: sis: fix potential NULL dereference in sisfb_post_sis300() video: fbdev: pxafb: use if else instead video: fbdev: udlfb: properly check endpoint type video: fbdev: of: display_timing: Remove a redundant zeroing of memory	2022-04-26 11:32:01 -07:00
Linus Torvalds	4fad37d595	gfs2 fix - Only re-check for direct I/O writes past the end of the file after re-acquiring the inode glock. -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmJn/TcUHGFncnVlbmJh QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTqtQg/+KFuiVv0FW8R/Pq6/zoSolOPmP6eC P2I2hGFTtUkgnyEtdjunIvpMiDVxvNhFPSJgrkfwx6pf55R67K9LJlHUiv8eRJpc GS5YxKEClnTZt7RTS8N/dHFrpIATqjhlPL8k3E7WE/uY72Nkt6yfStljnBU5KYXw NmhNWiCutn/nnvhzlfAc5GugmQ6umpaKeo59ClEd4SU+MdqjFSobl8SoUAOasXaK 4rNAh36R2wwS512Rs/y0+UAu5jNn7kcFvDbOBdJuAeXj26JDMYF2pp/ybp/LTqEr vdPO814mRwtk9SP56FZZVHzG90aVlbuX5Dxl8xPlK8Df3J0WvYooKBWIXv6xmBj0 sYmqQPMGe3ydWUCQRbFBVSw4qzMnxBXZWGDFuCbMaDDZ75+Uidl+H67pOxdqS8X2 rE4cqKkI02hGz8OEo7YXq0esBxrUWxvKwJcR1G0pIqYCkp77J3S4JeACtplQXnJY HUcmpiVOUAvhB7QQgV5DFuwkX5jKPSDd62KHR492+233ONJanGNoReQagiZUsGDp fpPkGQ5Rt66Sj+BTLg7yOUPpMvcsYqeIOmFxn1GnKudq0jVkDX4W9YlfSDfnnLO8 X+1EuYGNGe1jx5luLUKs1XvPYrVSc9yDy3DTbtwrvCuimxNeCPTqOoMU4di3UUgP 145LcO2Q+M+E+t8= =mqGR -----END PGP SIGNATURE----- Merge tag 'gfs2-v5.18-rc4-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fix from Andreas Gruenbacher: - Only re-check for direct I/O writes past the end of the file after re-acquiring the inode glock. * tag 'gfs2-v5.18-rc4-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Don't re-check for write past EOF unnecessarily	2022-04-26 11:17:18 -07:00
Luiz Augusto von Dentz	9b3628d79b	Bluetooth: hci_sync: Cleanup hci_conn if it cannot be aborted This attempts to cleanup the hci_conn if it cannot be aborted as otherwise it would likely result in having the controller and host stack out of sync with respect to connection handle. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2022-04-26 20:10:51 +02:00
Linus Torvalds	fd574a2f84	for-5.18-rc4-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmJnGGIACgkQxWXV+ddt WDumDw//cE1NcawdnVkEaKr20PetHfzPyFSIIr17nedtnVvWYyOFF/0uJlHNhv8Z CZIfJ7fmH/pO5oWPXN84wKNfumDWNwc36QrvoXC67TrKUSiBN8BzL83HvAjGwYFH G+LfZXGnVbqq8F1iYkIsuH0Oo1x/N/LPM3s6iZy3O4l8s96u+J4GRnc8Tr0AH4MA zgz3fab8Ec378HTG9fvdAQNLxFEe0VatD6WrzILnmM8UgeQK7g73dqH9Ni9gz2DW 2GDlO6aevQ1G6dm2AJ0ItExnbHH7TfOThkG56Gdqrzb/d39GzrVpeob7QiorETus EWS1rXaeikUiD4Bzt/RszUNL80yMN1DjcN3QBkiDf3ShSDFteoHMPw3e6jcQCy1m Dxf5oditQqltuFNLeSiVbZEMw2kXqBP7RoPiirF9rdvrDNLHhAE9wu0kpSGSSvT7 Tyu9JyLw2axU6wGTi1GHAXurlW2ItRRyFAewWWul1lLkuz/6YXI4F/EHm3Mbh6Nh pMIFMNr4Oafdx+3Ful8ZA4PynirXub/xVDefcFBibz/PTGEnHG4ZVzRudmVnowh7 GP2pql1+Y/TFkXdD98V8GWD+E10JAmNCkQSoiggJooNWR28whukmDVX/HY8lGmWg DjxwGkte3SltUBWNOTGnO7546hMwOxOPZHENPh+gffYkeMeIxPI= =xDWz -----END PGP SIGNATURE----- Merge tag 'for-5.18-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - direct IO fixes: - restore passing file offset to correctly calculate checksums when repairing on read and bio split happens - use correct bio when sumitting IO on zoned filesystem - zoned mode fixes: - fix selection of device to correctly calculate device capabilities when allocating a new bio - use a dedicated lock for exclusion during relocation - fix leaked plug after failure syncing log - fix assertion during scrub and relocation * tag 'for-5.18-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: zoned: use dedicated lock for data relocation btrfs: fix assertion failure during scrub due to block group reallocation btrfs: fix direct I/O writes for split bios on zoned devices btrfs: fix direct I/O read repair for split bios btrfs: fix and document the zoned device choice in alloc_new_bio btrfs: fix leaked plug after failure syncing log on zoned filesystems	2022-04-26 11:10:42 -07:00
Luiz Augusto von Dentz	aef2aa4fa9	Bluetooth: hci_event: Fix creating hci_conn object on error status It is useless to create a hci_conn object if on error status as the result would be it being freed in the process and anyway it is likely the result of controller and host stack being out of sync. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2022-04-26 20:10:22 +02:00
Luiz Augusto von Dentz	c86cc5a3ec	Bluetooth: hci_event: Fix checking for invalid handle on error status Commit `d5ebaa7c5f` introduces checks for handle range (e.g HCI_CONN_HANDLE_MAX) but controllers like Intel AX200 don't seem to respect the valid range int case of error status: > HCI Event: Connect Complete (0x03) plen 11 Status: Page Timeout (0x04) Handle: 65535 Address: 94:DB:56:XX:XX:XX (Sony Home Entertainment& Sound Products Inc) Link type: ACL (0x01) Encryption: Disabled (0x00) [1644965.827560] Bluetooth: hci0: Ignoring HCI_Connection_Complete for invalid handle Because of it is impossible to cleanup the connections properly since the stack would attempt to cancel the connection which is no longer in progress causing the following trace: < HCI Command: Create Connection Cancel (0x01\|0x0008) plen 6 Address: 94:DB:56:XX:XX:XX (Sony Home Entertainment& Sound Products Inc) = bluetoothd: src/profile.c:record_cb() Unable to get Hands-Free Voice gateway SDP record: Connection timed out > HCI Event: Command Complete (0x0e) plen 10 Create Connection Cancel (0x01\|0x0008) ncmd 1 Status: Unknown Connection Identifier (0x02) Address: 94:DB:56:XX:XX:XX (Sony Home Entertainment& Sound Products Inc) < HCI Command: Create Connection Cancel (0x01\|0x0008) plen 6 Address: 94:DB:56:XX:XX:XX (Sony Home Entertainment& Sound Products Inc) Fixes: `d5ebaa7c5f` ("Bluetooth: hci_event: Ignore multiple conn complete events") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2022-04-26 20:09:07 +02:00
Jacob Keller	b668f4cd71	ice: fix use-after-free when deinitializing mailbox snapshot During ice_sriov_configure, if num_vfs is 0, we are being asked by the kernel to remove all VFs. The driver first de-initializes the snapshot before freeing all the VFs. This results in a use-after-free BUG detected by KASAN. The bug occurs because the snapshot can still be accessed until all VFs are removed. Fix this by freeing all the VFs first before calling ice_mbx_deinit_snapshot. [ +0.032591] ================================================================== [ +0.000021] BUG: KASAN: use-after-free in ice_mbx_vf_state_handler+0x1c3/0x410 [ice] [ +0.000315] Write of size 28 at addr ffff889908eb6f28 by task kworker/55:2/1530996 [ +0.000029] CPU: 55 PID: 1530996 Comm: kworker/55:2 Kdump: loaded Tainted: G S I 5.17.0-dirty #1 [ +0.000022] Hardware name: Dell Inc. PowerEdge R740/0923K0, BIOS 1.6.13 12/17/2018 [ +0.000013] Workqueue: ice ice_service_task [ice] [ +0.000279] Call Trace: [ +0.000012] <TASK> [ +0.000011] dump_stack_lvl+0x33/0x42 [ +0.000030] print_report.cold.13+0xb2/0x6b3 [ +0.000028] ? ice_mbx_vf_state_handler+0x1c3/0x410 [ice] [ +0.000295] kasan_report+0xa5/0x120 [ +0.000026] ? __switch_to_asm+0x21/0x70 [ +0.000024] ? ice_mbx_vf_state_handler+0x1c3/0x410 [ice] [ +0.000298] kasan_check_range+0x183/0x1e0 [ +0.000019] memset+0x1f/0x40 [ +0.000018] ice_mbx_vf_state_handler+0x1c3/0x410 [ice] [ +0.000304] ? ice_conv_link_speed_to_virtchnl+0x160/0x160 [ice] [ +0.000297] ? ice_vsi_dis_spoofchk+0x40/0x40 [ice] [ +0.000305] ice_is_malicious_vf+0x1aa/0x250 [ice] [ +0.000303] ? ice_restore_all_vfs_msi_state+0x160/0x160 [ice] [ +0.000297] ? __mutex_unlock_slowpath.isra.15+0x410/0x410 [ +0.000022] ? ice_debug_cq+0xb7/0x230 [ice] [ +0.000273] ? __kasan_slab_alloc+0x2f/0x90 [ +0.000022] ? memset+0x1f/0x40 [ +0.000017] ? do_raw_spin_lock+0x119/0x1d0 [ +0.000022] ? rwlock_bug.part.2+0x60/0x60 [ +0.000024] __ice_clean_ctrlq+0x3a6/0xd60 [ice] [ +0.000273] ? newidle_balance+0x5b1/0x700 [ +0.000026] ? ice_print_link_msg+0x2f0/0x2f0 [ice] [ +0.000271] ? update_cfs_group+0x1b/0x140 [ +0.000018] ? load_balance+0x1260/0x1260 [ +0.000022] ? ice_process_vflr_event+0x27/0x130 [ice] [ +0.000301] ice_service_task+0x136e/0x1470 [ice] [ +0.000281] process_one_work+0x3b4/0x6c0 [ +0.000030] worker_thread+0x65/0x660 [ +0.000023] ? __kthread_parkme+0xe4/0x100 [ +0.000021] ? process_one_work+0x6c0/0x6c0 [ +0.000020] kthread+0x179/0x1b0 [ +0.000018] ? kthread_complete_and_exit+0x20/0x20 [ +0.000022] ret_from_fork+0x22/0x30 [ +0.000026] </TASK> [ +0.000018] Allocated by task 10742: [ +0.000013] kasan_save_stack+0x1c/0x40 [ +0.000018] __kasan_kmalloc+0x84/0xa0 [ +0.000016] kmem_cache_alloc_trace+0x16c/0x2e0 [ +0.000015] intel_iommu_probe_device+0xeb/0x860 [ +0.000015] __iommu_probe_device+0x9a/0x2f0 [ +0.000016] iommu_probe_device+0x43/0x270 [ +0.000015] iommu_bus_notifier+0xa7/0xd0 [ +0.000015] blocking_notifier_call_chain+0x90/0xc0 [ +0.000017] device_add+0x5f3/0xd70 [ +0.000014] pci_device_add+0x404/0xa40 [ +0.000015] pci_iov_add_virtfn+0x3b0/0x550 [ +0.000016] sriov_enable+0x3bb/0x600 [ +0.000013] ice_ena_vfs+0x113/0xa79 [ice] [ +0.000293] ice_sriov_configure.cold.17+0x21/0xe0 [ice] [ +0.000291] sriov_numvfs_store+0x160/0x200 [ +0.000015] kernfs_fop_write_iter+0x1db/0x270 [ +0.000018] new_sync_write+0x21d/0x330 [ +0.000013] vfs_write+0x376/0x410 [ +0.000013] ksys_write+0xba/0x150 [ +0.000012] do_syscall_64+0x3a/0x80 [ +0.000012] entry_SYSCALL_64_after_hwframe+0x44/0xae [ +0.000028] Freed by task 10742: [ +0.000011] kasan_save_stack+0x1c/0x40 [ +0.000015] kasan_set_track+0x21/0x30 [ +0.000016] kasan_set_free_info+0x20/0x30 [ +0.000012] __kasan_slab_free+0x104/0x170 [ +0.000016] kfree+0x9b/0x470 [ +0.000013] devres_destroy+0x1c/0x20 [ +0.000015] devm_kfree+0x33/0x40 [ +0.000012] ice_mbx_deinit_snapshot+0x39/0x70 [ice] [ +0.000295] ice_sriov_configure+0xb0/0x260 [ice] [ +0.000295] sriov_numvfs_store+0x1bc/0x200 [ +0.000015] kernfs_fop_write_iter+0x1db/0x270 [ +0.000016] new_sync_write+0x21d/0x330 [ +0.000012] vfs_write+0x376/0x410 [ +0.000012] ksys_write+0xba/0x150 [ +0.000012] do_syscall_64+0x3a/0x80 [ +0.000012] entry_SYSCALL_64_after_hwframe+0x44/0xae [ +0.000024] Last potentially related work creation: [ +0.000010] kasan_save_stack+0x1c/0x40 [ +0.000016] __kasan_record_aux_stack+0x98/0xa0 [ +0.000013] insert_work+0x34/0x160 [ +0.000015] __queue_work+0x20e/0x650 [ +0.000016] queue_work_on+0x4c/0x60 [ +0.000015] nf_nat_masq_schedule+0x297/0x2e0 [nf_nat] [ +0.000034] masq_device_event+0x5a/0x60 [nf_nat] [ +0.000031] raw_notifier_call_chain+0x5f/0x80 [ +0.000017] dev_close_many+0x1d6/0x2c0 [ +0.000015] unregister_netdevice_many+0x4e3/0xa30 [ +0.000015] unregister_netdevice_queue+0x192/0x1d0 [ +0.000014] iavf_remove+0x8f9/0x930 [iavf] [ +0.000058] pci_device_remove+0x65/0x110 [ +0.000015] device_release_driver_internal+0xf8/0x190 [ +0.000017] pci_stop_bus_device+0xb5/0xf0 [ +0.000014] pci_stop_and_remove_bus_device+0xe/0x20 [ +0.000016] pci_iov_remove_virtfn+0x19c/0x230 [ +0.000015] sriov_disable+0x4f/0x170 [ +0.000014] ice_free_vfs+0x9a/0x490 [ice] [ +0.000306] ice_sriov_configure+0xb8/0x260 [ice] [ +0.000294] sriov_numvfs_store+0x1bc/0x200 [ +0.000015] kernfs_fop_write_iter+0x1db/0x270 [ +0.000016] new_sync_write+0x21d/0x330 [ +0.000012] vfs_write+0x376/0x410 [ +0.000012] ksys_write+0xba/0x150 [ +0.000012] do_syscall_64+0x3a/0x80 [ +0.000012] entry_SYSCALL_64_after_hwframe+0x44/0xae [ +0.000025] The buggy address belongs to the object at ffff889908eb6f00 which belongs to the cache kmalloc-96 of size 96 [ +0.000016] The buggy address is located 40 bytes inside of 96-byte region [ffff889908eb6f00, ffff889908eb6f60) [ +0.000026] The buggy address belongs to the physical page: [ +0.000010] page:00000000b7e99a2e refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1908eb6 [ +0.000016] flags: 0x57ffffc0000200(slab\|node=1\|zone=2\|lastcpupid=0x1fffff) [ +0.000024] raw: 0057ffffc0000200 ffffea0069d9fd80 dead000000000002 ffff88810004c780 [ +0.000015] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000 [ +0.000009] page dumped because: kasan: bad access detected [ +0.000016] Memory state around the buggy address: [ +0.000012] ffff889908eb6e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ +0.000014] ffff889908eb6e80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ +0.000014] >ffff889908eb6f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ +0.000011] ^ [ +0.000013] ffff889908eb6f80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ +0.000013] ffff889908eb7000: fa fb fb fb fb fb fb fb fc fc fc fc fa fb fb fb [ +0.000012] ================================================================== Fixes: `0891c89674` ("ice: warn about potentially malicious VFs") Reported-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-04-26 09:26:48 -07:00
Petr Oros	b537752e6c	ice: wait 5 s for EMP reset after firmware flash We need to wait 5 s for EMP reset after firmware flash. Code was extracted from OOT driver (ice v1.8.3 downloaded from sourceforge). Without this wait, fw_activate let card in inconsistent state and recoverable only by second flash/activate. Flash was tested on these fw's: From -> To 3.00 -> 3.10/3.20 3.10 -> 3.00/3.20 3.20 -> 3.00/3.10 Reproducer: [root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin Preparing to flash [fw.mgmt] Erasing [fw.mgmt] Erasing done [fw.mgmt] Flashing 100% [fw.mgmt] Flashing done 100% [fw.undi] Erasing [fw.undi] Erasing done [fw.undi] Flashing 100% [fw.undi] Flashing done 100% [fw.netlist] Erasing [fw.netlist] Erasing done [fw.netlist] Flashing 100% [fw.netlist] Flashing done 100% Activate new firmware by devlink reload [root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate reload_actions_performed: fw_activate [root@host ~]# ip link show ens7f0 71: ens7f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff altname enp202s0f0 dmesg after flash: [ 55.120788] ice: Copyright (c) 2018, Intel Corporation. [ 55.274734] ice 0000:ca:00.0: Get PHY capabilities failed status = -5, continuing anyway [ 55.569797] ice 0000:ca:00.0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.28.0 [ 55.603629] ice 0000:ca:00.0: Get PHY capability failed. [ 55.608951] ice 0000:ca:00.0: ice_init_nvm_phy_type failed: -5 [ 55.647348] ice 0000:ca:00.0: PTP init successful [ 55.675536] ice 0000:ca:00.0: DCB is enabled in the hardware, max number of TCs supported on this port are 8 [ 55.685365] ice 0000:ca:00.0: FW LLDP is disabled, DCBx/LLDP in SW mode. [ 55.692179] ice 0000:ca:00.0: Commit DCB Configuration to the hardware [ 55.701382] ice 0000:ca:00.0: 126.024 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x8 link at 0000:c9:02.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link) Reboot doesn’t help, only second flash/activate with OOT or patched driver put card back in consistent state. After patch: [root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin Preparing to flash [fw.mgmt] Erasing [fw.mgmt] Erasing done [fw.mgmt] Flashing 100% [fw.mgmt] Flashing done 100% [fw.undi] Erasing [fw.undi] Erasing done [fw.undi] Flashing 100% [fw.undi] Flashing done 100% [fw.netlist] Erasing [fw.netlist] Erasing done [fw.netlist] Flashing 100% [fw.netlist] Flashing done 100% Activate new firmware by devlink reload [root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate reload_actions_performed: fw_activate [root@host ~]# ip link show ens7f0 19: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff altname enp202s0f0 Fixes: `399e27dbbd` ("ice: support immediate firmware activation via devlink reload") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-04-26 09:26:48 -07:00
Ivan Vecera	77d64d285b	ice: Protect vf_state check by cfg_lock in ice_vc_process_vf_msg() Previous patch labelled "ice: Fix incorrect locking in ice_vc_process_vf_msg()" fixed an issue with ignored messages sent by VF driver but a small race window still left. Recently caught trace during 'ip link set ... vf 0 vlan ...' operation: [ 7332.995625] ice 0000:3b:00.0: Clearing port VLAN on VF 0 [ 7333.001023] iavf 0000:3b:01.0: Reset indication received from the PF [ 7333.007391] iavf 0000:3b:01.0: Scheduling reset task [ 7333.059575] iavf 0000:3b:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 3 [ 7333.059626] ice 0000:3b:00.0: Invalid message from VF 0, opcode 3, len 4, error -1 Setting of VLAN for VF causes a reset of the affected VF using ice_reset_vf() function that runs with cfg_lock taken: 1. ice_notify_vf_reset() informs IAVF driver that reset is needed and IAVF schedules its own reset procedure 2. Bit ICE_VF_STATE_DIS is set in vf->vf_state 3. Misc initialization steps 4. ice_sriov_post_vsi_rebuild() -> ice_vf_set_initialized() and that clears ICE_VF_STATE_DIS in vf->vf_state Step 3 is mentioned race window because IAVF reset procedure runs in parallel and one of its step is sending of VIRTCHNL_OP_GET_VF_RESOURCES message (opcode==3). This message is handled in ice_vc_process_vf_msg() and if it is received during the mentioned race window then it's marked as invalid and error is returned to VF driver. Protect vf_state check in ice_vc_process_vf_msg() by cfg_lock to avoid this race condition. Fixes: `e6ba5273d4` ("ice: Fix race conditions between virtchnl handling and VF ndo ops") Tested-by: Fei Liu <feliu@redhat.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-04-26 09:26:42 -07:00
Ivan Vecera	aaf461af72	ice: Fix incorrect locking in ice_vc_process_vf_msg() Usage of mutex_trylock() in ice_vc_process_vf_msg() is incorrect because message sent from VF is ignored and never processed. Use mutex_lock() instead to fix the issue. It is safe because this mutex is used to prevent races between VF related NDOs and handlers processing request messages from VF and these handlers are running in ice_service_task() context. Additionally move this mutex lock prior ice_vc_is_opcode_allowed() call to avoid potential races during allowlist access. Fixes: `e6ba5273d4` ("ice: Fix race conditions between virtchnl handling and VF ndo ops") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-04-26 09:26:33 -07:00

1 2 3 4 5 ...

1090113 commits