Commit Graph

1215405 Commits

Author SHA1 Message Date
Jens Axboe a9ce385344 dm: don't attempt to queue IO under RCU protection
dm looks up the table for IO based on the request type, with an
assumption that if the request is marked REQ_NOWAIT, it's fine to
attempt to submit that IO while under RCU read lock protection. This
is not OK, as REQ_NOWAIT just means that we should not be sleeping
waiting on other IO, it does not mean that we can't potentially
schedule.

A simple test case demonstrates this quite nicely:

int main(int argc, char *argv[])
{
        struct iovec iov;
        int fd;

        fd = open("/dev/dm-0", O_RDONLY | O_DIRECT);
        posix_memalign(&iov.iov_base, 4096, 4096);
        iov.iov_len = 4096;
        preadv2(fd, &iov, 1, 0, RWF_NOWAIT);
        return 0;
}

which will instantly spew:

BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 5580, name: dm-nowait
preempt_count: 0, expected: 0
RCU nest depth: 1, expected: 0
INFO: lockdep is turned off.
CPU: 7 PID: 5580 Comm: dm-nowait Not tainted 6.6.0-rc1-g39956d2dcd81 #132
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x11d/0x1b0
 __might_resched+0x3c3/0x5e0
 ? preempt_count_sub+0x150/0x150
 mempool_alloc+0x1e2/0x390
 ? mempool_resize+0x7d0/0x7d0
 ? lock_sync+0x190/0x190
 ? lock_release+0x4b7/0x670
 ? internal_get_user_pages_fast+0x868/0x2d40
 bio_alloc_bioset+0x417/0x8c0
 ? bvec_alloc+0x200/0x200
 ? internal_get_user_pages_fast+0xb8c/0x2d40
 bio_alloc_clone+0x53/0x100
 dm_submit_bio+0x27f/0x1a20
 ? lock_release+0x4b7/0x670
 ? blk_try_enter_queue+0x1a0/0x4d0
 ? dm_dax_direct_access+0x260/0x260
 ? rcu_is_watching+0x12/0xb0
 ? blk_try_enter_queue+0x1cc/0x4d0
 __submit_bio+0x239/0x310
 ? __bio_queue_enter+0x700/0x700
 ? kvm_clock_get_cycles+0x40/0x60
 ? ktime_get+0x285/0x470
 submit_bio_noacct_nocheck+0x4d9/0xb80
 ? should_fail_request+0x80/0x80
 ? preempt_count_sub+0x150/0x150
 ? lock_release+0x4b7/0x670
 ? __bio_add_page+0x143/0x2d0
 ? iov_iter_revert+0x27/0x360
 submit_bio_noacct+0x53e/0x1b30
 submit_bio_wait+0x10a/0x230
 ? submit_bio_wait_endio+0x40/0x40
 __blkdev_direct_IO_simple+0x4f8/0x780
 ? blkdev_bio_end_io+0x4c0/0x4c0
 ? stack_trace_save+0x90/0xc0
 ? __bio_clone+0x3c0/0x3c0
 ? lock_release+0x4b7/0x670
 ? lock_sync+0x190/0x190
 ? atime_needs_update+0x3bf/0x7e0
 ? timestamp_truncate+0x21b/0x2d0
 ? inode_owner_or_capable+0x240/0x240
 blkdev_direct_IO.part.0+0x84a/0x1810
 ? rcu_is_watching+0x12/0xb0
 ? lock_release+0x4b7/0x670
 ? blkdev_read_iter+0x40d/0x530
 ? reacquire_held_locks+0x4e0/0x4e0
 ? __blkdev_direct_IO_simple+0x780/0x780
 ? rcu_is_watching+0x12/0xb0
 ? __mark_inode_dirty+0x297/0xd50
 ? preempt_count_add+0x72/0x140
 blkdev_read_iter+0x2a4/0x530
 do_iter_readv_writev+0x2f2/0x3c0
 ? generic_copy_file_range+0x1d0/0x1d0
 ? fsnotify_perm.part.0+0x25d/0x630
 ? security_file_permission+0xd8/0x100
 do_iter_read+0x31b/0x880
 ? import_iovec+0x10b/0x140
 vfs_readv+0x12d/0x1a0
 ? vfs_iter_read+0xb0/0xb0
 ? rcu_is_watching+0x12/0xb0
 ? rcu_is_watching+0x12/0xb0
 ? lock_release+0x4b7/0x670
 do_preadv+0x1b3/0x260
 ? do_readv+0x370/0x370
 __x64_sys_preadv2+0xef/0x150
 do_syscall_64+0x39/0xb0
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f5af41ad806
Code: 41 54 41 89 fc 55 44 89 c5 53 48 89 cb 48 83 ec 18 80 3d e4 dd 0d 00 00 74 7a 45 89 c1 49 89 ca 45 31 c0 b8 47 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 be 00 00 00 48 85 c0 79 4a 48 8b 0d da 55
RSP: 002b:00007ffd3145c7f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000147
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5af41ad806
RDX: 0000000000000001 RSI: 00007ffd3145c850 RDI: 0000000000000003
RBP: 0000000000000008 R08: 0000000000000000 R09: 0000000000000008
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
R13: 00007ffd3145c850 R14: 000055f5f0431dd8 R15: 0000000000000001
 </TASK>

where in fact it is dm itself that attempts to allocate a bio clone with
GFP_NOIO under the rcu read lock, regardless of the request type.

Fix this by getting rid of the special casing for REQ_NOWAIT, and just
use the normal SRCU protected table lookup. Get rid of the bio based
table locking helpers at the same time, as they are now unused.

Cc: stable@vger.kernel.org
Fixes: 563a225c9f ("dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-09-15 15:39:59 -04:00
Linus Torvalds 02e768c9fe selinux/stable-6.6 PR 20230914
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmUDRAUUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMwfw//cVR2sZxCjW6IXdYG627mL+IDMQJ3
 IVfN4N1l/bwA4R+ZHLwrOMTZx7lRUoyOAMDMfvDHJnCgvbeHtuKj5mopd1HekaN9
 Jga7mDQ/+moc6x6S85xI0nqzUiKEgxs7um7vLVnm+25QDHpEdGNQyDQgLmP4/OrO
 3rjlpeJjDuOMrspod+9wNK1m0sqpU0I0qMUxqdqBvW1eQ7zeYej5NhV/4+6eMVHT
 Lb/Rbxl7PPln69rhZ8uTdSOK51OcLfUoptpw+fts6KWjaIG9VBgltygSnYh7sxk1
 g+qfFZyRyLEEQu7XCFRGCo5uDPoWLvi0XBhSotW94evSpV4/F5lB/ZTBq/E8bsc3
 v4Na0njg2VGwqC/K7KEa1JJ40+L8QqNolgI+Tvm68d5mgU06HEIKsUdlj+wXwmbu
 tMlqCtOLEfPtnO5MI9LJpyUJfJ/gbT3YUyejNfD0b75w9JnIkf0yXu1CgKDJ4bip
 czZUn/+xxpQoJ+gsc1c6gLgEjm7mL4tHb5dvPL/hYA//BFw/nww7hVY4Wr08Hz3l
 vk2QKJQYUwThXxPXhwfyYO9ItHeVJX3GYuTSEfjaZN/xqWTeTnBfpvq7A5lwdOAl
 SGbescaOvzIRas3x0FWIJVF35Glwx7vOU6OyQsCTcZR0B4/hRkKtvAcJ2WRFvBrf
 QpHfsBBUtaY8Ors=
 =B1mh
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20230914' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fix from Paul Moore:
 "A relatively small SELinux patch to fix an issue with a
  vfs/LSM/SELinux patch that went upstream during the recent merge
  window.

  The short version is that the original patch changed how we
  initialized mount options to resolve a NFS issue and we inadvertently
  broke a use case due to the changed behavior.

  The fix restores this behavior for the cases that require it while
  keeping the original NFS fix in place"

* tag 'selinux-pr-20230914' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix handling of empty opts in selinux_fs_context_submount()
2023-09-15 12:38:44 -07:00
Linus Torvalds 82210979f3 RISC-V Fixes for 6.6-rc2
* A fix to align kexec'd kernels to PMD boundries.
 * The T-Head dcache.cva encoding was incorrect, it has been fixed to
   invalidate all caches (as opposed to just the L1).
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmUEdAsTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiRGqD/9D3uDELPxiODfpmEnocQp+l+2oa566
 R3YkIdTxatuVwakfoVNe3eHiyt2Buy4eKMs8Ymg//vQUUsISZFBF3ni7Wii7Bzwb
 MfhMZz/8IHF+Qzwgvu9vz/t5mVRHMgdMQ3nG/sKYxCSVo9soUw6HqjidF4CfK8ty
 6/gbvfn7zE7IUZi2flMrSZ03vfz7GYEoXgiydcQi5umNslzM0Fbi7WLM7Z9DHyeU
 f9WWU2A4U1ZFaiKJVs6B5oiT7BRluVAp+OSMbFd5S4AEURY6j8gWw2vHPs1DIRg0
 lX4lYtImXupor7U9z6lUwMuDJfQ1MVBrrd3DaPEepK4FuzggJ66IlCX/PqHva4dD
 aPMpJfeEIjuJgkxAdKPKQNvk1S1M3TPEyeCTUt8mv8dXAcQjqv5N2m90j28pfJcl
 7HMe6axVI8Tas21c6qMyJmSnKatJRJYEceR7+RS4B8Gu3IWS3qKJHYoRNNGM5bSz
 1XtmFa7SbDREkhR6oD1nL7ufxLu94Sdam/dU28Hq6Eg4T8GQdHi0/HwZePY2kd6+
 QZY8gHx/4348ej7HLKijAGVTdR2hO0bgPBXjG/p+NsWMW9MN207vqjfUxpPNJYOv
 miKSQC5qH3sTbEG4Vf28tEgKc5XBKyYmreFuNjoEqMfKmI5R1Azjs2HCc7cPcvws
 wn3ykSPr6JrnmA==
 =ES2t
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix to align kexec'd kernels to PMD boundries

 - The T-Head dcache.cva encoding was incorrect, it has been fixed to
   invalidate all caches (as opposed to just the L1)

* tag 'riscv-for-linus-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: errata: fix T-Head dcache.cva encoding
  riscv: kexec: Align the kexeced kernel entry
2023-09-15 12:33:01 -07:00
Takashi Sakamoto 3c70de9b58 Revert "firewire: core: obsolete usage of GFP_ATOMIC at building node tree"
This reverts commit 06f45435d9.

John Ogness reports the case that the allocation is in atomic context under
acquired spin-lock.

[   12.555784] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
[   12.555808] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 70, name: kworker/1:2
[   12.555814] preempt_count: 1, expected: 0
[   12.555820] INFO: lockdep is turned off.
[   12.555824] irq event stamp: 208
[   12.555828] hardirqs last  enabled at (207): [<c00000000111e414>] ._raw_spin_unlock_irq+0x44/0x80
[   12.555850] hardirqs last disabled at (208): [<c00000000110ff94>] .__schedule+0x854/0xfe0
[   12.555859] softirqs last  enabled at (188): [<c000000000f73504>] .addrconf_verify_rtnl+0x2c4/0xb70
[   12.555872] softirqs last disabled at (182): [<c000000000f732b0>] .addrconf_verify_rtnl+0x70/0xb70
[   12.555884] CPU: 1 PID: 70 Comm: kworker/1:2 Tainted: G S                 6.6.0-rc1 #1
[   12.555893] Hardware name: PowerMac7,2 PPC970 0x390202 PowerMac
[   12.555898] Workqueue: firewire_ohci .bus_reset_work [firewire_ohci]
[   12.555939] Call Trace:
[   12.555944] [c000000009677830] [c0000000010d83c0] .dump_stack_lvl+0x8c/0xd0 (unreliable)
[   12.555963] [c0000000096778b0] [c000000000140270] .__might_resched+0x320/0x340
[   12.555978] [c000000009677940] [c000000000497600] .__kmem_cache_alloc_node+0x390/0x460
[   12.555993] [c000000009677a10] [c0000000003fe620] .__kmalloc+0x70/0x310
[   12.556007] [c000000009677ac0] [c0003d00004e2268] .fw_core_handle_bus_reset+0x2c8/0xba0 [firewire_core]
[   12.556060] [c000000009677c20] [c0003d0000491190] .bus_reset_work+0x330/0x9b0 [firewire_ohci]
[   12.556079] [c000000009677d10] [c00000000011d0d0] .process_one_work+0x280/0x6f0
[   12.556094] [c000000009677e10] [c00000000011d8a0] .worker_thread+0x360/0x500
[   12.556107] [c000000009677ef0] [c00000000012e3b4] .kthread+0x154/0x160
[   12.556120] [c000000009677f90] [c00000000000bfa8] .start_kernel_thread+0x10/0x14

Cc: stable@kernel.org
Reported-by: John Ogness <john.ogness@linutronix.de>
Link: https://lore.kernel.org/lkml/87jzsuv1xk.fsf@jogness.linutronix.de/raw
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2023-09-15 18:37:52 +09:00
Lukas Wunner cccd328165 panic: Reenable preemption in WARN slowpath
Commit:

  5a5d7e9bad ("cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG")

amended warn_slowpath_fmt() to disable preemption until the WARN splat
has been emitted.

However the commit neglected to reenable preemption in the !fmt codepath,
i.e. when a WARN splat is emitted without additional format string.

One consequence is that users may see more splats than intended.  E.g. a
WARN splat emitted in a work item results in at least two extra splats:

  BUG: workqueue leaked lock or atomic
  (emitted by process_one_work())

  BUG: scheduling while atomic
  (emitted by worker_thread() -> schedule())

Ironically the point of the commit was to *avoid* extra splats. ;)

Fix it.

Fixes: 5a5d7e9bad ("cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/3ec48fde01e4ee6505f77908ba351bad200ae3d1.1694763684.git.lukas@wunner.de
2023-09-15 11:28:08 +02:00
Steve French 2c75426c1f smb3: fix some minor typos and repeated words
Minor cleanup pointed out by checkpatch (repeated words, missing blank
lines) in smb2pdu.c and old header location referred to in transport.c

Signed-off-by: Steve French <stfrench@microsoft.com>
2023-09-15 01:37:33 -05:00
Steve French ebc3d4e44a smb3: correct places where ENOTSUPP is used instead of preferred EOPNOTSUPP
checkpatch flagged a few places with:
     WARNING: ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP
Also fixed minor typo

Signed-off-by: Steve French <stfrench@microsoft.com>
2023-09-15 01:10:40 -05:00
Damien Le Moal e3da4c401f ata: pata_parport: Fix code style issues
Fix indentation and other code style issues in the comm.c file.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202309150646.n3iBvbPj-lkp@intel.com/
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-15 11:37:30 +09:00
Szuying Chen 737dd811a3 ata: libahci: clear pending interrupt status
When a CRC error occurs, the HBA asserts an interrupt to indicate an
interface fatal error (PxIS.IFS). The ISR clears PxIE and PxIS, then
does error recovery. If the adapter receives another SDB FIS
with an error (PxIS.TFES) from the device before the start of the EH
recovery process, the interrupt signaling the new SDB cannot be
serviced as PxIE was cleared already. This in turn results in the HBA
inability to issue any command during the error recovery process after
setting PxCMD.ST to 1 because PxIS.TFES is still set.

According to AHCI 1.3.1 specifications section 6.2.2, fatal errors
notified by setting PxIS.HBFS, PxIS.HBDS, PxIS.IFS or PxIS.TFES will
cause the HBA to enter the ERR:Fatal state. In this state, the HBA
shall not issue any new commands.

To avoid this situation, introduce the function
ahci_port_clear_pending_irq() to clear pending interrupts before
executing a COMRESET. This follows the AHCI 1.3.1 - section 6.2.2.2
specification.

Signed-off-by: Szuying Chen <Chloe_Chen@asmedia.com.tw>
Fixes: e0bfd14997 ("[PATCH] ahci: stop engine during hard reset")
Cc: stable@vger.kernel.org
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-09-15 11:34:31 +09:00
Dave Airlie c3c9acb8b2 Short summary of fixes pull:
* radeon: Uninterruptible fence waiting
  * tests: Fix use-after-free bug
  * vkms: Revert hrtimer fix
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmUC+9wACgkQaA3BHVML
 eiOhFgf6A72+7tKe2XIioVJ/oOtU+6rFnFybexyIgJ6Yx7BB1ZoOn2dM6birfSld
 vhR87x6OaOEhZsi1bThg0lmzhHXRSRedipE26NmJApPmrXYBE1fFQzvUhN/rMC+M
 Yp0EM03iYT5IS1zwQR7P+G96ZaarQDd31ODCfR5+KfTMr8fVGmYsA4wh4UNNNgQg
 YXjpCrZ48N733upca3S/LgmkJAuR0hViS9DiD1bT40KZ40Rc8OIWbIhWy24chbgM
 /OvHFI/qplawVum9x0ltWAsv1OzOGIGPGaQg+4FuHLIgXntLRgB6H3rPHj0TEful
 c6vzBO61BrVRHMyRJSxioNHzp3NKag==
 =DlGq
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2023-09-14' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Short summary of fixes pull:

 * radeon: Uninterruptible fence waiting
 * tests: Fix use-after-free bug
 * vkms: Revert hrtimer fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230914122649.GA28252@linux-uq9g
2023-09-15 12:13:01 +10:00
Dave Airlie c6fbd2b0ca - Only check eDP HPD when AUX CH is shared.
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmUC/fQACgkQ+mJfZA7r
 E8opGQf9EK0NS4kbREACCnLwZDHhZteTil6DysJzG4bydMBWa04Dbb7bsSW8hMNB
 zb+N2Gne6U3qOdD/lH9ZStSRCT3F9ZN94w/yTfErc76D3mfEQyy40wx+2cnKYEld
 0Rm8W3j/dXkC7uHKkUWwfPSq0mx+oA1WWYCrHo4BQDTbfJpRxFWzuZoDbIREtF4L
 YoQamuaEa8Y3Zt0YF2rNg31qKyQaUR0Wrxo/y0y8IEZ9lK1zwX1vasrb8uLIIzoK
 u+8VfQx56AA+TJJ/1Tg5FxnC8dBEO/R/QRPjbK4s4UxoGpIlC1DwLJK4XoQfrdoC
 EsKbsSkHa/yMaEiersW6Qw3gvoj/nA==
 =aYeP
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-fixes-2023-09-14' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Only check eDP HPD when AUX CH is shared.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZQL+NqtIZH5F/Nxr@intel.com
2023-09-15 10:12:24 +10:00
Dave Airlie 1216d49178 amd-drm-fixes-6.6-2023-09-13:
amdgpu:
 - GC 9.4.3 fixes
 - Fix white screen issues with S/G display on system with >= 64G of ram
 - Replay fixes
 - SMU 13.0.6 fixes
 - AUX backlight fix
 - NBIO 4.3 SR-IOV fixes for HDP
 - RAS fixes
 - DP MST resume fix
 - Fix segfault on systems with no vbios
 - DPIA fixes
 
 amdkfd:
 - CWSR grace period fix
 - Unaligned doorbell fix
 - CRIU fix for GFX11
 - Add missing TLB flush on gfx10 and newer
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZQIRSAAKCRC93/aFa7yZ
 2O/nAP4zB0fdLB46Hhz11aYsE9Zghe91b2rcmF4EYpEAQs7awwEAhSjy0Wiy6EYb
 prEGCdW0O8Tq7fdjr7+JrPmF7dasAQk=
 =SUbg
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-fixes-6.6-2023-09-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.6-2023-09-13:

amdgpu:
- GC 9.4.3 fixes
- Fix white screen issues with S/G display on system with >= 64G of ram
- Replay fixes
- SMU 13.0.6 fixes
- AUX backlight fix
- NBIO 4.3 SR-IOV fixes for HDP
- RAS fixes
- DP MST resume fix
- Fix segfault on systems with no vbios
- DPIA fixes

amdkfd:
- CWSR grace period fix
- Unaligned doorbell fix
- CRIU fix for GFX11
- Add missing TLB flush on gfx10 and newer

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230913195009.7714-1-alexander.deucher@amd.com
2023-09-15 09:50:50 +10:00
Jens Axboe c266ae774e nvme fixes for Linux 6.6
- nvme-tcp iov len fix (Varun)
  - nvme-hwmon const qualifier for safety (Krzysztof)
  - nvme-fc null pointer checks (Nigel)
  - nvme-pci no numa node fix (Pratyush)
  - nvme timeout fix for non-compliant controllers (Keith)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE3Fbyvv+648XNRdHTPe3zGtjzRgkFAmUDakkACgkQPe3zGtjz
 RgnSyA/+MR9dvwaa9ADcmWtzKaFnpxOy24Yr0CQ97oqfryv89ZVHSOT9gzQGdhNO
 m3Z3hJ72nU+mjjy8/Ixh0+AdSNLOD6sY9ohQQ/7jev+yGjOaSZnQWXpH6pionfuW
 DCfQweFq8VggCpT+wVinnzs7C4S/mhuuNHunefDAa+A7R6jeJENhridbzillokfi
 semNz6GSKcIMBsaQ85JDlTyh+nGfva7KwLVdEzA0g00XjxMw159bBjqPFtm/Dwde
 Jt280wVNQHEa+Jo8CVgAbYvFAhgAmwt5cqr8d22PisVVR324s7aMUf++BQcslmzX
 5WyVTvr3nvKH5uAWtvKJiDV8VNhs6WM3CuCfOOqK2fOaQ6PN18In08y9aGPDQ73e
 3xkkgDdRlJFa2jae48msIY2KnysJoqPJgymEtvHuqbQNYifE1jym39Ml5soWCol2
 a0aEVobGTceeEb0D7MWdVFntS6HmdFrZcinmSGcgj34IAYCYaguHK8ZSPNovJnAq
 /27YDkfXnt3+8ZgbxhZjxxToH2xCaa/uC842Ujh3KTROqKbppw8YBv/2VoLNxGNh
 Kqlu2vApIbrNP19HuCp/oDu10Q+6KPJZY1qqT68ndLl47CYAXEG0XbyB13I4hdwL
 YUues45a9f9X+Q0cHoJ9Hr4rTLfF6dm5gNv0X1cUdyLNTcB1FbE=
 =WTY0
 -----END PGP SIGNATURE-----

Merge tag 'nvme-6.6-2023-09-14' of git://git.infradead.org/nvme into block-6.6

Pull NVMe fixes from Keith:

"nvme fixes for Linux 6.6

 - nvme-tcp iov len fix (Varun)
 - nvme-hwmon const qualifier for safety (Krzysztof)
 - nvme-fc null pointer checks (Nigel)
 - nvme-pci no numa node fix (Pratyush)
 - nvme timeout fix for non-compliant controllers (Keith)"

* tag 'nvme-6.6-2023-09-14' of git://git.infradead.org/nvme:
  nvme: avoid bogus CRTO values
  nvme-pci: do not set the NUMA node of device if it has none
  nvme-fc: Prevent null pointer dereference in nvme_fc_io_getuuid()
  nvme: host: hwmon: constify pointers to hwmon_channel_info
  nvmet-tcp: pass iov_len instead of sg->length to bvec_set_page()
2023-09-14 16:20:31 -06:00
Keith Busch 6cc834ba62 nvme: avoid bogus CRTO values
Some devices are reporting controller ready mode support, but return 0
for CRTO. These devices require a much higher time to ready than that,
so they are failing to initialize after the driver starter preferring
that value over CAP.TO.

The spec requires that CAP.TO match the appropritate CRTO value, or be
set to 0xff if CRTO is larger than that. This means that CAP.TO can be
used to validate if CRTO is reliable, and provides an appropriate
fallback for setting the timeout value if not. Use whichever is larger.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217863
Reported-by: Cláudio Sampaio <patola@gmail.com>
Reported-by: Felix Yan <felixonmars@archlinux.org>
Tested-by: Felix Yan <felixonmars@archlinux.org>
Based-on-a-patch-by: Felix Yan <felixonmars@archlinux.org>
Cc: stable@vger.kernel.org
Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-09-14 13:09:52 -07:00
Rafael J. Wysocki fb2c10245f thermal: core: Fix disabled trip point check in handle_thermal_trip()
Commit bc840ea5f9 ("thermal: core: Do not handle trip points with
invalid temperature") added a check for invalid temperature to the
disabled trip point check in handle_thermal_trip(), but that check was
added at a point when the trip structure has not been initialized yet.

This may cause handle_thermal_trip() to skip a valid trip point in some
cases, so fix it by moving the check to a suitable place, after
__thermal_zone_get_trip() has been called to populate the trip
structure.

Fixes: bc840ea5f9 ("thermal: core: Do not handle trip points with invalid temperature")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-14 21:51:49 +02:00
Jens Axboe 29ee7a4a57 Merge tag 'md-fixes-20230914' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-6.6
Pull MD fixes from Song:

"These commits fix a bugzilla report [1] and some recent issues in 6.5
 and 6.6.

 [1] https://bugzilla.kernel.org/show_bug.cgi?id=217798"

* tag 'md-fixes-20230914' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
  md: Put the right device in md_seq_next
  md/raid1: fix error: ISO C90 forbids mixed declarations
  md: fix warning for holder mismatch from export_rdev()
  md: don't dereference mddev after export_rdev()
2023-09-14 12:30:47 -06:00
Michal Kubecek 552c5013f2 kbuild: avoid long argument lists in make modules_install
Running "make modules_install" may fail with

  make[2]: execvp: /bin/sh: Argument list too long

if many modules are built and INSTALL_MOD_PATH is long. This is because
scripts/Makefile.modinst creates all directories with one mkdir command.
Use $(foreach ...) instead to prevent an excessive argument list.

Fixes: 2dfec887c0 ("kbuild: reduce the number of mkdir calls during modules_install")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-09-15 02:39:24 +09:00
Masahiro Yamada c86e9ae5e3 kbuild: fix kernel-devel RPM package and linux-headers Deb package
Since commit fe66b5d2ae ("kbuild: refactor kernel-devel RPM package
and linux-headers Deb package"), the kernel-devel RPM package and
linux-headers Deb package are broken.

I double-quoted the $(find ... -type d), which resulted in newlines
being included in the argument to the outer find comment.

  find: 'arch/arm64/include\narch/arm64/kvm/hyp/include': No such file or directory

The outer find command is unneeded.

Fixes: fe66b5d2ae ("kbuild: refactor kernel-devel RPM package and linux-headers Deb package")
Reported-by: Karolis M <k4rolis@protonmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
2023-09-15 02:39:24 +09:00
Mariusz Tkaczyk c8870379a2 md: Put the right device in md_seq_next
If there are multiple arrays in system and one mddevice is marked
with MD_DELETED and md_seq_next() is called in the middle of removal
then it _get()s proper device but it may _put() deleted one. As a result,
active counter may never be zeroed for mddevice and it cannot
be removed.

Put the device which has been _get with previous md_seq_next() call.

Cc: stable@vger.kernel.org
Fixes: 12a6caf273 ("md: only delete entries from all_mddevs when the disk is freed")
Reported-by: AceLan Kao <acelan@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217798
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230914152416.10819-1-mariusz.tkaczyk@linux.intel.com
2023-09-14 10:13:11 -07:00
Linus Torvalds 9fdfb15a3d Networking fixes for 6.6-rc2.
Current release - regressions:
 
   - bcmasp: fix possible OOB write in bcmasp_netfilt_get_all_active()
 
 Previous releases - regressions:
 
   - ipv4: fix one memleak in __inet_del_ifa()
 
   - tcp: fix bind() regressions for v4-mapped-v6 addresses.
 
   - tls: do not free tls_rec on async operation in bpf_exec_tx_verdict()
 
   - dsa: fixes for SJA1105 FDB regressions
 
   - veth: update XDP feature set when bringing up device
 
   - igb: fix hangup when enabling SR-IOV
 
 Previous releases - always broken:
 
   - kcm: fix memory leak in error path of kcm_sendmsg()
 
   - smc: fix data corruption in smcr_port_add
 
   - microchip: fix possible memory leak for vcap_dup_rule()
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmUC2MoSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkYbYP/2L0I4IH6kkNxj0iJsK0Qhz0NU2OFY+d
 1X4JXNx0CzyLhFiy/ewodiP5R/NSnCbg3tssvrYKvyu2iu/hpA/WDK19Ny8doQ2T
 azgDs2tQUdYWrZojiSHBp0I4dEPraDpud+6d7uegvIbekk30WKBnqUy1ERQ6PUip
 M/dF6R6AW0j3CWiTrpmK69dNYaYPa8OnyQsyKNLuqSoefQ2wbGAmOOzlmMTTLY35
 jt/Vyc6/Dh9uw9Jf+Fg2Cf7IRhS/Q5Om+96oGv3NleVfe488Ykask/asybSbouGc
 0g4SUdOaeqEAzwqLMcHW9xXOTjzlygNjO7UUEKta0vqKVoEttR+P2srwZt+YYpGe
 GzbJBByNU2VqwaR3lozNVyKLtk/2F8v2WnosjwQAoj6R8seORDphXtjb+nD532zC
 vK4uxv0i2Au4M83lJQnCiJh5dwLkbDOjiArXHHJk3YWCGhl3x81URXOjxPjvE/20
 E9xR1K/4RYfKuKPcVnKfoDhAgs9d+J5/jz99AVI2O/xW0SkVUPjDznRBOk0+TTIW
 z4OfVJSNYjNytyG6ypKvE+JvezH4lw32s1rviOGrxcoSqUzqf+hZeZgOaAvADEVH
 VL+XKZtli/AX3MPUiFh1N3lUm6CGW98fVZxl5KUQk1w8VwyDqhXG6J4YKDNWqsJI
 4B3CliFV4iKF
 =vSvh
 -----END PGP SIGNATURE-----

Merge tag 'net-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Quite unusually, this does not contains any fix coming from subtrees
  (nf, ebpf, wifi, etc).

  Current release - regressions:

   - bcmasp: fix possible OOB write in bcmasp_netfilt_get_all_active()

  Previous releases - regressions:

   - ipv4: fix one memleak in __inet_del_ifa()

   - tcp: fix bind() regressions for v4-mapped-v6 addresses.

   - tls: do not free tls_rec on async operation in
     bpf_exec_tx_verdict()

   - dsa: fixes for SJA1105 FDB regressions

   - veth: update XDP feature set when bringing up device

   - igb: fix hangup when enabling SR-IOV

  Previous releases - always broken:

   - kcm: fix memory leak in error path of kcm_sendmsg()

   - smc: fix data corruption in smcr_port_add

   - microchip: fix possible memory leak for vcap_dup_rule()"

* tag 'net-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits)
  kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg().
  net: renesas: rswitch: Add spin lock protection for irq {un}mask
  net: renesas: rswitch: Fix unmasking irq condition
  igb: clean up in all error paths when enabling SR-IOV
  ixgbe: fix timestamp configuration code
  selftest: tcp: Add v4-mapped-v6 cases in bind_wildcard.c.
  selftest: tcp: Move expected_errno into each test case in bind_wildcard.c.
  selftest: tcp: Fix address length in bind_wildcard.c.
  tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.
  tcp: Fix bind() regression for v4-mapped-v6 wildcard address.
  tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any).
  ipv6: fix ip6_sock_set_addr_preferences() typo
  veth: Update XDP feature set when bringing up device
  net: macb: fix sleep inside spinlock
  net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict()
  net: ethernet: mtk_eth_soc: fix pse_port configuration for MT7988
  net: ethernet: mtk_eth_soc: fix uninitialized variable
  kcm: Fix memory leak in error path of kcm_sendmsg()
  r8152: check budget for r8152_poll()
  net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset
  ...
2023-09-14 10:03:34 -07:00
Pavel Begunkov c21a8027ad io_uring/net: fix iter retargeting for selected buf
When using selected buffer feature, io_uring delays data iter setup
until later. If io_setup_async_msg() is called before that it might see
not correctly setup iterator. Pre-init nr_segs and judge from its state
whether we repointing.

Cc: stable@vger.kernel.org
Reported-by: syzbot+a4c6e5ef999b68b26ed1@syzkaller.appspotmail.com
Fixes: 0455d4ccec ("io_uring: add POLL_FIRST support for send/sendmsg and recv/recvmsg")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0000000000002770be06053c7757@google.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-09-14 10:12:55 -06:00
Shida Zhang 7fda67e8c3 ext4: fix rec_len verify error
With the configuration PAGE_SIZE 64k and filesystem blocksize 64k,
a problem occurred when more than 13 million files were directly created
under a directory:

EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt?  Run e2fsck -D.
EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt?  Run e2fsck -D.
EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum

When enough files are created, the fake_dirent->reclen will be 0xffff.
it doesn't equal to the blocksize 65536, i.e. 0x10000.

But it is not the same condition when blocksize equals to 4k.
when enough files are created, the fake_dirent->reclen will be 0x1000.
it equals to the blocksize 4k, i.e. 0x1000.

The problem seems to be related to the limitation of the 16-bit field
when the blocksize is set to 64k.
To address this, helpers like ext4_rec_len_{from,to}_disk has already
been introduced to complete the conversion between the encoded and the
plain form of rec_len.

So fix this one by using the helper, and all the other in this file too.

Cc: stable@kernel.org
Fixes: dbe8944404 ("ext4: Calculate and verify checksums for htree nodes")
Suggested-by: Andreas Dilger <adilger@dilger.ca>
Suggested-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20230803060938.1929759-1-zhangshida@kylinos.cn
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-09-14 12:07:07 -04:00
Jan Kara 5229a658f6 ext4: do not let fstrim block system suspend
Len Brown has reported that system suspend sometimes fail due to
inability to freeze a task working in ext4_trim_fs() for one minute.
Trimming a large filesystem on a disk that slowly processes discard
requests can indeed take a long time. Since discard is just an advisory
call, it is perfectly fine to interrupt it at any time and the return
number of discarded blocks until that moment. Do that when we detect the
task is being frozen.

Cc: stable@kernel.org
Reported-by: Len Brown <lenb@kernel.org>
Suggested-by: Dave Chinner <david@fromorbit.com>
References: https://bugzilla.kernel.org/show_bug.cgi?id=216322
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230913150504.9054-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-09-14 12:06:59 -04:00
Jan Kara 45e4ab320c ext4: move setting of trimmed bit into ext4_try_to_trim_range()
Currently we set the group's trimmed bit in ext4_trim_all_free() based
on return value of ext4_try_to_trim_range(). However when we will want
to abort trimming because of suspend attempt, we want to return success
from ext4_try_to_trim_range() but not set the trimmed bit. Instead
implementing awkward propagation of this information, just move setting
of trimmed bit into ext4_try_to_trim_range() when the whole group is
trimmed.

Cc: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230913150504.9054-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-09-14 12:06:47 -04:00
Li Zetao 1bb0763f1e jbd2: Fix memory leak in journal_init_common()
There is a memory leak reported by kmemleak:

  unreferenced object 0xff11000105903b80 (size 64):
    comm "mount", pid 3382, jiffies 4295032021 (age 27.826s)
    hex dump (first 32 bytes):
      04 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
      ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00  ................
    backtrace:
      [<ffffffffae86ac40>] __kmalloc_node+0x50/0x160
      [<ffffffffaf2486d8>] crypto_alloc_tfmmem.isra.0+0x38/0x110
      [<ffffffffaf2498e5>] crypto_create_tfm_node+0x85/0x2f0
      [<ffffffffaf24a92c>] crypto_alloc_tfm_node+0xfc/0x210
      [<ffffffffaedde777>] journal_init_common+0x727/0x1ad0
      [<ffffffffaede1715>] jbd2_journal_init_inode+0x2b5/0x500
      [<ffffffffaed786b5>] ext4_load_and_init_journal+0x255/0x2440
      [<ffffffffaed8b423>] ext4_fill_super+0x8823/0xa330
      ...

The root cause was traced to an error handing path in journal_init_common()
when malloc memory failed in register_shrinker(). The checksum driver is
used to reference to checksum algorithm via cryptoapi and the user should
release the memory when the driver is no longer needed or the journal
initialization failed.

Fix it by calling crypto_free_shash() on the "err_cleanup" error handing
path in journal_init_common().

Fixes: c30713084b ("jbd2: move load_superblock() into journal_init_common()")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20230911025138.983101-1-lizetao1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-09-14 12:06:22 -04:00
Mikulas Patocka f6007dce0c dm: fix a race condition in retrieve_deps
There's a race condition in the multipath target when retrieve_deps
races with multipath_message calling dm_get_device and dm_put_device.
retrieve_deps walks the list of open devices without holding any lock
but multipath may add or remove devices to the list while it is
running. The end result may be memory corruption or use-after-free
memory access.

See this description of a UAF with multipath_message():
https://listman.redhat.com/archives/dm-devel/2022-October/052373.html

Fix this bug by introducing a new rw semaphore "devices_lock". We grab
devices_lock for read in retrieve_deps and we grab it for write in
dm_get_device and dm_put_device.

Reported-by: Luo Meng <luomeng12@huawei.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Tested-by: Li Lingfeng <lilingfeng3@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-09-14 11:18:29 -04:00
Daniel Vetter 15794f9dc3 One doc fix for drm/connector, one fix for amdgpu for an crash when
VRAM usage is high, and one fix in gm12u320 to fix the timeout units in
 the code
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZPl/TAAKCRDj7w1vZxhR
 xWZZAP0b3k5vIuQdbiZBdXy7+guakiJ2DqOMxJJ+sYS5Mun53AEA73Cu1gmBNMoT
 d8H1uBjOfvPcXANNI0t0OgJfrESOdg8=
 =atPC
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2023-09-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

One doc fix for drm/connector, one fix for amdgpu for an crash when
VRAM usage is high, and one fix in gm12u320 to fix the timeout units in
the code

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/w5nlld5ukeh6bgtljsxmkex3e7s7f4qquuqkv5lv4cv3uxzwqr@pgokpejfsyef
2023-09-14 14:00:51 +02:00
Thomas Hellström 139a27854b
drm/tests: helpers: Avoid a driver uaf
when using __drm_kunit_helper_alloc_drm_device() the driver may be
dereferenced by device-managed resources up until the device is
freed, which is typically later than the kunit-managed resource code
frees it. Fix this by simply make the driver device-managed as well.

In short, the sequence leading to the UAF is as follows:

INIT:
Code allocates a struct device as a kunit-managed resource.
Code allocates a drm driver as a kunit-managed resource.
Code allocates a drm device as a device-managed resource.

EXIT:
Kunit resource cleanup frees the drm driver
Kunit resource cleanup puts the struct device, which starts a
      device-managed resource cleanup
device-managed cleanup calls drm_dev_put()
drm_dev_put() dereferences the (now freed) drm driver -> Boom.

Related KASAN message:
[55272.551542] ==================================================================
[55272.551551] BUG: KASAN: slab-use-after-free in drm_dev_put.part.0+0xd4/0xe0 [drm]
[55272.551603] Read of size 8 at addr ffff888127502828 by task kunit_try_catch/10353

[55272.551612] CPU: 4 PID: 10353 Comm: kunit_try_catch Tainted: G     U           N 6.5.0-rc7+ #155
[55272.551620] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021
[55272.551626] Call Trace:
[55272.551629]  <TASK>
[55272.551633]  dump_stack_lvl+0x57/0x90
[55272.551639]  print_report+0xcf/0x630
[55272.551645]  ? _raw_spin_lock_irqsave+0x5f/0x70
[55272.551652]  ? drm_dev_put.part.0+0xd4/0xe0 [drm]
[55272.551694]  kasan_report+0xd7/0x110
[55272.551699]  ? drm_dev_put.part.0+0xd4/0xe0 [drm]
[55272.551742]  drm_dev_put.part.0+0xd4/0xe0 [drm]
[55272.551783]  devres_release_all+0x15d/0x1f0
[55272.551790]  ? __pfx_devres_release_all+0x10/0x10
[55272.551797]  device_unbind_cleanup+0x16/0x1a0
[55272.551802]  device_release_driver_internal+0x3e5/0x540
[55272.551808]  ? kobject_put+0x5d/0x4b0
[55272.551814]  bus_remove_device+0x1f1/0x3f0
[55272.551819]  device_del+0x342/0x910
[55272.551826]  ? __pfx_device_del+0x10/0x10
[55272.551830]  ? lock_release+0x339/0x5e0
[55272.551836]  ? kunit_remove_resource+0x128/0x290 [kunit]
[55272.551845]  ? __pfx_lock_release+0x10/0x10
[55272.551851]  platform_device_del.part.0+0x1f/0x1e0
[55272.551856]  ? _raw_spin_unlock_irqrestore+0x30/0x60
[55272.551863]  kunit_remove_resource+0x195/0x290 [kunit]
[55272.551871]  ? _raw_spin_unlock_irqrestore+0x30/0x60
[55272.551877]  kunit_cleanup+0x78/0x120 [kunit]
[55272.551885]  ? __kthread_parkme+0xc1/0x1f0
[55272.551891]  ? __pfx_kunit_try_run_case_cleanup+0x10/0x10 [kunit]
[55272.551900]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [kunit]
[55272.551909]  kunit_generic_run_threadfn_adapter+0x4a/0x90 [kunit]
[55272.551919]  kthread+0x2e7/0x3c0
[55272.551924]  ? __pfx_kthread+0x10/0x10
[55272.551929]  ret_from_fork+0x2d/0x70
[55272.551935]  ? __pfx_kthread+0x10/0x10
[55272.551940]  ret_from_fork_asm+0x1b/0x30
[55272.551948]  </TASK>

[55272.551953] Allocated by task 10351:
[55272.551956]  kasan_save_stack+0x1c/0x40
[55272.551962]  kasan_set_track+0x21/0x30
[55272.551966]  __kasan_kmalloc+0x8b/0x90
[55272.551970]  __kmalloc+0x5e/0x160
[55272.551976]  kunit_kmalloc_array+0x1c/0x50 [kunit]
[55272.551984]  drm_exec_test_init+0xfa/0x2c0 [drm_exec_test]
[55272.551991]  kunit_try_run_case+0xdd/0x250 [kunit]
[55272.551999]  kunit_generic_run_threadfn_adapter+0x4a/0x90 [kunit]
[55272.552008]  kthread+0x2e7/0x3c0
[55272.552012]  ret_from_fork+0x2d/0x70
[55272.552017]  ret_from_fork_asm+0x1b/0x30

[55272.552024] Freed by task 10353:
[55272.552027]  kasan_save_stack+0x1c/0x40
[55272.552032]  kasan_set_track+0x21/0x30
[55272.552036]  kasan_save_free_info+0x27/0x40
[55272.552041]  __kasan_slab_free+0x106/0x180
[55272.552046]  slab_free_freelist_hook+0xb3/0x160
[55272.552051]  __kmem_cache_free+0xb2/0x290
[55272.552056]  kunit_remove_resource+0x195/0x290 [kunit]
[55272.552064]  kunit_cleanup+0x78/0x120 [kunit]
[55272.552072]  kunit_generic_run_threadfn_adapter+0x4a/0x90 [kunit]
[55272.552080]  kthread+0x2e7/0x3c0
[55272.552085]  ret_from_fork+0x2d/0x70
[55272.552089]  ret_from_fork_asm+0x1b/0x30

[55272.552096] The buggy address belongs to the object at ffff888127502800
                which belongs to the cache kmalloc-512 of size 512
[55272.552105] The buggy address is located 40 bytes inside of
                freed 512-byte region [ffff888127502800, ffff888127502a00)

[55272.552115] The buggy address belongs to the physical page:
[55272.552119] page:00000000af6c70ff refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x127500
[55272.552127] head:00000000af6c70ff order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[55272.552133] anon flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
[55272.552141] page_type: 0xffffffff()
[55272.552145] raw: 0017ffffc0010200 ffff888100042c80 0000000000000000 dead000000000001
[55272.552152] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
[55272.552157] page dumped because: kasan: bad access detected

[55272.552163] Memory state around the buggy address:
[55272.552167]  ffff888127502700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[55272.552173]  ffff888127502780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[55272.552178] >ffff888127502800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[55272.552184]                                   ^
[55272.552187]  ffff888127502880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[55272.552193]  ffff888127502900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[55272.552198] ==================================================================
[55272.552203] Disabling lock debugging due to kernel taint

v2:
- Update commit message, add Fixes: tag and Cc stable.
v3:
- Further commit message updates (Maxime Ripard).

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Cc: stable@vger.kernel.org # v6.3+
Fixes: d987803107 ("drm/tests: helpers: Allow to pass a custom drm_driver")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20230907135339.7971-2-thomas.hellstrom@linux.intel.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
2023-09-14 13:57:58 +02:00
Maíra Canal 7908632f29
Revert "drm/vkms: Fix race-condition between the hrtimer and the atomic commit"
This reverts commit a0e6a017ab.

Unlocking a mutex in the context of a hrtimer callback is violating mutex
locking rules, as mutex_unlock() from interrupt context is not permitted.

Link: https://lore.kernel.org/dri-devel/ZQLAc%2FFwkv%2FGiVoK@phenom.ffwll.local/T/#t
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mairacanal@riseup.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20230914102024.1789154-1-mcanal@igalia.com
2023-09-14 07:48:19 -03:00
Kuniyuki Iwashima a22730b1b4 kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg().
syzkaller found a memory leak in kcm_sendmsg(), and commit c821a88bd7
("kcm: Fix memory leak in error path of kcm_sendmsg()") suppressed it by
updating kcm_tx_msg(head)->last_skb if partial data is copied so that the
following sendmsg() will resume from the skb.

However, we cannot know how many bytes were copied when we get the error.
Thus, we could mess up the MSG_MORE queue.

When kcm_sendmsg() fails for SOCK_DGRAM, we should purge the queue as we
do so for UDP by udp_flush_pending_frames().

Even without this change, when the error occurred, the following sendmsg()
resumed from a wrong skb and the queue was messed up.  However, we have
yet to get such a report, and only syzkaller stumbled on it.  So, this
can be changed safely.

Note this does not change SOCK_SEQPACKET behaviour.

Fixes: c821a88bd7 ("kcm: Fix memory leak in error path of kcm_sendmsg()")
Fixes: ab7ac4eb98 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230912022753.33327-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14 10:43:51 +02:00
Paolo Abeni 96f7dc6934 Merge branch 'net-renesas-rswitch-fix-a-lot-of-redundant-irq-issue'
Yoshihiro Shimoda says:

====================
net: renesas: rswitch: Fix a lot of redundant irq issue

After this patch series was applied, a lot of redundant interrupts
no longer occur.

For example: when "iperf3 -c <ipaddr> -R" on R-Car S4-8 Spider
 Before the patches are applied: about 800,000 times happened
 After the patches were applied: about 100,000 times happened
====================

Link: https://lore.kernel.org/r/20230912014936.3175430-1-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14 10:26:43 +02:00
Yoshihiro Shimoda c4f922e86c net: renesas: rswitch: Add spin lock protection for irq {un}mask
Add spin lock protection for irq {un}mask registers' control.

After napi_complete_done() and this protection were applied,
a lot of redundant interrupts no longer occur.

For example: when "iperf3 -c <ipaddr> -R" on R-Car S4-8 Spider
 Before the patches are applied: about 800,000 times happened
 After the patches were applied: about 100,000 times happened

Fixes: 3590918b5d ("net: ethernet: renesas: Add support for "Ethernet Switch"")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14 10:26:41 +02:00
Yoshihiro Shimoda e7b1ef2942 net: renesas: rswitch: Fix unmasking irq condition
Fix unmasking irq condition by using napi_complete_done(). Otherwise,
redundant interrupts happen.

Fixes: 3590918b5d ("net: ethernet: renesas: Add support for "Ethernet Switch"")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-14 10:26:40 +02:00
Justin Tee dae40be7a1 scsi: lpfc: Prevent use-after-free during rmmod with mapped NVMe rports
During rmmod, when dev_loss_tmo callback is called, an ndlp kref count is
decremented twice.  Once for SCSI transport registration and second to
remove the initial node allocation kref.  If there is also an NVMe
transport registration, another reference count decrement is expected in
lpfc_nvme_unregister_port().

Race conditions between the NVMe transport remoteport_delete and
dev_loss_tmo callbacks sometimes results in premature ndlp object release
resulting in use-after-free issues.

Fix by not dropping the ndlp object in dev_loss_tmo callback with an
outstanding NVMe transport registration.  Inversely, mark the final
NLP_DROPPED flag in lpfc_nvme_unregister_port when rmmod flag is set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230908211923.37603-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13 20:51:16 -04:00
Justin Tee 9c3034968e scsi: lpfc: Early return after marking final NLP_DROPPED flag in dev_loss_tmo
When a dev_loss_tmo event occurs, an ndlp lock is taken before checking
nlp_flag for NLP_DROPPED.  There is an attempt to restore the ndlp lock
when exiting the if statement, but the nlp_put kref could be the final
decrement causing a use-after-free memory access on a released ndlp object.

Instead of trying to reacquire the ndlp lock after checking nlp_flag, just
return after calling nlp_put.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230908211852.37576-1-justintee8345@gmail.com
Reviewed-by: "Ewan D. Milne" <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13 20:49:34 -04:00
Jinjie Ruan 7dcc683db3 scsi: lpfc: Fix the NULL vs IS_ERR() bug for debugfs_create_file()
Since debugfs_create_file() returns ERR_PTR and never NULL, use IS_ERR() to
check the return value.

Fixes: 2fcbc569b9 ("scsi: lpfc: Make debugfs ktime stats generic for NVME and SCSI")
Fixes: 4c47efc140 ("scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures")
Fixes: 6a828b0f61 ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues")
Fixes: 95bfc6d8ad ("scsi: lpfc: Make FW logging dynamically configurable")
Fixes: 9f77870870 ("scsi: lpfc: Add debugfs support for cm framework buffers")
Fixes: c490850a09 ("scsi: lpfc: Adapt partitioned XRI lists to efficient sharing")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/20230906030809.2847970-1-ruanjinjie@huawei.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13 20:48:36 -04:00
David Disseldorp d14e3e553e scsi: target: core: Fix target_cmd_counter leak
The target_cmd_counter struct allocated via target_alloc_cmd_counter() is
never freed, resulting in leaks across various transport types, e.g.:

 unreferenced object 0xffff88801f920120 (size 96):
  comm "sh", pid 102, jiffies 4294892535 (age 713.412s)
  hex dump (first 32 bytes):
    07 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 38 01 92 1f 80 88 ff ff  ........8.......
  backtrace:
    [<00000000e58a6252>] kmalloc_trace+0x11/0x20
    [<0000000043af4b2f>] target_alloc_cmd_counter+0x17/0x90 [target_core_mod]
    [<000000007da2dfa7>] target_setup_session+0x2d/0x140 [target_core_mod]
    [<0000000068feef86>] tcm_loop_tpg_nexus_store+0x19b/0x350 [tcm_loop]
    [<000000006a80e021>] configfs_write_iter+0xb1/0x120
    [<00000000e9f4d860>] vfs_write+0x2e4/0x3c0
    [<000000008143433b>] ksys_write+0x80/0xb0
    [<00000000a7df29b2>] do_syscall_64+0x42/0x90
    [<0000000053f45fb8>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Free the structure alongside the corresponding iscsit_conn / se_sess
parent.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Link: https://lore.kernel.org/r/20230831183459.6938-1-ddiss@suse.de
Fixes: becd9be606 ("scsi: target: Move sess cmd counter to new struct")
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13 20:09:56 -04:00
Damien Le Moal c91774818b scsi: pm8001: Setup IRQs on resume
The function pm8001_pci_resume() only calls pm8001_request_irq() without
calling pm8001_setup_irq(). This causes the IRQ allocation to fail, which
leads all drives being removed from the system.

Fix this issue by integrating the code for pm8001_setup_irq() directly
inside pm8001_request_irq() so that MSI-X setup is performed both during
normal initialization and resume operations.

Fixes: dbf9bfe615 ("[SCSI] pm8001: add SAS/SATA HBA driver")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20230911232745.325149-2-dlemoal@kernel.org
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13 20:08:40 -04:00
Michal Grzedzicki c13e733174 scsi: pm80xx: Avoid leaking tags when processing OPC_INB_SET_CONTROLLER_CONFIG command
Tags allocated for OPC_INB_SET_CONTROLLER_CONFIG command need to be freed
when we receive the response.

Signed-off-by: Michal Grzedzicki <mge@meta.com>
Link: https://lore.kernel.org/r/20230911170340.699533-2-mge@meta.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13 20:04:23 -04:00
Michal Grzedzicki 71996bb835 scsi: pm80xx: Use phy-specific SAS address when sending PHY_START command
Some cards have more than one SAS address. Using an incorrect address
causes communication issues with some devices like expanders.

Closes: https://lore.kernel.org/linux-kernel/A57AEA84-5CA0-403E-8053-106033C73C70@fb.com/
Signed-off-by: Michal Grzedzicki <mge@meta.com>
Link: https://lore.kernel.org/r/20230913155611.3183612-1-mge@meta.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13 20:04:22 -04:00
Martin K. Petersen 4f6cee6045 Merge branch '6.6/scsi-staging' into 6.6/scsi-fixes
Pull in staged fixes for 6.6.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-09-13 19:50:47 -04:00
Linus Torvalds aed8aee111 This pull-request renames the genpd subsystem to pmdomain.
Ideally "pmdomain" should give a better hint of the purpose of the
 subsystem.
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAmUBgPQXHHVsZi5oYW5z
 c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCn/FhAAkyfY0hQUXL6X18Gnj5qadq9/
 h7TmzaHNw3amhAdri+KkDs742Pz0sZpnspOqUcddBjOj41CYbawmAerkvKG4DtkF
 z04YeYY0Ynrdc0p6NxrZhQCX9IiXntubr8Iwb9uoaBJu1rDv8gC1SYg3k0yA9dMx
 eBwcbwrZh6Jt4vRNelU39Vulrpv98ywTrBB/9BSki3EqAWvf8JAx3ddRivQ+6oPK
 +AyJPsdBBiB6Sj/ggY8j7wk4xXNp3x06oiWc6+ufjYPQ7PeuTuVlyaktD/CteVL8
 Kt0ssvQp7ISgE4cIZ8/kTLCRZHPi8fJi9rDYbf9tQkZU+meWqyc17AjV4kMAaqrj
 O3SF/OkIb+97nlq4Iv5WXkX8AtO5ELxPU3IH0Tuei+LNhD8Wvgmfw67/6yobAG70
 qPMNVkx2vqrEGqFDqKUzA9n94wppmgHVCHNYTg3mS/ns75RArebECSqP7cQiWwC0
 MLuRRQ77Q/YR9eri87jP+2eyJnu2w0gIgkR7OhqvqwLuLkWcu0QpyIuW2Xdb8vaa
 e3X0Ap6tmmjI1qAEKH9sMA69RJVaZKPXV/OCAlD4fY0NOGP1OwDYfXFeITuS0WAR
 UGUMS7Z/TWHF9J0YOGz9jLMSXhmS2vGEEzVNFvQXh+LgIVWPhC9ciTSgbINFrJG9
 wdyU5YZ4av8FDntIQfw=
 =VibR
 -----END PGP SIGNATURE-----

Merge tag 'pmdomain-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm

Pull genpm / pmdomain rename from Ulf Hansson:
 "This renames the genpd subsystem to pmdomain.

  As discussed on LKML, using 'genpd' as the name of a subsystem isn't
  very self-explanatory and the acronym itself that means Generic PM
  Domain, is known only by a limited group of people.

  The suggestion to improve the situation is to rename the subsystem to
  'pmdomain', which there seems to be a good consensus around using.

  Ideally it should indicate that its purpose is to manage Power Domains
  or 'PM domains' as we often also use within the Linux Kernel
  terminology"

* tag 'pmdomain-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: Rename the genpd subsystem to pmdomain
2023-09-13 14:18:19 -07:00
Linus Torvalds 23f108dc9e Hi,
This pull request contains a critical fix for my previous pull request.
 
 BR, Jarkko
 -----BEGIN PGP SIGNATURE-----
 
 iIgEABYIADAWIQRE6pSOnaBC00OEHEIaerohdGur0gUCZQDFZxIcamFya2tvQGtl
 cm5lbC5vcmcACgkQGnq6IXRrq9JtiQD/ewB17tLbjoAzZSRjUTgHwfjLSOkLUE0R
 xyNSsbnOIQEBANLFJ9/23HBHN7hL/mhdcovTX4eWdjvUAzZHoQtasV4E
 =huwc
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-v6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull tpm fix from Jarkko Sakkinen.

* tag 'tpmdd-v6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm: Fix typo in tpmrm class definition
2023-09-13 11:44:20 -07:00
Linus Torvalds 847165d7c8 parisc architecture fixes and enhancements for kernel v6.6-rc2:
* fix reference to exported symbols for parisc64 [Masahiro Yamada]
 * Block-TLB (BTLB) support on 32-bit CPUs
 * sparse and build-warning fixes
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZQDAAwAKCRD3ErUQojoP
 X5wDAP4jxLxuVnUCpV5hUdFoJC5lkRM2LigbWDSfDQGHaycr0QD+NerBYX8Ejo6n
 x0zHqqtBBe1fgU0QfRdwHeE7hlOiigI=
 =FDci
 -----END PGP SIGNATURE-----

Merge tag 'parisc-for-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc architecture fixes from Helge Deller:

 - fix reference to exported symbols for parisc64 [Masahiro Yamada]

 - Block-TLB (BTLB) support on 32-bit CPUs

 - sparse and build-warning fixes

* tag 'parisc-for-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  linux/export: fix reference to exported functions for parisc64
  parisc: BTLB: Initialize BTLB tables at CPU startup
  parisc: firmware: Simplify calling non-PA20 functions
  parisc: BTLB: _edata symbol has to be page aligned for BTLB support
  parisc: BTLB: Add BTLB insert and purge firmware function wrappers
  parisc: BTLB: Clear possibly existing BTLB entries
  parisc: Prepare for Block-TLB support on 32-bit kernel
  parisc: shmparam.h: Document aliasing requirements of PA-RISC
  parisc: irq: Make irq_stack_union static to avoid sparse warning
  parisc: drivers: Fix sparse warning
  parisc: iosapic.c: Fix sparse warnings
  parisc: ccio-dma: Fix sparse warnings
  parisc: sba-iommu: Fix sparse warnigs
  parisc: sba: Fix compile warning wrt list of SBA devices
  parisc: sba_iommu: Fix build warning if procfs if disabled
2023-09-13 11:35:53 -07:00
Linus Torvalds 99214f6778 Tracing fixes for 6.6:
- Add missing LOCKDOWN checks for eventfs callers
   When LOCKDOWN is active for tracing, it causes inconsistent state
   when some functions succeed and others fail.
 
 - Use dput() to free the top level eventfs descriptor
   There was a race between accesses and freeing it.
 
 - Fix a long standing bug that eventfs exposed due to changing timings
   by dynamically creating files. That is, If a event file is opened
   for an instance, there's nothing preventing the instance from being
   removed which will make accessing the files cause use-after-free bugs.
 
 - Fix a ring buffer race that happens when iterating over the ring
   buffer while writers are active. Check to make sure not to read
   the event meta data if it's beyond the end of the ring buffer sub buffer.
 
 - Fix the print trigger that disappeared because the test to create it
   was looking for the event dir field being filled, but now it has the
   "ef" field filled for the eventfs structure.
 
 - Remove the unused "dir" field from the event structure.
 
 - Fix the order of the trace_dynamic_info as it had it backwards for the
   offset and len fields for which one was for which endianess.
 
 - Fix NULL pointer dereference with eventfs_remove_rec()
   If an allocation fails in one of the eventfs_add_*() functions,
   the caller of it in event_subsystem_dir() or event_create_dir()
   assigns the result to the structure. But it's assigning the ERR_PTR
   and not NULL. This was passed to eventfs_remove_rec() which expects
   either a good pointer or a NULL, not ERR_PTR. The fix is to not
   assign the ERR_PTR to the structure, but to keep it NULL on error.
 
 - Fix list_for_each_rcu() to use list_for_each_srcu() in
   dcache_dir_open_wrapper(). One iteration of the code used RCU
   but because it had to call sleepable code, it had to be changed
   to use SRCU, but one of the iterations was missed.
 
 - Fix synthetic event print function to use "as_u64" instead of
   passing in a pointer to the union. To fix big/little endian issues,
   the u64 that represented several types was turned into a union to
   define the types properly.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZQCvoBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qtgrAP9MiYiCMU+90oJ+61DFchbs3y7BNidP
 s3lLRDUMJ935NQD/SSAm54PqWb+YXMpD7m9+3781l6xqwfabBMXNaEl+FwA=
 =tlZu
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Add missing LOCKDOWN checks for eventfs callers

   When LOCKDOWN is active for tracing, it causes inconsistent state
   when some functions succeed and others fail.

 - Use dput() to free the top level eventfs descriptor

   There was a race between accesses and freeing it.

 - Fix a long standing bug that eventfs exposed due to changing timings
   by dynamically creating files. That is, If a event file is opened for
   an instance, there's nothing preventing the instance from being
   removed which will make accessing the files cause use-after-free
   bugs.

 - Fix a ring buffer race that happens when iterating over the ring
   buffer while writers are active. Check to make sure not to read the
   event meta data if it's beyond the end of the ring buffer sub buffer.

 - Fix the print trigger that disappeared because the test to create it
   was looking for the event dir field being filled, but now it has the
   "ef" field filled for the eventfs structure.

 - Remove the unused "dir" field from the event structure.

 - Fix the order of the trace_dynamic_info as it had it backwards for
   the offset and len fields for which one was for which endianess.

 - Fix NULL pointer dereference with eventfs_remove_rec()

   If an allocation fails in one of the eventfs_add_*() functions, the
   caller of it in event_subsystem_dir() or event_create_dir() assigns
   the result to the structure. But it's assigning the ERR_PTR and not
   NULL. This was passed to eventfs_remove_rec() which expects either a
   good pointer or a NULL, not ERR_PTR. The fix is to not assign the
   ERR_PTR to the structure, but to keep it NULL on error.

 - Fix list_for_each_rcu() to use list_for_each_srcu() in
   dcache_dir_open_wrapper(). One iteration of the code used RCU but
   because it had to call sleepable code, it had to be changed to use
   SRCU, but one of the iterations was missed.

 - Fix synthetic event print function to use "as_u64" instead of passing
   in a pointer to the union. To fix big/little endian issues, the u64
   that represented several types was turned into a union to define the
   types properly.

* tag 'trace-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  eventfs: Fix the NULL pointer dereference bug in eventfs_remove_rec()
  tracefs/eventfs: Use list_for_each_srcu() in dcache_dir_open_wrapper()
  tracing/synthetic: Print out u64 values properly
  tracing/synthetic: Fix order of struct trace_dynamic_info
  selftests/ftrace: Fix dependencies for some of the synthetic event tests
  tracing: Remove unused trace_event_file dir field
  tracing: Use the new eventfs descriptor for print trigger
  ring-buffer: Do not attempt to read past "commit"
  tracefs/eventfs: Free top level files on removal
  ring-buffer: Avoid softlockup in ring_buffer_resize()
  tracing: Have event inject files inc the trace array ref count
  tracing: Have option files inc the trace array ref count
  tracing: Have current_trace inc the trace array ref count
  tracing: Have tracing_max_latency inc the trace array ref count
  tracing: Increase trace array ref count on enable and filter files
  tracefs/eventfs: Use dput to free the toplevel events directory
  tracefs/eventfs: Add missing lockdown checks
  tracefs: Add missing lockdown check to tracefs_create_dir()
2023-09-13 11:30:11 -07:00
Namjae Jeon 59d8d24f46 ksmbd: fix passing freed memory 'aux_payload_buf'
The patch e2b76ab8b5c9: "ksmbd: add support for read compound" leads
to the following Smatch static checker warning:

  fs/smb/server/smb2pdu.c:6329 smb2_read()
        warn: passing freed memory 'aux_payload_buf'

It doesn't matter that we're passing a freed variable because nbytes is
zero. This patch set "aux_payload_buf = NULL" to make smatch silence.

Fixes: e2b76ab8b5 ("ksmbd: add support for read compound")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-09-13 10:21:05 -05:00
Namjae Jeon e4e14095cc ksmbd: remove unneeded mark_inode_dirty in set_info_sec()
mark_inode_dirty will be called in notify_change().
This patch remove unneeded mark_inode_dirty in set_info_sec().

Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-09-13 10:21:05 -05:00
Ricardo Neri 108af4b4bd x86/sched: Restore the SD_ASYM_PACKING flag in the DIE domain
Commit 8f2d6c41e5 ("x86/sched: Rewrite topology setup") dropped the
SD_ASYM_PACKING flag in the DIE domain added in commit 044f0e27de
("x86/sched: Add the SD_ASYM_PACKING flag to the die domain of hybrid
processors"). Restore it on hybrid processors.

The die-level domain does not depend on any build configuration and now
x86_sched_itmt_flags() is always needed. Remove the build dependency on
CONFIG_SCHED_[SMT|CLUSTER|MC].

Fixes: 8f2d6c41e5 ("x86/sched: Rewrite topology setup")
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Caleb Callaway <caleb.callaway@intel.com>
Link: https://lkml.kernel.org/r/20230815035747.11529-1-ricardo.neri-calderon@linux.intel.com
2023-09-13 15:03:18 +02:00
Tim Chen 450e749707 sched/fair: Fix SMT4 group_smt_balance handling
For SMT4, any group with more than 2 tasks will be marked as
group_smt_balance. Retain the behaviour of group_has_spare by marking
the busiest group as the group which has the least number of idle_cpus.

Also, handle rounding effect of adding (ncores_local + ncores_busy) when
the local is fully idle and busy group imbalance is less than 2 tasks.
Local group should try to pull at least 1 task in this case so imbalance
should be set to 2 instead.

Fixes: fee1759e4f ("sched/fair: Determine active load balance for SMT sched groups")
Acked-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/6cd1633036bb6b651af575c32c2a9608a106702c.camel@linux.intel.com
2023-09-13 15:03:06 +02:00
Corinna Vinschen bc6ed2fa24 igb: clean up in all error paths when enabling SR-IOV
After commit 50f303496d ("igb: Enable SR-IOV after reinit"), removing
the igb module could hang or crash (depending on the machine) when the
module has been loaded with the max_vfs parameter set to some value != 0.

In case of one test machine with a dual port 82580, this hang occurred:

[  232.480687] igb 0000:41:00.1: removed PHC on enp65s0f1
[  233.093257] igb 0000:41:00.1: IOV Disabled
[  233.329969] pcieport 0000:40:01.0: AER: Multiple Uncorrected (Non-Fatal) err0
[  233.340302] igb 0000:41:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fata)
[  233.352248] igb 0000:41:00.0:   device [8086:1516] error status/mask=00100000
[  233.361088] igb 0000:41:00.0:    [20] UnsupReq               (First)
[  233.368183] igb 0000:41:00.0: AER:   TLP Header: 40000001 0000040f cdbfc00c c
[  233.376846] igb 0000:41:00.1: PCIe Bus Error: severity=Uncorrected (Non-Fata)
[  233.388779] igb 0000:41:00.1:   device [8086:1516] error status/mask=00100000
[  233.397629] igb 0000:41:00.1:    [20] UnsupReq               (First)
[  233.404736] igb 0000:41:00.1: AER:   TLP Header: 40000001 0000040f cdbfc00c c
[  233.538214] pci 0000:41:00.1: AER: can't recover (no error_detected callback)
[  233.538401] igb 0000:41:00.0: removed PHC on enp65s0f0
[  233.546197] pcieport 0000:40:01.0: AER: device recovery failed
[  234.157244] igb 0000:41:00.0: IOV Disabled
[  371.619705] INFO: task irq/35-aerdrv:257 blocked for more than 122 seconds.
[  371.627489]       Not tainted 6.4.0-dirty #2
[  371.632257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this.
[  371.641000] task:irq/35-aerdrv   state:D stack:0     pid:257   ppid:2      f0
[  371.650330] Call Trace:
[  371.653061]  <TASK>
[  371.655407]  __schedule+0x20e/0x660
[  371.659313]  schedule+0x5a/0xd0
[  371.662824]  schedule_preempt_disabled+0x11/0x20
[  371.667983]  __mutex_lock.constprop.0+0x372/0x6c0
[  371.673237]  ? __pfx_aer_root_reset+0x10/0x10
[  371.678105]  report_error_detected+0x25/0x1c0
[  371.682974]  ? __pfx_report_normal_detected+0x10/0x10
[  371.688618]  pci_walk_bus+0x72/0x90
[  371.692519]  pcie_do_recovery+0xb2/0x330
[  371.696899]  aer_process_err_devices+0x117/0x170
[  371.702055]  aer_isr+0x1c0/0x1e0
[  371.705661]  ? __set_cpus_allowed_ptr+0x54/0xa0
[  371.710723]  ? __pfx_irq_thread_fn+0x10/0x10
[  371.715496]  irq_thread_fn+0x20/0x60
[  371.719491]  irq_thread+0xe6/0x1b0
[  371.723291]  ? __pfx_irq_thread_dtor+0x10/0x10
[  371.728255]  ? __pfx_irq_thread+0x10/0x10
[  371.732731]  kthread+0xe2/0x110
[  371.736243]  ? __pfx_kthread+0x10/0x10
[  371.740430]  ret_from_fork+0x2c/0x50
[  371.744428]  </TASK>

The reproducer was a simple script:

  #!/bin/sh
  for i in `seq 1 5`; do
    modprobe -rv igb
    modprobe -v igb max_vfs=1
    sleep 1
    modprobe -rv igb
  done

It turned out that this could only be reproduce on 82580 (quad and
dual-port), but not on 82576, i350 and i210.  Further debugging showed
that igb_enable_sriov()'s call to pci_enable_sriov() is failing, because
dev->is_physfn is 0 on 82580.

Prior to commit 50f303496d ("igb: Enable SR-IOV after reinit"),
igb_enable_sriov() jumped into the "err_out" cleanup branch.  After this
commit it only returned the error code.

So the cleanup didn't take place, and the incorrect VF setup in the
igb_adapter structure fooled the igb driver into assuming that VFs have
been set up where no VF actually existed.

Fix this problem by cleaning up again if pci_enable_sriov() fails.

Fixes: 50f303496d ("igb: Enable SR-IOV after reinit")
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-13 12:24:29 +01:00