Commit graph

1251184 commits

Author SHA1 Message Date
Linus Torvalds
7187ea0978 drm fixes for 6.8-rc7
buddy:
 - two allocation fixes + unit test
 
 fbcon:
 - font restore syzkaller fix
 
 ttm:
 - kunit test fix
 
 bridge:
 - fix aux-hpd leaks
 - fix aux-hpd registration
 - fix use after free in soc/qcom
 - fix boot on soc/qcom
 
 xe:
 - A couple of tracepoint updates from Priyanka and Lucas.
 - Make sure BINDs are completed before accepting UNBINDs on LR vms.
 - Don't arbitrarily restrict max number of batched binds.
 - Add uapi for dumpable bos (agreed on IRC).
 - Remove unused uapi flags and a leftover comment.
 - A couple of fixes related to the execlist backend.
 
 msm:
 - DP: Revert a change which was causing a HDP regression
 
 amdgpu:
 - Fix potential buffer overflow
 - Fix power min cap
 - Suspend/resume fix
 - SI PM fix
 - eDP fix
 
 nouveau:
 - fix a misreported VRAM sizing
 - fix a regression in suspend/resume due to freeing
 
 tegra:
 - host1x reset fix
 - only remove existing driver if display is possible
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmXhaX0ACgkQDHTzWXnE
 hr4DtA//VzJEzhfnPlss4kgD7Io5vvH+We9eoSi0lkIrL+mHVJYQOveB/yw8KMJZ
 p6qGYi4Q/bTLyoDS2tL7oP5HfU5ZbzLG2DBFJzBo5Y6dQRhhgWsKGQs/zhlSrm9S
 k8F1lR6DVL3rZd084FIY7GlSlGCj9h2GdmpmvbJ9R0rjFEu2E0WE3aatEVTFCj/Y
 Vy8rjwrrST+cj6kefEGoRO+awmQCJrxIp2TmygB8vWZwlkmCGK+MmzOpJa2Ei5g2
 jLFzMoxN7jIVQzdF/Z3hI6j75lWJAK0mpEsstt9hJlgv/nQbuJyG7rcRfy8X7pBF
 qcT6YCMbEvvC8Kh580H6H9/CIDavKqWOPAaRluv9Y1zA1Q4Wmmeh3u74QNnAX8Fy
 y+PKLe7rraQYx5E6z0EiBpfyxckQlDn6W7wDemyCu4IPyFCj5Y+x3oZ5Ldu0yKPR
 meksQm4NySe0zv/Q2uxYmtR3PcWPCHprvjZeqt9RfAs8iOUYD7liDCoamDiqkoDK
 b2wJrBTrBCq/0JnJEDSd6barSvGjw8ufwOAfyrjsknKRaNSz8+bxpZ3CqlkH+bGj
 6v5Wsll8kRg5gfpnFRh7qWME7MeskfCC1fVby+9hZE8vKYDmISLWuq4yvSlyzJ6k
 15c6CgQrulYyxxJmeTV45mxhfNyv2n1gTem2DXEpM/f1HWK1mi0=
 =uYay
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2024-03-01' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Bunch of fixes, xe, amdgpu, nouveau and tegra all have a few. Then
  drm/bridge including some drivers/soc fallout fixes. The biggest thing
  in here is a new unit test for some buddy allocator fixes, otherwise a
  misc fbcon, ttm unit test and one msm revert.

  Seems pretty normal for this stage.

  buddy:
   - two allocation fixes + unit test

  fbcon:
   - font restore syzkaller fix

  ttm:
   - kunit test fix

  bridge:
   - fix aux-hpd leaks
   - fix aux-hpd registration
   - fix use after free in soc/qcom
   - fix boot on soc/qcom

  xe:
   - A couple of tracepoint updates from Priyanka and Lucas
   - Make sure BINDs are completed before accepting UNBINDs on LR vms
   - Don't arbitrarily restrict max number of batched binds
   - Add uapi for dumpable bos (agreed on IRC)
   - Remove unused uapi flags and a leftover comment
   - A couple of fixes related to the execlist backend

  msm:
   - DP: Revert a change which was causing a HDP regression

  amdgpu:
   - Fix potential buffer overflow
   - Fix power min cap
   - Suspend/resume fix
   - SI PM fix
   - eDP fix

  nouveau:
   - fix a misreported VRAM sizing
   - fix a regression in suspend/resume due to freeing

  tegra:
   - host1x reset fix
   - only remove existing driver if display is possible"

* tag 'drm-fixes-2024-03-01' of https://gitlab.freedesktop.org/drm/kernel: (32 commits)
  drm/nouveau: keep DMA buffers required for suspend/resume
  nouveau: report byte usage in VRAM usage
  drm/xe/xe_trace: Add move_lacks_source detail to xe_bo_move trace
  drm/xe: Deny unbinds if uapi ufence pending
  drm/xe: Expose user fence from xe_sync_entry
  drm/xe: Use pointers in trace events
  drm/xe/xe_bo_move: Enhance xe_bo_move trace
  drm/xe/mmio: fix build warning for BAR resize on 32-bit
  drm/xe: get rid of MAX_BINDS
  drm/xe: Use vmalloc for array of bind allocation in bind IOCTL
  drm/xe: Don't support execlists in xe_gt_tlb_invalidation layer
  drm/xe: Fix execlist splat
  drm/xe/uapi: Remove unused flags
  drm/xe/uapi: Remove DRM_XE_VM_BIND_FLAG_ASYNC comment left over
  drm/xe: Add uapi for dumpable bos
  drm/amd/display: Add monitor patch for specific eDP
  Revert "drm/msm/dp: use drm_bridge_hpd_notify() to report HPD status changes"
  drm/tests/drm_buddy: add alloc_range_bias test
  drm/buddy: check range allocation matches alignment
  drm/buddy: fix range bias
  ...
2024-03-01 11:23:08 -08:00
Linus Torvalds
161671a6eb Probes fixes for v6.8-rc5:
- fprobe: Fix to allocate entry_data_size buffer for each rethook
     instance. This fixes a buffer overrun bug (which leads a kernel
     crash) when fprobe user uses its entry_data in the entry_handler.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmXhIPgbHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bhwIH/1h5q2ZqNwNplDGVQpWU
 G1uuRHLlt47jwbGR3gfeYqVELtX0gFigBsmVouCKK3u3qerB1pDscYhULzKeHjS4
 1HAsonj+vKY2pbdCaRnxRT7ejlEioN8CwPb1eqY6Bf6XQ2tJqS5gUqdej8JDJuY5
 tpNAhHWqAnRvf1V5muclGAIU+9zavrAjbetpgrPEDIjE5idFvN+6D+4PXiM1cRIW
 KXW1oA7VlShdfY7xprSZ33Lx7C/dLWojM2P/z/BvqyXOf4f1ovqtGFJegW4n7V5b
 ZgamgOcSBwFLTVOTpOzn0peucduLFTfEWyC7fFGkHjBxTl2JypsQLEupdoaWLvBB
 el4=
 =bUgZ
 -----END PGP SIGNATURE-----

Merge tag 'probes-fixes-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull fprobe fix from Masami Hiramatsu:

 - allocate entry_data_size buffer for each rethook instance.

   This fixes a buffer overrun bug (which leads a kernel crash)
   when fprobe user uses its entry_data in the entry_handler.

* tag 'probes-fixes-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  fprobe: Fix to allocate entry_data_size buffer with rethook instances
2024-03-01 11:17:23 -08:00
Tetsuo Handa
2f03fc340c tomoyo: fix UAF write bug in tomoyo_write_control()
Since tomoyo_write_control() updates head->write_buf when write()
of long lines is requested, we need to fetch head->write_buf after
head->io_sem is held.  Otherwise, concurrent write() requests can
cause use-after-free-write and double-free problems.

Reported-by: Sam Sun <samsun1006219@gmail.com>
Closes: https://lkml.kernel.org/r/CAEkJfYNDspuGxYx5kym8Lvp--D36CMDUErg4rxfWFJuPbbji8g@mail.gmail.com
Fixes: bd03a3e4c9 ("TOMOYO: Add policy namespace support.")
Cc:  <stable@vger.kernel.org> # Linux 3.1+
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-03-01 11:14:00 -08:00
Marc Zyngier
9a3bfb27ef KVM: arm64: Fix TRFCR_EL1/PMSCR_EL1 access in hVHE mode
When running in hVHE mode, EL1 accesses are performed with the EL12
accessor, as we run with HCR_EL2.E2H=1.

Unfortunately, both PMSCR_EL1 and TRFCR_EL1 are used with the
EL1 accessor, meaning that we actually affect the EL2 state. Duh.

Switch to using the {read,write}_sysreg_el1() helpers that will do
the right thing in all circumstances.

Note that the 'Fixes:' tag doesn't represent the point where the bug
was introduced (there is no such point), but the first practical point
where the hVHE feature is usable.

Cc: James Clark <james.clark@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Fixes: 38cba55008 ("KVM: arm64: Force HCR_E2H in guest context when ARM64_KVM_HVHE is set")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240229145417.3606279-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2024-03-01 18:53:47 +00:00
Jiri Bohac
7fd817c906 x86/e820: Don't reserve SETUP_RNG_SEED in e820
SETUP_RNG_SEED in setup_data is supplied by kexec and should
not be reserved in the e820 map.

Doing so reserves 16 bytes of RAM when booting with kexec.
(16 bytes because data->len is zeroed by parse_setup_data so only
sizeof(setup_data) is reserved.)

When kexec is used repeatedly, each boot adds two entries in the
kexec-provided e820 map as the 16-byte range splits a larger
range of usable memory. Eventually all of the 128 available entries
get used up. The next split will result in losing usable memory
as the new entries cannot be added to the e820 map.

Fixes: 68b8e9713c ("x86/setup: Use rng seeds from setup_data")
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/ZbmOjKnARGiaYBd5@dwarf.suse.cz
2024-03-01 10:27:20 -08:00
Zhangfei Gao
6384c56c99 iommu/sva: Fix SVA handle sharing in multi device case
iommu_sva_bind_device will directly goto out in multi-device
case when found existing domain, ignoring list_add handle,
which causes the handle to fail to be shared.

Fixes: 65d4418c50 ("iommu/sva: Restore SVA handle sharing")
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240227064821.128-1-zhangfei.gao@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-03-01 13:53:58 +01:00
Michael Ellerman
380cb2f4df selftests/powerpc: Fix fpu_signal failures
My recent commit e5d00aaac6 ("selftests/powerpc: Check all FPRs in
fpu_preempt") inadvertently broke the fpu_signal test.

It needs to take into account that fpu_preempt now loads 32 FPRs, so
enlarge darray.

Also use the newly added randomise_darray() to properly randomise darray.

Finally the checking done in signal_fpu_sig() needs to skip checking
f30/f31, because they are used as scratch registers in check_all_fprs(),
called by preempt_fpu(), and so could hold other values when the signal
is taken.

Fixes: e5d00aaac6 ("selftests/powerpc: Check all FPRs in fpu_preempt")
Reported-by: Spoorthy <spoorthy@linux.ibm.com>
Depends-on: 2ba107f679 ("selftests/powerpc: Generate better bit patterns for FPU tests")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240301101035.1230024-1-mpe@ellerman.id.au
2024-03-01 22:15:30 +11:00
Bartosz Golaszewski
ec5c54a9d3 gpio: fix resource unwinding order in error path
Hogs are added *after* ACPI so should be removed *before* in error path.

Fixes: a411e81e61 ("gpiolib: add hogs support for machine code")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2024-03-01 09:33:30 +01:00
Andy Shevchenko
e4aec4daa8 gpiolib: Fix the error path order in gpiochip_add_data_with_key()
After shuffling the code, error path wasn't updated correctly.
Fix it here.

Fixes: 2f4133bb5f ("gpiolib: No need to call gpiochip_remove_pin_ranges() twice")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-03-01 08:36:53 +01:00
Arturas Moskvinas
530b1dbd97 gpio: 74x164: Enable output pins after registers are reset
Chip outputs are enabled[1] before actual reset is performed[2] which might
cause pin output value to flip flop if previous pin value was set to 1.
Fix that behavior by making sure chip is fully reset before all outputs are
enabled.

Flip-flop can be noticed when module is removed and inserted again and one of
the pins was changed to 1 before removal. 100 microsecond flipping is
noticeable on oscilloscope (100khz SPI bus).

For a properly reset chip - output is enabled around 100 microseconds (on 100khz
SPI bus) later during probing process hence should be irrelevant behavioral
change.

Fixes: 7ebc194d0f (gpio: 74x164: Introduce 'enable-gpios' property)
Link: https://elixir.bootlin.com/linux/v6.7.4/source/drivers/gpio/gpio-74x164.c#L130 [1]
Link: https://elixir.bootlin.com/linux/v6.7.4/source/drivers/gpio/gpio-74x164.c#L150 [2]
Signed-off-by: Arturas Moskvinas <arturas.moskvinas@gmail.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-03-01 08:23:08 +01:00
Sid Pranjale
f6ecfdad35 drm/nouveau: keep DMA buffers required for suspend/resume
Nouveau deallocates a few buffers post GPU init which are required for GPU suspend/resume to function correctly.
This is likely not as big an issue on systems where the NVGPU is the only GPU, but on multi-GPU set ups it leads to a regression where the kernel module errors and results in a system-wide rendering freeze.

This commit addresses that regression by moving the two buffers required for suspend and resume to be deallocated at driver unload instead of post init.

Fixes: 042b5f8384 ("drm/nouveau: fix several DMA buffer leaks")
Signed-off-by: Sid Pranjale <sidpranjale127@protonmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-03-01 15:27:04 +10:00
Dave Airlie
f7916c47f6 nouveau: report byte usage in VRAM usage.
Turns out usage is always in bytes not shifted.

Fixes: 72fa02fdf8 ("nouveau: add an ioctl to report vram usage")
Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-03-01 15:26:28 +10:00
Dave Airlie
f060e461ea amd-drm-fixes-6.8-2024-02-29:
amdgpu:
 - Fix potential buffer overflow
 - Fix power min cap
 - Suspend/resume fix
 - SI PM fix
 - eDP fix
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZeCgBAAKCRC93/aFa7yZ
 2JyZAPwOnMslVzIcIT9QH1IvROC1EitEAcDhQvL0mGCOPO9nBAEA3iLPrJc4RKvH
 bA0Sa+e5V0F2eC8OLPWMy/1X1p0lZA0=
 =R/MV
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-fixes-6.8-2024-02-29' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.8-2024-02-29:

amdgpu:
- Fix potential buffer overflow
- Fix power min cap
- Suspend/resume fix
- SI PM fix
- eDP fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229152424.6646-1-alexander.deucher@amd.com
2024-03-01 15:09:47 +10:00
Dave Airlie
bba679c06c Merge tag 'drm-msm-fixes-2024-02-28' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v6.8-rc7

DP:
- Revert a change which was causing a HDP regression

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvhWvHiPGQ1pRD2XPAQoHEM2M35kjhrsSAEtzh8AMSRvg@mail.gmail.com
2024-03-01 14:24:52 +10:00
Dave Airlie
aa5fe428d5 UAPI Changes:
- A couple of tracepoint updates from Priyanka and Lucas.
 - Make sure BINDs are completed before accepting UNBINDs on LR vms.
 - Don't arbitrarily restrict max number of batched binds.
 - Add uapi for dumpable bos (agreed on IRC).
 - Remove unused uapi flags and a leftover comment.
 
 Driver Changes:
 - A couple of fixes related to the execlist backend.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRskUM7w1oG5rx2IZO4FpNVCsYGvwUCZeB/nAAKCRC4FpNVCsYG
 vwU8AQCD3be2p5O/SQ5btePtB1yFv3KgL1mGir7LTuvWj7VeegEAw0+7iZe+Uscp
 XUAWl6AQQyXWxHXd4BcBAa56exMljwo=
 =Y4LX
 -----END PGP SIGNATURE-----

Merge tag 'drm-xe-fixes-2024-02-29' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

UAPI Changes:
- A couple of tracepoint updates from Priyanka and Lucas.
- Make sure BINDs are completed before accepting UNBINDs on LR vms.
- Don't arbitrarily restrict max number of batched binds.
- Add uapi for dumpable bos (agreed on IRC).
- Remove unused uapi flags and a leftover comment.

Driver Changes:
- A couple of fixes related to the execlist backend.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZeCBg4MA2hd1oggN@fedora
2024-03-01 13:46:59 +10:00
Dave Airlie
45046af3d0 A reset fix for host1x, a resource leak fix and a probe fix for aux-hpd,
a use-after-free fix and a boot fix for a pmic_glink qcom driver in
 drivers/soc, a fix for the simpledrm/tegra transition, a kunit fix for
 the TTM tests, a font handling fix for fbcon, two allocation fixes and a
 kunit test to cover them for drm/buddy
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZeCIDAAKCRDj7w1vZxhR
 xSYQAP9KTXlKqw9p/jFw/MFqBxzvwmi4/M5iUReoajPo1uCeCAD/ZY71qqBKCnrx
 LiLHbQgvzWowxyq2A6fS28Ml7Vb5nQo=
 =FZcr
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2024-02-29' of https://anongit.freedesktop.org/git/drm/drm-misc into drm-fixes

A reset fix for host1x, a resource leak fix and a probe fix for aux-hpd,
a use-after-free fix and a boot fix for a pmic_glink qcom driver in
drivers/soc, a fix for the simpledrm/tegra transition, a kunit fix for
the TTM tests, a font handling fix for fbcon, two allocation fixes and a
kunit test to cover them for drm/buddy

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229-angelic-adorable-teal-fbfabb@houat
2024-03-01 13:13:06 +10:00
Masami Hiramatsu (Google)
6572786006 fprobe: Fix to allocate entry_data_size buffer with rethook instances
Fix to allocate fprobe::entry_data_size buffer with rethook instances.
If fprobe doesn't allocate entry_data_size buffer for each rethook instance,
fprobe entry handler can cause a buffer overrun when storing entry data in
entry handler.

Link: https://lore.kernel.org/all/170920576727.107552.638161246679734051.stgit@devnote2/

Reported-by: Jiri Olsa <olsajiri@gmail.com>
Closes: https://lore.kernel.org/all/Zd9eBn2FTQzYyg7L@krava/
Fixes: 4bbd934556 ("kprobes: kretprobe scalability improvement")
Cc: stable@vger.kernel.org
Tested-by: Jiri Olsa <olsajiri@gmail.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-03-01 09:18:24 +09:00
Filipe Manana
e2b54eaf28 btrfs: fix double free of anonymous device after snapshot creation failure
When creating a snapshot we may do a double free of an anonymous device
in case there's an error committing the transaction. The second free may
result in freeing an anonymous device number that was allocated by some
other subsystem in the kernel or another btrfs filesystem.

The steps that lead to this:

1) At ioctl.c:create_snapshot() we allocate an anonymous device number
   and assign it to pending_snapshot->anon_dev;

2) Then we call btrfs_commit_transaction() and end up at
   transaction.c:create_pending_snapshot();

3) There we call btrfs_get_new_fs_root() and pass it the anonymous device
   number stored in pending_snapshot->anon_dev;

4) btrfs_get_new_fs_root() frees that anonymous device number because
   btrfs_lookup_fs_root() returned a root - someone else did a lookup
   of the new root already, which could some task doing backref walking;

5) After that some error happens in the transaction commit path, and at
   ioctl.c:create_snapshot() we jump to the 'fail' label, and after
   that we free again the same anonymous device number, which in the
   meanwhile may have been reallocated somewhere else, because
   pending_snapshot->anon_dev still has the same value as in step 1.

Recently syzbot ran into this and reported the following trace:

  ------------[ cut here ]------------
  ida_free called for id=51 which is not allocated.
  WARNING: CPU: 1 PID: 31038 at lib/idr.c:525 ida_free+0x370/0x420 lib/idr.c:525
  Modules linked in:
  CPU: 1 PID: 31038 Comm: syz-executor.2 Not tainted 6.8.0-rc4-syzkaller-00410-gc02197fc9076 #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
  RIP: 0010:ida_free+0x370/0x420 lib/idr.c:525
  Code: 10 42 80 3c 28 (...)
  RSP: 0018:ffffc90015a67300 EFLAGS: 00010246
  RAX: be5130472f5dd000 RBX: 0000000000000033 RCX: 0000000000040000
  RDX: ffffc90009a7a000 RSI: 000000000003ffff RDI: 0000000000040000
  RBP: ffffc90015a673f0 R08: ffffffff81577992 R09: 1ffff92002b4cdb4
  R10: dffffc0000000000 R11: fffff52002b4cdb5 R12: 0000000000000246
  R13: dffffc0000000000 R14: ffffffff8e256b80 R15: 0000000000000246
  FS:  00007fca3f4b46c0(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f167a17b978 CR3: 000000001ed26000 CR4: 0000000000350ef0
  Call Trace:
   <TASK>
   btrfs_get_root_ref+0xa48/0xaf0 fs/btrfs/disk-io.c:1346
   create_pending_snapshot+0xff2/0x2bc0 fs/btrfs/transaction.c:1837
   create_pending_snapshots+0x195/0x1d0 fs/btrfs/transaction.c:1931
   btrfs_commit_transaction+0xf1c/0x3740 fs/btrfs/transaction.c:2404
   create_snapshot+0x507/0x880 fs/btrfs/ioctl.c:848
   btrfs_mksubvol+0x5d0/0x750 fs/btrfs/ioctl.c:998
   btrfs_mksnapshot+0xb5/0xf0 fs/btrfs/ioctl.c:1044
   __btrfs_ioctl_snap_create+0x387/0x4b0 fs/btrfs/ioctl.c:1306
   btrfs_ioctl_snap_create_v2+0x1ca/0x400 fs/btrfs/ioctl.c:1393
   btrfs_ioctl+0xa74/0xd40
   vfs_ioctl fs/ioctl.c:51 [inline]
   __do_sys_ioctl fs/ioctl.c:871 [inline]
   __se_sys_ioctl+0xfe/0x170 fs/ioctl.c:857
   do_syscall_64+0xfb/0x240
   entry_SYSCALL_64_after_hwframe+0x6f/0x77
  RIP: 0033:0x7fca3e67dda9
  Code: 28 00 00 00 (...)
  RSP: 002b:00007fca3f4b40c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 00007fca3e7abf80 RCX: 00007fca3e67dda9
  RDX: 00000000200005c0 RSI: 0000000050009417 RDI: 0000000000000003
  RBP: 00007fca3e6ca47a R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
  R13: 000000000000000b R14: 00007fca3e7abf80 R15: 00007fff6bf95658
   </TASK>

Where we get an explicit message where we attempt to free an anonymous
device number that is not currently allocated. It happens in a different
code path from the example below, at btrfs_get_root_ref(), so this change
may not fix the case triggered by syzbot.

To fix at least the code path from the example above, change
btrfs_get_root_ref() and its callers to receive a dev_t pointer argument
for the anonymous device number, so that in case it frees the number, it
also resets it to 0, so that up in the call chain we don't attempt to do
the double free.

CC: stable@vger.kernel.org # 5.10+
Link: https://lore.kernel.org/linux-btrfs/000000000000f673a1061202f630@google.com/
Fixes: e03ee2fe87 ("btrfs: do not ASSERT() if the newly created subvolume already got read")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-02-29 22:34:11 +01:00
Filipe Manana
418b090277 btrfs: ensure fiemap doesn't race with writes when FIEMAP_FLAG_SYNC is given
When FIEMAP_FLAG_SYNC is given to fiemap the expectation is that that
are no concurrent writes and we get a stable view of the inode's extent
layout.

When the flag is given we flush all IO (and wait for ordered extents to
complete) and then lock the inode in shared mode, however that leaves open
the possibility that a write might happen right after the flushing and
before locking the inode. So fix this by flushing again after locking the
inode - we leave the initial flushing before locking the inode to avoid
holding the lock and blocking other RO operations while waiting for IO
and ordered extents to complete. The second flushing while holding the
inode's lock will most of the time do nothing or very little since the
time window for new writes to have happened is small.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-02-29 22:34:07 +01:00
Filipe Manana
a1a4a9ca77 btrfs: fix race between ordered extent completion and fiemap
For fiemap we recently stopped locking the target extent range for the
whole duration of the fiemap call, in order to avoid a deadlock in a
scenario where the fiemap buffer happens to be a memory mapped range of
the same file. This use case is very unlikely to be useful in practice but
it may be triggered by fuzz testing (syzbot, etc).

However by not locking the target extent range for the whole duration of
the fiemap call we can race with an ordered extent. This happens like
this:

1) The fiemap task finishes processing a file extent item that covers
   the file range [512K, 1M[, and that file extent item is the last item
   in the leaf currently being processed;

2) And ordered extent for the file range [768K, 2M[, in COW mode,
   completes (btrfs_finish_one_ordered()) and the file extent item
   covering the range [512K, 1M[ is trimmed to cover the range
   [512K, 768K[ and then a new file extent item for the range [768K, 2M[
   is inserted in the inode's subvolume tree;

3) The fiemap task calls fiemap_next_leaf_item(), which then calls
   btrfs_next_leaf() to find the next leaf / item. This finds that the
   the next key following the one we previously processed (its type is
   BTRFS_EXTENT_DATA_KEY and its offset is 512K), is the key corresponding
   to the new file extent item inserted by the ordered extent, which has
   a type of BTRFS_EXTENT_DATA_KEY and an offset of 768K;

4) Later the fiemap code ends up at emit_fiemap_extent() and triggers
   the warning:

      if (cache->offset + cache->len > offset) {
               WARN_ON(1);
               return -EINVAL;
      }

   Since we get 1M > 768K, because the previously emitted entry for the
   old extent covering the file range [512K, 1M[ ends at an offset that
   is greater than the new extent's start offset (768K). This makes fiemap
   fail with -EINVAL besides triggering the warning that produces a stack
   trace like the following:

     [1621.677651] ------------[ cut here ]------------
     [1621.677656] WARNING: CPU: 1 PID: 204366 at fs/btrfs/extent_io.c:2492 emit_fiemap_extent+0x84/0x90 [btrfs]
     [1621.677899] Modules linked in: btrfs blake2b_generic (...)
     [1621.677951] CPU: 1 PID: 204366 Comm: pool Not tainted 6.8.0-rc5-btrfs-next-151+ #1
     [1621.677954] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
     [1621.677956] RIP: 0010:emit_fiemap_extent+0x84/0x90 [btrfs]
     [1621.678033] Code: 2b 4c 89 63 (...)
     [1621.678035] RSP: 0018:ffffab16089ffd20 EFLAGS: 00010206
     [1621.678037] RAX: 00000000004fa000 RBX: ffffab16089ffe08 RCX: 0000000000009000
     [1621.678039] RDX: 00000000004f9000 RSI: 00000000004f1000 RDI: ffffab16089ffe90
     [1621.678040] RBP: 00000000004f9000 R08: 0000000000001000 R09: 0000000000000000
     [1621.678041] R10: 0000000000000000 R11: 0000000000001000 R12: 0000000041d78000
     [1621.678043] R13: 0000000000001000 R14: 0000000000000000 R15: ffff9434f0b17850
     [1621.678044] FS:  00007fa6e20006c0(0000) GS:ffff943bdfa40000(0000) knlGS:0000000000000000
     [1621.678046] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     [1621.678048] CR2: 00007fa6b0801000 CR3: 000000012d404002 CR4: 0000000000370ef0
     [1621.678053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     [1621.678055] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
     [1621.678056] Call Trace:
     [1621.678074]  <TASK>
     [1621.678076]  ? __warn+0x80/0x130
     [1621.678082]  ? emit_fiemap_extent+0x84/0x90 [btrfs]
     [1621.678159]  ? report_bug+0x1f4/0x200
     [1621.678164]  ? handle_bug+0x42/0x70
     [1621.678167]  ? exc_invalid_op+0x14/0x70
     [1621.678170]  ? asm_exc_invalid_op+0x16/0x20
     [1621.678178]  ? emit_fiemap_extent+0x84/0x90 [btrfs]
     [1621.678253]  extent_fiemap+0x766/0xa30 [btrfs]
     [1621.678339]  btrfs_fiemap+0x45/0x80 [btrfs]
     [1621.678420]  do_vfs_ioctl+0x1e4/0x870
     [1621.678431]  __x64_sys_ioctl+0x6a/0xc0
     [1621.678434]  do_syscall_64+0x52/0x120
     [1621.678445]  entry_SYSCALL_64_after_hwframe+0x6e/0x76

There's also another case where before calling btrfs_next_leaf() we are
processing a hole or a prealloc extent and we had several delalloc ranges
within that hole or prealloc extent. In that case if the ordered extents
complete before we find the next key, we may end up finding an extent item
with an offset smaller than (or equals to) the offset in cache->offset.

So fix this by changing emit_fiemap_extent() to address these three
scenarios like this:

1) For the first case, steps listed above, adjust the length of the
   previously cached extent so that it does not overlap with the current
   extent, emit the previous one and cache the current file extent item;

2) For the second case where he had a hole or prealloc extent with
   multiple delalloc ranges inside the hole or prealloc extent's range,
   and the current file extent item has an offset that matches the offset
   in the fiemap cache, just discard what we have in the fiemap cache and
   assign the current file extent item to the cache, since it's more up
   to date;

3) For the third case where he had a hole or prealloc extent with
   multiple delalloc ranges inside the hole or prealloc extent's range
   and the offset of the file extent item we just found is smaller than
   what we have in the cache, just skip the current file extent item
   if its range end at or behind the cached extent's end, because we may
   have emitted (to the fiemap user space buffer) delalloc ranges that
   overlap with the current file extent item's range. If the file extent
   item's range goes beyond the end offset of the cached extent, just
   emit the cached extent and cache a subrange of the file extent item,
   that goes from the end offset of the cached extent to the end offset
   of the file extent item.

Dealing with those cases in those ways makes everything consistent by
reflecting the current state of file extent items in the btree and
without emitting extents that have overlapping ranges (which would be
confusing and violating expectations).

This issue could be triggered often with test case generic/561, and was
also hit and reported by Wang Yugui.

Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Link: https://lore.kernel.org/linux-btrfs/20240223104619.701F.409509F4@e16-tech.com/
Fixes: b0ad381fa7 ("btrfs: fix deadlock with fiemap and extent locking")
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-02-29 22:34:04 +01:00
Linus Torvalds
87adedeba5 Including fixes from bluetooth, WiFi and netfilter.
We have one outstanding issue with the stmmac driver, which may
 be a LOCKDEP false positive, not a blocker.
 
 Current release - regressions:
 
  - netfilter: nf_tables: re-allow NFPROTO_INET in
    nft_(match/target)_validate()
 
  - eth: ionic: fix error handling in PCI reset code
 
 Current release - new code bugs:
 
  - eth: stmmac: complete meta data only when enabled, fix null-deref
 
  - kunit: fix again checksum tests on big endian CPUs
 
 Previous releases - regressions:
 
  - veth: try harder when allocating queue memory
 
  - Bluetooth:
    - hci_bcm4377: do not mark valid bd_addr as invalid
    - hci_event: fix handling of HCI_EV_IO_CAPA_REQUEST
 
 Previous releases - always broken:
 
  - info leak in __skb_datagram_iter() on netlink socket
 
  - mptcp:
    - map v4 address to v6 when destroying subflow
    - fix potential wake-up event loss due to sndbuf auto-tuning
    - fix double-free on socket dismantle
 
  - wifi: nl80211: reject iftype change with mesh ID change
 
  - fix small out-of-bound read when validating netlink be16/32 types
 
  - rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back
 
  - ipv6: fix potential "struct net" ref-leak in inet6_rtm_getaddr()
 
  - ip_tunnel: prevent perpetual headroom growth with huge number of
    tunnels on top of each other
 
  - mctp: fix skb leaks on error paths of mctp_local_output()
 
  - eth: ice: fixes for DPLL state reporting
 
  - dpll: rely on rcu for netdev_dpll_pin() to prevent UaF
 
  - eth: dpaa: accept phy-interface-type = "10gbase-r" in the device tree
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmXg6ioACgkQMUZtbf5S
 IrupHQ/+Jt9OK8AYiUUBpeE0E0pb4yHS4KuiGWChx2YECJCeeeU6Ko4gaPI6+Nyv
 mMh/3sVsLnX7w4OXp2HddMMiFGbd1ufIptS0T/EMhHbbg1h7Qr1jhpu8aM8pb9jM
 5DwjfTZijDaW84Oe+Kk9BOonxR6A+Df27O3PSEUtLk4JCy5nwEwUr9iCxgCla499
 3aLu5eWRw8PTSsJec4BK6hfCKWiA/6oBHS1pQPwYvWuBWFZe8neYHtvt3LUwo1HR
 DwN9gtMiGBzYSSQmk8V1diGIokn80G5Krdq4gXbhsLxIU0oEJA7ltGpqasxy/OCs
 KGLHcU5wCd3j42gZOzvBzzzj8RQyd2ZekyvCu7B5Rgy3fx6JWI1jLalsQ/tT9yQg
 VJgFM2AZBb1EEAw/P2DkVQ8Km8ZuVlGtzUoldvIY1deP1/LZFWc0PftA6ndT7Ldl
 wQwKPQtJ5DMzqEe3mwSjFkL+AiSmcCHCkpnGBIi4c7Ek2/GgT1HeUMwJPh0mBftz
 smlLch3jMH2YKk7AmH7l9o/Q9ypgvl+8FA+icLaX0IjtSbzz5Q7gNyhgE0w1Hdb2
 79q6SE3ETLG/dn75XMA1C0Wowrr60WKHwagMPUl57u9bchfUT8Ler/4Sd9DWn8Vl
 55YnGPWMLCkxgpk+DHXYOWjOBRszCkXrAA71NclMnbZ5cQ86JYY=
 =T2ty
 -----END PGP SIGNATURE-----

Merge tag 'net-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, WiFi and netfilter.

  We have one outstanding issue with the stmmac driver, which may be a
  LOCKDEP false positive, not a blocker.

  Current release - regressions:

   - netfilter: nf_tables: re-allow NFPROTO_INET in
     nft_(match/target)_validate()

   - eth: ionic: fix error handling in PCI reset code

  Current release - new code bugs:

   - eth: stmmac: complete meta data only when enabled, fix null-deref

   - kunit: fix again checksum tests on big endian CPUs

  Previous releases - regressions:

   - veth: try harder when allocating queue memory

   - Bluetooth:
      - hci_bcm4377: do not mark valid bd_addr as invalid
      - hci_event: fix handling of HCI_EV_IO_CAPA_REQUEST

  Previous releases - always broken:

   - info leak in __skb_datagram_iter() on netlink socket

   - mptcp:
      - map v4 address to v6 when destroying subflow
      - fix potential wake-up event loss due to sndbuf auto-tuning
      - fix double-free on socket dismantle

   - wifi: nl80211: reject iftype change with mesh ID change

   - fix small out-of-bound read when validating netlink be16/32 types

   - rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back

   - ipv6: fix potential "struct net" ref-leak in inet6_rtm_getaddr()

   - ip_tunnel: prevent perpetual headroom growth with huge number of
     tunnels on top of each other

   - mctp: fix skb leaks on error paths of mctp_local_output()

   - eth: ice: fixes for DPLL state reporting

   - dpll: rely on rcu for netdev_dpll_pin() to prevent UaF

   - eth: dpaa: accept phy-interface-type = '10gbase-r' in the device
     tree"

* tag 'net-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (73 commits)
  dpll: fix build failure due to rcu_dereference_check() on unknown type
  kunit: Fix again checksum tests on big endian CPUs
  tls: fix use-after-free on failed backlog decryption
  tls: separate no-async decryption request handling from async
  tls: fix peeking with sync+async decryption
  tls: decrement decrypt_pending if no async completion will be called
  gtp: fix use-after-free and null-ptr-deref in gtp_newlink()
  net: hsr: Use correct offset for HSR TLV values in supervisory HSR frames
  igb: extend PTP timestamp adjustments to i211
  rtnetlink: fix error logic of IFLA_BRIDGE_FLAGS writing back
  tools: ynl: fix handling of multiple mcast groups
  selftests: netfilter: add bridge conntrack + multicast test case
  netfilter: bridge: confirm multicast packets before passing them up the stack
  netfilter: nf_tables: allow NFPROTO_INET in nft_(match/target)_validate()
  Bluetooth: qca: Fix triggering coredump implementation
  Bluetooth: hci_qca: Set BDA quirk bit if fwnode exists in DT
  Bluetooth: qca: Fix wrong event type for patch config command
  Bluetooth: Enforce validation on max value of connection interval
  Bluetooth: hci_event: Fix handling of HCI_EV_IO_CAPA_REQUEST
  Bluetooth: mgmt: Fix limited discoverable off timeout
  ...
2024-02-29 12:40:20 -08:00
Linus Torvalds
d4f76f8065 Landlock fix for v6.8-rc7
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZeDFdxAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbSnwUA/0w0hCpUVMxKI66HmzE0tgcgWje3aLfGpoYc
 vLFIL39YAQDRJ/2HDUCLoFwKKn537nnhRAP58brZYnwwT1YQT+/VAA==
 =PNsP
 -----END PGP SIGNATURE-----

Merge tag 'landlock-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull Landlock fix from Mickaël Salaün:
 "Fix a potential issue when handling inodes with inconsistent
  properties"

* tag 'landlock-6.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  landlock: Fix asymmetric private inodes referring
2024-02-29 12:29:23 -08:00
Dimitris Vlachos
a11dd49dcb
riscv: Sparse-Memory/vmemmap out-of-bounds fix
Offset vmemmap so that the first page of vmemmap will be mapped
to the first page of physical memory in order to ensure that
vmemmap’s bounds will be respected during
pfn_to_page()/page_to_pfn() operations.
The conversion macros will produce correct SV39/48/57 addresses
for every possible/valid DRAM_BASE inside the physical memory limits.

v2:Address Alex's comments

Suggested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Dimitris Vlachos <dvlachos@ics.forth.gr>
Reported-by: Dimitris Vlachos <dvlachos@ics.forth.gr>
Closes: https://lore.kernel.org/linux-riscv/20240202135030.42265-1-csd4492@csd.uoc.gr
Fixes: d95f1a542c ("RISC-V: Implement sparsemem")
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240229191723.32779-1-dvlachos@ics.forth.gr
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29 12:24:31 -08:00
Eric Dumazet
640f41ed33 dpll: fix build failure due to rcu_dereference_check() on unknown type
Tasmiya reports that their compiler complains that we deref
a pointer to unknown type with rcu_dereference_rtnl():

include/linux/rcupdate.h:439:9: error: dereferencing pointer to incomplete type ‘struct dpll_pin’

Unclear what compiler it is, at the moment, and we can't report
but since DPLL can't be a module - move the code from the header
into the source file.

Fixes: 0d60d8df6f ("dpll: rely on rcu for netdev_dpll_pin()")
Reported-by: Tasmiya Nalatwad <tasmiya@linux.vnet.ibm.com>
Link: https://lore.kernel.org/all/3fcf3a2c-1c1b-42c1-bacb-78fdcd700389@linux.vnet.ibm.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240229190515.2740221-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-29 12:18:37 -08:00
Palmer Dabbelt
2b8acd7154
Merge patch series "NAPOT Fixes"
Alexandre Ghiti <alexghiti@rivosinc.com> says:

This contains 2 fixes for NAPOT: patch 1 disables the use of NAPOT
mapping for vmalloc/vmap and patch 2 implements pte_leaf_size() to
report NAPOT size.

* b4-shazam-merge:
  riscv: Fix pte_leaf_size() for NAPOT
  Revert "riscv: mm: support Svnapot in huge vmap"

Link: https://lore.kernel.org/r/20240227205016.121901-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29 10:21:25 -08:00
Alexandre Ghiti
e0fe5ab419
riscv: Fix pte_leaf_size() for NAPOT
pte_leaf_size() must be reimplemented to add support for NAPOT mappings.

Fixes: 82a1a1f3bf ("riscv: mm: support Svnapot in hugetlb page")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240227205016.121901-3-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29 10:21:23 -08:00
Alexandre Ghiti
16ab4646c9
Revert "riscv: mm: support Svnapot in huge vmap"
This reverts commit ce173474cf.

We cannot correctly deal with NAPOT mappings in vmalloc/vmap because if
some part of a NAPOT mapping is unmapped, the remaining mapping is not
updated accordingly. For example:

ptr = vmalloc_huge(64 * 1024, GFP_KERNEL);
vunmap_range((unsigned long)(ptr + PAGE_SIZE),
	     (unsigned long)(ptr + 64 * 1024));

leads to the following kernel page table dump:

0xffff8f8000ef0000-0xffff8f8000ef1000    0x00000001033c0000         4K PTE N   ..     ..   D A G . . W R V

Meaning the first entry which was not unmapped still has the N bit set,
which, if accessed first and cached in the TLB, could allow access to the
unmapped range.

That's because the logic to break the NAPOT mapping does not exist and
likely won't. Indeed, to break a NAPOT mapping, we first have to clear
the whole mapping, flush the TLB and then set the new mapping ("break-
before-make" equivalent). That works fine in userspace since we can handle
any pagefault occurring on the remaining mapping but we can't handle a kernel
pagefault on such mapping.

So fix this by reverting the commit that introduced the vmap/vmalloc
support.

Fixes: ce173474cf ("riscv: mm: support Svnapot in huge vmap")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240227205016.121901-2-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29 10:21:22 -08:00
Palmer Dabbelt
e2b6bc28ec
Merge patch series "riscv: cbo.zero fixes"
Samuel Holland <samuel.holland@sifive.com> says:

This series fixes a couple of issues related to using the cbo.zero
instruction in userspace. The first patch fixes a bug where the wrong
enable bit gets set if the kernel is running in M-mode. The remaining
patches fix a bug where the enable bit gets reset to its default value
after a nonretentive idle state. I have hardware which reproduces this:

Before this series:
  $ tools/testing/selftests/riscv/hwprobe/cbo
  TAP version 13
  1..3
  ok 1 Zicboz block size
  # Zicboz block size: 64
  Illegal instruction

After applying this series:
  $ tools/testing/selftests/riscv/hwprobe/cbo
  TAP version 13
  1..3
  ok 1 Zicboz block size
  # Zicboz block size: 64
  ok 2 cbo.zero
  ok 3 cbo.zero check
  # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0

* b4-shazam-merge:
  riscv: Save/restore envcfg CSR during CPU suspend
  riscv: Add a custom ISA extension for the [ms]envcfg CSR
  riscv: Fix enabling cbo.zero when running in M-mode

Link: https://lore.kernel.org/r/20240228065559.3434837-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29 10:20:19 -08:00
Samuel Holland
05ab803d1a
riscv: Save/restore envcfg CSR during CPU suspend
The value of the [ms]envcfg CSR is lost when entering a nonretentive
idle state, so the CSR must be rewritten when resuming the CPU.

Cc: <stable@vger.kernel.org> # v6.7+
Fixes: 43c16d51a1 ("RISC-V: Enable cbo.zero in usermode")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240228065559.3434837-4-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29 10:20:18 -08:00
Samuel Holland
4774848fef
riscv: Add a custom ISA extension for the [ms]envcfg CSR
The [ms]envcfg CSR was added in version 1.12 of the RISC-V privileged
ISA (aka S[ms]1p12). However, bits in this CSR are defined by several
other extensions which may be implemented separately from any particular
version of the privileged ISA (for example, some unrelated errata may
prevent an implementation from claiming conformance with Ss1p12). As a
result, Linux cannot simply use the privileged ISA version to determine
if the CSR is present. It must also check if any of these other
extensions are implemented. It also cannot probe the existence of the
CSR at runtime, because Linux does not require Sstrict, so (in the
absence of additional information) it cannot know if a CSR at that
address is [ms]envcfg or part of some non-conforming vendor extension.

Since there are several standard extensions that imply the existence of
the [ms]envcfg CSR, it becomes unwieldy to check for all of them
wherever the CSR is accessed. Instead, define a custom Xlinuxenvcfg ISA
extension bit that is implied by the other extensions and denotes that
the CSR exists as defined in the privileged ISA, containing at least one
of the fields common between menvcfg and senvcfg.

This extension does not need to be parsed from the devicetree or ISA
string because it can only be implemented as a subset of some other
standard extension.

Cc: <stable@vger.kernel.org> # v6.7+
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240228065559.3434837-3-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29 10:20:17 -08:00
Samuel Holland
3fb3f7164e
riscv: Fix enabling cbo.zero when running in M-mode
When the kernel is running in M-mode, the CBZE bit must be set in the
menvcfg CSR, not in senvcfg.

Cc: <stable@vger.kernel.org>
Fixes: 43c16d51a1 ("RISC-V: Enable cbo.zero in usermode")
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240228065559.3434837-2-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29 10:20:16 -08:00
Fei Wu
34b5678687
perf: RISCV: Fix panic on pmu overflow handler
(1 << idx) of int is not desired when setting bits in unsigned long
overflowed_ctrs, use BIT() instead. This panic happens when running
'perf record -e branches' on sophgo sg2042.

[  273.311852] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098
[  273.320851] Oops [#1]
[  273.323179] Modules linked in:
[  273.326303] CPU: 0 PID: 1475 Comm: perf Not tainted 6.6.0-rc3+ #9
[  273.332521] Hardware name: Sophgo Mango (DT)
[  273.336878] epc : riscv_pmu_ctr_get_width_mask+0x8/0x62
[  273.342291]  ra : pmu_sbi_ovf_handler+0x2e0/0x34e
[  273.347091] epc : ffffffff80aecd98 ra : ffffffff80aee056 sp : fffffff6e36928b0
[  273.354454]  gp : ffffffff821f82d0 tp : ffffffd90c353200 t0 : 0000002ade4f9978
[  273.361815]  t1 : 0000000000504d55 t2 : ffffffff8016cd8c s0 : fffffff6e3692a70
[  273.369180]  s1 : 0000000000000020 a0 : 0000000000000000 a1 : 00001a8e81800000
[  273.376540]  a2 : 0000003c00070198 a3 : 0000003c00db75a4 a4 : 0000000000000015
[  273.383901]  a5 : ffffffd7ff8804b0 a6 : 0000000000000015 a7 : 000000000000002a
[  273.391327]  s2 : 000000000000ffff s3 : 0000000000000000 s4 : ffffffd7ff8803b0
[  273.398773]  s5 : 0000000000504d55 s6 : ffffffd905069800 s7 : ffffffff821fe210
[  273.406139]  s8 : 000000007fffffff s9 : ffffffd7ff8803b0 s10: ffffffd903f29098
[  273.413660]  s11: 0000000080000000 t3 : 0000000000000003 t4 : ffffffff8017a0ca
[  273.421022]  t5 : ffffffff8023cfc2 t6 : ffffffd9040780e8
[  273.426437] status: 0000000200000100 badaddr: 0000000000000098 cause: 000000000000000d
[  273.434512] [<ffffffff80aecd98>] riscv_pmu_ctr_get_width_mask+0x8/0x62
[  273.441169] [<ffffffff80076bd8>] handle_percpu_devid_irq+0x98/0x1ee
[  273.447562] [<ffffffff80071158>] generic_handle_domain_irq+0x28/0x36
[  273.454151] [<ffffffff8047a99a>] riscv_intc_irq+0x36/0x4e
[  273.459659] [<ffffffff80c944de>] handle_riscv_irq+0x4a/0x74
[  273.465442] [<ffffffff80c94c48>] do_irq+0x62/0x92
[  273.470360] Code: 0420 60a2 6402 5529 0141 8082 0013 0000 0013 0000 (6d5c) b783
[  273.477921] ---[ end trace 0000000000000000 ]---
[  273.482630] Kernel panic - not syncing: Fatal exception in interrupt

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Fei Wu <fei2.wu@intel.com>
Link: https://lore.kernel.org/r/20240228115425.2613856-1-fei2.wu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29 10:20:00 -08:00
Samuel Holland
680945f0aa
MAINTAINERS: Update SiFive driver maintainers
Add myself as a maintainer for the various SiFive drivers, since I have
been performing cleanup activity on these drivers and reviewing patches
to them for a while now. Remove Palmer as a maintainer, as he is focused
on overall RISC-V architecture support.

Collapse some duplicate entries into the main SiFive drivers entry:
 - Conor is already maintainer of standalone cache drivers as a whole,
   and these files are also covered by the "sifive" file name regex.
 - Paul's git tree has not been updated since 2018, and all file names
   matching the "fu540" pattern also match the "sifive" pattern.
 - Green has not been active on the LKML for a couple of years.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Acked-by: Paul Walmsley <paul.walmsley@sifive.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20240215234941.1663791-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-02-29 10:18:30 -08:00
Christophe Leroy
3d6423ef8d kunit: Fix again checksum tests on big endian CPUs
Commit b38460bc46 ("kunit: Fix checksum tests on big endian CPUs")
fixed endianness issues with kunit checksum tests, but then
commit 6f4c45cbcb ("kunit: Add tests for csum_ipv6_magic and
ip_fast_csum") introduced new issues on big endian CPUs. Those issues
are once again reflected by the warnings reported by sparse.

So, fix them with the same approach, perform proper conversion in
order to support both little and big endian CPUs. Once the conversions
are properly done and the right types used, the sparse warnings are
cleared as well.

Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Fixes: 6f4c45cbcb ("kunit: Add tests for csum_ipv6_magic and ip_fast_csum")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/73df3a9e95c2179119398ad1b4c84cdacbd8dfb6.1708684443.git.christophe.leroy@csgroup.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-29 09:16:02 -08:00
Jakub Kicinski
244b96c231 bluetooth pull request for net:
- mgmt: Fix limited discoverable off timeout
  - hci_qca: Set BDA quirk bit if fwnode exists in DT
  - hci_bcm4377: do not mark valid bd_addr as invalid
  - hci_sync: Check the correct flag before starting a scan
  - Enforce validation on max value of connection interval
  - hci_sync: Fix accept_list when attempting to suspend
  - hci_event: Fix handling of HCI_EV_IO_CAPA_REQUEST
  - Avoid potential use-after-free in hci_error_reset
  - rfcomm: Fix null-ptr-deref in rfcomm_check_security
  - hci_event: Fix wrongly recorded wakeup BD_ADDR
  - qca: Fix wrong event type for patch config command
  - qca: Fix triggering coredump implementation
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmXfSTQZHGx1aXoudm9u
 LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKd2UD/4jJaoGygVTAlthLkllVEOu
 mC52iXLpuxVfjIsHHMKPh4ysxTid4Zx0zijM7wLuZc7rCnOSw8Sdu72Mmm0q8Hcx
 hxkp731oB0jDKUt9DMXBasNF/Gk10BubD79oy+rZx4D2lKNJoUgktbtyp+3rFo7b
 2GdpxThYGz4UiiCVgZbMTv/kBzQvZPP/Dd/i+9dXq9sxDAi40bJcnT9o1/XLtEpN
 Ctz482PLqANkBvkmqUOx5SkeA1GcvlcNd9jDU2Ou/vwpHTAQk10RxFIr+ywV2PVI
 wieUAlD5OadlpDw1+0a6qz80MOEZg5I5qIpJ71h19KrMkWUlxOIomAKxjam6yUCb
 GSykO9pU18Q247++V3YPdbSLclgsjnxVWDThI7gp5JRXsjPW3g7taV2T6S4Q6s3r
 7f6FANeCYSL4sydnySXindwD3z8qsNqvCMblGrNM71COn9tv70Jyy6jZpqO+TttH
 AnVH5EgAUjpKN+3ePicRioeZ6rpr651hmz/PNPbjMggUMxQebz4Bpri9gO8YeW0A
 ksZVDfEXnuS0ev/uQKUEt/Fl+GiI64+QlipAFMPkgvQ1DtnEAYSzTYADKT6Owtuy
 J8tvehnjc+0c7RIpXAt5/yVWeMdHZvr06udVvglSSmJ/ZzeFU4UVdTNb5mbaL3rV
 9XNB4YyjMXmvEVR4/VhuGw==
 =ZpLZ
 -----END PGP SIGNATURE-----

Merge tag 'for-net-2024-02-28' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - mgmt: Fix limited discoverable off timeout
 - hci_qca: Set BDA quirk bit if fwnode exists in DT
 - hci_bcm4377: do not mark valid bd_addr as invalid
 - hci_sync: Check the correct flag before starting a scan
 - Enforce validation on max value of connection interval
 - hci_sync: Fix accept_list when attempting to suspend
 - hci_event: Fix handling of HCI_EV_IO_CAPA_REQUEST
 - Avoid potential use-after-free in hci_error_reset
 - rfcomm: Fix null-ptr-deref in rfcomm_check_security
 - hci_event: Fix wrongly recorded wakeup BD_ADDR
 - qca: Fix wrong event type for patch config command
 - qca: Fix triggering coredump implementation

* tag 'for-net-2024-02-28' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: qca: Fix triggering coredump implementation
  Bluetooth: hci_qca: Set BDA quirk bit if fwnode exists in DT
  Bluetooth: qca: Fix wrong event type for patch config command
  Bluetooth: Enforce validation on max value of connection interval
  Bluetooth: hci_event: Fix handling of HCI_EV_IO_CAPA_REQUEST
  Bluetooth: mgmt: Fix limited discoverable off timeout
  Bluetooth: hci_event: Fix wrongly recorded wakeup BD_ADDR
  Bluetooth: rfcomm: Fix null-ptr-deref in rfcomm_check_security
  Bluetooth: hci_sync: Fix accept_list when attempting to suspend
  Bluetooth: Avoid potential use-after-free in hci_error_reset
  Bluetooth: hci_sync: Check the correct flag before starting a scan
  Bluetooth: hci_bcm4377: do not mark valid bd_addr as invalid
====================

Link: https://lore.kernel.org/r/20240228145644.2269088-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-29 09:10:24 -08:00
Jakub Kicinski
8f5afe4114 Merge branch 'tls-a-few-more-fixes-for-async-decrypt'
Sabrina Dubroca says:

====================
tls: a few more fixes for async decrypt

The previous patchset [1] took care of "full async". This adds a few
fixes for cases where only part of the crypto operations go the async
route, found by extending my previous debug patch [2] to do N
synchronous operations followed by M asynchronous ops (with N and M
configurable).

[1] https://patchwork.kernel.org/project/netdevbpf/list/?series=823784&state=*
[2] https://lore.kernel.org/all/9d664093b1bf7f47497b2c40b3a085b45f3274a2.1694021240.git.sd@queasysnail.net/
====================

Link: https://lore.kernel.org/r/cover.1709132643.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-29 09:07:19 -08:00
Sabrina Dubroca
13114dc554 tls: fix use-after-free on failed backlog decryption
When the decrypt request goes to the backlog and crypto_aead_decrypt
returns -EBUSY, tls_do_decryption will wait until all async
decryptions have completed. If one of them fails, tls_do_decryption
will return -EBADMSG and tls_decrypt_sg jumps to the error path,
releasing all the pages. But the pages have been passed to the async
callback, and have already been released by tls_decrypt_done.

The only true async case is when crypto_aead_decrypt returns
 -EINPROGRESS. With -EBUSY, we already waited so we can tell
tls_sw_recvmsg that the data is available for immediate copy, but we
need to notify tls_decrypt_sg (via the new ->async_done flag) that the
memory has already been released.

Fixes: 8590541473 ("net: tls: handle backlogging of crypto requests")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/4755dd8d9bebdefaa19ce1439b833d6199d4364c.1709132643.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-29 09:07:16 -08:00
Sabrina Dubroca
41532b785e tls: separate no-async decryption request handling from async
If we're not doing async, the handling is much simpler. There's no
reference counting, we just need to wait for the completion to wake us
up and return its result.

We should preferably also use a separate crypto_wait. I'm not seeing a
UAF as I did in the past, I think aec7961916 ("tls: fix race between
async notify and socket close") took care of it.

This will make the next fix easier.

Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/47bde5f649707610eaef9f0d679519966fc31061.1709132643.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-29 09:07:16 -08:00
Sabrina Dubroca
6caaf10442 tls: fix peeking with sync+async decryption
If we peek from 2 records with a currently empty rx_list, and the
first record is decrypted synchronously but the second record is
decrypted async, the following happens:
  1. decrypt record 1 (sync)
  2. copy from record 1 to the userspace's msg
  3. queue the decrypted record to rx_list for future read(!PEEK)
  4. decrypt record 2 (async)
  5. queue record 2 to rx_list
  6. call process_rx_list to copy data from the 2nd record

We currently pass copied=0 as skip offset to process_rx_list, so we
end up copying once again from the first record. We should skip over
the data we've already copied.

Seen with selftest tls.12_aes_gcm.recv_peek_large_buf_mult_recs

Fixes: 692d7b5d1f ("tls: Fix recvmsg() to be able to peek across multiple records")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/1b132d2b2b99296bfde54e8a67672d90d6d16e71.1709132643.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-29 09:07:16 -08:00
Sabrina Dubroca
f7fa16d498 tls: decrement decrypt_pending if no async completion will be called
With mixed sync/async decryption, or failures of crypto_aead_decrypt,
we increment decrypt_pending but we never do the corresponding
decrement since tls_decrypt_done will not be called. In this case, we
should decrement decrypt_pending immediately to avoid getting stuck.

For example, the prequeue prequeue test gets stuck with mixed
modes (one async decrypt + one sync decrypt).

Fixes: 94524d8fc9 ("net/tls: Add support for async decryption of tls records")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/c56d5fc35543891d5319f834f25622360e1bfbec.1709132643.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-29 09:07:16 -08:00
Takashi Sakamoto
d0b06dc48f firewire: core: use long bus reset on gap count error
When resetting the bus after a gap count error, use a long rather than
short bus reset.

IEEE 1394-1995 uses only long bus resets. IEEE 1394a adds the option of
short bus resets. When video or audio transmission is in progress and a
device is hot-plugged elsewhere on the bus, the resulting bus reset can
cause video frame drops or audio dropouts. Short bus resets reduce or
eliminate this problem. Accordingly, short bus resets are almost always
preferred.

However, on a mixed 1394/1394a bus, a short bus reset can trigger an
immediate additional bus reset. This double bus reset can be interpreted
differently by different nodes on the bus, resulting in an inconsistent gap
count after the bus reset. An inconsistent gap count will cause another bus
reset, leading to a neverending bus reset loop. This only happens for some
bus topologies, not for all mixed 1394/1394a buses.

By instead sending a long bus reset after a gap count inconsistency, we
avoid the doubled bus reset, restoring the bus to normal operation.

Signed-off-by: Adam Goldman <adamg@pobox.com>
Link: https://sourceforge.net/p/linux1394/mailman/message/58741624/
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2024-02-29 22:18:14 +09:00
Alexander Ofitserov
616d82c3cf gtp: fix use-after-free and null-ptr-deref in gtp_newlink()
The gtp_link_ops operations structure for the subsystem must be
registered after registering the gtp_net_ops pernet operations structure.

Syzkaller hit 'general protection fault in gtp_genl_dump_pdp' bug:

[ 1010.702740] gtp: GTP module unloaded
[ 1010.715877] general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] SMP KASAN NOPTI
[ 1010.715888] KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
[ 1010.715895] CPU: 1 PID: 128616 Comm: a.out Not tainted 6.8.0-rc6-std-def-alt1 #1
[ 1010.715899] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-alt1 04/01/2014
[ 1010.715908] RIP: 0010:gtp_newlink+0x4d7/0x9c0 [gtp]
[ 1010.715915] Code: 80 3c 02 00 0f 85 41 04 00 00 48 8b bb d8 05 00 00 e8 ed f6 ff ff 48 89 c2 48 89 c5 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 4f 04 00 00 4c 89 e2 4c 8b 6d 00 48 b8 00 00 00
[ 1010.715920] RSP: 0018:ffff888020fbf180 EFLAGS: 00010203
[ 1010.715929] RAX: dffffc0000000000 RBX: ffff88800399c000 RCX: 0000000000000000
[ 1010.715933] RDX: 0000000000000001 RSI: ffffffff84805280 RDI: 0000000000000282
[ 1010.715938] RBP: 000000000000000d R08: 0000000000000001 R09: 0000000000000000
[ 1010.715942] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88800399cc80
[ 1010.715947] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000400
[ 1010.715953] FS:  00007fd1509ab5c0(0000) GS:ffff88805b300000(0000) knlGS:0000000000000000
[ 1010.715958] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1010.715962] CR2: 0000000000000000 CR3: 000000001c07a000 CR4: 0000000000750ee0
[ 1010.715968] PKRU: 55555554
[ 1010.715972] Call Trace:
[ 1010.715985]  ? __die_body.cold+0x1a/0x1f
[ 1010.715995]  ? die_addr+0x43/0x70
[ 1010.716002]  ? exc_general_protection+0x199/0x2f0
[ 1010.716016]  ? asm_exc_general_protection+0x1e/0x30
[ 1010.716026]  ? gtp_newlink+0x4d7/0x9c0 [gtp]
[ 1010.716034]  ? gtp_net_exit+0x150/0x150 [gtp]
[ 1010.716042]  __rtnl_newlink+0x1063/0x1700
[ 1010.716051]  ? rtnl_setlink+0x3c0/0x3c0
[ 1010.716063]  ? is_bpf_text_address+0xc0/0x1f0
[ 1010.716070]  ? kernel_text_address.part.0+0xbb/0xd0
[ 1010.716076]  ? __kernel_text_address+0x56/0xa0
[ 1010.716084]  ? unwind_get_return_address+0x5a/0xa0
[ 1010.716091]  ? create_prof_cpu_mask+0x30/0x30
[ 1010.716098]  ? arch_stack_walk+0x9e/0xf0
[ 1010.716106]  ? stack_trace_save+0x91/0xd0
[ 1010.716113]  ? stack_trace_consume_entry+0x170/0x170
[ 1010.716121]  ? __lock_acquire+0x15c5/0x5380
[ 1010.716139]  ? mark_held_locks+0x9e/0xe0
[ 1010.716148]  ? kmem_cache_alloc_trace+0x35f/0x3c0
[ 1010.716155]  ? __rtnl_newlink+0x1700/0x1700
[ 1010.716160]  rtnl_newlink+0x69/0xa0
[ 1010.716166]  rtnetlink_rcv_msg+0x43b/0xc50
[ 1010.716172]  ? rtnl_fdb_dump+0x9f0/0x9f0
[ 1010.716179]  ? lock_acquire+0x1fe/0x560
[ 1010.716188]  ? netlink_deliver_tap+0x12f/0xd50
[ 1010.716196]  netlink_rcv_skb+0x14d/0x440
[ 1010.716202]  ? rtnl_fdb_dump+0x9f0/0x9f0
[ 1010.716208]  ? netlink_ack+0xab0/0xab0
[ 1010.716213]  ? netlink_deliver_tap+0x202/0xd50
[ 1010.716220]  ? netlink_deliver_tap+0x218/0xd50
[ 1010.716226]  ? __virt_addr_valid+0x30b/0x590
[ 1010.716233]  netlink_unicast+0x54b/0x800
[ 1010.716240]  ? netlink_attachskb+0x870/0x870
[ 1010.716248]  ? __check_object_size+0x2de/0x3b0
[ 1010.716254]  netlink_sendmsg+0x938/0xe40
[ 1010.716261]  ? netlink_unicast+0x800/0x800
[ 1010.716269]  ? __import_iovec+0x292/0x510
[ 1010.716276]  ? netlink_unicast+0x800/0x800
[ 1010.716284]  __sock_sendmsg+0x159/0x190
[ 1010.716290]  ____sys_sendmsg+0x712/0x880
[ 1010.716297]  ? sock_write_iter+0x3d0/0x3d0
[ 1010.716304]  ? __ia32_sys_recvmmsg+0x270/0x270
[ 1010.716309]  ? lock_acquire+0x1fe/0x560
[ 1010.716315]  ? drain_array_locked+0x90/0x90
[ 1010.716324]  ___sys_sendmsg+0xf8/0x170
[ 1010.716331]  ? sendmsg_copy_msghdr+0x170/0x170
[ 1010.716337]  ? lockdep_init_map_type+0x2c7/0x860
[ 1010.716343]  ? lockdep_hardirqs_on_prepare+0x430/0x430
[ 1010.716350]  ? debug_mutex_init+0x33/0x70
[ 1010.716360]  ? percpu_counter_add_batch+0x8b/0x140
[ 1010.716367]  ? lock_acquire+0x1fe/0x560
[ 1010.716373]  ? find_held_lock+0x2c/0x110
[ 1010.716384]  ? __fd_install+0x1b6/0x6f0
[ 1010.716389]  ? lock_downgrade+0x810/0x810
[ 1010.716396]  ? __fget_light+0x222/0x290
[ 1010.716403]  __sys_sendmsg+0xea/0x1b0
[ 1010.716409]  ? __sys_sendmsg_sock+0x40/0x40
[ 1010.716419]  ? lockdep_hardirqs_on_prepare+0x2b3/0x430
[ 1010.716425]  ? syscall_enter_from_user_mode+0x1d/0x60
[ 1010.716432]  do_syscall_64+0x30/0x40
[ 1010.716438]  entry_SYSCALL_64_after_hwframe+0x62/0xc7
[ 1010.716444] RIP: 0033:0x7fd1508cbd49
[ 1010.716452] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ef 70 0d 00 f7 d8 64 89 01 48
[ 1010.716456] RSP: 002b:00007fff18872348 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
[ 1010.716463] RAX: ffffffffffffffda RBX: 000055f72bf0eac0 RCX: 00007fd1508cbd49
[ 1010.716468] RDX: 0000000000000000 RSI: 0000000020000280 RDI: 0000000000000006
[ 1010.716473] RBP: 00007fff18872360 R08: 00007fff18872360 R09: 00007fff18872360
[ 1010.716478] R10: 00007fff18872360 R11: 0000000000000202 R12: 000055f72bf0e1b0
[ 1010.716482] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1010.716491] Modules linked in: gtp(+) udp_tunnel ib_core uinput af_packet rfkill qrtr joydev hid_generic usbhid hid kvm_intel iTCO_wdt intel_pmc_bxt iTCO_vendor_support kvm snd_hda_codec_generic ledtrig_audio irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel snd_hda_intel nls_utf8 snd_intel_dspcfg nls_cp866 psmouse aesni_intel vfat crypto_simd fat cryptd glue_helper snd_hda_codec pcspkr snd_hda_core i2c_i801 snd_hwdep i2c_smbus xhci_pci snd_pcm lpc_ich xhci_pci_renesas xhci_hcd qemu_fw_cfg tiny_power_button button sch_fq_codel vboxvideo drm_vram_helper drm_ttm_helper ttm vboxsf vboxguest snd_seq_midi snd_seq_midi_event snd_seq snd_rawmidi snd_seq_device snd_timer snd soundcore msr fuse efi_pstore dm_mod ip_tables x_tables autofs4 virtio_gpu virtio_dma_buf drm_kms_helper cec rc_core drm virtio_rng virtio_scsi rng_core virtio_balloon virtio_blk virtio_net virtio_console net_failover failover ahci libahci libata evdev scsi_mod input_leds serio_raw virtio_pci intel_agp
[ 1010.716674]  virtio_ring intel_gtt virtio [last unloaded: gtp]
[ 1010.716693] ---[ end trace 04990a4ce61e174b ]---

Cc: stable@vger.kernel.org
Signed-off-by: Alexander Ofitserov <oficerovas@altlinux.org>
Fixes: 459aa660eb ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240228114703.465107-1-oficerovas@altlinux.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-29 14:14:18 +01:00
Priyanka Dandamudi
8188cae3cc drm/xe/xe_trace: Add move_lacks_source detail to xe_bo_move trace
Add move_lacks_source detail to xe_bo_move trace to make it readable
that is to check if it is migrate clear or migrate copy.

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: a09946a9a9 ("drm/xe/xe_bo_move: Enhance xe_bo_move trace")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240221101950.1019312-1-priyanka.dandamudi@intel.com
(cherry picked from commit 8034f6b070)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-29 12:32:15 +01:00
Paolo Abeni
b611b776a9 netfilter pull request 24-02-29
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmXfx4MACgkQ1V2XiooU
 IOSX+g//UHBfqYJASMMQJpWdMwWe7tB2m1LRzLYI+WUdUenK/MEylS7rNp/bwGkW
 42eDeGA0eov7kYNOY0rLB7lQBdUHwCpNZkdetTWFV9eHcEKA8cQ6OqcD1G8i41qg
 sCvObS+K/hq3f7fX9bJ9RvS5RvYoeuS1trw4mezhHwPS+1sj80v4FdqDOFCUqiT3
 65BfeoV65pVVteCRmJQxeeZ4Bepd4LRXW+VVyr3uXli/H87jqQOFxsOTqyXNEXIq
 jMYL0jnbYs0ARbNYXRYySLYQCWmbVXpfnt4JIBRP0S1e6Prby2hqUwJBeyNcXBAu
 CwBTjCEdLIV5G25EWTZWBYQdihct58s0GDRX078Sj/AozQJAWTxBEn0QLhKy2gvH
 2uspA0S2z1PS69hUvHfgGjDiBKw41T2O6D/12NBxI1DOYDLsk7ApE5tKqynUnUIj
 pOLUiolFnJd4JKnGZ/CTATpGi8KX/iSWdX8OElCpGOvKQgZyU8IXrydjcHnJz7b4
 AdsIfpjjZSdz2VU6ZmzLYJrWf6ukAchO5kYL2FIJt/eFEyGqDfwGL36FIO7YGcnu
 NPHtIF23Ldl+GIesc9UT08k+IOsfR9LMbUduJC6Dg63FDrEkFfOv+wXA1eURW3kS
 tq+eWs+QjlCeWG9FgW2NHj3+rGyjQbGOe+v1yTgl1x/BhXNV1cM=
 =2BRo
 -----END PGP SIGNATURE-----

Merge tag 'nf-24-02-29' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

Patch #1 restores NFPROTO_INET with nft_compat, from Ignat Korchagin.

Patch #2 fixes an issue with bridge netfilter and broadcast/multicast
packets.

There is a day 0 bug in br_netfilter when used with connection tracking.

Conntrack assumes that an nf_conn structure that is not yet added to
hash table ("unconfirmed"), is only visible by the current cpu that is
processing the sk_buff.

For bridge this isn't true, sk_buff can get cloned in between, and
clones can be processed in parallel on different cpu.

This patch disables NAT and conntrack helpers for multicast packets.

Patch #3 adds a selftest to cover for the br_netfilter bug.

netfilter pull request 24-02-29

* tag 'nf-24-02-29' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  selftests: netfilter: add bridge conntrack + multicast test case
  netfilter: bridge: confirm multicast packets before passing them up the stack
  netfilter: nf_tables: allow NFPROTO_INET in nft_(match/target)_validate()
====================

Link: https://lore.kernel.org/r/20240229000135.8780-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-29 12:16:08 +01:00
Lukasz Majewski
51dd4ee037 net: hsr: Use correct offset for HSR TLV values in supervisory HSR frames
Current HSR implementation uses following supervisory frame (even for
HSRv1 the HSR tag is not is not present):

00000000: 01 15 4e 00 01 2d XX YY ZZ 94 77 10 88 fb 00 01
00000010: 7e 1c 17 06 XX YY ZZ 94 77 10 1e 06 XX YY ZZ 94
00000020: 77 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000030: 00 00 00 00 00 00 00 00 00 00 00 00

The current code adds extra two bytes (i.e. sizeof(struct hsr_sup_tlv))
when offset for skb_pull() is calculated.
This is wrong, as both 'struct hsrv1_ethhdr_sp' and 'hsrv0_ethhdr_sp'
already have 'struct hsr_sup_tag' defined in them, so there is no need
for adding extra two bytes.

This code was working correctly as with no RedBox support, the check for
HSR_TLV_EOT (0x00) was off by two bytes, which were corresponding to
zeroed padded bytes for minimal packet size.

Fixes: eafaa88b3e ("net: hsr: Add support for redbox supervision frames")
Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240228085644.3618044-1-lukma@denx.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-29 10:42:46 +01:00
Mika Kuoppala
785f4cc068 drm/xe: Deny unbinds if uapi ufence pending
If user fence was provided for MAP in vm_bind_ioctl
and it has still not been signalled, deny UNMAP of said
vma with EBUSY as long as unsignalled fence exists.

This guarantees that MAP vs UNMAP sequences won't
escape under the radar if we ever want to track the
client's state wrt to completed and accessible MAPs.
By means of intercepting the ufence release signalling.

v2: find ufence with num_fences > 1 (Matt)
v3: careful on clearing vma ufence (Matt)

Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1159
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240215181152.450082-3-mika.kuoppala@linux.intel.com
(cherry picked from commit 158900ade9)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-29 10:39:03 +01:00
Mika Kuoppala
86b3cd6d07 drm/xe: Expose user fence from xe_sync_entry
By allowing getting reference to user fence, we can
control the lifetime outside of sync entries.

This is needed to allow vma to track the associated
user fence that was provided with bind ioctl.

v2: xe_user_fence can be kept opaque (Jani, Matt)
v3: indent fix (Matt)

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240215181152.450082-2-mika.kuoppala@linux.intel.com
(cherry picked from commit 977e5b82e0)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-29 10:39:02 +01:00
Lucas De Marchi
4ca5c82988 drm/xe: Use pointers in trace events
Commit a0df2cc858 ("drm/xe/xe_bo_move: Enhance xe_bo_move trace")
inadvertently reverted commit 8d038f49c1 ("drm/xe: Fix cast on trace
variable"), breaking the build on 32bits.

As noted by Ville, there's no point in converting the pointers to u64
and add casts everywhere. In fact, it's better to just use %p and let
the address be hashed. Convert all the cases in xe_trace.h to use
pointers.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222144125.2862546-1-lucas.demarchi@intel.com
(cherry picked from commit 7a975748d4)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-29 10:39:02 +01:00
Priyanka Dandamudi
a09946a9a9 drm/xe/xe_bo_move: Enhance xe_bo_move trace
Enhanced xe_bo_move trace to be more readable.
It will help to show the migration details.
Src and dst details.

v2: Modify trace_xe_bo_move(), it takes the integer mem_type
rather than a string.
Make mem_type_to_name() extern, it will be used by trace.(Thomas)

v3: Move mem_type_to_name() to xe_bo.[ch] (Thomas, Matt)

v4: Add device details to reduce ambiquity related to vram0/vram1. (Oak)

v5: Rename mem_type_to_name to xe_mem_type_to_name. (Thomas)

v6: Optimised code to use xe_bo_device(__entry->bo). (Thomas)

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Kempczynski Zbigniew <Zbigniew.Kempczynski@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Brian Welty <brian.welty@intel.com>
Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com>
Reviewed-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240220044748.948496-1-priyanka.dandamudi@intel.com
(cherry picked from commit a0df2cc858)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-29 10:39:02 +01:00
Arnd Bergmann
12cb2b21c2 drm/xe/mmio: fix build warning for BAR resize on 32-bit
clang complains about a nonsensical test on builds with a 32-bit phys_addr_t,
which means resizing will always fail:

drivers/gpu/drm/xe/xe_mmio.c:109:23: error: result of comparison of constant 4294967296 with expression of type 'resource_size_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
  109 |                     root_res->start > 0x100000000ull)
      |                     ~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~

Previously, BAR resize was always disallowed on 32-bit kernels, but
this apparently changed recently. Since 32-bit machines can in theory
support PAE/LPAE for large address spaces, this may end up useful,
so change the driver to shut up the warning but still work when
phys_addr_t/resource_size_t is 64 bit wide.

Fixes: 9a6e6c14bf ("drm/xe/mmio: Use non-atomic writeq/readq variant for 32b")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240226124736.1272949-2-arnd@kernel.org
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit f5d3983366)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-29 10:39:02 +01:00