Commit Graph

1247940 Commits

Author SHA1 Message Date
Zhengchao Shao 435e202d64 ipv6: init the accept_queue's spinlocks in inet6_create
In commit 198bc90e0e73("tcp: make sure init the accept_queue's spinlocks
once"), the spinlocks of accept_queue are initialized only when socket is
created in the inet4 scenario. The locks are not initialized when socket
is created in the inet6 scenario. The kernel reports the following error:
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<TASK>
	dump_stack_lvl (lib/dump_stack.c:107)
	register_lock_class (kernel/locking/lockdep.c:1289)
	__lock_acquire (kernel/locking/lockdep.c:5015)
	lock_acquire.part.0 (kernel/locking/lockdep.c:5756)
	_raw_spin_lock_bh (kernel/locking/spinlock.c:178)
	inet_csk_listen_stop (net/ipv4/inet_connection_sock.c:1386)
	tcp_disconnect (net/ipv4/tcp.c:2981)
	inet_shutdown (net/ipv4/af_inet.c:935)
	__sys_shutdown (./include/linux/file.h:32 net/socket.c:2438)
	__x64_sys_shutdown (net/socket.c:2445)
	do_syscall_64 (arch/x86/entry/common.c:52)
	entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
RIP: 0033:0x7f52ecd05a3d
Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 73 01 c3 48 8b 0d ab a3 0e 00 f7 d8 64 89 01 48
RSP: 002b:00007f52ecf5dde8 EFLAGS: 00000293 ORIG_RAX: 0000000000000030
RAX: ffffffffffffffda RBX: 00007f52ecf5e640 RCX: 00007f52ecd05a3d
RDX: 00007f52ecc8b188 RSI: 0000000000000000 RDI: 0000000000000004
RBP: 00007f52ecf5de20 R08: 00007ffdae45c69f R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 00007f52ecf5e640
R13: 0000000000000000 R14: 00007f52ecc8b060 R15: 00007ffdae45c6e0

Fixes: 198bc90e0e ("tcp: make sure init the accept_queue's spinlocks once")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240122102001.2851701-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-23 13:44:50 +01:00
Zhengchao Shao 234ec0b603 netlink: fix potential sleeping issue in mqueue_flush_file
I analyze the potential sleeping issue of the following processes:
Thread A                                Thread B
...                                     netlink_create  //ref = 1
do_mq_notify                            ...
  sock = netlink_getsockbyfilp          ...     //ref = 2
  info->notify_sock = sock;             ...
...                                     netlink_sendmsg
...                                       skb = netlink_alloc_large_skb  //skb->head is vmalloced
...                                       netlink_unicast
...                                         sk = netlink_getsockbyportid //ref = 3
...                                         netlink_sendskb
...                                           __netlink_sendskb
...                                             skb_queue_tail //put skb to sk_receive_queue
...                                         sock_put //ref = 2
...                                     ...
...                                     netlink_release
...                                       deferred_put_nlk_sk //ref = 1
mqueue_flush_file
  spin_lock
  remove_notification
    netlink_sendskb
      sock_put  //ref = 0
        sk_free
          ...
          __sk_destruct
            netlink_sock_destruct
              skb_queue_purge  //get skb from sk_receive_queue
                ...
                __skb_queue_purge_reason
                  kfree_skb_reason
                    __kfree_skb
                    ...
                    skb_release_all
                      skb_release_head_state
                        netlink_skb_destructor
                          vfree(skb->head)  //sleeping while holding spinlock

In netlink_sendmsg, if the memory pointed to by skb->head is allocated by
vmalloc, and is put to sk_receive_queue queue, also the skb is not freed.
When the mqueue executes flush, the sleeping bug will occur. Use
vfree_atomic instead of vfree in netlink_skb_destructor to solve the issue.

Fixes: c05cdb1b86 ("netlink: allow large data transfers from user-space")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20240122011807.2110357-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-23 11:21:18 +01:00
Kuniyuki Iwashima 97de5a15ed selftest: Don't reuse port for SO_INCOMING_CPU test.
Jakub reported that ASSERT_EQ(cpu, i) in so_incoming_cpu.c seems to
fire somewhat randomly.

  # #  RUN           so_incoming_cpu.before_reuseport.test3 ...
  # # so_incoming_cpu.c:191:test3:Expected cpu (32) == i (0)
  # # test3: Test terminated by assertion
  # #          FAIL  so_incoming_cpu.before_reuseport.test3
  # not ok 3 so_incoming_cpu.before_reuseport.test3

When the test failed, not-yet-accepted CLOSE_WAIT sockets received
SYN with a "challenging" SEQ number, which was sent from an unexpected
CPU that did not create the receiver.

The test basically does:

  1. for each cpu:
    1-1. create a server
    1-2. set SO_INCOMING_CPU

  2. for each cpu:
    2-1. set cpu affinity
    2-2. create some clients
    2-3. let clients connect() to the server on the same cpu
    2-4. close() clients

  3. for each server:
    3-1. accept() all child sockets
    3-2. check if all children have the same SO_INCOMING_CPU with the server

The root cause was the close() in 2-4. and net.ipv4.tcp_tw_reuse.

In a loop of 2., close() changed the client state to FIN_WAIT_2, and
the peer transitioned to CLOSE_WAIT.

In another loop of 2., connect() happened to select the same port of
the FIN_WAIT_2 socket, and it was reused as the default value of
net.ipv4.tcp_tw_reuse is 2.

As a result, the new client sent SYN to the CLOSE_WAIT socket from
a different CPU, and the receiver's sk_incoming_cpu was overwritten
with unexpected CPU ID.

Also, the SYN had a different SEQ number, so the CLOSE_WAIT socket
responded with Challenge ACK.  The new client properly returned RST
and effectively killed the CLOSE_WAIT socket.

This way, all clients were created successfully, but the error was
detected later by 3-2., ASSERT_EQ(cpu, i).

To avoid the failure, let's make sure that (i) the number of clients
is less than the number of available ports and (ii) such reuse never
happens.

Fixes: 6df96146b2 ("selftest: Add test for SO_INCOMING_CPU.")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Tested-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20240120031642.67014-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-23 10:48:07 +01:00
Salvatore Dipietro 7267e8dcad tcp: Add memory barrier to tcp_push()
On CPUs with weak memory models, reads and updates performed by tcp_push
to the sk variables can get reordered leaving the socket throttled when
it should not. The tasklet running tcp_wfree() may also not observe the
memory updates in time and will skip flushing any packets throttled by
tcp_push(), delaying the sending. This can pathologically cause 40ms
extra latency due to bad interactions with delayed acks.

Adding a memory barrier in tcp_push removes the bug, similarly to the
previous commit bf06200e73 ("tcp: tsq: fix nonagle handling").
smp_mb__after_atomic() is used to not incur in unnecessary overhead
on x86 since not affected.

Patch has been tested using an AWS c7g.2xlarge instance with Ubuntu
22.04 and Apache Tomcat 9.0.83 running the basic servlet below:

import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.PrintWriter;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

public class HelloWorldServlet extends HttpServlet {
    @Override
    protected void doGet(HttpServletRequest request, HttpServletResponse response)
      throws ServletException, IOException {
        response.setContentType("text/html;charset=utf-8");
        OutputStreamWriter osw = new OutputStreamWriter(response.getOutputStream(),"UTF-8");
        String s = "a".repeat(3096);
        osw.write(s,0,s.length());
        osw.flush();
    }
}

Load was applied using wrk2 (https://github.com/kinvolk/wrk2) from an AWS
c6i.8xlarge instance. Before the patch an additional 40ms latency from P99.99+
values is observed while, with the patch, the extra latency disappears.

No patch and tcp_autocorking=1
./wrk -t32 -c128 -d40s --latency -R10000  http://172.31.60.173:8080/hello/hello
  ...
 50.000%    0.91ms
 75.000%    1.13ms
 90.000%    1.46ms
 99.000%    1.74ms
 99.900%    1.89ms
 99.990%   41.95ms  <<< 40+ ms extra latency
 99.999%   48.32ms
100.000%   48.96ms

With patch and tcp_autocorking=1
./wrk -t32 -c128 -d40s --latency -R10000  http://172.31.60.173:8080/hello/hello
  ...
 50.000%    0.90ms
 75.000%    1.13ms
 90.000%    1.45ms
 99.000%    1.72ms
 99.900%    1.83ms
 99.990%    2.11ms  <<< no 40+ ms extra latency
 99.999%    2.53ms
100.000%    2.62ms

Patch has been also tested on x86 (m7i.2xlarge instance) which it is not
affected by this issue and the patch doesn't introduce any additional
delay.

Fixes: 7aa5470c2c ("tcp: tsq: move tsq_flags close to sk_wmem_alloc")
Signed-off-by: Salvatore Dipietro <dipiets@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240119190133.43698-1-dipiets@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-23 10:22:31 +01:00
Sharath Srinivasan 13e788deb7 net/rds: Fix UBSAN: array-index-out-of-bounds in rds_cmsg_recv
Syzcaller UBSAN crash occurs in rds_cmsg_recv(),
which reads inc->i_rx_lat_trace[j + 1] with index 4 (3 + 1),
but with array size of 4 (RDS_RX_MAX_TRACES).
Here 'j' is assigned from rs->rs_rx_trace[i] and in-turn from
trace.rx_trace_pos[i] in rds_recv_track_latency(),
with both arrays sized 3 (RDS_MSG_RX_DGRAM_TRACE_MAX). So fix the
off-by-one bounds check in rds_recv_track_latency() to prevent
a potential crash in rds_cmsg_recv().

Found by syzcaller:
=================================================================
UBSAN: array-index-out-of-bounds in net/rds/recv.c:585:39
index 4 is out of range for type 'u64 [4]'
CPU: 1 PID: 8058 Comm: syz-executor228 Not tainted 6.6.0-gd2f51b3516da #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.15.0-1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x136/0x150 lib/dump_stack.c:106
 ubsan_epilogue lib/ubsan.c:217 [inline]
 __ubsan_handle_out_of_bounds+0xd5/0x130 lib/ubsan.c:348
 rds_cmsg_recv+0x60d/0x700 net/rds/recv.c:585
 rds_recvmsg+0x3fb/0x1610 net/rds/recv.c:716
 sock_recvmsg_nosec net/socket.c:1044 [inline]
 sock_recvmsg+0xe2/0x160 net/socket.c:1066
 __sys_recvfrom+0x1b6/0x2f0 net/socket.c:2246
 __do_sys_recvfrom net/socket.c:2264 [inline]
 __se_sys_recvfrom net/socket.c:2260 [inline]
 __x64_sys_recvfrom+0xe0/0x1b0 net/socket.c:2260
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x63/0x6b
==================================================================

Fixes: 3289025aed ("RDS: add receive message trace used by application")
Reported-by: Chenyuan Yang <chenyuan0y@gmail.com>
Closes: https://lore.kernel.org/linux-rdma/CALGdzuoVdq-wtQ4Az9iottBqC5cv9ZhcE5q8N7LfYFvkRsOVcw@mail.gmail.com/
Signed-off-by: Sharath Srinivasan <sharath.srinivasan@oracle.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-22 11:24:00 +00:00
Horatiu Vultur aaf632f7ab net: micrel: Fix PTP frame parsing for lan8814
The HW has the capability to check each frame if it is a PTP frame,
which domain it is, which ptp frame type it is, different ip address in
the frame. And if one of these checks fail then the frame is not
timestamp. Most of these checks were disabled except checking the field
minorVersionPTP inside the PTP header. Meaning that once a partner sends
a frame compliant to 8021AS which has minorVersionPTP set to 1, then the
frame was not timestamp because the HW expected by default a value of 0
in minorVersionPTP. This is exactly the same issue as on lan8841.
Fix this issue by removing this check so the userspace can decide on this.

Fixes: ece1950283 ("net: phy: micrel: 1588 support for LAN8814 phy")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Divya Koppera <divya.koppera@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-22 11:02:23 +00:00
David S. Miller 94fa82b095 Merge branch 'dpll-fixes'
Arkadiusz Kubalewski says:

====================
dpll: fix unordered unbind/bind registerer issues

Fix issues when performing unordered unbind/bind of a kernel modules
which are using a dpll device with DPLL_PIN_TYPE_MUX pins.
Currently only serialized bind/unbind of such use case works, fix
the issues and allow for unserialized kernel module bind order.

The issues are observed on the ice driver, i.e.,

$ echo 0000:af:00.0 > /sys/bus/pci/drivers/ice/unbind
$ echo 0000:af:00.1 > /sys/bus/pci/drivers/ice/unbind

results in:

ice 0000:af:00.0: Removed PTP clock
BUG: kernel NULL pointer dereference, address: 0000000000000010
PF: supervisor read access in kernel mode
PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 7 PID: 71848 Comm: bash Kdump: loaded Not tainted 6.6.0-rc5_next-queue_19th-Oct-2023-01625-g039e5d15e451 #1
Hardware name: Intel Corporation S2600STB/S2600STB, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
RIP: 0010:ice_dpll_rclk_state_on_pin_get+0x2f/0x90 [ice]
Code: 41 57 4d 89 cf 41 56 41 55 4d 89 c5 41 54 55 48 89 f5 53 4c 8b 66 08 48 89 cb 4d 8d b4 24 f0 49 00 00 4c 89 f7 e8 71 ec 1f c5 <0f> b6 5b 10 41 0f b6 84 24 30 4b 00 00 29 c3 41 0f b6 84 24 28 4b
RSP: 0018:ffffc902b179fb60 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8882c1398000 RSI: ffff888c7435cc60 RDI: ffff888c7435cb90
RBP: ffff888c7435cc60 R08: ffffc902b179fbb0 R09: 0000000000000000
R10: ffff888ef1fc8050 R11: fffffffffff82700 R12: ffff888c743581a0
R13: ffffc902b179fbb0 R14: ffff888c7435cb90 R15: 0000000000000000
FS:  00007fdc7dae0740(0000) GS:ffff888c105c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000000132c24002 CR4: 00000000007706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 ? __die+0x20/0x70
 ? page_fault_oops+0x76/0x170
 ? exc_page_fault+0x65/0x150
 ? asm_exc_page_fault+0x22/0x30
 ? ice_dpll_rclk_state_on_pin_get+0x2f/0x90 [ice]
 ? __pfx_ice_dpll_rclk_state_on_pin_get+0x10/0x10 [ice]
 dpll_msg_add_pin_parents+0x142/0x1d0
 dpll_pin_event_send+0x7d/0x150
 dpll_pin_on_pin_unregister+0x3f/0x100
 ice_dpll_deinit_pins+0xa1/0x230 [ice]
 ice_dpll_deinit+0x29/0xe0 [ice]
 ice_remove+0xcd/0x200 [ice]
 pci_device_remove+0x33/0xa0
 device_release_driver_internal+0x193/0x200
 unbind_store+0x9d/0xb0
 kernfs_fop_write_iter+0x128/0x1c0
 vfs_write+0x2bb/0x3e0
 ksys_write+0x5f/0xe0
 do_syscall_64+0x59/0x90
 ? filp_close+0x1b/0x30
 ? do_dup2+0x7d/0xd0
 ? syscall_exit_work+0x103/0x130
 ? syscall_exit_to_user_mode+0x22/0x40
 ? do_syscall_64+0x69/0x90
 ? syscall_exit_work+0x103/0x130
 ? syscall_exit_to_user_mode+0x22/0x40
 ? do_syscall_64+0x69/0x90
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0033:0x7fdc7d93eb97
Code: 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
RSP: 002b:00007fff2aa91028 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fdc7d93eb97
RDX: 000000000000000d RSI: 00005644814ec9b0 RDI: 0000000000000001
RBP: 00005644814ec9b0 R08: 0000000000000000 R09: 00007fdc7d9b14e0
R10: 00007fdc7d9b13e0 R11: 0000000000000246 R12: 000000000000000d
R13: 00007fdc7d9fb780 R14: 000000000000000d R15: 00007fdc7d9f69e0
 </TASK>
Modules linked in: uinput vfio_pci vfio_pci_core vfio_iommu_type1 vfio irqbypass ixgbevf snd_seq_dummy snd_hrtimer snd_seq snd_timer snd_seq_device snd soundcore overlay qrtr rfkill vfat fat xfs libcrc32c rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common isst_if_common skx_edac nfit libnvdimm ipmi_ssif x86_pkg_temp_thermal intel_powerclamp coretemp irdma rapl intel_cstate ib_uverbs iTCO_wdt iTCO_vendor_support acpi_ipmi intel_uncore mei_me ipmi_si pcspkr i2c_i801 ib_core mei ipmi_devintf intel_pch_thermal ioatdma i2c_smbus ipmi_msghandler lpc_ich joydev acpi_power_meter acpi_pad ext4 mbcache jbd2 sd_mod t10_pi sg ast i2c_algo_bit drm_shmem_helper drm_kms_helper ice crct10dif_pclmul ixgbe crc32_pclmul drm crc32c_intel ahci i40e libahci ghash_clmulni_intel libata mdio dca gnss wmi fuse [last unloaded: iavf]
CR2: 0000000000000010

v6:
- fix memory corruption on error path in patch [v5 2/4]
====================

Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-22 11:01:11 +00:00
Arkadiusz Kubalewski 7dc5b18ff7 dpll: fix register pin with unregistered parent pin
In case of multiple kernel module instances using the same dpll device:
if only one registers dpll device, then only that one can register
directly connected pins with a dpll device. When unregistered parent is
responsible for determining if the muxed pin can be registered with it
or not, the drivers need to be loaded in serialized order to work
correctly - first the driver instance which registers the direct pins
needs to be loaded, then the other instances could register muxed type
pins.

Allow registration of a pin with a parent even if the parent was not
yet registered, thus allow ability for unserialized driver instance
load order.
Do not WARN_ON notification for unregistered pin, which can be invoked
for described case, instead just return error.

Fixes: 9431063ad3 ("dpll: core: Add DPLL framework base functions")
Fixes: 9d71b54b65 ("dpll: netlink: Add DPLL framework base functions")
Reviewed-by: Jan Glaza <jan.glaza@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-22 11:01:11 +00:00
Arkadiusz Kubalewski db2ec3c946 dpll: fix userspace availability of pins
If parent pin was unregistered but child pin was not, the userspace
would see the "zombie" pins - the ones that were registered with
a parent pin (dpll_pin_on_pin_register(..)).
Technically those are not available - as there is no dpll device in the
system. Do not dump those pins and prevent userspace from any
interaction with them. Provide a unified function to determine if the
pin is available and use it before acting/responding for user requests.

Fixes: 9d71b54b65 ("dpll: netlink: Add DPLL framework base functions")
Reviewed-by: Jan Glaza <jan.glaza@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-22 11:01:11 +00:00
Arkadiusz Kubalewski 830ead5fb0 dpll: fix pin dump crash for rebound module
When a kernel module is unbound but the pin resources were not entirely
freed (other kernel module instance of the same PCI device have had kept
the reference to that pin), and kernel module is again bound, the pin
properties would not be updated (the properties are only assigned when
memory for the pin is allocated), prop pointer still points to the
kernel module memory of the kernel module which was deallocated on the
unbind.

If the pin dump is invoked in this state, the result is a kernel crash.
Prevent the crash by storing persistent pin properties in dpll subsystem,
copy the content from the kernel module when pin is allocated, instead of
using memory of the kernel module.

Fixes: 9431063ad3 ("dpll: core: Add DPLL framework base functions")
Fixes: 9d71b54b65 ("dpll: netlink: Add DPLL framework base functions")
Reviewed-by: Jan Glaza <jan.glaza@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-22 11:01:11 +00:00
Arkadiusz Kubalewski b6a11a7fc4 dpll: fix broken error path in dpll_pin_alloc(..)
If pin type is not expected, or pin properities failed to allocate
memory, the unwind error path shall not destroy pin's xarrays, which
were not yet initialized.
Add new goto label and use it to fix broken error path.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-22 11:01:11 +00:00
David S. Miller ef48521672 Merge branch 'tun-fixes'
Yunjian Wang says:

====================
fixes for tun

There are few places on the receive path where packet receives and
packet drops were not accounted for. This patchset fixes that issue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-22 10:58:05 +00:00
Yunjian Wang f1084c427f tun: add missing rx stats accounting in tun_xdp_act
The TUN can be used as vhost-net backend, and it is necessary to
count the packets transmitted from TUN to vhost-net/virtio-net.
However, there are some places in the receive path that were not
taken into account when using XDP. It would be beneficial to also
include new accounting for successfully received bytes using
dev_sw_netstats_rx_add.

Fixes: 761876c857 ("tap: XDP support")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-22 10:58:04 +00:00
Yunjian Wang 5744ba05e7 tun: fix missing dropped counter in tun_xdp_act
The commit 8ae1aff0b3 ("tuntap: split out XDP logic") includes
dropped counter for XDP_DROP, XDP_ABORTED, and invalid XDP actions.
Unfortunately, that commit missed the dropped counter when error
occurs during XDP_TX and XDP_REDIRECT actions. This patch fixes
this issue.

Fixes: 8ae1aff0b3 ("tuntap: split out XDP logic")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-22 10:58:04 +00:00
Jakub Kicinski d09486a04f net: fix removing a namespace with conflicting altnames
Mark reports a BUG() when a net namespace is removed.

    kernel BUG at net/core/dev.c:11520!

Physical interfaces moved outside of init_net get "refunded"
to init_net when that namespace disappears. The main interface
name may get overwritten in the process if it would have
conflicted. We need to also discard all conflicting altnames.
Recent fixes addressed ensuring that altnames get moved
with the main interface, which surfaced this problem.

Reported-by: Марк Коренберг <socketpair@gmail.com>
Link: https://lore.kernel.org/all/CAEmTpZFZ4Sv3KwqFOY2WKDHeZYdi0O7N5H1nTvcGp=SAEavtDg@mail.gmail.com/
Fixes: 7663d52209 ("net: check for altname conflicts when changing netdev's netns")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-22 01:09:30 +00:00
Michal Schmidt 359724fa3a idpf: distinguish vports by the dev_port attribute
idpf registers multiple netdevs (virtual ports) for one PCI function,
but it does not provide a way for userspace to distinguish them with
sysfs attributes. Per Documentation/ABI/testing/sysfs-class-net, it is
a bug not to set dev_port for independent ports on the same PCI bus,
device and function.

Without dev_port set, systemd-udevd's default naming policy attempts
to assign the same name ("ens2f0") to all four idpf netdevs on my test
system and obviously fails, leaving three of them with the initial
eth<N> name.

With this patch, systemd-udevd is able to assign unique names to the
netdevs (e.g. "ens2f0", "ens2f0d1", "ens2f0d2", "ens2f0d3").

The Intel-provided out-of-tree idpf driver already sets dev_port. In
this patch I chose to do it in the same place in the idpf_cfg_netdev
function.

Fixes: 0fe45467a1 ("idpf: add create vport and netdev configuration")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-21 18:11:09 +00:00
Eric Dumazet a54d51fb2d udp: fix busy polling
Generic sk_busy_loop_end() only looks at sk->sk_receive_queue
for presence of packets.

Problem is that for UDP sockets after blamed commit, some packets
could be present in another queue: udp_sk(sk)->reader_queue

In some cases, a busy poller could spin until timeout expiration,
even if some packets are available in udp_sk(sk)->reader_queue.

v3: - make sk_busy_loop_end() nicer (Willem)

v2: - add a READ_ONCE(sk->sk_family) in sk_is_inet() to avoid KCSAN splats.
    - add a sk_is_inet() check in sk_is_udp() (Willem feedback)
    - add a sk_is_inet() check in sk_is_tcp().

Fixes: 2276f58ac5 ("udp: use a separate rx queue for packet reception")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-21 18:09:30 +00:00
Kuniyuki Iwashima e3f9bed9be llc: Drop support for ETH_P_TR_802_2.
syzbot reported an uninit-value bug below. [0]

llc supports ETH_P_802_2 (0x0004) and used to support ETH_P_TR_802_2
(0x0011), and syzbot abused the latter to trigger the bug.

  write$tun(r0, &(0x7f0000000040)={@val={0x0, 0x11}, @val, @mpls={[], @llc={@snap={0xaa, 0x1, ')', "90e5dd"}}}}, 0x16)

llc_conn_handler() initialises local variables {saddr,daddr}.mac
based on skb in llc_pdu_decode_sa()/llc_pdu_decode_da() and passes
them to __llc_lookup().

However, the initialisation is done only when skb->protocol is
htons(ETH_P_802_2), otherwise, __llc_lookup_established() and
__llc_lookup_listener() will read garbage.

The missing initialisation existed prior to commit 211ed86510
("net: delete all instances of special processing for token ring").

It removed the part to kick out the token ring stuff but forgot to
close the door allowing ETH_P_TR_802_2 packets to sneak into llc_rcv().

Let's remove llc_tr_packet_type and complete the deprecation.

[0]:
BUG: KMSAN: uninit-value in __llc_lookup_established+0xe9d/0xf90
 __llc_lookup_established+0xe9d/0xf90
 __llc_lookup net/llc/llc_conn.c:611 [inline]
 llc_conn_handler+0x4bd/0x1360 net/llc/llc_conn.c:791
 llc_rcv+0xfbb/0x14a0 net/llc/llc_input.c:206
 __netif_receive_skb_one_core net/core/dev.c:5527 [inline]
 __netif_receive_skb+0x1a6/0x5a0 net/core/dev.c:5641
 netif_receive_skb_internal net/core/dev.c:5727 [inline]
 netif_receive_skb+0x58/0x660 net/core/dev.c:5786
 tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1555
 tun_get_user+0x53af/0x66d0 drivers/net/tun.c:2002
 tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
 call_write_iter include/linux/fs.h:2020 [inline]
 new_sync_write fs/read_write.c:491 [inline]
 vfs_write+0x8ef/0x1490 fs/read_write.c:584
 ksys_write+0x20f/0x4c0 fs/read_write.c:637
 __do_sys_write fs/read_write.c:649 [inline]
 __se_sys_write fs/read_write.c:646 [inline]
 __x64_sys_write+0x93/0xd0 fs/read_write.c:646
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x63/0x6b

Local variable daddr created at:
 llc_conn_handler+0x53/0x1360 net/llc/llc_conn.c:783
 llc_rcv+0xfbb/0x14a0 net/llc/llc_input.c:206

CPU: 1 PID: 5004 Comm: syz-executor994 Not tainted 6.6.0-syzkaller-14500-g1c41041124bd #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023

Fixes: 211ed86510 ("net: delete all instances of special processing for token ring")
Reported-by: syzbot+b5ad66046b913bc04c6f@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b5ad66046b913bc04c6f
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240119015515.61898-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-19 21:30:09 -08:00
Eric Dumazet dad555c816 llc: make llc_ui_sendmsg() more robust against bonding changes
syzbot was able to trick llc_ui_sendmsg(), allocating an skb with no
headroom, but subsequently trying to push 14 bytes of Ethernet header [1]

Like some others, llc_ui_sendmsg() releases the socket lock before
calling sock_alloc_send_skb().
Then it acquires it again, but does not redo all the sanity checks
that were performed.

This fix:

- Uses LL_RESERVED_SPACE() to reserve space.
- Check all conditions again after socket lock is held again.
- Do not account Ethernet header for mtu limitation.

[1]

skbuff: skb_under_panic: text:ffff800088baa334 len:1514 put:14 head:ffff0000c9c37000 data:ffff0000c9c36ff2 tail:0x5dc end:0x6c0 dev:bond0

 kernel BUG at net/core/skbuff.c:193 !
Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 6875 Comm: syz-executor.0 Not tainted 6.7.0-rc8-syzkaller-00101-g0802e17d9aca-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : skb_panic net/core/skbuff.c:189 [inline]
 pc : skb_under_panic+0x13c/0x140 net/core/skbuff.c:203
 lr : skb_panic net/core/skbuff.c:189 [inline]
 lr : skb_under_panic+0x13c/0x140 net/core/skbuff.c:203
sp : ffff800096f97000
x29: ffff800096f97010 x28: ffff80008cc8d668 x27: dfff800000000000
x26: ffff0000cb970c90 x25: 00000000000005dc x24: ffff0000c9c36ff2
x23: ffff0000c9c37000 x22: 00000000000005ea x21: 00000000000006c0
x20: 000000000000000e x19: ffff800088baa334 x18: 1fffe000368261ce
x17: ffff80008e4ed000 x16: ffff80008a8310f8 x15: 0000000000000001
x14: 1ffff00012df2d58 x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000001 x10: 0000000000ff0100 x9 : e28a51f1087e8400
x8 : e28a51f1087e8400 x7 : ffff80008028f8d0 x6 : 0000000000000000
x5 : 0000000000000001 x4 : 0000000000000001 x3 : ffff800082b78714
x2 : 0000000000000001 x1 : 0000000100000000 x0 : 0000000000000089
Call trace:
  skb_panic net/core/skbuff.c:189 [inline]
  skb_under_panic+0x13c/0x140 net/core/skbuff.c:203
  skb_push+0xf0/0x108 net/core/skbuff.c:2451
  eth_header+0x44/0x1f8 net/ethernet/eth.c:83
  dev_hard_header include/linux/netdevice.h:3188 [inline]
  llc_mac_hdr_init+0x110/0x17c net/llc/llc_output.c:33
  llc_sap_action_send_xid_c+0x170/0x344 net/llc/llc_s_ac.c:85
  llc_exec_sap_trans_actions net/llc/llc_sap.c:153 [inline]
  llc_sap_next_state net/llc/llc_sap.c:182 [inline]
  llc_sap_state_process+0x1ec/0x774 net/llc/llc_sap.c:209
  llc_build_and_send_xid_pkt+0x12c/0x1c0 net/llc/llc_sap.c:270
  llc_ui_sendmsg+0x7bc/0xb1c net/llc/af_llc.c:997
  sock_sendmsg_nosec net/socket.c:730 [inline]
  __sock_sendmsg net/socket.c:745 [inline]
  sock_sendmsg+0x194/0x274 net/socket.c:767
  splice_to_socket+0x7cc/0xd58 fs/splice.c:881
  do_splice_from fs/splice.c:933 [inline]
  direct_splice_actor+0xe4/0x1c0 fs/splice.c:1142
  splice_direct_to_actor+0x2a0/0x7e4 fs/splice.c:1088
  do_splice_direct+0x20c/0x348 fs/splice.c:1194
  do_sendfile+0x4bc/0xc70 fs/read_write.c:1254
  __do_sys_sendfile64 fs/read_write.c:1322 [inline]
  __se_sys_sendfile64 fs/read_write.c:1308 [inline]
  __arm64_sys_sendfile64+0x160/0x3b4 fs/read_write.c:1308
  __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
  invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51
  el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136
  do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155
  el0_svc+0x54/0x158 arch/arm64/kernel/entry-common.c:678
  el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:696
  el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:595
Code: aa1803e6 aa1903e7 a90023f5 94792f6a (d4210000)

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-and-tested-by: syzbot+2a7024e9502df538e8ef@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240118183625.4007013-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-19 21:28:50 -08:00
Lin Ma 6c21660fe2 vlan: skip nested type that is not IFLA_VLAN_QOS_MAPPING
In the vlan_changelink function, a loop is used to parse the nested
attributes IFLA_VLAN_EGRESS_QOS and IFLA_VLAN_INGRESS_QOS in order to
obtain the struct ifla_vlan_qos_mapping. These two nested attributes are
checked in the vlan_validate_qos_map function, which calls
nla_validate_nested_deprecated with the vlan_map_policy.

However, this deprecated validator applies a LIBERAL strictness, allowing
the presence of an attribute with the type IFLA_VLAN_QOS_UNSPEC.
Consequently, the loop in vlan_changelink may parse an attribute of type
IFLA_VLAN_QOS_UNSPEC and believe it carries a payload of
struct ifla_vlan_qos_mapping, which is not necessarily true.

To address this issue and ensure compatibility, this patch introduces two
type checks that skip attributes whose type is not IFLA_VLAN_QOS_MAPPING.

Fixes: 07b5b17e15 ("[VLAN]: Use rtnl_link API")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240118130306.1644001-1-linma@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-19 21:25:06 -08:00
Jakub Kicinski 9b6979563b Merge branch 'bnxt_en-bug-fixes'
Michael Chan says:

====================
bnxt_en: Bug fixes

This series contains 5 miscellaneous fixes.  The fixes include adding
delay for FLR, buffer memory leak, RSS table size calculation,
ethtool self test kernel warning, and mqprio crash.
====================

Link: https://lore.kernel.org/r/20240117234515.226944-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-19 21:15:18 -08:00
Michael Chan 467739baf6 bnxt_en: Fix possible crash after creating sw mqprio TCs
The driver relies on netdev_get_num_tc() to get the number of HW
offloaded mqprio TCs to allocate and free TX rings.  This won't
work and can potentially crash the system if software mqprio or
taprio TCs have been setup.  netdev_get_num_tc() will return the
number of software TCs and it may cause the driver to allocate or
free more TX rings that it should.  Fix it by adding a bp->num_tc
field to store the number of HW offload mqprio TCs for the device.
Use bp->num_tc instead of netdev_get_num_tc().

This fixes a crash like this:

BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 42b8404067 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 120 PID: 8661 Comm: ifconfig Kdump: loaded Tainted: G           OE     5.18.16 #1
Hardware name: Lenovo ThinkSystem SR650 V3/SB27A92818, BIOS ESE114N-2.12 04/25/2023
RIP: 0010:bnxt_hwrm_cp_ring_alloc_p5+0x10/0x90 [bnxt_en]
Code: 41 5c 41 5d 41 5e c3 cc cc cc cc 41 8b 44 24 08 66 89 03 eb c6 e8 b0 f1 7d db 0f 1f 44 00 00 41 56 41 55 41 54 55 48 89 fd 53 <48> 8b 06 48 89 f3 48 81 c6 28 01 00 00 0f b6 96 13 ff ff ff 44 8b
RSP: 0018:ff65907660d1fa88 EFLAGS: 00010202
RAX: 0000000000000010 RBX: ff4dde1d907e4980 RCX: f400000000000000
RDX: 0000000000000010 RSI: 0000000000000000 RDI: ff4dde1d907e4980
RBP: ff4dde1d907e4980 R08: 000000000000000f R09: 0000000000000000
R10: ff4dde5f02671800 R11: 0000000000000008 R12: 0000000088888889
R13: 0500000000000000 R14: 00f0000000000000 R15: ff4dde5f02671800
FS:  00007f4b126b5740(0000) GS:ff4dde9bff600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000416f9c6002 CR4: 0000000000771ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 bnxt_hwrm_ring_alloc+0x204/0x770 [bnxt_en]
 bnxt_init_chip+0x4d/0x680 [bnxt_en]
 ? bnxt_poll+0x1a0/0x1a0 [bnxt_en]
 __bnxt_open_nic+0xd2/0x740 [bnxt_en]
 bnxt_open+0x10b/0x220 [bnxt_en]
 ? raw_notifier_call_chain+0x41/0x60
 __dev_open+0xf3/0x1b0
 __dev_change_flags+0x1db/0x250
 dev_change_flags+0x21/0x60
 devinet_ioctl+0x590/0x720
 ? avc_has_extended_perms+0x1b7/0x420
 ? _copy_from_user+0x3a/0x60
 inet_ioctl+0x189/0x1c0
 ? wp_page_copy+0x45a/0x6e0
 sock_do_ioctl+0x42/0xf0
 ? ioctl_has_perm.constprop.0.isra.0+0xbd/0x120
 sock_ioctl+0x1ce/0x2e0
 __x64_sys_ioctl+0x87/0xc0
 do_syscall_64+0x59/0x90
 ? syscall_exit_work+0x103/0x130
 ? syscall_exit_to_user_mode+0x12/0x30
 ? do_syscall_64+0x69/0x90
 ? exc_page_fault+0x62/0x150

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240117234515.226944-6-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-19 21:15:13 -08:00
Michael Chan c20f482129 bnxt_en: Prevent kernel warning when running offline self test
We call bnxt_half_open_nic() to setup the chip partially to run
loopback tests.  The rings and buffers are initialized normally
so that we can transmit and receive packets in loopback mode.
That means page pool buffers are allocated for the aggregation ring
just like the normal case.  NAPI is not needed because we are just
polling for the loopback packets.

When we're done with the loopback tests, we call bnxt_half_close_nic()
to clean up.  When freeing the page pools, we hit a WARN_ON()
in page_pool_unlink_napi() because the NAPI state linked to the
page pool is uninitialized.

The simplest way to avoid this warning is just to initialize the
NAPIs during half open and delete the NAPIs during half close.
Trying to skip the page pool initialization or skip linking of
NAPI during half open will be more complicated.

This fix avoids this warning:

WARNING: CPU: 4 PID: 46967 at net/core/page_pool.c:946 page_pool_unlink_napi+0x1f/0x30
CPU: 4 PID: 46967 Comm: ethtool Tainted: G S      W          6.7.0-rc5+ #22
Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.3.8 08/31/2021
RIP: 0010:page_pool_unlink_napi+0x1f/0x30
Code: 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8b 47 18 48 85 c0 74 1b 48 8b 50 10 83 e2 01 74 08 8b 40 34 83 f8 ff 74 02 <0f> 0b 48 c7 47 18 00 00 00 00 c3 cc cc cc cc 66 90 90 90 90 90 90
RSP: 0018:ffa000003d0dfbe8 EFLAGS: 00010246
RAX: ff110003607ce640 RBX: ff110010baf5d000 RCX: 0000000000000008
RDX: 0000000000000000 RSI: ff110001e5e522c0 RDI: ff110010baf5d000
RBP: ff11000145539b40 R08: 0000000000000001 R09: ffffffffc063f641
R10: ff110001361eddb8 R11: 000000000040000f R12: 0000000000000001
R13: 000000000000001c R14: ff1100014553a080 R15: 0000000000003fc0
FS:  00007f9301c4f740(0000) GS:ff1100103fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f91344fa8f0 CR3: 00000003527cc005 CR4: 0000000000771ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 ? __warn+0x81/0x140
 ? page_pool_unlink_napi+0x1f/0x30
 ? report_bug+0x102/0x200
 ? handle_bug+0x44/0x70
 ? exc_invalid_op+0x13/0x60
 ? asm_exc_invalid_op+0x16/0x20
 ? bnxt_free_ring.isra.123+0xb1/0xd0 [bnxt_en]
 ? page_pool_unlink_napi+0x1f/0x30
 page_pool_destroy+0x3e/0x150
 bnxt_free_mem+0x441/0x5e0 [bnxt_en]
 bnxt_half_close_nic+0x2a/0x40 [bnxt_en]
 bnxt_self_test+0x21d/0x450 [bnxt_en]
 __dev_ethtool+0xeda/0x2e30
 ? native_queued_spin_lock_slowpath+0x17f/0x2b0
 ? __link_object+0xa1/0x160
 ? _raw_spin_unlock_irqrestore+0x23/0x40
 ? __create_object+0x5f/0x90
 ? __kmem_cache_alloc_node+0x317/0x3c0
 ? dev_ethtool+0x59/0x170
 dev_ethtool+0xa7/0x170
 dev_ioctl+0xc3/0x530
 sock_do_ioctl+0xa8/0xf0
 sock_ioctl+0x270/0x310
 __x64_sys_ioctl+0x8c/0xc0
 do_syscall_64+0x3e/0xf0
 entry_SYSCALL_64_after_hwframe+0x6e/0x76

Fixes: 294e39e0d0 ("bnxt: hook NAPIs to page pools")
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240117234515.226944-5-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-19 21:15:13 -08:00
Michael Chan 523384a6aa bnxt_en: Fix RSS table entries calculation for P5_PLUS chips
The existing formula used in the driver to calculate the number of RSS
table entries is to round up the number of RX rings to the next integer
multiples of 64 (e.g. 64, 128, 192, ..).  This is incorrect.  The valid
values supported by the chip are 64, 128, 256, 512 only (power of 2
starting from 64).  When the number of RX rings is greater than 128, the
entry size will likely be wrong.  Firmware will round down the invalid
value (e.g. 192 rounded down to 128) provided by the driver, causing some
RSS rings to not receive any packets.

We already have an existing function bnxt_calc_nr_ring_pages() to
do this calculation.  Use it in bnxt_get_nr_rss_ctxs() to calculate the
number of RSS contexts correctly for P5_PLUS chips.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Fixes: 7b3af4f75b ("bnxt_en: Add RSS support for 57500 chips.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240117234515.226944-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-19 21:15:13 -08:00
Michael Chan 2ad8e57338 bnxt_en: Fix memory leak in bnxt_hwrm_get_rings()
bnxt_hwrm_get_rings() can abort and return error when there are not
enough ring resources.  It aborts without releasing the HWRM DMA buffer,
causing a dma_pool_destroy warning when the driver is unloaded:

bnxt_en 0000:99:00.0: dma_pool_destroy bnxt_hwrm, 000000005b089ba8 busy

Fixes: f1e50b276d ("bnxt_en: Fix trimming of P5 RX and TX rings")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240117234515.226944-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-19 21:15:12 -08:00
Michael Chan 3c1069fa42 bnxt_en: Wait for FLR to complete during probe
The first message to firmware may fail if the device is undergoing FLR.
The driver has some recovery logic for this failure scenario but we must
wait 100 msec for FLR to complete before proceeding.  Otherwise the
recovery will always fail.

Fixes: ba02629ff6 ("bnxt_en: log firmware status on firmware init failure")
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240117234515.226944-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-19 21:15:12 -08:00
Zhengchao Shao 198bc90e0e tcp: make sure init the accept_queue's spinlocks once
When I run syz's reproduction C program locally, it causes the following
issue:
pvqspinlock: lock 0xffff9d181cd5c660 has corrupted value 0x0!
WARNING: CPU: 19 PID: 21160 at __pv_queued_spin_unlock_slowpath (kernel/locking/qspinlock_paravirt.h:508)
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:__pv_queued_spin_unlock_slowpath (kernel/locking/qspinlock_paravirt.h:508)
Code: 73 56 3a ff 90 c3 cc cc cc cc 8b 05 bb 1f 48 01 85 c0 74 05 c3 cc cc cc cc 8b 17 48 89 fe 48 c7 c7
30 20 ce 8f e8 ad 56 42 ff <0f> 0b c3 cc cc cc cc 0f 0b 0f 1f 40 00 90 90 90 90 90 90 90 90 90
RSP: 0018:ffffa8d200604cb8 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9d1ef60e0908
RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9d1ef60e0900
RBP: ffff9d181cd5c280 R08: 0000000000000000 R09: 00000000ffff7fff
R10: ffffa8d200604b68 R11: ffffffff907dcdc8 R12: 0000000000000000
R13: ffff9d181cd5c660 R14: ffff9d1813a3f330 R15: 0000000000001000
FS:  00007fa110184640(0000) GS:ffff9d1ef60c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000000 CR3: 000000011f65e000 CR4: 00000000000006f0
Call Trace:
<IRQ>
  _raw_spin_unlock (kernel/locking/spinlock.c:186)
  inet_csk_reqsk_queue_add (net/ipv4/inet_connection_sock.c:1321)
  inet_csk_complete_hashdance (net/ipv4/inet_connection_sock.c:1358)
  tcp_check_req (net/ipv4/tcp_minisocks.c:868)
  tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2260)
  ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205)
  ip_local_deliver_finish (net/ipv4/ip_input.c:234)
  __netif_receive_skb_one_core (net/core/dev.c:5529)
  process_backlog (./include/linux/rcupdate.h:779)
  __napi_poll (net/core/dev.c:6533)
  net_rx_action (net/core/dev.c:6604)
  __do_softirq (./arch/x86/include/asm/jump_label.h:27)
  do_softirq (kernel/softirq.c:454 kernel/softirq.c:441)
</IRQ>
<TASK>
  __local_bh_enable_ip (kernel/softirq.c:381)
  __dev_queue_xmit (net/core/dev.c:4374)
  ip_finish_output2 (./include/net/neighbour.h:540 net/ipv4/ip_output.c:235)
  __ip_queue_xmit (net/ipv4/ip_output.c:535)
  __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
  tcp_rcv_synsent_state_process (net/ipv4/tcp_input.c:6469)
  tcp_rcv_state_process (net/ipv4/tcp_input.c:6657)
  tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1929)
  __release_sock (./include/net/sock.h:1121 net/core/sock.c:2968)
  release_sock (net/core/sock.c:3536)
  inet_wait_for_connect (net/ipv4/af_inet.c:609)
  __inet_stream_connect (net/ipv4/af_inet.c:702)
  inet_stream_connect (net/ipv4/af_inet.c:748)
  __sys_connect (./include/linux/file.h:45 net/socket.c:2064)
  __x64_sys_connect (net/socket.c:2073 net/socket.c:2070 net/socket.c:2070)
  do_syscall_64 (arch/x86/entry/common.c:51 arch/x86/entry/common.c:82)
  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
  RIP: 0033:0x7fa10ff05a3d
  Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89
  c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ab a3 0e 00 f7 d8 64 89 01 48
  RSP: 002b:00007fa110183de8 EFLAGS: 00000202 ORIG_RAX: 000000000000002a
  RAX: ffffffffffffffda RBX: 0000000020000054 RCX: 00007fa10ff05a3d
  RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000003
  RBP: 00007fa110183e20 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000202 R12: 00007fa110184640
  R13: 0000000000000000 R14: 00007fa10fe8b060 R15: 00007fff73e23b20
</TASK>

The issue triggering process is analyzed as follows:
Thread A                                       Thread B
tcp_v4_rcv	//receive ack TCP packet       inet_shutdown
  tcp_check_req                                  tcp_disconnect //disconnect sock
  ...                                              tcp_set_state(sk, TCP_CLOSE)
    inet_csk_complete_hashdance                ...
      inet_csk_reqsk_queue_add                 inet_listen  //start listen
        spin_lock(&queue->rskq_lock)             inet_csk_listen_start
        ...                                        reqsk_queue_alloc
        ...                                          spin_lock_init
        spin_unlock(&queue->rskq_lock)	//warning

When the socket receives the ACK packet during the three-way handshake,
it will hold spinlock. And then the user actively shutdowns the socket
and listens to the socket immediately, the spinlock will be initialized.
When the socket is going to release the spinlock, a warning is generated.
Also the same issue to fastopenq.lock.

Move init spinlock to inet_create and inet_accept to make sure init the
accept_queue's spinlocks once.

Fixes: fff1f3001c ("tcp: add a spinlock to protect struct request_sock_queue")
Fixes: 168a8f5805 ("tcp: TCP Fast Open Server - main code path")
Reported-by: Ming Shu <sming56@aliyun.com>
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240118012019.1751966-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-19 21:13:25 -08:00
Benjamin Poirier b01f15a757 selftests: bonding: Increase timeout to 1200s
When tests are run by runner.sh, bond_options.sh gets killed before
it can complete:

make -C tools/testing/selftests run_tests TARGETS="drivers/net/bonding"
	[...]
	# timeout set to 120
	# selftests: drivers/net/bonding: bond_options.sh
	# TEST: prio (active-backup miimon primary_reselect 0)                [ OK ]
	# TEST: prio (active-backup miimon primary_reselect 1)                [ OK ]
	# TEST: prio (active-backup miimon primary_reselect 2)                [ OK ]
	# TEST: prio (active-backup arp_ip_target primary_reselect 0)         [ OK ]
	# TEST: prio (active-backup arp_ip_target primary_reselect 1)         [ OK ]
	# TEST: prio (active-backup arp_ip_target primary_reselect 2)         [ OK ]
	#
	not ok 7 selftests: drivers/net/bonding: bond_options.sh # TIMEOUT 120 seconds

This test includes many sleep statements, at least some of which are
related to timers in the operation of the bonding driver itself. Increase
the test timeout to allow the test to complete.

I ran the test in slightly different VMs (including one without HW
virtualization support) and got runtimes of 13m39.760s, 13m31.238s, and
13m2.956s. Use a ~1.5x "safety factor" and set the timeout to 1200s.

Fixes: 42a8d4aaea ("selftests: bonding: add bonding prio option test")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20240116104402.1203850a@kernel.org/#t
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240118001233.304759-1-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-19 21:12:31 -08:00
Wen Gu dbc153fd3c net/smc: fix illegal rmb_desc access in SMC-D connection dump
A crash was found when dumping SMC-D connections. It can be reproduced
by following steps:

- run nginx/wrk test:
  smc_run nginx
  smc_run wrk -t 16 -c 1000 -d <duration> -H 'Connection: Close' <URL>

- continuously dump SMC-D connections in parallel:
  watch -n 1 'smcss -D'

 BUG: kernel NULL pointer dereference, address: 0000000000000030
 CPU: 2 PID: 7204 Comm: smcss Kdump: loaded Tainted: G	E      6.7.0+ #55
 RIP: 0010:__smc_diag_dump.constprop.0+0x5e5/0x620 [smc_diag]
 Call Trace:
  <TASK>
  ? __die+0x24/0x70
  ? page_fault_oops+0x66/0x150
  ? exc_page_fault+0x69/0x140
  ? asm_exc_page_fault+0x26/0x30
  ? __smc_diag_dump.constprop.0+0x5e5/0x620 [smc_diag]
  ? __kmalloc_node_track_caller+0x35d/0x430
  ? __alloc_skb+0x77/0x170
  smc_diag_dump_proto+0xd0/0xf0 [smc_diag]
  smc_diag_dump+0x26/0x60 [smc_diag]
  netlink_dump+0x19f/0x320
  __netlink_dump_start+0x1dc/0x300
  smc_diag_handler_dump+0x6a/0x80 [smc_diag]
  ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag]
  sock_diag_rcv_msg+0x121/0x140
  ? __pfx_sock_diag_rcv_msg+0x10/0x10
  netlink_rcv_skb+0x5a/0x110
  sock_diag_rcv+0x28/0x40
  netlink_unicast+0x22a/0x330
  netlink_sendmsg+0x1f8/0x420
  __sock_sendmsg+0xb0/0xc0
  ____sys_sendmsg+0x24e/0x300
  ? copy_msghdr_from_user+0x62/0x80
  ___sys_sendmsg+0x7c/0xd0
  ? __do_fault+0x34/0x160
  ? do_read_fault+0x5f/0x100
  ? do_fault+0xb0/0x110
  ? __handle_mm_fault+0x2b0/0x6c0
  __sys_sendmsg+0x4d/0x80
  do_syscall_64+0x69/0x180
  entry_SYSCALL_64_after_hwframe+0x6e/0x76

It is possible that the connection is in process of being established
when we dump it. Assumed that the connection has been registered in a
link group by smc_conn_create() but the rmb_desc has not yet been
initialized by smc_buf_create(), thus causing the illegal access to
conn->rmb_desc. So fix it by checking before dump.

Fixes: 4b1b7d3b30 ("net/smc: add SMC-D diag support")
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-19 12:04:17 +00:00
Linus Torvalds 736b5545d3 Including fixes from bpf and netfilter.
Previous releases - regressions:
 
  - Revert "net: rtnetlink: Enslave device before bringing it up",
    breaks the case inverse to the one it was trying to fix
 
  - net: dsa: fix oob access in DSA's netdevice event handler
    dereference netdev_priv() before check its a DSA port
 
  - sched: track device in tcf_block_get/put_ext() only for clsact
    binder types
 
  - net: tls, fix WARNING in __sk_msg_free when record becomes full
    during splice and MORE hint set
 
  - sfp-bus: fix SFP mode detect from bitrate
 
  - drv: stmmac: prevent DSA tags from breaking COE
 
 Previous releases - always broken:
 
  - bpf: fix no forward progress in in bpf_iter_udp if output
    buffer is too small
 
  - bpf: reject variable offset alu on registers with a type
    of PTR_TO_FLOW_KEYS to prevent oob access
 
  - netfilter: tighten input validation
 
  - net: add more sanity check in virtio_net_hdr_to_skb()
 
  - rxrpc: fix use of Don't Fragment flag on RESPONSE packets,
    avoid infinite loop
 
  - amt: do not use the portion of skb->cb area which may get clobbered
 
  - mptcp: improve validation of the MPTCPOPT_MP_JOIN MCTCP option
 
 Misc:
 
  - spring cleanup of inactive maintainers
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmWpnvoACgkQMUZtbf5S
 Irvskg/+Or5tETxOmpQXxnj6ECZyrSp0Jcyd7+TIcos/7JfPdn3Kebl004SG4h/s
 bwKDOIIP1iSjQ+0NFsPjyYIVd6wFuCElSB7npV5uQAT6ptXx7A4Ym68/rVxodI8T
 6hiYV/mlPuZF8JjRhtp/VJL8sY1qnG7RIUB4oH3y9HQNfwZX0lIWChuUilHuWfbq
 zQ2Iu97tMkoIBjXrkIT3Qaj0aFxYbjCOrg9zy+FZ69a7Rmrswr//7amlCH6saNTx
 Ku7Wl8FXhe7O23OiM6GSl7AechSM1aJ5kOS3orseej0+aSp9eH3ekYGmbsQr6sjz
 ix/eZ7V7SUkJK3bEH5haeymk4TDV3lHE8SziMbosK4wVbHOyPwEmqCxppADYJLZs
 WycHZKcTBluFBOxknAofH7m5Hh0ToXkeTfpptSSGtRB4WncAOMsMapr3yS4WXg/q
 AnOo/tzCBgMrnSJtD/kjqgUiCk8vYoLc8lBR9K74l0zqI1sf13OfuTHvEgqIS6z1
 Ir/ewlAV6fCH8gQbyzjKUVlyjZS4+vFv19xg/2GgLf+LdyzcCOxUZkND3/DE6+OA
 Dgf9gtABYU4hGXMUfTfml3KCBTF65QmY8dIh17zraNylYUHEJ2lI4D+sdiqWUrXb
 mXPBJh4nOPwIV5t2gT80skNwF3aWPr6l4ieY2codSbP04rO74S8=
 =YhQQ
 -----END PGP SIGNATURE-----

Merge tag 'net-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf and netfilter.

  Previous releases - regressions:

   - Revert "net: rtnetlink: Enslave device before bringing it up",
     breaks the case inverse to the one it was trying to fix

   - net: dsa: fix oob access in DSA's netdevice event handler
     dereference netdev_priv() before check its a DSA port

   - sched: track device in tcf_block_get/put_ext() only for clsact
     binder types

   - net: tls, fix WARNING in __sk_msg_free when record becomes full
     during splice and MORE hint set

   - sfp-bus: fix SFP mode detect from bitrate

   - drv: stmmac: prevent DSA tags from breaking COE

  Previous releases - always broken:

   - bpf: fix no forward progress in in bpf_iter_udp if output buffer is
     too small

   - bpf: reject variable offset alu on registers with a type of
     PTR_TO_FLOW_KEYS to prevent oob access

   - netfilter: tighten input validation

   - net: add more sanity check in virtio_net_hdr_to_skb()

   - rxrpc: fix use of Don't Fragment flag on RESPONSE packets, avoid
     infinite loop

   - amt: do not use the portion of skb->cb area which may get clobbered

   - mptcp: improve validation of the MPTCPOPT_MP_JOIN MCTCP option

  Misc:

   - spring cleanup of inactive maintainers"

* tag 'net-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
  i40e: Include types.h to some headers
  ipv6: mcast: fix data-race in ipv6_mc_down / mld_ifc_work
  selftests: mlxsw: qos_pfc: Adjust the test to support 8 lanes
  selftests: mlxsw: qos_pfc: Remove wrong description
  mlxsw: spectrum_router: Register netdevice notifier before nexthop
  mlxsw: spectrum_acl_tcam: Fix stack corruption
  mlxsw: spectrum_acl_tcam: Fix NULL pointer dereference in error path
  mlxsw: spectrum_acl_erp: Fix error flow of pool allocation failure
  ethtool: netlink: Add missing ethnl_ops_begin/complete
  selftests: bonding: Add more missing config options
  selftests: netdevsim: add a config file
  libbpf: warn on unexpected __arg_ctx type when rewriting BTF
  selftests/bpf: add tests confirming type logic in kernel for __arg_ctx
  bpf: enforce types for __arg_ctx-tagged arguments in global subprogs
  bpf: extract bpf_ctx_convert_map logic and make it more reusable
  libbpf: feature-detect arg:ctx tag support in kernel
  ipvs: avoid stat macros calls from preemptible context
  netfilter: nf_tables: reject NFT_SET_CONCAT with not field length description
  netfilter: nf_tables: skip dead set elements in netlink dump
  netfilter: nf_tables: do not allow mismatch field size and set key length
  ...
2024-01-18 17:33:50 -08:00
Linus Torvalds ed8d84530a This cycle, I2C removes the currently unused CLASS_DDC support
(controllers set the flag, but there is no client to use it). Also,
 CLASS_SPD support gets simplified to prepare removal in the future.
 Class based instantiation is not recommended these days anyhow.
 Furthermore, I2C core now creates a debugfs directory per I2C adapter.
 Current bus driver users were converted to use it. Then, there are also
 quite some driver updates. Standing out are patches for the wmt-driver
 which is refactored to support more variants. This is the rebased pull
 request where a large series for the designware driver was dropped.
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEOZGx6rniZ1Gk92RdFA3kzBSgKbYFAmWph0UPHHdzYUBrZXJu
 ZWwub3JnAAoJEBQN5MwUoCm2kbIQAJotSmX0mM+nNPReYCMMiloxoxUwgpiErNwY
 WDrYQSezthAJ1LDsGOEeLcE4f4I+UcUHBO1BoERtOZg3cGtE0Ii5N845sp100S9O
 ktyaKS5utoErymThWFFrnZX60/8yKXUMzZmNzy96560gPcxbFyyyVhKfBSPzK9T+
 O8CGu7GRNqgWHlvH3yqGeCbreWYrYVSrluEpBu6807cp3zDxrU+autOnsewm5+md
 ka3DdqrbxJSblYK8fJKESAUgkRmZgYKbgl0iiCuqX+ib6I4OA3Z68ny7dl0fY3Ws
 vwt7d88SaBKDdJmUZyb/sm4aJsW69GN+ECZolxrn4TIw45k4tes2s6Ma5+TV3E9h
 Fd1RuqduFEqQ7cj31UPe2x8rgj5Fo5nbjCWxdZv+/3zF8+cHwi8iwkp2PScsPCsa
 fmCdehUE5DrgobsRNANe6XJzxY5wp2VNpGEWKeaQz2Z0/d9T1YFS7a8aewvhXoPC
 isZboi6GQh2XoE8UgGJa29VUuaIkUW513DwCGw8mz1yKN+kHGcsRXXjkjaZoQn3U
 MMvh/zkI2Hpy/m2R8PWeIq5XhLJvmlZ19JJzUHJIjXh9Fn9EVtXhlUleh6mzMfeM
 n8NOg7Eukep2sBgmaufkUKz2Jtogs59YDSXZEvqJjIkPM2Wi0hA18Qj+pilES1ff
 3ckk3mxY
 =8D3Q
 -----END PGP SIGNATURE-----

Merge tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c updates from Wolfram Sang:
 "This removes the currently unused CLASS_DDC support (controllers set
  the flag, but there is no client to use it).

  Also, CLASS_SPD support gets simplified to prepare removal in the
  future. Class based instantiation is not recommended these days
  anyhow.

  Furthermore, I2C core now creates a debugfs directory per I2C adapter.
  Current bus driver users were converted to use it.

  Finally, quite some driver updates. Standing out are patches for the
  wmt-driver which is refactored to support more variants.

  This is the rebased pull request where a large series for the
  designware driver was dropped"

* tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits)
  MAINTAINERS: use proper email for my I2C work
  i2c: stm32f7: add support for stm32mp25 soc
  i2c: stm32f7: perform I2C_ISR read once at beginning of event isr
  dt-bindings: i2c: document st,stm32mp25-i2c compatible
  i2c: stm32f7: simplify status messages in case of errors
  i2c: stm32f7: perform most of irq job in threaded handler
  i2c: stm32f7: use dev_err_probe upon calls of devm_request_irq
  i2c: i801: Add lis3lv02d for Dell XPS 15 7590
  i2c: i801: Add lis3lv02d for Dell Precision 3540
  i2c: wmt: Reduce redundant: REG_CR setting
  i2c: wmt: Reduce redundant: function parameter
  i2c: wmt: Reduce redundant: clock mode setting
  i2c: wmt: Reduce redundant: wait event complete
  i2c: wmt: Reduce redundant: bus busy check
  i2c: mux: reg: Remove class-based device auto-detection support
  i2c: make i2c_bus_type const
  dt-bindings: at24: add ROHM BR24G04
  eeprom: at24: use of_match_ptr()
  i2c: cpm: Remove linux,i2c-index conversion from be32
  i2c: imx: Make SDA actually optional for bus recovering
  ...
2024-01-18 17:29:01 -08:00
Linus Torvalds 378de6df19 RTC for 6.8
Subsytem:
 
 New driver:
  - Analog Devices MAX31335
  - Nuvoton ma35d1
  - Texas Instrument TPS6594 PMIC RTC
 
 Drivers:
  - cmos: use ACPI alarm instead of HPET on recent AMD platforms
  - nuvoton: add NCT3015Y-R and NCT3018Y-R support
  - rv8803: proper suspend/resume and wakeup-source support
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEBqsFVZXh8s/0O5JiY6TcMGxwOjIFAmWpkqQACgkQY6TcMGxw
 OjJ2eQ//cDaUPgvOUQ2+qQiZOyOrjnck+QEyXZr2mcnNfRnMPiafJX09v3W4BMPh
 0oOVNF5BbGofTgJRnTC9BCRQnq8XZUwXaMr1B+mLQF4V9Xq3896pILkdRmd0q7EW
 3qwKCNP1vkYx2hGyWB9wVAMESAdUIFHCLxJWeQZ3ESGUacMoON0cdFa96TUx4fKa
 m29ybMTHRHKpnZsIYpegxG42lWp84IPvTbtySbT52dr2ucLToVos/dX23juQ40D0
 nyUa8Q+g6aLoTxjZPcFwK6dHJJIwWz56s40IbMRGr6dVfRis5QZfIQ/cB8ULj48L
 AkCtN6kptVsov/W2R9ZriTf5p53K7Fmwz+dhccW7SxA82cGyswWOD3BNzzOYnzPY
 pKSVeTnR7mD2IC28pGrekSpg3ExuNu/4+WHnz0EwpjgyXmtlxnfK6oV9nf8bIUsn
 JsY3tNOvenv8NQSQ1at3GugeQ4bGMbxay6pL9zm5EZjYGMDX0z7IcPFr9KeYC5tJ
 60dYWCGuB2JF0WocuAXSNrj2l9VFFqhn7OdDVuiB00rBROQcKKV/H30EnmAwRUI1
 8SZhVJ0TIeoG6XsajhbapvOH8EWUnKPn1mgJxSiyVR6Hz93oZldIPK3cUtfjcltx
 xl7LdVeseE1CXGh4g222CI0MCX6x1QPQ5L8jhgc67dyTUx/wQ4Q=
 =7u4i
 -----END PGP SIGNATURE-----

Merge tag 'rtc-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
 "There are three new drivers this cycle. Also the cmos driver is
  getting fixes for longstanding wakeup issues on AMD.

  New drivers:
   - Analog Devices MAX31335
   - Nuvoton ma35d1
   - Texas Instrument TPS6594 PMIC RTC

  Drivers:
   - cmos: use ACPI alarm instead of HPET on recent AMD platforms
   - nuvoton: add NCT3015Y-R and NCT3018Y-R support
   - rv8803: proper suspend/resume and wakeup-source support"

* tag 'rtc-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (26 commits)
  rtc: nuvoton: Compatible with NCT3015Y-R and NCT3018Y-R
  rtc: da9063: Use dev_err_probe()
  rtc: da9063: Use device_get_match_data()
  rtc: da9063: Make IRQ as optional
  rtc: max31335: Fix comparison in max31335_volatile_reg()
  rtc: max31335: use regmap_update_bits_check
  rtc: max31335: remove unecessary locking
  rtc: max31335: add driver support
  dt-bindings: rtc: max31335: add max31335 bindings
  rtc: rv8803: add wakeup-source support
  rtc: ac100: remove misuses of kernel-doc
  rtc: class: Remove usage of the deprecated ida_simple_xx() API
  rtc: MAINTAINERS: drop Alessandro Zummo
  rtc: ma35d1: remove hardcoded UIE support
  dt-bindings: rtc: qcom-pm8xxx: fix inconsistent example
  rtc: rv8803: Add power management support
  rtc: ds3232: avoid unused-const-variable warning
  rtc: lpc24xx: add missing dependency
  rtc: tps6594: Add driver for TPS6594 RTC
  rtc: Add driver for Nuvoton ma35d1 rtc controller
  ...
2024-01-18 17:25:39 -08:00
Linus Torvalds 0f289bdd41 Input updates for 6.8 merge window:
- a new driver for Adafruit Seesaw gamepad device
 
 - Zforce touchscreen will handle standard device properties for axis
   swap/inversion
 
 - handling of advanced sensitivity settings in Microchip CAP11xx
   capacitive sensor driver
 
 - more drivers have been converted to use newer gpiod API
 
 - support for dedicated wakeup IRQs in gpio-keys dirver
 
 - support for slider gestures and OTP variants in iqs269a driver
 
 - atkbd will report keyboard version as 0xab83 in cases when GET ID
   command was skipped (to deal with problematic firmware on newer
   laptops), restoring the previous behavior
 
 - other assorted cleanups and changes
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQST2eWILY88ieB2DOtAj56VGEWXnAUCZalqoQAKCRBAj56VGEWX
 nOqHAP4u4b/r4w2aeULy3kpESgbUQ1vXpFUus/6AHTw1FAtbsgD/XxV3ZWKZ0H8J
 VfZ81yXvT3WstJM7p7YNP0GGXJ/HRQg=
 =l0Z8
 -----END PGP SIGNATURE-----

Merge tag 'input-for-v6.8-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input updates from Dmitry Torokhov:

 - a new driver for Adafruit Seesaw gamepad device

 - Zforce touchscreen will handle standard device properties for axis
   swap/inversion

 - handling of advanced sensitivity settings in Microchip CAP11xx
   capacitive sensor driver

 - more drivers have been converted to use newer gpiod API

 - support for dedicated wakeup IRQs in gpio-keys dirver

 - support for slider gestures and OTP variants in iqs269a driver

 - atkbd will report keyboard version as 0xab83 in cases when GET ID
   command was skipped (to deal with problematic firmware on newer
   laptops), restoring the previous behavior

 - other assorted cleanups and changes

* tag 'input-for-v6.8-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (44 commits)
  Input: atkbd - use ab83 as id when skipping the getid command
  Input: driver for Adafruit Seesaw Gamepad
  dt-bindings: input: bindings for Adafruit Seesaw Gamepad
  Input: da9063_onkey - avoid explicitly setting input's parent
  Input: da9063_onkey - avoid using OF-specific APIs
  Input: iqs269a - add support for OTP variants
  dt-bindings: input: iqs269a: Add bindings for OTP variants
  Input: iqs269a - add support for slider gestures
  dt-bindings: input: iqs269a: Add bindings for slider gestures
  Input: gpio-keys - filter gpio_keys -EPROBE_DEFER error messages
  Input: zforce_ts - accept standard touchscreen properties
  dt-bindings: touchscreen: neonode,zforce: Use standard properties
  dt-bindings: touchscreen: convert neonode,zforce to json-schema
  dt-bindings: input: convert drv266x to json-schema
  Input: da9063 - use dev_err_probe()
  Input: da9063 - drop redundant prints in probe()
  Input: da9063 - simplify obtaining OF match data
  Input: as5011 - convert to GPIO descriptor
  Input: omap-keypad - drop optional GPIO support
  Input: tca6416-keypad - drop unused include
  ...
2024-01-18 17:21:35 -08:00
Linus Torvalds 33a9caa499 phy-for-6.8
- New Support
   - Qualcomm SM8650 UFS, PCIe and USB/DP Combo PHY, eUSB2 PHY, SDX75 USB3,
     X1E80100 USB3 support
   - Mediatek MT8195 support
   - Rockchip RK3128 usb2 support
   - TI SGMII mode for J784S4
 
 - Updates
   - Qualcomm v7 register offsets updates
   - Mediatek tphy support for force phy mode switch
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmWpWgIACgkQfBQHDyUj
 g0fMJBAAofaDTsCdkgrzkMGn5JXZ7Zr99phIN6zmo9bUQdoAIv7VTj7UbNjSpBrf
 Y3inwOjcB0jwjOuZ0NZ3LnLDktDCCaWBPg2Q14YQJkRJIAPyk9+zFyhZRY8B5jEp
 Bq+cre9Qc8nbgZwvyQ14qKf0VZX3LUHifUGuH2nq9ZO0tc6SdV0ej4ayyL2IWpXG
 7ZOQQdFx57YBPFUU5Sn2g/Lg1tXbHsix+zfzCE4SzhWK0ewWLweJ0ZyVvpD0hVsE
 2JYofhop+vXAgoIia4ry7BRe54M+sWtl16rw9ZwVMqyVBTh+FF5EF42sZJGDguFV
 SBG/lw3/8UK0qNc+SlV9Y01CRXMN+aVuW0Q/ilto1qkakDVcS66ADfWxSQuEXp8L
 3Udyijz52gSuETSR/d/STJCoafJ4K5e/2JeGPGzPdC1CcL9lA6izIqCHT1QJBWTI
 OoqstGzYy19T1CcYPGKNp3TuoPqzfi/9xt5frC2ax5uPGsVYz1xXeK8cde2Mr0OI
 KDFt0GAmISfibN+CwoAS75EwDVqxk2OvVO77ndFyVxMSO/Zq4r+7SpBt0Kvoemrq
 I4Qaj85afZagP8UQn62i7CHxki+VlCahwuXGtWxP47pGiJm1zMD0+NgKcpBM9TqB
 tmXhQoGWpNFqwYibyAiKEhbAIZ4b7cF8DHDADpB5eES1FX1Zf5Q=
 =RPh3
 -----END PGP SIGNATURE-----

Merge tag 'phy-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy

Pull phy updates from Vinod Koul:
 "New Support:

   - Qualcomm SM8650 UFS, PCIe and USB/DP Combo PHY, eUSB2 PHY, SDX75
     USB3, X1E80100 USB3 support

   - Mediatek MT8195 support

   - Rockchip RK3128 usb2 support

   - TI SGMII mode for J784S4

  Updates:

   - Qualcomm v7 register offsets updates

   - Mediatek tphy support for force phy mode switch"

* tag 'phy-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (34 commits)
  phy: ti: j721e-wiz: Add SGMII support in WIZ driver for J784S4
  phy: ti: gmii-sel: Enable SGMII mode for J784S4
  phy: qcom-qmp-usb: Add Qualcomm X1E80100 USB3 PHY support
  dt-bindings: phy: qcom,sc8280xp-qmp-usb3-uni: Add X1E80100 USB PHY binding
  phy: qcom-qmp-combo: Add x1e80100 USB/DP combo phys
  dt-bindings: phy: qcom,sc8280xp-qmp-usb43dp-phy: Document X1E80100 compatible
  dt-bindings: phy: qcom: snps-eusb2: Document the X1E80100 compatible
  phy: mediatek: tphy: add support force phy mode switch
  dt-bindings: phy: mediatek: tphy: add a property for force-mode switch
  phy: phy-can-transceiver: insert space after include
  phy: qualcomm: phy-qcom-qmp-ufs: Rectify SM8550 UFS HS-G4 PHY Settings
  dt-bindings: phy: qcom,sc8280xp-qmp-usb43dp-phy: fix path to header
  phy: renesas: phy-rcar-gen2: use select for GENERIC_PHY
  phy: qcom-qmp: qserdes-txrx: Add v7 register offsets
  phy: qcom-qmp: qserdes-txrx: Add V6 N4 register offsets
  phy: qcom-qmp: qserdes-com: Add v7 register offsets
  phy: qcom-qmp: pcs-usb: Add v7 register offsets
  phy: qcom-qmp: pcs: Add v7 register offsets
  phy: qcom-qmp: qserdes-txrx: Add some more v6.20 register offsets
  phy: qcom-qmp: qserdes-com: Add some more v6 register offsets
  ...
2024-01-18 17:11:43 -08:00
Linus Torvalds 4d5d604cc4 soundwire updates for 6.7
- Core: add concept of controller_id to deal with clear Controller/Manager
     hierarchy
  - bunch of qcom driver refactoring for qcom_swrm_stream_alloc_ports(),
    qcom_swrm_stream_alloc_ports() and setting controller id to hw master id
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmWpV0gACgkQfBQHDyUj
 g0ebow/+Ot1EFq8ADVjCnR4noelvlvfwhNHwM5PSN+GMkSmkTOI6lr4mueDFkEfT
 S2yFuKCtO7gIjV+EWt5G0uWAGM33E0aWwcJ+ubw95p+m1lilo2EYgQU3NDTfb+SK
 dUxSbwVDHl2jYTQHSV4UzuA66J9xyK7x5WdUuPvLVSzF5thXvu5g1hcXyd0zqDDT
 3i+WKkvhquOFOjsexSEAjXsbfa8XNM2R/7iiGlxZUdTh2NwSR5dyQ8WuyhEkw0Gb
 23KEOw9wPj1Gcp0JTQ1tSr7C/Ag9jC0QBZLI4ggdivXPUO9NuUHT1owHxLrwXr9R
 uz0zzbBBNvPGOOeW9C8CetcSXVvZAVmgPHZumiuXBZMcuDLyrS0BoQyNzl/yTflB
 ZBlXbzt7C9l6udHxG+4gLy2Xur0wwzBrH8ZqjItkp6MZTBUxcoMgIY6Y24e0DHsn
 cwgsuRblsnL7+JHcq5DL8KYBOnBmexxMCaFtKWGdlU9dWOQtPPe20yIwQACzxUrC
 VpvthWTk4P2KZB2hlP8xGbX9IZmINABN0vwQgf2wfUnRGzmyx7wdLXElXXahim4A
 cyIIjk9AiM2wglUQxKccKqdntg2SAwMeUE0YuMbBzAlzR8fgIklWzFvahOHLED0L
 kvPg1fr/dM5TN9PraptbrML+D5IHXCyCTThZB7ujTTMF4ekwdCo=
 =N74M
 -----END PGP SIGNATURE-----

Merge tag 'soundwire-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire

Pull soundwire updates from Vinod Koul:

 - Core: add concept of controller_id to deal with clear Controller /
   Manager hierarchy

 - bunch of qcom driver refactoring for qcom_swrm_stream_alloc_ports(),
   qcom_swrm_stream_alloc_ports() and setting controller id to hw master
   id

* tag 'soundwire-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: amd: drop bus freq calculation and set 'max_clk_freq'
  soundwire: generic_bandwidth_allocation use bus->params.max_dr_freq
  soundwire: qcom: set controller id to hw master id
  soundwire: fix initializing sysfs for same devices on different buses
  soundwire: bus: introduce controller_id
  soundwire: stream: constify sdw_port_config when adding devices
  soundwire: qcom: move sconfig in qcom_swrm_stream_alloc_ports() out of critical section
  soundwire: qcom: drop unneeded qcom_swrm_stream_alloc_ports() cleanup
2024-01-18 17:08:31 -08:00
Linus Torvalds 3455135839 gpio fixes for v6.8-rc1
- revert the changes aiming to use a read-write semaphore to protect the
   list of GPIO devices due to calls to legacy API taking that lock from
   atomic context in old code
 - fix inverted logic in DEFINE_FREE() for GPIO device references
 - check the return value of bgpio_init() in gpio-mlxbf3
 - fix node address in the DT bindings example for gpio-xilinx
 - fix signedness bug in gpio-rtd
 - fix kernel-doc warnings in gpio-en7523
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmWpGfgACgkQEacuoBRx
 13K92A//TonLW0SmSJ5OM0jn9pqUhYLsSxbknu9M9RVmX+GTIq9YK+S5SiCbRS50
 rXn4q3WSqbocM+9R0MUlJJJK8UkGT+JU330vPlJl4gZAy0bK8aoeOqg7vb7nY12e
 0/PvroMpTHoBEIp5XloK4Wjxp3dnc1mGfozSJ7af9FXuY9i1cv8KhnrI8krKbszr
 quuWlZRrtMxKeR3FnsZVn8WNvEWqB5gAN1QJrX0h4sOe0/ZFVfdpZaCW4j1YZtux
 aF6d/9XyEnoR/yJpAE/Rt5Wy727RkilRvSiihnijnJd5JqdMU89CdXdgUF2oFOZ6
 rTAh6XrXiJaaAst4H+5rRe37PR0FOBEAwY1pWUlk9caikfPujNhzPmjvXbdAW7N9
 xe8J+2hTxYfOPYPNtSD/yjcWOudgOb8iiaRREWcRzR96vX/K6R94nF/8hOOEnm+F
 q/xbon1Rt8zhtjZ5lylGsmKoBlKDHlch+c3BXSBf4AxahMNgkebgx8pbkl3jUfg8
 30k+Z5wl08t7oRgeydBKfw7Y42LElBMqAFikjEK8ACyhEGC/e5XD8H0xlWGaKqU5
 sGhd5VS5Y2BQjsrpWkpdQGyOGquz86z5AjwDAAQp3nLgCNrs1UJ8imMPn5c2MPXl
 JAQhMg+zM1NlUhLrXwChYpQ08vypgFUsbGVQ+SDnEbFUo9DbHpg=
 =g2QY
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:
 "Apart from some regular driver fixes there's a relatively big revert
  of the locking changes that were introduced to GPIOLIB in this merge
  window.

  This is because it turned out that some legacy GPIO interfaces - that
  need to translate a number from the global GPIO numberspace to the
  address of the relevant descriptor, thus running a GPIO device lookup
  and taking the GPIO device list lock - are still used in old code from
  atomic context resulting in "scheduling while atomic" errors.

  I'll try to make the read-only part of the list access entirely
  lockless using SRCU but this will take some time so let's go back to
  the old global spinlock for now.

  Summary:

   - revert the changes aiming to use a read-write semaphore to protect
     the list of GPIO devices due to calls to legacy API taking that
     lock from atomic context in old code

   - fix inverted logic in DEFINE_FREE() for GPIO device references

   - check the return value of bgpio_init() in gpio-mlxbf3

   - fix node address in the DT bindings example for gpio-xilinx

   - fix signedness bug in gpio-rtd

   - fix kernel-doc warnings in gpio-en7523"

* tag 'gpio-fixes-for-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpiolib: revert the attempt to protect the GPIO device list with an rwsem
  gpio: EN7523: fix kernel-doc warnings
  gpiolib: Fix scope-based gpio_device refcounting
  gpio: mlxbf3: add an error code check in mlxbf3_gpio_probe
  dt-bindings: gpio: xilinx: Fix node address in gpio
  gpio: rtd: Fix signedness bug in probe
2024-01-18 17:03:51 -08:00
Linus Torvalds 5c93506983 pwm changes for 6.8, take 2
The first commit fixes a duplicate cleanup in an error path introduced
 in pwm/for-6.8-rc1~13.
 
 The second cares for an out-of-bounds access. In practise it doesn't
 happen---otherwise someone would have noticed since v5.17-rc1 I
 guess---because the device tree binding for the two drivers using
 of_pwm_single_xlate() only have args->args_count == 1. A device-tree
 that doesn't conform to the respective bindings could trigger that
 easily however.
 
 The third and last one corrects the request callback of the jz4740 pwm
 driver which used dev_err_probe() long after .probe() completed. This is
 conceptually wrong because dev_err_probe() might call
 device_set_deferred_probe_reason() which is nonsensical after the driver
 is bound.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEP4GsaTp6HlmJrf7Tj4D7WH0S/k4FAmWo3dwACgkQj4D7WH0S
 /k5L3Qf8Db1uwUEGFXjjMIS82djW6u8rAEu1bEVf0yqhFKB3zo/PO4b1bfTHvKGo
 SwSflvZ2UB+bcby1yIZTP1yUbDH1O3WFfxgYwdUR715us/iMCV5sYvuNagi0Fp8O
 edR1Ntr/CmaU5VeDyml7ZUIHj7oZ50hd1dvvEugvKQKEVvW8yHBR2OEuqI38cIyK
 ItsQPrN9ld7PXzE0UFWMJpi3lbo+Q4KEYbx0yzBmBGogLXi+nRCIGtMB5KzdUSvf
 K4mQPHDmUyHsviXyjXcywst7sL1ebUcI3FicvCf4by/Sek5jZQGtLoyRS0m4FkRw
 V/TQMiLhU5iKr84wPQUvw6ZYmcXxog==
 =BP0a
 -----END PGP SIGNATURE-----

Merge tag 'pwm/for-6.8-2' of gitolite.kernel.org:pub/scm/linux/kernel/git/ukleinek/linux

Pull pwm fixes from Uwe Kleine-König:

 - fix a duplicate cleanup in an error path introduced in
   this merge window

 - fix an out-of-bounds access

   In practise it doesn't happen - otherwise someone would have noticed
   since v5.17-rc1 I guess - because the device tree binding for the two
   drivers using of_pwm_single_xlate() only have args->args_count == 1.

   A device-tree that doesn't conform to the respective bindings could
   trigger that easily however.

 - correct the request callback of the jz4740 pwm driver which used
   dev_err_probe() long after .probe() completed.

   This is conceptually wrong because dev_err_probe() might call
   device_set_deferred_probe_reason() which is nonsensical after the
   driver is bound.

* tag 'pwm/for-6.8-2' of gitolite.kernel.org:pub/scm/linux/kernel/git/ukleinek/linux:
  pwm: jz4740: Don't use dev_err_probe() in .request()
  pwm: Fix out-of-bounds access in of_pwm_single_xlate()
  pwm: bcm2835: Remove duplicate call to clk_rate_exclusive_put()
2024-01-18 16:58:21 -08:00
Linus Torvalds 21c91bb936 - New Drivers
- Add support for Monolithic Power Systems MP3309C WLED Step-up Converter
 
  - Fix-ups
    - Use/convert to new/better APIs/helpers/MACROs instead of hand-rolling implementations
    - Device Tree Binding updates
    - Demote non-kerneldoc header comments
    - Improve error handling; return proper error values, simplify, avoid duplicates, etc
    - Convert over to the new (kinda) GPIOD API
 
  - Bug Fixes
    - Fix uninitialised local variable
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEdrbJNaO+IJqU8IdIUa+KL4f8d2EFAmWmsY8ACgkQUa+KL4f8
 d2E+PQ/+LP2TiwChRYvz5jKA0ys4N4yR63dQU1s9dcp49D3RjMLgokevDjSswVEJ
 vGlSc3Ld50cuT6oLObI7z/ezK9/QMJ7yMFraSy+QnTOit8xdDAfJAgC6cM8GGoFU
 XKLT4JkkNyhc2NypHKG/hRkMMA6EoBPr+wZTBRpQvbbFOkalTDVtcQNUgsYNr4W6
 0fPT97oKgAmvzK/2KaihmfxtTE0vXon5XyZVpnTu28bqXQbQvbOgzKr5KWom022Y
 XxoMVUaobZn0v/WUC1KqPY1N0tM+xH2PoOD+4jss26oV0TyxYWH9nDorF46gcJpk
 CQkp5jCCra/WHR50fJx7y01GVswfKCNp2fjx9aRnhFsWfW9cu2Kdws3GfGRazPSW
 AN9MJ9Eyv9Xu689hcq0wYZspfMeJe7fyDdEyeUWHCD8mF4BZHiOjeiksz2eKIuTT
 Q4OykhGoh7svw2cA9EPWRvDFu4xyPA2usrDq/ClBEKVL6jE2gfSZjVmvLGGZ2Xmq
 oHRHiE0qK8kWOf3n6ModbMWTZA7D9Y6tGd+rpFe7GdOfPeH1A9uYXfxoJwaUaEAr
 TvcyyrsePUvHJHADwRx1SzHFuwfuNU1PgflkkdhnUopKLeDfeom0CA5a+YjP1M4Z
 lHongOEdL4xAssFZU2ZXgNZiYx8su2AcUfGgnkHF1frMuoxt5Gk=
 =SXS0
 -----END PGP SIGNATURE-----

Merge tag 'backlight-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight

Pull backlight updates from Lee Jones:
 "New Drivers:
   - Add support for Monolithic Power Systems MP3309C WLED Step-up Converter

  Fix-ups:
   - Use/convert to new/better APIs/helpers/MACROs instead of
     hand-rolling implementations
   - Device Tree Binding updates
   - Demote non-kerneldoc header comments
   - Improve error handling; return proper error values, simplify, avoid
     duplicates, etc
   - Convert over to the new (kinda) GPIOD API

  Bug Fixes:
   - Fix uninitialised local variable"

* tag 'backlight-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  backlight: hx8357: Convert to agnostic GPIO API
  backlight: ili922x: Add an error code check in ili922x_write()
  backlight: ili922x: Drop kernel-doc for local macros
  backlight: mp3309c: Fix uninitialized local variable
  backlight: pwm_bl: Use dev_err_probe
  backlight: mp3309c: Add support for MPS MP3309C
  dt-bindings: backlight: mp3309c: Remove two required properties
2024-01-18 16:53:35 -08:00
Linus Torvalds 17e232b6d2 dma-mapping fixes for Linux 6.8
- fix kerneldoc warnings (Randy Dunlap)
  - better bounds checking in swiotlb (ZhangPeng)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmWo0rYLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPxwRAAy/TfnQJZQX+VqLmdQ5JsHGoPkYCrdMdcGtM7JxHP
 0QJZSCN/FRIRIB/kXXceCLlls79pQUdrfyiIM4ZyWZU6nWdKlUVw3xJZ8VtbSXla
 W0tKUlAaVwt6HuK+H0GkSs/l1JlU/qWUwrVSXdsAeWT8aQes6OqRbesm/N8vvgJE
 Do/uyGG4WFQaiSB4vv6M6S32KGzNN+BBZGLEk+K4Et4fRhwmMh2ciGMK8Df8yyDU
 iKFVPVii26PySCpkoupxFw59pS7LNEn7NCE0hwuCqQt1VE0+8Hyb8BG4sh8U66FM
 sP9FRqv8HAMofJmn3OGIQXIS3J2/0OPoDAGJUbl0JqMfOUj9B3X2Iu3xFYUVK7MJ
 C1OG3vsKUJ8Dq3SStTNan2PsOXekfcLzzY5iuUoWysl5YiDmJs7zut22oX43z+k5
 lJ91QiEtejat39lAXERZYJ4BVuIoGM46xhJLbNjhdAcoakO4IpdETAEowsmZLxte
 6LMGXKUd/swcy0BCuRtoKheAvLfE+OGPd6RA299nsSxfn6T9zGmnl5pvXqrS17yV
 n3riHqgohcOaUFTaj88aO/y+WunVN5J/bpdbrPeitEnE94DW3Y2ah+5tx5sj+MiW
 p5qMO2AiHqx0Go1mlpYbk78adtBMogdwHjSdYBX0NfZVt9agTIB3FG6UVCkNWYta
 ing=
 =5+TR
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-6.8-2024-01-18' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fixes from Christoph Hellwig:

 - fix kerneldoc warnings (Randy Dunlap)

 - better bounds checking in swiotlb (ZhangPeng)

* tag 'dma-mapping-6.8-2024-01-18' of git://git.infradead.org/users/hch/dma-mapping:
  dma-debug: fix kernel-doc warnings
  swiotlb: check alloc_size before the allocation of a new memory pool
2024-01-18 16:49:34 -08:00
Linus Torvalds 77c9622d87 memblock: code readability improvement
Use NUMA_NO_NODE instead of -1 as return value of memblock_search_pfn_nid()
 to improve code readability and consistency with the callers of that
 function.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmWoxkIQHHJwcHRAa2Vy
 bmVsLm9yZwAKCRA5A4Ymyw79kZjnCAC8XcBYOJ+lD3zDOC3fDue3qPBzxSBRvpi5
 LL3MRDRON/0EFZdhEGG0CIZsSRCWuWRHiHPNhuxRBWO9qtqbX+aasJo+mKUo6YKP
 AxId5Y8K708CZPP+rn6zasytRb6+u6EkyoYwk8hqIGixN7pGJg9a1sr/szZE1QWW
 tvkGkoUX13KdCdvxCcE09h/kOfI60SwmkikQvfNFq59fTuwWw2h1qkpb/biT5xIt
 OFB70TNQesmXCJZjIPmv212DHbYTPY7xVD+NdjcfHM4+TcFk4YGHEVmbJUhBTS29
 VkWVZ01Q8veYY01oLJ3K5z8UekJYJyjMVbtWFAxYuxKf33iywqoF
 =G+XX
 -----END PGP SIGNATURE-----

Merge tag 'memblock-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull memblock update from Mike Rapoport:
 "Code readability improvement.

  Use NUMA_NO_NODE instead of -1 as return value of
  memblock_search_pfn_nid() to improve code readability
  and consistency with the callers of that function"

* tag 'memblock-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock: Return NUMA_NO_NODE instead of -1 to improve code readability
2024-01-18 16:46:18 -08:00
Linus Torvalds 0b7359ccdd virtio: features, fixes
vdpa/mlx5: support for resumable vqs
 virtio_scsi: mq_poll support
 3virtio_pmem: support SHMEM_REGION
 virtio_balloon: stay awake while adjusting balloon
 virtio: support for no-reset virtio PCI PM
 
 Fixes, cleanups.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmWmgP8PHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpcfgH/0RD2S+NFY0ZEJz8BuI6GjykzYnyRW9iyxcw
 epTLjPUcoEBttlw8TA+3PiPoNIJGfuU8Q4iKXJ8Jzql081tP9G1UxTIbj0v3Hx+q
 0L2DUXfdAMYMLo5WQVl/PADV/10xLgExEh9jMqpU3IJIxVaLE/knD9ghRCDvDbs/
 fOo3sSUGaNsSHYZs5bH73Q7cRKKmTLO+MzvHBbavFfz2fQ1b3vwecmJuQtAtK0JC
 6JxH6Y38VfOl8jA6IHeEpGIHeF661HABkDDUh4UVEGOeyBl4E6ZcG4fjWSMinZ08
 U3TbQLYOq10i8ki2LJKgoZHRv1HkxbM1Ogn0bsIh1hish8dPORM=
 =RWjR
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:

 - vdpa/mlx5: support for resumable vqs

 - virtio_scsi: mq_poll support

 - 3virtio_pmem: support SHMEM_REGION

 - virtio_balloon: stay awake while adjusting balloon

 - virtio: support for no-reset virtio PCI PM

 - Fixes, cleanups

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vdpa/mlx5: Add mkey leak detection
  vdpa/mlx5: Introduce reference counting to mrs
  vdpa/mlx5: Use vq suspend/resume during .set_map
  vdpa/mlx5: Mark vq state for modification in hw vq
  vdpa/mlx5: Mark vq addrs for modification in hw vq
  vdpa/mlx5: Introduce per vq and device resume
  vdpa/mlx5: Allow modifying multiple vq fields in one modify command
  vdpa/mlx5: Expose resumable vq capability
  vdpa: Block vq property changes in DRIVER_OK
  vdpa: Track device suspended state
  scsi: virtio_scsi: Add mq_poll support
  virtio_pmem: support feature SHMEM_REGION
  virtio_balloon: stay awake while adjusting balloon
  vdpa: Remove usage of the deprecated ida_simple_xx() API
  virtio: Add support for no-reset virtio PCI PM
  virtio_net: fix missing dma unmap for resize
  vhost-vdpa: account iommu allocations
  vdpa: Fix an error handling path in eni_vdpa_probe()
2024-01-18 16:44:03 -08:00
Linus Torvalds da3c45c721 hwmon updates for v6.8 - part 2
Fix crash seen when instantiating npcm750-pwm-fan
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEiHPvMQj9QTOCiqgVyx8mb86fmYEFAmWmq5YACgkQyx8mb86f
 mYH9WRAAoX2Vo1SGMGCcgv5+vkZu3T7OUDswZvTAeODBPNSKlVfAL0/pRe3F+rFH
 ftHB/p2eFATt41WpUkn2APzt8YVT/SBYHGkaZhIJtgzUW0DHQLkf/UrHyHGP8T99
 hSzMubpJhNXZkUdSY4J2IIXdI6tNJJB9ma098An0Y2zmez3v/BWR/011nmZl0HQb
 ohyCAbHPAVwLXdHAHA0NtuDL9a7qpJgtELETbkIrfKD0v5a9vE05zmhqqAlpM2GT
 LBcEQlofUyhCSyzs/K7T14BxJdbQgNpaJS0jpi9PqwUWfKT/U5j/sETmMV8by4I2
 pLn/d4HBJMEWi4P+6HrT93weYl/dRrzD+Po9dJy5pTgj67bi5cNmb9j6+XiocYb7
 GHvERQ6EIUkKTUwxPWBZ/Og35VRGHSXKtlpjNS3R6y4jPaFWMPi7usCwFpcIvbvt
 L3YimKNVGFu2kFqmoN6HCIXy9gGg7WXOpjVabL299nMl0exVONOIzADozPtBKwql
 ylIxBJGj13gZ4+WLHcJa8BNKNNDHII6y3FFCxbFbnRzKJ9YM2+pe7rxY7dYOb6k+
 ib421czTyqeo1D8bpUGpBYAaJDXLJD2jIMn4VhMnCLfQRS05eqEFSVjYedxsinRw
 mbuUb6ySncxaC7aCP7YvHkqySf2L82vEiNjlCgNhtnvz0gozFIg=
 =DwUL
 -----END PGP SIGNATURE-----

Merge tag 'hwmon-for-v6.8-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmonfix from Guenter Roeck:
 "Fix crash seen when instantiating npcm750-pwm-fan"

* tag 'hwmon-for-v6.8-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (npcm750-pwm-fan) Fix crash observed when instantiating nuvoton,npcm750-pwm-fan
2024-01-18 16:42:18 -08:00
Linus Torvalds db5ccb9eb2 cxl for v6.8
- Add support for parsing the Coherent Device Attribute Table (CDAT)
 
 - Add support for calculating a platform CXL QoS class from CDAT data
 
 - Unify the tracing of EFI CXL Events with native CXL Events.
 
 - Add Get Timestamp support
 
 - Miscellaneous cleanups and fixups
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZaHVvAAKCRDfioYZHlFs
 Z3sCAQDPHSsHmj845k4lvKbWjys3eh78MKKEFyTXLQgYhOlsGAEAigQY2ZiSum52
 nwdIgpOOADNt0Iq6yXuLsmn9xvY9bAU=
 =HjCl
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL (Compute Express Link) updates from Dan Williams:
 "The bulk of this update is support for enumerating the performance
  capabilities of CXL memory targets and connecting that to a platform
  CXL memory QoS class. Some follow-on work remains to hook up this data
  into core-mm policy, but that is saved for v6.9.

  The next significant update is unifying how CXL event records (things
  like background scrub errors) are processed between so called
  "firmware first" and native error record retrieval. The CXL driver
  handler that processes the record retrieved from the device mailbox is
  now the handler for that same record format coming from an EFI/ACPI
  notification source.

  This also contains miscellaneous feature updates, like Get Timestamp,
  and other fixups.

  Summary:

   - Add support for parsing the Coherent Device Attribute Table (CDAT)

   - Add support for calculating a platform CXL QoS class from CDAT data

   - Unify the tracing of EFI CXL Events with native CXL Events.

   - Add Get Timestamp support

   - Miscellaneous cleanups and fixups"

* tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (41 commits)
  cxl/core: use sysfs_emit() for attr's _show()
  cxl/pci: Register for and process CPER events
  PCI: Introduce cleanup helpers for device reference counts and locks
  acpi/ghes: Process CXL Component Events
  cxl/events: Create a CXL event union
  cxl/events: Separate UUID from event structures
  cxl/events: Remove passing a UUID to known event traces
  cxl/events: Create common event UUID defines
  cxl/events: Promote CXL event structures to a core header
  cxl: Refactor to use __free() for cxl_root allocation in cxl_endpoint_port_probe()
  cxl: Refactor to use __free() for cxl_root allocation in cxl_find_nvdimm_bridge()
  cxl: Fix device reference leak in cxl_port_perf_data_calculate()
  cxl: Convert find_cxl_root() to return a 'struct cxl_root *'
  cxl: Introduce put_cxl_root() helper
  cxl/port: Fix missing target list lock
  cxl/port: Fix decoder initialization when nr_targets > interleave_ways
  cxl/region: fix x9 interleave typo
  cxl/trace: Pass UUID explicitly to event traces
  cxl/region: use %pap format to print resource_size_t
  cxl/region: Add dev_dbg() detail on failure to allocate HPA space
  ...
2024-01-18 16:22:43 -08:00
Linus Torvalds 244aefb1c6 VFIO updates for v6.8-rc1
- Add debugfs support, initially used for reporting device migration
    state. (Longfang Liu)
 
  - Fixes and support for migration dirty tracking across multiple IOVA
    regions in the pds-vfio-pci driver. (Brett Creeley)
 
  - Improved IOMMU allocation accounting visibility. (Pasha Tatashin)
 
  - Virtio infrastructure and a new virtio-vfio-pci variant driver, which
    provides emulation of a legacy virtio interfaces on modern virtio
    hardware for virtio-net VF devices where the PF driver exposes
    support for legacy admin queues, ie. an emulated IO BAR on an SR-IOV
    VF to provide driver ABI compatibility to legacy devices.
    (Yishai Hadas & Feng Liu)
 
  - Migration fixes for the hisi-acc-vfio-pci variant driver.
    (Shameer Kolothum)
 
  - Kconfig dependency fix for new virtio-vfio-pci variant driver.
    (Arnd Bergmann)
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmWhkhEbHGFsZXgud2ls
 bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsiCLgQAJv6mzD79dVWKAZH27Lj
 PK0ZSyu3fwgPxTmhRXysKKMs79WI2GlVx6nyW8pVe3w+OGWpdTcbZK2H/T/FryZQ
 QsbKteueG83ni1cIdJFzmIM1jO79jhtsPxpclRS/VmECRhYA6+c7smynHyZNrVAL
 wWkJIkS2uUEx3eUefzH4U2CRen3TILwHAXi27fJ8pHbr6Yor+XvUOgM3eQDjUj+t
 eABL/pJr0qFDQstom6k7GLAsenRHKMLUG88ziSciSJxOg5YiT4py7zeLXuoEhVD1
 kI9KE+Vle5EdZe8MzLLhmzLZoFVfhjyNfj821QjtfP3Gkj6TqnUWBKJAptMuQpdf
 HklOLNmabrZbat+i6QqswrnQ5Z1doPz1uNBsl2lH+2/KIaT8bHZI+QgjK7pg2H2L
 O679My0od4rVLpjnSLDdRoXlcLd6mmvq3663gPogziHBNdNl3oQBI3iIa7ixljkA
 lxJbOZIDBAjzPk+t5NLYwkTsab1AY4zGlfr0M3Sk3q7tyj/MlBcX/fuqyhXjUfqR
 Zhqaw2OaWD8R0EqfSK+wRXr1+z7EWJO/y1iq8RYlD5Mozo+6YMVThjLDUO+8mrtV
 6/PL0woGALw0Tq1u0tw3rLjzCd9qwD9BD2fFUQwUWEe3j3wG2HCLLqyomxcmaKS8
 WgvUXtufWyvonCcIeLKXI9Kt
 =IuK2
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v6.8-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - Add debugfs support, initially used for reporting device migration
   state (Longfang Liu)

 - Fixes and support for migration dirty tracking across multiple IOVA
   regions in the pds-vfio-pci driver (Brett Creeley)

 - Improved IOMMU allocation accounting visibility (Pasha Tatashin)

 - Virtio infrastructure and a new virtio-vfio-pci variant driver, which
   provides emulation of a legacy virtio interfaces on modern virtio
   hardware for virtio-net VF devices where the PF driver exposes
   support for legacy admin queues, ie. an emulated IO BAR on an SR-IOV
   VF to provide driver ABI compatibility to legacy devices (Yishai
   Hadas & Feng Liu)

 - Migration fixes for the hisi-acc-vfio-pci variant driver (Shameer
   Kolothum)

 - Kconfig dependency fix for new virtio-vfio-pci variant driver (Arnd
   Bergmann)

* tag 'vfio-v6.8-rc1' of https://github.com/awilliam/linux-vfio: (22 commits)
  vfio/virtio: fix virtio-pci dependency
  hisi_acc_vfio_pci: Update migration data pointer correctly on saving/resume
  vfio/virtio: Declare virtiovf_pci_aer_reset_done() static
  vfio/virtio: Introduce a vfio driver over virtio devices
  vfio/pci: Expose vfio_pci_core_iowrite/read##size()
  vfio/pci: Expose vfio_pci_core_setup_barmap()
  virtio-pci: Introduce APIs to execute legacy IO admin commands
  virtio-pci: Initialize the supported admin commands
  virtio-pci: Introduce admin commands
  virtio-pci: Introduce admin command sending function
  virtio-pci: Introduce admin virtqueue
  virtio: Define feature bit for administration virtqueue
  vfio/type1: account iommu allocations
  vfio/pds: Add multi-region support
  vfio/pds: Move seq/ack bitmaps into region struct
  vfio/pds: Pass region info to relevant functions
  vfio/pds: Move and rename region specific info
  vfio/pds: Only use a single SGL for both seq and ack
  vfio/pds: Fix calculations in pds_vfio_dirty_sync
  MAINTAINERS: Add vfio debugfs interface doc link
  ...
2024-01-18 15:57:25 -08:00
Linus Torvalds 86c4d58a99 iommufd for 6.8
This brings the first of three planned user IO page table invalidation
 operations:
 
  - IOMMU_HWPT_INVALIDATE allows invalidating the IOTLB integrated into the
    iommu itself. The Intel implementation will also generate an ATC
    invalidation to flush the device IOTLB as it unambiguously knows the
    device, but other HW will not.
 
 It goes along with the prior PR to implement userspace IO page tables (aka
 nested translation for VMs) to allow Intel to have full functionality for
 simple cases. An Intel implementation of the operation is provided.
 
 Fix a small bug in the selftest mock iommu driver probe.
 -----BEGIN PGP SIGNATURE-----
 
 iHQEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZaFiRQAKCRCFwuHvBreF
 YbmgAP9Z0+cAUPKxUKaMRls8YR+gmaOCniSkqBlyrxcib+F/WAD2NPLcBPBRk2o7
 GfXPIrovx96Btf8M40AFdiTEp7LABw==
 =9POe
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd

Pull iommufd updates from Jason Gunthorpe:
 "This brings the first of three planned user IO page table invalidation
  operations:

   - IOMMU_HWPT_INVALIDATE allows invalidating the IOTLB integrated into
     the iommu itself. The Intel implementation will also generate an
     ATC invalidation to flush the device IOTLB as it unambiguously
     knows the device, but other HW will not.

  It goes along with the prior PR to implement userspace IO page tables
  (aka nested translation for VMs) to allow Intel to have full
  functionality for simple cases. An Intel implementation of the
  operation is provided.

  Also fix a small bug in the selftest mock iommu driver probe"

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
  iommufd/selftest: Check the bus type during probe
  iommu/vt-d: Add iotlb flush for nested domain
  iommufd: Add data structure for Intel VT-d stage-1 cache invalidation
  iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
  iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op
  iommufd/selftest: Add mock_domain_cache_invalidate_user support
  iommu: Add iommu_copy_struct_from_user_array helper
  iommufd: Add IOMMU_HWPT_INVALIDATE
  iommu: Add cache_invalidate_user op
2024-01-18 15:28:15 -08:00
Linus Torvalds 0dde2bf67b IOMMU Updates for Linux v6.8
Including:
 
 	- Core changes:
 	  - Fix race conditions in device probe path
 	  - Retire IOMMU bus_ops
 	  - Support for passing custom allocators to page table drivers
 	  - Clean up Kconfig around IOMMU_SVA
 	  - Support for sharing SVA domains with all devices bound to
 	    a mm
 	  - Firmware data parsing cleanup
 	  - Tracing improvements for iommu-dma code
 	  - Some smaller fixes and cleanups
 
 	- ARM-SMMU drivers:
 	  - Device-tree binding updates:
 	     - Add additional compatible strings for Qualcomm SoCs
 	     - Document Adreno clocks for Qualcomm's SM8350 SoC
 	  - SMMUv2:
 	    - Implement support for the ->domain_alloc_paging() callback
 	    - Ensure Secure context is restored following suspend of Qualcomm SMMU
 	      implementation
 	  - SMMUv3:
 	    - Disable stalling mode for the "quiet" context descriptor
 	    - Minor refactoring and driver cleanups
 
 	 - Intel VT-d driver:
 	   - Cleanup and refactoring
 
 	 - AMD IOMMU driver:
 	   - Improve IO TLB invalidation logic
 	   - Small cleanups and improvements
 
 	 - Rockchip IOMMU driver:
 	   - DT binding update to add Rockchip RK3588
 
 	 - Apple DART driver:
 	   - Apple M1 USB4/Thunderbolt DART support
 	   - Cleanups
 
 	 - Virtio IOMMU driver:
 	   - Add support for iotlb_sync_map
 	   - Enable deferred IO TLB flushes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmWecQoACgkQK/BELZcB
 GuN5ZxAAzC5QUKAzANx0puk7QhPpKKlbSvj6Q7iRgCLk00KJO1+VQh9v4ouCmXqF
 kn3Ko8gddjhtrgwN0OQ54F39cLUrp1SBemy71K5YOR+vu8VKtwtmawZGeeRZ+k+B
 Eohw58oaXTiR1maYvoLixLYczLrjklqyJOQ1vZ0GxFGxDqrFByAryHDgG/3OCpJx
 C9e6PsLbbfhfqA8Kv97iKcBqniGbXxAMuodqSUG0buQ3oZgfpIP6Bt3EgUzFGPGk
 3BTlYxowS/gkjUWd3fgjQFIFLTA01u9FhpA2Jb0a4v67pUCR64YxHN7rBQ6ZChtG
 kB9laQfU9re79RsHhqQzr0JT9x/eyq7pzGzjp5TV5TPW6IW+sqjMIPhzd9P08Ef7
 BclkCVobx0jSAHOhnnG4QJiKANr2Y2oM3HfsAJccMMY45RRhUKmVqM7jxMPfGn3A
 i+inlee73xTjZXJse1EWG1fmKKMLvX9LDEp4DyOfn9CqVT+7hpZvzPjfbGr937Rm
 JlwXhF3rQXEpOCagEsbt1vOf+V0e9QiCLf1Y2KpkIkDbE5wwSD/2qLm3tFhJG3oF
 fkW+J14Cid0pj+hY0afGe0kOUOIYlimu0nFmSf0pzMH+UktZdKogSfyb1gSDsy+S
 rsZRGPFhMJ832ExqhlDfxqBebqh+jsfKynlskui6Td5C9ZULaHA=
 =q751
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:
 "Core changes:
   - Fix race conditions in device probe path
   - Retire IOMMU bus_ops
   - Support for passing custom allocators to page table drivers
   - Clean up Kconfig around IOMMU_SVA
   - Support for sharing SVA domains with all devices bound to a mm
   - Firmware data parsing cleanup
   - Tracing improvements for iommu-dma code
   - Some smaller fixes and cleanups

  ARM-SMMU drivers:
   - Device-tree binding updates:
      - Add additional compatible strings for Qualcomm SoCs
      - Document Adreno clocks for Qualcomm's SM8350 SoC
   - SMMUv2:
      - Implement support for the ->domain_alloc_paging() callback
      - Ensure Secure context is restored following suspend of Qualcomm
        SMMU implementation
   - SMMUv3:
      - Disable stalling mode for the "quiet" context descriptor
      - Minor refactoring and driver cleanups

  Intel VT-d driver:
   - Cleanup and refactoring

  AMD IOMMU driver:
   - Improve IO TLB invalidation logic
   - Small cleanups and improvements

  Rockchip IOMMU driver:
   - DT binding update to add Rockchip RK3588

  Apple DART driver:
   - Apple M1 USB4/Thunderbolt DART support
   - Cleanups

  Virtio IOMMU driver:
   - Add support for iotlb_sync_map
   - Enable deferred IO TLB flushes"

* tag 'iommu-updates-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (66 commits)
  iommu: Don't reserve 0-length IOVA region
  iommu/vt-d: Move inline helpers to header files
  iommu/vt-d: Remove unused vcmd interfaces
  iommu/vt-d: Remove unused parameter of intel_pasid_setup_pass_through()
  iommu/vt-d: Refactor device_to_iommu() to retrieve iommu directly
  iommu/sva: Fix memory leak in iommu_sva_bind_device()
  dt-bindings: iommu: rockchip: Add Rockchip RK3588
  iommu/dma: Trace bounce buffer usage when mapping buffers
  iommu/arm-smmu: Convert to domain_alloc_paging()
  iommu/arm-smmu: Pass arm_smmu_domain to internal functions
  iommu/arm-smmu: Implement IOMMU_DOMAIN_BLOCKED
  iommu/arm-smmu: Convert to a global static identity domain
  iommu/arm-smmu: Reorganize arm_smmu_domain_add_master()
  iommu/arm-smmu-v3: Remove ARM_SMMU_DOMAIN_NESTED
  iommu/arm-smmu-v3: Master cannot be NULL in arm_smmu_write_strtab_ent()
  iommu/arm-smmu-v3: Add a type for the STE
  iommu/arm-smmu-v3: disable stall for quiet_cd
  iommu/qcom: restore IOMMU state if needed
  iommu/arm-smmu-qcom: Add QCM2290 MDSS compatible
  iommu/arm-smmu-qcom: Add missing GMU entry to match table
  ...
2024-01-18 15:16:57 -08:00
Linus Torvalds e7ded27593 percpu:
- Enable percpu page allocator for risc-v. There are risc-v
   configurations with sparse NUMA configurations and small vmalloc
   space causing dynamic percpu allocations to fail as the backing chunk
   stride is too far apart.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE3hZPHJdcVwe+yTTtiDc0yuoFPR0FAmWhz5YACgkQiDc0yuoF
 PR33fQ//TImRNBOvLn0zSL6eKE49pBOg1lhff82GroMkjIw/jHLOp0WfdtCRKBZm
 8234WiQRPk3TNKkvmrikMwmDG249Bc/U+RwaHTkfDao6Fm1Pb4SESaggNXw/VKDe
 zWFvI/zoVQGC3+xuUYo6KDtFE9shnphsT7surRt21wdDeZOojH89FtrrEHnnQpIx
 Zl5miPx0H1V+Hlzk7PZkPYmEwcZHp7Sjcx1/t7QzvtzzkiDKmOLROO2gxRMXaCJz
 zeM5UAQi1294EftLpHTgrtn9NEbwt8xOQnaNtZozYSznmcy6CztyiNH43XCOapFC
 10iVxn4NlioXGzaT/Vo2As3PGjJueg2kl+TJur7lAdENgWyqT0qksgtu+9Q2SSYg
 hzWMk8KKqpLHvjnDpKu0spl7EI7u4J8MdIfHLlw/a2vWUU1bBQeRzIZHGe56/yFu
 asHsTlqWzLPZy8ZvqjhX63HQnWnglHhmY63BcHr5kCeUN8F6cNAS0WWtSrvj5bXM
 OHuq+OaSUms9Ktl/igaaXDLUW+0t04vtH4qh1l2ncEdElYWBzT3d9WBkW8RfQzcu
 aXmu0ItxTHGTgmjafibGoQCkMzJ0NG0b7IW4NMNz5nWgpf5ghBXnSjz17Z4FkMgo
 PY/+uF3Gr7w+OYxsIDSzvMef/J14qgJ9oPMVUJWOJIwVUO7+nMQ=
 =fwxu
 -----END PGP SIGNATURE-----

Merge tag 'percpu-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu

Pull percpu updates from Dennis Zhou:
 "Enable percpu page allocator for RISC-V.

  There are RISC-V configurations with sparse NUMA configurations and
  small vmalloc space causing dynamic percpu allocations to fail as the
  backing chunk stride is too far apart"

* tag 'percpu-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
  riscv: Enable pcpu page first chunk allocator
  mm: Introduce flush_cache_vmap_early()
2024-01-18 15:01:28 -08:00
Linus Torvalds 24f3a63e1f More eventfs fixes and a seq_buf fix for 6.8:
- Hard-code the inodes for eventfs to the same number for files, and
   the same number for directories.
 
 - Have getdent() not create dentries/inodes in iterate_shared() as now
   it has hard-coded inode numbers
 
 - Use kcalloc() instead of kzalloc() on a list of elements
 
 - Fix seq_buf warning and make static work properly.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZak5GxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qvZYAP0ZO4YN9fKnl6Cw1GNCwPtMO13dEg9D
 mIvwftxX8DuaegD/fQFY9gBc+ZMSCbiWVJBAyfO57NPvHk4S3slwPVuL9gA=
 =iKJI
 -----END PGP SIGNATURE-----

Merge tag 'eventfs-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull eventfs updates from Steven Rostedt:

 - Remove "lookup" parameter of create_dir_dentry() and
   create_file_dentry(). These functions were called by lookup and the
   readdir logic, where readdir needed it to up the ref count of the
   dentry but the lookup did not. A "lookup" parameter was passed in to
   tell it what to do, but this complicated the code. It is better to
   just always up the ref count and require the caller to decrement it,
   even for lookup.

 - Modify the .iterate_shared callback to not use the dcache_readdir()
   logic and just handle what gets displayed by that one function. This
   removes the need for eventfs to hijack the file->private_data from
   the dcache_readdir() "cursor" pointer, and makes the code a bit more
   sane

 - Use the root and instance inodes for default ownership. Instead of
   walking the dentry tree and updating each dentry gid, use the
   getattr(), setattr() and permission() callbacks to set the ownership
   and permissions using the root or instance as the default

 - Some other optimizations with the eventfs iterate_shared logic

 - Hard-code the inodes for eventfs to the same number for files, and
   the same number for directories

 - Have getdent() not create dentries/inodes in iterate_shared() as now
   it has hard-coded inode numbers

 - Use kcalloc() instead of kzalloc() on a list of elements

 - Fix seq_buf warning and make static work properly.

* tag 'eventfs-v6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  seq_buf: Make DECLARE_SEQ_BUF() usable
  eventfs: Use kcalloc() instead of kzalloc()
  eventfs: Do not create dentries nor inodes in iterate_shared
  eventfs: Have the inodes all for files and directories all be the same
  eventfs: Shortcut eventfs_iterate() by skipping entries already read
  eventfs: Read ei->entries before ei->children in eventfs_iterate()
  eventfs: Do ctx->pos update for all iterations in eventfs_iterate()
  eventfs: Have eventfs_iterate() stop immediately if ei->is_freed is set
  tracefs/eventfs: Use root and instance inodes as default ownership
  eventfs: Stop using dcache_readdir() for getdents()
  eventfs: Remove "lookup" parameter from create_dir/file_dentry()
2024-01-18 14:45:33 -08:00
Linus Torvalds a2ded784cd tracing updates for 6.8:
- Allow kernel trace instance creation to specify what events are created
   Inside the kernel, a subsystem may create a tracing instance that it can
   use to send events to user space. This sub-system may not care about the
   thousands of events that exist in eventfs. Allow the sub-system to specify
   what sub-systems of events it cares about, and only those events are exposed
   to this instance.
 
 - Allow the ring buffer to be broken up into bigger sub-buffers than just the
   architecture page size. A new tracefs file called "buffer_subbuf_size_kb"
   is created. The user can now specify a minimum size the sub-buffer may be
   in kilobytes. Note, that the implementation currently make the sub-buffer
   size a power of 2 pages (1, 2, 4, 8, 16, ...) but the user only writes in
   kilobyte size, and the sub-buffer will be updated to the next size that
   it will can accommodate it. If the user writes in 10, it will change the
   size to be 4 pages on x86 (16K), as that is the next available size that
   can hold 10K pages.
 
 - Update the debug output when a corrupt time is detected in the ring buffer.
   If the ring buffer detects inconsistent timestamps, there's a debug config
   options that will dump the contents of the meta data of the sub-buffer that
   is used for debugging. Add some more information to this dump that helps
   with debugging.
 
 - Add more timestamp debugging checks (only triggers when the config is enabled)
 
 - Increase the trace_seq iterator to 2 page sizes.
 
 - Allow strings written into tracefs_marker to be larger. Up to just under
   2 page sizes (based on what trace_seq can hold).
 
 - Increase the trace_maker_raw write to be as big as a sub-buffer can hold.
 
 - Remove 32 bit time stamp logic, now that the rb_time_cmpxchg() has been
   removed.
 
 - More selftests were added.
 
 - Some code clean ups as well.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZZ8p3BQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6ql2GAQDZg/zlFEiJHyTfWbCIE8pA3T5xbzKo
 26TNxIZAxJJZpQEAvGFU5Smy14pG6soEoVMp8B6ZOANbqU8VVamhOL+r+Qw=
 =0OYG
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Allow kernel trace instance creation to specify what events are
   created

   Inside the kernel, a subsystem may create a tracing instance that it
   can use to send events to user space. This sub-system may not care
   about the thousands of events that exist in eventfs. Allow the
   sub-system to specify what sub-systems of events it cares about, and
   only those events are exposed to this instance.

 - Allow the ring buffer to be broken up into bigger sub-buffers than
   just the architecture page size.

   A new tracefs file called "buffer_subbuf_size_kb" is created. The
   user can now specify a minimum size the sub-buffer may be in
   kilobytes. Note, that the implementation currently make the
   sub-buffer size a power of 2 pages (1, 2, 4, 8, 16, ...) but the user
   only writes in kilobyte size, and the sub-buffer will be updated to
   the next size that it will can accommodate it. If the user writes in
   10, it will change the size to be 4 pages on x86 (16K), as that is
   the next available size that can hold 10K pages.

 - Update the debug output when a corrupt time is detected in the ring
   buffer. If the ring buffer detects inconsistent timestamps, there's a
   debug config options that will dump the contents of the meta data of
   the sub-buffer that is used for debugging. Add some more information
   to this dump that helps with debugging.

 - Add more timestamp debugging checks (only triggers when the config is
   enabled)

 - Increase the trace_seq iterator to 2 page sizes.

 - Allow strings written into tracefs_marker to be larger. Up to just
   under 2 page sizes (based on what trace_seq can hold).

 - Increase the trace_maker_raw write to be as big as a sub-buffer can
   hold.

 - Remove 32 bit time stamp logic, now that the rb_time_cmpxchg() has
   been removed.

 - More selftests were added.

 - Some code clean ups as well.

* tag 'trace-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (29 commits)
  ring-buffer: Remove stale comment from ring_buffer_size()
  tracing histograms: Simplify parse_actions() function
  tracing/selftests: Remove exec permissions from trace_marker.tc test
  ring-buffer: Use subbuf_order for buffer page masking
  tracing: Update subbuffer with kilobytes not page order
  ringbuffer/selftest: Add basic selftest to test changing subbuf order
  ring-buffer: Add documentation on the buffer_subbuf_order file
  ring-buffer: Just update the subbuffers when changing their allocation order
  ring-buffer: Keep the same size when updating the order
  tracing: Stop the tracing while changing the ring buffer subbuf size
  tracing: Update snapshot order along with main buffer order
  ring-buffer: Make sure the spare sub buffer used for reads has same size
  ring-buffer: Do no swap cpu buffers if order is different
  ring-buffer: Clear pages on error in ring_buffer_subbuf_order_set() failure
  ring-buffer: Read and write to ring buffers with custom sub buffer size
  ring-buffer: Set new size of the ring buffer sub page
  ring-buffer: Add interface for configuring trace sub buffer size
  ring-buffer: Page size per ring buffer
  ring-buffer: Have ring_buffer_print_page_header() be able to access ring_buffer_iter
  ring-buffer: Check if absolute timestamp goes backwards
  ...
2024-01-18 14:35:29 -08:00
Linus Torvalds 5b890ad456 Probes update for v6.8:
- Kprobes trace event to show the actual function name in notrace-symbol
   warning. Instead of using user specified symbol name, use "%ps" printk
   format to show the actual symbol at the probe address. Since kprobe
   event accepts the offset from symbol which is bigger than the symbol
   size, user specified symbol may not be the actual probed symbol.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmWdZB0bHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8b9AcH/R8mNbgAbKlxSXUm0NAG
 xrUcN9vyb9yaLgvoIEvW+XF6EMaCM6G2kG+wSaJB6xFiPlJgf9FhILjDjHAtV2x1
 wXL8r3eLyKvkU3HXfS7RphUTPecgblI16FHZ12x2TkQ41KoRzQf2c7cSQs4B8SHP
 W5LPqvxxqjbV84iqZPScez99S0ZS0Of3ubmepVEm2LDshfhUVMIUH1vfvEn3vQI7
 k5PoNiVRem+rjduERM3I7Zd51K7Lz/5hN56q6ok2vY8hVoRdp0j83Ly36h21ClS9
 CtvlzPX0YjaogVd8Gyc3z+vqy61YiNA1q0fRqIhagmfIy/26s1ORaq/0S2gywxXn
 piA=
 =mfU0
 -----END PGP SIGNATURE-----

Merge tag 'probes-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes update from Masami Hiramatsu:

 - Update the Kprobes trace event to show the actual function name in
   notrace-symbol warning.

   Instead of using the user specified symbol name, use "%ps" printk
   format to show the actual symbol at the probe address. Since kprobe
   event accepts the offset from symbol which is bigger than the symbol
   size, the user specified symbol may not be the actual probed symbol.

* tag 'probes-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  trace/kprobe: Display the actual notrace function when rejecting a probe
2024-01-18 14:21:22 -08:00