Commit graph

1217156 commits

Author SHA1 Message Date
Eric Dumazet
5579ee462d net_sched: export pfifo_fast prio2band[]
pfifo_fast prio2band[] is renamed to sch_default_prio2band[]
and exported because we want to share it in FQ.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-05 13:27:31 +02:00
Eric Dumazet
2ae45136a9 net_sched: sch_fq: remove q->ktime_cache
Now that both enqueue() and dequeue() need to use ktime_get_ns(),
there is no point wasting 8 bytes in struct fq_sched_data.

This makes room for future fields. ;)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-05 13:27:20 +02:00
Paolo Abeni
defe4b87d5 Merge branch 'net-mana-fix-some-tx-processing-bugs'
Haiyang Zhang says:

====================
net: mana: Fix some TX processing bugs

Fix TX processing bugs on error handling, tso_bytes calculation,
and sge0 size.
====================

Link: https://lore.kernel.org/r/1696020147-14989-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-05 11:45:09 +02:00
Haiyang Zhang
a43e8e9ffa net: mana: Fix oversized sge0 for GSO packets
Handle the case when GSO SKB linear length is too large.

MANA NIC requires GSO packets to put only the header part to SGE0,
otherwise the TX queue may stop at the HW level.

So, use 2 SGEs for the skb linear part which contains more than the
packet header.

Fixes: ca9c54d2d6 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-05 11:45:06 +02:00
Haiyang Zhang
7a54de9265 net: mana: Fix the tso_bytes calculation
sizeof(struct hop_jumbo_hdr) is not part of tso_bytes, so remove
the subtraction from header size.

Cc: stable@vger.kernel.org
Fixes: bd7fc6e195 ("net: mana: Add new MANA VF performance counters for easier troubleshooting")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-05 11:45:06 +02:00
Haiyang Zhang
b2b000069a net: mana: Fix TX CQE error handling
For an unknown TX CQE error type (probably from a newer hardware),
still free the SKB, update the queue tail, etc., otherwise the
accounting will be wrong.

Also, TX errors can be triggered by injecting corrupted packets, so
replace the WARN_ONCE to ratelimited error logging.

Cc: stable@vger.kernel.org
Fixes: ca9c54d2d6 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-05 11:45:06 +02:00
Justin Stitt
3b9333493b can: peak_pci: replace deprecated strncpy with strscpy
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

NUL-padding is not required since card is already zero-initialized:
|       card = kzalloc(sizeof(*card), GFP_KERNEL);

A suitable replacement is `strscpy` [2] due to the fact that it
guarantees NUL-termination on the destination buffer without
unnecessarily NUL-padding.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/all/20231005-strncpy-drivers-net-can-sja1000-peak_pci-c-v1-1-c36e1702cd56@google.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-10-05 09:51:18 +02:00
Dmitry Antipov
9418edf8ff wifi: rtlwifi: remove unreachable code in rtl92d_dm_check_edca_turbo()
Since '!(0x5ea42b & 0xffff0000)' is always false, remove unreachable
block in 'rtl92d_dm_check_edca_turbo()' and convert EDCA limits to
constant variables. Compile tested only.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231003043318.11370-1-dmantipov@yandex.ru
2023-10-05 09:54:58 +03:00
Zong-Zhe Yang
036042e157 wifi: rtw89: debug: txpwr table supports Wi-Fi 7 chips
We add TX power table format for Wi-Fi 7 chips. Since Wi-Fi 7 tables are
larger, in order to reuse some chunks, we extend code to process nested
entries. Now, dbgfs txpwr_table can work with Wi-Fi 7 chips.

An output example of dbgfs txpwr_table on Wi-Fi 7 chips is shown below.
...
[TX power byrate]
	<< BW20 >>
CCK       	-  1M      2M      5.5M   11M  	|  20,  20,  20,  20,	dBm
LEGACY    	-  6M      9M      12M    18M  	|  18,  18,  18,  18,	dBm
LEGACY    	-  24M     36M     48M    54M  	|  18,  18,  17,  16,	dBm
EHT       	-  MCS14   MCS15 		|   0,   0,		dBm
DLRU_EHT  	-  MCS14   MCS15 		|   0,  18,		dBm
MCS_1SS       	-  MCS0    MCS1    MCS2   MCS3 	|  18,  18,  18,  18,	dBm
MCS_1SS       	-  MCS4    MCS5    MCS6   MCS7 	|  18,  17,  16,  15,	dBm
MCS_1SS       	-  MCS8    MCS9    MCS10  MCS11	|  14,  13,  12,  11,	dBm
MCS_1SS       	-  MCS12   MCS13 		|  10,   9,		dBm
HEDCM_1SS     	-  MCS0    MCS1    MCS3   MCS4 	|  18,  18,  18,  18,	dBm
DLRU_MCS_1SS  	-  MCS0    MCS1    MCS2   MCS3 	|  18,  18,  18,  18,	dBm
DLRU_MCS_1SS  	-  MCS4    MCS5    MCS6   MCS7 	|  18,  17,  16,  15,	dBm
DLRU_MCS_1SS  	-  MCS8    MCS9    MCS10  MCS11	|  14,  13,  12,  11,	dBm
DLRU_MCS_1SS  	-  MCS12   MCS13 		|  10,   9,		dBm
DLRU_HEDCM_1SS	-  MCS0    MCS1    MCS3   MCS4 	|  18,  18,  18,  18,	dBm
MCS_2SS       	-  MCS0    MCS1    MCS2   MCS3 	|  18,  18,  18,  18,	dBm
...
[TX power limit]
	<< 1TX >>
CCK_20M    	-  NON_BF  BF	|   0,   0,		dBm
CCK_40M    	-  NON_BF  BF	|   0,   0,		dBm
OFDM       	-  NON_BF  BF	|  18,   0,		dBm
MCS_20M_0  	-  NON_BF  BF	|  18,   0,		dBm
MCS_20M_1  	-  NON_BF  BF	|   0,   0,		dBm
...

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231003015446.14658-8-pkshih@realtek.com
2023-10-05 09:54:17 +03:00
Zong-Zhe Yang
f680fc5695 wifi: rtw89: debug: show txpwr table according to chip gen
Since current TX power stuffs are for ax chips, add a suffix `_ax` to
them. Then, when requested to show txpwr table, select table according
to chip generation first.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231003015446.14658-7-pkshih@realtek.com
2023-10-05 09:54:16 +03:00
Zong-Zhe Yang
932f85c18a wifi: rtw89: phy: set TX power RU limit according to chip gen
Wi-Fi 6 chips and Wi-Fi 7 chips have different register design for TX
power RU limit. We rename original setting stuffs with a suffix `_ax`,
concentrate related enum declaration in phy.h, and implement setting
flow for Wi-Fi 7 chips. Then, we set TX power RU limit according to
chip generation.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231003015446.14658-6-pkshih@realtek.com
2023-10-05 09:54:16 +03:00
Zong-Zhe Yang
70aa04f2d5 wifi: rtw89: phy: set TX power limit according to chip gen
Wi-Fi 6 chips and Wi-Fi 7 chips have different register design for
TX power limit. We rename original setting stuffs with a suffix `_ax`,
concentrate related enum declaration in phy.h, and implement setting
flow for Wi-Fi 7 chips. Then, we set TX power limit according to chip
generation.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231003015446.14658-5-pkshih@realtek.com
2023-10-05 09:54:16 +03:00
Zong-Zhe Yang
3b7dc652cc wifi: rtw89: phy: set TX power offset according to chip gen
We have a register to control TX power of each rate section to increase
or decrease an offset. But, Wi-Fi 6 chips and Wi-Fi 7 chips have different
address and format for this control register. We rename original setting
stuffs with a suffix `_ax` and implement setting flow for Wi-Fi 7 chips.
Then, we set TX power offset according to chip generation.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231003015446.14658-4-pkshih@realtek.com
2023-10-05 09:54:16 +03:00
Zong-Zhe Yang
d513664215 wifi: rtw89: phy: set TX power by rate according to chip gen
Wi-Fi 6 chips and Wi-Fi 7 chips have different register design for
TX power by rate. We rename original setting stuffs with a suffix
`_ax` and implement setting flow for Wi-Fi 7 chips. Then, we set TX
power by rate according to chip generation.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231003015446.14658-3-pkshih@realtek.com
2023-10-05 09:54:15 +03:00
Zong-Zhe Yang
06b26738a7 wifi: rtw89: mac: get TX power control register according to chip gen
There are two difference between Wi-Fi 6 and Wi-Fi 7 chips.
1. Address range of TX power control register
2. Checking code to get a TX power control register

So, separate the implementation of them, access according to
chip generation, and rename original things with a suffix `_ax`.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20231003015446.14658-2-pkshih@realtek.com
2023-10-05 09:54:15 +03:00
Linus Torvalds
3006adf3be Timerlat auto-analysis:
- Timerlat is reporting thread interference time without thread noise
     events occurrence. It was caused because the thread interference variable
     was not reset after the analysis of a timerlat activation that did not
     hit the threshold.
 
   - The IRQ handler delay is estimated from the delta of the IRQ latency
     reported by timerlat, and the timestamp from IRQ handler start event.
     If the delta is near-zero, the drift from the external clock and the
     trace event and/or the overhead can cause the value to be negative.
     If the value is negative, print a zero-delay.
 
   - IRQ handlers happening after the timerlat thread event but before
     the stop tracing were being reported as IRQ that happened before the
     *current* IRQ occurrence. Ignore Previous IRQ noise in this condition
     because they are valid only for the *next* timerlat activation.
 
 Timerlat user-space:
 
   - Timerlat is stopping all user-space thread if a CPU becomes
     offline. Do not stop the entire tool if a CPU is/become offline,
     but only the thread of the unavailable CPU. Stop the tool only,
     if all threads leave because the CPUs become/are offline.
 
 man-pages:
 
   - Fix command line example in timerlat hist man page.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEElZdCZGILCpueJPrSY3Tw0sBuFwAFAmURVMQTHGJyaXN0b3RA
 a2VybmVsLm9yZwAKCRBjdPDSwG4XAJezD/0fJnrzJFVSUwAXbdu1K679ik5iqwTk
 UE/ZHY3dBbES6DFswXomofe4LkimY1tnLvyPr5tHqCGW8cvnMkOpgDK68LEgyL5a
 1FLR8D+07i2dsEcsXfcAAF8iVEeF/SzOfHwZuY1ZJyicwl3xtya/QDrXpq8LZR1n
 4YEWE3Xx60bo/Q81hTXN3uS+275bfuV/N8DSOXwVVWhK5kxheitc1ESUGLV/g1HQ
 muyv+k+fH1qnOfkPsokhnxMjgzy7Tqv13onoVY+KUSQ1Ui58p+c3zQSkceWxM8c4
 wnbfR0spF1eCoBlO2/PYUZ2p2zEh/NS3eTQchys4J2lbgURW1IIVaxaK1S5xC2CE
 tkYkBOaUJXlD3HzTCkPRNpOI0+8Ydo0MDzzPUqjHemfFE7zzHVoZTfmdInSyddUz
 ViKLi0HS+kjyvZVGa02JuDgPJmjTPgwd1F8p6cujHmSCbifbs4Oml9VaYHQRioZX
 bkIDAX6NMkqDpb0baGjsIzbmiWnsIeo8J1IDqdXnD3VY1J78D+kBNCISxGjXuTSF
 Eg3iyZJHWy2JhGBQ2k4lyCw9FZZ1FZtkURPWvTn5/PbsPqz5bjPWUcwXsyqE6wBL
 OPR3HUcjgaMv7gJrErbsAaAGXxwpgTOe0qMcWI2tR7n6SHzniOn9WlDjegVnwWp1
 r4ognHxasRQUAQ==
 =1BAc
 -----END PGP SIGNATURE-----

Merge tag 'rtla-v6.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux

Pull rtla fixes from Daniel Bristot de Oliveira:
 "rtla (Real-Time Linux Analysis) tool fixes.

  Timerlat auto-analysis:

   - Timerlat is reporting thread interference time without thread noise
     events occurrence. It was caused because the thread interference
     variable was not reset after the analysis of a timerlat activation
     that did not hit the threshold.

   - The IRQ handler delay is estimated from the delta of the IRQ
     latency reported by timerlat, and the timestamp from IRQ handler
     start event. If the delta is near-zero, the drift from the external
     clock and the trace event and/or the overhead can cause the value
     to be negative. If the value is negative, print a zero-delay.

   - IRQ handlers happening after the timerlat thread event but before
     the stop tracing were being reported as IRQ that happened before
     the *current* IRQ occurrence. Ignore Previous IRQ noise in this
     condition because they are valid only for the *next* timerlat
     activation.

  Timerlat user-space:

   - Timerlat is stopping all user-space thread if a CPU becomes
     offline. Do not stop the entire tool if a CPU is/become offline,
     but only the thread of the unavailable CPU. Stop the tool only, if
     all threads leave because the CPUs become/are offline.

  man-pages:

   - Fix command line example in timerlat hist man page"

* tag 'rtla-v6.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux:
  rtla: fix a example in rtla-timerlat-hist.rst
  rtla/timerlat: Do not stop user-space if a cpu is offline
  rtla/timerlat_aa: Fix previous IRQ delay for IRQs that happens after thread sample
  rtla/timerlat_aa: Fix negative IRQ delay
  rtla/timerlat_aa: Zero thread sum after every sample analysis
2023-10-04 18:19:55 -07:00
Jakub Kicinski
93e7eca853 Merge branch 'ynl-makefile-cleanup'
Jakub Kicinski says:

====================
ynl Makefile cleanup

While catching up on recent changes I noticed unexpected
changes to Makefiles in YNL. Indeed they were not working
as intended but the fixes put in place were not what I had
in mind :)
====================

Link: https://lore.kernel.org/r/20231003153416.2479808-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 17:33:56 -07:00
Jakub Kicinski
e2ca31cee9 tools: ynl: use uAPI include magic for samples
Makefile.deps provides direct includes in CFLAGS_$(obj).
We just need to rewrite the rules to make use of the extra
flags, no need to hard-include all of tools/include/uapi.

Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20231003153416.2479808-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 17:33:54 -07:00
Jakub Kicinski
a50660173c tools: ynl: don't regen on every make
As far as I can tell the normal Makefile dependency tracking
works, generated files get re-generated if the YAML was updated.
Let make do its job, don't force the re-generation.
make hardclean can be used to force regeneration.

Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20231003153416.2479808-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 17:33:54 -07:00
Jakub Kicinski
0629f22ec1 ynl: netdev: drop unnecessary enum-as-flags
enum-as-flags can be used when enum declares bit positions but
we want to carry bitmask in an attribute. If the definition
is already provided as flags there's no need to indicate
the flag-iness of the attribute.

Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20231003153416.2479808-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 17:33:54 -07:00
Eric Dumazet
d0f95894fd netlink: annotate data-races around sk->sk_err
syzbot caught another data-race in netlink when
setting sk->sk_err.

Annotate all of them for good measure.

BUG: KCSAN: data-race in netlink_recvmsg / netlink_recvmsg

write to 0xffff8881613bb220 of 4 bytes by task 28147 on cpu 0:
netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994
sock_recvmsg_nosec net/socket.c:1027 [inline]
sock_recvmsg net/socket.c:1049 [inline]
__sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229
__do_sys_recvfrom net/socket.c:2247 [inline]
__se_sys_recvfrom net/socket.c:2243 [inline]
__x64_sys_recvfrom+0x78/0x90 net/socket.c:2243
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

write to 0xffff8881613bb220 of 4 bytes by task 28146 on cpu 1:
netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994
sock_recvmsg_nosec net/socket.c:1027 [inline]
sock_recvmsg net/socket.c:1049 [inline]
__sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229
__do_sys_recvfrom net/socket.c:2247 [inline]
__se_sys_recvfrom net/socket.c:2243 [inline]
__x64_sys_recvfrom+0x78/0x90 net/socket.c:2243
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x00000000 -> 0x00000016

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 28146 Comm: syz-executor.0 Not tainted 6.6.0-rc3-syzkaller-00055-g9ed22ae6be81 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231003183455.3410550-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 17:32:54 -07:00
Eric Dumazet
d86e5fbd4c net: skb_queue_purge_reason() optimizations
1) Exit early if the list is empty.

2) splice the list into a local list,
   so that we block hard irqs only once.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231003181920.3280453-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 17:32:24 -07:00
Xin Long
1f4e803cd9 sctp: update hb timer immediately after users change hb_interval
Currently, when hb_interval is changed by users, it won't take effect
until the next expiry of hb timer. As the default value is 30s, users
have to wait up to 30s to wait its hb_interval update to work.

This becomes pretty bad in containers where a much smaller value is
usually set on hb_interval. This patch improves it by resetting the
hb timer immediately once the value of hb_interval is updated by users.

Note that we don't address the already existing 'problem' when sending
a heartbeat 'on demand' if one hb has just been sent(from the timer)
mentioned in:

  https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg590224.html

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Link: https://lore.kernel.org/r/75465785f8ee5df2fb3acdca9b8fafdc18984098.1696172660.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 17:29:58 -07:00
Xin Long
2222a78075 sctp: update transport state when processing a dupcook packet
During the 4-way handshake, the transport's state is set to ACTIVE in
sctp_process_init() when processing INIT_ACK chunk on client or
COOKIE_ECHO chunk on server.

In the collision scenario below:

  192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408]
    192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885]
    192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408]
    192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO]
    192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK]
  192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021]

when processing COOKIE_ECHO on 192.168.1.2, as it's in COOKIE_WAIT state,
sctp_sf_do_dupcook_b() is called by sctp_sf_do_5_2_4_dupcook() where it
creates a new association and sets its transport to ACTIVE then updates
to the old association in sctp_assoc_update().

However, in sctp_assoc_update(), it will skip the transport update if it
finds a transport with the same ipaddr already existing in the old asoc,
and this causes the old asoc's transport state not to move to ACTIVE
after the handshake.

This means if DATA retransmission happens at this moment, it won't be able
to enter PF state because of the check 'transport->state == SCTP_ACTIVE'
in sctp_do_8_2_transport_strike().

This patch fixes it by updating the transport in sctp_assoc_update() with
sctp_assoc_add_peer() where it updates the transport state if there is
already a transport with the same ipaddr exists in the old asoc.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Link: https://lore.kernel.org/r/fd17356abe49713ded425250cc1ae51e9f5846c6.1696172325.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 17:29:44 -07:00
Jakub Kicinski
397f70e3be Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-10-03 (i40e, iavf)

This series contains updates to i40e and iavf drivers.

Yajun Deng aligns reporting of buffer exhaustion statistics to follow
documentation for i40e.

Jake removes undesired 'inline' from functions in iavf.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  iavf: remove "inline" functions from iavf_txrx.c
  i40e: Add rx_missed_errors for buffer exhaustion
====================

Link: https://lore.kernel.org/r/20231003223610.2004976-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 17:27:24 -07:00
Jakub Kicinski
b4ac75a3bb Merge branch 'fix-a-couple-recent-instances-of-wincompatible-function-pointer-types-strict-from-mode_get-implementations'
Nathan Chancellor says:

====================
Fix a couple recent instances of -Wincompatible-function-pointer-types-strict from ->mode_get() implementations

This series fixes a couple of instances of
-Wincompatible-function-pointer-types-strict that were introduced by a
recent series that added a new type of ops, struct dpll_device_ops,
along with implementations of the callback ->mode_get() that had a
mismatched mode type.

This warning is not currently enabled for any build but I am planning on
submitting a patch to add it to W=1 to prevent new instances of the
warning from popping up while we try and fix the existing instances in
other drivers.

This series is based on current net-next but if they need to go into
individual maintainer trees, please feel free to take the patches
individually.
====================

Link: https://lore.kernel.org/r/20231002-net-wifpts-dpll_mode_get-v1-0-a356a16413cf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 17:15:06 -07:00
Nathan Chancellor
f4ecb3d44a mlx5: Fix type of mode parameter in mlx5_dpll_device_mode_get()
When building with -Wincompatible-function-pointer-types-strict, a
warning designed to catch potential kCFI failures at build time rather
than run time due to incorrect function pointer types, there is a
warning due to a mismatch between the type of the mode parameter in
mlx5_dpll_device_mode_get() vs. what the function pointer prototype for
->mode_get() in 'struct dpll_device_ops' expects.

  drivers/net/ethernet/mellanox/mlx5/core/dpll.c:141:14: error: incompatible function pointer types initializing 'int (*)(const struct dpll_device *, void *, enum dpll_mode *, struct netlink_ext_ack *)' with an expression of type 'int (const struct dpll_device *, void *, u32 *, struct netlink_ext_ack *)' (aka 'int (const struct dpll_device *, void *, unsigned int *, struct netlink_ext_ack *)') [-Werror,-Wincompatible-function-pointer-types-strict]
    141 |         .mode_get = mlx5_dpll_device_mode_get,
        |                     ^~~~~~~~~~~~~~~~~~~~~~~~~
  1 error generated.

Change the type of the mode parameter in mlx5_dpll_device_mode_get() to
clear up the warning and avoid kCFI failures at run time.

Fixes: 496fd0a26b ("mlx5: Implement SyncE support using DPLL infrastructure")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231002-net-wifpts-dpll_mode_get-v1-2-a356a16413cf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 17:15:04 -07:00
Nathan Chancellor
26cc115d59 ptp: Fix type of mode parameter in ptp_ocp_dpll_mode_get()
When building with -Wincompatible-function-pointer-types-strict, a
warning designed to catch potential kCFI failures at build time rather
than run time due to incorrect function pointer types, there is a
warning due to a mismatch between the type of the mode parameter in
ptp_ocp_dpll_mode_get() vs. what the function pointer prototype for
->mode_get() in 'struct dpll_device_ops' expects.

  drivers/ptp/ptp_ocp.c:4353:14: error: incompatible function pointer types initializing 'int (*)(const struct dpll_device *, void *, enum dpll_mode *, struct netlink_ext_ack *)' with an expression of type 'int (const struct dpll_device *, void *, u32 *, struct netlink_ext_ack *)' (aka 'int (const struct dpll_device *, void *, unsigned int *, struct netlink_ext_ack *)') [-Werror,-Wincompatible-function-pointer-types-strict]
   4353 |         .mode_get = ptp_ocp_dpll_mode_get,
        |                     ^~~~~~~~~~~~~~~~~~~~~
  1 error generated.

Change the type of the mode parameter in ptp_ocp_dpll_mode_get() to
clear up the warning and avoid kCFI failures at run time.

Fixes: 09eeb3aecc ("ptp_ocp: implement DPLL ops")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231002-net-wifpts-dpll_mode_get-v1-1-a356a16413cf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 17:15:04 -07:00
Jakub Kicinski
78cac6f171 Merge branch 'r8152-modify-rx_bottom'
Hayes Wang says:

====================
r8152: modify rx_bottom

v3:
For patch #1, this patch is replaced. The new patch only break the loop,
and keep that the driver would queue the rx packets.

For patch #2, modify the code depends on patch #1. For work_down < budget,
napi_get_frags() and napi_gro_frags() would be used. For the others,
nothing is changed.

v2:
For patch #1, add comment, update commit message, and add Fixes tag.

v1:
These patches are used to improve rx_bottom().
====================

Link: https://lore.kernel.org/r/20230926111714.9448-432-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 16:51:34 -07:00
Hayes Wang
788d30daa8 r8152: use napi_gro_frags
Use napi_gro_frags() for the skb of fragments when the work_done is less
than budget.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20230926111714.9448-434-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 16:51:32 -07:00
Hayes Wang
2cf51f9317 r8152: break the loop when the budget is exhausted
A bulk transfer of the USB may contain many packets. And, the total
number of the packets in the bulk transfer may be more than budget.

Originally, only budget packets would be handled by napi_gro_receive(),
and the other packets would be queued in the driver for next schedule.

This patch would break the loop about getting next bulk transfer, when
the budget is exhausted. That is, only the current bulk transfer would
be handled, and the other bulk transfers would be queued for next
schedule. Besides, the packets which are more than the budget in the
current bulk trasnfer would be still queued in the driver, as the
original method.

In addition, a bulk transfer wouldn't contain more than 400 packets, so
the check of queue length is unnecessary. Therefore, I replace it with
WARN_ON_ONCE().

Fixes: cf74eb5a5b ("eth: r8152: try to use a normal budget")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20230926111714.9448-433-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 16:51:31 -07:00
Jakub Kicinski
f8e5b77862 Merge branch 'chelsio-annotate-structs-with-__counted_by'
Kees Cook says:

====================
chelsio: Annotate structs with __counted_by

This annotates several chelsio structures with the coming __counted_by
attribute for bounds checking of flexible arrays at run-time. For more details,
see commit dd06e72e68 ("Compiler Attributes: Add __counted_by macro").
====================

Link: https://lore.kernel.org/r/20230929181042.work.990-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 15:37:15 -07:00
Kees Cook
1508cb7e07 cxgb4: Annotate struct smt_data with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct smt_data.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230929181149.3006432-5-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 15:37:13 -07:00
Kees Cook
ceba9725fb cxgb4: Annotate struct sched_table with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct sched_table.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230929181149.3006432-4-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 15:37:13 -07:00
Kees Cook
157c56a4fe cxgb4: Annotate struct cxgb4_tc_u32_table with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct cxgb4_tc_u32_table.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230929181149.3006432-3-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 15:37:13 -07:00
Kees Cook
c3db467b08 cxgb4: Annotate struct clip_tbl with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct clip_tbl.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230929181149.3006432-2-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 15:37:13 -07:00
Kees Cook
3bbae5f1c6 chelsio/l2t: Annotate struct l2t_data with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct l2t_data.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20230929181149.3006432-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 15:37:13 -07:00
Neal Cardwell
4720852ed9 tcp: fix delayed ACKs for MSS boundary condition
This commit fixes poor delayed ACK behavior that can cause poor TCP
latency in a particular boundary condition: when an application makes
a TCP socket write that is an exact multiple of the MSS size.

The problem is that there is painful boundary discontinuity in the
current delayed ACK behavior. With the current delayed ACK behavior,
we have:

(1) If an app reads data when > 1*MSS is unacknowledged, then
    tcp_cleanup_rbuf() ACKs immediately because of:

     tp->rcv_nxt - tp->rcv_wup > icsk->icsk_ack.rcv_mss ||

(2) If an app reads all received data, and the packets were < 1*MSS,
    and either (a) the app is not ping-pong or (b) we received two
    packets < 1*MSS, then tcp_cleanup_rbuf() ACKs immediately beecause
    of:

     ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED2) ||
      ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED) &&
       !inet_csk_in_pingpong_mode(sk))) &&

(3) *However*: if an app reads exactly 1*MSS of data,
    tcp_cleanup_rbuf() does not send an immediate ACK. This is true
    even if the app is not ping-pong and the 1*MSS of data had the PSH
    bit set, suggesting the sending application completed an
    application write.

Thus if the app is not ping-pong, we have this painful case where
>1*MSS gets an immediate ACK, and <1*MSS gets an immediate ACK, but a
write whose last skb is an exact multiple of 1*MSS can get a 40ms
delayed ACK. This means that any app that transfers data in one
direction and takes care to align write size or packet size with MSS
can suffer this problem. With receive zero copy making 4KB MSS values
more common, it is becoming more common to have application writes
naturally align with MSS, and more applications are likely to
encounter this delayed ACK problem.

The fix in this commit is to refine the delayed ACK heuristics with a
simple check: immediately ACK a received 1*MSS skb with PSH bit set if
the app reads all data. Why? If an skb has a len of exactly 1*MSS and
has the PSH bit set then it is likely the end of an application
write. So more data may not be arriving soon, and yet the data sender
may be waiting for an ACK if cwnd-bound or using TX zero copy. Thus we
set ICSK_ACK_PUSHED in this case so that tcp_cleanup_rbuf() will send
an ACK immediately if the app reads all of the data and is not
ping-pong. Note that this logic is also executed for the case where
len > MSS, but in that case this logic does not matter (and does not
hurt) because tcp_cleanup_rbuf() will always ACK immediately if the
app reads data and there is more than an MSS of unACKed data.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Cc: Xin Guo <guoxin0309@gmail.com>
Link: https://lore.kernel.org/r/20231001151239.1866845-2-ncardwell.sw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 15:34:18 -07:00
Neal Cardwell
059217c18b tcp: fix quick-ack counting to count actual ACKs of new data
This commit fixes quick-ack counting so that it only considers that a
quick-ack has been provided if we are sending an ACK that newly
acknowledges data.

The code was erroneously using the number of data segments in outgoing
skbs when deciding how many quick-ack credits to remove. This logic
does not make sense, and could cause poor performance in
request-response workloads, like RPC traffic, where requests or
responses can be multi-segment skbs.

When a TCP connection decides to send N quick-acks, that is to
accelerate the cwnd growth of the congestion control module
controlling the remote endpoint of the TCP connection. That quick-ack
decision is purely about the incoming data and outgoing ACKs. It has
nothing to do with the outgoing data or the size of outgoing data.

And in particular, an ACK only serves the intended purpose of allowing
the remote congestion control to grow the congestion window quickly if
the ACK is ACKing or SACKing new data.

The fix is simple: only count packets as serving the goal of the
quickack mechanism if they are ACKing/SACKing new data. We can tell
whether this is the case by checking inet_csk_ack_scheduled(), since
we schedule an ACK exactly when we are ACKing/SACKing new data.

Fixes: fc6415bcb0 ("[TCP]: Fix quick-ack decrementing with TSO.")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20231001151239.1866845-1-ncardwell.sw@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 15:34:18 -07:00
Jakub Kicinski
c56e67f3ff netfilter pull request 2023-10-04
-----BEGIN PGP SIGNATURE-----
 
 iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmUdcDQNHGZ3QHN0cmxl
 bi5kZQAKCRBwkajZrV/2AOf7EACipnPx/532GUk1pECg+iWGTfhFOu1jdHjAILzy
 +Ft/kfTLvd8kfZg6DuKIb6KYfj7w7uQ/xcD6wfqV8HBcss0SOyilx2ZUgH8ThwDv
 tSIsUsx1M1gOGkXK713GrD6h/PR5BBv3vVFymvr+MliYH4C2mmsGOGWk5D+s8IqU
 q3XDMMMlsZpfqCA8QGKK7TkFhnvnHdeoHGhZhw9ywXik733Qa4OUbJ5tkxztDKrr
 DKF/FhpYxWPKHURtPXaQpWuni7xbMjg+3lHYlWTRZkQRQOoPWidBuTumqJxvwT3Z
 FYwlS7T7OBMiFByy4spBnBs0uGiA6rR3sZ2/Gn98o9HpYlCllxpZm53Ay0u8sZTL
 RBhMkacOXTWN5n1fbIqHIZc6vs7Tm1crvT2V/CseAuhe9TDiD5cHkaz7hJUQif6h
 dmF48QHCHuSgWGtyPmVbTDSZ02YF++R398zHuBM2TXkFz8B9vI5DRpbXw0yX4ktg
 LZSKnBALOPN5Ye27+W+itNfNaMC3+Elto3Cv9IvpTaXWl8WpF8hnNagLObEXxJ1Q
 3dLRKpSHDKJe7BLQoqm9ESFUE80bZr+S1Xleukz0z7AamCrM/rxQGKBwbTJs2NoE
 1YezWzhw68+aQ7BY8eWigDAQKmtn1Oju3v5u5IekGKQVvXd5x97VGlJQRVxQvr2Z
 jDHNFw==
 =YLDi
 -----END PGP SIGNATURE-----

Merge tag 'nf-23-10-04' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Florian Westphal says:

====================
netfilter patches for net

First patch resolves a regression with vlan header matching, this was
broken since 6.5 release.  From myself.

Second patch fixes an ancient problem with sctp connection tracking in
case INIT_ACK packets are delayed.  This comes with a selftest, both
patches from Xin Long.

Patch 4 extends the existing nftables audit selftest, from
Phil Sutter.

Patch 5, also from Phil, avoids a situation where nftables
would emit an audit record twice. This was broken since 5.13 days.

Patch 6, from myself, avoids spurious insertion failure if we encounter an
overlapping but expired range during element insertion with the
'nft_set_rbtree' backend. This problem exists since 6.2.

* tag 'nf-23-10-04' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure
  netfilter: nf_tables: Deduplicate nft_register_obj audit logs
  selftests: netfilter: Extend nft_audit.sh
  selftests: netfilter: test for sctp collision processing in nf_conntrack
  netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp
  netfilter: nft_payload: rebuild vlan header on h_proto access
====================

Link: https://lore.kernel.org/r/20231004141405.28749-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 14:53:17 -07:00
Jakub Kicinski
07cf7974a2 netfilter pull request 2023-09-28
-----BEGIN PGP SIGNATURE-----
 
 iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmUVjwUNHGZ3QHN0cmxl
 bi5kZQAKCRBwkajZrV/2AKneEACzrKtIC0j0DyhgVW4Kb57T8Y7cD5wQCv7oz1Cx
 8A3UJ1pSLYhRnz94zY453GIenK+zx/KKIetDhyWnjA9gjk95HkUN+OwuuiKnUAgu
 7KPGbIYat7hERwoZpR88nrbTYXcDZfcZGTqWA++3yL2vn4Lu4lsuowqXYKBf/axk
 5gEwEtwn2mVsdo0qTVJcXkHqnf5CCdqd26ixF4yB1rz/P6kISi4I9q7ul43paFJW
 +/ifacdG+7raQkGlUlYiDNMVd0uO01HHaAcWfYa+FOMK+GSn+89zzTs906CU0g2O
 GRJSWjNTgfDtM2AHN7peUnf/G9XHSK2Y7Re8FzauKzwWSl5N9w5610nbQnT+ME5O
 uOZE1P/lhnidOwCEV8zU4yhs6fBrCMCHz+S5Yh8C8PCUhi12IEEYRHyGCoUVMOwY
 1LINjdn4HddL57QUGumy0VqVBlxQru8VXnlzm0eIyhsbZ3/mVXQWIHX4u1G36UUQ
 zSkm4/qP4kna/tV86mETNX1MUcJsQ1vQ842abcUbxudKei/uT9av6YHlz/aBOcQZ
 NDMrGVO6mjh7/HnYUr7+zbQfhLZdg424SpGEoiuS7dDcTpGlcT3pnWBJDGEHsy+4
 0VnLI8/GPT1/jQCCYTVLu+tn0XmfZF18j2bvGhz1hM9J/HXaRpuqjGF6thLgYl63
 CZf5Yg==
 =ALU2
 -----END PGP SIGNATURE-----

Merge tag 'nf-next-23-09-28' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Florian Westphal says:

====================
netfilter updates for net-next

First patch, from myself, is a bug fix. The issue (connect timeout) is
ancient, so I think its safe to give this more soak time given the esoteric
conditions needed to trigger this.
Also updates the existing selftest to cover this.

Add netlink extacks when an update references a non-existent
table/chain/set.  This allows userspace to provide much better
errors to the user, from Pablo Neira Ayuso.

Last patch adds more policy checks to nf_tables as a better
alternative to the existing runtime checks, from Phil Sutter.

* tag 'nf-next-23-09-28' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_tables: Utilize NLA_POLICY_NESTED_ARRAY
  netfilter: nf_tables: missing extended netlink error in lookup functions
  selftests: netfilter: test nat source port clash resolution interaction with tcp early demux
  netfilter: nf_nat: undo erroneous tcp edemux lookup after port clash
====================

Link: https://lore.kernel.org/r/20230928144916.18339-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 14:25:37 -07:00
Randy Dunlap
513dbc10cf page_pool: fix documentation typos
Correct grammar for better readability.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/r/20231001003846.29541-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 14:22:27 -07:00
Geert Uytterhoeven
2b464cc2fd sctp: Spelling s/preceeding/preceding/g
Fix a misspelling of "preceding".

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/663b14d07d6d716ddc34482834d6b65a2f714cfb.1695903447.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 14:04:58 -07:00
Chengfeng Ye
08e50cf071 tipc: fix a potential deadlock on &tx->lock
It seems that tipc_crypto_key_revoke() could be be invoked by
wokequeue tipc_crypto_work_rx() under process context and
timer/rx callback under softirq context, thus the lock acquisition
on &tx->lock seems better use spin_lock_bh() to prevent possible
deadlock.

This flaw was found by an experimental static analysis tool I am
developing for irq-related deadlock.

tipc_crypto_work_rx() <workqueue>
--> tipc_crypto_key_distr()
--> tipc_bcast_xmit()
--> tipc_bcbase_xmit()
--> tipc_bearer_bc_xmit()
--> tipc_crypto_xmit()
--> tipc_ehdr_build()
--> tipc_crypto_key_revoke()
--> spin_lock(&tx->lock)
<timer interrupt>
   --> tipc_disc_timeout()
   --> tipc_bearer_xmit_skb()
   --> tipc_crypto_xmit()
   --> tipc_ehdr_build()
   --> tipc_crypto_key_revoke()
   --> spin_lock(&tx->lock) <deadlock here>

Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Fixes: fc1b6d6de2 ("tipc: introduce TIPC encryption & authentication")
Link: https://lore.kernel.org/r/20230927181414.59928-1-dg573847474@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 13:24:12 -07:00
Ben Wolsieffer
6f195d6b0d net: stmmac: dwmac-stm32: fix resume on STM32 MCU
The STM32MP1 keeps clk_rx enabled during suspend, and therefore the
driver does not enable the clock in stm32_dwmac_init() if the device was
suspended. The problem is that this same code runs on STM32 MCUs, which
do disable clk_rx during suspend, causing the clock to never be
re-enabled on resume.

This patch adds a variant flag to indicate that clk_rx remains enabled
during suspend, and uses this to decide whether to enable the clock in
stm32_dwmac_init() if the device was suspended.

This approach fixes this specific bug with limited opportunity for
unintended side-effects, but I have a follow up patch that will refactor
the clock configuration and hopefully make it less error prone.

Fixes: 6528e02cc9 ("net: ethernet: stmmac: add adaptation for stm32mp157c.")
Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230927175749.1419774-1-ben.wolsieffer@hefring.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 13:22:37 -07:00
Dan Carpenter
24a0fbf48c ptp: ocp: fix error code in probe()
There is a copy and paste error so this uses a valid pointer instead of
an error pointer.

Fixes: 09eeb3aecc ("ptp_ocp: implement DPLL ops")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://lore.kernel.org/r/5c581336-0641-48bd-88f7-51984c3b1f79@moroto.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 13:10:41 -07:00
Russell King (Oracle)
d5a590b1b6 net: dsa: mt753x: remove mt753x_phylink_pcs_link_up()
Remove the mt753x_phylink_pcs_link_up() function for two reasons:

1) priv->pcs[i].pcs.neg_mode is set true, meaning it doesn't take a
   MLO_AN_FIXED anymore, but one of PHYLINK_PCS_NEG_*. However, this
   is inconsequential due to...
2) priv->pcs[port].pcs.ops is always initialised to point at
   mt7530_pcs_ops, which does not have a pcs_link_up() member.

So, let's remove mt753x_phylink_pcs_link_up() entirely.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/E1qlTQS-008BWe-Va@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 12:20:50 -07:00
Greg Kroah-Hartman
00f3696f75 net: appletalk: remove cops support
The COPS Appletalk support is very old, never said to actually work
properly, and the firmware code for the devices are under a very suspect
license.  Remove it all to clear up the license issue, if it is still
needed and actually used by anyone, we can add it back later once the
license is cleared up.

Reported-by: Prarit Bhargava <prarit@redhat.com>
Cc: jschlst@samba.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20230927090029.44704-2-gregkh@linuxfoundation.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 11:49:20 -07:00
Benjamin Poirier
0add5c597f ipv4: Set offload_failed flag in fibmatch results
Due to a small omission, the offload_failed flag is missing from ipv4
fibmatch results. Make sure it is set correctly.

The issue can be witnessed using the following commands:
echo "1 1" > /sys/bus/netdevsim/new_device
ip link add dummy1 up type dummy
ip route add 192.0.2.0/24 dev dummy1
echo 1 > /sys/kernel/debug/netdevsim/netdevsim1/fib/fail_route_offload
ip route add 198.51.100.0/24 dev dummy1
ip route
	# 192.168.15.0/24 has rt_trap
	# 198.51.100.0/24 has rt_offload_failed
ip route get 192.168.15.1 fibmatch
	# Result has rt_trap
ip route get 198.51.100.1 fibmatch
	# Result differs from the route shown by `ip route`, it is missing
	# rt_offload_failed
ip link del dev dummy1
echo 1 > /sys/bus/netdevsim/del_device

Fixes: 36c5100e85 ("IPv4: Add "offload failed" indication to routes")
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230926182730.231208-1-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04 11:39:36 -07:00
Linus Torvalds
ba7d997a2a linux-kselftest-fixes-6.6-rc5
This kselftest fixes update for Linux 6.6-rc5 consists of one single
 fix to Makefile to fix the incorrect TARGET name for uevent test.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmUdg68ACgkQCwJExA0N
 QxygqBAAkASI/KbTMEoM+yKLsJM/2xC6M9/2EGJaDLhpwO64Fec73CdLK1sBE/X5
 AkUZ8xT8E8YNykoa9KfOkDJbTaNFW8vxupPL2rDgfFks8zJ+0FBfbv8ZlnStwqk+
 aPaxqRhvu3sUAun53EoHUUY0Z8zD8lYwnYj/F5qejGnyFt6vmaEoq0mmSKyJ7RKo
 fkLIiXY/loiZECpRHxhraRxG3U4DZXasOsDlyloilgHTX2mVuuWx3NDoZd7TdTy7
 myGAKc/+MBYVCKAIAWsgHPQpVIHTWwItg0p7gwROXuCSK0vjGg789wa+K20A0fgM
 PyK77oEPtOfPHMZVAHyC346cvJMsT/3dr48Mq4m9z4nBtBsJ7md9BwXvUNtQAY7A
 FTmDlt6qvMu4ooxFTvjvG3wCUWfAIrAMUij5SazTxMVf6vVreFOmqJgSHxP9o4Ok
 AcCDoqvxn2FtwaOf13D1F2wXvjr6MtD7+SBv8ipLUkBGQ4uSBOmG/rXiXA8Vjbrt
 K8mmH6LW3dg5lUpMlLuGBDBdTJDyIW8b3laiIOhIDZrhl9KzqkPZ1Tll84hUHvOP
 6DSoXJjqxcQj8DCBBqVcJV9VyPuOEpB/ngL8APDHowJoWvULPxbaZYLsnhZnoOBH
 E3pbImyeVlsEwQpm0ZnLj8Kc2quaryjmwYt8AYyFpzSZqLazXzw=
 =W9Tx
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fix from Shuah Khan:
 "One single fix to Makefile to fix the incorrect TARGET name for uevent
  test"

* tag 'linux-kselftest-fixes-6.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: Fix wrong TARGET in kselftest top level Makefile
2023-10-04 11:35:23 -07:00