Commit Graph

1170119 Commits

Author SHA1 Message Date
Brian Gix 52dd5e964a Bluetooth: Remove "Power-on" check from Mesh feature
The Bluetooth mesh experimental feature enable was requiring the
controller to be powered off in order for the Enable to work. Mesh is
supposed to be enablable regardless of the controller state, and created
an unintended requirement that the mesh daemon be started before the
classic bluetoothd daemon.

Fixes: af6bcc1921 ("Bluetooth: Add experimental wrapper for MGMT based mesh")
Signed-off-by: Brian Gix <brian.gix@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-03-22 16:05:55 -07:00
Min Li 1c66bee492 Bluetooth: Fix race condition in hci_cmd_sync_clear
There is a potential race condition in hci_cmd_sync_work and
hci_cmd_sync_clear, and could lead to use-after-free. For instance,
hci_cmd_sync_work is added to the 'req_workqueue' after cancel_work_sync
The entry of 'cmd_sync_work_list' may be freed in hci_cmd_sync_clear, and
causing kernel panic when it is used in 'hci_cmd_sync_work'.

Here's the call trace:

dump_stack_lvl+0x49/0x63
print_report.cold+0x5e/0x5d3
? hci_cmd_sync_work+0x282/0x320
kasan_report+0xaa/0x120
? hci_cmd_sync_work+0x282/0x320
__asan_report_load8_noabort+0x14/0x20
hci_cmd_sync_work+0x282/0x320
process_one_work+0x77b/0x11c0
? _raw_spin_lock_irq+0x8e/0xf0
worker_thread+0x544/0x1180
? poll_idle+0x1e0/0x1e0
kthread+0x285/0x320
? process_one_work+0x11c0/0x11c0
? kthread_complete_and_exit+0x30/0x30
ret_from_fork+0x22/0x30
</TASK>

Allocated by task 266:
kasan_save_stack+0x26/0x50
__kasan_kmalloc+0xae/0xe0
kmem_cache_alloc_trace+0x191/0x350
hci_cmd_sync_queue+0x97/0x2b0
hci_update_passive_scan+0x176/0x1d0
le_conn_complete_evt+0x1b5/0x1a00
hci_le_conn_complete_evt+0x234/0x340
hci_le_meta_evt+0x231/0x4e0
hci_event_packet+0x4c5/0xf00
hci_rx_work+0x37d/0x880
process_one_work+0x77b/0x11c0
worker_thread+0x544/0x1180
kthread+0x285/0x320
ret_from_fork+0x22/0x30

Freed by task 269:
kasan_save_stack+0x26/0x50
kasan_set_track+0x25/0x40
kasan_set_free_info+0x24/0x40
____kasan_slab_free+0x176/0x1c0
__kasan_slab_free+0x12/0x20
slab_free_freelist_hook+0x95/0x1a0
kfree+0xba/0x2f0
hci_cmd_sync_clear+0x14c/0x210
hci_unregister_dev+0xff/0x440
vhci_release+0x7b/0xf0
__fput+0x1f3/0x970
____fput+0xe/0x20
task_work_run+0xd4/0x160
do_exit+0x8b0/0x22a0
do_group_exit+0xba/0x2a0
get_signal+0x1e4a/0x25b0
arch_do_signal_or_restart+0x93/0x1f80
exit_to_user_mode_prepare+0xf5/0x1a0
syscall_exit_to_user_mode+0x26/0x50
ret_from_fork+0x15/0x30

Fixes: 6a98e3836f ("Bluetooth: Add helper for serialized HCI command execution")
Cc: stable@vger.kernel.org
Signed-off-by: Min Li <lm0963hack@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-03-22 16:05:55 -07:00
Kiran K 294d749b5d Bluetooth: btintel: Iterate only bluetooth device ACPI entries
Current flow interates over entire ACPI table entries looking for
Bluetooth Per Platform Antenna Gain(PPAG) entry. This patch iterates
over ACPI entries relvant to Bluetooth device only.

Fixes: c585a92b2f ("Bluetooth: btintel: Set Per Platform Antenna Gain(PPAG)")
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-03-22 16:05:55 -07:00
Pauli Virtanen 2f10e40a94 Bluetooth: ISO: fix timestamped HCI ISO data packet parsing
Use correct HCI ISO data packet header struct when the packet has
timestamp. The timestamp, when present, goes before the other fields
(Core v5.3 4E 5.4.5), so the structs are not compatible.

Fixes: ccf74f2390 ("Bluetooth: Add BTPROTO_ISO socket type")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-03-22 16:05:55 -07:00
Luiz Augusto von Dentz efe375b716 Bluetooth: btusb: Remove detection of ISO packets over bulk
This removes the code introduced by
14202eff21 as hci_recv_frame is now able
to detect ACL packets that are in fact ISO packets.

Fixes: 14202eff21 ("Bluetooth: btusb: Detect if an ACL packet is in fact an ISO packet")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-03-22 16:05:55 -07:00
Luiz Augusto von Dentz 876e78104f Bluetooth: hci_core: Detect if an ACL packet is in fact an ISO packet
Because some transports don't have a dedicated type for ISO packets
(see 14202eff21) they may use ACL type
when in fact they are ISO packets.

In the past this was left for the driver to detect such thing but it
creates a problem when using the likes of btproxy when used by a VM as
the host would not be aware of the connection the guest is doing it
won't be able to detect such behavior, so this make bt_recv_frame
detect when it happens as it is the common interface to all drivers
including guest VMs.

Fixes: 14202eff21 ("Bluetooth: btusb: Detect if an ACL packet is in fact an ISO packet")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-03-22 16:05:55 -07:00
Zhengping Jiang 3c44a431d6 Bluetooth: hci_sync: Resume adv with no RPA when active scan
The address resolution should be disabled during the active scan,
so all the advertisements can reach the host. The advertising
has to be paused before disabling the address resolution,
because the advertising will prevent any changes to the resolving
list and the address resolution status. Skipping this will cause
the hci error and the discovery failure.

According to the bluetooth specification:
"7.8.44 LE Set Address Resolution Enable command

This command shall not be used when:
- Advertising (other than periodic advertising) is enabled,
- Scanning is enabled, or
- an HCI_LE_Create_Connection, HCI_LE_Extended_Create_Connection, or
  HCI_LE_Periodic_Advertising_Create_Sync command is outstanding."

If the host is using RPA, the controller needs to generate RPA for
the advertising, so the advertising must remain paused during the
active scan.

If the host is not using RPA, the advertising can be resumed after
disabling the address resolution.

Fixes: 9afc675ede ("Bluetooth: hci_sync: allow advertise when scan without RPA")
Signed-off-by: Zhengping Jiang <jiangzp@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2023-03-22 16:05:55 -07:00
Linus Torvalds fff5a5e7f5 ARM fix for 6.3-rc
Just one fix for now to eliminate a KASAN false positive.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmQbOREACgkQ9OeQG+St
 rGRjYxAAoMDdypCOx8ULFZEE4g15bROxX8uNUi8RrbMrAN1kd+OgEbSTAq9j1uZh
 2GTueX1oGsBBUM6WHhFJ8zFgdtiEkPYySz5uBnYhyMY7V6MSHbjlHwPjIrfB2M1l
 DD+HoRyeBw3LCHWdwL7Xbo0x3WQeVpThPNhn20Q3AE5RtNGLRD4/du6Ow5psQAKZ
 9SBhSJ3ceG2WPPwF70NDhCw8lwPw7RH4fk9eFsIHKyBpE4I6sDyVXGErMAakhs62
 eg2Eai8Q2bKaUqX2i6/l8D/OT/75bPzgUG/XUsbtGAU6xES0pZawIfCwvLREHXZ8
 1NzxlJt2gsTuqGjrDiD/roUP/m4niZu4zwgDGlTOgtD22m/h2EdzeLR2AObxZpve
 Q/upkek++ZFeqJS4Q0lIKGbU4JZ0y+FE+nnzNSy8Cco/D7rrXC+byIGmx7255ram
 PFfDMvmhJVc/fArO8koLWTLv8o17XZAd4++lzk1qD5MA0uTwndvzmf6ayWJ4913j
 dSO8KN6z6nTsgiorv2/Sh9qUEocLm0s9BI88/FoQ5RGlbnnioKGco+v7hXpwRcyz
 Tg93hAZXjbNEPD7iEglPaZY4d3fMIandtEN9QGlZ5Ze+8TDq3Pb+tUtAXRi2g0Ej
 hXc0hjeOAvEIOgqecEXcpchT0Crkb0szuesiuKersL4hqOJkacA=
 =2Z8M
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fix from Russell King:
 "Just one fix for now to eliminate a KASAN false positive"

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9290/1: uaccess: Fix KASAN false-positives
2023-03-22 14:15:05 -07:00
Linus Torvalds c13e02d335 Bootconfig fixes v6.3
- Fix bootconfig test script to test increased maximum number (8192) node
   correctly.
 
 - Change the console message if there is no bootconfig data and the
   kernel is compiled with CONFIG_BOOT_CONFIG_FORCE=y.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmQbDHsACgkQ2/sHvwUr
 Pxs8Nwf/S8r+Z8+1aRZ3lPLpGgT50CgaeceMcSuxhbTWgKXHTk1Gg/ShIqPMmiQW
 Lx7M9wL1ictPQ0Xk2hzJL5vD15RI6dIv+kLsRxyMHiqusR09/IIxkX3SbihjGQuf
 Tpf5Fq4yfEH3eWLVA6/WN+cv2pH+TmlHv+sYVSf01BcFfmhC4p8YKTMJpM8yJfnh
 Ux9ugxY5JwceDCRBoL88Wx7wKPHDIVs25mnYWSR82zohsJhqW0CFhOPHpziyqyQX
 FEt9CFqfrJ+qild9bkUhgAFze0EaetZaRjdXJhrwFZZ2UkTqT/RTZgWCbwJ10x8j
 rPNNTgpIpPSCZ17bZQd5CwpY6SGMtg==
 =tj7z
 -----END PGP SIGNATURE-----

Merge tag 'bootconfig-fixes-v6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull bootconfig fixes from Masami Hiramatsu:

 - Fix bootconfig test script to test increased maximum number (8192)
   node correctly

 - Change the console message if there is no bootconfig data and the
   kernel is compiled with CONFIG_BOOT_CONFIG_FORCE=y

* tag 'bootconfig-fixes-v6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  bootconfig: Change message if no bootconfig with CONFIG_BOOT_CONFIG_FORCE=y
  bootconfig: Fix testcase to increase max node
2023-03-22 14:09:20 -07:00
Jesper Dangaard Brouer 915efd8a44 xdp: bpf_xdp_metadata use EOPNOTSUPP for no driver support
When driver doesn't implement a bpf_xdp_metadata kfunc the fallback
implementation returns EOPNOTSUPP, which indicate device driver doesn't
implement this kfunc.

Currently many drivers also return EOPNOTSUPP when the hint isn't
available, which is ambiguous from an API point of view. Instead
change drivers to return ENODATA in these cases.

There can be natural cases why a driver doesn't provide any hardware
info for a specific hint, even on a frame to frame basis (e.g. PTP).
Lets keep these cases as separate return codes.

When describing the return values, adjust the function kernel-doc layout
to get proper rendering for the return values.

Fixes: ab46182d0d ("net/mlx4_en: Support RX XDP metadata")
Fixes: bc8d405b1b ("net/mlx5e: Support RX XDP metadata")
Fixes: 306531f024 ("veth: Support RX XDP metadata")
Fixes: 3d76a4d3d4 ("bpf: XDP metadata RX kfuncs")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/167940675120.2718408.8176058626864184420.stgit@firesoul
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-22 09:11:09 -07:00
Ido Schimmel bb765a7433 mlxsw: spectrum_fid: Fix incorrect local port type
Local port is a 10-bit number, but it was mistakenly stored in a u8,
resulting in firmware errors when using a netdev corresponding to a
local port higher than 255.

Fix by storing the local port in u16, as is done in the rest of the
code.

Fixes: bf73904f5f ("mlxsw: Add support for 802.1Q FID family")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/eace1f9d96545ab8a2775db857cb7e291a9b166b.1679398549.git.petrm@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-03-22 15:50:32 +01:00
Masami Hiramatsu (Google) caa0708a81 bootconfig: Change message if no bootconfig with CONFIG_BOOT_CONFIG_FORCE=y
Change no bootconfig data error message if user do not specify 'bootconfig'
option but CONFIG_BOOT_CONFIG_FORCE=y.
With CONFIG_BOOT_CONFIG_FORCE=y, the kernel proceeds bootconfig check even
if user does not specify 'bootconfig' option. So the current error message
is confusing. Let's show just an information message to notice skipping
the bootconfig in that case.

Link: https://lore.kernel.org/all/167754610254.318944.16848412476667893329.stgit@devnote2/

Fixes: b743852ccc ("Allow forcing unconditional bootconfig processing")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/all/CAMuHMdV9jJvE2y8gY5V_CxidUikCf5515QMZHzTA3rRGEOj6=w@mail.gmail.com/
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Mukesh Ojha <quic_mojha@quicinc.com>
2023-03-22 22:21:43 +09:00
Felix Fietkau f355f70145 wifi: mac80211: fix mesh path discovery based on unicast packets
If a packet has reached its intended destination, it was bumped to the code
that accepts it, without first checking if a mesh_path needs to be created
based on the discovered source.
Fix this by moving the destination address check further down.

Cc: stable@vger.kernel.org
Fixes: 986e43b19a ("wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20230314095956.62085-3-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-22 13:46:46 +01:00
Felix Fietkau 4e348c6c6e wifi: mac80211: fix qos on mesh interfaces
When ieee80211_select_queue is called for mesh, the sta pointer is usually
NULL, since the nexthop is looked up much later in the tx path.
Explicitly check for unicast address in that case in order to make qos work
again.

Cc: stable@vger.kernel.org
Fixes: 50e2ab3929 ("wifi: mac80211: fix queue selection for mesh/OCB interfaces")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20230314095956.62085-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-22 13:46:38 +01:00
Johannes Berg 923bf981eb wifi: iwlwifi: mvm: protect TXQ list manipulation
Some recent upstream debugging uncovered the fact that in
iwlwifi, the TXQ list manipulation is racy.

Introduce a new state bit for when the TXQ is completely
ready and can be used without locking, and if that's not
set yet acquire the lock to check everything correctly.

Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Tested-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-22 13:14:24 +01:00
Johannes Berg b58e3d4311 wifi: iwlwifi: mvm: fix mvmtxq->stopped handling
This could race if the queue is redirected while full, then
the flushing internally would start it while it's not yet
usable again. Fix it by using two state bits instead of just
one.

Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Tested-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-03-22 13:14:18 +01:00
Zhang Changzhong 4107b8746d net/sonic: use dma_mapping_error() for error check
The DMA address returned by dma_map_single() should be checked with
dma_mapping_error(). Fix it accordingly.

Fixes: efcce83936 ("[PATCH] macsonic/jazzsonic network drivers update")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/6645a4b5c1e364312103f48b7b36783b94e197a2.1679370343.git.fthain@linux-m68k.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21 21:29:34 -07:00
Jakub Kicinski 82463d9dbb Merge branch 'fix-trainwreck-with-ocelot-switch-statistics-counters'
Vladimir Oltean says:

====================
Fix trainwreck with Ocelot switch statistics counters

While testing the patch set for preemptible traffic classes with some
controlled traffic and measuring counter deltas:
https://lore.kernel.org/netdev/20230220122343.1156614-1-vladimir.oltean@nxp.com/

I noticed that in the output of "ethtool -S swp0 --groups eth-mac
eth-phy eth-ctrl rmon -- --src emac | grep -v ': 0'", the TX counters
were off. Quickly I realized that their values were permutated by 1
compared to their names, and that for example
tx-rmon-etherStatsPkts64to64Octets was incrementing when
tx-rmon-etherStatsPkts65to127Octets should have.

Initially I suspected something having to do with the bulk reading
logic, and indeed I found a bug there (fixed as 1/3), but that was not
the source of the problems. Instead it revealed other problems.

While dumping the regions created by the driver on my switch, I figured
out that it sees a discontinuity which shouldn't have existed between
reg 0x278 and reg 0x280.

Discontinuity between last reg 0x0 and new reg 0x0, creating new region
Discontinuity between last reg 0x108 and new reg 0x200, creating new region
Discontinuity between last reg 0x278 and new reg 0x280, creating new region
Discontinuity between last reg 0x2b0 and new reg 0x400, creating new region
region of 67 contiguous counters starting with SYS:STAT:CNT[0x000]
region of 31 contiguous counters starting with SYS:STAT:CNT[0x080]
region of 13 contiguous counters starting with SYS:STAT:CNT[0x0a0]
region of 18 contiguous counters starting with SYS:STAT:CNT[0x100]

That is where TX_MM_HOLD should have been, and that was the bug, since
it was missing. After adding it, the regions look like this and the
off-by-one issue is resolved:

Discontinuity between last reg 0x000000 and new reg 0x000000, creating new region
Discontinuity between last reg 0x000108 and new reg 0x000200, creating new region
Discontinuity between last reg 0x0002b0 and new reg 0x000400, creating new region
region of 67 contiguous counters starting with SYS:STAT:CNT[0x000]
region of 45 contiguous counters starting with SYS:STAT:CNT[0x080]
region of 18 contiguous counters starting with SYS:STAT:CNT[0x100]

However, as I am thinking out loud, it should have not reported the
other counters as off by one even when skipping TX_MM_HOLD... after all,
on Ocelot/Seville, there are more counters which need to be skipped.

Which is when I investigated and noticed the bug solved in 2/3.
I've validated that both on native VSC9959 (which uses
ocelot_mm_stats_layout) as well as by faking the other switches by
making VSC9959 use the plain ocelot_stats_layout.

To summarize: on all Ocelot switches, the TX counters and drop counters
are completely broken. The RX counters are mostly fine.

With this occasion, I have collected more cleanup patches in this area,
which I'm going to submit after the net -> net-next merge.
====================

Link: https://lore.kernel.org/r/20230321010325.897817-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21 21:28:08 -07:00
Vladimir Oltean 5291099e0f net: mscc: ocelot: add TX_MM_HOLD to ocelot_mm_stats_layout
The lack of a definition for this counter is what initially prompted me
to investigate a problem which really manifested itself as the previous
change, "net: mscc: ocelot: fix transfer from region->buf to ocelot->stats".

When TX_MM_HOLD is defined in enum ocelot_stat but not in struct
ocelot_stat_layout ocelot_mm_stats_layout, this creates a hole, which
due to the aforementioned bug, makes all counters following TX_MM_HOLD
be recorded off by one compared to their correct position. So for
example, a non-zero TX_PMAC_OCTETS would be reported as TX_MERGE_FRAGMENTS,
TX_PMAC_UNICAST would be reported as TX_PMAC_OCTETS, TX_PMAC_64 would be
reported as TX_PMAC_PAUSE, etc etc. This is because the size of the hole
(1) is much smaller than the size of the region, so the phenomenon where
the stats are off-by-one, rather than lost, prevails.

However, the phenomenon where stats are lost can be seen too, for
example with DROP_LOCAL, which is at the beginning of its own region
(offset 0x000400 vs the previous 0x0002b0 constitutes a discontinuity).
This is also reported as off by one and saved to TX_PMAC_1527_MAX, but
that counter is not reported to the unstructured "ethtool -S", as
opposed to DROP_LOCAL which is (as "drop_local").

Fixes: ab3f97a961 ("net: mscc: ocelot: export ethtool MAC Merge stats for Felix VSC9959")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21 21:28:07 -07:00
Vladimir Oltean 17dfd21045 net: mscc: ocelot: fix transfer from region->buf to ocelot->stats
To understand the problem, we need some definitions.

The driver is aware of multiple counters (enum ocelot_stat), yet not all
switches supported by the driver implement all counters. There are 2
statistics layouts: ocelot_stats_layout and ocelot_mm_stats_layout, the
latter having 36 counters more than the former.

ocelot->stats[] is not a compact array, i.e. there are elements within
it which are not going to be populated for ocelot_stats_layout. On the
other hand, ocelot->stats[] is easily indexable, for example "tx_octets"
for port 3 can be found at ocelot->stats[3 * OCELOT_NUM_STATS +
OCELOT_STAT_TX_OCTETS], and that is why we keep it sparse.

Regions, as created by ocelot_prepare_stats_regions(), are compact
(every element from region->buf will correspond to a counter that is
present in this switch's layout) but are not easily indexable.

Let's define holes as the ranges of values of enum ocelot_stat for which
ocelot_stats_layout doesn't have a "reg" defined. For example, there is
a hole between OCELOT_STAT_RX_GREEN_PRIO_7 and OCELOT_STAT_TX_OCTETS
which is of 23 elements that are only present on ocelot_mm_stats_layout,
and as such, they are also present in enum ocelot_stat. Let's define the
left extremity of the hole - the last enum ocelot_stat still defined -
as A (in this case OCELOT_STAT_RX_GREEN_PRIO_7) and the right extremity -
the first enum ocelot_stat that is defined after a series of undefined
ones - as B (in this case OCELOT_STAT_TX_OCTETS).

There is a bug in the procedure which transfers stats from region->buf[]
to ocelot->stats[].

For each hole in the ocelot_stats_layout, the logic transfers the stats
starting with enum ocelot_stat B to ocelot->stats[] index A + 1. So all
stats after a hole are saved to a position which is off by B - A + 1
elements.

This causes 2 kinds of issues:
(a) counters which shouldn't increment increment
(b) counters which should increment don't

Holes in the ocelot_stat_layout automatically imply the end of a region
and the beginning of a new one; however the reverse is not necessarily
true. For example, for ocelot_mm_stat_layout, there could be multiple
regions (which indicate discontinuities in register addresses) while
there is no hole (which indicates discontinuities in enum ocelot_stat
values).

In the example above, the stats from the second region->buf[] are not
transferred to ocelot->stats starting with index
"port * OCELOT_NUM_STATS + OCELOT_STAT_TX_OCTETS" as they should, but
rather, starting with element
"port * OCELOT_NUM_STATS + OCELOT_STAT_RX_GREEN_PRIO_7 + 1".

That stats[] array element is not reported to user space for switches
that use ocelot_stat_layout, and that is how issue (b) occurs.

However, if the length of the second region is larger than the hole,
then some stats will start to be transferred to the ocelot->stats[]
indices which *are* reported to user space, but those indices contain
wrong values (corresponding to unexpected counters). This is how issue
(a) occurs.

The procedure, as it was introduced in commit d87b1c08f3 ("net: mscc:
ocelot: use bulk reads for stats"), was not buggy, because there were no
holes in the struct ocelot_stat_layout instances at that time. The
problem is that when those holes were introduced, the function was not
updated to take them into consideration.

To update the procedure, we need to know, for each region, which enum
ocelot_stat corresponds to its region->base. We have no way of deducing
that based on the contents of struct ocelot_stats_region, so we need to
add this information.

Fixes: ab3f97a961 ("net: mscc: ocelot: export ethtool MAC Merge stats for Felix VSC9959")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21 21:27:51 -07:00
Vladimir Oltean 6acc72a43e net: mscc: ocelot: fix stats region batching
The blamed commit changed struct ocelot_stat_layout :: "u32 offset" to
"u32 reg".

However, "u32 reg" is not quite a register address, but an enum
ocelot_reg, which in itself encodes an enum ocelot_target target in the
upper bits, and an index into the ocelot->map[target][] array in the
lower bits.

So, whereas the previous code comparison between stats_layout[i].offset
and last + 1 was correct (because those "offsets" at the time were
32-bit relative addresses), the new code, comparing layout[i].reg to
last + 4 is not correct, because the "reg" here is an enum/index, not an
actual register address.

What we want to compare are indeed register addresses, but to do that,
we need to actually go through the same motions as
__ocelot_bulk_read_ix() itself.

With this bug, all statistics counters are deemed by
ocelot_prepare_stats_regions() as constituting their own region.
(Truncated) log on VSC9959 (Felix) below (prints added by me):

Before:

region of 1 contiguous counters starting with SYS:STAT:CNT[0x000]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x001]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x002]
...
region of 1 contiguous counters starting with SYS:STAT:CNT[0x041]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x042]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x080]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x081]
...
region of 1 contiguous counters starting with SYS:STAT:CNT[0x0ac]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x100]
region of 1 contiguous counters starting with SYS:STAT:CNT[0x101]
...
region of 1 contiguous counters starting with SYS:STAT:CNT[0x111]

After:

region of 67 contiguous counters starting with SYS:STAT:CNT[0x000]
region of 45 contiguous counters starting with SYS:STAT:CNT[0x080]
region of 18 contiguous counters starting with SYS:STAT:CNT[0x100]

Since commit d87b1c08f3 ("net: mscc: ocelot: use bulk reads for
stats") intended bulking as a performance improvement, and since now,
with trivial-sized regions, performance is even worse than without
bulking at all, this could easily qualify as a performance regression.

Fixes: d4c3676507 ("net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21 21:27:10 -07:00
Eric Dumazet 8e50ed7745 erspan: do not use skb_mac_header() in ndo_start_xmit()
Drivers should not assume skb_mac_header(skb) == skb->data in their
ndo_start_xmit().

Use skb_network_offset() and skb_transport_offset() which
better describe what is needed in erspan_fb_xmit() and
ip6erspan_tunnel_xmit()

syzbot reported:
WARNING: CPU: 0 PID: 5083 at include/linux/skbuff.h:2873 skb_mac_header include/linux/skbuff.h:2873 [inline]
WARNING: CPU: 0 PID: 5083 at include/linux/skbuff.h:2873 ip6erspan_tunnel_xmit+0x1d9c/0x2d90 net/ipv6/ip6_gre.c:962
Modules linked in:
CPU: 0 PID: 5083 Comm: syz-executor406 Not tainted 6.3.0-rc2-syzkaller-00866-gd4671cb96fa3 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
RIP: 0010:skb_mac_header include/linux/skbuff.h:2873 [inline]
RIP: 0010:ip6erspan_tunnel_xmit+0x1d9c/0x2d90 net/ipv6/ip6_gre.c:962
Code: 04 02 41 01 de 84 c0 74 08 3c 03 0f 8e 1c 0a 00 00 45 89 b4 24 c8 00 00 00 c6 85 77 fe ff ff 01 e9 33 e7 ff ff e8 b4 27 a1 f8 <0f> 0b e9 b6 e7 ff ff e8 a8 27 a1 f8 49 8d bf f0 0c 00 00 48 b8 00
RSP: 0018:ffffc90003b2f830 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 000000000000ffff RCX: 0000000000000000
RDX: ffff888021273a80 RSI: ffffffff88e1bd4c RDI: 0000000000000003
RBP: ffffc90003b2f9d8 R08: 0000000000000003 R09: 000000000000ffff
R10: 000000000000ffff R11: 0000000000000000 R12: ffff88802b28da00
R13: 00000000000000d0 R14: ffff88807e25b6d0 R15: ffff888023408000
FS: 0000555556a61300(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055e5b11eb6e8 CR3: 0000000027c1b000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__netdev_start_xmit include/linux/netdevice.h:4900 [inline]
netdev_start_xmit include/linux/netdevice.h:4914 [inline]
__dev_direct_xmit+0x504/0x730 net/core/dev.c:4300
dev_direct_xmit include/linux/netdevice.h:3088 [inline]
packet_xmit+0x20a/0x390 net/packet/af_packet.c:285
packet_snd net/packet/af_packet.c:3075 [inline]
packet_sendmsg+0x31a0/0x5150 net/packet/af_packet.c:3107
sock_sendmsg_nosec net/socket.c:724 [inline]
sock_sendmsg+0xde/0x190 net/socket.c:747
__sys_sendto+0x23a/0x340 net/socket.c:2142
__do_sys_sendto net/socket.c:2154 [inline]
__se_sys_sendto net/socket.c:2150 [inline]
__x64_sys_sendto+0xe1/0x1b0 net/socket.c:2150
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f123aaa1039
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 b1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffc15d12058 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f123aaa1039
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000020000040 R09: 0000000000000014
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f123aa648c0
R13: 431bde82d7b634db R14: 0000000000000000 R15: 0000000000000000

Fixes: 1baf5ebf89 ("erspan: auto detect truncated packets.")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230320163427.8096-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21 21:16:26 -07:00
Li Zetao 4fe3c88552 atm: idt77252: fix kmemleak when rmmod idt77252
There are memory leaks reported by kmemleak:

  unreferenced object 0xffff888106500800 (size 128):
    comm "modprobe", pid 1017, jiffies 4297787785 (age 67.152s)
    hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    backtrace:
      [<00000000970ce626>] __kmem_cache_alloc_node+0x20c/0x380
      [<00000000fb5f78d9>] kmalloc_trace+0x2f/0xb0
      [<000000000e947e2a>] idt77252_init_one+0x2847/0x3c90 [idt77252]
      [<000000006efb048e>] local_pci_probe+0xeb/0x1a0
    ...

  unreferenced object 0xffff888106500b00 (size 128):
    comm "modprobe", pid 1017, jiffies 4297787785 (age 67.152s)
    hex dump (first 32 bytes):
      00 20 3d 01 80 88 ff ff 00 20 3d 01 80 88 ff ff  . =...... =.....
      f0 23 3d 01 80 88 ff ff 00 20 3d 01 00 00 00 00  .#=...... =.....
    backtrace:
      [<00000000970ce626>] __kmem_cache_alloc_node+0x20c/0x380
      [<00000000fb5f78d9>] kmalloc_trace+0x2f/0xb0
      [<00000000f451c5be>] alloc_scq.constprop.0+0x4a/0x400 [idt77252]
      [<00000000e6313849>] idt77252_init_one+0x28cf/0x3c90 [idt77252]

The root cause is traced to the vc_maps which alloced in open_card_oam()
are not freed in close_card_oam(). The vc_maps are used to record
open connections, so when close a vc_map in close_card_oam(), the memory
should be freed. Moreover, the ubr0 is not closed when close a idt77252
device, leading to the memory leak of vc_map and scq_info.

Fix them by adding kfree in close_card_oam() and implementing new
close_card_ubr0() to close ubr0.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Francois Romieu <romieu@fr.zoreil.com>
Link: https://lore.kernel.org/r/20230320143318.2644630-1-lizetao1@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21 20:19:28 -07:00
Álvaro Fernández Rojas 032a954061 net: dsa: tag_brcm: legacy: fix daisy-chained switches
When BCM63xx internal switches are connected to switches with a 4-byte
Broadcom tag, it does not identify the packet as VLAN tagged, so it adds one
based on its PVID (which is likely 0).
Right now, the packet is received by the BCM63xx internal switch and the 6-byte
tag is properly processed. The next step would to decode the corresponding
4-byte tag. However, the internal switch adds an invalid VLAN tag after the
6-byte tag and the 4-byte tag handling fails.
In order to fix this we need to remove the invalid VLAN tag after the 6-byte
tag before passing it to the 4-byte tag decoding.

Fixes: 964dbf186e ("net: dsa: tag_brcm: add support for legacy tags")
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20230319095540.239064-1-noltari@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-21 17:29:13 -07:00
Linus Torvalds a1effab7a3 VFIO fixes for v6.3-rc4
- Fix dirty size reporting for pre-copy in mlx5 variant driver (Yishai Hadas)
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmQaLRwbHGFsZXgud2ls
 bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsiQjMP+we+dpw91BTxcLShKrrc
 A1Mzil0gKLeztDFNH04rbi45JshjXZkzbFKH/e0LXumud3Rnywna6Lk63TiwM7qB
 pAWdy+SXffOMCAIeEep6McjblWheYTmydJjHtuPLOqC5ZJEN2aD9adBzqvtSyG19
 COxkZafBZl0m0AfMkEbguv6b14ywQgTkhq8lYpYLsYiZYUOd2M5jy3cATJ9KmcXR
 4XSJmv6P9jDa0cci+g+gphDAz90zI8kdTD4Mq9InE56JE1bBMr4/fes79Rbheiui
 cpZCkFoBSMg0sOqrn2189Hnzc9gnbInPTMfx3Nz6mh5XH3PlAJurfMRd9I9itUQx
 D/gAysnnlLP8MQntJfCf2o0Njb7a3PqO7BFjXmG7ZK2ZOP+6G8ZIeALQXn/lkbYF
 RPJwXPEzlo+Pe8oFg/rTD9PqDVFXK6TjQuStIZNT0u1XpJwNZuYr7qSGXD/OdoUa
 Lkurr7rM/KsOA2fYrrW0jm9tPxEI9ilTSz9ILxkMc5RAzvF2DXkJ2VNLtLY+Lc7d
 c4IWSgUFBCO+vtvYJyMVsTeVcqkbKauYOQ8ItQCv1j9kg6xb5b8qzerj9CJDFTSE
 T1PG9wvsZBW9pWbBzAgXHgUt7ohiF2UynOO4jZUCcEXKTOrzxAK8rwoKGhS6j1CF
 ID36fHwRLnTx8z9yH1WiXoGd
 =VSl2
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v6.3-rc4' of https://github.com/awilliam/linux-vfio

Pull VFIO fix from Alex Williamson:

 - Fix dirty size reporting for pre-copy in mlx5 variant driver (Yishai
   Hadas)

* tag 'vfio-v6.3-rc4' of https://github.com/awilliam/linux-vfio:
  vfio/mlx5: Fix the report of dirty_bytes upon pre-copy
2023-03-21 15:46:39 -07:00
Linus Torvalds a2eaf246f5 nfsd-6.3 fixes:
- Fix a crash during NFS READs from certain client implementations
 - Address a minor kbuild regression in v6.3
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmQaEJgACgkQM2qzM29m
 f5ceYBAAiGHz+ISIIirg2+B5DttDL5ssPoqbgfZ16JZfBgaDduOg8oK7pbILv1nf
 4ufZQRcLpJDSnzhqNXxQE/L4Vsei8+tqkbwILSqFsCBZk5ei2AYSsZMu7Ptf8X8X
 Y+KZ55Bko0uTYAsglnPiEaDEGeq0ZYwfUc5/bnw7ZzXdAHhlwgjoAads6Q25A9f1
 6ZuKQ6nIis/ok9q88GdLf7x51IdKrXTrkC1iYAeVTXCgR3qXPezqmPOzW44rZs0t
 TMzUgDuyPgvB5eT/N+pwXoI9Bli9v1PFoXZjm2NchhJ/Iy6Gt1eEEB93J/qRSU/x
 ziJLkk0jnJEPWJ0Ht5li4+60b58t2OoWCTJTr3FzHCcF5Iqr7YjenSi2U8IT84ct
 EeFdqR2nZpJ4Y1wbAhTjQ2w6rw5WpaXDLHCRRdE7OF/D121n3FkSNo8DtSn+fbR8
 sFDpOruZn4tjgpf3AjCR1zbyKw9baC6clhnLrW6AE6T4ht9XwfE6eCUSy48FSMzg
 4IZKLcofQh2UEvlVtCvGOjtcK151eFKocsq+p6/yhYE6BM20Z3dBqhw9iBzqOAPg
 0qZk60AWl9W5h7M+ovkGAZpwXv4djsdWf7LbFuYOQ0kodzXOQ/5sFQm4zC5F3I1Z
 8WPPA9SkYyNiNSBUt65tca4JNw5h6x9A3scPWtuHB7Lz3fuGTnE=
 =frHb
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:

 - Fix a crash during NFS READs from certain client implementations

 - Address a minor kbuild regression in v6.3

* tag 'nfsd-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  nfsd: don't replace page in rq_pages if it's a continuation of last page
  NFS & NFSD: Update GSS dependencies
2023-03-21 14:48:38 -07:00
Dan Carpenter 640fcdbcf2 net/mlx5: E-Switch, Fix an Oops in error handling code
The error handling dereferences "vport".  There is nothing we can do if
it is an error pointer except returning the error code.

Fixes: 133dcfc577 ("net/mlx5: E-Switch, Alloc and free unique metadata for match")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-03-21 14:06:32 -07:00
Maher Sanalla 44d553188c net/mlx5: Read the TC mapping of all priorities on ETS query
When ETS configurations are queried by the user to get the mapping
assignment between packet priority and traffic class, only priorities up
to maximum TCs are queried from QTCT register in FW to retrieve their
assigned TC, leaving the rest of the priorities mapped to the default
TC #0 which might be misleading.

Fix by querying the TC mapping of all priorities on each ETS query,
regardless of the maximum number of TCs configured in FW.

Fixes: 820c2c5e77 ("net/mlx5e: Read ETS settings directly from firmware")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-03-21 14:06:32 -07:00
Emeel Hakim 7e3fce82d9 net/mlx5e: Overcome slow response for first macsec ASO WQE
First ASO WQE poll causes a cache miss in hardware hence the resut is
delayed. It causes to the situation where such WQE is polled earlier
than it is needed.

Add logic to retry ASO CQ polling operation.

Fixes: 739cfa3451 ("net/mlx5: Make ASO poll CQ usable in atomic context") 
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-03-21 14:06:32 -07:00
Roy Novich 6e9d51b1a5 net/mlx5e: Initialize link speed to zero
mlx5e_port_max_linkspeed does not guarantee value assignment for speed.
Avoid cases where link_speed might be used uninitialized. In case
mlx5e_port_max_linkspeed fails, a default link speed of 50000 will be
used for the calculations.

Fixes: 3f6d08d196 ("net/mlx5e: Add RSS support for hairpin")
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-03-21 14:06:31 -07:00
Lama Kayal 922f56e9a7 net/mlx5: Fix steering rules cleanup
vport's mc, uc and multicast rules are not deleted in teardown path when
EEH happens. Since the vport's promisc settings(uc, mc and all) in
firmware are reset after EEH, mlx5 driver will try to delete the above
rules in the initialization path. This cause kernel crash because these
software rules are no longer valid.

Fix by nullifying these rules right after delete to avoid accessing any dangling
pointers.

Call Trace:
__list_del_entry_valid+0xcc/0x100 (unreliable)
tree_put_node+0xf4/0x1b0 [mlx5_core]
tree_remove_node+0x30/0x70 [mlx5_core]
mlx5_del_flow_rules+0x14c/0x1f0 [mlx5_core]
esw_apply_vport_rx_mode+0x10c/0x200 [mlx5_core]
esw_update_vport_rx_mode+0xb4/0x180 [mlx5_core]
esw_vport_change_handle_locked+0x1ec/0x230 [mlx5_core]
esw_enable_vport+0x130/0x260 [mlx5_core]
mlx5_eswitch_enable_sriov+0x2a0/0x2f0 [mlx5_core]
mlx5_device_enable_sriov+0x74/0x440 [mlx5_core]
mlx5_load_one+0x114c/0x1550 [mlx5_core]
mlx5_pci_resume+0x68/0xf0 [mlx5_core]
eeh_report_resume+0x1a4/0x230
eeh_pe_dev_traverse+0x98/0x170
eeh_handle_normal_event+0x3e4/0x640
eeh_handle_event+0x4c/0x370
eeh_event_handler+0x14c/0x210
kthread+0x168/0x1b0
ret_from_kernel_thread+0x5c/0x84

Fixes: a35f71f27a ("net/mlx5: E-Switch, Implement promiscuous rx modes vf request handling")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-03-21 14:06:31 -07:00
Gavin Li 662404b24a net/mlx5e: Block entering switchdev mode with ns inconsistency
Upon entering switchdev mode, VF/SF representors are spawned in the
devlink instance's net namespace, whereas the PF net device transforms
into the uplink representor, remaining in the net namespace the PF net
device was in. Therefore, if a PF net device's namespace is different from
its parent devlink net namespace, entering switchdev mode can create an
illegal situation where all representors sharing the same core device
are NOT in the same net namespace.

To avoid this issue, block entering switchdev mode for devices whose child
netdev net namespace has diverged from the parent devlink's.

Fixes: 7768d1971d ("net/mlx5: E-Switch, Add control for encapsulation")
Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Gavi Teitz <gavi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-03-21 14:06:31 -07:00
Gavin Li c83172b063 net/mlx5e: Set uplink rep as NETNS_LOCAL
Previously, NETNS_LOCAL was not set for uplink representors, inconsistent
with VF representors, and allowed the uplink representor to be moved
between net namespaces and separated from the VF representors it shares
the core device with. Such usage would break the isolation model of
namespaces, as devices in different namespaces would have access to
shared memory.

To solve this issue, set NETNS_LOCAL for uplink representors if eswitch is
in switchdev mode.

Fixes: 7a9fb35e8c ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode")
Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Gavi Teitz <gavi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-03-21 14:06:31 -07:00
Daniel Borkmann 10ec8ca8ec bpf: Adjust insufficient default bpf_jit_limit
We've seen recent AWS EKS (Kubernetes) user reports like the following:

  After upgrading EKS nodes from v20230203 to v20230217 on our 1.24 EKS
  clusters after a few days a number of the nodes have containers stuck
  in ContainerCreating state or liveness/readiness probes reporting the
  following error:

    Readiness probe errored: rpc error: code = Unknown desc = failed to
    exec in container: failed to start exec "4a11039f730203ffc003b7[...]":
    OCI runtime exec failed: exec failed: unable to start container process:
    unable to init seccomp: error loading seccomp filter into kernel:
    error loading seccomp filter: errno 524: unknown

  However, we had not been seeing this issue on previous AMIs and it only
  started to occur on v20230217 (following the upgrade from kernel 5.4 to
  5.10) with no other changes to the underlying cluster or workloads.

  We tried the suggestions from that issue (sysctl net.core.bpf_jit_limit=452534528)
  which helped to immediately allow containers to be created and probes to
  execute but after approximately a day the issue returned and the value
  returned by cat /proc/vmallocinfo | grep bpf_jit | awk '{s+=$2} END {print s}'
  was steadily increasing.

I tested bpf tree to observe bpf_jit_charge_modmem, bpf_jit_uncharge_modmem
their sizes passed in as well as bpf_jit_current under tcpdump BPF filter,
seccomp BPF and native (e)BPF programs, and the behavior all looks sane
and expected, that is nothing "leaking" from an upstream perspective.

The bpf_jit_limit knob was originally added in order to avoid a situation
where unprivileged applications loading BPF programs (e.g. seccomp BPF
policies) consuming all the module memory space via BPF JIT such that loading
of kernel modules would be prevented. The default limit was defined back in
2018 and while good enough back then, we are generally seeing far more BPF
consumers today.

Adjust the limit for the BPF JIT pool from originally 1/4 to now 1/2 of the
module memory space to better reflect today's needs and avoid more users
running into potentially hard to debug issues.

Fixes: fdadd04931 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K")
Reported-by: Stephen Haynes <sh@synk.net>
Reported-by: Lefteris Alexakis <lefteris.alexakis@kpn.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://github.com/awslabs/amazon-eks-ami/issues/1179
Link: https://github.com/awslabs/amazon-eks-ami/issues/1219
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230320143725.8394-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-21 12:43:05 -07:00
Linus Torvalds 2faac9a98f keyrings fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmQZ234ACgkQ+7dXa6fL
 C2t9Yw/+O5uA0kyRp/3xgJ/6T2zbQEFdzUZVSif2P/Kv5uYx+1qC/E00bYOWNIIX
 dWey23HHHyNm0dWmR39SY9flixnIRFT83W4+chmHkeboXuXzL4I1s2+W2WOKqdoX
 H3X+uoK6eew3S4OEMwtlWoKebIzzzJls72kaSiCW63yhVOpiOI94muNdUVRQy7ll
 iXDiI5pB/cBvOAfLazrTnnhmcsUoGhW9qfJFI7kZM7JsVxvk0Fw6G1T35yEiwR6M
 GO9ZvJPNtan3jzKzJaUc5qtYclqSj9wzxiY3x3uPe2afLsb2KiFKWh0e89DNmr9D
 DyPfPvj7gJdFVLeE/wkdJNkAPgC8oN7F3uUnZaJjwtqItuCOp5HTYQtZlK532Xej
 0OnLcxPj18/SkbIZ2HcE2akwhfChbLghdAioqgsKU/pE1nUP69HBsiLn5PkL/1zK
 f1dUN+L6MggyQGJRKaQdKPNaY26bHg9GkSZdxPokkQ8QF/GSvmw1cqQZrox3qUQp
 9RhfHyh1GRI1MFlVfhVGfhTO8F7y6JwG1xTUJpaCVFgLVTNT9y7L9DAzH9YKGb79
 pWgf9OspfxoDJrKREPK6inC3x2O5I6ZruET44S5VY9iWlRmMY/CTQ/fJu6rIcCr+
 kHejN8rgU8N0Hc9WnZrSe1BS6e18jhZMMQtrF8/6md/OkoX13fU=
 =RrmT
 -----END PGP SIGNATURE-----

Merge tag 'keys-fixes-20230321' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull keyrings fixes from David Howells:

 - Fix request_key() so that it doesn't cache a looked up key on the
   current thread if that thread is a kernel thread.

   The cache is cleared during notify_resume - but that doesn't happen
   in kernel threads. This is causing cifs DNS keys to be
   un-invalidateable.

 - Fix a wrapper check in verify_pefile() to not round up the length.

 - Change asymmetric_keys code to log errors to make it easier for users
   to work out why failures occurred.

* tag 'keys-fixes-20230321' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  asymmetric_keys: log on fatal failures in PE/pkcs7
  verify_pefile: relax wrapper length check
  keys: Do not cache key in task struct if key is requested from kernel thread
2023-03-21 11:38:58 -07:00
Radoslaw Tyl c672297bbc i40e: fix flow director packet filter programming
Initialize to zero structures to build a valid
Tx Packet used for the filter programming.

Fixes: a9219b332f ("i40e: VLAN field for flow director")
Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-03-21 11:11:08 -07:00
Stefan Assmann 4e264be98b iavf: fix hang on reboot with ice
When a system with E810 with existing VFs gets rebooted the following
hang may be observed.

 Pid 1 is hung in iavf_remove(), part of a network driver:
 PID: 1        TASK: ffff965400e5a340  CPU: 24   COMMAND: "systemd-shutdow"
  #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
  #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
  #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
  #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
  #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
  #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
  #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
  #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
  #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
  #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
 #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
 #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
 #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
 #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
 #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
     RIP: 00007f1baa5c13d7  RSP: 00007fffbcc55a98  RFLAGS: 00000202
     RAX: ffffffffffffffda  RBX: 0000000000000000  RCX: 00007f1baa5c13d7
     RDX: 0000000001234567  RSI: 0000000028121969  RDI: 00000000fee1dead
     RBP: 00007fffbcc55ca0   R8: 0000000000000000   R9: 00007fffbcc54e90
     R10: 00007fffbcc55050  R11: 0000000000000202  R12: 0000000000000005
     R13: 0000000000000000  R14: 00007fffbcc55af0  R15: 0000000000000000
     ORIG_RAX: 00000000000000a9  CS: 0033  SS: 002b

During reboot all drivers PM shutdown callbacks are invoked.
In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
In ice_shutdown() the call chain above is executed, which at some point
calls iavf_remove(). However iavf_remove() expects the VF to be in one
of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
that's not the case it sleeps forever.
So if iavf_shutdown() gets invoked before iavf_remove() the system will
hang indefinitely because the adapter is already in state __IAVF_REMOVE.

Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE,
as we already went through iavf_shutdown().

Fixes: 974578017f ("iavf: Add waiting so the port is initialized in remove")
Fixes: a8417330f8 ("iavf: Fix race condition between iavf_shutdown and iavf_remove")
Reported-by: Marius Cornea <mcornea@redhat.com>
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-03-21 11:11:08 -07:00
Michal Swiatkowski 7d46c0e670 ice: remove filters only if VSI is deleted
Filters shouldn't be removed in VSI rebuild path. Removing them on PF
VSI results in no rule for PF MAC after changing for example queues
amount.

Remove all filters only in the VSI remove flow. As unload should also
cause the filter to be removed introduce, a new function ice_stop_eth().
It will unroll ice_start_eth(), so remove filters and close VSI.

Fixes: 6624e780a5 ("ice: split ice_vsi_setup into smaller functions")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-03-21 10:02:48 -07:00
Michal Swiatkowski 83b49e7f63 ice: check if VF exists before mode check
Setting trust on VF should return EINVAL when there is no VF. Move
checking for switchdev mode after checking if VF exists.

Fixes: c54d209c78 ("ice: Wait for VF to be reset/ready before configuration")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Kalyan Kodamagula <kalyan.kodamagula@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-03-21 10:02:48 -07:00
Piotr Raczynski 387d42ae6d ice: fix rx buffers handling for flow director packets
Adding flow director filters stopped working correctly after
commit 2fba7dc515 ("ice: Add support for XDP multi-buffer
on Rx side"). As a result, only first flow director filter
can be added, adding next filter leads to NULL pointer
dereference attached below.

Rx buffer handling and reallocation logic has been optimized,
however flow director specific traffic was not accounted for.
As a result driver handled those packets incorrectly since new
logic was based on ice_rx_ring::first_desc which was not set
in this case.

Fix this by setting struct ice_rx_ring::first_desc to next_to_clean
for flow director received packets.

[  438.544867] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  438.551840] #PF: supervisor read access in kernel mode
[  438.556978] #PF: error_code(0x0000) - not-present page
[  438.562115] PGD 7c953b2067 P4D 0
[  438.565436] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  438.569794] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 6.2.0-net-bug #1
[  438.577531] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022
[  438.588470] RIP: 0010:ice_clean_rx_irq+0x2b9/0xf20 [ice]
[  438.593860] Code: 45 89 f7 e9 ac 00 00 00 8b 4d 78 41 31 4e 10 41 09 d5 4d 85 f6 0f 84 82 00 00 00 49 8b 4e 08 41 8b 76
1c 65 8b 3d 47 36 4a 3f <48> 8b 11 48 c1 ea 36 39 d7 0f 85 a6 00 00 00 f6 41 08 02 0f 85 9c
[  438.612605] RSP: 0018:ff8c732640003ec8 EFLAGS: 00010082
[  438.617831] RAX: 0000000000000800 RBX: 00000000000007ff RCX: 0000000000000000
[  438.624957] RDX: 0000000000000800 RSI: 0000000000000000 RDI: 0000000000000000
[  438.632089] RBP: ff4ed275a2158200 R08: 00000000ffffffff R09: 0000000000000020
[  438.639222] R10: 0000000000000000 R11: 0000000000000020 R12: 0000000000001000
[  438.646356] R13: 0000000000000000 R14: ff4ed275d0daffe0 R15: 0000000000000000
[  438.653485] FS:  0000000000000000(0000) GS:ff4ed2738fa00000(0000) knlGS:0000000000000000
[  438.661563] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  438.667310] CR2: 0000000000000000 CR3: 0000007c9f0d6006 CR4: 0000000000771ef0
[  438.674444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  438.681573] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  438.688697] PKRU: 55555554
[  438.691404] Call Trace:
[  438.693857]  <IRQ>
[  438.695877]  ? profile_tick+0x17/0x80
[  438.699542]  ice_msix_clean_ctrl_vsi+0x24/0x50 [ice]
[  438.702571] ice 0000:b1:00.0: VF 1: ctrl_vsi irq timeout
[  438.704542]  __handle_irq_event_percpu+0x43/0x1a0
[  438.704549]  handle_irq_event+0x34/0x70
[  438.704554]  handle_edge_irq+0x9f/0x240
[  438.709901] iavf 0000:b1:01.1: Failed to add Flow Director filter with status: 6
[  438.714571]  __common_interrupt+0x63/0x100
[  438.714580]  common_interrupt+0xb4/0xd0
[  438.718424] iavf 0000:b1:01.1: Rule ID: 127 dst_ip: 0.0.0.0 src_ip 0.0.0.0 UDP: dst_port 4 src_port 0
[  438.722255]  </IRQ>
[  438.722257]  <TASK>
[  438.722257]  asm_common_interrupt+0x22/0x40
[  438.722262] RIP: 0010:cpuidle_enter_state+0xc8/0x430
[  438.722267] Code: 6e e9 25 ff e8 f9 ef ff ff 8b 53 04 49 89 c5 0f 1f 44 00 00 31 ff e8 d7 f1 24 ff 45
84 ff 0f 85 57 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 85 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d
[  438.722269] RSP: 0018:ffffffff86003e50 EFLAGS: 00000246
[  438.784108] RAX: ff4ed2738fa00000 RBX: ffbe72a64fc01020 RCX: 0000000000000000
[  438.791234] RDX: 0000000000000000 RSI: ffffffff858d84de RDI: ffffffff85893641
[  438.798365] RBP: 0000000000000002 R08: 0000000000000002 R09: 000000003158af9d
[  438.805490] R10: 0000000000000008 R11: 0000000000000354 R12: ffffffff862365a0
[  438.812622] R13: 000000661b472a87 R14: 0000000000000002 R15: 0000000000000000
[  438.819757]  cpuidle_enter+0x29/0x40
[  438.823333]  do_idle+0x1b6/0x230
[  438.826566]  cpu_startup_entry+0x19/0x20
[  438.830492]  rest_init+0xcb/0xd0
[  438.833717]  arch_call_rest_init+0xa/0x30
[  438.837731]  start_kernel+0x776/0xb70
[  438.841396]  secondary_startup_64_no_verify+0xe5/0xeb
[  438.846449]  </TASK>

Fixes: 2fba7dc515 ("ice: Add support for XDP multi-buffer on Rx side")
Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-03-21 10:02:48 -07:00
Robbie Harwood 3584c1dbff asymmetric_keys: log on fatal failures in PE/pkcs7
These particular errors can be encountered while trying to kexec when
secureboot lockdown is in place.  Without this change, even with a
signed debug build, one still needs to reboot the machine to add the
appropriate dyndbg parameters (since lockdown blocks debugfs).

Accordingly, upgrade all pr_debug() before fatal error into pr_warn().

Signed-off-by: Robbie Harwood <rharwood@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jarkko Sakkinen <jarkko@kernel.org>
cc: Eric Biederman <ebiederm@xmission.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: keyrings@vger.kernel.org
cc: linux-crypto@vger.kernel.org
cc: kexec@lists.infradead.org
Link: https://lore.kernel.org/r/20230220171254.592347-3-rharwood@redhat.com/ # v2
2023-03-21 16:23:56 +00:00
Robbie Harwood 4fc5c74dde verify_pefile: relax wrapper length check
The PE Format Specification (section "The Attribute Certificate Table
(Image Only)") states that `dwLength` is to be rounded up to 8-byte
alignment when used for traversal.  Therefore, the field is not required
to be an 8-byte multiple in the first place.

Accordingly, pesign has not performed this alignment since version
0.110.  This causes kexec failure on pesign'd binaries with "PEFILE:
Signature wrapper len wrong".  Update the comment and relax the check.

Signed-off-by: Robbie Harwood <rharwood@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jarkko Sakkinen <jarkko@kernel.org>
cc: Eric Biederman <ebiederm@xmission.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: keyrings@vger.kernel.org
cc: linux-crypto@vger.kernel.org
cc: kexec@lists.infradead.org
Link: https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#the-attribute-certificate-table-image-only
Link: https://github.com/rhboot/pesign
Link: https://lore.kernel.org/r/20230220171254.592347-2-rharwood@redhat.com/ # v2
2023-03-21 16:23:17 +00:00
David Howells 47f9e4c924 keys: Do not cache key in task struct if key is requested from kernel thread
The key which gets cached in task structure from a kernel thread does not
get invalidated even after expiry.  Due to which, a new key request from
kernel thread will be served with the cached key if it's present in task
struct irrespective of the key validity.  The change is to not cache key in
task_struct when key requested from kernel thread so that kernel thread
gets a valid key on every key request.

The problem has been seen with the cifs module doing DNS lookups from a
kernel thread and the results getting pinned by being attached to that
kernel thread's cache - and thus not something that can be easily got rid
of.  The cache would ordinarily be cleared by notify-resume, but kernel
threads don't do that.

This isn't seen with AFS because AFS is doing request_key() within the
kernel half of a user thread - which will do notify-resume.

Fixes: 7743c48e54 ("keys: Cache result of request_key*() temporarily in task_struct")
Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Steve French <smfrench@gmail.com>
cc: keyrings@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/CAGypqWw951d=zYRbdgNR4snUDvJhWL=q3=WOyh7HhSJupjz2vA@mail.gmail.com/
2023-03-21 16:22:40 +00:00
Masami Hiramatsu (Google) b69245126a bootconfig: Fix testcase to increase max node
Since commit 6c40624930 ("bootconfig: Increase max nodes of bootconfig
from 1024 to 8192 for DCC support") increased the max number of bootconfig
node to 8192, the bootconfig testcase of the max number of nodes fails.
To fix this issue, we can not simply increase the number in the test script
because the test bootconfig file becomes too big (>32KB). To fix that, we
can use a combination of three alphabets (26^3 = 17576). But with that,
we can not express the 8193 (just one exceed from the limitation) because
it also exceeds the max size of bootconfig. So, the first 26 nodes will just
use one alphabet.

With this fix, test-bootconfig.sh passes all tests.

Link: https://lore.kernel.org/all/167888844790.791176.670805252426835131.stgit@devnote2/

Reported-by: Heinz Wiesinger <pprkut@slackware.com>
Link: https://lore.kernel.org/all/2463802.XAFRqVoOGU@amaterasu.liwjatan.org
Fixes: 6c40624930 ("bootconfig: Increase max nodes of bootconfig from 1024 to 8192 for DCC support")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-03-22 01:00:28 +09:00
Jiasheng Jiang f038f3917b octeontx2-vf: Add missing free for alloc_percpu
Add the free_percpu for the allocated "vf->hw.lmt_info" in order to avoid
memory leak, same as the "pf->hw.lmt_info" in
`drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c`.

Fixes: 5c0512072f ("octeontx2-pf: cn10k: Use runtime allocated LMTLINE region")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Acked-by: Geethasowjanya Akula <gakula@marvell.com>
Link: https://lore.kernel.org/r/20230317064337.18198-1-jiasheng@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-20 22:00:19 -07:00
Linus Torvalds 17214b70a1 fsverity fixes for v6.3-rc4
Fix two significant performance issues with fsverity.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZBjIvRQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK2uuAP9I9h+KjU8cSpF0fS3vHhwmDtqc/vW9
 wfHniylcwK2YnwD/REgCv8yKlJva0+IUHMCGNWUzd0CERfuUFJy1Z7xvCgo=
 =rjIA
 -----END PGP SIGNATURE-----

Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux

Pull fsverity fixes from Eric Biggers:
 "Fix two significant performance issues with fsverity"

* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
  fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY
  fsverity: Remove WQ_UNBOUND from fsverity read workqueue
2023-03-20 15:20:33 -07:00
Linus Torvalds 4f1e308df8 fscrypt fixes for v6.3-rc4
Fix a bug where when a filesystem was being unmounted, the fscrypt
 keyring was destroyed before inodes have been released by the Landlock
 LSM.  This bug was found by syzbot.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZBjHAxQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK1THAP91M4UXEsVTPw1QI03+72pPy0U4iArf
 j8jhQVL6vjXHnQEA4fmwg3tE5g9F5Qe5mjFOxvoa/Z9xqupPOybOR3z1zQI=
 =S80B
 -----END PGP SIGNATURE-----

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux

Pull fscrypt fix from Eric Biggers:
 "Fix a bug where when a filesystem was being unmounted, the fscrypt
  keyring was destroyed before inodes have been released by the Landlock
  LSM.

  This bug was found by syzbot"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
  fscrypt: check for NULL keyring in fscrypt_put_master_key_activeref()
  fscrypt: improve fscrypt_destroy_keyring() documentation
  fscrypt: destroy keyring after security_sb_delete()
2023-03-20 15:12:55 -07:00
Damien Le Moal 88b170088a zonefs: Fix error message in zonefs_file_dio_append()
Since the expected write location in a sequential file is always at the
end of the file (append write), when an invalid write append location is
detected in zonefs_file_dio_append(), print the invalid written location
instead of the expected write location.

Fixes: a608da3bd7 ("zonefs: Detect append writes at invalid locations")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
2023-03-21 06:36:43 +09:00
Damien Le Moal d7e673c2a9 zonefs: Prevent uninitialized symbol 'size' warning
In zonefs_file_dio_append(), initialize the variable size to 0 to
prevent compilation and static code analizers warning such as:

New smatch warnings:
fs/zonefs/file.c:441 zonefs_file_dio_append() error: uninitialized
symbol 'size'.

The warning is a false positive as size is never actually used
uninitialized.

No functional change.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/r/202303191227.GL8Dprbi-lkp@intel.com/
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
2023-03-21 06:36:34 +09:00
Arnd Bergmann 7d31677bb7 gpu: host1x: fix uninitialized variable use
The error handling for platform_get_irq() failing no longer works after
a recent change, clang now points this out with a warning:

  drivers/gpu/host1x/dev.c:520:6: error: variable 'syncpt_irq' is uninitialized when used here [-Werror,-Wuninitialized]
          if (syncpt_irq < 0)
              ^~~~~~~~~~

Fix this by removing the variable and checking the correct error status.

Fixes: 625d4ffb43 ("gpu: host1x: Rewrite syncpoint interrupt handling")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-03-20 11:12:37 -07:00