Commit Graph

1140563 Commits

Author SHA1 Message Date
Tejun Heo e47877c7aa rhashtable: Allow rhashtable to be used from irq-safe contexts
rhashtable currently only does bh-safe synchronization making it impossible
to use from irq-safe contexts. Switch it to use irq-safe synchronization to
remove the restriction.

v2: Update the lock functions to return the ulong flags value and unlock
    functions to take the value directly instead of passing around the
    pointer. Suggested by Linus.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: David Vernet <dvernet@meta.com>
Acked-by: Josh Don <joshdon@google.com>
Acked-by: Hao Luo <haoluo@google.com>
Acked-by: Barret Rhoden <brho@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-09 10:42:56 +00:00
David S. Miller b602d00384 Merge branch 'net-sched-retpoline'
Pedro Tammela says:

====================
net/sched: retpoline wrappers for tc

In tc all qdics, classifiers and actions can be compiled as modules.
This results today in indirect calls in all transitions in the tc hierarchy.
Due to CONFIG_RETPOLINE, CPUs with mitigations=on might pay an extra cost on
indirect calls. For newer Intel cpus with IBRS the extra cost is
nonexistent, but AMD Zen cpus and older x86 cpus still go through the
retpoline thunk.

Known built-in symbols can be optimized into direct calls, thus
avoiding the retpoline thunk. So far, tc has not been leveraging this
build information and leaving out a performance optimization for some
CPUs. In this series we wire up 'tcf_classify()' and 'tcf_action_exec()'
with direct calls when known modules are compiled as built-in as an
opt-in optimization.

We measured these changes in one AMD Zen 4 cpu (Retpoline), one AMD Zen 3 cpu (Retpoline),
one Intel 10th Gen CPU (IBRS), one Intel 3rd Gen cpu (Retpoline) and one
Intel Xeon CPU (IBRS) using pktgen with 64b udp packets. Our test setup is a
dummy device with clsact and matchall in a kernel compiled with every
tc module as built-in.  We observed a 3-8% speed up on the retpoline CPUs,
when going through 1 tc filter, and a 60-100% speed up when going through 100 filters.
For the IBRS cpus we observed a 1-2% degradation in both scenarios, we believe
the extra branches check introduced a small overhead therefore we added
a static key that bypasses the wrapper on kernels not using the retpoline mitigation,
but compiled with CONFIG_RETPOLINE.

1 filter:
CPU        | before (pps) | after (pps) | diff
R9 7950X   | 5914980      | 6380227     | +7.8%
R9 5950X   | 4237838      | 4412241     | +4.1%
R9 5950X   | 4265287      | 4413757     | +3.4%   [*]
i5-3337U   | 1580565      | 1682406     | +6.4%
i5-10210U  | 3006074      | 3006857     | +0.0%
i5-10210U  | 3160245      | 3179945     | +0.6%   [*]
Xeon 6230R | 3196906      | 3197059     | +0.0%
Xeon 6230R | 3190392      | 3196153     | +0.01%  [*]

100 filters:
CPU        | before (pps) | after (pps) | diff
R9 7950X   | 373598       | 820396      | +119.59%
R9 5950X   | 313469       | 633303      | +102.03%
R9 5950X   | 313797       | 633150      | +101.77% [*]
i5-3337U   | 127454       | 211210      | +65.71%
i5-10210U  | 389259       | 381765      | -1.9%
i5-10210U  | 408812       | 412730      | +0.9%    [*]
Xeon 6230R | 415420       | 406612      | -2.1%
Xeon 6230R | 416705       | 405869      | -2.6%    [*]

[*] In these tests we ran pktgen with clone set to 1000.

On the 7950x system we also tested the impact of filters if iteration order
placement varied, first by compiling a kernel with the filter under test being
the first one in the static iteration and then repeating it with being last (of 15 classifiers existing today).
We saw a difference of +0.5-1% in pps between being the first in the iteration vs being the last.
Therefore we order the classifiers and actions according to relevance per our current thinking.

v5->v6:
- Address Eric Dumazet suggestions

v4->v5:
- Rebase

v3->v4:
- Address Eric Dumazet suggestions

v2->v3:
- Address suggestions by Jakub, Paolo and Eric
- Dropped RFC tag (I forgot to add it on v2)

v1->v2:
- Fix build errors found by the bots
- Address Kuniyuki Iwashima suggestions

====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-09 09:18:08 +00:00
Pedro Tammela 9f3101dca3 net/sched: avoid indirect classify functions on retpoline kernels
Expose the necessary tc classifier functions and wire up cls_api to use
direct calls in retpoline kernels.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-09 09:18:07 +00:00
Pedro Tammela 871cf386dd net/sched: avoid indirect act functions on retpoline kernels
Expose the necessary tc act functions and wire up act_api to use
direct calls in retpoline kernels.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-09 09:18:07 +00:00
Pedro Tammela 7f0e810220 net/sched: add retpoline wrapper for tc
On kernels using retpoline as a spectrev2 mitigation,
optimize actions and filters that are compiled as built-ins into a direct call.

On subsequent patches we expose the classifiers and actions functions
and wire up the wrapper into tc.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-09 09:18:07 +00:00
Pedro Tammela 2a7d228f1a net/sched: move struct action_ops definition out of ifdef
The type definition should be visible even in configurations not using
CONFIG_NET_CLS_ACT.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-09 09:18:07 +00:00
Randy Dunlap 0bdff1152c net: phy: remove redundant "depends on" lines
Delete a few lines of "depends on PHYLIB" since they are inside
an "if PHYLIB / endif # PHYLIB" block, i.e., they are redundant
and the other 50+ drivers there don't use "depends on PHYLIB"
since it is not needed.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Link: https://lore.kernel.org/r/20221207044257.30036-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 20:01:21 -08:00
Willem de Bruijn b534dc46c8 net_tstamp: add SOF_TIMESTAMPING_OPT_ID_TCP
Add an option to initialize SOF_TIMESTAMPING_OPT_ID for TCP from
write_seq sockets instead of snd_una.

This should have been the behavior from the start. Because processes
may now exist that rely on the established behavior, do not change
behavior of the existing option, but add the right behavior with a new
flag. It is encouraged to always set SOF_TIMESTAMPING_OPT_ID_TCP on
stream sockets along with the existing SOF_TIMESTAMPING_OPT_ID.

Intuitively the contract is that the counter is zero after the
setsockopt, so that the next write N results in a notification for
the last byte N - 1.

On idle sockets snd_una == write_seq and this holds for both. But on
sockets with data in transmission, snd_una records the unacked offset
in the stream. This depends on the ACK response from the peer. A
process cannot learn this in a race free manner (ioctl SIOCOUTQ is one
racy approach).

write_seq records the offset at the last byte written by the process.
This is a better starting point. It matches the intuitive contract in
all circumstances, unaffected by external behavior.

The new timestamp flag necessitates increasing sk_tsflags to 32 bits.
Move the field in struct sock to avoid growing the socket (for some
common CONFIG variants). The UAPI interface so_timestamping.flags is
already int, so 32 bits wide.

Reported-by: Sotirios Delimanolis <sotodel@meta.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20221207143701.29861-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 19:49:21 -08:00
Jakub Kicinski ecd6df3c1b Merge branch 'fix-possible-deadlock-during-wed-attach'
Lorenzo Bianconi says:

====================
fix possible deadlock during WED attach

Fix a possible deadlock in mtk_wed_attach if mtk_wed_wo_init routine fails.
Check wo pointer is properly allocated before running mtk_wed_wo_reset() and
mtk_wed_wo_deinit().
====================

Link: https://lore.kernel.org/r/cover.1670421354.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 19:47:46 -08:00
Lorenzo Bianconi 587585e1bb net: ethernet: mtk_wed: fix possible deadlock if mtk_wed_wo_init fails
Introduce __mtk_wed_detach() in order to avoid a deadlock in
mtk_wed_attach routine if mtk_wed_wo_init fails since both
mtk_wed_attach and mtk_wed_detach run holding hw_lock mutex.

Fixes: 4c5de09eb0 ("net: ethernet: mtk_wed: add configure wed wo support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 19:47:43 -08:00
Lorenzo Bianconi c79e0af5ae net: ethernet: mtk_wed: fix some possible NULL pointer dereferences
Fix possible NULL pointer dereference in mtk_wed_detach routine checking
wo pointer is properly allocated before running mtk_wed_wo_reset() and
mtk_wed_wo_deinit().
Even if it is just a theoretical issue at the moment check wo pointer is
not NULL in mtk_wed_mcu_msg_update.
Moreover, honor mtk_wed_mcu_send_msg return value in mtk_wed_wo_reset()

Fixes: 799684448e ("net: ethernet: mtk_wed: introduce wed wo support")
Fixes: 4c5de09eb0 ("net: ethernet: mtk_wed: add configure wed wo support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 19:47:43 -08:00
Colin Ian King 3df96774a4 nfp: Fix spelling mistake "tha" -> "the"
There is a spelling mistake in a nn_dp_warn message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20221207094312.2281493-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 19:44:06 -08:00
Björn Töpel 17961a37ce selftests: net: Fix O=dir builds
The BPF Makefile in net/bpf did incorrect path substitution for O=dir
builds, e.g.

  make O=/tmp/kselftest headers
  make O=/tmp/kselftest -C tools/testing/selftests

would fail in selftest builds [1] net/ with

  clang-16: error: no such file or directory: 'kselftest/net/bpf/nat6to4.c'
  clang-16: error: no input files

Add a pattern prerequisite and an order-only-prerequisite (for
creating the directory), to resolve the issue.

[1] https://lore.kernel.org/all/202212060009.34CkQmCN-lkp@intel.com/

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 837a3d66d6 ("selftests: net: Add cross-compilation support for BPF programs")
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20221206102838.272584-1-bjorn@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 19:26:18 -08:00
Jakub Kicinski ce87a957f1 Merge branch 'mlxsw-add-spectrum-1-ip6gre-support'
Petr Machata says:

====================
mlxsw: Add Spectrum-1 ip6gre support

Ido Schimmel writes:

Currently, mlxsw only supports ip6gre offload on Spectrum-2 and newer
ASICs. Spectrum-1 can also offload ip6gre tunnels, but it needs double
entry router interfaces (RIFs) for the RIFs representing these tunnels.
In addition, the RIF index needs to be even. This is handled in
patches #1-#3.

The implementation can otherwise be shared between all Spectrum
generations. This is handled in patches #4-#5.

Patch #6 moves a mlxsw ip6gre selftest to a shared directory, as ip6gre
is no longer only supported on Spectrum-2 and newer ASICs.

This work is motivated by users that require multiple GRE tunnels that
all share the same underlay VRF. Currently, mlxsw only supports
decapsulation based on the underlay destination IP (i.e., not taking the
GRE key into account), so users need to configure these tunnels with
different source IPs and IPv6 addresses are easier to spare than IPv4.

Tested using existing ip6gre forwarding selftests.
====================

Link: https://lore.kernel.org/r/cover.1670414573.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 18:46:34 -08:00
Ido Schimmel db401875f4 selftests: mlxsw: Move IPv6 decap_error test to shared directory
Now that Spectrum-1 gained ip6gre support we can move the test out of
the Spectrum-2 directory.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 18:46:32 -08:00
Ido Schimmel 7ec5364351 mlxsw: spectrum_ipip: Add Spectrum-1 ip6gre support
As explained in the previous patch, the existing Spectrum-2 ip6gre
implementation can be reused for Spectrum-1. Change the Spectrum-1
ip6gre operations structure to use the common operations.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 18:46:32 -08:00
Ido Schimmel ab30e4d4b2 mlxsw: spectrum_ipip: Rename Spectrum-2 ip6gre operations
There are two main differences between Spectrum-1 and newer ASICs in
terms of IP-in-IP support:

1. In Spectrum-1, RIFs representing ip6gre tunnels require two entries
   in the RIF table.

2. In Spectrum-2 and newer ASICs, packets ingress the underlay (during
   encapsulation) and egress the underlay (during decapsulation) via a
   special generic loopback RIF.

The first difference was handled in previous patches by adding the
'double_rif_entry' field to the Spectrum-1 operations structure of
ip6gre RIFs. The second difference is handled during RIF creation, by
only creating a generic loopback RIF in Spectrum-2 and newer ASICs.

Therefore, the ip6gre operations can be shared between Spectrum-1 and
newer ASIC in a similar fashion to how the ipgre operations are shared.

Rename the operations to not be Spectrum-2 specific and move them
earlier in the file so that they could later be used for Spectrum-1.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 18:46:32 -08:00
Ido Schimmel 5ca1b208c5 mlxsw: spectrum_router: Add support for double entry RIFs
In Spectrum-1, loopback router interfaces (RIFs) used for IP-in-IP
encapsulation with an IPv6 underlay require two RIF entries and the RIF
index must be even.

Prepare for this change by extending the RIF parameters structure with a
'double_entry' field that indicates if the RIF being created requires
two RIF entries or not. Only set it for RIFs representing ip6gre tunnels
in Spectrum-1.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 18:46:32 -08:00
Ido Schimmel 1a2f65b4a2 mlxsw: spectrum_router: Parametrize RIF allocation size
Currently, each router interface (RIF) consumes one entry in the RIFs
table. This is going to change in subsequent patches where some RIFs
will consume two table entries.

Prepare for this change by parametrizing the RIF allocation size. For
now, always pass '1'.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 18:46:31 -08:00
Ido Schimmel 40ef76de8a mlxsw: spectrum_router: Use gen_pool for RIF index allocation
Currently, each router interface (RIF) consumes one entry in the RIFs
table and there are no alignment constraints. This is going to change in
subsequent patches where some RIFs will consume two table entries and
their indexes will need to be aligned to the allocation size (even).

Prepare for this change by converting the RIF index allocation to use
gen_pool with the 'gen_pool_first_fit_order_align' algorithm.

No Kconfig changes necessary as mlxsw already selects
'GENERIC_ALLOCATOR'.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 18:46:31 -08:00
Jakub Kicinski 837e8ac871 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 18:19:59 -08:00
Linus Torvalds 010b6761a9 Including fixes from bluetooth, can and netfilter.
Current release - new code bugs:
 
  - bonding: ipv6: correct address used in Neighbour Advertisement
    parsing (src vs dst typo)
 
  - fec: properly scope IRQ coalesce setup during link up to supported
    chips only
 
 Previous releases - regressions:
 
  - Bluetooth fixes for fake CSR clones (knockoffs):
    - re-add ERR_DATA_REPORTING quirk
    - fix crash when device is replugged
 
  - Bluetooth:
    - silence a user-triggerable dmesg error message
    - L2CAP: fix u8 overflow, oob access
    - correct vendor codec definition
    - fix support for Read Local Supported Codecs V2
 
  - ti: am65-cpsw: fix RGMII configuration at SPEED_10
 
  - mana: fix race on per-CQ variable NAPI work_done
 
 Previous releases - always broken:
 
  - af_unix: diag: fetch user_ns from in_skb in unix_diag_get_exact(),
    avoid null-deref
 
  - af_can: fix NULL pointer dereference in can_rcv_filter
 
  - can: slcan: fix UAF with a freed work
 
  - can: can327: flush TX_work on ldisc .close()
 
  - macsec: add missing attribute validation for offload
 
  - ipv6: avoid use-after-free in ip6_fragment()
 
  - nft_set_pipapo: actually validate intervals in fields
    after the first one
 
  - mvneta: prevent oob access in mvneta_config_rss()
 
  - ipv4: fix incorrect route flushing when table ID 0 is used,
    or when source address is deleted
 
  - phy: mxl-gpy: add workaround for IRQ bug on GPY215B and GPY215C
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmOSS98ACgkQMUZtbf5S
 Irssbg//aTPi8bJgyM/yH83QK1+6t23xiW0YscJG3CXr+lTIvXKRoyPjUwNmy1TG
 xr5+dkW73nEVr3Uxkunn1qo+74yYpwewd1gamOxmjD+TDIb2xn/ErN61X506eXTy
 59thWoAMD0EOQrqDIcD4SEfdNMaASrJSD4t8rXwk/h7LMw+mrIsWNSeFL4HpgHOw
 lYwujLwDkkDJNWDWI8NcJiR3i+l7JGOHkcdYfhBZIodBCQ1y/u3AVpK0qZd6eeFs
 3Waz7a0q9M0glHhXXMdN/v+XjxKRE1evkMjv842zABTSLy4lKKZPn//pXwA+bZ24
 qHw2y1a7ZHiRBQMqNCddxzHhe6kfq8OyrCDAI8qupBTNFBXXKzJyxCrbabPNeern
 YW0oggN9dTkES17OeHraA5O8km1L6tfIbm6BLteXAwM4hDoxBSMQYtEDF26P4K1K
 UYufBjfL/hPhuoYNFZ//0laLow+6k8Sl7Pdk/fwd8InCb59YrYUpjod7kXMByw7m
 ++FO3guxnk32ZEWduG3Gc203Vi4acRrQkGqWbhMZtH+ccxexx2Efl6Xn+sq7r9Dv
 rl/PIQiuGdoFf7OaW64FS2Nj8eVAPcqbYUMEc3EBrFDs/CvGTkyzRUcQoLdEf2zu
 p2ldFTerGtek9F7dUs4GZr/vw29n0Q2dPdQVFIcuXo3Nz3wFNKI=
 =+mCl
 -----END PGP SIGNATURE-----

Merge tag 'net-6.1-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, can and netfilter.

  Current release - new code bugs:

   - bonding: ipv6: correct address used in Neighbour Advertisement
     parsing (src vs dst typo)

   - fec: properly scope IRQ coalesce setup during link up to supported
     chips only

  Previous releases - regressions:

   - Bluetooth fixes for fake CSR clones (knockoffs):
       - re-add ERR_DATA_REPORTING quirk
       - fix crash when device is replugged

   - Bluetooth:
       - silence a user-triggerable dmesg error message
       - L2CAP: fix u8 overflow, oob access
       - correct vendor codec definition
       - fix support for Read Local Supported Codecs V2

   - ti: am65-cpsw: fix RGMII configuration at SPEED_10

   - mana: fix race on per-CQ variable NAPI work_done

  Previous releases - always broken:

   - af_unix: diag: fetch user_ns from in_skb in unix_diag_get_exact(),
     avoid null-deref

   - af_can: fix NULL pointer dereference in can_rcv_filter

   - can: slcan: fix UAF with a freed work

   - can: can327: flush TX_work on ldisc .close()

   - macsec: add missing attribute validation for offload

   - ipv6: avoid use-after-free in ip6_fragment()

   - nft_set_pipapo: actually validate intervals in fields after the
     first one

   - mvneta: prevent oob access in mvneta_config_rss()

   - ipv4: fix incorrect route flushing when table ID 0 is used, or when
     source address is deleted

   - phy: mxl-gpy: add workaround for IRQ bug on GPY215B and GPY215C"

* tag 'net-6.1-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
  net: dsa: sja1105: avoid out of bounds access in sja1105_init_l2_policing()
  s390/qeth: fix use-after-free in hsci
  macsec: add missing attribute validation for offload
  net: mvneta: Fix an out of bounds check
  net: thunderbolt: fix memory leak in tbnet_open()
  ipv6: avoid use-after-free in ip6_fragment()
  net: plip: don't call kfree_skb/dev_kfree_skb() under spin_lock_irq()
  net: phy: mxl-gpy: add MDINT workaround
  net: dsa: mv88e6xxx: accept phy-mode = "internal" for internal PHY ports
  xen/netback: don't call kfree_skb() under spin_lock_irqsave()
  dpaa2-switch: Fix memory leak in dpaa2_switch_acl_entry_add() and dpaa2_switch_acl_entry_remove()
  ethernet: aeroflex: fix potential skb leak in greth_init_rings()
  tipc: call tipc_lxc_xmit without holding node_read_lock
  can: esd_usb: Allow REC and TEC to return to zero
  can: can327: flush TX_work on ldisc .close()
  can: slcan: fix freed work crash
  can: af_can: fix NULL pointer dereference in can_rcv_filter
  net: dsa: sja1105: fix memory leak in sja1105_setup_devlink_regions()
  ipv4: Fix incorrect route flushing when table ID 0 is used
  ipv4: Fix incorrect route flushing when source address is deleted
  ...
2022-12-08 15:32:13 -08:00
Jakub Kicinski ff36c447e2 Merge branch 'mlx4-better-big-tcp-support'
Eric Dumazet says:

====================
mlx4: better BIG-TCP support

mlx4 uses a bounce buffer in TX whenever the tx descriptors
wrap around the right edge of the ring.

Size of this bounce buffer was hard coded and can be
increased if/when needed.
====================

Link: https://lore.kernel.org/r/20221207141237.2575012-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 14:27:50 -08:00
Eric Dumazet 0e706f7961 net/mlx4: small optimization in mlx4_en_xmit()
Test against MLX4_MAX_DESC_TXBBS only matters if the TX
bounce buffer is going to be used.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 14:27:48 -08:00
Eric Dumazet 26782aad00 net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends on MAX_SKB_FRAGS
Google production kernel has increased MAX_SKB_FRAGS to 45
for BIG-TCP rollout.

Unfortunately mlx4 TX bounce buffer is not big enough whenever
an skb has up to 45 page fragments.

This can happen often with TCP TX zero copy, as one frag usually
holds 4096 bytes of payload (order-0 page).

Tested:
 Kernel built with MAX_SKB_FRAGS=45
 ip link set dev eth0 gso_max_size 185000
 netperf -t TCP_SENDFILE

I made sure that "ethtool -G eth0 tx 64" was properly working,
ring->full_size being set to 15.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Wei Wang <weiwan@google.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 14:27:48 -08:00
Eric Dumazet 35f31ff0c0 net/mlx4: rename two constants
MAX_DESC_SIZE is really the size of the bounce buffer used
when reaching the right side of TX ring buffer.

MAX_DESC_TXBBS get a MLX4_ prefix.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 14:27:48 -08:00
Jakub Kicinski ddda6326ee Merge branch 'mlx5-Support-tc-police-jump-conform-exceed-attribute'
Saeed Mahameed says:

====================
Support tc police jump conform-exceed attribute

The tc police action conform-exceed option defines how to handle
packets which exceed or conform to the configured bandwidth limit.
One of the possible conform-exceed values is jump, which skips over
a specified number of actions.
This series adds support for conform-exceed jump action.

The series adds platform support for branching actions by providing
true/false flow attributes to the branching action.
This is necessary for supporting police jump, as each branch may
execute a different action list.

The first five patches are preparation patches:
- Patches 1 and 2 add support for actions with no destinations (e.g. drop)
- Patch 3 refactor the code for subsequent function reuse
- Patch 4 defines an abstract way for identifying terminating actions
- Patch 5 updates action list validations logic considering branching actions

The following three patches introduce an interface for abstracting branching
actions:
- Patch 6 introduces an abstract api for defining branching actions
- Patch 7 generically instantiates the branching flow attributes using
  the abstract API

Patch 8 adds the platform support for jump actions, by executing the following
sequence:
  a. Store the jumping flow attr
  b. Identify the jump target action while iterating the actions list.
  c. Instantiate a new flow attribute after the jump target action.
     This is the flow attribute that the branching action should jump to.
  d. Set the target post action id on:
    d.1. The jumping attribute, thus realizing the jump functionality.
    d.2. The attribute preceding the target jump attr, if not terminating.

The next patches apply the platform's branching attributes to the police
action:
- Patch 9 is a refactor patch
- Patch 10 initializes the post meter table with the red/green flow attributes,
           as were initialized by the platform
- Patch 11 enables the offload of meter actions using jump conform-exceed
           value.
====================

Link: https://lore.kernel.org/all/20221203221337.29267-1-saeed@kernel.org/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 13:07:31 -08:00
Oz Shlomo 3603f26633 net/mlx5e: TC, allow meter jump control action
Separate the matchall police action validation from flower validation.
Isolate the action validation logic in the police action parser.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221203221337.29267-12-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 13:04:26 -08:00
Oz Shlomo 0d8c38d44f net/mlx5e: TC, init post meter rules with branching attributes
Instantiate the post meter actions with the platform initialized branching
action attributes.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221203221337.29267-11-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 13:04:26 -08:00
Oz Shlomo 3fcb94e393 net/mlx5e: TC, rename post_meter actions
Currently post meter supports only the pipe/drop conform-exceed policy.
This assumption is reflected in several variable names.
Rename the following variables as a pre-step for using the generalized
branching action platform.

Rename fwd_green_rule/drop_red_rule to green_rule/red_rule respectively.
Repurpose red_counter/green_counter to act_counter/drop_counter to allow
police conform-exceed configurations that do not drop.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221203221337.29267-10-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 13:04:26 -08:00
Oz Shlomo c84fa1ab94 net/mlx5e: TC, initialize branching action with target attr
Identify the jump target action when iterating the action list.
Initialize the jump target attr with the jumping attribute during the
parsing phase. Initialize the jumping attr post action with the target
during the offload phase.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221203221337.29267-9-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 13:04:26 -08:00
Oz Shlomo f86488cb46 net/mlx5e: TC, initialize branch flow attributes
Initialize flow attribute for drop, accept, pipe and jump branching actions.

Instantiate a flow attribute instance according to the specified branch
control action. Store the branching attributes on the branching action
flow attribute during the parsing phase. Then, during the offload phase,
allocate the relevant mod header objects to the branching actions.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221203221337.29267-8-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 13:04:26 -08:00
Oz Shlomo ec5878552b net/mlx5e: TC, set control params for branching actions
Extend the act tc api to set the branch control params aligning with
the police conform/exceed use case.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221203221337.29267-7-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 13:04:26 -08:00
Oz Shlomo 6442638251 net/mlx5e: TC, validate action list per attribute
Currently the entire flow action list is validate for offload limitations.
For example, flow with both forward and drop actions are declared invalid
due to hardware restrictions.
However, a multi-table hardware model changes the limitations from a flow
scope to a single flow attribute scope.

Apply offload limitations to flow attributes instead of the entire flow.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221203221337.29267-6-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 13:04:26 -08:00
Oz Shlomo d3f6b0df91 net/mlx5e: TC, add terminating actions
Extend act api to identify actions that terminate action list.
Pre-step for terminating branching actions.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221203221337.29267-5-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 13:04:26 -08:00
Oz Shlomo 8facc02f22 net/mlx5e: TC, reuse flow attribute post parser processing
After the tc action parsing phase the flow attribute is initialized with
relevant eswitch offload objects such as tunnel, vlan, header modify and
counter attributes. The post processing is done both for fdb and post-action
attributes.

Reuse the flow attribute post parsing logic by both fdb and post-action
offloads.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221203221337.29267-4-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 13:04:26 -08:00
Oz Shlomo f07d8afb1c net/mlx5: fs, assert null dest pointer when dest_num is 0
Currently create_flow_handle() assumes a null dest pointer when there
are no destinations.
This might not be the case as the caller may pass an allocated dest
array while setting the dest_num parameter to 0.

Assert null dest array for flow rules that have no destinations (e.g. drop
rule).

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221203221337.29267-3-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 13:04:26 -08:00
Oz Shlomo 5a5624d1ed net/mlx5e: E-Switch, handle flow attribute with no destinations
Rules with drop action are not required to have a destination.
Currently the destination list is allocated with the maximum number of
destinations and passed to the fs_core layer along with the actual number
of destinations.

Remove redundant passing of dest pointer when count of dest is 0.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221203221337.29267-2-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 13:04:26 -08:00
Linus Torvalds ce19275f01 for-linus-2022120801
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIVAwUAY5JD7aZi849r7WBJAQJtHw//WSKVFMKLq5P0PatMFBusJspse2ZsGFpI
 fumeGoNxd2/Gbj2eRrer5/1D5XU2VXC6Ot9/Rz3oIaOGWQkShjntoWhdpOxPCFI6
 bm2s78iPFSCubwi986vgI4/IOWsqWFcDbIWgn8i9q+ZkUDykIiIsbUAo7FyoX53i
 J34H8+NpQy5gPHXcSpWf83XBBidYC1uPtOk4k+W/SMOgS4UaVn3NAI2W/eKharlG
 kfepb6WRPiFDT/9opMb+PFmp62UeZyImQCxh/S0AjZArCg0A6u01Ou5moizprE2q
 qMgGk1MhruxVmSaCkSnYXn/xbW2JucyU7V2IPWTaBG5IGDUsdAibCD7YblPG/Gm5
 KZ4mz7zazc6gwn63/bxmBMbcfEHg1TUuH5EUW+tY9I/yAVlvYndS76WFCpaiqZfT
 uwq4aOVwZ2lPtmww/D0EZsTKtilhGTaSUW+/XdbGWLm4X9dKB+QJNzr8n00FeUDb
 g+pdqK6+oTviKd6WDIgU7PwBTED8aYF9krknl7U61scTAod5pivdk2GEX+7mAiYq
 ir/Hhrl0h2bO3XGNA5ViW2yNMVLFeFi4jAP5Wwk9X1polyMBicaqVe0M+IK2XCoT
 U8yhHYPnxrjfei8cd0gY56lzxKmI+Np6EVTVFA3big/wZ2jmjenFdgMLVP8Cw0H+
 RkfkW7pqZpQ=
 =eEFh
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-2022120801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:
 "A regression fix for handling Logitech HID++ devices and memory
  corruption fixes:

   - regression fix (revert) for catch-all handling of Logitech HID++
     Bluetooth devices; there are devices that turn out not to work with
     this, and the root cause is yet to be properly understood. So we
     are dropping it for now, and it will be revisited for 6.2 or 6.3
     (Benjamin Tissoires)

   - memory corruption fix in HID core (ZhangPeng)

   - memory corruption fix in hid-lg4ff (Anastasia Belova)

   - Kconfig fix for I2C_HID (Benjamin Tissoires)

   - a few device-id specific quirks that piggy-back on top of the
     important fixes above"

* tag 'for-linus-2022120801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  Revert "HID: logitech-hidpp: Enable HID++ for all the Logitech Bluetooth devices"
  Revert "HID: logitech-hidpp: Remove special-casing of Bluetooth devices"
  HID: usbhid: Add ALWAYS_POLL quirk for some mice
  HID: core: fix shift-out-of-bounds in hid_report_raw_event
  HID: uclogic: Add HID_QUIRK_HIDINPUT_FORCE quirk
  HID: fix I2C_HID not selected when I2C_HID_OF_ELAN is
  HID: hid-lg4ff: Add check for empty lbuf
  HID: ite: Enable QUIRK_TOUCHPAD_ON_OFF_REPORT on Acer Aspire Switch V 10
  HID: uclogic: Fix frame templates for big endian architectures
2022-12-08 12:37:42 -08:00
Linus Torvalds f3e8416619 ARM: SoC fixes for 6.1, part 5
One last build fix came in, addressing a link failure when
 building without CONFIG_OUTER_CACHE
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmOR5EQACgkQmmx57+YA
 GNnJXhAAjdEuH4YNKdGSDG7psY/6zkwAdG3/4BtMRzYaAFI0o2JuDmNScJY9CgbK
 BrWmdFArdKPGn1cPh1YHTfKHQR8QS3UbNHr/ykqWE8zhJzpti2DjHqg2bDWXtg8v
 M9dla+Da1Y/7mIRUyj4Mfi6WMp7k2Tu9cQ09tyIKXOmeS79mSARrRq3H1U3YzAfe
 /Ou/i3TzhzHXAg5NOlYaNNXeKsPPhONhPvylLd7pyOl2z508IPqELfZI/nLN7lA7
 M/FtL9rlBnwPjNiMTICUu11tcVSEWpz+wrXUoYZqsurBuz9oFrxZlMLaSpeGAOWi
 XyQF5ibMWAP7l8nWrQRoToTizMuJ0P7N1ji734iLmXd1LsO2O7bUXbu31eBT6vMw
 rsMJjQCg2TCKXGsdr32J+6+MbZZIUs4/lenYV3P+duh/Q+jbnIOJrPlZ5GCjkxGS
 CcH/xve91vNk6sz3z/WrQ88tFwhXeahjHqmQjvAcGnKDczt1BUxl8b3zSLc0T2F5
 B4Y6n3i9ovX0+ugSyGk/eIhGyZ1oKlX+Ev7N7TEwexcevjP7ITDXB/pvPWE89I0W
 3bT+jFzuTbBjEs8WXqr9PJ0MzKLHQueU0karUMOTsnh+6FDDb18OoeWre9ao8HvB
 tQXL+JCr4bDCuKHlgj87gzMXtx2LQ00iseZb5e8dmmc8ZLw7ruU=
 =5lcR
 -----END PGP SIGNATURE-----

Merge tag 'soc-fixes-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fix from Arnd Bergmann:
 "One last build fix came in, addressing a link failure when building
  without CONFIG_OUTER_CACHE"

* tag 'soc-fixes-6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: at91: fix build for SAMA5D3 w/o L2 cache
2022-12-08 11:22:27 -08:00
Benjamin Tissoires a9d9e46c75 Revert "HID: logitech-hidpp: Enable HID++ for all the Logitech Bluetooth devices"
This reverts commit 532223c8ac.

As reported in [0], hid-logitech-hidpp now binds on all bluetooth mice,
but there are corner cases where hid-logitech-hidpp just gives up on
the mouse. This leads the end user with a dead mouse.

Given that we are at -rc8, we are definitively too late to find a proper
fix. We already identified 2 issues less than 24 hours after the bug
report. One in that ->match() was never designed to be used anywhere else
than in hid-generic, and the other that hid-logitech-hidpp has corner
cases where it gives up on devices it is not supposed to.

So we have no choice but postpone this patch to the next kernel release.

[0] https://lore.kernel.org/linux-input/CAJZ5v0g-_o4AqMgNwihCb0jrwrcJZfRrX=jv8aH54WNKO7QB8A@mail.gmail.com/

Reported-by: Rafael J . Wysocki <rjw@rjwysocki.net>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2022-12-08 20:18:11 +01:00
Benjamin Tissoires 40f2432b53 Revert "HID: logitech-hidpp: Remove special-casing of Bluetooth devices"
This reverts commit 8544c812e4.

We need to revert commit 532223c8ac ("HID: logitech-hidpp: Enable HID++
for all the Logitech Bluetooth devices") because that commit might make
hid-logitech-hidpp bind on mice that are not well enough supported by
hid-logitech-hidpp, and the end result is that the probe of those mice
is now returning -ENODEV, leaving the end user with a dead mouse.

Given that commit 8544c812e4 ("HID: logitech-hidpp: Remove special-casing
of Bluetooth devices") is a direct dependency of 532223c8ac, revert it
too.

Note that this also adapt according to commit 908d325e16 ("HID:
logitech-hidpp: Detect hi-res scrolling support") to re-add support of
the devices that were removed from that commit too.

I have locally an MX Master and I tested this device with that revert,
ensuring we still have high-res scrolling.

Reported-by: Rafael J . Wysocki <rjw@rjwysocki.net>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2022-12-08 20:18:11 +01:00
Linus Torvalds 7f043b7662 LoongArch fixes for v6.1-final
-----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmOR3UcWHGNoZW5odWFj
 YWlAa2VybmVsLm9yZwAKCRAChivD8uImesDED/90fKvraGLbwAzS/nV131JyrJO9
 BQ96cSbD9fzsy9Iq+sO9tutyPPCR3thJpJ2gnaCHlo9pn1mjQVEzX39izSkaOJ0G
 0OWeBXa429RMbNERMgxc62aZd21LNKGrrAaW9+1pnZebQ0ELvx55+PnoQNKbxlhR
 EoyjBtSZ9Hb+FV1f6yroAtAWfBw5X2GYYfSr2yjnxY2py6zLKEzTsEOzzU6rPbso
 e0457hFPB5mudAsdTjB2Icjays9BUaJffhp/eLDleky8P0ERJLfldYhnxnKqTE3q
 hZOeHZH3O9rbNUnktlgj/J8VTrrb7yYHaZMQGOzTaXBrvjj9q0k5kSMNKWdoRq37
 ZfnXwMtEGmgXf/htx/wY9sM7FCqV7QY5ubU5SkfU2ygNOdsWQFHmgl6QNtuvx4ii
 GvkHzYpymNdvlD1ZbqFPq7eABBriGs3tefUHPQ8/FoaiMudlAgZKjSuDYIUcy3nK
 3d5abAjCmxMybLGc9bI3T3BPFPGrpUlme/WdKZvvQJ3azia5F0sn2JMa7STVg3FF
 hLFvbS5RKP7dD7uP4CHbnCe5OD7aFBHmar5MYiJY4WrBIKiMQv+URtGKg9g+PDv1
 rFiiEfMfRWe2tVPuyZ5XnkQJxJp4otgaWdRNdQUDmo5uPE85H0hHw1bHw3eEtAK/
 f133+V+/kxamE1F6jg==
 =2sT6
 -----END PGP SIGNATURE-----

Merge tag 'loongarch-fixes-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Export smp_send_reschedule() for modules use, fix a huge page entry
  update issue, and add documents for booting description"

* tag 'loongarch-fixes-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  docs/zh_CN: Add LoongArch booting description's translation
  docs/LoongArch: Add booting description
  LoongArch: mm: Fix huge page entry update for virtual machine
  LoongArch: Export symbol for function smp_send_reschedule()
2022-12-08 11:16:15 -08:00
Linus Torvalds a4c3a07e5b xen: branch for v6.1-rc9b
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCY5GCdwAKCRCAXGG7T9hj
 vsFlAQD4Nn5w0MRewBvRRPp81Wbjhm6mnckOCRPATud6yyrUVgEApZJp+BRNoW1x
 kTVkkOb1x8XxJg1gzM7p5i2WSoCsRQo=
 =NnbR
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-xsa-6.1-rc9b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fix from Juergen Gross:
 "A single fix for the recent security issue XSA-423"

* tag 'for-linus-xsa-6.1-rc9b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/netback: fix build warning
2022-12-08 11:11:06 -08:00
Linus Torvalds 306ba2402d gpio fixes for v6.1
- fix a memory leak in gpiolib core
 - fix reference leaks in gpio-amd8111 and gpio-rockchip
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmOSLrsACgkQEacuoBRx
 13K4Ag/+PE+Oa+EPa5H2+gShdaXgQPHWSly/tWxtbYvaru8UDpM6UpqSQ/gHx2ZT
 ALgSRq0qMDeaJcZzrPEvi1bBSEgi5Jaq4jQAS51w7lLVCElDoNhptwMx03YUTZfs
 sc6ETyVr4MEyblXmcucHu6Mh97yRoUWYndZes+vpAueEIV9VxDJkrLUzh5ETf1Q2
 EJ6NXVOUXiTVHyuynFrMbLzXxpCX6TJC+ez1vLz+PaiLQJ3aWkECPm5ZNY/6fSIu
 NA+ywrYXaq8wlamggbyfyWjIwZNRj7SDmC+YKyLfXiUtiPRCoClGzw2txTv9Umoi
 IzmKFOnddepwPUcHEVLYkAmMeSYbIG6K3LQL5eg5FFqi27mExPxs22Z38gouyA9x
 ok0ozoplzFZBm31oclsWTbkhBCQTpYm63fGFe1KMoEXmRjAJDhg7siktmlrMUQJp
 3ClVTlaLP/AOXlv5Fb0buw6vAqBU4kJxKOIFTxTGDqNkmusLEs4N33i0ZIFMLevd
 FqHCxOXePVuZqBJtsUNOzmlpYpJiFc+qJ09yH1xmHqABNdkoWU8MTehgsLFB5VY2
 if1qB+O090XAjRFZmjsrx0vLToUUTZMPO5mbpae5yJhQ8VBZURL12Jk+fI5hBJVI
 JCScXi6LkPO0eNG82hmmpa7xxHNfqabwF1EMTuRDLSJWMfn7Xic=
 =oBrd
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix a memory leak in gpiolib core

 - fix reference leaks in gpio-amd8111 and gpio-rockchip

* tag 'gpio-fixes-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio/rockchip: fix refcount leak in rockchip_gpiolib_register()
  gpio: amd8111: Fix PCI device reference count leak
  gpiolib: fix memory leak in gpiochip_setup_dev()
2022-12-08 11:00:42 -08:00
Linus Torvalds 57fb3f66a3 ATA fixes for 6.1-rc8
A single fix for this final PR for 6.1-rc:
 
   - Avoid a NULL pointer dereference in the libahci platform code that
     can happen on initialization when a device tree does not specify
     names for the adapter clocks (from Anders).
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCY5FONgAKCRDdoc3SxdoY
 dsBSAP9aOBlpbZxMA1SU7Ig9JZDv22W+0D747wuMDHoLzOdlHQD8CGsarHIFsfU5
 H5xOjTVfkSra5sfMUIpqk31R77ETDwg=
 =jjPB
 -----END PGP SIGNATURE-----

Merge tag 'ata-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull ATA fix from Damien Le Moal:

 - Avoid a NULL pointer dereference in the libahci platform code that
   can happen on initialization when a device tree does not specify
   names for the adapter clocks (from Anders)

* tag 'ata-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: libahci_platform: ahci_platform_find_clk: oops, NULL pointer
2022-12-08 10:46:52 -08:00
Tejun Heo fbf8321238 memcg: Fix possible use-after-free in memcg_write_event_control()
memcg_write_event_control() accesses the dentry->d_name of the specified
control fd to route the write call.  As a cgroup interface file can't be
renamed, it's safe to access d_name as long as the specified file is a
regular cgroup file.  Also, as these cgroup interface files can't be
removed before the directory, it's safe to access the parent too.

Prior to 347c4a8747 ("memcg: remove cgroup_event->cft"), there was a
call to __file_cft() which verified that the specified file is a regular
cgroupfs file before further accesses.  The cftype pointer returned from
__file_cft() was no longer necessary and the commit inadvertently
dropped the file type check with it allowing any file to slip through.
With the invarients broken, the d_name and parent accesses can now race
against renames and removals of arbitrary files and cause
use-after-free's.

Fix the bug by resurrecting the file type check in __file_cft().  Now
that cgroupfs is implemented through kernfs, checking the file
operations needs to go through a layer of indirection.  Instead, let's
check the superblock and dentry type.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 347c4a8747 ("memcg: remove cgroup_event->cft")
Cc: stable@kernel.org # v3.14+
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-12-08 10:40:58 -08:00
Radu Nicolae Pirea (OSS) f8bac7f9fd net: dsa: sja1105: avoid out of bounds access in sja1105_init_l2_policing()
The SJA1105 family has 45 L2 policing table entries
(SJA1105_MAX_L2_POLICING_COUNT) and SJA1110 has 110
(SJA1110_MAX_L2_POLICING_COUNT). Keeping the table structure but
accounting for the difference in port count (5 in SJA1105 vs 10 in
SJA1110) does not fully explain the difference. Rather, the SJA1110 also
has L2 ingress policers for multicast traffic. If a packet is classified
as multicast, it will be processed by the policer index 99 + SRCPORT.

The sja1105_init_l2_policing() function initializes all L2 policers such
that they don't interfere with normal packet reception by default. To have
a common code between SJA1105 and SJA1110, the index of the multicast
policer for the port is calculated because it's an index that is out of
bounds for SJA1105 but in bounds for SJA1110, and a bounds check is
performed.

The code fails to do the proper thing when determining what to do with the
multicast policer of port 0 on SJA1105 (ds->num_ports = 5). The "mcast"
index will be equal to 45, which is also equal to
table->ops->max_entry_count (SJA1105_MAX_L2_POLICING_COUNT). So it passes
through the check. But at the same time, SJA1105 doesn't have multicast
policers. So the code programs the SHARINDX field of an out-of-bounds
element in the L2 Policing table of the static config.

The comparison between index 45 and 45 entries should have determined the
code to not access this policer index on SJA1105, since its memory wasn't
even allocated.

With enough bad luck, the out-of-bounds write could even overwrite other
valid kernel data, but in this case, the issue was detected using KASAN.

Kernel log:

sja1105 spi5.0: Probed switch chip: SJA1105Q
==================================================================
BUG: KASAN: slab-out-of-bounds in sja1105_setup+0x1cbc/0x2340
Write of size 8 at addr ffffff880bd57708 by task kworker/u8:0/8
...
Workqueue: events_unbound deferred_probe_work_func
Call trace:
...
sja1105_setup+0x1cbc/0x2340
dsa_register_switch+0x1284/0x18d0
sja1105_probe+0x748/0x840
...
Allocated by task 8:
...
sja1105_setup+0x1bcc/0x2340
dsa_register_switch+0x1284/0x18d0
sja1105_probe+0x748/0x840
...

Fixes: 38fbe91f22 ("net: dsa: sja1105: configure the multicast policers, if present")
CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Radu Nicolae Pirea (OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20221207132347.38698-1-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 09:38:31 -08:00
Alexandra Winter ebaaadc332 s390/qeth: fix use-after-free in hsci
KASAN found that addr was dereferenced after br2dev_event_work was freed.

==================================================================
BUG: KASAN: use-after-free in qeth_l2_br2dev_worker+0x5ba/0x6b0
Read of size 1 at addr 00000000fdcea440 by task kworker/u760:4/540
CPU: 17 PID: 540 Comm: kworker/u760:4 Tainted: G            E      6.1.0-20221128.rc7.git1.5aa3bed4ce83.300.fc36.s390x+kasan #1
Hardware name: IBM 8561 T01 703 (LPAR)
Workqueue: 0.0.8000_event qeth_l2_br2dev_worker
Call Trace:
 [<000000016944d4ce>] dump_stack_lvl+0xc6/0xf8
 [<000000016942cd9c>] print_address_description.constprop.0+0x34/0x2a0
 [<000000016942d118>] print_report+0x110/0x1f8
 [<0000000167a7bd04>] kasan_report+0xfc/0x128
 [<000000016938d79a>] qeth_l2_br2dev_worker+0x5ba/0x6b0
 [<00000001673edd1e>] process_one_work+0x76e/0x1128
 [<00000001673ee85c>] worker_thread+0x184/0x1098
 [<000000016740718a>] kthread+0x26a/0x310
 [<00000001672c606a>] __ret_from_fork+0x8a/0xe8
 [<00000001694711da>] ret_from_fork+0xa/0x40
Allocated by task 108338:
 kasan_save_stack+0x40/0x68
 kasan_set_track+0x36/0x48
 __kasan_kmalloc+0xa0/0xc0
 qeth_l2_switchdev_event+0x25a/0x738
 atomic_notifier_call_chain+0x9c/0xf8
 br_switchdev_fdb_notify+0xf4/0x110
 fdb_notify+0x122/0x180
 fdb_add_entry.constprop.0.isra.0+0x312/0x558
 br_fdb_add+0x59e/0x858
 rtnl_fdb_add+0x58a/0x928
 rtnetlink_rcv_msg+0x5f8/0x8d8
 netlink_rcv_skb+0x1f2/0x408
 netlink_unicast+0x570/0x790
 netlink_sendmsg+0x752/0xbe0
 sock_sendmsg+0xca/0x110
 ____sys_sendmsg+0x510/0x6a8
 ___sys_sendmsg+0x12a/0x180
 __sys_sendmsg+0xe6/0x168
 __do_sys_socketcall+0x3c8/0x468
 do_syscall+0x22c/0x328
 __do_syscall+0x94/0xf0
 system_call+0x82/0xb0
Freed by task 540:
 kasan_save_stack+0x40/0x68
 kasan_set_track+0x36/0x48
 kasan_save_free_info+0x4c/0x68
 ____kasan_slab_free+0x14e/0x1a8
 __kasan_slab_free+0x24/0x30
 __kmem_cache_free+0x168/0x338
 qeth_l2_br2dev_worker+0x154/0x6b0
 process_one_work+0x76e/0x1128
 worker_thread+0x184/0x1098
 kthread+0x26a/0x310
 __ret_from_fork+0x8a/0xe8
 ret_from_fork+0xa/0x40
Last potentially related work creation:
 kasan_save_stack+0x40/0x68
 __kasan_record_aux_stack+0xbe/0xd0
 insert_work+0x56/0x2e8
 __queue_work+0x4ce/0xd10
 queue_work_on+0xf4/0x100
 qeth_l2_switchdev_event+0x520/0x738
 atomic_notifier_call_chain+0x9c/0xf8
 br_switchdev_fdb_notify+0xf4/0x110
 fdb_notify+0x122/0x180
 fdb_add_entry.constprop.0.isra.0+0x312/0x558
 br_fdb_add+0x59e/0x858
 rtnl_fdb_add+0x58a/0x928
 rtnetlink_rcv_msg+0x5f8/0x8d8
 netlink_rcv_skb+0x1f2/0x408
 netlink_unicast+0x570/0x790
 netlink_sendmsg+0x752/0xbe0
 sock_sendmsg+0xca/0x110
 ____sys_sendmsg+0x510/0x6a8
 ___sys_sendmsg+0x12a/0x180
 __sys_sendmsg+0xe6/0x168
 __do_sys_socketcall+0x3c8/0x468
 do_syscall+0x22c/0x328
 __do_syscall+0x94/0xf0
 system_call+0x82/0xb0
Second to last potentially related work creation:
 kasan_save_stack+0x40/0x68
 __kasan_record_aux_stack+0xbe/0xd0
 kvfree_call_rcu+0xb2/0x760
 kernfs_unlink_open_file+0x348/0x430
 kernfs_fop_release+0xc2/0x320
 __fput+0x1ae/0x768
 task_work_run+0x1bc/0x298
 exit_to_user_mode_prepare+0x1a0/0x1a8
 __do_syscall+0x94/0xf0
 system_call+0x82/0xb0
The buggy address belongs to the object at 00000000fdcea400
 which belongs to the cache kmalloc-96 of size 96
The buggy address is located 64 bytes inside of
 96-byte region [00000000fdcea400, 00000000fdcea460)
The buggy address belongs to the physical page:
page:000000005a9c26e8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xfdcea
flags: 0x3ffff00000000200(slab|node=0|zone=1|lastcpupid=0x1ffff)
raw: 3ffff00000000200 0000000000000000 0000000100000122 000000008008cc00
raw: 0000000000000000 0020004100000000 ffffffff00000001 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
 00000000fdcea300: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
 00000000fdcea380: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
>00000000fdcea400: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
                                           ^
 00000000fdcea480: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
 00000000fdcea500: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
==================================================================

Fixes: f7936b7b26 ("s390/qeth: Update MACs of LEARNING_SYNC device")
Reported-by: Thorsten Winkler <twinkler@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Thorsten Winkler <twinkler@linux.ibm.com>
Link: https://lore.kernel.org/r/20221207105304.20494-1-wintera@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 09:12:56 -08:00
Emeel Hakim 38099024e5 macsec: add missing attribute validation for offload
Add missing attribute validation for IFLA_MACSEC_OFFLOAD
to the netlink policy.

Fixes: 791bb3fcaf ("net: macsec: add support for specifying offload upon link creation")
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/20221207101618.989-1-ehakim@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 09:12:18 -08:00