Commit Graph

1236754 Commits

Author SHA1 Message Date
Wen Gu b40584d145 net/smc: compatible with 128-bits extended GID of virtual ISM device
According to virtual ISM support feature defined by SMCv2.1, GIDs of
virtual ISM device are UUIDs defined by RFC4122, which are 128-bits
long. So some adaptation work is required. And note that the GIDs of
existing platform firmware ISM devices still remain 64-bits long.

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26 20:24:33 +00:00
Wen Gu 8dd512df3c net/smc: define a reserved CHID range for virtual ISM devices
According to virtual ISM support feature defined by SMCv2.1, CHIDs in
the range 0xFF00 to 0xFFFF are reserved for use by virtual ISM devices.

And two helpers are introduced to distinguish virtual ISM devices from
the existing platform firmware ISM devices.

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-and-tested-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26 20:24:33 +00:00
Wen Gu 00e006a257 net/smc: introduce virtual ISM device support feature
This introduces virtual ISM device support feature to SMCv2.1 as the
first supplemental feature.

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26 20:24:33 +00:00
Wen Gu ece60db3a4 net/smc: support SMCv2.x supplemental features negotiation
This patch adds SMCv2.x supplemental features negotiation. Supported
SMCv2.x supplemental features are represented by feature_mask in FCE
field. The negotiation process is as follows.

 Server                                        Client
            Proposal(features(c-mask bits))
      <-----------------------------------------
            Accept(features(s-mask bits))
      ----------------------------------------->
           Confirm(features(s&c-mask bits))
      <-----------------------------------------

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-and-tested-by: Wenjia Zhang <wenjia@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26 20:24:33 +00:00
Wen Gu 9505450d55 net/smc: unify the structs of accept or confirm message for v1 and v2
The structs of CLC accept and confirm messages for SMCv1 and SMCv2 are
separately defined and often casted to each other in the code, which may
increase the risk of errors caused by future divergence of them. So
unify them into one struct for better maintainability.

Suggested-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26 20:24:33 +00:00
Wen Gu 5205ac4483 net/smc: introduce sub-functions for smc_clc_send_confirm_accept()
There is a large if-else block in smc_clc_send_confirm_accept() and it
is better to split it into two sub-functions.

Suggested-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26 20:24:32 +00:00
Wen Gu ac053a169c net/smc: rename some 'fce' to 'fce_v2x' for clarity
Rename some functions or variables with 'fce' in their name but used in
SMCv2.1 as 'fce_v2x' for clarity.

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-26 20:24:32 +00:00
Marek Behún e9301af385 net: sfp: fix PHY discovery for FS SFP-10G-T module
Commit 2f3ce7a56c ("net: sfp: rework the RollBall PHY waiting code")
changed the long wait before accessing RollBall / FS modules into
probing for PHY every 1 second, and trying 25 times.

Wei Lei reports that this does not work correctly on FS modules: when
initializing, they may report values different from 0xffff in PHY ID
registers for some MMDs, causing get_phy_c45_ids() to find some bogus
MMD.

Fix this by adding the module_t_wait member back, and setting it to 4
seconds for FS modules.

Fixes: 2f3ce7a56c ("net: sfp: rework the RollBall PHY waiting code")
Reported-by: Wei Lei <quic_leiwei@quicinc.com>
Signed-off-by: Marek Behún <kabel@kernel.org>
Tested-by: Lei Wei <quic_leiwei@quicinc.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-25 06:20:14 +00:00
David S. Miller 3b83fa94cf Merge branch 'dpaa2-switch-small-improvements'
Ioana Ciornei says:

====================
dpaa2-switch: small improvements

This patch set consists of a series of small improvements on the
dpaa2-switch driver ranging from adding some more verbosity when
encountering errors to reorganizing code to be easily extensible.

Changes in v3:
- 4/8: removed the fixes tag and moved it to the commit message
- 5/8: specified that there is no user-visible effect
- 6/8: removed the initialization of the err variable

Changes in v2:
- No changes to the actual diff, only rephrased some commit messages and
  added more information.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:18:59 +00:00
Ioana Ciornei 71150d9447 dpaa2-switch: cleanup the egress flood of an unused FDB
In case a DPAA2 switch interface joins a bridge, the FDB used on the
port will be changed to the one associated with the bridge. What this
means exactly is that any VLAN installed on the port will need to be
removed and then installed back so that it points to the new FDB.

Once this is done, the previous FDB will become unused (no VLAN to
point to it). Even though no traffic will reach this FDB, it's best to
just cleanup the state of the FDB by zeroing its egress flood domain.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:18:59 +00:00
Ioana Ciornei 6d46a4f105 dpaa2-switch: move a check to the prechangeupper stage
Two different DPAA2 switch ports from two different DPSW instances
cannot be under the same bridge. Instead of checking for this
unsupported configuration in the CHANGEUPPER event, check it as early as
possible in the PRECHANGEUPPER one.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:18:59 +00:00
Ioana Ciornei a8150c9fb1 dpaa2-switch: reorganize the [pre]changeupper events
Create separate functions, dpaa2_switch_port_prechangeupper and
dpaa2_switch_port_changeupper, to be called directly when a DPSW port
changes its upper device.

This way we are not open-coding everything in the main event callback
and we can easily extent, for example, with bond offload.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:18:59 +00:00
Ioana Ciornei f6da276479 dpaa2-switch: do not clear any interrupts automatically
The DPSW object has multiple event sources multiplexed over the same
IRQ. The driver has the capability to configure only some of these
events to trigger the IRQ.

The dpsw_get_irq_status() can clear events automatically based on the
value stored in the 'status' variable passed to it. We don't want that
to happen because we could get into a situation when we are clearing
more events than we actually handled.

Just resort to manually clearing the events that we handled. Also, since
status is not used on the out path we remove its initialization to zero.

This change does not have a user-visible effect because the dpaa2-switch
driver enables and handles all the DPSW events which exist at the
moment.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:18:59 +00:00
Ioana Ciornei 77c42a3b0a dpaa2-switch: add ENDPOINT_CHANGED to the irq_mask
Commit 84cba72956 ("dpaa2-switch: integrate the MAC endpoint support")
added support for MAC endpoints in the dpaa2-switch driver but omitted
to add the ENDPOINT_CHANGED irq to the list of interrupt sources. Fix
this by extending the list of events which can raise an interrupt by
extending the mask passed to the dpsw_set_irq_mask() firmware API.

There is no user visible impact even without this patch since whenever a
switch interface is connected/disconnected from an endpoint both events
are set (LINK_CHANGED and ENDPOINT_CHANGED) and, luckily, the
LINK_CHANGED event could actually raise the interrupt and thus get the
MAC/PHY SW configuration started.

Even with this, it's better to just not rely on undocumented firmware
behavior which can change.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:18:59 +00:00
Ioana Ciornei d50b1a8c30 dpaa2-switch: print an error when the vlan is already configured
Print a netdev error when we hit a case in which a specific VLAN is
already configured on the port. While at it, change the already existing
netdev_warn into an _err for consistency purposes.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:18:59 +00:00
Ioana Ciornei 7218e96319 dpaa2-switch: declare the netdev as IFF_LIVE_ADDR_CHANGE capable
There is no restriction around the change of the MAC address on the
switch ports, thus declare the interface netdevs IFF_LIVE_ADDR_CHANGE
capable.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:18:59 +00:00
Ioana Ciornei 365d0371a9 dpaa2-switch: set interface MAC address only on endpoint change
There is no point in updating the MAC address of a switch interface each
time the link state changes, this only needs to happen in case the
endpoint changes (the switch interface is [dis]connected from/to a MAC).

Just move the call to dpaa2_switch_port_set_mac_addr() under
DPSW_IRQ_EVENT_ENDPOINT_CHANGED.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:18:59 +00:00
David S. Miller d11db8ad38 Merge branch 'am65-cpsw-preemption-coalescing'
Roger Quadros says:

====================
net: ethernet: am65-cpsw: Add mqprio, frame preemption & coalescing

This series adds mqprio qdisc offload in channel mode,
Frame Preemption MAC merge support and RX/TX coalesing
for AM65 CPSW driver.

In v11 following changes were made
- Fix patch "net: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode"
by including units.h

Changelog information in each patch file.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:01:20 +00:00
Grygorii Strashko e4918f9d48 net: ethernet: ti: am65-cpsw: add sw tx/rx irq coalescing based on hrtimers
Add SW IRQ coalescing based on hrtimers for TX and RX data path which
can be enabled by ethtool commands:

- RX coalescing
  ethtool -C eth1 rx-usecs 50

- TX coalescing can be enabled per TX queue

  - by default enables coalesing for TX0
  ethtool -C eth1 tx-usecs 50
  - configure TX0
  ethtool -Q eth0 queue_mask 1 --coalesce tx-usecs 100
  - configure TX1
  ethtool -Q eth0 queue_mask 2 --coalesce tx-usecs 100
  - configure TX0 and TX1
  ethtool -Q eth0 queue_mask 3 --coalesce tx-usecs 100 --coalesce tx-usecs 100

  show configuration for TX0 and TX1:
  ethtool -Q eth0 queue_mask 3 --show-coalesce

Comparing to gro_flush_timeout and napi_defer_hard_irqs, this patch
allows to enable IRQ coalesing for RX path separately.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:01:19 +00:00
Roger Quadros 49a2eb9068 net: ethernet: ti: am65-cpsw-qos: Add Frame Preemption MAC Merge support
Add driver support for viewing / changing the MAC Merge sublayer
parameters and seeing the verification state machine's current state
via ethtool.

As hardware does not support interrupt notification for verification
events we resort to polling on link up. On link up we try a couple of
times for verification success and if unsuccessful then give up.

The Frame Preemption feature is described in the Technical Reference
Manual [1] in section:
	12.3.1.4.6.7 Intersperced Express Traffic (IET – P802.3br/D2.0)

Due to Silicon Errata i2208 [2] we set limit min IET fragment size to
124 (excluding 4 bytes mCRC).

[1] AM62x TRM - https://www.ti.com/lit/ug/spruiv7a/spruiv7a.pdf
[2] AM62x Silicon Errata - https://www.ti.com/lit/er/sprz487c/sprz487c.pdf

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:01:19 +00:00
Grygorii Strashko bc8d62e16e net: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode
This patch adds MQPRIO Qdisc offload in full 'channel' mode which allows
not only setting up pri:tc mapping, but also configuring TX shapers
(rate-limiting) on external port FIFOs.

The MQPRIO Qdisc offload is expected to work with or without VLAN/priority
tagged packets.

The CPSW external Port FIFO has 8 Priority queues. The rate-limit can be
set for each of these priority queues. Which Priority queue a packet is
assigned to depends on PN_REG_TX_PRI_MAP register which maps header
priority to switch priority.

The header priority of a packet is assigned via the RX_PRI_MAP_REG which
maps packet priority to header priority.

The packet priority is either the VLAN priority (for VLAN tagged packets)
or the thread/channel offset.

For simplicity, we assign the same priority queue to all queues of a
Traffic Class so it can be rate-limited correctly.

Configuration example:
 ethtool -L eth1 tx 5
 ethtool --set-priv-flags eth1 p0-rx-ptype-rrobin off

 tc qdisc add dev eth1 parent root handle 100: mqprio num_tc 3 \
 map 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 \
 queues 1@0 1@1 1@2 hw 1 mode channel \
 shaper bw_rlimit min_rate 0 100mbit 200mbit max_rate 0 101mbit 202mbit

 tc qdisc replace dev eth2 handle 100: parent root mqprio num_tc 1 \
 map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@0 hw 1

 ip link add link eth1 name eth1.100 type vlan id 100
 ip link set eth1.100 type vlan egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7

In the above example two ports share the same TX CPPI queue 0 for low
priority traffic. 3 traffic classes are defined for eth1 and mapped to:
TC0 - low priority, TX CPPI queue 0 -> ext Port 1 fifo0, no rate limit
TC1 - prio 2, TX CPPI queue 1 -> ext Port 1 fifo1, CIR=100Mbit/s, EIR=1Mbit/s
TC2 - prio 3, TX CPPI queue 2 -> ext Port 1 fifo2, CIR=200Mbit/s, EIR=2Mbit/s

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:01:19 +00:00
Roger Quadros 8f5a756106 net: ethernet: am65-cpsw: Move register definitions to header file
Move register definitions to header file. No functional change.

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:01:19 +00:00
Roger Quadros 1374841ad4 net: ethernet: ti: am65-cpsw: Move code to avoid forward declaration
Move this code around to avoid forward declaration.
No functional change.

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:01:19 +00:00
Roger Quadros 5db81bdc48 net: ethernet: am65-cpsw: cleanup TAPRIO handling
Handle offloading commands using switch-case in
am65_cpsw_setup_taprio().

Move checks to am65_cpsw_taprio_replace().

Use NL_SET_ERR_MSG_MOD for error messages.
Change error message from "Failed to set cycle time extension"
to "cycle time extension not supported"

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:01:19 +00:00
Roger Quadros d0f9535b31 net: ethernet: am65-cpsw: Rename TI_AM65_CPSW_TAS to TI_AM65_CPSW_QOS
We will use this Kconfig option to not only enable TAS/EST offload
but also other QoS features like Multiqueue priority descriptors
and MAC-Merge/Frame Preemption. TI_AM65_CPSW_QOS seems a more
appropriate Kconfig option name than TI_AM65_CPSW_TAS.

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:01:19 +00:00
Roger Quadros c92b1321bb net: ethernet: am65-cpsw: Build am65-cpsw-qos only if required
Build am65-cpsw-qos only if CONFIG_TI_AM65_CPSW_TAS is enabled.

Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:01:19 +00:00
Vladimir Oltean c8659bd9d1 selftests: forwarding: ethtool_mm: fall back to aggregate if device does not report pMAC stats
Some devices do not support individual 'pmac' and 'emac' stats.
For such devices, resort to 'aggregate' stats.

Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:01:19 +00:00
Vladimir Oltean 2491d66ae6 selftests: forwarding: ethtool_mm: support devices with higher rx-min-frag-size
Some devices have errata due to which they cannot report ETH_ZLEN (60)
in the rx-min-frag-size. This was foreseen of course, and lldpad has
logic that when we request it to advertise addFragSize 0, it will round
it up to the lowest value that is _actually_ supported by the hardware.

The problem is that the selftest expects lldpad to report back to us the
same value as we requested.

Make the selftest smarter by figuring out on its own what is a
reasonable value to expect.

Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 01:01:19 +00:00
David S. Miller 2437c0f514 Merge branch 'net-selftests-unique-namespace-last-part'
Hangbin Liu says:

====================
Convert net selftests to run in unique namespace (last part)

Here is the last part of converting net selftests to run in unique namespace.
This part converts all left tests. After the conversion, we can run the net
sleftests in parallel. e.g.

 # ./run_kselftest.sh -n -t net:reuseport_bpf
 TAP version 13
 1..1
 # selftests: net: reuseport_bpf
 ok 1 selftests: net: reuseport_bpf
  mod 10...
 # Socket 0: 0
 # Socket 1: 1
 ...
 # Socket 4: 19
 # Testing filter add without bind...
 # SUCCESS

 # ./run_kselftest.sh -p -n -t net:cmsg_so_mark.sh -t net:cmsg_time.sh -t net:cmsg_ipv6.sh
 TAP version 13
 1..3
 # selftests: net: cmsg_so_mark.sh
 ok 1 selftests: net: cmsg_so_mark.sh
 # selftests: net: cmsg_time.sh
 ok 2 selftests: net: cmsg_time.sh
 # selftests: net: cmsg_ipv6.sh
 ok 3 selftests: net: cmsg_ipv6.sh

 # ./run_kselftest.sh -p -n -c net
 TAP version 13
 1..95
 # selftests: net: reuseport_bpf_numa
 ok 3 selftests: net: reuseport_bpf_numa
 # selftests: net: reuseport_bpf_cpu
 ok 2 selftests: net: reuseport_bpf_cpu
 # selftests: net: sk_bind_sendto_listen
 ok 9 selftests: net: sk_bind_sendto_listen
 # selftests: net: reuseaddr_conflict
 ok 5 selftests: net: reuseaddr_conflict
 ...

Here is the part 1 link:
https://lore.kernel.org/netdev/20231202020110.362433-1-liuhangbin@gmail.com
part 2 link:
https://lore.kernel.org/netdev/20231206070801.1691247-1-liuhangbin@gmail.com
part 3 link:
https://lore.kernel.org/netdev/20231213060856.4030084-1-liuhangbin@gmail.com
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 00:26:32 +00:00
Hangbin Liu 9d0b4ad82d kselftest/runner.sh: add netns support
Add a variable RUN_IN_NETNS if the user wants to run all the selected tests
in namespace in parallel. With this, we can save a lot of testing time.

Note that some tests may not fit to run in namespace, e.g.
net/drop_monitor_tests.sh, as the dwdump needs to be run in init ns.

I also added another parameter -p to make all the logs reported separately
instead of mixing them in the stdout or output.log.

Nit: the NUM in run_one is not used, rename it to test_num.

Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 00:26:32 +00:00
Hangbin Liu 378f082eaf selftests/net: convert pmtu.sh to run it in unique namespace
pmtu test use /bin/sh, so we need to source ./lib.sh instead of lib.sh
Here is the test result after conversion.

 # ./pmtu.sh
 TEST: ipv4: PMTU exceptions                                         [ OK ]
 TEST: ipv4: PMTU exceptions - nexthop objects                       [ OK ]
 TEST: ipv6: PMTU exceptions                                         [ OK ]
 TEST: ipv6: PMTU exceptions - nexthop objects                       [ OK ]
 ...
 TEST: ipv4: list and flush cached exceptions - nexthop objects      [ OK ]
 TEST: ipv6: list and flush cached exceptions                        [ OK ]
 TEST: ipv6: list and flush cached exceptions - nexthop objects      [ OK ]
 TEST: ipv4: PMTU exception w/route replace                          [ OK ]
 TEST: ipv4: PMTU exception w/route replace - nexthop objects        [ OK ]
 TEST: ipv6: PMTU exception w/route replace                          [ OK ]
 TEST: ipv6: PMTU exception w/route replace - nexthop objects        [ OK ]

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 00:26:32 +00:00
Hangbin Liu 4416c5f53b selftests/net: use unique netns name for setup_loopback.sh setup_veth.sh
The setup_loopback and setup_veth use their own way to create namespace.
So let's just re-define server_ns/client_ns to unique name.
At the same time update the namespace name in gro.sh and toeplitz.sh.
As I don't have env to run toeplitz.sh. Here is only the gro test result.

 # ./gro.sh
 running test ipv4 data
 Expected {200 }, Total 1 packets
 Received {200 }, Total 1 packets.
 ...
 Gro::large test passed.
 All Tests Succeeded!

Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 00:26:32 +00:00
Hangbin Liu 976fd1fe4f selftests/net: convert xfrm_policy.sh to run it in unique namespace
Here is the test result after conversion.

 # ./xfrm_policy.sh
 PASS: policy before exception matches
 PASS: ping to .254 bypassed ipsec tunnel (exceptions)
 PASS: direct policy matches (exceptions)
 PASS: policy matches (exceptions)
 PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies)
 PASS: direct policy matches (exceptions and block policies)
 PASS: policy matches (exceptions and block policies)
 PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after hresh changes)
 PASS: direct policy matches (exceptions and block policies after hresh changes)
 PASS: policy matches (exceptions and block policies after hresh changes)
 PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after hthresh change in ns3)
 PASS: direct policy matches (exceptions and block policies after hthresh change in ns3)
 PASS: policy matches (exceptions and block policies after hthresh change in ns3)
 PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after htresh change to normal)
 PASS: direct policy matches (exceptions and block policies after htresh change to normal)
 PASS: policy matches (exceptions and block policies after htresh change to normal)
 PASS: policies with repeated htresh change

Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 00:26:32 +00:00
Hangbin Liu 098f1ce08b selftests/net: convert stress_reuseport_listen.sh to run it in unique namespace
Here is the test result after conversion.

 # ./stress_reuseport_listen.sh
 listen 24000 socks took 0.47714

Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 00:26:32 +00:00
Hangbin Liu d3b6b11161 selftests/net: convert rtnetlink.sh to run it in unique namespace
When running the test in namespace, the debugfs may not load automatically.
So add a checking to make sure debugfs loaded. Here is the test result
after conversion.

 # ./rtnetlink.sh
 PASS: policy routing
 PASS: route get
 ...
 PASS: address proto IPv4
 PASS: address proto IPv6

Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 00:26:32 +00:00
Hangbin Liu f6476dedf0 selftests/net: convert netns-name.sh to run it in unique namespace
This test will move the device to netns 1. Add a new test_ns to do this.
Here is the test result after conversion.

 # ./netns-name.sh
 netns-name.sh                           [  OK  ]

Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 00:26:32 +00:00
Hangbin Liu b84c2faeb9 selftests/net: convert gre_gso.sh to run it in unique namespace
Here is the test result after conversion.

 # ./gre_gso.sh
     TEST: GREv6/v4 - copy file w/ TSO                                   [ OK ]
     TEST: GREv6/v4 - copy file w/ GSO                                   [ OK ]
     TEST: GREv6/v6 - copy file w/ TSO                                   [ OK ]
     TEST: GREv6/v6 - copy file w/ GSO                                   [ OK ]

 Tests passed:   4
 Tests failed:   0

Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 00:26:32 +00:00
Jiapeng Chong 6530b29f77 selftests/net: remove unneeded semicolon
No functional modification involved.

./tools/testing/selftests/net/tcp_ao/setsockopt-closed.c:121:2-3: Unneeded semicolon.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7771
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-23 00:23:30 +00:00
Dmitry Safonov 826eb9bcc1 selftest/tcp-ao: Rectify out-of-tree build
Trivial fix for out-of-tree build that I wasn't testing previously:

1. Create a directory for library object files, fixes:
> gcc lib/kconfig.c -Wall -O2 -g -D_GNU_SOURCE -fno-strict-aliasing -I ../../../../../usr/include/ -iquote /tmp/kselftest/kselftest/net/tcp_ao/lib -I ../../../../include/  -o /tmp/kselftest/kselftest/net/tcp_ao/lib/kconfig.o -c
> Assembler messages:
> Fatal error: can't create /tmp/kselftest/kselftest/net/tcp_ao/lib/kconfig.o: No such file or directory
> make[1]: *** [Makefile:46: /tmp/kselftest/kselftest/net/tcp_ao/lib/kconfig.o] Error 1

2. Include $(KHDR_INCLUDES) that's exported by selftests/Makefile, fixes:
> In file included from lib/kconfig.c:6:
> lib/aolib.h:320:45: warning: ‘struct tcp_ao_add’ declared inside parameter list will not be visible outside of this definition or declaration
>   320 | extern int test_prepare_key_sockaddr(struct tcp_ao_add *ao, const char *alg,
>       |                                             ^~~~~~~~~~
...

3. While at here, clean-up $(KSFT_KHDR_INSTALL): it's not needed anymore
   since commit f2745dc0ba ("selftests: stop using KSFT_KHDR_INSTALL")

4. Also, while at here, drop .DEFAULT_GOAL definition: that has a
   self-explaining comment, that was valid when I made these selftests
   compile on local v4.19 kernel, but not needed since
   commit 8ce72dc325 ("selftests: fix headers_install circular dependency")

Fixes: cfbab37b3d ("selftests/net: Add TCP-AO library")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312190645.q76MmHyq-lkp@intel.com/
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22 23:26:37 +00:00
Jonathan Corbet 45248f2902 tipc: Remove some excess struct member documentation
Remove documentation for nonexistent struct members, addressing these
warnings:

  ./net/tipc/link.c:228: warning: Excess struct member 'media_addr' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'timer' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'refcnt' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'proto_msg' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'pmsg' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'backlog_limit' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'exp_msg_count' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'reset_rcv_checkpt' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'transmitq' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'snt_nxt' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'deferred_queue' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'unacked_window' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'next_out' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'long_msg_seq_no' description in 'tipc_link'
  ./net/tipc/link.c:228: warning: Excess struct member 'bc_rcvr' description in 'tipc_link'

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22 23:14:43 +00:00
Jonathan Corbet dcc3e46472 net: skbuff: Remove some excess struct-member documentation
Remove documentation for nonexistent structure members, addressing these
warnings:

  ./include/linux/skbuff.h:1063: warning: Excess struct member 'sp' description in 'sk_buff'
  ./include/linux/skbuff.h:1063: warning: Excess struct member 'nf_bridge' description in 'sk_buff'

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22 23:12:33 +00:00
David S. Miller 5f12303528 Merge branch 'tcp-refactor-bhash2'
Kuniyuki Iwashima says:

====================
tcp: Refactor bhash2 and remove sk_bind2_node.

This series refactors code around bhash2 and remove some bhash2-specific
fields; sock.sk_bind2_node, and inet_timewait_sock.tw_bind2_node.

  patch 1      : optimise bind() for non-wildcard v4-mapped-v6 address
  patch 2 -  4 : optimise bind() conflict tests
  patch 5 - 12 : Link bhash2 to bhash and unlink sk from bhash2 to
                 remove sk_bind2_node

The patch 8 will trigger a false-positive error by checkpatch.

v2: resend of https://lore.kernel.org/netdev/20231213082029.35149-1-kuniyu@amazon.com/
  * Rebase on latest net-next
  * Patch 11
    * Add change in inet_diag_dump_icsk() for recent bhash dump patch

v1: https://lore.kernel.org/netdev/20231023190255.39190-1-kuniyu@amazon.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22 22:15:35 +00:00
Kuniyuki Iwashima 8191792c18 tcp: Remove dead code and fields for bhash2.
Now all sockets including TIME_WAIT are linked to bhash2 using
sock_common.skc_bind_node.

We no longer use inet_bind2_bucket.deathrow, sock.sk_bind2_node,
and inet_timewait_sock.tw_bind2_node.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22 22:15:35 +00:00
Kuniyuki Iwashima 770041d337 tcp: Link sk and twsk to tb2->owners using skc_bind_node.
Now we can use sk_bind_node/tw_bind_node for bhash2, which means
we need not link TIME_WAIT sockets separately.

The dead code and sk_bind2_node will be removed in the next patch.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22 22:15:35 +00:00
Kuniyuki Iwashima b2cb9f9ef2 tcp: Unlink sk from bhash.
Now we do not use tb->owners and can unlink sockets from bhash.

sk_bind_node/tw_bind_node are available for bhash2 and will be
used in the following patch.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22 22:15:35 +00:00
Kuniyuki Iwashima 8002d44fe8 tcp: Check hlist_empty(&tb->bhash2) instead of hlist_empty(&tb->owners).
We use hlist_empty(&tb->owners) to check if the bhash bucket has a socket.
We can check the child bhash2 buckets instead.

For this to work, the bhash2 bucket must be freed before the bhash bucket.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22 22:15:35 +00:00
Kuniyuki Iwashima b82ba728cc tcp: Iterate tb->bhash2 in inet_csk_bind_conflict().
Sockets in bhash are also linked to bhash2, but TIME_WAIT sockets
are linked separately in tb2->deathrow.

Let's replace tb->owners iteration in inet_csk_bind_conflict() with
two iterations over tb2->owners and tb2->deathrow.

This can be done safely under bhash's lock because socket insertion/
deletion in bhash2 happens with bhash's lock held.

Note that twsk_for_each_bound_bhash() will be removed later.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22 22:15:35 +00:00
Kuniyuki Iwashima 58655bc0ad tcp: Rearrange tests in inet_csk_bind_conflict().
The following patch adds code in the !inet_use_bhash2_on_bind(sk)
case in inet_csk_bind_conflict().

To avoid adding nest and make the change cleaner, this patch
rearranges tests in inet_csk_bind_conflict().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22 22:15:35 +00:00
Kuniyuki Iwashima 822fb91fc7 tcp: Link bhash2 to bhash.
bhash2 added a new member sk_bind2_node in struct sock to link
sockets to bhash2 in addition to bhash.

bhash is still needed to search conflicting sockets efficiently
from a port for the wildcard address.  However, bhash itself need
not have sockets.

If we link each bhash2 bucket to the corresponding bhash bucket,
we can iterate the same set of the sockets from bhash2 via bhash.

This patch links bhash2 to bhash only, and the actual use will be
in the later patches.  Finally, we will remove sk_bind2_node.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22 22:15:34 +00:00
Kuniyuki Iwashima 4dd7108854 tcp: Rename tb in inet_bind2_bucket_(init|create)().
Later, we no longer link sockets to bhash.  Instead, each bhash2
bucket is linked to the corresponding bhash bucket.

Then, we pass the bhash bucket to bhash2 allocation functions as
tb.  However, tb is already used in inet_bind2_bucket_create() and
inet_bind2_bucket_init() as the bhash2 bucket.

To make the following diff clear, let's use tb2 for the bhash2 bucket
there.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-22 22:15:34 +00:00