Commit graph

970064 commits

Author SHA1 Message Date
Tom Parkin
4cf476ced4 ppp: add PPPIOCBRIDGECHAN and PPPIOCUNBRIDGECHAN ioctls
This new ioctl pair allows two ppp channels to be bridged together:
frames arriving in one channel are transmitted in the other channel
and vice versa.

The practical use for this is primarily to support the L2TP Access
Concentrator use-case.  The end-user session is presented as a ppp
channel (typically PPPoE, although it could be e.g. PPPoA, or even PPP
over a serial link) and is switched into a PPPoL2TP session for
transmission to the LNS.  At the LNS the PPP session is terminated in
the ISP's network.

When a PPP channel is bridged to another it takes a reference on the
other's struct ppp_file.  This reference is dropped when the channels
are unbridged, which can occur either explicitly on userspace calling
the PPPIOCUNBRIDGECHAN ioctl, or implicitly when either channel in the
bridge is unregistered.

In order to implement the channel bridge, struct channel is extended
with a new field, 'bridge', which points to the other struct channel
making up the bridge.

This pointer is RCU protected to avoid adding another lock to the data
path.

To guard against concurrent writes to the pointer, the existing struct
channel lock 'upl' coverage is extended rather than adding a new lock.

The 'upl' lock is used to protect the existing unit pointer.  Since the
bridge effectively replaces the unit (they're mutually exclusive for a
channel) it makes coding easier to use the same lock to cover them
both.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 13:57:36 -08:00
Jakub Kicinski
51e13685bd rtnetlink: RCU-annotate both dimensions of rtnl_msg_handlers
We use rcu_assign_pointer to assign both the table and the entries,
but the entries are not marked as __rcu. This generates sparse
warnings.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 13:35:59 -08:00
Willy Tarreau
1d608d2e0d Revert "macb: support the two tx descriptors on at91rm9200"
This reverts commit 0a4e9ce17b.

The code was developed and tested on an MSC313E SoC, which seems to be
half-way between the AT91RM9200 and the AT91SAM9260 in that it supports
both the 2-descriptors mode and a Tx ring.

It turns out that after the code was merged I could notice that the
controller would sometimes lock up, and only when dealing with sustained
bidirectional transfers, in which case it would report a Tx overrun
condition right after having reported being ready, and will stop sending
even after the status is cleared (a down/up cycle fixes it though).

After adding lots of traces I couldn't spot a sequence pattern allowing
to predict that this situation would happen. The chip comes with no
documentation and other bits are often reported with no conclusive
pattern either.

It is possible that my change is wrong just like it is possible that
the controller on the chip is bogus or at least unpredictable based on
existing docs from other chips. I do not have an RM9200 at hand to test
at the moment and a few tests run on a more recent 9G20 indicate that
this code path cannot be used there to test the code on a 3rd platform.

Since the MSC313E works fine in the single-descriptor mode, and that
people using the old RM9200 very likely favor stability over performance,
better revert this patch until we can test it on the original platform
this part of the driver was written for. Note that the reverted patch
was actually tested on MSC313E.

Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Daniel Palmer <daniel@0x0f.com>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/netdev/20201206092041.GA10646@1wt.eu/
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 13:32:33 -08:00
Subash Abhinov Kasiviswanathan
b7f5eb6ba2 net: qualcomm: rmnet: Update rmnet device MTU based on real device
Packets sent by rmnet to the real device have variable MAP header
lengths based on the data format configured. This patch adds checks
to ensure that the real device MTU is sufficient to transmit the MAP
packet comprising of the MAP header and the IP packet. This check
is enforced when rmnet devices are created and updated and during
MTU updates of both the rmnet and real device.

Additionally, rmnet devices now have a default MTU configured which
accounts for the real device MTU and the headroom based on the data
format.

Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Tested-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 13:30:26 -08:00
Xie He
3b0c860f87 net: lapbether: Consider it successful if (dis)connecting when already (dis)connected
When the upper layer instruct us to connect (or disconnect), but we have
already connected (or disconnected), consider this operation successful
rather than failed.

This can help the upper layer to correct its record about whether we are
connected or not here in layer 2.

The upper layer may not have the correct information about whether we are
connected or not. This can happen if this driver has already been running
for some time when the "x25" module gets loaded.

Another X.25 driver (hdlc_x25) is already doing this, so we make this
driver do this, too.

Cc: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Acked-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 13:18:28 -08:00
Sasha Neftin
bfa5e98c9d igc: Add new device ID
Add new device ID for the next step of the silicon and
reflect the I226_K part.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 13:13:24 -08:00
Arjun Roy
e0fecb289a tcp: correctly handle increased zerocopy args struct size
A prior patch increased the size of struct tcp_zerocopy_receive
but did not update do_tcp_getsockopt() handling to properly account
for this.

This patch simply reintroduces content erroneously cut from the
referenced prior patch that handles the new struct size.

Fixes: 18fb76ed53 ("net-zerocopy: Copy straggler unaligned data for TCP Rx. zerocopy.")
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 13:10:32 -08:00
Zheng Yongjun
a76b6b1fe8 net: mediatek: simplify the return expression of mtk_gmac_sgmii_path_setup()
Simplify the return expression.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 13:02:16 -08:00
Zheng Yongjun
b18cac546b net/mlx4: simplify the return expression of mlx4_init_srq_table()
Simplify the return expression.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 13:00:51 -08:00
Zheng Yongjun
ec73c31dfb net: stmmac: simplify the return tc_delete_knode()
Simplify the return expression.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 12:58:55 -08:00
David S. Miller
c7dd222053 linux-can-next-for-5.11-20201210
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEK3kIWJt9yTYMP3ehqclaivrt76kFAl/R7dETHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRCpyVqK+u3vqVGdB/4kMAJaguAotadb18Zq6+5bQ3VYxKms
 a9ctqOnFHv/pTPijEtZoVOMTFItMhQZeBZqYqhe4P544z1oMBg+dtGcli++me2TR
 Lh8JSQIkql9EQN65XX4ZEdEE5u32fFzp2nr1WpKSa6INek5Pfv/QsiGOz2Ws0Oxy
 mZNj4lf30p1+tJunKKSWXghHkw12NPX77AeYLw5uYb1Jmq7o1LIdENpBJEULVwkI
 LVRcxiwcFmUkXMCXbdB8HuUiVCm5Ju3/BCaOWdQk23LYCBDGLi6jW3JAk3qtTqRA
 DQO6sMo/KXrzA49qma6o7AL/i3+fVYjsVPpRjiWNioKbHduvSXoyPyKb
 =u5Aa
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-5.11-20201210' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2020-12-10

here's a pull request of 7 patches for net-next/master.

The first patch is by Oliver Hartkopp for the CAN ISOTP, which adds support for
functional addressing.

A patch by Antonio Quartulli removes an unneeded unlikely() annotation from the
rx-offload helper.

The next three patches target the m_can driver. Sean Nyekjaers's patch removes
a double clearing of clock stop request bit, Patrik Flykt's patch moves the
runtime PM enable/disable to m_can_platform and Jarkko Nikula's patch adds a
PCI glue code driver.

Fabio Estevam's patch converts the flexcan driver to DT only.

And Manivannan Sadhasivam's patchd for the mcp251xfd driver adds internal
loopback mode support.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 12:49:45 -08:00
Antonio Quartulli
a10b24b832 vxlan: avoid double unlikely() notation when using IS_ERR()
The definition of IS_ERR() already applies the unlikely() notation
when checking the error status of the passed pointer. For this
reason there is no need to have the same notation outside of
IS_ERR() itself.

Clean up code by removing redundant notation.

Signed-off-by: Antonio Quartulli <a@unstable.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-10 12:43:29 -08:00
Manivannan Sadhasivam
ee42bedc85 can: mcp251xfd: Add support for internal loopback mode
MCP251xFD supports internal loopback mode which can be used to verify CAN
functionality in the absence of a real CAN device.

Link: https://lore.kernel.org/r/20201201054019.11012-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[mkl: mcp251xfd_get_normal_mode(): move CAN_CTRLMODE_LOOPBACK check to front]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10 10:40:10 +01:00
Fabio Estevam
2c0ac92081 can: flexcan: convert the driver to DT-only
The flexcan driver runs only on DT platforms, so simplify the code by using
of_device_get_match_data() to retrieve the driver data and also by removing the
unused id_table.

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20201128132855.7724-1-festevam@gmail.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10 10:40:10 +01:00
Jarkko Nikula
cab7ffc032 can: m_can: add PCI glue driver for Intel Elkhart Lake
Add support for M_CAN controller on Intel Elkhart Lake attached to the PCI bus.
It integrates the Bosch M_CAN controller with Message RAM and the wrapper IP
block with additional registers which all of them are within the same MMIO
range.

Currently only interrupt control register from wrapper IP is used and the MRAM
configuration is expected to come from the firmware via "bosch,mram-cfg" device
property and parsed by m_can.c core.

Initial implementation is done by Felipe Balbi while he was working at Intel
with later changes from Raymond Tan and me.

Co-developed-by: Felipe Balbi (Intel) <balbi@kernel.org>
Co-developed-by: Raymond Tan <raymond.tan@intel.com>
Signed-off-by: Felipe Balbi (Intel) <balbi@kernel.org>
Signed-off-by: Raymond Tan <raymond.tan@intel.com>
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Link: https://lore.kernel.org/r/20201117160827.3636264-1-jarkko.nikula@linux.intel.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10 10:40:10 +01:00
Patrik Flykt
227619c3ff can: m_can: move runtime PM enable/disable to m_can_platform
This is a preparatory patch for upcoming PCI based M_CAN devices. The current
PM implementation would cause PCI based drivers to enable PM twice, once when
the PCI device is added and a second time in m_can_class_register(). This will
cause 'Unbalanced pm_runtime_enable!' to be logged, and is a situation that
should be avoided.

Therefore, in anticipation of PCI devices, move PM enabling out from M_CAN
class registration to its only user, the m_can_platform driver.

Signed-off-by: Patrik Flykt <patrik.flykt@linux.intel.com>
Link: https://lore.kernel.org/r/20201023115800.46538-2-patrik.flykt@linux.intel.com
[mkl: m_can_plat_probe(): fix error handling
      m_can_class_register(): simplify error handling]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10 10:39:36 +01:00
Sean Nyekjaer
c9f4cad6cd can: m_can: m_can_config_endisable(): remove double clearing of clock stop request bit
The CSR bit is already cleared when arriving here so remove this section of
duplicate code.

The registers set in m_can_config_endisable() is set to same exact values as
before this patch.

Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Acked-by: Sriram Dash <sriram.dash@samsung.com>
Acked-by: Dan Murphy <dmurphy@ti.com>
Link: https://lore.kernel.org/r/20191211063227.84259-1-sean@geanix.com
Fixes: f524f829b7 ("can: m_can: Create a m_can platform framework")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10 10:10:43 +01:00
Antonio Quartulli
ecbaf5e13f can: rx-offload: can_rx_offload_offload_one(): avoid double unlikely() notation when using IS_ERR()
The definition of IS_ERR() already applies the unlikely() notation when
checking the error status of the passed pointer. For this reason there is no
need to have the same notation outside of IS_ERR() itself.

Clean up code by removing redundant notation.

Signed-off-by: Antonio Quartulli <a@unstable.cc>
Link: https://lore.kernel.org/r/20201210085321.18693-1-a@unstable.cc
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10 10:10:43 +01:00
Oliver Hartkopp
921ca574cd can: isotp: add SF_BROADCAST support for functional addressing
When CAN_ISOTP_SF_BROADCAST is set in the CAN_ISOTP_OPTS flags the CAN_ISOTP
socket is switched into functional addressing mode, where only single frame
(SF) protocol data units can be send on the specified CAN interface and the
given tp.tx_id after bind().

In opposite to normal and extended addressing this socket does not register a
CAN-ID for reception which would be needed for a 1-to-1 ISOTP connection with a
segmented bi-directional data transfer.

Sending SFs on this socket is therefore a TX-only 'broadcast' operation.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Thomas Wagner <thwa1@web.de>
Link: https://lore.kernel.org/r/20201206144731.4609-1-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-12-10 09:31:40 +01:00
David S. Miller
a7105e3472 Merge branch 'hns3-next'
Huazhong Tan says:

====================
net: hns3: updates for -next

This patchset adds support for tc mqprio offload, hw tc
offload of tc flower, and adpation for max rss size changes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 20:33:20 -08:00
Guojia Liao
cdab7c9779 net: hns3: adjust rss tc mode configure command
For the max rss size of PF may be up to 512, the max queue
number of single tc may be up to 512 too. For the total queue
numbers may be up to 1280, so the queue offset of each tc may
be more than 1024. So adjust the rss tc mode configuration
command, including extend tc size field from 10 bits to 11
bits, and extend tc size field from 3 bits to 4 bits.

Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 20:33:20 -08:00
Guojia Liao
8eeb1f4bce net: hns3: adjust rss indirection table configure command
For the max rss size of PF may be up to 512, so adjust the
command of configuring rss indirection table to support
queue id larger than 255. The width of queue id is extended
from 8 bits to 10 bits. The high 2 bits are stored in filed
rss_qid_h when the queue id is larger than 255.

Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 20:33:20 -08:00
Guojia Liao
f1c2e66d7f net: hns3: add support for max 512 rss size
Currently, the driver gets the max rss size from configuration
file when initialization. Both the PF and VF share the same
max rss size, and no more than 128.

For DEVICE_VERSION_V3, the max rss size for PF can be up to
512, so there is a new field in configuration file to store it,
the old filed is used for VF.

To be compatible with boards using old configure file, the PF
will use the old filed if the one is zero.

For the rss size may be larger than 256, so the type of
rss_indirection_tbl of struct hclge_vport should be changed to
u16 as well.

Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 20:33:20 -08:00
Jian Shen
0205ec041e net: hns3: add support for hw tc offload of tc flower
Some new device supports forwarding packet to queues of specified
TC when flow director rule hit. So add support to configure flow
director rule by tc flower. To avoid rule conflict, add a new flow
director mode HCLGE_FD_TC_FLOWER_ACTIVE, and only one mode can be
active at the same time.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 20:33:19 -08:00
Jian Shen
0f993fe2b8 net: hns3: add support for forwarding packet to queues of specified TC when flow director rule hit
For some new device, it supports forwarding packet to queues
of specified TC when flow director rule hit. So extend the
command handle to support it.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 20:33:19 -08:00
Jian Shen
5a5c909174 net: hns3: add support for tc mqprio offload
Currently, the HNS3 driver only supports offload for tc number
and prio_tc. This patch adds support for other qopts, including
queues count and offset for each tc.

When enable tc mqprio offload, it's not allowed to change
queue numbers by ethtool. For hardware limitation, the queue
number of each tc should be power of 2.

For the queues is not assigned to each tc by average, so it's
should return vport->alloc_tqps for hclge_get_max_channels().

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 20:33:19 -08:00
Jian Shen
35244430d6 net: hns3: refine the struct hane3_tc_info
Currently, there are multiple members related to tc information
in struct hnae3_knic_private_info. Merge them into a new struct
hnae3_tc_info.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 20:33:19 -08:00
Jakub Kicinski
c0ead5552c nfp: silence set but not used warning with IPV6=n
Test robot reports:

drivers/net/ethernet/netronome/nfp/crypto/tls.c: In function 'nfp_net_tls_rx_resync_req':
drivers/net/ethernet/netronome/nfp/crypto/tls.c:477:18: warning: variable 'ipv6h' set but not used [-Wunused-but-set-variable]
  477 |  struct ipv6hdr *ipv6h;
      |                  ^~~~~
In file included from include/linux/compiler_types.h:65,
                    from <command-line>:
drivers/net/ethernet/netronome/nfp/crypto/tls.c: In function 'nfp_net_tls_add':
include/linux/compiler_attributes.h:208:41: warning: statement will never be executed [-Wswitch-unreachable]
  208 | # define fallthrough                    __attribute__((__fallthrough__))
      |                                         ^~~~~~~~~~~~~
drivers/net/ethernet/netronome/nfp/crypto/tls.c:299:3: note: in expansion of macro 'fallthrough'
  299 |   fallthrough;
      |   ^~~~~~~~~~~

Use the IPv6 header in the switch, it doesn't matter which header
we use to read the version field.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:42:03 -08:00
Wong Vee Khee
523437d7b5 net: stmmac: allow stmmac to probe for C45 PHY devices
Assign stmmac's mdio_bus probe capabilities to MDIOBUS_C22_C45.
This extended the probing of C45 PHY devices on the MDIO bus.

Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:40:09 -08:00
David S. Miller
5cab30359a Merge branch 'Add-support-for-VSOL-V2801F-CarlitoxxPro-CPGOS03-GPON-mo
dule'

Russell King says:

====================
Add support for VSOL V2801F/CarlitoxxPro CPGOS03 GPON module

This patch set adds support for the V2801F / CarlitoxxPro module. This
requires two changes:

1) the module only supports single byte reads to the ID EEPROM,
   while we need to still permit sequential reads to the diagnostics
   EEPROM for atomicity reasons.

2) we need to relax the encoding check when we have no reported
   capabilities to allow 1000base-X based on the module bitrate.

Thanks to Pali Rohár for responsive testing over the last two days.

(Resending, dropping the utf-8 characters in Pali's name so the patches
 get through vger. Added Andrew's r-b tags.)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:38:10 -08:00
Russell King
7a77233ec6 net: sfp: relax bitrate-derived mode check
Do not check the encoding when deriving 1000BASE-X from the bitrate
when no other modes are discovered. Some GPON modules (VSOL V2801F
and CarlitoxxPro CPGOS03-0490 v2.0) indicate NRZ encoding with a
1200Mbaud bitrate, but should be driven with 1000BASE-X on the host
side.

Tested-by: Pali Rohár <pali@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:38:10 -08:00
Russell King
0d035bed2a net: sfp: VSOL V2801F / CarlitoxxPro CPGOS03-0490 v2.0 workaround
Add a workaround for the detection of VSOL V2801F / CarlitoxxPro
CPGOS03-0490 v2.0 GPON module which CarlitoxxPro states needs single
byte I2C reads to the EEPROM.

Pali Rohár reports that he also has a CarlitoxxPro-based V2801F module,
which reports a manufacturer of "OEM". This manufacturer can't be
matched as it appears in many different modules, so also match the part
number too.

Reported-by: Thomas Schreiber <tschreibe@gmail.com>
Reported-by: Pali Rohár <pali@kernel.org>
Tested-by: Pali Rohár <pali@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:38:10 -08:00
Xie He
6b21c0bb3a net: x25: Fix handling of Restart Request and Restart Confirmation
1. When the x25 module gets loaded, layer 2 may already be running and
connected. In this case, although we are in X25_LINK_STATE_0, we still
need to handle the Restart Request received, rather than ignore it.

2. When we are in X25_LINK_STATE_2, we have already sent a Restart Request
and is waiting for the Restart Confirmation with t20timer. t20timer will
restart itself repeatedly forever so it will always be there, as long as we
are in State 2. So we don't need to check x25_t20timer_pending again.

Fixes: d023b2b9cc ("net/x25: fix restart request/confirm handling")
Cc: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Acked-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:34:25 -08:00
David S. Miller
0f86a5be10 Merge branch 'mptcp-fixes'
Paolo Abeni says:

====================
mptcp: a bunch of fixes

This series includes a few fixes following-up the
recent code refactor for the MPTCP RX and TX paths.

Boundling them together, since the fixes are somewhat
related.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:31:58 -08:00
Paolo Abeni
d7b1bfd083 mptcp: be careful on subflows shutdown
When the workqueue disposes of the msk, the subflows can still
receive some data from the peer after __mptcp_close_ssk()
completes.

The above could trigger a race between the msk receive path and the
msk destruction. Acquiring the mptcp_data_lock() in __mptcp_destroy_sock()
will not save the day: the rx path could be reached even after msk
destruction completes.

Instead use the subflow 'disposable' flag to prevent entering
the msk receive path after __mptcp_close_ssk().

Fixes: e16163b6e2 ("mptcp: refactor shutdown and close")
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:31:58 -08:00
Paolo Abeni
0597d0f8e0 mptcp: plug subflow context memory leak
When a MPTCP listener socket is closed with unaccepted
children pending, the ULP release callback will be invoked,
but nobody will call into __mptcp_close_ssk() on the
corresponding subflow.

As a consequence, at ULP release time, the 'disposable' flag
will be cleared and the subflow context memory will be leaked.

This change addresses the issue always freeing the context if
the subflow is still in the accept queue at ULP release time.

Additionally, this fixes an incorrect code reference in the
related comment.

Note: this fix leverages the changes introduced by the previous
commit.

Fixes: e16163b6e2 ("mptcp: refactor shutdown and close")
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:31:58 -08:00
Paolo Abeni
5b950ff433 mptcp: link MPC subflow into msk only after accept
Christoph reported the following splat:

WARNING: CPU: 0 PID: 4615 at net/ipv4/inet_connection_sock.c:1031 inet_csk_listen_stop+0x8e8/0xad0 net/ipv4/inet_connection_sock.c:1031
Modules linked in:
CPU: 0 PID: 4615 Comm: syz-executor.4 Not tainted 5.9.0 #37
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:inet_csk_listen_stop+0x8e8/0xad0 net/ipv4/inet_connection_sock.c:1031
Code: 03 00 00 00 e8 79 b2 3d ff e9 ad f9 ff ff e8 1f 76 ba fe be 02 00 00 00 4c 89 f7 e8 62 b2 3d ff e9 14 f9 ff ff e8 08 76 ba fe <0f> 0b e9 97 f8 ff ff e8 fc 75 ba fe be 03 00 00 00 4c 89 f7 e8 3f
RSP: 0018:ffffc900037f7948 EFLAGS: 00010293
RAX: ffff88810a349c80 RBX: ffff888114ee1b00 RCX: ffffffff827b14cd
RDX: 0000000000000000 RSI: ffffffff827b1c38 RDI: 0000000000000005
RBP: ffff88810a2a8000 R08: ffff88810a349c80 R09: fffff520006fef1f
R10: 0000000000000003 R11: fffff520006fef1e R12: ffff888114ee2d00
R13: dffffc0000000000 R14: 0000000000000001 R15: ffff888114ee1d68
FS:  00007f2ac1945700(0000) GS:ffff88811b400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd44798bc0 CR3: 0000000109810002 CR4: 0000000000170ef0
Call Trace:
 __tcp_close+0xd86/0x1110 net/ipv4/tcp.c:2433
 __mptcp_close_ssk+0x256/0x430 net/mptcp/protocol.c:1761
 __mptcp_destroy_sock+0x49b/0x770 net/mptcp/protocol.c:2127
 mptcp_close+0x62d/0x910 net/mptcp/protocol.c:2184
 inet_release+0xe9/0x1f0 net/ipv4/af_inet.c:434
 __sock_release+0xd2/0x280 net/socket.c:596
 sock_close+0x15/0x20 net/socket.c:1277
 __fput+0x276/0x960 fs/file_table.c:281
 task_work_run+0x109/0x1d0 kernel/task_work.c:151
 get_signal+0xe8f/0x1d40 kernel/signal.c:2561
 arch_do_signal+0x88/0x1b60 arch/x86/kernel/signal.c:811
 exit_to_user_mode_loop kernel/entry/common.c:161 [inline]
 exit_to_user_mode_prepare+0x9b/0xf0 kernel/entry/common.c:191
 syscall_exit_to_user_mode+0x22/0x150 kernel/entry/common.c:266
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f2ac1254469
Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
RSP: 002b:00007f2ac1944dc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffbf RBX: 000000000069bf00 RCX: 00007f2ac1254469
RDX: 0000000000000000 RSI: 0000000000008982 RDI: 0000000000000003
RBP: 000000000069bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000069bf0c
R13: 00007ffeb53f178f R14: 00000000004668b0 R15: 0000000000000003

After commit 0397c6d85f ("mptcp: keep unaccepted MPC subflow into
join list"), the msk's workqueue and/or PM can touch the MPC
subflow - and acquire its socket lock - even if it's still unaccepted.

If the above event races with the relevant listener socket close, we
can end-up with the above splat.

This change addresses the issue delaying the MPC socket insertion
in conn_list at accept time - that is, partially reverting the
blamed commit.

We must additionally ensure that mptcp_pm_fully_established()
happens after accept() time, or the PM will not be able to
handle properly such event - conn_list could be empty otherwise.

In the receive path, we check the subflow list node to ensure
it is out of the listener queue. Be sure client subflows do
not match transiently such condition moving them into the join
list earlier at creation time.

Since we now have multiple mptcp_pm_fully_established() call sites
from different code-paths, said helper can now race with itself.
Use an additional PM status bit to avoid multiple notifications.

Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/103
Fixes: 0397c6d85f ("mptcp: keep unaccepted MPC subflow into join list"),
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:31:58 -08:00
Xie He
7bdddc68cd net: hdlc_x25: Remove unnecessary skb_reset_network_header calls
1. In x25_xmit, skb_reset_network_header is not necessary before we call
lapb_data_request. The lapb module doesn't need skb->network_header.
So there is no need to set skb->network_header before calling
lapb_data_request.

2. In x25_data_indication (called by the lapb module after data have
been received), skb_reset_network_header is not necessary before we
call netif_rx. After we call netif_rx, the code in net/core/dev.c will
call skb_reset_network_header before handing the skb to upper layers
(in __netif_receive_skb_core, called by __netif_receive_skb_one_core,
called by __netif_receive_skb, called by process_backlog). So we don't
need to call skb_reset_network_header by ourselves.

Cc: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:22:48 -08:00
Zheng Yongjun
016ade51a7 net/mlx4: simplify the return expression of mlx4_init_cq_table()
Simplify the return expression.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:20:46 -08:00
Dwip N. Banerjee
c2af62256e ibmvnic: fix rx buffer tracking and index management in replenish_rx_pool partial success
We observed that in the error case for batched send_subcrq_indirect() the
driver does not account for the partial success case. This caused Linux to
crash when free_map and pool index are inconsistent.

Driver needs to update the rx pools "available" count when some batched
sends worked but an error was encountered as part of the whole operation.
Also track replenish_add_buff_failure for statistic purposes.

Fixes: 4f0b6812e9 ("ibmvnic: Introduce batched RX buffer descriptor transmission")
Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:06:10 -08:00
David S. Miller
5a40cce208 Merge branch 'mptcp-Add-port-parameter-to-ADD_ADDR-option'
Mat Martineau says:

====================
mptcp: Add port parameter to ADD_ADDR option

The ADD_ADDR MPTCP option is used to announce available IP addresses
that a peer may connect to when adding more TCP subflows to an existing
MPTCP connection. There is an optional port number field in that
ADD_ADDR header, and this patch set adds capability for that port number
to be sent and received.

Patches 1, 2, and 4 refactor existing ADD_ADDR code to simplify implementation
of port number support.

Patches 3 and 5 are the main functional changes, for sending and
receiving the port number in the MPTCP ADD_ADDR option.

Patch 6 sends the ADD_ADDR option with port number on a bare TCP ACK,
since the extra length of the option may run in to cases where
sufficient TCP option space is not available on a data packet.

Patch 7 plumbs in port number support for the in-kernel MPTCP path
manager.

Patches 8-11 add some optional debug output and a little more cleanup
refactoring.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:02:16 -08:00
Geliang Tang
432d9e74d8 mptcp: use the variable sk instead of open-coding
Since the local variable sk has been defined, use it instead of
open-coding.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:02:16 -08:00
Geliang Tang
13ad9f01a2 mptcp: rename add_addr_signal and mptcp_add_addr_status
Since the RM_ADDR signal had been reused with add_addr_signal, it's not
suitable to call it add_addr_signal or mptcp_add_addr_status. So this
patch renamed add_addr_signal to addr_signal, and renamed
mptcp_add_addr_status to mptcp_addr_signal_status.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:02:16 -08:00
Geliang Tang
42842a425a mptcp: drop rm_addr_signal flag
This patch reused add_addr_signal for the RM_ADDR announcing signal, by
defining a new ADD_ADDR status named MPTCP_RM_ADDR_SIGNAL. Then the flag
rm_addr_signal in PM could be dropped.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:02:15 -08:00
Geliang Tang
90a4aea8b6 mptcp: print out port and ahmac when receiving ADD_ADDR
This patch printed out more debugging information for the ADD_ADDR
suboption parsing on the incoming path.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:02:15 -08:00
Geliang Tang
0f5c9e3f07 mptcp: add port parameter for mptcp_pm_announce_addr
This patch added a new parameter 'port' for mptcp_pm_announce_addr. If
this parameter is true, we set the MPTCP_ADD_ADDR_PORT bit of the
add_addr_signal. That means the announced address is added with a port
number.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:02:15 -08:00
Geliang Tang
fbe0f87ac7 mptcp: send out dedicated packet for ADD_ADDR using port
The process is similar to that of the ADD_ADDR IPv6, this patch also sent
out a pure ack for the ADD_ADDR using port.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:02:15 -08:00
Geliang Tang
4a2777a834 mptcp: add the outgoing ADD_ADDR port support
This patch added a new add_addr_signal type named MPTCP_ADD_ADDR_PORT,
to identify it is an address with port to be added.

It also added a new parameter 'port' for both mptcp_add_addr_len and
mptcp_pm_add_addr_signal.

In mptcp_established_options_add_addr, we check whether the announced
address is added with port. If it is, we put this port number to
mptcp_out_options's port field.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:02:15 -08:00
Geliang Tang
2ec72faec8 mptcp: use adding up size to get ADD_ADDR length
This patch uses adding up size to get the ADD_ADDR suboption length rather
than returning the ADD_ADDR size constants.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:02:15 -08:00
Geliang Tang
22fb85ffae mptcp: add port support for ADD_ADDR suboption writing
In rfc8684, the length of ADD_ADDR suboption with IPv4 address and port
is 18 octets, but mptcp_write_options is 32-bit aligned, so we need to
pad it to 20 octets. All the other port related option lengths need to
be added up 2 octets similarly.

This patch added a new field 'port' in mptcp_out_options. When this
field is set with a port number, we need to add up 4 octets for the
ADD_ADDR suboption, and put the port number into the suboption.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09 19:02:15 -08:00