Commit Graph

258 Commits

Author SHA1 Message Date
Daniil Tatianin 201ed315f9 net/ethtool/ioctl: split ethtool_get_phy_stats into multiple helpers
So that it's easier to follow and make sense of the branching and
various conditions.

Stats retrieval has been split into two separate functions
ethtool_get_phy_stats_phydev & ethtool_get_phy_stats_ethtool.
The former attempts to retrieve the stats using phydev & phy_ops, while
the latter uses ethtool_ops.

Actual n_stats validation & array allocation has been moved into a new
ethtool_vzalloc_stats_array helper.

This also fixes a potential NULL dereference of
ops->get_ethtool_phy_stats where it was getting called in an else branch
unconditionally without making sure it was actually present.

Found by Linux Verification Center (linuxtesting.org) with the SVACE
static analysis tool.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-28 11:55:24 +00:00
Daniil Tatianin fd4778581d net/ethtool/ioctl: remove if n_stats checks from ethtool_get_phy_stats
Now that we always early return if we don't have any stats we can remove
these checks as they're no longer necessary.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-28 11:55:24 +00:00
Daniil Tatianin 9deb1e9fb8 net/ethtool/ioctl: return -EOPNOTSUPP if we have no phy stats
It's not very useful to copy back an empty ethtool_stats struct and
return 0 if we didn't actually have any stats. This also allows for
further simplification of this function in the future commits.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-28 11:55:24 +00:00
Willem de Bruijn b534dc46c8 net_tstamp: add SOF_TIMESTAMPING_OPT_ID_TCP
Add an option to initialize SOF_TIMESTAMPING_OPT_ID for TCP from
write_seq sockets instead of snd_una.

This should have been the behavior from the start. Because processes
may now exist that rely on the established behavior, do not change
behavior of the existing option, but add the right behavior with a new
flag. It is encouraged to always set SOF_TIMESTAMPING_OPT_ID_TCP on
stream sockets along with the existing SOF_TIMESTAMPING_OPT_ID.

Intuitively the contract is that the counter is zero after the
setsockopt, so that the next write N results in a notification for
the last byte N - 1.

On idle sockets snd_una == write_seq and this holds for both. But on
sockets with data in transmission, snd_una records the unacked offset
in the stream. This depends on the ACK response from the peer. A
process cannot learn this in a race free manner (ioctl SIOCOUTQ is one
racy approach).

write_seq records the offset at the last byte written by the process.
This is a better starting point. It matches the intuitive contract in
all circumstances, unaffected by external behavior.

The new timestamp flag necessitates increasing sk_tsflags to 32 bits.
Move the field in struct sock to avoid growing the socket (for some
common CONFIG variants). The UAPI interface so_timestamping.flags is
already int, so 32 bits wide.

Reported-by: Sotirios Delimanolis <sotodel@meta.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20221207143701.29861-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 19:49:21 -08:00
Sudheer Mogilappagari 7112a04664 ethtool: add netlink based get rss support
Add netlink based support for "ethtool -x <dev> [context x]"
command by implementing ETHTOOL_MSG_RSS_GET netlink message.
This is equivalent to functionality provided via ETHTOOL_GRSSH
in ioctl path. It sends RSS table, hash key and hash function
of an interface to user space.

This patch implements existing functionality available
in ioctl path and enables addition of new RSS context
based parameters in future.

Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Link: https://lore.kernel.org/r/20221202002555.241580-1-sudheer.mogilappagari@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-05 17:25:00 -08:00
Maxim Korotkov 64a8f8f712 ethtool: avoiding integer overflow in ethtool_phys_id()
The value of an arithmetic expression "n * id.data" is subject
to possible overflow due to a failure to cast operands to a larger data
type before performing arithmetic. Used macro for multiplication instead
operator for avoiding overflow.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Maxim Korotkov <korotkov.maxim.s@gmail.com>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20221122122901.22294-1-korotkov.maxim.s@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-23 20:03:51 -08:00
Vincent Mailhol edaf5df22c ethtool: ethtool_get_drvinfo: populate drvinfo fields even if callback exits
If ethtool_ops::get_drvinfo() callback isn't set,
ethtool_get_drvinfo() will fill the ethtool_drvinfo::name and
ethtool_drvinfo::bus_info fields.

However, if the driver provides the callback function, those two
fields are not touched. This means that the driver has to fill these
itself.

Allow the driver to leave those two fields empty and populate them in
such case. This way, the driver can rely on the default values for the
name and the bus_info. If the driver provides values, do nothing.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/r/20221108035754.2143-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-10 09:01:58 -08:00
Gal Pressman 47f3ecf476 ethtool: Fail number of channels change when it conflicts with rxnfc
Similar to what we do with the hash indirection table [1], when network
flow classification rules are forwarding traffic to channels greater
than the requested number of channels, fail the operation.
Without this, traffic could be directed to channels which no longer
exist (dropped) after changing number of channels.

[1] commit d4ab428627 ("ethtool: correctly ensure {GS}CHANNELS doesn't conflict with GS{RXFH}")

Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Link: https://lore.kernel.org/r/20221106123127.522985-1-gal@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-08 12:08:44 +01:00
Jakub Kicinski 9a0f830f80 ethtool: linkstate: add a statistic for PHY down events
The previous attempt to augment carrier_down (see Link)
was not met with much enthusiasm so let's do the simple
thing of exposing what some devices already maintain.
Add a common ethtool statistic for link going down.
Currently users have to maintain per-driver mapping
to extract the right stat from the vendor-specific ethtool -S
stats. carrier_down does not fit the bill because it counts
a lot of software related false positives.

Add the statistic to the extended link state API to steer
vendors towards implementing all of it.

Implement for bnxt and all Linux-controlled PHYs. mlx5 and (possibly)
enic also have a counter for this but I leave the implementation
to their maintainers.

Link: https://lore.kernel.org/r/20220520004500.2250674-1-kuba@kernel.org
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20221104190125.684910-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-08 10:36:54 +01:00
Jiri Pirko 8eba37f7e9 net: devlink: use devlink_port pointer instead of ndo_get_devlink_port
Use newly introduced devlink_port pointer instead of getting it calling
to ndo_get_devlink_port op.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03 20:48:36 -07:00
Jakub Kicinski 31f1aa4f74 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
  2871edb32f ("can: kvaser_usb: Fix possible completions during init_completion")
  abb8670938 ("can: kvaser_usb_leaf: Ignore stale bus-off after start")
  8d21f5927a ("can: kvaser_usb_leaf: Fix improved state not being reported")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27 16:56:36 -07:00
Xin Long 9d9effca9d ethtool: eeprom: fix null-deref on genl_info in dump
The similar fix as commit 46cdedf2a0 ("ethtool: pse-pd: fix null-deref on
genl_info in dump") is also needed for ethtool eeprom.

Fixes: c781ff12a2 ("ethtool: Allow network drivers to dump arbitrary EEPROM data")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/5575919a2efc74cd9ad64021880afc3805c54166.1666362167.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-24 19:08:07 -07:00
Jakub Kicinski 96917bb3a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/linux/net.h
  a5ef058dc4 ("net: introduce and use custom sockopt socket flag")
  e993ffe3da ("net: flag sockets supporting msghdr originated zerocopy")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-24 13:44:11 -07:00
Amit Cohen 404c76783f ethtool: Add support for 800Gbps link modes
Add support for 800Gbps speed, link modes of 100Gbps per lane.
As mentioned in slide 21 in IEEE documentation [1], all adopted 802.3df
copper and optical PMDs baselines using 100G/lane will be supported.

Add the relevant PMDs which are mentioned in slide 5 in IEEE
documentation [1] and were approved on 10-2022 [2]:
BP - KR8
Cu Cable - CR8
MMF 50m - VR8
MMF 100m - SR8
SMF 500m - DR8
SMF 2km - DR8-2

[1]: https://www.ieee802.org/3/df/public/22_10/22_1004/shrikhande_3df_01a_221004.pdf
[2]: https://ieee802.org/3/df/KeyMotions_3df_221005.pdf

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24 10:43:39 +01:00
Jakub Kicinski 46cdedf2a0 ethtool: pse-pd: fix null-deref on genl_info in dump
ethnl_default_dump_one() passes NULL as info.

It's correct not to set extack during dump, as we should just
silently skip interfaces which can't provide the information.

Reported-by: syzbot+81c4b4bbba6eea2cfcae@syzkaller.appspotmail.com
Fixes: 18ff0bcda6 ("ethtool: add interface to interact with Ethernet Power Equipment")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-21 13:18:05 +01:00
Oleksij Rempel 18ff0bcda6 ethtool: add interface to interact with Ethernet Power Equipment
Add interface to support Power Sourcing Equipment. At current step it
provides generic way to address all variants of PSE devices as defined
in IEEE 802.3-2018 but support only objects specified for IEEE 802.3-2018 104.4
PoDL Power Sourcing Equipment (PSE).

Currently supported and mandatory objects are:
IEEE 802.3-2018 30.15.1.1.3 aPoDLPSEPowerDetectionStatus
IEEE 802.3-2018 30.15.1.1.2 aPoDLPSEAdminState
IEEE 802.3-2018 30.15.1.2.1 acPoDLPSEAdminControl

This is minimal interface needed to control PSE on each separate
ethernet port but it provides not all mandatory objects specified in
IEEE 802.3-2018.

Since "PoDL PSE" and "PSE" have similar names, but some different values
I decide to not merge them and keep separate naming schema. This should
allow as to be as close to IEEE 802.3 spec as possible and avoid name
conflicts in the future.

This implementation is connected to PHYs instead of MACs because PSE
auto classification can potentially interfere with PHY auto negotiation.
So, may be some extra PHY related initialization will be needed.

With WIP version of ethtools interaction with PSE capable link looks
as following:

$ ip l
...
5: t1l1@eth0: <BROADCAST,MULTICAST> ..
...

$ ethtool --show-pse t1l1
PSE attributs for t1l1:
PoDL PSE Admin State: disabled
PoDL PSE Power Detection Status: disabled

$ ethtool --set-pse t1l1 podl-pse-admin-control enable
$ ethtool --show-pse t1l1
PSE attributs for t1l1:
PoDL PSE Admin State: enabled
PoDL PSE Power Detection Status: delivering power

Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 17:33:57 -07:00
Sean Anderson 0c3e10cb44 net: phy: Add support for rate matching
This adds support for rate matching (also known as rate adaptation) to
the phy subsystem. The general idea is that the phy interface runs at
one speed, and the MAC throttles the rate at which it sends packets to
the link speed. There's a good overview of several techniques for
achieving this at [1]. This patch adds support for three: pause-frame
based (such as in Aquantia phys), CRS-based (such as in 10PASS-TS and
2BASE-TL), and open-loop-based (such as in 10GBASE-W).

This patch makes a few assumptions and a few non assumptions about the
types of rate matching available. First, it assumes that different phys
may use different forms of rate matching. Second, it assumes that phys
can use rate matching for any of their supported link speeds (e.g. if a
phy supports 10BASE-T and XGMII, then it can adapt XGMII to 10BASE-T).
Third, it does not assume that all interface modes will use the same
form of rate matching. Fourth, it does not assume that all phy devices
will support rate matching (even if some do). Relaxing or strengthening
these (non-)assumptions could result in a different API. For example, if
all interface modes were assumed to use the same form of rate matching,
then a bitmask of interface modes supportting rate matching would
suffice.

For some better visibility into the process, the current rate matching
mode is exposed as part of the ethtool ksettings. For the moment, only
read access is supported. I'm not sure what userspace might want to
configure yet (disable it altogether, disable just one mode, specify the
mode to use, etc.). For the moment, since only pause-based rate
adaptation support is added in the next few commits, rate matching can
be disabled altogether by adjusting the advertisement.

802.3 calls this feature "rate adaptation" in clause 49 (10GBASE-R) and
"rate matching" in clause 61 (10PASS-TL and 2BASE-TS). Aquantia also calls
this feature "rate adaptation". I chose "rate matching" because it is
shorter, and because Russell doesn't think "adaptation" is correct in this
context.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-23 11:55:35 +01:00
Li Zhong 05cd823863 ethtool: tunnels: check the return value of nla_nest_start()
Check the return value of nla_nest_start(). When starting the entry
level nested attributes, if the tailroom of socket buffer is
insufficient to store the attribute header and payload, the return value
will be NULL.

There is, however, no real bug here since if the skb is full
nla_put_be16() will fail as well and we'll error out.

Signed-off-by: Li Zhong <floridsleeves@gmail.com>
Link: https://lore.kernel.org/r/20220921181716.1629541-1-floridsleeves@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-22 19:28:10 -07:00
Jakub Kicinski 4f5059e629 ethtool: report missing header via ext_ack in the default handler
The actual presence check for the header is in
ethnl_parse_header_dev_get() but it's a few layers in,
and already has a ton of arguments so let's just pick
the low hanging fruit and check for missing header in
the default request handler.

Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-30 12:20:43 +02:00
Jakub Kicinski 08d1d0e784 ethtool: strset: report missing ETHTOOL_A_STRINGSET_ID via ext_ack
Strset needs ETHTOOL_A_STRINGSET_ID, use it as an example of
reporting attrs missing in nests.

Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-30 12:20:43 +02:00
Jakub Kicinski 9c5d03d362 genetlink: start to validate reserved header bytes
We had historically not checked that genlmsghdr.reserved
is 0 on input which prevents us from using those precious
bytes in the future.

One use case would be to extend the cmd field, which is
currently just 8 bits wide and 256 is not a lot of commands
for some core families.

To make sure that new families do the right thing by default
put the onus of opting out of validation on existing families.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel)
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-29 12:47:15 +01:00
Wolfram Sang a71af8902b ethtool: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20220818210218.8443-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-22 18:06:55 -07:00
William Dean aa246499bb net: delete extra space and tab in blank line
delete extra space and tab in blank line, there is no functional change.

Reported-by: Hacash Robot <hacashRobot@santino.com>
Signed-off-by: William Dean <williamsukatube@gmail.com>
Link: https://lore.kernel.org/r/20220723073222.2961602-1-williamsukatube@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-25 19:38:31 -07:00
Jakub Kicinski 93817be8b6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-23 12:33:24 -07:00
Ivan Vecera a3bb7b6381 ethtool: Fix get module eeprom fallback
Function fallback_set_params() checks if the module type returned
by a driver is ETH_MODULE_SFF_8079 and in this case it assumes
that buffer returns a concatenated content of page  A0h and A2h.
The check is wrong because the correct type is ETH_MODULE_SFF_8472.

Fixes: 96d971e307 ("ethtool: Add fallback to get_module_eeprom from netlink command")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220616160856.3623273-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-17 20:22:16 -07:00
Marco Bonelli 19d62f5eea ethtool: Fix and simplify ethtool_convert_link_mode_to_legacy_u32()
Fix the implementation of ethtool_convert_link_mode_to_legacy_u32(), which
is supposed to return false if src has bits higher than 31 set. The current
implementation uses the complement of bitmap_fill(ext, 32) to test high
bits of src, which is wrong as bitmap_fill() fills _with long granularity_,
and sizeof(long) can be > 4. No users of this function currently check the
return value, so the bug was dormant.

Also remove the check for __ETHTOOL_LINK_MODE_MASK_NBITS > 32, as the enum
ethtool_link_mode_bit_indices contains far beyond 32 values. Using
find_next_bit() to test the src bitmask works regardless of this anyway.

Signed-off-by: Marco Bonelli <marco@mebeim.net>
Link: https://lore.kernel.org/r/20220609134900.11201-1-marco@mebeim.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-13 23:11:35 -07:00
Jakub Kicinski d62607c3fe net: rename reference+tracking helpers
Netdev reference helpers have a dev_ prefix for historic
reasons. Renaming the old helpers would be too much churn
but we can rename the tracking ones which are relatively
recent and should be the default for new code.

Rename:
 dev_hold_track()    -> netdev_hold()
 dev_put_track()     -> netdev_put()
 dev_replace_track() -> netdev_ref_replace()

Link: https://lore.kernel.org/r/20220608043955.919359-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09 21:52:55 -07:00
Alexandru Tachici 3254e0b9eb ethtool: Add 10base-T1L link mode entry
Add entry for the 10base-T1L full duplex mode.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-01 17:45:35 +01:00
Jie Wang bde292c07b net: ethtool: move checks before rtnl_lock() in ethnl_set_rings
Currently these two checks in ethnl_set_rings are added after rtnl_lock()
which will do useless works if the request is invalid.

So this patch moves these checks before the rtnl_lock() to avoid these
costs.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 11:41:45 -07:00
Jie Wang 4dc84c06a3 net: ethtool: extend ringparam set/get APIs for tx_push
Currently tx push is a standard driver feature which controls use of a fast
path descriptor push. So this patch extends the ringparam APIs and data
structures to support set/get tx push by ethtool -G/g.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 11:41:35 -07:00
Subbaraya Sundeep 1241e329ce ethtool: add support to set/get completion queue event size
Add support to set completion queue event size via ethtool -G
parameter and get it via ethtool -g parameter.

~ # ./ethtool -G eth0 cqe-size 512
~ # ./ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX:             1048576
RX Mini:        n/a
RX Jumbo:       n/a
TX:             1048576
Current hardware settings:
RX:             256
RX Mini:        n/a
RX Jumbo:       n/a
TX:             4096
RX Buf Len:             2048
CQE Size:                128

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-23 20:33:05 -08:00
Jakub Kicinski 9690ae6042 ethtool: add header/data split indication
For applications running on a mix of platforms it's useful
to have a clear indication whether host's NIC supports the
geometry requirements of TCP zero-copy. TCP zero-copy Rx
requires data to be neatly placed into memory pages.
Most NICs can't do that.

This patch is adding GET support only, since the NICs
I work with either always have the feature enabled or
enable it whenever MTU is set to jumbo. In other words
I don't need SET. But adding set should be trivial.
(The only note on SET is that we will likely want
the setting to be "sticky" and use 0 / `unknown`
to reset it back to driver default.)

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-28 14:43:47 +00:00
Tom Rix ccd21ec5b8 ethtool: use phydev variable
In ethtool_get_phy_stats(), the phydev varaible is set to
dev->phydev but dev->phydev is still used.  Replace
dev->phydev uses with phydev.

Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-06 12:33:35 +00:00
Eric Dumazet 2d6ec25539 netlink: do not allocate a device refcount tracker in ethnl_default_notify()
As reported by Johannes, the tracker allocated in
ethnl_default_notify() is not really needed, as this
function is not expected to change a device reference count.

Fixes: e4b8954074 ("netlink: add net device refcount tracker to struct ethnl_req_info")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Tested-by: Johannes Berg <johannes@sipsolutions.net>
Link: https://lore.kernel.org/r/20220105170849.2610470-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-05 09:50:06 -08:00
David S. Miller e63a023489 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-12-30

The following pull-request contains BPF updates for your *net-next* tree.

We've added 72 non-merge commits during the last 20 day(s) which contain
a total of 223 files changed, 3510 insertions(+), 1591 deletions(-).

The main changes are:

1) Automatic setrlimit in libbpf when bpf is memcg's in the kernel, from Andrii.

2) Beautify and de-verbose verifier logs, from Christy.

3) Composable verifier types, from Hao.

4) bpf_strncmp helper, from Hou.

5) bpf.h header dependency cleanup, from Jakub.

6) get_func_[arg|ret|arg_cnt] helpers, from Jiri.

7) Sleepable local storage, from KP.

8) Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support, from Kumar.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-31 14:35:40 +00:00
luo penghao c09f103e89 ethtool: Remove redundant ret assignments
The assignment here will be overwritten, so it should be deleted

The clang_analyzer complains as follows:

net/ethtool/netlink.c:

Value stored to 'ret' is never read

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: luo penghao <luo.penghao@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30 13:29:14 +00:00
Jakub Kicinski b6459415b3 net: Don't include filter.h from net/sock.h
sock.h is pretty heavily used (5k objects rebuilt on x86 after
it's touched). We can drop the include of filter.h from it and
add a forward declaration of struct sk_filter instead.
This decreases the number of rebuilt objects when bpf.h
is touched from ~5k to ~1k.

There's a lot of missing includes this was masking. Primarily
in networking tho, this time.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/bpf/20211229004913.513372-1-kuba@kernel.org
2021-12-29 08:48:14 -08:00
Jakub Kicinski 3bc14ea0d1 ethtool: always write dev in ethnl_parse_header_dev_get
Commit 0976b888a1 ("ethtool: fix null-ptr-deref on ref tracker")
made the write to req_info.dev conditional, but as Eric points out
in a different follow up the structure is often allocated on the
stack and not kzalloc()'d so seems safer to always write the dev,
in case it's garbage on input.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 15:09:24 +00:00
Eric Dumazet 34ac17ecbf ethtool: use ethnl_parse_header_dev_put()
It seems I missed that most ethnl_parse_header_dev_get() callers
declare an on-stack struct ethnl_req_info, and that they simply call
dev_put(req_info.dev) when about to return.

Add ethnl_parse_header_dev_put() helper to properly untrack
reference taken by ethnl_parse_header_dev_get().

Fixes: e4b8954074 ("netlink: add net device refcount tracker to struct ethnl_req_info")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15 10:27:47 +00:00
Jakub Kicinski 0976b888a1 ethtool: fix null-ptr-deref on ref tracker
dev can be a NULL here, not all requests set require_dev.

Fixes: e4b8954074 ("netlink: add net device refcount tracker to struct ethnl_req_info")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14 12:35:56 +00:00
Jakub Kicinski 3150a73366 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 13:23:02 -08:00
Eric Dumazet e4b8954074 netlink: add net device refcount tracker to struct ethnl_req_info
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 20:44:59 -08:00
Antoine Tenart dde91ccfa2 ethtool: do not perform operations on net devices being unregistered
There is a short period between a net device starts to be unregistered
and when it is actually gone. In that time frame ethtool operations
could still be performed, which might end up in unwanted or undefined
behaviours[1].

Do not allow ethtool operations after a net device starts its
unregistration. This patch targets the netlink part as the ioctl one
isn't affected: the reference to the net device is taken and the
operation is executed within an rtnl lock section and the net device
won't be found after unregister.

[1] For example adding Tx queues after unregister ends up in NULL
    pointer exceptions and UaFs, such as:

      BUG: KASAN: use-after-free in kobject_get+0x14/0x90
      Read of size 1 at addr ffff88801961248c by task ethtool/755

      CPU: 0 PID: 755 Comm: ethtool Not tainted 5.15.0-rc6+ #778
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/014
      Call Trace:
       dump_stack_lvl+0x57/0x72
       print_address_description.constprop.0+0x1f/0x140
       kasan_report.cold+0x7f/0x11b
       kobject_get+0x14/0x90
       kobject_add_internal+0x3d1/0x450
       kobject_init_and_add+0xba/0xf0
       netdev_queue_update_kobjects+0xcf/0x200
       netif_set_real_num_tx_queues+0xb4/0x310
       veth_set_channels+0x1c3/0x550
       ethnl_set_channels+0x524/0x610

Fixes: 041b1c5d4a ("ethtool: helper functions for netlink interface")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20211203101318.435618-1-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:53:32 -08:00
Eric Dumazet 5ae2195088 net: add net device refcount tracker to ethtool_phys_id()
This helper might hold a netdev reference for a long time,
lets add reference tracking.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:10 -08:00
Christophe JAILLET 72a2ff567f ethtool: netlink: Slightly simplify 'ethnl_features_to_bitmap()'
The 'dest' bitmap is fully initialized by the 'for' loop, so there is no
need to explicitly reset it.

This also makes this function in line with 'ethnl_features_to_bitmap32()'
which does not clear the destination before writing it.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/17fca158231c6f03689bd891254f0dd1f4e84cb8.1638091829.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 20:16:46 -08:00
Tonghao Zhang bde3b0fd80 net: ethtool: set a default driver name
The netdev (e.g. ifb, bareudp), which not support ethtool ops
(e.g. .get_drvinfo), we can use the rtnl kind as a default name.

ifb netdev may be created by others prefix, not ifbX.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Hao Chen <chenhao288@hisilicon.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Danielle Ratson <danieller@nvidia.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20211125163049.84970-1-xiangxia.m.yue@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 16:44:53 -08:00
Jakub Kicinski 93d5404e89 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ipa/ipa_main.c
  8afc7e471a ("net: ipa: separate disabling setup from modem stop")
  76b5fbcd6b ("net: ipa: kill ipa_modem_init()")

Duplicated include, drop one.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 13:45:19 -08:00
Julian Wiedmann 0276af2176 ethtool: ioctl: fix potential NULL deref in ethtool_set_coalesce()
ethtool_set_coalesce() now uses both the .get_coalesce() and
.set_coalesce() callbacks. But the check for their availability is
buggy, so changing the coalesce settings on a device where the driver
provides only _one_ of the callbacks results in a NULL pointer
dereference instead of an -EOPNOTSUPP.

Fix the condition so that the availability of both callbacks is
ensured. This also matches the netlink code.

Note that reproducing this requires some effort - it only affects the
legacy ioctl path, and needs a specific combination of driver options:
- have .get_coalesce() and .coalesce_supported but no
 .set_coalesce(), or
- have .set_coalesce() but no .get_coalesce(). Here eg. ethtool doesn't
  cause the crash as it first attempts to call ethtool_get_coalesce()
  and bails out on error.

Fixes: f3ccfda193 ("ethtool: extend coalesce setting uAPI with CQE mode")
Cc: Yufeng Mo <moyufeng@huawei.com>
Cc: Huazhong Tan <tanhuazhong@huawei.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Link: https://lore.kernel.org/r/20211126175543.28000-1-jwi@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26 11:17:47 -08:00
Hao Chen 7462494408 ethtool: extend ringparam setting/getting API with rx_buf_len
Add two new parameters kernel_ringparam and extack for
.get_ringparam and .set_ringparam to extend more ring params
through netlink.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 12:31:49 +00:00
Hao Chen 0b70c256eb ethtool: add support to set/get rx buf len via ethtool
Add support to set rx buf len via ethtool -G parameter and get
rx buf len via ethtool -g parameter.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-22 12:31:48 +00:00