linux-stable/include/linux/ethtool.h

1056 lines
42 KiB
C
Raw Normal View History

License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 14:07:57 +00:00
/* SPDX-License-Identifier: GPL-2.0 */
/*
* ethtool.h: Defines for Linux ethtool.
*
* Copyright (C) 1998 David S. Miller (davem@redhat.com)
* Copyright 2001 Jeff Garzik <jgarzik@pobox.com>
* Portions Copyright 2001 Sun Microsystems (thockin@sun.com)
* Portions Copyright 2002 Intel (eli.kupermann@intel.com,
* christopher.leech@intel.com,
* scott.feldman@intel.com)
* Portions Copyright (C) Sun Microsystems 2008
*/
#ifndef _LINUX_ETHTOOL_H
#define _LINUX_ETHTOOL_H
net: ethtool: add new ETHTOOL_xLINKSETTINGS API This patch defines a new ETHTOOL_GLINKSETTINGS/SLINKSETTINGS API, handled by the new get_link_ksettings/set_link_ksettings callbacks. This API provides support for most legacy ethtool_cmd fields, adds support for larger link mode masks (up to 4064 bits, variable length), and removes ethtool_cmd deprecated fields (transceiver/maxrxpkt/maxtxpkt). This API is deprecating the legacy ETHTOOL_GSET/SSET API and provides the following backward compatibility properties: - legacy ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks. - legacy ethtool with new get/set_link_ksettings drivers: the new driver callbacks are used, data internally converted to legacy ethtool_cmd. ETHTOOL_GSET will return only the 1st 32b of each link mode mask. ETHTOOL_SSET will fail if user tries to set the ethtool_cmd deprecated fields to non-0 (transceiver/maxrxpkt/maxtxpkt). A kernel warning is logged if driver sets higher bits. - future ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks, internally converted to new data structure. Deprecated fields (transceiver/maxrxpkt/maxtxpkt) will be ignored and seen as 0 from user space. Note that that "future" ethtool tool will not allow changes to these deprecated fields. - future ethtool with new drivers: direct call to the new callbacks. By "future" ethtool, what is meant is: - query: first try ETHTOOL_GLINKSETTINGS, and revert to ETHTOOL_GSET if fails - set: query first and remember which of ETHTOOL_GLINKSETTINGS or ETHTOOL_GSET was successful + if ETHTOOL_GLINKSETTINGS was successful, then change config with ETHTOOL_SLINKSETTINGS. A failure there is final (do not try ETHTOOL_SSET). + otherwise ETHTOOL_GSET was successful, change config with ETHTOOL_SSET. A failure there is final (do not try ETHTOOL_SLINKSETTINGS). The interaction user/kernel via the new API requires a small ETHTOOL_GLINKSETTINGS handshake first to agree on the length of the link mode bitmaps. If kernel doesn't agree with user, it returns the bitmap length it is expecting from user as a negative length (and cmd field is 0). When kernel and user agree, kernel returns valid info in all fields (ie. link mode length > 0 and cmd is ETHTOOL_GLINKSETTINGS). Data structure crossing user/kernel boundary is 32/64-bit agnostic. Converted internally to a legal kernel bitmap. The internal __ethtool_get_settings kernel helper will gradually be replaced by __ethtool_get_link_ksettings by the time the first "link_settings" drivers start to appear. So this patch doesn't change it, it will be removed before it needs to be changed. Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 18:57:59 +00:00
#include <linux/bitmap.h>
#include <linux/compat.h>
#include <linux/if_ether.h>
#include <linux/netlink.h>
#include <uapi/linux/ethtool.h>
struct compat_ethtool_rx_flow_spec {
u32 flow_type;
union ethtool_flow_union h_u;
struct ethtool_flow_ext h_ext;
union ethtool_flow_union m_u;
struct ethtool_flow_ext m_ext;
compat_u64 ring_cookie;
u32 location;
};
struct compat_ethtool_rxnfc {
u32 cmd;
u32 flow_type;
compat_u64 data;
struct compat_ethtool_rx_flow_spec fs;
u32 rule_cnt;
u32 rule_locs[];
};
#include <linux/rculist.h>
/**
* enum ethtool_phys_id_state - indicator state for physical identification
* @ETHTOOL_ID_INACTIVE: Physical ID indicator should be deactivated
* @ETHTOOL_ID_ACTIVE: Physical ID indicator should be activated
* @ETHTOOL_ID_ON: LED should be turned on (used iff %ETHTOOL_ID_ACTIVE
* is not supported)
* @ETHTOOL_ID_OFF: LED should be turned off (used iff %ETHTOOL_ID_ACTIVE
* is not supported)
*/
enum ethtool_phys_id_state {
ETHTOOL_ID_INACTIVE,
ETHTOOL_ID_ACTIVE,
ETHTOOL_ID_ON,
ETHTOOL_ID_OFF
};
ethtool: Support for configurable RSS hash function This patch extends the set/get_rxfh ethtool-options for getting or setting the RSS hash function. It modifies drivers implementation of set/get_rxfh accordingly. This change also delegates the responsibility of checking whether a modification to a certain RX flow hash parameter is supported to the driver implementation of set_rxfh. User-kernel API is done through the new hfunc bitmask field in the ethtool_rxfh struct. A bit set in the hfunc field is corresponding to an index in the new string-set ETH_SS_RSS_HASH_FUNCS. Got approval from most of the relevant driver maintainers that their driver is using Toeplitz, and for the few that didn't answered, also assumed it is Toeplitz. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Cc: Greg Rose <gregory.v.rose@intel.com> Cc: Matthew Vick <matthew.vick@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Shradha Shah <sshah@solarflare.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 16:12:10 +00:00
enum {
ETH_RSS_HASH_TOP_BIT, /* Configurable RSS hash function - Toeplitz */
ETH_RSS_HASH_XOR_BIT, /* Configurable RSS hash function - Xor */
ETH_RSS_HASH_CRC32_BIT, /* Configurable RSS hash function - Crc32 */
ethtool: Support for configurable RSS hash function This patch extends the set/get_rxfh ethtool-options for getting or setting the RSS hash function. It modifies drivers implementation of set/get_rxfh accordingly. This change also delegates the responsibility of checking whether a modification to a certain RX flow hash parameter is supported to the driver implementation of set_rxfh. User-kernel API is done through the new hfunc bitmask field in the ethtool_rxfh struct. A bit set in the hfunc field is corresponding to an index in the new string-set ETH_SS_RSS_HASH_FUNCS. Got approval from most of the relevant driver maintainers that their driver is using Toeplitz, and for the few that didn't answered, also assumed it is Toeplitz. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Cc: Greg Rose <gregory.v.rose@intel.com> Cc: Matthew Vick <matthew.vick@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Shradha Shah <sshah@solarflare.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 16:12:10 +00:00
/*
* Add your fresh new hash function bits above and remember to update
* rss_hash_func_strings[] in ethtool.c
*/
ETH_RSS_HASH_FUNCS_COUNT
};
/**
* struct kernel_ethtool_ringparam - RX/TX ring configuration
* @rx_buf_len: Current length of buffers on the rx ring.
* @tcp_data_split: Scatter packet headers and data to separate buffers
* @tx_push: The flag of tx push mode
* @rx_push: The flag of rx push mode
* @cqe_size: Size of TX/RX completion queue event
* @tx_push_buf_len: Size of TX push buffer
* @tx_push_buf_max_len: Maximum allowed size of TX push buffer
*/
struct kernel_ethtool_ringparam {
u32 rx_buf_len;
u8 tcp_data_split;
u8 tx_push;
u8 rx_push;
u32 cqe_size;
u32 tx_push_buf_len;
u32 tx_push_buf_max_len;
};
/**
* enum ethtool_supported_ring_param - indicator caps for setting ring params
* @ETHTOOL_RING_USE_RX_BUF_LEN: capture for setting rx_buf_len
* @ETHTOOL_RING_USE_CQE_SIZE: capture for setting cqe_size
* @ETHTOOL_RING_USE_TX_PUSH: capture for setting tx_push
* @ETHTOOL_RING_USE_RX_PUSH: capture for setting rx_push
* @ETHTOOL_RING_USE_TX_PUSH_BUF_LEN: capture for setting tx_push_buf_len
*/
enum ethtool_supported_ring_param {
ETHTOOL_RING_USE_RX_BUF_LEN = BIT(0),
ETHTOOL_RING_USE_CQE_SIZE = BIT(1),
ETHTOOL_RING_USE_TX_PUSH = BIT(2),
ETHTOOL_RING_USE_RX_PUSH = BIT(3),
ETHTOOL_RING_USE_TX_PUSH_BUF_LEN = BIT(4),
};
ethtool: Support for configurable RSS hash function This patch extends the set/get_rxfh ethtool-options for getting or setting the RSS hash function. It modifies drivers implementation of set/get_rxfh accordingly. This change also delegates the responsibility of checking whether a modification to a certain RX flow hash parameter is supported to the driver implementation of set_rxfh. User-kernel API is done through the new hfunc bitmask field in the ethtool_rxfh struct. A bit set in the hfunc field is corresponding to an index in the new string-set ETH_SS_RSS_HASH_FUNCS. Got approval from most of the relevant driver maintainers that their driver is using Toeplitz, and for the few that didn't answered, also assumed it is Toeplitz. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Cc: Greg Rose <gregory.v.rose@intel.com> Cc: Matthew Vick <matthew.vick@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Shradha Shah <sshah@solarflare.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 16:12:10 +00:00
#define __ETH_RSS_HASH_BIT(bit) ((u32)1 << (bit))
#define __ETH_RSS_HASH(name) __ETH_RSS_HASH_BIT(ETH_RSS_HASH_##name##_BIT)
#define ETH_RSS_HASH_TOP __ETH_RSS_HASH(TOP)
#define ETH_RSS_HASH_XOR __ETH_RSS_HASH(XOR)
#define ETH_RSS_HASH_CRC32 __ETH_RSS_HASH(CRC32)
ethtool: Support for configurable RSS hash function This patch extends the set/get_rxfh ethtool-options for getting or setting the RSS hash function. It modifies drivers implementation of set/get_rxfh accordingly. This change also delegates the responsibility of checking whether a modification to a certain RX flow hash parameter is supported to the driver implementation of set_rxfh. User-kernel API is done through the new hfunc bitmask field in the ethtool_rxfh struct. A bit set in the hfunc field is corresponding to an index in the new string-set ETH_SS_RSS_HASH_FUNCS. Got approval from most of the relevant driver maintainers that their driver is using Toeplitz, and for the few that didn't answered, also assumed it is Toeplitz. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Cc: Greg Rose <gregory.v.rose@intel.com> Cc: Matthew Vick <matthew.vick@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Shradha Shah <sshah@solarflare.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 16:12:10 +00:00
#define ETH_RSS_HASH_UNKNOWN 0
#define ETH_RSS_HASH_NO_CHANGE 0
struct net_device;
struct netlink_ext_ack;
/* Link extended state and substate. */
struct ethtool_link_ext_state_info {
enum ethtool_link_ext_state link_ext_state;
union {
enum ethtool_link_ext_substate_autoneg autoneg;
enum ethtool_link_ext_substate_link_training link_training;
enum ethtool_link_ext_substate_link_logical_mismatch link_logical_mismatch;
enum ethtool_link_ext_substate_bad_signal_integrity bad_signal_integrity;
enum ethtool_link_ext_substate_cable_issue cable_issue;
enum ethtool_link_ext_substate_module module;
u32 __link_ext_substate;
};
};
struct ethtool_link_ext_stats {
/* Custom Linux statistic for PHY level link down events.
* In a simpler world it should be equal to netdev->carrier_down_count
* unfortunately netdev also counts local reconfigurations which don't
* actually take the physical link down, not to mention NC-SI which,
* if present, keeps the link up regardless of host state.
* This statistic counts when PHY _actually_ went down, or lost link.
*
* Note that we need u64 for ethtool_stats_init() and comparisons
* to ETHTOOL_STAT_NOT_SET, but only u32 is exposed to the user.
*/
u64 link_down_events;
};
/**
* ethtool_rxfh_indir_default - get default value for RX flow hash indirection
* @index: Index in RX flow hash indirection table
* @n_rx_rings: Number of RX rings to use
*
* This function provides the default policy for RX flow hash indirection.
*/
static inline u32 ethtool_rxfh_indir_default(u32 index, u32 n_rx_rings)
{
return index % n_rx_rings;
}
net: ethtool: add new ETHTOOL_xLINKSETTINGS API This patch defines a new ETHTOOL_GLINKSETTINGS/SLINKSETTINGS API, handled by the new get_link_ksettings/set_link_ksettings callbacks. This API provides support for most legacy ethtool_cmd fields, adds support for larger link mode masks (up to 4064 bits, variable length), and removes ethtool_cmd deprecated fields (transceiver/maxrxpkt/maxtxpkt). This API is deprecating the legacy ETHTOOL_GSET/SSET API and provides the following backward compatibility properties: - legacy ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks. - legacy ethtool with new get/set_link_ksettings drivers: the new driver callbacks are used, data internally converted to legacy ethtool_cmd. ETHTOOL_GSET will return only the 1st 32b of each link mode mask. ETHTOOL_SSET will fail if user tries to set the ethtool_cmd deprecated fields to non-0 (transceiver/maxrxpkt/maxtxpkt). A kernel warning is logged if driver sets higher bits. - future ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks, internally converted to new data structure. Deprecated fields (transceiver/maxrxpkt/maxtxpkt) will be ignored and seen as 0 from user space. Note that that "future" ethtool tool will not allow changes to these deprecated fields. - future ethtool with new drivers: direct call to the new callbacks. By "future" ethtool, what is meant is: - query: first try ETHTOOL_GLINKSETTINGS, and revert to ETHTOOL_GSET if fails - set: query first and remember which of ETHTOOL_GLINKSETTINGS or ETHTOOL_GSET was successful + if ETHTOOL_GLINKSETTINGS was successful, then change config with ETHTOOL_SLINKSETTINGS. A failure there is final (do not try ETHTOOL_SSET). + otherwise ETHTOOL_GSET was successful, change config with ETHTOOL_SSET. A failure there is final (do not try ETHTOOL_SLINKSETTINGS). The interaction user/kernel via the new API requires a small ETHTOOL_GLINKSETTINGS handshake first to agree on the length of the link mode bitmaps. If kernel doesn't agree with user, it returns the bitmap length it is expecting from user as a negative length (and cmd field is 0). When kernel and user agree, kernel returns valid info in all fields (ie. link mode length > 0 and cmd is ETHTOOL_GLINKSETTINGS). Data structure crossing user/kernel boundary is 32/64-bit agnostic. Converted internally to a legal kernel bitmap. The internal __ethtool_get_settings kernel helper will gradually be replaced by __ethtool_get_link_ksettings by the time the first "link_settings" drivers start to appear. So this patch doesn't change it, it will be removed before it needs to be changed. Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 18:57:59 +00:00
/* declare a link mode bitmap */
#define __ETHTOOL_DECLARE_LINK_MODE_MASK(name) \
DECLARE_BITMAP(name, __ETHTOOL_LINK_MODE_MASK_NBITS)
/* drivers must ignore base.cmd and base.link_mode_masks_nwords
* fields, but they are allowed to overwrite them (will be ignored).
*/
struct ethtool_link_ksettings {
struct ethtool_link_settings base;
struct {
__ETHTOOL_DECLARE_LINK_MODE_MASK(supported);
__ETHTOOL_DECLARE_LINK_MODE_MASK(advertising);
__ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising);
} link_modes;
ethtool: Extend link modes settings uAPI with lanes Currently, when auto negotiation is on, the user can advertise all the linkmodes which correspond to a specific speed, but does not have a similar selector for the number of lanes. This is significant when a specific speed can be achieved using different number of lanes. For example, 2x50 or 4x25. Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct ethtool_link_settings' with lanes field in order to implement a new lanes-selector that will enable the user to advertise a specific number of lanes as well. When auto negotiation is off, lanes parameter can be forced only if the driver supports it. Add a capability bit in 'struct ethtool_ops' that allows ethtool know if the driver can handle the lanes parameter when auto negotiation is off, so if it does not, an error message will be returned when trying to set lanes. Example: $ ethtool -s swp1 lanes 4 $ ethtool swp1 Settings for swp1: Supported ports: [ FIBRE ] Supported link modes: 1000baseKX/Full 10000baseKR/Full 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 25000baseCR/Full 25000baseSR/Full 50000baseCR2/Full 100000baseSR4/Full 100000baseCR4/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 100000baseSR4/Full 100000baseCR4/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: Unknown! Duplex: Unknown! (255) Auto-negotiation: on Port: Direct Attach Copper PHYAD: 0 Transceiver: internal Link detected: no Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-02 18:06:06 +00:00
u32 lanes;
net: ethtool: add new ETHTOOL_xLINKSETTINGS API This patch defines a new ETHTOOL_GLINKSETTINGS/SLINKSETTINGS API, handled by the new get_link_ksettings/set_link_ksettings callbacks. This API provides support for most legacy ethtool_cmd fields, adds support for larger link mode masks (up to 4064 bits, variable length), and removes ethtool_cmd deprecated fields (transceiver/maxrxpkt/maxtxpkt). This API is deprecating the legacy ETHTOOL_GSET/SSET API and provides the following backward compatibility properties: - legacy ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks. - legacy ethtool with new get/set_link_ksettings drivers: the new driver callbacks are used, data internally converted to legacy ethtool_cmd. ETHTOOL_GSET will return only the 1st 32b of each link mode mask. ETHTOOL_SSET will fail if user tries to set the ethtool_cmd deprecated fields to non-0 (transceiver/maxrxpkt/maxtxpkt). A kernel warning is logged if driver sets higher bits. - future ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks, internally converted to new data structure. Deprecated fields (transceiver/maxrxpkt/maxtxpkt) will be ignored and seen as 0 from user space. Note that that "future" ethtool tool will not allow changes to these deprecated fields. - future ethtool with new drivers: direct call to the new callbacks. By "future" ethtool, what is meant is: - query: first try ETHTOOL_GLINKSETTINGS, and revert to ETHTOOL_GSET if fails - set: query first and remember which of ETHTOOL_GLINKSETTINGS or ETHTOOL_GSET was successful + if ETHTOOL_GLINKSETTINGS was successful, then change config with ETHTOOL_SLINKSETTINGS. A failure there is final (do not try ETHTOOL_SSET). + otherwise ETHTOOL_GSET was successful, change config with ETHTOOL_SSET. A failure there is final (do not try ETHTOOL_SLINKSETTINGS). The interaction user/kernel via the new API requires a small ETHTOOL_GLINKSETTINGS handshake first to agree on the length of the link mode bitmaps. If kernel doesn't agree with user, it returns the bitmap length it is expecting from user as a negative length (and cmd field is 0). When kernel and user agree, kernel returns valid info in all fields (ie. link mode length > 0 and cmd is ETHTOOL_GLINKSETTINGS). Data structure crossing user/kernel boundary is 32/64-bit agnostic. Converted internally to a legal kernel bitmap. The internal __ethtool_get_settings kernel helper will gradually be replaced by __ethtool_get_link_ksettings by the time the first "link_settings" drivers start to appear. So this patch doesn't change it, it will be removed before it needs to be changed. Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 18:57:59 +00:00
};
/**
* ethtool_link_ksettings_zero_link_mode - clear link_ksettings link mode mask
* @ptr : pointer to struct ethtool_link_ksettings
* @name : one of supported/advertising/lp_advertising
*/
#define ethtool_link_ksettings_zero_link_mode(ptr, name) \
bitmap_zero((ptr)->link_modes.name, __ETHTOOL_LINK_MODE_MASK_NBITS)
/**
* ethtool_link_ksettings_add_link_mode - set bit in link_ksettings
* link mode mask
* @ptr : pointer to struct ethtool_link_ksettings
* @name : one of supported/advertising/lp_advertising
* @mode : one of the ETHTOOL_LINK_MODE_*_BIT
* (not atomic, no bound checking)
*/
#define ethtool_link_ksettings_add_link_mode(ptr, name, mode) \
__set_bit(ETHTOOL_LINK_MODE_ ## mode ## _BIT, (ptr)->link_modes.name)
/**
* ethtool_link_ksettings_del_link_mode - clear bit in link_ksettings
* link mode mask
* @ptr : pointer to struct ethtool_link_ksettings
* @name : one of supported/advertising/lp_advertising
* @mode : one of the ETHTOOL_LINK_MODE_*_BIT
* (not atomic, no bound checking)
*/
#define ethtool_link_ksettings_del_link_mode(ptr, name, mode) \
__clear_bit(ETHTOOL_LINK_MODE_ ## mode ## _BIT, (ptr)->link_modes.name)
net: ethtool: add new ETHTOOL_xLINKSETTINGS API This patch defines a new ETHTOOL_GLINKSETTINGS/SLINKSETTINGS API, handled by the new get_link_ksettings/set_link_ksettings callbacks. This API provides support for most legacy ethtool_cmd fields, adds support for larger link mode masks (up to 4064 bits, variable length), and removes ethtool_cmd deprecated fields (transceiver/maxrxpkt/maxtxpkt). This API is deprecating the legacy ETHTOOL_GSET/SSET API and provides the following backward compatibility properties: - legacy ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks. - legacy ethtool with new get/set_link_ksettings drivers: the new driver callbacks are used, data internally converted to legacy ethtool_cmd. ETHTOOL_GSET will return only the 1st 32b of each link mode mask. ETHTOOL_SSET will fail if user tries to set the ethtool_cmd deprecated fields to non-0 (transceiver/maxrxpkt/maxtxpkt). A kernel warning is logged if driver sets higher bits. - future ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks, internally converted to new data structure. Deprecated fields (transceiver/maxrxpkt/maxtxpkt) will be ignored and seen as 0 from user space. Note that that "future" ethtool tool will not allow changes to these deprecated fields. - future ethtool with new drivers: direct call to the new callbacks. By "future" ethtool, what is meant is: - query: first try ETHTOOL_GLINKSETTINGS, and revert to ETHTOOL_GSET if fails - set: query first and remember which of ETHTOOL_GLINKSETTINGS or ETHTOOL_GSET was successful + if ETHTOOL_GLINKSETTINGS was successful, then change config with ETHTOOL_SLINKSETTINGS. A failure there is final (do not try ETHTOOL_SSET). + otherwise ETHTOOL_GSET was successful, change config with ETHTOOL_SSET. A failure there is final (do not try ETHTOOL_SLINKSETTINGS). The interaction user/kernel via the new API requires a small ETHTOOL_GLINKSETTINGS handshake first to agree on the length of the link mode bitmaps. If kernel doesn't agree with user, it returns the bitmap length it is expecting from user as a negative length (and cmd field is 0). When kernel and user agree, kernel returns valid info in all fields (ie. link mode length > 0 and cmd is ETHTOOL_GLINKSETTINGS). Data structure crossing user/kernel boundary is 32/64-bit agnostic. Converted internally to a legal kernel bitmap. The internal __ethtool_get_settings kernel helper will gradually be replaced by __ethtool_get_link_ksettings by the time the first "link_settings" drivers start to appear. So this patch doesn't change it, it will be removed before it needs to be changed. Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 18:57:59 +00:00
/**
* ethtool_link_ksettings_test_link_mode - test bit in ksettings link mode mask
* @ptr : pointer to struct ethtool_link_ksettings
* @name : one of supported/advertising/lp_advertising
* @mode : one of the ETHTOOL_LINK_MODE_*_BIT
* (not atomic, no bound checking)
*
* Returns true/false.
*/
#define ethtool_link_ksettings_test_link_mode(ptr, name, mode) \
test_bit(ETHTOOL_LINK_MODE_ ## mode ## _BIT, (ptr)->link_modes.name)
extern int
__ethtool_get_link_ksettings(struct net_device *dev,
struct ethtool_link_ksettings *link_ksettings);
struct kernel_ethtool_coalesce {
u8 use_cqe_mode_tx;
u8 use_cqe_mode_rx;
u32 tx_aggr_max_bytes;
u32 tx_aggr_max_frames;
u32 tx_aggr_time_usecs;
};
/**
* ethtool_intersect_link_masks - Given two link masks, AND them together
* @dst: first mask and where result is stored
* @src: second mask to intersect with
*
* Given two link mode masks, AND them together and save the result in dst.
*/
void ethtool_intersect_link_masks(struct ethtool_link_ksettings *dst,
struct ethtool_link_ksettings *src);
void ethtool_convert_legacy_u32_to_link_mode(unsigned long *dst,
u32 legacy_u32);
/* return false if src had higher bits set. lower bits always updated. */
bool ethtool_convert_link_mode_to_legacy_u32(u32 *legacy_u32,
const unsigned long *src);
#define ETHTOOL_COALESCE_RX_USECS BIT(0)
#define ETHTOOL_COALESCE_RX_MAX_FRAMES BIT(1)
#define ETHTOOL_COALESCE_RX_USECS_IRQ BIT(2)
#define ETHTOOL_COALESCE_RX_MAX_FRAMES_IRQ BIT(3)
#define ETHTOOL_COALESCE_TX_USECS BIT(4)
#define ETHTOOL_COALESCE_TX_MAX_FRAMES BIT(5)
#define ETHTOOL_COALESCE_TX_USECS_IRQ BIT(6)
#define ETHTOOL_COALESCE_TX_MAX_FRAMES_IRQ BIT(7)
#define ETHTOOL_COALESCE_STATS_BLOCK_USECS BIT(8)
#define ETHTOOL_COALESCE_USE_ADAPTIVE_RX BIT(9)
#define ETHTOOL_COALESCE_USE_ADAPTIVE_TX BIT(10)
#define ETHTOOL_COALESCE_PKT_RATE_LOW BIT(11)
#define ETHTOOL_COALESCE_RX_USECS_LOW BIT(12)
#define ETHTOOL_COALESCE_RX_MAX_FRAMES_LOW BIT(13)
#define ETHTOOL_COALESCE_TX_USECS_LOW BIT(14)
#define ETHTOOL_COALESCE_TX_MAX_FRAMES_LOW BIT(15)
#define ETHTOOL_COALESCE_PKT_RATE_HIGH BIT(16)
#define ETHTOOL_COALESCE_RX_USECS_HIGH BIT(17)
#define ETHTOOL_COALESCE_RX_MAX_FRAMES_HIGH BIT(18)
#define ETHTOOL_COALESCE_TX_USECS_HIGH BIT(19)
#define ETHTOOL_COALESCE_TX_MAX_FRAMES_HIGH BIT(20)
#define ETHTOOL_COALESCE_RATE_SAMPLE_INTERVAL BIT(21)
#define ETHTOOL_COALESCE_USE_CQE_RX BIT(22)
#define ETHTOOL_COALESCE_USE_CQE_TX BIT(23)
#define ETHTOOL_COALESCE_TX_AGGR_MAX_BYTES BIT(24)
#define ETHTOOL_COALESCE_TX_AGGR_MAX_FRAMES BIT(25)
#define ETHTOOL_COALESCE_TX_AGGR_TIME_USECS BIT(26)
#define ETHTOOL_COALESCE_ALL_PARAMS GENMASK(26, 0)
#define ETHTOOL_COALESCE_USECS \
(ETHTOOL_COALESCE_RX_USECS | ETHTOOL_COALESCE_TX_USECS)
#define ETHTOOL_COALESCE_MAX_FRAMES \
(ETHTOOL_COALESCE_RX_MAX_FRAMES | ETHTOOL_COALESCE_TX_MAX_FRAMES)
#define ETHTOOL_COALESCE_USECS_IRQ \
(ETHTOOL_COALESCE_RX_USECS_IRQ | ETHTOOL_COALESCE_TX_USECS_IRQ)
#define ETHTOOL_COALESCE_MAX_FRAMES_IRQ \
(ETHTOOL_COALESCE_RX_MAX_FRAMES_IRQ | \
ETHTOOL_COALESCE_TX_MAX_FRAMES_IRQ)
#define ETHTOOL_COALESCE_USE_ADAPTIVE \
(ETHTOOL_COALESCE_USE_ADAPTIVE_RX | ETHTOOL_COALESCE_USE_ADAPTIVE_TX)
#define ETHTOOL_COALESCE_USECS_LOW_HIGH \
(ETHTOOL_COALESCE_RX_USECS_LOW | ETHTOOL_COALESCE_TX_USECS_LOW | \
ETHTOOL_COALESCE_RX_USECS_HIGH | ETHTOOL_COALESCE_TX_USECS_HIGH)
#define ETHTOOL_COALESCE_MAX_FRAMES_LOW_HIGH \
(ETHTOOL_COALESCE_RX_MAX_FRAMES_LOW | \
ETHTOOL_COALESCE_TX_MAX_FRAMES_LOW | \
ETHTOOL_COALESCE_RX_MAX_FRAMES_HIGH | \
ETHTOOL_COALESCE_TX_MAX_FRAMES_HIGH)
#define ETHTOOL_COALESCE_PKT_RATE_RX_USECS \
(ETHTOOL_COALESCE_USE_ADAPTIVE_RX | \
ETHTOOL_COALESCE_RX_USECS_LOW | ETHTOOL_COALESCE_RX_USECS_HIGH | \
ETHTOOL_COALESCE_PKT_RATE_LOW | ETHTOOL_COALESCE_PKT_RATE_HIGH | \
ETHTOOL_COALESCE_RATE_SAMPLE_INTERVAL)
#define ETHTOOL_COALESCE_USE_CQE \
(ETHTOOL_COALESCE_USE_CQE_RX | ETHTOOL_COALESCE_USE_CQE_TX)
#define ETHTOOL_COALESCE_TX_AGGR \
(ETHTOOL_COALESCE_TX_AGGR_MAX_BYTES | \
ETHTOOL_COALESCE_TX_AGGR_MAX_FRAMES | \
ETHTOOL_COALESCE_TX_AGGR_TIME_USECS)
#define ETHTOOL_STAT_NOT_SET (~0ULL)
static inline void ethtool_stats_init(u64 *stats, unsigned int n)
{
while (n--)
stats[n] = ETHTOOL_STAT_NOT_SET;
}
/* Basic IEEE 802.3 MAC statistics (30.3.1.1.*), not otherwise exposed
* via a more targeted API.
*/
struct ethtool_eth_mac_stats {
enum ethtool_mac_stats_src src;
struct_group(stats,
u64 FramesTransmittedOK;
u64 SingleCollisionFrames;
u64 MultipleCollisionFrames;
u64 FramesReceivedOK;
u64 FrameCheckSequenceErrors;
u64 AlignmentErrors;
u64 OctetsTransmittedOK;
u64 FramesWithDeferredXmissions;
u64 LateCollisions;
u64 FramesAbortedDueToXSColls;
u64 FramesLostDueToIntMACXmitError;
u64 CarrierSenseErrors;
u64 OctetsReceivedOK;
u64 FramesLostDueToIntMACRcvError;
u64 MulticastFramesXmittedOK;
u64 BroadcastFramesXmittedOK;
u64 FramesWithExcessiveDeferral;
u64 MulticastFramesReceivedOK;
u64 BroadcastFramesReceivedOK;
u64 InRangeLengthErrors;
u64 OutOfRangeLengthField;
u64 FrameTooLongErrors;
);
};
/* Basic IEEE 802.3 PHY statistics (30.3.2.1.*), not otherwise exposed
* via a more targeted API.
*/
struct ethtool_eth_phy_stats {
enum ethtool_mac_stats_src src;
struct_group(stats,
u64 SymbolErrorDuringCarrier;
);
};
/* Basic IEEE 802.3 MAC Ctrl statistics (30.3.3.*), not otherwise exposed
* via a more targeted API.
*/
struct ethtool_eth_ctrl_stats {
enum ethtool_mac_stats_src src;
struct_group(stats,
u64 MACControlFramesTransmitted;
u64 MACControlFramesReceived;
u64 UnsupportedOpcodesReceived;
);
};
/**
* struct ethtool_pause_stats - statistics for IEEE 802.3x pause frames
* @src: input field denoting whether stats should be queried from the eMAC or
* pMAC (if the MM layer is supported). To be ignored otherwise.
* @tx_pause_frames: transmitted pause frame count. Reported to user space
* as %ETHTOOL_A_PAUSE_STAT_TX_FRAMES.
*
* Equivalent to `30.3.4.2 aPAUSEMACCtrlFramesTransmitted`
* from the standard.
*
* @rx_pause_frames: received pause frame count. Reported to user space
* as %ETHTOOL_A_PAUSE_STAT_RX_FRAMES. Equivalent to:
*
* Equivalent to `30.3.4.3 aPAUSEMACCtrlFramesReceived`
* from the standard.
*/
struct ethtool_pause_stats {
enum ethtool_mac_stats_src src;
struct_group(stats,
u64 tx_pause_frames;
u64 rx_pause_frames;
);
};
#define ETHTOOL_MAX_LANES 8
/**
* struct ethtool_fec_stats - statistics for IEEE 802.3 FEC
* @corrected_blocks: number of received blocks corrected by FEC
* Reported to user space as %ETHTOOL_A_FEC_STAT_CORRECTED.
*
* Equivalent to `30.5.1.1.17 aFECCorrectedBlocks` from the standard.
*
* @uncorrectable_blocks: number of received blocks FEC was not able to correct
* Reported to user space as %ETHTOOL_A_FEC_STAT_UNCORR.
*
* Equivalent to `30.5.1.1.18 aFECUncorrectableBlocks` from the standard.
*
* @corrected_bits: number of bits corrected by FEC
* Similar to @corrected_blocks but counts individual bit changes,
* not entire FEC data blocks. This is a non-standard statistic.
* Reported to user space as %ETHTOOL_A_FEC_STAT_CORR_BITS.
*
* @lane: per-lane/PCS-instance counts as defined by the standard
* @total: error counts for the entire port, for drivers incapable of reporting
* per-lane stats
*
* Drivers should fill in either only total or per-lane statistics, core
* will take care of adding lane values up to produce the total.
*/
struct ethtool_fec_stats {
struct ethtool_fec_stat {
u64 total;
u64 lanes[ETHTOOL_MAX_LANES];
} corrected_blocks, uncorrectable_blocks, corrected_bits;
};
/**
* struct ethtool_rmon_hist_range - byte range for histogram statistics
* @low: low bound of the bucket (inclusive)
* @high: high bound of the bucket (inclusive)
*/
struct ethtool_rmon_hist_range {
u16 low;
u16 high;
};
#define ETHTOOL_RMON_HIST_MAX 10
/**
* struct ethtool_rmon_stats - selected RMON (RFC 2819) statistics
* @src: input field denoting whether stats should be queried from the eMAC or
* pMAC (if the MM layer is supported). To be ignored otherwise.
* @undersize_pkts: Equivalent to `etherStatsUndersizePkts` from the RFC.
* @oversize_pkts: Equivalent to `etherStatsOversizePkts` from the RFC.
* @fragments: Equivalent to `etherStatsFragments` from the RFC.
* @jabbers: Equivalent to `etherStatsJabbers` from the RFC.
* @hist: Packet counter for packet length buckets (e.g.
* `etherStatsPkts128to255Octets` from the RFC).
* @hist_tx: Tx counters in similar form to @hist, not defined in the RFC.
*
* Selection of RMON (RFC 2819) statistics which are not exposed via different
* APIs, primarily the packet-length-based counters.
* Unfortunately different designs choose different buckets beyond
* the 1024B mark (jumbo frame teritory), so the definition of the bucket
* ranges is left to the driver.
*/
struct ethtool_rmon_stats {
enum ethtool_mac_stats_src src;
struct_group(stats,
u64 undersize_pkts;
u64 oversize_pkts;
u64 fragments;
u64 jabbers;
u64 hist[ETHTOOL_RMON_HIST_MAX];
u64 hist_tx[ETHTOOL_RMON_HIST_MAX];
);
};
#define ETH_MODULE_EEPROM_PAGE_LEN 128
#define ETH_MODULE_MAX_I2C_ADDRESS 0x7f
/**
* struct ethtool_module_eeprom - EEPROM dump from specified page
* @offset: Offset within the specified EEPROM page to begin read, in bytes.
* @length: Number of bytes to read.
* @page: Page number to read from.
* @bank: Page bank number to read from, if applicable by EEPROM spec.
* @i2c_address: I2C address of a page. Value less than 0x7f expected. Most
* EEPROMs use 0x50 or 0x51.
* @data: Pointer to buffer with EEPROM data of @length size.
*
* This can be used to manage pages during EEPROM dump in ethtool and pass
* required information to the driver.
*/
struct ethtool_module_eeprom {
u32 offset;
u32 length;
u8 page;
u8 bank;
u8 i2c_address;
u8 *data;
};
ethtool: Add ability to control transceiver modules' power mode Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and 'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver modules parameters and retrieve their status. The first parameter to control is the power mode of the module. It is only relevant for paged memory modules, as flat memory modules always operate in low power mode. When a paged memory module is in low power mode, its power consumption is reduced to the minimum, the management interface towards the host is available and the data path is deactivated. User space can choose to put modules that are not currently in use in low power mode and transition them to high power mode before putting the associated ports administratively up. This is useful for user space that favors reduced power consumption and lower temperatures over reduced link up times. In QSFP-DD modules the transition from low power mode to high power mode can take a few seconds and this transition is only expected to get longer with future / more complex modules. User space can control the power mode of the module via the power mode policy attribute ('ETHTOOL_A_MODULE_POWER_MODE_POLICY'). Possible values: * high: Module is always in high power mode. * auto: Module is transitioned by the host to high power mode when the first port using it is put administratively up and to low power mode when the last port using it is put administratively down. The operational power mode of the module is available to user space via the 'ETHTOOL_A_MODULE_POWER_MODE' attribute. The attribute is not reported to user space when a module is not plugged-in. The user API is designed to be generic enough so that it could be used for modules with different memory maps (e.g., SFF-8636, CMIS). The only implementation of the device driver API in this series is for a MAC driver (mlxsw) where the module is controlled by the device's firmware, but it is designed to be generic enough so that it could also be used by implementations where the module is controlled by the CPU. CMIS testing ============ # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x03 (ModuleReady) LowPwrAllowRequestHW : Off LowPwrRequestSW : Off The module is not in low power mode, as it is not forced by hardware (LowPwrAllowRequestHW is off) or by software (LowPwrRequestSW is off). The power mode can be queried from the kernel. In case LowPwrAllowRequestHW was on, the kernel would need to take into account the state of the LowPwrRequestHW signal, which is not visible to user space. $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy high power-mode high Change the power mode policy to 'auto': # ethtool --set-module swp11 power-mode-policy auto Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x01 (ModuleLowPwr) LowPwrAllowRequestHW : Off LowPwrRequestSW : On Put the associated port administratively up which will instruct the host to transition the module to high power mode: # ip link set dev swp11 up Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode high Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x03 (ModuleReady) LowPwrAllowRequestHW : Off LowPwrRequestSW : Off Put the associated port administratively down which will instruct the host to transition the module to low power mode: # ip link set dev swp11 down Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x01 (ModuleLowPwr) LowPwrAllowRequestHW : Off LowPwrRequestSW : On SFF-8636 testing ================ # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled Power set : Off Power override : On ... Transmit avg optical power (Channel 1) : 0.7733 mW / -1.12 dBm Transmit avg optical power (Channel 2) : 0.7649 mW / -1.16 dBm Transmit avg optical power (Channel 3) : 0.7790 mW / -1.08 dBm Transmit avg optical power (Channel 4) : 0.7837 mW / -1.06 dBm Rcvr signal avg optical power(Channel 1) : 0.9302 mW / -0.31 dBm Rcvr signal avg optical power(Channel 2) : 0.9079 mW / -0.42 dBm Rcvr signal avg optical power(Channel 3) : 0.8993 mW / -0.46 dBm Rcvr signal avg optical power(Channel 4) : 0.8778 mW / -0.57 dBm The module is not in low power mode, as it is not forced by hardware (Power override is on) or by software (Power set is off). The power mode can be queried from the kernel. In case Power override was off, the kernel would need to take into account the state of the LPMode signal, which is not visible to user space. $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy high power-mode high Change the power mode policy to 'auto': # ethtool --set-module swp13 power-mode-policy auto Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled Power set : On Power override : On ... Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm Put the associated port administratively up which will instruct the host to transition the module to high power mode: # ip link set dev swp13 up Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode high Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled Power set : Off Power override : On ... Transmit avg optical power (Channel 1) : 0.7934 mW / -1.01 dBm Transmit avg optical power (Channel 2) : 0.7859 mW / -1.05 dBm Transmit avg optical power (Channel 3) : 0.7885 mW / -1.03 dBm Transmit avg optical power (Channel 4) : 0.7985 mW / -0.98 dBm Rcvr signal avg optical power(Channel 1) : 0.9325 mW / -0.30 dBm Rcvr signal avg optical power(Channel 2) : 0.9034 mW / -0.44 dBm Rcvr signal avg optical power(Channel 3) : 0.9086 mW / -0.42 dBm Rcvr signal avg optical power(Channel 4) : 0.8885 mW / -0.51 dBm Put the associated port administratively down which will instruct the host to transition the module to low power mode: # ip link set dev swp13 down Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled Power set : On Power override : On ... Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-06 10:46:42 +00:00
/**
* struct ethtool_module_power_mode_params - module power mode parameters
* @policy: The power mode policy enforced by the host for the plug-in module.
* @mode: The operational power mode of the plug-in module. Should be filled by
* device drivers on get operations.
*/
struct ethtool_module_power_mode_params {
enum ethtool_module_power_mode_policy policy;
enum ethtool_module_power_mode mode;
};
net: ethtool: add support for MAC Merge layer The MAC merge sublayer (IEEE 802.3-2018 clause 99) is one of 2 specifications (the other being Frame Preemption; IEEE 802.1Q-2018 clause 6.7.2), which work together to minimize latency caused by frame interference at TX. The overall goal of TSN is for normal traffic and traffic with a bounded deadline to be able to cohabitate on the same L2 network and not bother each other too much. The standards achieve this (partly) by introducing the concept of preemptible traffic, i.e. Ethernet frames that have a custom value for the Start-of-Frame-Delimiter (SFD), and these frames can be fragmented and reassembled at L2 on a link-local basis. The non-preemptible frames are called express traffic, they are transmitted using a normal SFD, and they can preempt preemptible frames, therefore having lower latency, which can matter at lower (100 Mbps) link speeds, or at high MTUs (jumbo frames around 9K). Preemption is not recursive, i.e. a P frame cannot preempt another P frame. Preemption also does not depend upon priority, or otherwise said, an E frame with prio 0 will still preempt a P frame with prio 7. In terms of implementation, the standards talk about the presence of an express MAC (eMAC) which handles express traffic, and a preemptible MAC (pMAC) which handles preemptible traffic, and these MACs are multiplexed on the same MII by a MAC merge layer. To support frame preemption, the definition of the SFD was generalized to SMD (Start-of-mPacket-Delimiter), where an mPacket is essentially an Ethernet frame fragment, or a complete frame. Stations unaware of an SMD value different from the standard SFD will treat P frames as error frames. To prevent that from happening, a negotiation process is defined. On RX, packets are dispatched to the eMAC or pMAC after being filtered by their SMD. On TX, the eMAC/pMAC classification decision is taken by the 802.1Q spec, based on packet priority (each of the 8 user priority values may have an admin-status of preemptible or express). The MAC Merge layer and the Frame Preemption parameters have some degree of independence in terms of how software stacks are supposed to deal with them. The activation of the MM layer is supposed to be controlled by an LLDP daemon (after it has been communicated that the link partner also supports it), after which a (hardware-based or not) verification handshake takes place, before actually enabling the feature. So the process is intended to be relatively plug-and-play. Whereas FP settings are supposed to be coordinated across a network using something approximating NETCONF. The support contained here is exclusively for the 802.3 (MAC Merge) portions and not for the 802.1Q (Frame Preemption) parts. This API is sufficient for an LLDP daemon to do its job. The FP adminStatus variable from 802.1Q is outside the scope of an LLDP daemon. I have taken a few creative licenses and augmented the Linux kernel UAPI compared to the standard managed objects recommended by IEEE 802.3. These are: - ETHTOOL_A_MM_PMAC_ENABLED: According to Figure 99-6: Receive Processing state diagram, a MAC Merge layer is always supposed to be able to receive P frames. However, this implies keeping the pMAC powered on, which will consume needless power in applications where FP will never be used. If LLDP is used, the reception of an Additional Ethernet Capabilities TLV from the link partner is sufficient indication that the pMAC should be enabled. So my proposal is that in Linux, we keep the pMAC turned off by default and that user space turns it on when needed. - ETHTOOL_A_MM_VERIFY_ENABLED: The IEEE managed object is called aMACMergeVerifyDisableTx. I opted for consistency (positive logic) in the boolean netlink attributes offered, so this is also positive here. Other than the meaning being reversed, they correspond to the same thing. - ETHTOOL_A_MM_MAX_VERIFY_TIME: I found it most reasonable for a LLDP daemon to maximize the verifyTime variable (delay between SMD-V transmissions), to maximize its chances that the LP replies. IEEE says that the verifyTime can range between 1 and 128 ms, but the NXP ENETC stupidly keeps this variable in a 7 bit register, so the maximum supported value is 127 ms. I could have chosen to hardcode this in the LLDP daemon to a lower value, but why not let the kernel expose its supported range directly. - ETHTOOL_A_MM_TX_MIN_FRAG_SIZE: the standard managed object is called aMACMergeAddFragSize, and expresses the "additional" fragment size (on top of ETH_ZLEN), whereas this expresses the absolute value of the fragment size. - ETHTOOL_A_MM_RX_MIN_FRAG_SIZE: there doesn't appear to exist a managed object mandated by the standard, but user space clearly needs to know what is the minimum supported fragment size of our local receiver, since LLDP must advertise a value no lower than that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-19 12:26:54 +00:00
/**
* struct ethtool_mm_state - 802.3 MAC merge layer state
* @verify_time:
* wait time between verification attempts in ms (according to clause
* 30.14.1.6 aMACMergeVerifyTime)
* @max_verify_time:
* maximum accepted value for the @verify_time variable in set requests
* @verify_status:
* state of the verification state machine of the MM layer (according to
* clause 30.14.1.2 aMACMergeStatusVerify)
* @tx_enabled:
* set if the MM layer is administratively enabled in the TX direction
* (according to clause 30.14.1.3 aMACMergeEnableTx)
* @tx_active:
* set if the MM layer is enabled in the TX direction, which makes FP
* possible (according to 30.14.1.5 aMACMergeStatusTx). This should be
* true if MM is enabled, and the verification status is either verified,
* or disabled.
* @pmac_enabled:
* set if the preemptible MAC is powered on and is able to receive
* preemptible packets and respond to verification frames.
* @verify_enabled:
* set if the Verify function of the MM layer (which sends SMD-V
* verification requests) is administratively enabled (regardless of
* whether it is currently in the ETHTOOL_MM_VERIFY_STATUS_DISABLED state
* or not), according to clause 30.14.1.4 aMACMergeVerifyDisableTx (but
* using positive rather than negative logic). The device should always
* respond to received SMD-V requests as long as @pmac_enabled is set.
* @tx_min_frag_size:
* the minimum size of non-final mPacket fragments that the link partner
* supports receiving, expressed in octets. Compared to the definition
* from clause 30.14.1.7 aMACMergeAddFragSize which is expressed in the
* range 0 to 3 (requiring a translation to the size in octets according
* to the formula 64 * (1 + addFragSize) - 4), a value in a continuous and
* unbounded range can be specified here.
* @rx_min_frag_size:
* the minimum size of non-final mPacket fragments that this device
* supports receiving, expressed in octets.
*/
struct ethtool_mm_state {
u32 verify_time;
u32 max_verify_time;
enum ethtool_mm_verify_status verify_status;
bool tx_enabled;
bool tx_active;
bool pmac_enabled;
bool verify_enabled;
u32 tx_min_frag_size;
u32 rx_min_frag_size;
};
/**
* struct ethtool_mm_cfg - 802.3 MAC merge layer configuration
* @verify_time: see struct ethtool_mm_state
* @verify_enabled: see struct ethtool_mm_state
* @tx_enabled: see struct ethtool_mm_state
* @pmac_enabled: see struct ethtool_mm_state
* @tx_min_frag_size: see struct ethtool_mm_state
*/
struct ethtool_mm_cfg {
u32 verify_time;
bool verify_enabled;
bool tx_enabled;
bool pmac_enabled;
u32 tx_min_frag_size;
};
/**
* struct ethtool_mm_stats - 802.3 MAC merge layer statistics
* @MACMergeFrameAssErrorCount:
* received MAC frames with reassembly errors
* @MACMergeFrameSmdErrorCount:
* received MAC frames/fragments rejected due to unknown or incorrect SMD
* @MACMergeFrameAssOkCount:
* received MAC frames that were successfully reassembled and passed up
* @MACMergeFragCountRx:
* number of additional correct SMD-C mPackets received due to preemption
* @MACMergeFragCountTx:
* number of additional mPackets sent due to preemption
* @MACMergeHoldCount:
* number of times the MM layer entered the HOLD state, which blocks
* transmission of preemptible traffic
*/
struct ethtool_mm_stats {
u64 MACMergeFrameAssErrorCount;
u64 MACMergeFrameSmdErrorCount;
u64 MACMergeFrameAssOkCount;
u64 MACMergeFragCountRx;
u64 MACMergeFragCountTx;
u64 MACMergeHoldCount;
};
/**
* struct ethtool_ops - optional netdev operations
ethtool: Extend link modes settings uAPI with lanes Currently, when auto negotiation is on, the user can advertise all the linkmodes which correspond to a specific speed, but does not have a similar selector for the number of lanes. This is significant when a specific speed can be achieved using different number of lanes. For example, 2x50 or 4x25. Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct ethtool_link_settings' with lanes field in order to implement a new lanes-selector that will enable the user to advertise a specific number of lanes as well. When auto negotiation is off, lanes parameter can be forced only if the driver supports it. Add a capability bit in 'struct ethtool_ops' that allows ethtool know if the driver can handle the lanes parameter when auto negotiation is off, so if it does not, an error message will be returned when trying to set lanes. Example: $ ethtool -s swp1 lanes 4 $ ethtool swp1 Settings for swp1: Supported ports: [ FIBRE ] Supported link modes: 1000baseKX/Full 10000baseKR/Full 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 25000baseCR/Full 25000baseSR/Full 50000baseCR2/Full 100000baseSR4/Full 100000baseCR4/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 100000baseSR4/Full 100000baseCR4/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: Unknown! Duplex: Unknown! (255) Auto-negotiation: on Port: Direct Attach Copper PHYAD: 0 Transceiver: internal Link detected: no Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-02 18:06:06 +00:00
* @cap_link_lanes_supported: indicates if the driver supports lanes
* parameter.
* @supported_coalesce_params: supported types of interrupt coalescing.
* @supported_ring_params: supported ring params.
ethtool: doc: clarify what drivers can implement in their get_drvinfo() Many of the drivers which implement ethtool_ops::get_drvinfo() will prints the .driver, .version or .bus_info of struct ethtool_drvinfo. To have a glance of current state, do: $ git grep -W "get_drvinfo(struct" Printing in those three fields is useless because: - since [1], the driver version should be the kernel version (at least for upstream drivers). Arguably, out of tree drivers might still want to set a custom version, but out of tree is not our focus. - since [2], the core is able to provide default values for .driver and .bus_info. In summary, drivers may provide .fw_version and .erom_version, the rest is expected to be done by the core. In struct ethtool_ops doc from linux/ethtool: rephrase field get_drvinfo() doc to discourage developers from implementing this callback. In struct ethtool_drvinfo doc from uapi/linux/ethtool.h: remove the paragraph mentioning what drivers should do. Rationale: no need to repeat what is already written in struct ethtool_ops doc. But add a note that .fw_version and .erom_version are driver defined. Also update the dummy driver and simply remove the callback in order not to confuse the newcomers: most of the drivers will not need this callback function any more. [1] commit 6a7e25c7fb48 ("net/core: Replace driver version to be kernel version") Link: https://git.kernel.org/torvalds/linux/c/6a7e25c7fb48 [2] commit edaf5df22cb8 ("ethtool: ethtool_get_drvinfo: populate drvinfo fields even if callback exits") Link: https://git.kernel.org/netdev/net-next/c/edaf5df22cb8 Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/r/20221116171828.4093-1-mailhol.vincent@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-16 17:18:28 +00:00
* @get_drvinfo: Report driver/device information. Modern drivers no
* longer have to implement this callback. Most fields are
* correctly filled in by the core using system information, or
* populated using other driver operations.
* @get_regs_len: Get buffer length required for @get_regs
* @get_regs: Get device registers
* @get_wol: Report whether Wake-on-Lan is enabled
* @set_wol: Turn Wake-on-Lan on or off. Returns a negative error code
* or zero.
* @get_msglevel: Report driver message level. This should be the value
* of the @msg_enable field used by netif logging functions.
* @set_msglevel: Set driver message level
* @nway_reset: Restart autonegotiation. Returns a negative error code
* or zero.
* @get_link: Report whether physical link is up. Will only be called if
* the netdev is up. Should usually be set to ethtool_op_get_link(),
* which uses netif_carrier_ok().
* @get_link_ext_state: Report link extended state. Should set link_ext_state and
* link_ext_substate (link_ext_substate of 0 means link_ext_substate is unknown,
* do not attach ext_substate attribute to netlink message). If link_ext_state
* and link_ext_substate are unknown, return -ENODATA. If not implemented,
* link_ext_state and link_ext_substate will not be sent to userspace.
* @get_link_ext_stats: Read extra link-related counters.
* @get_eeprom_len: Read range of EEPROM addresses for validation of
* @get_eeprom and @set_eeprom requests.
* Returns 0 if device does not support EEPROM access.
* @get_eeprom: Read data from the device EEPROM.
* Should fill in the magic field. Don't need to check len for zero
* or wraparound. Fill in the data argument with the eeprom values
* from offset to offset + len. Update len to the amount read.
* Returns an error or zero.
* @set_eeprom: Write data to the device EEPROM.
* Should validate the magic field. Don't need to check len for zero
* or wraparound. Update len to the amount written. Returns an error
* or zero.
* @get_coalesce: Get interrupt coalescing parameters. Returns a negative
* error code or zero.
* @set_coalesce: Set interrupt coalescing parameters. Supported coalescing
* types should be set in @supported_coalesce_params.
* Returns a negative error code or zero.
* @get_ringparam: Report ring sizes
* @set_ringparam: Set ring sizes. Returns a negative error code or zero.
* @get_pause_stats: Report pause frame statistics. Drivers must not zero
* statistics which they don't report. The stats structure is initialized
* to ETHTOOL_STAT_NOT_SET indicating driver does not report statistics.
* @get_pauseparam: Report pause parameters
* @set_pauseparam: Set pause parameters. Returns a negative error code
* or zero.
* @self_test: Run specified self-tests
* @get_strings: Return a set of strings that describe the requested objects
* @set_phys_id: Identify the physical devices, e.g. by flashing an LED
* attached to it. The implementation may update the indicator
* asynchronously or synchronously, but in either case it must return
* quickly. It is initially called with the argument %ETHTOOL_ID_ACTIVE,
ethtool: allow custom interval for physical identification When physical identification of an adapter is done by toggling the mechanism on and off through software utilizing the set_phys_id operation, it is done with a fixed duration for both on and off states. Some drivers may want to set a custom duration for the on/off intervals. This patch changes the API so the return code from the driver's entry point when it is called with ETHTOOL_ID_ACTIVE can specify the frequency at which to cycle the on/off states, and updates the drivers that have already been converted to use the new set_phys_id and use the synchronous method for identifying an adapter. The physical identification frequency set in the updated drivers is based on how it was done prior to the introduction of set_phys_id. Compile tested only. Also fixes a compiler warning in sfc. v2: drivers do not return -EINVAL for ETHOOL_ID_ACTIVE v3: fold patchset into single patch and cleanup per Ben's feedback Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Eilon Greenstein <eilong@broadcom.com> Cc: Divy Le Ray <divy@chelsio.com> Cc: Don Fry <pcnet32@frontier.com> Cc: Jon Mason <jdmason@kudzu.us> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Steve Hodgson <shodgson@solarflare.com> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Matt Carlson <mcarlson@broadcom.com> Acked-by: Jon Mason <jdmason@kudzu.us> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-13 13:09:10 +00:00
* and must either activate asynchronous updates and return zero, return
* a negative error or return a positive frequency for synchronous
* indication (e.g. 1 for one on/off cycle per second). If it returns
* a frequency then it will be called again at intervals with the
* argument %ETHTOOL_ID_ON or %ETHTOOL_ID_OFF and should set the state of
* the indicator accordingly. Finally, it is called with the argument
* %ETHTOOL_ID_INACTIVE and must deactivate the indicator. Returns a
* negative error code or zero.
* @get_ethtool_stats: Return extended statistics about the device.
* This is only useful if the device maintains statistics not
* included in &struct rtnl_link_stats64.
* @begin: Function to be called before any other operation. Returns a
* negative error code or zero.
* @complete: Function to be called after any other operation except
* @begin. Will be called even if the other operation failed.
* @get_priv_flags: Report driver-specific feature flags.
* @set_priv_flags: Set driver-specific feature flags. Returns a negative
* error code or zero.
* @get_sset_count: Get number of strings that @get_strings will write.
* @get_rxnfc: Get RX flow classification rules. Returns a negative
* error code or zero.
* @set_rxnfc: Set RX flow classification rules. Returns a negative
* error code or zero.
* @flash_device: Write a firmware image to device's flash memory.
* Returns a negative error code or zero.
* @reset: Reset (part of) the device, as specified by a bitmask of
* flags from &enum ethtool_reset_flags. Returns a negative
* error code or zero.
* @get_rxfh_key_size: Get the size of the RX flow hash key.
* Returns zero if not supported for this specific device.
* @get_rxfh_indir_size: Get the size of the RX flow hash indirection table.
* Returns zero if not supported for this specific device.
ethtool: Support for configurable RSS hash function This patch extends the set/get_rxfh ethtool-options for getting or setting the RSS hash function. It modifies drivers implementation of set/get_rxfh accordingly. This change also delegates the responsibility of checking whether a modification to a certain RX flow hash parameter is supported to the driver implementation of set_rxfh. User-kernel API is done through the new hfunc bitmask field in the ethtool_rxfh struct. A bit set in the hfunc field is corresponding to an index in the new string-set ETH_SS_RSS_HASH_FUNCS. Got approval from most of the relevant driver maintainers that their driver is using Toeplitz, and for the few that didn't answered, also assumed it is Toeplitz. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Cc: Greg Rose <gregory.v.rose@intel.com> Cc: Matthew Vick <matthew.vick@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Shradha Shah <sshah@solarflare.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 16:12:10 +00:00
* @get_rxfh: Get the contents of the RX flow hash indirection table, hash key
* and/or hash function.
* Returns a negative error code or zero.
ethtool: Support for configurable RSS hash function This patch extends the set/get_rxfh ethtool-options for getting or setting the RSS hash function. It modifies drivers implementation of set/get_rxfh accordingly. This change also delegates the responsibility of checking whether a modification to a certain RX flow hash parameter is supported to the driver implementation of set_rxfh. User-kernel API is done through the new hfunc bitmask field in the ethtool_rxfh struct. A bit set in the hfunc field is corresponding to an index in the new string-set ETH_SS_RSS_HASH_FUNCS. Got approval from most of the relevant driver maintainers that their driver is using Toeplitz, and for the few that didn't answered, also assumed it is Toeplitz. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Cc: Greg Rose <gregory.v.rose@intel.com> Cc: Matthew Vick <matthew.vick@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Shradha Shah <sshah@solarflare.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 16:12:10 +00:00
* @set_rxfh: Set the contents of the RX flow hash indirection table, hash
* key, and/or hash function. Arguments which are set to %NULL or zero
* will remain unchanged.
* Returns a negative error code or zero. An error code must be returned
* if at least one unsupported change was requested.
* @get_rxfh_context: Get the contents of the RX flow hash indirection table,
* hash key, and/or hash function assiciated to the given rss context.
* Returns a negative error code or zero.
* @set_rxfh_context: Create, remove and configure RSS contexts. Allows setting
* the contents of the RX flow hash indirection table, hash key, and/or
* hash function associated to the given context. Arguments which are set
* to %NULL or zero will remain unchanged.
* Returns a negative error code or zero. An error code must be returned
* if at least one unsupported change was requested.
* @get_channels: Get number of channels.
* @set_channels: Set number of channels. Returns a negative error code or
* zero.
* @get_dump_flag: Get dump flag indicating current dump length, version,
* and flag of the device.
* @get_dump_data: Get dump data.
* @set_dump: Set dump specific flags to the device.
* @get_ts_info: Get the time stamping and PTP hardware clock capabilities.
* It may be called with RCU, or rtnl or reference on the device.
* Drivers supporting transmit time stamps in software should set this to
* ethtool_op_get_ts_info().
* @get_module_info: Get the size and type of the eeprom contained within
* a plug-in module.
* @get_module_eeprom: Get the eeprom information from the plug-in module
* @get_eee: Get Energy-Efficient (EEE) supported and status.
* @set_eee: Set EEE status (enable/disable) as well as LPI timers.
* @get_tunable: Read the value of a driver / device tunable.
* @set_tunable: Set the value of a driver / device tunable.
* @get_per_queue_coalesce: Get interrupt coalescing parameters per queue.
* It must check that the given queue number is valid. If neither a RX nor
* a TX queue has this number, return -EINVAL. If only a RX queue or a TX
* queue has this number, set the inapplicable fields to ~0 and return 0.
* Returns a negative error code or zero.
* @set_per_queue_coalesce: Set interrupt coalescing parameters per queue.
* It must check that the given queue number is valid. If neither a RX nor
* a TX queue has this number, return -EINVAL. If only a RX queue or a TX
* queue has this number, ignore the inapplicable fields. Supported
* coalescing types should be set in @supported_coalesce_params.
* Returns a negative error code or zero.
* @get_link_ksettings: Get various device settings including Ethernet link
* settings. The %cmd and %link_mode_masks_nwords fields should be
* ignored (use %__ETHTOOL_LINK_MODE_MASK_NBITS instead of the latter),
* any change to them will be overwritten by kernel. Returns a negative
* error code or zero.
* @set_link_ksettings: Set various device settings including Ethernet link
* settings. The %cmd and %link_mode_masks_nwords fields should be
* ignored (use %__ETHTOOL_LINK_MODE_MASK_NBITS instead of the latter),
* any change to them will be overwritten by kernel. Returns a negative
* error code or zero.
* @get_fec_stats: Report FEC statistics.
* Core will sum up per-lane stats to get the total.
* Drivers must not zero statistics which they don't report. The stats
* structure is initialized to ETHTOOL_STAT_NOT_SET indicating driver does
* not report statistics.
* @get_fecparam: Get the network device Forward Error Correction parameters.
* @set_fecparam: Set the network device Forward Error Correction parameters.
* @get_ethtool_phy_stats: Return extended statistics about the PHY device.
* This is only useful if the device maintains PHY statistics and
* cannot use the standard PHY library helpers.
* @get_phy_tunable: Read the value of a PHY tunable.
* @set_phy_tunable: Set the value of a PHY tunable.
* @get_module_eeprom_by_page: Get a region of plug-in module EEPROM data from
* specified page. Returns a negative error code or the amount of bytes
* read.
* @get_eth_phy_stats: Query some of the IEEE 802.3 PHY statistics.
* @get_eth_mac_stats: Query some of the IEEE 802.3 MAC statistics.
* @get_eth_ctrl_stats: Query some of the IEEE 802.3 MAC Ctrl statistics.
* @get_rmon_stats: Query some of the RMON (RFC 2819) statistics.
* Set %ranges to a pointer to zero-terminated array of byte ranges.
ethtool: Add ability to control transceiver modules' power mode Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and 'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver modules parameters and retrieve their status. The first parameter to control is the power mode of the module. It is only relevant for paged memory modules, as flat memory modules always operate in low power mode. When a paged memory module is in low power mode, its power consumption is reduced to the minimum, the management interface towards the host is available and the data path is deactivated. User space can choose to put modules that are not currently in use in low power mode and transition them to high power mode before putting the associated ports administratively up. This is useful for user space that favors reduced power consumption and lower temperatures over reduced link up times. In QSFP-DD modules the transition from low power mode to high power mode can take a few seconds and this transition is only expected to get longer with future / more complex modules. User space can control the power mode of the module via the power mode policy attribute ('ETHTOOL_A_MODULE_POWER_MODE_POLICY'). Possible values: * high: Module is always in high power mode. * auto: Module is transitioned by the host to high power mode when the first port using it is put administratively up and to low power mode when the last port using it is put administratively down. The operational power mode of the module is available to user space via the 'ETHTOOL_A_MODULE_POWER_MODE' attribute. The attribute is not reported to user space when a module is not plugged-in. The user API is designed to be generic enough so that it could be used for modules with different memory maps (e.g., SFF-8636, CMIS). The only implementation of the device driver API in this series is for a MAC driver (mlxsw) where the module is controlled by the device's firmware, but it is designed to be generic enough so that it could also be used by implementations where the module is controlled by the CPU. CMIS testing ============ # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x03 (ModuleReady) LowPwrAllowRequestHW : Off LowPwrRequestSW : Off The module is not in low power mode, as it is not forced by hardware (LowPwrAllowRequestHW is off) or by software (LowPwrRequestSW is off). The power mode can be queried from the kernel. In case LowPwrAllowRequestHW was on, the kernel would need to take into account the state of the LowPwrRequestHW signal, which is not visible to user space. $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy high power-mode high Change the power mode policy to 'auto': # ethtool --set-module swp11 power-mode-policy auto Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x01 (ModuleLowPwr) LowPwrAllowRequestHW : Off LowPwrRequestSW : On Put the associated port administratively up which will instruct the host to transition the module to high power mode: # ip link set dev swp11 up Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode high Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x03 (ModuleReady) LowPwrAllowRequestHW : Off LowPwrRequestSW : Off Put the associated port administratively down which will instruct the host to transition the module to low power mode: # ip link set dev swp11 down Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x01 (ModuleLowPwr) LowPwrAllowRequestHW : Off LowPwrRequestSW : On SFF-8636 testing ================ # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled Power set : Off Power override : On ... Transmit avg optical power (Channel 1) : 0.7733 mW / -1.12 dBm Transmit avg optical power (Channel 2) : 0.7649 mW / -1.16 dBm Transmit avg optical power (Channel 3) : 0.7790 mW / -1.08 dBm Transmit avg optical power (Channel 4) : 0.7837 mW / -1.06 dBm Rcvr signal avg optical power(Channel 1) : 0.9302 mW / -0.31 dBm Rcvr signal avg optical power(Channel 2) : 0.9079 mW / -0.42 dBm Rcvr signal avg optical power(Channel 3) : 0.8993 mW / -0.46 dBm Rcvr signal avg optical power(Channel 4) : 0.8778 mW / -0.57 dBm The module is not in low power mode, as it is not forced by hardware (Power override is on) or by software (Power set is off). The power mode can be queried from the kernel. In case Power override was off, the kernel would need to take into account the state of the LPMode signal, which is not visible to user space. $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy high power-mode high Change the power mode policy to 'auto': # ethtool --set-module swp13 power-mode-policy auto Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled Power set : On Power override : On ... Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm Put the associated port administratively up which will instruct the host to transition the module to high power mode: # ip link set dev swp13 up Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode high Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled Power set : Off Power override : On ... Transmit avg optical power (Channel 1) : 0.7934 mW / -1.01 dBm Transmit avg optical power (Channel 2) : 0.7859 mW / -1.05 dBm Transmit avg optical power (Channel 3) : 0.7885 mW / -1.03 dBm Transmit avg optical power (Channel 4) : 0.7985 mW / -0.98 dBm Rcvr signal avg optical power(Channel 1) : 0.9325 mW / -0.30 dBm Rcvr signal avg optical power(Channel 2) : 0.9034 mW / -0.44 dBm Rcvr signal avg optical power(Channel 3) : 0.9086 mW / -0.42 dBm Rcvr signal avg optical power(Channel 4) : 0.8885 mW / -0.51 dBm Put the associated port administratively down which will instruct the host to transition the module to low power mode: # ip link set dev swp13 down Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled Power set : On Power override : On ... Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-06 10:46:42 +00:00
* @get_module_power_mode: Get the power mode policy for the plug-in module
* used by the network device and its operational power mode, if
* plugged-in.
* @set_module_power_mode: Set the power mode policy for the plug-in module
* used by the network device.
net: ethtool: add support for MAC Merge layer The MAC merge sublayer (IEEE 802.3-2018 clause 99) is one of 2 specifications (the other being Frame Preemption; IEEE 802.1Q-2018 clause 6.7.2), which work together to minimize latency caused by frame interference at TX. The overall goal of TSN is for normal traffic and traffic with a bounded deadline to be able to cohabitate on the same L2 network and not bother each other too much. The standards achieve this (partly) by introducing the concept of preemptible traffic, i.e. Ethernet frames that have a custom value for the Start-of-Frame-Delimiter (SFD), and these frames can be fragmented and reassembled at L2 on a link-local basis. The non-preemptible frames are called express traffic, they are transmitted using a normal SFD, and they can preempt preemptible frames, therefore having lower latency, which can matter at lower (100 Mbps) link speeds, or at high MTUs (jumbo frames around 9K). Preemption is not recursive, i.e. a P frame cannot preempt another P frame. Preemption also does not depend upon priority, or otherwise said, an E frame with prio 0 will still preempt a P frame with prio 7. In terms of implementation, the standards talk about the presence of an express MAC (eMAC) which handles express traffic, and a preemptible MAC (pMAC) which handles preemptible traffic, and these MACs are multiplexed on the same MII by a MAC merge layer. To support frame preemption, the definition of the SFD was generalized to SMD (Start-of-mPacket-Delimiter), where an mPacket is essentially an Ethernet frame fragment, or a complete frame. Stations unaware of an SMD value different from the standard SFD will treat P frames as error frames. To prevent that from happening, a negotiation process is defined. On RX, packets are dispatched to the eMAC or pMAC after being filtered by their SMD. On TX, the eMAC/pMAC classification decision is taken by the 802.1Q spec, based on packet priority (each of the 8 user priority values may have an admin-status of preemptible or express). The MAC Merge layer and the Frame Preemption parameters have some degree of independence in terms of how software stacks are supposed to deal with them. The activation of the MM layer is supposed to be controlled by an LLDP daemon (after it has been communicated that the link partner also supports it), after which a (hardware-based or not) verification handshake takes place, before actually enabling the feature. So the process is intended to be relatively plug-and-play. Whereas FP settings are supposed to be coordinated across a network using something approximating NETCONF. The support contained here is exclusively for the 802.3 (MAC Merge) portions and not for the 802.1Q (Frame Preemption) parts. This API is sufficient for an LLDP daemon to do its job. The FP adminStatus variable from 802.1Q is outside the scope of an LLDP daemon. I have taken a few creative licenses and augmented the Linux kernel UAPI compared to the standard managed objects recommended by IEEE 802.3. These are: - ETHTOOL_A_MM_PMAC_ENABLED: According to Figure 99-6: Receive Processing state diagram, a MAC Merge layer is always supposed to be able to receive P frames. However, this implies keeping the pMAC powered on, which will consume needless power in applications where FP will never be used. If LLDP is used, the reception of an Additional Ethernet Capabilities TLV from the link partner is sufficient indication that the pMAC should be enabled. So my proposal is that in Linux, we keep the pMAC turned off by default and that user space turns it on when needed. - ETHTOOL_A_MM_VERIFY_ENABLED: The IEEE managed object is called aMACMergeVerifyDisableTx. I opted for consistency (positive logic) in the boolean netlink attributes offered, so this is also positive here. Other than the meaning being reversed, they correspond to the same thing. - ETHTOOL_A_MM_MAX_VERIFY_TIME: I found it most reasonable for a LLDP daemon to maximize the verifyTime variable (delay between SMD-V transmissions), to maximize its chances that the LP replies. IEEE says that the verifyTime can range between 1 and 128 ms, but the NXP ENETC stupidly keeps this variable in a 7 bit register, so the maximum supported value is 127 ms. I could have chosen to hardcode this in the LLDP daemon to a lower value, but why not let the kernel expose its supported range directly. - ETHTOOL_A_MM_TX_MIN_FRAG_SIZE: the standard managed object is called aMACMergeAddFragSize, and expresses the "additional" fragment size (on top of ETH_ZLEN), whereas this expresses the absolute value of the fragment size. - ETHTOOL_A_MM_RX_MIN_FRAG_SIZE: there doesn't appear to exist a managed object mandated by the standard, but user space clearly needs to know what is the minimum supported fragment size of our local receiver, since LLDP must advertise a value no lower than that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-19 12:26:54 +00:00
* @get_mm: Query the 802.3 MAC Merge layer state.
* @set_mm: Set the 802.3 MAC Merge layer parameters.
* @get_mm_stats: Query the 802.3 MAC Merge layer statistics.
*
* All operations are optional (i.e. the function pointer may be set
* to %NULL) and callers must take this into account. Callers must
* hold the RTNL lock.
*
* See the structures used by these operations for further documentation.
* Note that for all operations using a structure ending with a zero-
* length array, the array is allocated separately in the kernel and
* is passed to the driver as an additional parameter.
*
* See &struct net_device and &struct net_device_ops for documentation
* of the generic netdev features interface.
*/
struct ethtool_ops {
ethtool: Extend link modes settings uAPI with lanes Currently, when auto negotiation is on, the user can advertise all the linkmodes which correspond to a specific speed, but does not have a similar selector for the number of lanes. This is significant when a specific speed can be achieved using different number of lanes. For example, 2x50 or 4x25. Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct ethtool_link_settings' with lanes field in order to implement a new lanes-selector that will enable the user to advertise a specific number of lanes as well. When auto negotiation is off, lanes parameter can be forced only if the driver supports it. Add a capability bit in 'struct ethtool_ops' that allows ethtool know if the driver can handle the lanes parameter when auto negotiation is off, so if it does not, an error message will be returned when trying to set lanes. Example: $ ethtool -s swp1 lanes 4 $ ethtool swp1 Settings for swp1: Supported ports: [ FIBRE ] Supported link modes: 1000baseKX/Full 10000baseKR/Full 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 25000baseCR/Full 25000baseSR/Full 50000baseCR2/Full 100000baseSR4/Full 100000baseCR4/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 100000baseSR4/Full 100000baseCR4/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: Unknown! Duplex: Unknown! (255) Auto-negotiation: on Port: Direct Attach Copper PHYAD: 0 Transceiver: internal Link detected: no Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-02 18:06:06 +00:00
u32 cap_link_lanes_supported:1;
u32 supported_coalesce_params;
u32 supported_ring_params;
void (*get_drvinfo)(struct net_device *, struct ethtool_drvinfo *);
int (*get_regs_len)(struct net_device *);
void (*get_regs)(struct net_device *, struct ethtool_regs *, void *);
void (*get_wol)(struct net_device *, struct ethtool_wolinfo *);
int (*set_wol)(struct net_device *, struct ethtool_wolinfo *);
u32 (*get_msglevel)(struct net_device *);
void (*set_msglevel)(struct net_device *, u32);
int (*nway_reset)(struct net_device *);
u32 (*get_link)(struct net_device *);
int (*get_link_ext_state)(struct net_device *,
struct ethtool_link_ext_state_info *);
void (*get_link_ext_stats)(struct net_device *dev,
struct ethtool_link_ext_stats *stats);
int (*get_eeprom_len)(struct net_device *);
int (*get_eeprom)(struct net_device *,
struct ethtool_eeprom *, u8 *);
int (*set_eeprom)(struct net_device *,
struct ethtool_eeprom *, u8 *);
int (*get_coalesce)(struct net_device *,
struct ethtool_coalesce *,
struct kernel_ethtool_coalesce *,
struct netlink_ext_ack *);
int (*set_coalesce)(struct net_device *,
struct ethtool_coalesce *,
struct kernel_ethtool_coalesce *,
struct netlink_ext_ack *);
void (*get_ringparam)(struct net_device *,
struct ethtool_ringparam *,
struct kernel_ethtool_ringparam *,
struct netlink_ext_ack *);
int (*set_ringparam)(struct net_device *,
struct ethtool_ringparam *,
struct kernel_ethtool_ringparam *,
struct netlink_ext_ack *);
void (*get_pause_stats)(struct net_device *dev,
struct ethtool_pause_stats *pause_stats);
void (*get_pauseparam)(struct net_device *,
struct ethtool_pauseparam*);
int (*set_pauseparam)(struct net_device *,
struct ethtool_pauseparam*);
void (*self_test)(struct net_device *, struct ethtool_test *, u64 *);
void (*get_strings)(struct net_device *, u32 stringset, u8 *);
int (*set_phys_id)(struct net_device *, enum ethtool_phys_id_state);
void (*get_ethtool_stats)(struct net_device *,
struct ethtool_stats *, u64 *);
int (*begin)(struct net_device *);
void (*complete)(struct net_device *);
u32 (*get_priv_flags)(struct net_device *);
int (*set_priv_flags)(struct net_device *, u32);
int (*get_sset_count)(struct net_device *, int);
int (*get_rxnfc)(struct net_device *,
struct ethtool_rxnfc *, u32 *rule_locs);
int (*set_rxnfc)(struct net_device *, struct ethtool_rxnfc *);
int (*flash_device)(struct net_device *, struct ethtool_flash *);
int (*reset)(struct net_device *, u32 *);
u32 (*get_rxfh_key_size)(struct net_device *);
u32 (*get_rxfh_indir_size)(struct net_device *);
ethtool: Support for configurable RSS hash function This patch extends the set/get_rxfh ethtool-options for getting or setting the RSS hash function. It modifies drivers implementation of set/get_rxfh accordingly. This change also delegates the responsibility of checking whether a modification to a certain RX flow hash parameter is supported to the driver implementation of set_rxfh. User-kernel API is done through the new hfunc bitmask field in the ethtool_rxfh struct. A bit set in the hfunc field is corresponding to an index in the new string-set ETH_SS_RSS_HASH_FUNCS. Got approval from most of the relevant driver maintainers that their driver is using Toeplitz, and for the few that didn't answered, also assumed it is Toeplitz. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Cc: Greg Rose <gregory.v.rose@intel.com> Cc: Matthew Vick <matthew.vick@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Shradha Shah <sshah@solarflare.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 16:12:10 +00:00
int (*get_rxfh)(struct net_device *, u32 *indir, u8 *key,
u8 *hfunc);
int (*set_rxfh)(struct net_device *, const u32 *indir,
ethtool: Support for configurable RSS hash function This patch extends the set/get_rxfh ethtool-options for getting or setting the RSS hash function. It modifies drivers implementation of set/get_rxfh accordingly. This change also delegates the responsibility of checking whether a modification to a certain RX flow hash parameter is supported to the driver implementation of set_rxfh. User-kernel API is done through the new hfunc bitmask field in the ethtool_rxfh struct. A bit set in the hfunc field is corresponding to an index in the new string-set ETH_SS_RSS_HASH_FUNCS. Got approval from most of the relevant driver maintainers that their driver is using Toeplitz, and for the few that didn't answered, also assumed it is Toeplitz. Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Cc: Greg Rose <gregory.v.rose@intel.com> Cc: Matthew Vick <matthew.vick@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Shradha Shah <sshah@solarflare.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02 16:12:10 +00:00
const u8 *key, const u8 hfunc);
int (*get_rxfh_context)(struct net_device *, u32 *indir, u8 *key,
u8 *hfunc, u32 rss_context);
int (*set_rxfh_context)(struct net_device *, const u32 *indir,
const u8 *key, const u8 hfunc,
u32 *rss_context, bool delete);
void (*get_channels)(struct net_device *, struct ethtool_channels *);
int (*set_channels)(struct net_device *, struct ethtool_channels *);
int (*get_dump_flag)(struct net_device *, struct ethtool_dump *);
int (*get_dump_data)(struct net_device *,
struct ethtool_dump *, void *);
int (*set_dump)(struct net_device *, struct ethtool_dump *);
int (*get_ts_info)(struct net_device *, struct ethtool_ts_info *);
int (*get_module_info)(struct net_device *,
struct ethtool_modinfo *);
int (*get_module_eeprom)(struct net_device *,
struct ethtool_eeprom *, u8 *);
int (*get_eee)(struct net_device *, struct ethtool_eee *);
int (*set_eee)(struct net_device *, struct ethtool_eee *);
int (*get_tunable)(struct net_device *,
const struct ethtool_tunable *, void *);
int (*set_tunable)(struct net_device *,
const struct ethtool_tunable *, const void *);
int (*get_per_queue_coalesce)(struct net_device *, u32,
struct ethtool_coalesce *);
int (*set_per_queue_coalesce)(struct net_device *, u32,
struct ethtool_coalesce *);
net: ethtool: add new ETHTOOL_xLINKSETTINGS API This patch defines a new ETHTOOL_GLINKSETTINGS/SLINKSETTINGS API, handled by the new get_link_ksettings/set_link_ksettings callbacks. This API provides support for most legacy ethtool_cmd fields, adds support for larger link mode masks (up to 4064 bits, variable length), and removes ethtool_cmd deprecated fields (transceiver/maxrxpkt/maxtxpkt). This API is deprecating the legacy ETHTOOL_GSET/SSET API and provides the following backward compatibility properties: - legacy ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks. - legacy ethtool with new get/set_link_ksettings drivers: the new driver callbacks are used, data internally converted to legacy ethtool_cmd. ETHTOOL_GSET will return only the 1st 32b of each link mode mask. ETHTOOL_SSET will fail if user tries to set the ethtool_cmd deprecated fields to non-0 (transceiver/maxrxpkt/maxtxpkt). A kernel warning is logged if driver sets higher bits. - future ethtool with legacy drivers: no change, still using the get_settings/set_settings callbacks, internally converted to new data structure. Deprecated fields (transceiver/maxrxpkt/maxtxpkt) will be ignored and seen as 0 from user space. Note that that "future" ethtool tool will not allow changes to these deprecated fields. - future ethtool with new drivers: direct call to the new callbacks. By "future" ethtool, what is meant is: - query: first try ETHTOOL_GLINKSETTINGS, and revert to ETHTOOL_GSET if fails - set: query first and remember which of ETHTOOL_GLINKSETTINGS or ETHTOOL_GSET was successful + if ETHTOOL_GLINKSETTINGS was successful, then change config with ETHTOOL_SLINKSETTINGS. A failure there is final (do not try ETHTOOL_SSET). + otherwise ETHTOOL_GSET was successful, change config with ETHTOOL_SSET. A failure there is final (do not try ETHTOOL_SLINKSETTINGS). The interaction user/kernel via the new API requires a small ETHTOOL_GLINKSETTINGS handshake first to agree on the length of the link mode bitmaps. If kernel doesn't agree with user, it returns the bitmap length it is expecting from user as a negative length (and cmd field is 0). When kernel and user agree, kernel returns valid info in all fields (ie. link mode length > 0 and cmd is ETHTOOL_GLINKSETTINGS). Data structure crossing user/kernel boundary is 32/64-bit agnostic. Converted internally to a legal kernel bitmap. The internal __ethtool_get_settings kernel helper will gradually be replaced by __ethtool_get_link_ksettings by the time the first "link_settings" drivers start to appear. So this patch doesn't change it, it will be removed before it needs to be changed. Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 18:57:59 +00:00
int (*get_link_ksettings)(struct net_device *,
struct ethtool_link_ksettings *);
int (*set_link_ksettings)(struct net_device *,
const struct ethtool_link_ksettings *);
void (*get_fec_stats)(struct net_device *dev,
struct ethtool_fec_stats *fec_stats);
net: ethtool: add support for forward error correction modes Forward Error Correction (FEC) modes i.e Base-R and Reed-Solomon modes are introduced in 25G/40G/100G standards for providing good BER at high speeds. Various networking devices which support 25G/40G/100G provides ability to manage supported FEC modes and the lack of FEC encoding control and reporting today is a source for interoperability issues for many vendors. FEC capability as well as specific FEC mode i.e. Base-R or RS modes can be requested or advertised through bits D44:47 of base link codeword. This patch set intends to provide option under ethtool to manage and report FEC encoding settings for networking devices as per IEEE 802.3 bj, bm and by specs. set-fec/show-fec option(s) are designed to provide control and report the FEC encoding on the link. SET FEC option: root@tor: ethtool --set-fec swp1 encoding [off | RS | BaseR | auto] Encoding: Types of encoding Off : Turning off any encoding RS : enforcing RS-FEC encoding on supported speeds BaseR : enforcing Base R encoding on supported speeds Auto : IEEE defaults for the speed/medium combination Here are a few examples of what we would expect if encoding=auto: - if autoneg is on, we are expecting FEC to be negotiated as on or off as long as protocol supports it - if the hardware is capable of detecting the FEC encoding on it's receiver it will reconfigure its encoder to match - in absence of the above, the configuration would be set to IEEE defaults. >From our understanding , this is essentially what most hardware/driver combinations are doing today in the absence of a way for users to control the behavior. SHOW FEC option: root@tor: ethtool --show-fec swp1 FEC parameters for swp1: Active FEC encodings: RS Configured FEC encodings: RS | BaseR ETHTOOL DEVNAME output modification: ethtool devname output: root@tor:~# ethtool swp1 Settings for swp1: root@hpe-7712-03:~# ethtool swp18 Settings for swp18: Supported ports: [ FIBRE ] Supported link modes: 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 100000baseSR4/Full 100000baseCR4/Full 100000baseLR4_ER4/Full Supported pause frame use: No Supports auto-negotiation: Yes Supported FEC modes: [RS | BaseR | None | Not reported] Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Advertised FEC modes: [RS | BaseR | None | Not reported] <<<< One or more FEC modes Speed: 100000Mb/s Duplex: Full Port: FIBRE PHYAD: 106 Transceiver: internal Auto-negotiation: off Link detected: yes This patch includes following changes a) New ETHTOOL_SFECPARAM/SFECPARAM API, handled by the new get_fecparam/set_fecparam callbacks, provides support for configuration of forward error correction modes. b) Link mode bits for FEC modes i.e. None (No FEC mode), RS, BaseR/FC are defined so that users can configure these fec modes for supported and advertising fields as part of link autonegotiation. Signed-off-by: Vidya Sagar Ravipati <vidya.chowdary@gmail.com> Signed-off-by: Dustin Byford <dustin@cumulusnetworks.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-27 23:47:26 +00:00
int (*get_fecparam)(struct net_device *,
struct ethtool_fecparam *);
int (*set_fecparam)(struct net_device *,
struct ethtool_fecparam *);
void (*get_ethtool_phy_stats)(struct net_device *,
struct ethtool_stats *, u64 *);
int (*get_phy_tunable)(struct net_device *,
const struct ethtool_tunable *, void *);
int (*set_phy_tunable)(struct net_device *,
const struct ethtool_tunable *, const void *);
int (*get_module_eeprom_by_page)(struct net_device *dev,
const struct ethtool_module_eeprom *page,
struct netlink_ext_ack *extack);
void (*get_eth_phy_stats)(struct net_device *dev,
struct ethtool_eth_phy_stats *phy_stats);
void (*get_eth_mac_stats)(struct net_device *dev,
struct ethtool_eth_mac_stats *mac_stats);
void (*get_eth_ctrl_stats)(struct net_device *dev,
struct ethtool_eth_ctrl_stats *ctrl_stats);
void (*get_rmon_stats)(struct net_device *dev,
struct ethtool_rmon_stats *rmon_stats,
const struct ethtool_rmon_hist_range **ranges);
ethtool: Add ability to control transceiver modules' power mode Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and 'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver modules parameters and retrieve their status. The first parameter to control is the power mode of the module. It is only relevant for paged memory modules, as flat memory modules always operate in low power mode. When a paged memory module is in low power mode, its power consumption is reduced to the minimum, the management interface towards the host is available and the data path is deactivated. User space can choose to put modules that are not currently in use in low power mode and transition them to high power mode before putting the associated ports administratively up. This is useful for user space that favors reduced power consumption and lower temperatures over reduced link up times. In QSFP-DD modules the transition from low power mode to high power mode can take a few seconds and this transition is only expected to get longer with future / more complex modules. User space can control the power mode of the module via the power mode policy attribute ('ETHTOOL_A_MODULE_POWER_MODE_POLICY'). Possible values: * high: Module is always in high power mode. * auto: Module is transitioned by the host to high power mode when the first port using it is put administratively up and to low power mode when the last port using it is put administratively down. The operational power mode of the module is available to user space via the 'ETHTOOL_A_MODULE_POWER_MODE' attribute. The attribute is not reported to user space when a module is not plugged-in. The user API is designed to be generic enough so that it could be used for modules with different memory maps (e.g., SFF-8636, CMIS). The only implementation of the device driver API in this series is for a MAC driver (mlxsw) where the module is controlled by the device's firmware, but it is designed to be generic enough so that it could also be used by implementations where the module is controlled by the CPU. CMIS testing ============ # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x03 (ModuleReady) LowPwrAllowRequestHW : Off LowPwrRequestSW : Off The module is not in low power mode, as it is not forced by hardware (LowPwrAllowRequestHW is off) or by software (LowPwrRequestSW is off). The power mode can be queried from the kernel. In case LowPwrAllowRequestHW was on, the kernel would need to take into account the state of the LowPwrRequestHW signal, which is not visible to user space. $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy high power-mode high Change the power mode policy to 'auto': # ethtool --set-module swp11 power-mode-policy auto Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x01 (ModuleLowPwr) LowPwrAllowRequestHW : Off LowPwrRequestSW : On Put the associated port administratively up which will instruct the host to transition the module to high power mode: # ip link set dev swp11 up Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode high Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x03 (ModuleReady) LowPwrAllowRequestHW : Off LowPwrRequestSW : Off Put the associated port administratively down which will instruct the host to transition the module to low power mode: # ip link set dev swp11 down Query the power mode again: $ ethtool --show-module swp11 Module parameters for swp11: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp11 Identifier : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628)) ... Module State : 0x01 (ModuleLowPwr) LowPwrAllowRequestHW : Off LowPwrRequestSW : On SFF-8636 testing ================ # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled Power set : Off Power override : On ... Transmit avg optical power (Channel 1) : 0.7733 mW / -1.12 dBm Transmit avg optical power (Channel 2) : 0.7649 mW / -1.16 dBm Transmit avg optical power (Channel 3) : 0.7790 mW / -1.08 dBm Transmit avg optical power (Channel 4) : 0.7837 mW / -1.06 dBm Rcvr signal avg optical power(Channel 1) : 0.9302 mW / -0.31 dBm Rcvr signal avg optical power(Channel 2) : 0.9079 mW / -0.42 dBm Rcvr signal avg optical power(Channel 3) : 0.8993 mW / -0.46 dBm Rcvr signal avg optical power(Channel 4) : 0.8778 mW / -0.57 dBm The module is not in low power mode, as it is not forced by hardware (Power override is on) or by software (Power set is off). The power mode can be queried from the kernel. In case Power override was off, the kernel would need to take into account the state of the LPMode signal, which is not visible to user space. $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy high power-mode high Change the power mode policy to 'auto': # ethtool --set-module swp13 power-mode-policy auto Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled Power set : On Power override : On ... Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm Put the associated port administratively up which will instruct the host to transition the module to high power mode: # ip link set dev swp13 up Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode high Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) enabled Power set : Off Power override : On ... Transmit avg optical power (Channel 1) : 0.7934 mW / -1.01 dBm Transmit avg optical power (Channel 2) : 0.7859 mW / -1.05 dBm Transmit avg optical power (Channel 3) : 0.7885 mW / -1.03 dBm Transmit avg optical power (Channel 4) : 0.7985 mW / -0.98 dBm Rcvr signal avg optical power(Channel 1) : 0.9325 mW / -0.30 dBm Rcvr signal avg optical power(Channel 2) : 0.9034 mW / -0.44 dBm Rcvr signal avg optical power(Channel 3) : 0.9086 mW / -0.42 dBm Rcvr signal avg optical power(Channel 4) : 0.8885 mW / -0.51 dBm Put the associated port administratively down which will instruct the host to transition the module to low power mode: # ip link set dev swp13 down Query the power mode again: $ ethtool --show-module swp13 Module parameters for swp13: power-mode-policy auto power-mode low Verify with the data read from the EEPROM: # ethtool -m swp13 Identifier : 0x11 (QSFP28) ... Extended identifier description : 5.0W max. Power consumption, High Power Class (> 3.5 W) not enabled Power set : On Power override : On ... Transmit avg optical power (Channel 1) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 2) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 3) : 0.0000 mW / -inf dBm Transmit avg optical power (Channel 4) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 1) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 2) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 3) : 0.0000 mW / -inf dBm Rcvr signal avg optical power(Channel 4) : 0.0000 mW / -inf dBm Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-06 10:46:42 +00:00
int (*get_module_power_mode)(struct net_device *dev,
struct ethtool_module_power_mode_params *params,
struct netlink_ext_ack *extack);
int (*set_module_power_mode)(struct net_device *dev,
const struct ethtool_module_power_mode_params *params,
struct netlink_ext_ack *extack);
net: ethtool: add support for MAC Merge layer The MAC merge sublayer (IEEE 802.3-2018 clause 99) is one of 2 specifications (the other being Frame Preemption; IEEE 802.1Q-2018 clause 6.7.2), which work together to minimize latency caused by frame interference at TX. The overall goal of TSN is for normal traffic and traffic with a bounded deadline to be able to cohabitate on the same L2 network and not bother each other too much. The standards achieve this (partly) by introducing the concept of preemptible traffic, i.e. Ethernet frames that have a custom value for the Start-of-Frame-Delimiter (SFD), and these frames can be fragmented and reassembled at L2 on a link-local basis. The non-preemptible frames are called express traffic, they are transmitted using a normal SFD, and they can preempt preemptible frames, therefore having lower latency, which can matter at lower (100 Mbps) link speeds, or at high MTUs (jumbo frames around 9K). Preemption is not recursive, i.e. a P frame cannot preempt another P frame. Preemption also does not depend upon priority, or otherwise said, an E frame with prio 0 will still preempt a P frame with prio 7. In terms of implementation, the standards talk about the presence of an express MAC (eMAC) which handles express traffic, and a preemptible MAC (pMAC) which handles preemptible traffic, and these MACs are multiplexed on the same MII by a MAC merge layer. To support frame preemption, the definition of the SFD was generalized to SMD (Start-of-mPacket-Delimiter), where an mPacket is essentially an Ethernet frame fragment, or a complete frame. Stations unaware of an SMD value different from the standard SFD will treat P frames as error frames. To prevent that from happening, a negotiation process is defined. On RX, packets are dispatched to the eMAC or pMAC after being filtered by their SMD. On TX, the eMAC/pMAC classification decision is taken by the 802.1Q spec, based on packet priority (each of the 8 user priority values may have an admin-status of preemptible or express). The MAC Merge layer and the Frame Preemption parameters have some degree of independence in terms of how software stacks are supposed to deal with them. The activation of the MM layer is supposed to be controlled by an LLDP daemon (after it has been communicated that the link partner also supports it), after which a (hardware-based or not) verification handshake takes place, before actually enabling the feature. So the process is intended to be relatively plug-and-play. Whereas FP settings are supposed to be coordinated across a network using something approximating NETCONF. The support contained here is exclusively for the 802.3 (MAC Merge) portions and not for the 802.1Q (Frame Preemption) parts. This API is sufficient for an LLDP daemon to do its job. The FP adminStatus variable from 802.1Q is outside the scope of an LLDP daemon. I have taken a few creative licenses and augmented the Linux kernel UAPI compared to the standard managed objects recommended by IEEE 802.3. These are: - ETHTOOL_A_MM_PMAC_ENABLED: According to Figure 99-6: Receive Processing state diagram, a MAC Merge layer is always supposed to be able to receive P frames. However, this implies keeping the pMAC powered on, which will consume needless power in applications where FP will never be used. If LLDP is used, the reception of an Additional Ethernet Capabilities TLV from the link partner is sufficient indication that the pMAC should be enabled. So my proposal is that in Linux, we keep the pMAC turned off by default and that user space turns it on when needed. - ETHTOOL_A_MM_VERIFY_ENABLED: The IEEE managed object is called aMACMergeVerifyDisableTx. I opted for consistency (positive logic) in the boolean netlink attributes offered, so this is also positive here. Other than the meaning being reversed, they correspond to the same thing. - ETHTOOL_A_MM_MAX_VERIFY_TIME: I found it most reasonable for a LLDP daemon to maximize the verifyTime variable (delay between SMD-V transmissions), to maximize its chances that the LP replies. IEEE says that the verifyTime can range between 1 and 128 ms, but the NXP ENETC stupidly keeps this variable in a 7 bit register, so the maximum supported value is 127 ms. I could have chosen to hardcode this in the LLDP daemon to a lower value, but why not let the kernel expose its supported range directly. - ETHTOOL_A_MM_TX_MIN_FRAG_SIZE: the standard managed object is called aMACMergeAddFragSize, and expresses the "additional" fragment size (on top of ETH_ZLEN), whereas this expresses the absolute value of the fragment size. - ETHTOOL_A_MM_RX_MIN_FRAG_SIZE: there doesn't appear to exist a managed object mandated by the standard, but user space clearly needs to know what is the minimum supported fragment size of our local receiver, since LLDP must advertise a value no lower than that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-19 12:26:54 +00:00
int (*get_mm)(struct net_device *dev, struct ethtool_mm_state *state);
int (*set_mm)(struct net_device *dev, struct ethtool_mm_cfg *cfg,
struct netlink_ext_ack *extack);
void (*get_mm_stats)(struct net_device *dev, struct ethtool_mm_stats *stats);
};
int ethtool_check_ops(const struct ethtool_ops *ops);
struct ethtool_rx_flow_rule {
struct flow_rule *rule;
unsigned long priv[];
};
struct ethtool_rx_flow_spec_input {
const struct ethtool_rx_flow_spec *fs;
u32 rss_ctx;
};
struct ethtool_rx_flow_rule *
ethtool_rx_flow_rule_create(const struct ethtool_rx_flow_spec_input *input);
void ethtool_rx_flow_rule_destroy(struct ethtool_rx_flow_rule *rule);
bool ethtool_virtdev_validate_cmd(const struct ethtool_link_ksettings *cmd);
int ethtool_virtdev_set_link_ksettings(struct net_device *dev,
const struct ethtool_link_ksettings *cmd,
u32 *dev_speed, u8 *dev_duplex);
struct phy_device;
struct phy_tdr_config;
struct phy_plca_cfg;
struct phy_plca_status;
/**
* struct ethtool_phy_ops - Optional PHY device options
* @get_sset_count: Get number of strings that @get_strings will write.
* @get_strings: Return a set of strings that describe the requested objects
* @get_stats: Return extended statistics about the PHY device.
* @get_plca_cfg: Return PLCA configuration.
* @set_plca_cfg: Set PLCA configuration.
* @get_plca_status: Get PLCA configuration.
* @start_cable_test: Start a cable test
* @start_cable_test_tdr: Start a Time Domain Reflectometry cable test
*
* All operations are optional (i.e. the function pointer may be set to %NULL)
* and callers must take this into account. Callers must hold the RTNL lock.
*/
struct ethtool_phy_ops {
int (*get_sset_count)(struct phy_device *dev);
int (*get_strings)(struct phy_device *dev, u8 *data);
int (*get_stats)(struct phy_device *dev,
struct ethtool_stats *stats, u64 *data);
int (*get_plca_cfg)(struct phy_device *dev,
struct phy_plca_cfg *plca_cfg);
int (*set_plca_cfg)(struct phy_device *dev,
const struct phy_plca_cfg *plca_cfg,
struct netlink_ext_ack *extack);
int (*get_plca_status)(struct phy_device *dev,
struct phy_plca_status *plca_st);
int (*start_cable_test)(struct phy_device *phydev,
struct netlink_ext_ack *extack);
int (*start_cable_test_tdr)(struct phy_device *phydev,
struct netlink_ext_ack *extack,
const struct phy_tdr_config *config);
};
/**
* ethtool_set_ethtool_phy_ops - Set the ethtool_phy_ops singleton
* @ops: Ethtool PHY operations to set
*/
void ethtool_set_ethtool_phy_ops(const struct ethtool_phy_ops *ops);
/**
ethtool: Remove link_mode param and derive link params from driver Some drivers clear the 'ethtool_link_ksettings' struct in their get_link_ksettings() callback, before populating it with actual values. Such drivers will set the new 'link_mode' field to zero, resulting in user space receiving wrong link mode information given that zero is a valid value for the field. Another problem is that some drivers (notably tun) can report random values in the 'link_mode' field. This can result in a general protection fault when the field is used as an index to the 'link_mode_params' array [1]. This happens because such drivers implement their set_link_ksettings() callback by simply overwriting their private copy of 'ethtool_link_ksettings' struct with the one they get from the stack, which is not always properly initialized. Fix these problems by removing 'link_mode' from 'ethtool_link_ksettings' and instead have drivers call ethtool_params_from_link_mode() with the current link mode. The function will derive the link parameters (e.g., speed) from the link mode and fill them in the 'ethtool_link_ksettings' struct. v3: * Remove link_mode parameter and derive the link parameters in the driver instead of passing link_mode parameter to ethtool and derive it there. v2: * Introduce 'cap_link_mode_supported' instead of adding a validity field to 'ethtool_link_ksettings' struct. [1] general protection fault, probably for non-canonical address 0xdffffc00f14cc32c: 0000 [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x000000078a661960-0x000000078a661967] CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446 Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 +38 d0 7c 08 84 d2 0f 85 b9 RSP: 0018:ffffc900019df7a0 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888026136008 RCX: 0000000000000000 RDX: 00000000f14cc32c RSI: ffffffff873439ca RDI: 000000078a661960 RBP: 00000000ffff8880 R08: 00000000ffffffff R09: ffff88802613606f R10: ffffffff873439bc R11: 0000000000000000 R12: 0000000000000000 R13: ffff88802613606c R14: ffff888011d0c210 R15: ffff888011d0c210 FS: 0000000000749300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004b60f0 CR3: 00000000185c2000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37 ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586 ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656 ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620 dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842 dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060 sock_ioctl+0x477/0x6a0 net/socket.c:1177 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and duplex parameters") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07 10:06:51 +00:00
* ethtool_params_from_link_mode - Derive link parameters from a given link mode
* @link_ksettings: Link parameters to be derived from the link mode
* @link_mode: Link mode
*/
void
ethtool_params_from_link_mode(struct ethtool_link_ksettings *link_ksettings,
enum ethtool_link_mode_bit_indices link_mode);
/**
* ethtool_get_phc_vclocks - Derive phc vclocks information, and caller
* is responsible to free memory of vclock_index
* @dev: pointer to net_device structure
* @vclock_index: pointer to pointer of vclock index
*
* Return number of phc vclocks
*/
int ethtool_get_phc_vclocks(struct net_device *dev, int **vclock_index);
/* Some generic methods drivers may use in their ethtool_ops */
u32 ethtool_op_get_link(struct net_device *dev);
int ethtool_op_get_ts_info(struct net_device *dev, struct ethtool_ts_info *eti);
/**
* ethtool_mm_frag_size_add_to_min - Translate (standard) additional fragment
* size expressed as multiplier into (absolute) minimum fragment size
* value expressed in octets
* @val_add: Value of addFragSize multiplier
*/
static inline u32 ethtool_mm_frag_size_add_to_min(u32 val_add)
{
return (ETH_ZLEN + ETH_FCS_LEN) * (1 + val_add) - ETH_FCS_LEN;
}
/**
* ethtool_mm_frag_size_min_to_add - Translate (absolute) minimum fragment size
* expressed in octets into (standard) additional fragment size expressed
* as multiplier
* @val_min: Value of addFragSize variable in octets
* @val_add: Pointer where the standard addFragSize value is to be returned
* @extack: Netlink extended ack
*
* Translate a value in octets to one of 0, 1, 2, 3 according to the reverse
* application of the 802.3 formula 64 * (1 + addFragSize) - 4. To be called
* by drivers which do not support programming the minimum fragment size to a
* continuous range. Returns error on other fragment length values.
*/
static inline int ethtool_mm_frag_size_min_to_add(u32 val_min, u32 *val_add,
struct netlink_ext_ack *extack)
{
u32 add_frag_size;
for (add_frag_size = 0; add_frag_size < 4; add_frag_size++) {
if (ethtool_mm_frag_size_add_to_min(add_frag_size) == val_min) {
*val_add = add_frag_size;
return 0;
}
}
NL_SET_ERR_MSG_MOD(extack,
"minFragSize required to be one of 60, 124, 188 or 252");
return -EINVAL;
}
/**
* ethtool_sprintf - Write formatted string to ethtool string data
* @data: Pointer to start of string to update
* @fmt: Format of string to write
*
* Write formatted string to data. Update data to point at start of
* next string.
*/
extern __printf(2, 3) void ethtool_sprintf(u8 **data, const char *fmt, ...);
#endif /* _LINUX_ETHTOOL_H */