linux-stable/include/linux/dsa/ocelot.h

269 lines
11 KiB
C
Raw Normal View History

/* SPDX-License-Identifier: GPL-2.0
* Copyright 2019-2021 NXP
*/
#ifndef _NET_DSA_TAG_OCELOT_H
#define _NET_DSA_TAG_OCELOT_H
net: dsa: tag_ocelot_8021q: break circular dependency with ocelot switch lib Michael reported that when using the "ocelot-8021q" tagging protocol, the switch driver module must be manually loaded before the tagging protocol can be loaded/is available. This appears to be the same problem described here: https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/ where due to the fact that DSA tagging protocols make use of symbols exported by the switch drivers, circular dependencies appear and this breaks module autoloading. The ocelot_8021q driver needs the ocelot_can_inject() and ocelot_port_inject_frame() functions from the switch library. Previously the wrong approach was taken to solve that dependency: shims were provided for the case where the ocelot switch library was compiled out, but that turns out to be insufficient, because the dependency when the switch lib _is_ compiled is problematic too. We cannot declare ocelot_can_inject() and ocelot_port_inject_frame() as static inline functions, because these access I/O functions like __ocelot_write_ix() which is called by ocelot_write_rix(). Making those static inline basically means exposing the whole guts of the ocelot switch library, not ideal... We already have one tagging protocol driver which calls into the switch driver during xmit but not using any exported symbol: sja1105_defer_xmit. We can do the same thing here: create a kthread worker and one work item per skb, and let the switch driver itself do the register accesses to send the skb, and then consume it. Fixes: 0a6f17c6ae21 ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping") Reported-by: Michael Walle <michael@walle.cc> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12 11:40:41 +00:00
#include <linux/kthread.h>
#include <linux/packing.h>
#include <linux/skbuff.h>
struct ocelot_skb_cb {
struct sk_buff *clone;
unsigned int ptp_class; /* valid only for clones */
net: dsa: felix: fix broken VLAN-tagged PTP under VLAN-aware bridge Normally it is expected that the dsa_device_ops :: rcv() method finishes parsing the DSA tag and consumes it, then never looks at it again. But commit c0bcf537667c ("net: dsa: ocelot: add hardware timestamping support for Felix") added support for RX timestamping in a very unconventional way. On this switch, a partial timestamp is available in the DSA header, but the driver got away with not parsing that timestamp right away, but instead delayed that parsing for a little longer: dsa_switch_rcv(): nskb = cpu_dp->rcv(skb, dev); <------------- not here -> ocelot_rcv() ... skb = nskb; skb_push(skb, ETH_HLEN); skb->pkt_type = PACKET_HOST; skb->protocol = eth_type_trans(skb, skb->dev); ... if (dsa_skb_defer_rx_timestamp(p, skb)) <--- but here -> felix_rxtstamp() return 0; When in felix_rxtstamp(), this driver accounted for the fact that eth_type_trans() happened in the meanwhile, so it got a hold of the extraction header again by subtracting (ETH_HLEN + OCELOT_TAG_LEN) bytes from the current skb->data. This worked for quite some time but was quite fragile from the very beginning. Not to mention that having DSA tag parsing split in two different files, under different folders (net/dsa/tag_ocelot.c vs drivers/net/dsa/ocelot/felix.c) made it quite non-obvious for patches to come that they might break this. Finally, the blamed commit does the following: at the end of ocelot_rcv(), it checks whether the skb payload contains a VLAN header. If it does, and this port is under a VLAN-aware bridge, that VLAN ID might not be correct in the sense that the packet might have suffered VLAN rewriting due to TCAM rules (VCAP IS1). So we consume the VLAN ID from the skb payload using __skb_vlan_pop(), and take the classified VLAN ID from the DSA tag, and construct a hwaccel VLAN tag with the classified VLAN, and the skb payload is VLAN-untagged. The big problem is that __skb_vlan_pop() does: memmove(skb->data + VLAN_HLEN, skb->data, 2 * ETH_ALEN); __skb_pull(skb, VLAN_HLEN); aka it moves the Ethernet header 4 bytes to the right, and pulls 4 bytes from the skb headroom (effectively also moving skb->data, by definition). So for felix_rxtstamp()'s fragile logic, all bets are off now. Instead of having the "extraction" pointer point to the DSA header, it actually points to 4 bytes _inside_ the extraction header. Corollary, the last 4 bytes of the "extraction" header are in fact 4 stale bytes of the destination MAC address from the Ethernet header, from prior to the __skb_vlan_pop() movement. So of course, RX timestamps are completely bogus when the system is configured in this way. The fix is actually very simple: just don't structure the code like that. For better or worse, the DSA PTP timestamping API does not offer a straightforward way for drivers to present their RX timestamps, but other drivers (sja1105) have established a simple mechanism to carry their RX timestamp from dsa_device_ops :: rcv() all the way to dsa_switch_ops :: port_rxtstamp() and even later. That mechanism is to simply save the partial timestamp to the skb->cb, and complete it later. Question: why don't we simply populate the skb's struct skb_shared_hwtstamps from ocelot_rcv(), and bother with this complication of propagating the timestamp to felix_rxtstamp()? Answer: dsa_switch_ops :: port_rxtstamp() answers the question whether PTP packets need sleepable context to retrieve the full RX timestamp. Currently felix_rxtstamp() answers "no, thanks" to that question, and calls ocelot_ptp_gettime64() from softirq atomic context. This is understandable, since Felix VSC9959 is a PCIe memory-mapped switch, so hardware access does not require sleeping. But the felix driver is preparing for the introduction of other switches where hardware access is over a slow bus like SPI or MDIO: https://lore.kernel.org/lkml/20210814025003.2449143-1-colin.foster@in-advantage.com/ So I would like to keep this code structure, so the rework needed when that driver will need PTP support will be minimal (answer "yes, I need deferred context for this skb's RX timestamp", then the partial timestamp will still be found in the skb->cb. Fixes: ea440cd2d9b2 ("net: dsa: tag_ocelot: use VLAN information from tagging header when available") Reported-by: Po Liu <po.liu@nxp.com> Cc: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-02 19:31:22 +00:00
u32 tstamp_lo;
u8 ptp_cmd;
u8 ts_id;
};
#define OCELOT_SKB_CB(skb) \
((struct ocelot_skb_cb *)((skb)->cb))
#define IFH_TAG_TYPE_C 0
#define IFH_TAG_TYPE_S 1
#define IFH_REW_OP_NOOP 0x0
#define IFH_REW_OP_DSCP 0x1
#define IFH_REW_OP_ONE_STEP_PTP 0x2
#define IFH_REW_OP_TWO_STEP_PTP 0x3
#define IFH_REW_OP_ORIGIN_PTP 0x5
#define OCELOT_TAG_LEN 16
#define OCELOT_SHORT_PREFIX_LEN 4
#define OCELOT_LONG_PREFIX_LEN 16
#define OCELOT_TOTAL_TAG_LEN (OCELOT_SHORT_PREFIX_LEN + OCELOT_TAG_LEN)
/* The CPU injection header and the CPU extraction header can have 3 types of
* prefixes: long, short and no prefix. The format of the header itself is the
* same in all 3 cases.
*
* Extraction with long prefix:
*
* +-------------------+-------------------+------+------+------------+-------+
* | ff:ff:ff:ff:ff:ff | fe:ff:ff:ff:ff:ff | 8880 | 000a | extraction | frame |
* | | | | | header | |
* +-------------------+-------------------+------+------+------------+-------+
* 48 bits 48 bits 16 bits 16 bits 128 bits
*
* Extraction with short prefix:
*
* +------+------+------------+-------+
* | 8880 | 000a | extraction | frame |
* | | | header | |
* +------+------+------------+-------+
* 16 bits 16 bits 128 bits
*
* Extraction with no prefix:
*
* +------------+-------+
* | extraction | frame |
* | header | |
* +------------+-------+
* 128 bits
*
*
* Injection with long prefix:
*
* +-------------------+-------------------+------+------+------------+-------+
* | any dmac | any smac | 8880 | 000a | injection | frame |
* | | | | | header | |
* +-------------------+-------------------+------+------+------------+-------+
* 48 bits 48 bits 16 bits 16 bits 128 bits
*
* Injection with short prefix:
*
* +------+------+------------+-------+
* | 8880 | 000a | injection | frame |
* | | | header | |
* +------+------+------------+-------+
* 16 bits 16 bits 128 bits
*
* Injection with no prefix:
*
* +------------+-------+
* | injection | frame |
* | header | |
* +------------+-------+
* 128 bits
*
* The injection header looks like this (network byte order, bit 127
* is part of lowest address byte in memory, bit 0 is part of highest
* address byte):
*
* +------+------+------+------+------+------+------+------+
* 127:120 |BYPASS| MASQ | MASQ_PORT |REW_OP|REW_OP|
* +------+------+------+------+------+------+------+------+
* 119:112 | REW_OP |
* +------+------+------+------+------+------+------+------+
* 111:104 | REW_VAL |
* +------+------+------+------+------+------+------+------+
* 103: 96 | REW_VAL |
* +------+------+------+------+------+------+------+------+
* 95: 88 | REW_VAL |
* +------+------+------+------+------+------+------+------+
* 87: 80 | REW_VAL |
* +------+------+------+------+------+------+------+------+
* 79: 72 | RSV |
* +------+------+------+------+------+------+------+------+
* 71: 64 | RSV | DEST |
* +------+------+------+------+------+------+------+------+
* 63: 56 | DEST |
* +------+------+------+------+------+------+------+------+
* 55: 48 | RSV |
* +------+------+------+------+------+------+------+------+
* 47: 40 | RSV | SRC_PORT | RSV |TFRM_TIMER|
* +------+------+------+------+------+------+------+------+
* 39: 32 | TFRM_TIMER | RSV |
* +------+------+------+------+------+------+------+------+
* 31: 24 | RSV | DP | POP_CNT | CPUQ |
* +------+------+------+------+------+------+------+------+
* 23: 16 | CPUQ | QOS_CLASS |TAG_TYPE|
* +------+------+------+------+------+------+------+------+
* 15: 8 | PCP | DEI | VID |
* +------+------+------+------+------+------+------+------+
* 7: 0 | VID |
* +------+------+------+------+------+------+------+------+
*
* And the extraction header looks like this:
*
* +------+------+------+------+------+------+------+------+
* 127:120 | RSV | REW_OP |
* +------+------+------+------+------+------+------+------+
* 119:112 | REW_OP | REW_VAL |
* +------+------+------+------+------+------+------+------+
* 111:104 | REW_VAL |
* +------+------+------+------+------+------+------+------+
* 103: 96 | REW_VAL |
* +------+------+------+------+------+------+------+------+
* 95: 88 | REW_VAL |
* +------+------+------+------+------+------+------+------+
* 87: 80 | REW_VAL | LLEN |
* +------+------+------+------+------+------+------+------+
* 79: 72 | LLEN | WLEN |
* +------+------+------+------+------+------+------+------+
* 71: 64 | WLEN | RSV |
* +------+------+------+------+------+------+------+------+
* 63: 56 | RSV |
* +------+------+------+------+------+------+------+------+
* 55: 48 | RSV |
* +------+------+------+------+------+------+------+------+
* 47: 40 | RSV | SRC_PORT | ACL_ID |
* +------+------+------+------+------+------+------+------+
* 39: 32 | ACL_ID | RSV | SFLOW_ID |
* +------+------+------+------+------+------+------+------+
* 31: 24 |ACL_HIT| DP | LRN_FLAGS | CPUQ |
* +------+------+------+------+------+------+------+------+
* 23: 16 | CPUQ | QOS_CLASS |TAG_TYPE|
* +------+------+------+------+------+------+------+------+
* 15: 8 | PCP | DEI | VID |
* +------+------+------+------+------+------+------+------+
* 7: 0 | VID |
* +------+------+------+------+------+------+------+------+
*/
net: dsa: tag_ocelot_8021q: break circular dependency with ocelot switch lib Michael reported that when using the "ocelot-8021q" tagging protocol, the switch driver module must be manually loaded before the tagging protocol can be loaded/is available. This appears to be the same problem described here: https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/ where due to the fact that DSA tagging protocols make use of symbols exported by the switch drivers, circular dependencies appear and this breaks module autoloading. The ocelot_8021q driver needs the ocelot_can_inject() and ocelot_port_inject_frame() functions from the switch library. Previously the wrong approach was taken to solve that dependency: shims were provided for the case where the ocelot switch library was compiled out, but that turns out to be insufficient, because the dependency when the switch lib _is_ compiled is problematic too. We cannot declare ocelot_can_inject() and ocelot_port_inject_frame() as static inline functions, because these access I/O functions like __ocelot_write_ix() which is called by ocelot_write_rix(). Making those static inline basically means exposing the whole guts of the ocelot switch library, not ideal... We already have one tagging protocol driver which calls into the switch driver during xmit but not using any exported symbol: sja1105_defer_xmit. We can do the same thing here: create a kthread worker and one work item per skb, and let the switch driver itself do the register accesses to send the skb, and then consume it. Fixes: 0a6f17c6ae21 ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping") Reported-by: Michael Walle <michael@walle.cc> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-12 11:40:41 +00:00
struct felix_deferred_xmit_work {
struct dsa_port *dp;
struct sk_buff *skb;
struct kthread_work work;
};
struct felix_port {
void (*xmit_work_fn)(struct kthread_work *work);
struct kthread_worker *xmit_worker;
};
static inline void ocelot_xfh_get_rew_val(void *extraction, u64 *rew_val)
{
packing(extraction, rew_val, 116, 85, OCELOT_TAG_LEN, UNPACK, 0);
}
static inline void ocelot_xfh_get_len(void *extraction, u64 *len)
{
u64 llen, wlen;
packing(extraction, &llen, 84, 79, OCELOT_TAG_LEN, UNPACK, 0);
packing(extraction, &wlen, 78, 71, OCELOT_TAG_LEN, UNPACK, 0);
*len = 60 * wlen + llen - 80;
}
static inline void ocelot_xfh_get_src_port(void *extraction, u64 *src_port)
{
packing(extraction, src_port, 46, 43, OCELOT_TAG_LEN, UNPACK, 0);
}
static inline void ocelot_xfh_get_qos_class(void *extraction, u64 *qos_class)
{
packing(extraction, qos_class, 19, 17, OCELOT_TAG_LEN, UNPACK, 0);
}
static inline void ocelot_xfh_get_tag_type(void *extraction, u64 *tag_type)
{
packing(extraction, tag_type, 16, 16, OCELOT_TAG_LEN, UNPACK, 0);
}
static inline void ocelot_xfh_get_vlan_tci(void *extraction, u64 *vlan_tci)
{
packing(extraction, vlan_tci, 15, 0, OCELOT_TAG_LEN, UNPACK, 0);
}
static inline void ocelot_ifh_set_bypass(void *injection, u64 bypass)
{
packing(injection, &bypass, 127, 127, OCELOT_TAG_LEN, PACK, 0);
}
static inline void ocelot_ifh_set_rew_op(void *injection, u64 rew_op)
{
packing(injection, &rew_op, 125, 117, OCELOT_TAG_LEN, PACK, 0);
}
static inline void ocelot_ifh_set_dest(void *injection, u64 dest)
{
packing(injection, &dest, 67, 56, OCELOT_TAG_LEN, PACK, 0);
}
static inline void ocelot_ifh_set_qos_class(void *injection, u64 qos_class)
{
packing(injection, &qos_class, 19, 17, OCELOT_TAG_LEN, PACK, 0);
}
net: dsa: tag_ocelot: create separate tagger for Seville The ocelot tagger is a hot mess currently, it relies on memory initialized by the attached driver for basic frame transmission. This is against all that DSA tagging protocols stand for, which is that the transmission and reception of a DSA-tagged frame, the data path, should be independent from the switch control path, because the tag protocol is in principle hot-pluggable and reusable across switches (even if in practice it wasn't until very recently). But if another driver like dsa_loop wants to make use of tag_ocelot, it couldn't. This was done to have common code between Felix and Ocelot, which have one bit difference in the frame header format. Quoting from commit 67c2404922c2 ("net: dsa: felix: create a template for the DSA tags on xmit"): Other alternatives have been analyzed, such as: - Create a separate tag_seville.c: too much code duplication for just 1 bit field difference. - Create a separate DSA_TAG_PROTO_SEVILLE under tag_ocelot.c, just like tag_brcm.c, which would have a separate .xmit function. Again, too much code duplication for just 1 bit field difference. - Allocate the template from the init function of the tag_ocelot.c module, instead of from the driver: couldn't figure out a method of accessing the correct port template corresponding to the correct tagger in the .xmit function. The really interesting part is that Seville should have had its own tagging protocol defined - it is not compatible on the wire with Ocelot, even for that single bit. In principle, a packet generated by DSA_TAG_PROTO_OCELOT when booted on NXP LS1028A would look in a certain way, but when booted on NXP T1040 it would look differently. The reverse is also true: a packet generated by a Seville switch would be interpreted incorrectly by Wireshark if it was told it was generated by an Ocelot switch. Actually things are a bit more nuanced. If we concentrate only on the DSA tag, what I said above is true, but Ocelot/Seville also support an optional DSA tag prefix, which can be short or long, and it is possible to distinguish the two taggers based on an integer constant put in that prefix. Nonetheless, creating a separate tagger is still justified, since the tag prefix is optional, and without it, there is again no way to distinguish. Claiming backwards binary compatibility is a bit more tough, since I've already changed the format of tag_ocelot once, in commit 5124197ce58b ("net: dsa: tag_ocelot: use a short prefix on both ingress and egress"). Therefore I am not very concerned with treating this as a bugfix and backporting it to stable kernels (which would be another mess due to the fact that there would be lots of conflicts with the other DSA_TAG_PROTO* definitions). It's just simpler to say that the string values of the taggers have ABI value starting with kernel 5.12, which will be when the changing of tag protocol via /sys/class/net/<dsa-master>/dsa/tagging goes live. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-13 22:37:58 +00:00
static inline void seville_ifh_set_dest(void *injection, u64 dest)
{
packing(injection, &dest, 67, 57, OCELOT_TAG_LEN, PACK, 0);
}
static inline void ocelot_ifh_set_src(void *injection, u64 src)
{
packing(injection, &src, 46, 43, OCELOT_TAG_LEN, PACK, 0);
}
static inline void ocelot_ifh_set_tag_type(void *injection, u64 tag_type)
{
packing(injection, &tag_type, 16, 16, OCELOT_TAG_LEN, PACK, 0);
}
static inline void ocelot_ifh_set_vlan_tci(void *injection, u64 vlan_tci)
{
packing(injection, &vlan_tci, 15, 0, OCELOT_TAG_LEN, PACK, 0);
}
/* Determine the PTP REW_OP to use for injecting the given skb */
static inline u32 ocelot_ptp_rew_op(struct sk_buff *skb)
{
struct sk_buff *clone = OCELOT_SKB_CB(skb)->clone;
u8 ptp_cmd = OCELOT_SKB_CB(skb)->ptp_cmd;
u32 rew_op = 0;
if (ptp_cmd == IFH_REW_OP_TWO_STEP_PTP && clone) {
rew_op = ptp_cmd;
rew_op |= OCELOT_SKB_CB(clone)->ts_id << 3;
} else if (ptp_cmd == IFH_REW_OP_ORIGIN_PTP) {
rew_op = ptp_cmd;
}
return rew_op;
}
#endif