linux-stable/net/sched
Tonghao Zhang 38a6f08657 net: sched: support hash selecting tx queue
This patch allows users to pick queue_mapping, range
from A to B. Then we can load balance packets from A
to B tx queue. The range is an unsigned 16bit value
in decimal format.

$ tc filter ... action skbedit queue_mapping skbhash A B

"skbedit queue_mapping QUEUE_MAPPING" (from "man 8 tc-skbedit")
is enhanced with flags: SKBEDIT_F_TXQ_SKBHASH

  +----+      +----+      +----+
  | P1 |      | P2 |      | Pn |
  +----+      +----+      +----+
    |           |           |
    +-----------+-----------+
                |
                | clsact/skbedit
                |      MQ
                v
    +-----------+-----------+
    | q0        | qn        | qm
    v           v           v
  HTB/FQ       FIFO   ...  FIFO

For example:
If P1 sends out packets to different Pods on other host, and
we want distribute flows from qn - qm. Then we can use skb->hash
as hash.

setup commands:
$ NETDEV=eth0
$ ip netns add n1
$ ip link add ipv1 link $NETDEV type ipvlan mode l2
$ ip link set ipv1 netns n1
$ ip netns exec n1 ifconfig ipv1 2.2.2.100/24 up

$ tc qdisc add dev $NETDEV clsact
$ tc filter add dev $NETDEV egress protocol ip prio 1 \
        flower skip_hw src_ip 2.2.2.100 action skbedit queue_mapping skbhash 2 6
$ tc qdisc add dev $NETDEV handle 1: root mq
$ tc qdisc add dev $NETDEV parent 1:1 handle 2: htb
$ tc class add dev $NETDEV parent 2: classid 2:1 htb rate 100kbit
$ tc class add dev $NETDEV parent 2: classid 2:2 htb rate 200kbit
$ tc qdisc add dev $NETDEV parent 1:2 tbf rate 100mbit burst 100mb latency 1
$ tc qdisc add dev $NETDEV parent 1:3 pfifo
$ tc qdisc add dev $NETDEV parent 1:4 pfifo
$ tc qdisc add dev $NETDEV parent 1:5 pfifo
$ tc qdisc add dev $NETDEV parent 1:6 pfifo
$ tc qdisc add dev $NETDEV parent 1:7 pfifo

$ ip netns exec n1 iperf3 -c 2.2.2.1 -i 1 -t 10 -P 10

pick txqueue from 2 - 6:
$ ethtool -S $NETDEV | grep -i tx_queue_[0-9]_bytes
     tx_queue_0_bytes: 42
     tx_queue_1_bytes: 0
     tx_queue_2_bytes: 11442586444
     tx_queue_3_bytes: 7383615334
     tx_queue_4_bytes: 3981365579
     tx_queue_5_bytes: 3983235051
     tx_queue_6_bytes: 6706236461
     tx_queue_7_bytes: 42
     tx_queue_8_bytes: 0
     tx_queue_9_bytes: 0

txqueues 2 - 6 are mapped to classid 1:3 - 1:7
$ tc -s class show dev $NETDEV
...
class mq 1:3 root leaf 8002:
 Sent 11949133672 bytes 7929798 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq 1:4 root leaf 8003:
 Sent 7710449050 bytes 5117279 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq 1:5 root leaf 8004:
 Sent 4157648675 bytes 2758990 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq 1:6 root leaf 8005:
 Sent 4159632195 bytes 2759990 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
class mq 1:7 root leaf 8006:
 Sent 7003169603 bytes 4646912 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
...

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexander Lobakin <alobakin@pm.me>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Talal Ahmad <talalahmad@google.com>
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Antoine Tenart <atenart@kernel.org>
Cc: Wei Wang <weiwan@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-19 12:20:45 +02:00
..
act_api.c net/sched: act_api: Add extack to offload_act_setup() callback 2022-04-08 13:45:43 +01:00
act_bpf.c bpf: Keep the (rcv) timestamp behavior for the existing tc-bpf@ingress 2022-03-03 14:38:48 +00:00
act_connmark.c flow_offload: fill flags to action structure 2021-12-19 14:08:47 +00:00
act_csum.c net/sched: act_api: Add extack to offload_act_setup() callback 2022-04-08 13:45:43 +01:00
act_ct.c net/sched: act_api: Add extack to offload_act_setup() callback 2022-04-08 13:45:43 +01:00
act_ctinfo.c flow_offload: fill flags to action structure 2021-12-19 14:08:47 +00:00
act_gact.c net/sched: act_gact: Add extack messages for offload failure 2022-04-08 13:45:43 +01:00
act_gate.c net/sched: act_api: Add extack to offload_act_setup() callback 2022-04-08 13:45:43 +01:00
act_ife.c flow_offload: fill flags to action structure 2021-12-19 14:08:47 +00:00
act_ipt.c flow_offload: fill flags to action structure 2021-12-19 14:08:47 +00:00
act_meta_mark.c
act_meta_skbprio.c
act_meta_skbtcindex.c
act_mirred.c net/sched: act_mirred: Add extack message for offload failure 2022-04-08 13:45:43 +01:00
act_mpls.c net/sched: act_mpls: Add extack messages for offload failure 2022-04-08 13:45:43 +01:00
act_nat.c flow_offload: fill flags to action structure 2021-12-19 14:08:47 +00:00
act_pedit.c net/sched: act_pedit: Add extack message for offload failure 2022-04-08 13:45:43 +01:00
act_police.c net/sched: act_police: Add extack messages for offload failure 2022-04-08 13:45:43 +01:00
act_sample.c net/sched: act_api: Add extack to offload_act_setup() callback 2022-04-08 13:45:43 +01:00
act_simple.c flow_offload: fill flags to action structure 2021-12-19 14:08:47 +00:00
act_skbedit.c net: sched: support hash selecting tx queue 2022-04-19 12:20:45 +02:00
act_skbmod.c flow_offload: fill flags to action structure 2021-12-19 14:08:47 +00:00
act_tunnel_key.c net/sched: act_tunnel_key: Add extack message for offload failure 2022-04-08 13:45:43 +01:00
act_vlan.c net/sched: act_vlan: Add extack message for offload failure 2022-04-08 13:45:43 +01:00
cls_api.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-15 09:26:00 +02:00
cls_basic.c net_sched: refactor TC action init API 2021-08-02 10:24:38 +01:00
cls_bpf.c bpf: Keep the (rcv) timestamp behavior for the existing tc-bpf@ingress 2022-03-03 14:38:48 +00:00
cls_cgroup.c net_sched: refactor TC action init API 2021-08-02 10:24:38 +01:00
cls_flow.c net_sched: refactor TC action init API 2021-08-02 10:24:38 +01:00
cls_flower.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-15 09:26:00 +02:00
cls_fw.c net_sched: refactor TC action init API 2021-08-02 10:24:38 +01:00
cls_matchall.c net/sched: matchall: Avoid overwriting error messages 2022-04-08 13:45:43 +01:00
cls_route.c net_sched: refactor TC action init API 2021-08-02 10:24:38 +01:00
cls_rsvp.c
cls_rsvp.h net_sched: refactor TC action init API 2021-08-02 10:24:38 +01:00
cls_rsvp6.c
cls_tcindex.c net_sched: refactor TC action init API 2021-08-02 10:24:38 +01:00
cls_u32.c flow_offload: validate flags of filter and actions 2021-12-19 14:08:48 +00:00
em_canid.c
em_cmp.c net: sched: fix misspellings using misspell-fixer tool 2020-11-10 17:00:28 -08:00
em_ipset.c
em_ipt.c
em_meta.c net: introduce sk_forward_alloc_get() 2021-10-27 18:20:29 -07:00
em_nbyte.c net: sched: Return the correct errno code 2021-02-06 11:15:28 -08:00
em_text.c
em_u32.c
ematch.c net: sched: Fix spelling mistakes 2021-05-31 22:44:56 -07:00
Kconfig net: sched: incorrect Kconfig dependencies on Netfilter modules 2020-12-09 15:49:29 -08:00
Makefile net/sched: sch_frag: add generic packet fragment support. 2020-11-27 14:36:02 -08:00
sch_api.c net_sched: add __rcu annotation to netdev->qdisc 2022-02-14 13:36:36 +00:00
sch_atm.c net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
sch_blackhole.c
sch_cake.c sch_cake: revise Diffserv docs 2022-01-07 08:41:29 -08:00
sch_cbq.c net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
sch_cbs.c net: don't include ethtool.h from netdevice.h 2020-11-23 17:27:04 -08:00
sch_choke.c net: sched: validate stab values 2021-03-10 15:47:52 -08:00
sch_codel.c
sch_drr.c net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
sch_dsmark.c net/sched: store the last executed chain also for clsact egress 2021-07-29 22:17:37 +01:00
sch_etf.c
sch_ets.c net/sched: sch_ets: don't remove idle classes from the round-robin list 2021-12-13 12:30:23 +00:00
sch_fifo.c net_sched: fix NULL deref in fifo_set_limit() 2021-10-01 14:59:10 -07:00
sch_fq.c
sch_fq_codel.c fq_codel: generalise ce_threshold marking for subset of traffic 2021-10-20 15:24:36 -07:00
sch_fq_pie.c net/sched: fq_pie: prevent dismantle issue 2021-12-09 08:01:00 -08:00
sch_frag.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-12-31 14:35:40 +00:00
sch_generic.c net_sched: make qdisc_reset() smaller 2022-04-15 14:04:56 -07:00
sch_gred.c net: sched: gred: dynamically allocate tc_gred_qopt_offload 2021-10-27 12:06:52 -07:00
sch_hfsc.c net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
sch_hhf.c
sch_htb.c sch_htb: Fail on unsupported parameters when offload is requested 2022-01-25 20:00:02 -08:00
sch_ingress.c
sch_mq.c net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
sch_mqprio.c net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
sch_multiq.c net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
sch_netem.c net: sched: sch_netem: Refactor code in 4-state loss generator 2021-11-15 13:23:23 +00:00
sch_pie.c net: sched: fix misspellings using misspell-fixer tool 2020-11-10 17:00:28 -08:00
sch_plug.c
sch_prio.c net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
sch_qfq.c sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc 2022-01-04 12:36:51 +00:00
sch_red.c net: sched: validate stab values 2021-03-10 15:47:52 -08:00
sch_sfb.c net/sched: store the last executed chain also for clsact egress 2021-07-29 22:17:37 +01:00
sch_sfq.c net/sched: store the last executed chain also for clsact egress 2021-07-29 22:17:37 +01:00
sch_skbprio.c
sch_taprio.c net/sched: taprio: Check if socket flags are valid 2022-04-11 10:51:00 +01:00
sch_tbf.c net: sch_tbf: Add a graft command 2021-10-19 12:24:51 +01:00
sch_teql.c net: sched: sch_teql: fix null-pointer dereference 2021-04-08 14:14:42 -07:00