Commit Graph

29 Commits

Author SHA1 Message Date
Pablo Neira Ayuso 7a27f6ab41 netfilter: flowtable: use dev_fill_forward_path() to obtain egress device
The egress device in the tuple is obtained from route. Use
dev_fill_forward_path() instead to provide the real egress device for
this flow whenever this is available.

The new FLOW_OFFLOAD_XMIT_DIRECT type uses dev_queue_xmit() to transmit
ethernet frames. Cache the source and destination hardware address to
use dev_queue_xmit() to transfer packets.

The FLOW_OFFLOAD_XMIT_DIRECT replaces FLOW_OFFLOAD_XMIT_NEIGH if
dev_fill_forward_path() finds a direct transmit path.

In case of topology updates, if peer is moved to different bridge port,
the connection will time out, reconnect will result in a new entry with
the correct path. Snooping fdb updates would allow for cleaning up stale
flowtable entries.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24 12:48:39 -07:00
Pablo Neira Ayuso c63a7cc4d7 netfilter: flowtable: use dev_fill_forward_path() to obtain ingress device
Obtain the ingress device in the tuple from the route in the reply
direction. Use dev_fill_forward_path() instead to get the real ingress
device for this flow.

Fall back to use the ingress device that the IP forwarding route
provides if:

- dev_fill_forward_path() finds no real ingress device.
- the ingress device that is obtained is not part of the flowtable
  devices.
- this route has a xfrm policy.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24 12:48:39 -07:00
Pablo Neira Ayuso 5139c0c007 netfilter: flowtable: add xmit path types
Add the xmit_type field that defines the two supported xmit paths in the
flowtable data plane, which are the neighbour and the xfrm xmit paths.
This patch prepares for new flowtable xmit path types to come.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24 12:48:39 -07:00
Florian Westphal c46172147e netfilter: conntrack: do not auto-delete clash entries on reply
Its possible that we have more than one packet with the same ct tuple
simultaneously, e.g. when an application emits n packets on same UDP
socket from multiple threads.

NAT rules might be applied to those packets. With the right set of rules,
n packets will be mapped to m destinations, where at least two packets end
up with the same destination.

When this happens, the existing clash resolution may merge the skb that
is processed after the first has been received with the identical tuple
already in hash table.

However, its possible that this identical tuple is a NAT_CLASH tuple.
In that case the second skb will be sent, but no reply can be received
since the reply that is processed first removes the NAT_CLASH tuple.

Do not auto-delete, this gives a 1 second window for replies to be passed
back to originator.

Packets that are coming later (udp stream case) will not be affected:
they match the original ct entry, not a NAT_CLASH one.

Also prevent NAT_CLASH entries from getting offloaded.

Fixes: 6a757c07e5 ("netfilter: conntrack: allow insertion of clashing entries")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-08-29 13:03:06 +02:00
Rob Gill 4cacc39516 netfilter: Add MODULE_DESCRIPTION entries to kernel modules
The user tool modinfo is used to get information on kernel modules, including a
description where it is available.

This patch adds a brief MODULE_DESCRIPTION to netfilter kernel modules
(descriptions taken from Kconfig file or code comments)

Signed-off-by: Rob Gill <rrobgill@protonmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-06-25 00:50:31 +02:00
wenxu 8ca79606cd netfilter: nft_flow_offload: fix underflow in flowtable reference counter
The .deactivate and .activate interfaces already deal with the reference
counter. Otherwise, this results in spurious "Device is busy" errors.

Fixes: a3c90f7a23 ("netfilter: nf_tables: flow offload expression")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-01-05 10:06:22 +01:00
Pablo Neira Ayuso f1363e058b netfilter: nf_flow_table: detach routing information from flow description
This patch adds the infrastructure to support for flow entry types.
The initial type is NF_FLOW_OFFLOAD_ROUTE that stores the routing
information into the flow entry to define a fastpath for the classic
forwarding path.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-12 19:42:26 -08:00
Laura Garcia Liebana 9b05b6e11d netfilter: nf_tables: bogus EBUSY when deleting flowtable after flush
The deletion of a flowtable after a flush in the same transaction
results in EBUSY. This patch adds an activation and deactivation of
flowtables in order to update the _use_ counter.

Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-25 11:01:19 +02:00
Jeremy Sowden 40d102cde0 netfilter: update include directives.
Include some headers in files which require them, and remove others
which are not required.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-09-13 12:33:06 +02:00
Pablo Neira Ayuso 14c415862c netfilter: nft_flow_offload: missing netlink attribute policy
The netlink attribute policy for NFTA_FLOW_TABLE_NAME is missing.

Fixes: a3c90f7a23 ("netfilter: nf_tables: flow offload expression")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-08-19 09:34:20 +02:00
Pablo Neira Ayuso dfe42be15f netfilter: nft_flow_offload: skip tcp rst and fin packets
TCP rst and fin packets do not qualify to place a flow into the
flowtable. Most likely there will be no more packets after connection
closure. Without this patch, this flow entry expires and connection
tracking picks up the entry in ESTABLISHED state using the fixup
timeout, which makes this look inconsistent to the user for a connection
that is actually already closed.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-08-14 11:09:07 +02:00
David S. Miller 71e15f76f8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter/IPVS fixes for your net tree:

1) Fix crash when dumping rules after conversion to RCU,
   from Florian Westphal.

2) Fix incorrect hook reinjection from nf_queue in case NF_REPEAT,
   from Jagdish Motwani.

3) Fix check for route existence in fib extension, from Phil Sutter.

4) Fix use after free in ip_vs_in() hook, from YueHaibing.

5) Check for veth existence from netfilter selftests,
   from Jeffrin Jose T.

6) Checksum corruption in UDP NAT helpers due to typo,
   from Florian Westphal.

7) Pass up packets to classic forwarding path regardless of
   IPv4 DF bit, patch for the flowtable infrastructure from Florian.

8) Set liberal TCP tracking for flows that are placed in the
   flowtable, in case they need to go back to classic forwarding path,
   also from Florian.

9) Don't add flow with sequence adjustment to flowtable, from Florian.

10) Skip IPv4 options from IPv6 datapath in flowtable, from Florian.

11) Add selftest for the flowtable infrastructure, from Florian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-23 14:45:36 -07:00
Florian Westphal 69aeb53858 netfilter: nft_flow_offload: IPCB is only valid for ipv4 family
Guard this with a check vs. ipv4, IPCB isn't valid in ipv6 case.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-05-22 10:56:06 +02:00
Florian Westphal 91a9048f23 netfilter: nft_flow_offload: don't offload when sequence numbers need adjustment
We can't deal with tcp sequence number rewrite in flow_offload.
While at it, simplify helper check, we only need to know if the extension
is present, we don't need the helper data.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-05-22 10:51:49 +02:00
Florian Westphal 8437a6209f netfilter: nft_flow_offload: set liberal tracking mode for tcp
Without it, whenever a packet has to be pushed up the stack (e.g. because
of mtu mismatch), then conntrack will flag packets as invalid, which in
turn breaks NAT.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-05-22 10:51:49 +02:00
Thomas Gleixner 09c434b8a0 treewide: Add SPDX license identifier for more missed files
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have MODULE_LICENCE("GPL*") inside which was used in the initial
   scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:45 +02:00
Taehee Yoo 26a302afbe netfilter: nf_flow_table: fix netdev refcnt leak
flow_offload_alloc() calls nf_route() to get a dst_entry. Internally,
nf_route() calls ip_route_output_key() that allocates a dst_entry and
holds it. So, a dst_entry should be released by dst_release() if
nf_route() is successful.

Otherwise, netns exit routine cannot be finished and the following
message is printed:

[  257.490952] unregister_netdevice: waiting for lo to become free. Usage count = 1

Fixes: ac2a66665e ("netfilter: add generic flow table infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-30 13:55:00 +02:00
Pablo Neira Ayuso 270a8a297f netfilter: nft_flow_offload: add entry to flowtable after confirmation
This is fixing flow offload for UDP traffic where packets only follow
one single direction.

The flow_offload_fixup_tcp() mechanism works fine in case that the
offloaded entry remains in SYN_RECV state, given sequence tracking is
reset and that conntrack handles syn+ack packets as a retransmission, ie.

	sES + synack => sIG

for reply traffic.

Fixes: a3c90f7a23 ("netfilter: nf_tables: flow offload expression")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-04-30 13:48:23 +02:00
Henry Yen 2314e87974 netfilter: nft_flow_offload: fix checking method of conntrack helper
This patch uses nfct_help() to detect whether an established connection
needs conntrack helper instead of using test_bit(IPS_HELPER_BIT,
&ct->status).

The reason is that IPS_HELPER_BIT is only set when using explicit CT
target.

However, in the case that a device enables conntrack helper via command
"echo 1 > /proc/sys/net/netfilter/nf_conntrack_helper", the status of
IPS_HELPER_BIT will not present any change, and consequently it loses
the checking ability in the context.

Signed-off-by: Henry Yen <henry.yen@mediatek.com>
Reviewed-by: Ryder Lee <ryder.lee@mediatek.com>
Tested-by: John Crispin <john@phrozen.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-01-14 12:50:59 +01:00
wenxu 10f4e76587 netfilter: nft_flow_offload: fix interaction with vrf slave device
In the forward chain, the iif is changed from slave device to master vrf
device. Thus, flow offload does not find a match on the lower slave
device.

This patch uses the cached route, ie. dst->dev, to update the iif and
oif fields in the flow entry.

After this patch, the following example works fine:

 # ip addr add dev eth0 1.1.1.1/24
 # ip addr add dev eth1 10.0.0.1/24
 # ip link add user1 type vrf table 1
 # ip l set user1 up
 # ip l set dev eth0 master user1
 # ip l set dev eth1 master user1

 # nft add table firewall
 # nft add flowtable f fb1 { hook ingress priority 0 \; devices = { eth0, eth1 } \; }
 # nft add chain f ftb-all {type filter hook forward priority 0 \; policy accept \; }
 # nft add rule f ftb-all ct zone 1 ip protocol tcp flow offload @fb1
 # nft add rule f ftb-all ct zone 1 ip protocol udp flow offload @fb1

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-01-11 00:55:37 +01:00
wenxu a799aea098 netfilter: nft_flow_offload: Fix reverse route lookup
Using the following example:

	client 1.1.1.7 ---> 2.2.2.7 which dnat to 10.0.0.7 server

The first reply packet (ie. syn+ack) uses an incorrect destination
address for the reverse route lookup since it uses:

	daddr = ct->tuplehash[!dir].tuple.dst.u3.ip;

which is 2.2.2.7 in the scenario that is described above, while this
should be:

	daddr = ct->tuplehash[dir].tuple.src.u3.ip;

that is 10.0.0.7.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-01-09 23:25:02 +01:00
Taehee Yoo 584eab291c netfilter: add missing error handling code for register functions
register_{netdevice/inetaddr/inet6addr}_notifier may return an error
value, this patch adds the code to handle these error paths.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-11-27 00:35:19 +01:00
Taehee Yoo 5f1be84aad netfilter: nf_flow_table: remove unnecessary parameter of nf_flow_table_cleanup()
parameter net of nf_flow_table_cleanup() is not used.
So that it can be removed.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-10-19 13:25:22 +02:00
Pablo Neira Ayuso cac20fcdf1 netfilter: nf_tables: simplify lookup functions
Replace the nf_tables_ prefix by nft_ and merge code into single lookup
function whenever possible. In many cases we go over the 80-chars
boundary function names, this save us ~50 LoC.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-04-24 10:29:09 +02:00
Pablo Neira Ayuso c0ea1bcb39 netfilter: nft_flow_offload: move flowtable cleanup routines to nf_flow_table
Move the flowtable cleanup routines to nf_flow_table and expose the
nf_flow_table_cleanup() helper function.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-07 00:58:57 +01:00
Pablo Neira Ayuso 992cfc7c5d netfilter: nft_flow_offload: no need to flush entries on module removal
nft_flow_offload module removal does not require to flush existing
flowtables, it is valid to remove this module while keeping flowtables
around.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-02 18:26:43 +01:00
Pablo Neira Ayuso c7f0030b5b netfilter: nft_flow_offload: wait for garbage collector to run after cleanup
If netdevice goes down, then flowtable entries are scheduled to be
removed. Wait for garbage collector to have a chance to run so it can
delete them from the hashtable.

The flush call might sleep, so hold the nfnl mutex from
nft_flow_table_iterate() instead of rcu read side lock. The use of the
nfnl mutex is also implicitly fixing races between updates via nfnetlink
and netdevice event.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-02-02 18:26:42 +01:00
Pablo Neira Ayuso 36596dadf5 netfilter: nf_tables: add single table list for all families
Place all existing user defined tables in struct net *, instead of
having one list per family. This saves us from one level of indentation
in netlink dump functions.

Place pointer to struct nft_af_info in struct nft_table temporarily, as
we still need this to put back reference module reference counter on
table removal.

This patch comes in preparation for the removal of struct nft_af_info.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-01-10 15:32:08 +01:00
Pablo Neira Ayuso a3c90f7a23 netfilter: nf_tables: flow offload expression
Add new instruction for the nf_tables VM that allows us to specify what
flows are offloaded into a given flow table via name. This new
instruction creates the flow entry and adds it to the flow table.

Only established flows, ie. we have seen traffic in both directions, are
added to the flow table. You can still decide to offload entries at a
later stage via packet counting or checking the ct status in case you
want to offload assured conntracks.

This new extension depends on the conntrack subsystem.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-01-08 18:11:10 +01:00