Commit graph

685 commits

Author SHA1 Message Date
David S. Miller
d247b6ab3c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/Makefile
	net/ipv6/sysctl_net_ipv6.c

Two ipv6_table_template[] additions overlap, so the index
of the ipv6_table[x] assignments needed to be adjusted.

In the drivers/net/Makefile case, we've gotten rid of the
garbage whereby we had to list every single USB networking
driver in the top-level Makefile, there is just one
"USB_NETWORKING" that guards everything.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 18:46:26 -07:00
Thomas Graf
0dc1362562 netfilter: nf_tables: Avoid duplicate call to nft_data_uninit() for same key
nft_del_setelem() currently calls nft_data_uninit() twice on the same
key. Once to release the key which is guaranteed to be NFT_DATA_VALUE
and a second time in the error path to which it falls through.

The second call has been harmless so far though because the type
passed is always NFT_DATA_VALUE which is currently a no-op.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-08-01 18:14:49 +02:00
Pablo Neira Ayuso
7d5570ca89 netfilter: nf_tables: check for unset NFTA_SET_ELEM_LIST_ELEMENTS attribute
Otherwise, the kernel oopses in nla_for_each_nested when iterating over
the unset attribute NFTA_SET_ELEM_LIST_ELEMENTS in the
nf_tables_{new,del}setelem() path.

netlink: 65524 bytes leftover after parsing attributes in process `nft'.
[...]
Oops: 0000 [#1] SMP
[...]
CPU: 2 PID: 6287 Comm: nft Not tainted 3.16.0-rc2+ #169
RIP: 0010:[<ffffffffa0526e61>]  [<ffffffffa0526e61>] nf_tables_newsetelem+0x82/0xec [nf_tables]
[...]
Call Trace:
 [<ffffffffa05178c4>] nfnetlink_rcv+0x2e7/0x3d7 [nfnetlink]
 [<ffffffffa0517939>] ? nfnetlink_rcv+0x35c/0x3d7 [nfnetlink]
 [<ffffffff8137d300>] netlink_unicast+0xf8/0x17a
 [<ffffffff8137d6a5>] netlink_sendmsg+0x323/0x351
[...]

Fix this by returning -EINVAL if this attribute is not set, which
doesn't make sense at all since those commands are there to add and to
delete elements from the set.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-07-31 21:11:43 +02:00
Pablo Neira Ayuso
5b96af7713 netfilter: nf_tables: simplify set dump through netlink
This patch uses the cb->data pointer that allows us to store the
context when dumping the set list. Thus, we don't need to parse the
original netlink message containing the dump request for each recvmsg()
call when dumping the set list. The different function flavours
depending on the dump criteria has been also merged into one single
generic function. This saves us ~100 lines of code.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-07-22 12:08:54 +02:00
Eric Dumazet
ce355e209f netfilter: nf_tables: 64bit stats need some extra synchronization
Use generic u64_stats_sync infrastructure to get proper 64bit stats,
even on 32bit arches, at no extra cost for 64bit arches.

Without this fix, 32bit arches can have some wrong counters at the time
the carry is propagated into upper word.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-07-14 12:00:17 +02:00
Pablo Neira Ayuso
38e029f14a netfilter: nf_tables: set NLM_F_DUMP_INTR if netlink dumping is stale
An updater may interfer with the dumping of any of the object lists.
Fix this by using a per-net generation counter and use the
nl_dump_check_consistent() interface so the NLM_F_DUMP_INTR flag is set
to notify userspace that it has to restart the dump since an updater
has interfered.

This patch also replaces the existing consistency checking code in the
rule dumping path since it is broken. Basically, the value that the
dump callback returns is not propagated to userspace via
netlink_dump_start().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-07-14 12:00:16 +02:00
Pablo Neira Ayuso
e688a7f8c6 netfilter: nf_tables: safe RCU iteration on list when dumping
The dump operation through netlink is not protected by the nfnl_lock.
Thus, a reader process can be dumping any of the existing object
lists while another process can be updating the list content.

This patch resolves this situation by protecting all the object
lists with RCU in the netlink dump path which is the reader side.
The updater path is already protected via nfnl_lock, so use list
manipulation RCU-safe operations.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-07-14 11:20:45 +02:00
Pablo Neira Ayuso
63283dd21e netfilter: nf_tables: skip transaction if no update flags in tables
Skip transaction handling for table updates with no changes in
the flags. This fixes a crash when passing the table flag with all
bits unset.

Reported-by: Ana Rey <anarey@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-30 11:44:24 +02:00
Pablo Neira Ayuso
6403d96254 netfilter: nf_tables: indicate family when dumping set elements
Set the nfnetlink header that indicates the family of this element.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-16 13:08:09 +02:00
Pablo Neira Ayuso
ac904ac835 netfilter: nf_tables: fix wrong type in transaction when replacing rules
In b380e5c ("netfilter: nf_tables: add message type to transactions"),
I used the wrong message type in the rule replacement case. The rule
that is replaced needs to be handled as a deleted rule.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-16 13:07:58 +02:00
Pablo Neira Ayuso
ac34b86197 netfilter: nf_tables: decrement chain use counter when replacing rules
Thus, the chain use counter remains with the same value after the
rule replacement.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-16 13:07:50 +02:00
Pablo Neira Ayuso
a0a7379e16 netfilter: nf_tables: use u32 for chain use counter
Since 4fefee5 ("netfilter: nf_tables: allow to delete several objects
from a batch"), every new rule bumps the chain use counter. However,
this is limited to 16 bits, which means that it will overrun after
2^16 rules.

Use a u32 chain counter and check for overflows (just like we do for
table objects).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-16 13:07:44 +02:00
Pablo Neira Ayuso
5bc5c30765 netfilter: nf_tables: use RCU-safe list insertion when replacing rules
The patch 5e94846 ("netfilter: nf_tables: add insert operation") did
not include RCU-safe list insertion when replacing rules.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-16 13:07:29 +02:00
Pablo Neira Ayuso
31f8441c32 netfilter: nf_tables: atomic allocation in set notifications from rcu callback
Use GFP_ATOMIC allocations when sending removal notifications of
anonymous sets from rcu callback context. Sleeping in that context
is illegal.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-02 10:54:38 +02:00
Pablo Neira Ayuso
4fefee570d netfilter: nf_tables: allow to delete several objects from a batch
Three changes to allow the deletion of several objects with dependencies
in one transaction, they are:

1) Introduce speculative counter increment/decrement that is undone in
   the abort path if required, thus we avoid hitting -EBUSY when deleting
   the chain. The counter updates are reverted in the abort path.

2) Increment/decrement table/chain use counter for each set/rule. We need
   this to fully rely on the use counters instead of the list content,
   eg. !list_empty(&chain->rules) which evaluate true in the middle of the
   transaction.

3) Decrement table use counter when an anonymous set is bound to the
   rule in the commit path. This avoids hitting -EBUSY when deleting
   the table that contains anonymous sets. The anonymous sets are released
   in the nf_tables_rule_destroy path. This should not be a problem since
   the rule already bumped the use counter of the chain, so the bound
   anonymous set reflects dependencies through the rule object, which
   already increases the chain use counter.

So the general assumption after this patch is that the use counters are
bumped by direct object dependencies.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-02 10:54:35 +02:00
Pablo Neira Ayuso
a1cee076f4 netfilter: nf_tables: release objects in reverse order in the abort path
The patch c7c32e7 ("netfilter: nf_tables: defer all object release via
rcu") indicates that we always release deleted objects in the reverse
order, but that is only needed in the abort path. These are the two
possible scenarios when releasing objects:

1) Deletion scenario in the commit path: no need to release objects in
the reverse order since userspace already ensures that dependencies are
fulfilled), ie. userspace tells us to delete rule -> ... -> rule ->
chain -> table. In this case, we have to release the objects in the
*same order* as userspace provided.

2) Deletion scenario in the abort path: we have to iterate in the reverse
order to undo what it cannot be added, ie. userspace sent us a batch
that includes: table -> chain -> rule -> ... -> rule, and that needs to
be partially undone. In this case, we have to release objects in the
reverse order to ensure that the set and chain objects point to valid
rule and table objects.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-02 10:54:28 +02:00
Pablo Neira Ayuso
46bbafceb2 netfilter: nf_tables: fix wrong transaction ordering in set elements
The transaction needs to be placed at the end of the commit list,
otherwise event notifications are reordered and we may crash when
releasing object via call_rcu.

This problem was introduced in 60319eb ("netfilter: nf_tables: use new
transaction infrastructure to handle elements").

Reported-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-06-02 10:54:25 +02:00
David S. Miller
8af750d739 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables
Pablo Neira Ayuso says:

====================
Netfilter/nftables updates for net-next

The following patchset contains Netfilter/nftables updates for net-next,
most relevantly they are:

1) Add set element update notification via netlink, from Arturo Borrero.

2) Put all object updates in one single message batch that is sent to
   kernel-space. Before this patch only rules where included in the batch.
   This series also introduces the generic transaction infrastructure so
   updates to all objects (tables, chains, rules and sets) are applied in
   an all-or-nothing fashion, these series from me.

3) Defer release of objects via call_rcu to reduce the time required to
   commit changes. The assumption is that all objects are destroyed in
   reverse order to ensure that dependencies betweem them are fulfilled
   (ie. rules and sets are destroyed first, then chains, and finally
   tables).

4) Allow to match by bridge port name, from Tomasz Bursztyka. This series
   include two patches to prepare this new feature.

5) Implement the proper set selection based on the characteristics of the
   data. The new infrastructure also allows you to specify your preferences
   in terms of memory and computational complexity so the underlying set
   type is also selected according to your needs, from Patrick McHardy.

6) Several cleanup patches for nft expressions, including one minor possible
   compilation breakage due to missing mark support, also from Patrick.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-22 12:06:23 -04:00
Pablo Neira Ayuso
c7c32e72cb netfilter: nf_tables: defer all object release via rcu
Now that all objects are released in the reverse order via the
transaction infrastructure, we can enqueue the release via
call_rcu to save one synchronize_rcu. For small rule-sets loaded
via nft -f, it now takes around 50ms less here.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:13 +02:00
Pablo Neira Ayuso
128ad3322b netfilter: nf_tables: remove skb and nlh from context structure
Instead of caching the original skbuff that contains the netlink
messages, this stores the netlink message sequence number, the
netlink portID and the report flag. This helps to prepare the
introduction of the object release via call_rcu.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:13 +02:00
Pablo Neira Ayuso
35151d840c netfilter: nf_tables: simplify nf_tables_*_notify
Now that all these function are called from the commit path, we can
pass the context structure to reduce the amount of parameters in all
of the nf_tables_*_notify functions. This patch also removes unneeded
branches to check for skb, nlh and net that should be always set in
the context structure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso
60319eb1ca netfilter: nf_tables: use new transaction infrastructure to handle elements
Leave the set content in consistent state if we fail to load the
batch. Use the new generic transaction infrastructure to achieve
this.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso
55dd6f9307 netfilter: nf_tables: use new transaction infrastructure to handle table
This patch speeds up rule-set updates and it also provides a way
to revert updates and leave things in consistent state in case that
the batch needs to be aborted.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:12 +02:00
Pablo Neira Ayuso
e1aaca93ee netfilter: nf_tables: pass context to nf_tables_updtable()
So nf_tables_uptable() only takes one single parameter.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso
f75edf5e9c netfilter: nf_tables: disabling table hooks always succeeds
nf_tables_table_disable() always succeeds, make this function void.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso
91c7b38dc9 netfilter: nf_tables: use new transaction infrastructure to handle chain
This patch speeds up rule-set updates and it also introduces a way to
revert chain updates if the batch is aborted. The idea is to store the
changes in the transaction to apply that in the commit step.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso
ff3cd7b3c9 netfilter: nf_tables: refactor chain statistic routines
Add new routines to encapsulate chain statistics allocation and
replacement.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:11 +02:00
Pablo Neira Ayuso
958bee14d0 netfilter: nf_tables: use new transaction infrastructure to handle sets
This patch reworks the nf_tables API so set updates are included in
the same batch that contains rule updates. This speeds up rule-set
updates since we skip a dialog of four messages between kernel and
user-space (two on each direction), from:

 1) create the set and send netlink message to the kernel
 2) process the response from the kernel that contains the allocated name.
 3) add the set elements and send netlink message to the kernel.
 4) process the response from the kernel (to check for errors).

To:

 1) add the set to the batch.
 2) add the set elements to the batch.
 3) add the rule that points to the set.
 4) send batch to the kernel.

This also introduces an internal set ID (NFTA_SET_ID) that is unique
in the batch so set elements and rules can refer to new sets.

Backward compatibility has been only retained in userspace, this
means that new nft versions can talk to the kernel both in the new
and the old fashion.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso
b380e5c733 netfilter: nf_tables: add message type to transactions
The patch adds message type to the transaction to simplify the
commit the and abort routines. Yet another step forward in the
generalisation of the transaction infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso
37082f930b netfilter: nf_tables: relocate commit and abort routines in the source file
Move the commit and abort routines to the bottom of the source code
file. This change is required by the follow up patches that add the
set, chain and table transaction support.

This patch is just a cleanup to access several functions without
having to declare their prototypes.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso
1081d11b08 netfilter: nf_tables: generalise transaction infrastructure
This patch generalises the existing rule transaction infrastructure
so it can be used to handle set, table and chain object transactions
as well. The transaction provides a data area that stores private
information depending on the transaction type.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:10 +02:00
Pablo Neira Ayuso
7c95f6d866 netfilter: nf_tables: deconstify table and chain in context structure
The new transaction infrastructure updates the family, table and chain
objects in the context structure, so let's deconstify them. While at it,
move the context structure initialization routine to the top of the
source file as it will be also used from the table and chain routines.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-05-19 12:06:09 +02:00
Pablo Neira
4c1f7818e4 netfilter: nf_tables: relax string validation of NFTA_CHAIN_TYPE
Use NLA_STRING for consistency with other string attributes in
nf_tables.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-28 16:54:15 +02:00
Tomasz Bursztyka
758dbcecf1 netfilter: nf_tables: Stack expression type depending on their family
To ensure family tight expression gets selected in priority to family
agnostic ones.

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-23 13:51:05 +02:00
Patrick McHardy
60eb18943b netfilter: nf_tables: handle more than 8 * PAGE_SIZE set name allocations
We currently have a limit of 8 * PAGE_SIZE anonymous sets. Lift that limit
by continuing the scan if the entire page is exhausted.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-14 10:31:20 +02:00
Pablo Neira Ayuso
2fec6bb6f4 netfilter: nf_tables: fix wrong format in request_module()
The intended format in request_module is %.*s instead of %*.s.

Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-03 23:52:50 +02:00
Pablo Neira Ayuso
a9bdd83656 netfilter: nf_tables: set names cannot be larger than 15 bytes
Currently, nf_tables trims off the set name if it exceeeds 15
bytes, so explicitly reject set names that are too large.

Reported-by: Giuseppe Longo <giuseppelng@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-03 23:52:44 +02:00
Arturo Borrero
d60ce62fb5 netfilter: nf_tables: add set_elem notifications
This patch adds set_elems notifications. When a set_elem is
added/deleted, all listening peers in userspace will receive the
corresponding notification.

Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@gnumonks.org>
2014-04-03 12:22:25 +02:00
Patrick McHardy
c50b960ccc netfilter: nf_tables: implement proper set selection
The current set selection simply choses the first set type that provides
the requested features, which always results in the rbtree being chosen
by virtue of being the first set in the list.

What we actually want to do is choose the implementation that can provide
the requested features and is optimal from either a performance or memory
perspective depending on the characteristics of the elements and the
preferences specified by the user.

The elements are not known when creating a set. Even if we would provide
them for anonymous (literal) sets, we'd still have standalone sets where
the elements are not known in advance. We therefore need an abstract
description of the data charcteristics.

The kernel already knows the size of the key, this patch starts by
introducing a nested set description which so far contains only the maximum
amount of elements. Based on this the set implementations are changed to
provide an estimate of the required amount of memory and the lookup
complexity class.

The set ops have a new callback ->estimate() that is invoked during set
selection. It receives a structure containing the attributes known to the
kernel and is supposed to populate a struct nft_set_estimate with the
complexity class and, in case the size is known, the complete amount of
memory required, or the amount of memory required per element otherwise.

Based on the policy specified by the user (performance/memory, defaulting
to performance) the kernel will then select the best suited implementation.

Even if the set implementation would allow to add more than the specified
maximum amount of elements, they are enforced since new implementations
might not be able to add more than maximum based on which they were
selected.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-04-02 21:32:57 +02:00
Patrick McHardy
ab9da5c19f netfilter: nf_tables: restore notifications for anonymous set destruction
Since we have the context available again, we can restore notifications
for destruction of anonymous sets.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-08 12:35:18 +01:00
Patrick McHardy
62472bcefb netfilter: nf_tables: restore context for expression destructors
In order to fix set destruction notifications and get rid of unnecessary
members in private data structures, pass the context to expressions'
destructor functions again.

In order to do so, replace various members in the nft_rule_trans structure
by the full context.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-08 12:35:17 +01:00
Patrick McHardy
a36e901cf6 netfilter: nf_tables: clean up nf_tables_trans_add() argument order
The context argument logically comes first, and this is what every other
function dealing with contexts does.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-03-08 12:35:16 +01:00
Pablo Neira Ayuso
0768b3b3d2 netfilter: nf_tables: add optional user data area to rules
This allows us to store user comment strings, but it could be also
used to store any kind of information that the user application needs
to link to the rule.

Scratch 8 bits for the new ulen field that indicates the length the
user data area. 4 bits from the handle (so it's 42 bits long, according
to Patrick, it would last 139 years with 1000 new rules per second)
and 4 bits from dlen (so the expression data area is 4K, which seems
sufficient by now even considering the compatibility layer).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
2014-02-27 16:56:00 +01:00
Patrick McHardy
e0abdadcc6 netfilter: nf_tables: accept QUEUE/DROP verdict parameters
Allow userspace to specify the queue number or the errno code for QUEUE
and DROP verdicts.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-25 11:29:26 +01:00
Patrick McHardy
67a8fc27cc netfilter: nf_tables: add nft_dereference() macro
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-25 11:29:23 +01:00
Pablo Neira Ayuso
62f9c8b40d netfilter: nf_tables: fix loop checking with end interval elements
Fix access to uninitialized data for end interval elements. The
element data part is uninitialized in interval end elements.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-07 17:21:45 +01:00
Pablo Neira Ayuso
bd7fc645da netfilter: nf_tables: do not allow NFT_SET_ELEM_INTERVAL_END flag and data
This combination is not allowed since end interval elements cannot
contain data.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
2014-02-07 14:21:49 +01:00
Pablo Neira Ayuso
0165d9325d netfilter: nf_tables: fix racy rule deletion
We may lost race if we flush the rule-set (which happens asynchronously
via call_rcu) and we try to remove the table (that userspace assumes
to be empty).

Fix this by recovering synchronous rule and chain deletion. This was
introduced time ago before we had no batch support, and synchronous
rule deletion performance was not good. Now that we have the batch
support, we can just postpone the purge of old rule in a second step
in the commit phase. All object deletions are synchronous after this
patch.

As a side effect, we save memory as we don't need rcu_head per rule
anymore.

Cc: Patrick McHardy <kaber@trash.net>
Reported-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-06 11:46:06 +01:00
Patrick McHardy
64d46806b6 netfilter: nf_tables: add AF specific expression support
For the reject module, we need to add AF-specific implementations to
get rid of incorrect module dependencies. Try to load an AF-specific
module first and fall back to generic modules.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-06 00:05:36 +01:00
Patrick McHardy
ec2c993568 netfilter: nf_tables: fix potential oops when dumping sets
Commit c9c8e48597 (netfilter: nf_tables: dump sets in all existing families)
changed nft_ctx_init_from_setattr() to only look up the address family if it
is not NFPROTO_UNSPEC. However if it is NFPROTO_UNSPEC and a table attribute
is given, nftables_afinfo_lookup() will dereference the NULL afi pointer.

Fix by checking for non-NULL afi and also move a check added by that commit
to the proper position.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-06 00:04:15 +01:00
Patrick McHardy
53b70287dd netfilter: nf_tables: fix overrun in nf_tables_set_alloc_name()
The map that is used to allocate anonymous sets is indeed
BITS_PER_BYTE * PAGE_SIZE long.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-05 17:46:07 +01:00
Patrick McHardy
3dd7279fb6 netfilter: nf_tables: fix oops when deleting a chain with references
The following commands trigger an oops:

 # nft -i
 nft> add table filter
 nft> add chain filter input { type filter hook input priority 0; }
 nft> add chain filter test
 nft> add rule filter input jump test
 nft> delete chain filter test

We need to check the chain use counter before allowing destruction since
we might have references from sets or jump rules.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=69341
Reported-by: Matthew Ife <deleriux1@gmail.com>
Tested-by: Matthew Ife <deleriux1@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-02-05 13:16:17 +01:00
Pablo Neira Ayuso
8f46df184c netfilter: nf_tables: fix missing byteorder conversion in policy
When fetching the policy attribute, the byteorder conversion was
missing, breaking the chain policy setting.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-10 18:26:13 +01:00
Patrick McHardy
44a6f0df03 netfilter: nf_tables: prohibit deletion of a table with existing sets
We currently leak the set memory when deleting a table that still has
sets in it. Return EBUSY when attempting to delete a table with sets.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-09 20:17:16 +01:00
Patrick McHardy
7047f9d052 netfilter: nf_tables: take AF module reference when creating a table
The table refers to data of the AF module, so we need to make sure the
module isn't unloaded while the table exists.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-09 20:17:16 +01:00
Patrick McHardy
c5c1f975ad netfilter: nf_tables: perform flags validation before table allocation
Simplifies error handling. Additionally use the correct type u32 for the
host byte order flags value.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-09 20:17:15 +01:00
Patrick McHardy
fa2c1de0bb netfilter: nf_tables: minor nf_chain_type cleanups
Minor nf_chain_type cleanups:

- reorder struct to plug a hoe
- rename struct module member to "owner" for consistency
- rename nf_hookfn array to "hooks" for consistency
- reorder initializers for better readability

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-09 20:17:15 +01:00
Patrick McHardy
2a37d755b8 netfilter: nf_tables: constify chain type definitions and pointers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-09 20:17:15 +01:00
Patrick McHardy
93b0806f00 netfilter: nf_tables: replay request after dropping locks to load chain type
To avoid races, we need to replay to request after dropping the nfnl_mutex
to auto-load the chain type module.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-09 20:17:14 +01:00
Patrick McHardy
baae3e62f3 netfilter: nf_tables: fix chain type module reference handling
The chain type module reference handling makes no sense at all: we take
a reference immediately when the module is registered, preventing the
module from ever being unloaded.

Fix by taking a reference when we're actually creating a chain of the
chain type and release the reference when destroying the chain.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-09 20:17:14 +01:00
Patrick McHardy
758206760c netfilter: nf_tables: fix check for table overflow
The table use counter is only increased for new chains, so move the check
to the correct position.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-09 20:17:13 +01:00
Patrick McHardy
4401a86200 netfilter: nf_tables: restore chain change atomicity
Chain counter validation is performed after the chain policy has
potentially been changed. Move counter validation/setting before
changing of the chain policy to fix this.

Additionally fix a memory leak if chain counter allocation fails
for new chains, remove an unnecessary free_percpu() and move
counter allocation for new chains

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-09 20:17:13 +01:00
Patrick McHardy
57de2a0cd9 netfilter: nf_tables: split chain policy validation from actually setting it
Currently nf_tables_newchain() atomicity is broken because of having
validation of some netlink attributes performed after changing attributes
of the chain. The chain policy is (currently) fine, but split it up as
preparation for the following fixes and to avoid future mistakes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-09 20:17:13 +01:00
Patrick McHardy
115a60b173 netfilter: nf_tables: add support for multi family tables
Add support to register chains to multiple hooks for different address
families for mixed IPv4/IPv6 tables.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2014-01-07 23:55:46 +01:00
Patrick McHardy
3b088c4bc0 netfilter: nf_tables: make chain types override the default AF functions
Currently the AF-specific hook functions override the chain-type specific
hook functions. That doesn't make too much sense since the chain types
are a special case of the AF-specific hooks.

Make the AF-specific hook functions the default and make the optional
chain type hooks override them.

As a side effect, the necessary code restructuring reduces the code size,
f.i. in case of nf_tables_ipv4.o:

  nf_tables_ipv4_init_net   |  -24
  nft_do_chain_ipv4         | -113
 2 functions changed, 137 bytes removed, diff: -137

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-07 23:50:43 +01:00
David S. Miller
56a4342dfe Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
	net/ipv6/ip6_tunnel.c
	net/ipv6/ip6_vti.c

ipv6 tunnel statistic bug fixes conflicting with consolidation into
generic sw per-cpu net stats.

qlogic conflict between queue counting bug fix and the addition
of multiple MAC address support.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 17:37:45 -05:00
David S. Miller
9aa28f2b71 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables
Pablo Neira Ayuso says: <pablo@netfilter.org>

====================
nftables updates for net-next

The following patchset contains nftables updates for your net-next tree,
they are:

* Add set operation to the meta expression by means of the select_ops()
  infrastructure, this allows us to set the packet mark among other things.
  From Arturo Borrero Gonzalez.

* Fix wrong format in sscanf in nf_tables_set_alloc_name(), from Daniel
  Borkmann.

* Add new queue expression to nf_tables. These comes with two previous patches
  to prepare this new feature, one to add mask in nf_tables_core to
  evaluate the queue verdict appropriately and another to refactor common
  code with xt_NFQUEUE, from Eric Leblond.

* Do not hide nftables from Kconfig if nfnetlink is not enabled, also from
  Eric Leblond.

* Add the reject expression to nf_tables, this adds the missing TCP RST
  support. It comes with an initial patch to refactor common code with
  xt_NFQUEUE, again from Eric Leblond.

* Remove an unused variable assignment in nf_tables_dump_set(), from Michal
  Nazarewicz.

* Remove the nft_meta_target code, now that Arturo added the set operation
  to the meta expression, from me.

* Add help information for nf_tables to Kconfig, also from me.

* Allow to dump all sets by specifying NFPROTO_UNSPEC, similar feature is
  available to other nf_tables objects, requested by Arturo, from me.

* Expose the table usage counter, so we can know how many chains are using
  this table without dumping the list of chains, from Tomasz Bursztyka.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-06 13:29:30 -05:00
Pablo Neira Ayuso
c9c8e48597 netfilter: nf_tables: dump sets in all existing families
This patch allows you to dump all sets available in all of
the registered families. This allows you to use NFPROTO_UNSPEC
to dump all existing sets, similarly to other existing table,
chain and rule operations.

This patch is based on original patch from Arturo Borrero
González.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-04 00:23:11 +01:00
Michal Nazarewicz
720e0dfa3a netfilter: nf_tables: remove unused variable in nf_tables_dump_set()
The nfmsg variable is not used (except in sizeof operator which does
not care about its value) between the first and second time it is
assigned the value.  Furthermore, nlmsg_data has no side effects, so
the assignment can be safely removed.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-03 22:51:14 +01:00
Daniel Borkmann
1466291790 netfilter: nf_tables: fix type in parsing in nf_tables_set_alloc_name()
In nf_tables_set_alloc_name(), we are trying to find a new, unused
name for our new set and interate through the list of present sets.
As far as I can see, we're using format string %d to parse already
present names in order to mark their presence in a bitmap, so that
we can later on find the first 0 in that map to assign the new set
name to. We should rather use a temporary variable of type int to
store the result of sscanf() to, and for making sanity checks on.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-01-03 22:50:59 +01:00
Pablo Neira Ayuso
2ee0d3c80f netfilter: nf_tables: fix wrong datatype in nft_validate_data_load()
This patch fixes dictionary mappings, eg.

 add rule ip filter input meta dnat set tcp dport map { 22 => 1.1.1.1, 23 => 2.2.2.2 }

The kernel was returning -EINVAL in nft_validate_data_load() since
the type of the set element data that is passed was the real userspace
datatype instead of NFT_DATA_VALUE.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-28 22:32:28 +01:00
Pablo Neira Ayuso
d201297561 netfilter: nf_tables: fix oops when updating table with user chains
This patch fixes a crash while trying to deactivate a table that
contains user chains. You can reproduce it via:

% nft add table table1
% nft add chain table1 chain1
% nft-table-upd ip table1 dormant

[  253.021026] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[  253.021114] IP: [<ffffffff8134cebd>] nf_register_hook+0x35/0x6f
[  253.021167] PGD 30fa5067 PUD 30fa2067 PMD 0
[  253.021208] Oops: 0000 [#1] SMP
[...]
[  253.023305] Call Trace:
[  253.023331]  [<ffffffffa0885020>] nf_tables_newtable+0x11c/0x258 [nf_tables]
[  253.023385]  [<ffffffffa0878592>] nfnetlink_rcv_msg+0x1f4/0x226 [nfnetlink]
[  253.023438]  [<ffffffffa0878418>] ? nfnetlink_rcv_msg+0x7a/0x226 [nfnetlink]
[  253.023491]  [<ffffffffa087839e>] ? nfnetlink_bind+0x45/0x45 [nfnetlink]
[  253.023542]  [<ffffffff8134b47e>] netlink_rcv_skb+0x3c/0x88
[  253.023586]  [<ffffffffa0878973>] nfnetlink_rcv+0x3af/0x3e4 [nfnetlink]
[  253.023638]  [<ffffffff813fb0d4>] ? _raw_read_unlock+0x22/0x34
[  253.023683]  [<ffffffff8134af17>] netlink_unicast+0xe2/0x161
[  253.023727]  [<ffffffff8134b29a>] netlink_sendmsg+0x304/0x332
[  253.023773]  [<ffffffff8130d250>] __sock_sendmsg_nosec+0x25/0x27
[  253.023820]  [<ffffffff8130fb93>] sock_sendmsg+0x5a/0x7b
[  253.023861]  [<ffffffff8130d5d5>] ? copy_from_user+0x2a/0x2c
[  253.023905]  [<ffffffff8131066f>] ? move_addr_to_kernel+0x35/0x60
[  253.023952]  [<ffffffff813107b3>] SYSC_sendto+0x119/0x15c
[  253.023995]  [<ffffffff81401107>] ? sysret_check+0x1b/0x56
[  253.024039]  [<ffffffff8108dc30>] ? trace_hardirqs_on_caller+0x140/0x1db
[  253.024090]  [<ffffffff8120164e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  253.024141]  [<ffffffff81310caf>] SyS_sendto+0x9/0xb
[  253.026219]  [<ffffffff814010e2>] system_call_fastpath+0x16/0x1b

Reported-by: Alex Wei <alex.kern.mentor@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-28 12:18:16 +01:00
Pablo Neira Ayuso
e38195bf32 netfilter: nf_tables: fix dumping with large number of sets
If not table name is specified, the dumping of the existing sets
may be incomplete with a sufficiently large number of sets and
tables. This patch fixes missing reset of the cursors after
finding the location of the last object that has been included
in the previous multi-part message.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-28 12:14:42 +01:00
Tomasz Bursztyka
d8bcc768c8 netfilter: nf_tables: Expose the table usage counter via netlink
Userspace can therefore know whether a table is in use or not, and
by how many chains. Suggested by Pablo Neira Ayuso.

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-17 14:28:05 +01:00
Pablo Neira Ayuso
cf9dc09d09 netfilter: nf_tables: fix missing rules flushing per table
This patch allows you to atomically remove all rules stored in
a table via the NFT_MSG_DELRULE command. You only need to indicate
the specific table and no chain to flush all rules stored in that
table.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-07 22:55:48 +01:00
Pablo Neira Ayuso
b5bc89bfa0 netfilter: nf_tables: add trace support
This patch adds support for tracing the packet travel through
the ruleset, in a similar fashion to x_tables.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14 18:01:02 +02:00
Pablo Neira Ayuso
0628b123c9 netfilter: nfnetlink: add batch support and use it from nf_tables
This patch adds a batch support to nfnetlink. Basically, it adds
two new control messages:

* NFNL_MSG_BATCH_BEGIN, that indicates the beginning of a batch,
  the nfgenmsg->res_id indicates the nfnetlink subsystem ID.

* NFNL_MSG_BATCH_END, that results in the invocation of the
  ss->commit callback function. If not specified or an error
  ocurred in the batch, the ss->abort function is invoked
  instead.

The end message represents the commit operation in nftables, the
lack of end message results in an abort. This patch also adds the
.call_batch function that is only called from the batch receival
path.

This patch adds atomic rule updates and dumps based on
bitmask generations. This allows to atomically commit a set of
rule-set updates incrementally without altering the internal
state of existing nf_tables expressions/matches/targets.

The idea consists of using a generation cursor of 1 bit and
a bitmask of 2 bits per rule. Assuming the gencursor is 0,
then the genmask (expressed as a bitmask) can be interpreted
as:

00 active in the present, will be active in the next generation.
01 inactive in the present, will be active in the next generation.
10 active in the present, will be deleted in the next generation.
 ^
 gencursor

Once you invoke the transition to the next generation, the global
gencursor is updated:

00 active in the present, will be active in the next generation.
01 active in the present, needs to zero its future, it becomes 00.
10 inactive in the present, delete now.
^
gencursor

If a dump is in progress and nf_tables enters a new generation,
the dump will stop and return -EBUSY to let userspace know that
it has to retry again. In order to invalidate dumps, a global
genctr counter is increased everytime nf_tables enters a new
generation.

This new operation can be used from the user-space utility
that controls the firewall, eg.

nft -f restore

The rule updates contained in `file' will be applied atomically.

cat file
-----
add filter INPUT ip saddr 1.1.1.1 counter accept #1
del filter INPUT ip daddr 2.2.2.2 counter drop   #2
-EOF-

Note that the rule 1 will be inactive until the transition to the
next generation, the rule 2 will be evicted in the next generation.

There is a penalty during the rule update due to the branch
misprediction in the packet matching framework. But that should be
quickly resolved once the iteration over the commit list that
contain rules that require updates is finished.

Event notification happens once the rule-set update has been
committed. So we skip notifications is case the rule-set update
is aborted, which can happen in case that the rule-set is tested
to apply correctly.

This patch squashed the following patches from Pablo:

* nf_tables: atomic rule updates and dumps
* nf_tables: get rid of per rule list_head for commits
* nf_tables: use per netns commit list
* nfnetlink: add batch support and use it from nf_tables
* nf_tables: all rule updates are transactional
* nf_tables: attach replacement rule after stale one
* nf_tables: do not allow deletion/replacement of stale rules
* nf_tables: remove unused NFTA_RULE_FLAGS

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14 18:01:01 +02:00
Eric Leblond
5e94846686 netfilter: nf_tables: add insert operation
This patch adds a new rule attribute NFTA_RULE_POSITION which is
used to store the position of a rule relatively to the others.
By providing the create command and specifying the position, the
rule is inserted after the rule with the handle equal to the
provided position.

Regarding notification, the position attribute specifies the
handle of the previous rule to make sure we don't point to any
stale rule in notifications coming from the commit path.

This patch includes the following fix from Pablo:

* nf_tables: fix rule deletion event reporting

Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14 18:01:00 +02:00
Pablo Neira Ayuso
99633ab29b netfilter: nf_tables: complete net namespace support
Register family per netnamespace to ensure that sets are
only visible in its approapriate namespace.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14 18:00:59 +02:00
Pablo Neira Ayuso
9ddf632357 netfilter: nf_tables: add support for dormant tables
This patch allows you to temporarily disable an entire table.
You can change the state of a dormant table via NFT_MSG_NEWTABLE
messages. Using this operation you can wake up a table, so their
chains are registered.

This provides atomicity at chain level. Thus, the rule-set of one
chain is applied at once, avoiding any possible intermediate state
in every chain. Still, the chains that belongs to a table are
registered consecutively. This also allows you to have inactive
tables in the kernel.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14 18:00:57 +02:00
Pablo Neira Ayuso
0ca743a559 netfilter: nf_tables: add compatibility layer for x_tables
This patch adds the x_tables compatibility layer. This allows you
to use existing x_tables matches and targets from nf_tables.

This compatibility later allows us to use existing matches/targets
for features that are still missing in nf_tables. We can progressively
replace them with native nf_tables extensions. It also provides the
userspace compatibility software that allows you to express the
rule-set using the iptables syntax but using the nf_tables kernel
components.

In order to get this compatibility layer working, I've done the
following things:

* add NFNL_SUBSYS_NFT_COMPAT: this new nfnetlink subsystem is used
to query the x_tables match/target revision, so we don't need to
use the native x_table getsockopt interface.

* emulate xt structures: this required extending the struct nft_pktinfo
to include the fragment offset, which is already obtained from
ip[6]_tables and that is used by some matches/targets.

* add support for default policy to base chains, required to emulate
  x_tables.

* add NFTA_CHAIN_USE attribute to obtain the number of references to
  chains, required by x_tables emulation.

* add chain packet/byte counters using per-cpu.

* support 32-64 bits compat.

For historical reasons, this patch includes the following patches
that were posted in the netfilter-devel mailing list.

From Pablo Neira Ayuso:
* nf_tables: add default policy to base chains
* netfilter: nf_tables: add NFTA_CHAIN_USE attribute
* nf_tables: nft_compat: private data of target and matches in contiguous area
* nf_tables: validate hooks for compat match/target
* nf_tables: nft_compat: release cached matches/targets
* nf_tables: x_tables support as a compile time option
* nf_tables: fix alias for xtables over nftables module
* nf_tables: add packet and byte counters per chain
* nf_tables: fix per-chain counter stats if no counters are passed
* nf_tables: don't bump chain stats
* nf_tables: add protocol and flags for xtables over nf_tables
* nf_tables: add ip[6]t_entry emulation
* nf_tables: move specific layer 3 compat code to nf_tables_ipv[4|6]
* nf_tables: support 32bits-64bits x_tables compat
* nf_tables: fix compilation if CONFIG_COMPAT is disabled

From Patrick McHardy:
* nf_tables: move policy to struct nft_base_chain
* nf_tables: send notifications for base chain policy changes

From Alexander Primak:
* nf_tables: remove the duplicate NF_INET_LOCAL_OUT

From Nicolas Dichtel:
* nf_tables: fix compilation when nf-netlink is a module

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14 18:00:04 +02:00
Pablo Neira Ayuso
9370761c56 netfilter: nf_tables: convert built-in tables/chains to chain types
This patch converts built-in tables/chains to chain types that
allows you to deploy customized table and chain configurations from
userspace.

After this patch, you have to specify the chain type when
creating a new chain:

 add chain ip filter output { type filter hook input priority 0; }
                              ^^^^ ------

The existing chain types after this patch are: filter, route and
nat. Note that tables are just containers of chains with no specific
semantics, which is a significant change with regards to iptables.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14 17:16:11 +02:00
Patrick McHardy
ef1f7df917 netfilter: nf_tables: expression ops overloading
Split the expression ops into two parts and support overloading of
the runtime expression ops based on the requested function through
a ->select_ops() callback.

This can be used to provide optimized implementations, for instance
for loading small aligned amounts of data from the packet or inlining
frequently used operations into the main evaluation loop.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14 17:16:08 +02:00
Patrick McHardy
20a69341f2 netfilter: nf_tables: add netlink set API
This patch adds the new netlink API for maintaining nf_tables sets
independently of the ruleset. The API supports the following operations:

- creation of sets
- deletion of sets
- querying of specific sets
- dumping of all sets

- addition of set elements
- removal of set elements
- dumping of all set elements

Sets are identified by name, each table defines an individual namespace.
The name of a set may be allocated automatically, this is mostly useful
in combination with the NFT_SET_ANONYMOUS flag, which destroys a set
automatically once the last reference has been released.

Sets can be marked constant, meaning they're not allowed to change while
linked to a rule. This allows to perform lockless operation for set
types that would otherwise require locking.

Additionally, if the implementation supports it, sets can (as before) be
used as maps, associating a data value with each key (or range), by
specifying the NFT_SET_MAP flag and can be used for interval queries by
specifying the NFT_SET_INTERVAL flag.

Set elements are added and removed incrementally. All element operations
support batching, reducing netlink message and set lookup overhead.

The old "set" and "hash" expressions are replaced by a generic "lookup"
expression, which binds to the specified set. Userspace is not aware
of the actual set implementation used by the kernel anymore, all
configuration options are generic.

Currently the implementation selection logic is largely missing and the
kernel will simply use the first registered implementation supporting the
requested operation. Eventually, the plan is to have userspace supply a
description of the data characteristics and select the implementation
based on expected performance and memory use.

This patch includes the new 'lookup' expression to look up for element
matching in the set.

This patch includes kernel-doc descriptions for this set API and it
also includes the following fixes.

From Patrick McHardy:
* netfilter: nf_tables: fix set element data type in dumps
* netfilter: nf_tables: fix indentation of struct nft_set_elem comments
* netfilter: nf_tables: fix oops in nft_validate_data_load()
* netfilter: nf_tables: fix oops while listing sets of built-in tables
* netfilter: nf_tables: destroy anonymous sets immediately if binding fails
* netfilter: nf_tables: propagate context to set iter callback
* netfilter: nf_tables: add loop detection

From Pablo Neira Ayuso:
* netfilter: nf_tables: allow to dump all existing sets
* netfilter: nf_tables: fix wrong type for flags variable in newelem

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14 17:16:07 +02:00
Patrick McHardy
96518518cc netfilter: add nftables
This patch adds nftables which is the intended successor of iptables.
This packet filtering framework reuses the existing netfilter hooks,
the connection tracking system, the NAT subsystem, the transparent
proxying engine, the logging infrastructure and the userspace packet
queueing facilities.

In a nutshell, nftables provides a pseudo-state machine with 4 general
purpose registers of 128 bits and 1 specific purpose register to store
verdicts. This pseudo-machine comes with an extensible instruction set,
a.k.a. "expressions" in the nftables jargon. The expressions included
in this patch provide the basic functionality, they are:

* bitwise: to perform bitwise operations.
* byteorder: to change from host/network endianess.
* cmp: to compare data with the content of the registers.
* counter: to enable counters on rules.
* ct: to store conntrack keys into register.
* exthdr: to match IPv6 extension headers.
* immediate: to load data into registers.
* limit: to limit matching based on packet rate.
* log: to log packets.
* meta: to match metainformation that usually comes with the skbuff.
* nat: to perform Network Address Translation.
* payload: to fetch data from the packet payload and store it into
  registers.
* reject (IPv4 only): to explicitly close connection, eg. TCP RST.

Using this instruction-set, the userspace utility 'nft' can transform
the rules expressed in human-readable text representation (using a
new syntax, inspired by tcpdump) to nftables bytecode.

nftables also inherits the table, chain and rule objects from
iptables, but in a more configurable way, and it also includes the
original datatype-agnostic set infrastructure with mapping support.
This set infrastructure is enhanced in the follow up patch (netfilter:
nf_tables: add netlink set API).

This patch includes the following components:

* the netlink API: net/netfilter/nf_tables_api.c and
  include/uapi/netfilter/nf_tables.h
* the packet filter core: net/netfilter/nf_tables_core.c
* the expressions (described above): net/netfilter/nft_*.c
* the filter tables: arp, IPv4, IPv6 and bridge:
  net/ipv4/netfilter/nf_tables_ipv4.c
  net/ipv6/netfilter/nf_tables_ipv6.c
  net/ipv4/netfilter/nf_tables_arp.c
  net/bridge/netfilter/nf_tables_bridge.c
* the NAT table (IPv4 only):
  net/ipv4/netfilter/nf_table_nat_ipv4.c
* the route table (similar to mangle):
  net/ipv4/netfilter/nf_table_route_ipv4.c
  net/ipv6/netfilter/nf_table_route_ipv6.c
* internal definitions under:
  include/net/netfilter/nf_tables.h
  include/net/netfilter/nf_tables_core.h
* It also includes an skeleton expression:
  net/netfilter/nft_expr_template.c
  and the preliminary implementation of the meta target
  net/netfilter/nft_meta_target.c

It also includes a change in struct nf_hook_ops to add a new
pointer to store private data to the hook, that is used to store
the rule list per chain.

This patch is based on the patch from Patrick McHardy, plus merged
accumulated cleanups, fixes and small enhancements to the nftables
code that has been done since 2009, which are:

From Patrick McHardy:
* nf_tables: adjust netlink handler function signatures
* nf_tables: only retry table lookup after successful table module load
* nf_tables: fix event notification echo and avoid unnecessary messages
* nft_ct: add l3proto support
* nf_tables: pass expression context to nft_validate_data_load()
* nf_tables: remove redundant definition
* nft_ct: fix maxattr initialization
* nf_tables: fix invalid event type in nf_tables_getrule()
* nf_tables: simplify nft_data_init() usage
* nf_tables: build in more core modules
* nf_tables: fix double lookup expression unregistation
* nf_tables: move expression initialization to nf_tables_core.c
* nf_tables: build in payload module
* nf_tables: use NFPROTO constants
* nf_tables: rename pid variables to portid
* nf_tables: save 48 bits per rule
* nf_tables: introduce chain rename
* nf_tables: check for duplicate names on chain rename
* nf_tables: remove ability to specify handles for new rules
* nf_tables: return error for rule change request
* nf_tables: return error for NLM_F_REPLACE without rule handle
* nf_tables: include NLM_F_APPEND/NLM_F_REPLACE flags in rule notification
* nf_tables: fix NLM_F_MULTI usage in netlink notifications
* nf_tables: include NLM_F_APPEND in rule dumps

From Pablo Neira Ayuso:
* nf_tables: fix stack overflow in nf_tables_newrule
* nf_tables: nft_ct: fix compilation warning
* nf_tables: nft_ct: fix crash with invalid packets
* nft_log: group and qthreshold are 2^16
* nf_tables: nft_meta: fix socket uid,gid handling
* nft_counter: allow to restore counters
* nf_tables: fix module autoload
* nf_tables: allow to remove all rules placed in one chain
* nf_tables: use 64-bits rule handle instead of 16-bits
* nf_tables: fix chain after rule deletion
* nf_tables: improve deletion performance
* nf_tables: add missing code in route chain type
* nf_tables: rise maximum number of expressions from 12 to 128
* nf_tables: don't delete table if in use
* nf_tables: fix basechain release

From Tomasz Bursztyka:
* nf_tables: Add support for changing users chain's name
* nf_tables: Change chain's name to be fixed sized
* nf_tables: Add support for replacing a rule by another one
* nf_tables: Update uapi nftables netlink header documentation

From Florian Westphal:
* nft_log: group is u16, snaplen u32

From Phil Oester:
* nf_tables: operational limit match

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-10-14 17:15:48 +02:00