Commit Graph

147 Commits

Author SHA1 Message Date
Patrick McHardy 7c6c6e95a1 netfilter: nf_tables: add flag to indicate set contains expressions
Add a set flag to indicate that the set is used as a state table and
contains expressions for evaluation. This operation is mutually
exclusive with the mapping operation, so sets specifying both are
rejected. The lookup expression also rejects binding to state tables
since it only deals with loopup and map operations.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 20:12:32 +02:00
Patrick McHardy f25ad2e907 netfilter: nf_tables: prepare for expressions associated to set elements
Preparation to attach expressions to set elements: add a set extension
type to hold an expression and dump the expression information with the
set element.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 20:12:31 +02:00
Patrick McHardy 0b2d8a7b63 netfilter: nf_tables: add helper functions for expression handling
Add helper functions for initializing, cloning, dumping and destroying
a single expression that is not part of a rule.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 20:12:31 +02:00
Patrick McHardy 7d7402642e netfilter: nf_tables: variable sized set element keys / data
This patch changes sets to support variable sized set element keys / data
up to 64 bytes each by using variable sized set extensions. This allows
to use concatenations with bigger data items suchs as IPv6 addresses.

As a side effect, small keys/data now don't require the full 16 bytes
of struct nft_data anymore but just the space they need.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 17:17:31 +02:00
Patrick McHardy d0a11fc3dc netfilter: nf_tables: support variable sized data in nft_data_init()
Add a size argument to nft_data_init() and pass in the available space.
This will be used by the following patches to support variable sized
set element data.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 17:17:30 +02:00
Patrick McHardy 49499c3e6e netfilter: nf_tables: switch registers to 32 bit addressing
Switch the nf_tables registers from 128 bit addressing to 32 bit
addressing to support so called concatenations, where multiple values
can be concatenated over multiple registers for O(1) exact matches of
multiple dimensions using sets.

The old register values are mapped to areas of 128 bits for compatibility.
When dumping register numbers, values are expressed using the old values
if they refer to the beginning of a 128 bit area for compatibility.

To support concatenations, register loads of less than a full 32 bit
value need to be padded. This mainly affects the payload and exthdr
expressions, which both unconditionally zero the last word before
copying the data.

Userspace fully passes the testsuite using both old and new register
addressing.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 17:17:29 +02:00
Patrick McHardy b1c96ed37c netfilter: nf_tables: add register parsing/dumping helpers
Add helper functions to parse and dump register values in netlink attributes.
These helpers will later be changed to take care of translation between the
old 128 bit and the new 32 bit register numbers.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 17:17:28 +02:00
Patrick McHardy 1ca2e1702c netfilter: nf_tables: use struct nft_verdict within struct nft_data
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 17:17:24 +02:00
Patrick McHardy d07db9884a netfilter: nf_tables: introduce nft_validate_register_load()
Change nft_validate_input_register() to not only validate the input
register number, but also the length of the load, and rename it to
nft_validate_register_load() to reflect that change.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 16:25:50 +02:00
Patrick McHardy 27e6d2017a netfilter: nf_tables: kill nft_validate_output_register()
All users of nft_validate_register_store() first invoke
nft_validate_output_register(). There is in fact no use for using it
on its own, so simplify the code by folding the functionality into
nft_validate_register_store() and kill it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 16:25:50 +02:00
Patrick McHardy 58f40ab6e2 netfilter: nft_lookup: use nft_validate_register_store() to validate types
In preparation of validating the length of a register store, use
nft_validate_register_store() in nft_lookup instead of open coding the
validation.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 16:25:49 +02:00
Patrick McHardy 1ec10212f9 netfilter: nf_tables: rename nft_validate_data_load()
The existing name is ambiguous, data is loaded as well when we read from
a register. Rename to nft_validate_register_store() for clarity and
consistency with the upcoming patch to introduce its counterpart,
nft_validate_register_load().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 16:25:49 +02:00
Patrick McHardy 45d9bcda21 netfilter: nf_tables: validate len in nft_validate_data_load()
For values spanning multiple registers, we need to validate that enough
space is available from the destination register onwards. Add a len
argument to nft_validate_data_load() and consolidate the existing length
validations in preparation of that.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-13 16:25:49 +02:00
Patrick McHardy 68e942e88a netfilter: nf_tables: support optional userdata for set elements
Add an userdata set extension and allow the user to attach arbitrary
data to set elements. This is intended to hold TLV encoded data like
comments or DNS annotations that have no meaning to the kernel.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:58:27 +02:00
Patrick McHardy 22fe54d5fe netfilter: nf_tables: add support for dynamic set updates
Add a new "dynset" expression for dynamic set updates.

A new set op ->update() is added which, for non existant elements,
invokes an initialization callback and inserts the new element.
For both new or existing elements the extenstion pointer is returned
to the caller to optionally perform timer updates or other actions.

Element removal is not supported so far, however that seems to be a
rather exotic need and can be added later on.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:58:27 +02:00
Patrick McHardy 11113e190b netfilter: nf_tables: support different set binding types
Currently a set binding is assumed to be related to a lookup and, in
case of maps, a data load.

In order to use bindings for set updates, the loop detection checks
must be restricted to map operations only. Add a flags member to the
binding struct to hold the set "action" flags such as NFT_SET_MAP,
and perform loop detection based on these.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:58:27 +02:00
Patrick McHardy 3dd0673ac3 netfilter: nf_tables: prepare set element accounting for async updates
Use atomic operations for the element count to avoid races with async
updates.

To properly handle the transactional semantics during netlink updates,
deleted but not yet committed elements are accounted for seperately and
are treated as being already removed. This means for the duration of
a netlink transaction, the limit might be exceeded by the amount of
elements deleted. Set implementations must be prepared to handle this.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:58:27 +02:00
Patrick McHardy 4a8678efbe netfilter: nf_tables: fix set selection when timeouts are requested
The NFT_SET_TIMEOUT flag is ignore in nft_select_set_ops, which may
lead to selection of a set implementation that doesn't actually
support timeouts.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:58:26 +02:00
Patrick McHardy 6908665826 netfilter: nf_tables: add GC synchronization helpers
GC is expected to happen asynchrously to the netlink interface. In the
netlink path, both insertion and removal of elements consist of two
steps, insertion followed by activation or deactivation followed by
removal, during which the element must not be freed by GC.

The synchronization helpers use an unused bit in the genmask field to
atomically mark an element as "busy", meaning it is either currently
being handled through the netlink API or by GC.

Elements being processed by GC will never survive, netlink will simply
ignore them. Elements being currently processed through netlink will be
skipped by GC and reprocessed during the next run.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-01 11:17:29 +02:00
Patrick McHardy cfed7e1b1f netfilter: nf_tables: add set garbage collection helpers
Add helpers for GC batch destruction: since element destruction needs
a RCU grace period for all set implementations, add some helper functions
for asynchronous batch destruction. Elements are collected in a batch
structure, which is asynchronously released using RCU once its full.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-01 11:17:29 +02:00
Patrick McHardy c3e1b005ed netfilter: nf_tables: add set element timeout support
Add API support for set element timeouts. Elements can have a individual
timeout value specified, overriding the sets' default.

Two new extension types are used for timeouts - the timeout value and
the expiration time. The timeout value only exists if it differs from
the default value.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-01 11:17:28 +02:00
Patrick McHardy 761da2935d netfilter: nf_tables: add set timeout API support
Add set timeout support to the netlink API. Sets with timeout support
enabled can have a default timeout value and garbage collection interval
specified.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-01 11:17:28 +02:00
Patrick McHardy cc02e457bb netfilter: nf_tables: implement set transaction support
Set elements are the last object type not supporting transaction support.
Implement similar to the existing rule transactions:

The global transaction counter keeps track of two generations, current
and next. Each element contains a bitmask specifying in which generations
it is inactive.

New elements start out as inactive in the current generation and active
in the next. On commit, the previous next generation becomes the current
generation and the element becomes active. The bitmask is then cleared
to indicate that the element is active in all future generations. If the
transaction is aborted, the element is removed from the set before it
becomes active.

When removing an element, it gets marked as inactive in the next generation.
On commit the next generation becomes active and the therefor the element
inactive. It is then taken out of then set and released. On abort, the
element is marked as active for the next generation again.

Lookups ignore elements not active in the current generation.

The current set types (hash/rbtree) both use a field in the extension area
to store the generation mask. This (currently) does not require any
additional memory since we have some free space in there.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-26 11:09:35 +01:00
Patrick McHardy ea4bd995b0 netfilter: nf_tables: add transaction helper functions
Add some helper functions for building the genmask as preparation for
set transactions.

Also add a little documentation how this stuff actually works.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-26 11:09:35 +01:00
Patrick McHardy 61edafbb47 netfilter: nf_tables: consolide set element destruction
With the conversion to set extensions, it is now possible to consolidate
the different set element destruction functions.

The set implementations' ->remove() functions are changed to only take
the element out of their internal data structures. Elements will be freed
in a batched fashion after the global transaction's completion RCU grace
period.

This reduces the amount of grace periods required for nft_hash from N
to zero additional ones, additionally this guarantees that the set
elements' extensions of all implementations can be used under RCU
protection.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-26 11:09:34 +01:00
Patrick McHardy fe2811ebeb netfilter: nf_tables: convert hash and rbtree to set extensions
The set implementations' private struct will only contain the elements
needed to maintain the search structure, all other elements are moved
to the set extensions.

Element allocation and initialization is performed centrally by
nf_tables_api instead of by the different set implementations'
->insert() functions. A new "elemsize" member in the set ops specifies
the amount of memory to reserve for internal usage. Destruction
will also be moved out of the set implementations by a following patch.

Except for element allocation, the patch is a simple conversion to
using data from the extension area.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-25 17:18:35 +01:00
Patrick McHardy 3ac4c07a24 netfilter: nf_tables: add set extensions
Add simple set extension infrastructure for maintaining variable sized
and optional per element data.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-25 17:18:34 +01:00
Patrick McHardy 5ebb335dcb netfilter: nf_tables: move struct net pointer to base chain
The network namespace is only needed for base chains to get at the
gencursor. Also convert to possible_net_t.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-25 12:09:38 +01:00
David S. Miller d5c1d8c567 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/netfilter/nf_tables_core.c

The nf_tables_core.c conflict was resolved using a conflict resolution
from Stephen Rothwell as a guide.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 22:22:43 -04:00
Patrick McHardy 55df35d22f netfilter: nf_tables: reject NFT_SET_ELEM_INTERVAL_END flag for non-interval sets
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-22 19:50:35 +01:00
Pablo Neira Ayuso ffdb210eb4 netfilter: nf_tables: consolidate error path of nf_tables_newtable()
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-18 11:57:31 +01:00
Pablo Neira Ayuso d6b6cb1d3e netfilter: nf_tables: allow to change chain policy without hook if it exists
If there's an existing base chain, we have to allow to change the
default policy without indicating the hook information.

However, if the chain doesn't exists, we have to enforce the presence of
the hook attribute.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-17 13:48:04 +01:00
David S. Miller 3cef5c5b0b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/cadence/macb.c

Overlapping changes in macb driver, mostly fixes and cleanups
in 'net' overlapping with the integration of at91_ether into
macb in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-09 23:38:02 -04:00
Pablo Neira Ayuso 1cae565e8b netfilter: nf_tables: limit maximum table name length to 32 bytes
Set the same as we use for chain names, it should be enough.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-06 01:21:21 +01:00
Pablo Neira Ayuso 59900e0a01 netfilter: nf_tables: fix error handling of rule replacement
In general, if a transaction object is added to the list successfully,
we can rely on the abort path to undo what we've done. This allows us to
simplify the error handling of the rule replacement path in
nf_tables_newrule().

This implicitly fixes an unnecessary removal of the old rule, which
needs to be left in place if we fail to replace.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-04 18:46:08 +01:00
Patrick McHardy 86f1ec3231 netfilter: nf_tables: fix userdata length overflow
The NFT_USERDATA_MAXLEN is defined to 256, however we only have a u8
to store its size. Introduce a struct nft_userdata which contains a
length field and indicate its presence using a single bit in the rule.

The length field of struct nft_userdata is also a u8, however we don't
store zero sized data, so the actual length is udata->len + 1.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-04 18:46:06 +01:00
Patrick McHardy 9889840f59 netfilter: nf_tables: check for overflow of rule dlen field
Check that the space required for the expressions doesn't exceed the
size of the dlen field, which would lead to the iterators crashing.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-04 18:46:05 +01:00
Patrick McHardy 8670c3a55e netfilter: nf_tables: fix transaction race condition
A race condition exists in the rule transaction code for rules that
get added and removed within the same transaction.

The new rule starts out as inactive in the current and active in the
next generation and is inserted into the ruleset. When it is deleted,
it is additionally set to inactive in the next generation as well.

On commit the next generation is begun, then the actions are finalized.
For the new rule this would mean clearing out the inactive bit for
the previously current, now next generation.

However nft_rule_clear() clears out the bits for *both* generations,
activating the rule in the current generation, where it should be
deactivated due to being deleted. The rule will thus be active until
the deletion is finalized, removing the rule from the ruleset.

Similarly, when aborting a transaction for the same case, the undo
of insertion will remove it from the RCU protected rule list, the
deletion will clear out all bits. However until the next RCU
synchronization after all operations have been undone, the rule is
active on CPUs which can still see the rule on the list.

Generally, there may never be any modifications of the current
generations' inactive bit since this defeats the entire purpose of
atomicity. Change nft_rule_clear() to only touch the next generations
bit to fix this.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-03-04 18:46:04 +01:00
Pablo Neira Ayuso 02263db00b netfilter: nf_tables: fix addition/deletion of elements from commit/abort
We have several problems in this path:

1) There is a use-after-free when removing individual elements from
   the commit path.

2) We have to uninit() the data part of the element from the abort
   path to avoid a chain refcount leak.

3) We have to check for set->flags to see if there's a mapping, instead
   of the element flags.

4) We have to check for !(flags & NFT_SET_ELEM_INTERVAL_END) to skip
   elements that are part of the interval that have no data part, so
   they don't need to be uninit().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-02-22 21:05:08 +01:00
David S. Miller 6e03f896b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/vxlan.c
	drivers/vhost/net.c
	include/linux/if_vlan.h
	net/core/dev.c

The net/core/dev.c conflict was the overlap of one commit marking an
existing function static whilst another was adding a new function.

In the include/linux/if_vlan.h case, the type used for a local
variable was changed in 'net', whereas the function got rewritten
to fix a stacked vlan bug in 'net-next'.

In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next'
overlapped with an endainness fix for VHOST 1.0 in 'net'.

In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter
in 'net-next' whereas in 'net' there was a bug fix to pass in the
correct network namespace pointer in calls to this function.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05 14:33:28 -08:00
Pablo Neira Ayuso f5553c19ff netfilter: nf_tables: fix leaks in error path of nf_tables_newchain()
Release statistics and module refcount on memory allocation problems.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-01-30 18:42:08 +01:00
Pablo Neira Ayuso e8781f70a5 netfilter: nf_tables: disable preemption when restoring chain counters
With CONFIG_DEBUG_PREEMPT=y

[22144.496057] BUG: using smp_processor_id() in preemptible [00000000] code: iptables-compat/10406
[22144.496061] caller is debug_smp_processor_id+0x17/0x1b
[22144.496065] CPU: 2 PID: 10406 Comm: iptables-compat Not tainted 3.19.0-rc4+ #
[...]
[22144.496092] Call Trace:
[22144.496098]  [<ffffffff8145b9fa>] dump_stack+0x4f/0x7b
[22144.496104]  [<ffffffff81244f52>] check_preemption_disabled+0xd6/0xe8
[22144.496110]  [<ffffffff81244f90>] debug_smp_processor_id+0x17/0x1b
[22144.496120]  [<ffffffffa07c557e>] nft_stats_alloc+0x94/0xc7 [nf_tables]
[22144.496130]  [<ffffffffa07c73d2>] nf_tables_newchain+0x471/0x6d8 [nf_tables]
[22144.496140]  [<ffffffffa07c5ef6>] ? nft_trans_alloc+0x18/0x34 [nf_tables]
[22144.496154]  [<ffffffffa063c8da>] nfnetlink_rcv_batch+0x2b4/0x457 [nfnetlink]

Reported-by: Andreas Schultz <aschultz@tpip.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-01-26 11:50:02 +01:00
Pablo Neira Ayuso 75e8d06d43 netfilter: nf_tables: validate hooks in NAT expressions
The user can crash the kernel if it uses any of the existing NAT
expressions from the wrong hook, so add some code to validate this
when loading the rule.

This patch introduces nft_chain_validate_hooks() which is based on
an existing function in the bridge version of the reject expression.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-01-19 14:52:39 +01:00
Johannes Berg 053c095a82 netlink: make nlmsg_end() and genlmsg_end() void
Contrary to common expectations for an "int" return, these functions
return only a positive value -- if used correctly they cannot even
return 0 because the message header will necessarily be in the skb.

This makes the very common pattern of

  if (genlmsg_end(...) < 0) { ... }

be a whole bunch of dead code. Many places also simply do

  return nlmsg_end(...);

and the caller is expected to deal with it.

This also commonly (at least for me) causes errors, because it is very
common to write

  if (my_function(...))
    /* error condition */

and if my_function() does "return nlmsg_end()" this is of course wrong.

Additionally, there's not a single place in the kernel that actually
needs the message length returned, and if anyone needs it later then
it'll be very easy to just use skb->len there.

Remove this, and make the functions void. This removes a bunch of dead
code as described above. The patch adds lines because I did

-	return nlmsg_end(...);
+	nlmsg_end(...);
+	return 0;

I could have preserved all the function's return values by returning
skb->len, but instead I've audited all the places calling the affected
functions and found that none cared. A few places actually compared
the return value with <= 0 in dump functionality, but that could just
be changed to < 0 with no change in behaviour, so I opted for the more
efficient version.

One instance of the error I've made numerous times now is also present
in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
check for <0 or <=0 and thus broke out of the loop every single time.
I've preserved this since it will (I think) have caused the messages to
userspace to be formatted differently with just a single message for
every SKB returned to userspace. It's possible that this isn't needed
for the tools that actually use this, but I don't even know what they
are so couldn't test that changing this behaviour would be acceptable.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-18 01:03:45 -05:00
Pablo Neira Ayuso a2f18db0c6 netfilter: nf_tables: fix flush ruleset chain dependencies
Jumping between chains doesn't mix well with flush ruleset. Rules
from a different chain and set elements may still refer to us.

[  353.373791] ------------[ cut here ]------------
[  353.373845] kernel BUG at net/netfilter/nf_tables_api.c:1159!
[  353.373896] invalid opcode: 0000 [#1] SMP
[  353.373942] Modules linked in: intel_powerclamp uas iwldvm iwlwifi
[  353.374017] CPU: 0 PID: 6445 Comm: 31c3.nft Not tainted 3.18.0 #98
[  353.374069] Hardware name: LENOVO 5129CTO/5129CTO, BIOS 6QET47WW (1.17 ) 07/14/2010
[...]
[  353.375018] Call Trace:
[  353.375046]  [<ffffffff81964c31>] ? nf_tables_commit+0x381/0x540
[  353.375101]  [<ffffffff81949118>] nfnetlink_rcv+0x3d8/0x4b0
[  353.375150]  [<ffffffff81943fc5>] netlink_unicast+0x105/0x1a0
[  353.375200]  [<ffffffff8194438e>] netlink_sendmsg+0x32e/0x790
[  353.375253]  [<ffffffff818f398e>] sock_sendmsg+0x8e/0xc0
[  353.375300]  [<ffffffff818f36b9>] ? move_addr_to_kernel.part.20+0x19/0x70
[  353.375357]  [<ffffffff818f44f9>] ? move_addr_to_kernel+0x19/0x30
[  353.375410]  [<ffffffff819016d2>] ? verify_iovec+0x42/0xd0
[  353.375459]  [<ffffffff818f3e10>] ___sys_sendmsg+0x3f0/0x400
[  353.375510]  [<ffffffff810615fa>] ? native_sched_clock+0x2a/0x90
[  353.375563]  [<ffffffff81176697>] ? acct_account_cputime+0x17/0x20
[  353.375616]  [<ffffffff8110dc78>] ? account_user_time+0x88/0xa0
[  353.375667]  [<ffffffff818f4bbd>] __sys_sendmsg+0x3d/0x80
[  353.375719]  [<ffffffff81b184f4>] ? int_check_syscall_exit_work+0x34/0x3d
[  353.375776]  [<ffffffff818f4c0d>] SyS_sendmsg+0xd/0x20
[  353.375823]  [<ffffffff81b1826d>] system_call_fastpath+0x16/0x1b

Release objects in this order: rules -> sets -> chains -> tables, to
make sure no references to chains are held anymore.

Reported-by: Asbjoern Sloth Toennesen <asbjorn@asbjorn.biz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-01-06 22:27:48 +01:00
David S. Miller 958d03b016 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
netfilter/ipvs updates for net-next

The following patchset contains Netfilter updates for your net-next
tree, this includes the NAT redirection support for nf_tables, the
cgroup support for nft meta and conntrack zone support for the connlimit
match. Coming after those, a bunch of sparse warning fixes, missing
netns bits and cleanups. More specifically, they are:

1) Prepare IPv4 and IPv6 NAT redirect code to use it from nf_tables,
   patches from Arturo Borrero.

2) Introduce the nf_tables redir expression, from Arturo Borrero.

3) Remove an unnecessary assignment in ip_vs_xmit/__ip_vs_get_out_rt().
   Patch from Alex Gartrell.

4) Add nft_log_dereference() macro to the nf_log infrastructure, patch
   from Marcelo Leitner.

5) Add some extra validation when registering logger families, also
   from Marcelo.

6) Some spelling cleanups from stephen hemminger.

7) Fix sparse warning in nf_logger_find_get().

8) Add cgroup support to nf_tables meta, patch from Ana Rey.

9) A Kconfig fix for the new redir expression and fix sparse warnings in
   the new redir expression.

10) Fix several sparse warnings in the netfilter tree, from
    Florian Westphal.

11) Reduce verbosity when OOM in nfnetlink_log. User can basically do
    nothing when this situation occurs.

12) Add conntrack zone support to xt_connlimit, again from Florian.

13) Add netnamespace support to the h323 conntrack helper, contributed
    by Vasily Averin.

14) Remove unnecessary nul-pointer checks before free_percpu() and
    module_put(), from Markus Elfring.

15) Use pr_fmt in nfnetlink_log, again patch from Marcelo Leitner.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24 16:00:58 -05:00
Markus Elfring 982f405136 netfilter: Deletion of unnecessary checks before two function calls
The functions free_percpu() and module_put() test whether their argument
is NULL and then return immediately. Thus the test around the call is
not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-11-20 13:08:43 +01:00
Pablo Neira Ayuso b326dd37b9 netfilter: nf_tables: restore synchronous object release from commit/abort
The existing xtables matches and targets, when used from nft_compat, may
sleep from the destroy path, ie. when removing rules. Since the objects
are released via call_rcu from softirq context, this results in lockdep
splats and possible lockups that may be hard to reproduce.

Patrick also indicated that delayed object release via call_rcu can
cause us problems in the ordering of event notifications when anonymous
sets are in place.

So, this patch restores the synchronous object release from the commit
and abort paths. This includes a call to synchronize_rcu() to make sure
that no packets are walking on the objects that are going to be
released. This is slowier though, but it's simple and it resolves the
aforementioned problems.

This is a partial revert of c7c32e7 ("netfilter: nf_tables: defer all
object release via rcu") that was introduced in 3.16 to speed up
interaction with userspace.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-11-12 12:06:24 +01:00
stephen hemminger 01cfa0a4ed netfilter: fix spelling errors
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-30 17:35:30 +01:00
Sabrina Dubroca c123bb7163 netfilter: nf_tables: check for NULL in nf_tables_newchain pcpu stats allocation
alloc_percpu returns NULL on failure, not a negative error code.

Fixes: ff3cd7b3c9 ("netfilter: nf_tables: refactor chain statistic routines")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-22 14:12:51 +02:00