Commit graph

709 commits

Author SHA1 Message Date
Eric Dumazet
22df13319d bridge: fix a possible use after free
br_multicast_ipv6_rcv() can call pskb_trim_rcsum() and therefore skb
head can be reallocated.

Cache icmp6_type field instead of dereferencing twice the struct
icmp6hdr pointer.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 17:49:24 -07:00
Yan, Zheng
4b275d7efa bridge: Pseudo-header required for the checksum of ICMPv6
Checksum of ICMPv6 is not properly computed because the pseudo header is not used.
Thus, the MLD packet gets dropped by the bridge.

Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Reported-by: Ang Way Chuang <wcang@sfc.wide.ad.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-24 17:49:00 -07:00
Eric Dumazet
11f3a6bdc2 bridge: fix a possible net_device leak
Jan Beulich reported a possible net_device leak in bridge code after
commit bb900b27a2 (bridge: allow creating bridge devices with netlink)

Reported-by: Jan Beulich <JBeulich@novell.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-22 16:49:56 -07:00
David S. Miller
823dcd2506 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net 2011-08-20 10:39:12 -07:00
Jiri Pirko
afc4b13df1 net: remove use of ndo_set_multicast_list in drivers
replace it by ndo_set_rx_mode

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-17 20:22:03 -07:00
Julia Lawall
5189054dd7 net/bridge/netfilter/ebtables.c: use available error handling code
Free the locally allocated table and newinfo as done in adjacent error
handling code.

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-11 05:52:57 -07:00
Andrei Warkentin
9be6dd6510 Bridge: Always send NETDEV_CHANGEADDR up on br MAC change.
This ensures the neighbor entries associated with the bridge
dev are flushed, also invalidating the associated cached L2 headers.

This means we br_add_if/br_del_if ports to implement hand-over and
not wind up with bridge packets going out with stale MAC.

This means we can also change MAC of port device and also not wind
up with bridge packets going out with stale MAC.

This builds on Stephen Hemminger's patch, also handling the br_del_if
case and the port MAC change case.

Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-09 21:44:44 -07:00
Stephen Hemminger
a9b3cd7f32 rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER
When assigning a NULL value to an RCU protected pointer, no barrier
is needed. The rcu_assign_pointer, used to handle that but will soon
change to not handle the special case.

Convert all rcu_assign_pointer of NULL value.

//smpl
@@ expression P; @@

- rcu_assign_pointer(P, NULL)
+ RCU_INIT_POINTER(P, NULL)

// </smpl>

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-02 04:29:23 -07:00
Bart De Schuymer
9823d9ff48 netfilter: ebtables: fix ebtables build dependency
The configuration of ebtables shouldn't depend on
CONFIG_BRIDGE_NETFILTER, only on CONFIG_NETFILTER.

Reported-by: Sébastien Laveze <slaveze@gmail.com>
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-29 16:40:30 +02:00
Arun Sharma
60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Linus Torvalds
d3ec4844d4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  fs: Merge split strings
  treewide: fix potentially dangerous trailing ';' in #defined values/expressions
  uwb: Fix misspelling of neighbourhood in comment
  net, netfilter: Remove redundant goto in ebt_ulog_packet
  trivial: don't touch files that are removed in the staging tree
  lib/vsprintf: replace link to Draft by final RFC number
  doc: Kconfig: `to be' -> `be'
  doc: Kconfig: Typo: square -> squared
  doc: Konfig: Documentation/power/{pm => apm-acpi}.txt
  drivers/net: static should be at beginning of declaration
  drivers/media: static should be at beginning of declaration
  drivers/i2c: static should be at beginning of declaration
  XTENSA: static should be at beginning of declaration
  SH: static should be at beginning of declaration
  MIPS: static should be at beginning of declaration
  ARM: static should be at beginning of declaration
  rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
  Update my e-mail address
  PCIe ASPM: forcedly -> forcibly
  gma500: push through device driver tree
  ...

Fix up trivial conflicts:
 - arch/arm/mach-ep93xx/dma-m2p.c (deleted)
 - drivers/gpio/gpio-ep93xx.c (renamed and context nearby)
 - drivers/net/r8169.c (just context changes)
2011-07-25 13:56:39 -07:00
stephen hemminger
160d73b845 bridge: minor cleanups
Some minor cleanups that won't impact code:
  1. Remove inline from non-critical functions; compiler will most
     likely inline them anyway.
  2. Make function args const where possible.
  3. Whitespace cleanup

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-22 17:01:13 -07:00
stephen hemminger
4ecb961c8b bridge: add notification over netlink when STP changes state
When STP changes state of interface need to send a new link
message to reflect that change.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-22 17:01:12 -07:00
stephen hemminger
56139fc5bd bridge: notifier called with the wrong device
If a new device is added to a bridge, the ethernet address of the
bridge network device may change. When the address changes, the
appropriate callback is called, but with the wrong device argument.
The address of the bridge device (ie br0) changes not the address
of the device being passed to add_if (ie eth0).

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-22 17:01:12 -07:00
stephen hemminger
0652cac22c bridge: ignore bogus STP config packets
If the message_age is already greater than the max_age, then the
BPDU is bogus. Linux won't generate BPDU, but conformance tester
or buggy implementation might.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-22 17:01:12 -07:00
stephen hemminger
0c03150e7e bridge: send proper message_age in config BPDU
A bridge topology with three systems:

      +------+  +------+
      | A(2) |--| B(1) |
      +------+  +------+
           \    /
          +------+
          | C(3) |
          +------+

What is supposed to happen:
 * bridge with the lowest ID is elected root (for example: B)
 * C detects that A->C is higher cost path and puts in blocking state

What happens. Bridge with lowest id (B) is elected correctly as
root and things start out fine initially. But then config BPDU
doesn't get transmitted from A -> C. Because of that
the link from A-C is transistioned to the forwarding state.

The root cause of this is that the configuration messages
is generated with bogus message age, and dropped before
sending.

In the standardmessage_age is supposed to be:
  the time since the generation of the Configuration BPDU by
  the Root that instigated the generation of this Configuration BPDU.

Reimplement this by recording the timestamp (age + jiffies) when
recording config information. The old code incorrectly used the time
elapsed on the ageing timer which was incorrect.

See also:
  https://bugzilla.vyatta.com/show_bug.cgi?id=7164

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-22 17:01:12 -07:00
Jesper Juhl
f0da7ee410 net, netfilter: Remove redundant goto in ebt_ulog_packet
In net/bridge/netfilter/ebt_ulog.c:ebt_ulog_packet() the 'goto unlock'
before the 'alloc_failure' label is completely redundant. This patch
removes it.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-07-21 14:02:17 +02:00
David S. Miller
d3aaeb38c4 net: Add ->neigh_lookup() operation to dst_ops
In the future dst entries will be neigh-less.  In that environment we
need to have an easy transition point for current users of
dst->neighbour outside of the packet output fast path.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-18 00:40:17 -07:00
David S. Miller
69cce1d140 net: Abstract dst->neighbour accesses behind helpers.
dst_{get,set}_neighbour()

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-17 23:11:35 -07:00
David S. Miller
8f40b161de neigh: Pass neighbour entry to output ops.
This will get us closer to being able to do "neigh stuff"
completely independent of the underlying dst_entry for
protocols (ipv4/ipv6) that wish to do so.

We will also be able to make dst entries neigh-less.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-17 23:11:17 -07:00
David S. Miller
f6b72b6217 net: Embed hh_cache inside of struct neighbour.
Now that there is a one-to-one correspondance between neighbour
and hh_cache entries, we no longer need:

1) dynamic allocation
2) attachment to dst->hh
3) refcounting

Initialization of the hh_cache entry is indicated by hh_len
being non-zero, and such initialization is always done with
the neighbour's lock held as a writer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-14 07:53:20 -07:00
David S. Miller
e12fe68ce3 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-07-05 23:23:37 -07:00
Herbert Xu
44661462ee bridge: Always flood broadcast packets
As is_multicast_ether_addr returns true on broadcast packets as
well, we need to explicitly exclude broadcast packets so that
they're always flooded.  This wasn't an issue before as broadcast
packets were considered to be an unregistered multicast group,
which were always flooded.  However, as we now only flood such
packets to router ports, this is no longer acceptable.

Reported-by: Michael Guntsche <mike@it-loops.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-05 18:39:39 -07:00
Herbert Xu
bd4265fe36 bridge: Only flood unregistered groups to routers
The bridge currently floods packets to groups that we have never
seen before to all ports.  This is not required by RFC4541 and
in fact it is not desirable in environment where traffic to
unregistered group is always present.

This patch changes the behaviour so that we only send traffic
to unregistered groups to ports marked as routers.

The user can always force flooding behaviour to any given port
by marking it as a router.

Note that this change does not apply to traffic to 224.0.0.X
as traffic to those groups must always be flooded to all ports.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-24 17:52:51 -07:00
David S. Miller
9f6ec8d697 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
	drivers/net/wireless/rtlwifi/pci.c
	net/netfilter/ipvs/ip_vs_core.c
2011-06-20 22:29:08 -07:00
WANG Cong
cefa9993f1 netpoll: copy dev name of slaves to struct netpoll
Otherwise we will not see the name of the slave dev in error
message:

[  388.469446] (null):  doesn't support polling, aborting.

Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-19 16:13:01 -07:00
Fernando Luis Vázquez Cao
fc2af6c73f IGMP snooping: set mrouters_only flag for IPv6 traffic properly
Upon reception of a MGM report packet the kernel sets the mrouters_only flag
in a skb that is a clone of the original skb, which means that the bridge
loses track of MGM packets (cb buffers are tied to a specific skb and not
shared) and it ends up forwading join requests to the bridge interface.

This can cause unexpected membership timeouts and intermitent/permanent loss
of connectivity as described in RFC 4541 [2.1.1. IGMP Forwarding Rules]:

    A snooping switch should forward IGMP Membership Reports only to
    those ports where multicast routers are attached.
    [...]
    Sending membership reports to other hosts can result, for IGMPv1
    and IGMPv2, in unintentionally preventing a host from joining a
    specific multicast group.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
2011-06-16 23:14:13 -04:00
Fernando Luis Vázquez Cao
62b2bcb49c IGMP snooping: set mrouters_only flag for IPv4 traffic properly
Upon reception of a IGMP/IGMPv2 membership report the kernel sets the
mrouters_only flag in a skb that may be a clone of the original skb, which
means that sometimes the bridge loses track of membership report packets (cb
buffers are tied to a specific skb and not shared) and it ends up forwading
join requests to the bridge interface.

This can cause unexpected membership timeouts and intermitent/permanent loss
of connectivity as described in RFC 4541 [2.1.1. IGMP Forwarding Rules]:

    A snooping switch should forward IGMP Membership Reports only to
    those ports where multicast routers are attached.
    [...]
    Sending membership reports to other hosts can result, for IGMPv1
    and IGMPv2, in unintentionally preventing a host from joining a
    specific multicast group.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Tested-by: Hayato Kakuta <kakuta.hayato@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
2011-06-16 23:14:12 -04:00
Greg Rose
c7ac8679be rtnetlink: Compute and store minimum ifinfo dump size
The message size allocated for rtnl ifinfo dumps was limited to
a single page.  This is not enough for additional interface info
available with devices that support SR-IOV and caused a bug in
which VF info would not be displayed if more than approximately
40 VFs were created per interface.

Implement a new function pointer for the rtnl_register service that will
calculate the amount of data required for the ifinfo dump and allocate
enough data to satisfy the request.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-06-09 20:38:07 -07:00
Alexander Holler
6407d74c51 bridge: provide a cow_metrics method for fake_ops
Like in commit 0972ddb237 (provide cow_metrics() methods to blackhole
dst_ops), we must provide a cow_metrics for bridges fake_dst_ops as
well.

This fixes a regression coming from commits 62fa8a846d (net: Implement
read-only protection and COW'ing of metrics.) and 33eb9873a2 (bridge:
initialize fake_rtable metrics)

ip link set mybridge mtu 1234
-->
[  136.546243] Pid: 8415, comm: ip Tainted: P 
2.6.39.1-00006-g40545b7 #103 ASUSTeK Computer Inc.         V1Sn 
        /V1Sn
[  136.546256] EIP: 0060:[<00000000>] EFLAGS: 00010202 CPU: 0
[  136.546268] EIP is at 0x0
[  136.546273] EAX: f14a389c EBX: 000005d4 ECX: f80d32c0 EDX: f80d1da1
[  136.546279] ESI: f14a3000 EDI: f255bf10 EBP: f15c3b54 ESP: f15c3b48
[  136.546285]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  136.546293] Process ip (pid: 8415, ti=f15c2000 task=f4741f80 
task.ti=f15c2000)
[  136.546297] Stack:
[  136.546301]  f80c658f f14a3000 ffffffed f15c3b64 c12cb9c8 f80d1b80 
ffffffa1 f15c3bbc
[  136.546315]  c12da347 c12d9c7d 00000000 f7670b00 00000000 f80d1b80 
ffffffa6 f15c3be4
[  136.546329]  00000004 f14a3000 f255bf20 00000008 f15c3bbc c11d6cae 
00000000 00000000
[  136.546343] Call Trace:
[  136.546359]  [<f80c658f>] ? br_change_mtu+0x5f/0x80 [bridge]
[  136.546372]  [<c12cb9c8>] dev_set_mtu+0x38/0x80
[  136.546381]  [<c12da347>] do_setlink+0x1a7/0x860
[  136.546390]  [<c12d9c7d>] ? rtnl_fill_ifinfo+0x9bd/0xc70
[  136.546400]  [<c11d6cae>] ? nla_parse+0x6e/0xb0
[  136.546409]  [<c12db931>] rtnl_newlink+0x361/0x510
[  136.546420]  [<c1023240>] ? vmalloc_sync_all+0x100/0x100
[  136.546429]  [<c1362762>] ? error_code+0x5a/0x60
[  136.546438]  [<c12db5d0>] ? rtnl_configure_link+0x80/0x80
[  136.546446]  [<c12db27a>] rtnetlink_rcv_msg+0xfa/0x210
[  136.546454]  [<c12db180>] ? __rtnl_unlock+0x20/0x20
[  136.546463]  [<c12ee0fe>] netlink_rcv_skb+0x8e/0xb0
[  136.546471]  [<c12daf1c>] rtnetlink_rcv+0x1c/0x30
[  136.546479]  [<c12edafa>] netlink_unicast+0x23a/0x280
[  136.546487]  [<c12ede6b>] netlink_sendmsg+0x26b/0x2f0
[  136.546497]  [<c12bb828>] sock_sendmsg+0xc8/0x100
[  136.546508]  [<c10adf61>] ? __alloc_pages_nodemask+0xe1/0x750
[  136.546517]  [<c11d0602>] ? _copy_from_user+0x42/0x60
[  136.546525]  [<c12c5e4c>] ? verify_iovec+0x4c/0xc0
[  136.546534]  [<c12bd805>] sys_sendmsg+0x1c5/0x200
[  136.546542]  [<c10c2150>] ? __do_fault+0x310/0x410
[  136.546549]  [<c10c2c46>] ? do_wp_page+0x1d6/0x6b0
[  136.546557]  [<c10c47d1>] ? handle_pte_fault+0xe1/0x720
[  136.546565]  [<c12bd1af>] ? sys_getsockname+0x7f/0x90
[  136.546574]  [<c10c4ec1>] ? handle_mm_fault+0xb1/0x180
[  136.546582]  [<c1023240>] ? vmalloc_sync_all+0x100/0x100
[  136.546589]  [<c10233b3>] ? do_page_fault+0x173/0x3d0
[  136.546596]  [<c12bd87b>] ? sys_recvmsg+0x3b/0x60
[  136.546605]  [<c12bdd83>] sys_socketcall+0x293/0x2d0
[  136.546614]  [<c13629d0>] sysenter_do_call+0x12/0x26
[  136.546619] Code:  Bad EIP value.
[  136.546627] EIP: [<00000000>] 0x0 SS:ESP 0068:f15c3b48
[  136.546645] CR2: 0000000000000000
[  136.546652] ---[ end trace 6909b560e78934fa ]---

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-07 00:51:35 -07:00
David S. Miller
58bf2dbccc Merge branch 'pablo/nf-2.6-updates' of git://1984.lsi.us.es/net-2.6 2011-05-27 13:04:40 -04:00
David Miller
97242c85a2 netfilter: Fix several warnings in compat_mtw_from_user().
Kill set but not used 'entry_offset'.

Add a default case to the switch statement so the compiler
can see that we always initialize off and size_kern before
using them.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-05-26 19:09:07 +02:00
Eric Dumazet
33eb9873a2 bridge: initialize fake_rtable metrics
bridge netfilter code uses a fake_rtable, and we must init its _metric
field or risk NULL dereference later.

Ref: https://bugzilla.kernel.org/show_bug.cgi?id=35672

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-24 13:32:18 -04:00
Eric Dumazet
6df427fe8c net: remove synchronize_net() from netdev_set_master()
In the old days, we used to access dev->master in __netif_receive_skb()
in a rcu_read_lock section.

So one synchronize_net() call was needed in netdev_set_master() to make
sure another cpu could not use old master while/after we release it.

We now use netdev_rx_handler infrastructure and added one
synchronize_net() call in bond_release()/bond_release_all()

Remove the obsolete synchronize_net() from netdev_set_master() and add
one in bridge del_nbp() after its netdev_rx_handler_unregister() call.

This makes enslave -d a bit faster.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jiri Pirko <jpirko@redhat.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-22 21:01:20 -04:00
Amerigo Wang
bb8ed6302b bridge: call NETDEV_JOIN notifiers when add a slave
In the previous patch I added NETDEV_JOIN, now
we can notify netconsole when adding a device to a bridge too.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Neil Horman <nhorman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-22 21:01:19 -04:00
David S. Miller
9cbc94eabb Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/vmxnet3/vmxnet3_ethtool.c
	net/core/dev.c
2011-05-17 17:33:11 -04:00
Stephen Hemminger
cb68552858 bridge: fix forwarding of IPv6
The commit 6b1e960fdb
    bridge: Reset IPCB when entering IP stack on NF_FORWARD
broke forwarding of IPV6 packets in bridge because it would
call bp_parse_ip_options with an IPV6 packet.

Reported-by: Noah Meyerhans <noahm@debian.org>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 16:03:24 -04:00
David S. Miller
3c709f8fb4 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-3.6
Conflicts:
	drivers/net/benet/be_main.c
2011-05-11 14:26:58 -04:00
Florian Westphal
103a9778e0 netfilter: ebtables: only call xt_compat_add_offset once per rule
The optimizations in commit 255d0dc340
(netfilter: x_table: speedup compat operations) assume that
xt_compat_add_offset is called once per rule.

ebtables however called it for each match/target found in a rule.

The match/watcher/target parser already returns the needed delta, so it
is sufficient to move the xt_compat_add_offset call to a more reasonable
location.

While at it, also get rid of the unused COMPAT iterator macros.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-05-10 09:52:17 +02:00
Eric Dumazet
5a6351eecf netfilter: fix ebtables compat support
commit 255d0dc340 (netfilter: x_table: speedup compat operations)
made ebtables not working anymore.

1) xt_compat_calc_jump() is not an exact match lookup
2) compat_table_info() has a typo in xt_compat_init_offsets() call
3) compat_do_replace() misses a xt_compat_init_offsets() call

Reported-by: dann frazier <dannf@dannf.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-05-10 09:48:59 +02:00
Eric Dumazet
e67f88dd12 net: dont hold rtnl mutex during netlink dump callbacks
Four years ago, Patrick made a change to hold rtnl mutex during netlink
dump callbacks.

I believe it was a wrong move. This slows down concurrent dumps, making
good old /proc/net/ files faster than rtnetlink in some situations.

This occurred to me because one "ip link show dev ..." was _very_ slow
on a workload adding/removing network devices in background.

All dump callbacks are able to use RCU locking now, so this patch does
roughly a revert of commits :

1c2d670f36 : [RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks
6313c1e099 : [RTNETLINK]: Remove unnecessary locking in dump callbacks

This let writers fight for rtnl mutex and readers going full speed.

It also takes care of phonet : phonet_route_get() is now called from rcu
read section. I renamed it to phonet_route_get_rcu()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-02 15:26:28 -07:00
David Decotigny
25db033881 ethtool: Use full 32 bit speed range in ethtool's set_settings
This makes sure the ethtool's set_settings() callback of network
drivers don't ignore the 16 most significant bits when ethtool calls
their set_settings().

All drivers compiled with make allyesconfig on x86_64 have been
updated.

Signed-off-by: David Decotigny <decot@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-29 14:03:00 -07:00
Michał Mirosław
c4d27ef957 bridge: convert br_features_recompute() to ndo_fix_features
Note: netdev_update_features() needs only rtnl_lock as br->port_list
is only changed while holding it.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-28 13:33:08 -07:00
David S. Miller
2bd93d7af1 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Resolved logic conflicts causing a build failure due to
drivers/net/r8169.c changes using a patch from Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-26 12:16:46 -07:00
Eric Dumazet
b71d1d426d inet: constify ip headers and in6_addr
Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers
where possible, to make code intention more obvious.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-22 11:04:14 -07:00
David S. Miller
f01cb5fbea Revert "bridge: Forward reserved group addresses if !STP"
This reverts commit 1e253c3b8a.

It breaks 802.3ad bonding inside of a bridge.

The commit was meant to support transport bridging, and specifically
virtual machines bridged to an ethernet interface connected to a
switch port wiht 802.1x enabled.

But this isn't the way to do it, it breaks too many other things.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-21 21:17:25 -07:00
David S. Miller
e1943424e4 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bnx2x/bnx2x_ethtool.c
2011-04-19 00:21:33 -07:00
Stephen Hemminger
28674b97cf bridge: fix accidental creation of sysfs directory
Commit bb900b27a2 ("bridge: allow
creating bridge devices with netlink") introduced a bug in net-next
because of a typo in notifier. Every device would have the sysfs
bridge directory (and files).

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-17 17:52:51 -07:00
Eric Dumazet
f8e9881c2a bridge: reset IPCB in br_parse_ip_options
Commit 462fb2af97 (bridge : Sanitize skb before it enters the IP
stack), missed one IPCB init before calling ip_options_compile()

Thanks to Scot Doyle for his tests and bug reports.

Reported-by: Scot Doyle <lkml@scotdoyle.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Acked-by: Bandan Das <bandan.das@stratus.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jan Lübbe <jluebbe@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-12 13:39:14 -07:00
David S. Miller
1c01a80cfe Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/smsc911x.c
2011-04-11 13:44:25 -07:00
Linus Torvalds
42933bac11 Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
  Fix common misspellings
2011-04-07 11:14:49 -07:00
stephen hemminger
14f98f258f bridge: range check STP parameters
Apply restrictions on STP parameters based 802.1D 1998 standard.
   * Fixes missing locking in set path cost ioctl
   * Uses common code for both ioctl and sysfs

This is based on an earlier patch Sasikanth V but with overhaul.

Note:
1. It does NOT enforce the restriction on the relationship max_age and
   forward delay or hello time because in existing implementation these are
   set as independant operations.

2. If STP is disabled, there is no restriction on forward delay

3. No restriction on holding time because users use Linux code to act
   as hub or be sticky.

4. Although standard allow 0-255, Linux only allows 0-63 for port priority
   because more bits are reserved for port number.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:29 -07:00
stephen hemminger
bb900b27a2 bridge: allow creating bridge devices with netlink
Add netlink device ops to allow creating bridge device via netlink.
This works in a manner similar to vlan, macvlan and bonding.

Example:
  # ip link add link dev br0 type bridge
  # ip link del dev br0

The change required rearranging initializtion code to deal with
being called by create link. Most of the initialization happens
in br_dev_setup, but allocation of stats is done in ndo_init callback
to deal with allocation failure. Sysfs setup has to wait until
after the network device kobject is registered.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:28 -07:00
stephen hemminger
36fd2b63e3 bridge: allow creating/deleting fdb entries via netlink
Use RTM_NEWNEIGH and RTM_DELNEIGH to allow updating of entries
in bridge forwarding table. This allows manipulating static entries
which is not possible with existing tools.

Example (using bridge extensions to iproute2)
   # br fdb add 00:02:03:04:05:06 dev eth0

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:28 -07:00
stephen hemminger
b078f0df67 bridge: add netlink notification on forward entry changes
This allows applications to query and monitor bridge forwarding
table in the same method used for neighbor table. The forward table
entries are returned in same structure format as used by the ioctl.
If more information is desired in future, the netlink method is
extensible.

Example (using bridge extensions to iproute2)
  # br monitor

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:27 -07:00
stephen hemminger
664de48bb6 bridge: split rcu and no-rcu cases of fdb lookup
In some cases, look up of forward database entry is done with RCU;
and for others no RCU is needed because of locking. Split the two
cases into two differnt loops (and take off inline).

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:26 -07:00
stephen hemminger
7cd8861ab0 bridge: track last used time in forwarding table
Adds tracking the last used time in forwarding table.
Rename ageing_timer to updated to better describe it.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:26 -07:00
stephen hemminger
03e9b64b89 bridge: change arguments to fdb_create
Later patch provides ability to create non-local static entry.
To make this easier move the updating of the flag values to
after the code that creates entry.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-04 17:22:25 -07:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Linus Lüssing
ff9a57a62a bridge: mcast snooping, fix length check of snooped MLDv1/2
"len = ntohs(ip6h->payload_len)" does not include the length of the ipv6
header itself, which the rest of this function assumes, though.

This leads to a length check less restrictive as it should be in the
following line for one thing. For another, it very likely leads to an
integer underrun when substracting the offset and therefore to a very
high new value of 'len' due to its unsignedness. This will ultimately
lead to the pskb_trim_rcsum() practically never being called, even in
the cases where it should.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-30 02:28:20 -07:00
Balaji G
1459a3cc51 bridge: Fix compilation warning in function br_stp_recalculate_bridge_id()
net/bridge/br_stp_if.c: In function ‘br_stp_recalculate_bridge_id’:
net/bridge/br_stp_if.c:216:3: warning: ‘return’ with no value, in function returning non-void

Signed-off-by: G.Balaji <balajig81@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-29 23:37:23 -07:00
stephen hemminger
edf947f100 bridge: notify applications if address of bridge device changes
The mac address of the bridge device may be changed when a new interface
is added to the bridge. If this happens, then the bridge needs to call
the network notifiers to tickle any other systems that care. Since bridge
can be a module, this also means exporting the notifier function.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-27 23:35:02 -07:00
Linus Lüssing
a7bff75b08 bridge: Fix possibly wrong MLD queries' ethernet source address
The ipv6_dev_get_saddr() is currently called with an uninitialized
destination address. Although in tests it usually seemed to nevertheless
always fetch the right source address, there seems to be a possible race
condition.

Therefore this commit changes this, first setting the destination
address and only after that fetching the source address.

Reported-by: Jan Beulich <JBeulich@novell.com>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-22 19:26:42 -07:00
Herbert Xu
6b1e960fdb bridge: Reset IPCB when entering IP stack on NF_FORWARD
Whenever we enter the IP stack proper from bridge netfilter we
need to ensure that the skb is in a form the IP stack expects
it to be in.

The entry point on NF_FORWARD did not meet the requirements of
the IP stack, therefore leading to potential crashes/panics.

This patch fixes the problem.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-18 15:13:12 -07:00
Jiri Pirko
8a4eb5734e net: introduce rx_handler results and logic around that
This patch allows rx_handlers to better signalize what to do next to
it's caller. That makes skb->deliver_no_wcard no longer needed.

kernel-doc for rx_handler_result is taken from Nicolas' patch.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-16 12:53:54 -07:00
David S. Miller
c337ffb68e Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-03-15 15:15:17 -07:00
stephen hemminger
a461c0297f bridge: skip forwarding delay if not using STP
If Spanning Tree Protocol is not enabled, there is no good reason for
the bridge code to wait for the forwarding delay period before enabling
the link. The purpose of the forwarding delay is to allow STP to
learn about other bridges before nominating itself.

The only possible impact is that when starting up a new port
the bridge may flood a packet now, where previously it might have
seen traffic from the other host and preseeded the forwarding table.

Includes change for local variable br already available in that func.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 15:06:49 -07:00
stephen hemminger
1faa4356a3 bridge: control carrier based on ports online
This makes the bridge device behave like a physical device.
In earlier releases the bridge always asserted carrier. This
changes the behavior so that bridge device carrier is on only
if one or more ports are in the forwarding state. This
should help IPv6 autoconfiguration, DHCP, and routing daemons.

I did brief testing with Network and Virt manager and they
seem fine, but since this changes behavior of bridge, it should
wait until net-next (2.6.39).

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Tested-By: Adam Majer <adamm@zombino.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-14 14:29:02 -07:00
David S. Miller
78fbfd8a65 ipv4: Create and use route lookup helpers.
The idea here is this minimizes the number of places one has to edit
in order to make changes to how flows are defined and used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-12 15:08:42 -08:00
David S. Miller
33175d84ee Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bnx2x/bnx2x_cmn.c
2011-03-10 14:26:00 -08:00
Randy Dunlap
dcbcdf22f5 net: bridge builtin vs. ipv6 modular
When configs BRIDGE=y and IPV6=m, this build error occurs:

br_multicast.c:(.text+0xa3341): undefined reference to `ipv6_dev_get_saddr'

BRIDGE_IGMP_SNOOPING is boolean; if it were tristate, then adding
	depends on IPV6 || IPV6=n
to BRIDGE_IGMP_SNOOPING would be a good fix.  As it is currently,
making BRIDGE depend on the IPV6 config works.

Reported-by: Patrick Schaaf <netdev@bof.de>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-10 13:45:57 -08:00
David S. Miller
0a0e9ae1bd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bnx2x/bnx2x.h
2011-03-03 21:27:42 -08:00
David S. Miller
b23dd4fe42 ipv4: Make output route lookup return rtable directly.
Instead of on the stack.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-02 14:31:35 -08:00
David S. Miller
3872b28408 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2011-03-02 11:30:24 -08:00
Linus Lüssing
fe29ec41aa bridge: Use IPv6 link-local address for multicast listener queries
Currently the bridge multicast snooping feature periodically issues
IPv6 general multicast listener queries to sense the absence of a
listener.

For this, it uses :: as its source address - however RFC 2710 requires:
"To be valid, the Query message MUST come from a link-local IPv6 Source
Address". Current Linux kernel versions seem to follow this requirement
and ignore our bogus MLD queries.

With this commit a link local address from the bridge interface is being
used to issue the MLD query, resulting in other Linux devices which are
multicast listeners in the network to respond with a MLD response (which
was not the case before).

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:29 -08:00
Linus Lüssing
36cff5a10c bridge: Fix MLD queries' ethernet source address
Map the IPv6 header's destination multicast address to an ethernet
source address instead of the MLD queries multicast address.

For instance for a general MLD query (multicast address in the MLD query
set to ::), this would wrongly be mapped to 33:33:00:00:00:00, although
an MLD queries destination MAC should always be 33:33:00:00:00:01 which
matches the IPv6 header's multicast destination ff02::1.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:28 -08:00
Linus Lüssing
e4de9f9e83 bridge: Allow mcast snooping for transient link local addresses too
Currently the multicast bridge snooping support is not active for
link local multicast. I assume this has been done to leave
important multicast data untouched, like IPv6 Neighborhood Discovery.

In larger, bridged, local networks it could however be desirable to
optimize for instance local multicast audio/video streaming too.

With the transient flag in IPv6 multicast addresses we have an easy
way to optimize such multimedia traffic without tempering with the
high priority multicast data from well-known addresses.

This patch alters the multicast bridge snooping for IPv6, to take
effect for transient multicast addresses instead of non-link-local
addresses.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:28 -08:00
Linus Lüssing
d41db9f3f7 bridge: Add missing ntohs()s for MLDv2 report parsing
The nsrcs number is 2 Byte wide, therefore we need to call ntohs()
before using it.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:27 -08:00
Linus Lüssing
649e984d00 bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report
We actually want a pointer to the grec_nsrcr and not the following
field. Otherwise we can get very high values for *nsrcs as the first two
bytes of the IPv6 multicast address are being used instead, leading to
a failing pskb_may_pull() which results in MLDv2 reports not being
parsed.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:26 -08:00
Linus Lüssing
9cc6e0c4c4 bridge: Fix IPv6 multicast snooping by storing correct protocol type
The protocol type for IPv6 entries in the hash table for multicast
bridge snooping is falsely set to ETH_P_IP, marking it as an IPv4
address, instead of setting it to ETH_P_IPV6, which results in negative
look-ups in the hash table later.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-22 10:07:26 -08:00
David S. Miller
da935c66ba Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	Documentation/feature-removal-schedule.txt
	drivers/net/e1000e/netdev.c
	net/xfrm/xfrm_policy.c
2011-02-19 19:17:35 -08:00
Vasiliy Kulikov
d846f71195 bridge: netfilter: fix information leak
Struct tmp is copied from userspace.  It is not checked whether the "name"
field is NULL terminated.  This may lead to buffer overflow and passing
contents of kernel stack as a module name to try_then_request_module() and,
consequently, to modprobe commandline.  It would be seen by all userspace
processes.

Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-02-14 16:49:23 +01:00
Jiri Pirko
afc6151a78 bridge: implement [add/del]_slave ops
add possibility to addif/delif via rtnetlink

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-13 16:58:40 -08:00
Herbert Xu
8a870178c0 bridge: Replace mp->mglist hlist with a bool
As it turns out we never need to walk through the list of multicast
groups subscribed by the bridge interface itself (the only time we'd
want to do that is when we shut down the bridge, in which case we
simply walk through all multicast groups), we don't really need to
keep an hlist for mp->mglist.

This means that we can replace it with just a single bit to indicate
whether the bridge interface is subscribed to a group.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-12 01:05:42 -08:00
Herbert Xu
24f9cdcbd7 bridge: Fix timer typo that may render snooping less effective
In a couple of spots where we are supposed to modify the port
group timer (p->timer) we instead modify the bridge interface
group timer (mp->timer).

The effect of this is mostly harmless.  However, it can cause
port subscriptions to be longer than they should be, thus making
snooping less effective.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-11 21:59:37 -08:00
Herbert Xu
6b0d6a9b42 bridge: Fix mglist corruption that leads to memory corruption
The list mp->mglist is used to indicate whether a multicast group
is active on the bridge interface itself as opposed to one of the
constituent interfaces in the bridge.

Unfortunately the operation that adds the mp->mglist node to the
list neglected to check whether it has already been added.  This
leads to list corruption in the form of nodes pointing to itself.

Normally this would be quite obvious as it would cause an infinite
loop when walking the list.  However, as this list is never actually
walked (which means that we don't really need it, I'll get rid of
it in a subsequent patch), this instead is hidden until we perform
a delete operation on the affected nodes.

As the same node may now be pointed to by more than one node, the
delete operations can then cause modification of freed memory.

This was observed in practice to cause corruption in 512-byte slabs,
most commonly leading to crashes in jbd2.

Thanks to Josef Bacik for pointing me in the right direction.

Reported-by: Ian Page Hands <ihands@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-11 21:59:37 -08:00
David S. Miller
bd4a6974cc Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-02-04 14:28:58 -08:00
Pavel Emelyanov
1158f762e5 bridge: Don't put partly initialized fdb into hash
The fdb_create() puts a new fdb into hash with only addr set. This is
not good, since there are callers, that search the hash w/o the lock
and access all the other its fields.

Applies to current netdev tree.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-02-04 13:02:36 -08:00
Michał Mirosław
acd1130e87 net: reduce and unify printk level in netdev_fix_features()
Reduce printk() levels to KERN_INFO in netdev_fix_features() as this will
be used by ethtool and might spam dmesg unnecessarily.

This converts the function to use netdev_info() instead of plain printk().

As a side effect, bonding and bridge devices will now log dropped features
on every slave device change.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24 15:45:15 -08:00
Michał Mirosław
04ed3e741d net: change netdev->features to u32
Quoting Ben Hutchings: we presumably won't be defining features that
can only be enabled on 64-bit architectures.

Occurences found by `grep -r` on net/, drivers/net, include/

[ Move features and vlan_features next to each other in
  struct netdev, as per Eric Dumazet's suggestion -DaveM ]

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-24 15:32:47 -08:00
Florian Westphal
6faee60a4e netfilter: ebt_ip6: allow matching on ipv6-icmp types/codes
To avoid adding a new match revision icmp type/code are stored
in the sport/dport area.

Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Holger Eitzenberger <holger@eitzenberger.org>
Reviewed-by: Bart De Schuymer<bdschuym@pandora.be>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-13 12:05:12 +01:00
Eric Dumazet
255d0dc340 netfilter: x_table: speedup compat operations
One iptables invocation with 135000 rules takes 35 seconds of cpu time
on a recent server, using a 32bit distro and a 64bit kernel.

We eventually trigger NMI/RCU watchdog.

INFO: rcu_sched_state detected stall on CPU 3 (t=6000 jiffies)

COMPAT mode has quadratic behavior and consume 16 bytes of memory per
rule.

Switch the xt_compat algos to use an array instead of list, and use a
binary search to locate an offset in the sorted array.

This halves memory need (8 bytes per rule), and removes quadratic
behavior [ O(N*N) -> O(N*log2(N)) ]

Time of iptables goes from 35 s to 150 ms.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-01-13 12:05:12 +01:00
Changli Gao
f88de8de5a net: bridge: check the length of skb after nf_bridge_maybe_copy_header()
Since nf_bridge_maybe_copy_header() may change the length of skb,
we should check the length of skb after it to handle the ppoe skbs.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-06 11:33:05 -08:00
David S. Miller
dbbe68bb12 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-01-04 11:57:25 -08:00
Tomas Winkler
1a9180a20f net/bridge: fix trivial sparse errors
net/bridge//br_stp_if.c:148:66: warning: conversion of
net/bridge//br_stp_if.c:148:66:     int to
net/bridge//br_stp_if.c:148:66:     int enum umh_wait

net/bridge//netfilter/ebtables.c:1150:30: warning: Using plain integer as NULL pointer

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-03 13:29:18 -08:00
Florian Westphal
e6f26129eb bridge: stp: ensure mac header is set
commit bf9ae5386b
(llc: use dev_hard_header) removed the
skb_reset_mac_header call from llc_mac_hdr_init.

This seems fine itself, but br_send_bpdu() invokes ebtables LOCAL_OUT.

We oops in ebt_basic_match() because it assumes eth_hdr(skb) returns
a meaningful result.

Cc: acme@ghostprotocols.net
References: https://bugzilla.kernel.org/show_bug.cgi?id=24532
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-03 12:09:33 -08:00
Tomas Winkler
9d89081d69 bridge: fix br_multicast_ipv6_rcv for paged skbs
use pskb_may_pull to access ipv6 header correctly for paged skbs
It was omitted in the bridge code leading to crash in blind
__skb_pull

since the skb is cloned undonditionally we also simplify the
the exit path

this fixes bug https://bugzilla.kernel.org/show_bug.cgi?id=25202

Dec 15 14:36:40 User-PC hostapd: wlan0: STA 00:15:00:60:5d:34 IEEE 802.11: authenticated
Dec 15 14:36:40 User-PC hostapd: wlan0: STA 00:15:00:60:5d:34 IEEE 802.11: associated (aid 2)
Dec 15 14:36:40 User-PC hostapd: wlan0: STA 00:15:00:60:5d:34 RADIUS: starting accounting session 4D0608A3-00000005
Dec 15 14:36:41 User-PC kernel: [175576.120287] ------------[ cut here ]------------
Dec 15 14:36:41 User-PC kernel: [175576.120452] kernel BUG at include/linux/skbuff.h:1178!
Dec 15 14:36:41 User-PC kernel: [175576.120609] invalid opcode: 0000 [#1] SMP
Dec 15 14:36:41 User-PC kernel: [175576.120749] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/uevent
Dec 15 14:36:41 User-PC kernel: [175576.121035] Modules linked in: approvals binfmt_misc bridge stp llc parport_pc ppdev arc4 iwlagn snd_hda_codec_realtek iwlcore i915 snd_hda_intel mac80211 joydev snd_hda_codec snd_hwdep snd_pcm snd_seq_midi drm_kms_helper snd_rawmidi drm snd_seq_midi_event snd_seq snd_timer snd_seq_device cfg80211 eeepc_wmi usbhid psmouse intel_agp i2c_algo_bit intel_gtt uvcvideo agpgart videodev sparse_keymap snd shpchp v4l1_compat lp hid video serio_raw soundcore output snd_page_alloc ahci libahci atl1c
Dec 15 14:36:41 User-PC kernel: [175576.122712]
Dec 15 14:36:41 User-PC kernel: [175576.122769] Pid: 0, comm: kworker/0:0 Tainted: G        W   2.6.37-rc5-wl+ #3 1015PE/1016P
Dec 15 14:36:41 User-PC kernel: [175576.123012] EIP: 0060:[<f83edd65>] EFLAGS: 00010283 CPU: 1
Dec 15 14:36:41 User-PC kernel: [175576.123193] EIP is at br_multicast_rcv+0xc95/0xe1c [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.123362] EAX: 0000001c EBX: f5626318 ECX: 00000000 EDX: 00000000
Dec 15 14:36:41 User-PC kernel: [175576.123550] ESI: ec512262 EDI: f5626180 EBP: f60b5ca0 ESP: f60b5bd8
Dec 15 14:36:41 User-PC kernel: [175576.123737]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Dec 15 14:36:41 User-PC kernel: [175576.123902] Process kworker/0:0 (pid: 0, ti=f60b4000 task=f60a8000 task.ti=f60b0000)
Dec 15 14:36:41 User-PC kernel: [175576.124137] Stack:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  ec556500 f6d06800 f60b5be8 c01087d8 ec512262 00000030 00000024 f5626180
Dec 15 14:36:41 User-PC kernel: [175576.124181]  f572c200 ef463440 f5626300 3affffff f6d06dd0 e60766a4 000000c4 f6d06860
Dec 15 14:36:41 User-PC kernel: [175576.124181]  ffffffff ec55652c 00000001 f6d06844 f60b5c64 c0138264 c016e451 c013e47d
Dec 15 14:36:41 User-PC kernel: [175576.124181] Call Trace:
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01087d8>] ? sched_clock+0x8/0x10
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0138264>] ? enqueue_entity+0x174/0x440
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c016e451>] ? sched_clock_cpu+0x131/0x190
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c013e47d>] ? select_task_rq_fair+0x2ad/0x730
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0524fc1>] ? nf_iterate+0x71/0x90
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4914>] ? br_handle_frame_finish+0x184/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4790>] ? br_handle_frame_finish+0x0/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e46e9>] ? br_handle_frame+0x189/0x230 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4790>] ? br_handle_frame_finish+0x0/0x220 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f83e4560>] ? br_handle_frame+0x0/0x230 [bridge]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04ff026>] ? __netif_receive_skb+0x1b6/0x5b0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04f7a30>] ? skb_copy_bits+0x110/0x210
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0503a7f>] ? netif_receive_skb+0x6f/0x80
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cb74c>] ? ieee80211_deliver_skb+0x8c/0x1a0 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cc836>] ? ieee80211_rx_handlers+0xeb6/0x1aa0 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04ff1f0>] ? __netif_receive_skb+0x380/0x5b0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c016e242>] ? sched_clock_local+0xb2/0x190
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c012b688>] ? default_spin_lock_flags+0x8/0x10
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d83df>] ? _raw_spin_lock_irqsave+0x2f/0x50
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82cd621>] ? ieee80211_prepare_and_rx_handle+0x201/0xa90 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f82ce154>] ? ieee80211_rx+0x2a4/0x830 [mac80211]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f815a8d6>] ? iwl_update_stats+0xa6/0x2a0 [iwlcore]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8499212>] ? iwlagn_rx_reply_rx+0x292/0x3b0 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d83df>] ? _raw_spin_lock_irqsave+0x2f/0x50
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8483697>] ? iwl_rx_handle+0xe7/0x350 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<f8486ab7>] ? iwl_irq_tasklet+0xf7/0x5c0 [iwlagn]
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01aece1>] ? __rcu_process_callbacks+0x201/0x2d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150d05>] ? tasklet_action+0xc5/0x100
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150a07>] ? __do_softirq+0x97/0x1d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d910c>] ? nmi_stack_correct+0x2f/0x34
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0150970>] ? __do_softirq+0x0/0x1d0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  <IRQ>
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01508f5>] ? irq_exit+0x65/0x70
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05df062>] ? do_IRQ+0x52/0xc0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c01036b0>] ? common_interrupt+0x30/0x38
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c03a1fc2>] ? intel_idle+0xc2/0x160
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c04daebb>] ? cpuidle_idle_call+0x6b/0x100
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c0101dea>] ? cpu_idle+0x8a/0xf0
Dec 15 14:36:41 User-PC kernel: [175576.124181]  [<c05d2702>] ? start_secondary+0x1e8/0x1ee

Cc: David Miller <davem@davemloft.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-03 11:26:34 -08:00
David S. Miller
b4aa9e05a6 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bnx2x/bnx2x.h
	drivers/net/wireless/iwlwifi/iwl-1000.c
	drivers/net/wireless/iwlwifi/iwl-6000.c
	drivers/net/wireless/iwlwifi/iwl-core.h
	drivers/vhost/vhost.c
2010-12-17 12:27:22 -08:00
David Stevens
76d661586c bridge: fix IPv6 queries for bridge multicast snooping
This patch fixes a missing ntohs() for bridge IPv6 multicast snooping.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-16 14:41:23 -08:00
Herbert Xu
deef4b522b bridge: Use consistent NF_DROP returns in nf_pre_routing
The nf_pre_routing functions in bridging have collected two
distinct ways of returning NF_DROP over the years, inline and
via goto.  There is no reason for preferring either one.

So this patch arbitrarily picks the inline variant and converts
the all the gotos.

Also removes a redundant comment.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-12-10 16:04:53 -08:00