This fixes the use of fwmarks to denote IPv4 virtual services
which was unfortunately broken as a result of the integration
of IPv6 support into IPVS, which was included in 2.6.28.
The problem arises because fwmarks are stored in the 4th octet
of a union nf_inet_addr .all, however in the case of IPv4 only
the first octet, corresponding to .ip, is assigned and compared.
In other words, using .all = { 0, 0, 0, htonl(svc->fwmark) always
results in a value of 0 (32bits) being stored for IPv4. This means
that one fwmark can be used, as it ends up being mapped to 0, but things
break down when multiple fwmarks are used, as they all end up being mapped
to 0.
As fwmarks are 32bits a reasonable fix seems to be to just store the fwmark
in .ip, and comparing and storing .ip when fwmarks are used.
This patch makes the assumption that in calls to ip_vs_ct_in_get()
and ip_vs_sched_persist() if the proto parameter is IPPROTO_IP then
we are dealing with an fwmark. I believe this is valid as ip_vs_in()
does fairly strict filtering on the protocol and IPPROTO_IP should
not be used in these calls unless explicitly passed when making
these calls for fwmarks in ip_vs_sched_persist().
Tested-by: Fabien Duchêne <fabien.duchene@student.uclouvain.be>
Cc: Joseph Mack NA3T <jmack@wm7d.net>
Cc: Julius Volz <julius.volz@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This code is used as a library by several device drivers,
which select INET_LRO.
If some are modules and some are statically built into the
kernel, we get build failures if INET_LRO is modular.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ashish Karkare <akarkare@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit ac45f602ee ("net: infrastructure
for hardware time stamping") added two skb initialization actions to
__alloc_skb(), which need to be added to skb_recycle_check() as well.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add barrier() to bnx2_get_hw_{tx|rx}_cons() to fix this issue:
http://bugzilla.kernel.org/show_bug.cgi?id=12698
This issue was reported by multiple i386 users. Without barrier(),
the compiled code looks like the following where %eax contains the
address of the tx_cons or rx_cons in the DMA status block. The
status block contents can change between the cmpb and the movzwl
instruction. The driver would crash if the value was not 0xff during
the cmpb instruction, but changed to 0xff during the movzwl
instruction.
6828: 80 38 ff cmpb $0xff,(%eax)
682b: 0f b7 10 movzwl (%eax),%edx
With the added barrier(), the compiled code now looks correct:
683d: 0f b7 10 movzwl (%eax),%edx
6840: 0f b6 c2 movzbl %dl,%eax
6843: 3d ff 00 00 00 cmp $0xff,%eax
Thanks to Pascal de Bruijn <pmjdebruijn@pcode.nl> for reporting the
problem and Holger Noefer <hnoefer@pironet-ndh.com> for patiently
testing test patches for us.
Also updated version to 2.0.1.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When no limit is given, the bfifo uses a default of tx_queue_len * mtu.
Packets handled by qdiscs include the link layer header, so this should
be taken into account, similar to what other qdiscs do.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The setup_rctl call was making a call into the ring structure after it had
been freed. This was causing a panic on shutdown. This call wasn't
necessary since it is possible to get the needed index from
adapter->vfs_allocated_count.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a new wimax_dev is created, it's state has to be __WIMAX_ST_NULL
until wimax_dev_add() is succesfully called. This allows calls into
the stack that happen before said time to be rejected.
Until now, the state was being set (by mistake) to UNINITIALIZED,
which was allowing calls such as wimax_report_rfkill_hw() to go
through even when a call to wimax_dev_add() had failed; that was
causing an oops when touching uninitialized data.
This situation is normal when the device starts reporting state before
the whole initialization has been completed. It just has to be dealt
with.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
When sending a message to user space using wimax_msg(), if nla_put()
fails, correctly interpret the return code from wimax_msg_alloc() as
an err ptr and return the error code instead of crashing (as it is
assuming than non-NULL means the pointer is ok).
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
This patch fixes the wrong message type that are triggered by
user updates, the following commands:
(term1)# conntrack -I -p tcp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 --state LISTEN
(term1)# conntrack -U -p tcp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 --state SYN_SENT
(term1)# conntrack -U -p tcp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 --state SYN_RECV
only trigger event message of type NEW, when only the first is NEW
while others should be UPDATE.
(term2)# conntrack -E
[NEW] tcp 6 10 LISTEN src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0
[NEW] tcp 6 10 SYN_SENT src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0
[NEW] tcp 6 10 SYN_RECV src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0
This patch also removes IPCT_REFRESH from the bitmask since it is
not of any use.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch fixes a problem when you use 32 nodes in the cluster
match:
% iptables -I PREROUTING -t mangle -i eth0 -m cluster \
--cluster-total-nodes 32 --cluster-local-node 32 \
--cluster-hash-seed 0xdeadbeef -j MARK --set-mark 0xffff
iptables: Invalid argument. Run `dmesg' for more information.
% dmesg | tail -1
xt_cluster: this node mask cannot be higher than the total number of nodes
The problem is related to this checking:
if (info->node_mask >= (1 << info->total_nodes)) {
printk(KERN_ERR "xt_cluster: this node mask cannot be "
"higher than the total number of nodes\n");
return false;
}
(1 << 32) is 1. Thus, the checking fails.
BTW, I said this before but I insist: I have only tested the cluster
match with 2 nodes getting ~45% extra performance in an active-active setup.
The maximum limit of 32 nodes is still completely arbitrary. I'd really
appreciate if people that have more nodes in their setups let me know.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
As packets ending with NEXTHDR_NONE don't have a last extension header,
the check for the length needs to be after the check for NEXTHDR_NONE.
Signed-off-by: Christoph Paasch <christoph.paasch@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Pointed out by Dave Miller:
CHECK include/linux/netfilter (57 files)
/home/davem/src/GIT/net-2.6/usr/include/linux/netfilter/xt_LED.h:6: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Patrick McHardy <kaber@trash.net>
a recent fix to e1000 (commit 15b2bee2) caused KVM/QEMU/VMware based
virtualized e1000 interfaces to begin failing when resetting.
This is because the driver in a virtual environment doesn't
get to run instructions *AT ALL* when an interrupt is asserted.
The interrupt code runs immediately and this recent bug fix
allows an interrupt to be possible when the interrupt handler
will reject it (due to the new code), when being called from
any path in the driver that holds the E1000_RESETTING flag.
the driver should use the __E1000_DOWN flag instead of the
__E1000_RESETTING flag to prevent interrupt execution
while reconfiguring the hardware.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix locking issue in alb MAC address management; removed
incorrect locking and replaced with correct locking. This bug was
introduced in commit 059fe7a578
("bonding: Convert locks to _bh, rework alb locking for new locking")
Bug reported by Paul Smith <paul@mad-scientist.net>, who also
tested the fix.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to a semantic changes in flush_workqueue() the current approach of
synchronizing the sysfs handling for connections doesn't work anymore. The
whole approach is actually fully broken and based on assumptions that are
no longer valid.
With the introduction of Simple Pairing support, the creation of low-level
ACL links got changed. This change invalidates the reason why in the past
two independent work queues have been used for adding/removing sysfs
devices. The adding of the actual sysfs device is now postponed until the
host controller successfully assigns an unique handle to that link. So
the real synchronization happens inside the controller and not the host.
The only left-over problem is that some internals of the sysfs device
handling are not initialized ahead of time. This leaves potential access
to invalid data and can cause various NULL pointer dereferences. To fix
this a new function makes sure that all sysfs details are initialized
when an connection attempt is made. The actual sysfs device is only
registered when the connection has been successfully established. To
avoid a race condition with the registration, the check if a device is
registered has been moved into the removal work.
As an extra protection two flush_work() calls are left in place to
make sure a previous add/del work has been completed first.
Based on a report by Marc Pignat <marc.pignat@hevs.ch>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Tested-by: Justin P. Mattock <justinmattock@gmail.com>
Tested-by: Roger Quadros <ext-roger.quadros@nokia.com>
Tested-by: Marc Pignat <marc.pignat@hevs.ch>
pid doesn't count with some band having more bitrates than the one
associated the first time.
Fix that by counting the maximal available bitrate count and allocate
big enough space.
Secondly, fix touching uninitialized memory which causes panics.
Index sucked from this random memory points to the hell.
The fix is to sort the rates on each band change.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
minstrel doesn't count max rate count in fact, since it doesn't use
a loop variable `i' and hence allocs space only for bitrates found in
the first band.
Fix it by involving the `i' as an index so that it traverses all the
bands now and finds the real max bitrate count.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We forgot to lock using the cfg80211_mutex in
wiphy_apply_custom_regulatory(). Without the lock
there is possible race between processing a reply from CRDA
and a driver calling wiphy_apply_custom_regulatory(). During
the processing of the reply from CRDA we free last_request and
wiphy_apply_custom_regulatory() eventually accesses an
element from last_request in the through freq_reg_info_regd().
This is very difficult to reproduce (I haven't), it takes us
3 hours and you need to be banging hard, but the race is obvious
by looking at the code.
This should only affect those who use this caller, which currently
is ath5k, ath9k, and ar9170.
EIP: 0060:[<f8ebec50>] EFLAGS: 00210282 CPU: 1
EIP is at freq_reg_info_regd+0x24/0x121 [cfg80211]
EAX: 00000000 EBX: f7ca0060 ECX: f5183d94 EDX: 0024cde0
ESI: f8f56edc EDI: 00000000 EBP: 00000000 ESP: f5183d44
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process modprobe (pid: 14617, ti=f5182000 task=f3934d10 task.ti=f5182000)
Stack: c0505300 f7ca0ab4 f5183d94 0024cde0 f8f403a6 f8f63160 f7ca0060 00000000
00000000 f8ebedf8 f5183d90 f8f56edc 00000000 00000004 00000f40 f8f56edc
f7ca0060 f7ca1234 00000000 00000000 00000000 f7ca14f0 f7ca0ab4 f7ca1289
Call Trace:
[<f8ebedf8>] wiphy_apply_custom_regulatory+0x8f/0x122 [cfg80211]
[<f8f3f798>] ath_attach+0x707/0x9e6 [ath9k]
[<f8f45e46>] ath_pci_probe+0x18d/0x29a [ath9k]
[<c023c7ba>] pci_device_probe+0xa3/0xe4
[<c02a860b>] really_probe+0xd7/0x1de
[<c02a87e7>] __driver_attach+0x37/0x55
[<c02a7eed>] bus_for_each_dev+0x31/0x57
[<c02a83bd>] driver_attach+0x16/0x18
[<c02a78e6>] bus_add_driver+0xec/0x21b
[<c02a8959>] driver_register+0x85/0xe2
[<c023c9bb>] __pci_register_driver+0x3c/0x69
[<f8e93043>] ath9k_init+0x43/0x68 [ath9k]
[<c010112b>] _stext+0x3b/0x116
[<c014a872>] sys_init_module+0x8a/0x19e
[<c01049ad>] sysenter_do_call+0x12/0x21
[<ffffe430>] 0xffffe430
=======================
Code: 0f 94 c0 c3 31 c0 c3 55 57 56 53 89 c3 83 ec 14 8b 74 24 2c 89 54 24 0c 89 4c 24 08 85 f6 75
06 8b 35 c8 bb ec f8 a1 cc bb ec f8 <8b> 40 04 83 f8 03 74 3a 48 74 37 8b 43 28 85 c0 74 30 89 c6
8b
EIP: [<f8ebec50>] freq_reg_info_regd+0x24/0x121 [cfg80211] SS:ESP 0068:f5183d44
Cc: stable@kernel.org
Reported-by: Nataraj Sadasivam <Nataraj.Sadasivam@Atheros.com>
Reported-by: Vivek Natarajan <Vivek.Natarajan@Atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We need to be symmetrical in what is done when key is set and cleared.
This is important wrt the key flags as they are used during key
clearing and if they are not set when the key is set the key cannot be
cleared completely.
This addresses the many occurences of the WARN found in
iwl_set_tkip_dynamic_key_info() and tracked in
http://www.kerneloops.org/searchweek.php?search=iwl_set_dynamic_key
If calling iwl_set_tkip_dynamic_key_info()/iwl_remove_dynamic_key()
pair a few times in a row will cause that we run out of key space.
This is because the index stored in the key flags is used by
iwl_remove_dynamic_key() to decide if it should remove the key.
Unfortunately the key flags, and hence the key index is currently only
set at the time the key is written to the device (in
iwl_update_tkip_key()) and _not_ in iwl_set_tkip_dynamic_key_info().
Fix this by setting flags in iwl_set_tkip_dynamic_key_info().
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Another bug in the "cfg80211: do not replace BSS structs" patch,
a forgotten length update leads to bogus data being stored and
passed to userspace, often truncated.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The fragmentation threshold is defined to be including the
FCS, and the code that sets the TX_FRAGMENTED flag correctly
accounts for those four bytes. The code that verifies this
doesn't though, which could lead to spurious warnings and
frames being dropped although everything is ok. Correct the
code by accounting for the FCS.
(JWL -- The problem is described here:
http://article.gmane.org/gmane.linux.kernel.wireless.general/32205 )
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It does not make sense to apply EXPORT_SYMBOL to a static symbol. Fixes
this build error:
drivers/net/wireless/iwlwifi/iwl3945-base.c:1697: error: __ksymtab_iwl3945_rx_queue_reset causes a section type conflict
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This introduces a CDC Ethernet Emulation Model (EEM) host side
driver to support USB EEM devices.
EEM is different from the Ethernet Control Model (ECM) currently
supported by the "CDC Ethernet" driver. One key difference is
that it doesn't require of USB interface alternate settings to
manage interface state; some maldesigned hardware can't handle
that part of USB. It also avoids a separate USB interface for
control and status updates.
[ dbrownell@users.sourceforge.net: fix skb leaks, add rx packet
checks, improve fault handling, EEM conformance updates, cleanup ]
Signed-off-by: Omar Laazimani <omar.oberthur@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_prequeue() refers to the constant value (TCP_RTO_MIN) regardless of
the actual value might be tuned. The following patches fix this and make
tcp_prequeue get the actual value returns from tcp_rto_min().
Signed-off-by: Satoru SATOH <satoru.satoh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes an invalid pointer access in case the receive queue
holds no pointer to the next skb when the queue is empty.
Signed-off-by: Hannes Hering <hering2@de.ibm.com>
Signed-off-by: Jan-Bernd Themann <themann@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doing it in reverse order causes uevent to be sent before
we have a MAC address, which confuses udev.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 0ba25ff4c6 ("br2684: convert to
net_device_ops") inadvertently deleted the initialization of the net_dev
pointer in the br2684_dev structure, leading to crashes. This patch
adds it back.
Reported-by: Mikko Vinni <mmvinni@yahoo.com>
Tested-by: Mikko Vinni <mmvinni@yahoo.com>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel should only be using the high 16 bits of a kernel
generated priority. Filter priorities in all other cases only
use the upper 16 bits of the u32 'prio' field of 'struct tcf_proto',
but when the kernel generates the priority of a filter is saves all
32 bits which can result in incorrect lookup failures when a filter
needs to be deleted or modified.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We were avoiding calling sg_init* on scatterlists passed
into virtnet_send_command to prevent extraneous end markers.
This caused build warnings for uninitialized variables.
Cleanup the code to create proper scatterlists.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes the cleanup in bond_create nicer :) Also now the forgotten
free_netdev is called.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
virtio_net.h uses the macro ETH_ALEN which is defined in linux/if_ether.h.
Discovered when hacking on virtio-over-pci patches.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
LAN9512 and LAN9514 are USB hubs with an integrated 10/100 ethernet
controller. Logically this looks like an ethernet controller (similar
to LAN9500) permanently attached to one of the hub's downstream ports.
This patch adds the usb device id of the new ethernet controller to the
smsc95xx driver. This id is the same in both new devices.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SMSC LAN9500 has dual purpose GPIO/LED pins, and by default at power-on
these are configured as GPIOs. This means that if LEDs are fitted they
won't ever light.
This patch sets them to be LED outputs for speed, duplex and
link/activity.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When netconsole is loaded and a network interface fades away (e.g. on
rmmod $interface_driver_module) the rmmod remains stuck and some locks
are taken that prevent any additional module loading/unloading as well
as interface up/down changes.
In addition kernel logs (and console) get flooded at 10s interval with
[ 122.464065] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
[ 132.704059] unregister_netdevice: waiting for eth0 to become free. Usage count = 1
This patch lets netconsole take NETDEV_UNREGISTER event into account
and release the affected interface if it was in use.
Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
xt_socket can use connection tracking, and checks whether it is a module.
Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
bond_slave_info_query() should keep a read lock while accessing slave info,
or risk accessing stale data and corruption.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial: fixing gcc 4.4 compiler warning:
drivers/net/cxgb3/t3_hw.c: In function ‘t3_prep_adapter’:
drivers/net/cxgb3/t3_hw.c:3782: warning: suggest parentheses around operand of ‘!’ or change ‘|’ to ‘||’ or ‘!’ to ‘~’
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by>
Signed-off-by: David S. Miller <davem@davemloft.net>
When skb_rx_queue_recorded() is true, we dont want to use jash distribution
as the device driver exactly told us which queue was selected at RX time.
jhash makes a statistical shuffle, but this wont work with 8 static inputs.
Later improvements would be to compute reciprocal value of real_num_tx_queues
to avoid a divide here. But this computation should be done once,
when real_num_tx_queues is set. This needs a separate patch, and a new
field in struct net_device.
Reported-by: Andrew Dickinson <andrew@whydna.net>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>