Commit graph

8175 commits

Author SHA1 Message Date
Daniel Lezcano
6cc118bd50 [NETNS][IPV6] rt6_stats - dynamically allocate the routes statistics
This patch allocates the rt6_stats struct dynamically when the fib6 is
initialized. That provides the ability to create several instances of
this structure for the network namespaces.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:33:43 -08:00
Daniel Lezcano
dcabb819a6 [NETNS][IPV6] fib6_rules - handle several network namespaces
The fib6_rules_ops is moved to the network namespace structure.  All
references are changed to have it relatively to it.

Each time a network namespace is created a new fib6_rules_ops is
allocated, initialized and stored into the network namespace
structure.

The common part of the fib rules is namespace aware, so it is quite
easy to retrieve the network namespace from the rules and use it in
the different callbacks.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:33:08 -08:00
Daniel Lezcano
eb5564b853 [NETNS][IPV6] fib6 rule - dynamic allocation of the rules struct ops
The fib6_rules_ops structure is dynamically allocated, so that allows
to make several instances of it per network namespace.

The global static fib6_rules_ops structure is renamed to
fib6_rules_ops_template in order to quickly memcopy it for the
structure initialization.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:32:30 -08:00
Benjamin Thery
ec7d43c291 [NETNS][IPV6] ip6_fib - clean node use namespace
The fib6_clean_node function should have the network namespace it is
working on. The fib6_cleaner_t structure is extended with the network
namespace field to be passed to the fib6_clean_node function.

The different functions calling the fib6_clean_node function are
extended with the netns parameter when needed to propagate the netns
pointer.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:31:57 -08:00
Daniel Lezcano
63152fc0de [NETNS][IPV6] ip6_fib - gc timer per namespace
Move the timer initialization at the network namespace creation and
store the network namespace in the timer argument.

That enables multiple timers (one per network namespace) to do garbage
collecting.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:31:11 -08:00
Daniel Lezcano
450d19f8ab [NETNS][IPV6] ip6_fib - dynamically allocate gc-timer
The ip6_fib_timer gc timer is dynamically allocated and initialized in
the ip6 fib init function. There are no more references to a static
global variable. That will allow to make multiple instance of the
garbage collecting timer and make them per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:29:33 -08:00
Daniel Lezcano
5b7c931dff [NETNS][IPV6] ip6_fib - add net to gc timer parameter
The fib tables are now relative to the network namespace. When the
garbage collector timer expires, we must have a network namespace
parameter in order to retrieve the tables. For now this is the
init_net, but we should be able to have a timer per namespace and use
the timer callback parameter to pass the network namespace from the
expired timer.

The timer callback, fib6_run_gc, is actually used to be called
synchronously by some functions and asynchronously when the timer
expires.

When the timer expires, the delay specified for fib6_run_gc parameter
is always zero. So, I changed fib6_run_gc to not be a timer callback
but a function called by the timer callback and I added a timer
callback where its work is just to retrieve from the data arg of the
timer the network namespace and call fib6_run_gc with zero expiring
time and the network namespace parameters. That makes the code cleaner
for the fib6_run_gc callers.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:28:58 -08:00
Daniel Lezcano
f3db48517f [NETNS][IPV6] ip6_fib - fib6_clean_all handle several network namespaces
The function fib6_clean_all takes the network namespace as
parameter. That allows to flush the routes related to a specific
network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:27:06 -08:00
Daniel Lezcano
58f09b78b7 [NETNS][IPV6] ip6_fib - make it per network namespace
The fib table for ipv6 are moved to the network namespace structure.
All references to them are made relatively to the network namespace.

All external calls to the ip6_fib functions taking the network
namespace parameter are made using the init_net variable, so the
ip6_fib engine is ready for the namespaces but the callers not yet.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:25:27 -08:00
Daniel Lezcano
e0b85590bc [NETNS][IPV6] ip6_fib - dynamically allocate the fib tables
This patch changes the fib6 tables to be dynamically allocated.  That
provides the ability to make several instances of them when a new
network namespace is created.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 23:24:31 -08:00
YOSHIFUJI Hideaki
4192717807 [IPV6] MCAST: Use standard path for sending MLD/MLDv2 messages.
This is changing the paths for sending MLD/MLDv2 messages
from dev_queue_xmit() to standard dst_output().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:24 +09:00
YOSHIFUJI Hideaki
3b00944c5c [IPV6]: Make ndisc_dst_alloc() common for later use.
For later use, this patch is renaming ndisc_dst_alloc()
(and related function/structures) to icmp6_dst_alloc()
(and so on).  This patch also removing unused function-
pointer argument for it.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:24 +09:00
YOSHIFUJI Hideaki
95e41e93e1 [IPV6]: Make ndisc_flow_init() common for later use.
For later use, this patch is renaming ndisc_flow_init() to
icmpv6_flow_init() and putting it in common place.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:24 +09:00
YOSHIFUJI Hideaki
5e5f3f0f80 [IPV6] ADDRCONF: Convert ipv6_get_saddr() to ipv6_dev_get_saddr().
Since most users of ipv6_get_saddr() pass non-NULL as
dst argument, use ipv6_dev_get_saddr() directly.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:23 +09:00
YOSHIFUJI Hideaki
5ee0910509 [IPV6] SYSCTL: complete initialization for sysctl table in subsystem code.
Move initialization bits for subsystem sysctl tables to
appropriate functions.
 - route
 - icmp

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:23 +09:00
YOSHIFUJI Hideaki
662397fd7a [IPV6]: Move packet_type{} related bits to af_inet6.c.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:23 +09:00
YOSHIFUJI Hideaki
db1ed684f6 [IPV6] UDP: Rename IPv6 UDP files.
Rename net/ipv6/udp.c to net/ipv6/udp_ipv6.c
Rename net/ipv6/udplite.c to net/ipv6/udplite_ipv6.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:22 +09:00
YOSHIFUJI Hideaki
8be8af8fa4 [IPV4] UDP: Move IPv4-specific bits to other file.
Move IPv4-specific UDP bits from net/ipv4/udp.c into (new) net/ipv4/udp_ipv4.c.
Rename net/ipv4/udplite.c to net/ipv4/udplite_ipv4.c.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:22 +09:00
YOSHIFUJI Hideaki
cf80efc27d [IPV4]: Fix size description of CONFIG_INET.
CONFIG_INET now enlarges about 400KB, not 140KB.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:22 +09:00
YOSHIFUJI Hideaki
e898d4db27 [UDP]: Allow users to configure UDP-Lite.
Let's give users an option for disabling UDP-Lite (~4K).

old:
|    text	   data	    bss	    dec	    hex	filename
|  286498	  12432	   6072	 305002	  4a76a	net/ipv4/built-in.o
|  193830	   8192	   3204	 205226	  321aa	net/ipv6/ipv6.o

new (without UDP-Lite):
|    text	   data	    bss	    dec	    hex	filename
|  284086	  12136	   5432	 301654	  49a56	net/ipv4/built-in.o
|  191835	   7832	   3076	 202743	  317f7	net/ipv6/ipv6.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:22 +09:00
Glenn Griffin
c6aefafb7e [TCP]: Add IPv6 support to TCP SYN cookies
Updated to incorporate Eric's suggestion of using a per cpu buffer
rather than allocating on the stack.  Just a two line change, but will
resend in it's entirety.

Signed-off-by: Glenn Griffin <ggriffin.kernel@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:21 +09:00
Eric Dumazet
11baab7ac3 [TCP]: lower stack usage in cookie_hash() function
400 bytes allocated on stack might be a litle bit too much. Using a
per_cpu var is more friendly.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:21 +09:00
Pavel Emelyanov
988b705077 [ARP]: Introduce the arp_hdr_len helper.
There are some place, that calculate the ARP header length. These
calculations are correct, but 
 a) some operate with "magic" constants,
 b) enlarge the code length (sometimes at the cost of coding style),
 c) are not informative from the first glance.

The proposal is to introduce a helper, that includes all the good
sides of these calculations.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:20:57 -08:00
Dave Young
8e8440f535 [BLUETOOTH]: l2cap info_timer delete fix in hci_conn_del
When the l2cap info_timer is active the info_state will be set to
L2CAP_INFO_FEAT_MASK_REQ_SENT, and it will be unset after the timer is
deleted or timeout triggered.

Here in l2cap_conn_del only call del_timer_sync when the info_state is
set to L2CAP_INFO_FEAT_MASK_REQ_SENT.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:18:55 -08:00
Frank Blaschka
7e36763b2c [NET]: Fix race in generic address resolution.
neigh_update sends skb from neigh->arp_queue while neigh_timer_handler
has increased skbs refcount and calls solicit with the
skb. neigh_timer_handler should not increase skbs refcount but make a
copy of the skb and do solicit with the copy.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:16:04 -08:00
Heiko Carstens
c3d84a4dd2 iucv: fix build error on !SMP
Since a5fbb6d106
"KVM: fix !SMP build error" smp_call_function isn't a define anymore
that folds into nothing but a define that calls up_smp_call_function
with all parameters. Hence we cannot #ifdef out the unused code
anymore...
This seems to be the preferred method, so do this for s390 as well.

net/iucv/iucv.c: In function 'iucv_cleanup_queue':
net/iucv/iucv.c:657: error: '__iucv_cleanup_queue' undeclared

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:12:33 -08:00
Ilpo Järvinen
d152a7d88a [TCP]: Must count fack_count also when skipping
It makes fackets_out to grow too slowly compared with the
real write queue.

This shouldn't cause those BUG_TRAP(packets <= tp->packets_out)
to trigger but how knows how such inconsistent fackets_out
affects here and there around TCP when everything is nowadays
assuming accurate fackets_out. So lets see if this silences
them all.

Reported by Guillaume Chazarain <guichaz@gmail.com>.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:10:16 -08:00
Alexey Dobriyan
8ed7edce82 ipv6: fix inet6_init/icmpv6_cleanup sections mismatch
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 12:02:54 -08:00
Denis V. Lunev
7cd04fa7e3 [TCP]: Merge exit paths in tcp_v4_conn_request.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 11:59:32 -08:00
Denis V. Lunev
f0fd56ed40 [SCTP]: seq_printf format warning. (fixed)
sctp_association->hbinterval is unsigned long. Replace %8d with %8lu.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 11:55:54 -08:00
Denis V. Lunev
da7ef338a2 [IPV4]: skb->dst can't be NULL in ip_options_echo.
ip_options_echo is called on the packet input path after the initial
routing. The dst entry on the packet is cleared only in the several
very specific places and immidiately assigned back (may be new).

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-03 11:50:10 -08:00
Denis V. Lunev
1d1c8d13c4 [ICMP]: Section conflict between icmp_sk_init/icmp_sk_exit.
Functions from __exit section should not be called from ones in __init
section. Fix this conflict.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 14:15:19 -08:00
David S. Miller
4a80f27889 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6.26 2008-02-29 13:41:25 -08:00
Johannes Berg
e486182907 mac80211: fix key replacing, hw accel
Even though I thought about it a lot and had also tested it, some
of my recent changes in the key code broke replacing keys, making
the kernel oops because a key is removed from a list while not on
it.

This patch fixes that using the list as an indication whether or
not the key is on it (an empty list means it's not on any list.)

Also, this patch fixes hw accel enabling, the check for not doing
hw accel when the interface is down was lost and is restored by
this.

Additionally, move adding the key to the list into the function
__ieee80211_key_replace() for more consistency.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:42:07 -05:00
Johannes Berg
db4d1169d0 mac80211: split ieee80211_key_alloc/free
In order to RCU-ify sta_info, we need to be able to allocate
a key without linking it to an sdata/sta structure (because
allocation cannot be done in an rcu critical section). This
patch splits up ieee80211_key_alloc() and updates all users
appropriately.

While at it, this patch fixes a number of race conditions
such as finally making key replacement atomic, unfortunately
at the expense of more complex code.

Note that this patch documents /existing/ bugs with sta info
and key interaction, there is currently a race condition
when a sta info is freed without holding the RTNL. This will
finally be fixed by a followup patch.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:42:04 -05:00
Johannes Berg
6f48422a29 mac80211: remove STA infos last_ack stuff
These things aren't used and the only possible use is within
rate control algorithms, however those can, if they need it,
keep track of it in their private data. last_ack_ms isn't
even updated so completely useless.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:42:02 -05:00
Johannes Berg
e6a5ddf208 mac80211: safely free beacon in ieee80211_if_reinit
If ieee80211_if_reinit() is called from ieee80211_unregister_hw()
then it is possible that the driver will still request a beacon
(it is allowed to until ieee80211_unregister_hw() has returned.)
This means we need to use an RCU-protected write to the beacon
information even in this function.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:42:00 -05:00
Johannes Berg
fba4a1e637 mac80211: fix IBSS code
This patch fixes two errors introduced by

    commit 19d35612f3cd7f60dd9174c0100584e21f5a1025
    Author: Bruno Randolf <bruno@thinktube.com>
    Date:   Mon Feb 18 11:21:36 2008 +0900

        mac80211: enable IBSS merging

The first error is an endianness problem that sparse found and
the second is a build failure when CONFIG_MAC80211_IBSS_DEBUG
is not set.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Bruno Randolf <bruno@thinktube.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:41 -05:00
Johannes Berg
f3af89d1aa mac80211: fix debugfs_sta print_mac() warning
When print_mac() was marked as __pure to avoid emitting a function
call in pr_debug() scenarios, a warning in this code surfaced since
it relies on the fact that the buffer is modified and doesn't use
the return value. This patch makes it use the return value instead.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Reported-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:38 -05:00
Johannes Berg
665e8aafb4 mac80211: Disallow concurrent IBSS/STA mode interfaces
Disallow having more than one IBSS interface up at any time
because of beacon distribution issues, and for now also disallow
having more than one IBSS/STA interface up at the same time
because we use the master interface's BSS struct which would
be completely corrupted when we have more than one up.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:36 -05:00
Johannes Berg
43ba7e958f mac80211: atomically check whether STA exists already
When a STA structure is added, it is often checked whether it
already exists before adding it. This, however, isn't done
atomically so there is a race condition that could lead to two
STA structures being added with the same MAC address. This
patch changes sta_info_add() to return an ERR_PTR in case
of failure and adds the failure mode -EEXIST when the STA
already exists.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Luis Carlos Cobo <luisca@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:34 -05:00
Johannes Berg
d46e144b65 mac80211: rework TX filtered frame code
This reworks the code for TX filtered frames, splitting it out to
a new function to handle those cases, making the clear instruction
a flag and renaming a few things to be easier to understand and
less Atheros hardware specific. Finally, it also makes the comments
explain more.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:41:32 -05:00
Pavel Roskin
d97cf01576 mac80211: fix incorrect use of CONFIG_MAC80211_IBSS_DEBUG
Configuration variables are only available to the preprocessor

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:27 -05:00
Johannes Berg
004c872e78 mac80211: consolidate TIM handling code
This consolidates all TIM handling code to avoid re-introducing
errors with the bitmap/set_tim order and to reduce code. While
reading the code I noticed a possible problem so I also added
a comment about that.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:26 -05:00
Johannes Berg
836341a704 mac80211: remove sta TIM flag, fix expiry TIM handling
The TIM flag that is kept in each station's info is completely
useless, there's no code (aside from the debugfs display code)
checking it, hence it can be removed. While doing that, I noticed
that the TIM handling is broken when buffered frames expire, so
fix that.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:26 -05:00
Johannes Berg
d2259243a1 mac80211: invoke set_tim() callback after setting own TIM info
Drivers should be allowed to simply get a complete new beacon when
set_tim() is invoked (and set_tim() is required for drivers that
just want a beacon template!), so we need to update our own TIM
bitmap before calling set_tim() so that getting the beacon will
now get an already updated beacon.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:26 -05:00
Tomas Winkler
b46b4ee034 wireless: update US regulatory domain
This patch adds channels to US regulatory domain

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:25 -05:00
Johannes Berg
4a9a66e9a8 mac80211: convert sta_info.pspoll to a flag
This doesn't really need to be a full int variable since it's
just a flag to indicate a PS-poll is in progress.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:13 -05:00
Bruno Randolf
9d9bf77d16 mac80211: enable IBSS merging
enable IBSS cell merging. if an IBSS beacon with the same channel, same ESSID
and a TSF higher than the local TSF (mactime) is received, we have to join its
BSSID. while this might not be immediately apparent from reading the 802.11
standard it is compliant and necessary to make IBSS mode functional in many
cases. most drivers have a similar behaviour.

* move the relevant code section (previously only containing debug code) down
to the end of the function, so we can reuse the bss structure.

* we have to compare the mactime (TSF at the time of packet receive) rather
than the current TSF. since mactime is defined as the time the first data
symbol arrived we add the time until byte 24 where the timestamp resides, since
this is how the beacon timestamp is defined. as some some drivers are not able
to give a reliable mactime we fall back to use the current TSF, which will be
enough to catch most (but not all) cases where an IBSS merge is necessary.

* in IBSS mode we want to allow beacons to override probe response info so we
can correctly do merges.

* we don't only configure beacons based on scan results, so change that
message.

* to enable this we have to let all beacons thru in IBSS mode, even if they
have a different BSSID.

Signed-off-by: Bruno Randolf <bruno@thinktube.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:12 -05:00
Bruno Randolf
a607268a0d mac80211: move function ieee80211_sta_join_ibss()
this moves ieee80211_sta_join_ibss() up for the next patch (ibss merge).

Signed-off-by: Bruno Randolf <bruno@thinktube.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:12 -05:00
S.Çağlar Onur
ab46623ec1 net/mac80211/: Use time_* macros
The functions time_before, time_before_eq, time_after, and time_after_eq are more robust for comparing jiffies against other values.

So following patch implements usage of the time_after() macro, defined at linux/jiffies.h, which deals with wrapping correctly

Cc: linux-wireless@vger.kernel.org
Signed-off-by: S.Çağlar Onur <caglar@pardus.org.tr>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:10 -05:00
Johannes Berg
ac2bf3242e mac80211: fix ecw2cw brain-damage
This brain-damaged code just bothers me, fix it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:10 -05:00
Johannes Berg
3330d7be70 mac80211: give burst time in txop rather than 0.1msec units
This changes mac80211 to pass the burst time to conf_tx in txop
units rather than 0.1msec units. 0.1msec units are only required
by atheros hardware (according to current driver support), all
other drivers do other calculations or require the txop value.
Therefore, it results in fewer calculations and more precision
if we just pass the txop value through to the driver.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:07 -05:00
Johannes Berg
96d510566e mac80211: defer master netdev allocation to ieee80211_register_hw
When we want to go multiqueue, we will need to know the number of
queues the hardware has for registering the master netdev. This
number is only available in ieee80211_register_hw() rather than
ieee80211_alloc_hw(), so defer allocation of the master device to
ieee80211_register_hw().

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:06 -05:00
Ron Rindjunsky
a9af2013ca mac80211: adjustable number of bits for qdisc pool
This fix allows to control the number of bits that qdiscs book keeping
can be done for with respect to the qdisc pool

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:05 -05:00
Michael Wu
3d30d949cf mac80211: Add cooked monitor mode support
This adds "cooked" monitor mode to mac80211. A monitor interface
in "cooked" mode will see all frames that mac80211 has not used
internally.

Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:03 -05:00
Johannes Berg
8944b79fe9 mac80211: move some code into ieee80211_invoke_rx_handlers
There is some duplicated code that sits in front of each function
call to ieee80211_invoke_rx_handlers() that can very well be part
of that function if it gets slightly different arguments.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:03 -05:00
Johannes Berg
589052904a mac80211: remove "dynamic" RX/TX handlers
It doesn't really make sense to have extra pointers to the RX/TX
handler arrays instead of just using the arrays directly, that
also allows us to make them static.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:03 -05:00
Johannes Berg
2c9745e568 mac80211: clean up some things in the RX path
Uninline ieee80211_invoke_rx_handlers to save .text space,
make the code more readable in some places and remove the
"optimisation" that is hit only very few times and unclear
to start with.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:03 -05:00
Michael Wu
8cc9a73914 mac80211: Use monitor configuration flags
Take advantage of the monitor configuration flags now provided by cfg80211.

Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:02 -05:00
Michael Wu
66f7ac50ed nl80211: Add monitor interface configuration flags
This allows precise control over what a monitor interface shows.

Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:02 -05:00
Johannes Berg
e4c26add88 mac80211: split RX_DROP
Some instances of RX_DROP mean that the frame was useless,
others mean that the frame should be visible in userspace
on "cooked" monitor interfaces. This patch splits up RX_DROP
and changes each instance appropriately.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:02 -05:00
Johannes Berg
9ae54c8463 mac80211: split ieee80211_txrx_result
The _DROP result will need to be split in the RX path but not
in the TX path, so for preparation split up the type into two
types, one for RX and one for TX. Also make sure (via sparse)
that they cannot be confused.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:37:01 -05:00
Ivo van Doorn
406f2388cc wireless: Fix WARN_ON() with ieee802.11b
When the driver registers a IEEE80211_BAND_2GHZ band,
it can either be 802.11b or 802.11g. But when 802.11b rates
are registered "want" will be 3 (since 4 rates are being registered,
and each of those 4 rates will decrease "want").
Since this is a correct situation, there is no need to trigger
a WARN_ON() for this.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:33 -05:00
Johannes Berg
aac09fbf82 wireless: fix ERP rate flags
In the rate API patch I accidentally reverted the test for
ERP rates, this fixes it. All rates except 1, 2, 5.5 and 11
MBit are ERP rates, not those.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:33 -05:00
Stefano Brivio
b7c50de92e rc80211-pid: fix rate adjustment
Merge rate_control_pid_shift_adjust() to rate_control_pid_adjust_rate()
in order to make the learning algorithm aware of constraints on rates. Also
add some comments and rename variables.

This fixes a bug which prevented 802.11b/g non-AP STAs from working with
802.11b only AP STAs.

Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:33 -05:00
Johannes Berg
238814fd9a mac80211: remove port control enable switch, clean up sta flags
This patch removes the 802.1X port acess control enable flag
since it is not required. Instead, set the authorized flag for
each station that we normally communicate with (WDS peers, IBSS
peers and APs we're associated to) and require hostapd to set
the authorized flag for all stations when port control is not
enabled.

Also, since I was working in that area, this documents station
flags and removes the unused "permanent" one.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:33 -05:00
Johannes Berg
69d464d593 mac80211: fix scan band off-by-one error
When checking for the next band to advance to, there
was an off-by-one error that could lead to an access
to an invalid array index. Additionally, the later
check for scan_band >= IEEE80211_NUM_BANDS is not
required since that will never be true.

This also improves the comments related to that code.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:32 -05:00
Johannes Berg
ee688b000d nl80211: export hardware bitrate/channel capabilities
This makes nl80211 export the hardware bitrate/channel capabilities
as registered in a wiphy.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:32 -05:00
Johannes Berg
8318d78a44 cfg80211 API for channels/bitrates, mac80211 and driver conversion
This patch creates new cfg80211 wiphy API for channel and bitrate
registration and converts mac80211 and drivers to the new API. The
old mac80211 API is completely ripped out. All drivers (except ath5k)
are updated to the new API, in many cases I expect that optimisations
can be done.

Along with the regulatory code I've also ripped out the
IEEE80211_HW_DEFAULT_REG_DOMAIN_CONFIGURED flag, I believe it to be
unnecessary if the hardware simply gives us whatever channels it wants
to support and we then enable/disable them as required, which is pretty
much required for travelling.

Additionally, the patch adds proper "basic" rate handling for STA
mode interface, AP mode interface will have to have new API added
to allow userspace to set the basic rate set, currently it'll be
empty... However, the basic rate handling will need to be moved to
the BSS conf stuff.

I do expect there to be bugs in this, especially wrt. transmit
power handling where I'm basically clueless about how it should work.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:32 -05:00
Johannes Berg
38f3714d66 mac80211: dissolve pre-rx handlers
These handlers do not really return a status and the compiler
can do a much better job when they're simply static functions
that it can inline if appropriate. Also makes the code shorter.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:21 -05:00
Ron Rindjunsky
d92684e660 mac80211: A-MPDU Tx add delBA from recipient support
This patch adds the ability to handle delBA from recipient to initiator
during an A-MPDU session

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:18 -05:00
Ron Rindjunsky
eb2ba62ee5 mac80211: A-MPDU add debugfs support
This patch adds A-MPDU status report per STA to the debugfs.
The option to de/activate A-MPDU through debugfs is also present.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:17 -05:00
Ron Rindjunsky
fe3bf0f59e mac80211: A-MPDU Tx MLME data initialization
This patch initialize A-MPDU MLME data for Tx sessions.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:17 -05:00
Ron Rindjunsky
9e72349237 mac80211: A-MPDU Tx adding qdisc support
This patch allows qdisc support in A-MPDU Tx. a method to
handle QoS <-> TID switches is present in this patch.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:17 -05:00
Ron Rindjunsky
eadc8d9e90 mac80211: A-MPDU Tx adding basic functionality
This patch adds the following abilities to mac80211:
 - start A-MPDU Tx session
 - stop A-MPDU Tx session
 - call backs to start/stop A-MPDU Tx session
 - sending addBA request
 - processing addBA response

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:14 -05:00
Ron Rindjunsky
80656c2031 mac80211: A-MPDU Tx add MLME structures
This patch adds the needed structures to describe the Tx aggregation MLME
per STA
new:
 - struct tid_ampdu_tx: TID aggregation information (Tx)
changed:
 - struct sta_ampdu_mlme: Tx aggregation information per TID and
			  dialog token creator were added
 - struct sta_info: tid_to_tx_q added for tid<->tx queue mapping

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:13 -05:00
Ron Rindjunsky
0df3ef45a3 mac80211: A-MPDU Tx add session's and low level driver's API
This patch adds the API for 3 stages in A-MPDU Tx session flow:
- request mac80211 to start/stop A-MPDU Tx session for specific TID. such a
  request should be issued by a load aware element, either mac80211 itself
  or external element.
- requests by mac80211 to low-level driver to start/stop Tx aggregation.
  notice that low level driver responds now with Starting Sequence Number.
- async feedback by low-level to mac80211 to inform that HW is ready for
  next A-MPDU Tx state.
Changes in API to Rx A-MPDU were also made, reflected in iwlwifi changes as
well.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:13 -05:00
Johannes Berg
7d185b8bb1 mac80211: allow sending multicast frames through virtual ports
When reworking the port access control code, I forgot multicast frames
and those are now always rejected because the destination station is
not known. This changes the code to allow through multicast frames and
also avoid the sta hash lookup (which is bound to fail) for them.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:09 -05:00
Tomas Winkler
b220525696 mac80211: set assoc flag to bss_conf
Only BSS_CHANGED_ASSOC was set in the 'changed' bitmask. Assignment  to
bss_conf.assoc was absent.
This patch assign value to  bss_conf.assoc according the association state.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-29 15:19:08 -05:00
Pavel Emelyanov
95a363582b [NET]: Use existing device list walker for /proc/dev_mcast.
The seq_file_operations' dev_mc_seq_xxx callbacks do the same thing as
the dev_seq_xxx ones do, but skip the SEQ_START_TOKEN.

So use the existing exported dev_seq_xxx calls and handle the
SEQ_START_TOKEN in the dev_mc_seq_show().

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:44:14 -08:00
Denis V. Lunev
fd80eb942a [INET]: Remove struct dst_entry *dst from request_sock_ops.rtx_syn_ack.
It looks like dst parameter is used in this API due to historical
reasons.  Actually, it is really used in the direct call to
tcp_v4_send_synack only.  So, create a wrapper for tcp_v4_send_synack
and remove dst from rtx_syn_ack.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:43:03 -08:00
Neil Horman
58fbbed4fb [SCTP]: extend exported data in /proc/net/sctp/assoc
RFC 3873 specifies several MIB objects that can't be obtained by the
current data set exported by /proc/sys/net/sctp/assoc.  This patch
adds the missing pieces of data that allow us to compute all the
objects in the sctpAssocTable object.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:40:56 -08:00
Pavel Emelyanov
665bba1087 [NETFILTER/RXRPC]: Don't use seq_release_private where inappropriate.
Some netfilter code and rxrpc one use seq_open() to open
a proc file, but seq_release_private to release one.

This is harmless, but ambiguous.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:39:17 -08:00
Pavel Emelyanov
c20932d2c9 [ATALK/DECNET]: Use seq_open_private in appletalk and decnet.
These two also perform manual seq_open_private, so patch them both at
once. But unlike ATM code, these already use the seq_release_private,
so I splitted this patch from the previous one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:38:24 -08:00
Pavel Emelyanov
9a8c09e73b [ATM]: Use seq_open/release_privade instead of manual manipulations.
lec_seq_open/lec_seq_release and __vcc_seq_open/vcc_seq_release
do seq_open/release_private's job.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:37:02 -08:00
David S. Miller
45af1754bc [NET]: sk_release_kernel needs to be exported to modules
Fixes:

ERROR: "sk_release_kernel" [net/ipv6/ipv6.ko] undefined!

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:33:19 -08:00
Pavel Emelyanov
459eea7410 [SCTP]: Use proc_create to setup de->proc_fops.
In addition to commit 160f17 ("[SCTP]: Use proc_create() to setup
->proc_fops first") use proc_create in two more places.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:24:45 -08:00
Denis V. Lunev
98c6d1b261 [NETNS]: Make icmpv6_sk per namespace.
All preparations are done. Now just add a hook to perform an
initialization on namespace startup and replace icmpv6_sk macro with
proper inline call.  Actual namespace the packet belongs too will be
passed later along with the one for the routing.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:21:22 -08:00
Denis V. Lunev
4a6ad7a141 [NETNS]: Make icmp_sk per namespace.
All preparations are done. Now just add a hook to perform an
initialization on namespace startup and replace icmp_sk macro with
proper inline call.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:19:58 -08:00
Denis V. Lunev
5c8cafd65e [NETNS]: icmp(v6)_sk should not pin a namespace.
So, change icmp(v6)_sk creation/disposal to the scheme used in the
netlink for rtnl, i.e. create a socket in the context of the init_net
and assign the namespace without getting a referrence later.

Also use sk_release_kernel instead of sock_release to properly destroy
such sockets.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:19:22 -08:00
Denis V. Lunev
edf0208702 [NET]: Make netlink_kernel_release publically available as sk_release_kernel.
This staff will be needed for non-netlink kernel sockets, which should
also not pin a namespace like tcp_socket and icmp_socket.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:18:32 -08:00
Denis V. Lunev
9dfbec1fb2 [NETLINK]: No need for a separate __netlink_release call.
Merge it to netlink_kernel_release.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:17:56 -08:00
Denis V. Lunev
79c9115953 [ICMP]: Allocate data for __icmp(v6)_sk dynamically.
Own __icmp(v6)_sk should be present in each namespace. So, it should be
allocated dynamically. Though, alloc_percpu does not fit the case as it
implies additional dereferrence for no bonus.

Allocate data for pointers just like __percpu_alloc_mask does and place
pointers to struct sock into this array.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:17:11 -08:00
Denis V. Lunev
405666db84 [ICMP]: Pass proper ICMP socket into icmp(v6)_xmit_(un)lock.
We have to get socket lock inside icmp(v6)_xmit_lock/unlock. The socket
is get from global variable now. When this code became namespaces, one
should pass a namespace and get socket from it.

Though, above is useless. Socket is available in the caller, just pass
it inside. This saves a bit of code now and saves more later.

add/remove: 0/0 grow/shrink: 1/3 up/down: 1/-169 (-168)
function                                     old     new   delta
icmp_rcv                                     718     719      +1
icmpv6_rcv                                  2343    2303     -40
icmp_send                                   1566    1518     -48
icmp_reply                                   549     468     -81

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:16:46 -08:00
Denis V. Lunev
b7e729c4b4 [ICMP]: Store sock rather than socket for ICMP flow control.
Basically, there is no difference, what to store: socket or sock. Though,
sock looks better as there will be 1 less dereferrence on the fast path.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:16:08 -08:00
Denis V. Lunev
1e3cf6834e [ICMP]: Optimize icmp_socket usage.
Use this macro only once in a function to save a bit of space.

add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-98 (-98)
function                                     old     new   delta
icmp_reply                                   562     561      -1
icmp_push_reply                              305     258     -47
icmp_init                                    273     223     -50

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:15:42 -08:00
Denis V. Lunev
a5710d6582 [ICMP]: Add return code to icmp_init.
icmp_init could fail and this is normal for namespace other than initial.
So, the panic should be triggered only on init_net initialization path.

Additionally create rollback path for icmp_init as a separate function.
It will also be used later during namespace destruction.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:14:50 -08:00
Denis V. Lunev
9b0f976f27 [INET]: Remove struct net_proto_family* from _init calls.
struct net_proto_family* is not used in icmp[v6]_init, ndisc_init,
igmp_init and tcp_v4_init. Remove it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 11:13:15 -08:00
Wang Chen
5e47879f49 [IRDA]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29 10:34:45 -08:00
Sangtae Ha
0bc8c7bf9e [TCP]: BIC web page link is corrected.
Signed-off-by: Sangtae Ha <sha2@ncsu.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 22:14:32 -08:00
Timo Teras
4c563f7669 [XFRM]: Speed up xfrm_policy and xfrm_state walking
Change xfrm_policy and xfrm_state walking algorithm from O(n^2) to O(n).
This is achieved adding the entries to one more list which is used
solely for walking the entries.

This also fixes some races where the dump can have duplicate or missing
entries when the SPD/SADB is modified during an ongoing dump.

Dumping SADB with 20000 entries using "time ip xfrm state" the sys
time dropped from 1.012s to 0.080s.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 21:31:08 -08:00
Adrian Bunk
1e04d53070 [IPV6]: Unexport ip6_find_1stfragopt
This patch removes the no longer used 
EXPORT_SYMBOL_GPL(ip6_find_1stfragopt).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 21:27:35 -08:00
Juha-Matti Tapio
99cd07a537 [IPV6]: Fix source address selection for ORCHID addresses
Skip the prefix length matching in source address selection for
orchid -> non-orchid addresses.

Overlay Routable Cryptographic Hash IDentifiers (RFC 4843,
2001:10::/28) are currenty not globally reachable. Without this
check a host with an ORCHID address can end up preferring those over
regular addresses when talking to other regular hosts in the 2001::/16
range thus breaking non-orchid connections.

Signed-off-by: Juha-Matti Tapio <jmtapio@verkkotelakka.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:55:46 -08:00
Juha-Matti Tapio
5fe47b8a65 [IPV6]: Add ORCHID prefix to address label table
Add a new label for Overlay Routable Cryptographic Hash Identifiers
(RFC 4843) prefix 2001:10::/28 to help proper source address
selection.

ORCHID addresses are used by for example Host Identity Protocol. They are 
global and routable, but they currently need support from both endpoints 
and therefore mixing regular and ORCHID addresses for source and 
destination is a bad idea in general case.

Signed-off-by: Juha-Matti Tapio <jmtapio@verkkotelakka.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:55:02 -08:00
Denis V. Lunev
c4544c7243 [NETNS]: Process inet_select_addr inside a namespace.
The context is available from a network device passed in.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:52:54 -08:00
Denis V. Lunev
3776c8891a [NETNS]: Enable IPv4 address manipulations inside namespace.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:52:25 -08:00
Denis V. Lunev
1937504dd1 [NETNS]: Enable all routing manipulation via netlink inside namespace.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:52:04 -08:00
Denis V. Lunev
e5b13cb10d [NETNS]: Process devinet ioctl in the correct namespace.
Add namespace parameter to devinet_ioctl and locate device inside it for
state changes.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:51:43 -08:00
Denis V. Lunev
73b3871165 [NETNS]: Register /proc/net/rt_cache for each namespace.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:51:18 -08:00
Denis V. Lunev
a75e936f2f [NETNS]: Process /proc/net/rt_cache inside a namespace.
Show routing cache for a particular namespace only.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:50:55 -08:00
Denis V. Lunev
642d631811 [IPV4]: rt_cache_get_next should take rt_genid into account.
In the other case /proc/net/rt_cache will look inconsistent in respect to
genid.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:50:33 -08:00
Denis V. Lunev
317805b8f8 [NETNS]: Process ip_rt_redirect in the correct namespace.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:50:06 -08:00
Denis V. Lunev
9de8f76d20 [NETNS]: DST cleanup routines should be called inside namespace.
Device inside the namespace can be started and downed. So, active routing
cache should be cleaned up on device stop.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:49:44 -08:00
Denis V. Lunev
be162d6288 [NETNS]: Enable inetdev_event notifier.
After all these preparations it is time to enable main IPv4 device
initialization routine inside namespace. It is safe do this now.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:49:13 -08:00
Denis V. Lunev
2430aa85de [NETNS]: Disable multicaststing configuration inside non-initial namespace.
Do not calls hooks from device notifiers and disallow configuration from
ioctl/netlink layer.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:48:49 -08:00
Denis V. Lunev
0c65babd6c [NETNS]: Default arp parameters lookup.
Default ARP parameters should be findable regardless of the context.
Required to make inetdev_event working.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:48:25 -08:00
Denis V. Lunev
4ab438fcd7 [NETNS]: Register neighbour table parameters in the correct namespace.
neigh_sysctl_register should register sysctl entries inside correct namespace
to avoid naming conflict. Typical example is a loopback. Entries for it
present in all namespaces.

Required to make inetdev_event working.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:48:01 -08:00
Denis V. Lunev
6133fb1aa1 [NETNS]: Disable inetaddr notifiers in namespaces other than initial.
ip_fib_init is kept enabled. It is already namespace-aware.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:46:17 -08:00
Denis V. Lunev
6fc68624e5 [NETFILTER]: Consolidate masq_inet_event and masq_device_event.
They do exactly the same job.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:45:41 -08:00
Denis V. Lunev
5811769c78 [IPV4]: Remove check for ifa->ifa_dev != NULL.
This is a callback registered to inet address notifier chain.
The check is useless as:
- ifa->ifa_dev is always != NULL
- similar checks are abscent in all other notifiers.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 20:45:00 -08:00
Wang Chen
2335f8ec27 [X25]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:16:33 -08:00
Wang Chen
6d37a15816 [WANROUTER]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:15:56 -08:00
Wang Chen
d45fba3625 [8021Q]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:14:58 -08:00
Wang Chen
770207208e [IPV4]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:14:25 -08:00
Wang Chen
4436f4cbfa [IPV6]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:13:46 -08:00
Wang Chen
160f17e345 [SCTP]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:13:16 -08:00
Wang Chen
25296d599c [PKTGEN]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:11:49 -08:00
Wang Chen
46ecf0b994 [NEIGHBOUR]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:10:51 -08:00
Wang Chen
7e02180998 [LLC]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:08:54 -08:00
Wang Chen
1e15dc981d [IPX]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:06:14 -08:00
Wang Chen
2ce8f047d5 [SUNRPC]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 14:00:59 -08:00
David S. Miller
64758bd792 Merge branch 'pending' of master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev 2008-02-28 13:56:37 -08:00
Wang Chen
16e297b358 [ATM]: Use proc_create() to setup ->proc_fops first
Use proc_create() to make sure that ->proc_fops be setup before gluing
PDE to main tree.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 13:55:45 -08:00
Vlad Yasevich
7e8616d8e7 [SCTP]: Update AUTH structures to match declarations in draft-16.
The new SCTP socket api (draft 16) updates the AUTH API structures.
We never exported these since we knew they would change.
Update the rest to match the draft.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-28 16:45:04 -05:00
Vlad Yasevich
b40db68468 [SCTP]: Incorrect length was used in SCTP_*_AUTH_CHUNKS socket option
The chunks are stored inside a parameter structure in the kernel
and when we copy them to the user, we need to account for
the parameter header.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-28 16:45:01 -05:00
Neil Horman
15efbe7639 [SCTP]: Clean up naming conventions of sctp protocol/address family registration
I noticed while looking into some odd behavior in sctp, that the variable
name sctp_pf_inet6_specific was used twice to represent two different
pieces of data (its both a structure name and a pointer to that type of
structure), which is confusing to say the least, and potentially dangerous
depending on the variable scope.  This patch cleans that up, and makes the
protocol and address family registration names in SCTP more regular,
increasing readability.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

 ipv6.c     |   12 ++++++------
 protocol.c |   12 ++++++------
 2 files changed, 12 insertions(+), 12 deletions(-)
2008-02-28 16:41:05 -05:00
Wang Chen
ed2b5b474e [APPLETALK]: Use proc_create() to setup ->proc_fops first
As Davem mentioned in his recently patch
(d9595a7b9c)
that the procfs visibility should occur after
the ->proc_fops are setup.

And also, Alexey provide proc_create() to make
sure that ->proc_fops is setup before gluing PDE
to main tree.

We use proc_create().

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 12:53:32 -08:00
Herbert Xu
21e43188f2 [IPCOMP]: Disable BH on output when using shared tfm
Because we use shared tfm objects in order to conserve memory,
(each tfm requires 128K of vmalloc memory), BH needs to be turned
off on output as that can occur in process context.

Previously this was done implicitly by the xfrm output code.
That was lost when it became lockless.  So we need to add the
BH disabling to IPComp directly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-28 11:23:17 -08:00
David S. Miller
60717f7e76 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-02-28 11:03:29 -08:00
Johannes Berg
03147dfc8a mac80211: fix kmalloc vs. net_ratelimit
The "goto end;" part definitely must not be rate limited.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-28 09:13:10 -05:00
Vlad Yasevich
b90a137d30 [SCTP]: Correctly set the length of sctp_assoc_change notification
sctp_assoc_change notification may contain the data from a received
ABORT chunk.  Set the length correctly to account for that.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-27 16:40:55 -05:00
Jan Engelhardt
6556874dc3 [NETFILTER]: xt_conntrack: fix IPv4 address comparison
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-27 12:20:41 -08:00
Jan Engelhardt
d61f89e941 [NETFILTER]: xt_conntrack: fix missing boolean clamping
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-27 12:09:05 -08:00
Patrick McHardy
4e29e9ec7e [NETFILTER]: nf_conntrack: fix smp_processor_id() in preemptible code warning
Since we're using RCU for the conntrack hash now, we need to avoid
getting preempted or interrupted by BHs while changing the stats.

Fixes warning reported by Tilman Schmidt <tilman@imap.cc> when using
preemptible RCU:

[   48.180297] BUG: using smp_processor_id() in preemptible [00000000] code: ntpdate/3562
[   48.180297] caller is __nf_conntrack_find+0x9b/0xeb [nf_conntrack]
[   48.180297] Pid: 3562, comm: ntpdate Not tainted 2.6.25-rc2-mm1-testing #1
[   48.180297]  [<c02015b9>] debug_smp_processor_id+0x99/0xb0
[   48.180297]  [<fac643a7>] __nf_conntrack_find+0x9b/0xeb [nf_conntrack]

Tested-by: Tilman Schmidt <tilman@imap.cc>
Tested-by: Christian Casteyde <casteyde.christian@free.fr> [Bugzilla #10097]

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-27 12:07:47 -08:00
YOSHIFUJI Hideaki
3bdfe7ec08 [IPV6] SYSCTL: Fix possible memory leakage in error path.
In error path, we do need to free memory just allocated.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-27 12:06:38 -08:00
Pavel Emelyanov
b37d428b24 [INET]: Don't create tunnels with '%' in name.
Four tunnel drivers (ip_gre, ipip, ip6_tunnel and sit) can receive a
pre-defined name for a device from the userspace.  Since these drivers
call the register_netdevice() (rtnl_lock, is held), which does _not_
generate the device's name, this name may contain a '%' character.

Not sure how bad is this to have a device with a '%' in its name, but
all the other places either use the register_netdev(), which call the
dev_alloc_name(), or explicitly call the dev_alloc_name() before
registering, i.e. do not allow for such names.

This had to be prior to the commit 34cc7b, but I forgot to number the
patches and this one got lost, sorry.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-26 23:51:04 -08:00
David S. Miller
d9595a7b9c [AF_KEY]: Fix oops by converting to proc_net_*().
To make sure the procfs visibility occurs after the ->proc_fs ops are
setup, use proc_net_fops_create() and proc_net_remove().

This also fixes an OOPS after module unload in that the name string
for remove was wrong, so it wouldn't actually be removed.  That bug
was introduced by commit 61145aa1a1
("[KEY]: Clean up proc files creation a bit.")

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-26 22:23:31 -08:00
Bjorn Mork
148f97292e [IPV4]: Reset scope when changing address
This bug did bite at least one user, who did have to resort to rebooting
the system after an "ifconfig eth0 127.0.0.1" typo.

Deleting the address and adding a new is a less intrusive workaround.
But I still beleive this is a bug that should be fixed.  Some way or
another.

Another possibility would be to remove the scope mangling based on
address.  This will always be incomplete (are 127/8 the only address
space with host scope requirements?)

We set the scope to RT_SCOPE_HOST if an IPv4 interface is configured
with a loopback address (127/8).  The scope is never reset, and will
remain set to RT_SCOPE_HOST after changing the address. This patch
resets the scope if the address is changed again, to restore normal
functionality.

Signed-off-by: Bjorn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-26 18:42:41 -08:00
Benjamin Thery
f1243c2db6 [IPV6]: Add missing initializations of the new nl_info.nl_net field
Add some more missing initializations of the new nl_info.nl_net field
in IPv6 stack. This field will be used when network namespaces are
fully supported.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-26 18:42:37 -08:00
Thomas Gleixner
3ab2273175 bluetooth: delete timer in l2cap_conn_del()
Delete a possibly armed timer before kfree'ing the connection object.

Solves: http://lkml.org/lkml/2008/2/15/514

Reported-by:Quel Qun <kelk1@comcast.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-26 17:42:56 -08:00
Harvey Harrison
5f2f40a92e tipc: fix integer as NULL pointer sparse warnings in tipc
net/tipc/cluster.c:145:2: warning: Using plain integer as NULL pointer
net/tipc/link.c:3254:36: warning: Using plain integer as NULL pointer
net/tipc/ref.c:151:15: warning: Using plain integer as NULL pointer
net/tipc/zone.c:85:2: warning: Using plain integer as NULL pointer

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-24 18:38:31 -08:00
Linus Torvalds
bdc0894289 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits)
  [NETFILTER]: fix ebtable targets return
  [IP_TUNNEL]: Don't limit the number of tunnels with generic name explicitly.
  [NET]: Restore sanity wrt. print_mac().
  [NEIGH]: Fix race between neighbor lookup and table's hash_rnd update.
  [RTNL]: Validate hardware and broadcast address attribute for RTM_NEWLINK
  tg3: ethtool phys_id default
  [BNX2]: Update version to 1.7.4.
  [BNX2]: Disable parallel detect on an HP blade.
  [BNX2]: More 5706S link down workaround.
  ssb: Fix support for PCI devices behind a SSB->PCI bridge
  zd1211rw: fix sparse warnings
  rtl818x: fix sparse warnings
  ssb: Fix pcicore cardbus mode
  ssb: Make the GPIO API reentrancy safe
  ssb: Fix the GPIO API
  ssb: Fix watchdog access for devices without a chipcommon
  ssb: Fix serial console on new bcm47xx devices
  ath5k: Fix build warnings on some 64-bit platforms.
  WDEV, ath5k, don't return int from bool function
  WDEV: ath5k, fix lock imbalance
  ...
2008-02-23 21:07:10 -08:00
Joonwoo Park
1b04ab4597 [NETFILTER]: fix ebtable targets return
The function ebt_do_table doesn't take NF_DROP as a verdict from the targets.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23 20:22:27 -08:00
Pavel Emelyanov
34cc7ba639 [IP_TUNNEL]: Don't limit the number of tunnels with generic name explicitly.
Use the added dev_alloc_name() call to create tunnel device name,
rather than iterate in a hand-made loop with an artificial limit.

Thanks Patrick for noticing this.

[ The way this works is, when the device is actually registered,
  the generic code noticed the '%' in the name and invokes
  dev_alloc_name() to fully resolve the name.  -DaveM ]

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23 20:19:20 -08:00
David S. Miller
55b01e8681 [NET]: Restore sanity wrt. print_mac().
MAC_FMT had only one user and we tried to get rid of
that, but this created more problems than it solved.

As a result, this reverts three commits:

235365f3aa ("net/8021q/vlan_dev.c: Use
print_mac."), fea5fa875e ("[NET]: Remove
MAC_FMT"), and 8f789c4844 ("[NET]:
Elminate spurious print_mac() calls.")

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23 20:09:11 -08:00
Pavel Emelyanov
bc4bf5f38c [NEIGH]: Fix race between neighbor lookup and table's hash_rnd update.
The neigh_hash_grow() may update the tbl->hash_rnd value, which 
is used in all tbl->hash callbacks to calculate the hashval.

Two lookup routines may race with this, since they call the 
->hash callback without the tbl->lock held. Since the hash_rnd
is changed with this lock write-locked moving the calls to ->hash
under this lock read-locked closes this gap.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23 19:57:02 -08:00
Thomas Graf
1840bb13c2 [RTNL]: Validate hardware and broadcast address attribute for RTM_NEWLINK
RTM_NEWLINK allows for already existing links to be modified. For this
purpose do_setlink() is called which expects address attributes with a
payload length of at least dev->addr_len. This patch adds the necessary
validation for the RTM_NEWLINK case.

The address length for links to be created is not checked for now as the
actual attribute length is used when copying the address to the netdevice
structure. It might make sense to report an error if less than addr_len
bytes are provided but enforcing this might break drivers trying to be
smart with not transmitting all zero addresses.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23 19:54:36 -08:00
Rafael J. Wysocki
3a2d5b7001 PM: Introduce PM_EVENT_HIBERNATE callback state
During the last step of hibernation in the "platform" mode (with the
help of ACPI) we use the suspend code, including the devices'
->suspend() methods, to prepare the system for entering the ACPI S4
system sleep state.

But at least for some devices the operations performed by the
->suspend() callback in that case must be different from its operations
during regular suspend.

For this reason, introduce the new PM event type PM_EVENT_HIBERNATE and
pass it to the device drivers' ->suspend() methods during the last phase
of hibernation, so that they can distinguish this case and handle it as
appropriate.  Modify the drivers that handle PM_EVENT_SUSPEND in a
special way and need to handle PM_EVENT_HIBERNATE in the same way.

These changes are necessary to fix a hibernation regression related
to the i915 driver (ref. http://lkml.org/lkml/2008/2/22/488).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23 10:40:04 -08:00
Pavel Emelyanov
5216a8e70e Wrap buffers used for rpc debug printks into RPC_IFDEBUG
Sorry for the noise, but here's the v3 of this compilation fix :)

There are some places, which declare the char buf[...] on the stack
to push it later into dprintk(). Since the dprintk sometimes (if the
CONFIG_SYSCTL=n) becomes an empty do { } while (0) stub, these buffers
cause gcc to produce appropriate warnings.

Wrap these buffers with RPC_IFDEBUG macro, as Trond proposed, to
compile them out when not needed.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-21 18:42:29 -05:00
Denis V. Lunev
da12f7356d [NETNS]: Namespace leak in pneigh_lookup.
release_net is missed on the error path in pneigh_lookup.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-20 00:26:16 -08:00
Pavel Emelyanov
5f31886ff0 [SCTP]: Pick up an orphaned sctp_sockets_allocated counter.
This counter is currently write-only.

Drawing an analogy with the similar tcp counter, I think
that this one should be pointed by the sockets_allocated
members of sctp_prot and sctpv6_prot.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-20 00:23:01 -08:00
Jan Engelhardt
27ecb1ff0a [NETFILTER]: xt_iprange: fix subtraction-based comparison
The host address parts need to be converted to host-endian first
before arithmetic makes any sense on them.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 17:20:06 -08:00
Jan Engelhardt
7d9904c260 [NETFILTER]: xt_hashlimit: remove unneeded struct member
By allocating ->hinfo, we already have the needed indirection to cope
with the per-cpu xtables struct match_entry.

[Patrick: do this now before the revision 1 struct is used by userspace]

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 17:19:44 -08:00
Joonwoo Park
eb1197bc0e [NETFILTER]: Fix incorrect use of skb_make_writable
http://bugzilla.kernel.org/show_bug.cgi?id=9920
The function skb_make_writable returns true or false.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 17:18:47 -08:00
Pavel Emelyanov
f449b3b54d [NETFILTER]: xt_u32: drop the actually unused variable from u32_match_it
The int ret variable is used only to trigger the BUG_ON() after
the skb_copy_bits() call, so check the call failure directly
and drop the variable.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 17:18:20 -08:00
Patrick McHardy
e2b58a67b9 [NETFILTER]: {ip,ip6,nfnetlink}_queue: fix SKB_LINEAR_ASSERT when mangling packet data
As reported by Tomas Simonaitis <tomas.simonaitis@gmail.com>,
inserting new data in skbs queued over {ip,ip6,nfnetlink}_queue
triggers a SKB_LINEAR_ASSERT in skb_put().

Going back through the git history, it seems this bug is present since
at least 2.6.12-rc2, probably even since the removal of
skb_linearize() for netfilter.

Linearize non-linear skbs through skb_copy_expand() when enlarging
them.  Tested by Thomas, fixes bugzilla #9933.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 17:17:52 -08:00
Adrian Bunk
94cb1503c7 ipv4/fib_hash.c: fix NULL dereference
Unless I miss a guaranteed relation between between "f" and
"new_fa->fa_info" this patch is required for fixing a NULL dereference
introduced by commit a6501e080c ("[IPV4]
FIB_HASH: Reduce memory needs and speedup lookups") and spotted by the
Coverity checker.

Eric Dumazet says:

	Hum, you are right, kmem_cache_free() doesnt allow a NULL
	object, like kfree() does.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 16:28:54 -08:00
Adrian Bunk
15e29b8b05 net/9p/trans_virtio.c: kmalloc() enough memory
The Coverity checker spotted that less memory than required was
allocated.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 16:25:30 -08:00
Thomas Graf
76e87306c2 [RTNL]: Add missing link netlink attribute policy definitions
IFLA_LINK is no longer a write-only attribute on the kernel side and
must thus be validated. Same goes for the newly introduced
IFLA_LINKINFO.

Fixes undefined behaviour if either of the attributes are not well
formed.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 16:12:08 -08:00
Jorge Boncompte [DTI2]
12aa343add [NET]: Messed multicast lists after dev_mc_sync/unsync
Commit a0a400d79e ("[NET]: dev_mcast:
add multicast list synchronization helpers") from you introduced a new
field "da_synced" to struct dev_addr_list that is not properly
initialized to 0. So when any of the current users (8021q, macvlan,
mac80211) calls dev_mc_sync/unsync they mess the address list for both
devices.

The attached patch fixed it for me and avoid future problems.

Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-19 14:17:04 -08:00
Linus Torvalds
07ce198a1e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (60 commits)
  [NIU]: Bump driver version and release date.
  [NIU]: Fix BMAC alternate MAC address indexing.
  net: fix kernel-doc warnings in header files
  [IPV6]: Use BUG_ON instead of if + BUG in fib6_del_route.
  [IPV6]: dst_entry leak in ip4ip6_err. (resend)
  bluetooth: do not move child device other than rfcomm
  bluetooth: put hci dev after del conn
  [NET]: Elminate spurious print_mac() calls.
  [BLUETOOTH] hci_sysfs.c: Kill build warning.
  [NET]: Remove MAC_FMT
  net/8021q/vlan_dev.c: Use print_mac.
  [XFRM]: Fix ordering issue in xfrm_dst_hash_transfer().
  [BLUETOOTH] net/bluetooth/hci_core.c: Use time_* macros
  [IPV6]: Fix hardcoded removing of old module code
  [NETLABEL]: Move some initialization code into __init section.
  [NETLABEL]: Shrink the genl-ops registration code.
  [AX25] ax25_out: check skb for NULL in ax25_kick()
  [TCP]: Fix tcp_v4_send_synack() comment
  [IPV4]: fix alignment of IP-Config output
  Documentation: fix tcp.txt
  ...
2008-02-19 07:52:45 -08:00
Pavel Emelyanov
2df96af03d [IPV6]: Use BUG_ON instead of if + BUG in fib6_del_route.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-18 20:50:42 -08:00
Denis V. Lunev
9937ded8e4 [IPV6]: dst_entry leak in ip4ip6_err. (resend)
The result of the ip_route_output is not assigned to skb. This means that
- it is leaked
- possible OOPS below dereferrencing skb->dst
- no ICMP message for this case

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-18 20:49:36 -08:00
Dave Young
8ac62dc773 bluetooth: do not move child device other than rfcomm
hci conn child devices other than rfcomm tty should not be moved here.
This is my lost, thanks for Barnaby's reporting and testing.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com> 
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-18 20:45:41 -08:00
Dave Young
0cd63c8089 bluetooth: put hci dev after del conn
Move hci_dev_put to del_conn to avoid hci dev going away before hci conn.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-18 20:44:01 -08:00
David S. Miller
988d0093f9 [BLUETOOTH] hci_sysfs.c: Kill build warning.
net/bluetooth/hci_sysfs.c: In function ‘del_conn’:
net/bluetooth/hci_sysfs.c:339: warning: suggest parentheses around assignment used as truth value

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-18 00:20:50 -08:00
Joe Perches
235365f3aa net/8021q/vlan_dev.c: Use print_mac.
Remove direct use of MAC_FMT

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 23:34:54 -08:00
YOSHIFUJI Hideaki
b791160b5a [XFRM]: Fix ordering issue in xfrm_dst_hash_transfer().
Keep ordering of policy entries with same selector in
xfrm_dst_hash_transfer().

Issue should not appear in usual cases because multiple policy entries
with same selector are basically not allowed so far.  Bug was pointed
out by Sebastien Decugis <sdecugis@hongo.wide.ad.jp>.

We could convert bydst from hlist to list and use list_add_tail()
instead.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Sebastien Decugis <sdecugis@hongo.wide.ad.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 23:29:30 -08:00
S.Çağlar Onur
82453021b8 [BLUETOOTH] net/bluetooth/hci_core.c: Use time_* macros
The functions time_before, time_before_eq, time_after, and
time_after_eq are more robust for comparing jiffies against other
values.

So following patch implements usage of the time_after() macro, defined
at linux/jiffies.h, which deals with wrapping correctly

Signed-off-by: S.Çağlar Onur <caglar@pardus.org.tr>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 23:25:57 -08:00
Wang Chen
5ee46e562c [IPV6]: Fix hardcoded removing of old module code
Rusty hardcoded the old module code.
We can remove it now.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 22:34:53 -08:00
Pavel Emelyanov
05705e4e11 [NETLABEL]: Move some initialization code into __init section.
Everything that is called from netlbl_init() can be marked with
__init. This moves 620 bytes from .text section to .text.init one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 22:33:57 -08:00
Pavel Emelyanov
227c43c3bc [NETLABEL]: Shrink the genl-ops registration code.
Turning them to array and registration in a loop saves
80 lines of code and ~300 bytes from text section.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 22:33:16 -08:00
Jarek Poplawski
f47b7257c7 [AX25] ax25_out: check skb for NULL in ax25_kick()
According to some OOPS reports ax25_kick tries to clone NULL skbs
sometimes. It looks like a race with ax25_clear_queues(). Probably
there is no need to add more than a simple check for this yet.
Another report suggested there are probably also cases where ax25
->paclen == 0 can happen in ax25_output(); this wasn't confirmed
during testing but let's leave this debugging check for some time.

Reported-and-tested-by: Jann Traschewski <jann@gmx.de>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 22:31:19 -08:00
Kris Katterjohn
9bf1d83e7e [TCP]: Fix tcp_v4_send_synack() comment
Signed-off-by: Kris Katterjohn <katterjohn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 22:29:19 -08:00
Uwe Kleine-Koenig
9c00409a2a [IPV4]: fix alignment of IP-Config output
Make the indented lines aligned in the output (not in the code).

Signed-off-by: Uwe Kleine-Koenig <Uwe.Kleine-Koenig@digi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 22:28:32 -08:00
Julia Lawall
d6584f3a08 net/9p/trans_virtio.c: Use BUG_ON
if (...) BUG(); should be replaced with BUG_ON(...) when the test has no
side-effects to allow a definition of BUG_ON that drops the code completely.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@ disable unlikely @ expression E,f; @@

(
  if (<... f(...) ...>) { BUG(); }
|
- if (unlikely(E)) { BUG(); }
+ BUG_ON(E);
)

@@ expression E,f; @@

(
  if (<... f(...) ...>) { BUG(); }
|
- if (E) { BUG(); }
+ BUG_ON(E);
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 18:42:53 -08:00
Julia Lawall
163e3cb7da net/rxrpc: Use BUG_ON
if (...) BUG(); should be replaced with BUG_ON(...) when the test has no
side-effects to allow a definition of BUG_ON that drops the code completely.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@ disable unlikely @ expression E,f; @@

(
  if (<... f(...) ...>) { BUG(); }
|
- if (unlikely(E)) { BUG(); }
+ BUG_ON(E);
)

@@ expression E,f; @@

(
  if (<... f(...) ...>) { BUG(); }
|
- if (E) { BUG(); }
+ BUG_ON(E);
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 18:42:03 -08:00
David S. Miller
9ff5660746 Revert "[NDISC]: Fix race in generic address resolution"
This reverts commit 69cc64d8d9.

It causes recursive locking in IPV6 because unlike other
neighbour layer clients, it even needs neighbour cache
entries to send neighbour soliciation messages :-(

We'll have to find another way to fix this race.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 18:39:54 -08:00
David S. Miller
93b2d4a208 Revert "[RTNETLINK]: Send a single notification on device state changes."
This reverts commit 45b5035482.

It break locking around dev->link_mode as well as cause
other bootup problems.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-17 18:35:07 -08:00
David S. Miller
42fe95cae5 Merge branch 'fixes' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-02-15 15:56:47 -08:00
Michael Buesch
ceffefd15a mac80211: Fix initial hardware configuration
On the initial device-open we need to defer the hardware reconfiguration
after we incremented the open_count, because the hw_config checks this flag
and won't call the lowlevel driver in case it is zero.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-15 13:44:19 -05:00
Linus Torvalds
f6866fecd6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (82 commits)
  [NET]: Make sure sockets implement splice_read
  netconsole: avoid null pointer dereference at show_local_mac()
  [IPV6]: Fix reversed local_df test in ip6_fragment
  [XFRM]: Avoid bogus BUG() when throwing new policy away.
  [AF_KEY]: Fix bug in spdadd
  [NETFILTER] nf_conntrack_proto_tcp.c: Mistyped state corrected.
  net: xfrm statistics depend on INET
  [NETFILTER]: make secmark_tg_destroy() static
  [INET]: Unexport inet_listen_wlock
  [INET]: Unexport __inet_hash_connect
  [NET]: Improve cache line coherency of ingress qdisc
  [NET]: Fix race in dev_close(). (Bug 9750)
  [IPSEC]: Fix bogus usage of u64 on input sequence number
  [RTNETLINK]: Send a single notification on device state changes.
  [NETLABLE]: Hide netlbl_unlabel_audit_addr6 under ifdef CONFIG_IPV6.
  [NETLABEL]: Don't produce unused variables when IPv6 is off.
  [NETLABEL]: Compilation for CONFIG_AUDIT=n case.
  [GENETLINK]: Relax dances with genl_lock.
  [NETLABEL]: Fix lookup logic of netlbl_domhsh_search_def.
  [IPV6]: remove unused method declaration (net/ndisc.h).
  ...
2008-02-15 07:33:07 -08:00
Rémi Denis-Courmont
997b37da15 [NET]: Make sure sockets implement splice_read
Fixes a segmentation fault when trying to splice from a non-TCP socket.

Signed-off-by: Rémi Denis-Courmont <rdenis@simphalempin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-15 02:35:45 -08:00
Herbert Xu
b5c15fc004 [IPV6]: Fix reversed local_df test in ip6_fragment
I managed to reverse the local_df test when forward-porting this
patch so it actually makes things worse by never fragmenting at
all.

Thanks to David Stevens for testing and reporting this bug.

Bill Fink pointed out that the local_df setting is also the wrong
way around.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-14 23:49:37 -08:00
Jan Blunck
1d957f9bf8 Introduce path_put()
* Add path_put() functions for releasing a reference to the dentry and
  vfsmount of a struct path in the right order

* Switch from path_release(nd) to path_put(&nd->path)

* Rename dput_path() to path_put_conditional()

[akpm@linux-foundation.org: fix cifs]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Jan Blunck
4ac9137858 Embed a struct path into struct nameidata instead of nd->{dentry,mnt}
This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.

Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
  <dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
  struct path in every place where the stack can be traversed
- it reduces the overall code size:

without patch series:
   text    data     bss     dec     hex filename
5321639  858418  715768 6895825  6938d1 vmlinux

with patch series:
   text    data     bss     dec     hex filename
5320026  858418  715768 6894212  693284 vmlinux

This patch:

Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
YOSHIFUJI Hideaki
073a371987 [XFRM]: Avoid bogus BUG() when throwing new policy away.
From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

When we destory a new policy entry, we need to tell
xfrm_policy_destroy() explicitly that the entry is not
alive yet.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-14 14:52:38 -08:00
Kazunori MIYAZAWA
a4d6b8af1e [AF_KEY]: Fix bug in spdadd
This patch fix a BUG when adding spds which have same selector.

Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-14 14:51:38 -08:00
Jozsef Kadlecsik
d0c1fd7a8f [NETFILTER] nf_conntrack_proto_tcp.c: Mistyped state corrected.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-14 14:50:21 -08:00
Paul Mundt
0f4bda005f net: xfrm statistics depend on INET
net/built-in.o: In function `xfrm_policy_init':
/home/pmundt/devel/git/sh-2.6.25/net/xfrm/xfrm_policy.c:2338: undefined reference to `snmp_mib_init'

snmp_mib_init() is only built in if CONFIG_INET is set.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-14 14:48:45 -08:00
Adrian Bunk
f51f5ec690 [NETFILTER]: make secmark_tg_destroy() static
This patch makes the needlessly global secmark_tg_destroy() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-13 17:41:39 -08:00
Adrian Bunk
324b57619b [INET]: Unexport inet_listen_wlock
This patch removes the no longer used EXPORT_SYMBOL(inet_listen_wlock).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-13 17:40:25 -08:00
Adrian Bunk
74da4d34e4 [INET]: Unexport __inet_hash_connect
This patch removes the unused EXPORT_SYMBOL_GPL(__inet_hash_connect).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-13 17:39:34 -08:00
Randy Dunlap
bc2cda1ebd docbook: make a networking book and fix a few errors
Move networking (core and drivers) docbook to its own networking book.
Fix a few kernel-doc errors in header and source files.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-13 16:21:19 -08:00
Randy Dunlap
65b6e42cdc docbook: sunrpc filenames and notation fixes
Use updated file list for docbook files and
fix kernel-doc warnings in sunrpc:
Warning(linux-2.6.24-git12//net/sunrpc/rpc_pipe.c:689): No description found for parameter 'rpc_client'
Warning(linux-2.6.24-git12//net/sunrpc/rpc_pipe.c:765): No description found for parameter 'flags'
Warning(linux-2.6.24-git12//net/sunrpc/clnt.c:584): No description found for parameter 'tk_ops'
Warning(linux-2.6.24-git12//net/sunrpc/clnt.c:618): No description found for parameter 'bufsize'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-13 16:21:19 -08:00
Harvey Harrison
b5606c2d44 remove final fastcall users
fastcall always expands to empty, remove it.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-13 16:21:18 -08:00
Matti Linnanvuori
d8b2a4d21e [NET]: Fix race in dev_close(). (Bug 9750)
There is a race in Linux kernel file net/core/dev.c, function dev_close.
The function calls function dev_deactivate, which calls function
dev_watchdog_down that deletes the watchdog timer. However, after that, a
driver can call netif_carrier_ok, which calls function
__netdev_watchdog_up that can add the watchdog timer again. Function
unregister_netdevice calls function dev_shutdown that traps the bug
!timer_pending(&dev->watchdog_timer). Moving dev_deactivate after
netif_running() has been cleared prevents function netif_carrier_on
from calling __netdev_watchdog_up and adding the watchdog timer again.

Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 23:11:16 -08:00
Herbert Xu
b318e0e4ef [IPSEC]: Fix bogus usage of u64 on input sequence number
Al Viro spotted a bogus use of u64 on the input sequence number which
is big-endian.  This patch fixes it by giving the input sequence number
its own member in the xfrm_skb_cb structure.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:50:35 -08:00
Laszlo Attila Toth
45b5035482 [RTNETLINK]: Send a single notification on device state changes.
In do_setlink() a single notification is sent at the end of the
function if any modification occured. If the address has been changed,
another notification is sent.

Both of them is required because originally only the NETDEV_CHANGEADDR
notification was sent and although device state change implies address
change, some programs may expect the original notification. It remains
for compatibity.

If set_operstate() is called from do_setlink(), it doesn't send a
notification, only if it is called from rtnl_create_link() as earlier.

Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:42:09 -08:00
Pavel Emelyanov
370125f0a4 [NETLABLE]: Hide netlbl_unlabel_audit_addr6 under ifdef CONFIG_IPV6.
This one is called from under this config only, so move
it in the same place.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:38:06 -08:00
Pavel Emelyanov
56628b1d89 [NETLABEL]: Don't produce unused variables when IPv6 is off.
Some code declares variables on the stack, but uses them
under #ifdef CONFIG_IPV6, so thay become unused when ipv6
is off. Fortunately, they are used in a switch's case
branches, so the fix is rather simple.

Is it OK from coding style POV to add braces inside "cases",
or should I better avoid such style and rework the patch?

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:37:19 -08:00
Pavel Emelyanov
94de7feb2d [NETLABEL]: Compilation for CONFIG_AUDIT=n case.
The audit_log_start() will expand into an empty do { } while (0)
construction and the audit_ctx becomes unused.

The solution: push current->audit_context into audit_log_start()
directly, since it is not required in any other place in the 
calling function.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:35:37 -08:00
Pavel Emelyanov
910d6c320c [GENETLINK]: Relax dances with genl_lock.
The genl_unregister_family() calls the genl_unregister_mc_groups(), 
which takes and releases the genl_lock and then locks and releases
this lock itself.

Relax this behavior, all the more so the genl_unregister_mc_groups() 
is called from genl_unregister_family() only.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:16:33 -08:00
Pavel Emelyanov
4c3a0a254e [NETLABEL]: Fix lookup logic of netlbl_domhsh_search_def.
Currently, if the call to netlbl_domhsh_search succeeds the
return result will still be NULL.

Fix that, by returning the found entry (if any).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:15:14 -08:00
Urs Thuermann
fee54fa517 [NET]: Fix comment for skb_pull_rcsum
Fix comment for skb_pull_rcsum

Signed-off-by: Urs Thuermann <urs@isnogud.escape.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 22:03:25 -08:00
Herbert Xu
28a89453b1 [IPV6]: Fix IPsec datagram fragmentation
This is a long-standing bug in the IPsec IPv6 code that breaks
when we emit a IPsec tunnel-mode datagram packet.  The problem
is that the code the emits the packet assumes the IPv6 stack
will fragment it later, but the IPv6 stack assumes that whoever
is emitting the packet is going to pre-fragment the packet.

In the long term we need to fix both sides, e.g., to get the
datagram code to pre-fragment as well as to get the IPv6 stack
to fragment locally generated tunnel-mode packet.

For now this patch does the second part which should make it
work for the IPsec host case.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 18:07:27 -08:00
David S. Miller
69cc64d8d9 [NDISC]: Fix race in generic address resolution
Frank Blaschka provided the bug report and the initial suggested fix
for this bug.  He also validated this version of this fix.

The problem is that the access to neigh->arp_queue is inconsistent, we
grab references when dropping the lock lock to call
neigh->ops->solicit() but this does not prevent other threads of
control from trying to send out that packet at the same time causing
corruptions because both code paths believe they have exclusive access
to the skb.

The best option seems to be to hold the write lock on neigh->lock
during the ->solicit() call.  I looked at all of the ndisc_ops
implementations and this seems workable.  The only case that needs
special care is the IPV4 ARP implementation of arp_solicit().  It
wants to take neigh->lock as a reader to protect the header entry in
neigh->ha during the emission of the soliciation.  We can simply
remove the read lock calls to take care of that since holding the lock
as a writer at the caller providers a superset of the protection
afforded by the existing read locking.

The rest of the ->solicit() implementations don't care whether the
neigh is locked or not.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:54:17 -08:00
Jarek Poplawski
e848b583e0 [AX25] ax25_ds_timer: use mod_timer instead of add_timer
This patch changes current use of: init_timer(), add_timer()
and del_timer() to setup_timer() with mod_timer(), which
should be safer anyway.

Reported-by: Jann Traschewski <jann@gmx.de>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:34 -08:00
Jarek Poplawski
21fab4a86a [AX25] ax25_timer: use mod_timer instead of add_timer
According to one of Jann's OOPS reports it looks like
BUG_ON(timer_pending(timer)) triggers during add_timer()
in ax25_start_t1timer(). This patch changes current use
of: init_timer(), add_timer() and del_timer() to
setup_timer() with mod_timer(), which should be safer
anyway.

Reported-by: Jann Traschewski <jann@gmx.de>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:33 -08:00
Jarek Poplawski
4de211f1a2 [AX25] ax25_route: make ax25_route_lock BH safe
> =================================
> [ INFO: inconsistent lock state ]
> 2.6.24-dg8ngn-p02 #1
> ---------------------------------
> inconsistent {softirq-on-W} -> {in-softirq-R} usage.
> linuxnet/3046 [HC0[0]:SC1[2]:HE1:SE0] takes:
>  (ax25_route_lock){--.+}, at: [<f8a0cfb7>] ax25_get_route+0x18/0xb7 [ax25]
> {softirq-on-W} state was registered at:
...

This lockdep report shows that ax25_route_lock is taken for reading in
softirq context, and for writing in process context with BHs enabled.
So, to make this safe, all write_locks in ax25_route.c are changed to
_bh versions.

Reported-by: Jann Traschewski <jann@gmx.de>,
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:32 -08:00
Jarek Poplawski
1105b5d1d4 [AX25] af_ax25: remove sock lock in ax25_info_show()
This lockdep warning:

> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.24 #3
> -------------------------------------------------------
> swapper/0 is trying to acquire lock:
>  (ax25_list_lock){-+..}, at: [<f91dd3b1>] ax25_destroy_socket+0x171/0x1f0 [ax25]
>
> but task is already holding lock:
>  (slock-AF_AX25){-+..}, at: [<f91dbabc>] ax25_std_heartbeat_expiry+0x1c/0xe0 [ax25]
>
> which lock already depends on the new lock.
...

shows that ax25_list_lock and slock-AF_AX25 are taken in different
order: ax25_info_show() takes slock (bh_lock_sock(ax25->sk)) while
ax25_list_lock is held, so reversely to other functions. To fix this
the sock lock should be moved to ax25_info_start(), and there would
be still problem with breaking ax25_list_lock (it seems this "proper"
order isn't optimal yet). But, since it's only for reading proc info
it seems this is not necessary (e.g.  ax25_send_to_raw() does similar
reading without this lock too).

So, this patch removes sock lock to avoid deadlock possibility; there
is also used sock_i_ino() function, which reads sk_socket under proper
read lock. Additionally printf format of this i_ino is changed to %lu.

Reported-by: Bernard Pidoux F6BVP <f6bvp@free.fr>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:31 -08:00
Stephen Hemminger
8315f5d80a fib_trie: /proc/net/route performance improvement
Use key/offset caching to change /proc/net/route (use by iputils route)
from O(n^2) to O(n). This improves performance from 30sec with 160,000
routes to 1sec.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:31 -08:00
Stephen Hemminger
ec28cf738d fib_trie: handle empty tree
This fixes possible problems when trie_firstleaf() returns NULL
to trie_leafindex().

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:30 -08:00
David S. Miller
e4f8b5d4ed [IPV4]: Remove IP_TOS setting privilege checks.
Various RFCs have all sorts of things to say about the CS field of the
DSCP value.  In particular they try to make the distinction between
values that should be used by "user applications" and things like
routing daemons.

This seems to have influenced the CAP_NET_ADMIN check which exists for
IP_TOS socket option settings, but in fact it has an off-by-one error
so it wasn't allowing CS5 which is meant for "user applications" as
well.

Further adding to the inconsistency and brokenness here, IPV6 does not
validate the DSCP values specified for the IPV6_TCLASS socket option.

The real actual uses of these TOS values are system specific in the
final analysis, and these RFC recommendations are just that, "a
recommendation".  In fact the standards very purposefully use
"SHOULD" and "SHOULD NOT" when describing how these values can be
used.

In the final analysis the only clean way to provide consistency here
is to remove the CAP_NET_ADMIN check.  The alternatives just don't
work out:

1) If we add the CAP_NET_ADMIN check to ipv6, this can break existing
   setups.

2) If we just fix the off-by-one error in the class comparison in
   IPV4, certain DSCP values can be used in IPV6 but not IPV4 by
   default.  So people will just ask for a sysctl asking to
   override that.

I checked several other freely available kernel trees and they
do not make any privilege checks in this area like we do.  For
the BSD stacks, this goes back all the way to Stevens Volume 2
and beyond.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-12 17:53:29 -08:00
Linus Torvalds
0c0d61ca93 Merge branch 'for-linus' of git://linux-nfs.org/~bfields/linux
* 'for-linus' of git://linux-nfs.org/~bfields/linux:
  SUNPRC: Fix printk format warning
  nfsd: clean up svc_reserve_auth()
  NLM: don't requeue block if it was invalidated while GRANT_MSG was in flight
  NLM: don't reattempt GRANT_MSG when there is already an RPC in flight
  NLM: have server-side RPC clients default to soft RPC tasks
  NLM: set RPC_CLNT_CREATE_NOPING for NLM RPC clients
2008-02-11 09:19:47 -08:00
Roland Dreier
bb50c8012c SUNPRC: Fix printk format warning
net/sunrpc/xprtrdma/svc_rdma_sendto.c:160: warning: format '%llx'
expects type 'long long unsigned int', but argument 3 has type 'u64'

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-10 18:11:22 -05:00
David S. Miller
30ddb159ff [PKT_SCHED] ematch: Fix build warning.
Commit 954415e33e ("[PKT_SCHED] ematch:
tcf_em_destroy robustness") removed a cast on em->data when
passing it to kfree(), but em->data is an integer type that can
hold pointers as well as other values so the cast is necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-10 03:48:15 -08:00
Jarek Poplawski
21347456ab [NET_SCHED] sch_htb: htb_requeue fix
htb_requeue() enqueues skbs for which htb_classify() returns NULL.
This is wrong because such skbs could be handled by NET_CLS_ACT code,
and the decision could be different than earlier in htb_enqueue().
So htb_requeue() is changed to work and look more like htb_enqueue().

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:44:00 -08:00
Rami Rosen
238fc7eac8 [IPV6]: Replace using the magic constant "1024" with IP6_RT_PRIO_USER for fc_metric.
This patch replaces the explicit usage of the magic constant "1024"
with IP6_RT_PRIO_USER in the IPV6 tree.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:43:11 -08:00
Stephen Hemminger
954415e33e [PKT_SCHED] ematch: tcf_em_destroy robustness
Make the code in tcf_em_tree_destroy more robust and cleaner:
 * Don't need to cast pointer to kfree() or avoid passing NULL.
 * After freeing the tree, clear the pointer to avoid possible problems
from repeated free.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:26:53 -08:00
Stephen Hemminger
ed7af3b350 [PKT_SCHED]: deinline functions in meta match
A couple of functions in meta match don't need to be inline.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:26:17 -08:00
Pavel Emelyanov
8ff65b4603 [SCTP]: Convert sctp_dbg_objcnt to seq files.
This makes the code use a good proc API and the text ~50 bytes shorter.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:24:58 -08:00
Pavel Emelyanov
3f5340a67e [SCTP]: Use snmp_fold_field instead of a homebrew analogue.
SCPT already depends in INET, so this doesn't create additional
dependencies.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:23:44 -08:00
Denis V. Lunev
cd557bc1c1 [IGMP]: Optimize kfree_skb in igmp_rcv.
Merge error paths inside igmp_rcv.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:22:26 -08:00
Pavel Emelyanov
bd2f747658 [KEY]: Convert net/pfkey to use seq files.
The seq files API disposes the caller of the difficulty of
checking file position, the length of data to produce and
the size of provided buffer.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:20:06 -08:00
Pavel Emelyanov
61145aa1a1 [KEY]: Clean up proc files creation a bit.
Mainly this removes ifdef-s from inside the ipsec_pfkey_init.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 23:19:14 -08:00
Stephen Hemminger
268bcca1e7 [PKT_SCHED] ematch: oops from uninitialized variable (resend)
Setting up a meta match causes a kernel OOPS because of uninitialized
elements in tree.

[   37.322381] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[   37.322381] IP: [<ffffffff883fc717>] :em_meta:em_meta_destroy+0x17/0x80

[   37.322381] Call Trace:
[   37.322381]  [<ffffffff803ec83d>] tcf_em_tree_destroy+0x2d/0xa0
[   37.322381]  [<ffffffff803ecc8c>] tcf_em_tree_validate+0x2dc/0x4a0
[   37.322381]  [<ffffffff803f06d2>] nla_parse+0x92/0xe0
[   37.322381]  [<ffffffff883f9672>] :cls_basic:basic_change+0x202/0x3c0
[   37.322381]  [<ffffffff802a3917>] kmem_cache_alloc+0x67/0xa0
[   37.322381]  [<ffffffff803ea221>] tc_ctl_tfilter+0x3b1/0x580
[   37.322381]  [<ffffffff803dffd0>] rtnetlink_rcv_msg+0x0/0x260
[   37.322381]  [<ffffffff803ee944>] netlink_rcv_skb+0x74/0xa0
[   37.322381]  [<ffffffff803dffc8>] rtnetlink_rcv+0x18/0x20
[   37.322381]  [<ffffffff803ee6c3>] netlink_unicast+0x263/0x290
[   37.322381]  [<ffffffff803cf276>] __alloc_skb+0x96/0x160
[   37.322381]  [<ffffffff803ef014>] netlink_sendmsg+0x274/0x340
[   37.322381]  [<ffffffff803c7c3b>] sock_sendmsg+0x12b/0x140
[   37.322381]  [<ffffffff8024de90>] autoremove_wake_function+0x0/0x30
[   37.322381]  [<ffffffff8024de90>] autoremove_wake_function+0x0/0x30
[   37.322381]  [<ffffffff803c7c3b>] sock_sendmsg+0x12b/0x140
[   37.322381]  [<ffffffff80288611>] zone_statistics+0xb1/0xc0
[   37.322381]  [<ffffffff803c7e5e>] sys_sendmsg+0x20e/0x360
[   37.322381]  [<ffffffff803c7411>] sockfd_lookup_light+0x41/0x80
[   37.322381]  [<ffffffff8028d04b>] handle_mm_fault+0x3eb/0x7f0
[   37.322381]  [<ffffffff8020c2fb>] system_call_after_swapgs+0x7b/0x80

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-09 03:47:19 -08:00
David S. Miller
ab1ecbabb1 Merge branch 'pending' of master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev 2008-02-09 03:44:25 -08:00
Linus Torvalds
3668805a54 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
  [IPSEC] flow: reorder "struct flow_cache_entry" and remove SLAB_HWCACHE_ALIGN
  [DECNET] ROUTE: remove unecessary alignment
  [IPSEC]: Add support for aes-ctr.
  [ISDN]: fix section mismatch warning in enpci_card_msg
  [TIPC]: declare proto_ops structures as 'const'.
  [TIPC]: Kill unused static inline (x5)
  [TC]: oops in em_meta
  [IPV6] Minor cleanup: remove unused definitions in net/ip6_fib.h
  [IPV6] Minor clenup: remove two unused definitions in net/ip6_route.h
  [AF_IUCV]: defensive programming of iucv_callback_txdone
  [AF_IUCV]: broken send_skb_q results in endless loop
  [IUCV]: wrong irq-disabling locking at module load time
  [CAN]: Minor clean-ups
  [CAN]: Move proto_{,un}register() out of spin-locked region
  [CAN]: Clean up module auto loading
  [IPSEC] flow: Remove an unnecessary ____cacheline_aligned
  [IPV4]: route: fix crash ip_route_input
  [NETFILTER]: xt_iprange: add missing #include
  [NETFILTER]: xt_iprange: fix typo in address family
  [NETFILTER]: nf_conntrack: fix ct_extend ->move operation
  ...
2008-02-08 09:27:06 -08:00
Pavel Emelyanov
cbdc738732 namespaces: mark NET_NS with "depends on NAMESPACES"
There's already an option controlling the net namespaces cloning code, so make
it work the same way as all the other namespaces' options.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Kirill Korotaev <dev@sw.ru>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-08 09:22:23 -08:00
Eric Dumazet
dd5a1843d5 [IPSEC] flow: reorder "struct flow_cache_entry" and remove SLAB_HWCACHE_ALIGN
1) We can shrink sizeof(struct flow_cache_entry) by 8 bytes on 64bit arches.
2) No need to align these structures to hardware cache lines, this only waste 
   ram for very litle gain.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 23:30:42 -08:00
Eric Dumazet
fca09fb732 [DECNET] ROUTE: remove unecessary alignment
Same alignment requirement was removed on IP route cache in the past.

This alignment actually has bad effect on 32 bit arches, uniprocessor,
since sizeof(dn_rt_hash_bucket) is forced to 8 bytes instead of 4.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 23:29:57 -08:00
Joy Latten
405137d16f [IPSEC]: Add support for aes-ctr.
The below patch allows IPsec to use CTR mode with AES encryption
algorithm. Tested this using setkey in ipsec-tools.

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 23:11:56 -08:00
Florian Westphal
bca65eae39 [TIPC]: declare proto_ops structures as 'const'.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:18:01 -08:00
Ilpo Järvinen
86121fe5b4 [TIPC]: Kill unused static inline (x5)
All these static inlines are unused:

in_own_zone     1 (net/tipc/addr.h)
msg_dataoctet   1 (net/tipc/msg.h)
msg_direct      1 (include/net/tipc/tipc_msg.h)
msg_options     1 (include/net/tipc/tipc_msg.h)
tipc_nmap_get   1 (net/tipc/bcast.h)

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:17:13 -08:00
Stephen Hemminger
04f217aca4 [TC]: oops in em_meta
If userspace passes a unknown match index into em_meta, then
em_meta_change will return an error and the data for the match will
not be set. This then causes an null pointer dereference when the
cleanup is done in the error path via tcf_em_tree_destroy. Since the
tree structure comes kzalloc, it is initialized to NULL.

Discovered when testing a new version of tc command against an
accidental older kernel.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:13:00 -08:00
Ursula Braun
f2a77991a9 [AF_IUCV]: defensive programming of iucv_callback_txdone
The loop in iucv_callback_txdone presumes existence of an entry
with msg->tag in the send_skb_q list. In error cases this
assumption might be wrong and might cause an endless loop.
Loop is rewritten to guarantee loop end in case of missing
msg->tag entry in send_skb_q.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:07:44 -08:00
Ursula Braun
d44447229e [AF_IUCV]: broken send_skb_q results in endless loop
A race has been detected in iucv_callback_txdone().
skb_unlink has to be done inside the locked area.

In addition checkings for successful allocations are inserted.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:07:19 -08:00
Ursula Braun
435bc9dfc6 [IUCV]: wrong irq-disabling locking at module load time
Linux may hang when running af_iucv socket programs concurrently
with a load of module netiucv. iucv_register() tries to take the
iucv_table_lock with spin_lock_irq. This conflicts with
iucv_connect() which has a need for an smp_call_function while
holding the iucv_table_lock.
Solution: use bh-disabling locking in iucv_register()

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:06:52 -08:00
Urs Thuermann
a219994bf5 [CAN]: Minor clean-ups
Remove unneeded variable.
Rename local variable error to err like in all other places.
Some white-space changes.

Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:05:04 -08:00
Urs Thuermann
a2fea5f19f [CAN]: Move proto_{,un}register() out of spin-locked region
The implementation of proto_register() has changed so that it can now
sleep.  The call to proto_register() must be moved out of the
spin-locked region.

Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:04:45 -08:00
Urs Thuermann
5423dd67bd [CAN]: Clean up module auto loading
Remove local char array to construct module name.
Don't call request_module() when CONFIG_KMOD is not set.

Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:04:21 -08:00
Eric Dumazet
5f58a5c872 [IPSEC] flow: Remove an unnecessary ____cacheline_aligned
We use a percpu variable named flow_hash_info, which holds 12 bytes.

It is currently marked as ____cacheline_aligned, which makes linker
skip space to properly align this variable.

Before :
c065cc90 D per_cpu__softnet_data
c065cd00 d per_cpu__flow_tables
<Here, hole of 124 bytes>
c065cd80 d per_cpu__flow_hash_info
<Here, hole of 116 bytes>
c065ce00 d per_cpu__flow_flush_tasklets
c065ce14 d per_cpu__rt_cache_stat


This alignement is quite unproductive, and removing it reduces the
size of percpu data (by 240 bytes on my x86 machine), and improves
performance (flow_tables & flow_hash_info can share a single cache
line)

After patch :
c065cc04 D per_cpu__softnet_data
c065cc4c d per_cpu__flow_tables
c065cc50 d per_cpu__flow_hash_info
c065cc5c d per_cpu__flow_flush_tasklets
c065cc70 d per_cpu__rt_cache_stat

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 18:03:18 -08:00
Patrick McHardy
4136cd523e [IPV4]: route: fix crash ip_route_input
ip_route_me_harder() may call ip_route_input() with skbs that don't
have skb->dev set for skbs rerouted in LOCAL_OUT and TCP resets
generated by the REJECT target, resulting in a crash when dereferencing
skb->dev->nd_net. Since ip_route_input() has an input device argument,
it seems correct to use that one anyway.

Bug introduced in b5921910a1 (Routing cache virtualization).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 17:58:20 -08:00
Jan Engelhardt
5da621f1c5 [NETFILTER]: xt_iprange: add missing #include
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 17:57:11 -08:00
Patrick McHardy
d9d17578d9 [NETFILTER]: xt_iprange: fix typo in address family
The family for iprange_mt4 should be AF_INET, not AF_INET6.
Noticed by Jiri Moravec <jim.lkml@gmail.com>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 17:56:49 -08:00
Patrick McHardy
86577c661b [NETFILTER]: nf_conntrack: fix ct_extend ->move operation
The ->move operation has two bugs:

- It is called with the same extension as source and destination,
  so it doesn't update the new extension.

- The address of the old extension is calculated incorrectly,
  instead of (void *)ct->ext + ct->ext->offset[i] it uses
  ct->ext + ct->ext->offset[i].

Fixes a crash on x86_64 reported by Chuck Ebbert <cebbert@redhat.com>
and Thomas Woerner <twoerner@redhat.com>.

Tested-by: Thomas Woerner <twoerner@redhat.com>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 17:56:34 -08:00
Jozsef Kadlecsik
b2155e7f70 [NETFILTER]: nf_conntrack: TCP conntrack reopening fix
TCP connection tracking in netfilter did not handle TCP reopening
properly: active close was taken into account for one side only and
not for any side, which is fixed now. The patch includes more comments
to explain the logic how the different cases are handled.
The bug was discovered by Jeff Chua.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-07 17:54:56 -08:00
David Howells
e231c2ee64 Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p)
Convert instances of ERR_PTR(PTR_ERR(p)) to ERR_CAST(p) using:

perl -spi -e 's/ERR_PTR[(]PTR_ERR[(](.*)[)][)]/ERR_CAST(\1)/' `grep -rl 'ERR_PTR[(]*PTR_ERR' fs crypto net security`

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:26 -08:00
Vlad Yasevich
5f9646c3d9 [SCTP]: Make sure the chunk is off the transmitted list prior to freeing.
In a few instances, we need to remove the chunk from the transmitted list
prior to freeing it.  This is because the free code doesn't do that any
more and so we need to do it manually.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-06 21:27:39 -05:00
Wei Yongjun
a869981423 [SCTP]: Fix kernel panic while received ASCONF chunk with bad serial number
While recevied ASCONF chunk with serial number less then needed, kernel
will treat this chunk as a retransmitted ASCONF chunk and find cached
ASCONF-ACK chunk used sctp_assoc_lookup_asconf_ack(). But this function
will always return NO-NULL. So response with cached ASCONF-ACKs chunk
will cause kernel panic.
In function sctp_assoc_lookup_asconf_ack(), if the cached ASCONF-ACKs
list asconf_ack_list is empty, or if the serial being requested does not
exists, the function as it currectly stands returns the actuall
list_head asoc->asconf_ack_list, this is not a cache ASCONF-ACK chunk
but a bogus pointer.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-06 21:27:39 -05:00
Vlad Yasevich
b46ae36de4 [SCTP]: Set ports in every address returned by sctp_getladdrs()
Thomas Dreibholz has reported that port numbers are not filled
in the results of sctp_getladdrs() when the socket was bound
to an ephemeral port.  This is only true, if the address was
not specified either.  So, fill in the port number correctly.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-06 21:27:39 -05:00
Vlad Yasevich
c068be5491 [SCTP]: Correctly reap SSNs when processing FORWARD_TSN chunk
When we recieve a FORWARD_TSN chunk, we need to reap
all the queued fast-forwarded chunks from the ordering queue
 However, if we don't have them queued, we need to see if
the next expected one is there as well.  If it is, start
deliver from that point instead of waiting for the next
chunk to arrive.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-06 21:26:26 -05:00
Andrew Morton
7276744354 9p: fix p9_printfcall export
ERROR: "p9_printfcall" [net/9p/9pnet_virtio.ko] undefined!

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:01 -06:00
Eric Van Hensbergen
8a0dc95fd9 9p: transport API reorganization
This merges the mux.c (including the connection interface) with trans_fd
in preparation for transport API changes.  Ultimately, trans_fd will need
to be rewritten to clean it up and simplify the implementation, but this
reorganization is viewed as the first step.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:03 -06:00
Eric Van Hensbergen
f39335453f 9p: add remove function to trans_virtio
Request from rusty:
Just cleaning up patches for 2.6.25 merge, and noticed that
net/9p/trans_virtio.c doesn't have a remove function.  This will crash when
removing the module (console doesn't have one because it can't really be
removed).

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:04 -06:00
Anthony Liguori
dea7bbb603 9p: Convert semaphore to spinlock for p9_idpool
When booting from v9fs, down_interruptible in p9_idpool_get() triggered a BUG
as it was being called with IRQs disabled.  A spinlock seems like the right
thing to be using since the idr functions go out of their way not to sleep.

This patch eliminates the BUG by converting the semaphore to a spinlock.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:04 -06:00
Eric Van Hensbergen
7c7d90f2dd 9p: Fix soft lockup in virtio transport
This fixes a poorly placed spinlock which could result in a
soft lockup condition.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:07 -06:00
Eric Van Hensbergen
e2735b7720 9p: block-based virtio client
This replaces the console-based virto client with a block-based
client using a single request queue.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:58 -06:00
Eric Van Hensbergen
043aba403e 9p: create transport rpc cut-thru
Add a new transport function which allows a cut-thru directly to
the transport instead of processing request through the mux if the
cut-thru exists.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:25:09 -06:00
Martin Stava
afcf0c13ae 9p: fix bug in p9_clone_stat
This patch fixes a bug in the copying of 9P
stat information where string references
weren't being updated properly.

Signed-off-by: Martin Sava <martin.stava@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-02-06 19:20:44 -06:00
Sven Wegener
9c1ca6e68a ipvs: Make wrr "no available servers" error message rate-limited
No available servers is more an error message than something informational. It
should also be rate-limited, else we're going to flood our logs on a busy
director, if all real servers are out of order with a weight of zero.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 20:00:10 -08:00
David S. Miller
a29961b33b Merge branch 'fixes' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-02-05 19:58:05 -08:00
Patrick McHardy
9ec138101f [NET_SCHED]: cls_flow: support classification based on VLAN tag
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 16:21:04 -08:00
Patrick McHardy
4f25049106 [NET_SCHED]: cls_flow: fix key mask validity check
Since we're using fls(), we need to check whether the value is
non-zero first.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 16:19:59 -08:00
Patrick McHardy
0ea9d70df8 [NET_SCHED]: em_meta: fix compile warning
net/sched/em_meta.c: In function 'meta_int_vlan_tag':
net/sched/em_meta.c:179: warning: 'tag' may be used uninitialized in this function

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 16:19:33 -08:00
Michael Buesch
03ac7a8141 mac80211: Is not EXPERIMENTAL anymore
Remove the EXPERIMENTAL dependency, as the existing mac80211
features are stable.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-02-05 14:35:47 -05:00
Linus Torvalds
3d412f60b7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
  [PKT_SCHED]: vlan tag match
  [NET]: Add if_addrlabel.h to sanitized headers.
  [NET] rtnetlink.c: remove no longer used functions
  [ICMP]: Restore pskb_pull calls in receive function
  [INET]: Fix accidentally broken inet(6)_hash_connect's port offset calculations.
  [NET]: Remove further references to net-modules.txt
  bluetooth rfcomm tty: destroy before tty_close()
  bluetooth: blacklist another Broadcom BCM2035 device
  drivers/bluetooth/btsdio.c: fix double-free
  drivers/bluetooth/bpa10x.c: fix memleak
  bluetooth: uninlining
  bluetooth: hidp_process_hid_control remove unnecessary parameter dealing
  tun: impossible to deassert IFF_ONE_QUEUE or IFF_NO_PI
  hamradio: fix dmascc section mismatch
  [SCTP]: Fix kernel panic while received AUTH chunk with BAD shared key identifier
  [SCTP]: Fix kernel panic while received AUTH chunk while enabled auth
  [IPV4]: Formatting fix for /proc/net/fib_trie.
  [IPV6]: Fix sysctl compilation error.
  [NET_SCHED]: Add #ifdef CONFIG_NET_EMATCH in net/sched/cls_flow.c (latest git broken build)
  [IPV4]: Fix compile error building without CONFIG_FS_PROC
  ...
2008-02-05 10:09:07 -08:00
Paul Moore
eda61d32e8 NetLabel: introduce a new kernel configuration API for NetLabel
Add a new set of configuration functions to the NetLabel/LSM API so that
LSMs can perform their own configuration of the NetLabel subsystem without
relying on assistance from userspace.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:20 -08:00
Vlad Yasevich
01f2d38498 [SCTP]: Kill silly inlines in ulpqueue.c
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-05 10:59:30 -05:00
Vlad Yasevich
0eca8fee0c [SCTP]: Do not increase rwnd when reading partial notification.
When a user reads a partial notification message, do not
update rwnd since notifications must not be counted towards
receive window.

Tested-by: Oliver Roll <mail@oliroll.de>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-05 10:59:30 -05:00
Vlad Yasevich
60c778b259 [SCTP]: Stop claiming that this is a "reference implementation"
I was notified by Randy Stewart that lksctp claims to be
"the reference implementation".  First of all, "the
refrence implementation" was the original implementation
of SCTP in usersapce written ty Randy and a few others.
Second, after looking at the definiton of 'reference implementation',
we don't really meet the requirements.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
2008-02-05 10:59:07 -05:00
Stephen Hemminger
3113e88c3c [PKT_SCHED]: vlan tag match
Provide a way to use tc filters on vlan tag even if tag is buried in
skb due to hardware acceleration.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:20:13 -08:00
Adrian Bunk
03245ce2f0 [NET] rtnetlink.c: remove no longer used functions
This patch removes the following no longer used functions:
- rtattr_parse()
- rtattr_strlcpy()
- __rtattr_parse_nested_compat()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:17:22 -08:00
Herbert Xu
8cf229437f [ICMP]: Restore pskb_pull calls in receive function
Somewhere along the development of my ICMP relookup patch the header
length check went AWOL on the non-IPsec path.  This patch restores the
check.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:15:50 -08:00
Pavel Emelyanov
5d8c0aa943 [INET]: Fix accidentally broken inet(6)_hash_connect's port offset calculations.
The port offset calculations depend on the protocol family, but, as
Adrian noticed, I broke this logic with the commit

	5ee31fc1ec
	[INET]: Consolidate inet(6)_hash_connect.

Return this logic back, by passing the port offset directly into the
consolidated function.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Noticed-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:14:44 -08:00
Dave Young
93d807401c bluetooth rfcomm tty: destroy before tty_close()
rfcomm dev could be deleted in tty_hangup, so we must not call
rfcomm_dev_del again to prevent from destroying rfcomm dev before tty
close.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:12:06 -08:00
Andrew Morton
91f5cca3d1 bluetooth: uninlining
Remove all those inlines which were either a) unneeded or b) increased code
size.

          text    data     bss     dec     hex filename
before:   6997      74       8    7079    1ba7 net/bluetooth/hidp/core.o
after:    6492      74       8    6574    19ae net/bluetooth/hidp/core.o

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:07:58 -08:00
Dave Young
eff001e35a bluetooth: hidp_process_hid_control remove unnecessary parameter dealing
According to the bluetooth HID spec v1.0 chapter 7.4.2

"This code requests a major state change in a BT-HID device.  A HID_CONTROL
request does not generate a HANDSHAKE response."

"A HID_CONTROL packet with a parameter of VIRTUAL_CABLE_UNPLUG is the only
HID_CONTROL packet a device can send to a host.  A host will ignore all other
packets."

So in the hidp_precess_hid_control function, we just need to deal with the
UNLUG packet.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:07:14 -08:00
Wei Yongjun
7cc08b55fc [SCTP]: Fix kernel panic while received AUTH chunk with BAD shared key identifier
If SCTP-AUTH is enabled, received AUTH chunk with BAD shared key 
identifier will cause kernel panic.

Test as following:
step1: enabled /proc/sys/net/sctp/auth_enable
step 2:  connect  to SCTP server with auth capable. Association is 
established between endpoints. Then send a AUTH chunk with a bad 
shareid, SCTP server will kernel panic after received that AUTH chunk.

SCTP client                   SCTP server
  INIT         ---------->  
    (with auth capable)
               <----------    INIT-ACK
                              (with auth capable)
  COOKIE-ECHO  ---------->
               <----------    COOKIE-ACK
  AUTH         ---------->


AUTH chunk is like this:
  AUTH chunk
    Chunk type: AUTH (15)
    Chunk flags: 0x00
    Chunk length: 28
    Shared key identifier: 10
    HMAC identifier: SHA-1 (1)
    HMAC: 0000000000000000000000000000000000000000

The assignment of NULL to key can safely be removed, since key_for_each 
(which is just list_for_each_entry under the covers does an initial 
assignment to key anyway).

If the endpoint_shared_keys list is empty, or if the key_id being 
requested does not exist, the function as it currently stands returns 
the actuall list_head (in this case endpoint_shared_keys.  Since that 
list_head isn't surrounded by an actuall data structure, the last 
iteration through list_for_each_entry will do a container_of on key, and 
we wind up returning a bogus pointer, instead of NULL, as we should.

> Neil Horman wrote:
>> On Tue, Jan 22, 2008 at 05:29:20PM +0900, Wei Yongjun wrote:
>>
>> FWIW, Ack from me.  The assignment of NULL to key can safely be 
>> removed, since
>> key_for_each (which is just list_for_each_entry under the covers does 
>> an initial
>> assignment to key anyway).
>> If the endpoint_shared_keys list is empty, or if the key_id being 
>> requested does
>> not exist, the function as it currently stands returns the actuall 
>> list_head (in
>> this case endpoint_shared_keys.  Since that list_head isn't 
>> surrounded by an
>> actuall data structure, the last iteration through 
>> list_for_each_entry will do a
>> container_of on key, and we wind up returning a bogus pointer, 
>> instead of NULL,
>> as we should.  Wei's patch corrects that.
>>
>> Regards
>> Neil
>>
>> Acked-by: Neil Horman <nhorman@tuxdriver.com>
>>
>
> Yep, the patch is correct.
>
> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
>
> -vlad
>

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:03:06 -08:00
Wei Yongjun
d2f19fa13e [SCTP]: Fix kernel panic while received AUTH chunk while enabled auth
If STCP is started while /proc/sys/net/sctp/auth_enable is set 0 and
association is established between endpoints. Then if
/proc/sys/net/sctp/auth_enable is set 1, a received AUTH chunk will
cause kernel panic.

Test as following:
step 1: echo 0> /proc/sys/net/sctp/auth_enable
step 2:

   SCTP client                  SCTP server
      INIT          --------->
                    <---------   INIT-ACK
      COOKIE-ECHO   --------->
                    <---------   COOKIE-ACK
step 3:
    echo 1> /proc/sys/net/sctp/auth_enable
step 4:
   SCTP client                  SCTP server
       AUTH        ----------->  Kernel Panic


This patch fix this probleam to treat AUTH chunk as unknow chunk if peer 
has initialized with no auth capable.

> Sorry for the delay.  Was on vacation without net access.
>
> Wei Yongjun wrote:
>>
>>
>> This patch fix this probleam to treat AUTH chunk as unknow chunk if 
>> peer has initialized with no auth capable.
>>
>> Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
>
> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
>
>>

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 03:02:26 -08:00
Denis V. Lunev
b9c4d82a85 [IPV4]: Formatting fix for /proc/net/fib_trie.
The line in the /proc/net/fib_trie for route with TOS specified
- has extra \n at the end
- does not have a space after route scope
like below.
           |-- 1.1.1.1
              /32 universe UNICASTtos =1

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 02:58:45 -08:00
Rami Rosen
0aead54347 [NET_SCHED]: Add #ifdef CONFIG_NET_EMATCH in net/sched/cls_flow.c (latest git broken build)
The 2.6 latest git build was broken when using the following
configuration options:
CONFIG_NET_EMATCH=n
CONFIG_NET_CLS_FLOW=y

with the following error:
net/sched/cls_flow.c: In function 'flow_dump':
net/sched/cls_flow.c:598: error: 'struct tcf_ematch_tree' has no
member named 'hdr'
make[2]: *** [net/sched/cls_flow.o] Error 1
make[1]: *** [net/sched] Error 2
make: *** [net] Error 2


see the recent post by Li Zefan:
  http://www.spinics.net/lists/netdev/msg54434.html

The reason for this crash is that struct tcf_ematch_tree
(net/pkt_cls.h) is empty when CONFIG_NET_EMATCH is not defined.

When CONFIG_NET_EMATCH is defined, the tcf_ematch_tree structure
indeed holds a struct tcf_ematch_tree_hdr (hdr) as flow_dump()
expects.

This patch adds #ifdef CONFIG_NET_EMATCH in flow_dump to avoid this.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 02:56:48 -08:00
Adrian Bunk
322c8a3c36 [IPSEC] xfrm4_beet_input(): fix an if()
A bug every C programmer makes at some point in time...

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-05 02:51:39 -08:00
Linus Torvalds
93890b71a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (25 commits)
  virtio: balloon driver
  virtio: Use PCI revision field to indicate virtio PCI ABI version
  virtio: PCI device
  virtio_blk: implement naming for vda-vdz,vdaa-vdzz,vdaaa-vdzzz
  virtio_blk: Dont waste major numbers
  virtio_blk: provide getgeo
  virtio_net: parametrize the napi_weight for virtio receive queue.
  virtio: free transmit skbs when notified, not on next xmit.
  virtio: flush buffers on open
  virtnet: remove double ether_setup
  virtio: Allow virtio to be modular and used by modules
  virtio: Use the sg_phys convenience function.
  virtio: Put the virtio under the virtualization menu
  virtio: handle interrupts after callbacks turned off
  virtio: reset function
  virtio: populate network rings in the probe routine, not open
  virtio: Tweak virtio_net defines
  virtio: Net header needs hdr_len
  virtio: remove unused id field from struct virtio_blk_outhdr
  virtio: clarify NO_NOTIFY flag usage
  ...
2008-02-04 08:00:54 -08:00
Linus Torvalds
f5bb3a5e9d Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (79 commits)
  Jesper Juhl is the new trivial patches maintainer
  Documentation: mention email-clients.txt in SubmittingPatches
  fs/binfmt_elf.c: spello fix
  do_invalidatepage() comment typo fix
  Documentation/filesystems/porting fixes
  typo fixes in net/core/net_namespace.c
  typo fix in net/rfkill/rfkill.c
  typo fixes in net/sctp/sm_statefuns.c
  lib/: Spelling fixes
  kernel/: Spelling fixes
  include/scsi/: Spelling fixes
  include/linux/: Spelling fixes
  include/asm-m68knommu/: Spelling fixes
  include/asm-frv/: Spelling fixes
  fs/: Spelling fixes
  drivers/watchdog/: Spelling fixes
  drivers/video/: Spelling fixes
  drivers/ssb/: Spelling fixes
  drivers/serial/: Spelling fixes
  drivers/scsi/: Spelling fixes
  ...
2008-02-04 07:58:52 -08:00
Rusty Russell
18445c4d50 virtio: explicit enable_cb/disable_cb rather than callback return.
It seems that virtio_net wants to disable callbacks (interrupts) before
calling netif_rx_schedule(), so we can't use the return value to do so.

Rename "restart" to "cb_enable" and introduce "cb_disable" hook: callback
now returns void, rather than a boolean.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04 23:49:58 +11:00
Rusty Russell
a586d4f601 virtio: simplify config mechanism.
Previously we used a type/len pair within the config space, but this
seems overkill.  We now simply define a structure which represents the
layout in the config space: the config space can now only be extended
at the end.

The main driver-visible changes:
1) We indicate what fields are present with an explicit feature bit.
2) Virtqueues are explicitly numbered, and not in the config space.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04 23:49:57 +11:00
Rusty Russell
f35d9d8aae virtio: Implement skb_partial_csum_set, for setting partial csums on untrusted packets.
Use it in virtio_net (replacing buggy version there), it's also going
to be used by TAP for partial csum support.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: David S. Miller <davem@davemloft.net>
2008-02-04 23:49:56 +11:00
Oliver Pinter
53379e57a7 typo fixes in net/core/net_namespace.c
Signed-off-by: Oliver Pinter <oliver.pntr@gmail.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2008-02-03 17:56:48 +02:00
Oliver Pinter
1dbede8714 typo fix in net/rfkill/rfkill.c
Signed-off-by: Oliver Pinter <oliver.pntr@gmail.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2008-02-03 17:55:45 +02:00
Oliver Pinter
ac461a0330 typo fixes in net/sctp/sm_statefuns.c
Signed-off-by: Oliver Pinter <oliver.pntr@gmail.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2008-02-03 17:52:41 +02:00
David S. Miller
a80f509f4a Merge branch 'fixes' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-02-03 04:43:34 -08:00
Arnaldo Carvalho de Melo
ab1e0a13d7 [SOCK] proto: Add hashinfo member to struct proto
This way we can remove TCP and DCCP specific versions of

sk->sk_prot->get_port: both v4 and v6 use inet_csk_get_port
sk->sk_prot->hash:     inet_hash is directly used, only v6 need
                       a specific version to deal with mapped sockets
sk->sk_prot->unhash:   both v4 and v6 use inet_hash directly

struct inet_connection_sock_af_ops also gets a new member, bind_conflict, so
that inet_csk_get_port can find the per family routine.

Now only the lookup routines receive as a parameter a struct inet_hashtable.

With this we further reuse code, reducing the difference among INET transport
protocols.

Eventually work has to be done on UDP and SCTP to make them share this
infrastructure and get as a bonus inet_diag interfaces so that iproute can be
used with these protocols.

net-2.6/net/ipv4/inet_hashtables.c:
  struct proto			     |   +8
  struct inet_connection_sock_af_ops |   +8
 2 structs changed
  __inet_hash_nolisten               |  +18
  __inet_hash                        | -210
  inet_put_port                      |   +8
  inet_bind_bucket_create            |   +1
  __inet_hash_connect                |   -8
 5 functions changed, 27 bytes added, 218 bytes removed, diff: -191

net-2.6/net/core/sock.c:
  proto_seq_show                     |   +3
 1 function changed, 3 bytes added, diff: +3

net-2.6/net/ipv4/inet_connection_sock.c:
  inet_csk_get_port                  |  +15
 1 function changed, 15 bytes added, diff: +15

net-2.6/net/ipv4/tcp.c:
  tcp_set_state                      |   -7
 1 function changed, 7 bytes removed, diff: -7

net-2.6/net/ipv4/tcp_ipv4.c:
  tcp_v4_get_port                    |  -31
  tcp_v4_hash                        |  -48
  tcp_v4_destroy_sock                |   -7
  tcp_v4_syn_recv_sock               |   -2
  tcp_unhash                         | -179
 5 functions changed, 267 bytes removed, diff: -267

net-2.6/net/ipv6/inet6_hashtables.c:
  __inet6_hash |   +8
 1 function changed, 8 bytes added, diff: +8

net-2.6/net/ipv4/inet_hashtables.c:
  inet_unhash                        | +190
  inet_hash                          | +242
 2 functions changed, 432 bytes added, diff: +432

vmlinux:
 16 functions changed, 485 bytes added, 492 bytes removed, diff: -7

/home/acme/git/net-2.6/net/ipv6/tcp_ipv6.c:
  tcp_v6_get_port                    |  -31
  tcp_v6_hash                        |   -7
  tcp_v6_syn_recv_sock               |   -9
 3 functions changed, 47 bytes removed, diff: -47

/home/acme/git/net-2.6/net/dccp/proto.c:
  dccp_destroy_sock                  |   -7
  dccp_unhash                        | -179
  dccp_hash                          |  -49
  dccp_set_state                     |   -7
  dccp_done                          |   +1
 5 functions changed, 1 bytes added, 242 bytes removed, diff: -241

/home/acme/git/net-2.6/net/dccp/ipv4.c:
  dccp_v4_get_port                   |  -31
  dccp_v4_request_recv_sock          |   -2
 2 functions changed, 33 bytes removed, diff: -33

/home/acme/git/net-2.6/net/dccp/ipv6.c:
  dccp_v6_get_port                   |  -31
  dccp_v6_hash                       |   -7
  dccp_v6_request_recv_sock          |   +5
 3 functions changed, 5 bytes added, 38 bytes removed, diff: -33

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-03 04:28:52 -08:00
Linus Torvalds
63e9b66e29 Merge branch 'for-linus' of git://linux-nfs.org/~bfields/linux
* 'for-linus' of git://linux-nfs.org/~bfields/linux: (100 commits)
  SUNRPC: RPC program information is stored in unsigned integers
  SUNRPC: Move exported symbol definitions after function declaration part 2
  NLM: tear down RPC clients in nlm_shutdown_hosts
  SUNRPC: spin svc_rqst initialization to its own function
  nfsd: more careful input validation in nfsctl write methods
  lockd: minor log message fix
  knfsd: don't bother mapping putrootfh enoent to eperm
  rdma: makefile
  rdma: ONCRPC RDMA protocol marshalling
  rdma: SVCRDMA sendto
  rdma: SVCRDMA recvfrom
  rdma: SVCRDMA Core Transport Services
  rdma: SVCRDMA Transport Module
  rdma: SVCRMDA Header File
  svc: Add svc_xprt_names service to replace svc_sock_names
  knfsd: Support adding transports by writing portlist file
  svc: Add svc API that queries for a transport instance
  svc: Add /proc/sys/sunrpc/transport files
  svc: Add transport hdr size for defer/revisit
  svc: Move the xprt independent code to the svc_xprt.c file
  ...
2008-02-02 14:31:28 +11:00
Chuck Lever
ea339d46b9 SUNRPC: RPC program information is stored in unsigned integers
Clean up: When looping over RPC version and procedure numbers, use
unsigned index variables.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 17:01:31 -05:00
Trond Myklebust
d2f7e79e3b SUNRPC: Move exported symbol definitions after function declaration part 2
Do it for the server code...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 17:01:24 -05:00
Jeff Layton
0113ab3464 SUNRPC: spin svc_rqst initialization to its own function
Move the initialzation in __svc_create_thread that happens prior to
thread creation to a new function. Export the function to allow
services to have better control over the svc_rqst structs.

Also rearrange the rqstp initialization to prevent NULL pointer
dereferences in svc_exit_thread in case allocations fail.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:15 -05:00
Tom Tucker
4b8449af75 rdma: makefile
Add the svcrdma module to the xprtrdma makefile.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
ef1eac0a3f rdma: ONCRPC RDMA protocol marshalling
This logic parses the ONCRDMA protocol headers that
precede the actual RPC header. It is placed in a separate
file to keep all protocol aware code in a single place.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
c06b540a54 rdma: SVCRDMA sendto
This file implements the RDMA transport sendto function. A RPC reply
on an RDMA transport consists of some number of RDMA_WRITE requests
followed by an RDMA_SEND request. The sendto function parses the
ONCRPC RDMA reply header to determine how to send the reply back to
the client. The send queue is sized so as to be able to send complete
replies for requests in most cases.  In the event that there are not
enough SQ WR slots to reply, e.g.  big data, the send will block the
NFSD thread. The I/O callback functions in svc_rdma_transport.c that
reap WR completions wake any waiters blocked on the SQ. In general,
the goal is not to block NFSD threads and the has_wspace method
stall requests when the SQ is nearly full.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
d5b31be682 rdma: SVCRDMA recvfrom
This file implements the RDMA transport recvfrom function. The function
dequeues work reqeust completion contexts from an I/O list that it shares
with the I/O tasklet in svc_rdma_transport.c. For ONCRPC RDMA, an RPC may
not be complete when it is received. Instead, the RDMA header that precedes
the RPC message informs the transport where to get the RPC data from on
the client and where to place it in the RPC message before it is delivered
to the server. The svc_rdma_recvfrom function therefore, parses this RDMA
header and issues any necessary RDMA operations to fetch the remainder of
the RPC from the client.

Special handling is required when the request involves an RDMA_READ.
In this case, recvfrom submits the RDMA_READ requests to the underlying
transport driver and then returns 0. When the transport
completes the last RDMA_READ for the request, it enqueues it on a
read completion queue and enqueues the transport. The recvfrom code
favors this queue over the regular DTO queue when satisfying reads.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
377f9b2f45 rdma: SVCRDMA Core Transport Services
This file implements the core transport data management and I/O
path. The I/O path for RDMA involves receiving callbacks on interrupt
context. Since all the svc transport locks are _bh locks we enqueue the
transport on a list, schedule a tasklet to dequeue data indications from
the RDMA completion queue. The tasklet in turn takes _bh locks to
enqueue receive data indications on a list for the transport. The
svc_rdma_recvfrom transport function dequeues data from this list in an
NFSD thread context.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
ef7fbf59e6 rdma: SVCRDMA Transport Module
This file implements the RDMA transport module initialization and
termination logic and registers the transport sysctl variables.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
9571af18fa svc: Add svc_xprt_names service to replace svc_sock_names
Create a transport independent version of the svc_sock_names function.

The toclose capability of the svc_sock_names service can be implemented
using the svc_xprt_find and svc_xprt_close services.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:14 -05:00
Tom Tucker
a217813f90 knfsd: Support adding transports by writing portlist file
Update the write handler for the portlist file to allow creating new
listening endpoints on a transport. The general form of the string is:

<transport_name><space><port number>

For example:

echo "tcp 2049" > /proc/fs/nfsd/portlist

This is intended to support the creation of a listening endpoint for
RDMA transports without adding #ifdef code to the nfssvc.c file.

Transports can also be removed as follows:

'-'<transport_name><space><port number>

For example:

echo "-tcp 2049" > /proc/fs/nfsd/portlist

Attempting to add a listener with an invalid transport string results
in EPROTONOSUPPORT and a perror string of "Protocol not supported".

Attempting to remove an non-existent listener (.e.g. bad proto or port)
results in ENOTCONN and a perror string of
"Transport endpoint is not connected"

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
7fcb98d58c svc: Add svc API that queries for a transport instance
Add a new svc function that allows a service to query whether a
transport instance has already been created. This is used in lockd
to determine whether or not a transport needs to be created when
a lockd instance is brought up.

Specifying 0 for the address family or port is effectively a wild-card,
and will result in matching the first transport in the service's list
that has a matching class name.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
dc9a16e49d svc: Add /proc/sys/sunrpc/transport files
Add a file that when read lists the set of registered svc
transports.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
260c1d1298 svc: Add transport hdr size for defer/revisit
Some transports have a header in front of the RPC header. The current
defer/revisit processing considers only the iov_len and arg_len to
determine how much to back up when saving the original request
to revisit. Add a field to the rqstp structure to save the size
of the transport header so svc_defer can correctly compute
the start of a request.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
0f0257eaa5 svc: Move the xprt independent code to the svc_xprt.c file
This functionally trivial patch moves all of the transport independent
functions from the svcsock.c file to the transport independent svc_xprt.c
file.

In addition the following formatting changes were made:
- White space cleanup
- Function signatures on single line
- The inline directive was removed
- Lines over 80 columns were reformatted
- The term 'socket' was changed to 'transport' in comments
- The SMP comment was moved and updated.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
18d19f949d svc: Make svc_check_conn_limits xprt independent
The svc_check_conn_limits function only manipulates xprt fields. Change references
to svc_sock->sk_xprt to svc_xprt directly.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
57b1d3baba svc: Removing remaining references to rq_sock in rqstp
This functionally empty patch removes rq_sock and unamed union
from rqstp structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
4e5caaa5f2 svc: Move create logic to common code
Move the svc transport list logic into common transport creation code.
Refactor this code path to make the flow of control easier to read.

Move the setting and clearing of the BUSY_BIT during transport creation
to common code.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
9f8bfae693 svc: Make svc_age_temp_sockets svc_age_temp_transports
This function is transport independent. Change it to use svc_xprt directly
and change it's name to reflect this.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:13 -05:00
Tom Tucker
c36adb2a7f svc: Make svc_recv transport neutral
All of the transport field and functions used by svc_recv are now
transport independent. Change the svc_recv function to use the svc_xprt
structure directly instead of the transport specific svc_sock structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
eab996d4ac svc: Make svc_sock_release svc_xprt_release
The svc_sock_release function only touches transport independent fields.
Change the function to manipulate svc_xprt directly instead of the transport
dependent svc_sock structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
9dbc240f19 svc: Move the sockaddr information to svc_xprt
This patch moves the transport sockaddr to the svc_xprt
structure.  Convenience functions are added to set and
get the local and remote addresses of a transport from
the transport provider as well as determine the length
of a sockaddr.

A transport is responsible for setting the xpt_local
and xpt_remote addresses in the svc_xprt structure as
part of transport creation and xpo_accept processing. This
cannot be done in a generic way and in fact varies
between TCP, UDP and RDMA. A set of xpo_ functions
(e.g. getlocalname, getremotename) could have been
added but this would have resulted in additional
caching and copying of the addresses around.  Note that
the xpt_local address should also be set on listening
endpoints; for TCP/RDMA this is done as part of
endpoint creation.

For connected transports like TCP and RDMA, the addresses
never change and can be set once and copied into the
rqstp structure for each request. For UDP, however, the
local and remote addresses may change for each request. In
this case, the address information is obtained from the
UDP recvmsg info and copied into the rqstp structure from
there.

A svc_xprt_local_port function was also added that returns
the local port given a transport. This is used by
svc_create_xprt when returning the port associated with
a newly created transport, and later when creating a
generic find transport service to check if a service is
already listening on a given port.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
8c7b0172a1 svc: Make deferral processing xprt independent
This patch moves the transport independent sk_deferred list to the svc_xprt
structure and updates the svc_deferred_req structure to keep pointers to
svc_xprt's directly. The deferral processing code is also moved out of the
transport dependent recvfrom functions and into the generic svc_recv path.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
def13d7401 svc: Move the authinfo cache to svc_xprt.
Move the authinfo cache to svc_xprt. This allows both the TCP and RDMA
transports to share this logic. A flag bit is used to determine if
auth information is to be cached or not. Previously, this code looked
at the transport protocol.

I've also changed the spin_lock/unlock logic so that a lock is not taken for
transports that are not caching auth info.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
4bc6c497b2 svc: Remove sk_lastrecv
With the implementation of the new mark and sweep algorithm for shutting
down old connections, the sk_lastrecv field is no longer needed.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
6bc5ab1367 svc: Move accept call to svc_xprt_received to common code
Now that the svc_xprt_received function handles transports, the call
to svc_xprt_received in the xpo_tcp_accept function can be moved to
common code.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
a6046f71f2 svc: Change svc_sock_received to svc_xprt_received and export it
All fields touched by svc_sock_received are now transport independent.
Change it to use svc_xprt directly. This function is called from
transport dependent code, so export it.

Update the comment to clearly state the rules for calling this function.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:12 -05:00
Tom Tucker
a50fea26b9 svc: Make svc_send transport neutral
Move the sk_mutex field to the transport independent svc_xprt structure.
Now all the fields that svc_send touches are transport neutral. Change the
svc_send function to use the transport independent svc_xprt directly instead
of the transport dependent svc_sock structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:11 -05:00
Tom Tucker
f6150c3cab svc: Make the enqueue service transport neutral and export it.
The svc_sock_enqueue function is now transport independent since all of
the fields it touches have been moved to the transport independent svc_xprt
structure. Change the function to use the svc_xprt structure directly
instead of the transport specific svc_sock structure.

Transport specific data-ready handlers need to call this function, so
export it.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:11 -05:00
Tom Tucker
7a90e8cc21 svc: Move sk_reserved to svc_xprt
This functionally trivial patch moves the sk_reserved field to the
transport independent svc_xprt structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:11 -05:00
Tom Tucker
7a18208383 svc: Make close transport independent
Move sk_list and sk_ready to svc_xprt. This involves close because these
lists are walked by svcs when closing all their transports. So I combined
the moving of these lists to svc_xprt with making close transport independent.

The svc_force_sock_close has been changed to svc_close_all and takes a list
as an argument. This removes some svc internals knowledge from the svcs.

This code races with module removal and transport addition.

Thanks to Simon Holm Thøgersen for a compile fix.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Simon Holm Thøgersen <odie@cs.aau.dk>
2008-02-01 16:42:11 -05:00
Tom Tucker
bb5cf160b2 svc: Move sk_server and sk_pool to svc_xprt
This is another incremental change that moves transport independent
fields from svc_sock to the svc_xprt structure. The changes
should be functionally null.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:11 -05:00
Tom Tucker
02fc6c3618 svc: Move sk_flags to the svc_xprt structure
This functionally trivial change moves the transport independent sk_flags
field to the transport independent svc_xprt structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:11 -05:00
Tom Tucker
e1b3157f97 svc: Change sk_inuse to a kref
Change the atomic_t reference count to a kref and move it to the
transport indepenent svc_xprt structure. Change the reference count
wrapper names to be generic.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:11 -05:00
Tom Tucker
d7c9f1ed97 svc: Change services to use new svc_create_xprt service
Modify the various kernel RPC svcs to use the svc_create_xprt service.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:09 -05:00
Tom Tucker
b700cbb11f svc: Add a generic transport svc_create_xprt function
The svc_create_xprt function is a transport independent version
of the svc_makesock function.

Since transport instance creation contains transport dependent and
independent components, add an xpo_create transport function. The
transport implementation of this function allocates the memory for the
endpoint, implements the transport dependent initialization logic, and
calls svc_xprt_init to initialize the transport independent field (svc_xprt)
in it's data structure.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:09 -05:00
Tom Tucker
f9f3cc4fae svc: Move connection limit checking to its own function
Move the code that poaches connections when the connection limit is hit
to a subroutine to make the accept logic path easier to follow. Since this
is in the new connection path, it should not be a performance issue.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:09 -05:00
Tom Tucker
44a6995b32 svc: Remove unnecessary call to svc_sock_enqueue
The svc_tcp_accept function calls svc_sock_enqueue after setting the
SK_CONN bit. This doesn't actually do anything because the SK_BUSY bit
is still set. The call is unnecessary anyway because the generic code in
svc_recv calls svc_sock_received after calling the accept function.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:09 -05:00
Tom Tucker
38a417cc99 svc: Add xpo_accept transport function
Previously, the accept logic looked into the socket state to determine
whether to call accept or recv when data-ready was indicated on an endpoint.
Since some transports don't use sockets, this logic now uses a flag
bit (SK_LISTENER) to identify listening endpoints. A transport function
(xpo_accept) allows each transport to define its own accept processing.
A transport's initialization logic is reponsible for setting the
SK_LISTENER bit. I didn't see any way to do this in transport independent
logic since the passive side of a UDP connection doesn't listen and
always recv's.

In the svc_recv function, if the SK_LISTENER bit is set, the transport
xpo_accept function is called to handle accept processing.

Note that all functions are defined even if they don't make sense
for a given transport. For example, accept doesn't mean anything for
UDP. The function is defined anyway and bug checks if called. The
UDP transport should never set the SK_LISTENER bit.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
d7979ae4a0 svc: Move close processing to a single place
Close handling was duplicated in the UDP and TCP recvfrom
methods. This code has been moved to the transport independent
svc_recv function.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
323bee32e9 svc: Add a transport function that checks for write space
In order to avoid blocking a service thread, the receive side checks
to see if there is sufficient write space to reply to the request.
Each transport has a different mechanism for determining if there is
enough write space to reply.

The code that checked for write space was coupled with code that
checked for CLOSE and CONN. These checks have been broken out into
separate statements to make the code easier to read.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
e831fe65b1 svc: Add xpo_prep_reply_hdr
Some transports add fields to the RPC header for replies, e.g. the TCP
record length. This function is called when preparing the reply header
to allow each transport to add whatever fields it requires.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
755cceaba7 svc: Add per-transport delete functions
Add transport specific xpo_detach and xpo_free functions. The xpo_detach
function causes the transport to stop delivering data-ready events
and enqueing the transport for I/O.

The xpo_free function frees all resources associated with the particular
transport instance.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00
Tom Tucker
5148bf4ebc svc: Add transport specific xpo_release function
The svc_sock_release function releases pages allocated to a thread. For
UDP this frees the receive skb. For RDMA it will post a receive WR
and bump the client credit count.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:08 -05:00