linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-08-21 16:29:59 +00:00

Author	SHA1	Message	Date
Mark McLoughlin	3f2c31d903	virtio_net: VIRTIO_NET_F_MSG_RXBUF (imprive rcv buffer allocation) If segmentation offload is enabled by the host, we currently allocate maximum sized packet buffers and pass them to the host. This uses up 20 ring entries, allowing us to supply only 20 packet buffers to the host with a 256 entry ring. This is a huge overhead when receiving small packets, and is most keenly felt when receiving MTU sized packets from off-host. The VIRTIO_NET_F_MRG_RXBUF feature flag is set by hosts which support using receive buffers which are smaller than the maximum packet size. In order to transfer large packets to the guest, the host merges together multiple receive buffers to form a larger logical buffer. The number of merged buffers is returned to the guest via a field in the virtio_net_hdr. Make use of this support by supplying single page receive buffers to the host. On receive, we extract the virtio_net_hdr, copy 128 bytes of the payload to the skb's linear data buffer and adjust the fragment offset to point to the remaining data. This ensures proper alignment and allows us to not use any paged data for small packets. If the payload occupies multiple pages, we simply append those pages as fragments and free the associated skbs. This scheme allows us to be efficient in our use of ring entries while still supporting large packets. Benchmarking using netperf from an external machine to a guest over a 10Gb/s network shows a 100% improvement from ~1Gb/s to ~2Gb/s. With a local host->guest benchmark with GSO disabled on the host side, throughput was seen to increase from 700Mb/s to 1.7Gb/s. Based on a patch from Herbert Xu. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv) Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 22:41:34 -08:00
Mark McLoughlin	0276b4972e	virtio_net: hook up the set-tso ethtool op Seems like an oversight that we have set-tx-csum and set-sg hooked up, but not set-tso. Also leads to the strange situation that if you e.g. disable tx-csum, then tso doesn't get disabled. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 22:40:36 -08:00
Mark McLoughlin	0a888fd1f6	virtio_net: Recycle some more rx buffer pages Each time we re-fill the recv queue with buffers, we allocate one too many skbs and free it again when adding fails. We should recycle the pages allocated in this case. A previous version of this patch made trim_pages() trim trailing unused pages from skbs with some paged data, but this actually caused a barely measurable slowdown. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv) Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 22:39:18 -08:00
Alexey Dobriyan	908cd2dabb	net: use %pF for /proc/net/ptype Technically, patch changes format for modules, but I think nobody cares. -86dd :ipv6:ipv6_rcv+0x0 +86dd ipv6_rcv+0x0/0x400 [ipv6] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:50:35 -08:00
Eric Dumazet	5635c10d97	net: make sure struct dst_entry refcount is aligned on 64 bytes As found in the past (commit `f1dd9c379c` [NET]: Fix tbench regression in 2.6.25-rc1), it is really important that struct dst_entry refcount is aligned on a cache line. We cannot use __atribute((aligned)), so manually pad the structure for 32 and 64 bit arches. for 32bit : offsetof(truct dst_entry, __refcnt) is 0x80 for 64bit : offsetof(truct dst_entry, __refcnt) is 0xc0 As it is not possible to guess at compile time cache line size, we use a generic value of 64 bytes, that satisfies many current arches. (Using 128 bytes alignment on 64bit arches would waste 64 bytes) Add a BUILD_BUG_ON to catch future updates to "struct dst_entry" dont break this alignment. "tbench 8" is 4.4 % faster on a dual quad core (HP BL460c G1), Intel E5450 @3.00GHz (2350 MB/s instead of 2250 MB/s) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:46:36 -08:00
Eric Dumazet	536533e69e	rcu: documents rculist_nulls Adds Documentation/RCU/rculist_nulls.txt file to describe how 'nulls' end-of-list can help in some RCU algos. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:41:14 -08:00
Eric Dumazet	3ab5aee7fe	net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls RCU was added to UDP lookups, using a fast infrastructure : - sockets kmem_cache use SLAB_DESTROY_BY_RCU and dont pay the price of call_rcu() at freeing time. - hlist_nulls permits to use few memory barriers. This patch uses same infrastructure for TCP/DCCP established and timewait sockets. Thanks to SLAB_DESTROY_BY_RCU, no slowdown for applications using short lived TCP connections. A followup patch, converting rwlocks to spinlocks will even speedup this case. __inet_lookup_established() is pretty fast now we dont have to dirty a contended cache line (read_lock/read_unlock) Only established and timewait hashtable are converted to RCU (bind table and listen table are still using traditional locking) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:40:17 -08:00
Eric Dumazet	88ab1932ea	udp: Use hlist_nulls in UDP RCU code This is a straightforward patch, using hlist_nulls infrastructure. RCUification already done on UDP two weeks ago. Using hlist_nulls permits us to avoid some memory barriers, both at lookup time and delete time. Patch is large because it adds new macros to include/net/sock.h. These macros will be used by TCP & DCCP in next patch. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:39:21 -08:00
Eric Dumazet	bbaffaca48	rcu: Introduce hlist_nulls variant of hlist hlist uses NULL value to finish a chain. hlist_nulls variant use the low order bit set to 1 to signal an end-of-list marker. This allows to store many different end markers, so that some RCU lockless algos (used in TCP/UDP stack for example) can save some memory barriers in fast paths. Two new files are added : include/linux/list_nulls.h - mimics hlist part of include/linux/list.h, derived to hlist_nulls variant include/linux/rculist_nulls.h - mimics hlist part of include/linux/rculist.h, derived to hlist_nulls variant Only four helpers are declared for the moment : hlist_nulls_del_init_rcu(), hlist_nulls_del_rcu(), hlist_nulls_add_head_rcu() and hlist_nulls_for_each_entry_rcu() prefetches() were removed, since an end of list is not anymore NULL value. prefetches() could trigger useless (and possibly dangerous) memory transactions. Example of use (extracted from __udp4_lib_lookup()) struct sock sk, result; struct hlist_nulls_node node; unsigned short hnum = ntohs(dport); unsigned int hash = udp_hashfn(net, hnum); struct udp_hslot hslot = &udptable->hash[hash]; int score, badness; rcu_read_lock(); begin: result = NULL; badness = -1; sk_nulls_for_each_rcu(sk, node, &hslot->head) { score = compute_score(sk, net, saddr, hnum, sport, daddr, dport, dif); if (score > badness) { result = sk; badness = score; } } /* * if the nulls value we got at the end of this lookup is * not the expected one, we must restart lookup. * We probably met an item that was moved to another chain. */ if (get_nulls_value(node) != hash) goto begin; if (result) { if (unlikely(!atomic_inc_not_zero(&result->sk_refcnt))) result = NULL; else if (unlikely(compute_score(result, net, saddr, hnum, sport, daddr, dport, dif) < badness)) { sock_put(result); goto begin; } } rcu_read_unlock(); return result; Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:37:55 -08:00
Balazs Scheidler	e8b2dfe9b4	TPROXY: implemented IP_RECVORIGDSTADDR socket option In case UDP traffic is redirected to a local UDP socket, the originally addressed destination address/port cannot be recovered with the in-kernel tproxy. This patch adds an IP_RECVORIGDSTADDR sockopt that enables a IP_ORIGDSTADDR ancillary message in recvmsg(). This ancillary message contains the original destination address/port of the packet being received. Signed-off-by: Balazs Scheidler <bazsi@balabit.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:32:39 -08:00
Ben Greear	8164f1b797	ipv4: Fix ARP behavior with many mac-vlans Ben Greear wrote: > I have 500 mac-vlans on a system talking to 500 other > mac-vlans. My problem is that the arp-table gets extremely > huge because every time an arp-request comes in on all mac-vlans, > a stale arp entry is added for each mac-vlan. I have filtering > turned on, but that doesn't help because the neigh_event_ns call > below will cause a stale neighbor entry to be created regardless > of whether a replay will be sent or not. > Maybe the neigh_event code should be below the checks for dont_send, > and only create check neigh_event_ns if we are !dont_send? The attached patch makes it work much better for me. The patch will cause the code to NOT create a stale neighbor entry if we are not going to respond to the ARP request. The old code would create a stale entry even if we are not going to respond. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:19:38 -08:00
Alexander Duyck	6ea7ae1d0f	e1000e: enable ECC correction on 82571 silicon This change enables ECC correction for the packet buffer on all 82571 silicon. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:09:43 -08:00
Paulius Zaleckas	f004f3ea34	phylib: make mdio-gpio work without OF (v4) make mdio-gpio work with non OpenFirmware gpio implementation. Aditional changes to mdio-gpio: - use gpio_request() and gpio_free() - place irq[] array in struct mdio_gpio_info - add module description, author and license - add note about compiling this driver as module - rename mdc and mdio function (were ugly names) - change MII to MDIO in bus name - add __init __exit to module (un)loading functions - probe fails if no phys added to the bus - kzalloc bitbang with sizeof(*bitbang) Changes since v3: - keep bus naming "%x" to be compatible with existing drivers. Changes since v2: - more #ifdefs reduction - platform driver will be registered on OF platforms also - unified platform and OF bus_id to phy%i Changes since v1: - removed NO_IRQ - reduced #idefs Laurent, please test this driver under OF. Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 18:59:45 -08:00
Paulius Zaleckas	72af187f21	phylib: rename mdio-ofgpio to mdio-gpio Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 18:59:24 -08:00
David S. Miller	6817ba2cd2	dm9000: Fix build error. Reported by Stephen Rothwell: drivers/net/dm9000.c:1450: error: expected ')' before ';' token drivers/net/dm9000.c:1455: error: expected ';' before '}' token Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 12:41:35 -08:00
David Brownell	cda2836dc6	pegasus: minor resource shrinkage Make pegasus driver not allocate a workqueue until the driver is bound to some device, which will need that workqueue if the device is brought up. This conserves resources when the driver is linked but there's no pegasus device connected. Also shrink the runtime footprint a smidgeon by moving some init-only code into its proper section, and move an obnoxious (frequent and meaningless) message to be debug-only. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 00:36:08 -08:00
PJ Waskiewicz	74ad0a5421	ixgbe: Fix usage of netif_*_all_queues() with netif_carrier_{off\|on}() netif_carrier_off() is sufficient to stop Tx into the driver. Stopping the Tx queues is redundant and unnecessary. By the same token, netif_carrier_on() will be sufficient to re-enable Tx, so waking the queues is unnecessary. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 00:31:35 -08:00
Eric Dumazet	ef711cf1d1	net: speedup dst_release() During tbench/oprofile sessions, I found that dst_release() was in third position. CPU: Core 2, speed 2999.68 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000 samples % symbol name 483726 9.0185 __copy_user_zeroing_intel 191466 3.5697 __copy_user_intel 185475 3.4580 dst_release 175114 3.2648 ip_queue_xmit 153447 2.8608 tcp_sendmsg 108775 2.0280 tcp_recvmsg 102659 1.9140 sysenter_past_esp 101450 1.8914 tcp_current_mss 95067 1.7724 __copy_from_user_ll 86531 1.6133 tcp_transmit_skb Of course, all CPUS fight on the dst_entry associated with 127.0.0.1 Instead of first checking the refcount value, then decrement it, we use atomic_dec_return() to help CPU to make the right memory transaction (ie getting the cache line in exclusive mode) dst_release() is now at the fifth position, and tbench a litle bit faster ;) CPU: Core 2, speed 3000.1 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000 samples % symbol name 647107 8.8072 __copy_user_zeroing_intel 258840 3.5229 ip_queue_xmit 258302 3.5155 __copy_user_intel 209629 2.8531 tcp_sendmsg 165632 2.2543 dst_release 149232 2.0311 tcp_current_mss 147821 2.0119 tcp_recvmsg 137893 1.8767 sysenter_past_esp 127473 1.7349 __copy_from_user_ll 121308 1.6510 ip_finish_output 118510 1.6129 tcp_transmit_skb 109295 1.4875 tcp_v4_rcv Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-14 00:53:54 -08:00
Jarek Poplawski	f30ab418a1	pkt_sched: Remove qdisc->ops->requeue() etc. After implementing qdisc->ops->peek() and changing sch_netem into classless qdisc there are no more qdisc->ops->requeue() users. This patch removes this method with its wrappers (qdisc_requeue()), and also unused qdisc->requeue structure. There are a few minor fixes of warnings (htb_enqueue()) and comments btw. The idea to kill ->requeue() and a similar patch were first developed by David S. Miller. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-13 22:56:30 -08:00
Petr Tesarik	38a7ddffa4	tcp: remove an unnecessary field in struct tcp_skb_cb The urg_ptr field is not used anywhere and is merely confusing. Signed-off-by: Petr Tesarik <ptesarik@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-13 22:44:11 -08:00
Harvey Harrison	00bcd522ea	isdn: use %pI4, remove get_{u8/u16/u32} and put_{u8/u16/u32} inlines They would have been better named as get_be16, put_be16, etc. as they were hiding an endian shift inside. They don't add much over explicitly coding the byteshifting and gcc sometimes has a problem with builtin_constant_p inside inline functions, so it may do a better job of byteswapping at compile time rather than runtime. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-13 22:41:29 -08:00
Wang Chen	524ad0a791	netdevice: safe convert to netdev_priv() #part-4 We have some reasons to kill netdev->priv: 1. netdev->priv is equal to netdev_priv(). 2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously netdev_priv() is more flexible than netdev->priv. But we cann't kill netdev->priv, because so many drivers reference to it directly. This patch is a safe convert for netdev->priv to netdev_priv(netdev). Since all of the netdev->priv is only for read. But it is too big to be sent in one mail. I split it to 4 parts and make every part smaller than 100,000 bytes, which is max size allowed by vger. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 23:39:10 -08:00
Wang Chen	8f15ea42b6	netdevice: safe convert to netdev_priv() #part-3 We have some reasons to kill netdev->priv: 1. netdev->priv is equal to netdev_priv(). 2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously netdev_priv() is more flexible than netdev->priv. But we cann't kill netdev->priv, because so many drivers reference to it directly. This patch is a safe convert for netdev->priv to netdev_priv(netdev). Since all of the netdev->priv is only for read. But it is too big to be sent in one mail. I split it to 4 parts and make every part smaller than 100,000 bytes, which is max size allowed by vger. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 23:38:36 -08:00
Wang Chen	4cf1653aa9	netdevice: safe convert to netdev_priv() #part-2 We have some reasons to kill netdev->priv: 1. netdev->priv is equal to netdev_priv(). 2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously netdev_priv() is more flexible than netdev->priv. But we cann't kill netdev->priv, because so many drivers reference to it directly. This patch is a safe convert for netdev->priv to netdev_priv(netdev). Since all of the netdev->priv is only for read. But it is too big to be sent in one mail. I split it to 4 parts and make every part smaller than 100,000 bytes, which is max size allowed by vger. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 23:38:14 -08:00
Wang Chen	454d7c9b14	netdevice: safe convert to netdev_priv() #part-1 We have some reasons to kill netdev->priv: 1. netdev->priv is equal to netdev_priv(). 2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously netdev_priv() is more flexible than netdev->priv. But we cann't kill netdev->priv, because so many drivers reference to it directly. This patch is a safe convert for netdev->priv to netdev_priv(netdev). Since all of the netdev->priv is only for read. But it is too big to be sent in one mail. I split it to 4 parts and make every part smaller than 100,000 bytes, which is max size allowed by vger. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 23:37:49 -08:00
Arnaud Ebalard	7a12122c7a	net: Remove unused parameter of xfrm_gen_index() In commit `2518c7c2b3` ("[XFRM]: Hash policies when non-prefixed."), the last use of xfrm_gen_policy() first argument was removed, but the argument was left behind in the prototype. Signed-off-by: Arnaud Ebalard <arno@natisbad.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 23:28:15 -08:00
Alexey Dobriyan	2378982487	net: ifdef struct sock::sk_async_wait_queue Every user is under CONFIG_NET_DMA already, so ifdef field as well. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 23:25:32 -08:00
Michael Chan	e4412cb8a6	bnx2: Update version to 1.8.2. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 16:03:05 -08:00
Michael Chan	40105c0b07	bnx2: Reorganize timeout constants. Move all related timeout constants to the same location. BNX2 prefix is also added to make them more consistent. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 16:02:45 -08:00
Michael Chan	d8026d9394	bnx2: Set rx buffer water marks based on MTU. The default rx buffer water marks for XOFF/XON are for 1500 MTU. At larger MTUs, these water marks need to be adjusted for effective flow control. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 16:02:20 -08:00
Michael Chan	5ec6d7bf19	bnx2: Restrict WoL support. On some quad-port cards that cannot support WoL on all ports due to excessive power consumption, the driver needs to restrict WoL on some ports by checking VAUX_PRESET bit. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 16:01:41 -08:00
Michael Chan	1caacecb7c	bnx2: Add PCI ID for 5716S. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 16:01:12 -08:00
Eric Dumazet	e42ea986e4	net: Cleanup of neighbour code Using read_pnet() and write_pnet() in neighbour code ease the reading of code. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 00:54:54 -08:00
Eric Dumazet	7a9546ee35	net: ib_net pointer should depends on CONFIG_NET_NS We can shrink size of "struct inet_bind_bucket" by 50%, using read_pnet() and write_pnet() Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 00:54:20 -08:00
Eric Dumazet	8f424b5f32	net: Introduce read_pnet() and write_pnet() helpers This patch introduces two helpers that deal with reading and writing struct net pointers in various network structures. Their implementation depends on CONFIG_NET_NS For symmetry, both functions work with "struct net *pnet". Their usage should reduce the number of #ifdef CONFIG_NET_NS, without adding many helpers for each network structure that hold a "struct net pointer" Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 00:53:30 -08:00
Gerrit Renker	9eca0a47de	dccp: Resolve dependencies of features on choice of CCID This provides a missing link in the code chain, as several features implicitly depend and/or rely on the choice of CCID. Most notably, this is the Send Ack Vector feature, but also Ack Ratio and Send Loss Event Rate (also taken care of). For Send Ack Vector, the situation is as follows: * since CCID2 mandates the use of Ack Vectors, there is no point in allowing endpoints which use CCID2 to disable Ack Vector features such a connection; * a peer with a TX CCID of CCID2 will always expect Ack Vectors, and a peer with a RX CCID of CCID2 must always send Ack Vectors (RFC 4341, sec. 4); * for all other CCIDs, the use of (Send) Ack Vector is optional and thus negotiable. However, this implies that the code negotiating the use of Ack Vectors also supports it (i.e. is able to supply and to either parse or ignore received Ack Vectors). Since this is not the case (CCID-3 has no Ack Vector support), the use of Ack Vectors is here disabled, with a comment in the source code. An analogous consideration arises for the Send Loss Event Rate feature, since the CCID-3 implementation does not support the loss interval options of RFC 4342. To make such use explicit, corresponding feature-negotiation options are inserted which signal the use of the loss event rate option, as it is used by the CCID3 code. Lastly, the values of the Ack Ratio feature are matched to the choice of CCID. The patch implements this as a function which is called after the user has made all other registrations for changing default values of features. The table is variable-length, the reserved (and hence for feature-negotiation invalid, confirmed by considering section 19.4 of RFC 4340) feature number `0' is used to mark the end of the table. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 00:48:44 -08:00
Gerrit Renker	d90ebcbfa7	dccp: Query supported CCIDs This provides a data structure to record which CCIDs are locally supported and three accessor functions: - a test function for internal use which is used to validate CCID requests made by the user; - a copy function so that the list can be used for feature-negotiation; - documented getsockopt() support so that the user can query capabilities. The data structure is a table which is filled in at compile-time with the list of available CCIDs (which in turn depends on the Kconfig choices). Using the copy function for cloning the list of supported CCIDs is useful for feature negotiation, since the negotiation is now with the full list of available CCIDs (e.g. {2, 3}) instead of the default value {2}. This means negotiation will not fail if the peer requests to use CCID3 instead of CCID2. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 00:47:26 -08:00
Gerrit Renker	e8ef967a54	dccp: Registration routines for changing feature values Two registration routines, for SP and NN features, are provided by this patch, replacing a previous routine which was used for both feature types. These are internal-only routines and therefore start with `__feat_register'. It further exports the known limits of Sequence Window and Ack Ratio as symbolic constants. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 00:43:40 -08:00
Gerrit Renker	f74e91b6cc	dccp: Limit feature negotiation to connection setup phase This patch limits feature (capability) negotation to the connection setup phase: 1. Although it is theoretically possible to perform feature negotiation at any time (and RFC 4340 supports this), in practice this is prohibitively complex, as it requires to put traffic on hold for each new negotiation. 2. As a byproduct of restricting feature negotiation to connection setup, the feature-negotiation retransmit timer is no longer required. This part is now mapped onto the protocol-level retransmission. Details indicating why timers are no longer needed can be found on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/feature_negotiation/\ implementation_notes.html This patch disables anytime negotiation, subsequent patches work out full feature negotiation support for connection setup. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-12 00:42:58 -08:00
Alexey Dobriyan	6bb3ce25d0	net: remove struct dst_entry::entry_size Unused after kmem_cache_zalloc() conversion. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-11 17:25:22 -08:00
Alexey Dobriyan	9b739ba5e6	net: remove struct neigh_table::pde ->pde isn't actually needed, since name is stashed in ->id. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-11 16:47:44 -08:00
David S. Miller	7e452baf6b	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/message/fusion/mptlan.c drivers/net/sfc/ethtool.c net/mac80211/debugfs_sta.c	2008-11-11 15:43:02 -08:00
David S. Miller	3ac38c3a2e	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6	2008-11-11 14:40:06 -08:00
Linus Torvalds	f21f237cf5	Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timers: handle HRTIMER_CB_IRQSAFE_UNLOCKED correctly from softirq context nohz: disable tick_nohz_kick_tick() for now irq: call __irq_enter() before calling the tick_idle_check x86: HPET: enter hpet_interrupt_handler with interrupts disabled x86: HPET: read from HPET_Tn_CMP() not HPET_T0_CMP x86: HPET: convert WARN_ON to WARN_ON_ONCE	2008-11-11 10:53:50 -08:00
Linus Torvalds	2f96cb57cd	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: release buddies on yield fix for account_group_exec_runtime(), make sure ->signal can't be freed under rq->lock sched: clean up debug info	2008-11-11 10:52:25 -08:00
Linus Torvalds	09eb3b5b1b	Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: ring-buffer: prevent infinite looping on time stamping ftrace: disable tracing on resize ftrace: fix breakage in bin_fmt results ftrace: ftrace.txt version update ftrace: update txt document	2008-11-11 10:51:50 -08:00
Sujith	9757d55652	ath9k: Fix compilation failure when RFKILL is enabled Signed-off-by: Sujith <Sujith.Manoharan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-11-11 13:06:14 -05:00
Linus Torvalds	04ca2c17e3	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: [XFS] XFS: Check for valid transaction headers in recovery [XFS] handle memory allocation failures during log initialisation [XFS] Account for allocated blocks when expanding directories [XFS] Wait for all I/O on truncate to zero file size [XFS] Fix use-after-free with log and quotas	2008-11-11 09:32:58 -08:00
Linus Torvalds	ad1164b79f	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (21 commits) ocfs2: Check search result in ocfs2_xattr_block_get() ocfs2: fix printk related build warnings in xattr.c ocfs2: truncate outstanding block after direct io failure ocfs2/xattr: Proper hash collision handle in bucket division ocfs2: return 0 in page_mkwrite to let VFS retry. ocfs2: Set journal descriptor to NULL after journal shutdown ocfs2: Fix check of return value of ocfs2_start_trans() in xattr.c. ocfs2: Let inode be really deleted when ocfs2_mknod_locked() fails ocfs2: Fix checking of return value of new_inode() ocfs2: Fix check of return value of ocfs2_start_trans() ocfs2: Fix some typos in xattr annotations. ocfs2: Remove unused ocfs2_restore_xattr_block(). ocfs2: Don't repeat ocfs2_xattr_block_find() ocfs2: Specify appropriate journal access for new xattr buckets. ocfs2: Check errors from ocfs2_xattr_update_xattr_search() ocfs2: Don't return -EFAULT from a corrupt xattr entry. ocfs2: Check xattr block signatures properly. ocfs2: add handler_map array bounds checking ocfs2: remove duplicate definition in xattr ocfs2: fix function declaration and definition in xattr ...	2008-11-11 09:31:32 -08:00
Alan Cox	0906dd9df2	telephony: trivial: fix up email address Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-11 09:30:23 -08:00

1 2 3 4 5 ...

119150 commits