Commit graph

8789 commits

Author SHA1 Message Date
Pavel Emelyanov
36f0bebd98 [TR]: Use ctl paths to register net/token-ring/ table
The same thing for token-ring - use ctl paths and get
rid of external references on the tr_table.

Unfortunately, I couldn't split this patch into cleanup and
use-the-paths parts.

As a lame excuse I can say, that the cleanup is just moving
the tr_table from one file to another - closet to a single
variable, that this ctl table tunes. Since the source  file
becomes empty after the move, I remove it.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:56:28 -08:00
Patrick McHardy
02f014d888 [NETFILTER]: nf_queue: move list_head/skb/id to struct nf_info
Move common fields for queue management to struct nf_info and rename it
to struct nf_queue_entry. The avoids one allocation/free per packet and
simplifies the code a bit.

Alternatively we could add some private room at the tail, but since
all current users use identical structs this seems easier.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:56:14 -08:00
Patrick McHardy
c01cd429fc [NETFILTER]: nf_queue: move queueing related functions/struct to seperate header
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:56:10 -08:00
Patrick McHardy
f9d8928f83 [NETFILTER]: nf_queue: remove unused data pointer
Remove the data pointer from struct nf_queue_handler. It has never been used
and is useless for the only handler that really matters, nfnetlink_queue,
since the handler is shared between all instances.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:56:10 -08:00
Patrick McHardy
e3ac529815 [NETFILTER]: nf_queue: make queue_handler const
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:56:09 -08:00
Patrick McHardy
1841a4c7ae [NETFILTER]: nf_ct_h323: remove ipv6 module dependency
nf_conntrack_h323 needs ip6_route_output for the call forwarding filter.
Add a ->route function to nf_afinfo and use that to avoid pulling in the
ipv6 module.

Fix the #ifdef for the IPv6 code while I'm at it - the IPv6 support is
only needed when IPv6 conntrack is enabled.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:56:05 -08:00
Patrick McHardy
50c164a81f [NETFILTER]: x_tables: add rateest match
Add rate estimator match. The rate estimator match can match on
estimated rates by the RATEEST target. It supports matching on
absolute bps/pps values, comparing two rate estimators and matching
on the difference between two rate estimators.

This is what I use to route outgoing data connections from a FTP
server over two lines based on the  available bandwidth:

# estimate outgoing rates
iptables -t mangle -A POSTROUTING -o eth0 -j RATEEST --rateest-name eth0 \
                                                     --rateest-interval 250ms \
                                                     --rateest-ewma 0.5s
iptables -t mangle -A POSTROUTING -o ppp0 -j RATEEST --rateest-name ppp0 \
                                                     --rateest-interval 250ms \
                                                     --rateest-ewma 0.5s

# mark based on available bandwidth
iptables -t mangle -A BALANCE -m state --state NEW \
                              -m helper --helper ftp \
                              -m rateest --rateest-delta \
                                         --rateest1 eth0 \
                                         --rateest-bps1 2.5mbit \
                                         --rateest-gt \
                                         --rateest2 ppp0 \
                                         --rateest-bps2 2mbit \
                              -j CONNMARK --set-mark 0x1

iptables -t mangle -A BALANCE -m state --state NEW \
                              -m helper --helper ftp \
                              -m rateest --rateest-delta \
                                         --rateest1 ppp0 \
                                         --rateest-bps1 2mbit \
                                         --rateest-gt \
                                         --rateest2 eth0 \
                                         --rateest-bps2 2.5mbit \
                              -j CONNMARK --set-mark 0x2

iptables -t mangle -A BALANCE -j CONNMARK --restore-mark

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:56:03 -08:00
Patrick McHardy
5859034d7e [NETFILTER]: x_tables: add RATEEST target
Add new rate estimator target (using gen_estimator). In combination with
the rateest match (next patch) this can be used for load-based multipath
routing.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:56:02 -08:00
Jan Engelhardt
5c350e5a38 [NETFILTER]: IPv6 capable xt_TOS v1 target
Extends the xt_DSCP target by xt_TOS v1 to add support for selectively
setting and flipping any bit in the IPv4 TOS and IPv6 Priority fields.
(ipt_TOS and xt_DSCP only accepted a limited range of possible
values.)

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:56:00 -08:00
Jan Engelhardt
f1095ab51d [NETFILTER]: IPv6 capable xt_tos v1 match
Extends the xt_dscp match by xt_tos v1 to add support for selectively
matching any bit in the IPv4 TOS and IPv6 Priority fields. (ipt_tos
and xt_dscp only accepted a limited range of possible values.)

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:56:00 -08:00
Laszlo Attila Toth
e2cf5ecbea [NETFILTER]: ipt_addrtype: limit address type checking to an interface
Addrtype match has a new revision (1), which lets address type checking
limited to the interface the current packet belongs to. Either incoming
or outgoing interface can be used depending on the current hook. In the
FORWARD hook two maches should be used if both interfaces have to be checked.
The new structure is ipt_addrtype_info_v1.

Revision 0 lets older userspace programs use the match as earlier.
ipt_addrtype_info is used.

Signed-off-by: Laszlo Attila Toth <panther@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:56 -08:00
Jan Engelhardt
0265ab44ba [NETFILTER]: merge ipt_owner/ip6t_owner in xt_owner
xt_owner merges ipt_owner and ip6t_owner, and adds a flag to match
on socket (non-)existence.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:55 -08:00
Eric Dumazet
259d4e41f3 [NETFILTER]: x_tables: struct xt_table_info diet
Instead of using a big array of NR_CPUS entries, we can compute the size
needed at runtime, using nr_cpu_ids

This should save some ram (especially on David's machines where NR_CPUS=4096 :
32 KB can be saved per table, and 64KB for dynamically allocated ones (because
of slab/slub alignements) )

In particular, the 'bootstrap' tables are not any more static (in data
section) but on stack as their size is now very small.

This also should reduce the size used on stack in compat functions
(get_info() declares an automatic variable, that could be bigger than kernel
stack size for big NR_CPUS)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:54 -08:00
Sven Schnelle
338e8a7926 [NETFILTER]: x_tables: add TCPOPTSTRIP target
Signed-off-by: Sven Schnelle <svens@bitebene.org>
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:51 -08:00
Eric W. Biederman
e51b6ba077 sysctl: Infrastructure for per namespace sysctls
This patch implements the basic infrastructure for per namespace sysctls.

A list of lists of sysctl headers is added, allowing each namespace to have
it's own list of sysctl headers.

Each list of sysctl headers has a lookup function to find the first
sysctl header in the list, allowing the lists to have a per namespace
instance.

register_sysct_root is added to tell sysctl.c about additional
lists of sysctl_headers.  As all of the users are expected to be in
kernel no unregister function is provided.

sysctl_head_next is updated to walk through the list of lists.

__register_sysctl_paths is added to add a new sysctl table on
a non-default sysctl list.

The only intrusive part of this patch is propagating the information
to decided which list of sysctls to use for sysctl_check_table.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Daniel Lezcano <dlezcano@fr.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:17 -08:00
Eric W. Biederman
23eb06de7d sysctl: Remember the ctl_table we passed to register_sysctl_paths
By doing this we allow users of register_sysctl_paths that build
and dynamically allocate their ctl_table to be simpler.  This allows
them to just remember the ctl_table_header returned from
register_sysctl_paths from which they can now find the
ctl_table array they need to free.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Daniel Lezcano <dlezcano@fr.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:17 -08:00
Eric W. Biederman
29e796fd4d sysctl: Add register_sysctl_paths function
There are a number of modules that register a sysctl table
somewhere deeply nested in the sysctl hierarchy, such as
fs/nfs, fs/xfs, dev/cdrom, etc.

They all specify several dummy ctl_tables for the path name.
This patch implements register_sysctl_path that takes
an additional path name, and makes up dummy sysctl nodes
for each component.

This patch was originally written by Olaf Kirch and
brought to my attention and reworked some by Olaf Hering.
I have changed a few additional things so the bugs are mine.

After converting all of the easy callers Olaf Hering observed
allyesconfig ARCH=i386, the patch reduces the final binary size by 9369 bytes.

.text +897
.data -7008

   text    data     bss     dec     hex filename
   26959310        4045899 4718592 35723801        2211a19 ../vmlinux-vanilla
   26960207        4038891 4718592 35717690        221023a ../O-allyesconfig/vmlinux

So this change is both a space savings and a code simplification.

CC: Olaf Kirch <okir@suse.de>
CC: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Daniel Lezcano <dlezcano@fr.ibm.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:16 -08:00
Patrick McHardy
be0ea7d5da [NETFILTER]: Convert old checksum helper names
Kill the defines again, convert to the new checksum helper names and
remove the dependency of NET_ACT_NAT on NETFILTER.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:15 -08:00
Patrick McHardy
a99a00cf1a [NET]: Move netfilter checksum helpers to net/core/utils.c
This allows to get rid of the CONFIG_NETFILTER dependency of NET_ACT_NAT.
This patch redefines the old names to keep the noise low, the next patch
converts all users.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:14 -08:00
Gerrit Renker
0c86962076 [DCCP]: Integrate state transitions for passive-close
This adds the necessary state transitions for the two forms of passive-close

 * PASSIVE_CLOSE    - which is entered when a host   receives a Close;
 * PASSIVE_CLOSEREQ - which is entered when a client receives a CloseReq.

Here is a detailed account of what the patch does in each state.

1) Receiving CloseReq

  The pseudo-code in 8.5 says:

     Step 13: Process CloseReq
          If P.type == CloseReq and S.state < CLOSEREQ,
              Generate Close
              S.state := CLOSING
              Set CLOSING timer.

  This means we need to address what to do in CLOSED, LISTEN, REQUEST, RESPOND, PARTOPEN, and OPEN.

   * CLOSED:         silently ignore - it may be a late or duplicate CloseReq;
   * LISTEN/RESPOND: will not appear, since Step 7 is performed first (we know we are the client);
   * REQUEST:        perform Step 13 directly (no need to enqueue packet);
   * OPEN/PARTOPEN:  enter PASSIVE_CLOSEREQ so that the application has a chance to process unread data.

  When already in PASSIVE_CLOSEREQ, no second CloseReq is enqueued. In any other state, the CloseReq is ignored.
  I think that this offers some robustness against rare and pathological cases: e.g. a simultaneous close where
  the client sends a Close and the server a CloseReq. The client will then be retransmitting its Close until it
  gets the Reset, so ignoring the CloseReq while in state CLOSING is sane.

2) Receiving Close

  The code below from 8.5 is unconditional.

     Step 14: Process Close
          If P.type == Close,
              Generate Reset(Closed)
              Tear down connection
              Drop packet and return

  Thus we need to consider all states:
   * CLOSED:           silently ignore, since this can happen when a retransmitted or late Close arrives;
   * LISTEN:           dccp_rcv_state_process() will generate a Reset ("No Connection");
   * REQUEST:          perform Step 14 directly (no need to enqueue packet);
   * RESPOND:          dccp_check_req() will generate a Reset ("Packet Error") -- left it at that;
   * OPEN/PARTOPEN:    enter PASSIVE_CLOSE so that application has a chance to process unread data;
   * CLOSEREQ:         server performed active-close -- perform Step 14;
   * CLOSING:          simultaneous-close: use a tie-breaker to avoid message ping-pong (see comment);
   * PASSIVE_CLOSEREQ: ignore - the peer has a bug (sending first a CloseReq and now a Close);
   * TIMEWAIT:         packet is ignored.

   Note that the condition of receiving a packet in state CLOSED here is different from the condition "there
   is no socket for such a connection": the socket still exists, but its state indicates it is unusable.

   Last, dccp_finish_passive_close sets either DCCP_CLOSED or DCCP_CLOSING = TCP_CLOSING, so that
   sk_stream_wait_close() will wait for the final Reset (which will trigger CLOSING => CLOSED).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:13 -08:00
Gerrit Renker
f11135a344 [DCCP]: Dedicated auxiliary states to support passive-close
This adds two auxiliary states to deal with passive closes:
  * PASSIVE_CLOSE    (reached from OPEN via reception of Close)    and
  * PASSIVE_CLOSEREQ (reached from OPEN via reception of CloseReq)
as internal intermediate states.

These states are used to allow a receiver to process unread data before
acknowledging the received connection-termination-request (the Close/CloseReq).

Without such support, it will happen that passively-closed sockets enter CLOSED
state while there is still unprocessed data in the queue; leading to unexpected
and erratic API behaviour.

PASSIVE_CLOSE has been mapped into TCPF_CLOSE_WAIT, so that the code will
seamlessly work with inet_accept() (which tests for this state).

The state names are thanks to Arnaldo, who suggested this naming scheme
following an earlier revision of this patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:12 -08:00
Fred L. Templin
c7dc89c0ac [IPV6]: Add RFC4214 support
This patch includes support for the Intra-Site Automatic Tunnel
Addressing Protocol (ISATAP) per RFC4214. It uses the SIT
module, and is configured using extensions to the "iproute2"
utility. The diffs are specific to the Linux 2.6.24-rc2 kernel
distribution.

This version includes the diff for ./include/linux/if.h which was
missing in the v2.4 submission and is needed to make the
patch compile. The patch has been installed, compiled and
tested in a clean 2.6.24-rc2 kernel build area.

Signed-off-by: Fred L. Templin <fred.l.templin@boeing.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:09 -08:00
Pavel Emelyanov
8d8ad9d7c4 [NET]: Name magic constants in sock_wake_async()
The sock_wake_async() performs a bit different actions
depending on "how" argument. Unfortunately this argument
ony has numerical magic values.

I propose to give names to their constants to help people
reading this function callers understand what's going on
without looking into this function all the time.

I suppose this is 2.6.25 material, but if it's not (or the
naming seems poor/bad/awful), I can rework it against the
current net-2.6 tree.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:55:03 -08:00
Ilpo Järvinen
e7d0362dd4 [PCOUNTER] Fix build error without CONFIG_SMP
I keep getting this build error and couldn't find anyone fixing
it in archives. ...Maybe all net developers except me build
just SMP kernels :-).

In file included from include/net/sock.h:50,
                 from ipc/mqueue.c:35:
include/linux/pcounter.h: In function 'pcounter_add':
include/linux/pcounter.h:87: error: 'struct pcounter' has no
member named 'value'
make[1]: *** [ipc/mqueue.o] Error 1
make: *** [ipc] Error 2

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:50 -08:00
Gerrit Renker
9b91ad2747 [DCCP]: Make PARTOPEN an autonomous state
This decouples PARTOPEN from TCP-specific stream-states.

It thus addresses the FIXME.

The code has been checked with regard to dependency on PARTOPEN and FIN_WAIT1
states (to which PARTOPEN previously was mapped): there is no difference, as
PARTOPEN is always referred to directly (i.e. not via the mapping to TCP
state).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:44 -08:00
Arnaldo Carvalho de Melo
de4d1db369 [LIB]: Introduce struct pcounter
This just generalises what was introduced by Eric Dumazet for the struct proto
inuse field in 286ab3d460:

    [NET]: Define infrastructure to keep 'inuse' changes in an efficent SMP/NUMA way.

Please look at the comment in there to see the rationale.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:39 -08:00
Ron Rindjunsky
6b4e324164 mac80211: adding 802.11n definitions in ieee80211.h
This patch adds several structs and definitions to ieee80211.h
to support 802.11n draft specifications.
As 802.11n depends on and extends the 802.11e standard in several issues,
there are also several definitions that belong to 802.11e.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:38 -08:00
Denis V. Lunev
e372c41401 [NET]: Consolidate net namespace related proc files creation.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:28 -08:00
Denis V. Lunev
97c53cacf0 [NET]: Make rtnetlink infrastructure network namespace aware (v3)
After this patch none of the netlink callback support anything
except the initial network namespace but the rtnetlink infrastructure
now handles multiple network namespaces.

Changes from v2:
- IPv6 addrlabel processing

Changes from v1:
- no need for special rtnl_unlock handling
- fixed IPv6 ndisc

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:25 -08:00
Michael Wu
c237899d1f ieee80211: Add IEEE80211_MAX_FRAME_LEN to linux/ieee80211.h
This patch adds IEEE80211_MAX_FRAME_LEN which is useful for drivers trying
to determine how much to allocate for their RX buffers.

It also updates the comment on IEEE80211_MAX_DATA_LEN based on revisions
in 802.11e.

IEEE80211_MAX_FRAG_THRESHOLD and IEEE80211_MAX_RTS_THRESHOLD are also
revised due to the new maximum frame size.

Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:20 -08:00
Stephen Hemminger
c7b6ea24b4 [NETPOLL]: Don't need rx_flags.
The rx_flags variable is redundant. Turning rx on/off is done
via setting the rx_np pointer.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:18 -08:00
Stephen Hemminger
0953864160 [NETPOLL]: no need to store local_mac
The local_mac is managed by the network device, no need to keep a
spare copy and all the management problems that could cause.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:17 -08:00
Oliver Hartkopp
1f98eefae8 [CAN]: Add missing Kbuild entries
This patch adds the missing Kbuild entries and the missing Kbuild file
in include/linux/can for the CAN subsystem.

Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:13 -08:00
Oliver Hartkopp
4195e31780 [CAN]: Fix plain integer definitions in userspace header.
This patch fixes the use of plain integers instead of __u32 in a struct
that is visible from kernel space and user space.

Thanks to Sam Ravnborg for pointing out the wrong plain int usage.

Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:12 -08:00
Oliver Hartkopp
ffd980f976 [CAN]: Add broadcast manager (bcm) protocol
This patch adds the CAN broadcast manager (bcm) protocol.

Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:11 -08:00
Oliver Hartkopp
c18ce101f2 [CAN]: Add raw protocol
This patch adds the CAN raw protocol.

Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:10 -08:00
Oliver Hartkopp
0d66548a10 [CAN]: Add PF_CAN core module
This patch adds the CAN core functionality but no protocols or drivers.
No protocol implementations are included here.  They come as separate
patches.  Protocol numbers are already in include/linux/can.h.

Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:10 -08:00
Oliver Hartkopp
cd05acfe65 [CAN]: Allocate protocol numbers for PF_CAN
This patch adds a protocol/address family number, ARP hardware type,
ethernet packet type, and a line discipline number for the SocketCAN
implementation.

Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:09 -08:00
Ilpo Järvinen
68f8353b48 [TCP]: Rewrite SACK block processing & sack_recv_cache use
Key points of this patch are:

  - In case new SACK information is advance only type, no skb
    processing below previously discovered highest point is done
  - Optimize cases below highest point too since there's no need
    to always go up to highest point (which is very likely still
    present in that SACK), this is not entirely true though
    because I'm dropping the fastpath_skb_hint which could
    previously optimize those cases even better. Whether that's
    significant, I'm not too sure.

Currently it will provide skipping by walking. Combined with
RB-tree, all skipping would become fast too regardless of window
size (can be done incrementally later).

Previously a number of cases in TCP SACK processing fails to
take advantage of costly stored information in sack_recv_cache,
most importantly, expected events such as cumulative ACK and new
hole ACKs. Processing on such ACKs result in rather long walks
building up latencies (which easily gets nasty when window is
huge). Those latencies are often completely unnecessary
compared with the amount of _new_ information received, usually
for cumulative ACK there's no new information at all, yet TCP
walks whole queue unnecessary potentially taking a number of
costly cache misses on the way, etc.!

Since the inclusion of highest_sack, there's a lot information
that is very likely redundant (SACK fastpath hint stuff,
fackets_out, highest_sack), though there's no ultimate guarantee
that they'll remain the same whole the time (in all unearthly
scenarios). Take advantage of this knowledge here and drop
fastpath hint and use direct access to highest SACKed skb as
a replacement.

Effectively "special cased" fastpath is dropped. This change
adds some complexity to introduce better coveraged "fastpath",
though the added complexity should make TCP behave more cache
friendly.

The current ACK's SACK blocks are compared against each cached
block individially and only ranges that are new are then scanned
by the high constant walk. For other parts of write queue, even
when in previously known part of the SACK blocks, a faster skip
function is used (if necessary at all). In addition, whenever
possible, TCP fast-forwards to highest_sack skb that was made
available by an earlier patch. In typical case, no other things
but this fast-forward and mandatory markings after that occur
making the access pattern quite similar to the former fastpath
"special case".

DSACKs are special case that must always be walked.

The local to recv_sack_cache copying could be more intelligent
w.r.t DSACKs which are likely to be there only once but that
is left to a separate patch.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:07 -08:00
Ilpo Järvinen
fd6dad616d [TCP]: Earlier SACK block verification & simplify access to them
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:07 -08:00
Ilpo Järvinen
a47e5a988a [TCP]: Convert highest_sack to sk_buff to allow direct access
It is going to replace the sack fastpath hint quite soon... :-)

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:54:03 -08:00
YOSHIFUJI Hideaki
2a8cc6c890 [IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table.
Policy table is implemented as an RCU linear list since we do not expect
large list nor frequent updates.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:58 -08:00
Patrick McHardy
6e23ae2a48 [NETFILTER]: Introduce NF_INET_ hook values
The IPv4 and IPv6 hook values are identical, yet some code tries to figure
out the "correct" value by looking at the address family. Introduce NF_INET_*
values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__
section for userspace compatibility.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:55 -08:00
Jens Axboe
9c55e01c0c [TCP]: Splice receive support.
Support for network splice receive.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:31 -08:00
Jens Axboe
bbdfc2f706 [SPLICE]: Don't assume regular pages in splice_to_pipe()
Allow caller to pass in a release function, there might be
other resources that need releasing as well. Needed for
network receive.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 14:53:30 -08:00
Linus Torvalds
f4798748de Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (24 commits)
  HID: ADS/Tech Radio si470x needs blacklist entry
  HID: Logitech Extreme 3D needs NOGET quirk
  HID: Refactor MS Presenter 8K key mapping
  HID: MS Presenter mapping for PID 0x0701
  HID: Support Samsung IR remote
  HID: fix compilation of hidbp drivers without usbhid
  HID: Blacklist the Gretag-Macbeth Huey display colorimeter
  HID: the `bit' in hidinput_mapping_quirks() is an out parameter
  HID: remove redundant WARN_ON()s in order not to scare users
  HID: force hiddev creation for SONY PS3 controller
  HID: Use hid blacklist in usbmouse/usbkbd
  HID: proper handling of MS 4k and 6k devices
  HID: remove unused variable in quirk event handler
  HID: hid-input quirk for BTC 8193
  HID: separate hid-input event quirks from generic code
  HID: refactor mapping to input subsystem for quirky devices
  HID: Microsoft Wireless Optical Desktop 3.0 quirk
  HID: Add support for Logitech Elite keyboards
  HID: add full support for Genius KB-29E
  HID: fix a potential bug in pointer casting
  ...
2008-01-29 08:52:20 +11:00
Linus Torvalds
8d01eddf29 Merge branch 'for-2.6.25' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.25' of git://git.kernel.dk/linux-2.6-block:
  block: implement drain buffers
  __bio_clone: don't calculate hw/phys segment counts
  block: allow queue dma_alignment of zero
  blktrace: Add blktrace ioctls to SCSI generic devices
2008-01-29 08:51:56 +11:00
Linus Torvalds
f0f0052069 Merge branch 'blk-end-request' of git://git.kernel.dk/linux-2.6-block
* 'blk-end-request' of git://git.kernel.dk/linux-2.6-block: (30 commits)
  blk_end_request: changing xsysace (take 4)
  blk_end_request: changing ub (take 4)
  blk_end_request: cleanup of request completion (take 4)
  blk_end_request: cleanup 'uptodate' related code (take 4)
  blk_end_request: remove/unexport end_that_request_* (take 4)
  blk_end_request: changing scsi (take 4)
  blk_end_request: add bidi completion interface (take 4)
  blk_end_request: changing ide-cd (take 4)
  blk_end_request: add callback feature (take 4)
  blk_end_request: changing ide normal caller (take 4)
  blk_end_request: changing cpqarray (take 4)
  blk_end_request: changing cciss (take 4)
  blk_end_request: changing ide-scsi (take 4)
  blk_end_request: changing s390 (take 4)
  blk_end_request: changing mmc (take 4)
  blk_end_request: changing i2o_block (take 4)
  blk_end_request: changing viocd (take 4)
  blk_end_request: changing xen-blkfront (take 4)
  blk_end_request: changing viodasd (take 4)
  blk_end_request: changing sx8 (take 4)
  ...
2008-01-29 08:51:32 +11:00
Linus Torvalds
68fbda7de0 Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block
* 'sg' of git://git.kernel.dk/linux-2.6-block:
  SG: work with the SCSI fixed maximum allocations.
  SG: Convert SCSI to use scatterlist helpers for sg chaining
  SG: Move functions to lib/scatterlist.c and add sg chaining allocator helpers
2008-01-29 08:51:05 +11:00
Linus Torvalds
d4928196fe Merge branch 'cfq-ioc-share' of git://git.kernel.dk/linux-2.6-block
* 'cfq-ioc-share' of git://git.kernel.dk/linux-2.6-block:
  cfq-iosched: kill some big inlines
  cfq-iosched: relax IOPRIO_CLASS_IDLE restrictions
  kernel: add CLONE_IO to specifically request sharing of IO contexts
  io_context sharing - anticipatory changes
  block: cfq: make the io contect sharing lockless
  io_context sharing - cfq changes
  io context sharing: preliminary support
  ioprio: move io priority from task_struct to io_context
2008-01-29 08:50:42 +11:00
Robert Schedel
fe56caa97e HID: Support Samsung IR remote
Samsung USB remotes (0419:0001) are rejected by kernel 2.6.23, because the
report descriptor from the remote contains a 48 bit HID report field. HID 1.11
states: Fields may span at most 4 bytes.

This patch, based on 2.6.23, fixes this by modifying the internal report
descriptor in hid-quirks.c. Additional user space support (e.g. LIRC) is
required to fetch the information from the hiddev interface.

The burden to reconstruct the data is moved into userspace (lirc through hiddev).
There is no need to set HID_QUIRK_HIDDEV quirk, as the device has also output
applications, which trigger the creation of hiddev device automatically.

Signed-off-by: Robert Schedel <r.schedel@yahoo.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-01-28 14:51:22 +01:00
Fengguang Wu
70d215c4a7 HID: the `bit' in hidinput_mapping_quirks() is an out parameter
Fix a panic, by changing
	hidinput_mapping_quirks(,, unsigned long *bit,)
to
	hidinput_mapping_quirks(,, unsigned long **bit,)

The `bit' in this function is an out parameter.

Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-01-28 14:51:22 +01:00
Jiri Kosina
628edcde87 HID: proper handling of MS 4k and 6k devices
This removes ugly macros IS_* to distinguish devices that
need special handling in hid-input, and establish proper
quirks for them.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-01-28 14:51:21 +01:00
Jiri Kosina
36ccaad640 HID: hid-input quirk for BTC 8193
BTC 8193 keyboard handles its scrollwheel in very non-standard way.
It produces two non-standard usages for scrolling up and down, in
both cases with postive value equaling to 1. We handle this by temporary
mapping, which we then catch in quirk event handler, and remap to
negative HWHEEL even in order to introduce correct behavior.

Also the button requires special mapping, as it triggers standard-violating
usage code.

Reported in kernel.org bugzilla #9385

Reported-by: Kir Kolyshkin <kir@sacred.ru>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-01-28 14:51:21 +01:00
Jiri Kosina
87bc2aa993 HID: separate hid-input event quirks from generic code
This patch separates also the hid-input quirks that have to be
applied at the time the event occurs, so that the generic code
handling HUT-compliant devices is not messed up by them too much.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-01-28 14:51:20 +01:00
Jiri Kosina
10bd065fac HID: refactor mapping to input subsystem for quirky devices
Currently, the handling of mapping between hid and input for devices
that don't conform to HUT 1.12 specification is very messy -- no per-device
handling, no blacklists, conditions on idVendor and idProduct placed
all over the code.

This patch moves all the device-specific input mapping to a separate
file, and introduces a blacklist-style handling for non-standard
device-specific mappings.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-01-28 14:51:20 +01:00
Jiri Kosina
af9e0eacdc HID: add full support for Genius KB-29E
Genius KB-29E has broken report descriptor, which causes some of the
Consumer usages to appear incorrectly as Button usages. We fix it by
fixing the report descriptor before it is being parsed.

Also a few of the keys violate the HUT standard, so they need a special
handling. They currently fall into "Reserved" range as per HUT 1.12.

Reported-by: Szekeres Istvan <szekeres@iii.hu>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-01-28 14:51:20 +01:00
Pavel Troller
c80e5ffac0 HID: Implement horizontal wheel handling for A4 Tech X5-005D
This mouse distinguishes horizontal wheel from vertical by a special "pseudo
event" GenericDesktop.00b8, with values of 0 for vertical and 8 for horizontal
wheel. Because this event is supplied by the parser too late, we need to delay
a wheel event, wait for this one and send either REL_WHEEL or REL_HWHEEL to
input depending on the event value.

Signed-off-by: Pavel Troller <patrol@sinus.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-01-28 14:51:19 +01:00
Michel Daenzer
81e1a87550 HID: Rename some code identifiers from PowerBook specific to Apple generic
Preserve identifiers exposed in build and run time configuration though in
order not to break existing configurations.

This is in preparation for adding support for Apple aluminum USB keyboards.

Signed-off-by: Michel Daenzer <michel@tungstengraphics.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-01-28 14:51:19 +01:00
Russell King
c00d4ffdba Merge branch 'orion' into devel
* orion: (26 commits)
  [ARM] Orion: implement power-off method for QNAP TS-109/209
  [ARM] Orion: add support for QNAP TS-109/TS-209
  [ARM] Orion: I2C support
  [I2C] i2c-mv64xxx: Don't set i2c_adapter.retries
  [I2C] Split mv643xx I2C platform support
  [ARM] Orion: enable CONFIG_RTC_DRV_M41T80 for D-Link DNS-323
  [ARM] Orion defconfig
  [ARM] Orion: add support for Orion/MV88F5181 based D-Link DNS-323
  [ARM] Orion: MV88F5181 support bits
  [ARM] Orion: Buffalo/Revogear Kurobox Pro support
  [ARM] OrionNAS RD board support
  [ARM] Orion: support for Marvell Orion-2 (88F5281) Development Board
  [ARM] Orion: common platform setup for Gigabit Ethernet port
  [ARM] Orion: platform device registration for UART, USB and NAND
  [ARM] Orion: system timer support
  [ARM] Orion edge GPIO IRQ support
  [ARM] Orion: IRQ support
  [ARM] Orion: provide GPIO method for enabling hardware assisted blinking
  [ARM] Orion: GPIO support
  [ARM] Orion: programable address map support
  ...

Conflicts:

	arch/arm/Kconfig
	arch/arm/Makefile

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-01-28 13:21:30 +00:00
James Bottomley
7cedb1f17f SG: work with the SCSI fixed maximum allocations.
SCSI sg table allocation has a maximum size (of SCSI_MAX_SG_SEGMENTS,
currently 128) and this will cause a BUG_ON() in SCSI if something
tries an allocation over it.  This patch adds a size limit to the
chaining allocator to allow the specification of the maximum
allocation size for chaining, so we always chain in units of the
maximum SCSI allocation size.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:54:49 +01:00
James Bottomley
fa0ccd837e block: implement drain buffers
These DMA drain buffer implementations in drivers are pretty horrible
to do in terms of manipulating the scatterlist.  Plus they're being
done at least in drivers/ide and drivers/ata, so we now have code
duplication.

The one use case for this, as I understand it is AHCI controllers doing
PIO mode to mmc devices but translating this to DMA at the controller
level.

So, what about adding a callback to the block layer that permits the
adding of the drain buffer for the problem devices.  The idea is that
you'd do this in slave_configure after you find one of these devices.

The beauty of doing it in the block layer is that it quietly adds the
drain buffer to the end of the sg list, so it automatically gets mapped
(and unmapped) without anything unusual having to be done to the
scatterlist in driver/scsi or drivers/ata and without any alteration to
the transfer length.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:54:11 +01:00
Jens Axboe
fadad878cc kernel: add CLONE_IO to specifically request sharing of IO contexts
syslets (or other threads/processes that want io context sharing) can
set this to enforce sharing of io context.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:50:36 +01:00
Jens Axboe
4ac845a2e9 block: cfq: make the io contect sharing lockless
The io context sharing introduced a per-ioc spinlock, that would protect
the cfq io context lookup. That is a regression from the original, since
we never needed any locking there because the ioc/cic were process private.

The cic lookup is changed from an rbtree construct to a radix tree, which
we can then use RCU to make the reader side lockless. That is the performance
critical path, modifying the radix tree is only done on process creation
(when that process first does IO, actually) and on process exit (if that
process has done IO).

As it so happens, radix trees are also much faster for this type of
lookup where the key is a pointer. It's a very sparse tree.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:50:33 +01:00
Jens Axboe
d38ecf935f io context sharing: preliminary support
Detach task state from ioc, instead keep track of how many processes
are accessing the ioc.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:50:31 +01:00
Jens Axboe
fd0928df98 ioprio: move io priority from task_struct to io_context
This is where it belongs and then it doesn't take up space for a
process that doesn't do IO.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:50:29 +01:00
Kiyoshi Ueda
5450d3e1d6 blk_end_request: cleanup 'uptodate' related code (take 4)
This patch converts 'uptodate' arguments of no longer exported
interfaces, end_that_request_first/last, to 'error', and removes
internal conversions for it in blk_end_request interfaces.

Also, this patch removes no longer needed end_io_error().

Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:37:13 +01:00
Kiyoshi Ueda
3bcddeac1c blk_end_request: remove/unexport end_that_request_* (take 4)
This patch removes the following functions:
  o end_that_request_first()
  o end_that_request_chunk()
and stops exporting the functions below:
  o end_that_request_last()

Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:37:12 +01:00
Kiyoshi Ueda
e3a04fe34a blk_end_request: add bidi completion interface (take 4)
This patch adds a variant of the interface, blk_end_bidi_request(),
which completes a bidi request.

Bidi request must be completed as a whole, both rq and rq->next_rq
at once.  So the interface has 2 arguments for completion size.

As for ->end_io, only rq->end_io is called (rq->next_rq->end_io is not
called).  So if special completion handling is needed, the handler
must be set to rq->end_io.
And the handler must take care of freeing next_rq too, since
the interface doesn't care of it if rq->end_io is not NULL.

Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:37:08 +01:00
Kiyoshi Ueda
e19a3ab058 blk_end_request: add callback feature (take 4)
This patch adds a variant of the interface, blk_end_request_callback(),
which has driver callback feature.

Drivers may need to do special works between end_that_request_first()
and end_that_request_last().
For such drivers, blk_end_request_callback() allows it to pass
a callback function which is called between end_that_request_first()
and end_that_request_last().

This interface is only for fallback of other blk_end_request interfaces.
Drivers should avoid their tricky behaviors and use other interfaces
as much as possible.

Currently, only one driver, ide-cd, needs this interface.
So this interface should/will be removed, after the driver removes
such tricky behaviors.

o ide-cd (cdrom_newpc_intr())
  In PIO mode, cdrom_newpc_intr() needs to defer end_that_request_last()
  until the device clears DRQ_STAT and raises an interrupt after
  end_that_request_first().
  So end_that_request_first() and end_that_request_last() are called
  separately in cdrom_newpc_intr().

  This means blk_end_request_callback() has to return without
  completing request even if no leftover in the request.
  To satisfy the requirement, callback function has return value
  so that drivers can tell blk_end_request_callback() to return
  without completing request.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:37:04 +01:00
Kiyoshi Ueda
3b11313a6c blk_end_request: add/export functions to get request size (take 4)
This patch adds/exports functions to get the size of request in bytes.
They are useful because blk_end_request interfaces take bytes
as a completed I/O size instead of sectors.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:35:56 +01:00
Kiyoshi Ueda
336cdb4003 blk_end_request: add new request completion interface (take 4)
This patch adds 2 new interfaces for request completion:
  o blk_end_request()   : called without queue lock
  o __blk_end_request() : called with queue lock held

blk_end_request takes 'error' as an argument instead of 'uptodate',
which current end_that_request_* take.
The meanings of values are below and the value is used when bio is
completed.
    0 : success
  < 0 : error

Some device drivers call some generic functions below between
end_that_request_{first/chunk} and end_that_request_last().
  o add_disk_randomness()
  o blk_queue_end_tag()
  o blkdev_dequeue_request()
These are called in the blk_end_request interfaces as a part of
generic request completion.
So all device drivers become to call above functions.
To decide whether to call blkdev_dequeue_request(), blk_end_request
uses list_empty(&rq->queuelist) (blk_queued_rq() macro is added for it).
So drivers must re-initialize it using list_init() or so before calling
blk_end_request if drivers use it for its specific purpose.
(Currently, there is no driver which completes request without
 re-initializing the queuelist after used it.  So rq->queuelist
 can be used for the purpose above.)

"Normal" drivers can be converted to use blk_end_request()
in a standard way shown below.

 a) end_that_request_{chunk/first}
    spin_lock_irqsave()
    (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request())
    end_that_request_last()
    spin_unlock_irqrestore()
    => blk_end_request()

 b) spin_lock_irqsave()
    end_that_request_{chunk/first}
    (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request())
    end_that_request_last()
    spin_unlock_irqrestore()
    => spin_lock_irqsave()
       __blk_end_request()
       spin_unlock_irqsave()

 c) spin_lock_irqsave()
    (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request())
    end_that_request_last()
    spin_unlock_irqrestore()
    => blk_end_request()   or   spin_lock_irqsave()
                                __blk_end_request()
                                spin_unlock_irqrestore()

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:35:53 +01:00
Jens Axboe
0db9299f48 SG: Move functions to lib/scatterlist.c and add sg chaining allocator helpers
Manually doing chained sg lists is not trivial, so add some helpers
to make sure that drivers get it right.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:05:27 +01:00
Pete Wyckoff
482eb68916 block: allow queue dma_alignment of zero
Let queue_dma_alignment return 0 if it was specifically set to 0.
This permits devices with no particular alignment restrictions to
use arbitrary user space buffers without copying.

Signed-off-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:04:46 +01:00
Christof Schmitt
6da127ad09 blktrace: Add blktrace ioctls to SCSI generic devices
Since the SCSI layer uses the request queues from the block layer, blktrace can
also be used to trace the requests to all SCSI devices (like SCSI tape drives),
not only disks. The only missing part is the ioctl interface to start and stop
tracing.

This patch adds the SETUP, START, STOP and TEARDOWN ioctls from blktrace to the
sg device files. With this change, blktrace can be used for SCSI devices like
for disks, e.g.: blktrace -d /dev/sg1 -o - | blkparse -i -

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-01-28 10:04:46 +01:00
David Brownell
e9f1373b64 i2c: Add i2c_new_dummy() utility
This adds a i2c_new_dummy() primitive to help work with devices
that consume multiple addresses, which include many I2C eeproms
and at least one RTC.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-01-27 18:14:52 +01:00
David Brownell
9b766b814d i2c: Stop using the redundant client list
The i2c_adapter.clients list of i2c_client nodes duplicates driver
model state.  This patch starts removing that list, letting us remove
most existing users of those i2c-core lists.

 * The core I2C code now iterates over the driver model's list instead
   of the i2c-internal one in some places where it's safe:
      - Passing a command/ioctl to each client, a mechanims
        used almost exclusively by DVB adapters;
      - Device address checking, in both i2c-core and i2c-dev.

 * Provide i2c_verify_client() to use with driver model iterators.

 * Flag the relevant i2c_adapter and i2c_client fields as deprecated,
   to help prevent new users from appearing.

For the moment the list needs to stick around, since some issues show
up when deleting devices created by legacy I2C drivers.  (They don't
follow standard driver model rules.  Removing those devices can cause
self-deadlocks.)

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-01-27 18:14:51 +01:00
Jean Delvare
7bca0871ca i2c: Discard unused driver IDs
Discard all I2C driver IDs that aren't used anywhere. That's not just a
couple of them, but more like 49 or one quarter of all defined IDs! And
this is just a first pass, next will come all IDs that are set but
never used, or used but never set.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-01-27 18:14:50 +01:00
David Brownell
6d16bfb5e8 i2c/tps65010: move header to <linux/i2c/...>
Move the tps65010 header file from the OMAP arch directory to the
more generic <linux/i2c/...> directory, and remove the spurious
dependency of this driver on OMAP.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-01-27 18:14:49 +01:00
Jean Delvare
026526f5af i2c: Drop redundant i2c_driver.list
i2c_driver.list is superfluous, this list duplicates the one
maintained by the driver core. Drop it.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
2008-01-27 18:14:49 +01:00
Jean Delvare
87c6c22945 i2c: Drop redundant i2c_adapter.list
i2c_adapter.list is superfluous, this list duplicates the one
maintained by the driver core. Drop it.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
2008-01-27 18:14:48 +01:00
Jean Delvare
e48d33193d i2c: Change prototypes of refcounting functions
Use more standard prototypes for i2c_use_client() and
i2c_release_client(). The former now returns a pointer to the client,
and the latter no longer returns anything. This matches what all other
subsystems do.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: David Brownell <david-b@pacbell.net>
2008-01-27 18:14:48 +01:00
Jean Delvare
bdc511f438 i2c: Use the driver model reference counting
Don't implement our own reference counting mechanism for i2c clients
when the driver model already has one.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: David Brownell <david-b@pacbell.net>
2008-01-27 18:14:48 +01:00
Mark M. Hoffman
bfb6df24fa i2c: Constify client address data
This patch allows much of the I2C client address data to move from initdata
into text.
    
Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-01-27 18:14:46 +01:00
Adrian Bunk
7e8b99251b i2c: some overdue driver removal
This patch contains the overdue removal of three I2C drivers.

[JD: In fact only i2c-ixp4xx can be removed at the moment, the other two
platforms don't implement the generic GPIO layer yet.]

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-01-27 18:14:46 +01:00
Adrian Bunk
eee87d3196 i2c: the scheduled I2C RTC driver removal
This patch contains the scheduled removal of legacy I2C RTC drivers with 
replacement drivers.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-01-27 18:14:45 +01:00
Bartlomiej Zolnierkiewicz
7267c33774 ide: remove REQ_TYPE_ATA_CMD
Based on the earlier work by Tejun Heo.

All users are gone so we can finally remove it.

Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:13 +01:00
Bartlomiej Zolnierkiewicz
34f5d5ae35 ide: switch set_xfer_rate() to use REQ_TYPE_ATA_TASKFILE requests
Based on the earlier work by Tejun Heo.

Switch set_xfer_rate() to use REQ_TYPE_ATA_TASKFILE requests
and make ide_wait_cmd() static.

There should be no functionality changes caused by this patch.

Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:12 +01:00
Bartlomiej Zolnierkiewicz
2624565caa ide: use wait_drive_not_busy() in drive_cmd_intr() (take 2)
Use wait_drive_not_busy() in drive_cmd_intr().

v2:
* Fix wait_drive_not_busy() comment (noticed by Sergei).

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:11 +01:00
Bartlomiej Zolnierkiewicz
4906f3b4cd ide: kill DATA_READY define
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:11 +01:00
Tejun Heo
4d7a984bdc ide: task_end_request() fix
task_end_request() modified to always call ide_end_drive_cmd()
for taskfile requests.  Previously, ide_end_drive_cmd() was
called only when IDE_TFLAG_FLAGGED was set.  Also,
ide_dma_intr() is modified to use task_end_request().

Enables TASKFILE ioctls to get valid register outputs on
successful completion.

Bart:
- ported it over recent IDE changes

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:11 +01:00
Bartlomiej Zolnierkiewicz
657cc1a8f6 ide: set IDE_TFLAG_IN_* flags before queuing/executing command
* Add IDE_TFLAG_{HOB,TF,DEVICE} defines.

* Set IDE_TFLAG_IN_* flags in {do_rw,ide_no_data,ide_raw}_taskfile() users.

* Remove no longer needed ->tf_flags setup from ide_end_drive_cmd().

There should be no functionality changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:10 +01:00
Tejun Heo
35cf2b94d0 ide: fix ->io_32bit race in ide_taskfile_ioctl()
In ide_taskfile_ioctl(), there was a race condition involving
drive->io_32bit.  It was cleared and restored during ioctl
requests but there was no synchronization with other requests.
So, other requests could execute with the altered ->io_32bit
setting or updated drive->io_32bit could be overwritten by
ide_taskfile_ioctl().

This patch adds IDE_TFLAG_IO_16BIT flag to indicate to
ide_pio_datablock() that 16-bit I/O is needed regardless of
drive->io_32bit settting.

Bart:
- ported it over recent IDE changes

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:10 +01:00
Bartlomiej Zolnierkiewicz
9e47be0c97 ide: remove broken disk byte-swapping support
Remove broken disk byte-swapping support:
- it can cause a data corruption on SMP (or if using PREEMPT on UP)
- all data coming from disk are byte-swapped by taskfile_*_data() which
  results in incorrect identify data being reported by /proc/ide/ and IOCTLs
- "hdx=bswap/byteswap" kernel parameter has been broken on m68k host drivers
  (including Atari/Q40 ones) since 2.5.x days (because of 'hwif' zero-ing)
- byte-swapping is limited to PIO transfers (for working with TiVo disks on
  x86 machines using user-space solutions or dm-byteswap should result in
  much better performance because DMA can be used)

For previous discussions please see:

http://www.ussg.iu.edu/hypermail/linux/kernel/0201.0/0768.html
http://lkml.org/lkml/2004/2/28/111

[ I have dm-byteswap device mapper target if somebody is interested
  (patch is for 2.6.4 though but I'll dust it off if needed). ]

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:09 +01:00
Bartlomiej Zolnierkiewicz
81ca691981 ide: add ide_set_irq() inline helper
There should be no functionality changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:08 +01:00
Bartlomiej Zolnierkiewicz
ade2daf9c6 ide: make remaining built-in only IDE host drivers modular (take 2)
* Make remaining built-in only IDE host drivers modular, add ide-scan-pci.c
  file for probing PCI host drivers registered with IDE core (special case
  for built-in IDE and CONFIG_IDEPCI_PCIBUS_ORDER=y) and then take care of
  the ordering in which all IDE host drivers are probed when IDE is built-in
  during link time.

* Move probing of gayle, falconide, macide, q40ide and buddha (m68k arch
  specific) host drivers, before PCI ones (no PCI on m68k), ide-cris (cris
  arch specific), cmd640 (x86 arch specific) and pmac (ppc arch specific).

* Move probing of ide-cris (cris arch specific) host driver before cmd640
  (x86 arch specific).

* Move probing of mpc8xx (ppc specific) host driver before ide-pnp (depends
  on ISA and none of ppc platform that use mpc8xx supports ISA) and ide-h8300
  (h8300 arch specific).

* Add "probe_vlb" kernel parameter to cmd640 host driver and update
  Documentation/ide.txt accordingly.

* Make IDE_ARM config option visible so it can also be disabled if needed.

* Remove bogus comment from ide.c while at it.

v2:
* Fix two issues spotted by Sergei:
  - replace ENOMEM error value by ENOENT in ide-h8300 host driver
  - fix MODULE_PARM_DESC() in cmd640 host driver

Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:07 +01:00
Bartlomiej Zolnierkiewicz
cbb010c180 ide: drop 'initializing' argument from ide_register_hw()
* Rename init_hwif_data() to ide_init_port_data() and export it.

* For all users of ide_register_hw() with 'initializing' argument set
  hwif->present and hwif->hold are always zero so convert these host
  drivers to use ide_find_port()+ide_init_port_data()+ide_init_port_hw()
  instead (also no need for init_hwif_default() call since the setup
  done by it gets over-ridden by ide_init_port_hw() call).

* Drop 'initializing' argument from ide_register_hw().

Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:06 +01:00
Bartlomiej Zolnierkiewicz
57c802e84f ide: add ide_init_port_hw() helper
* Add ide_init_port_hw() helper.

* rapide.c: convert rapide_locate_hwif() to rapide_setup_ports()
  and use ide_init_port_hw().

* ide_platform.c: convert plat_ide_locate_hwif() to plat_ide_setup_ports()
  and use ide_init_port_hw().

* sgiioc4.c: use ide_init_port_hw().

* pmac.c: add 'hw_regs_t *hw' argument to pmac_ide_setup_device(),
  setup 'hw' in pmac_ide_{macio,pci}_attach() and use ide_init_port_hw()
  in pmac_ide_setup_device().

This patch is a preparation for the future changes in the IDE probing code.

There should be no functionality changes caused by this patch.

Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: Jeremy Higdon <jeremy@sgi.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:05 +01:00
Olof Johansson
b0d5bc27ce ide: Fix build break caused by "ide: remove ideprobe_init()"
Fix build break of powerpc holly_defconfig:

In file included from arch/powerpc/platforms/embedded6xx/holly.c:24:
include/linux/ide.h:1206: error: 'CONFIG_IDE_MAX_HWIFS' undeclared here (not in a function)

There's no need to have a sized array in the prototype, might as well
turn it into a pointer.

It could probably be argued that large parts of the include file can be
covered under #ifdef CONFIG_IDE, but that's a larger undertaking.

Signed-off-by: Olof Johansson <olof@lixom.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:05 +01:00
Bartlomiej Zolnierkiewicz
151575e464 ide: remove ideprobe_init()
* Rename ide_device_add() to ide_device_add_all() and make it accept
  'u8 idx[MAX_HWIFS]' instead of 'u8 idx[4]' as an argument.

* Add ide_device_add() wrapper for ide_device_add_all().

* Convert ide_generic_init() to use ide_device_add_all().

* Remove no longer needed ideprobe_init().

There should be no functionality changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-01-26 20:13:05 +01:00