Commit graph

612 commits

Author SHA1 Message Date
Eric Dumazet
580da35a31 IB: Fix RCU lockdep splats
Commit f2c31e32b3 ("net: fix NULL dereferences in check_peer_redir()")
forgot to take care of infiniband uses of dst neighbours.

Many thanks to Marc Aurele who provided a nice bug report and feedback.

Reported-by: Marc Aurele La France <tsi@ualberta.ca>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-29 13:37:11 -08:00
Mike Marciniszyn
3874397c0b IB/ipoib: Prevent hung task or softlockup processing multicast response
This following can occur with ipoib when processing a multicast reponse:

    BUG: soft lockup - CPU#0 stuck for 67s! [ib_mad1:982]
    Modules linked in: ...
    CPU 0:
    Modules linked in: ...
    Pid: 982, comm: ib_mad1 Not tainted 2.6.32-131.0.15.el6.x86_64 #1 ProLiant DL160 G5
    RIP: 0010:[<ffffffff814ddb27>]  [<ffffffff814ddb27>] _spin_unlock_irqrestore+0x17/0x20
    RSP: 0018:ffff8802119ed860  EFLAGS: 00000246
    0000000000000004 RBX: ffff8802119ed860 RCX: 000000000000a299
    RDX: ffff88021086c700 RSI: 0000000000000246 RDI: 0000000000000246
    RBP: ffffffff8100bc8e R08: ffff880210ac229c R09: 0000000000000000
    R10: ffff88021278aab8 R11: 0000000000000000 R12: ffff8802119ed860
    R13: ffffffff8100be6e R14: 0000000000000001 R15: 0000000000000003
    FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 00000000006d4840 CR3: 0000000209aa5000 CR4: 00000000000406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Call Trace:
    [<ffffffffa032c247>] ? ipoib_mcast_send+0x157/0x480 [ib_ipoib]
    [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20
    [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20
    [<ffffffffa03283d4>] ? ipoib_path_lookup+0x124/0x2d0 [ib_ipoib]
    [<ffffffffa03286fc>] ? ipoib_start_xmit+0x17c/0x430 [ib_ipoib]
    [<ffffffff8141e758>] ? dev_hard_start_xmit+0x2c8/0x3f0
    [<ffffffff81439d0a>] ? sch_direct_xmit+0x15a/0x1c0
    [<ffffffff81423098>] ? dev_queue_xmit+0x388/0x4d0
    [<ffffffffa032d6b7>] ? ipoib_mcast_join_finish+0x2c7/0x510 [ib_ipoib]
    [<ffffffffa032dab8>] ? ipoib_mcast_sendonly_join_complete+0x1b8/0x1f0 [ib_ipoib]
    [<ffffffffa02a0946>] ? mcast_work_handler+0x1a6/0x710 [ib_sa]
    [<ffffffffa015f01e>] ? ib_send_mad+0xfe/0x3c0 [ib_mad]
    [<ffffffffa00f6c93>] ? ib_get_cached_lmc+0xa3/0xb0 [ib_core]
    [<ffffffffa02a0f9b>] ? join_handler+0xeb/0x200 [ib_sa]
    [<ffffffffa029e4fc>] ? ib_sa_mcmember_rec_callback+0x5c/0xa0 [ib_sa]
    [<ffffffffa029e79c>] ? recv_handler+0x3c/0x70 [ib_sa]
    [<ffffffffa01603a4>] ? ib_mad_completion_handler+0x844/0x9d0 [ib_mad]
    [<ffffffffa015fb60>] ? ib_mad_completion_handler+0x0/0x9d0 [ib_mad]
    [<ffffffff81088830>] ? worker_thread+0x170/0x2a0
    [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40
    [<ffffffff810886c0>] ? worker_thread+0x0/0x2a0
    [<ffffffff8108ddf6>] ? kthread+0x96/0xa0
    [<ffffffff8100c1ca>] ? child_rip+0xa/0x20

Coinciding with stack trace is the following message:

    ib0: ib_address_create failed

The code below in ipoib_mcast_join_finish() will note the above
failure in the address handle but otherwise continue:

                ah = ipoib_create_ah(dev, priv->pd, &av);
                if (!ah) {
                        ipoib_warn(priv, "ib_address_create failed\n");
                } else {

The while loop at the bottom of ipoib_mcast_join_finish() will attempt
to send queued multicast packets in mcast->pkt_queue and eventually
end up in ipoib_mcast_send():

        if (!mcast->ah) {
                if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE)
                        skb_queue_tail(&mcast->pkt_queue, skb);
                else {
                        ++dev->stats.tx_dropped;
                        dev_kfree_skb_any(skb);
                }

My read is that the code will requeue the packet and return to the
ipoib_mcast_join_finish() while loop and the stage is set for the
"hung" task diagnostic as the while loop never sees a non-NULL ah, and
will do nothing to resolve.

There are GFP_ATOMIC allocates in the provider routines, so this is
possible and should be dealt with.

The test that induced the failure is associated with a host SM on the
same server during a shutdown.

This patch causes ipoib_mcast_join_finish() to exit with an error
which will flush the queued mcast packets.  Nothing is done to unwind
the QP attached state so that subsequent sends from above will retry
the join.

Reviewed-by: Ram Vepa <ram.vepa@qlogic.com>
Reviewed-by: Gary Leshner <gary.leshner@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-29 13:20:02 -08:00
David S. Miller
9ca36f7db2 infiniband: Update net drivers for netdev_features_t changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-16 18:05:50 -05:00
Linus Torvalds
32aaeffbd4 Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include <linux/module.h>
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include <linux/module.h>
  net: sch_generic remove redundant use of <linux/module.h>
  net: inet_timewait_sock doesnt need <linux/module.h>
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
2011-11-06 19:44:47 -08:00
Or Gerlitz
52439540ea IB/iser: DMA unmap TX bufs used for iSCSI/iSER headers
The current driver never does DMA unmapping on these buffers.  Fix that
by adding DMA unmapping to the task cleanup callback, and DMA mapping to
the task init function (drop the headers_initialized micro-optimization).

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-04 09:32:44 -07:00
Or Gerlitz
2c4ce60934 IB/iser: Use separate buffers for the login request/response
The driver counted on the transactional nature of iSCSI login/text
flows and used the same buffer for both the request and the response.
We also went further and did DMA mapping only once, with
DMA_FROM_DEVICE, which violates the DMA mapping API.  Fix that by
using different buffers, one for requests and one for responses, and
use the correct DMA mapping direction for each.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-11-04 09:30:52 -07:00
Linus Torvalds
f470f8d4e7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (62 commits)
  mlx4_core: Deprecate log_num_vlan module param
  IB/mlx4: Don't set VLAN in IBoE WQEs' control segment
  IB/mlx4: Enable 4K mtu for IBoE
  RDMA/cxgb4: Mark QP in error before disabling the queue in firmware
  RDMA/cxgb4: Serialize calls to CQ's comp_handler
  RDMA/cxgb3: Serialize calls to CQ's comp_handler
  IB/qib: Fix issue with link states and QSFP cables
  IB/mlx4: Configure extended active speeds
  mlx4_core: Add extended port capabilities support
  IB/qib: Hold links until tuning data is available
  IB/qib: Clean up checkpatch issue
  IB/qib: Remove s_lock around header validation
  IB/qib: Precompute timeout jiffies to optimize latency
  IB/qib: Use RCU for qpn lookup
  IB/qib: Eliminate divide/mod in converting idx to egr buf pointer
  IB/qib: Decode path MTU optimization
  IB/qib: Optimize RC/UC code by IB operation
  IPoIB: Use the right function to do DMA unmap pages
  RDMA/cxgb4: Use correct QID in insert_recv_cqe()
  RDMA/cxgb4: Make sure flush CQ entries are collected on connection close
  ...
2011-11-01 10:51:38 -07:00
Roland Dreier
504255f8d0 Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'fdr', 'ipath', 'ipoib', 'misc', 'mlx4', 'misc', 'nes', 'qib' and 'xrc' into for-next 2011-11-01 09:37:08 -07:00
Paul Gortmaker
fec14d2fce infiniband: add moduleparam.h to drivers/infiniband as required
These files were getting the moduleparam infrastructure from the
implicit presence of module.h being everywhere, but that is going
away soon.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:36 -04:00
Paul Gortmaker
b108d9764c infiniband: add in export.h for files using EXPORT_SYMBOL/THIS_MODULE
These were getting it implicitly via device.h --> module.h but
we are going to stop that when we clean up the headers.

Fix these in advance so the tree remains biscect-clean.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:35 -04:00
Paul Gortmaker
e4dd23d753 infiniband: Fix up module files that need to include module.h
They had been getting it implicitly via device.h but we can't
rely on that for the future, due to a pending cleanup so fix
it now.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:35 -04:00
Linus Torvalds
ec7ae51753 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (204 commits)
  [SCSI] qla4xxx: export address/port of connection (fix udev disk names)
  [SCSI] ipr: Fix BUG on adapter dump timeout
  [SCSI] megaraid_sas: Fix instance access in megasas_reset_timer
  [SCSI] hpsa: change confusing message to be more clear
  [SCSI] iscsi class: fix vlan configuration
  [SCSI] qla4xxx: fix data alignment and use nl helpers
  [SCSI] iscsi class: fix link local mispelling
  [SCSI] iscsi class: Replace iscsi_get_next_target_id with IDA
  [SCSI] aacraid: use lower snprintf() limit
  [SCSI] lpfc 8.3.27: Change driver version to 8.3.27
  [SCSI] lpfc 8.3.27: T10 additions for SLI4
  [SCSI] lpfc 8.3.27: Fix queue allocation failure recovery
  [SCSI] lpfc 8.3.27: Change algorithm for getting physical port name
  [SCSI] lpfc 8.3.27: Changed worst case mailbox timeout
  [SCSI] lpfc 8.3.27: Miscellanous logic and interface fixes
  [SCSI] megaraid_sas: Changelog and version update
  [SCSI] megaraid_sas: Add driver workaround for PERC5/1068 kdump kernel panic
  [SCSI] megaraid_sas: Add multiple MSI-X vector/multiple reply queue support
  [SCSI] megaraid_sas: Add support for MegaRAID 9360/9380 12GB/s controllers
  [SCSI] megaraid_sas: Clear FUSION_IN_RESET before enabling interrupts
  ...
2011-10-28 16:44:18 -07:00
Eric Dumazet
9e903e0852 net: add skb frag size accessors
To ease skb->truesize sanitization, its better to be able to localize
all references to skb frags size.

Define accessors : skb_frag_size() to fetch frag size, and
skb_frag_size_{set|add|sub}() to manipulate it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:10:46 -04:00
Dotan Barak
787adb9d6a IPoIB: Use the right function to do DMA unmap pages
Pages that were mapped using ib_dma_map_page() should be unmapped
using ib_dma_unmap_page().

Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-18 10:08:31 -07:00
Sean Hefty
96104eda01 RDMA/core: Add SRQ type field
Currently, there is only a single ("basic") type of SRQ, but with XRC
support we will add a second.  Prepare for this by defining an SRQ type
and setting all current users to IB_SRQT_BASIC.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-13 09:13:26 -07:00
Marcel Apfelbaum
e36fb88a9a IPoIB: Handle extended rates in debugfs
Use new function ib_rate_to_mbps() to handle printing rate in debugfs,
so that we handle extended rates.

Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-10-11 11:57:08 -07:00
David S. Miller
8decf86879 Merge branch 'master' of github.com:davem330/net
Conflicts:
	MAINTAINERS
	drivers/net/Kconfig
	drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
	drivers/net/ethernet/broadcom/tg3.c
	drivers/net/wireless/iwlwifi/iwl-pci.c
	drivers/net/wireless/iwlwifi/iwl-trans-tx-pcie.c
	drivers/net/wireless/rt2x00/rt2800usb.c
	drivers/net/wireless/wl12xx/main.c
2011-09-22 03:23:13 -04:00
Mike Christie
f27fb2ef7b [SCSI] iscsi class: sysfs group is_visible callout for iscsi host attrs
The iscsi class currently does not support writable sysfs
attrs for LLD sysfs settings. This patch converts the
iscsi class and driver's host attrs to use the attribute
container sysfs group and the sysfs group's is_visible callout
to be able to support readable or writable sysfs attrs.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-08-27 08:36:14 -06:00
Mike Christie
1d063c1729 [SCSI] iscsi class: sysfs group is_visible callout for session attrs
The iscsi class currently does not support writable sysfs
attrs for LLD sysfs settings. This patch converts the
iscsi class and driver's session attrs to use the attribute
container sysfs group and the sysfs group's is_visible callout
to be able to support readable or writable sysfs attrs.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-08-27 08:36:06 -06:00
Mike Christie
3128c6c73c [SCSI] iscsi cls: sysfs group is_visible callout for conn attrs
The iscsi class currently does not support writable sysfs
attrs for LLD sysfs settings. This patch converts the
iscsi class and drivers to use the attribute container
sysfs group and the sysfs group's is_visible callout
to be able to support readable or writable sysfs attrs.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-08-27 08:36:03 -06:00
Ian Campbell
5581be3b48 IPoIB: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: linux-rdma@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-26 12:38:42 -04:00
Jiri Pirko
afc4b13df1 net: remove use of ndo_set_multicast_list in drivers
replace it by ndo_set_rx_mode

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-08-17 20:22:03 -07:00
Roland Dreier
80b43de837 Merge branches 'ipoib' and 'iser' into for-next 2011-08-17 10:57:43 -07:00
Or Gerlitz
200ae1a08b IB/iser: Support iSCSI PDU padding
RFC3270 mandates that iSCSI PDUs are padded to the closest integer
number of four byte words.  Fix the iser code to support that on both
the TX/RX flows.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-08-17 09:45:07 -07:00
Or Gerlitz
0ace64b85e IBiser: Fix wrong mask when sizeof (dma_addr_t) > sizeof (unsigned long)
The code that prepares the SG associated with SCSI command for FMR was
buggy for systems with DMA addresses that don't fit in unsigned long,
e.g under the 32-bit based XenServer dom0 sizeof(dma_addr_t) is 8.

Fix that by casting to unsigned long long a masking constant used by
the code. This resolves a crash in iser_sg_to_page_vec on this system.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-08-17 09:40:55 -07:00
Bernd Schubert
22cfb0bf67 IPoIB: Fix possible NULL dereference in ipoib_start_xmit()
Fix a bug introduced in 69cce1d140 ("net: Abstract dst->neighbour
accesses behind helpers.") where we might dereference skb_dst(skb)
even if it is NULL, which causes:

    [  240.944030] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
    [  240.948007] IP: [<ffffffffa0366ce9>] ipoib_start_xmit+0x39/0x280 [ib_ipoib]
    [...]
    [  240.948007] Call Trace:
    [  240.948007]  <IRQ>
    [  240.948007]  [<ffffffff812cd5e0>] dev_hard_start_xmit+0x2a0/0x590
    [  240.948007]  [<ffffffff8131f680>] ? arp_create+0x70/0x200
    [  240.948007]  [<ffffffff812e8e1f>] sch_direct_xmit+0xef/0x1c0

Addresses: https://bugzilla.kernel.org/show_bug.cgi?id=41212
Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-08-16 10:19:20 -07:00
Linus Torvalds
91d41fdf31 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: Convert to DIV_ROUND_UP_SECTOR_T usage for sectors / dev_max_sectors
  kernel.h: Add DIV_ROUND_UP_ULL and DIV_ROUND_UP_SECTOR_T macro usage
  iscsi-target: Add iSCSI fabric support for target v4.1
  iscsi: Add Serial Number Arithmetic LT and GT into iscsi_proto.h
  iscsi: Use struct scsi_lun in iscsi structs instead of u8[8]
  iscsi: Resolve iscsi_proto.h naming conflicts with drivers/target/iscsi
2011-07-27 13:21:40 -07:00
Arun Sharma
60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Nicholas Bellinger
123521830c iscsi: Resolve iscsi_proto.h naming conflicts with drivers/target/iscsi
This patch renames the following iscsi_proto.h structures to avoid
namespace issues with drivers/target/iscsi/iscsi_target_core.h:

*) struct iscsi_cmd -> struct iscsi_scsi_req
*) struct iscsi_cmd_rsp -> struct iscsi_scsi_rsp
*) struct iscsi_login -> struct iscsi_login_req

This patch includes useful ISCSI_FLAG_LOGIN_[CURRENT,NEXT]_STAGE*,
and ISCSI_FLAG_SNACK_TYPE_* definitions used by iscsi_target_mod, and
fixes the incorrect definition of struct iscsi_snack to following
RFC-3720 Section 10.16. SNACK Request.

Also, this patch updates libiscsi, iSER, be2iscsi, and bn2xi to
use the updated structure definitions in a handful of locations.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
2011-07-25 07:18:45 +00:00
Linus Torvalds
ece236ce2f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (26 commits)
  IB/qib: Defer HCA error events to tasklet
  mlx4_core: Bump the driver version to 1.0
  RDMA/cxgb4: Use printk_ratelimited() instead of printk_ratelimit()
  IB/mlx4: Support PMA counters for IBoE
  IB/mlx4: Use flow counters on IBoE ports
  IB/pma: Add include file for IBA performance counters definitions
  mlx4_core: Add network flow counters
  mlx4_core: Fix location of counter index in QP context struct
  mlx4_core: Read extended capabilities into the flags field
  mlx4_core: Extend capability flags to 64 bits
  IB/mlx4: Generate GID change events in IBoE code
  IB/core: Add GID change event
  RDMA/cma: Don't allow IPoIB port space for IBoE
  RDMA: Allow for NULL .modify_device() and .modify_port() methods
  IB/qib: Update active link width
  IB/qib: Fix potential deadlock with link down interrupt
  IB/qib: Add sysfs interface to read free contexts
  IB/mthca: Remove unnecessary read of PCI_CAP_ID_EXP
  IB/qib: Remove double define
  IB/qib: Remove unnecessary read of PCI_CAP_ID_EXP
  ...
2011-07-22 14:50:12 -07:00
David S. Miller
69cce1d140 net: Abstract dst->neighbour accesses behind helpers.
dst_{get,set}_neighbour()

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-17 23:11:35 -07:00
Bart Van Assche
fd1b6c4a69 IB/srp: Avoid duplicate devices from LUN scan
SCSI scanning of a channel🆔lun triplet in Linux works as follows
(function scsi_scan_target() in drivers/scsi/scsi_scan.c):

- If lun == SCAN_WILD_CARD, send a REPORT LUNS command to the target
  and process the result.

- If lun != SCAN_WILD_CARD, send an INQUIRY command to the LUN
  corresponding to the specified channel🆔lun triplet to verify
  whether the LUN exists.

So a SCSI driver must either take the channel and target id values in
account in its quecommand() function or it should declare that it only
supports one channel and one target id.

Currently the ib_srp driver does neither.  As a result scanning the
SCSI bus via e.g. rescan-scsi-bus.sh causes many duplicate SCSI
devices to be created. For each 0:0:L device, several duplicates are
created with the same LUN number and with (C:I) != (0:0). Fix this by
declaring that the ib_srp driver only supports one channel and one
target id.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: <stable@kernel.org>
Acked-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-07-13 09:19:16 -07:00
Alexey Dobriyan
a6b7a40786 net: remove interrupt.h inclusion from netdevice.h
* remove interrupt.g inclusion from netdevice.h -- not needed
* fixup fallout, add interrupt.h and hardirq.h back where needed.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 22:55:11 -07:00
Linus Torvalds
4c171acc20 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/cma: Save PID of ID's owner
  RDMA/cma: Add support for netlink statistics export
  RDMA/cma: Pass QP type into rdma_create_id()
  RDMA: Update exported headers list
  RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h>
  RDMA/nes: Add a check for strict_strtoul()
  RDMA/cxgb3: Don't post zero-byte read if endpoint is going away
  RDMA/cxgb4: Use completion objects for event blocking
  IB/srp: Fix integer -> pointer cast warnings
  IB: Add devnode methods to cm_class and umad_class
  IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required
  IB/uverbs: Add devnode method to set path/mode
  RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node
  RDMA: Add netlink infrastructure
  RDMA: Add error handling to ib_core_init()
2011-05-26 12:13:57 -07:00
Roland Dreier
8dc4abdf4c Merge branches 'cma', 'cxgb3', 'cxgb4', 'misc', 'nes', 'netlink', 'srp' and 'uverbs' into for-next 2011-05-25 13:47:20 -07:00
Sean Hefty
b26f9b9949 RDMA/cma: Pass QP type into rdma_create_id()
The RDMA CM currently infers the QP type from the port space selected
by the user.  In the future (eg with RDMA_PS_IB or XRC), there may not
be a 1-1 correspondence between port space and QP type.  For netlink
export of RDMA CM state, we want to export the QP type to userspace,
so it is cleaner to explicitly associate a QP type to an ID.

Modify rdma_create_id() to allow the user to specify the QP type, and
use it to make our selections of datagram versus connected mode.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-25 13:46:23 -07:00
Roland Dreier
737b94eb41 IB/srp: Fix integer -> pointer cast warnings
Fix

    drivers/infiniband/ulp/srp/ib_srp.c: In function 'srp_handle_recv':
    drivers/infiniband/ulp/srp/ib_srp.c:1150: warning: cast to pointer from integer of different size
    drivers/infiniband/ulp/srp/ib_srp.c: In function 'srp_send_completion':
    drivers/infiniband/ulp/srp/ib_srp.c🔢 warning: cast to pointer from integer of different size

by adding an intermediate cast to uintptr_t.

Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: David Dillow <dillowda@ornl.gov>
2011-05-23 11:30:04 -07:00
Michał Mirosław
3d96c74d89 net: infiniband/ulp/ipoib: convert to hw_features
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-20 01:30:42 -07:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Linus Torvalds
0625bef606 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB: Increase DMA max_segment_size on Mellanox hardware
  IB/mad: Improve an error message so error code is included
  RDMA/nes: Don't print success message at level KERN_ERR
  RDMA/addr: Fix return of uninitialized ret value
  IB/srp: try to use larger FMR sizes to cover our mappings
  IB/srp: add support for indirect tables that don't fit in SRP_CMD
  IB/srp: rework mapping engine to use multiple FMR entries
  IB/srp: allow sg_tablesize to be set for each target
  IB/srp: move IB CM setup completion into its own function
  IB/srp: always avoid non-zero offsets into an FMR
2011-03-24 07:59:46 -07:00
David Dillow
be8b981453 IB/srp: try to use larger FMR sizes to cover our mappings
Now that we can get larger SG lists, we can take advantage of HCAs that
allow us to use larger FMR sizes. In many cases, we can use up to 512
entries, so start there and work our way down.

Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-03-15 19:41:30 -04:00
David Dillow
c07d424d61 IB/srp: add support for indirect tables that don't fit in SRP_CMD
This allows us to guarantee the ability to submit up to 8 MB requests
based on the current value of SCSI_MAX_SG_CHAIN_SEGMENTS. While FMR will
usually condense the requests into 8 SG entries, it is imperative that
the target support external tables in case the FMR mapping fails or is
not supported.

We add a safety valve to allow targets without the needed support to
reap the benefits of the large tables, but fail in a manner that lets
the user know that the data didn't make it to the device. The user must
add "allow_ext_sg=1" to the target parameters to indicate that the
target has the needed support.

If indirect_sg_entries is not specified in the modules options, then
the sg_tablesize for the target will default to cmd_sg_entries unless
overridden by the target options.

Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-03-15 19:37:23 -04:00
David Dillow
8f26c9ff9c IB/srp: rework mapping engine to use multiple FMR entries
Instead of forcing all of the S/G entries to fit in one FMR, and falling
back to indirect descriptors if that fails, allow the use of as many
FMRs as needed to map the request. This lays the groundwork for allowing
indirect descriptor tables that are larger than can fit in the command
IU, but should marginally improve performance now by reducing the number
of indirect descriptors needed.

We increase the minimum page size for the FMR pool to 4K, as larger
pages help increase the coverage of each FMR, and it is rare that the
kernel would send down a request with scattered 512 byte fragments.

This patch also move some of the target initialization code afte the
parsing of options, to keep it together with the new code that needs to
allocate memory based on the options given.

Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-03-15 19:35:16 -04:00
David Dillow
4924864404 IB/srp: allow sg_tablesize to be set for each target
Different configurations of target software allow differing max sizes of
the command IU. Allowing this to be changed per-target allows all
targets on an initiator to get an optimal setting.

We deprecate srp_sg_tablesize and replace it with cmd_sg_entries in
preparation for allowing more indirect descriptors than can fit in the
IU.

Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-03-15 19:35:05 -04:00
David Dillow
961e0be89a IB/srp: move IB CM setup completion into its own function
This is to clean up prior to further changes.

Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-03-15 19:34:48 -04:00
David Dillow
8c4037b501 IB/srp: always avoid non-zero offsets into an FMR
It is unclear exactly how this code works around Mellanox SRP targets,
or if the problem is on the target side or in the HCA itself. In an
abundance of caution, we should always enable the workaround.

Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-03-15 19:34:28 -04:00
Mike Christie
7c53c6f89d [SCSI] iser: export addr and port
This pactch has iser export the address and port
of the endpoint.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2011-02-24 12:41:23 -05:00
David Rientjes
6a108a14fa kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT
The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option
is used to configure any non-standard kernel with a much larger scope than
only small devices.

This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes
references to the option throughout the kernel.  A new CONFIG_EMBEDDED
option is added that automatically selects CONFIG_EXPERT when enabled and
can be used in the future to isolate options that should only be
considered for embedded systems (RISC architectures, SLOB, etc).

Calling the option "EXPERT" more accurately represents its intention: only
expert users who understand the impact of the configuration changes they
are making should enable it.

Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: David Woodhouse <david.woodhouse@intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Greg KH <gregkh@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Robin Holt <holt@sgi.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20 17:02:05 -08:00
Roland Dreier
4790f4dc5f Merge branches 'misc', 'mlx4', 'mthca', 'nes' and 'srp' into for-next 2011-01-16 21:22:41 -08:00
Tejun Heo
f06267104d RDMA: Update workqueue usage
* ib_wq is added, which is used as the common workqueue for infiniband
  instead of the system workqueue.  All system workqueue usages
  including flush_scheduled_work() callers are converted to use and
  flush ib_wq.

* cancel_delayed_work() + flush_scheduled_work() converted to
  cancel_delayed_work_sync().

* qib_wq is removed and ib_wq is used instead.

This is to prepare for deprecation of flush_scheduled_work().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2011-01-16 21:16:31 -08:00
Bart Van Assche
695b83495e IB/srp: Test only once whether iu allocation succeeded
Merge the two tests in srp_queuecommand() of whether information unit
allocation succeeded into one.  An intended side effect of this change
is that we fix the warning:

    drivers/infiniband/ulp/srp/ib_srp.c: In function 'srp_queuecommand':
    drivers/infiniband/ulp/srp/ib_srp.c:1116: warning: 'req' may be used uninitialized in this function

(seen with CONFIG_CC_OPTIMIZE_FOR_SIZE=y at least with gcc 4.4.4)

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2011-01-13 14:00:43 -08:00
Joe Perches
948579cd8c RDMA: Use vzalloc() to replace vmalloc()+memset(0)
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2011-01-12 11:11:58 -08:00
Roland Dreier
2b76c05794 Merge branches 'cxgb4', 'ipath', 'ipoib', 'mlx4', 'mthca', 'nes', 'qib' and 'srp' into for-next 2011-01-10 17:43:30 -08:00
Or Gerlitz
8ae31e5b1f IPoIB: Add GRO support
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2011-01-10 17:41:55 -08:00
Or Gerlitz
19e364f680 IPoIB: Remove LRO support
As a first step in moving from LRO to GRO, revert commit af40da894e
("IPoIB: add LRO support").  Also eliminate the ethtool set_flags
callback which isn't needed anymore.  Finally, we need to include
<linux/sched.h> directly to get the declaration of restart_syscall()
(which used to be included implicitly through <linux/inet_lro.h>).

Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2011-01-10 17:41:54 -08:00
David Dillow
9af762719e IB/srp: consolidate hot-path variables into cache lines
Put the variables accessed together in the hot-path into common
cachelines, and separate them by RW vs RO to avoid false dirtying.
We keep a local copy of the lkey and rkey in the target to avoid
traversing pointers (and associated cache lines) to find them.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-10 15:44:51 -05:00
Bart Van Assche
e968467822 IB/srp: stop sharing the host lock with SCSI
We don't need protection against the SCSI stack, so use our own lock to
allow parallel progress on separate CPUs.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ broken out and small cleanups by David Dillow ]
Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-10 15:44:50 -05:00
Bart Van Assche
94a9174c63 IB/srp: reduce lock coverage of command completion
We only need the lock to cover list and credit manipulations, so push
those into srp_remove_req() and update the call chains.

We reorder the request removal and command completion in
srp_process_rsp() to avoid the SCSI mid-layer sending another command
before we've released our request and added any credits returned by the
target. This prevents us from returning HOST_BUSY unneccesarily.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ broken out, small cleanups, and modified to avoid potential extraneous
  HOST_BUSY returns by David Dillow ]
Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-10 15:44:50 -05:00
Bart Van Assche
76c75b258f IB/srp: reduce local coverage for command submission and EH
We only need locks to protect our lists and number of credits available.
By pre-consuming the credit for the request, we can reduce our lock
coverage to just those areas. If we don't actually send the request,
we'll need to put the credit back into the pool.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ broken out and small cleanups by David Dillow ]
Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-10 15:44:49 -05:00
Bart Van Assche
536ae14e75 IB/srp: don't move active requests to their own list
We use req->scmnd != NULL to indicate an active request, so there's no
need to keep a separate list for them. We can afford the array iteration
during error handling, and dropping it gives us one less item that needs
lock protection.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ broken out and small cleanups by David Dillow ]
Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-10 15:44:42 -05:00
Bart Van Assche
dcb4cb85f4 IB/srp: allow lockless work posting
Only one CPU at a time will own an RX IU, so using the address of the IU
as the work request cookie allows us to avoid taking a lock. We can
similarly prepare the TX path for lockless posting by moving the free TX
IUs to a list. This also removes the requirement that the queue sizes be
a power of 2.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ broken out, small cleanups, and modified to avoid needing an extra field
  in the IU by David Dillow]
Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-05 15:24:25 -05:00
Bart Van Assche
9709f0e05b IB/srp: consolidate state change code
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ broken out and small cleanups by David Dillow ]
Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-05 15:24:25 -05:00
David Dillow
f8b6e31e4e IB/srp: allow task management without a previous request
We can only have one task management comment outstanding, so move the
completion and status to the target port. This allows us to handle
resets of a LUN without a corresponding request having been sent.
Meanwhile, we don't need to play games with host_scribble, just use it
as the pointer it is.

This fixes a crash when we issue a bus reset using sg_reset.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=13893
Reported-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
2011-01-05 15:24:25 -05:00
Jeff Garzik
f281233d3e SCSI host lock push-down
Move the mid-layer's ->queuecommand() invocation from being locked
with the host lock to being unlocked to facilitate speeding up the
critical path for drivers who don't need this lock taken anyway.

The patch below presents a simple SCSI host lock push-down as an
equivalent transformation.  No locking or other behavior should change
with this patch.  All existing bugs and locking orders are preserved.

Additionally, add one parameter to queuecommand,
	struct Scsi_Host *
and remove one parameter from queuecommand,
	void (*done)(struct scsi_cmnd *)

Scsi_Host* is a convenient pointer that most host drivers need anyway,
and 'done' is redundant to struct scsi_cmnd->scsi_done.

Minimal code disturbance was attempted with this change.  Most drivers
needed only two one-line modifications for their host lock push-down.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-16 13:33:23 -08:00
Linus Torvalds
9e5fca251f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (63 commits)
  IB/qib: clean up properly if pci_set_consistent_dma_mask() fails
  IB/qib: Allow driver to load if PCIe AER fails
  IB/qib: Fix uninitialized pointer if CONFIG_PCI_MSI not set
  IB/qib: Fix extra log level in qib_early_err()
  RDMA/cxgb4: Remove unnecessary KERN_<level> use
  RDMA/cxgb3: Remove unnecessary KERN_<level> use
  IB/core: Add link layer type information to sysfs
  IB/mlx4: Add VLAN support for IBoE
  IB/core: Add VLAN support for IBoE
  IB/mlx4: Add support for IBoE
  mlx4_en: Change multicast promiscuous mode to support IBoE
  mlx4_core: Update data structures and constants for IBoE
  mlx4_core: Allow protocol drivers to find corresponding interfaces
  IB/uverbs: Return link layer type to userspace for query port operation
  IB/srp: Sync buffer before posting send
  IB/srp: Use list_first_entry()
  IB/srp: Reduce number of BUSY conditions
  IB/srp: Eliminate two forward declarations
  IB/mlx4: Signal node desc changes to SM by using FW to generate trap 144
  IB: Replace EXTRA_CFLAGS with ccflags-y
  ...
2010-10-26 17:54:22 -07:00
Hagen Paul Pfeifer
732eacc054 replace nested max/min macros with {max,min}3 macro
Use the new {max,min}3 macros to save some cycles and bytes on the stack.
This patch substitutes trivial nested macros with their counterpart.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Joe Perches <joe@perches.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:12 -07:00
Roland Dreier
116e9535fe Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'ehca', 'iboe', 'ipoib', 'misc', 'mlx4', 'nes', 'qib' and 'srp' into for-next 2010-10-26 16:09:11 -07:00
David Dillow
19081f31ce IB/srp: Sync buffer before posting send
srp_send_tsk_mgmt() was missing the proper DMA sync calls before posting
the buffer to the device.

Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-24 22:14:23 -07:00
Bart Van Assche
21c1a90769 IB/srp: Use list_first_entry()
Use the list_first_entry() macro in ib_srp instead of open-coding the equivalent,
which makes the source code slightly more descriptive.  The list_first_entry()
macro itself was introduced in kernel 2.6.22.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-24 22:14:19 -07:00
Bart Van Assche
7ade400aba IB/srp: Reduce number of BUSY conditions
As proposed by the SRP (draft) standard, ib_srp reserves one ring
element for SRP_TSK_MGMT requests. This patch makes sure that the SCSI
mid-layer never tries to queue more than (SRP request limit) - 1 SCSI
commands to ib_srp. This improves performance for targets whose request
limit is less than or equal to SRP_NORMAL_REQ_SQ_SIZE by reducing the
number of BUSY responses reported by ib_srp to the SCSI mid-layer.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-24 22:14:14 -07:00
David Dillow
05a1d7504f IB/srp: Eliminate two forward declarations
Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-24 22:14:08 -07:00
Eli Cohen
c3aa9b186b IPoIB: Set dev_id field of net_device
Use the net device's dev_id field to encode the port number of the pci
device.  This can be used to to associate a net device with the pci
device's port. The encoding is: dev_id = port - 1.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-23 13:35:48 -07:00
David Dillow
bb12588a38 IB/srp: Implement SRP_CRED_REQ and SRP_AER_REQ
This patch adds support for SRP_CRED_REQ to avoid a lockup by targets
that use that mechanism to return credits to the initiator. This
prevents a lockup observed in the field where we would never add the
credits from the SRP_CRED_REQ to our current count, and would therefore
never send another command to the target.

Minimal support for SRP_AER_REQ is also added, as these messages can
also be used to convey additional credits to the initiator.

Based upon extensive debugging and code by Bart Van Assche and a bug
report by Chris Worley.

Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-22 22:19:10 -07:00
Bart Van Assche
dd5e6e38b2 IB/srp: Preparation for transmit ring response allocation
The transmit ring in ib_srp (srp_target.tx_ring) is currently only used
for allocating requests sent by the initiator to the target. This patch
prepares using that ring for allocation of both requests and responses.
Also, this patch differentiates the uses of SRP_SQ_SIZE, increases the
size of the IB send completion queue by one element and reserves one
transmit ring slot for SRP_TSK_MGMT requests.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-22 22:19:10 -07:00
Justin P. Mattock
631dd1a885 Update broken web addresses in the kernel.
The patch below updates broken web addresses in the kernel

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Finn Thain <fthain@telegraphics.com.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: Ben Pfaff <blp@cs.stanford.edu>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Reviewed-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-10-18 11:03:14 +02:00
Eli Cohen
7b4c876961 IPoIB: Skip IBoE ports
IPoIB is IP-over-Infiniband link layer. In the case of IBoE, the link
layer is Ethernet and IP can work directly over Ethernet, so disable
IPoIB for non-IB_LINK_LAYER_INFINIBAND ports.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-13 09:38:11 -07:00
Christoph Lameter
fed1db33fe IPoIB: Set pkt_type correctly for multicast packets (fix IGMP breakage)
IGMP processing is broken because the IPOIB does not set the
skb->pkt_type the right way for multicast traffic.  All incoming
packets are set to PACKET_HOST which means that igmp_recv() will
ignore the IGMP broadcasts/multicasts.

This in turn means that the IGMP timers are firing and are sending
information about multicast subscriptions unnecessarily.  In a large
private network this can cause traffic spikes.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-09-28 11:09:23 -07:00
Linus Torvalds
3cc08fc35d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (42 commits)
  IB/qib: Add missing <linux/slab.h> include
  IB/ehca: Drop unnecessary NULL test
  RDMA/nes: Fix confusing if statement indentation
  IB/ehca: Init irq tasklet before irq can happen
  RDMA/nes: Fix misindented code
  RDMA/nes: Fix showing wqm_quanta
  RDMA/nes: Get rid of "set but not used" variables
  RDMA/nes: Read firmware version from correct place
  IB/srp: Export req_lim via sysfs
  IB/srp: Make receive buffer handling more robust
  IB/srp: Use print_hex_dump()
  IB: Rename RAW_ETY to RAW_ETHERTYPE
  RDMA/nes: Fix two sparse warnings
  RDMA/cxgb3: Make needlessly global iwch_l2t_send() static
  IB/iser: Make needlessly global iser_alloc_rx_descriptors() static
  RDMA/cxgb4: Add timeouts when waiting for FW responses
  IB/qib: Fix race between qib_error_qp() and receive packet processing
  IB/qib: Limit the number of packets processed per interrupt
  IB/qib: Allow writes to the diag_counters to be able to clear them
  IB/qib: Set cfgctxts to number of CPUs by default
  ...
2010-08-07 17:08:02 -07:00
Roland Dreier
03b37ecdb3 Merge branches 'cxgb3', 'cxgb4', 'ehca', 'ipath', 'misc', 'nes', 'qib' and 'srp' into for-next 2010-08-05 14:27:14 -07:00
Bart Van Assche
89de74866b IB/srp: Export req_lim via sysfs
Export req_lim via sysfs for debugging.

Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Acked-by: David Dillow <dave@thedillows.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-04 13:02:26 -07:00
Linus Torvalds
6ba74014c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits)
  phy/marvell: add 88ec048 support
  igb: Program MDICNFG register prior to PHY init
  e1000e: correct MAC-PHY interconnect register offset for 82579
  hso: Add new product ID
  can: Add driver for esd CAN-USB/2 device
  l2tp: fix export of header file for userspace
  can-raw: Fix skb_orphan_try handling
  Revert "net: remove zap_completion_queue"
  net: cleanup inclusion
  phy/marvell: add 88e1121 interface mode support
  u32: negative offset fix
  net: Fix a typo from "dev" to "ndev"
  igb: Use irq_synchronize per vector when using MSI-X
  ixgbevf: fix null pointer dereference due to filter being set for VLAN 0
  e1000e: Fix irq_synchronize in MSI-X case
  e1000e: register pm_qos request on hardware activation
  ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice
  net: Add getsockopt support for TCP thin-streams
  cxgb4: update driver version
  cxgb4: add new PCI IDs
  ...

Manually fix up conflicts in:
 - drivers/net/e1000e/netdev.c: due to pm_qos registration
   infrastructure changes
 - drivers/net/phy/marvell.c: conflict between adding 88ec048 support
   and cleaning up the IDs
 - drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req
   conflict (registration change vs marking it static)
2010-08-04 11:47:58 -07:00
Bart Van Assche
c996bb47bb IB/srp: Make receive buffer handling more robust
The current strategy in ib_srp for posting receive buffers is:

 * Post one buffer after channel establishment.
 * Post one buffer before sending an SRP_CMD or SRP_TSK_MGMT to the target.

As a result, only the first non-SRP_RSP information unit from the
target will be processed.  If that first information unit is an
SRP_T_LOGOUT, it will be processed.  On the other hand, if the
initiator receives an SRP_CRED_REQ or SRP_AER_REQ before it receives a
SRP_T_LOGOUT, the SRP_T_LOGOUT won't be processed.

We can fix this inconsistency by changing the strategy for posting
receive buffers to:

 * Post all receive buffers after channel establishment.
 * After a receive buffer has been consumed and processed, post it again.

A side effect is that the ib_post_recv() call is moved out of the SCSI
command processing path.  Since __srp_post_recv() is not called
directly any more, get rid of it and move the code directly into
srp_post_recv().  Also, move srp_post_recv() up in the file to avoid a
forward declaration.

Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Acked-by: David Dillow <dave@thedillows.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-04 11:47:39 -07:00
Bart Van Assche
7a7008110b IB/srp: Use print_hex_dump()
Replace an open-coded dump of the receive buffer with a call to
print_hex_dump().

Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-04 11:24:12 -07:00
Or Gerlitz
48d8fcebb7 IB/iser: Make needlessly global iser_alloc_rx_descriptors() static
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-08-04 09:54:55 -07:00
Or Gerlitz
7a52b34b07 IPoIB: Fix world-writable child interface control sysfs attributes
Sumeet Lahorani <sumeet.lahorani@oracle.com> reported that the IPoIB
child entries are world-writable; however we don't want ordinary users
to be able to create and destroy child interfaces, so fix them to be
writable only by root.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Cc: <stable@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-07-06 14:23:22 -07:00
Ben Hutchings
39827be26b IB/{nes, ipoib}: Pass supported flags to ethtool_op_set_flags()
Following commit 1437ce3983 "ethtool:
Change ethtool_op_set_flags to validate flags", ethtool_op_set_flags
takes a third parameter and cannot be used directly as an
implementation of ethtool_ops::set_flags.

Changes nes and ipoib driver to pass in the appropriate value.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-04 11:48:14 -07:00
Linus Torvalds
f8965467f3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1674 commits)
  qlcnic: adding co maintainer
  ixgbe: add support for active DA cables
  ixgbe: dcb, do not tag tc_prio_control frames
  ixgbe: fix ixgbe_tx_is_paused logic
  ixgbe: always enable vlan strip/insert when DCB is enabled
  ixgbe: remove some redundant code in setting FCoE FIP filter
  ixgbe: fix wrong offset to fc_frame_header in ixgbe_fcoe_ddp
  ixgbe: fix header len when unsplit packet overflows to data buffer
  ipv6: Never schedule DAD timer on dead address
  ipv6: Use POSTDAD state
  ipv6: Use state_lock to protect ifa state
  ipv6: Replace inet6_ifaddr->dead with state
  cxgb4: notify upper drivers if the device is already up when they load
  cxgb4: keep interrupts available when the ports are brought down
  cxgb4: fix initial addition of MAC address
  cnic: Return SPQ credit to bnx2x after ring setup and shutdown.
  cnic: Convert cnic_local_flags to atomic ops.
  can: Fix SJA1000 command register writes on SMP systems
  bridge: fix build for CONFIG_SYSFS disabled
  ARCNET: Limit com20020 PCI ID matches for SOHARD cards
  ...

Fix up various conflicts with pcmcia tree drivers/net/
{pcmcia/3c589_cs.c, wireless/orinoco/orinoco_cs.c and
wireless/orinoco/spectrum_cs.c} and feature removal
(Documentation/feature-removal-schedule.txt).

Also fix a non-content conflict due to pm_qos_requirement getting
renamed in the PM tree (now pm_qos_request) in net/mac80211/scan.c
2010-05-20 21:04:44 -07:00
Roland Dreier
ffebedb7ab Merge branches 'amso1100', 'bkl', 'cma', 'cxgb3', 'cxgb4', 'ipoib', 'iser', 'masked-atomics', 'misc', 'mthca' and 'nes' into for-next 2010-05-15 20:06:01 -07:00
Dan Carpenter
9fda1ac5fa IB/iser: Fix error flow in iser_create_ib_conn_res()
We shouldn't free things here because we free them later.
The call tree looks like this:
	iser_connect() ==> initiating the connection establishment
and later
	iser_cma_handler() => iser_route_handler() => iser_create_ib_conn_res()
if we fail here, eventually iser_conn_release() is called, resulting
in a double free.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-12 09:30:45 -07:00
Or Gerlitz
39ff05dbbb IB/iser: Enhance disconnection logic for multi-pathing
The iser connection teardown flow isn't over until the underlying
Connection Manager (e.g the IB CM) delivers a disconnected or timeout
event through the RDMA-CM.  When the remote (target) side isn't
reachable, e.g when some HW e.g port/hca/switch isn't functioning or
taken down administratively, the CM timeout flow is used and the event
may be generated only after relatively long time -- on the order of
tens of seconds.

The current iser code exposes this possibly long delay to higher
layers, specifically to the iscsid daemon and iscsi kernel stack. As a
result, the iscsi stack doesn't respond well: this low-level CM delay
is added to the fail-over time under HA schemes such as the one
provided by DM multipath through the multipathd(8) service.

This patch enhances the reference counting scheme on iser's IB
connections so that the disconnect flow initiated by iscsid from user
space (ep_disconnect) doesn't wait for the CM to deliver the
disconnect/timeout event.  (The connection teardown isn't done from
iser's view point until the event is delivered)

The iser ib (rdma) connection object is destroyed when its reference
count reaches zero.  When this happens on the RDMA-CM callback
context, extra care is taken so that the RDMA-CM does the actual
destroying of the associated ID, since doing it in the callback is
prohibited.

The reference count of iser ib connection normally reaches three,
where the <ref, deref> relations are

 1. conn <init, terminate>
 2. conn <bind, stop/destroy>
 3. cma id <create, disconnect/error/timeout callbacks>

With this patch, multipath fail-over time is about 30 seconds, while
without this patch, multipath fail-over time is about 130 seconds.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-12 09:30:44 -07:00
Or Gerlitz
d265b98082 IB/iser: Remove buggy back-pointer setting
The iscsi connection object life cycle includes binding and unbinding
(conn_stop) to/from the iscsi transport connection object.  Since
iscsi connection objects are recycled, at the time the transport
connection (e.g iser's IB connection) is released, it is not valid to
touch the iscsi connection tied to the transport back-pointer since it
may already point to a different transport connection.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-12 09:30:44 -07:00
Or Gerlitz
2110f9bf37 IB/iser: Add asynchronous event handler
Add handler to handle events such as port up and down.  This is useful
when testing high-availability schemes such as multi-pathing.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-05-12 09:30:43 -07:00
Or Gerlitz
d414371795 IPoIB: Allow disabling/enabling TSO on the fly through ethtool
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-04-21 17:17:30 -07:00
David S. Miller
871039f02f Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/stmmac/stmmac_main.c
	drivers/net/wireless/wl12xx/wl1271_cmd.c
	drivers/net/wireless/wl12xx/wl1271_main.c
	drivers/net/wireless/wl12xx/wl1271_spi.c
	net/core/ethtool.c
	net/mac80211/scan.c
2010-04-11 14:53:53 -07:00
David S. Miller
4a35ecf8bf Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/bonding/bond_main.c
	drivers/net/via-velocity.c
	drivers/net/wireless/iwlwifi/iwl-agn.c
2010-04-06 23:53:30 -07:00
Jiri Pirko
22bedad3ce net: convert multicast list to list_head
Converts the list and the core manipulating with it to be the same as uc_list.

+uses two functions for adding/removing mc address (normal and "global"
 variant) instead of a function parameter.
+removes dev_mcast.c completely.
+exposes netdev_hw_addr_list_* macros along with __hw_addr_* functions for
 manipulation with lists on a sandbox (used in bonding and 80211 drivers)

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-03 14:22:15 -07:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Jiri Pirko
3e4aa12f8a ipoib: remove addrlen check for mc addresses
Finally this bit can be removed. Currently, after the bonding driver is
changed/fixed (32a806c194 net-next-2.6),
that's not possible for an addr with different length than dev->addr_len
to be present in list. Removing this check as in new mc_list there will be
no addrlen in the record.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-03-22 18:33:11 -07:00
Linus Torvalds
961cde93de Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (69 commits)
  [SCSI] scsi_transport_fc: Fix synchronization issue while deleting vport
  [SCSI] bfa: Update the driver version to 2.1.2.1.
  [SCSI] bfa: Remove unused header files and did some cleanup.
  [SCSI] bfa: Handle SCSI IO underrun case.
  [SCSI] bfa: FCS and include file changes.
  [SCSI] bfa: Modified the portstats get/clear logic
  [SCSI] bfa: Replace bfa_get_attr() with specific APIs
  [SCSI] bfa: New portlog entries for events (FIP/FLOGI/FDISC/LOGO).
  [SCSI] bfa: Rename pport to fcport in BFA FCS.
  [SCSI] bfa: IOC fixes, check for IOC down condition.
  [SCSI] bfa: In MSIX mode, ignore spurious RME interrupts when FCoE ports are in FW mismatch state.
  [SCSI] bfa: Fix Command Queue (CPE) full condition check and ack CPE interrupt.
  [SCSI] bfa: IOC recovery fix in fcmode.
  [SCSI] bfa: AEN and byte alignment fixes.
  [SCSI] bfa: Introduce a link notification state machine.
  [SCSI] bfa: Added firmware save clear feature for BFA driver.
  [SCSI] bfa: FCS authentication related changes.
  [SCSI] bfa: PCI VPD, FIP and include file changes.
  [SCSI] bfa: Fix to copy fpma MAC when requested by user space application.
  [SCSI] bfa: RPORT state machine: direct attach mode fix.
  ...
2010-03-18 16:54:31 -07:00
Or Gerlitz
a48f509b26 IPoIB: Include return code in trace message for ib_post_send() failures
Print the return code of ib_post_send() if it fails to make these
debugging messages more useful.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-03-11 13:43:11 -08:00