Commit graph

516 commits

Author SHA1 Message Date
Roland Dreier
532c3b5817 IB/mthca: Fix mthca_write_mtt() on HCAs with hidden memory
Commit b2875d4c ("IB/mthca: Always fill MTTs from CPU") causes a crash
in mthca_write_mtt() with non-memfree HCAs that have their memory
hidden (that is, have only two PCI BARs instead of having a third BAR
that allows access to the RAM attached to the HCA) on 64-bit
architectures.  This is because the commit just before, c20e20ab
("IB/mthca: Merge MR and FMR space on 64-bit systems") makes
dev->mr_table.fmr_mtt_buddy equal to &dev->mr_table.mtt_buddy and
hence mthca_write_mtt() tries to write directly into the HCA's MTT
table.  However, since that table is in the HCA's memory, this is
impossible without the PCI BAR that gives access to that memory.

This causes a crash because mthca_tavor_write_mtt_seg() basically
tries to dereference some offset of a NULL pointer.  Fix this by
adding a test of MTHCA_FLAG_FMR in mthca_write_mtt() so that we always
use the WRITE_MTT firmware command rather than writing directly if
FMRs are not enabled.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-24 16:31:04 -07:00
Roland Dreier
3f114853d4 IB/mthca: Update HCA firmware revisions
Update the driver's list of current firmware versions with Mellanox's
latest releases.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:21:02 -07:00
Robert Walsh
40b90430ec IB/ipath: Fix WC format drift between user and kernel space
The kernel ib_wc structure now uses a QP pointer, but the user space
equivalent uses a QP number instead.  This means we can no longer use
a simple structure copy to copy stuff into user space.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:21:01 -07:00
Robert Walsh
6ce73b07db IB/ipath: Check that a UD work request's address handle is valid
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:21:00 -07:00
Robert Walsh
0d6172a428 IB/ipath: Remove duplicate stuff from ipath_verbs.h
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:21:00 -07:00
Robert Walsh
253fb39020 IB/ipath: Check reserved memory keys
Don't let userspace use the direct-physical-map L_key or R_key.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:21:00 -07:00
Bryan O'Sullivan
f0810daf74 IB/ipath: Fix unit selection when all CPU affinity bits set
At some point things changed so that all the affinity bits can be set,
but cpus_full() macro is not true.  This caused problems with the unit
selection logic on multi-unit (board) configurations.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:59 -07:00
Bryan O'Sullivan
662af5813b IB/ipath: Don't allow QPs 0 and 1 to be opened multiple times
Signed-off-by: Robert Walsh <robert.walsh@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:59 -07:00
Bryan O'Sullivan
53c1d2c943 IB/ipath: Disable IB link earlier in shutdown sequence
Move the code that shuts down the IB link earlier in the unload
process, to be sure no new packets can arrive while we are unloading.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:59 -07:00
Bryan O'Sullivan
490462c268 IB/ipath: Prevent random program use of diags interface
To prevent random utility reads and writes of the diag interface to the
chip, we first require a handshake of reading from offset 0 and writing
to offset 0 before any other reads or writes can be done through the
diags device.   Otherwise chip errors can be triggered.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:59 -07:00
Bryan O'Sullivan
f5408ac7cc IB/ipath: On unrecoverable errors, force link down, LEDs off
If the chip is no longer usable, LEDs should be turned off so system
can be found easily in the cluster.

Also some minor reorganizing so both chips print hardware error
message at same point and only if there were unrecovered errors

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:59 -07:00
Michael Albaugh
27b044a815 IB/ipath: Fix driver crash (in interrupt or during unload) after chip reset
Re-init of the kernel structures after a chip reset was leaving the
portdata structure for port zero in an inconsistent state, and a
pointer to it either stale (in re-init code) or NULL (in devdata)
Fixing the order of operations on this struct, and the condition for
interrupt access, prevents the crashes.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:58 -07:00
Bryan O'Sullivan
9783ab4058 IB/ipath: Improve handling and reporting of parity errors
Mostly cleanup.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:58 -07:00
Bryan O'Sullivan
820054b7ca IB/ipath: Print better error messages if kernel is misconfigured
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:58 -07:00
Arthur Jones
569b87b47f IB/ipath: Force PIOAvail update entry point
Due to a chip bug, the PIOAvail register is not always updated to
memory.  This patch allows userspace to force an update.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:58 -07:00
Arthur Jones
7b196e2ff3 IB/ipath: Call free_irq() on chip specific initialization failure
In initialization, if we bailed at chip specific initialization, we
forgot to clean up the irq we had requested.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:58 -07:00
Bryan O'Sullivan
5a7d4eea91 IB/ipath: Discard multicast packets without a GRH
This patch fixes a bug where multicast packets without a GRH were not
being dropped as per the IB spec.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:57 -07:00
Bryan O'Sullivan
0ed3c594e3 IB/ipath: Fix calculation for number of kernel PIO buffers
If the module parameter "kpiobufs" is set too high, the calculation to
reset it to a sane value was incorrect.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:57 -07:00
Bryan O'Sullivan
c8c6f5d496 IB/ipath: Remove unused ipath_read_kreg64_port()
Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:57 -07:00
Ralph Campbell
dd5190b6be IB/ipath: Fix RDMA reads of length zero and error handling
Fix RDMA read response length checking for RDMA_READ_RESPONSE_ONLY to
allow a zero length response.  RDMA read responses which don't match
the expected length or occur in response to some other operation
should generate a completion queue error (see table 56, ch. 9.9.2.3 in
the IB spec).

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:57 -07:00
Mark Debbage
c7e29ff11f IB/ipath: Allow receive ports mapped into userspace to be shared
Improve port-sharing performance by allowing any process to receive
packets from the shared hardware port under a spin lock for mutual
exclusion. Previously, one process was nominated as the master and
that process was responsible for receiving all packets from the shared
hardware port and either consuming them or forwarding them to their
destination. This led to starvation problems for other processes when
the master process was busy in computation phases.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:57 -07:00
Ralph Campbell
0a5a83cffc IB/ipath: Fix port sharing on powerpc
The port sharing feature mixed kernel virtual addresses as well as
physical addresses for the offset used to describe the mmap address to
map the InfiniPath hardware into user space.  This had a conflict on
powerpc.  The new scheme converts it to a physical address so it
doesn't conflict with chip addresses and yet still fits in 40/44 bits
so it isn't truncated by 32-bit applications calling mmap64().

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:56 -07:00
Bryan O'Sullivan
041eab9136 IB/ipath: Fix CQ flushing when QP is modified to error state
If a receive work request has been removed from the queue but has not
had a CQ entry generated for it and the QP is modified to the error
state, the completion entry generated is incorrect.  This patch fixes
the problem.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:56 -07:00
Bryan O'Sullivan
614d49a21e IB/ipath: Fix bad argument to clear_bit()
Code was converted from a &= ~mask to clear_bit, but the bit was left
shifted instead of being used directly, so we were either trashing
memory several pages away, or sometimes taking a kernel page fault on
an invalid page.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:56 -07:00
Bryan O'Sullivan
8ec1077b35 IB/ipath: Change packet problems vs chip errors handling and reporting
Some types of packet errors are moderately common with longer IB
cables and large clusters, and are not reported with prints by other
IB HCA drivers.  This suppresses those messages unless the new
__IPATH_ERRPKTDBG bit is set in ipath_debug.  Reporting of temporarily
disabled frequent error interrupts was also made clearer

We also distinguish between chip errors, and bad packets sent or
received in the wording of the messages.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:55 -07:00
Ralph Campbell
6f5c407460 IB/ipath: Fix PSN update for RC retries
This patch fixes a number of bugs with updating the PSN for retries of
RC requests.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:55 -07:00
Ralph Campbell
0434d271fd IB/ipath: Fix QP error completion queue entries
When switching to the QP error state, the completion queue entries
(error or flush) were not being generated correctly.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:55 -07:00
Bryan O'Sullivan
39c0d0b919 IB/ipath: Fix up some debug messages
ipath_dbg doesn't need the same prefixes that printk does.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:55 -07:00
Ralph Campbell
3859e39d75 IB/ipath: Support larger IB_QP_MAX_DEST_RD_ATOMIC and IB_QP_MAX_QP_RD_ATOMIC
This patch adds support for multiple RDMA reads and atomics to be sent
before an ACK is required to be seen by the requester.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:55 -07:00
Ralph Campbell
7b21d26dda IB/ipath: NMI cpu lockup if local loopback used
If a post send is done in loopback and there is no receive queue
entry, the sending QP is put on a timeout list for a while so the
receiver has a chance to post a receive buffer. If the another post
send is done, the code incorrectly tried to put the QP on the timeout
list again an corrupted the timeout list. This eventually leads to a
spin lock deadlock NMI due to the timer function looping forever with
the lock held.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:54 -07:00
Ralph Campbell
9f9630d5e1 IB/ipath: Fix SRQ limit event causing dropped CQ entry
A silly programming error causes a CQ entry to not be generated if a
SRQ limit event is generated.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:54 -07:00
Ralph Campbell
947d7617a1 IB/ipath: Don't initialize port memory for subports
A recent change was made to allocate memory for a port after CPU
affinity is set. That change didn't account for subports and was
trying to allocate memory for the port twice.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:54 -07:00
Bryan O'Sullivan
1908574559 IB/ipath: Definitions of two RXE parity err bits were reversed
The chip documentation on the expected TID vs eager TID parity error
bits was reversed from what was implemented in the RTL, for both
chips.  This corrects the definitions.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:54 -07:00
Bryan O'Sullivan
165c552c35 IB/ipath: Fix user memory region creation when IOMMU present
The loop which initializes the user memory region from an array of
pages was using the wrong limit for the array.  This worked OK when
dma_map_sg() returned the same number as the number of pages.  This
patch fixes the problem.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:54 -07:00
Bryan O'Sullivan
946db67fbf IB/ipath: Add ability to set and clear IB local loopback
This is a sticky state.  It is useful for diagnosing problems with
boards versus cable/switch problems.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-18 20:20:53 -07:00
Michael S. Tsirkin
608d8268be IB/mthca: Fix data corruption after FMR unmap on Sinai
In mthca_arbel_fmr_unmap(), the high bits of the key are masked off.
This gets rid of the effect of adjust_key(), which makes sure that
bits 3 and 23 of the key are equal when the Sinai throughput
optimization is enabled, and so it may happen that an FMR will end up
with bits 3 and 23 in the key being different.  This causes data
corruption, because when enabling the throughput optimization, the
driver promises the HCA firmware that bits 3 and 23 of all memory keys
will always be equal.

Fix by re-applying adjust_key() after masking the key.

Thanks to Or Gerlitz for reproducing the problem, and Ariel Shahar for
help in debug.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-16 14:10:55 -07:00
Steve Wise
1ca19770c5 RDMA/cxgb3: Add set_tcb_rpl_handler
As of commit 6cdbd77e ("cxgb3 - missing CPL hanler and register
setting."), the cxgb3 ethernet NIC driver no longer handles SET_TCB
replies, so we need to do it in the iWARP driver.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-04-12 10:37:11 -07:00
Michael S. Tsirkin
0264d88531 IB/mthca: Fix thinko in init_mr_table()
Commit c20e20ab ("IB/mthca: Merge MR and FMR space on 64-bit systems")
swapped the number of MTTs and MPTs when initializing the MR table. As
a result, we get a kernel oops when the number of MTT segments
allocated exceeds 0x20000.

Noted by Troy Benjegerdes <troy@scl.ameslab.gov>, and reproduced by
Dotan Barak <dotanb@mellanox.co.il>.  This fixes
https://bugs.openfabrics.org/show_bug.cgi?id=490

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-26 15:59:32 -07:00
Steve Wise
ed6ee5178e RDMA/cxgb3: Fix resource leak in cxio_hal_init_ctrl_qp()
This was spotted by the Coverity checker (CID 1554).

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-26 15:54:40 -07:00
Joachim Fenkes
73b9e9870f IB/ehca: Make scaling code work without CPU hotplug
eHCA scaling code must not depend on register_cpu_notifier() if
CONFIG_HOTPLUG_CPU is not set, so put all related code into #ifdefs.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22 14:40:16 -07:00
Steve Wise
d601347188 RDMA/cxgb3: Handle build_phys_page_list() failure in iwch_reregister_phys_mem()
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22 14:40:16 -07:00
Bryan O'Sullivan
fae8773b73 IB/ipath: Check return value of lookup_one_len
This fixes kernel.org bug 8003.

Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-22 14:40:15 -07:00
Al Viro
62577fa324 [PATCH] fix ipath_dma_free_coherent() prototype
method gets u64, not dma_addr_t

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-14 15:27:49 -07:00
Steve Wise
e64518f373 RDMA/cxgb3: Fix MR permission problems
Fix memory region permission problems:

- remove useless and redundant iwch_mem_perms enum.

- create ib_to_tpt_access_rights() for mapping ib access rights
  to T3 TPT permissions.

- create ib_to_mwbind_access_rights() for mapping ib access rights
  to T3 MWBIND WR permissions.

- fix up the mem reg code to utilize the new functions.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:51:02 -08:00
Steve Wise
1f6a849b7c RDMA/cxgb3: Don't reuse skbs that are non-linear or cloned
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:50:57 -08:00
Steve Wise
8cfccf02bb RDMA/cxgb3: Squelch logging AE errors
Only print one AE error for a given connection in the kernel log.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:50:53 -08:00
Steve Wise
adf376b370 RDMA/cxgb3: Stop EP timer when MPA exchange is aborted by peer
Stop the endpoint timer when the MPA exchange is aborted by the peer.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:50:49 -08:00
Steve Wise
2df50da00e RDMA/cxgb3: Move QP to error on destroy if the state is IDLE
Change iwch_destroy_qp() to always move the QP to ERROR and let
iwch_modify_qp() decide what to do.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:50:45 -08:00
Steve Wise
42e3175354 RDMA/cxgb3: Fixes for "normal close" failures
Fixes for "normal close" failures:

- Start normal close timer when moving to CLOSING state.
- Handle ABORTING state in close_con_rpl().
- Stop timer correctly on abort during a normal close.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:50:37 -08:00
David Miller
c3bb1092c8 RDMA/cxgb3: Fix build on sparc64
cxgb3 uses dma_alloc_coherent() et al. thus needs linux/dma-mapping.h
include in order to build reliably.

Noticed on sparc64.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-03-06 12:45:57 -08:00