Commit graph

533102 commits

Author SHA1 Message Date
Nicholas Bellinger
417c20a9bd iscsi-target: Fix use-after-free during TPG session shutdown
This patch fixes a use-after-free bug in iscsit_release_sessions_for_tpg()
where se_portal_group->session_lock was incorrectly released/re-acquired
while walking the active se_portal_group->tpg_sess_list.

The can result in a NULL pointer dereference when iscsit_close_session()
shutdown happens in the normal path asynchronously to this code, causing
a bogus dereference of an already freed list entry to occur.

To address this bug, walk the session list checking for the same state
as before, but move entries to a local list to avoid dropping the lock
while walking the active list.

As before, signal using iscsi_session->session_restatement=1 for those
list entries to be released locally by iscsit_free_session() code.

Reported-by: Sunilkumar Nadumuttlu <sjn@datera.io>
Cc: Sunilkumar Nadumuttlu <sjn@datera.io>
Cc: <stable@vger.kernel.org> # v3.1+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:43 -07:00
Alexei Potashnik
7359df25a5 qla2xxx: terminate exchange when command is aborted by LIO
The newly introduced aborted_task TFO callback has to terminate
exchange with QLogic driver, since command is being deleted and
no status will be queued to the driver at a later point.

This patch also moves the burden of releasing one cmd refcount to
the aborted_task handler.

Changed iSCSI aborted_task logic to satisfy the above requirement.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:42 -07:00
Alexei Potashnik
e52a8b45b9 qla2xxx: drop cmds/tmrs arrived while session is being deleted
If a new initiator (different WWN) shows up on the same fcport, old
initiator's session is scheduled for deletion. But there is a small
window between it being marked with QLA_SESS_DELETION_IN_PROGRESS
and qlt_unret_sess getting called when new session's commands will
keep finding old session in the fcport map.

This patch drops cmds/tmrs if they find session in the progress of
being deleted.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:42 -07:00
Alexei Potashnik
d20ed91bb6 qla2xxx: disable scsi_transport_fc registration in target mode
There are multiple reasons for disabling this:

1. It provides no functional benefit. We pretty much only get a few more
sysfs entries for each port, but all that information is already
available from /sys/kernel/debug/target/qla-session-X

2. It already only works in private-loop mode. By disabling we'll be
getting more uniform behavior with fabric mode.

3. It creates complications for the new PLOGI handling mechanism:
scsi_transport_fc port deletion timer could race with new session
from initiator and cause logout after successful login.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:41 -07:00
Alexei Potashnik
df673274fa qla2xxx: added sess generations to detect RSCN update races
RSCN processing in qla2xxx driver can run in parallel with ELS/IO
processing. As such the decision to remove disappeared fc port's
session could be stale, because a new login sequence has occurred
since and created a brand new session.

Previous mechanism of dealing with this by delaying deletion request
was prone to erroneous deletions if the event that was supposed to
cancel the deletion never arrived or has been delayed in processing.

New mechanism relies on a time-like generation counter to serialize
RSCN updates relative to ELS/IO updates.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:41 -07:00
Alexei Potashnik
daddf5cf9b qla2xxx: Abort stale cmds on qla_tgt_wq when plogi arrives
cancel any commands from initiator's s_id that are still waiting
on qla_tgt_wq when PLOGI arrives.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:40 -07:00
Alexei Potashnik
a6ca88780d qla2xxx: delay plogi/prli ack until existing sessions are deleted
- keep qla_tgt_sess object on the session list until it's freed

- modify use of sess->deleted flag to differentiate delayed
  session deletion that can be cancelled from irreversible one:
  QLA_SESS_DELETION_PENDING vs QLA_SESS_DELETION_IN_PROGRESS

- during IN_PROGRESS deletion all newly arrived commands and TMRs will
  be rejected, existing commands and TMRs will be terminated when
  given by the core to the fabric or simply dropped if session logout
  has already happened (logout terminates all existing exchanges)

- new PLOGI will initiate deletion of the following sessions
  (unless deletion is already IN_PROGRESS):
  - with the same port_name (with logout)
  - different port_name, different loop_id but the same port_id
    (with logout)
  - different port_name, different port_id, but the same loop_id
    (without logout)

- additionally each new PLOGI will store imm notify iocb in the
  same port_name session being deleted. When deletion process
  completes this iocb will be acked. Only the most recent PLOGI
  iocb is stored. The older ones will be terminated when replaced.

- new PRLI will initiate deletion of the following sessions
  (unless deletion is already IN_PROGRESS):
  - different port_name, different port_id, but the same loop_id
   (without logout)

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:39 -07:00
Swapnil Nagle
8b2f5ff3d0 qla2xxx: cleanup cmd in qla workqueue before processing TMR
Since cmds go into qla_tgt_wq and TMRs don't, it's possible that TMR
like TASK_ABORT can be queued over the cmd for which it was meant.
To avoid this race, use a per-port list to keep track of cmds that
are enqueued to qla_tgt_wq but not yet processed. When a TMR arrives,
iterate through this list and remove any cmds that match the TMR.
This patch supports TASK_ABORT and LUN_RESET.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Swapnil Nagle <swapnil.nagle@purestorage.com>
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:39 -07:00
Roland Dreier
b2032fd567 qla2xxx: kill sessions/log out initiator on RSCN and port down events
To fix some issues talking to ESX, this patch modifies the qla2xxx driver
so that it never logs into remote ports.  This has the side effect of
getting rid of the "rports" entirely, which means we never log out of
initiators and never tear down sessions when an initiator goes away.

This is mostly OK, except that we can run into trouble if we have
initiator A assigned FC address X:Y:Z by the fabric talking to us, and
then initiator A goes away.  Some time (could be a long time) later,
initiator B comes along and also gets FC address X:Y:Z (which is
available again, because initiator A is gone).  If initiator B starts
talking to us, then we'll still have the session for initiator A, and
since we look up incoming IO based on the FC address X:Y:Z, initiator B
will end up using ACLs for initiator A.

Fix this by:

 1. Handling RSCN events somewhat differently; instead of completely
    skipping the processing of fcports, we look through the list, and if
    an fcport disappears, we tell the target code the tear down the
    session and tell the HBA FW to release the N_Port handle.

 2. Handling "port down" events by flushing all of our sessions.  The
    firmware was already releasing the N_Port handle but we want the
    target code to drop all the sessions too.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Alexei Potashnik <alexei@purestorage.com>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:38 -07:00
Kanoj Sarcar
9fce12540c qla2xxx: fix command initialization in target mode.
Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Kanoj Sarcar <kanoj.sarcar@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:31 -07:00
Himanshu Madhani
6bc85dd595 qla2xxx: Remove msleep in qlt_send_term_exchange
Remove unnecessary msleep from qlt_send_term_exchange as it
adds latency of 250 msec while sending terminate exchange to
an aborted task.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:25 -07:00
Quinn Tran
e5fdee875f qla2xxx: adjust debug flags
Adjust debug flag to match debug comment.

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:18 -07:00
Quinn Tran
810e30bc46 qla2xxx: release request queue reservation.
Request IOCB queue element(s) is reserved during
good path IO.  Under error condition such as unable
to allocate IOCB handle condition, the IOCB count
that was reserved is not released.

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:11 -07:00
Quinn Tran
3761f3e870 qla2xxx: Add flush after updating ATIOQ consumer index.
After updating the consumer index of ATIO Q, a read is
required to flush the write to the adapter register.

Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:19:05 -07:00
Himanshu Madhani
b20f02e141 qla2xxx: Enable target mode for ISP27XX
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:18:55 -07:00
Saurav Kashyap
ba9f6f64a0 qla2xxx: Fix hardware lock/unlock issue causing kernel panic.
[ Upstream commit ef86cb2059 ]

This patch fixes a kernel panic for qla2xxx Target core
Module driver introduced by a fix in the qla2xxx initiator code.

Commit ef86cb2 ("qla2xxx: Mark port lost when we receive an RSCN for it.")
introduced the regression for qla2xxx Target driver.

Stack trace will have following signature

 --- <NMI exception stack> ---
[ffff88081faa3cc8] _raw_spin_lock_irqsave at ffffffff815b1f03
[ffff88081faa3cd0] qlt_fc_port_deleted at ffffffffa096ccd0 [qla2xxx]
[ffff88081faa3d20] qla2x00_schedule_rport_del at ffffffffa0913831[qla2xxx]
[ffff88081faa3d50] qla2x00_mark_device_lost at ffffffffa09159c5[qla2xxx]
[ffff88081faa3db0] qla2x00_async_event at ffffffffa0938d59 [qla2xxx]
[ffff88081faa3e30] qla24xx_msix_default at ffffffffa093a326 [qla2xxx]
[ffff88081faa3e90] handle_irq_event_percpu at ffffffff810a7b8d
[ffff88081faa3ee0] handle_irq_event at ffffffff810a7d32
[ffff88081faa3f10] handle_edge_irq at ffffffff810ab6b9
[ffff88081faa3f30] handle_irq at ffffffff8100619c
[ffff88081faa3f70] do_IRQ at ffffffff815b4b1c
 --- <IRQ stack> ---

Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-24 14:18:42 -07:00
David Disseldorp
c20910264c target/configfs: handle match_int() errors
As a follow up to ce31c1b0dc - there are
still a few LIO match_int() calls that don't check the return value.
Propagate errors rather than using the potentially uninitialised result.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-23 23:41:22 -07:00
Andy Grover
9105bfc038 target: Do not return 0 from aptpl and alua configfs store functions
Here are some more instances where we are returning 0 from a configfs
store function, the unintended result of which is likely infinite retries
from userspace.

Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-23 23:40:01 -07:00
Andy Grover
bc1a7d6aff target: Indicate success if writing 0 to pi_prot_type
See https://bugzilla.redhat.com/show_bug.cgi?id=1240687

Returning 0 from a configfs store function results in infinite retries.

Reported-by: Yanko Kaneti <yaneti@declera.com>
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-23 23:39:59 -07:00
Nicholas Mc Guire
fd4e1393c4 tcm_qla2xxx: pass timeout as HZ independent value
API compliance scanning with coccinelle flagged:
./drivers/scsi/qla2xxx/tcm_qla2xxx.c:407:2-29:
         WARNING: timeout is HZ dependent

This was introduced in 'commit 75f8c1f693 ("[SCSI] tcm_qla2xxx: Add >=
24xx series fabric module for target-core")'. wait_for_completion_timeout()
expects a timeout in jiffies so the numeric constant makes the effective
timeout HZ dependent. Resolved by converting it to CONST * HZ.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Acked-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-23 23:39:50 -07:00
Sagi Grimberg
5dacbfc934 target/rd: Set ramdisk as non rotational device
Since a RAM backend device is not really a rotational device,
we set it as is_nonrot=1 which will be forwarded in VPD page 0xb1
(block device characteristics) response.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-06 18:07:19 -07:00
Nicholas Bellinger
3aa3c67b26 target: Add extra TYPE_DISK + protection checks for INQUIRY SPT
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-06 18:07:17 -07:00
Sagi Grimberg
27e6772b0d target/spc: Set SPT correctly in Extended INQUIRY Data VPD page
LIO supports protection types 1,3 so setting a hard-coded SPT=3
is fine for now.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-06 18:07:03 -07:00
Sagi Grimberg
9b353cc8f1 target/pr: Fix possible uninitialized variable usage
Triggered a compilation warning.

Fixes: 2650d71e2 target: move transport ID handling to the core

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-07-06 17:07:30 -07:00
Linus Torvalds
1c4c7159ed Bug fixes (all for stable kernels) for ext4:
* address corner cases for indirect blocks->extent migration
   * fix reserved block accounting invalidate_page when
 	page_size != block_size (i.e., ppc or 1k block size file systems)
   * fix deadlocks when a memcg is under heavy memory pressure
   * fix fencepost error in lazytime optimization
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJVmW27AAoJEPL5WVaVDYGjmEkIAJsGHVIKur1Kp//FhejSB/wI
 B0d+UuQt5kdAE3lNxC7lHO1NqIhvnS7eBho+52LG8V4JDRrzTbE1GdbsBhAIk6FW
 CcsQvsHAI99QJMdqOCachu/+nhCwIINGkxmbumhNaZoJPn6wmGQzCA3Cn5qmnGnK
 Ctbk6li1HuMXyzbbvxCLfaD/xCUs1NCdufEnRU44i0U4OfaYNpiAhddeGIQ8WMEQ
 G14l2JvhIfye6fG8lnCzfacFvnT9zvvSGfRO3ZQjC4Az1EogIUbhCPLvq0ebDbPp
 i4eRfrSRdXmMojqmW/knET8skXQVZVnD7LWuvkue+n47UbTH2c0roTbp4l76W+U=
 =x8Cc
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 bugfixes from Ted Ts'o:
 "Bug fixes (all for stable kernels) for ext4:

   - address corner cases for indirect blocks->extent migration

   - fix reserved block accounting invalidate_page when
     page_size != block_size (i.e., ppc or 1k block size file systems)

   - fix deadlocks when a memcg is under heavy memory pressure

   - fix fencepost error in lazytime optimization"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: replace open coded nofail allocation in ext4_free_blocks()
  ext4: correctly migrate a file with a hole at the beginning
  ext4: be more strict when migrating to non-extent based file
  ext4: fix reservation release on invalidatepage for delalloc fs
  ext4: avoid deadlocks in the writeback path by using sb_getblk_gfp
  bufferhead: Add _gfp version for sb_getblk()
  ext4: fix fencepost error in lazytime optimization
2015-07-05 16:24:54 -07:00
Linus Torvalds
d770e558e2 Linux 4.2-rc1 2015-07-05 11:01:52 -07:00
Linus Torvalds
a585d2b738 platform-drivers-x86 for 4.2-2
A new intel_pmc_ipc driver, a symmetrical allocation and free fix in
 dell-laptop, a couple minor fixes, and some updated documentation in the
 dell-laptop comments.
 
 intel_pmc_ipc:
  - Add Intel Apollo Lake PMC IPC driver
 
 tc1100-wmi:
  - Delete an unnecessary check before the function call "kfree"
 
 dell-laptop:
  - Fix allocating & freeing SMI buffer page
  - Show info about WiGig and UWB in debugfs
  - Update information about wireless control
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJVmM8aAAoJEKbMaAwKp364iUkH/jihOduWkDTzzzxRP2Dv2nEh
 qyvE94Nc9A9dl87C2+II/Pi1s8h4CJOQpl70syYYPc4FdF70hpvP8TbHkgCWrY/d
 F8CoS9L9keviMtGOWlbEL9hBjfSDNwTMESTrDxrwhA04TSAwjDmXhhiUOF5FjFJm
 CX5+ZQ3iXEH6KsENR+Er54J9+6WKE6IuRcnnKCapnPQ8cEYeVn+WEPyzHCOy8Pg3
 xzzUar3/knS2VMIb5eIVpaKFvD9P9qBsC/gQ0pk1Y+686gwQZMVURDv8lw8hfXpx
 TJDOXk21P8WbSH1r+jwax5wLjLge7vJtYG2Deye6MUgvSgg+O2tSVCv9SMQR088=
 =WUgr
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v4.2-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86

Pull late x86 platform driver updates from Darren Hart:
 "The following came in a bit later and I wanted them to bake in next a
  few more days before submitting, thus the second pull.

  A new intel_pmc_ipc driver, a symmetrical allocation and free fix in
  dell-laptop, a couple minor fixes, and some updated documentation in
  the dell-laptop comments.

  intel_pmc_ipc:
   - Add Intel Apollo Lake PMC IPC driver

  tc1100-wmi:
   - Delete an unnecessary check before the function call "kfree"

  dell-laptop:
   - Fix allocating & freeing SMI buffer page
   - Show info about WiGig and UWB in debugfs
   - Update information about wireless control"

* tag 'platform-drivers-x86-v4.2-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
  intel_pmc_ipc: Add Intel Apollo Lake PMC IPC driver
  tc1100-wmi: Delete an unnecessary check before the function call "kfree"
  dell-laptop: Fix allocating & freeing SMI buffer page
  dell-laptop: Show info about WiGig and UWB in debugfs
  dell-laptop: Update information about wireless control
2015-07-05 10:54:09 -07:00
Michal Hocko
7444a072c3 ext4: replace open coded nofail allocation in ext4_free_blocks()
ext4_free_blocks is looping around the allocation request and mimics
__GFP_NOFAIL behavior without any allocation fallback strategy. Let's
remove the open coded loop and replace it with __GFP_NOFAIL. Without the
flag the allocator has no way to find out never-fail requirement and
cannot help in any way.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2015-07-05 12:33:44 -04:00
Linus Torvalds
1dc51b8288 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
 "Assorted VFS fixes and related cleanups (IMO the most interesting in
  that part are f_path-related things and Eric's descriptor-related
  stuff).  UFS regression fixes (it got broken last cycle).  9P fixes.
  fs-cache series, DAX patches, Jan's file_remove_suid() work"

[ I'd say this is much more than "fixes and related cleanups".  The
  file_table locking rule change by Eric Dumazet is a rather big and
  fundamental update even if the patch isn't huge.   - Linus ]

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
  9p: cope with bogus responses from server in p9_client_{read,write}
  p9_client_write(): avoid double p9_free_req()
  9p: forgetting to cancel request on interrupted zero-copy RPC
  dax: bdev_direct_access() may sleep
  block: Add support for DAX reads/writes to block devices
  dax: Use copy_from_iter_nocache
  dax: Add block size note to documentation
  fs/file.c: __fget() and dup2() atomicity rules
  fs/file.c: don't acquire files->file_lock in fd_install()
  fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
  vfs: avoid creation of inode number 0 in get_next_ino
  namei: make set_root_rcu() return void
  make simple_positive() public
  ufs: use dir_pages instead of ufs_dir_pages()
  pagemap.h: move dir_pages() over there
  remove the pointless include of lglock.h
  fs: cleanup slight list_entry abuse
  xfs: Correctly lock inode when removing suid and file capabilities
  fs: Call security_ops->inode_killpriv on truncate
  fs: Provide function telling whether file_remove_privs() will do anything
  ...
2015-07-04 19:36:06 -07:00
Linus Torvalds
9b284cbdb5 bluetooth: fix list handling
Commit 835a6a2f86 ("Bluetooth: Stop sabotaging list poisoning")
thought that the code was sabotaging the list poisoning when NULL'ing
out the list pointers and removed it.

But what was going on was that the bluetooth code was using NULL
pointers for the list as a way to mark it empty, and that commit just
broke it (and replaced the test with NULL with a "list_empty()" test on
a uninitialized list instead, breaking things even further).

So fix it all up to use the regular and real list_empty() handling
(which does not use NULL, but a pointer to itself), also making sure to
initialize the list properly (the previous NULL case was initialized
implicitly by the session being allocated with kzalloc())

This is a combination of patches by Marcel Holtmann and Tedd Ho-Jeong
An.

[ I would normally expect to get this through the bt tree, but I'm going
  to release -rc1, so I'm just committing this directly   - Linus ]

Reported-and-tested-by: Jörg Otte <jrg.otte@gmail.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Original-by: Tedd Ho-Jeong An <tedd.an@intel.com>
Original-by: Marcel Holtmann <marcel@holtmann.org>:
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-04 19:11:33 -07:00
Linus Torvalds
5c755fe142 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "It's been a busy development cycle for target-core in a number of
  different areas.

  The fabric API usage for se_node_acl allocation is now within
  target-core code, dropping the external API callers for all fabric
  drivers tree-wide.

  There is a new conversion to RCU hlists for se_node_acl and
  se_portal_group LUN mappings, that turns fast-past LUN lookup into a
  completely lockless code-path.  It also removes the original
  hard-coded limitation of 256 LUNs per fabric endpoint.

  The configfs attributes for backends can now be shared between core
  and driver code, allowing existing drivers to use common code while
  still allowing flexibility for new backend provided attributes.

  The highlights include:

   - Merge sbc_verify_dif_* into common code (sagi)
   - Remove iscsi-target support for obsolete IFMarker/OFMarker
     (Christophe Vu-Brugier)
   - Add bidi support in target/user backend (ilias + vangelis + agover)
   - Move se_node_acl allocation into target-core code (hch)
   - Add crc_t10dif_update common helper (akinobu + mkp)
   - Handle target-core odd SGL mapping for data transfer memory
     (akinobu)
   - Move transport ID handling into target-core (hch)
   - Move task tag into struct se_cmd + support 64-bit tags (bart)
   - Convert se_node_acl->device_list[] to RCU hlist (nab + hch +
     paulmck)
   - Convert se_portal_group->tpg_lun_list[] to RCU hlist (nab + hch +
     paulmck)
   - Simplify target backend driver registration (hch)
   - Consolidate + simplify target backend attribute implementations
     (hch + nab)
   - Subsume se_port + t10_alua_tg_pt_gp_member into se_lun (hch)
   - Drop lun_sep_lock for se_lun->lun_se_dev RCU usage (hch + nab)
   - Drop unnecessary core_tpg_register TFO parameter (nab)
   - Use 64-bit LUNs tree-wide (hannes)
   - Drop left-over TARGET_MAX_LUNS_PER_TRANSPORT limit (hannes)"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (76 commits)
  target: Bump core version to v5.0
  target: remove target_core_configfs.h
  target: remove unused TARGET_CORE_CONFIG_ROOT define
  target: consolidate version defines
  target: implement WRITE_SAME with UNMAP bit using ->execute_unmap
  target: simplify UNMAP handling
  target: replace se_cmd->execute_rw with a protocol_data field
  target/user: Fix inconsistent kmap_atomic/kunmap_atomic
  target: Send UA when changing LUN inventory
  target: Send UA upon LUN RESET tmr completion
  target: Send UA on ALUA target port group change
  target: Convert se_lun->lun_deve_lock to normal spinlock
  target: use 'se_dev_entry' when allocating UAs
  target: Remove 'ua_nacl' pointer from se_ua structure
  target_core_alua: Correct UA handling when switching states
  xen-scsiback: Fix compile warning for 64-bit LUN
  target: Remove TARGET_MAX_LUNS_PER_TRANSPORT
  target: use 64-bit LUNs
  target: Drop duplicate + unused se_dev_check_wce
  target: Drop unnecessary core_tpg_register TFO parameter
  ...
2015-07-04 14:13:43 -07:00
Linus Torvalds
6d7c8e1b3a A very significant modification to NTB in this series.
An abstraction layer was added to allow the hardware and clients to be
 easily added.  This required rewriting the NTB transport layer for this
 abstraction layer.  This modification will allow future
 "high performance" NTB clients.
 
 In addition to this change, a number of performance modifications were
 added.  These changes include NUMA enablement, using CPU memcpy instead
 of asyncdma, and modification of NTB layer MTU size.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJVmCNDAAoJEG5mS6x6i9IjI/QQAINNe7XCENdIL8iQZU2NDWCB
 BmVhQjKKnS5d774qrCIc/29FoxolTo4MG5H7j1zyGUUrx0mWbjrvRMA4AGt+yE0p
 Xf4DvFUi+Ptmf/dxlzxwt7ySBAQtAJiPRK0xJjEUqXpJqR/u5eTcsG2og7rXnXE1
 DHtgqmh05D4Noi725yyn4qqVrqnUnFtgJ0hp6s7BE5ReeNPJNrD5yRpByEH81TBw
 +FiUHoCIuhZ4taGbfUU3G6lbBoWqztV8RmjI1AQGRGiij5BmYNZUBTEuSoEt86Df
 jxoLIz+77A9SGKejSmZbomeeBT3FnlnwHurC8hKdlqF05m3BebiTlCGEmddUdRAa
 u+8v/z5lmCaxH2Bg0rdQFn1HIvwtnToT4N3nfRtOkywg8tDHfYv2s1IaGO9bosJn
 XaIS5TR548IgEcsrIjJPX8ab/Q7nBUPkhaIEtXLyPdo41rON7J8aaJnv7bjPq/22
 BoHU8fIoiYjIsiDPpsIgRAEXSTz8pr3uHU68NDR8R8pzf9OoysZQV9vy7N04YKTy
 3Hr+AtEmiPel9YIx9Z8rgvLoMzP/nyxecJeFqUQpLkVR5YWdI50eg5TO9lBMU6Io
 hqROco8RUxJDC5J3IE75eWpc2YIxUsNo+5IFsxbNJ78OnY6GiF+pWnKx5rxwxs/G
 YHILdVMg/OBbVX3+fvJV
 =CyfQ
 -----END PGP SIGNATURE-----

Merge tag 'ntb-4.2' of git://github.com/jonmason/ntb

Pull NTB updates from Jon Mason:
 "This includes a pretty significant reworking of the NTB core code, but
  has already produced some significant performance improvements.

  An abstraction layer was added to allow the hardware and clients to be
  easily added.  This required rewriting the NTB transport layer for
  this abstraction layer.  This modification will allow future "high
  performance" NTB clients.

  In addition to this change, a number of performance modifications were
  added.  These changes include NUMA enablement, using CPU memcpy
  instead of asyncdma, and modification of NTB layer MTU size"

* tag 'ntb-4.2' of git://github.com/jonmason/ntb: (22 commits)
  NTB: Add split BAR output for debugfs stats
  NTB: Change WARN_ON_ONCE to pr_warn_once on unsafe
  NTB: Print driver name and version in module init
  NTB: Increase transport MTU to 64k from 16k
  NTB: Rename Intel code names to platform names
  NTB: Default to CPU memcpy for performance
  NTB: Improve performance with write combining
  NTB: Use NUMA memory in Intel driver
  NTB: Use NUMA memory and DMA chan in transport
  NTB: Rate limit ntb_qp_link_work
  NTB: Add tool test client
  NTB: Add ping pong test client
  NTB: Add parameters for Intel SNB B2B addresses
  NTB: Reset transport QP link stats on down
  NTB: Do not advance transport RX on link down
  NTB: Differentiate transport link down messages
  NTB: Check the device ID to set errata flags
  NTB: Enable link for Intel root port mode in probe
  NTB: Read peer info from local SPAD in transport
  NTB: Split ntb_hw_intel and ntb_transport drivers
  ...
2015-07-04 14:07:47 -07:00
Al Viro
0f1db7dee2 9p: cope with bogus responses from server in p9_client_{read,write}
if server claims to have written/read more than we'd told it to,
warn and cap the claimed byte count to avoid advancing more than
we are ready to.
2015-07-04 16:17:39 -04:00
Al Viro
67e808fbb0 p9_client_write(): avoid double p9_free_req()
Braino in "9p: switch p9_client_write() to passing it struct iov_iter *";
if response is impossible to parse and we discard the request, get the
out of the loop right there.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-04 16:11:05 -04:00
Al Viro
a84b69cb6e 9p: forgetting to cancel request on interrupted zero-copy RPC
If we'd already sent a request and decide to abort it, we *must*
issue TFLUSH properly and not just blindly reuse the tag, or
we'll get seriously screwed when response eventually arrives
and we confuse it for response to later request that had reused
the same tag.

Cc: stable@vger.kernel.org # v3.2 and later
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-04 16:04:19 -04:00
Matthew Wilcox
43c3dd08da dax: bdev_direct_access() may sleep
The brd driver is the only in-tree driver that may sleep currently.
After some discussion on linux-fsdevel, we decided that any driver
may choose to sleep in its ->direct_access method.  To ensure that all
callers of bdev_direct_access() are prepared for this, add a call
to might_sleep().

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-04 15:56:57 -04:00
Matthew Wilcox
bbab37ddc2 block: Add support for DAX reads/writes to block devices
If a block device supports the ->direct_access methods, bypass the normal
DIO path and use DAX to go straight to memcpy() instead of allocating
a DIO and a BIO.

Includes support for the DIO_SKIP_DIO_COUNT flag in DAX, as is done in
do_blockdev_direct_IO().

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-04 15:56:57 -04:00
Matthew Wilcox
872eb127e3 dax: Use copy_from_iter_nocache
When userspace does a write, there's no need for the written data to
pollute the CPU cache.  This matches the original XIP code.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-04 15:56:56 -04:00
Matthew Wilcox
44f4c054ca dax: Add block size note to documentation
For block devices which are small enough, mkfs will default to creating
a filesystem with block sizes smaller than page size.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-04 15:56:56 -04:00
Linus Torvalds
1b3618b60a Except for the preempt notifiers fix, these are all small bugfixes
that could have been waited for -rc2.  Sending them now since I
 was taking care of Peter's patch anyway.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJVmB5pAAoJEL/70l94x66DpLAH/A0p2HICsG5Qw3gnI3NxAmK4
 YUvtMx0d67mFXPg0kuYRMO7C2Is6XHKtnmsX8oqkg3JTRFfn7XYqlwvrrK3Be08U
 tGvhigneJTGDXwU74jyik+D6VLmyJP3CxEvXM3d9AFyy7Ro9Grxx0Ja8c9cmKGQE
 esCwNAEJOcqaQMtNIix3WtXifOVFr40NZlbAawsMyxVw8LZK/K5maXyUTRDI57Qn
 B1wbTN1KD847/0rLrit+8VlsGEZBorUgCFhueeYGy/7EdiY0bNkzhLWb4erlWnRq
 ZlKzsLdfXmEg2CEepaHCm5jlLfIurgbLfoV1tzQ5jAuj/SHmUxq+k3lZZYTYA3w=
 =vDKM
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Except for the preempt notifiers fix, these are all small bugfixes
  that could have been waited for -rc2.  Sending them now since I was
  taking care of Peter's patch anyway"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: add hyper-v crash msrs values
  KVM: x86: remove data variable from kvm_get_msr_common
  KVM: s390: virtio-ccw: don't overwrite config space values
  KVM: x86: keep track of LVT0 changes under APICv
  KVM: x86: properly restore LVT0
  KVM: x86: make vapics_in_nmi_mode atomic
  sched, preempt_notifier: separate notifier registration from static_key inc/dec
2015-07-04 11:29:59 -07:00
Dave Jiang
bf44fe4671 NTB: Add split BAR output for debugfs stats
When split BAR is enabled, the driver needs to dump out the split BAR
registers rather than the original 64bit BAR registers.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:32 -04:00
Dave Jiang
fd839bf884 NTB: Change WARN_ON_ONCE to pr_warn_once on unsafe
The unsafe doorbell and scratchpad access should display reason when
WARN is called.  Otherwise we get a stack dump without any explanation.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:30 -04:00
Dave Jiang
7eb387813d NTB: Print driver name and version in module init
Printouts driver name and version to indicate what is being loaded.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:28 -04:00
Dave Jiang
9891417de8 NTB: Increase transport MTU to 64k from 16k
Benchmarking showed a significant performance increase with the MTU size
to 64k instead of 16k.  Change the driver default to 64k.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:27 -04:00
Dave Jiang
2f887b9a44 NTB: Rename Intel code names to platform names
Instead of using the platform code names, use the correct platform names
to identify the respective Intel NTB hardware.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:25 -04:00
Dave Jiang
a41ef053f7 NTB: Default to CPU memcpy for performance
Disable DMA usage by default, since the CPU provides much better
performance with write combining.  Provide a module parameter to enable
DMA usage when offloading the memcpy is preferred.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:24 -04:00
Dave Jiang
06917f7535 NTB: Improve performance with write combining
Changing the memory window BAR mappings to write combining significantly
boosts the performance.  We will also use memcpy that uses non-temporal
store, which showed performance improvement when doing non-cached
memcpys.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:21 -04:00
Allen Hubbe
0e041fb536 NTB: Use NUMA memory in Intel driver
Allocate memory for the NUMA node of the NTB device.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:09:19 -04:00
Allen Hubbe
1199aa6126 NTB: Use NUMA memory and DMA chan in transport
Allocate memory and request the DMA channel for the same NUMA node as
the NTB device.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:08:33 -04:00
Allen Hubbe
2876228941 NTB: Rate limit ntb_qp_link_work
When the ntb transport is connecting and waiting for the peer, the debug
console receives lots of debug level messages about the remote qp link
status being down.  Rate limit those messages.

Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
2015-07-04 14:08:30 -04:00