Commit graph

39146 commits

Author SHA1 Message Date
Dave Chinner
179073620d Merge branch 'xfs-ioctl-setattr-cleanup' into for-next 2015-02-02 10:57:30 +11:00
Iustin Pop
9b94fcc398 xfs: fix behaviour of XFS_IOC_FSSETXATTR on directories
Currently, the ioctl handling code for XFS_IOC_FSSETXATTR treats all
targets as regular files: it refuses to change the extent size if
extents are allocated. This is wrong for directories, as there the
extent size is only used as a default for children.

The patch fixes this issue and improves validation of flag
combinations:

- only disallow extent size changes after extents have been allocated
  for regular files
- only allow XFS_XFLAG_EXTSIZE for regular files
- only allow XFS_XFLAG_EXTSZINHERIT for directories
- automatically clear the flags if the extent size is zero

Thanks to Dave Chinner for guidance on the proper fix for this issue.

[dchinner: ported changes onto cleanup series. Makes changes clear
	   and obvious.]
[dchinner: added comments documenting validity checking rules.]

Signed-off-by: Iustin Pop <iustin@k1024.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-02 10:26:26 +11:00
Dave Chinner
23bd0735cf xfs: factor projid hint checking out of xfs_ioctl_setattr
The project ID change checking is one of the few remaining open
coded checks in xfs_ioctl_setattr(). Factor it into a helper
function so that the setattr code mostly becomes a flow of check
and action helpers, making it easier to read and follow.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-02 10:22:53 +11:00
Dave Chinner
d4388d3c09 xfs: factor extsize hint checking out of xfs_ioctl_setattr
The extent size hint change checking is fairly complex, so isolate
that into it's own function. This simplifies the logic flow of the
setattr code, making it easier to read.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-02 10:22:20 +11:00
Dave Chinner
41c145271d xfs: XFS_IOCTL_SETXATTR can run in user namespaces
Currently XFS_IOCTL_SETXATTR will fail if run in a user namespace as
it it not allowed to change project IDs. The current code, however,
also prevents any other change being made as well, so things like
extent size hints cannot be set in user namespaces. This is wrong,
so only disallow access to project IDs and related flags from inside
the init namespace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-02 10:17:51 +11:00
Dave Chinner
fd179b9c3b xfs: kill xfs_ioctl_setattr behaviour mask
Now there is only one caller to xfs_ioctl_setattr that uses all the
functionality of the function we can kill the behviour mask and
start cleaning up the code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-02 10:16:25 +11:00
Dave Chinner
f96291f6a3 xfs: disaggregate xfs_ioctl_setattr
xfs_ioctl_setxflags doesn't need all of the functionailty in
xfs_ioctl_setattr() and now we have separate helper functions that
share the checks and modifications that xfs_ioctl_setxflags
requires. Hence disaggregate it from xfs_ioctl_setattr() to allow
further work to be done on xfs_ioctl_setattr.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-02 10:15:56 +11:00
Dave Chinner
8f3d17ab06 xfs: factor out xfs_ioctl_setattr transaciton preamble
The setup of the transaction is done after a random smattering of
checks and before another bunch of ioperations specific
validity checks. Pull all the preamble out into a helper function
that returns a transaction or error.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-02 10:15:35 +11:00
Dave Chinner
29a17c00d4 xfs: separate xflags from xfs_ioctl_setattr
The setting of the extended flags is down through two separate
interfaces, but they are munged together into xfs_ioctl_setattr
and make that function far more complex than it needs to be.
Separate it out into a helper function along with all the other
common inode changes and transaction manipulations in
xfs_ioctl_setattr().

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-02 10:14:25 +11:00
Dave Chinner
817b6c480e xfs: FSX_NONBLOCK is not used
It is set if the filp is set ot non-blocking, but the flag is not
used anywhere. Hence we can kill it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-02 10:14:04 +11:00
Dave Chinner
3fd1b0d158 Merge branch 'xfs-misc-fixes-for-3.20-3' into for-next 2015-02-02 10:03:18 +11:00
Christoph Hellwig
2ba6623702 xfs: don't allocate an ioend for direct I/O completions
Back in the days when the direct I/O ->end_io callback could be called
from interrupt context for AIO we needed a structure to hand off to the
workqueue, and reused the ioend structure for this purpose.  These days
->end_io is always called from user or workqueue context, which allows us
to avoid this memory allocation and simplify the code significantly.

[dchinner: removed now unused xfs_finish_ioend_sync() function after
	   Brian Foster did an initial review. ]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-02 10:02:09 +11:00
Wang, Yalin
f3d215526e xfs: change kmem_free to use generic kvfree()
Change kmem_free to use kvfree() generic function, remove the
duplicated code.

Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-02 09:54:18 +11:00
Christoph Hellwig
8add71ca3f xfs: factor out a xfs_update_prealloc_flags() helper
This logic is duplicated in xfs_file_fallocate and xfs_ioc_space, and
we'll need another copy of it for pNFS block support.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-02 09:53:56 +11:00
Linus Torvalds
bc208e0ee0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fix from Chris Mason:
 "We have one more fix for btrfs in my for-linus branch - this was a bug
  in the new raid5/6 scrubbing support"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  btrfs: fix raid56 scrub failed in xfstests btrfs/072
2015-01-30 14:25:52 -08:00
Linus Torvalds
92ef9ce301 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota and UDF fix from Jan Kara:
 "A fix for UDF to properly free preallocated blocks and a fix for quota
  so that Q_GETQUOTA quotactl reports correct numbers for XFS filesystem
  (and similarly Q_XGETQUOTA quotactl works properly for other
  filesystems)"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: Switch ->get_dqblk() and ->set_dqblk() to use bytes as space units
  udf: Release preallocation on last writeable close
2015-01-30 13:46:04 -08:00
Jan Kara
b10a08194c quota: Store maximum space limit in bytes
Currently maximum space limit quota format supports is in blocks however
since we store space limits in bytes, this is somewhat confusing. So
store the maximum limit in bytes as well. Also rename the field to match
the new unit and related inode field to match the new naming scheme.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-30 12:51:21 +01:00
Jan Kara
aaa3daed15 quota: Remove quota_on_meta callback
There are no more users for quota_on_meta callback. Just remove it.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-30 12:51:10 +01:00
Jan Kara
664dbd5fe5 ocfs2: Use generic helpers for quotaon and quotaoff
Ocfs2 can just use the generic helpers provided by quota code for
turning quotas on and off when quota files are stored as system inodes.
The only difference is the feature test in ocfs2_quota_on() and that is
covered by dquot_quota_enable() checking whether usage tracking is
enabled (which can happen only if the filesystem has the quota feature
set).

Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-30 12:50:58 +01:00
Jan Kara
1fa5efe362 ext4: Use generic helpers for quotaon and quotaoff
Ext4 can just use the generic helpers provided by quota code for turning
quotas on and off when quota files are stored as system inodes. The only
difference is the feature test in ext4_quota_on_sysfile() but the same
is achieved in dquot_quota_enable() by checking whether usage tracking
for the corresponding quota type is enabled (which can happen only if
quota feature is set).

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-30 12:50:42 +01:00
Jan Kara
3e2af67e66 quota: Add ->quota_{enable,disable} callbacks for VFS quotas
Add functions which translate ->quota_enable / ->quota_disable calls
into appropriate changes in VFS quota. This will enable filesystems
supporting VFS quota files in system inodes to be controlled via
Q_XQUOTA[ON|OFF] quotactls for better userspace compatibility.

Also provide a vector for quotactl using these functions which can be
used by filesystems with quota files stored in hidden system files.

Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-30 12:50:32 +01:00
Jan Kara
d3b8632485 quota: Wire up ->quota_{enable,disable} callbacks into Q_QUOTA{ON,OFF}
Make Q_QUOTAON / Q_QUOTAOFF quotactl call ->quota_enable /
->quota_disable callback when provided. To match current behavior of
ocfs2 & ext4 we make these quotactls turn on / off quota enforcement for
appropriate quota type.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-30 12:50:16 +01:00
Jan Kara
38e478c448 quota: Split ->set_xstate callback into two
Split ->set_xstate callback into two callbacks - one for turning quotas
on (->quota_enable) and one for turning quotas off (->quota_disable). That
way we don't have to pass quotactl command into the callback which seems
cleaner.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-30 12:49:40 +01:00
Jan Kara
1cd6b7be92 Merge branch 'for_linus' into for_next 2015-01-30 10:16:33 +01:00
Linus Torvalds
353a0c6fcc NFS client bugfixes for Linux 3.19
Highlights include:
 - Stable fix for a NFSv4.1 Oops on mount
 - Stable fix for an O_DIRECT deadlock condition
 - Fix an issue with submounted volumes and fake duplicate inode numbers
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUyqe0AAoJEGcL54qWCgDyIDcP/RZmf7SasgP+i1c4oWGMrqsn
 69DaPRjakbCEJcX00DIp3x2Ulq4uUgJ5v9WDEPkQBKe2lAg7uaQBTzY6x1erDs+3
 bBZFXd0SByfRD8M6UXQwxtbxksBfozItLNU7vbln/0BiIoabO8QVxNhX6fMfK7yt
 tl4Ik/Dr/o3Oe3pVR0wW+0eqEYcGGXzTpPdYoN470z1/DjV9w5rkm3HsPzoq3QHI
 CN3acBJXFrWdCMSJcyzjmeaWRO93FwLSwKS4KKB7hcFY1+pUgA9qhnA5W4GY/0/7
 1ovakR778CPD2YAaQuY9iJG561QixiTSRQHRWoI6LgawQ3JyBMt/eqtcfFI6Olxp
 Mjs+EMjf6hlrFUpBNZ/8dAROJYSU35sZ6y9C5Ch7ZGBhcgbiQCOVyKjNiBWRG8KX
 +xnufzs4tnSIScMHuPyGYpkXy/xEdRhbtm+NjVbvel7Hr37q0jlV2M0/4i54HJ8U
 UWMyHfGFnODKkFGByJc9w1XWSGcEZmRsCAsevGI9Yzf+FSTxrdkLWwdVFFOonHQw
 OfAlYmO29v4zdTbl1dCgU1pyqdDb47Js/07TQCLb2YcRiUfAuysUTFtBSkRDbwCf
 eYKnRvXahPqTB84TDM62HUCJ7y4D945wt+/6FvuYnDYgOeTARXGzAZ+jhv1i8UbJ
 8xX2xcdW6fe2DIfZM/9h
 =eGYe
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.19-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

   - Stable fix for a NFSv4.1 Oops on mount
   - Stable fix for an O_DIRECT deadlock condition
   - Fix an issue with submounted volumes and fake duplicate inode
     numbers"

* tag 'nfs-for-3.19-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Fix use of nfs_attr_use_mounted_on_fileid()
  NFSv4.1: Fix an Oops in nfs41_walk_client_list
  nfs: fix dio deadlock when O_DIRECT flag is flipped
2015-01-29 15:18:12 -08:00
Ingo Molnar
3c01b74e81 * Move efivarfs from the misc filesystem section to pseudo filesystem,
since that's a more logical and accurate place - Leif Lindholm
 
  * Update efibootmgr URL in Kconfig help - Peter Jones
 
  * Improve accuracy of EFI guid function names - Borislav Petkov
 
  * Expose firmware platform size in sysfs for the benefit of EFI boot
    loader installers and other utilities - Steve McIntyre
 
  * Cleanup __init annotations for arm64/efi code - Ard Biesheuvel
 
  * Mark the UIE as unsupported for rtc-efi - Ard Biesheuvel
 
  * Fix memory leak in error code path of runtime map code - Dan Carpenter
 
  * Improve robustness of get_memory_map() by removing assumptions on the
    size of efi_memory_desc_t (which could change in future spec
    versions) and querying the firmware instead of guessing about the
    memmap size - Ard Biesheuvel
 
  * Remove superfluous guid unparse calls - Ivan Khoronzhuk
 
  * Delete unnecessary chosen@0 DT node FDT code since was duplicated
    from code in drivers/of and is entirely unnecessary - Leif Lindholm
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUv69oAAoJEC84WcCNIz1VEYgP/1b27WRfCXs4q/8FP+UheSDS
 nAFbGe9PjVPnxo5pA9VwPP6eNQ2zYiyNGEK1BlbQlFPZdSD1updIraA78CiF5iys
 iSYyG9xVIcTB23RZI8aJLnBXbosIUKPJZ3FORv1LPhI6Mz1rCpraEaaUlv67rUKr
 FLBG9cR7t9f/f+fJw6LOAAISGIG/4s0wQdA5/noaYkj5R5bICl2UTGtbwa0oNstb
 NUO93aKDgaG/VljpIEeG6XV96Ioz7cHjQsEaX8sTrvT0n7nPNIqSDjFJOqWKJOXl
 RsFrzyl8fFIbMuQatYv1f3efPvyH+iKOfHnHrvcjUNje0xhm7F0Bd86BkOw1a3JQ
 pNb0YUWecI0Z/8GSzN8X0JQ7cowa3wI15Z/Hfs03odTXiM6VqwFAhuz/s5DEUdKS
 U+rOPjU0ezt3G4oBB/VGgF9w5JWKfsMcsHgmLX9P+JYzKFrxggo1SXAtXUeRAqQp
 agKmUB+k6Y1baQO8efkoM7rKL2F0q1SR9QiK+16BHCCkevD23v7IFGrHm2r1xKil
 kvWlY4MkRVa4KGPxEFEDVty0HjXxImwYsxTaYVHTS7SMeoP41f6koHKB19NaB3No
 5fqn/rT1KcJuhQj/I+vAixIX4WMJkX/MQVbtKfqSaKlAiRg3eRY6ONYr0jOglfF6
 gaMuvmDd0HlV6UJvH/9L
 =iPpM
 -----END PGP SIGNATURE-----

Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/efi

Pull EFI updates from Matt Fleming:

" - Move efivarfs from the misc filesystem section to pseudo filesystem,
    since that's a more logical and accurate place - Leif Lindholm

  - Update efibootmgr URL in Kconfig help - Peter Jones

  - Improve accuracy of EFI guid function names - Borislav Petkov

  - Expose firmware platform size in sysfs for the benefit of EFI boot
    loader installers and other utilities - Steve McIntyre

  - Cleanup __init annotations for arm64/efi code - Ard Biesheuvel

  - Mark the UIE as unsupported for rtc-efi - Ard Biesheuvel

  - Fix memory leak in error code path of runtime map code - Dan Carpenter

  - Improve robustness of get_memory_map() by removing assumptions on the
    size of efi_memory_desc_t (which could change in future spec
    versions) and querying the firmware instead of guessing about the
    memmap size - Ard Biesheuvel

  - Remove superfluous guid unparse calls - Ivan Khoronzhuk

  - Delete unnecessary chosen@0 DT node FDT code since was duplicated
    from code in drivers/of and is entirely unnecessary - Leif Lindholm

   There's nothing super scary, mainly cleanups, and a merge from Ricardo who
   kindly picked up some patches from the linux-efi mailing list while I
   was out on annual leave in December.

   Perhaps the biggest risk is the get_memory_map() change from Ard, which
   changes the way that both the arm64 and x86 EFI boot stub build the
   early memory map. It would be good to have it bake in linux-next for a
   while.
"

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-29 19:16:40 +01:00
Jan Kara
14bf61ffe6 quota: Switch ->get_dqblk() and ->set_dqblk() to use bytes as space units
Currently ->get_dqblk() and ->set_dqblk() use struct fs_disk_quota which
tracks space limits and usage in 512-byte blocks. However VFS quotas
track usage in bytes (as some filesystems require that) and we need to
somehow pass this information. Upto now it wasn't a problem because we
didn't do any unit conversion (thus VFS quota routines happily stuck
number of bytes into d_bcount field of struct fd_disk_quota). Only if
you tried to use Q_XGETQUOTA or Q_XSETQLIM for VFS quotas (or Q_GETQUOTA
/ Q_SETQUOTA for XFS quotas), you got bogus results. Hardly anyone
tried this but reportedly some Samba users hit the problem in practice.
So when we want interfaces compatible we need to fix this.

We bite the bullet and define another quota structure used for passing
information from/to ->get_dqblk()/->set_dqblk. It's somewhat sad we have
to have more conversion routines in fs/quota/quota.c and another copying
of quota structure slows down getting of quota information by about 2%
but it seems cleaner than overloading e.g. units of d_bcount to bytes.

CC: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-28 09:01:40 +01:00
Jan Kara
b07ef35244 udf: Release preallocation on last writeable close
Commit 6fb1ca92a6 "udf: Fix race between write(2) and close(2)"
changed the condition when preallocation is released. The idea was that
we don't want to release the preallocation for an inode on close when
there are other writeable file descriptors for the inode. However the
condition was written in the opposite way so we released preallocation
only if there were other writeable file descriptors. Fix the problem by
changing the condition properly.

CC: stable@vger.kernel.org
Fixes: 6fb1ca92a6
Reported-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-28 09:00:40 +01:00
David S. Miller
95f873f2ff Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	arch/arm/boot/dts/imx6sx-sdb.dts
	net/sched/cls_bpf.c

Two simple sets of overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-27 16:59:56 -08:00
Gui Hecheng
063c54dccd btrfs: fix raid56 scrub failed in xfstests btrfs/072
The xfstests btrfs/072 reports uncorrectable read errors in dmesg,
because scrub forgets to use commit_root for parity scrub routine
and scrub attempts to scrub those extents items whose contents are
not fully on disk.

To fix it, we just add the @search_commit_root flag back.

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-27 15:26:16 -08:00
Niklas Cassel
7a1ceba071 cifs: fix MUST SecurityFlags filtering
If CONFIG_CIFS_WEAK_PW_HASH is not set, CIFSSEC_MUST_LANMAN
and CIFSSEC_MUST_PLNTXT is defined as 0.

When setting new SecurityFlags without any MUST flags,
your flags would be overwritten with CIFSSEC_MUST_LANMAN (0).

Signed-off-by: Niklas Cassel <niklass@axis.com>
Signed-off-by: Steve French <steve.french@primarydata.com>
2015-01-26 19:38:26 -06:00
Bob Peterson
45094a58b1 GFS2: Eliminate a nonsense goto
This patch just removes a goto that did nothing.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2015-01-26 20:58:54 +00:00
Linus Torvalds
80a755545d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "A couple of fixes - deadlock in CIFS and build breakage in cris serial
  driver (resurfaced f_dentry in there)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  VFS: Convert file->f_dentry->d_inode to file_inode()
  fix deadlock in cifs_ioctl_clone()
2015-01-25 17:27:18 -08:00
Linus Torvalds
c4e00f1d31 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "We have a few fixes in my for-linus branch.

  Qu Wenruo's batch fix a regression between some our merge window pull
  and the inode_cache feature.  The rest are smaller bugs"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  btrfs: Fix the bug that fs_info->pending_changes is never cleared.
  btrfs: fix state->private cast on 32 bit machines
  Btrfs: fix race deleting block group from space_info->ro_bgs list
  Btrfs: fix incorrect freeing in scrub_stripe
  btrfs: sync ioctl, handle errors after transaction start
2015-01-24 14:31:27 +12:00
Jeff Layton
8116bf4cb6 locks: update comments that refer to inode->i_flock
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
2015-01-21 20:44:01 -05:00
Brian Foster
4d949021aa xfs: remove incorrect error negation in attr_multi ioctl
xfs_compat_attrmulti_by_handle() calls memdup_user() which returns a
negative error code. The error code is negated by the caller and thus
incorrectly converted to a positive error code.

Remove the error negation such that the negative error is passed
correctly back up to userspace.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22 10:04:24 +11:00
Dave Chinner
438c3c8d2b Merge branch 'xfs-buf-type-fixes' into for-next 2015-01-22 09:51:30 +11:00
Dave Chinner
3443a3bca5 xfs: set superblock buffer type correctly
When the superblock is modified in a transaction, the commonly
modified fields are not actually copied to the superblock buffer to
avoid the buffer lock becoming a serialisation point. However, there
are some other operations that modify the superblock fields within
the transaction that don't directly log to the superblock but rely
on the changes to be applied during the transaction commit (to
minimise the buffer lock hold time).

When we do this, we fail to mark the buffer log item as being a
superblock buffer and that can lead to the buffer not being marked
with the corect type in the log and hence causing recovery issues.
Fix it by setting the type correctly, similar to xfs_mod_sb()...

cc: <stable@vger.kernel.org> # 3.10 to current
Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22 09:30:23 +11:00
Dave Chinner
fe22d552b8 xfs: set buf types when converting extent formats
Conversion from local to extent format does not set the buffer type
correctly on the new extent buffer when a symlink data is moved out
of line.

Fix the symlink code and leave a comment in the generic bmap code
reminding us that the format-specific data copy needs to set the
destination buffer type appropriately.

cc: <stable@vger.kernel.org> # 3.10 to current
Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22 09:30:06 +11:00
Dave Chinner
f19b872b08 xfs: inode unlink does not set AGI buffer type
This leads to log recovery throwing errors like:

XFS (md0): Mounting V5 Filesystem
XFS (md0): Starting recovery (logdev: internal)
XFS (md0): Unknown buffer type 0!
XFS (md0): _xfs_buf_ioapply: no ops on block 0xaea8802/0x1
ffff8800ffc53800: 58 41 47 49 .....

Which is the AGI buffer magic number.

Ensure that we set the type appropriately in both unlink list
addition and removal.

cc: <stable@vger.kernel.org> # 3.10 to current
Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22 09:29:40 +11:00
Dave Chinner
0d612fb570 xfs: ensure buffer types are set correctly
Jan Kara reported that log recovery was finding buffers with invalid
types in them. This should not happen, and indicates a bug in the
logging of buffers. To catch this, add asserts to the buffer
formatting code to ensure that the buffer type is in range when the
transaction is committed.

We don't set a type on buffers being marked stale - they are not
going to get replayed, the format item exists only for recovery to
be able to prevent replay of the buffer, so the type does not
matter. Hence that needs special casing here.

cc: <stable@vger.kernel.org> # 3.10 to current
Reported-by: Jan Kara <jack@suse.cz>
Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22 09:29:05 +11:00
Dave Chinner
465e2def7c Merge branch 'xfs-sb-logging-rework' into for-next
Conflicts:
	fs/xfs/xfs_mount.c
2015-01-22 09:20:53 +11:00
Anna Schumaker
2ef47eb1ae NFS: Fix use of nfs_attr_use_mounted_on_fileid()
This function call was being optimized out during nfs_fhget(), leading
to situations where we have a valid fileid but still want to use the
mounted_on_fileid.  For example, imagine we have our server configured
like this:

server % df
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1       9.1G  6.5G  1.9G  78% /
/dev/vdb1       487M  2.3M  456M   1% /exports
/dev/vdc1       487M  2.3M  456M   1% /exports/vol1
/dev/vdd1       487M  2.3M  456M   1% /exports/vol2

If our client mounts /exports and tries to do a "chown -R" across the
entire mountpoint, we will get a nasty message warning us about a circular
directory structure.  Running chown with strace tells me that each directory
has the same device and inode number:

newfstatat(AT_FDCWD, "/nfs/", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0
newfstatat(4, "vol1", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0
newfstatat(4, "vol2", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0

With this patch the mounted_on_fileid values are used for st_ino, so the
directory loop warning isn't reported.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-01-21 17:15:41 -05:00
Dave Chinner
074e427ba7 xfs: sanitise sb_bad_features2 handling
We currently have to ensure that every time we update sb_features2
that we update sb_bad_features2. Now that we log and format the
superblock in it's entirety we actually don't have to care because
we can simply update the sb_bad_features2 when we format it into the
buffer. This removes the need for anything but the mount and
superblock formatting code to care about sb_bad_features2, and
hence removes the possibility that we forget to update bad_features2
when necessary in the future.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22 09:10:33 +11:00
Dave Chinner
61e63ecb57 xfs: consolidate superblock logging functions
We now have several superblock loggin functions that are identical
except for the transaction reservation and whether it shoul dbe a
synchronous transaction or not. Consolidate these all into a single
function, a single reserveration and a sync flag and call it
xfs_sync_sb().

Also, xfs_mod_sb() is not really a modification function - it's the
operation of logging the superblock buffer. hence change the name of
it to reflect this.

Note that we have to change the mp->m_update_flags that are passed
around at mount time to a boolean simply to indicate a superblock
update is needed.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22 09:10:31 +11:00
Dave Chinner
4d11a40239 xfs: remove bitfield based superblock updates
When we log changes to the superblock, we first have to write them
to the on-disk buffer, and then log that. Right now we have a
complex bitfield based arrangement to only write the modified field
to the buffer before we log it.

This used to be necessary as a performance optimisation because we
logged the superblock buffer in every extent or inode allocation or
freeing, and so performance was extremely important. We haven't done
this for years, however, ever since the lazy superblock counters
pulled the superblock logging out of the transaction commit
fast path.

Hence we have a bunch of complexity that is not necessary that makes
writing the in-core superblock to disk much more complex than it
needs to be. We only need to log the superblock now during
management operations (e.g. during mount, unmount or quota control
operations) so it is not a performance critical path anymore.

As such, remove the complex field based logging mechanism and
replace it with a simple conversion function similar to what we use
for all other on-disk structures.

This means we always log the entirity of the superblock, but again
because we rarely modify the superblock this is not an issue for log
bandwidth or CPU time. Indeed, if we do log the superblock
frequently, delayed logging will minimise the impact of this
overhead.

[Fixed gquota/pquota inode sharing regression noticed by bfoster.]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22 09:10:26 +11:00
Trond Myklebust
3175e1dcec NFSv4.1: Fix an Oops in nfs41_walk_client_list
If we start state recovery on a client that failed to initialise correctly,
then we are very likely to Oops.

Reported-by: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>
Link: http://lkml.kernel.org/r/130621862.279655.1421851650684.JavaMail.zimbra@desy.de
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-01-21 17:10:09 -05:00
Peng Tao
ee8a1a8b16 nfs: fix dio deadlock when O_DIRECT flag is flipped
We only support swap file calling nfs_direct_IO. However, application
might be able to get to nfs_direct_IO if it toggles O_DIRECT flag
during IO and it can deadlock because we grab inode->i_mutex in
nfs_file_direct_write(). So return 0 for such case. Then the generic
layer will fall back to buffer IO.

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-01-21 17:10:08 -05:00
Jan Kara
a39427007e xfs: Remove some pointless quota checks
xfs_fs_get_xstate() and xfs_fs_get_xstatev() check whether there's quota
running before calling xfs_qm_scall_getqstat() or
xfs_qm_scall_getqstatv(). Thus we are certain that superblock supports
quota and xfs_sb_version_hasquota() check is pointless. Similarly we
know that when quota is running, mp->m_quotainfo will be allocated.

Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-21 19:21:33 +01:00
Jan Kara
fbf64b3df3 xfs: Remove some useless flags tests
'flags' have XFS_ALL_QUOTA_ACCT cleared immediately on function entry.
There's no point in checking these bits later in the function. Also
because we check something is going to change, we know some enforcement
bits are being added and thus there's no point in testing that later.

Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-21 19:21:32 +01:00
Jan Kara
8a2fdd4a49 xfs: Remove useless test
Q_XQUOTARM is never passed to xfs_fs_set_xstate() so remove the test.

Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-21 19:21:32 +01:00
Jan Kara
ca6cb0918e quota: Verify flags passed to Q_SETINFO
Currently flags passed via Q_SETINFO were just stored. This makes it
hard to add new flags since in theory userspace could be just setting /
clearing random flags. Since currently there is only one userspace
settable flag and that is somewhat obscure flags only for ancient v1
quota format, I'm reasonably sure noone operates these flags and
hopefully we are fine just adding the check that passed flags are sane.
If we indeed find some userspace program that gets broken by the strict
check, we can always remove it again.

Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-21 19:21:31 +01:00
Jan Kara
9c45101e88 quota: Cleanup flags definitions
Currently all quota flags were defined just in kernel-private headers.
Export flags readable / writeable from userspace to userspace via
include/uapi/linux/quota.h.

Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-21 19:21:30 +01:00
Jan Kara
96827adcc2 ocfs2: Move OLQF_CLEAN flag out of generic quota flags
OLQF_CLEAN flag is used by OCFS2 on disk to recognize whether quota
recovery is needed or not. We also somewhat abuse mem_dqinfo->dqi_flags
to pass this flag around. Use private flags for this to avoid clashes
with other quota flags / not pollute generic quota flag namespace.

Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-21 19:21:30 +01:00
Jan Kara
c119c5b974 quota: Don't store flags for v2 quota format
Currently, v2 quota format blindly stored flags from in-memory dqinfo on
disk, although there are no flags supported. Since it is stupid to store
flags which have no effect, just store 0 unconditionally and don't
bother loading it from disk.

Note that userspace could have stored some flags there via Q_SETINFO
quotactl and then later read them (although flags have no effect) but
I'm pretty sure noone does that (most definitely quota-tools don't and
quota interface doesn't have too much other users).

Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-21 19:21:07 +01:00
Ingo Molnar
f49028292c Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

  - Documentation updates.

  - Miscellaneous fixes.

  - Preemptible-RCU fixes, including fixing an old bug in the
    interaction of RCU priority boosting and CPU hotplug.

  - SRCU updates.

  - RCU CPU stall-warning updates.

  - RCU torture-test updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-21 06:12:21 +01:00
Qu Wenruo
a53f4f8e9c btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
Commit 6b5fe46dfa (btrfs: do commit in sync_fs if there are pending
changes) will call btrfs_start_transaction() in sync_fs(), to handle
some operations needed to be done in next transaction.

However this can cause deadlock if the filesystem is frozen, with the
following sys_r+w output:
[  143.255932] Call Trace:
[  143.255936]  [<ffffffff816c0e09>] schedule+0x29/0x70
[  143.255939]  [<ffffffff811cb7f3>] __sb_start_write+0xb3/0x100
[  143.255971]  [<ffffffffa040ec06>] start_transaction+0x2e6/0x5a0
[btrfs]
[  143.255992]  [<ffffffffa040f1eb>] btrfs_start_transaction+0x1b/0x20
[btrfs]
[  143.256003]  [<ffffffffa03dc0ba>] btrfs_sync_fs+0xca/0xd0 [btrfs]
[  143.256007]  [<ffffffff811f7be0>] sync_fs_one_sb+0x20/0x30
[  143.256011]  [<ffffffff811cbd01>] iterate_supers+0xe1/0xf0
[  143.256014]  [<ffffffff811f7d75>] sys_sync+0x55/0x90
[  143.256017]  [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17
[  143.256111] Call Trace:
[  143.256114]  [<ffffffff816c0e09>] schedule+0x29/0x70
[  143.256119]  [<ffffffff816c3405>] rwsem_down_write_failed+0x1c5/0x2d0
[  143.256123]  [<ffffffff8133f013>] call_rwsem_down_write_failed+0x13/0x20
[  143.256131]  [<ffffffff811caae8>] thaw_super+0x28/0xc0
[  143.256135]  [<ffffffff811db3e5>] do_vfs_ioctl+0x3f5/0x540
[  143.256187]  [<ffffffff811db5c1>] SyS_ioctl+0x91/0xb0
[  143.256213]  [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17

The reason is like the following:
(Holding s_umount)
VFS sync_fs staff:
|- btrfs_sync_fs()
   |- btrfs_start_transaction()
      |- sb_start_intwrite()
      (Waiting thaw_fs to unfreeze)
					VFS thaw_fs staff:
					thaw_fs()
					(Waiting sync_fs to release
					 s_umount)

So deadlock happens.
This can be easily triggered by fstest/generic/068 with inode_cache
mount option.

The fix is to check if the fs is frozen, if the fs is frozen, just
return and waiting for the next transaction.

Cc: David Sterba <dsterba@suse.cz>
Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
[enhanced comment, changed to SB_FREEZE_WRITE]
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-20 17:20:21 -08:00
Qu Wenruo
6c9fe14f9d btrfs: Fix the bug that fs_info->pending_changes is never cleared.
Fs_info->pending_changes is never cleared since the original code uses
cmpxchg(&fs_info->pending_changes, 0, 0), which will only clear it if
pending_changes is already 0.

This will cause a lot of problem when mount it with inode_cache mount
option.
If the btrfs is mounted as inode_cache, pending_changes will always be
1, even when the fs is frozen.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-20 17:19:40 -08:00
Sachin Prabhu
ca7df8e0bb Complete oplock break jobs before closing file handle
Commit
c11f1df500
requires writers to wait for any pending oplock break handler to
complete before proceeding to write. This is done by waiting on bit
CIFS_INODE_PENDING_OPLOCK_BREAK in cifsFileInfo->flags. This bit is
cleared by the oplock break handler job queued on the workqueue once it
has completed handling the oplock break allowing writers to proceed with
writing to the file.

While testing, it was noticed that the filehandle could be closed while
there is a pending oplock break which results in the oplock break
handler on the cifsiod workqueue being cancelled before it has had a
chance to execute and clear the CIFS_INODE_PENDING_OPLOCK_BREAK bit.
Any subsequent attempt to write to this file hangs waiting for the
CIFS_INODE_PENDING_OPLOCK_BREAK bit to be cleared.

We fix this by ensuring that we also clear the bit
CIFS_INODE_PENDING_OPLOCK_BREAK when we remove the oplock break handler
from the workqueue.

The bug was found by Red Hat QA while testing using ltp's fsstress
command.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Acked-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Jeff Layton <jlayton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <steve.french@primarydata.com>
2015-01-19 20:20:46 -06:00
Giel van Schijndel
f99dbfa4b3 cifs: use memzero_explicit to clear stack buffer
When leaving a function use memzero_explicit instead of memset(0) to
clear stack allocated buffers. memset(0) may be optimized away.

This particular buffer is highly likely to contain sensitive data which
we shouldn't leak (it's named 'passwd' after all).

Signed-off-by: Giel van Schijndel <me@mortis.eu>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Reported-at: http://www.viva64.com/en/b/0299/
Reported-by: Andrey Karpov
Reported-by: Svyatoslav Razmyslov
Signed-off-by: Steve French <steve.french@primarydata.com>
2015-01-19 15:32:13 -06:00
Satoru Takeuchi
6e1103a6e9 btrfs: fix state->private cast on 32 bit machines
Suppress the following warning displayed on building 32bit (i686) kernel.

===============================================================================
...
   CC [M]  fs/btrfs/extent_io.o
fs/btrfs/extent_io.c: In function ‘btrfs_free_io_failure_record’:
fs/btrfs/extent_io.c:2193:13: warning: cast to pointer from integer of
different size [-Wint-to-pointer-cast]
    failrec = (struct io_failure_record *)state->private;
...
===============================================================================

Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Reported-by: Chris Murphy <chris@colorremedies.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-19 13:06:06 -08:00
Filipe Manana
75c68e9fbb Btrfs: fix race deleting block group from space_info->ro_bgs list
When removing a block group we were deleting it from its space_info's
ro_bgs list without the correct protection - the space info's spinlock.
Fix this by doing the list delete while holding the spinlock of the
corresponding space info, which is the correct lock for any operation
on that list.

This issue was introduced in the 3.19 kernel by the following change:

    Btrfs: move read only block groups onto their own list V2
    commit 633c0aad4c

I ran into a kernel crash while a task was running statfs, which iterates
the space_info->ro_bgs list while holding the space info's spinlock,
and another task was deleting it from the same list, without holding that
spinlock, as part of the block group remove operation (while running the
function btrfs_remove_block_group). This happened often when running the
stress test xfstests/generic/038 I recently made.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-19 13:05:45 -08:00
Tsutomu Itoh
379d6854a2 Btrfs: fix incorrect freeing in scrub_stripe
The address that should be freed is not 'ppath' but 'path'.

Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Reviewed-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-19 13:05:44 -08:00
David Sterba
98bd5c547e btrfs: sync ioctl, handle errors after transaction start
The version merged to 3.19 did not handle errors from start_trancaction
and could pass an invalid pointer to commit_transaction.

Fixes: 6b5fe46dfa ("btrfs: do commit in sync_fs if there are pending changes")
Reported-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2015-01-19 13:05:44 -08:00
Al Viro
378ff1a53b fix deadlock in cifs_ioctl_clone()
It really needs to check that src is non-directory *and* use
{un,}lock_two_nodirectories().  As it is, it's trivial to cause
double-lock (ioctl(fd, CIFS_IOC_COPYCHUNK_FILE, fd)) and if the
last argument is an fd of directory, we are asking for trouble
by violating the locking order - all directories go before all
non-directories.  If the last argument is an fd of parent
directory, it has 50% odds of locking child before parent,
which will cause AB-BA deadlock if we race with unlink().

Cc: stable@vger.kernel.org @ 3.13+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-01-18 23:49:26 -05:00
Johannes Berg
053c095a82 netlink: make nlmsg_end() and genlmsg_end() void
Contrary to common expectations for an "int" return, these functions
return only a positive value -- if used correctly they cannot even
return 0 because the message header will necessarily be in the skb.

This makes the very common pattern of

  if (genlmsg_end(...) < 0) { ... }

be a whole bunch of dead code. Many places also simply do

  return nlmsg_end(...);

and the caller is expected to deal with it.

This also commonly (at least for me) causes errors, because it is very
common to write

  if (my_function(...))
    /* error condition */

and if my_function() does "return nlmsg_end()" this is of course wrong.

Additionally, there's not a single place in the kernel that actually
needs the message length returned, and if anyone needs it later then
it'll be very easy to just use skb->len there.

Remove this, and make the functions void. This removes a bunch of dead
code as described above. The patch adds lines because I did

-	return nlmsg_end(...);
+	nlmsg_end(...);
+	return 0;

I could have preserved all the function's return values by returning
skb->len, but instead I've audited all the places calling the affected
functions and found that none cared. A few places actually compared
the return value with <= 0 in dump functionality, but that could just
be changed to < 0 with no change in behaviour, so I opted for the more
efficient version.

One instance of the error I've made numerous times now is also present
in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
check for <0 or <=0 and thus broke out of the loop every single time.
I've preserved this since it will (I think) have caused the messages to
userspace to be formatted differently with just a single message for
every SKB returned to userspace. It's possible that this isn't needed
for the tools that actually use this, but I don't even know what they
are so couldn't test that changing this behaviour would be acceptable.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-18 01:03:45 -05:00
Jeff Layton
3d8e560de4 locks: consolidate NULL i_flctx checks in locks_remove_file
We have each of the locks_remove_* variants doing this individually.
Have the caller do it instead, and have locks_remove_flock and
locks_remove_lease just assume that it's a valid pointer.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
2015-01-16 16:08:50 -05:00
Jeff Layton
9bd0f45b70 locks: keep a count of locks on the flctx lists
This makes things a bit more efficient in the cifs and ceph lock
pushing code.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Acked-by: Christoph Hellwig <hch@lst.de>
2015-01-16 16:08:50 -05:00
Jeff Layton
7448cc37b1 locks: clean up the lm_change prototype
Now that we use standard list_heads for tracking leases, we can have
lm_change take a pointer to the lease to be modified instead of a
double pointer.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Acked-by: Christoph Hellwig <hch@lst.de>
2015-01-16 16:08:50 -05:00
Jeff Layton
6109c85037 locks: add a dedicated spinlock to protect i_flctx lists
We can now add a dedicated spinlock without expanding struct inode.
Change to using that to protect the various i_flctx lists.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Acked-by: Christoph Hellwig <hch@lst.de>
2015-01-16 16:08:49 -05:00
Jeff Layton
8634b51f6c locks: convert lease handling to file_lock_context
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Acked-by: Christoph Hellwig <hch@lst.de>
2015-01-16 16:08:17 -05:00
Jeff Layton
bd61e0a9c8 locks: convert posix locks to file_lock_context
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Acked-by: Christoph Hellwig <hch@lst.de>
2015-01-16 16:08:16 -05:00
Jeff Layton
5263e31e45 locks: move flock locks to file_lock_context
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Acked-by: Christoph Hellwig <hch@lst.de>
2015-01-16 15:09:25 -05:00
Jeff Layton
c362781cad ceph: move spinlocking into ceph_encode_locks_to_buffer and ceph_count_locks
There is only a single call site for each of these functions, and the
caller takes the i_lock prior to calling them and drops it just
afterward. Move the spinlocking into the functions instead.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Acked-by: Christoph Hellwig <hch@lst.de>
2015-01-16 15:09:25 -05:00
Jeff Layton
4a075e39c8 locks: add a new struct file_locking_context pointer to struct inode
The current scheme of using the i_flock list is really difficult to
manage. There is also a legitimate desire for a per-inode spinlock to
manage these lists that isn't the i_lock.

Start conversion to a new scheme to eventually replace the old i_flock
list with a new "file_lock_context" object.

We start by adding a new i_flctx to struct inode. For now, it lives in
parallel with i_flock list, but will eventually replace it. The idea is
to allocate a structure to sit in that pointer and act as a locus for
all things file locking.

We allocate a file_lock_context for an inode when the first lock is
added to it, and it's only freed when the inode is freed. We use the
i_lock to protect the assignment, but afterward it should mostly be
accessed locklessly.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Acked-by: Christoph Hellwig <hch@lst.de>
2015-01-16 15:05:54 -05:00
Jeff Layton
dd459bb197 locks: have locks_release_file use flock_lock_file to release generic flock locks
...instead of open-coding it and removing flock locks directly. This
helps consolidate the flock lock removal logic into a single spot.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
2015-01-16 15:05:54 -05:00
Jeff Layton
6dee60f69d locks: add new struct list_head to struct file_lock
...that we can use to queue file_locks to per-ctx list_heads. Go ahead
and convert locks_delete_lock and locks_dispose_list to use it instead
of the fl_block list.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Acked-by: Christoph Hellwig <hch@lst.de>
2015-01-16 15:05:54 -05:00
Linus Torvalds
62b1530065 Driver core fixes for 3.19-rc5
Here is one kernfs fix for a reported issue for 3.19-rc5.
 
 It has been in linux-next for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlS5V8QACgkQMUfUDdst+ynQTgCdEOUn6oftKCkErl4WWX9q0+ZT
 4CIAoLuGH9Gdn5tIVlqJ1tVmESnsgn0T
 =P84S
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fix from Greg KH:
 "Here is one kernfs fix for a reported issue for 3.19-rc5.

  It has been in linux-next for a while"

* tag 'driver-core-3.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  kernfs: Fix kernfs_name_compare
2015-01-17 08:16:52 +13:00
Linus Torvalds
a2a32cd1d7 NFS client bugfixes for Linux 3.19
Highlights include:
 
 - Stable fix for a NFSv3/lockd race
 - Fixes for several NFSv4.1 client id trunking bugs
 - Remove an incorrect test when checking for delegated opens
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUuSAcAAoJEGcL54qWCgDyVGkQAJTTzJ8TcuDaXPz7a+8vYR/F
 0Z0iJ7Q2VzfVwLKReQdSMUhX8E6OsVe/HlmFT1paNdIOnF7gQvTclP7/GZ/EaDtA
 csBKkEcxoCRbF6OhCdeSMJC+iXs0cugCgYrkusPSGJj8dG+qa6Hix8rDfZKbM4yp
 sQphMozeI/FLHlckBTsViBCECcyR1FxjOrpXc1RPTz1u2dw9dDgZABdX00ubve2Y
 5dGIGyriFUiZzqFp+7GLXa2GITuLnda+IPPywyzF2MQy4XYWH63bXoWbyXL3kzQ7
 P7hLsGMOfl+pFOBnwYyXhhfqdsS/lOt8V+18Rrr1NhFx7VtQUae/VDPwLxnnoq0P
 ZtYiuJmIihzHfr4N5V2QR9TcuNO7fg6/1hxHZkOYxaYkH2IezFRWbtj9J/UJS+8q
 XDQzEvXMxtXCHKDQvmW653oQIdCFRR9YvgsN/YaxIB3vWPKL2tupeQWe/ZEdVvX3
 WemIxyc6NBS86otauVXOIKZFgiJ4mynpgH22xb43CJ33b049vJVu/NOQWHgrhHqQ
 7vpNDvpjyErjtoMVdN/x2L21uVTucZDXF59hfUD/Q9kE5qXGDRlqiFsDbkVvd+0C
 SXKBngtslBvgU3dYblAuJ9uMkTYkSE/vNkzVibPXJr4/cysVyMeeRDysVPHRZpw2
 8jH+yyHBxvK8XSDTxIkh
 =hz4r
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.19-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

   - Stable fix for a NFSv3/lockd race
   - Fixes for several NFSv4.1 client id trunking bugs
   - Remove an incorrect test when checking for delegated opens"

* tag 'nfs-for-3.19-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4: Remove incorrect check in can_open_delegated()
  NFS: Ignore transport protocol when detecting server trunking
  NFSv4/v4.1: Verify the client owner id during trunking detection
  NFSv4: Cache the NFSv4/v4.1 client owner_id in the struct nfs_client
  NFSv4.1: Fix client id trunking on Linux
  LOCKD: Fix a race when initialising nlmsvc_timeout
2015-01-17 07:59:06 +13:00
Linus Torvalds
cb59670870 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
 "This fixes a regression in the latest fuse update plus a fix for a
  rather theoretical memory ordering issue"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: add memory barrier to INIT
  fuse: fix LOOKUP vs INIT compat handling
2015-01-16 14:58:16 +13:00
NeilBrown
52d304eb4e locks: fix NULL-deref in generic_delete_lease
commit 0efaa7e82f
  locks: generic_delete_lease doesn't need a file_lock at all

moves the call to fl->fl_lmops->lm_change() to a place in the
code where fl might be a non-lease lock.
When that happens, fl_lmops is NULL and an Oops ensures.

So add an extra test to restore correct functioning.

Reported-by: Linda Walsh <suse@tlinx.org>
Link: https://bugzilla.suse.com/show_bug.cgi?id=912569
Cc: stable@vger.kernel.org (v3.18)
Fixes: 0efaa7e82f
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
2015-01-13 07:00:55 -05:00
alex chen
3566c96476 GFS2: fix sprintf format specifier
Sprintf format specifier "%d" and "%u" are mixed up in
gfs2_recovery_done() and freeze_show(). So correct them.

Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2015-01-13 10:48:57 +00:00
Fabian Frederick
bbe48dd811 udf: destroy sbi mutex in put_super
Call mutex_destroy() on superblock mutex in udf_put_super()
otherwise mutex debugging code isn't able to detect that
mutex is used after being freed.
(thanks to Jan Kara for complete definition).

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
2015-01-12 10:56:04 +01:00
Matt Fleming
17473c32ce Updates included:
*Rename the function efi_guid_unparse to reflect better its behavior. - Borislav Petkov
 
  *Update the EFI firmware Kconfig help with the URL of efibootmgr. - Peter Jones
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUrfd/AAoJECYUxVC6df4Sd7kP/RP4uzJQnQTteg2ys2Zk2sEe
 rWdjxtcQurnP5P07SqNV/dZIWQx6hG/l61herFzryq9JKNye70fCd8bGl4sOPiHW
 i64FJtmCF+3vgLhJK2s6Ysub/xPdC0zkRXIIHvulX1nwZRWXiPzkJPzFlkCbXaW1
 O3ykSBIT4EwXAe84dN1Oio476Zyf/rS81rKakYXKwYNrfSKTmyH2FUlZODaSL+0a
 oGpae/2kps6rqdlOq19J5VKcYE3CTzy+Pv3QhqVlgywWF0AIUOHrcWpsWTv4IhKr
 v8blQcy7QZ/QWOEg7Nnus48/Mh0sNdAy497ZV4/tgmBfkGUcHD5qX5/Fy6Yp+/0R
 YJfRv5fTDyxYp6krdEeUnOphaXZT2jkG8uRdLJxNBONWBkwf/i9QXcmwuJc0+sgF
 7NKP8/kbqc3A6blQ/RNQRdeLW0hAAXfrVHj/zW8Fbi0PJTn+ALTTyvjDhMwYOr46
 IVbm1CzGQLXGNDFWvxRlMbzQa5PktiiqOxymTNJ6HZwP8TnS5IE5f/iEVpj5Yd+z
 zbYQcpBiN8VnQsKBmq04865/OrNgQRcUC+OU6Mz1iIIEIF8NLBDoRfE/oX0I+dEC
 jc0wTszmr0uXAu6dP4tgQEaQ0oNdOE806+BZ++3bpaZ3GCScjYtDOnSdsFB5ByYp
 dsAqvfPW8uY+I27SXMat
 =aLDw
 -----END PGP SIGNATURE-----

Merge tag 'rneri-efi-next' of https://github.com/ricardon/efi into next

Pull EFI changes for v3.20 from Ricardo Neri, who kindly picked them up
while I was out on annual leave in December.

Updates included:

 *Rename the function efi_guid_unparse to reflect better its behavior. - Borislav Petkov

 *Update the EFI firmware Kconfig help with the URL of efibootmgr. - Peter Jones

Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2015-01-12 09:33:25 +00:00
Linus Torvalds
5ab551d662 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "Misc fixes: group scheduling corner case fix, two deadline scheduler
  fixes, effective_load() overflow fix, nested sleep fix, 6144 CPUs
  system fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix RCU stall upon -ENOMEM in sched_create_group()
  sched/deadline: Avoid double-accounting in case of missed deadlines
  sched/deadline: Fix migration of SCHED_DEADLINE tasks
  sched: Fix odd values in effective_load() calculations
  sched, fanotify: Deal with nested sleeps
  sched: Fix KMALLOC_MAX_SIZE overflow during cpumask allocation
2015-01-11 11:51:49 -08:00
Linus Torvalds
dc9319f5a3 Merge branch 'for-3.19' of git://linux-nfs.org/~bfields/linux
Pull two nfsd bugfixes from Bruce Fields.

* 'for-3.19' of git://linux-nfs.org/~bfields/linux:
  rpc: fix xdr_truncate_encode to handle buffer ending on page boundary
  nfsd: fix fi_delegees leak when fi_had_conflict returns true
2015-01-09 18:10:48 -08:00
Linus Torvalds
20ebb34528 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull two Ceph fixes from Sage Weil:
 "These are both pretty trivial: a sparse warning fix and size_t printk
  thing"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  libceph: fix sparse endianness warnings
  ceph: use %zu for len in ceph_fill_inline_data()
2015-01-09 17:55:00 -08:00
Linus Torvalds
03c751a5e1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "None of these are huge, but my commit does fix a regression from 3.18
  that could cause lost files during log replay.

  This also adds Dave Sterba to the list of Btrfs maintainers.  It
  doesn't mean we're doing things differently, but Dave has really been
  helping with the maintainer workload for years"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: don't delay inode ref updates during log replay
  Btrfs: correctly get tree level in tree_backref_for_extent
  Btrfs: call inode_dec_link_count() on mkdir error path
  Btrfs: abort transaction if we don't find the block group
  Btrfs, scrub: uninitialized variable in scrub_extent_for_parity()
  Btrfs: add more maintainers
2015-01-09 17:46:07 -08:00
Rasmus Villemoes
72392ed0eb kernfs: Fix kernfs_name_compare
Returning a difference from a comparison functions is usually wrong
(see acbbe6fbb2 "kcmp: fix standard comparison bug" for the long
story). Here there is the additional twist that if the void pointers
ns and kn->ns happen to differ by a multiple of 2^32,
kernfs_name_compare returns 0, falsely reporting a match to the
caller.

Technically 'hash - kn->hash' is ok since the hashes are restricted to
31 bits, but it's better to avoid that subtlety.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-09 15:51:08 -08:00
Bob Peterson
8f6cb409f0 GFS2: Eliminate __gfs2_glock_remove_from_lru
Since the only caller of function __gfs2_glock_remove_from_lru locks the
same spin_lock as gfs2_glock_remove_from_lru, the functions can be combined.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2015-01-09 11:33:13 +00:00
Peter Zijlstra
536ebe9ca9 sched, fanotify: Deal with nested sleeps
As per e23738a730 ("sched, inotify: Deal with nested sleeps").

fanotify_read is a wait loop with sleeps in. Wait loops rely on
task_struct::state and sleeps do too, since that's the only means of
actually sleeping. Therefore the nested sleeps destroy the wait loop
state and the wait loop breaks the sleep functions that assume
TASK_RUNNING (mutex_lock).

Fix this by using the new woken_wake_function and wait_woken() stuff,
which registers wakeups in wait and thereby allows shrinking the
task_state::state changes to the actual sleep part.

Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Paris <eparis@redhat.com>
Link: http://lkml.kernel.org/r/20141216152838.GZ3337@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-09 11:18:12 +01:00
Dave Chinner
6bcf0939ff Merge branch 'xfs-misc-fixes-for-3.20-2' into for-next 2015-01-09 11:06:17 +11:00
Nicholas Mc Guire
43fd1fce96 xfs: fix implicit bool to int conversion
try_wait_for_completion returns bool so the wrapper function
xfs_dqflock_nowait should probably also return bool and not int.

Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-09 10:48:58 +11:00
Christoph Hellwig
d32057fc84 xfs: pass a 64-bit count argument to xfs_iomap_write_unwritten
The code is already ready for it, and the pnfs layout commit code expects
to be able to pass a larger than 32-bit argument.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-09 10:48:12 +11:00
Dave Chinner
64af7a6ea5 xfs: remove deprecated sysctls
xfsbufd_centisecs and age_buffer_centisecs were due for removal in
3.14. We forgot to do that - it's now well past time to remove these
deprecated, unused sysctls.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-09 10:47:43 +11:00
Dave Chinner
aa5d95c1b5 xfs: move xfs_bmap_finish prototype
This function is used libxfs code, but is implemented separately in
userspace. Move the function prototype to xfs_bmap.h so that the
prototype is shared even if the implementations aren't.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-09 10:47:14 +11:00
Dave Chinner
9799b438ce xfs: move struct xfs_bmalloca to libxfs
It no long is used for stack splits, so strip the kernel workqueue
bits from it and push it back into libxfs/xfs_bmap.h so that
it can be shared with the userspace code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-09 10:46:49 +11:00
Dave Chinner
5ebdc213ac xfs: move xfs_types.h to libxfs
The types used by the core XFS code are common between kernel and
userspace. xfs_types.h is duplicated in both kernel and userspace,
so move it to libxfs along with all the other shared code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-09 10:46:31 +11:00
Dave Chinner
2155355fda xfs: move xfs_fs.h to libxfs
Ioctl API definitions are shared with userspace, so move the header
file that defines them all to libxfs along with all the other code
shared with userspace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-09 10:45:13 +11:00
David Drysdale
75069f2b5b vfs: renumber FMODE_NONOTIFY and add to uniqueness check
Fix clashing values for O_PATH and FMODE_NONOTIFY on sparc.  The
clashing O_PATH value was added in commit 5229645bdc ("vfs: add
nonconflicting values for O_PATH") but this can't be changed as it is
user-visible.

FMODE_NONOTIFY is only used internally in the kernel, but it is in the
same numbering space as the other O_* flags, as indicated by the comment
at the top of include/uapi/asm-generic/fcntl.h (and its use in
fs/notify/fanotify/fanotify_user.c).  So renumber it to avoid the clash.

All of this has happened before (commit 12ed2e36c9: "fanotify:
FMODE_NONOTIFY and __O_SYNC in sparc conflict"), and all of this will
happen again -- so update the uniqueness check in fcntl_init() to
include __FMODE_NONOTIFY.

Signed-off-by: David Drysdale <drysdale@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-01-08 15:10:52 -08:00