Commit graph

2237 commits

Author SHA1 Message Date
Jeff Layton
8fcd461db7 nfsd: do nfs4_check_fh in nfs4_check_file instead of nfs4_check_olstateid
Currently, preprocess_stateid_op calls nfs4_check_olstateid which
verifies that the open stateid corresponds to the current filehandle in the
call by calling nfs4_check_fh.

If the stateid is a NFS4_DELEG_STID however, then no such check is done.
This could cause incorrect enforcement of permissions, because the
nfsd_permission() call in nfs4_check_file uses current the current
filehandle, but any subsequent IO operation will use the file descriptor
in the stateid.

Move the call to nfs4_check_fh into nfs4_check_file instead so that it
can be done for all stateid types.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org
[bfields: moved fh check to avoid NULL deref in special stateid case]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-31 16:30:26 -04:00
Kinglong Mee
1ca4b88e7d nfsd: Fix a file leak on nfsd4_layout_setlease failure
If nfsd4_layout_setlease fails, nfsd will not put ls->ls_file.

Fix commit c5c707f96f "nfsd: implement pNFS layout recalls".

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-20 14:58:22 -04:00
Kinglong Mee
c2227a39a0 nfsd: Drop BUG_ON and ignore SECLABEL on absent filesystem
On an absent filesystem (one served by another server), we need to be
able to handle requests for certain attributest (like fs_locations, so
the client can find out which server does have the filesystem), but
others we can't.

We forgot to take that into account when adding another attribute
bitmask work for the SECURITY_LABEL attribute.

There an export entry with the "refer" option can result in:

[   88.414272] kernel BUG at fs/nfsd/nfs4xdr.c:2249!
[   88.414828] invalid opcode: 0000 [#1] SMP
[   88.415368] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd xfs libcrc32c iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi iosf_mbi ppdev btrfs coretemp crct10dif_pclmul crc32_pclmul crc32c_intel xor ghash_clmulni_intel raid6_pq vmw_balloon parport_pc parport i2c_piix4 shpchp vmw_vmci acpi_cpufreq auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi mptscsih serio_raw mptbase e1000 scsi_transport_spi ata_generic pata_acpi [last unloaded: nfsd]
[   88.417827] CPU: 0 PID: 2116 Comm: nfsd Not tainted 4.0.7-300.fc22.x86_64 #1
[   88.418448] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014
[   88.419093] task: ffff880079146d50 ti: ffff8800785d8000 task.ti: ffff8800785d8000
[   88.419729] RIP: 0010:[<ffffffffa04b3c10>]  [<ffffffffa04b3c10>] nfsd4_encode_fattr+0x820/0x1f00 [nfsd]
[   88.420376] RSP: 0000:ffff8800785db998  EFLAGS: 00010206
[   88.421027] RAX: 0000000000000001 RBX: 000000000018091a RCX: ffff88006668b980
[   88.421676] RDX: 00000000fffef7fc RSI: 0000000000000000 RDI: ffff880078d05000
[   88.422315] RBP: ffff8800785dbb58 R08: ffff880078d043f8 R09: ffff880078d4a000
[   88.422968] R10: 0000000000010000 R11: 0000000000000002 R12: 0000000000b0a23a
[   88.423612] R13: ffff880078d05000 R14: ffff880078683100 R15: ffff88006668b980
[   88.424295] FS:  0000000000000000(0000) GS:ffff88007c600000(0000) knlGS:0000000000000000
[   88.424944] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   88.425597] CR2: 00007f40bc370f90 CR3: 0000000035af5000 CR4: 00000000001407f0
[   88.426285] Stack:
[   88.426921]  ffff8800785dbaa8 ffffffffa049e4af ffff8800785dba08 ffffffff813298f0
[   88.427585]  ffff880078683300 ffff8800769b0de8 0000089d00000001 0000000087f805e0
[   88.428228]  ffff880000000000 ffff880079434a00 0000000000000000 ffff88006668b980
[   88.428877] Call Trace:
[   88.429527]  [<ffffffffa049e4af>] ? exp_get_by_name+0x7f/0xb0 [nfsd]
[   88.430168]  [<ffffffff813298f0>] ? inode_doinit_with_dentry+0x210/0x6a0
[   88.430807]  [<ffffffff8123833e>] ? d_lookup+0x2e/0x60
[   88.431449]  [<ffffffff81236133>] ? dput+0x33/0x230
[   88.432097]  [<ffffffff8123f214>] ? mntput+0x24/0x40
[   88.432719]  [<ffffffff812272b2>] ? path_put+0x22/0x30
[   88.433340]  [<ffffffffa049ac87>] ? nfsd_cross_mnt+0xb7/0x1c0 [nfsd]
[   88.433954]  [<ffffffffa04b54e0>] nfsd4_encode_dirent+0x1b0/0x3d0 [nfsd]
[   88.434601]  [<ffffffffa04b5330>] ? nfsd4_encode_getattr+0x40/0x40 [nfsd]
[   88.435172]  [<ffffffffa049c991>] nfsd_readdir+0x1c1/0x2a0 [nfsd]
[   88.435710]  [<ffffffffa049a530>] ? nfsd_direct_splice_actor+0x20/0x20 [nfsd]
[   88.436447]  [<ffffffffa04abf30>] nfsd4_encode_readdir+0x120/0x220 [nfsd]
[   88.437011]  [<ffffffffa04b58cd>] nfsd4_encode_operation+0x7d/0x190 [nfsd]
[   88.437566]  [<ffffffffa04aa6dd>] nfsd4_proc_compound+0x24d/0x6f0 [nfsd]
[   88.438157]  [<ffffffffa0496103>] nfsd_dispatch+0xc3/0x220 [nfsd]
[   88.438680]  [<ffffffffa006f0cb>] svc_process_common+0x43b/0x690 [sunrpc]
[   88.439192]  [<ffffffffa0070493>] svc_process+0x103/0x1b0 [sunrpc]
[   88.439694]  [<ffffffffa0495a57>] nfsd+0x117/0x190 [nfsd]
[   88.440194]  [<ffffffffa0495940>] ? nfsd_destroy+0x90/0x90 [nfsd]
[   88.440697]  [<ffffffff810bb728>] kthread+0xd8/0xf0
[   88.441260]  [<ffffffff810bb650>] ? kthread_worker_fn+0x180/0x180
[   88.441762]  [<ffffffff81789e58>] ret_from_fork+0x58/0x90
[   88.442322]  [<ffffffff810bb650>] ? kthread_worker_fn+0x180/0x180
[   88.442879] Code: 0f 84 93 05 00 00 83 f8 ea c7 85 a0 fe ff ff 00 00 27 30 0f 84 ba fe ff ff 85 c0 0f 85 a5 fe ff ff e9 e3 f9 ff ff 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 be 04 00 00 00 4c 89 ef 4c 89 8d 68 fe
[   88.444052] RIP  [<ffffffffa04b3c10>] nfsd4_encode_fattr+0x820/0x1f00 [nfsd]
[   88.444658]  RSP <ffff8800785db998>
[   88.445232] ---[ end trace 6cb9d0487d94a29f ]---

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-20 14:58:22 -04:00
Christoph Hellwig
68e8bb0334 nfsd: wrap too long lines in nfsd4_encode_read
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-22 14:15:05 -04:00
Christoph Hellwig
96bcad5064 nfsd: fput rd_file from XDR encode context
Remove the hack where we fput the read-specific file in generic code.
Instead we can do it in nfsd4_encode_read as that gets called for all
error cases as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-22 14:15:04 -04:00
Christoph Hellwig
af90f707fa nfsd: take struct file setup fully into nfs4_preprocess_stateid_op
This patch changes nfs4_preprocess_stateid_op so it always returns
a valid struct file if it has been asked for that.  For that we
now allocate a temporary struct file for special stateids, and check
permissions if we got the file structure from the stateid.  This
ensures that all callers will get their handling of special stateids
right, and avoids code duplication.

There is a little wart in here because the read code needs to know
if we allocated a file structure so that it can copy around the
read-ahead parameters.  In the long run we should probably aim to
cache full file structures used with special stateids instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-22 14:15:03 -04:00
Christoph Hellwig
a0649b2d3f nfsd: refactor nfs4_preprocess_stateid_op
Split out two self contained helpers to make the function more readable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-19 15:39:52 -04:00
Christoph Hellwig
e749a4621e nfsd: clean up raparams handling
Refactor the raparam hash helpers to just deal with the raparms,
and keep opening/closing files separate from that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-19 15:39:51 -04:00
Fabian Frederick
97b1f9aae9 nfsd: use swap() in sort_pacl_range()
Use kernel.h macro definition.

Thanks to Julia Lawall for Coccinelle scripting support.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-19 15:39:50 -04:00
Kinglong Mee
276f03e3ba nfsd: Update callback sequnce id only CB_SEQUENCE success
When testing pnfs layout, nfsd got error NFS4ERR_SEQ_MISORDERED.
It is caused by nfs return NFS4ERR_DELAY before validate_seqid(),
don't update the sequnce id, but nfsd updates the sequnce id !!!

According to RFC5661 20.9.3,
" If CB_SEQUENCE returns an error, then the state of the slot
(sequence ID, cached reply) MUST NOT change. "

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-04 16:51:30 -04:00
Kinglong Mee
4399396eec nfsd: Reset cb_status in nfsd4_cb_prepare() at retrying
nfsd enters a infinite loop and prints message every 10 seconds:

May 31 18:33:52 test-server kernel: Error sending entire callback!
May 31 18:34:01 test-server kernel: Error sending entire callback!

This is caused by a cb_layoutreturn getting error -10008
(NFS4ERR_DELAY), the client crashing, and then nfsd entering the
infinite loop:

  bc_sendto --> call_timeout --> nfsd4_cb_done --> nfsd4_cb_layout_done
  with error -10008 --> rpc_delay(task, HZ/100) --> bc_sendto ...

Reproduced using xfstests 074 with nfs client's kdump on,
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT set, and client's blkmapd down:

1. nfs client's write operation will get the layout of file,
   and then send getdeviceinfo,
2. but layout segment is not recorded by client because blkmapd is down,
3. client writes data by sending WRITE to server,
4. nfs server recalls the layout of the file before WRITE,
5. network error causes the client reset the session and return NFS4ERR_DELAY,
6. so client's WRITE operation is waiting the reply.
   If the task hangs 120s, the client will crash.
7. so that, the next bc_sendto will fail with TIMEOUT,
   and cb_status is NFS4ERR_DELAY.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-04 16:43:39 -04:00
Andreas Gruenbacher
2f6b3879c2 nfsd: Remove dead declarations
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-29 11:04:04 -04:00
Arnd Bergmann
6ac75368e1 nfsd: work around a gcc-5.1 warning
gcc-5.0 warns about a potential uninitialized variable use in nfsd:

fs/nfsd/nfs4state.c: In function 'nfsd4_process_open2':
fs/nfsd/nfs4state.c:3781:3: warning: 'old_deny_bmap' may be used uninitialized in this function [-Wmaybe-uninitialized]
   reset_union_bmap_deny(old_deny_bmap, stp);
   ^
fs/nfsd/nfs4state.c:3760:16: note: 'old_deny_bmap' was declared here
  unsigned char old_deny_bmap;
                ^

This is a false positive, the code path that is warned about cannot
actually be reached.

This adds an initialization for the variable to make the warning go
away.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-29 11:04:03 -04:00
Andreas Gruenbacher
0c9d65e76a nfsd: Checking for acl support does not require fetching any acls
Whether or not a file system supports acls can be determined with
IS_POSIXACL(inode) and does not require trying to fetch any acls; the code for
computing the supported_attrs and aclsupport attributes can be simplified.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-29 11:04:02 -04:00
Andreas Gruenbacher
cc265089ce nfsd: Disable NFSv2 timestamp workaround for NFSv3+
NFSv2 can set the atime and/or mtime of a file to specific timestamps but not
to the server's current time.  To implement the equivalent of utimes("file",
NULL), it uses a heuristic.

NFSv3 and later do support setting the atime and/or mtime to the server's
current time directly.  The NFSv2 heuristic is still enabled, and causes
timestamps to be set wrong sometimes.

Fix this by moving the heuristic into the NFSv2 specific code.  We can leave it
out of the create code path: the owner can always set timestamps arbitrarily,
and the workaround would never trigger.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-29 11:04:01 -04:00
NeilBrown
43b0e7ea59 nfsd: stop READDIRPLUS returning inconsistent attributes
The NFSv3 READDIRPLUS gets some of the returned attributes from the
readdir, and some from an inode returned from a new lookup.  The two
objects could be different thanks to intervening renames.

The attributes in READDIRPLUS are optional, so let's just skip them if
we notice this case.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-07 11:47:00 -04:00
Christoph Hellwig
fd89145460 nfsd: remove nfsd_close
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04 12:02:43 -04:00
Christoph Hellwig
4bd9e9b77f nfsd: skip CB_NULL probes for 4.1 or later
With sessions in v4.1 or later we don't need to manually probe the backchannel
connection, so we can declare it up instantly after setting up the RPC client.

Note that we really should split nfsd4_run_cb_work in the long run, this is
just the least intrusive fix for now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04 12:02:42 -04:00
Christoph Hellwig
cba5f62b18 nfsd: fix callback restarts
Checking the rpc_client pointer is not a reliable way to detect
backchannel changes: cl_cb_client is changed only after shutting down
the rpc client, so the condition cl_cb_client = tk_client will always be
true.

Check the RPC_TASK_KILLED flag instead, and rewrite the code to avoid
the buggy cl_callbacks list and fix the lifetime rules due to double
calls of the ->prepare callback operations method for this retry case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04 12:02:41 -04:00
Christoph Hellwig
ef2a1b3e10 nfsd: split transport vs operation errors for callbacks
We must only increment the sequence id if the client has seen and responded
to a request.  If we failed to deliver it to the client we must resend with
the same sequence id.  So just like the client track errors at the transport
level differently from those returned in the XDR.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04 12:02:40 -04:00
Sachin Bhamare
8287f009bd nfsd: fix pNFS return on close semantics
For the sake of forgetful clients, the server should return the layouts
to the file system on 'last close' of a file (assuming that there are no
delegations outstanding to that particular client) or on delegreturn
(assuming that there are no opens on a file from that particular
client).

In theory the information is all there in current data structures, but
it's not efficiently available; nfs4_file->fi_ref includes references on
the file across all clients, but we need a per-(client, file) count.
Walking through lots of stateid's to calculate this on each close or
delegreturn would be painful.

This patch introduces infrastructure to maintain per-client opens and
delegation counters on a per-file basis.

[hch: ported to the mainline pNFS support, merged various fixes from Jeff]
Signed-off-by: Sachin Bhamare <sachin.bhamare@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04 12:02:39 -04:00
Christoph Hellwig
ebe9cb3bb1 nfsd: fix the check for confirmed openowner in nfs4_preprocess_stateid_op
If we find a non-confirmed openowner we jump to exit the function, but do
not set an error value.  Fix this by factoring out a helper to do the
check and properly set the error from nfsd4_validate_stateid.

Cc: stable@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04 12:02:38 -04:00
Christoph Hellwig
40cdc7a530 nfsd/blocklayout: pretend we can send deviceid notifications
Commit df52699e4f ("NFSv4.1: Don't cache deviceids that have no
notifications") causes the Linux NFS client to stop caching deviceid's
unless a server pretends to support deviceid notifications.  While this
behavior is stupid and the language around this area in rfc5661 is a
mess carified by an errata that I submittted, Trond insists on this
behavior.  Not caching deviceids degrades block layout performance
massively as a GETDEVICEINFO is fairly expensive.

So add this hack to make the Linux client happy again.

Cc: stable@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-05-04 12:02:37 -04:00
Linus Torvalds
9ec3a646fe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull fourth vfs update from Al Viro:
 "d_inode() annotations from David Howells (sat in for-next since before
  the beginning of merge window) + four assorted fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  RCU pathwalk breakage when running into a symlink overmounting something
  fix I_DIO_WAKEUP definition
  direct-io: only inc/dec inode->i_dio_count for file systems
  fs/9p: fix readdir()
  VFS: assorted d_backing_inode() annotations
  VFS: fs/inode.c helpers: d_inode() annotations
  VFS: fs/cachefiles: d_backing_inode() annotations
  VFS: fs library helpers: d_inode() annotations
  VFS: assorted weird filesystems: d_inode() annotations
  VFS: normal filesystems (and lustre): d_inode() annotations
  VFS: security/: d_inode() annotations
  VFS: security/: d_backing_inode() annotations
  VFS: net/: d_inode() annotations
  VFS: net/unix: d_backing_inode() annotations
  VFS: kernel/: d_inode() annotations
  VFS: audit: d_backing_inode() annotations
  VFS: Fix up some ->d_inode accesses in the chelsio driver
  VFS: Cachefiles should perform fs modifications on the top layer only
  VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-26 17:22:07 -07:00
Linus Torvalds
860448cf76 Merge branch 'for-4.1' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
 "A quiet cycle this time; this is basically entirely bugfixes.

  The few that aren't cc'd to stable are cleanup or seemed unlikely to
  affect anyone much"

* 'for-4.1' of git://linux-nfs.org/~bfields/linux:
  uapi: Remove kernel internal declaration
  nfsd: fix nsfd startup race triggering BUG_ON
  nfsd: eliminate NFSD_DEBUG
  nfsd4: fix READ permission checking
  nfsd4: disallow SEEK with special stateids
  nfsd4: disallow ALLOCATE with special stateids
  nfsd: add NFSEXP_PNFS to the exflags array
  nfsd: Remove duplicate macro define for max sec label length
  nfsd: allow setting acls with unenforceable DENYs
  nfsd: NFSD_FAULT_INJECTION depends on DEBUG_FS
  nfsd: remove unused status arg to nfsd4_cleanup_open_state
  nfsd: remove bogus setting of status in nfsd4_process_open2
  NFSD: Use correct reply size calculating function
  NFSD: Using path_equal() for checking two paths
2015-04-24 07:46:05 -07:00
Giuseppe Cantavenera
bb7ffbf29e nfsd: fix nsfd startup race triggering BUG_ON
nfsd triggered a BUG_ON in net_generic(...) when rpc_pipefs_event(...)
in fs/nfsd/nfs4recover.c was called before assigning ntfsd_net_id.
The following was observed on a MIPS 32-core processor:
kernel: Call Trace:
kernel: [<ffffffffc00bc5e4>] rpc_pipefs_event+0x7c/0x158 [nfsd]
kernel: [<ffffffff8017a2a0>] notifier_call_chain+0x70/0xb8
kernel: [<ffffffff8017a4e4>] __blocking_notifier_call_chain+0x4c/0x70
kernel: [<ffffffff8053aff8>] rpc_fill_super+0xf8/0x1a0
kernel: [<ffffffff8022204c>] mount_ns+0xb4/0xf0
kernel: [<ffffffff80222b48>] mount_fs+0x50/0x1f8
kernel: [<ffffffff8023dc00>] vfs_kern_mount+0x58/0xf0
kernel: [<ffffffff802404ac>] do_mount+0x27c/0xa28
kernel: [<ffffffff80240cf0>] SyS_mount+0x98/0xe8
kernel: [<ffffffff80135d24>] handle_sys64+0x44/0x68
kernel:
kernel:
        Code: 0040f809  00000000  2e020001 <00020336> 3c12c00d
                3c02801a  de100000 6442eb98  0040f809
kernel: ---[ end trace 7471374335809536 ]---

Fixed this behaviour by calling register_pernet_subsys(&nfsd_net_ops) before
registering rpc_pipefs_event(...) with the notifier chain.

Signed-off-by: Giuseppe Cantavenera <giuseppe.cantavenera.ext@nokia.com>
Signed-off-by: Lorenzo Restelli <lorenzo.restelli.ext@nokia.com>
Reviewed-by: Kinlong Mee <kinglongmee@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21 16:16:03 -04:00
Mark Salter
135dd002c2 nfsd: eliminate NFSD_DEBUG
Commit f895b252d4 ("sunrpc: eliminate RPC_DEBUG") introduced
use of IS_ENABLED() in a uapi header which leads to a build
failure for userspace apps trying to use <linux/nfsd/debug.h>:

   linux/nfsd/debug.h:18:15: error: missing binary operator before token "("
  #if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
                ^

Since this was only used to define NFSD_DEBUG if CONFIG_SUNRPC_DEBUG
is enabled, replace instances of NFSD_DEBUG with CONFIG_SUNRPC_DEBUG.

Cc: stable@vger.kernel.org
Fixes: f895b252d4 "sunrpc: eliminate RPC_DEBUG"
Signed-off-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21 16:16:02 -04:00
J. Bruce Fields
6e4891dc28 nfsd4: fix READ permission checking
In the case we already have a struct file (derived from a stateid), we
still need to do permission-checking; otherwise an unauthorized user
could gain access to a file by sniffing or guessing somebody else's
stateid.

Cc: stable@vger.kernel.org
Fixes: dc97618ddd "nfsd4: separate splice and readv cases"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21 16:16:01 -04:00
J. Bruce Fields
980608fb50 nfsd4: disallow SEEK with special stateids
If the client uses a special stateid then we'll pass a NULL file to
vfs_llseek.

Fixes: 24bab49122 " NFSD: Implement SEEK"
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: stable@vger.kernel.org
Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21 16:16:01 -04:00
J. Bruce Fields
5ba4a25ab7 nfsd4: disallow ALLOCATE with special stateids
vfs_fallocate will hit a NULL dereference if the client tries an
ALLOCATE or DEALLOCATE with a special stateid.  Fix that.  (We also
depend on the open to have broken any conflicting leases or delegations
for us.)

(If it turns out we need to allow special stateid's then we could do a
temporary open here in the special-stateid case, as we do for read and
write.  For now I'm assuming it's not necessary.)

Fixes: 95d871f03c "nfsd: Add ALLOCATE support"
Cc: stable@vger.kernel.org
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21 15:44:06 -04:00
Linus Torvalds
eea3a00264 Merge branch 'akpm' (patches from Andrew)
Merge second patchbomb from Andrew Morton:

 - the rest of MM

 - various misc bits

 - add ability to run /sbin/reboot at reboot time

 - printk/vsprintf changes

 - fiddle with seq_printf() return value

* akpm: (114 commits)
  parisc: remove use of seq_printf return value
  lru_cache: remove use of seq_printf return value
  tracing: remove use of seq_printf return value
  cgroup: remove use of seq_printf return value
  proc: remove use of seq_printf return value
  s390: remove use of seq_printf return value
  cris fasttimer: remove use of seq_printf return value
  cris: remove use of seq_printf return value
  openrisc: remove use of seq_printf return value
  ARM: plat-pxa: remove use of seq_printf return value
  nios2: cpuinfo: remove use of seq_printf return value
  microblaze: mb: remove use of seq_printf return value
  ipc: remove use of seq_printf return value
  rtc: remove use of seq_printf return value
  power: wakeup: remove use of seq_printf return value
  x86: mtrr: if: remove use of seq_printf return value
  linux/bitmap.h: improve BITMAP_{LAST,FIRST}_WORD_MASK
  MAINTAINERS: CREDITS: remove Stefano Brivio from B43
  .mailmap: add Ricardo Ribalda
  CREDITS: add Ricardo Ribalda Delgado
  ...
2015-04-15 16:39:15 -07:00
Iulia Manda
2813893f8b kernel: conditionally support non-root users, groups and capabilities
There are a lot of embedded systems that run most or all of their
functionality in init, running as root:root.  For these systems,
supporting multiple users is not necessary.

This patch adds a new symbol, CONFIG_MULTIUSER, that makes support for
non-root users, non-root groups, and capabilities optional.  It is enabled
under CONFIG_EXPERT menu.

When this symbol is not defined, UID and GID are zero in any possible case
and processes always have all capabilities.

The following syscalls are compiled out: setuid, setregid, setgid,
setreuid, setresuid, getresuid, setresgid, getresgid, setgroups,
getgroups, setfsuid, setfsgid, capget, capset.

Also, groups.c is compiled out completely.

In kernel/capability.c, capable function was moved in order to avoid
adding two ifdef blocks.

This change saves about 25 KB on a defconfig build.  The most minimal
kernels have total text sizes in the high hundreds of kB rather than
low MB.  (The 25k goes down a bit with allnoconfig, but not that much.

The kernel was booted in Qemu.  All the common functionalities work.
Adding users/groups is not possible, failing with -ENOSYS.

Bloat-o-meter output:
add/remove: 7/87 grow/shrink: 19/397 up/down: 1675/-26325 (-24650)

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Iulia Manda <iulia.manda21@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-15 16:35:22 -07:00
David Howells
2b0143b5c9 VFS: normal filesystems (and lustre): d_inode() annotations
that's the bulk of filesystem drivers dealing with inodes of their own

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-15 15:06:57 -04:00
Christoph Hellwig
9b3075c59f nfsd: add NFSEXP_PNFS to the exflags array
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-03 10:00:59 -04:00
Jeff Layton
cae80b305e locks: change lm_get_owner and lm_put_owner prototypes
The current prototypes for these operations are somewhat awkward as they
deal with fl_owners but take struct file_lock arguments. In the future,
we'll want to be able to take references without necessarily dealing
with a struct file_lock.

Change them to take fl_owner_t arguments instead and have the callers
deal with assigning the values to the file_lock structs.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
2015-04-03 09:04:04 -04:00
Kinglong Mee
1ec8c0c47f nfsd: Remove duplicate macro define for max sec label length
NFS4_MAXLABELLEN has defined for sec label max length, use it directly.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31 16:46:39 -04:00
J. Bruce Fields
b14f4f7e61 nfsd: allow setting acls with unenforceable DENYs
We've been refusing ACLs that DENY permissions that we can't effectively
deny.  (For example, we can't deny permission to read attributes.)

Andreas points out that any DENY of Window's "read", "write", or
"modify" permissions would trigger this.  That would be annoying.

So maybe we should be a little less paranoid, and ignore entirely the
permissions that are meaningless to us.

Reported-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31 16:46:39 -04:00
Chengyu Song
629b8729cc nfsd: NFSD_FAULT_INJECTION depends on DEBUG_FS
NFSD_FAULT_INJECTION depends on DEBUG_FS, otherwise the debugfs_create_*
interface may return unexpected error -ENODEV, and cause system crash.

Signed-off-by: Chengyu Song <csong84@gatech.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31 16:46:39 -04:00
Jeff Layton
4229789993 nfsd: remove unused status arg to nfsd4_cleanup_open_state
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31 16:46:39 -04:00
Jeff Layton
fc26c3860a nfsd: remove bogus setting of status in nfsd4_process_open2
status is always reset after this (and it doesn't make much sense there
anyway).

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31 16:46:39 -04:00
Kinglong Mee
beaca2347f NFSD: Use correct reply size calculating function
ALLOCATE/DEALLOCATE only reply one status value to client,
so, using nfsd4_only_status_rsize for reply size calculating.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31 16:46:38 -04:00
Kinglong Mee
b77a4b2edb NFSD: Using path_equal() for checking two paths
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31 16:46:38 -04:00
Christoph Hellwig
f3f03330de nfsd: require an explicit option to enable pNFS
Turns out sending out layouts to any client is a bad idea if they
can't get at the storage device, so require explicit admin action
to enable pNFS.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-30 16:05:26 -04:00
Kinglong Mee
7890203da2 NFSD: Fix bad update of layout in nfsd4_return_file_layout
With return layout as, (seg is return layout, lo is record layout)
seg->offset <= lo->offset and layout_end(seg) < layout_end(lo),
nfsd should update lo's offset to seg's end,
and,
seg->offset > lo->offset and layout_end(seg) >= layout_end(lo),
nfsd should update lo's end to seg's offset.

Fixes: 9cf514ccfa ("nfsd: implement pNFS operations")
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25 21:13:03 -04:00
Kinglong Mee
376675daea NFSD: Take care the return value from nfsd4_encode_stateid
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25 21:13:02 -04:00
Kinglong Mee
853695230e NFSD: Printk blocklayout length and offset as format 0x%llx
When testing pnfs with nfsd_debug on, nfsd print a negative number
of layout length and foff in nfsd4_block_proc_layoutget as,
"GET: -xxxx:-xxx 2"

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25 21:13:02 -04:00
J. Bruce Fields
340f0ba1c6 nfsd: return correct lockowner when there is a race on hash insert
alloc_init_lock_stateowner can return an already freed entry if there is
a race to put openowners in the hashtable.

Noticed by inspection after Jeff Layton fixed the same bug for open
owners.  Depending on client behavior, this one may be trickier to
trigger in practice.

Fixes: c58c6610ec "nfsd: Protect adding/removing lock owners using client_lock"
Cc: <stable@vger.kernel.org>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Acked-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25 21:06:16 -04:00
Jeff Layton
c5952338bf nfsd: return correct openowner when there is a race to put one in the hash
alloc_init_open_stateowner can return an already freed entry if there is
a race to put openowners in the hashtable.

In commit 7ffb588086, we changed it so that we allocate and initialize
an openowner, and then check to see if a matching one got stuffed into
the hashtable in the meantime. If it did, then we free the one we just
allocated and take a reference on the one already there. There is a bug
here though. The code will then return the pointer to the one that was
allocated (and has now been freed).

This wasn't evident before as this race almost never occurred. The Linux
kernel client used to serialize requests for a single openowner.  That
has changed now with v4.0 kernels, and this race can now easily occur.

Fixes: 7ffb588086
Cc: <stable@vger.kernel.org> # v3.17+
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Reported-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-25 21:06:06 -04:00
Kinglong Mee
a1420384e3 NFSD: Put exports after nfsd4_layout_verify fail
Fix commit 9cf514ccfa (nfsd: implement pNFS operations).

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-20 16:15:42 -04:00
Kinglong Mee
a68465c9cb NFSD: Error out when register_shrinker() fail
If register_shrinker() failed, nfsd will cause a NULL pointer access as,

[ 9250.875465] nfsd: last server has exited, flushing export cache
[ 9251.427270] BUG: unable to handle kernel NULL pointer dereference at           (null)
[ 9251.427393] IP: [<ffffffff8136fc29>] __list_del_entry+0x29/0xd0
[ 9251.427579] PGD 13e4d067 PUD 13e4c067 PMD 0
[ 9251.427633] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 9251.427706] Modules linked in: ip6t_rpfilter ip6t_REJECT bnep bluetooth xt_conntrack cfg80211 rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw btrfs xfs microcode ppdev serio_raw pcspkr xor libcrc32c raid6_pq e1000 parport_pc parport i2c_piix4 i2c_core nfsd(OE-) auth_rpcgss nfs_acl lockd sunrpc(E) ata_generic pata_acpi
[ 9251.428240] CPU: 0 PID: 1557 Comm: rmmod Tainted: G           OE 3.16.0-rc2+ #22
[ 9251.428366] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
[ 9251.428496] task: ffff880000849540 ti: ffff8800136f4000 task.ti: ffff8800136f4000
[ 9251.428593] RIP: 0010:[<ffffffff8136fc29>]  [<ffffffff8136fc29>] __list_del_entry+0x29/0xd0
[ 9251.428696] RSP: 0018:ffff8800136f7ea0  EFLAGS: 00010207
[ 9251.428751] RAX: 0000000000000000 RBX: ffffffffa0116d48 RCX: dead000000200200
[ 9251.428814] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa0116d48
[ 9251.428876] RBP: ffff8800136f7ea0 R08: ffff8800136f4000 R09: 0000000000000001
[ 9251.428939] R10: 8080808080808080 R11: 0000000000000000 R12: ffffffffa011a5a0
[ 9251.429002] R13: 0000000000000800 R14: 0000000000000000 R15: 00000000018ac090
[ 9251.429064] FS:  00007fb9acef0740(0000) GS:ffff88003fa00000(0000) knlGS:0000000000000000
[ 9251.429164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9251.429221] CR2: 0000000000000000 CR3: 0000000031a17000 CR4: 00000000001407f0
[ 9251.429306] Stack:
[ 9251.429410]  ffff8800136f7eb8 ffffffff8136fcdd ffffffffa0116d20 ffff8800136f7ed0
[ 9251.429511]  ffffffff8118a0f2 0000000000000000 ffff8800136f7ee0 ffffffffa00eb765
[ 9251.429610]  ffff8800136f7ef0 ffffffffa010e93c ffff8800136f7f78 ffffffff81104ac2
[ 9251.429709] Call Trace:
[ 9251.429755]  [<ffffffff8136fcdd>] list_del+0xd/0x30
[ 9251.429896]  [<ffffffff8118a0f2>] unregister_shrinker+0x22/0x40
[ 9251.430037]  [<ffffffffa00eb765>] nfsd_reply_cache_shutdown+0x15/0x90 [nfsd]
[ 9251.430106]  [<ffffffffa010e93c>] exit_nfsd+0x9/0x6cd [nfsd]
[ 9251.430192]  [<ffffffff81104ac2>] SyS_delete_module+0x162/0x200
[ 9251.430280]  [<ffffffff81013b69>] ? do_notify_resume+0x59/0x90
[ 9251.430395]  [<ffffffff816f2369>] system_call_fastpath+0x16/0x1b
[ 9251.430457] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08
[ 9251.430691] RIP  [<ffffffff8136fc29>] __list_del_entry+0x29/0xd0
[ 9251.430755]  RSP <ffff8800136f7ea0>
[ 9251.430805] CR2: 0000000000000000
[ 9251.431033] ---[ end trace 080f3050d082b4ea ]---

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-20 12:44:00 -04:00