CIFS currently allows you to change the mode of an inode on a share that
doesn't have unix extensions enabled, and isn't using cifsacl. The inode
in this case *only* has its mode changed in memory on the client. This
is problematic since it can change any time the inode is purged from the
cache.
This patch makes cifs_setattr silently ignore most mode changes when
unix extensions and cifsacl support are not enabled, and when the share
is not mounted with the "dynperm" option. The exceptions are:
When a mode change would remove all write access to an inode we turn on
the ATTR_READONLY bit on the server and remove all write bits from the
inode's mode in memory.
When a mode change would add a write bit to an inode that previously had
them all turned off, it turns off the ATTR_READONLY bit on the server,
and resets the mode back to what it would normally be (generally, the
file_mode or dir_mode of the share).
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
CIFS currently allows you to change the ownership of a file, but unless
unix extensions are enabled this change is not passed off to the server.
Have CIFS silently ignore ownership changes that can't be persistently
stored on the server unless the "setuids" option is explicitly
specified.
We could return an error here (-EOPNOTSUPP or something), but this is
how most disk-based windows filesystems on behave on Linux (e.g. VFAT,
NTFS, etc). With cifsacl support and proper Windows to Unix idmapping
support, we may be able to do this more properly in the future.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
When CIFS creates a new inode on a mount without unix extensions, it
temporarily assigns the mode that was passed to it in the create/mkdir
call. Eventually, when the inode is revalidated, it changes to have the
file_mode or dir_mode for the mount. This is confusing to users who
expect that the mode shouldn't change this way. It's also problematic
since only the mode is treated this way, not the uid or gid. Suppose you
have a CIFS mount that's mounted with:
uid=0,gid=0,file_mode=0666,dir_mode=0777
...if an unprivileged user comes along and does this on the mount:
mkdir -m 0700 foo
touch foo/bar
...there is a period of time where the touch will fail, since the dir
will initially be owned by root and have mode 0700. If the user waits
long enough, then "foo" will be revalidated and will get the correct
dir_mode permissions.
This patch changes cifs_mkdir and cifs_create to not overwrite the
mode found by the initial cifs_get_inode_info call after the inode is
created on the server. Legacy behavior can be reenabled with the
new "dynperm" mount option.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
When mounting a share with posix extensions disabled,
cifs_get_inode_info turns off all the write bits in the mode for regular
files if ATTR_READONLY is set. Directories and other inode types,
however, can also have ATTR_READONLY set, but the mode gives no
indication of this.
This patch makes this apply to other inode types besides regular files.
It also cleans up how modes are set in cifs_get_inode_info for both the
"normal" and "dynperm" cases.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
[XFS] Fix memory corruption with small buffer reads
[XFS] Fix inode list allocation size in writeback.
[XFS] Don't allow memory reclaim to wait on the filesystem in inode
[XFS] Fix fsync() b0rkage.
[XFS] Include linux/random.h in all builds, not just debug builds.
When we have multiple buffers in a single page for a blocksize == pagesize
filesystem we might overwrite the page contents if two callers hit it
shortly after each other. To prevent that we need to keep the page locked
until I/O is completed and the page marked uptodate.
Thanks to Eric Sandeen for triaging this bug and finding a reproducible
testcase and Dave Chinner for additional advice.
This should fix kernel.org bz #10421.
Tested-by: Eric Sandeen <sandeen@sandeen.net>
SGI-PV: 981813
SGI-Modid: xfs-linux-melb:xfs-kern:31173a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
We only need to allocate space for the number of inodes in the cluster
when writing back inodes, not every byte in the inode cluster. This
reduces the amount of memory needing to be allocated to 256 bytes instead
of 64k.
SGI-PV: 981949
SGI-Modid: xfs-linux-melb:xfs-kern:31182a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
writeback
If we allow memory reclaim to wait on the pages under writeback in inode
cluster writeback we could deadlock because we are currently holding the
ILOCK on the initial writeback inode which is needed in data I/O
completion to change the file size or do unwritten extent conversion
before the pages are taken out of writeback state.
SGI-PV: 981091
SGI-Modid: xfs-linux-melb:xfs-kern:31015a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
xfs_fsync() fails to wait for data I/O completion before checking if the
inode is dirty or clean to decide whether to log the inode or not. This
misses inode size updates when the data flushed by the fsync() is
extending the file.
Hence, like fdatasync(), we need to wait for I/o completion first, then
check the inode for cleanliness. Doing so makes the behaviour of
xfs_fsync() identical for fsync and fdatasync and we *always* use
synchronous semantics if the inode is dirty. Therefore also kill the
differences and remove the unused flags from the xfs_fsync function and
callers.
SGI-PV: 981296
SGI-Modid: xfs-linux-melb:xfs-kern:31033a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
memcpy() from userland pointer is a Bad Thing(tm)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fallout from commit 46d7b522eb ("uml: move
hppfs_kern.c to hppfs.c")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (21 commits)
[CIFS] Remove debug statement
Fix possible access to undefined memory region.
[CIFS] Enable DFS support for Windows query path info
[CIFS] Enable DFS support for Unix query path info
[CIFS] add missing seq_printf to cifs_show_options for hard mount option
[CIFS] add more complete mount options to cifs_show_options
[CIFS] Add missing defines for DFS
CIFSGetDFSRefer cleanup + dfs_referral_level_3 fixed to conform REFERRAL_V3 the MS-DFSC spec.
Fixed DFS code to work with new 'build_path_from_dentry', that returns full path if share in the dfs, now.
[CIFS] enable parsing for transport encryption mount parm
[CIFS] Finishup DFS code
[CIFS] BKL-removal: convert CIFS over to unlocked_ioctl
[CIFS] suppress duplicate warning
[CIFS] Fix paths when share is in DFS to include proper prefix
add function to convert access flags to legacy open mode
clarify return value of cifs_convert_flags()
[CIFS] don't explicitly do a FindClose on rewind when directory search has ended
[CIFS] cleanup old checkpatch warnings
[CIFS] CIFSSMBPosixLock should return -EINVAL on error
fix memory leak in CIFSFindNext
...
* 'for-2.6.26' of git://linux-nfs.org/~bfields/linux: (25 commits)
svcrdma: Verify read-list fits within RPCSVC_MAXPAGES
svcrdma: Change svc_rdma_send_error return type to void
svcrdma: Copy transport address and arm CQ before calling rdma_accept
svcrdma: Set rqstp transport address in rdma_read_complete function
svcrdma: Use ib verbs version of dma_unmap
svcrdma: Cleanup queued, but unprocessed I/O in svc_rdma_free
svcrdma: Move the QP and cm_id destruction to svc_rdma_free
svcrdma: Add reference for each SQ/RQ WR
svcrdma: Move destroy to kernel thread
svcrdma: Shrink scope of spinlock on RQ CQ
svcrdma: Use standard Linux lists for context cache
svcrdma: Simplify RDMA_READ deferral buffer management
svcrdma: Remove unused READ_DONE context flags bit
svcrdma: Return error from rdma_read_xdr so caller knows to free context
svcrdma: Fix error handling during listening endpoint creation
svcrdma: Free context on post_recv error in send_reply
svcrdma: Free context on ib_post_recv error
svcrdma: Add put of connection ESTABLISHED reference in rdma_cma_handler
svcrdma: Fix return value in svc_rdma_send
svcrdma: Fix race with dto_tasklet in svc_rdma_send
...
Final piece for handling DFS in query_path_info, constructing a
fake inode for the junction directory which the submount will cover.
This handles the non-Unix (Windows etc.) code path.
Signed-off-by: Steve French <sfrench@us.ibm.com>
Final piece for handling DFS in unix_query_path_info, constructing a
fake inode for the junction directory which the submount will cover.
Acked-by: Igor Mammedov <niallain@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
[GFS2] Prefer strlcpy() over snprintf()
[GFS2] Fix cast from unsigned int to s64
[GFS2] filesystem consistency error from do_strip
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
[PATCH] return to old errno choice in mkdir() et.al.
[Patch] fs/binfmt_elf.c: fix wrong return values
[PATCH] get rid of leak in compat_execve()
[Patch] fs/binfmt_elf.c: fix a wrong free
[PATCH] avoid multiplication overflows and signedness issues for max_fds
[PATCH] dup_fd() part 4 - race fix
[PATCH] dup_fd() - part 3
[PATCH] dup_fd() part 2
[PATCH] dup_fd() fixes, part 1
[PATCH] take init_files to fs/file.c
Also Kari Hurtta noticed a missing check in the same function which is now fixed.
CC: Kari Hurtta <hurtta+gmane@siilo.fmi.fi>
Signed-off-by: Steve French <sfrench@us.ibm.com>
The return value on writes to the plock device should be
the number of bytes written. It was returning 0 instead
when an nfs lock callback was involved.
Reported-by: Nathan Straz <nstraz@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Removed the section mismatch message:
WARNING: fs/dlm/dlm.o(.init.text+0x132): Section mismatch in reference from the function init_module() to the function .exit.text:dlm_netlink_exit()
Since dlm_netlink_exit() is called in the init_dlm() error handling,
the __exit annotation has been removed.
Signed-off-by: Leonardo Potenza <lpotenza@inwind.it>
Signed-off-by: David Teigland <teigland@redhat.com>
The semaphore connections_lock is used as a mutex. Convert it to the mutex
API.
Signed-off-by: Matthias Kaehlcke <matthias@kaehlcke.net>
Cc: Christine Caulfield <ccaulfie@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Teigland <teigland@redhat.com>
* 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
SUNRPC: AUTH_SYS "machine creds" shouldn't use negative valued uid/gid
nfs: make nfs4_drop_state_owner() static
nfs: path_{get,put}() cleanups
nfs: replace remaining __FUNCTION__ occurrences
nfs/lsm: make NFSv4 set LSM mount options
NFSv4: Check the return value of decode_compound_hdr_arg()
nfs: fix race in nfs_dirty_request
NFS: Ensure that 'noac' and/or 'actimeo=0' turn off attribute caching
The current permissions on sessionid are a little too restrictive.
Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
adds various options to cifs_show_options
(displayed when you cat /proc/mounts with a cifs mount). I limited
the new ones to values that are associated with the mount with the
exception of "seal" (which is a per tree connection property, but I
thought was important enough to show through).
Eventually cifs's parse_mount_options also needs to
be rewritten to use the match_token API but that would be a big enough
change that I would prefer that changing parse_mount_options wait
until next release.
Signed-off-by: Steve French <sfrench@us.ibm.com>
In case when both EEXIST and EROFS would apply we used to
return the former in mkdir(2) and friends. Lest anyone suspects
us of being consistent, in the same situation knfsd gave clients
nfs_erofs...
ro-bind series had switched the syscall side of things to
returning -EROFS and immediately broke an application - namely,
mkdir -p. Patch restores the original behaviour...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
create_elf_tables() returns 0 on success. But when strnlen_user() "fails",
it returns 0 directly. So this is wrong.
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Even though copy_compat_strings() doesn't cache the pages,
copy_strings_kernel() and stuff indirectly called by e.g.
->load_binary() is doing that, so we need to drop the
cache contents in the end.
[found by WANG Cong <wangcong@zeuux.org>]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
In kmalloc failing path, we shouldn't free pointers in 'info',
because the struct 'info' is uninitilized when kmalloc is called.
And when kmalloc returns NULL, it's needless to kfree it.
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
--
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Limit sysctl_nr_open - we don't want ->max_fds to exceed MAX_INT and
we don't want size calculation for ->fd[] to overflow.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Parent _can_ be a clone task, contrary to the comment. Moreover,
more files could be opened while we allocate a copy, in which case
we end up copying only part into new descriptor table. Since what
we get _is_ affected by all changes in the old range, we can get
rather weird effects - e.g.
dup2(0, 1024); close(0);
in parallel with fork() resulting in child that sees the effect of
close(), but not that of dup2() done just before that close().
What we need is to recalculate the open_count after having reacquired
->file_lock and if external fdtable we'd just allocated is too small for
it, free the sucker and redo allocation.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
use alloc_fdtable() instead of expand_files(), get rid of pointless
grabbing newf->file_lock, kill magic in copy_fdtable() that used to
be there only to skip copying when called from dup_fd().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>