linux-stable

Commit Graph

Author	SHA1	Message	Date
Jens Axboe	4fbef206da	nfsd: fix nfsd_vfs_read() splice actor setup When nfsd was transitioned to use splice instead of sendfile() for data transfers, a line setting the page index was lost. Restore it, so that nfsd is functional when that path is used. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-13 16:45:43 -07:00
Jens Axboe	51a92c0f6c	splice: fix offset mangling with direct splicing (sendfile) If the output actor doesn't transfer the full amount of data, we will increment ppos too much. Two related bugs in there: - We need to break out and return actor() retval if it is shorted than what we spliced into the pipe. - Adjust ppos only according to actor() return. Also fix loop problem in generic_file_splice_read(), it should not keep going when data has already been transferred. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-13 14:14:31 +02:00
James Morris	29ce20586b	security: revalidate rw permissions for sys_splice and sys_vmsplice Revalidate read/write permissions for splice(2) and vmslice(2), in case security policy has changed since the files were opened. Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-13 14:14:29 +02:00
Zhang Rui	91a6902958	sysfs: add parameter "struct bin_attribute " in .read/.write methods for sysfs binary attributes Well, first of all, I don't want to change so many files either. What I do: Adding a new parameter "struct bin_attribute " in the .read/.write methods for the sysfs binary attributes. In fact, only the four lines change in fs/sysfs/bin.c and include/linux/sysfs.h do the real work. But I have to update all the files that use binary attributes to make them compatible with the new .read and .write methods. I'm not sure if I missed any. :( Why I do this: For a sysfs attribute, we can get a pointer pointing to the struct attribute in the .show/.store method, while we can't do this for the binary attributes. I don't know why this is different, but this does make it not so handy to use the binary attributes as the regular ones. So I think this patch is reasonable. :) Who benefits from it: The patch that exposes ACPI tables in sysfs requires such an improvement. All the table binary attributes share the same .read method. Parameter "struct bin_attribute *" is used to get the table signature and instance number which are used to distinguish different ACPI table binary attributes. Without this parameter, we need to offer different .read methods for different ACPI table binary attributes. This is impossible as there are various ACPI tables on different platforms, and we don't know what they are until they are loaded. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:09 -07:00
Tejun Heo	51225039f3	sysfs: make directory dentries and inodes reclaimable This patch makes dentries and inodes for sysfs directories reclaimable. * sysfs_notify() is modified to walk sysfs_dirent tree instead of dentry tree. * sysfs_update_file() and sysfs_chmod_file() use sysfs_get_dentry() to grab the victim dentry. * sysfs_rename_dir() and sysfs_move_dir() grab all dentries using sysfs_get_dentry() on startup. * Dentries for all shadowed directories are pinned in memory to serve as lookup start point. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:09 -07:00
Tejun Heo	53e0ae9269	sysfs: implement sysfs_get_dentry() Some sysfs operations require dentry and inode. sysfs_get_dentry() looks up and gets dentry for the specified sysfs_dirent. It finds the first ancestor with dentry attached and starts looking up dentries from there. Looking up from the nearest ancestor is necessary to support shadowed directories because we can't reliably lookup dentry for one of the shadows. Dentries for each shadow will be pinned in memory such that they can serve as the starting point for dentry lookup. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:09 -07:00
Tejun Heo	a0edd7c848	sysfs: move sysfs_drop_dentry() to dir.c and make it static After add/remove path restructuring, the only user of sysfs_drop_dentry() is sysfs_addrm_finish(). Move sysfs_drop_dentry() to dir.c and make it static. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:09 -07:00
Tejun Heo	fb6896da37	sysfs: restructure add/remove paths and fix inode update The original add/remove code had the following problems. * parent's timestamps are updated on dentry instantiation. this is incorrect with reclaimable files. * updating parent's timestamps isn't synchronized. * parent nlink update assumes the inode is accessible which won't be true once directory dentries are made reclaimable. This patch restructures add/remove paths to resolve the above problems. Add/removal are done in the following steps. 1. sysfs_addrm_start() : acquire locks including sysfs_mutex and other resources. 2-a. sysfs_add_one() : add new sd. linking the new sd into the children list is caller's responsibility. 2-b. sysfs_remove_one() : remove a sd. unlinking the sd from the children list is caller's responsibility. 3. sysfs_addrm_finish() : release all resources and clean up. Steps 2-a and/or 2-b can be repeated multiple times. Parent's inode is looked up during sysfs_addrm_start(). If available (always at the moment), it's pinned and nlink is updated as sd's are added and removed. Timestamps are updated during finish if any sd has been added or removed. If parent's inode is not available during start, sysfs_mutex ensures that parent inode is not created till add/remove is complete. All the complexity is contained inside the helper functions. Especially, dentry/inode handling is properly hidden from the rest of sysfs which now mostly operate on sysfs_dirents. As an added bonus, codes which use these helpers to add and remove sysfs_dirents are now more structured and simpler. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:09 -07:00
Tejun Heo	3007e997de	sysfs: use sysfs_mutex to protect the sysfs_dirent tree As kobj sysfs dentries and inodes are gonna be made reclaimable, i_mutex can't be used to protect sysfs_dirent tree. Use sysfs_mutex globally instead. As the whole tree is protected with sysfs_mutex, there is no reason to keep sysfs_rename_sem. Drop it. While at it, add docbook comments to functions which require sysfs_mutex locking. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:08 -07:00
Tejun Heo	5f9953237f	sysfs: consolidate sysfs spinlocks Replace sysfs_lock and kobj_sysfs_assoc_lock with sysfs_assoc_lock. sysfs_lock was originally to be used to protect sysfs_dirent tree but mutex seems better choice, so there is no reason to keep sysfs_lock separate. Merge the two spinlocks into one. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:08 -07:00
Tejun Heo	608e266a2d	sysfs: make kobj point to sysfs_dirent instead of dentry As kobj sysfs dentries and inodes are gonna be made reclaimable, dentry can't be used as naming token for sysfs file/directory, replace kobj->dentry with kobj->sd. The only external interface change is shadow directory handling. All other changes are contained in kobj and sysfs. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:08 -07:00
Tejun Heo	f0b0af4792	sysfs: implement sysfs_find_dirent() and sysfs_get_dirent() Implement sysfs_find_dirent() and sysfs_get_dirent(). sysfs_dirent_exist() is replaced by sysfs_find_dirent(). These will be used to make directory entries reclamiable. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:08 -07:00
Tejun Heo	380e6fbb72	sysfs: implement SYSFS_FLAG_REMOVED flag Implement SYSFS_FLAG_REMOVED flag which currently is used only to improve sanity check in sysfs_deactivate(). The flag will be used to make directory entries reclamiable. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:08 -07:00
Tejun Heo	b402d72cf7	sysfs: rename sysfs_dirent->s_type to s_flags and make room for flags Rename sysfs_dirent->s_type to s_flags, pack type into lower eight bits and reserve the rest for flags. sysfs_type() can used to access the type. All existing sd->s_type accesses are converted to use sysfs_type(). While at it, type test is changed to equality test instead of bit-and test where appropriate. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:08 -07:00
Tejun Heo	d0bcb5689a	sysfs: make sysfs_drop_dentry() access inodes using ilookup() sysfs_drop_dentry() used to go through sd->s_dentry and sd->s_parent->s_dentry to access the inodes. This is incorrect because inode can be cached without dentry. This patch makes sysfs_drop_dentry() access inodes using ilookup() on sd->s_ino. This is both correct and simpler. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:08 -07:00
Rafael J. Wysocki	9d9307dabb	sysfs: Fix oops in sysfs_drop_dentry on x86_64 Fix oops on x86_64 caused by the dereference of dir in sysfs_drop_dentry() made before checking if dir is not NULL (cf. http://marc.info/?l=linux-kernel&m=118151626704924&w=2). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:07 -07:00
Tejun Heo	0c73f18b7d	sysfs: use singly-linked list for sysfs_dirent tree Make sysfs_dirent use singly linked list for its tree structure. sysfs_link_sibling() and sysfs_unlink_sibling() functions are added to handle simpler cases. It adds some complexity and cpu cycle overhead but reduced memory footprint is worthwhile on big machines. This change reduces the sizeof sysfs_dirent from 104 to 88 on 64bit and from 60 to 52 on 32bit. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:07 -07:00
Tejun Heo	8619f97989	sysfs: slim down sysfs_dirent->s_active Make sysfs_dirent->s_active an atomic_t instead of rwsem. This reduces the size of sysfs_dirent from 136 to 104 on 64bit and from 76 to 60 on 32bit with lock debugging turned off. With lock debugging turned on the reduction is much larger. s_active starts at zero and each active reference increments s_active. Putting a reference decrements s_active. Deactivation subtracts SD_DEACTIVATED_BIAS which is currently INT_MIN and assumed to be small enough to make s_active negative. If s_active is negative, sysfs_get() no longer grants new references. Deactivation succeeds immediately if there is no active user; otherwise, it waits using a completion for the last put. Due to the removal of lockdep tricks, this change makes things less trickier in release_sysfs_dirent(). As all the complexity is contained in three s_active functions, I think it's more readable this way. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:07 -07:00
Tejun Heo	b6b4a4399c	sysfs: move s_active functions to fs/sysfs/dir.c These functions are about to receive more complexity and doesn't really need to be inlined in the first place. Move them from fs/sysfs/sysfs.h to fs/sysfs/dir.c. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:07 -07:00
Tejun Heo	0b8ead82f5	sysfs: fix root sysfs_dirent -> root dentry association The root sysfs_dirent didn't point to the root dentry fix it. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:07 -07:00
Tejun Heo	8312a8d7c1	sysfs: use iget_locked() instead of new_inode() After dentry is reclaimed, sysfs always used to allocate new dentry and inode if the file is accessed again. This causes problem with operations which only pin the inode. For example, if inotify watch is added to a sysfs file and the dentry for the file is reclaimed, the next update event creates new dentry and new inode making the inotify watch miss all the events from there on. This patch fixes it by using iget_locked() instead of new_inode(). sysfs_new_inode() is renamed to sysfs_get_inode() and inode is initialized iff the inode is newly allocated. sysfs_instantiate() is responsible for unlocking new inodes. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:07 -07:00
Tejun Heo	fc9f54b998	sysfs: reorganize sysfs_new_indoe() and sysfs_create() Reorganize/clean up sysfs_new_inode() and sysfs_create(). * sysfs_init_inode() is separated out from sysfs_new_inode() and is responsible for basic initialization. * sysfs_instantiate() replaces the last step of sysfs_create() and is responsible for dentry instantitaion. * type-specific initialization is moved out to the callers. * mode is specified only once when creating a sysfs_dirent. * spurious list_del_init(&sd->s_sibling) dropped from create_dir() This change is to * prepare for inode allocation fix. * separate alloc and init code for synchronization update. * make dentry/inode initialization more flexible for later changes. This patch doesn't introduce visible behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:07 -07:00
Tejun Heo	7f7cfffe60	sysfs: fix parent refcounting during rename and move Parent reference wasn't properly transferred during rename and move. Fix it. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:07 -07:00
Tejun Heo	42b37df6ab	sysfs: make sysfs_alloc_ino() static sysfs_alloc_ino() isn't used out side of fs/sysfs/dir.c. Make it static. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:06 -07:00
Tejun Heo	7b595756ec	sysfs: kill unnecessary attribute->owner sysfs is now completely out of driver/module lifetime game. After deletion, a sysfs node doesn't access anything outside sysfs proper, so there's no reason to hold onto the attribute owners. Note that often the wrong modules were accounted for as owners leading to accessing removed modules. This patch kills now unnecessary attribute->owner. Note that with this change, userland holding a sysfs node does not prevent the backing module from being unloaded. For more info regarding lifetime rule cleanup, please read the following message. http://article.gmane.org/gmane.linux.kernel/510293 (tweaked by Greg to not delete the field just yet, to make it easier to merge things properly.) Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:06 -07:00
Tejun Heo	dbde0fcf9f	sysfs: reimplement sysfs_drop_dentry() This patch reimplements sysfs_drop_dentry() such that remove_dir() can use it to drop dentry instead of using a separate mechanism. With this change, making directories reclaimable is much easier. This patch used to contain fixes for two race conditions around sd->s_dentry but that part has been separated out and included into mainline early as commit `6aa054aadf` and `dd14cbc994`. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:06 -07:00
Tejun Heo	198a2a8470	sysfs: separate out sysfs_attach_dentry() Consolidate sd <-> dentry association into sysfs_attach_dentry() and call it after dentry and inode are properly set up. This is in preparation of sysfs_drop_dentry() updates. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:05 -07:00
Tejun Heo	73107cb3ad	sysfs: kill attribute file orphaning Now that sysfs_dirent can be disconnected from kobject on deletion, there is no need to orphan each attribute files. All [bin_]attribute nodes are automatically orphaned when the parent node is deleted. Kill attribute file orphaning. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:05 -07:00
Tejun Heo	0ab66088c8	sysfs: implement sysfs_dirent active reference and immediate disconnect sysfs: implement sysfs_dirent active reference and immediate disconnect Opening a sysfs node references its associated kobject, so userland can arbitrarily prolong lifetime of a kobject which complicates lifetime rules in drivers. This patch implements active reference and makes the association between kobject and sysfs immediately breakable. Now each sysfs_dirent has two reference counts - s_count and s_active. s_count is a regular reference count which guarantees that the containing sysfs_dirent is accessible. As long as s_count reference is held, all sysfs internal fields in sysfs_dirent are accessible including s_parent and s_name. The newly added s_active is active reference count. This is acquired by invoking sysfs_get_active() and it's the caller's responsibility to ensure sysfs_dirent itself is accessible (should be holding s_count one way or the other). Dereferencing sysfs_dirent to access objects out of sysfs proper requires active reference. This includes access to the associated kobjects, attributes and ops. The active references can be drained and denied by calling sysfs_deactivate(). All active sysfs_dirents must be deactivated after deletion but before the default reference is dropped. This enables immediate disconnect of sysfs nodes. Once a sysfs_dirent is deleted, it won't access any entity external to sysfs proper. Because attr/bin_attr ops access both the node itself and its parent for kobject, they need to hold active references to both. sysfs_get/put_active_two() helpers are provided to help grabbing both references. Parent's is acquired first and released last. Unlike other operations, mmapped area lingers on after mmap() is finished and the module implement implementing it and kobj need to stay referenced till all the mapped pages are gone. This is accomplished by holding one set of active references to the bin_attr and its parent if there have been any mmap during lifetime of an openfile. The references are dropped when the openfile is released. This change makes sysfs lifetime rules independent from both kobject's and module's. It not only fixes several race conditions caused by sysfs not holding onto the proper module when referencing kobject, but also helps fixing and simplifying lifetime management in driver model and drivers by taking sysfs out of the equation. Please read the following message for more info. http://article.gmane.org/gmane.linux.kernel/510293 Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:05 -07:00
Tejun Heo	eb36165353	sysfs: implement bin_buffer Implement bin_buffer which contains a mutex and pointer to PAGE_SIZE buffer to properly synchronize accesses to per-openfile buffer and prepare for immediate-kobj-disconnect. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:04 -07:00
Tejun Heo	2b29ac252a	sysfs: reimplement symlink using sysfs_dirent tree sysfs symlink is implemented by referencing dentry and kobject from sysfs_dirent - symlink entry references kobject, dentry is used to walk the tree. This complicates object lifetimes rules and is dangerous - for example, there is no way to tell to which module the target of a symlink belongs and referencing that kobject can make it linger after the module is gone. This patch reimplements symlink using only sysfs_dirent tree. sd for a symlink points and holds reference to the target sysfs_dirent and all walking is done using sysfs_dirent tree. Simpler and safer. Please read the following message for more info. http://article.gmane.org/gmane.linux.kernel/510293 Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:04 -07:00
Tejun Heo	aecdcedaab	sysfs: implement kobj_sysfs_assoc_lock kobj->dentry can go away anytime unless the user controls when the associated sysfs node is deleted. This patch implements kobj_sysfs_assoc_lock which protects kobj->dentry. This will be used to maintain kobj based API when converting sysfs to use sysfs_dirent tree instead of dentry/kobject. Note that this lock belongs to kobject/driver-model not sysfs. Once sysfs is converted to not use kobject in its interface, this can be removed from sysfs. This is in preparation of object reference simplification. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:04 -07:00
Tejun Heo	3e5190380e	sysfs: make sysfs_dirent->s_element a union Make sd->s_element a union of sysfs_elem_{dir\|symlink\|attr\|bin_attr} and rename it to s_elem. This is to achieve... * some level of type checking : changing symlink to point to sysfs_dirent instead of kobject is much safer and less painful now. * easier / standardized dereferencing * allow sysfs_elem_* to contain more than one entry Where possible, pointer is obtained by directly deferencing from sd instead of going through other entities. This reduces dependencies to dentry, inode and kobject. to_attr() and to_bin_attr() are unused now and removed. This is in preparation of object reference simplification. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:04 -07:00
Tejun Heo	0c096b507f	sysfs: add sysfs_dirent->s_name Add s_name to sysfs_dirent. This is to further reduce dependency to the associated dentry. Name is copied for directories and symlinks but not for attributes. Where possible, name dereferences are converted to use sd->s_name. sysfs_symlink->link_name and sysfs_get_name() are unused now and removed. This change allows symlink to be implemented using sysfs_dirent tree proper, which is the last remaining dentry-dependent sysfs walk. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:04 -07:00
Tejun Heo	13b3086d2e	sysfs: add sysfs_dirent->s_parent Add sysfs_dirent->s_parent. With this patch, each sd points to and holds a reference to its parent. This allows walking sysfs tree without referencing sd->s_dentry which can go away anytime if the user doesn't control when it's deleted. sd->s_parent is initialized and parent is referenced in sysfs_attach_dirent(). Reference to parent is released when the sd is released, so as long as reference to a sd is held, s_parent can be followed. dentry walk in sysfs_readdir() is convereted to s_parent walk. This will be used to reimplement symlink such that it uses only sysfs_dirent tree. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:04 -07:00
Tejun Heo	a26cd7226c	sysfs: consolidate sysfs_dirent creation functions Currently there are four functions to create sysfs_dirent - __sysfs_new_dirent(), sysfs_new_dirent(), __sysfs_make_dirent() and sysfs_make_dirent(). Other than sysfs_make_dirent(), no function has two users if calls to implement other functions are excluded. This patch consolidates sysfs_dirent creation functions into the following two. * sysfs_new_dirent() : allocate and initialize * sysfs_attach_dirent() : attach to sysfs_dirent hierarchy and/or associate with dentry This simplifies interface and gives callers more flexibility. This is in preparation of object reference simplification. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:04 -07:00
Tejun Heo	996b73764e	sysfs: flatten and fix sysfs_rename_dir() error handling Error handling in sysfs_rename_dir() was broken. * When lookup_one_len() fails, 0 is returned. * If parent inode check fails, returns with inode mutex and rename rwsem held. This patch fixes the above bugs and flattens error handling such that it's more readable and easier to modify. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:03 -07:00
Tejun Heo	dfeb9fb034	sysfs: flatten cleanup paths in sysfs_add_link() and create_dir() Flatten cleanup paths in sysfs_add_link() and create_dir() to improve readability and ease further changes to these functions. This is in preparation of object reference simplification. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:03 -07:00
Tejun Heo	93e3cd8270	sysfs: fix error handling in binattr write() Error handling in fs/sysfs/bin.c:write() was wrong because size_t count is used to receive return value from flush_write() which is negative on failure. This patch updates write() such that int variable is used instead. read() is updated the same way for consistency. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:03 -07:00
Tejun Heo	7a23ad4404	sysfs: make sysfs_put() ignore NULL sd Make sysfs_put() ignore NULL sd instead of oopsing. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:03 -07:00
Tejun Heo	2b611bb7ab	sysfs: allocate inode number using ida sysfs used simple incrementing allocator which is not guaranteed to be unique. This patch makes sysfs use ida to give each sd a unique and packed inode number. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:03 -07:00
Tejun Heo	fa7f912ad4	sysfs: move release_sysfs_dirent() to dir.c There is no reason this function should be inlined and soon to follow sysfs object reference simplification will make it heavier. Move it to dir.c. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:03 -07:00
Jan Kara	cfc94cdf8e	debugfs: add rename for debugfs files Implement debugfs_rename() to allow renaming files/directories in debugfs. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:00 -07:00
Frank Filz	137d6acaa6	NFSv4: Make sure unlock is really an unlock when cancelling a lock I ran into a curious issue when a lock is being canceled. The cancellation results in a lock request to the vfs layer instead of an unlock request. This is particularly insidious when the process that owns the lock is exiting. In that case, sometimes the erroneous lock is applied AFTER the process has entered zombie state, preventing the lock from ever being released. Eventually other processes block on the lock causing a slow degredation of the system. In the 2.6.16 kernel this was investigated on, the problem is compounded by the fact that the cl_sem is held while blocking on the vfs lock, which results in most processes accessing the nfs file system in question hanging. In more detail, here is how the situation occurs: first _nfs4_do_setlk(): static int _nfs4_do_setlk(struct nfs4_state state, int cmd, struct file_lock fl, int reclaim) ... ret = nfs4_wait_for_completion_rpc_task(task); if (ret == 0) { ... } else data->cancelled = 1; then nfs4_lock_release(): static void nfs4_lock_release(void calldata) ... if (data->cancelled != 0) { struct rpc_task task; task = nfs4_do_unlck(&data->fl, data->ctx, data->lsp, data->arg.lock_seqid); The problem is the same file_lock that was passed in to _nfs4_do_setlk() gets passed to nfs4_do_unlck() from nfs4_lock_release(). So the type is still F_RDLCK or FWRLCK, not F_UNLCK. At some point, when cancelling the lock, the type needs to be changed to F_UNLCK. It seemed easiest to do that in nfs4_do_unlck(), but it could be done in nfs4_lock_release(). The concern I had with doing it there was if something still needed the original file_lock, though it turns out the original file_lock still needs to be modified by nfs4_do_unlck() because nfs4_do_unlck() uses the original file_lock to pass to the vfs layer, and a copy of the original file_lock for the RPC request. It seems like the simplest solution is to force all situations where nfs4_do_unlck() is being used to result in an unlock, so with that in mind, I made the following change: Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:49 -04:00
Frank van Maarseveen	c98451bdb2	NLM: fix source address of callback to client Use the destination address of the original NLM request as the source address in callbacks to the client. Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:49 -04:00
Trond Myklebust	6f2e64d3e1	NFSv4: Make the NFS state model work with the nosharedcache mount option Consider the case where the user has mounted the remote filesystem server:/foo on the two local directories /bar and /baz using the nosharedcache mount option. The files /bar/file and /baz/file are represented by different inodes in the local namespace, but refer to the same file /foo/file on the server. Consider the case where a process opens both /bar/file and /baz/file, then closes /bar/file: because the nfs4_state is not shared between /bar/file and /baz/file, the kernel will see that the nfs4_state for /bar/file is no longer referenced, so it will send off a CLOSE rpc call. Unless the open_owners differ, then that CLOSE call will invalidate the open state on /baz/file too. Conclusion: we cannot share open state owners between two different non-shared mount instances of the same filesystem. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:48 -04:00
Trond Myklebust	275a5d24bf	NFS: Error when mounting the same filesystem with different options Unless the user sets the NFS_MOUNT_NOSHAREDCACHE mount flag, we should return EBUSY if the filesystem is already mounted on a superblock that has set conflicting mount options. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:48 -04:00
Trond Myklebust	75180df2ed	NFS: Add the mount option "nosharecache" Prior to David Howell's mount changes in 2.6.18, users who mounted different directories which happened to be from the same filesystem on the server would get different super blocks, and hence could choose different mount options. As long as there were no hard linked files that crossed from one subtree to another, this was quite safe. Post the changes, if the two directories are on the same filesystem (have the same 'fsid'), they will share the same super block, and hence the same mount options. Add a flag to allow users to elect not to share the NFS super block with another mount point, even if the fsids are the same. This will allow users to set different mount options for the two different super blocks, as was previously possible. It is still up to the user to ensure that there are no cache coherency issues when doing this, however the default behaviour will be to share super blocks whenever two paths result in the same fsid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:48 -04:00
Chuck Lever	8007122520	NFS: Add support for mounting NFSv4 file systems with string options Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:48 -04:00
Chuck Lever	136d558ce7	NFS: Add final pieces to support in-kernel mount option parsing Hook in final components required for supporting in-kernel mount option parsing for NFSv2 and NFSv3 mounts. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:48 -04:00
Chuck Lever	0076d7b7ba	NFS: Introduce generic mount client API For NFSv2 and v3 mounts, the first step is to contact the server's MOUNTD and request the file handle for the root of the mounted share. Add a function to the NFS client that handles this operation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:47 -04:00
Chuck Lever	bf0fd7680f	NFS: Add enums and match tables for mount option parsing This generic infrastructure works for both NFS and NFSv4 mounts. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:47 -04:00
Chuck Lever	013a8c1ab5	NFS: Improve debugging output in NFS in-kernel mount client Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:47 -04:00
Chuck Lever	19207231c9	NFS: Clean up in-kernel NFS mount Clean up white space and coding conventions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:46 -04:00
Chuck Lever	3ea97309e6	NFS: Remake nfsroot_mount as a permanent part of NFS client In preparation for supporting NFSv2 and NFSv3 mount option handling in the kernel NFS client, convert mount_clnt.c to be a permanent part of the NFS client, instead of built only when CONFIG_ROOT_NFS is enabled. In addition, we also replace the "struct sockaddr_in *" argument with something more generic, to help support IPv6 at some later point. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:46 -04:00
Chuck Lever	43780b87fa	SUNRPC: Add a convenient default for the hostname when calling rpc_create() A couple of callers just use a stringified IP address for the rpc client's hostname. Move the logic for constructing this into rpc_create(), so it can be shared. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:46 -04:00
Chuck Lever	cce63cd637	SUNRPC: Rename rpcb_getport_external routine In preparation for handling NFS mount option parsing in the kernel, rename rpcb_getport_external as rpcb_get_port_sync, and make it available always (instead of only when CONFIG_ROOT_NFS is enabled). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:46 -04:00
Chuck Lever	f0768ebd09	NFS: Introduce nfs4_validate_mount_options Refactor NFSv4 mount processing to break out mount data validation in the same way it's broken out in the NFSv2/v3 mount path. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:45 -04:00
Chuck Lever	5df36e78da	NFS: Clean up nfs_validate_mount_data Move error handling code out of the main code path. The switch statement was also improperly indented, according to Documentation/CodingStyle. This prepares nfs_validate_mount_data for the addition of option string parsing. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:45 -04:00
Chuck Lever	fc50d58fd0	NFS: Clean-up: Refactor IP address sanity checks in NFS client NFS and NFSv4 mounts can now share server address sanity checking. And, it provides an easy mechanism for adding IPv6 address checking at some later point. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:44 -04:00
Chuck Lever	4d81cd1611	NFS: Clean-up: fix a compiler warning in fs/nfs/super.c /home/cel/linux/fs/nfs/super.c: In function 'nfs_pseudoflavour_to_name': /home/cel/linux/fs/nfs/super.c:270: warning: comparison between signed and unsigned Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:44 -04:00
Chuck Lever	0655960f76	NFS: Clean up error handling in nfs_get_sb The error return logic in nfs_get_sb now matches nfs4_get_sb, and is more maintainable. A subsequent patch will take advantage of this simplification. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:44 -04:00
Chuck Lever	29eb981a3b	NFS: Clean-up: Replace nfs_copy_user_string with strndup_user The new string utility function strndup_user can be used instead of nfs_copy_user_string, eliminating an unnecessary duplication of function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:44 -04:00
Chuck Lever	5680d48be8	NFS: Clean-up: Define macros for maximum host and export path name lengths Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:44 -04:00
Chuck Lever	9eaa67c6a5	NFS: Clean-up: use correct type when converting NFS blocks to local blocks inode->i_blocks is a blkcnt_t these days, which can be a u64 or unsigned long, depending on the setting of CONFIG_LSF. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:44 -04:00
Trond Myklebust	8bda4e4c98	NFSv4: Fix up stateid locking... We really don't need to grab both the state->so_owner and the inode->i_lock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:43 -04:00
Trond Myklebust	1ac7e2fd35	NFSv4: Clean up the callers of nfs4_open_recover_helper() Rely on nfs4_try_open_cached() when appropriate. Also fix an RCU violation in _nfs4_do_open_reclaim() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:43 -04:00
Trond Myklebust	6ee4126890	NFSv4: Don't call OPEN if we already have an open stateid for a file If we already have a stateid with the correct open mode for a given file, then we can reuse that stateid instead of re-issuing an OPEN call without violating the close-to-open caching semantics. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:43 -04:00
Trond Myklebust	aac00a8d0a	NFSv4: Check for the existence of a delegation in nfs4_open_prepare() We should not be calling open() on an inode that has a delegation unless we're doing a reclaim. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:43 -04:00
Trond Myklebust	3e309914a1	NFSv4: Clean up _nfs4_proc_open() Use a flag instead of the 'data->rpc_status = -ENOMEM hack. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:42 -04:00
Trond Myklebust	1b370bc28f	NFSv4: Allow nfs4_opendata_to_nfs4_state to return errors. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:42 -04:00
Trond Myklebust	6f43ddccb3	NFSv4: Improve the debugging of bad sequence id errors... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:42 -04:00
Trond Myklebust	003707c722	NFSv4: Always use the delegation if we have one Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:41 -04:00
Trond Myklebust	0f9f95e0ad	NFSv4: Clean up confirmation of sequence ids... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:41 -04:00
Trond Myklebust	412c77cee6	NFSv4: Defer inode revalidation when setting up a delegation Currently we force a synchronous call to __nfs_revalidate_inode() in nfs_inode_set_delegation(). This not only ensures that we cannot call nfs_inode_set_delegation from an asynchronous context, but it also slows down any call to open(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:41 -04:00
Trond Myklebust	8383e4602c	NFSv4: Use RCU to protect delegations Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:41 -04:00
Trond Myklebust	13437e12fb	NFSv4: Support recalling delegations by stateid part 2 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:41 -04:00
Trond Myklebust	9016302784	NFSv4: Support recalling delegations by stateid There appear to be some rogue servers out there that issue multiple delegations with different stateids for the same file. Ensure that when we return delegations, we do so on a per-stateid basis rather than a per-file basis. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:40 -04:00
Trond Myklebust	2ced46c270	NFSv4: Fix up a bug in nfs4_open_recover() Don't clobber the delegation info... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:40 -04:00
Trond Myklebust	549d6ed5e8	NFSv4: set the delegation in nfs4_opendata_to_nfs4_state This ensures that nfs4_open_release() and nfs4_open_confirm_release() can now handle an eventual delegation that was returned with out open. As such, it fixes a delegation "leak" when the user breaks out of an open call. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:40 -04:00
Trond Myklebust	1c816efa24	NFSv4: Fix a bug in __nfs4_find_state_byowner The test for state->state == 0 does not tell you that the stateid is in the process of being freed. It really tells you that the stateid is not yet initialised... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:40 -04:00
Trond Myklebust	1b45c46cf7	NFSv4: Fix atomic open for execute... Currently we do not check for the FMODE_EXEC flag as we should. For that particular case, we need to perform an ACCESS call to the server in order to check that the file is executable. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:40 -04:00
Trond Myklebust	9f958ab885	NFSv4: Reduce the chances of an open_owner identifier collision Currently we just use a 32-bit counter. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:39 -04:00
Trond Myklebust	88d9093997	NFSv4: nfs_increment_open_seqid should not return a value It is a void function... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:39 -04:00
Trond Myklebust	e6889620e8	NFSv4: Fix underestimate of NFSv4 lookup request size Also fix up the underestimate of fs_locations Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:39 -04:00
Trond Myklebust	2cebf82883	NFSv4: Fix the underestimate of NFSv4 open request size The maximum size depends on the filename size and a number of other elements which are currently not being counted. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:39 -04:00
Trond Myklebust	bd625ba80d	NFSv4: Fix the NFSv4 owner and owner_group size estimates Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:39 -04:00
Trond Myklebust	7af654f8d1	NFSv4: Don't reuse expired nfs4_state_owner structs That just confuses certain NFSv4 servers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:38 -04:00
Trond Myklebust	27b3f949b7	NFSv4: Fix a credential reference leak in nfs4_get_state_owner() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:38 -04:00
Trond Myklebust	587142f85f	NFS: Replace NFS_I(inode)->req_lock with inode->i_lock There is no justification for keeping a special spinlock for the exclusive use of the NFS writeback code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:38 -04:00
Trond Myklebust	4e56e082dd	NFSv4: Clean up _nfs4_proc_lookup() vs _nfs4_proc_lookupfh() They differ only slightly in the arguments they take. Why have they not been merged? Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:38 -04:00
Trond Myklebust	1be27f3660	SUNRPC: Remove the tk_auth macro... We should almost always be deferencing the rpc_auth struct by means of the credential's cr_auth field instead of the rpc_clnt->cl_auth anyway. Fix up that historical mistake, and remove the macro that propagated it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:37 -04:00
Trond Myklebust	f61534dfd3	SUNRPC: Remove redundant calls to rpciod_up()/rpciod_down() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:30 -04:00
Trond Myklebust	90c5755ff5	SUNRPC: Kill rpc_clnt->cl_oneshot Replace it with explicit calls to rpc_shutdown_client() or rpc_destroy_client() (for the case of asynchronous calls). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:29 -04:00
Trond Myklebust	34f52e3591	SUNRPC: Convert rpc_clnt->cl_users to a kref Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:28 -04:00
Trond Myklebust	c6d00e639b	NFSv4: Convert struct nfs4_opendata to use struct kref Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:28 -04:00
Trond Myklebust	3bec63db55	NFS: Convert struct nfs_open_context to use a kref Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:27 -04:00
Trond Myklebust	edc05fc1c2	NFS: reduce latency by using conditional rescheduling in nfs_scan_list Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:27 -04:00
Trond Myklebust	dce34ce298	NFS: Prevent integer overflow in nfs_scan_list() Also ensure that nfs_inode ncommit and npages are large enough to represent all possible values for the number of pages. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:27 -04:00
Trond Myklebust	2aefa10431	NFS: Remove the redundant 'dirty' and 'commit' lists from nfs_inode Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:26 -04:00
Trond Myklebust	5c36968343	NFS cleanup: speed up nfs_scan_commit using radix tree tags Add a tag for requests that are waiting for a COMMIT Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:26 -04:00
Trond Myklebust	9fd367f0f3	NFS cleanup: Rename NFS_PAGE_TAG_WRITEBACK to NFS_PAGE_TAG_LOCKED Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:26 -04:00
Trond Myklebust	c03b402461	NFS: Convert struct nfs_page to use krefs Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:26 -04:00
Trond Myklebust	a50f7951a3	NFS: Fix an Oops in the nfs_access_cache_shrinker() The nfs_access_cache_shrinker may race with nfs_access_zap_cache(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:25 -04:00
Trond Myklebust	e2f032e9ef	NFS: nfs3_proc_create() should use nfs_post_op_update_inode() Also get rid of a redundant call to nfs_setattr_update_inode(). The call to nfs3_proc_setattr() already takes care of that. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:25 -04:00
Jeff Layton	aa53ed541a	NFS4: on a O_EXCL OPEN make sure SETATTR sets the fields holding the verifier The Linux NFS4 client simply skips over the bitmask in an O_EXCL open call and so it doesn't bother to reset any fields that may be holding the verifier. This patch has us save the first two words of the bitmask (which is all the current client has #defines for). The client then later checks this bitmask and turns on the appropriate flags in the sattr->ia_verify field for the following SETATTR call. This patch only currently checks to see if the server used the atime and mtime slots for the verifier (which is what the Linux server uses for this). I'm not sure of what other fields the server could reasonably use, but adding checks for others should be trivial. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:25 -04:00
Trond Myklebust	fc6ae3cf48	NFS: Re-enable forced umounts They disappeared some time around 2.6.18. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:25 -04:00
Jeff Layton	83d93f2229	NFS: Use GFP_HIGHUSER for page allocation in nfs_symlink() nfs_symlink() allocates a GFP_KERNEL page for the pagecache. Most pagecache pages are allocated using GFP_HIGHUSER, and there's no reason not to do that in nfs_symlink() as well. Signed-off-by: Jeff Layton <jlayton@redhat.com>	2007-07-10 23:40:25 -04:00
Trond Myklebust	a0356862bc	NFS: Fix nfs_reval_fsid() We don't need to revalidate the fsid on the root directory. It suffices to revalidate it on the current directory. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:24 -04:00
Trond Myklebust	b39e625b6e	NFSv4: Clean up nfs4_call_async() Use rpc_run_task() instead of doing it ourselves. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:24 -04:00
Trond Myklebust	4a35bd41af	NFSv4: Ensure that nfs4_do_close() doesn't race with umount nfs4_do_close() does not currently have any way to ensure that the user won't attempt to unmount the partition while the asynchronous RPC call is completing. This again may cause Oopses in nfs_update_inode(). Add a vfsmount argument to nfs4_close_state to ensure that the partition remains mounted while we're closing the file. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:24 -04:00
Trond Myklebust	ad389da79f	NFSv4: Ensure asynchronous open() calls always pin the mountpoint A number of race conditions may currently ensue if the user presses ^C and then unmounts the partition while an asynchronous open() is in progress. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:24 -04:00
Trond Myklebust	539cd03a57	NFSv4: Cleanup: pass the nfs_open_context to open recovery code Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:24 -04:00
Trond Myklebust	88be9f990f	NFS: Replace vfsmount and dentry in nfs_open_context with struct path Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:23 -04:00
Trond Myklebust	de05a0cc2a	NFS: Minor read optimisation... Since PG_uptodate may now end up getting set during the call to nfs_wb_page(), we can avoid putting a read request on the wire in those situations. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:23 -04:00
Trond Myklebust	44dd151d5c	NFS: Don't mark a written page as uptodate until it is on disk The write may fail, so we should not mark the page as uptodate until we are certain that the data has been accepted and written to disk by the server. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:23 -04:00
Trond Myklebust	d9df8d6b38	NFS: Don't fail an O_DIRECT read/write if get_user_pages() returns pages There is no need to fail the entire O_DIRECT read/write just because get_user_pages() returned fewer pages than we requested. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:23 -04:00
Chuck Lever	070ea60214	NFS: Clean ups in fs/nfs/direct.c Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-07-10 23:40:23 -04:00
Pavel Emelianov	bcf67e1625	Make common helpers for seq_files that work with list_heads Many places in kernel use seq_file API to iterate over a regular list_head. The code for such iteration is identical in all the places, so it's worth introducing a common helpers. This makes code about 300 lines smaller: The first version of this patch made the helper functions static inline in the seq_file.h header. This patch moves them to the fs/seq_file.c as Andrew proposed. The vmlinux .text section sizes are as follows: 2.6.22-rc1-mm1: 0x001794d5 with the previous version: 0x00179505 with this patch: 0x00179135 The config file used was make allnoconfig with the "y" inclusion of all the possible options to make the files modified by the patch compile plus drivers I have on the test node. This patch: Many places in kernel use seq_file API to iterate over a regular list_head. The code for such iteration is identical in all the places, so it's worth introducing a common helpers. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-10 17:51:13 -07:00
Linus Torvalds	9f9d763216	Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6 * 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [S390] vmlogrdr function annotation. [S390] s390: rename CPU_IDLE to S390_CPU_IDLE [S390] cio: Remove prototype for non-existing function cmf_reset(). [S390] zcrypt: fix request timeout handling [S390] system call optimization. [S390] dasd: Avoid compile warnings on !CONFIG_DASD_PROFILE [S390] Remove volatile from atomic_t [S390] Program check in diag 210 under 31 bit [S390] Bogomips calculation for 64 bit. [S390] smp: Merge smp_count_cpus() and smp_get_save_areas(). [S390] zcore: Fix __user annotation. [S390] fixed cdl-format detection. [S390] sclp: Test facility list before executing a service call. [S390] sclp: introduce some new interfaces. [S390] Fixed comment typo. [S390] vmcp cleanup	2007-07-10 14:46:09 -07:00
Linus Torvalds	1b21f458dd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (57 commits) [GFS2] Accept old format NFS filehandles [GFS2] Small fixes to logging code [DLM] dump more lock values [GFS2] Remove i_mode passing from NFS File Handle [GFS2] Obtaining no_formal_ino from directory entry [GFS2] git-gfs2-nmw-build-fix [GFS2] System won't suspend with GFS2 file system mounted [GFS2] remounting w/o acl option leaves acls enabled [GFS2] inode size inconsistency [DLM] Telnet to port 21064 can stop all lockspaces [GFS2] Fix gfs2_block_truncate_page err return [GFS2] Addendum to the journaled file/unmount patch [GFS2] Simplify multiple glock aquisition [GFS2] assertion failure after writing to journaled file, umount [GFS2] Use zero_user_page() in stuffed_readpage() [GFS2] Remove bogus '\0' in rgrp.c [GFS2] Journaled file write/unstuff bug [DLM] don't require FS flag on all nodes [GFS2] Fix deallocation issues [GFS2] return conflicts for GETLK ...	2007-07-10 13:56:13 -07:00
Linus Torvalds	01370f0603	Merge branch 'splice-2.6.23' of git://git.kernel.dk/data/git/linux-2.6-block * 'splice-2.6.23' of git://git.kernel.dk/data/git/linux-2.6-block: pipe: add documentation and comments pipe: change the ->pin() operation to ->confirm() Remove remnants of sendfile() xip sendfile removal splice: completely document external interface with kerneldoc sendfile: remove bad_sendfile() from bad_file_ops shmem: convert to using splice instead of sendfile() relay: use splice_to_pipe() instead of open-coding the pipe loop pipe: allow passing around of ops private pointer splice: divorce the splice structure/function definitions from the pipe header splice: relay support sendfile: convert nfsd to splice_direct_to_actor() sendfile: convert nfs to using splice_read() loop: convert to using splice_direct_to_actor() instead of sendfile() splice: add void cookie to the actor data sendfile: kill generic_file_sendfile() sendfile: remove .sendfile from filesystems that use generic_file_sendfile() sys_sendfile: switch to using ->splice_read, if available vmsplice: add vmsplice-to-user support splice: abstract out actor data	2007-07-10 13:51:06 -07:00
Steven Whitehouse	3ebf44902f	[GFS2] Accept old format NFS filehandles On Tue, 2007-07-10 at 10:06 +0100, Christoph Hellwig wrote: > > -#define GFS2_LARGE_FH_SIZE 10 > > - > > -struct gfs2_fh_obj { > > - struct gfs2_inum_host this; > > - u32 imode; > > -}; > > +#define GFS2_LARGE_FH_SIZE 8 > > Because gfs2_decode_fh only accepts file handles with GFS2_LARGE_FH_SIZE > or GFS2_LARGE_FH_SIZE you don't accept filehandles sent out by and older > gfs version anymore. Stale filehandles because of a new kernel version > are a big no-no, so please add back code to handle the old filehandles > on the decode side. > This should fix that problem I think since its only relating to end of the fh we can just ignore that field in order to accept the older format. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Wendy Cheng <wcheng@redhat.com>	2007-07-10 12:28:27 +01:00
Stefan Haberland	bf1a95a225	[S390] fixed cdl-format detection. CDL formated DASDs are now detected correctly even if no VOL1 label is on the disk. This prevents possible loss of data. Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-07-10 11:24:44 +02:00
Jens Axboe	0845718daf	pipe: add documentation and comments As per Andrew Mortons request, here's a set of documentation for the generic pipe_buf_operations hooks, the pipe, and pipe_buffer structures. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:16 +02:00
Jens Axboe	cac36bb06e	pipe: change the ->pin() operation to ->confirm() The name 'pin' was badly chosen, it doesn't pin a pipe buffer in the most commonly used sense in the kernel. So change the name to 'confirm', after debating this issue with Hugh Dickins a bit. A good return from ->confirm() means that the buffer is really there, and that the contents are good. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:15 +02:00
Jens Axboe	d96e6e7164	Remove remnants of sendfile() There are now zero users of .sendfile() in the kernel, so kill it from the file_operations structure and in do_sendfile(). Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:15 +02:00
Carsten Otte	d054fe3d10	xip sendfile removal This patch removes xip_file_sendfile, the sendfile implementation for xip without replacement. Those customers that use xip on s390 are not using sendfile() as far as we know, and so far s390 is the only platform this could potentially be used on so far. Having sendfile is not a popular feature for execute in place file systems, however we have a working implementation of splice_read() based on fs/splice.c if anyone asks for it. At this point in time, it does not seem preferable to merge splice_read() for xip because it causes extra maintenence effort due to code duplication and it requires struct page behind the xip memory segment. We'd like to get rid of that in favor of supporting flash based embedded platforms (Monta Vista work) soon. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:15 +02:00
Jens Axboe	932cc6d4f7	splice: completely document external interface with kerneldoc Also add fs/splice.c as a kerneldoc target with a smaller blurb that should be expanded to better explain the overview of splice. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:15 +02:00
Jens Axboe	d6f517568f	sendfile: remove bad_sendfile() from bad_file_ops do_sendfile() prefers splice over sendfile, so it should not trigger (directly, at least). Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:15 +02:00
Jens Axboe	497f9625c2	pipe: allow passing around of ops private pointer relay needs this for proper consumption handling, and the network receive support needs it as well to lookup the sk_buff on pipe release. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:14 +02:00
Jens Axboe	d6b29d7cee	splice: divorce the splice structure/function definitions from the pipe header We need to move even more stuff into the header so that folks can use the splice_to_pipe() implementation instead of open-coding a lot of pipe knowledge (see relay implementation), so move to our own header file finally. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:14 +02:00
Jens Axboe	cf8208d0ea	sendfile: convert nfsd to splice_direct_to_actor() Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:14 +02:00
Jens Axboe	f0930fffa9	sendfile: convert nfs to using splice_read() Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:14 +02:00
Jens Axboe	5ffc4ef45b	sendfile: remove .sendfile from filesystems that use generic_file_sendfile() They can use generic_file_splice_read() instead. Since sys_sendfile() now prefers that, there should be no change in behaviour. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:13 +02:00
Jens Axboe	534f2aaa6a	sys_sendfile: switch to using ->splice_read, if available This patch makes sendfile prefer to use ->splice_read(), if it's available in the file_operations structure. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:12 +02:00
Jens Axboe	6a14b90bb6	vmsplice: add vmsplice-to-user support A bit of a cheat, it actually just copies the data to userspace. But this makes the interface nice and symmetric and enables people to build on splice, with room for future improvement in performance. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:12 +02:00
Jens Axboe	c66ab6fa70	splice: abstract out actor data For direct splicing (or private splicing), the output may not be a file. So abstract out the handling into a specified actor function and put the data in the splice_desc structure earlier, so we can build on top of that. This is the first step in better splice handling for drivers, and also for implementing vmsplice _to_ user memory. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:04:12 +02:00
Adrian Bunk	72d3a38ee0	unexport bio_{,un}map_user bio_{,un}map_user no longer have any modular users. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:03:34 +02:00
Linus Torvalds	71d441ddb5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6: JFS: Update print_hex_dump() syntax JFS: use print_hex_dump() rather than private dump_mem() function JFS: Whitespace cleanup and remove some dead code	2007-07-09 13:09:16 -07:00
Ingo Molnar	43ae34cb4c	sched: scheduler debugging, core scheduler debugging core: implement /proc/sched_debug and /proc/<PID>/sched files for scheduler debugging. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-07-09 18:52:00 +02:00
Balbir Singh	172ba844a8	sched: update delay-accounting to use CFS's precise stats update delay-accounting to use CFS's precise stats. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-07-09 18:52:00 +02:00
Ingo Molnar	b27f03d4bd	sched: make use of precise accounting for /proc task stats make use of CFS's precise accounting to drive /proc/<pid>/stat statistics. this code was co-authored by: Balbir Singh <balbir@linux.vnet.ibm.com> Dmitry Adamushko <dmitry.adamushko@gmail.com> Ingo Molnar <mingo@elte.hu> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>	2007-07-09 18:51:59 +02:00
Ingo Molnar	62480d13d5	sched: remove the SleepAVG field remove the SleepAVG field from /proc/<pid>/status, as with the removal of the sleep-average code this value no longer makes sense. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-07-09 18:51:59 +02:00
Steven Whitehouse	a0a24741ca	[GFS2] Small fixes to logging code This reverts part of an earlier patch which tried to reclaim gfs2_bufdata structures too early and resulted in a "use after free" case (this bit from me). Also a change to not write out log headers unless we really need to (in the case of flushing nothing we don't need a header) from Bob. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2007-07-09 15:43:07 +01:00
David Teigland	ac90a25525	[DLM] dump more lock values Add two more output fields (lkb_flags and rsb nodeid) to the new debugfs file that dumps one lock per line. Also, dump all locks instead of just mastered locks. Accordingly, use a suffix of _locks instead of _master. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:24:13 +01:00
Wendy Cheng	35dcc52e3a	[GFS2] Remove i_mode passing from NFS File Handle GFS2 has been passing i_mode within NFS File Handle. Other than the wrong assumption that there is always room for this extra 16 bit value, the current gfs2_get_dentry doesn't really need the i_mode to work correctly. Note that GFS2 NFS code does go thru the same lookup code path as direct file access route (where the mode is obtained from name lookup) but gfs2_get_dentry() is coded for different purpose. It is not used during lookup time. It is part of the file access procedure call. When the call is invoked, if on-disk inode is not in-memory, it has to be read-in. This makes i_mode passing a useless overhead. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:24:11 +01:00
Wendy Cheng	bb9bcf0616	[GFS2] Obtaining no_formal_ino from directory entry GFS2 lookup code doesn't ask for inode shared glock. This implies during in-memory inode creation for existing file, GFS2 will not disk-read in the inode contents. This leaves no_formal_ino un-initialized during lookup time. The un-initialized no_formal_ino is subsequently encoded into file handle. Clients will get ESTALE error whenever it tries to access these files. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:24:08 +01:00
akpm@linux-foundation.org	f4fadb23ca	[GFS2] git-gfs2-nmw-build-fix Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:24:06 +01:00
Abhijith Das	b365762924	[GFS2] System won't suspend with GFS2 file system mounted The kernel threads in gfs2, namely gfs2_scand, gfs2_logd, gfs2_quotad, gfs2_glockd, gfs2_recoverd weren't doing anything when the suspend mechanism was trying to freeze them. I put in calls to refrigerator() in the loops for all the daemons and suspend works as expected. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2007-07-09 08:24:04 +01:00

1 2 3 4 5 ...

5904 Commits