vfs-6.9.misc

-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZem3wQAKCRCRxhvAZXjc
 otRMAQDeo8qsuuIAcS2KUicKqZR5yMVvrY9r4sQzf7YRcJo5HQD+NQXkKwQuv1VO
 OUeScsic/+I+136AgdjWnlEYO5dp0go=
 =4WKU
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "Misc features, cleanups, and fixes for vfs and individual filesystems.

  Features:

   - Support idmapped mounts for hugetlbfs.

   - Add RWF_NOAPPEND flag for pwritev2(). This allows us to fix a bug
     where the passed offset is ignored if the file is O_APPEND. The new
     flag allows a caller to enforce that the offset is honored to
     conform to posix even if the file was opened in append mode.

   - Move i_mmap_rwsem in struct address_space to avoid false sharing
     between i_mmap and i_mmap_rwsem.

   - Convert efs, qnx4, and coda to use the new mount api.

   - Add a generic is_dot_dotdot() helper that's used by various
     filesystems and the VFS code instead of open-coding it multiple
     times.

   - Recently we've added stable offsets which allows stable ordering
     when iterating directories exported through NFS on e.g., tmpfs
     filesystems. Originally an xarray was used for the offset map but
     that caused slab fragmentation issues over time. This switches the
     offset map to the maple tree which has a dense mode that handles
     this scenario a lot better. Includes tests.

   - Finally merge the case-insensitive improvement series Gabriel has
     been working on for a long time. This cleanly propagates case
     insensitive operations through ->s_d_op which in turn allows us to
     remove the quite ugly generic_set_encrypted_ci_d_ops() operations.
     It also improves performance by trying a case-sensitive comparison
     first and then fallback to case-insensitive lookup if that fails.
     This also fixes a bug where overlayfs would be able to be mounted
     over a case insensitive directory which would lead to all sort of
     odd behaviors.

  Cleanups:

   - Make file_dentry() a simple accessor now that ->d_real() is
     simplified because of the backing file work we did the last two
     cycles.

   - Use the dedicated file_mnt_idmap helper in ntfs3.

   - Use smp_load_acquire/store_release() in the i_size_read/write
     helpers and thus remove the hack to handle i_size reads in the
     filemap code.

   - The SLAB_MEM_SPREAD is a nop now. Remove it from various places in
     fs/

   - It's no longer necessary to perform a second built-in initramfs
     unpack call because we retain the contents of the previous
     extraction. Remove it.

   - Now that we have removed various allocators kfree_rcu() always
     works with kmem caches and kmalloc(). So simplify various places
     that only use an rcu callback in order to handle the kmem cache
     case.

   - Convert the pipe code to use a lockdep comparison function instead
     of open-coding the nesting making lockdep validation easier.

   - Move code into fs-writeback.c that was located in a header but can
     be made static as it's only used in that one file.

   - Rewrite the alignment checking iterators for iovec and bvec to be
     easier to read, and also significantly more compact in terms of
     generated code. This saves 270 bytes of text on x86-64 (with
     clang-18) and 224 bytes on arm64 (with gcc-13). In profiles it also
     saves a bit of time for the same workload.

   - Switch various places to use KMEM_CACHE instead of
     kmem_cache_create().

   - Use inode_set_ctime_to_ts() in inode_set_ctime_current()

   - Use kzalloc() in name_to_handle_at() to avoid kernel infoleak.

   - Various smaller cleanups for eventfds.

  Fixes:

   - Fix various comments and typos, and unneeded initializations.

   - Fix stack allocation hack for clang in the select code.

   - Improve dump_mapping() debug code on a best-effort basis.

   - Fix build errors in various selftests.

   - Avoid wrap-around instrumentation in various places.

   - Don't allow user namespaces without an idmapping to be used for
     idmapped mounts.

   - Fix sysv sb_read() call.

   - Fix fallback implementation of the get_name() export operation"

* tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (70 commits)
  hugetlbfs: support idmapped mounts
  qnx4: convert qnx4 to use the new mount api
  fs: use inode_set_ctime_to_ts to set inode ctime to current time
  libfs: Drop generic_set_encrypted_ci_d_ops
  ubifs: Configure dentry operations at dentry-creation time
  f2fs: Configure dentry operations at dentry-creation time
  ext4: Configure dentry operations at dentry-creation time
  libfs: Add helper to choose dentry operations at mount-time
  libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops
  fscrypt: Drop d_revalidate once the key is added
  fscrypt: Drop d_revalidate for valid dentries during lookup
  fscrypt: Factor out a helper to configure the lookup dentry
  ovl: Always reject mounting over case-insensitive directories
  libfs: Attempt exact-match comparison first during casefolded lookup
  efs: remove SLAB_MEM_SPREAD flag usage
  jfs: remove SLAB_MEM_SPREAD flag usage
  minix: remove SLAB_MEM_SPREAD flag usage
  openpromfs: remove SLAB_MEM_SPREAD flag usage
  proc: remove SLAB_MEM_SPREAD flag usage
  qnx6: remove SLAB_MEM_SPREAD flag usage
  ...
This commit is contained in:
Linus Torvalds 2024-03-11 09:38:17 -07:00
commit 7ea65c89d8
65 changed files with 819 additions and 506 deletions

View File

@ -116,7 +116,7 @@ before and after the reference count increment. This pattern can be seen
in get_file_rcu() and __files_get_rcu().
In addition, it isn't possible to access or check fields in struct file
without first aqcuiring a reference on it under rcu lookup. Not doing
without first acquiring a reference on it under rcu lookup. Not doing
that was always very dodgy and it was only usable for non-pointer data
in struct file. With SLAB_TYPESAFE_BY_RCU it is necessary that callers
either first acquire a reference or they must hold the files_lock of the

View File

@ -29,7 +29,7 @@ prototypes::
char *(*d_dname)((struct dentry *dentry, char *buffer, int buflen);
struct vfsmount *(*d_automount)(struct path *path);
int (*d_manage)(const struct path *, bool);
struct dentry *(*d_real)(struct dentry *, const struct inode *);
struct dentry *(*d_real)(struct dentry *, enum d_real_type type);
locking rules:

View File

@ -1264,7 +1264,7 @@ defined:
char *(*d_dname)(struct dentry *, char *, int);
struct vfsmount *(*d_automount)(struct path *);
int (*d_manage)(const struct path *, bool);
struct dentry *(*d_real)(struct dentry *, const struct inode *);
struct dentry *(*d_real)(struct dentry *, enum d_real_type type);
};
``d_revalidate``
@ -1419,16 +1419,14 @@ defined:
the dentry being transited from.
``d_real``
overlay/union type filesystems implement this method to return
one of the underlying dentries hidden by the overlay. It is
used in two different modes:
overlay/union type filesystems implement this method to return one
of the underlying dentries of a regular file hidden by the overlay.
Called from file_dentry() it returns the real dentry matching
the inode argument. The real dentry may be from a lower layer
already copied up, but still referenced from the file. This
mode is selected with a non-NULL inode argument.
The 'type' argument takes the values D_REAL_DATA or D_REAL_METADATA
for returning the real underlying dentry that refers to the inode
hosting the file's data or metadata respectively.
With NULL inode the topmost real underlying dentry is returned.
For non-regular files, the 'dentry' argument is returned.
Each dentry has a pointer to its parent dentry, as well as a hash list
of child dentries. Child dentries are basically like files in a

View File

@ -352,7 +352,7 @@ int may_setattr(struct mnt_idmap *idmap, struct inode *inode,
EXPORT_SYMBOL(may_setattr);
/**
* notify_change - modify attributes of a filesytem object
* notify_change - modify attributes of a filesystem object
* @idmap: idmap of the mount the inode was found from
* @dentry: object affected
* @attr: new attributes

View File

@ -325,9 +325,7 @@ EXPORT_SYMBOL_GPL(backing_file_mmap);
static int __init backing_aio_init(void)
{
backing_aio_cachep = kmem_cache_create("backing_aio",
sizeof(struct backing_aio),
0, SLAB_HWCACHE_ALIGN, NULL);
backing_aio_cachep = KMEM_CACHE(backing_aio, SLAB_HWCACHE_ALIGN);
if (!backing_aio_cachep)
return -ENOMEM;

View File

@ -464,7 +464,7 @@ EXPORT_SYMBOL(mark_buffer_async_write);
* a successful fsync(). For example, ext2 indirect blocks need to be
* written back and waited upon before fsync() returns.
*
* The functions mark_buffer_inode_dirty(), fsync_inode_buffers(),
* The functions mark_buffer_dirty_inode(), fsync_inode_buffers(),
* inode_has_buffers() and invalidate_inode_buffers() are provided for the
* management of a list of dependent buffers at ->i_mapping->i_private_list.
*
@ -3121,12 +3121,8 @@ void __init buffer_init(void)
unsigned long nrpages;
int ret;
bh_cachep = kmem_cache_create("buffer_head",
sizeof(struct buffer_head), 0,
(SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
SLAB_MEM_SPREAD),
NULL);
bh_cachep = KMEM_CACHE(buffer_head,
SLAB_RECLAIM_ACCOUNT|SLAB_PANIC);
/*
* Limit the bh occupancy to 10% of ZONE_NORMAL
*/

View File

@ -24,6 +24,8 @@
#include <linux/pid_namespace.h>
#include <linux/uaccess.h>
#include <linux/fs.h>
#include <linux/fs_context.h>
#include <linux/fs_parser.h>
#include <linux/vmalloc.h>
#include <linux/coda.h>
@ -87,10 +89,10 @@ void coda_destroy_inodecache(void)
kmem_cache_destroy(coda_inode_cachep);
}
static int coda_remount(struct super_block *sb, int *flags, char *data)
static int coda_reconfigure(struct fs_context *fc)
{
sync_filesystem(sb);
*flags |= SB_NOATIME;
sync_filesystem(fc->root->d_sb);
fc->sb_flags |= SB_NOATIME;
return 0;
}
@ -102,78 +104,102 @@ static const struct super_operations coda_super_operations =
.evict_inode = coda_evict_inode,
.put_super = coda_put_super,
.statfs = coda_statfs,
.remount_fs = coda_remount,
};
static int get_device_index(struct coda_mount_data *data)
struct coda_fs_context {
int idx;
};
enum {
Opt_fd,
};
static const struct fs_parameter_spec coda_param_specs[] = {
fsparam_fd ("fd", Opt_fd),
{}
};
static int coda_parse_fd(struct fs_context *fc, int fd)
{
struct coda_fs_context *ctx = fc->fs_private;
struct fd f;
struct inode *inode;
int idx;
if (data == NULL) {
pr_warn("%s: Bad mount data\n", __func__);
return -1;
}
if (data->version != CODA_MOUNT_VERSION) {
pr_warn("%s: Bad mount version\n", __func__);
return -1;
}
f = fdget(data->fd);
f = fdget(fd);
if (!f.file)
goto Ebadf;
return -EBADF;
inode = file_inode(f.file);
if (!S_ISCHR(inode->i_mode) || imajor(inode) != CODA_PSDEV_MAJOR) {
fdput(f);
goto Ebadf;
return invalf(fc, "code: Not coda psdev");
}
idx = iminor(inode);
fdput(f);
if (idx < 0 || idx >= MAX_CODADEVS) {
pr_warn("%s: Bad minor number\n", __func__);
return -1;
}
return idx;
Ebadf:
pr_warn("%s: Bad file\n", __func__);
return -1;
if (idx < 0 || idx >= MAX_CODADEVS)
return invalf(fc, "coda: Bad minor number");
ctx->idx = idx;
return 0;
}
static int coda_fill_super(struct super_block *sb, void *data, int silent)
static int coda_parse_param(struct fs_context *fc, struct fs_parameter *param)
{
struct fs_parse_result result;
int opt;
opt = fs_parse(fc, coda_param_specs, param, &result);
if (opt < 0)
return opt;
switch (opt) {
case Opt_fd:
return coda_parse_fd(fc, result.uint_32);
}
return 0;
}
/*
* Parse coda's binary mount data form. We ignore any errors and go with index
* 0 if we get one for backward compatibility.
*/
static int coda_parse_monolithic(struct fs_context *fc, void *_data)
{
struct coda_mount_data *data = _data;
if (!data)
return invalf(fc, "coda: Bad mount data");
if (data->version != CODA_MOUNT_VERSION)
return invalf(fc, "coda: Bad mount version");
coda_parse_fd(fc, data->fd);
return 0;
}
static int coda_fill_super(struct super_block *sb, struct fs_context *fc)
{
struct coda_fs_context *ctx = fc->fs_private;
struct inode *root = NULL;
struct venus_comm *vc;
struct CodaFid fid;
int error;
int idx;
if (task_active_pid_ns(current) != &init_pid_ns)
return -EINVAL;
infof(fc, "coda: device index: %i\n", ctx->idx);
idx = get_device_index((struct coda_mount_data *) data);
/* Ignore errors in data, for backward compatibility */
if(idx == -1)
idx = 0;
pr_info("%s: device index: %i\n", __func__, idx);
vc = &coda_comms[idx];
vc = &coda_comms[ctx->idx];
mutex_lock(&vc->vc_mutex);
if (!vc->vc_inuse) {
pr_warn("%s: No pseudo device\n", __func__);
errorf(fc, "coda: No pseudo device");
error = -EINVAL;
goto unlock_out;
}
if (vc->vc_sb) {
pr_warn("%s: Device already mounted\n", __func__);
errorf(fc, "coda: Device already mounted");
error = -EBUSY;
goto unlock_out;
}
@ -313,18 +339,45 @@ static int coda_statfs(struct dentry *dentry, struct kstatfs *buf)
return 0;
}
/* init_coda: used by filesystems.c to register coda */
static struct dentry *coda_mount(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data)
static int coda_get_tree(struct fs_context *fc)
{
return mount_nodev(fs_type, flags, data, coda_fill_super);
if (task_active_pid_ns(current) != &init_pid_ns)
return -EINVAL;
return get_tree_nodev(fc, coda_fill_super);
}
static void coda_free_fc(struct fs_context *fc)
{
kfree(fc->fs_private);
}
static const struct fs_context_operations coda_context_ops = {
.free = coda_free_fc,
.parse_param = coda_parse_param,
.parse_monolithic = coda_parse_monolithic,
.get_tree = coda_get_tree,
.reconfigure = coda_reconfigure,
};
static int coda_init_fs_context(struct fs_context *fc)
{
struct coda_fs_context *ctx;
ctx = kzalloc(sizeof(struct coda_fs_context), GFP_KERNEL);
if (!ctx)
return -ENOMEM;
fc->fs_private = ctx;
fc->ops = &coda_context_ops;
return 0;
}
struct file_system_type coda_fs_type = {
.owner = THIS_MODULE,
.name = "coda",
.mount = coda_mount,
.init_fs_context = coda_init_fs_context,
.parameters = coda_param_specs,
.kill_sb = kill_anon_super,
.fs_flags = FS_BINARY_MOUNTDATA,
};

View File

@ -74,13 +74,7 @@ struct fscrypt_nokey_name {
static inline bool fscrypt_is_dot_dotdot(const struct qstr *str)
{
if (str->len == 1 && str->name[0] == '.')
return true;
if (str->len == 2 && str->name[0] == '.' && str->name[1] == '.')
return true;
return false;
return is_dot_dotdot(str->name, str->len);
}
/**

View File

@ -102,11 +102,8 @@ int __fscrypt_prepare_lookup(struct inode *dir, struct dentry *dentry,
if (err && err != -ENOENT)
return err;
if (fname->is_nokey_name) {
spin_lock(&dentry->d_lock);
dentry->d_flags |= DCACHE_NOKEY_NAME;
spin_unlock(&dentry->d_lock);
}
fscrypt_prepare_dentry(dentry, fname->is_nokey_name);
return err;
}
EXPORT_SYMBOL_GPL(__fscrypt_prepare_lookup);
@ -131,12 +128,10 @@ EXPORT_SYMBOL_GPL(__fscrypt_prepare_lookup);
int fscrypt_prepare_lookup_partial(struct inode *dir, struct dentry *dentry)
{
int err = fscrypt_get_encryption_info(dir, true);
bool is_nokey_name = (!err && !fscrypt_has_encryption_key(dir));
fscrypt_prepare_dentry(dentry, is_nokey_name);
if (!err && !fscrypt_has_encryption_key(dir)) {
spin_lock(&dentry->d_lock);
dentry->d_flags |= DCACHE_NOKEY_NAME;
spin_unlock(&dentry->d_lock);
}
return err;
}
EXPORT_SYMBOL_GPL(fscrypt_prepare_lookup_partial);

View File

@ -3139,7 +3139,7 @@ static void __init dcache_init(void)
* of the dcache.
*/
dentry_cache = KMEM_CACHE_USERCOPY(dentry,
SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_ACCOUNT,
d_iname);
/* Hash may have been set up in dcache_init_early */

View File

@ -1949,16 +1949,6 @@ out:
return rc;
}
static bool is_dot_dotdot(const char *name, size_t name_size)
{
if (name_size == 1 && name[0] == '.')
return true;
else if (name_size == 2 && name[0] == '.' && name[1] == '.')
return true;
return false;
}
/**
* ecryptfs_decode_and_decrypt_filename - converts the encoded cipher text name to decoded plaintext
* @plaintext_name: The plaintext name

View File

@ -14,19 +14,14 @@
#include <linux/buffer_head.h>
#include <linux/vfs.h>
#include <linux/blkdev.h>
#include <linux/fs_context.h>
#include <linux/fs_parser.h>
#include "efs.h"
#include <linux/efs_vh.h>
#include <linux/efs_fs_sb.h>
static int efs_statfs(struct dentry *dentry, struct kstatfs *buf);
static int efs_fill_super(struct super_block *s, void *d, int silent);
static struct dentry *efs_mount(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data)
{
return mount_bdev(fs_type, flags, dev_name, data, efs_fill_super);
}
static int efs_init_fs_context(struct fs_context *fc);
static void efs_kill_sb(struct super_block *s)
{
@ -35,15 +30,6 @@ static void efs_kill_sb(struct super_block *s)
kfree(sbi);
}
static struct file_system_type efs_fs_type = {
.owner = THIS_MODULE,
.name = "efs",
.mount = efs_mount,
.kill_sb = efs_kill_sb,
.fs_flags = FS_REQUIRES_DEV,
};
MODULE_ALIAS_FS("efs");
static struct pt_types sgi_pt_types[] = {
{0x00, "SGI vh"},
{0x01, "SGI trkrepl"},
@ -63,6 +49,27 @@ static struct pt_types sgi_pt_types[] = {
{0, NULL}
};
enum {
Opt_explicit_open,
};
static const struct fs_parameter_spec efs_param_spec[] = {
fsparam_flag ("explicit-open", Opt_explicit_open),
{}
};
/*
* File system definition and registration.
*/
static struct file_system_type efs_fs_type = {
.owner = THIS_MODULE,
.name = "efs",
.kill_sb = efs_kill_sb,
.fs_flags = FS_REQUIRES_DEV,
.init_fs_context = efs_init_fs_context,
.parameters = efs_param_spec,
};
MODULE_ALIAS_FS("efs");
static struct kmem_cache * efs_inode_cachep;
@ -91,8 +98,8 @@ static int __init init_inodecache(void)
{
efs_inode_cachep = kmem_cache_create("efs_inode_cache",
sizeof(struct efs_inode_info), 0,
SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
SLAB_ACCOUNT, init_once);
SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
init_once);
if (efs_inode_cachep == NULL)
return -ENOMEM;
return 0;
@ -108,18 +115,10 @@ static void destroy_inodecache(void)
kmem_cache_destroy(efs_inode_cachep);
}
static int efs_remount(struct super_block *sb, int *flags, char *data)
{
sync_filesystem(sb);
*flags |= SB_RDONLY;
return 0;
}
static const struct super_operations efs_superblock_operations = {
.alloc_inode = efs_alloc_inode,
.free_inode = efs_free_inode,
.statfs = efs_statfs,
.remount_fs = efs_remount,
};
static const struct export_operations efs_export_ops = {
@ -249,26 +248,26 @@ static int efs_validate_super(struct efs_sb_info *sb, struct efs_super *super) {
return 0;
}
static int efs_fill_super(struct super_block *s, void *d, int silent)
static int efs_fill_super(struct super_block *s, struct fs_context *fc)
{
struct efs_sb_info *sb;
struct buffer_head *bh;
struct inode *root;
sb = kzalloc(sizeof(struct efs_sb_info), GFP_KERNEL);
sb = kzalloc(sizeof(struct efs_sb_info), GFP_KERNEL);
if (!sb)
return -ENOMEM;
s->s_fs_info = sb;
s->s_time_min = 0;
s->s_time_max = U32_MAX;
s->s_magic = EFS_SUPER_MAGIC;
if (!sb_set_blocksize(s, EFS_BLOCKSIZE)) {
pr_err("device does not support %d byte blocks\n",
EFS_BLOCKSIZE);
return -EINVAL;
}
/* read the vh (volume header) block */
bh = sb_bread(s, 0);
@ -294,7 +293,7 @@ static int efs_fill_super(struct super_block *s, void *d, int silent)
pr_err("cannot read superblock\n");
return -EIO;
}
if (efs_validate_super(sb, (struct efs_super *) bh->b_data)) {
#ifdef DEBUG
pr_warn("invalid superblock at block %u\n",
@ -328,6 +327,61 @@ static int efs_fill_super(struct super_block *s, void *d, int silent)
return 0;
}
static void efs_free_fc(struct fs_context *fc)
{
kfree(fc->fs_private);
}
static int efs_get_tree(struct fs_context *fc)
{
return get_tree_bdev(fc, efs_fill_super);
}
static int efs_parse_param(struct fs_context *fc, struct fs_parameter *param)
{
int token;
struct fs_parse_result result;
token = fs_parse(fc, efs_param_spec, param, &result);
if (token < 0)
return token;
return 0;
}
static int efs_reconfigure(struct fs_context *fc)
{
sync_filesystem(fc->root->d_sb);
return 0;
}
struct efs_context {
unsigned long s_mount_opts;
};
static const struct fs_context_operations efs_context_opts = {
.parse_param = efs_parse_param,
.get_tree = efs_get_tree,
.reconfigure = efs_reconfigure,
.free = efs_free_fc,
};
/*
* Set up the filesystem mount context.
*/
static int efs_init_fs_context(struct fs_context *fc)
{
struct efs_context *ctx;
ctx = kzalloc(sizeof(struct efs_context), GFP_KERNEL);
if (!ctx)
return -ENOMEM;
fc->fs_private = ctx;
fc->ops = &efs_context_opts;
return 0;
}
static int efs_statfs(struct dentry *dentry, struct kstatfs *buf) {
struct super_block *sb = dentry->d_sb;
struct efs_sb_info *sbi = SUPER_INFO(sb);

View File

@ -251,7 +251,7 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
ssize_t res;
__u64 ucnt;
if (count < sizeof(ucnt))
if (count != sizeof(ucnt))
return -EINVAL;
if (copy_from_user(&ucnt, buf, sizeof(ucnt)))
return -EFAULT;
@ -283,13 +283,18 @@ static ssize_t eventfd_write(struct file *file, const char __user *buf, size_t c
static void eventfd_show_fdinfo(struct seq_file *m, struct file *f)
{
struct eventfd_ctx *ctx = f->private_data;
__u64 cnt;
spin_lock_irq(&ctx->wqh.lock);
seq_printf(m, "eventfd-count: %16llx\n",
(unsigned long long)ctx->count);
cnt = ctx->count;
spin_unlock_irq(&ctx->wqh.lock);
seq_printf(m, "eventfd-id: %d\n", ctx->id);
seq_printf(m, "eventfd-semaphore: %d\n",
seq_printf(m,
"eventfd-count: %16llx\n"
"eventfd-id: %d\n"
"eventfd-semaphore: %d\n",
cnt,
ctx->id,
!!(ctx->flags & EFD_SEMAPHORE));
}
#endif
@ -383,6 +388,7 @@ static int do_eventfd(unsigned int count, int flags)
/* Check the EFD_* constants for consistency. */
BUILD_BUG_ON(EFD_CLOEXEC != O_CLOEXEC);
BUILD_BUG_ON(EFD_NONBLOCK != O_NONBLOCK);
BUILD_BUG_ON(EFD_SEMAPHORE != (1 << 0));
if (flags & ~EFD_FLAGS_SET)
return -EINVAL;

View File

@ -206,7 +206,7 @@ struct eventpoll {
*/
struct epitem *ovflist;
/* wakeup_source used when ep_scan_ready_list is running */
/* wakeup_source used when ep_send_events or __ep_eventpoll_poll is running */
struct wakeup_source *ws;
/* The user that created the eventpoll descriptor */
@ -678,12 +678,6 @@ static void ep_done_scan(struct eventpoll *ep,
write_unlock_irq(&ep->lock);
}
static void epi_rcu_free(struct rcu_head *head)
{
struct epitem *epi = container_of(head, struct epitem, rcu);
kmem_cache_free(epi_cache, epi);
}
static void ep_get(struct eventpoll *ep)
{
refcount_inc(&ep->refcount);
@ -767,7 +761,7 @@ static bool __ep_remove(struct eventpoll *ep, struct epitem *epi, bool force)
* ep->mtx. The rcu read side, reverse_path_check_proc(), does not make
* use of the rbn field.
*/
call_rcu(&epi->rcu, epi_rcu_free);
kfree_rcu(epi, rcu);
percpu_counter_dec(&ep->user->epoll_watches);
return ep_refcount_dec_and_test(ep);
@ -1153,7 +1147,7 @@ static inline bool chain_epi_lockless(struct epitem *epi)
* This callback takes a read lock in order not to contend with concurrent
* events from another file descriptor, thus all modifications to ->rdllist
* or ->ovflist are lockless. Read lock is paired with the write lock from
* ep_scan_ready_list(), which stops all list modifications and guarantees
* ep_start/done_scan(), which stops all list modifications and guarantees
* that lists state is seen correctly.
*
* Another thing worth to mention is that ep_poll_callback() can be called
@ -1751,7 +1745,7 @@ static int ep_send_events(struct eventpoll *ep,
* availability. At this point, no one can insert
* into ep->rdllist besides us. The epoll_ctl()
* callers are locked out by
* ep_scan_ready_list() holding "mtx" and the
* ep_send_events() holding "mtx" and the
* poll callback will queue them in ep->ovflist.
*/
list_add_tail(&epi->rdllink, &ep->rdllist);
@ -1904,7 +1898,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
__set_current_state(TASK_INTERRUPTIBLE);
/*
* Do the final check under the lock. ep_scan_ready_list()
* Do the final check under the lock. ep_start/done_scan()
* plays with two lists (->rdllist and ->ovflist) and there
* is always a race when both lists are empty for short
* period of time although events are pending, so lock is

View File

@ -255,7 +255,7 @@ static bool filldir_one(struct dir_context *ctx, const char *name, int len,
container_of(ctx, struct getdents_callback, ctx);
buf->sequence++;
if (buf->ino == ino && len <= NAME_MAX) {
if (buf->ino == ino && len <= NAME_MAX && !is_dot_dotdot(name, len)) {
memcpy(buf->name, name, len);
buf->name[len] = '\0';
buf->found = 1;

View File

@ -1762,7 +1762,6 @@ static struct buffer_head *ext4_lookup_entry(struct inode *dir,
struct buffer_head *bh;
err = ext4_fname_prepare_lookup(dir, dentry, &fname);
generic_set_encrypted_ci_d_ops(dentry);
if (err == -ENOENT)
return NULL;
if (err)

View File

@ -5484,6 +5484,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
goto failed_mount4;
}
generic_set_sb_d_ops(sb);
sb->s_root = d_make_root(root);
if (!sb->s_root) {
ext4_msg(sb, KERN_ERR, "get root dentry failed");

View File

@ -3364,17 +3364,6 @@ static inline bool f2fs_cp_error(struct f2fs_sb_info *sbi)
return is_set_ckpt_flags(sbi, CP_ERROR_FLAG);
}
static inline bool is_dot_dotdot(const u8 *name, size_t len)
{
if (len == 1 && name[0] == '.')
return true;
if (len == 2 && name[0] == '.' && name[1] == '.')
return true;
return false;
}
static inline void *f2fs_kmalloc(struct f2fs_sb_info *sbi,
size_t size, gfp_t flags)
{

View File

@ -531,7 +531,6 @@ static struct dentry *f2fs_lookup(struct inode *dir, struct dentry *dentry,
}
err = f2fs_prepare_lookup(dir, dentry, &fname);
generic_set_encrypted_ci_d_ops(dentry);
if (err == -ENOENT)
goto out_splice;
if (err)

View File

@ -4660,6 +4660,7 @@ try_onemore:
goto free_node_inode;
}
generic_set_sb_d_ops(sb);
sb->s_root = d_make_root(root); /* allocate root dentry */
if (!sb->s_root) {
err = -ENOMEM;

View File

@ -846,12 +846,6 @@ int send_sigurg(struct fown_struct *fown)
static DEFINE_SPINLOCK(fasync_lock);
static struct kmem_cache *fasync_cache __ro_after_init;
static void fasync_free_rcu(struct rcu_head *head)
{
kmem_cache_free(fasync_cache,
container_of(head, struct fasync_struct, fa_rcu));
}
/*
* Remove a fasync entry. If successfully removed, return
* positive and clear the FASYNC flag. If no entry exists,
@ -877,7 +871,7 @@ int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
write_unlock_irq(&fa->fa_lock);
*fp = fa->fa_next;
call_rcu(&fa->fa_rcu, fasync_free_rcu);
kfree_rcu(fa, fa_rcu);
filp->f_flags &= ~FASYNC;
result = 1;
break;

View File

@ -36,7 +36,7 @@ static long do_sys_name_to_handle(const struct path *path,
if (f_handle.handle_bytes > MAX_HANDLE_SZ)
return -EINVAL;
handle = kmalloc(sizeof(struct file_handle) + f_handle.handle_bytes,
handle = kzalloc(sizeof(struct file_handle) + f_handle.handle_bytes,
GFP_KERNEL);
if (!handle)
return -ENOMEM;

View File

@ -141,6 +141,31 @@ static void wb_wakeup(struct bdi_writeback *wb)
spin_unlock_irq(&wb->work_lock);
}
/*
* This function is used when the first inode for this wb is marked dirty. It
* wakes-up the corresponding bdi thread which should then take care of the
* periodic background write-out of dirty inodes. Since the write-out would
* starts only 'dirty_writeback_interval' centisecs from now anyway, we just
* set up a timer which wakes the bdi thread up later.
*
* Note, we wouldn't bother setting up the timer, but this function is on the
* fast-path (used by '__mark_inode_dirty()'), so we save few context switches
* by delaying the wake-up.
*
* We have to be careful not to postpone flush work if it is scheduled for
* earlier. Thus we use queue_delayed_work().
*/
static void wb_wakeup_delayed(struct bdi_writeback *wb)
{
unsigned long timeout;
timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
spin_lock_irq(&wb->work_lock);
if (test_bit(WB_registered, &wb->state))
queue_delayed_work(bdi_wq, &wb->dwork, timeout);
spin_unlock_irq(&wb->work_lock);
}
static void finish_writeback_work(struct bdi_writeback *wb,
struct wb_writeback_work *work)
{

View File

@ -83,8 +83,8 @@ static const struct fs_parameter_spec *fs_lookup_key(
}
/*
* fs_parse - Parse a filesystem configuration parameter
* @fc: The filesystem context to log errors through.
* __fs_parse - Parse a filesystem configuration parameter
* @log: The filesystem context to log errors through.
* @desc: The parameter description to use.
* @param: The parameter.
* @result: Where to place the result of the parse

View File

@ -30,7 +30,7 @@ struct hfsplus_wd {
* @sector: block to read or write, for blocks of HFSPLUS_SECTOR_SIZE bytes
* @buf: buffer for I/O
* @data: output pointer for location of requested data
* @opf: request op flags
* @opf: I/O operation type and flags
*
* The unit of I/O is hfsplus_min_io_size(sb), which may be bigger than
* HFSPLUS_SECTOR_SIZE, and @buf must be sized accordingly. On reads

View File

@ -933,7 +933,7 @@ static int hugetlbfs_setattr(struct mnt_idmap *idmap,
unsigned int ia_valid = attr->ia_valid;
struct hugetlbfs_inode_info *info = HUGETLBFS_I(inode);
error = setattr_prepare(&nop_mnt_idmap, dentry, attr);
error = setattr_prepare(idmap, dentry, attr);
if (error)
return error;
@ -950,7 +950,7 @@ static int hugetlbfs_setattr(struct mnt_idmap *idmap,
hugetlb_vmtruncate(inode, newsize);
}
setattr_copy(&nop_mnt_idmap, inode, attr);
setattr_copy(idmap, inode, attr);
mark_inode_dirty(inode);
return 0;
}
@ -985,6 +985,7 @@ static struct inode *hugetlbfs_get_root(struct super_block *sb,
static struct lock_class_key hugetlbfs_i_mmap_rwsem_key;
static struct inode *hugetlbfs_get_inode(struct super_block *sb,
struct mnt_idmap *idmap,
struct inode *dir,
umode_t mode, dev_t dev)
{
@ -1006,7 +1007,7 @@ static struct inode *hugetlbfs_get_inode(struct super_block *sb,
struct hugetlbfs_inode_info *info = HUGETLBFS_I(inode);
inode->i_ino = get_next_ino();
inode_init_owner(&nop_mnt_idmap, inode, dir, mode);
inode_init_owner(idmap, inode, dir, mode);
lockdep_set_class(&inode->i_mapping->i_mmap_rwsem,
&hugetlbfs_i_mmap_rwsem_key);
inode->i_mapping->a_ops = &hugetlbfs_aops;
@ -1050,7 +1051,7 @@ static int hugetlbfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
{
struct inode *inode;
inode = hugetlbfs_get_inode(dir->i_sb, dir, mode, dev);
inode = hugetlbfs_get_inode(dir->i_sb, idmap, dir, mode, dev);
if (!inode)
return -ENOSPC;
inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir));
@ -1062,7 +1063,7 @@ static int hugetlbfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
static int hugetlbfs_mkdir(struct mnt_idmap *idmap, struct inode *dir,
struct dentry *dentry, umode_t mode)
{
int retval = hugetlbfs_mknod(&nop_mnt_idmap, dir, dentry,
int retval = hugetlbfs_mknod(idmap, dir, dentry,
mode | S_IFDIR, 0);
if (!retval)
inc_nlink(dir);
@ -1073,7 +1074,7 @@ static int hugetlbfs_create(struct mnt_idmap *idmap,
struct inode *dir, struct dentry *dentry,
umode_t mode, bool excl)
{
return hugetlbfs_mknod(&nop_mnt_idmap, dir, dentry, mode | S_IFREG, 0);
return hugetlbfs_mknod(idmap, dir, dentry, mode | S_IFREG, 0);
}
static int hugetlbfs_tmpfile(struct mnt_idmap *idmap,
@ -1082,7 +1083,7 @@ static int hugetlbfs_tmpfile(struct mnt_idmap *idmap,
{
struct inode *inode;
inode = hugetlbfs_get_inode(dir->i_sb, dir, mode | S_IFREG, 0);
inode = hugetlbfs_get_inode(dir->i_sb, idmap, dir, mode | S_IFREG, 0);
if (!inode)
return -ENOSPC;
inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir));
@ -1094,10 +1095,11 @@ static int hugetlbfs_symlink(struct mnt_idmap *idmap,
struct inode *dir, struct dentry *dentry,
const char *symname)
{
const umode_t mode = S_IFLNK|S_IRWXUGO;
struct inode *inode;
int error = -ENOSPC;
inode = hugetlbfs_get_inode(dir->i_sb, dir, S_IFLNK|S_IRWXUGO, 0);
inode = hugetlbfs_get_inode(dir->i_sb, idmap, dir, mode, 0);
if (inode) {
int l = strlen(symname)+1;
error = page_symlink(inode, symname, l);
@ -1566,6 +1568,7 @@ static struct file_system_type hugetlbfs_fs_type = {
.init_fs_context = hugetlbfs_init_fs_context,
.parameters = hugetlb_fs_parameters,
.kill_sb = kill_litter_super,
.fs_flags = FS_ALLOW_IDMAP,
};
static struct vfsmount *hugetlbfs_vfsmount[HUGE_MAX_HSTATE];
@ -1619,7 +1622,9 @@ struct file *hugetlb_file_setup(const char *name, size_t size,
}
file = ERR_PTR(-ENOSPC);
inode = hugetlbfs_get_inode(mnt->mnt_sb, NULL, S_IFREG | S_IRWXUGO, 0);
/* hugetlbfs_vfsmount[] mounts do not use idmapped mounts. */
inode = hugetlbfs_get_inode(mnt->mnt_sb, &nop_mnt_idmap, NULL,
S_IFREG | S_IRWXUGO, 0);
if (!inode)
goto out;
if (creat_flags == HUGETLB_SHMFS_INODE)

View File

@ -588,7 +588,8 @@ void dump_mapping(const struct address_space *mapping)
}
dentry_ptr = container_of(dentry_first, struct dentry, d_u.d_alias);
if (get_kernel_nofault(dentry, dentry_ptr)) {
if (get_kernel_nofault(dentry, dentry_ptr) ||
!dentry.d_parent || !dentry.d_name.name) {
pr_warn("aops:%ps ino:%lx invalid dentry:%px\n",
a_ops, ino, dentry_ptr);
return;
@ -2285,7 +2286,7 @@ void __init inode_init(void)
sizeof(struct inode),
0,
(SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
SLAB_MEM_SPREAD|SLAB_ACCOUNT),
SLAB_ACCOUNT),
init_once);
/* Hash may have been set up in inode_init_early */
@ -2509,7 +2510,7 @@ struct timespec64 inode_set_ctime_current(struct inode *inode)
{
struct timespec64 now = current_time(inode);
inode_set_ctime(inode, now.tv_sec, now.tv_nsec);
inode_set_ctime_to_ts(inode, now);
return now;
}
EXPORT_SYMBOL(inode_set_ctime_current);

View File

@ -932,7 +932,7 @@ static int __init init_jfs_fs(void)
jfs_inode_cachep =
kmem_cache_create_usercopy("jfs_ip", sizeof(struct jfs_inode_info),
0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
0, SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
offsetof(struct jfs_inode_info, i_inline_all),
sizeof_field(struct jfs_inode_info, i_inline_all),
init_once);

View File

@ -240,17 +240,22 @@ const struct inode_operations simple_dir_inode_operations = {
};
EXPORT_SYMBOL(simple_dir_inode_operations);
static void offset_set(struct dentry *dentry, u32 offset)
/* 0 is '.', 1 is '..', so always start with offset 2 or more */
enum {
DIR_OFFSET_MIN = 2,
};
static void offset_set(struct dentry *dentry, long offset)
{
dentry->d_fsdata = (void *)((uintptr_t)(offset));
dentry->d_fsdata = (void *)offset;
}
static u32 dentry2offset(struct dentry *dentry)
static long dentry2offset(struct dentry *dentry)
{
return (u32)((uintptr_t)(dentry->d_fsdata));
return (long)dentry->d_fsdata;
}
static struct lock_class_key simple_offset_xa_lock;
static struct lock_class_key simple_offset_lock_class;
/**
* simple_offset_init - initialize an offset_ctx
@ -259,11 +264,9 @@ static struct lock_class_key simple_offset_xa_lock;
*/
void simple_offset_init(struct offset_ctx *octx)
{
xa_init_flags(&octx->xa, XA_FLAGS_ALLOC1);
lockdep_set_class(&octx->xa.xa_lock, &simple_offset_xa_lock);
/* 0 is '.', 1 is '..', so always start with offset 2 */
octx->next_offset = 2;
mt_init_flags(&octx->mt, MT_FLAGS_ALLOC_RANGE);
lockdep_set_class(&octx->mt.ma_lock, &simple_offset_lock_class);
octx->next_offset = DIR_OFFSET_MIN;
}
/**
@ -271,20 +274,19 @@ void simple_offset_init(struct offset_ctx *octx)
* @octx: directory offset ctx to be updated
* @dentry: new dentry being added
*
* Returns zero on success. @so_ctx and the dentry offset are updated.
* Returns zero on success. @octx and the dentry's offset are updated.
* Otherwise, a negative errno value is returned.
*/
int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry)
{
static const struct xa_limit limit = XA_LIMIT(2, U32_MAX);
u32 offset;
unsigned long offset;
int ret;
if (dentry2offset(dentry) != 0)
return -EBUSY;
ret = xa_alloc_cyclic(&octx->xa, &offset, dentry, limit,
&octx->next_offset, GFP_KERNEL);
ret = mtree_alloc_cyclic(&octx->mt, &offset, dentry, DIR_OFFSET_MIN,
LONG_MAX, &octx->next_offset, GFP_KERNEL);
if (ret < 0)
return ret;
@ -300,16 +302,48 @@ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry)
*/
void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry)
{
u32 offset;
long offset;
offset = dentry2offset(dentry);
if (offset == 0)
return;
xa_erase(&octx->xa, offset);
mtree_erase(&octx->mt, offset);
offset_set(dentry, 0);
}
/**
* simple_offset_empty - Check if a dentry can be unlinked
* @dentry: dentry to be tested
*
* Returns 0 if @dentry is a non-empty directory; otherwise returns 1.
*/
int simple_offset_empty(struct dentry *dentry)
{
struct inode *inode = d_inode(dentry);
struct offset_ctx *octx;
struct dentry *child;
unsigned long index;
int ret = 1;
if (!inode || !S_ISDIR(inode->i_mode))
return ret;
index = DIR_OFFSET_MIN;
octx = inode->i_op->get_offset_ctx(inode);
mt_for_each(&octx->mt, child, index, LONG_MAX) {
spin_lock(&child->d_lock);
if (simple_positive(child)) {
spin_unlock(&child->d_lock);
ret = 0;
break;
}
spin_unlock(&child->d_lock);
}
return ret;
}
/**
* simple_offset_rename_exchange - exchange rename with directory offsets
* @old_dir: parent of dentry being moved
@ -327,8 +361,8 @@ int simple_offset_rename_exchange(struct inode *old_dir,
{
struct offset_ctx *old_ctx = old_dir->i_op->get_offset_ctx(old_dir);
struct offset_ctx *new_ctx = new_dir->i_op->get_offset_ctx(new_dir);
u32 old_index = dentry2offset(old_dentry);
u32 new_index = dentry2offset(new_dentry);
long old_index = dentry2offset(old_dentry);
long new_index = dentry2offset(new_dentry);
int ret;
simple_offset_remove(old_ctx, old_dentry);
@ -354,9 +388,9 @@ int simple_offset_rename_exchange(struct inode *old_dir,
out_restore:
offset_set(old_dentry, old_index);
xa_store(&old_ctx->xa, old_index, old_dentry, GFP_KERNEL);
mtree_store(&old_ctx->mt, old_index, old_dentry, GFP_KERNEL);
offset_set(new_dentry, new_index);
xa_store(&new_ctx->xa, new_index, new_dentry, GFP_KERNEL);
mtree_store(&new_ctx->mt, new_index, new_dentry, GFP_KERNEL);
return ret;
}
@ -369,7 +403,7 @@ out_restore:
*/
void simple_offset_destroy(struct offset_ctx *octx)
{
xa_destroy(&octx->xa);
mtree_destroy(&octx->mt);
}
/**
@ -399,15 +433,16 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence)
/* In this case, ->private_data is protected by f_pos_lock */
file->private_data = NULL;
return vfs_setpos(file, offset, U32_MAX);
return vfs_setpos(file, offset, LONG_MAX);
}
static struct dentry *offset_find_next(struct xa_state *xas)
static struct dentry *offset_find_next(struct offset_ctx *octx, loff_t offset)
{
MA_STATE(mas, &octx->mt, offset, offset);
struct dentry *child, *found = NULL;
rcu_read_lock();
child = xas_next_entry(xas, U32_MAX);
child = mas_find(&mas, LONG_MAX);
if (!child)
goto out;
spin_lock(&child->d_lock);
@ -421,8 +456,8 @@ out:
static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry)
{
u32 offset = dentry2offset(dentry);
struct inode *inode = d_inode(dentry);
long offset = dentry2offset(dentry);
return ctx->actor(ctx, dentry->d_name.name, dentry->d_name.len, offset,
inode->i_ino, fs_umode_to_dtype(inode->i_mode));
@ -430,12 +465,11 @@ static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry)
static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx)
{
struct offset_ctx *so_ctx = inode->i_op->get_offset_ctx(inode);
XA_STATE(xas, &so_ctx->xa, ctx->pos);
struct offset_ctx *octx = inode->i_op->get_offset_ctx(inode);
struct dentry *dentry;
while (true) {
dentry = offset_find_next(&xas);
dentry = offset_find_next(octx, ctx->pos);
if (!dentry)
return ERR_PTR(-ENOENT);
@ -444,8 +478,8 @@ static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx)
break;
}
ctx->pos = dentry2offset(dentry) + 1;
dput(dentry);
ctx->pos = xas.xa_index + 1;
}
return NULL;
}
@ -481,7 +515,7 @@ static int offset_readdir(struct file *file, struct dir_context *ctx)
return 0;
/* In this case, ->private_data is protected by f_pos_lock */
if (ctx->pos == 2)
if (ctx->pos == DIR_OFFSET_MIN)
file->private_data = NULL;
else if (file->private_data == ERR_PTR(-ENOENT))
return 0;
@ -1704,16 +1738,28 @@ bool is_empty_dir_inode(struct inode *inode)
static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
const char *str, const struct qstr *name)
{
const struct dentry *parent = READ_ONCE(dentry->d_parent);
const struct inode *dir = READ_ONCE(parent->d_inode);
const struct super_block *sb = dentry->d_sb;
const struct unicode_map *um = sb->s_encoding;
struct qstr qstr = QSTR_INIT(str, len);
const struct dentry *parent;
const struct inode *dir;
char strbuf[DNAME_INLINE_LEN];
int ret;
struct qstr qstr;
/*
* Attempt a case-sensitive match first. It is cheaper and
* should cover most lookups, including all the sane
* applications that expect a case-sensitive filesystem.
*
* This comparison is safe under RCU because the caller
* guarantees the consistency between str and len. See
* __d_lookup_rcu_op_compare() for details.
*/
if (len == name->len && !memcmp(str, name->name, len))
return 0;
parent = READ_ONCE(dentry->d_parent);
dir = READ_ONCE(parent->d_inode);
if (!dir || !IS_CASEFOLDED(dir))
goto fallback;
return 1;
/*
* If the dentry name is stored in-line, then it may be concurrently
* modified by a rename. If this happens, the VFS will eventually retry
@ -1724,20 +1770,14 @@ static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
if (len <= DNAME_INLINE_LEN - 1) {
memcpy(strbuf, str, len);
strbuf[len] = 0;
qstr.name = strbuf;
str = strbuf;
/* prevent compiler from optimizing out the temporary buffer */
barrier();
}
ret = utf8_strncasecmp(um, name, &qstr);
if (ret >= 0)
return ret;
qstr.len = len;
qstr.name = str;
if (sb_has_strict_encoding(sb))
return -EINVAL;
fallback:
if (len != name->len)
return 1;
return !!memcmp(str, name->name, len);
return utf8_strncasecmp(dentry->d_sb->s_encoding, name, &qstr);
}
/**
@ -1752,7 +1792,7 @@ static int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str)
const struct inode *dir = READ_ONCE(dentry->d_inode);
struct super_block *sb = dentry->d_sb;
const struct unicode_map *um = sb->s_encoding;
int ret = 0;
int ret;
if (!dir || !IS_CASEFOLDED(dir))
return 0;
@ -1766,6 +1806,9 @@ static int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str)
static const struct dentry_operations generic_ci_dentry_ops = {
.d_hash = generic_ci_d_hash,
.d_compare = generic_ci_d_compare,
#ifdef CONFIG_FS_ENCRYPTION
.d_revalidate = fscrypt_d_revalidate,
#endif
};
#endif
@ -1775,64 +1818,33 @@ static const struct dentry_operations generic_encrypted_dentry_ops = {
};
#endif
#if defined(CONFIG_FS_ENCRYPTION) && IS_ENABLED(CONFIG_UNICODE)
static const struct dentry_operations generic_encrypted_ci_dentry_ops = {
.d_hash = generic_ci_d_hash,
.d_compare = generic_ci_d_compare,
.d_revalidate = fscrypt_d_revalidate,
};
#endif
/**
* generic_set_encrypted_ci_d_ops - helper for setting d_ops for given dentry
* @dentry: dentry to set ops on
* generic_set_sb_d_ops - helper for choosing the set of
* filesystem-wide dentry operations for the enabled features
* @sb: superblock to be configured
*
* Casefolded directories need d_hash and d_compare set, so that the dentries
* contained in them are handled case-insensitively. Note that these operations
* are needed on the parent directory rather than on the dentries in it, and
* while the casefolding flag can be toggled on and off on an empty directory,
* dentry_operations can't be changed later. As a result, if the filesystem has
* casefolding support enabled at all, we have to give all dentries the
* casefolding operations even if their inode doesn't have the casefolding flag
* currently (and thus the casefolding ops would be no-ops for now).
*
* Encryption works differently in that the only dentry operation it needs is
* d_revalidate, which it only needs on dentries that have the no-key name flag.
* The no-key flag can't be set "later", so we don't have to worry about that.
*
* Finally, to maximize compatibility with overlayfs (which isn't compatible
* with certain dentry operations) and to avoid taking an unnecessary
* performance hit, we use custom dentry_operations for each possible
* combination rather than always installing all operations.
* Filesystems supporting casefolding and/or fscrypt can call this
* helper at mount-time to configure sb->s_d_op to best set of dentry
* operations required for the enabled features. The helper must be
* called after these have been configured, but before the root dentry
* is created.
*/
void generic_set_encrypted_ci_d_ops(struct dentry *dentry)
void generic_set_sb_d_ops(struct super_block *sb)
{
#ifdef CONFIG_FS_ENCRYPTION
bool needs_encrypt_ops = dentry->d_flags & DCACHE_NOKEY_NAME;
#endif
#if IS_ENABLED(CONFIG_UNICODE)
bool needs_ci_ops = dentry->d_sb->s_encoding;
#endif
#if defined(CONFIG_FS_ENCRYPTION) && IS_ENABLED(CONFIG_UNICODE)
if (needs_encrypt_ops && needs_ci_ops) {
d_set_d_op(dentry, &generic_encrypted_ci_dentry_ops);
if (sb->s_encoding) {
sb->s_d_op = &generic_ci_dentry_ops;
return;
}
#endif
#ifdef CONFIG_FS_ENCRYPTION
if (needs_encrypt_ops) {
d_set_d_op(dentry, &generic_encrypted_dentry_ops);
return;
}
#endif
#if IS_ENABLED(CONFIG_UNICODE)
if (needs_ci_ops) {
d_set_d_op(dentry, &generic_ci_dentry_ops);
if (sb->s_cop) {
sb->s_d_op = &generic_encrypted_dentry_ops;
return;
}
#endif
}
EXPORT_SYMBOL(generic_set_encrypted_ci_d_ops);
EXPORT_SYMBOL(generic_set_sb_d_ops);
/**
* inode_maybe_inc_iversion - increments i_version

View File

@ -426,9 +426,7 @@ EXPORT_SYMBOL(mb_cache_destroy);
static int __init mbcache_init(void)
{
mb_entry_cache = kmem_cache_create("mbcache",
sizeof(struct mb_cache_entry), 0,
SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD, NULL);
mb_entry_cache = KMEM_CACHE(mb_cache_entry, SLAB_RECLAIM_ACCOUNT);
if (!mb_entry_cache)
return -ENOMEM;
return 0;

View File

@ -87,7 +87,7 @@ static int __init init_inodecache(void)
minix_inode_cachep = kmem_cache_create("minix_inode_cache",
sizeof(struct minix_inode_info),
0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD|SLAB_ACCOUNT),
SLAB_ACCOUNT),
init_once);
if (minix_inode_cachep == NULL)
return -ENOMEM;

View File

@ -214,7 +214,7 @@ static int copy_mnt_idmap(struct uid_gid_map *map_from,
* anything at all.
*/
if (nr_extents == 0)
return 0;
return -EINVAL;
/*
* Here we know that nr_extents is greater than zero which means

View File

@ -2680,10 +2680,8 @@ static int lookup_one_common(struct mnt_idmap *idmap,
if (!len)
return -EACCES;
if (unlikely(name[0] == '.')) {
if (len < 2 || (len == 2 && name[1] == '.'))
return -EACCES;
}
if (is_dot_dotdot(name, len))
return -EACCES;
while (len--) {
unsigned int c = *(const unsigned char *)name++;

View File

@ -431,7 +431,7 @@ static int ntfs_atomic_open(struct inode *dir, struct dentry *dentry,
* fnd contains tree's path to insert to.
* If fnd is not NULL then dir is locked.
*/
inode = ntfs_create_inode(mnt_idmap(file->f_path.mnt), dir, dentry, uni,
inode = ntfs_create_inode(file_mnt_idmap(file), dir, dentry, uni,
mode, 0, NULL, 0, fnd);
err = IS_ERR(inode) ? PTR_ERR(inode) :
finish_open(file, dentry, ntfs_file_open);

View File

@ -446,7 +446,7 @@ static int __init init_openprom_fs(void)
sizeof(struct op_inode_info),
0,
(SLAB_RECLAIM_ACCOUNT |
SLAB_MEM_SPREAD | SLAB_ACCOUNT),
SLAB_ACCOUNT),
op_inode_init_once);
if (!op_inode_cachep)
return -ENOMEM;

View File

@ -280,12 +280,20 @@ static int ovl_mount_dir_check(struct fs_context *fc, const struct path *path,
{
struct ovl_fs_context *ctx = fc->fs_private;
if (ovl_dentry_weird(path->dentry))
return invalfc(fc, "filesystem on %s not supported", name);
if (!d_is_dir(path->dentry))
return invalfc(fc, "%s is not a directory", name);
/*
* Root dentries of case-insensitive capable filesystems might
* not have the dentry operations set, but still be incompatible
* with overlayfs. Check explicitly to prevent post-mount
* failures.
*/
if (sb_has_encoding(path->mnt->mnt_sb))
return invalfc(fc, "case-insensitive capable filesystem on %s not supported", name);
if (ovl_dentry_weird(path->dentry))
return invalfc(fc, "filesystem on %s not supported", name);
/*
* Check whether upper path is read-only here to report failures

View File

@ -28,41 +28,38 @@ MODULE_LICENSE("GPL");
struct ovl_dir_cache;
static struct dentry *ovl_d_real(struct dentry *dentry,
const struct inode *inode)
static struct dentry *ovl_d_real(struct dentry *dentry, enum d_real_type type)
{
struct dentry *real = NULL, *lower;
struct dentry *upper, *lower;
int err;
/*
* vfs is only expected to call d_real() with NULL from d_real_inode()
* and with overlay inode from file_dentry() on an overlay file.
*
* TODO: remove @inode argument from d_real() API, remove code in this
* function that deals with non-NULL @inode and remove d_real() call
* from file_dentry().
*/
if (inode && d_inode(dentry) == inode)
return dentry;
else if (inode)
switch (type) {
case D_REAL_DATA:
case D_REAL_METADATA:
break;
default:
goto bug;
}
if (!d_is_reg(dentry)) {
/* d_real_inode() is only relevant for regular files */
return dentry;
}
real = ovl_dentry_upper(dentry);
if (real && (inode == d_inode(real)))
return real;
upper = ovl_dentry_upper(dentry);
if (upper && (type == D_REAL_METADATA ||
ovl_has_upperdata(d_inode(dentry))))
return upper;
if (real && !inode && ovl_has_upperdata(d_inode(dentry)))
return real;
if (type == D_REAL_METADATA) {
lower = ovl_dentry_lower(dentry);
goto real_lower;
}
/*
* Best effort lazy lookup of lowerdata for !inode case to return
* Best effort lazy lookup of lowerdata for D_REAL_DATA case to return
* the real lowerdata dentry. The only current caller of d_real() with
* NULL inode is d_real_inode() from trace_uprobe and this caller is
* D_REAL_DATA is d_real_inode() from trace_uprobe and this caller is
* likely going to be followed reading from the file, before placing
* uprobes on offset within the file, so lowerdata should be available
* when setting the uprobe.
@ -73,18 +70,13 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
lower = ovl_dentry_lowerdata(dentry);
if (!lower)
goto bug;
real = lower;
/* Handle recursion */
real = d_real(real, inode);
real_lower:
/* Handle recursion into stacked lower fs */
return d_real(lower, type);
if (!inode || inode == d_inode(real))
return real;
bug:
WARN(1, "%s(%pd4, %s:%lu): real dentry (%p/%lu) not found\n",
__func__, dentry, inode ? inode->i_sb->s_id : "NULL",
inode ? inode->i_ino : 0, real,
real && d_inode(real) ? d_inode(real)->i_ino : 0);
WARN(1, "%s(%pd4, %d): real dentry not found\n", __func__, dentry, type);
return dentry;
}

View File

@ -76,18 +76,20 @@ static unsigned long pipe_user_pages_soft = PIPE_DEF_BUFFERS * INR_OPEN_CUR;
* -- Manfred Spraul <manfred@colorfullife.com> 2002-05-09
*/
static void pipe_lock_nested(struct pipe_inode_info *pipe, int subclass)
#define cmp_int(l, r) ((l > r) - (l < r))
#ifdef CONFIG_PROVE_LOCKING
static int pipe_lock_cmp_fn(const struct lockdep_map *a,
const struct lockdep_map *b)
{
if (pipe->files)
mutex_lock_nested(&pipe->mutex, subclass);
return cmp_int((unsigned long) a, (unsigned long) b);
}
#endif
void pipe_lock(struct pipe_inode_info *pipe)
{
/*
* pipe_lock() nests non-pipe inode locks (for writing to a file)
*/
pipe_lock_nested(pipe, I_MUTEX_PARENT);
if (pipe->files)
mutex_lock(&pipe->mutex);
}
EXPORT_SYMBOL(pipe_lock);
@ -98,28 +100,16 @@ void pipe_unlock(struct pipe_inode_info *pipe)
}
EXPORT_SYMBOL(pipe_unlock);
static inline void __pipe_lock(struct pipe_inode_info *pipe)
{
mutex_lock_nested(&pipe->mutex, I_MUTEX_PARENT);
}
static inline void __pipe_unlock(struct pipe_inode_info *pipe)
{
mutex_unlock(&pipe->mutex);
}
void pipe_double_lock(struct pipe_inode_info *pipe1,
struct pipe_inode_info *pipe2)
{
BUG_ON(pipe1 == pipe2);
if (pipe1 < pipe2) {
pipe_lock_nested(pipe1, I_MUTEX_PARENT);
pipe_lock_nested(pipe2, I_MUTEX_CHILD);
} else {
pipe_lock_nested(pipe2, I_MUTEX_PARENT);
pipe_lock_nested(pipe1, I_MUTEX_CHILD);
}
if (pipe1 > pipe2)
swap(pipe1, pipe2);
pipe_lock(pipe1);
pipe_lock(pipe2);
}
static void anon_pipe_buf_release(struct pipe_inode_info *pipe,
@ -271,7 +261,7 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
return 0;
ret = 0;
__pipe_lock(pipe);
mutex_lock(&pipe->mutex);
/*
* We only wake up writers if the pipe was full when we started
@ -368,7 +358,7 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
ret = -EAGAIN;
break;
}
__pipe_unlock(pipe);
mutex_unlock(&pipe->mutex);
/*
* We only get here if we didn't actually read anything.
@ -400,13 +390,13 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
if (wait_event_interruptible_exclusive(pipe->rd_wait, pipe_readable(pipe)) < 0)
return -ERESTARTSYS;
__pipe_lock(pipe);
mutex_lock(&pipe->mutex);
was_full = pipe_full(pipe->head, pipe->tail, pipe->max_usage);
wake_next_reader = true;
}
if (pipe_empty(pipe->head, pipe->tail))
wake_next_reader = false;
__pipe_unlock(pipe);
mutex_unlock(&pipe->mutex);
if (was_full)
wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
@ -462,7 +452,7 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from)
if (unlikely(total_len == 0))
return 0;
__pipe_lock(pipe);
mutex_lock(&pipe->mutex);
if (!pipe->readers) {
send_sig(SIGPIPE, current, 0);
@ -582,19 +572,19 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from)
* after waiting we need to re-check whether the pipe
* become empty while we dropped the lock.
*/
__pipe_unlock(pipe);
mutex_unlock(&pipe->mutex);
if (was_empty)
wake_up_interruptible_sync_poll(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM);
kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
wait_event_interruptible_exclusive(pipe->wr_wait, pipe_writable(pipe));
__pipe_lock(pipe);
mutex_lock(&pipe->mutex);
was_empty = pipe_empty(pipe->head, pipe->tail);
wake_next_writer = true;
}
out:
if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
wake_next_writer = false;
__pipe_unlock(pipe);
mutex_unlock(&pipe->mutex);
/*
* If we do do a wakeup event, we do a 'sync' wakeup, because we
@ -629,7 +619,7 @@ static long pipe_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
switch (cmd) {
case FIONREAD:
__pipe_lock(pipe);
mutex_lock(&pipe->mutex);
count = 0;
head = pipe->head;
tail = pipe->tail;
@ -639,16 +629,16 @@ static long pipe_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
count += pipe->bufs[tail & mask].len;
tail++;
}
__pipe_unlock(pipe);
mutex_unlock(&pipe->mutex);
return put_user(count, (int __user *)arg);
#ifdef CONFIG_WATCH_QUEUE
case IOC_WATCH_QUEUE_SET_SIZE: {
int ret;
__pipe_lock(pipe);
mutex_lock(&pipe->mutex);
ret = watch_queue_set_size(pipe, arg);
__pipe_unlock(pipe);
mutex_unlock(&pipe->mutex);
return ret;
}
@ -734,7 +724,7 @@ pipe_release(struct inode *inode, struct file *file)
{
struct pipe_inode_info *pipe = file->private_data;
__pipe_lock(pipe);
mutex_lock(&pipe->mutex);
if (file->f_mode & FMODE_READ)
pipe->readers--;
if (file->f_mode & FMODE_WRITE)
@ -747,7 +737,7 @@ pipe_release(struct inode *inode, struct file *file)
kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
}
__pipe_unlock(pipe);
mutex_unlock(&pipe->mutex);
put_pipe_info(inode, pipe);
return 0;
@ -759,7 +749,7 @@ pipe_fasync(int fd, struct file *filp, int on)
struct pipe_inode_info *pipe = filp->private_data;
int retval = 0;
__pipe_lock(pipe);
mutex_lock(&pipe->mutex);
if (filp->f_mode & FMODE_READ)
retval = fasync_helper(fd, filp, on, &pipe->fasync_readers);
if ((filp->f_mode & FMODE_WRITE) && retval >= 0) {
@ -768,7 +758,7 @@ pipe_fasync(int fd, struct file *filp, int on)
/* this can happen only if on == T */
fasync_helper(-1, filp, 0, &pipe->fasync_readers);
}
__pipe_unlock(pipe);
mutex_unlock(&pipe->mutex);
return retval;
}
@ -834,6 +824,7 @@ struct pipe_inode_info *alloc_pipe_info(void)
pipe->nr_accounted = pipe_bufs;
pipe->user = user;
mutex_init(&pipe->mutex);
lock_set_cmp_fn(&pipe->mutex, pipe_lock_cmp_fn, NULL);
return pipe;
}
@ -1144,7 +1135,7 @@ static int fifo_open(struct inode *inode, struct file *filp)
filp->private_data = pipe;
/* OK, we have a pipe and it's pinned down */
__pipe_lock(pipe);
mutex_lock(&pipe->mutex);
/* We can only do regular read/write on fifos */
stream_open(inode, filp);
@ -1214,7 +1205,7 @@ static int fifo_open(struct inode *inode, struct file *filp)
}
/* Ok! */
__pipe_unlock(pipe);
mutex_unlock(&pipe->mutex);
return 0;
err_rd:
@ -1230,7 +1221,7 @@ err_wr:
goto err;
err:
__pipe_unlock(pipe);
mutex_unlock(&pipe->mutex);
put_pipe_info(inode, pipe);
return ret;
@ -1411,7 +1402,7 @@ long pipe_fcntl(struct file *file, unsigned int cmd, unsigned int arg)
if (!pipe)
return -EBADF;
__pipe_lock(pipe);
mutex_lock(&pipe->mutex);
switch (cmd) {
case F_SETPIPE_SZ:
@ -1425,7 +1416,7 @@ long pipe_fcntl(struct file *file, unsigned int cmd, unsigned int arg)
break;
}
__pipe_unlock(pipe);
mutex_unlock(&pipe->mutex);
return ret;
}

View File

@ -92,7 +92,7 @@ void __init proc_init_kmemcache(void)
proc_inode_cachep = kmem_cache_create("proc_inode_cache",
sizeof(struct proc_inode),
0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD|SLAB_ACCOUNT|
SLAB_ACCOUNT|
SLAB_PANIC),
init_once);
pde_opener_cache =

View File

@ -21,6 +21,7 @@
#include <linux/buffer_head.h>
#include <linux/writeback.h>
#include <linux/statfs.h>
#include <linux/fs_context.h>
#include "qnx4.h"
#define QNX4_VERSION 4
@ -30,28 +31,33 @@ static const struct super_operations qnx4_sops;
static struct inode *qnx4_alloc_inode(struct super_block *sb);
static void qnx4_free_inode(struct inode *inode);
static int qnx4_remount(struct super_block *sb, int *flags, char *data);
static int qnx4_statfs(struct dentry *, struct kstatfs *);
static int qnx4_get_tree(struct fs_context *fc);
static const struct super_operations qnx4_sops =
{
.alloc_inode = qnx4_alloc_inode,
.free_inode = qnx4_free_inode,
.statfs = qnx4_statfs,
.remount_fs = qnx4_remount,
};
static int qnx4_remount(struct super_block *sb, int *flags, char *data)
static int qnx4_reconfigure(struct fs_context *fc)
{
struct super_block *sb = fc->root->d_sb;
struct qnx4_sb_info *qs;
sync_filesystem(sb);
qs = qnx4_sb(sb);
qs->Version = QNX4_VERSION;
*flags |= SB_RDONLY;
fc->sb_flags |= SB_RDONLY;
return 0;
}
static const struct fs_context_operations qnx4_context_opts = {
.get_tree = qnx4_get_tree,
.reconfigure = qnx4_reconfigure,
};
static int qnx4_get_block( struct inode *inode, sector_t iblock, struct buffer_head *bh, int create )
{
unsigned long phys;
@ -183,12 +189,13 @@ static const char *qnx4_checkroot(struct super_block *sb,
return "bitmap file not found.";
}
static int qnx4_fill_super(struct super_block *s, void *data, int silent)
static int qnx4_fill_super(struct super_block *s, struct fs_context *fc)
{
struct buffer_head *bh;
struct inode *root;
const char *errmsg;
struct qnx4_sb_info *qs;
int silent = fc->sb_flags & SB_SILENT;
qs = kzalloc(sizeof(struct qnx4_sb_info), GFP_KERNEL);
if (!qs)
@ -216,7 +223,7 @@ static int qnx4_fill_super(struct super_block *s, void *data, int silent)
errmsg = qnx4_checkroot(s, (struct qnx4_super_block *) bh->b_data);
brelse(bh);
if (errmsg != NULL) {
if (!silent)
if (!silent)
printk(KERN_ERR "qnx4: %s\n", errmsg);
return -EINVAL;
}
@ -235,6 +242,18 @@ static int qnx4_fill_super(struct super_block *s, void *data, int silent)
return 0;
}
static int qnx4_get_tree(struct fs_context *fc)
{
return get_tree_bdev(fc, qnx4_fill_super);
}
static int qnx4_init_fs_context(struct fs_context *fc)
{
fc->ops = &qnx4_context_opts;
return 0;
}
static void qnx4_kill_sb(struct super_block *sb)
{
struct qnx4_sb_info *qs = qnx4_sb(sb);
@ -376,18 +395,12 @@ static void destroy_inodecache(void)
kmem_cache_destroy(qnx4_inode_cachep);
}
static struct dentry *qnx4_mount(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data)
{
return mount_bdev(fs_type, flags, dev_name, data, qnx4_fill_super);
}
static struct file_system_type qnx4_fs_type = {
.owner = THIS_MODULE,
.name = "qnx4",
.mount = qnx4_mount,
.kill_sb = qnx4_kill_sb,
.fs_flags = FS_REQUIRES_DEV,
.owner = THIS_MODULE,
.name = "qnx4",
.kill_sb = qnx4_kill_sb,
.fs_flags = FS_REQUIRES_DEV,
.init_fs_context = qnx4_init_fs_context,
};
MODULE_ALIAS_FS("qnx4");

View File

@ -615,7 +615,7 @@ static int init_inodecache(void)
qnx6_inode_cachep = kmem_cache_create("qnx6_inode_cache",
sizeof(struct qnx6_inode_info),
0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD|SLAB_ACCOUNT),
SLAB_ACCOUNT),
init_once);
if (!qnx6_inode_cachep)
return -ENOMEM;

View File

@ -670,7 +670,6 @@ static int __init init_inodecache(void)
sizeof(struct
reiserfs_inode_info),
0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD|
SLAB_ACCOUNT),
init_once);
if (reiserfs_inode_cachep == NULL)

View File

@ -630,8 +630,8 @@ static int __init init_romfs_fs(void)
romfs_inode_cachep =
kmem_cache_create("romfs_i",
sizeof(struct romfs_inode_info), 0,
SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
SLAB_ACCOUNT, romfs_i_init_once);
SLAB_RECLAIM_ACCOUNT | SLAB_ACCOUNT,
romfs_i_init_once);
if (!romfs_inode_cachep) {
pr_err("Failed to initialise inode cache\n");

View File

@ -476,7 +476,7 @@ static inline void wait_key_set(poll_table *wait, unsigned long in,
wait->_key |= POLLOUT_SET;
}
static int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
static noinline_for_stack int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
{
ktime_t expire, *to = NULL;
struct poll_wqueues table;
@ -839,7 +839,7 @@ SYSCALL_DEFINE1(old_select, struct sel_arg_struct __user *, arg)
struct poll_list {
struct poll_list *next;
int len;
unsigned int len;
struct pollfd entries[];
};
@ -975,14 +975,15 @@ static int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds,
struct timespec64 *end_time)
{
struct poll_wqueues table;
int err = -EFAULT, fdcount, len;
int err = -EFAULT, fdcount;
/* Allocate small arguments on the stack to save memory and be
faster - use long to make sure the buffer is aligned properly
on 64 bit archs to avoid unaligned access */
long stack_pps[POLL_STACK_ALLOC/sizeof(long)];
struct poll_list *const head = (struct poll_list *)stack_pps;
struct poll_list *walk = head;
unsigned long todo = nfds;
unsigned int todo = nfds;
unsigned int len;
if (nfds > rlimit(RLIMIT_NOFILE))
return -EINVAL;
@ -998,9 +999,9 @@ static int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds,
sizeof(struct pollfd) * walk->len))
goto out_fds;
todo -= walk->len;
if (!todo)
if (walk->len >= todo)
break;
todo -= walk->len;
len = min(todo, POLLFD_PER_PAGE);
walk = walk->next = kmalloc(struct_size(walk, entries, len),
@ -1020,7 +1021,7 @@ static int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds,
for (walk = head; walk; walk = walk->next) {
struct pollfd *fds = walk->entries;
int j;
unsigned int j;
for (j = walk->len; j; fds++, ufds++, j--)
unsafe_put_user(fds->revents, &ufds->revents, Efault);

View File

@ -336,7 +336,7 @@ int __init sysv_init_icache(void)
{
sysv_inode_cachep = kmem_cache_create("sysv_inode_cache",
sizeof(struct sysv_inode_info), 0,
SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
init_once);
if (!sysv_inode_cachep)
return -ENOMEM;

View File

@ -83,9 +83,6 @@ static inline sysv_zone_t *block_end(struct buffer_head *bh)
return (sysv_zone_t*)((char*)bh->b_data + bh->b_size);
}
/*
* Requires read_lock(&pointers_lock) or write_lock(&pointers_lock)
*/
static Indirect *get_branch(struct inode *inode,
int depth,
int offsets[],
@ -105,15 +102,18 @@ static Indirect *get_branch(struct inode *inode,
bh = sb_bread(sb, block);
if (!bh)
goto failure;
read_lock(&pointers_lock);
if (!verify_chain(chain, p))
goto changed;
add_chain(++p, bh, (sysv_zone_t*)bh->b_data + *++offsets);
read_unlock(&pointers_lock);
if (!p->key)
goto no_block;
}
return NULL;
changed:
read_unlock(&pointers_lock);
brelse(bh);
*err = -EAGAIN;
goto no_block;
@ -219,9 +219,7 @@ static int get_block(struct inode *inode, sector_t iblock, struct buffer_head *b
goto out;
reread:
read_lock(&pointers_lock);
partial = get_branch(inode, depth, offsets, chain, &err);
read_unlock(&pointers_lock);
/* Simplest case - block found, no allocation needed */
if (!partial) {
@ -291,9 +289,9 @@ static Indirect *find_shared(struct inode *inode,
*top = 0;
for (k = depth; k > 1 && !offsets[k-1]; k--)
;
partial = get_branch(inode, k, offsets, chain, &err);
write_lock(&pointers_lock);
partial = get_branch(inode, k, offsets, chain, &err);
if (!partial)
partial = chain + k-1;
/*

View File

@ -205,7 +205,6 @@ static struct dentry *ubifs_lookup(struct inode *dir, struct dentry *dentry,
dbg_gen("'%pd' in dir ino %lu", dentry, dir->i_ino);
err = fscrypt_prepare_lookup(dir, dentry, &nm);
generic_set_encrypted_ci_d_ops(dentry);
if (err == -ENOENT)
return d_splice_alias(NULL, dentry);
if (err)

View File

@ -2239,6 +2239,7 @@ static int ubifs_fill_super(struct super_block *sb, void *data, int silent)
goto out_umount;
}
generic_set_sb_d_ops(sb);
sb->s_root = d_make_root(root);
if (!sb->s_root) {
err = -ENOMEM;

View File

@ -193,7 +193,6 @@ do { \
#ifndef smp_store_release
#define smp_store_release(p, v) \
do { \
compiletime_assert_atomic_type(*p); \
barrier(); \
WRITE_ONCE(*p, v); \
} while (0)
@ -203,7 +202,6 @@ do { \
#define smp_load_acquire(p) \
({ \
__unqual_scalar_typeof(*p) ___p1 = READ_ONCE(*p); \
compiletime_assert_atomic_type(*p); \
barrier(); \
(typeof(*p))___p1; \
})

View File

@ -38,7 +38,6 @@ struct backing_dev_info *bdi_alloc(int node_id);
void wb_start_background_writeback(struct bdi_writeback *wb);
void wb_workfn(struct work_struct *work);
void wb_wakeup_delayed(struct bdi_writeback *wb);
void wb_wait_for_completion(struct wb_completion *done);

View File

@ -125,6 +125,11 @@ enum dentry_d_lock_class
DENTRY_D_LOCK_NESTED
};
enum d_real_type {
D_REAL_DATA,
D_REAL_METADATA,
};
struct dentry_operations {
int (*d_revalidate)(struct dentry *, unsigned int);
int (*d_weak_revalidate)(struct dentry *, unsigned int);
@ -139,7 +144,7 @@ struct dentry_operations {
char *(*d_dname)(struct dentry *, char *, int);
struct vfsmount *(*d_automount)(struct path *);
int (*d_manage)(const struct path *, bool);
struct dentry *(*d_real)(struct dentry *, const struct inode *);
struct dentry *(*d_real)(struct dentry *, enum d_real_type type);
} ____cacheline_aligned;
/*
@ -547,24 +552,23 @@ static inline struct inode *d_backing_inode(const struct dentry *upper)
/**
* d_real - Return the real dentry
* @dentry: the dentry to query
* @inode: inode to select the dentry from multiple layers (can be NULL)
* @type: the type of real dentry (data or metadata)
*
* If dentry is on a union/overlay, then return the underlying, real dentry.
* Otherwise return the dentry itself.
*
* See also: Documentation/filesystems/vfs.rst
*/
static inline struct dentry *d_real(struct dentry *dentry,
const struct inode *inode)
static inline struct dentry *d_real(struct dentry *dentry, enum d_real_type type)
{
if (unlikely(dentry->d_flags & DCACHE_OP_REAL))
return dentry->d_op->d_real(dentry, inode);
return dentry->d_op->d_real(dentry, type);
else
return dentry;
}
/**
* d_real_inode - Return the real inode
* d_real_inode - Return the real inode hosting the data
* @dentry: The dentry to query
*
* If dentry is on a union/overlay, then return the underlying, real inode.
@ -573,7 +577,7 @@ static inline struct dentry *d_real(struct dentry *dentry,
static inline struct inode *d_real_inode(const struct dentry *dentry)
{
/* This usage of d_real() results in const dentry */
return d_backing_inode(d_real((struct dentry *) dentry, NULL));
return d_inode(d_real((struct dentry *) dentry, D_REAL_DATA));
}
struct name_snapshot {

View File

@ -43,6 +43,7 @@
#include <linux/cred.h>
#include <linux/mnt_idmapping.h>
#include <linux/slab.h>
#include <linux/maple_tree.h>
#include <asm/byteorder.h>
#include <uapi/linux/fs.h>
@ -484,10 +485,10 @@ struct address_space {
pgoff_t writeback_index;
const struct address_space_operations *a_ops;
unsigned long flags;
struct rw_semaphore i_mmap_rwsem;
errseq_t wb_err;
spinlock_t i_private_lock;
struct list_head i_private_list;
struct rw_semaphore i_mmap_rwsem;
void * i_private_data;
} __attribute__((aligned(sizeof(long)))) __randomize_layout;
/*
@ -909,7 +910,8 @@ static inline loff_t i_size_read(const struct inode *inode)
preempt_enable();
return i_size;
#else
return inode->i_size;
/* Pairs with smp_store_release() in i_size_write() */
return smp_load_acquire(&inode->i_size);
#endif
}
@ -931,7 +933,12 @@ static inline void i_size_write(struct inode *inode, loff_t i_size)
inode->i_size = i_size;
preempt_enable();
#else
inode->i_size = i_size;
/*
* Pairs with smp_load_acquire() in i_size_read() to ensure
* changes related to inode size (such as page contents) are
* visible before we see the changed inode size.
*/
smp_store_release(&inode->i_size, i_size);
#endif
}
@ -1080,9 +1087,20 @@ static inline struct inode *file_inode(const struct file *f)
return f->f_inode;
}
/*
* file_dentry() is a relic from the days that overlayfs was using files with a
* "fake" path, meaning, f_path on overlayfs and f_inode on underlying fs.
* In those days, file_dentry() was needed to get the underlying fs dentry that
* matches f_inode.
* Files with "fake" path should not exist nowadays, so use an assertion to make
* sure that file_dentry() was not papering over filesystem bugs.
*/
static inline struct dentry *file_dentry(const struct file *file)
{
return d_real(file->f_path.dentry, file_inode(file));
struct dentry *dentry = file->f_path.dentry;
WARN_ON_ONCE(d_inode(dentry) != file_inode(file));
return dentry;
}
struct fasync_struct {
@ -2927,6 +2945,17 @@ extern bool path_is_under(const struct path *, const struct path *);
extern char *file_path(struct file *, char *, int);
/**
* is_dot_dotdot - returns true only if @name is "." or ".."
* @name: file name to check
* @len: length of file name, in bytes
*/
static inline bool is_dot_dotdot(const char *name, size_t len)
{
return len && unlikely(name[0] == '.') &&
(len == 1 || (len == 2 && name[1] == '.'));
}
#include <linux/err.h>
/* needed for stackable file system support */
@ -3259,13 +3288,14 @@ extern ssize_t simple_write_to_buffer(void *to, size_t available, loff_t *ppos,
const void __user *from, size_t count);
struct offset_ctx {
struct xarray xa;
u32 next_offset;
struct maple_tree mt;
unsigned long next_offset;
};
void simple_offset_init(struct offset_ctx *octx);
int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry);
void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry);
int simple_offset_empty(struct dentry *dentry);
int simple_offset_rename_exchange(struct inode *old_dir,
struct dentry *old_dentry,
struct inode *new_dir,
@ -3279,7 +3309,16 @@ extern int generic_file_fsync(struct file *, loff_t, loff_t, int);
extern int generic_check_addressable(unsigned, u64);
extern void generic_set_encrypted_ci_d_ops(struct dentry *dentry);
extern void generic_set_sb_d_ops(struct super_block *sb);
static inline bool sb_has_encoding(const struct super_block *sb)
{
#if IS_ENABLED(CONFIG_UNICODE)
return !!sb->s_encoding;
#else
return false;
#endif
}
int may_setattr(struct mnt_idmap *idmap, struct inode *inode,
unsigned int ia_valid);
@ -3334,6 +3373,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
return 0;
if (unlikely(flags & ~RWF_SUPPORTED))
return -EOPNOTSUPP;
if (unlikely((flags & RWF_APPEND) && (flags & RWF_NOAPPEND)))
return -EINVAL;
if (flags & RWF_NOWAIT) {
if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
@ -3344,6 +3385,12 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
if (flags & RWF_SYNC)
kiocb_flags |= IOCB_DSYNC;
if ((flags & RWF_NOAPPEND) && (ki->ki_flags & IOCB_APPEND)) {
if (IS_APPEND(file_inode(ki->ki_filp)))
return -EPERM;
ki->ki_flags &= ~IOCB_APPEND;
}
ki->ki_flags |= kiocb_flags;
return 0;
}

View File

@ -192,6 +192,8 @@ struct fscrypt_operations {
unsigned int *num_devs);
};
int fscrypt_d_revalidate(struct dentry *dentry, unsigned int flags);
static inline struct fscrypt_inode_info *
fscrypt_get_inode_info(const struct inode *inode)
{
@ -221,15 +223,29 @@ static inline bool fscrypt_needs_contents_encryption(const struct inode *inode)
}
/*
* When d_splice_alias() moves a directory's no-key alias to its plaintext alias
* as a result of the encryption key being added, DCACHE_NOKEY_NAME must be
* cleared. Note that we don't have to support arbitrary moves of this flag
* because fscrypt doesn't allow no-key names to be the source or target of a
* rename().
* When d_splice_alias() moves a directory's no-key alias to its
* plaintext alias as a result of the encryption key being added,
* DCACHE_NOKEY_NAME must be cleared and there might be an opportunity
* to disable d_revalidate. Note that we don't have to support the
* inverse operation because fscrypt doesn't allow no-key names to be
* the source or target of a rename().
*/
static inline void fscrypt_handle_d_move(struct dentry *dentry)
{
dentry->d_flags &= ~DCACHE_NOKEY_NAME;
/*
* VFS calls fscrypt_handle_d_move even for non-fscrypt
* filesystems.
*/
if (dentry->d_flags & DCACHE_NOKEY_NAME) {
dentry->d_flags &= ~DCACHE_NOKEY_NAME;
/*
* Other filesystem features might be handling dentry
* revalidation, in which case it cannot be disabled.
*/
if (dentry->d_op->d_revalidate == fscrypt_d_revalidate)
dentry->d_flags &= ~DCACHE_OP_REVALIDATE;
}
}
/**
@ -261,6 +277,35 @@ static inline bool fscrypt_is_nokey_name(const struct dentry *dentry)
return dentry->d_flags & DCACHE_NOKEY_NAME;
}
static inline void fscrypt_prepare_dentry(struct dentry *dentry,
bool is_nokey_name)
{
/*
* This code tries to only take ->d_lock when necessary to write
* to ->d_flags. We shouldn't be peeking on d_flags for
* DCACHE_OP_REVALIDATE unlocked, but in the unlikely case
* there is a race, the worst it can happen is that we fail to
* unset DCACHE_OP_REVALIDATE and pay the cost of an extra
* d_revalidate.
*/
if (is_nokey_name) {
spin_lock(&dentry->d_lock);
dentry->d_flags |= DCACHE_NOKEY_NAME;
spin_unlock(&dentry->d_lock);
} else if (dentry->d_flags & DCACHE_OP_REVALIDATE &&
dentry->d_op->d_revalidate == fscrypt_d_revalidate) {
/*
* Unencrypted dentries and encrypted dentries where the
* key is available are always valid from fscrypt
* perspective. Avoid the cost of calling
* fscrypt_d_revalidate unnecessarily.
*/
spin_lock(&dentry->d_lock);
dentry->d_flags &= ~DCACHE_OP_REVALIDATE;
spin_unlock(&dentry->d_lock);
}
}
/* crypto.c */
void fscrypt_enqueue_decrypt_work(struct work_struct *);
@ -368,7 +413,6 @@ int fscrypt_fname_disk_to_usr(const struct inode *inode,
bool fscrypt_match_name(const struct fscrypt_name *fname,
const u8 *de_name, u32 de_name_len);
u64 fscrypt_fname_siphash(const struct inode *dir, const struct qstr *name);
int fscrypt_d_revalidate(struct dentry *dentry, unsigned int flags);
/* bio.c */
bool fscrypt_decrypt_bio(struct bio *bio);
@ -425,6 +469,11 @@ static inline bool fscrypt_is_nokey_name(const struct dentry *dentry)
return false;
}
static inline void fscrypt_prepare_dentry(struct dentry *dentry,
bool is_nokey_name)
{
}
/* crypto.c */
static inline void fscrypt_enqueue_decrypt_work(struct work_struct *work)
{
@ -982,6 +1031,9 @@ static inline int fscrypt_prepare_lookup(struct inode *dir,
fname->usr_fname = &dentry->d_name;
fname->disk_name.name = (unsigned char *)dentry->d_name.name;
fname->disk_name.len = dentry->d_name.len;
fscrypt_prepare_dentry(dentry, false);
return 0;
}

View File

@ -171,6 +171,7 @@ enum maple_type {
#define MT_FLAGS_LOCK_IRQ 0x100
#define MT_FLAGS_LOCK_BH 0x200
#define MT_FLAGS_LOCK_EXTERN 0x300
#define MT_FLAGS_ALLOC_WRAPPED 0x0800
#define MAPLE_HEIGHT_MAX 31
@ -319,6 +320,9 @@ int mtree_insert_range(struct maple_tree *mt, unsigned long first,
int mtree_alloc_range(struct maple_tree *mt, unsigned long *startp,
void *entry, unsigned long size, unsigned long min,
unsigned long max, gfp_t gfp);
int mtree_alloc_cyclic(struct maple_tree *mt, unsigned long *startp,
void *entry, unsigned long range_lo, unsigned long range_hi,
unsigned long *next, gfp_t gfp);
int mtree_alloc_rrange(struct maple_tree *mt, unsigned long *startp,
void *entry, unsigned long size, unsigned long min,
unsigned long max, gfp_t gfp);
@ -499,6 +503,9 @@ void *mas_find_range(struct ma_state *mas, unsigned long max);
void *mas_find_rev(struct ma_state *mas, unsigned long min);
void *mas_find_range_rev(struct ma_state *mas, unsigned long max);
int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp);
int mas_alloc_cyclic(struct ma_state *mas, unsigned long *startp,
void *entry, unsigned long range_lo, unsigned long range_hi,
unsigned long *next, gfp_t gfp);
bool mas_nomem(struct ma_state *mas, gfp_t gfp);
void mas_pause(struct ma_state *mas);

View File

@ -14,11 +14,7 @@
/* ~832 bytes of stack space used max in sys_select/sys_poll before allocating
additional memory. */
#ifdef __clang__
#define MAX_STACK_ALLOC 768
#else
#define MAX_STACK_ALLOC 832
#endif
#define FRONTEND_STACK_ALLOC 256
#define SELECT_STACK_ALLOC FRONTEND_STACK_ALLOC
#define POLL_STACK_ALLOC FRONTEND_STACK_ALLOC

View File

@ -301,9 +301,12 @@ typedef int __bitwise __kernel_rwf_t;
/* per-IO O_APPEND */
#define RWF_APPEND ((__force __kernel_rwf_t)0x00000010)
/* per-IO negation of O_APPEND */
#define RWF_NOAPPEND ((__force __kernel_rwf_t)0x00000020)
/* mask of flags supported by the kernel */
#define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\
RWF_APPEND)
RWF_APPEND | RWF_NOAPPEND)
/* Pagemap ioctl */
#define PAGEMAP_SCAN _IOWR('f', 16, struct pm_scan_arg)

View File

@ -679,8 +679,6 @@ static void __init populate_initrd_image(char *err)
struct file *file;
loff_t pos = 0;
unpack_to_rootfs(__initramfs_start, __initramfs_size);
printk(KERN_INFO "rootfs image is not initramfs (%s); looks like an initrd\n",
err);
file = filp_open("/initrd.image", O_WRONLY | O_CREAT, 0700);

View File

@ -691,12 +691,11 @@ EXPORT_SYMBOL(iov_iter_discard);
static bool iov_iter_aligned_iovec(const struct iov_iter *i, unsigned addr_mask,
unsigned len_mask)
{
const struct iovec *iov = iter_iov(i);
size_t size = i->count;
size_t skip = i->iov_offset;
unsigned k;
for (k = 0; k < i->nr_segs; k++, skip = 0) {
const struct iovec *iov = iter_iov(i) + k;
do {
size_t len = iov->iov_len - skip;
if (len > size)
@ -706,34 +705,36 @@ static bool iov_iter_aligned_iovec(const struct iov_iter *i, unsigned addr_mask,
if ((unsigned long)(iov->iov_base + skip) & addr_mask)
return false;
iov++;
size -= len;
if (!size)
break;
}
skip = 0;
} while (size);
return true;
}
static bool iov_iter_aligned_bvec(const struct iov_iter *i, unsigned addr_mask,
unsigned len_mask)
{
size_t size = i->count;
const struct bio_vec *bvec = i->bvec;
unsigned skip = i->iov_offset;
unsigned k;
size_t size = i->count;
for (k = 0; k < i->nr_segs; k++, skip = 0) {
size_t len = i->bvec[k].bv_len - skip;
do {
size_t len = bvec->bv_len;
if (len > size)
len = size;
if (len & len_mask)
return false;
if ((unsigned long)(i->bvec[k].bv_offset + skip) & addr_mask)
if ((unsigned long)(bvec->bv_offset + skip) & addr_mask)
return false;
bvec++;
size -= len;
if (!size)
break;
}
skip = 0;
} while (size);
return true;
}
@ -777,13 +778,12 @@ EXPORT_SYMBOL_GPL(iov_iter_is_aligned);
static unsigned long iov_iter_alignment_iovec(const struct iov_iter *i)
{
const struct iovec *iov = iter_iov(i);
unsigned long res = 0;
size_t size = i->count;
size_t skip = i->iov_offset;
unsigned k;
for (k = 0; k < i->nr_segs; k++, skip = 0) {
const struct iovec *iov = iter_iov(i) + k;
do {
size_t len = iov->iov_len - skip;
if (len) {
res |= (unsigned long)iov->iov_base + skip;
@ -791,30 +791,31 @@ static unsigned long iov_iter_alignment_iovec(const struct iov_iter *i)
len = size;
res |= len;
size -= len;
if (!size)
break;
}
}
iov++;
skip = 0;
} while (size);
return res;
}
static unsigned long iov_iter_alignment_bvec(const struct iov_iter *i)
{
const struct bio_vec *bvec = i->bvec;
unsigned res = 0;
size_t size = i->count;
unsigned skip = i->iov_offset;
unsigned k;
for (k = 0; k < i->nr_segs; k++, skip = 0) {
size_t len = i->bvec[k].bv_len - skip;
res |= (unsigned long)i->bvec[k].bv_offset + skip;
do {
size_t len = bvec->bv_len - skip;
res |= (unsigned long)bvec->bv_offset + skip;
if (len > size)
len = size;
res |= len;
bvec++;
size -= len;
if (!size)
break;
}
skip = 0;
} while (size);
return res;
}
@ -1143,11 +1144,12 @@ const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
EXPORT_SYMBOL(dup_iter);
static __noclone int copy_compat_iovec_from_user(struct iovec *iov,
const struct iovec __user *uvec, unsigned long nr_segs)
const struct iovec __user *uvec, u32 nr_segs)
{
const struct compat_iovec __user *uiov =
(const struct compat_iovec __user *)uvec;
int ret = -EFAULT, i;
int ret = -EFAULT;
u32 i;
if (!user_access_begin(uiov, nr_segs * sizeof(*uiov)))
return -EFAULT;

View File

@ -4290,6 +4290,56 @@ exists:
}
/**
* mas_alloc_cyclic() - Internal call to find somewhere to store an entry
* @mas: The maple state.
* @startp: Pointer to ID.
* @range_lo: Lower bound of range to search.
* @range_hi: Upper bound of range to search.
* @entry: The entry to store.
* @next: Pointer to next ID to allocate.
* @gfp: The GFP_FLAGS to use for allocations.
*
* Return: 0 if the allocation succeeded without wrapping, 1 if the
* allocation succeeded after wrapping, or -EBUSY if there are no
* free entries.
*/
int mas_alloc_cyclic(struct ma_state *mas, unsigned long *startp,
void *entry, unsigned long range_lo, unsigned long range_hi,
unsigned long *next, gfp_t gfp)
{
unsigned long min = range_lo;
int ret = 0;
range_lo = max(min, *next);
ret = mas_empty_area(mas, range_lo, range_hi, 1);
if ((mas->tree->ma_flags & MT_FLAGS_ALLOC_WRAPPED) && ret == 0) {
mas->tree->ma_flags &= ~MT_FLAGS_ALLOC_WRAPPED;
ret = 1;
}
if (ret < 0 && range_lo > min) {
ret = mas_empty_area(mas, min, range_hi, 1);
if (ret == 0)
ret = 1;
}
if (ret < 0)
return ret;
do {
mas_insert(mas, entry);
} while (mas_nomem(mas, gfp));
if (mas_is_err(mas))
return xa_err(mas->node);
*startp = mas->index;
*next = *startp + 1;
if (*next == 0)
mas->tree->ma_flags |= MT_FLAGS_ALLOC_WRAPPED;
return ret;
}
EXPORT_SYMBOL(mas_alloc_cyclic);
static __always_inline void mas_rewalk(struct ma_state *mas, unsigned long index)
{
retry:
@ -6443,6 +6493,49 @@ unlock:
}
EXPORT_SYMBOL(mtree_alloc_range);
/**
* mtree_alloc_cyclic() - Find somewhere to store this entry in the tree.
* @mt: The maple tree.
* @startp: Pointer to ID.
* @range_lo: Lower bound of range to search.
* @range_hi: Upper bound of range to search.
* @entry: The entry to store.
* @next: Pointer to next ID to allocate.
* @gfp: The GFP_FLAGS to use for allocations.
*
* Finds an empty entry in @mt after @next, stores the new index into
* the @id pointer, stores the entry at that index, then updates @next.
*
* @mt must be initialized with the MT_FLAGS_ALLOC_RANGE flag.
*
* Context: Any context. Takes and releases the mt.lock. May sleep if
* the @gfp flags permit.
*
* Return: 0 if the allocation succeeded without wrapping, 1 if the
* allocation succeeded after wrapping, -ENOMEM if memory could not be
* allocated, -EINVAL if @mt cannot be used, or -EBUSY if there are no
* free entries.
*/
int mtree_alloc_cyclic(struct maple_tree *mt, unsigned long *startp,
void *entry, unsigned long range_lo, unsigned long range_hi,
unsigned long *next, gfp_t gfp)
{
int ret;
MA_STATE(mas, mt, 0, 0);
if (!mt_is_alloc(mt))
return -EINVAL;
if (WARN_ON_ONCE(mt_is_reserved(entry)))
return -EINVAL;
mtree_lock(mt);
ret = mas_alloc_cyclic(&mas, startp, entry, range_lo, range_hi,
next, gfp);
mtree_unlock(mt);
return ret;
}
EXPORT_SYMBOL(mtree_alloc_cyclic);
int mtree_alloc_rrange(struct maple_tree *mt, unsigned long *startp,
void *entry, unsigned long size, unsigned long min,
unsigned long max, gfp_t gfp)

View File

@ -3599,6 +3599,45 @@ static noinline void __init check_state_handling(struct maple_tree *mt)
mas_unlock(&mas);
}
static noinline void __init alloc_cyclic_testing(struct maple_tree *mt)
{
unsigned long location;
unsigned long next;
int ret = 0;
MA_STATE(mas, mt, 0, 0);
next = 0;
mtree_lock(mt);
for (int i = 0; i < 100; i++) {
mas_alloc_cyclic(&mas, &location, mt, 2, ULONG_MAX, &next, GFP_KERNEL);
MAS_BUG_ON(&mas, i != location - 2);
MAS_BUG_ON(&mas, mas.index != location);
MAS_BUG_ON(&mas, mas.last != location);
MAS_BUG_ON(&mas, i != next - 3);
}
mtree_unlock(mt);
mtree_destroy(mt);
next = 0;
mt_init_flags(mt, MT_FLAGS_ALLOC_RANGE);
for (int i = 0; i < 100; i++) {
mtree_alloc_cyclic(mt, &location, mt, 2, ULONG_MAX, &next, GFP_KERNEL);
MT_BUG_ON(mt, i != location - 2);
MT_BUG_ON(mt, i != next - 3);
MT_BUG_ON(mt, mtree_load(mt, location) != mt);
}
mtree_destroy(mt);
/* Overflow test */
next = ULONG_MAX - 1;
ret = mtree_alloc_cyclic(mt, &location, mt, 2, ULONG_MAX, &next, GFP_KERNEL);
MT_BUG_ON(mt, ret != 0);
ret = mtree_alloc_cyclic(mt, &location, mt, 2, ULONG_MAX, &next, GFP_KERNEL);
MT_BUG_ON(mt, ret != 0);
ret = mtree_alloc_cyclic(mt, &location, mt, 2, ULONG_MAX, &next, GFP_KERNEL);
MT_BUG_ON(mt, ret != 1);
}
static DEFINE_MTREE(tree);
static int __init maple_tree_seed(void)
{
@ -3880,6 +3919,11 @@ static int __init maple_tree_seed(void)
check_state_handling(&tree);
mtree_destroy(&tree);
mt_init_flags(&tree, MT_FLAGS_ALLOC_RANGE);
alloc_cyclic_testing(&tree);
mtree_destroy(&tree);
#if defined(BENCH)
skip:
#endif

View File

@ -372,31 +372,6 @@ static int __init default_bdi_init(void)
}
subsys_initcall(default_bdi_init);
/*
* This function is used when the first inode for this wb is marked dirty. It
* wakes-up the corresponding bdi thread which should then take care of the
* periodic background write-out of dirty inodes. Since the write-out would
* starts only 'dirty_writeback_interval' centisecs from now anyway, we just
* set up a timer which wakes the bdi thread up later.
*
* Note, we wouldn't bother setting up the timer, but this function is on the
* fast-path (used by '__mark_inode_dirty()'), so we save few context switches
* by delaying the wake-up.
*
* We have to be careful not to postpone flush work if it is scheduled for
* earlier. Thus we use queue_delayed_work().
*/
void wb_wakeup_delayed(struct bdi_writeback *wb)
{
unsigned long timeout;
timeout = msecs_to_jiffies(dirty_writeback_interval * 10);
spin_lock_irq(&wb->work_lock);
if (test_bit(WB_registered, &wb->state))
queue_delayed_work(bdi_wq, &wb->dwork, timeout);
spin_unlock_irq(&wb->work_lock);
}
static void wb_update_bandwidth_workfn(struct work_struct *work)
{
struct bdi_writeback *wb = container_of(to_delayed_work(work),

View File

@ -2608,15 +2608,6 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
goto put_folios;
end_offset = min_t(loff_t, isize, iocb->ki_pos + iter->count);
/*
* Pairs with a barrier in
* block_write_end()->mark_buffer_dirty() or other page
* dirtying routines like iomap_write_end() to ensure
* changes to page contents are visible before we see
* increased inode size.
*/
smp_rmb();
/*
* Once we start copying data, we don't want to be touching any
* cachelines that might be contended:

View File

@ -3374,7 +3374,7 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry)
static int shmem_rmdir(struct inode *dir, struct dentry *dentry)
{
if (!simple_empty(dentry))
if (!simple_offset_empty(dentry))
return -ENOTEMPTY;
drop_nlink(d_inode(dentry));
@ -3431,7 +3431,7 @@ static int shmem_rename2(struct mnt_idmap *idmap,
return simple_offset_rename_exchange(old_dir, old_dentry,
new_dir, new_dentry);
if (!simple_empty(new_dentry))
if (!simple_offset_empty(new_dentry))
return -ENOTEMPTY;
if (flags & RENAME_WHITEOUT) {

View File

@ -10,7 +10,6 @@
#include <linux/mount.h>
#include <sys/syscall.h>
#include <sys/stat.h>
#include <sys/mount.h>
#include <sys/mman.h>
#include <sched.h>
#include <fcntl.h>
@ -32,7 +31,11 @@ static int sys_fsmount(int fd, unsigned int flags, unsigned int attr_flags)
{
return syscall(__NR_fsmount, fd, flags, attr_flags);
}
static int sys_mount(const char *src, const char *tgt, const char *fst,
unsigned long flags, const void *data)
{
return syscall(__NR_mount, src, tgt, fst, flags, data);
}
static int sys_move_mount(int from_dfd, const char *from_pathname,
int to_dfd, const char *to_pathname,
unsigned int flags)
@ -166,8 +169,7 @@ int main(int argc, char **argv)
ksft_test_result_skip("unable to create a new mount namespace\n");
return 1;
}
if (mount(NULL, "/", NULL, MS_SLAVE | MS_REC, NULL) == -1) {
if (sys_mount(NULL, "/", NULL, MS_SLAVE | MS_REC, NULL) == -1) {
pr_perror("mount");
return 1;
}

View File

@ -218,7 +218,7 @@ static bool move_mount_set_group_supported(void)
if (mount(NULL, SET_GROUP_FROM, NULL, MS_SHARED, 0))
return -1;
ret = syscall(SYS_move_mount, AT_FDCWD, SET_GROUP_FROM,
ret = syscall(__NR_move_mount, AT_FDCWD, SET_GROUP_FROM,
AT_FDCWD, SET_GROUP_TO, MOVE_MOUNT_SET_GROUP);
umount2("/tmp", MNT_DETACH);
@ -363,7 +363,7 @@ TEST_F(move_mount_set_group, complex_sharing_copying)
CLONE_VM | CLONE_FILES); ASSERT_GT(pid, 0);
ASSERT_EQ(wait_for_pid(pid), 0);
ASSERT_EQ(syscall(SYS_move_mount, ca_from.mntfd, "",
ASSERT_EQ(syscall(__NR_move_mount, ca_from.mntfd, "",
ca_to.mntfd, "", MOVE_MOUNT_SET_GROUP
| MOVE_MOUNT_F_EMPTY_PATH | MOVE_MOUNT_T_EMPTY_PATH),
0);