overlayfs updates for 6.8

-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE9zuTYTs0RXF+Ke33EVvVyTe/1WoFAmWeozwACgkQEVvVyTe/ 1WqnyA//U2Ka5ZIncs/hA5D03LMyuCh9qlH5qAGce5vrBTxogTlFuTGGKtsUuCB5 Y4GALO+fw8aWAowt5X1XfHD3TETLVbCshT7dYjKsKy/ojANCbgkCipXBudYx+l9m fllwTZyueK0UY14kCU2DAV5PYsI/XVVykk71GSMOMLCUfRJfDI7R0vBD0NaUd7Kz Wp/M6t0MnXX23nGUdgNoroZPPj3Ts/gK2MXID+QHXGaR2+M1B1lLKfSu6TcRDLtn tbe/ivaw4y1jj3jfFwMC7sSSDyIJeZh9tBB4Rvv2vsMiYU8zAC6Eg35eIbPONu42 pUMd0QQa79H3cyYEDtUzyskzur0Jry5azzb8JdQWipgVKFh5g3CHce2XAFlVjw2a 9RyCKg41A9LvdB5l/PvBtsxig2PzaYqE09rXAfUM7eLNFlOLbL99uc1WJbIFfG43 Czh9vPxsuJ5RkdwS7R0m4GYDw8+BKW6WjpaC+Eje4I8X1rAQK0H/BLTCxe2dLRB7 0neAg8e3g6NdisRSLOP74xoEn/dhijNP7ENOFF1EdP/BFPHL7+sRsV6XYwwBeUAc c6YsxeAPylm6gvIq/ESoRiY+e5QWvImHIWP+zB/cySYdT0fQHL9WjO6/uZW0ALuv oZugICSmZ15pYlACIU8iYztRkS19CJZrUV7Gbq4+AurUKP8kCEI= =2Ohx -----END PGP SIGNATURE----- Merge tag 'ovl-update-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull overlayfs updates from Amir Goldstein: "This is a very small update with no bug fixes and no new features. The larger update of overlayfs for this cycle, the re-factoring of overlayfs code into generic backing_file helpers, was already merged via Christian. Summary: - Simplify/clarify some code No bug fixes here, just some changes following questions from Al about overlayfs code that could be a little more simple to follow. - Overlayfs documentation style fixes Mainly fixes for ReST formatting suggested by documentation developers" * tag 'ovl-update-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: overlayfs.rst: fix ReST formatting overlayfs.rst: use consistent feature names ovl: initialize ovl_copy_up_ctx.destname inside ovl_do_copy_up() ovl: remove redundant ofs->indexdir member
2024-01-10 10:48:22 -08:00 · 2024-01-10 10:48:22 -08:00 · 4d925f6057
parent 0507d2526f d17bb4620f
commit 4d925f6057
9 changed files with 76 additions and 74 deletions
--- a/Documentation/filesystems/overlayfs.rst
+++ b/Documentation/filesystems/overlayfs.rst
@ -39,7 +39,7 @@ objects in the original filesystem.
 On 64bit systems, even if all overlay layers are not on the same
 underlying filesystem, the same compliant behavior could be achieved
 with the "xino" feature.  The "xino" feature composes a unique object
-identifier from the real object st_ino and an underlying fsid index.
+identifier from the real object st_ino and an underlying fsid number.
 The "xino" feature uses the high inode number bits for fsid, because the
 underlying filesystems rarely use the high inode number bits.  In case
 the underlying inode number does overflow into the high xino bits, overlay
@ -118,7 +118,7 @@ Where both upper and lower objects are directories, a merged directory
 is formed.

 At mount time, the two directories given as mount options "lowerdir" and
-"upperdir" are combined into a merged directory:
+"upperdir" are combined into a merged directory::

  mount -t overlay overlay -olowerdir=/lower,upperdir=/upper,\
  workdir=/work /merged
@ -172,12 +172,12 @@ directory is being read.  This is unlikely to be noticed by many
 programs.

 seek offsets are assigned sequentially when the directories are read.
-Thus if
+Thus if:

-  - read part of a directory
-  - remember an offset, and close the directory
-  - re-open the directory some time later
-  - seek to the remembered offset
+ - read part of a directory
+ - remember an offset, and close the directory
+ - re-open the directory some time later
+ - seek to the remembered offset

 there may be little correlation between the old and new locations in
 the list of filenames, particularly if anything has changed in the
@ -290,9 +290,9 @@ Permission checking in the overlay filesystem follows these principles:
 2) task creating the overlay mount MUST NOT gain additional privileges

 3) non-mounting task MAY gain additional privileges through the overlay,
- compared to direct access on underlying lower or upper filesystems
+    compared to direct access on underlying lower or upper filesystems

-This is achieved by performing two permission checks on each access
+This is achieved by performing two permission checks on each access:

 a) check if current task is allowed access based on local DAC (owner,
    group, mode and posix acl), as well as MAC checks
@ -311,11 +311,11 @@ to create setups where the consistency rule (1) does not hold; normally,
 however, the mounting task will have sufficient privileges to perform all
 operations.

-Another way to demonstrate this model is drawing parallels between
+Another way to demonstrate this model is drawing parallels between::

  mount -t overlay overlay -olowerdir=/lower,upperdir=/upper,... /merged

-and
+and::

  cp -a /lower /upper
  mount --bind /upper /merged
@ -328,7 +328,7 @@ Multiple lower layers
 ---------------------

 Multiple lower layers can now be given using the colon (":") as a
-separator character between the directory names.  For example:
+separator character between the directory names.  For example::

  mount -t overlay overlay -olowerdir=/lower1:/lower2:/lower3 /merged

@ -340,13 +340,13 @@ rightmost one and going left.  In the above example lower1 will be the
 top, lower2 the middle and lower3 the bottom layer.

 Note: directory names containing colons can be provided as lower layer by
-escaping the colons with a single backslash.  For example:
+escaping the colons with a single backslash.  For example::

  mount -t overlay overlay -olowerdir=/a\:lower\:\:dir /merged

 Since kernel version v6.8, directory names containing colons can also
 be configured as lower layer using the "lowerdir+" mount options and the
-fsconfig syscall from new mount api.  For example:
+fsconfig syscall from new mount api.  For example::

  fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir+", "/a:lower::dir", 0);

@ -356,7 +356,7 @@ as an octal characters (\072) when displayed in /proc/self/mountinfo.
 Metadata only copy up
 ---------------------

-When metadata only copy up feature is enabled, overlayfs will only copy
+When the "metacopy" feature is enabled, overlayfs will only copy
 up metadata (as opposed to whole file), when a metadata specific operation
 like chown/chmod is performed. Full file will be copied up later when
 file is opened for WRITE operation.
@ -405,7 +405,7 @@ A normal lower layer is not allowed to be below a data-only layer, so single
 colon separators are not allowed to the right of double colon ("::") separators.


-For example:
+For example::

  mount -t overlay overlay -olowerdir=/l1:/l2:/l3::/do1::/do2 /merged

@ -419,7 +419,7 @@ to the absolute path of the "lower data" file in the "data-only" lower layer.

 Since kernel version v6.8, "data-only" lower layers can also be added using
 the "datadir+" mount options and the fsconfig syscall from new mount api.
-For example:
+For example::

  fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir+", "/l1", 0);
  fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir+", "/l2", 0);
@ -429,7 +429,7 @@ For example:


 fs-verity support
----------------------
+-----------------

 During metadata copy up of a lower file, if the source file has
 fs-verity enabled and overlay verity support is enabled, then the
@ -492,27 +492,27 @@ though it will not result in a crash or deadlock.

 Mounting an overlay using an upper layer path, where the upper layer path
 was previously used by another mounted overlay in combination with a
-different lower layer path, is allowed, unless the "inodes index" feature
-or "metadata only copy up" feature is enabled.
+different lower layer path, is allowed, unless the "index" or "metacopy"
+features are enabled.

-With the "inodes index" feature, on the first time mount, an NFS file
+With the "index" feature, on the first time mount, an NFS file
 handle of the lower layer root directory, along with the UUID of the lower
 filesystem, are encoded and stored in the "trusted.overlay.origin" extended
 attribute on the upper layer root directory.  On subsequent mount attempts,
 the lower root directory file handle and lower filesystem UUID are compared
 to the stored origin in upper root directory.  On failure to verify the
 lower root origin, mount will fail with ESTALE.  An overlayfs mount with
-"inodes index" enabled will fail with EOPNOTSUPP if the lower filesystem
+"index" enabled will fail with EOPNOTSUPP if the lower filesystem
 does not support NFS export, lower filesystem does not have a valid UUID or
 if the upper filesystem does not support extended attributes.

-For "metadata only copy up" feature there is no verification mechanism at
+For the "metacopy" feature, there is no verification mechanism at
 mount time. So if same upper is mounted with different set of lower, mount
 probably will succeed but expect the unexpected later on. So don't do it.

 It is quite a common practice to copy overlay layers to a different
 directory tree on the same or different underlying filesystem, and even
-to a different machine.  With the "inodes index" feature, trying to mount
+to a different machine.  With the "index" feature, trying to mount
 the copied layers will fail the verification of the lower root file handle.

 Nesting overlayfs mounts
@ -547,20 +547,21 @@ filesystem.

 This is the list of cases that overlayfs doesn't currently handle:

-a) POSIX mandates updating st_atime for reads.  This is currently not
-done in the case when the file resides on a lower layer.
+ a) POSIX mandates updating st_atime for reads.  This is currently not
+    done in the case when the file resides on a lower layer.

-b) If a file residing on a lower layer is opened for read-only and then
-memory mapped with MAP_SHARED, then subsequent changes to the file are not
-reflected in the memory mapping.
+ b) If a file residing on a lower layer is opened for read-only and then
+    memory mapped with MAP_SHARED, then subsequent changes to the file are not
+    reflected in the memory mapping.

-c) If a file residing on a lower layer is being executed, then opening that
-file for write or truncating the file will not be denied with ETXTBSY.
+ c) If a file residing on a lower layer is being executed, then opening that
+    file for write or truncating the file will not be denied with ETXTBSY.

 The following options allow overlayfs to act more like a standards
 compliant filesystem:

-1) "redirect_dir"
+redirect_dir
+````````````

 Enabled with the mount option or module option: "redirect_dir=on" or with
 the kernel config option CONFIG_OVERLAY_FS_REDIRECT_DIR=y.
@ -568,7 +569,8 @@ the kernel config option CONFIG_OVERLAY_FS_REDIRECT_DIR=y.
 If this feature is disabled, then rename(2) on a lower or merged directory
 will fail with EXDEV ("Invalid cross-device link").

-2) "inode index"
+index
+`````

 Enabled with the mount option or module option "index=on" or with the
 kernel config option CONFIG_OVERLAY_FS_INDEX=y.
@ -577,7 +579,8 @@ If this feature is disabled and a file with multiple hard links is copied
 up, then this will "break" the link.  Changes will not be propagated to
 other names referring to the same inode.

-3) "xino"
+xino
+````

 Enabled with the mount option "xino=auto" or "xino=on", with the module
 option "xino_auto=on" or with the kernel config option
@ -604,7 +607,7 @@ a crash or deadlock.

 Offline changes, when the overlay is not mounted, are allowed to the
 upper tree.  Offline changes to the lower tree are only allowed if the
-"metadata only copy up", "inode index", "xino" and "redirect_dir" features
+"metacopy", "index", "xino" and "redirect_dir" features
 have not been used.  If the lower tree is modified and any of these
 features has been used, the behavior of the overlay is undefined,
 though it will not result in a crash or deadlock.
@ -644,12 +647,13 @@ directory inode.
 When encoding a file handle from an overlay filesystem object, the
 following rules apply:

-1. For a non-upper object, encode a lower file handle from lower inode
-2. For an indexed object, encode a lower file handle from copy_up origin
-3. For a pure-upper object and for an existing non-indexed upper object,
-   encode an upper file handle from upper inode
+ 1. For a non-upper object, encode a lower file handle from lower inode
+ 2. For an indexed object, encode a lower file handle from copy_up origin
+ 3. For a pure-upper object and for an existing non-indexed upper object,
+    encode an upper file handle from upper inode

 The encoded overlay file handle includes:
+
 - Header including path type information (e.g. lower/upper)
 - UUID of the underlying filesystem
 - Underlying filesystem encoding of underlying inode
@ -659,15 +663,15 @@ are stored in extended attribute "trusted.overlay.origin".

 When decoding an overlay file handle, the following steps are followed:

-1. Find underlying layer by UUID and path type information.
-2. Decode the underlying filesystem file handle to underlying dentry.
-3. For a lower file handle, lookup the handle in index directory by name.
-4. If a whiteout is found in index, return ESTALE. This represents an
-   overlay object that was deleted after its file handle was encoded.
-5. For a non-directory, instantiate a disconnected overlay dentry from the
-   decoded underlying dentry, the path type and index inode, if found.
-6. For a directory, use the connected underlying decoded dentry, path type
-   and index, to lookup a connected overlay dentry.
+ 1. Find underlying layer by UUID and path type information.
+ 2. Decode the underlying filesystem file handle to underlying dentry.
+ 3. For a lower file handle, lookup the handle in index directory by name.
+ 4. If a whiteout is found in index, return ESTALE. This represents an
+    overlay object that was deleted after its file handle was encoded.
+ 5. For a non-directory, instantiate a disconnected overlay dentry from the
+    decoded underlying dentry, the path type and index inode, if found.
+ 6. For a directory, use the connected underlying decoded dentry, path type
+    and index, to lookup a connected overlay dentry.

 Decoding a non-directory file handle may return a disconnected dentry.
 copy_up of that disconnected dentry will create an upper index entry with
@ -770,9 +774,9 @@ Testsuite
 There's a testsuite originally developed by David Howells and currently
 maintained by Amir Goldstein at:

-  https://github.com/amir73il/unionmount-testsuite.git
+https://github.com/amir73il/unionmount-testsuite.git

-Run as root:
+Run as root::

  # cd unionmount-testsuite
  # ./run --ov --verify
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@ -952,6 +952,13 @@ static int ovl_do_copy_up(struct ovl_copy_up_ctx *c)
 		err = -EIO;
 		goto out_free_fh;
 	} else {
+		/*
+		 * c->dentry->d_name is stabilzed by ovl_copy_up_start(),
+		 * because if we got here, it means that c->dentry has no upper
+		 * alias and changing ->d_name means going through ovl_rename()
+		 * that will call ovl_copy_up() on source and target dentry.
+		 */
+		c->destname = c->dentry->d_name;
 		/*
 		 * Mark parent "impure" because it may now contain non-pure
 		 * upper
@ -1132,7 +1139,6 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	if (parent) {
 		ovl_path_upper(parent, &parentpath);
 		ctx.destdir = parentpath.dentry;
-		ctx.destname = dentry->d_name;

 		err = vfs_getattr(&parentpath, &ctx.pstat,
 				  STATX_ATIME | STATX_MTIME,
--- a/fs/overlayfs/export.c
+++ b/fs/overlayfs/export.c
@ -460,7 +460,7 @@ static struct dentry *ovl_lookup_real_inode(struct super_block *sb,
 	 * For decoded lower dir file handle, lookup index by origin to check
 	 * if lower dir was copied up and and/or removed.
 	 */
-	if (!this && layer->idx && ofs->indexdir && !WARN_ON(!d_is_dir(real))) {
+	if (!this && layer->idx && ovl_indexdir(sb) && !WARN_ON(!d_is_dir(real))) {
 		index = ovl_lookup_index(ofs, NULL, real, false);
 		if (IS_ERR(index))
 			return index;
@ -733,7 +733,7 @@ static struct dentry *ovl_lower_fh_to_d(struct super_block *sb,
 	}

 	/* Then lookup indexed upper/whiteout by origin fh */
-	if (ofs->indexdir) {
+	if (ovl_indexdir(sb)) {
 		index = ovl_get_index_fh(ofs, fh);
 		err = PTR_ERR(index);
 		if (IS_ERR(index)) {
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@ -754,7 +754,7 @@ struct dentry *ovl_get_index_fh(struct ovl_fs *ofs, struct ovl_fh *fh)
 	if (err)
 		return ERR_PTR(err);

-	index = lookup_positive_unlocked(name.name, ofs->indexdir, name.len);
+	index = lookup_positive_unlocked(name.name, ofs->workdir, name.len);
 	kfree(name.name);
 	if (IS_ERR(index)) {
 		if (PTR_ERR(index) == -ENOENT)
@ -787,7 +787,7 @@ struct dentry *ovl_lookup_index(struct ovl_fs *ofs, struct dentry *upper,
 		return ERR_PTR(err);

 	index = lookup_one_positive_unlocked(ovl_upper_mnt_idmap(ofs), name.name,
-					     ofs->indexdir, name.len);
+					     ofs->workdir, name.len);
 	if (IS_ERR(index)) {
 		err = PTR_ERR(index);
 		if (err == -ENOENT) {
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@ -63,10 +63,8 @@ struct ovl_fs {
 	struct ovl_sb *fs;
 	/* workbasedir is the path at workdir= mount option */
 	struct dentry *workbasedir;
-	/* workdir is the 'work' directory under workbasedir */
+	/* workdir is the 'work' or 'index' directory under workbasedir */
 	struct dentry *workdir;
-	/* index directory listing overlay inodes by origin file handle */
-	struct dentry *indexdir;
 	long namelen;
 	/* pathnames of lower and upper dirs, for show_options */
 	struct ovl_config config;
@ -81,7 +79,6 @@ struct ovl_fs {
 	/* Traps in ovl inode cache */
 	struct inode *workbasedir_trap;
 	struct inode *workdir_trap;
-	struct inode *indexdir_trap;
 	/* -1: disabled, 0: same fs, 1..32: number of unused ino bits */
 	int xino_mode;
 	/* For allocation of non-persistent inode numbers */
--- a/fs/overlayfs/params.c
+++ b/fs/overlayfs/params.c
@ -743,10 +743,8 @@ void ovl_free_fs(struct ovl_fs *ofs)
 	unsigned i;

 	iput(ofs->workbasedir_trap);
-	iput(ofs->indexdir_trap);
 	iput(ofs->workdir_trap);
 	dput(ofs->whiteout);
-	dput(ofs->indexdir);
 	dput(ofs->workdir);
 	if (ofs->workdir_locked)
 		ovl_inuse_unlock(ofs->workbasedir);
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@ -1169,7 +1169,7 @@ int ovl_workdir_cleanup(struct ovl_fs *ofs, struct inode *dir,
 int ovl_indexdir_cleanup(struct ovl_fs *ofs)
 {
 	int err;
-	struct dentry *indexdir = ofs->indexdir;
+	struct dentry *indexdir = ofs->workdir;
 	struct dentry *index = NULL;
 	struct inode *dir = indexdir->d_inode;
 	struct path path = { .mnt = ovl_upper_mnt(ofs), .dentry = indexdir };
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@ -853,10 +853,8 @@ static int ovl_get_indexdir(struct super_block *sb, struct ovl_fs *ofs,
 	if (IS_ERR(indexdir)) {
 		err = PTR_ERR(indexdir);
 	} else if (indexdir) {
-		ofs->indexdir = indexdir;
-		ofs->workdir = dget(indexdir);
-
-		err = ovl_setup_trap(sb, ofs->indexdir, &ofs->indexdir_trap,
+		ofs->workdir = indexdir;
+		err = ovl_setup_trap(sb, indexdir, &ofs->workdir_trap,
 				     "indexdir");
 		if (err)
 			goto out;
@ -869,16 +867,15 @@ static int ovl_get_indexdir(struct super_block *sb, struct ovl_fs *ofs,
 		 * ".overlay.upper" to indicate that index may have
 		 * directory entries.
 		 */
-		if (ovl_check_origin_xattr(ofs, ofs->indexdir)) {
-			err = ovl_verify_origin_xattr(ofs, ofs->indexdir,
+		if (ovl_check_origin_xattr(ofs, indexdir)) {
+			err = ovl_verify_origin_xattr(ofs, indexdir,
 						      OVL_XATTR_ORIGIN,
 						      upperpath->dentry, true,
 						      false);
 			if (err)
 				pr_err("failed to verify index dir 'origin' xattr\n");
 		}
-		err = ovl_verify_upper(ofs, ofs->indexdir, upperpath->dentry,
-				       true);
+		err = ovl_verify_upper(ofs, indexdir, upperpath->dentry, true);
 		if (err)
 			pr_err("failed to verify index dir 'upper' xattr\n");

@ -886,7 +883,7 @@ static int ovl_get_indexdir(struct super_block *sb, struct ovl_fs *ofs,
 		if (!err)
 			err = ovl_indexdir_cleanup(ofs);
 	}
-	if (err || !ofs->indexdir)
+	if (err || !indexdir)
 		pr_warn("try deleting index dir or mounting with '-o index=off' to disable inodes index.\n");

 out:
@ -1406,7 +1403,7 @@ int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
 			goto out_free_oe;

 		/* Force r/o mount with no index dir */
-		if (!ofs->indexdir)
+		if (!ofs->workdir)
 			sb->s_flags |= SB_RDONLY;
 	}

@ -1415,7 +1412,7 @@ int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
 		goto out_free_oe;

 	/* Show index=off in /proc/mounts for forced r/o mount */
-	if (!ofs->indexdir) {
+	if (!ofs->workdir) {
 		ofs->config.index = false;
 		if (ovl_upper_mnt(ofs) && ofs->config.nfs_export) {
 			pr_warn("NFS export requires an index dir, falling back to nfs_export=off.\n");
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@ -91,7 +91,7 @@ struct dentry *ovl_indexdir(struct super_block *sb)
 {
 	struct ovl_fs *ofs = OVL_FS(sb);

-	return ofs->indexdir;
+	return ofs->config.index ? ofs->workdir : NULL;
 }

 /* Index all files on copy up. For now only enabled for NFS export */