Commit Graph

958 Commits

Author SHA1 Message Date
Christian Brauner 22650a9982
fs,block: yield devices early
Currently a device is only really released once the umount returns to
userspace due to how file closing works. That ultimately could cause
an old umount assumption to be violated that concurrent umount and mount
don't fail. So an exclusively held device with a temporary holder should
be yielded before the filesystem is gone. Add a helper that allows
callers to do that. This also allows us to remove the two holder ops
that Linus wasn't excited about.

Link: https://lore.kernel.org/r/20240326-vfs-bdev-end_holder-v1-1-20af85202918@kernel.org
Fixes: f3a608827d ("bdev: open block device as files") # mainline only
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-27 13:17:15 +01:00
Linus Torvalds e5e038b7ae \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmXx5kwACgkQnJ2qBz9k
 QNmZowf/UlGJ1rmQFFhoodn3SyK48tQjOZ23Ygx6v9FZiLMuQ3b1k0kWKmwM4lZb
 mtRriCm+lPO9Yp/Sflz+jn8S51b/2bcTXiPV4w2Y4ZIun41wwggV7rWPnTCHhu94
 rGEPu/SNSBdpxWGv43BKHSDl4XolsGbyusQKBbKZtftnrpIf0y2OnyEXSV91Vnlh
 KM/XxzacBD4/3r4KCljyEkORWlIIn2+gdZf58sKtxLKvnfCIxjB+BF1e0gOWgmNQ
 e/pVnzbAHO3wuavRlwnrtA+ekBYQiJq7T61yyYI8zpeSoLHmwvPoKSsZP+q4BTvV
 yrcVCbGp3uZlXHD93U3BOfdqS0xBmg==
 =84Q4
 -----END PGP SIGNATURE-----

Merge tag 'fs_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull ext2, isofs, udf, and quota updates from Jan Kara:
 "A lot of material this time:

   - removal of a lot of GFP_NOFS usage from ext2, udf, quota (either it
     was legacy or replaced with scoped memalloc_nofs_*() API)

   - removal of BUG_ONs in quota code

   - conversion of UDF to the new mount API

   - tightening quota on disk format verification

   - fix some potentially unsafe use of RCU pointers in quota code and
     annotate everything properly to make sparse happy

   - a few other small quota, ext2, udf, and isofs fixes"

* tag 'fs_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (26 commits)
  udf: remove SLAB_MEM_SPREAD flag usage
  quota: remove SLAB_MEM_SPREAD flag usage
  isofs: remove SLAB_MEM_SPREAD flag usage
  ext2: remove SLAB_MEM_SPREAD flag usage
  ext2: mark as deprecated
  udf: convert to new mount API
  udf: convert novrs to an option flag
  MAINTAINERS: add missing git address for ext2 entry
  quota: Detect loops in quota tree
  quota: Properly annotate i_dquot arrays with __rcu
  quota: Fix rcu annotations of inode dquot pointers
  isofs: handle CDs with bad root inode but good Joliet root directory
  udf: Avoid invalid LVID used on mount
  quota: Fix potential NULL pointer dereference
  quota: Drop GFP_NOFS instances under dquot->dq_lock and dqio_sem
  quota: Set nofs allocation context when acquiring dqio_sem
  ext2: Remove GFP_NOFS use in ext2_xattr_cache_insert()
  ext2: Drop GFP_NOFS use in ext2_get_blocks()
  ext2: Drop GFP_NOFS allocation from ext2_init_block_alloc_info()
  udf: Remove GFP_NOFS allocation in udf_expand_file_adinicb()
  ...
2024-03-13 14:30:58 -07:00
Linus Torvalds 910202f00a vfs-6.9.super
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZem4DwAKCRCRxhvAZXjc
 ooTRAQDRI6Qz6wJym5Yblta8BScMGbt/SgrdgkoCvT6y83MtqwD+Nv/AZQzi3A3l
 9NdULtniW1reuCYkc8R7dYM8S+yAwAc=
 =Y1qX
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.9.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull block handle updates from Christian Brauner:
 "Last cycle we changed opening of block devices, and opening a block
  device would return a bdev_handle. This allowed us to implement
  support for restricting and forbidding writes to mounted block
  devices. It was accompanied by converting and adding helpers to
  operate on bdev_handles instead of plain block devices.

  That was already a good step forward but ultimately it isn't necessary
  to have special purpose helpers for opening block devices internally
  that return a bdev_handle.

  Fundamentally, opening a block device internally should just be
  equivalent to opening files. So now all internal opens of block
  devices return files just as a userspace open would. Instead of
  introducing a separate indirection into bdev_open_by_*() via struct
  bdev_handle bdev_file_open_by_*() is made to just return a struct
  file. Opening and closing a block device just becomes equivalent to
  opening and closing a file.

  This all works well because internally we already have a pseudo fs for
  block devices and so opening block devices is simple. There's a few
  places where we needed to be careful such as during boot when the
  kernel is supposed to mount the rootfs directly without init doing it.
  Here we need to take care to ensure that we flush out any asynchronous
  file close. That's what we already do for opening, unpacking, and
  closing the initramfs. So nothing new here.

  The equivalence of opening and closing block devices to regular files
  is a win in and of itself. But it also has various other advantages.
  We can remove struct bdev_handle completely. Various low-level helpers
  are now private to the block layer. Other helpers were simply
  removable completely.

  A follow-up series that is already reviewed build on this and makes it
  possible to remove bdev->bd_inode and allows various clean ups of the
  buffer head code as well. All places where we stashed a bdev_handle
  now just stash a file and use simple accessors to get to the actual
  block device which was already the case for bdev_handle"

* tag 'vfs-6.9.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (35 commits)
  block: remove bdev_handle completely
  block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access
  bdev: remove bdev pointer from struct bdev_handle
  bdev: make struct bdev_handle private to the block layer
  bdev: make bdev_{release, open_by_dev}() private to block layer
  bdev: remove bdev_open_by_path()
  reiserfs: port block device access to file
  ocfs2: port block device access to file
  nfs: port block device access to files
  jfs: port block device access to file
  f2fs: port block device access to files
  ext4: port block device access to file
  erofs: port device access to file
  btrfs: port device access to file
  bcachefs: port block device access to file
  target: port block device access to file
  s390: port block device access to file
  nvme: port block device access to file
  block2mtd: port device access to files
  bcache: port block device access to files
  ...
2024-03-11 10:52:34 -07:00
Chengming Zhou 87a83c8c67 reiserfs: remove SLAB_MEM_SPREAD flag usage
The SLAB_MEM_SPREAD flag used to be implemented in SLAB, which was
removed as of v6.8-rc1 (see [1]), so it became a dead flag since the
commit 16a1d96835 ("mm/slab: remove mm/slab.c and slab_def.h"). And
the series[1] went on to mark it obsolete explicitly to avoid confusion
for users. Here we can just remove all its users, which has no any
functional change.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Link: https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-1-02f1753e8303@suse.cz [1]
Link: https://lore.kernel.org/r/20240224135126.830110-1-chengming.zhou@linux.dev
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-27 11:21:32 +01:00
Christian Brauner 9de31ee6d7
reiserfs: port block device access to file
Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-26-adbd023e19cc@kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-02-25 12:05:27 +01:00
Jan Kara ccb49011bb quota: Properly annotate i_dquot arrays with __rcu
Dquots pointed to from i_dquot arrays in inodes are protected by
dquot_srcu. Annotate them as such and change .get_dquots callback to
return properly annotated pointer to make sparse happy.

Fixes: b9ba6f94b2 ("quota: remove dqptr_sem")
Signed-off-by: Jan Kara <jack@suse.cz>
2024-02-08 12:04:59 +01:00
Linus Torvalds 488926926a misc cleanups (the part that hadn't been picked by individual fs trees)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZZ/BCAAKCRBZ7Krx/gZQ
 68qqAQD6LtfYLDJGdJM+lNpyiG4BA7coYpPlJtmH7mzL+MbFPgEAnM7XsK6zyvza
 3+rEggLM0UFWjg9Ln7Nlq035TeYtFwo=
 =w1mD
 -----END PGP SIGNATURE-----

Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull misc filesystem updates from Al Viro:
 "Misc cleanups (the part that hadn't been picked by individual fs
  trees)"

* tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  apparmorfs: don't duplicate kfree_link()
  orangefs: saner arguments passing in readdir guts
  ocfs2_find_match(): there's no such thing as NULL or negative ->d_parent
  reiserfs_add_entry(): get rid of pointless namelen checks
  __ocfs2_add_entry(), ocfs2_prepare_dir_for_insert(): namelen checks
  ext4_add_entry(): ->d_name.len is never 0
  befs: d_obtain_alias(ERR_PTR(...)) will do the right thing
  affs: d_obtain_alias(ERR_PTR(...)) will do the right thing
  /proc/sys: use d_splice_alias() calling conventions to simplify failure exits
  hostfs: use d_splice_alias() calling conventions to simplify failure exits
  udf_fiiter_add_entry(): check for zero ->d_name.len is bogus...
  udf: d_obtain_alias(ERR_PTR(...)) will do the right thing...
  udf: d_splice_alias() will do the right thing on ERR_PTR() inode
  nfsd: kill stale comment about simple_fill_super() requirements
  bfs_add_entry(): get rid of pointless ->d_name.len checks
  nilfs2: d_obtain_alias(ERR_PTR(...)) will do the right thing...
  zonefs: d_splice_alias() will do the right thing on ERR_PTR() inode
2024-01-11 20:23:50 -08:00
Linus Torvalds bf4e7080ae fix directory locking scheme on rename
broken in 6.5; we really can't lock two unrelated directories
 without holding ->s_vfs_rename_mutex first and in case of
 same-parent rename of a subdirectory 6.5 ends up doing just
 that.
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZZ+lyQAKCRBZ7Krx/gZQ
 60MWAP94hTqeMIpjhsUIkrTnylrIFaiw4UCWFJzIRG1QQYKqCgD/XUaWI9np7dL6
 0wR/j4CQSdJjiEFKUFE2pD3QoSuJYAQ=
 =+x0+
 -----END PGP SIGNATURE-----

Merge tag 'pull-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull rename updates from Al Viro:
 "Fix directory locking scheme on rename

  This was broken in 6.5; we really can't lock two unrelated directories
  without holding ->s_vfs_rename_mutex first and in case of same-parent
  rename of a subdirectory 6.5 ends up doing just that"

* tag 'pull-rename' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  rename(): avoid a deadlock in the case of parents having no common ancestor
  kill lock_two_inodes()
  rename(): fix the locking of subdirectories
  f2fs: Avoid reading renamed directory if parent does not change
  ext4: don't access the source subdirectory content on same-directory rename
  ext2: Avoid reading renamed directory if parent does not change
  udf_rename(): only access the child content on cross-directory rename
  ocfs2: Avoid touching renamed directory if parent does not change
  reiserfs: Avoid touching renamed directory if parent does not change
2024-01-11 20:00:22 -08:00
Edward Adam Davis dd8f87f21d
reiserfs: fix uninit-value in comp_keys
The cpu_key was not initialized in reiserfs_delete_solid_item(), which triggered
this issue.

Reported-and-tested-by:  <syzbot+b3b14fb9f8a14c5d0267@syzkaller.appspotmail.com>
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Link: https://lore.kernel.org/r/tencent_9EA7E746DE92DBC66049A62EDF6ED64CA706@qq.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-12-28 11:56:52 +01:00
Al Viro 3a1613672e reiserfs_add_entry(): get rid of pointless namelen checks
In all cases namelen is ->d_name.len of some dentry; moreover, a dentry
that has passed ->lookup() without triggering ENAMETOOLONG check there.
The comment next to these checks is either a rudiment of some other
check that used to be there once upon a time, or an attempt to come up
with the possible reason for that check (well, more like "why does
ext3 do it?")

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-12-21 12:53:25 -05:00
Jan Kara 49db9b1b86 reiserfs: Avoid touching renamed directory if parent does not change
The VFS will not be locking moved directory if its parent does not
change. Change reiserfs rename code to avoid touching renamed directory
if its parent does not change as without locking that can corrupt the
filesystem.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-11-25 02:53:20 -05:00
Linus Torvalds ecae0bd517 Many singleton patches against the MM code. The patch series which are
included in this merge do the following:
 
 - Kemeng Shi has contributed some compation maintenance work in the
   series "Fixes and cleanups to compaction".
 
 - Joel Fernandes has a patchset ("Optimize mremap during mutual
   alignment within PMD") which fixes an obscure issue with mremap()'s
   pagetable handling during a subsequent exec(), based upon an
   implementation which Linus suggested.
 
 - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the
   following patch series:
 
 	mm/damon: misc fixups for documents, comments and its tracepoint
 	mm/damon: add a tracepoint for damos apply target regions
 	mm/damon: provide pseudo-moving sum based access rate
 	mm/damon: implement DAMOS apply intervals
 	mm/damon/core-test: Fix memory leaks in core-test
 	mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval
 
 - In the series "Do not try to access unaccepted memory" Adrian Hunter
   provides some fixups for the recently-added "unaccepted memory' feature.
   To increase the feature's checking coverage.  "Plug a few gaps where
   RAM is exposed without checking if it is unaccepted memory".
 
 - In the series "cleanups for lockless slab shrink" Qi Zheng has done
   some maintenance work which is preparation for the lockless slab
   shrinking code.
 
 - Qi Zheng has redone the earlier (and reverted) attempt to make slab
   shrinking lockless in the series "use refcount+RCU method to implement
   lockless slab shrink".
 
 - David Hildenbrand contributes some maintenance work for the rmap code
   in the series "Anon rmap cleanups".
 
 - Kefeng Wang does more folio conversions and some maintenance work in
   the migration code.  Series "mm: migrate: more folio conversion and
   unification".
 
 - Matthew Wilcox has fixed an issue in the buffer_head code which was
   causing long stalls under some heavy memory/IO loads.  Some cleanups
   were added on the way.  Series "Add and use bdev_getblk()".
 
 - In the series "Use nth_page() in place of direct struct page
   manipulation" Zi Yan has fixed a potential issue with the direct
   manipulation of hugetlb page frames.
 
 - In the series "mm: hugetlb: Skip initialization of gigantic tail
   struct pages if freed by HVO" has improved our handling of gigantic
   pages in the hugetlb vmmemmep optimizaton code.  This provides
   significant boot time improvements when significant amounts of gigantic
   pages are in use.
 
 - Matthew Wilcox has sent the series "Small hugetlb cleanups" - code
   rationalization and folio conversions in the hugetlb code.
 
 - Yin Fengwei has improved mlock()'s handling of large folios in the
   series "support large folio for mlock"
 
 - In the series "Expose swapcache stat for memcg v1" Liu Shixin has
   added statistics for memcg v1 users which are available (and useful)
   under memcg v2.
 
 - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
   prctl so that userspace may direct the kernel to not automatically
   propagate the denial to child processes.  The series is named "MDWE
   without inheritance".
 
 - Kefeng Wang has provided the series "mm: convert numa balancing
   functions to use a folio" which does what it says.
 
 - In the series "mm/ksm: add fork-exec support for prctl" Stefan Roesch
   makes is possible for a process to propagate KSM treatment across
   exec().
 
 - Huang Ying has enhanced memory tiering's calculation of memory
   distances.  This is used to permit the dax/kmem driver to use "high
   bandwidth memory" in addition to Optane Data Center Persistent Memory
   Modules (DCPMM).  The series is named "memory tiering: calculate
   abstract distance based on ACPI HMAT"
 
 - In the series "Smart scanning mode for KSM" Stefan Roesch has
   optimized KSM by teaching it to retain and use some historical
   information from previous scans.
 
 - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the
   series "mm: memcg: fix tracking of pending stats updates values".
 
 - In the series "Implement IOCTL to get and optionally clear info about
   PTEs" Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits
   us to atomically read-then-clear page softdirty state.  This is mainly
   used by CRIU.
 
 - Hugh Dickins contributed the series "shmem,tmpfs: general maintenance"
   - a bunch of relatively minor maintenance tweaks to this code.
 
 - Matthew Wilcox has increased the use of the VMA lock over file-backed
   page faults in the series "Handle more faults under the VMA lock".  Some
   rationalizations of the fault path became possible as a result.
 
 - In the series "mm/rmap: convert page_move_anon_rmap() to
   folio_move_anon_rmap()" David Hildenbrand has implemented some cleanups
   and folio conversions.
 
 - In the series "various improvements to the GUP interface" Lorenzo
   Stoakes has simplified and improved the GUP interface with an eye to
   providing groundwork for future improvements.
 
 - Andrey Konovalov has sent along the series "kasan: assorted fixes and
   improvements" which does those things.
 
 - Some page allocator maintenance work from Kemeng Shi in the series
   "Two minor cleanups to break_down_buddy_pages".
 
 - In thes series "New selftest for mm" Breno Leitao has developed
   another MM self test which tickles a race we had between madvise() and
   page faults.
 
 - In the series "Add folio_end_read" Matthew Wilcox provides cleanups
   and an optimization to the core pagecache code.
 
 - Nhat Pham has added memcg accounting for hugetlb memory in the series
   "hugetlb memcg accounting".
 
 - Cleanups and rationalizations to the pagemap code from Lorenzo
   Stoakes, in the series "Abstract vma_merge() and split_vma()".
 
 - Audra Mitchell has fixed issues in the procfs page_owner code's new
   timestamping feature which was causing some misbehaviours.  In the
   series "Fix page_owner's use of free timestamps".
 
 - Lorenzo Stoakes has fixed the handling of new mappings of sealed files
   in the series "permit write-sealed memfd read-only shared mappings".
 
 - Mike Kravetz has optimized the hugetlb vmemmap optimization in the
   series "Batch hugetlb vmemmap modification operations".
 
 - Some buffer_head folio conversions and cleanups from Matthew Wilcox in
   the series "Finish the create_empty_buffers() transition".
 
 - As a page allocator performance optimization Huang Ying has added
   automatic tuning to the allocator's per-cpu-pages feature, in the series
   "mm: PCP high auto-tuning".
 
 - Roman Gushchin has contributed the patchset "mm: improve performance
   of accounted kernel memory allocations" which improves their performance
   by ~30% as measured by a micro-benchmark.
 
 - folio conversions from Kefeng Wang in the series "mm: convert page
   cpupid functions to folios".
 
 - Some kmemleak fixups in Liu Shixin's series "Some bugfix about
   kmemleak".
 
 - Qi Zheng has improved our handling of memoryless nodes by keeping them
   off the allocation fallback list.  This is done in the series "handle
   memoryless nodes more appropriately".
 
 - khugepaged conversions from Vishal Moola in the series "Some
   khugepaged folio conversions".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZULEMwAKCRDdBJ7gKXxA
 jhQHAQCYpD3g849x69DmHnHWHm/EHQLvQmRMDeYZI+nx/sCJOwEAw4AKg0Oemv9y
 FgeUPAD1oasg6CP+INZvCj34waNxwAc=
 =E+Y4
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Many singleton patches against the MM code. The patch series which are
  included in this merge do the following:

   - Kemeng Shi has contributed some compation maintenance work in the
     series 'Fixes and cleanups to compaction'

   - Joel Fernandes has a patchset ('Optimize mremap during mutual
     alignment within PMD') which fixes an obscure issue with mremap()'s
     pagetable handling during a subsequent exec(), based upon an
     implementation which Linus suggested

   - More DAMON/DAMOS maintenance and feature work from SeongJae Park i
     the following patch series:

	mm/damon: misc fixups for documents, comments and its tracepoint
	mm/damon: add a tracepoint for damos apply target regions
	mm/damon: provide pseudo-moving sum based access rate
	mm/damon: implement DAMOS apply intervals
	mm/damon/core-test: Fix memory leaks in core-test
	mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval

   - In the series 'Do not try to access unaccepted memory' Adrian
     Hunter provides some fixups for the recently-added 'unaccepted
     memory' feature. To increase the feature's checking coverage. 'Plug
     a few gaps where RAM is exposed without checking if it is
     unaccepted memory'

   - In the series 'cleanups for lockless slab shrink' Qi Zheng has done
     some maintenance work which is preparation for the lockless slab
     shrinking code

   - Qi Zheng has redone the earlier (and reverted) attempt to make slab
     shrinking lockless in the series 'use refcount+RCU method to
     implement lockless slab shrink'

   - David Hildenbrand contributes some maintenance work for the rmap
     code in the series 'Anon rmap cleanups'

   - Kefeng Wang does more folio conversions and some maintenance work
     in the migration code. Series 'mm: migrate: more folio conversion
     and unification'

   - Matthew Wilcox has fixed an issue in the buffer_head code which was
     causing long stalls under some heavy memory/IO loads. Some cleanups
     were added on the way. Series 'Add and use bdev_getblk()'

   - In the series 'Use nth_page() in place of direct struct page
     manipulation' Zi Yan has fixed a potential issue with the direct
     manipulation of hugetlb page frames

   - In the series 'mm: hugetlb: Skip initialization of gigantic tail
     struct pages if freed by HVO' has improved our handling of gigantic
     pages in the hugetlb vmmemmep optimizaton code. This provides
     significant boot time improvements when significant amounts of
     gigantic pages are in use

   - Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code
     rationalization and folio conversions in the hugetlb code

   - Yin Fengwei has improved mlock()'s handling of large folios in the
     series 'support large folio for mlock'

   - In the series 'Expose swapcache stat for memcg v1' Liu Shixin has
     added statistics for memcg v1 users which are available (and
     useful) under memcg v2

   - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
     prctl so that userspace may direct the kernel to not automatically
     propagate the denial to child processes. The series is named 'MDWE
     without inheritance'

   - Kefeng Wang has provided the series 'mm: convert numa balancing
     functions to use a folio' which does what it says

   - In the series 'mm/ksm: add fork-exec support for prctl' Stefan
     Roesch makes is possible for a process to propagate KSM treatment
     across exec()

   - Huang Ying has enhanced memory tiering's calculation of memory
     distances. This is used to permit the dax/kmem driver to use 'high
     bandwidth memory' in addition to Optane Data Center Persistent
     Memory Modules (DCPMM). The series is named 'memory tiering:
     calculate abstract distance based on ACPI HMAT'

   - In the series 'Smart scanning mode for KSM' Stefan Roesch has
     optimized KSM by teaching it to retain and use some historical
     information from previous scans

   - Yosry Ahmed has fixed some inconsistencies in memcg statistics in
     the series 'mm: memcg: fix tracking of pending stats updates
     values'

   - In the series 'Implement IOCTL to get and optionally clear info
     about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap
     which permits us to atomically read-then-clear page softdirty
     state. This is mainly used by CRIU

   - Hugh Dickins contributed the series 'shmem,tmpfs: general
     maintenance', a bunch of relatively minor maintenance tweaks to
     this code

   - Matthew Wilcox has increased the use of the VMA lock over
     file-backed page faults in the series 'Handle more faults under the
     VMA lock'. Some rationalizations of the fault path became possible
     as a result

   - In the series 'mm/rmap: convert page_move_anon_rmap() to
     folio_move_anon_rmap()' David Hildenbrand has implemented some
     cleanups and folio conversions

   - In the series 'various improvements to the GUP interface' Lorenzo
     Stoakes has simplified and improved the GUP interface with an eye
     to providing groundwork for future improvements

   - Andrey Konovalov has sent along the series 'kasan: assorted fixes
     and improvements' which does those things

   - Some page allocator maintenance work from Kemeng Shi in the series
     'Two minor cleanups to break_down_buddy_pages'

   - In thes series 'New selftest for mm' Breno Leitao has developed
     another MM self test which tickles a race we had between madvise()
     and page faults

   - In the series 'Add folio_end_read' Matthew Wilcox provides cleanups
     and an optimization to the core pagecache code

   - Nhat Pham has added memcg accounting for hugetlb memory in the
     series 'hugetlb memcg accounting'

   - Cleanups and rationalizations to the pagemap code from Lorenzo
     Stoakes, in the series 'Abstract vma_merge() and split_vma()'

   - Audra Mitchell has fixed issues in the procfs page_owner code's new
     timestamping feature which was causing some misbehaviours. In the
     series 'Fix page_owner's use of free timestamps'

   - Lorenzo Stoakes has fixed the handling of new mappings of sealed
     files in the series 'permit write-sealed memfd read-only shared
     mappings'

   - Mike Kravetz has optimized the hugetlb vmemmap optimization in the
     series 'Batch hugetlb vmemmap modification operations'

   - Some buffer_head folio conversions and cleanups from Matthew Wilcox
     in the series 'Finish the create_empty_buffers() transition'

   - As a page allocator performance optimization Huang Ying has added
     automatic tuning to the allocator's per-cpu-pages feature, in the
     series 'mm: PCP high auto-tuning'

   - Roman Gushchin has contributed the patchset 'mm: improve
     performance of accounted kernel memory allocations' which improves
     their performance by ~30% as measured by a micro-benchmark

   - folio conversions from Kefeng Wang in the series 'mm: convert page
     cpupid functions to folios'

   - Some kmemleak fixups in Liu Shixin's series 'Some bugfix about
     kmemleak'

   - Qi Zheng has improved our handling of memoryless nodes by keeping
     them off the allocation fallback list. This is done in the series
     'handle memoryless nodes more appropriately'

   - khugepaged conversions from Vishal Moola in the series 'Some
     khugepaged folio conversions'"

[ bcachefs conflicts with the dynamically allocated shrinkers have been
  resolved as per Stephen Rothwell in

     https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/

  with help from Qi Zheng.

  The clone3 test filtering conflict was half-arsed by yours truly ]

* tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits)
  mm/damon/sysfs: update monitoring target regions for online input commit
  mm/damon/sysfs: remove requested targets when online-commit inputs
  selftests: add a sanity check for zswap
  Documentation: maple_tree: fix word spelling error
  mm/vmalloc: fix the unchecked dereference warning in vread_iter()
  zswap: export compression failure stats
  Documentation: ubsan: drop "the" from article title
  mempolicy: migration attempt to match interleave nodes
  mempolicy: mmap_lock is not needed while migrating folios
  mempolicy: alloc_pages_mpol() for NUMA policy without vma
  mm: add page_rmappable_folio() wrapper
  mempolicy: remove confusing MPOL_MF_LAZY dead code
  mempolicy: mpol_shared_policy_init() without pseudo-vma
  mempolicy trivia: use pgoff_t in shared mempolicy tree
  mempolicy trivia: slightly more consistent naming
  mempolicy trivia: delete those ancient pr_debug()s
  mempolicy: fix migrate_pages(2) syscall return nr_failed
  kernfs: drop shared NUMA mempolicy hooks
  hugetlbfs: drop shared NUMA mempolicy pretence
  mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets()
  ...
2023-11-02 19:38:47 -10:00
Linus Torvalds 14ab6d425e vfs-6.7.ctime
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZTppYgAKCRCRxhvAZXjc
 okIHAP9anLz1QDyMLH12ASuHjgBc0Of3jcB6NB97IWGpL4O21gEA46ohaD+vcJuC
 YkBLU3lXqQ87nfu28ExFAzh10hG2jwM=
 =m4pB
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.7.ctime' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull vfs inode time accessor updates from Christian Brauner:
 "This finishes the conversion of all inode time fields to accessor
  functions as discussed on list. Changing timestamps manually as we
  used to do before is error prone. Using accessors function makes this
  robust.

  It does not contain the switch of the time fields to discrete 64 bit
  integers to replace struct timespec and free up space in struct inode.
  But after this, the switch can be trivially made and the patch should
  only affect the vfs if we decide to do it"

* tag 'vfs-6.7.ctime' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (86 commits)
  fs: rename inode i_atime and i_mtime fields
  security: convert to new timestamp accessors
  selinux: convert to new timestamp accessors
  apparmor: convert to new timestamp accessors
  sunrpc: convert to new timestamp accessors
  mm: convert to new timestamp accessors
  bpf: convert to new timestamp accessors
  ipc: convert to new timestamp accessors
  linux: convert to new timestamp accessors
  zonefs: convert to new timestamp accessors
  xfs: convert to new timestamp accessors
  vboxsf: convert to new timestamp accessors
  ufs: convert to new timestamp accessors
  udf: convert to new timestamp accessors
  ubifs: convert to new timestamp accessors
  tracefs: convert to new timestamp accessors
  sysv: convert to new timestamp accessors
  squashfs: convert to new timestamp accessors
  server: convert to new timestamp accessors
  client: convert to new timestamp accessors
  ...
2023-10-30 09:47:13 -10:00
Linus Torvalds 7352a6765c vfs-6.7.xattr
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZTppWAAKCRCRxhvAZXjc
 okB2AP4jjoRErJBwj245OIDJqzoj4m4UVOVd0MH2AkiSpANczwD/TToChdpusY2y
 qAYg1fQoGMbDVlb7Txaj9qI9ieCf9w0=
 =2PXg
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.7.xattr' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs

Pull vfs xattr updates from Christian Brauner:
 "The 's_xattr' field of 'struct super_block' currently requires a
  mutable table of 'struct xattr_handler' entries (although each handler
  itself is const). However, no code in vfs actually modifies the
  tables.

  This changes the type of 's_xattr' to allow const tables, and modifies
  existing file systems to move their tables to .rodata. This is
  desirable because these tables contain entries with function pointers
  in them; moving them to .rodata makes it considerably less likely to
  be modified accidentally or maliciously at runtime"

* tag 'vfs-6.7.xattr' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (30 commits)
  const_structs.checkpatch: add xattr_handler
  net: move sockfs_xattr_handlers to .rodata
  shmem: move shmem_xattr_handlers to .rodata
  overlayfs: move xattr tables to .rodata
  xfs: move xfs_xattr_handlers to .rodata
  ubifs: move ubifs_xattr_handlers to .rodata
  squashfs: move squashfs_xattr_handlers to .rodata
  smb: move cifs_xattr_handlers to .rodata
  reiserfs: move reiserfs_xattr_handlers to .rodata
  orangefs: move orangefs_xattr_handlers to .rodata
  ocfs2: move ocfs2_xattr_handlers and ocfs2_xattr_handler_map to .rodata
  ntfs3: move ntfs_xattr_handlers to .rodata
  nfs: move nfs4_xattr_handlers to .rodata
  kernfs: move kernfs_xattr_handlers to .rodata
  jfs: move jfs_xattr_handlers to .rodata
  jffs2: move jffs2_xattr_handlers to .rodata
  hfsplus: move hfsplus_xattr_handlers to .rodata
  hfs: move hfs_xattr_handlers to .rodata
  gfs2: move gfs2_xattr_handlers_max to .rodata
  fuse: move fuse_xattr_handlers to .rodata
  ...
2023-10-30 09:29:44 -10:00
Jan Kara ba1787a5ed
reiserfs: Convert to bdev_open_by_dev/path()
Convert reiserfs to use bdev_open_by_dev/path() and pass the handle
around.

CC: reiserfs-devel@vger.kernel.org
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230927093442.25915-27-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-28 13:29:21 +02:00
Matthew Wilcox (Oracle) 0a88810d9b buffer: remove folio_create_empty_buffers()
With all users converted, remove the old create_empty_buffers() and rename
folio_create_empty_buffers() to create_empty_buffers().

Link: https://lkml.kernel.org/r/20231016201114.1928083-28-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-25 16:47:10 -07:00
Matthew Wilcox (Oracle) 44f6857526 reiserfs: convert writepage to use a folio
Convert the incoming page to a folio and then use it throughout the
writeback path.  This definitely isn't enough to support large folios, but
I don't expect reiserfs to gain support for those before it is removed.

Link: https://lkml.kernel.org/r/20231016201114.1928083-23-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Pankaj Raghav <p.raghav@samsung.com>
Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-25 16:47:09 -07:00
Jeff Layton 5e8b820be8
reiserfs: convert to new timestamp accessors
Convert to using the new inode timestamp accessor functions.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20231004185347.80880-64-jlayton@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-18 14:08:27 +02:00
Wedson Almeida Filho 29a69055e2
reiserfs: move reiserfs_xattr_handlers to .rodata
This makes it harder for accidental or malicious changes to
reiserfs_xattr_handlers at runtime.

Cc: reiserfs-devel@vger.kernel.org
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Link: https://lore.kernel.org/r/20230930050033.41174-23-wedsonaf@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-10 13:49:19 +02:00
Shigeru Yoshida 2251588143 reiserfs: Replace 1-element array with C99 style flex-array
UBSAN found the following issue:

================================================================================
UBSAN: array-index-out-of-bounds in fs/reiserfs/journal.c:4166:22
index 1 is out of range for type '__le32 [1]'

This is because struct reiserfs_journal_desc uses 1-element array for
dynamically sized array member, j_realblock.

This patch fixes this issue by replacing the 1-element array member with C99
style flex-array.  This patch also fixes the same issue in struct
reiserfs_journal_commit as the same manner.

Fixes: f466c6fdb3 ("move private bits of reiserfs_fs.h to fs/reiserfs/reiserfs.h")
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Message-Id: <20230821043312.1444068-1-syoshida@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-09-11 14:07:46 +02:00
Linus Torvalds 3d3dfeb3ae for-6.6/block-2023-08-28
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmTs08EQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpqa4EACu/zKE+omGXBV0Q7kEpVsChjp0ElGtSDIJ
 tJfTuvnWqQjrqRv4ksmZvGdx8SkqFuXri4/7oBXlsaqeUVbIQdWJUpLErBye6nxa
 lUb6nXOFWwyG94cMRYs71lN0loosjb7aiVw7oVLAIhntq3p3doFl/cyy3ndMZrUE
 pZbsrWSt4QiOKhcO0TtIjfAwsr31AN51qFiNNITEiZl3UjXfkGRCK81X0yM2N8zZ
 7Y0h1ldPBsZ/olNWeRyaW1uB64nKM0buR7/nDxCV/NI05nndJ34bIgo/JIj4xy0v
 SiBj2+y86+oMJZt17yYENwOQdtX3hbyESGuVm9dCrO0t9/byVQxkUk0OMm65BM/l
 l2d+gmMQZTbHziqfLlgq9i3i9+B4C2hsb7iBpuo7SW/FPbM45POgi3lpiZycaZyu
 krQo1qwL4KSGXzGN9CabEuKDcJcXqLxqMDOyEDA3R5Kz06V9tNuM+Di/mr4vuZHK
 sVHUfHuWBO9ionLlGPdc3fH/CuMqic8SHjumiAm2menBZV6cSzRDxpm6H4CyLt7y
 tWmw7BNU7dfHFGd+Jw0Ld49sAuEybszEXq6qYv5uYBVfJNqDvOvEeVoQp0RN2jJA
 AG30hymcZgxn9n7gkIgkPQDgIGUjnzUR8B2mE2UFU1CYVHXYXAXU55CCI5oeTkbs
 d0Y/zCZf1A==
 =p1bd
 -----END PGP SIGNATURE-----

Merge tag 'for-6.6/block-2023-08-28' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:
 "Pretty quiet round for this release. This contains:

   - Add support for zoned storage to ublk (Andreas, Ming)

   - Series improving performance for drivers that mark themselves as
     needing a blocking context for issue (Bart)

   - Cleanup the flush logic (Chengming)

   - sed opal keyring support (Greg)

   - Fixes and improvements to the integrity support (Jinyoung)

   - Add some exports for bcachefs that we can hopefully delete again in
     the future (Kent)

   - deadline throttling fix (Zhiguo)

   - Series allowing building the kernel without buffer_head support
     (Christoph)

   - Sanitize the bio page adding flow (Christoph)

   - Write back cache fixes (Christoph)

   - MD updates via Song:
      - Fix perf regression for raid0 large sequential writes (Jan)
      - Fix split bio iostat for raid0 (David)
      - Various raid1 fixes (Heinz, Xueshi)
      - raid6test build fixes (WANG)
      - Deprecate bitmap file support (Christoph)
      - Fix deadlock with md sync thread (Yu)
      - Refactor md io accounting (Yu)
      - Various non-urgent fixes (Li, Yu, Jack)

   - Various fixes and cleanups (Arnd, Azeem, Chengming, Damien, Li,
     Ming, Nitesh, Ruan, Tejun, Thomas, Xu)"

* tag 'for-6.6/block-2023-08-28' of git://git.kernel.dk/linux: (113 commits)
  block: use strscpy() to instead of strncpy()
  block: sed-opal: keyring support for SED keys
  block: sed-opal: Implement IOC_OPAL_REVERT_LSP
  block: sed-opal: Implement IOC_OPAL_DISCOVERY
  blk-mq: prealloc tags when increase tagset nr_hw_queues
  blk-mq: delete redundant tagset map update when fallback
  blk-mq: fix tags leak when shrink nr_hw_queues
  ublk: zoned: support REQ_OP_ZONE_RESET_ALL
  md: raid0: account for split bio in iostat accounting
  md/raid0: Fix performance regression for large sequential writes
  md/raid0: Factor out helper for mapping and submitting a bio
  md raid1: allow writebehind to work on any leg device set WriteMostly
  md/raid1: hold the barrier until handle_read_error() finishes
  md/raid1: free the r1bio before waiting for blocked rdev
  md/raid1: call free_r1bio() before allow_barrier() in raid_end_bio_io()
  blk-cgroup: Fix NULL deref caused by blkg_policy_data being installed before init
  drivers/rnbd: restore sysfs interface to rnbd-client
  md/raid5-cache: fix null-ptr-deref for r5l_flush_stripe_to_raid()
  raid6: test: only check for Altivec if building on powerpc hosts
  raid6: test: make sure all intermediate and artifact files are .gitignored
  ...
2023-08-29 20:21:42 -07:00
Linus Torvalds 9d6b14cd1e flexible-array transformations for 6.6-rc1
Hi Linus,
 
 Please, pull the following flexible-array transformations. These patches
 have been baking in linux-next for a while.
 
 Thanks
 --
 Gustavo
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAmTtBEQACgkQRwW0y0cG
 2zHgeBAAm1D5H+4THFOuYVsl93igowMbcPwraoo2xxnNtLqQR4y6p9SiWcjh2vk3
 bmabPtID3jPZytjpV6QUYe6JlcOoWc2LsNTWqwnt2X2fhcovJfGMc/K3+7bRQzE8
 uBqooxkiaZvPiUATeo0zdBSN9aYWXvEpYrBR5J9EtJWyS93LiYb/nn2DY+0d1yrY
 89koxTV9/eOKJ7xi6UB0zWbyayor+JawX5RI2ysMFHO6UB2aVrNjqUCJP51dyywa
 2QKtwkY/8vQIo8Y8c7ReApHR0t+s0jmxRi9ip+4rh/0VNl8ON5SoL+EcRB5FLCTn
 E/VqQcebjIwo0AfeEzUQIc68E1hZJLYhGGJupF1Wsb9IvmbdTTMhhfLJmWStz5+E
 GnNpXx9gTjCoM5+FqJlMkHgxT9JpHYSV5LtnZKmM7wiexHND+5T/ukAxRUhFjLY3
 saMbXnJ85kHsNYLAxIDUfdRcXEDNBJl7XiVI2YRCI9fttD+GVjdFtATdNZFsFAge
 kHZCAFloFHwe0lTcksN5I/NVdDc6Jv7ABjrTKNezfrjfRn24zvR61V54/ebPp5Fs
 2JA5kVJpgGnUdIM2uot2lYLCmLDjhGNfXE95AjK0SdJXrZtzYqzTcKJg6w1gz3ZJ
 VHWK4snsu48KxTx66f+v38ovKXLtlA8Rk8kYOHWXYgB0ZXOOAy4=
 =L9kF
 -----END PGP SIGNATURE-----

Merge tag 'flex-array-transformations-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull flexible-array updates from Gustavo A. R. Silva.

* tag 'flex-array-transformations-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  fs: omfs: Use flexible-array member in struct omfs_extent
  sparc: openpromio: Address -Warray-bounds warning
  reiserfs: Replace one-element array with flexible-array member
2023-08-29 12:48:12 -07:00
Linus Torvalds de16588a77 v6.6-vfs.misc
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXTxQAKCRCRxhvAZXjc
 okaVAP94WAlItvDRt/z2Wtzf0+RqPZeTXEdGTxua8+RxqCyYIQD+OO5nRfKQPHlV
 AqqGJMKItQMSMIYgB5ftqVhNWZfnHgM=
 =pSEW
 -----END PGP SIGNATURE-----

Merge tag 'v6.6-vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "This contains the usual miscellaneous features, cleanups, and fixes
  for vfs and individual filesystems.

  Features:

   - Block mode changes on symlinks and rectify our broken semantics

   - Report file modifications via fsnotify() for splice

   - Allow specifying an explicit timeout for the "rootwait" kernel
     command line option. This allows to timeout and reboot instead of
     always waiting indefinitely for the root device to show up

   - Use synchronous fput for the close system call

  Cleanups:

   - Get rid of open-coded lockdep workarounds for async io submitters
     and replace it all with a single consolidated helper

   - Simplify epoll allocation helper

   - Convert simple_write_begin and simple_write_end to use a folio

   - Convert page_cache_pipe_buf_confirm() to use a folio

   - Simplify __range_close to avoid pointless locking

   - Disable per-cpu buffer head cache for isolated cpus

   - Port ecryptfs to kmap_local_page() api

   - Remove redundant initialization of pointer buf in pipe code

   - Unexport the d_genocide() function which is only used within core
     vfs

   - Replace printk(KERN_ERR) and WARN_ON() with WARN()

  Fixes:

   - Fix various kernel-doc issues

   - Fix refcount underflow for eventfds when used as EFD_SEMAPHORE

   - Fix a mainly theoretical issue in devpts

   - Check the return value of __getblk() in reiserfs

   - Fix a racy assert in i_readcount_dec

   - Fix integer conversion issues in various functions

   - Fix LSM security context handling during automounts that prevented
     NFS superblock sharing"

* tag 'v6.6-vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (39 commits)
  cachefiles: use kiocb_{start,end}_write() helpers
  ovl: use kiocb_{start,end}_write() helpers
  aio: use kiocb_{start,end}_write() helpers
  io_uring: use kiocb_{start,end}_write() helpers
  fs: create kiocb_{start,end}_write() helpers
  fs: add kerneldoc to file_{start,end}_write() helpers
  io_uring: rename kiocb_end_write() local helper
  splice: Convert page_cache_pipe_buf_confirm() to use a folio
  libfs: Convert simple_write_begin and simple_write_end to use a folio
  fs/dcache: Replace printk and WARN_ON by WARN
  fs/pipe: remove redundant initialization of pointer buf
  fs: Fix kernel-doc warnings
  devpts: Fix kernel-doc warnings
  doc: idmappings: fix an error and rephrase a paragraph
  init: Add support for rootwait timeout parameter
  vfs: fix up the assert in i_readcount_dec
  fs: Fix one kernel-doc comment
  docs: filesystems: idmappings: clarify from where idmappings are taken
  fs/buffer.c: disable per-CPU buffer_head cache for isolated CPUs
  vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing
  ...
2023-08-28 10:17:14 -07:00
Christoph Hellwig 925c86a19b fs: add CONFIG_BUFFER_HEAD
Add a new config option that controls building the buffer_head code, and
select it from all file systems and stacking drivers that need it.

For the block device nodes and alternative iomap based buffered I/O path
is provided when buffer_head support is not enabled, and iomap needs a
a small tweak to define the IOMAP_F_BUFFER_HEAD flag to 0 to not call
into the buffer_head code when it doesn't exist.

Otherwise this is just Kconfig and ifdef changes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20230801172201.1923299-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-08-02 09:13:09 -06:00
Jeff Layton ae834901ec reiserfs: convert to ctime accessor functions
In later patches, we're going to change how the inode's ctime field is
used. Switch to using accessor functions instead of raw accesses of
inode->i_ctime.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-Id: <20230705190309.579783-70-jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-24 10:30:04 +02:00
Gustavo A. R. Silva 76d2ceda72 reiserfs: Replace one-element array with flexible-array member
One-element arrays are deprecated, and we are replacing them with flexible
array members instead. So, replace one-element array with flexible-array
member in direntry_uarea structure, and refactor the rest of the code,
accordingly.

Worth mentioning is that before these changes, the original implementation
was returning two-too many bytes in function direntry_create_vi():

fs/reiserfs/item_ops.c:464:     int size = sizeof(struct direntry_uarea);
...
fs/reiserfs/item_ops.c-490-     size += (dir_u->entry_count * sizeof(short));
...
fs/reiserfs/item_ops.c-517-     return size;

Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/290
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2023-07-12 13:09:46 -06:00
Matthew Wilcox ba38980add reiserfs: Check the return value from __getblk()
__getblk() can return a NULL pointer if we run out of memory or if we
try to access beyond the end of the device; check it and handle it
appropriately.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lore.kernel.org/lkml/CAFcO6XOacq3hscbXevPQP7sXRoYFz34ZdKPYjmd6k5sZuhGFDw@mail.gmail.com/
Tested-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2") # probably introduced in 2002
Acked-by: Edward Shishkin <edward.shishkin@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-10 14:36:10 +02:00
Jeff Layton 298d35ef76 reiserfs: convert to simple_rename_timestamp
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-Id: <20230705190309.579783-12-jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-07-10 10:08:17 +02:00
Linus Torvalds 6e17c6de3d - Yosry Ahmed brought back some cgroup v1 stats in OOM logs.
- Yosry has also eliminated cgroup's atomic rstat flushing.
 
 - Nhat Pham adds the new cachestat() syscall.  It provides userspace
   with the ability to query pagecache status - a similar concept to
   mincore() but more powerful and with improved usability.
 
 - Mel Gorman provides more optimizations for compaction, reducing the
   prevalence of page rescanning.
 
 - Lorenzo Stoakes has done some maintanance work on the get_user_pages()
   interface.
 
 - Liam Howlett continues with cleanups and maintenance work to the maple
   tree code.  Peng Zhang also does some work on maple tree.
 
 - Johannes Weiner has done some cleanup work on the compaction code.
 
 - David Hildenbrand has contributed additional selftests for
   get_user_pages().
 
 - Thomas Gleixner has contributed some maintenance and optimization work
   for the vmalloc code.
 
 - Baolin Wang has provided some compaction cleanups,
 
 - SeongJae Park continues maintenance work on the DAMON code.
 
 - Huang Ying has done some maintenance on the swap code's usage of
   device refcounting.
 
 - Christoph Hellwig has some cleanups for the filemap/directio code.
 
 - Ryan Roberts provides two patch series which yield some
   rationalization of the kernel's access to pte entries - use the provided
   APIs rather than open-coding accesses.
 
 - Lorenzo Stoakes has some fixes to the interaction between pagecache
   and directio access to file mappings.
 
 - John Hubbard has a series of fixes to the MM selftesting code.
 
 - ZhangPeng continues the folio conversion campaign.
 
 - Hugh Dickins has been working on the pagetable handling code, mainly
   with a view to reducing the load on the mmap_lock.
 
 - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment from
   128 to 8.
 
 - Domenico Cerasuolo has improved the zswap reclaim mechanism by
   reorganizing the LRU management.
 
 - Matthew Wilcox provides some fixups to make gfs2 work better with the
   buffer_head code.
 
 - Vishal Moola also has done some folio conversion work.
 
 - Matthew Wilcox has removed the remnants of the pagevec code - their
   functionality is migrated over to struct folio_batch.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZJejewAKCRDdBJ7gKXxA
 joggAPwKMfT9lvDBEUnJagY7dbDPky1cSYZdJKxxM2cApGa42gEA6Cl8HRAWqSOh
 J0qXCzqaaN8+BuEyLGDVPaXur9KirwY=
 =B7yQ
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:

 - Yosry Ahmed brought back some cgroup v1 stats in OOM logs

 - Yosry has also eliminated cgroup's atomic rstat flushing

 - Nhat Pham adds the new cachestat() syscall. It provides userspace
   with the ability to query pagecache status - a similar concept to
   mincore() but more powerful and with improved usability

 - Mel Gorman provides more optimizations for compaction, reducing the
   prevalence of page rescanning

 - Lorenzo Stoakes has done some maintanance work on the
   get_user_pages() interface

 - Liam Howlett continues with cleanups and maintenance work to the
   maple tree code. Peng Zhang also does some work on maple tree

 - Johannes Weiner has done some cleanup work on the compaction code

 - David Hildenbrand has contributed additional selftests for
   get_user_pages()

 - Thomas Gleixner has contributed some maintenance and optimization
   work for the vmalloc code

 - Baolin Wang has provided some compaction cleanups,

 - SeongJae Park continues maintenance work on the DAMON code

 - Huang Ying has done some maintenance on the swap code's usage of
   device refcounting

 - Christoph Hellwig has some cleanups for the filemap/directio code

 - Ryan Roberts provides two patch series which yield some
   rationalization of the kernel's access to pte entries - use the
   provided APIs rather than open-coding accesses

 - Lorenzo Stoakes has some fixes to the interaction between pagecache
   and directio access to file mappings

 - John Hubbard has a series of fixes to the MM selftesting code

 - ZhangPeng continues the folio conversion campaign

 - Hugh Dickins has been working on the pagetable handling code, mainly
   with a view to reducing the load on the mmap_lock

 - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment
   from 128 to 8

 - Domenico Cerasuolo has improved the zswap reclaim mechanism by
   reorganizing the LRU management

 - Matthew Wilcox provides some fixups to make gfs2 work better with the
   buffer_head code

 - Vishal Moola also has done some folio conversion work

 - Matthew Wilcox has removed the remnants of the pagevec code - their
   functionality is migrated over to struct folio_batch

* tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits)
  mm/hugetlb: remove hugetlb_set_page_subpool()
  mm: nommu: correct the range of mmap_sem_read_lock in task_mem()
  hugetlb: revert use of page_cache_next_miss()
  Revert "page cache: fix page_cache_next/prev_miss off by one"
  mm/vmscan: fix root proactive reclaim unthrottling unbalanced node
  mm: memcg: rename and document global_reclaim()
  mm: kill [add|del]_page_to_lru_list()
  mm: compaction: convert to use a folio in isolate_migratepages_block()
  mm: zswap: fix double invalidate with exclusive loads
  mm: remove unnecessary pagevec includes
  mm: remove references to pagevec
  mm: rename invalidate_mapping_pagevec to mapping_try_invalidate
  mm: remove struct pagevec
  net: convert sunrpc from pagevec to folio_batch
  i915: convert i915_gpu_error to use a folio_batch
  pagevec: rename fbatch_count()
  mm: remove check_move_unevictable_pages()
  drm: convert drm_gem_put_pages() to use a folio_batch
  i915: convert shmem_sg_free_table() to use a folio_batch
  scatterlist: add sg_set_folio()
  ...
2023-06-28 10:28:11 -07:00
Linus Torvalds 21953eb16c lsm/stable-6.5 PR 20230626
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmSZuh0UHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNpuxAAxChGqME9nE7iITx1TaFRrbK49mDF
 1RZh/5cwzde72lLLFkTFKB6ErMSQkrrtA+jFH7vKsrOslBel1+yO80vkXmhYCeZU
 P3m0FeREUpuU4QV0tbQamPeR+SWohmKi2dYWd+VdpLA+1aTK3KNYsi2NFkDIreap
 BqeRq4S0Rqc4u3/5juk6JCGFhTRWaH16YJQrzIKHF/K3DK+gMhAY5sjuAWzFc6ma
 /5bbD55kdVVDfnsxNSe+lzJ7zEf7TYedLG6BN+R9cVrU+El12a38M29kASaAof5w
 vpb92a27hA9Q5EyQ2O9QXnr2L5CShT4bvAZCGkK4cmZerGNTdM0iojhYj1s7FAV/
 USkWgkDmEuSatp0+DdXlfQyUmZZWlw1W0oiEfZwR8w7TY7q9CU7aD8K7+GDSIazB
 g89nYznVjlaC/oA4/owMraoWP3eiDiAcsQdO052Vv63TVyJtTiRiKyBq5EFLrX8L
 iaUCa4cBaYFc94kN1PZeNXZKwqRc2F6oAFT1YuXnFWBGmixN0kUL023C0xjl/J7P
 02jYYSVzLm22aU39GU0DSnaLfAwl3muazOB3XuyGOhUWHFYzjkc9UhmGp0W50DkK
 qigW3ONA8s8CKUS/q7QSGq+Vf+CVZA5f+daDDPGYstPfCTk61eu0wjwfwek3W0o+
 xKzBr2Od3vTOzAs=
 =3nWy
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20230626' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull lsm updates from Paul Moore:

 - A SafeSetID patch to correct what appears to be a cut-n-paste typo in
   the code causing a UID to be printed where a GID was desired.

   This is coming via the LSM tree because we haven't been able to get a
   response from the SafeSetID maintainer (Micah Morton) in several
   months. Hopefully we are able to get in touch with Micah, but until
   we do I'm going to pick them up in the LSM tree.

 - A small fix to the reiserfs LSM xattr code.

   We're continuing to work through some issues with the reiserfs code
   as we try to fixup the LSM xattr handling, but in the process we're
   uncovering some ugly problems in reiserfs and we may just end up
   removing the LSM xattr support in reiserfs prior to reiserfs'
   removal.

   For better or worse, this shouldn't impact any of the reiserfs users,
   as we discovered that LSM xattrs on reiserfs were completely broken,
   meaning no one is currently using the combo of reiserfs and a file
   labeling LSM.

 - A tweak to how the cap_user_data_t struct/typedef is declared in the
   header file to appease the Sparse gods.

 - In the process of trying to sort out the SafeSetID lost-maintainer
   problem I realized that I needed to update the labeled networking
   entry to "Supported".

 - Minor comment/documentation and spelling fixes.

* tag 'lsm-pr-20230626' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  device_cgroup: Fix kernel-doc warnings in device_cgroup
  SafeSetID: fix UID printed instead of GID
  MAINTAINERS: move labeled networking to "supported"
  capability: erase checker warnings about struct __user_cap_data_struct
  lsm: fix a number of misspellings
  reiserfs: Initialize sec->length in reiserfs_security_init().
  capability: fix kernel-doc warnings in capability.c
2023-06-27 17:24:26 -07:00
Yu Kuai c576c4bf9e reiserfs: fix blkdev_put() warning from release_journal_dev()
In journal_init_dev(), if super bdev is used as 'j_dev_bd', then
blkdev_get_by_dev() is called with NULL holder, otherwise, holder will
be journal. However, later in release_journal_dev(), blkdev_put() is
called with journal unconditionally, cause following warning:

WARNING: CPU: 1 PID: 5034 at block/bdev.c:617 bd_end_claim block/bdev.c:617 [inline]
WARNING: CPU: 1 PID: 5034 at block/bdev.c:617 blkdev_put+0x562/0x8a0 block/bdev.c:901
RIP: 0010:blkdev_put+0x562/0x8a0 block/bdev.c:901
Call Trace:
 <TASK>
 release_journal_dev fs/reiserfs/journal.c:2592 [inline]
 free_journal_ram+0x421/0x5c0 fs/reiserfs/journal.c:1896
 do_journal_release fs/reiserfs/journal.c:1960 [inline]
 journal_release+0x276/0x630 fs/reiserfs/journal.c:1971
 reiserfs_put_super+0xe4/0x5c0 fs/reiserfs/super.c:616
 generic_shutdown_super+0x158/0x480 fs/super.c:499
 kill_block_super+0x64/0xb0 fs/super.c:1422
 deactivate_locked_super+0x98/0x160 fs/super.c:330
 deactivate_super+0xb1/0xd0 fs/super.c:361
 cleanup_mnt+0x2ae/0x3d0 fs/namespace.c:1247
 task_work_run+0x16f/0x270 kernel/task_work.c:179
 exit_task_work include/linux/task_work.h:38 [inline]
 do_exit+0xadc/0x2a30 kernel/exit.c:874
 do_group_exit+0xd4/0x2a0 kernel/exit.c:1024
 __do_sys_exit_group kernel/exit.c:1035 [inline]
 __se_sys_exit_group kernel/exit.c:1033 [inline]
 __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1033
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fix this problem by passing in NULL holder in this case.

Reported-by: syzbot+04625c80899f4555de39@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=04625c80899f4555de39
Fixes: 2736e8eeb0 ("block: use the holder as indication for exclusive opens")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230620111322.1014775-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-20 07:19:15 -06:00
Matthew Wilcox (Oracle) 4a9622f2fd buffer: convert page_zero_new_buffers() to folio_zero_new_buffers()
Most of the callers already have a folio; convert reiserfs_write_end() to
have a folio.  Removes a couple of hidden calls to compound_head().

Link: https://lkml.kernel.org/r/20230612210141.730128-10-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-19 16:19:31 -07:00
Matthew Wilcox (Oracle) 53418a18fc buffer: convert __block_write_full_page() to __block_write_full_folio()
Remove nine hidden calls to compound_head() by using a folio instead of a
page.

Link: https://lkml.kernel.org/r/20230612210141.730128-5-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-19 16:19:30 -07:00
Christoph Hellwig 05bdb99653 block: replace fmode_t with a block-specific type for block open flags
The only overlap between the block open flags mapped into the fmode_t and
other uses of fmode_t are FMODE_READ and FMODE_WRITE.  Define a new
blk_mode_t instead for use in blkdev_get_by_{dev,path}, ->open and
->ioctl and stop abusing fmode_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jack Wang <jinpu.wang@ionos.com>		[rnbd]
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20230608110258.189493-28-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12 08:04:05 -06:00
Christoph Hellwig 2736e8eeb0 block: use the holder as indication for exclusive opens
The current interface for exclusive opens is rather confusing as it
requires both the FMODE_EXCL flag and a holder.  Remove the need to pass
FMODE_EXCL and just key off the exclusive open off a non-NULL holder.

For blkdev_put this requires adding the holder argument, which provides
better debug checking that only the holder actually releases the hold,
but at the same time allows removing the now superfluous mode argument.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Acked-by: David Sterba <dsterba@suse.com>		[btrfs]
Acked-by: Jack Wang <jinpu.wang@ionos.com>		[rnbd]
Link: https://lore.kernel.org/r/20230608110258.189493-16-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12 08:04:04 -06:00
Christoph Hellwig 0718afd47f block: introduce holder ops
Add a new blk_holder_ops structure, which is passed to blkdev_get_by_* and
installed in the block_device for exclusive claims.  It will be used to
allow the block layer to call back into the user of the block device for
thing like notification of a removed device or a device resize.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Link: https://lore.kernel.org/r/20230601094459.1350643-10-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-05 10:53:04 -06:00
Tetsuo Handa d031f4e8b4 reiserfs: Initialize sec->length in reiserfs_security_init().
syzbot is reporting that sec->length is not initialized.

Since security_inode_init_security() returns 0 when initxattrs is provided
but call_int_hook(inode_init_security) returned -EOPNOTSUPP, control will
reach to "if (sec->length && ...) {" without initializing sec->length.

Reported-by: syzbot <syzbot+00a3779539a23cbee38c@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=00a3779539a23cbee38c
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 52ca4b6435 ("reiserfs: Switch to security_inode_init_security()")
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-25 17:44:44 -04:00
David Howells 2cb1e08985 splice: Use filemap_splice_read() instead of generic_file_splice_read()
Replace pointers to generic_file_splice_read() with calls to
filemap_splice_read().

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20230522135018.2742245-29-dhowells@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-24 08:42:17 -06:00
Linus Torvalds 94fc079266 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmRHq24ACgkQnJ2qBz9k
 QNkowggAuyZ313GrAUNDH2Cvu3PE804LpW6k27VxtIxmgtMSZvuOED0fC99Xt+3y
 SnQrJKfw4HVKIDT2sx3ECaiKa+v1UBMG+KmaZ5u5qLjxfjY8YHy6TbHMwY/NV2Aj
 jGgJYN+5E9AJqStzuu4C9fUzgOgvi4IzgNX4hp0WbESVk6jbDqW2KGNAv6IRM9ku
 qDg54aL8W50uDMgUJysqe180SlDgeYGPBg1OfA7xrzMaBpEK3z5+v52ZlpGNguDP
 MZMGwZgFLCP2R5Z9+zqhyK9UuIHUDNE05y20+Hsm1AxlI0iVVt+CahQ9T/6P9OBT
 QQ+vnepQR6ORtFBlPCVgyiTQgsO3Tw==
 =+MkS
 -----END PGP SIGNATURE-----

Merge tag 'fs_for_v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull ext2, reiserfs, udf, and quota updates from Jan Kara:
 "A couple of small fixes and cleanups for ext2, udf, reiserfs, and
  quota.

  The biggest change is making CONFIG_PRINT_QUOTA_WARNING depend on
  BROKEN with an outlook for removing it completely in an year or so"

* tag 'fs_for_v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: mark PRINT_QUOTA_WARNING as BROKEN
  quota: update Kconfig comment
  reiserfs: remove unused iter variable
  quota: Use register_sysctl_init() for registering fs_dqstats_table
  reiserfs: remove unused sched_count variable
  ext2: remove redundant assignment to pointer end
  quota: make dquot_set_dqinfo return errors from ->write_info
  quota: fixup *_write_file_info() to return proper error code
  quota: simplify two-level sysctl registration for fs_dqstats_table
  udf: use wrapper i_blocksize() in udf_discard_prealloc()
  udf: Use folios in udf_adinicb_writepage()
  ext2: Check block size validity during mount
  ext2: Correct maximum ext2 filesystem block size
2023-04-26 09:07:46 -07:00
Linus Torvalds 7bcff5a396 v6.4/vfs.acl
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZEEhwgAKCRCRxhvAZXjc
 otwgAQDXHnKiPm/d76lITXbxdUNCtvZz+ig26EbOrD+vEszzIQEA81dru0QbCNCt
 ctoZdcsmtKbt2VaYQF1CDOhlnNg5VQM=
 =pER1
 -----END PGP SIGNATURE-----

Merge tag 'v6.4/vfs.acl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull acl updates from Christian Brauner:
 "After finishing the introduction of the new posix acl api last cycle
  the generic POSIX ACL xattr handlers are still around in the
  filesystems xattr handlers for two reasons:

   (1) Because a few filesystems rely on the ->list() method of the
       generic POSIX ACL xattr handlers in their ->listxattr() inode
       operation.

   (2) POSIX ACLs are only available if IOP_XATTR is raised. The
       IOP_XATTR flag is raised in inode_init_always() based on whether
       the sb->s_xattr pointer is non-NULL. IOW, the registered xattr
       handlers of the filesystem are used to raise IOP_XATTR. Removing
       the generic POSIX ACL xattr handlers from all filesystems would
       risk regressing filesystems that only implement POSIX ACL support
       and no other xattrs (nfs3 comes to mind).

  This contains the work to decouple POSIX ACLs from the IOP_XATTR flag
  as they don't depend on xattr handlers anymore. So it's now possible
  to remove the generic POSIX ACL xattr handlers from the sb->s_xattr
  list of all filesystems. This is a crucial step as the generic POSIX
  ACL xattr handlers aren't used for POSIX ACLs anymore and POSIX ACLs
  don't depend on the xattr infrastructure anymore.

  Adressing problem (1) will require more long-term work. It would be
  best to get rid of the ->list() method of xattr handlers completely at
  some point.

  For erofs, ext{2,4}, f2fs, jffs2, ocfs2, and reiserfs the nop POSIX
  ACL xattr handler is kept around so they can continue to use
  array-based xattr handler indexing.

  This update does simplify the ->listxattr() implementation of all
  these filesystems however"

* tag 'v6.4/vfs.acl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  acl: don't depend on IOP_XATTR
  ovl: check for ->listxattr() support
  reiserfs: rework priv inode handling
  fs: rename generic posix acl handlers
  reiserfs: rework ->listxattr() implementation
  fs: simplify ->listxattr() implementation
  fs: drop unused posix acl handlers
  xattr: remove unused argument
  xattr: add listxattr helper
  xattr: simplify listxattr helpers
2023-04-24 13:35:23 -07:00
Roberto Sassu d82dcd9e21 reiserfs: Add security prefix to xattr name in reiserfs_security_write()
Reiserfs sets a security xattr at inode creation time in two stages: first,
it calls reiserfs_security_init() to obtain the xattr from active LSMs;
then, it calls reiserfs_security_write() to actually write that xattr.

Unfortunately, it seems there is a wrong expectation that LSMs provide the
full xattr name in the form 'security.<suffix>'. However, LSMs always
provided just the suffix, causing reiserfs to not write the xattr at all
(if the suffix is shorter than the prefix), or to write an xattr with the
wrong name.

Add a temporary buffer in reiserfs_security_write(), and write to it the
full xattr name, before passing it to reiserfs_xattr_set_handle().

Also replace the name length check with a check that the full xattr name is
not larger than XATTR_NAME_MAX.

Cc: stable@vger.kernel.org # v2.6.x
Fixes: 57fe60df62 ("reiserfs: add atomic addition of selinux attributes during inode creation")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-04-04 14:26:26 -04:00
Tom Rix 74b7d42300 reiserfs: remove unused iter variable
clang with W=1 reports
fs/reiserfs/stree.c:1265:6: error: variable
  'iter' set but not used [-Werror,-Wunused-but-set-variable]
        int iter = 0;
            ^
This variable is not used so remove it.

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20230331120325.1855111-1-trix@redhat.com>
2023-03-31 14:21:28 +02:00
Tom Rix 8ace883815 reiserfs: remove unused sched_count variable
clang with W=1 reports
fs/reiserfs/journal.c:3034:6: error: variable
  'sched_count' set but not used [-Werror,-Wunused-but-set-variable]
        int sched_count = 0;
            ^
This variable is not used so remove it.

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20230326204459.1358553-1-trix@redhat.com>
2023-03-29 11:43:15 +02:00
Roberto Sassu 52ca4b6435 reiserfs: Switch to security_inode_init_security()
In preparation for removing security_old_inode_init_security(), switch to
security_inode_init_security(). Commit 572302af12 ("reiserfs: Add missing
calls to reiserfs_security_free()") fixed possible memory leaks and another
issue related to adding an xattr at inode creation time.

Define the initxattrs callback reiserfs_initxattrs(), to populate the
name/value/len triple in the reiserfs_security_handle() with the first
xattr provided by LSMs. Make a copy of the xattr value, as
security_inode_init_security() frees it.

After the call to security_inode_init_security(), remove the check for
returning -EOPNOTSUPP, as security_inode_init_security() changes it to
zero.

Multiple xattrs are currently not supported, as the
reiserfs_security_handle structure is exported to user space. As a
consequence, even if EVM is invoked, it will not provide an xattr (if it
is not the first to set it, its xattr will be discarded; if it is the
first, it does not have xattrs to calculate the HMAC on).

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-03-23 19:32:02 -04:00
Christian Brauner d9f892b9bd
reiserfs: rework priv inode handling
Reiserfs is the only filesystem that removes IOP_XATTR without also
using a set of dedicated inode operations at the same time that nop all
xattr related inode operations. This means we need to have a IOP_XATTR
check in vfs_listxattr() instead of just being able to check for
->listxatt() being implemented.

Introduce a dedicated set of nop inode operations that are used when
IOP_XATTR is removed, allowing us to remove that check from
vfs_listxattr(). This in turn allows us to completely decouple POSIX ACLs from
IOP_XATTR.

Cc: reiserfs-devel@vger.kernel.org
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2023-03-06 09:57:13 +01:00
Christian Brauner 387b96a589
reiserfs: rework ->listxattr() implementation
Rework reiserfs so it doesn't have to rely on the dummy xattr handlers
in its s_xattr list anymore as this is completely unused for setting and
getting posix acls.

Cc: reiserfs-devel@vger.kernel.org
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2023-03-06 09:57:12 +01:00
Linus Torvalds 3822a7c409 - Daniel Verkamp has contributed a memfd series ("mm/memfd: add
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
   memfd creation time, with the option of sealing the state of the X bit.
 
 - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
   thread-safe for pmd unshare") which addresses a rare race condition
   related to PMD unsharing.
 
 - Several folioification patch serieses from Matthew Wilcox, Vishal
   Moola, Sidhartha Kumar and Lorenzo Stoakes
 
 - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which
   does perform some memcg maintenance and cleanup work.
 
 - SeongJae Park has added DAMOS filtering to DAMON, with the series
   "mm/damon/core: implement damos filter".  These filters provide users
   with finer-grained control over DAMOS's actions.  SeongJae has also done
   some DAMON cleanup work.
 
 - Kairui Song adds a series ("Clean up and fixes for swap").
 
 - Vernon Yang contributed the series "Clean up and refinement for maple
   tree".
 
 - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series.  It
   adds to MGLRU an LRU of memcgs, to improve the scalability of global
   reclaim.
 
 - David Hildenbrand has added some userfaultfd cleanup work in the
   series "mm: uffd-wp + change_protection() cleanups".
 
 - Christoph Hellwig has removed the generic_writepages() library
   function in the series "remove generic_writepages".
 
 - Baolin Wang has performed some maintenance on the compaction code in
   his series "Some small improvements for compaction".
 
 - Sidhartha Kumar is doing some maintenance work on struct page in his
   series "Get rid of tail page fields".
 
 - David Hildenbrand contributed some cleanup, bugfixing and
   generalization of pte management and of pte debugging in his series "mm:
   support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap
   PTEs".
 
 - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
   flag in the series "Discard __GFP_ATOMIC".
 
 - Sergey Senozhatsky has improved zsmalloc's memory utilization with his
   series "zsmalloc: make zspage chain size configurable".
 
 - Joey Gouly has added prctl() support for prohibiting the creation of
   writeable+executable mappings.  The previous BPF-based approach had
   shortcomings.  See "mm: In-kernel support for memory-deny-write-execute
   (MDWE)".
 
 - Waiman Long did some kmemleak cleanup and bugfixing in the series
   "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
 
 - T.J.  Alumbaugh has contributed some MGLRU cleanup work in his series
   "mm: multi-gen LRU: improve".
 
 - Jiaqi Yan has provided some enhancements to our memory error
   statistics reporting, mainly by presenting the statistics on a per-node
   basis.  See the series "Introduce per NUMA node memory error
   statistics".
 
 - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
   regression in compaction via his series "Fix excessive CPU usage during
   compaction".
 
 - Christoph Hellwig does some vmalloc maintenance work in the series
   "cleanup vfree and vunmap".
 
 - Christoph Hellwig has removed block_device_operations.rw_page() in ths
   series "remove ->rw_page".
 
 - We get some maple_tree improvements and cleanups in Liam Howlett's
   series "VMA tree type safety and remove __vma_adjust()".
 
 - Suren Baghdasaryan has done some work on the maintainability of our
   vm_flags handling in the series "introduce vm_flags modifier functions".
 
 - Some pagemap cleanup and generalization work in Mike Rapoport's series
   "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and
   "fixups for generic implementation of pfn_valid()"
 
 - Baoquan He has done some work to make /proc/vmallocinfo and
   /proc/kcore better represent the real state of things in his series
   "mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
 
 - Jason Gunthorpe rationalized the GUP system's interface to the rest of
   the kernel in the series "Simplify the external interface for GUP".
 
 - SeongJae Park wishes to migrate people from DAMON's debugfs interface
   over to its sysfs interface.  To support this, we'll temporarily be
   printing warnings when people use the debugfs interface.  See the series
   "mm/damon: deprecate DAMON debugfs interface".
 
 - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
   and clean-ups" series.
 
 - Huang Ying has provided a dramatic reduction in migration's TLB flush
   IPI rates with the series "migrate_pages(): batch TLB flushing".
 
 - Arnd Bergmann has some objtool fixups in "objtool warning fixes".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY/PoPQAKCRDdBJ7gKXxA
 jlvpAPsFECUBBl20qSue2zCYWnHC7Yk4q9ytTkPB/MMDrFEN9wD/SNKEm2UoK6/K
 DmxHkn0LAitGgJRS/W9w81yrgig9tAQ=
 =MlGs
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Daniel Verkamp has contributed a memfd series ("mm/memfd: add
   F_SEAL_EXEC") which permits the setting of the memfd execute bit at
   memfd creation time, with the option of sealing the state of the X
   bit.

 - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
   thread-safe for pmd unshare") which addresses a rare race condition
   related to PMD unsharing.

 - Several folioification patch serieses from Matthew Wilcox, Vishal
   Moola, Sidhartha Kumar and Lorenzo Stoakes

 - Johannes Weiner has a series ("mm: push down lock_page_memcg()")
   which does perform some memcg maintenance and cleanup work.

 - SeongJae Park has added DAMOS filtering to DAMON, with the series
   "mm/damon/core: implement damos filter".

   These filters provide users with finer-grained control over DAMOS's
   actions. SeongJae has also done some DAMON cleanup work.

 - Kairui Song adds a series ("Clean up and fixes for swap").

 - Vernon Yang contributed the series "Clean up and refinement for maple
   tree".

 - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
   adds to MGLRU an LRU of memcgs, to improve the scalability of global
   reclaim.

 - David Hildenbrand has added some userfaultfd cleanup work in the
   series "mm: uffd-wp + change_protection() cleanups".

 - Christoph Hellwig has removed the generic_writepages() library
   function in the series "remove generic_writepages".

 - Baolin Wang has performed some maintenance on the compaction code in
   his series "Some small improvements for compaction".

 - Sidhartha Kumar is doing some maintenance work on struct page in his
   series "Get rid of tail page fields".

 - David Hildenbrand contributed some cleanup, bugfixing and
   generalization of pte management and of pte debugging in his series
   "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with
   swap PTEs".

 - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
   flag in the series "Discard __GFP_ATOMIC".

 - Sergey Senozhatsky has improved zsmalloc's memory utilization with
   his series "zsmalloc: make zspage chain size configurable".

 - Joey Gouly has added prctl() support for prohibiting the creation of
   writeable+executable mappings.

   The previous BPF-based approach had shortcomings. See "mm: In-kernel
   support for memory-deny-write-execute (MDWE)".

 - Waiman Long did some kmemleak cleanup and bugfixing in the series
   "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".

 - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
   "mm: multi-gen LRU: improve".

 - Jiaqi Yan has provided some enhancements to our memory error
   statistics reporting, mainly by presenting the statistics on a
   per-node basis. See the series "Introduce per NUMA node memory error
   statistics".

 - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
   regression in compaction via his series "Fix excessive CPU usage
   during compaction".

 - Christoph Hellwig does some vmalloc maintenance work in the series
   "cleanup vfree and vunmap".

 - Christoph Hellwig has removed block_device_operations.rw_page() in
   ths series "remove ->rw_page".

 - We get some maple_tree improvements and cleanups in Liam Howlett's
   series "VMA tree type safety and remove __vma_adjust()".

 - Suren Baghdasaryan has done some work on the maintainability of our
   vm_flags handling in the series "introduce vm_flags modifier
   functions".

 - Some pagemap cleanup and generalization work in Mike Rapoport's
   series "mm, arch: add generic implementation of pfn_valid() for
   FLATMEM" and "fixups for generic implementation of pfn_valid()"

 - Baoquan He has done some work to make /proc/vmallocinfo and
   /proc/kcore better represent the real state of things in his series
   "mm/vmalloc.c: allow vread() to read out vm_map_ram areas".

 - Jason Gunthorpe rationalized the GUP system's interface to the rest
   of the kernel in the series "Simplify the external interface for
   GUP".

 - SeongJae Park wishes to migrate people from DAMON's debugfs interface
   over to its sysfs interface. To support this, we'll temporarily be
   printing warnings when people use the debugfs interface. See the
   series "mm/damon: deprecate DAMON debugfs interface".

 - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
   and clean-ups" series.

 - Huang Ying has provided a dramatic reduction in migration's TLB flush
   IPI rates with the series "migrate_pages(): batch TLB flushing".

 - Arnd Bergmann has some objtool fixups in "objtool warning fixes".

* tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits)
  include/linux/migrate.h: remove unneeded externs
  mm/memory_hotplug: cleanup return value handing in do_migrate_range()
  mm/uffd: fix comment in handling pte markers
  mm: change to return bool for isolate_movable_page()
  mm: hugetlb: change to return bool for isolate_hugetlb()
  mm: change to return bool for isolate_lru_page()
  mm: change to return bool for folio_isolate_lru()
  objtool: add UACCESS exceptions for __tsan_volatile_read/write
  kmsan: disable ftrace in kmsan core code
  kasan: mark addr_has_metadata __always_inline
  mm: memcontrol: rename memcg_kmem_enabled()
  sh: initialize max_mapnr
  m68k/nommu: add missing definition of ARCH_PFN_OFFSET
  mm: percpu: fix incorrect size in pcpu_obj_full_size()
  maple_tree: reduce stack usage with gcc-9 and earlier
  mm: page_alloc: call panic() when memoryless node allocation fails
  mm: multi-gen LRU: avoid futile retries
  migrate_pages: move THP/hugetlb migration support check to simplify code
  migrate_pages: batch flushing TLB
  migrate_pages: share more code between _unmap and _move
  ...
2023-02-23 17:09:35 -08:00
Linus Torvalds 553637f73c for-6.3/dio-2023-02-16
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmPueAQQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgplopEACo17a4Z2p2xCedA0NCqX2ggtsSIdYiluPm
 pgdBzIEsgwKo1XVLGRgGiC8VdMRuzO4Zh/NGn4iRF1a68wjgjnwGWY7r052TUoSr
 q1yya739BpffnkXjj15x86cwl+5rHv2RQkm15+2HqBgcruA63/ZgdKBtjj+EtVKs
 zYOlmgyfFbkn8AdULMGiDKP4lixV8gUelv6vWneBwNrj4iSLnuN1+8nJNsl4wxwg
 ImSpx63AzhUoeL6byc+fmiA8fZhDhSvwON2tCyyCmOjlFM/TLrsm5t1juWiDid1O
 UROkQwQtsmjSUq3ow5fRJfjbZ3HLa1uGQr95DYHy0OBRAteAhDY5Upv0DXNL0ZBh
 uNNg8AXtJbyc+pLHWnncyiTzi+3eWs7WiMn04/a5eDhFvcJ0PZjLIgRi5j1ezUS1
 bWqoPaAIxoMD83WoMxjnKvBpGeMzPHvNTijeZjkGOu0vOk8JhXqNmLTjNG9aLtzf
 1Nvp0o8AqtQAW7cgFazZSWtw4bPk/wZ7mW0zHtqLDHIzXkc7A/Uo0ftdv84G08aW
 pvakNz4aNLwSPf7hxgPP9SgS9CeHhxK8PS6uk3V788SI8qGiew11+EcTNGkQNmvw
 /ItCo93UaWD/7SZLObTLslmet7rFHzz6PXaXrMxrSvaeZMkgr7DWEy9XS+ueOtXO
 fS8QhJX11w==
 =IU45
 -----END PGP SIGNATURE-----

Merge tag 'for-6.3/dio-2023-02-16' of git://git.kernel.dk/linux

Pull legacy dio update from Jens Axboe:
 "We only have a few file systems that use the old dio code, make them
  select it rather than build it unconditionally"

* tag 'for-6.3/dio-2023-02-16' of git://git.kernel.dk/linux:
  fs: build the legacy direct I/O code conditionally
  fs: move sb_init_dio_done_wq out of direct-io.c
2023-02-20 14:10:36 -08:00
Christoph Hellwig 9636e650e1 fs: build the legacy direct I/O code conditionally
Add a new LEGACY_DIRECT_IO config symbol that is only selected by the
file systems that still use the legacy blockdev_direct_IO code, so that
kernels without support for those file systems don't need to build the
code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20230125065839.191256-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-26 10:30:56 -07:00
Christian Brauner f861646a65
quota: port to mnt_idmap
Convert to struct mnt_idmap.

Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.

Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.

Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.

Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2023-01-19 09:24:29 +01:00