Commit graph

2133 commits

Author SHA1 Message Date
Linus Torvalds
cc675d22e4 xen: branch for v6.1-rc6
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCY3TN0wAKCRCAXGG7T9hj
 voAOAP4i3FjRj/ilXohox3F7iyPsRbFrGnayYcHRPeFF8UPz8QEAzyLP/FBGbmho
 sSuhcmb6r9foGKri7zyTKHIA4bkz4Qo=
 =/KaG
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "Two trivial cleanups, and three simple fixes"

* tag 'for-linus-6.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/platform-pci: use define instead of literal number
  xen/platform-pci: add missing free_irq() in error path
  xen-pciback: Allow setting PCI_MSIX_FLAGS_MASKALL too
  xen/pcpu: fix possible memory leak in register_pcpu()
  x86/xen: Use kstrtobool() instead of strtobool()
2022-11-16 10:49:06 -08:00
Juergen Gross
4abb77fc55 xen/platform-pci: use define instead of literal number
Instead of "0x01" use the HVM_PARAM_CALLBACK_TYPE_PCI_INTX define from
the interface header in get_callback_via().

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-11-15 07:34:13 +01:00
ruanjinjie
c53717e1e3 xen/platform-pci: add missing free_irq() in error path
free_irq() is missing in case of error in platform_pci_probe(), fix that.

Signed-off-by: ruanjinjie <ruanjinjie@huawei.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Link: https://lore.kernel.org/r/20221114112124.1965611-1-ruanjinjie@huawei.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-11-14 13:29:14 +01:00
Marek Marczykowski-Górecki
5e29500eba xen-pciback: Allow setting PCI_MSIX_FLAGS_MASKALL too
When Xen domain configures MSI-X, the usual approach is to enable MSI-X
together with masking all of them via the config space, then fill the
table and only then clear PCI_MSIX_FLAGS_MASKALL. Allow doing this via
QEMU running in a stub domain.

Previously, when changing PCI_MSIX_FLAGS_MASKALL was not allowed, the
whole write was aborted, preventing change to the PCI_MSIX_FLAGS_ENABLE
bit too.

Note the Xen hypervisor intercepts this write anyway, and may keep the
PCI_MSIX_FLAGS_MASKALL bit set if it wishes to. It will store the
guest-requested state and will apply it eventually.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Link: https://lore.kernel.org/r/20221114103110.1519413-1-marmarek@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-11-14 13:29:10 +01:00
Yang Yingliang
da36a2a76b xen/pcpu: fix possible memory leak in register_pcpu()
In device_add(), dev_set_name() is called to allocate name, if it returns
error, the name need be freed. As comment of device_register() says, it
should use put_device() to give up the reference in the error path. So fix
this by calling put_device(), then the name can be freed in kobject_cleanup().

Fixes: f65c9bb3fb ("xen/pcpu: Xen physical cpus online/offline sys interface")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20221110152441.401630-1-yangyingliang@huawei.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-11-14 13:29:07 +01:00
Dave Airlie
b837d3db9a drm-misc-next for 6.2:
UAPI Changes:
   - Documentation for page-flip flags
 
 Cross-subsystem Changes:
   - dma-buf: Add unlocked variant of vmapping and attachment-mapping
     functions
 
 Core Changes:
   - atomic-helpers: CRTC primary plane test fixes
   - connector: TV API consistency improvements, cmdline parsing
     improvements
   - crtc-helpers: Introduce drm_crtc_helper_atomic_check() helper
   - edid: Fixes for HFVSDB parsing,
   - fourcc: Addition of the Vivante tiled modifier
   - makefile: Sort and reorganize the objects files
   - mode_config: Remove fb_base from drm_mode_config_funcs
   - sched: Add a module parameter to change the scheduling policy,
     refcounting fix for fences
   - tests: Sort the Kunit tests in the Makefile, improvements to the
     DP-MST tests
   - ttm: Remove unnecessary drm_mm_clean() call
 
 Driver Changes:
   - New driver: ofdrm
   - Move all drivers to a common dma-buf locking convention
   - bridge:
     - adv7533: Remove dynamic lane switching
     - it6505: Runtime PM support
     - ps8640: Handle AUX defer messages
     - tc358775: Drop soft-reset over I2C
   - ast: Atomic Gamma LUT Support, Convert to SHMEM, various
     improvements
   - lcdif: Support for YUV planes
   - mgag200: Fix PLL Setup on some revisions
   - udl: Modesetting improvements, hot-unplug support
   - vc4: Fix support for PAL-M
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCY1D3fQAKCRDj7w1vZxhR
 xZZgAPoCSfyU3+M3ALT0vQ/OpktE5tuwUBMac2Lxkqgx4dnQ0gEAnYeez0Dedod8
 HNdgBH7FTklYa/zT8n17SwHUOzJJ5gc=
 =YvuW
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2022-10-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for 6.2:

UAPI Changes:
  - Documentation for page-flip flags

Cross-subsystem Changes:
  - dma-buf: Add unlocked variant of vmapping and attachment-mapping
    functions

Core Changes:
  - atomic-helpers: CRTC primary plane test fixes
  - connector: TV API consistency improvements, cmdline parsing
    improvements
  - crtc-helpers: Introduce drm_crtc_helper_atomic_check() helper
  - edid: Fixes for HFVSDB parsing,
  - fourcc: Addition of the Vivante tiled modifier
  - makefile: Sort and reorganize the objects files
  - mode_config: Remove fb_base from drm_mode_config_funcs
  - sched: Add a module parameter to change the scheduling policy,
    refcounting fix for fences
  - tests: Sort the Kunit tests in the Makefile, improvements to the
    DP-MST tests
  - ttm: Remove unnecessary drm_mm_clean() call

Driver Changes:
  - New driver: ofdrm
  - Move all drivers to a common dma-buf locking convention
  - bridge:
    - adv7533: Remove dynamic lane switching
    - it6505: Runtime PM support
    - ps8640: Handle AUX defer messages
    - tc358775: Drop soft-reset over I2C
  - ast: Atomic Gamma LUT Support, Convert to SHMEM, various
    improvements
  - lcdif: Support for YUV planes
  - mgag200: Fix PLL Setup on some revisions
  - udl: Modesetting improvements, hot-unplug support
  - vc4: Fix support for PAL-M

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20221020072405.g3o4hxuk75gmeumw@houat
2022-10-25 11:42:18 +10:00
Linus Torvalds
1d61754caa xen: branch for v6.1-rc2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCY1JhRwAKCRCAXGG7T9hj
 vtcFAQDEkdKa5lWFo6Lut8rraHPyjoCVdLV1J2KJo0vO8rgU+AEA1kuPMqQzpVez
 1xTkZrKnPvIo+Dv515oxDCmzTxGoDgA=
 =1fmO
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.1-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "Just two fixes for the new 'virtio with grants' feature"

* tag 'for-linus-6.1-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/virtio: Convert PAGE_SIZE/PAGE_SHIFT/PFN_UP to Xen counterparts
  xen/virtio: Handle cases when page offset > PAGE_SIZE properly
2022-10-21 14:43:09 -07:00
Maxime Ripard
a140a6a2d5
Merge drm/drm-next into drm-misc-next
Let's kick-off this release cycle.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2022-10-18 15:00:03 +02:00
Dmitry Osipenko
e841ad86e7 xen/gntdev: Prepare to dynamic dma-buf locking specification
Prepare gntdev driver to the common dynamic dma-buf locking convention
by starting to use the unlocked versions of dma-buf API functions.

Acked-by: Juergen Gross <jgross@suse.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221017172229.42269-13-dmitry.osipenko@collabora.com
2022-10-18 01:21:47 +03:00
Oleksandr Tyshchenko
a383dcb1cc xen/virtio: Convert PAGE_SIZE/PAGE_SHIFT/PFN_UP to Xen counterparts
Currently, a grant ref is always based on the Xen page granularity
(4KB), and guest commonly uses the same page granularity.
But the guest may use a different page granularity (i.e 64KB).
So adopt the code to be able to deal with it.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Xenia Ragiadakou <burzalodowa@gmail.com>
Link: https://lore.kernel.org/r/20221008151013.2537826-3-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-14 14:22:23 +02:00
Oleksandr Tyshchenko
875553e317 xen/virtio: Handle cases when page offset > PAGE_SIZE properly
Passed to xen_grant_dma_map_page() offset in the page
can be > PAGE_SIZE even if the guest uses the same page granularity
as Xen (4KB).

Before current patch, if such case happened we ended up providing
grants for the whole region in xen_grant_dma_map_page() which
was really unnecessary. The more, we ended up not releasing all
grants which represented that region in xen_grant_dma_unmap_page().

Current patch updates the code to be able to deal with such cases.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Xenia Ragiadakou <burzalodowa@gmail.com>
Link: https://lore.kernel.org/r/20221008151013.2537826-2-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-14 14:22:20 +02:00
Linus Torvalds
778ce723e9 xen: branch for v6.1-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCY0ZjFAAKCRCAXGG7T9hj
 vjEsAP4rFMnqc6AXy4Mpvv8cxBtEuQZbwEqgBrMJUvK1jZQrBQD/dOJK2GBCVcfD
 2yaVlefFiJGTw5WUlbPeohUlTZ8pJwg=
 =xsHV
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:

 - Some minor typo fixes

 - A fix of the Xen pcifront driver for supporting the device model to
   run in a Linux stub domain

 - A cleanup of the pcifront driver

 - A series to enable grant-based virtio with Xen on x86

 - A cleanup of Xen PV guests to distinguish between safe and faulting
   MSR accesses

 - Two fixes of the Xen gntdev driver

 - Two fixes of the new xen grant DMA driver

* tag 'for-linus-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: Kconfig: Fix spelling mistake "Maxmium" -> "Maximum"
  xen/pv: support selecting safe/unsafe msr accesses
  xen/pv: refactor msr access functions to support safe and unsafe accesses
  xen/pv: fix vendor checks for pmu emulation
  xen/pv: add fault recovery control to pmu msr accesses
  xen/virtio: enable grant based virtio on x86
  xen/virtio: use dom0 as default backend for CONFIG_XEN_VIRTIO_FORCE_GRANT
  xen/virtio: restructure xen grant dma setup
  xen/pcifront: move xenstore config scanning into sub-function
  xen/gntdev: Accommodate VMA splitting
  xen/gntdev: Prevent leaking grants
  xen/virtio: Fix potential deadlock when accessing xen_grant_dma_devices
  xen/virtio: Fix n_pages calculation in xen_grant_dma_map(unmap)_page()
  xen/xenbus: Fix spelling mistake "hardward" -> "hardware"
  xen-pcifront: Handle missed Connected state
2022-10-12 14:39:38 -07:00
Colin Ian King
7880672bdc xen: Kconfig: Fix spelling mistake "Maxmium" -> "Maximum"
There is a spelling mistake in a Kconfig description. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20221007203500.2756787-1-colin.i.king@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-12 08:39:50 +02:00
Linus Torvalds
27bc50fc90 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any negative
   reports (or any positive ones, come to that).
 
 - Also the Maple Tree from Liam R.  Howlett.  An overlapping range-based
   tree for vmas.  It it apparently slight more efficient in its own right,
   but is mainly targeted at enabling work to reduce mmap_lock contention.
 
   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.
 
   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   (https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
   This has yet to be addressed due to Liam's unfortunately timed
   vacation.  He is now back and we'll get this fixed up.
 
 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer.  It uses
   clang-generated instrumentation to detect used-unintialized bugs down to
   the single bit level.
 
   KMSAN keeps finding bugs.  New ones, as well as the legacy ones.
 
 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.
 
 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
   file/shmem-backed pages.
 
 - userfaultfd updates from Axel Rasmussen
 
 - zsmalloc cleanups from Alexey Romanov
 
 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
 
 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.
 
 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.
 
 - memcg cleanups from Kairui Song.
 
 - memcg fixes and cleanups from Johannes Weiner.
 
 - Vishal Moola provides more folio conversions
 
 - Zhang Yi removed ll_rw_block() :(
 
 - migration enhancements from Peter Xu
 
 - migration error-path bugfixes from Huang Ying
 
 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths.  For optimizations by PMEM drivers, DRM
   drivers, etc.
 
 - vma merging improvements from Jakub Matěn.
 
 - NUMA hinting cleanups from David Hildenbrand.
 
 - xu xin added aditional userspace visibility into KSM merging activity.
 
 - THP & KSM code consolidation from Qi Zheng.
 
 - more folio work from Matthew Wilcox.
 
 - KASAN updates from Andrey Konovalov.
 
 - DAMON cleanups from Kaixu Xia.
 
 - DAMON work from SeongJae Park: fixes, cleanups.
 
 - hugetlb sysfs cleanups from Muchun Song.
 
 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
 joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
 bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
 =xfWx
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP & KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
2022-10-10 17:53:04 -07:00
Juergen Gross
61367688f1 xen/virtio: enable grant based virtio on x86
Use an x86-specific virtio_check_mem_acc_cb() for Xen in order to setup
the correct DMA ops.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # common code
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-10 14:31:26 +02:00
Linus Torvalds
62e6e5940c SCSI misc on 20221007
Updates to the usual drivers (qla2xxx, lpfc, ufs, hisi_sas, mpi3mr,
 mpt3sas, target); the biggest change (from my biased viewpoint) being
 that the mpi3mr now attached to the SAS transport class, making it the
 first fusion type device to do so.  Beyond the usual bug fixing and
 security class reworks, there aren't a huge number of core changes.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCY0B74yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishW2NAP9CPp2R
 7NRmSyxcyVYvtCNUW3WxXh65Gn+KgArmg8XucgEAhUBX1fSjOzpERWEU+UaXitbE
 Rb+FbjxSc5YxR+nJ/Qc=
 =0Wlp
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Updates to the usual drivers (qla2xxx, lpfc, ufs, hisi_sas, mpi3mr,
  mpt3sas, target). The biggest change (from my biased viewpoint) being
  that the mpi3mr now attached to the SAS transport class, making it the
  first fusion type device to do so.

  Beyond the usual bug fixing and security class reworks, there aren't a
  huge number of core changes"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (141 commits)
  scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername()
  scsi: mpi3mr: Remove unnecessary cast
  scsi: stex: Properly zero out the passthrough command structure
  scsi: mpi3mr: Update driver version to 8.2.0.3.0
  scsi: mpi3mr: Fix scheduling while atomic type bug
  scsi: mpi3mr: Scan the devices during resume time
  scsi: mpi3mr: Free enclosure objects during driver unload
  scsi: mpi3mr: Handle 0xF003 Fault Code
  scsi: mpi3mr: Graceful handling of surprise removal of PCIe HBA
  scsi: mpi3mr: Schedule IRQ kthreads only on non-RT kernels
  scsi: mpi3mr: Support new power management framework
  scsi: mpi3mr: Update mpi3 header files
  scsi: mpt3sas: Revert "scsi: mpt3sas: Fix ioc->base_readl() use"
  scsi: mpt3sas: Revert "scsi: mpt3sas: Fix writel() use"
  scsi: wd33c93: Remove dead code related to the long-gone config WD33C93_PIO
  scsi: core: Add I/O timeout count for SCSI device
  scsi: qedf: Populate sysfs attributes for vport
  scsi: pm8001: Replace one-element array with flexible-array member
  scsi: 3w-xxxx: Replace one-element array with flexible-array member
  scsi: hptiop: Replace one-element array with flexible-array member in struct hpt_iop_request_ioctl_command()
  ...
2022-10-07 12:33:18 -07:00
Juergen Gross
7228113d1f xen/virtio: use dom0 as default backend for CONFIG_XEN_VIRTIO_FORCE_GRANT
With CONFIG_XEN_VIRTIO_FORCE_GRANT set the default backend domid to 0,
enabling to use xen_grant_dma_ops for those devices.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-07 16:39:03 +02:00
Juergen Gross
c9133112f3 xen/virtio: restructure xen grant dma setup
In order to prepare supporting other means than device tree for
setting up virtio devices under Xen, restructure the functions
xen_is_grant_dma_device() and xen_grant_setup_dma_ops() a little bit.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 only
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-07 16:39:01 +02:00
M. Vefa Bicakci
5c13a4a029 xen/gntdev: Accommodate VMA splitting
Prior to this commit, the gntdev driver code did not handle the
following scenario correctly with paravirtualized (PV) Xen domains:

* User process sets up a gntdev mapping composed of two grant mappings
  (i.e., two pages shared by another Xen domain).
* User process munmap()s one of the pages.
* User process munmap()s the remaining page.
* User process exits.

In the scenario above, the user process would cause the kernel to log
the following messages in dmesg for the first munmap(), and the second
munmap() call would result in similar log messages:

  BUG: Bad page map in process doublemap.test  pte:... pmd:...
  page:0000000057c97bff refcount:1 mapcount:-1 \
    mapping:0000000000000000 index:0x0 pfn:...
  ...
  page dumped because: bad pte
  ...
  file:gntdev fault:0x0 mmap:gntdev_mmap [xen_gntdev] readpage:0x0
  ...
  Call Trace:
   <TASK>
   dump_stack_lvl+0x46/0x5e
   print_bad_pte.cold+0x66/0xb6
   unmap_page_range+0x7e5/0xdc0
   unmap_vmas+0x78/0xf0
   unmap_region+0xa8/0x110
   __do_munmap+0x1ea/0x4e0
   __vm_munmap+0x75/0x120
   __x64_sys_munmap+0x28/0x40
   do_syscall_64+0x38/0x90
   entry_SYSCALL_64_after_hwframe+0x61/0xcb
   ...

For each munmap() call, the Xen hypervisor (if built with CONFIG_DEBUG)
would print out the following and trigger a general protection fault in
the affected Xen PV domain:

  (XEN) d0v... Attempt to implicitly unmap d0's grant PTE ...
  (XEN) d0v... Attempt to implicitly unmap d0's grant PTE ...

As of this writing, gntdev_grant_map structure's vma field (referred to
as map->vma below) is mainly used for checking the start and end
addresses of mappings. However, with split VMAs, these may change, and
there could be more than one VMA associated with a gntdev mapping.
Hence, remove the use of map->vma and rely on map->pages_vm_start for
the original start address and on (map->count << PAGE_SHIFT) for the
original mapping size. Let the invalidate() and find_special_page()
hooks use these.

Also, given that there can be multiple VMAs associated with a gntdev
mapping, move the "mmu_interval_notifier_remove(&map->notifier)" call to
the end of gntdev_put_map, so that the MMU notifier is only removed
after the closing of the last remaining VMA.

Finally, use an atomic to prevent inadvertent gntdev mapping re-use,
instead of using the map->live_grants atomic counter and/or the map->vma
pointer (the latter of which is now removed). This prevents the
userspace from mmap()'ing (with MAP_FIXED) a gntdev mapping over the
same address range as a previously set up gntdev mapping. This scenario
can be summarized with the following call-trace, which was valid prior
to this commit:

  mmap
    gntdev_mmap
  mmap (repeat mmap with MAP_FIXED over the same address range)
    gntdev_invalidate
      unmap_grant_pages (sets 'being_removed' entries to true)
        gnttab_unmap_refs_async
    unmap_single_vma
    gntdev_mmap (maps the shared pages again)
  munmap
    gntdev_invalidate
      unmap_grant_pages
        (no-op because 'being_removed' entries are true)
    unmap_single_vma (For PV domains, Xen reports that a granted page
      is being unmapped and triggers a general protection fault in the
      affected domain, if Xen was built with CONFIG_DEBUG)

The fix for this last scenario could be worth its own commit, but we
opted for a single commit, because removing the gntdev_grant_map
structure's vma field requires guarding the entry to gntdev_mmap(), and
the live_grants atomic counter is not sufficient on its own to prevent
the mmap() over a pre-existing mapping.

Link: https://github.com/QubesOS/qubes-issues/issues/7631
Fixes: ab31523c2f ("xen/gntdev: allow usermode to map granted pages")
Cc: stable@vger.kernel.org
Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20221002222006.2077-3-m.v.b@runbox.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-06 10:40:21 +02:00
M. Vefa Bicakci
0991028cd4 xen/gntdev: Prevent leaking grants
Prior to this commit, if a grant mapping operation failed partially,
some of the entries in the map_ops array would be invalid, whereas all
of the entries in the kmap_ops array would be valid. This in turn would
cause the following logic in gntdev_map_grant_pages to become invalid:

  for (i = 0; i < map->count; i++) {
    if (map->map_ops[i].status == GNTST_okay) {
      map->unmap_ops[i].handle = map->map_ops[i].handle;
      if (!use_ptemod)
        alloced++;
    }
    if (use_ptemod) {
      if (map->kmap_ops[i].status == GNTST_okay) {
        if (map->map_ops[i].status == GNTST_okay)
          alloced++;
        map->kunmap_ops[i].handle = map->kmap_ops[i].handle;
      }
    }
  }
  ...
  atomic_add(alloced, &map->live_grants);

Assume that use_ptemod is true (i.e., the domain mapping the granted
pages is a paravirtualized domain). In the code excerpt above, note that
the "alloced" variable is only incremented when both kmap_ops[i].status
and map_ops[i].status are set to GNTST_okay (i.e., both mapping
operations are successful).  However, as also noted above, there are
cases where a grant mapping operation fails partially, breaking the
assumption of the code excerpt above.

The aforementioned causes map->live_grants to be incorrectly set. In
some cases, all of the map_ops mappings fail, but all of the kmap_ops
mappings succeed, meaning that live_grants may remain zero. This in turn
makes it impossible to unmap the successfully grant-mapped pages pointed
to by kmap_ops, because unmap_grant_pages has the following snippet of
code at its beginning:

  if (atomic_read(&map->live_grants) == 0)
    return; /* Nothing to do */

In other cases where only some of the map_ops mappings fail but all
kmap_ops mappings succeed, live_grants is made positive, but when the
user requests unmapping the grant-mapped pages, __unmap_grant_pages_done
will then make map->live_grants negative, because the latter function
does not check if all of the pages that were requested to be unmapped
were actually unmapped, and the same function unconditionally subtracts
"data->count" (i.e., a value that can be greater than map->live_grants)
from map->live_grants. The side effects of a negative live_grants value
have not been studied.

The net effect of all of this is that grant references are leaked in one
of the above conditions. In Qubes OS v4.1 (which uses Xen's grant
mechanism extensively for X11 GUI isolation), this issue manifests
itself with warning messages like the following to be printed out by the
Linux kernel in the VM that had granted pages (that contain X11 GUI
window data) to dom0: "g.e. 0x1234 still pending", especially after the
user rapidly resizes GUI VM windows (causing some grant-mapping
operations to partially or completely fail, due to the fact that the VM
unshares some of the pages as part of the window resizing, making the
pages impossible to grant-map from dom0).

The fix for this issue involves counting all successful map_ops and
kmap_ops mappings separately, and then adding the sum to live_grants.
During unmapping, only the number of successfully unmapped grants is
subtracted from live_grants. The code is also modified to check for
negative live_grants values after the subtraction and warn the user.

Link: https://github.com/QubesOS/qubes-issues/issues/7631
Fixes: dbe97cff7d ("xen/gntdev: Avoid blocking in unmap_grant_pages()")
Cc: stable@vger.kernel.org
Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com>
Acked-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20221002222006.2077-2-m.v.b@runbox.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-06 10:40:16 +02:00
Oleksandr Tyshchenko
77be00f194 xen/virtio: Fix potential deadlock when accessing xen_grant_dma_devices
As find_xen_grant_dma_data() is called from both interrupt and process
contexts, the access to xen_grant_dma_devices XArray must be protected
by xa_lock_irqsave to avoid deadlock scenario.
As XArray API doesn't provide xa_store_irqsave helper, call lockless
__xa_store directly and guard it externally.

Also move the storage of the XArray's entry to a separate helper.

Fixes: d6aca3504c ("xen/grant-dma-ops: Add option to restrict memory access under Xen")
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20221005174823.1800761-3-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-06 08:30:13 +02:00
Oleksandr Tyshchenko
e433715b11 xen/virtio: Fix n_pages calculation in xen_grant_dma_map(unmap)_page()
Take page offset into the account when calculating the number of pages
to be granted.

Fixes: d6aca3504c ("xen/grant-dma-ops: Add option to restrict memory access under Xen")
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20221005174823.1800761-2-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-06 08:30:10 +02:00
Colin Ian King
06c62f8cbb xen/xenbus: Fix spelling mistake "hardward" -> "hardware"
There is a spelling mistake in the module description. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20221004160639.154421-1-colin.i.king@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-10-06 07:35:14 +02:00
Liam R. Howlett
7ccf089b26 xen: use vma_lookup() in privcmd_ioctl_mmap()
vma_lookup() walks the VMA tree for a specific value, find_vma() will
search the tree after walking to a specific value.  It is more efficient
to only walk to the requested value since privcmd_ioctl_mmap() will exit
the loop if vm_start != msg->va.

Link: https://lkml.kernel.org/r/20220906194824.2110408-20-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26 19:46:16 -07:00
Linus Torvalds
526e826285 xen: branch for v6.0-rc7
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYy1wIQAKCRCAXGG7T9hj
 vtbNAQCU6rDtPGB4I60tG9BU+8JKYLnUQ9oOY2OJOWWIO3M44AD+I7YHXyP0Y3Pj
 8L+aTXGX1mHht9jFXpChCYwaBceKGws=
 =B4T9
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.0-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fix from Juergen Gross:
 "A single fix for an issue in the xenbus driver (initialization of
  multi-page rings for Xen PV devices)"

* tag 'for-linus-6.0-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/xenbus: fix xenbus_setup_ring()
2022-09-23 08:31:24 -07:00
Juergen Gross
ce6b8ccdef xen/xenbus: fix xenbus_setup_ring()
Commit 4573240f07 ("xen/xenbus: eliminate xenbus_grant_ring()")
introduced an error for initialization of multi-page rings.

Cc: stable@vger.kernel.org
Fixes: 4573240f07 ("xen/xenbus: eliminate xenbus_grant_ring()")
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-09-16 09:50:47 +02:00
Mike Christie
00903af94d scsi: xen: Drop use of internal host codes
The error codes:

 - DID_TARGET_FAILURE

 - DID_NEXUS_FAILURE

 - DID_ALLOC_FAILURE

 - DID_MEDIUM_ERROR

are internal to the SCSI layer. Drivers must not use them because:

 1. They are not propagated upwards, so SG IO/passthrough users will not
    see an error and think a command was successful.

    xen-scsiback will never see this error and should not try to send it.

 2. There is no handling for them in scsi_decide_disposition() so if
    xen-scsifront were to return the error to the SCSI midlayer then it
    kicks off the error handler which is definitely not what we want.

Remove the use from xen-scsifront/back.

Link: https://lore.kernel.org/r/20220812010027.8251-2-michael.christie@oracle.com
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-09-06 22:05:58 -04:00
Linus Torvalds
65eea2c060 xen: branch for v6.0-rc4
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYxNpQQAKCRCAXGG7T9hj
 vgO9AQChV9kknjgoL9qWJe25nNHyQ4zNfgtUNLefamYOAr93HgEAiFv7SIwMshFb
 jl4oceI76z2u6APkZlySR2JcJa+B+go=
 =BtBW
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - a minor fix for the Xen grant driver

 - a small series fixing a recently introduced problem in the Xen
   blkfront/blkback drivers with negotiation of feature usage

* tag 'for-linus-6.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/grants: prevent integer overflow in gnttab_dma_alloc_pages()
  xen-blkfront: Cache feature_persistent value before advertisement
  xen-blkfront: Advertise feature-persistent as user requested
  xen-blkback: Advertise feature-persistent as user requested
2022-09-03 13:23:11 -07:00
Dan Carpenter
e9ea0b30ad xen/grants: prevent integer overflow in gnttab_dma_alloc_pages()
The change from kcalloc() to kvmalloc() means that arg->nr_pages
might now be large enough that the "args->nr_pages << PAGE_SHIFT" can
result in an integer overflow.

Fixes: b3f7931f5c ("xen/gntdev: switch from kcalloc() to kvcalloc()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/YxDROJqu/RPvR0bi@kili
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-09-02 11:05:19 +02:00
Linus Torvalds
05519f2480 xen: branch for v6.0-rc3
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYwnVogAKCRCAXGG7T9hj
 vkvCAP97gXbhilg8GkmhSgwsn8XAnR3PFWonR+PA/NBQwsrx4AD/by18f7ICyOCl
 /r2PHDjYSiyMjwBSjrdm5Uo/kNBaqAM=
 =cIk8
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - two minor cleanups

 - a fix of the xen/privcmd driver avoiding a possible NULL dereference
   in an error case

* tag 'for-linus-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/privcmd: fix error exit of privcmd_ioctl_dm_op()
  xen: move from strlcpy with unused retval to strscpy
  xen: x86: remove setting the obsolete config XEN_MAX_DOMAIN_MEMORY
2022-08-27 15:38:00 -07:00
Juergen Gross
c5deb27895 xen/privcmd: fix error exit of privcmd_ioctl_dm_op()
The error exit of privcmd_ioctl_dm_op() is calling unlock_pages()
potentially with pages being NULL, leading to a NULL dereference.

Additionally lock_pages() doesn't check for pin_user_pages_fast()
having been completely successful, resulting in potentially not
locking all pages into memory. This could result in sporadic failures
when using the related memory in user mode.

Fix all of that by calling unlock_pages() always with the real number
of pinned pages, which will be zero in case pages being NULL, and by
checking the number of pages pinned by pin_user_pages_fast() matching
the expected number of pages.

Cc: <stable@vger.kernel.org>
Fixes: ab520be8cd ("xen/privcmd: Add IOCTL_PRIVCMD_DM_OP")
Reported-by: Rustam Subkhankulov <subkhankulov@ispras.ru>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Link: https://lore.kernel.org/r/20220825141918.3581-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-25 16:40:28 +02:00
Wolfram Sang
6bb79f5b4c xen: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Link: https://lore.kernel.org/r/20220818210122.7613-1-wsa+renesas@sang-engineering.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-25 13:38:07 +02:00
Linus Torvalds
5d6a0f4da9 xen: branch for v6.0-rc1b
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYvi0yQAKCRCAXGG7T9hj
 vmikAQDWSrcWuxDkGnzut0A1tBQRUCWDMyKPqigWAA5tH2sPgAEAtWfBvT1xyl7T
 gZ22I7o21WxxDGyvNUcA65pK7c2cpg8=
 =UMbq
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.0-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull more xen updates from Juergen Gross:

 - fix the handling of the "persistent grants" feature negotiation
   between Xen blkfront and Xen blkback drivers

 - a cleanup of xen.config and adding xen.config to Xen section in
   MAINTAINERS

 - support HVMOP_set_evtchn_upcall_vector, which is more compliant to
   "normal" interrupt handling than the global callback used up to now

 - further small cleanups

* tag 'for-linus-6.0-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  MAINTAINERS: add xen config fragments to XEN HYPERVISOR sections
  xen: remove XEN_SCRUB_PAGES in xen.config
  xen/pciback: Fix comment typo
  xen/xenbus: fix return type in xenbus_file_read()
  xen-blkfront: Apply 'feature_persistent' parameter when connect
  xen-blkback: Apply 'feature_persistent' parameter when connect
  xen-blkback: fix persistent grants negotiation
  x86/xen: Add support for HVMOP_set_evtchn_upcall_vector
2022-08-14 09:28:54 -07:00
Jason Wang
ced3c74271 xen/pciback: Fix comment typo
The double `the' is duplicated in the comment, remove one.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220811120918.17961-1-wangborong@cdjrlc.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-12 12:19:50 +02:00
Dan Carpenter
32ad11127b xen/xenbus: fix return type in xenbus_file_read()
This code tries to store -EFAULT in an unsigned int.  The
xenbus_file_read() function returns type ssize_t so the negative value
is returned as a positive value to the user.

This change forces another change to the min() macro.  Originally, the
min() macro used "unsigned" type which checkpatch complains about.  Also
unsigned type would break if "len" were not capped at MAX_RW_COUNT.  Use
size_t for the min().  (No effect on runtime for the min_t() change).

Fixes: 2fb3683e7b ("xen: Add xenbus device driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Link: https://lore.kernel.org/r/YutxJUaUYRG/VLVc@kili
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-12 12:16:39 +02:00
Jane Malalane
b1c3497e60 x86/xen: Add support for HVMOP_set_evtchn_upcall_vector
Implement support for the HVMOP_set_evtchn_upcall_vector hypercall in
order to set the per-vCPU event channel vector callback on Linux and
use it in preference of HVM_PARAM_CALLBACK_IRQ.

If the per-VCPU vector setup is successful on BSP, use this method
for the APs. If not, fallback to the global vector-type callback.

Also register callback_irq at per-vCPU event channel setup to trick
toolstack to think the domain is enlightened.

Suggested-by: "Roger Pau Monné" <roger.pau@citrix.com>
Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20220729070416.23306-1-jane.malalane@citrix.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-12 11:28:21 +02:00
Linus Torvalds
6614a3c316 - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
 
 - Some kmemleak fixes from Patrick Wang and Waiman Long
 
 - DAMON updates from SeongJae Park
 
 - memcg debug/visibility work from Roman Gushchin
 
 - vmalloc speedup from Uladzislau Rezki
 
 - more folio conversion work from Matthew Wilcox
 
 - enhancements for coherent device memory mapping from Alex Sierra
 
 - addition of shared pages tracking and CoW support for fsdax, from
   Shiyang Ruan
 
 - hugetlb optimizations from Mike Kravetz
 
 - Mel Gorman has contributed some pagealloc changes to improve latency
   and realtime behaviour.
 
 - mprotect soft-dirty checking has been improved by Peter Xu
 
 - Many other singleton patches all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYuravgAKCRDdBJ7gKXxA
 jpqSAQDrXSdII+ht9kSHlaCVYjqRFQz/rRvURQrWQV74f6aeiAD+NHHeDPwZn11/
 SPktqEUrF1pxnGQxqLh1kUFUhsVZQgE=
 =w/UH
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Most of the MM queue. A few things are still pending.

  Liam's maple tree rework didn't make it. This has resulted in a few
  other minor patch series being held over for next time.

  Multi-gen LRU still isn't merged as we were waiting for mapletree to
  stabilize. The current plan is to merge MGLRU into -mm soon and to
  later reintroduce mapletree, with a view to hopefully getting both
  into 6.1-rc1.

  Summary:

   - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
     Lin, Yang Shi, Anshuman Khandual and Mike Rapoport

   - Some kmemleak fixes from Patrick Wang and Waiman Long

   - DAMON updates from SeongJae Park

   - memcg debug/visibility work from Roman Gushchin

   - vmalloc speedup from Uladzislau Rezki

   - more folio conversion work from Matthew Wilcox

   - enhancements for coherent device memory mapping from Alex Sierra

   - addition of shared pages tracking and CoW support for fsdax, from
     Shiyang Ruan

   - hugetlb optimizations from Mike Kravetz

   - Mel Gorman has contributed some pagealloc changes to improve
     latency and realtime behaviour.

   - mprotect soft-dirty checking has been improved by Peter Xu

   - Many other singleton patches all over the place"

 [ XFS merge from hell as per Darrick Wong in

   https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ]

* tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits)
  tools/testing/selftests/vm/hmm-tests.c: fix build
  mm: Kconfig: fix typo
  mm: memory-failure: convert to pr_fmt()
  mm: use is_zone_movable_page() helper
  hugetlbfs: fix inaccurate comment in hugetlbfs_statfs()
  hugetlbfs: cleanup some comments in inode.c
  hugetlbfs: remove unneeded header file
  hugetlbfs: remove unneeded hugetlbfs_ops forward declaration
  hugetlbfs: use helper macro SZ_1{K,M}
  mm: cleanup is_highmem()
  mm/hmm: add a test for cross device private faults
  selftests: add soft-dirty into run_vmtests.sh
  selftests: soft-dirty: add test for mprotect
  mm/mprotect: fix soft-dirty check in can_change_pte_writable()
  mm: memcontrol: fix potential oom_lock recursion deadlock
  mm/gup.c: fix formatting in check_and_migrate_movable_page()
  xfs: fail dax mount if reflink is enabled on a partition
  mm/memcontrol.c: remove the redundant updating of stats_flush_threshold
  userfaultfd: don't fail on unrecognized features
  hugetlb_cgroup: fix wrong hugetlb cgroup numa stat
  ...
2022-08-05 16:32:45 -07:00
Linus Torvalds
7447691ef9 xen: branch for v6.0-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYuooOQAKCRCAXGG7T9hj
 vmmlAPoCfYBh4jKwRnvGvyn+sPQed/r0TH0wnsGK1ccONhyIvAD+IZcSTPsnp4Cj
 m1URGGff2PvAyjOIAzQZbKZomtfICwM=
 =z2e5
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:

 - a series fine tuning virtio support for Xen guests, including removal
   the now again unused "platform_has()" feature.

 - a fix for host admin triggered reboot of Xen guests

 - a simple spelling fix

* tag 'for-linus-6.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: don't require virtio with grants for non-PV guests
  kernel: remove platform_has() infrastructure
  virtio: replace restricted mem access flag with callback
  xen: Fix spelling mistake
  xen/manage: Use orderly_reboot() to reboot
2022-08-04 15:10:55 -07:00
Linus Torvalds
9de1f9c8ca Updates for interrupt core and drivers:
core:
    - Fix a few inconsistencies between UP and SMP vs. interrupt affinities
    - Small updates and cleanups all over the place
 
  drivers:
    - New driver for the LoongArch interrupt controller
    - New driver for the Renesas RZ/G2L interrupt controller
 
    - Hotpath optimization for SiFive PLIC
    - Workaround for broken PLIC edge triggered interrupts
 
    - Simall cleanups and improvements as usual
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmLn5agTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoV2HD/4u0+09Fd8Awt1Knnb4CInmwFihZ/bu
 EiS1Air+MEJ/fyFb5sT/Dn8YdUWYA6a3ifpLMGBwrKCcb5pMwPEtI8uqjSmtgsN/
 2Z4o3N5v6EgM25CtrHNBrXK0E9Rz5Py49gm5p3K7+h4g63z9JwrM4G0Bvr8+krLS
 EV9IZU6kVmGC6gnG/MspkArsLk1rCM0PU0SJ2lEPsWd1fjhVSDfunvy/qnnzXRzz
 wjrcAf+a2Kgb1TMnpL6tx9d2Xx8KrKfODZTdOmPHrdv58F0EbJzapJnAVkYZDPtR
 LE2kQc2Qhdawx0kgNNNhvu9P6oZtpnK9N7KAhDQdw17sgrRygINjAMSEe2RykYL1
 lK+lJOIzfyd2JkEuC/8w1ZezL88S0EaTNawrkxjJ8L3fa7WDbwilCC+1w95QydCv
 sQB137OaLKgWetcRsht9PLWFb4ujkWzxoPf2cPPsm81EzCicNtBuNPLReBTcNrWJ
 X2VPpbaqRK8t8bnkXRqhahbq7f8c86feoICHfA4c7T4eZUp/Oq6T8aNvf6WPgjae
 I2/FO6kxZj3CQqFkhJGhiZRtGZdx6HLCsL84A+2Ktsra+D8+/qecZCnkHYtz0Vo6
 aFuGg+Wj+zuc2QfdaWwG8Dh5dijbxgHGHhzbh9znsWzytN9gfoBxuvBejf65i6sC
 In63mEkv35ttfA==
 =OnhF
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "Updates for interrupt core and drivers:

  Core:

   - Fix a few inconsistencies between UP and SMP vs interrupt
     affinities

   - Small updates and cleanups all over the place

  New drivers:

   - LoongArch interrupt controller

   - Renesas RZ/G2L interrupt controller

  Updates:

   - Hotpath optimization for SiFive PLIC

   - Workaround for broken PLIC edge triggered interrupts

   - Simall cleanups and improvements as usual"

* tag 'irq-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  irqchip/mmp: Declare init functions in common header file
  irqchip/mips-gic: Check the return value of ioremap() in gic_of_init()
  genirq: Use for_each_action_of_desc in actions_show()
  irqchip / ACPI: Introduce ACPI_IRQ_MODEL_LPIC for LoongArch
  irqchip: Add LoongArch CPU interrupt controller support
  irqchip: Add Loongson Extended I/O interrupt controller support
  irqchip/loongson-liointc: Add ACPI init support
  irqchip/loongson-pch-msi: Add ACPI init support
  irqchip/loongson-pch-pic: Add ACPI init support
  irqchip: Add Loongson PCH LPC controller support
  LoongArch: Prepare to support multiple pch-pic and pch-msi irqdomain
  LoongArch: Use ACPI_GENERIC_GSI for gsi handling
  genirq/generic_chip: Export irq_unmap_generic_chip
  ACPI: irq: Allow acpi_gsi_to_irq() to have an arch-specific fallback
  APCI: irq: Add support for multiple GSI domains
  LoongArch: Provisionally add ACPICA data structures
  irqdomain: Use hwirq_max instead of revmap_size for NOMAP domains
  irqdomain: Report irq number for NOMAP domains
  irqchip/gic-v3: Fix comment typo
  dt-bindings: interrupt-controller: renesas,rzg2l-irqc: Document RZ/V2L SoC
  ...
2022-08-01 12:48:15 -07:00
Juergen Gross
251e90e7e3 xen: don't require virtio with grants for non-PV guests
Commit fa1f57421e ("xen/virtio: Enable restricted memory access using
Xen grant mappings") introduced a new requirement for using virtio
devices: the backend now needs to support the VIRTIO_F_ACCESS_PLATFORM
feature.

This is an undue requirement for non-PV guests, as those can be operated
with existing backends without any problem, as long as those backends
are running in dom0.

Per default allow virtio devices without grant support for non-PV
guests.

On Arm require VIRTIO_F_ACCESS_PLATFORM for devices having been listed
in the device tree to use grants.

Add a new config item to always force use of grants for virtio.

Fixes: fa1f57421e ("xen/virtio: Enable restricted memory access using Xen grant mappings")
Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 guest using Xen
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20220622063838.8854-4-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-01 07:43:03 +02:00
Zhang Jiaming
8441dac05e xen: Fix spelling mistake
Change 'maped' to 'mapped'.
Change 'unmaped' to 'unmapped'.

Signed-off-by: Zhang Jiaming <jiaming@nfschina.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220630075027.68833-1-jiaming@nfschina.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-01 07:42:42 +02:00
Ross Lagerwall
67f43c9c6a xen/manage: Use orderly_reboot() to reboot
Currently when the toolstack issues a reboot, it gets translated into a
call to ctrl_alt_del(). But tying reboot to ctrl-alt-del means rebooting
may fail if e.g. the user has masked the ctrl-alt-del.target under
systemd.

A previous attempt to fix this issue made a change that sets the
kernel.ctrl-alt-del sysctl to 1 before ctrl_alt_del() is called.
However, this doesn't give userspace the opportunity to block rebooting
or even do any cleanup or syncing.

Instead, call orderly_reboot() which will call the "reboot" command,
giving userspace the opportunity to block it or perform the usual reboot
process while being independent of the ctrl-alt-del behaviour. It also
matches what happens in the shutdown case.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220627142822.3612106-1-ross.lagerwall@citrix.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-01 07:42:22 +02:00
Demi Marie Obenour
166d386323 xen/gntdev: Ignore failure to unmap INVALID_GRANT_HANDLE
The error paths of gntdev_mmap() can call unmap_grant_pages() even
though not all of the pages have been successfully mapped.  This will
trigger the WARN_ON()s in __unmap_grant_pages_done().  The number of
warnings can be very large; I have observed thousands of lines of
warnings in the systemd journal.

Avoid this problem by only warning on unmapping failure if the handle
being unmapped is not INVALID_GRANT_HANDLE.  The handle field of any
page that was not successfully mapped will be INVALID_GRANT_HANDLE, so
this catches all cases where unmapping can legitimately fail.

Fixes: dbe97cff7d ("xen/gntdev: Avoid blocking in unmap_grant_pages()")
Cc: stable@vger.kernel.org
Suggested-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220710230522.1563-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-07-11 11:39:45 +02:00
Samuel Holland
073352e951 genirq: Add and use an irq_data_update_affinity helper
Some architectures and irqchip drivers modify the cpumask returned by
irq_data_get_affinity_mask, usually by copying in to it. This is
problematic for uniprocessor configurations, where the affinity mask
should be constant, as it is known at compile time.

Add and use a setter for the affinity mask, following the pattern of
irq_data_update_effective_affinity. This allows the getter function to
return a const cpumask pointer.

Signed-off-by: Samuel Holland <samuel@sholland.org>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Xen bits
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220701200056.46555-7-samuel@sholland.org
2022-07-07 09:38:04 +01:00
Roman Gushchin
e33c267ab7 mm: shrinkers: provide shrinkers with names
Currently shrinkers are anonymous objects.  For debugging purposes they
can be identified by count/scan function names, but it's not always
useful: e.g.  for superblock's shrinkers it's nice to have at least an
idea of to which superblock the shrinker belongs.

This commit adds names to shrinkers.  register_shrinker() and
prealloc_shrinker() functions are extended to take a format and arguments
to master a name.

In some cases it's not possible to determine a good name at the time when
a shrinker is allocated.  For such cases shrinker_debugfs_rename() is
provided.

The expected format is:
    <subsystem>-<shrinker_type>[:<instance>]-<id>
For some shrinkers an instance can be encoded as (MAJOR:MINOR) pair.

After this change the shrinker debugfs directory looks like:
  $ cd /sys/kernel/debug/shrinker/
  $ ls
    dquota-cache-16     sb-devpts-28     sb-proc-47       sb-tmpfs-42
    mm-shadow-18        sb-devtmpfs-5    sb-proc-48       sb-tmpfs-43
    mm-zspool:zram0-34  sb-hugetlbfs-17  sb-pstore-31     sb-tmpfs-44
    rcu-kfree-0         sb-hugetlbfs-33  sb-rootfs-2      sb-tmpfs-49
    sb-aio-20           sb-iomem-12      sb-securityfs-6  sb-tracefs-13
    sb-anon_inodefs-15  sb-mqueue-21     sb-selinuxfs-22  sb-xfs:vda1-36
    sb-bdev-3           sb-nsfs-4        sb-sockfs-8      sb-zsmalloc-19
    sb-bpf-32           sb-pipefs-14     sb-sysfs-26      thp-deferred_split-10
    sb-btrfs:vda2-24    sb-proc-25       sb-tmpfs-1       thp-zero-9
    sb-cgroup2-30       sb-proc-39       sb-tmpfs-27      xfs-buf:vda1-37
    sb-configfs-23      sb-proc-41       sb-tmpfs-29      xfs-inodegc:vda1-38
    sb-dax-11           sb-proc-45       sb-tmpfs-35
    sb-debugfs-7        sb-proc-46       sb-tmpfs-40

[roman.gushchin@linux.dev: fix build warnings]
  Link: https://lkml.kernel.org/r/Yr+ZTnLb9lJk6fJO@castle
  Reported-by: kernel test robot <lkp@intel.com>
Link: https://lkml.kernel.org/r/20220601032227.4076670-4-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-03 18:08:40 -07:00
Demi Marie Obenour
dbe97cff7d xen/gntdev: Avoid blocking in unmap_grant_pages()
unmap_grant_pages() currently waits for the pages to no longer be used.
In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a
deadlock against i915: i915 was waiting for gntdev's MMU notifier to
finish, while gntdev was waiting for i915 to free its pages.  I also
believe this is responsible for various deadlocks I have experienced in
the past.

Avoid these problems by making unmap_grant_pages async.  This requires
making it return void, as any errors will not be available when the
function returns.  Fortunately, the only use of the return value is a
WARN_ON(), which can be replaced by a WARN_ON when the error is
detected.  Additionally, a failed call will not prevent further calls
from being made, but this is harmless.

Because unmap_grant_pages is now async, the grant handle will be sent to
INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same
handle.  Instead, a separate bool array is allocated for this purpose.
This wastes memory, but stuffing this information in padding bytes is
too fragile.  Furthermore, it is necessary to grab a reference to the
map before making the asynchronous call, and release the reference when
the call returns.

It is also necessary to guard against reentrancy in gntdev_map_put(),
and to handle the case where userspace tries to map a mapping whose
contents have not all been freed yet.

Fixes: 745282256c ("xen/gntdev: safely unmap grants in case they are still in use")
Cc: stable@vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220622022726.2538-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-23 15:29:18 +02:00
Julien Grall
ecb6237fa3 x86/xen: Remove undefined behavior in setup_features()
1 << 31 is undefined. So switch to 1U << 31.

Fixes: 5ead97c84f ("xen: Core Xen implementation")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220617103037.57828-1-julien@xen.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-21 16:36:11 +02:00
Masahiro Yamada
dbac14a5a0 xen: unexport __init-annotated xen_xlate_map_ballooned_pages()
EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.

modpost used to detect it, but it has been broken for a decade.

Recently, I fixed modpost so it started to warn it again, then this
showed up in linux-next builds.

There are two ways to fix it:

  - Remove __init
  - Remove EXPORT_SYMBOL

I chose the latter for this case because none of the in-tree call-sites
(arch/arm/xen/enlighten.c, arch/x86/xen/grant-table.c) is compiled as
modular.

Fixes: 243848fc01 ("xen/grant-table: Move xlated_setup_gnttab_pages to common place")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20220606045920.4161881-1-masahiroy@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-07 08:11:35 +02:00
Oleksandr Tyshchenko
625ab90ecd xen/grant-dma-ops: Retrieve the ID of backend's domain for DT devices
Use the presence of "iommus" property pointed to the IOMMU node with
recently introduced "xen,grant-dma" compatible as a clear indicator
of enabling Xen grant mappings scheme for that device and read the ID
of Xen domain where the corresponding backend is running. The domid
(domain ID) is used as an argument to the Xen grant mapping APIs.

To avoid the deferred probe timeout which takes place after reusing
generic IOMMU device tree bindings (because the IOMMU device never
becomes available) enable recently introduced stub IOMMU driver by
selecting XEN_GRANT_DMA_IOMMU.

Also introduce xen_is_grant_dma_device() to check whether xen-grant
DMA ops need to be set for a passed device.

Remove the hardcoded domid 0 in xen_grant_setup_dma_ops().

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/1654197833-25362-8-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-06 16:07:30 +02:00
Oleksandr Tyshchenko
1ca55d50e5 xen/grant-dma-iommu: Introduce stub IOMMU driver
In order to reuse generic IOMMU device tree bindings by Xen grant
DMA-mapping layer we need to add this stub driver from a fw_devlink
perspective (grant-dma-ops cannot be converted into the proper
IOMMU driver).

Otherwise, just reusing IOMMU bindings (without having a corresponding
driver) leads to the deferred probe timeout afterwards, because
the IOMMU device never becomes available.

This stub driver does nothing except registering empty iommu_ops,
the upper layer "of_iommu" will treat this as NO_IOMMU condition
and won't return -EPROBE_DEFER.

As this driver is quite different from the most hardware IOMMU
implementations and only needed in Xen guests, place it in drivers/xen
directory. The subsequent commit will make use of it.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/1654197833-25362-7-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-06 16:07:30 +02:00
Juergen Gross
fa1f57421e xen/virtio: Enable restricted memory access using Xen grant mappings
In order to support virtio in Xen guests add a config option XEN_VIRTIO
enabling the user to specify whether in all Xen guests virtio should
be able to access memory via Xen grant mappings only on the host side.

Also set PLATFORM_VIRTIO_RESTRICTED_MEM_ACCESS feature from the guest
initialization code on Arm and x86 if CONFIG_XEN_VIRTIO is enabled.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/1654197833-25362-5-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-06 16:07:30 +02:00
Juergen Gross
d6aca3504c xen/grant-dma-ops: Add option to restrict memory access under Xen
Introduce Xen grant DMA-mapping layer which contains special DMA-mapping
routines for providing grant references as DMA addresses to be used by
frontends (e.g. virtio) in Xen guests.

Add the needed functionality by providing a special set of DMA ops
handling the needed grant operations for the I/O pages.

The subsequent commit will introduce the use case for xen-grant DMA ops
layer to enable using virtio devices in Xen guests in a safe manner.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/1654197833-25362-4-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-06 08:54:52 +02:00
Juergen Gross
02a9e681a3 xen/grants: support allocating consecutive grants
For support of virtio via grant mappings in rare cases larger mappings
using consecutive grants are needed. Support those by adding a bitmap
of free grants.

As consecutive grants will be needed only in very rare cases (e.g. when
configuring a virtio device with a multi-page ring), optimize for the
normal case of non-consecutive allocations.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/1654197833-25362-3-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-06 08:54:35 +02:00
Linus Torvalds
4ccbe91de9 xen: branch for v5.19-rc1b
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYprzPAAKCRCAXGG7T9hj
 vuTzAQC4GiDXcD/cfLVcEqdyw1diCWZjuOfuznUqy5ZUBAZjvAD/draFHTeO96+k
 qyZyzFggPIziaAOIUZ2DkJ/NqSAmbA8=
 =dl1E
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.19-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull more xen updates from Juergen Gross:
 "Two cleanup patches for Xen related code and (more important) an
  update of MAINTAINERS for Xen, as Boris Ostrovsky decided to step
  down"

* tag 'for-linus-5.19-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: replace xen_remap() with memremap()
  MAINTAINERS: Update Xen maintainership
  xen: switch gnttab_end_foreign_access() to take a struct page pointer
2022-06-04 13:42:53 -07:00
Juergen Gross
41925b105e xen: replace xen_remap() with memremap()
xen_remap() is used to establish mappings for frames not under direct
control of the kernel: for Xenstore and console ring pages, and for
grant pages of non-PV guests.

Today xen_remap() is defined to use ioremap() on x86 (doing uncached
mappings), and ioremap_cache() on Arm (doing cached mappings).

Uncached mappings for those use cases are bad for performance, so they
should be avoided if possible. As all use cases of xen_remap() don't
require uncached mappings (the mapped area is always physical RAM),
a mapping using the standard WB cache mode is fine.

As sparse is flagging some of the xen_remap() use cases to be not
appropriate for iomem(), as the result is not annotated with the
__iomem modifier, eliminate xen_remap() completely and replace all
use cases with memremap() specifying the MEMREMAP_WB caching mode.

xen_unmap() can be replaced with memunmap().

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20220530082634.6339-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-06-01 07:24:17 +02:00
Juergen Gross
49f8b459fc xen: switch gnttab_end_foreign_access() to take a struct page pointer
Instead of a virtual kernel address use a pointer of the associated
struct page as second parameter of gnttab_end_foreign_access().

Most users have that pointer available already and are creating the
virtual address from it, risking problems in case the memory is
located in highmem.

gnttab_end_foreign_access() itself won't need to get the struct page
from the address again.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-27 11:05:29 +02:00
Linus Torvalds
3f306ea2e1 dma-mapping updates for Linux 5.19
- don't over-decrypt memory (Robin Murphy)
  - takes min align mask into account for the swiotlb max mapping size
    (Tianyu Lan)
  - use GFP_ATOMIC in dma-debug (Mikulas Patocka)
  - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me)
  - don't fail on highmem CMA pages in dma_direct_alloc_pages (me)
  - cleanup swiotlb initialization and share more code with swiotlb-xen
    (me, Stefano Stabellini)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmKObTQLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYObmA//dIcDB/q4iFGD+WJh4MhM+asx0ZsdF2OJz42WEhgT
 Z9duOrgcneEQundCamqJP9rNTs980LHDA8uWQC5rZEc9vxuRVOdS7bSgYRUwWh6B
 r0ZjOsvQCn+ChoZML8uyk4rfmEINq+EvJuec3G5fgecZOhPuJS2i2uzzv5cHwqgP
 ChC0fwyZlkfdECXgvZXbEoCJLfTgGNlziN6Ai8dirSoqgEQUoCsY89/M7OiEBvV2
 R4XUWD7OvQERfB4t6xLuUHyzf9PAuWB+OiblRVNeAmK3lMjxVrc3k4kIowgklnzD
 8hfmphAa9Zou3zdfi6Gd4fiQRHRVOwKVp1rtqUmJ+lPSiwyMzu64z9ld2+2qac0h
 V4sSr/yJkhxnBT4/0MkTChvhnRobisackpUzNRpiM4ck7cNVb7eAvkISsbH+pWI9
 aEexPhbyskjlV+GOyM4QL4ygG0dpXY0HSyoh6uaSVsaXMycnWIsJCPidXxV1HGV0
 q2/RLHuHwYxia8cYCF01/DQvwOKSjwbU0zModxtRezGD5GYh2C0a+SrA1aX+qiTu
 yGJCs2UHtSQstAt78tTVp499YeDeL/oGSQkPAu8zyRkSczzF+CncGTuXyoJbAWyK
 otcgERWljgZ4scxjfu1uacfoVhKQ7nOu7hiJokL0U80FESAennLC3ZlocvB9h/ff
 HNA=
 =n2rk
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - don't over-decrypt memory (Robin Murphy)

 - takes min align mask into account for the swiotlb max mapping size
   (Tianyu Lan)

 - use GFP_ATOMIC in dma-debug (Mikulas Patocka)

 - fix DMA_ATTR_NO_KERNEL_MAPPING on xen/arm (me)

 - don't fail on highmem CMA pages in dma_direct_alloc_pages (me)

 - cleanup swiotlb initialization and share more code with swiotlb-xen
   (me, Stefano Stabellini)

* tag 'dma-mapping-5.19-2022-05-25' of git://git.infradead.org/users/hch/dma-mapping: (23 commits)
  dma-direct: don't over-decrypt memory
  swiotlb: max mapping size takes min align mask into account
  swiotlb: use the right nslabs-derived sizes in swiotlb_init_late
  swiotlb: use the right nslabs value in swiotlb_init_remap
  swiotlb: don't panic when the swiotlb buffer can't be allocated
  dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC
  dma-direct: don't fail on highmem CMA pages in dma_direct_alloc_pages
  swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on arm
  x86: remove cruft from <asm/dma-mapping.h>
  swiotlb: remove swiotlb_init_with_tbl and swiotlb_init_late_with_tbl
  swiotlb: merge swiotlb-xen initialization into swiotlb
  swiotlb: provide swiotlb_init variants that remap the buffer
  swiotlb: pass a gfp_mask argument to swiotlb_init_late
  swiotlb: add a SWIOTLB_ANY flag to lift the low memory restriction
  swiotlb: make the swiotlb_init interface more useful
  x86: centralize setting SWIOTLB_FORCE when guest memory encryption is enabled
  x86: remove the IOMMU table infrastructure
  MIPS/octeon: use swiotlb_init instead of open coding it
  arm/xen: don't check for xen_initial_domain() in xen_create_contiguous_region
  swiotlb: rename swiotlb_late_init_with_default_size
  ...
2022-05-25 19:18:36 -07:00
Luca Miccio
5b3353949e xen: add support for initializing xenstore later as HVM domain
When running as dom0less guest (HVM domain on ARM) the xenstore event
channel is available at domain creation but the shared xenstore
interface page only becomes available later on.

In that case, wait for a notification on the xenstore event channel,
then complete the xenstore initialization later, when the shared page
is actually available.

The xenstore page has few extra field. Add them to the shared struct.
One of the field is "connection", when the connection is ready, it is
zero. If the connection is not-zero, wait for a notification.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20220513211938.719341-2-sstabellini@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:44:08 +02:00
Juergen Gross
4573240f07 xen/xenbus: eliminate xenbus_grant_ring()
There is no external user of xenbus_grant_ring() left, so merge it into
the only caller xenbus_setup_ring().

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:22:11 +02:00
Juergen Gross
7050096d07 xen/xenbus: add xenbus_setup_ring() service function
Most PV device frontends share very similar code for setting up shared
ring buffers:

- allocate page(s)
- init the ring admin data
- give the backend access to the ring via grants

Tearing down the ring requires similar actions in all frontends again:

- remove grants
- free the page(s)

Provide service functions xenbus_setup_ring() and xenbus_teardown_ring()
for that purpose.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:55 +02:00
Juergen Gross
888fd787f3 xen/shbuf: switch xen-front-pgdir-shbuf to use INVALID_GRANT_REF
Instead of using a private macro for an invalid grant reference use
the common one.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> # Arm64 only
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:51 +02:00
Juergen Gross
bd506c7812 xen/dmabuf: switch gntdev-dmabuf to use INVALID_GRANT_REF
Instead of using a private macro for an invalid grant reference use
the common one.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:48 +02:00
Juergen Gross
8c9eb0e373 xen/grant-table: never put a reserved grant on the free list
Make sure a reserved grant is never put on the free list, as this could
cause hard to debug errors.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:37 +02:00
Juergen Gross
79c22318f8 xen: update grant_table.h
Update include/xen/interface/grant_table.h to its newest version.

This allows to drop some private definitions in grant-table.c and
include/xen/grant_table.h.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:21:34 +02:00
Juergen Gross
54aee68bb6 xen/scsiback: use new command result macros
Instead of using the kernel's values for the result of PV scsi
operations use the values of the interface definition.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20220428075323.12853-3-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-05-19 14:02:43 +02:00
Christoph Hellwig
566fb90e05 swiotlb-xen: fix DMA_ATTR_NO_KERNEL_MAPPING on arm
swiotlb-xen uses very different ways to allocate coherent memory on x86
vs arm.  On the former it allocates memory from the page allocator, while
on the later it reuses the dma-direct allocator the handles the
complexities of non-coherent DMA on arm platforms.

Unfortunately the complexities of trying to deal with the two cases in
the swiotlb-xen.c code lead to a bug in the handling of
DMA_ATTR_NO_KERNEL_MAPPING on arm.  With the DMA_ATTR_NO_KERNEL_MAPPING
flag the coherent memory allocator does not actually allocate coherent
memory, but just a DMA handle for some memory that is DMA addressable
by the device, but which does not have to have a kernel mapping.  Thus
dereferencing the return value will lead to kernel crashed and memory
corruption.

Fix this by using the dma-direct allocator directly for arm, which works
perfectly fine because on arm swiotlb-xen is only used when the domain is
1:1 mapped, and then simplifying the remaining code to only cater for the
x86 case with DMA coherent device.

Reported-by: Rahul Singh <Rahul.Singh@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Rahul Singh <rahul.singh@arm.com>
2022-05-11 19:48:32 +02:00
Alaa Mohamed
f31076a6b2 xen: Convert kmap() to kmap_local_page()
kmap() is being deprecated and these usages are all local to the thread
so there is no reason kmap_local_page() can't be used.

Replace kmap() calls with kmap_local_page().

Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220419234328.10346-1-eng.alaamohamedsoliman.am@gmail.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-04-20 15:22:18 -05:00
Christoph Hellwig
3f70356edf swiotlb: merge swiotlb-xen initialization into swiotlb
Reuse the generic swiotlb initialization for xen-swiotlb.  For ARM/ARM64
this works trivially, while for x86 xen_swiotlb_fixup needs to be passed
as the remap argument to swiotlb_init_remap/swiotlb_init_late.

Note that the lower bound of the swiotlb size is changed to the smaller
IO_TLB_MIN_SLABS based value with this patch, but that is fine as the
2MB value used in Xen before was just an optimization and is not the
hard lower bound.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-04-18 07:21:13 +02:00
Christoph Hellwig
c6af2aa9ff swiotlb: make the swiotlb_init interface more useful
Pass a boolean flag to indicate if swiotlb needs to be enabled based on
the addressing needs, and replace the verbose argument with a set of
flags, including one to force enable bounce buffering.

Note that this patch removes the possibility to force xen-swiotlb use
with the swiotlb=force parameter on the command line on x86 (arm and
arm64 never supported that), but this interface will be restored shortly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-04-18 07:21:11 +02:00
Christoph Hellwig
a2daa27c0c swiotlb: simplify swiotlb_max_segment
Remove the bogus Xen override that was usually larger than the actual
size and just calculate the value on demand.  Note that
swiotlb_max_segment still doesn't make sense as an interface and should
eventually be removed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-04-18 07:21:09 +02:00
Juergen Gross
262fc47ac1 xen/balloon: don't use PV mode extra memory for zone device allocations
When running as a Xen PV guest use the extra memory (memory which isn't
allocated for the guest at boot time) only for ballooning purposes and
not for zone device allocations. This will remove some code without any
lack of functionality.

While at it move some code to get rid of another #ifdef.

Remove a comment which is stale since some time now.

Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220407093857.1485-1-jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-04-07 15:08:37 -05:00
Jakub Kądziołka
ff32baa1f3 xen: don't hang when resuming PCI device
If a xen domain with at least two VCPUs has a PCI device attached which
enters the D3hot state during suspend, the kernel may hang while
resuming, depending on the core on which an async resume task gets
scheduled.

The bug occurs because xen's do_suspend calls dpm_resume_start while
only the timer of the boot CPU has been resumed (when xen_suspend called
syscore_resume), before calling xen_arch_suspend to resume the timers of
the other CPUs. This breaks pci_dev_d3_sleep.

Thus this patch moves the call to xen_arch_resume before the call to
dpm_resume_start, eliminating the hangs and restoring the stack-like
structure of the suspend/restore procedure.

Signed-off-by: Jakub Kądziołka <niedzejkob@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220323012103.2537-1-niedzejkob@invisiblethingslab.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-03-25 14:22:15 -05:00
Juergen Gross
c94b731da2 xen/grant-table: remove readonly parameter from functions
The gnttab_end_foreign_access() family of functions is taking a
"readonly" parameter, which isn't used. Remove it from the function
parameters.

Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220311103429.12845-3-jgross@suse.com
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-03-15 20:34:40 -05:00
Juergen Gross
b0f212633b xen/grant-table: remove gnttab_*transfer*() functions
All grant table operations related to the "transfer" functionality
are unused currently. There have been users in the old days of the
"Xen-o-Linux" kernel, but those didn't make it upstream.

So remove the "transfer" related functions.

Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220311103429.12845-2-jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-03-15 20:34:04 -05:00
zhanglianjie
982e4430be drivers/xen: use helper macro __ATTR_RW
Use helper macro __ATTR_RW to define HYPERVISOR_ATTR_RW to make code more clear.
Minor readability improvement.

Remove extra whitespace [boris: added this comment]

Signed-off-by: zhanglianjie <zhanglianjie@uniontech.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20220305133823.158961-1-zhanglianjie@uniontech.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-03-10 09:27:55 -06:00
Wang Qing
b537bf429a xen: use time_is_before_eq_jiffies() instead of open coding it
Use the helper function time_is_{before,after}_jiffies() to improve
code readability.

Signed-off-by: Wang Qing <wangqing@vivo.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/1646018104-61415-1-git-send-email-wangqing@vivo.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-03-10 09:27:55 -06:00
Juergen Gross
42baefac63 xen/gnttab: fix gnttab_end_foreign_access() without page specified
gnttab_end_foreign_access() is used to free a grant reference and
optionally to free the associated page. In case the grant is still in
use by the other side processing is being deferred. This leads to a
problem in case no page to be freed is specified by the caller: the
caller doesn't know that the page is still mapped by the other side
and thus should not be used for other purposes.

The correct way to handle this situation is to take an additional
reference to the granted page in case handling is being deferred and
to drop that reference when the grant reference could be freed
finally.

This requires that there are no users of gnttab_end_foreign_access()
left directly repurposing the granted page after the call, as this
might result in clobbered data or information leaks via the not yet
freed grant reference.

This is part of CVE-2022-23041 / XSA-396.

Reported-by: Simon Gaiser <simon@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V4:
- expand comment in header
V5:
- get page ref in case of kmalloc() failure, too
2022-03-07 09:48:55 +01:00
Juergen Gross
b0576cc9c6 xen/pvcalls: use alloc/free_pages_exact()
Instead of __get_free_pages() and free_pages() use alloc_pages_exact()
and free_pages_exact(). This is in preparation of a change of
gnttab_end_foreign_access() which will prohibit use of high-order
pages.

This is part of CVE-2022-23041 / XSA-396.

Reported-by: Simon Gaiser <simon@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V4:
- new patch
2022-03-07 09:48:55 +01:00
Juergen Gross
1dbd11ca75 xen: remove gnttab_query_foreign_access()
Remove gnttab_query_foreign_access(), as it is unused and unsafe to
use.

All previous use cases assumed a grant would not be in use after
gnttab_query_foreign_access() returned 0. This information is useless
in best case, as it only refers to a situation in the past, which could
have changed already.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2022-03-07 09:48:54 +01:00
Juergen Gross
d3b6372c58 xen/gntalloc: don't use gnttab_query_foreign_access()
Using gnttab_query_foreign_access() is unsafe, as it is racy by design.

The use case in the gntalloc driver is not needed at all. While at it
replace the call of gnttab_end_foreign_access_ref() with a call of
gnttab_end_foreign_access(), which is what is really wanted there. In
case the grant wasn't used due to an allocation failure, just free the
grant via gnttab_free_grant_reference().

This is CVE-2022-23039 / part of XSA-396.

Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V3:
- fix __del_gref() (Jan Beulich)
2022-03-07 09:48:54 +01:00
Juergen Gross
6b1775f26a xen/grant-table: add gnttab_try_end_foreign_access()
Add a new grant table function gnttab_try_end_foreign_access(), which
will remove and free a grant if it is not in use.

Its main use case is to either free a grant if it is no longer in use,
or to take some other action if it is still in use. This other action
can be an error exit, or (e.g. in the case of blkfront persistent grant
feature) some special handling.

This is CVE-2022-23036, CVE-2022-23038 / part of XSA-396.

Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V2:
- new patch
V4:
- add comments to header (Jan Beulich)
2022-03-07 09:48:54 +01:00
Juergen Gross
3777ea7bac xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
Letting xenbus_grant_ring() tear down grants in the error case is
problematic, as the other side could already have used these grants.
Calling gnttab_end_foreign_access_ref() without checking success is
resulting in an unclear situation for any caller of xenbus_grant_ring()
as in the error case the memory pages of the ring page might be
partially mapped. Freeing them would risk unwanted foreign access to
them, while not freeing them would leak memory.

In order to remove the need to undo any gnttab_grant_foreign_access()
calls, use gnttab_alloc_grant_references() to make sure no further
error can occur in the loop granting access to the ring pages.

It should be noted that this way of handling removes leaking of
grant entries in the error case, too.

This is CVE-2022-23040 / part of XSA-396.

Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2022-03-07 09:48:54 +01:00
Cai Huoqing
f66edf684e xen/pci: Make use of the helper macro LIST_HEAD()
Replace "struct list_head head = LIST_HEAD_INIT(head)" with
"LIST_HEAD(head)" to simplify the code.

Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220209032842.38818-1-cai.huoqing@linux.dev
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-02-10 11:10:23 +01:00
Oleksandr Tyshchenko
b2371587fe arm/xen: Read extended regions from DT and init Xen resource
This patch implements arch_xen_unpopulated_init() on Arm where
the extended regions (if any) are gathered from DT and inserted
into specific Xen resource to be used as unused address space
for Xen scratch pages by unpopulated-alloc code.

The extended region (safe range) is a region of guest physical
address space which is unused and could be safely used to create
grant/foreign mappings instead of wasting real RAM pages from
the domain memory for establishing these mappings.

The extended regions are chosen by the hypervisor at the domain
creation time and advertised to it via "reg" property under
hypervisor node in the guest device-tree. As region 0 is reserved
for grant table space (always present), the indexes for extended
regions are 1...N.

If arch_xen_unpopulated_init() fails for some reason the default
behaviour will be restored (allocate xenballooned pages).

This patch also removes XEN_UNPOPULATED_ALLOC dependency on x86.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/1639080336-26573-6-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-01-06 09:53:41 +01:00
Oleksandr Tyshchenko
d1a928eac7 xen/unpopulated-alloc: Add mechanism to use Xen resource
The main reason of this change is that unpopulated-alloc
code cannot be used in its current form on Arm, but there
is a desire to reuse it to avoid wasting real RAM pages
for the grant/foreign mappings.

The problem is that system "iomem_resource" is used for
the address space allocation, but the really unallocated
space can't be figured out precisely by the domain on Arm
without hypervisor involvement. For example, not all device
I/O regions are known by the time domain starts creating
grant/foreign mappings. And following the advise from
"iomem_resource" we might end up reusing these regions by
a mistake. So, the hypervisor which maintains the P2M for
the domain is in the best position to provide unused regions
of guest physical address space which could be safely used
to create grant/foreign mappings.

Introduce new helper arch_xen_unpopulated_init() which purpose
is to create specific Xen resource based on the memory regions
provided by the hypervisor to be used as unused space for Xen
scratch pages. If arch doesn't define arch_xen_unpopulated_init()
the default "iomem_resource" will be used.

Update the arguments list of allocate_resource() in fill_list()
to always allocate a region from the hotpluggable range
(maximum possible addressable physical memory range for which
the linear mapping could be created). If arch doesn't define
arch_get_mappable_range() the default range (0,-1) will be used.

The behaviour on x86 won't be changed by current patch as both
arch_xen_unpopulated_init() and arch_get_mappable_range()
are not implemented for it.

Also fallback to allocate xenballooned pages (balloon out RAM
pages) if we do not have any suitable resource to work with
(target_resource is invalid) and as the result we won't be able
to provide unpopulated pages on a request.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/1639080336-26573-5-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-01-06 09:53:38 +01:00
Oleksandr Tyshchenko
9dd060afe2 xen/balloon: Bring alloc(free)_xenballooned_pages helpers back
This patch rolls back some of the changes introduced by commit
121f2faca2 "xen/balloon: rename alloc/free_xenballooned_pages"
in order to make possible to still allocate xenballooned pages
if CONFIG_XEN_UNPOPULATED_ALLOC is enabled.

On Arm the unpopulated pages will be allocated on top of extended
regions provided by Xen via device-tree (the subsequent patches
will add required bits to support unpopulated-alloc feature on Arm).
The problem is that extended regions feature has been introduced
into Xen quite recently (during 4.16 release cycle). So this
effectively means that Linux must only use unpopulated-alloc on Arm
if it is running on "new Xen" which advertises these regions.
But, it will only be known after parsing the "hypervisor" node
at boot time, so before doing that we cannot assume anything.

In order to keep working if CONFIG_XEN_UNPOPULATED_ALLOC is enabled
and the extended regions are not advertised (Linux is running on
"old Xen", etc) we need the fallback to alloc_xenballooned_pages().

This way we wouldn't reduce the amount of memory usable (wasting
RAM pages) for any of the external mappings anymore (and eliminate
XSA-300) with "new Xen", but would be still functional ballooning
out RAM pages with "old Xen".

Also rename alloc(free)_xenballooned_pages to xen_alloc(free)_ballooned_pages
and make xen_alloc(free)_unpopulated_pages static inline in xen.h
if CONFIG_XEN_UNPOPULATED_ALLOC is disabled.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/1639080336-26573-4-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-01-06 09:53:35 +01:00
Oleksandr Tyshchenko
fbf3a5c301 xen/unpopulated-alloc: Drop check for virt_addr_valid() in fill_list()
If memremap_pages() succeeds the range is guaranteed to have proper page
table, there is no need for an additional virt_addr_valid() check.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/1639080336-26573-2-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-01-06 09:53:15 +01:00
Oleksandr Andrushchenko
ce2f46f353 xen/gntdev: fix unmap notification order
While working with Xen's libxenvchan library I have faced an issue with
unmap notifications sent in wrong order if both UNMAP_NOTIFY_SEND_EVENT
and UNMAP_NOTIFY_CLEAR_BYTE were requested: first we send an event channel
notification and then clear the notification byte which renders in the below
inconsistency (cli_live is the byte which was requested to be cleared on unmap):

[  444.514243] gntdev_put_map UNMAP_NOTIFY_SEND_EVENT map->notify.event 6
libxenvchan_is_open cli_live 1
[  444.515239] __unmap_grant_pages UNMAP_NOTIFY_CLEAR_BYTE at 14

Thus it is not possible to reliably implement the checks like
- wait for the notification (UNMAP_NOTIFY_SEND_EVENT)
- check the variable (UNMAP_NOTIFY_CLEAR_BYTE)
because it is possible that the variable gets checked before it is cleared
by the kernel.

To fix that we need to re-order the notifications, so the variable is first
gets cleared and then the event channel notification is sent.
With this fix I can see the correct order of execution:

[   54.522611] __unmap_grant_pages UNMAP_NOTIFY_CLEAR_BYTE at 14
[   54.537966] gntdev_put_map UNMAP_NOTIFY_SEND_EVENT map->notify.event 6
libxenvchan_is_open cli_live 0

Cc: stable@vger.kernel.org
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20211210092817.580718-1-andr2000@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-01-06 08:52:22 +01:00
Juergen Gross
fe415186b4 xen/console: harden hvc_xen against event channel storms
The Xen console driver is still vulnerable for an attack via excessive
number of events sent by the backend. Fix that by using a lateeoi event
channel.

For the normal domU initial console this requires the introduction of
bind_evtchn_to_irq_lateeoi() as there is no xenbus device available
at the time the event channel is bound to the irq.

As the decision whether an interrupt was spurious or not requires to
test for bytes having been read from the backend, move sending the
event into the if statement, as sending an event without having found
any bytes to be read is making no sense at all.

This is part of XSA-391

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V2:
- slightly adapt spurious irq detection (Jan Beulich)
V3:
- fix spurious irq detection (Jan Beulich)
2021-12-16 08:24:08 +01:00
Stefano Stabellini
36e8f60f08 xen: detect uninitialized xenbus in xenbus_init
If the xenstore page hasn't been allocated properly, reading the value
of the related hvm_param (HVM_PARAM_STORE_PFN) won't actually return
error. Instead, it will succeed and return zero. Instead of attempting
to xen_remap a bad guest physical address, detect this condition and
return early.

Note that although a guest physical address of zero for
HVM_PARAM_STORE_PFN is theoretically possible, it is not a good choice
and zero has never been validly used in that capacity.

Also recognize all bits set as an invalid value.

For 32-bit Linux, any pfn above ULONG_MAX would get truncated. Pfns
above ULONG_MAX should never be passed by the Xen tools to HVM guests
anyway, so check for this condition and return early.

Cc: stable@vger.kernel.org
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Link: https://lore.kernel.org/r/20211123210748.1910236-1-sstabellini@kernel.org
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-24 08:55:15 -06:00
Juergen Gross
03e143b2ac xen: flag pvcalls-front to be not essential for system boot
The Xen pvcalls device is not essential for booting. Set the respective
flag.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20211022064800.14978-5-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-23 13:42:20 -06:00
Juergen Gross
37a72b08a3 xen: add "not_essential" flag to struct xenbus_driver
When booting the xenbus driver will wait for PV devices to have
connected to their backends before continuing. The timeout is different
between essential and non-essential devices.

Non-essential devices are identified by their nodenames directly in the
xenbus driver, which requires to update this list in case a new device
type being non-essential is added (this was missed for several types
in the past).

In order to avoid this problem, add a "not_essential" flag to struct
xenbus_driver which can be set to "true" by the respective frontend.

Set this flag for the frontends currently regarded to be not essential
(vkbs and vfb) and use it for testing in the xenbus driver.

Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211022064800.14978-2-jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-23 13:41:29 -06:00
Stefano Stabellini
08f6c2b09e xen: don't continue xenstore initialization in case of errors
In case of errors in xenbus_init (e.g. missing xen_store_gfn parameter),
we goto out_error but we forget to reset xen_store_domain_type to
XS_UNKNOWN. As a consequence xenbus_probe_initcall and other initcalls
will still try to initialize xenstore resulting into a crash at boot.

[    2.479830] Call trace:
[    2.482314]  xb_init_comms+0x18/0x150
[    2.486354]  xs_init+0x34/0x138
[    2.489786]  xenbus_probe+0x4c/0x70
[    2.498432]  xenbus_probe_initcall+0x2c/0x7c
[    2.503944]  do_one_initcall+0x54/0x1b8
[    2.507358]  kernel_init_freeable+0x1ac/0x210
[    2.511617]  kernel_init+0x28/0x130
[    2.516112]  ret_from_fork+0x10/0x20

Cc: <Stable@vger.kernel.org>
Cc: jbeulich@suse.com
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Link: https://lore.kernel.org/r/20211115222719.2558207-1-sstabellini@kernel.org
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-16 16:15:42 -06:00
Juergen Gross
897919ad8b xen/privcmd: make option visible in Kconfig
This configuration option provides a misc device as an API to userspace.
Make this API usable without having to select the module as a transitive
dependency.

This also fixes an issue where localyesconfig would select
CONFIG_XEN_PRIVCMD=m because it was not visible and defaulted to
building as module.

[boris: clarified help message per Jan's suggestion]

Based-on-patch-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211116143323.18866-1-jgross@suse.com
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-16 16:13:31 -06:00
Linus Torvalds
bf98ecbbae xen: branch for v5.16-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYYp8HgAKCRCAXGG7T9hj
 vmuVAP4whjbyIi4IxYEOnE6On0aD0AgUMiFa7QXrDZi6NXUQIwEAnggLFe+rEG5C
 Fwi/cEXSHrRgveqrgD4GYEr6l0GTxwM=
 =/fMa
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.16b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:

 - a series to speed up the boot of Xen PV guests

 - some cleanups in Xen related code

 - replacement of license texts with the appropriate SPDX headers and
   fixing of wrong SPDX headers in Xen header files

 - a small series making paravirtualized interrupt masking much simpler
   and at the same time removing complaints of objtool

 - a fix for Xen ballooning hogging workqueues for too long

 - enablement of the Xen pciback driver for Arm

 - some further small fixes/enhancements

* tag 'for-linus-5.16b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (22 commits)
  xen/balloon: fix unused-variable warning
  xen/balloon: rename alloc/free_xenballooned_pages
  xen/balloon: add late_initcall_sync() for initial ballooning done
  x86/xen: remove 32-bit awareness from startup_xen
  xen: remove highmem remnants
  xen: allow pv-only hypercalls only with CONFIG_XEN_PV
  x86/xen: remove 32-bit pv leftovers
  xen-pciback: allow compiling on other archs than x86
  x86/xen: switch initial pvops IRQ functions to dummy ones
  x86/xen: remove xen_have_vcpu_info_placement flag
  x86/pvh: add prototype for xen_pvh_init()
  xen: Fix implicit type conversion
  xen: fix wrong SPDX headers of Xen related headers
  xen/pvcalls-back: Remove redundant 'flush_workqueue()' calls
  x86/xen: Remove redundant irq_enter/exit() invocations
  xen-pciback: Fix return in pm_ctrl_init()
  xen/x86: restrict PV Dom0 identity mapping
  xen/x86: there's no highmem anymore in PV mode
  xen/x86: adjust handling of the L3 user vsyscall special page table
  xen/x86: adjust xen_set_fixmap()
  ...
2021-11-10 11:14:21 -08:00
Arnd Bergmann
501586ea59 xen/balloon: fix unused-variable warning
In configurations with CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=n
and CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y, gcc warns about an
unused variable:

drivers/xen/balloon.c:83:12: error: 'xen_hotplug_unpopulated' defined but not used [-Werror=unused-variable]

Since this is always zero when CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
is disabled, turn it into a preprocessor constant in that case.

Fixes: 121f2faca2 ("xen/balloon: rename alloc/free_xenballooned_pages")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211108111408.3940366-1-arnd@kernel.org
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-08 10:40:17 -06:00
Linus Torvalds
512b7931ad Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "257 patches.

  Subsystems affected by this patch series: scripts, ocfs2, vfs, and
  mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
  gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
  pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
  memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
  vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
  cleanups, kfence, and damon)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
  mm/damon: remove return value from before_terminate callback
  mm/damon: fix a few spelling mistakes in comments and a pr_debug message
  mm/damon: simplify stop mechanism
  Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
  Docs/admin-guide/mm/damon/start: simplify the content
  Docs/admin-guide/mm/damon/start: fix a wrong link
  Docs/admin-guide/mm/damon/start: fix wrong example commands
  mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
  mm/damon: remove unnecessary variable initialization
  Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
  mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
  selftests/damon: support watermarks
  mm/damon/dbgfs: support watermarks
  mm/damon/schemes: activate schemes based on a watermarks mechanism
  tools/selftests/damon: update for regions prioritization of schemes
  mm/damon/dbgfs: support prioritization weights
  mm/damon/vaddr,paddr: support pageout prioritization
  mm/damon/schemes: prioritize regions within the quotas
  mm/damon/selftests: support schemes quotas
  mm/damon/dbgfs: support quotas of schemes
  ...
2021-11-06 14:08:17 -07:00
Mike Rapoport
4421cca0a3 memblock: use memblock_free for freeing virtual pointers
Rename memblock_free_ptr() to memblock_free() and use memblock_free()
when freeing a virtual pointer so that memblock_free() will be a
counterpart of memblock_alloc()

The callers are updated with the below semantic patch and manual
addition of (void *) casting to pointers that are represented by
unsigned long variables.

    @@
    identifier vaddr;
    expression size;
    @@
    (
    - memblock_phys_free(__pa(vaddr), size);
    + memblock_free(vaddr, size);
    |
    - memblock_free_ptr(vaddr, size);
    + memblock_free(vaddr, size);
    )

[sfr@canb.auug.org.au: fixup]
  Link: https://lkml.kernel.org/r/20211018192940.3d1d532f@canb.auug.org.au

Link: https://lkml.kernel.org/r/20210930185031.18648-7-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Mike Rapoport
3ecc68349b memblock: rename memblock_free to memblock_phys_free
Since memblock_free() operates on a physical range, make its name
reflect it and rename it to memblock_phys_free(), so it will be a
logical counterpart to memblock_phys_alloc().

The callers are updated with the below semantic patch:

    @@
    expression addr;
    expression size;
    @@
    - memblock_free(addr, size);
    + memblock_phys_free(addr, size);

Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-06 13:30:41 -07:00
Linus Torvalds
7e113d01f5 IOMMU Updates for Linux v5.16:
Including:
 
   - Intel IOMMU Updates fro Lu Baolu:
     - Dump DMAR translation structure when DMA fault occurs
     - An optimization in the page table manipulation code
     - Use second level for GPA->HPA translation
     - Various cleanups
 
   - Arm SMMU Updates from Will
     - Minor optimisations to SMMUv3 command creation and submission
     - Numerous new compatible string for Qualcomm SMMUv2 implementations
 
   - Fixes for the SWIOTLB based implemenation of dma-iommu code for
     untrusted devices
 
   - Add support for r8a779a0 to the Renesas IOMMU driver and DT matching
     code for r8a77980
 
   - A couple of cleanups and fixes for the Apple DART IOMMU driver
 
   - Make use of generic report_iommu_fault() interface in the AMD IOMMU
     driver
 
   - Various smaller fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmGD6NQACgkQK/BELZcB
 GuOSfg/9FKXl5ym86BP3tAS1fREKH7p59JRGZrrIR89NyHAcEUjtNG3YLPao+YxU
 3CDgLkru+vlDpYY54QoyqcY5FgIHT3Cna/Cdk4zekRmSO/14gHp47jtZRheOUzLF
 rvwfaplcbbtT8akpsVFzvw8YpQLGSDiDQSl7xL2+40Z9hiYX/gS9Af+PH98tAXsa
 yZKZj6gU+JXM58VihO3M7umyE06tovyBaYgcsBZtbf66bGc0ySu+fe75UVWbueRt
 Z8jwqa7TUfVXiYC8h+LqtGET6gtzNSsxAU3VllRe7Brf6K8i/yaRs/TO2Hp83d7/
 q/fcK3vNQ5v3aDNci/DjBB8SEySzCmRz/9ocCOCx8ByuRp+5lwVRPPq3WcUMtsZY
 QpYo9Fk7luFz2Gj5LObKAVBvOoeBZ5Km3oPs4HVmQ6epxn/rVckJDnJnVSLJuATq
 tSZC2heRfFlg1dT6WFaynCTP2RI1LlNEdKhHirV6L368rSjmF0ZdQxdTpHULsHr1
 yMjqL21OfcSkLW91rvfb3g68EsIwDbCPGTOlQWZLmAtwOWtHSCLPgwwEG7WefZbH
 yaslpmlUTOurUnFmpxlfLicy5sqsBL2ASzGJkEKrgunw82Ke96zzkRzi+9j9HeS6
 g0AyIWMi1cUAjONVUZtV4yjImXh63HIPiKx730a9teodusoxm+Q=
 =waUR
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Intel IOMMU Updates fro Lu Baolu:
     - Dump DMAR translation structure when DMA fault occurs
     - An optimization in the page table manipulation code
     - Use second level for GPA->HPA translation
     - Various cleanups

 - Arm SMMU Updates from Will
     - Minor optimisations to SMMUv3 command creation and submission
     - Numerous new compatible string for Qualcomm SMMUv2 implementations

 - Fixes for the SWIOTLB based implemenation of dma-iommu code for
   untrusted devices

 - Add support for r8a779a0 to the Renesas IOMMU driver and DT matching
   code for r8a77980

 - A couple of cleanups and fixes for the Apple DART IOMMU driver

 - Make use of generic report_iommu_fault() interface in the AMD IOMMU
   driver

 - Various smaller fixes and cleanups

* tag 'iommu-updates-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (35 commits)
  iommu/dma: Fix incorrect error return on iommu deferred attach
  iommu/dart: Initialize DART_STREAMS_ENABLE
  iommu/dma: Use kvcalloc() instead of kvzalloc()
  iommu/tegra-smmu: Use devm_bitmap_zalloc when applicable
  iommu/dart: Use kmemdup instead of kzalloc and memcpy
  iommu/vt-d: Avoid duplicate removing in __domain_mapping()
  iommu/vt-d: Convert the return type of first_pte_in_page to bool
  iommu/vt-d: Clean up unused PASID updating functions
  iommu/vt-d: Delete dev_has_feat callback
  iommu/vt-d: Use second level for GPA->HPA translation
  iommu/vt-d: Check FL and SL capability sanity in scalable mode
  iommu/vt-d: Remove duplicate identity domain flag
  iommu/vt-d: Dump DMAR translation structure when DMA fault occurs
  iommu/vt-d: Do not falsely log intel_iommu is unsupported kernel option
  iommu/arm-smmu-qcom: Request direct mapping for modem device
  iommu: arm-smmu-qcom: Add compatible for QCM2290
  dt-bindings: arm-smmu: Add compatible for QCM2290 SoC
  iommu/arm-smmu-qcom: Add SM6350 SMMU compatible
  dt-bindings: arm-smmu: Add compatible for SM6350 SoC
  iommu/arm-smmu-v3: Properly handle the return value of arm_smmu_cmdq_build_cmd()
  ...
2021-11-04 11:11:24 -07:00
Juergen Gross
121f2faca2 xen/balloon: rename alloc/free_xenballooned_pages
alloc_xenballooned_pages() and free_xenballooned_pages() are used as
direct replacements of xen_alloc_unpopulated_pages() and
xen_free_unpopulated_pages() in case CONFIG_XEN_UNPOPULATED_ALLOC isn't
defined.

Guard both functions with !CONFIG_XEN_UNPOPULATED_ALLOC and rename them
to the xen_*() variants they are replacing. This allows to remove some
ifdeffery from the xen.h header file. Adapt the prototype of the
functions to match.

Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211102092234.17852-1-jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-04 12:59:44 -05:00
Juergen Gross
40fdea0284 xen/balloon: add late_initcall_sync() for initial ballooning done
When running as PVH or HVM guest with actual memory < max memory the
hypervisor is using "populate on demand" in order to allow the guest
to balloon down from its maximum memory size. For this to work
correctly the guest must not touch more memory pages than its target
memory size as otherwise the PoD cache will be exhausted and the guest
is crashed as a result of that.

In extreme cases ballooning down might not be finished today before
the init process is started, which can consume lots of memory.

In order to avoid random boot crashes in such cases, add a late init
call to wait for ballooning down having finished for PVH/HVM guests.

Warn on console if initial ballooning fails, panic() after stalling
for more than 3 minutes per default. Add a module parameter for
changing this timeout.

[boris: replaced pr_info() with pr_notice()]

Cc: <stable@vger.kernel.org>
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211102091944.17487-1-jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-04 12:59:17 -05:00
Juergen Gross
d99bb72a30 x86/xen: remove 32-bit pv leftovers
There are some remaining 32-bit pv-guest support leftovers in the Xen
hypercall interface. Remove them.

Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211028081221.2475-2-jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-02 08:03:43 -05:00
Oleksandr Andrushchenko
a67efff288 xen-pciback: allow compiling on other archs than x86
Xen-pciback driver was designed to be built for x86 only. But it
can also be used by other architectures, e.g. Arm.

Currently PCI backend implements multiple functionalities at a time,
such as:
1. It is used as a database for assignable PCI devices, e.g. xl
   pci-assignable-{add|remove|list} manipulates that list. So, whenever
   the toolstack needs to know which PCI devices can be passed through
   it reads that from the relevant sysfs entries of the pciback.
2. It is used to hold the unbound PCI devices list, e.g. when passing
   through a PCI device it needs to be unbound from the relevant device
   driver and bound to pciback (strictly speaking it is not required
   that the device is bound to pciback, but pciback is again used as a
   database of the passed through PCI devices, so we can re-bind the
   devices back to their original drivers when guest domain shuts down)
3. Device reset for the devices being passed through
4. Para-virtualised use-cases support

The para-virtualised part of the driver is not always needed as some
architectures, e.g. Arm or x86 PVH Dom0, are not using backend-frontend
model for PCI device passthrough.

For such use-cases make the very first step in splitting the
xen-pciback driver into two parts: Xen PCI stub and PCI PV backend
drivers.

For that add new configuration options CONFIG_XEN_PCI_STUB and
CONFIG_XEN_PCIDEV_STUB, so the driver can be limited in its
functionality, e.g. no support for para-virtualised scenario.
x86 platform will continue using CONFIG_XEN_PCIDEV_BACKEND for the
fully featured backend driver.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Anastasiia Lukianenko <anastasiia_lukianenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211028143620.144936-1-andr2000@gmail.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-02 08:03:43 -05:00
Jiasheng Jiang
cbd5458ef1 xen: Fix implicit type conversion
The variable 'i' is defined as UINT. However, in the for_each_possible_cpu
its value is assigned to -1. That doesn't make sense and in the
cpumask_next() it is implicitly type converted to INT. It is universally
accepted that the implicit type conversion is terrible. Also, having the
good programming custom will set an example for others. Thus, it might be
better to change the definition of 'i' from UINT to INT.

[boris: fixed commit message formatting]

Fixes: 3fac10145b ("xen: Re-upload processor PM data to hypervisor after S3 resume (v2)")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Link: https://lore.kernel.org/r/1635233531-2437704-1-git-send-email-jiasheng@iscas.ac.cn
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jiamei Xie <jiamei.xie@arm.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-02 07:45:44 -05:00
Christophe JAILLET
d8da26671a xen/pvcalls-back: Remove redundant 'flush_workqueue()' calls
'destroy_workqueue()' already drains the queue before destroying it, so
there is no need to flush it explicitly.

Remove the redundant 'flush_workqueue()' calls.

This was generated with coccinelle:

@@
expression E;
@@
- 	flush_workqueue(E);
	destroy_workqueue(E);

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/2d6c2e031e4aa2acf2ac4e0bbbc17cfdcc8dbee2.1634236560.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-02 07:45:44 -05:00
YueHaibing
4745ea2628 xen-pciback: Fix return in pm_ctrl_init()
Return NULL instead of passing to ERR_PTR while err is zero,
this fix smatch warnings:
drivers/xen/xen-pciback/conf_space_capability.c:163
 pm_ctrl_init() warn: passing zero to 'ERR_PTR'

Fixes: a92336a117 ("xen/pciback: Drop two backends, squash and cleanup some code.")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211008074417.8260-1-yuehaibing@huawei.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-11-02 07:45:44 -05:00
Joerg Roedel
52d96919d6 Merge branches 'apple/dart', 'arm/mediatek', 'arm/renesas', 'arm/smmu', 'arm/tegra', 'iommu/fixes', 'x86/amd', 'x86/vt-d' and 'core' into next 2021-10-31 22:26:53 +01:00
Greg Kroah-Hartman
16b0314aa7 dma-buf: move dma-buf symbols into the DMA_BUF module namespace
In order to better track where in the kernel the dma-buf code is used,
put the symbols in the namespace DMA_BUF and modify all users of the
symbols to properly import the namespace to not break the build at the
same time.

Now the output of modinfo shows the use of these symbols, making it
easier to watch for users over time:

$ modinfo drivers/misc/fastrpc.ko | grep import
import_ns:      DMA_BUF

Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20211010124628.17691-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-10-25 14:53:08 +02:00
Linus Torvalds
3946b46cab xen: branch for v5.15-rc5
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYWBSIwAKCRCAXGG7T9hj
 vrXxAP9na1EqRJ+SpWyvxHY1jMaIrbg1bgnOc+GsnWxU5liW5AEA4h1HjHtVtrzL
 3vweIS6u2fanrWlYML/daQ3r6EuLPQc=
 =iXsP
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.15b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - fix two minor issues in the Xen privcmd driver plus a cleanup patch
   for that driver

 - fix multiple issues related to running as PVH guest and some related
   earlyprintk fixes for other Xen guest types

 - fix an issue introduced in 5.15 the Xen balloon driver

* tag 'for-linus-5.15b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/balloon: fix cancelled balloon action
  xen/x86: adjust data placement
  x86/PVH: adjust function/data placement
  xen/x86: hook up xen_banner() also for PVH
  xen/x86: generalize preferred console model from PV to PVH Dom0
  xen/x86: make "earlyprintk=xen" work for HVM/PVH DomU
  xen/x86: allow "earlyprintk=xen" to work for PV Dom0
  xen/x86: make "earlyprintk=xen" work better for PVH Dom0
  xen/x86: allow PVH Dom0 without XEN_PV=y
  xen/x86: prevent PVH type from getting clobbered
  xen/privcmd: drop "pages" parameter from xen_remap_pfn()
  xen/privcmd: fix error handling in mmap-resource processing
  xen/privcmd: replace kcalloc() by kvcalloc() when allocating empty pages
2021-10-08 12:55:23 -07:00
Juergen Gross
319933a80f xen/balloon: fix cancelled balloon action
In case a ballooning action is cancelled the new kernel thread handling
the ballooning might end up in a busy loop.

Fix that by handling the cancelled action gracefully.

While at it introduce a short wait for the BP_WAIT case.

Cc: stable@vger.kernel.org
Fixes: 8480ed9c2b ("xen/balloon: use a kernel thread instead a workqueue")
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20211005133433.32008-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-06 07:45:00 +02:00
Jan Beulich
cae7d81a37 xen/x86: allow PVH Dom0 without XEN_PV=y
Decouple XEN_DOM0 from XEN_PV, converting some existing uses of XEN_DOM0
to a new XEN_PV_DOM0. (I'm not convinced all are really / should really
be PV-specific, but for starters I've tried to be conservative.)

For PVH Dom0 the hypervisor populates MADT with only x2APIC entries, so
without x2APIC support enabled in the kernel things aren't going to work
very well. (As opposed, DomU-s would only ever see LAPIC entries in MADT
as of now.) Note that this then requires PVH Dom0 to be 64-bit, as
X86_X2APIC depends on X86_64.

In the course of this xen_running_on_version_or_later() needs to be
available more broadly. Move it from a PV-specific to a generic file,
considering that what it does isn't really PV-specific at all anyway.

Note that xen/interface/version.h cannot be included on its own; in
enlighten.c, which uses SCHEDOP_* anyway, include xen/interface/sched.h
first to resolve the apparently sole missing type (xen_ulong_t).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>

Link: https://lore.kernel.org/r/983bb72f-53df-b6af-14bd-5e088bd06a08@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05 08:35:56 +02:00
Jan Beulich
97315723c4 xen/privcmd: drop "pages" parameter from xen_remap_pfn()
The function doesn't use it and all of its callers say in a comment that
their respective arguments are to be non-NULL only in auto-translated
mode. Since xen_remap_domain_mfn_array() isn't supposed to be used by
non-PV, drop the parameter there as well. It was bogusly passed as non-
NULL (PRIV_VMA_LOCKED) by its only caller anyway. For
xen_remap_domain_gfn_range(), otoh, it's not clear at all why this
wouldn't want / might not need to gain auto-translated support down the
road, so the parameter is retained there despite now remaining unused
(and the only caller passing NULL); correct a respective comment as
well.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Link: https://lore.kernel.org/r/036ad8a2-46f9-ac3d-6219-bdc93ab9e10b@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05 08:20:27 +02:00
Jan Beulich
e11423d672 xen/privcmd: fix error handling in mmap-resource processing
xen_pfn_t is the same size as int only on 32-bit builds (and not even
on Arm32). Hence pfns[] can't be used directly to read individual error
values returned from xen_remap_domain_mfn_array(); every other error
indicator would be skipped/ignored on 64-bit.

Fixes: 3ad0876554 ("xen/privcmd: add IOCTL_PRIVCMD_MMAP_RESOURCE")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Link: https://lore.kernel.org/r/aa6d6a67-6889-338a-a910-51e889f792d5@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05 08:20:22 +02:00
Jan Beulich
0432523f48 xen/privcmd: replace kcalloc() by kvcalloc() when allocating empty pages
Osstest has been suffering test failures for a little while from order-4
allocation failures, resulting from alloc_empty_pages() calling
kcalloc(). As there's no need for physically contiguous space here,
switch to kvcalloc().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/6d698901-98a4-05be-c421-bcd0713f5335@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-10-05 08:20:17 +02:00
David Stevens
e81e99bacc swiotlb: Support aligned swiotlb buffers
Add an argument to swiotlb_tbl_map_single that specifies the desired
alignment of the allocated buffer. This is used by dma-iommu to ensure
the buffer is aligned to the iova granule size when using swiotlb with
untrusted sub-granule mappings. This addresses an issue where adjacent
slots could be exposed to the untrusted device if IO_TLB_SIZE < iova
granule < PAGE_SIZE.

Signed-off-by: David Stevens <stevensd@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210929023300.335969-7-stevensd@google.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-29 12:50:34 +02:00
Linus Torvalds
5739844347 xen: branch for v5.15-rc3
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYU8xegAKCRCAXGG7T9hj
 vvk2APwM85dFSMNHQo5+S35X+h+M8uKuqqvPYxVtKqQEwu3LXAEAg0cVgr1lWegI
 X98f07/+M0rPJl24kgW/EIxAD9fsWAw=
 =JILz
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.15b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "Some minor cleanups and fixes of some theoretical bugs, as well as a
  fix of a bug introduced in 5.15-rc1"

* tag 'for-linus-5.15b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/x86: fix PV trap handling on secondary processors
  xen/balloon: fix balloon kthread freezing
  swiotlb-xen: this is PV-only on x86
  xen/pci-swiotlb: reduce visibility of symbols
  PCI: only build xen-pcifront in PV-enabled environments
  swiotlb-xen: ensure to issue well-formed XENMEM_exchange requests
  Xen/gntdev: don't ignore kernel unmapping error
  xen/x86: drop redundant zeroing from cpu_initialize_context()
2021-09-25 15:37:31 -07:00
Juergen Gross
96f5bd03e1 xen/balloon: fix balloon kthread freezing
Commit 8480ed9c2b ("xen/balloon: use a kernel thread instead a
workqueue") switched the Xen balloon driver to use a kernel thread.
Unfortunately the patch omitted to call try_to_freeze() or to use
wait_event_freezable_timeout(), causing a system suspend to fail.

Fixes: 8480ed9c2b ("xen/balloon: use a kernel thread instead a workqueue")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210920100345.21939-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-21 09:39:04 +02:00
Jan Beulich
794d5b8a49 swiotlb-xen: this is PV-only on x86
The code is unreachable for HVM or PVH, and it also makes little sense
in auto-translated environments. On Arm, with
xen_{create,destroy}_contiguous_region() both being stubs, I have a hard
time seeing what good the Xen specific variant does - the generic one
ought to be fine for all purposes there. Still Arm code explicitly
references symbols here, so the code will continue to be included there.

Instead of making PCI_XEN's "select" conditional, simply drop it -
SWIOTLB_XEN will be available unconditionally in the PV case anyway, and
is - as explained above - dead code in non-PV environments.

This in turn allows dropping the stubs for
xen_{create,destroy}_contiguous_region(), the former of which was broken
anyway - it failed to set the DMA handle output.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

Link: https://lore.kernel.org/r/5947b8ae-fdc7-225c-4838-84712265fc1e@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-20 17:01:21 +02:00
Jan Beulich
9074c79b62 swiotlb-xen: ensure to issue well-formed XENMEM_exchange requests
While the hypervisor hasn't been enforcing this, we would still better
avoid issuing requests with GFNs not aligned to the requested order.
Instead of altering the value also in the call to panic(), drop it
there for being static and hence easy to determine without being part
of the panic message.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/7b3998e3-1233-4e5a-89ec-d740e77eb166@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-20 17:01:07 +02:00
Jan Beulich
f28347cc66 Xen/gntdev: don't ignore kernel unmapping error
While working on XSA-361 and its follow-ups, I failed to spot another
place where the kernel mapping part of an operation was not treated the
same as the user space part. Detect and propagate errors and add a 2nd
pr_debug().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/c2513395-74dc-aea3-9192-fd265aa44e35@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-20 16:57:08 +02:00
Linus Torvalds
c6460daea2 xen: branch for v5.15-rc2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYUR8DAAKCRCAXGG7T9hj
 vjclAP0ekisZSabOsmzM0iz8rpuYj113/kQozokFeFD5eQlEiAD/b5LytLosic/1
 5XKa5fbOyimAyRBwblWhpLLhrW6uiAg=
 =WoAB
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.15b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - The first hunk of a Xen swiotlb fixup series fixing multiple minor
   issues and doing some small cleanups

 - Some further Xen related fixes avoiding WARN() splats when running as
   Xen guests or dom0

 - A Kconfig fix allowing the pvcalls frontend to be built as a module

* tag 'for-linus-5.15b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  swiotlb-xen: drop DEFAULT_NSLABS
  swiotlb-xen: arrange to have buffer info logged
  swiotlb-xen: drop leftover __ref
  swiotlb-xen: limit init retries
  swiotlb-xen: suppress certain init retries
  swiotlb-xen: maintain slab count properly
  swiotlb-xen: fix late init retry
  swiotlb-xen: avoid double free
  xen/pvcalls: backend can be a module
  xen: fix usage of pmd_populate in mremap for pv guests
  xen: reset legacy rtc flag for PV domU
  PM: base: power: don't try to use non-existing RTC for storing data
  xen/balloon: use a kernel thread instead a workqueue
2021-09-17 08:31:49 -07:00
Jan Beulich
d859ed25b2 swiotlb-xen: drop DEFAULT_NSLABS
It was introduced by 4035b43da6 ("xen-swiotlb: remove xen_set_nslabs")
and then not removed by 2d29960af0 ("swiotlb: dynamically allocate
io_tlb_default_mem").

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

Link: https://lore.kernel.org/r/15259326-209a-1d11-338c-5018dc38abe8@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-15 08:42:04 +02:00
Jan Beulich
7fd880a38c swiotlb-xen: arrange to have buffer info logged
I consider it unhelpful that address and size of the buffer aren't put
in the log file; it makes diagnosing issues needlessly harder. The
majority of callers of swiotlb_init() also passes 1 for the "verbose"
parameter.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

Link: https://lore.kernel.org/r/2e3c8e68-36b2-4ae9-b829-bf7f75d39d47@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-15 08:42:04 +02:00
Jan Beulich
68573c1b5c swiotlb-xen: drop leftover __ref
Commit a98f565462 ("xen-swiotlb: split xen_swiotlb_init") should not
only have added __init to the split off function, but also should have
dropped __ref from the one left.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

Link: https://lore.kernel.org/r/7cd163e1-fe13-270b-384c-2708e8273d34@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-15 08:42:04 +02:00
Jan Beulich
cabb7f89b2 swiotlb-xen: limit init retries
Due to the use of max(1024, ...) there's no point retrying (and issuing
bogus log messages) when the number of slabs is already no larger than
this minimum value.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

Link: https://lore.kernel.org/r/984fa426-2b7b-4b77-5ce8-766619575b7f@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-15 08:42:04 +02:00
Jan Beulich
79ca5f778a swiotlb-xen: suppress certain init retries
Only on the 2nd of the paths leading to xen_swiotlb_init()'s "error"
label it is useful to retry the allocation; the first one did already
iterate through all possible order values.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/56477481-87da-4962-9661-5e1b277efde0@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-15 08:42:04 +02:00
Jan Beulich
d9a688add3 swiotlb-xen: maintain slab count properly
Generic swiotlb code makes sure to keep the slab count a multiple of the
number of slabs per segment. Yet even without checking whether any such
assumption is made elsewhere, it is easy to see that xen_swiotlb_fixup()
might alter unrelated memory when calling xen_create_contiguous_region()
for the last segment, when that's not a full one - the function acts on
full order-N regions, not individual pages.

Align the slab count suitably when halving it for a retry. Add a build
time check and a runtime one. Replace the no longer useful local
variable "slabs" by an "order" one calculated just once, outside of the
loop. Re-use "order" for calculating "dma_bits", and change the type of
the latter as well as the one of "i" while touching this anyway.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

Link: https://lore.kernel.org/r/dc054cb0-bec4-4db0-fc06-c9fc957b6e66@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-15 08:42:04 +02:00
Jan Beulich
4c092c5901 swiotlb-xen: fix late init retry
The commit referenced below removed the assignment of "bytes" from
xen_swiotlb_init() without - like done for xen_swiotlb_init_early() -
adding an assignment on the retry path, thus leading to excessively
sized allocations upon retries.

Fixes: 2d29960af0 ("swiotlb: dynamically allocate io_tlb_default_mem")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>

Link: https://lore.kernel.org/r/778299d6-9cfd-1c13-026e-25ee5d14ecb3@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-15 08:42:04 +02:00
Jan Beulich
ce6a80d1b2 swiotlb-xen: avoid double free
Of the two paths leading to the "error" label in xen_swiotlb_init() one
didn't allocate anything, while the other did already free what was
allocated.

Fixes: b827760053 ("xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>

Link: https://lore.kernel.org/r/ce9c2adb-8a52-6293-982a-0d6ece943ac6@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-15 08:42:04 +02:00
Jan Beulich
45da234467 xen/pvcalls: backend can be a module
It's not clear to me why only the frontend has been tristate. Switch the
backend to be, too.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>

Link: https://lore.kernel.org/r/54a6070c-92bb-36a3-2fc0-de9ccca438c5@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-15 08:42:04 +02:00
Juergen Gross
8480ed9c2b xen/balloon: use a kernel thread instead a workqueue
Today the Xen ballooning is done via delayed work in a workqueue. This
might result in workqueue hangups being reported in case of large
amounts of memory are being ballooned in one go (here 16GB):

BUG: workqueue lockup - pool cpus=6 node=0 flags=0x0 nice=0 stuck for 64s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
  pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=2/256 refcnt=3
    in-flight: 229:balloon_process
    pending: cache_reap
workqueue events_freezable_power_: flags=0x84
  pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
    pending: disk_events_workfn
workqueue mm_percpu_wq: flags=0x8
  pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
    pending: vmstat_update
pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=64s workers=3 idle: 2222 43

This can easily be avoided by using a dedicated kernel thread for doing
the ballooning work.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210827123206.15429-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-09-14 09:53:48 +02:00
Linus Torvalds
b250e6d141 Kbuild updates for v5.15
- Add -s option (strict mode) to merge_config.sh to make it fail when
    any symbol is redefined.
 
  - Show a warning if a different compiler is used for building external
    modules.
 
  - Infer --target from ARCH for CC=clang to let you cross-compile the
    kernel without CROSS_COMPILE.
 
  - Make the integrated assembler default (LLVM_IAS=1) for CC=clang.
 
  - Add <linux/stdarg.h> to the kernel source instead of borrowing
    <stdarg.h> from the compiler.
 
  - Add Nick Desaulniers as a Kbuild reviewer.
 
  - Drop stale cc-option tests.
 
  - Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG
    to handle symbols in inline assembly.
 
  - Show a warning if 'FORCE' is missing for if_changed rules.
 
  - Various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmExXHoVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGAZwP/iHdEZzuQ4cz2uXUaV0fevj9jjPU
 zJ8wrrNabAiT6f5x861DsARQSR4OSt3zN0tyBNgZwUdotbe7ED5GegrgIUBMWlML
 QskhTEIZj7TexAX/20vx671gtzI3JzFg4c9BuriXCFRBvychSevdJPr65gMDOesL
 vOJnXe+SGXG2+fPWi/PxrcOItNRcveqo2GiWHT3g0Cv/DJUulu81gEkz3hrufnMR
 cjMeSkV0nJJcvI755OQBOUnEuigW64k4m2WxHPG24tU8cQOCqV6lqwOfNQBAn4+F
 OoaCMyPQT9gvGYwGExQMCXGg0wbUt1qnxzOVoA2qFCwbo+MFhqjBvPXab6VJm7CE
 mY3RrTtvxSqBdHI6EGcYeLjhycK9b+LLoJ1qc3S9FK8It6NoFFp4XV0R6ItPBls7
 mWi9VSpyI6k0AwLq+bGXEHvaX/bnnf/vfqn8H+w6mRZdXjFV8EB2DiOSRX/OqjVG
 RnvTtXzWWThLyXvWR3Jox4+7X6728oL7akLemoeZI6oTbJDm7dQgwpz5HbSyHXLh
 d+gUF3Y/6lqxT5N9GSVDxpD1bEMh2I7nGQ4M7WGbGas/3yUemF8wbBqGQo4a+YeD
 d9vGAUxDp2PQTtL2sjFo5Gd4PZEM9g7vwWzRvHe0o5NxKEXcBg25b8cD1hxrN9Y4
 Y1AAnc0kLO+My3PC
 =lw3M
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Add -s option (strict mode) to merge_config.sh to make it fail when
   any symbol is redefined.

 - Show a warning if a different compiler is used for building external
   modules.

 - Infer --target from ARCH for CC=clang to let you cross-compile the
   kernel without CROSS_COMPILE.

 - Make the integrated assembler default (LLVM_IAS=1) for CC=clang.

 - Add <linux/stdarg.h> to the kernel source instead of borrowing
   <stdarg.h> from the compiler.

 - Add Nick Desaulniers as a Kbuild reviewer.

 - Drop stale cc-option tests.

 - Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG
   to handle symbols in inline assembly.

 - Show a warning if 'FORCE' is missing for if_changed rules.

 - Various cleanups

* tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
  kbuild: redo fake deps at include/ksym/*.h
  kbuild: clean up objtool_args slightly
  modpost: get the *.mod file path more simply
  checkkconfigsymbols.py: Fix the '--ignore' option
  kbuild: merge vmlinux_link() between ARCH=um and other architectures
  kbuild: do not remove 'linux' link in scripts/link-vmlinux.sh
  kbuild: merge vmlinux_link() between the ordinary link and Clang LTO
  kbuild: remove stale *.symversions
  kbuild: remove unused quiet_cmd_update_lto_symversions
  gen_compile_commands: extract compiler command from a series of commands
  x86: remove cc-option-yn test for -mtune=
  arc: replace cc-option-yn uses with cc-option
  s390: replace cc-option-yn uses with cc-option
  ia64: move core-y in arch/ia64/Makefile to arch/ia64/Kbuild
  sparc: move the install rule to arch/sparc/Makefile
  security: remove unneeded subdir-$(CONFIG_...)
  kbuild: sh: remove unused install script
  kbuild: Fix 'no symbols' warning when CONFIG_TRIM_UNUSD_KSYMS=y
  kbuild: Switch to 'f' variants of integrated assembler flag
  kbuild: Shuffle blank line to improve comment meaning
  ...
2021-09-03 15:33:47 -07:00
Linus Torvalds
3de18c865f Merge branch 'stable/for-linus-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb updates from Konrad Rzeszutek Wilk:
 "A new feature called restricted DMA pools. It allows SWIOTLB to
  utilize per-device (or per-platform) allocated memory pools instead of
  using the global one.

  The first big user of this is ARM Confidential Computing where the
  memory for DMA operations can be set per platform"

* 'stable/for-linus-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: (23 commits)
  swiotlb: use depends on for DMA_RESTRICTED_POOL
  of: restricted dma: Don't fail device probe on rmem init failure
  of: Move of_dma_set_restricted_buffer() into device.c
  powerpc/svm: Don't issue ultracalls if !mem_encrypt_active()
  s390/pv: fix the forcing of the swiotlb
  swiotlb: Free tbl memory in swiotlb_exit()
  swiotlb: Emit diagnostic in swiotlb_exit()
  swiotlb: Convert io_default_tlb_mem to static allocation
  of: Return success from of_dma_set_restricted_buffer() when !OF_ADDRESS
  swiotlb: add overflow checks to swiotlb_bounce
  swiotlb: fix implicit debugfs declarations
  of: Add plumbing for restricted DMA pool
  dt-bindings: of: Add restricted DMA pool
  swiotlb: Add restricted DMA pool initialization
  swiotlb: Add restricted DMA alloc/free support
  swiotlb: Refactor swiotlb_tbl_unmap_single
  swiotlb: Move alloc_size to swiotlb_find_slots
  swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
  swiotlb: Update is_swiotlb_active to add a struct device argument
  swiotlb: Update is_swiotlb_buffer to add a struct device argument
  ...
2021-09-03 10:34:44 -07:00
Linus Torvalds
9ae5fceb9a xen: branch for v5.15-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYTB6pAAKCRCAXGG7T9hj
 vk9xAP9hUQtGdTb5eW9+3X3PHDON2jGbydAMtQ57WJLdqj0oFgD/cGV95PKv0jgg
 zgw6Icf9Vp6APF/Ucd/dOB0B6qviJgw=
 =PNRc
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:

 - some small cleanups

 - a fix for a bug when running as Xen PV guest which could result in
   not all memory being transferred in case of a migration of the guest

 - a small series for getting rid of code for supporting very old Xen
   hypervisor versions nobody should be using since many years now

 - a series for hardening the Xen block frontend driver

 - a fix for Xen PV boot code issuing warning messages due to a stray
   preempt_disable() on the non-boot processors

* tag 'for-linus-5.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: remove stray preempt_disable() from PV AP startup code
  xen/pcifront: Removed unnecessary __ref annotation
  x86: xen: platform-pci-unplug: use pr_err() and pr_warn() instead of raw printk()
  drivers/xen/xenbus/xenbus_client.c: fix bugon.cocci warnings
  xen/blkfront: don't trust the backend response data blindly
  xen/blkfront: don't take local copy of a request from the ring page
  xen/blkfront: read response from backend only once
  xen: assume XENFEAT_gnttab_map_avail_bits being set for pv guests
  xen: assume XENFEAT_mmu_pt_update_preserve_ad being set for pv guests
  xen: check required Xen features
  xen: fix setting of max_pfn in shared_info
2021-09-02 13:20:11 -07:00
Linus Torvalds
4a3bb4200a dma-mapping updates for Linux 5.15
- fix debugfs initialization order (Anthony Iliopoulos)
  - use memory_intersects() directly (Kefeng Wang)
  - allow to return specific errors from ->map_sg
    (Logan Gunthorpe, Martin Oliveira)
  - turn the dma_map_sg return value into an unsigned int (me)
  - provide a common global coherent pool іmplementation (me)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmEvY+8LHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPaehAAsgnBzzzoLHO83pgs0KL92c+0DiMNHYmaMCJOvZXk
 x2Irv+O74WikRJc4S7uQ26p2spjmUxjmiOjld+8+NN0liD4QO9BQ/SZpIp8emuKS
 /yPG6Xh86xSl/OrPL1y7kGeHkRi5sm3mRhcTdILFQFPLcSReupe++GRfnvrpbOPk
 tj3pBGXluD6iJH12BBt00ushUVzZ0F2xaF6xUDAs94RSZ3tlqsfx6c928Y1KxSZh
 f89q/KuaokyogFG7Ujj/nYgIUETaIs2W6UmxBfRzdEMJFSffwomUMbw+M+qGJ7/d
 2UjamFYRX16FReE8WNsndbX1E6k5JBW12E1qwV3dUwatlNLWEaRq3PNiWkF7zcFH
 LDkpDYN6s5bIDPTfDp21XfPygoH8KQhnD9lVf0aB7n04uu8VJrGB9+10PpkCJVXD
 0b2dcuSwCO7hAfTfNGVV8f3EI/1XPflr1hJvMgcVtY53CR96ldp+4QaElzWLXumN
 MyptirmrVITNVyVwGzhGAblXBLWdarXD0EXudyiaF4Xbrj3AkIOSUCghEwKLpjQf
 UwMFFwSE8yGxKTRK4HfU5gMzy6G751fU7TUe5lmxZLovDflQoSXMWgHE8e7r0Qel
 o5v6lmUzoWz2fAISf3xjauo2ncgmfWMwYM6C7OJy5nG73QXLQId9J+ReXbmrgrrN
 DgI=
 =spje
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.15' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - fix debugfs initialization order (Anthony Iliopoulos)

 - use memory_intersects() directly (Kefeng Wang)

 - allow to return specific errors from ->map_sg (Logan Gunthorpe,
   Martin Oliveira)

 - turn the dma_map_sg return value into an unsigned int (me)

 - provide a common global coherent pool іmplementation (me)

* tag 'dma-mapping-5.15' of git://git.infradead.org/users/hch/dma-mapping: (31 commits)
  hexagon: use the generic global coherent pool
  dma-mapping: make the global coherent pool conditional
  dma-mapping: add a dma_init_global_coherent helper
  dma-mapping: simplify dma_init_coherent_memory
  dma-mapping: allow using the global coherent pool for !ARM
  ARM/nommu: use the generic dma-direct code for non-coherent devices
  dma-direct: add support for dma_coherent_default_memory
  dma-mapping: return an unsigned int from dma_map_sg{,_attrs}
  dma-mapping: disallow .map_sg operations from returning zero on error
  dma-mapping: return error code from dma_dummy_map_sg()
  x86/amd_gart: don't set failed sg dma_address to DMA_MAPPING_ERROR
  x86/amd_gart: return error code from gart_map_sg()
  xen: swiotlb: return error code from xen_swiotlb_map_sg()
  parisc: return error code from .map_sg() ops
  sparc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR
  sparc/iommu: return error codes from .map_sg() ops
  s390/pci: don't set failed sg dma_address to DMA_MAPPING_ERROR
  s390/pci: return error code from s390_dma_map_sg()
  powerpc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR
  powerpc/iommu: return error code from .map_sg() ops
  ...
2021-09-02 10:32:06 -07:00
Linus Torvalds
c6c3c5704b Driver core update for 5.15-rc1
Here is the big set of driver core patches for 5.15-rc1.
 
 These do change a number of different things across different
 subsystems, and because of that, there were 2 stable tags created that
 might have already come into your tree from different pulls that did the
 following
 	- changed the bus remove callback to return void
 	- sysfs iomem_get_mapping rework
 
 The latter one will cause a tiny merge issue with your tree, as there
 was a last-minute fix for this in 5.14 in your tree, but the fixup
 should be "obvious".  If you want me to provide a fixed merge for this,
 please let me know.
 
 Other than those two things, there's only a few small things in here:
 	- kernfs performance improvements for huge numbers of sysfs
 	  users at once
 	- tiny api cleanups
 	- other minor changes
 
 All of these have been in linux-next for a while with no reported
 problems, other than the before-mentioned merge issue.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYS+FLQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylXuACfWECnysDtXNe66DdETCFs1a1RToYAoMokWeU5
 s8VFP1NY2BjmxJbkebLL
 =8kVu
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the big set of driver core patches for 5.15-rc1.

  These do change a number of different things across different
  subsystems, and because of that, there were 2 stable tags created that
  might have already come into your tree from different pulls that did
  the following

   - changed the bus remove callback to return void

   - sysfs iomem_get_mapping rework

  Other than those two things, there's only a few small things in here:

   - kernfs performance improvements for huge numbers of sysfs users at
     once

   - tiny api cleanups

   - other minor changes

  All of these have been in linux-next for a while with no reported
  problems, other than the before-mentioned merge issue"

* tag 'driver-core-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (33 commits)
  MAINTAINERS: Add dri-devel for component.[hc]
  driver core: platform: Remove platform_device_add_properties()
  ARM: tegra: paz00: Handle device properties with software node API
  bitmap: extend comment to bitmap_print_bitmask/list_to_buf
  drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI
  topology: use bin_attribute to break the size limitation of cpumap ABI
  lib: test_bitmap: add bitmap_print_bitmask/list_to_buf test cases
  cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list
  sysfs: Rename struct bin_attribute member to f_mapping
  sysfs: Invoke iomem_get_mapping() from the sysfs open callback
  debugfs: Return error during {full/open}_proxy_open() on rmmod
  zorro: Drop useless (and hardly used) .driver member in struct zorro_dev
  zorro: Simplify remove callback
  sh: superhyway: Simplify check in remove callback
  nubus: Simplify check in remove callback
  nubus: Make struct nubus_driver::remove return void
  kernfs: dont call d_splice_alias() under kernfs node lock
  kernfs: use i_lock to protect concurrent inode updates
  kernfs: switch kernfs to use an rwsem
  kernfs: use VFS negative dentry caching
  ...
2021-09-01 08:44:42 -07:00
Jing Yangyang
bb70913dce drivers/xen/xenbus/xenbus_client.c: fix bugon.cocci warnings
Use BUG_ON instead of a if condition followed by BUG.

Generated by: scripts/coccinelle/misc/bugon.cocci

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Jing Yangyang <jing.yangyang@zte.com.cn>
Reviewed-by: SeongJae Park <sjpark@amazon.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20210825062451.69998-1-deng.changcheng@zte.com.cn
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30 12:04:53 +02:00
Juergen Gross
30dcc56bba xen: assume XENFEAT_gnttab_map_avail_bits being set for pv guests
XENFEAT_gnttab_map_avail_bits is always set in Xen 4.0 and newer.
Remove coding assuming it might be zero.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210730071804.4302-4-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30 11:57:50 +02:00
Juergen Gross
ac4c403c90 xen: check required Xen features
Linux kernel is not supported to run on Xen versions older than 4.0.

Add tests for required Xen features always being present in Xen 4.0
and newer.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210730071804.4302-2-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2021-08-30 11:57:45 +02:00
Alexey Dobriyan
39f75da7bc isystem: trim/fixup stdarg.h and other headers
Delete/fixup few includes in anticipation of global -isystem compile
option removal.

Note: crypto/aegis128-neon-inner.c keeps <stddef.h> due to redefinition
of uintptr_t error (one definition comes from <stddef.h>, another from
<linux/types.h>).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-08-19 09:02:55 +09:00
Linus Torvalds
ba31f97d43 xen: branch for v5.14-rc6
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYReBzAAKCRCAXGG7T9hj
 vsjcAQD9tV9ThjOP9HXL3lGGoSIBCy+t92ZwpVkQuYITa9fHFAD/ViU+Iye5WHTf
 jWxlveDJMoOegEmpKEQAoX56hDHPkww=
 =lQOv
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.14-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "A small cleanup patch and a fix of a rare race in the Xen evtchn
  driver"

* tag 'for-linus-5.14-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/events: Fix race in set_evtchn_to_irq
  xen/events: remove redundant initialization of variable irq
2021-08-14 06:31:22 -10:00
Maximilian Heyne
88ca2521bd xen/events: Fix race in set_evtchn_to_irq
There is a TOCTOU issue in set_evtchn_to_irq. Rows in the evtchn_to_irq
mapping are lazily allocated in this function. The check whether the row
is already present and the row initialization is not synchronized. Two
threads can at the same time allocate a new row for evtchn_to_irq and
add the irq mapping to the their newly allocated row. One thread will
overwrite what the other has set for evtchn_to_irq[row] and therefore
the irq mapping is lost. This will trigger a BUG_ON later in
bind_evtchn_to_cpu:

  INFO: pci 0000:1a:15.4: [1d0f:8061] type 00 class 0x010802
  INFO: nvme 0000:1a:12.1: enabling device (0000 -> 0002)
  INFO: nvme nvme77: 1/0/0 default/read/poll queues
  CRIT: kernel BUG at drivers/xen/events/events_base.c:427!
  WARN: invalid opcode: 0000 [#1] SMP NOPTI
  WARN: Workqueue: nvme-reset-wq nvme_reset_work [nvme]
  WARN: RIP: e030:bind_evtchn_to_cpu+0xc2/0xd0
  WARN: Call Trace:
  WARN:  set_affinity_irq+0x121/0x150
  WARN:  irq_do_set_affinity+0x37/0xe0
  WARN:  irq_setup_affinity+0xf6/0x170
  WARN:  irq_startup+0x64/0xe0
  WARN:  __setup_irq+0x69e/0x740
  WARN:  ? request_threaded_irq+0xad/0x160
  WARN:  request_threaded_irq+0xf5/0x160
  WARN:  ? nvme_timeout+0x2f0/0x2f0 [nvme]
  WARN:  pci_request_irq+0xa9/0xf0
  WARN:  ? pci_alloc_irq_vectors_affinity+0xbb/0x130
  WARN:  queue_request_irq+0x4c/0x70 [nvme]
  WARN:  nvme_reset_work+0x82d/0x1550 [nvme]
  WARN:  ? check_preempt_wakeup+0x14f/0x230
  WARN:  ? check_preempt_curr+0x29/0x80
  WARN:  ? nvme_irq_check+0x30/0x30 [nvme]
  WARN:  process_one_work+0x18e/0x3c0
  WARN:  worker_thread+0x30/0x3a0
  WARN:  ? process_one_work+0x3c0/0x3c0
  WARN:  kthread+0x113/0x130
  WARN:  ? kthread_park+0x90/0x90
  WARN:  ret_from_fork+0x3a/0x50

This patch sets evtchn_to_irq rows via a cmpxchg operation so that they
will be set only once. The row is now cleared before writing it to
evtchn_to_irq in order to not create a race once the row is visible for
other threads.

While at it, do not require the page to be zeroed, because it will be
overwritten with -1's in clear_evtchn_to_irq_row anyway.

Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Fixes: d0b075ffee ("xen/events: Refactor evtchn_to_irq array to be dynamically allocated")
Link: https://lore.kernel.org/r/20210812130930.127134-1-mheyne@amazon.de
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-08-12 10:49:54 -05:00
Martin Oliveira
2c647ebe17 xen: swiotlb: return error code from xen_swiotlb_map_sg()
The .map_sg() op now expects an error code instead of zero on failure.

xen_swiotlb_map_sg() may only fail if xen_swiotlb_map_page() fails, but
xen_swiotlb_map_page() only supports returning errors as
DMA_MAPPING_ERROR. So coalesce all errors into EIO per the documentation
for dma_map_sgtable().

Signed-off-by: Martin Oliveira <martin.oliveira@eideticom.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-08-09 17:13:06 +02:00
Will Deacon
463e862ac6 swiotlb: Convert io_default_tlb_mem to static allocation
Since commit 69031f5008 ("swiotlb: Set dev->dma_io_tlb_mem to the
swiotlb pool used"), 'struct device' may hold a copy of the global
'io_default_tlb_mem' pointer if the device is using swiotlb for DMA. A
subsequent call to swiotlb_exit() will therefore leave dangling pointers
behind in these device structures, resulting in KASAN splats such as:

  |  BUG: KASAN: use-after-free in __iommu_dma_unmap_swiotlb+0x64/0xb0
  |  Read of size 8 at addr ffff8881d7830000 by task swapper/0/0
  |
  |  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc3-debug #1
  |  Hardware name: HP HP Desktop M01-F1xxx/87D6, BIOS F.12 12/17/2020
  |  Call Trace:
  |   <IRQ>
  |   dump_stack+0x9c/0xcf
  |   print_address_description.constprop.0+0x18/0x130
  |   kasan_report.cold+0x7f/0x111
  |   __iommu_dma_unmap_swiotlb+0x64/0xb0
  |   nvme_pci_complete_rq+0x73/0x130
  |   blk_complete_reqs+0x6f/0x80
  |   __do_softirq+0xfc/0x3be

Convert 'io_default_tlb_mem' to a static structure, so that the
per-device pointers remain valid after swiotlb_exit() has been invoked.
All users are updated to reference the static structure directly, using
the 'nslabs' field to determine whether swiotlb has been initialised.
The 'slots' array is still allocated dynamically and referenced via a
pointer rather than a flexible array member.

Cc: Claire Chang <tientzu@chromium.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Fixes: 69031f5008 ("swiotlb: Set dev->dma_io_tlb_mem to the swiotlb pool used")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Claire Chang <tientzu@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
2021-07-23 20:14:43 -04:00
Colin Ian King
83f877a095 xen/events: remove redundant initialization of variable irq
The variable irq is being initialized with a value that is never
read, it is being updated later on. The assignment is redundant and
can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210721114010.108648-1-colin.king@canonical.com
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2021-07-21 11:40:45 -05:00
Uwe Kleine-König
fc7a6209d5 bus: Make remove callback return void
The driver core ignores the return value of this callback because there
is only little it can do when a device disappears.

This is the final bit of a long lasting cleanup quest where several
buses were converted to also return void from their remove callback.
Additionally some resource leaks were fixed that were caused by drivers
returning an error code in the expectation that the driver won't go
away.

With struct bus_type::remove returning void it's prevented that newly
implemented buses return an ignored error code and so don't anticipate
wrong expectations for driver authors.

Reviewed-by: Tom Rix <trix@redhat.com> (For fpga)
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com> (For drivers/s390 and drivers/vfio)
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> (For ARM, Amba and related parts)
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Chen-Yu Tsai <wens@csie.org> (for sunxi-rsb)
Acked-by: Pali Rohár <pali@kernel.org>
Acked-by: Mauro Carvalho Chehab <mchehab@kernel.org> (for media)
Acked-by: Hans de Goede <hdegoede@redhat.com> (For drivers/platform)
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-By: Vinod Koul <vkoul@kernel.org>
Acked-by: Juergen Gross <jgross@suse.com> (For xen)
Acked-by: Lee Jones <lee.jones@linaro.org> (For mfd)
Acked-by: Johannes Thumshirn <jth@kernel.org> (For mcb)
Acked-by: Johan Hovold <johan@kernel.org>
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> (For slimbus)
Acked-by: Kirti Wankhede <kwankhede@nvidia.com> (For vfio)
Acked-by: Maximilian Luz <luzmaximilian@gmail.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> (For ulpi and typec)
Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> (For ipack)
Acked-by: Geoff Levand <geoff@infradead.org> (For ps3)
Acked-by: Yehezkel Bernat <YehezkelShB@gmail.com> (For thunderbolt)
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> (For intel_th)
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> (For pcmcia)
Acked-by: Rafael J. Wysocki <rafael@kernel.org> (For ACPI)
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> (rpmsg and apr)
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> (For intel-ish-hid)
Acked-by: Dan Williams <dan.j.williams@intel.com> (For CXL, DAX, and NVDIMM)
Acked-by: William Breathitt Gray <vilhelm.gray@gmail.com> (For isa)
Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (For firewire)
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> (For hid)
Acked-by: Thorsten Scherer <t.scherer@eckelmann.de> (For siox)
Acked-by: Sven Van Asbroeck <TheSven73@gmail.com> (For anybuss)
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> (For MMC)
Acked-by: Wolfram Sang <wsa@kernel.org> # for I2C
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20210713193522.1770306-6-u.kleine-koenig@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-21 11:53:42 +02:00
Claire Chang
903cd0f315 swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
Propagate the swiotlb_force into io_tlb_default_mem->force_bounce and
use it to determine whether to bounce the data or not. This will be
useful later to allow for different pools.

Signed-off-by: Claire Chang <tientzu@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Will Deacon <will@kernel.org>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

[v2: Includes Will's fix]
2021-07-13 20:04:43 -04:00
Claire Chang
7fd856aa7f swiotlb: Update is_swiotlb_buffer to add a struct device argument
Update is_swiotlb_buffer to add a struct device argument. This will be
useful later to allow for different pools.

Signed-off-by: Claire Chang <tientzu@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Will Deacon <will@kernel.org>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2021-07-13 20:04:38 -04:00
Linus Torvalds
4ea9031795 xen: branch for v5.14-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYOVQVQAKCRCAXGG7T9hj
 vnsfAQCLwcqSbno1iiUFDFGDzwnF0RZjIxpvEvdI/Zc0KMAO5gEA9tixIpfBYUO3
 0yAd9g/VRumPS0wRsfwUrnLYs8TZ4gY=
 =pPbL
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:
 "Only two minor patches this time: one cleanup patch and one patch
  refreshing a Xen header"

* tag 'for-linus-5.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: sync include/xen/interface/io/ring.h with Xen's newest version
  xen: Use DEVICE_ATTR_*() macro
2021-07-07 11:07:13 -07:00