Commit graph

1540 commits

Author SHA1 Message Date
Juergen Gross
9a69878dfa xen/balloon: Support xend-based toolstack take two
commit eda4eabf86 upstream.

Commit 3aa6c19d2f ("xen/balloon: Support xend-based toolstack")
tried to fix a regression with running on rather ancient Xen versions.
Unfortunately the fix was based on the assumption that xend would
just use another Xenstore node, but in reality only some downstream
versions of xend are doing that. The upstream xend does not write
that Xenstore node at all, so the problem must be fixed in another
way.

The easiest way to achieve that is to fall back to the behavior
before commit 96edd61dcf ("xen/balloon: don't online new memory
initially") in case the static memory maximum can't be read.

This is achieved by setting static_max to the current number of
memory pages known by the system resulting in target_diff becoming
zero.

Fixes: 3aa6c19d2f ("xen/balloon: Support xend-based toolstack")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: <stable@vger.kernel.org> # 4.13
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11 04:34:08 -08:00
Eric Dumazet
b0fb910bfd net: add {READ|WRITE}_ONCE() annotations on ->rskq_accept_head
[ Upstream commit 60b173ca3d ]

reqsk_queue_empty() is called from inet_csk_listen_poll() while
other cpus might write ->rskq_accept_head value.

Use {READ|WRITE}_ONCE() to avoid compiler tricks
and potential KCSAN splats.

Fixes: fff1f3001c ("tcp: add a spinlock to protect struct request_sock_queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27 14:51:18 +01:00
Dan Carpenter
a663874605 xen, cpu_hotplug: Prevent an out of bounds access
[ Upstream commit 201676095d ]

The "cpu" variable comes from the sscanf() so Smatch marks it as
untrusted data.  We can't pass a higher value than "nr_cpu_ids" to
cpu_possible() or it results in an out of bounds access.

Fixes: d68d82afd4 ("xen: implement CPU hotplugging")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27 14:50:31 +01:00
Juergen Gross
71d955cdbb xen/balloon: fix ballooned page accounting without hotplug enabled
[ Upstream commit c673ec61ad ]

When CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is not defined
reserve_additional_memory() will set balloon_stats.target_pages to a
wrong value in case there are still some ballooned pages allocated via
alloc_xenballooned_pages().

This will result in balloon_process() no longer be triggered when
ballooned pages are freed in batches.

Reported-by: Nicholas Tsirakis <niko.tsirakis@gmail.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09 10:18:58 +01:00
Jason Gunthorpe
dd6a2167dd xen/gntdev: Use select for DMA_SHARED_BUFFER
[ Upstream commit fa6614d8ef ]

DMA_SHARED_BUFFER can not be enabled by the user (it represents a library
set in the kernel). The kconfig convention is to use select for such
symbols so they are turned on implicitly when the user enables a kconfig
that needs them.

Otherwise the XEN_GNTDEV_DMABUF kconfig is overly difficult to enable.

Fixes: 932d656217 ("xen/gntdev: Add initial support for dma-buf UAPI")
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel@lists.xenproject.org
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-31 16:35:50 +01:00
Stefano Stabellini
36a2e9bf24 pvcalls-front: don't return error when the ring is full
[ Upstream commit d90a1ca60a ]

When the ring is full, size == array_size. It is not an error condition,
so simply return 0 instead of an error.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17 20:35:32 +01:00
Ross Lagerwall
73c1ca2dda xen/pciback: Check dev_data before using it
[ Upstream commit 1669907e3d ]

If pcistub_init_device fails, the release function will be called with
dev_data set to NULL.  Check it before using it to avoid a NULL pointer
dereference.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05 09:20:26 +01:00
David Hildenbrand
02735d5987 mm/memory_hotplug: make add_memory() take the device_hotplug_lock
[ Upstream commit 8df1d0e4a2 ]

add_memory() currently does not take the device_hotplug_lock, however
is aleady called under the lock from
	arch/powerpc/platforms/pseries/hotplug-memory.c
	drivers/acpi/acpi_memhotplug.c
to synchronize against CPU hot-remove and similar.

In general, we should hold the device_hotplug_lock when adding memory to
synchronize against online/offline request (e.g.  from user space) - which
already resulted in lock inversions due to device_lock() and
mem_hotplug_lock - see 30467e0b3b ("mm, hotplug: fix concurrent memory
hot-add deadlock").  add_memory()/add_memory_resource() will create memory
block devices, so this really feels like the right thing to do.

Holding the device_hotplug_lock makes sure that a memory block device
can really only be accessed (e.g. via .online/.state) from user space,
once the memory has been fully added to the system.

The lock is not held yet in
	drivers/xen/balloon.c
	arch/powerpc/platforms/powernv/memtrace.c
	drivers/s390/char/sclp_cmd.c
	drivers/hv/hv_balloon.c
So, let's either use the locked variants or take the lock.

Don't export add_memory_resource(), as it once was exported to be used by
XEN, which is never built as a module.  If somebody requires it, we also
have to export a locked variant (as device_hotplug_lock is never
exported).

Link: http://lkml.kernel.org/r/20180925091457.28651-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01 09:17:10 +01:00
Igor Druzhinin
2bc2a90a08 xen/pci: reserve MCFG areas earlier
[ Upstream commit a4098bc6ee ]

If MCFG area is not reserved in E820, Xen by default will defer its usage
until Dom0 registers it explicitly after ACPI parser recognizes it as
a reserved resource in DSDT. Having it reserved in E820 is not
mandatory according to "PCI Firmware Specification, rev 3.2" (par. 4.1.2)
and firmware is free to keep a hole in E820 in that place. Xen doesn't know
what exactly is inside this hole since it lacks full ACPI view of the
platform therefore it's potentially harmful to access MCFG region
without additional checks as some machines are known to provide
inconsistent information on the size of the region.

Now xen_mcfg_late() runs after acpi_init() which is too late as some basic
PCI enumeration starts exactly there as well. Trying to register a device
prior to MCFG reservation causes multiple problems with PCIe extended
capability initializations in Xen (e.g. SR-IOV VF BAR sizing). There are
no convenient hooks for us to subscribe to so register MCFG areas earlier
upon the first invocation of xen_add_device(). It should be safe to do once
since all the boot time buses must have their MCFG areas in MCFG table
already and we don't support PCI bus hot-plug.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-11 18:21:13 +02:00
Juergen Gross
975859bb69 xen/xenbus: fix self-deadlock after killing user process
commit a8fabb3852 upstream.

In case a user process using xenbus has open transactions and is killed
e.g. via ctrl-C the following cleanup of the allocated resources might
result in a deadlock due to trying to end a transaction in the xenbus
worker thread:

[ 2551.474706] INFO: task xenbus:37 blocked for more than 120 seconds.
[ 2551.492215]       Tainted: P           OE     5.0.0-29-generic #5
[ 2551.510263] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2551.528585] xenbus          D    0    37      2 0x80000080
[ 2551.528590] Call Trace:
[ 2551.528603]  __schedule+0x2c0/0x870
[ 2551.528606]  ? _cond_resched+0x19/0x40
[ 2551.528632]  schedule+0x2c/0x70
[ 2551.528637]  xs_talkv+0x1ec/0x2b0
[ 2551.528642]  ? wait_woken+0x80/0x80
[ 2551.528645]  xs_single+0x53/0x80
[ 2551.528648]  xenbus_transaction_end+0x3b/0x70
[ 2551.528651]  xenbus_file_free+0x5a/0x160
[ 2551.528654]  xenbus_dev_queue_reply+0xc4/0x220
[ 2551.528657]  xenbus_thread+0x7de/0x880
[ 2551.528660]  ? wait_woken+0x80/0x80
[ 2551.528665]  kthread+0x121/0x140
[ 2551.528667]  ? xb_read+0x1d0/0x1d0
[ 2551.528670]  ? kthread_park+0x90/0x90
[ 2551.528673]  ret_from_fork+0x35/0x40

Fix this by doing the cleanup via a workqueue instead.

Reported-by: James Dingwall <james@dingwall.me.uk>
Fixes: fd8aa9095a ("xen: optimize xenbus driver for multiple concurrent xenstore accesses")
Cc: <stable@vger.kernel.org> # 4.11
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-11 18:21:06 +02:00
YueHaibing
e72e6ba17a xen/pciback: remove set but not used variable 'old_state'
[ Upstream commit 09e088a490 ]

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/xen/xen-pciback/conf_space_capability.c: In function pm_ctrl_write:
drivers/xen/xen-pciback/conf_space_capability.c:119:25: warning:
 variable old_state set but not used [-Wunused-but-set-variable]

It is never used so can be removed.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-25 10:47:52 +02:00
Juergen Gross
04fdca1f2f xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
commit 50f6393f96 upstream.

The condition in xen_swiotlb_free_coherent() for deciding whether to
call xen_destroy_contiguous_region() is wrong: in case the region to
be freed is not contiguous calling xen_destroy_contiguous_region() is
the wrong thing to do: it would result in inconsistent mappings of
multiple PFNs to the same MFN. This will lead to various strange
crashes or data corruption.

Instead of calling xen_destroy_contiguous_region() in that case a
warning should be issued as that situation should never occur.

Cc: stable@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-06 19:06:55 +02:00
Juergen Gross
007e5aaf28 xen/events: fix binding user event channels to cpus
commit bce5963bcb upstream.

When binding an interdomain event channel to a vcpu via
IOCTL_EVTCHN_BIND_INTERDOMAIN not only the event channel needs to be
bound, but the affinity of the associated IRQi must be changed, too.
Otherwise the IRQ and the event channel won't be moved to another vcpu
in case the original vcpu they were bound to is going offline.

Cc: <stable@vger.kernel.org> # 4.13
Fixes: c48f64ab47 ("xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26 09:14:25 +02:00
Juergen Gross
e73db09669 xen: let alloc_xenballooned_pages() fail if not enough memory free
commit a1078e821b upstream.

Instead of trying to allocate pages with GFP_USER in
add_ballooned_pages() check the available free memory via
si_mem_available(). GFP_USER is far less limiting memory exhaustion
than the test via si_mem_available().

This will avoid dom0 running out of memory due to excessive foreign
page mappings especially on ARM and on x86 in PVH mode, as those don't
have a pre-ballooned area which can be used for foreign mappings.

As the normal ballooning suffers from the same problem don't balloon
down more than si_mem_available() pages in one iteration. At the same
time limit the default maximum number of retries.

This is part of XSA-300.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-26 09:14:18 +02:00
Ross Lagerwall
4acce74428 xenbus: Avoid deadlock during suspend due to open transactions
[ Upstream commit d10e0cc113 ]

During a suspend/resume, the xenwatch thread waits for all outstanding
xenstore requests and transactions to complete. This does not work
correctly for transactions started by userspace because it waits for
them to complete after freezing userspace threads which means the
transactions have no way of completing, resulting in a deadlock. This is
trivial to reproduce by running this script and then suspending the VM:

    import pyxs, time
    c = pyxs.client.Client(xen_bus_path="/dev/xen/xenbus")
    c.connect()
    c.transaction()
    time.sleep(3600)

Even if this deadlock were resolved, misbehaving userspace should not
prevent a VM from being migrated. So, instead of waiting for these
transactions to complete before suspending, store the current generation
id for each transaction when it is started. The global generation id is
incremented during resume. If the caller commits the transaction and the
generation id does not match the current generation id, return EAGAIN so
that they try again. If the transaction was instead discarded, return OK
since no changes were made anyway.

This only affects users of the xenbus file interface. In-kernel users of
xenbus are assumed to be well-behaved and complete all transactions
before freezing.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:15:19 +02:00
YueHaibing
66f33b2bd2 xen/pvcalls: Remove set but not used variable
[ Upstream commit 41349672e3 ]

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/xen/pvcalls-front.c: In function pvcalls_front_sendmsg:
drivers/xen/pvcalls-front.c:543:25: warning: variable bedata set but not used [-Wunused-but-set-variable]
drivers/xen/pvcalls-front.c: In function pvcalls_front_recvmsg:
drivers/xen/pvcalls-front.c:638:25: warning: variable bedata set but not used [-Wunused-but-set-variable]

They are never used since introduction.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-06-22 08:15:19 +02:00
Konrad Rzeszutek Wilk
99dcf4a4dd xen/pciback: Don't disable PCI_COMMAND on PCI device reset.
commit 7681f31ec9 upstream.

There is no need for this at all. Worst it means that if
the guest tries to write to BARs it could lead (on certain
platforms) to PCI SERR errors.

Please note that with af6fc858a3
"xen-pciback: limit guest control of command register"
a guest is still allowed to enable those control bits (safely), but
is not allowed to disable them and that therefore a well behaved
frontend which enables things before using them will still
function correctly.

This is done via an write to the configuration register 0x4 which
triggers on the backend side:
command_write
  \- pci_enable_device
     \- pci_enable_device_flags
        \- do_pci_enable_device
           \- pcibios_enable_device
              \-pci_enable_resourcess
                [which enables the PCI_COMMAND_MEMORY|PCI_COMMAND_IO]

However guests (and drivers) which don't do this could cause
problems, including the security issues which XSA-120 sought
to address.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-04 08:02:34 +02:00
Kirill Smelkov
04b4d5f75a fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock
[ Upstream commit 10dce8af34 ]

Commit 9c225f2655 ("vfs: atomic f_pos accesses as per POSIX") added
locking for file.f_pos access and in particular made concurrent read and
write not possible - now both those functions take f_pos lock for the
whole run, and so if e.g. a read is blocked waiting for data, write will
deadlock waiting for that read to complete.

This caused regression for stream-like files where previously read and
write could run simultaneously, but after that patch could not do so
anymore. See e.g. commit 581d21a2d0 ("xenbus: fix deadlock on writes
to /proc/xen/xenbus") which fixes such regression for particular case of
/proc/xen/xenbus.

The patch that added f_pos lock in 2014 did so to guarantee POSIX thread
safety for read/write/lseek and added the locking to file descriptors of
all regular files. In 2014 that thread-safety problem was not new as it
was already discussed earlier in 2006.

However even though 2006'th version of Linus's patch was adding f_pos
locking "only for files that are marked seekable with FMODE_LSEEK (thus
avoiding the stream-like objects like pipes and sockets)", the 2014
version - the one that actually made it into the tree as 9c225f2655 -
is doing so irregardless of whether a file is seekable or not.

See

    https://lore.kernel.org/lkml/53022DB1.4070805@gmail.com/
    https://lwn.net/Articles/180387
    https://lwn.net/Articles/180396

for historic context.

The reason that it did so is, probably, that there are many files that
are marked non-seekable, but e.g. their read implementation actually
depends on knowing current position to correctly handle the read. Some
examples:

	kernel/power/user.c		snapshot_read
	fs/debugfs/file.c		u32_array_read
	fs/fuse/control.c		fuse_conn_waiting_read + ...
	drivers/hwmon/asus_atk0110.c	atk_debugfs_ggrp_read
	arch/s390/hypfs/inode.c		hypfs_read_iter
	...

Despite that, many nonseekable_open users implement read and write with
pure stream semantics - they don't depend on passed ppos at all. And for
those cases where read could wait for something inside, it creates a
situation similar to xenbus - the write could be never made to go until
read is done, and read is waiting for some, potentially external, event,
for potentially unbounded time -> deadlock.

Besides xenbus, there are 14 such places in the kernel that I've found
with semantic patch (see below):

	drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock .write()
	drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can deadlock .write()
	drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock .write()
	drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() can deadlock .write()
	net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock .write()
	drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can deadlock .write()
	drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can deadlock .write()
	drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock .write()
	net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can deadlock .write()
	drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can deadlock .write()
	drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock .write()
	drivers/input/misc/uinput.c:400:1-17: ERROR: uinput_fops: .read() can deadlock .write()
	drivers/infiniband/core/user_mad.c:985:7-23: ERROR: umad_fops: .read() can deadlock .write()
	drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock .write()

In addition to the cases above another regression caused by f_pos
locking is that now FUSE filesystems that implement open with
FOPEN_NONSEEKABLE flag, can no longer implement bidirectional
stream-like files - for the same reason as above e.g. read can deadlock
write locking on file.f_pos in the kernel.

FUSE's FOPEN_NONSEEKABLE was added in 2008 in a7c1b990f7 ("fuse:
implement nonseekable open") to support OSSPD. OSSPD implements /dev/dsp
in userspace with FOPEN_NONSEEKABLE flag, with corresponding read and
write routines not depending on current position at all, and with both
read and write being potentially blocking operations:

See

    https://github.com/libfuse/osspd
    https://lwn.net/Articles/308445

    https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1406
    https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1438-L1477
    https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1479-L1510

Corresponding libfuse example/test also describes FOPEN_NONSEEKABLE as
"somewhat pipe-like files ..." with read handler not using offset.
However that test implements only read without write and cannot exercise
the deadlock scenario:

    https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L124-L131
    https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L146-L163
    https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L209-L216

I've actually hit the read vs write deadlock for real while implementing
my FUSE filesystem where there is /head/watch file, for which open
creates separate bidirectional socket-like stream in between filesystem
and its user with both read and write being later performed
simultaneously. And there it is semantically not easy to split the
stream into two separate read-only and write-only channels:

    https://lab.nexedi.com/kirr/wendelin.core/blob/f13aa600/wcfs/wcfs.go#L88-169

Let's fix this regression. The plan is:

1. We can't change nonseekable_open to include &~FMODE_ATOMIC_POS -
   doing so would break many in-kernel nonseekable_open users which
   actually use ppos in read/write handlers.

2. Add stream_open() to kernel to open stream-like non-seekable file
   descriptors. Read and write on such file descriptors would never use
   nor change ppos. And with that property on stream-like files read and
   write will be running without taking f_pos lock - i.e. read and write
   could be running simultaneously.

3. With semantic patch search and convert to stream_open all in-kernel
   nonseekable_open users for which read and write actually do not
   depend on ppos and where there is no other methods in file_operations
   which assume @offset access.

4. Add FOPEN_STREAM to fs/fuse/ and open in-kernel file-descriptors via
   steam_open if that bit is present in filesystem open reply.

   It was tempting to change fs/fuse/ open handler to use stream_open
   instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but
   grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE,
   and in particular GVFS which actually uses offset in its read and
   write handlers

	https://codesearch.debian.net/search?q=-%3Enonseekable+%3D
	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080
	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346
	https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481

   so if we would do such a change it will break a real user.

5. Add stream_open and FOPEN_STREAM handling to stable kernels starting
   from v3.14+ (the kernel where 9c225f2655 first appeared).

   This will allow to patch OSSPD and other FUSE filesystems that
   provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE
   in their open handler and this way avoid the deadlock on all kernel
   versions. This should work because fs/fuse/ ignores unknown open
   flags returned from a filesystem and so passing FOPEN_STREAM to a
   kernel that is not aware of this flag cannot hurt. In turn the kernel
   that is not aware of FOPEN_STREAM will be < v3.14 where just
   FOPEN_NONSEEKABLE is sufficient to implement streams without read vs
   write deadlock.

This patch adds stream_open, converts /proc/xen/xenbus to it and adds
semantic patch to automatically locate in-kernel places that are either
required to be converted due to read vs write deadlock, or that are just
safe to be converted because read and write do not use ppos and there
are no other funky methods in file_operations.

Regarding semantic patch I've verified each generated change manually -
that it is correct to convert - and each other nonseekable_open instance
left - that it is either not correct to convert there, or that it is not
converted due to current stream_open.cocci limitations.

The script also does not convert files that should be valid to convert,
but that currently have .llseek = noop_llseek or generic_file_llseek for
unknown reason despite file being opened with nonseekable_open (e.g.
drivers/input/mousedev.c)

Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yongzhi Pan <panyongzhi@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Tejun Heo <tj@kernel.org>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Nikolaus Rath <Nikolaus@rath.org>
Cc: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-08 07:21:51 +02:00
Oleksandr Andrushchenko
7273c2b1e7 xen/gntdev: Do not destroy context while dma-bufs are in use
[ Upstream commit fa13e665e0 ]

If there are exported DMA buffers which are still in use and
grant device is closed by either normal user-space close or by
a signal this leads to the grant device context to be destroyed,
thus making it not possible to correctly destroy those exported
buffers when they are returned back to gntdev and makes the module
crash:

[  339.617540] [<ffff00000854c0d8>] dmabuf_exp_ops_release+0x40/0xa8
[  339.617560] [<ffff00000867a6e8>] dma_buf_release+0x60/0x190
[  339.617577] [<ffff0000082211f0>] __fput+0x88/0x1d0
[  339.617589] [<ffff000008221394>] ____fput+0xc/0x18
[  339.617607] [<ffff0000080ed4e4>] task_work_run+0x9c/0xc0
[  339.617622] [<ffff000008089714>] do_notify_resume+0xfc/0x108

Fix this by referencing gntdev on each DMA buffer export and
unreferencing on buffer release.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Boris Ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-05 22:33:06 +02:00
Wen Yang
642e26628c pvcalls-front: fix potential null dereference
[ Upstream commit b471109806 ]

 static checker warning:
    drivers/xen/pvcalls-front.c:373 alloc_active_ring()
    error: we previously assumed 'map->active.ring' could be null
           (see line 357)

drivers/xen/pvcalls-front.c
    351 static int alloc_active_ring(struct sock_mapping *map)
    352 {
    353     void *bytes;
    354
    355     map->active.ring = (struct pvcalls_data_intf *)
    356         get_zeroed_page(GFP_KERNEL);
    357     if (!map->active.ring)
                    ^^^^^^^^^^^^^^^^^
Check

    358         goto out;
    359
    360     map->active.ring->ring_order = PVCALLS_RING_ORDER;
    361     bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
    362                     PVCALLS_RING_ORDER);
    363     if (!bytes)
    364         goto out;
    365
    366     map->active.data.in = bytes;
    367     map->active.data.out = bytes +
    368         XEN_FLEX_RING_SIZE(PVCALLS_RING_ORDER);
    369
    370     return 0;
    371
    372 out:
--> 373     free_active_ring(map);
                                 ^^^
Add null check on map->active.ring before dereferencing it to avoid
any NULL pointer dereferences.

Fixes: 9f51c05dc4 ("pvcalls-front: Avoid get_free_pages(GFP_KERNEL) under spinlock")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Juergen Gross <jgross@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Dan Carpenter <dan.carpenter@oracle.com>
CC: xen-devel@lists.xenproject.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-27 10:08:56 +01:00
Wen Yang
06b919a517 pvcalls-front: Avoid get_free_pages(GFP_KERNEL) under spinlock
[ Upstream commit 9f51c05dc4 ]

The problem is that we call this with a spin lock held.
The call tree is:
pvcalls_front_accept() holds bedata->socket_lock.
    -> create_active()
        -> __get_free_pages() uses GFP_KERNEL

The create_active() function is only called from pvcalls_front_accept()
with a spin_lock held, The allocation is not allowed to sleep and
GFP_KERNEL is not sufficient.

This issue was detected by using the Coccinelle software.

v2: Add a function doing the allocations which is called
    outside the lock and passing the allocated data to
    create_active().

v3: Use the matching deallocators i.e., free_page()
    and free_pages(), respectively.

v4: It would be better to pre-populate map (struct sock_mapping),
    rather than introducing one more new struct.

v5: Since allocating the data outside of this call it should also
    be freed outside, when create_active() fails.
    Move kzalloc(sizeof(*map2), GFP_ATOMIC) outside spinlock and
    use GFP_KERNEL instead.

v6: Drop the superfluous calls.

Suggested-by: Juergen Gross <jgross@suse.com>
Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Suggested-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
CC: Julia Lawall <julia.lawall@lip6.fr>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Juergen Gross <jgross@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: xen-devel@lists.xenproject.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-27 10:08:54 +01:00
YueHaibing
bc4e383da8 xen/pvcalls: remove set but not used variable 'intf'
[ Upstream commit 1f8ce09b36 ]

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/xen/pvcalls-back.c: In function 'pvcalls_sk_state_change':
drivers/xen/pvcalls-back.c:286:28: warning:
 variable 'intf' set but not used [-Wunused-but-set-variable]

It not used since e6587cdbd7 ("pvcalls-back: set -ENOTCONN in
pvcalls_conn_back_read")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-27 10:08:52 +01:00
Stefano Stabellini
1a4c9c4e01 pvcalls-back: set -ENOTCONN in pvcalls_conn_back_read
[ Upstream commit e6587cdbd7 ]

When a connection is closing we receive on pvcalls_sk_state_change
notification. Instead of setting the connection as closed immediately
(-ENOTCONN), let's read one more time from it: pvcalls_conn_back_read
will set the connection as closed when necessary.

That way, we avoid races between pvcalls_sk_state_change and
pvcalls_back_ioworker.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-27 10:08:51 +01:00
Stefano Stabellini
05ac8a6839 pvcalls-front: properly allocate sk
[ Upstream commit beee1fbe8f ]

Don't use kzalloc: it ends up leaving sk->sk_prot not properly
initialized. Use sk_alloc instead and define our own trivial struct
proto.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-27 10:08:50 +01:00
Stefano Stabellini
81b8519de1 pvcalls-front: don't try to free unallocated rings
[ Upstream commit 96283f9a08 ]

inflight_req_id is 0 when initialized. If inflight_req_id is 0, there is
no accept_map to free. Fix the check in pvcalls_front_release.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-27 10:08:50 +01:00
Stefano Stabellini
9699f7a70e pvcalls-front: read all data before closing the connection
[ Upstream commit b79470b64f ]

When a connection is closing in_error is set to ENOTCONN. There could
still be outstanding data on the ring left by the backend. Before
closing the connection on the frontend side, drain the ring.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-02-27 10:08:50 +01:00
Juergen Gross
4432362af3 xen: Fix x86 sched_clock() interface for xen
commit 867cefb4cb upstream.

Commit f94c8d1169 ("sched/clock, x86/tsc: Rework the x86 'unstable'
sched_clock() interface") broke Xen guest time handling across
migration:

[  187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
[  187.251137] OOM killer disabled.
[  187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[  187.252299] suspending xenstore...
[  187.266987] xen:grant_table: Grant tables using version 1 layout
[18446743811.706476] OOM killer enabled.
[18446743811.706478] Restarting tasks ... done.
[18446743811.720505] Setting capacity to 16777216

Fix that by setting xen_sched_clock_offset at resume time to ensure a
monotonic clock value.

[boris: replaced pr_info() with pr_info_once() in xen_callback_vector()
 to avoid printing with incorrect timestamp during resume (as we
 haven't re-adjusted the clock yet)]

Fixes: f94c8d1169 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface")
Cc: <stable@vger.kernel.org> # 4.11
Reported-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-22 21:40:32 +01:00
Pan Bian
ff5ac9bd16 pvcalls-front: fixes incorrect error handling
[ Upstream commit 975ef94a02 ]

kfree() is incorrectly used to release the pages allocated by
__get_free_page() and __get_free_pages(). Use the matching deallocators
i.e., free_page() and free_pages(), respectively.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-17 09:24:39 +01:00
Igor Druzhinin
a9d79a0751 Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"
[ Upstream commit 123664101a ]

This reverts commit b3cf8528bb.

That commit unintentionally broke Xen balloon memory hotplug with
"hotplug_unpopulated" set to 1. As long as "System RAM" resource
got assigned under a new "Unusable memory" resource in IO/Mem tree
any attempt to online this memory would fail due to general kernel
restrictions on having "System RAM" resources as 1st level only.

The original issue that commit has tried to workaround fa564ad963
("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f,
60-7f)") also got amended by the following 03a551734 ("x86/PCI: Move
and shrink AMD 64-bit window to avoid conflict") which made the
original fix to Xen ballooning unnecessary.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-17 09:24:39 +01:00
Srikanth Boddepalli
c1a21086bb xen: xlate_mmu: add missing header to fix 'W=1' warning
[ Upstream commit 72791ac854 ]

Add a missing header otherwise compiler warns about missed prototype:

drivers/xen/xlate_mmu.c:183:5: warning: no previous prototype for 'xen_xlate_unmap_gfn_range?' [-Wmissing-prototypes]
  int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
      ^~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Srikanth Boddepalli <boddepalli.srikanth@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joey Pabalinas <joeypabalinas@gmail.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-17 09:24:39 +01:00
Liam Merwick
dd9392292e xen/grant-table: Fix incorrect gnttab_dma_free_pages() pr_debug message
[ Upstream commit d9cccfa7c4 ]

If a call to xenmem_reservation_increase() in gnttab_dma_free_pages()
fails it triggers a message "Failed to decrease reservation..." which
should be "Failed to increase reservation..."

Fixes: 9bdc7304f5 ('xen/grant-table: Allow allocating buffers suitable for DMA')
Reported-by: Ross Philipson <ross.philipson@oracle.com>
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-27 16:13:04 +01:00
Juergen Gross
1ebdc3d180 xen: remove size limit of privcmd-buf mapping interface
commit 3941552aec upstream.

Currently the size of hypercall buffers allocated via
/dev/xen/hypercall is limited to a default of 64 memory pages. For live
migration of guests this might be too small as the page dirty bitmask
needs to be sized according to the size of the guest. This means
migrating a 8GB sized guest is already exhausting the default buffer
size for the dirty bitmap.

There is no sensible way to set a sane limit, so just remove it
completely. The device node's usage is limited to root anyway, so there
is no additional DOS scenario added by allowing unlimited buffers.

While at it make the error path for the -ENOMEM case a little bit
cleaner by setting n_pages to the number of successfully allocated
pages instead of the target size.

Fixes: c51b3c639e ("xen: add new hypercall buffer mapping device")
Cc: <stable@vger.kernel.org> #4.18
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13 11:08:52 -08:00
Boris Ostrovsky
a3e7530f44 xen/balloon: Support xend-based toolstack
commit 3aa6c19d2f upstream.

Xend-based toolstacks don't have static-max entry in xenstore. The
equivalent node for those toolstacks is memory_static_max.

Fixes: 5266b8e444 (xen: fix booting ballooned down hvm guest)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: <stable@vger.kernel.org> # 4.13
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13 11:08:40 -08:00
Joe Jin
73afce2ebf xen-swiotlb: use actually allocated size on check physical continuous
commit 7250f422da upstream.

xen_swiotlb_{alloc,free}_coherent() allocate/free memory based on the
order of the pages and not size argument (bytes). This is inconsistent with
range_straddles_page_boundary and memset which use the 'size' value,
which may lead to not exchanging memory with Xen (range_straddles_page_boundary()
returned true). And then the call to xen_swiotlb_free_coherent() would
actually try to exchange the memory with Xen, leading to the kernel
hitting an BUG (as the hypercall returned an error).

This patch fixes it by making the 'size' variable be of the same size
as the amount of memory allocated.

CC: stable@vger.kernel.org
Signed-off-by: Joe Jin <joe.jin@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Christoph Helwig <hch@lst.de>
Cc: Dongli Zhang <dongli.zhang@oracle.com>
Cc: John Sobecki <john.sobecki@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13 11:08:40 -08:00
Juergen Gross
d59f532480 xen: issue warning message when out of grant maptrack entries
When a driver domain (e.g. dom0) is running out of maptrack entries it
can't map any more foreign domain pages. Instead of silently stalling
the affected domUs issue a rate limited warning in this case in order
to make it easier to detect that situation.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-09-19 11:27:42 -04:00
Michal Hocko
58a5756990 xen/gntdev: fix up blockable calls to mn_invl_range_start
Patch series "mmu_notifiers follow ups".

Tetsuo has noticed some fallouts from 93065ac753 ("mm, oom: distinguish
blockable mode for mmu notifiers").  One of them has been fixed and picked
up by AMD/DRM maintainer [1].  XEN issue is fixed by patch 1.  I have also
clarified expectations about blockable semantic of invalidate_range_end.
Finally the last patch removes MMU_INVALIDATE_DOES_NOT_BLOCK which is no
longer used nor needed.

[1] http://lkml.kernel.org/r/20180824135257.GU29735@dhcp22.suse.cz

This patch (of 3):

93065ac753 ("mm, oom: distinguish blockable mode for mmu notifiers") has
introduced blockable parameter to all mmu_notifiers and the notifier has
to back off when called in !blockable case and it could block down the
road.

The above commit implemented that for mn_invl_range_start but both
in_range checks are done unconditionally regardless of the blockable mode
and as such they would fail all the time for regular calls.  Fix this by
checking blockable parameter as well.

Once we are there we can remove the stale TODO.  The lock has to be
sleepable because we wait for completion down in gnttab_unmap_refs_sync.

Link: http://lkml.kernel.org/r/20180827112623.8992-2-mhocko@kernel.org
Fixes: 93065ac753 ("mm, oom: distinguish blockable mode for mmu notifiers")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-09-14 08:52:30 -04:00
Josh Abraham
4dca864b59 xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage
This patch removes duplicate macro useage in events_base.c.

It also fixes gcc warning:
variable ‘col’ set but not used [-Wunused-but-set-variable]

Signed-off-by: Joshua Abraham <j.abraham1776@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-09-14 08:51:10 -04:00
Olaf Hering
3366cdb6d3 xen: avoid crash in disable_hotplug_cpu
The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:

BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
RIP: e030:device_offline+0x9/0xb0
Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
Call Trace:
 handle_vcpu_hotplug_event+0xb5/0xc0
 xenwatch_thread+0x80/0x140
 ? wait_woken+0x80/0x80
 kthread+0x112/0x130
 ? kthread_create_worker_on_cpu+0x40/0x40
 ret_from_fork+0x3a/0x50

This happens because handle_vcpu_hotplug_event is called twice. In the
first iteration cpu_present is still true, in the second iteration
cpu_present is false which causes get_cpu_device to return NULL.
In case of cpu#0, cpu_online is apparently always true.

Fix this crash by checking if the cpu can be hotplugged, which is false
for a cpu that was just removed.

Also check if the cpu was actually offlined by device_remove, otherwise
leave the cpu_present state as it is.

Rearrange to code to do all work with device_hotplug_lock held.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-09-14 08:51:10 -04:00
Marek Marczykowski-Górecki
197ecb3802 xen/balloon: add runtime control for scrubbing ballooned out pages
Scrubbing pages on initial balloon down can take some time, especially
in nested virtualization case (nested EPT is slow). When HVM/PVH guest is
started with memory= significantly lower than maxmem=, all the extra
pages will be scrubbed before returning to Xen. But since most of them
weren't used at all at that point, Xen needs to populate them first
(from populate-on-demand pool). In nested virt case (Xen inside KVM)
this slows down the guest boot by 15-30s with just 1.5GB needed to be
returned to Xen.

Add runtime parameter to enable/disable it, to allow initially disabling
scrubbing, then enable it back during boot (for example in initramfs).
Such usage relies on assumption that a) most pages ballooned out during
initial boot weren't used at all, and b) even if they were, very few
secrets are in the guest at that time (before any serious userspace
kicks in).
Convert CONFIG_XEN_SCRUB_PAGES to CONFIG_XEN_SCRUB_PAGES_DEFAULT (also
enabled by default), controlling default value for the new runtime
switch.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-09-14 08:51:10 -04:00
Vitaly Kuznetsov
87dffe86d4 xen/manage: don't complain about an empty value in control/sysrq node
When guest receives a sysrq request from the host it acknowledges it by
writing '\0' to control/sysrq xenstore node. This, however, make xenstore
watch fire again but xenbus_scanf() fails to parse empty value with "%c"
format string:

 sysrq: SysRq : Emergency Sync
 Emergency Sync complete
 xen:manage: Error -34 reading sysrq code in control/sysrq

Ignore -ERANGE the same way we already ignore -ENOENT, empty value in
control/sysrq is totally legal.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-09-14 08:51:10 -04:00
Linus Torvalds
4290d5b9ca xen: fixes for 4.19-rc2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCW4lM6AAKCRCAXGG7T9hj
 vs8AAQDysFccg97UdopW3B7yklIaRqkfEIAsxe65f191MXsH2AEAp5SKxZqRPqBP
 a9WHDj8ShB3BhZ/IxpdO9Y59U3Jo4wA=
 =Gt4c
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.19b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - minor cleanup avoiding a warning when building with new gcc

 - a patch to add a new sysfs node for Xen frontend/backend drivers to
   make it easier to obtain the state of a pv device

 - two fixes for 32-bit pv-guests to avoid intermediate L1TF vulnerable
   PTEs

* tag 'for-linus-4.19b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: remove redundant variable save_pud
  xen: export device state to sysfs
  x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
  x86/xen: don't write ptes directly in 32-bit PV guests
2018-08-31 08:45:16 -07:00
Joe Jin
076e2cedd6 xen: export device state to sysfs
Export device state to sysfs to allow for easier get device state.

Signed-off-by: Joe Jin <joe.jin@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-08-28 17:37:40 -04:00
Linus Torvalds
d40acad1f1 xen: fixes and cleanups for 4.19-rc1, second round
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCW36rRgAKCRCAXGG7T9hj
 vkrcAQC8F+ljGO5PtYUkKcMy17vqvcq/BdetJuUVfk+G1WmLxQEAiaNiqqJGsOyJ
 Msa0HHDT31uBYGg/iq7yAWk23tcTZwE=
 =Px4D
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.19b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes and cleanups from Juergen Gross:
 "Some cleanups, some minor fixes and a fix for a bug introduced in this
  merge window hitting 32-bit PV guests"

* tag 'for-linus-4.19b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: enable early use of set_fixmap in 32-bit Xen PV guest
  xen: remove unused hypercall functions
  x86/xen: remove unused function xen_auto_xlated_memory_setup()
  xen/ACPI: don't upload Px/Cx data for disabled processors
  x86/Xen: further refine add_preferred_console() invocations
  xen/mcelog: eliminate redundant setting of interface version
  x86/Xen: mark xen_setup_gdt() __init
2018-08-23 14:52:23 -07:00
Michal Hocko
93065ac753 mm, oom: distinguish blockable mode for mmu notifiers
There are several blockable mmu notifiers which might sleep in
mmu_notifier_invalidate_range_start and that is a problem for the
oom_reaper because it needs to guarantee a forward progress so it cannot
depend on any sleepable locks.

Currently we simply back off and mark an oom victim with blockable mmu
notifiers as done after a short sleep.  That can result in selecting a new
oom victim prematurely because the previous one still hasn't torn its
memory down yet.

We can do much better though.  Even if mmu notifiers use sleepable locks
there is no reason to automatically assume those locks are held.  Moreover
majority of notifiers only care about a portion of the address space and
there is absolutely zero reason to fail when we are unmapping an unrelated
range.  Many notifiers do really block and wait for HW which is harder to
handle and we have to bail out though.

This patch handles the low hanging fruit.
__mmu_notifier_invalidate_range_start gets a blockable flag and callbacks
are not allowed to sleep if the flag is set to false.  This is achieved by
using trylock instead of the sleepable lock for most callbacks and
continue as long as we do not block down the call chain.

I think we can improve that even further because there is a common pattern
to do a range lookup first and then do something about that.  The first
part can be done without a sleeping lock in most cases AFAICS.

The oom_reaper end then simply retries if there is at least one notifier
which couldn't make any progress in !blockable mode.  A retry loop is
already implemented to wait for the mmap_sem and this is basically the
same thing.

The simplest way for driver developers to test this code path is to wrap
userspace code which uses these notifiers into a memcg and set the hard
limit to hit the oom.  This can be done e.g.  after the test faults in all
the mmu notifier managed memory and set the hard limit to something really
small.  Then we are looking for a proper process tear down.

[akpm@linux-foundation.org: coding style fixes]
[akpm@linux-foundation.org: minor code simplification]
Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers
Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp
Reported-by: David Rientjes <rientjes@google.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:44 -07:00
Jan Beulich
166deb0f0b xen/ACPI: don't upload Px/Cx data for disabled processors
This is unnecessary and triggers a warning in the hypervisor.

Often systems have more processor entries in their ACPI tables than are
actually installed/active. The ACPI_STA_DEVICE_PRESENT bit cannot be
reliably used, but the ACPI_MADT_ENABLED bit can. In order to not
introduce new functions in the main ACPI processor driver code, simply
use acpi_get_phys_id(), which does more than we need, but which checks
the MADT enabled bit in the process. Any CPU for which we can't
determine the APIC ID is unlikely to work properly anyway, so the extra
checks done by acpi_get_phys_id() should do no harm.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-08-20 14:46:18 -04:00
Jan Beulich
f76c318c77 xen/mcelog: eliminate redundant setting of interface version
This already gets done in HYPERVISOR_mca().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-08-20 14:46:18 -04:00
Linus Torvalds
72f02ba66b SCSI misc on 20180815
This is mostly updates to the usual drivers: mpt3sas, lpfc, qla2xxx,
 hisi_sas, smartpqi, megaraid_sas, arcmsr.  In addition, with the
 continuing absence of Nic we have target updates for tcmu and target
 core (all with reviews and acks).  The biggest observable change is
 going to be that we're (again) trying to switch to mulitqueue as the
 default (a user can still override the setting on the kernel command
 line).  Other major core stuff is the removal of the remaining
 Microchannel drivers, an update of the internal timers and some
 reworks of completion and result handling.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCW3R3niYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishauRAP4yfBKK
 dbxF81c/Bxi/Stk16FWkOOrjs4CizwmnMcpM5wD/UmM9o6ebDzaYpZgA8wIl7X/N
 o/JckEZZpIp+5NySZNc=
 =ggLB
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly updates to the usual drivers: mpt3sas, lpfc, qla2xxx,
  hisi_sas, smartpqi, megaraid_sas, arcmsr.

  In addition, with the continuing absence of Nic we have target updates
  for tcmu and target core (all with reviews and acks).

  The biggest observable change is going to be that we're (again) trying
  to switch to mulitqueue as the default (a user can still override the
  setting on the kernel command line).

  Other major core stuff is the removal of the remaining Microchannel
  drivers, an update of the internal timers and some reworks of
  completion and result handling"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits)
  scsi: core: use blk_mq_run_hw_queues in scsi_kick_queue
  scsi: ufs: remove unnecessary query(DM) UPIU trace
  scsi: qla2xxx: Fix issue reported by static checker for qla2x00_els_dcmd2_sp_done()
  scsi: aacraid: Spelling fix in comment
  scsi: mpt3sas: Fix calltrace observed while running IO & reset
  scsi: aic94xx: fix an error code in aic94xx_init()
  scsi: st: remove redundant pointer STbuffer
  scsi: qla2xxx: Update driver version to 10.00.00.08-k
  scsi: qla2xxx: Migrate NVME N2N handling into state machine
  scsi: qla2xxx: Save frame payload size from ICB
  scsi: qla2xxx: Fix stalled relogin
  scsi: qla2xxx: Fix race between switch cmd completion and timeout
  scsi: qla2xxx: Fix Management Server NPort handle reservation logic
  scsi: qla2xxx: Flush mailbox commands on chip reset
  scsi: qla2xxx: Fix unintended Logout
  scsi: qla2xxx: Fix session state stuck in Get Port DB
  scsi: qla2xxx: Fix redundant fc_rport registration
  scsi: qla2xxx: Silent erroneous message
  scsi: qla2xxx: Prevent sysfs access when chip is down
  scsi: qla2xxx: Add longer window for chip reset
  ...
2018-08-15 22:06:26 -07:00
Linus Torvalds
54dbe75bbf drm pull for 4.19-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbc41pAAoJEAx081l5xIa+ZrAP/AzKj4i4pBLVJcvNZ2BwD+UD
 ZNSNj2iqCJ5+Jo/WtIwQ8tLct9UqfVssUwBke6tZksiLdTigGPTUyVIAdK+9kyWD
 D00m3x/pToJrSF2D0FwxQlPUtPkohp9N+E6+TU7gd1oCasZfBzmcEpoVAmZf+NWE
 kN1xXpmGxZWpu0wc7JA2lv9MuUTijCwIqJqa5E0bB3z06G5mw+PJ89kYzMx19OyA
 ZYQK8y3A40ZGl8UbajZ4xg9pqFCRYFFHGqfYlpUWWTh0XMAXu8+Yqzh3dJxmak7r
 4u2pdQBsxPMZO8qKBHpVvI7Zhoe0Ntnolc0XVD+2IbqqnTprVbQs0bWf3YyfUlQi
 1/9bWFK67W0LEuzac6M7a7EQqFNiHF13Btao7aqENTIe/GaCZJoopaiRMAmh6EHD
 4PezeYqrW8cSaPj6OKouL1BhW9Bjixsg0bvjS/uB6m4KekFCt1++BDFGzkqvm6Mo
 SVW7nkJoCFpCASaR7DhUEOPexaHeJ65HCDDUvYdqz9jd2w1TgvvanEZWual1NwEm
 ImA8A4wGZ/3KijpyyKm0gE96RX7+zMMZ3brW6p1vhUUKVYJCrvSr5jrXH5+2k6Aw
 Y455doGL87IRkwyje/YbQF0I8pbUZD9QS5wII13tLGwOH9/uC/Xl6dHNM40gtqyh
 W4gEdY+NAMJmYLvRNawa
 =g9rD
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2018-08-15' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "This is the main drm pull request for 4.19.

  Rob has some new hardware support for new qualcomm hw that I'll send
  along separately. This has the display part of it, the remaining pull
  is for the acceleration engine.

  This also contains a wound-wait/wait-die mutex rework, Peter has acked
  it for merging via my tree.

  Otherwise mostly the usual level of activity. Summary:

  core:
   - Wound-wait/wait-die mutex rework
   - Add writeback connector type
   - Add "content type" property for HDMI
   - Move GEM bo to drm_framebuffer
   - Initial gpu scheduler documentation
   - GPU scheduler fixes for dying processes
   - Console deferred fbcon takeover support
   - Displayport support for CEC tunneling over AUX

  panel:
   - otm8009a panel driver fixes
   - Innolux TV123WAM and G070Y2-L01 panel driver
   - Ilitek ILI9881c panel driver
   - Rocktech RK070ER9427 LCD
   - EDT ETM0700G0EDH6 and EDT ETM0700G0BDH6
   - DLC DLC0700YZG-1
   - BOE HV070WSA-100
   - newhaven, nhd-4.3-480272ef-atxl LCD
   - DataImage SCF0700C48GGU18
   - Sharp LQ035Q7DB03
   - p079zca: Refactor to support multiple panels

  tinydrm:
   - ILI9341 display panel

  New driver:
   - vkms - virtual kms driver to testing.

  i915:
   - Icelake:
        Display enablement
        DSI support
        IRQ support
        Powerwell support
   - GPU reset fixes and improvements
   - Full ppgtt support refactoring
   - PSR fixes and improvements
   - Execlist improvments
   - GuC related fixes

  amdgpu:
   - Initial amdgpu documentation
   - JPEG engine support on VCN
   - CIK uses powerplay by default
   - Move to using core PCIE functionality for gens/lanes
   - DC/Powerplay interface rework
   - Stutter mode support for RV
   - Vega12 Powerplay updates
   - GFXOFF fixes
   - GPUVM fault debugging
   - Vega12 GFXOFF
   - DC improvements
   - DC i2c/aux changes
   - UVD 7.2 fixes
   - Powerplay fixes for Polaris12, CZ/ST
   - command submission bo_list fixes

  amdkfd:
   - Raven support
   - Power management fixes

  udl:
   - Cleanups and fixes

  nouveau:
   - misc fixes and cleanups.

  msm:
   - DPU1 support display controller in sdm845
   - GPU coredump support.

  vmwgfx:
   - Atomic modesetting validation fixes
   - Support for multisample surfaces

  armada:
   - Atomic modesetting support completed.

  exynos:
   - IPPv2 fixes
   - Move g2d to component framework
   - Suspend/resume support cleanups
   - Driver cleanups

  imx:
   - CSI configuration improvements
   - Driver cleanups
   - Use atomic suspend/resume helpers
   - ipu-v3 V4L2 XRGB32/XBGR32 support

  pl111:
   - Add Nomadik LCDC variant

  v3d:
   - GPU scheduler jobs management

  sun4i:
   - R40 display engine support
   - TCON TOP driver

  mediatek:
   - MT2712 SoC support

  rockchip:
   - vop fixes

  omapdrm:
   - Workaround for DRA7 errata i932
   - Fix mm_list locking

  mali-dp:
   - Writeback implementation
        PM improvements
   - Internal error reporting debugfs

  tilcdc:
   - Single fix for deferred probing

  hdlcd:
   - Teardown fixes

  tda998x:
   - Converted to a bridge driver.

  etnaviv:
   - Misc fixes"

* tag 'drm-next-2018-08-15' of git://anongit.freedesktop.org/drm/drm: (1506 commits)
  drm/amdgpu/sriov: give 8s for recover vram under RUNTIME
  drm/scheduler: fix param documentation
  drm/i2c: tda998x: correct PLL divider calculation
  drm/i2c: tda998x: get rid of private fill_modes function
  drm/i2c: tda998x: move mode_valid() to bridge
  drm/i2c: tda998x: register bridge outside of component helper
  drm/i2c: tda998x: cleanup from previous changes
  drm/i2c: tda998x: allocate tda998x_priv inside tda998x_create()
  drm/i2c: tda998x: convert to bridge driver
  drm/scheduler: fix timeout worker setup for out of order job completions
  drm/amd/display: display connected to dp-1 does not light up
  drm/amd/display: update clk for various HDMI color depths
  drm/amd/display: program display clock on cache match
  drm/amd/display: Add NULL check for enabling dp ss
  drm/amd/display: add vbios table check for enabling dp ss
  drm/amd/display: Don't share clk source between DP and HDMI
  drm/amd/display: Fix DP HBR2 Eye Diagram Pattern on Carrizo
  drm/amd/display: Use calculated disp_clk_khz value for dce110
  drm/amd/display: Implement custom degamma lut on dcn
  drm/amd/display: Destroy aux_engines only once
  ...
2018-08-15 17:39:07 -07:00
Roger Pau Monne
3596924a23 xen/balloon: fix balloon initialization for PVH Dom0
The current balloon code tries to calculate a delta factor for the
balloon target when running in HVM mode in order to account for memory
used by the firmware.

This workaround for memory accounting doesn't work properly on a PVH
Dom0, that has a static-max value different from the target value even
at startup. Note that this is not a problem for DomUs because guests are
started with a static-max value that matches the amount of RAM in the
memory map.

Fix this by forcefully setting target_diff for Dom0, regardless of
it's mode.

Reported-by: Gabriel Bercarug <bercarug@amazon.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-08-08 11:03:20 -04:00
Gustavo A. R. Silva
bf06bad958 xen/biomerge: Use true and false for boolean values
Return statements in functions returning bool should use true or false
instead of an integer value.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-08-06 10:20:57 -04:00