Commit Graph

301 Commits

Author SHA1 Message Date
Dexuan Cui ca1c4b7457 Drivers: hv: vmbus: fix init_vp_index() for reloading hv_netvsc
This fixes the recent commit 3b71107d73b16074afa7658f3f0fcf837aabfe24:
Drivers: hv: vmbus: Further improve CPU affiliation logic

Without the fix, reloading hv_netvsc hangs the guest.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-20 22:44:51 -07:00
Vitaly Kuznetsov f39c4280a3 Drivers: hv: vmbus: use cpu_hotplug_enable/disable
Commit e513229b4c ("Drivers: hv: vmbus: prevent cpu offlining on newer
hypervisors") was altering smp_ops.cpu_disable to prevent CPU offlining.
We can bo better by using cpu_hotplug_enable/disable functions instead of
such hard-coding.

Reported-by: Radim Kr.má  <rkrcmar@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:46:44 -07:00
Dexuan Cui 042ab0313b Drivers: hv: vmbus: add a sysfs attr to show the binding of channel/VP
This is useful to analyze performance issue.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:29 -07:00
K. Y. Srinivasan ca9357bd26 Drivers: hv: vmbus: Implement a clocksource based on the TSC page
The current Hyper-V clock source is based on the per-partition reference counter
and this counter is being accessed via s synthetic MSR - HV_X64_MSR_TIME_REF_COUNT.
Hyper-V has a more efficient way of computing the per-partition reference
counter value that does not involve reading a synthetic MSR. We implement
a time source based on this mechanism.

Tested-by: Vivek Yadav <vyadav@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:29 -07:00
Viresh Kumar bc609cb47f drivers/hv: Migrate to new 'set-state' interface
Migrate hv driver to the new 'set-state' interface provided by
clockevents core, the earlier 'set-mode' interface is marked obsolete
now.

This also enables us to implement callbacks for new states of clockevent
devices, for example: ONESHOT_STOPPED.

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: devel@linuxdriverproject.org
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:29 -07:00
Christopher Oo a5cca686ce Drivers: hv_vmbus: Fix signal to host condition
Fixes a bug where previously hv_ringbuffer_read would pass in the old
number of bytes available to read instead of the expected old read index
when calculating when to signal to the host that the ringbuffer is empty.
Since the previous write size is already saved, also changes the
hv_need_to_signal_on_read to use the previously read value rather than
recalculating it.

Signed-off-by: Christopher Oo <t-chriso@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:29 -07:00
Dexuan Cui 3b71107d73 Drivers: hv: vmbus: Further improve CPU affiliation logic
Keep track of CPU affiliations of sub-channels within the scope of the primary
channel. This will allow us to better distribute the load amongst available
CPUs.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:44:28 -07:00
K. Y. Srinivasan 9f01ec5345 Drivers: hv: vmbus: Improve the CPU affiliation for channels
The current code tracks the assigned CPUs within a NUMA node in the context of
the primary channel. So, if we have a VM with a single NUMA node with 8 VCPUs, we may
end up unevenly distributing the channel load. Fix the issue by tracking affiliations
globally.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:41:31 -07:00
Jake Oshins 3546448338 drivers:hv: Move MMIO range picking from hyper_fb to hv_vmbus
This patch deletes the logic from hyperv_fb which picked a range of MMIO space
for the frame buffer and adds new logic to hv_vmbus which picks ranges for
child drivers.  The new logic isn't quite the same as the old, as it considers
more possible ranges.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:41:31 -07:00
Jake Oshins 7f163a6fd9 drivers:hv: Modify hv_vmbus to search for all MMIO ranges available.
This patch changes the logic in hv_vmbus to record all of the ranges in the
VM's firmware (BIOS or UEFI) that offer regions of memory-mapped I/O space for
use by paravirtual front-end drivers.  The old logic just found one range
above 4GB and called it good.  This logic will find any ranges above 1MB.

It would have been possible with this patch to just use existing resource
allocation functions, rather than keep track of the entire set of Hyper-V
related MMIO regions in VMBus.  This strategy, however, is not sufficient
when the resource allocator needs to be aware of the constraints of a
Hyper-V virtual machine, which is what happens in the next patch in the series.
So this first patch exists to show the first steps in reworking the MMIO
allocation paths for Hyper-V front-end drivers.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 11:41:30 -07:00
K. Y. Srinivasan 379e4f756b Drivers: hv: vmbus: Consider ND NIC in binding channels to CPUs
We cycle through all the "high performance" channels to distribute
load across the available CPUs. Process the NetworkDirect as a
high performance device.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:30:44 -07:00
Denis V. Lunev cc2dd4027a mshyperv: fix recognition of Hyper-V guest crash MSR's
Hypervisor Top Level Functional Specification v3.1/4.0 notes that cpuid
(0x40000003) EDX's 10th bit should be used to check that Hyper-V guest
crash MSR's functionality available.

This patch should fix this recognition. Currently the code checks EAX
register instead of EDX.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:30:44 -07:00
Vitaly Kuznetsov 4a54243fc0 Drivers: hv: vmbus: don't send CHANNELMSG_UNLOAD on pre-Win2012R2 hosts
Pre-Win2012R2 hosts don't properly handle CHANNELMSG_UNLOAD and
wait_for_completion() hangs. Avoid sending such request on old hosts.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:29:45 -07:00
Nik Nyby e26009aad0 Drivers: hv: vmbus: fix typo in hv_port_info struct
This fixes a typo: base_flag_bumber to base_flag_number

Signed-off-by: Nik Nyby <nikolas@gnu.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:29:45 -07:00
Dan Carpenter 9dd6a06430 hv: util: checking the wrong variable
We don't catch this allocation failure because there is a typo and we
check the wrong variable.

Fixes: 14b50f80c3 ('Drivers: hv: util: introduce hv_utils_transport abstraction')

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:29:45 -07:00
K. Y. Srinivasan b81658cf5d Drivers: hv: vmbus: Permit sending of packets without payload
The guest may have to send a completion packet back to the host.
To support this usage, permit sending a packet without a payload -
we would be only sending the descriptor in this case.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:39 -07:00
Alex Ng b6ddeae160 Drivers: hv: balloon: Enable dynamic memory protocol negotiation with Windows 10 hosts
Support Win10 protocol for Dynamic Memory. Thia patch allows guests on Win10 hosts
to hot-add memory even when dynamic memory is not enabled on the guest.

Signed-off-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:39 -07:00
Vitaly Kuznetsov 25ef06fe27 Drivers: hv: fcopy: dynamically allocate smsg_out in fcopy_send_data()
struct hv_start_fcopy is too big to be on stack on i386, the following
warning is reported:

>> drivers/hv/hv_fcopy.c:159:1: warning: the frame size of 1088 bytes is larger than 1024 bytes [-Wframe-larger-than=]

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov b36fda3397 Drivers: hv: kvp: check kzalloc return value
kzalloc() return value check was accidentally lost in 11bc3a5fa91f:
"Drivers: hv: kvp: convert to hv_utils_transport" commit.

We don't need to reset kvp_transaction.state here as we have the
kvp_timeout_func() timeout function and in case we're in OOM situation
it is preferable to wait.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov 510f7aef65 Drivers: hv: vmbus: prefer 'die' notification chain to 'panic'
current_pt_regs() sometimes returns regs of the userspace process and in
case of a kernel crash this is not what we need to report. E.g. when we
trigger crash with sysrq we see the following:
...
 RIP: 0010:[<ffffffff815b8696>]  [<ffffffff815b8696>] sysrq_handle_crash+0x16/0x20
 RSP: 0018:ffff8800db0a7d88  EFLAGS: 00010246
 RAX: 000000000000000f RBX: ffffffff820a0660 RCX: 0000000000000000
...
at the same time current_pt_regs() give us:
ip=7f899ea7e9e0, ax=ffffffffffffffda, bx=26c81a0, cx=7f899ea7e9e0, ...
These registers come from the userspace process triggered the crash. As we
don't even know which process it was this information is rather useless.

When kernel crash happens through 'die' proper regs are being passed to
all receivers on the die_chain (and panic_notifier_list is being notified
with the string passed to panic() only). If panic() is called manually
(e.g. on BUG()) we won't get 'die' notification so keep the 'panic'
notification reporter as well but guard against double reporting.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov b4370df2b1 Drivers: hv: vmbus: add special crash handler
Full kernel hang is observed when kdump kernel starts after a crash. This
hang happens in vmbus_negotiate_version() function on
wait_for_completion() as Hyper-V host (Win2012R2 in my testing) never
responds to CHANNELMSG_INITIATE_CONTACT as it thinks the connection is
already established. We need to perform some mandatory minimalistic
cleanup before we start new kernel.

Reported-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov d7646eaa76 Drivers: hv: don't do hypercalls when hypercall_page is NULL
At the very late stage of kexec a driver (which are not being unloaded) can
try to post a message or signal an event. This will crash the kernel as we
already did hv_cleanup() and the hypercall page is NULL.

Move all common (between 32 and 64 bit code) declarations to the beginning
of the do_hypercall() function. Unfortunately we have to write the
!hypercall_page check twice to not mix declarations and code.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:25:29 -07:00
Vitaly Kuznetsov 2517281d63 Drivers: hv: vmbus: add special kexec handler
When general-purpose kexec (not kdump) is being performed in Hyper-V guest
the newly booted kernel fails with an MCE error coming from the host. It
is the same error which was fixed in the "Drivers: hv: vmbus: Implement
the protocol for tearing down vmbus state" commit - monitor pages remain
special and when they're being written to (as the new kernel doesn't know
these pages are special) bad things happen. We need to perform some
minimalistic cleanup before booting a new kernel on kexec. To do so we
need to register a special machine_ops.shutdown handler to be executed
before the native_machine_shutdown(). Registering a shutdown notification
handler via the register_reboot_notifier() call is not sufficient as it
happens to early for our purposes. machine_ops is not being exported to
modules (and I don't think we want to export it) so let's do this in
mshyperv.c

The minimalistic cleanup consists of cleaning up clockevents, synic MSRs,
guest os id MSR, and hypercall MSR.

Kdump doesn't require all this stuff as it lives in a separate memory
space.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:25:29 -07:00
Vitaly Kuznetsov 06210b42f3 Drivers: hv: vmbus: remove hv_synic_free_cpu() call from hv_synic_cleanup()
We already have hv_synic_free() which frees all per-cpu pages for all
CPUs, let's remove the hv_synic_free_cpu() call from hv_synic_cleanup()
so it will be possible to do separate cleanup (writing to MSRs) and final
freeing. This is going to be used to assist kexec.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-04 22:25:28 -07:00
K. Y. Srinivasan 294409d205 Drivers: hv: vmbus: Allocate ring buffer memory in NUMA aware fashion
Allocate ring buffer memory from the NUMA node assigned to the channel.
Since this is a performance and not a correctness issue, if the node specific
allocation were to fail, fall back and allocate without specifying the node.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-12 16:58:33 -07:00
K. Y. Srinivasan 1f656ff3fd Drivers: hv: vmbus: Implement NUMA aware CPU affinity for channels
Channels/sub-channels can be affinitized to VCPUs in the guest. Implement
this affinity in a way that is NUMA aware. The current protocol distributed
the primary channels uniformly across all available CPUs. The new protocol
is NUMA aware: primary channels are distributed across the available NUMA
nodes while the sub-channels within a primary channel are distributed amongst
CPUs within the NUMA node assigned to the primary channel.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-01 10:56:31 +09:00
K. Y. Srinivasan 9c6e64adf2 Drivers: hv: vmbus: Use the vp_index map even for channels bound to CPU 0
Map target_cpu to target_vcpu using the mapping table.
We should use the mapping table to transform guest CPU ID to VP Index
as is done for the non-performance critical channels.
While the value CPU 0 is special and will
map to VP index 0, it is good to be consistent.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-01 10:56:31 +09:00
Vitaly Kuznetsov 4e4bd36f97 Drivers: hv: balloon: check if ha_region_mutex was acquired in MEM_CANCEL_ONLINE case
Memory notifiers are being executed in a sequential order and when one of
them fails returning something different from NOTIFY_OK the remainder of
the notification chain is not being executed. When a memory block is being
onlined in online_pages() we do memory_notify(MEM_GOING_ONLINE, ) and if
one of the notifiers in the chain fails we end up doing
memory_notify(MEM_CANCEL_ONLINE, ) so it is possible for a notifier to see
MEM_CANCEL_ONLINE without seeing the corresponding MEM_GOING_ONLINE event.
E.g. when CONFIG_KASAN is enabled the kasan_mem_notifier() is being used
to prevent memory hotplug, it returns NOTIFY_BAD for all MEM_GOING_ONLINE
events. As kasan_mem_notifier() comes before the hv_memory_notifier() in
the notification chain we don't see the MEM_GOING_ONLINE event and we do
not take the ha_region_mutex. We, however, see the MEM_CANCEL_ONLINE event
and unconditionally try to release the lock, the following is observed:

[  110.850927] =====================================
[  110.850927] [ BUG: bad unlock balance detected! ]
[  110.850927] 4.1.0-rc3_bugxxxxxxx_test_xxxx #595 Not tainted
[  110.850927] -------------------------------------
[  110.850927] systemd-udevd/920 is trying to release lock
(&dm_device.ha_region_mutex) at:
[  110.850927] [<ffffffff81acda0e>] mutex_unlock+0xe/0x10
[  110.850927] but there are no more locks to release!

At the same time we can have the ha_region_mutex taken when we get the
MEM_CANCEL_ONLINE event in case one of the memory notifiers after the
hv_memory_notifier() in the notification chain failed so we need to add
the mutex_is_locked() check. In case of MEM_ONLINE we are always supposed
to have the mutex locked.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-01 06:38:21 +09:00
Keith Mange 6c4e5f9c9f Drivers: hv: vmbus:Update preferred vmbus protocol version to windows 10.
Add support for Windows 10.

Signed-off-by: Keith Mange <keith.mange@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-06-01 06:38:21 +09:00
Vitaly Kuznetsov ce59fec836 Drivers: hv: vmbus: distribute subchannels among all vcpus
Primary channels are distributed evenly across all vcpus we have. When the host
asks us to create subchannels it usually makes us num_cpus-1 offers and we are
supposed to distribute the work evenly among the channel itself and all its
subchannels. Make sure they are all assigned to different vcpus.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov f38e7dd723 Drivers: hv: vmbus: move init_vp_index() call to vmbus_process_offer()
We need to call init_vp_index() after we added the channel to the appropriate
list (global or subchannel) to be able to use this information when assigning
the channel to the particular vcpu. To do so we need to move a couple of
functions around. The only real change is the init_vp_index() call. This is a
small refactoring without a functional change.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov 357e836a60 Drivers: hv: vmbus: decrease num_sc on subchannel removal
It is unlikely that that host will ask us to close only one subchannel for a
device but let's be consistent. Do both num_sc++ and num_sc-- with
channel->lock to be on the safe side.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov 8dfd332674 Drivers: hv: vmbus: unify calls to percpu_channel_enq()
Remove some code duplication, no functional change intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov 1959a28e26 Drivers: hv: vmbus: kill tasklets on module unload
Explicitly kill tasklets we create on module unload.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov ffc151f3c8 Drivers: hv: vmbus: do cleanup on all vmbus_open() failure paths
In case there was an error reported in the response to the CHANNELMSG_OPENCHANNEL
call we need to do the cleanup as a vmbus_open() user won't be doing it after
receiving an error. The cleanup should be done on all failure paths. We also need
to avoid returning open_info->response.open_result.status as the return value as
all other errors we return from vmbus_open() are -EXXX and vmbus_open() callers
are not supposed to analyze host error codes.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:19:00 -07:00
K. Y. Srinivasan 2db84eff12 Drivers: hv: vmbus: Implement the protocol for tearing down vmbus state
Implement the protocol for tearing down the monitor state established with
the host.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:18:24 -07:00
Dexuan Cui 813c5b7958 hv: vmbus_free_channels(): remove the redundant free_channel()
free_channel() has been invoked in
vmbus_remove() -> hv_process_channel_removal(), or vmbus_remove() ->
... -> vmbus_close_internal() -> hv_process_channel_removal().

We also change to use list_for_each_entry_safe(), because the entry
is removed in hv_process_channel_removal().

This patch fixes a bug in the vmbus unload path.

Thank Dan Carpenter for finding the issue!

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:18:23 -07:00
Vitaly Kuznetsov 096c605feb Drivers: hv: vmbus: unregister panic notifier on module unload
Commit 96c1d0581d ("Drivers: hv: vmbus: Add
support for VMBus panic notifier handler") introduced
atomic_notifier_chain_register() call on module load. We also need to call
atomic_notifier_chain_unregister() on module unload as otherwise the following
crash is observed when we bring hv_vmbus back:

[   39.788877] BUG: unable to handle kernel paging request at ffffffffa00078a8
[   39.788877] IP: [<ffffffff8109d63f>] notifier_call_chain+0x3f/0x80
...
[   39.788877] Call Trace:
[   39.788877]  [<ffffffff8109de7d>] __atomic_notifier_call_chain+0x5d/0x90
...
[   39.788877]  [<ffffffff8109d788>] ? atomic_notifier_chain_register+0x38/0x70
[   39.788877]  [<ffffffff8109d767>] ? atomic_notifier_chain_register+0x17/0x70
[   39.788877]  [<ffffffffa002814f>] hv_acpi_init+0x14f/0x1000 [hv_vmbus]
[   39.788877]  [<ffffffff81002144>] do_one_initcall+0xd4/0x210

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:18:23 -07:00
Vitaly Kuznetsov e4ecb41cf9 Drivers: hv: vmbus: introduce vmbus_acpi_remove
In case we do request_resource() in vmbus_acpi_add() we need to tear it down
to be able to load the driver again. Otherwise the following crash in observed
when hv_vmbus unload/load sequence is performed on a Generation2 instance:

[   38.165701] BUG: unable to handle kernel paging request at ffffffffa00075a0
[   38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
[   38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0
[   38.166315] Oops: 0000 [#1] SMP
[   38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus]
[   38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-rc5_bug923184+ #486
[   38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
[   38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti: ffff88003f60c000
[   38.166315] RIP: 0010:[<ffffffff8107dc5f>]  [<ffffffff8107dc5f>] __request_resource+0x2f/0x50
[   38.166315] RSP: 0018:ffff88003f60fb58  EFLAGS: 00010286

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:18:23 -07:00
Vitaly Kuznetsov 7c95912758 Drivers: hv: utils: unify driver registration reporting
Unify driver registration reporting and move it to debug level as normally daemons write to syslog themselves
and these kernel messages are useless.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:42 -07:00
Vitaly Kuznetsov a4d1ee5b02 Drivers: hv: fcopy: full handshake support
Introduce FCOPY_VERSION_1 to support kernel replying to the negotiation
message with its own version.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:42 -07:00
Vitaly Kuznetsov cd8dc05485 Drivers: hv: vss: full handshake support
Introduce VSS_OP_REGISTER1 to support kernel replying to the negotiation
message with its own version.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 11bc3a5fa9 Drivers: hv: kvp: convert to hv_utils_transport
Convert to hv_utils_transport to support both netlink and /dev/vmbus/hv_kvp communication methods.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov c7e490fc23 Drivers: hv: fcopy: convert to hv_utils_transport
Unify the code with the recently introduced hv_utils_transport. Netlink
communication is disabled for fcopy.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 6472f80a2e Drivers: hv: vss: convert to hv_utils_transport
Convert to hv_utils_transport to support both netlink and /dev/vmbus/hv_vss communication methods.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 14b50f80c3 Drivers: hv: util: introduce hv_utils_transport abstraction
The intention is to make KVP/VSS drivers work through misc char devices.
Introduce an abstraction for kernel/userspace communication to make the
migration smoother. Transport operational mode (netlink or char device)
is determined by the first received message. To support driver upgrades
the switch from netlink to chardev operational mode is supported.

Every hv_util daemon is supposed to register 2 callbacks:
1) on_msg() to get notified when the userspace daemon sent a message;
2) on_reset() to get notified when the userspace daemon drops the connection.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 3f0dccf86c Drivers: hv: fcopy: set .owner reference for file operations
Get an additional reference otherwise a crash is observed when hv_utils module is being unloaded while
fcopy daemon is still running. .owner gives us an additional reference when
someone holds a descriptor for the device.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 4c93ccccf4 Drivers: hv: fcopy: switch to using the hvutil_device_state state machine
Switch to using the hvutil_device_state state machine from using 3 different state variables:
fcopy_transaction.active, opened, and in_hand_shake.

State transitions are:
-> HVUTIL_DEVICE_INIT when driver loads or on device release
-> HVUTIL_READY if the handshake was successful
-> HVUTIL_HOSTMSG_RECEIVED when there is a non-negotiation message from the host
-> HVUTIL_USERSPACE_REQ after userspace daemon read the message
   -> HVUTIL_USERSPACE_RECV after/if userspace has replied
-> HVUTIL_READY after we respond to the host
-> HVUTIL_DEVICE_DYING on driver unload

In hv_fcopy_onchannelcallback() process ICMSGTYPE_NEGOTIATE messages even when
the userspace daemon is disconnected, otherwise we can make the host think
we don't support FCOPY and disable the service completely.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 086a6f68d6 Drivers: hv: vss: switch to using the hvutil_device_state state machine
Switch to using the hvutil_device_state state machine from using kvp_transaction.active.

State transitions are:
-> HVUTIL_DEVICE_INIT when driver loads or on device release
-> HVUTIL_READY if the handshake was successful
-> HVUTIL_HOSTMSG_RECEIVED when there is a non-negotiation message from the host
-> HVUTIL_USERSPACE_REQ after we sent the message to the userspace daemon
   -> HVUTIL_USERSPACE_RECV after/if the userspace daemon has replied
-> HVUTIL_READY after we respond to the host
-> HVUTIL_DEVICE_DYING on driver unload

In hv_vss_onchannelcallback() process ICMSGTYPE_NEGOTIATE messages even when
the userspace daemon is disconnected, otherwise we can make the host think
we don't support VSS and disable the service completely.

Unfortunately there is no good way we can figure out that the userspace daemon
has died (unless we start treating all timeouts as such), add a protection
against processing new VSS_OP_REGISTER messages while being in the middle of a
transaction (HVUTIL_USERSPACE_REQ or HVUTIL_USERSPACE_RECV state).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov 97bf16cd30 Drivers: hv: kvp: switch to using the hvutil_device_state state machine
Switch to using the hvutil_device_state state machine from using 2 different state variables: kvp_transaction.active and
in_hand_shake.

State transitions are:
-> HVUTIL_DEVICE_INIT when driver loads or on device release
-> HVUTIL_READY if the handshake was successful
-> HVUTIL_HOSTMSG_RECEIVED when there is a non-negotiation message from the host
-> HVUTIL_USERSPACE_REQ after we sent the message to the userspace daemon
   -> HVUTIL_USERSPACE_RECV after/if the userspace daemon has replied
-> HVUTIL_READY after we respond to the host
-> HVUTIL_DEVICE_DYING on driver unload

In hv_kvp_onchannelcallback() process ICMSGTYPE_NEGOTIATE messages even when
the userspace daemon is disconnected, otherwise we can make the host think
we don't support KVP and disable the service completely.

Unfortunately there is no good way we can figure out that the userspace daemon
has died (unless we start treating all timeouts as such). In case the daemon
restarts we skip the negotiation procedure (so the daemon is supposed to has
the same version). This behavior is unchanged from in_handshake approach.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-05-24 12:17:41 -07:00