- Rework copy_oldmem_page() callback to take an iov_iter.
This includes few prerequisite updates and fixes to the
oldmem reading code.
- Rework cpufeature implementation to allow for various CPU feature
indications, which is not only limited to hardware capabilities,
but also allows CPU facilities.
- Use the cpufeature rework to autoload Ultravisor module when CPU
facility 158 is available.
- Add ELF note type for encrypted CPU state of a protected virtual CPU.
The zgetdump tool from s390-tools package will decrypt the CPU state
using a Customer Communication Key and overwrite respective notes to
make the data accessible for crash and other debugging tools.
- Use vzalloc() instead of vmalloc() + memset() in ChaCha20 crypto test.
- Fix incorrect recovery of kretprobe modified return address in stacktrace.
- Switch the NMI handler to use generic irqentry_nmi_enter() and
irqentry_nmi_exit() helper functions.
- Rework the cryptographic Adjunct Processors (AP) pass-through design
to support dynamic changes to the AP matrix of a running guest as well
as to implement more of the AP architecture.
- Minor boot code cleanups.
- Grammar and typo fixes to hmcdrv and tape drivers.
-----BEGIN PGP SIGNATURE-----
iI0EABYIADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCYu4dRBccYWdvcmRlZXZA
bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8DnlAP45Sk4cE35T+Z0vdHE2f0uMXE/p
uHNjS3fDZOQVFJ2jZwEA99xPF5qPCttbR/b1VHsMSb30684IT1A4PC7y05kgfAw=
=jCc3
-----END PGP SIGNATURE-----
Merge tag 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Alexander Gordeev:
- Rework copy_oldmem_page() callback to take an iov_iter.
This includes a few prerequisite updates and fixes to the oldmem
reading code.
- Rework cpufeature implementation to allow for various CPU feature
indications, which is not only limited to hardware capabilities, but
also allows CPU facilities.
- Use the cpufeature rework to autoload Ultravisor module when CPU
facility 158 is available.
- Add ELF note type for encrypted CPU state of a protected virtual CPU.
The zgetdump tool from s390-tools package will decrypt the CPU state
using a Customer Communication Key and overwrite respective notes to
make the data accessible for crash and other debugging tools.
- Use vzalloc() instead of vmalloc() + memset() in ChaCha20 crypto
test.
- Fix incorrect recovery of kretprobe modified return address in
stacktrace.
- Switch the NMI handler to use generic irqentry_nmi_enter() and
irqentry_nmi_exit() helper functions.
- Rework the cryptographic Adjunct Processors (AP) pass-through design
to support dynamic changes to the AP matrix of a running guest as
well as to implement more of the AP architecture.
- Minor boot code cleanups.
- Grammar and typo fixes to hmcdrv and tape drivers.
* tag 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (46 commits)
Revert "s390/smp: enforce lowcore protection on CPU restart"
Revert "s390/smp: rework absolute lowcore access"
Revert "s390/smp,ptdump: add absolute lowcore markers"
s390/unwind: fix fgraph return address recovery
s390/nmi: use irqentry_nmi_enter()/irqentry_nmi_exit()
s390: add ELF note type for encrypted CPU state of a PV VCPU
s390/smp,ptdump: add absolute lowcore markers
s390/smp: rework absolute lowcore access
s390/setup: rearrange absolute lowcore initialization
s390/boot: cleanup adjust_to_uv_max() function
s390/smp: enforce lowcore protection on CPU restart
s390/tape: fix comment typo
s390/hmcdrv: fix Kconfig "its" grammar
s390/docs: fix warnings for vfio_ap driver doc
s390/docs: fix warnings for vfio_ap driver lock usage doc
s390/crash: support multi-segment iterators
s390/crash: use static swap buffer for copy_to_user_real()
s390/crash: move copy_to_user_real() to crash_dump.c
s390/zcore: fix race when reading from hardware system area
s390/crash: fix incorrect number of bytes to copy to user space
...
The vfio_ap_ops code maintains both nib address and its PFN, which
is redundant, merely because vfio_pin/unpin_pages API wanted pfn.
Since vfio_pin/unpin_pages() now accept "iova", change "saved_pfn"
to "saved_iova" and remove pfn in the vfio_ap_validate_nib().
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20220723020256.30081-7-nicolinc@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Instead of having drivers register the notifier with explicit code just
have them provide a dma_unmap callback op in their driver ops and rely on
the core code to wire it up.
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v4-681e038e30fd+78-vfio_unmap_notif_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This patch implements two new AP driver callbacks:
void (*on_config_changed)(struct ap_config_info *new_config_info,
struct ap_config_info *old_config_info);
void (*on_scan_complete)(struct ap_config_info *new_config_info,
struct ap_config_info *old_config_info);
The on_config_changed callback is invoked at the start of the AP bus scan
function when it determines that the host AP configuration information
has changed since the previous scan.
The vfio_ap device driver registers a callback function for this callback
that performs the following operations:
1. Unplugs the adapters, domains and control domains removed from the
host's AP configuration from the guests to which they are
assigned in a single operation.
2. Stores bitmaps identifying the adapters, domains and control domains
added to the host's AP configuration with the structure representing
the mediated device. When the vfio_ap device driver's probe callback is
subsequently invoked, the probe function will recognize that the
queue is being probed due to a change in the host's AP configuration
and the plugging of the queue into the guest will be bypassed.
The on_scan_complete callback is invoked after the ap bus scan is
completed if the host AP configuration data has changed. The vfio_ap
device driver registers a callback function for this callback that hot
plugs each queue and control domain added to the AP configuration for each
guest using them in a single hot plug operation.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Let's implement the callback to indicate when an APQN
is in use by the vfio_ap device driver. The callback is
invoked whenever a change to the apmask or aqmask would
result in one or more queue devices being removed from the driver. The
vfio_ap device driver will indicate a resource is in use
if the APQN of any of the queue devices to be removed are assigned to
any of the matrix mdevs under the driver's control.
There is potential for a deadlock condition between the
matrix_dev->guests_lock used to lock the guest during assignment of
adapters and domains and the ap_perms_mutex locked by the AP bus when
changes are made to the sysfs apmask/aqmask attributes.
The AP Perms lock controls access to the objects that store the adapter
numbers (ap_perms) and domain numbers (aq_perms) for the sysfs
/sys/bus/ap/apmask and /sys/bus/ap/aqmask attributes. These attributes
identify which queues are reserved for the zcrypt default device drivers.
Before allowing a bit to be removed from either mask, the AP bus must check
with the vfio_ap device driver to verify that none of the queues are
assigned to any of its mediated devices.
The apmask/aqmask attributes can be written or read at any time from
userspace, so care must be taken to prevent a deadlock with asynchronous
operations that might be taking place in the vfio_ap device driver. For
example, consider the following:
1. A system administrator assigns an adapter to a mediated device under the
control of the vfio_ap device driver. The driver will need to first take
the matrix_dev->guests_lock to potentially hot plug the adapter into
the KVM guest.
2. At the same time, a system administrator sets a bit in the sysfs
/sys/bus/ap/ap_mask attribute. To complete the operation, the AP bus
must:
a. Take the ap_perms_mutex lock to update the object storing the values
for the /sys/bus/ap/ap_mask attribute.
b. Call the vfio_ap device driver's in-use callback to verify that the
queues now being reserved for the default zcrypt drivers are not
assigned to a mediated device owned by the vfio_ap device driver. To
do the verification, the in-use callback function takes the
matrix_dev->guests_lock, but has to wait because it is already held
by the operation in 1 above.
3. The vfio_ap device driver calls an AP bus function to verify that the
new queues resulting from the assignment of the adapter in step 1 are
not reserved for the default zcrypt device driver. This AP bus function
tries to take the ap_perms_mutex lock but gets stuck waiting for the
waiting for the lock due to step 2a above.
Consequently, we have the following deadlock situation:
matrix_dev->guests_lock locked (1)
ap_perms_mutex lock locked (2a)
Waiting for matrix_dev->gusts_lock (2b) which is currently held (1)
Waiting for ap_perms_mutex lock (3) which is currently held (2a)
To prevent this deadlock scenario, the function called in step 3 will no
longer take the ap_perms_mutex lock and require the caller to take the
lock. The lock will be the first taken by the adapter/domain assignment
functions in the vfio_ap device driver to maintain the proper locking
order.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
When an adapter or domain is unassigned from an mdev attached to a KVM
guest, one or more of the guest's queues may get dynamically removed. Since
the removed queues could get re-assigned to another mdev, they need to be
reset. So, when an adapter or domain is unassigned from the mdev, the
queues that are removed from the guest's AP configuration (APCB) will be
reset.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
The vfio_ap device driver registers for notification when the pointer to
the KVM object for a guest is set. Recall that the KVM lock (kvm->lock)
mutex must be taken outside of the matrix_dev->lock mutex to prevent the
reporting by lockdep of a circular locking dependency (a.k.a., a lockdep
splat):
* see commit 0cc00c8d40 ("Fix circular lockdep when setting/clearing
crypto masks")
* see commit 86956e7076 ("replace open coded locks for
VFIO_GROUP_NOTIFY_SET_KVM notification")
With the introduction of support for hot plugging/unplugging AP devices
passed through to a KVM guest, a new guests_lock mutex is introduced to
ensure the proper locking order is maintained:
struct ap_matrix_dev {
...
struct mutex guests_lock;
...
}
The matrix_dev->guests_lock controls access to the matrix_mdev instances
that hold the state for AP devices that have been passed through to a
KVM guest. This lock must be held to control access to the KVM pointer
(matrix_mdev->kvm) while the vfio_ap device driver is using it to
plug/unplug AP devices passed through to the KVM guest.
Keep in mind, the proper locking order must be maintained whenever
dynamically updating a KVM guest's APCB to plug/unplug adapters, domains
and control domains:
1. matrix_dev->guests_lock: required to use the KVM pointer - stored in
a struct ap_matrix_mdev instance - to update a KVM guest's APCB
2. matrix_mdev->kvm->lock: required to update a guest's APCB
3. matrix_dev->mdevs_lock: required to access data stored in a
struct ap_matrix_mdev instance.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
The matrix_dev->lock mutex is being renamed to matrix_dev->mdevs_lock to
better reflect its purpose, which is to control access to the state of the
mediated devices under the control of the vfio_ap device driver.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
The APCB is a field within the CRYCB that provides the AP configuration
to a KVM guest. Let's introduce a shadow copy of the KVM guest's APCB and
maintain it for the lifespan of the guest.
The shadow APCB serves the following purposes:
1. The shadow APCB can be maintained even when the mediated device is not
currently in use by a KVM guest. Since the mediated device's AP
configuration is filtered to ensure that no AP queues are passed through
to the KVM guest that are not bound to the vfio_ap device driver or
available to the host, the mediated device's AP configuration may differ
from the guest's. Having a shadow of a guest's APCB allows us to provide
a sysfs interface to view the guest's APCB even if the mediated device
is not currently passed through to a KVM guest. This can aid in
problem determination when the guest is unexpectedly missing AP
resources.
2. If filtering was done in-place for the real APCB, the guest could pick
up a transient state. Doing the filtering on a shadow and transferring
the AP configuration to the real APCB after the guest is started or when
AP resources are assigned to or unassigned from the mediated device, or
when the host configuration changes, the guest's AP configuration will
never be in a transient state.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Let's create links between each queue device bound to the vfio_ap device
driver and the matrix mdev to which the queue's APQN is assigned. The idea
is to facilitate efficient retrieval of the objects representing the queue
devices and matrix mdevs as well as to verify that a queue assigned to
a matrix mdev is bound to the driver.
The links will be created as follows:
* When the queue device is probed, if its APQN is assigned to a matrix
mdev, the structures representing the queue device and the matrix mdev
will be linked.
* When an adapter or domain is assigned to a matrix mdev, for each new
APQN assigned that references a queue device bound to the vfio_ap
device driver, the structures representing the queue device and the
matrix mdev will be linked.
The links will be removed as follows:
* When the queue device is removed, if its APQN is assigned to a matrix
mdev, the link from the structure representing the matrix mdev to the
structure representing the queue will be removed. Since the storage
allocated for the vfio_ap_queue will be freed, there is no need to
remove the link to the matrix_mdev to which the queue's APQN is
assigned.
* When an adapter or domain is unassigned from a matrix mdev, for each
APQN unassigned that references a queue device bound to the vfio_ap
device driver, the structures representing the queue device and the
matrix mdev will be unlinked.
* When an mdev is removed, the link from any queues assigned to the mdev
to the mdev will be removed.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Let's move the probe and remove callbacks into the vfio_ap_ops.c
file to keep all code related to managing queues in a single file. This
way, all functions related to queue management can be removed from the
vfio_ap_private.h header file defining the public interfaces for the
vfio_ap device driver.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Rather than relying on a notifier for associating the KVM with
the group, let's assume that the association has already been
made prior to device_open. The first time a device is opened
associate the group KVM with the device.
This fixes a user-triggerable oops in GVT.
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Zhi Wang <zhi.a.wang@intel.com>
Link: https://lore.kernel.org/r/20220519183311.582380-2-mjrosato@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Fixes the kernel-doc warnings in the following source files:
* drivers/s390/crypto/vfio_ap_private.h
* drivers/s390/crypto/vfio_ap_drv.c
* drivers/s390/crypto/vfio_ap_ops.c
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
This is straightforward conversion, the ap_matrix_mdev is actually serving
as the vfio_device and we can replace all the mdev_get_drvdata()'s with a
simple container_of() or a dev_get_drvdata() for sysfs paths.
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/0-v4-0203a4ab0596+f7-vfio_ap_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
It was pointed out during an unrelated patch review that locks should not
be open coded - i.e., writing the algorithm of a standard lock in a
function instead of using a lock from the standard library. The setting and
testing of a busy flag and sleeping on a wait_event is the same thing
a lock does. The open coded locks are invisible to lockdep, so potential
locking problems are not detected.
This patch removes the open coded locks used during
VFIO_GROUP_NOTIFY_SET_KVM notification. The busy flag
and wait queue were introduced to resolve a possible circular locking
dependency reported by lockdep when starting a secure execution guest
configured with AP adapters and domains. Reversing the order in which
the kvm->lock mutex and matrix_dev->lock mutex are locked resolves the
issue reported by lockdep, thus enabling the removal of the open coded
locks.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Link: https://lore.kernel.org/r/20210823212047.1476436-3-akrowiak@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The function pointer to the interception handler for the PQAP instruction
can get changed during the interception process. Let's add a
semaphore to struct kvm_s390_crypto to control read/write access to the
function pointer contained therein.
The semaphore must be locked for write access by the vfio_ap device driver
when notified that the KVM pointer has been set or cleared. It must be
locked for read access by the interception framework when the PQAP
instruction is intercepted.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: https://lore.kernel.org/r/20210823212047.1476436-2-akrowiak@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This patch fixes a lockdep splat introduced by commit f21916ec48
("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated").
The lockdep splat only occurs when starting a Secure Execution guest.
Crypto virtualization (vfio_ap) is not yet supported for SE guests;
however, in order to avoid this problem when support becomes available,
this fix is being provided.
The circular locking dependency was introduced when the setting of the
masks in the guest's APCB was executed while holding the matrix_dev->lock.
While the lock is definitely needed to protect the setting/unsetting of the
matrix_mdev->kvm pointer, it is not necessarily critical for setting the
masks; so, the matrix_dev->lock will be released while the masks are being
set or cleared.
Keep in mind, however, that another process that takes the matrix_dev->lock
can get control while the masks in the guest's APCB are being set or
cleared as a result of the driver being notified that the KVM pointer
has been set or unset. This could result in invalid access to the
matrix_mdev->kvm pointer by the intervening process. To avoid this
scenario, two new fields are being added to the ap_matrix_mdev struct:
struct ap_matrix_mdev {
...
bool kvm_busy;
wait_queue_head_t wait_for_kvm;
...
};
The functions that handle notification that the KVM pointer value has
been set or cleared will set the kvm_busy flag to true until they are done
processing at which time they will set it to false and wake up the tasks on
the matrix_mdev->wait_for_kvm wait queue. Functions that require
access to matrix_mdev->kvm will sleep on the wait queue until they are
awakened at which time they can safely access the matrix_mdev->kvm
field.
Fixes: f21916ec48 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated")
Cc: stable@vger.kernel.org
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
The queues assigned to a matrix mediated device are currently reset when:
* The VFIO_DEVICE_RESET ioctl is invoked
* The mdev fd is closed by userspace (QEMU)
* The mdev is removed from sysfs.
Immediately after the reset of a queue, a call is made to disable
interrupts for the queue. This is entirely unnecessary because the reset of
a queue disables interrupts, so this will be removed.
Furthermore, vfio_ap_irq_disable() does an unconditional PQAP/AQIC which
can result in a specification exception (when the corresponding facility
is not available), so this is actually a bugfix.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
[pasic@linux.ibm.com: minor rework before merging]
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Fixes: ec89b55e3b ("s390: ap: implement PAPQ AQIC interception in kernel")
Cc: <stable@vger.kernel.org>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
We register a AP PQAP instruction hook during the open
of the mediated device. And unregister it on release.
During the probe of the AP device, we allocate a vfio_ap_queue
structure to keep track of the information we need for the
PQAP/AQIC instruction interception.
In the AP PQAP instruction hook, if we receive a demand to
enable IRQs,
- we retrieve the vfio_ap_queue based on the APQN we receive
in REG1,
- we retrieve the page of the guest address, (NIB), from
register REG2
- we retrieve the mediated device to use the VFIO pinning
infrastructure to pin the page of the guest address,
- we retrieve the pointer to KVM to register the guest ISC
and retrieve the host ISC
- finaly we activate GISA
If we receive a demand to disable IRQs,
- we deactivate GISA
- unregister from the GIB
- unpin the NIB
When removing the AP device from the driver the device is
reseted and this process unregisters the GISA from the GIB,
and unpins the NIB address then we free the vfio_ap_queue
structure.
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
To be able to use the VFIO interface to facilitate the
mediated device memory pinning/unpinning we need to register
a notifier for IOMMU.
While we will start to pin one guest page for the interrupt indicator
byte, this is still ok with ballooning as this page will never be
used by the guest virtio-balloon driver.
So the pinned page will never be freed. And even a broken guest does
so, that would not impact the host as the original page is still
in control by vfio.
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
We prepare the interception of the PQAP/AQIC instruction for
the case the AQIC facility is enabled in the guest.
First of all we do not want to change existing behavior when
intercepting AP instructions without the SIE allowing the guest
to use AP instructions.
In this patch we only handle the AQIC interception allowed by
facility 65 which will be enabled when the complete interception
infrastructure will be present.
We add a callback inside the KVM arch structure for s390 for
a VFIO driver to handle a specific response to the PQAP
instruction with the AQIC command and only this command.
But we want to be able to return a correct answer to the guest
even there is no VFIO AP driver in the kernel.
Therefor, we inject the correct exceptions from inside KVM for the
case the callback is not initialized, which happens when the vfio_ap
driver is not loaded.
We do consider the responsibility of the driver to always initialize
the PQAP callback if it defines queues by initializing the CRYCB for
a guest.
If the callback has been setup we call it.
If not we setup an answer considering that no queue is available
for the guest when no callback has been setup.
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Libudev relies on having a subsystem link for non-root devices. To
avoid libudev (and potentially other userspace tools) choking on the
matrix device let us introduce a matrix bus and with it the matrix
bus subsytem. Also make the matrix device reside within the matrix
bus.
Doing this we remove the forced link from the matrix device to the
vfio_ap driver and the device_type we do not need anymore.
Since the associated matrix driver is not the vfio_ap driver any more,
we have to change the search for the devices on the vfio_ap driver in
the function vfio_ap_verify_queue_reserved.
Fixes: 1fde573413 ("s390: vfio-ap: base implementation of VFIO AP device driver")
Cc: stable@vger.kernel.org
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Implements the open callback on the mediated matrix device.
The function registers a group notifier to receive notification
of the VFIO_GROUP_NOTIFY_SET_KVM event. When notified,
the vfio_ap device driver will get access to the guest's
kvm structure. The open callback must ensure that only one
mediated device shall be opened per guest.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Farhan Ali <alifm@linux.ibm.com>
Tested-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20180925231641.4954-12-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Registers the matrix device created by the VFIO AP device
driver with the VFIO mediated device framework.
Registering the matrix device will create the sysfs
structures needed to create mediated matrix devices
each of which will be used to configure the AP matrix
for a guest and connect it to the VFIO AP device driver.
Registering the matrix device with the VFIO mediated device
framework will create the following sysfs structures:
/sys/devices/vfio_ap/matrix/
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ create
To create a mediated device for the AP matrix device, write a UUID
to the create file:
uuidgen > create
A symbolic link to the mediated device's directory will be created in the
devices subdirectory named after the generated $uuid:
/sys/devices/vfio_ap/matrix/
...... [mdev_supported_types]
......... [vfio_ap-passthrough]
............ [devices]
............... [$uuid]
A symbolic link to the mediated device will also be created
in the vfio_ap matrix's directory:
/sys/devices/vfio_ap/matrix/[$uuid]
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Farhan Ali <alifm@linux.ibm.com>
Message-Id: <20180925231641.4954-6-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Introduces a new AP device driver. This device driver
is built on the VFIO mediated device framework. The framework
provides sysfs interfaces that facilitate passthrough
access by guests to devices installed on the linux host.
The VFIO AP device driver will serve two purposes:
1. Provide the interfaces to reserve AP devices for exclusive
use by KVM guests. This is accomplished by unbinding the
devices to be reserved for guest usage from the zcrypt
device driver and binding them to the VFIO AP device driver.
2. Implements the functions, callbacks and sysfs attribute
interfaces required to create one or more VFIO mediated
devices each of which will be used to configure the AP
matrix for a guest and serve as a file descriptor
for facilitating communication between QEMU and the
VFIO AP device driver.
When the VFIO AP device driver is initialized:
* It registers with the AP bus for control of type 10 (CEX4
and newer) AP queue devices. This limitation was imposed
due to:
1. A desire to keep the code as simple as possible;
2. Some older models are no longer supported by the kernel
and others are getting close to end of service.
3. A lack of older systems on which to test older devices.
The probe and remove callbacks will be provided to support
the binding/unbinding of AP queue devices to/from the VFIO
AP device driver.
* Creates a matrix device, /sys/devices/vfio_ap/matrix,
to serve as the parent of the mediated devices created, one
for each guest, and to hold the APQNs of the AP devices bound to
the VFIO AP device driver.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Farhan Ali <alifm@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20180925231641.4954-5-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>