- Support for the H_RPT_INVALIDATE hypercall

- Conversion of Book3S entry/exit to C

- Bug fixes
This commit is contained in:
Paolo Bonzini 2021-06-23 05:43:45 -04:00
commit c3ab0e28a4
331 changed files with 4499 additions and 2899 deletions

View file

@ -160,6 +160,7 @@ Jeff Layton <jlayton@kernel.org> <jlayton@primarydata.com>
Jeff Layton <jlayton@kernel.org> <jlayton@redhat.com>
Jens Axboe <axboe@suse.de>
Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Jernej Skrabec <jernej.skrabec@gmail.com> <jernej.skrabec@siol.net>
Jiri Slaby <jirislaby@kernel.org> <jirislaby@gmail.com>
Jiri Slaby <jirislaby@kernel.org> <jslaby@novell.com>
Jiri Slaby <jirislaby@kernel.org> <jslaby@suse.com>

View file

@ -1,7 +1,7 @@
What: /sys/class/dax/
Date: May, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description: Device DAX is the device-centric analogue of Filesystem
DAX (CONFIG_FS_DAX). It allows memory ranges to be
allocated and mapped without need of an intervening file

View file

@ -1,4 +1,4 @@
This ABI is renamed and moved to a new location /sys/kernel/fadump/registered.¬
This ABI is renamed and moved to a new location /sys/kernel/fadump/registered.
What: /sys/kernel/fadump_registered
Date: Feb 2012

View file

@ -1,4 +1,4 @@
This ABI is renamed and moved to a new location /sys/kernel/fadump/release_mem.¬
This ABI is renamed and moved to a new location /sys/kernel/fadump/release_mem.
What: /sys/kernel/fadump_release_mem
Date: Feb 2012

View file

@ -1,7 +1,7 @@
What: /sys/bus/nd/devices/regionX/nfit/ecc_unit_size
Date: Aug, 2017
KernelVersion: v4.14 (Removed v4.18)
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Size of a write request to a DIMM that will not incur a
read-modify-write cycle at the memory controller.

View file

@ -5,7 +5,7 @@ Interface Table (NFIT)' section in the ACPI specification
What: /sys/bus/nd/devices/nmemX/nfit/serial
Date: Jun, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Serial number of the NVDIMM (non-volatile dual in-line
memory module), assigned by the module vendor.
@ -14,7 +14,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/handle
Date: Apr, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) The address (given by the _ADR object) of the device on its
parent bus of the NVDIMM device containing the NVDIMM region.
@ -23,7 +23,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/device
Date: Apr, 2015
KernelVersion: v4.1
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Device id for the NVDIMM, assigned by the module vendor.
@ -31,7 +31,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/rev_id
Date: Jun, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Revision of the NVDIMM, assigned by the module vendor.
@ -39,7 +39,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/phys_id
Date: Apr, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Handle (i.e., instance number) for the SMBIOS (system
management BIOS) Memory Device structure describing the NVDIMM
@ -49,7 +49,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/flags
Date: Jun, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) The flags in the NFIT memory device sub-structure indicate
the state of the data on the nvdimm relative to its energy
@ -68,7 +68,7 @@ What: /sys/bus/nd/devices/nmemX/nfit/format1
What: /sys/bus/nd/devices/nmemX/nfit/formats
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) The interface codes indicate support for persistent memory
mapped directly into system physical address space and / or a
@ -84,7 +84,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/vendor
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Vendor id of the NVDIMM.
@ -92,7 +92,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/dsm_mask
Date: May, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) The bitmask indicates the supported device specific control
functions relative to the NVDIMM command family supported by the
@ -102,7 +102,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/family
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Displays the NVDIMM family command sets. Values
0, 1, 2 and 3 correspond to NVDIMM_FAMILY_INTEL,
@ -118,7 +118,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/id
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) ACPI specification 6.2 section 5.2.25.9, defines an
identifier for an NVDIMM, which refelects the id attribute.
@ -127,7 +127,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/subsystem_vendor
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Sub-system vendor id of the NVDIMM non-volatile memory
subsystem controller.
@ -136,7 +136,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/subsystem_rev_id
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Sub-system revision id of the NVDIMM non-volatile memory subsystem
controller, assigned by the non-volatile memory subsystem
@ -146,7 +146,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/subsystem_device
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Sub-system device id for the NVDIMM non-volatile memory
subsystem controller, assigned by the non-volatile memory
@ -156,7 +156,7 @@ Description:
What: /sys/bus/nd/devices/ndbusX/nfit/revision
Date: Jun, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) ACPI NFIT table revision number.
@ -164,7 +164,7 @@ Description:
What: /sys/bus/nd/devices/ndbusX/nfit/scrub
Date: Sep, 2016
KernelVersion: v4.9
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RW) This shows the number of full Address Range Scrubs (ARS)
that have been completed since driver load time. Userspace can
@ -177,7 +177,7 @@ Description:
What: /sys/bus/nd/devices/ndbusX/nfit/hw_error_scrub
Date: Sep, 2016
KernelVersion: v4.9
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RW) Provides a way to toggle the behavior between just adding
the address (cache line) where the MCE happened to the poison
@ -196,7 +196,7 @@ Description:
What: /sys/bus/nd/devices/ndbusX/nfit/dsm_mask
Date: Jun, 2017
KernelVersion: v4.13
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) The bitmask indicates the supported bus specific control
functions. See the section named 'NVDIMM Root Device _DSMs' in
@ -205,7 +205,7 @@ Description:
What: /sys/bus/nd/devices/ndbusX/nfit/firmware_activate_noidle
Date: Apr, 2020
KernelVersion: v5.8
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RW) The Intel platform implementation of firmware activate
support exposes an option let the platform force idle devices in
@ -225,7 +225,7 @@ Description:
What: /sys/bus/nd/devices/regionX/nfit/range_index
Date: Jun, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) A unique number provided by the BIOS to identify an address
range. Used by NVDIMM Region Mapping Structure to uniquely refer

View file

@ -1,7 +1,7 @@
What: /sys/bus/nd/devices/nmemX/papr/flags
Date: Apr, 2020
KernelVersion: v5.8
Contact: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm@lists.01.org,
Contact: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, nvdimm@lists.linux.dev,
Description:
(RO) Report flags indicating various states of a
papr-pmem NVDIMM device. Each flag maps to a one or
@ -36,7 +36,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/papr/perf_stats
Date: May, 2020
KernelVersion: v5.9
Contact: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm@lists.01.org,
Contact: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, nvdimm@lists.linux.dev,
Description:
(RO) Report various performance stats related to papr-scm NVDIMM
device. Each stat is reported on a new line with each line

View file

@ -37,13 +37,13 @@ Description: Maximum time allowed for periodic transfers per microframe (μs)
What: /sys/module/*/{coresize,initsize}
Date: Jan 2012
KernelVersion:»·3.3
KernelVersion: 3.3
Contact: Kay Sievers <kay.sievers@vrfy.org>
Description: Module size in bytes.
What: /sys/module/*/taint
Date: Jan 2012
KernelVersion:»·3.3
KernelVersion: 3.3
Contact: Kay Sievers <kay.sievers@vrfy.org>
Description: Module taint flags:
== =====================

View file

@ -483,10 +483,11 @@ modprobe
========
The full path to the usermode helper for autoloading kernel modules,
by default "/sbin/modprobe". This binary is executed when the kernel
requests a module. For example, if userspace passes an unknown
filesystem type to mount(), then the kernel will automatically request
the corresponding filesystem module by executing this usermode helper.
by default ``CONFIG_MODPROBE_PATH``, which in turn defaults to
"/sbin/modprobe". This binary is executed when the kernel requests a
module. For example, if userspace passes an unknown filesystem type
to mount(), then the kernel will automatically request the
corresponding filesystem module by executing this usermode helper.
This usermode helper should insert the needed module into the kernel.
This sysctl only affects module autoloading. It has no effect on the

View file

@ -1,4 +1,4 @@
==============
==============
Data Integrity
==============

View file

@ -146,18 +146,18 @@ with the kernel as a block device by registering the following general
*struct file_operations*::
struct file_operations cdrom_fops = {
NULL, / lseek /
block _read , / read—general block-dev read /
block _write, / write—general block-dev write /
NULL, / readdir /
NULL, / select /
cdrom_ioctl, / ioctl /
NULL, / mmap /
cdrom_open, / open /
cdrom_release, / release /
NULL, / fsync /
NULL, / fasync /
NULL / revalidate /
NULL, /* lseek */
block _read , /* read--general block-dev read */
block _write, /* write--general block-dev write */
NULL, /* readdir */
NULL, /* select */
cdrom_ioctl, /* ioctl */
NULL, /* mmap */
cdrom_open, /* open */
cdrom_release, /* release */
NULL, /* fsync */
NULL, /* fasync */
NULL /* revalidate */
};
Every active CD-ROM device shares this *struct*. The routines
@ -250,12 +250,12 @@ The drive-specific, minor-like information that is registered with
`cdrom.c`, currently contains the following fields::
struct cdrom_device_info {
const struct cdrom_device_ops * ops; /* device operations for this major */
const struct cdrom_device_ops * ops; /* device operations for this major */
struct list_head list; /* linked list of all device_info */
struct gendisk * disk; /* matching block layer disk */
void * handle; /* driver-dependent data */
int mask; /* mask of capability: disables them */
int mask; /* mask of capability: disables them */
int speed; /* maximum speed for reading data */
int capacity; /* number of discs in a jukebox */
@ -569,7 +569,7 @@ the *CDC_CLOSE_TRAY* bit in *mask*.
In the file `cdrom.c` you will encounter many constructions of the type::
if (cdo->capability & cdi->mask & CDC _⟨capability⟩) ...
if (cdo->capability & ~cdi->mask & CDC _<capability>) ...
There is no *ioctl* to set the mask... The reason is that
I think it is better to control the **behavior** rather than the

View file

@ -4,7 +4,7 @@ LIBNVDIMM: Non-Volatile Devices
libnvdimm - kernel / libndctl - userspace helper library
linux-nvdimm@lists.01.org
nvdimm@lists.linux.dev
Version 13

View file

@ -19,7 +19,6 @@ Serial drivers
moxa-smartio
n_gsm
rocket
serial-iso7816
serial-rs485

View file

@ -109,16 +109,19 @@ well as to make sure they aren't relying on some HCD-specific behavior.
USB-Standard Types
==================
In ``drivers/usb/common/common.c`` and ``drivers/usb/common/debug.c`` you
will find the USB data types defined in chapter 9 of the USB specification.
These data types are used throughout USB, and in APIs including this host
side API, gadget APIs, usb character devices and debugfs interfaces.
In ``include/uapi/linux/usb/ch9.h`` you will find the USB data types defined
in chapter 9 of the USB specification. These data types are used throughout
USB, and in APIs including this host side API, gadget APIs, usb character
devices and debugfs interfaces. That file is itself included by
``include/linux/usb/ch9.h``, which also contains declarations of a few
utility routines for manipulating these data types; the implementations
are in ``drivers/usb/common/common.c``.
.. kernel-doc:: drivers/usb/common/common.c
:export:
.. kernel-doc:: drivers/usb/common/debug.c
:export:
In addition, some functions useful for creating debugging output are
defined in ``drivers/usb/common/debug.c``.
Host-Side Data Types and Macros
===============================

View file

@ -50,8 +50,8 @@ Here is the main features of EROFS:
- Support POSIX.1e ACLs by using xattrs;
- Support transparent file compression as an option:
LZ4 algorithm with 4 KB fixed-sized output compression for high performance.
- Support transparent data compression as an option:
LZ4 algorithm with the fixed-sized output compression for high performance.
The following git tree provides the file system user-space tools under
development (ex, formatting tool mkfs.erofs):
@ -113,31 +113,31 @@ may not. All metadatas can be now observed in two different spaces (views):
::
|-> aligned with 8B
|-> followed closely
+ meta_blkaddr blocks |-> another slot
_____________________________________________________________________
| ... | inode | xattrs | extents | data inline | ... | inode ...
|________|_______|(optional)|(optional)|__(optional)_|_____|__________
|-> aligned with the inode slot size
. .
. .
. .
. .
. .
. .
.____________________________________________________|-> aligned with 4B
| xattr_ibody_header | shared xattrs | inline xattrs |
|____________________|_______________|_______________|
|-> 12 bytes <-|->x * 4 bytes<-| .
. . .
. . .
. . .
._______________________________.______________________.
| id | id | id | id | ... | id | ent | ... | ent| ... |
|____|____|____|____|______|____|_____|_____|____|_____|
|-> aligned with 4B
|-> aligned with 4B
|-> aligned with 8B
|-> followed closely
+ meta_blkaddr blocks |-> another slot
_____________________________________________________________________
| ... | inode | xattrs | extents | data inline | ... | inode ...
|________|_______|(optional)|(optional)|__(optional)_|_____|__________
|-> aligned with the inode slot size
. .
. .
. .
. .
. .
. .
.____________________________________________________|-> aligned with 4B
| xattr_ibody_header | shared xattrs | inline xattrs |
|____________________|_______________|_______________|
|-> 12 bytes <-|->x * 4 bytes<-| .
. . .
. . .
. . .
._______________________________.______________________.
| id | id | id | id | ... | id | ent | ... | ent| ... |
|____|____|____|____|______|____|_____|_____|____|_____|
|-> aligned with 4B
|-> aligned with 4B
Inode could be 32 or 64 bytes, which can be distinguished from a common
field which all inode versions have -- i_format::
@ -175,13 +175,13 @@ may not. All metadatas can be now observed in two different spaces (views):
Each share xattr can also be directly found by the following formula:
xattr offset = xattr_blkaddr * block_size + 4 * xattr_id
::
::
|-> aligned by 4 bytes
+ xattr_blkaddr blocks |-> aligned with 4 bytes
_________________________________________________________________________
| ... | xattr_entry | xattr data | ... | xattr_entry | xattr data ...
|________|_____________|_____________|_____|______________|_______________
|-> aligned by 4 bytes
+ xattr_blkaddr blocks |-> aligned with 4 bytes
_________________________________________________________________________
| ... | xattr_entry | xattr data | ... | xattr_entry | xattr data ...
|________|_____________|_____________|_____|______________|_______________
Directories
-----------
@ -193,48 +193,77 @@ algorithm (could refer to the related source code).
::
___________________________
/ |
/ ______________|________________
/ / | nameoff1 | nameoffN-1
____________.______________._______________v________________v__________
| dirent | dirent | ... | dirent | filename | filename | ... | filename |
|___.0___|____1___|_____|___N-1__|____0_____|____1_____|_____|___N-1____|
\ ^
\ | * could have
\ | trailing '\0'
\________________________| nameoff0
Directory block
___________________________
/ |
/ ______________|________________
/ / | nameoff1 | nameoffN-1
____________.______________._______________v________________v__________
| dirent | dirent | ... | dirent | filename | filename | ... | filename |
|___.0___|____1___|_____|___N-1__|____0_____|____1_____|_____|___N-1____|
\ ^
\ | * could have
\ | trailing '\0'
\________________________| nameoff0
Directory block
Note that apart from the offset of the first filename, nameoff0 also indicates
the total number of directory entries in this block since it is no need to
introduce another on-disk field at all.
Compression
-----------
Currently, EROFS supports 4KB fixed-sized output transparent file compression,
as illustrated below::
Data compression
----------------
EROFS implements LZ4 fixed-sized output compression which generates fixed-sized
compressed data blocks from variable-sized input in contrast to other existing
fixed-sized input solutions. Relatively higher compression ratios can be gotten
by using fixed-sized output compression since nowadays popular data compression
algorithms are mostly LZ77-based and such fixed-sized output approach can be
benefited from the historical dictionary (aka. sliding window).
|---- Variant-Length Extent ----|-------- VLE --------|----- VLE -----
clusterofs clusterofs clusterofs
| | | logical data
_________v_______________________________v_____________________v_______________
... | . | | . | | . | ...
____|____.________|_____________|________.____|_____________|__.__________|____
|-> cluster <-|-> cluster <-|-> cluster <-|-> cluster <-|-> cluster <-|
size size size size size
. . . .
. . . .
. . . .
_______._____________._____________._____________._____________________
... | | | | ... physical data
_______|_____________|_____________|_____________|_____________________
|-> cluster <-|-> cluster <-|-> cluster <-|
size size size
In details, original (uncompressed) data is turned into several variable-sized
extents and in the meanwhile, compressed into physical clusters (pclusters).
In order to record each variable-sized extent, logical clusters (lclusters) are
introduced as the basic unit of compress indexes to indicate whether a new
extent is generated within the range (HEAD) or not (NONHEAD). Lclusters are now
fixed in block size, as illustrated below::
Currently each on-disk physical cluster can contain 4KB (un)compressed data
at most. For each logical cluster, there is a corresponding on-disk index to
describe its cluster type, physical cluster address, etc.
|<- variable-sized extent ->|<- VLE ->|
clusterofs clusterofs clusterofs
| | |
_________v_________________________________v_______________________v________
... | . | | . | | . ...
____|____._________|______________|________.___ _|______________|__.________
|-> lcluster <-|-> lcluster <-|-> lcluster <-|-> lcluster <-|
(HEAD) (NONHEAD) (HEAD) (NONHEAD) .
. CBLKCNT . .
. . .
. . .
_______._____________________________.______________._________________
... | | | | ...
_______|______________|______________|______________|_________________
|-> big pcluster <-|-> pcluster <-|
See "struct z_erofs_vle_decompressed_index" in erofs_fs.h for more details.
A physical cluster can be seen as a container of physical compressed blocks
which contains compressed data. Previously, only lcluster-sized (4KB) pclusters
were supported. After big pcluster feature is introduced (available since
Linux v5.13), pcluster can be a multiple of lcluster size.
For each HEAD lcluster, clusterofs is recorded to indicate where a new extent
starts and blkaddr is used to seek the compressed data. For each NONHEAD
lcluster, delta0 and delta1 are available instead of blkaddr to indicate the
distance to its HEAD lcluster and the next HEAD lcluster. A PLAIN lcluster is
also a HEAD lcluster except that its data is uncompressed. See the comments
around "struct z_erofs_vle_decompressed_index" in erofs_fs.h for more details.
If big pcluster is enabled, pcluster size in lclusters needs to be recorded as
well. Let the delta0 of the first NONHEAD lcluster store the compressed block
count with a special flag as a new called CBLKCNT NONHEAD lcluster. It's easy
to understand its delta0 is constantly 1, as illustrated below::
__________________________________________________________
| HEAD | NONHEAD | NONHEAD | ... | NONHEAD | HEAD | HEAD |
|__:___|_(CBLKCNT)_|_________|_____|_________|__:___|____:_|
|<----- a big pcluster (with CBLKCNT) ------>|<-- -->|
a lcluster-sized pcluster (without CBLKCNT) ^
If another HEAD follows a HEAD lcluster, there is no room to record CBLKCNT,
but it's easy to know the size of such pcluster is 1 lcluster as well.

View file

@ -21,10 +21,10 @@ Description
The TMP103 is a digital output temperature sensor in a four-ball
wafer chip-scale package (WCSP). The TMP103 is capable of reading
temperatures to a resolution of 1°C. The TMP103 is specified for
operation over a temperature range of 40°C to +125°C.
operation over a temperature range of -40°C to +125°C.
Resolution: 8 Bits
Accuracy: ±1°C Typ (10°C to +100°C)
Accuracy: ±1°C Typ (-10°C to +100°C)
The driver provides the common sysfs-interface for temperatures (see
Documentation/hwmon/sysfs-interface.rst under Temperatures).

View file

@ -173,7 +173,7 @@ Director rule is added from ethtool (Sideband filter), ATR is turned off by the
driver. To re-enable ATR, the sideband can be disabled with the ethtool -K
option. For example::
ethtool K [adapter] ntuple [off|on]
ethtool -K [adapter] ntuple [off|on]
If sideband is re-enabled after ATR is re-enabled, ATR remains enabled until a
TCP-IP flow is added. When all TCP-IP sideband rules are deleted, ATR is
@ -688,7 +688,7 @@ shaper bw_rlimit: for each tc, sets minimum and maximum bandwidth rates.
Totals must be equal or less than port speed.
For example: min_rate 1Gbit 3Gbit: Verify bandwidth limit using network
monitoring tools such as ifstat or sar n DEV [interval] [number of samples]
monitoring tools such as `ifstat` or `sar -n DEV [interval] [number of samples]`
2. Enable HW TC offload on interface::

View file

@ -179,7 +179,7 @@ shaper bw_rlimit: for each tc, sets minimum and maximum bandwidth rates.
Totals must be equal or less than port speed.
For example: min_rate 1Gbit 3Gbit: Verify bandwidth limit using network
monitoring tools such as ifstat or sar n DEV [interval] [number of samples]
monitoring tools such as ``ifstat`` or ``sar -n DEV [interval] [number of samples]``
NOTE:
Setting up channels via ethtool (ethtool -L) is not supported when the

View file

@ -1,4 +1,4 @@
.. _process_statement_kernel:
.. _process_statement_kernel:
Linux Kernel Enforcement Statement
----------------------------------

View file

@ -1,4 +1,4 @@
=============================
=============================
Virtual TPM interface for Xen
=============================

View file

@ -1,4 +1,4 @@
======================================
======================================
NO_HZ: Reducing Scheduling-Clock Ticks
======================================

View file

@ -1,50 +0,0 @@
Chinese translated version of Documentation/admin-guide/security-bugs.rst
If you have any comment or update to the content, please contact the
original document maintainer directly. However, if you have a problem
communicating in English you can also ask the Chinese maintainer for
help. Contact the Chinese maintainer if this translation is outdated
or if there is a problem with the translation.
Chinese maintainer: Harry Wei <harryxiyou@gmail.com>
---------------------------------------------------------------------
Documentation/admin-guide/security-bugs.rst 的中文翻译
如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文
交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻
译存在问题,请联系中文版维护者。
中文版维护者: 贾威威 Harry Wei <harryxiyou@gmail.com>
中文版翻译者: 贾威威 Harry Wei <harryxiyou@gmail.com>
中文版校译者: 贾威威 Harry Wei <harryxiyou@gmail.com>
以下为正文
---------------------------------------------------------------------
Linux内核开发者认为安全非常重要。因此我们想要知道当一个有关于
安全的漏洞被发现的时候,并且它可能会被尽快的修复或者公开。请把这个安全
漏洞报告给Linux内核安全团队。
1) 联系
linux内核安全团队可以通过email<security@kernel.org>来联系。这是
一组独立的安全工作人员,可以帮助改善漏洞报告并且公布和取消一个修复。安
全团队有可能会从部分的维护者那里引进额外的帮助来了解并且修复安全漏洞。
当遇到任何漏洞,所能提供的信息越多就越能诊断和修复。如果你不清楚什么
是有帮助的信息那就请重温一下admin-guide/reporting-bugs.rst文件中的概述过程。任
何攻击性的代码都是非常有用的,未经报告者的同意不会被取消,除非它已经
被公布于众。
2) 公开
Linux内核安全团队的宗旨就是和漏洞提交者一起处理漏洞的解决方案直
到公开。我们喜欢尽快地完全公开漏洞。当一个漏洞或者修复还没有被完全地理
解,解决方案没有通过测试或者供应商协调,可以合理地延迟公开。然而,我们
期望这些延迟尽可能的短些,是可数的几天,而不是几个星期或者几个月。公开
日期是通过安全团队和漏洞提供者以及供应商洽谈后的结果。公开时间表是从很
短(特殊的,它已经被公众所知道)到几个星期。作为一个基本的默认政策,我
们所期望通知公众的日期是7天的安排。
3) 保密协议
Linux内核安全团队不是一个正式的团体因此不能加入任何的保密协议。

View file

@ -140,7 +140,7 @@ is an arbitrary string allowed in a filesystem, e.g.::
Each function provides its specific set of attributes, with either read-only
or read-write access. Where applicable they need to be written to as
appropriate.
Please refer to Documentation/ABI/*/configfs-usb-gadget* for more information.
Please refer to Documentation/ABI/testing/configfs-usb-gadget for more information.
4. Associating the functions with their configurations
------------------------------------------------------

View file

@ -1,4 +1,4 @@
================
================
mtouchusb driver
================

View file

@ -1,4 +1,4 @@
==========
==========
USB serial
==========

View file

@ -22,7 +22,7 @@ to SEV::
[ecx]:
Bits[31:0] Number of encrypted guests supported simultaneously
If support for SEV is present, MSR 0xc001_0010 (MSR_K8_SYSCFG) and MSR 0xc001_0015
If support for SEV is present, MSR 0xc001_0010 (MSR_AMD64_SYSCFG) and MSR 0xc001_0015
(MSR_K7_HWCR) can be used to determine if it can be enabled::
0xc001_0010:

View file

@ -6410,6 +6410,24 @@ default.
See Documentation/x86/sgx/2.Kernel-internals.rst for more details.
7.26 KVM_CAP_PPC_RPT_INVALIDATE
-------------------------------
:Capability: KVM_CAP_PPC_RPT_INVALIDATE
:Architectures: ppc
:Type: vm
This capability indicates that the kernel is capable of handling
H_RPT_INVALIDATE hcall.
In order to enable the use of H_RPT_INVALIDATE in the guest,
user space might have to advertise it for the guest. For example,
IBM pSeries (sPAPR) guest starts using it if "hcall-rpt-invalidate" is
present in the "ibm,hypertas-functions" device-tree property.
This capability is enabled for hypervisors on platforms like POWER9
that support radix MMU.
8. Other capabilities.
======================

View file

@ -53,7 +53,7 @@ CPUID function 0x8000001f reports information related to SME::
system physical addresses, not guest physical
addresses)
If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be used to
If support for SME is present, MSR 0xc00100010 (MSR_AMD64_SYSCFG) can be used to
determine if SME is enabled and/or to enable memory encryption::
0xc0010010:
@ -79,7 +79,7 @@ The state of SME in the Linux kernel can be documented as follows:
The CPU supports SME (determined through CPUID instruction).
- Enabled:
Supported and bit 23 of MSR_K8_SYSCFG is set.
Supported and bit 23 of MSR_AMD64_SYSCFG is set.
- Active:
Supported, Enabled and the Linux kernel is actively applying
@ -89,7 +89,7 @@ The state of SME in the Linux kernel can be documented as follows:
SME can also be enabled and activated in the BIOS. If SME is enabled and
activated in the BIOS, then all memory accesses will be encrypted and it will
not be necessary to activate the Linux memory encryption support. If the BIOS
merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can activate
merely enables SME (sets bit 23 of the MSR_AMD64_SYSCFG), then Linux can activate
memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
by supplying mem_encrypt=on on the kernel command line. However, if BIOS does
not enable SME, then Linux will not be able to activate memory encryption, even

View file

@ -1578,7 +1578,7 @@ F: drivers/clk/sunxi/
ARM/Allwinner sunXi SoC support
M: Maxime Ripard <mripard@kernel.org>
M: Chen-Yu Tsai <wens@csie.org>
R: Jernej Skrabec <jernej.skrabec@siol.net>
R: Jernej Skrabec <jernej.skrabec@gmail.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux.git
@ -5089,7 +5089,7 @@ S: Maintained
F: drivers/net/fddi/defza.*
DEINTERLACE DRIVERS FOR ALLWINNER H3
M: Jernej Skrabec <jernej.skrabec@siol.net>
M: Jernej Skrabec <jernej.skrabec@gmail.com>
L: linux-media@vger.kernel.org
S: Maintained
T: git git://linuxtv.org/media_tree.git
@ -5237,7 +5237,7 @@ DEVICE DIRECT ACCESS (DAX)
M: Dan Williams <dan.j.williams@intel.com>
M: Vishal Verma <vishal.l.verma@intel.com>
M: Dave Jiang <dave.jiang@intel.com>
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
F: drivers/dax/
@ -5632,14 +5632,14 @@ F: include/linux/power/smartreflex.h
DRM DRIVER FOR ALLWINNER DE2 AND DE3 ENGINE
M: Maxime Ripard <mripard@kernel.org>
M: Chen-Yu Tsai <wens@csie.org>
R: Jernej Skrabec <jernej.skrabec@siol.net>
R: Jernej Skrabec <jernej.skrabec@gmail.com>
L: dri-devel@lists.freedesktop.org
S: Supported
T: git git://anongit.freedesktop.org/drm/drm-misc
F: drivers/gpu/drm/sun4i/sun8i*
DRM DRIVER FOR ARM PL111 CLCD
M: Eric Anholt <eric@anholt.net>
M: Emma Anholt <emma@anholt.net>
S: Supported
T: git git://anongit.freedesktop.org/drm/drm-misc
F: drivers/gpu/drm/pl111/
@ -5719,7 +5719,7 @@ T: git git://anongit.freedesktop.org/drm/drm-misc
F: drivers/gpu/drm/tiny/gm12u320.c
DRM DRIVER FOR HX8357D PANELS
M: Eric Anholt <eric@anholt.net>
M: Emma Anholt <emma@anholt.net>
S: Maintained
T: git git://anongit.freedesktop.org/drm/drm-misc
F: Documentation/devicetree/bindings/display/himax,hx8357d.txt
@ -6023,7 +6023,7 @@ M: Neil Armstrong <narmstrong@baylibre.com>
M: Robert Foss <robert.foss@linaro.org>
R: Laurent Pinchart <Laurent.pinchart@ideasonboard.com>
R: Jonas Karlman <jonas@kwiboo.se>
R: Jernej Skrabec <jernej.skrabec@siol.net>
R: Jernej Skrabec <jernej.skrabec@gmail.com>
S: Maintained
T: git git://anongit.freedesktop.org/drm/drm-misc
F: drivers/gpu/drm/bridge/
@ -6177,7 +6177,7 @@ F: Documentation/devicetree/bindings/display/ti/
F: drivers/gpu/drm/omapdrm/
DRM DRIVERS FOR V3D
M: Eric Anholt <eric@anholt.net>
M: Emma Anholt <emma@anholt.net>
S: Supported
T: git git://anongit.freedesktop.org/drm/drm-misc
F: Documentation/devicetree/bindings/gpu/brcm,bcm-v3d.yaml
@ -6185,7 +6185,7 @@ F: drivers/gpu/drm/v3d/
F: include/uapi/drm/v3d_drm.h
DRM DRIVERS FOR VC4
M: Eric Anholt <eric@anholt.net>
M: Emma Anholt <emma@anholt.net>
M: Maxime Ripard <mripard@kernel.org>
S: Supported
T: git git://github.com/anholt/linux
@ -7006,7 +7006,7 @@ M: Dan Williams <dan.j.williams@intel.com>
R: Matthew Wilcox <willy@infradead.org>
R: Jan Kara <jack@suse.cz>
L: linux-fsdevel@vger.kernel.org
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
F: fs/dax.c
F: include/linux/dax.h
@ -10378,7 +10378,7 @@ LIBNVDIMM BLK: MMIO-APERTURE DRIVER
M: Dan Williams <dan.j.williams@intel.com>
M: Vishal Verma <vishal.l.verma@intel.com>
M: Dave Jiang <dave.jiang@intel.com>
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
P: Documentation/nvdimm/maintainer-entry-profile.rst
@ -10389,7 +10389,7 @@ LIBNVDIMM BTT: BLOCK TRANSLATION TABLE
M: Vishal Verma <vishal.l.verma@intel.com>
M: Dan Williams <dan.j.williams@intel.com>
M: Dave Jiang <dave.jiang@intel.com>
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
P: Documentation/nvdimm/maintainer-entry-profile.rst
@ -10399,7 +10399,7 @@ LIBNVDIMM PMEM: PERSISTENT MEMORY DRIVER
M: Dan Williams <dan.j.williams@intel.com>
M: Vishal Verma <vishal.l.verma@intel.com>
M: Dave Jiang <dave.jiang@intel.com>
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
P: Documentation/nvdimm/maintainer-entry-profile.rst
@ -10407,7 +10407,7 @@ F: drivers/nvdimm/pmem*
LIBNVDIMM: DEVICETREE BINDINGS
M: Oliver O'Halloran <oohall@gmail.com>
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
F: Documentation/devicetree/bindings/pmem/pmem-region.txt
@ -10418,7 +10418,7 @@ M: Dan Williams <dan.j.williams@intel.com>
M: Vishal Verma <vishal.l.verma@intel.com>
M: Dave Jiang <dave.jiang@intel.com>
M: Ira Weiny <ira.weiny@intel.com>
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
P: Documentation/nvdimm/maintainer-entry-profile.rst
@ -15815,7 +15815,7 @@ F: include/uapi/linux/rose.h
F: net/rose/
ROTATION DRIVER FOR ALLWINNER A83T
M: Jernej Skrabec <jernej.skrabec@siol.net>
M: Jernej Skrabec <jernej.skrabec@gmail.com>
L: linux-media@vger.kernel.org
S: Maintained
T: git git://linuxtv.org/media_tree.git

View file

@ -2,7 +2,7 @@
VERSION = 5
PATCHLEVEL = 13
SUBLEVEL = 0
EXTRAVERSION = -rc1
EXTRAVERSION = -rc2
NAME = Frozen Wasteland
# *DOCUMENTATION*

View file

@ -31,7 +31,7 @@ endif
ifdef CONFIG_ARC_CURR_IN_REG
# For a global register defintion, make sure it gets passed to every file
# For a global register definition, make sure it gets passed to every file
# We had a customer reported bug where some code built in kernel was NOT using
# any kernel headers, and missing the r25 global register
# Can't do unconditionally because of recursive include issues

View file

@ -116,7 +116,7 @@ static inline unsigned long __xchg(unsigned long val, volatile void *ptr,
*
* Technically the lock is also needed for UP (boils down to irq save/restore)
* but we can cheat a bit since cmpxchg() atomic_ops_lock() would cause irqs to
* be disabled thus can't possibly be interrpted/preempted/clobbered by xchg()
* be disabled thus can't possibly be interrupted/preempted/clobbered by xchg()
* Other way around, xchg is one instruction anyways, so can't be interrupted
* as such
*/
@ -143,7 +143,7 @@ static inline unsigned long __xchg(unsigned long val, volatile void *ptr,
/*
* "atomic" variant of xchg()
* REQ: It needs to follow the same serialization rules as other atomic_xxx()
* Since xchg() doesn't always do that, it would seem that following defintion
* Since xchg() doesn't always do that, it would seem that following definition
* is incorrect. But here's the rationale:
* SMP : Even xchg() takes the atomic_ops_lock, so OK.
* LLSC: atomic_ops_lock are not relevant at all (even if SMP, since LLSC

View file

@ -7,6 +7,18 @@
#include <uapi/asm/page.h>
#ifdef CONFIG_ARC_HAS_PAE40
#define MAX_POSSIBLE_PHYSMEM_BITS 40
#define PAGE_MASK_PHYS (0xff00000000ull | PAGE_MASK)
#else /* CONFIG_ARC_HAS_PAE40 */
#define MAX_POSSIBLE_PHYSMEM_BITS 32
#define PAGE_MASK_PHYS PAGE_MASK
#endif /* CONFIG_ARC_HAS_PAE40 */
#ifndef __ASSEMBLY__
#define clear_page(paddr) memset((paddr), 0, PAGE_SIZE)

View file

@ -107,8 +107,8 @@
#define ___DEF (_PAGE_PRESENT | _PAGE_CACHEABLE)
/* Set of bits not changed in pte_modify */
#define _PAGE_CHG_MASK (PAGE_MASK | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_SPECIAL)
#define _PAGE_CHG_MASK (PAGE_MASK_PHYS | _PAGE_ACCESSED | _PAGE_DIRTY | \
_PAGE_SPECIAL)
/* More Abbrevaited helpers */
#define PAGE_U_NONE __pgprot(___DEF)
#define PAGE_U_R __pgprot(___DEF | _PAGE_READ)
@ -132,13 +132,7 @@
#define PTE_BITS_IN_PD0 (_PAGE_GLOBAL | _PAGE_PRESENT | _PAGE_HW_SZ)
#define PTE_BITS_RWX (_PAGE_EXECUTE | _PAGE_WRITE | _PAGE_READ)
#ifdef CONFIG_ARC_HAS_PAE40
#define PTE_BITS_NON_RWX_IN_PD1 (0xff00000000 | PAGE_MASK | _PAGE_CACHEABLE)
#define MAX_POSSIBLE_PHYSMEM_BITS 40
#else
#define PTE_BITS_NON_RWX_IN_PD1 (PAGE_MASK | _PAGE_CACHEABLE)
#define MAX_POSSIBLE_PHYSMEM_BITS 32
#endif
#define PTE_BITS_NON_RWX_IN_PD1 (PAGE_MASK_PHYS | _PAGE_CACHEABLE)
/**************************************************************************
* Mapping of vm_flags (Generic VM) to PTE flags (arch specific)

View file

@ -33,5 +33,4 @@
#define PAGE_MASK (~(PAGE_SIZE-1))
#endif /* _UAPI__ASM_ARC_PAGE_H */

View file

@ -177,7 +177,7 @@ tracesys:
; Do the Sys Call as we normally would.
; Validate the Sys Call number
cmp r8, NR_syscalls
cmp r8, NR_syscalls - 1
mov.hi r0, -ENOSYS
bhi tracesys_exit
@ -255,7 +255,7 @@ ENTRY(EV_Trap)
;============ Normal syscall case
; syscall num shd not exceed the total system calls avail
cmp r8, NR_syscalls
cmp r8, NR_syscalls - 1
mov.hi r0, -ENOSYS
bhi .Lret_from_system_call

View file

@ -140,6 +140,7 @@ int kgdb_arch_handle_exception(int e_vector, int signo, int err_code,
ptr = &remcomInBuffer[1];
if (kgdb_hex2long(&ptr, &addr))
regs->ret = addr;
fallthrough;
case 'D':
case 'k':

View file

@ -50,14 +50,14 @@ SYSCALL_DEFINE3(arc_usr_cmpxchg, int *, uaddr, int, expected, int, new)
int ret;
/*
* This is only for old cores lacking LLOCK/SCOND, which by defintion
* This is only for old cores lacking LLOCK/SCOND, which by definition
* can't possibly be SMP. Thus doesn't need to be SMP safe.
* And this also helps reduce the overhead for serializing in
* the UP case
*/
WARN_ON_ONCE(IS_ENABLED(CONFIG_SMP));
/* Z indicates to userspace if operation succeded */
/* Z indicates to userspace if operation succeeded */
regs->status32 &= ~STATUS_Z_MASK;
ret = access_ok(uaddr, sizeof(*uaddr));
@ -107,7 +107,7 @@ SYSCALL_DEFINE3(arc_usr_cmpxchg, int *, uaddr, int, expected, int, new)
void arch_cpu_idle(void)
{
/* Re-enable interrupts <= default irq priority before commiting SLEEP */
/* Re-enable interrupts <= default irq priority before committing SLEEP */
const unsigned int arg = 0x10 | ARCV2_IRQ_DEF_PRIO;
__asm__ __volatile__(
@ -120,7 +120,7 @@ void arch_cpu_idle(void)
void arch_cpu_idle(void)
{
/* sleep, but enable both set E1/E2 (levels of interrutps) before committing */
/* sleep, but enable both set E1/E2 (levels of interrupts) before committing */
__asm__ __volatile__("sleep 0x3 \n");
}

View file

@ -259,7 +259,7 @@ setup_rt_frame(struct ksignal *ksig, sigset_t *set, struct pt_regs *regs)
regs->r2 = (unsigned long)&sf->uc;
/*
* small optim to avoid unconditonally calling do_sigaltstack
* small optim to avoid unconditionally calling do_sigaltstack
* in sigreturn path, now that we only have rt_sigreturn
*/
magic = MAGIC_SIGALTSTK;
@ -391,7 +391,7 @@ void do_signal(struct pt_regs *regs)
void do_notify_resume(struct pt_regs *regs)
{
/*
* ASM glue gaurantees that this is only called when returning to
* ASM glue guarantees that this is only called when returning to
* user mode
*/
if (test_thread_flag(TIF_NOTIFY_RESUME))

View file

@ -157,7 +157,16 @@ void __init setup_arch_memory(void)
min_high_pfn = PFN_DOWN(high_mem_start);
max_high_pfn = PFN_DOWN(high_mem_start + high_mem_sz);
max_zone_pfn[ZONE_HIGHMEM] = min_low_pfn;
/*
* max_high_pfn should be ok here for both HIGHMEM and HIGHMEM+PAE.
* For HIGHMEM without PAE max_high_pfn should be less than
* min_low_pfn to guarantee that these two regions don't overlap.
* For PAE case highmem is greater than lowmem, so it is natural
* to use max_high_pfn.
*
* In both cases, holes should be handled by pfn_valid().
*/
max_zone_pfn[ZONE_HIGHMEM] = max_high_pfn;
high_memory = (void *)(min_high_pfn << PAGE_SHIFT);

View file

@ -53,9 +53,10 @@ EXPORT_SYMBOL(ioremap);
void __iomem *ioremap_prot(phys_addr_t paddr, unsigned long size,
unsigned long flags)
{
unsigned int off;
unsigned long vaddr;
struct vm_struct *area;
phys_addr_t off, end;
phys_addr_t end;
pgprot_t prot = __pgprot(flags);
/* Don't allow wraparound, zero size */
@ -72,7 +73,7 @@ void __iomem *ioremap_prot(phys_addr_t paddr, unsigned long size,
/* Mappings have to be page-aligned */
off = paddr & ~PAGE_MASK;
paddr &= PAGE_MASK;
paddr &= PAGE_MASK_PHYS;
size = PAGE_ALIGN(end + 1) - paddr;
/*

View file

@ -576,7 +576,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long vaddr_unaligned,
pte_t *ptep)
{
unsigned long vaddr = vaddr_unaligned & PAGE_MASK;
phys_addr_t paddr = pte_val(*ptep) & PAGE_MASK;
phys_addr_t paddr = pte_val(*ptep) & PAGE_MASK_PHYS;
struct page *page = pfn_to_page(pte_pfn(*ptep));
create_tlb(vma, vaddr, ptep);

View file

@ -135,24 +135,18 @@ void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
return;
}
int xen_swiotlb_detect(void)
{
if (!xen_domain())
return 0;
if (xen_feature(XENFEAT_direct_mapped))
return 1;
/* legacy case */
if (!xen_feature(XENFEAT_not_direct_mapped) && xen_initial_domain())
return 1;
return 0;
}
static int __init xen_mm_init(void)
{
struct gnttab_cache_flush cflush;
int rc;
if (!xen_swiotlb_detect())
return 0;
xen_swiotlb_init();
rc = xen_swiotlb_init();
/* we can work with the default swiotlb */
if (rc < 0 && rc != -EEXIST)
return rc;
cflush.op = 0;
cflush.a.dev_bus_addr = 0;

View file

@ -175,6 +175,9 @@ vdso_install:
$(if $(CONFIG_COMPAT_VDSO), \
$(Q)$(MAKE) $(build)=arch/arm64/kernel/vdso32 $@)
archprepare:
$(Q)$(MAKE) $(build)=arch/arm64/tools kapi
# We use MRPROPER_FILES and CLEAN_FILES now
archclean:
$(Q)$(MAKE) $(clean)=$(boot)

View file

@ -5,3 +5,5 @@ generic-y += qrwlock.h
generic-y += qspinlock.h
generic-y += set_memory.h
generic-y += user.h
generated-y += cpucaps.h

View file

@ -1,74 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* arch/arm64/include/asm/cpucaps.h
*
* Copyright (C) 2016 ARM Ltd.
*/
#ifndef __ASM_CPUCAPS_H
#define __ASM_CPUCAPS_H
#define ARM64_WORKAROUND_CLEAN_CACHE 0
#define ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE 1
#define ARM64_WORKAROUND_845719 2
#define ARM64_HAS_SYSREG_GIC_CPUIF 3
#define ARM64_HAS_PAN 4
#define ARM64_HAS_LSE_ATOMICS 5
#define ARM64_WORKAROUND_CAVIUM_23154 6
#define ARM64_WORKAROUND_834220 7
#define ARM64_HAS_NO_HW_PREFETCH 8
#define ARM64_HAS_VIRT_HOST_EXTN 11
#define ARM64_WORKAROUND_CAVIUM_27456 12
#define ARM64_HAS_32BIT_EL0 13
#define ARM64_SPECTRE_V3A 14
#define ARM64_HAS_CNP 15
#define ARM64_HAS_NO_FPSIMD 16
#define ARM64_WORKAROUND_REPEAT_TLBI 17
#define ARM64_WORKAROUND_QCOM_FALKOR_E1003 18
#define ARM64_WORKAROUND_858921 19
#define ARM64_WORKAROUND_CAVIUM_30115 20
#define ARM64_HAS_DCPOP 21
#define ARM64_SVE 22
#define ARM64_UNMAP_KERNEL_AT_EL0 23
#define ARM64_SPECTRE_V2 24
#define ARM64_HAS_RAS_EXTN 25
#define ARM64_WORKAROUND_843419 26
#define ARM64_HAS_CACHE_IDC 27
#define ARM64_HAS_CACHE_DIC 28
#define ARM64_HW_DBM 29
#define ARM64_SPECTRE_V4 30
#define ARM64_MISMATCHED_CACHE_TYPE 31
#define ARM64_HAS_STAGE2_FWB 32
#define ARM64_HAS_CRC32 33
#define ARM64_SSBS 34
#define ARM64_WORKAROUND_1418040 35
#define ARM64_HAS_SB 36
#define ARM64_WORKAROUND_SPECULATIVE_AT 37
#define ARM64_HAS_ADDRESS_AUTH_ARCH 38
#define ARM64_HAS_ADDRESS_AUTH_IMP_DEF 39
#define ARM64_HAS_GENERIC_AUTH_ARCH 40
#define ARM64_HAS_GENERIC_AUTH_IMP_DEF 41
#define ARM64_HAS_IRQ_PRIO_MASKING 42
#define ARM64_HAS_DCPODP 43
#define ARM64_WORKAROUND_1463225 44
#define ARM64_WORKAROUND_CAVIUM_TX2_219_TVM 45
#define ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM 46
#define ARM64_WORKAROUND_1542419 47
#define ARM64_HAS_E0PD 48
#define ARM64_HAS_RNG 49
#define ARM64_HAS_AMU_EXTN 50
#define ARM64_HAS_ADDRESS_AUTH 51
#define ARM64_HAS_GENERIC_AUTH 52
#define ARM64_HAS_32BIT_EL1 53
#define ARM64_BTI 54
#define ARM64_HAS_ARMv8_4_TTL 55
#define ARM64_HAS_TLB_RANGE 56
#define ARM64_MTE 57
#define ARM64_WORKAROUND_1508412 58
#define ARM64_HAS_LDAPR 59
#define ARM64_KVM_PROTECTED_MODE 60
#define ARM64_WORKAROUND_NVIDIA_CARMEL_CNP 61
#define ARM64_HAS_EPAN 62
#define ARM64_NCAPS 63
#endif /* __ASM_CPUCAPS_H */

View file

@ -55,8 +55,10 @@ void __sync_icache_dcache(pte_t pte)
{
struct page *page = pte_page(pte);
if (!test_and_set_bit(PG_dcache_clean, &page->flags))
if (!test_bit(PG_dcache_clean, &page->flags)) {
sync_icache_aliases(page_address(page), page_size(page));
set_bit(PG_dcache_clean, &page->flags);
}
}
EXPORT_SYMBOL_GPL(__sync_icache_dcache);

View file

@ -43,6 +43,7 @@
#include <linux/sizes.h>
#include <asm/tlb.h>
#include <asm/alternative.h>
#include <asm/xen/swiotlb-xen.h>
/*
* We need to be able to catch inadvertent references to memstart_addr
@ -482,7 +483,7 @@ void __init mem_init(void)
if (swiotlb_force == SWIOTLB_FORCE ||
max_pfn > PFN_DOWN(arm64_dma_phys_limit))
swiotlb_init(1);
else
else if (!xen_swiotlb_detect())
swiotlb_force = SWIOTLB_NO_FORCE;
set_max_mapnr(max_pfn - PHYS_PFN_OFFSET);

View file

@ -447,6 +447,18 @@ SYM_FUNC_START(__cpu_setup)
mov x10, #(SYS_GCR_EL1_RRND | SYS_GCR_EL1_EXCL_MASK)
msr_s SYS_GCR_EL1, x10
/*
* If GCR_EL1.RRND=1 is implemented the same way as RRND=0, then
* RGSR_EL1.SEED must be non-zero for IRG to produce
* pseudorandom numbers. As RGSR_EL1 is UNKNOWN out of reset, we
* must initialize it.
*/
mrs x10, CNTVCT_EL0
ands x10, x10, #SYS_RGSR_EL1_SEED_MASK
csinc x10, x10, xzr, ne
lsl x10, x10, #SYS_RGSR_EL1_SEED_SHIFT
msr_s SYS_RGSR_EL1, x10
/* clear any pending tag check faults in TFSR*_EL1 */
msr_s SYS_TFSR_EL1, xzr
msr_s SYS_TFSRE0_EL1, xzr

22
arch/arm64/tools/Makefile Normal file
View file

@ -0,0 +1,22 @@
# SPDX-License-Identifier: GPL-2.0
gen := arch/$(ARCH)/include/generated
kapi := $(gen)/asm
kapi-hdrs-y := $(kapi)/cpucaps.h
targets += $(addprefix ../../../,$(gen-y) $(kapi-hdrs-y))
PHONY += kapi
kapi: $(kapi-hdrs-y) $(gen-y)
# Create output directory if not already present
_dummy := $(shell [ -d '$(kapi)' ] || mkdir -p '$(kapi)')
quiet_cmd_gen_cpucaps = GEN $@
cmd_gen_cpucaps = mkdir -p $(dir $@) && \
$(AWK) -f $(filter-out $(PHONY),$^) > $@
$(kapi)/cpucaps.h: $(src)/gen-cpucaps.awk $(src)/cpucaps FORCE
$(call if_changed,gen_cpucaps)

65
arch/arm64/tools/cpucaps Normal file
View file

@ -0,0 +1,65 @@
# SPDX-License-Identifier: GPL-2.0
#
# Internal CPU capabilities constants, keep this list sorted
BTI
HAS_32BIT_EL0
HAS_32BIT_EL1
HAS_ADDRESS_AUTH
HAS_ADDRESS_AUTH_ARCH
HAS_ADDRESS_AUTH_IMP_DEF
HAS_AMU_EXTN
HAS_ARMv8_4_TTL
HAS_CACHE_DIC
HAS_CACHE_IDC
HAS_CNP
HAS_CRC32
HAS_DCPODP
HAS_DCPOP
HAS_E0PD
HAS_EPAN
HAS_GENERIC_AUTH
HAS_GENERIC_AUTH_ARCH
HAS_GENERIC_AUTH_IMP_DEF
HAS_IRQ_PRIO_MASKING
HAS_LDAPR
HAS_LSE_ATOMICS
HAS_NO_FPSIMD
HAS_NO_HW_PREFETCH
HAS_PAN
HAS_RAS_EXTN
HAS_RNG
HAS_SB
HAS_STAGE2_FWB
HAS_SYSREG_GIC_CPUIF
HAS_TLB_RANGE
HAS_VIRT_HOST_EXTN
HW_DBM
KVM_PROTECTED_MODE
MISMATCHED_CACHE_TYPE
MTE
SPECTRE_V2
SPECTRE_V3A
SPECTRE_V4
SSBS
SVE
UNMAP_KERNEL_AT_EL0
WORKAROUND_834220
WORKAROUND_843419
WORKAROUND_845719
WORKAROUND_858921
WORKAROUND_1418040
WORKAROUND_1463225
WORKAROUND_1508412
WORKAROUND_1542419
WORKAROUND_CAVIUM_23154
WORKAROUND_CAVIUM_27456
WORKAROUND_CAVIUM_30115
WORKAROUND_CAVIUM_TX2_219_PRFM
WORKAROUND_CAVIUM_TX2_219_TVM
WORKAROUND_CLEAN_CACHE
WORKAROUND_DEVICE_LOAD_ACQUIRE
WORKAROUND_NVIDIA_CARMEL_CNP
WORKAROUND_QCOM_FALKOR_E1003
WORKAROUND_REPEAT_TLBI
WORKAROUND_SPECULATIVE_AT

View file

@ -0,0 +1,40 @@
#!/bin/awk -f
# SPDX-License-Identifier: GPL-2.0
# gen-cpucaps.awk: arm64 cpucaps header generator
#
# Usage: awk -f gen-cpucaps.awk cpucaps.txt
# Log an error and terminate
function fatal(msg) {
print "Error at line " NR ": " msg > "/dev/stderr"
exit 1
}
# skip blank lines and comment lines
/^$/ { next }
/^#/ { next }
BEGIN {
print "#ifndef __ASM_CPUCAPS_H"
print "#define __ASM_CPUCAPS_H"
print ""
print "/* Generated file - do not edit */"
cap_num = 0
print ""
}
/^[vA-Z0-9_]+$/ {
printf("#define ARM64_%-30s\t%d\n", $0, cap_num++)
next
}
END {
printf("#define ARM64_NCAPS\t\t\t\t%d\n", cap_num)
print ""
print "#endif /* __ASM_CPUCAPS_H */"
}
# Any lines not handled by previous rules are unexpected
{
fatal("unhandled statement")
}

View file

@ -120,6 +120,7 @@ extern s32 patch__call_flush_branch_caches3;
extern s32 patch__flush_count_cache_return;
extern s32 patch__flush_link_stack_return;
extern s32 patch__call_kvm_flush_link_stack;
extern s32 patch__call_kvm_flush_link_stack_p9;
extern s32 patch__memset_nocache, patch__memcpy_nocache;
extern long flush_branch_caches;
@ -140,7 +141,7 @@ void kvmhv_load_host_pmu(void);
void kvmhv_save_guest_pmu(struct kvm_vcpu *vcpu, bool pmu_in_use);
void kvmhv_load_guest_pmu(struct kvm_vcpu *vcpu);
int __kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu);
void kvmppc_p9_enter_guest(struct kvm_vcpu *vcpu);
long kvmppc_h_set_dabr(struct kvm_vcpu *vcpu, unsigned long dabr);
long kvmppc_h_set_xdabr(struct kvm_vcpu *vcpu, unsigned long dabr,

View file

@ -19,6 +19,7 @@ struct mmu_psize_def {
int penc[MMU_PAGE_COUNT]; /* HPTE encoding */
unsigned int tlbiel; /* tlbiel supported for that page size */
unsigned long avpnm; /* bits to mask out in AVPN in the HPTE */
unsigned long h_rpt_pgsize; /* H_RPT_INVALIDATE page size encoding */
union {
unsigned long sllp; /* SLB L||LP (exact mask to use in slbmte) */
unsigned long ap; /* Ap encoding used by PowerISA 3.0 */

View file

@ -4,6 +4,10 @@
#include <asm/hvcall.h>
#define RIC_FLUSH_TLB 0
#define RIC_FLUSH_PWC 1
#define RIC_FLUSH_ALL 2
struct vm_area_struct;
struct mm_struct;
struct mmu_gather;

View file

@ -98,6 +98,36 @@ static inline int cpu_last_thread_sibling(int cpu)
return cpu | (threads_per_core - 1);
}
/*
* tlb_thread_siblings are siblings which share a TLB. This is not
* architected, is not something a hypervisor could emulate and a future
* CPU may change behaviour even in compat mode, so this should only be
* used on PowerNV, and only with care.
*/
static inline int cpu_first_tlb_thread_sibling(int cpu)
{
if (cpu_has_feature(CPU_FTR_ARCH_300) && (threads_per_core == 8))
return cpu & ~0x6; /* Big Core */
else
return cpu_first_thread_sibling(cpu);
}
static inline int cpu_last_tlb_thread_sibling(int cpu)
{
if (cpu_has_feature(CPU_FTR_ARCH_300) && (threads_per_core == 8))
return cpu | 0x6; /* Big Core */
else
return cpu_last_thread_sibling(cpu);
}
static inline int cpu_tlb_thread_sibling_step(void)
{
if (cpu_has_feature(CPU_FTR_ARCH_300) && (threads_per_core == 8))
return 2; /* Big Core */
else
return 1;
}
static inline u32 get_tensr(void)
{
#ifdef CONFIG_BOOKE

View file

@ -35,6 +35,19 @@
/* PACA save area size in u64 units (exgen, exmc, etc) */
#define EX_SIZE 10
/* PACA save area offsets */
#define EX_R9 0
#define EX_R10 8
#define EX_R11 16
#define EX_R12 24
#define EX_R13 32
#define EX_DAR 40
#define EX_DSISR 48
#define EX_CCR 52
#define EX_CFAR 56
#define EX_PPR 64
#define EX_CTR 72
/*
* maximum recursive depth of MCE exceptions
*/

View file

@ -413,9 +413,9 @@
#define H_RPTI_TYPE_NESTED 0x0001 /* Invalidate nested guest partition-scope */
#define H_RPTI_TYPE_TLB 0x0002 /* Invalidate TLB */
#define H_RPTI_TYPE_PWC 0x0004 /* Invalidate Page Walk Cache */
/* Invalidate Process Table Entries if H_RPTI_TYPE_NESTED is clear */
/* Invalidate caching of Process Table Entries if H_RPTI_TYPE_NESTED is clear */
#define H_RPTI_TYPE_PRT 0x0008
/* Invalidate Partition Table Entries if H_RPTI_TYPE_NESTED is set */
/* Invalidate caching of Partition Table Entries if H_RPTI_TYPE_NESTED is set */
#define H_RPTI_TYPE_PAT 0x0008
#define H_RPTI_TYPE_ALL (H_RPTI_TYPE_TLB | H_RPTI_TYPE_PWC | \
H_RPTI_TYPE_PRT)
@ -448,6 +448,9 @@
*/
long plpar_hcall_norets(unsigned long opcode, ...);
/* Variant which does not do hcall tracing */
long plpar_hcall_norets_notrace(unsigned long opcode, ...);
/**
* plpar_hcall: - Make a pseries hypervisor call
* @opcode: The hypervisor call to make.

View file

@ -153,8 +153,6 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
*/
static inline void interrupt_exit_prepare(struct pt_regs *regs, struct interrupt_state *state)
{
if (user_mode(regs))
kuep_unlock();
}
static inline void interrupt_async_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
@ -222,6 +220,13 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte
local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && !(regs->msr & MSR_PR) &&
regs->nip < (unsigned long)__end_interrupts) {
// Kernel code running below __end_interrupts is
// implicitly soft-masked.
regs->softe = IRQS_ALL_DISABLED;
}
/* Don't do any per-CPU operations until interrupt state is fixed */
if (nmi_disables_ftrace(regs)) {

View file

@ -147,6 +147,7 @@
#define KVM_GUEST_MODE_SKIP 2
#define KVM_GUEST_MODE_GUEST_HV 3
#define KVM_GUEST_MODE_HOST_HV 4
#define KVM_GUEST_MODE_HV_P9 5 /* ISA >= v3.0 path */
#define KVM_INST_FETCH_FAILED -1

View file

@ -307,6 +307,9 @@ void kvmhv_set_ptbl_entry(unsigned int lpid, u64 dw0, u64 dw1);
void kvmhv_release_all_nested(struct kvm *kvm);
long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu);
long kvmhv_do_nested_tlbie(struct kvm_vcpu *vcpu);
long do_h_rpt_invalidate_pat(struct kvm_vcpu *vcpu, unsigned long lpid,
unsigned long type, unsigned long pg_sizes,
unsigned long start, unsigned long end);
int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu,
u64 time_limit, unsigned long lpcr);
void kvmhv_save_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr);

View file

@ -153,9 +153,17 @@ static inline bool kvmhv_vcpu_is_radix(struct kvm_vcpu *vcpu)
return radix;
}
int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr);
#define KVM_DEFAULT_HPT_ORDER 24 /* 16MB HPT by default */
#endif
/*
* Invalid HDSISR value which is used to indicate when HW has not set the reg.
* Used to work around an errata.
*/
#define HDSISR_CANARY 0x7fff
/*
* We use a lock bit in HPTE dword 0 to synchronize updates and
* accesses to each HPTE, and another bit to indicate non-present

View file

@ -298,7 +298,6 @@ struct kvm_arch {
u8 fwnmi_enabled;
u8 secure_guest;
u8 svm_enabled;
bool threads_indep;
bool nested_enable;
bool dawr1_enabled;
pgd_t *pgtable;
@ -684,7 +683,12 @@ struct kvm_vcpu_arch {
ulong fault_dar;
u32 fault_dsisr;
unsigned long intr_msr;
ulong fault_gpa; /* guest real address of page fault (POWER9) */
/*
* POWER9 and later: fault_gpa contains the guest real address of page
* fault for a radix guest, or segment descriptor (equivalent to result
* from slbmfev of SLB entry that translated the EA) for hash guests.
*/
ulong fault_gpa;
#endif
#ifdef CONFIG_BOOKE

View file

@ -129,6 +129,7 @@ extern void kvmppc_core_vcpu_put(struct kvm_vcpu *vcpu);
extern int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu);
extern int kvmppc_core_pending_dec(struct kvm_vcpu *vcpu);
extern void kvmppc_core_queue_machine_check(struct kvm_vcpu *vcpu, ulong flags);
extern void kvmppc_core_queue_syscall(struct kvm_vcpu *vcpu);
extern void kvmppc_core_queue_program(struct kvm_vcpu *vcpu, ulong flags);
extern void kvmppc_core_queue_fpunavail(struct kvm_vcpu *vcpu);
extern void kvmppc_core_queue_vec_unavail(struct kvm_vcpu *vcpu);
@ -606,6 +607,7 @@ extern void kvmppc_free_pimap(struct kvm *kvm);
extern int kvmppc_xics_rm_complete(struct kvm_vcpu *vcpu, u32 hcall);
extern void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu);
extern int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd);
extern int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req);
extern u64 kvmppc_xics_get_icp(struct kvm_vcpu *vcpu);
extern int kvmppc_xics_set_icp(struct kvm_vcpu *vcpu, u64 icpval);
extern int kvmppc_xics_connect_vcpu(struct kvm_device *dev,
@ -638,6 +640,8 @@ static inline int kvmppc_xics_enabled(struct kvm_vcpu *vcpu)
static inline void kvmppc_xics_free_icp(struct kvm_vcpu *vcpu) { }
static inline int kvmppc_xics_hcall(struct kvm_vcpu *vcpu, u32 cmd)
{ return 0; }
static inline int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req)
{ return 0; }
#endif
#ifdef CONFIG_KVM_XIVE
@ -655,8 +659,6 @@ extern int kvmppc_xive_get_xive(struct kvm *kvm, u32 irq, u32 *server,
u32 *priority);
extern int kvmppc_xive_int_on(struct kvm *kvm, u32 irq);
extern int kvmppc_xive_int_off(struct kvm *kvm, u32 irq);
extern void kvmppc_xive_init_module(void);
extern void kvmppc_xive_exit_module(void);
extern int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
struct kvm_vcpu *vcpu, u32 cpu);
@ -671,6 +673,8 @@ extern int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, u64 icpval);
extern int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 irq,
int level, bool line_status);
extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu);
extern void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu);
extern void kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu);
static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
{
@ -680,8 +684,6 @@ static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
extern int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev,
struct kvm_vcpu *vcpu, u32 cpu);
extern void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu);
extern void kvmppc_xive_native_init_module(void);
extern void kvmppc_xive_native_exit_module(void);
extern int kvmppc_xive_native_get_vp(struct kvm_vcpu *vcpu,
union kvmppc_one_reg *val);
extern int kvmppc_xive_native_set_vp(struct kvm_vcpu *vcpu,
@ -695,8 +697,6 @@ static inline int kvmppc_xive_get_xive(struct kvm *kvm, u32 irq, u32 *server,
u32 *priority) { return -1; }
static inline int kvmppc_xive_int_on(struct kvm *kvm, u32 irq) { return -1; }
static inline int kvmppc_xive_int_off(struct kvm *kvm, u32 irq) { return -1; }
static inline void kvmppc_xive_init_module(void) { }
static inline void kvmppc_xive_exit_module(void) { }
static inline int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
struct kvm_vcpu *vcpu, u32 cpu) { return -EBUSY; }
@ -711,14 +711,14 @@ static inline int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, u64 icpval) { retur
static inline int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 irq,
int level, bool line_status) { return -ENODEV; }
static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { }
static inline void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu) { }
static inline void kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu) { }
static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
{ return 0; }
static inline int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev,
struct kvm_vcpu *vcpu, u32 cpu) { return -EBUSY; }
static inline void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu) { }
static inline void kvmppc_xive_native_init_module(void) { }
static inline void kvmppc_xive_native_exit_module(void) { }
static inline int kvmppc_xive_native_get_vp(struct kvm_vcpu *vcpu,
union kvmppc_one_reg *val)
{ return 0; }
@ -754,7 +754,7 @@ long kvmppc_rm_h_stuff_tce(struct kvm_vcpu *vcpu,
unsigned long tce_value, unsigned long npages);
long int kvmppc_rm_h_confer(struct kvm_vcpu *vcpu, int target,
unsigned int yield_count);
long kvmppc_h_random(struct kvm_vcpu *vcpu);
long kvmppc_rm_h_random(struct kvm_vcpu *vcpu);
void kvmhv_commence_exit(int trap);
void kvmppc_realmode_machine_check(struct kvm_vcpu *vcpu);
void kvmppc_subcore_enter_guest(void);

View file

@ -122,12 +122,6 @@ static inline bool need_extra_context(struct mm_struct *mm, unsigned long ea)
}
#endif
#if defined(CONFIG_KVM_BOOK3S_HV_POSSIBLE) && defined(CONFIG_PPC_RADIX_MMU)
extern void radix_kvm_prefetch_workaround(struct mm_struct *mm);
#else
static inline void radix_kvm_prefetch_workaround(struct mm_struct *mm) { }
#endif
extern void switch_cop(struct mm_struct *next);
extern int use_cop(unsigned long acop, struct mm_struct *mm);
extern void drop_cop(unsigned long acop, struct mm_struct *mm);
@ -222,6 +216,18 @@ static inline void mm_context_add_copro(struct mm_struct *mm) { }
static inline void mm_context_remove_copro(struct mm_struct *mm) { }
#endif
#if defined(CONFIG_KVM_BOOK3S_HV_POSSIBLE) && defined(CONFIG_PPC_RADIX_MMU)
void do_h_rpt_invalidate_prt(unsigned long pid, unsigned long lpid,
unsigned long type, unsigned long pg_sizes,
unsigned long start, unsigned long end);
#else
static inline void do_h_rpt_invalidate_prt(unsigned long pid,
unsigned long lpid,
unsigned long type,
unsigned long pg_sizes,
unsigned long start,
unsigned long end) { }
#endif
extern void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
struct task_struct *tsk);

View file

@ -28,19 +28,35 @@ static inline u32 yield_count_of(int cpu)
return be32_to_cpu(yield_count);
}
/*
* Spinlock code confers and prods, so don't trace the hcalls because the
* tracing code takes spinlocks which can cause recursion deadlocks.
*
* These calls are made while the lock is not held: the lock slowpath yields if
* it can not acquire the lock, and unlock slow path might prod if a waiter has
* yielded). So this may not be a problem for simple spin locks because the
* tracing does not technically recurse on the lock, but we avoid it anyway.
*
* However the queued spin lock contended path is more strictly ordered: the
* H_CONFER hcall is made after the task has queued itself on the lock, so then
* recursing on that lock will cause the task to then queue up again behind the
* first instance (or worse: queued spinlocks use tricks that assume a context
* never waits on more than one spinlock, so such recursion may cause random
* corruption in the lock code).
*/
static inline void yield_to_preempted(int cpu, u32 yield_count)
{
plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), yield_count);
plpar_hcall_norets_notrace(H_CONFER, get_hard_smp_processor_id(cpu), yield_count);
}
static inline void prod_cpu(int cpu)
{
plpar_hcall_norets(H_PROD, get_hard_smp_processor_id(cpu));
plpar_hcall_norets_notrace(H_PROD, get_hard_smp_processor_id(cpu));
}
static inline void yield_to_any(void)
{
plpar_hcall_norets(H_CONFER, -1, 0);
plpar_hcall_norets_notrace(H_CONFER, -1, 0);
}
#else
static inline bool is_shared_processor(void)

View file

@ -28,7 +28,11 @@ static inline void set_cede_latency_hint(u8 latency_hint)
static inline long cede_processor(void)
{
return plpar_hcall_norets(H_CEDE);
/*
* We cannot call tracepoints inside RCU idle regions which
* means we must not trace H_CEDE.
*/
return plpar_hcall_norets_notrace(H_CEDE);
}
static inline long extended_cede_processor(unsigned long latency_hint)

View file

@ -97,6 +97,18 @@ extern void div128_by_32(u64 dividend_high, u64 dividend_low,
extern void secondary_cpu_time_init(void);
extern void __init time_init(void);
#ifdef CONFIG_PPC64
static inline unsigned long test_irq_work_pending(void)
{
unsigned long x;
asm volatile("lbz %0,%1(13)"
: "=r" (x)
: "i" (offsetof(struct paca_struct, irq_work_pending)));
return x;
}
#endif
DECLARE_PER_CPU(u64, decrementers_next_tb);
/* Convert timebase ticks to nanoseconds */

View file

@ -157,7 +157,7 @@ do { \
"2: lwz%X1 %L0, %L1\n" \
EX_TABLE(1b, %l2) \
EX_TABLE(2b, %l2) \
: "=r" (x) \
: "=&r" (x) \
: "m" (*addr) \
: \
: label)

View file

@ -534,7 +534,6 @@ int main(void)
OFFSET(VCPU_SLB_NR, kvm_vcpu, arch.slb_nr);
OFFSET(VCPU_FAULT_DSISR, kvm_vcpu, arch.fault_dsisr);
OFFSET(VCPU_FAULT_DAR, kvm_vcpu, arch.fault_dar);
OFFSET(VCPU_FAULT_GPA, kvm_vcpu, arch.fault_gpa);
OFFSET(VCPU_INTR_MSR, kvm_vcpu, arch.intr_msr);
OFFSET(VCPU_LAST_INST, kvm_vcpu, arch.last_inst);
OFFSET(VCPU_TRAP, kvm_vcpu, arch.trap);

View file

@ -340,6 +340,12 @@ ret_from_mc_except:
andi. r10,r10,IRQS_DISABLED; /* yes -> go out of line */ \
bne masked_interrupt_book3e_##n
/*
* Additional regs must be re-loaded from paca before EXCEPTION_COMMON* is
* called, because that does SAVE_NVGPRS which must see the original register
* values, otherwise the scratch values might be restored when exiting the
* interrupt.
*/
#define PROLOG_ADDITION_2REGS_GEN(n) \
std r14,PACA_EXGEN+EX_R14(r13); \
std r15,PACA_EXGEN+EX_R15(r13)
@ -535,6 +541,10 @@ __end_interrupts:
PROLOG_ADDITION_2REGS)
mfspr r14,SPRN_DEAR
mfspr r15,SPRN_ESR
std r14,_DAR(r1)
std r15,_DSISR(r1)
ld r14,PACA_EXGEN+EX_R14(r13)
ld r15,PACA_EXGEN+EX_R15(r13)
EXCEPTION_COMMON(0x300)
b storage_fault_common
@ -544,6 +554,10 @@ __end_interrupts:
PROLOG_ADDITION_2REGS)
li r15,0
mr r14,r10
std r14,_DAR(r1)
std r15,_DSISR(r1)
ld r14,PACA_EXGEN+EX_R14(r13)
ld r15,PACA_EXGEN+EX_R15(r13)
EXCEPTION_COMMON(0x400)
b storage_fault_common
@ -557,6 +571,10 @@ __end_interrupts:
PROLOG_ADDITION_2REGS)
mfspr r14,SPRN_DEAR
mfspr r15,SPRN_ESR
std r14,_DAR(r1)
std r15,_DSISR(r1)
ld r14,PACA_EXGEN+EX_R14(r13)
ld r15,PACA_EXGEN+EX_R15(r13)
EXCEPTION_COMMON(0x600)
b alignment_more /* no room, go out of line */
@ -565,10 +583,10 @@ __end_interrupts:
NORMAL_EXCEPTION_PROLOG(0x700, BOOKE_INTERRUPT_PROGRAM,
PROLOG_ADDITION_1REG)
mfspr r14,SPRN_ESR
EXCEPTION_COMMON(0x700)
std r14,_DSISR(r1)
addi r3,r1,STACK_FRAME_OVERHEAD
ld r14,PACA_EXGEN+EX_R14(r13)
EXCEPTION_COMMON(0x700)
addi r3,r1,STACK_FRAME_OVERHEAD
bl program_check_exception
REST_NVGPRS(r1)
b interrupt_return
@ -725,11 +743,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
* normal exception
*/
mfspr r14,SPRN_DBSR
EXCEPTION_COMMON_CRIT(0xd00)
std r14,_DSISR(r1)
addi r3,r1,STACK_FRAME_OVERHEAD
ld r14,PACA_EXCRIT+EX_R14(r13)
ld r15,PACA_EXCRIT+EX_R15(r13)
EXCEPTION_COMMON_CRIT(0xd00)
addi r3,r1,STACK_FRAME_OVERHEAD
bl DebugException
REST_NVGPRS(r1)
b interrupt_return
@ -796,11 +814,11 @@ kernel_dbg_exc:
* normal exception
*/
mfspr r14,SPRN_DBSR
EXCEPTION_COMMON_DBG(0xd08)
std r14,_DSISR(r1)
addi r3,r1,STACK_FRAME_OVERHEAD
ld r14,PACA_EXDBG+EX_R14(r13)
ld r15,PACA_EXDBG+EX_R15(r13)
EXCEPTION_COMMON_DBG(0xd08)
addi r3,r1,STACK_FRAME_OVERHEAD
bl DebugException
REST_NVGPRS(r1)
b interrupt_return
@ -931,11 +949,7 @@ masked_interrupt_book3e_0x2c0:
* original values stashed away in the PACA
*/
storage_fault_common:
std r14,_DAR(r1)
std r15,_DSISR(r1)
addi r3,r1,STACK_FRAME_OVERHEAD
ld r14,PACA_EXGEN+EX_R14(r13)
ld r15,PACA_EXGEN+EX_R15(r13)
bl do_page_fault
b interrupt_return
@ -944,11 +958,7 @@ storage_fault_common:
* continues here.
*/
alignment_more:
std r14,_DAR(r1)
std r15,_DSISR(r1)
addi r3,r1,STACK_FRAME_OVERHEAD
ld r14,PACA_EXGEN+EX_R14(r13)
ld r15,PACA_EXGEN+EX_R15(r13)
bl alignment_exception
REST_NVGPRS(r1)
b interrupt_return

View file

@ -21,22 +21,6 @@
#include <asm/feature-fixups.h>
#include <asm/kup.h>
/* PACA save area offsets (exgen, exmc, etc) */
#define EX_R9 0
#define EX_R10 8
#define EX_R11 16
#define EX_R12 24
#define EX_R13 32
#define EX_DAR 40
#define EX_DSISR 48
#define EX_CCR 52
#define EX_CFAR 56
#define EX_PPR 64
#define EX_CTR 72
.if EX_SIZE != 10
.error "EX_SIZE is wrong"
.endif
/*
* Following are fixed section helper macros.
*
@ -133,7 +117,6 @@ name:
#define IBRANCH_TO_COMMON .L_IBRANCH_TO_COMMON_\name\() /* ENTRY branch to common */
#define IREALMODE_COMMON .L_IREALMODE_COMMON_\name\() /* Common runs in realmode */
#define IMASK .L_IMASK_\name\() /* IRQ soft-mask bit */
#define IKVM_SKIP .L_IKVM_SKIP_\name\() /* Generate KVM skip handler */
#define IKVM_REAL .L_IKVM_REAL_\name\() /* Real entry tests KVM */
#define __IKVM_REAL(name) .L_IKVM_REAL_ ## name
#define IKVM_VIRT .L_IKVM_VIRT_\name\() /* Virt entry tests KVM */
@ -190,9 +173,6 @@ do_define_int n
.ifndef IMASK
IMASK=0
.endif
.ifndef IKVM_SKIP
IKVM_SKIP=0
.endif
.ifndef IKVM_REAL
IKVM_REAL=0
.endif
@ -207,8 +187,6 @@ do_define_int n
.endif
.endm
#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
/*
* All interrupts which set HSRR registers, as well as SRESET and MCE and
* syscall when invoked with "sc 1" switch to MSR[HV]=1 (HVMODE) to be taken,
@ -238,88 +216,28 @@ do_define_int n
/*
* If an interrupt is taken while a guest is running, it is immediately routed
* to KVM to handle. If both HV and PR KVM arepossible, KVM interrupts go first
* to kvmppc_interrupt_hv, which handles the PR guest case.
* to KVM to handle.
*/
#define kvmppc_interrupt kvmppc_interrupt_hv
#else
#define kvmppc_interrupt kvmppc_interrupt_pr
#endif
.macro KVMTEST name
.macro KVMTEST name handler
#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
lbz r10,HSTATE_IN_GUEST(r13)
cmpwi r10,0
bne \name\()_kvm
.endm
.macro GEN_KVM name
.balign IFETCH_ALIGN_BYTES
\name\()_kvm:
.if IKVM_SKIP
cmpwi r10,KVM_GUEST_MODE_SKIP
beq 89f
.else
BEGIN_FTR_SECTION
ld r10,IAREA+EX_CFAR(r13)
std r10,HSTATE_CFAR(r13)
END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
.endif
ld r10,IAREA+EX_CTR(r13)
mtctr r10
BEGIN_FTR_SECTION
ld r10,IAREA+EX_PPR(r13)
std r10,HSTATE_PPR(r13)
END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
ld r11,IAREA+EX_R11(r13)
ld r12,IAREA+EX_R12(r13)
std r12,HSTATE_SCRATCH0(r13)
sldi r12,r9,32
ld r9,IAREA+EX_R9(r13)
ld r10,IAREA+EX_R10(r13)
/* HSRR variants have the 0x2 bit added to their trap number */
.if IHSRR_IF_HVMODE
BEGIN_FTR_SECTION
ori r12,r12,(IVEC + 0x2)
li r10,(IVEC + 0x2)
FTR_SECTION_ELSE
ori r12,r12,(IVEC)
li r10,(IVEC)
ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
.elseif IHSRR
ori r12,r12,(IVEC+ 0x2)
li r10,(IVEC + 0x2)
.else
ori r12,r12,(IVEC)
li r10,(IVEC)
.endif
b kvmppc_interrupt
.if IKVM_SKIP
89: mtocrf 0x80,r9
ld r10,IAREA+EX_CTR(r13)
mtctr r10
ld r9,IAREA+EX_R9(r13)
ld r10,IAREA+EX_R10(r13)
ld r11,IAREA+EX_R11(r13)
ld r12,IAREA+EX_R12(r13)
.if IHSRR_IF_HVMODE
BEGIN_FTR_SECTION
b kvmppc_skip_Hinterrupt
FTR_SECTION_ELSE
b kvmppc_skip_interrupt
ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
.elseif IHSRR
b kvmppc_skip_Hinterrupt
.else
b kvmppc_skip_interrupt
.endif
.endif
.endm
#else
.macro KVMTEST name
.endm
.macro GEN_KVM name
.endm
bne \handler
#endif
.endm
/*
* This is the BOOK3S interrupt entry code macro.
@ -461,7 +379,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
DEFINE_FIXED_SYMBOL(\name\()_common_real)
\name\()_common_real:
.if IKVM_REAL
KVMTEST \name
KVMTEST \name kvm_interrupt
.endif
ld r10,PACAKMSR(r13) /* get MSR value for kernel */
@ -484,7 +402,7 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real)
DEFINE_FIXED_SYMBOL(\name\()_common_virt)
\name\()_common_virt:
.if IKVM_VIRT
KVMTEST \name
KVMTEST \name kvm_interrupt
1:
.endif
.endif /* IVIRT */
@ -498,7 +416,7 @@ DEFINE_FIXED_SYMBOL(\name\()_common_virt)
DEFINE_FIXED_SYMBOL(\name\()_common_real)
\name\()_common_real:
.if IKVM_REAL
KVMTEST \name
KVMTEST \name kvm_interrupt
.endif
.endm
@ -1000,8 +918,6 @@ EXC_COMMON_BEGIN(system_reset_common)
EXCEPTION_RESTORE_REGS
RFI_TO_USER_OR_KERNEL
GEN_KVM system_reset
/**
* Interrupt 0x200 - Machine Check Interrupt (MCE).
@ -1070,7 +986,6 @@ INT_DEFINE_BEGIN(machine_check)
ISET_RI=0
IDAR=1
IDSISR=1
IKVM_SKIP=1
IKVM_REAL=1
INT_DEFINE_END(machine_check)
@ -1166,7 +1081,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
/*
* Check if we are coming from guest. If yes, then run the normal
* exception handler which will take the
* machine_check_kvm->kvmppc_interrupt branch to deliver the MC event
* machine_check_kvm->kvm_interrupt branch to deliver the MC event
* to guest.
*/
lbz r11,HSTATE_IN_GUEST(r13)
@ -1236,8 +1151,6 @@ EXC_COMMON_BEGIN(machine_check_common)
bl machine_check_exception
b interrupt_return
GEN_KVM machine_check
#ifdef CONFIG_PPC_P7_NAP
/*
@ -1342,7 +1255,6 @@ INT_DEFINE_BEGIN(data_access)
IVEC=0x300
IDAR=1
IDSISR=1
IKVM_SKIP=1
IKVM_REAL=1
INT_DEFINE_END(data_access)
@ -1373,8 +1285,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
REST_NVGPRS(r1)
b interrupt_return
GEN_KVM data_access
/**
* Interrupt 0x380 - Data Segment Interrupt (DSLB).
@ -1396,7 +1306,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
INT_DEFINE_BEGIN(data_access_slb)
IVEC=0x380
IDAR=1
IKVM_SKIP=1
IKVM_REAL=1
INT_DEFINE_END(data_access_slb)
@ -1425,8 +1334,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
bl do_bad_slb_fault
b interrupt_return
GEN_KVM data_access_slb
/**
* Interrupt 0x400 - Instruction Storage Interrupt (ISI).
@ -1463,8 +1370,6 @@ MMU_FTR_SECTION_ELSE
ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
b interrupt_return
GEN_KVM instruction_access
/**
* Interrupt 0x480 - Instruction Segment Interrupt (ISLB).
@ -1509,8 +1414,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
bl do_bad_slb_fault
b interrupt_return
GEN_KVM instruction_access_slb
/**
* Interrupt 0x500 - External Interrupt.
@ -1555,8 +1458,6 @@ EXC_COMMON_BEGIN(hardware_interrupt_common)
bl do_IRQ
b interrupt_return
GEN_KVM hardware_interrupt
/**
* Interrupt 0x600 - Alignment Interrupt
@ -1584,8 +1485,6 @@ EXC_COMMON_BEGIN(alignment_common)
REST_NVGPRS(r1) /* instruction emulation may change GPRs */
b interrupt_return
GEN_KVM alignment
/**
* Interrupt 0x700 - Program Interrupt (program check).
@ -1693,8 +1592,6 @@ EXC_COMMON_BEGIN(program_check_common)
REST_NVGPRS(r1) /* instruction emulation may change GPRs */
b interrupt_return
GEN_KVM program_check
/*
* Interrupt 0x800 - Floating-Point Unavailable Interrupt.
@ -1744,8 +1641,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
b interrupt_return
#endif
GEN_KVM fp_unavailable
/**
* Interrupt 0x900 - Decrementer Interrupt.
@ -1784,8 +1679,6 @@ EXC_COMMON_BEGIN(decrementer_common)
bl timer_interrupt
b interrupt_return
GEN_KVM decrementer
/**
* Interrupt 0x980 - Hypervisor Decrementer Interrupt.
@ -1831,8 +1724,6 @@ EXC_COMMON_BEGIN(hdecrementer_common)
ld r13,PACA_EXGEN+EX_R13(r13)
HRFI_TO_KERNEL
GEN_KVM hdecrementer
/**
* Interrupt 0xa00 - Directed Privileged Doorbell Interrupt.
@ -1872,8 +1763,6 @@ EXC_COMMON_BEGIN(doorbell_super_common)
#endif
b interrupt_return
GEN_KVM doorbell_super
EXC_REAL_NONE(0xb00, 0x100)
EXC_VIRT_NONE(0x4b00, 0x100)
@ -1923,7 +1812,7 @@ INT_DEFINE_END(system_call)
GET_PACA(r13)
std r10,PACA_EXGEN+EX_R10(r13)
INTERRUPT_TO_KERNEL
KVMTEST system_call /* uses r10, branch to system_call_kvm */
KVMTEST system_call kvm_hcall /* uses r10, branch to kvm_hcall */
mfctr r9
#else
mr r9,r13
@ -1979,14 +1868,16 @@ EXC_VIRT_BEGIN(system_call, 0x4c00, 0x100)
EXC_VIRT_END(system_call, 0x4c00, 0x100)
#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
TRAMP_REAL_BEGIN(system_call_kvm)
/*
* This is a hcall, so register convention is as above, with these
* differences:
* r13 = PACA
* ctr = orig r13
* orig r10 saved in PACA
*/
TRAMP_REAL_BEGIN(kvm_hcall)
std r9,PACA_EXGEN+EX_R9(r13)
std r11,PACA_EXGEN+EX_R11(r13)
std r12,PACA_EXGEN+EX_R12(r13)
mfcr r9
mfctr r10
std r10,PACA_EXGEN+EX_R13(r13)
li r10,0
std r10,PACA_EXGEN+EX_CFAR(r13)
std r10,PACA_EXGEN+EX_CTR(r13)
/*
* Save the PPR (on systems that support it) before changing to
* HMT_MEDIUM. That allows the KVM code to save that value into the
@ -1994,31 +1885,24 @@ TRAMP_REAL_BEGIN(system_call_kvm)
*/
BEGIN_FTR_SECTION
mfspr r10,SPRN_PPR
std r10,HSTATE_PPR(r13)
std r10,PACA_EXGEN+EX_PPR(r13)
END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
HMT_MEDIUM
mfctr r10
SET_SCRATCH0(r10)
mfcr r10
std r12,HSTATE_SCRATCH0(r13)
sldi r12,r10,32
ori r12,r12,0xc00
#ifdef CONFIG_RELOCATABLE
/*
* Requires __LOAD_FAR_HANDLER beause kvmppc_interrupt lives
* Requires __LOAD_FAR_HANDLER beause kvmppc_hcall lives
* outside the head section.
*/
__LOAD_FAR_HANDLER(r10, kvmppc_interrupt)
__LOAD_FAR_HANDLER(r10, kvmppc_hcall)
mtctr r10
ld r10,PACA_EXGEN+EX_R10(r13)
bctr
#else
ld r10,PACA_EXGEN+EX_R10(r13)
b kvmppc_interrupt
b kvmppc_hcall
#endif
#endif
/**
* Interrupt 0xd00 - Trace Interrupt.
* This is a synchronous interrupt in response to instruction step or
@ -2043,8 +1927,6 @@ EXC_COMMON_BEGIN(single_step_common)
bl single_step_exception
b interrupt_return
GEN_KVM single_step
/**
* Interrupt 0xe00 - Hypervisor Data Storage Interrupt (HDSI).
@ -2063,7 +1945,6 @@ INT_DEFINE_BEGIN(h_data_storage)
IHSRR=1
IDAR=1
IDSISR=1
IKVM_SKIP=1
IKVM_REAL=1
IKVM_VIRT=1
INT_DEFINE_END(h_data_storage)
@ -2084,8 +1965,6 @@ MMU_FTR_SECTION_ELSE
ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_TYPE_RADIX)
b interrupt_return
GEN_KVM h_data_storage
/**
* Interrupt 0xe20 - Hypervisor Instruction Storage Interrupt (HISI).
@ -2111,8 +1990,6 @@ EXC_COMMON_BEGIN(h_instr_storage_common)
bl unknown_exception
b interrupt_return
GEN_KVM h_instr_storage
/**
* Interrupt 0xe40 - Hypervisor Emulation Assistance Interrupt.
@ -2137,8 +2014,6 @@ EXC_COMMON_BEGIN(emulation_assist_common)
REST_NVGPRS(r1) /* instruction emulation may change GPRs */
b interrupt_return
GEN_KVM emulation_assist
/**
* Interrupt 0xe60 - Hypervisor Maintenance Interrupt (HMI).
@ -2210,16 +2085,12 @@ EXC_COMMON_BEGIN(hmi_exception_early_common)
EXCEPTION_RESTORE_REGS hsrr=1
GEN_INT_ENTRY hmi_exception, virt=0
GEN_KVM hmi_exception_early
EXC_COMMON_BEGIN(hmi_exception_common)
GEN_COMMON hmi_exception
addi r3,r1,STACK_FRAME_OVERHEAD
bl handle_hmi_exception
b interrupt_return
GEN_KVM hmi_exception
/**
* Interrupt 0xe80 - Directed Hypervisor Doorbell Interrupt.
@ -2250,8 +2121,6 @@ EXC_COMMON_BEGIN(h_doorbell_common)
#endif
b interrupt_return
GEN_KVM h_doorbell
/**
* Interrupt 0xea0 - Hypervisor Virtualization Interrupt.
@ -2278,8 +2147,6 @@ EXC_COMMON_BEGIN(h_virt_irq_common)
bl do_IRQ
b interrupt_return
GEN_KVM h_virt_irq
EXC_REAL_NONE(0xec0, 0x20)
EXC_VIRT_NONE(0x4ec0, 0x20)
@ -2323,8 +2190,6 @@ EXC_COMMON_BEGIN(performance_monitor_common)
bl performance_monitor_exception
b interrupt_return
GEN_KVM performance_monitor
/**
* Interrupt 0xf20 - Vector Unavailable Interrupt.
@ -2374,8 +2239,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
bl altivec_unavailable_exception
b interrupt_return
GEN_KVM altivec_unavailable
/**
* Interrupt 0xf40 - VSX Unavailable Interrupt.
@ -2424,8 +2287,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
bl vsx_unavailable_exception
b interrupt_return
GEN_KVM vsx_unavailable
/**
* Interrupt 0xf60 - Facility Unavailable Interrupt.
@ -2454,8 +2315,6 @@ EXC_COMMON_BEGIN(facility_unavailable_common)
REST_NVGPRS(r1) /* instruction emulation may change GPRs */
b interrupt_return
GEN_KVM facility_unavailable
/**
* Interrupt 0xf60 - Hypervisor Facility Unavailable Interrupt.
@ -2484,8 +2343,6 @@ EXC_COMMON_BEGIN(h_facility_unavailable_common)
REST_NVGPRS(r1) /* XXX Shouldn't be necessary in practice */
b interrupt_return
GEN_KVM h_facility_unavailable
EXC_REAL_NONE(0xfa0, 0x20)
EXC_VIRT_NONE(0x4fa0, 0x20)
@ -2515,8 +2372,6 @@ EXC_COMMON_BEGIN(cbe_system_error_common)
bl cbe_system_error_exception
b interrupt_return
GEN_KVM cbe_system_error
#else /* CONFIG_CBE_RAS */
EXC_REAL_NONE(0x1200, 0x100)
EXC_VIRT_NONE(0x5200, 0x100)
@ -2548,8 +2403,6 @@ EXC_COMMON_BEGIN(instruction_breakpoint_common)
bl instruction_breakpoint_exception
b interrupt_return
GEN_KVM instruction_breakpoint
EXC_REAL_NONE(0x1400, 0x100)
EXC_VIRT_NONE(0x5400, 0x100)
@ -2670,8 +2523,6 @@ EXC_COMMON_BEGIN(denorm_exception_common)
bl unknown_exception
b interrupt_return
GEN_KVM denorm_exception
#ifdef CONFIG_CBE_RAS
INT_DEFINE_BEGIN(cbe_maintenance)
@ -2689,8 +2540,6 @@ EXC_COMMON_BEGIN(cbe_maintenance_common)
bl cbe_maintenance_exception
b interrupt_return
GEN_KVM cbe_maintenance
#else /* CONFIG_CBE_RAS */
EXC_REAL_NONE(0x1600, 0x100)
EXC_VIRT_NONE(0x5600, 0x100)
@ -2721,8 +2570,6 @@ EXC_COMMON_BEGIN(altivec_assist_common)
#endif
b interrupt_return
GEN_KVM altivec_assist
#ifdef CONFIG_CBE_RAS
INT_DEFINE_BEGIN(cbe_thermal)
@ -2740,8 +2587,6 @@ EXC_COMMON_BEGIN(cbe_thermal_common)
bl cbe_thermal_exception
b interrupt_return
GEN_KVM cbe_thermal
#else /* CONFIG_CBE_RAS */
EXC_REAL_NONE(0x1800, 0x100)
EXC_VIRT_NONE(0x5800, 0x100)
@ -2994,6 +2839,15 @@ TRAMP_REAL_BEGIN(rfscv_flush_fallback)
USE_TEXT_SECTION()
#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
kvm_interrupt:
/*
* The conditional branch in KVMTEST can't reach all the way,
* make a stub.
*/
b kvmppc_interrupt
#endif
_GLOBAL(do_uaccess_flush)
UACCESS_FLUSH_FIXUP_SECTION
nop
@ -3009,32 +2863,6 @@ EXPORT_SYMBOL(do_uaccess_flush)
MASKED_INTERRUPT
MASKED_INTERRUPT hsrr=1
#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
kvmppc_skip_interrupt:
/*
* Here all GPRs are unchanged from when the interrupt happened
* except for r13, which is saved in SPRG_SCRATCH0.
*/
mfspr r13, SPRN_SRR0
addi r13, r13, 4
mtspr SPRN_SRR0, r13
GET_SCRATCH0(r13)
RFI_TO_KERNEL
b .
kvmppc_skip_Hinterrupt:
/*
* Here all GPRs are unchanged from when the interrupt happened
* except for r13, which is saved in SPRG_SCRATCH0.
*/
mfspr r13, SPRN_HSRR0
addi r13, r13, 4
mtspr SPRN_HSRR0, r13
GET_SCRATCH0(r13)
HRFI_TO_KERNEL
b .
#endif
/*
* Relocation-on interrupts: A subset of the interrupts can be delivered
* with IR=1/DR=1, if AIL==2 and MSR.HV won't be changed by delivering

View file

@ -34,9 +34,6 @@ notrace long system_call_exception(long r3, long r4, long r5,
syscall_fn f;
kuep_lock();
#ifdef CONFIG_PPC32
kuap_save_and_lock(regs);
#endif
regs->orig_gpr3 = r3;
@ -427,6 +424,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned
/* Restore user access locks last */
kuap_user_restore(regs);
kuep_unlock();
return ret;
}

View file

@ -356,13 +356,16 @@ static void __init setup_legacy_serial_console(int console)
static int __init ioremap_legacy_serial_console(void)
{
struct legacy_serial_info *info = &legacy_serial_infos[legacy_serial_console];
struct plat_serial8250_port *port = &legacy_serial_ports[legacy_serial_console];
struct plat_serial8250_port *port;
struct legacy_serial_info *info;
void __iomem *vaddr;
if (legacy_serial_console < 0)
return 0;
info = &legacy_serial_infos[legacy_serial_console];
port = &legacy_serial_ports[legacy_serial_console];
if (!info->early_addr)
return 0;

View file

@ -432,16 +432,19 @@ device_initcall(stf_barrier_debugfs_init);
static void update_branch_cache_flush(void)
{
u32 *site;
u32 *site, __maybe_unused *site2;
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
site = &patch__call_kvm_flush_link_stack;
site2 = &patch__call_kvm_flush_link_stack_p9;
// This controls the branch from guest_exit_cont to kvm_flush_link_stack
if (link_stack_flush_type == BRANCH_CACHE_FLUSH_NONE) {
patch_instruction_site(site, ppc_inst(PPC_INST_NOP));
patch_instruction_site(site2, ppc_inst(PPC_INST_NOP));
} else {
// Could use HW flush, but that could also flush count cache
patch_branch_site(site, (u64)&kvm_flush_link_stack, BRANCH_SET_LINK);
patch_branch_site(site2, (u64)&kvm_flush_link_stack, BRANCH_SET_LINK);
}
#endif

View file

@ -166,9 +166,9 @@ copy_ckfpr_from_user(struct task_struct *task, void __user *from)
}
#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
#else
#define unsafe_copy_fpr_to_user(to, task, label) do { } while (0)
#define unsafe_copy_fpr_to_user(to, task, label) do { if (0) goto label;} while (0)
#define unsafe_copy_fpr_from_user(task, from, label) do { } while (0)
#define unsafe_copy_fpr_from_user(task, from, label) do { if (0) goto label;} while (0)
static inline unsigned long
copy_fpr_to_user(void __user *to, struct task_struct *task)

View file

@ -508,16 +508,6 @@ EXPORT_SYMBOL(profile_pc);
* 64-bit uses a byte in the PACA, 32-bit uses a per-cpu variable...
*/
#ifdef CONFIG_PPC64
static inline unsigned long test_irq_work_pending(void)
{
unsigned long x;
asm volatile("lbz %0,%1(13)"
: "=r" (x)
: "i" (offsetof(struct paca_struct, irq_work_pending)));
return x;
}
static inline void set_irq_work_pending_flag(void)
{
asm volatile("stb %0,%1(13)" : :

View file

@ -57,6 +57,7 @@ kvm-pr-y := \
book3s_32_mmu.o
kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) += \
book3s_64_entry.o \
tm.o
ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
@ -86,6 +87,7 @@ kvm-book3s_64-builtin-tm-objs-$(CONFIG_PPC_TRANSACTIONAL_MEM) += \
ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
kvm-book3s_64-builtin-objs-$(CONFIG_KVM_BOOK3S_64_HANDLER) += \
book3s_hv_hmi.o \
book3s_hv_p9_entry.o \
book3s_hv_rmhandlers.o \
book3s_hv_rm_mmu.o \
book3s_hv_ras.o \

View file

@ -171,6 +171,12 @@ void kvmppc_core_queue_machine_check(struct kvm_vcpu *vcpu, ulong flags)
}
EXPORT_SYMBOL_GPL(kvmppc_core_queue_machine_check);
void kvmppc_core_queue_syscall(struct kvm_vcpu *vcpu)
{
kvmppc_inject_interrupt(vcpu, BOOK3S_INTERRUPT_SYSCALL, 0);
}
EXPORT_SYMBOL(kvmppc_core_queue_syscall);
void kvmppc_core_queue_program(struct kvm_vcpu *vcpu, ulong flags)
{
/* might as well deliver this straight away */
@ -1044,13 +1050,10 @@ static int kvmppc_book3s_init(void)
#ifdef CONFIG_KVM_XICS
#ifdef CONFIG_KVM_XIVE
if (xics_on_xive()) {
kvmppc_xive_init_module();
kvm_register_device_ops(&kvm_xive_ops, KVM_DEV_TYPE_XICS);
if (kvmppc_xive_native_supported()) {
kvmppc_xive_native_init_module();
if (kvmppc_xive_native_supported())
kvm_register_device_ops(&kvm_xive_native_ops,
KVM_DEV_TYPE_XIVE);
}
} else
#endif
kvm_register_device_ops(&kvm_xics_ops, KVM_DEV_TYPE_XICS);
@ -1060,12 +1063,6 @@ static int kvmppc_book3s_init(void)
static void kvmppc_book3s_exit(void)
{
#ifdef CONFIG_KVM_XICS
if (xics_on_xive()) {
kvmppc_xive_exit_module();
kvmppc_xive_native_exit_module();
}
#endif
#ifdef CONFIG_KVM_BOOK3S_32_HANDLER
kvmppc_book3s_exit_pr();
#endif

View file

@ -0,0 +1,416 @@
/* SPDX-License-Identifier: GPL-2.0-only */
#include <asm/asm-offsets.h>
#include <asm/cache.h>
#include <asm/code-patching-asm.h>
#include <asm/exception-64s.h>
#include <asm/export.h>
#include <asm/kvm_asm.h>
#include <asm/kvm_book3s_asm.h>
#include <asm/mmu.h>
#include <asm/ppc_asm.h>
#include <asm/ptrace.h>
#include <asm/reg.h>
#include <asm/ultravisor-api.h>
/*
* These are branched to from interrupt handlers in exception-64s.S which set
* IKVM_REAL or IKVM_VIRT, if HSTATE_IN_GUEST was found to be non-zero.
*/
/*
* This is a hcall, so register convention is as
* Documentation/powerpc/papr_hcalls.rst.
*
* This may also be a syscall from PR-KVM userspace that is to be
* reflected to the PR guest kernel, so registers may be set up for
* a system call rather than hcall. We don't currently clobber
* anything here, but the 0xc00 handler has already clobbered CTR
* and CR0, so PR-KVM can not support a guest kernel that preserves
* those registers across its system calls.
*
* The state of registers is as kvmppc_interrupt, except CFAR is not
* saved, R13 is not in SCRATCH0, and R10 does not contain the trap.
*/
.global kvmppc_hcall
.balign IFETCH_ALIGN_BYTES
kvmppc_hcall:
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
lbz r10,HSTATE_IN_GUEST(r13)
cmpwi r10,KVM_GUEST_MODE_HV_P9
beq kvmppc_p9_exit_hcall
#endif
ld r10,PACA_EXGEN+EX_R13(r13)
SET_SCRATCH0(r10)
li r10,0xc00
/* Now we look like kvmppc_interrupt */
li r11,PACA_EXGEN
b .Lgot_save_area
/*
* KVM interrupt entry occurs after GEN_INT_ENTRY runs, and follows that
* call convention:
*
* guest R9-R13, CTR, CFAR, PPR saved in PACA EX_xxx save area
* guest (H)DAR, (H)DSISR are also in the save area for relevant interrupts
* guest R13 also saved in SCRATCH0
* R13 = PACA
* R11 = (H)SRR0
* R12 = (H)SRR1
* R9 = guest CR
* PPR is set to medium
*
* With the addition for KVM:
* R10 = trap vector
*/
.global kvmppc_interrupt
.balign IFETCH_ALIGN_BYTES
kvmppc_interrupt:
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
std r10,HSTATE_SCRATCH0(r13)
lbz r10,HSTATE_IN_GUEST(r13)
cmpwi r10,KVM_GUEST_MODE_HV_P9
beq kvmppc_p9_exit_interrupt
ld r10,HSTATE_SCRATCH0(r13)
#endif
li r11,PACA_EXGEN
cmpdi r10,0x200
bgt+ .Lgot_save_area
li r11,PACA_EXMC
beq .Lgot_save_area
li r11,PACA_EXNMI
.Lgot_save_area:
add r11,r11,r13
BEGIN_FTR_SECTION
ld r12,EX_CFAR(r11)
std r12,HSTATE_CFAR(r13)
END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
ld r12,EX_CTR(r11)
mtctr r12
BEGIN_FTR_SECTION
ld r12,EX_PPR(r11)
std r12,HSTATE_PPR(r13)
END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
ld r12,EX_R12(r11)
std r12,HSTATE_SCRATCH0(r13)
sldi r12,r9,32
or r12,r12,r10
ld r9,EX_R9(r11)
ld r10,EX_R10(r11)
ld r11,EX_R11(r11)
/*
* Hcalls and other interrupts come here after normalising register
* contents and save locations:
*
* R12 = (guest CR << 32) | interrupt vector
* R13 = PACA
* guest R12 saved in shadow HSTATE_SCRATCH0
* guest R13 saved in SPRN_SCRATCH0
*/
std r9,HSTATE_SCRATCH2(r13)
lbz r9,HSTATE_IN_GUEST(r13)
cmpwi r9,KVM_GUEST_MODE_SKIP
beq- .Lmaybe_skip
.Lno_skip:
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
cmpwi r9,KVM_GUEST_MODE_GUEST
beq kvmppc_interrupt_pr
#endif
b kvmppc_interrupt_hv
#else
b kvmppc_interrupt_pr
#endif
/*
* "Skip" interrupts are part of a trick KVM uses a with hash guests to load
* the faulting instruction in guest memory from the the hypervisor without
* walking page tables.
*
* When the guest takes a fault that requires the hypervisor to load the
* instruction (e.g., MMIO emulation), KVM is running in real-mode with HV=1
* and the guest MMU context loaded. It sets KVM_GUEST_MODE_SKIP, and sets
* MSR[DR]=1 while leaving MSR[IR]=0, so it continues to fetch HV instructions
* but loads and stores will access the guest context. This is used to load
* the faulting instruction using the faulting guest effective address.
*
* However the guest context may not be able to translate, or it may cause a
* machine check or other issue, which results in a fault in the host
* (even with KVM-HV).
*
* These faults come here because KVM_GUEST_MODE_SKIP was set, so if they
* are (or are likely) caused by that load, the instruction is skipped by
* just returning with the PC advanced +4, where it is noticed the load did
* not execute and it goes to the slow path which walks the page tables to
* read guest memory.
*/
.Lmaybe_skip:
cmpwi r12,BOOK3S_INTERRUPT_MACHINE_CHECK
beq 1f
cmpwi r12,BOOK3S_INTERRUPT_DATA_STORAGE
beq 1f
cmpwi r12,BOOK3S_INTERRUPT_DATA_SEGMENT
beq 1f
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
/* HSRR interrupts get 2 added to interrupt number */
cmpwi r12,BOOK3S_INTERRUPT_H_DATA_STORAGE | 0x2
beq 2f
#endif
b .Lno_skip
1: mfspr r9,SPRN_SRR0
addi r9,r9,4
mtspr SPRN_SRR0,r9
ld r12,HSTATE_SCRATCH0(r13)
ld r9,HSTATE_SCRATCH2(r13)
GET_SCRATCH0(r13)
RFI_TO_KERNEL
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
2: mfspr r9,SPRN_HSRR0
addi r9,r9,4
mtspr SPRN_HSRR0,r9
ld r12,HSTATE_SCRATCH0(r13)
ld r9,HSTATE_SCRATCH2(r13)
GET_SCRATCH0(r13)
HRFI_TO_KERNEL
#endif
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
/* Stack frame offsets for kvmppc_p9_enter_guest */
#define SFS (144 + STACK_FRAME_MIN_SIZE)
#define STACK_SLOT_NVGPRS (SFS - 144) /* 18 gprs */
/*
* void kvmppc_p9_enter_guest(struct vcpu *vcpu);
*
* Enter the guest on a ISAv3.0 or later system.
*/
.balign IFETCH_ALIGN_BYTES
_GLOBAL(kvmppc_p9_enter_guest)
EXPORT_SYMBOL_GPL(kvmppc_p9_enter_guest)
mflr r0
std r0,PPC_LR_STKOFF(r1)
stdu r1,-SFS(r1)
std r1,HSTATE_HOST_R1(r13)
mfcr r4
stw r4,SFS+8(r1)
reg = 14
.rept 18
std reg,STACK_SLOT_NVGPRS + ((reg - 14) * 8)(r1)
reg = reg + 1
.endr
ld r4,VCPU_LR(r3)
mtlr r4
ld r4,VCPU_CTR(r3)
mtctr r4
ld r4,VCPU_XER(r3)
mtspr SPRN_XER,r4
ld r1,VCPU_CR(r3)
BEGIN_FTR_SECTION
ld r4,VCPU_CFAR(r3)
mtspr SPRN_CFAR,r4
END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
BEGIN_FTR_SECTION
ld r4,VCPU_PPR(r3)
mtspr SPRN_PPR,r4
END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
reg = 4
.rept 28
ld reg,__VCPU_GPR(reg)(r3)
reg = reg + 1
.endr
ld r4,VCPU_KVM(r3)
lbz r4,KVM_SECURE_GUEST(r4)
cmpdi r4,0
ld r4,VCPU_GPR(R4)(r3)
bne .Lret_to_ultra
mtcr r1
ld r0,VCPU_GPR(R0)(r3)
ld r1,VCPU_GPR(R1)(r3)
ld r2,VCPU_GPR(R2)(r3)
ld r3,VCPU_GPR(R3)(r3)
HRFI_TO_GUEST
b .
/*
* Use UV_RETURN ultracall to return control back to the Ultravisor
* after processing an hypercall or interrupt that was forwarded
* (a.k.a. reflected) to the Hypervisor.
*
* All registers have already been reloaded except the ucall requires:
* R0 = hcall result
* R2 = SRR1, so UV can detect a synthesized interrupt (if any)
* R3 = UV_RETURN
*/
.Lret_to_ultra:
mtcr r1
ld r1,VCPU_GPR(R1)(r3)
ld r0,VCPU_GPR(R3)(r3)
mfspr r2,SPRN_SRR1
LOAD_REG_IMMEDIATE(r3, UV_RETURN)
sc 2
/*
* kvmppc_p9_exit_hcall and kvmppc_p9_exit_interrupt are branched to from
* above if the interrupt was taken for a guest that was entered via
* kvmppc_p9_enter_guest().
*
* The exit code recovers the host stack and vcpu pointer, saves all guest GPRs
* and CR, LR, XER as well as guest MSR and NIA into the VCPU, then re-
* establishes the host stack and registers to return from the
* kvmppc_p9_enter_guest() function, which saves CTR and other guest registers
* (SPRs and FP, VEC, etc).
*/
.balign IFETCH_ALIGN_BYTES
kvmppc_p9_exit_hcall:
mfspr r11,SPRN_SRR0
mfspr r12,SPRN_SRR1
li r10,0xc00
std r10,HSTATE_SCRATCH0(r13)
.balign IFETCH_ALIGN_BYTES
kvmppc_p9_exit_interrupt:
/*
* If set to KVM_GUEST_MODE_HV_P9 but we're still in the
* hypervisor, that means we can't return from the entry stack.
*/
rldicl. r10,r12,64-MSR_HV_LG,63
bne- kvmppc_p9_bad_interrupt
std r1,HSTATE_SCRATCH1(r13)
std r3,HSTATE_SCRATCH2(r13)
ld r1,HSTATE_HOST_R1(r13)
ld r3,HSTATE_KVM_VCPU(r13)
std r9,VCPU_CR(r3)
1:
std r11,VCPU_PC(r3)
std r12,VCPU_MSR(r3)
reg = 14
.rept 18
std reg,__VCPU_GPR(reg)(r3)
reg = reg + 1
.endr
/* r1, r3, r9-r13 are saved to vcpu by C code */
std r0,VCPU_GPR(R0)(r3)
std r2,VCPU_GPR(R2)(r3)
reg = 4
.rept 5
std reg,__VCPU_GPR(reg)(r3)
reg = reg + 1
.endr
ld r2,PACATOC(r13)
mflr r4
std r4,VCPU_LR(r3)
mfspr r4,SPRN_XER
std r4,VCPU_XER(r3)
reg = 14
.rept 18
ld reg,STACK_SLOT_NVGPRS + ((reg - 14) * 8)(r1)
reg = reg + 1
.endr
lwz r4,SFS+8(r1)
mtcr r4
/*
* Flush the link stack here, before executing the first blr on the
* way out of the guest.
*
* The link stack won't match coming out of the guest anyway so the
* only cost is the flush itself. The call clobbers r0.
*/
1: nop
patch_site 1b patch__call_kvm_flush_link_stack_p9
addi r1,r1,SFS
ld r0,PPC_LR_STKOFF(r1)
mtlr r0
blr
/*
* Took an interrupt somewhere right before HRFID to guest, so registers are
* in a bad way. Return things hopefully enough to run host virtual code and
* run the Linux interrupt handler (SRESET or MCE) to print something useful.
*
* We could be really clever and save all host registers in known locations
* before setting HSTATE_IN_GUEST, then restoring them all here, and setting
* return address to a fixup that sets them up again. But that's a lot of
* effort for a small bit of code. Lots of other things to do first.
*/
kvmppc_p9_bad_interrupt:
BEGIN_MMU_FTR_SECTION
/*
* Hash host doesn't try to recover MMU (requires host SLB reload)
*/
b .
END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)
/*
* Clean up guest registers to give host a chance to run.
*/
li r10,0
mtspr SPRN_AMR,r10
mtspr SPRN_IAMR,r10
mtspr SPRN_CIABR,r10
mtspr SPRN_DAWRX0,r10
BEGIN_FTR_SECTION
mtspr SPRN_DAWRX1,r10
END_FTR_SECTION_IFSET(CPU_FTR_DAWR1)
mtspr SPRN_PID,r10
/*
* Switch to host MMU mode
*/
ld r10, HSTATE_KVM_VCPU(r13)
ld r10, VCPU_KVM(r10)
lwz r10, KVM_HOST_LPID(r10)
mtspr SPRN_LPID,r10
ld r10, HSTATE_KVM_VCPU(r13)
ld r10, VCPU_KVM(r10)
ld r10, KVM_HOST_LPCR(r10)
mtspr SPRN_LPCR,r10
/*
* Set GUEST_MODE_NONE so the handler won't branch to KVM, and clear
* MSR_RI in r12 ([H]SRR1) so the handler won't try to return.
*/
li r10,KVM_GUEST_MODE_NONE
stb r10,HSTATE_IN_GUEST(r13)
li r10,MSR_RI
andc r12,r12,r10
/*
* Go back to interrupt handler. MCE and SRESET have their specific
* PACA save area so they should be used directly. They set up their
* own stack. The other handlers all use EXGEN. They will use the
* guest r1 if it looks like a kernel stack, so just load the
* emergency stack and go to program check for all other interrupts.
*/
ld r10,HSTATE_SCRATCH0(r13)
cmpwi r10,BOOK3S_INTERRUPT_MACHINE_CHECK
beq machine_check_common
cmpwi r10,BOOK3S_INTERRUPT_SYSTEM_RESET
beq system_reset_common
b .
#endif

View file

@ -840,7 +840,7 @@ bool kvm_unmap_gfn_range_hv(struct kvm *kvm, struct kvm_gfn_range *range)
kvm_unmap_radix(kvm, range->slot, gfn);
} else {
for (gfn = range->start; gfn < range->end; gfn++)
kvm_unmap_rmapp(kvm, range->slot, range->start);
kvm_unmap_rmapp(kvm, range->slot, gfn);
}
return false;

View file

@ -21,6 +21,7 @@
#include <asm/pte-walk.h>
#include <asm/ultravisor.h>
#include <asm/kvm_book3s_uvmem.h>
#include <asm/plpar_wrappers.h>
/*
* Supported radix tree geometry.
@ -318,9 +319,19 @@ void kvmppc_radix_tlbie_page(struct kvm *kvm, unsigned long addr,
}
psi = shift_to_mmu_psize(pshift);
rb = addr | (mmu_get_ap(psi) << PPC_BITLSHIFT(58));
rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(0, 0, 1),
lpid, rb);
if (!firmware_has_feature(FW_FEATURE_RPT_INVALIDATE)) {
rb = addr | (mmu_get_ap(psi) << PPC_BITLSHIFT(58));
rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(0, 0, 1),
lpid, rb);
} else {
rc = pseries_rpt_invalidate(lpid, H_RPTI_TARGET_CMMU,
H_RPTI_TYPE_NESTED |
H_RPTI_TYPE_TLB,
psize_to_rpti_pgsize(psi),
addr, addr + psize);
}
if (rc)
pr_err("KVM: TLB page invalidation hcall failed, rc=%ld\n", rc);
}
@ -334,8 +345,14 @@ static void kvmppc_radix_flush_pwc(struct kvm *kvm, unsigned int lpid)
return;
}
rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(1, 0, 1),
lpid, TLBIEL_INVAL_SET_LPID);
if (!firmware_has_feature(FW_FEATURE_RPT_INVALIDATE))
rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(1, 0, 1),
lpid, TLBIEL_INVAL_SET_LPID);
else
rc = pseries_rpt_invalidate(lpid, H_RPTI_TARGET_CMMU,
H_RPTI_TYPE_NESTED |
H_RPTI_TYPE_PWC, H_RPTI_PAGE_ALL,
0, -1UL);
if (rc)
pr_err("KVM: TLB PWC invalidation hcall failed, rc=%ld\n", rc);
}

View file

@ -391,10 +391,6 @@ long kvmppc_rm_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
/* udbg_printf("H_PUT_TCE(): liobn=0x%lx ioba=0x%lx, tce=0x%lx\n", */
/* liobn, ioba, tce); */
/* For radix, we might be in virtual mode, so punt */
if (kvm_is_radix(vcpu->kvm))
return H_TOO_HARD;
stt = kvmppc_find_table(vcpu->kvm, liobn);
if (!stt)
return H_TOO_HARD;
@ -489,10 +485,6 @@ long kvmppc_rm_h_put_tce_indirect(struct kvm_vcpu *vcpu,
bool prereg = false;
struct kvmppc_spapr_tce_iommu_table *stit;
/* For radix, we might be in virtual mode, so punt */
if (kvm_is_radix(vcpu->kvm))
return H_TOO_HARD;
/*
* used to check for invalidations in progress
*/
@ -602,10 +594,6 @@ long kvmppc_rm_h_stuff_tce(struct kvm_vcpu *vcpu,
long i, ret;
struct kvmppc_spapr_tce_iommu_table *stit;
/* For radix, we might be in virtual mode, so punt */
if (kvm_is_radix(vcpu->kvm))
return H_TOO_HARD;
stt = kvmppc_find_table(vcpu->kvm, liobn);
if (!stt)
return H_TOO_HARD;

File diff suppressed because it is too large Load diff

View file

@ -34,21 +34,6 @@
#include "book3s_xics.h"
#include "book3s_xive.h"
/*
* The XIVE module will populate these when it loads
*/
unsigned long (*__xive_vm_h_xirr)(struct kvm_vcpu *vcpu);
unsigned long (*__xive_vm_h_ipoll)(struct kvm_vcpu *vcpu, unsigned long server);
int (*__xive_vm_h_ipi)(struct kvm_vcpu *vcpu, unsigned long server,
unsigned long mfrr);
int (*__xive_vm_h_cppr)(struct kvm_vcpu *vcpu, unsigned long cppr);
int (*__xive_vm_h_eoi)(struct kvm_vcpu *vcpu, unsigned long xirr);
EXPORT_SYMBOL_GPL(__xive_vm_h_xirr);
EXPORT_SYMBOL_GPL(__xive_vm_h_ipoll);
EXPORT_SYMBOL_GPL(__xive_vm_h_ipi);
EXPORT_SYMBOL_GPL(__xive_vm_h_cppr);
EXPORT_SYMBOL_GPL(__xive_vm_h_eoi);
/*
* Hash page table alignment on newer cpus(CPU_FTR_ARCH_206)
* should be power of 2.
@ -196,16 +181,9 @@ int kvmppc_hwrng_present(void)
}
EXPORT_SYMBOL_GPL(kvmppc_hwrng_present);
long kvmppc_h_random(struct kvm_vcpu *vcpu)
long kvmppc_rm_h_random(struct kvm_vcpu *vcpu)
{
int r;
/* Only need to do the expensive mfmsr() on radix */
if (kvm_is_radix(vcpu->kvm) && (mfmsr() & MSR_IR))
r = powernv_get_random_long(&vcpu->arch.regs.gpr[4]);
else
r = powernv_get_random_real_mode(&vcpu->arch.regs.gpr[4]);
if (r)
if (powernv_get_random_real_mode(&vcpu->arch.regs.gpr[4]))
return H_SUCCESS;
return H_HARDWARE;
@ -221,15 +199,6 @@ void kvmhv_rm_send_ipi(int cpu)
void __iomem *xics_phys;
unsigned long msg = PPC_DBELL_TYPE(PPC_DBELL_SERVER);
/* For a nested hypervisor, use the XICS via hcall */
if (kvmhv_on_pseries()) {
unsigned long retbuf[PLPAR_HCALL_BUFSIZE];
plpar_hcall_raw(H_IPI, retbuf, get_hard_smp_processor_id(cpu),
IPI_PRIORITY);
return;
}
/* On POWER9 we can use msgsnd for any destination cpu. */
if (cpu_has_feature(CPU_FTR_ARCH_300)) {
msg |= get_hard_smp_processor_id(cpu);
@ -442,19 +411,12 @@ static long kvmppc_read_one_intr(bool *again)
return 1;
/* Now read the interrupt from the ICP */
if (kvmhv_on_pseries()) {
unsigned long retbuf[PLPAR_HCALL_BUFSIZE];
rc = plpar_hcall_raw(H_XIRR, retbuf, 0xFF);
xirr = cpu_to_be32(retbuf[0]);
} else {
xics_phys = local_paca->kvm_hstate.xics_phys;
rc = 0;
if (!xics_phys)
rc = opal_int_get_xirr(&xirr, false);
else
xirr = __raw_rm_readl(xics_phys + XICS_XIRR);
}
xics_phys = local_paca->kvm_hstate.xics_phys;
rc = 0;
if (!xics_phys)
rc = opal_int_get_xirr(&xirr, false);
else
xirr = __raw_rm_readl(xics_phys + XICS_XIRR);
if (rc < 0)
return 1;
@ -483,13 +445,7 @@ static long kvmppc_read_one_intr(bool *again)
*/
if (xisr == XICS_IPI) {
rc = 0;
if (kvmhv_on_pseries()) {
unsigned long retbuf[PLPAR_HCALL_BUFSIZE];
plpar_hcall_raw(H_IPI, retbuf,
hard_smp_processor_id(), 0xff);
plpar_hcall_raw(H_EOI, retbuf, h_xirr);
} else if (xics_phys) {
if (xics_phys) {
__raw_rm_writeb(0xff, xics_phys + XICS_MFRR);
__raw_rm_writel(xirr, xics_phys + XICS_XIRR);
} else {
@ -515,13 +471,7 @@ static long kvmppc_read_one_intr(bool *again)
/* We raced with the host,
* we need to resend that IPI, bummer
*/
if (kvmhv_on_pseries()) {
unsigned long retbuf[PLPAR_HCALL_BUFSIZE];
plpar_hcall_raw(H_IPI, retbuf,
hard_smp_processor_id(),
IPI_PRIORITY);
} else if (xics_phys)
if (xics_phys)
__raw_rm_writeb(IPI_PRIORITY,
xics_phys + XICS_MFRR);
else
@ -541,22 +491,13 @@ static long kvmppc_read_one_intr(bool *again)
}
#ifdef CONFIG_KVM_XICS
static inline bool is_rm(void)
{
return !(mfmsr() & MSR_DR);
}
unsigned long kvmppc_rm_h_xirr(struct kvm_vcpu *vcpu)
{
if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD;
if (xics_on_xive()) {
if (is_rm())
return xive_rm_h_xirr(vcpu);
if (unlikely(!__xive_vm_h_xirr))
return H_NOT_AVAILABLE;
return __xive_vm_h_xirr(vcpu);
} else
if (xics_on_xive())
return xive_rm_h_xirr(vcpu);
else
return xics_rm_h_xirr(vcpu);
}
@ -565,13 +506,9 @@ unsigned long kvmppc_rm_h_xirr_x(struct kvm_vcpu *vcpu)
if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD;
vcpu->arch.regs.gpr[5] = get_tb();
if (xics_on_xive()) {
if (is_rm())
return xive_rm_h_xirr(vcpu);
if (unlikely(!__xive_vm_h_xirr))
return H_NOT_AVAILABLE;
return __xive_vm_h_xirr(vcpu);
} else
if (xics_on_xive())
return xive_rm_h_xirr(vcpu);
else
return xics_rm_h_xirr(vcpu);
}
@ -579,13 +516,9 @@ unsigned long kvmppc_rm_h_ipoll(struct kvm_vcpu *vcpu, unsigned long server)
{
if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD;
if (xics_on_xive()) {
if (is_rm())
return xive_rm_h_ipoll(vcpu, server);
if (unlikely(!__xive_vm_h_ipoll))
return H_NOT_AVAILABLE;
return __xive_vm_h_ipoll(vcpu, server);
} else
if (xics_on_xive())
return xive_rm_h_ipoll(vcpu, server);
else
return H_TOO_HARD;
}
@ -594,13 +527,9 @@ int kvmppc_rm_h_ipi(struct kvm_vcpu *vcpu, unsigned long server,
{
if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD;
if (xics_on_xive()) {
if (is_rm())
return xive_rm_h_ipi(vcpu, server, mfrr);
if (unlikely(!__xive_vm_h_ipi))
return H_NOT_AVAILABLE;
return __xive_vm_h_ipi(vcpu, server, mfrr);
} else
if (xics_on_xive())
return xive_rm_h_ipi(vcpu, server, mfrr);
else
return xics_rm_h_ipi(vcpu, server, mfrr);
}
@ -608,13 +537,9 @@ int kvmppc_rm_h_cppr(struct kvm_vcpu *vcpu, unsigned long cppr)
{
if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD;
if (xics_on_xive()) {
if (is_rm())
return xive_rm_h_cppr(vcpu, cppr);
if (unlikely(!__xive_vm_h_cppr))
return H_NOT_AVAILABLE;
return __xive_vm_h_cppr(vcpu, cppr);
} else
if (xics_on_xive())
return xive_rm_h_cppr(vcpu, cppr);
else
return xics_rm_h_cppr(vcpu, cppr);
}
@ -622,13 +547,9 @@ int kvmppc_rm_h_eoi(struct kvm_vcpu *vcpu, unsigned long xirr)
{
if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD;
if (xics_on_xive()) {
if (is_rm())
return xive_rm_h_eoi(vcpu, xirr);
if (unlikely(!__xive_vm_h_eoi))
return H_NOT_AVAILABLE;
return __xive_vm_h_eoi(vcpu, xirr);
} else
if (xics_on_xive())
return xive_rm_h_eoi(vcpu, xirr);
else
return xics_rm_h_eoi(vcpu, xirr);
}
#endif /* CONFIG_KVM_XICS */
@ -800,7 +721,7 @@ void kvmppc_check_need_tlb_flush(struct kvm *kvm, int pcpu,
* Thus we make all 4 threads use the same bit.
*/
if (cpu_has_feature(CPU_FTR_ARCH_300))
pcpu = cpu_first_thread_sibling(pcpu);
pcpu = cpu_first_tlb_thread_sibling(pcpu);
if (nested)
need_tlb_flush = &nested->need_tlb_flush;

View file

@ -58,7 +58,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
/*
* Put whatever is in the decrementer into the
* hypervisor decrementer.
* Because of a hardware deviation in P8 and P9,
* Because of a hardware deviation in P8,
* we need to set LPCR[HDICE] before writing HDEC.
*/
ld r5, HSTATE_KVM_VCORE(r13)
@ -67,15 +67,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
ori r8, r9, LPCR_HDICE
mtspr SPRN_LPCR, r8
isync
andis. r0, r9, LPCR_LD@h
mfspr r8,SPRN_DEC
mftb r7
BEGIN_FTR_SECTION
/* On POWER9, don't sign-extend if host LPCR[LD] bit is set */
bne 32f
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
extsw r8,r8
32: mtspr SPRN_HDEC,r8
mtspr SPRN_HDEC,r8
add r8,r8,r7
std r8,HSTATE_DECEXP(r13)

View file

@ -19,6 +19,7 @@
#include <asm/pgalloc.h>
#include <asm/pte-walk.h>
#include <asm/reg.h>
#include <asm/plpar_wrappers.h>
static struct patb_entry *pseries_partition_tb;
@ -53,7 +54,8 @@ void kvmhv_save_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
hr->dawrx1 = vcpu->arch.dawrx1;
}
static void byteswap_pt_regs(struct pt_regs *regs)
/* Use noinline_for_stack due to https://bugs.llvm.org/show_bug.cgi?id=49610 */
static noinline_for_stack void byteswap_pt_regs(struct pt_regs *regs)
{
unsigned long *addr = (unsigned long *) regs;
@ -467,8 +469,15 @@ static void kvmhv_flush_lpid(unsigned int lpid)
return;
}
rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(2, 0, 1),
lpid, TLBIEL_INVAL_SET_LPID);
if (!firmware_has_feature(FW_FEATURE_RPT_INVALIDATE))
rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(2, 0, 1),
lpid, TLBIEL_INVAL_SET_LPID);
else
rc = pseries_rpt_invalidate(lpid, H_RPTI_TARGET_CMMU,
H_RPTI_TYPE_NESTED |
H_RPTI_TYPE_TLB | H_RPTI_TYPE_PWC |
H_RPTI_TYPE_PAT,
H_RPTI_PAGE_ALL, 0, -1UL);
if (rc)
pr_err("KVM: TLB LPID invalidation hcall failed, rc=%ld\n", rc);
}
@ -1214,6 +1223,113 @@ long kvmhv_do_nested_tlbie(struct kvm_vcpu *vcpu)
return H_SUCCESS;
}
static long do_tlb_invalidate_nested_all(struct kvm_vcpu *vcpu,
unsigned long lpid, unsigned long ric)
{
struct kvm *kvm = vcpu->kvm;
struct kvm_nested_guest *gp;
gp = kvmhv_get_nested(kvm, lpid, false);
if (gp) {
kvmhv_emulate_tlbie_lpid(vcpu, gp, ric);
kvmhv_put_nested(gp);
}
return H_SUCCESS;
}
/*
* Number of pages above which we invalidate the entire LPID rather than
* flush individual pages.
*/
static unsigned long tlb_range_flush_page_ceiling __read_mostly = 33;
static long do_tlb_invalidate_nested_tlb(struct kvm_vcpu *vcpu,
unsigned long lpid,
unsigned long pg_sizes,
unsigned long start,
unsigned long end)
{
int ret = H_P4;
unsigned long addr, nr_pages;
struct mmu_psize_def *def;
unsigned long psize, ap, page_size;
bool flush_lpid;
for (psize = 0; psize < MMU_PAGE_COUNT; psize++) {
def = &mmu_psize_defs[psize];
if (!(pg_sizes & def->h_rpt_pgsize))
continue;
nr_pages = (end - start) >> def->shift;
flush_lpid = nr_pages > tlb_range_flush_page_ceiling;
if (flush_lpid)
return do_tlb_invalidate_nested_all(vcpu, lpid,
RIC_FLUSH_TLB);
addr = start;
ap = mmu_get_ap(psize);
page_size = 1UL << def->shift;
do {
ret = kvmhv_emulate_tlbie_tlb_addr(vcpu, lpid, ap,
get_epn(addr));
if (ret)
return H_P4;
addr += page_size;
} while (addr < end);
}
return ret;
}
/*
* Performs partition-scoped invalidations for nested guests
* as part of H_RPT_INVALIDATE hcall.
*/
long do_h_rpt_invalidate_pat(struct kvm_vcpu *vcpu, unsigned long lpid,
unsigned long type, unsigned long pg_sizes,
unsigned long start, unsigned long end)
{
/*
* If L2 lpid isn't valid, we need to return H_PARAMETER.
*
* However, nested KVM issues a L2 lpid flush call when creating
* partition table entries for L2. This happens even before the
* corresponding shadow lpid is created in HV which happens in
* H_ENTER_NESTED call. Since we can't differentiate this case from
* the invalid case, we ignore such flush requests and return success.
*/
if (!kvmhv_find_nested(vcpu->kvm, lpid))
return H_SUCCESS;
/*
* A flush all request can be handled by a full lpid flush only.
*/
if ((type & H_RPTI_TYPE_NESTED_ALL) == H_RPTI_TYPE_NESTED_ALL)
return do_tlb_invalidate_nested_all(vcpu, lpid, RIC_FLUSH_ALL);
/*
* We don't need to handle a PWC flush like process table here,
* because intermediate partition scoped table in nested guest doesn't
* really have PWC. Only level we have PWC is in L0 and for nested
* invalidate at L0 we always do kvm_flush_lpid() which does
* radix__flush_all_lpid(). For range invalidate at any level, we
* are not removing the higher level page tables and hence there is
* no PWC invalidate needed.
*
* if (type & H_RPTI_TYPE_PWC) {
* ret = do_tlb_invalidate_nested_all(vcpu, lpid, RIC_FLUSH_PWC);
* if (ret)
* return H_P4;
* }
*/
if (start == 0 && end == -1)
return do_tlb_invalidate_nested_all(vcpu, lpid, RIC_FLUSH_TLB);
if (type & H_RPTI_TYPE_TLB)
return do_tlb_invalidate_nested_tlb(vcpu, lpid, pg_sizes,
start, end);
return H_SUCCESS;
}
/* Used to convert a nested guest real address to a L1 guest real address */
static int kvmhv_translate_addr_nested(struct kvm_vcpu *vcpu,
struct kvm_nested_guest *gp,

View file

@ -0,0 +1,508 @@
// SPDX-License-Identifier: GPL-2.0-only
#include <linux/kernel.h>
#include <linux/kvm_host.h>
#include <asm/asm-prototypes.h>
#include <asm/dbell.h>
#include <asm/kvm_ppc.h>
#include <asm/ppc-opcode.h>
#ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
static void __start_timing(struct kvm_vcpu *vcpu, struct kvmhv_tb_accumulator *next)
{
struct kvmppc_vcore *vc = vcpu->arch.vcore;
u64 tb = mftb() - vc->tb_offset_applied;
vcpu->arch.cur_activity = next;
vcpu->arch.cur_tb_start = tb;
}
static void __accumulate_time(struct kvm_vcpu *vcpu, struct kvmhv_tb_accumulator *next)
{
struct kvmppc_vcore *vc = vcpu->arch.vcore;
struct kvmhv_tb_accumulator *curr;
u64 tb = mftb() - vc->tb_offset_applied;
u64 prev_tb;
u64 delta;
u64 seq;
curr = vcpu->arch.cur_activity;
vcpu->arch.cur_activity = next;
prev_tb = vcpu->arch.cur_tb_start;
vcpu->arch.cur_tb_start = tb;
if (!curr)
return;
delta = tb - prev_tb;
seq = curr->seqcount;
curr->seqcount = seq + 1;
smp_wmb();
curr->tb_total += delta;
if (seq == 0 || delta < curr->tb_min)
curr->tb_min = delta;
if (delta > curr->tb_max)
curr->tb_max = delta;
smp_wmb();
curr->seqcount = seq + 2;
}
#define start_timing(vcpu, next) __start_timing(vcpu, next)
#define end_timing(vcpu) __start_timing(vcpu, NULL)
#define accumulate_time(vcpu, next) __accumulate_time(vcpu, next)
#else
#define start_timing(vcpu, next) do {} while (0)
#define end_timing(vcpu) do {} while (0)
#define accumulate_time(vcpu, next) do {} while (0)
#endif
static inline void mfslb(unsigned int idx, u64 *slbee, u64 *slbev)
{
asm volatile("slbmfev %0,%1" : "=r" (*slbev) : "r" (idx));
asm volatile("slbmfee %0,%1" : "=r" (*slbee) : "r" (idx));
}
static inline void mtslb(u64 slbee, u64 slbev)
{
asm volatile("slbmte %0,%1" :: "r" (slbev), "r" (slbee));
}
static inline void clear_slb_entry(unsigned int idx)
{
mtslb(idx, 0);
}
static inline void slb_clear_invalidate_partition(void)
{
clear_slb_entry(0);
asm volatile(PPC_SLBIA(6));
}
/*
* Malicious or buggy radix guests may have inserted SLB entries
* (only 0..3 because radix always runs with UPRT=1), so these must
* be cleared here to avoid side-channels. slbmte is used rather
* than slbia, as it won't clear cached translations.
*/
static void radix_clear_slb(void)
{
int i;
for (i = 0; i < 4; i++)
clear_slb_entry(i);
}
static void switch_mmu_to_guest_radix(struct kvm *kvm, struct kvm_vcpu *vcpu, u64 lpcr)
{
struct kvm_nested_guest *nested = vcpu->arch.nested;
u32 lpid;
lpid = nested ? nested->shadow_lpid : kvm->arch.lpid;
/*
* All the isync()s are overkill but trivially follow the ISA
* requirements. Some can likely be replaced with justification
* comment for why they are not needed.
*/
isync();
mtspr(SPRN_LPID, lpid);
isync();
mtspr(SPRN_LPCR, lpcr);
isync();
mtspr(SPRN_PID, vcpu->arch.pid);
isync();
}
static void switch_mmu_to_guest_hpt(struct kvm *kvm, struct kvm_vcpu *vcpu, u64 lpcr)
{
u32 lpid;
int i;
lpid = kvm->arch.lpid;
mtspr(SPRN_LPID, lpid);
mtspr(SPRN_LPCR, lpcr);
mtspr(SPRN_PID, vcpu->arch.pid);
for (i = 0; i < vcpu->arch.slb_max; i++)
mtslb(vcpu->arch.slb[i].orige, vcpu->arch.slb[i].origv);
isync();
}
static void switch_mmu_to_host(struct kvm *kvm, u32 pid)
{
isync();
mtspr(SPRN_PID, pid);
isync();
mtspr(SPRN_LPID, kvm->arch.host_lpid);
isync();
mtspr(SPRN_LPCR, kvm->arch.host_lpcr);
isync();
if (!radix_enabled())
slb_restore_bolted_realmode();
}
static void save_clear_host_mmu(struct kvm *kvm)
{
if (!radix_enabled()) {
/*
* Hash host could save and restore host SLB entries to
* reduce SLB fault overheads of VM exits, but for now the
* existing code clears all entries and restores just the
* bolted ones when switching back to host.
*/
slb_clear_invalidate_partition();
}
}
static void save_clear_guest_mmu(struct kvm *kvm, struct kvm_vcpu *vcpu)
{
if (kvm_is_radix(kvm)) {
radix_clear_slb();
} else {
int i;
int nr = 0;
/*
* This must run before switching to host (radix host can't
* access all SLBs).
*/
for (i = 0; i < vcpu->arch.slb_nr; i++) {
u64 slbee, slbev;
mfslb(i, &slbee, &slbev);
if (slbee & SLB_ESID_V) {
vcpu->arch.slb[nr].orige = slbee | i;
vcpu->arch.slb[nr].origv = slbev;
nr++;
}
}
vcpu->arch.slb_max = nr;
slb_clear_invalidate_partition();
}
}
int kvmhv_vcpu_entry_p9(struct kvm_vcpu *vcpu, u64 time_limit, unsigned long lpcr)
{
struct kvm *kvm = vcpu->kvm;
struct kvm_nested_guest *nested = vcpu->arch.nested;
struct kvmppc_vcore *vc = vcpu->arch.vcore;
s64 hdec;
u64 tb, purr, spurr;
u64 *exsave;
bool ri_set;
int trap;
unsigned long msr;
unsigned long host_hfscr;
unsigned long host_ciabr;
unsigned long host_dawr0;
unsigned long host_dawrx0;
unsigned long host_psscr;
unsigned long host_pidr;
unsigned long host_dawr1;
unsigned long host_dawrx1;
hdec = time_limit - mftb();
if (hdec < 0)
return BOOK3S_INTERRUPT_HV_DECREMENTER;
WARN_ON_ONCE(vcpu->arch.shregs.msr & MSR_HV);
WARN_ON_ONCE(!(vcpu->arch.shregs.msr & MSR_ME));
start_timing(vcpu, &vcpu->arch.rm_entry);
vcpu->arch.ceded = 0;
if (vc->tb_offset) {
u64 new_tb = mftb() + vc->tb_offset;
mtspr(SPRN_TBU40, new_tb);
tb = mftb();
if ((tb & 0xffffff) < (new_tb & 0xffffff))
mtspr(SPRN_TBU40, new_tb + 0x1000000);
vc->tb_offset_applied = vc->tb_offset;
}
msr = mfmsr();
host_hfscr = mfspr(SPRN_HFSCR);
host_ciabr = mfspr(SPRN_CIABR);
host_dawr0 = mfspr(SPRN_DAWR0);
host_dawrx0 = mfspr(SPRN_DAWRX0);
host_psscr = mfspr(SPRN_PSSCR);
host_pidr = mfspr(SPRN_PID);
if (cpu_has_feature(CPU_FTR_DAWR1)) {
host_dawr1 = mfspr(SPRN_DAWR1);
host_dawrx1 = mfspr(SPRN_DAWRX1);
}
if (vc->pcr)
mtspr(SPRN_PCR, vc->pcr | PCR_MASK);
mtspr(SPRN_DPDES, vc->dpdes);
mtspr(SPRN_VTB, vc->vtb);
local_paca->kvm_hstate.host_purr = mfspr(SPRN_PURR);
local_paca->kvm_hstate.host_spurr = mfspr(SPRN_SPURR);
mtspr(SPRN_PURR, vcpu->arch.purr);
mtspr(SPRN_SPURR, vcpu->arch.spurr);
if (dawr_enabled()) {
mtspr(SPRN_DAWR0, vcpu->arch.dawr0);
mtspr(SPRN_DAWRX0, vcpu->arch.dawrx0);
if (cpu_has_feature(CPU_FTR_DAWR1)) {
mtspr(SPRN_DAWR1, vcpu->arch.dawr1);
mtspr(SPRN_DAWRX1, vcpu->arch.dawrx1);
}
}
mtspr(SPRN_CIABR, vcpu->arch.ciabr);
mtspr(SPRN_IC, vcpu->arch.ic);
mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
(local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
mtspr(SPRN_HFSCR, vcpu->arch.hfscr);
mtspr(SPRN_HSRR0, vcpu->arch.regs.nip);
mtspr(SPRN_HSRR1, (vcpu->arch.shregs.msr & ~MSR_HV) | MSR_ME);
/*
* On POWER9 DD2.1 and below, sometimes on a Hypervisor Data Storage
* Interrupt (HDSI) the HDSISR is not be updated at all.
*
* To work around this we put a canary value into the HDSISR before
* returning to a guest and then check for this canary when we take a
* HDSI. If we find the canary on a HDSI, we know the hardware didn't
* update the HDSISR. In this case we return to the guest to retake the
* HDSI which should correctly update the HDSISR the second time HDSI
* entry.
*
* Just do this on all p9 processors for now.
*/
mtspr(SPRN_HDSISR, HDSISR_CANARY);
mtspr(SPRN_SPRG0, vcpu->arch.shregs.sprg0);
mtspr(SPRN_SPRG1, vcpu->arch.shregs.sprg1);
mtspr(SPRN_SPRG2, vcpu->arch.shregs.sprg2);
mtspr(SPRN_SPRG3, vcpu->arch.shregs.sprg3);
mtspr(SPRN_AMOR, ~0UL);
local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_HV_P9;
/*
* Hash host, hash guest, or radix guest with prefetch bug, all have
* to disable the MMU before switching to guest MMU state.
*/
if (!radix_enabled() || !kvm_is_radix(kvm) ||
cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG))
__mtmsrd(msr & ~(MSR_IR|MSR_DR|MSR_RI), 0);
save_clear_host_mmu(kvm);
if (kvm_is_radix(kvm)) {
switch_mmu_to_guest_radix(kvm, vcpu, lpcr);
if (!cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG))
__mtmsrd(0, 1); /* clear RI */
} else {
switch_mmu_to_guest_hpt(kvm, vcpu, lpcr);
}
/* TLBIEL uses LPID=LPIDR, so run this after setting guest LPID */
kvmppc_check_need_tlb_flush(kvm, vc->pcpu, nested);
/*
* P9 suppresses the HDEC exception when LPCR[HDICE] = 0,
* so set guest LPCR (with HDICE) before writing HDEC.
*/
mtspr(SPRN_HDEC, hdec);
mtspr(SPRN_DAR, vcpu->arch.shregs.dar);
mtspr(SPRN_DSISR, vcpu->arch.shregs.dsisr);
mtspr(SPRN_SRR0, vcpu->arch.shregs.srr0);
mtspr(SPRN_SRR1, vcpu->arch.shregs.srr1);
accumulate_time(vcpu, &vcpu->arch.guest_time);
kvmppc_p9_enter_guest(vcpu);
accumulate_time(vcpu, &vcpu->arch.rm_intr);
/* XXX: Could get these from r11/12 and paca exsave instead */
vcpu->arch.shregs.srr0 = mfspr(SPRN_SRR0);
vcpu->arch.shregs.srr1 = mfspr(SPRN_SRR1);
vcpu->arch.shregs.dar = mfspr(SPRN_DAR);
vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR);
/* 0x2 bit for HSRR is only used by PR and P7/8 HV paths, clear it */
trap = local_paca->kvm_hstate.scratch0 & ~0x2;
/* HSRR interrupts leave MSR[RI] unchanged, SRR interrupts clear it. */
ri_set = false;
if (likely(trap > BOOK3S_INTERRUPT_MACHINE_CHECK)) {
if (trap != BOOK3S_INTERRUPT_SYSCALL &&
(vcpu->arch.shregs.msr & MSR_RI))
ri_set = true;
exsave = local_paca->exgen;
} else if (trap == BOOK3S_INTERRUPT_SYSTEM_RESET) {
exsave = local_paca->exnmi;
} else { /* trap == 0x200 */
exsave = local_paca->exmc;
}
vcpu->arch.regs.gpr[1] = local_paca->kvm_hstate.scratch1;
vcpu->arch.regs.gpr[3] = local_paca->kvm_hstate.scratch2;
/*
* Only set RI after reading machine check regs (DAR, DSISR, SRR0/1)
* and hstate scratch (which we need to move into exsave to make
* re-entrant vs SRESET/MCE)
*/
if (ri_set) {
if (unlikely(!(mfmsr() & MSR_RI))) {
__mtmsrd(MSR_RI, 1);
WARN_ON_ONCE(1);
}
} else {
WARN_ON_ONCE(mfmsr() & MSR_RI);
__mtmsrd(MSR_RI, 1);
}
vcpu->arch.regs.gpr[9] = exsave[EX_R9/sizeof(u64)];
vcpu->arch.regs.gpr[10] = exsave[EX_R10/sizeof(u64)];
vcpu->arch.regs.gpr[11] = exsave[EX_R11/sizeof(u64)];
vcpu->arch.regs.gpr[12] = exsave[EX_R12/sizeof(u64)];
vcpu->arch.regs.gpr[13] = exsave[EX_R13/sizeof(u64)];
vcpu->arch.ppr = exsave[EX_PPR/sizeof(u64)];
vcpu->arch.cfar = exsave[EX_CFAR/sizeof(u64)];
vcpu->arch.regs.ctr = exsave[EX_CTR/sizeof(u64)];
vcpu->arch.last_inst = KVM_INST_FETCH_FAILED;
if (unlikely(trap == BOOK3S_INTERRUPT_MACHINE_CHECK)) {
vcpu->arch.fault_dar = exsave[EX_DAR/sizeof(u64)];
vcpu->arch.fault_dsisr = exsave[EX_DSISR/sizeof(u64)];
kvmppc_realmode_machine_check(vcpu);
} else if (unlikely(trap == BOOK3S_INTERRUPT_HMI)) {
kvmppc_realmode_hmi_handler();
} else if (trap == BOOK3S_INTERRUPT_H_EMUL_ASSIST) {
vcpu->arch.emul_inst = mfspr(SPRN_HEIR);
} else if (trap == BOOK3S_INTERRUPT_H_DATA_STORAGE) {
vcpu->arch.fault_dar = exsave[EX_DAR/sizeof(u64)];
vcpu->arch.fault_dsisr = exsave[EX_DSISR/sizeof(u64)];
vcpu->arch.fault_gpa = mfspr(SPRN_ASDR);
} else if (trap == BOOK3S_INTERRUPT_H_INST_STORAGE) {
vcpu->arch.fault_gpa = mfspr(SPRN_ASDR);
} else if (trap == BOOK3S_INTERRUPT_H_FAC_UNAVAIL) {
vcpu->arch.hfscr = mfspr(SPRN_HFSCR);
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
/*
* Softpatch interrupt for transactional memory emulation cases
* on POWER9 DD2.2. This is early in the guest exit path - we
* haven't saved registers or done a treclaim yet.
*/
} else if (trap == BOOK3S_INTERRUPT_HV_SOFTPATCH) {
vcpu->arch.emul_inst = mfspr(SPRN_HEIR);
/*
* The cases we want to handle here are those where the guest
* is in real suspend mode and is trying to transition to
* transactional mode.
*/
if (local_paca->kvm_hstate.fake_suspend &&
(vcpu->arch.shregs.msr & MSR_TS_S)) {
if (kvmhv_p9_tm_emulation_early(vcpu)) {
/* Prevent it being handled again. */
trap = 0;
}
}
#endif
}
accumulate_time(vcpu, &vcpu->arch.rm_exit);
/* Advance host PURR/SPURR by the amount used by guest */
purr = mfspr(SPRN_PURR);
spurr = mfspr(SPRN_SPURR);
mtspr(SPRN_PURR, local_paca->kvm_hstate.host_purr +
purr - vcpu->arch.purr);
mtspr(SPRN_SPURR, local_paca->kvm_hstate.host_spurr +
spurr - vcpu->arch.spurr);
vcpu->arch.purr = purr;
vcpu->arch.spurr = spurr;
vcpu->arch.ic = mfspr(SPRN_IC);
vcpu->arch.pid = mfspr(SPRN_PID);
vcpu->arch.psscr = mfspr(SPRN_PSSCR) & PSSCR_GUEST_VIS;
vcpu->arch.shregs.sprg0 = mfspr(SPRN_SPRG0);
vcpu->arch.shregs.sprg1 = mfspr(SPRN_SPRG1);
vcpu->arch.shregs.sprg2 = mfspr(SPRN_SPRG2);
vcpu->arch.shregs.sprg3 = mfspr(SPRN_SPRG3);
/* Preserve PSSCR[FAKE_SUSPEND] until we've called kvmppc_save_tm_hv */
mtspr(SPRN_PSSCR, host_psscr |
(local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
mtspr(SPRN_HFSCR, host_hfscr);
mtspr(SPRN_CIABR, host_ciabr);
mtspr(SPRN_DAWR0, host_dawr0);
mtspr(SPRN_DAWRX0, host_dawrx0);
if (cpu_has_feature(CPU_FTR_DAWR1)) {
mtspr(SPRN_DAWR1, host_dawr1);
mtspr(SPRN_DAWRX1, host_dawrx1);
}
if (kvm_is_radix(kvm)) {
/*
* Since this is radix, do a eieio; tlbsync; ptesync sequence
* in case we interrupted the guest between a tlbie and a
* ptesync.
*/
asm volatile("eieio; tlbsync; ptesync");
}
/*
* cp_abort is required if the processor supports local copy-paste
* to clear the copy buffer that was under control of the guest.
*/
if (cpu_has_feature(CPU_FTR_ARCH_31))
asm volatile(PPC_CP_ABORT);
vc->dpdes = mfspr(SPRN_DPDES);
vc->vtb = mfspr(SPRN_VTB);
mtspr(SPRN_DPDES, 0);
if (vc->pcr)
mtspr(SPRN_PCR, PCR_MASK);
if (vc->tb_offset_applied) {
u64 new_tb = mftb() - vc->tb_offset_applied;
mtspr(SPRN_TBU40, new_tb);
tb = mftb();
if ((tb & 0xffffff) < (new_tb & 0xffffff))
mtspr(SPRN_TBU40, new_tb + 0x1000000);
vc->tb_offset_applied = 0;
}
mtspr(SPRN_HDEC, 0x7fffffff);
save_clear_guest_mmu(kvm, vcpu);
switch_mmu_to_host(kvm, host_pidr);
local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE;
/*
* If we are in real mode, only switch MMU on after the MMU is
* switched to host, to avoid the P9_RADIX_PREFETCH_BUG.
*/
__mtmsrd(msr, 0);
end_timing(vcpu);
return trap;
}
EXPORT_SYMBOL_GPL(kvmhv_vcpu_entry_p9);

View file

@ -57,6 +57,10 @@ static int global_invalidates(struct kvm *kvm)
else
global = 1;
/* LPID has been switched to host if in virt mode so can't do local */
if (!global && (mfmsr() & (MSR_IR|MSR_DR)))
global = 1;
if (!global) {
/* any other core might now have stale TLB entries... */
smp_wmb();
@ -67,7 +71,7 @@ static int global_invalidates(struct kvm *kvm)
* so use the bit for the first thread to represent the core.
*/
if (cpu_has_feature(CPU_FTR_ARCH_300))
cpu = cpu_first_thread_sibling(cpu);
cpu = cpu_first_tlb_thread_sibling(cpu);
cpumask_clear_cpu(cpu, &kvm->arch.need_tlb_flush);
}
@ -409,6 +413,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
vcpu->arch.pgdir, true,
&vcpu->arch.regs.gpr[4]);
}
EXPORT_SYMBOL_GPL(kvmppc_h_enter);
#ifdef __BIG_ENDIAN__
#define LOCK_TOKEN (*(u32 *)(&get_paca()->lock_token))
@ -553,6 +558,7 @@ long kvmppc_h_remove(struct kvm_vcpu *vcpu, unsigned long flags,
return kvmppc_do_h_remove(vcpu->kvm, flags, pte_index, avpn,
&vcpu->arch.regs.gpr[4]);
}
EXPORT_SYMBOL_GPL(kvmppc_h_remove);
long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
{
@ -671,6 +677,7 @@ long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
return ret;
}
EXPORT_SYMBOL_GPL(kvmppc_h_bulk_remove);
long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
unsigned long pte_index, unsigned long avpn)
@ -741,6 +748,7 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long flags,
return H_SUCCESS;
}
EXPORT_SYMBOL_GPL(kvmppc_h_protect);
long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long flags,
unsigned long pte_index)
@ -781,6 +789,7 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long flags,
}
return H_SUCCESS;
}
EXPORT_SYMBOL_GPL(kvmppc_h_read);
long kvmppc_h_clear_ref(struct kvm_vcpu *vcpu, unsigned long flags,
unsigned long pte_index)
@ -829,6 +838,7 @@ long kvmppc_h_clear_ref(struct kvm_vcpu *vcpu, unsigned long flags,
unlock_hpte(hpte, v & ~HPTE_V_HVLOCK);
return ret;
}
EXPORT_SYMBOL_GPL(kvmppc_h_clear_ref);
long kvmppc_h_clear_mod(struct kvm_vcpu *vcpu, unsigned long flags,
unsigned long pte_index)
@ -876,6 +886,7 @@ long kvmppc_h_clear_mod(struct kvm_vcpu *vcpu, unsigned long flags,
unlock_hpte(hpte, v & ~HPTE_V_HVLOCK);
return ret;
}
EXPORT_SYMBOL_GPL(kvmppc_h_clear_mod);
static int kvmppc_get_hpa(struct kvm_vcpu *vcpu, unsigned long mmu_seq,
unsigned long gpa, int writing, unsigned long *hpa,
@ -1294,3 +1305,4 @@ long kvmppc_hpte_hv_fault(struct kvm_vcpu *vcpu, unsigned long addr,
return -1; /* send fault up to host kernel mode */
}
EXPORT_SYMBOL_GPL(kvmppc_hpte_hv_fault);

View file

@ -141,13 +141,6 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
return;
}
if (xive_enabled() && kvmhv_on_pseries()) {
/* No XICS access or hypercalls available, too hard */
this_icp->rm_action |= XICS_RM_KICK_VCPU;
this_icp->rm_kick_target = vcpu;
return;
}
/*
* Check if the core is loaded,
* if not, find an available host core to post to wake the VCPU,
@ -771,14 +764,6 @@ static void icp_eoi(struct irq_chip *c, u32 hwirq, __be32 xirr, bool *again)
void __iomem *xics_phys;
int64_t rc;
if (kvmhv_on_pseries()) {
unsigned long retbuf[PLPAR_HCALL_BUFSIZE];
iosync();
plpar_hcall_raw(H_EOI, retbuf, hwirq);
return;
}
rc = pnv_opal_pci_msi_eoi(c, hwirq);
if (rc)

File diff suppressed because it is too large Load diff

View file

@ -164,12 +164,15 @@ kvmppc_interrupt_pr:
/* 64-bit entry. Register usage at this point:
*
* SPRG_SCRATCH0 = guest R13
* R9 = HSTATE_IN_GUEST
* R12 = (guest CR << 32) | exit handler id
* R13 = PACA
* HSTATE.SCRATCH0 = guest R12
* HSTATE.SCRATCH2 = guest R9
*/
#ifdef CONFIG_PPC64
/* Match 32-bit entry */
ld r9,HSTATE_SCRATCH2(r13)
rotldi r12, r12, 32 /* Flip R12 halves for stw */
stw r12, HSTATE_SCRATCH1(r13) /* CR is now in the low half */
srdi r12, r12, 32 /* shift trap into low half */

View file

@ -127,6 +127,71 @@ void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu)
}
EXPORT_SYMBOL_GPL(kvmppc_xive_push_vcpu);
/*
* Pull a vcpu's context from the XIVE on guest exit.
* This assumes we are in virtual mode (MMU on)
*/
void kvmppc_xive_pull_vcpu(struct kvm_vcpu *vcpu)
{
void __iomem *tima = local_paca->kvm_hstate.xive_tima_virt;
if (!vcpu->arch.xive_pushed)
return;
/*
* Should not have been pushed if there is no tima
*/
if (WARN_ON(!tima))
return;
eieio();
/* First load to pull the context, we ignore the value */
__raw_readl(tima + TM_SPC_PULL_OS_CTX);
/* Second load to recover the context state (Words 0 and 1) */
vcpu->arch.xive_saved_state.w01 = __raw_readq(tima + TM_QW1_OS);
/* Fixup some of the state for the next load */
vcpu->arch.xive_saved_state.lsmfb = 0;
vcpu->arch.xive_saved_state.ack = 0xff;
vcpu->arch.xive_pushed = 0;
eieio();
}
EXPORT_SYMBOL_GPL(kvmppc_xive_pull_vcpu);
void kvmppc_xive_rearm_escalation(struct kvm_vcpu *vcpu)
{
void __iomem *esc_vaddr = (void __iomem *)vcpu->arch.xive_esc_vaddr;
if (!esc_vaddr)
return;
/* we are using XIVE with single escalation */
if (vcpu->arch.xive_esc_on) {
/*
* If we still have a pending escalation, abort the cede,
* and we must set PQ to 10 rather than 00 so that we don't
* potentially end up with two entries for the escalation
* interrupt in the XIVE interrupt queue. In that case
* we also don't want to set xive_esc_on to 1 here in
* case we race with xive_esc_irq().
*/
vcpu->arch.ceded = 0;
/*
* The escalation interrupts are special as we don't EOI them.
* There is no need to use the load-after-store ordering offset
* to set PQ to 10 as we won't use StoreEOI.
*/
__raw_readq(esc_vaddr + XIVE_ESB_SET_PQ_10);
} else {
vcpu->arch.xive_esc_on = true;
mb();
__raw_readq(esc_vaddr + XIVE_ESB_SET_PQ_00);
}
mb();
}
EXPORT_SYMBOL_GPL(kvmppc_xive_rearm_escalation);
/*
* This is a simple trigger for a generic XIVE IRQ. This must
* only be called for interrupts that support a trigger page
@ -2075,6 +2140,36 @@ static int kvmppc_xive_create(struct kvm_device *dev, u32 type)
return 0;
}
int kvmppc_xive_xics_hcall(struct kvm_vcpu *vcpu, u32 req)
{
struct kvmppc_vcore *vc = vcpu->arch.vcore;
/* The VM should have configured XICS mode before doing XICS hcalls. */
if (!kvmppc_xics_enabled(vcpu))
return H_TOO_HARD;
switch (req) {
case H_XIRR:
return xive_vm_h_xirr(vcpu);
case H_CPPR:
return xive_vm_h_cppr(vcpu, kvmppc_get_gpr(vcpu, 4));
case H_EOI:
return xive_vm_h_eoi(vcpu, kvmppc_get_gpr(vcpu, 4));
case H_IPI:
return xive_vm_h_ipi(vcpu, kvmppc_get_gpr(vcpu, 4),
kvmppc_get_gpr(vcpu, 5));
case H_IPOLL:
return xive_vm_h_ipoll(vcpu, kvmppc_get_gpr(vcpu, 4));
case H_XIRR_X:
xive_vm_h_xirr(vcpu);
kvmppc_set_gpr(vcpu, 5, get_tb() + vc->tb_offset);
return H_SUCCESS;
}
return H_UNSUPPORTED;
}
EXPORT_SYMBOL_GPL(kvmppc_xive_xics_hcall);
int kvmppc_xive_debug_show_queues(struct seq_file *m, struct kvm_vcpu *vcpu)
{
struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
@ -2257,21 +2352,3 @@ struct kvm_device_ops kvm_xive_ops = {
.get_attr = xive_get_attr,
.has_attr = xive_has_attr,
};
void kvmppc_xive_init_module(void)
{
__xive_vm_h_xirr = xive_vm_h_xirr;
__xive_vm_h_ipoll = xive_vm_h_ipoll;
__xive_vm_h_ipi = xive_vm_h_ipi;
__xive_vm_h_cppr = xive_vm_h_cppr;
__xive_vm_h_eoi = xive_vm_h_eoi;
}
void kvmppc_xive_exit_module(void)
{
__xive_vm_h_xirr = NULL;
__xive_vm_h_ipoll = NULL;
__xive_vm_h_ipi = NULL;
__xive_vm_h_cppr = NULL;
__xive_vm_h_eoi = NULL;
}

View file

@ -289,13 +289,6 @@ extern int xive_rm_h_ipi(struct kvm_vcpu *vcpu, unsigned long server,
extern int xive_rm_h_cppr(struct kvm_vcpu *vcpu, unsigned long cppr);
extern int xive_rm_h_eoi(struct kvm_vcpu *vcpu, unsigned long xirr);
extern unsigned long (*__xive_vm_h_xirr)(struct kvm_vcpu *vcpu);
extern unsigned long (*__xive_vm_h_ipoll)(struct kvm_vcpu *vcpu, unsigned long server);
extern int (*__xive_vm_h_ipi)(struct kvm_vcpu *vcpu, unsigned long server,
unsigned long mfrr);
extern int (*__xive_vm_h_cppr)(struct kvm_vcpu *vcpu, unsigned long cppr);
extern int (*__xive_vm_h_eoi)(struct kvm_vcpu *vcpu, unsigned long xirr);
/*
* Common Xive routines for XICS-over-XIVE and XIVE native
*/

View file

@ -1281,13 +1281,3 @@ struct kvm_device_ops kvm_xive_native_ops = {
.has_attr = kvmppc_xive_native_has_attr,
.mmap = kvmppc_xive_native_mmap,
};
void kvmppc_xive_native_init_module(void)
{
;
}
void kvmppc_xive_native_exit_module(void)
{
;
}

View file

@ -682,6 +682,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = !!(hv_enabled && kvmppc_hv_ops->enable_dawr1 &&
!kvmppc_hv_ops->enable_dawr1(NULL));
break;
case KVM_CAP_PPC_RPT_INVALIDATE:
r = 1;
break;
#endif
default:
r = 0;

View file

@ -14,6 +14,7 @@
#include <linux/string.h>
#include <linux/init.h>
#include <linux/sched/mm.h>
#include <linux/stop_machine.h>
#include <asm/cputable.h>
#include <asm/code-patching.h>
#include <asm/page.h>
@ -149,17 +150,17 @@ static void do_stf_entry_barrier_fixups(enum stf_barrier_type types)
pr_devel("patching dest %lx\n", (unsigned long)dest);
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
if (types & STF_BARRIER_FALLBACK)
// See comment in do_entry_flush_fixups() RE order of patching
if (types & STF_BARRIER_FALLBACK) {
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_branch((struct ppc_inst *)(dest + 1),
(unsigned long)&stf_barrier_fallback,
BRANCH_SET_LINK);
else
patch_instruction((struct ppc_inst *)(dest + 1),
ppc_inst(instrs[1]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
(unsigned long)&stf_barrier_fallback, BRANCH_SET_LINK);
} else {
patch_instruction((struct ppc_inst *)(dest + 1), ppc_inst(instrs[1]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
}
}
printk(KERN_DEBUG "stf-barrier: patched %d entry locations (%s barrier)\n", i,
@ -227,11 +228,25 @@ static void do_stf_exit_barrier_fixups(enum stf_barrier_type types)
: "unknown");
}
static int __do_stf_barrier_fixups(void *data)
{
enum stf_barrier_type *types = data;
do_stf_entry_barrier_fixups(*types);
do_stf_exit_barrier_fixups(*types);
return 0;
}
void do_stf_barrier_fixups(enum stf_barrier_type types)
{
do_stf_entry_barrier_fixups(types);
do_stf_exit_barrier_fixups(types);
/*
* The call to the fallback entry flush, and the fallback/sync-ori exit
* flush can not be safely patched in/out while other CPUs are executing
* them. So call __do_stf_barrier_fixups() on one CPU while all other CPUs
* spin in the stop machine core with interrupts hard disabled.
*/
stop_machine(__do_stf_barrier_fixups, &types, NULL);
}
void do_uaccess_flush_fixups(enum l1d_flush_type types)
@ -284,8 +299,9 @@ void do_uaccess_flush_fixups(enum l1d_flush_type types)
: "unknown");
}
void do_entry_flush_fixups(enum l1d_flush_type types)
static int __do_entry_flush_fixups(void *data)
{
enum l1d_flush_type types = *(enum l1d_flush_type *)data;
unsigned int instrs[3], *dest;
long *start, *end;
int i;
@ -309,6 +325,31 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
if (types & L1D_FLUSH_MTTRIG)
instrs[i++] = 0x7c12dba6; /* mtspr TRIG2,r0 (SPR #882) */
/*
* If we're patching in or out the fallback flush we need to be careful about the
* order in which we patch instructions. That's because it's possible we could
* take a page fault after patching one instruction, so the sequence of
* instructions must be safe even in a half patched state.
*
* To make that work, when patching in the fallback flush we patch in this order:
* - the mflr (dest)
* - the mtlr (dest + 2)
* - the branch (dest + 1)
*
* That ensures the sequence is safe to execute at any point. In contrast if we
* patch the mtlr last, it's possible we could return from the branch and not
* restore LR, leading to a crash later.
*
* When patching out the fallback flush (either with nops or another flush type),
* we patch in this order:
* - the branch (dest + 1)
* - the mtlr (dest + 2)
* - the mflr (dest)
*
* Note we are protected by stop_machine() from other CPUs executing the code in a
* semi-patched state.
*/
start = PTRRELOC(&__start___entry_flush_fixup);
end = PTRRELOC(&__stop___entry_flush_fixup);
for (i = 0; start < end; start++, i++) {
@ -316,15 +357,16 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
pr_devel("patching dest %lx\n", (unsigned long)dest);
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
if (types == L1D_FLUSH_FALLBACK)
patch_branch((struct ppc_inst *)(dest + 1), (unsigned long)&entry_flush_fallback,
BRANCH_SET_LINK);
else
if (types == L1D_FLUSH_FALLBACK) {
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_branch((struct ppc_inst *)(dest + 1),
(unsigned long)&entry_flush_fallback, BRANCH_SET_LINK);
} else {
patch_instruction((struct ppc_inst *)(dest + 1), ppc_inst(instrs[1]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
}
}
start = PTRRELOC(&__start___scv_entry_flush_fixup);
@ -334,15 +376,16 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
pr_devel("patching dest %lx\n", (unsigned long)dest);
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
if (types == L1D_FLUSH_FALLBACK)
patch_branch((struct ppc_inst *)(dest + 1), (unsigned long)&scv_entry_flush_fallback,
BRANCH_SET_LINK);
else
if (types == L1D_FLUSH_FALLBACK) {
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_branch((struct ppc_inst *)(dest + 1),
(unsigned long)&scv_entry_flush_fallback, BRANCH_SET_LINK);
} else {
patch_instruction((struct ppc_inst *)(dest + 1), ppc_inst(instrs[1]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
}
}
@ -354,6 +397,19 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
: "ori type" :
(types & L1D_FLUSH_MTTRIG) ? "mttrig type"
: "unknown");
return 0;
}
void do_entry_flush_fixups(enum l1d_flush_type types)
{
/*
* The call to the fallback flush can not be safely patched in/out while
* other CPUs are executing it. So call __do_entry_flush_fixups() on one
* CPU while all other CPUs spin in the stop machine core with interrupts
* hard disabled.
*/
stop_machine(__do_entry_flush_fixups, &types, NULL);
}
void do_rfi_flush_fixups(enum l1d_flush_type types)

View file

@ -357,30 +357,19 @@ static void __init radix_init_pgtable(void)
}
/* Find out how many PID bits are supported */
if (!cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG)) {
if (!mmu_pid_bits)
mmu_pid_bits = 20;
mmu_base_pid = 1;
} else if (cpu_has_feature(CPU_FTR_HVMODE)) {
if (!mmu_pid_bits)
mmu_pid_bits = 20;
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
if (!cpu_has_feature(CPU_FTR_HVMODE) &&
cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG)) {
/*
* When KVM is possible, we only use the top half of the
* PID space to avoid collisions between host and guest PIDs
* which can cause problems due to prefetch when exiting the
* guest with AIL=3
* Older versions of KVM on these machines perfer if the
* guest only uses the low 19 PID bits.
*/
mmu_base_pid = 1 << (mmu_pid_bits - 1);
#else
mmu_base_pid = 1;
#endif
} else {
/* The guest uses the bottom half of the PID space */
if (!mmu_pid_bits)
mmu_pid_bits = 19;
mmu_base_pid = 1;
} else {
if (!mmu_pid_bits)
mmu_pid_bits = 20;
}
mmu_base_pid = 1;
/*
* Allocate Partition table and process table for the
@ -486,6 +475,7 @@ static int __init radix_dt_scan_page_sizes(unsigned long node,
def = &mmu_psize_defs[idx];
def->shift = shift;
def->ap = ap;
def->h_rpt_pgsize = psize_to_rpti_pgsize(idx);
}
/* needed ? */
@ -560,9 +550,13 @@ void __init radix__early_init_devtree(void)
*/
mmu_psize_defs[MMU_PAGE_4K].shift = 12;
mmu_psize_defs[MMU_PAGE_4K].ap = 0x0;
mmu_psize_defs[MMU_PAGE_4K].h_rpt_pgsize =
psize_to_rpti_pgsize(MMU_PAGE_4K);
mmu_psize_defs[MMU_PAGE_64K].shift = 16;
mmu_psize_defs[MMU_PAGE_64K].ap = 0x5;
mmu_psize_defs[MMU_PAGE_64K].h_rpt_pgsize =
psize_to_rpti_pgsize(MMU_PAGE_64K);
}
/*

View file

@ -20,10 +20,6 @@
#include "internal.h"
#define RIC_FLUSH_TLB 0
#define RIC_FLUSH_PWC 1
#define RIC_FLUSH_ALL 2
/*
* tlbiel instruction for radix, set invalidation
* i.e., r=1 and is=01 or is=10 or is=11
@ -130,6 +126,21 @@ static __always_inline void __tlbie_pid(unsigned long pid, unsigned long ric)
trace_tlbie(0, 0, rb, rs, ric, prs, r);
}
static __always_inline void __tlbie_pid_lpid(unsigned long pid,
unsigned long lpid,
unsigned long ric)
{
unsigned long rb, rs, prs, r;
rb = PPC_BIT(53); /* IS = 1 */
rs = (pid << PPC_BITLSHIFT(31)) | (lpid & ~(PPC_BITMASK(0, 31)));
prs = 1; /* process scoped */
r = 1; /* radix format */
asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
: : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory");
trace_tlbie(0, 0, rb, rs, ric, prs, r);
}
static __always_inline void __tlbie_lpid(unsigned long lpid, unsigned long ric)
{
unsigned long rb,rs,prs,r;
@ -190,6 +201,23 @@ static __always_inline void __tlbie_va(unsigned long va, unsigned long pid,
trace_tlbie(0, 0, rb, rs, ric, prs, r);
}
static __always_inline void __tlbie_va_lpid(unsigned long va, unsigned long pid,
unsigned long lpid,
unsigned long ap, unsigned long ric)
{
unsigned long rb, rs, prs, r;
rb = va & ~(PPC_BITMASK(52, 63));
rb |= ap << PPC_BITLSHIFT(58);
rs = (pid << PPC_BITLSHIFT(31)) | (lpid & ~(PPC_BITMASK(0, 31)));
prs = 1; /* process scoped */
r = 1; /* radix format */
asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
: : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : "memory");
trace_tlbie(0, 0, rb, rs, ric, prs, r);
}
static __always_inline void __tlbie_lpid_va(unsigned long va, unsigned long lpid,
unsigned long ap, unsigned long ric)
{
@ -235,6 +263,22 @@ static inline void fixup_tlbie_va_range(unsigned long va, unsigned long pid,
}
}
static inline void fixup_tlbie_va_range_lpid(unsigned long va,
unsigned long pid,
unsigned long lpid,
unsigned long ap)
{
if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) {
asm volatile("ptesync" : : : "memory");
__tlbie_pid_lpid(0, lpid, RIC_FLUSH_TLB);
}
if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) {
asm volatile("ptesync" : : : "memory");
__tlbie_va_lpid(va, pid, lpid, ap, RIC_FLUSH_TLB);
}
}
static inline void fixup_tlbie_pid(unsigned long pid)
{
/*
@ -254,6 +298,25 @@ static inline void fixup_tlbie_pid(unsigned long pid)
}
}
static inline void fixup_tlbie_pid_lpid(unsigned long pid, unsigned long lpid)
{
/*
* We can use any address for the invalidation, pick one which is
* probably unused as an optimisation.
*/
unsigned long va = ((1UL << 52) - 1);
if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) {
asm volatile("ptesync" : : : "memory");
__tlbie_pid_lpid(0, lpid, RIC_FLUSH_TLB);
}
if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) {
asm volatile("ptesync" : : : "memory");
__tlbie_va_lpid(va, pid, lpid, mmu_get_ap(MMU_PAGE_64K),
RIC_FLUSH_TLB);
}
}
static inline void fixup_tlbie_lpid_va(unsigned long va, unsigned long lpid,
unsigned long ap)
@ -344,6 +407,31 @@ static inline void _tlbie_pid(unsigned long pid, unsigned long ric)
asm volatile("eieio; tlbsync; ptesync": : :"memory");
}
static inline void _tlbie_pid_lpid(unsigned long pid, unsigned long lpid,
unsigned long ric)
{
asm volatile("ptesync" : : : "memory");
/*
* Workaround the fact that the "ric" argument to __tlbie_pid
* must be a compile-time contraint to match the "i" constraint
* in the asm statement.
*/
switch (ric) {
case RIC_FLUSH_TLB:
__tlbie_pid_lpid(pid, lpid, RIC_FLUSH_TLB);
fixup_tlbie_pid_lpid(pid, lpid);
break;
case RIC_FLUSH_PWC:
__tlbie_pid_lpid(pid, lpid, RIC_FLUSH_PWC);
break;
case RIC_FLUSH_ALL:
default:
__tlbie_pid_lpid(pid, lpid, RIC_FLUSH_ALL);
fixup_tlbie_pid_lpid(pid, lpid);
}
asm volatile("eieio; tlbsync; ptesync" : : : "memory");
}
struct tlbiel_pid {
unsigned long pid;
unsigned long ric;
@ -469,6 +557,20 @@ static inline void __tlbie_va_range(unsigned long start, unsigned long end,
fixup_tlbie_va_range(addr - page_size, pid, ap);
}
static inline void __tlbie_va_range_lpid(unsigned long start, unsigned long end,
unsigned long pid, unsigned long lpid,
unsigned long page_size,
unsigned long psize)
{
unsigned long addr;
unsigned long ap = mmu_get_ap(psize);
for (addr = start; addr < end; addr += page_size)
__tlbie_va_lpid(addr, pid, lpid, ap, RIC_FLUSH_TLB);
fixup_tlbie_va_range_lpid(addr - page_size, pid, lpid, ap);
}
static __always_inline void _tlbie_va(unsigned long va, unsigned long pid,
unsigned long psize, unsigned long ric)
{
@ -549,6 +651,18 @@ static inline void _tlbie_va_range(unsigned long start, unsigned long end,
asm volatile("eieio; tlbsync; ptesync": : :"memory");
}
static inline void _tlbie_va_range_lpid(unsigned long start, unsigned long end,
unsigned long pid, unsigned long lpid,
unsigned long page_size,
unsigned long psize, bool also_pwc)
{
asm volatile("ptesync" : : : "memory");
if (also_pwc)
__tlbie_pid_lpid(pid, lpid, RIC_FLUSH_PWC);
__tlbie_va_range_lpid(start, end, pid, lpid, page_size, psize);
asm volatile("eieio; tlbsync; ptesync" : : : "memory");
}
static inline void _tlbiel_va_range_multicast(struct mm_struct *mm,
unsigned long start, unsigned long end,
unsigned long pid, unsigned long page_size,
@ -1338,47 +1452,57 @@ void radix__flush_tlb_all(void)
}
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
extern void radix_kvm_prefetch_workaround(struct mm_struct *mm)
/*
* Performs process-scoped invalidations for a given LPID
* as part of H_RPT_INVALIDATE hcall.
*/
void do_h_rpt_invalidate_prt(unsigned long pid, unsigned long lpid,
unsigned long type, unsigned long pg_sizes,
unsigned long start, unsigned long end)
{
unsigned long pid = mm->context.id;
if (unlikely(pid == MMU_NO_CONTEXT))
return;
if (!cpu_has_feature(CPU_FTR_P9_RADIX_PREFETCH_BUG))
return;
unsigned long psize, nr_pages;
struct mmu_psize_def *def;
bool flush_pid;
/*
* If this context hasn't run on that CPU before and KVM is
* around, there's a slim chance that the guest on another
* CPU just brought in obsolete translation into the TLB of
* this CPU due to a bad prefetch using the guest PID on
* the way into the hypervisor.
*
* We work around this here. If KVM is possible, we check if
* any sibling thread is in KVM. If it is, the window may exist
* and thus we flush that PID from the core.
*
* A potential future improvement would be to mark which PIDs
* have never been used on the system and avoid it if the PID
* is new and the process has no other cpumask bit set.
* A H_RPTI_TYPE_ALL request implies RIC=3, hence
* do a single IS=1 based flush.
*/
if (cpu_has_feature(CPU_FTR_HVMODE) && radix_enabled()) {
int cpu = smp_processor_id();
int sib = cpu_first_thread_sibling(cpu);
bool flush = false;
if ((type & H_RPTI_TYPE_ALL) == H_RPTI_TYPE_ALL) {
_tlbie_pid_lpid(pid, lpid, RIC_FLUSH_ALL);
return;
}
for (; sib <= cpu_last_thread_sibling(cpu) && !flush; sib++) {
if (sib == cpu)
continue;
if (!cpu_possible(sib))
continue;
if (paca_ptrs[sib]->kvm_hstate.kvm_vcpu)
flush = true;
if (type & H_RPTI_TYPE_PWC)
_tlbie_pid_lpid(pid, lpid, RIC_FLUSH_PWC);
/* Full PID flush */
if (start == 0 && end == -1)
return _tlbie_pid_lpid(pid, lpid, RIC_FLUSH_TLB);
/* Do range invalidation for all the valid page sizes */
for (psize = 0; psize < MMU_PAGE_COUNT; psize++) {
def = &mmu_psize_defs[psize];
if (!(pg_sizes & def->h_rpt_pgsize))
continue;
nr_pages = (end - start) >> def->shift;
flush_pid = nr_pages > tlb_single_page_flush_ceiling;
/*
* If the number of pages spanning the range is above
* the ceiling, convert the request into a full PID flush.
* And since PID flush takes out all the page sizes, there
* is no need to consider remaining page sizes.
*/
if (flush_pid) {
_tlbie_pid_lpid(pid, lpid, RIC_FLUSH_TLB);
return;
}
if (flush)
_tlbiel_pid(pid, RIC_FLUSH_ALL);
_tlbie_va_range_lpid(start, end, pid, lpid,
(1UL << def->shift), psize, false);
}
}
EXPORT_SYMBOL_GPL(radix_kvm_prefetch_workaround);
EXPORT_SYMBOL_GPL(do_h_rpt_invalidate_prt);
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */

View file

@ -83,9 +83,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
if (cpu_has_feature(CPU_FTR_ALTIVEC))
asm volatile ("dssall");
if (new_on_cpu)
radix_kvm_prefetch_workaround(next);
else
if (!new_on_cpu)
membarrier_arch_switch_mm(prev, next, tsk);
/*

Some files were not shown because too many files have changed in this diff Show more