This commit is contained in:
Mark Brown 2021-06-01 18:33:33 +01:00
commit 1a435466b0
No known key found for this signature in database
GPG Key ID: 24D68B725D5487D0
325 changed files with 3075 additions and 1746 deletions

View File

@ -160,6 +160,7 @@ Jeff Layton <jlayton@kernel.org> <jlayton@primarydata.com>
Jeff Layton <jlayton@kernel.org> <jlayton@redhat.com>
Jens Axboe <axboe@suse.de>
Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Jernej Skrabec <jernej.skrabec@gmail.com> <jernej.skrabec@siol.net>
Jiri Slaby <jirislaby@kernel.org> <jirislaby@gmail.com>
Jiri Slaby <jirislaby@kernel.org> <jslaby@novell.com>
Jiri Slaby <jirislaby@kernel.org> <jslaby@suse.com>

View File

@ -1,7 +1,7 @@
What: /sys/class/dax/
Date: May, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description: Device DAX is the device-centric analogue of Filesystem
DAX (CONFIG_FS_DAX). It allows memory ranges to be
allocated and mapped without need of an intervening file

View File

@ -1,4 +1,4 @@
This ABI is renamed and moved to a new location /sys/kernel/fadump/registered.¬
This ABI is renamed and moved to a new location /sys/kernel/fadump/registered.
What: /sys/kernel/fadump_registered
Date: Feb 2012

View File

@ -1,4 +1,4 @@
This ABI is renamed and moved to a new location /sys/kernel/fadump/release_mem.¬
This ABI is renamed and moved to a new location /sys/kernel/fadump/release_mem.
What: /sys/kernel/fadump_release_mem
Date: Feb 2012

View File

@ -1,7 +1,7 @@
What: /sys/bus/nd/devices/regionX/nfit/ecc_unit_size
Date: Aug, 2017
KernelVersion: v4.14 (Removed v4.18)
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Size of a write request to a DIMM that will not incur a
read-modify-write cycle at the memory controller.

View File

@ -5,7 +5,7 @@ Interface Table (NFIT)' section in the ACPI specification
What: /sys/bus/nd/devices/nmemX/nfit/serial
Date: Jun, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Serial number of the NVDIMM (non-volatile dual in-line
memory module), assigned by the module vendor.
@ -14,7 +14,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/handle
Date: Apr, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) The address (given by the _ADR object) of the device on its
parent bus of the NVDIMM device containing the NVDIMM region.
@ -23,7 +23,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/device
Date: Apr, 2015
KernelVersion: v4.1
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Device id for the NVDIMM, assigned by the module vendor.
@ -31,7 +31,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/rev_id
Date: Jun, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Revision of the NVDIMM, assigned by the module vendor.
@ -39,7 +39,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/phys_id
Date: Apr, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Handle (i.e., instance number) for the SMBIOS (system
management BIOS) Memory Device structure describing the NVDIMM
@ -49,7 +49,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/flags
Date: Jun, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) The flags in the NFIT memory device sub-structure indicate
the state of the data on the nvdimm relative to its energy
@ -68,7 +68,7 @@ What: /sys/bus/nd/devices/nmemX/nfit/format1
What: /sys/bus/nd/devices/nmemX/nfit/formats
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) The interface codes indicate support for persistent memory
mapped directly into system physical address space and / or a
@ -84,7 +84,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/vendor
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Vendor id of the NVDIMM.
@ -92,7 +92,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/dsm_mask
Date: May, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) The bitmask indicates the supported device specific control
functions relative to the NVDIMM command family supported by the
@ -102,7 +102,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/family
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Displays the NVDIMM family command sets. Values
0, 1, 2 and 3 correspond to NVDIMM_FAMILY_INTEL,
@ -118,7 +118,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/id
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) ACPI specification 6.2 section 5.2.25.9, defines an
identifier for an NVDIMM, which refelects the id attribute.
@ -127,7 +127,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/subsystem_vendor
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Sub-system vendor id of the NVDIMM non-volatile memory
subsystem controller.
@ -136,7 +136,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/subsystem_rev_id
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Sub-system revision id of the NVDIMM non-volatile memory subsystem
controller, assigned by the non-volatile memory subsystem
@ -146,7 +146,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/nfit/subsystem_device
Date: Apr, 2016
KernelVersion: v4.7
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) Sub-system device id for the NVDIMM non-volatile memory
subsystem controller, assigned by the non-volatile memory
@ -156,7 +156,7 @@ Description:
What: /sys/bus/nd/devices/ndbusX/nfit/revision
Date: Jun, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) ACPI NFIT table revision number.
@ -164,7 +164,7 @@ Description:
What: /sys/bus/nd/devices/ndbusX/nfit/scrub
Date: Sep, 2016
KernelVersion: v4.9
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RW) This shows the number of full Address Range Scrubs (ARS)
that have been completed since driver load time. Userspace can
@ -177,7 +177,7 @@ Description:
What: /sys/bus/nd/devices/ndbusX/nfit/hw_error_scrub
Date: Sep, 2016
KernelVersion: v4.9
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RW) Provides a way to toggle the behavior between just adding
the address (cache line) where the MCE happened to the poison
@ -196,7 +196,7 @@ Description:
What: /sys/bus/nd/devices/ndbusX/nfit/dsm_mask
Date: Jun, 2017
KernelVersion: v4.13
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) The bitmask indicates the supported bus specific control
functions. See the section named 'NVDIMM Root Device _DSMs' in
@ -205,7 +205,7 @@ Description:
What: /sys/bus/nd/devices/ndbusX/nfit/firmware_activate_noidle
Date: Apr, 2020
KernelVersion: v5.8
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RW) The Intel platform implementation of firmware activate
support exposes an option let the platform force idle devices in
@ -225,7 +225,7 @@ Description:
What: /sys/bus/nd/devices/regionX/nfit/range_index
Date: Jun, 2015
KernelVersion: v4.2
Contact: linux-nvdimm@lists.01.org
Contact: nvdimm@lists.linux.dev
Description:
(RO) A unique number provided by the BIOS to identify an address
range. Used by NVDIMM Region Mapping Structure to uniquely refer

View File

@ -1,7 +1,7 @@
What: /sys/bus/nd/devices/nmemX/papr/flags
Date: Apr, 2020
KernelVersion: v5.8
Contact: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm@lists.01.org,
Contact: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, nvdimm@lists.linux.dev,
Description:
(RO) Report flags indicating various states of a
papr-pmem NVDIMM device. Each flag maps to a one or
@ -36,7 +36,7 @@ Description:
What: /sys/bus/nd/devices/nmemX/papr/perf_stats
Date: May, 2020
KernelVersion: v5.9
Contact: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm@lists.01.org,
Contact: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, nvdimm@lists.linux.dev,
Description:
(RO) Report various performance stats related to papr-scm NVDIMM
device. Each stat is reported on a new line with each line

View File

@ -37,13 +37,13 @@ Description: Maximum time allowed for periodic transfers per microframe (μs)
What: /sys/module/*/{coresize,initsize}
Date: Jan 2012
KernelVersion:»·3.3
KernelVersion: 3.3
Contact: Kay Sievers <kay.sievers@vrfy.org>
Description: Module size in bytes.
What: /sys/module/*/taint
Date: Jan 2012
KernelVersion:»·3.3
KernelVersion: 3.3
Contact: Kay Sievers <kay.sievers@vrfy.org>
Description: Module taint flags:
== =====================

View File

@ -483,10 +483,11 @@ modprobe
========
The full path to the usermode helper for autoloading kernel modules,
by default "/sbin/modprobe". This binary is executed when the kernel
requests a module. For example, if userspace passes an unknown
filesystem type to mount(), then the kernel will automatically request
the corresponding filesystem module by executing this usermode helper.
by default ``CONFIG_MODPROBE_PATH``, which in turn defaults to
"/sbin/modprobe". This binary is executed when the kernel requests a
module. For example, if userspace passes an unknown filesystem type
to mount(), then the kernel will automatically request the
corresponding filesystem module by executing this usermode helper.
This usermode helper should insert the needed module into the kernel.
This sysctl only affects module autoloading. It has no effect on the

View File

@ -1,4 +1,4 @@
==============
==============
Data Integrity
==============

View File

@ -146,18 +146,18 @@ with the kernel as a block device by registering the following general
*struct file_operations*::
struct file_operations cdrom_fops = {
NULL, / lseek /
block _read , / read—general block-dev read /
block _write, / write—general block-dev write /
NULL, / readdir /
NULL, / select /
cdrom_ioctl, / ioctl /
NULL, / mmap /
cdrom_open, / open /
cdrom_release, / release /
NULL, / fsync /
NULL, / fasync /
NULL / revalidate /
NULL, /* lseek */
block _read , /* read--general block-dev read */
block _write, /* write--general block-dev write */
NULL, /* readdir */
NULL, /* select */
cdrom_ioctl, /* ioctl */
NULL, /* mmap */
cdrom_open, /* open */
cdrom_release, /* release */
NULL, /* fsync */
NULL, /* fasync */
NULL /* revalidate */
};
Every active CD-ROM device shares this *struct*. The routines
@ -250,12 +250,12 @@ The drive-specific, minor-like information that is registered with
`cdrom.c`, currently contains the following fields::
struct cdrom_device_info {
const struct cdrom_device_ops * ops; /* device operations for this major */
const struct cdrom_device_ops * ops; /* device operations for this major */
struct list_head list; /* linked list of all device_info */
struct gendisk * disk; /* matching block layer disk */
void * handle; /* driver-dependent data */
int mask; /* mask of capability: disables them */
int mask; /* mask of capability: disables them */
int speed; /* maximum speed for reading data */
int capacity; /* number of discs in a jukebox */
@ -569,7 +569,7 @@ the *CDC_CLOSE_TRAY* bit in *mask*.
In the file `cdrom.c` you will encounter many constructions of the type::
if (cdo->capability & cdi->mask & CDC _⟨capability⟩) ...
if (cdo->capability & ~cdi->mask & CDC _<capability>) ...
There is no *ioctl* to set the mask... The reason is that
I think it is better to control the **behavior** rather than the

View File

@ -72,7 +72,7 @@ examples:
mux-controls = <&mux>;
spi-flash@0 {
flash@0 {
compatible = "jedec,spi-nor";
reg = <0>;
spi-max-frequency = <40000000>;

View File

@ -4,7 +4,7 @@ LIBNVDIMM: Non-Volatile Devices
libnvdimm - kernel / libndctl - userspace helper library
linux-nvdimm@lists.01.org
nvdimm@lists.linux.dev
Version 13

View File

@ -19,7 +19,6 @@ Serial drivers
moxa-smartio
n_gsm
rocket
serial-iso7816
serial-rs485

View File

@ -109,16 +109,19 @@ well as to make sure they aren't relying on some HCD-specific behavior.
USB-Standard Types
==================
In ``drivers/usb/common/common.c`` and ``drivers/usb/common/debug.c`` you
will find the USB data types defined in chapter 9 of the USB specification.
These data types are used throughout USB, and in APIs including this host
side API, gadget APIs, usb character devices and debugfs interfaces.
In ``include/uapi/linux/usb/ch9.h`` you will find the USB data types defined
in chapter 9 of the USB specification. These data types are used throughout
USB, and in APIs including this host side API, gadget APIs, usb character
devices and debugfs interfaces. That file is itself included by
``include/linux/usb/ch9.h``, which also contains declarations of a few
utility routines for manipulating these data types; the implementations
are in ``drivers/usb/common/common.c``.
.. kernel-doc:: drivers/usb/common/common.c
:export:
.. kernel-doc:: drivers/usb/common/debug.c
:export:
In addition, some functions useful for creating debugging output are
defined in ``drivers/usb/common/debug.c``.
Host-Side Data Types and Macros
===============================

View File

@ -50,8 +50,8 @@ Here is the main features of EROFS:
- Support POSIX.1e ACLs by using xattrs;
- Support transparent file compression as an option:
LZ4 algorithm with 4 KB fixed-sized output compression for high performance.
- Support transparent data compression as an option:
LZ4 algorithm with the fixed-sized output compression for high performance.
The following git tree provides the file system user-space tools under
development (ex, formatting tool mkfs.erofs):
@ -113,31 +113,31 @@ may not. All metadatas can be now observed in two different spaces (views):
::
|-> aligned with 8B
|-> followed closely
+ meta_blkaddr blocks |-> another slot
_____________________________________________________________________
| ... | inode | xattrs | extents | data inline | ... | inode ...
|________|_______|(optional)|(optional)|__(optional)_|_____|__________
|-> aligned with the inode slot size
. .
. .
. .
. .
. .
. .
.____________________________________________________|-> aligned with 4B
| xattr_ibody_header | shared xattrs | inline xattrs |
|____________________|_______________|_______________|
|-> 12 bytes <-|->x * 4 bytes<-| .
. . .
. . .
. . .
._______________________________.______________________.
| id | id | id | id | ... | id | ent | ... | ent| ... |
|____|____|____|____|______|____|_____|_____|____|_____|
|-> aligned with 4B
|-> aligned with 4B
|-> aligned with 8B
|-> followed closely
+ meta_blkaddr blocks |-> another slot
_____________________________________________________________________
| ... | inode | xattrs | extents | data inline | ... | inode ...
|________|_______|(optional)|(optional)|__(optional)_|_____|__________
|-> aligned with the inode slot size
. .
. .
. .
. .
. .
. .
.____________________________________________________|-> aligned with 4B
| xattr_ibody_header | shared xattrs | inline xattrs |
|____________________|_______________|_______________|
|-> 12 bytes <-|->x * 4 bytes<-| .
. . .
. . .
. . .
._______________________________.______________________.
| id | id | id | id | ... | id | ent | ... | ent| ... |
|____|____|____|____|______|____|_____|_____|____|_____|
|-> aligned with 4B
|-> aligned with 4B
Inode could be 32 or 64 bytes, which can be distinguished from a common
field which all inode versions have -- i_format::
@ -175,13 +175,13 @@ may not. All metadatas can be now observed in two different spaces (views):
Each share xattr can also be directly found by the following formula:
xattr offset = xattr_blkaddr * block_size + 4 * xattr_id
::
::
|-> aligned by 4 bytes
+ xattr_blkaddr blocks |-> aligned with 4 bytes
_________________________________________________________________________
| ... | xattr_entry | xattr data | ... | xattr_entry | xattr data ...
|________|_____________|_____________|_____|______________|_______________
|-> aligned by 4 bytes
+ xattr_blkaddr blocks |-> aligned with 4 bytes
_________________________________________________________________________
| ... | xattr_entry | xattr data | ... | xattr_entry | xattr data ...
|________|_____________|_____________|_____|______________|_______________
Directories
-----------
@ -193,48 +193,77 @@ algorithm (could refer to the related source code).
::
___________________________
/ |
/ ______________|________________
/ / | nameoff1 | nameoffN-1
____________.______________._______________v________________v__________
| dirent | dirent | ... | dirent | filename | filename | ... | filename |
|___.0___|____1___|_____|___N-1__|____0_____|____1_____|_____|___N-1____|
\ ^
\ | * could have
\ | trailing '\0'
\________________________| nameoff0
Directory block
___________________________
/ |
/ ______________|________________
/ / | nameoff1 | nameoffN-1
____________.______________._______________v________________v__________
| dirent | dirent | ... | dirent | filename | filename | ... | filename |
|___.0___|____1___|_____|___N-1__|____0_____|____1_____|_____|___N-1____|
\ ^
\ | * could have
\ | trailing '\0'
\________________________| nameoff0
Directory block
Note that apart from the offset of the first filename, nameoff0 also indicates
the total number of directory entries in this block since it is no need to
introduce another on-disk field at all.
Compression
-----------
Currently, EROFS supports 4KB fixed-sized output transparent file compression,
as illustrated below::
Data compression
----------------
EROFS implements LZ4 fixed-sized output compression which generates fixed-sized
compressed data blocks from variable-sized input in contrast to other existing
fixed-sized input solutions. Relatively higher compression ratios can be gotten
by using fixed-sized output compression since nowadays popular data compression
algorithms are mostly LZ77-based and such fixed-sized output approach can be
benefited from the historical dictionary (aka. sliding window).
|---- Variant-Length Extent ----|-------- VLE --------|----- VLE -----
clusterofs clusterofs clusterofs
| | | logical data
_________v_______________________________v_____________________v_______________
... | . | | . | | . | ...
____|____.________|_____________|________.____|_____________|__.__________|____
|-> cluster <-|-> cluster <-|-> cluster <-|-> cluster <-|-> cluster <-|
size size size size size
. . . .
. . . .
. . . .
_______._____________._____________._____________._____________________
... | | | | ... physical data
_______|_____________|_____________|_____________|_____________________
|-> cluster <-|-> cluster <-|-> cluster <-|
size size size
In details, original (uncompressed) data is turned into several variable-sized
extents and in the meanwhile, compressed into physical clusters (pclusters).
In order to record each variable-sized extent, logical clusters (lclusters) are
introduced as the basic unit of compress indexes to indicate whether a new
extent is generated within the range (HEAD) or not (NONHEAD). Lclusters are now
fixed in block size, as illustrated below::
Currently each on-disk physical cluster can contain 4KB (un)compressed data
at most. For each logical cluster, there is a corresponding on-disk index to
describe its cluster type, physical cluster address, etc.
|<- variable-sized extent ->|<- VLE ->|
clusterofs clusterofs clusterofs
| | |
_________v_________________________________v_______________________v________
... | . | | . | | . ...
____|____._________|______________|________.___ _|______________|__.________
|-> lcluster <-|-> lcluster <-|-> lcluster <-|-> lcluster <-|
(HEAD) (NONHEAD) (HEAD) (NONHEAD) .
. CBLKCNT . .
. . .
. . .
_______._____________________________.______________._________________
... | | | | ...
_______|______________|______________|______________|_________________
|-> big pcluster <-|-> pcluster <-|
See "struct z_erofs_vle_decompressed_index" in erofs_fs.h for more details.
A physical cluster can be seen as a container of physical compressed blocks
which contains compressed data. Previously, only lcluster-sized (4KB) pclusters
were supported. After big pcluster feature is introduced (available since
Linux v5.13), pcluster can be a multiple of lcluster size.
For each HEAD lcluster, clusterofs is recorded to indicate where a new extent
starts and blkaddr is used to seek the compressed data. For each NONHEAD
lcluster, delta0 and delta1 are available instead of blkaddr to indicate the
distance to its HEAD lcluster and the next HEAD lcluster. A PLAIN lcluster is
also a HEAD lcluster except that its data is uncompressed. See the comments
around "struct z_erofs_vle_decompressed_index" in erofs_fs.h for more details.
If big pcluster is enabled, pcluster size in lclusters needs to be recorded as
well. Let the delta0 of the first NONHEAD lcluster store the compressed block
count with a special flag as a new called CBLKCNT NONHEAD lcluster. It's easy
to understand its delta0 is constantly 1, as illustrated below::
__________________________________________________________
| HEAD | NONHEAD | NONHEAD | ... | NONHEAD | HEAD | HEAD |
|__:___|_(CBLKCNT)_|_________|_____|_________|__:___|____:_|
|<----- a big pcluster (with CBLKCNT) ------>|<-- -->|
a lcluster-sized pcluster (without CBLKCNT) ^
If another HEAD follows a HEAD lcluster, there is no room to record CBLKCNT,
but it's easy to know the size of such pcluster is 1 lcluster as well.

View File

@ -21,10 +21,10 @@ Description
The TMP103 is a digital output temperature sensor in a four-ball
wafer chip-scale package (WCSP). The TMP103 is capable of reading
temperatures to a resolution of 1°C. The TMP103 is specified for
operation over a temperature range of 40°C to +125°C.
operation over a temperature range of -40°C to +125°C.
Resolution: 8 Bits
Accuracy: ±1°C Typ (10°C to +100°C)
Accuracy: ±1°C Typ (-10°C to +100°C)
The driver provides the common sysfs-interface for temperatures (see
Documentation/hwmon/sysfs-interface.rst under Temperatures).

View File

@ -173,7 +173,7 @@ Director rule is added from ethtool (Sideband filter), ATR is turned off by the
driver. To re-enable ATR, the sideband can be disabled with the ethtool -K
option. For example::
ethtool K [adapter] ntuple [off|on]
ethtool -K [adapter] ntuple [off|on]
If sideband is re-enabled after ATR is re-enabled, ATR remains enabled until a
TCP-IP flow is added. When all TCP-IP sideband rules are deleted, ATR is
@ -688,7 +688,7 @@ shaper bw_rlimit: for each tc, sets minimum and maximum bandwidth rates.
Totals must be equal or less than port speed.
For example: min_rate 1Gbit 3Gbit: Verify bandwidth limit using network
monitoring tools such as ifstat or sar n DEV [interval] [number of samples]
monitoring tools such as `ifstat` or `sar -n DEV [interval] [number of samples]`
2. Enable HW TC offload on interface::

View File

@ -179,7 +179,7 @@ shaper bw_rlimit: for each tc, sets minimum and maximum bandwidth rates.
Totals must be equal or less than port speed.
For example: min_rate 1Gbit 3Gbit: Verify bandwidth limit using network
monitoring tools such as ifstat or sar n DEV [interval] [number of samples]
monitoring tools such as ``ifstat`` or ``sar -n DEV [interval] [number of samples]``
NOTE:
Setting up channels via ethtool (ethtool -L) is not supported when the

View File

@ -1,4 +1,4 @@
.. _process_statement_kernel:
.. _process_statement_kernel:
Linux Kernel Enforcement Statement
----------------------------------

View File

@ -1,4 +1,4 @@
=============================
=============================
Virtual TPM interface for Xen
=============================

View File

@ -1,4 +1,4 @@
======================================
======================================
NO_HZ: Reducing Scheduling-Clock Ticks
======================================

View File

@ -1,50 +0,0 @@
Chinese translated version of Documentation/admin-guide/security-bugs.rst
If you have any comment or update to the content, please contact the
original document maintainer directly. However, if you have a problem
communicating in English you can also ask the Chinese maintainer for
help. Contact the Chinese maintainer if this translation is outdated
or if there is a problem with the translation.
Chinese maintainer: Harry Wei <harryxiyou@gmail.com>
---------------------------------------------------------------------
Documentation/admin-guide/security-bugs.rst 的中文翻译
如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文
交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻
译存在问题,请联系中文版维护者。
中文版维护者: 贾威威 Harry Wei <harryxiyou@gmail.com>
中文版翻译者: 贾威威 Harry Wei <harryxiyou@gmail.com>
中文版校译者: 贾威威 Harry Wei <harryxiyou@gmail.com>
以下为正文
---------------------------------------------------------------------
Linux内核开发者认为安全非常重要。因此我们想要知道当一个有关于
安全的漏洞被发现的时候,并且它可能会被尽快的修复或者公开。请把这个安全
漏洞报告给Linux内核安全团队。
1) 联系
linux内核安全团队可以通过email<security@kernel.org>来联系。这是
一组独立的安全工作人员,可以帮助改善漏洞报告并且公布和取消一个修复。安
全团队有可能会从部分的维护者那里引进额外的帮助来了解并且修复安全漏洞。
当遇到任何漏洞,所能提供的信息越多就越能诊断和修复。如果你不清楚什么
是有帮助的信息那就请重温一下admin-guide/reporting-bugs.rst文件中的概述过程。任
何攻击性的代码都是非常有用的,未经报告者的同意不会被取消,除非它已经
被公布于众。
2) 公开
Linux内核安全团队的宗旨就是和漏洞提交者一起处理漏洞的解决方案直
到公开。我们喜欢尽快地完全公开漏洞。当一个漏洞或者修复还没有被完全地理
解,解决方案没有通过测试或者供应商协调,可以合理地延迟公开。然而,我们
期望这些延迟尽可能的短些,是可数的几天,而不是几个星期或者几个月。公开
日期是通过安全团队和漏洞提供者以及供应商洽谈后的结果。公开时间表是从很
短(特殊的,它已经被公众所知道)到几个星期。作为一个基本的默认政策,我
们所期望通知公众的日期是7天的安排。
3) 保密协议
Linux内核安全团队不是一个正式的团体因此不能加入任何的保密协议。

View File

@ -140,7 +140,7 @@ is an arbitrary string allowed in a filesystem, e.g.::
Each function provides its specific set of attributes, with either read-only
or read-write access. Where applicable they need to be written to as
appropriate.
Please refer to Documentation/ABI/*/configfs-usb-gadget* for more information.
Please refer to Documentation/ABI/testing/configfs-usb-gadget for more information.
4. Associating the functions with their configurations
------------------------------------------------------

View File

@ -1,4 +1,4 @@
================
================
mtouchusb driver
================

View File

@ -1,4 +1,4 @@
==========
==========
USB serial
==========

View File

@ -22,7 +22,7 @@ to SEV::
[ecx]:
Bits[31:0] Number of encrypted guests supported simultaneously
If support for SEV is present, MSR 0xc001_0010 (MSR_K8_SYSCFG) and MSR 0xc001_0015
If support for SEV is present, MSR 0xc001_0010 (MSR_AMD64_SYSCFG) and MSR 0xc001_0015
(MSR_K7_HWCR) can be used to determine if it can be enabled::
0xc001_0010:

View File

@ -4803,7 +4803,7 @@ KVM_PV_VM_VERIFY
4.126 KVM_X86_SET_MSR_FILTER
----------------------------
:Capability: KVM_X86_SET_MSR_FILTER
:Capability: KVM_CAP_X86_MSR_FILTER
:Architectures: x86
:Type: vm ioctl
:Parameters: struct kvm_msr_filter
@ -6715,7 +6715,7 @@ accesses that would usually trigger a #GP by KVM into the guest will
instead get bounced to user space through the KVM_EXIT_X86_RDMSR and
KVM_EXIT_X86_WRMSR exit notifications.
8.27 KVM_X86_SET_MSR_FILTER
8.27 KVM_CAP_X86_MSR_FILTER
---------------------------
:Architectures: x86

View File

@ -53,7 +53,7 @@ CPUID function 0x8000001f reports information related to SME::
system physical addresses, not guest physical
addresses)
If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be used to
If support for SME is present, MSR 0xc00100010 (MSR_AMD64_SYSCFG) can be used to
determine if SME is enabled and/or to enable memory encryption::
0xc0010010:
@ -79,7 +79,7 @@ The state of SME in the Linux kernel can be documented as follows:
The CPU supports SME (determined through CPUID instruction).
- Enabled:
Supported and bit 23 of MSR_K8_SYSCFG is set.
Supported and bit 23 of MSR_AMD64_SYSCFG is set.
- Active:
Supported, Enabled and the Linux kernel is actively applying
@ -89,7 +89,7 @@ The state of SME in the Linux kernel can be documented as follows:
SME can also be enabled and activated in the BIOS. If SME is enabled and
activated in the BIOS, then all memory accesses will be encrypted and it will
not be necessary to activate the Linux memory encryption support. If the BIOS
merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can activate
merely enables SME (sets bit 23 of the MSR_AMD64_SYSCFG), then Linux can activate
memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
by supplying mem_encrypt=on on the kernel command line. However, if BIOS does
not enable SME, then Linux will not be able to activate memory encryption, even

View File

@ -1578,7 +1578,7 @@ F: drivers/clk/sunxi/
ARM/Allwinner sunXi SoC support
M: Maxime Ripard <mripard@kernel.org>
M: Chen-Yu Tsai <wens@csie.org>
R: Jernej Skrabec <jernej.skrabec@siol.net>
R: Jernej Skrabec <jernej.skrabec@gmail.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux.git
@ -5089,7 +5089,7 @@ S: Maintained
F: drivers/net/fddi/defza.*
DEINTERLACE DRIVERS FOR ALLWINNER H3
M: Jernej Skrabec <jernej.skrabec@siol.net>
M: Jernej Skrabec <jernej.skrabec@gmail.com>
L: linux-media@vger.kernel.org
S: Maintained
T: git git://linuxtv.org/media_tree.git
@ -5237,7 +5237,7 @@ DEVICE DIRECT ACCESS (DAX)
M: Dan Williams <dan.j.williams@intel.com>
M: Vishal Verma <vishal.l.verma@intel.com>
M: Dave Jiang <dave.jiang@intel.com>
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
F: drivers/dax/
@ -5632,14 +5632,14 @@ F: include/linux/power/smartreflex.h
DRM DRIVER FOR ALLWINNER DE2 AND DE3 ENGINE
M: Maxime Ripard <mripard@kernel.org>
M: Chen-Yu Tsai <wens@csie.org>
R: Jernej Skrabec <jernej.skrabec@siol.net>
R: Jernej Skrabec <jernej.skrabec@gmail.com>
L: dri-devel@lists.freedesktop.org
S: Supported
T: git git://anongit.freedesktop.org/drm/drm-misc
F: drivers/gpu/drm/sun4i/sun8i*
DRM DRIVER FOR ARM PL111 CLCD
M: Eric Anholt <eric@anholt.net>
M: Emma Anholt <emma@anholt.net>
S: Supported
T: git git://anongit.freedesktop.org/drm/drm-misc
F: drivers/gpu/drm/pl111/
@ -5719,7 +5719,7 @@ T: git git://anongit.freedesktop.org/drm/drm-misc
F: drivers/gpu/drm/tiny/gm12u320.c
DRM DRIVER FOR HX8357D PANELS
M: Eric Anholt <eric@anholt.net>
M: Emma Anholt <emma@anholt.net>
S: Maintained
T: git git://anongit.freedesktop.org/drm/drm-misc
F: Documentation/devicetree/bindings/display/himax,hx8357d.txt
@ -6023,7 +6023,7 @@ M: Neil Armstrong <narmstrong@baylibre.com>
M: Robert Foss <robert.foss@linaro.org>
R: Laurent Pinchart <Laurent.pinchart@ideasonboard.com>
R: Jonas Karlman <jonas@kwiboo.se>
R: Jernej Skrabec <jernej.skrabec@siol.net>
R: Jernej Skrabec <jernej.skrabec@gmail.com>
S: Maintained
T: git git://anongit.freedesktop.org/drm/drm-misc
F: drivers/gpu/drm/bridge/
@ -6177,7 +6177,7 @@ F: Documentation/devicetree/bindings/display/ti/
F: drivers/gpu/drm/omapdrm/
DRM DRIVERS FOR V3D
M: Eric Anholt <eric@anholt.net>
M: Emma Anholt <emma@anholt.net>
S: Supported
T: git git://anongit.freedesktop.org/drm/drm-misc
F: Documentation/devicetree/bindings/gpu/brcm,bcm-v3d.yaml
@ -6185,7 +6185,7 @@ F: drivers/gpu/drm/v3d/
F: include/uapi/drm/v3d_drm.h
DRM DRIVERS FOR VC4
M: Eric Anholt <eric@anholt.net>
M: Emma Anholt <emma@anholt.net>
M: Maxime Ripard <mripard@kernel.org>
S: Supported
T: git git://github.com/anholt/linux
@ -7006,7 +7006,7 @@ M: Dan Williams <dan.j.williams@intel.com>
R: Matthew Wilcox <willy@infradead.org>
R: Jan Kara <jack@suse.cz>
L: linux-fsdevel@vger.kernel.org
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
F: fs/dax.c
F: include/linux/dax.h
@ -10378,7 +10378,7 @@ LIBNVDIMM BLK: MMIO-APERTURE DRIVER
M: Dan Williams <dan.j.williams@intel.com>
M: Vishal Verma <vishal.l.verma@intel.com>
M: Dave Jiang <dave.jiang@intel.com>
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
P: Documentation/nvdimm/maintainer-entry-profile.rst
@ -10389,7 +10389,7 @@ LIBNVDIMM BTT: BLOCK TRANSLATION TABLE
M: Vishal Verma <vishal.l.verma@intel.com>
M: Dan Williams <dan.j.williams@intel.com>
M: Dave Jiang <dave.jiang@intel.com>
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
P: Documentation/nvdimm/maintainer-entry-profile.rst
@ -10399,7 +10399,7 @@ LIBNVDIMM PMEM: PERSISTENT MEMORY DRIVER
M: Dan Williams <dan.j.williams@intel.com>
M: Vishal Verma <vishal.l.verma@intel.com>
M: Dave Jiang <dave.jiang@intel.com>
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
P: Documentation/nvdimm/maintainer-entry-profile.rst
@ -10407,7 +10407,7 @@ F: drivers/nvdimm/pmem*
LIBNVDIMM: DEVICETREE BINDINGS
M: Oliver O'Halloran <oohall@gmail.com>
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
F: Documentation/devicetree/bindings/pmem/pmem-region.txt
@ -10418,7 +10418,7 @@ M: Dan Williams <dan.j.williams@intel.com>
M: Vishal Verma <vishal.l.verma@intel.com>
M: Dave Jiang <dave.jiang@intel.com>
M: Ira Weiny <ira.weiny@intel.com>
L: linux-nvdimm@lists.01.org
L: nvdimm@lists.linux.dev
S: Supported
Q: https://patchwork.kernel.org/project/linux-nvdimm/list/
P: Documentation/nvdimm/maintainer-entry-profile.rst
@ -15815,7 +15815,7 @@ F: include/uapi/linux/rose.h
F: net/rose/
ROTATION DRIVER FOR ALLWINNER A83T
M: Jernej Skrabec <jernej.skrabec@siol.net>
M: Jernej Skrabec <jernej.skrabec@gmail.com>
L: linux-media@vger.kernel.org
S: Maintained
T: git git://linuxtv.org/media_tree.git
@ -17304,6 +17304,12 @@ L: linux-i2c@vger.kernel.org
S: Maintained
F: drivers/i2c/busses/i2c-stm32*
ST STM32 SPI DRIVER
M: Alain Volmat <alain.volmat@foss.st.com>
L: linux-spi@vger.kernel.org
S: Maintained
F: drivers/spi/spi-stm32.c
ST STPDDC60 DRIVER
M: Daniel Nilsson <daniel.nilsson@flex.com>
L: linux-hwmon@vger.kernel.org

View File

@ -2,7 +2,7 @@
VERSION = 5
PATCHLEVEL = 13
SUBLEVEL = 0
EXTRAVERSION = -rc1
EXTRAVERSION = -rc2
NAME = Frozen Wasteland
# *DOCUMENTATION*

View File

@ -31,7 +31,7 @@ endif
ifdef CONFIG_ARC_CURR_IN_REG
# For a global register defintion, make sure it gets passed to every file
# For a global register definition, make sure it gets passed to every file
# We had a customer reported bug where some code built in kernel was NOT using
# any kernel headers, and missing the r25 global register
# Can't do unconditionally because of recursive include issues

View File

@ -116,7 +116,7 @@ static inline unsigned long __xchg(unsigned long val, volatile void *ptr,
*
* Technically the lock is also needed for UP (boils down to irq save/restore)
* but we can cheat a bit since cmpxchg() atomic_ops_lock() would cause irqs to
* be disabled thus can't possibly be interrpted/preempted/clobbered by xchg()
* be disabled thus can't possibly be interrupted/preempted/clobbered by xchg()
* Other way around, xchg is one instruction anyways, so can't be interrupted
* as such
*/
@ -143,7 +143,7 @@ static inline unsigned long __xchg(unsigned long val, volatile void *ptr,
/*
* "atomic" variant of xchg()
* REQ: It needs to follow the same serialization rules as other atomic_xxx()
* Since xchg() doesn't always do that, it would seem that following defintion
* Since xchg() doesn't always do that, it would seem that following definition
* is incorrect. But here's the rationale:
* SMP : Even xchg() takes the atomic_ops_lock, so OK.
* LLSC: atomic_ops_lock are not relevant at all (even if SMP, since LLSC

View File

@ -7,6 +7,18 @@
#include <uapi/asm/page.h>
#ifdef CONFIG_ARC_HAS_PAE40
#define MAX_POSSIBLE_PHYSMEM_BITS 40
#define PAGE_MASK_PHYS (0xff00000000ull | PAGE_MASK)
#else /* CONFIG_ARC_HAS_PAE40 */
#define MAX_POSSIBLE_PHYSMEM_BITS 32
#define PAGE_MASK_PHYS PAGE_MASK
#endif /* CONFIG_ARC_HAS_PAE40 */
#ifndef __ASSEMBLY__
#define clear_page(paddr) memset((paddr), 0, PAGE_SIZE)

View File

@ -107,8 +107,8 @@
#define ___DEF (_PAGE_PRESENT | _PAGE_CACHEABLE)
/* Set of bits not changed in pte_modify */
#define _PAGE_CHG_MASK (PAGE_MASK | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_SPECIAL)
#define _PAGE_CHG_MASK (PAGE_MASK_PHYS | _PAGE_ACCESSED | _PAGE_DIRTY | \
_PAGE_SPECIAL)
/* More Abbrevaited helpers */
#define PAGE_U_NONE __pgprot(___DEF)
#define PAGE_U_R __pgprot(___DEF | _PAGE_READ)
@ -132,13 +132,7 @@
#define PTE_BITS_IN_PD0 (_PAGE_GLOBAL | _PAGE_PRESENT | _PAGE_HW_SZ)
#define PTE_BITS_RWX (_PAGE_EXECUTE | _PAGE_WRITE | _PAGE_READ)
#ifdef CONFIG_ARC_HAS_PAE40
#define PTE_BITS_NON_RWX_IN_PD1 (0xff00000000 | PAGE_MASK | _PAGE_CACHEABLE)
#define MAX_POSSIBLE_PHYSMEM_BITS 40
#else
#define PTE_BITS_NON_RWX_IN_PD1 (PAGE_MASK | _PAGE_CACHEABLE)
#define MAX_POSSIBLE_PHYSMEM_BITS 32
#endif
#define PTE_BITS_NON_RWX_IN_PD1 (PAGE_MASK_PHYS | _PAGE_CACHEABLE)
/**************************************************************************
* Mapping of vm_flags (Generic VM) to PTE flags (arch specific)

View File

@ -33,5 +33,4 @@
#define PAGE_MASK (~(PAGE_SIZE-1))
#endif /* _UAPI__ASM_ARC_PAGE_H */

View File

@ -177,7 +177,7 @@ tracesys:
; Do the Sys Call as we normally would.
; Validate the Sys Call number
cmp r8, NR_syscalls
cmp r8, NR_syscalls - 1
mov.hi r0, -ENOSYS
bhi tracesys_exit
@ -255,7 +255,7 @@ ENTRY(EV_Trap)
;============ Normal syscall case
; syscall num shd not exceed the total system calls avail
cmp r8, NR_syscalls
cmp r8, NR_syscalls - 1
mov.hi r0, -ENOSYS
bhi .Lret_from_system_call

View File

@ -140,6 +140,7 @@ int kgdb_arch_handle_exception(int e_vector, int signo, int err_code,
ptr = &remcomInBuffer[1];
if (kgdb_hex2long(&ptr, &addr))
regs->ret = addr;
fallthrough;
case 'D':
case 'k':

View File

@ -50,14 +50,14 @@ SYSCALL_DEFINE3(arc_usr_cmpxchg, int *, uaddr, int, expected, int, new)
int ret;
/*
* This is only for old cores lacking LLOCK/SCOND, which by defintion
* This is only for old cores lacking LLOCK/SCOND, which by definition
* can't possibly be SMP. Thus doesn't need to be SMP safe.
* And this also helps reduce the overhead for serializing in
* the UP case
*/
WARN_ON_ONCE(IS_ENABLED(CONFIG_SMP));
/* Z indicates to userspace if operation succeded */
/* Z indicates to userspace if operation succeeded */
regs->status32 &= ~STATUS_Z_MASK;
ret = access_ok(uaddr, sizeof(*uaddr));
@ -107,7 +107,7 @@ fail:
void arch_cpu_idle(void)
{
/* Re-enable interrupts <= default irq priority before commiting SLEEP */
/* Re-enable interrupts <= default irq priority before committing SLEEP */
const unsigned int arg = 0x10 | ARCV2_IRQ_DEF_PRIO;
__asm__ __volatile__(
@ -120,7 +120,7 @@ void arch_cpu_idle(void)
void arch_cpu_idle(void)
{
/* sleep, but enable both set E1/E2 (levels of interrutps) before committing */
/* sleep, but enable both set E1/E2 (levels of interrupts) before committing */
__asm__ __volatile__("sleep 0x3 \n");
}

View File

@ -259,7 +259,7 @@ setup_rt_frame(struct ksignal *ksig, sigset_t *set, struct pt_regs *regs)
regs->r2 = (unsigned long)&sf->uc;
/*
* small optim to avoid unconditonally calling do_sigaltstack
* small optim to avoid unconditionally calling do_sigaltstack
* in sigreturn path, now that we only have rt_sigreturn
*/
magic = MAGIC_SIGALTSTK;
@ -391,7 +391,7 @@ void do_signal(struct pt_regs *regs)
void do_notify_resume(struct pt_regs *regs)
{
/*
* ASM glue gaurantees that this is only called when returning to
* ASM glue guarantees that this is only called when returning to
* user mode
*/
if (test_thread_flag(TIF_NOTIFY_RESUME))

View File

@ -157,7 +157,16 @@ void __init setup_arch_memory(void)
min_high_pfn = PFN_DOWN(high_mem_start);
max_high_pfn = PFN_DOWN(high_mem_start + high_mem_sz);
max_zone_pfn[ZONE_HIGHMEM] = min_low_pfn;
/*
* max_high_pfn should be ok here for both HIGHMEM and HIGHMEM+PAE.
* For HIGHMEM without PAE max_high_pfn should be less than
* min_low_pfn to guarantee that these two regions don't overlap.
* For PAE case highmem is greater than lowmem, so it is natural
* to use max_high_pfn.
*
* In both cases, holes should be handled by pfn_valid().
*/
max_zone_pfn[ZONE_HIGHMEM] = max_high_pfn;
high_memory = (void *)(min_high_pfn << PAGE_SHIFT);

View File

@ -53,9 +53,10 @@ EXPORT_SYMBOL(ioremap);
void __iomem *ioremap_prot(phys_addr_t paddr, unsigned long size,
unsigned long flags)
{
unsigned int off;
unsigned long vaddr;
struct vm_struct *area;
phys_addr_t off, end;
phys_addr_t end;
pgprot_t prot = __pgprot(flags);
/* Don't allow wraparound, zero size */
@ -72,7 +73,7 @@ void __iomem *ioremap_prot(phys_addr_t paddr, unsigned long size,
/* Mappings have to be page-aligned */
off = paddr & ~PAGE_MASK;
paddr &= PAGE_MASK;
paddr &= PAGE_MASK_PHYS;
size = PAGE_ALIGN(end + 1) - paddr;
/*

View File

@ -576,7 +576,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long vaddr_unaligned,
pte_t *ptep)
{
unsigned long vaddr = vaddr_unaligned & PAGE_MASK;
phys_addr_t paddr = pte_val(*ptep) & PAGE_MASK;
phys_addr_t paddr = pte_val(*ptep) & PAGE_MASK_PHYS;
struct page *page = pfn_to_page(pte_pfn(*ptep));
create_tlb(vma, vaddr, ptep);

View File

@ -135,24 +135,18 @@ void xen_destroy_contiguous_region(phys_addr_t pstart, unsigned int order)
return;
}
int xen_swiotlb_detect(void)
{
if (!xen_domain())
return 0;
if (xen_feature(XENFEAT_direct_mapped))
return 1;
/* legacy case */
if (!xen_feature(XENFEAT_not_direct_mapped) && xen_initial_domain())
return 1;
return 0;
}
static int __init xen_mm_init(void)
{
struct gnttab_cache_flush cflush;
int rc;
if (!xen_swiotlb_detect())
return 0;
xen_swiotlb_init();
rc = xen_swiotlb_init();
/* we can work with the default swiotlb */
if (rc < 0 && rc != -EEXIST)
return rc;
cflush.op = 0;
cflush.a.dev_bus_addr = 0;

View File

@ -175,6 +175,9 @@ vdso_install:
$(if $(CONFIG_COMPAT_VDSO), \
$(Q)$(MAKE) $(build)=arch/arm64/kernel/vdso32 $@)
archprepare:
$(Q)$(MAKE) $(build)=arch/arm64/tools kapi
# We use MRPROPER_FILES and CLEAN_FILES now
archclean:
$(Q)$(MAKE) $(clean)=$(boot)

View File

@ -5,3 +5,5 @@ generic-y += qrwlock.h
generic-y += qspinlock.h
generic-y += set_memory.h
generic-y += user.h
generated-y += cpucaps.h

View File

@ -1,74 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* arch/arm64/include/asm/cpucaps.h
*
* Copyright (C) 2016 ARM Ltd.
*/
#ifndef __ASM_CPUCAPS_H
#define __ASM_CPUCAPS_H
#define ARM64_WORKAROUND_CLEAN_CACHE 0
#define ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE 1
#define ARM64_WORKAROUND_845719 2
#define ARM64_HAS_SYSREG_GIC_CPUIF 3
#define ARM64_HAS_PAN 4
#define ARM64_HAS_LSE_ATOMICS 5
#define ARM64_WORKAROUND_CAVIUM_23154 6
#define ARM64_WORKAROUND_834220 7
#define ARM64_HAS_NO_HW_PREFETCH 8
#define ARM64_HAS_VIRT_HOST_EXTN 11
#define ARM64_WORKAROUND_CAVIUM_27456 12
#define ARM64_HAS_32BIT_EL0 13
#define ARM64_SPECTRE_V3A 14
#define ARM64_HAS_CNP 15
#define ARM64_HAS_NO_FPSIMD 16
#define ARM64_WORKAROUND_REPEAT_TLBI 17
#define ARM64_WORKAROUND_QCOM_FALKOR_E1003 18
#define ARM64_WORKAROUND_858921 19
#define ARM64_WORKAROUND_CAVIUM_30115 20
#define ARM64_HAS_DCPOP 21
#define ARM64_SVE 22
#define ARM64_UNMAP_KERNEL_AT_EL0 23
#define ARM64_SPECTRE_V2 24
#define ARM64_HAS_RAS_EXTN 25
#define ARM64_WORKAROUND_843419 26
#define ARM64_HAS_CACHE_IDC 27
#define ARM64_HAS_CACHE_DIC 28
#define ARM64_HW_DBM 29
#define ARM64_SPECTRE_V4 30
#define ARM64_MISMATCHED_CACHE_TYPE 31
#define ARM64_HAS_STAGE2_FWB 32
#define ARM64_HAS_CRC32 33
#define ARM64_SSBS 34
#define ARM64_WORKAROUND_1418040 35
#define ARM64_HAS_SB 36
#define ARM64_WORKAROUND_SPECULATIVE_AT 37
#define ARM64_HAS_ADDRESS_AUTH_ARCH 38
#define ARM64_HAS_ADDRESS_AUTH_IMP_DEF 39
#define ARM64_HAS_GENERIC_AUTH_ARCH 40
#define ARM64_HAS_GENERIC_AUTH_IMP_DEF 41
#define ARM64_HAS_IRQ_PRIO_MASKING 42
#define ARM64_HAS_DCPODP 43
#define ARM64_WORKAROUND_1463225 44
#define ARM64_WORKAROUND_CAVIUM_TX2_219_TVM 45
#define ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM 46
#define ARM64_WORKAROUND_1542419 47
#define ARM64_HAS_E0PD 48
#define ARM64_HAS_RNG 49
#define ARM64_HAS_AMU_EXTN 50
#define ARM64_HAS_ADDRESS_AUTH 51
#define ARM64_HAS_GENERIC_AUTH 52
#define ARM64_HAS_32BIT_EL1 53
#define ARM64_BTI 54
#define ARM64_HAS_ARMv8_4_TTL 55
#define ARM64_HAS_TLB_RANGE 56
#define ARM64_MTE 57
#define ARM64_WORKAROUND_1508412 58
#define ARM64_HAS_LDAPR 59
#define ARM64_KVM_PROTECTED_MODE 60
#define ARM64_WORKAROUND_NVIDIA_CARMEL_CNP 61
#define ARM64_HAS_EPAN 62
#define ARM64_NCAPS 63
#endif /* __ASM_CPUCAPS_H */

View File

@ -55,8 +55,10 @@ void __sync_icache_dcache(pte_t pte)
{
struct page *page = pte_page(pte);
if (!test_and_set_bit(PG_dcache_clean, &page->flags))
if (!test_bit(PG_dcache_clean, &page->flags)) {
sync_icache_aliases(page_address(page), page_size(page));
set_bit(PG_dcache_clean, &page->flags);
}
}
EXPORT_SYMBOL_GPL(__sync_icache_dcache);

View File

@ -43,6 +43,7 @@
#include <linux/sizes.h>
#include <asm/tlb.h>
#include <asm/alternative.h>
#include <asm/xen/swiotlb-xen.h>
/*
* We need to be able to catch inadvertent references to memstart_addr
@ -482,7 +483,7 @@ void __init mem_init(void)
if (swiotlb_force == SWIOTLB_FORCE ||
max_pfn > PFN_DOWN(arm64_dma_phys_limit))
swiotlb_init(1);
else
else if (!xen_swiotlb_detect())
swiotlb_force = SWIOTLB_NO_FORCE;
set_max_mapnr(max_pfn - PHYS_PFN_OFFSET);

View File

@ -447,6 +447,18 @@ SYM_FUNC_START(__cpu_setup)
mov x10, #(SYS_GCR_EL1_RRND | SYS_GCR_EL1_EXCL_MASK)
msr_s SYS_GCR_EL1, x10
/*
* If GCR_EL1.RRND=1 is implemented the same way as RRND=0, then
* RGSR_EL1.SEED must be non-zero for IRG to produce
* pseudorandom numbers. As RGSR_EL1 is UNKNOWN out of reset, we
* must initialize it.
*/
mrs x10, CNTVCT_EL0
ands x10, x10, #SYS_RGSR_EL1_SEED_MASK
csinc x10, x10, xzr, ne
lsl x10, x10, #SYS_RGSR_EL1_SEED_SHIFT
msr_s SYS_RGSR_EL1, x10
/* clear any pending tag check faults in TFSR*_EL1 */
msr_s SYS_TFSR_EL1, xzr
msr_s SYS_TFSRE0_EL1, xzr

22
arch/arm64/tools/Makefile Normal file
View File

@ -0,0 +1,22 @@
# SPDX-License-Identifier: GPL-2.0
gen := arch/$(ARCH)/include/generated
kapi := $(gen)/asm
kapi-hdrs-y := $(kapi)/cpucaps.h
targets += $(addprefix ../../../,$(gen-y) $(kapi-hdrs-y))
PHONY += kapi
kapi: $(kapi-hdrs-y) $(gen-y)
# Create output directory if not already present
_dummy := $(shell [ -d '$(kapi)' ] || mkdir -p '$(kapi)')
quiet_cmd_gen_cpucaps = GEN $@
cmd_gen_cpucaps = mkdir -p $(dir $@) && \
$(AWK) -f $(filter-out $(PHONY),$^) > $@
$(kapi)/cpucaps.h: $(src)/gen-cpucaps.awk $(src)/cpucaps FORCE
$(call if_changed,gen_cpucaps)

65
arch/arm64/tools/cpucaps Normal file
View File

@ -0,0 +1,65 @@
# SPDX-License-Identifier: GPL-2.0
#
# Internal CPU capabilities constants, keep this list sorted
BTI
HAS_32BIT_EL0
HAS_32BIT_EL1
HAS_ADDRESS_AUTH
HAS_ADDRESS_AUTH_ARCH
HAS_ADDRESS_AUTH_IMP_DEF
HAS_AMU_EXTN
HAS_ARMv8_4_TTL
HAS_CACHE_DIC
HAS_CACHE_IDC
HAS_CNP
HAS_CRC32
HAS_DCPODP
HAS_DCPOP
HAS_E0PD
HAS_EPAN
HAS_GENERIC_AUTH
HAS_GENERIC_AUTH_ARCH
HAS_GENERIC_AUTH_IMP_DEF
HAS_IRQ_PRIO_MASKING
HAS_LDAPR
HAS_LSE_ATOMICS
HAS_NO_FPSIMD
HAS_NO_HW_PREFETCH
HAS_PAN
HAS_RAS_EXTN
HAS_RNG
HAS_SB
HAS_STAGE2_FWB
HAS_SYSREG_GIC_CPUIF
HAS_TLB_RANGE
HAS_VIRT_HOST_EXTN
HW_DBM
KVM_PROTECTED_MODE
MISMATCHED_CACHE_TYPE
MTE
SPECTRE_V2
SPECTRE_V3A
SPECTRE_V4
SSBS
SVE
UNMAP_KERNEL_AT_EL0
WORKAROUND_834220
WORKAROUND_843419
WORKAROUND_845719
WORKAROUND_858921
WORKAROUND_1418040
WORKAROUND_1463225
WORKAROUND_1508412
WORKAROUND_1542419
WORKAROUND_CAVIUM_23154
WORKAROUND_CAVIUM_27456
WORKAROUND_CAVIUM_30115
WORKAROUND_CAVIUM_TX2_219_PRFM
WORKAROUND_CAVIUM_TX2_219_TVM
WORKAROUND_CLEAN_CACHE
WORKAROUND_DEVICE_LOAD_ACQUIRE
WORKAROUND_NVIDIA_CARMEL_CNP
WORKAROUND_QCOM_FALKOR_E1003
WORKAROUND_REPEAT_TLBI
WORKAROUND_SPECULATIVE_AT

View File

@ -0,0 +1,40 @@
#!/bin/awk -f
# SPDX-License-Identifier: GPL-2.0
# gen-cpucaps.awk: arm64 cpucaps header generator
#
# Usage: awk -f gen-cpucaps.awk cpucaps.txt
# Log an error and terminate
function fatal(msg) {
print "Error at line " NR ": " msg > "/dev/stderr"
exit 1
}
# skip blank lines and comment lines
/^$/ { next }
/^#/ { next }
BEGIN {
print "#ifndef __ASM_CPUCAPS_H"
print "#define __ASM_CPUCAPS_H"
print ""
print "/* Generated file - do not edit */"
cap_num = 0
print ""
}
/^[vA-Z0-9_]+$/ {
printf("#define ARM64_%-30s\t%d\n", $0, cap_num++)
next
}
END {
printf("#define ARM64_NCAPS\t\t\t\t%d\n", cap_num)
print ""
print "#endif /* __ASM_CPUCAPS_H */"
}
# Any lines not handled by previous rules are unexpected
{
fatal("unhandled statement")
}

View File

@ -448,6 +448,9 @@
*/
long plpar_hcall_norets(unsigned long opcode, ...);
/* Variant which does not do hcall tracing */
long plpar_hcall_norets_notrace(unsigned long opcode, ...);
/**
* plpar_hcall: - Make a pseries hypervisor call
* @opcode: The hypervisor call to make.

View File

@ -153,8 +153,6 @@ static inline void interrupt_enter_prepare(struct pt_regs *regs, struct interrup
*/
static inline void interrupt_exit_prepare(struct pt_regs *regs, struct interrupt_state *state)
{
if (user_mode(regs))
kuep_unlock();
}
static inline void interrupt_async_enter_prepare(struct pt_regs *regs, struct interrupt_state *state)
@ -222,6 +220,13 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte
local_paca->irq_soft_mask = IRQS_ALL_DISABLED;
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && !(regs->msr & MSR_PR) &&
regs->nip < (unsigned long)__end_interrupts) {
// Kernel code running below __end_interrupts is
// implicitly soft-masked.
regs->softe = IRQS_ALL_DISABLED;
}
/* Don't do any per-CPU operations until interrupt state is fixed */
if (nmi_disables_ftrace(regs)) {

View File

@ -28,19 +28,35 @@ static inline u32 yield_count_of(int cpu)
return be32_to_cpu(yield_count);
}
/*
* Spinlock code confers and prods, so don't trace the hcalls because the
* tracing code takes spinlocks which can cause recursion deadlocks.
*
* These calls are made while the lock is not held: the lock slowpath yields if
* it can not acquire the lock, and unlock slow path might prod if a waiter has
* yielded). So this may not be a problem for simple spin locks because the
* tracing does not technically recurse on the lock, but we avoid it anyway.
*
* However the queued spin lock contended path is more strictly ordered: the
* H_CONFER hcall is made after the task has queued itself on the lock, so then
* recursing on that lock will cause the task to then queue up again behind the
* first instance (or worse: queued spinlocks use tricks that assume a context
* never waits on more than one spinlock, so such recursion may cause random
* corruption in the lock code).
*/
static inline void yield_to_preempted(int cpu, u32 yield_count)
{
plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), yield_count);
plpar_hcall_norets_notrace(H_CONFER, get_hard_smp_processor_id(cpu), yield_count);
}
static inline void prod_cpu(int cpu)
{
plpar_hcall_norets(H_PROD, get_hard_smp_processor_id(cpu));
plpar_hcall_norets_notrace(H_PROD, get_hard_smp_processor_id(cpu));
}
static inline void yield_to_any(void)
{
plpar_hcall_norets(H_CONFER, -1, 0);
plpar_hcall_norets_notrace(H_CONFER, -1, 0);
}
#else
static inline bool is_shared_processor(void)

View File

@ -28,7 +28,11 @@ static inline void set_cede_latency_hint(u8 latency_hint)
static inline long cede_processor(void)
{
return plpar_hcall_norets(H_CEDE);
/*
* We cannot call tracepoints inside RCU idle regions which
* means we must not trace H_CEDE.
*/
return plpar_hcall_norets_notrace(H_CEDE);
}
static inline long extended_cede_processor(unsigned long latency_hint)

View File

@ -157,7 +157,7 @@ do { \
"2: lwz%X1 %L0, %L1\n" \
EX_TABLE(1b, %l2) \
EX_TABLE(2b, %l2) \
: "=r" (x) \
: "=&r" (x) \
: "m" (*addr) \
: \
: label)

View File

@ -340,6 +340,12 @@ ret_from_mc_except:
andi. r10,r10,IRQS_DISABLED; /* yes -> go out of line */ \
bne masked_interrupt_book3e_##n
/*
* Additional regs must be re-loaded from paca before EXCEPTION_COMMON* is
* called, because that does SAVE_NVGPRS which must see the original register
* values, otherwise the scratch values might be restored when exiting the
* interrupt.
*/
#define PROLOG_ADDITION_2REGS_GEN(n) \
std r14,PACA_EXGEN+EX_R14(r13); \
std r15,PACA_EXGEN+EX_R15(r13)
@ -535,6 +541,10 @@ __end_interrupts:
PROLOG_ADDITION_2REGS)
mfspr r14,SPRN_DEAR
mfspr r15,SPRN_ESR
std r14,_DAR(r1)
std r15,_DSISR(r1)
ld r14,PACA_EXGEN+EX_R14(r13)
ld r15,PACA_EXGEN+EX_R15(r13)
EXCEPTION_COMMON(0x300)
b storage_fault_common
@ -544,6 +554,10 @@ __end_interrupts:
PROLOG_ADDITION_2REGS)
li r15,0
mr r14,r10
std r14,_DAR(r1)
std r15,_DSISR(r1)
ld r14,PACA_EXGEN+EX_R14(r13)
ld r15,PACA_EXGEN+EX_R15(r13)
EXCEPTION_COMMON(0x400)
b storage_fault_common
@ -557,6 +571,10 @@ __end_interrupts:
PROLOG_ADDITION_2REGS)
mfspr r14,SPRN_DEAR
mfspr r15,SPRN_ESR
std r14,_DAR(r1)
std r15,_DSISR(r1)
ld r14,PACA_EXGEN+EX_R14(r13)
ld r15,PACA_EXGEN+EX_R15(r13)
EXCEPTION_COMMON(0x600)
b alignment_more /* no room, go out of line */
@ -565,10 +583,10 @@ __end_interrupts:
NORMAL_EXCEPTION_PROLOG(0x700, BOOKE_INTERRUPT_PROGRAM,
PROLOG_ADDITION_1REG)
mfspr r14,SPRN_ESR
EXCEPTION_COMMON(0x700)
std r14,_DSISR(r1)
addi r3,r1,STACK_FRAME_OVERHEAD
ld r14,PACA_EXGEN+EX_R14(r13)
EXCEPTION_COMMON(0x700)
addi r3,r1,STACK_FRAME_OVERHEAD
bl program_check_exception
REST_NVGPRS(r1)
b interrupt_return
@ -725,11 +743,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
* normal exception
*/
mfspr r14,SPRN_DBSR
EXCEPTION_COMMON_CRIT(0xd00)
std r14,_DSISR(r1)
addi r3,r1,STACK_FRAME_OVERHEAD
ld r14,PACA_EXCRIT+EX_R14(r13)
ld r15,PACA_EXCRIT+EX_R15(r13)
EXCEPTION_COMMON_CRIT(0xd00)
addi r3,r1,STACK_FRAME_OVERHEAD
bl DebugException
REST_NVGPRS(r1)
b interrupt_return
@ -796,11 +814,11 @@ kernel_dbg_exc:
* normal exception
*/
mfspr r14,SPRN_DBSR
EXCEPTION_COMMON_DBG(0xd08)
std r14,_DSISR(r1)
addi r3,r1,STACK_FRAME_OVERHEAD
ld r14,PACA_EXDBG+EX_R14(r13)
ld r15,PACA_EXDBG+EX_R15(r13)
EXCEPTION_COMMON_DBG(0xd08)
addi r3,r1,STACK_FRAME_OVERHEAD
bl DebugException
REST_NVGPRS(r1)
b interrupt_return
@ -931,11 +949,7 @@ masked_interrupt_book3e_0x2c0:
* original values stashed away in the PACA
*/
storage_fault_common:
std r14,_DAR(r1)
std r15,_DSISR(r1)
addi r3,r1,STACK_FRAME_OVERHEAD
ld r14,PACA_EXGEN+EX_R14(r13)
ld r15,PACA_EXGEN+EX_R15(r13)
bl do_page_fault
b interrupt_return
@ -944,11 +958,7 @@ storage_fault_common:
* continues here.
*/
alignment_more:
std r14,_DAR(r1)
std r15,_DSISR(r1)
addi r3,r1,STACK_FRAME_OVERHEAD
ld r14,PACA_EXGEN+EX_R14(r13)
ld r15,PACA_EXGEN+EX_R15(r13)
bl alignment_exception
REST_NVGPRS(r1)
b interrupt_return

View File

@ -34,9 +34,6 @@ notrace long system_call_exception(long r3, long r4, long r5,
syscall_fn f;
kuep_lock();
#ifdef CONFIG_PPC32
kuap_save_and_lock(regs);
#endif
regs->orig_gpr3 = r3;
@ -427,6 +424,7 @@ again:
/* Restore user access locks last */
kuap_user_restore(regs);
kuep_unlock();
return ret;
}

View File

@ -356,13 +356,16 @@ static void __init setup_legacy_serial_console(int console)
static int __init ioremap_legacy_serial_console(void)
{
struct legacy_serial_info *info = &legacy_serial_infos[legacy_serial_console];
struct plat_serial8250_port *port = &legacy_serial_ports[legacy_serial_console];
struct plat_serial8250_port *port;
struct legacy_serial_info *info;
void __iomem *vaddr;
if (legacy_serial_console < 0)
return 0;
info = &legacy_serial_infos[legacy_serial_console];
port = &legacy_serial_ports[legacy_serial_console];
if (!info->early_addr)
return 0;

View File

@ -166,9 +166,9 @@ copy_ckfpr_from_user(struct task_struct *task, void __user *from)
}
#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
#else
#define unsafe_copy_fpr_to_user(to, task, label) do { } while (0)
#define unsafe_copy_fpr_to_user(to, task, label) do { if (0) goto label;} while (0)
#define unsafe_copy_fpr_from_user(task, from, label) do { } while (0)
#define unsafe_copy_fpr_from_user(task, from, label) do { if (0) goto label;} while (0)
static inline unsigned long
copy_fpr_to_user(void __user *to, struct task_struct *task)

View File

@ -840,7 +840,7 @@ bool kvm_unmap_gfn_range_hv(struct kvm *kvm, struct kvm_gfn_range *range)
kvm_unmap_radix(kvm, range->slot, gfn);
} else {
for (gfn = range->start; gfn < range->end; gfn++)
kvm_unmap_rmapp(kvm, range->slot, range->start);
kvm_unmap_rmapp(kvm, range->slot, gfn);
}
return false;

View File

@ -14,6 +14,7 @@
#include <linux/string.h>
#include <linux/init.h>
#include <linux/sched/mm.h>
#include <linux/stop_machine.h>
#include <asm/cputable.h>
#include <asm/code-patching.h>
#include <asm/page.h>
@ -149,17 +150,17 @@ static void do_stf_entry_barrier_fixups(enum stf_barrier_type types)
pr_devel("patching dest %lx\n", (unsigned long)dest);
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
if (types & STF_BARRIER_FALLBACK)
// See comment in do_entry_flush_fixups() RE order of patching
if (types & STF_BARRIER_FALLBACK) {
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_branch((struct ppc_inst *)(dest + 1),
(unsigned long)&stf_barrier_fallback,
BRANCH_SET_LINK);
else
patch_instruction((struct ppc_inst *)(dest + 1),
ppc_inst(instrs[1]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
(unsigned long)&stf_barrier_fallback, BRANCH_SET_LINK);
} else {
patch_instruction((struct ppc_inst *)(dest + 1), ppc_inst(instrs[1]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
}
}
printk(KERN_DEBUG "stf-barrier: patched %d entry locations (%s barrier)\n", i,
@ -227,11 +228,25 @@ static void do_stf_exit_barrier_fixups(enum stf_barrier_type types)
: "unknown");
}
static int __do_stf_barrier_fixups(void *data)
{
enum stf_barrier_type *types = data;
do_stf_entry_barrier_fixups(*types);
do_stf_exit_barrier_fixups(*types);
return 0;
}
void do_stf_barrier_fixups(enum stf_barrier_type types)
{
do_stf_entry_barrier_fixups(types);
do_stf_exit_barrier_fixups(types);
/*
* The call to the fallback entry flush, and the fallback/sync-ori exit
* flush can not be safely patched in/out while other CPUs are executing
* them. So call __do_stf_barrier_fixups() on one CPU while all other CPUs
* spin in the stop machine core with interrupts hard disabled.
*/
stop_machine(__do_stf_barrier_fixups, &types, NULL);
}
void do_uaccess_flush_fixups(enum l1d_flush_type types)
@ -284,8 +299,9 @@ void do_uaccess_flush_fixups(enum l1d_flush_type types)
: "unknown");
}
void do_entry_flush_fixups(enum l1d_flush_type types)
static int __do_entry_flush_fixups(void *data)
{
enum l1d_flush_type types = *(enum l1d_flush_type *)data;
unsigned int instrs[3], *dest;
long *start, *end;
int i;
@ -309,6 +325,31 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
if (types & L1D_FLUSH_MTTRIG)
instrs[i++] = 0x7c12dba6; /* mtspr TRIG2,r0 (SPR #882) */
/*
* If we're patching in or out the fallback flush we need to be careful about the
* order in which we patch instructions. That's because it's possible we could
* take a page fault after patching one instruction, so the sequence of
* instructions must be safe even in a half patched state.
*
* To make that work, when patching in the fallback flush we patch in this order:
* - the mflr (dest)
* - the mtlr (dest + 2)
* - the branch (dest + 1)
*
* That ensures the sequence is safe to execute at any point. In contrast if we
* patch the mtlr last, it's possible we could return from the branch and not
* restore LR, leading to a crash later.
*
* When patching out the fallback flush (either with nops or another flush type),
* we patch in this order:
* - the branch (dest + 1)
* - the mtlr (dest + 2)
* - the mflr (dest)
*
* Note we are protected by stop_machine() from other CPUs executing the code in a
* semi-patched state.
*/
start = PTRRELOC(&__start___entry_flush_fixup);
end = PTRRELOC(&__stop___entry_flush_fixup);
for (i = 0; start < end; start++, i++) {
@ -316,15 +357,16 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
pr_devel("patching dest %lx\n", (unsigned long)dest);
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
if (types == L1D_FLUSH_FALLBACK)
patch_branch((struct ppc_inst *)(dest + 1), (unsigned long)&entry_flush_fallback,
BRANCH_SET_LINK);
else
if (types == L1D_FLUSH_FALLBACK) {
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_branch((struct ppc_inst *)(dest + 1),
(unsigned long)&entry_flush_fallback, BRANCH_SET_LINK);
} else {
patch_instruction((struct ppc_inst *)(dest + 1), ppc_inst(instrs[1]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
}
}
start = PTRRELOC(&__start___scv_entry_flush_fixup);
@ -334,15 +376,16 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
pr_devel("patching dest %lx\n", (unsigned long)dest);
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
if (types == L1D_FLUSH_FALLBACK)
patch_branch((struct ppc_inst *)(dest + 1), (unsigned long)&scv_entry_flush_fallback,
BRANCH_SET_LINK);
else
if (types == L1D_FLUSH_FALLBACK) {
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_branch((struct ppc_inst *)(dest + 1),
(unsigned long)&scv_entry_flush_fallback, BRANCH_SET_LINK);
} else {
patch_instruction((struct ppc_inst *)(dest + 1), ppc_inst(instrs[1]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_instruction((struct ppc_inst *)(dest + 2), ppc_inst(instrs[2]));
patch_instruction((struct ppc_inst *)dest, ppc_inst(instrs[0]));
}
}
@ -354,6 +397,19 @@ void do_entry_flush_fixups(enum l1d_flush_type types)
: "ori type" :
(types & L1D_FLUSH_MTTRIG) ? "mttrig type"
: "unknown");
return 0;
}
void do_entry_flush_fixups(enum l1d_flush_type types)
{
/*
* The call to the fallback flush can not be safely patched in/out while
* other CPUs are executing it. So call __do_entry_flush_fixups() on one
* CPU while all other CPUs spin in the stop machine core with interrupts
* hard disabled.
*/
stop_machine(__do_entry_flush_fixups, &types, NULL);
}
void do_rfi_flush_fixups(enum l1d_flush_type types)

View File

@ -102,6 +102,16 @@ END_FTR_SECTION(0, 1); \
#define HCALL_BRANCH(LABEL)
#endif
_GLOBAL_TOC(plpar_hcall_norets_notrace)
HMT_MEDIUM
mfcr r0
stw r0,8(r1)
HVSC /* invoke the hypervisor */
lwz r0,8(r1)
mtcrf 0xff,r0
blr /* return r3 = status */
_GLOBAL_TOC(plpar_hcall_norets)
HMT_MEDIUM

View File

@ -1829,30 +1829,28 @@ void hcall_tracepoint_unregfunc(void)
#endif
/*
* Since the tracing code might execute hcalls we need to guard against
* recursion. One example of this are spinlocks calling H_YIELD on
* shared processor partitions.
* Keep track of hcall tracing depth and prevent recursion. Warn if any is
* detected because it may indicate a problem. This will not catch all
* problems with tracing code making hcalls, because the tracing might have
* been invoked from a non-hcall, so the first hcall could recurse into it
* without warning here, but this better than nothing.
*
* Hcalls with specific problems being traced should use the _notrace
* plpar_hcall variants.
*/
static DEFINE_PER_CPU(unsigned int, hcall_trace_depth);
void __trace_hcall_entry(unsigned long opcode, unsigned long *args)
notrace void __trace_hcall_entry(unsigned long opcode, unsigned long *args)
{
unsigned long flags;
unsigned int *depth;
/*
* We cannot call tracepoints inside RCU idle regions which
* means we must not trace H_CEDE.
*/
if (opcode == H_CEDE)
return;
local_irq_save(flags);
depth = this_cpu_ptr(&hcall_trace_depth);
if (*depth)
if (WARN_ON_ONCE(*depth))
goto out;
(*depth)++;
@ -1864,19 +1862,16 @@ out:
local_irq_restore(flags);
}
void __trace_hcall_exit(long opcode, long retval, unsigned long *retbuf)
notrace void __trace_hcall_exit(long opcode, long retval, unsigned long *retbuf)
{
unsigned long flags;
unsigned int *depth;
if (opcode == H_CEDE)
return;
local_irq_save(flags);
depth = this_cpu_ptr(&hcall_trace_depth);
if (*depth)
if (*depth) /* Don't warn again on the way out */
goto out;
(*depth)++;

View File

@ -180,7 +180,6 @@ static inline void arch_ftrace_nmi_exit(void) { }
BUILD_TRAP_HANDLER(nmi)
{
unsigned int cpu = smp_processor_id();
TRAP_HANDLER_DECL;
arch_ftrace_nmi_enter();

View File

@ -30,6 +30,7 @@ targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma \
KBUILD_CFLAGS := -m$(BITS) -O2
KBUILD_CFLAGS += -fno-strict-aliasing -fPIE
KBUILD_CFLAGS += -Wundef
KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
cflags-$(CONFIG_X86_32) := -march=i386
cflags-$(CONFIG_X86_64) := -mcmodel=small -mno-red-zone
@ -48,10 +49,10 @@ KBUILD_CFLAGS += $(call as-option,-Wa$(comma)-mrelax-relocations=no)
KBUILD_CFLAGS += -include $(srctree)/include/linux/hidden.h
KBUILD_CFLAGS += $(CLANG_FLAGS)
# sev-es.c indirectly inludes inat-table.h which is generated during
# sev.c indirectly inludes inat-table.h which is generated during
# compilation and stored in $(objtree). Add the directory to the includes so
# that the compiler finds it even with out-of-tree builds (make O=/some/path).
CFLAGS_sev-es.o += -I$(objtree)/arch/x86/lib/
CFLAGS_sev.o += -I$(objtree)/arch/x86/lib/
KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__
GCOV_PROFILE := n
@ -93,7 +94,7 @@ ifdef CONFIG_X86_64
vmlinux-objs-y += $(obj)/idt_64.o $(obj)/idt_handlers_64.o
vmlinux-objs-y += $(obj)/mem_encrypt.o
vmlinux-objs-y += $(obj)/pgtable_64.o
vmlinux-objs-$(CONFIG_AMD_MEM_ENCRYPT) += $(obj)/sev-es.o
vmlinux-objs-$(CONFIG_AMD_MEM_ENCRYPT) += $(obj)/sev.o
endif
vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o

View File

@ -172,7 +172,7 @@ void __puthex(unsigned long value)
}
}
#if CONFIG_X86_NEED_RELOCS
#ifdef CONFIG_X86_NEED_RELOCS
static void handle_relocations(void *output, unsigned long output_len,
unsigned long virt_addr)
{

View File

@ -79,7 +79,7 @@ struct mem_vector {
u64 size;
};
#if CONFIG_RANDOMIZE_BASE
#ifdef CONFIG_RANDOMIZE_BASE
/* kaslr.c */
void choose_random_location(unsigned long input,
unsigned long input_size,

View File

@ -13,7 +13,7 @@
#include "misc.h"
#include <asm/pgtable_types.h>
#include <asm/sev-es.h>
#include <asm/sev.h>
#include <asm/trapnr.h>
#include <asm/trap_pf.h>
#include <asm/msr-index.h>
@ -117,7 +117,7 @@ static enum es_result vc_read_mem(struct es_em_ctxt *ctxt,
#include "../../lib/insn.c"
/* Include code for early handlers */
#include "../../kernel/sev-es-shared.c"
#include "../../kernel/sev-shared.c"
static bool early_setup_sev_es(void)
{

View File

@ -113,6 +113,7 @@
#define VALID_PAGE(x) ((x) != INVALID_PAGE)
#define UNMAPPED_GVA (~(gpa_t)0)
#define INVALID_GPA (~(gpa_t)0)
/* KVM Hugepage definitions for x86 */
#define KVM_MAX_HUGEPAGE_LEVEL PG_LEVEL_1G
@ -199,6 +200,7 @@ enum x86_intercept_stage;
#define KVM_NR_DB_REGS 4
#define DR6_BUS_LOCK (1 << 11)
#define DR6_BD (1 << 13)
#define DR6_BS (1 << 14)
#define DR6_BT (1 << 15)
@ -212,7 +214,7 @@ enum x86_intercept_stage;
* DR6_ACTIVE_LOW is also used as the init/reset value for DR6.
*/
#define DR6_ACTIVE_LOW 0xffff0ff0
#define DR6_VOLATILE 0x0001e00f
#define DR6_VOLATILE 0x0001e80f
#define DR6_FIXED_1 (DR6_ACTIVE_LOW & ~DR6_VOLATILE)
#define DR7_BP_EN_MASK 0x000000ff
@ -407,7 +409,7 @@ struct kvm_mmu {
u32 pkru_mask;
u64 *pae_root;
u64 *lm_root;
u64 *pml4_root;
/*
* check zero bits on shadow page table entries, these
@ -1417,6 +1419,7 @@ struct kvm_arch_async_pf {
bool direct_map;
};
extern u32 __read_mostly kvm_nr_uret_msrs;
extern u64 __read_mostly host_efer;
extern bool __read_mostly allow_smaller_maxphyaddr;
extern struct kvm_x86_ops kvm_x86_ops;
@ -1775,9 +1778,15 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low,
unsigned long ipi_bitmap_high, u32 min,
unsigned long icr, int op_64_bit);
void kvm_define_user_return_msr(unsigned index, u32 msr);
int kvm_add_user_return_msr(u32 msr);
int kvm_find_user_return_msr(u32 msr);
int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask);
static inline bool kvm_is_supported_user_return_msr(u32 msr)
{
return kvm_find_user_return_msr(msr) >= 0;
}
u64 kvm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc);
u64 kvm_read_l1_tsc(struct kvm_vcpu *vcpu, u64 host_tsc);

View File

@ -7,8 +7,6 @@
#include <linux/interrupt.h>
#include <uapi/asm/kvm_para.h>
extern void kvmclock_init(void);
#ifdef CONFIG_KVM_GUEST
bool kvm_check_and_clear_guest_paused(void);
#else
@ -86,13 +84,14 @@ static inline long kvm_hypercall4(unsigned int nr, unsigned long p1,
}
#ifdef CONFIG_KVM_GUEST
void kvmclock_init(void);
void kvmclock_disable(void);
bool kvm_para_available(void);
unsigned int kvm_arch_para_features(void);
unsigned int kvm_arch_para_hints(void);
void kvm_async_pf_task_wait_schedule(u32 token);
void kvm_async_pf_task_wake(u32 token);
u32 kvm_read_and_reset_apf_flags(void);
void kvm_disable_steal_time(void);
bool __kvm_handle_async_pf(struct pt_regs *regs, u32 token);
DECLARE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
@ -137,11 +136,6 @@ static inline u32 kvm_read_and_reset_apf_flags(void)
return 0;
}
static inline void kvm_disable_steal_time(void)
{
return;
}
static __always_inline bool kvm_handle_async_pf(struct pt_regs *regs, u32 token)
{
return false;

View File

@ -537,9 +537,9 @@
/* K8 MSRs */
#define MSR_K8_TOP_MEM1 0xc001001a
#define MSR_K8_TOP_MEM2 0xc001001d
#define MSR_K8_SYSCFG 0xc0010010
#define MSR_K8_SYSCFG_MEM_ENCRYPT_BIT 23
#define MSR_K8_SYSCFG_MEM_ENCRYPT BIT_ULL(MSR_K8_SYSCFG_MEM_ENCRYPT_BIT)
#define MSR_AMD64_SYSCFG 0xc0010010
#define MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT 23
#define MSR_AMD64_SYSCFG_MEM_ENCRYPT BIT_ULL(MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT)
#define MSR_K8_INT_PENDING_MSG 0xc0010055
/* C1E active bits in int pending message */
#define K8_INTP_C1E_ACTIVE_MASK 0x18000000

View File

@ -787,8 +787,10 @@ DECLARE_PER_CPU(u64, msr_misc_features_shadow);
#ifdef CONFIG_CPU_SUP_AMD
extern u32 amd_get_nodes_per_socket(void);
extern u32 amd_get_highest_perf(void);
#else
static inline u32 amd_get_nodes_per_socket(void) { return 0; }
static inline u32 amd_get_highest_perf(void) { return 0; }
#endif
static inline uint32_t hypervisor_cpuid_base(const char *sig, uint32_t leaves)

View File

@ -0,0 +1,62 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* AMD SEV header common between the guest and the hypervisor.
*
* Author: Brijesh Singh <brijesh.singh@amd.com>
*/
#ifndef __ASM_X86_SEV_COMMON_H
#define __ASM_X86_SEV_COMMON_H
#define GHCB_MSR_INFO_POS 0
#define GHCB_MSR_INFO_MASK (BIT_ULL(12) - 1)
#define GHCB_MSR_SEV_INFO_RESP 0x001
#define GHCB_MSR_SEV_INFO_REQ 0x002
#define GHCB_MSR_VER_MAX_POS 48
#define GHCB_MSR_VER_MAX_MASK 0xffff
#define GHCB_MSR_VER_MIN_POS 32
#define GHCB_MSR_VER_MIN_MASK 0xffff
#define GHCB_MSR_CBIT_POS 24
#define GHCB_MSR_CBIT_MASK 0xff
#define GHCB_MSR_SEV_INFO(_max, _min, _cbit) \
((((_max) & GHCB_MSR_VER_MAX_MASK) << GHCB_MSR_VER_MAX_POS) | \
(((_min) & GHCB_MSR_VER_MIN_MASK) << GHCB_MSR_VER_MIN_POS) | \
(((_cbit) & GHCB_MSR_CBIT_MASK) << GHCB_MSR_CBIT_POS) | \
GHCB_MSR_SEV_INFO_RESP)
#define GHCB_MSR_INFO(v) ((v) & 0xfffUL)
#define GHCB_MSR_PROTO_MAX(v) (((v) >> GHCB_MSR_VER_MAX_POS) & GHCB_MSR_VER_MAX_MASK)
#define GHCB_MSR_PROTO_MIN(v) (((v) >> GHCB_MSR_VER_MIN_POS) & GHCB_MSR_VER_MIN_MASK)
#define GHCB_MSR_CPUID_REQ 0x004
#define GHCB_MSR_CPUID_RESP 0x005
#define GHCB_MSR_CPUID_FUNC_POS 32
#define GHCB_MSR_CPUID_FUNC_MASK 0xffffffff
#define GHCB_MSR_CPUID_VALUE_POS 32
#define GHCB_MSR_CPUID_VALUE_MASK 0xffffffff
#define GHCB_MSR_CPUID_REG_POS 30
#define GHCB_MSR_CPUID_REG_MASK 0x3
#define GHCB_CPUID_REQ_EAX 0
#define GHCB_CPUID_REQ_EBX 1
#define GHCB_CPUID_REQ_ECX 2
#define GHCB_CPUID_REQ_EDX 3
#define GHCB_CPUID_REQ(fn, reg) \
(GHCB_MSR_CPUID_REQ | \
(((unsigned long)reg & GHCB_MSR_CPUID_REG_MASK) << GHCB_MSR_CPUID_REG_POS) | \
(((unsigned long)fn) << GHCB_MSR_CPUID_FUNC_POS))
#define GHCB_MSR_TERM_REQ 0x100
#define GHCB_MSR_TERM_REASON_SET_POS 12
#define GHCB_MSR_TERM_REASON_SET_MASK 0xf
#define GHCB_MSR_TERM_REASON_POS 16
#define GHCB_MSR_TERM_REASON_MASK 0xff
#define GHCB_SEV_TERM_REASON(reason_set, reason_val) \
(((((u64)reason_set) & GHCB_MSR_TERM_REASON_SET_MASK) << GHCB_MSR_TERM_REASON_SET_POS) | \
((((u64)reason_val) & GHCB_MSR_TERM_REASON_MASK) << GHCB_MSR_TERM_REASON_POS))
#define GHCB_SEV_ES_REASON_GENERAL_REQUEST 0
#define GHCB_SEV_ES_REASON_PROTOCOL_UNSUPPORTED 1
#define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK)
#endif

View File

@ -10,34 +10,12 @@
#include <linux/types.h>
#include <asm/insn.h>
#include <asm/sev-common.h>
#define GHCB_SEV_INFO 0x001UL
#define GHCB_SEV_INFO_REQ 0x002UL
#define GHCB_INFO(v) ((v) & 0xfffUL)
#define GHCB_PROTO_MAX(v) (((v) >> 48) & 0xffffUL)
#define GHCB_PROTO_MIN(v) (((v) >> 32) & 0xffffUL)
#define GHCB_PROTO_OUR 0x0001UL
#define GHCB_SEV_CPUID_REQ 0x004UL
#define GHCB_CPUID_REQ_EAX 0
#define GHCB_CPUID_REQ_EBX 1
#define GHCB_CPUID_REQ_ECX 2
#define GHCB_CPUID_REQ_EDX 3
#define GHCB_CPUID_REQ(fn, reg) (GHCB_SEV_CPUID_REQ | \
(((unsigned long)reg & 3) << 30) | \
(((unsigned long)fn) << 32))
#define GHCB_PROTO_OUR 0x0001UL
#define GHCB_PROTOCOL_MAX 1ULL
#define GHCB_DEFAULT_USAGE 0ULL
#define GHCB_PROTOCOL_MAX 0x0001UL
#define GHCB_DEFAULT_USAGE 0x0000UL
#define GHCB_SEV_CPUID_RESP 0x005UL
#define GHCB_SEV_TERMINATE 0x100UL
#define GHCB_SEV_TERMINATE_REASON(reason_set, reason_val) \
(((((u64)reason_set) & 0x7) << 12) | \
((((u64)reason_val) & 0xff) << 16))
#define GHCB_SEV_ES_REASON_GENERAL_REQUEST 0
#define GHCB_SEV_ES_REASON_PROTOCOL_UNSUPPORTED 1
#define GHCB_SEV_GHCB_RESP_CODE(v) ((v) & 0xfff)
#define VMGEXIT() { asm volatile("rep; vmmcall\n\r"); }
enum es_result {

View File

@ -7,4 +7,6 @@
VDSO_CLOCKMODE_PVCLOCK, \
VDSO_CLOCKMODE_HVCLOCK
#define HAVE_VDSO_CLOCKMODE_HVCLOCK
#endif /* __ASM_VDSO_CLOCKSOURCE_H */

View File

@ -437,6 +437,8 @@ struct kvm_vmx_nested_state_hdr {
__u16 flags;
} smm;
__u16 pad;
__u32 flags;
__u64 preemption_timer_deadline;
};

View File

@ -20,7 +20,7 @@ CFLAGS_REMOVE_kvmclock.o = -pg
CFLAGS_REMOVE_ftrace.o = -pg
CFLAGS_REMOVE_early_printk.o = -pg
CFLAGS_REMOVE_head64.o = -pg
CFLAGS_REMOVE_sev-es.o = -pg
CFLAGS_REMOVE_sev.o = -pg
endif
KASAN_SANITIZE_head$(BITS).o := n
@ -28,7 +28,7 @@ KASAN_SANITIZE_dumpstack.o := n
KASAN_SANITIZE_dumpstack_$(BITS).o := n
KASAN_SANITIZE_stacktrace.o := n
KASAN_SANITIZE_paravirt.o := n
KASAN_SANITIZE_sev-es.o := n
KASAN_SANITIZE_sev.o := n
# With some compiler versions the generated code results in boot hangs, caused
# by several compilation units. To be safe, disable all instrumentation.
@ -148,7 +148,7 @@ obj-$(CONFIG_UNWINDER_ORC) += unwind_orc.o
obj-$(CONFIG_UNWINDER_FRAME_POINTER) += unwind_frame.o
obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o
obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev-es.o
obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev.o
###
# 64 bit specific files
ifeq ($(CONFIG_X86_64),y)

View File

@ -593,8 +593,8 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
*/
if (cpu_has(c, X86_FEATURE_SME) || cpu_has(c, X86_FEATURE_SEV)) {
/* Check if memory encryption is enabled */
rdmsrl(MSR_K8_SYSCFG, msr);
if (!(msr & MSR_K8_SYSCFG_MEM_ENCRYPT))
rdmsrl(MSR_AMD64_SYSCFG, msr);
if (!(msr & MSR_AMD64_SYSCFG_MEM_ENCRYPT))
goto clear_all;
/*
@ -1165,3 +1165,19 @@ void set_dr_addr_mask(unsigned long mask, int dr)
break;
}
}
u32 amd_get_highest_perf(void)
{
struct cpuinfo_x86 *c = &boot_cpu_data;
if (c->x86 == 0x17 && ((c->x86_model >= 0x30 && c->x86_model < 0x40) ||
(c->x86_model >= 0x70 && c->x86_model < 0x80)))
return 166;
if (c->x86 == 0x19 && ((c->x86_model >= 0x20 && c->x86_model < 0x30) ||
(c->x86_model >= 0x40 && c->x86_model < 0x70)))
return 166;
return 255;
}
EXPORT_SYMBOL_GPL(amd_get_highest_perf);

View File

@ -836,7 +836,7 @@ int __init amd_special_default_mtrr(void)
if (boot_cpu_data.x86 < 0xf)
return 0;
/* In case some hypervisor doesn't pass SYSCFG through: */
if (rdmsr_safe(MSR_K8_SYSCFG, &l, &h) < 0)
if (rdmsr_safe(MSR_AMD64_SYSCFG, &l, &h) < 0)
return 0;
/*
* Memory between 4GB and top of mem is forced WB by this magic bit.

View File

@ -53,13 +53,13 @@ static inline void k8_check_syscfg_dram_mod_en(void)
(boot_cpu_data.x86 >= 0x0f)))
return;
rdmsr(MSR_K8_SYSCFG, lo, hi);
rdmsr(MSR_AMD64_SYSCFG, lo, hi);
if (lo & K8_MTRRFIXRANGE_DRAM_MODIFY) {
pr_err(FW_WARN "MTRR: CPU %u: SYSCFG[MtrrFixDramModEn]"
" not cleared by BIOS, clearing this bit\n",
smp_processor_id());
lo &= ~K8_MTRRFIXRANGE_DRAM_MODIFY;
mtrr_wrmsr(MSR_K8_SYSCFG, lo, hi);
mtrr_wrmsr(MSR_AMD64_SYSCFG, lo, hi);
}
}

View File

@ -39,7 +39,7 @@
#include <asm/realmode.h>
#include <asm/extable.h>
#include <asm/trapnr.h>
#include <asm/sev-es.h>
#include <asm/sev.h>
/*
* Manage page tables very early on.

View File

@ -26,6 +26,7 @@
#include <linux/kprobes.h>
#include <linux/nmi.h>
#include <linux/swait.h>
#include <linux/syscore_ops.h>
#include <asm/timer.h>
#include <asm/cpu.h>
#include <asm/traps.h>
@ -37,6 +38,7 @@
#include <asm/tlb.h>
#include <asm/cpuidle_haltpoll.h>
#include <asm/ptrace.h>
#include <asm/reboot.h>
#include <asm/svm.h>
DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
@ -345,7 +347,7 @@ static void kvm_guest_cpu_init(void)
wrmsrl(MSR_KVM_ASYNC_PF_EN, pa);
__this_cpu_write(apf_reason.enabled, 1);
pr_info("KVM setup async PF for cpu %d\n", smp_processor_id());
pr_info("setup async PF for cpu %d\n", smp_processor_id());
}
if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) {
@ -371,34 +373,17 @@ static void kvm_pv_disable_apf(void)
wrmsrl(MSR_KVM_ASYNC_PF_EN, 0);
__this_cpu_write(apf_reason.enabled, 0);
pr_info("Unregister pv shared memory for cpu %d\n", smp_processor_id());
pr_info("disable async PF for cpu %d\n", smp_processor_id());
}
static void kvm_pv_guest_cpu_reboot(void *unused)
static void kvm_disable_steal_time(void)
{
/*
* We disable PV EOI before we load a new kernel by kexec,
* since MSR_KVM_PV_EOI_EN stores a pointer into old kernel's memory.
* New kernel can re-enable when it boots.
*/
if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
wrmsrl(MSR_KVM_PV_EOI_EN, 0);
kvm_pv_disable_apf();
kvm_disable_steal_time();
}
if (!has_steal_clock)
return;
static int kvm_pv_reboot_notify(struct notifier_block *nb,
unsigned long code, void *unused)
{
if (code == SYS_RESTART)
on_each_cpu(kvm_pv_guest_cpu_reboot, NULL, 1);
return NOTIFY_DONE;
wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
}
static struct notifier_block kvm_pv_reboot_nb = {
.notifier_call = kvm_pv_reboot_notify,
};
static u64 kvm_steal_clock(int cpu)
{
u64 steal;
@ -416,14 +401,6 @@ static u64 kvm_steal_clock(int cpu)
return steal;
}
void kvm_disable_steal_time(void)
{
if (!has_steal_clock)
return;
wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
}
static inline void __set_percpu_decrypted(void *ptr, unsigned long size)
{
early_set_memory_decrypted((unsigned long) ptr, size);
@ -451,6 +428,27 @@ static void __init sev_map_percpu_data(void)
}
}
static void kvm_guest_cpu_offline(bool shutdown)
{
kvm_disable_steal_time();
if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
wrmsrl(MSR_KVM_PV_EOI_EN, 0);
kvm_pv_disable_apf();
if (!shutdown)
apf_task_wake_all();
kvmclock_disable();
}
static int kvm_cpu_online(unsigned int cpu)
{
unsigned long flags;
local_irq_save(flags);
kvm_guest_cpu_init();
local_irq_restore(flags);
return 0;
}
#ifdef CONFIG_SMP
static DEFINE_PER_CPU(cpumask_var_t, __pv_cpu_mask);
@ -635,33 +633,66 @@ static void __init kvm_smp_prepare_boot_cpu(void)
kvm_spinlock_init();
}
static void kvm_guest_cpu_offline(void)
{
kvm_disable_steal_time();
if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
wrmsrl(MSR_KVM_PV_EOI_EN, 0);
kvm_pv_disable_apf();
apf_task_wake_all();
}
static int kvm_cpu_online(unsigned int cpu)
{
local_irq_disable();
kvm_guest_cpu_init();
local_irq_enable();
return 0;
}
static int kvm_cpu_down_prepare(unsigned int cpu)
{
local_irq_disable();
kvm_guest_cpu_offline();
local_irq_enable();
unsigned long flags;
local_irq_save(flags);
kvm_guest_cpu_offline(false);
local_irq_restore(flags);
return 0;
}
#endif
static int kvm_suspend(void)
{
kvm_guest_cpu_offline(false);
return 0;
}
static void kvm_resume(void)
{
kvm_cpu_online(raw_smp_processor_id());
}
static struct syscore_ops kvm_syscore_ops = {
.suspend = kvm_suspend,
.resume = kvm_resume,
};
static void kvm_pv_guest_cpu_reboot(void *unused)
{
kvm_guest_cpu_offline(true);
}
static int kvm_pv_reboot_notify(struct notifier_block *nb,
unsigned long code, void *unused)
{
if (code == SYS_RESTART)
on_each_cpu(kvm_pv_guest_cpu_reboot, NULL, 1);
return NOTIFY_DONE;
}
static struct notifier_block kvm_pv_reboot_nb = {
.notifier_call = kvm_pv_reboot_notify,
};
/*
* After a PV feature is registered, the host will keep writing to the
* registered memory location. If the guest happens to shutdown, this memory
* won't be valid. In cases like kexec, in which you install a new kernel, this
* means a random memory location will be kept being written.
*/
#ifdef CONFIG_KEXEC_CORE
static void kvm_crash_shutdown(struct pt_regs *regs)
{
kvm_guest_cpu_offline(true);
native_machine_crash_shutdown(regs);
}
#endif
static void __init kvm_guest_init(void)
{
int i;
@ -704,6 +735,12 @@ static void __init kvm_guest_init(void)
kvm_guest_cpu_init();
#endif
#ifdef CONFIG_KEXEC_CORE
machine_ops.crash_shutdown = kvm_crash_shutdown;
#endif
register_syscore_ops(&kvm_syscore_ops);
/*
* Hard lockup detection is enabled by default. Disable it, as guests
* can get false positives too easily, for example if the host is

View File

@ -20,7 +20,6 @@
#include <asm/hypervisor.h>
#include <asm/mem_encrypt.h>
#include <asm/x86_init.h>
#include <asm/reboot.h>
#include <asm/kvmclock.h>
static int kvmclock __initdata = 1;
@ -203,28 +202,9 @@ static void kvm_setup_secondary_clock(void)
}
#endif
/*
* After the clock is registered, the host will keep writing to the
* registered memory location. If the guest happens to shutdown, this memory
* won't be valid. In cases like kexec, in which you install a new kernel, this
* means a random memory location will be kept being written. So before any
* kind of shutdown from our side, we unregister the clock by writing anything
* that does not have the 'enable' bit set in the msr
*/
#ifdef CONFIG_KEXEC_CORE
static void kvm_crash_shutdown(struct pt_regs *regs)
void kvmclock_disable(void)
{
native_write_msr(msr_kvm_system_time, 0, 0);
kvm_disable_steal_time();
native_machine_crash_shutdown(regs);
}
#endif
static void kvm_shutdown(void)
{
native_write_msr(msr_kvm_system_time, 0, 0);
kvm_disable_steal_time();
native_machine_shutdown();
}
static void __init kvmclock_init_mem(void)
@ -351,10 +331,6 @@ void __init kvmclock_init(void)
#endif
x86_platform.save_sched_clock_state = kvm_save_sched_clock_state;
x86_platform.restore_sched_clock_state = kvm_restore_sched_clock_state;
machine_ops.shutdown = kvm_shutdown;
#ifdef CONFIG_KEXEC_CORE
machine_ops.crash_shutdown = kvm_crash_shutdown;
#endif
kvm_get_preset_lpj();
/*

View File

@ -95,7 +95,7 @@ static void get_fam10h_pci_mmconf_base(void)
return;
/* SYS_CFG */
address = MSR_K8_SYSCFG;
address = MSR_AMD64_SYSCFG;
rdmsrl(address, val);
/* TOP_MEM2 is not enabled? */

View File

@ -33,7 +33,7 @@
#include <asm/reboot.h>
#include <asm/cache.h>
#include <asm/nospec-branch.h>
#include <asm/sev-es.h>
#include <asm/sev.h>
#define CREATE_TRACE_POINTS
#include <trace/events/nmi.h>

View File

@ -26,13 +26,13 @@ static bool __init sev_es_check_cpu_features(void)
static void __noreturn sev_es_terminate(unsigned int reason)
{
u64 val = GHCB_SEV_TERMINATE;
u64 val = GHCB_MSR_TERM_REQ;
/*
* Tell the hypervisor what went wrong - only reason-set 0 is
* currently supported.
*/
val |= GHCB_SEV_TERMINATE_REASON(0, reason);
val |= GHCB_SEV_TERM_REASON(0, reason);
/* Request Guest Termination from Hypvervisor */
sev_es_wr_ghcb_msr(val);
@ -47,15 +47,15 @@ static bool sev_es_negotiate_protocol(void)
u64 val;
/* Do the GHCB protocol version negotiation */
sev_es_wr_ghcb_msr(GHCB_SEV_INFO_REQ);
sev_es_wr_ghcb_msr(GHCB_MSR_SEV_INFO_REQ);
VMGEXIT();
val = sev_es_rd_ghcb_msr();
if (GHCB_INFO(val) != GHCB_SEV_INFO)
if (GHCB_MSR_INFO(val) != GHCB_MSR_SEV_INFO_RESP)
return false;
if (GHCB_PROTO_MAX(val) < GHCB_PROTO_OUR ||
GHCB_PROTO_MIN(val) > GHCB_PROTO_OUR)
if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTO_OUR ||
GHCB_MSR_PROTO_MIN(val) > GHCB_PROTO_OUR)
return false;
return true;
@ -153,28 +153,28 @@ void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EAX));
VMGEXIT();
val = sev_es_rd_ghcb_msr();
if (GHCB_SEV_GHCB_RESP_CODE(val) != GHCB_SEV_CPUID_RESP)
if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
goto fail;
regs->ax = val >> 32;
sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EBX));
VMGEXIT();
val = sev_es_rd_ghcb_msr();
if (GHCB_SEV_GHCB_RESP_CODE(val) != GHCB_SEV_CPUID_RESP)
if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
goto fail;
regs->bx = val >> 32;
sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_ECX));
VMGEXIT();
val = sev_es_rd_ghcb_msr();
if (GHCB_SEV_GHCB_RESP_CODE(val) != GHCB_SEV_CPUID_RESP)
if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
goto fail;
regs->cx = val >> 32;
sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EDX));
VMGEXIT();
val = sev_es_rd_ghcb_msr();
if (GHCB_SEV_GHCB_RESP_CODE(val) != GHCB_SEV_CPUID_RESP)
if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
goto fail;
regs->dx = val >> 32;

View File

@ -22,7 +22,7 @@
#include <asm/cpu_entry_area.h>
#include <asm/stacktrace.h>
#include <asm/sev-es.h>
#include <asm/sev.h>
#include <asm/insn-eval.h>
#include <asm/fpu/internal.h>
#include <asm/processor.h>
@ -459,7 +459,7 @@ static enum es_result vc_slow_virt_to_phys(struct ghcb *ghcb, struct es_em_ctxt
}
/* Include code shared with pre-decompression boot stage */
#include "sev-es-shared.c"
#include "sev-shared.c"
void noinstr __sev_es_nmi_complete(void)
{

View File

@ -2043,7 +2043,7 @@ static bool amd_set_max_freq_ratio(void)
return false;
}
highest_perf = perf_caps.highest_perf;
highest_perf = amd_get_highest_perf();
nominal_perf = perf_caps.nominal_perf;
if (!highest_perf || !nominal_perf) {

View File

@ -458,7 +458,7 @@ void kvm_set_cpu_caps(void)
F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
F(SGX_LC)
F(SGX_LC) | F(BUS_LOCK_DETECT)
);
/* Set LA57 based on hardware capability. */
if (cpuid_ecx(7) & F(LA57))
@ -567,6 +567,21 @@ void kvm_set_cpu_caps(void)
F(ACE2) | F(ACE2_EN) | F(PHE) | F(PHE_EN) |
F(PMM) | F(PMM_EN)
);
/*
* Hide RDTSCP and RDPID if either feature is reported as supported but
* probing MSR_TSC_AUX failed. This is purely a sanity check and
* should never happen, but the guest will likely crash if RDTSCP or
* RDPID is misreported, and KVM has botched MSR_TSC_AUX emulation in
* the past. For example, the sanity check may fire if this instance of
* KVM is running as L1 on top of an older, broken KVM.
*/
if (WARN_ON((kvm_cpu_cap_has(X86_FEATURE_RDTSCP) ||
kvm_cpu_cap_has(X86_FEATURE_RDPID)) &&
!kvm_is_supported_user_return_msr(MSR_TSC_AUX))) {
kvm_cpu_cap_clear(X86_FEATURE_RDTSCP);
kvm_cpu_cap_clear(X86_FEATURE_RDPID);
}
}
EXPORT_SYMBOL_GPL(kvm_set_cpu_caps);
@ -637,7 +652,8 @@ static int __do_cpuid_func_emulated(struct kvm_cpuid_array *array, u32 func)
case 7:
entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
entry->eax = 0;
entry->ecx = F(RDPID);
if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP))
entry->ecx = F(RDPID);
++array->nent;
default:
break;

View File

@ -4502,7 +4502,7 @@ static const struct opcode group8[] = {
* from the register case of group9.
*/
static const struct gprefix pfx_0f_c7_7 = {
N, N, N, II(DstMem | ModRM | Op3264 | EmulateOnUD, em_rdpid, rdtscp),
N, N, N, II(DstMem | ModRM | Op3264 | EmulateOnUD, em_rdpid, rdpid),
};

View File

@ -468,6 +468,7 @@ enum x86_intercept {
x86_intercept_clgi,
x86_intercept_skinit,
x86_intercept_rdtscp,
x86_intercept_rdpid,
x86_intercept_icebp,
x86_intercept_wbinvd,
x86_intercept_monitor,

View File

@ -1913,8 +1913,8 @@ void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu)
if (!apic->lapic_timer.hv_timer_in_use)
goto out;
WARN_ON(rcuwait_active(&vcpu->wait));
cancel_hv_timer(apic);
apic_timer_expired(apic, false);
cancel_hv_timer(apic);
if (apic_lvtt_period(apic) && apic->lapic_timer.period) {
advance_periodic_target_expiration(apic);

View File

@ -3310,12 +3310,12 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
if (mmu->shadow_root_level == PT64_ROOT_4LEVEL) {
pm_mask |= PT_ACCESSED_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
if (WARN_ON_ONCE(!mmu->lm_root)) {
if (WARN_ON_ONCE(!mmu->pml4_root)) {
r = -EIO;
goto out_unlock;
}
mmu->lm_root[0] = __pa(mmu->pae_root) | pm_mask;
mmu->pml4_root[0] = __pa(mmu->pae_root) | pm_mask;
}
for (i = 0; i < 4; ++i) {
@ -3335,7 +3335,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
}
if (mmu->shadow_root_level == PT64_ROOT_4LEVEL)
mmu->root_hpa = __pa(mmu->lm_root);
mmu->root_hpa = __pa(mmu->pml4_root);
else
mmu->root_hpa = __pa(mmu->pae_root);
@ -3350,7 +3350,7 @@ out_unlock:
static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
{
struct kvm_mmu *mmu = vcpu->arch.mmu;
u64 *lm_root, *pae_root;
u64 *pml4_root, *pae_root;
/*
* When shadowing 32-bit or PAE NPT with 64-bit NPT, the PML4 and PDP
@ -3369,14 +3369,14 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
if (WARN_ON_ONCE(mmu->shadow_root_level != PT64_ROOT_4LEVEL))
return -EIO;
if (mmu->pae_root && mmu->lm_root)
if (mmu->pae_root && mmu->pml4_root)
return 0;
/*
* The special roots should always be allocated in concert. Yell and
* bail if KVM ends up in a state where only one of the roots is valid.
*/
if (WARN_ON_ONCE(!tdp_enabled || mmu->pae_root || mmu->lm_root))
if (WARN_ON_ONCE(!tdp_enabled || mmu->pae_root || mmu->pml4_root))
return -EIO;
/*
@ -3387,14 +3387,14 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
if (!pae_root)
return -ENOMEM;
lm_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
if (!lm_root) {
pml4_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
if (!pml4_root) {
free_page((unsigned long)pae_root);
return -ENOMEM;
}
mmu->pae_root = pae_root;
mmu->lm_root = lm_root;
mmu->pml4_root = pml4_root;
return 0;
}
@ -5261,7 +5261,7 @@ static void free_mmu_pages(struct kvm_mmu *mmu)
if (!tdp_enabled && mmu->pae_root)
set_memory_encrypted((unsigned long)mmu->pae_root, 1);
free_page((unsigned long)mmu->pae_root);
free_page((unsigned long)mmu->lm_root);
free_page((unsigned long)mmu->pml4_root);
}
static int __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)

View File

@ -388,7 +388,7 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, tdp_ptep_t pt,
}
/**
* handle_changed_spte - handle bookkeeping associated with an SPTE change
* __handle_changed_spte - handle bookkeeping associated with an SPTE change
* @kvm: kvm instance
* @as_id: the address space of the paging structure the SPTE was a part of
* @gfn: the base GFN that was mapped by the SPTE
@ -444,6 +444,13 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
trace_kvm_tdp_mmu_spte_changed(as_id, gfn, level, old_spte, new_spte);
if (is_large_pte(old_spte) != is_large_pte(new_spte)) {
if (is_large_pte(old_spte))
atomic64_sub(1, (atomic64_t*)&kvm->stat.lpages);
else
atomic64_add(1, (atomic64_t*)&kvm->stat.lpages);
}
/*
* The only times a SPTE should be changed from a non-present to
* non-present state is when an MMIO entry is installed/modified/
@ -1009,6 +1016,14 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
}
if (!is_shadow_present_pte(iter.old_spte)) {
/*
* If SPTE has been forzen by another thread, just
* give up and retry, avoiding unnecessary page table
* allocation and free.
*/
if (is_removed_spte(iter.old_spte))
break;
sp = alloc_tdp_mmu_page(vcpu, iter.gfn, iter.level);
child_pt = sp->spt;

View File

@ -764,7 +764,6 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
svm_switch_vmcb(svm, &svm->vmcb01);
WARN_ON_ONCE(svm->vmcb->control.exit_code != SVM_EXIT_VMRUN);
/*
* On vmexit the GIF is set to false and
@ -872,6 +871,15 @@ void svm_free_nested(struct vcpu_svm *svm)
__free_page(virt_to_page(svm->nested.vmcb02.ptr));
svm->nested.vmcb02.ptr = NULL;
/*
* When last_vmcb12_gpa matches the current vmcb12 gpa,
* some vmcb12 fields are not loaded if they are marked clean
* in the vmcb12, since in this case they are up to date already.
*
* When the vmcb02 is freed, this optimization becomes invalid.
*/
svm->nested.last_vmcb12_gpa = INVALID_GPA;
svm->nested.initialized = false;
}
@ -884,9 +892,11 @@ void svm_leave_nested(struct vcpu_svm *svm)
if (is_guest_mode(vcpu)) {
svm->nested.nested_run_pending = 0;
svm->nested.vmcb12_gpa = INVALID_GPA;
leave_guest_mode(vcpu);
svm_switch_vmcb(svm, &svm->nested.vmcb02);
svm_switch_vmcb(svm, &svm->vmcb01);
nested_svm_uninit_mmu_context(vcpu);
vmcb_mark_all_dirty(svm->vmcb);
@ -1298,12 +1308,17 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
* L2 registers if needed are moved from the current VMCB to VMCB02.
*/
if (is_guest_mode(vcpu))
svm_leave_nested(svm);
else
svm->nested.vmcb02.ptr->save = svm->vmcb01.ptr->save;
svm_set_gif(svm, !!(kvm_state->flags & KVM_STATE_NESTED_GIF_SET));
svm->nested.nested_run_pending =
!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
svm->nested.vmcb12_gpa = kvm_state->hdr.svm.vmcb_pa;
if (svm->current_vmcb == &svm->vmcb01)
svm->nested.vmcb02.ptr->save = svm->vmcb01.ptr->save;
svm->vmcb01.ptr->save.es = save->es;
svm->vmcb01.ptr->save.cs = save->cs;

View File

@ -763,7 +763,7 @@ static int __sev_dbg_decrypt(struct kvm *kvm, unsigned long src_paddr,
}
static int __sev_dbg_decrypt_user(struct kvm *kvm, unsigned long paddr,
unsigned long __user dst_uaddr,
void __user *dst_uaddr,
unsigned long dst_paddr,
int size, int *err)
{
@ -787,8 +787,7 @@ static int __sev_dbg_decrypt_user(struct kvm *kvm, unsigned long paddr,
if (tpage) {
offset = paddr & 15;
if (copy_to_user((void __user *)(uintptr_t)dst_uaddr,
page_address(tpage) + offset, size))
if (copy_to_user(dst_uaddr, page_address(tpage) + offset, size))
ret = -EFAULT;
}
@ -800,9 +799,9 @@ e_free:
}
static int __sev_dbg_encrypt_user(struct kvm *kvm, unsigned long paddr,
unsigned long __user vaddr,
void __user *vaddr,
unsigned long dst_paddr,
unsigned long __user dst_vaddr,
void __user *dst_vaddr,
int size, int *error)
{
struct page *src_tpage = NULL;
@ -810,13 +809,12 @@ static int __sev_dbg_encrypt_user(struct kvm *kvm, unsigned long paddr,
int ret, len = size;
/* If source buffer is not aligned then use an intermediate buffer */
if (!IS_ALIGNED(vaddr, 16)) {
if (!IS_ALIGNED((unsigned long)vaddr, 16)) {
src_tpage = alloc_page(GFP_KERNEL);
if (!src_tpage)
return -ENOMEM;
if (copy_from_user(page_address(src_tpage),
(void __user *)(uintptr_t)vaddr, size)) {
if (copy_from_user(page_address(src_tpage), vaddr, size)) {
__free_page(src_tpage);
return -EFAULT;
}
@ -830,7 +828,7 @@ static int __sev_dbg_encrypt_user(struct kvm *kvm, unsigned long paddr,
* - copy the source buffer in an intermediate buffer
* - use the intermediate buffer as source buffer
*/
if (!IS_ALIGNED(dst_vaddr, 16) || !IS_ALIGNED(size, 16)) {
if (!IS_ALIGNED((unsigned long)dst_vaddr, 16) || !IS_ALIGNED(size, 16)) {
int dst_offset;
dst_tpage = alloc_page(GFP_KERNEL);
@ -855,7 +853,7 @@ static int __sev_dbg_encrypt_user(struct kvm *kvm, unsigned long paddr,
page_address(src_tpage), size);
else {
if (copy_from_user(page_address(dst_tpage) + dst_offset,
(void __user *)(uintptr_t)vaddr, size)) {
vaddr, size)) {
ret = -EFAULT;
goto e_free;
}
@ -935,15 +933,15 @@ static int sev_dbg_crypt(struct kvm *kvm, struct kvm_sev_cmd *argp, bool dec)
if (dec)
ret = __sev_dbg_decrypt_user(kvm,
__sme_page_pa(src_p[0]) + s_off,
dst_vaddr,
(void __user *)dst_vaddr,
__sme_page_pa(dst_p[0]) + d_off,
len, &argp->error);
else
ret = __sev_dbg_encrypt_user(kvm,
__sme_page_pa(src_p[0]) + s_off,
vaddr,
(void __user *)vaddr,
__sme_page_pa(dst_p[0]) + d_off,
dst_vaddr,
(void __user *)dst_vaddr,
len, &argp->error);
sev_unpin_memory(kvm, src_p, n);
@ -1764,7 +1762,8 @@ e_mirror_unlock:
e_source_unlock:
mutex_unlock(&source_kvm->lock);
e_source_put:
fput(source_kvm_file);
if (source_kvm_file)
fput(source_kvm_file);
return ret;
}
@ -2198,7 +2197,7 @@ vmgexit_err:
return -EINVAL;
}
static void pre_sev_es_run(struct vcpu_svm *svm)
void sev_es_unmap_ghcb(struct vcpu_svm *svm)
{
if (!svm->ghcb)
return;
@ -2234,9 +2233,6 @@ void pre_sev_run(struct vcpu_svm *svm, int cpu)
struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
int asid = sev_get_asid(svm->vcpu.kvm);
/* Perform any SEV-ES pre-run actions */
pre_sev_es_run(svm);
/* Assign the asid allocated with this SEV guest */
svm->asid = asid;

View File

@ -212,7 +212,7 @@ DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
* RDTSCP and RDPID are not used in the kernel, specifically to allow KVM to
* defer the restoration of TSC_AUX until the CPU returns to userspace.
*/
#define TSC_AUX_URET_SLOT 0
static int tsc_aux_uret_slot __read_mostly = -1;
static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
@ -447,6 +447,11 @@ static int has_svm(void)
return 0;
}
if (pgtable_l5_enabled()) {
pr_info("KVM doesn't yet support 5-level paging on AMD SVM\n");
return 0;
}
return 1;
}
@ -858,8 +863,8 @@ static __init void svm_adjust_mmio_mask(void)
return;
/* If memory encryption is not enabled, use existing mask */
rdmsrl(MSR_K8_SYSCFG, msr);
if (!(msr & MSR_K8_SYSCFG_MEM_ENCRYPT))
rdmsrl(MSR_AMD64_SYSCFG, msr);
if (!(msr & MSR_AMD64_SYSCFG_MEM_ENCRYPT))
return;
enc_bit = cpuid_ebx(0x8000001f) & 0x3f;
@ -959,8 +964,7 @@ static __init int svm_hardware_setup(void)
kvm_tsc_scaling_ratio_frac_bits = 32;
}
if (boot_cpu_has(X86_FEATURE_RDTSCP))
kvm_define_user_return_msr(TSC_AUX_URET_SLOT, MSR_TSC_AUX);
tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);
/* Check for pause filtering support */
if (!boot_cpu_has(X86_FEATURE_PAUSEFILTER)) {
@ -1100,7 +1104,9 @@ static u64 svm_write_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
return svm->vmcb->control.tsc_offset;
}
static void svm_check_invpcid(struct vcpu_svm *svm)
/* Evaluate instruction intercepts that depend on guest CPUID features. */
static void svm_recalc_instruction_intercepts(struct kvm_vcpu *vcpu,
struct vcpu_svm *svm)
{
/*
* Intercept INVPCID if shadow paging is enabled to sync/free shadow
@ -1113,6 +1119,13 @@ static void svm_check_invpcid(struct vcpu_svm *svm)
else
svm_clr_intercept(svm, INTERCEPT_INVPCID);
}
if (kvm_cpu_cap_has(X86_FEATURE_RDTSCP)) {
if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
svm_clr_intercept(svm, INTERCEPT_RDTSCP);
else
svm_set_intercept(svm, INTERCEPT_RDTSCP);
}
}
static void init_vmcb(struct kvm_vcpu *vcpu)
@ -1235,8 +1248,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
svm->current_vmcb->asid_generation = 0;
svm->asid = 0;
svm->nested.vmcb12_gpa = 0;
svm->nested.last_vmcb12_gpa = 0;
svm->nested.vmcb12_gpa = INVALID_GPA;
svm->nested.last_vmcb12_gpa = INVALID_GPA;
vcpu->arch.hflags = 0;
if (!kvm_pause_in_guest(vcpu->kvm)) {
@ -1248,7 +1261,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
svm_clr_intercept(svm, INTERCEPT_PAUSE);
}
svm_check_invpcid(svm);
svm_recalc_instruction_intercepts(vcpu, svm);
/*
* If the host supports V_SPEC_CTRL then disable the interception
@ -1424,6 +1437,9 @@ static void svm_prepare_guest_switch(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
struct svm_cpu_data *sd = per_cpu(svm_data, vcpu->cpu);
if (sev_es_guest(vcpu->kvm))
sev_es_unmap_ghcb(svm);
if (svm->guest_state_loaded)
return;
@ -1445,8 +1461,8 @@ static void svm_prepare_guest_switch(struct kvm_vcpu *vcpu)
}
}
if (static_cpu_has(X86_FEATURE_RDTSCP))
kvm_set_user_return_msr(TSC_AUX_URET_SLOT, svm->tsc_aux, -1ull);
if (likely(tsc_aux_uret_slot >= 0))
kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
svm->guest_state_loaded = true;
}
@ -2655,11 +2671,6 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
msr_info->data |= (u64)svm->sysenter_esp_hi << 32;
break;
case MSR_TSC_AUX:
if (!boot_cpu_has(X86_FEATURE_RDTSCP))
return 1;
if (!msr_info->host_initiated &&
!guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
return 1;
msr_info->data = svm->tsc_aux;
break;
/*
@ -2876,30 +2887,13 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
svm->sysenter_esp_hi = guest_cpuid_is_intel(vcpu) ? (data >> 32) : 0;
break;
case MSR_TSC_AUX:
if (!boot_cpu_has(X86_FEATURE_RDTSCP))
return 1;
if (!msr->host_initiated &&
!guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
return 1;
/*
* Per Intel's SDM, bits 63:32 are reserved, but AMD's APM has
* incomplete and conflicting architectural behavior. Current
* AMD CPUs completely ignore bits 63:32, i.e. they aren't
* reserved and always read as zeros. Emulate AMD CPU behavior
* to avoid explosions if the vCPU is migrated from an AMD host
* to an Intel host.
*/
data = (u32)data;
/*
* TSC_AUX is usually changed only during boot and never read
* directly. Intercept TSC_AUX instead of exposing it to the
* guest via direct_access_msrs, and switch it via user return.
*/
preempt_disable();
r = kvm_set_user_return_msr(TSC_AUX_URET_SLOT, data, -1ull);
r = kvm_set_user_return_msr(tsc_aux_uret_slot, data, -1ull);
preempt_enable();
if (r)
return 1;
@ -3084,6 +3078,7 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = {
[SVM_EXIT_STGI] = stgi_interception,
[SVM_EXIT_CLGI] = clgi_interception,
[SVM_EXIT_SKINIT] = skinit_interception,
[SVM_EXIT_RDTSCP] = kvm_handle_invalid_op,
[SVM_EXIT_WBINVD] = kvm_emulate_wbinvd,
[SVM_EXIT_MONITOR] = kvm_emulate_monitor,
[SVM_EXIT_MWAIT] = kvm_emulate_mwait,
@ -3972,8 +3967,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
svm->nrips_enabled = kvm_cpu_cap_has(X86_FEATURE_NRIPS) &&
guest_cpuid_has(vcpu, X86_FEATURE_NRIPS);
/* Check again if INVPCID interception if required */
svm_check_invpcid(svm);
svm_recalc_instruction_intercepts(vcpu, svm);
/* For sev guests, the memory encryption bit is not reserved in CR3. */
if (sev_guest(vcpu->kvm)) {

Some files were not shown because too many files have changed in this diff Show More