powerpc updates for 6.10

- Enable BPF Kernel Functions (kfuncs) in the powerpc BPF JIT.
 
  - Allow per-process DEXCR (Dynamic Execution Control Register) settings via
    prctl, notably NPHIE which controls hashst/hashchk for ROP protection.
 
  - Install powerpc selftests in sub-directories. Note this changes the way
    run_kselftest.sh needs to be invoked for powerpc selftests.
 
  - Change fadump (Firmware Assisted Dump) to better handle memory add/remove.
 
  - Add support for passing additional parameters to the fadump kernel.
 
  - Add support for updating the kdump image on CPU/memory add/remove events.
 
  - Other small features, cleanups and fixes.
 
 Thanks to: Andrew Donnellan, Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann,
 Benjamin Gray, Bjorn Helgaas, Christian Zigotzky, Christophe Jaillet, Christophe
 Leroy, Colin Ian King, Cédric Le Goater, Dr. David Alan Gilbert, Erhard Furtner,
 Frank Li, GUO Zihua, Ganesh Goudar, Geoff Levand, Ghanshyam Agrawal, Greg Kurz,
 Hari Bathini, Joel Stanley, Justin Stitt, Kunwu Chan, Li Yang, Lidong Zhong,
 Madhavan Srinivasan, Mahesh Salgaonkar, Masahiro Yamada, Matthias Schiffer,
 Naresh Kamboju, Nathan Chancellor, Nathan Lynch, Naveen N Rao, Nicholas
 Miehlbradt, Ran Wang, Randy Dunlap, Ritesh Harjani, Sachin Sant, Shirisha Ganta,
 Shrikanth Hegde, Sourabh Jain, Stephen Rothwell, sundar, Thorsten Blum, Vaibhav
 Jain, Xiaowei Bao, Yang Li, Zhao Chenhui.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmZHLtwTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgCGdD/0cqQkYl6+E0/K68Y7jnAWF+l0LNFlm
 /4jZ+zKXPiPhSdaQq4xo2ZjEooUPsm3c+AHidmrAtOMBULvv4pyciu61hrVu4Y2b
 aAudkBMUc+i/Lfaz7fq1KnN4LDFVm7xZZ+i/ju9tOBLMpOZ3YZ+YoOGA6nqsshJF
 XuB5h0T+H55he1wBpvyyrsUUyss53Mp3IsajxdwBOsUDDp0fSAg8SLEyhoiK3BsQ
 EjEa6iEqJSBheqFEXPvqsMuqM3k51CHe/pCOMODjo7P+u/MNrClZUscZKXGB5xq9
 Bu3SPxIYfRmU4XE53517faElEPmlxSBrjQGCD1EGEVXGsjn6r7TD6R5voow3SoUq
 CLTy90KNNrS1cIqeomu6bJ/anzYrViqTdekImA7Vb+Ol8f+uT9l+l1D75eYOKPQ3
 N0AHoa4rnWIb5kjCAjHaZ54O+B2q2tPlQqFUmt+BrvZyKS13zjE36stnArxP3MPC
 Xw6y3huX3AkZiJ4mQYRiBn//xGOLwrRCd/EoTDnoe08yq0Hoor6qIm4uEy2Nu3Kf
 0mBsEOxMsmQd6NEq43B/sFgVbbxKhAyxfZ9gHqxDQZcgoxXcMesyj/n4+jM5sRYK
 zmavLlykM2Tjlh1evs8+e0mCEwDjDn2GRlqstJQTrmnGhbMKi3jvw9I7gGtZVqbS
 kAflTXzsIXvxBA==
 =GoCV
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Enable BPF Kernel Functions (kfuncs) in the powerpc BPF JIT.

 - Allow per-process DEXCR (Dynamic Execution Control Register) settings
   via prctl, notably NPHIE which controls hashst/hashchk for ROP
   protection.

 - Install powerpc selftests in sub-directories. Note this changes the
   way run_kselftest.sh needs to be invoked for powerpc selftests.

 - Change fadump (Firmware Assisted Dump) to better handle memory
   add/remove.

 - Add support for passing additional parameters to the fadump kernel.

 - Add support for updating the kdump image on CPU/memory add/remove
   events.

 - Other small features, cleanups and fixes.

Thanks to Andrew Donnellan, Andy Shevchenko, Aneesh Kumar K.V, Arnd
Bergmann, Benjamin Gray, Bjorn Helgaas, Christian Zigotzky, Christophe
Jaillet, Christophe Leroy, Colin Ian King, Cédric Le Goater, Dr. David
Alan Gilbert, Erhard Furtner, Frank Li, GUO Zihua, Ganesh Goudar, Geoff
Levand, Ghanshyam Agrawal, Greg Kurz, Hari Bathini, Joel Stanley, Justin
Stitt, Kunwu Chan, Li Yang, Lidong Zhong, Madhavan Srinivasan, Mahesh
Salgaonkar, Masahiro Yamada, Matthias Schiffer, Naresh Kamboju, Nathan
Chancellor, Nathan Lynch, Naveen N Rao, Nicholas Miehlbradt, Ran Wang,
Randy Dunlap, Ritesh Harjani, Sachin Sant, Shirisha Ganta, Shrikanth
Hegde, Sourabh Jain, Stephen Rothwell, sundar, Thorsten Blum, Vaibhav
Jain, Xiaowei Bao, Yang Li, and Zhao Chenhui.

* tag 'powerpc-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (85 commits)
  powerpc/fadump: Fix section mismatch warning
  powerpc/85xx: fix compile error without CONFIG_CRASH_DUMP
  powerpc/fadump: update documentation about bootargs_append
  powerpc/fadump: pass additional parameters when fadump is active
  powerpc/fadump: setup additional parameters for dump capture kernel
  powerpc/pseries/fadump: add support for multiple boot memory regions
  selftests/powerpc/dexcr: Fix spelling mistake "predicition" -> "prediction"
  KVM: PPC: Book3S HV nestedv2: Fix an error handling path in gs_msg_ops_kvmhv_nestedv2_config_fill_info()
  KVM: PPC: Fix documentation for ppc mmu caps
  KVM: PPC: code cleanup for kvmppc_book3s_irqprio_deliver
  KVM: PPC: Book3S HV nestedv2: Cancel pending DEC exception
  powerpc/xmon: Check cpu id in commands "c#", "dp#" and "dx#"
  powerpc/code-patching: Use dedicated memory routines for patching
  powerpc/code-patching: Test patch_instructions() during boot
  powerpc64/kasan: Pass virtual addresses to kasan_init_phys_region()
  powerpc: rename SPRN_HID2 define to SPRN_HID2_750FX
  powerpc: Fix typos
  powerpc/eeh: Fix spelling of the word "auxillary" and update comment
  macintosh/ams: Fix unused variable warning
  powerpc/Makefile: Remove bits related to the previous use of -mcmodel=large
  ...
This commit is contained in:
Linus Torvalds 2024-05-17 09:05:46 -07:00
commit ff2632d7d0
199 changed files with 3052 additions and 1270 deletions

View File

@ -423,7 +423,7 @@ What: /sys/devices/system/cpu/cpuX/cpufreq/throttle_stats
/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/occ_reset
Date: March 2016
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: POWERNV CPUFreq driver's frequency throttle stats directory and
attributes
@ -473,7 +473,7 @@ What: /sys/devices/system/cpu/cpufreq/policyX/throttle_stats
/sys/devices/system/cpu/cpufreq/policyX/throttle_stats/occ_reset
Date: March 2016
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: POWERNV CPUFreq driver's frequency throttle stats directory and
attributes
@ -608,7 +608,7 @@ Description: Umwait control
What: /sys/devices/system/cpu/svm
Date: August 2019
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: Secure Virtual Machine
If 1, it means the system is using the Protected Execution
@ -617,7 +617,7 @@ Description: Secure Virtual Machine
What: /sys/devices/system/cpu/cpuX/purr
Date: Apr 2005
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: PURR ticks for this CPU since the system boot.
The Processor Utilization Resources Register (PURR) is
@ -628,7 +628,7 @@ Description: PURR ticks for this CPU since the system boot.
What: /sys/devices/system/cpu/cpuX/spurr
Date: Dec 2006
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: SPURR ticks for this CPU since the system boot.
The Scaled Processor Utilization Resources Register
@ -640,7 +640,7 @@ Description: SPURR ticks for this CPU since the system boot.
What: /sys/devices/system/cpu/cpuX/idle_purr
Date: Apr 2020
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: PURR ticks for cpuX when it was idle.
This sysfs interface exposes the number of PURR ticks
@ -648,7 +648,7 @@ Description: PURR ticks for cpuX when it was idle.
What: /sys/devices/system/cpu/cpuX/idle_spurr
Date: Apr 2020
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: SPURR ticks for cpuX when it was idle.
This sysfs interface exposes the number of SPURR ticks

View File

@ -1,6 +1,6 @@
What: /sys/firmware/opal/powercap
Date: August 2017
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: Powercap directory for Powernv (P8, P9) servers
Each folder in this directory contains a
@ -11,7 +11,7 @@ What: /sys/firmware/opal/powercap/system-powercap
/sys/firmware/opal/powercap/system-powercap/powercap-max
/sys/firmware/opal/powercap/system-powercap/powercap-current
Date: August 2017
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: System powercap directory and attributes applicable for
Powernv (P8, P9) servers

View File

@ -1,6 +1,6 @@
What: /sys/firmware/opal/psr
Date: August 2017
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: Power-Shift-Ratio directory for Powernv P9 servers
Power-Shift-Ratio allows to provide hints the firmware
@ -10,7 +10,7 @@ Description: Power-Shift-Ratio directory for Powernv P9 servers
What: /sys/firmware/opal/psr/cpu_to_gpu_X
Date: August 2017
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: PSR sysfs attributes for Powernv P9 servers
Power-Shift-Ratio between CPU and GPU for a given chip

View File

@ -1,6 +1,6 @@
What: /sys/firmware/opal/sensor_groups
Date: August 2017
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: Sensor groups directory for POWER9 powernv servers
Each folder in this directory contains a sensor group
@ -11,7 +11,7 @@ Description: Sensor groups directory for POWER9 powernv servers
What: /sys/firmware/opal/sensor_groups/<sensor_group_name>/clear
Date: August 2017
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: Sysfs file to clear the min-max of all the sensors
belonging to the group.

View File

@ -1,6 +1,6 @@
What: /sys/firmware/papr/energy_scale_info
Date: February 2022
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: Directory hosting a set of platform attributes like
energy/frequency on Linux running as a PAPR guest.
@ -10,20 +10,20 @@ Description: Directory hosting a set of platform attributes like
What: /sys/firmware/papr/energy_scale_info/<id>
Date: February 2022
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: Energy, frequency attributes directory for POWERVM servers
What: /sys/firmware/papr/energy_scale_info/<id>/desc
Date: February 2022
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: String description of the energy attribute of <id>
What: /sys/firmware/papr/energy_scale_info/<id>/value
Date: February 2022
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: Numeric value of the energy attribute of <id>
What: /sys/firmware/papr/energy_scale_info/<id>/value_desc
Date: February 2022
Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
Contact: Linux for PowerPC mailing list <linuxppc-dev@lists.ozlabs.org>
Description: String value of the energy attribute of <id>

View File

@ -38,3 +38,21 @@ Contact: linuxppc-dev@lists.ozlabs.org
Description: read only
Provide information about the amount of memory reserved by
FADump to save the crash dump in bytes.
What: /sys/kernel/fadump/hotplug_ready
Date: Apr 2024
Contact: linuxppc-dev@lists.ozlabs.org
Description: read only
Kdump udev rule re-registers fadump on memory add/remove events,
primarily to update the elfcorehdr. This sysfs indicates the
kdump udev rule that fadump re-registration is not required on
memory add/remove events because elfcorehdr is now prepared in
the second/fadump kernel.
User: kexec-tools
What: /sys/kernel/fadump/bootargs_append
Date: May 2024
Contact: linuxppc-dev@lists.ozlabs.org
Description: read/write
This is a special sysfs file available to setup additional
parameters to be passed to capture kernel.

View File

@ -36,8 +36,145 @@ state for a process.
Configuration
=============
The DEXCR is currently unconfigurable. All threads are run with the
NPHIE aspect enabled.
prctl
-----
A process can control its own userspace DEXCR value using the
``PR_PPC_GET_DEXCR`` and ``PR_PPC_SET_DEXCR`` pair of
:manpage:`prctl(2)` commands. These calls have the form::
prctl(PR_PPC_GET_DEXCR, unsigned long which, 0, 0, 0);
prctl(PR_PPC_SET_DEXCR, unsigned long which, unsigned long ctrl, 0, 0);
The possible 'which' and 'ctrl' values are as follows. Note there is no relation
between the 'which' value and the DEXCR aspect's index.
.. flat-table::
:header-rows: 1
:widths: 2 7 1
* - ``prctl()`` which
- Aspect name
- Aspect index
* - ``PR_PPC_DEXCR_SBHE``
- Speculative Branch Hint Enable (SBHE)
- 0
* - ``PR_PPC_DEXCR_IBRTPD``
- Indirect Branch Recurrent Target Prediction Disable (IBRTPD)
- 3
* - ``PR_PPC_DEXCR_SRAPD``
- Subroutine Return Address Prediction Disable (SRAPD)
- 4
* - ``PR_PPC_DEXCR_NPHIE``
- Non-Privileged Hash Instruction Enable (NPHIE)
- 5
.. flat-table::
:header-rows: 1
:widths: 2 8
* - ``prctl()`` ctrl
- Meaning
* - ``PR_PPC_DEXCR_CTRL_EDITABLE``
- This aspect can be configured with PR_PPC_SET_DEXCR (get only)
* - ``PR_PPC_DEXCR_CTRL_SET``
- This aspect is set / set this aspect
* - ``PR_PPC_DEXCR_CTRL_CLEAR``
- This aspect is clear / clear this aspect
* - ``PR_PPC_DEXCR_CTRL_SET_ONEXEC``
- This aspect will be set after exec / set this aspect after exec
* - ``PR_PPC_DEXCR_CTRL_CLEAR_ONEXEC``
- This aspect will be clear after exec / clear this aspect after exec
Note that
* which is a plain value, not a bitmask. Aspects must be worked with individually.
* ctrl is a bitmask. ``PR_PPC_GET_DEXCR`` returns both the current and onexec
configuration. For example, ``PR_PPC_GET_DEXCR`` may return
``PR_PPC_DEXCR_CTRL_EDITABLE | PR_PPC_DEXCR_CTRL_SET |
PR_PPC_DEXCR_CTRL_CLEAR_ONEXEC``. This would indicate the aspect is currently
set, it will be cleared when you run exec, and you can change this with the
``PR_PPC_SET_DEXCR`` prctl.
* The set/clear terminology refers to setting/clearing the bit in the DEXCR.
For example::
prctl(PR_PPC_SET_DEXCR, PR_PPC_DEXCR_IBRTPD, PR_PPC_DEXCR_CTRL_SET, 0, 0);
will set the IBRTPD aspect bit in the DEXCR, causing indirect branch prediction
to be disabled.
* The status returned by ``PR_PPC_GET_DEXCR`` represents what value the process
would like applied. It does not include any alternative overrides, such as if
the hypervisor is enforcing the aspect be set. To see the true DEXCR state
software should read the appropriate SPRs directly.
* The aspect state when starting a process is copied from the parent's state on
:manpage:`fork(2)`. The state is reset to a fixed value on
:manpage:`execve(2)`. The PR_PPC_SET_DEXCR prctl() can control both of these
values.
* The ``*_ONEXEC`` controls do not change the current process's DEXCR.
Use ``PR_PPC_SET_DEXCR`` with one of ``PR_PPC_DEXCR_CTRL_SET`` or
``PR_PPC_DEXCR_CTRL_CLEAR`` to edit a given aspect.
Common error codes for both getting and setting the DEXCR are as follows:
.. flat-table::
:header-rows: 1
:widths: 2 8
* - Error
- Meaning
* - ``EINVAL``
- The DEXCR is not supported by the kernel.
* - ``ENODEV``
- The aspect is not recognised by the kernel or not supported by the
hardware.
``PR_PPC_SET_DEXCR`` may also report the following error codes:
.. flat-table::
:header-rows: 1
:widths: 2 8
* - Error
- Meaning
* - ``EINVAL``
- The ctrl value contains unrecognised flags.
* - ``EINVAL``
- The ctrl value contains mutually conflicting flags (e.g.,
``PR_PPC_DEXCR_CTRL_SET | PR_PPC_DEXCR_CTRL_CLEAR``)
* - ``EPERM``
- This aspect cannot be modified with prctl() (check for the
PR_PPC_DEXCR_CTRL_EDITABLE flag with PR_PPC_GET_DEXCR).
* - ``EPERM``
- The process does not have sufficient privilege to perform the operation.
For example, clearing NPHIE on exec is a privileged operation (a process
can still clear its own NPHIE aspect without privileges).
This interface allows a process to control its own DEXCR aspects, and also set
the initial DEXCR value for any children in its process tree (up to the next
child to use an ``*_ONEXEC`` control). This allows fine-grained control over the
default value of the DEXCR, for example allowing containers to run with different
default values.
coredump and ptrace

View File

@ -134,12 +134,12 @@ that are run. If there is dump data, then the
memory is held.
If there is no waiting dump data, then only the memory required to
hold CPU state, HPTE region, boot memory dump, FADump header and
elfcore header, is usually reserved at an offset greater than boot
memory size (see Fig. 1). This area is *not* released: this region
will be kept permanently reserved, so that it can act as a receptacle
for a copy of the boot memory content in addition to CPU state and
HPTE region, in the case a crash does occur.
hold CPU state, HPTE region, boot memory dump, and FADump header is
usually reserved at an offset greater than boot memory size (see Fig. 1).
This area is *not* released: this region will be kept permanently
reserved, so that it can act as a receptacle for a copy of the boot
memory content in addition to CPU state and HPTE region, in the case
a crash does occur.
Since this reserved memory area is used only after the system crash,
there is no point in blocking this significant chunk of memory from
@ -153,22 +153,22 @@ that were present in CMA region::
o Memory Reservation during first kernel
Low memory Top of memory
0 boot memory size |<--- Reserved dump area --->| |
| | | Permanent Reservation | |
V V | | V
+-----------+-----/ /---+---+----+-------+-----+-----+----+--+
| | |///|////| DUMP | HDR | ELF |////| |
+-----------+-----/ /---+---+----+-------+-----+-----+----+--+
| ^ ^ ^ ^ ^
| | | | | |
\ CPU HPTE / | |
------------------------------ | |
Boot memory content gets transferred | |
to reserved area by firmware at the | |
time of crash. | |
FADump Header |
(meta area) |
Low memory Top of memory
0 boot memory size |<------ Reserved dump area ----->| |
| | | Permanent Reservation | |
V V | | V
+-----------+-----/ /---+---+----+-----------+-------+----+-----+
| | |///|////| DUMP | HDR |////| |
+-----------+-----/ /---+---+----+-----------+-------+----+-----+
| ^ ^ ^ ^ ^
| | | | | |
\ CPU HPTE / | |
-------------------------------- | |
Boot memory content gets transferred | |
to reserved area by firmware at the | |
time of crash. | |
FADump Header |
(meta area) |
|
|
Metadata: This area holds a metadata structure whose
@ -186,13 +186,20 @@ that were present in CMA region::
0 boot memory size |
| |<------------ Crash preserved area ------------>|
V V |<--- Reserved dump area --->| |
+-----------+-----/ /---+---+----+-------+-----+-----+----+--+
| | |///|////| DUMP | HDR | ELF |////| |
+-----------+-----/ /---+---+----+-------+-----+-----+----+--+
| |
V V
Used by second /proc/vmcore
kernel to boot
+----+---+--+-----/ /---+---+----+-------+-----+-----+-------+
| |ELF| | |///|////| DUMP | HDR |/////| |
+----+---+--+-----/ /---+---+----+-------+-----+-----+-------+
| | | | | |
----- ------------------------------ ---------------
\ | |
\ | |
\ | |
\ | ----------------------------
\ | /
\ | /
\ | /
/proc/vmcore
+---+
|///| -> Regions (CPU, HPTE & Metadata) marked like this in the above
@ -200,6 +207,12 @@ that were present in CMA region::
does not have CPU & HPTE regions while Metadata region is
not supported on pSeries currently.
+---+
|ELF| -> elfcorehdr, it is created in second kernel after crash.
+---+
Note: Memory from 0 to the boot memory size is used by second kernel
Fig. 2
@ -353,26 +366,6 @@ TODO:
- Need to come up with the better approach to find out more
accurate boot memory size that is required for a kernel to
boot successfully when booted with restricted memory.
- The FADump implementation introduces a FADump crash info structure
in the scratch area before the ELF core header. The idea of introducing
this structure is to pass some important crash info data to the second
kernel which will help second kernel to populate ELF core header with
correct data before it gets exported through /proc/vmcore. The current
design implementation does not address a possibility of introducing
additional fields (in future) to this structure without affecting
compatibility. Need to come up with the better approach to address this.
The possible approaches are:
1. Introduce version field for version tracking, bump up the version
whenever a new field is added to the structure in future. The version
field can be used to find out what fields are valid for the current
version of the structure.
2. Reserve the area of predefined size (say PAGE_SIZE) for this
structure and have unused area as reserved (initialized to zero)
for future field additions.
The advantage of approach 1 over 2 is we don't need to reserve extra space.
Author: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

View File

@ -4300,7 +4300,7 @@ operating system that uses the PIT for timing (e.g. Linux 2.4.x).
4.100 KVM_PPC_CONFIGURE_V3_MMU
------------------------------
:Capability: KVM_CAP_PPC_RADIX_MMU or KVM_CAP_PPC_HASH_MMU_V3
:Capability: KVM_CAP_PPC_MMU_RADIX or KVM_CAP_PPC_MMU_HASH_V3
:Architectures: ppc
:Type: vm ioctl
:Parameters: struct kvm_ppc_mmuv3_cfg (in)
@ -4334,7 +4334,7 @@ the Power ISA V3.00, Book III section 5.7.6.1.
4.101 KVM_PPC_GET_RMMU_INFO
---------------------------
:Capability: KVM_CAP_PPC_RADIX_MMU
:Capability: KVM_CAP_PPC_MMU_RADIX
:Architectures: ppc
:Type: vm ioctl
:Parameters: struct kvm_ppc_rmmu_info (out)
@ -8102,7 +8102,7 @@ capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this
will disable the use of APIC hardware virtualization even if supported
by the CPU, as it's incompatible with SynIC auto-EOI behavior.
8.3 KVM_CAP_PPC_RADIX_MMU
8.3 KVM_CAP_PPC_MMU_RADIX
-------------------------
:Architectures: ppc
@ -8112,7 +8112,7 @@ available, means that the kernel can support guests using the
radix MMU defined in Power ISA V3.00 (as implemented in the POWER9
processor).
8.4 KVM_CAP_PPC_HASH_MMU_V3
8.4 KVM_CAP_PPC_MMU_HASH_V3
---------------------------
:Architectures: ppc

View File

@ -12652,7 +12652,6 @@ LINUX FOR POWERPC (32-BIT AND 64-BIT)
M: Michael Ellerman <mpe@ellerman.id.au>
R: Nicholas Piggin <npiggin@gmail.com>
R: Christophe Leroy <christophe.leroy@csgroup.eu>
R: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
R: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
L: linuxppc-dev@lists.ozlabs.org
S: Supported
@ -15086,7 +15085,7 @@ F: drivers/phy/marvell/phy-pxa-usb.c
MMU GATHER AND TLB INVALIDATION
M: Will Deacon <will@kernel.org>
M: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
M: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
M: Andrew Morton <akpm@linux-foundation.org>
M: Nick Piggin <npiggin@gmail.com>
M: Peter Zijlstra <peterz@infradead.org>

View File

@ -1,5 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror
subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror -Wa,--fatal-warnings
subdir-asflags-$(CONFIG_PPC_WERROR) := -Wa,--fatal-warnings
obj-y += kernel/
obj-y += mm/

View File

@ -687,6 +687,10 @@ config ARCH_SELECTS_CRASH_DUMP
depends on CRASH_DUMP
select RELOCATABLE if PPC64 || 44x || PPC_85xx
config ARCH_SUPPORTS_CRASH_HOTPLUG
def_bool y
depends on PPC64
config FA_DUMP
bool "Firmware-assisted dump"
depends on CRASH_DUMP && PPC64 && (PPC_RTAS || PPC_POWERNV)

View File

@ -114,7 +114,6 @@ LDFLAGS_vmlinux := $(LDFLAGS_vmlinux-y)
ifdef CONFIG_PPC64
ifndef CONFIG_PPC_KERNEL_PCREL
ifeq ($(call cc-option-yn,-mcmodel=medium),y)
# -mcmodel=medium breaks modules because it uses 32bit offsets from
# the TOC pointer to create pointers where possible. Pointers into the
# percpu data area are created by this method.
@ -124,9 +123,6 @@ ifeq ($(call cc-option-yn,-mcmodel=medium),y)
# kernel percpu data space (starting with 0xc...). We need a full
# 64bit relocation for this to work, hence -mcmodel=large.
KBUILD_CFLAGS_MODULE += -mcmodel=large
else
export NO_MINIMAL_TOC := -mno-minimal-toc
endif
endif
endif
@ -139,7 +135,7 @@ CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv1)
CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcall-aixdesc)
endif
endif
CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mcmodel=medium,$(call cc-option,-mminimal-toc))
CFLAGS-$(CONFIG_PPC64) += -mcmodel=medium
CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mno-pointers-to-nested-functions)
CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mlong-double-128)

View File

@ -108,8 +108,8 @@ DTC_FLAGS ?= -p 1024
# these files into the build dir, fix up any includes and ensure that dependent
# files are copied in the right order.
# these need to be seperate variables because they are copied out of different
# directories in the kernel tree. Sure you COULd merge them, but it's a
# these need to be separate variables because they are copied out of different
# directories in the kernel tree. Sure you COULD merge them, but it's a
# cure-is-worse-than-disease situation.
zlib-decomp-$(CONFIG_KERNEL_GZIP) := decompress_inflate.c
zlib-$(CONFIG_KERNEL_GZIP) := inffast.c inflate.c inftrees.c

View File

@ -101,7 +101,7 @@ static void print_err(char *s)
* @input_size: length of the input buffer
* @outbuf: output buffer
* @output_size: length of the output buffer
* @skip number of output bytes to ignore
* @_skip: number of output bytes to ignore
*
* This function takes compressed data from inbuf, decompresses and write it to
* outbuf. Once output_size bytes are written to the output buffer, or the

View File

@ -172,7 +172,7 @@
reg = <0xef602800 0x60>;
interrupt-parent = <&UIC0>;
interrupts = <0x4 0x4>;
/* This thing is a bit weird. It has it's own UIC
/* This thing is a bit weird. It has its own UIC
* that it uses to generate snapshot triggers. We
* don't really support this device yet, and it needs
* work to figure this out.

View File

@ -50,7 +50,7 @@
&ifc {
#address-cells = <2>;
#size-cells = <1>;
compatible = "fsl,ifc", "simple-bus";
compatible = "fsl,ifc";
interrupts = <25 2 0 0>;
};

View File

@ -15,7 +15,7 @@
device_type = "memory";
};
board_ifc: ifc: ifc@ff71e000 {
board_ifc: ifc: memory-controller@ff71e000 {
/* NAND Flash on board */
ranges = <0x0 0x0 0x0 0xff800000 0x00004000>;
reg = <0x0 0xff71e000 0x0 0x2000>;

View File

@ -35,7 +35,7 @@
&ifc {
#address-cells = <2>;
#size-cells = <1>;
compatible = "fsl,ifc", "simple-bus";
compatible = "fsl,ifc";
interrupts = <16 2 0 0 20 2 0 0>;
};

View File

@ -15,7 +15,7 @@
device_type = "memory";
};
ifc: ifc@ff71e000 {
ifc: memory-controller@ff71e000 {
/* NOR, NAND Flash on board */
ranges = <0x0 0x0 0x0 0x88000000 0x08000000
0x1 0x0 0x0 0xff800000 0x00010000>;

View File

@ -35,7 +35,7 @@
&ifc {
#address-cells = <2>;
#size-cells = <1>;
compatible = "fsl,ifc", "simple-bus";
compatible = "fsl,ifc";
/* FIXME: Test whether interrupts are split */
interrupts = <16 2 0 0 20 2 0 0>;
};

View File

@ -42,7 +42,7 @@
device_type = "memory";
};
ifc: ifc@fffe1e000 {
ifc: memory-controller@fffe1e000 {
reg = <0xf 0xffe1e000 0 0x2000>;
ranges = <0x0 0x0 0xf 0xec000000 0x04000000
0x1 0x0 0xf 0xff800000 0x00010000

View File

@ -35,7 +35,7 @@
&ifc {
#address-cells = <2>;
#size-cells = <1>;
compatible = "fsl,ifc", "simple-bus";
compatible = "fsl,ifc";
interrupts = <19 2 0 0>;
};

View File

@ -199,6 +199,10 @@
/include/ "pq3-dma-0.dtsi"
/include/ "pq3-etsec1-0.dtsi"
enet0: ethernet@24000 {
fsl,wake-on-filer;
fsl,pmc-handle = <&etsec1_clk>;
};
/include/ "pq3-etsec1-timer-0.dtsi"
usb@22000 {
@ -222,9 +226,10 @@
};
/include/ "pq3-etsec1-2.dtsi"
ethernet@26000 {
enet2: ethernet@26000 {
cell-index = <1>;
fsl,wake-on-filer;
fsl,pmc-handle = <&etsec3_clk>;
};
usb@2b000 {
@ -249,4 +254,9 @@
reg = <0xe0000 0x1000>;
fsl,has-rstcr;
};
/include/ "pq3-power.dtsi"
power@e0070 {
compatible = "fsl,mpc8536-pmc", "fsl,mpc8548-pmc";
};
};

View File

@ -188,4 +188,6 @@
reg = <0xe0000 0x1000>;
fsl,has-rstcr;
};
/include/ "pq3-power.dtsi"
};

View File

@ -156,4 +156,6 @@
reg = <0xe0000 0x1000>;
fsl,has-rstcr;
};
/include/ "pq3-power.dtsi"
};

View File

@ -193,4 +193,6 @@
reg = <0xe0000 0x1000>;
fsl,has-rstcr;
};
/include/ "pq3-power.dtsi"
};

View File

@ -29,3 +29,19 @@
};
/include/ "p1010si-post.dtsi"
&pci0 {
pcie@0 {
interrupt-map = <
/* IDSEL 0x0 */
/*
*irq[4:5] are active-high
*irq[6:7] are active-low
*/
0000 0x0 0x0 0x1 &mpic 0x4 0x2 0x0 0x0
0000 0x0 0x0 0x2 &mpic 0x5 0x2 0x0 0x0
0000 0x0 0x0 0x3 &mpic 0x6 0x1 0x0 0x0
0000 0x0 0x0 0x4 &mpic 0x7 0x1 0x0 0x0
>;
};
};

View File

@ -56,3 +56,19 @@
};
/include/ "p1010si-post.dtsi"
&pci0 {
pcie@0 {
interrupt-map = <
/* IDSEL 0x0 */
/*
*irq[4:5] are active-high
*irq[6:7] are active-low
*/
0000 0x0 0x0 0x1 &mpic 0x4 0x2 0x0 0x0
0000 0x0 0x0 0x2 &mpic 0x5 0x2 0x0 0x0
0000 0x0 0x0 0x3 &mpic 0x6 0x1 0x0 0x0
0000 0x0 0x0 0x4 &mpic 0x7 0x1 0x0 0x0
>;
};
};

View File

@ -215,19 +215,3 @@
phy-connection-type = "sgmii";
};
};
&pci0 {
pcie@0 {
interrupt-map = <
/* IDSEL 0x0 */
/*
*irq[4:5] are active-high
*irq[6:7] are active-low
*/
0000 0x0 0x0 0x1 &mpic 0x4 0x2 0x0 0x0
0000 0x0 0x0 0x2 &mpic 0x5 0x2 0x0 0x0
0000 0x0 0x0 0x3 &mpic 0x6 0x1 0x0 0x0
0000 0x0 0x0 0x4 &mpic 0x7 0x1 0x0 0x0
>;
};
};

View File

@ -36,7 +36,7 @@ memory {
device_type = "memory";
};
board_ifc: ifc: ifc@ffe1e000 {
board_ifc: ifc: memory-controller@ffe1e000 {
/* NOR, NAND Flashes and CPLD on board */
ranges = <0x0 0x0 0x0 0xee000000 0x02000000
0x1 0x0 0x0 0xff800000 0x00010000

View File

@ -36,7 +36,7 @@ memory {
device_type = "memory";
};
board_ifc: ifc: ifc@fffe1e000 {
board_ifc: ifc: memory-controller@fffe1e000 {
/* NOR, NAND Flashes and CPLD on board */
ranges = <0x0 0x0 0xf 0xee000000 0x02000000
0x1 0x0 0xf 0xff800000 0x00010000

View File

@ -35,7 +35,7 @@
&ifc {
#address-cells = <2>;
#size-cells = <1>;
compatible = "fsl,ifc", "simple-bus";
compatible = "fsl,ifc";
interrupts = <16 2 0 0 19 2 0 0>;
};
@ -183,9 +183,23 @@
/include/ "pq3-etsec2-1.dtsi"
/include/ "pq3-etsec2-2.dtsi"
enet0: ethernet@b0000 {
fsl,pmc-handle = <&etsec1_clk>;
};
enet1: ethernet@b1000 {
fsl,pmc-handle = <&etsec2_clk>;
};
enet2: ethernet@b2000 {
fsl,pmc-handle = <&etsec3_clk>;
};
global-utilities@e0000 {
compatible = "fsl,p1010-guts";
reg = <0xe0000 0x1000>;
fsl,has-rstcr;
};
/include/ "pq3-power.dtsi"
};

View File

@ -163,14 +163,17 @@
/include/ "pq3-etsec2-0.dtsi"
enet0: enet0_grp2: ethernet@b0000 {
fsl,pmc-handle = <&etsec1_clk>;
};
/include/ "pq3-etsec2-1.dtsi"
enet1: enet1_grp2: ethernet@b1000 {
fsl,pmc-handle = <&etsec2_clk>;
};
/include/ "pq3-etsec2-2.dtsi"
enet2: enet2_grp2: ethernet@b2000 {
fsl,pmc-handle = <&etsec3_clk>;
};
global-utilities@e0000 {
@ -178,6 +181,8 @@
reg = <0xe0000 0x1000>;
fsl,has-rstcr;
};
/include/ "pq3-power.dtsi"
};
/include/ "pq3-etsec2-grp2-0.dtsi"

View File

@ -159,14 +159,17 @@
/include/ "pq3-etsec2-0.dtsi"
enet0: enet0_grp2: ethernet@b0000 {
fsl,pmc-handle = <&etsec1_clk>;
};
/include/ "pq3-etsec2-1.dtsi"
enet1: enet1_grp2: ethernet@b1000 {
fsl,pmc-handle = <&etsec2_clk>;
};
/include/ "pq3-etsec2-2.dtsi"
enet2: enet2_grp2: ethernet@b2000 {
fsl,pmc-handle = <&etsec3_clk>;
};
global-utilities@e0000 {
@ -174,6 +177,8 @@
reg = <0xe0000 0x1000>;
fsl,has-rstcr;
};
/include/ "pq3-power.dtsi"
};
&qe {

View File

@ -225,11 +225,13 @@
/include/ "pq3-etsec2-0.dtsi"
enet0: enet0_grp2: ethernet@b0000 {
fsl,wake-on-filer;
fsl,pmc-handle = <&etsec1_clk>;
};
/include/ "pq3-etsec2-1.dtsi"
enet1: enet1_grp2: ethernet@b1000 {
fsl,wake-on-filer;
fsl,pmc-handle = <&etsec2_clk>;
};
global-utilities@e0000 {
@ -238,9 +240,10 @@
fsl,has-rstcr;
};
/include/ "pq3-power.dtsi"
power@e0070 {
compatible = "fsl,mpc8536-pmc", "fsl,mpc8548-pmc";
reg = <0xe0070 0x20>;
compatible = "fsl,p1022-pmc", "fsl,mpc8536-pmc",
"fsl,mpc8548-pmc";
};
};

View File

@ -178,6 +178,10 @@
compatible = "fsl-usb2-dr-v1.6", "fsl-usb2-dr";
};
/include/ "pq3-etsec1-0.dtsi"
enet0: ethernet@24000 {
fsl,pmc-handle = <&etsec1_clk>;
};
/include/ "pq3-etsec1-timer-0.dtsi"
ptp_clock@24e00 {
@ -186,7 +190,15 @@
/include/ "pq3-etsec1-1.dtsi"
enet1: ethernet@25000 {
fsl,pmc-handle = <&etsec2_clk>;
};
/include/ "pq3-etsec1-2.dtsi"
enet2: ethernet@26000 {
fsl,pmc-handle = <&etsec3_clk>;
};
/include/ "pq3-esdhc-0.dtsi"
sdhc@2e000 {
compatible = "fsl,p2020-esdhc", "fsl,esdhc";
@ -202,8 +214,5 @@
fsl,has-rstcr;
};
pmc: power@e0070 {
compatible = "fsl,mpc8548-pmc";
reg = <0xe0070 0x20>;
};
/include/ "pq3-power.dtsi"
};

View File

@ -0,0 +1,19 @@
// SPDX-License-Identifier: (GPL-2.0+)
/*
* Copyright 2024 NXP
*/
power@e0070 {
compatible = "fsl,mpc8548-pmc";
reg = <0xe0070 0x20>;
etsec1_clk: soc-clk@24 {
fsl,pmcdr-mask = <0x00000080>;
};
etsec2_clk: soc-clk@25 {
fsl,pmcdr-mask = <0x00000040>;
};
etsec3_clk: soc-clk@26 {
fsl,pmcdr-mask = <0x00000020>;
};
};

View File

@ -52,7 +52,7 @@
&ifc {
#address-cells = <2>;
#size-cells = <1>;
compatible = "fsl,ifc", "simple-bus";
compatible = "fsl,ifc";
interrupts = <25 2 0 0>;
};

View File

@ -91,7 +91,7 @@
board-control@2,0 {
#address-cells = <1>;
#size-cells = <1>;
compatible = "fsl,t1024-cpld";
compatible = "fsl,t1024-cpld", "fsl,deepsleep-cpld";
reg = <3 0 0x300>;
ranges = <0 3 0 0x300>;
bank-width = <1>;

View File

@ -104,7 +104,7 @@
ifc: localbus@ffe124000 {
cpld@3,0 {
compatible = "fsl,t1040rdb-cpld";
compatible = "fsl,t104xrdb-cpld", "fsl,deepsleep-cpld";
};
};
};

View File

@ -52,7 +52,7 @@
&ifc {
#address-cells = <2>;
#size-cells = <1>;
compatible = "fsl,ifc", "simple-bus";
compatible = "fsl,ifc";
interrupts = <25 2 0 0>;
};

View File

@ -68,7 +68,7 @@
ifc: localbus@ffe124000 {
cpld@3,0 {
compatible = "fsl,t1042rdb-cpld";
compatible = "fsl,t104xrdb-cpld", "fsl,deepsleep-cpld";
};
};
};

View File

@ -41,7 +41,7 @@
ifc: localbus@ffe124000 {
cpld@3,0 {
compatible = "fsl,t1042rdb_pi-cpld";
compatible = "fsl,t104xrdb-cpld", "fsl,deepsleep-cpld";
};
};

View File

@ -50,7 +50,7 @@
&ifc {
#address-cells = <2>;
#size-cells = <1>;
compatible = "fsl,ifc", "simple-bus";
compatible = "fsl,ifc";
interrupts = <25 2 0 0>;
};

View File

@ -50,7 +50,7 @@
&ifc {
#address-cells = <2>;
#size-cells = <1>;
compatible = "fsl,ifc", "simple-bus";
compatible = "fsl,ifc";
interrupts = <25 2 0 0>;
};

View File

@ -188,7 +188,7 @@ static inline void prep_esm_blob(struct addr_range vmlinux, void *chosen) { }
/* A buffer that may be edited by tools operating on a zImage binary so as to
* edit the command line passed to vmlinux (by setting /chosen/bootargs).
* The buffer is put in it's own section so that tools may locate it easier.
* The buffer is put in its own section so that tools may locate it easier.
*/
static char cmdline[BOOT_COMMAND_LINE_SIZE]
__attribute__((__section__("__builtin_cmdline")));

View File

@ -25,7 +25,7 @@ BSS_STACK(4096);
/* A buffer that may be edited by tools operating on a zImage binary so as to
* edit the command line passed to vmlinux (by setting /chosen/bootargs).
* The buffer is put in it's own section so that tools may locate it easier.
* The buffer is put in its own section so that tools may locate it easier.
*/
static char cmdline[BOOT_COMMAND_LINE_SIZE]

View File

@ -29,7 +29,7 @@ static __always_inline bool cpu_has_feature(unsigned long feature)
#endif
#ifdef CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
if (!static_key_initialized) {
if (!static_key_feature_checks_initialized) {
printk("Warning! cpu_has_feature() used prior to jump label init!\n");
dump_stack();
return early_cpu_has_feature(feature);

View File

@ -82,7 +82,7 @@ struct eeh_pe {
int false_positives; /* Times of reported #ff's */
atomic_t pass_dev_cnt; /* Count of passed through devs */
struct eeh_pe *parent; /* Parent PE */
void *data; /* PE auxillary data */
void *data; /* PE auxiliary data */
struct list_head child_list; /* List of PEs below this PE */
struct list_head child; /* Memb. child_list/eeh_phb_pe */
struct list_head edevs; /* List of eeh_dev in this PE */

View File

@ -42,13 +42,38 @@ static inline u64 fadump_str_to_u64(const char *str)
#define FADUMP_CPU_UNKNOWN (~((u32)0))
#define FADUMP_CRASH_INFO_MAGIC fadump_str_to_u64("FADMPINF")
/*
* The introduction of new fields in the fadump crash info header has
* led to a change in the magic key from `FADMPINF` to `FADMPSIG` for
* identifying a kernel crash from an old kernel.
*
* To prevent the need for further changes to the magic number in the
* event of future modifications to the fadump crash info header, a
* version field has been introduced to track the fadump crash info
* header version.
*
* Consider a few points before adding new members to the fadump crash info
* header structure:
*
* - Append new members; avoid adding them in between.
* - Non-primitive members should have a size member as well.
* - For every change in the fadump header, increment the
* fadump header version. This helps the updated kernel decide how to
* handle kernel dumps from older kernels.
*/
#define FADUMP_CRASH_INFO_MAGIC_OLD fadump_str_to_u64("FADMPINF")
#define FADUMP_CRASH_INFO_MAGIC fadump_str_to_u64("FADMPSIG")
#define FADUMP_HEADER_VERSION 1
/* fadump crash info structure */
struct fadump_crash_info_header {
u64 magic_number;
u64 elfcorehdr_addr;
u32 version;
u32 crashing_cpu;
u64 vmcoreinfo_raddr;
u64 vmcoreinfo_size;
u32 pt_regs_sz;
u32 cpu_mask_sz;
struct pt_regs regs;
struct cpumask cpu_mask;
};
@ -94,9 +119,13 @@ struct fw_dump {
u64 boot_mem_regs_cnt;
unsigned long fadumphdr_addr;
u64 elfcorehdr_addr;
u64 elfcorehdr_size;
unsigned long cpu_notes_buf_vaddr;
unsigned long cpu_notes_buf_size;
unsigned long param_area;
/*
* Maximum size supported by firmware to copy from source to
* destination address per entry.
@ -111,6 +140,7 @@ struct fw_dump {
unsigned long dump_active:1;
unsigned long dump_registered:1;
unsigned long nocma:1;
unsigned long param_area_supported:1;
struct fadump_ops *ops;
};
@ -129,6 +159,7 @@ struct fadump_ops {
struct seq_file *m);
void (*fadump_trigger)(struct fadump_crash_info_header *fdh,
const char *msg);
int (*fadump_max_boot_mem_rgns)(void);
};
/* Helper functions */
@ -136,7 +167,6 @@ s32 __init fadump_setup_cpu_notes_buf(u32 num_cpus);
void fadump_free_cpu_notes_buf(void);
u32 *__init fadump_regs_to_elf_notes(u32 *buf, struct pt_regs *regs);
void __init fadump_update_elfcore_header(char *bufp);
bool is_fadump_boot_mem_contiguous(void);
bool is_fadump_reserved_mem_contiguous(void);
#else /* !CONFIG_PRESERVE_FA_DUMP */

View File

@ -19,12 +19,14 @@ extern int is_fadump_active(void);
extern int should_fadump_crash(void);
extern void crash_fadump(struct pt_regs *, const char *);
extern void fadump_cleanup(void);
extern void fadump_append_bootargs(void);
#else /* CONFIG_FA_DUMP */
static inline int is_fadump_active(void) { return 0; }
static inline int should_fadump_crash(void) { return 0; }
static inline void crash_fadump(struct pt_regs *regs, const char *str) { }
static inline void fadump_cleanup(void) { }
static inline void fadump_append_bootargs(void) { }
#endif /* !CONFIG_FA_DUMP */
#if defined(CONFIG_FA_DUMP) || defined(CONFIG_PRESERVE_FA_DUMP)

View File

@ -291,6 +291,8 @@ extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup;
extern long __start___barrier_nospec_fixup, __stop___barrier_nospec_fixup;
extern long __start__btb_flush_fixup, __stop__btb_flush_fixup;
extern bool static_key_feature_checks_initialized;
void apply_feature_fixups(void);
void update_mmu_feature_fixups(unsigned long mask);
void setup_feature_keys(void);

View File

@ -524,7 +524,7 @@ long plpar_hcall_norets_notrace(unsigned long opcode, ...);
* Used for all but the craziest of phyp interfaces (see plpar_hcall9)
*/
#define PLPAR_HCALL_BUFSIZE 4
long plpar_hcall(unsigned long opcode, unsigned long *retbuf, ...);
long plpar_hcall(unsigned long opcode, unsigned long retbuf[static PLPAR_HCALL_BUFSIZE], ...);
/**
* plpar_hcall_raw: - Make a hypervisor call without calculating hcall stats
@ -538,7 +538,7 @@ long plpar_hcall(unsigned long opcode, unsigned long *retbuf, ...);
* plpar_hcall, but plpar_hcall_raw works in real mode and does not
* calculate hypervisor call statistics.
*/
long plpar_hcall_raw(unsigned long opcode, unsigned long *retbuf, ...);
long plpar_hcall_raw(unsigned long opcode, unsigned long retbuf[static PLPAR_HCALL_BUFSIZE], ...);
/**
* plpar_hcall9: - Make a pseries hypervisor call with up to 9 return arguments
@ -549,8 +549,8 @@ long plpar_hcall_raw(unsigned long opcode, unsigned long *retbuf, ...);
* PLPAR_HCALL9_BUFSIZE to size the return argument buffer.
*/
#define PLPAR_HCALL9_BUFSIZE 9
long plpar_hcall9(unsigned long opcode, unsigned long *retbuf, ...);
long plpar_hcall9_raw(unsigned long opcode, unsigned long *retbuf, ...);
long plpar_hcall9(unsigned long opcode, unsigned long retbuf[static PLPAR_HCALL9_BUFSIZE], ...);
long plpar_hcall9_raw(unsigned long opcode, unsigned long retbuf[static PLPAR_HCALL9_BUFSIZE], ...);
/* pseries hcall tracing */
extern struct static_key hcall_tracepoint_key;
@ -570,7 +570,7 @@ struct hvcall_mpp_data {
unsigned long backing_mem;
};
int h_get_mpp(struct hvcall_mpp_data *);
long h_get_mpp(struct hvcall_mpp_data *mpp_data);
struct hvcall_mpp_x_data {
unsigned long coalesced_bytes;

View File

@ -336,6 +336,14 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte
if (IS_ENABLED(CONFIG_KASAN))
return;
/*
* Likewise, do not use it in real mode if percpu first chunk is not
* embedded. With CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there
* are chances where percpu allocation can come from vmalloc area.
*/
if (percpu_first_chunk_is_paged)
return;
/* Otherwise, it should be safe to call it */
nmi_enter();
}
@ -351,6 +359,8 @@ static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct inter
// no nmi_exit for a pseries hash guest taking a real mode exception
} else if (IS_ENABLED(CONFIG_KASAN)) {
// no nmi_exit for KASAN in real mode
} else if (percpu_first_chunk_is_paged) {
// no nmi_exit if percpu first chunk is not embedded
} else {
nmi_exit();
}

View File

@ -37,7 +37,7 @@ extern struct pci_dev *isa_bridge_pcidev;
* define properly based on the platform
*/
#ifndef CONFIG_PCI
#define _IO_BASE 0
#define _IO_BASE POISON_POINTER_DELTA
#define _ISA_MEM_BASE 0
#define PCI_DRAM_OFFSET 0
#elif defined(CONFIG_PPC32)
@ -585,12 +585,12 @@ __do_out_asm(_rec_outl, "stwbrx")
#define __do_inw(port) _rec_inw(port)
#define __do_inl(port) _rec_inl(port)
#else /* CONFIG_PPC32 */
#define __do_outb(val, port) writeb(val,(PCI_IO_ADDR)_IO_BASE+port);
#define __do_outw(val, port) writew(val,(PCI_IO_ADDR)_IO_BASE+port);
#define __do_outl(val, port) writel(val,(PCI_IO_ADDR)_IO_BASE+port);
#define __do_inb(port) readb((PCI_IO_ADDR)_IO_BASE + port);
#define __do_inw(port) readw((PCI_IO_ADDR)_IO_BASE + port);
#define __do_inl(port) readl((PCI_IO_ADDR)_IO_BASE + port);
#define __do_outb(val, port) writeb(val,(PCI_IO_ADDR)(_IO_BASE+port));
#define __do_outw(val, port) writew(val,(PCI_IO_ADDR)(_IO_BASE+port));
#define __do_outl(val, port) writel(val,(PCI_IO_ADDR)(_IO_BASE+port));
#define __do_inb(port) readb((PCI_IO_ADDR)(_IO_BASE + port));
#define __do_inw(port) readw((PCI_IO_ADDR)(_IO_BASE + port));
#define __do_inl(port) readl((PCI_IO_ADDR)(_IO_BASE + port));
#endif /* !CONFIG_PPC32 */
#ifdef CONFIG_EEH
@ -606,12 +606,12 @@ __do_out_asm(_rec_outl, "stwbrx")
#define __do_writesw(a, b, n) _outsw(PCI_FIX_ADDR(a),(b),(n))
#define __do_writesl(a, b, n) _outsl(PCI_FIX_ADDR(a),(b),(n))
#define __do_insb(p, b, n) readsb((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
#define __do_insw(p, b, n) readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
#define __do_insl(p, b, n) readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
#define __do_outsb(p, b, n) writesb((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
#define __do_outsw(p, b, n) writesw((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
#define __do_outsl(p, b, n) writesl((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
#define __do_insb(p, b, n) readsb((PCI_IO_ADDR)(_IO_BASE+(p)), (b), (n))
#define __do_insw(p, b, n) readsw((PCI_IO_ADDR)(_IO_BASE+(p)), (b), (n))
#define __do_insl(p, b, n) readsl((PCI_IO_ADDR)(_IO_BASE+(p)), (b), (n))
#define __do_outsb(p, b, n) writesb((PCI_IO_ADDR)(_IO_BASE+(p)),(b),(n))
#define __do_outsw(p, b, n) writesw((PCI_IO_ADDR)(_IO_BASE+(p)),(b),(n))
#define __do_outsl(p, b, n) writesl((PCI_IO_ADDR)(_IO_BASE+(p)),(b),(n))
#define __do_memset_io(addr, c, n) \
_memset_io(PCI_FIX_ADDR(addr), c, n)
@ -982,7 +982,7 @@ static inline phys_addr_t page_to_phys(struct page *page)
}
/*
* 32 bits still uses virt_to_bus() for it's implementation of DMA
* 32 bits still uses virt_to_bus() for its implementation of DMA
* mappings se we have to keep it defined here. We also have some old
* drivers (shame shame shame) that use bus_to_virt() and haven't been
* fixed yet so I need to define it here.

View File

@ -135,6 +135,17 @@ static inline void crash_setup_regs(struct pt_regs *newregs,
ppc_save_regs(newregs);
}
#ifdef CONFIG_CRASH_HOTPLUG
void arch_crash_handle_hotplug_event(struct kimage *image, void *arg);
#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
int arch_crash_hotplug_support(struct kimage *image, unsigned long kexec_flags);
#define arch_crash_hotplug_support arch_crash_hotplug_support
unsigned int arch_crash_get_elfcorehdr_size(void);
#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
#endif /* CONFIG_CRASH_HOTPLUG */
extern int crashing_cpu;
extern void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *));
extern void crash_ipi_callback(struct pt_regs *regs);
@ -185,6 +196,10 @@ static inline void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *))
#endif /* CONFIG_CRASH_DUMP */
#if defined(CONFIG_KEXEC_FILE) || defined(CONFIG_CRASH_DUMP)
int update_cpus_node(void *fdt);
#endif
#ifdef CONFIG_PPC_BOOK3S_64
#include <asm/book3s/64/kexec.h>
#endif

View File

@ -7,19 +7,9 @@
void sort_memory_ranges(struct crash_mem *mrngs, bool merge);
struct crash_mem *realloc_mem_ranges(struct crash_mem **mem_ranges);
int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
int add_tce_mem_ranges(struct crash_mem **mem_ranges);
int add_initrd_mem_range(struct crash_mem **mem_ranges);
#ifdef CONFIG_PPC_64S_HASH_MMU
int add_htab_mem_range(struct crash_mem **mem_ranges);
#else
static inline int add_htab_mem_range(struct crash_mem **mem_ranges)
{
return 0;
}
#endif
int add_kernel_mem_range(struct crash_mem **mem_ranges);
int add_rtas_mem_range(struct crash_mem **mem_ranges);
int add_opal_mem_range(struct crash_mem **mem_ranges);
int add_reserved_mem_ranges(struct crash_mem **mem_ranges);
int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size);
int get_exclude_memory_ranges(struct crash_mem **mem_ranges);
int get_reserved_memory_ranges(struct crash_mem **mem_ranges);
int get_crash_memory_ranges(struct crash_mem **mem_ranges);
int get_usable_memory_ranges(struct crash_mem **mem_ranges);
#endif /* _ASM_POWERPC_KEXEC_RANGES_H */

View File

@ -251,7 +251,7 @@ static __always_inline bool mmu_has_feature(unsigned long feature)
#endif
#ifdef CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
if (!static_key_initialized) {
if (!static_key_feature_checks_initialized) {
printk("Warning! mmu_has_feature() used prior to jump label init!\n");
dump_stack();
return early_mmu_has_feature(feature);

View File

@ -48,11 +48,6 @@ struct mod_arch_specific {
unsigned long tramp;
unsigned long tramp_regs;
#endif
/* List of BUG addresses, source line numbers and filenames */
struct list_head bug_list;
struct bug_entry *bug_table;
unsigned int num_bugs;
};
/*

View File

@ -1027,10 +1027,10 @@ struct opal_i2c_request {
* The host will pass on OPAL, a buffer of length OPAL_SYSEPOW_MAX
* with individual elements being 16 bits wide to fetch the system
* wide EPOW status. Each element in the buffer will contain the
* EPOW status in it's bit representation for a particular EPOW sub
* EPOW status in its bit representation for a particular EPOW sub
* class as defined here. So multiple detailed EPOW status bits
* specific for any sub class can be represented in a single buffer
* element as it's bit representation.
* element as its bit representation.
*/
/* System EPOW type */

View File

@ -15,6 +15,16 @@
#endif /* CONFIG_SMP */
#endif /* __powerpc64__ */
#if defined(CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK) && defined(CONFIG_SMP)
#include <linux/jump_label.h>
DECLARE_STATIC_KEY_FALSE(__percpu_first_chunk_is_paged);
#define percpu_first_chunk_is_paged \
(static_key_enabled(&__percpu_first_chunk_is_paged.key))
#else
#define percpu_first_chunk_is_paged false
#endif /* CONFIG_PPC64 && CONFIG_SMP */
#include <asm-generic/percpu.h>
#include <asm/paca.h>

View File

@ -192,7 +192,7 @@ static inline long pmac_call_feature(int selector, struct device_node* node,
/* PMAC_FTR_BMAC_ENABLE (struct device_node* node, 0, int value)
* enable/disable the bmac (ethernet) cell of a mac-io ASIC, also drive
* it's reset line
* its reset line
*/
#define PMAC_FTR_BMAC_ENABLE PMAC_FTR_DEF(6)

View File

@ -510,6 +510,7 @@
#define PPC_RAW_STB(r, base, i) (0x98000000 | ___PPC_RS(r) | ___PPC_RA(base) | IMM_L(i))
#define PPC_RAW_LBZ(r, base, i) (0x88000000 | ___PPC_RT(r) | ___PPC_RA(base) | IMM_L(i))
#define PPC_RAW_LDX(r, base, b) (0x7c00002a | ___PPC_RT(r) | ___PPC_RA(base) | ___PPC_RB(b))
#define PPC_RAW_LHA(r, base, i) (0xa8000000 | ___PPC_RT(r) | ___PPC_RA(base) | IMM_L(i))
#define PPC_RAW_LHZ(r, base, i) (0xa0000000 | ___PPC_RT(r) | ___PPC_RA(base) | IMM_L(i))
#define PPC_RAW_LHBRX(r, base, b) (0x7c00062c | ___PPC_RT(r) | ___PPC_RA(base) | ___PPC_RB(b))
#define PPC_RAW_LWBRX(r, base, b) (0x7c00042c | ___PPC_RT(r) | ___PPC_RA(base) | ___PPC_RB(b))
@ -532,6 +533,7 @@
#define PPC_RAW_MULW(d, a, b) (0x7c0001d6 | ___PPC_RT(d) | ___PPC_RA(a) | ___PPC_RB(b))
#define PPC_RAW_MULHWU(d, a, b) (0x7c000016 | ___PPC_RT(d) | ___PPC_RA(a) | ___PPC_RB(b))
#define PPC_RAW_MULI(d, a, i) (0x1c000000 | ___PPC_RT(d) | ___PPC_RA(a) | IMM_L(i))
#define PPC_RAW_DIVW(d, a, b) (0x7c0003d6 | ___PPC_RT(d) | ___PPC_RA(a) | ___PPC_RB(b))
#define PPC_RAW_DIVWU(d, a, b) (0x7c000396 | ___PPC_RT(d) | ___PPC_RA(a) | ___PPC_RB(b))
#define PPC_RAW_DIVDU(d, a, b) (0x7c000392 | ___PPC_RT(d) | ___PPC_RA(a) | ___PPC_RB(b))
#define PPC_RAW_DIVDE(t, a, b) (0x7c000352 | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b))
@ -550,6 +552,8 @@
#define PPC_RAW_XOR(d, a, b) (0x7c000278 | ___PPC_RA(d) | ___PPC_RS(a) | ___PPC_RB(b))
#define PPC_RAW_XORI(d, a, i) (0x68000000 | ___PPC_RA(d) | ___PPC_RS(a) | IMM_L(i))
#define PPC_RAW_XORIS(d, a, i) (0x6c000000 | ___PPC_RA(d) | ___PPC_RS(a) | IMM_L(i))
#define PPC_RAW_EXTSB(d, a) (0x7c000774 | ___PPC_RA(d) | ___PPC_RS(a))
#define PPC_RAW_EXTSH(d, a) (0x7c000734 | ___PPC_RA(d) | ___PPC_RS(a))
#define PPC_RAW_EXTSW(d, a) (0x7c0007b4 | ___PPC_RA(d) | ___PPC_RS(a))
#define PPC_RAW_SLW(d, a, s) (0x7c000030 | ___PPC_RA(d) | ___PPC_RS(a) | ___PPC_RB(s))
#define PPC_RAW_SLD(d, a, s) (0x7c000036 | ___PPC_RA(d) | ___PPC_RS(a) | ___PPC_RB(s))

View File

@ -260,7 +260,8 @@ struct thread_struct {
unsigned long sier2;
unsigned long sier3;
unsigned long hashkeyr;
unsigned long dexcr;
unsigned long dexcr_onexec; /* Reset value to load on exec */
#endif
};
@ -333,6 +334,16 @@ extern int set_endian(struct task_struct *tsk, unsigned int val);
extern int get_unalign_ctl(struct task_struct *tsk, unsigned long adr);
extern int set_unalign_ctl(struct task_struct *tsk, unsigned int val);
#ifdef CONFIG_PPC_BOOK3S_64
#define PPC_GET_DEXCR_ASPECT(tsk, asp) get_dexcr_prctl((tsk), (asp))
#define PPC_SET_DEXCR_ASPECT(tsk, asp, val) set_dexcr_prctl((tsk), (asp), (val))
int get_dexcr_prctl(struct task_struct *tsk, unsigned long asp);
int set_dexcr_prctl(struct task_struct *tsk, unsigned long asp, unsigned long val);
#endif
extern void load_fp_state(struct thread_fp_state *fp);
extern void store_fp_state(struct thread_fp_state *fp);
extern void load_vr_state(struct thread_vr_state *vr);

View File

@ -615,7 +615,7 @@
#define HID1_ABE (1<<10) /* 7450 Address Broadcast Enable */
#define HID1_PS (1<<16) /* 750FX PLL selection */
#endif
#define SPRN_HID2 0x3F8 /* Hardware Implementation Register 2 */
#define SPRN_HID2_750FX 0x3F8 /* IBM 750FX HID2 Register */
#define SPRN_HID2_GEKKO 0x398 /* Gekko HID2 Register */
#define SPRN_HID2_G2_LE 0x3F3 /* G2_LE HID2 Register */
#define HID2_G2_LE_HBE (1<<18) /* High BAT Enable (G2_LE) */

View File

@ -144,7 +144,7 @@
#define UNI_N_HWINIT_STATE_SLEEPING 0x01
#define UNI_N_HWINIT_STATE_RUNNING 0x02
/* This last bit appear to be used by the bootROM to know the second
* CPU has started and will enter it's sleep loop with IP=0
* CPU has started and will enter its sleep loop with IP=0
*/
#define UNI_N_HWINIT_STATE_CPU1_FLAG 0x10000000

View File

@ -108,7 +108,7 @@ typedef struct boot_infos
/* ALL BELOW NEW (vers. 4) */
/* This defines the physical memory. Valid with BOOT_ARCH_NUBUS flag
(non-PCI) only. On PCI, memory is contiguous and it's size is in the
(non-PCI) only. On PCI, memory is contiguous and its size is in the
device-tree. */
boot_info_map_entry_t
physMemoryMap[MAX_MEM_MAP_SIZE]; /* Where the phys memory is */

View File

@ -3,9 +3,6 @@
# Makefile for the linux kernel.
#
ifdef CONFIG_PPC64
CFLAGS_prom_init.o += $(NO_MINIMAL_TOC)
endif
ifdef CONFIG_PPC32
CFLAGS_prom_init.o += -fPIC
CFLAGS_btext.o += -fPIC
@ -87,6 +84,7 @@ obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
obj-$(CONFIG_PPC_DAWR) += dawr.o
obj-$(CONFIG_PPC_BOOK3S_64) += cpu_setup_ppc970.o cpu_setup_pa6t.o
obj-$(CONFIG_PPC_BOOK3S_64) += cpu_setup_power.o
obj-$(CONFIG_PPC_BOOK3S_64) += dexcr.o
obj-$(CONFIG_PPC_BOOK3S_64) += mce.o mce_power.o
obj-$(CONFIG_PPC_BOOK3E_64) += exceptions-64e.o idle_64e.o
obj-$(CONFIG_PPC_BARRIER_NOSPEC) += security.o
@ -190,9 +188,6 @@ GCOV_PROFILE_kprobes-ftrace.o := n
KCOV_INSTRUMENT_kprobes-ftrace.o := n
KCSAN_SANITIZE_kprobes-ftrace.o := n
UBSAN_SANITIZE_kprobes-ftrace.o := n
GCOV_PROFILE_syscall_64.o := n
KCOV_INSTRUMENT_syscall_64.o := n
UBSAN_SANITIZE_syscall_64.o := n
UBSAN_SANITIZE_vdso.o := n
# Necessary for booting with kcov enabled on book3e machines

View File

@ -401,7 +401,7 @@ _GLOBAL(__save_cpu_setup)
andi. r3,r3,0xff00
cmpwi cr0,r3,0x0200
bne 1f
mfspr r4,SPRN_HID2
mfspr r4,SPRN_HID2_750FX
stw r4,CS_HID2(r5)
1:
mtcr r7
@ -496,7 +496,7 @@ _GLOBAL(__restore_cpu_setup)
bne 4f
lwz r4,CS_HID2(r5)
rlwinm r4,r4,0,19,17
mtspr SPRN_HID2,r4
mtspr SPRN_HID2_750FX,r4
sync
4:
lwz r4,CS_HID1(r5)

124
arch/powerpc/kernel/dexcr.c Normal file
View File

@ -0,0 +1,124 @@
// SPDX-License-Identifier: GPL-2.0-or-later
#include <linux/capability.h>
#include <linux/cpu.h>
#include <linux/init.h>
#include <linux/prctl.h>
#include <linux/sched.h>
#include <asm/cpu_has_feature.h>
#include <asm/cputable.h>
#include <asm/processor.h>
#include <asm/reg.h>
static int __init init_task_dexcr(void)
{
if (!early_cpu_has_feature(CPU_FTR_ARCH_31))
return 0;
current->thread.dexcr_onexec = mfspr(SPRN_DEXCR);
return 0;
}
early_initcall(init_task_dexcr)
/* Allow thread local configuration of these by default */
#define DEXCR_PRCTL_EDITABLE ( \
DEXCR_PR_IBRTPD | \
DEXCR_PR_SRAPD | \
DEXCR_PR_NPHIE)
static int prctl_to_aspect(unsigned long which, unsigned int *aspect)
{
switch (which) {
case PR_PPC_DEXCR_SBHE:
*aspect = DEXCR_PR_SBHE;
break;
case PR_PPC_DEXCR_IBRTPD:
*aspect = DEXCR_PR_IBRTPD;
break;
case PR_PPC_DEXCR_SRAPD:
*aspect = DEXCR_PR_SRAPD;
break;
case PR_PPC_DEXCR_NPHIE:
*aspect = DEXCR_PR_NPHIE;
break;
default:
return -ENODEV;
}
return 0;
}
int get_dexcr_prctl(struct task_struct *task, unsigned long which)
{
unsigned int aspect;
int ret;
ret = prctl_to_aspect(which, &aspect);
if (ret)
return ret;
if (aspect & DEXCR_PRCTL_EDITABLE)
ret |= PR_PPC_DEXCR_CTRL_EDITABLE;
if (aspect & mfspr(SPRN_DEXCR))
ret |= PR_PPC_DEXCR_CTRL_SET;
else
ret |= PR_PPC_DEXCR_CTRL_CLEAR;
if (aspect & task->thread.dexcr_onexec)
ret |= PR_PPC_DEXCR_CTRL_SET_ONEXEC;
else
ret |= PR_PPC_DEXCR_CTRL_CLEAR_ONEXEC;
return ret;
}
int set_dexcr_prctl(struct task_struct *task, unsigned long which, unsigned long ctrl)
{
unsigned long dexcr;
unsigned int aspect;
int err = 0;
err = prctl_to_aspect(which, &aspect);
if (err)
return err;
if (!(aspect & DEXCR_PRCTL_EDITABLE))
return -EPERM;
if (ctrl & ~PR_PPC_DEXCR_CTRL_MASK)
return -EINVAL;
if (ctrl & PR_PPC_DEXCR_CTRL_SET && ctrl & PR_PPC_DEXCR_CTRL_CLEAR)
return -EINVAL;
if (ctrl & PR_PPC_DEXCR_CTRL_SET_ONEXEC && ctrl & PR_PPC_DEXCR_CTRL_CLEAR_ONEXEC)
return -EINVAL;
/*
* We do not want an unprivileged process being able to disable
* a setuid process's hash check instructions
*/
if (aspect == DEXCR_PR_NPHIE &&
ctrl & PR_PPC_DEXCR_CTRL_CLEAR_ONEXEC &&
!capable(CAP_SYS_ADMIN))
return -EPERM;
dexcr = mfspr(SPRN_DEXCR);
if (ctrl & PR_PPC_DEXCR_CTRL_SET)
dexcr |= aspect;
else if (ctrl & PR_PPC_DEXCR_CTRL_CLEAR)
dexcr &= ~aspect;
if (ctrl & PR_PPC_DEXCR_CTRL_SET_ONEXEC)
task->thread.dexcr_onexec |= aspect;
else if (ctrl & PR_PPC_DEXCR_CTRL_CLEAR_ONEXEC)
task->thread.dexcr_onexec &= ~aspect;
mtspr(SPRN_DEXCR, dexcr);
return 0;
}

View File

@ -506,9 +506,18 @@ int eeh_dev_check_failure(struct eeh_dev *edev)
* We will punt with the following conditions: Failure to get
* PE's state, EEH not support and Permanently unavailable
* state, PE is in good state.
*
* On the pSeries, after reaching the threshold, get_state might
* return EEH_STATE_NOT_SUPPORT. However, it's possible that the
* device state remains uncleared if the device is not marked
* pci_channel_io_perm_failure. Therefore, consider logging the
* event to let device removal happen.
*
*/
if ((ret < 0) ||
(ret == EEH_STATE_NOT_SUPPORT) || eeh_state_active(ret)) {
(ret == EEH_STATE_NOT_SUPPORT &&
dev->error_state == pci_channel_io_perm_failure) ||
eeh_state_active(ret)) {
eeh_stats.false_positives++;
pe->false_positives++;
rc = 0;

View File

@ -865,9 +865,18 @@ void eeh_handle_normal_event(struct eeh_pe *pe)
devices++;
if (!devices) {
pr_debug("EEH: Frozen PHB#%x-PE#%x is empty!\n",
pr_warn("EEH: Frozen PHB#%x-PE#%x is empty!\n",
pe->phb->global_number, pe->addr);
goto out; /* nothing to recover */
/*
* The device is removed, tear down its state, on powernv
* hotplug driver would take care of it but not on pseries,
* permanently disable the card as it is hot removed.
*
* In the case of powernv, note that the removal of device
* is covered by pci rescan lock, so no problem even if hotplug
* driver attempts to remove the device.
*/
goto recover_failed;
}
/* Log the event */

View File

@ -24,10 +24,10 @@ static int eeh_pe_aux_size = 0;
static LIST_HEAD(eeh_phb_pe);
/**
* eeh_set_pe_aux_size - Set PE auxillary data size
* @size: PE auxillary data size
* eeh_set_pe_aux_size - Set PE auxiliary data size
* @size: PE auxiliary data size in bytes
*
* Set PE auxillary data size
* Set PE auxiliary data size.
*/
void eeh_set_pe_aux_size(int size)
{
@ -527,7 +527,7 @@ EXPORT_SYMBOL_GPL(eeh_pe_state_mark);
* eeh_pe_mark_isolated
* @pe: EEH PE
*
* Record that a PE has been isolated by marking the PE and it's children as
* Record that a PE has been isolated by marking the PE and its children as
* EEH_PE_ISOLATED (and EEH_PE_CFG_BLOCKED, if required) and their PCI devices
* as pci_channel_io_frozen.
*/

View File

@ -53,8 +53,6 @@ static struct kobject *fadump_kobj;
static atomic_t cpus_in_fadump;
static DEFINE_MUTEX(fadump_mutex);
static struct fadump_mrange_info crash_mrange_info = { "crash", NULL, 0, 0, 0, false };
#define RESERVED_RNGS_SZ 16384 /* 16K - 128 entries */
#define RESERVED_RNGS_CNT (RESERVED_RNGS_SZ / \
sizeof(struct fadump_memory_range))
@ -133,6 +131,41 @@ static int __init fadump_cma_init(void)
static int __init fadump_cma_init(void) { return 1; }
#endif /* CONFIG_CMA */
/*
* Additional parameters meant for capture kernel are placed in a dedicated area.
* If this is capture kernel boot, append these parameters to bootargs.
*/
void __init fadump_append_bootargs(void)
{
char *append_args;
size_t len;
if (!fw_dump.dump_active || !fw_dump.param_area_supported || !fw_dump.param_area)
return;
if (fw_dump.param_area >= fw_dump.boot_mem_top) {
if (memblock_reserve(fw_dump.param_area, COMMAND_LINE_SIZE)) {
pr_warn("WARNING: Can't use additional parameters area!\n");
fw_dump.param_area = 0;
return;
}
}
append_args = (char *)fw_dump.param_area;
len = strlen(boot_command_line);
/*
* Too late to fail even if cmdline size exceeds. Truncate additional parameters
* to cmdline size and proceed anyway.
*/
if (len + strlen(append_args) >= COMMAND_LINE_SIZE - 1)
pr_warn("WARNING: Appending parameters exceeds cmdline size. Truncating!\n");
pr_debug("Cmdline: %s\n", boot_command_line);
snprintf(boot_command_line + len, COMMAND_LINE_SIZE - len, " %s", append_args);
pr_info("Updated cmdline: %s\n", boot_command_line);
}
/* Scan the Firmware Assisted dump configuration details. */
int __init early_init_dt_scan_fw_dump(unsigned long node, const char *uname,
int depth, void *data)
@ -222,28 +255,6 @@ static bool is_fadump_mem_area_contiguous(u64 d_start, u64 d_end)
return ret;
}
/*
* Returns true, if there are no holes in boot memory area,
* false otherwise.
*/
bool is_fadump_boot_mem_contiguous(void)
{
unsigned long d_start, d_end;
bool ret = false;
int i;
for (i = 0; i < fw_dump.boot_mem_regs_cnt; i++) {
d_start = fw_dump.boot_mem_addr[i];
d_end = d_start + fw_dump.boot_mem_sz[i];
ret = is_fadump_mem_area_contiguous(d_start, d_end);
if (!ret)
break;
}
return ret;
}
/*
* Returns true, if there are no holes in reserved memory area,
* false otherwise.
@ -373,12 +384,6 @@ static unsigned long __init get_fadump_area_size(void)
size = PAGE_ALIGN(size);
size += fw_dump.boot_memory_size;
size += sizeof(struct fadump_crash_info_header);
size += sizeof(struct elfhdr); /* ELF core header.*/
size += sizeof(struct elf_phdr); /* place holder for cpu notes */
/* Program headers for crash memory regions. */
size += sizeof(struct elf_phdr) * (memblock_num_regions(memory) + 2);
size = PAGE_ALIGN(size);
/* This is to hold kernel metadata on platforms that support it */
size += (fw_dump.ops->fadump_get_metadata_size ?
@ -389,10 +394,11 @@ static unsigned long __init get_fadump_area_size(void)
static int __init add_boot_mem_region(unsigned long rstart,
unsigned long rsize)
{
int max_boot_mem_rgns = fw_dump.ops->fadump_max_boot_mem_rgns();
int i = fw_dump.boot_mem_regs_cnt++;
if (fw_dump.boot_mem_regs_cnt > FADUMP_MAX_MEM_REGS) {
fw_dump.boot_mem_regs_cnt = FADUMP_MAX_MEM_REGS;
if (fw_dump.boot_mem_regs_cnt > max_boot_mem_rgns) {
fw_dump.boot_mem_regs_cnt = max_boot_mem_rgns;
return 0;
}
@ -573,22 +579,6 @@ int __init fadump_reserve_mem(void)
}
}
/*
* Calculate the memory boundary.
* If memory_limit is less than actual memory boundary then reserve
* the memory for fadump beyond the memory_limit and adjust the
* memory_limit accordingly, so that the running kernel can run with
* specified memory_limit.
*/
if (memory_limit && memory_limit < memblock_end_of_DRAM()) {
size = get_fadump_area_size();
if ((memory_limit + size) < memblock_end_of_DRAM())
memory_limit += size;
else
memory_limit = memblock_end_of_DRAM();
printk(KERN_INFO "Adjusted memory_limit for firmware-assisted"
" dump, now %#016llx\n", memory_limit);
}
if (memory_limit)
mem_boundary = memory_limit;
else
@ -705,7 +695,7 @@ void crash_fadump(struct pt_regs *regs, const char *str)
* old_cpu == -1 means this is the first CPU which has come here,
* go ahead and trigger fadump.
*
* old_cpu != -1 means some other CPU has already on it's way
* old_cpu != -1 means some other CPU has already on its way
* to trigger fadump, just keep looping here.
*/
this_cpu = smp_processor_id();
@ -931,36 +921,6 @@ static inline int fadump_add_mem_range(struct fadump_mrange_info *mrange_info,
return 0;
}
static int fadump_exclude_reserved_area(u64 start, u64 end)
{
u64 ra_start, ra_end;
int ret = 0;
ra_start = fw_dump.reserve_dump_area_start;
ra_end = ra_start + fw_dump.reserve_dump_area_size;
if ((ra_start < end) && (ra_end > start)) {
if ((start < ra_start) && (end > ra_end)) {
ret = fadump_add_mem_range(&crash_mrange_info,
start, ra_start);
if (ret)
return ret;
ret = fadump_add_mem_range(&crash_mrange_info,
ra_end, end);
} else if (start < ra_start) {
ret = fadump_add_mem_range(&crash_mrange_info,
start, ra_start);
} else if (ra_end < end) {
ret = fadump_add_mem_range(&crash_mrange_info,
ra_end, end);
}
} else
ret = fadump_add_mem_range(&crash_mrange_info, start, end);
return ret;
}
static int fadump_init_elfcore_header(char *bufp)
{
struct elfhdr *elf;
@ -997,52 +957,6 @@ static int fadump_init_elfcore_header(char *bufp)
return 0;
}
/*
* Traverse through memblock structure and setup crash memory ranges. These
* ranges will be used create PT_LOAD program headers in elfcore header.
*/
static int fadump_setup_crash_memory_ranges(void)
{
u64 i, start, end;
int ret;
pr_debug("Setup crash memory ranges.\n");
crash_mrange_info.mem_range_cnt = 0;
/*
* Boot memory region(s) registered with firmware are moved to
* different location at the time of crash. Create separate program
* header(s) for this memory chunk(s) with the correct offset.
*/
for (i = 0; i < fw_dump.boot_mem_regs_cnt; i++) {
start = fw_dump.boot_mem_addr[i];
end = start + fw_dump.boot_mem_sz[i];
ret = fadump_add_mem_range(&crash_mrange_info, start, end);
if (ret)
return ret;
}
for_each_mem_range(i, &start, &end) {
/*
* skip the memory chunk that is already added
* (0 through boot_memory_top).
*/
if (start < fw_dump.boot_mem_top) {
if (end > fw_dump.boot_mem_top)
start = fw_dump.boot_mem_top;
else
continue;
}
/* add this range excluding the reserved dump area. */
ret = fadump_exclude_reserved_area(start, end);
if (ret)
return ret;
}
return 0;
}
/*
* If the given physical address falls within the boot memory region then
* return the relocated address that points to the dump region reserved
@ -1073,36 +987,50 @@ static inline unsigned long fadump_relocate(unsigned long paddr)
return raddr;
}
static int fadump_create_elfcore_headers(char *bufp)
static void __init populate_elf_pt_load(struct elf_phdr *phdr, u64 start,
u64 size, unsigned long long offset)
{
unsigned long long raddr, offset;
struct elf_phdr *phdr;
struct elfhdr *elf;
int i, j;
phdr->p_align = 0;
phdr->p_memsz = size;
phdr->p_filesz = size;
phdr->p_paddr = start;
phdr->p_offset = offset;
phdr->p_type = PT_LOAD;
phdr->p_flags = PF_R|PF_W|PF_X;
phdr->p_vaddr = (unsigned long)__va(start);
}
static void __init fadump_populate_elfcorehdr(struct fadump_crash_info_header *fdh)
{
char *bufp;
struct elfhdr *elf;
struct elf_phdr *phdr;
u64 boot_mem_dest_offset;
unsigned long long i, ra_start, ra_end, ra_size, mstart, mend;
bufp = (char *) fw_dump.elfcorehdr_addr;
fadump_init_elfcore_header(bufp);
elf = (struct elfhdr *)bufp;
bufp += sizeof(struct elfhdr);
/*
* setup ELF PT_NOTE, place holder for cpu notes info. The notes info
* will be populated during second kernel boot after crash. Hence
* this PT_NOTE will always be the first elf note.
* Set up ELF PT_NOTE, a placeholder for CPU notes information.
* The notes info will be populated later by platform-specific code.
* Hence, this PT_NOTE will always be the first ELF note.
*
* NOTE: Any new ELF note addition should be placed after this note.
*/
phdr = (struct elf_phdr *)bufp;
bufp += sizeof(struct elf_phdr);
phdr->p_type = PT_NOTE;
phdr->p_flags = 0;
phdr->p_vaddr = 0;
phdr->p_align = 0;
phdr->p_offset = 0;
phdr->p_paddr = 0;
phdr->p_filesz = 0;
phdr->p_memsz = 0;
phdr->p_flags = 0;
phdr->p_vaddr = 0;
phdr->p_align = 0;
phdr->p_offset = 0;
phdr->p_paddr = 0;
phdr->p_filesz = 0;
phdr->p_memsz = 0;
/* Increment number of program headers. */
(elf->e_phnum)++;
/* setup ELF PT_NOTE for vmcoreinfo */
@ -1112,55 +1040,66 @@ static int fadump_create_elfcore_headers(char *bufp)
phdr->p_flags = 0;
phdr->p_vaddr = 0;
phdr->p_align = 0;
phdr->p_paddr = fadump_relocate(paddr_vmcoreinfo_note());
phdr->p_offset = phdr->p_paddr;
phdr->p_memsz = phdr->p_filesz = VMCOREINFO_NOTE_SIZE;
phdr->p_paddr = phdr->p_offset = fdh->vmcoreinfo_raddr;
phdr->p_memsz = phdr->p_filesz = fdh->vmcoreinfo_size;
/* Increment number of program headers. */
(elf->e_phnum)++;
/* setup PT_LOAD sections. */
j = 0;
offset = 0;
raddr = fw_dump.boot_mem_addr[0];
for (i = 0; i < crash_mrange_info.mem_range_cnt; i++) {
u64 mbase, msize;
mbase = crash_mrange_info.mem_ranges[i].base;
msize = crash_mrange_info.mem_ranges[i].size;
if (!msize)
continue;
/*
* Setup PT_LOAD sections. first include boot memory regions
* and then add rest of the memory regions.
*/
boot_mem_dest_offset = fw_dump.boot_mem_dest_addr;
for (i = 0; i < fw_dump.boot_mem_regs_cnt; i++) {
phdr = (struct elf_phdr *)bufp;
bufp += sizeof(struct elf_phdr);
phdr->p_type = PT_LOAD;
phdr->p_flags = PF_R|PF_W|PF_X;
phdr->p_offset = mbase;
populate_elf_pt_load(phdr, fw_dump.boot_mem_addr[i],
fw_dump.boot_mem_sz[i],
boot_mem_dest_offset);
/* Increment number of program headers. */
(elf->e_phnum)++;
boot_mem_dest_offset += fw_dump.boot_mem_sz[i];
}
if (mbase == raddr) {
/*
* The entire real memory region will be moved by
* firmware to the specified destination_address.
* Hence set the correct offset.
*/
phdr->p_offset = fw_dump.boot_mem_dest_addr + offset;
if (j < (fw_dump.boot_mem_regs_cnt - 1)) {
offset += fw_dump.boot_mem_sz[j];
raddr = fw_dump.boot_mem_addr[++j];
}
/* Memory reserved for fadump in first kernel */
ra_start = fw_dump.reserve_dump_area_start;
ra_size = get_fadump_area_size();
ra_end = ra_start + ra_size;
phdr = (struct elf_phdr *)bufp;
for_each_mem_range(i, &mstart, &mend) {
/* Boot memory regions already added, skip them now */
if (mstart < fw_dump.boot_mem_top) {
if (mend > fw_dump.boot_mem_top)
mstart = fw_dump.boot_mem_top;
else
continue;
}
phdr->p_paddr = mbase;
phdr->p_vaddr = (unsigned long)__va(mbase);
phdr->p_filesz = msize;
phdr->p_memsz = msize;
phdr->p_align = 0;
/* Handle memblock regions overlaps with fadump reserved area */
if ((ra_start < mend) && (ra_end > mstart)) {
if ((mstart < ra_start) && (mend > ra_end)) {
populate_elf_pt_load(phdr, mstart, ra_start - mstart, mstart);
/* Increment number of program headers. */
(elf->e_phnum)++;
bufp += sizeof(struct elf_phdr);
phdr = (struct elf_phdr *)bufp;
populate_elf_pt_load(phdr, ra_end, mend - ra_end, ra_end);
} else if (mstart < ra_start) {
populate_elf_pt_load(phdr, mstart, ra_start - mstart, mstart);
} else if (ra_end < mend) {
populate_elf_pt_load(phdr, ra_end, mend - ra_end, ra_end);
}
} else {
/* No overlap with fadump reserved memory region */
populate_elf_pt_load(phdr, mstart, mend - mstart, mstart);
}
/* Increment number of program headers. */
(elf->e_phnum)++;
bufp += sizeof(struct elf_phdr);
phdr = (struct elf_phdr *) bufp;
}
return 0;
}
static unsigned long init_fadump_header(unsigned long addr)
@ -1175,14 +1114,25 @@ static unsigned long init_fadump_header(unsigned long addr)
memset(fdh, 0, sizeof(struct fadump_crash_info_header));
fdh->magic_number = FADUMP_CRASH_INFO_MAGIC;
fdh->elfcorehdr_addr = addr;
fdh->version = FADUMP_HEADER_VERSION;
/* We will set the crashing cpu id in crash_fadump() during crash. */
fdh->crashing_cpu = FADUMP_CPU_UNKNOWN;
/*
* The physical address and size of vmcoreinfo are required in the
* second kernel to prepare elfcorehdr.
*/
fdh->vmcoreinfo_raddr = fadump_relocate(paddr_vmcoreinfo_note());
fdh->vmcoreinfo_size = VMCOREINFO_NOTE_SIZE;
fdh->pt_regs_sz = sizeof(struct pt_regs);
/*
* When LPAR is terminated by PYHP, ensure all possible CPUs'
* register data is processed while exporting the vmcore.
*/
fdh->cpu_mask = *cpu_possible_mask;
fdh->cpu_mask_sz = sizeof(struct cpumask);
return addr;
}
@ -1190,8 +1140,6 @@ static unsigned long init_fadump_header(unsigned long addr)
static int register_fadump(void)
{
unsigned long addr;
void *vaddr;
int ret;
/*
* If no memory is reserved then we can not register for firmware-
@ -1200,18 +1148,10 @@ static int register_fadump(void)
if (!fw_dump.reserve_dump_area_size)
return -ENODEV;
ret = fadump_setup_crash_memory_ranges();
if (ret)
return ret;
addr = fw_dump.fadumphdr_addr;
/* Initialize fadump crash info header. */
addr = init_fadump_header(addr);
vaddr = __va(addr);
pr_debug("Creating ELF core headers at %#016lx\n", addr);
fadump_create_elfcore_headers(vaddr);
/* register the future kernel dump with firmware. */
pr_debug("Registering for firmware-assisted kernel dump...\n");
@ -1230,7 +1170,6 @@ void fadump_cleanup(void)
} else if (fw_dump.dump_registered) {
/* Un-register Firmware-assisted dump if it was registered. */
fw_dump.ops->fadump_unregister(&fw_dump);
fadump_free_mem_ranges(&crash_mrange_info);
}
if (fw_dump.ops->fadump_cleanup)
@ -1416,6 +1355,22 @@ static void fadump_release_memory(u64 begin, u64 end)
fadump_release_reserved_area(tstart, end);
}
static void fadump_free_elfcorehdr_buf(void)
{
if (fw_dump.elfcorehdr_addr == 0 || fw_dump.elfcorehdr_size == 0)
return;
/*
* Before freeing the memory of `elfcorehdr`, reset the global
* `elfcorehdr_addr` to prevent modules like `vmcore` from accessing
* invalid memory.
*/
elfcorehdr_addr = ELFCORE_ADDR_ERR;
fadump_free_buffer(fw_dump.elfcorehdr_addr, fw_dump.elfcorehdr_size);
fw_dump.elfcorehdr_addr = 0;
fw_dump.elfcorehdr_size = 0;
}
static void fadump_invalidate_release_mem(void)
{
mutex_lock(&fadump_mutex);
@ -1427,6 +1382,7 @@ static void fadump_invalidate_release_mem(void)
fadump_cleanup();
mutex_unlock(&fadump_mutex);
fadump_free_elfcorehdr_buf();
fadump_release_memory(fw_dump.boot_mem_top, memblock_end_of_DRAM());
fadump_free_cpu_notes_buf();
@ -1484,6 +1440,18 @@ static ssize_t enabled_show(struct kobject *kobj,
return sprintf(buf, "%d\n", fw_dump.fadump_enabled);
}
/*
* /sys/kernel/fadump/hotplug_ready sysfs node returns 1, which inidcates
* to usersapce that fadump re-registration is not required on memory
* hotplug events.
*/
static ssize_t hotplug_ready_show(struct kobject *kobj,
struct kobj_attribute *attr,
char *buf)
{
return sprintf(buf, "%d\n", 1);
}
static ssize_t mem_reserved_show(struct kobject *kobj,
struct kobj_attribute *attr,
char *buf)
@ -1498,6 +1466,43 @@ static ssize_t registered_show(struct kobject *kobj,
return sprintf(buf, "%d\n", fw_dump.dump_registered);
}
static ssize_t bootargs_append_show(struct kobject *kobj,
struct kobj_attribute *attr,
char *buf)
{
return sprintf(buf, "%s\n", (char *)__va(fw_dump.param_area));
}
static ssize_t bootargs_append_store(struct kobject *kobj,
struct kobj_attribute *attr,
const char *buf, size_t count)
{
char *params;
if (!fw_dump.fadump_enabled || fw_dump.dump_active)
return -EPERM;
if (count >= COMMAND_LINE_SIZE)
return -EINVAL;
/*
* Fail here instead of handling this scenario with
* some silly workaround in capture kernel.
*/
if (saved_command_line_len + count >= COMMAND_LINE_SIZE) {
pr_err("Appending parameters exceeds cmdline size!\n");
return -ENOSPC;
}
params = __va(fw_dump.param_area);
strscpy_pad(params, buf, COMMAND_LINE_SIZE);
/* Remove newline character at the end. */
if (params[count-1] == '\n')
params[count-1] = '\0';
return count;
}
static ssize_t registered_store(struct kobject *kobj,
struct kobj_attribute *attr,
const char *buf, size_t count)
@ -1556,11 +1561,14 @@ static struct kobj_attribute release_attr = __ATTR_WO(release_mem);
static struct kobj_attribute enable_attr = __ATTR_RO(enabled);
static struct kobj_attribute register_attr = __ATTR_RW(registered);
static struct kobj_attribute mem_reserved_attr = __ATTR_RO(mem_reserved);
static struct kobj_attribute hotplug_ready_attr = __ATTR_RO(hotplug_ready);
static struct kobj_attribute bootargs_append_attr = __ATTR_RW(bootargs_append);
static struct attribute *fadump_attrs[] = {
&enable_attr.attr,
&register_attr.attr,
&mem_reserved_attr.attr,
&hotplug_ready_attr.attr,
NULL,
};
@ -1632,6 +1640,150 @@ static void __init fadump_init_files(void)
return;
}
static int __init fadump_setup_elfcorehdr_buf(void)
{
int elf_phdr_cnt;
unsigned long elfcorehdr_size;
/*
* Program header for CPU notes comes first, followed by one for
* vmcoreinfo, and the remaining program headers correspond to
* memory regions.
*/
elf_phdr_cnt = 2 + fw_dump.boot_mem_regs_cnt + memblock_num_regions(memory);
elfcorehdr_size = sizeof(struct elfhdr) + (elf_phdr_cnt * sizeof(struct elf_phdr));
elfcorehdr_size = PAGE_ALIGN(elfcorehdr_size);
fw_dump.elfcorehdr_addr = (u64)fadump_alloc_buffer(elfcorehdr_size);
if (!fw_dump.elfcorehdr_addr) {
pr_err("Failed to allocate %lu bytes for elfcorehdr\n",
elfcorehdr_size);
return -ENOMEM;
}
fw_dump.elfcorehdr_size = elfcorehdr_size;
return 0;
}
/*
* Check if the fadump header of crashed kernel is compatible with fadump kernel.
*
* It checks the magic number, endianness, and size of non-primitive type
* members of fadump header to ensure safe dump collection.
*/
static bool __init is_fadump_header_compatible(struct fadump_crash_info_header *fdh)
{
if (fdh->magic_number == FADUMP_CRASH_INFO_MAGIC_OLD) {
pr_err("Old magic number, can't process the dump.\n");
return false;
}
if (fdh->magic_number != FADUMP_CRASH_INFO_MAGIC) {
if (fdh->magic_number == swab64(FADUMP_CRASH_INFO_MAGIC))
pr_err("Endianness mismatch between the crashed and fadump kernels.\n");
else
pr_err("Fadump header is corrupted.\n");
return false;
}
/*
* Dump collection is not safe if the size of non-primitive type members
* of the fadump header do not match between crashed and fadump kernel.
*/
if (fdh->pt_regs_sz != sizeof(struct pt_regs) ||
fdh->cpu_mask_sz != sizeof(struct cpumask)) {
pr_err("Fadump header size mismatch.\n");
return false;
}
return true;
}
static void __init fadump_process(void)
{
struct fadump_crash_info_header *fdh;
fdh = (struct fadump_crash_info_header *) __va(fw_dump.fadumphdr_addr);
if (!fdh) {
pr_err("Crash info header is empty.\n");
goto err_out;
}
/* Avoid processing the dump if fadump header isn't compatible */
if (!is_fadump_header_compatible(fdh))
goto err_out;
/* Allocate buffer for elfcorehdr */
if (fadump_setup_elfcorehdr_buf())
goto err_out;
fadump_populate_elfcorehdr(fdh);
/* Let platform update the CPU notes in elfcorehdr */
if (fw_dump.ops->fadump_process(&fw_dump) < 0)
goto err_out;
/*
* elfcorehdr is now ready to be exported.
*
* set elfcorehdr_addr so that vmcore module will export the
* elfcorehdr through '/proc/vmcore'.
*/
elfcorehdr_addr = virt_to_phys((void *)fw_dump.elfcorehdr_addr);
return;
err_out:
fadump_invalidate_release_mem();
}
/*
* Reserve memory to store additional parameters to be passed
* for fadump/capture kernel.
*/
static void __init fadump_setup_param_area(void)
{
phys_addr_t range_start, range_end;
if (!fw_dump.param_area_supported || fw_dump.dump_active)
return;
/* This memory can't be used by PFW or bootloader as it is shared across kernels */
if (radix_enabled()) {
/*
* Anywhere in the upper half should be good enough as all memory
* is accessible in real mode.
*/
range_start = memblock_end_of_DRAM() / 2;
range_end = memblock_end_of_DRAM();
} else {
/*
* Passing additional parameters is supported for hash MMU only
* if the first memory block size is 768MB or higher.
*/
if (ppc64_rma_size < 0x30000000)
return;
/*
* 640 MB to 768 MB is not used by PFW/bootloader. So, try reserving
* memory for passing additional parameters in this range to avoid
* being stomped on by PFW/bootloader.
*/
range_start = 0x2A000000;
range_end = range_start + 0x4000000;
}
fw_dump.param_area = memblock_phys_alloc_range(COMMAND_LINE_SIZE,
COMMAND_LINE_SIZE,
range_start,
range_end);
if (!fw_dump.param_area || sysfs_create_file(fadump_kobj, &bootargs_append_attr.attr)) {
pr_warn("WARNING: Could not setup area to pass additional parameters!\n");
return;
}
memset(phys_to_virt(fw_dump.param_area), 0, COMMAND_LINE_SIZE);
}
/*
* Prepare for firmware-assisted dump.
*/
@ -1651,15 +1803,11 @@ int __init setup_fadump(void)
* saving it to the disk.
*/
if (fw_dump.dump_active) {
/*
* if dump process fails then invalidate the registration
* and release memory before proceeding for re-registration.
*/
if (fw_dump.ops->fadump_process(&fw_dump) < 0)
fadump_invalidate_release_mem();
fadump_process();
}
/* Initialize the kernel dump memory structure and register with f/w */
else if (fw_dump.reserve_dump_area_size) {
fadump_setup_param_area();
fw_dump.ops->fadump_init_mem_struct(&fw_dump);
register_fadump();
}

View File

@ -192,7 +192,7 @@ _GLOBAL(scom970_read)
xori r0,r0,MSR_EE
mtmsrd r0,1
/* rotate 24 bits SCOM address 8 bits left and mask out it's low 8 bits
/* rotate 24 bits SCOM address 8 bits left and mask out its low 8 bits
* (including parity). On current CPUs they must be 0'd,
* and finally or in RW bit
*/
@ -226,7 +226,7 @@ _GLOBAL(scom970_write)
xori r0,r0,MSR_EE
mtmsrd r0,1
/* rotate 24 bits SCOM address 8 bits left and mask out it's low 8 bits
/* rotate 24 bits SCOM address 8 bits left and mask out its low 8 bits
* (including parity). On current CPUs they must be 0'd.
*/

View File

@ -16,8 +16,6 @@
#include <asm/setup.h>
#include <asm/sections.h>
static LIST_HEAD(module_bug_list);
static const Elf_Shdr *find_section(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
const char *name)

View File

@ -1185,6 +1185,9 @@ static inline void save_sprs(struct thread_struct *t)
if (cpu_has_feature(CPU_FTR_DEXCR_NPHIE))
t->hashkeyr = mfspr(SPRN_HASHKEYR);
if (cpu_has_feature(CPU_FTR_ARCH_31))
t->dexcr = mfspr(SPRN_DEXCR);
#endif
}
@ -1267,6 +1270,10 @@ static inline void restore_sprs(struct thread_struct *old_thread,
if (cpu_has_feature(CPU_FTR_DEXCR_NPHIE) &&
old_thread->hashkeyr != new_thread->hashkeyr)
mtspr(SPRN_HASHKEYR, new_thread->hashkeyr);
if (cpu_has_feature(CPU_FTR_ARCH_31) &&
old_thread->dexcr != new_thread->dexcr)
mtspr(SPRN_DEXCR, new_thread->dexcr);
#endif
}
@ -1634,6 +1641,13 @@ void arch_setup_new_exec(void)
current->thread.regs->amr = default_amr;
current->thread.regs->iamr = default_iamr;
#endif
#ifdef CONFIG_PPC_BOOK3S_64
if (cpu_has_feature(CPU_FTR_ARCH_31)) {
current->thread.dexcr = current->thread.dexcr_onexec;
mtspr(SPRN_DEXCR, current->thread.dexcr);
}
#endif /* CONFIG_PPC_BOOK3S_64 */
}
#ifdef CONFIG_PPC64
@ -1647,7 +1661,7 @@ void arch_setup_new_exec(void)
* cases will happen:
*
* 1. The correct thread is running, the wrong thread is not
* In this situation, the correct thread is woken and proceeds to pass it's
* In this situation, the correct thread is woken and proceeds to pass its
* condition check.
*
* 2. Neither threads are running
@ -1657,15 +1671,15 @@ void arch_setup_new_exec(void)
* for the wrong thread, or they will execute the condition check immediately.
*
* 3. The wrong thread is running, the correct thread is not
* The wrong thread will be woken, but will fail it's condition check and
* The wrong thread will be woken, but will fail its condition check and
* re-execute wait. The correct thread, when scheduled, will execute either
* it's condition check (which will pass), or wait, which returns immediately
* when called the first time after the thread is scheduled, followed by it's
* its condition check (which will pass), or wait, which returns immediately
* when called the first time after the thread is scheduled, followed by its
* condition check (which will pass).
*
* 4. Both threads are running
* Both threads will be woken. The wrong thread will fail it's condition check
* and execute another wait, while the correct thread will pass it's condition
* Both threads will be woken. The wrong thread will fail its condition check
* and execute another wait, while the correct thread will pass its condition
* check.
*
* @t: the task to set the thread ID for
@ -1878,6 +1892,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
#ifdef CONFIG_PPC_BOOK3S_64
if (cpu_has_feature(CPU_FTR_DEXCR_NPHIE))
p->thread.hashkeyr = current->thread.hashkeyr;
if (cpu_has_feature(CPU_FTR_ARCH_31))
p->thread.dexcr = mfspr(SPRN_DEXCR);
#endif
return 0;
}

View File

@ -779,7 +779,7 @@ static inline void save_fscr_to_task(void) {}
void __init early_init_devtree(void *params)
{
phys_addr_t limit;
phys_addr_t int_vector_size;
DBG(" -> early_init_devtree(%px)\n", params);
@ -813,6 +813,9 @@ void __init early_init_devtree(void *params)
*/
of_scan_flat_dt(early_init_dt_scan_chosen_ppc, boot_command_line);
/* Append additional parameters passed for fadump capture kernel */
fadump_append_bootargs();
/* Scan memory nodes and rebuild MEMBLOCKs */
early_init_dt_scan_root();
early_init_dt_scan_memory_ppc();
@ -832,9 +835,16 @@ void __init early_init_devtree(void *params)
setup_initial_memory_limit(memstart_addr, first_memblock_size);
/* Reserve MEMBLOCK regions used by kernel, initrd, dt, etc... */
memblock_reserve(PHYSICAL_START, __pa(_end) - PHYSICAL_START);
#ifdef CONFIG_PPC64
/* If relocatable, reserve at least 32k for interrupt vectors etc. */
int_vector_size = __end_interrupts - _stext;
int_vector_size = max_t(phys_addr_t, SZ_32K, int_vector_size);
#else
/* If relocatable, reserve first 32k for interrupt vectors etc. */
int_vector_size = SZ_32K;
#endif
if (PHYSICAL_START > MEMORY_START)
memblock_reserve(MEMORY_START, 0x8000);
memblock_reserve(MEMORY_START, int_vector_size);
reserve_kdump_trampoline();
#if defined(CONFIG_FA_DUMP) || defined(CONFIG_PRESERVE_FA_DUMP)
/*
@ -846,9 +856,12 @@ void __init early_init_devtree(void *params)
reserve_crashkernel();
early_reserve_mem();
/* Ensure that total memory size is page-aligned. */
limit = ALIGN(memory_limit ?: memblock_phys_mem_size(), PAGE_SIZE);
memblock_enforce_memory_limit(limit);
if (memory_limit > memblock_phys_mem_size())
memory_limit = 0;
/* Align down to 16 MB which is large page size with hash page translation */
memory_limit = ALIGN_DOWN(memory_limit ?: memblock_phys_mem_size(), SZ_16M);
memblock_enforce_memory_limit(memory_limit);
#if defined(CONFIG_PPC_BOOK3S_64) && defined(CONFIG_PPC_4K_PAGES)
if (!early_radix_enabled())

View File

@ -817,8 +817,8 @@ static void __init early_cmdline_parse(void)
opt += 4;
prom_memory_limit = prom_memparse(opt, (const char **)&opt);
#ifdef CONFIG_PPC64
/* Align to 16 MB == size of ppc64 large page */
prom_memory_limit = ALIGN(prom_memory_limit, 0x1000000);
/* Align down to 16 MB which is large page size with hash page translation */
prom_memory_limit = ALIGN_DOWN(prom_memory_limit, SZ_16M);
#endif
}

View File

@ -12,7 +12,7 @@ void flush_tmregs_to_thread(struct task_struct *tsk)
{
/*
* If task is not current, it will have been flushed already to
* it's thread_struct during __switch_to().
* its thread_struct during __switch_to().
*
* A reclaim flushes ALL the state or if not in TM save TM SPRs
* in the appropriate thread structures from live.

View File

@ -469,12 +469,7 @@ static int dexcr_get(struct task_struct *target, const struct user_regset *regse
if (!cpu_has_feature(CPU_FTR_ARCH_31))
return -ENODEV;
/*
* The DEXCR is currently static across all CPUs, so we don't
* store the target's value anywhere, but the static value
* will also be correct.
*/
membuf_store(&to, (u64)lower_32_bits(DEXCR_INIT));
membuf_store(&to, (u64)lower_32_bits(target->thread.dexcr));
/*
* Technically the HDEXCR is per-cpu, but a hypervisor can't reasonably

View File

@ -405,7 +405,7 @@ static void __init cpu_init_thread_core_maps(int tpc)
cpumask_set_cpu(i, &threads_core_mask);
printk(KERN_INFO "CPU maps initialized for %d thread%s per core\n",
tpc, tpc > 1 ? "s" : "");
tpc, str_plural(tpc));
printk(KERN_DEBUG " (thread shift is %d)\n", threads_shift);
}

View File

@ -834,6 +834,7 @@ static __init int pcpu_cpu_to_node(int cpu)
unsigned long __per_cpu_offset[NR_CPUS] __read_mostly;
EXPORT_SYMBOL(__per_cpu_offset);
DEFINE_STATIC_KEY_FALSE(__percpu_first_chunk_is_paged);
void __init setup_per_cpu_areas(void)
{
@ -876,6 +877,7 @@ void __init setup_per_cpu_areas(void)
if (rc < 0)
panic("cannot initialize percpu area (err=%d)", rc);
static_key_enable(&__percpu_first_chunk_is_paged.key);
delta = (unsigned long)pcpu_base_addr - (unsigned long)__per_cpu_start;
for_each_possible_cpu(cpu) {
__per_cpu_offset[cpu] = delta + pcpu_unit_offsets[cpu];

View File

@ -1567,7 +1567,7 @@ static void add_cpu_to_masks(int cpu)
/*
* This CPU will not be in the online mask yet so we need to manually
* add it to it's own thread sibling mask.
* add it to its own thread sibling mask.
*/
map_cpu_to_node(cpu, cpu_to_node(cpu));
cpumask_set_cpu(cpu, cpu_sibling_mask(cpu));

View File

@ -139,7 +139,7 @@ static unsigned long dscr_default;
* @val: Returned cpu specific DSCR default value
*
* This function returns the per cpu DSCR default value
* for any cpu which is contained in it's PACA structure.
* for any cpu which is contained in its PACA structure.
*/
static void read_dscr(void *val)
{
@ -152,7 +152,7 @@ static void read_dscr(void *val)
* @val: New cpu specific DSCR default value to update
*
* This function updates the per cpu DSCR default value
* for any cpu which is contained in it's PACA structure.
* for any cpu which is contained in its PACA structure.
*/
static void write_dscr(void *val)
{

View File

@ -3,11 +3,11 @@
# Makefile for the linux kernel.
#
obj-y += core.o core_$(BITS).o
obj-y += core.o core_$(BITS).o ranges.o
obj-$(CONFIG_PPC32) += relocate_32.o
obj-$(CONFIG_KEXEC_FILE) += file_load.o ranges.o file_load_$(BITS).o elf_$(BITS).o
obj-$(CONFIG_KEXEC_FILE) += file_load.o file_load_$(BITS).o elf_$(BITS).o
obj-$(CONFIG_VMCORE_INFO) += vmcore_info.o
obj-$(CONFIG_CRASH_DUMP) += crash.o

View File

@ -17,6 +17,7 @@
#include <linux/cpu.h>
#include <linux/hardirq.h>
#include <linux/of.h>
#include <linux/libfdt.h>
#include <asm/page.h>
#include <asm/current.h>
@ -30,6 +31,7 @@
#include <asm/hw_breakpoint.h>
#include <asm/svm.h>
#include <asm/ultravisor.h>
#include <asm/crashdump-ppc64.h>
int machine_kexec_prepare(struct kimage *image)
{
@ -419,3 +421,92 @@ static int __init export_htab_values(void)
}
late_initcall(export_htab_values);
#endif /* CONFIG_PPC_64S_HASH_MMU */
#if defined(CONFIG_KEXEC_FILE) || defined(CONFIG_CRASH_DUMP)
/**
* add_node_props - Reads node properties from device node structure and add
* them to fdt.
* @fdt: Flattened device tree of the kernel
* @node_offset: offset of the node to add a property at
* @dn: device node pointer
*
* Returns 0 on success, negative errno on error.
*/
static int add_node_props(void *fdt, int node_offset, const struct device_node *dn)
{
int ret = 0;
struct property *pp;
if (!dn)
return -EINVAL;
for_each_property_of_node(dn, pp) {
ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, pp->length);
if (ret < 0) {
pr_err("Unable to add %s property: %s\n", pp->name, fdt_strerror(ret));
return ret;
}
}
return ret;
}
/**
* update_cpus_node - Update cpus node of flattened device tree using of_root
* device node.
* @fdt: Flattened device tree of the kernel.
*
* Returns 0 on success, negative errno on error.
*/
int update_cpus_node(void *fdt)
{
struct device_node *cpus_node, *dn;
int cpus_offset, cpus_subnode_offset, ret = 0;
cpus_offset = fdt_path_offset(fdt, "/cpus");
if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
pr_err("Malformed device tree: error reading /cpus node: %s\n",
fdt_strerror(cpus_offset));
return cpus_offset;
}
if (cpus_offset > 0) {
ret = fdt_del_node(fdt, cpus_offset);
if (ret < 0) {
pr_err("Error deleting /cpus node: %s\n", fdt_strerror(ret));
return -EINVAL;
}
}
/* Add cpus node to fdt */
cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
if (cpus_offset < 0) {
pr_err("Error creating /cpus node: %s\n", fdt_strerror(cpus_offset));
return -EINVAL;
}
/* Add cpus node properties */
cpus_node = of_find_node_by_path("/cpus");
ret = add_node_props(fdt, cpus_offset, cpus_node);
of_node_put(cpus_node);
if (ret < 0)
return ret;
/* Loop through all subnodes of cpus and add them to fdt */
for_each_node_by_type(dn, "cpu") {
cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, dn->full_name);
if (cpus_subnode_offset < 0) {
pr_err("Unable to add %s subnode: %s\n", dn->full_name,
fdt_strerror(cpus_subnode_offset));
ret = cpus_subnode_offset;
goto out;
}
ret = add_node_props(fdt, cpus_subnode_offset, dn);
if (ret < 0)
goto out;
}
out:
of_node_put(dn);
return ret;
}
#endif /* CONFIG_KEXEC_FILE || CONFIG_CRASH_DUMP */

View File

@ -16,6 +16,8 @@
#include <linux/delay.h>
#include <linux/irq.h>
#include <linux/types.h>
#include <linux/libfdt.h>
#include <linux/memory.h>
#include <asm/processor.h>
#include <asm/machdep.h>
@ -24,6 +26,7 @@
#include <asm/setjmp.h>
#include <asm/debug.h>
#include <asm/interrupt.h>
#include <asm/kexec_ranges.h>
/*
* The primary CPU waits a while for all secondary CPUs to enter. This is to
@ -392,3 +395,195 @@ void default_machine_crash_shutdown(struct pt_regs *regs)
if (ppc_md.kexec_cpu_down)
ppc_md.kexec_cpu_down(1, 0);
}
#ifdef CONFIG_CRASH_HOTPLUG
#undef pr_fmt
#define pr_fmt(fmt) "crash hp: " fmt
/*
* Advertise preferred elfcorehdr size to userspace via
* /sys/kernel/crash_elfcorehdr_size sysfs interface.
*/
unsigned int arch_crash_get_elfcorehdr_size(void)
{
unsigned long phdr_cnt;
/* A program header for possible CPUs + vmcoreinfo */
phdr_cnt = num_possible_cpus() + 1;
if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG))
phdr_cnt += CONFIG_CRASH_MAX_MEMORY_RANGES;
return sizeof(struct elfhdr) + (phdr_cnt * sizeof(Elf64_Phdr));
}
/**
* update_crash_elfcorehdr() - Recreate the elfcorehdr and replace it with old
* elfcorehdr in the kexec segment array.
* @image: the active struct kimage
* @mn: struct memory_notify data handler
*/
static void update_crash_elfcorehdr(struct kimage *image, struct memory_notify *mn)
{
int ret;
struct crash_mem *cmem = NULL;
struct kexec_segment *ksegment;
void *ptr, *mem, *elfbuf = NULL;
unsigned long elfsz, memsz, base_addr, size;
ksegment = &image->segment[image->elfcorehdr_index];
mem = (void *) ksegment->mem;
memsz = ksegment->memsz;
ret = get_crash_memory_ranges(&cmem);
if (ret) {
pr_err("Failed to get crash mem range\n");
return;
}
/*
* The hot unplugged memory is part of crash memory ranges,
* remove it here.
*/
if (image->hp_action == KEXEC_CRASH_HP_REMOVE_MEMORY) {
base_addr = PFN_PHYS(mn->start_pfn);
size = mn->nr_pages * PAGE_SIZE;
ret = remove_mem_range(&cmem, base_addr, size);
if (ret) {
pr_err("Failed to remove hot-unplugged memory from crash memory ranges\n");
goto out;
}
}
ret = crash_prepare_elf64_headers(cmem, false, &elfbuf, &elfsz);
if (ret) {
pr_err("Failed to prepare elf header\n");
goto out;
}
/*
* It is unlikely that kernel hit this because elfcorehdr kexec
* segment (memsz) is built with addition space to accommodate growing
* number of crash memory ranges while loading the kdump kernel. It is
* Just to avoid any unforeseen case.
*/
if (elfsz > memsz) {
pr_err("Updated crash elfcorehdr elfsz %lu > memsz %lu", elfsz, memsz);
goto out;
}
ptr = __va(mem);
if (ptr) {
/* Temporarily invalidate the crash image while it is replaced */
xchg(&kexec_crash_image, NULL);
/* Replace the old elfcorehdr with newly prepared elfcorehdr */
memcpy((void *)ptr, elfbuf, elfsz);
/* The crash image is now valid once again */
xchg(&kexec_crash_image, image);
}
out:
kvfree(cmem);
kvfree(elfbuf);
}
/**
* get_fdt_index - Loop through the kexec segment array and find
* the index of the FDT segment.
* @image: a pointer to kexec_crash_image
*
* Returns the index of FDT segment in the kexec segment array
* if found; otherwise -1.
*/
static int get_fdt_index(struct kimage *image)
{
void *ptr;
unsigned long mem;
int i, fdt_index = -1;
/* Find the FDT segment index in kexec segment array. */
for (i = 0; i < image->nr_segments; i++) {
mem = image->segment[i].mem;
ptr = __va(mem);
if (ptr && fdt_magic(ptr) == FDT_MAGIC) {
fdt_index = i;
break;
}
}
return fdt_index;
}
/**
* update_crash_fdt - updates the cpus node of the crash FDT.
*
* @image: a pointer to kexec_crash_image
*/
static void update_crash_fdt(struct kimage *image)
{
void *fdt;
int fdt_index;
fdt_index = get_fdt_index(image);
if (fdt_index < 0) {
pr_err("Unable to locate FDT segment.\n");
return;
}
fdt = __va((void *)image->segment[fdt_index].mem);
/* Temporarily invalidate the crash image while it is replaced */
xchg(&kexec_crash_image, NULL);
/* update FDT to reflect changes in CPU resources */
if (update_cpus_node(fdt))
pr_err("Failed to update crash FDT");
/* The crash image is now valid once again */
xchg(&kexec_crash_image, image);
}
int arch_crash_hotplug_support(struct kimage *image, unsigned long kexec_flags)
{
#ifdef CONFIG_KEXEC_FILE
if (image->file_mode)
return 1;
#endif
return kexec_flags & KEXEC_CRASH_HOTPLUG_SUPPORT;
}
/**
* arch_crash_handle_hotplug_event - Handle crash CPU/Memory hotplug events to update the
* necessary kexec segments based on the hotplug event.
* @image: a pointer to kexec_crash_image
* @arg: struct memory_notify handler for memory hotplug case and NULL for CPU hotplug case.
*
* Update the kdump image based on the type of hotplug event, represented by image->hp_action.
* CPU add: Update the FDT segment to include the newly added CPU.
* CPU remove: No action is needed, with the assumption that it's okay to have offline CPUs
* part of the FDT.
* Memory add/remove: No action is taken as this is not yet supported.
*/
void arch_crash_handle_hotplug_event(struct kimage *image, void *arg)
{
struct memory_notify *mn;
switch (image->hp_action) {
case KEXEC_CRASH_HP_REMOVE_CPU:
return;
case KEXEC_CRASH_HP_ADD_CPU:
update_crash_fdt(image);
break;
case KEXEC_CRASH_HP_REMOVE_MEMORY:
case KEXEC_CRASH_HP_ADD_MEMORY:
mn = (struct memory_notify *)arg;
update_crash_elfcorehdr(image, mn);
return;
default:
pr_warn_once("Unknown hotplug action\n");
}
}
#endif /* CONFIG_CRASH_HOTPLUG */

View File

@ -116,7 +116,8 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
if (ret)
goto out_free_fdt;
fdt_pack(fdt);
if (!IS_ENABLED(CONFIG_CRASH_HOTPLUG) || image->type != KEXEC_TYPE_CRASH)
fdt_pack(fdt);
kbuf.buffer = fdt;
kbuf.bufsz = kbuf.memsz = fdt_totalsize(fdt);

View File

@ -30,6 +30,7 @@
#include <asm/iommu.h>
#include <asm/prom.h>
#include <asm/plpks.h>
#include <asm/cputhreads.h>
struct umem_info {
__be64 *buf; /* data buffer for usable-memory property */
@ -47,83 +48,6 @@ const struct kexec_file_ops * const kexec_file_loaders[] = {
NULL
};
/**
* get_exclude_memory_ranges - Get exclude memory ranges. This list includes
* regions like opal/rtas, tce-table, initrd,
* kernel, htab which should be avoided while
* setting up kexec load segments.
* @mem_ranges: Range list to add the memory ranges to.
*
* Returns 0 on success, negative errno on error.
*/
static int get_exclude_memory_ranges(struct crash_mem **mem_ranges)
{
int ret;
ret = add_tce_mem_ranges(mem_ranges);
if (ret)
goto out;
ret = add_initrd_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_htab_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_kernel_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_rtas_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_opal_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_reserved_mem_ranges(mem_ranges);
if (ret)
goto out;
/* exclude memory ranges should be sorted for easy lookup */
sort_memory_ranges(*mem_ranges, true);
out:
if (ret)
pr_err("Failed to setup exclude memory ranges\n");
return ret;
}
/**
* get_reserved_memory_ranges - Get reserve memory ranges. This list includes
* memory regions that should be added to the
* memory reserve map to ensure the region is
* protected from any mischief.
* @mem_ranges: Range list to add the memory ranges to.
*
* Returns 0 on success, negative errno on error.
*/
static int get_reserved_memory_ranges(struct crash_mem **mem_ranges)
{
int ret;
ret = add_rtas_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_tce_mem_ranges(mem_ranges);
if (ret)
goto out;
ret = add_reserved_mem_ranges(mem_ranges);
out:
if (ret)
pr_err("Failed to setup reserved memory ranges\n");
return ret;
}
/**
* __locate_mem_hole_top_down - Looks top down for a large enough memory hole
* in the memory regions between buf_min & buf_max
@ -322,119 +246,6 @@ static int locate_mem_hole_bottom_up_ppc64(struct kexec_buf *kbuf,
}
#ifdef CONFIG_CRASH_DUMP
/**
* get_usable_memory_ranges - Get usable memory ranges. This list includes
* regions like crashkernel, opal/rtas & tce-table,
* that kdump kernel could use.
* @mem_ranges: Range list to add the memory ranges to.
*
* Returns 0 on success, negative errno on error.
*/
static int get_usable_memory_ranges(struct crash_mem **mem_ranges)
{
int ret;
/*
* Early boot failure observed on guests when low memory (first memory
* block?) is not added to usable memory. So, add [0, crashk_res.end]
* instead of [crashk_res.start, crashk_res.end] to workaround it.
* Also, crashed kernel's memory must be added to reserve map to
* avoid kdump kernel from using it.
*/
ret = add_mem_range(mem_ranges, 0, crashk_res.end + 1);
if (ret)
goto out;
ret = add_rtas_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_opal_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_tce_mem_ranges(mem_ranges);
out:
if (ret)
pr_err("Failed to setup usable memory ranges\n");
return ret;
}
/**
* get_crash_memory_ranges - Get crash memory ranges. This list includes
* first/crashing kernel's memory regions that
* would be exported via an elfcore.
* @mem_ranges: Range list to add the memory ranges to.
*
* Returns 0 on success, negative errno on error.
*/
static int get_crash_memory_ranges(struct crash_mem **mem_ranges)
{
phys_addr_t base, end;
struct crash_mem *tmem;
u64 i;
int ret;
for_each_mem_range(i, &base, &end) {
u64 size = end - base;
/* Skip backup memory region, which needs a separate entry */
if (base == BACKUP_SRC_START) {
if (size > BACKUP_SRC_SIZE) {
base = BACKUP_SRC_END + 1;
size -= BACKUP_SRC_SIZE;
} else
continue;
}
ret = add_mem_range(mem_ranges, base, size);
if (ret)
goto out;
/* Try merging adjacent ranges before reallocation attempt */
if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
sort_memory_ranges(*mem_ranges, true);
}
/* Reallocate memory ranges if there is no space to split ranges */
tmem = *mem_ranges;
if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
tmem = realloc_mem_ranges(mem_ranges);
if (!tmem)
goto out;
}
/* Exclude crashkernel region */
ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
if (ret)
goto out;
/*
* FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
* regions are exported to save their context at the time of
* crash, they should actually be backed up just like the
* first 64K bytes of memory.
*/
ret = add_rtas_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_opal_mem_range(mem_ranges);
if (ret)
goto out;
/* create a separate program header for the backup region */
ret = add_mem_range(mem_ranges, BACKUP_SRC_START, BACKUP_SRC_SIZE);
if (ret)
goto out;
sort_memory_ranges(*mem_ranges, false);
out:
if (ret)
pr_err("Failed to setup crash memory ranges\n");
return ret;
}
/**
* check_realloc_usable_mem - Reallocate buffer if it can't accommodate entries
* @um_info: Usable memory buffer and ranges info.
@ -784,6 +595,23 @@ static void update_backup_region_phdr(struct kimage *image, Elf64_Ehdr *ehdr)
}
}
static unsigned int kdump_extra_elfcorehdr_size(struct crash_mem *cmem)
{
#if defined(CONFIG_CRASH_HOTPLUG) && defined(CONFIG_MEMORY_HOTPLUG)
unsigned int extra_sz = 0;
if (CONFIG_CRASH_MAX_MEMORY_RANGES > (unsigned int)PN_XNUM)
pr_warn("Number of Phdrs %u exceeds max\n", CONFIG_CRASH_MAX_MEMORY_RANGES);
else if (cmem->nr_ranges >= CONFIG_CRASH_MAX_MEMORY_RANGES)
pr_warn("Configured crash mem ranges may not be enough\n");
else
extra_sz = (CONFIG_CRASH_MAX_MEMORY_RANGES - cmem->nr_ranges) * sizeof(Elf64_Phdr);
return extra_sz;
#endif
return 0;
}
/**
* load_elfcorehdr_segment - Setup crash memory ranges and initialize elfcorehdr
* segment needed to load kdump kernel.
@ -815,7 +643,8 @@ static int load_elfcorehdr_segment(struct kimage *image, struct kexec_buf *kbuf)
kbuf->buffer = headers;
kbuf->mem = KEXEC_BUF_MEM_UNKNOWN;
kbuf->bufsz = kbuf->memsz = headers_sz;
kbuf->bufsz = headers_sz;
kbuf->memsz = headers_sz + kdump_extra_elfcorehdr_size(cmem);
kbuf->top_down = false;
ret = kexec_add_buffer(kbuf);
@ -979,6 +808,9 @@ static unsigned int kdump_extra_fdt_size_ppc64(struct kimage *image)
unsigned int cpu_nodes, extra_size = 0;
struct device_node *dn;
u64 usm_entries;
#ifdef CONFIG_CRASH_HOTPLUG
unsigned int possible_cpu_nodes;
#endif
if (!IS_ENABLED(CONFIG_CRASH_DUMP) || image->type != KEXEC_TYPE_CRASH)
return 0;
@ -1006,6 +838,19 @@ static unsigned int kdump_extra_fdt_size_ppc64(struct kimage *image)
if (cpu_nodes > boot_cpu_node_count)
extra_size += (cpu_nodes - boot_cpu_node_count) * cpu_node_size();
#ifdef CONFIG_CRASH_HOTPLUG
/*
* Make sure enough space is reserved to accommodate possible CPU nodes
* in the crash FDT. This allows packing possible CPU nodes which are
* not yet present in the system without regenerating the entire FDT.
*/
if (image->type == KEXEC_TYPE_CRASH) {
possible_cpu_nodes = num_possible_cpus() / threads_per_core;
if (possible_cpu_nodes > cpu_nodes)
extra_size += (possible_cpu_nodes - cpu_nodes) * cpu_node_size();
}
#endif
return extra_size;
}
@ -1028,93 +873,6 @@ unsigned int kexec_extra_fdt_size_ppc64(struct kimage *image)
return extra_size + kdump_extra_fdt_size_ppc64(image);
}
/**
* add_node_props - Reads node properties from device node structure and add
* them to fdt.
* @fdt: Flattened device tree of the kernel
* @node_offset: offset of the node to add a property at
* @dn: device node pointer
*
* Returns 0 on success, negative errno on error.
*/
static int add_node_props(void *fdt, int node_offset, const struct device_node *dn)
{
int ret = 0;
struct property *pp;
if (!dn)
return -EINVAL;
for_each_property_of_node(dn, pp) {
ret = fdt_setprop(fdt, node_offset, pp->name, pp->value, pp->length);
if (ret < 0) {
pr_err("Unable to add %s property: %s\n", pp->name, fdt_strerror(ret));
return ret;
}
}
return ret;
}
/**
* update_cpus_node - Update cpus node of flattened device tree using of_root
* device node.
* @fdt: Flattened device tree of the kernel.
*
* Returns 0 on success, negative errno on error.
*/
static int update_cpus_node(void *fdt)
{
struct device_node *cpus_node, *dn;
int cpus_offset, cpus_subnode_offset, ret = 0;
cpus_offset = fdt_path_offset(fdt, "/cpus");
if (cpus_offset < 0 && cpus_offset != -FDT_ERR_NOTFOUND) {
pr_err("Malformed device tree: error reading /cpus node: %s\n",
fdt_strerror(cpus_offset));
return cpus_offset;
}
if (cpus_offset > 0) {
ret = fdt_del_node(fdt, cpus_offset);
if (ret < 0) {
pr_err("Error deleting /cpus node: %s\n", fdt_strerror(ret));
return -EINVAL;
}
}
/* Add cpus node to fdt */
cpus_offset = fdt_add_subnode(fdt, fdt_path_offset(fdt, "/"), "cpus");
if (cpus_offset < 0) {
pr_err("Error creating /cpus node: %s\n", fdt_strerror(cpus_offset));
return -EINVAL;
}
/* Add cpus node properties */
cpus_node = of_find_node_by_path("/cpus");
ret = add_node_props(fdt, cpus_offset, cpus_node);
of_node_put(cpus_node);
if (ret < 0)
return ret;
/* Loop through all subnodes of cpus and add them to fdt */
for_each_node_by_type(dn, "cpu") {
cpus_subnode_offset = fdt_add_subnode(fdt, cpus_offset, dn->full_name);
if (cpus_subnode_offset < 0) {
pr_err("Unable to add %s subnode: %s\n", dn->full_name,
fdt_strerror(cpus_subnode_offset));
ret = cpus_subnode_offset;
goto out;
}
ret = add_node_props(fdt, cpus_subnode_offset, dn);
if (ret < 0)
goto out;
}
out:
of_node_put(dn);
return ret;
}
static int copy_property(void *fdt, int node_offset, const struct device_node *dn,
const char *propname)
{

View File

@ -20,9 +20,13 @@
#include <linux/kexec.h>
#include <linux/of.h>
#include <linux/slab.h>
#include <linux/memblock.h>
#include <linux/crash_core.h>
#include <asm/sections.h>
#include <asm/kexec_ranges.h>
#include <asm/crashdump-ppc64.h>
#if defined(CONFIG_KEXEC_FILE) || defined(CONFIG_CRASH_DUMP)
/**
* get_max_nr_ranges - Get the max no. of ranges crash_mem structure
* could hold, given the size allocated for it.
@ -234,13 +238,16 @@ int add_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size)
return __add_mem_range(mem_ranges, base, size);
}
#endif /* CONFIG_KEXEC_FILE || CONFIG_CRASH_DUMP */
#ifdef CONFIG_KEXEC_FILE
/**
* add_tce_mem_ranges - Adds tce-table range to the given memory ranges list.
* @mem_ranges: Range list to add the memory range(s) to.
*
* Returns 0 on success, negative errno on error.
*/
int add_tce_mem_ranges(struct crash_mem **mem_ranges)
static int add_tce_mem_ranges(struct crash_mem **mem_ranges)
{
struct device_node *dn = NULL;
int ret = 0;
@ -279,7 +286,7 @@ int add_tce_mem_ranges(struct crash_mem **mem_ranges)
*
* Returns 0 on success, negative errno on error.
*/
int add_initrd_mem_range(struct crash_mem **mem_ranges)
static int add_initrd_mem_range(struct crash_mem **mem_ranges)
{
u64 base, end;
int ret;
@ -296,7 +303,6 @@ int add_initrd_mem_range(struct crash_mem **mem_ranges)
return ret;
}
#ifdef CONFIG_PPC_64S_HASH_MMU
/**
* add_htab_mem_range - Adds htab range to the given memory ranges list,
* if it exists
@ -304,14 +310,18 @@ int add_initrd_mem_range(struct crash_mem **mem_ranges)
*
* Returns 0 on success, negative errno on error.
*/
int add_htab_mem_range(struct crash_mem **mem_ranges)
static int add_htab_mem_range(struct crash_mem **mem_ranges)
{
#ifdef CONFIG_PPC_64S_HASH_MMU
if (!htab_address)
return 0;
return add_mem_range(mem_ranges, __pa(htab_address), htab_size_bytes);
}
#else
return 0;
#endif
}
/**
* add_kernel_mem_range - Adds kernel text region to the given
@ -320,18 +330,20 @@ int add_htab_mem_range(struct crash_mem **mem_ranges)
*
* Returns 0 on success, negative errno on error.
*/
int add_kernel_mem_range(struct crash_mem **mem_ranges)
static int add_kernel_mem_range(struct crash_mem **mem_ranges)
{
return add_mem_range(mem_ranges, 0, __pa(_end));
}
#endif /* CONFIG_KEXEC_FILE */
#if defined(CONFIG_KEXEC_FILE) || defined(CONFIG_CRASH_DUMP)
/**
* add_rtas_mem_range - Adds RTAS region to the given memory ranges list.
* @mem_ranges: Range list to add the memory range to.
*
* Returns 0 on success, negative errno on error.
*/
int add_rtas_mem_range(struct crash_mem **mem_ranges)
static int add_rtas_mem_range(struct crash_mem **mem_ranges)
{
struct device_node *dn;
u32 base, size;
@ -356,7 +368,7 @@ int add_rtas_mem_range(struct crash_mem **mem_ranges)
*
* Returns 0 on success, negative errno on error.
*/
int add_opal_mem_range(struct crash_mem **mem_ranges)
static int add_opal_mem_range(struct crash_mem **mem_ranges)
{
struct device_node *dn;
u64 base, size;
@ -374,7 +386,9 @@ int add_opal_mem_range(struct crash_mem **mem_ranges)
of_node_put(dn);
return ret;
}
#endif /* CONFIG_KEXEC_FILE || CONFIG_CRASH_DUMP */
#ifdef CONFIG_KEXEC_FILE
/**
* add_reserved_mem_ranges - Adds "/reserved-ranges" regions exported by f/w
* to the given memory ranges list.
@ -382,7 +396,7 @@ int add_opal_mem_range(struct crash_mem **mem_ranges)
*
* Returns 0 on success, negative errno on error.
*/
int add_reserved_mem_ranges(struct crash_mem **mem_ranges)
static int add_reserved_mem_ranges(struct crash_mem **mem_ranges)
{
int n_mem_addr_cells, n_mem_size_cells, i, len, cells, ret = 0;
struct device_node *root = of_find_node_by_path("/");
@ -412,3 +426,283 @@ int add_reserved_mem_ranges(struct crash_mem **mem_ranges)
return ret;
}
/**
* get_reserved_memory_ranges - Get reserve memory ranges. This list includes
* memory regions that should be added to the
* memory reserve map to ensure the region is
* protected from any mischief.
* @mem_ranges: Range list to add the memory ranges to.
*
* Returns 0 on success, negative errno on error.
*/
int get_reserved_memory_ranges(struct crash_mem **mem_ranges)
{
int ret;
ret = add_rtas_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_tce_mem_ranges(mem_ranges);
if (ret)
goto out;
ret = add_reserved_mem_ranges(mem_ranges);
out:
if (ret)
pr_err("Failed to setup reserved memory ranges\n");
return ret;
}
/**
* get_exclude_memory_ranges - Get exclude memory ranges. This list includes
* regions like opal/rtas, tce-table, initrd,
* kernel, htab which should be avoided while
* setting up kexec load segments.
* @mem_ranges: Range list to add the memory ranges to.
*
* Returns 0 on success, negative errno on error.
*/
int get_exclude_memory_ranges(struct crash_mem **mem_ranges)
{
int ret;
ret = add_tce_mem_ranges(mem_ranges);
if (ret)
goto out;
ret = add_initrd_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_htab_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_kernel_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_rtas_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_opal_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_reserved_mem_ranges(mem_ranges);
if (ret)
goto out;
/* exclude memory ranges should be sorted for easy lookup */
sort_memory_ranges(*mem_ranges, true);
out:
if (ret)
pr_err("Failed to setup exclude memory ranges\n");
return ret;
}
#ifdef CONFIG_CRASH_DUMP
/**
* get_usable_memory_ranges - Get usable memory ranges. This list includes
* regions like crashkernel, opal/rtas & tce-table,
* that kdump kernel could use.
* @mem_ranges: Range list to add the memory ranges to.
*
* Returns 0 on success, negative errno on error.
*/
int get_usable_memory_ranges(struct crash_mem **mem_ranges)
{
int ret;
/*
* Early boot failure observed on guests when low memory (first memory
* block?) is not added to usable memory. So, add [0, crashk_res.end]
* instead of [crashk_res.start, crashk_res.end] to workaround it.
* Also, crashed kernel's memory must be added to reserve map to
* avoid kdump kernel from using it.
*/
ret = add_mem_range(mem_ranges, 0, crashk_res.end + 1);
if (ret)
goto out;
ret = add_rtas_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_opal_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_tce_mem_ranges(mem_ranges);
out:
if (ret)
pr_err("Failed to setup usable memory ranges\n");
return ret;
}
#endif /* CONFIG_CRASH_DUMP */
#endif /* CONFIG_KEXEC_FILE */
#ifdef CONFIG_CRASH_DUMP
/**
* get_crash_memory_ranges - Get crash memory ranges. This list includes
* first/crashing kernel's memory regions that
* would be exported via an elfcore.
* @mem_ranges: Range list to add the memory ranges to.
*
* Returns 0 on success, negative errno on error.
*/
int get_crash_memory_ranges(struct crash_mem **mem_ranges)
{
phys_addr_t base, end;
struct crash_mem *tmem;
u64 i;
int ret;
for_each_mem_range(i, &base, &end) {
u64 size = end - base;
/* Skip backup memory region, which needs a separate entry */
if (base == BACKUP_SRC_START) {
if (size > BACKUP_SRC_SIZE) {
base = BACKUP_SRC_END + 1;
size -= BACKUP_SRC_SIZE;
} else
continue;
}
ret = add_mem_range(mem_ranges, base, size);
if (ret)
goto out;
/* Try merging adjacent ranges before reallocation attempt */
if ((*mem_ranges)->nr_ranges == (*mem_ranges)->max_nr_ranges)
sort_memory_ranges(*mem_ranges, true);
}
/* Reallocate memory ranges if there is no space to split ranges */
tmem = *mem_ranges;
if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
tmem = realloc_mem_ranges(mem_ranges);
if (!tmem)
goto out;
}
/* Exclude crashkernel region */
ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
if (ret)
goto out;
/*
* FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
* regions are exported to save their context at the time of
* crash, they should actually be backed up just like the
* first 64K bytes of memory.
*/
ret = add_rtas_mem_range(mem_ranges);
if (ret)
goto out;
ret = add_opal_mem_range(mem_ranges);
if (ret)
goto out;
/* create a separate program header for the backup region */
ret = add_mem_range(mem_ranges, BACKUP_SRC_START, BACKUP_SRC_SIZE);
if (ret)
goto out;
sort_memory_ranges(*mem_ranges, false);
out:
if (ret)
pr_err("Failed to setup crash memory ranges\n");
return ret;
}
/**
* remove_mem_range - Removes the given memory range from the range list.
* @mem_ranges: Range list to remove the memory range to.
* @base: Base address of the range to remove.
* @size: Size of the memory range to remove.
*
* (Re)allocates memory, if needed.
*
* Returns 0 on success, negative errno on error.
*/
int remove_mem_range(struct crash_mem **mem_ranges, u64 base, u64 size)
{
u64 end;
int ret = 0;
unsigned int i;
u64 mstart, mend;
struct crash_mem *mem_rngs = *mem_ranges;
if (!size)
return 0;
/*
* Memory range are stored as start and end address, use
* the same format to do remove operation.
*/
end = base + size - 1;
for (i = 0; i < mem_rngs->nr_ranges; i++) {
mstart = mem_rngs->ranges[i].start;
mend = mem_rngs->ranges[i].end;
/*
* Memory range to remove is not part of this range entry
* in the memory range list
*/
if (!(base >= mstart && end <= mend))
continue;
/*
* Memory range to remove is equivalent to this entry in the
* memory range list. Remove the range entry from the list.
*/
if (base == mstart && end == mend) {
for (; i < mem_rngs->nr_ranges - 1; i++) {
mem_rngs->ranges[i].start = mem_rngs->ranges[i+1].start;
mem_rngs->ranges[i].end = mem_rngs->ranges[i+1].end;
}
mem_rngs->nr_ranges--;
goto out;
}
/*
* Start address of the memory range to remove and the
* current memory range entry in the list is same. Just
* move the start address of the current memory range
* entry in the list to end + 1.
*/
else if (base == mstart) {
mem_rngs->ranges[i].start = end + 1;
goto out;
}
/*
* End address of the memory range to remove and the
* current memory range entry in the list is same.
* Just move the end address of the current memory
* range entry in the list to base - 1.
*/
else if (end == mend) {
mem_rngs->ranges[i].end = base - 1;
goto out;
}
/*
* Memory range to remove is not at the edge of current
* memory range entry. Split the current memory entry into
* two half.
*/
else {
mem_rngs->ranges[i].end = base - 1;
size = mem_rngs->ranges[i].end - end;
ret = add_mem_range(mem_ranges, end + 1, size);
}
}
out:
return ret;
}
#endif /* CONFIG_CRASH_DUMP */

View File

@ -360,10 +360,6 @@ static int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu,
break;
}
#if 0
printk(KERN_INFO "Deliver interrupt 0x%x? %x\n", vec, deliver);
#endif
if (deliver)
kvmppc_inject_interrupt(vcpu, vec, 0);

View File

@ -714,7 +714,7 @@ int kvmppc_core_emulate_mtspr_pr(struct kvm_vcpu *vcpu, int sprn, ulong spr_val)
case SPRN_HID1:
to_book3s(vcpu)->hid[1] = spr_val;
break;
case SPRN_HID2:
case SPRN_HID2_750FX:
to_book3s(vcpu)->hid[2] = spr_val;
break;
case SPRN_HID2_GEKKO:
@ -900,7 +900,7 @@ int kvmppc_core_emulate_mfspr_pr(struct kvm_vcpu *vcpu, int sprn, ulong *spr_val
case SPRN_HID1:
*spr_val = to_book3s(vcpu)->hid[1];
break;
case SPRN_HID2:
case SPRN_HID2_750FX:
case SPRN_HID2_GEKKO:
*spr_val = to_book3s(vcpu)->hid[2];
break;

View File

@ -4857,7 +4857,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
* entering a nested guest in which case the decrementer is now owned
* by L2 and the L1 decrementer is provided in hdec_expires
*/
if (!kvmhv_is_nestedv2() && kvmppc_core_pending_dec(vcpu) &&
if (kvmppc_core_pending_dec(vcpu) &&
((tb < kvmppc_dec_expires_host_tb(vcpu)) ||
(trap == BOOK3S_INTERRUPT_SYSCALL &&
kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))

View File

@ -71,8 +71,8 @@ gs_msg_ops_kvmhv_nestedv2_config_fill_info(struct kvmppc_gs_buff *gsb,
}
if (kvmppc_gsm_includes(gsm, KVMPPC_GSID_RUN_OUTPUT)) {
kvmppc_gse_put_buff_info(gsb, KVMPPC_GSID_RUN_OUTPUT,
cfg->vcpu_run_output_cfg);
rc = kvmppc_gse_put_buff_info(gsb, KVMPPC_GSID_RUN_OUTPUT,
cfg->vcpu_run_output_cfg);
if (rc < 0)
return rc;
}

View File

@ -531,7 +531,7 @@ static int xive_vm_h_eoi(struct kvm_vcpu *vcpu, unsigned long xirr)
xc->cppr = xive_prio_from_guest(new_cppr);
/*
* IPIs are synthetized from MFRR and thus don't need
* IPIs are synthesized from MFRR and thus don't need
* any special EOI handling. The underlying interrupt
* used to signal MFRR changes is EOId when fetched from
* the queue.

View File

@ -3,8 +3,6 @@
# Makefile for ppc-specific library files..
#
ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC)
CFLAGS_code-patching.o += -fno-stack-protector
CFLAGS_feature-fixups.o += -fno-stack-protector

View File

@ -372,9 +372,32 @@ int patch_instruction(u32 *addr, ppc_inst_t instr)
}
NOKPROBE_SYMBOL(patch_instruction);
static int patch_memset64(u64 *addr, u64 val, size_t count)
{
for (u64 *end = addr + count; addr < end; addr++)
__put_kernel_nofault(addr, &val, u64, failed);
return 0;
failed:
return -EPERM;
}
static int patch_memset32(u32 *addr, u32 val, size_t count)
{
for (u32 *end = addr + count; addr < end; addr++)
__put_kernel_nofault(addr, &val, u32, failed);
return 0;
failed:
return -EPERM;
}
static int __patch_instructions(u32 *patch_addr, u32 *code, size_t len, bool repeat_instr)
{
unsigned long start = (unsigned long)patch_addr;
int err;
/* Repeat instruction */
if (repeat_instr) {
@ -383,19 +406,19 @@ static int __patch_instructions(u32 *patch_addr, u32 *code, size_t len, bool rep
if (ppc_inst_prefixed(instr)) {
u64 val = ppc_inst_as_ulong(instr);
memset64((u64 *)patch_addr, val, len / 8);
err = patch_memset64((u64 *)patch_addr, val, len / 8);
} else {
u32 val = ppc_inst_val(instr);
memset32(patch_addr, val, len / 4);
err = patch_memset32(patch_addr, val, len / 4);
}
} else {
memcpy(patch_addr, code, len);
err = copy_to_kernel_nofault(patch_addr, code, len);
}
smp_wmb(); /* smp write barrier */
flush_icache_range(start, start + len);
return 0;
return err;
}
/*

View File

@ -25,6 +25,13 @@
#include <asm/firmware.h>
#include <asm/inst.h>
/*
* Used to generate warnings if mmu or cpu feature check functions that
* use static keys before they are initialized.
*/
bool static_key_feature_checks_initialized __read_mostly;
EXPORT_SYMBOL_GPL(static_key_feature_checks_initialized);
struct fixup_entry {
unsigned long mask;
unsigned long value;
@ -679,6 +686,7 @@ void __init setup_feature_keys(void)
jump_label_init();
cpu_feature_keys_init();
mmu_feature_keys_init();
static_key_feature_checks_initialized = true;
}
static int __init check_features(void)

Some files were not shown because too many files have changed in this diff Show More