Commit Graph

47 Commits

Author SHA1 Message Date
Heiko Carstens 80ddf5ce1c s390: always build relocatable kernel
Nathan Chancellor reported several link errors on s390 with
CONFIG_RELOCATABLE disabled, after binutils commit 906f69cf65da ("IBM
zSystems: Issue error for *DBL relocs on misaligned symbols"). The binutils
commit reveals potential miscompiles that might have happened already
before with linker script defined symbols at odd addresses.

A similar bug was recently fixed in the kernel with commit c9305b6c1f
("s390: fix nospec table alignments").

See https://github.com/ClangBuiltLinux/linux/issues/1747 for an analysis
from Ulich Weigand.

Therefore always build a relocatable kernel to avoid this problem. There is
hardly any use-case for non-relocatable kernels, so this shouldn't be
controversial.

Link: https://github.com/ClangBuiltLinux/linux/issues/1747
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20221030182202.2062705-1-hca@linux.ibm.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-08 19:32:32 +01:00
Alexander Gordeev 2f0e8aae26 s390/mm: rework memcpy_real() to avoid DAT-off mode
Function memcpy_real() is an univeral data mover that does not
require DAT mode to be able reading from a physical address.
Its advantage is an ability to read from any address, even
those for which no kernel virtual mapping exists.

Although memcpy_real() is interrupt-safe, there are no handlers
that make use of this function. The compiler instrumentation
have to be disabled and separate no-DAT stack used to allow
execution of the function once DAT mode is disabled.

Rework memcpy_real() to overcome these shortcomings. As result,
data copying (which is primarily reading out a crashed system
memory by a user process) is executed on a regular stack with
enabled interrupts. Also, use of memcpy_real_buf swap buffer
becomes unnecessary and the swapping is eliminated.

The above is achieved by using a fixed virtual address range
that spans a single page and remaps that page repeatedly when
memcpy_real() is called for a particular physical address.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14 16:46:01 +02:00
Alexander Gordeev 4df29d2b90 s390/smp: rework absolute lowcore access
Temporary unsetting of the prefix page in memcpy_absolute() routine
poses a risk of executing code path with unexpectedly disabled prefix
page. This rework avoids the prefix page uninstalling and disabling
of normal and machine check interrupts when accessing the absolute
zero memory.

Although memcpy_absolute() routine can access the whole memory, it is
only used to update the absolute zero lowcore. This rework therefore
introduces a new mechanism for the absolute zero lowcore access and
scraps memcpy_absolute() routine for good.

Instead, an area is reserved in the virtual memory that is used for
the absolute lowcore access only. That area holds an array of 8KB
virtual mappings - one per CPU. Whenever a CPU is brought online, the
corresponding item is mapped to the real address of the previously
installed prefix page.

The absolute zero lowcore access works like this: a CPU calls the
new primitive get_abs_lowcore() to obtain its 8KB mapping as a
pointer to the struct lowcore. Virtual address references to that
pointer get translated to the real addresses of the prefix page,
which in turn gets swapped with the absolute zero memory addresses
due to prefixing. Once the pointer is not needed it must be released
with put_abs_lowcore() primitive:

	struct lowcore *abs_lc;
	unsigned long flags;

	abs_lc = get_abs_lowcore(&flags);
	abs_lc->... = ...;
	put_abs_lowcore(abs_lc, flags);

To ensure the described mechanism works large segment- and region-
table entries must be avoided for the 8KB mappings. Failure to do
so results in usage of Region-Frame Absolute Address (RFAA) or
Segment-Frame Absolute Address (SFAA) large page fields. In that
case absolute addresses would be used to address the prefix page
instead of the real ones and the prefixing would get bypassed.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-14 16:46:00 +02:00
Alexander Gordeev 5e441f61f5 Revert "s390/smp: rework absolute lowcore access"
This reverts commit 7d06fed77b.

This introduced vmem_mutex locking from vmem_map_4k_page()
function called from smp_reinit_ipl_cpu() with interrupts
disabled. While it is a pre-SMP early initcall no other CPUs
running in parallel nor other code taking vmem_mutex on this
boot stage - it still needs to be fixed.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-06 09:24:07 +02:00
Alexander Gordeev 7d06fed77b s390/smp: rework absolute lowcore access
Temporary unsetting of the prefix page in memcpy_absolute() routine
poses a risk of executing code path with unexpectedly disabled prefix
page. This rework avoids the prefix page uninstalling and disabling
of normal and machine check interrupts when accessing the absolute
zero memory.

Although memcpy_absolute() routine can access the whole memory, it is
only used to update the absolute zero lowcore. This rework therefore
introduces a new mechanism for the absolute zero lowcore access and
scraps memcpy_absolute() routine for good.

Instead, an area is reserved in the virtual memory that is used for
the absolute lowcore access only. That area holds an array of 8KB
virtual mappings - one per CPU. Whenever a CPU is brought online, the
corresponding item is mapped to the real address of the previously
installed prefix page.

The absolute zero lowcore access works like this: a CPU calls the
new primitive get_abs_lowcore() to obtain its 8KB mapping as a
pointer to the struct lowcore. Virtual address references to that
pointer get translated to the real addresses of the prefix page,
which in turn gets swapped with the absolute zero memory addresses
due to prefixing. Once the pointer is not needed it must be released
with put_abs_lowcore() primitive:

	struct lowcore *abs_lc;
	unsigned long flags;

	abs_lc = get_abs_lowcore(&flags);
	abs_lc->... = ...;
	put_abs_lowcore(abs_lc, flags);

To ensure the described mechanism works large segment- and region-
table entries must be avoided for the 8KB mappings. Failure to do
so results in usage of Region-Frame Absolute Address (RFAA) or
Segment-Frame Absolute Address (SFAA) large page fields. In that
case absolute addresses would be used to address the prefix page
instead of the real ones and the prefixing would get bypassed.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-28 18:05:23 +02:00
Alexander Gordeev 57ad19bcde s390/boot: cleanup adjust_to_uv_max() function
Uncouple input and output arguments by making the latter
the function return value.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-28 18:05:23 +02:00
Heiko Carstens edd4a86673 s390/boot: get rid of startup archive
The final kernel image is created by linking decompressor object files with
a startup archive. The startup archive file however does not contain only
optional code and data which can be discarded if not referenced. It also
contains mandatory object data like head.o which must never be discarded,
even if not referenced.

Move the decompresser code and linker script to the boot directory and get
rid of the startup archive so everything is kept during link time.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06 20:45:14 +02:00
Vasily Gorbik 9a39abb7c9 s390/boot: simplify and fix kernel memory layout setup
Initial KASAN shadow memory range was picked to preserve original kernel
modules area position. With protected execution support, which might
impose addressing limitation on vmalloc area and hence affect modules
area position, current fixed KASAN shadow memory range is only making
kernel memory layout setup more complex. So move it to the very end of
available virtual space and simplify calculations.

At the same time return to previous kernel address space split. In
particular commit 0c4f2623b9 ("s390: setup kernel memory layout
early") introduced precise identity map size calculation and keeping
vmemmap left most starting from a fresh region table entry. This didn't
take into account additional mapping region requirement for potential
DCSS mapping above available physical memory. So go back to virtual
space split between 1:1 mapping & vmemmap array once vmalloc area size
is subtracted.

Cc: stable@vger.kernel.org
Fixes: 0c4f2623b9 ("s390: setup kernel memory layout early")
Reported-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16 12:29:19 +01:00
Alexander Gordeev e3ec8e0f57 s390/boot: allocate amode31 section in decompressor
The memory for amode31 section is allocated from the decompressed
kernel. Instead, allocate that memory from the decompressor. This
is a prerequisite to allow initialization of the virtual memory
before the decompressed kernel takes over.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-04 09:49:37 +02:00
Alexander Gordeev e8f06683d4 s390/boot: factor out offset_vmlinux_info() function
Move offsetting all of vmlinux_info fields to a separate
function for better readability.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25 11:03:33 +02:00
Heiko Carstens 6ab023641a s390/boot: get rid of arithmetics on function pointers
sparse warning:
  CHECK   arch/s390/boot/startup.c
arch/s390/boot/startup.c:283:39: error: arithmetics on pointers to functions

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-07-27 09:39:22 +02:00
Alexander Egorenkov 6bda667037 s390/boot: move dma sections from decompressor to decompressed kernel
This change simplifies the task of making the decompressor relocatable.

The decompressor's image contains special DMA sections between _sdma and
_edma. This DMA segment is loaded at boot as part of the decompressor and
then simply handed over to the decompressed kernel. The decompressor itself
never uses it in any way. The primary reason for this is the need to keep
the aforementioned DMA segment below 2GB which is required by architecture,
and because the decompressor is always loaded at a fixed low physical
address, it is guaranteed that the DMA region will not cross the 2GB
memory limit. If the DMA region had been placed in the decompressed kernel,
then KASLR would make this guarantee impossible to fulfill or it would
be restricted to the first 2GB of memory address space.

This commit moves all DMA sections between _sdma and _edma from
the decompressor's image to the decompressed kernel's image. The complete
DMA region is placed in the init section of the decompressed kernel and
immediately relocated below 2GB at start-up before it is needed by other
parts of the decompressed kernel. The relocation of the DMA region happens
even if the decompressed kernel is already located below 2GB in order
to keep the first implementation simple. The relocation should not have
any noticeable impact on boot time because the DMA segment is only a couple
of pages.

After relocating the DMA sections, the kernel has to fix all references
which point into it. In order to automate this, place all variables
pointing into the DMA sections in a special .dma.refs section. All such
variables must be defined using the new __dma_ref macro. Only variables
containing addresses within the DMA sections must be placed in the new
.dma.refs section.

Furthermore, move the initialization of control registers from
the decompressor to the decompressed kernel because some control registers
reference tables that must be placed in the DMA data section to
guarantee that their addresses are below 2G. Because the decompressed
kernel relocates the DMA sections at startup, the content of control
registers CR2, CR5 and CR15 must be updated with new addresses after
the relocation. The decompressed kernel initializes all control registers
early at boot and then updates the content of CR2, CR5 and CR15
as soon as the DMA relocation has occurred. This practically reverts
the commit a80313ff91 ("s390/kernel: introduce .dma sections").

Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-07-27 09:39:17 +02:00
Alexander Egorenkov e9e7870f90 s390/dump: introduce boot data 'oldmem_data'
The new boot data struct shall replace global variables OLDMEM_BASE and
OLDMEM_SIZE. It is initialized in the decompressor and passed
to the decompressed kernel. In comparison to the old solution, this one
doesn't access data at fixed physical addresses which will become important
when the decompressor becomes relocatable.

Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-07-27 09:39:16 +02:00
Alexander Egorenkov 84733284f6 s390/boot: introduce boot data 'initrd_data'
The new boot data struct shall replace global variables INITRD_START and
INITRD_SIZE. It is initialized in the decompressor and passed
to the decompressed kernel. In comparison to the old solution, this one
doesn't access data at fixed physical addresses which will become important
when the decompressor becomes relocatable.

Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-07-27 09:39:15 +02:00
Alexander Egorenkov 42c89439b9 s390/boot: disable Secure Execution in dump mode
A dump kernel is neither required nor able to support Secure Execution.

Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-07-27 09:39:14 +02:00
Alexander Egorenkov c5cf505446 s390/boot: move uv function declarations to boot/uv.h
The functions adjust_to_uv_max() and uv_query_info() are used only
in the decompressor. Therefore, move the function declarations from
the global arch/s390/include/asm/uv.h to arch/s390/boot/uv.h.

Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-07-27 09:39:14 +02:00
Alexander Egorenkov 7fadcc0787 s390/boot: move all linker symbol declarations from c to h files
To prevent multiple incompatible declarations of symbols and to catch
such mistakes at compile time.

Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-07-27 09:39:12 +02:00
Alexander Egorenkov 9f744abb46 s390/boot: replace magic string check with a bootdata flag
The magic string "S390EP" at offset 0x10008 indicated to the decompressed
kernel that it was booted by the decompressor. Introduce a new bootdata
flag instead which conveys the same information in an explicit and
a cleaner way. But keep the magic string because it is a kernel ABI.

Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-07-05 12:44:23 +02:00
Vasily Gorbik 0c4f2623b9 s390: setup kernel memory layout early
Currently there are two separate places where kernel memory layout has
to be known and adjusted:
1. early kasan setup.
2. paging setup later.

Those 2 places had to be kept in sync and adjusted to reflect peculiar
technical details of one another. With additional factors which influence
kernel memory layout like ultravisor secure storage limit, complexity
of keeping two things in sync grew up even more.

Besides that if we look forward towards creating identity mapping and
enabling DAT before jumping into uncompressed kernel - that would also
require full knowledge of and control over kernel memory layout.

So, de-duplicate and move kernel memory layout setup logic into
the decompressor.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-18 16:41:19 +02:00
Sven Schnelle 17e89e1340 s390/facilities: move stfl information from lowcore to global data
With gcc-11, there are a lot of warnings because the facility functions
are accessing lowcore through a null pointer. Fix this by moving the
facility arrays away from lowcore.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-07 17:06:58 +02:00
Vasily Gorbik 73045a08cf s390: unify identity mapping limits handling
Currently we have to consider too many different values which
in the end only affect identity mapping size. These are:
1. max_physmem_end - end of physical memory online or standby.
   Always <= end of the last online memory block (get_mem_detect_end()).
2. CONFIG_MAX_PHYSMEM_BITS - the maximum size of physical memory the
   kernel is able to support.
3. "mem=" kernel command line option which limits physical memory usage.
4. OLDMEM_BASE which is a kdump memory limit when the kernel is executed as
   crash kernel.
5. "hsa" size which is a memory limit when the kernel is executed during
   zfcp/nvme dump.

Through out kernel startup and run we juggle all those values at once
but that does not bring any amusement, only confusion and complexity.

Unify all those values to a single one we should really care, that is
our identity mapping size.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:10 +01:00
Vasily Gorbik d7e7fbba67 s390/early: rewrite program parameter setup in C
And move it earlier in the decompressor.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:21:00 +01:00
Heiko Carstens 90178c1900 s390/mm: let vmalloc area size depend on physical memory size
To make sure that the vmalloc area size is for almost all cases large
enough let it depend on the (potential) physical memory size. There is
still the possibility to override this with the vmalloc kernel command
line parameter.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:59 +01:00
Joe Perches 33def8498f treewide: Convert macro and uses of __section(foo) to __section("foo")
Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.

Remove the quote operator # from compiler_attributes.h __section macro.

Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.

Conversion done using the script at:

    https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-25 14:51:49 -07:00
Vasily Gorbik 1c7c83e8d2 s390: remove unused _swsusp_reset_dma
Since commit 394216275c ("s390: remove broken hibernate / power
management support") _swsusp_reset_dma is unused and could be safely
removed.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-29 15:00:58 +02:00
Alexander Egorenkov 980d5f9ab3 s390/boot: enable .bss section for compressed kernel
- Support static uninitialized variables in compressed kernel.
- Remove chkbss script
- Get rid of workarounds for not having .bss section

Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-16 14:08:47 +02:00
Gerald Schaefer a9f2f6865d s390/kaslr: store KASLR offset for early dumps
The KASLR offset is added to vmcoreinfo in arch_crash_save_vmcoreinfo(),
so that it can be found by crash when processing kernel dumps.

However, arch_crash_save_vmcoreinfo() is called during a subsys_initcall,
so if the kernel crashes before that, we have no vmcoreinfo and no KASLR
offset.

Fix this by storing the KASLR offset in the lowcore, where the vmcore_info
pointer will be stored, and where it can be found by crash. In order to
make it distinguishable from a real vmcore_info pointer, mark it as uneven
(KASLR offset itself is aligned to THREAD_SIZE).

When arch_crash_save_vmcoreinfo() stores the real vmcore_info pointer in
the lowcore, it overwrites the KASLR offset. At that point, the KASLR
offset is not yet added to vmcoreinfo, so we also need to move the
mem_assign_absolute() behind the vmcoreinfo_append_str().

Fixes: b2d24b97b2 ("s390/kernel: add support for kernel address space layout randomization (KASLR)")
Cc: <stable@vger.kernel.org> # v5.2+
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30 10:52:45 +01:00
Linus Torvalds ea1f56fa16 s390 updates for the 5.5 merge window
- Adjust PMU device drivers registration to avoid WARN_ON and few other
   perf improvements.
 
 - Enhance tracing in vfio-ccw.
 
 - Few stack unwinder fixes and improvements, convert get_wchan custom
   stack unwinding to generic api usage.
 
 - Fixes for mm helpers issues uncovered with tests validating architecture
   page table helpers.
 
 - Fix noexec bit handling when hardware doesn't support it.
 
 - Fix memleak and unsigned value compared with zero bugs in crypto
   code. Minor code simplification.
 
 - Fix crash during kdump with kasan enabled kernel.
 
 - Switch bug and alternatives from asm to asm_inline to improve inlining
   decisions.
 
 - Use 'depends on cc-option' for MARCH and TUNE options in Kconfig,
   add z13s and z14 ZR1 to TUNE descriptions.
 
 - Minor head64.S simplification.
 
 - Fix physical to logical CPU map for SMT.
 
 - Several cleanups in qdio code.
 
 - Other minor cleanups and fixes all over the code.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl3ahGYACgkQjYWKoQLX
 FBguwAgAig+FNos8zkd7Sr2wg4DPL2IlYERVP40fOLXfGuVUOnMLg8OTO6yDWDpH
 5+cKAQS1wWgyvlfjWRUJ6anXLBAsgKRD1nyFIZTpn/wArGk/duCbnl/VFriDgrST
 8KTQDJpZ9w9nXtQ7lA2QWaw5U2WG8I2T2JuQJCdLXze7RXi0bDVe8e6131NMaJ42
 LLxqOqm8d8XDnd8oDVP04LT5IfhuI2cILoGBP/GyI2fqQk9Ems6M2gxuISq1COmy
 WORDLfwWyCLeF7gWKKjxf8Vo1HYcyoFvdXnxWiHb0TDZesQZJr/LLELTP03fbCW9
 U4jbXncnnPA7kT4tlC95jT5M69yK5w==
 =+FxG
 -----END PGP SIGNATURE-----

Merge tag 's390-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Adjust PMU device drivers registration to avoid WARN_ON and few other
   perf improvements.

 - Enhance tracing in vfio-ccw.

 - Few stack unwinder fixes and improvements, convert get_wchan custom
   stack unwinding to generic api usage.

 - Fixes for mm helpers issues uncovered with tests validating
   architecture page table helpers.

 - Fix noexec bit handling when hardware doesn't support it.

 - Fix memleak and unsigned value compared with zero bugs in crypto
   code. Minor code simplification.

 - Fix crash during kdump with kasan enabled kernel.

 - Switch bug and alternatives from asm to asm_inline to improve
   inlining decisions.

 - Use 'depends on cc-option' for MARCH and TUNE options in Kconfig, add
   z13s and z14 ZR1 to TUNE descriptions.

 - Minor head64.S simplification.

 - Fix physical to logical CPU map for SMT.

 - Several cleanups in qdio code.

 - Other minor cleanups and fixes all over the code.

* tag 's390-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (41 commits)
  s390/cpumf: Adjust registration of s390 PMU device drivers
  s390/smp: fix physical to logical CPU map for SMT
  s390/early: move access registers setup in C code
  s390/head64: remove unnecessary vdso_per_cpu_data setup
  s390/early: move control registers setup in C code
  s390/kasan: support memcpy_real with TRACE_IRQFLAGS
  s390/crypto: Fix unsigned variable compared with zero
  s390/pkey: use memdup_user() to simplify code
  s390/pkey: fix memory leak within _copy_apqns_from_user()
  s390/disassembler: don't hide instruction addresses
  s390/cpum_sf: Assign error value to err variable
  s390/cpum_sf: Replace function name in debug statements
  s390/cpum_sf: Use consistant debug print format for sampling
  s390/unwind: drop unnecessary code around calling ftrace_graph_ret_addr()
  s390: add error handling to perf_callchain_kernel
  s390: always inline current_stack_pointer()
  s390/mm: add mm_pxd_folded() checks to pxd_free()
  s390/mm: properly clear _PAGE_NOEXEC bit when it is not supported
  s390/mm: simplify page table helpers for large entries
  s390/mm: make pmd/pud_bad() report large entries as bad
  ...
2019-11-25 17:23:53 -08:00
Nick Desaulniers 4f84b38351 s390/boot: fix section name escaping
GCC unescapes escaped string section names while Clang does not. Because
__section uses the `#` stringification operator for the section name, it
doesn't need to be escaped.

This antipattern was found with:
$ grep -e __section\(\" -e __section__\(\" -r

Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Message-Id: <20190812215052.71840-1-ndesaulniers@google.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-31 17:20:52 +01:00
Gerald Schaefer ac49303d9e s390/kaslr: add support for R_390_GLOB_DAT relocation type
Commit "bpf: Process in-kernel BTF" in linux-next introduced an undefined
__weak symbol, which results in an R_390_GLOB_DAT relocation type. That
is not yet handled by the KASLR relocation code, and the kernel stops with
the message "Unknown relocation type".

Add code to detect and handle R_390_GLOB_DAT relocation types and undefined
symbols.

Fixes: 805bc0bc23 ("s390/kernel: build a relocatable kernel")
Cc: <stable@vger.kernel.org> # v5.2+
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-22 17:55:51 +02:00
Vasily Gorbik 2e83e0eb85 s390: clean .bss before running uncompressed kernel
Clean uncompressed kernel .bss section in the startup code before
the uncompressed kernel is executed. At this point of time initrd and
certificates have been already rescued. Uncompressed kernel .bss size
is known from vmlinux_info. It is also taken into consideration during
uncompressed kernel positioning by kaslr (so it is safe to clean it).

With that uncompressed kernel is starting with .bss section zeroed and
no .bss section usage restrictions apply. Which makes chkbss checks for
uncompressed kernel objects obsolete and they can be removed.

early_nobss.c is also not needed anymore. Parts of it which are still
relevant are moved to early.c. Kasan initialization code is now called
directly from head64 (early.c is instrumented and should not be
executed before kasan shadow memory is set up).

Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-21 12:58:52 +02:00
Martin Schwidefsky 98587c2d89 s390: simplify disabled_wait
The disabled_wait() function uses its argument as the PSW address when
it stops the CPU with a wait PSW that is disabled for interrupts.
The different callers sometimes use a specific number like 0xdeadbeef
to indicate a specific failure, the early boot code uses 0 and some
other calls sites use __builtin_return_address(0).

At the time a dump is created the current PSW and the registers of a
CPU are written to lowcore to make them avaiable to the dump analysis
tool. For a CPU stopped with disabled_wait the PSW and the registers
do not really make sense together, the PSW address does not point to
the function the registers belong to.

Simplify disabled_wait() by using _THIS_IP_ for the PSW address and
drop the argument to the function.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-05-02 13:54:11 +02:00
Gerald Schaefer b2d24b97b2 s390/kernel: add support for kernel address space layout randomization (KASLR)
This patch adds support for relocating the kernel to a random address.
The random kernel offset is obtained from cpacf, using either TRNG, PRNO,
or KMC_PRNG, depending on supported MSA level.

KERNELOFFSET is added to vmcoreinfo, for crash --kaslr support.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-29 10:47:10 +02:00
Gerald Schaefer a80313ff91 s390/kernel: introduce .dma sections
With a relocatable kernel that could reside at any place in memory, code
and data that has to stay below 2 GB needs special handling.

This patch introduces .dma sections for such text, data and ex_table.
The sections will be part of the decompressor kernel, so they will not
be relocated and stay below 2 GB. Their location is passed over to the
decompressed / relocated kernel via the .boot.preserved.data section.

The duald and aste for control register setup also need to stay below
2 GB, so move the setup code from arch/s390/kernel/head64.S to
arch/s390/boot/head.S. The duct and linkage_stack could reside above
2 GB, but their content has to be preserved for the decompresed kernel,
so they are also moved into the .dma section.

The start and end address of the .dma sections is added to vmcoreinfo,
for crash support, to help debugging in case the kernel crashed there.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-29 10:47:10 +02:00
Gerald Schaefer 805bc0bc23 s390/kernel: build a relocatable kernel
This patch adds support for building a relocatable kernel with -fPIE.
The kernel will be relocated to 0 early in the boot process.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-29 10:47:10 +02:00
Martin Schwidefsky 9641b8cc73 s390/ipl: read IPL report at early boot
Read the IPL Report block provided by secure-boot, add the entries
of the certificate list to the system key ring and print the list
of components.

PR: Adjust to Vasilys bootdata_preserved patch set. Preserve ipl_cert_list
for later use in kexec_file.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-26 12:34:05 +02:00
Vasily Gorbik 5abb9351df s390/uv: introduce guest side ultravisor code
The Ultravisor Call Facility (stfle bit 158) defines an API to the
Ultravisor (UV calls), a mini hypervisor located at machine
level. With help of the Ultravisor, KVM will be able to run
"protected" VMs, special VMs whose memory and management data are
unavailable to KVM.

The protected VMs can also request services from the Ultravisor.
The guest api consists of UV calls to share and unshare memory with the
kvm hypervisor.

To enable this feature support PROTECTED_VIRTUALIZATION_GUEST kconfig
option has been introduced.

Co-developed-by: Janosch Frank <frankja@de.ibm.com>
Signed-off-by: Janosch Frank <frankja@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10 17:47:21 +02:00
Gerald Schaefer bf9921a9c1 s390: introduce .boot.preserved.data section
Introduce .boot.preserve.data section which is similar to .boot.data and
"shared" between the decompressor code and the decompressed kernel. The
decompressor will store values in it, and copy over to the decompressed
image before starting it. This method allows to avoid using pre-defined
addresses and other hacks to pass values between those boot phases.

Unlike .boot.data section .boot.preserved.data is NOT a part of init data,
and hence will be preserved for the kernel life time.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-04-10 17:47:09 +02:00
Vasily Gorbik b5e804598d s390: allow overriding facilities via command line
Add "facilities=" command line option which allows to override
facility bits returned by stfle. The main purpose of that is debugging
aids which allows to test specific kernel behaviour depending on
specific facilities presence. It also affects CPU alternatives.

"facilities=" command line option format is comma separated list of
integer values to be additionally set or cleared (if value is starting
with "!"). Values ranges are also supported. e.g.:

facilities=!130-160,159,167-169

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-01 08:00:41 +01:00
Vasily Gorbik 49698745e5 s390: move ipl block and cmd line handling to early boot phase
To distinguish zfcpdump case and to be able to parse some of the kernel
command line arguments early (e.g. mem=) ipl block retrieval and command
line construction code is moved to the early boot phase.

"memory_end" is set up correctly respecting "mem=" and hsa_size in case
of the zfcpdump.

arch/s390/boot/string.c is introduced to provide string handling and
command line parsing functions to early boot phase code for the compressed
kernel image case.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:14 +02:00
Vasily Gorbik 6966d604e2 s390/mem_detect: move tprot loop to early boot phase
Move memory detection to early boot phase. To store online memory
regions "struct mem_detect_info" has been introduced together with
for_each_mem_detect_block iterator. mem_detect_info is later converted
to memblock.

Also introduces sclp_early_get_meminfo function to get maximum physical
memory and maximum increment number.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:08 +02:00
Vasily Gorbik 17aacfbfa1 s390/sclp: move sclp_early_read_info to sclp_early_core.c
To enable early online memory detection sclp_early_read_info has
been moved to sclp_early_core.c. sclp_info_sccb has been made a part
of .boot.data, which allows to reuse it later during early kernel
startup and make sclp_early_read_info call just once.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:07 +02:00
Vasily Gorbik d1b52a4388 s390: introduce .boot.data section
Introduce .boot.data section which is "shared" between the decompressor
code and the decompressed kernel. The decompressor will store values in
it, and copy over to the decompressed image before starting it. This
method allows to avoid using pre-defined addresses and other hacks to
pass values between those boot phases.

.boot.data section is a part of init data, and will be freed after kernel
initialization is complete.

For uncompressed kernel image, .boot.data section is basically the same
as .init.data

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:06 +02:00
Vasily Gorbik 7516fc11e4 s390/decompressor: clean up and rename compressed/misc.c
Since compressed/misc.c is conditionally compiled move error reporting
code to boot/main.c. With that being done compressed/misc.c has no
"miscellaneous" functions left and is all about plain decompression
now. Rename it accordingly.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:05 +02:00
Vasily Gorbik 15426ca43d s390: rescue initrd as early as possible
To avoid multi-stage initrd rescue operation and to simplify
assumptions during early memory allocations move initrd at some final
safe destination as early as possible. This would also allow us to
drop .bss usage restrictions for some files.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:05 +02:00
Vasily Gorbik 369f91c374 s390/decompressor: rework uncompressed image info collection
The kernel decompressor has to know several bits of information about
uncompressed image. Currently this info is collected by running "nm" on
uncompressed vmlinux + "sed" and producing sizes.h file. This method
worked well, but it has several disadvantages. Obscure symbols name
pattern matching is fragile. Adding new values makes pattern even
longer. Logic is spread across code and make file. Limited ability to
adjust symbols values (currently magic lma value of 0x100000 is always
subtracted). Apart from that same pieces of information (and more)
would be needed for early memory detection and features like KASLR
outside of boot/compressed/ folder where sizes.h is generated.

To overcome limitations new "struct vmlinux_info" has been introduced
to include values needed for the decompressor and the rest of the
boot code. The only static instance of vmlinux_info is produced during
vmlinux link step by filling in struct fields by the linker (like it is
done with input_data in boot/compressed/vmlinux.scr.lds.S). This way
individual values could be adjusted with all the knowledge linker has
and arithmetic it supports. Later .vmlinux.info section (which contains
struct vmlinux_info) is transplanted into the decompressor image and
dropped from uncompressed image altogether.

While doing that replace "compressed/vmlinux.scr.lds.S" linker
script (whose purpose is to rename .data section in piggy.o to
.rodata.compressed) with plain objcopy command. And simplify
decompressor's linker script.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:02 +02:00
Vasily Gorbik 8f75582a2f s390: remove decompressor's head.S
Decompressor's head.S provided "data mover" sole purpose of which has
been to safely move uncompressed kernel at 0x100000 and jump to it.

With current bzImage layout entire decompressor's code guaranteed to be
in a safe location under 0x100000, and hence could not be overwritten
during kernel move. For that reason head.S could be replaced with simple
memmove function. To do so introduce early boot code phase which is
executed from arch/s390/boot/head.S after "verify_facilities" and takes
care of optional kernel image decompression and transition to it.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:02 +02:00