Commit graph

5075 commits

Author SHA1 Message Date
Maxim Samoylov
53743aa7f1 KVM: s390: Introduce Vector Enhancements facility 1 to the guest
We can directly forward the vector enhancement facility 1 to the guest
if available and VX is requested by user space.

Please note that user space will have to take care of the final state
of the facility bit when migrating to older machines.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Maxim Samoylov <max7255@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-01-30 11:17:29 +01:00
Christian Borntraeger
27f67f8727 KVM: s390: Get rid of ar_t
sparse with __CHECK_ENDIAN__ shows that ar_t was never properly
used across KVM on s390. We can now:
- fix all places
- do not make ar_t special
Since ar_t is just used as a register number (no endianness issues
for u8), and all other register numbers are also just plain int
variables, let's just use u8, which matches the __u8 in the userspace
ABI for the memop ioctl.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-01-30 11:17:29 +01:00
Heiko Carstens
d051ae5313 KVM: s390: get rid of bogus cc initialization
The plo inline assembly has a cc output operand that is always written
to and is also as such an operand declared. Therefore the compiler is
free to omit the rather pointless and misleading initialization.

Get rid of this.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-01-30 11:17:28 +01:00
Janosch Frank
cd1836f583 KVM: s390: instruction-execution-protection support
The new Instruction Execution Protection needs to be enabled before
the guest can use it. Therefore we pass the IEP facility bit to the
guest and enable IEP interpretation.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-01-30 11:17:28 +01:00
Christian Borntraeger
a679c547d1 KVM: s390: gaccess: add ESOP2 handling
When we access guest memory and run into a protection exception, we
need to pass the exception data to the guest. ESOP2 provides detailed
information about all protection exceptions which ESOP1 only partially
provided.

The gaccess changes make sure, that the guest always gets all
available information.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-01-30 11:17:27 +01:00
Bart Van Assche
7844572c63 lib/dma-noop: Only build dma_noop_ops for s390 and m32r
Reduce the kernel size by only building dma_noop_ops for those
architectures that actually use it. This was suggested by
Christoph Hellwig.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 12:23:35 -05:00
Bart Van Assche
815dd18788 treewide: Consolidate get_dma_ops() implementations
Introduce a new architecture-specific get_arch_dma_ops() function
that takes a struct bus_type * argument. Add get_dma_ops() in
<linux/dma-mapping.h>.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 12:23:35 -05:00
Bart Van Assche
5657933dbb treewide: Move dma_ops from struct dev_archdata into struct device
Some but not all architectures provide set_dma_ops(). Move dma_ops
from struct dev_archdata into struct device such that it becomes
possible on all architectures to configure dma_ops per device.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 12:23:35 -05:00
Bart Van Assche
5299709d0a treewide: Constify most dma_map_ops structures
Most dma_map_ops structures are never modified. Constify these
structures such that these can be write-protected. This patch
has been generated as follows:

git grep -l 'struct dma_map_ops' |
  xargs -d\\n sed -i \
    -e 's/struct dma_map_ops/const struct dma_map_ops/g' \
    -e 's/const struct dma_map_ops {/struct dma_map_ops {/g' \
    -e 's/^const struct dma_map_ops;$/struct dma_map_ops;/' \
    -e 's/const const struct dma_map_ops /const struct dma_map_ops /g';
sed -i -e 's/const \(struct dma_map_ops intel_dma_ops\)/\1/' \
  $(git grep -l 'struct dma_map_ops intel_dma_ops');
sed -i -e 's/const \(struct dma_map_ops dma_iommu_ops\)/\1/' \
  $(git grep -l 'struct dma_map_ops' | grep ^arch/powerpc);
sed -i -e '/^struct vmd_dev {$/,/^};$/ s/const \(struct dma_map_ops[[:blank:]]dma_ops;\)/\1/' \
       -e '/^static void vmd_setup_dma_ops/,/^}$/ s/const \(struct dma_map_ops \*dest\)/\1/' \
       -e 's/const \(struct dma_map_ops \*dest = \&vmd->dma_ops\)/\1/' \
    drivers/pci/host/*.c
sed -i -e '/^void __init pci_iommu_alloc(void)$/,/^}$/ s/dma_ops->/intel_dma_ops./' arch/ia64/kernel/pci-dma.c
sed -i -e 's/static const struct dma_map_ops sn_dma_ops/static struct dma_map_ops sn_dma_ops/' arch/ia64/sn/pci/pci_dma.c
sed -i -e 's/(const struct dma_map_ops \*)//' drivers/misc/mic/bus/vop_bus.c

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24 12:23:35 -05:00
Christian Borntraeger
0d6da872d3 s390/mm: Fix cmma unused transfer from pgste into pte
The last pgtable rework silently disabled the CMMA unused state by
setting a local pte variable (a parameter) instead of propagating it
back into the caller. Fix it.

Fixes: ebde765c0e ("s390/mm: uninline ptep_xxx functions from pgtable.h")
Cc: stable@vger.kernel.org # v4.6+
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-24 16:03:42 +01:00
Martin Schwidefsky
9dce990d2c s390/ptrace: Preserve previous registers for short regset write
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.

convert_vx_to_fp() is adapted to handle only a specified number of
registers rather than unconditionally handling all of them: other
callers of this function are adapted appropriately.

Based on an initial patch by Dave Martin.

Cc: stable@vger.kernel.org
Reported-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-24 08:33:40 +01:00
Linus Torvalds
4c9eff7af6 KVM fixes for v4.10-rc5
ARM:
  - Fix for timer setup on VHE machines
  - Drop spurious warning when the timer races against the vcpu running
    again
  - Prevent a vgic deadlock when the initialization fails (for stable)
 
 s390:
  - Fix a kernel memory exposure (for stable)
 
 x86:
  - Fix exception injection when hypercall instruction cannot be patched
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJYglwIAAoJEED/6hsPKofoZp0H+gLLEeKP0Mu+olXiOWjB/KFp
 WBDAR1872xIjvEcOl9l6AZgdmp2hk7KW1t+kJj5npgu237v6fHBO9ybqrAfhfU4l
 PH23zOebL15HINcwCK6OcxOTiOtgae5Nui1cnLJBHDQgPTC/VmIE8NgV/qrMyo2r
 Vth+K/cBLKiWG9JhyQvxmrfupNJUknLSH7CTnlO/fC8GEJzDfMpUl7B1Ui0TGK53
 ExVgVLg3F28SErj9bUU8y4VJhMrwDAf2Kx2BNHqDbzXMzTdp0LrGRymFLl2/Gxez
 zLtZDfGYYzEhPp1NuDydlxLb8ymnsQNB7K6Kau0w9JoAvOYwfUYfDt+GaTegwYM=
 =dPtS
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
 "ARM:
   - Fix for timer setup on VHE machines
   - Drop spurious warning when the timer races against the vcpu running
     again
   - Prevent a vgic deadlock when the initialization fails (for stable)

  s390:
   - Fix a kernel memory exposure (for stable)

  x86:
   - Fix exception injection when hypercall instruction cannot be
     patched"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: do not expose random data via facility bitmap
  KVM: x86: fix fixing of hypercalls
  KVM: arm/arm64: vgic: Fix deadlock on error handling
  KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems
  KVM: arm/arm64: Fix occasional warning from the timer work function
2017-01-20 14:19:34 -08:00
Christian Borntraeger
0447819741 KVM: s390: do not expose random data via facility bitmap
kvm_s390_get_machine() populates the facility bitmap by copying bytes
from the host results that are stored in a 256 byte array in the prefix
page. The KVM code does use the size of the target buffer (2k), thus
copying and exposing unrelated kernel memory (mostly machine check
related logout data).

Let's use the size of the source buffer instead.  This is ok, as the
target buffer will always be greater or equal than the source buffer as
the KVM internal buffers (and thus S390_ARCH_FAC_LIST_SIZE_BYTE) cover
the maximum possible size that is allowed by STFLE, which is 256
doublewords. All structures are zero allocated so we can leave bytes
256-2047 unchanged.

Add a similar fix for kvm_arch_init_vm().

Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[found with smatch]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: stable@vger.kernel.org
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-01-20 15:29:34 +01:00
Daniel Borkmann
9437964885 s390/bpf: remove redundant check for non-null image
After we already allocated the jit.prg_buf image via
bpf_jit_binary_alloc() and filled it out with instructions,
jit.prg_buf cannot be NULL anymore. Thus, remove the
unnecessary check. Tested on s390x with test_bpf module.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:55 +01:00
Heiko Carstens
89175cf766 s390: provide sclp based boot console
Use the early sclp code to provide a boot console. This boot console
is available if the kernel parameter "earlyprintk" has been specified,
just like it works for other architectures that also provide an early
boot console.

This makes debugging of early problems much easier, since now we
finally have working console output even before memory detection is
running.

The boot console will be automatically disabled as soon as another
console will be registered.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:55 +01:00
Heiko Carstens
f031974859 s390/sclp: always stay within bounds of the early sccb
Make sure the _sclp_print_lm function stays within bounds of the early
sccb, even if the passed string is very long.  If the string is too
long, the remaining characters will be dropped.

Suggested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:55 +01:00
Heiko Carstens
742dc5773c s390/sclp: make early sclp irq handler more robust
Make the early sclp interrupt handler more robust:

- disable all interrupt sub classes except for the service signal subclass
- extend ctlreg0 union so it is easily possible to set the service signal
  subclass mask bit without using a magic number
- disable lowcore protection before writing to it
- make sure that all write accesses are done before the original content
  of control register 0 is restored, which could enable lowcore protection

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:55 +01:00
Heiko Carstens
68cc795d19 s390/topology: make "topology=off" parameter work
The "topology=off" kernel parameter is supposed to prevent the kernel
to use hardware topology information to generate scheduling domains
etc.
For an unknown reason I implemented this in a very odd way back then:
instead of simply clearing the MACHINE_HAS_TOPOLOGY flag within the
lowcore I added a second variable which indicated that topology
information should not be used. This is more than suboptimal since it
partially doesn't work.  For the fake NUMA case topology information
is still considered and scheduling domains will be created based on
this.
To fix this and to simplify the code get rid of the extra variable and
implement the "topology=off" case like it is done for other features.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:54 +01:00
Heiko Carstens
970ba6ac6a s390: use false/true when using bool
Yet another trivial patch to reduce the noise that coccinelle
generates.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:54 +01:00
Heiko Carstens
0b92515916 s390: remove couple of unneeded semicolons
Remove a couple of unneeded semicolons. This is just to reduce the
noise that the coccinelle static code checker generates.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:54 +01:00
Heiko Carstens
496e59cc48 s390/topology: reduce number of printks
Merge the seven printks within topology_init_early to a single one.
With an early boot console this avoids printing six lines each
containing only a single character.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:54 +01:00
Heiko Carstens
00de54c803 s390/mem_detect: fix memory type of first block
Fix a long-standing but currently irrelevant bug: the memory detection
code performs a tprot instruction on address zero to figure out if the
first memory chunk is readable or writable. Due to low address
protection the result is "read-only". If the memory detection code
would actually care, it would have to ignore the first memory
increment, but it adds the memory increment to writable memory anyway.

If memblock debugging is enabled this leads to an extra rather
surprising call which registers memory. To avoid this get rid of the
first misleading tprot call and simply assume that the first memory
increment is writable. Otherwise we wouldn't have reached the memory
detection code anyway.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:54 +01:00
Heiko Carstens
a2ce2a9568 s390/mem_detect: add debugging output
The s390 specific memory detection code does not call memblock_add,
which would generate debug output if memblock=debug is specified on
the kernel command line. Instead it directly calls memblock_add_range,
which doesn't generate any debug output.
To have a chance to debug early memblock related bugs add an s390
specific memblock_dbg call and a (missing) memblock_dump_all call.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:53 +01:00
Heiko Carstens
7be5e359a7 s390/setup: call memblock_reserve only for size > 0
reserve_initrd currently calls memblock_reserve even if the to be
reserved size is zero. Even though the memblock core code can handle
this correctly, it still yields confusing debug messages if
memblock debugging is enabled.
Therefore make sure to not call memblock_reserve with a size of zero.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:53 +01:00
Sebastian Ott
5064cd3506 s390/pci: use proper endianness annotations
Add proper annotation to the bar definition and use casts within the
bus accessors. Also change the sequence in the accessors to do the
shifts in the native byte order. No functional change.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:53 +01:00
Heiko Carstens
90b3baa232 s390: proper type casts for csum_partial invocations
Keep sparse and other static code checkers from emitting warnings like:

arch/s390/kernel/ipl.c:1549:14: warning: incorrect type in assignment (different base types)
arch/s390/kernel/ipl.c:1549:14:    expected unsigned int [unsigned] csum
arch/s390/kernel/ipl.c:1549:14:    got restricted __wsum

All usages in s390 code are ok. Therefore add proper casts.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:53 +01:00
Heiko Carstens
551f413434 s390/lib: improve memmove, memset and memcpy
Improve the memmove implementation to save one instruction and use
better label names. Also use better label names for the memset and
memcpy implementations so everything looks consistent.

Suggested-by: Jens Remus <jremus@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:51 +01:00
Heiko Carstens
e3850ecfc1 s390/cpumf: get rid of variable length array
The stcctm5 inline assembly uses a variable length array to specify
the memory that is written to.  According to the gcc manual this trick
only works if the length is known at compile time. This is not the the
case for the stccm5 inline assembly.

Therefore simply use a full memory clobber. As requested by Martin
also move the output Q constraint operand to the input operands list,
since all we want is that the compiler generates an instruction that
may use the displacement field: in other words we only need the
address of *val. That the inline assembly actually writes to an array
starting at val is taken care of with the memory clobber.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:51 +01:00
Heiko Carstens
1d9995771f s390: update defconfigs
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2017-01-16 07:27:48 +01:00
Heiko Carstens
e991c24d68 s390/ctl_reg: make __ctl_load a full memory barrier
We have quite a lot of code that depends on the order of the
__ctl_load inline assemby and subsequent memory accesses, like
e.g. disabling lowcore protection and the writing to lowcore.

Since the __ctl_load macro does not have memory barrier semantics, nor
any other dependencies the compiler is, theoretically, free to shuffle
code around. Or in other words: storing to lowcore could happen before
lowcore protection is disabled.

In order to avoid this class of potential bugs simply add a full
memory barrier to the __ctl_load macro.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:48 +01:00
Frederic Weisbecker
c8d7dabf8f sched/cputime: Rename vtime_account_user() to vtime_flush()
CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y used to accumulate user time and
account it on ticks and context switches only through the
vtime_account_user() function.

Now this model has been generalized on the 3 archs for all kind of
cputime (system, irq, ...) and all the cputime flushing happens under
vtime_account_user().

So let's rename this function to better reflect its new role.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-11-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14 09:54:13 +01:00
Martin Schwidefsky
b7394a5f4c sched/cputime, s390: Implement delayed accounting of system time
The account_system_time() function is called with a cputime that
occurred while running in the kernel. The function detects which
context the CPU is currently running in and accounts the time to
the correct bucket. This forces the arch code to account the
cputime for hardirq and softirq immediately.

Such accounting function can be costly and perform unwelcome divisions
and multiplications, among others.

The arch code can delay the accounting for system time. For s390
the accounting is done once per timer tick and for each task switch.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
[ Rebase against latest linus tree and move account_system_index_scaled(). ]
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-10-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-14 09:54:12 +01:00
Linus Torvalds
74e5c265a4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "Two bug fixes for 4.10-rc3"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/kbuild: enable modversions for symbols exported from asm
  s390/vtime: correct system time accounting
2017-01-02 09:08:45 -08:00
Linus Torvalds
3ddc76dfc7 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer type cleanups from Thomas Gleixner:
 "This series does a tree wide cleanup of types related to
  timers/timekeeping.

   - Get rid of cycles_t and use a plain u64. The type is not really
     helpful and caused more confusion than clarity

   - Get rid of the ktime union. The union has become useless as we use
     the scalar nanoseconds storage unconditionally now. The 32bit
     timespec alike storage got removed due to the Y2038 limitations
     some time ago.

     That leaves the odd union access around for no reason. Clean it up.

  Both changes have been done with coccinelle and a small amount of
  manual mopping up"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ktime: Get rid of ktime_equal()
  ktime: Cleanup ktime_set() usage
  ktime: Get rid of the union
  clocksource: Use a plain u64 instead of cycle_t
2016-12-25 14:30:04 -08:00
Linus Torvalds
b272f732f8 Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP hotplug notifier removal from Thomas Gleixner:
 "This is the final cleanup of the hotplug notifier infrastructure. The
  series has been reintgrated in the last two days because there came a
  new driver using the old infrastructure via the SCSI tree.

  Summary:

   - convert the last leftover drivers utilizing notifiers

   - fixup for a completely broken hotplug user

   - prevent setup of already used states

   - removal of the notifiers

   - treewide cleanup of hotplug state names

   - consolidation of state space

  There is a sphinx based documentation pending, but that needs review
  from the documentation folks"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/armada-xp: Consolidate hotplug state space
  irqchip/gic: Consolidate hotplug state space
  coresight/etm3/4x: Consolidate hotplug state space
  cpu/hotplug: Cleanup state names
  cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
  staging/lustre/libcfs: Convert to hotplug state machine
  scsi/bnx2i: Convert to hotplug state machine
  scsi/bnx2fc: Convert to hotplug state machine
  cpu/hotplug: Prevent overwriting of callbacks
  x86/msr: Remove bogus cleanup from the error path
  bus: arm-ccn: Prevent hotplug callback leak
  perf/x86/intel/cstate: Prevent hotplug callback leak
  ARM/imx/mmcd: Fix broken cpu hotplug handling
  scsi: qedi: Convert to hotplug state machine
2016-12-25 14:05:56 -08:00
Thomas Gleixner
8b0e195314 ktime: Cleanup ktime_set() usage
ktime_set(S,N) was required for the timespec storage type and is still
useful for situations where a Seconds and Nanoseconds part of a time value
needs to be converted. For anything where the Seconds argument is 0, this
is pointless and can be replaced with a simple assignment.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
2016-12-25 17:21:22 +01:00
Thomas Gleixner
a5a1d1c291 clocksource: Use a plain u64 instead of cycle_t
There is no point in having an extra type for extra confusion. u64 is
unambiguous.

Conversion was done with the following coccinelle script:

@rem@
@@
-typedef u64 cycle_t;

@fix@
typedef cycle_t;
@@
-cycle_t
+u64

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
2016-12-25 11:04:12 +01:00
Thomas Gleixner
73c1b41e63 cpu/hotplug: Cleanup state names
When the state names got added a script was used to add the extra argument
to the calls. The script basically converted the state constant to a
string, but the cleanup to convert these strings into meaningful ones did
not happen.

Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
are used in all the other places already.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-12-25 10:47:44 +01:00
Linus Torvalds
7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Heiko Carstens
cabab3f9f5 s390/kbuild: enable modversions for symbols exported from asm
s390 version of commit 334bb77387 ("x86/kbuild: enable modversions
for symbols exported from asm") so we get also rid of all these
warnings:

WARNING: EXPORT symbol "_mcount" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: EXPORT symbol "memcpy" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: EXPORT symbol "memmove" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: EXPORT symbol "memset" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: EXPORT symbol "save_fpu_regs" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: EXPORT symbol "sie64a" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: EXPORT symbol "sie_exit" [vmlinux] version generation failed, symbol will not be versioned.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-20 15:22:56 +01:00
Martin Schwidefsky
8f2b468aad s390/vtime: correct system time accounting
There is a slight misaccounting of system time in vtime_account_user.
This function is called once per HZ tick in interrupt context.
The irq_enter function already accounted the system time up to the
point of the irq_enter call. The system time from irq_enter until
vtime_account_user/do_account_vtime is reached is irq time but it
is accounted to the previous context.

Just drop the hardirq offset from arch/s390/kernel/vtime.c.

Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-20 15:22:56 +01:00
Linus Torvalds
57ca04ab44 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull m ore s390 updates from Martin Schwidefsky:
 "Over 95% of the changes in this pull request are related to the zcrypt
  driver. There are five improvements for zcrypt: the ID for the CEX6
  cards is added, workload balancing and multi-domain support are
  introduced, the debug logs are overhauled and a set of tracepoints is
  added.

  Then there are several patches in regard to inline assemblies. One
  compile fix and several missing memory clobbers. As far as we can tell
  the omitted memory clobbers have not caused any breakage.

  A small change to the PCI arch code, the machine can tells us how big
  the function measurement blocks are. The PCI function measurement will
  be disabled for a device if the queried length is larger than the
  allocated size for these blocks.

  And two more patches to correct five printk messages.

  That is it for s390 in regard to the 4.10 merge window. Happy holidays"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (23 commits)
  s390/pci: query fmb length
  s390/zcrypt: add missing memory clobber to ap_qci inline assembly
  s390/extmem: add missing memory clobber to dcss_set_subcodes
  s390/nmi: fix inline assembly constraints
  s390/lib: add missing memory barriers to string inline assemblies
  s390/cpumf: fix qsi inline assembly
  s390/setup: reword printk messages
  s390/dasd: fix typos in DASD error messages
  s390: fix compile error with memmove_early() inline assembly
  s390/zcrypt: tracepoint definitions for zcrypt device driver.
  s390/zcrypt: Rework debug feature invocations.
  s390/zcrypt: Improved invalid domain response handling.
  s390/zcrypt: Fix ap_max_domain_id for older machine types
  s390/zcrypt: Correct function bits for CEX2x and CEX3x cards.
  s390/zcrypt: Fixed attrition of AP adapters and domains
  s390/zcrypt: Introduce new zcrypt device status API
  s390/zcrypt: add multi domain support
  s390/zcrypt: Introduce workload balancing
  s390/zcrypt: get rid of ap_poll_requests
  s390/zcrypt: header for the AP inline assmblies
  ...
2016-12-16 09:05:25 -08:00
Linus Torvalds
a9042defa2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial updates from Jiri Kosina.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  NTB: correct ntb_spad_count comment typo
  misc: ibmasm: fix typo in error message
  Remove references to dead make variable LINUX_INCLUDE
  Remove last traces of ikconfig.h
  treewide: Fix printk() message errors
  Documentation/device-mapper: s/getsize/getsz/
2016-12-14 11:12:25 -08:00
Sebastian Ott
0b7589ecca s390/pci: query fmb length
Query the length of the fmb and abort fmb registration if the
size of the associated measurement block is too small.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14 16:33:41 +01:00
Heiko Carstens
f1c7ea2617 s390/extmem: add missing memory clobber to dcss_set_subcodes
Add the missing memory clobber / barrier to dcss_set_subcodes() to
tell the compiler that the inline assembly accesses memory (name
string).

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14 16:33:41 +01:00
Heiko Carstens
86fa7087d3 s390/nmi: fix inline assembly constraints
Add missing memory clobbers / barriers or use the Q constraint where
possible to tell the compiler that the inline assemblies actually
access memory and not only pointers to memory.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14 16:33:41 +01:00
Heiko Carstens
7a71fd1c59 s390/lib: add missing memory barriers to string inline assemblies
We have a couple of inline assemblies like memchr() and strlen() that
read from memory, but tell the compiler only they need the addresses
of the strings they access.
This allows the compiler to omit the initialization of such strings
and therefore generate broken code. Add the missing memory barrier to
all string related inline assemblies to fix this potential issue. It
looks like the compiler currently does not generate broken code due to
these bugs.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14 16:33:41 +01:00
Heiko Carstens
259acc5c25 s390/cpumf: fix qsi inline assembly
The qsi inline assembly takes an initialized "cc" variable as output
operand but specifies it as write-to operand only instead of
read/write operand. This allows the compiler to omit the
initialization, which in fact it also does (gcc 6.1).

Use the "+" constraint modifier to fix this. In addition also use the
Q constraint to specify the hws_qsi_info_block memory location, so the
compiler can generate slightly better code. Also get rid of the cc
clobber since none of the instructions within the inline assembly
modify the condition code.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14 16:33:40 +01:00
Martin Schwidefsky
6d7b2ee9d5 s390/setup: reword printk messages
Two of the messages introduced by the memblock conversion are reworded.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14 16:33:40 +01:00
Heiko Carstens
75a357341e s390: fix compile error with memmove_early() inline assembly
Old gcc versions can't handle a bogus early clobber on a Q constraint:

arch/s390/kernel/early.c: In function 'memmove_early.part.1':
arch/s390/kernel/early.c:432:2: error: '&' constraint used with no register class

Simply remove it to fix this.

Reported-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Fixes: d543a106f9 ("s390: fix initrd corruptions with gcov/kcov instrumented kernels")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14 16:33:40 +01:00
Harald Freudenberger
13b251bdc8 s390/zcrypt: tracepoint definitions for zcrypt device driver.
This patch introduces tracepoint definitions and tracepoint
event invocations for the s390 zcrypt device.

Currently there are just two tracepoint events defined.
An s390_zcrypt_req request event occurs as soon as the
request is recognized by the zcrypt ioctl function. This
event may act as some kind of request-processing-starts-now
indication.
As late as possible within the zcrypt ioctl function there
occurs the s390_zcrypt_rep event which may act as the point
in time where the request has been processed by the kernel
and the result is about to be transferred back to userspace.
The glue which binds together request and reply event is the
ptr parameter, which is the local buffer address where the
request from userspace has been stored by the ioctl function.

The main purpose of this zcrypt tracepoint patch is to get
some data for performance measurements together with
information about the kind of request and on which card and
queue the request has been processed. It is not an ffdc
interface as there is already code in the zcrypt device
driver to serve the s390 debug feature interface.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14 16:33:40 +01:00
Ingo Tuchscherer
b886a9d156 s390/zcrypt: Introduce new zcrypt device status API
Introduce new ioctl (ZDEVICESTATUS) to provide detailed
information, like hardware type, domains, status and functionality
of available crypto devices.

Signed-off-by: Ingo Tuchscherer <ingo.tuchscherer@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14 16:33:38 +01:00
Paul Bolle
846221cfb8 Remove references to dead make variable LINUX_INCLUDE
Commit 4fd06960f1 ("Use the new x86 setup code for i386") introduced a
reference to the make variable LINUX_INCLUDE. That reference got moved
around a bit and copied twice and now there are three references to it.

There has never been a definition of that variable. (Presumably that is
because it started out as a mistyped reference to LINUXINCLUDE.) So this
reference has always been an empty string. Let's remove it before it
spreads any further.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-12-14 10:54:28 +01:00
Linus Torvalds
2ec4584eb8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "The main bulk of the s390 patches for the 4.10 merge window:

   - Add support for the contiguous memory allocator.

   - The recovery for I/O errors in the dasd device driver is improved,
     the driver will now remove channel paths that are not working
     properly.

   - Additional fields are added to /proc/sysinfo, the extended
     partition name and the partition UUID.

   - New naming for PCI devices with system defined UIDs.

   - The last few remaining alloc_bootmem calls are converted to
     memblock.

   - The thread_info structure is stripped down and moved to the
     task_struct. The only field left in thread_info is the flags field.

   - Rework of the arch topology code to fix a fake numa issue.

   - Refactoring of the atomic primitives and add a new preempt_count
     implementation.

   - Clocksource steering for the STP sync check offsets.

   - The s390 specific headers are changed to make them usable with
     CLANG.

   - Bug fixes and cleanup"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (70 commits)
  s390/cpumf: Use configuration level indication for sampling data
  s390: provide memmove implementation
  s390: cleanup arch/s390/kernel Makefile
  s390: fix initrd corruptions with gcov/kcov instrumented kernels
  s390: exclude early C code from gcov profiling
  s390/dasd: channel path aware error recovery
  s390/dasd: extend dasd path handling
  s390: remove unused labels from entry.S
  s390/vmlogrdr: fix IUCV buffer allocation
  s390/crypto: unlock on error in prng_tdes_read()
  s390/sysinfo: show partition extended name and UUID if available
  s390/numa: pin all possible cpus to nodes early
  s390/numa: establish cpu to node mapping early
  s390/topology: use cpu_topology array instead of per cpu variable
  s390/smp: initialize cpu_present_mask in setup_arch
  s390/topology: always use s390 specific sched_domain_topology_level
  s390/smp: use smp_get_base_cpu() helper function
  s390/numa: always use logical cpu and core ids
  s390: Remove VLAIS in ptff() and clear_table()
  s390: fix machine check panic stack switch
  ...
2016-12-13 16:33:33 -08:00
Linus Torvalds
93173b5bf2 Small release, the most interesting stuff is x86 nested virt improvements.
x86: userspace can now hide nested VMX features from guests; nested
 VMX can now run Hyper-V in a guest; support for AVX512_4VNNIW and
 AVX512_FMAPS in KVM; infrastructure support for virtual Intel GPUs.
 
 PPC: support for KVM guests on POWER9; improved support for interrupt
 polling; optimizations and cleanups.
 
 s390: two small optimizations, more stuff is in flight and will be
 in 4.11.
 
 ARM: support for the GICv3 ITS on 32bit platforms.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJYTkP0FBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 lZIH/iT1n9OQXcuTpYYnQhuCenzI3GZZOIMTbCvK2i5bo0FIJKxVn0EiAAqZSXvO
 nO185FqjOgLuJ1AD1kJuxzye5suuQp4HIPWWgNHcexLuy43WXWKZe0IQlJ4zM2Xf
 u31HakpFmVDD+Cd1qN3yDXtDrRQ79/xQn2kw7CWb8olp+pVqwbceN3IVie9QYU+3
 gCz0qU6As0aQIwq2PyalOe03sO10PZlm4XhsoXgWPG7P18BMRhNLTDqhLhu7A/ry
 qElVMANT7LSNLzlwNdpzdK8rVuKxETwjlc1UP8vSuhrwad4zM2JJ1Exk26nC2NaG
 D0j4tRSyGFIdx6lukZm7HmiSHZ0=
 =mkoB
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "Small release, the most interesting stuff is x86 nested virt
  improvements.

  x86:
   - userspace can now hide nested VMX features from guests
   - nested VMX can now run Hyper-V in a guest
   - support for AVX512_4VNNIW and AVX512_FMAPS in KVM
   - infrastructure support for virtual Intel GPUs.

  PPC:
   - support for KVM guests on POWER9
   - improved support for interrupt polling
   - optimizations and cleanups.

  s390:
   - two small optimizations, more stuff is in flight and will be in
     4.11.

  ARM:
   - support for the GICv3 ITS on 32bit platforms"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits)
  arm64: KVM: pmu: Reset PMSELR_EL0.SEL to a sane value before entering the guest
  KVM: arm/arm64: timer: Check for properly initialized timer on init
  KVM: arm/arm64: vgic-v2: Limit ITARGETSR bits to number of VCPUs
  KVM: x86: Handle the kthread worker using the new API
  KVM: nVMX: invvpid handling improvements
  KVM: nVMX: check host CR3 on vmentry and vmexit
  KVM: nVMX: introduce nested_vmx_load_cr3 and call it on vmentry
  KVM: nVMX: propagate errors from prepare_vmcs02
  KVM: nVMX: fix CR3 load if L2 uses PAE paging and EPT
  KVM: nVMX: load GUEST_EFER after GUEST_CR0 during emulated VM-entry
  KVM: nVMX: generate MSR_IA32_CR{0,4}_FIXED1 from guest CPUID
  KVM: nVMX: fix checks on CR{0,4} during virtual VMX operation
  KVM: nVMX: support restore of VMX capability MSRs
  KVM: nVMX: generate non-true VMX MSRs based on true versions
  KVM: x86: Do not clear RFLAGS.TF when a singlestep trap occurs.
  KVM: x86: Add kvm_skip_emulated_instruction and use it.
  KVM: VMX: Move skip_emulated_instruction out of nested_vmx_check_vmcs12
  KVM: VMX: Reorder some skip_emulated_instruction calls
  KVM: x86: Add a return value to kvm_emulate_cpuid
  KVM: PPC: Book3S: Move prototypes for KVM functions into kvm_ppc.h
  ...
2016-12-13 15:47:02 -08:00
Linus Torvalds
e34bac726d Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - various misc bits

 - most of MM (quite a lot of MM material is awaiting the merge of
   linux-next dependencies)

 - kasan

 - printk updates

 - procfs updates

 - MAINTAINERS

 - /lib updates

 - checkpatch updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (123 commits)
  init: reduce rootwait polling interval time to 5ms
  binfmt_elf: use vmalloc() for allocation of vma_filesz
  checkpatch: don't emit unified-diff error for rename-only patches
  checkpatch: don't check c99 types like uint8_t under tools
  checkpatch: avoid multiple line dereferences
  checkpatch: don't check .pl files, improve absolute path commit log test
  scripts/checkpatch.pl: fix spelling
  checkpatch: don't try to get maintained status when --no-tree is given
  lib/ida: document locking requirements a bit better
  lib/rbtree.c: fix typo in comment of ____rb_erase_color
  lib/Kconfig.debug: make CONFIG_STRICT_DEVMEM depend on CONFIG_DEVMEM
  MAINTAINERS: add drm and drm/i915 irc channels
  MAINTAINERS: add "C:" for URI for chat where developers hang out
  MAINTAINERS: add drm and drm/i915 bug filing info
  MAINTAINERS: add "B:" for URI where to file bugs
  get_maintainer: look for arbitrary letter prefixes in sections
  printk: add Kconfig option to set default console loglevel
  printk/sound: handle more message headers
  printk/btrfs: handle more message headers
  printk/kdb: handle more message headers
  ...
2016-12-12 20:50:02 -08:00
Linus Torvalds
e71c3978d6 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp hotplug updates from Thomas Gleixner:
 "This is the final round of converting the notifier mess to the state
  machine. The removal of the notifiers and the related infrastructure
  will happen around rc1, as there are conversions outstanding in other
  trees.

  The whole exercise removed about 2000 lines of code in total and in
  course of the conversion several dozen bugs got fixed. The new
  mechanism allows to test almost every hotplug step standalone, so
  usage sites can exercise all transitions extensively.

  There is more room for improvement, like integrating all the
  pointlessly different architecture mechanisms of synchronizing,
  setting cpus online etc into the core code"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  tracing/rb: Init the CPU mask on allocation
  soc/fsl/qbman: Convert to hotplug state machine
  soc/fsl/qbman: Convert to hotplug state machine
  zram: Convert to hotplug state machine
  KVM/PPC/Book3S HV: Convert to hotplug state machine
  arm64/cpuinfo: Convert to hotplug state machine
  arm64/cpuinfo: Make hotplug notifier symmetric
  mm/compaction: Convert to hotplug state machine
  iommu/vt-d: Convert to hotplug state machine
  mm/zswap: Convert pool to hotplug state machine
  mm/zswap: Convert dst-mem to hotplug state machine
  mm/zsmalloc: Convert to hotplug state machine
  mm/vmstat: Convert to hotplug state machine
  mm/vmstat: Avoid on each online CPU loops
  mm/vmstat: Drop get_online_cpus() from init_cpu_node_state/vmstat_cpu_dead()
  tracing/rb: Convert to hotplug state machine
  oprofile/nmi timer: Convert to hotplug state machine
  net/iucv: Use explicit clean up labels in iucv_init()
  x86/pci/amd-bus: Convert to hotplug state machine
  x86/oprofile/nmi: Convert to hotplug state machine
  ...
2016-12-12 19:25:04 -08:00
Johannes Weiner
6d75f366b9 lib: radix-tree: check accounting of existing slot replacement users
The bug in khugepaged fixed earlier in this series shows that radix tree
slot replacement is fragile; and it will become more so when not only
NULL<->!NULL transitions need to be caught but transitions from and to
exceptional entries as well.  We need checks.

Re-implement radix_tree_replace_slot() on top of the sanity-checked
__radix_tree_replace().  This requires existing callers to also pass the
radix tree root, but it'll warn us when somebody replaces slots with
contents that need proper accounting (transitions between NULL entries,
real entries, exceptional entries) and where a replacement through the
slot pointer would corrupt the radix tree node counts.

Link: http://lkml.kernel.org/r/20161117193021.GB23430@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:08 -08:00
Aneesh Kumar K.V
692a68c154 mm: remove the page size change check in tlb_remove_page
Now that we check for page size change early in the loop, we can
partially revert e9d55e1570 ("mm: change the interface for
__tlb_remove_page").

This simplies the code much, by removing the need to track the last
address with which we adjusted the range.  We also go back to the older
way of filling the mmu_gather array, ie, we add an entry and then check
whether the gather batch is full.

Link: http://lkml.kernel.org/r/20161026084839.27299-6-aneesh.kumar@linux.vnet.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:07 -08:00
Aneesh Kumar K.V
07e326610e mm: add tlb_remove_check_page_size_change to track page size change
With commit e77b0852b5 ("mm/mmu_gather: track page size with mmu
gather and force flush if page size change") we added the ability to
force a tlb flush when the page size change in a mmu_gather loop.  We
did that by checking for a page size change every time we added a page
to mmu_gather for lazy flush/remove.  We can improve that by moving the
page size change check early and not doing it every time we add a page.

This also helps us to do tlb flush when invalidating a range covering
dax mapping.  Wrt dax mapping we don't have a backing struct page and
hence we don't call tlb_remove_page, which earlier forced the tlb flush
on page size change.  Moving the page size change check earlier means we
will do the same even for dax mapping.

We also avoid doing this check on architecture other than powerpc.

In a later patch we will remove page size check from tlb_remove_page().

Link: http://lkml.kernel.org/r/20161026084839.27299-5-aneesh.kumar@linux.vnet.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:07 -08:00
Aneesh Kumar K.V
b528e4b640 mm/hugetlb: add tlb_remove_hugetlb_entry for handling hugetlb pages
This add tlb_remove_hugetlb_entry similar to tlb_remove_pmd_tlb_entry.

Link: http://lkml.kernel.org/r/20161026084839.27299-4-aneesh.kumar@linux.vnet.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12 18:55:07 -08:00
Linus Torvalds
92c020d08d Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main scheduler changes in this cycle were:

   - support Intel Turbo Boost Max Technology 3.0 (TBM3) by introducig a
     notion of 'better cores', which the scheduler will prefer to
     schedule single threaded workloads on. (Tim Chen, Srinivas
     Pandruvada)

   - enhance the handling of asymmetric capacity CPUs further (Morten
     Rasmussen)

   - improve/fix load handling when moving tasks between task groups
     (Vincent Guittot)

   - simplify and clean up the cputime code (Stanislaw Gruszka)

   - improve mass fork()ed task spread a.k.a. hackbench speedup (Vincent
     Guittot)

   - make struct kthread kmalloc()ed and related fixes (Oleg Nesterov)

   - add uaccess atomicity debugging (when using access_ok() in the
     wrong context), under CONFIG_DEBUG_ATOMIC_SLEEP=y (Peter Zijlstra)

   - implement various fixes, cleanups and other enhancements (Daniel
     Bristot de Oliveira, Martin Schwidefsky, Rafael J. Wysocki)"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  sched/core: Use load_avg for selecting idlest group
  sched/core: Fix find_idlest_group() for fork
  kthread: Don't abuse kthread_create_on_cpu() in __kthread_create_worker()
  kthread: Don't use to_live_kthread() in kthread_[un]park()
  kthread: Don't use to_live_kthread() in kthread_stop()
  Revert "kthread: Pin the stack via try_get_task_stack()/put_task_stack() in to_live_kthread() function"
  kthread: Make struct kthread kmalloc'ed
  x86/uaccess, sched/preempt: Verify access_ok() context
  sched/x86: Make CONFIG_SCHED_MC_PRIO=y easier to enable
  sched/x86: Change CONFIG_SCHED_ITMT to CONFIG_SCHED_MC_PRIO
  x86/sched: Use #include <linux/mutex.h> instead of #include <asm/mutex.h>
  cpufreq/intel_pstate: Use CPPC to get max performance
  acpi/bus: Set _OSC for diverse core support
  acpi/bus: Enable HWP CPPC objects
  x86/sched: Add SD_ASYM_PACKING flags to x86 ITMT CPU
  x86/sysctl: Add sysctl for ITMT scheduling feature
  x86: Enable Intel Turbo Boost Max Technology 3.0
  x86/topology: Define x86's arch_update_cpu_topology
  sched: Extend scheduler's asym packing
  sched/fair: Clean up the tunable parameter definitions
  ...
2016-12-12 12:15:10 -08:00
Linus Torvalds
6cdf89b1ca Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The tree got pretty big in this development cycle, but the net effect
  is pretty good:

    115 files changed, 673 insertions(+), 1522 deletions(-)

  The main changes were:

   - Rework and generalize the mutex code to remove per arch mutex
     primitives. (Peter Zijlstra)

   - Add vCPU preemption support: add an interface to query the
     preemption status of vCPUs and use it in locking primitives - this
     optimizes paravirt performance. (Pan Xinhui, Juergen Gross,
     Christian Borntraeger)

   - Introduce cpu_relax_yield() and remov cpu_relax_lowlatency() to
     clean up and improve the s390 lock yielding machinery and its core
     kernel impact. (Christian Borntraeger)

   - Micro-optimize mutexes some more. (Waiman Long)

   - Reluctantly add the to-be-deprecated mutex_trylock_recursive()
     interface on a temporary basis, to give the DRM code more time to
     get rid of its locking hacks. Any other users will be NAK-ed on
     sight. (We turned off the deprecation warning for the time being to
     not pollute the build log.) (Peter Zijlstra)

   - Improve the rtmutex code a bit, in light of recent long lived
     bugs/races. (Thomas Gleixner)

   - Misc fixes, cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  x86/paravirt: Fix bool return type for PVOP_CALL()
  x86/paravirt: Fix native_patch()
  locking/ww_mutex: Use relaxed atomics
  locking/rtmutex: Explain locking rules for rt_mutex_proxy_unlock()/init_proxy_locked()
  locking/rtmutex: Get rid of RT_MUTEX_OWNER_MASKALL
  x86/paravirt: Optimize native pv_lock_ops.vcpu_is_preempted()
  locking/mutex: Break out of expensive busy-loop on {mutex,rwsem}_spin_on_owner() when owner vCPU is preempted
  locking/osq: Break out of spin-wait busy waiting loop for a preempted vCPU in osq_lock()
  Documentation/virtual/kvm: Support the vCPU preemption check
  x86/xen: Support the vCPU preemption check
  x86/kvm: Support the vCPU preemption check
  x86/kvm: Support the vCPU preemption check
  kvm: Introduce kvm_write_guest_offset_cached()
  locking/core, x86/paravirt: Implement vcpu_is_preempted(cpu) for KVM and Xen guests
  locking/spinlocks, s390: Implement vcpu_is_preempted(cpu)
  locking/core, powerpc: Implement vcpu_is_preempted(cpu)
  sched/core: Introduce the vcpu_is_preempted(cpu) interface
  sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q
  locking/core: Provide common cpu_relax_yield() definition
  locking/mutex: Don't mark mutex_trylock_recursive() as deprecated, temporarily
  ...
2016-12-12 10:48:02 -08:00
Christian Borntraeger
c19805f870 s390/cpumf: Use configuration level indication for sampling data
Newer hardware provides the level of virtualization that a particular
sample belongs to. Use that information and fall back to the old
heuristics if the sample does not contain that information.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:12:18 +01:00
Heiko Carstens
b4623d4e5b s390: provide memmove implementation
Provide an s390 specific memmove implementation which is faster than
the generic implementation which copies byte-wise.

For non-destructive (as defined by the mvc instruction) memmove
operations the following table compares the old default implementation
versus the new s390 specific implementation:

size     old   new
   1     1ns   8ns
   2     2ns   8ns
   4     4ns   8ns
   8     7ns   8ns
  16    17ns   8ns
  32    35ns   8ns
  64    65ns   9ns
 128   146ns  10ns
 256   298ns  11ns
 512   537ns  11ns
1024  1193ns  19ns
2048  2405ns  36ns

So only for very small sizes the old implementation is faster. For
overlapping memmoves, where the mvc instruction can't be used, the new
implementation is as slow as the old one.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:11:32 +01:00
Heiko Carstens
82897ede92 s390: cleanup arch/s390/kernel Makefile
Group all compiler flag modification lines together and sort them
alphabetically. This should hopefully prevent future bugs due to
missing flag modifications.
Also fix indentation at some places.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:11:26 +01:00
Heiko Carstens
d543a106f9 s390: fix initrd corruptions with gcov/kcov instrumented kernels
The early C code within arch/s390/kernel/early.c saves ipl parameters
before the bss section is cleared. When doing that it jumps to code
that is potentially gcov/kcov instrumented. That code in turn will
corrupt an initrd that potentially may reside in the not yet ready to
be used bss section.

Instead of excluding more and more code from gcov/kcov instrumentation
provide an early memmove function which will be used to save ipl
parameters. The verification if these parameters are actually valid
will be done later.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:11:20 +01:00
Heiko Carstens
b5cb9bf8dd s390: exclude early C code from gcov profiling
Early C code must be excluded from gcov profiling since it may write
to the bss section before
- a potential initrd that resides there is rescued
- the bss section is initialized (zeroed)

This patch only addresses the problem that early code is instrumented
for profiling, but not the problem that it jumps into other code that
is still instrumented. That problem will be fixed with a follow-on
patch.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:10:15 +01:00
Stefan Haberland
a521b048bc s390/dasd: channel path aware error recovery
With this feature, the DASD device driver more robustly handles DASDs
that are attached via multiple channel paths and are subject to
constant Interface-Control-Checks (IFCCs) and Channel-Control-Checks
(CCCs) or loss of High-Performance-FICON (HPF) functionality on one or
more of these paths.

If a channel path does not work correctly, it is removed from normal
operation as long as other channel paths are available. All extended
error recovery states can be queried and reset via user space
interfaces.

Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:05:03 +01:00
Heiko Carstens
7df1160459 s390: remove unused labels from entry.S
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:04:26 +01:00
Dan Carpenter
9e6e7c7431 s390/crypto: unlock on error in prng_tdes_read()
We added some new locking but forgot to unlock on error.

Fixes: 57127645d7 ("s390/zcrypt: Introduce new SHA-512 based Pseudo Random Generator.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 09:01:10 +01:00
Martin KaFai Lau
17bedab272 bpf: xdp: Allow head adjustment in XDP prog
This patch allows XDP prog to extend/remove the packet
data at the head (like adding or removing header).  It is
done by adding a new XDP helper bpf_xdp_adjust_head().

It also renames bpf_helper_changes_skb_data() to
bpf_helper_changes_pkt_data() to better reflect
that XDP prog does not work on skb.

This patch adds one "xdp_adjust_head" bit to bpf_prog for the
XDP-capable driver to check if the XDP prog requires
bpf_xdp_adjust_head() support.  The driver can then decide
to error out during XDP_SETUP_PROG.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-08 14:25:13 -05:00
Viktor Mihajlovski
e32eae10e5 s390/sysinfo: show partition extended name and UUID if available
Extract extended name and UUID from SYSIB 2.2.2 data.
As the code to convert the raw extended name into printable format
can be reused by stsi_2_2_2 we're moving the conversion code into a
separate function convert_ext_name.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 12:29:47 +01:00
Heiko Carstens
e6d4a636ac s390/numa: pin all possible cpus to nodes early
It is required to have an early static cpu to node mapping. This patch
pins all possible cpus to nodes for which no topology information is
present. Since there is no interface available which would allow to
tell where a non-present cpu would appear topology-wise, simply use a
round robin algorithm.
Right now this makes sure that the cpu_to_node() function will return
the same value for a cpu during the life time of the system.

Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:23:33 +01:00
Heiko Carstens
8c91058022 s390/numa: establish cpu to node mapping early
Initialize the cpu topology and therefore also the cpu to node mapping
much earlier. Fixes this warning and subsequent crashes when using the
fake numa emulation mode on s390:

WARNING: CPU: 0 PID: 1 at include/linux/cpumask.h:121 select_task_rq+0xe6/0x1a8
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc6-00001-ge9d867a67fd0-dirty #28
task: 00000001dd270008 ti: 00000001eccb4000 task.ti: 00000001eccb4000
Krnl PSW : 0404c00180000000 0000000000176c56 (select_task_rq+0xe6/0x1a8)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
Call Trace:
([<0000000000176c30>] select_task_rq+0xc0/0x1a8)
([<0000000000177d64>] try_to_wake_up+0x2e4/0x478)
([<000000000015d46c>] create_worker+0x174/0x1c0)
([<0000000000161a98>] alloc_unbound_pwq+0x360/0x438)
([<0000000000162550>] apply_wqattrs_prepare+0x200/0x2a0)
([<000000000016266a>] apply_workqueue_attrs_locked+0x7a/0xb0)
([<0000000000162af0>] apply_workqueue_attrs+0x50/0x78)
([<000000000016441c>] __alloc_workqueue_key+0x304/0x520)
([<0000000000ee3706>] default_bdi_init+0x3e/0x70)
([<0000000000100270>] do_one_initcall+0x140/0x1d8)
([<0000000000ec9da8>] kernel_init_freeable+0x220/0x2d8)
([<0000000000984a7a>] kernel_init+0x2a/0x150)
([<00000000009913fa>] kernel_thread_starter+0x6/0xc)
([<00000000009913f4>] kernel_thread_starter+0x0/0xc)

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:23:25 +01:00
Heiko Carstens
30fc4ca2a8 s390/topology: use cpu_topology array instead of per cpu variable
CPU topology information like cpu to node mapping must be setup in
setup_arch already. Topology information is currently made available
with a per cpu variable; this however will not work when the
initialization will be moved to setup_arch, since the generic percpu
setup will be done much later.

Therefore convert back to a cpu_topology array.

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:23:16 +01:00
Heiko Carstens
af51160ebd s390/smp: initialize cpu_present_mask in setup_arch
In order to be able to setup the cpu to node mappings early it is a
prerequisite to know which cpus are present. Therefore cpus must be
detected much earlier than before.

For sclp based cpu detection this requires yet another early sclp
call, since the system is not ready to use the regular interrupt and
memory allocations.

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:23:07 +01:00
Heiko Carstens
ebb299a510 s390/topology: always use s390 specific sched_domain_topology_level
The s390 specific sched_domain_topology_level should always be used,
not only if the machine provides topology information. Luckily this
odd behaviour, that was by accident introduced with git commit
d05d15da18 ("s390/topology: delay initialization of topology cpu
masks") has currently no side effect.

Fixes: d05d15da18 ("s390/topology: delay initialization of topology cpumasks")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:59 +01:00
Heiko Carstens
5423145f8c s390/smp: use smp_get_base_cpu() helper function
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:50 +01:00
Heiko Carstens
307b3114ef s390/numa: always use logical cpu and core ids
The toptree algorithm uses the physical core ids to create a mapping
between cores and nodes (to_node_id array within emu_cores structure).
The core ids are used as an index into an array which size depends on
CONFIG_NR_CPUS. If the physical core ids are larger, this will result
in out-of-bounds write accesses.

Generate logical core ids instead to avoid this.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:41 +01:00
Michael Holzheu
11a247e376 s390: Remove VLAIS in ptff() and clear_table()
The ptff() and clear_table() functions use the gcc extension "variable
length arrays in structures" (VLAIS) to define in the inline assembler
constraints the area of the clobbered memory. This extension will most
likely never be supported by LLVM/Clang.

Since currently BPF programs are compiled with LLVM, this leads to the
following compile errors:

 $ cd samples/bpf
 $ make

 In file included from /root/linux-master/samples/bpf/tracex1_kern.c:8:
 In file included from ./include/linux/netdevice.h:44:
 ...
 In file included from ./arch/s390/include/asm/mmu_context.h:10:
  ./arch/s390/include/asm/pgalloc.h:30:24: error: fields must have a
  constant size: 'variable length array in structure' extension will never
  be supported
         typedef struct { char _[n]; } addrtype;

 In file included from /root/linux-master/samples/bpf/tracex1_kern.c:7:
 In file included from ./include/linux/skbuff.h:18:
 ...
 In file included from ./include/linux/jiffies.h:8:
 In file included from ./include/linux/timex.h:65:
  ./arch/s390/include/asm/timex.h:105:24: error: fields must have a
  constant size: 'variable length array in structure' extension will never
  be supported
        typedef struct { char _[len]; } addrtype;

To fix this do the following:

 - Convert ptff() into a macro that then uses a fixed size array
   when expanded.
 - Convert the clear_table() function and use an inline assembly
   with fixed size array in a loop.
   The runtime performance of the new version is even better than
   the old version (tested with EC12/z13 and gcc 4.8.5/6.2.1 with
   "-march=z196 -O2").

Reported-by: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:30 +01:00
Martin Schwidefsky
ce4dda3f02 s390: fix machine check panic stack switch
For system damage machine checks or machine checks due to invalid PSW
fields the system will be stopped. In order to get an oops message out
before killing the system the machine check handler branches to
.Lmcck_panic, switches to the panic stack and then does the usual
machine check handling.

The switch to the panic stack is incomplete, the stack pointer in %r15
is replaced, but the pt_regs pointer in %r11 is not. The result is
a program check which will kill the system in a slightly different way.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:13 +01:00
Heiko Carstens
db7ad63624 s390/setup: fix memblock usage
When converting from bootmem to memblock I missed a subtle difference:
the memblock_alloc() functions return uninitialized memory, while the
memblock_virt_alloc() functions return zeroed memory.

This led to quite random early boot crashes.

Therefore use the correct version everywhere now.
Hopefully.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-02 07:36:24 +01:00
Heiko Carstens
9f88eb4df7 s390/kexec: use node 0 when re-adding crash kernel memory
When re-adding crash kernel memory within setup_resources() the
function memblock_add() is used. That function will add memory by
default to node "MAX_NUMNODES" instead of node 0, like the memory
detection code does. In case of !NUMA this will trigger this warning
when the kernel generates the vmemmap:

Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead
WARNING: CPU: 0 PID: 0 at mm/memblock.c:1261 memblock_virt_alloc_internal+0x76/0x220
CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc6 #16
Call Trace:
 [<0000000000d0b2e8>] memblock_virt_alloc_try_nid+0x88/0xc8
 [<000000000083c8ea>] __earlyonly_bootmem_alloc.constprop.1+0x42/0x50
 [<000000000083e7f4>] vmemmap_populate+0x1ac/0x1e0
 [<0000000000840136>] sparse_mem_map_populate+0x46/0x68
 [<0000000000d0c59c>] sparse_init+0x184/0x238
 [<0000000000cf45f6>] paging_init+0xbe/0xf8
 [<0000000000cf1d4a>] setup_arch+0xa02/0xae0
 [<0000000000ced75a>] start_kernel+0x72/0x450
 [<0000000000100020>] _stext+0x20/0x80

If NUMA is selected numa_setup_memory() will fix the node assignments
before the vmemmap will be populated; so this warning will only appear
if NUMA is not selected.

To fix this simply use memblock_add_node() and re-add crash kernel
memory explicitly to node 0.

Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 4e042af463 ("s390/kexec: fix crash on resize of reserved memory")
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-02 07:36:17 +01:00
Francis Yan
1c885808e4 tcp: SOF_TIMESTAMPING_OPT_STATS option for SO_TIMESTAMPING
This patch exports the sender chronograph stats via the socket
SO_TIMESTAMPING channel. Currently we can instrument how long a
particular application unit of data was queued in TCP by tracking
SOF_TIMESTAMPING_TX_SOFTWARE and SOF_TIMESTAMPING_TX_SCHED. Having
these sender chronograph stats exported simultaneously along with
these timestamps allow further breaking down the various sender
limitation.  For example, a video server can tell if a particular
chunk of video on a connection takes a long time to deliver because
TCP was experiencing small receive window. It is not possible to
tell before this patch without packet traces.

To prepare these stats, the user needs to set
SOF_TIMESTAMPING_OPT_STATS and SOF_TIMESTAMPING_OPT_TSONLY flags
while requesting other SOF_TIMESTAMPING TX timestamps. When the
timestamps are available in the error queue, the stats are returned
in a separate control message of type SCM_TIMESTAMPING_OPT_STATS,
in a list of TLVs (struct nlattr) of types: TCP_NLA_BUSY_TIME,
TCP_NLA_RWND_LIMITED, TCP_NLA_SNDBUF_LIMITED. Unit is microsecond.

Signed-off-by: Francis Yan <francisyyan@gmail.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-30 10:04:25 -05:00
Heiko Carstens
dd5224986e s390/uapi: sort header export list
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-29 07:52:59 +01:00
Heiko Carstens
aa9725ff9c s390/hypfs: add hypfs header file to uapi header export list
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-29 07:52:58 +01:00
Heiko Carstens
6bc32bf0a0 s390: use generic asm-offsets.h
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-29 07:52:57 +01:00
Heiko Carstens
9e427365af s390: convert remaining bootmem allocations to memblock
Get rid of all remaining alloc_bootmem calls and use memblock_alloc
instead everywhere.  This way we get rid of the inconsistent mixture
of alloc_bootmem and memblock_alloc usages.

Two of the alloc_bootmem_low calls within arch/s390/kernel/setup.c are
replaced with memblock_alloc calls that don't enforce that the
allocated memory is below 2GB. This restriction was never necessary.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-29 07:52:55 +01:00
Martin Schwidefsky
61aaef51cc s390: fix kernel oops for CONFIG_MARCH_Z900=y builds
The LAST_BREAK macro in entry.S uses a different instruction sequence
for CONFIG_MARCH_Z900 builds. The branch target offset to skip the
store of the last breaking event address needs to take the different
length of the code block into account.

Fixes: f8fc82b471 ("s390: move sys_call_table and last_break from thread_info to thread_struct")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-25 10:07:55 +01:00
Sebastian Ott
5c5afd0201 s390/pci: use unique UIDs for domain enumeration
Use UIDs as domain numbers if the UID checking rules apply (in this
case the FW guarantees uniqueness of these values).

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:26 +01:00
Heiko Carstens
80abb39b50 s390: update defconfig
Enable the contiguous memory allocator but set the default size to
zero. If somebody wants to use the cma allocator the "cma=" kernel
parameter has to be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:24 +01:00
Heiko Carstens
e1231b0e48 s390: add cma support
In order to make the cma infrastructure usable we need to add a small
architecture backend which calls dma_contiguous_reserve.
Otherwise we would end up with the cma allocator enabled, but no pool
where memory can be allocated from.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:23 +01:00
Heiko Carstens
1e16b09666 s390/cpumf: simplify psw generation
Use the psw_bits macro and simplify the code. The generated code is
also better since it doesn't contain any conditional branches anymore.

Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:22 +01:00
Heiko Carstens
3a890380e4 s390/thread_info: get rid of THREAD_ORDER define
We have the s390 specific THREAD_ORDER define and the THREAD_SIZE_ORDER
define which is also used in common code. Both have exactly the same
semantics. Therefore get rid of THREAD_ORDER and always use
THREAD_SIZE_ORDER instead.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:21 +01:00
Heiko Carstens
56e9219a82 s390/uaccess: make setfs macro return void
For an unknown (historic) reason the s390 specific implementation of
set_fs returns whatever the __ctl_load would return. The set_fs macro
however is supposed to return void.
Change the macro to do that.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:19 +01:00
Christian Borntraeger
e1788bb995 KVM: s390: handle floating point registers in the run ioctl not in vcpu_put/load
Right now we switch the host fprs/vrs in kvm_arch_vcpu_load and switch
back in kvm_arch_vcpu_put. This process is already optimized
since commit 9977e886cb ("s390/kernel: lazy restore fpu registers")
avoiding double save/restores on schedule. We still reload the pointers
and test the guest fpc on each context switch, though.

We can minimize the cost of vcpu_load/put by doing the test in the
VCPU_RUN ioctl itself. As most VCPU threads almost never exit to
userspace in the common fast path, this allows to avoid this overhead
for the common case (eventfd driven I/O, all exits including sleep
handled in the kernel) - making kvm_arch_vcpu_load/put basically
disappear in perf top.

Also adapt the fpu get/set ioctls.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-11-22 19:32:35 +01:00
Christian Borntraeger
31d8b8d41a KVM: s390: handle access registers in the run ioctl not in vcpu_put/load
Right now we save the host access registers in kvm_arch_vcpu_load
and load them in kvm_arch_vcpu_put. Vice versa for the guest access
registers. On schedule this means, that we load/save access registers
multiple times.

e.g. VCPU_RUN with just one reschedule and then return does

[from user space via VCPU_RUN]
- save the host registers in kvm_arch_vcpu_load (via ioctl)
- load the guest registers in kvm_arch_vcpu_load (via ioctl)
- do guest stuff
- decide to schedule/sleep
- save the guest registers in kvm_arch_vcpu_put (via sched)
- load the host registers in kvm_arch_vcpu_put (via sched)
- save the host registers in switch_to (via sched)
- schedule
- return
- load the host registers in switch_to (via sched)
- save the host registers in kvm_arch_vcpu_load (via sched)
- load the guest registers in kvm_arch_vcpu_load (via sched)
- do guest stuff
- decide to go to userspace
- save the guest registers in kvm_arch_vcpu_put (via ioctl)
- load the host registers in kvm_arch_vcpu_put (via ioctl)
[back to user space]

As the kernel does not use access registers, we can avoid
this reloading and simply piggy back on switch_to (let it save
the guest values instead of host values in thread.acrs) by
moving the host/guest switch into the VCPU_RUN ioctl function.
We now do

[from user space via VCPU_RUN]
- save the host registers in kvm_arch_vcpu_ioctl_run
- load the guest registers in kvm_arch_vcpu_ioctl_run
- do guest stuff
- decide to schedule/sleep
- save the guest registers in switch_to
- schedule
- return
- load the guest registers in switch_to (via sched)
- do guest stuff
- decide to go to userspace
- save the guest registers in kvm_arch_vcpu_ioctl_run
- load the host registers in kvm_arch_vcpu_ioctl_run

This seems to save about 10% of the vcpu_put/load functions
according to perf.

As vcpu_load no longer switches the acrs, We can also loading
the acrs in kvm_arch_vcpu_ioctl_set_sregs.

Suggested-by: Fan Zhang <zhangfan@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-11-22 19:32:35 +01:00
Christian Borntraeger
760928c0da locking/spinlocks, s390: Implement vcpu_is_preempted(cpu)
This implements the s390 version for vcpu_is_preempted(cpu),
by reworking the existing smp_vcpu_scheduled() function into
arch_vcpu_is_preempted().

We can then also get rid of the local cpu_is_preempted()
function by moving the CIF_ENABLED_WAIT test into
arch_vcpu_is_preempted().

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-6-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:48:06 +01:00
Ingo Molnar
02cb689b2c Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:37:38 +01:00
Christian Borntraeger
6d0d287891 locking/core: Provide common cpu_relax_yield() definition
No need to duplicate the same define everywhere. Since
the only user is stop-machine and the only provider is
s390, we can use a default implementation of cpu_relax_yield()
in sched.h.

Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-s390 <linux-s390@vger.kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1479298985-191589-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-17 08:17:36 +01:00
Sebastian Ott
a95aeb2ff4 s390/pci_dma: remove memset from dma_alloc
Get rid of a useless memset from dma_alloc. Users of dma_alloc who want
zero initialized memory can get it by specifying __GFP_ZERO or use one
of the zalloc variants.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17 07:10:14 +01:00
Sebastian Ott
4f5359e94b s390/pci_dma: make lazy flush independent from the tlb_refresh bit
We have 2 strategies to reduce the number of RPCIT instructions:
* A HW feature indicated via the tlb_refresh bit allows us to omit RPCIT for
  invalid -> valid translation-table entry updates.
* With "lazy flush" we omit RPCIT for valid -> invalid updates until we run
  out of dma addresses. When we have to reuse dma addresses we issue a global
  tlb flush using only one RPCIT instruction.

Currently lazy flushing depends on tlb_refresh. Since there is no technical
reason for this remove this dependency.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17 07:10:08 +01:00
Sebastian Ott
6b7df3ce92 s390/pci: fix dma address calculation in map_sg
__s390_dma_map_sg maps a dma-contiguous area. Although we only map
whole pages we have to take into account that the area doesn't start
or stop at a page boundary because we use the dma address to loop
over the individual sg entries. Failing to do that might lead to an
access of the wrong sg entry.

Fixes: ee877b81c6 ("s390/pci_dma: improve map_sg")
Reported-and-tested-by: Christoph Raisch <raisch@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17 07:09:53 +01:00
Martin Schwidefsky
191ce9d1fd s390/time: fix clocksource steering for negative clock offsets
The TOD clock offset injected by an STP sync check can be negative.
If the resulting total tod_steering_delta gets negative the kernel
will panic.

Change the type of tod_steering_delta to a signed type.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 75c7b6f3f6 ("s390/time: steer clocksource on STP sync events")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17 06:56:38 +01:00
Ingo Molnar
0acbc7aa47 Merge branch 'linus' into sched/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:16:28 +01:00
Christian Borntraeger
5bd0b85ba8 locking/core, arch: Remove cpu_relax_lowlatency()
As there are no users left, we can remove cpu_relax_lowlatency()
implementations from every architecture.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Cc: <linux-arch@vger.kernel.org>
Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:15:11 +01:00
Christian Borntraeger
22b6430d36 locking/core, s390: Make cpu_relax() a barrier again
stop_machine() seemed to be the only important place for yielding during
cpu_relax(). This was fixed by using cpu_relax_yield().

Therefore, we can now redefine cpu_relax() to be a barrier instead on s390,
making s390 identical to all other architectures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1477386195-32736-4-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:15:10 +01:00
Christian Borntraeger
79ab11cdb9 locking/core: Introduce cpu_relax_yield()
For spinning loops people do often use barrier() or cpu_relax().
For most architectures cpu_relax and barrier are the same, but on
some architectures cpu_relax can add some latency.
For example on power,sparc64 and arc, cpu_relax can shift the CPU
towards other hardware threads in an SMT environment.
On s390 cpu_relax does even more, it uses an hypercall to the
hypervisor to give up the timeslice.
In contrast to the SMT yielding this can result in larger latencies.
In some places this latency is unwanted, so another variant
"cpu_relax_lowlatency" was introduced. Before this is used in more
and more places, lets revert the logic and provide a cpu_relax_yield
that can be called in places where yielding is more important than
latency. By default this is the same as cpu_relax on all architectures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:15:09 +01:00
Martin Schwidefsky
ef280c859f s390: move sys_call_table and last_break from thread_info to thread_struct
Move the last two architecture specific fields from the thread_info
structure to the thread_struct. All that is left in thread_info is
the flags field.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-15 16:48:20 +01:00
Stanislaw Gruszka
40565b5aed sched/cputime, powerpc, s390: Make scaled cputime arch specific
Only s390 and powerpc have hardware facilities allowing to measure
cputimes scaled by frequency. On all other architectures
utimescaled/stimescaled are equal to utime/stime (however they are
accounted separately).

Remove {u,s}timescaled accounting on all architectures except
powerpc and s390, where those values are explicitly accounted
in the proper places.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161031162143.GB12646@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-15 09:51:05 +01:00
Linus Torvalds
015ed9433b Merge branch 'maybe-uninitialized' (patches from Arnd)
Merge fixes for -Wmaybe-uninitialized from Arnd Bergmann:
 "It took a while for some patches to make it into mainline through
  maintainer trees, but the 28-patch series is now reduced to 10, with
  one tiny patch added at the end.

  Aside from patches that are no longer required, I did these changes
  compared to version 1:

   - Dropped "iio: maxim_thermocouple: detect invalid storage size in
     read()", which is currently in linux-next as commit 32cb7d27e6.
     This is the only remaining warning I see for a couple of corner
     cases (kbuild bot reports it on blackfin, kernelci bot and arm-soc
     bot both report it on arm64)

   - Dropped "brcmfmac: avoid maybe-uninitialized warning in
     brcmf_cfg80211_start_ap", which is currently in net/master merge
     pending.

   - Dropped two x86 patches, "x86: math-emu: possible uninitialized
     variable use" and "x86: mark target address as output in 'insb'
     asm" as they do not seem to trigger for a default build, and I got
     no feedback on them. Both of these are ancient issues and seem
     harmless, I will send them again to the x86 maintainers once the
     rest is merged.

   - Dropped "rbd: false-postive gcc-4.9 -Wmaybe-uninitialized" based on
     feedback from Ilya Dryomov, who already has a different fix queued
     up for v4.10. The kbuild bot reports this as a warning for xtensa.

   - Replaced "crypto: aesni: avoid -Wmaybe-uninitialized warning" with
     a simpler patch, this one always triggers but my first solution
     would not be safe for linux-4.9 any more at this point. I'll follow
     up with the larger patch as a cleanup for 4.10.

   - Replaced "dib0700: fix nec repeat handling" with a better one,
     contributed by Sean Young"

* -Wmaybe-uninitialized fixes:
  Kbuild: enable -Wmaybe-uninitialized warnings by default
  pcmcia: fix return value of soc_pcmcia_regulator_set
  infiniband: shut up a maybe-uninitialized warning
  crypto: aesni: shut up -Wmaybe-uninitialized warning
  rc: print correct variable for z8f0811
  dib0700: fix nec repeat handling
  s390: pci: don't print uninitialized data for debugging
  nios2: fix timer initcall return value
  x86: apm: avoid uninitialized data
  NFSv4.1: work around -Wmaybe-uninitialized warning
  Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"
2016-11-11 10:03:01 -08:00
Arnd Bergmann
92dfffee97 s390: pci: don't print uninitialized data for debugging
gcc correctly warns about an incorrect use of the 'pa' variable in case
we pass an empty scatterlist to __s390_dma_map_sg:

  arch/s390/pci/pci_dma.c: In function '__s390_dma_map_sg':
  arch/s390/pci/pci_dma.c:309:13: warning: 'pa' may be used uninitialized in this function [-Wmaybe-uninitialized]

This adds a bogus initialization to the function to sanitize the debug
output.  I would have preferred a solution without the initialization,
but I only got the report from the kbuild bot after turning on the
warning again, and didn't manage to reproduce it myself.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11 08:45:08 -08:00
Jakub Kicinski
d7c19b066d mm: kmemleak: scan .data.ro_after_init
Limit the number of kmemleak false positives by including
.data.ro_after_init in memory scanning.  To achieve this we need to add
symbols for start and end of the section to the linker scripts.

The problem was been uncovered by commit 56989f6d85 ("genetlink: mark
families as __ro_after_init").

Link: http://lkml.kernel.org/r/1478274173-15218-1-git-send-email-jakub.kicinski@netronome.com
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-11 08:12:37 -08:00
Martin Schwidefsky
90c53e6580 s390: move cputime accounting fields from thread_info to thread_struct
The user_timer and system_timer fields are used for the per-thread
cputime accounting code. The access to these values is simpler if
they are moved to the thread_struct as the task_thread_info(tsk)
indirection is not needed anymore.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:43 +01:00
Martin Schwidefsky
f8fc82b471 s390: move system_call field from thread_info to thread_struct
The system_call field in thread_info structure is used by the signal
code to store the number of the current system call while the debugger
interacts with its inferior. A better location for the system_call
field is with the other debugger related information in the
thread_struct.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:43 +01:00
Heiko Carstens
d5c352cdd0 s390: move thread_info into task_struct
This is the s390 variant of commit 15f4eae70d ("x86: Move
thread_info into task_struct").

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:41 +01:00
Martin Schwidefsky
c360192bf4 s390/preempt: move preempt_count to the lowcore
Convert s390 to use a field in the struct lowcore for the CPU
preemption count. It is a bit cheaper to access a lowcore field
compared to a thread_info variable and it removes the depencency
on a task related structure.

bloat-o-meter on the vmlinux image for the default configuration
(CONFIG_PREEMPT_NONE=y) reports a small reduction in text size:

add/remove: 0/0 grow/shrink: 18/578 up/down: 228/-5448 (-5220)

A larger improvement is achieved with the default configuration
but with CONFIG_PREEMPT=y and CONFIG_DEBUG_PREEMPT=n:

add/remove: 2/6 grow/shrink: 59/4477 up/down: 1618/-228762 (-227144)

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:40 +01:00
Martin Schwidefsky
1993dbc7e0 s390/bitops: use atomic primitives for bitops
Replace the bitops specific atomic update code by the functions
from atomic_ops.h. This saves a few lines of non-trivial code.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:38 +01:00
Martin Schwidefsky
126b30c3cb s390/atomic: refactor atomic primitives
Rework atomic.h to make the low level functions avaible for use
in other headers without using atomic_t, e.g. in bitops.h.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:33 +01:00
Ingo Molnar
4c8ee71620 Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-11 08:25:07 +01:00
Sebastian Andrzej Siewior
dfbbd86a0f s390/smp: Convert to hotplug state machine
cpuhp_setup_state() invokes the startup callback on all online cpus with
the proper protection, so we can remove the cpu hotplug protection from the
init function and the creation of the per cpu files for online cpus in
smp_add_present_cpu(). smp_add_present_cpu() is called also called from
__smp_rescan_cpus(), but this callpath never adds an online cpu, it merily
adds newly present cpus, so the creation of the cpu files is not required.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161104144502.7kd4bxz2rxqvtack@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-09 23:45:28 +01:00
Thomas Gleixner
ef65d45cbf s390/smp: Make cpu notifier symetric
There is no reason to remove the sysfs cpu files when the CPU is dead, they
can be removed when the cpu is prepared to go down. Doing it at
DOWN_PREPARE allows us to convert it to a symetric hotplug state in the
next step.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161104144140.lcee6kwmwlx37m7g@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-09 23:45:28 +01:00
Linus Torvalds
ae67e87f40 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "Two bug fixes

   - a memory alignment fix in the s390 only hypfs code

   - a fix for the generic percpu code that caused ftrace to break on
     s390. This is not relevant for x86 but for all architectures that
     use the generic percpu code"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  percpu: use notrace variant of preempt_disable/preempt_enable
  s390/hypfs: Use get_free_page() instead of kmalloc to ensure page alignment
2016-11-09 11:09:40 -08:00
Masahiro Yamada
847e070012 s390: remove unneeded dependency for gen_facilities
The dependency between the object and the source is handled by
scripts/Makefile.host, so only "hostprogs-y += gen_facilities"
is fine.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-08 10:30:28 +01:00
Masahiro Yamada
d1f7e8f85b s390: squash facilities_src.h into gen_facilities.c
We generally expect headers in arch/$(ARCH)/include/asm directory
are included from kernel sources, but facilities_src.h is not;
it is included from the arch/s390/tools/gen_facilities.c tool.

There is no reason to expose this header to the public include path.
Furthermore, facilities_src.h makes sure to be included only from
gen_facilities.c by the following:

  #ifndef S390_GEN_FACILITIES_C
  #error "This file can only be included by gen_facilities.c"
  #endif

This check can be removed by merging the two files.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-07 11:06:57 +01:00
Masahiro Yamada
72cc9918ef s390: delete unneeded #include <linux/kconfig.h> from facilities_src.h
The header facilities_src.h is only included from gen_facilities.c
and the tool is compiled with the following extra options:

    HOSTCFLAGS_gen_facilities.o += -Wall $(LINUXINCLUDE)

Please note $(LINUXINCLUDE) is expanded into build options including:

    -include $(srctree)/include/linux/kconfig.h

So, the Makefile always forces the tool to include kconfig.h, i.e.,
the #include <linux/kconfig.h> directive in the header is redundant.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-07 11:06:56 +01:00
Linus Torvalds
66cecb6789 One NULL pointer dereference, and two fixes for regressions introduced
during the merge window.  The rest are fixes for MIPS, s390 and nested VMX.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJYG2H5AAoJEL/70l94x66DK/cH/0jEQ3ynuLAd5CKux7JxI/EP
 msSJh1Xqr4+XhXZnuDpGQWrdsBlxoiqA6PsJrUTtyi4nQCDXlT8g+2MDuvqhWIHz
 7vw58j/EMJDCVQzYAbN5VDUfk13uB5aSWTo3M9Rf09v0hU1Ql7z8u4CtKEdLpN5Y
 LY9bT9fxUmXO7REKP7bdW6ZrDX/hUShYHgMqzXGFMyGBG3ym3a9bggXEzTCD6eNQ
 ioogQIWqg+icdhta0iLNAwFClPlcKB2/xo4IUuNgrPwGoHFGJN/8+qxT4+sVbp2B
 v8u1zOXlCFXBcskWE+yRRsGe72+mIzz6QScCyO+5HbhKYVfbE9H7KBlFX9rZZ2c=
 =IbKx
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "One NULL pointer dereference, and two fixes for regressions introduced
  during the merge window.

  The rest are fixes for MIPS, s390 and nested VMX"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: x86: Check memopp before dereference (CVE-2016-8630)
  kvm: nVMX: VMCLEAR an active shadow VMCS after last use
  KVM: x86: drop TSC offsetting kvm_x86_ops to fix KVM_GET/SET_CLOCK
  KVM: x86: fix wbinvd_dirty_mask use-after-free
  kvm/x86: Show WRMSR data is in hex
  kvm: nVMX: Fix kernel panics induced by illegal INVEPT/INVVPID types
  KVM: document lock orders
  KVM: fix OOPS on flush_work
  KVM: s390: Fix STHYI buffer alignment for diag224
  KVM: MIPS: Precalculate MMIO load resume PC
  KVM: MIPS: Make ERET handle ERL before EXL
  KVM: MIPS: Fix lazy user ASID regenerate for SMP
2016-11-04 13:08:05 -07:00
Paul Gortmaker
8ba8b05f57 s390: kernel: make lgr explicitly non-modular
The Makefile currently controlling compilation of this code is obj-y
meaning that it currently is not being built as a module by anyone.

Lets remove the couple traces of modular infrastructure use, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

We replace module.h with init.h and export.h since the file does
export some symbols.

Cc: linux-s390@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-31 17:55:42 +01:00
Paul Gortmaker
cee672e180 s390: hypfs: make inode explicitly non-modular
The Kconfig currently controlling compilation of this code is:

arch/s390/Kconfig:config S390_HYPFS_FS
arch/s390/Kconfig:      def_bool y

...meaning that it currently is not being built as a module by anyone.

Lets remove the couple traces of modular infrastructure use, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

Also note that MODULE_ALIAS is a no-op for non-modular code.

We also delete the MODULE_LICENSE tag etc. since all that information
was (or is now) contained at the top of the file in the comments.

Build testing indicated the presence of module.h was masking an
implicit include of kobject.h, hence the addition of that.

Cc: linux-s390@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-31 17:55:40 +01:00
Martin Schwidefsky
75c7b6f3f6 s390/time: steer clocksource on STP sync events
On STP sync events the TOD clock will jump in time, either forward or
backward. The TOD clocksource claims to be continuous but in case of
an STP sync with a negative offset it is not.

Subtract the offset injected by the STP sync check from the result of
the TOD clocksource to make it continuous again. Add code to drift the
offset towards zero with a fixed rate, steering 1 second in ~9 hours.

Suggested-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-28 10:09:02 +02:00
Martin Schwidefsky
2ace06ec0d s390/time: adjust last_update_clock at clock synchronization
The last_update_clock time stamp in the lowcore should be adjusted by
the TOD clock delta that is created by the clock synchronization.
Otherwise the calculation of the steal time will be incorrect.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-28 10:09:02 +02:00
Martin Schwidefsky
b1c0854d16 s390/time: refactor clock sync
Merge clock_sync_cpu into stp_sync_clock and split out the update
of the global and per-CPU clock fields into clock_sync_global
and clock_sync_local.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-28 10:09:02 +02:00
Michael Holzheu
237d6e6884 s390/hypfs: Use get_free_page() instead of kmalloc to ensure page alignment
Since commit d86bd1bece ("mm/slub: support left redzone") it is no longer
guaranteed that kmalloc(PAGE_SIZE) returns page aligned memory.

After the above commit we get an error for diag224 because aligned
memory is required. This leads to the following user visible error:

 # mount none -t s390_hypfs /sys/hypervisor/
 mount: unknown filesystem type 's390_hypfs'

 # dmesg | grep hypfs
 hypfs.cccfb8: The hardware system does not provide all functions
               required by hypfs
 hypfs.7a79f0: Initialization of hypfs failed with rc=-61

Fix this problem and use get_free_page() instead of kmalloc() to get
correctly aligned memory.

Cc: stable@vger.kernel.org # v3.6+
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-28 10:08:58 +02:00
Linus Torvalds
55bea71ed5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "A few more s390 patches for 4.9:
   - a fix for an overflow in the dasd driver reported by UBSAN
   - fix a regression and add hotplug memory to the zone movable again
   - add ignore defines for the pkey system calls
   - fix the ouput of the merged stack tracer
   - replace printk with pr_cont in arch/s390 where appropriate
   - remove the arch specific return_address function again
   - ignore reserved channel paths at boot time
   - add a missing hugetlb_bad_size call to the arch backend"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: fix zone calculation in arch_add_memory()
  s390/dumpstack: use pr_cont within show_stack and die
  s390/dumpstack: get rid of return_address again
  s390/disassambler: use pr_cont where appropriate
  s390/dumpstack: use pr_cont where appropriate
  s390/dumpstack: restore reliable indicator for call traces
  s390/mm: use hugetlb_bad_size()
  s390/cio: don't register chpids in reserved state
  s390: ignore pkey system calls
  s390/dasd: avoid undefined behaviour
2016-10-27 14:16:30 -07:00
Paolo Bonzini
b5149a5fd1 KVM: s390: Fix wrong memory allocation
With commit d86bd1bece ("mm/slub: support left redzone") or
 with slab debugging the allocation of our diag224 buffer is not
 aligned properly. Let's fix this.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJYELGIAAoJEBF7vIC1phx8eY0P/jGHsq8vvtvJFQxwMLx0MPkS
 HOd/kw9nupyNeJMrvLGgcpyxfLTbCJsdA5P+ZpgZEUygwAkjs/EFbzGfgWL6kY4t
 pT9l4uo0LfdGdNITNVWNgUvpFC4VIIlyQAeTdMG2+hUlaIF0I1zpdQDYa9umacKN
 lGuKRObTH7kVX6yj02kTTallBKDdtl45fSAPCoMSFImi+Hlyn8BBVTlbNPujoOk0
 i50vWzjn711Fm/qnc0p/8XpLTcZLdYPm8vToxD6WR6C8mjepb3+JyHgTx4u9WRVY
 cvVVTrqDvrUaqOss4KTssbgzSKZ74aPPlQm27n2l8scpAcHbr+1ZY59PnkEdoF2Y
 jA5JHeOfqijneOf0F7vwGAG2c4MU5rIVq7bFbxVGVglb3S5TBTRnd63Q0kUsJc1g
 amrdFOjXtkyqpusEJMIsLMkhw2ICjQ7D2oXIRq8GBqxnEoMh99KUYdFu3Bd21/Bv
 uP5vBvvuKx0jtD7WBkIOL9Df0YEhKYwqmonubPyUwr/8ok04g/dKTPgelXzCVFts
 e2uXkQ0uSxT2l04E2iSiq7qMM4CMNVqhkJHHHyBAxisHpM3Wdbyq2y0evOqATwzN
 F26a4JAEhOxRbytnaoRH/XaGgGLfKFegSHZoJ/dtrxuBIBV2t4aOHHhR0N0FTyK0
 bPCH857jWJV2yB1ym4OY
 =YoA+
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fix wrong memory allocation

With commit d86bd1bece ("mm/slub: support left redzone") or
with slab debugging the allocation of our diag224 buffer is not
aligned properly. Let's fix this.
2016-10-27 13:22:54 +02:00
Janosch Frank
45c7ee43a5 KVM: s390: Fix STHYI buffer alignment for diag224
Diag224 requires a page-aligned 4k buffer to store the name table
into. kmalloc does not guarantee page alignment, hence we replace it
with __get_free_page for the buffer allocation.

Cc: stable@vger.kernel.org # v4.8+
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-10-26 13:46:44 +02:00
Peter Zijlstra
890658b7ab locking/mutex: Kill arch specific code
Its all generic atomic_long_t stuff now.

Tested-by: Jason Low <jason.low2@hpe.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-25 11:31:51 +02:00
Gerald Schaefer
4a65429457 s390/mm: fix zone calculation in arch_add_memory()
Standby (hotplug) memory should be added to ZONE_MOVABLE on s390. After
commit 199071f1 "s390/mm: make arch_add_memory() NUMA aware",
arch_add_memory() used memblock_end_of_DRAM() to find out the end of
ZONE_NORMAL and the beginning of ZONE_MOVABLE. However, commit 7f36e3e5
"memory-hotplug: add hot-added memory ranges to memblock before allocate
node_data for a node." moved the call of memblock_add_node() before
the call of arch_add_memory() in add_memory_resource(), and thus changed
the return value of memblock_end_of_DRAM() when called in
arch_add_memory(). As a result, arch_add_memory() will think that all
memory blocks should be added to ZONE_NORMAL.

Fix this by changing the logic in arch_add_memory() so that it will
manually iterate over all zones of a given node to find out which zone
a memory block should be added to.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-24 10:26:17 +02:00
Heiko Carstens
47ece7fef4 s390/dumpstack: use pr_cont within show_stack and die
Use pr_cont instead of printk calls also within show_stack and
die in order to avoid extra line breaks.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-24 10:26:14 +02:00
Linus Torvalds
a23b27ae12 KVM fixes for v4.9-rc2
ARM:
  - avoid livelock when walking guest page tables
  - fix HYP mode static keys without CC_HAVE_ASM_GOTO
 
 MIPS:
  - fix a build error without TRACEPOINTS_ENABLED
 
 s390:
  - reject a malformed userspace configuration
 
 x86:
  - suppress a warning without CONFIG_CPU_FREQ
  - initialize whole irq_eoi array
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJYCl1iAAoJEED/6hsPKofo7pUH/R/sL417YLTkY6UVhtrCXQq1
 cUPWLLp96/Ijkmb+PoByLn5msKxhUa9A06QfphKCbmvpInubXPTxaWDCpoXxHmCO
 ywHmwuNk7Zgc8MnvcqBKte1jo8/JxQTM1NYZEys7va+J/fC4Nqb9gjZnECSTfUK5
 JE8bPs+yxVSavsh0KOZcTdTHtuZQ6SQijgDkE4pSDBYhCKxIpYAXaKVUOC+VSTDH
 ACUMLvUrFlFbAev0z4oF4CSKotAq6VEkJQhequghKPUHSeWabZB4wAHTkfUbJ+Bb
 Ar57zrz5YCGbojywuHi1954eHWv6AfWyD8bnYSCtD4gsIRws+dH/MIiPgEMjLOQ=
 =9U78
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
 "ARM:
   - avoid livelock when walking guest page tables
   - fix HYP mode static keys without CC_HAVE_ASM_GOTO

  MIPS:
   - fix a build error without TRACEPOINTS_ENABLED

  s390:
   - reject a malformed userspace configuration

  x86:
   - suppress a warning without CONFIG_CPU_FREQ
   - initialize whole irq_eoi array"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  arm/arm64: KVM: Map the BSS at HYP
  arm64: KVM: Take S1 walks into account when determining S2 write faults
  KVM: s390: reject invalid modes for runtime instrumentation
  kvm: x86: memset whole irq_eoi
  kvm/x86: Fix unused variable warning in kvm_timer_init()
  KVM: MIPS: Add missing uaccess.h include
2016-10-21 19:09:29 -07:00
Christian Borntraeger
a5efb6b6c9 KVM: s390: reject invalid modes for runtime instrumentation
Usually a validity intercept is a programming error of the host
because of invalid entries in the state description.
We can get a validity intercept if the mode of the runtime
instrumentation control block is wrong. As the host does not know
which modes are valid, this can be used by userspace to trigger
a WARN.
Instead of printing a WARN let's return an error to userspace as
this can only happen if userspace provides a malformed initial
value (e.g. on migration). The kernel should never warn on bogus
input. Instead let's log it into the s390 debug feature.

While at it, let's return -EINVAL for all validity intercepts as
this will trigger an error in QEMU like

error: kvm run failed Invalid argument
PSW=mask 0404c00180000000 addr 000000000063c226 cc 00
R00=000000000000004f R01=0000000000000004 R02=0000000000760005 R03=000000007fe0a000
R04=000000000064ba2a R05=000000049db73dd0 R06=000000000082c4b0 R07=0000000000000041
R08=0000000000000002 R09=000003e0804042a8 R10=0000000496152c42 R11=000000007fe0afb0
[...]

This will avoid an endless loop of validity intercepts.

Cc: stable@vger.kernel.org # v4.5+
Fixes: c6e5f16637 ("KVM: s390: implement the RI support of guest")
Acked-by: Fan Zhang <zhangfan@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-10-20 20:06:12 +02:00
Linus Torvalds
63ae602cea Merge branch 'gup_flag-cleanups'
Merge the gup_flags cleanups from Lorenzo Stoakes:
 "This patch series adjusts functions in the get_user_pages* family such
  that desired FOLL_* flags are passed as an argument rather than
  implied by flags.

  The purpose of this change is to make the use of FOLL_FORCE explicit
  so it is easier to grep for and clearer to callers that this flag is
  being used.  The use of FOLL_FORCE is an issue as it overrides missing
  VM_READ/VM_WRITE flags for the VMA whose pages we are reading
  from/writing to, which can result in surprising behaviour.

  The patch series came out of the discussion around commit 38e0885465
  ("mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing"),
  which addressed a BUG_ON() being triggered when a page was faulted in
  with PROT_NONE set but having been overridden by FOLL_FORCE.
  do_numa_page() was run on the assumption the page _must_ be one marked
  for NUMA node migration as an actual PROT_NONE page would have been
  dealt with prior to this code path, however FOLL_FORCE introduced a
  situation where this assumption did not hold.

  See

      https://marc.info/?l=linux-mm&m=147585445805166

  for the patch proposal"

Additionally, there's a fix for an ancient bug related to FOLL_FORCE and
FOLL_WRITE by me.

[ This branch was rebased recently to add a few more acked-by's and
  reviewed-by's ]

* gup_flag-cleanups:
  mm: replace access_process_vm() write parameter with gup_flags
  mm: replace access_remote_vm() write parameter with gup_flags
  mm: replace __access_remote_vm() write parameter with gup_flags
  mm: replace get_user_pages_remote() write/force parameters with gup_flags
  mm: replace get_user_pages() write/force parameters with gup_flags
  mm: replace get_vaddr_frames() write/force parameters with gup_flags
  mm: replace get_user_pages_locked() write/force parameters with gup_flags
  mm: replace get_user_pages_unlocked() write/force parameters with gup_flags
  mm: remove write/force parameters from __get_user_pages_unlocked()
  mm: remove write/force parameters from __get_user_pages_locked()
  mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
2016-10-19 08:39:47 -07:00
Lorenzo Stoakes
c164154f66 mm: replace get_user_pages_unlocked() write/force parameters with gup_flags
This removes the 'write' and 'force' use from get_user_pages_unlocked()
and replaces them with 'gup_flags' to make the use of FOLL_FORCE
explicit in callers as use of this flag can result in surprising
behaviour (and hence bugs) within the mm subsystem.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-18 14:13:37 -07:00
Heiko Carstens
dcddba96cd s390/dumpstack: get rid of return_address again
With commit ef6000b4c6 ("Disable the __builtin_return_address()
warning globally after all)" the kernel does not warn at all again if
__builtin_return_address(n) is called with n > 0.

Besides the fact that this was a false warning on s390 anyway, due to
the always present backchain, we can now revert commit 5606330627
("s390/dumpstack: implement and use return_address()") again, to
simplify the code again.

After all I shouldn't have had return_address() implememted at all to
workaround this issue. So get rid of this again.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 14:44:33 +02:00
Heiko Carstens
4d062487f3 s390/disassambler: use pr_cont where appropriate
Just like for dumpstack use pr_cont instead of simple printk calls to
fix the output when disassembling a piece of code.

Before:
[    0.840627] Krnl Code: 000000000017d1c6: a77400f7            brc     7,17d3b4
[    0.840630]
                          000000000017d1ca: 92015000            mvi     0(%r5),1
[    0.840634]
                         #000000000017d1ce: a7f40001            brc     15,17d1d0

After:
[    0.831792] Krnl Code: 000000000017d13e: a77400f7            brc     7,17d32c
                          000000000017d142: 92015000            mvi     0(%r5),1
                         #000000000017d146: a7f40001            brc     15,17d148

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 14:44:32 +02:00
Heiko Carstens
a790634544 s390/dumpstack: use pr_cont where appropriate
Use pr_cont instead of simple printk calls when lines will be
continued. This fixes the kernel output of various lines printed on
e.g. a warning:

Before:
[    0.840604] Krnl PSW : 0404c00180000000 000000000017d1d2
[    0.840606]  (try_to_wake_up+0x382/0x5e0)

[    0.840610]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0
[    0.840611]  RI:0 EA:3

After:
[    0.831772] Krnl PSW : 0404c00180000000 000000000017d14a (try_to_wake_up+0x382/0x5e0)
[    0.831776]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 14:44:31 +02:00
Heiko Carstens
d0208639db s390/dumpstack: restore reliable indicator for call traces
Before merging all different stack tracers the call traces printed had
an indicator if an entry can be considered reliable or not.
Unreliable entries were put in braces, reliable not. Currently all
lines contain these extra braces.

This patch restores the old behaviour by adding an extra "reliable"
parameter to the callback functions. Only show_trace makes currently
use of it.

Before:
[    0.804751] Call Trace:
[    0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0)
[    0.804756] ([<0000000000161d64>] create_worker+0x174/0x1c0)

After:
[    0.804751] Call Trace:
[    0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0)
[    0.804756]  [<0000000000161d64>] create_worker+0x174/0x1c0

Fixes: 758d39ebd3 ("s390/dumpstack: merge all four stack tracers")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 14:44:30 +02:00
Shyam Saini
b5003b5f0a s390/mm: use hugetlb_bad_size()
Update setup_hugepagesz() to call hugetlb_bad_size() when unsupported
hugepage size is found.

Signed-off-by: Shyam Saini <mayhs11saini@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 11:25:26 +02:00
Heiko Carstens
12e721964e s390: ignore pkey system calls
Ignore the pkey systems calls since they don't make any sense on s390.
In addition any user could trigger a warning if issueing the pkey_free
system call, if it would be wired up on a system without pkey support.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 11:25:25 +02:00