Commit graph

4778 commits

Author SHA1 Message Date
Wei Yang
7fbe7a9374 powerpc/powernv: simplify the calculation of iov resource alignment
The alignment of IOV BAR on PowerNV platform is the total size of the IOV
BAR. No matter whether the IOV BAR is extended with number of
roundup_pow_of_two(total_vfs) or number of max PE number (256), the total
size could be calculated by (vfs_expanded * VF_BAR_size).

This patch simplifies the pnv_pci_iov_resource_alignment() by removing the
first case.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10 12:04:55 +11:00
Wei Yang
b033185419 powerpc/powernv: don't enable SRIOV when VF BAR has non 64bit-prefetchable BAR
On PHB3, we enable SRIOV devices by mapping IOV BAR with M64 BARs. If a
SRIOV device's IOV BAR is not 64bit-prefetchable, this is not assigned from
64bit prefetchable window, which means M64 BAR can't work on it.

The reason is PCI bridges support only 2 memory windows and the kernel code
programs bridges in the way that one window is 32bit-nonprefetchable and
the other one is 64bit-prefetchable. So if devices' IOV BAR is 64bit and
non-prefetchable, it will be mapped into 32bit space and therefore M64
cannot be used for it.

This patch makes this explicit and truncate IOV resource in this case to
save MMIO space.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10 12:04:54 +11:00
Gavin Shan
ccc9662da5 powerpc/powernv: Simplify definitions of EEH debugfs handlers
The EEH debugfs handlers have same prototype. This introduces
a macro to define them, then to simplify the code. No logical
changes.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-10 12:04:46 +11:00
Andrew Donnellan
9b4fffa149 powerpc/powernv: new function to access OPAL msglog
Currently, the OPAL msglog/console buffer is exposed as a sysfs file, with
the sysfs read handler responsible for retrieving the log from the OPAL
buffer. We'd like to be able to use it in xmon as well.

Refactor the OPAL msglog code to create a new function, opal_msglog_copy(),
that copies to an arbitrary buffer. Separate the initialisation code into
generic memcons init and sysfs file creation.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-09 20:27:04 +11:00
Russell Currey
2de50e9674 powerpc/powernv: Remove support for p5ioc2
"p5ioc2 is used by approximately 2 machines in the world, and has never
ever been a supported configuration."

The code for p5ioc2 is essentially unused and complicates what is already
a very complicated codebase.  Its removal is essentially a "free win" in
the effort to simplify the powernv PCI code.

In addition, support for p5ioc2 has been dropped from skiboot.  There's no
reason to keep it around in the kernel.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-08 22:03:37 +11:00
Al Viro
5955102c99 wrappers for ->i_mutex access
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-22 18:04:28 -05:00
Linus Torvalds
f689b742f2 powerpc updates for 4.5
- Ground work for the new Power9 MMU from Aneesh Kumar K.V
  - Optimise FP/VMX/VSX context switching from Anton Blanchard
 
  - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica Gupta,
    Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling, Andrew Donnellan
  - Allow wrapper to work on non-english system from Laurent Vivier
  - Add rN aliases to the pt_regs_offset table from Rashmica Gupta
  - Fix module autoload for rackmeter & axonram drivers from Luis de Bethencourt
  - Include KVM guest test in all interrupt vectors from Paul Mackerras
  - Fix DSCR inheritance over fork() from Anton Blanchard
  - Make value-returning atomics & {cmp}xchg* & their atomic_ versions fully ordered from Boqun Feng
  - Print MSR TM bits in oops messages from Michael Neuling
  - Add TM signal return & invalid stack selftests from Michael Neuling
  - Limit EPOW reset event warnings from Vipin K Parashar
  - Remove the Cell QPACE code from Rashmica Gupta
  - Append linux_banner to exception information in xmon from Rashmica Gupta
  - Add selftest to check if VSRs are corrupted from Rashmica Gupta
  - Remove broken GregorianDay() from Daniel Axtens
  - Import Anton's context_switch2 benchmark into selftests from Michael Ellerman
  - Add selftest script to test HMI functionality from Daniel Axtens
  - Remove obsolete OPAL v2 support from Stewart Smith
  - Make enter_rtas() private from Michael Ellerman
  - PPR exception cleanups from Michael Ellerman
  - Add page soft dirty tracking from Laurent Dufour
  - Add support for Nvlink NPUs from Alistair Popple
  - Add support for kexec on 476fpe from Alistair Popple
  - Enable kernel CPU dlpar from sysfs from Nathan Fontenot
  - Copy only required pieces of the mm_context_t to the paca from Michael Neuling
  - Add a kmsg_dumper that flushes OPAL console output on panic from Russell Currey
  - Implement save_stack_trace_regs() to enable kprobe stack tracing from Steven Rostedt
  - Add HWCAP bits for Power9 from Michael Ellerman
  - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
  - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
  - scripts/recordmcount.pl: support data in text section on powerpc from Ulrich Weigand
  - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand
 
  - cxl: Fix possible idr warning when contexts are released from Vaibhav Jain
  - cxl: use correct operator when writing pcie config space values from Andrew Donnellan
  - cxl: Fix DSI misses when the context owning task exits from Vaibhav Jain
  - cxl: fix build for GCC 4.6.x from Brian Norris
  - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
  - cxl: Enable PCI device ID for future IBM CXL adapter from Uma Krishnan
 
  - Freescale updates from Scott: Highlights include moving QE code out of
    arch/powerpc (to be shared with arm), device tree updates, and minor fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWmIxeAAoJEFHr6jzI4aWAA+cQAIXAw4WfVWJ2V4ZK+1eKfB57
 fdXG71PuXG+WYIWy71ly8keLHdzzD1NQ2OUB64bUVRq202nRgVc15ZYKRJ/FE/sP
 SkxaQ2AG/2kI2EflWshOi0Lu9qaZ+LMHJnszIqE/9lnGSB2kUI/cwsSXgziiMKXR
 XNci9v14SdDd40YV/6BSZXoxApwyq9cUbZ7rnzFLmz4hrFuKmB/L3LABDF8QcpH7
 sGt/YaHGOtqP0UX7h5KQTFLGe1OPvK6NWixSXeZKQ71ED6cho1iKUEOtBA9EZeIN
 QM5JdHFWgX8MMRA0OHAgidkSiqO38BXjmjkVYWoIbYz7Zax3ThmrDHB4IpFwWnk3
 l7WBykEXY7KEqpZzbh0GFGehZWzVZvLnNgDdvpmpk/GkPzeYKomBj7ZZfm3H1yGD
 BTHPwuWCTX+/K75yEVNO8aJO12wBg7DRl4IEwBgqhwU8ga4FvUOCJkm+SCxA1Dnn
 qlpS7qPwTXNIEfKMJcxp5X0KiwDY1EoOotd4glTN0jbeY5GEYcxe+7RQ302GrYxP
 zcc8EGLn8h6BtQvV3ypNHF5l6QeTW/0ZlO9c236tIuUQ5gQU39SQci7jQKsYjSzv
 BB1XdLHkbtIvYDkmbnr1elbeJCDbrWL9rAXRUTRyfuCzaFWTfZmfVNe8c8qwDMLk
 TUxMR/38aI7bLcIQjwj9
 =R5bX
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Core:
   - Ground work for the new Power9 MMU from Aneesh Kumar K.V
   - Optimise FP/VMX/VSX context switching from Anton Blanchard

  Misc:
   - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica
     Gupta, Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling,
     Andrew Donnellan
   - Allow wrapper to work on non-english system from Laurent Vivier
   - Add rN aliases to the pt_regs_offset table from Rashmica Gupta
   - Fix module autoload for rackmeter & axonram drivers from Luis de
     Bethencourt
   - Include KVM guest test in all interrupt vectors from Paul Mackerras
   - Fix DSCR inheritance over fork() from Anton Blanchard
   - Make value-returning atomics & {cmp}xchg* & their atomic_ versions
     fully ordered from Boqun Feng
   - Print MSR TM bits in oops messages from Michael Neuling
   - Add TM signal return & invalid stack selftests from Michael Neuling
   - Limit EPOW reset event warnings from Vipin K Parashar
   - Remove the Cell QPACE code from Rashmica Gupta
   - Append linux_banner to exception information in xmon from Rashmica
     Gupta
   - Add selftest to check if VSRs are corrupted from Rashmica Gupta
   - Remove broken GregorianDay() from Daniel Axtens
   - Import Anton's context_switch2 benchmark into selftests from
     Michael Ellerman
   - Add selftest script to test HMI functionality from Daniel Axtens
   - Remove obsolete OPAL v2 support from Stewart Smith
   - Make enter_rtas() private from Michael Ellerman
   - PPR exception cleanups from Michael Ellerman
   - Add page soft dirty tracking from Laurent Dufour
   - Add support for Nvlink NPUs from Alistair Popple
   - Add support for kexec on 476fpe from Alistair Popple
   - Enable kernel CPU dlpar from sysfs from Nathan Fontenot
   - Copy only required pieces of the mm_context_t to the paca from
     Michael Neuling
   - Add a kmsg_dumper that flushes OPAL console output on panic from
     Russell Currey
   - Implement save_stack_trace_regs() to enable kprobe stack tracing
     from Steven Rostedt
   - Add HWCAP bits for Power9 from Michael Ellerman
   - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
   - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
   - scripts/recordmcount.pl: support data in text section on powerpc
     from Ulrich Weigand
   - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand

  cxl:
   - cxl: Fix possible idr warning when contexts are released from
     Vaibhav Jain
   - cxl: use correct operator when writing pcie config space values
     from Andrew Donnellan
   - cxl: Fix DSI misses when the context owning task exits from Vaibhav
     Jain
   - cxl: fix build for GCC 4.6.x from Brian Norris
   - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
   - cxl: Enable PCI device ID for future IBM CXL adapter from Uma
     Krishnan

  Freescale:
   - Freescale updates from Scott: Highlights include moving QE code out
     of arch/powerpc (to be shared with arm), device tree updates, and
     minor fixes"

* tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (149 commits)
  powerpc/module: Handle R_PPC64_ENTRY relocations
  scripts/recordmcount.pl: support data in text section on powerpc
  powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
  powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  powerpc/mm: Fix _PAGE_PTE breaking swapoff
  cxl: Enable PCI device ID for future IBM CXL adapter
  cxl: use -Werror only with CONFIG_PPC_WERROR
  cxl: fix build for GCC 4.6.x
  powerpc: Add HWCAP bits for Power9
  powerpc/powernv: Reserve PE#0 on NPU
  powerpc/powernv: Change NPU PE# assignment
  powerpc/powernv: Fix update of NVLink DMA mask
  powerpc/powernv: Remove misleading comment in pci.c
  powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing
  powerpc: Fix build break due to paca mm_context_t changes
  cxl: Fix DSI misses when the context owning task exits
  MAINTAINERS: Update Scott Wood's e-mail address
  powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()
  powerpc: Fix style of self-test config prompts
  powerpc/powernv: Only delay opal_rtc_read() retry when necessary
  ...
2016-01-15 13:18:47 -08:00
Vladimir Davydov
5d097056c9 kmemcg: account certain kmem allocations to memcg
Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg.  For the list, see below:

 - threadinfo
 - task_struct
 - task_delay_info
 - pid
 - cred
 - mm_struct
 - vm_area_struct and vm_region (nommu)
 - anon_vma and anon_vma_chain
 - signal_struct
 - sighand_struct
 - fs_struct
 - files_struct
 - fdtable and fdtable->full_fds_bits
 - dentry and external_name
 - inode for all filesystems. This is the most tedious part, because
   most filesystems overwrite the alloc_inode method.

The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds.  Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
Michael Ellerman
be6bfc29bc Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include moving QE code out of arch/powerpc (to be shared with
arm), device tree updates, and minor fixes."
2016-01-14 09:55:01 +11:00
Russell Currey
c88c5d4373 powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
The recently added OPAL API call, OPAL_CONSOLE_FLUSH, originally took no
parameters and returned nothing.  The call was updated to accept the
terminal number to flush, and returned various values depending on the
state of the output buffer.

The prototype has been updated and its usage in the OPAL kmsg dumper has
been modified to support its new behaviour as an incremental flush.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-13 12:35:17 +11:00
Alistair Popple
08f48f3234 powerpc/powernv: Reserve PE#0 on NPU
P8+ hardware reports all errors on PE#0. This patch ensures PE#0 is
not assigned to NPU devices so that it can be used for EEH.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-11 20:30:52 +11:00
Alistair Popple
b521549a09 powerpc/powernv: Change NPU PE# assignment
The P8+ hardware supports four partitionable endpoints (PEs) however
the hardware reports all errors as occurring on PE#0. This means we
need to reserve this PE for error handling (EEH) and not assign it to
a NPU device, implying that some devices will need to share PEs.

This patch changes the PE assignment for NPU devices such that NPU
devices which connect to the same GPU are assigned to the same
PE#.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-11 20:30:51 +11:00
Alistair Popple
419dbd5e1f powerpc/powernv: Fix update of NVLink DMA mask
The emulated NVLink PCI devices share the same IODA2 TCE tables but only
support a single TVT (instead of the normal two for PCI devices). This
requires the kernel to manually replace windows with either the bypass
or non-bypass window depending on what the driver has requested.

Unfortunately an incorrect optimisation was made in
pnv_pci_ioda_dma_set_mask() which caused updating of some NPU device PEs
to be skipped in certain configurations due to an incorrect assumption
that a NULL peer PE in the array indicated there were no more peers
present. This patch fixes the problem by ensuring all peer PEs are
updated.

Fixes: 5d2aa710e6 ("powerpc/powernv: Add support for Nvlink NPUs")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-11 20:30:48 +11:00
Russell Currey
b0eab5b29a powerpc/powernv: Remove misleading comment in pci.c
PCI in powernv now supports quite a bit more than p5ioc2, so remove the
outdated comment.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-11 16:56:01 +11:00
Andrew Lunn
e7f4dc3536 mdio: Move allocation of interrupts into core
Have mdio_alloc() create the array of interrupt numbers, and
initialize it to POLLING. This is what most MDIO drivers want, so
allowing code to be removed from the drivers.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-07 14:31:26 -05:00
Andrew Donnellan
dc3799bb9a powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()
Fix off-by-one error in opal_mce_check_early_recovery() when checking
whether the NIP falls within OPAL space.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-27 19:12:41 +11:00
Michael Neuling
57a9039052 powerpc/powernv: Only delay opal_rtc_read() retry when necessary
Only delay opal_rtc_read() when busy and are going to retry.

This has the advantage of possibly saving a massive 10ms off booting!

Kudos to Stewart for noticing.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-27 19:12:40 +11:00
Russell Currey
affddff69c powerpc/powernv: Add a kmsg_dumper that flushes console output on panic
On BMC machines, console output is controlled by the OPAL firmware and is
only flushed when its pollers are called.  When the kernel is in a panic
state, it no longer calls these pollers and thus console output does not
completely flush, causing some output from the panic to be lost.

Output is only actually lost when the kernel is configured to not power off
or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
flushes the console buffer as part of its power down routines.  Before this
patch, however, only partial output would be printed during the timeout wait.

This patch adds a new kmsg_dumper which gets called at panic time to ensure
panic output is not lost.  It accomplishes this by calling OPAL_CONSOLE_FLUSH
in the OPAL API, and if that is not available, the pollers are called enough
times to (hopefully) completely flush the buffer.

The flushing mechanism will only affect output printed at and before the
kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel panic"
message may still be truncated as follows:

>Call Trace:
>[c000000f1f603b00] [c0000000008e9458] dump_stack+0x90/0xbc (unreliable)
>[c000000f1f603b30] [c0000000008e7e78] panic+0xf8/0x2c4
>[c000000f1f603bc0] [c000000000be4860] mount_block_root+0x288/0x33c
>[c000000f1f603c80] [c000000000be4d14] prepare_namespace+0x1f4/0x254
>[c000000f1f603d00] [c000000000be43e8] kernel_init_freeable+0x318/0x350
>[c000000f1f603dc0] [c00000000000bd74] kernel_init+0x24/0x130
>[c000000f1f603e30] [c0000000000095b0] ret_from_kernel_thread+0x5c/0xac
>---[ end Kernel panic - not

This functionality is implemented as a kmsg_dumper as it seems to be the
most sensible way to introduce platform-specific functionality to the
panic function.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-27 19:12:40 +11:00
Daniel Walker
433c858a61 powerpc/85xx: mpc85xx ADS: remove pci exclude
This code was reworked in commit,

905e75c46d

This change removed the fsl_add_bridge() which originally was above
the addition of the pci_exclude_device function. I think the assumption was that
the pci_exclude_device would prevent changes to the bridge PCI config after
it's been added. It seems it wasn't fully tested on MPC85xx ADS because
if you move the fsl_add_bridge() the pci_exclude_device is set in the machine
description then you can never update the PCI Config since the exclude
prevents it. This disrupts things like DMA.

This issue was extensively debugged by David Beazley.

Cc: xe-kernel@external.cisco.com
Cc: dbeazley@cisco.com
Cc: dwalker@fifo99.com
Signed-off-by: Daniel Walker <danielwa@cisco.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 18:23:21 -06:00
Harninder Rai
720d7aebcd powerpc/85xx: Add PCIe controller support for bsc9132qds
1. Use machine_arch_initcall to hook mpc85xx_common_publish_devices
This can ensure before pcibios_init() is called, pci controllers have
been probed and added to the hose_list.
2. Add a workaround for errata A-005434
For the BSC9132, PEX_PEXIWARn[TRGT] for all windows defaults to 0xF,
which is mapped to CCSRBAR. However, for other products, 0xF is
mapped to the local memory. Therefore, for the BSC9132, any default
PCI Express access to the local memory (DDR) will now access the
CCSRBAR. This patch changes the mapping of targets of inbound windows
PEX_PEXIWARn[TRGT] to the Local address space – 0x0 (from 0xF).

Signed-off-by: Harninder Rai <harninder.rai@freescale.com>
Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Hou Zhiqiang <B48286@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 18:17:15 -06:00
Zhao Qiang
7aa1aa6ece QE: Move QE from arch/powerpc to drivers/soc
ls1 has qe and ls1 has arm cpu.
move qe from arch/powerpc to drivers/soc/fsl
to adapt to powerpc and arm

Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 17:12:56 -06:00
Zhao Qiang
302c059f2e QE: use subsys_initcall to init qe
Use subsys_initcall to init qe to adapt ARM architecture.
Remove qe_reset from PowerPC platform file.

Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 17:10:18 -06:00
Zhao Qiang
0e6e01ff69 CPM/QE: use genalloc to manage CPM/QE muram
Use genalloc to manage CPM/QE muram instead of rheap.

Signed-off-by: Zhao Qiang <qiang.zhao@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2015-12-22 17:10:18 -06:00
Alistair Popple
036592fbbe powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"
Commit 25642e1459 ("powerpc/opal-irqchip: Fix double endian
conversion") fixed an endian bug by calling opal_handle_events() in
opal_event_unmask().

However this introduced a deadlock if we find an event is active
during unmasking and call opal_handle_events() again. The bad call
sequence is:

  opal_interrupt()
  -> opal_handle_events()
     -> generic_handle_irq()
        -> handle_level_irq()
           -> raw_spin_lock(&desc->lock)
              handle_irq_event(desc)
              unmask_irq(desc)
              -> opal_event_unmask()
                 -> opal_handle_events()
                    -> generic_handle_irq()
                       -> handle_level_irq()
                          -> raw_spin_lock(&desc->lock)	(BOOM)

When generating multiple opal events in quick succession this would lead
to the following stall warnings:

EEH: Fenced PHB#0 detected, location: U78C9.001.WZS09XA-P1-C32
INFO: rcu_sched detected stalls on CPUs/tasks:

         12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=2065
         15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=2065
         (detected by 13, t=2102 jiffies, g=1325, c=1324, q=602)
NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [irqbalance:2696]
INFO: rcu_sched detected stalls on CPUs/tasks:
         12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=8371
         15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=8371
         (detected by 20, t=8407 jiffies, g=1325, c=1324, q=1290)

This patch corrects the problem by queuing the work if an event is
active during unmasking, which is similar to the pre-endian fix
behaviour.

Fixes: 25642e1459 ("powerpc/opal-irqchip: Fix double endian conversion")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-18 22:24:15 +11:00
Daniel Axtens
1b855e167b powerpc: Add missing calls to va_end()
cppcheck picked up that there were a couple of missing va_end()
calls in functions using va_start().

Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 23:23:22 +11:00
Nathan Fontenot
e9d764f803 powerpc/pseries: Enable kernel CPU dlpar from sysfs
Enable new kernel cpu hotplug functionality by allowing cpu dlpar requests
to be initiated from sysfs.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:03 +11:00
Nathan Fontenot
90edf184b9 powerpc/pseries: Add CPU dlpar add functionality
Add the ability to hotplug add cpus via rtas hotplug events by either
specifying the drc index of the CPU to add, or providing a count of the
number of CPUs to add.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:02 +11:00
Nathan Fontenot
ac71380071 powerpc/pseries: Add CPU dlpar remove functionality
Add the ability to dlpar remove CPUs via hotplug rtas events, either by
specifying the drc-index of the CPU to remove or providing a count of cpus
to remove.

To remove multiple cpus in a single request we create a list of possible
DR (Dynamic Reconfiguration) cpus and their drc indexes that can be
removed.  We can then traverse the list remove each cpu and easily clean
up in any cases of failure.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:02 +11:00
Nathan Fontenot
e666ae0b10 powerpc/pseries: Update CPU hotplug error recovery
Update the cpu dlpar add/remove paths to do better error recovery when
a failure occurs during the add/remove operation.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:02 +11:00
Nathan Fontenot
d98389f375 powerpc/pseries: Factor out common cpu hotplug code
Re-factor the cpu hotplug code to support doing cpu hotplug completely in
the kernel and using the existing sysfs probe/release interfaces. This
patch pulls out pieces of existing cpu hotplug code into common routines,
dlpar_cpu_add() and dlpar_cpu_remove(), to be used by both interfaces.
There are no functional changes introduced.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:01 +11:00
Nathan Fontenot
183deeea58 powerpc/pseries: Consolidate CPU hotplug code to hotplug-cpu.c
No functional changes, this patch is simply a move of the cpu hotplug
code from pseries/dlpar.c to pseries/hotplug-cpu.c. This is in an effort
to consolidate all of the cpu hotplug code in a common place.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:01 +11:00
Nathan Fontenot
1f859adb92 powerpc/pseries: Verify CPU doesn't exist before adding
When DLPAR adding a CPU we should verify that the CPU does not already
exist. Failure to do so can generate a kernel oops;

[    9.465585] kernel BUG at arch/powerpc/platforms/pseries/dlpar.c:382!
[    9.465796] Oops: Exception in kernel mode, sig: 5 [#1]

This oops can be generated by causing a probe to be performed on a cpu
by writing to the sysfs cpu probe file (/sys/devices/system/cpu/probe).
This patch adds a check for the existence of cpu prior to probing the cpu
so userspace doing the wrong thing won't trigger a BUG_ON().

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:01 +11:00
Alistair Popple
5d2aa710e6 powerpc/powernv: Add support for Nvlink NPUs
NVLink is a high speed interconnect that is used in conjunction with a
PCI-E connection to create an interface between CPU and GPU that
provides very high data bandwidth. A PCI-E connection to a GPU is used
as the control path to initiate and report status of large data
transfers sent via the NVLink.

On IBM Power systems the NVLink processing unit (NPU) is similar to
the existing PHB3. This patch adds support for a new NPU PHB type. DMA
operations on the NPU are not supported as this patch sets the TCE
translation tables to be the same as the related GPU PCIe device for
each NVLink. Therefore all DMA operations are setup and controlled via
the PCIe device.

EEH is not presently supported for the NPU devices, although it may be
added in future.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:41:00 +11:00
Alistair Popple
a84bf32140 powerpc: Add __raw_rm_writeq() function
Move __raw_rm_writeq() from platforms/powernv/pci-ioda.c to
include/asm/io.h so that it can be used by other code.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:59 +11:00
Alistair Popple
94973b24d6 Revert "powerpc/pci: Remove unused struct pci_dn.pcidev field"
This commit removed the pcidev field from struct pci_dn as it was no
longer in use by the kernel. However to support finding the
association of Nvlink devices to GPU devices from the device-tree this
field is required.

This reverts commit 250c7b277c ("powerpc/pci: Remove unused struct
pci_dn.pcidev field").

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:59 +11:00
Gavin Shan
e80c4e7ca5 powerpc/powernv: Fix M64 resource name in /proc/iomem
The name of PCI root bus's M64 resource isn't initialized properly.
When dumping "/proc/iomem", "<BAD>" is seen for those M64 resources
on PCI root buses.

   ~# cat /proc/iomem | grep -e "BAD"
   3b0000000000-3b0fefffffff : <BAD>
   3b1000000000-3b1fefffffff : <BAD>
   3c0000000000-3c0fefffffff : <BAD>
   3c1000000000-3c1fefffffff : <BAD>
   3c2000000000-3c2fefffffff : <BAD>

This fixes the issue by setting the name of PCI root bus's M64
resource to that of PHB's device node full name. With the patch,
no "<BAD>" is seen from "/proc/iomem".

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:59 +11:00
Michael Ellerman
b2e8590fa1 powerpc/pseries: Use rtas_call_unlocked() in pseries hotplug
Avoid open coding the logic by using rtas_call_unlocked().

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:55 +11:00
Stewart Smith
e4d54f71d2 powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL
Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is
just OPALv3, with nobody ever expecting anything on pre-OPALv3 to
be cared about or supported by mainline kernels.

So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL
exclusively.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:54 +11:00
Stewart Smith
7261aafc09 powerpc/powernv: Remove OPALv2 firmware define and references
OPALv2 only ever existed in the lab and didn't escape to the world.
All OPAL systems in the wild are OPALv3.

The probability of there being an OPALv2 system still powered on
anywhere inside IBM is approximately zero, let alone anyone
expecting to run mainline kernels.

So, start to remove references to OPALv2.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:54 +11:00
Stewart Smith
786842b62f powerpc/powernv: panic() on OPAL < V3
The OpenPower Abstraction Layer firmware went through a couple
of iterations in the lab before being released. What we now know
as OPAL advertises itself as OPALv3.

OPALv2 and OPALv1 never made it outside the lab, and the possibility
of anyone at all ever building a mainline kernel today and expecting
it to boot on such hardware is zero.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:53 +11:00
Stewart Smith
98da62b716 powerpc/powernv: pr_warn_once on unsupported OPAL_MSG type
When running on newer OPAL firmware that supports sending extra
OPAL_MSG types, we would print a warning on *every* message received.

This could be a problem for kernels that don't support OPAL_MSG_OCC
on machines that are running real close to thermal limits and the
OCC is throttling the chip. For a kernel that is paying attention to
the message queue, we could get these notifications quite often.

Conceivably, future message types could also come fairly often,
and printing that we didn't understand them 10,000 times provides
no further information than printing them once.

Cc: stable@vger.kernel.org
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 20:42:13 +11:00
Daniel Axtens
00b912b0c8 powerpc: Remove broken GregorianDay()
GregorianDay() is supposed to calculate the day of the week
(tm->tm_wday) for a given day/month/year. In that calcuation it
indexed into an array called MonthOffset using tm->tm_mon-1. However
tm_mon is zero-based, not one-based, so this is off-by-one. It also
means that every January, GregoiranDay() will access element -1 of
the MonthOffset array.

It also doesn't appear to be a correct algorithm either: see in
contrast kernel/time/timeconv.c's time_to_tm function.

It's been broken forever, which suggests no-one in userland uses
this. It looks like no-one in the kernel uses tm->tm_wday either
(see e.g. drivers/rtc/rtc-ds1305.c:319).

tm->tm_wday is conventionally set to -1 when not available in
hardware so we can simply set it to -1 and drop the function.
(There are over a dozen other drivers in drivers/rtc that do
this.)

Found using UBSAN.

Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrew Morton <akpm@linux-foundation.org> # as an example of what UBSan finds.
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: rtc-linux@googlegroups.com
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-16 12:54:04 +11:00
Rashmica Gupta
24ad1648ed powerpc/cell: Remove the Cell QPACE code
All users of QPACE have upgraded to QPACE2 so remove the Cell QPACE code.

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-14 20:41:50 +11:00
Vipin K Parashar
b4af279a7c powerpc/pseries: Limit EPOW reset event warnings
Kernel prints respective warnings about various EPOW events for
user information/action after parsing EPOW interrupts. At times
below EPOW reset event warning is seen to be flooding kernel log
over a period of time.

May 25 03:46:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:46:52 alp kernel: Non critical power or cooling issue cleared
May 25 03:53:48 alp kernel: Non critical power or cooling issue cleared
May 25 03:55:46 alp kernel: Non critical power or cooling issue cleared
May 25 03:56:34 alp kernel: Non critical power or cooling issue cleared
May 25 03:59:04 alp kernel: Non critical power or cooling issue cleared
May 25 04:02:01 alp kernel: Non critical power or cooling issue cleared

These EPOW reset events are spurious in nature and are triggered by
firmware without an actual EPOW event being reset. This patch avoids these
multiple EPOW reset warnings by using a counter variable. This variable
is incremented every time an EPOW event is reported. Upon receiving a EPOW
reset event the same variable is checked to filter out spurious events and
decremented accordingly.

This patch also improves log messages to better describe EPOW event being
reported. Merged adjacent log messages into single one to reduce number of
lines printed per event.

Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Vipin K Parashar <vipin@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-14 20:41:49 +11:00
Aneesh Kumar K.V
4ad90c8649 powerpc/mm: Use H_READ with H_READ_4
This will bulk read 4 hash pte slot entries and should reduce the loop

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-14 15:19:17 +11:00
Aneesh Kumar K.V
e34aa03ca4 powerpc/mm: Move THP headers around
We support THP only with book3s_64 and 64K page size. Move
THP details to hash64-64k.h to clarify the same.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-14 15:19:14 +11:00
Alistair Popple
25642e1459 powerpc/opal-irqchip: Fix double endian conversion
The OPAL event calls return a mask of events that are active in big
endian format. This is checked when unmasking the events in the
irqchip by comparison with a cached value. The cached value was stored
in big endian format but should've been converted to CPU endian
first.

This bug leads to OPAL event delivery being delayed or dropped on some
systems. Symptoms may include a non-functional console.

The bug is fixed by calling opal_handle_events(...) instead of
duplicating code in opal_event_unmask(...).

Fixes: 9f0fd0499d ("powerpc/powernv: Add a virtual irqchip for opal events")
Cc: stable@vger.kernel.org # v4.2+
Reported-by: Douglas L Lehr <dllehr@us.ibm.com>
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-08 16:53:31 +11:00
Rashmica Gupta
f43194e458 powerpc: Standardise on NR_syscalls rather than __NR_syscalls.
Most architectures use NR_syscalls as the #define for the number of syscalls.

We use __NR_syscalls, and then define NR_syscalls as __NR_syscalls.

__NR_syscalls is not used outside arch code, whereas NR_syscalls is. So as
NR_syscalls must be defined and __NR_syscalls does not, replace __NR_syscalls
with NR_syscalls.

Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-26 22:11:17 +11:00
John Ogness
57f889471c powerpc/powermac: set IRQF_NO_THREAD for xmon/cascade handlers
The xmon and cascade irq handlers must not run as threads.
pmac_pic_lock is already a raw_spinlock, but the irq flag
IRQF_NO_THREAD needs to be set as well.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-26 22:11:05 +11:00
Krzysztof Kozlowski
87630eb1d5 powerpc/powernv: Drop owner assignment from platform_driver
platform_driver does not need to set an owner because
platform_driver_register() will set it.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-24 14:21:28 +11:00