linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-10-08 17:49:45 +00:00

Author	SHA1	Message	Date
Thomas Gleixner	4b979e4c61	Merge branch 'linus' into irq/core Pull in upstream fixes before applying conflicting changes	2015-07-30 00:13:24 +02:00
Alexey Kardashevskiy	3ba3a73e9f	powerpc/powernv/ioda2: Fix calculation for memory allocated for TCE table The existing code stores the amount of memory allocated for a TCE table. At the moment it uses @offset which is a virtual offset in the TCE table which is only correct for a one level tables and it does not include memory allocated for intermediate levels. When multilevel TCE table is requested, WARN_ON in tce_iommu_create_table() prints a warning. This adds an additional counter to pnv_pci_ioda2_table_do_alloc_pages() to count actually allocated memory. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-23 19:56:44 +10:00
Paul Mackerras	01c9348c76	powerpc: Use hardware RNG for arch_get_random_seed_* not arch_get_random_* The hardware RNG on POWER8 and POWER7+ can be relatively slow, since it can only supply one 64-bit value per microsecond. Currently we read it in arch_get_random_long(), but that slows down reading from /dev/urandom since the code in random.c calls arch_get_random_long() for every longword read from /dev/urandom. Since the hardware RNG supplies high-quality entropy on every read, it matches the semantics of arch_get_random_seed_long() better than those of arch_get_random_long(). Therefore this commit makes the code use the POWER8/7+ hardware RNG only for arch_get_random_seed_{long,int} and not for arch_get_random_{long,int}. This won't affect any other PowerPC-based platforms because none of them currently support a hardware RNG. To make it clear that the ppc_md function pointer is used for arch_get_random_seed_*, we rename it from get_random_long to get_random_seed. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-23 19:52:03 +10:00
Thomas Huth	1c2cb59444	powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlers The EPOW interrupt handler uses rtas_get_sensor(), which in turn uses rtas_busy_delay() to wait for RTAS becoming ready in case it is necessary. But rtas_busy_delay() is annotated with might_sleep() and thus may not be used by interrupts handlers like the EPOW handler! This leads to the following BUG when CONFIG_DEBUG_ATOMIC_SLEEP is enabled: BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:496 in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #6 Call Trace: [c00000007ffe7b90] [c000000000807670] dump_stack+0xa0/0xdc (unreliable) [c00000007ffe7bc0] [c0000000000e1f14] ___might_sleep+0x134/0x180 [c00000007ffe7c20] [c00000000002aec0] rtas_busy_delay+0x30/0xd0 [c00000007ffe7c50] [c00000000002bde4] rtas_get_sensor+0x74/0xe0 [c00000007ffe7ce0] [c000000000083264] ras_epow_interrupt+0x44/0x450 [c00000007ffe7d90] [c000000000120260] handle_irq_event_percpu+0xa0/0x300 [c00000007ffe7e70] [c000000000120524] handle_irq_event+0x64/0xc0 [c00000007ffe7eb0] [c000000000124dbc] handle_fasteoi_irq+0xec/0x260 [c00000007ffe7ef0] [c00000000011f4f0] generic_handle_irq+0x50/0x80 [c00000007ffe7f20] [c000000000010f3c] __do_irq+0x8c/0x200 [c00000007ffe7f90] [c0000000000236cc] call_do_irq+0x14/0x24 [c00000007e6f39e0] [c000000000011144] do_IRQ+0x94/0x110 [c00000007e6f3a30] [c000000000002594] hardware_interrupt_common+0x114/0x180 Fix this issue by introducing a new rtas_get_sensor_fast() function that does not use rtas_busy_delay() - and thus can only be used for sensors that do not cause a BUSY condition - known as "fast" sensors. The EPOW sensor is defined to be "fast" in sPAPR - mpe. Fixes: `587f83e8dd` ("powerpc/pseries: Use rtas_get_sensor in RAS code") Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-23 19:43:11 +10:00
Jiang Liu	2921d1790e	powerpc/PCI: Use for_pci_msi_entry() to access MSI device list Use accessor for_each_pci_msi_entry() to access MSI device list, so we could easily move msi_list from struct pci_dev into struct device later. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Stuart Yoder <stuart.yoder@freescale.com> Cc: Yijing Wang <wangyijing@huawei.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Olof Johansson <olof@lixom.net> Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Daniel Axtens <dja@axtens.net> Cc: Wei Yang <weiyang@linux.vnet.ibm.com> Cc: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Cc: Alexander Gordeev <agordeev@redhat.com> Cc: Scott Wood <scottwood@freescale.com> Cc: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Cc: Tudor Laurentiu <b10716@freescale.com> Cc: Hongtao Jia <hongtao.jia@freescale.com> Link: http://lkml.kernel.org/r/1436428847-8886-4-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-07-22 18:37:42 +02:00
Gavin Shan	79cd952000	powerpc/eeh: Dump PHB diag-data for non-existing PE When detecting EEH error on non-existing PE, including the reserved one, the PE is simply unfrozen without dumping the PHB diag-data, which is useful for locating the root cause of the EEH error. The patch dumps the PHB diag-data when non-existing PE reports error. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-21 11:38:47 +10:00
Gavin Shan	0f36db7764	powerpc/eeh: Fix wrong printed PE number On LE kernel, the non-existing PE number in BE format derived from skiboot firmware isn't converted to LE format properly as following kernel log indicates: EEH: Clear non-existing PHB#4-PE#200000000000000 Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-21 11:38:46 +10:00
Stephen Boyd	d5fb48a354	powerpc/512x: clk: Include clk.h This clock provider uses the consumer API, so include clk.h explicitly. Cc: Gerhard Sittig <gsi@denx.de> Cc: Scott Wood <scottwood@freescale.com> Acked-by: Anatolij Gustschin <agust@denx.de> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>	2015-07-20 10:52:40 -07:00
Vipin K Parashar	3b476aadbc	powerpc/powernv: Add poweroff (EPOW, DPO) events support for PowerNV platform This patch adds support for OPAL EPOW (Environmental and Power Warnings) and DPO (Delayed Power Off) events for the PowerNV platform. These events are generated on FSP (Flexible Service Processor) based systems. EPOW events are generated due to various critical system conditions that require system shutdown. A few examples of these conditions are high ambient temperature or system running on UPS power with low UPS battery. DPO event is generated in response to admin initiated system shutdown request. Upon receipt of EPOW and DPO events the host kernel invokes orderly_poweroff() for performing graceful system shutdown. Signed-off-by: Vipin K Parashar <vipin@linux.vnet.ibm.com> Acked-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-16 13:34:36 +10:00
Gavin Shan	f951e51003	powerpc/powernv: Unfreeze VF PE on releasing it When releasing PE for SRIOV VF, the PE is forced to be frozen wrongly. When the same PE is picked for another VF, it won't work anyhow. The patch fixes the issue by unfreezing, not freezing the VF PE when releasing it. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-13 16:12:30 +10:00
Gavin Shan	283e2d8a59	powerpc/powernv: Include VF PE in PELTV of PF PE The PELTV of PF PE should include VF PE, which is missed by current code, so that the VF PE is frozen automatically when freezing PF PE. The patch fixes the PELTV of PF PE to include VF PE. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-13 16:12:22 +10:00
Gavin Shan	26ba248d52	powerpc/powernv: Pick M64 PEs based on BARs On PHB3, PE might be reserved in advance to reflect the M64 segments consumed by the PE according to M64 BARs (exclude VF BARs) of the PCI devices included in the PE. The PE is picked based on M64 BARs instead of the bridge's M64 windows, which might include VF BARs. Otherwise, wrong PE could be picked. The patch calculates the used M64 segments and PE numbers according to the M64 BARs, excluding VF BARs, of PCI devices in one particular PE, instead of the bridge's M64 windows. Then the right PE number is picked. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-13 16:12:01 +10:00
Gavin Shan	d1203852df	powerpc/powernv: Boolean argument for pnv_ioda_setup_bus_PE() The patch changes the type of last argument of pnv_ioda_setup_bus_PE() and phb::pick_m64_pe() to boolean. No functional change. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-13 16:12:00 +10:00
Gavin Shan	96a2f92bf8	powerpc/powernv: Reserve M64 PEs based on BARs On PHB3, some PEs might be reserved in advance to reflect the M64 segments consumed by those PEs. We're reserving PEs based on the M64 window of root port, which might contain VF BAR. The PEs for VFs are allocated dynamically, not reserved based on the consumed M64 segments. So the M64 window of root port isn't reliable for the task. Instead, we go through M64 BARs (VF BARs excluded) of PCI devices under the specified root bus and reserve PEs accordingly, as the patch does. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-13 16:12:00 +10:00
Gavin Shan	e9dc4d7f72	powerpc/powernv: Allow to reserve one PE for multiple times The PE numbers are reserved according to root port's M64 window, which is aligned to M64 segment finely. So one PE shouldn't be reserved for multiple times. We will reserve PE numbers according to the M64 BARs of PCI device in subsequent patches, which aren't aligned to M64 segment size finely. It means one particular PE could be reserved for multiple times. The patch allows one PE to be reserved for multiple times and we print the warning message at debugging level. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-13 16:12:00 +10:00
Benjamin Herrenschmidt	e91c25111a	powerpc/iommu: Cleanup setting of DMA base/offset Now that the table and the offset can co-exist, we no longer need to flip/flop, we can just establish both once at boot time. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-13 10:10:55 +10:00
Alistair Popple	a8956a7b72	powerpc/powernv: Fix opal-elog interrupt handler The conversion of opal events to a proper irqchip means that handlers are called until the relevant opal event has been cleared by processing it. Events that queue work should therefore use a threaded handler to mask the event until processing is complete. Signed-off-by: Alistair Popple <alistair@popple.id.au> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-06 20:24:36 +10:00
Vaidyanathan Srinivasan	d8ea782b56	powerpc/powernv: Fix vma page prot flags in opal-prd driver opal-prd driver will mmap() firmware code/data area as private mapping to prd user space daemon. Write to this page will trigger COW faults. The new COW pages are normal kernel RAM pages accounted by the kernel and are not special. vma->vm_page_prot value will be used at page fault time for the new COW pages, while pgprot_t value passed in remap_pfn_range() is used for the initial page table entry. Hence: * Do not add _PAGE_SPECIAL in vma, but only for remap_pfn_range() * Also remap_pfn_range() will add the _PAGE_SPECIAL flag using pte_mkspecial() call, hence no need to specify in the driver This fix resolves the page accounting warning shown below: BUG: Bad rss-counter state mm:c0000007d34ac600 idx:1 val:19 The above warning is triggered since _PAGE_SPECIAL was incorrectly being set for the normal kernel COW pages. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-07-06 12:06:42 +10:00
Linus Torvalds	1dc51b8288	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: "Assorted VFS fixes and related cleanups (IMO the most interesting in that part are f_path-related things and Eric's descriptor-related stuff). UFS regression fixes (it got broken last cycle). 9P fixes. fs-cache series, DAX patches, Jan's file_remove_suid() work" [ I'd say this is much more than "fixes and related cleanups". The file_table locking rule change by Eric Dumazet is a rather big and fundamental update even if the patch isn't huge. - Linus ] * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits) 9p: cope with bogus responses from server in p9_client_{read,write} p9_client_write(): avoid double p9_free_req() 9p: forgetting to cancel request on interrupted zero-copy RPC dax: bdev_direct_access() may sleep block: Add support for DAX reads/writes to block devices dax: Use copy_from_iter_nocache dax: Add block size note to documentation fs/file.c: __fget() and dup2() atomicity rules fs/file.c: don't acquire files->file_lock in fd_install() fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation vfs: avoid creation of inode number 0 in get_next_ino namei: make set_root_rcu() return void make simple_positive() public ufs: use dir_pages instead of ufs_dir_pages() pagemap.h: move dir_pages() over there remove the pointless include of lglock.h fs: cleanup slight list_entry abuse xfs: Correctly lock inode when removing suid and file capabilities fs: Call security_ops->inode_killpriv on truncate fs: Provide function telling whether file_remove_privs() will do anything ...	2015-07-04 19:36:06 -07:00
Linus Torvalds	2d4407079c	Replace module_init with equivalent device_initcall in non modules. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVkO5XAAoJEOvOhAQsB9HWe4cQAJcsmSXIDN2O6oxvgH8Wilof EIEMvT13uwBdsjQdYUY6A6B3iUV9wzEEgoosg/JRgpz5/b1FTDMIO4arUPD3Lcak 5bmyVO2qAT+yaLAWSgn6I8DMplXrKiEuK+TkH/mW3p9TdvElLdG3Vg6UI407hSWv W0QbVwkNtv8XmzshV9F2YdmflT8j1PgYxIu/tEkVOWn37DNW+Fp2OVBrdTIYp3AJ X6bYZPEcQDCrWWW/s2GmIDrNgryiebasns+CAgGY21262jAYaRcFOJmR47AsTqW7 DSZXIlLc/gJca++hfxqV15RZ4NRHxrebCypTsPtZUV7ZiYHI726eeUZzxsp/9itu mvhmi+aQUTTUP3dDhiv05f4syAKEb4zslT6SLwcna6oi09M97HfCeQsHqhcFq/MG KnS2JJoJQToQtJvMUXMQzp5hyHjNlOclIvCxEiL32EZU54PeJOKasy/mptNGEctk TxACWvoXBQglRaVN+1wIjjr0BaHJSuJa9CUnIfM4WZdSHiMQMx00XLTkZcTnSM6R 12pE54vVolrXswGPJhy4W/Sf1yPSW1tkWSVBbkKLyCIrlAWJtu68rXhvwhG/nz6E 3g6QrDEQGlk6bzUH4CJCEqXLPRN1bNS0XjdkEFh60Lury3Ns5yHKZXPW5vCQ5csr FQNUyBs595CWbJNfbn1n =0BDx -----END PGP SIGNATURE----- Merge tag 'module_init-device_initcall-v4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Pull module_init replacement part one from Paul Gortmaker: "Replace module_init with equivalent device_initcall in non modules. This series of commits converts non-modular code that is using the module_init() call to hook itself into the system to instead use device_initcall(). The conversion is a runtime no-op, since module_init actually becomes __initcall in the non-modular case, and that in turn gets mapped onto device_initcall. A couple files show a larger negative diffstat, representing ones that had a module_exit function that we remove here vs previously relying on the linker to dispose of it. We make this conversion now, so that we can relocate module_init from init.h into module.h in the future. The files changed here are just limited to those that would otherwise have to add module.h to obviously non-modular code, in order to avoid a compile fail, as testing has shown" * tag 'module_init-device_initcall-v4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: MIPS: don't use module_init in non-modular cobalt/mtd.c file drivers/leds: don't use module_init in non-modular leds-cobalt-raq.c cris: don't use module_init for non-modular core eeprom.c code tty/metag_da: Avoid module_init/module_exit in non-modular code drivers/clk: don't use module_init in clk-nomadik.c which is non-modular xtensa: don't use module_init for non-modular core network.c code sh: don't use module_init in non-modular psw.c code mn10300: don't use module_init in non-modular flash.c code parisc64: don't use module_init for non-modular core perf code parisc: don't use module_init for non-modular core pdc_cons code cris: don't use module_init for non-modular core intmem.c code ia64: don't use module_init in non-modular sim/simscsi.c code ia64: don't use module_init for non-modular core kernel/mca.c code arm: don't use module_init in non-modular mach-vexpress/spc.c code powerpc: don't use module_init in non-modular 83xx suspend code powerpc: use device_initcall for registering rtc devices x86: don't use module_init in non-modular devicetree.c code x86: don't use module_init in non-modular intel_mid_vrtc.c	2015-07-02 10:30:48 -07:00
Linus Torvalds	08d183e3c1	powerpc updates for 4.2 - Disable the 32-bit vdso when building LE, so we can build with a 64-bit only toolchain. - EEH fixes from Gavin & Richard. - Enable the sys_kcmp syscall from Laurent. - Sysfs control for fastsleep workaround from Shreyas. - Expose OPAL events as an irq chip by Alistair. - MSI ops moved to pci_controller_ops by Daniel. - Fix for kernel to userspace backtraces for perf from Anton. - Merge pseries and pseries_le defconfigs from Cyril. - CXL in-kernel API from Mikey. - OPAL prd driver from Jeremy. - Fix for DSCR handling & tests from Anshuman. - Powernv flash mtd driver from Cyril. - Dynamic DMA Window support on powernv from Alexey. - LLVM clang fixes & workarounds from Anton. - Reworked version of the patch to abort syscalls when transactional. - Fix the swap encoding to support 4TB, from Aneesh. - Various fixes as usual. - Freescale updates from Scott: Highlights include more 8xx optimizations, an e6500 hugetlb optimization, QMan device tree nodes, t1024/t1023 support, and various fixes and cleanup. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJViSZqAAoJEFHr6jzI4aWAA7kQAKq3+pejfo2rY7alpKJyeVao vlaIEaDNOTh+ctcmu3MFF9Jy6fai8gNZziRXU5JRmE5RW4GVBN4KZiqXRbkVjdBK uG9sCX7Y58VRsS2vnGBYLsamfTMgjaXeDvgunQHVLiechJnrDr0RHEK90F3LSi73 Axp6l8XIG63a3zFZmkhzANMCme2lm5+MWmGlSjUUNi5F+viQUgJc5iiO8xrVUgM5 RpNlV2NJSqFiU+gMQWJ226V85UIniouq4j+qtyUcu8/m9BberyolXVU0GPlPFdsx r/Qh9uCJyZaUdSB5hzomQZj50IsSz6J6nEuJTeGRoVZOmeI8Dnc2xU9fxQF5fC8H lUJw10WPoNOggQZTeSUKn7wTXw3i4p3KsWNUczaW68VJdhqZUVaSp0+I6mnDSqzs 9iGC+VffLYNa1OHq7mGRFrgDdLBCHes31aZ3CxlQsmyNpAPCwMzsD4TUfVnvOG6E oJOeaQ4mZM9PvqxEYJfoIL+vgRxmQ8sdIBtNY4in+C7J6eFnZNFO9xmPnJZuVU31 PGtx60kjFCOVMXvqn34WkRNbgqGWI91IK0KcRwFO2LXVio1uY77TWL52kNK2IMsp Az+VDDvqnT3+BoV1yz0P6SrXAkwTpvFk2y+IdmEiUUN7zZFL5ZSA2epej9AzHTAK WID2bc5yVtIL6p6x5ICH =d9Wh -----END PGP SIGNATURE----- Merge tag 'powerpc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc updates from Michael Ellerman: - disable the 32-bit vdso when building LE, so we can build with a 64-bit only toolchain. - EEH fixes from Gavin & Richard. - enable the sys_kcmp syscall from Laurent. - sysfs control for fastsleep workaround from Shreyas. - expose OPAL events as an irq chip by Alistair. - MSI ops moved to pci_controller_ops by Daniel. - fix for kernel to userspace backtraces for perf from Anton. - merge pseries and pseries_le defconfigs from Cyril. - CXL in-kernel API from Mikey. - OPAL prd driver from Jeremy. - fix for DSCR handling & tests from Anshuman. - Powernv flash mtd driver from Cyril. - dynamic DMA Window support on powernv from Alexey. - LLVM clang fixes & workarounds from Anton. - reworked version of the patch to abort syscalls when transactional. - fix the swap encoding to support 4TB, from Aneesh. - various fixes as usual. - Freescale updates from Scott: Highlights include more 8xx optimizations, an e6500 hugetlb optimization, QMan device tree nodes, t1024/t1023 support, and various fixes and cleanup. * tag 'powerpc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (180 commits) cxl: Fix typo in debug print cxl: Add CXL_KERNEL_API config option powerpc/powernv: Fix wrong IOMMU table in pnv_ioda_setup_bus_dma() powerpc/mm: Change the swap encoding in pte. powerpc/mm: PTE_RPN_MAX is not used, remove the same powerpc/tm: Abort syscalls in active transactions powerpc/iommu/ioda2: Enable compile with IOV=on and IOMMU_API=off powerpc/include: Add opal-prd to installed uapi headers powerpc/powernv: fix construction of opal PRD messages powerpc/powernv: Increase opal-irqchip initcall priority powerpc: Make doorbell check preemption safe powerpc/powernv: pnv_init_idle_states() should only run on powernv macintosh/nvram: Remove as unused powerpc: Don't use gcc specific options on clang powerpc: Don't use -mno-strict-align on clang powerpc: Only use -mtraceback=no, -mno-string and -msoft-float if toolchain supports it powerpc: Only use -mabi=altivec if toolchain supports it powerpc: Fix duplicate const clang warning in user access code vfio: powerpc/spapr: Support Dynamic DMA windows vfio: powerpc/spapr: Register memory and define IOMMU v2 ...	2015-06-24 08:46:32 -07:00
Al Viro	dc3f4198ea	make simple_positive() public Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-06-23 18:02:01 -04:00
Michael Ellerman	6096f88451	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Freescale updates from Scott: "Highlights include more 8xx optimizations, an e6500 hugetlb optimization, QMan device tree nodes, t1024/t1023 support, and various fixes and cleanup."	2015-06-19 17:23:48 +10:00
Alexey Kardashevskiy	5c89a87d13	powerpc/powernv: Fix wrong IOMMU table in pnv_ioda_setup_bus_dma() When pnv_pci_ioda_fixup() is called during PHB fixup time, each PE in the sorted list of PEs (phb::pe_dma_list) is iterated to setup the PE's DMA32 space by pnv_ioda_setup_bus_dma() if the PE's DMA32 weight is bigger than zero. The function also assigns all the subordinate PCI devices of the PE's primary bus with the PE's DMA32 IOMMU table. It causes the PCI devicess in the child PEs, which don't have DMA weight, receives wrong IOMMU table and then IOMMU group. The patch fixes above issue by more check on the PE's coverage and don't assign IOMMU table to those PCI devices, which belong to the child PEs. The problem was found on Firestone platform initially. Suggested-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-19 17:10:29 +10:00
Alexey Kardashevskiy	b5926430df	powerpc/iommu/ioda2: Enable compile with IOV=on and IOMMU_API=off The pnv_pci_ioda2_unset_window() function is used to do the final cleanup of a DMA window being released: - via VFIO ioctl by the guest request; - via unplugging a virtual PCI function. However the function was under #ifdef CONFIG_IOMMU_API and was missing. This moves the helper outside of IOMMU_API block and enables it for either or both IOMMU_API and PCI_IOV. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-18 07:16:02 +10:00
Jeremy Kerr	7185795a62	powerpc/powernv: fix construction of opal PRD messages We currently have a bug in the PRD code, where the contents of an incoming message (beyond the header) will be overwritten by the list item manipulations when adding to to the prd_msg_queue. This change reorders struct opal_prd_msg_queue_item, so that the message body doesn't overlap the list_head. We also clarify the memcpy of the message, as we're copying unnecessary bytes at the end of the message data. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-18 07:16:01 +10:00
Alistair Popple	02b6505c8f	powerpc/powernv: Increase opal-irqchip initcall priority The eeh subsystem for powernv requires the opal event irqchip to be initialised prior to initialisation or the following errors are produced (and eeh doesn't work as expected): irq: XICS didn't like hwirq-0x9 to VIRQ17 mapping (rc=-22) pnv_eeh_post_init: Can't request OPAL event interrupt (0) On powernv eeh is initialised from a subsys_initcall due to a check for machine_is(powernv) in eeh_init(). This patch increases the initcall priority of opal_event_init() to an arch_initcall to ensure the opal event interface is initialised prior to any users of it. Signed-off-by: Alistair Popple <alistair@popple.id.au> Reported-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-18 07:16:01 +10:00
Paul Gortmaker	a390a2f181	powerpc: don't use module_init in non-modular 83xx suspend code The suspend.o is built for SUSPEND -- which is bool, and hence this code is either present or absent. It will never be modular, so using module_init as an alias for __initcall can be somewhat misleading. Fix this up now, so that we can relocate module_init from init.h into module.h in the future. If we don't do this, we'd have to add module.h to obviously non-modular code, and that would be a worse thing. Note that direct use of __initcall is discouraged, vs. one of the priority categorized subgroups. As __initcall gets mapped onto device_initcall, our use of device_initcall directly in this change means that the runtime impact is zero -- it will remain at level 6 in initcall ordering. Cc: Scott Wood <scottwood@freescale.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2015-06-16 14:12:29 -04:00
Paul Gortmaker	8f6b9512ce	powerpc: use device_initcall for registering rtc devices Currently these two RTC devices are in core platform code where it is not possible for them to be modular. It will never be modular, so using module_init as an alias for __initcall can be somewhat misleading. Fix this up now, so that we can relocate module_init from init.h into module.h in the future. If we don't do this, we'd have to add module.h to obviously non-modular code, and that would be a worse thing. Note that direct use of __initcall is discouraged, vs. one of the priority categorized subgroups. As __initcall gets mapped onto device_initcall, our use of device_initcall directly in this change means that the runtime impact is zero -- they will remain at level 6 in initcall ordering. Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Geoff Levand <geoff@infradead.org> Acked-by: Geoff Levand <geoff@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2015-06-16 14:12:29 -04:00
Michael Ellerman	4bece972fc	powerpc/powernv: pnv_init_idle_states() should only run on powernv Although this init call checks for device tree properties before doing anything, it should still only run on powernv machines. Reviewed-by: Shreyas B Prabhu <shreyas@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-15 16:45:12 +10:00
Alexey Kardashevskiy	46d3e1e162	vfio: powerpc/spapr: powerpc/powernv/ioda2: Use DMA windows API in ownership control Before the IOMMU user (VFIO) would take control over the IOMMU table belonging to a specific IOMMU group. This approach did not allow sharing tables between IOMMU groups attached to the same container. This introduces a new IOMMU ownership flavour when the user can not just control the existing IOMMU table but remove/create tables on demand. If an IOMMU implements take/release_ownership() callbacks, this lets the user have full control over the IOMMU group. When the ownership is taken, the platform code removes all the windows so the caller must create them. Before returning the ownership back to the platform code, VFIO unprograms and removes all the tables it created. This changes IODA2's onwership handler to remove the existing table rather than manipulating with the existing one. From now on, iommu_take_ownership() and iommu_release_ownership() are only called from the vfio_iommu_spapr_tce driver. Old-style ownership is still supported allowing VFIO to run on older P5IOC2 and IODA IO controllers. No change in userspace-visible behaviour is expected. Since it recreates TCE tables on each ownership change, related kernel traces will appear more often. This adds a pnv_pci_ioda2_setup_default_config() which is called when PE is being configured at boot time and when the ownership is passed from VFIO to the platform code. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:54 +10:00
Alexey Kardashevskiy	0054719386	powerpc/iommu/ioda2: Add get_table_size() to calculate the size of future table This adds a way for the IOMMU user to know how much a new table will use so it can be accounted in the locked_vm limit before allocation happens. This stores the allocated table size in pnv_pci_ioda2_get_table_size() so the locked_vm counter can be updated correctly when a table is being disposed. This defines an iommu_table_group_ops callback to let VFIO know how much memory will be locked if a table is created. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:53 +10:00
Alexey Kardashevskiy	c035e37b58	powerpc/powernv/ioda2: Use new helpers to do proper cleanup on PE release The existing code programmed TVT#0 with some address and then immediately released that memory. This makes use of pnv_pci_ioda2_unset_window() and pnv_pci_ioda2_set_bypass() which do correct resource release and TVT update. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:53 +10:00
Alexey Kardashevskiy	4793d65d1a	vfio: powerpc/spapr: powerpc/powernv/ioda: Define and implement DMA windows API This extends iommu_table_group_ops by a set of callbacks to support dynamic DMA windows management. create_table() creates a TCE table with specific parameters. it receives iommu_table_group to know nodeid in order to allocate TCE table memory closer to the PHB. The exact format of allocated multi-level table might be also specific to the PHB model (not the case now though). This callback calculated the DMA window offset on a PCI bus from @num and stores it in a just created table. set_window() sets the window at specified TVT index + @num on PHB. unset_window() unsets the window from specified TVT. This adds a free() callback to iommu_table_ops to free the memory (potentially a tree of tables) allocated for the TCE table. create_table() and free() are supposed to be called once per VFIO container and set_window()/unset_window() are supposed to be called for every group in a container. This adds IOMMU capabilities to iommu_table_group such as default 32bit window parameters and others. This makes use of new values in vfio_iommu_spapr_tce. IODA1/P5IOC2 do not support DDW so they do not advertise pagemasks to the userspace. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:52 +10:00
Alexey Kardashevskiy	bbb845c4ba	powerpc/powernv: Implement multilevel TCE tables TCE tables might get too big in case of 4K IOMMU pages and DDW enabled on huge guests (hundreds of GB of RAM) so the kernel might be unable to allocate contiguous chunk of physical memory to store the TCE table. To address this, POWER8 CPU (actually, IODA2) supports multi-level TCE tables, up to 5 levels which splits the table into a tree of smaller subtables. This adds multi-level TCE tables support to pnv_pci_ioda2_table_alloc_pages() and pnv_pci_ioda2_table_free_pages() helpers. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:51 +10:00
Alexey Kardashevskiy	43cb60ab7f	powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window This is a part of moving DMA window programming to an iommu_ops callback. pnv_pci_ioda2_set_window() takes an iommu_table_group as a first parameter (not pnv_ioda_pe) as it is going to be used as a callback for VFIO DDW code. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:51 +10:00
Alexey Kardashevskiy	aca6913f55	powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages This is a part of moving TCE table allocation into an iommu_ops callback to support multiple IOMMU groups per one VFIO container. This moves the code which allocates the actual TCE tables to helpers: pnv_pci_ioda2_table_alloc_pages() and pnv_pci_ioda2_table_free_pages(). These do not allocate/free the iommu_table struct. This enforces window size to be a power of two. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:50 +10:00
Alexey Kardashevskiy	e5aad1e678	powerpc/powernv/ioda2: Rework iommu_table creation This moves iommu_table creation to the beginning to make following changes easier to review. This starts using table parameters from the iommu_table struct. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:50 +10:00
Alexey Kardashevskiy	05c6cfb9dc	powerpc/iommu/powernv: Release replaced TCE At the moment writing new TCE value to the IOMMU table fails with EBUSY if there is a valid entry already. However PAPR specification allows the guest to write new TCE value without clearing it first. Another problem this patch is addressing is the use of pool locks for external IOMMU users such as VFIO. The pool locks are to protect DMA page allocator rather than entries and since the host kernel does not control what pages are in use, there is no point in pool locks and exchange()+put_page(oldtce) is sufficient to avoid possible races. This adds an exchange() callback to iommu_table_ops which does the same thing as set() plus it returns replaced TCE and DMA direction so the caller can release the pages afterwards. The exchange() receives a physical address unlike set() which receives linear mapping address; and returns a physical address as the clear() does. This implements exchange() for P5IOC2/IODA/IODA2. This adds a requirement for a platform to have exchange() implemented in order to support VFIO. This replaces iommu_tce_build() and iommu_clear_tce() with a single iommu_tce_xchg(). This makes sure that TCE permission bits are not set in TCE passed to IOMMU API as those are to be calculated by platform code from DMA direction. This moves SetPageDirty() to the IOMMU code to make it work for both VFIO ioctl interface in in-kernel TCE acceleration (when it becomes available later). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:49 +10:00
Alexey Kardashevskiy	c5bb44edee	powerpc/powernv: Implement accessor to TCE entry This replaces direct accesses to TCE table with a helper which returns an TCE entry address. This does not make difference now but will when multi-level TCE tables get introduces. No change in behavior is expected. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:49 +10:00
Alexey Kardashevskiy	e57080f17d	powerpc/powernv/ioda2: Add TCE invalidation for all attached groups The iommu_table struct keeps a list of IOMMU groups it is used for. At the moment there is just a single group attached but further patches will add TCE table sharing. When sharing is enabled, TCE cache in each PE needs to be invalidated so does the patch. This does not change pnv_pci_ioda1_tce_invalidate() as there is no plan to enable TCE table sharing on PHBs older than IODA2. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:48 +10:00
Alexey Kardashevskiy	5780fb0426	powerpc/powernv/ioda2: Move TCE kill register address to PE At the moment the DMA setup code looks for the "ibm,opal-tce-kill" property which contains the TCE kill register address. Writing to this register invalidates TCE cache on IODA/IODA2 hub. This moves the register address from iommu_table to pnv_pnb as this register belongs to PHB and invalidates TCE cache for all tables of all attached PEs. This moves the property reading/remapping code to a helper which is called when DMA is being configured for PE and which does DMA setup for both IODA1 and IODA2. This adds a new pnv_pci_ioda2_tce_invalidate_entire() helper which invalidates cache for the entire table. It should be called after every call to opal_pci_map_pe_dma_window(). It was not required before because there was just a single TCE table and 64bit DMA was handled via bypass window (which has no table so no cache was used) but this is going to change with Dynamic DMA windows (DDW). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:48 +10:00
Alexey Kardashevskiy	f87a88642e	vfio: powerpc/spapr/iommu/powernv/ioda2: Rework IOMMU ownership control This adds tce_iommu_take_ownership() and tce_iommu_release_ownership which call in a loop iommu_take_ownership()/iommu_release_ownership() for every table on the group. As there is just one now, no change in behaviour is expected. At the moment the iommu_table struct has a set_bypass() which enables/ disables DMA bypass on IODA2 PHB. This is exposed to POWERPC IOMMU code which calls this callback when external IOMMU users such as VFIO are about to get over a PHB. The set_bypass() callback is not really an iommu_table function but IOMMU/PE function. This introduces a iommu_table_group_ops struct and adds take_ownership()/release_ownership() callbacks to it which are called when an external user takes/releases control over the IOMMU. This replaces set_bypass() with ownership callbacks as it is not necessarily just bypass enabling, it can be something else/more so let's give it more generic name. The callbacks is implemented for IODA2 only. Other platforms (P5IOC2, IODA1) will use the old iommu_take_ownership/iommu_release_ownership API. The following patches will replace iommu_take_ownership/ iommu_release_ownership calls in IODA2 with full IOMMU table release/ create. As we here and touching bypass control, this removes pnv_pci_ioda2_setup_bypass_pe() as it does not do much more compared to pnv_pci_ioda2_set_bypass. This moves tce_bypass_base initialization to pnv_pci_ioda2_setup_dma_pe. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:47 +10:00
Alexey Kardashevskiy	0eaf4defc7	powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group So far one TCE table could only be used by one IOMMU group. However IODA2 hardware allows programming the same TCE table address to multiple PE allowing sharing tables. This replaces a single pointer to a group in a iommu_table struct with a linked list of groups which provides the way of invalidating TCE cache for every PE when an actual TCE table is updated. This adds pnv_pci_link_table_and_group() and pnv_pci_unlink_table_and_group() helpers to manage the list. However without VFIO, it is still going to be a single IOMMU group per iommu_table. This changes iommu_add_device() to add a device to a first group from the group list of a table as it is only called from the platform init code or PCI bus notifier and at these moments there is only one group per table. This does not change TCE invalidation code to loop through all attached groups in order to simplify this patch and because it is not really needed in most cases. IODA2 is fixed in a later patch. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:16:15 +10:00
Alexey Kardashevskiy	b348aa6529	powerpc/spapr: vfio: Replace iommu_table with iommu_table_group Modern IBM POWERPC systems support multiple (currently two) TCE tables per IOMMU group (a.k.a. PE). This adds a iommu_table_group container for TCE tables. Right now just one table is supported. This defines iommu_table_group struct which stores pointers to iommu_group and iommu_table(s). This replaces iommu_table with iommu_table_group where iommu_table was used to identify a group: - iommu_register_group(); - iommudata of generic iommu_group; This removes @data from iommu_table as it_table_group provides same access to pnv_ioda_pe. For IODA, instead of embedding iommu_table, the new iommu_table_group keeps pointers to those. The iommu_table structs are allocated dynamically. For P5IOC2, both iommu_table_group and iommu_table are embedded into PE struct. As there is no EEH and SRIOV support for P5IOC2, iommu_free_table() should not be called on iommu_table struct pointers so we can keep it embedded in pnv_phb::p5ioc2. For pSeries, this replaces multiple calls of kzalloc_node() with a new iommu_pseries_alloc_group() helper and stores the table group struct pointer into the pci_dn struct. For release, a iommu_table_free_group() helper is added. This moves iommu_table struct allocation from SR-IOV code to the generic DMA initialization code in pnv_pci_ioda_setup_dma_pe and pnv_pci_ioda2_setup_dma_pe as this is where DMA is actually initialized. This change is here because those lines had to be changed anyway. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [aw: for the vfio related changes] Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:14:57 +10:00
Alexey Kardashevskiy	decbda2572	powerpc/powernv/ioda/ioda2: Rework TCE invalidation in tce_build()/tce_free() The pnv_pci_ioda_tce_invalidate() helper invalidates TCE cache. It is supposed to be called on IODA1/2 and not called on p5ioc2. It receives start and end host addresses of TCE table. IODA2 actually needs PCI addresses to invalidate the cache. Those can be calculated from host addresses but since we are going to implement multi-level TCE tables, calculating PCI address from a host address might get either tricky or ugly as TCE table remains flat on PCI bus but not in RAM. This moves pnv_pci_ioda_tce_invalidate() from generic pnv_tce_build/ pnt_tce_free and defines IODA1/2-specific callbacks which call generic ones and do PHB-model-specific TCE cache invalidation. P5IOC2 keeps using generic callbacks as before. This changes pnv_pci_ioda2_tce_invalidate() to receives TCE index and number of pages which are PCI addresses shifted by IOMMU page shift. No change in behaviour is expected. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:14:56 +10:00
Alexey Kardashevskiy	da004c3600	powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table This adds a iommu_table_ops struct and puts pointer to it into the iommu_table struct. This moves tce_build/tce_free/tce_get/tce_flush callbacks from ppc_md to the new struct where they really belong to. This adds the requirement for @it_ops to be initialized before calling iommu_init_table() to make sure that we do not leave any IOMMU table with iommu_table_ops uninitialized. This is not a parameter of iommu_init_table() though as there will be cases when iommu_init_table() will not be called on TCE tables, for example - VFIO. This does s/tce_build/set/, s/tce_free/clear/ and removes "tce_" redundant prefixes. This removes tce_xxx_rm handlers from ppc_md but does not add them to iommu_table_ops as this will be done later if we decide to support TCE hypercalls in real mode. This removes _vm callbacks as only virtual mode is supported by now so this also removes @rm parameter. For pSeries, this always uses tce_buildmulti_pSeriesLP/ tce_buildmulti_pSeriesLP. This changes multi callback to fall back to tce_build_pSeriesLP/tce_free_pSeriesLP if FW_FEATURE_MULTITCE is not present. The reason for this is we still have to support "multitce=off" boot parameter in disable_multitce() and we do not want to walk through all IOMMU tables in the system and replace "multi" callbacks with single ones. For powernv, this defines _ops per PHB type which are P5IOC2/IODA1/IODA2. This makes the callbacks for them public. Later patches will extend callbacks for IODA1/2. No change in behaviour is expected. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:14:56 +10:00
Alexey Kardashevskiy	10b35b2b74	powerpc/powernv: Do not set "read" flag if direction==DMA_NONE Normally a bitmap from the iommu_table is used to track what TCE entry is in use. Since we are going to use iommu_table without its locks and do xchg() instead, it becomes essential not to put bits which are not implied in the direction flag as the old TCE value (more precisely - the permission bits) will be used to decide whether to put the page or not. This adds iommu_direction_to_tce_perm() (its counterpart is there already) and uses it for powernv's pnv_tce_build(). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:14:56 +10:00
Alexey Kardashevskiy	ac9a58891a	powerpc/iommu: Put IOMMU group explicitly So far an iommu_table lifetime was the same as PE. Dynamic DMA windows will change this and iommu_free_table() will not always require the group to be released. This moves iommu_group_put() out of iommu_free_table(). This adds a iommu_pseries_free_table() helper which does iommu_group_put() and iommu_free_table(). Later it will be changed to receive a table_group and we will have to change less lines then. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:14:55 +10:00
Alexey Kardashevskiy	c5773822c0	powerpc/powernv/ioda: Clean up IOMMU group registration The existing code has 3 calls to iommu_register_group() and all 3 branches actually cover all possible cases. This replaces 3 calls with one and moves the registration earlier; the latter will make more sense when we add TCE table sharing. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:14:54 +10:00
Alexey Kardashevskiy	4617082ec0	powerpc/iommu/powernv: Get rid of set_iommu_table_base_and_group The set_iommu_table_base_and_group() name suggests that the function sets table base and add a device to an IOMMU group. The actual purpose for table base setting is to put some reference into a device so later iommu_add_device() can get the IOMMU group reference and the device to the group. At the moment a group cannot be explicitly passed to iommu_add_device() as we want it to work from the bus notifier, we can fix it later and remove confusing calls of set_iommu_table_base(). This replaces set_iommu_table_base_and_group() with a couple of set_iommu_table_base() + iommu_add_device() which makes reading the code easier. This adds few comments why set_iommu_table_base() and iommu_add_device() are called where they are called. For IODA1/2, this essentially removes iommu_add_device() call from the pnv_pci_ioda_dma_dev_setup() as it will always fail at this particular place: - for physical PE, the device is already attached by iommu_add_device() in pnv_pci_ioda_setup_dma_pe(); - for virtual PE, the sysfs entries are not ready to create all symlinks so actual adding is happening in tce_iommu_bus_notifier. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:14:54 +10:00
Alexey Kardashevskiy	ea30e99e8e	powerpc/eeh/ioda2: Use device::iommu_group to check IOMMU group This relies on the fact that a PCI device always has an IOMMU table which may not be the case when we get dynamic DMA windows so let's use more reliable check for IOMMU group here. As we do not rely on the table presence here, remove the workaround from pnv_pci_ioda2_set_bypass(); also remove the @add_to_iommu_group parameter from pnv_ioda_setup_bus_dma(). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-11 15:14:54 +10:00
Bjorn Helgaas	cd11433eda	PCI: Include <linux/pci.h>, not <asm/pci.h> We already include <asm/pci.h> from <linux/pci.h>, so just include <linux/pci.h> directly. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: linuxppc-dev@lists.ozlabs.org CC: linux-s390@vger.kernel.org	2015-06-08 07:55:03 -05:00
Jeremy Kerr	0d7cd8550d	powerpc/powernv: Add opal-prd channel This change adds a char device to access the "PRD" (processor runtime diagnostics) channel to OPAL firmware. Includes contributions from Vaidyanathan Srinivasan, Neelesh Gupta & Vishal Kulkarni. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-05 08:32:21 +10:00
Jeremy Kerr	594fcb9ec9	powerpc/powernv: Expose OPAL APIs required by PRD interface The (upcoming) opal-prd driver needs to access the message notifier and xscom code, so add EXPORT_SYMBOL_GPL macros for these. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-05 08:32:20 +10:00
Jeremy Kerr	48c0615495	powerpc/powernv: Merge common platform device initialisation opal_ipmi_init and opal_flash_init are equivalent, except for the compatbile string. Merge these two into a common opal_pdev_init function. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-05 08:32:20 +10:00
Cédric Le Goater	14aae78f08	powerpc/powernv: convert OPAL codes returned by sysparam calls The opal_{get,set}_param calls return internal error codes which need to be translated in errnos in Linux. Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-04 22:27:56 +10:00
Michael Neuling	ec249dd860	cxl: Move include file cxl.h -> cxl-base.h This moves the current include file from cxl.h -> cxl-base.h. This current include file is used only to pass information between the base driver that needs to be built into the kernel and the cxl module. This is to make way for a new include/misc/cxl.h which will contain just the kernel API for other driver to use Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-03 13:27:19 +10:00
Michael Neuling	7a8e6bbf85	powerpc/pci: Add shutdown hook to pci_controller_ops Currently pnv_pci_shutdown() calls the PHB shutdown code for all PHBs in the system. It dereferences the private_data assuming it's a powernv PHB, which won't be the case when we have different PHB in the systems (like when we add vPHBs for CXL). This moves the shutdown hook to the pci_controller_ops and fixes the call site to use that instead. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-03 13:27:16 +10:00
Xie Xiaobo	31ea9d5dfd	powerpc/85xx: p1025twr: add module conditional to fix QE-uart issue A ioport setting was needed when used the QE uart function on TWR-P1025. Added a conditional definition to avoid missing this setting when the QE-uart driver was bulit to a module. Signed-off-by: Xie Xiaobo <X.Xie@freescale.com> Signed-off-by: Li Pengbo <Pengbo.Li@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-06-02 21:37:29 -05:00
Kevin Hao	379caf6063	powerpc: mpc85xx: flush the l1 cache before cpu down in kexec We observe a "Zero PT_NOTE entries found" warning when vmcore_init() is running on the dump-capture kernel. Actually the PT_NOTE segments is not empty, but the entries generated by crash_save_cpu() are not flushed to the memory before we reset these cores. So we should flush the l1 cache as what we do in cpu hotplug. With this change, we can also kill the mpc85xx_smp_flush_dcache_kexec() since that becomes unnecessary. Please note: this only fix the issue on e500 core, we still need to implement the function to flush the l2 cache for the e500mc core. Fortunately we already had proposing patch for this support [1]. Hope we can fix this issue for e500mc after that merged. [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-March/115830.html Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-06-02 21:37:28 -05:00
Shengzhou Liu	65bf2a0570	powerpc/fsl-booke: Add T1023 RDB board support T1023RDB is a Freescale Reference Design Board that hosts T1023 SoC. T1023RDB board Overview ----------------------- - T1023 SoC integrating two 64-bit e5500 cores up to 1.4GHz - CoreNet fabric supporting coherent and noncoherent transactions with prioritization and bandwidth allocation - Memory: 2GB Micron MT40A512M8HX unbuffered 32-bit fixed DDR4 without ECC - Accelerator: DPAA components consist of FMan, BMan, QMan, DCE and SEC - Ethernet interfaces: - one 1G RGMII port on-board(RTL8211F PHY) - one 1G SGMII port on-board(RTL8211F PHY) - one 2.5G SGMII port on-board(AQR105 PHY) - PCIe: Two Mini-PCIe connectors on-board. - SerDes: 4 lanes up to 10.3125GHz - NOR: 128MB S29GL01GS110TFIV10 Spansion NOR Flash - NAND: 512MB S34MS04G200BFI000 Spansion NAND Flash - eSPI: 64MB S25FL512SAGMFI010 Spansion SPI flash - USB: one Type-A USB 2.0 port with internal PHY - eSDHC: support SD/MMC card and eMMC flash on-board - 256Kbit M24256 I2C EEPROM - RTC: Real-time clock DS1339 on I2C bus - UART: one serial port on-board with RJ45 connector - Debugging: JTAG/COP for T1023 debugging Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-06-02 21:37:21 -05:00
Shengzhou Liu	5afe13fd48	powerpc/fsl-booke: Add T1024 RDB board support T1024RDB is a Freescale Reference Design Board that hosts the T1024 SoC. Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com> [scottwood: vendor prefix: s/at24/atmel/ and trimmed detailed board description with too-long lines] Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-06-02 21:37:21 -05:00
Shengzhou Liu	2b6029e2e0	powerpc/fsl-booke: Add T1024 QDS board support Add support for Freescale T1024/T1023 QorIQ Development System Board. T1024QDS is a high-performance computing evaluation, development and test platform for T1024 QorIQ Power Architecture processor. Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com> [scottwood: vendor prefix: s/at24/atmel/ and trimmed detailed board description with too-long lines] Signed-off-by: Scott Wood <scottwood@freescale.com>	2015-06-02 21:37:20 -05:00
Jiang Liu	c1231a784a	powerpc: Use irq_desc_get_xxx() to avoid redundant lookup of irq_desc Use irq_desc_get_xxx() to avoid redundant lookup of irq_desc while we already have a pointer to corresponding irq_desc. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-02 16:54:44 +10:00
Daniel Axtens	763d2d8df1	powerpc/powernv: Move dma_set_mask() from pnv_phb to pci_controller_ops Previously, dma_set_mask() on powernv was convoluted: 0) Call dma_set_mask() (a/p/kernel/dma.c) 1) In dma_set_mask(), ppc_md.dma_set_mask() exists, so call it. 2) On powernv, that function pointer is pnv_dma_set_mask(). In pnv_dma_set_mask(), the device is pci, so call pnv_pci_dma_set_mask(). 3) In pnv_pci_dma_set_mask(), call pnv_phb->set_dma_mask() if it exists. 4) It only exists in the ioda case, where it points to pnv_pci_ioda_dma_set_mask(), which is the final function. So the call chain is: dma_set_mask() -> pnv_dma_set_mask() -> pnv_pci_dma_set_mask() -> pnv_pci_ioda_dma_set_mask() Both ppc_md and pnv_phb function pointers are used. Rip out the ppc_md call, pnv_dma_set_mask() and pnv_pci_dma_set_mask(). Instead: 0) Call dma_set_mask() (a/p/kernel/dma.c) 1) In dma_set_mask(), the device is pci, and pci_controller_ops.dma_set_mask() exists, so call pci_controller_ops.dma_set_mask() 2) In the ioda case, that points to pnv_pci_ioda_dma_set_mask(). The new call chain is dma_set_mask() -> pnv_pci_ioda_dma_set_mask() Now only the pci_controller_ops function pointer is used. The fallback paths for p5ioc2 are the same. Previously, pnv_pci_dma_set_mask() would find no pnv_phb->set_dma_mask() function, to it would call __set_dma_mask(). Now, dma_set_mask() finds no ppc_md call or pci_controller_ops call, so it calls __set_dma_mask(). Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-02 13:18:49 +10:00
Daniel Axtens	92ae035326	powerpc/powernv: Specialise pci_controller_ops for each controller type Remove powernv generic PCI controller operations. Replace it with controller ops for each of the two supported PHBs. As an added bonus, make the two new structs const, which will help guard against bugs such as the one introduced in `65ebf4b63` ("powerpc/powernv: Move controller ops from ppc_md to controller_ops") Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-02 13:18:49 +10:00
Daniel Axtens	8392296697	powerpc/pasemi: Move MSI-related ops to pci_controller_ops Move the PaSemi MPIC msi subsystem to use the pci_controller_ops structure rather than ppc_md for MSI related PCI controller operations. As with fsl_msi, operations are plugged in at the subsys level, after controller creation. Again, we iterate over all controllers and populate them with the MSI ops. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-02 11:47:43 +10:00
Daniel Axtens	1d14b8755f	powerpc/pseries: Move MSI-related ops to pci_controller_ops Move the pseries platform to use the pci_controller_ops structure rather than ppc_md for MSI related PCI controller operations We need to iterate all PHBs because the MSI setup happens later than find_and_init_phbs() - mpe. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-02 11:47:10 +10:00
Daniel Axtens	7e3d6c5a4b	powerpc/cell: Move MSI-related ops to pci_controller_ops Move the Cell platform to use the pci_controller_ops structure rather than ppc_md for MSI related PCI controller operations. We can be confident that the functions will be added to the platform's ops struct before any PCI controller's ops struct is populated because: 1) These ops are added to the struct in a subsys initcall. We populate the ops in axon_msi_probe, which is the probe call for the axon-msi driver. However the driver is registered in axon_msi_init, which is a subsys initcall, so this will happen at the subsys level. 2) The controller recieves the struct later, in a device initcall. Cell populates the controller in cell_setup_phb, which is hooked up to ppc_md.pci_setup_phb. ppc_md.pci_setup_phb is only ever called in of_platform.c, as part of the OpenFirmware PCI driver's probe routine. That driver is registered in a device initcall, so it will occur after the struct is properly populated. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-22 15:50:55 +10:00
Daniel Axtens	d6381119a4	powerpc/powernv: Move MSI-related ops to pci_controller_ops Move the PowerNV/BML platform to use the pci_controller_ops structure rather than ppc_md for MSI related PCI controller operations. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-22 15:50:55 +10:00
Alistair Popple	81f2f7ce4c	opal: Remove events notifier All users of the old opal events notifier have been converted over to the irq domain so remove the event notifier functions. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-22 15:14:38 +10:00
Alistair Popple	8034f715f0	powernv/opal-dump: Convert to irq domain Convert the opal dump driver to the new opal irq domain. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-22 15:14:38 +10:00
Alistair Popple	74159a7028	powernv/elog: Convert elog to opal irq domain This patch converts the elog code to use the opal irq domain instead of notifier events. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-22 15:14:38 +10:00
Alistair Popple	a295af24d0	powernv/opal: Convert opal message events to opal irq domain This patch converts the opal message event to use the new opal irq domain. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-22 15:14:38 +10:00
Alistair Popple	79231448c9	powernv/eeh: Update the EEH code to use the opal irq domain The eeh code currently uses the old notifier method to get eeh events from OPAL. It also contains some logic to filter opal events which has been moved into the virtual irqchip. This patch converts the eeh code to the new event interface which simplifies event handling. Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-22 15:14:38 +10:00
Alistair Popple	9f0fd0499d	powerpc/powernv: Add a virtual irqchip for opal events Whenever an interrupt is received for opal the linux kernel gets a bitfield indicating certain events that have occurred and need handling by the various device drivers. Currently this is handled using a notifier interface where we call every device driver that has registered to receive opal events. This approach has several drawbacks. For example each driver has to do its own checking to see if the event is relevant as well as event masking. There is also no easy method of recording the number of times we receive particular events. This patch solves these issues by exposing opal events via the standard interrupt APIs by adding a new interrupt chip and domain. Drivers can then register for the appropriate events using standard kernel calls such as irq_of_parse_and_map(). Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-22 15:14:37 +10:00
Alistair Popple	96e023e753	powerpc/powernv: Reorder OPAL subsystem initialisation Most of the OPAL subsystems are always compiled in for PowerNV and many of them need to be initialised before or after other OPAL subsystems. Rather than trying to control this ordering through machine initcalls it is clearer and easier to control initialisation order with explicit calls in opal_init. Signed-off-by: Alistair Popple <alistair@popple.id.au> Cc: Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-22 15:14:37 +10:00
Shreyas B. Prabhu	5703d2f4a1	powerpc/powernv: Introduce sysfs control for fastsleep workaround behavior Fastsleep is one of the idle state which cpuidle subsystem currently uses on power8 machines. In this state L2 cache is brought down to a threshold voltage. Therefore when the core is in fastsleep, the communication between L2 and L3 needs to be fenced. But there is a bug in the current power8 chips surrounding this fencing. OPAL provides a workaround which precludes the possibility of hitting this bug. But running with this workaround applied causes checkstop if any correctable error in L2 cache directory is detected. Hence OPAL also provides a way to undo the workaround. In the existing implementation, workaround is applied by the last thread of the core entering fastsleep and undone by the first thread waking up. But this has a performance cost. These OPAL calls account for roughly 4000 cycles everytime the core has to enter or wakeup from fastsleep. This patch introduces a sysfs attribute (fastsleep_workaround_applyonce) to choose the behavior of this workaround. By default, fastsleep_workaround_applyonce = 0. In this case, workaround is applied/undone everytime the core enters/exits fastsleep. fastsleep_workaround_applyonce = 1. In this case the workaround is applied once on all the cores and never undone. This can be triggered by echo 1 > /sys/devices/system/cpu/fastsleep_workaround_applyonce For simplicity this attribute can be modified only once. Implying, once fastsleep_workaround_applyonce is changed to 1, it cannot be reverted to the default state. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-22 15:12:30 +10:00
Shreyas B. Prabhu	d405a98c70	powerpc/powernv: Move cpuidle related code from setup.c to new file This is a cleanup patch; doesn't change any functionality. Moves all cpuidle related code from setup.c to a new file. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> [mpe: Fix the SMP=n build by including asm/smp.h in idle.c] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-22 15:12:30 +10:00
Michael Ellerman	d4d4add9ea	powerpc: Little endian should depend on PPC_BOOK3S_64 The only little endian configuration we support is ppc64le, all other configurations are big endian. So we should only offer a choice of endian if we're building for 64-bit Book3S, ie. PPC_BOOK3S_64. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-18 16:48:11 +10:00
Wei Yang	e17866d559	powerpc/eeh: fix powernv_eeh_wait_state delay logic As the comment indicates, powernv_eeh_get_state() will inform EEH core to delay 1 second. This means the delay doesn't happen when powernv_eeh_get_state() returns. This patch moves the delay subtraction just before msleep(), which is the same logic in pseries_eeh_wait_state(). Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-13 14:00:07 +10:00
Wei Yang	2ac3990cc3	powerpc/eeh: fix comment for wait_state() To retrieve the PCI slot state, EEH driver would set a timeout for that. While current comment is not aligned to what the code does. This patch fixes those comments according to the code. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-13 14:00:07 +10:00
Joel Stanley	38c0488770	powerpc/powernv: Silence SYSPARAM warning on boot OpenPower BMC machines do not place any sysparams in the device tree, so at every boot we get a warning: [ 0.437176] SYSPARAM: Opal sysparam node not found Remove the warning, and reorder the init so we don't peform allocations when there is no sysparam node in the device tree. Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-11 20:26:42 +10:00
Michael Ellerman	e0d0059169	powerpc/vdso: Disable building the 32-bit VDSO on little endian The only little endian configuration we support is ppc64le. As such if we're building little endian we don't need a 32-bit VDSO, because there is no 32-bit userspace. This patch is a fairly ugly mess of #ifdefs, but is the minimal logic required to disable the 32-bit VDSO. We can hopefully clean up the result in future with some further refactoring. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-11 20:01:02 +10:00
Michael Ellerman	5af7a6f3e2	powerpc/pasemi: Only the build the pasemi MSI code for PASEMI=y The pasemi MSI code is currently always built when MPIC=y && PCI_MSI=y. It should not have any effect on other platforms, because it immediately checks the MPIC's compatible property for "pasemi,pwrficient-openpic". However it's odd that it's still built even when PASEMI=n. It also needn't be in sysdev, as it's only used by pasemi. So move it into platforms/pasemi, whereby it will only be built for PASEMI=y. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-11 19:55:25 +10:00
Nathan Fontenot	2222ce0fbb	powerpc/pseries: Fix possible leaked device node reference Failure return from dlpar_configure_connector when dlpar adding cpus results in leaking references to the cpus parent device node. Move the call to of_node_put() prior to checking the result of dlpar_configure_connector. Fixes: `8d5ff32076` ("powerpc/pseries: Make dlpar_configure_connector parent node aware") Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-11 16:58:20 +10:00
Nathan Fontenot	f32393c943	powerpc/pseries: Correct cpu affinity for dlpar added cpus The incorrect ordering of operations during cpu dlpar add results in invalid affinity for the cpu being added. The ibm,associativity property in the device tree is populated with all zeroes for the added cpu which results in invalid affinity mappings and all cpus appear to belong to node 0. This occurs because rtas configure-connector is called prior to making the rtas set-indicator calls. Phyp does not assign affinity information for a cpu until the rtas set-indicator calls are made to set the isolation and allocation state. Correct the order of operations to make the rtas set-indicator calls (done in dlpar_acquire_drc) before calling rtas configure-connector. Fixes: `1a8061c46c` ("powerpc/pseries: Add kernel based CPU DLPAR handling") Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-05-01 13:47:24 +10:00
Daniel Axtens	d33047fd7e	powerpc/powernv: Fix early pci_controller_ops loading. Load the PowerNV platform pci controller ops into pci controllers after all the operations are loaded into the platform ops struct, not before. Otherwise we aren't actually setting the ops properly which can break IO for some devices. Fixes: `65ebf4b63` ("powerpc/powernv: Move controller ops from ppc_md to controller_ops") Reported-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-04-29 19:43:58 +10:00
Linus Torvalds	9ec3a646fe	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fourth vfs update from Al Viro: "d_inode() annotations from David Howells (sat in for-next since before the beginning of merge window) + four assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RCU pathwalk breakage when running into a symlink overmounting something fix I_DIO_WAKEUP definition direct-io: only inc/dec inode->i_dio_count for file systems fs/9p: fix readdir() VFS: assorted d_backing_inode() annotations VFS: fs/inode.c helpers: d_inode() annotations VFS: fs/cachefiles: d_backing_inode() annotations VFS: fs library helpers: d_inode() annotations VFS: assorted weird filesystems: d_inode() annotations VFS: normal filesystems (and lustre): d_inode() annotations VFS: security/: d_inode() annotations VFS: security/: d_backing_inode() annotations VFS: net/: d_inode() annotations VFS: net/unix: d_backing_inode() annotations VFS: kernel/: d_inode() annotations VFS: audit: d_backing_inode() annotations VFS: Fix up some ->d_inode accesses in the chelsio driver VFS: Cachefiles should perform fs modifications on the top layer only VFS: AF_UNIX sockets should call mknod on the top layer only	2015-04-26 17:22:07 -07:00
Linus Torvalds	eadf16a912	This mostly includes the PPC changes for 4.1, which this time cover Book3S HV only (debugging aids, minor performance improvements and some cleanups). But there are also bug fixes and small cleanups for ARM, x86 and s390. The task_migration_notifier revert and real fix is still pending review, but I'll send it as soon as possible after -rc1. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJVOONLAAoJEL/70l94x66DbsMIAIpZPsaqgXOC1sDEiZuYay+6 rD4n4id7j8hIAzcf3AlZdyf5XgLlr6I1Zyt62s1WcoRq/CCnL7k9EljzSmw31WFX P2y7/J0iBdkn0et+PpoNThfL2GsgTqNRCLOOQlKgEQwMP9Dlw5fnUbtC1UchOzTg eAMeBIpYwufkWkXhdMw4PAD4lJ9WxUZ1eXHEBRzJb0o0ZxAATJ1tPZGrFJzoUOSM WsVNTuBsNd7upT02kQdvA1TUo/OPjseTOEoksHHwfcORt6bc5qvpctL3jYfcr7sk /L6sIhYGVNkjkuredjlKGLfT2DDJjSEdJb1k2pWrDRsY76dmottQubAE9J9cDTk= =OAi2 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull second batch of KVM changes from Paolo Bonzini: "This mostly includes the PPC changes for 4.1, which this time cover Book3S HV only (debugging aids, minor performance improvements and some cleanups). But there are also bug fixes and small cleanups for ARM, x86 and s390. The task_migration_notifier revert and real fix is still pending review, but I'll send it as soon as possible after -rc1" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (29 commits) KVM: arm/arm64: check IRQ number on userland injection KVM: arm: irqfd: fix value returned by kvm_irq_map_gsi KVM: VMX: Preserve host CR4.MCE value while in guest mode. KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8 KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C KVM: PPC: Book3S HV: Streamline guest entry and exit KVM: PPC: Book3S HV: Use bitmap of active threads rather than count KVM: PPC: Book3S HV: Use decrementer to wake napping threads KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu KVM: PPC: Book3S HV: Minor cleanups KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA update KVM: PPC: Book3S HV: Accumulate timing information for real-mode code KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT KVM: PPC: Book3S HV: Add ICP real mode counters KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock KVM: PPC: Book3S HV: Add guest->host real mode completion counters KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte ...	2015-04-26 13:06:22 -07:00
Michael Ellerman	e928e9cb36	KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation. Some PowerNV systems include a hardware random-number generator. This HWRNG is present on POWER7+ and POWER8 chips and is capable of generating one 64-bit random number every microsecond. The random numbers are produced by sampling a set of 64 unstable high-frequency oscillators and are almost completely entropic. PAPR defines an H_RANDOM hypercall which guests can use to obtain one 64-bit random sample from the HWRNG. This adds a real-mode implementation of the H_RANDOM hypercall. This hypercall was implemented in real mode because the latency of reading the HWRNG is generally small compared to the latency of a guest exit and entry for all the threads in the same virtual core. Userspace can detect the presence of the HWRNG and the H_RANDOM implementation by querying the KVM_CAP_PPC_HWRNG capability. The H_RANDOM hypercall implementation will only be invoked when the guest does an H_RANDOM hypercall if userspace first enables the in-kernel H_RANDOM implementation using the KVM_CAP_PPC_ENABLE_HCALL capability. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-04-21 15:21:29 +02:00
Linus Torvalds	d19d5efd8c	powerpc updates for 4.1 - Numerous minor fixes, cleanups etc. - More EEH work from Gavin to remove its dependency on device_nodes. - Memory hotplug implemented entirely in the kernel from Nathan Fontenot. - Removal of redundant CONFIG_PPC_OF by Kevin Hao. - Rewrite of VPHN parsing logic & tests from Greg Kurz. - A fix from Nish Aravamudan to reduce memory usage by clamping nodes_possible_map. - Support for pstore on powernv from Hari Bathini. - Removal of old powerpc specific byte swap routines by David Gibson. - Fix from Vasant Hegde to prevent the flash driver telling you it was flashing your firmware when it wasn't. - Patch from Ben Herrenschmidt to add an OPAL heartbeat driver. - Fix for an oops causing get/put_cpu_var() imbalance in perf by Jan Stancek. - Some fixes for migration from Tyrel Datwyler. - A new syscall to switch the cpu endian by Michael Ellerman. - Large series from Wei Yang to implement SRIOV, reviewed and acked by Bjorn. - A fix for the OPAL sensor driver from Cédric Le Goater. - Fixes to get STRICT_MM_TYPECHECKS building again by Michael Ellerman. - Large series from Daniel Axtens to make our PCI hooks per PHB rather than per machine. - Small patch from Sam Bobroff to explicitly abort non-suspended transactions on syscalls, plus a test to exercise it. - Numerous reworks and fixes for the 24x7 PMU from Sukadev Bhattiprolu. - Small patch to enable the hard lockup detector from Anton Blanchard. - Fix from Dave Olson for missing L2 cache information on some CPUs. - Some fixes from Michael Ellerman to get Cell machines booting again. - Freescale updates from Scott: Highlights include BMan device tree nodes, an MSI erratum workaround, a couple minor performance improvements, config updates, and misc fixes/cleanup. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJVL2cxAAoJEFHr6jzI4aWAR8cP/19VTo/CzCE4ffPSx7qR464n F+WFZcbNjIMXu6+B0YLuJZEsuWtKKrCit/MCg3+mSgE4iqvxmtI+HDD0445Buszj UD4E4HMdPrXQ+KUSUDORvRjv/FFUXIa94LSv/0g2UeMsPz/HeZlhMxEu7AkXw9Nf rTxsmRTsOWME85Y/c9ss7XHuWKXT3DJV7fOoK9roSaN3dJAuWTtG3WaKS0nUu0ok 0M81D6ZczoD6ybwh2DUMPD9K6SGxLdQ4OzQwtW6vWzcQIBDfy5Pdeo0iAFhGPvXf T4LLPkv4cF4AwHsAC4rKDPHQNa+oZBoLlScrHClaebAlDiv+XYKNdMogawUObvSh h7avKmQr0Ygp1OvvZAaXLhuDJI9FJJ8lf6AOIeULgHsDR9SyKMjZWxRzPe11uarO Fyi0qj3oJaQu6LjazZraApu8mo+JBtQuD3z3o5GhLxeFtBBF60JXj6zAXJikufnl kk1/BUF10nKUhtKcDX767AMUCtMH3fp5hx8K/z9T5v+pobJB26Wup1bbdT68pNBT NjdKUppV6QTjZvCsA6U2/ECu6E9KeIaFtFSL2IRRoiI0dWBN5/5eYn3RGkO2ZFoL 1NdwKA2XJcchwTPkpSRrUG70sYH0uM2AldNYyaLfjzrQqza7Y6lF699ilxWmCN/H OplzJAE5cQ8Am078veTW =03Yh -----END PGP SIGNATURE----- Merge tag 'powerpc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc updates from Michael Ellerman: - Numerous minor fixes, cleanups etc. - More EEH work from Gavin to remove its dependency on device_nodes. - Memory hotplug implemented entirely in the kernel from Nathan Fontenot. - Removal of redundant CONFIG_PPC_OF by Kevin Hao. - Rewrite of VPHN parsing logic & tests from Greg Kurz. - A fix from Nish Aravamudan to reduce memory usage by clamping nodes_possible_map. - Support for pstore on powernv from Hari Bathini. - Removal of old powerpc specific byte swap routines by David Gibson. - Fix from Vasant Hegde to prevent the flash driver telling you it was flashing your firmware when it wasn't. - Patch from Ben Herrenschmidt to add an OPAL heartbeat driver. - Fix for an oops causing get/put_cpu_var() imbalance in perf by Jan Stancek. - Some fixes for migration from Tyrel Datwyler. - A new syscall to switch the cpu endian by Michael Ellerman. - Large series from Wei Yang to implement SRIOV, reviewed and acked by Bjorn. - A fix for the OPAL sensor driver from Cédric Le Goater. - Fixes to get STRICT_MM_TYPECHECKS building again by Michael Ellerman. - Large series from Daniel Axtens to make our PCI hooks per PHB rather than per machine. - Small patch from Sam Bobroff to explicitly abort non-suspended transactions on syscalls, plus a test to exercise it. - Numerous reworks and fixes for the 24x7 PMU from Sukadev Bhattiprolu. - Small patch to enable the hard lockup detector from Anton Blanchard. - Fix from Dave Olson for missing L2 cache information on some CPUs. - Some fixes from Michael Ellerman to get Cell machines booting again. - Freescale updates from Scott: Highlights include BMan device tree nodes, an MSI erratum workaround, a couple minor performance improvements, config updates, and misc fixes/cleanup. * tag 'powerpc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (196 commits) powerpc/powermac: Fix build error seen with powermac smp builds powerpc/pseries: Fix compile of memory hotplug without CONFIG_MEMORY_HOTREMOVE powerpc: Remove PPC32 code from pseries specific find_and_init_phbs() powerpc/cell: Fix iommu breakage caused by controller_ops change powerpc/eeh: Fix crash in eeh_add_device_early() on Cell powerpc/perf: Cap 64bit userspace backtraces to PERF_MAX_STACK_DEPTH powerpc/perf/hv-24x7: Fail 24x7 initcall if create_events_from_catalog() fails powerpc/pseries: Correct memory hotplug locking powerpc: Fix missing L2 cache size in /sys/devices/system/cpu powerpc: Add ppc64 hard lockup detector support oprofile: Disable oprofile NMI timer on ppc64 powerpc/perf/hv-24x7: Add missing put_cpu_var() powerpc/perf/hv-24x7: Break up single_24x7_request powerpc/perf/hv-24x7: Define update_event_count() powerpc/perf/hv-24x7: Whitespace cleanup powerpc/perf/hv-24x7: Define add_event_to_24x7_request() powerpc/perf/hv-24x7: Rename hv_24x7_event_update powerpc/perf/hv-24x7: Move debug prints to separate function powerpc/perf/hv-24x7: Drop event_24x7_request() powerpc/perf/hv-24x7: Use pr_devel() to log message ... Conflicts: tools/testing/selftests/powerpc/Makefile tools/testing/selftests/powerpc/tm/Makefile	2015-04-16 13:53:32 -05:00
Joel Stanley	e243304d0a	powerpc/powernv: reboot when requested by firmware Use orderly_reboot so userspace will to shut itself down via the reboot path. This is required for graceful reboot initiated by the BMC, such as when a user uses ipmitool to issue a 'chassis power cycle' command. Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Fabian Frederick <fabf@skynet.be> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Kerr <jk@ozlabs.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-04-15 16:35:23 -07:00
David Howells	75c3cfa855	VFS: assorted weird filesystems: d_inode() annotations Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-15 15:06:58 -04:00
Guenter Roeck	2fe0753d49	powerpc/powermac: Fix build error seen with powermac smp builds powermac smp builds fail with arch/powerpc/platforms/powermac/smp.c: In function 'smp_psurge_probe': arch/powerpc/platforms/powermac/smp.c:278:3: error: 'return' with a value, in function returning void There are several instances of this error. Fixes: `a7f4ee1fe9` ("powerpc: Drop return value of smp_ops->probe()") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-04-15 15:52:59 +10:00
Alexey Kardashevskiy	16e00f5a5f	powerpc/pseries: Fix compile of memory hotplug without CONFIG_MEMORY_HOTREMOVE `51925fb3c5` "powerpc/pseries: Implement memory hotplug remove in the kernel" broke compile when CONFIG_MEMORY_HOTREMOVE is not defined due to missing symbols. This fixes the issue by adding the missing symbols. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-04-15 11:55:28 +10:00
Linus Torvalds	d0bbe0dd35	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree from Jiri Kosina: "Usual trivial tree updates. Nothing outstanding -- mostly printk() and comment fixes and unused identifier removals" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: goldfish: goldfish_tty_probe() is not using 'i' any more powerpc: Fix comment in smu.h qla2xxx: Fix printks in ql_log message lib: correct link to the original source for div64_u64 si2168, tda10071, m88ds3103: Fix firmware wording usb: storage: Fix printk in isd200_log_config() qla2xxx: Fix printk in qla25xx_setup_mode init/main: fix reset_device comment ipwireless: missing assignment goldfish: remove unreachable line of code coredump: Fix do_coredump() comment stacktrace.h: remove duplicate declaration task_struct smpboot.h: Remove unused function prototype treewide: Fix typo in printk messages treewide: Fix typo in printk messages mod_devicetable: fix comment for match_flags	2015-04-14 09:50:27 -07:00
Daniel Axtens	ff7a2adac5	powerpc: Remove PPC32 code from pseries specific find_and_init_phbs() In `bdc728a849` ("powerpc: move find_and_init_phbs() to pSeries specific code"), find_and_init_phbs() was moved into a pseries specific file, but PPC32 code wasn't removed. Remove it. See https://lkml.kernel.org/r/552C0AA6.4010403@fau.de Reported-by: Andreas Ruprecht <andreas.ruprecht@fau.de> Fixes: `bdc728a849` Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-04-14 17:14:22 +10:00
Michael Ellerman	4acd09b4bf	powerpc/cell: Fix iommu breakage caused by controller_ops change The recent patch to convert cell to use pci_controller_ops had a small bug which broke machines using an iommu. The set of phb->controller_ops was added after the check for name != "pci", meaning pcix/pcie PHBs weren't getting their ops set correctly. Fixes: `9c1368fc50` ("powerpc/cell: Move controller ops from ppc_md to controller_ops") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-04-14 17:13:31 +10:00

1 2 3 4 5 ...

4701 commits