Commit graph

13405 commits

Author SHA1 Message Date
Paul Mackerras
56548fc0e8 powerpc/powernv: Return to cpu offline loop when finished in KVM guest
When a secondary hardware thread has finished running a KVM guest, we
currently put that thread into nap mode using a nap instruction in
the KVM code.  This changes the code so that instead of doing a nap
instruction directly, we instead cause the call to power7_nap() that
put the thread into nap mode to return.  The reason for doing this is
to avoid having the KVM code having to know what low-power mode to
put the thread into.

In the case of a secondary thread used to run a KVM guest, the thread
will be offline from the point of view of the host kernel, and the
relevant power7_nap() call is the one in pnv_smp_cpu_disable().
In this case we don't want to clear pending IPIs in the offline loop
in that function, since that might cause us to miss the wakeup for
the next time the thread needs to run a guest.  To tell whether or
not to clear the interrupt, we use the SRR1 value returned from
power7_nap(), and check if it indicates an external interrupt.  We
arrange that the return from power7_nap() when we have finished running
a guest returns 0, so pending interrupts don't get flushed in that
case.

Note that it is important a secondary thread that has finished
executing in the guest, or that didn't have a guest to run, should
not return to power7_nap's caller while the kvm_hstate.hwthread_req
flag in the PACA is non-zero, because the return from power7_nap
will reenable the MMU, and the MMU might still be in guest context.
In this situation we spin at low priority in real mode waiting for
hwthread_req to become zero.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-08 13:16:31 +11:00
Alexei Starovoitov
89aa075832 net: sock: allow eBPF programs to be attached to sockets
introduce new setsockopt() command:

setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, &prog_fd, sizeof(prog_fd))

where prog_fd was received from syscall bpf(BPF_PROG_LOAD, attr, ...)
and attr->prog_type == BPF_PROG_TYPE_SOCKET_FILTER

setsockopt() calls bpf_prog_get() which increments refcnt of the program,
so it doesn't get unloaded while socket is using the program.

The same eBPF program can be attached to multiple sockets.

User task exit automatically closes socket which calls sk_filter_uncharge()
which decrements refcnt of eBPF program

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-05 21:47:32 -08:00
Mahesh Salgaonkar
682e77c861 powerpc/book3s: Fix partial invalidation of TLBs in MCE code.
The existing MCE code calls flush_tlb hook with IS=0 (single page) resulting
in partial invalidation of TLBs which is not right. This patch fixes
that by passing IS=0xc00 to invalidate whole TLB for successful recovery
from TLB and ERAT errors.

Cc: stable@vger.kernel.org
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-05 16:26:21 +11:00
Aneesh Kumar K.V
aefa5688c0 powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault
upatepp can get called for a nohpte fault when we find from the linux
page table that the translation was hashed before. In that case
we are sure that there is no existing translation, hence we could
avoid doing tlbie.

We could possibly race with a parallel fault filling the TLB. But
that should be ok because updatepp is only ever relaxing permissions.
We also look at linux pte permission bits when filling hash pte
permission bits. We also hold the linux pte busy bits while
inserting/updating a hashpte entry, hence a paralle update of
linux pte is not possible. On the other hand mprotect involves
ptep_modify_prot_start which cause a hpte invalidate and not updatepp.

Performance number:
We use randbox_access_bench written by Anton.

Kernel with THP disabled and smaller hash page table size.

    86.60%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_updatepp
     2.10%  random_access_b  random_access_bench              [.] doit
     1.99%  random_access_b  [kernel.kallsyms]                [k] .do_raw_spin_lock
     1.85%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
     1.26%  random_access_b  [kernel.kallsyms]                [k] .native_flush_hash_range
     1.18%  random_access_b  [kernel.kallsyms]                [k] .__delay
     0.69%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
     0.37%  random_access_b  [kernel.kallsyms]                [k] .clear_user_page
     0.34%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
     0.32%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
     0.30%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm

With Fix:

    27.54%  random_access_b  random_access_bench              [.] doit
    22.90%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
     5.76%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
     5.20%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
     5.12%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
     4.80%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
     3.31%  random_access_b  [kernel.kallsyms]                [k] data_access_common
     1.84%  random_access_b  [kernel.kallsyms]                [k] .trace_hardirqs_on_caller

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-05 16:26:15 +11:00
Julia Lawall
d83480b061 crypto: powerpc - replace memset by memzero_explicit
Memset on a local variable may be removed when it is called just before the
variable goes out of scope.  Using memzero_explicit defeats this
optimization.  A simplified version of the semantic patch that makes this
change is as follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
identifier x;
type T;
@@

{
... when any
T x[...];
... when any
    when exists
- memset
+ memzero_explicit
  (x,
-0,
  ...)
... when != x
    when strict
}
// </smpl>

This change was suggested by Daniel Borkmann <dborkman@redhat.com>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-12-02 22:55:50 +08:00
Michael Ellerman
abb90ee7bc powerpc/xmon: Cleanup the breakpoint flags
Drop BP_IABR_TE, which though used, does not do anything useful. Rename
BP_IABR to BP_CIABR. Renumber the flags.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-02 14:23:04 +11:00
Anshuman Khandual
1ad7d70562 powerpc/xmon: Enable HW instruction breakpoint on POWER8
This patch enables support for hardware instruction breakpoint in xmon
on POWER8 platform with the help of a new register called the CIABR
(Completed Instruction Address Breakpoint Register). With this patch, a
single hardware instruction breakpoint can be added and cleared during
any active xmon debug session. The hardware based instruction breakpoint
mechanism works correctly with the existing TRAP based instruction
breakpoint available on xmon.

There are no powerpc CPU with CPU_FTR_IABR feature any more. This patch
has re-purposed all the existing IABR related code to work with CIABR
register based HW instruction breakpoint.

This has one odd feature, which is that when we hit a breakpoint xmon
doesn't tell us we have hit the breakpoint. This is because xmon is
expecting bp->address == regs->nip. Because CIABR fires on completition
regs->nip points to the instruction after the breakpoint. We could fix
that, but it would then confuse other parts of the xmon code which think
we need to emulate the instruction. [mpe]

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
2014-12-02 14:23:04 +11:00
Michael Ellerman
b5be75d008 Merge remote-tracking branch 'benh/next' into next
Merge updates collected & acked by Ben. A few EEH patches from Gavin,
some mm updates from Aneesh and a few odds and ends.
2014-12-02 14:19:20 +11:00
Aneesh Kumar K.V
d557b09800 powerpc/mm/thp: Use tlbiel if possible
If we know that user address space has never executed on other cpus
we could use tlbiel.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02 14:10:11 +11:00
Aneesh Kumar K.V
f1581bf14b powerpc/mm/thp: Remove code duplication
Rename invalidate_old_hpte to flush_hash_hugepage and use that in
other places.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02 14:10:10 +11:00
James Yang
c4f3eb5fc5 powerpc/mm/hugetlb: Sanity check gigantic hugepage count
Limit the number of gigantic hugepages specified by the
hugepages= parameter to MAX_NUMBER_GPAGES.

Signed-off-by: James Yang <James.Yang@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02 14:10:09 +11:00
Jiang Lu
0de3b56b13 powerpc/oprofile: Disable pagefaults during user stack read
A page fault occurred during reading user stack in oprofile backtrace
would lead following calltrace:

WARNING: at linux/kernel/smp.c:210
Modules linked in:
CPU: 5 PID: 736 Comm: sh Tainted: G W 3.14.23-WR7.0.0.0_standard #1
task: c0000000f6208bc0 ti: c00000007c72c000 task.ti: c00000007c72c000
NIP: c0000000000ed6e4 LR: c0000000000ed5b8 CTR: 0000000000000000
REGS: c00000007c72f050 TRAP: 0700 Tainted: G W (3.14.23-WR7.0.0
tandard)
MSR: 0000000080021000 <CE,ME> CR: 48222482 XER: 00000000
SOFTE: 0
GPR00: c0000000000ed5b8 c00000007c72f2d0 c0000000010aa048 0000000000000005
GPR04: c000000000fdb820 c00000007c72f410 0000000000000001 0000000000000005
GPR08: c0000000010b5768 c000000000f8a048 0000000000000001 0000000000000000
GPR12: 0000000048222482 c00000000fffe580 0000000022222222 0000000010129664
GPR16: 0000000010143cc0 0000000000000000 0000000044444444 0000000000000000
GPR20: c00000007c7221d8 c0000000f638e3c8 000003f15a20120d 0000000000000001
GPR24: 000000005a20120d c00000007c722000 c00000007cdedda8 00003fffef23b160
GPR28: 0000000000000001 c00000007c72f410 c000000000fdb820 0000000000000006
NIP [c0000000000ed6e4] .smp_call_function_single+0x18c/0x248
LR [c0000000000ed5b8] .smp_call_function_single+0x60/0x248
Call Trace:
[c00000007c72f2d0] [c0000000000ed5b8] .smp_call_function_single+0x60/0x248 (unreliable)
[c00000007c72f3a0] [c000000000030810] .__flush_tlb_page+0x164/0x1b0
[c00000007c72f460] [c00000000002e054] .ptep_set_access_flags+0xb8/0x168
[c00000007c72f500] [c0000000001ad3d8] .handle_mm_fault+0x4a8/0xbac
[c00000007c72f5e0] [c000000000bb3238] .do_page_fault+0x3b8/0x868
[c00000007c72f810] [c00000000001e1d0] storage_fault_common+0x20/0x44
 Exception: 301 at .__copy_tofrom_user_base+0x54/0x5b0
    LR = .op_powerpc_backtrace+0x190/0x20c
[c00000007c72fb00] [c000000000a2ec34] .op_powerpc_backtrace+0x204/0x20c (unreliable)
[c00000007c72fbc0] [c000000000a2b5fc] .oprofile_add_ext_sample+0xe8/0x118
[c00000007c72fc70] [c000000000a2eee0] .fsl_emb_handle_interrupt+0x20c/0x27c
[c00000007c72fd30] [c000000000a2e440] .op_handle_interrupt+0x44/0x58
[c00000007c72fdb0] [c000000000016d68] .performance_monitor_exception+0x74/0x90
[c00000007c72fe30] [c00000000001d8b4] exc_0x260_common+0xfc/0x100

performance_monitor_exception() is executed in a context with interrupt
disabled and preemption enabled. When there is a user space page fault
happened, do_page_fault() invoke in_atomic() to decide whether kernel
should handle such page fault. in_atomic() only check preempt_count.
So need call pagefault_disable() to disable preemption before reading
user stack.

Signed-off-by: Jiang Lu <lu.jiang@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02 14:10:08 +11:00
Aneesh Kumar K.V
0ec2698fe1 powerpc/mm: Check for matching hpte without taking hpte lock
With smaller hash page table config, we would end up in situation
where we would be replacing hash page table slot frequently. In
such config, we will find the hpte to be not matching, and we
can do that check without holding the hpte lock. We need to
recheck the hpte again after holding lock.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02 11:03:45 +11:00
Greg Kurz
221195fb80 powerpc: Drop useless warning in eeh_init()
This is what we get in dmesg when booting a pseries guest and
the hypervisor doesn't provide EEH support.

[    0.166655] EEH functionality not supported
[    0.166778] eeh_init: Failed to call platform init function (-22)

Since both powernv_eeh_init() and pseries_eeh_init() already complain when
hitting an error, it is not needed to print more (especially such an
uninformative message).

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02 11:03:45 +11:00
Mahesh Salgaonkar
6d626c5ea3 powerpc/powernv: Cleanup unused MCE definitions/declarations.
Cleanup OpalMCE_* definitions/declarations and other related code which
is not used anymore.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Benjamin Herrrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02 11:03:45 +11:00
Gavin Shan
a450e8f55a powerpc/eeh: Dump PHB diag-data early
On PowerNV platform, PHB diag-data is dumped after stopping device
drivers. In case of recursive EEH errors, the kernel is usually
crashed before dumping PHB diag-data for the second EEH error. It's
hard to locate the root cause of the second EEH error without PHB
diag-data.

The patch adds one more EEH option "eeh=early_log", which helps
dumping PHB diag-data immediately once frozen PE is detected, in
order to get the PHB diag-data for the second EEH error.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02 11:03:26 +11:00
Gavin Shan
b1d76a7d57 powerpc/eeh: Recover EEH error on ownership change for BCM5719
In PCI passthrou scenario, we need simulate EEH recovery for Emulex
adapters when their ownership changes, as we did in commit 5cfb20b96
("powerpc/eeh: Emulate EEH recovery for VFIO devices"). Broadcom
BCM5719 adpaters are facing same problem and needs same cure.

Reported-by: Rajeshkumar Subramanian <rajeshkumars@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02 11:03:26 +11:00
Gavin Shan
28bf36f92a powerpc/eeh: Set EEH_PE_RESET on PE reset
The patch introduces additional flag EEH_PE_RESET to indicate the
corresponding PE is under reset. In turn, the PE retrieval bakcend
on PowerNV platform can return unfrozen state for the EEH core to
moving forward. Flag EEH_PE_CFG_BLOCKED isn't the correct one for
the purpose.

In PCI passthrou case, the problem is more worse: Guest doesn't
recover 6th EEH error. The PE is left in isolated (frozen) and
config blocked state on Broadcom adapters. We can't retrieve the
PE's state correctly any more, even from the host side via sysfs
/sys/bus/pci/devices/xxx/eeh_pe_state.

Reported-by: Rajeshkumar Subramanian <rajeshkumars@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02 11:03:26 +11:00
Gavin Shan
b85743ee95 powerpc/eeh: Refactor eeh_reset_pe()
The patch refactors eeh_reset_pe() in order for:

   * Varied return values for different failure cases.
   * Replace pr_err() with pr_warn() and print function name.
   * Coding style cleanup.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-12-02 11:03:26 +11:00
Neelesh Gupta
8de303bae4 hwmon: (ibmpowernv) Use platform 'id_table' to probe the device
The current driver probe() function assumes the sensor device to be
always present and gets executed every time if the driver is loaded,
but the appropriate hardware could not be present.

So, move the platform device creation as part of platform init code
and use the 'id_table' to check if the device is present or not.

Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2014-11-30 20:13:13 -08:00
Paul Gortmaker
cd4c910e00 netpoll: delete defconfig references to obsolete NETPOLL_TRAP
In commit 9c62a68d13 ("netpoll:
Remove dead packet receive code (CONFIG_NETPOLL_TRAP)") this
Kconfig option was removed.  So remove references to it from
all defconfigs as well.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-29 21:13:48 -08:00
David S. Miller
60b7379dc5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-11-29 20:47:48 -08:00
Linus Torvalds
21f122f472 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc fixes from Michael Ellerman:
 "Here are five fixes for you to pull please.

  They're all CC'ed to stable except the "Fix PE state format" one which
  went in this release"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
  powerpc: 32 bit getcpu VDSO function uses 64 bit instructions
  powerpc/powernv: Replace OPAL_DEASSERT_RESET with EEH_RESET_DEACTIVATE
  powerpc/eeh: Fix PE state format
  powerpc/pseries: Fix endiannes issue in RTAS call from xmon
  powerpc/powernv: Fix the hmi event version check.
2014-11-27 18:23:41 -08:00
Anton Blanchard
152d44a853 powerpc: 32 bit getcpu VDSO function uses 64 bit instructions
I used some 64 bit instructions when adding the 32 bit getcpu VDSO
function. Fix it.

Fixes: 18ad51dd34 ("powerpc: Add VDSO version of getcpu")
Cc: stable@vger.kernel.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-27 09:42:12 +11:00
Gavin Shan
360d88a9e3 powerpc/powernv: Replace OPAL_DEASSERT_RESET with EEH_RESET_DEACTIVATE
The flag passed to ioda_eeh_phb_reset() should be EEH_RESET_DEACTIVATE,
which is translated to OPAL_DEASSERT_RESET or something else by the
EEH backend accordingly.

The patch replaces OPAL_DEASSERT_RESET with EEH_RESET_DEACTIVATE for
ioda_eeh_phb_reset().

Cc: stable@vger.kernel.org
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-27 09:40:32 +11:00
Gavin Shan
7531473c30 powerpc/eeh: Fix PE state format
Obviously I had wrong format given to the PE state output from
/sys/bus/pci/devices/xxxx/eeh_pe_state with some typoes, which
was introduced by commit 2013add4ce. The patch fixes it up.

Fixes: 2013add4ce ("powerpc/eeh: Show hex prefix for PE state sysfs")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-27 09:32:58 +11:00
Laurent Dufour
3b8a3c0109 powerpc/pseries: Fix endiannes issue in RTAS call from xmon
On pseries system (LPAR) xmon failed to enter when running in LE mode,
system is hunging. Inititating xmon will lead to such an output on the
console:

SysRq : Entering xmon
cpu 0x15: Vector: 0  at [c0000003f39ffb10]
    pc: c00000000007ed7c: sysrq_handle_xmon+0x5c/0x70
    lr: c00000000007ed7c: sysrq_handle_xmon+0x5c/0x70
    sp: c0000003f39ffc70
   msr: 8000000000009033
  current = 0xc0000003fafa7180
  paca    = 0xc000000007d75e80	 softe: 0	 irq_happened: 0x01
    pid   = 14617, comm = bash
Bad kernel stack pointer fafb4b0 at eca7cc4
cpu 0x15: Vector: 300 (Data Access) at [c000000007f07d40]
    pc: 000000000eca7cc4
    lr: 000000000eca7c44
    sp: fafb4b0
   msr: 8000000000001000
   dar: 10000000
 dsisr: 42000000
  current = 0xc0000003fafa7180
  paca    = 0xc000000007d75e80	 softe: 0	 irq_happened: 0x01
    pid   = 14617, comm = bash
cpu 0x15: Exception 300 (Data Access) in xmon, returning to main loop
xmon: WARNING: bad recursive fault on cpu 0x15

The root cause is that xmon is calling RTAS to turn off the surveillance
when entering xmon, and RTAS is requiring big endian parameters.

This patch is byte swapping the RTAS arguments when running in LE mode.

Cc: stable@vger.kernel.org
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-27 09:32:57 +11:00
Mahesh Salgaonkar
6acbc5a1da powerpc/powernv: Fix the hmi event version check.
The current HMI event structure is an ABI and carries a version field to
accommodate future changes without affecting/rearranging current structure
members that are valid for previous versions.

The current version check "if (hmi_evt->version != OpalHMIEvt_V1)"
doesn't accomodate the fact that the version number may change in
future.

If firmware starts returning an HMI event with version > 1, this check
will fail and no HMI information will be printed on older kernels.

This patch fixes this issue.

Cc: stable@vger.kernel.org # 3.17+
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
[mpe: Reword changelog]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-27 09:32:52 +11:00
Grant Likely
f5242e5a88 of/reconfig: Always use the same structure for notifiers
The OF_RECONFIG notifier callback uses a different structure depending
on whether it is a node change or a property change. This is silly, and
not very safe. Rework the code to use the same data structure regardless
of the type of notifier.

Signed-off-by: Grant Likely <grant.likely@linaro.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Cc: <linuxppc-dev@lists.ozlabs.org>
2014-11-24 22:25:03 +00:00
Kees Cook
5d26a105b5 crypto: prefix module autoloading with "crypto-"
This prefixes all crypto module loading with "crypto-" so we never run
the risk of exposing module auto-loading to userspace via a crypto API,
as demonstrated by Mathias Krause:

https://lkml.org/lkml/2013/3/4/70

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2014-11-24 22:43:57 +08:00
Benjamin Herrenschmidt
31345e1a07 powerpc/pci: Remove unused force_32bit_msi quirk
This is now fully replaced with the generic "no_64bit_msi" one
that is set by the respective drivers directly.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-11-24 14:36:02 +11:00
Benjamin Herrenschmidt
415072a041 powerpc/pseries: Honor the generic "no_64bit_msi" flag
Instead of the arch specific quirk which we are deprecating

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
2014-11-24 14:36:02 +11:00
Benjamin Herrenschmidt
360743814c powerpc/powernv: Honor the generic "no_64bit_msi" flag
Instead of the arch specific quirk which we are deprecating
and that drivers don't understand.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
2014-11-24 14:36:01 +11:00
Thomas Gleixner
280510f106 PCI/MSI: Rename mask/unmask_msi_irq treewide
The PCI/MSI irq chip callbacks mask/unmask_msi_irq have been renamed
to pci_msi_mask/unmask_irq to mark them PCI specific. Rename all usage
sites. The conversion helper functions are kept around to avoid
conflicts in next and will be removed after merging into mainline.

Coccinelle assisted conversion. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: x86@kernel.org
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Mohit Kumar <mohit.kumar@st.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yijing Wang <wangyijing@huawei.com>
2014-11-23 13:01:45 +01:00
Jiang Liu
83a18912b0 PCI/MSI: Rename write_msi_msg() to pci_write_msi_msg()
Rename write_msi_msg() to pci_write_msi_msg() to mark it as PCI
specific.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-23 13:01:45 +01:00
Jiang Liu
891d4a48f7 PCI/MSI: Rename __read_msi_msg() to __pci_read_msi_msg()
Rename __read_msi_msg() to __pci_read_msi_msg() and kill unused
read_msi_msg(). It's a preparation to separate generic MSI code from
PCI core.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-11-23 13:01:45 +01:00
David S. Miller
1459143386 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ieee802154/fakehard.c

A bug fix went into 'net' for ieee802154/fakehard.c, which is removed
in 'net-next'.

Add build fix into the merge from Stephen Rothwell in openvswitch, the
logging macros take a new initial 'log' argument, a new call was added
in 'net' so when we merge that in here we have to explicitly add the
new 'log' arg to it else the build fails.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-21 22:28:24 -05:00
Masanari Iida
6774def642 treewide: fix typo in printk and Kconfig
This patch fix spelling typo in printk and Kconfig within
various part of kernel sources.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-11-20 14:56:11 +01:00
Jiri Kosina
a02001086b Merge Linus' tree to be be to apply submitted patches to newer code than
current trivial.git base
2014-11-20 14:42:02 +01:00
Al Viro
a455589f18 assorted conversions to %p[dD]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-19 13:01:20 -05:00
Michael Ellerman
e39f223fc9 powerpc: Remove more traces of bootmem
Although we are now selecting NO_BOOTMEM, we still have some traces of
bootmem lying around. That is because even with NO_BOOTMEM there is
still a shim that converts bootmem calls into memblock calls, but
ultimately we want to remove all traces of bootmem.

Most of the patch is conversions from alloc_bootmem() to
memblock_virt_alloc(). In general a call such as:

  p = (struct foo *)alloc_bootmem(x);

Becomes:

  p = memblock_virt_alloc(x, 0);

We don't need the cast because memblock_virt_alloc() returns a void *.
The alignment value of zero tells memblock to use the default alignment,
which is SMP_CACHE_BYTES, the same value alloc_bootmem() uses.

We remove a number of NULL checks on the result of
memblock_virt_alloc(). That is because memblock_virt_alloc() will panic
if it can't allocate, in exactly the same way as alloc_bootmem(), so the
NULL checks are and always have been redundant.

The memory returned by memblock_virt_alloc() is already zeroed, so we
remove several memsets of the result of memblock_virt_alloc().

Finally we convert a few uses of __alloc_bootmem(x, y, MAX_DMA_ADDRESS)
to just plain memblock_virt_alloc(). We don't use memblock_alloc_base()
because MAX_DMA_ADDRESS is ~0ul on powerpc, so limiting the allocation
to that is pointless, 16XB ought to be enough for anyone.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-19 21:41:51 +11:00
Li Zhong
a49ab6eeeb powerpc/pseries: Initialise nvram_pstore_info's buf_lock
nvram_pstore_info's buf_lock is not initialized before registering,
which is clearly incorrect.

It causes some strange behavior when trying to obtain the lock during
kdump process.

On a UP configuration, the console stopped for a couple of seconds, then
"lockup suspected" warning printed out, but then it continued to run.

So try lock fails, and lockup reported, but then arch_spin_lock()
passes.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
[mpe: Edited changelog]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-19 16:09:13 +11:00
Denis Kirjanov
cadaecd218 PPC: bpf_jit_comp: Unify BPF_MOD | BPF_X and BPF_DIV | BPF_X
Reduce duplicated code by unifying
BPF_ALU | BPF_MOD | BPF_X and BPF_ALU | BPF_DIV | BPF_X

CC: Alexei Starovoitov<alexei.starovoitov@gmail.com>
CC: Daniel Borkmann<dborkman@redhat.com>
CC: Philippe Bergheaud<felix@linux.vnet.ibm.com>
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-18 13:20:09 -05:00
Joerg Roedel
0690cbd2e5 powerpc/iommu: Rename iommu_[un]map_sg functions
The IOMMU-API gained support for a new iommu_map_sg
function. This causes compile failures on powerpc because
the function name is already globally used there.
This patch renames adds a ppc_ prefix to these functions to
solve the compile problem.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
2014-11-18 11:30:01 +01:00
Michael Ellerman
35891d40bf Merge remote-tracking branch 'scottwood/next' into next
Scott says:

"Highlights include a bunch of 8xx optimizations, device tree bindings
for Freescale BMan, QMan, and FMan datapath components, misc device tree
updates, and inbound rio window support."
2014-11-18 17:00:38 +11:00
Kevin Hao
d7ce437749 powerpc/fsl_msi: mark the msi cascade handler IRQF_NO_THREAD
The commit 543c043cba ("powerpc/fsl_msi: change the irq handler from
chained to normal") changes the msi cascade handler from chained to
normal. Since cascade handler must run in hard interrupt context, this
will cause kernel panic if we force threading of all the interrupt
handler via kernel command parameter 'threadirqs'. So mark the irq
handler IRQF_NO_THREAD explicitly.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-17 22:00:30 -06:00
Prabhakar Kushwaha
76f3e2929b powerpc/config: Enable memory driver
As Freescale IFC controller has been moved to driver to driver/memory.

So enable memory driver in powerpc config

Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-17 19:36:42 -06:00
Will Deacon
fb7332a9fe mmu_gather: move minimal range calculations into generic code
On architectures with hardware broadcasting of TLB invalidation messages
, it makes sense to reduce the range of the mmu_gather structure when
unmapping page ranges based on the dirty address information passed to
tlb_remove_tlb_entry.

arm64 already does this by directly manipulating the start/end fields
of the gather structure, but this confuses the generic code which
does not expect these fields to change and can end up calculating
invalid, negative ranges when forcing a flush in zap_pte_range.

This patch moves the minimal range calculation out of the arm64 code
and into the generic implementation, simplifying zap_pte_range in the
process (which no longer needs to care about start/end, since they will
point to the appropriate ranges already). With the range being tracked
by core code, the need_flush flag is dropped in favour of checking that
the end of the range has actually been set.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Michal Simek <monstr@monstr.eu>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-11-17 10:12:42 +00:00
Neelesh Gupta
16b1d26e77 rtc/tpo: Driver to support rtc and wakeup on PowerNV platform
The patch implements the OPAL rtc driver that binds with the rtc
driver subsystem. The driver uses the platform device infrastructure
to probe the rtc device and register it to rtc class framework. The
'wakeup' is supported depending upon the property 'has-tpo' present
in the OF node. It provides a way to load the generic rtc driver in
in the absence of an OPAL driver.

The patch also moves the existing OPAL rtc get/set time interfaces to the
new driver and exposes the necessary OPAL calls using EXPORT_SYMBOL_GPL.

Test results:
-------------
Host:
[root@tul169p1 ~]# ls -l /sys/class/rtc/
total 0
lrwxrwxrwx 1 root root 0 Oct 14 03:07 rtc0 -> ../../devices/opal-rtc/rtc/rtc0
[root@tul169p1 ~]# cat /sys/devices/opal-rtc/rtc/rtc0/time
08:10:07
[root@tul169p1 ~]# echo `date '+%s' -d '+ 2 minutes'` > /sys/class/rtc/rtc0/wakealarm
[root@tul169p1 ~]# cat /sys/class/rtc/rtc0/wakealarm
1413274345
[root@tul169p1 ~]#

FSP:
$ smgr mfgState
standby
$ rtim timeofday

System time is valid: 2014/10/14 08:12:04.225115

$ smgr mfgState
ipling
$

CC: devicetree@vger.kernel.org
CC: tglx@linutronix.de
CC: rtc-linux@googlegroups.com
CC: a.zummo@towertech.it
Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-17 18:04:01 +11:00
Vineeth Vijayan
59994fb01a powerpc: Use generic PIE randomization
Back in 2009 we merged 501cb16d3c "Randomise PIEs", which added support for
randomizing PIE (Position Independent Executable) binaries.

That commit added randomize_et_dyn(), which correctly randomized the addresses,
but failed to honor PF_RANDOMIZE. That means it was not possible to disable PIE
randomization via the personality flag, or /proc/sys/kernel/randomize_va_space.

Since then there has been generic support for PIE randomization added to
binfmt_elf.c, selectable via ARCH_BINFMT_ELF_RANDOMIZE_PIE.

Enabling that allows us to drop randomize_et_dyn(), which means we start
honoring PF_RANDOMIZE correctly.

It also causes a fairly major change to how we layout PIE binaries.

Currently we will place the binary at 512MB-520MB for 32 bit binaries, or
512MB-1.5GB for 64 bit binaries, eg:

    $ cat /proc/$$/maps
    4e550000-4e580000 r-xp 00000000 08:02 129813       /bin/dash
    4e580000-4e590000 rw-p 00020000 08:02 129813       /bin/dash
    10014110000-10014140000 rw-p 00000000 00:00 0      [heap]
    3fffaa3f0000-3fffaa5a0000 r-xp 00000000 08:02 921  /lib/powerpc64le-linux-gnu/libc-2.19.so
    3fffaa5a0000-3fffaa5b0000 rw-p 001a0000 08:02 921  /lib/powerpc64le-linux-gnu/libc-2.19.so
    3fffaa5c0000-3fffaa5d0000 rw-p 00000000 00:00 0
    3fffaa5d0000-3fffaa5f0000 r-xp 00000000 00:00 0    [vdso]
    3fffaa5f0000-3fffaa620000 r-xp 00000000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so
    3fffaa620000-3fffaa630000 rw-p 00020000 08:02 1246 /lib/powerpc64le-linux-gnu/ld-2.19.so
    3ffffc340000-3ffffc370000 rw-p 00000000 00:00 0    [stack]

With this commit applied we don't do any special randomisation for the binary,
and instead rely on mmap randomisation. This means the binary ends up at high
addresses, eg:

    $ cat /proc/$$/maps
    3fff99820000-3fff999d0000 r-xp 00000000 08:02 921    /lib/powerpc64le-linux-gnu/libc-2.19.so
    3fff999d0000-3fff999e0000 rw-p 001a0000 08:02 921    /lib/powerpc64le-linux-gnu/libc-2.19.so
    3fff999f0000-3fff99a00000 rw-p 00000000 00:00 0
    3fff99a00000-3fff99a20000 r-xp 00000000 00:00 0      [vdso]
    3fff99a20000-3fff99a50000 r-xp 00000000 08:02 1246   /lib/powerpc64le-linux-gnu/ld-2.19.so
    3fff99a50000-3fff99a60000 rw-p 00020000 08:02 1246   /lib/powerpc64le-linux-gnu/ld-2.19.so
    3fff99a60000-3fff99a90000 r-xp 00000000 08:02 129813 /bin/dash
    3fff99a90000-3fff99aa0000 rw-p 00020000 08:02 129813 /bin/dash
    3fffc3de0000-3fffc3e10000 rw-p 00000000 00:00 0      [stack]
    3fffc55e0000-3fffc5610000 rw-p 00000000 00:00 0      [heap]

Although this should be OK, it's possible it might break badly written
binaries that make assumptions about the address space layout.

Signed-off-by: Vineeth Vijayan <vvijayan@mvista.com>
[mpe: Rewrite changelog]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-17 17:55:11 +11:00
Gavin Shan
cf2b1e0eb0 powerpc/powernv: Fix potential zero devisor
If there're no PHBs under P5IOC2 HUB device tree node, we should
bail early to avoid zero devisor and allocating TCE tables.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:25 +11:00
Gavin Shan
ec8e4e9d3d powerpc/powernv: Bail upon invalid master PE
When freezing compound PEs in pnv_ioda_freeze_pe(), we should bail
upon illegal master PE. We needn't freeze slave PE because it should
have been put into frozen state by hardware.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:25 +11:00
Gavin Shan
4773f76b61 powerpc/powernv: Simplify pnv_ioda_configure_pe()
Nested if statements are always bad and the patch avoids one by
checking PHB type and bail in advance if necessary.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:24 +11:00
Gavin Shan
b131a8425c powerpc/powernv: Set PELTV for compound PEs
Commit 262af55 ("powerpc/powernv: Enable M64 aperatus for PHB3")
introduced compound PEs in order to support M64 aperatus on PHB3.
However, we never configured PELTV for compound PEs. The patch
fixes that by: parent PE can freeze all child compound PEs. Any
compound PE affects the group.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:24 +11:00
Gavin Shan
4b82ab1803 powerpc/powernv: Initialize M64 PE in time
The patch initializes PE instance when reserving PE number to
keep consistent things as we did before. Also, it replaces the
iteration on bridge's windows with the prefered way.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:23 +11:00
Gavin Shan
5ef7356781 powerpc/powernv: Rename alloc_m64_pe() to reserve_m64_pe()
The patch renames alloc_m64_pe() to reserve_m64_pe() to reflect
its real usage: We reserve PE numbers for M64 segments in advance
and then pick up the reserved PE numbers when building the mapping
between PE numbers and M64 segments.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:23 +11:00
Gavin Shan
9e9e893521 powerpc/powernv: Fix condition to remove M64
The M64 resource should be removed if we don't have hook to
initialize it, or (not and) fail to do that.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:22 +11:00
Gavin Shan
1665c4a892 powerpc/powernv: Check PHB type in advance
The patch checks PHB type a bit early to save a bit cycles
for P7 because we don't support M64 for P7IOC no matter what
OPAL firmware we have.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:22 +11:00
Aneesh Kumar K.V
b30e759072 powerpc/mm: Switch to generic RCU get_user_pages_fast
This patch switch the ppc arch to use the generic RCU based
gup implementation.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:21 +11:00
Aneesh Kumar K.V
f30c59e921 mm: Update generic gup implementation to handle hugepage directory
Update generic gup implementation with powerpc specific details.
On powerpc at pmd level we can have hugepte, normal pmd pointer
or a pointer to the hugepage directory.

Tested-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:21 +11:00
Aneesh Kumar K.V
06743521d0 powerpc/mm: Add missing pmd accessors
This patch add documentation and missing accessors.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:21 +11:00
Aneesh Kumar K.V
9e819963b4 powerpc: Disable CPU_FTR_TM if TM is disabled by firmware
Firmware is allowed to communicate to us via the "ibm,pa-features" property
that TM (Transactional Memory) support is disabled.

Currently this doesn't happen on any platform we're aware of, but we should
honor it anyway.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-14 17:24:20 +11:00
David S. Miller
076ce44825 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/chelsio/cxgb4vf/sge.c
	drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c

sge.c was overlapping two changes, one to use the new
__dev_alloc_page() in net-next, and one to use s->fl_pg_order in net.

ixgbe_phy.c was a set of overlapping whitespace changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-14 01:01:12 -05:00
Martijn de Gouw
e6a546fd03 powerpc/fsl-rio: add support for mapping inbound windows
Add support for mapping and unmapping of inbound rapidio windows.  This
allows for drivers to open up a part of local memory on the rapidio
network.  Also applications can use this and tranfer blocks of data
over the network.

Signed-off-by: Martijn de Gouw <martijn.de.gouw@prodrive-technologies.com>
[scottwood@freescale.com: updated commit message based on review]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-12 23:56:16 -06:00
Michael Ellerman
d8ee6f34fd powerpc/xmon: Fix build when 4xx=y and 44x=n
dump_tlb_44x() is only defined when 44x=y, but the ifdef in xmon.c
checks for 4xx, leading to a build failure:

  arch/powerpc/xmon/xmon.c:912:4: error: implicit declaration of function 'dump_tlb_44x'

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-12 16:54:54 +11:00
Michael Ellerman
40cc76e271 Merge branch 'topic/opal-ipmi' into next 2014-11-12 16:33:53 +11:00
Boqun Feng
e7a7a65ed2 powerpc: Fix comment typos in arch/powerpc/include/asm/bitops.h
In arch/powerpc/include/asm/bitops.h, the comments about bit numbers in
large (> 1 word) bitmaps have two typos:
- On ppc64 system, the LSB of the 4th word should be bit 192 rather than
  196, because if it's bit 196, bit 192-195 will be missing in the
  bitmap.
- On ppc32 system, the LSB of the second word should be bit 32 rather
  than 31, because bit 31 is already in the first word.

This patch fixes these typos.

Signed-off-by: Boqun Feng <boqun.feng@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-12 16:31:46 +11:00
Jeremy Kerr
608b286d1d powerpc/powernv: Add OPAL IPMI interface
Recent OPAL firmare adds a couple of functions to send and receive IPMI
messages:

  https://github.com/open-power/skiboot/commit/b2a374da

This change updates the token list and wrappers to suit, and adds the
platform devices for any IPMI interfaces.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-12 16:28:49 +11:00
Paul Mackerras
7048c84694 powerpc: Fix compilation of emulate_step()
Commit be96f63375 ("powerpc: Split out instruction analysis
part of emulate_step()") added some calls to do_fp_load()
and do_fp_store(), which fail to compile on configs with
CONFIG_PPC_FPU=n and CONFIG_PPC_EMULATE_SSTEP=y.  This fixes
the compile by adding #ifdef CONFIG_PPC_FPU around the code
that calls these functions.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-12 15:54:29 +11:00
Suresh E. Warrier
8b91a25546 powerpc: Save/restore PPR for KVM hypercalls
The system call FLIH (first-level interrupt handler) at 0xc00
unconditionally sets hardware priority to medium. For hypercalls, this
means we lose guest OS priority. The front end (do_kvm_0x**) to the
KVM interrupt handler always assumes that PPR priority is saved in
PACA exception save area, so it copies this to the kvm_hstate
structure. For hypercalls, this would be the saved priority from any
previous exception. Eventually, the guest gets resumed with an
incorrect priority.

The fix is to save the PPR priority in PACA exception save area before
switching HMT priorities in the FLIH so that existing code described above
in the KVM interrupt handler can copy it from there into the VCPU's saved
context.

Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
[mpe: Dropped HMT_MEDIUM_PPR_DISCARD and reworded comment]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-12 15:53:25 +11:00
Gavin Shan
d1d5304fcb powerpc/mm: Use PAGE_FACTOR
PAGE_FACTOR was defined to reflect the difference between configured
page size and fixed 4KB page size. Replace (PAGE_SHIFT - HW_PAGE_SHIFT)
with PAGE_FACTOR.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-12 13:54:36 +11:00
Anton Blanchard
cd32e2dcc9 powerpc: Fix bad NULL pointer check in udbg_uart_getc_poll()
We have some code in udbg_uart_getc_poll() that tries to protect
against a NULL udbg_uart_in, but gets it all wrong.

Found with the LLVM static analyzer (scan-build).

Fixes: 309257484c ("powerpc: Cleanup udbg_16550 and add support for LPC PIO-only UARTs")
Signed-off-by: Anton Blanchard <anton@samba.org>
[mpe: Add some newlines for readability while we're here]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-12 13:47:20 +11:00
Denis Kirjanov
5b61c4db49 PPC: bpf_jit_comp: add SKF_AD_HATYPE instruction
Add BPF extension SKF_AD_HATYPE to ppc JIT to check
the hw type of the interface

Before:
[   57.723666] test_bpf: #20 LD_HATYPE
[   57.723675] BPF filter opcode 0020 (@0) unsupported
[   57.724168] 48 48 PASS

After:
[  103.053184] test_bpf: #20 LD_HATYPE 7 6 PASS

CC: Alexei Starovoitov<alexei.starovoitov@gmail.com>
CC: Daniel Borkmann<dborkman@redhat.com>
CC: Philippe Bergheaud<felix@linux.vnet.ibm.com>
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>

v2: address Alexei's comments
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-11 13:39:47 -05:00
Eric Dumazet
2c8c56e15d net: introduce SO_INCOMING_CPU
Alternative to RPS/RFS is to use hardware support for multiple
queues.

Then split a set of million of sockets into worker threads, each
one using epoll() to manage events on its own socket pool.

Ideally, we want one thread per RX/TX queue/cpu, but we have no way to
know after accept() or connect() on which queue/cpu a socket is managed.

We normally use one cpu per RX queue (IRQ smp_affinity being properly
set), so remembering on socket structure which cpu delivered last packet
is enough to solve the problem.

After accept(), connect(), or even file descriptor passing around
processes, applications can use :

 int cpu;
 socklen_t len = sizeof(cpu);

 getsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);

And use this information to put the socket into the right silo
for optimal performance, as all networking stack should run
on the appropriate cpu, without need to send IPI (RPS/RFS).

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-11 13:00:06 -05:00
Steven Rostedt (Red Hat)
4fd3279b48 ftrace: Add more information to ftrace_bug() output
With the introduction of the dynamic trampolines, it is useful that if
things go wrong that ftrace_bug() produces more information about what
the current state is. This can help debug issues that may arise.

Ftrace has lots of checks to make sure that the state of the system it
touchs is exactly what it expects it to be. When it detects an abnormality
it calls ftrace_bug() and disables itself to prevent any further damage.
It is crucial that ftrace_bug() produces sufficient information that
can be used to debug the situation.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Borislav Petkov <bp@suse.de>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-11-11 12:42:13 -05:00
Julia Lawall
6609ed14de powerpc/gamecube/wii: delete unneeded test before of_node_put
Simplify the error path to avoid calling of_node_put when it is not needed.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:35 +11:00
Julia Lawall
498b651472 powerpc/pseries: delete unneeded test before of_node_put
Of_node_put supports NULL as its argument, so the initial test is not
necessary.

Suggested by Uwe Kleine-König.

The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression e;
@@

-if (e)
   of_node_put(e);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:35 +11:00
Julia Lawall
756d36790a powerpc/mpc5xxx: delete unneeded test before of_node_put
Of_node_put supports NULL as its argument, so the initial test is not
necessary.

Suggested by Uwe Kleine-König.

The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression e;
@@

-if (e)
   of_node_put(e);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:34 +11:00
Julia Lawall
4cdd641ddd powerpc/fsl: fsl_soc: delete unneeded test before of_node_put
Of_node_put supports NULL as its argument, so the initial test is not
necessary.

Suggested by Uwe Kleine-König.

The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression e;
@@

-if (e)
   of_node_put(e);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:34 +11:00
Julia Lawall
0f9da5cb74 powerpc/4xx/cpm: delete unneeded test before of_node_put
Simplify the error path to avoid calling of_node_put when it is not needed.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:34 +11:00
Anton Blanchard
20f1aae6cb powerpc/pseries: Quieten relocation on exceptions warning
The H_SET_MODE hcall returns H_P2 if a function is not implemented
and all callers should handle this case.

The call to enable relocation on exceptions currently prints an error
message if the feature is not implemented. While H_SET_MODE was
first introduced on POWER8 (which has relocation on exceptions), it
has been now added on some POWER7 configurations (which does not).

Check for H_P2 and print an informational message instead.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:33 +11:00
Anton Blanchard
7aa189c8f5 powerpc/pseries: Quieten ibm,pcie-link-speed-stats warning
The ibm,pcie-link-speed-stats isn't mandatory, so we shouldn't print
a high priority error message when missing. One example where we see
this is QEMU.

Reduce it to pr_debug.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:33 +11:00
Anton Blanchard
ecaf5fa0a5 powerpc: LLVM complains about forward declaration of struct rtas_sensors
Move the declaration up to silence the warning.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:32 +11:00
Anton Blanchard
6f791bef76 powerpc: Remove double braces in alignment code.
Looks like I introduced this when adding LE support.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:32 +11:00
Anton Blanchard
a3e5b356b3 powerpc: Don't use local named register variable in current_thread_info
LLVM doesn't support local named register variables and is unlikely
to. current_thread_info is using one, fix it by moving it out and
calling it __current_r1().

I gave it a bit of an obscure name because we don't want anyone else
using it - they should use current_stack_pointer(). This specific
case is performance critical and we can't afford to call a function
to get it. Furthermore it isn't important to know exactly where in
the stack we are since we mask the lower bits.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:31 +11:00
Michael Ellerman
e88f157de5 powerpc: Remove unused vgacon_remap_base & fix build break
The build is broken with CONFIG_PPC32=y, CONFIG_FB_VGA16=y and
CONFIG_VGA_CONSOLE=n.

The problem is that vgacon_remap_base is not defined. It's used in:

    #define VGA_MAP_MEM(x,s) (x + vgacon_remap_base)

Which is used in the vga16fb.c code.

Digging down it seems vgacon_remap_base is never initialised. It used to
be, back in arch/ppc (pplus.c and prep_setup.c), but none of that code
ever made it to arch/powerpc.

So given it's been unused for >6 years, remove it.

Whether vga16fb.c works on 32-bit is another question, but this patch
shouldn't affect it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:31 +11:00
Anton Blanchard
1bc9e47aa8 powerpc/jump_label: Use HAVE_JUMP_LABEL
Commit d4fe0965e2 ("powerpc/jump_label: use HAVE_JUMP_LABEL?")
missed a few conversions. Change the remaining uses of
CONFIG_JUMP_LABEL to HAVE_JUMP_LABEL.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:30 +11:00
Simon Kagstrom
4b4b13d5fe powerpc/boot: Parse chosen/cmdline-timeout parameter
On some platforms a 5 second timeout during boot might be quite long, so
make it configurable. Run the loop at least once to let the user stop
the boot by holding a key pressed. If the timeout is set to 0, don't
wait for input, which can be used as a workaround if the boot hangs on
random data coming in on the serial port.

Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
[mpe: Changelog wording & whitespace]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:30 +11:00
Michael Ellerman
90029640fd powerpc: Remove unused CPU_FTRS_A2
In commit fb5a515704 "Remove platforms/wsp and associated pieces" we
removed the last user of CPU_FTRS_A2, so we should remove it too.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:29 +11:00
Michael Ellerman
66f3d4fe0b powerpc: Remove CPU_FTR_HVMODE from CPU_FTRS_ALWAYS
We potentially clear CPU_FTR_HVMODE at runtime in __init_hvmode_206(),
so we must make sure it's not set in CPU_FTRS_ALWAYS.

This doesn't hurt us in practice at the moment, because we don't support
compiling only for CPUs that support CPU_FTR_HVMODE.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:29 +11:00
Jiri Slaby
60878dfb11 powerpc/ftrace: Fix obsolete comment
CONFIG_MCOUNT is not defined anymore, the corresponding #ifdef there
is CONFIG_FUNCTION_TRACER.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:29 +11:00
Kyle McMartin
dedd24a12f powerpc: Remove unused devm_ioremap_prot()
Added in 2008, but has never had any in-tree users, and no other
architectures provide it.

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:28 +11:00
Anton Blanchard
b3c18725a0 powerpc/ftrace: simplify prepare_ftrace_return
Instead of passing in the stack address of the link register
to be modified, just pass in the old value and return the
new value and rely on ftrace_graph_caller to do the
modification.

This removes the exception handling around the stack update -
it isn't needed and we weren't consistent about it. Later on
we would do an unprotected modification:

       if (!ftrace_graph_entry(&trace)) {
               *parent = old;

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:28 +11:00
Anton Blanchard
7d56c65a6f powerpc/ftrace: Remove mod_return_to_handler
mod_return_to_handler is the same as return_to_handler, except
it handles the change of the TOC (r2). Add this into
return_to_handler and remove mod_return_to_handler.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:27 +11:00
Anton Blanchard
6e4c632cdf powerpc: make __ffs return unsigned long
I'm seeing a build warning in mm/nobootmem.c after removing
bootmem:

mm/nobootmem.c: In function '__free_pages_memory':
include/linux/kernel.h:713:17: warning: comparison of distinct pointer types lacks a cast [enabled by default]
  (void) (&_min1 == &_min2);  \
                 ^
mm/nobootmem.c:90:11: note: in expansion of macro 'min'
   order = min(MAX_ORDER - 1UL, __ffs(start));
           ^

The rest of the worlds seems to define __ffs as returning unsigned long,
so lets do that.

Signed-off-by: Anton Blanchard <anton@samba.org>
Tested-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:27 +11:00
Anton Blanchard
21098b9e07 powerpc: Move sparse_init() into initmem_init
We did part of sparse initialisation in setup_arch and part in
initmem_init. Put them together.

Signed-off-by: Anton Blanchard <anton@samba.org>
Tested-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:26 +11:00
Anton Blanchard
68cf0d642f powerpc: Remove superfluous bootmem includes
Lots of places included bootmem.h even when not using bootmem.

Signed-off-by: Anton Blanchard <anton@samba.org>
Tested-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:26 +11:00
Anton Blanchard
14ed740957 powerpc: Remove some old bootmem related comments
Now bootmem is gone from powerpc we can remove comments mentioning it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Tested-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:25 +11:00
Anton Blanchard
10239733ee powerpc: Remove bootmem allocator
At the moment we transition from the memblock alloctor to the bootmem
allocator. Gitting rid of the bootmem allocator removes a bunch of
complicated code (most of which I owe the dubious honour of being
responsible for writing).

Signed-off-by: Anton Blanchard <anton@samba.org>
Tested-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-10 09:59:25 +11:00
Emil Medve
58810cb7f6 powerpc/dts: Add node(s) for the platform PLL
Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Change-Id: If76cd705a01813abe53396c1486bc13c4289ee92
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:50 -06:00
Emil Medve
eaffcb0f1b powerpc/dts: Factorize the clock control node
Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Change-Id: I25ce24a25862b4ca460164159867abefe00ccdd1
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:49 -06:00
Hongtao Jia
94701fcb2f powerpc: Add INA220 to device tree for supported boards
Including: P3041DS P5020DS P5040DS B4QDS

Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:48 -06:00
Hongtao Jia
3b6b17900b powerpc: Add ADT7461 to device tree for supported boards
Including: T104xRDB T208xQDS B4QDS

Signed-off-by: Jia Hongtao <hongtao.jia@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:48 -06:00
Igal Liberman
19bc4808f9 powerpc/fsl: Added rcw registers to global utility registers
The RCW registers are required for the future clock binding implementation.

Signed-off-by: Igal Liberman <Igal.Liberman@freescale.com>
Change-Id: Ic36dd8bc2959aa7f97fb6fd7bbb8420822fef0a9
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:47 -06:00
Ashish Kumar
45c22ed744 powerpc/mpc85xx: Remove SPI and NAND partition from bsc9131rdb.dtsi
* Run "mtdparts default" on u-boot to create dynamic partitions
 * Or use dynamic mtd partition with the help of bootargs in u-boot
   Append bootargs with:
    "mtdparts=ff800000.flash:1m(nand_uboot),512K(nand_dtb),8m(nand_kernel),-(fs);\
     spiff707000.0:1m(spi_uboot),4m(spi_kernel),512k(spi_dtb),-(fs)'"

Signed-off-by: Ashish Kumar <Ashish.Kumar@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:47 -06:00
Paul Bolle
6f2ce34dd7 powerpc/8xx: Remove Kconfig symbol FADS
Commit 39eb56da2b ("pcmcia: Remove m8xx_pcmcia driver") removed the
only driver that used CONFIG_FADS. Setting the Kconfig symbol FADS is
pointless since that commit. Remove it.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:46 -06:00
LEROY Christophe
c51a6821bd powerpc/8xx: Invalidate non present TLB as early as possible
8xx sometimes need to load a invalid/non-present TLBs in
it DTLB asm handler.

These must be invalidated separaly as linux mm doesn't.

Commit 5efab4a02c was invalidating them in
arch/powerpc/mm/fault.c.
This patch does the invalidation earlier in order to free the TLB as soon as
possible. This also has the advantage of removing some 8xx specific code from
fault.c

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:45 -06:00
LEROY Christophe
83c17ba35e powerpc/8xx: Use DAR to save r3 for CPU6 ERRATA
As we are not using anymore DAR to save registers, it is now available for
saving the r3 register used for CPU6 ERRATA handling. Therefore we can
remove the major hack which was to use memory location 0 to save r3.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:45 -06:00
LEROY Christophe
b0168eb97b powerpc/8xx: Don't restore regs to save them again.
There is not need to restore r10, r11 and cr registers at this end of ITLBmiss
handler as they are saved again to the same place in ITLBError handler we are
jumping to.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:44 -06:00
LEROY Christophe
c9a803fb17 powerpc/8xx: _PMD_PRESENT already set in level 1 entries
When a PMD entry is valid, _PMD_PRESENT is set. Therefore, forcing that bit
during TLB loading is useless.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:44 -06:00
LEROY Christophe
4094f28f90 powerpc/8xx: set PTE bit 22 off TLBmiss
No need to re-set this bit at each TLB miss. Let's set it in the PTE.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:43 -06:00
LEROY Christophe
d3e40262e7 powerpc/8xx: Better readibility of ERRATA CPU6 handling
This patch hiddes that SPR address needed for CPU6 ERRATA handling in the macro.
Then we don't have to worry about this address directly in the code.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:42 -06:00
LEROY Christophe
959d6173b5 powerpc/8xx: Implement 16k pages
This patch activates the handling of 16k pages on the MPC8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:42 -06:00
LEROY Christophe
ac21951fa8 powerpc/8xx: Const for TLB RPN forced value
Value 0x00f0 is used to force bits in TLB level 2 entry. This value is linked
to the page size and will vary when we change the page size. Lets define a const
for it in order to have it at only one place.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:41 -06:00
LEROY Christophe
d14068035c powerpc/8xx: Use PAGE size related consts
For PAGE size related operations, use PAGE size consts in order to be able to
use different page size in the futur.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:41 -06:00
LEROY Christophe
33fb845a6f powerpc/8xx: Don't use MD_TWC for walk
MD_TWC can only be used properly with 4k pages.
So lets calculate level 2 table index by ourselves.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:40 -06:00
LEROY Christophe
cbc130f120 powerpc/8xx: Use M_TW instead of M_TWB
Use M_TW instead of M_TWB for storing Level 1 table address as M_TWB requires
4k aligned tables, which is only the case with 4k pages.
Consequently, we have to calculate the level 1 table index by ourselves.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:40 -06:00
LEROY Christophe
6cde2b6f39 powerpc/8xx: No need to restore registers and save them again.
In DTLBError handler there is not need to restore r10, r11 and cr registers
after fixing DAR as they are saved again to the same place just after.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:39 -06:00
LEROY Christophe
749137a251 powerpc/8xx: DataAccess exception not generated by MPC8xx
DataAccess exception is never generated by MPC8xx so do the job directly where
it is used to avoid an unnecessary branching.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:38 -06:00
LEROY Christophe
7439b37e75 powerpc/8xx: exception InstructionAccess does not exist on MPC8xx
Exception InstructionAccess does not exist on MPC8xx. No need to branch there from somewhere else.
Handling can be done directly in InstructionTLBError Exception.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-11-07 18:10:38 -06:00
Anton Blanchard
16d0f5c4af powerpc: Remove ppc_md.remove_memory
We have an extra level of indirection on memory hot remove which is not
matched on memory hot add. Memory hotplug is book3s only, so there is
no need for it.

This also enables means remove_memory() (ie memory hot unplug) works
on powernv.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-05 21:00:46 +11:00
Anton Blanchard
64ff91ff85 powerpc: Remove ppc64_boot_msg
ppc64_boot_msg is meant to be a boot debug aid, but
is only used in one spot. Get rid of it, and save
ourseleves a couple of lines in the kernel log
buffer.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-05 21:00:46 +11:00
Anton Blanchard
adb7cd7322 powerpc/pci: Quieten unset I/O resource warning
Newer POWER designs do not implement PCI I/O space, so we
expect to see a number of these.

Reduce the severity of the warning so it doesn't mask other
real issues.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-05 21:00:46 +11:00
Anton Blanchard
7b051f665c powerpc: Use probe_kernel_address in show_instructions
We really don't want to take a pagefault in show_instructions,
so use probe_kernel_address instead of __get_user.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-05 21:00:45 +11:00
Michael Ellerman
dd521d1eb4 Merge branch 'topic/get-cpu-var' into next 2014-11-05 21:00:31 +11:00
Michael Ellerman
8418804ef2 Merge branch 'topic/pm-power-off' into next 2014-11-05 21:00:19 +11:00
Linus Torvalds
8a97577a59 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc fixes from Michael Ellerman:
 "Some more powerpc fixes if you please"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
  powerpc: use device_online/offline() instead of cpu_up/down()
  powerpc/powernv: Properly fix LPC debugfs endianness
  powerpc: do_notify_resume can be called with bad thread_info flags argument
  powerpc/fadump: Fix endianess issues in firmware assisted dump handling
  powerpc: Fix section mismatch warning
2014-11-04 11:18:29 -08:00
Greg Kroah-Hartman
a8a93c6f99 Merge branch 'platform/remove_owner' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux into driver-core-next
Remove all .owner fields from platform drivers
2014-11-03 19:53:56 -08:00
Denis Kirjanov
4e23576113 PPC: bpf_jit_comp: add SKF_AD_PKTTYPE instruction
Add BPF extension SKF_AD_PKTTYPE to ppc JIT to load
skb->pkt_type field.

Before:
[   88.262622] test_bpf: #11 LD_IND_NET 86 97 99 PASS
[   88.265740] test_bpf: #12 LD_PKTTYPE 109 107 PASS

After:
[   80.605964] test_bpf: #11 LD_IND_NET 44 40 39 PASS
[   80.607370] test_bpf: #12 LD_PKTTYPE 9 9 PASS

CC: Alexei Starovoitov<alexei.starovoitov@gmail.com>
CC: Michael Ellerman<mpe@ellerman.id.au>
Cc: Matt Evans <matt@ozlabs.org>
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>

v2: Added test rusults
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03 15:29:42 -05:00
Al Viro
946e51f2bf move d_rcu from overlapping d_child to overlapping d_alias
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-11-03 15:20:29 -05:00
Alexander Graf
9178ba294b powerpc: Convert power off logic to pm_power_off
The generic Linux framework to power off the machine is a function pointer
called pm_power_off. The trick about this pointer is that device drivers can
potentially implement it rather than board files.

Today on powerpc we set pm_power_off to invoke our generic full machine power
off logic which then calls ppc_md.power_off to invoke machine specific power
off.

However, when we want to add a power off GPIO via the "gpio-poweroff" driver,
this card house falls apart. That driver only registers itself if pm_power_off
is NULL to ensure it doesn't override board specific logic. However, since we
always set pm_power_off to the generic power off logic (which will just not
power off the machine if no ppc_md.power_off call is implemented), we can't
implement power off via the generic GPIO power off driver.

To fix this up, let's get rid of the ppc_md.power_off logic and just always use
pm_power_off as was intended. Then individual drivers such as the GPIO power off
driver can implement power off logic via that function pointer.

With this patch set applied and a few patches on top of QEMU that implement a
power off GPIO on the virt e500 machine, I can successfully turn off my virtual
machine after halt.

Signed-off-by: Alexander Graf <agraf@suse.de>
[mpe: Squash into one patch and update changelog based on cover letter]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-03 12:12:51 +11:00
Christoph Lameter
69111bac42 powerpc: Replace __get_cpu_var uses
This still has not been merged and now powerpc is the only arch that does
not have this change. Sorry about missing linuxppc-dev before.

V2->V2
  - Fix up to work against 3.18-rc1

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x).  This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :

__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.

This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset.  Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e.  using a global
register that may be set to the per cpu base.

Transformations done to __get_cpu_var()

1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);

2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);

3. Retrieve the content of the current processors instance of a per cpu
variable.

	DEFINE_PER_CPU(int, y);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);

4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(&x, this_cpu_ptr(&y), sizeof(x));

5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	__this_cpu_write(y, x);

6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	__this_cpu_inc(y)

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
[mpe: Fix build errors caused by set/or_softirq_pending(), and rework
      assignment in __set_breakpoint() to use memcpy().]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-03 12:12:32 +11:00
Dan Streetman
10ccaf178b powerpc: use device_online/offline() instead of cpu_up/down()
In powerpc pseries platform dlpar operations, use device_online() and
device_offline() instead of cpu_up() and cpu_down().

Calling cpu_up/down() directly does not update the cpu device offline
field, which is used to online/offline a cpu from sysfs. Calling
device_online/offline() instead keeps the sysfs cpu online value
correct. The hotplug lock, which is required to be held when calling
device_online/offline(), is already held when dlpar_online/offline_cpu()
are called, since they are called only from cpu_probe|release_store().

This patch fixes errors on phyp (PowerVM) systems that have cpu(s)
added/removed using dlpar operations; without this patch, the
/sys/devices/system/cpu/cpuN/online nodes do not correctly show the
online state of added/removed cpus.

Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Fixes: 0902a9044f ("Driver core: Use generic offline/online for CPU offline/online")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-02 10:55:56 +11:00
Linus Torvalds
5656b408ff Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, plus on the kernel side:

   - a revert for a newly introduced PMU driver which isn't complete yet
     and where we ran out of time with fixes (to be tried again in
     v3.19) - this makes up for a large chunk of the diffstat.

   - compilation warning fixes

   - a printk message fix

   - event_idx usage fixes/cleanups"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf probe: Trivial typo fix for --demangle
  perf tools: Fix report -F dso_from for data without branch info
  perf tools: Fix report -F dso_to for data without branch info
  perf tools: Fix report -F symbol_from for data without branch info
  perf tools: Fix report -F symbol_to for data without branch info
  perf tools: Fix report -F mispredict for data without branch info
  perf tools: Fix report -F in_tx for data without branch info
  perf tools: Fix report -F abort for data without branch info
  perf tools: Make CPUINFO_PROC an array to support different kernel versions
  perf callchain: Use global caching provided by libunwind
  perf/x86/intel: Revert incomplete and undocumented Broadwell client support
  perf/x86: Fix compile warnings for intel_uncore
  perf: Fix typos in sample code in the perf_event.h header
  perf: Fix and clean up initialization of pmu::event_idx
  perf: Fix bogus kernel printk
  perf diff: Add missing hists__init() call at tool start
2014-10-31 14:01:47 -07:00
Benjamin Herrenschmidt
325e411404 powerpc/powernv: Properly fix LPC debugfs endianness
Endian is hard, especially when I designed a stupid FW interface, and
I should know better... oh well, this is attempt #2 at fixing this
properly. This time it seems to work with all access sizes and I
can run my flashing tool (which exercises all sort of access sizes
and types to access the SPI controller in the BMC) just fine.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-31 17:09:04 +11:00
Anton Blanchard
808be31426 powerpc: do_notify_resume can be called with bad thread_info flags argument
Back in 7230c56441 ("powerpc: Rework lazy-interrupt handling") we
added a call out to restore_interrupts() (written in c) before calling
do_notify_resume:

        bl      restore_interrupts
        addi    r3,r1,STACK_FRAME_OVERHEAD
        bl      do_notify_resume

Unfortunately do_notify_resume takes two arguments, the second one
being the thread_info flags:

void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)

We do populate r4 (the second argument) earlier, but
restore_interrupts() is free to muck it up all it wants. My guess is
the gcc compiler gods shone down on us and its register allocator
never used r4. Sometimes, rarely, luck is on our side.

LLVM on the other hand did trample r4.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-31 16:52:46 +11:00
Hari Bathini
408cddd96e powerpc/fadump: Fix endianess issues in firmware assisted dump handling
Firmware-assisted dump (fadump) kernel code is not endian safe. The
below patch fixes this issue. Tested this patch with upstream kernel.
Below output shows crash tool successfully opening LE fadump vmcore.

    # crash vmlinux vmcore
    GNU gdb (GDB) 7.6
    This GDB was configured as "powerpc64le-unknown-linux-gnu"...

          KERNEL: vmlinux
        DUMPFILE: vmcore
    	CPUS: 16
    	DATE: Wed Dec 31 19:00:00 1969
          UPTIME: 00:03:28
    LOAD AVERAGE: 0.46, 0.86, 0.41
           TASKS: 268
        NODENAME: linux-dhr2
         RELEASE: 3.17.0-rc5-7-default
         VERSION: #6 SMP Tue Sep 30 01:06:34 EDT 2014
         MACHINE: ppc64le  (4116 Mhz)
          MEMORY: 40 GB
           PANIC: "Oops: Kernel access of bad area, sig: 11 [#1]" (check log for details)
    	 PID: 6223
         COMMAND: "bash"
    	TASK: c0000009661b2500  [THREAD_INFO: c000000967ac0000]
    	 CPU: 2
           STATE: TASK_RUNNING (PANIC)

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
[mpe: Make the comment in pSeries_lpar_hptab_clear() clearer]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-30 16:52:46 +11:00
Fabian Frederick
94966b712c powerpc: Fix section mismatch warning
Add __init to MMU_setup() which uses __initdata boot_command_line.
Also MMU_setup() is only called from MMU_init(), which is also __init.

Warning appeared since commit 3e47d1474c.

Fixes: 3e47d1474c ("powerpc: Remove powerpc specific cmd_line")
Signed-off-by: Fabian Frederick <fabf@skynet.be>
[mpe: Update changelog]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-30 11:22:33 +11:00
Paul Bolle
b823982cd7 powerpc: Fix comment typo 'CONIFG_8xx'
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-10-29 14:42:08 +01:00
Paul Bolle
c2522dcdcb powerpc: Fix comment typos 'CONFiG_ALTIVEC'
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-10-29 14:41:49 +01:00
Alistair Popple
f09ced92fa IBM Akebono: Remove obsolete config select
The original implementation of MMC support for Akebono introduced a
new configuration symbol (MMC_SDHCI_OF_476GTR). This symbol has been
dropped in favour of using the generic platform driver however the
select for this symbol was mistakenly left in the platform
configuration.

This patch removes the obsolete symbol selection.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-10-29 14:35:32 +01:00
Nishanth Aravamudan
2c0a33f986 powerpc/numa: ensure per-cpu NUMA mappings are correct on topology update
We received a report of warning in kernel/sched/core.c where the sched
group was NULL on an LPAR after a topology update. This seems to occur
because after the topology update has moved the CPUs, cpu_to_node is
returning the old value still, which ends up breaking the consistency of
the NUMA topology in the per-cpu maps. Ensure that we update the per-cpu
fields when we re-map CPUs.

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-29 09:41:22 +11:00
Nishanth Aravamudan
49f8d8c043 powerpc/numa: use cached value of update->cpu in update_cpu_topology
There isn't any need to keep referring to update->cpu, as we've already
checked cpu == update->cpu at this point.

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-29 09:41:22 +11:00
Peter Zijlstra
c719f56092 perf: Fix and clean up initialization of pmu::event_idx
Andy reported that the current state of event_idx is rather confused.
So remove all but the x86_pmu implementation and change the default to
return 0 (the safe option).

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Himangi Saraogi <himangi774@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: sukadev@linux.vnet.ibm.com <sukadev@linux.vnet.ibm.com>
Cc: Thomas Huth <thuth@linux.vnet.ibm.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux390@de.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28 10:51:01 +01:00
Ian Munsie
03f5439797 powerpc/mm: Use appropriate ESID mask in copro_calculate_slb()
This patch makes copro_calculate_slb() mask the ESID by the correct mask
for 1T vs 256M segments.

This has no effect by itself as the extra bits were ignored, but it
makes debugging the segment table entries easier and means that we can
directly compare the ESID values for duplicates without needing to worry
about masking in the comparison.

This will be used to simplify a comparison in the following patch.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-28 19:52:45 +11:00
Michael Ellerman
bf19edd290 Revert "powerpc/powernv: Fix endian bug in LPC bus debugfs accessors"
This reverts commit bf7588a085.

Ben says although the code is not correct "[this] fix was completely
wrong and does more damages than it fixes things."

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-28 15:17:48 +11:00
Jeremy Kerr
19d36c2152 powernv: Use _GLOBAL_TOC for opal wrappers
Currently, we can't call opal wrappers from modules when using the LE
ABIv2, which requires a TOC init. If we do we'll try and load the opal
entry point using the wrong toc and probably explode or worse jump to
the wrong address.

Nothing in upstream is making opal calls from a module, but we do export
one of the wrappers so we should fix this anyway.

This change uses the _GLOBAL_TOC() macro (rather than _GLOBAL) for the
opal wrappers, so that we can do non-local calls to them.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-23 13:57:18 +11:00
Pranith Kumar
fcbb539f27 powerpc: Wire up sys_bpf() syscall
This patch wires up the new syscall sys_bpf() on powerpc.

Passes the tests in samples/bpf:

    #0 add+sub+mul OK
    #1 unreachable OK
    #2 unreachable2 OK
    #3 out of range jump OK
    #4 out of range jump2 OK
    #5 test1 ld_imm64 OK
    #6 test2 ld_imm64 OK
    #7 test3 ld_imm64 OK
    #8 test4 ld_imm64 OK
    #9 test5 ld_imm64 OK
    #10 no bpf_exit OK
    #11 loop (back-edge) OK
    #12 loop2 (back-edge) OK
    #13 conditional loop OK
    #14 read uninitialized register OK
    #15 read invalid register OK
    #16 program doesn't init R0 before exit OK
    #17 stack out of bounds OK
    #18 invalid call insn1 OK
    #19 invalid call insn2 OK
    #20 invalid function call OK
    #21 uninitialized stack1 OK
    #22 uninitialized stack2 OK
    #23 check valid spill/fill OK
    #24 check corrupted spill/fill OK
    #25 invalid src register in STX OK
    #26 invalid dst register in STX OK
    #27 invalid dst register in ST OK
    #28 invalid src register in LDX OK
    #29 invalid dst register in LDX OK
    #30 junk insn OK
    #31 junk insn2 OK
    #32 junk insn3 OK
    #33 junk insn4 OK
    #34 junk insn5 OK
    #35 misaligned read from stack OK
    #36 invalid map_fd for function call OK
    #37 don't check return value before access OK
    #38 access memory with incorrect alignment OK
    #39 sometimes access memory with incorrect alignment OK
    #40 jump test 1 OK
    #41 jump test 2 OK
    #42 jump test 3 OK
    #43 jump test 4 OK

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
[mpe: test using samples/bpf]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-22 14:03:06 +11:00
Aneesh Kumar K.V
ca5f1d16a5 powerpc/mm: Remove redundant #if case
Remove the check of CONFIG_PPC_SUBPAGE_PROT when deciding if
is_hugepage_only_range() is extern or inline. The extern version is in
slice.c and is built if CONFIG_PPC_MM_SLICES=y.

There was no build break possible because CONFIG_PPC_SUBPAGE_PROT is
only selectable under conditions which also mean CONFIG_PPC_MM_SLICES
will be selected.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-22 14:03:06 +11:00
Aneesh Kumar K.V
6643773ce1 powerpc/mm: Fix build error with hugetlfs disabled
arch/powerpc/mm/slice.c:704:5: error: expected identifier or ‘(’ before numeric constant
 int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr,
     ^
make[1]: *** [arch/powerpc/mm/slice.o] Error 1
make: *** [arch/powerpc/mm/slice.o] Error 2

This got introduced via 1217d34b53
"powerpc: Ensure global functions include their prototype". We
started including linux/hugetlb.h with that patch and now we have

 #define is_hugepage_only_range(mm, addr, len)	0

with hugetlbfs disabled.

Fixes: 1217d34b53 ("powerpc: Ensure global functions include their prototype")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-22 14:03:06 +11:00
Linus Torvalds
dc303408a7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull more powerpc updates from Michael Ellerman:
 "Here's some more updates for powerpc for 3.18.

  They are a bit late I know, though must are actually bug fixes.  In my
  defence I nearly cut the top of my finger off last weekend in a
  gruesome bike maintenance accident, so I spent a good part of the week
  waiting around for doctors.  True story, I can send photos if you like :)

  Probably the most interesting fix is the sys_call_table one, which
  enables syscall tracing for powerpc.  There's a fix for HMI handling
  for old firmware, more endian fixes for firmware interfaces, more EEH
  fixes, Anton fixed our routine that gets the current stack pointer,
  and a few other misc bits"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (22 commits)
  powerpc: Only do dynamic DMA zone limits on platforms that need it
  powerpc: sync pseries_le_defconfig with pseries_defconfig
  powerpc: Add printk levels to setup_system output
  powerpc/vphn: NUMA node code expects big-endian
  powerpc/msi: Use WARN_ON() in msi bitmap selftests
  powerpc/msi: Fix the msi bitmap alignment tests
  powerpc/eeh: Block CFG upon frozen Shiner adapter
  powerpc/eeh: Don't collect logs on PE with blocked config space
  powerpc/eeh: Block PCI config access upon frozen PE
  powerpc/pseries: Drop config requests in EEH accessors
  powerpc/powernv: Drop config requests in EEH accessors
  powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED
  powerpc/eeh: Fix condition for isolated state
  powerpc/pseries: Make CPU hotplug path endian safe
  powerpc/pseries: Use dump_stack instead of show_stack
  powerpc: Rename __get_SP() to current_stack_pointer()
  powerpc: Reimplement __get_SP() as a function not a define
  powerpc/numa: Add ability to disable and debug topology updates
  powerpc/numa: check error return from proc_create
  powerpc/powernv: Fallback to old HMI handling behavior for old firmware
  ...
2014-10-21 07:48:56 -07:00
Will Deacon
5da590574c powerpc: io: implement dummy relaxed accessor macros for writes
write{b,w,l,q}_relaxed are implemented by some architectures in order to
permit memory-mapped I/O accesses with weaker barrier semantics than the
non-relaxed variants.

This patch adds dummy macros for the write accessors to powerpc, in the
same vein as the dummy definitions for the relaxed read accessors.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-10-20 18:49:18 +01:00
Wolfram Sang
e70db68276 powerpc: sysdev: qe_lib: drop owner assignment from platform_drivers
A platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-10-20 16:20:13 +02:00
Wolfram Sang
4ad7fb261d powerpc: sysdev: drop owner assignment from platform_drivers
A platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-10-20 16:20:13 +02:00
Wolfram Sang
0b06ab798c powerpc: platforms: pasemi: drop owner assignment from platform_drivers
A platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-10-20 16:20:12 +02:00
Wolfram Sang
463fbd6eb1 powerpc: platforms: cell: drop owner assignment from platform_drivers
A platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-10-20 16:20:12 +02:00
Wolfram Sang
4a0b0cee81 powerpc: platforms: 85xx: drop owner assignment from platform_drivers
A platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-10-20 16:20:11 +02:00
Wolfram Sang
bedb9eb1e8 powerpc: platforms: 83xx: drop owner assignment from platform_drivers
A platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-10-20 16:20:11 +02:00
Wolfram Sang
dc31c3b568 powerpc: platforms: 82xx: drop owner assignment from platform_drivers
A platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-10-20 16:20:11 +02:00
Wolfram Sang
ef0e201fc1 powerpc: platforms: 52xx: drop owner assignment from platform_drivers
A platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-10-20 16:20:10 +02:00
Wolfram Sang
21d9625180 powerpc: kernel: drop owner assignment from platform_drivers
A platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-10-20 16:20:10 +02:00
Linus Torvalds
ab074ade9c Merge git://git.infradead.org/users/eparis/audit
Pull audit updates from Eric Paris:
 "So this change across a whole bunch of arches really solves one basic
  problem.  We want to audit when seccomp is killing a process.  seccomp
  hooks in before the audit syscall entry code.  audit_syscall_entry
  took as an argument the arch of the given syscall.  Since the arch is
  part of what makes a syscall number meaningful it's an important part
  of the record, but it isn't available when seccomp shoots the
  syscall...

  For most arch's we have a better way to get the arch (syscall_get_arch)
  So the solution was two fold: Implement syscall_get_arch() everywhere
  there is audit which didn't have it.  Use syscall_get_arch() in the
  seccomp audit code.  Having syscall_get_arch() everywhere meant it was
  a useless flag on the stack and we could get rid of it for the typical
  syscall entry.

  The other changes inside the audit system aren't grand, fixed some
  records that had invalid spaces.  Better locking around the task comm
  field.  Removing some dead functions and structs.  Make some things
  static.  Really minor stuff"

* git://git.infradead.org/users/eparis/audit: (31 commits)
  audit: rename audit_log_remove_rule to disambiguate for trees
  audit: cull redundancy in audit_rule_change
  audit: WARN if audit_rule_change called illegally
  audit: put rule existence check in canonical order
  next: openrisc: Fix build
  audit: get comm using lock to avoid race in string printing
  audit: remove open_arg() function that is never used
  audit: correct AUDIT_GET_FEATURE return message type
  audit: set nlmsg_len for multicast messages.
  audit: use union for audit_field values since they are mutually exclusive
  audit: invalid op= values for rules
  audit: use atomic_t to simplify audit_serial()
  kernel/audit.c: use ARRAY_SIZE instead of sizeof/sizeof[0]
  audit: reduce scope of audit_log_fcaps
  audit: reduce scope of audit_net_id
  audit: arm64: Remove the audit arch argument to audit_syscall_entry
  arm64: audit: Add audit hook in syscall_trace_enter/exit()
  audit: x86: drop arch from __audit_syscall_entry() interface
  sparc: implement is_32bit_task
  sparc: properly conditionalize use of TIF_32BIT
  ...
2014-10-19 16:25:56 -07:00
Michael Ellerman
e89dafb5ca powerpc: Only do dynamic DMA zone limits on platforms that need it
Scott's patch 1c98025c6c "Dynamic DMA zone limits" changed
dma_direct_alloc_coherent() to start using dev->coherent_dma_mask.

That seems fair enough, but it exposes the fact that some of the drivers
we care about on IBM platforms aren't setting the coherent mask.

The proper fix is to have drivers set the coherent mask and also have
the platform code honor it.

For now, just restrict the dynamic DMA zone limits to the platforms that
need it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Scott Wood <scottwood@freescale.com>
2014-10-17 09:21:44 +11:00
Anton Blanchard
86be175a73 powerpc: sync pseries_le_defconfig with pseries_defconfig
Now KVM is working on LE, enable it. Also enable transarent
hugepage which has already been enabled on BE.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-16 17:37:44 +11:00
Anton Blanchard
2c186e05a5 powerpc: Add printk levels to setup_system output
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-16 17:37:27 +11:00
Michael Ellerman
aeba3731b1 powerpc/pci: Fix IO space breakage after of_pci_range_to_resource() change
Commit 0b0b0893d4 "of/pci: Fix the conversion of IO ranges into IO
resources" changed the behaviour of of_pci_range_to_resource().

Previously it simply populated the resource based on the arguments. Now
it calls pci_register_io_range() and pci_address_to_pio(). These both
have two implementations depending on whether PCI_IOBASE is defined,
which it is not for powerpc.

Further complicating matters, both routines are weak, and powerpc
implements it's own version of one - pci_address_to_pio(). However
powerpc's implementation depends on other initialisations which are done
later in boot.

The end result is incorrectly initialised IO space. Often we can get
away with that, because we don't make much use of IO space. However
virtio requires it, so we see eg:

  pci_bus 0000:00: root bus resource [io  0xffff] (bus address [0xffffffffffffffff-0xffffffffffffffff])
  PCI: Cannot allocate resource region 0 of device 0000:00:01.0, will remap
  virtio-pci 0000:00:01.0: can't enable device: BAR 0 [io  size 0x0020] not assigned

The simplest fix for now is to just stop using of_pci_range_to_resource(),
and open-code the original implementation, that's all we want it to do.

Fixes: 0b0b0893d4 ("of/pci: Fix the conversion of IO ranges into IO resources")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-16 14:19:07 +11:00
Greg Kurz
5c9fb18994 powerpc/vphn: NUMA node code expects big-endian
The associativity domain numbers are obtained from the hypervisor through
registers and written into memory by the guest: the packed array passed to
vphn_unpack_associativity() is then native-endian, unlike what was assumed
in the following commit:

commit b08a2a12e4
Author: Alistair Popple <alistair@popple.id.au>
Date:   Wed Aug 7 02:01:44 2013 +1000

    powerpc: Make NUMA device node code endian safe

This issue fills the topology with bogus data and makes it unusable. It may
lead to severe performance breakdowns.

We should ideally patch the vphn_unpack_associativity() function to do the
64-bit loads, but this requires some more brain storming.

In the meantime, let's go for a suboptimal and temporary bug fix: this patch
converts each 64-bit value of the packed array to big endian, as expected by
the current parsing code in vphn_unpack_associativity().

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-16 12:43:21 +11:00
Linus Torvalds
0429fbc0bd Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu consistent-ops changes from Tejun Heo:
 "Way back, before the current percpu allocator was implemented, static
  and dynamic percpu memory areas were allocated and handled separately
  and had their own accessors.  The distinction has been gone for many
  years now; however, the now duplicate two sets of accessors remained
  with the pointer based ones - this_cpu_*() - evolving various other
  operations over time.  During the process, we also accumulated other
  inconsistent operations.

  This pull request contains Christoph's patches to clean up the
  duplicate accessor situation.  __get_cpu_var() uses are replaced with
  with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().

  Unfortunately, the former sometimes is tricky thanks to C being a bit
  messy with the distinction between lvalues and pointers, which led to
  a rather ugly solution for cpumask_var_t involving the introduction of
  this_cpu_cpumask_var_ptr().

  This converts most of the uses but not all.  Christoph will follow up
  with the remaining conversions in this merge window and hopefully
  remove the obsolete accessors"

* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
  irqchip: Properly fetch the per cpu offset
  percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
  ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
  percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
  Revert "powerpc: Replace __get_cpu_var uses"
  percpu: Remove __this_cpu_ptr
  clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
  sparc: Replace __get_cpu_var uses
  avr32: Replace __get_cpu_var with __this_cpu_write
  blackfin: Replace __get_cpu_var uses
  tile: Use this_cpu_ptr() for hardware counters
  tile: Replace __get_cpu_var uses
  powerpc: Replace __get_cpu_var uses
  alpha: Replace __get_cpu_var
  ia64: Replace __get_cpu_var uses
  s390: cio driver &__get_cpu_var replacements
  s390: Replace __get_cpu_var uses
  mips: Replace __get_cpu_var uses
  MIPS: Replace __get_cpu_var uses in FPU emulator.
  arm: Replace __this_cpu_ptr with raw_cpu_ptr
  ...
2014-10-15 07:48:18 +02:00
Michael Ellerman
4a77f2bdbd powerpc/msi: Use WARN_ON() in msi bitmap selftests
As demonstrated in the previous commit, the failure message from the msi
bitmap selftests is a bit subtle, it's easy to miss a failure in a busy
boot log.

So drop our check() macro and use WARN_ON() instead. This necessitates
inverting all the conditions as well.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 13:09:33 +11:00
Michael Ellerman
695911fb1f powerpc/msi: Fix the msi bitmap alignment tests
When we added the alignment tests recently we failed to check they were
actually passing - oops.

They weren't passing, because the bitmap was full. We should also be a
bit more careful when checking the return code, a negative error return
could by divisible by our alignment value.

Fixes: b0345bbc6d ("powerpc/msi: Improve IRQ bitmap allocator")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 13:09:32 +11:00
Gavin Shan
179ea48bc7 powerpc/eeh: Block CFG upon frozen Shiner adapter
The Broadcom Shiner 2-ports 10G ethernet adapter has same problem
commit 6f20bda0 ("powerpc/eeh: Block PCI config access upon frozen
PE") fixes. Put it to the black list as well.

   # lspci -s 0004:01:00.0
   0004:01:00.0 Ethernet controller: Broadcom Corporation \
                NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)
   # lspci -n -s 0004:01:00.0
   0004:01:00.0 0200: 14e4:168e (rev 10)

Reported-by: John Walthour <jwalthour@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 11:27:34 +11:00
Gavin Shan
c59004cc83 powerpc/eeh: Don't collect logs on PE with blocked config space
When the PE's config space is marked as blocked, PCI config read
requests always return 0xFF's. It's pointless to collect logs in
this case.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 11:27:21 +11:00
Gavin Shan
b6541db139 powerpc/eeh: Block PCI config access upon frozen PE
The problem was found when I tried to inject PCI config error by
PHB3 PAPR error injection registers into Broadcom Austin 4-ports
NIC adapter. The frozen PE was reported successfully and EEH core
started to recover it. However, I run into fenced PHB when dumping
PCI config space as EEH logs. I was told that PCI config requests
should not be progagated to the adapter until PE reset is done
successfully. Otherise, we would run out of PHB internal credits
and trigger PCT (PCIE Completion Timeout), which leads to the
fenced PHB.

The patch introduces another PE flag EEH_PE_CFG_RESTRICTED, which
is set during PE initialization time if the PE includes the specific
PCI devices that need block PCI config access until PE reset is done.
When the PE becomes frozen for the first time, EEH_PE_CFG_BLOCKED is
set if the PE has flag EEH_PE_CFG_RESTRICTED. Then the PCI config
access to the PE will be dropped by platform PCI accessors until
PE reset is done successfully. The mechanism is shared by PowerNV
platform owned PE or userland owned ones. It's not used on pSeries
platform yet.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 11:27:20 +11:00
Gavin Shan
3409eb4e69 powerpc/pseries: Drop config requests in EEH accessors
The pSeires EEH config accessors rely on rtas_{read, write}_config()
and the condition to check if the PE's config space is blocked
should be moved to those 2 functions so that config requests from
kernel, userland, EEH core can be dropped to avoid recursive EEH error
if necessary.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 11:27:19 +11:00
Gavin Shan
d2cfbcd7c8 powerpc/powernv: Drop config requests in EEH accessors
It's bad idea to access the PCI config registers of the adapters,
which is experiencing reset. It leads to recursive EEH error without
exception. The patch drops PCI config requests in EEH accessors if
the PE has been marked to accept PCI config requests, for example
during PE reseet time.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 11:27:19 +11:00
Gavin Shan
8a6b3710cc powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED
The flag EEH_PE_RESET indicates blocking config space of the PE
during reset time. We potentially need block PE's config space
other than reset time. So it's reasonable to replace it with
EEH_PE_CFG_BLOCKED to indicate its usage.

There are no substantial code or logic changes in this patch.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 11:27:18 +11:00
Gavin Shan
8315070c07 powerpc/eeh: Fix condition for isolated state
Function eeh_pe_state_mark() could possibly have combination of
multiple EEH PE state as its argument. The patch fixes the condition
used to check if EEH_PE_ISOLATED is included.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 11:27:18 +11:00
Bharata B Rao
d6f1e7abdb powerpc/pseries: Make CPU hotplug path endian safe
- ibm,rtas-configure-connector should treat the RTAS data as big endian.
- Treat ibm,ppc-interrupt-server#s as big-endian when setting
  smp_processor_id during hotplug.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 11:25:59 +11:00
Anton Blanchard
4ff52b4ded powerpc/pseries: Use dump_stack instead of show_stack
We can use the simpler dump_stack() instead of
show_stack(current, __get_SP())

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 11:23:20 +11:00
Anton Blanchard
acf620ecf5 powerpc: Rename __get_SP() to current_stack_pointer()
Michael points out that __get_SP() is a pretty horrible
function name. Let's give it a better name.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 11:23:20 +11:00
Anton Blanchard
bfe9a2cfe9 powerpc: Reimplement __get_SP() as a function not a define
Li Zhong points out an issue with our current __get_SP()
implementation. If ftrace function tracing is enabled (ie -pg
profiling using _mcount) we spill a stack frame on 64bit all the
time.

If a function calls __get_SP() and later calls a function that is
tail call optimised, we will pop the stack frame and the value
returned by __get_SP() is no longer valid. An example from Li can
be found in save_stack_trace -> save_context_stack:

c0000000000432c0 <.save_stack_trace>:
c0000000000432c0:       mflr    r0
c0000000000432c4:       std     r0,16(r1)
c0000000000432c8:       stdu    r1,-128(r1) <-- stack frame for _mcount
c0000000000432cc:       std     r3,112(r1)
c0000000000432d0:       bl      <._mcount>
c0000000000432d4:       nop

c0000000000432d8:       mr      r4,r1 <-- __get_SP()

c0000000000432dc:       ld      r5,632(r13)
c0000000000432e0:       ld      r3,112(r1)
c0000000000432e4:       li      r6,1

c0000000000432e8:       addi    r1,r1,128 <-- pop stack frame

c0000000000432ec:       ld      r0,16(r1)
c0000000000432f0:       mtlr    r0
c0000000000432f4:       b       <.save_context_stack> <-- tail call optimized

save_context_stack ends up with a stack pointer below the current
one, and it is likely to be scribbled over.

Fix this by making __get_SP() a function which returns the
callers stack frame. Also replace inline assembly which grabs
the stack pointer in save_stack_trace and show_stack with
__get_SP().

This also fixes an issue with perf_arch_fetch_caller_regs().
It currently unwinds the stack once, which will skip a
valid stack frame on a leaf function. With the __get_SP() fixes
in this patch, we never need to unwind the stack frame to get
to the first interesting frame.

We have to export __get_SP() because perf_arch_fetch_caller_regs()
(which is used in modules) calls it from a header file.

Reported-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 11:23:19 +11:00
Linus Torvalds
faafcba3b5 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Optimized support for Intel "Cluster-on-Die" (CoD) topologies (Dave
     Hansen)

   - Various sched/idle refinements for better idle handling (Nicolas
     Pitre, Daniel Lezcano, Chuansheng Liu, Vincent Guittot)

   - sched/numa updates and optimizations (Rik van Riel)

   - sysbench speedup (Vincent Guittot)

   - capacity calculation cleanups/refactoring (Vincent Guittot)

   - Various cleanups to thread group iteration (Oleg Nesterov)

   - Double-rq-lock removal optimization and various refactorings
     (Kirill Tkhai)

   - various sched/deadline fixes

  ... and lots of other changes"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
  sched/dl: Use dl_bw_of() under rcu_read_lock_sched()
  sched/fair: Delete resched_cpu() from idle_balance()
  sched, time: Fix build error with 64 bit cputime_t on 32 bit systems
  sched: Improve sysbench performance by fixing spurious active migration
  sched/x86: Fix up typo in topology detection
  x86, sched: Add new topology for multi-NUMA-node CPUs
  sched/rt: Use resched_curr() in task_tick_rt()
  sched: Use rq->rd in sched_setaffinity() under RCU read lock
  sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask'
  sched: Use dl_bw_of() under RCU read lock
  sched/fair: Remove duplicate code from can_migrate_task()
  sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
  sched: print_rq(): Don't use tasklist_lock
  sched: normalize_rt_tasks(): Don't use _irqsave for tasklist_lock, use task_rq_lock()
  sched: Fix the task-group check in tg_has_rt_tasks()
  sched/fair: Leverage the idle state info when choosing the "idlest" cpu
  sched: Let the scheduler see CPU idle states
  sched/deadline: Fix inter- exclusive cpusets migrations
  sched/deadline: Clear dl_entity params when setscheduling to different class
  sched/numa: Kill the wrong/dead TASK_DEAD check in task_numa_fault()
  ...
2014-10-13 16:23:15 +02:00
Linus Torvalds
dbb885fecc Merge branch 'locking-arch-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull arch atomic cleanups from Ingo Molnar:
 "This is a series kept separate from the main locking tree, which
  cleans up and improves various details in the atomics type handling:

   - Remove the unused atomic_or_long() method

   - Consolidate and compress atomic ops implementations between
     architectures, to reduce linecount and to make it easier to add new
     ops.

   - Rewrite generic atomic support to only require cmpxchg() from an
     architecture - generate all other methods from that"

* 'locking-arch-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  locking,arch: Use ACCESS_ONCE() instead of cast to volatile in atomic_read()
  locking, mips: Fix atomics
  locking, sparc64: Fix atomics
  locking,arch: Rewrite generic atomic support
  locking,arch,xtensa: Fold atomic_ops
  locking,arch,sparc: Fold atomic_ops
  locking,arch,sh: Fold atomic_ops
  locking,arch,powerpc: Fold atomic_ops
  locking,arch,parisc: Fold atomic_ops
  locking,arch,mn10300: Fold atomic_ops
  locking,arch,mips: Fold atomic_ops
  locking,arch,metag: Fold atomic_ops
  locking,arch,m68k: Fold atomic_ops
  locking,arch,m32r: Fold atomic_ops
  locking,arch,ia64: Fold atomic_ops
  locking,arch,hexagon: Fold atomic_ops
  locking,arch,cris: Fold atomic_ops
  locking,arch,avr32: Fold atomic_ops
  locking,arch,arm64: Fold atomic_ops
  locking,arch,arm: Fold atomic_ops
  ...
2014-10-13 15:48:00 +02:00
Nishanth Aravamudan
2d73bae12b powerpc/numa: Add ability to disable and debug topology updates
We have hit a few customer issues with the topology update code (VPHN
and PRRN). It would be nice to be able to debug the notifications coming
from the hypervisor in both cases to the LPAR, as well as to disable
responding to the notifications at boot-time, to narrow down the source
of the problems. Add a basic level of such functionality, similar to the
numa= command-line parameter. We already have a toggle in
/proc/powerpc/topology_updates that allows run-time enabling/disabling,
so the updates can be started at run-time if desired. But the bugs we've
run into have occured during boot or very shortly after coming to login,
and have resulted in a broken NUMA topology.

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-13 18:16:17 +11:00
Nishanth Aravamudan
2d15b9b479 powerpc/numa: check error return from proc_create
proc_create can fail, we should check the return value and pass up the
failure.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-13 18:15:53 +11:00
Mahesh Salgaonkar
6507955c97 powerpc/powernv: Fallback to old HMI handling behavior for old firmware
Recently we moved HMI handling into Linux kernel instead of taking
HMI directly in OPAL. This new change is dependent on new OPAL call
for HMI recovery which was introduced in newer firmware. While this new
change works fine with latest OPAL firmware, we broke the HMI handling
if we run newer kernel on old OPAL firmware that results in system hang.

This patch fixes this issue by falling back to old HMI behavior on older
OPAL firmware.

This patch introduces a check for opal token OPAL_HANDLE_HMI to see
if we are running on newer firmware or old firmware. On newer firmware
this check would return OPAL_TOKEN_PRESENT, otherwise we are running on
old firmware and fallback to old HMI behavior.

Old firmware: POWER8 System Firmware Release as of today <= SV810_087
Action: Let OPAL handle HMIs

Newer firmware: in development/yet to be released.
Action: Let Linux host handle HMIs.

This patch depends on opal check token patch posted at ppc-devel
https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-August/120224.html

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
[mpe: Minor comment and printk rewording]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-13 18:09:50 +11:00
Linus Torvalds
fd9879b9bb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc updates from Michael Ellerman:
 "Here's a first pull request for powerpc updates for 3.18.

  The bulk of the additions are for the "cxl" driver, for IBM's Coherent
  Accelerator Processor Interface (CAPI).  Most of it's in drivers/misc,
  which Greg & Arnd maintain, Greg said he was happy for us to take it
  through our tree.

  There's the usual minor cleanups and fixes, including a bit of noise
  in drivers from some of those.  A bunch of updates to our EEH code,
  which has been getting more testing.  Several nice speedups from
  Anton, including 20% in clear_page().

  And a bunch of updates for freescale from Scott"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (130 commits)
  cxl: Fix afu_read() not doing finish_wait() on signal or non-blocking
  cxl: Add documentation for userspace APIs
  cxl: Add driver to Kbuild and Makefiles
  cxl: Add userspace header file
  cxl: Driver code for powernv PCIe based cards for userspace access
  cxl: Add base builtin support
  powerpc/mm: Add hooks for cxl
  powerpc/opal: Add PHB to cxl mode call
  powerpc/mm: Add new hash_page_mm()
  powerpc/powerpc: Add new PCIe functions for allocating cxl interrupts
  cxl: Add new header for call backs and structs
  powerpc/powernv: Split out set MSI IRQ chip code
  powerpc/mm: Export mmu_kernel_ssize and mmu_linear_psize
  powerpc/msi: Improve IRQ bitmap allocator
  powerpc/cell: Make spu_flush_all_slbs() generic
  powerpc/cell: Move data segment faulting code out of cell platform
  powerpc/cell: Move spu_handle_mm_fault() out of cell platform
  powerpc/pseries: Use new defines when calling H_SET_MODE
  powerpc: Update contact info in Documentation files
  powerpc/perf/hv-24x7: Simplify catalog_read()
  ...
2014-10-11 20:34:00 -04:00
Mahesh Salgaonkar
c675c7db62 powerpc/book3s: Don't clear MSR_RI in hmi handler.
In HMI interrupt handler we don't touch SRR0/SRR1, instead we touch
HSRR0/HSRR1. Hence we don't need to clear MSR_RI bit.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-10 17:25:25 +11:00
Romeo Cane
1028ccf560 powerpc: Fix sys_call_table declaration to enable syscall tracing
Declaring sys_call_table as a pointer causes the compiler to generate
the wrong lookup code in arch_syscall_addr().

     <arch_syscall_addr>:
        lis     r9,-16384
        rlwinm  r3,r3,2,0,29
  -     lwz     r11,30640(r9)
  -     lwzx    r3,r11,r3
  +     addi    r9,r9,30640
  +     lwzx    r3,r9,r3
        blr

The actual sys_call_table symbol, declared in assembler, is an
array. If we lie about that to the compiler we get the wrong code
generated, as above.

This definition seems only to be used by the syscall tracing code in
kernel/trace/trace_syscalls.c. With this patch I can successfully use
the syscall tracepoints:

  bash-3815  [002] ....   333.239082: sys_write -> 0x2
  bash-3815  [002] ....   333.239087: sys_dup2(oldfd: a, newfd: 1)
  bash-3815  [002] ....   333.239088: sys_dup2 -> 0x1
  bash-3815  [002] ....   333.239092: sys_fcntl(fd: a, cmd: 1, arg: 0)
  bash-3815  [002] ....   333.239093: sys_fcntl -> 0x1
  bash-3815  [002] ....   333.239094: sys_close(fd: a)
  bash-3815  [002] ....   333.239094: sys_close -> 0x0

Signed-off-by: Romeo Cane <romeo.cane.ext@coriant.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-10 17:03:36 +11:00
Linus Torvalds
0cf744bc7a Merge branch 'akpm' (fixes from Andrew Morton)
Merge patch-bomb from Andrew Morton:
 - part of OCFS2 (review is laggy again)
 - procfs
 - slab
 - all of MM
 - zram, zbud
 - various other random things: arch, filesystems.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits)
  nosave: consolidate __nosave_{begin,end} in <asm/sections.h>
  include/linux/screen_info.h: remove unused ORIG_* macros
  kernel/sys.c: compat sysinfo syscall: fix undefined behavior
  kernel/sys.c: whitespace fixes
  acct: eliminate compile warning
  kernel/async.c: switch to pr_foo()
  include/linux/blkdev.h: use NULL instead of zero
  include/linux/kernel.h: deduplicate code implementing clamp* macros
  include/linux/kernel.h: rewrite min3, max3 and clamp using min and max
  alpha: use Kbuild logic to include <asm-generic/sections.h>
  frv: remove deprecated IRQF_DISABLED
  frv: remove unused cpuinfo_frv and friends to fix future build error
  zbud: avoid accessing last unused freelist
  zsmalloc: simplify init_zspage free obj linking
  mm/zsmalloc.c: correct comment for fullness group computation
  zram: use notify_free to account all free notifications
  zram: report maximum used memory
  zram: zram memory size limitation
  zsmalloc: change return value unit of zs_get_total_size_bytes
  zsmalloc: move pages_allocated to zs_pool
  ...
2014-10-09 22:26:14 -04:00
Geert Uytterhoeven
7f8998c7ae nosave: consolidate __nosave_{begin,end} in <asm/sections.h>
The different architectures used their own (and different) declarations:

    extern __visible const void __nosave_begin, __nosave_end;
    extern const void __nosave_begin, __nosave_end;
    extern long __nosave_begin, __nosave_end;

Consolidate them using the first variant in <asm/sections.h>.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:04 -04:00
Mel Gorman
6a33979d5b mm: remove misleading ARCH_USES_NUMA_PROT_NONE
ARCH_USES_NUMA_PROT_NONE was defined for architectures that implemented
_PAGE_NUMA using _PROT_NONE.  This saved using an additional PTE bit and
relied on the fact that PROT_NONE vmas were skipped by the NUMA hinting
fault scanner.  This was found to be conceptually confusing with a lot of
implicit assumptions and it was asked that an alternative be found.

Commit c46a7c81 "x86: define _PAGE_NUMA by reusing software bits on the
PMD and PTE levels" redefined _PAGE_NUMA on x86 to be one of the swap PTE
bits and shrunk the maximum possible swap size but it did not go far
enough.  There are no architectures that reuse _PROT_NONE as _PROT_NUMA
but the relics still exist.

This patch removes ARCH_USES_NUMA_PROT_NONE and removes some unnecessary
duplication in powerpc vs the generic implementation by defining the types
the core NUMA helpers expected to exist from x86 with their ppc64
equivalent.  This necessitated that a PTE bit mask be created that
identified the bits that distinguish present from NUMA pte entries but it
is expected this will only differ between arches based on _PAGE_PROTNONE.
The naming for the generic helpers was taken from x86 originally but ppc64
has types that are equivalent for the purposes of the helper so they are
mapped instead of duplicating code.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:25:52 -04:00
Linus Torvalds
80213c03c4 PCI changes for the v3.18 merge window:
Enumeration
     - Check Vendor ID only for Config Request Retry Status (Rajat Jain)
     - Enable Config Request Retry Status when supported (Rajat Jain)
     - Add generic domain handling (Catalin Marinas)
     - Generate uppercase hex for modalias interface class (Ricardo Ribalda Delgado)
 
   Resource management
     - Add missing MEM_64 mask in pci_assign_unassigned_bridge_resources() (Yinghai Lu)
     - Increase IBM ipr SAS Crocodile BARs to at least system page size (Douglas Lehr)
 
   PCI device hotplug
     - Prevent NULL dereference during pciehp probe (Andreas Noever)
     - Move _HPP & _HPX handling into core (Bjorn Helgaas)
     - Apply _HPP to PCIe devices as well as PCI (Bjorn Helgaas)
     - Apply _HPP/_HPX to display devices (Bjorn Helgaas)
     - Preserve SERR & PARITY settings when applying _HPP/_HPX (Bjorn Helgaas)
     - Preserve MPS and MRRS settings when applying _HPP/_HPX (Bjorn Helgaas)
     - Apply _HPP/_HPX to all devices, not just hot-added ones (Bjorn Helgaas)
     - Fix wait time in pciehp timeout message (Yinghai Lu)
     - Add more pciehp Slot Control debug output (Yinghai Lu)
     - Stop disabling pciehp notifications during init (Yinghai Lu)
 
   MSI
     - Remove arch_msi_check_device() (Alexander Gordeev)
     - Rename pci_msi_check_device() to pci_msi_supported() (Alexander Gordeev)
     - Move D0 check into pci_msi_check_device() (Alexander Gordeev)
     - Remove unused kobject from struct msi_desc (Yijing Wang)
     - Remove "pos" from the struct msi_desc msi_attrib (Yijing Wang)
     - Add "msi_bus" sysfs MSI/MSI-X control for endpoints (Yijing Wang)
     - Use __get_cached_msi_msg() instead of get_cached_msi_msg() (Yijing Wang)
     - Use __read_msi_msg() instead of read_msi_msg() (Yijing Wang)
     - Use __write_msi_msg() instead of write_msi_msg() (Yijing Wang)
 
   Power management
     - Drop unused runtime PM support code for PCIe ports (Rafael J.  Wysocki)
     - Allow PCI devices to be put into D3cold during system suspend (Rafael J. Wysocki)
 
   AER
     - Add additional AER error strings (Gong Chen)
     - Make <linux/aer.h> standalone includable (Thierry Reding)
 
   Virtualization
     - Add ACS quirk for Solarflare SFC9120 & SFC9140 (Alex Williamson)
     - Add ACS quirk for Intel 10G NICs (Alex Williamson)
     - Add ACS quirk for AMD A88X southbridge (Marti Raudsepp)
     - Remove unused pci_find_upstream_pcie_bridge(), pci_get_dma_source() (Alex Williamson)
     - Add device flag helpers (Ethan Zhao)
     - Assume all Mellanox devices have broken INTx masking (Gavin Shan)
 
   Generic host bridge driver
     - Fix ioport_map() for !CONFIG_GENERIC_IOMAP (Liviu Dudau)
     - Add pci_register_io_range() and pci_pio_to_address() (Liviu Dudau)
     - Define PCI_IOBASE as the base of virtual PCI IO space (Liviu Dudau)
     - Fix the conversion of IO ranges into IO resources (Liviu Dudau)
     - Add pci_get_new_domain_nr() and of_get_pci_domain_nr() (Liviu Dudau)
     - Add support for parsing PCI host bridge resources from DT (Liviu Dudau)
     - Add pci_remap_iospace() to map bus I/O resources (Liviu Dudau)
     - Add arm64 architectural support for PCI (Liviu Dudau)
 
   APM X-Gene
     - Add APM X-Gene PCIe driver (Tanmay Inamdar)
     - Add arm64 DT APM X-Gene PCIe device tree nodes (Tanmay Inamdar)
 
   Freescale i.MX6
     - Probe in module_init(), not fs_initcall() (Lucas Stach)
     - Delay enabling reference clock for SS until it stabilizes (Tim Harvey)
 
   Marvell MVEBU
     - Fix uninitialized variable in mvebu_get_tgt_attr() (Thomas Petazzoni)
 
   NVIDIA Tegra
     - Make sure the PCIe PLL is really reset (Eric Yuen)
     - Add error path tegra_msi_teardown_irq() cleanup (Jisheng Zhang)
     - Fix extended configuration space mapping (Peter Daifuku)
     - Implement resource hierarchy (Thierry Reding)
     - Clear CLKREQ# enable on port disable (Thierry Reding)
     - Add Tegra124 support (Thierry Reding)
 
   ST Microelectronics SPEAr13xx
     - Pass config resource through reg property (Pratyush Anand)
 
   Synopsys DesignWare
     - Use NULL instead of false (Fabio Estevam)
     - Parse bus-range property from devicetree (Lucas Stach)
     - Use pci_create_root_bus() instead of pci_scan_root_bus() (Lucas Stach)
     - Remove pci_assign_unassigned_resources() (Lucas Stach)
     - Check private_data validity in single place (Lucas Stach)
     - Setup and clear exactly one MSI at a time (Lucas Stach)
     - Remove open-coded bitmap operations (Lucas Stach)
     - Fix configuration base address when using 'reg' (Minghuan Lian)
     - Fix IO resource end address calculation (Minghuan Lian)
     - Rename get_msi_data() to get_msi_addr() (Minghuan Lian)
     - Add get_msi_data() to pcie_host_ops (Minghuan Lian)
     - Add support for v3.65 hardware (Murali Karicheri)
     - Fold struct pcie_port_info into struct pcie_port (Pratyush Anand)
 
   TI Keystone
     - Add TI Keystone PCIe driver (Murali Karicheri)
     - Limit MRSS for all downstream devices (Murali Karicheri)
     - Assume controller is already in RC mode (Murali Karicheri)
     - Set device ID based on SoC to support multiple ports (Murali Karicheri)
 
   Xilinx AXI
     - Add Xilinx AXI PCIe driver (Srikanth Thokala)
     - Fix xilinx_pcie_assign_msi() return value test (Dan Carpenter)
 
   Miscellaneous
     - Clean up whitespace (Quentin Lambert)
     - Remove assignments from "if" conditions (Quentin Lambert)
     - Move PCI_VENDOR_ID_VMWARE to pci_ids.h (Francesco Ruggeri)
     - x86: Mark DMI tables as initialization data (Mathias Krause)
     - x86: Move __init annotation to the correct place (Mathias Krause)
     - x86: Mark constants of pci_mmcfg_nvidia_mcp55() as __initconst (Mathias Krause)
     - x86: Constify pci_mmcfg_probes[] array (Mathias Krause)
     - x86: Mark PCI BIOS initialization code as such (Mathias Krause)
     - Parenthesize PCI_DEVID and PCI_VPD_LRDT_ID parameters (Megan Kamiya)
     - Remove unnecessary variable in pci_add_dynid() (Tobias Klauser)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUNWmJAAoJEFmIoMA60/r8GncP/3uHRoBrnaF6pv+S1l1p3Fs/
 l1kKH91/IuAAU7VJX8pkNybFqx02topWmiVVXAzqvD01PcRLGCLjPbWl5h+y5/Ja
 CHZH33AwHAmm0kt4BrOSOeHTLJhAigly2zV3P4F8jRIgyaeMoGZ6Ko4tkQUpm21k
 +ohrOd4cxYkmzzCjKwsZZhKnyRNpae8FmTk3VQBPuN8DbhvFPrqo5/+GeAdSZTdS
 HZHpfl2HL4095aY7uBVsZqNkjQyl6SnWwjkjLnuI8q3qA3BLgDZE/Jr8F/MNuW1V
 y01JIjerFWMDFyBIkpg7moYnODy6oP3KvczwYdKGmqsJja+0MQvYhLTwD+R/yTQS
 SewJA0mL3T3EJEfnFYkCiaIX27xIwk/FxHfaKPN91xgx/QM7xCVZNrU2/dXjhoX1
 GqLKxOEaFHhWWTyT5Dj27I0ZcElzFZ3tIwvrHfs8y22oAuAlsAypaUgvUwRfL4CO
 hOj4ITZa0t041sYWqxCoGAA9Fdp8HMzNKKS5F4mhADz4Ad9v6uPCNv/s/RoxVsbm
 jhZOtPYJ0/iCA+kNVX563S8Z3VpfPI+7bBjcj2WKdzW+IlICvOKT+kvwL2Tv/rE7
 w0hrNsbkgGsYbPldMx7LwCavsUtYFuNj0zoU6vkhP2jk6O2Tn5VXDmjrXH0v3iHI
 v03vlUtre0bQ26fzDyLQ
 =4Zv1
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "The interesting things here are:

   - Turn on Config Request Retry Status Software Visibility.  This
     caused hangs last time, but we included a fix this time.
   - Rework PCI device configuration to use _HPP/_HPX more aggressively
   - Allow PCI devices to be put into D3cold during system suspend
   - Add arm64 PCI support
   - Add APM X-Gene host bridge driver
   - Add TI Keystone host bridge driver
   - Add Xilinx AXI host bridge driver

  More detailed summary:

  Enumeration
    - Check Vendor ID only for Config Request Retry Status (Rajat Jain)
    - Enable Config Request Retry Status when supported (Rajat Jain)
    - Add generic domain handling (Catalin Marinas)
    - Generate uppercase hex for modalias interface class (Ricardo Ribalda Delgado)

  Resource management
    - Add missing MEM_64 mask in pci_assign_unassigned_bridge_resources() (Yinghai Lu)
    - Increase IBM ipr SAS Crocodile BARs to at least system page size (Douglas Lehr)

  PCI device hotplug
    - Prevent NULL dereference during pciehp probe (Andreas Noever)
    - Move _HPP & _HPX handling into core (Bjorn Helgaas)
    - Apply _HPP to PCIe devices as well as PCI (Bjorn Helgaas)
    - Apply _HPP/_HPX to display devices (Bjorn Helgaas)
    - Preserve SERR & PARITY settings when applying _HPP/_HPX (Bjorn Helgaas)
    - Preserve MPS and MRRS settings when applying _HPP/_HPX (Bjorn Helgaas)
    - Apply _HPP/_HPX to all devices, not just hot-added ones (Bjorn Helgaas)
    - Fix wait time in pciehp timeout message (Yinghai Lu)
    - Add more pciehp Slot Control debug output (Yinghai Lu)
    - Stop disabling pciehp notifications during init (Yinghai Lu)

  MSI
    - Remove arch_msi_check_device() (Alexander Gordeev)
    - Rename pci_msi_check_device() to pci_msi_supported() (Alexander Gordeev)
    - Move D0 check into pci_msi_check_device() (Alexander Gordeev)
    - Remove unused kobject from struct msi_desc (Yijing Wang)
    - Remove "pos" from the struct msi_desc msi_attrib (Yijing Wang)
    - Add "msi_bus" sysfs MSI/MSI-X control for endpoints (Yijing Wang)
    - Use __get_cached_msi_msg() instead of get_cached_msi_msg() (Yijing Wang)
    - Use __read_msi_msg() instead of read_msi_msg() (Yijing Wang)
    - Use __write_msi_msg() instead of write_msi_msg() (Yijing Wang)

  Power management
    - Drop unused runtime PM support code for PCIe ports (Rafael J.  Wysocki)
    - Allow PCI devices to be put into D3cold during system suspend (Rafael J. Wysocki)

  AER
    - Add additional AER error strings (Gong Chen)
    - Make <linux/aer.h> standalone includable (Thierry Reding)

  Virtualization
    - Add ACS quirk for Solarflare SFC9120 & SFC9140 (Alex Williamson)
    - Add ACS quirk for Intel 10G NICs (Alex Williamson)
    - Add ACS quirk for AMD A88X southbridge (Marti Raudsepp)
    - Remove unused pci_find_upstream_pcie_bridge(), pci_get_dma_source() (Alex Williamson)
    - Add device flag helpers (Ethan Zhao)
    - Assume all Mellanox devices have broken INTx masking (Gavin Shan)

  Generic host bridge driver
    - Fix ioport_map() for !CONFIG_GENERIC_IOMAP (Liviu Dudau)
    - Add pci_register_io_range() and pci_pio_to_address() (Liviu Dudau)
    - Define PCI_IOBASE as the base of virtual PCI IO space (Liviu Dudau)
    - Fix the conversion of IO ranges into IO resources (Liviu Dudau)
    - Add pci_get_new_domain_nr() and of_get_pci_domain_nr() (Liviu Dudau)
    - Add support for parsing PCI host bridge resources from DT (Liviu Dudau)
    - Add pci_remap_iospace() to map bus I/O resources (Liviu Dudau)
    - Add arm64 architectural support for PCI (Liviu Dudau)

  APM X-Gene
    - Add APM X-Gene PCIe driver (Tanmay Inamdar)
    - Add arm64 DT APM X-Gene PCIe device tree nodes (Tanmay Inamdar)

  Freescale i.MX6
    - Probe in module_init(), not fs_initcall() (Lucas Stach)
    - Delay enabling reference clock for SS until it stabilizes (Tim Harvey)

  Marvell MVEBU
    - Fix uninitialized variable in mvebu_get_tgt_attr() (Thomas Petazzoni)

  NVIDIA Tegra
    - Make sure the PCIe PLL is really reset (Eric Yuen)
    - Add error path tegra_msi_teardown_irq() cleanup (Jisheng Zhang)
    - Fix extended configuration space mapping (Peter Daifuku)
    - Implement resource hierarchy (Thierry Reding)
    - Clear CLKREQ# enable on port disable (Thierry Reding)
    - Add Tegra124 support (Thierry Reding)

  ST Microelectronics SPEAr13xx
    - Pass config resource through reg property (Pratyush Anand)

  Synopsys DesignWare
    - Use NULL instead of false (Fabio Estevam)
    - Parse bus-range property from devicetree (Lucas Stach)
    - Use pci_create_root_bus() instead of pci_scan_root_bus() (Lucas Stach)
    - Remove pci_assign_unassigned_resources() (Lucas Stach)
    - Check private_data validity in single place (Lucas Stach)
    - Setup and clear exactly one MSI at a time (Lucas Stach)
    - Remove open-coded bitmap operations (Lucas Stach)
    - Fix configuration base address when using 'reg' (Minghuan Lian)
    - Fix IO resource end address calculation (Minghuan Lian)
    - Rename get_msi_data() to get_msi_addr() (Minghuan Lian)
    - Add get_msi_data() to pcie_host_ops (Minghuan Lian)
    - Add support for v3.65 hardware (Murali Karicheri)
    - Fold struct pcie_port_info into struct pcie_port (Pratyush Anand)

  TI Keystone
    - Add TI Keystone PCIe driver (Murali Karicheri)
    - Limit MRSS for all downstream devices (Murali Karicheri)
    - Assume controller is already in RC mode (Murali Karicheri)
    - Set device ID based on SoC to support multiple ports (Murali Karicheri)

  Xilinx AXI
    - Add Xilinx AXI PCIe driver (Srikanth Thokala)
    - Fix xilinx_pcie_assign_msi() return value test (Dan Carpenter)

  Miscellaneous
    - Clean up whitespace (Quentin Lambert)
    - Remove assignments from "if" conditions (Quentin Lambert)
    - Move PCI_VENDOR_ID_VMWARE to pci_ids.h (Francesco Ruggeri)
    - x86: Mark DMI tables as initialization data (Mathias Krause)
    - x86: Move __init annotation to the correct place (Mathias Krause)
    - x86: Mark constants of pci_mmcfg_nvidia_mcp55() as __initconst (Mathias Krause)
    - x86: Constify pci_mmcfg_probes[] array (Mathias Krause)
    - x86: Mark PCI BIOS initialization code as such (Mathias Krause)
    - Parenthesize PCI_DEVID and PCI_VPD_LRDT_ID parameters (Megan Kamiya)
    - Remove unnecessary variable in pci_add_dynid() (Tobias Klauser)"

* tag 'pci-v3.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (109 commits)
  arm64: dts: Add APM X-Gene PCIe device tree nodes
  PCI: Add ACS quirk for AMD A88X southbridge devices
  PCI: xgene: Add APM X-Gene PCIe driver
  PCI: designware: Remove open-coded bitmap operations
  PCI/MSI: Remove unnecessary temporary variable
  PCI/MSI: Use __write_msi_msg() instead of write_msi_msg()
  MSI/powerpc: Use __read_msi_msg() instead of read_msi_msg()
  PCI/MSI: Use __get_cached_msi_msg() instead of get_cached_msi_msg()
  PCI/MSI: Add "msi_bus" sysfs MSI/MSI-X control for endpoints
  PCI/MSI: Remove "pos" from the struct msi_desc msi_attrib
  PCI/MSI: Remove unused kobject from struct msi_desc
  PCI/MSI: Rename pci_msi_check_device() to pci_msi_supported()
  PCI/MSI: Move D0 check into pci_msi_check_device()
  PCI/MSI: Remove arch_msi_check_device()
  irqchip: armada-370-xp: Remove arch_msi_check_device()
  PCI/MSI/PPC: Remove arch_msi_check_device()
  arm64: Add architectural support for PCI
  PCI: Add pci_remap_iospace() to map bus I/O resources
  of/pci: Add support for parsing PCI host bridge resources from DT
  of/pci: Add pci_get_new_domain_nr() and of_get_pci_domain_nr()
  ...

Conflicts:
	arch/arm64/boot/dts/apm-storm.dtsi
2014-10-09 15:03:49 -04:00
Linus Torvalds
ea584595fc This is the bulk of GPIO changes for the v3.18 development
cycle:
 
 - Increase the default ARCH_NR_GPIO from 256 to 512. This
   was done to avoid having a custom <asm/gpio.h> header for
   the x86 architecture - GPIO is custom and complicated
   enough as it is already! We want to move to a radix to
   store the descriptors going forward, and finally get rid
   of this fixed array size altogether.
 
 - Endgame patching of the gpio_remove() semantics initiated
   by Abdoulaye Berthe. It is not accepted by the system that
   the removal of a GPIO chip fails during e.g. reboot or
   shutdown, and therefore the return value has now painfully
   been refactored away. For special cases like GPIO expanders
   on a hot-pluggable bus like USB, we may later add some
   gpiochip_try_remove() call, but for the cases we have now,
   return values are moot.
 
 - Some incremental refactoring of the gpiolib core and ACPI
   GPIO library for more descriptor usage.
 
 - Refactor the chained IRQ handler set-up method to handle
   also threaded, nested interrupts and set up the parent IRQ
   correctly. Switch STMPE and TC3589x drivers to use this
   registration method.
 
 - Add a .irq_not_threaded flag to the struct gpio_chip, so
   that also GPIO expanders that block but are still not
   using threaded IRQ handlers.
 
 - New drivers for the ARM64 X-Gene SoC GPIO controller.
 
 - The syscon GPIO driver has been improved to handle the
   "DSP GPIO" found on the TI Keystone 2 SoC:s.
 
 - ADNP driver switched to use gpiolib irqchip helpers.
 
 - Refactor the DWAPB driver to support being instantiated
   from and MFD cell (platform device).
 
 - Incremental feature improvement in the Zynq, MCP23S08,
   DWAPB, OMAP, Xilinx and Crystalcove drivers.
 
 - Various minor fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUNOr0AAoJEEEQszewGV1z9toP/2ISXRnsi3+jlqVmEGm/y6EA
 PPwJOiYnOhZR2/fTCHIF0PNbIi9pw7xKnzxttYCu4uCz7geHX+FfTwUZ2/KWMfqi
 ZJ9kEoOVVKzKjmL/m2a2tO4IRSBHqJ8dF3yvaNjS3AL7EDfG6F5STErQurdLEynK
 SeJZ2OwM/vRFCac6F7oDlqAUTu3xYGbVD8+zI0H0V/ReocosFlEwcbl2S8ctDWUd
 h98M+gY+A8rxkvVMnmQ/k7rUTme/glDQ3z5xVx+uHbS2/a5M1jSM/71cXE6YnSrR
 it0CK7CHomq2RzHsKf7oH7GD4kFkukMwFKeMoqz75JWz3352VZPTF53chCIqRSgO
 hrgGwZ7WF6pUUUhsn1ZdZsnBPA2Fou2uwslyLSAiE+OYEH2/NSVIOUcorjQcWqU/
 0Kix5yb8X1ZzRMhR+TVrTD5V0jguqp2buXq+0P2XlU6MoO2vy7iNf2eXvPg8sF8C
 anjTCKgmkzy7eyT2uzfDaNZAyfSBKb1TiKiR9zA0SRChJkCi1ErJEXDGeHiptvSA
 +D2k68Ils2LqsvdrnEd2XvVFMllh0iq7b+16o7D+Els0WRbnHpfYCaqfOuF5F4U0
 SmeyI0ruawNDc5e9EBKXstt0/R9AMOetyTcTu29U2ZVo90zGaT1ofT8+R1jJ0kGa
 bPARJZrgecgv1E9Qnnnd
 =8InA
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v3.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO changes from Linus Walleij:
 "This is the bulk of GPIO changes for the v3.18 development cycle:

   - Increase the default ARCH_NR_GPIO from 256 to 512.  This was done
     to avoid having a custom <asm/gpio.h> header for the x86
     architecture - GPIO is custom and complicated enough as it is
     already! We want to move to a radix to store the descriptors going
     forward, and finally get rid of this fixed array size altogether.

   - Endgame patching of the gpio_remove() semantics initiated by
     Abdoulaye Berthe.  It is not accepted by the system that the
     removal of a GPIO chip fails during eg reboot or shutdown, and
     therefore the return value has now painfully been refactored away.
     For special cases like GPIO expanders on a hot-pluggable bus like
     USB, we may later add some gpiochip_try_remove() call, but for the
     cases we have now, return values are moot.

   - Some incremental refactoring of the gpiolib core and ACPI GPIO
     library for more descriptor usage.

   - Refactor the chained IRQ handler set-up method to handle also
     threaded, nested interrupts and set up the parent IRQ correctly.
     Switch STMPE and TC3589x drivers to use this registration method.

   - Add a .irq_not_threaded flag to the struct gpio_chip, so that also
     GPIO expanders that block but are still not using threaded IRQ
     handlers.

   - New drivers for the ARM64 X-Gene SoC GPIO controller.

   - The syscon GPIO driver has been improved to handle the "DSP GPIO"
     found on the TI Keystone 2 SoC:s.

   - ADNP driver switched to use gpiolib irqchip helpers.

   - Refactor the DWAPB driver to support being instantiated from and
     MFD cell (platform device).

   - Incremental feature improvement in the Zynq, MCP23S08, DWAPB, OMAP,
     Xilinx and Crystalcove drivers.

   - Various minor fixes"

* tag 'gpio-v3.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (52 commits)
  gpio: pch: Build context save/restore only for PM
  pinctrl: abx500: get rid of unused variable
  gpio: ks8695: fix 'else should follow close brace '}''
  gpio: stmpe: add verbose debug code
  gpio: stmpe: fix up interrupt enable logic
  gpio: staticize xway_stp_init()
  gpio: handle also nested irqchips in the chained handler set-up
  gpio: set parent irq on chained handlers
  gpiolib: irqchip: use irq_find_mapping while removing irqchip
  gpio: crystalcove: support virtual GPIO
  pinctrl: bcm281xx: make Kconfig dependency more strict
  gpio: kona: enable only on BCM_MOBILE or for compile testing
  gpio, bcm-kona, LLVMLinux: Remove use of __initconst
  gpio: Fix ngpio in gpio-xilinx driver
  gpio: dwapb: fix pointer to integer cast
  gpio: xgene: Remove unneeded #ifdef CONFIG_OF guard
  gpio: xgene: Remove unneeded forward declation for struct xgene_gpio
  gpio: xgene: Fix missing spin_lock_init()
  gpio: ks8695: fix switch case indentation
  gpiolib: add irq_not_threaded flag to gpio_chip
  ...
2014-10-09 14:58:15 -04:00
Linus Torvalds
afa3536be8 Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
 "Main changes:

  - Fix the deadlock reported by Dave Jones et al
  - Clean up and fix nohz_full interaction with arch abilities
  - nohz init code consolidation/cleanup"

* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  nohz: nohz full depends on irq work self IPI support
  nohz: Consolidate nohz full init code
  arm64: Tell irq work about self IPI support
  arm: Tell irq work about self IPI support
  x86: Tell irq work about self IPI support
  irq_work: Force raised irq work to run on irq work interrupt
  irq_work: Introduce arch_irq_work_has_interrupt()
  nohz: Move nohz full init call to tick init
2014-10-09 06:30:57 -04:00
Linus Torvalds
35a9ad8af0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Most notable changes in here:

   1) By far the biggest accomplishment, thanks to a large range of
      contributors, is the addition of multi-send for transmit.  This is
      the result of discussions back in Chicago, and the hard work of
      several individuals.

      Now, when the ->ndo_start_xmit() method of a driver sees
      skb->xmit_more as true, it can choose to defer the doorbell
      telling the driver to start processing the new TX queue entires.

      skb->xmit_more means that the generic networking is guaranteed to
      call the driver immediately with another SKB to send.

      There is logic added to the qdisc layer to dequeue multiple
      packets at a time, and the handling mis-predicted offloads in
      software is now done with no locks held.

      Finally, pktgen is extended to have a "burst" parameter that can
      be used to test a multi-send implementation.

      Several drivers have xmit_more support: i40e, igb, ixgbe, mlx4,
      virtio_net

      Adding support is almost trivial, so export more drivers to
      support this optimization soon.

      I want to thank, in no particular or implied order, Jesper
      Dangaard Brouer, Eric Dumazet, Alexander Duyck, Tom Herbert, Jamal
      Hadi Salim, John Fastabend, Florian Westphal, Daniel Borkmann,
      David Tat, Hannes Frederic Sowa, and Rusty Russell.

   2) PTP and timestamping support in bnx2x, from Michal Kalderon.

   3) Allow adjusting the rx_copybreak threshold for a driver via
      ethtool, and add rx_copybreak support to enic driver.  From
      Govindarajulu Varadarajan.

   4) Significant enhancements to the generic PHY layer and the bcm7xxx
      driver in particular (EEE support, auto power down, etc.) from
      Florian Fainelli.

   5) Allow raw buffers to be used for flow dissection, allowing drivers
      to determine the optimal "linear pull" size for devices that DMA
      into pools of pages.  The objective is to get exactly the
      necessary amount of headers into the linear SKB area pre-pulled,
      but no more.  The new interface drivers use is eth_get_headlen().
      From WANG Cong, with driver conversions (several had their own
      by-hand duplicated implementations) by Alexander Duyck and Eric
      Dumazet.

   6) Support checksumming more smoothly and efficiently for
      encapsulations, and add "foo over UDP" facility.  From Tom
      Herbert.

   7) Add Broadcom SF2 switch driver to DSA layer, from Florian
      Fainelli.

   8) eBPF now can load programs via a system call and has an extensive
      testsuite.  Alexei Starovoitov and Daniel Borkmann.

   9) Major overhaul of the packet scheduler to use RCU in several major
      areas such as the classifiers and rate estimators.  From John
      Fastabend.

  10) Add driver for Intel FM10000 Ethernet Switch, from Alexander
      Duyck.

  11) Rearrange TCP_SKB_CB() to reduce cache line misses, from Eric
      Dumazet.

  12) Add Datacenter TCP congestion control algorithm support, From
      Florian Westphal.

  13) Reorganize sk_buff so that __copy_skb_header() is significantly
      faster.  From Eric Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1558 commits)
  netlabel: directly return netlbl_unlabel_genl_init()
  net: add netdev_txq_bql_{enqueue, complete}_prefetchw() helpers
  net: description of dma_cookie cause make xmldocs warning
  cxgb4: clean up a type issue
  cxgb4: potential shift wrapping bug
  i40e: skb->xmit_more support
  net: fs_enet: Add NAPI TX
  net: fs_enet: Remove non NAPI RX
  r8169:add support for RTL8168EP
  net_sched: copy exts->type in tcf_exts_change()
  wimax: convert printk to pr_foo()
  af_unix: remove 0 assignment on static
  ipv6: Do not warn for informational ICMP messages, regardless of type.
  Update Intel Ethernet Driver maintainers list
  bridge: Save frag_max_size between PRE_ROUTING and POST_ROUTING
  tipc: fix bug in multicast congestion handling
  net: better IFF_XMIT_DST_RELEASE support
  net/mlx4_en: remove NETDEV_TX_BUSY
  3c59x: fix bad split of cpu_to_le32(pci_map_single())
  net: bcmgenet: fix Tx ring priority programming
  ...
2014-10-08 21:40:54 -04:00
Linus Torvalds
e4e65676f2 Fixes and features for 3.18.
Apart from the usual cleanups, here is the summary of new features:
 
 - s390 moves closer towards host large page support
 
 - PowerPC has improved support for debugging (both inside the guest and
   via gdbstub) and support for e6500 processors
 
 - ARM/ARM64 support read-only memory (which is necessary to put firmware
   in emulated NOR flash)
 
 - x86 has the usual emulator fixes and nested virtualization improvements
   (including improved Windows support on Intel and Jailhouse hypervisor
   support on AMD), adaptive PLE which helps overcommitting of huge guests.
   Also included are some patches that make KVM more friendly to memory
   hot-unplug, and fixes for rare caching bugs.
 
 Two patches have trivial mm/ parts that were acked by Rik and Andrew.
 
 Note: I will soon switch to a subkey for signing purposes.  To verify
 future signed pull requests from me, please update my key with
 "gpg --recv-keys 9B4D86F2".  You should see 3 new subkeys---the
 one for signing will be a 2048-bit RSA key, 4E6B09D7.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJUL5sPAAoJEBvWZb6bTYbyfkEP/3MNhSyn6HCjPjtjLNPAl9KL
 WpExZSUFL2+4CztpdGIsek1BeJYHmqv3+c5S+WvaWVA1aqh2R7FT1D1ErBLjgLQq
 lq23IOr+XxmC3dXQUEEk+TlD+283UzypzEG4l4UD3JYg79fE3UrXAz82SeyewJDY
 x7aPYhkZG3RHu+wAyMPasG6E3zS5LySdUtGWbiPwz5BejrhBJoJdeb2WIL/RwnUK
 7ppSLB5EoFj/uMkuyeAAdAbdfSrhHA6faDZxNdxS9k9wGutrhhfUoQ49ONrKG4dV
 sFo1tSPTVgRs8QFYUZ2fJUPBAmUVddsgqh2K9d0NftGTq7b8YszaCsfFrs2/Y4MU
 YxssWEhxsfszerCu12bbAJrv6JBZYQ7TwGvI9L7P0iFU6IVw/djmukU4AkM9/e91
 YS/cue/PN+9Pn2ccXzL9J7xRtZb8FsOuRsCXTCmbOwDkLmrKPDBN2t3RUbeF+Eam
 ABrpWnLKX13kZSo4LKU+/niarzmPMp7odQfHVdr8ea0fiYLp4iN8puA20WaSPIgd
 CLvm+RAvXe5Lm91L4mpFotJ2uFyK6QlIYJV4FsgeWv/0D0qppWQi0Utb/aCNHCgy
 z8MyUMD48y7EpoQrFYr/7cddXIu0/NegnM8I1coVjIPEk4NfeebGUlCJ/V3D8wMG
 BgEfS2x6jRc5zB3hjwDr
 =iEVi
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "Fixes and features for 3.18.

  Apart from the usual cleanups, here is the summary of new features:

   - s390 moves closer towards host large page support

   - PowerPC has improved support for debugging (both inside the guest
     and via gdbstub) and support for e6500 processors

   - ARM/ARM64 support read-only memory (which is necessary to put
     firmware in emulated NOR flash)

   - x86 has the usual emulator fixes and nested virtualization
     improvements (including improved Windows support on Intel and
     Jailhouse hypervisor support on AMD), adaptive PLE which helps
     overcommitting of huge guests.  Also included are some patches that
     make KVM more friendly to memory hot-unplug, and fixes for rare
     caching bugs.

  Two patches have trivial mm/ parts that were acked by Rik and Andrew.

  Note: I will soon switch to a subkey for signing purposes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (157 commits)
  kvm: do not handle APIC access page if in-kernel irqchip is not in use
  KVM: s390: count vcpu wakeups in stat.halt_wakeup
  KVM: s390/facilities: allow TOD-CLOCK steering facility bit
  KVM: PPC: BOOK3S: HV: CMA: Reserve cma region only in hypervisor mode
  arm/arm64: KVM: Report correct FSC for unsupported fault types
  arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc
  kvm: Fix kvm_get_page_retry_io __gup retval check
  arm/arm64: KVM: Fix set_clear_sgi_pend_reg offset
  kvm: x86: Unpin and remove kvm_arch->apic_access_page
  kvm: vmx: Implement set_apic_access_page_addr
  kvm: x86: Add request bit to reload APIC access page address
  kvm: Add arch specific mmu notifier for page invalidation
  kvm: Rename make_all_cpus_request() to kvm_make_all_cpus_request() and make it non-static
  kvm: Fix page ageing bugs
  kvm/x86/mmu: Pass gfn and level to rmapp callback.
  x86: kvm: use alternatives for VMCALL vs. VMMCALL if kernel text is read-only
  kvm: x86: use macros to compute bank MSRs
  KVM: x86: Remove debug assertion of non-PAE reserved bits
  kvm: don't take vcpu mutex for obviously invalid vcpu ioctls
  kvm: Faults which trigger IO release the mmap_sem
  ...
2014-10-08 05:27:39 -04:00
Ian Munsie
4c6d9acce1 powerpc/mm: Add hooks for cxl
This adds hooks into the core powerpc mm code for cxl.

The core powerpc code sometimes uses local tlbie. Unfortunately this won't
work with the current cxl driver as it relies on snooping tlbie broadcasts.

The cxl hardware can have TLB entries invalidated via MMIO but this is not
currently supported by the driver. In future we can make local tlbie smarter so
that it invalidates cxl contexts via MMIO when it needs to but for now we have
this workaround.

This workaround checks for any active cxl contexts and if so, disables local
tlbie.

This also adds a hook for when SLBs are invalidated. This ensures any
corresponding SLBs in cxl are also invalidated at the same time. This is
required for segment demotion.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:15:55 +11:00
Ian Munsie
0952173601 powerpc/opal: Add PHB to cxl mode call
This adds the OPAL call to change a PHB into cxl mode.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:15:52 +11:00
Ian Munsie
a1dca3465a powerpc/mm: Add new hash_page_mm()
This adds a new function hash_page_mm() based on the existing hash_page().
This version allows any struct mm to be passed in, rather than assuming
current. This is useful for servicing co-processor faults which are not in the
context of the current running process.

We need to be careful here as the current hash_page() assumes current in a few
places.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:15:44 +11:00
Ian Munsie
80c49c7e4a powerpc/powerpc: Add new PCIe functions for allocating cxl interrupts
This adds a number of functions for allocating IRQs under powernv PCIe for cxl.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:15:44 +11:00
Ian Munsie
fd9a1c26ae powerpc/powernv: Split out set MSI IRQ chip code
Some of the MSI IRQ code in pnv_pci_ioda_msi_setup() is generically useful so
split it out.

This will be used by some of the cxl PCIe code later.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:15:43 +11:00
Ian Munsie
8ca7a82f7b powerpc/mm: Export mmu_kernel_ssize and mmu_linear_psize
Export mmu_kernel_ssize and mmu_linear_psize.  These are needed by the cxl
driver which has it's own MMU.  To setup the MMU cxl needs access to these.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:15:42 +11:00
Ian Munsie
b0345bbc6d powerpc/msi: Improve IRQ bitmap allocator
Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests
to the nearest power of 2. eg. ask for 5 IRQs and you'll get 8. This wastes a
lot of IRQs which can be a scarce resource.

For cxl we may require multiple IRQs for every context that is attached to the
accelerator. There may be 1000s of contexts attached, hence we can easily run
out of IRQs, especially if we are needlessly wasting them.

This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number
of IRQs, hence avoiding this wastage. It keeps the natural alignment
requirement though.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:15:42 +11:00
Ian Munsie
be3ebfe821 powerpc/cell: Make spu_flush_all_slbs() generic
This moves spu_flush_all_slbs() into a generic call copro_flush_all_slbs().

This will be useful when we add cxl which also needs a similar SLB flush call.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:15:37 +11:00
Ian Munsie
73d16a6e0e powerpc/cell: Move data segment faulting code out of cell platform
__spu_trap_data_seg() currently contains code to determine the VSID and ESID
required for a particular EA and mm struct.

This code is generically useful for other co-processors. This moves the code of
the cell platform so it can be used by other powerpc code. It also adds 1TB
segment handling which Cell didn't support.  The new function is called
copro_calculate_slb().

This also moves the internal struct spu_slb to a generic struct copro_slb which
is now used in the Cell and copro code.  We use this new struct instead of
passing around esid and vsid parameters.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:14:55 +11:00
Ian Munsie
e83d016975 powerpc/cell: Move spu_handle_mm_fault() out of cell platform
Currently spu_handle_mm_fault() is in the cell platform.

This code is generically useful for other non-cell co-processors on powerpc.

This patch moves this function out of the cell platform into arch/powerpc/mm so
that others may use it.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:14:54 +11:00
Linus Torvalds
28596c9722 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull "trivial tree" updates from Jiri Kosina:
 "Usual pile from trivial tree everyone is so eagerly waiting for"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  Remove MN10300_PROC_MN2WS0038
  mei: fix comments
  treewide: Fix typos in Kconfig
  kprobes: update jprobe_example.c for do_fork() change
  Documentation: change "&" to "and" in Documentation/applying-patches.txt
  Documentation: remove obsolete pcmcia-cs from Changes
  Documentation: update links in Changes
  Documentation: Docbook: Fix generated DocBook/kernel-api.xml
  score: Remove GENERIC_HAS_IOMAP
  gpio: fix 'CONFIG_GPIO_IRQCHIP' comments
  tty: doc: Fix grammar in serial/tty
  dma-debug: modify check_for_stack output
  treewide: fix errors in printk
  genirq: fix reference in devm_request_threaded_irq comment
  treewide: fix synchronize_rcu() in comments
  checkstack.pl: port to AArch64
  doc: queue-sysfs: minor fixes
  init/do_mounts: better syntax description
  MIPS: fix comment spelling
  powerpc/simpleboot: fix comment
  ...
2014-10-07 21:16:26 -04:00
Michael Neuling
60666de2da powerpc/pseries: Use new defines when calling H_SET_MODE
Now that we define these in the KVM code, use these defines when we call
H_SET_MODE. No functional change.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-07 22:01:56 +11:00
sukadev@linux.vnet.ibm.com
56f12bee55 powerpc/perf/hv-24x7: Simplify catalog_read()
catalog_read() implements the read interface for the sysfs file

	/sys/bus/event_source/devices/hv_24x7/interface/catalog

It essentially takes a buffer, an offset and count as parameters
to the read() call.  It makes a hypervisor call to read a specific
page from the catalog and copy the required bytes into the given
buffer. Each call to catalog_read() returns at most one 4K page.

Given these requirements, we should be able to simplify the
catalog_read().

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-07 16:57:10 +11:00
Cody P Schafer
48bee8a6c9 powerpc/perf/hv-24x7: use kmem_cache instead of aligned stack allocations
Ian pointed out the use of __aligned(4096) caused rather large stack
consumption in single_24x7_request(), so use the kmem_cache
hv_page_cache (which we've already got set up for other allocations)
insead of allocating locally.

CC: Haren Myneni <hbabu@us.ibm.com>
Reported-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Cody P Schafer <dev@codyps.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-07 16:52:58 +11:00
Benjamin Herrenschmidt
bf7588a085 powerpc/powernv: Fix endian bug in LPC bus debugfs accessors
When reading from the LPC, the OPAL FW calls return the value via pointer
to a uint32_t which is always returned big endian. Our internal inb/outb
implementation byteswaps that fine but our debugfs code is still broken.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-07 16:01:18 +11:00
Michael Ellerman
75d43b2d0a Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux.git
Freescale updates from Scott (27 commits):

  "Highlights include DMA32 zone support (SATA, USB, etc now works on 64-bit
   FSL kernels), MSI changes, 8xx optimizations and cleanup, t104x board
   support, and PrPMC PCI enumeration."
2014-10-04 08:59:06 +10:00
Michael Ellerman
d0b7abb2c7 powerpc: Enable CONFIG_CRASH_DUMP=y for ppc64_defconfig
It pulls in more code, including causing us to build a relocatable
kernel, which is good for testing.

The resulting kernel is still usable as a non-crash dump kernel.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03 18:03:55 +10:00
Michael Ellerman
edcee77fef powerpc/kdump: crash_dump.c needs to include io.h
For __ioremap().

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03 18:03:35 +10:00
Michael Ellerman
d3b94e4b3b powerpc: Don't build powernv for other platform defconfigs
Because powernv arrived after these other platforms, the defconfigs
didn't have PPC_POWERNV disabled, and being default y it gets turned on.

If we're going to bother having defconfigs for the specific platforms
then they should only build the code required for those platforms.

The grab bag of everything config is ppc64_defconfig.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03 18:03:11 +10:00
Wei Yang
8abf29f829 powerpc/pci: remove duplicate declaration of pci_bus_find_capability
pci_bus_find_capability() is decleared in pci.h, so it is not necessary to do
it again.

This patch removes it.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03 17:27:38 +10:00
Alexey Kardashevskiy
9410e0185e powerpc/iommu/ddw: Fix endianness
rtas_call() accepts and returns values in CPU endianness.
The ddw_query_response and ddw_create_response structs members are
defined and treated as BE but as they are passed to rtas_call() as
(u32 *) and they get byteswapped automatically, the data is CPU-endian.
This fixes ddw_query_response and ddw_create_response definitions and use.

of_read_number() is designed to work with device tree cells - it assumes
the input is big-endian and returns data in CPU-endian. However due
to the ddw_create_response struct fix, create.addr_hi/lo are already
CPU-endian so do not byteswap them.

ddw_avail is a pointer to the "ibm,ddw-applicable" property which contains
3 cells which are big-endian as it is a device tree. rtas_call() accepts
a RTAS token in CPU-endian. This makes use of of_property_read_u32_array
to byte swap and avoid the need for a number of be32_to_cpu calls.

Cc: stable@vger.kernel.org # v3.13+
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: folded Anton's patch with of_property_read_u32_array]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-03 14:22:34 +10:00
Rik van Riel
347abad981 sched, time: Fix build error with 64 bit cputime_t on 32 bit systems
On 32 bit systems cmpxchg cannot handle 64 bit values, so
some additional magic is required to allow a 32 bit system
with CONFIG_VIRT_CPU_ACCOUNTING_GEN=y enabled to build.

Make sure the correct cmpxchg function is used when doing
an atomic swap of a cputime_t.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: umgwanakikbuti@gmail.com
Cc: fweisbec@gmail.com
Cc: srao@redhat.com
Cc: lwoodman@redhat.com
Cc: atheurer@redhat.com
Cc: oleg@redhat.com
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linux390@de.ibm.com
Cc: linux-arch@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Link: http://lkml.kernel.org/r/20140930155947.070cdb1f@annuminas.surriel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-03 05:46:55 +02:00
Anton Blanchard
a7696b36c0 powerpc: Add printk levels to powerpc code
Add printk levels to some places in the powerpc port.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:33:55 +10:00
Anton Blanchard
9a4f5cd0a5 powerpc: Add printk levels to powernv platform code
Add printk levels to powernv platform code, and convert to
pr_err() etc while here.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:33:55 +10:00
Anton Blanchard
3e47d1474c powerpc: Remove powerpc specific cmd_line
There is no need for yet another copy of the command line, just
use boot_command_line like everyone else.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:33:55 +10:00
Anton Blanchard
c7d1f6afe0 powerpc: Use pr_fmt in module loader code
Use pr_fmt to give some context to the error messages in the
module code, and convert open coded debug printk to pr_debug.

Use pr_err for error messages.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:33:54 +10:00
Anton Blanchard
9d57472f61 powerpc: Fill in si_addr_lsb siginfo field
Fill in the si_addr_lsb siginfo field so the hwpoison code can
pass to userspace the length of memory that has been corrupted.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:15:17 +10:00
Anton Blanchard
3913fdd7a2 powerpc: Add VM_FAULT_HWPOISON handling to powerpc page fault handler
do_page_fault was missing knowledge of HWPOISON, and we would oops
if userspace tried to access a poisoned page:

kernel BUG at arch/powerpc/mm/fault.c:180!

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:15:13 +10:00
Anton Blanchard
63af52629a powerpc: Simplify do_sigbus
Exit out early for a kernel fault, avoiding indenting of
most of the function.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 17:15:09 +10:00
Anton Blanchard
e35735b9a5 powerpc: Speed up clear_page by unrolling it
Unroll clear_page 8 times. A simple microbenchmark which
allocates and frees a zeroed page:

for (i = 0; i < iterations; i++) {
	unsigned long p = __get_free_page(GFP_KERNEL | __GFP_ZERO);
	free_page(p);
}

improves 20% on POWER8.

This assumes cacheline sizes won't grow beyond 512 bytes or
page sizes wont drop below 1kB, which is unlikely, but we could
add a runtime check during early init if it makes people nervous.

Michael found that some versions of gcc produce quite bad code
(all multiplies), so we give gcc a hand by using shifts and adds.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-02 16:04:21 +10:00
Yijing Wang
1e8f4cc82e MSI/powerpc: Use __read_msi_msg() instead of read_msi_msg()
rtas_setup_msi_irqs() already has the struct msi_desc pointer required by
__read_msi_msg(), so call it directly instead of having read_msi_msg() look
it up from the IRQ.

No functional change.

[bhelgaas: changelog]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: linuxppc-dev@lists.ozlabs.org
2014-10-01 12:21:46 -06:00
Alexander Gordeev
6b2fd7efeb PCI/MSI/PPC: Remove arch_msi_check_device()
Move MSI checks from arch_msi_check_device() to arch_setup_msi_irqs().
This makes the code more compact and allows removing
arch_msi_check_device() from generic MSI code.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-01 12:21:14 -06:00
Gavin Shan
2013add4ce powerpc/eeh: Show hex prefix for PE state sysfs
As Michael suggested, the hex prefix for the output of EEH PE
state sysfs entry (/sys/bus/pci/devices/xxx/eeh_pe_state) is
always informative to users.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-01 22:23:34 +10:00
Gavin Shan
fe7e85c6f5 powerpc/powernv: Override dma_get_required_mask()
The dma_get_required_mask() function is used by some drivers to
query the platform about what DMA mask is needed to cover all of
memory. This is a bit of a strange semantic when we have to choose
between IOMMU translation or bypass, but essentially what it means
is "what DMA mask will give best performances".

Currently, our IOMMU backend always returns a 32-bit mask here, we
don't do anything special to it when we have bypass available. This
causes some drivers to choose a 32-bit mask, thus losing the ability
to use the bypass window, thinking this is more efficient. The problem
was reported from the driver of following device:

0004:03:00.0 0107: 1000:0087 (rev 05)
0004:03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios \
             Logic SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)

This patch adds an override of that function in order to, instead,
return a 64-bit mask whenever a bypass window is available in order
for drivers to prefer this configuration.

Reported-by: Murali N. Iyer <mniyer@us.ibm.com>
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:20 +10:00
Gavin Shan
372fb80db9 powerpc/powernv: Fetch frozen PE on top level
It should have been part of commit 1ad7a72c5 ("powerpc/eeh: Report
frozen parent PE prior to child PE"). There are 2 ways to report
EEH errors: proactively polling because of 0xFF's returned from
PCI config or IO read, or interrupt driven event. We missed to
report and handle parent frozen PE prior to child frozen PE for
the later case on PowerNV platform.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:20 +10:00
Gavin Shan
f2e0be5e76 powerpc/eeh: Dump PCI config space for all child devices
The PEs can be organized as nested. Current implementation doesn't
dump PCI config space for subordinate devices of child PEs. However,
the frozen PE could be caused by those subordinate devices of its
child PEs.

The patch dumps PCI config space for all subordinate devices of the
problematic PE.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:19 +10:00
Gavin Shan
5cfb20b96f powerpc/eeh: Emulate EEH recovery for VFIO devices
When enabling EEH functionality on passed through devices (PE)
with VFIO, the devices in the PE would be removed permanently
from guest side. In that case, the PE remains frozen state.
When returning PE to host, or restarting the guest again, we
had mechanism unfreezing the PE by clearing PESTA/B frozen
bits. However, that's not enough for some adapters, which are
indicated as following "lspci" shows. Those adapters require
hot reset on the parent bus to bring their firmware back to
workable state. Otherwise, those adaptrs won't be operative
and the host (for returning case) or the guest will fail to
load the drivers for those adapters without exception.

0000:01:00.0 Ethernet controller: Emulex Corporation OneConnect \
             10Gb NIC (be3) (rev 02)
0000:01:00.0 0200: 19a2:0710 (rev 02)
0001:03:00.0 Ethernet controller: Emulex Corporation OneConnect \
             NIC (Lancer) (rev 10)
0001:03:00.0 0200: 10df:e220 (rev 10)

The patch adds mechanism to emulate EEH recovery (for hot reset
on parent PCI bus) on 3 gates to fix the issue: open/release one
adapter of the PE, enable EEH functionality on one adapter of the
PE.

Reported-by:  Murilo Fossa Vicentini <muvic@br.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:18 +10:00
Gavin Shan
93e8b36d7b powerpc/eeh: Tag reset state for user owned PE
PE would be owned by userland, which probably request PE reset
done in host side. During the reset, we should drop the PCI
config accesses to the PE with help of flag EEH_PE_RESET.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:17 +10:00
Gavin Shan
d1a85eee35 powerpc/powernv: Sync OpalPciResetScope with firmware
The names of PCI reset scopes aren't sychronized with firmware.
The patch fixes it.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:17 +10:00
Gavin Shan
4ba5a0fc64 powerpc/pseries: Decrease message level on EEH initialization
As Anton suggested, the patch decreases the message level on EEH
initialization to avoid unnecessary messages if required. Also,
we have unified hint if any of needful RTAS calls is missed, and
then we can check /proc/device-tree to figure out the missed RTAS
calls.

Suggested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:16 +10:00
Gavin Shan
9372dddb18 powerpc/eeh: Block PCI config access during reset
Function pcibios_set_pcie_reset_state() can be used to do PCI
reset. PCI config access during the reset usually causes EEH
errors unexpectedly. In order to avoid the EEH error, the patch
blocks PCI config access during reset with the help of flag
EEH_PE_RESET, which is similar to what we did in EEH PE reset
path.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:15 +10:00
Gavin Shan
c9dd014397 powerpc/eeh: Use eeh_unfreeze_pe()
The patch uses eeh_unfreeze_pe() to replace the logic clearing
frozen IO and DMA, in order to simplify the code.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:14 +10:00
Gavin Shan
4eeeff0ebc powerpc/eeh: Unfreeze PE on enabling EEH functionality
When passing through PE to guest, that's possibly in frozen
state. The driver for the pass-through devices on guest side
can't be loaded successfully as reported. We already had one
gate in eeh_dev_open() to clear PE frozen state accordingly,
but that's not enough because the function is only called at
QEMU startup for once.

The patch adds another gate in eeh_pe_set_option() so that the
PE frozen state can be cleared at QEMU restart time.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:14 +10:00
Gavin Shan
4d4f577e4b powerpc/eeh: Fix improper condition in eeh_pci_enable()
The function eeh_pci_enable() is called to apply various requests
to one particular PE: Enabling EEH, Disabling EEH, Enabling IO,
Enabling DMA, Freezing PE. When enabling IO or DMA on one specific
PE, we need check that IO or DMA isn't enabled previously. But
the condition used to do the check isn't completely correct because
one PE would be in DMA frozen state with workable IO path, or vice
versa.

The patch fixes the improper condition.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:13 +10:00
Gavin Shan
22fca17924 powerpc/eeh: Clear frozen device state in time
The problem was reported by Carol: In the scenario of passing mlx4
adapter to guest, EEH error could be recovered successfully. When
returning the device back to host, the driver (mlx4_core.ko)
couldn't be loaded successfully because of error number -5 (-EIO)
returned from mlx4_get_ownership(), which hits offlined PCI device.
The root cause is that we missed to put the affected devices into
normal state on clearing PE isolated state right after PE reset.

The patch fixes above issue by putting the affected devices to
normal state when clearing PE isolated state in eeh_pe_state_clear().

Cc: stable@vger.kernel.org
Reported-by: Carol L. Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:12 +10:00
Gavin Shan
d9df1b5da1 powerpc/powernv: Clear PAPR error injection registers
The frozen state on one specific PE is probably caused by error
injection, which is done with help of PAPR error injection registers.
According to the hardware spec, those registers should be cleared
automatically after one-shot frozen PE. However, that's not always
true, at least on P7IOC of Firebird-L. So we have to clear them
before doing PE reset to avoid recursive EEH errors at recovery
stage.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:11 +10:00
Mike Qiu
7a06278229 powerpc/powernv: Add PCI error injection debugfs entry
The patch adds debugfs file (/sys/kernel/debug/powerpc/PCIxxxx/
err_injct), which accepts following formated string, to support
error injection. It will be used to support userland utility
"errinjct" in future.

  "pe_no:0:function:address:mask" - 32-bits PCI errors
  "pe_no:1:function:address:mask" - 64-bits PCI errors

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:10 +10:00
Gavin Shan
131c123abe powerpc/eeh: Introduce eeh_ops::err_inject
The patch introduces eeh_ops::err_inject(), which allows to inject
specified errors to indicated PE for testing purpose. The functionality
isn't support on pSeries platform. On PowerNV, the functionality
relies on OPAL API opal_pci_err_inject().

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:10 +10:00
Gavin Shan
5b64234081 powerpc/powernv: Sync header with firmware
The patch synchronizes firmware header file (opal.h) for PCI error
injection.

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:09 +10:00
Gavin Shan
404079c87e powerpc/eeh: Clear frozen state on passing device
When passing through device, its PE might have been put into frozen
state. One obvious example would be: the passed PE is forced to be
offline because of hitting maximal allowed EEH errors in userland.
In that case, the frozen state won't be cleared and then the PE is
returned back to host, which might not have chance detecting and
recovering from it.

The patch adds more check when passing through device and clear the
PE frozen state if necessary.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:08 +10:00
Gavin Shan
316233ff87 powerpc/eeh: Reenable PCI devices after reset
The PCI devices that have been passed through are enabled before
reset, we need restore to the enabled state after reset. Otherwise,
MMIO access might be issued to disabled devices after reset and
causes exceptional recursive EEH error.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:07 +10:00
Gavin Shan
0d5ee5205e powerpc/eeh: Freeze PE before PE reset
The patch adds one more option (EEH_OPT_FREEZE_PE) to set_option()
method to proactively freeze PE, which will be issued before resetting
pass-throughed PE to drop MMIO access during reset because it's
always contributing to recursive EEH error.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:07 +10:00
Gavin Shan
940376b3a4 powerpc/eeh: Add eeh_pe_state sysfs entry
The patch adds sysfs entry "eeh_pe_state". Reading on it returns
the PE's state while writing to it clears the frozen state. It's
used to check or clear the PE frozen state from userland for
debugging purpose.

The patch also replaces printk(KERN_WARNING ...) with pr_warn() in
eeh_sysfs.c

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:06 +10:00