Commit graph

106826 commits

Author SHA1 Message Date
Ingo Molnar
2e54a5bdba perf/x86/intel/pt: Fix the 32-bit build
On a 32-bit build I got:

  arch/x86/kernel/cpu/perf_event_intel_pt.c:413:5: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
  arch/x86/kernel/cpu/perf_event_intel_bts.c:162:24: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]

Fix it. The code should probably be (re-)tested on 32-bit systems to make
sure all is fine.

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kaixu Xia <kaixu.xia@linaro.org>
Cc: linux-kernel@vger.kernel.org
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Cc: kan.liang@intel.com
Cc: markus.t.metzger@intel.com
Cc: mathieu.poirier@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:58:45 +02:00
Andi Kleen
cd1f11de69 perf/x86/intel: Avoid rewriting DEBUGCTL with the same value for LBRs
perf with LBRs on has a tendency to rewrite the DEBUGCTL MSR with
the same value. Add a little optimization to skip the unnecessary
write.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1426871484-21285-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:20 +02:00
Andi Kleen
1a78d93750 perf/x86/intel: Streamline LBR MSR handling in PMI
The perf PMI currently does unnecessary MSR accesses when
LBRs are enabled. We use LBR freezing, or when in callstack
mode force the LBRs to only filter on ring 3.

So there is no need to disable the LBRs explicitely in the
PMI handler.

Also we always unnecessarily rewrite LBR_SELECT in the LBR
handler, even though it can never change.

 5)               |  /* write_msr: MSR_LBR_SELECT(1c8), value 0 */
 5)               |  /* read_msr: MSR_IA32_DEBUGCTLMSR(1d9), value 1801 */
 5)               |  /* write_msr: MSR_IA32_DEBUGCTLMSR(1d9), value 1801 */
 5)               |  /* write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 70000000f */
 5)               |  /* write_msr: MSR_CORE_PERF_GLOBAL_CTRL(38f), value 0 */
 5)               |  /* write_msr: MSR_LBR_SELECT(1c8), value 0 */
 5)               |  /* read_msr: MSR_IA32_DEBUGCTLMSR(1d9), value 1801 */
 5)               |  /* write_msr: MSR_IA32_DEBUGCTLMSR(1d9), value 1801 */

This patch:

  - Avoids disabling already frozen LBRs unnecessarily in the PMI
  - Avoids changing LBR_SELECT in the PMI

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1426871484-21285-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:19 +02:00
Andi Kleen
15fde1101a perf/x86: Only dump PEBS register when PEBS has been detected
Technically PEBS_ENABLED is only guaranteed to exist when we
detected PEBS. So add a check for this to the PMU dump function.
I don't think it can happen on a real CPU, but could in a VM.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1425059312-18217-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:17 +02:00
Andi Kleen
da3e606d88 perf/x86: Dump DEBUGCTL in PMU dump
LBRs and LBR freezing are controlled through the DEBUGCTL MSR. So
dump the state of DEBUGCTL too when dumping the PMU state.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1425059312-18217-3-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:17 +02:00
Andi Kleen
8882edf735 perf/x86/intel: Reset more state in PMU reset
The PMU reset code didn't quite keep up with newer PMU features.
Improve it a bit to really reset a modern PMU:

  - Clear all overflow status
  - Clear LBRs and freezing state
  - Disable fixed counters too

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1425059312-18217-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:16 +02:00
Stephane Eranian
b37609c30e perf/x86/intel: Make the HT bug workaround conditional on HT enabled
This patch disables the PMU HT bug when Hyperthreading (HT)
is disabled. We cannot do this test immediately when perf_events
is initialized. We need to wait until the topology information
is setup properly. As such, we register a later initcall, check
the topology and potentially disable the workaround. To do this,
we need to ensure there is no user of the PMU. At this point of
the boot, the only user is the NMI watchdog, thus we disable
it during the switch and re-enable it right after.

Having the workaround disabled when it is not needed provides
some benefits by limiting the overhead is time and space.
The workaround still ensures correct scheduling of the corrupting
memory events (0xd0, 0xd1, 0xd2) when HT is off. Those events
can only be measured on counters 0-3. Something else the current
kernel did not handle correctly.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: bp@alien8.de
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Cc: maria.n.dimakopoulou@gmail.com
Link: http://lkml.kernel.org/r/1416251225-17721-13-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:15 +02:00
Stephane Eranian
c02cdbf60b perf/x86/intel: Limit to half counters when the HT workaround is enabled, to avoid exclusive mode starvation
This patch limits the number of counters available to each CPU when
the HT bug workaround is enabled.

This is necessary to avoid situation of counter starvation. Such can
arise from configuration where one HT thread, HT0, is using all 4 counters
with corrupting events which require exclusion the the sibling HT, HT1.

In such case, HT1 would not be able to schedule any event until HT0
is done. To mitigate this problem, this patch artificially limits
the number of counters to 2.

That way, we can gurantee that at least 2 counters are not in exclusive
mode and therefore allow the sibling thread to schedule events of the
same type (system vs. per-thread). The 2 counters are not determined
in advance. We simply set the limit to two events per HT.

This helps mitigate starvation in case of events with specific counter
constraints such a PREC_DIST.

Note that this does not elimintate the starvation is all cases. But
it is better than not having it.

(Solution suggested by Peter Zjilstra.)

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: bp@alien8.de
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Cc: maria.n.dimakopoulou@gmail.com
Link: http://lkml.kernel.org/r/1416251225-17721-11-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:14 +02:00
Stephane Eranian
a90738c2cb perf/x86/intel: Fix intel_get_event_constraints() for dynamic constraints
With dynamic constraint, we need to restart from the static
constraints each time the intel_get_event_constraints() is called.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Cc: bp@alien8.de
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1416251225-17721-10-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:14 +02:00
Maria Dimakopoulou
b63b4b459a perf/x86/intel: Enforce HT bug workaround with PEBS for SNB/IVB/HSW
This patch modifies the PEBS constraint tables for SNB/IVB/HSW
such that corrupting events supporting PEBS activate the HT
workaround.

Signed-off-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: bp@alien8.de
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1416251225-17721-9-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:13 +02:00
Maria Dimakopoulou
93fcf72cc0 perf/x86/intel: Enforce HT bug workaround for SNB/IVB/HSW
This patches activates the HT bug workaround for the
SNB/IVB/HSW processors. This covers non-PEBS mode.
Activation is done thru the constraint tables.

Both client and server processors needs this workaround.

Signed-off-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: bp@alien8.de
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1416251225-17721-8-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:12 +02:00
Maria Dimakopoulou
e979121b1b perf/x86/intel: Implement cross-HT corruption bug workaround
This patch implements a software workaround for a HW erratum
on Intel SandyBridge, IvyBridge and Haswell processors
with Hyperthreading enabled. The errata are documented for
each processor in their respective specification update
documents:

  - SandyBridge: BJ122
  - IvyBridge: BV98
  - Haswell: HSD29

The bug causes silent counter corruption across hyperthreads only
when measuring certain memory events (0xd0, 0xd1, 0xd2, 0xd3).
Counters measuring those events may leak counts to the sibling
counter. For instance, counter 0, thread 0 measuring event 0xd0,
may leak to counter 0, thread 1, regardless of the event measured
there. The size of the leak is not predictible. It all depends on
the workload and the state of each sibling hyper-thread. The
corrupting events do undercount as a consequence of the leak. The
leak is compensated automatically only when the sibling counter measures
the exact same corrupting event AND the workload is on the two threads
is the same. Given, there is no way to guarantee this, a work-around
is necessary. Furthermore, there is a serious problem if the leaked count
is added to a low-occurrence event. In that case the corruption on
the low occurrence event can be very large, e.g., orders of magnitude.

There is no HW or FW workaround for this problem.

The bug is very easy to reproduce on a loaded system.
Here is an example on a Haswell client, where CPU0, CPU4
are siblings. We load the CPUs with a simple triad app
streaming large floating-point vector. We use 0x81d0
corrupting event (MEM_UOPS_RETIRED:ALL_LOADS) and
0x20cc (ROB_MISC_EVENTS:LBR_INSERTS). Given we are not
using the LBR, the 0x20cc event should be zero.

  $ taskset -c 0 triad &
  $ taskset -c 4 triad &
  $ perf stat -a -C 0 -e r81d0 sleep 100 &
  $ perf stat -a -C 4 -r20cc sleep 10
  Performance counter stats for 'system wide':
        139 277 291      r20cc
       10,000969126 seconds time elapsed

In this example, 0x81d0 and r20cc ar eusing sinling counters
on CPU0 and CPU4. 0x81d0 leaks into 0x20cc and corrupts it
from 0 to 139 millions occurrences.

This patch provides a software workaround to this problem by modifying the
way events are scheduled onto counters by the kernel. The patch forces
cross-thread mutual exclusion between counters in case a corrupting event
is measured by one of the hyper-threads. If thread 0, counter 0 is measuring
event 0xd0, then nothing can be measured on counter 0, thread 1. If no corrupting
event is measured on any hyper-thread, event scheduling proceeds as before.

The same example run with the workaround enabled, yield the correct answer:

  $ taskset -c 0 triad &
  $ taskset -c 4 triad &
  $ perf stat -a -C 0 -e r81d0 sleep 100 &
  $ perf stat -a -C 4 -r20cc sleep 10
  Performance counter stats for 'system wide':
        0 r20cc
       10,000969126 seconds time elapsed

The patch does provide correctness for all non-corrupting events. It does not
"repatriate" the leaked counts back to the leaking counter. This is planned
for a second patch series. This patch series makes this repatriation more
easy by guaranteeing the sibling counter is not measuring any useful event.

The patch introduces dynamic constraints for events. That means that events which
did not have constraints, i.e., could be measured on any counters, may now be
constrained to a subset of the counters depending on what is going on the sibling
thread. The algorithm is similar to a cache coherency protocol. We call it XSU
in reference to Exclusive, Shared, Unused, the 3 possible states of a PMU
counter.

As a consequence of the workaround, users may see an increased amount of event
multiplexing, even in situtations where there are fewer events than counters
measured on a CPU.

Patch has been tested on all three impacted processors. Note that when
HT is off, there is no corruption. However, the workaround is still enabled,
yet not costing too much. Adding a dynamic detection of HT on turned out to
be complex are requiring too much to code to be justified.

This patch addresses the issue when PEBS is not used. A subsequent patch
fixes the problem when PEBS is used.

Signed-off-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
[spinlock_t -> raw_spinlock_t]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: bp@alien8.de
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1416251225-17721-7-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:12 +02:00
Maria Dimakopoulou
6f6539cad9 perf/x86/intel: Add cross-HT counter exclusion infrastructure
This patch adds a new shared_regs style structure to the
per-cpu x86 state (cpuc). It is used to coordinate access
between counters which must be used with exclusion across
HyperThreads on Intel processors. This new struct is not
needed on each PMU, thus is is allocated on demand.

Signed-off-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
[peterz: spinlock_t -> raw_spinlock_t]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: bp@alien8.de
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1416251225-17721-6-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:11 +02:00
Stephane Eranian
79cba82244 perf/x86: Add 'index' param to get_event_constraint() callback
This patch adds an index parameter to the get_event_constraint()
x86_pmu callback. It is expected to represent the index of the
event in the cpuc->event_list[] array. When the callback is used
for fake_cpuc (evnet validation), then the index must be -1. The
motivation for passing the index is to use it to index into another
cpuc array.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: bp@alien8.de
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Cc: maria.n.dimakopoulou@gmail.com
Link: http://lkml.kernel.org/r/1416251225-17721-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:10 +02:00
Maria Dimakopoulou
c5362c0c37 perf/x86: Add 3 new scheduling callbacks
This patch adds 3 new PMU model specific callbacks
during the event scheduling done by x86_schedule_events().

  ->start_scheduling():  invoked when entering the schedule routine.
  ->stop_scheduling():   invoked at the end of the schedule routine
  ->commit_scheduling(): invoked for each committed event

To be used optionally by model-specific code.

Signed-off-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: bp@alien8.de
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1416251225-17721-4-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:09 +02:00
Stephane Eranian
9041346431 perf/x86: Vectorize cpuc->kfree_on_online
Make the cpuc->kfree_on_online a vector to accommodate
more than one entry and add the second entry to be
used by a later patch.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Maria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
Cc: bp@alien8.de
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1416251225-17721-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:08 +02:00
Stephane Eranian
9a5e3fb52a perf/x86: Rename x86_pmu::er_flags to 'flags'
Because it will be used for more than just tracking the
presence of extra registers.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: bp@alien8.de
Cc: jolsa@redhat.com
Cc: kan.liang@intel.com
Cc: maria.n.dimakopoulou@gmail.com
Link: http://lkml.kernel.org/r/1416251225-17721-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:33:08 +02:00
Ingo Molnar
c2b078e78a Merge branch 'perf/urgent' into perf/core, before applying dependent patches
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:17:46 +02:00
Alexander Shishkin
8062382c8d perf/x86/intel/bts: Add BTS PMU driver
Add support for Branch Trace Store (BTS) via kernel perf event infrastructure.
The difference with the existing implementation of BTS support is that this
one is a separate PMU that exports events' trace buffers to userspace by means
of AUX area of the perf buffer, which is zero-copy mapped into userspace.

The immediate benefit is that the buffer size can be much bigger, resulting in
fewer interrupts and no kernel side copying is involved and little to no trace
data loss. Also, kernel code can be traced with this driver.

The old way of collecting BTS traces still works.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kaixu Xia <kaixu.xia@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Cc: kan.liang@intel.com
Cc: markus.t.metzger@intel.com
Cc: mathieu.poirier@linaro.org
Link: http://lkml.kernel.org/r/1422614435-114702-1-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:14:21 +02:00
Alexander Shishkin
52ca9ced3f perf/x86/intel/pt: Add Intel PT PMU driver
Add support for Intel Processor Trace (PT) to kernel's perf events.
PT is an extension of Intel Architecture that collects information about
software execuction such as control flow, execution modes and timings and
formats it into highly compressed binary packets. Even being compressed,
these packets are generated at hundreds of megabytes per second per core,
which makes it impractical to decode them on the fly in the kernel.

This driver exports trace data by through AUX space in the perf ring
buffer, which is zero-copy mapped into userspace for faster data retrieval.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kaixu Xia <kaixu.xia@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Cc: kan.liang@intel.com
Cc: markus.t.metzger@intel.com
Cc: mathieu.poirier@linaro.org
Link: http://lkml.kernel.org/r/1422614392-114498-1-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:14:20 +02:00
Alexander Shishkin
4807034248 perf/x86: Mark Intel PT and LBR/BTS as mutually exclusive
Intel PT cannot be used at the same time as LBR or BTS and will cause a
general protection fault if they are used together. In order to avoid
fixing up GPs in the fast path, instead we disallow creating LBR/BTS
events when PT events are present and vice versa.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kaixu Xia <kaixu.xia@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Cc: kan.liang@intel.com
Cc: markus.t.metzger@intel.com
Cc: mathieu.poirier@linaro.org
Link: http://lkml.kernel.org/r/1421237903-181015-12-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:14:19 +02:00
Alexander Shishkin
ed69628b3b x86: Add Intel Processor Trace (INTEL_PT) cpu feature detection
Intel Processor Trace is an architecture extension that allows for program
flow tracing.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kaixu Xia <kaixu.xia@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Robert Richter <rric@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Cc: kan.liang@intel.com
Cc: markus.t.metzger@intel.com
Cc: mathieu.poirier@linaro.org
Link: http://lkml.kernel.org/r/1421237903-181015-11-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:14:18 +02:00
Andi Kleen
c420f19b9c perf/x86/intel: Fix Haswell CYCLE_ACTIVITY.* counter constraints
Some of the CYCLE_ACTIVITY.* events can only be scheduled on
counter 2.  Due to a typo Haswell matched those with
INTEL_EVENT_CONSTRAINT, which lead to the events never
matching as the comparison does not expect anything
in the umask too. Fix the typo.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1425925222-32361-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:07:43 +02:00
Kan Liang
687805e4a6 perf/x86/intel: Filter branches for PEBS event
For supporting Intel LBR branches filtering, Intel LBR sharing logic
mechanism is introduced from commit b36817e886 ("perf/x86: Add Intel
LBR sharing logic"). It modifies __intel_shared_reg_get_constraints() to
config lbr_sel, which is finally used to set LBR_SELECT.

However, the intel_shared_regs_constraints() function is called after
intel_pebs_constraints(). The PEBS event will return immediately after
intel_pebs_constraints(). So it's impossible to filter branches for PEBS
events.

This patch moves intel_shared_regs_constraints() ahead of
intel_pebs_constraints().

We can safely do that because the intel_shared_regs_constraints() function
only returns empty constraint if its rejecting the event, otherwise it
returns NULL such that we continue calling intel_pebs_constraints() and
x86_get_event_constraint().

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1427467105-9260-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-02 17:07:42 +02:00
Linus Torvalds
08f41f7c35 ARM: SoC fixes
The latest and greatest fixes for ARM platform code. Worth pointing out are:
 
 - Lines-wise, largest is a PXA fix for dealing with interrupts on DT that was
   quite broken. It's still newish code so while we could have held this off,
   it seemed appropriate to include now
 - Some GPIO fixes for OMAP platforms added a few lines. This was also fixes for
   code recently added (this release).
 - Small OMAP timer fix to behave better with partially upstreamed platforms,
   which is quite welcome.
 - Allwinner fixes about operating point control, reducing overclocking in some
   cases for better stability.
 
 + a handful of other smaller fixes across the map.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVGGpHAAoJEIwa5zzehBx3vi8QAKRlHuJzKlEAGOglrmaYLNwr
 /eSx6wQQlMa+DFr6YVgypYvImCpvbOVPRGMFpaAyriKDrWcu1HiEF4nvq+3Iq1Qt
 yqPRbTh2xU1qZ9kmidcY5rhSX7IO7Bgeo0fV4q1Dyt5rgUKfIRhE6BnYO5C9TbY5
 CoFri3sFSyeAwIUYcbvt7OElVTB2ro6xmqfc6ZspPqPLYnbc9wSSWSKLV8EFM+OS
 6DD3UFiTPN1o0nb/Vo/fuOsryCE3oR2wfomWAr7XMeTbB2DOugocre0ttyxhhbBt
 Dkc0u3/a75AtisIPkoHPGQZX3671SnUKmge9VYGeMWBHpGCvXJMMvPonbkr2yema
 gxH5/KPBwmv4c/asL/nfKiQnHowxLGdCDVCQ+NKfhyvEKZ7WzwVzXekrWn3L8u1W
 xUy3O0XU3YQJVXX/S2ec5RwsGe9K/5OanKTkjeJ8q5zLtJA0ax0fJGyQ+/iEcjFy
 CnlW857EL1ChXQ2WItKvaLyxe/mea4HGonHe5nZSRVZtTVTYTSFASrJchmRR/eu2
 NXnZPBtvAEn8MqTuNyVVjqwifojK8jdVox+mlNoQUoehLhyAYc4IHbC6+/2c5HU7
 dW5tFMGPOlQNmYbXqe+oKSUaaSWcxUaPTzmD6IRUrozrhabyv/lb79GXDLb2c6QL
 Vz7WtY1qTD7TOwPv+Xyd
 =OclH
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "The latest and greatest fixes for ARM platform code.  Worth pointing
  out are:

   - Lines-wise, largest is a PXA fix for dealing with interrupts on DT
     that was quite broken.  It's still newish code so while we could
     have held this off, it seemed appropriate to include now

   - Some GPIO fixes for OMAP platforms added a few lines.  This was
     also fixes for code recently added (this release).

   - Small OMAP timer fix to behave better with partially upstreamed
     platforms, which is quite welcome.

   - Allwinner fixes about operating point control, reducing
     overclocking in some cases for better stability.

  plus a handful of other smaller fixes across the map"

* tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  arm64: juno: Fix misleading name of UART reference clock
  ARM: dts: sunxi: Remove overclocked/overvoltaged OPP
  ARM: dts: sun4i: a10-lime: Override and remove 1008MHz OPP setting
  ARM: socfpga: dts: fix spi1 interrupt
  ARM: dts: Fix gpio interrupts for dm816x
  ARM: dts: dra7: remove ti,hwmod property from pcie phy
  ARM: OMAP: dmtimer: disable pm runtime on remove
  ARM: OMAP: dmtimer: check for pm_runtime_get_sync() failure
  ARM: OMAP2+: Fix socbus family info for AM33xx devices
  ARM: dts: omap3: Add missing dmas for crypto
  ARM: dts: rockchip: disable gmac by default in rk3288.dtsi
  MAINTAINERS: add rockchip regexp to the ARM/Rockchip entry
  ARM: pxa: fix pxa interrupts handling in DT
  ARM: pxa: Fix typo in zeus.c
  ARM: sunxi: Have ARCH_SUNXI select RESET_CONTROLLER for clock driver usage
2015-03-29 15:09:31 -07:00
Olof Johansson
4550bdb0bd Allwinner fixes for 4.0
There's a few fixes to merge for 4.0, one to add a select in the machine
 Kconfig option to fix a potential build failure, and two fixing cpufreq related
 issues.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVEyVYAAoJEBx+YmzsjxAg54AQAIIZvr69IYXoGL4njoVRhKPq
 Tm8RPb4B74mKHuAfRSblP7BV2cr3hCNBrLmyDnPUAw5KmYdNwfZVAEHxitBqin33
 JQphd5NGD3gtI0YYJQ4DrPlydSDMCV+FvH/4ASgEXyCrimAvcvsASgGoTSZ4PsN+
 /I7FmIVdGsYJ3V6uJv9mPlgAW6PiGkGFkJ/9ZpTbN2sRCLiE/SAIoSR4C9QvMbhI
 khlmMlP8fF1TpOK9m7sXlTHvaf1kH5GK3XN6GxgvVPR66R2kEYKTBLSpxbB3zQmH
 PPAx8UAQbIR6T+dBar5a7rDaSPVOQ2d+f+ytXFKeY5Sonb0iShmjD3EMRFj8PYUv
 /JUdr/8o9HDBWBHNw7JeVx2texk/Gk5AGSGMuPjGsyJYkEoNK0vZS2mwB5doQggo
 nAADHBPhJptPyUqA4ubnjdgc82To4eh7z5dYgNYCH8XlvFuIRHsHPlp9QIdaTzTJ
 eHBMQx6pgr66QNjF940s0z2rH/UtGmWT60WpWToQA7EJau0E2VosQAnjLctiQogO
 ZUhgMa5V3zCiK3831bwDG3we3ictudkZgvIFWeRkbI/StRbt/n8A2NRK7QWhqqh7
 SXgNP5kisAq29X9grn5ekku6O4gJ1xobbjwDsdYw3z36uBjFkYUe85j3AJjhW+sG
 v2chTh9PHTZ6hu/uyK1L
 =e6O1
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-fixes-for-4.0' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into fixes

Allwinner fixes for 4.0

There's a few fixes to merge for 4.0, one to add a select in the machine
Kconfig option to fix a potential build failure, and two fixing cpufreq related
issues.

* tag 'sunxi-fixes-for-4.0' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
  ARM: dts: sunxi: Remove overclocked/overvoltaged OPP
  ARM: dts: sun4i: a10-lime: Override and remove 1008MHz OPP setting
  ARM: sunxi: Have ARCH_SUNXI select RESET_CONTROLLER for clock driver usage

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-03-29 14:01:02 -07:00
Olof Johansson
b1dae3d8b0 Fixes for omaps for the -rc cycle:
- Fix a device tree based booting vs legacy booting regression for
   omap3 crypto hardware by adding the missing DMA channels.
 
 - Fix /sys/bus/soc/devices/soc0/family for am33xx devices.
 
 - Fix two timer issues that can cause hangs if the timer related
   hwmod data is missing like it often initially is for new SoCs.
 
 - Remove pcie hwmods entry from dts as that causes runtime PM to
   fail for the PHYs.
 
 - A paper bag type dts configuration fix for dm816x GPIO
   interrupts that I just noticed. This is most of the changes
   diffstat wise, but as it's a basic feature for connecting
   devices and things work otherwise, it should be fixed.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVCzVOAAoJEBvUPslcq6VzJKMP/ioZMQuxKWjMZDQhc4PsSZzF
 3qtijSrN4aPnYphih23/7XuE0dws0pWdpH1toYPPlzB78gN18MuFOpHRHSKxy23e
 dTRvgrvfDfVFyHYpvOTstaqNai90HbwpQnWx2IdEI9xhP0N3RNQQDTdE8mrISoVE
 2GyT8hF++MxbBk3/dhx1tIoqSkWvSi2TH8rAQ6I/UR2O07HRz8tdcciHq0goYvA9
 n16Ou7a3WpRiaGQZvN4wAvwRCklKL3vM/7ZpbRlJAW4M1Uzc1OGKSGHrqr9kbG77
 U/QgJ7lQoHOn6zDRYZqCY5lHR5U8wW8WVCH5WNV6GZaVHzbOSDoV/Ygt/CB9eBzh
 i8RAcKyvVOWqYACjQ3ogfaC5AD7Of7DONFDQVBr/4xaUOLVquYx8V4r89dfDTeyY
 i95ay06NaS6esX3Y0lsgBSlvPObRxoVH4nFUUjUH+gLqhqHpqojWF8CMUl/zs8Cm
 xHIb/4VYR/dnlhqZvhn9NGo9MlAo4D6ftNE5BooON8cKPuEnl25pb5KWmNBu1piM
 knNIV7QA/aawQ3jnH9tXyCWcprcxt/2INwwp41DE/mxpn6xiw+5UwzINu6q5NAzs
 DxEWYv08mM6Qrt0YsIe7cYDPO03my547LWhI3tUWf/KQvqc98hP5ZODy/QDQJmbA
 IpP36kjKB5dJldTv2sWb
 =eCH/
 -----END PGP SIGNATURE-----

Merge tag 'fixes-v4.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Fixes for omaps for the -rc cycle:

- Fix a device tree based booting vs legacy booting regression for
  omap3 crypto hardware by adding the missing DMA channels.

- Fix /sys/bus/soc/devices/soc0/family for am33xx devices.

- Fix two timer issues that can cause hangs if the timer related
  hwmod data is missing like it often initially is for new SoCs.

- Remove pcie hwmods entry from dts as that causes runtime PM to
  fail for the PHYs.

- A paper bag type dts configuration fix for dm816x GPIO
  interrupts that I just noticed. This is most of the changes
  diffstat wise, but as it's a basic feature for connecting
  devices and things work otherwise, it should be fixed.

* tag 'fixes-v4.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: Fix gpio interrupts for dm816x
  ARM: dts: dra7: remove ti,hwmod property from pcie phy
  ARM: OMAP: dmtimer: disable pm runtime on remove
  ARM: OMAP: dmtimer: check for pm_runtime_get_sync() failure
  ARM: OMAP2+: Fix socbus family info for AM33xx devices
  ARM: dts: omap3: Add missing dmas for crypto

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-03-29 13:59:16 -07:00
Olof Johansson
ebc0aa8fd5 Late fix for v4.0 on the SoCFPGA platform:
- Fix interrupt number for SPI1 interface
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJVCxZzAAoJEBmUBAuBoyj0V3YQAJN24cPBrxeyAG1atuHN2T9r
 8QtJ2eVOU32IwzNoOl1PHITGaf1b3Ks03NC+GSk3WMAzMJ/fJMY2p6ABpCCrriBZ
 xRZfkqqlJlB6/BNb62BkUK45yy0chgxoPoaDiX9CR0KqBeVciZM0Ag4Tw82oQDDF
 mEBKJxSO7GIzRj5uHzQHaTh6Fi6QMRYLYDMyoRNom1tpdOLm7ufyWI++q6FLk2/W
 YAxI8tj+AN48cgTB9oO+K83RjxVuR7YM4xQPUJ3K21AAAc1AHhXgOyzMd2FZhIhJ
 qC5xODFXVcG5+m6/7vYxkBnLo9sexHF8EdD7ByRE2KMoRrsWU2s304ErUBfzp635
 9LbSHZB2HD3rM7KEz/QiRGYkijACgVVBbIImHAaAbd4H13hz4FevKARPNvFz9u62
 Bu3jYj94wuGJYHuiIa9p4QZcMaQVSThBsnWUfq18l1UljIoeRzK4jt0P/DCowRJO
 J3gtmH1E/N/NKGbw1RqNwqSydUqdbeyDdOua/2oLrEaOthJATCj3eFCTpIK5lrQD
 Ttc3EsdL7r2OVD7zOXZsbH9mZXxgDnIu0E4Ocm3F3Xs22ERRMVPiDb0xvwXbtq5+
 uN0HdFN2P7wb02dv2r3NWcTpQEBVqvPe4VRIv+U4hgXXJJrYeEZqWYQ8Vd2d4bsb
 ZkvK7CHsv9FiU3aCnT7B
 =NEia
 -----END PGP SIGNATURE-----

Merge tag 'socfpga_fix_for_v4.0_2' of git://git.rocketboards.org/linux-socfpga-next into fixes

Late fix for v4.0 on the SoCFPGA platform:
- Fix interrupt number for SPI1 interface

* tag 'socfpga_fix_for_v4.0_2' of git://git.rocketboards.org/linux-socfpga-next:
  ARM: socfpga: dts: fix spi1 interrupt

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-03-29 13:58:14 -07:00
Dave Martin
78d84bc373 arm64: juno: Fix misleading name of UART reference clock
The UART reference clock speed is 7273.8 kHz, not 72738 kHz.

Dots aren't usually used in node names even though ePAPR permits
them.  However, this can easily be avoided by expressing the
frequency in Hz, not kHz.

This patch changes the name to refclk7273800hz, reflecting the
actual clock speed.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
2015-03-29 13:56:08 -07:00
Olof Johansson
53b1a66398 arm: pxa: fixes for v4.0-rc5
There are only 2 fixes, one for the zeus board about the regulator changes,
 where a typo prevented the zeus board from having a working can regulator,
 and one regression triggered by the interrupts IRQ shift of 16 affecting all
 boards.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVBzpKAAoJEAP2et0duMsSVngP+wTxRklD/kyjRrB4edOGvnKj
 yCJdSbw6dK4IQd4K5BbnJKRSxUTZs9TrPA11IB9IznWcoRjHwKkSr6uk5h/bg+vl
 FwbH60iu7DJ4wyKjIzLCstZpONfFYB/bVOZti3lg+9u0A//YMg0/2PcfcfsUzfgi
 /SAaYuJe+A0XZNZYGwtBEDyYo9hkDzJPbYMRneQnL8XwjLJ9BvksoqBZ/4aIQXlb
 gTlUhlCo9wMngfS3Fbhpd+stw/zO1mt0xUXRXwWJYnUSS2x9mz0K/9zz8RWmodtx
 R4NNz3ZhQzMLHxjj3NpUDcId7faiZNVY0yvSVJU7O5/TLgXluNdH+vRTNiOrpVGu
 +s/+nxg0HKuK16Tz/0mlQH0uqWvtjJvtNTLS3yDgJKM3civ2Vl+i4DKYK0Ht8ynN
 ZSBDmARc6Tq8M9JHgzZFAI8ypKACYuIHJzWNF919BuodJ1N2gcexwKo3m6kPoobG
 WFDTzT+TDiFB0Z4VLOWtWKU9V/RrZqZnrF5rqqkpJ1NyOPdotj+9n2Q63JEdzhH7
 eQKSK9P4FcLKtEmlMo/iRI9livkN81nIRRUTsxxrSTjHA8MJAQOBAU7JDs/ic3YX
 j1vyC/Brd0fPbCqnePfS2BO9G14fwHNb2AxoFgM9JI9zb8sIVxscM4s8uczFCIOs
 rd6bZsQ6mhKjrObS1iSe
 =r3W0
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-v4.0-rc5' of https://github.com/rjarzmik/linux into fixes

arm: pxa: fixes for v4.0-rc5

There are only 2 fixes, one for the zeus board about the regulator changes,
where a typo prevented the zeus board from having a working can regulator,
and one regression triggered by the interrupts IRQ shift of 16 affecting all
boards.

* tag 'fixes-for-v4.0-rc5' of https://github.com/rjarzmik/linux:
  ARM: pxa: fix pxa interrupts handling in DT
  ARM: pxa: Fix typo in zeus.c

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-03-29 13:47:37 -07:00
Linus Torvalds
7fc377ecf4 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
 "Fix x86 syscall exit code bug that resulted in spurious non-execution
  of TIF-driven user-return worklets, causing big trouble for things
  like KVM that rely on user notifiers for correctness of their vcpu
  model, causing crashes like double faults"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm/entry: Check for syscall exit work with IRQs disabled
2015-03-28 11:25:04 -07:00
Linus Torvalds
713d25dba9 Merge branch 'parisc-4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parsic fixes from Helge Deller:
 "One patch from Mikulas fixes a bug on parisc by artifically
  incrementing the counter in pmd_free when the kernel tries to free
  the preallocated pmd.

  Other than that we now prevent that syscalls gets added without
  incrementing __NR_Linux_syscalls and fix the initial pmd setup code
  if a default page size greater than 4k has been selected"

* 'parisc-4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix pmd code to depend on PT_NLEVELS value, not on CONFIG_64BIT
  parisc: mm: don't count preallocated pmds
  parisc: Add compile-time check when adding new syscalls
2015-03-28 10:58:53 -07:00
Linus Torvalds
22824c5369 Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm ppc bugfixes from Marcelo Tosatti.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: Book3S HV: Fix instruction emulation
  KVM: PPC: Book3S HV: Endian fix for accessing VPA yield count
  KVM: PPC: Book3S HV: Fix spinlock/mutex ordering issue in kvmppc_set_lpcr()
2015-03-28 10:54:59 -07:00
Linus Torvalds
af0c11ca80 ARC signal handling related fixes uncovered during recent testing of NPTL tools
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVFTw+AAoJEGnX8d3iisJeojUP/12ANSkctP29kOjZFkZrav6v
 M9qaSR0NqtuxqQ7o8SY4FeA61/vKvy0Sfg5aqfLq1b4YU7FJginyOxZv+d3NPNto
 HCS4XVJwFiEmqYx6EgIrCnYmxNQsRTseJEI24/ImHmzvnjufx/JQ9wr0wQVcw/S3
 ewxIyf0uyyVJ2UAOMjn2F3ukJYIpteItyT/59ndWluB/+JTHhhJUsIYowP2irU29
 EsvEiExNv5d+n35gtxY7GdKEsX5bQc8bTT2UiladcWYIWdb9OoV4+L8MgFh7JHii
 vV71D4kdH5bhJsD1D98nm9jFpP6GfPWCY2BMTIhWIYmg+2Zl+cfiaYgIIuBYNG+x
 +ahrqB5Nbm+Ux4R7KnIYlXMgDiWMPetkAptUmbod293M/pXPyyQtFhiaB2X9l6Bl
 48IUvWYjSp5p8mfhwBXcMItTazrv46HqarHfl3NG0sljRc9BwdPqlumhB1xxsQXg
 iauvvD6s6GHRIGapYlINRww9mCaNu5bTcY9cD6AHEc++6Ov/ZYpWnnipmUINNCmL
 4yLd+pUQ6QoPuvCHbpzdEvb/eb/LDSbo17nB3aPNn/pPGzICgAXnhY4ET6nzZ3LM
 Dc/8FfrYmalYHmjeoWC8CPR/XmrmI7xCkNlF/kre+vCWRLs8Zewl/btxyU48+hho
 81HBp3a0QCzCOcFKvzam
 =zVXd
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.0-fixes-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
 "We found some issues with signal handling taking down the system.  I
  know its late, but these are important and all marked for stable.

  ARC signal handling related fixes uncovered during recent testing of
  NPTL tools"

* tag 'arc-4.0-fixes-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: signal handling robustify
  ARC: SA_SIGINFO ucontext regs off-by-one
2015-03-28 10:47:27 -07:00
Peter Zijlstra
34f439278c perf: Add per event clockid support
While thinking on the whole clock discussion it occurred to me we have
two distinct uses of time:

 1) the tracking of event/ctx/cgroup enabled/running/stopped times
    which includes the self-monitoring support in struct
    perf_event_mmap_page.

 2) the actual timestamps visible in the data records.

And we've been conflating them.

The first is all about tracking time deltas, nobody should really care
in what time base that happens, its all relative information, as long
as its internally consistent it works.

The second however is what people are worried about when having to
merge their data with external sources. And here we have the
discussion on MONOTONIC vs MONOTONIC_RAW etc..

Where MONOTONIC is good for correlating between machines (static
offset), MONOTNIC_RAW is required for correlating against a fixed rate
hardware clock.

This means configurability; now 1) makes that hard because it needs to
be internally consistent across groups of unrelated events; which is
why we had to have a global perf_clock().

However, for 2) it doesn't really matter, perf itself doesn't care
what it writes into the buffer.

The below patch makes the distinction between these two cases by
adding perf_event_clock() which is used for the second case. It
further makes this configurable on a per-event basis, but adds a few
sanity checks such that we cannot combine events with different clocks
in confusing ways.

And since we then have per-event configurability we might as well
retain the 'legacy' behaviour as a default.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 10:13:22 +01:00
Ingo Molnar
b381e63b48 Merge branch 'perf/core' into perf/timer, before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 10:10:47 +01:00
Ingo Molnar
4e6d7c2aa9 Merge branch 'timers/core' into perf/timer, to apply dependent patch
An upcoming patch will depend on tai_ns() and NMI-safe ktime_get_raw_fast(),
so merge timers/core here in a separate topic branch until it's all cooked
and timers/core is merged upstream.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 10:09:21 +01:00
David Ahern
9332d250b4 perf/x86: Remove redundant calls to perf_pmu_{dis|en}able()
perf_pmu_disable() is called before pmu->add() and perf_pmu_enable() is called
afterwards. No need to call these inside of x86_pmu_add() as well.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1424281543-67335-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 09:49:44 +01:00
Ingo Molnar
936c663aed Merge branch 'perf/x86' into perf/core, because it's ready
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 09:46:19 +01:00
Ingo Molnar
072e5a1cfa Merge branch 'perf/urgent' into perf/core, to pick up fixes and to refresh the tree
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 09:46:03 +01:00
Peter Zijlstra
876e78818d time: Rename timekeeper::tkr to timekeeper::tkr_mono
In preparation of adding another tkr field, rename this one to
tkr_mono. Also rename tk_read_base::base_mono to tk_read_base::base,
since the structure is not specific to CLOCK_MONOTONIC and the mono
name got added to the tk_read_base instance.

Lots of trivial churn.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150319093400.344679419@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 09:45:06 +01:00
Andi Kleen
294fe0f52a perf/x86/intel: Add INST_RETIRED.ALL workarounds
On Broadwell INST_RETIRED.ALL cannot be used with any period
that doesn't have the lowest 6 bits cleared. And the period
should not be smaller than 128.

This is erratum BDM11 and BDM55:

  http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/5th-gen-core-family-spec-update.pdf

BDM11: When using a period < 100; we may get incorrect PEBS/PMI
interrupts and/or an invalid counter state.
BDM55: When bit0-5 of the period are !0 we may get redundant PEBS
records on overflow.

Add a new callback to enforce this, and set it for Broadwell.

How does this handle the case when an app requests a specific
period with some of the bottom bits set?

Short answer:

Any useful instruction sampling period needs to be 4-6 orders
of magnitude larger than 128, as an PMI every 128 instructions
would instantly overwhelm the system and be throttled.
So the +-64 error from this is really small compared to the
period, much smaller than normal system jitter.

Long answer (by Peterz):

IFF we guarantee perf_event_attr::sample_period >= 128.

Suppose we start out with sample_period=192; then we'll set period_left
to 192, we'll end up with left = 128 (we truncate the lower bits). We
get an interrupt, find that period_left = 64 (>0 so we return 0 and
don't get an overflow handler), up that to 128. Then we trigger again,
at n=256. Then we find period_left = -64 (<=0 so we return 1 and do get
an overflow). We increment with sample_period so we get left = 128. We
fire again, at n=384, period_left = 0 (<=0 so we return 1 and get an
overflow). And on and on.

So while the individual interrupts are 'wrong' we get then with
interval=256,128 in exactly the right ratio to average out at 192. And
this works for everything >=128.

So the num_samples*fixed_period thing is still entirely correct +- 127,
which is good enough I'd say, as you already have that error anyhow.

So no need to 'fix' the tools, al we need to do is refuse to create
INST_RETIRED:ALL events with sample_period < 128.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
[ Updated comments and changelog a bit. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1424225886-18652-3-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 09:14:03 +01:00
Andi Kleen
91f1b70582 perf/x86/intel: Add Broadwell core support
Add Broadwell support for Broadwell to perf.

The basic support is very similar to Haswell. We use the new cache
event list added for Haswell earlier. The only differences
are a few bits related to remote nodes. To avoid an extra,
mostly identical, table these are patched up in the initialization code.

The constraint list has one new event that needs to be handled over Haswell.

Includes code and testing from Kan Liang.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1424225886-18652-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 09:14:02 +01:00
Andi Kleen
0f1b5ca240 perf/x86/intel: Add new cache events table for Haswell
Haswell offcore events are quite different from Sandy Bridge.
Add a new table to handle Haswell properly.

Note that the offcore bits listed in the SDM are not quite correct
(this is currently being fixed). An uptodate list of bits is
in the patch.

The basic setup is similar to Sandy Bridge. The prefetch columns
have been removed, as prefetch counting is not very reliable
on Haswell. One L1 event that is not in the event list anymore
has been also removed.

- data reads do not include code reads (comparable to earlier Sandy Bridge tables)
- data counts include speculative execution (except L1 write, dtlb, bpu)
- remote node access includes both remote memory, remote cache, remote mmio.
- prefetches are not included in the counts for consistency
  (different from Sandy Bridge, which includes prefetches in the remote node)

Signed-off-by: Andi Kleen <ak@linux.intel.com>
[ Removed the HSM30 comments; we don't have them for SNB/IVB either. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1424225886-18652-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-03-27 09:14:01 +01:00
Linus Torvalds
d6702d840c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "A couple of bug fixes for s390.

  The ftrace comile fix is quite large for a -rc6 release, but it would
  be nice to have it in 4.0"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/smp: reenable smt after resume
  s390/mm: limit STACK_RND_MASK for compat tasks
  s390/ftrace: fix compile error if CONFIG_KPROBES is disabled
  s390/cpum_sf: add diagnostic sampling event only if it is authorized
2015-03-26 14:11:17 -07:00
Vineet Gupta
e4140819da ARC: signal handling robustify
A malicious signal handler / restorer can DOS the system by fudging the
user regs saved on stack, causing weird things such as sigreturn returning
to user mode PC but cpu state still being kernel mode....

Ensure that in sigreturn path status32 always has U bit; any other bogosity
(gargbage PC etc) will be taken care of by normal user mode exceptions mechanisms.

Reproducer signal handler:

    void handle_sig(int signo, siginfo_t *info, void *context)
    {
	ucontext_t *uc = context;
	struct user_regs_struct *regs = &(uc->uc_mcontext.regs);

	regs->scratch.status32 = 0;
    }

Before the fix, kernel would go off to weeds like below:

    --------->8-----------
    [ARCLinux]$ ./signal-test
    Path: /signal-test
    CPU: 0 PID: 61 Comm: signal-test Not tainted 4.0.0-rc5+ #65
    task: 8f177880 ti: 5ffe6000 task.ti: 8f15c000

    [ECR   ]: 0x00220200 => Invalid Write @ 0x00000010 by insn @ 0x00010698
    [EFA   ]: 0x00000010
    [BLINK ]: 0x2007c1ee
    [ERET  ]: 0x10698
    [STAT32]: 0x00000000 :                                   <--------
    BTA: 0x00010680	 SP: 0x5ffe7e48	 FP: 0x00000000
    LPS: 0x20003c6c	LPE: 0x20003c70	LPC: 0x00000000
    ...
    --------->8-----------

Reported-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-03-26 11:19:36 +05:30
Vineet Gupta
6914e1e3f6 ARC: SA_SIGINFO ucontext regs off-by-one
The regfile provided to SA_SIGINFO signal handler as ucontext was off by
one due to pt_regs gutter cleanups in 2013.

Before handling signal, user pt_regs are copied onto user_regs_struct and copied
back later. Both structs are binary compatible. This was all fine until
commit 2fa919045b (ARC: pt_regs update #2) which removed the empty stack slot
at top of pt_regs (corresponding to first pad) and made the corresponding
fixup in struct user_regs_struct (the pad in there was moved out of
@scratch - not removed altogether as it is part of ptrace ABI)

 struct user_regs_struct {
+       long pad;
        struct {
-               long pad;
                long bta, lp_start, lp_end,....
        } scratch;
 ...
 }

This meant that now user_regs_struct was off by 1 reg w.r.t pt_regs and
signal code needs to user_regs_struct.scratch to reflect it as pt_regs,
which is what this commit does.

This problem was hidden for 2 years, because both save/restore, despite
using wrong location, were using the same location. Only an interim
inspection (reproducer below) exposed the issue.

     void handle_segv(int signo, siginfo_t *info, void *context)
     {
 	ucontext_t *uc = context;
	struct user_regs_struct *regs = &(uc->uc_mcontext.regs);

	printf("regs %x %x\n",               <=== prints 7 8 (vs. 8 9)
               regs->scratch.r8, regs->scratch.r9);
     }

     int main()
     {
	struct sigaction sa;

	sa.sa_sigaction = handle_segv;
	sa.sa_flags = SA_SIGINFO;
	sigemptyset(&sa.sa_mask);
	sigaction(SIGSEGV, &sa, NULL);

	asm volatile(
	"mov	r7, 7	\n"
	"mov	r8, 8	\n"
	"mov	r9, 9	\n"
	"mov	r10, 10	\n"
	:::"r7","r8","r9","r10");

	*((unsigned int*)0x10) = 0;
     }

Fixes: 2fa919045b "ARC: pt_regs update #2: Remove unused gutter at start of pt_regs"
CC: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-03-26 09:38:00 +05:30
Linus Torvalds
4c4fe4c247 Another metag architecture fix for v4.0
This is another single fix, for an include dependency problem when using
 ioremap_wc() from asm/io.h without also including asm/pgtable.h.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJVEXtTAAoJEGwLaZPeOHZ62GQP/39zCHTQ4ntXrzUFHueNqB+t
 o3F205AL3HwHbygD+lpiYBMMPbh/5IOG9cFKxj798GRd6khfDX/hGha1OWtJT5hh
 0vkhqGhVZPySfz+a7jzSPbmgrN4P/uvcavdNCmEJkDY6Gj/RVsfHpuSIrswAI0by
 LIGAregj9MeQ2M4snggbkneHYey1xReGH4BQCid/j4tByJi2+AijRh+GSxLZ163U
 WIcPPvEh7Rg3rTi39aHyjeIqgNIt4d+MLTGvvrQ7hZsCgrULFlPjCfi6mG398dqh
 qg75oJYNoc42pDrX2Q6mLDgwjbdKEJhH5AHHsnY6Zhd7rSCUK46k2Vmq6sfAP0BK
 /VXl9nQnKfm6We5zbPCDoC5mUxtQyLRXxli0vJGTDOI18kV92B92McWgaIN5fzRN
 uD942EmEz13IdCGb26p5AhfhhmFsBB9i11wyZ4oKOl3Xnh0eAo8tM2ibhK5lWmog
 BtaxrLO78rpGOwa99DtyNEorTICLYBQUPWHKrGa6PN5hUNwUoaJdZy/prqTfsn/9
 6JI0NCQKLfSLx+K3T3fCKAowN1UsIQyjk57qjkTS0WHmnZqiLAelOD0z+lFWiPBH
 zbGGcVwsQA+NiEcbsI3LDADIkhu7KWg3v+rPVpfn8gDJ2/PMsDSsJ1vGGws+jyna
 Pg2vnJdUA6wf4xV7DmYg
 =gsz4
 -----END PGP SIGNATURE-----

Merge tag 'metag-fixes-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag

Pull arch/metag fix from James Hogan:
 "Another metag architecture fix for v4.0

  This is another single fix, for an include dependency problem when
  using ioremap_wc() from asm/io.h without also including asm/pgtable.h"

* tag 'metag-fixes-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
  metag: Fix ioremap_wc/ioremap_cached build errors
2015-03-25 16:52:53 -07:00
Marcelo Tosatti
27bfc6cfda Patch queue for 4.0 - 2015-03-25
A few bug fixes for Book3S HV KVM:
 
   - Fix spinlock ordering
   - Fix idle guests on LE hosts
   - Fix instruction emulation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJVEy9HAAoJECszeR4D/txgPxEQAKMglNb5wHkm0QaQrXZ0sYs0
 w/N6QUM/UWG4x6kFb1JUBwt+Piboxriial9xVUdYwnIZbvWfN6X6+HEs599R4dHm
 m9at/dvOo4/Zd5TRVlV3CJUIkiWtFYAgmBU8oy03bbyiPyT+qk8RPwusH1iTy+iW
 1ul9cuKCZ1EL3zIDe0pkVsF8Z7cB2QO/1ACbuM1LQdn74FoZen9VoKepDV+jG01n
 Dw8zwbdCmCb4aMtYCu42jjvcNlf3qNNNzm31vDXl085lXOOdwVSppUhMIlciNrxK
 MuJ0NhT7zLL2BSLBD9R7Zaiify0Zl/x8ja2g+FIKQRufVFKZkcSBXpUV7uFuJMNA
 BdIZkpKAwLNcUpmOG1eJ1xRbSzhDa3DazbInV2BBaySUgG1cDtWOCVa6rFHA4f5X
 Kgcug1aeB62jgvx69JjM3EOnjwvEzTwMMeCELAsjXgRIUKZj6ietJ8Zz3StrDNRj
 HRsu/yvS/56qOXA4vcMXcqx0Ziztpwv0Ttrk9aqOkwfkTdg5+sFrMqFWbIA+opzX
 Zuw0HF+CpbLzdCqiIalA56WhVfExZq4uApzfKhdPFu2lAznILYbMq+1M+8F6KjXe
 hkUCdXE9J/C2bnrRLR5NlKa/IPJTQcqWttLtphO3+jeZzevN26t268xxO/5IZuZ6
 QUhZ6XGXgbabxYqLIepw
 =fAF8
 -----END PGP SIGNATURE-----

Merge tag 'signed-for-4.0' of git://github.com/agraf/linux-2.6

Patch queue for 4.0 - 2015-03-25

A few bug fixes for Book3S HV KVM:

  - Fix spinlock ordering
  - Fix idle guests on LE hosts
  - Fix instruction emulation
2015-03-25 20:20:31 -03:00
Heiko Carstens
1833c9f647 s390/smp: reenable smt after resume
After a suspend/resume cycle we missed to enable smt again, which leads
to all sorts of bugs, since the kernel assumes smt is enabled, while the
hardware thinks it is not.

Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reported-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-03-25 11:47:07 +01:00