Commit graph

965 commits

Author SHA1 Message Date
Sean Christopherson
18c16cef81 perf: Protect perf_guest_cbs with RCU
commit ff083a2d97 upstream.

Protect perf_guest_cbs with RCU to fix multiple possible errors.  Luckily,
all paths that read perf_guest_cbs already require RCU protection, e.g. to
protect the callback chains, so only the direct perf_guest_cbs touchpoints
need to be modified.

Bug #1 is a simple lack of WRITE_ONCE/READ_ONCE behavior to ensure
perf_guest_cbs isn't reloaded between a !NULL check and a dereference.
Fixed via the READ_ONCE() in rcu_dereference().

Bug #2 is that on weakly-ordered architectures, updates to the callbacks
themselves are not guaranteed to be visible before the pointer is made
visible to readers.  Fixed by the smp_store_release() in
rcu_assign_pointer() when the new pointer is non-NULL.

Bug #3 is that, because the callbacks are global, it's possible for
readers to run in parallel with an unregisters, and thus a module
implementing the callbacks can be unloaded while readers are in flight,
resulting in a use-after-free.  Fixed by a synchronize_rcu() call when
unregistering callbacks.

Bug #1 escaped notice because it's extremely unlikely a compiler will
reload perf_guest_cbs in this sequence.  perf_guest_cbs does get reloaded
for future derefs, e.g. for ->is_user_mode(), but the ->is_in_guest()
guard all but guarantees the consumer will win the race, e.g. to nullify
perf_guest_cbs, KVM has to completely exit the guest and teardown down
all VMs before KVM start its module unload / unregister sequence.  This
also makes it all but impossible to encounter bug #3.

Bug #2 has not been a problem because all architectures that register
callbacks are strongly ordered and/or have a static set of callbacks.

But with help, unloading kvm_intel can trigger bug #1 e.g. wrapping
perf_guest_cbs with READ_ONCE in perf_misc_flags() while spamming
kvm_intel module load/unload leads to:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP
  CPU: 6 PID: 1825 Comm: stress Not tainted 5.14.0-rc2+ #459
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:perf_misc_flags+0x1c/0x70
  Call Trace:
   perf_prepare_sample+0x53/0x6b0
   perf_event_output_forward+0x67/0x160
   __perf_event_overflow+0x52/0xf0
   handle_pmi_common+0x207/0x300
   intel_pmu_handle_irq+0xcf/0x410
   perf_event_nmi_handler+0x28/0x50
   nmi_handle+0xc7/0x260
   default_do_nmi+0x6b/0x170
   exc_nmi+0x103/0x130
   asm_exc_nmi+0x76/0xbf

Fixes: 39447b386c ("perf: Enhance perf to allow for guest statistic collection from host")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211111020738.2512932-2-seanjc@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-20 09:13:14 +01:00
Alexander Antonov
737143025c perf/x86/intel/uncore: Fix IIO event constraints for Snowridge
[ Upstream commit bdc0feee05 ]

According to the latest uncore document, DATA_REQ_OF_CPU (0x83),
DATA_REQ_BY_CPU (0xc0) and COMP_BUF_OCCUPANCY (0xd5) events have
constraints. Add uncore IIO constraints for Snowridge.

Fixes: 210cc5f9db ("perf/x86/intel/uncore: Add uncore support for Snow Ridge server")
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20211115090334.3789-4-alexander.antonov@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25 09:48:41 +01:00
Alexander Antonov
d55aa2391d perf/x86/intel/uncore: Fix IIO event constraints for Skylake Server
[ Upstream commit 3866ae319c ]

According to the latest uncore document, COMP_BUF_OCCUPANCY (0xd5) event
can be collected on 2-3 counters. Update uncore IIO event constraints for
Skylake Server.

Fixes: cd34cd97b7 ("perf/x86/intel/uncore: Add Skylake server uncore support")
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20211115090334.3789-3-alexander.antonov@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25 09:48:41 +01:00
Alexander Antonov
7955e4aca7 perf/x86/intel/uncore: Fix filter_tid mask for CHA events on Skylake Server
[ Upstream commit e324234e0a ]

According Uncore Reference Manual: any of the CHA events may be filtered
by Thread/Core-ID by using tid modifier in CHA Filter 0 Register.
Update skx_cha_hw_config() to follow Uncore Guide.

Fixes: cd34cd97b7 ("perf/x86/intel/uncore: Add Skylake server uncore support")
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20211115090334.3789-2-alexander.antonov@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25 09:48:41 +01:00
Like Xu
eda355db53 perf/x86/vlbr: Add c->flags to vlbr event constraints
[ Upstream commit 5863702561 ]

Just like what we do in the x86_get_event_constraints(), the
PERF_X86_EVENT_LBR_SELECT flag should also be propagated
to event->hw.flags so that the host lbr driver can save/restore
MSR_LBR_SELECT for the special vlbr event created by KVM or BPF.

Fixes: 097e4311cd ("perf/x86: Add constraint to create guest LBR event without hw counter")
Reported-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Wanpeng Li <wanpengli@tencent.com>
Link: https://lore.kernel.org/r/20211103091716.59906-1-likexu@tencent.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-25 09:48:32 +01:00
Kan Liang
848df133cc perf/x86/intel/uncore: Fix Intel SPR M3UPI event constraints
[ Upstream commit 4034fb207e ]

SPR M3UPI have the exact same event constraints as ICX, so add the
constraints.

Fixes: 2a8e51eae7 ("perf/x86/intel/uncore: Add Sapphire Rapids server M3UPI support")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1629991963-102621-8-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18 19:16:22 +01:00
Kan Liang
a4a2da864e perf/x86/intel/uncore: Fix Intel SPR M2PCIE event constraints
[ Upstream commit f01d7d558e ]

Similar to the ICX M2PCIE  events, some of the SPR M2PCIE events also
have constraints. Add the constraints for SPR M2PCIE.

Fixes: f85ef898f8 ("perf/x86/intel/uncore: Add Sapphire Rapids server M2PCIe support")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1629991963-102621-7-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18 19:16:22 +01:00
Kan Liang
ced11c1b40 perf/x86/intel/uncore: Fix Intel SPR IIO event constraints
[ Upstream commit 67c5d44384 ]

SPR IIO events have the exact same event constraints as ICX, so add the
constraints.

Fixes: 3ba7095bea ("perf/x86/intel/uncore: Add Sapphire Rapids server IIO support")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1629991963-102621-6-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18 19:16:22 +01:00
Kan Liang
cdba93ecbb perf/x86/intel/uncore: Fix Intel SPR CHA event constraints
[ Upstream commit 9d756e408e ]

SPR CHA events have the exact same event constraints as SKX, so add the
constraints.

Fixes: 949b11381f ("perf/x86/intel/uncore: Add Sapphire Rapids server CHA support")
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1629991963-102621-5-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18 19:16:22 +01:00
Stephane Eranian
ec6ceae80a perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings
[ Upstream commit 2de71ee153 ]

This patch fixes the encoding for INST_RETIRED.PREC_DIST as published by Intel
(download.01.org/perfmon/) for Icelake. The official encoding
is event code 0x00 umask 0x1, a change from Skylake where it was code 0xc0
umask 0x1.

With this patch applied it is possible to run:
$ perf record -a -e cpu/event=0x00,umask=0x1/pp .....

Whereas before this would fail.

To avoid problems with tools which may use the old code, we maintain the old
encoding for Icelake.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211014001214.2680534-1-eranian@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18 19:16:21 +01:00
Kan Liang
8c0abfa673 perf/x86/intel/uncore: Fix Intel ICX IIO event constraints
commit f42e8a603c upstream.

According to the latest uncore document, both NUM_OUTSTANDING_REQ_OF_CPU
(0x88) event and COMP_BUF_OCCUPANCY(0xd5) event also have constraints. Add
them into the event constraints table.

Fixes: 2b3b76b5ec ("perf/x86/intel/uncore: Add Ice Lake server uncore support")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1629991963-102621-4-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18 19:16:00 +01:00
Kan Liang
54cabed24b perf/x86/intel/uncore: Fix invalid unit check
commit e2bb9fab08 upstream.

The uncore unit with the type ID 0 and the unit ID 0 is missed.

The table3 of the uncore unit maybe 0. The
uncore_discovery_invalid_unit() mistakenly treated it as an invalid
value.

Remove the !unit.table3 check.

Fixes: edae1f06c2 ("perf/x86/intel/uncore: Parse uncore discovery tables")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1629991963-102621-3-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18 19:16:00 +01:00
Kan Liang
398318fb96 perf/x86/intel/uncore: Support extra IMC channel on Ice Lake server
commit 496a18f093 upstream.

There are three channels on a Ice Lake server, but only two channels
will ever be active. Current perf only enables two channels.

Support the extra IMC channel, which may be activated on some Ice Lake
machines. For a non-activated channel, the SW can still access it. The
write will be ignored by the HW. 0 is always returned for the reading.

Fixes: 2b3b76b5ec ("perf/x86/intel/uncore: Add Ice Lake server uncore support")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1629991963-102621-2-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18 19:16:00 +01:00
Kan Liang
71920ea97d perf/x86/msr: Add Sapphire Rapids CPU support
SMI_COUNT MSR is supported on Sapphire Rapids CPU.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1633551137-192083-1-git-send-email-kan.liang@linux.intel.com
2021-10-15 11:25:26 +02:00
Kan Liang
ecc2123e09 perf/x86/intel: Update event constraints for ICX
According to the latest event list, the event encoding 0xEF is only
available on the first 4 counters. Add it into the event constraints
table.

Fixes: 6017608936 ("perf/x86/intel: Add Icelake support")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1632842343-25862-1-git-send-email-kan.liang@linux.intel.com
2021-10-01 13:57:54 +02:00
Anand K Mistry
02d029a41d perf/x86: Reset destroy callback on event init failure
perf_init_event tries multiple init callbacks and does not reset the
event state between tries. When x86_pmu_event_init runs, it
unconditionally sets the destroy callback to hw_perf_event_destroy. On
the next init attempt after x86_pmu_event_init, in perf_try_init_event,
if the pmu's capabilities includes PERF_PMU_CAP_NO_EXCLUDE, the destroy
callback will be run. However, if the next init didn't set the destroy
callback, hw_perf_event_destroy will be run (since the callback wasn't
reset).

Looking at other pmu init functions, the common pattern is to only set
the destroy callback on a successful init. Resetting the callback on
failure tries to replicate that pattern.

This was discovered after commit f11dd0d805 ("perf/x86/amd/ibs: Extend
PERF_PMU_CAP_NO_EXCLUDE to IBS Op") when the second (and only second)
run of the perf tool after a reboot results in 0 samples being
generated. The extra run of hw_perf_event_destroy results in
active_events having an extra decrement on each perf run. The second run
has active_events == 0 and every subsequent run has active_events < 0.
When active_events == 0, the NMI handler will early-out and not record
any samples.

Signed-off-by: Anand K Mistry <amistry@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210929170405.1.I078b98ee7727f9ae9d6df8262bad7e325e40faf0@changeid
2021-10-01 13:57:54 +02:00
Kim Phillips
6a371bafe6 perf/x86/amd/ibs: Add bitfield definitions in new <asm/amd-ibs.h> header
Add <asm/amd-ibs.h> with bitfield definitions for IBS MSRs,
and demonstrate usage within the driver.

Also move 'struct perf_ibs_data' where it can be shared with
the perf tool that will soon be using it.

No functional changes.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210817221048.88063-9-kim.phillips@amd.com
2021-08-26 09:14:36 +02:00
Kim Phillips
05485745ad perf/amd/uncore: Allow the driver to be built as a module
Add support to build the AMD uncore driver as a module.

This is in order to facilitate development without having
to reboot the kernel in most cases.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210817221048.88063-8-kim.phillips@amd.com
2021-08-26 09:14:36 +02:00
Kim Phillips
9164d9493a x86/cpu: Add get_llc_id() helper function
Factor out a helper function rather than export cpu_llc_id, which is
needed in order to be able to build the AMD uncore driver as a module.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210817221048.88063-7-kim.phillips@amd.com
2021-08-26 09:14:36 +02:00
Kim Phillips
0a0b53e0c3 perf/amd/uncore: Clean up header use, use <linux/ include paths instead of <asm/
Found by checkpatch.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210817221048.88063-6-kim.phillips@amd.com
2021-08-26 09:14:36 +02:00
Kim Phillips
6cf295b216 perf/amd/uncore: Simplify code, use free_percpu()'s built-in check for NULL
free_percpu() has its own check for NULL, no need to open-code it.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210817221048.88063-5-kim.phillips@amd.com
2021-08-26 09:14:36 +02:00
Sebastian Andrzej Siewior
eda8a2c599 perf/x86/intel: Replace deprecated CPU-hotplug functions
The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210803141621.780504-11-bigeasy@linutronix.de
2021-08-26 09:14:36 +02:00
Colin Ian King
4f32da76a1 perf/x86: Remove unused assignment to pointer 'e'
The pointer 'e' is being assigned a value that is never read, the assignment
is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210804115710.109608-1-colin.king@canonical.com
2021-08-26 09:14:36 +02:00
Ingo Molnar
46466ae3a1 Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-08-26 09:14:05 +02:00
Kim Phillips
ccf2648341 perf/x86/amd/power: Assign pmu.module
Assign pmu.module so the driver can't be unloaded whilst in use.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210817221048.88063-4-kim.phillips@amd.com
2021-08-26 09:12:57 +02:00
Kim Phillips
f11dd0d805 perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op
Commit:

   2ff4025069 ("perf/core, arch/x86: Use PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs")

neglected to do so.

Fixes: 2ff4025069 ("perf/core, arch/x86: Use PERF_PMU_CAP_NO_EXCLUDE for exclusion incapable PMUs")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210817221048.88063-2-kim.phillips@amd.com
2021-08-26 08:58:02 +02:00
Kim Phillips
26db2e0c51 perf/x86/amd/ibs: Work around erratum #1197
Erratum #1197 "IBS (Instruction Based Sampling) Register State May be
Incorrect After Restore From CC6" is published in a document:

  "Revision Guide for AMD Family 19h Models 00h-0Fh Processors" 56683 Rev. 1.04 July 2021

  https://bugzilla.kernel.org/show_bug.cgi?id=206537

Implement the erratum's suggested workaround and ignore IBS samples if
MSRC001_1031 == 0.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210817221048.88063-3-kim.phillips@amd.com
2021-08-26 08:58:02 +02:00
Colin Ian King
0b3a8738b7 perf/x86/intel/uncore: Fix integer overflow on 23 bit left shift of a u32
The u32 variable pci_dword is being masked with 0x1fffffff and then left
shifted 23 places. The shift is a u32 operation,so a value of 0x200 or
more in pci_dword will overflow the u32 and only the bottow 32 bits
are assigned to addr. I don't believe this was the original intent.
Fix this by casting pci_dword to a resource_size_t to ensure no
overflow occurs.

Note that the mask and 12 bit left shift operation does not need this
because the mask SNR_IMC_MMIO_MEM0_MASK and shift is always a 32 bit
value.

Fixes: ee49532b38 ("perf/x86/intel/uncore: Add IMC uncore support for Snow Ridge")
Addresses-Coverity: ("Unintentional integer overflow")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20210706114553.28249-1-colin.king@canonical.com
2021-08-26 08:58:02 +02:00
Xiaoyao Li
c53c6b7409 perf/x86/intel/pt: Fix mask of num_address_ranges
Per SDM, bit 2:0 of CPUID(0x14,1).EAX[2:0] reports the number of
configurable address ranges for filtering, not bit 1:0.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Link: https://lkml.kernel.org/r/20210824040622.4081502-1-xiaoyao.li@intel.com
2021-08-25 15:42:31 +02:00
Kan Liang
acade63799 perf/x86/intel: Apply mid ACK for small core
A warning as below may be occasionally triggered in an ADL machine when
these conditions occur:

 - Two perf record commands run one by one. Both record a PEBS event.
 - Both runs on small cores.
 - They have different adaptive PEBS configuration (PEBS_DATA_CFG).

  [ ] WARNING: CPU: 4 PID: 9874 at arch/x86/events/intel/ds.c:1743 setup_pebs_adaptive_sample_data+0x55e/0x5b0
  [ ] RIP: 0010:setup_pebs_adaptive_sample_data+0x55e/0x5b0
  [ ] Call Trace:
  [ ]  <NMI>
  [ ]  intel_pmu_drain_pebs_icl+0x48b/0x810
  [ ]  perf_event_nmi_handler+0x41/0x80
  [ ]  </NMI>
  [ ]  __perf_event_task_sched_in+0x2c2/0x3a0

Different from the big core, the small core requires the ACK right
before re-enabling counters in the NMI handler, otherwise a stale PEBS
record may be dumped into the later NMI handler, which trigger the
warning.

Add a new mid_ack flag to track the case. Add all PMI handler bits in
the struct x86_hybrid_pmu to track the bits for different types of
PMUs.  Apply mid ACK for the small cores on an Alder Lake machine.

The existing hybrid() macro has a compile error when taking address of
a bit-field variable. Add a new macro hybrid_bit() to get the
bit-field value of a given PMU.

Fixes: f83d2f91d2 ("perf/x86/intel: Add Alder Lake Hybrid support")
Reported-by: Ammy Yi <ammy.yi@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Ammy Yi <ammy.yi@intel.com>
Link: https://lkml.kernel.org/r/1627997128-57891-1-git-send-email-kan.liang@linux.intel.com
2021-08-06 14:25:15 +02:00
Like Xu
df51fe7ea1 perf/x86/amd: Don't touch the AMD64_EVENTSEL_HOSTONLY bit inside the guest
If we use "perf record" in an AMD Milan guest, dmesg reports a #GP
warning from an unchecked MSR access error on MSR_F15H_PERF_CTLx:

  [] unchecked MSR access error: WRMSR to 0xc0010200 (tried to write 0x0000020000110076) at rIP: 0xffffffff8106ddb4 (native_write_msr+0x4/0x20)
  [] Call Trace:
  []  amd_pmu_disable_event+0x22/0x90
  []  x86_pmu_stop+0x4c/0xa0
  []  x86_pmu_del+0x3a/0x140

The AMD64_EVENTSEL_HOSTONLY bit is defined and used on the host,
while the guest perf driver should avoid such use.

Fixes: 1018faa6cf ("perf/x86/kvm: Fix Host-Only/Guest-Only counting with SVM disabled")
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Tested-by: Kim Phillips <kim.phillips@amd.com>
Tested-by: Liam Merwick <liam.merwick@oracle.com>
Link: https://lkml.kernel.org/r/20210802070850.35295-1-likexu@tencent.com
2021-08-04 15:16:34 +02:00
Peter Zijlstra
f4b4b45652 perf/x86: Fix out of bound MSR access
On Wed, Jul 28, 2021 at 12:49:43PM -0400, Vince Weaver wrote:
> [32694.087403] unchecked MSR access error: WRMSR to 0x318 (tried to write 0x0000000000000000) at rIP: 0xffffffff8106f854 (native_write_msr+0x4/0x20)
> [32694.101374] Call Trace:
> [32694.103974]  perf_clear_dirty_counters+0x86/0x100

The problem being that it doesn't filter out all fake counters, in
specific the above (erroneously) tries to use FIXED_BTS. Limit the
fixed counters indexes to the hardware supplied number.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Like Xu <likexu@tencent.com>
Link: https://lkml.kernel.org/r/YQJxka3dxgdIdebG@hirez.programming.kicks-ass.net
2021-08-04 15:16:33 +02:00
Alexander Antonov
3f2cbe3810 perf/x86/intel/uncore: Fix IIO cleanup mapping procedure for SNR/ICX
skx_iio_cleanup_mapping() is re-used for snr and icx, but in those
cases it fails to use the appropriate XXX_iio_mapping_group and as
such fails to free previously allocated resources, leading to memory
leaks.

Fixes: 10337e95e0 ("perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on ICX")
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
[peterz: Changelog]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210706090723.41850-1-alexander.antonov@linux.intel.com
2021-07-16 18:46:48 +02:00
Linus Torvalds
936b664fb2 A fix and a hardware-enablement addition:
- Robustify uncore_snbep's skx_iio_set_mapping()'s error cleanup
  - Add cstate event support for Intel ICELAKE_X and ICELAKE_D
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDq8TwRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ixEg//e3lbiMEJSqTv+MCZYZWn1PH/oOaHzcVD
 dJ/Mb9Glwbanj9vEfF7BszVIyovnPcnhO9Hg0sLEg4TJkYUcWi7uaJWJJYTVjlys
 yXNV8wOK7IkJHbBjJMyv4gn5rrnSVfjsCMiSctTs9sNCsXyNmmgDpTp43wmQIphb
 ROnUUbDhGqDahT3Q+Oj1VG5u66N1j24PD2Yf6/gdoyoPvjYbgEB5baJ011bletbM
 +tfMOlSK63DoeBQyhMadG+1kdVG27YesLn7g7Metb7g9Y6InBkr9XRpMwmi3V+43
 tjX3bDJ+5DSGsSH2GzawsUlGrWZuek7l8s5u25+w1QxAPvUQvgUFf3R1h0lt/Mke
 uJi5eSG1Z6A0AQs1h8L8lPmyF0YeVjhZCmxZVSGZpAAz+vqoiNWVdUS3MevqxB2S
 SJJ4A1Yj/mEBDxbE9WCU2RS7QVgVlra/sT0aadRpJQyjIrMooXa0LxvkLBm6/YPW
 43gJysUNFt5l1fi2opn6PlOYuqTH6FJNnOveQ/cp3fCO4NDtFd+pohfKwBq0PPMh
 xhiFe6o0bJ0/sfEftwecf1v7qdvnWGsjkyvt3jVk8NBtI03k6qXkHTHdkEEJ53YK
 Ge88tosDTdEZnJHTCUzguTyX+Dns/iEGwdtH2OrghCLxPUmAcVmShbB0MlvaYjjB
 DDockwrBh88=
 =+Ond
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "A fix and a hardware-enablement addition:

   - Robustify uncore_snbep's skx_iio_set_mapping()'s error cleanup

   - Add cstate event support for Intel ICELAKE_X and ICELAKE_D"

* tag 'perf-urgent-2021-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/uncore: Clean up error handling path of iio mapping
  perf/x86/cstate: Add ICELAKE_X and ICELAKE_D support
2021-07-11 11:10:48 -07:00
Linus Torvalds
1423e2660c Fixes and improvements for FPU handling on x86:
- Prevent sigaltstack out of bounds writes. The kernel unconditionally
     writes the FPU state to the alternate stack without checking whether
     the stack is large enough to accomodate it.
 
     Check the alternate stack size before doing so and in case it's too
     small force a SIGSEGV instead of silently corrupting user space data.
 
   - MINSIGSTKZ and SIGSTKSZ are constants in signal.h and have never been
     updated despite the fact that the FPU state which is stored on the
     signal stack has grown over time which causes trouble in the field
     when AVX512 is available on a CPU. The kernel does not expose the
     minimum requirements for the alternate stack size depending on the
     available and enabled CPU features.
 
     ARM already added an aux vector AT_MINSIGSTKSZ for the same reason.
     Add it to x86 as well
 
   - A major cleanup of the x86 FPU code. The recent discoveries of XSTATE
     related issues unearthed quite some inconsistencies, duplicated code
     and other issues.
 
     The fine granular overhaul addresses this, makes the code more robust
     and maintainable, which allows to integrate upcoming XSTATE related
     features in sane ways.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmDlcpETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoeP5D/4i+AgYYeiMLgGb+NS7iaKPfoWo6LIz
 y3qdTSA0DQaIYbYivWwRO/g0GYdDMXDWeZalFi7eGnVI8O3eOog+22Zrf/y0UINB
 KJHdYd4ApWHhs401022y5hexrWQvnV8w1yQCuj/zLm6eC+AVhdwt2AY+IBoRrdUj
 wqY97B/4rJNsBvvqTDn9EeDrJA2y0y0Suc7AhIp2BGMI+dpIdxys8RJDamXNWyDL
 gJf0YRgUoiIn3AHKb+fgv60AoxfC175NSg/5/y/scFNXqVlW0Up4YCb7pqG9o2Ga
 f3XvtWfbw1N5PmUYjFkALwEkzGUbM3v0RA3xLY2j2WlWm9fBPPy59dt+i/h/VKyA
 GrA7i7lcIqX8dfVH6XkrReZBkRDSB6t9SZTvV54jAz5fcIZO2Rg++UFUvI/R6GKK
 XCcxukYaArwo+IG62iqDszS3gfLGhcor/cviOeULRC5zMUIO4Jah+IhDnifmShtC
 M5s9QzrwIRD/XMewGRQmvkiN4kBfE7jFoBQr1J9leCXJKrM+2JQmMzVInuubTQIq
 SdlKOaAIn7xtekz+6XdFG9Gmhck0PCLMJMOLNvQkKWI3KqGLRZ+dAWKK0vsCizAx
 0BA7ZeB9w9lFT+D8mQCX77JvW9+VNwyfwIOLIrJRHk3VqVpS5qvoiFTLGJJBdZx4
 /TbbRZu7nXDN2w==
 =Mq1m
 -----END PGP SIGNATURE-----

Merge tag 'x86-fpu-2021-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fpu updates from Thomas Gleixner:
 "Fixes and improvements for FPU handling on x86:

   - Prevent sigaltstack out of bounds writes.

     The kernel unconditionally writes the FPU state to the alternate
     stack without checking whether the stack is large enough to
     accomodate it.

     Check the alternate stack size before doing so and in case it's too
     small force a SIGSEGV instead of silently corrupting user space
     data.

   - MINSIGSTKZ and SIGSTKSZ are constants in signal.h and have never
     been updated despite the fact that the FPU state which is stored on
     the signal stack has grown over time which causes trouble in the
     field when AVX512 is available on a CPU. The kernel does not expose
     the minimum requirements for the alternate stack size depending on
     the available and enabled CPU features.

     ARM already added an aux vector AT_MINSIGSTKSZ for the same reason.
     Add it to x86 as well.

   - A major cleanup of the x86 FPU code. The recent discoveries of
     XSTATE related issues unearthed quite some inconsistencies,
     duplicated code and other issues.

     The fine granular overhaul addresses this, makes the code more
     robust and maintainable, which allows to integrate upcoming XSTATE
     related features in sane ways"

* tag 'x86-fpu-2021-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits)
  x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again
  x86/fpu/signal: Let xrstor handle the features to init
  x86/fpu/signal: Handle #PF in the direct restore path
  x86/fpu: Return proper error codes from user access functions
  x86/fpu/signal: Split out the direct restore code
  x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing()
  x86/fpu/signal: Sanitize the xstate check on sigframe
  x86/fpu/signal: Remove the legacy alignment check
  x86/fpu/signal: Move initial checks into fpu__restore_sig()
  x86/fpu: Mark init_fpstate __ro_after_init
  x86/pkru: Remove xstate fiddling from write_pkru()
  x86/fpu: Don't store PKRU in xstate in fpu_reset_fpstate()
  x86/fpu: Remove PKRU handling from switch_fpu_finish()
  x86/fpu: Mask PKRU from kernel XRSTOR[S] operations
  x86/fpu: Hook up PKRU into ptrace()
  x86/fpu: Add PKRU storage outside of task XSAVE buffer
  x86/fpu: Dont restore PKRU in fpregs_restore_userspace()
  x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi()
  x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs()
  x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs()
  ...
2021-07-07 11:12:01 -07:00
Kan Liang
c76826a65f perf/x86/intel/uncore: Support IMC free-running counters on Sapphire Rapids server
Several free-running counters for IMC uncore blocks are supported on
Sapphire Rapids server.

They are not enumerated in the discovery tables. The number of the
free-running counter boxes is calculated from the number of
corresponding standard boxes.

The snbep_pci2phy_map_init() is invoked to setup the mapping from a PCI
BUS to a Die ID, which is used to locate the corresponding MC device of
a IMC uncore unit in the spr_uncore_imc_freerunning_init_box().

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-16-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:42 +02:00
Kan Liang
0378c93a92 perf/x86/intel/uncore: Support IIO free-running counters on Sapphire Rapids server
Several free-running counters for IIO uncore blocks are supported on
Sapphire Rapids server.

They are not enumerated in the discovery tables. Extend
generic_init_uncores() to support extra uncore types. The uncore types
for the free-running counters is inserted right after the uncore types
retrieved from the discovery table.

The number of the free-running counter boxes is calculated from the max
number of the corresponding standard boxes.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-15-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:41 +02:00
Kan Liang
1583971b5c perf/x86/intel/uncore: Factor out snr_uncore_mmio_map()
The IMC free-running counters on Sapphire Rapids server are also
accessed by MMIO, which is similar to the previous platforms, SNR and
ICX. The only difference is the device ID of the device which contains
BAR address.

Factor out snr_uncore_mmio_map() which ioremap the MMIO space. It can be
reused in the following patch for SPR.

Use the snr_uncore_mmio_map() in the icx_uncore_imc_freerunning_init_box().
There is no box_ctl for the free-running counters.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-14-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:41 +02:00
Kan Liang
8053f2d752 perf/x86/intel/uncore: Add alias PMU name
A perf PMU may have two PMU names. For example, Intel Sapphire Rapids
server supports the discovery mechanism. Without the platform-specific
support, an uncore PMU is named by a type ID plus a box ID, e.g.,
uncore_type_0_0, because the real name of the uncore PMU cannot be
retrieved from the discovery table. With the platform-specific support
later, perf has the mapping information from a type ID to a specific
uncore unit. Just like the previous platforms, the uncore PMU is named
by the real PMU name, e.g., uncore_cha_0. The user scripts which work
well with the old numeric name may not work anymore.

Add a new attribute "alias" to indicate the old numeric name. The
following userspace perf tool patch will handle both names. The user
scripts should work properly with the updated perf tool.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-13-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:40 +02:00
Kan Liang
0d771caf72 perf/x86/intel/uncore: Add Sapphire Rapids server MDF support
The MDF subsystem is a new IP built to support the new Intel Xeon
architecture that bridges multiple dies with a embedded bridge system.

The layout of the control registers for a MDF uncore unit is similar to
a IRP uncore unit.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-12-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:40 +02:00
Kan Liang
2a8e51eae7 perf/x86/intel/uncore: Add Sapphire Rapids server M3UPI support
M3 Intel UPI is the interface between the mesh and the Intel UPI link
layer. It is responsible for translating between the mesh protocol
packets and the flits that are used for transmitting data across the
Intel UPI interface.

The layout of the control registers for a M3UPI uncore unit is similar
to a UPI uncore unit.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-11-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:40 +02:00
Kan Liang
da5a9156cd perf/x86/intel/uncore: Add Sapphire Rapids server UPI support
Sapphire Rapids uses a coherent interconnect for scaling to multiple
sockets known as Intel UPI. Intel UPI technology provides a cache
coherent socket to socket external communication interface between
processors.

The layout of the control registers for a UPI uncore unit is similar to
a M2M uncore unit.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-10-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:39 +02:00
Kan Liang
f57191edaa perf/x86/intel/uncore: Add Sapphire Rapids server M2M support
The M2M blocks manage the interface between the mesh (operating on both
the mesh and the SMI3 protocol) and the memory controllers.

The layout of the control registers for a M2M uncore unit is a little
 bit different from the generic one. So a specific format and ops are
required. Expose the common PCI ops which can be reused.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-9-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:39 +02:00
Kan Liang
85f2e30f98 perf/x86/intel/uncore: Add Sapphire Rapids server IMC support
The Sapphire Rapids IMC provides the interface to the DRAM and
communicates to the rest of the uncore through the M2M block.

The layout of the control registers for a IMC uncore unit is a little
bit different from the generic one. There is a fixed counter for IMC.
So a specific format and ops are required. Expose the common MMIO ops
which can be reused.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-8-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:39 +02:00
Kan Liang
0654dfdc7e perf/x86/intel/uncore: Add Sapphire Rapids server PCU support
The PCU is the primary power controller for the Sapphire Rapids.

Except the name, all the information can be retrieved from the discovery
tables.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-7-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:38 +02:00
Kan Liang
f85ef898f8 perf/x86/intel/uncore: Add Sapphire Rapids server M2PCIe support
M2PCIe* blocks manage the interface between the mesh and each IIO stack.

The layout of the control registers for a M2PCIe uncore unit is similar
to a IRP uncore unit.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-6-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:38 +02:00
Kan Liang
e199eb5131 perf/x86/intel/uncore: Add Sapphire Rapids server IRP support
The IRP is responsible for maintaining coherency for the IIO traffic
targeting coherent memory.

The layout of the control registers for a IRP uncore unit is a little
bit different from the generic one.

Factor out SPR_UNCORE_COMMON_FORMAT, which can be reused by the
following uncore units.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-5-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:37 +02:00
Kan Liang
3ba7095bea perf/x86/intel/uncore: Add Sapphire Rapids server IIO support
The IIO stacks are responsible for managing the traffic between the PCI
Express* (PCIe*) domain and the mesh domain. The IIO PMON block is
situated near the IIO stacks traffic controller capturing the traffic
controller as well as the PCIe* root port information.

The layout of the control registers for a IIO uncore unit is a little
bit different from the generic one.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-4-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:37 +02:00
Kan Liang
949b11381f perf/x86/intel/uncore: Add Sapphire Rapids server CHA support
CHA merges the caching agent and Home Agent (HA) responsibilities of the
chip into a single block. It's one of the Sapphire Rapids server uncore
units.

The layout of the control registers for a CHA uncore unit is a little
bit different from the generic one. The CHA uncore unit also supports a
filter register for TID. So a specific format and ops are required.
Expose the common MSR ops which can be reused.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-3-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:37 +02:00
Kan Liang
c54c53d992 perf/x86/intel/uncore: Add Sapphire Rapids server framework
Intel Sapphire Rapids supports a discovery mechanism, that allows an
uncore driver to discover the different components ("boxes") of the
chip.

All the generic information of the uncore boxes should be retrieved from
the discovery tables. This has been enabled with the commit edae1f06c2
("perf/x86/intel/uncore: Parse uncore discovery tables"). Add
use_discovery to indicate the case. The uncore driver doesn't need to
hard code the generic information for each uncore box.
But we still need to enable various functionality that cannot be
directly discovered.

To support these functionalities, the Sapphire Rapids server framework
is introduced here. Each specific uncore unit will be added into the
framework in the following patches.

Add use_discovery to indicate that the discovery mechanism is required
for the platform. Currently, Intel Sapphire Rapids is one of the
platforms.

The box ID from the discovery table is the accurate index. Use it if
applicable.

All the undiscovered platform-specific features will be hard code in the
spr_uncores[]. Add uncore_type_customized_copy(), instead of the memcpy,
to only overwrite these features.

The specific uncore unit hasn't been added here. From user's
perspective, there is nothing changed for now.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1625087320-194204-2-git-send-email-kan.liang@linux.intel.com
2021-07-02 15:58:36 +02:00