Merge remote-tracking branch 'tip/perf/core' into perf/urgent

To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This commit is contained in:
Arnaldo Carvalho de Melo 2019-07-08 13:06:57 -03:00
commit e3b22a6534
250 changed files with 10235 additions and 2434 deletions

View file

@ -538,3 +538,26 @@ Description: Intel Energy and Performance Bias Hint (EPB)
This attribute is present for all online CPUs supporting the
Intel EPB feature.
What: /sys/devices/system/cpu/umwait_control
/sys/devices/system/cpu/umwait_control/enable_c02
/sys/devices/system/cpu/umwait_control/max_time
Date: May 2019
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description: Umwait control
enable_c02: Read/write interface to control umwait C0.2 state
Read returns C0.2 state status:
0: C0.2 is disabled
1: C0.2 is enabled
Write 'y' or '1' or 'on' to enable C0.2 state.
Write 'n' or '0' or 'off' to disable C0.2 state.
The interface is case insensitive.
max_time: Read/write interface to control umwait maximum time
in TSC-quanta that the CPU can reside in either C0.1
or C0.2 state. The time is an unsigned 32-bit number.
Note that a value of zero means there is no limit.
Low order two bits must be zero.

View file

@ -12,6 +12,12 @@ physical_package_id:
socket number, but the actual value is architecture and platform
dependent.
die_id:
the CPU die ID of cpuX. Typically it is the hardware platform's
identifier (rather than the kernel's). The actual value is
architecture and platform dependent.
core_id:
the CPU core ID of cpuX. Typically it is the hardware platform's
@ -30,25 +36,33 @@ drawer_id:
identifier (rather than the kernel's). The actual value is
architecture and platform dependent.
thread_siblings:
core_cpus:
internal kernel map of cpuX's hardware threads within the same
core as cpuX.
internal kernel map of CPUs within the same core.
(deprecated name: "thread_siblings")
thread_siblings_list:
core_cpus_list:
human-readable list of cpuX's hardware threads within the same
core as cpuX.
human-readable list of CPUs within the same core.
(deprecated name: "thread_siblings_list");
core_siblings:
package_cpus:
internal kernel map of cpuX's hardware threads within the same
physical_package_id.
internal kernel map of the CPUs sharing the same physical_package_id.
(deprecated name: "core_siblings")
core_siblings_list:
package_cpus_list:
human-readable list of cpuX's hardware threads within the same
physical_package_id.
human-readable list of CPUs sharing the same physical_package_id.
(deprecated name: "core_siblings_list")
die_cpus:
internal kernel map of CPUs within the same die.
die_cpus_list:
human-readable list of CPUs within the same die.
book_siblings:
@ -81,11 +95,13 @@ For an architecture to support this feature, it must define some of
these macros in include/asm-XXX/topology.h::
#define topology_physical_package_id(cpu)
#define topology_die_id(cpu)
#define topology_core_id(cpu)
#define topology_book_id(cpu)
#define topology_drawer_id(cpu)
#define topology_sibling_cpumask(cpu)
#define topology_core_cpumask(cpu)
#define topology_die_cpumask(cpu)
#define topology_book_cpumask(cpu)
#define topology_drawer_cpumask(cpu)
@ -99,9 +115,11 @@ provides default definitions for any of the above macros that are
not defined by include/asm-XXX/topology.h:
1) topology_physical_package_id: -1
2) topology_core_id: 0
3) topology_sibling_cpumask: just the given CPU
4) topology_core_cpumask: just the given CPU
2) topology_die_id: -1
3) topology_core_id: 0
4) topology_sibling_cpumask: just the given CPU
5) topology_core_cpumask: just the given CPU
6) topology_die_cpumask: just the given CPU
For architectures that don't support books (CONFIG_SCHED_BOOK) there are no
default definitions for topology_book_id() and topology_book_cpumask().

View file

@ -31,7 +31,7 @@ you probably needn't concern yourself with isdn4k-utils.
====================== =============== ========================================
GNU C 4.6 gcc --version
GNU make 3.81 make --version
binutils 2.20 ld -v
binutils 2.21 ld -v
flex 2.5.35 flex --version
bison 2.0 bison --version
util-linux 2.10o fdformat --version
@ -77,9 +77,7 @@ You will need GNU make 3.81 or later to build the kernel.
Binutils
--------
The build system has, as of 4.13, switched to using thin archives (`ar T`)
rather than incremental linking (`ld -r`) for built-in.a intermediate steps.
This requires binutils 2.20 or newer.
Binutils 2.21 or newer is needed to build the kernel.
pkg-config
----------

View file

@ -49,6 +49,10 @@ Package-related topology information in the kernel:
The number of cores in a package. This information is retrieved via CPUID.
- cpuinfo_x86.x86_max_dies:
The number of dies in a package. This information is retrieved via CPUID.
- cpuinfo_x86.phys_proc_id:
The physical ID of the package. This information is retrieved via CPUID

View file

@ -7810,7 +7810,7 @@ INGENIC JZ4780 NAND DRIVER
M: Harvey Hunt <harveyhuntnexus@gmail.com>
L: linux-mtd@lists.infradead.org
S: Maintained
F: drivers/mtd/nand/raw/jz4780_*
F: drivers/mtd/nand/raw/ingenic/
INOTIFY
M: Jan Kara <jack@suse.cz>
@ -17496,6 +17496,12 @@ Q: https://patchwork.linuxtv.org/project/linux-media/list/
S: Maintained
F: drivers/media/dvb-frontends/zd1301_demod*
ZHAOXIN PROCESSOR SUPPORT
M: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
L: linux-kernel@vger.kernel.org
S: Maintained
F: arch/x86/kernel/cpu/zhaoxin.c
ZPOOL COMPRESSED PAGE STORAGE API
M: Dan Streetman <ddstreet@ieee.org>
L: linux-mm@kvack.org

View file

@ -2,7 +2,7 @@
VERSION = 5
PATCHLEVEL = 2
SUBLEVEL = 0
EXTRAVERSION = -rc7
EXTRAVERSION =
NAME = Bobtail Squid
# *DOCUMENTATION*

View file

@ -17,6 +17,7 @@ archscripts: scripts_basic
$(Q)$(MAKE) $(build)=arch/mips/boot/tools relocs
KBUILD_DEFCONFIG := 32r2el_defconfig
KBUILD_DTBS := dtbs
#
# Select the object file format to substitute into the linker script.
@ -384,7 +385,7 @@ quiet_cmd_64 = OBJCOPY $@
vmlinux.64: vmlinux
$(call cmd,64)
all: $(all-y)
all: $(all-y) $(KBUILD_DTBS)
# boot
$(boot-y): $(vmlinux-32) FORCE

View file

@ -78,6 +78,8 @@ OBJCOPYFLAGS_piggy.o := --add-section=.image=$(obj)/vmlinux.bin.z \
$(obj)/piggy.o: $(obj)/dummy.o $(obj)/vmlinux.bin.z FORCE
$(call if_changed,objcopy)
HOSTCFLAGS_calc_vmlinuz_load_addr.o += $(LINUXINCLUDE)
# Calculate the load address of the compressed kernel image
hostprogs-y := calc_vmlinuz_load_addr

View file

@ -9,7 +9,7 @@
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include "../../../../include/linux/sizes.h"
#include <linux/sizes.h>
int main(int argc, char *argv[])
{

View file

@ -24,8 +24,8 @@
#define AR933X_UART_CS_PARITY_S 0
#define AR933X_UART_CS_PARITY_M 0x3
#define AR933X_UART_CS_PARITY_NONE 0
#define AR933X_UART_CS_PARITY_ODD 1
#define AR933X_UART_CS_PARITY_EVEN 2
#define AR933X_UART_CS_PARITY_ODD 2
#define AR933X_UART_CS_PARITY_EVEN 3
#define AR933X_UART_CS_IF_MODE_S 2
#define AR933X_UART_CS_IF_MODE_M 0x3
#define AR933X_UART_CS_IF_MODE_NONE 0

View file

@ -203,7 +203,7 @@ unsigned long arch_randomize_brk(struct mm_struct *mm)
bool __virt_addr_valid(const volatile void *kaddr)
{
unsigned long vaddr = (unsigned long)vaddr;
unsigned long vaddr = (unsigned long)kaddr;
if ((vaddr < PAGE_OFFSET) || (vaddr >= MAP_BASE))
return false;

View file

@ -391,6 +391,7 @@ static struct work_registers build_get_work_registers(u32 **p)
static void build_restore_work_registers(u32 **p)
{
if (scratch_reg >= 0) {
uasm_i_ehb(p);
UASM_i_MFC0(p, 1, c0_kscratch(), scratch_reg);
return;
}
@ -668,10 +669,12 @@ static void build_restore_pagemask(u32 **p, struct uasm_reloc **r,
uasm_i_mtc0(p, 0, C0_PAGEMASK);
uasm_il_b(p, r, lid);
}
if (scratch_reg >= 0)
if (scratch_reg >= 0) {
uasm_i_ehb(p);
UASM_i_MFC0(p, 1, c0_kscratch(), scratch_reg);
else
} else {
UASM_i_LW(p, 1, scratchpad_offset(0), 0);
}
} else {
/* Reset default page size */
if (PM_DEFAULT_MASK >> 16) {
@ -938,10 +941,12 @@ build_get_pgd_vmalloc64(u32 **p, struct uasm_label **l, struct uasm_reloc **r,
uasm_i_jr(p, ptr);
if (mode == refill_scratch) {
if (scratch_reg >= 0)
if (scratch_reg >= 0) {
uasm_i_ehb(p);
UASM_i_MFC0(p, 1, c0_kscratch(), scratch_reg);
else
} else {
UASM_i_LW(p, 1, scratchpad_offset(0), 0);
}
} else {
uasm_i_nop(p);
}
@ -1258,6 +1263,7 @@ build_fast_tlb_refill_handler (u32 **p, struct uasm_label **l,
UASM_i_MTC0(p, odd, C0_ENTRYLO1); /* load it */
if (c0_scratch_reg >= 0) {
uasm_i_ehb(p);
UASM_i_MFC0(p, scratch, c0_kscratch(), c0_scratch_reg);
build_tlb_write_entry(p, l, r, tlb_random);
uasm_l_leave(l, *p);
@ -1603,15 +1609,17 @@ static void build_setup_pgd(void)
uasm_i_dinsm(&p, a0, 0, 29, 64 - 29);
uasm_l_tlbl_goaround1(&l, p);
UASM_i_SLL(&p, a0, a0, 11);
uasm_i_jr(&p, 31);
UASM_i_MTC0(&p, a0, C0_CONTEXT);
uasm_i_jr(&p, 31);
uasm_i_ehb(&p);
} else {
/* PGD in c0_KScratch */
uasm_i_jr(&p, 31);
if (cpu_has_ldpte)
UASM_i_MTC0(&p, a0, C0_PWBASE);
else
UASM_i_MTC0(&p, a0, c0_kscratch(), pgd_reg);
uasm_i_jr(&p, 31);
uasm_i_ehb(&p);
}
#else
#ifdef CONFIG_SMP
@ -1625,13 +1633,16 @@ static void build_setup_pgd(void)
UASM_i_LA_mostly(&p, a2, pgdc);
UASM_i_SW(&p, a0, uasm_rel_lo(pgdc), a2);
#endif /* SMP */
uasm_i_jr(&p, 31);
/* if pgd_reg is allocated, save PGD also to scratch register */
if (pgd_reg != -1)
if (pgd_reg != -1) {
UASM_i_MTC0(&p, a0, c0_kscratch(), pgd_reg);
else
uasm_i_jr(&p, 31);
uasm_i_ehb(&p);
} else {
uasm_i_jr(&p, 31);
uasm_i_nop(&p);
}
#endif
if (p >= (u32 *)tlbmiss_handler_setup_pgd_end)
panic("tlbmiss_handler_setup_pgd space exceeded");

View file

@ -480,3 +480,16 @@ config CPU_SUP_UMC_32
CPU might render the kernel unbootable.
If unsure, say N.
config CPU_SUP_ZHAOXIN
default y
bool "Support Zhaoxin processors" if PROCESSOR_SELECT
help
This enables detection, tunings and quirks for Zhaoxin processors
You need this enabled if you want your kernel to run on a
Zhaoxin CPU. Disabling this option on other types of CPUs
makes the kernel a tiny bit smaller. Disabling it on a Zhaoxin
CPU might render the kernel unbootable.
If unsure, say N.

View file

@ -1670,11 +1670,17 @@ nmi_restore:
iretq
END(nmi)
#ifndef CONFIG_IA32_EMULATION
/*
* This handles SYSCALL from 32-bit code. There is no way to program
* MSRs to fully disable 32-bit SYSCALL.
*/
ENTRY(ignore_sysret)
UNWIND_HINT_EMPTY
mov $-ENOSYS, %eax
sysret
END(ignore_sysret)
#endif
ENTRY(rewind_stack_do_exit)
UNWIND_HINT_FUNC

View file

@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
obj-y += core.o
obj-y += core.o probe.o
obj-y += amd/
obj-$(CONFIG_X86_LOCAL_APIC) += msr.o
obj-$(CONFIG_CPU_SUP_INTEL) += intel/

View file

@ -1618,68 +1618,6 @@ static struct attribute_group x86_pmu_format_group __ro_after_init = {
.attrs = NULL,
};
/*
* Remove all undefined events (x86_pmu.event_map(id) == 0)
* out of events_attr attributes.
*/
static void __init filter_events(struct attribute **attrs)
{
struct device_attribute *d;
struct perf_pmu_events_attr *pmu_attr;
int offset = 0;
int i, j;
for (i = 0; attrs[i]; i++) {
d = (struct device_attribute *)attrs[i];
pmu_attr = container_of(d, struct perf_pmu_events_attr, attr);
/* str trumps id */
if (pmu_attr->event_str)
continue;
if (x86_pmu.event_map(i + offset))
continue;
for (j = i; attrs[j]; j++)
attrs[j] = attrs[j + 1];
/* Check the shifted attr. */
i--;
/*
* event_map() is index based, the attrs array is organized
* by increasing event index. If we shift the events, then
* we need to compensate for the event_map(), otherwise
* we are looking up the wrong event in the map
*/
offset++;
}
}
/* Merge two pointer arrays */
__init struct attribute **merge_attr(struct attribute **a, struct attribute **b)
{
struct attribute **new;
int j, i;
for (j = 0; a && a[j]; j++)
;
for (i = 0; b && b[i]; i++)
j++;
j++;
new = kmalloc_array(j, sizeof(struct attribute *), GFP_KERNEL);
if (!new)
return NULL;
j = 0;
for (i = 0; a && a[i]; i++)
new[j++] = a[i];
for (i = 0; b && b[i]; i++)
new[j++] = b[i];
new[j] = NULL;
return new;
}
ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr, char *page)
{
struct perf_pmu_events_attr *pmu_attr = \
@ -1744,9 +1682,24 @@ static struct attribute *events_attr[] = {
NULL,
};
/*
* Remove all undefined events (x86_pmu.event_map(id) == 0)
* out of events_attr attributes.
*/
static umode_t
is_visible(struct kobject *kobj, struct attribute *attr, int idx)
{
struct perf_pmu_events_attr *pmu_attr;
pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr.attr);
/* str trumps id */
return pmu_attr->event_str || x86_pmu.event_map(idx) ? attr->mode : 0;
}
static struct attribute_group x86_pmu_events_group __ro_after_init = {
.name = "events",
.attrs = events_attr,
.is_visible = is_visible,
};
ssize_t x86_event_sysfs_show(char *page, u64 config, u64 event)
@ -1842,37 +1795,10 @@ static int __init init_hw_perf_events(void)
x86_pmu_format_group.attrs = x86_pmu.format_attrs;
if (x86_pmu.caps_attrs) {
struct attribute **tmp;
tmp = merge_attr(x86_pmu_caps_group.attrs, x86_pmu.caps_attrs);
if (!WARN_ON(!tmp))
x86_pmu_caps_group.attrs = tmp;
}
if (x86_pmu.event_attrs)
x86_pmu_events_group.attrs = x86_pmu.event_attrs;
if (!x86_pmu.events_sysfs_show)
x86_pmu_events_group.attrs = &empty_attrs;
else
filter_events(x86_pmu_events_group.attrs);
if (x86_pmu.cpu_events) {
struct attribute **tmp;
tmp = merge_attr(x86_pmu_events_group.attrs, x86_pmu.cpu_events);
if (!WARN_ON(!tmp))
x86_pmu_events_group.attrs = tmp;
}
if (x86_pmu.attrs) {
struct attribute **tmp;
tmp = merge_attr(x86_pmu_attr_group.attrs, x86_pmu.attrs);
if (!WARN_ON(!tmp))
x86_pmu_attr_group.attrs = tmp;
}
pmu.attr_update = x86_pmu.attr_update;
pr_info("... version: %d\n", x86_pmu.version);
pr_info("... bit width: %d\n", x86_pmu.cntval_bits);

View file

@ -20,6 +20,7 @@
#include <asm/intel-family.h>
#include <asm/apic.h>
#include <asm/cpu_device_id.h>
#include <asm/hypervisor.h>
#include "../perf_event.h"
@ -3897,8 +3898,6 @@ static __initconst const struct x86_pmu core_pmu = {
.check_period = intel_pmu_check_period,
};
static struct attribute *intel_pmu_attrs[];
static __initconst const struct x86_pmu intel_pmu = {
.name = "Intel",
.handle_irq = intel_pmu_handle_irq,
@ -3930,8 +3929,6 @@ static __initconst const struct x86_pmu intel_pmu = {
.format_attrs = intel_arch3_formats_attr,
.events_sysfs_show = intel_event_sysfs_show,
.attrs = intel_pmu_attrs,
.cpu_prepare = intel_pmu_cpu_prepare,
.cpu_starting = intel_pmu_cpu_starting,
.cpu_dying = intel_pmu_cpu_dying,
@ -4054,6 +4051,13 @@ static bool check_msr(unsigned long msr, u64 mask)
{
u64 val_old, val_new, val_tmp;
/*
* Disable the check for real HW, so we don't
* mess with potentionaly enabled registers:
*/
if (hypervisor_is_type(X86_HYPER_NATIVE))
return true;
/*
* Read the current value, change it and read it back to see if it
* matches, this is needed to detect certain hardware emulators
@ -4274,13 +4278,6 @@ static struct attribute *icl_tsx_events_attrs[] = {
NULL,
};
static __init struct attribute **get_icl_events_attrs(void)
{
return boot_cpu_has(X86_FEATURE_RTM) ?
merge_attr(icl_events_attrs, icl_tsx_events_attrs) :
icl_events_attrs;
}
static ssize_t freeze_on_smi_show(struct device *cdev,
struct device_attribute *attr,
char *buf)
@ -4402,43 +4399,111 @@ static DEVICE_ATTR(allow_tsx_force_abort, 0644,
static struct attribute *intel_pmu_attrs[] = {
&dev_attr_freeze_on_smi.attr,
NULL, /* &dev_attr_allow_tsx_force_abort.attr.attr */
&dev_attr_allow_tsx_force_abort.attr,
NULL,
};
static __init struct attribute **
get_events_attrs(struct attribute **base,
struct attribute **mem,
struct attribute **tsx)
static umode_t
tsx_is_visible(struct kobject *kobj, struct attribute *attr, int i)
{
struct attribute **attrs = base;
struct attribute **old;
if (mem && x86_pmu.pebs)
attrs = merge_attr(attrs, mem);
if (tsx && boot_cpu_has(X86_FEATURE_RTM)) {
old = attrs;
attrs = merge_attr(attrs, tsx);
if (old != base)
kfree(old);
}
return attrs;
return boot_cpu_has(X86_FEATURE_RTM) ? attr->mode : 0;
}
static umode_t
pebs_is_visible(struct kobject *kobj, struct attribute *attr, int i)
{
return x86_pmu.pebs ? attr->mode : 0;
}
static umode_t
lbr_is_visible(struct kobject *kobj, struct attribute *attr, int i)
{
return x86_pmu.lbr_nr ? attr->mode : 0;
}
static umode_t
exra_is_visible(struct kobject *kobj, struct attribute *attr, int i)
{
return x86_pmu.version >= 2 ? attr->mode : 0;
}
static umode_t
default_is_visible(struct kobject *kobj, struct attribute *attr, int i)
{
if (attr == &dev_attr_allow_tsx_force_abort.attr)
return x86_pmu.flags & PMU_FL_TFA ? attr->mode : 0;
return attr->mode;
}
static struct attribute_group group_events_td = {
.name = "events",
};
static struct attribute_group group_events_mem = {
.name = "events",
.is_visible = pebs_is_visible,
};
static struct attribute_group group_events_tsx = {
.name = "events",
.is_visible = tsx_is_visible,
};
static struct attribute_group group_caps_gen = {
.name = "caps",
.attrs = intel_pmu_caps_attrs,
};
static struct attribute_group group_caps_lbr = {
.name = "caps",
.attrs = lbr_attrs,
.is_visible = lbr_is_visible,
};
static struct attribute_group group_format_extra = {
.name = "format",
.is_visible = exra_is_visible,
};
static struct attribute_group group_format_extra_skl = {
.name = "format",
.is_visible = exra_is_visible,
};
static struct attribute_group group_default = {
.attrs = intel_pmu_attrs,
.is_visible = default_is_visible,
};
static const struct attribute_group *attr_update[] = {
&group_events_td,
&group_events_mem,
&group_events_tsx,
&group_caps_gen,
&group_caps_lbr,
&group_format_extra,
&group_format_extra_skl,
&group_default,
NULL,
};
static struct attribute *empty_attrs;
__init int intel_pmu_init(void)
{
struct attribute **extra_attr = NULL;
struct attribute **mem_attr = NULL;
struct attribute **tsx_attr = NULL;
struct attribute **to_free = NULL;
struct attribute **extra_skl_attr = &empty_attrs;
struct attribute **extra_attr = &empty_attrs;
struct attribute **td_attr = &empty_attrs;
struct attribute **mem_attr = &empty_attrs;
struct attribute **tsx_attr = &empty_attrs;
union cpuid10_edx edx;
union cpuid10_eax eax;
union cpuid10_ebx ebx;
struct event_constraint *c;
unsigned int unused;
struct extra_reg *er;
bool pmem = false;
int version, i;
char *name;
@ -4596,7 +4661,7 @@ __init int intel_pmu_init(void)
x86_pmu.pebs_constraints = intel_slm_pebs_event_constraints;
x86_pmu.extra_regs = intel_slm_extra_regs;
x86_pmu.flags |= PMU_FL_HAS_RSP_1;
x86_pmu.cpu_events = slm_events_attrs;
td_attr = slm_events_attrs;
extra_attr = slm_format_attr;
pr_cont("Silvermont events, ");
name = "silvermont";
@ -4624,7 +4689,7 @@ __init int intel_pmu_init(void)
x86_pmu.pebs_prec_dist = true;
x86_pmu.lbr_pt_coexist = true;
x86_pmu.flags |= PMU_FL_HAS_RSP_1;
x86_pmu.cpu_events = glm_events_attrs;
td_attr = glm_events_attrs;
extra_attr = slm_format_attr;
pr_cont("Goldmont events, ");
name = "goldmont";
@ -4651,7 +4716,7 @@ __init int intel_pmu_init(void)
x86_pmu.flags |= PMU_FL_HAS_RSP_1;
x86_pmu.flags |= PMU_FL_PEBS_ALL;
x86_pmu.get_event_constraints = glp_get_event_constraints;
x86_pmu.cpu_events = glm_events_attrs;
td_attr = glm_events_attrs;
/* Goldmont Plus has 4-wide pipeline */
event_attr_td_total_slots_scale_glm.event_str = "4";
extra_attr = slm_format_attr;
@ -4740,7 +4805,7 @@ __init int intel_pmu_init(void)
x86_pmu.flags |= PMU_FL_HAS_RSP_1;
x86_pmu.flags |= PMU_FL_NO_HT_SHARING;
x86_pmu.cpu_events = snb_events_attrs;
td_attr = snb_events_attrs;
mem_attr = snb_mem_events_attrs;
/* UOPS_ISSUED.ANY,c=1,i=1 to count stall cycles */
@ -4781,7 +4846,7 @@ __init int intel_pmu_init(void)
x86_pmu.flags |= PMU_FL_HAS_RSP_1;
x86_pmu.flags |= PMU_FL_NO_HT_SHARING;
x86_pmu.cpu_events = snb_events_attrs;
td_attr = snb_events_attrs;
mem_attr = snb_mem_events_attrs;
/* UOPS_ISSUED.ANY,c=1,i=1 to count stall cycles */
@ -4818,10 +4883,10 @@ __init int intel_pmu_init(void)
x86_pmu.hw_config = hsw_hw_config;
x86_pmu.get_event_constraints = hsw_get_event_constraints;
x86_pmu.cpu_events = hsw_events_attrs;
x86_pmu.lbr_double_abort = true;
extra_attr = boot_cpu_has(X86_FEATURE_RTM) ?
hsw_format_attr : nhm_format_attr;
td_attr = hsw_events_attrs;
mem_attr = hsw_mem_events_attrs;
tsx_attr = hsw_tsx_events_attrs;
pr_cont("Haswell events, ");
@ -4860,10 +4925,10 @@ __init int intel_pmu_init(void)
x86_pmu.hw_config = hsw_hw_config;
x86_pmu.get_event_constraints = hsw_get_event_constraints;
x86_pmu.cpu_events = hsw_events_attrs;
x86_pmu.limit_period = bdw_limit_period;
extra_attr = boot_cpu_has(X86_FEATURE_RTM) ?
hsw_format_attr : nhm_format_attr;
td_attr = hsw_events_attrs;
mem_attr = hsw_mem_events_attrs;
tsx_attr = hsw_tsx_events_attrs;
pr_cont("Broadwell events, ");
@ -4890,9 +4955,10 @@ __init int intel_pmu_init(void)
name = "knights-landing";
break;
case INTEL_FAM6_SKYLAKE_X:
pmem = true;
case INTEL_FAM6_SKYLAKE_MOBILE:
case INTEL_FAM6_SKYLAKE_DESKTOP:
case INTEL_FAM6_SKYLAKE_X:
case INTEL_FAM6_KABYLAKE_MOBILE:
case INTEL_FAM6_KABYLAKE_DESKTOP:
x86_add_quirk(intel_pebs_isolation_quirk);
@ -4920,27 +4986,28 @@ __init int intel_pmu_init(void)
x86_pmu.get_event_constraints = hsw_get_event_constraints;
extra_attr = boot_cpu_has(X86_FEATURE_RTM) ?
hsw_format_attr : nhm_format_attr;
extra_attr = merge_attr(extra_attr, skl_format_attr);
to_free = extra_attr;
x86_pmu.cpu_events = hsw_events_attrs;
extra_skl_attr = skl_format_attr;
td_attr = hsw_events_attrs;
mem_attr = hsw_mem_events_attrs;
tsx_attr = hsw_tsx_events_attrs;
intel_pmu_pebs_data_source_skl(
boot_cpu_data.x86_model == INTEL_FAM6_SKYLAKE_X);
intel_pmu_pebs_data_source_skl(pmem);
if (boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)) {
x86_pmu.flags |= PMU_FL_TFA;
x86_pmu.get_event_constraints = tfa_get_event_constraints;
x86_pmu.enable_all = intel_tfa_pmu_enable_all;
x86_pmu.commit_scheduling = intel_tfa_commit_scheduling;
intel_pmu_attrs[1] = &dev_attr_allow_tsx_force_abort.attr;
}
pr_cont("Skylake events, ");
name = "skylake";
break;
case INTEL_FAM6_ICELAKE_X:
case INTEL_FAM6_ICELAKE_XEON_D:
pmem = true;
case INTEL_FAM6_ICELAKE_MOBILE:
case INTEL_FAM6_ICELAKE_DESKTOP:
x86_pmu.late_ack = true;
memcpy(hw_cache_event_ids, skl_hw_cache_event_ids, sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, skl_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
@ -4959,11 +5026,12 @@ __init int intel_pmu_init(void)
x86_pmu.get_event_constraints = icl_get_event_constraints;
extra_attr = boot_cpu_has(X86_FEATURE_RTM) ?
hsw_format_attr : nhm_format_attr;
extra_attr = merge_attr(extra_attr, skl_format_attr);
x86_pmu.cpu_events = get_icl_events_attrs();
extra_skl_attr = skl_format_attr;
mem_attr = icl_events_attrs;
tsx_attr = icl_tsx_events_attrs;
x86_pmu.rtm_abort_event = X86_CONFIG(.event=0xca, .umask=0x02);
x86_pmu.lbr_pt_coexist = true;
intel_pmu_pebs_data_source_skl(false);
intel_pmu_pebs_data_source_skl(pmem);
pr_cont("Icelake events, ");
name = "icelake";
break;
@ -4988,14 +5056,14 @@ __init int intel_pmu_init(void)
snprintf(pmu_name_str, sizeof(pmu_name_str), "%s", name);
if (version >= 2 && extra_attr) {
x86_pmu.format_attrs = merge_attr(intel_arch3_formats_attr,
extra_attr);
WARN_ON(!x86_pmu.format_attrs);
}
x86_pmu.cpu_events = get_events_attrs(x86_pmu.cpu_events,
mem_attr, tsx_attr);
group_events_td.attrs = td_attr;
group_events_mem.attrs = mem_attr;
group_events_tsx.attrs = tsx_attr;
group_format_extra.attrs = extra_attr;
group_format_extra_skl.attrs = extra_skl_attr;
x86_pmu.attr_update = attr_update;
if (x86_pmu.num_counters > INTEL_PMC_MAX_GENERIC) {
WARN(1, KERN_ERR "hw perf events %d > max(%d), clipping!",
@ -5043,12 +5111,8 @@ __init int intel_pmu_init(void)
x86_pmu.lbr_nr = 0;
}
x86_pmu.caps_attrs = intel_pmu_caps_attrs;
if (x86_pmu.lbr_nr) {
x86_pmu.caps_attrs = merge_attr(x86_pmu.caps_attrs, lbr_attrs);
if (x86_pmu.lbr_nr)
pr_cont("%d-deep LBR, ", x86_pmu.lbr_nr);
}
/*
* Access extra MSR may cause #GP under certain circumstances.
@ -5078,7 +5142,6 @@ __init int intel_pmu_init(void)
if (x86_pmu.counter_freezing)
x86_pmu.handle_irq = intel_pmu_handle_irq_v4;
kfree(to_free);
return 0;
}

View file

@ -96,6 +96,7 @@
#include <asm/cpu_device_id.h>
#include <asm/intel-family.h>
#include "../perf_event.h"
#include "../probe.h"
MODULE_LICENSE("GPL");
@ -144,25 +145,42 @@ enum perf_cstate_core_events {
PERF_CSTATE_CORE_EVENT_MAX,
};
PMU_EVENT_ATTR_STRING(c1-residency, evattr_cstate_core_c1, "event=0x00");
PMU_EVENT_ATTR_STRING(c3-residency, evattr_cstate_core_c3, "event=0x01");
PMU_EVENT_ATTR_STRING(c6-residency, evattr_cstate_core_c6, "event=0x02");
PMU_EVENT_ATTR_STRING(c7-residency, evattr_cstate_core_c7, "event=0x03");
PMU_EVENT_ATTR_STRING(c1-residency, attr_cstate_core_c1, "event=0x00");
PMU_EVENT_ATTR_STRING(c3-residency, attr_cstate_core_c3, "event=0x01");
PMU_EVENT_ATTR_STRING(c6-residency, attr_cstate_core_c6, "event=0x02");
PMU_EVENT_ATTR_STRING(c7-residency, attr_cstate_core_c7, "event=0x03");
static struct perf_cstate_msr core_msr[] = {
[PERF_CSTATE_CORE_C1_RES] = { MSR_CORE_C1_RES, &evattr_cstate_core_c1 },
[PERF_CSTATE_CORE_C3_RES] = { MSR_CORE_C3_RESIDENCY, &evattr_cstate_core_c3 },
[PERF_CSTATE_CORE_C6_RES] = { MSR_CORE_C6_RESIDENCY, &evattr_cstate_core_c6 },
[PERF_CSTATE_CORE_C7_RES] = { MSR_CORE_C7_RESIDENCY, &evattr_cstate_core_c7 },
static unsigned long core_msr_mask;
PMU_EVENT_GROUP(events, cstate_core_c1);
PMU_EVENT_GROUP(events, cstate_core_c3);
PMU_EVENT_GROUP(events, cstate_core_c6);
PMU_EVENT_GROUP(events, cstate_core_c7);
static bool test_msr(int idx, void *data)
{
return test_bit(idx, (unsigned long *) data);
}
static struct perf_msr core_msr[] = {
[PERF_CSTATE_CORE_C1_RES] = { MSR_CORE_C1_RES, &group_cstate_core_c1, test_msr },
[PERF_CSTATE_CORE_C3_RES] = { MSR_CORE_C3_RESIDENCY, &group_cstate_core_c3, test_msr },
[PERF_CSTATE_CORE_C6_RES] = { MSR_CORE_C6_RESIDENCY, &group_cstate_core_c6, test_msr },
[PERF_CSTATE_CORE_C7_RES] = { MSR_CORE_C7_RESIDENCY, &group_cstate_core_c7, test_msr },
};
static struct attribute *core_events_attrs[PERF_CSTATE_CORE_EVENT_MAX + 1] = {
static struct attribute *attrs_empty[] = {
NULL,
};
/*
* There are no default events, but we need to create
* "events" group (with empty attrs) before updating
* it with detected events.
*/
static struct attribute_group core_events_attr_group = {
.name = "events",
.attrs = core_events_attrs,
.attrs = attrs_empty,
};
DEFINE_CSTATE_FORMAT_ATTR(core_event, event, "config:0-63");
@ -211,31 +229,37 @@ enum perf_cstate_pkg_events {
PERF_CSTATE_PKG_EVENT_MAX,
};
PMU_EVENT_ATTR_STRING(c2-residency, evattr_cstate_pkg_c2, "event=0x00");
PMU_EVENT_ATTR_STRING(c3-residency, evattr_cstate_pkg_c3, "event=0x01");
PMU_EVENT_ATTR_STRING(c6-residency, evattr_cstate_pkg_c6, "event=0x02");
PMU_EVENT_ATTR_STRING(c7-residency, evattr_cstate_pkg_c7, "event=0x03");
PMU_EVENT_ATTR_STRING(c8-residency, evattr_cstate_pkg_c8, "event=0x04");
PMU_EVENT_ATTR_STRING(c9-residency, evattr_cstate_pkg_c9, "event=0x05");
PMU_EVENT_ATTR_STRING(c10-residency, evattr_cstate_pkg_c10, "event=0x06");
PMU_EVENT_ATTR_STRING(c2-residency, attr_cstate_pkg_c2, "event=0x00");
PMU_EVENT_ATTR_STRING(c3-residency, attr_cstate_pkg_c3, "event=0x01");
PMU_EVENT_ATTR_STRING(c6-residency, attr_cstate_pkg_c6, "event=0x02");
PMU_EVENT_ATTR_STRING(c7-residency, attr_cstate_pkg_c7, "event=0x03");
PMU_EVENT_ATTR_STRING(c8-residency, attr_cstate_pkg_c8, "event=0x04");
PMU_EVENT_ATTR_STRING(c9-residency, attr_cstate_pkg_c9, "event=0x05");
PMU_EVENT_ATTR_STRING(c10-residency, attr_cstate_pkg_c10, "event=0x06");
static struct perf_cstate_msr pkg_msr[] = {
[PERF_CSTATE_PKG_C2_RES] = { MSR_PKG_C2_RESIDENCY, &evattr_cstate_pkg_c2 },
[PERF_CSTATE_PKG_C3_RES] = { MSR_PKG_C3_RESIDENCY, &evattr_cstate_pkg_c3 },
[PERF_CSTATE_PKG_C6_RES] = { MSR_PKG_C6_RESIDENCY, &evattr_cstate_pkg_c6 },
[PERF_CSTATE_PKG_C7_RES] = { MSR_PKG_C7_RESIDENCY, &evattr_cstate_pkg_c7 },
[PERF_CSTATE_PKG_C8_RES] = { MSR_PKG_C8_RESIDENCY, &evattr_cstate_pkg_c8 },
[PERF_CSTATE_PKG_C9_RES] = { MSR_PKG_C9_RESIDENCY, &evattr_cstate_pkg_c9 },
[PERF_CSTATE_PKG_C10_RES] = { MSR_PKG_C10_RESIDENCY, &evattr_cstate_pkg_c10 },
};
static unsigned long pkg_msr_mask;
static struct attribute *pkg_events_attrs[PERF_CSTATE_PKG_EVENT_MAX + 1] = {
NULL,
PMU_EVENT_GROUP(events, cstate_pkg_c2);
PMU_EVENT_GROUP(events, cstate_pkg_c3);
PMU_EVENT_GROUP(events, cstate_pkg_c6);
PMU_EVENT_GROUP(events, cstate_pkg_c7);
PMU_EVENT_GROUP(events, cstate_pkg_c8);
PMU_EVENT_GROUP(events, cstate_pkg_c9);
PMU_EVENT_GROUP(events, cstate_pkg_c10);
static struct perf_msr pkg_msr[] = {
[PERF_CSTATE_PKG_C2_RES] = { MSR_PKG_C2_RESIDENCY, &group_cstate_pkg_c2, test_msr },
[PERF_CSTATE_PKG_C3_RES] = { MSR_PKG_C3_RESIDENCY, &group_cstate_pkg_c3, test_msr },
[PERF_CSTATE_PKG_C6_RES] = { MSR_PKG_C6_RESIDENCY, &group_cstate_pkg_c6, test_msr },
[PERF_CSTATE_PKG_C7_RES] = { MSR_PKG_C7_RESIDENCY, &group_cstate_pkg_c7, test_msr },
[PERF_CSTATE_PKG_C8_RES] = { MSR_PKG_C8_RESIDENCY, &group_cstate_pkg_c8, test_msr },
[PERF_CSTATE_PKG_C9_RES] = { MSR_PKG_C9_RESIDENCY, &group_cstate_pkg_c9, test_msr },
[PERF_CSTATE_PKG_C10_RES] = { MSR_PKG_C10_RESIDENCY, &group_cstate_pkg_c10, test_msr },
};
static struct attribute_group pkg_events_attr_group = {
.name = "events",
.attrs = pkg_events_attrs,
.attrs = attrs_empty,
};
DEFINE_CSTATE_FORMAT_ATTR(pkg_event, event, "config:0-63");
@ -289,7 +313,8 @@ static int cstate_pmu_event_init(struct perf_event *event)
if (event->pmu == &cstate_core_pmu) {
if (cfg >= PERF_CSTATE_CORE_EVENT_MAX)
return -EINVAL;
if (!core_msr[cfg].attr)
cfg = array_index_nospec((unsigned long)cfg, PERF_CSTATE_CORE_EVENT_MAX);
if (!(core_msr_mask & (1 << cfg)))
return -EINVAL;
event->hw.event_base = core_msr[cfg].msr;
cpu = cpumask_any_and(&cstate_core_cpu_mask,
@ -298,11 +323,11 @@ static int cstate_pmu_event_init(struct perf_event *event)
if (cfg >= PERF_CSTATE_PKG_EVENT_MAX)
return -EINVAL;
cfg = array_index_nospec((unsigned long)cfg, PERF_CSTATE_PKG_EVENT_MAX);
if (!pkg_msr[cfg].attr)
if (!(pkg_msr_mask & (1 << cfg)))
return -EINVAL;
event->hw.event_base = pkg_msr[cfg].msr;
cpu = cpumask_any_and(&cstate_pkg_cpu_mask,
topology_core_cpumask(event->cpu));
topology_die_cpumask(event->cpu));
} else {
return -ENOENT;
}
@ -385,7 +410,7 @@ static int cstate_cpu_exit(unsigned int cpu)
if (has_cstate_pkg &&
cpumask_test_and_clear_cpu(cpu, &cstate_pkg_cpu_mask)) {
target = cpumask_any_but(topology_core_cpumask(cpu), cpu);
target = cpumask_any_but(topology_die_cpumask(cpu), cpu);
/* Migrate events if there is a valid target */
if (target < nr_cpu_ids) {
cpumask_set_cpu(target, &cstate_pkg_cpu_mask);
@ -414,15 +439,35 @@ static int cstate_cpu_init(unsigned int cpu)
* in the package cpu mask as the designated reader.
*/
target = cpumask_any_and(&cstate_pkg_cpu_mask,
topology_core_cpumask(cpu));
topology_die_cpumask(cpu));
if (has_cstate_pkg && target >= nr_cpu_ids)
cpumask_set_cpu(cpu, &cstate_pkg_cpu_mask);
return 0;
}
const struct attribute_group *core_attr_update[] = {
&group_cstate_core_c1,
&group_cstate_core_c3,
&group_cstate_core_c6,
&group_cstate_core_c7,
NULL,
};
const struct attribute_group *pkg_attr_update[] = {
&group_cstate_pkg_c2,
&group_cstate_pkg_c3,
&group_cstate_pkg_c6,
&group_cstate_pkg_c7,
&group_cstate_pkg_c8,
&group_cstate_pkg_c9,
&group_cstate_pkg_c10,
NULL,
};
static struct pmu cstate_core_pmu = {
.attr_groups = core_attr_groups,
.attr_update = core_attr_update,
.name = "cstate_core",
.task_ctx_nr = perf_invalid_context,
.event_init = cstate_pmu_event_init,
@ -437,6 +482,7 @@ static struct pmu cstate_core_pmu = {
static struct pmu cstate_pkg_pmu = {
.attr_groups = pkg_attr_groups,
.attr_update = pkg_attr_update,
.name = "cstate_pkg",
.task_ctx_nr = perf_invalid_context,
.event_init = cstate_pmu_event_init,
@ -580,35 +626,11 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = {
X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT_PLUS, glm_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_ICELAKE_MOBILE, snb_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_ICELAKE_DESKTOP, snb_cstates),
{ },
};
MODULE_DEVICE_TABLE(x86cpu, intel_cstates_match);
/*
* Probe the cstate events and insert the available one into sysfs attrs
* Return false if there are no available events.
*/
static bool __init cstate_probe_msr(const unsigned long evmsk, int max,
struct perf_cstate_msr *msr,
struct attribute **attrs)
{
bool found = false;
unsigned int bit;
u64 val;
for (bit = 0; bit < max; bit++) {
if (test_bit(bit, &evmsk) && !rdmsrl_safe(msr[bit].msr, &val)) {
*attrs++ = &msr[bit].attr->attr.attr;
found = true;
} else {
msr[bit].attr = NULL;
}
}
*attrs = NULL;
return found;
}
static int __init cstate_probe(const struct cstate_model *cm)
{
/* SLM has different MSR for PKG C6 */
@ -620,13 +642,14 @@ static int __init cstate_probe(const struct cstate_model *cm)
pkg_msr[PERF_CSTATE_CORE_C6_RES].msr = MSR_KNL_CORE_C6_RESIDENCY;
has_cstate_core = cstate_probe_msr(cm->core_events,
PERF_CSTATE_CORE_EVENT_MAX,
core_msr, core_events_attrs);
core_msr_mask = perf_msr_probe(core_msr, PERF_CSTATE_CORE_EVENT_MAX,
true, (void *) &cm->core_events);
has_cstate_pkg = cstate_probe_msr(cm->pkg_events,
PERF_CSTATE_PKG_EVENT_MAX,
pkg_msr, pkg_events_attrs);
pkg_msr_mask = perf_msr_probe(pkg_msr, PERF_CSTATE_PKG_EVENT_MAX,
true, (void *) &cm->pkg_events);
has_cstate_core = !!core_msr_mask;
has_cstate_pkg = !!pkg_msr_mask;
return (has_cstate_core || has_cstate_pkg) ? 0 : -ENODEV;
}
@ -663,7 +686,13 @@ static int __init cstate_init(void)
}
if (has_cstate_pkg) {
err = perf_pmu_register(&cstate_pkg_pmu, cstate_pkg_pmu.name, -1);
if (topology_max_die_per_package() > 1) {
err = perf_pmu_register(&cstate_pkg_pmu,
"cstate_die", -1);
} else {
err = perf_pmu_register(&cstate_pkg_pmu,
cstate_pkg_pmu.name, -1);
}
if (err) {
has_cstate_pkg = false;
pr_info("Failed to register cstate pkg pmu\n");

View file

@ -55,27 +55,28 @@
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/perf_event.h>
#include <linux/nospec.h>
#include <asm/cpu_device_id.h>
#include <asm/intel-family.h>
#include "../perf_event.h"
#include "../probe.h"
MODULE_LICENSE("GPL");
/*
* RAPL energy status counters
*/
#define RAPL_IDX_PP0_NRG_STAT 0 /* all cores */
#define INTEL_RAPL_PP0 0x1 /* pseudo-encoding */
#define RAPL_IDX_PKG_NRG_STAT 1 /* entire package */
#define INTEL_RAPL_PKG 0x2 /* pseudo-encoding */
#define RAPL_IDX_RAM_NRG_STAT 2 /* DRAM */
#define INTEL_RAPL_RAM 0x3 /* pseudo-encoding */
#define RAPL_IDX_PP1_NRG_STAT 3 /* gpu */
#define INTEL_RAPL_PP1 0x4 /* pseudo-encoding */
#define RAPL_IDX_PSYS_NRG_STAT 4 /* psys */
#define INTEL_RAPL_PSYS 0x5 /* pseudo-encoding */
enum perf_rapl_events {
PERF_RAPL_PP0 = 0, /* all cores */
PERF_RAPL_PKG, /* entire package */
PERF_RAPL_RAM, /* DRAM */
PERF_RAPL_PP1, /* gpu */
PERF_RAPL_PSYS, /* psys */
PERF_RAPL_MAX,
NR_RAPL_DOMAINS = PERF_RAPL_MAX,
};
#define NR_RAPL_DOMAINS 0x5
static const char *const rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
"pp0-core",
"package",
@ -84,33 +85,6 @@ static const char *const rapl_domain_names[NR_RAPL_DOMAINS] __initconst = {
"psys",
};
/* Clients have PP0, PKG */
#define RAPL_IDX_CLN (1<<RAPL_IDX_PP0_NRG_STAT|\
1<<RAPL_IDX_PKG_NRG_STAT|\
1<<RAPL_IDX_PP1_NRG_STAT)
/* Servers have PP0, PKG, RAM */
#define RAPL_IDX_SRV (1<<RAPL_IDX_PP0_NRG_STAT|\
1<<RAPL_IDX_PKG_NRG_STAT|\
1<<RAPL_IDX_RAM_NRG_STAT)
/* Servers have PP0, PKG, RAM, PP1 */
#define RAPL_IDX_HSW (1<<RAPL_IDX_PP0_NRG_STAT|\
1<<RAPL_IDX_PKG_NRG_STAT|\
1<<RAPL_IDX_RAM_NRG_STAT|\
1<<RAPL_IDX_PP1_NRG_STAT)
/* SKL clients have PP0, PKG, RAM, PP1, PSYS */
#define RAPL_IDX_SKL_CLN (1<<RAPL_IDX_PP0_NRG_STAT|\
1<<RAPL_IDX_PKG_NRG_STAT|\
1<<RAPL_IDX_RAM_NRG_STAT|\
1<<RAPL_IDX_PP1_NRG_STAT|\
1<<RAPL_IDX_PSYS_NRG_STAT)
/* Knights Landing has PKG, RAM */
#define RAPL_IDX_KNL (1<<RAPL_IDX_PKG_NRG_STAT|\
1<<RAPL_IDX_RAM_NRG_STAT)
/*
* event code: LSB 8 bits, passed in attr->config
* any other bit is reserved
@ -149,26 +123,32 @@ struct rapl_pmu {
struct rapl_pmus {
struct pmu pmu;
unsigned int maxpkg;
unsigned int maxdie;
struct rapl_pmu *pmus[];
};
struct rapl_model {
unsigned long events;
bool apply_quirk;
};
/* 1/2^hw_unit Joule */
static int rapl_hw_unit[NR_RAPL_DOMAINS] __read_mostly;
static struct rapl_pmus *rapl_pmus;
static cpumask_t rapl_cpu_mask;
static unsigned int rapl_cntr_mask;
static u64 rapl_timer_ms;
static struct perf_msr rapl_msrs[];
static inline struct rapl_pmu *cpu_to_rapl_pmu(unsigned int cpu)
{
unsigned int pkgid = topology_logical_package_id(cpu);
unsigned int dieid = topology_logical_die_id(cpu);
/*
* The unsigned check also catches the '-1' return value for non
* existent mappings in the topology map.
*/
return pkgid < rapl_pmus->maxpkg ? rapl_pmus->pmus[pkgid] : NULL;
return dieid < rapl_pmus->maxdie ? rapl_pmus->pmus[dieid] : NULL;
}
static inline u64 rapl_read_counter(struct perf_event *event)
@ -350,7 +330,7 @@ static void rapl_pmu_event_del(struct perf_event *event, int flags)
static int rapl_pmu_event_init(struct perf_event *event)
{
u64 cfg = event->attr.config & RAPL_EVENT_MASK;
int bit, msr, ret = 0;
int bit, ret = 0;
struct rapl_pmu *pmu;
/* only look at RAPL events */
@ -366,33 +346,12 @@ static int rapl_pmu_event_init(struct perf_event *event)
event->event_caps |= PERF_EV_CAP_READ_ACTIVE_PKG;
/*
* check event is known (determines counter)
*/
switch (cfg) {
case INTEL_RAPL_PP0:
bit = RAPL_IDX_PP0_NRG_STAT;
msr = MSR_PP0_ENERGY_STATUS;
break;
case INTEL_RAPL_PKG:
bit = RAPL_IDX_PKG_NRG_STAT;
msr = MSR_PKG_ENERGY_STATUS;
break;
case INTEL_RAPL_RAM:
bit = RAPL_IDX_RAM_NRG_STAT;
msr = MSR_DRAM_ENERGY_STATUS;
break;
case INTEL_RAPL_PP1:
bit = RAPL_IDX_PP1_NRG_STAT;
msr = MSR_PP1_ENERGY_STATUS;
break;
case INTEL_RAPL_PSYS:
bit = RAPL_IDX_PSYS_NRG_STAT;
msr = MSR_PLATFORM_ENERGY_STATUS;
break;
default:
if (!cfg || cfg >= NR_RAPL_DOMAINS + 1)
return -EINVAL;
}
cfg = array_index_nospec((long)cfg, NR_RAPL_DOMAINS + 1);
bit = cfg - 1;
/* check event supported */
if (!(rapl_cntr_mask & (1 << bit)))
return -EINVAL;
@ -407,7 +366,7 @@ static int rapl_pmu_event_init(struct perf_event *event)
return -EINVAL;
event->cpu = pmu->cpu;
event->pmu_private = pmu;
event->hw.event_base = msr;
event->hw.event_base = rapl_msrs[bit].msr;
event->hw.config = cfg;
event->hw.idx = bit;
@ -457,90 +416,18 @@ RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_ram_scale, "2.3283064365386962890
RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_gpu_scale, "2.3283064365386962890625e-10");
RAPL_EVENT_ATTR_STR(energy-psys.scale, rapl_psys_scale, "2.3283064365386962890625e-10");
static struct attribute *rapl_events_srv_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
EVENT_PTR(rapl_ram),
EVENT_PTR(rapl_cores_unit),
EVENT_PTR(rapl_pkg_unit),
EVENT_PTR(rapl_ram_unit),
EVENT_PTR(rapl_cores_scale),
EVENT_PTR(rapl_pkg_scale),
EVENT_PTR(rapl_ram_scale),
NULL,
};
static struct attribute *rapl_events_cln_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
EVENT_PTR(rapl_gpu),
EVENT_PTR(rapl_cores_unit),
EVENT_PTR(rapl_pkg_unit),
EVENT_PTR(rapl_gpu_unit),
EVENT_PTR(rapl_cores_scale),
EVENT_PTR(rapl_pkg_scale),
EVENT_PTR(rapl_gpu_scale),
NULL,
};
static struct attribute *rapl_events_hsw_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
EVENT_PTR(rapl_gpu),
EVENT_PTR(rapl_ram),
EVENT_PTR(rapl_cores_unit),
EVENT_PTR(rapl_pkg_unit),
EVENT_PTR(rapl_gpu_unit),
EVENT_PTR(rapl_ram_unit),
EVENT_PTR(rapl_cores_scale),
EVENT_PTR(rapl_pkg_scale),
EVENT_PTR(rapl_gpu_scale),
EVENT_PTR(rapl_ram_scale),
NULL,
};
static struct attribute *rapl_events_skl_attr[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_pkg),
EVENT_PTR(rapl_gpu),
EVENT_PTR(rapl_ram),
EVENT_PTR(rapl_psys),
EVENT_PTR(rapl_cores_unit),
EVENT_PTR(rapl_pkg_unit),
EVENT_PTR(rapl_gpu_unit),
EVENT_PTR(rapl_ram_unit),
EVENT_PTR(rapl_psys_unit),
EVENT_PTR(rapl_cores_scale),
EVENT_PTR(rapl_pkg_scale),
EVENT_PTR(rapl_gpu_scale),
EVENT_PTR(rapl_ram_scale),
EVENT_PTR(rapl_psys_scale),
NULL,
};
static struct attribute *rapl_events_knl_attr[] = {
EVENT_PTR(rapl_pkg),
EVENT_PTR(rapl_ram),
EVENT_PTR(rapl_pkg_unit),
EVENT_PTR(rapl_ram_unit),
EVENT_PTR(rapl_pkg_scale),
EVENT_PTR(rapl_ram_scale),
/*
* There are no default events, but we need to create
* "events" group (with empty attrs) before updating
* it with detected events.
*/
static struct attribute *attrs_empty[] = {
NULL,
};
static struct attribute_group rapl_pmu_events_group = {
.name = "events",
.attrs = NULL, /* patched at runtime */
.attrs = attrs_empty,
};
DEFINE_RAPL_FORMAT_ATTR(event, event, "config:0-7");
@ -561,6 +448,79 @@ static const struct attribute_group *rapl_attr_groups[] = {
NULL,
};
static struct attribute *rapl_events_cores[] = {
EVENT_PTR(rapl_cores),
EVENT_PTR(rapl_cores_unit),
EVENT_PTR(rapl_cores_scale),
NULL,
};
static struct attribute_group rapl_events_cores_group = {
.name = "events",
.attrs = rapl_events_cores,
};
static struct attribute *rapl_events_pkg[] = {
EVENT_PTR(rapl_pkg),
EVENT_PTR(rapl_pkg_unit),
EVENT_PTR(rapl_pkg_scale),
NULL,
};
static struct attribute_group rapl_events_pkg_group = {
.name = "events",
.attrs = rapl_events_pkg,
};
static struct attribute *rapl_events_ram[] = {
EVENT_PTR(rapl_ram),
EVENT_PTR(rapl_ram_unit),
EVENT_PTR(rapl_ram_scale),
NULL,
};
static struct attribute_group rapl_events_ram_group = {
.name = "events",
.attrs = rapl_events_ram,
};
static struct attribute *rapl_events_gpu[] = {
EVENT_PTR(rapl_gpu),
EVENT_PTR(rapl_gpu_unit),
EVENT_PTR(rapl_gpu_scale),
NULL,
};
static struct attribute_group rapl_events_gpu_group = {
.name = "events",
.attrs = rapl_events_gpu,
};
static struct attribute *rapl_events_psys[] = {
EVENT_PTR(rapl_psys),
EVENT_PTR(rapl_psys_unit),
EVENT_PTR(rapl_psys_scale),
NULL,
};
static struct attribute_group rapl_events_psys_group = {
.name = "events",
.attrs = rapl_events_psys,
};
static bool test_msr(int idx, void *data)
{
return test_bit(idx, (unsigned long *) data);
}
static struct perf_msr rapl_msrs[] = {
[PERF_RAPL_PP0] = { MSR_PP0_ENERGY_STATUS, &rapl_events_cores_group, test_msr },
[PERF_RAPL_PKG] = { MSR_PKG_ENERGY_STATUS, &rapl_events_pkg_group, test_msr },
[PERF_RAPL_RAM] = { MSR_DRAM_ENERGY_STATUS, &rapl_events_ram_group, test_msr },
[PERF_RAPL_PP1] = { MSR_PP1_ENERGY_STATUS, &rapl_events_gpu_group, test_msr },
[PERF_RAPL_PSYS] = { MSR_PLATFORM_ENERGY_STATUS, &rapl_events_psys_group, test_msr },
};
static int rapl_cpu_offline(unsigned int cpu)
{
struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu);
@ -572,7 +532,7 @@ static int rapl_cpu_offline(unsigned int cpu)
pmu->cpu = -1;
/* Find a new cpu to collect rapl events */
target = cpumask_any_but(topology_core_cpumask(cpu), cpu);
target = cpumask_any_but(topology_die_cpumask(cpu), cpu);
/* Migrate rapl events to the new target */
if (target < nr_cpu_ids) {
@ -599,14 +559,14 @@ static int rapl_cpu_online(unsigned int cpu)
pmu->timer_interval = ms_to_ktime(rapl_timer_ms);
rapl_hrtimer_init(pmu);
rapl_pmus->pmus[topology_logical_package_id(cpu)] = pmu;
rapl_pmus->pmus[topology_logical_die_id(cpu)] = pmu;
}
/*
* Check if there is an online cpu in the package which collects rapl
* events already.
*/
target = cpumask_any_and(&rapl_cpu_mask, topology_core_cpumask(cpu));
target = cpumask_any_and(&rapl_cpu_mask, topology_die_cpumask(cpu));
if (target < nr_cpu_ids)
return 0;
@ -633,7 +593,7 @@ static int rapl_check_hw_unit(bool apply_quirk)
* of 2. Datasheet, September 2014, Reference Number: 330784-001 "
*/
if (apply_quirk)
rapl_hw_unit[RAPL_IDX_RAM_NRG_STAT] = 16;
rapl_hw_unit[PERF_RAPL_RAM] = 16;
/*
* Calculate the timer rate:
@ -669,23 +629,33 @@ static void cleanup_rapl_pmus(void)
{
int i;
for (i = 0; i < rapl_pmus->maxpkg; i++)
for (i = 0; i < rapl_pmus->maxdie; i++)
kfree(rapl_pmus->pmus[i]);
kfree(rapl_pmus);
}
const struct attribute_group *rapl_attr_update[] = {
&rapl_events_cores_group,
&rapl_events_pkg_group,
&rapl_events_ram_group,
&rapl_events_gpu_group,
&rapl_events_gpu_group,
NULL,
};
static int __init init_rapl_pmus(void)
{
int maxpkg = topology_max_packages();
int maxdie = topology_max_packages() * topology_max_die_per_package();
size_t size;
size = sizeof(*rapl_pmus) + maxpkg * sizeof(struct rapl_pmu *);
size = sizeof(*rapl_pmus) + maxdie * sizeof(struct rapl_pmu *);
rapl_pmus = kzalloc(size, GFP_KERNEL);
if (!rapl_pmus)
return -ENOMEM;
rapl_pmus->maxpkg = maxpkg;
rapl_pmus->maxdie = maxdie;
rapl_pmus->pmu.attr_groups = rapl_attr_groups;
rapl_pmus->pmu.attr_update = rapl_attr_update;
rapl_pmus->pmu.task_ctx_nr = perf_invalid_context;
rapl_pmus->pmu.event_init = rapl_pmu_event_init;
rapl_pmus->pmu.add = rapl_pmu_event_add;
@ -701,105 +671,96 @@ static int __init init_rapl_pmus(void)
#define X86_RAPL_MODEL_MATCH(model, init) \
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long)&init }
struct intel_rapl_init_fun {
bool apply_quirk;
int cntr_mask;
struct attribute **attrs;
static struct rapl_model model_snb = {
.events = BIT(PERF_RAPL_PP0) |
BIT(PERF_RAPL_PKG) |
BIT(PERF_RAPL_PP1),
.apply_quirk = false,
};
static const struct intel_rapl_init_fun snb_rapl_init __initconst = {
.apply_quirk = false,
.cntr_mask = RAPL_IDX_CLN,
.attrs = rapl_events_cln_attr,
static struct rapl_model model_snbep = {
.events = BIT(PERF_RAPL_PP0) |
BIT(PERF_RAPL_PKG) |
BIT(PERF_RAPL_RAM),
.apply_quirk = false,
};
static const struct intel_rapl_init_fun hsx_rapl_init __initconst = {
.apply_quirk = true,
.cntr_mask = RAPL_IDX_SRV,
.attrs = rapl_events_srv_attr,
static struct rapl_model model_hsw = {
.events = BIT(PERF_RAPL_PP0) |
BIT(PERF_RAPL_PKG) |
BIT(PERF_RAPL_RAM) |
BIT(PERF_RAPL_PP1),
.apply_quirk = false,
};
static const struct intel_rapl_init_fun hsw_rapl_init __initconst = {
.apply_quirk = false,
.cntr_mask = RAPL_IDX_HSW,
.attrs = rapl_events_hsw_attr,
static struct rapl_model model_hsx = {
.events = BIT(PERF_RAPL_PP0) |
BIT(PERF_RAPL_PKG) |
BIT(PERF_RAPL_RAM),
.apply_quirk = true,
};
static const struct intel_rapl_init_fun snbep_rapl_init __initconst = {
.apply_quirk = false,
.cntr_mask = RAPL_IDX_SRV,
.attrs = rapl_events_srv_attr,
static struct rapl_model model_knl = {
.events = BIT(PERF_RAPL_PKG) |
BIT(PERF_RAPL_RAM),
.apply_quirk = true,
};
static const struct intel_rapl_init_fun knl_rapl_init __initconst = {
.apply_quirk = true,
.cntr_mask = RAPL_IDX_KNL,
.attrs = rapl_events_knl_attr,
static struct rapl_model model_skl = {
.events = BIT(PERF_RAPL_PP0) |
BIT(PERF_RAPL_PKG) |
BIT(PERF_RAPL_RAM) |
BIT(PERF_RAPL_PP1) |
BIT(PERF_RAPL_PSYS),
.apply_quirk = false,
};
static const struct intel_rapl_init_fun skl_rapl_init __initconst = {
.apply_quirk = false,
.cntr_mask = RAPL_IDX_SKL_CLN,
.attrs = rapl_events_skl_attr,
};
static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE, snb_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE_X, snbep_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_IVYBRIDGE, snb_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_IVYBRIDGE_X, snbep_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_CORE, hsw_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_X, hsx_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_ULT, hsw_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_GT3E, hsw_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_CORE, hsw_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_GT3E, hsw_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_X, hsx_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_XEON_D, hsx_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_XEON_PHI_KNL, knl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_XEON_PHI_KNM, knl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X, hsx_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_KABYLAKE_MOBILE, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_KABYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_CANNONLAKE_MOBILE, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT_X, hsw_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT_PLUS, hsw_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ICELAKE_MOBILE, skl_rapl_init),
static const struct x86_cpu_id rapl_model_match[] __initconst = {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE, model_snb),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SANDYBRIDGE_X, model_snbep),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_IVYBRIDGE, model_snb),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_IVYBRIDGE_X, model_snbep),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_CORE, model_hsw),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_X, model_hsx),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_ULT, model_hsw),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_GT3E, model_hsw),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_CORE, model_hsw),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_GT3E, model_hsw),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_X, model_hsx),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_XEON_D, model_hsx),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_XEON_PHI_KNL, model_knl),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_XEON_PHI_KNM, model_knl),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE, model_skl),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, model_skl),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X, model_hsx),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_KABYLAKE_MOBILE, model_skl),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_KABYLAKE_DESKTOP, model_skl),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_CANNONLAKE_MOBILE, model_skl),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, model_hsw),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT_X, model_hsw),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT_PLUS, model_hsw),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ICELAKE_MOBILE, model_skl),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ICELAKE_DESKTOP, model_skl),
{},
};
MODULE_DEVICE_TABLE(x86cpu, rapl_cpu_match);
MODULE_DEVICE_TABLE(x86cpu, rapl_model_match);
static int __init rapl_pmu_init(void)
{
const struct x86_cpu_id *id;
struct intel_rapl_init_fun *rapl_init;
bool apply_quirk;
struct rapl_model *rm;
int ret;
id = x86_match_cpu(rapl_cpu_match);
id = x86_match_cpu(rapl_model_match);
if (!id)
return -ENODEV;
rapl_init = (struct intel_rapl_init_fun *)id->driver_data;
apply_quirk = rapl_init->apply_quirk;
rapl_cntr_mask = rapl_init->cntr_mask;
rapl_pmu_events_group.attrs = rapl_init->attrs;
rm = (struct rapl_model *) id->driver_data;
rapl_cntr_mask = perf_msr_probe(rapl_msrs, PERF_RAPL_MAX,
false, (void *) &rm->events);
ret = rapl_check_hw_unit(apply_quirk);
ret = rapl_check_hw_unit(rm->apply_quirk);
if (ret)
return ret;

View file

@ -8,6 +8,7 @@
static struct intel_uncore_type *empty_uncore[] = { NULL, };
struct intel_uncore_type **uncore_msr_uncores = empty_uncore;
struct intel_uncore_type **uncore_pci_uncores = empty_uncore;
struct intel_uncore_type **uncore_mmio_uncores = empty_uncore;
static bool pcidrv_registered;
struct pci_driver *uncore_pci_driver;
@ -15,7 +16,7 @@ struct pci_driver *uncore_pci_driver;
DEFINE_RAW_SPINLOCK(pci2phy_map_lock);
struct list_head pci2phy_map_head = LIST_HEAD_INIT(pci2phy_map_head);
struct pci_extra_dev *uncore_extra_pci_dev;
static int max_packages;
static int max_dies;
/* mask of cpus that collect uncore events */
static cpumask_t uncore_cpu_mask;
@ -28,7 +29,7 @@ struct event_constraint uncore_constraint_empty =
MODULE_LICENSE("GPL");
static int uncore_pcibus_to_physid(struct pci_bus *bus)
int uncore_pcibus_to_physid(struct pci_bus *bus)
{
struct pci2phy_map *map;
int phys_id = -1;
@ -101,13 +102,13 @@ ssize_t uncore_event_show(struct kobject *kobj,
struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu)
{
unsigned int pkgid = topology_logical_package_id(cpu);
unsigned int dieid = topology_logical_die_id(cpu);
/*
* The unsigned check also catches the '-1' return value for non
* existent mappings in the topology map.
*/
return pkgid < max_packages ? pmu->boxes[pkgid] : NULL;
return dieid < max_dies ? pmu->boxes[dieid] : NULL;
}
u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event)
@ -119,6 +120,21 @@ u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *eve
return count;
}
void uncore_mmio_exit_box(struct intel_uncore_box *box)
{
if (box->io_addr)
iounmap(box->io_addr);
}
u64 uncore_mmio_read_counter(struct intel_uncore_box *box,
struct perf_event *event)
{
if (!box->io_addr)
return 0;
return readq(box->io_addr + event->hw.event_base);
}
/*
* generic get constraint function for shared match/mask registers.
*/
@ -312,7 +328,7 @@ static struct intel_uncore_box *uncore_alloc_box(struct intel_uncore_type *type,
uncore_pmu_init_hrtimer(box);
box->cpu = -1;
box->pci_phys_id = -1;
box->pkgid = -1;
box->dieid = -1;
/* set default hrtimer timeout */
box->hrtimer_duration = UNCORE_PMU_HRTIMER_INTERVAL;
@ -827,10 +843,10 @@ static void uncore_pmu_unregister(struct intel_uncore_pmu *pmu)
static void uncore_free_boxes(struct intel_uncore_pmu *pmu)
{
int pkg;
int die;
for (pkg = 0; pkg < max_packages; pkg++)
kfree(pmu->boxes[pkg]);
for (die = 0; die < max_dies; die++)
kfree(pmu->boxes[die]);
kfree(pmu->boxes);
}
@ -867,7 +883,7 @@ static int __init uncore_type_init(struct intel_uncore_type *type, bool setid)
if (!pmus)
return -ENOMEM;
size = max_packages * sizeof(struct intel_uncore_box *);
size = max_dies * sizeof(struct intel_uncore_box *);
for (i = 0; i < type->num_boxes; i++) {
pmus[i].func_id = setid ? i : -1;
@ -937,20 +953,21 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
struct intel_uncore_type *type;
struct intel_uncore_pmu *pmu = NULL;
struct intel_uncore_box *box;
int phys_id, pkg, ret;
int phys_id, die, ret;
phys_id = uncore_pcibus_to_physid(pdev->bus);
if (phys_id < 0)
return -ENODEV;
pkg = topology_phys_to_logical_pkg(phys_id);
if (pkg < 0)
die = (topology_max_die_per_package() > 1) ? phys_id :
topology_phys_to_logical_pkg(phys_id);
if (die < 0)
return -EINVAL;
if (UNCORE_PCI_DEV_TYPE(id->driver_data) == UNCORE_EXTRA_PCI_DEV) {
int idx = UNCORE_PCI_DEV_IDX(id->driver_data);
uncore_extra_pci_dev[pkg].dev[idx] = pdev;
uncore_extra_pci_dev[die].dev[idx] = pdev;
pci_set_drvdata(pdev, NULL);
return 0;
}
@ -989,7 +1006,7 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
pmu = &type->pmus[UNCORE_PCI_DEV_IDX(id->driver_data)];
}
if (WARN_ON_ONCE(pmu->boxes[pkg] != NULL))
if (WARN_ON_ONCE(pmu->boxes[die] != NULL))
return -EINVAL;
box = uncore_alloc_box(type, NUMA_NO_NODE);
@ -1003,13 +1020,13 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
atomic_inc(&box->refcnt);
box->pci_phys_id = phys_id;
box->pkgid = pkg;
box->dieid = die;
box->pci_dev = pdev;
box->pmu = pmu;
uncore_box_init(box);
pci_set_drvdata(pdev, box);
pmu->boxes[pkg] = box;
pmu->boxes[die] = box;
if (atomic_inc_return(&pmu->activeboxes) > 1)
return 0;
@ -1017,7 +1034,7 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
ret = uncore_pmu_register(pmu);
if (ret) {
pci_set_drvdata(pdev, NULL);
pmu->boxes[pkg] = NULL;
pmu->boxes[die] = NULL;
uncore_box_exit(box);
kfree(box);
}
@ -1028,16 +1045,17 @@ static void uncore_pci_remove(struct pci_dev *pdev)
{
struct intel_uncore_box *box;
struct intel_uncore_pmu *pmu;
int i, phys_id, pkg;
int i, phys_id, die;
phys_id = uncore_pcibus_to_physid(pdev->bus);
box = pci_get_drvdata(pdev);
if (!box) {
pkg = topology_phys_to_logical_pkg(phys_id);
die = (topology_max_die_per_package() > 1) ? phys_id :
topology_phys_to_logical_pkg(phys_id);
for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
if (uncore_extra_pci_dev[pkg].dev[i] == pdev) {
uncore_extra_pci_dev[pkg].dev[i] = NULL;
if (uncore_extra_pci_dev[die].dev[i] == pdev) {
uncore_extra_pci_dev[die].dev[i] = NULL;
break;
}
}
@ -1050,7 +1068,7 @@ static void uncore_pci_remove(struct pci_dev *pdev)
return;
pci_set_drvdata(pdev, NULL);
pmu->boxes[box->pkgid] = NULL;
pmu->boxes[box->dieid] = NULL;
if (atomic_dec_return(&pmu->activeboxes) == 0)
uncore_pmu_unregister(pmu);
uncore_box_exit(box);
@ -1062,7 +1080,7 @@ static int __init uncore_pci_init(void)
size_t size;
int ret;
size = max_packages * sizeof(struct pci_extra_dev);
size = max_dies * sizeof(struct pci_extra_dev);
uncore_extra_pci_dev = kzalloc(size, GFP_KERNEL);
if (!uncore_extra_pci_dev) {
ret = -ENOMEM;
@ -1109,11 +1127,11 @@ static void uncore_change_type_ctx(struct intel_uncore_type *type, int old_cpu,
{
struct intel_uncore_pmu *pmu = type->pmus;
struct intel_uncore_box *box;
int i, pkg;
int i, die;
pkg = topology_logical_package_id(old_cpu < 0 ? new_cpu : old_cpu);
die = topology_logical_die_id(old_cpu < 0 ? new_cpu : old_cpu);
for (i = 0; i < type->num_boxes; i++, pmu++) {
box = pmu->boxes[pkg];
box = pmu->boxes[die];
if (!box)
continue;
@ -1141,18 +1159,33 @@ static void uncore_change_context(struct intel_uncore_type **uncores,
uncore_change_type_ctx(*uncores, old_cpu, new_cpu);
}
static int uncore_event_cpu_offline(unsigned int cpu)
static void uncore_box_unref(struct intel_uncore_type **types, int id)
{
struct intel_uncore_type *type, **types = uncore_msr_uncores;
struct intel_uncore_type *type;
struct intel_uncore_pmu *pmu;
struct intel_uncore_box *box;
int i, pkg, target;
int i;
for (; *types; types++) {
type = *types;
pmu = type->pmus;
for (i = 0; i < type->num_boxes; i++, pmu++) {
box = pmu->boxes[id];
if (box && atomic_dec_return(&box->refcnt) == 0)
uncore_box_exit(box);
}
}
}
static int uncore_event_cpu_offline(unsigned int cpu)
{
int die, target;
/* Check if exiting cpu is used for collecting uncore events */
if (!cpumask_test_and_clear_cpu(cpu, &uncore_cpu_mask))
goto unref;
/* Find a new cpu to collect uncore events */
target = cpumask_any_but(topology_core_cpumask(cpu), cpu);
target = cpumask_any_but(topology_die_cpumask(cpu), cpu);
/* Migrate uncore events to the new target */
if (target < nr_cpu_ids)
@ -1161,25 +1194,19 @@ static int uncore_event_cpu_offline(unsigned int cpu)
target = -1;
uncore_change_context(uncore_msr_uncores, cpu, target);
uncore_change_context(uncore_mmio_uncores, cpu, target);
uncore_change_context(uncore_pci_uncores, cpu, target);
unref:
/* Clear the references */
pkg = topology_logical_package_id(cpu);
for (; *types; types++) {
type = *types;
pmu = type->pmus;
for (i = 0; i < type->num_boxes; i++, pmu++) {
box = pmu->boxes[pkg];
if (box && atomic_dec_return(&box->refcnt) == 0)
uncore_box_exit(box);
}
}
die = topology_logical_die_id(cpu);
uncore_box_unref(uncore_msr_uncores, die);
uncore_box_unref(uncore_mmio_uncores, die);
return 0;
}
static int allocate_boxes(struct intel_uncore_type **types,
unsigned int pkg, unsigned int cpu)
unsigned int die, unsigned int cpu)
{
struct intel_uncore_box *box, *tmp;
struct intel_uncore_type *type;
@ -1192,20 +1219,20 @@ static int allocate_boxes(struct intel_uncore_type **types,
type = *types;
pmu = type->pmus;
for (i = 0; i < type->num_boxes; i++, pmu++) {
if (pmu->boxes[pkg])
if (pmu->boxes[die])
continue;
box = uncore_alloc_box(type, cpu_to_node(cpu));
if (!box)
goto cleanup;
box->pmu = pmu;
box->pkgid = pkg;
box->dieid = die;
list_add(&box->active_list, &allocated);
}
}
/* Install them in the pmus */
list_for_each_entry_safe(box, tmp, &allocated, active_list) {
list_del_init(&box->active_list);
box->pmu->boxes[pkg] = box;
box->pmu->boxes[die] = box;
}
return 0;
@ -1217,15 +1244,15 @@ static int allocate_boxes(struct intel_uncore_type **types,
return -ENOMEM;
}
static int uncore_event_cpu_online(unsigned int cpu)
static int uncore_box_ref(struct intel_uncore_type **types,
int id, unsigned int cpu)
{
struct intel_uncore_type *type, **types = uncore_msr_uncores;
struct intel_uncore_type *type;
struct intel_uncore_pmu *pmu;
struct intel_uncore_box *box;
int i, ret, pkg, target;
int i, ret;
pkg = topology_logical_package_id(cpu);
ret = allocate_boxes(types, pkg, cpu);
ret = allocate_boxes(types, id, cpu);
if (ret)
return ret;
@ -1233,23 +1260,38 @@ static int uncore_event_cpu_online(unsigned int cpu)
type = *types;
pmu = type->pmus;
for (i = 0; i < type->num_boxes; i++, pmu++) {
box = pmu->boxes[pkg];
box = pmu->boxes[id];
if (box && atomic_inc_return(&box->refcnt) == 1)
uncore_box_init(box);
}
}
return 0;
}
static int uncore_event_cpu_online(unsigned int cpu)
{
int die, target, msr_ret, mmio_ret;
die = topology_logical_die_id(cpu);
msr_ret = uncore_box_ref(uncore_msr_uncores, die, cpu);
mmio_ret = uncore_box_ref(uncore_mmio_uncores, die, cpu);
if (msr_ret && mmio_ret)
return -ENOMEM;
/*
* Check if there is an online cpu in the package
* which collects uncore events already.
*/
target = cpumask_any_and(&uncore_cpu_mask, topology_core_cpumask(cpu));
target = cpumask_any_and(&uncore_cpu_mask, topology_die_cpumask(cpu));
if (target < nr_cpu_ids)
return 0;
cpumask_set_cpu(cpu, &uncore_cpu_mask);
uncore_change_context(uncore_msr_uncores, -1, cpu);
if (!msr_ret)
uncore_change_context(uncore_msr_uncores, -1, cpu);
if (!mmio_ret)
uncore_change_context(uncore_mmio_uncores, -1, cpu);
uncore_change_context(uncore_pci_uncores, -1, cpu);
return 0;
}
@ -1297,12 +1339,35 @@ static int __init uncore_cpu_init(void)
return ret;
}
static int __init uncore_mmio_init(void)
{
struct intel_uncore_type **types = uncore_mmio_uncores;
int ret;
ret = uncore_types_init(types, true);
if (ret)
goto err;
for (; *types; types++) {
ret = type_pmu_register(*types);
if (ret)
goto err;
}
return 0;
err:
uncore_types_exit(uncore_mmio_uncores);
uncore_mmio_uncores = empty_uncore;
return ret;
}
#define X86_UNCORE_MODEL_MATCH(model, init) \
{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY, (unsigned long)&init }
struct intel_uncore_init_fun {
void (*cpu_init)(void);
int (*pci_init)(void);
void (*mmio_init)(void);
};
static const struct intel_uncore_init_fun nhm_uncore_init __initconst = {
@ -1373,6 +1438,12 @@ static const struct intel_uncore_init_fun icl_uncore_init __initconst = {
.pci_init = skl_uncore_pci_init,
};
static const struct intel_uncore_init_fun snr_uncore_init __initconst = {
.cpu_init = snr_uncore_cpu_init,
.pci_init = snr_uncore_pci_init,
.mmio_init = snr_uncore_mmio_init,
};
static const struct x86_cpu_id intel_uncore_match[] __initconst = {
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_NEHALEM_EP, nhm_uncore_init),
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_NEHALEM, nhm_uncore_init),
@ -1400,6 +1471,9 @@ static const struct x86_cpu_id intel_uncore_match[] __initconst = {
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_KABYLAKE_MOBILE, skl_uncore_init),
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_KABYLAKE_DESKTOP, skl_uncore_init),
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_ICELAKE_MOBILE, icl_uncore_init),
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_ICELAKE_NNPI, icl_uncore_init),
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_ICELAKE_DESKTOP, icl_uncore_init),
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_ATOM_TREMONT_X, snr_uncore_init),
{},
};
@ -1409,7 +1483,7 @@ static int __init intel_uncore_init(void)
{
const struct x86_cpu_id *id;
struct intel_uncore_init_fun *uncore_init;
int pret = 0, cret = 0, ret;
int pret = 0, cret = 0, mret = 0, ret;
id = x86_match_cpu(intel_uncore_match);
if (!id)
@ -1418,7 +1492,7 @@ static int __init intel_uncore_init(void)
if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
return -ENODEV;
max_packages = topology_max_packages();
max_dies = topology_max_packages() * topology_max_die_per_package();
uncore_init = (struct intel_uncore_init_fun *)id->driver_data;
if (uncore_init->pci_init) {
@ -1432,7 +1506,12 @@ static int __init intel_uncore_init(void)
cret = uncore_cpu_init();
}
if (cret && pret)
if (uncore_init->mmio_init) {
uncore_init->mmio_init();
mret = uncore_mmio_init();
}
if (cret && pret && mret)
return -ENODEV;
/* Install hotplug callbacks to setup the targets for each package */
@ -1446,6 +1525,7 @@ static int __init intel_uncore_init(void)
err:
uncore_types_exit(uncore_msr_uncores);
uncore_types_exit(uncore_mmio_uncores);
uncore_pci_exit();
return ret;
}
@ -1455,6 +1535,7 @@ static void __exit intel_uncore_exit(void)
{
cpuhp_remove_state(CPUHP_AP_PERF_X86_UNCORE_ONLINE);
uncore_types_exit(uncore_msr_uncores);
uncore_types_exit(uncore_mmio_uncores);
uncore_pci_exit();
}
module_exit(intel_uncore_exit);

View file

@ -2,6 +2,7 @@
#include <linux/slab.h>
#include <linux/pci.h>
#include <asm/apicdef.h>
#include <linux/io-64-nonatomic-lo-hi.h>
#include <linux/perf_event.h>
#include "../perf_event.h"
@ -56,7 +57,10 @@ struct intel_uncore_type {
unsigned fixed_ctr;
unsigned fixed_ctl;
unsigned box_ctl;
unsigned msr_offset;
union {
unsigned msr_offset;
unsigned mmio_offset;
};
unsigned num_shared_regs:8;
unsigned single_fixed:1;
unsigned pair_ctr_ctl:1;
@ -108,7 +112,7 @@ struct intel_uncore_extra_reg {
struct intel_uncore_box {
int pci_phys_id;
int pkgid; /* Logical package ID */
int dieid; /* Logical die ID */
int n_active; /* number of active events */
int n_events;
int cpu; /* cpu to collect events */
@ -125,7 +129,7 @@ struct intel_uncore_box {
struct hrtimer hrtimer;
struct list_head list;
struct list_head active_list;
void *io_addr;
void __iomem *io_addr;
struct intel_uncore_extra_reg shared_regs[0];
};
@ -159,6 +163,7 @@ struct pci2phy_map {
};
struct pci2phy_map *__find_pci2phy_map(int segment);
int uncore_pcibus_to_physid(struct pci_bus *bus);
ssize_t uncore_event_show(struct kobject *kobj,
struct kobj_attribute *attr, char *buf);
@ -190,6 +195,13 @@ static inline bool uncore_pmc_freerunning(int idx)
return idx == UNCORE_PMC_IDX_FREERUNNING;
}
static inline
unsigned int uncore_mmio_box_ctl(struct intel_uncore_box *box)
{
return box->pmu->type->box_ctl +
box->pmu->type->mmio_offset * box->pmu->pmu_idx;
}
static inline unsigned uncore_pci_box_ctl(struct intel_uncore_box *box)
{
return box->pmu->type->box_ctl;
@ -330,7 +342,7 @@ unsigned uncore_msr_perf_ctr(struct intel_uncore_box *box, int idx)
static inline
unsigned uncore_fixed_ctl(struct intel_uncore_box *box)
{
if (box->pci_dev)
if (box->pci_dev || box->io_addr)
return uncore_pci_fixed_ctl(box);
else
return uncore_msr_fixed_ctl(box);
@ -339,7 +351,7 @@ unsigned uncore_fixed_ctl(struct intel_uncore_box *box)
static inline
unsigned uncore_fixed_ctr(struct intel_uncore_box *box)
{
if (box->pci_dev)
if (box->pci_dev || box->io_addr)
return uncore_pci_fixed_ctr(box);
else
return uncore_msr_fixed_ctr(box);
@ -348,7 +360,7 @@ unsigned uncore_fixed_ctr(struct intel_uncore_box *box)
static inline
unsigned uncore_event_ctl(struct intel_uncore_box *box, int idx)
{
if (box->pci_dev)
if (box->pci_dev || box->io_addr)
return uncore_pci_event_ctl(box, idx);
else
return uncore_msr_event_ctl(box, idx);
@ -357,7 +369,7 @@ unsigned uncore_event_ctl(struct intel_uncore_box *box, int idx)
static inline
unsigned uncore_perf_ctr(struct intel_uncore_box *box, int idx)
{
if (box->pci_dev)
if (box->pci_dev || box->io_addr)
return uncore_pci_perf_ctr(box, idx);
else
return uncore_msr_perf_ctr(box, idx);
@ -419,6 +431,16 @@ static inline bool is_freerunning_event(struct perf_event *event)
(((cfg >> 8) & 0xff) >= UNCORE_FREERUNNING_UMASK_START);
}
/* Check and reject invalid config */
static inline int uncore_freerunning_hw_config(struct intel_uncore_box *box,
struct perf_event *event)
{
if (is_freerunning_event(event))
return 0;
return -EINVAL;
}
static inline void uncore_disable_box(struct intel_uncore_box *box)
{
if (box->pmu->type->ops->disable_box)
@ -467,7 +489,7 @@ static inline void uncore_box_exit(struct intel_uncore_box *box)
static inline bool uncore_box_is_fake(struct intel_uncore_box *box)
{
return (box->pkgid < 0);
return (box->dieid < 0);
}
static inline struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event)
@ -482,6 +504,9 @@ static inline struct intel_uncore_box *uncore_event_to_box(struct perf_event *ev
struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu);
u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event);
void uncore_mmio_exit_box(struct intel_uncore_box *box);
u64 uncore_mmio_read_counter(struct intel_uncore_box *box,
struct perf_event *event);
void uncore_pmu_start_hrtimer(struct intel_uncore_box *box);
void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box);
void uncore_pmu_event_start(struct perf_event *event, int flags);
@ -497,6 +522,7 @@ u64 uncore_shared_reg_config(struct intel_uncore_box *box, int idx);
extern struct intel_uncore_type **uncore_msr_uncores;
extern struct intel_uncore_type **uncore_pci_uncores;
extern struct intel_uncore_type **uncore_mmio_uncores;
extern struct pci_driver *uncore_pci_driver;
extern raw_spinlock_t pci2phy_map_lock;
extern struct list_head pci2phy_map_head;
@ -528,6 +554,9 @@ int knl_uncore_pci_init(void);
void knl_uncore_cpu_init(void);
int skx_uncore_pci_init(void);
void skx_uncore_cpu_init(void);
int snr_uncore_pci_init(void);
void snr_uncore_cpu_init(void);
void snr_uncore_mmio_init(void);
/* uncore_nhmex.c */
void nhmex_uncore_cpu_init(void);

View file

@ -3,27 +3,29 @@
#include "uncore.h"
/* Uncore IMC PCI IDs */
#define PCI_DEVICE_ID_INTEL_SNB_IMC 0x0100
#define PCI_DEVICE_ID_INTEL_IVB_IMC 0x0154
#define PCI_DEVICE_ID_INTEL_IVB_E3_IMC 0x0150
#define PCI_DEVICE_ID_INTEL_HSW_IMC 0x0c00
#define PCI_DEVICE_ID_INTEL_HSW_U_IMC 0x0a04
#define PCI_DEVICE_ID_INTEL_BDW_IMC 0x1604
#define PCI_DEVICE_ID_INTEL_SKL_U_IMC 0x1904
#define PCI_DEVICE_ID_INTEL_SKL_Y_IMC 0x190c
#define PCI_DEVICE_ID_INTEL_SKL_HD_IMC 0x1900
#define PCI_DEVICE_ID_INTEL_SKL_HQ_IMC 0x1910
#define PCI_DEVICE_ID_INTEL_SKL_SD_IMC 0x190f
#define PCI_DEVICE_ID_INTEL_SKL_SQ_IMC 0x191f
#define PCI_DEVICE_ID_INTEL_KBL_Y_IMC 0x590c
#define PCI_DEVICE_ID_INTEL_KBL_U_IMC 0x5904
#define PCI_DEVICE_ID_INTEL_KBL_UQ_IMC 0x5914
#define PCI_DEVICE_ID_INTEL_KBL_SD_IMC 0x590f
#define PCI_DEVICE_ID_INTEL_KBL_SQ_IMC 0x591f
#define PCI_DEVICE_ID_INTEL_CFL_2U_IMC 0x3ecc
#define PCI_DEVICE_ID_INTEL_CFL_4U_IMC 0x3ed0
#define PCI_DEVICE_ID_INTEL_CFL_4H_IMC 0x3e10
#define PCI_DEVICE_ID_INTEL_CFL_6H_IMC 0x3ec4
#define PCI_DEVICE_ID_INTEL_SNB_IMC 0x0100
#define PCI_DEVICE_ID_INTEL_IVB_IMC 0x0154
#define PCI_DEVICE_ID_INTEL_IVB_E3_IMC 0x0150
#define PCI_DEVICE_ID_INTEL_HSW_IMC 0x0c00
#define PCI_DEVICE_ID_INTEL_HSW_U_IMC 0x0a04
#define PCI_DEVICE_ID_INTEL_BDW_IMC 0x1604
#define PCI_DEVICE_ID_INTEL_SKL_U_IMC 0x1904
#define PCI_DEVICE_ID_INTEL_SKL_Y_IMC 0x190c
#define PCI_DEVICE_ID_INTEL_SKL_HD_IMC 0x1900
#define PCI_DEVICE_ID_INTEL_SKL_HQ_IMC 0x1910
#define PCI_DEVICE_ID_INTEL_SKL_SD_IMC 0x190f
#define PCI_DEVICE_ID_INTEL_SKL_SQ_IMC 0x191f
#define PCI_DEVICE_ID_INTEL_KBL_Y_IMC 0x590c
#define PCI_DEVICE_ID_INTEL_KBL_U_IMC 0x5904
#define PCI_DEVICE_ID_INTEL_KBL_UQ_IMC 0x5914
#define PCI_DEVICE_ID_INTEL_KBL_SD_IMC 0x590f
#define PCI_DEVICE_ID_INTEL_KBL_SQ_IMC 0x591f
#define PCI_DEVICE_ID_INTEL_KBL_HQ_IMC 0x5910
#define PCI_DEVICE_ID_INTEL_KBL_WQ_IMC 0x5918
#define PCI_DEVICE_ID_INTEL_CFL_2U_IMC 0x3ecc
#define PCI_DEVICE_ID_INTEL_CFL_4U_IMC 0x3ed0
#define PCI_DEVICE_ID_INTEL_CFL_4H_IMC 0x3e10
#define PCI_DEVICE_ID_INTEL_CFL_6H_IMC 0x3ec4
#define PCI_DEVICE_ID_INTEL_CFL_2S_D_IMC 0x3e0f
#define PCI_DEVICE_ID_INTEL_CFL_4S_D_IMC 0x3e1f
#define PCI_DEVICE_ID_INTEL_CFL_6S_D_IMC 0x3ec2
@ -34,9 +36,15 @@
#define PCI_DEVICE_ID_INTEL_CFL_4S_S_IMC 0x3e33
#define PCI_DEVICE_ID_INTEL_CFL_6S_S_IMC 0x3eca
#define PCI_DEVICE_ID_INTEL_CFL_8S_S_IMC 0x3e32
#define PCI_DEVICE_ID_INTEL_AML_YD_IMC 0x590c
#define PCI_DEVICE_ID_INTEL_AML_YQ_IMC 0x590d
#define PCI_DEVICE_ID_INTEL_WHL_UQ_IMC 0x3ed0
#define PCI_DEVICE_ID_INTEL_WHL_4_UQ_IMC 0x3e34
#define PCI_DEVICE_ID_INTEL_WHL_UD_IMC 0x3e35
#define PCI_DEVICE_ID_INTEL_ICL_U_IMC 0x8a02
#define PCI_DEVICE_ID_INTEL_ICL_U2_IMC 0x8a12
/* SNB event control */
#define SNB_UNC_CTL_EV_SEL_MASK 0x000000ff
#define SNB_UNC_CTL_UMASK_MASK 0x0000ff00
@ -420,11 +428,6 @@ static void snb_uncore_imc_init_box(struct intel_uncore_box *box)
box->hrtimer_duration = UNCORE_SNB_IMC_HRTIMER_INTERVAL;
}
static void snb_uncore_imc_exit_box(struct intel_uncore_box *box)
{
iounmap(box->io_addr);
}
static void snb_uncore_imc_enable_box(struct intel_uncore_box *box)
{}
@ -437,13 +440,6 @@ static void snb_uncore_imc_enable_event(struct intel_uncore_box *box, struct per
static void snb_uncore_imc_disable_event(struct intel_uncore_box *box, struct perf_event *event)
{}
static u64 snb_uncore_imc_read_counter(struct intel_uncore_box *box, struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
return (u64)*(unsigned int *)(box->io_addr + hwc->event_base);
}
/*
* Keep the custom event_init() function compatible with old event
* encoding for free running counters.
@ -570,13 +566,13 @@ static struct pmu snb_uncore_imc_pmu = {
static struct intel_uncore_ops snb_uncore_imc_ops = {
.init_box = snb_uncore_imc_init_box,
.exit_box = snb_uncore_imc_exit_box,
.exit_box = uncore_mmio_exit_box,
.enable_box = snb_uncore_imc_enable_box,
.disable_box = snb_uncore_imc_disable_box,
.disable_event = snb_uncore_imc_disable_event,
.enable_event = snb_uncore_imc_enable_event,
.hw_config = snb_uncore_imc_hw_config,
.read_counter = snb_uncore_imc_read_counter,
.read_counter = uncore_mmio_read_counter,
};
static struct intel_uncore_type snb_uncore_imc = {
@ -681,6 +677,14 @@ static const struct pci_device_id skl_uncore_pci_ids[] = {
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_KBL_SQ_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
},
{ /* IMC */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_KBL_HQ_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
},
{ /* IMC */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_KBL_WQ_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
},
{ /* IMC */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_2U_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
@ -737,6 +741,26 @@ static const struct pci_device_id skl_uncore_pci_ids[] = {
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_CFL_8S_S_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
},
{ /* IMC */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_AML_YD_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
},
{ /* IMC */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_AML_YQ_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
},
{ /* IMC */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_WHL_UQ_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
},
{ /* IMC */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_WHL_4_UQ_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
},
{ /* IMC */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_WHL_UD_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
},
{ /* end: all zeroes */ },
};
@ -807,6 +831,8 @@ static const struct imc_uncore_pci_dev desktop_imc_pci_ids[] = {
IMC_DEV(KBL_UQ_IMC, &skl_uncore_pci_driver), /* 7th Gen Core U Quad Core */
IMC_DEV(KBL_SD_IMC, &skl_uncore_pci_driver), /* 7th Gen Core S Dual Core */
IMC_DEV(KBL_SQ_IMC, &skl_uncore_pci_driver), /* 7th Gen Core S Quad Core */
IMC_DEV(KBL_HQ_IMC, &skl_uncore_pci_driver), /* 7th Gen Core H Quad Core */
IMC_DEV(KBL_WQ_IMC, &skl_uncore_pci_driver), /* 7th Gen Core S 4 cores Work Station */
IMC_DEV(CFL_2U_IMC, &skl_uncore_pci_driver), /* 8th Gen Core U 2 Cores */
IMC_DEV(CFL_4U_IMC, &skl_uncore_pci_driver), /* 8th Gen Core U 4 Cores */
IMC_DEV(CFL_4H_IMC, &skl_uncore_pci_driver), /* 8th Gen Core H 4 Cores */
@ -821,6 +847,11 @@ static const struct imc_uncore_pci_dev desktop_imc_pci_ids[] = {
IMC_DEV(CFL_4S_S_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 4 Cores Server */
IMC_DEV(CFL_6S_S_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 6 Cores Server */
IMC_DEV(CFL_8S_S_IMC, &skl_uncore_pci_driver), /* 8th Gen Core S 8 Cores Server */
IMC_DEV(AML_YD_IMC, &skl_uncore_pci_driver), /* 8th Gen Core Y Mobile Dual Core */
IMC_DEV(AML_YQ_IMC, &skl_uncore_pci_driver), /* 8th Gen Core Y Mobile Quad Core */
IMC_DEV(WHL_UQ_IMC, &skl_uncore_pci_driver), /* 8th Gen Core U Mobile Quad Core */
IMC_DEV(WHL_4_UQ_IMC, &skl_uncore_pci_driver), /* 8th Gen Core U Mobile Quad Core */
IMC_DEV(WHL_UD_IMC, &skl_uncore_pci_driver), /* 8th Gen Core U Mobile Dual Core */
IMC_DEV(ICL_U_IMC, &icl_uncore_pci_driver), /* 10th Gen Core Mobile */
IMC_DEV(ICL_U2_IMC, &icl_uncore_pci_driver), /* 10th Gen Core Mobile */
{ /* end marker */ }

View file

@ -324,12 +324,77 @@
#define SKX_M2M_PCI_PMON_CTR0 0x200
#define SKX_M2M_PCI_PMON_BOX_CTL 0x258
/* SNR Ubox */
#define SNR_U_MSR_PMON_CTR0 0x1f98
#define SNR_U_MSR_PMON_CTL0 0x1f91
#define SNR_U_MSR_PMON_UCLK_FIXED_CTL 0x1f93
#define SNR_U_MSR_PMON_UCLK_FIXED_CTR 0x1f94
/* SNR CHA */
#define SNR_CHA_RAW_EVENT_MASK_EXT 0x3ffffff
#define SNR_CHA_MSR_PMON_CTL0 0x1c01
#define SNR_CHA_MSR_PMON_CTR0 0x1c08
#define SNR_CHA_MSR_PMON_BOX_CTL 0x1c00
#define SNR_C0_MSR_PMON_BOX_FILTER0 0x1c05
/* SNR IIO */
#define SNR_IIO_MSR_PMON_CTL0 0x1e08
#define SNR_IIO_MSR_PMON_CTR0 0x1e01
#define SNR_IIO_MSR_PMON_BOX_CTL 0x1e00
#define SNR_IIO_MSR_OFFSET 0x10
#define SNR_IIO_PMON_RAW_EVENT_MASK_EXT 0x7ffff
/* SNR IRP */
#define SNR_IRP0_MSR_PMON_CTL0 0x1ea8
#define SNR_IRP0_MSR_PMON_CTR0 0x1ea1
#define SNR_IRP0_MSR_PMON_BOX_CTL 0x1ea0
#define SNR_IRP_MSR_OFFSET 0x10
/* SNR M2PCIE */
#define SNR_M2PCIE_MSR_PMON_CTL0 0x1e58
#define SNR_M2PCIE_MSR_PMON_CTR0 0x1e51
#define SNR_M2PCIE_MSR_PMON_BOX_CTL 0x1e50
#define SNR_M2PCIE_MSR_OFFSET 0x10
/* SNR PCU */
#define SNR_PCU_MSR_PMON_CTL0 0x1ef1
#define SNR_PCU_MSR_PMON_CTR0 0x1ef8
#define SNR_PCU_MSR_PMON_BOX_CTL 0x1ef0
#define SNR_PCU_MSR_PMON_BOX_FILTER 0x1efc
/* SNR M2M */
#define SNR_M2M_PCI_PMON_CTL0 0x468
#define SNR_M2M_PCI_PMON_CTR0 0x440
#define SNR_M2M_PCI_PMON_BOX_CTL 0x438
#define SNR_M2M_PCI_PMON_UMASK_EXT 0xff
/* SNR PCIE3 */
#define SNR_PCIE3_PCI_PMON_CTL0 0x508
#define SNR_PCIE3_PCI_PMON_CTR0 0x4e8
#define SNR_PCIE3_PCI_PMON_BOX_CTL 0x4e4
/* SNR IMC */
#define SNR_IMC_MMIO_PMON_FIXED_CTL 0x54
#define SNR_IMC_MMIO_PMON_FIXED_CTR 0x38
#define SNR_IMC_MMIO_PMON_CTL0 0x40
#define SNR_IMC_MMIO_PMON_CTR0 0x8
#define SNR_IMC_MMIO_PMON_BOX_CTL 0x22800
#define SNR_IMC_MMIO_OFFSET 0x4000
#define SNR_IMC_MMIO_SIZE 0x4000
#define SNR_IMC_MMIO_BASE_OFFSET 0xd0
#define SNR_IMC_MMIO_BASE_MASK 0x1FFFFFFF
#define SNR_IMC_MMIO_MEM0_OFFSET 0xd8
#define SNR_IMC_MMIO_MEM0_MASK 0x7FF
DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7");
DEFINE_UNCORE_FORMAT_ATTR(event2, event, "config:0-6");
DEFINE_UNCORE_FORMAT_ATTR(event_ext, event, "config:0-7,21");
DEFINE_UNCORE_FORMAT_ATTR(use_occ_ctr, use_occ_ctr, "config:7");
DEFINE_UNCORE_FORMAT_ATTR(umask, umask, "config:8-15");
DEFINE_UNCORE_FORMAT_ATTR(umask_ext, umask, "config:8-15,32-43,45-55");
DEFINE_UNCORE_FORMAT_ATTR(umask_ext2, umask, "config:8-15,32-57");
DEFINE_UNCORE_FORMAT_ATTR(umask_ext3, umask, "config:8-15,32-39");
DEFINE_UNCORE_FORMAT_ATTR(qor, qor, "config:16");
DEFINE_UNCORE_FORMAT_ATTR(edge, edge, "config:18");
DEFINE_UNCORE_FORMAT_ATTR(tid_en, tid_en, "config:19");
@ -343,11 +408,14 @@ DEFINE_UNCORE_FORMAT_ATTR(occ_invert, occ_invert, "config:30");
DEFINE_UNCORE_FORMAT_ATTR(occ_edge, occ_edge, "config:14-51");
DEFINE_UNCORE_FORMAT_ATTR(occ_edge_det, occ_edge_det, "config:31");
DEFINE_UNCORE_FORMAT_ATTR(ch_mask, ch_mask, "config:36-43");
DEFINE_UNCORE_FORMAT_ATTR(ch_mask2, ch_mask, "config:36-47");
DEFINE_UNCORE_FORMAT_ATTR(fc_mask, fc_mask, "config:44-46");
DEFINE_UNCORE_FORMAT_ATTR(fc_mask2, fc_mask, "config:48-50");
DEFINE_UNCORE_FORMAT_ATTR(filter_tid, filter_tid, "config1:0-4");
DEFINE_UNCORE_FORMAT_ATTR(filter_tid2, filter_tid, "config1:0");
DEFINE_UNCORE_FORMAT_ATTR(filter_tid3, filter_tid, "config1:0-5");
DEFINE_UNCORE_FORMAT_ATTR(filter_tid4, filter_tid, "config1:0-8");
DEFINE_UNCORE_FORMAT_ATTR(filter_tid5, filter_tid, "config1:0-9");
DEFINE_UNCORE_FORMAT_ATTR(filter_cid, filter_cid, "config1:5");
DEFINE_UNCORE_FORMAT_ATTR(filter_link, filter_link, "config1:5-8");
DEFINE_UNCORE_FORMAT_ATTR(filter_link2, filter_link, "config1:6-8");
@ -1058,8 +1126,8 @@ static void snbep_qpi_enable_event(struct intel_uncore_box *box, struct perf_eve
if (reg1->idx != EXTRA_REG_NONE) {
int idx = box->pmu->pmu_idx + SNBEP_PCI_QPI_PORT0_FILTER;
int pkg = box->pkgid;
struct pci_dev *filter_pdev = uncore_extra_pci_dev[pkg].dev[idx];
int die = box->dieid;
struct pci_dev *filter_pdev = uncore_extra_pci_dev[die].dev[idx];
if (filter_pdev) {
pci_write_config_dword(filter_pdev, reg1->reg,
@ -3585,6 +3653,7 @@ static struct uncore_event_desc skx_uncore_iio_freerunning_events[] = {
static struct intel_uncore_ops skx_uncore_iio_freerunning_ops = {
.read_counter = uncore_msr_read_counter,
.hw_config = uncore_freerunning_hw_config,
};
static struct attribute *skx_uncore_iio_freerunning_formats_attr[] = {
@ -3967,3 +4036,535 @@ int skx_uncore_pci_init(void)
}
/* end of SKX uncore support */
/* SNR uncore support */
static struct intel_uncore_type snr_uncore_ubox = {
.name = "ubox",
.num_counters = 2,
.num_boxes = 1,
.perf_ctr_bits = 48,
.fixed_ctr_bits = 48,
.perf_ctr = SNR_U_MSR_PMON_CTR0,
.event_ctl = SNR_U_MSR_PMON_CTL0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.fixed_ctr = SNR_U_MSR_PMON_UCLK_FIXED_CTR,
.fixed_ctl = SNR_U_MSR_PMON_UCLK_FIXED_CTL,
.ops = &ivbep_uncore_msr_ops,
.format_group = &ivbep_uncore_format_group,
};
static struct attribute *snr_uncore_cha_formats_attr[] = {
&format_attr_event.attr,
&format_attr_umask_ext2.attr,
&format_attr_edge.attr,
&format_attr_tid_en.attr,
&format_attr_inv.attr,
&format_attr_thresh8.attr,
&format_attr_filter_tid5.attr,
NULL,
};
static const struct attribute_group snr_uncore_chabox_format_group = {
.name = "format",
.attrs = snr_uncore_cha_formats_attr,
};
static int snr_cha_hw_config(struct intel_uncore_box *box, struct perf_event *event)
{
struct hw_perf_event_extra *reg1 = &event->hw.extra_reg;
reg1->reg = SNR_C0_MSR_PMON_BOX_FILTER0 +
box->pmu->type->msr_offset * box->pmu->pmu_idx;
reg1->config = event->attr.config1 & SKX_CHA_MSR_PMON_BOX_FILTER_TID;
reg1->idx = 0;
return 0;
}
static void snr_cha_enable_event(struct intel_uncore_box *box,
struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
struct hw_perf_event_extra *reg1 = &hwc->extra_reg;
if (reg1->idx != EXTRA_REG_NONE)
wrmsrl(reg1->reg, reg1->config);
wrmsrl(hwc->config_base, hwc->config | SNBEP_PMON_CTL_EN);
}
static struct intel_uncore_ops snr_uncore_chabox_ops = {
.init_box = ivbep_uncore_msr_init_box,
.disable_box = snbep_uncore_msr_disable_box,
.enable_box = snbep_uncore_msr_enable_box,
.disable_event = snbep_uncore_msr_disable_event,
.enable_event = snr_cha_enable_event,
.read_counter = uncore_msr_read_counter,
.hw_config = snr_cha_hw_config,
};
static struct intel_uncore_type snr_uncore_chabox = {
.name = "cha",
.num_counters = 4,
.num_boxes = 6,
.perf_ctr_bits = 48,
.event_ctl = SNR_CHA_MSR_PMON_CTL0,
.perf_ctr = SNR_CHA_MSR_PMON_CTR0,
.box_ctl = SNR_CHA_MSR_PMON_BOX_CTL,
.msr_offset = HSWEP_CBO_MSR_OFFSET,
.event_mask = HSWEP_S_MSR_PMON_RAW_EVENT_MASK,
.event_mask_ext = SNR_CHA_RAW_EVENT_MASK_EXT,
.ops = &snr_uncore_chabox_ops,
.format_group = &snr_uncore_chabox_format_group,
};
static struct attribute *snr_uncore_iio_formats_attr[] = {
&format_attr_event.attr,
&format_attr_umask.attr,
&format_attr_edge.attr,
&format_attr_inv.attr,
&format_attr_thresh9.attr,
&format_attr_ch_mask2.attr,
&format_attr_fc_mask2.attr,
NULL,
};
static const struct attribute_group snr_uncore_iio_format_group = {
.name = "format",
.attrs = snr_uncore_iio_formats_attr,
};
static struct intel_uncore_type snr_uncore_iio = {
.name = "iio",
.num_counters = 4,
.num_boxes = 5,
.perf_ctr_bits = 48,
.event_ctl = SNR_IIO_MSR_PMON_CTL0,
.perf_ctr = SNR_IIO_MSR_PMON_CTR0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.event_mask_ext = SNR_IIO_PMON_RAW_EVENT_MASK_EXT,
.box_ctl = SNR_IIO_MSR_PMON_BOX_CTL,
.msr_offset = SNR_IIO_MSR_OFFSET,
.ops = &ivbep_uncore_msr_ops,
.format_group = &snr_uncore_iio_format_group,
};
static struct intel_uncore_type snr_uncore_irp = {
.name = "irp",
.num_counters = 2,
.num_boxes = 5,
.perf_ctr_bits = 48,
.event_ctl = SNR_IRP0_MSR_PMON_CTL0,
.perf_ctr = SNR_IRP0_MSR_PMON_CTR0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.box_ctl = SNR_IRP0_MSR_PMON_BOX_CTL,
.msr_offset = SNR_IRP_MSR_OFFSET,
.ops = &ivbep_uncore_msr_ops,
.format_group = &ivbep_uncore_format_group,
};
static struct intel_uncore_type snr_uncore_m2pcie = {
.name = "m2pcie",
.num_counters = 4,
.num_boxes = 5,
.perf_ctr_bits = 48,
.event_ctl = SNR_M2PCIE_MSR_PMON_CTL0,
.perf_ctr = SNR_M2PCIE_MSR_PMON_CTR0,
.box_ctl = SNR_M2PCIE_MSR_PMON_BOX_CTL,
.msr_offset = SNR_M2PCIE_MSR_OFFSET,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.ops = &ivbep_uncore_msr_ops,
.format_group = &ivbep_uncore_format_group,
};
static int snr_pcu_hw_config(struct intel_uncore_box *box, struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
struct hw_perf_event_extra *reg1 = &hwc->extra_reg;
int ev_sel = hwc->config & SNBEP_PMON_CTL_EV_SEL_MASK;
if (ev_sel >= 0xb && ev_sel <= 0xe) {
reg1->reg = SNR_PCU_MSR_PMON_BOX_FILTER;
reg1->idx = ev_sel - 0xb;
reg1->config = event->attr.config1 & (0xff << reg1->idx);
}
return 0;
}
static struct intel_uncore_ops snr_uncore_pcu_ops = {
IVBEP_UNCORE_MSR_OPS_COMMON_INIT(),
.hw_config = snr_pcu_hw_config,
.get_constraint = snbep_pcu_get_constraint,
.put_constraint = snbep_pcu_put_constraint,
};
static struct intel_uncore_type snr_uncore_pcu = {
.name = "pcu",
.num_counters = 4,
.num_boxes = 1,
.perf_ctr_bits = 48,
.perf_ctr = SNR_PCU_MSR_PMON_CTR0,
.event_ctl = SNR_PCU_MSR_PMON_CTL0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.box_ctl = SNR_PCU_MSR_PMON_BOX_CTL,
.num_shared_regs = 1,
.ops = &snr_uncore_pcu_ops,
.format_group = &skx_uncore_pcu_format_group,
};
enum perf_uncore_snr_iio_freerunning_type_id {
SNR_IIO_MSR_IOCLK,
SNR_IIO_MSR_BW_IN,
SNR_IIO_FREERUNNING_TYPE_MAX,
};
static struct freerunning_counters snr_iio_freerunning[] = {
[SNR_IIO_MSR_IOCLK] = { 0x1eac, 0x1, 0x10, 1, 48 },
[SNR_IIO_MSR_BW_IN] = { 0x1f00, 0x1, 0x10, 8, 48 },
};
static struct uncore_event_desc snr_uncore_iio_freerunning_events[] = {
/* Free-Running IIO CLOCKS Counter */
INTEL_UNCORE_EVENT_DESC(ioclk, "event=0xff,umask=0x10"),
/* Free-Running IIO BANDWIDTH IN Counters */
INTEL_UNCORE_EVENT_DESC(bw_in_port0, "event=0xff,umask=0x20"),
INTEL_UNCORE_EVENT_DESC(bw_in_port0.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_in_port0.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port1, "event=0xff,umask=0x21"),
INTEL_UNCORE_EVENT_DESC(bw_in_port1.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_in_port1.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port2, "event=0xff,umask=0x22"),
INTEL_UNCORE_EVENT_DESC(bw_in_port2.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_in_port2.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port3, "event=0xff,umask=0x23"),
INTEL_UNCORE_EVENT_DESC(bw_in_port3.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_in_port3.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port4, "event=0xff,umask=0x24"),
INTEL_UNCORE_EVENT_DESC(bw_in_port4.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_in_port4.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port5, "event=0xff,umask=0x25"),
INTEL_UNCORE_EVENT_DESC(bw_in_port5.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_in_port5.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port6, "event=0xff,umask=0x26"),
INTEL_UNCORE_EVENT_DESC(bw_in_port6.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_in_port6.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port7, "event=0xff,umask=0x27"),
INTEL_UNCORE_EVENT_DESC(bw_in_port7.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_in_port7.unit, "MiB"),
{ /* end: all zeroes */ },
};
static struct intel_uncore_type snr_uncore_iio_free_running = {
.name = "iio_free_running",
.num_counters = 9,
.num_boxes = 5,
.num_freerunning_types = SNR_IIO_FREERUNNING_TYPE_MAX,
.freerunning = snr_iio_freerunning,
.ops = &skx_uncore_iio_freerunning_ops,
.event_descs = snr_uncore_iio_freerunning_events,
.format_group = &skx_uncore_iio_freerunning_format_group,
};
static struct intel_uncore_type *snr_msr_uncores[] = {
&snr_uncore_ubox,
&snr_uncore_chabox,
&snr_uncore_iio,
&snr_uncore_irp,
&snr_uncore_m2pcie,
&snr_uncore_pcu,
&snr_uncore_iio_free_running,
NULL,
};
void snr_uncore_cpu_init(void)
{
uncore_msr_uncores = snr_msr_uncores;
}
static void snr_m2m_uncore_pci_init_box(struct intel_uncore_box *box)
{
struct pci_dev *pdev = box->pci_dev;
int box_ctl = uncore_pci_box_ctl(box);
__set_bit(UNCORE_BOX_FLAG_CTL_OFFS8, &box->flags);
pci_write_config_dword(pdev, box_ctl, IVBEP_PMON_BOX_CTL_INT);
}
static struct intel_uncore_ops snr_m2m_uncore_pci_ops = {
.init_box = snr_m2m_uncore_pci_init_box,
.disable_box = snbep_uncore_pci_disable_box,
.enable_box = snbep_uncore_pci_enable_box,
.disable_event = snbep_uncore_pci_disable_event,
.enable_event = snbep_uncore_pci_enable_event,
.read_counter = snbep_uncore_pci_read_counter,
};
static struct attribute *snr_m2m_uncore_formats_attr[] = {
&format_attr_event.attr,
&format_attr_umask_ext3.attr,
&format_attr_edge.attr,
&format_attr_inv.attr,
&format_attr_thresh8.attr,
NULL,
};
static const struct attribute_group snr_m2m_uncore_format_group = {
.name = "format",
.attrs = snr_m2m_uncore_formats_attr,
};
static struct intel_uncore_type snr_uncore_m2m = {
.name = "m2m",
.num_counters = 4,
.num_boxes = 1,
.perf_ctr_bits = 48,
.perf_ctr = SNR_M2M_PCI_PMON_CTR0,
.event_ctl = SNR_M2M_PCI_PMON_CTL0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.event_mask_ext = SNR_M2M_PCI_PMON_UMASK_EXT,
.box_ctl = SNR_M2M_PCI_PMON_BOX_CTL,
.ops = &snr_m2m_uncore_pci_ops,
.format_group = &snr_m2m_uncore_format_group,
};
static struct intel_uncore_type snr_uncore_pcie3 = {
.name = "pcie3",
.num_counters = 4,
.num_boxes = 1,
.perf_ctr_bits = 48,
.perf_ctr = SNR_PCIE3_PCI_PMON_CTR0,
.event_ctl = SNR_PCIE3_PCI_PMON_CTL0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.box_ctl = SNR_PCIE3_PCI_PMON_BOX_CTL,
.ops = &ivbep_uncore_pci_ops,
.format_group = &ivbep_uncore_format_group,
};
enum {
SNR_PCI_UNCORE_M2M,
SNR_PCI_UNCORE_PCIE3,
};
static struct intel_uncore_type *snr_pci_uncores[] = {
[SNR_PCI_UNCORE_M2M] = &snr_uncore_m2m,
[SNR_PCI_UNCORE_PCIE3] = &snr_uncore_pcie3,
NULL,
};
static const struct pci_device_id snr_uncore_pci_ids[] = {
{ /* M2M */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x344a),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(12, 0, SNR_PCI_UNCORE_M2M, 0),
},
{ /* PCIe3 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x334a),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(4, 0, SNR_PCI_UNCORE_PCIE3, 0),
},
{ /* end: all zeroes */ }
};
static struct pci_driver snr_uncore_pci_driver = {
.name = "snr_uncore",
.id_table = snr_uncore_pci_ids,
};
int snr_uncore_pci_init(void)
{
/* SNR UBOX DID */
int ret = snbep_pci2phy_map_init(0x3460, SKX_CPUNODEID,
SKX_GIDNIDMAP, true);
if (ret)
return ret;
uncore_pci_uncores = snr_pci_uncores;
uncore_pci_driver = &snr_uncore_pci_driver;
return 0;
}
static struct pci_dev *snr_uncore_get_mc_dev(int id)
{
struct pci_dev *mc_dev = NULL;
int phys_id, pkg;
while (1) {
mc_dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x3451, mc_dev);
if (!mc_dev)
break;
phys_id = uncore_pcibus_to_physid(mc_dev->bus);
if (phys_id < 0)
continue;
pkg = topology_phys_to_logical_pkg(phys_id);
if (pkg < 0)
continue;
else if (pkg == id)
break;
}
return mc_dev;
}
static void snr_uncore_mmio_init_box(struct intel_uncore_box *box)
{
struct pci_dev *pdev = snr_uncore_get_mc_dev(box->dieid);
unsigned int box_ctl = uncore_mmio_box_ctl(box);
resource_size_t addr;
u32 pci_dword;
if (!pdev)
return;
pci_read_config_dword(pdev, SNR_IMC_MMIO_BASE_OFFSET, &pci_dword);
addr = (pci_dword & SNR_IMC_MMIO_BASE_MASK) << 23;
pci_read_config_dword(pdev, SNR_IMC_MMIO_MEM0_OFFSET, &pci_dword);
addr |= (pci_dword & SNR_IMC_MMIO_MEM0_MASK) << 12;
addr += box_ctl;
box->io_addr = ioremap(addr, SNR_IMC_MMIO_SIZE);
if (!box->io_addr)
return;
writel(IVBEP_PMON_BOX_CTL_INT, box->io_addr);
}
static void snr_uncore_mmio_disable_box(struct intel_uncore_box *box)
{
u32 config;
if (!box->io_addr)
return;
config = readl(box->io_addr);
config |= SNBEP_PMON_BOX_CTL_FRZ;
writel(config, box->io_addr);
}
static void snr_uncore_mmio_enable_box(struct intel_uncore_box *box)
{
u32 config;
if (!box->io_addr)
return;
config = readl(box->io_addr);
config &= ~SNBEP_PMON_BOX_CTL_FRZ;
writel(config, box->io_addr);
}
static void snr_uncore_mmio_enable_event(struct intel_uncore_box *box,
struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
if (!box->io_addr)
return;
writel(hwc->config | SNBEP_PMON_CTL_EN,
box->io_addr + hwc->config_base);
}
static void snr_uncore_mmio_disable_event(struct intel_uncore_box *box,
struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
if (!box->io_addr)
return;
writel(hwc->config, box->io_addr + hwc->config_base);
}
static struct intel_uncore_ops snr_uncore_mmio_ops = {
.init_box = snr_uncore_mmio_init_box,
.exit_box = uncore_mmio_exit_box,
.disable_box = snr_uncore_mmio_disable_box,
.enable_box = snr_uncore_mmio_enable_box,
.disable_event = snr_uncore_mmio_disable_event,
.enable_event = snr_uncore_mmio_enable_event,
.read_counter = uncore_mmio_read_counter,
};
static struct uncore_event_desc snr_uncore_imc_events[] = {
INTEL_UNCORE_EVENT_DESC(clockticks, "event=0x00,umask=0x00"),
INTEL_UNCORE_EVENT_DESC(cas_count_read, "event=0x04,umask=0x0f"),
INTEL_UNCORE_EVENT_DESC(cas_count_read.scale, "6.103515625e-5"),
INTEL_UNCORE_EVENT_DESC(cas_count_read.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(cas_count_write, "event=0x04,umask=0x30"),
INTEL_UNCORE_EVENT_DESC(cas_count_write.scale, "6.103515625e-5"),
INTEL_UNCORE_EVENT_DESC(cas_count_write.unit, "MiB"),
{ /* end: all zeroes */ },
};
static struct intel_uncore_type snr_uncore_imc = {
.name = "imc",
.num_counters = 4,
.num_boxes = 2,
.perf_ctr_bits = 48,
.fixed_ctr_bits = 48,
.fixed_ctr = SNR_IMC_MMIO_PMON_FIXED_CTR,
.fixed_ctl = SNR_IMC_MMIO_PMON_FIXED_CTL,
.event_descs = snr_uncore_imc_events,
.perf_ctr = SNR_IMC_MMIO_PMON_CTR0,
.event_ctl = SNR_IMC_MMIO_PMON_CTL0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.box_ctl = SNR_IMC_MMIO_PMON_BOX_CTL,
.mmio_offset = SNR_IMC_MMIO_OFFSET,
.ops = &snr_uncore_mmio_ops,
.format_group = &skx_uncore_format_group,
};
enum perf_uncore_snr_imc_freerunning_type_id {
SNR_IMC_DCLK,
SNR_IMC_DDR,
SNR_IMC_FREERUNNING_TYPE_MAX,
};
static struct freerunning_counters snr_imc_freerunning[] = {
[SNR_IMC_DCLK] = { 0x22b0, 0x0, 0, 1, 48 },
[SNR_IMC_DDR] = { 0x2290, 0x8, 0, 2, 48 },
};
static struct uncore_event_desc snr_uncore_imc_freerunning_events[] = {
INTEL_UNCORE_EVENT_DESC(dclk, "event=0xff,umask=0x10"),
INTEL_UNCORE_EVENT_DESC(read, "event=0xff,umask=0x20"),
INTEL_UNCORE_EVENT_DESC(read.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(read.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(write, "event=0xff,umask=0x21"),
INTEL_UNCORE_EVENT_DESC(write.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(write.unit, "MiB"),
};
static struct intel_uncore_ops snr_uncore_imc_freerunning_ops = {
.init_box = snr_uncore_mmio_init_box,
.exit_box = uncore_mmio_exit_box,
.read_counter = uncore_mmio_read_counter,
.hw_config = uncore_freerunning_hw_config,
};
static struct intel_uncore_type snr_uncore_imc_free_running = {
.name = "imc_free_running",
.num_counters = 3,
.num_boxes = 1,
.num_freerunning_types = SNR_IMC_FREERUNNING_TYPE_MAX,
.freerunning = snr_imc_freerunning,
.ops = &snr_uncore_imc_freerunning_ops,
.event_descs = snr_uncore_imc_freerunning_events,
.format_group = &skx_uncore_iio_freerunning_format_group,
};
static struct intel_uncore_type *snr_mmio_uncores[] = {
&snr_uncore_imc,
&snr_uncore_imc_free_running,
NULL,
};
void snr_uncore_mmio_init(void)
{
uncore_mmio_uncores = snr_mmio_uncores;
}
/* end of SNR uncore support */

View file

@ -1,7 +1,9 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/perf_event.h>
#include <linux/sysfs.h>
#include <linux/nospec.h>
#include <asm/intel-family.h>
#include "probe.h"
enum perf_msr_id {
PERF_MSR_TSC = 0,
@ -12,32 +14,30 @@ enum perf_msr_id {
PERF_MSR_PTSC = 5,
PERF_MSR_IRPERF = 6,
PERF_MSR_THERM = 7,
PERF_MSR_THERM_SNAP = 8,
PERF_MSR_THERM_UNIT = 9,
PERF_MSR_EVENT_MAX,
};
static bool test_aperfmperf(int idx)
static bool test_aperfmperf(int idx, void *data)
{
return boot_cpu_has(X86_FEATURE_APERFMPERF);
}
static bool test_ptsc(int idx)
static bool test_ptsc(int idx, void *data)
{
return boot_cpu_has(X86_FEATURE_PTSC);
}
static bool test_irperf(int idx)
static bool test_irperf(int idx, void *data)
{
return boot_cpu_has(X86_FEATURE_IRPERF);
}
static bool test_therm_status(int idx)
static bool test_therm_status(int idx, void *data)
{
return boot_cpu_has(X86_FEATURE_DTHERM);
}
static bool test_intel(int idx)
static bool test_intel(int idx, void *data)
{
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
boot_cpu_data.x86 != 6)
@ -98,37 +98,51 @@ static bool test_intel(int idx)
return false;
}
struct perf_msr {
u64 msr;
struct perf_pmu_events_attr *attr;
bool (*test)(int idx);
PMU_EVENT_ATTR_STRING(tsc, attr_tsc, "event=0x00" );
PMU_EVENT_ATTR_STRING(aperf, attr_aperf, "event=0x01" );
PMU_EVENT_ATTR_STRING(mperf, attr_mperf, "event=0x02" );
PMU_EVENT_ATTR_STRING(pperf, attr_pperf, "event=0x03" );
PMU_EVENT_ATTR_STRING(smi, attr_smi, "event=0x04" );
PMU_EVENT_ATTR_STRING(ptsc, attr_ptsc, "event=0x05" );
PMU_EVENT_ATTR_STRING(irperf, attr_irperf, "event=0x06" );
PMU_EVENT_ATTR_STRING(cpu_thermal_margin, attr_therm, "event=0x07" );
PMU_EVENT_ATTR_STRING(cpu_thermal_margin.snapshot, attr_therm_snap, "1" );
PMU_EVENT_ATTR_STRING(cpu_thermal_margin.unit, attr_therm_unit, "C" );
static unsigned long msr_mask;
PMU_EVENT_GROUP(events, aperf);
PMU_EVENT_GROUP(events, mperf);
PMU_EVENT_GROUP(events, pperf);
PMU_EVENT_GROUP(events, smi);
PMU_EVENT_GROUP(events, ptsc);
PMU_EVENT_GROUP(events, irperf);
static struct attribute *attrs_therm[] = {
&attr_therm.attr.attr,
&attr_therm_snap.attr.attr,
&attr_therm_unit.attr.attr,
NULL,
};
PMU_EVENT_ATTR_STRING(tsc, evattr_tsc, "event=0x00" );
PMU_EVENT_ATTR_STRING(aperf, evattr_aperf, "event=0x01" );
PMU_EVENT_ATTR_STRING(mperf, evattr_mperf, "event=0x02" );
PMU_EVENT_ATTR_STRING(pperf, evattr_pperf, "event=0x03" );
PMU_EVENT_ATTR_STRING(smi, evattr_smi, "event=0x04" );
PMU_EVENT_ATTR_STRING(ptsc, evattr_ptsc, "event=0x05" );
PMU_EVENT_ATTR_STRING(irperf, evattr_irperf, "event=0x06" );
PMU_EVENT_ATTR_STRING(cpu_thermal_margin, evattr_therm, "event=0x07" );
PMU_EVENT_ATTR_STRING(cpu_thermal_margin.snapshot, evattr_therm_snap, "1" );
PMU_EVENT_ATTR_STRING(cpu_thermal_margin.unit, evattr_therm_unit, "C" );
static struct attribute_group group_therm = {
.name = "events",
.attrs = attrs_therm,
};
static struct perf_msr msr[] = {
[PERF_MSR_TSC] = { 0, &evattr_tsc, NULL, },
[PERF_MSR_APERF] = { MSR_IA32_APERF, &evattr_aperf, test_aperfmperf, },
[PERF_MSR_MPERF] = { MSR_IA32_MPERF, &evattr_mperf, test_aperfmperf, },
[PERF_MSR_PPERF] = { MSR_PPERF, &evattr_pperf, test_intel, },
[PERF_MSR_SMI] = { MSR_SMI_COUNT, &evattr_smi, test_intel, },
[PERF_MSR_PTSC] = { MSR_F15H_PTSC, &evattr_ptsc, test_ptsc, },
[PERF_MSR_IRPERF] = { MSR_F17H_IRPERF, &evattr_irperf, test_irperf, },
[PERF_MSR_THERM] = { MSR_IA32_THERM_STATUS, &evattr_therm, test_therm_status, },
[PERF_MSR_THERM_SNAP] = { MSR_IA32_THERM_STATUS, &evattr_therm_snap, test_therm_status, },
[PERF_MSR_THERM_UNIT] = { MSR_IA32_THERM_STATUS, &evattr_therm_unit, test_therm_status, },
[PERF_MSR_TSC] = { .no_check = true, },
[PERF_MSR_APERF] = { MSR_IA32_APERF, &group_aperf, test_aperfmperf, },
[PERF_MSR_MPERF] = { MSR_IA32_MPERF, &group_mperf, test_aperfmperf, },
[PERF_MSR_PPERF] = { MSR_PPERF, &group_pperf, test_intel, },
[PERF_MSR_SMI] = { MSR_SMI_COUNT, &group_smi, test_intel, },
[PERF_MSR_PTSC] = { MSR_F15H_PTSC, &group_ptsc, test_ptsc, },
[PERF_MSR_IRPERF] = { MSR_F17H_IRPERF, &group_irperf, test_irperf, },
[PERF_MSR_THERM] = { MSR_IA32_THERM_STATUS, &group_therm, test_therm_status, },
};
static struct attribute *events_attrs[PERF_MSR_EVENT_MAX + 1] = {
static struct attribute *events_attrs[] = {
&attr_tsc.attr.attr,
NULL,
};
@ -153,6 +167,17 @@ static const struct attribute_group *attr_groups[] = {
NULL,
};
const struct attribute_group *attr_update[] = {
&group_aperf,
&group_mperf,
&group_pperf,
&group_smi,
&group_ptsc,
&group_irperf,
&group_therm,
NULL,
};
static int msr_event_init(struct perf_event *event)
{
u64 cfg = event->attr.config;
@ -169,7 +194,7 @@ static int msr_event_init(struct perf_event *event)
cfg = array_index_nospec((unsigned long)cfg, PERF_MSR_EVENT_MAX);
if (!msr[cfg].attr)
if (!(msr_mask & (1 << cfg)))
return -EINVAL;
event->hw.idx = -1;
@ -252,32 +277,17 @@ static struct pmu pmu_msr = {
.stop = msr_event_stop,
.read = msr_event_update,
.capabilities = PERF_PMU_CAP_NO_INTERRUPT | PERF_PMU_CAP_NO_EXCLUDE,
.attr_update = attr_update,
};
static int __init msr_init(void)
{
int i, j = 0;
if (!boot_cpu_has(X86_FEATURE_TSC)) {
pr_cont("no MSR PMU driver.\n");
return 0;
}
/* Probe the MSRs. */
for (i = PERF_MSR_TSC + 1; i < PERF_MSR_EVENT_MAX; i++) {
u64 val;
/* Virt sucks; you cannot tell if a R/O MSR is present :/ */
if (!msr[i].test(i) || rdmsrl_safe(msr[i].msr, &val))
msr[i].attr = NULL;
}
/* List remaining MSRs in the sysfs attrs. */
for (i = 0; i < PERF_MSR_EVENT_MAX; i++) {
if (msr[i].attr)
events_attrs[j++] = &msr[i].attr->attr.attr;
}
events_attrs[j] = NULL;
msr_mask = perf_msr_probe(msr, PERF_MSR_EVENT_MAX, true, NULL);
perf_pmu_register(&pmu_msr, "msr", -1);

View file

@ -613,14 +613,11 @@ struct x86_pmu {
int attr_rdpmc_broken;
int attr_rdpmc;
struct attribute **format_attrs;
struct attribute **event_attrs;
struct attribute **caps_attrs;
ssize_t (*events_sysfs_show)(char *page, u64 config);
struct attribute **cpu_events;
const struct attribute_group **attr_update;
unsigned long attr_freeze_on_smi;
struct attribute **attrs;
/*
* CPU Hotplug hooks
@ -886,8 +883,6 @@ static inline void set_linear_ip(struct pt_regs *regs, unsigned long ip)
ssize_t x86_event_sysfs_show(char *page, u64 config, u64 event);
ssize_t intel_event_sysfs_show(char *page, u64 config);
struct attribute **merge_attr(struct attribute **a, struct attribute **b);
ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
char *page);
ssize_t events_ht_sysfs_show(struct device *dev, struct device_attribute *attr,

45
arch/x86/events/probe.c Normal file
View file

@ -0,0 +1,45 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/export.h>
#include <linux/types.h>
#include <linux/bits.h>
#include "probe.h"
static umode_t
not_visible(struct kobject *kobj, struct attribute *attr, int i)
{
return 0;
}
unsigned long
perf_msr_probe(struct perf_msr *msr, int cnt, bool zero, void *data)
{
unsigned long avail = 0;
unsigned int bit;
u64 val;
if (cnt >= BITS_PER_LONG)
return 0;
for (bit = 0; bit < cnt; bit++) {
if (!msr[bit].no_check) {
struct attribute_group *grp = msr[bit].grp;
grp->is_visible = not_visible;
if (msr[bit].test && !msr[bit].test(bit, data))
continue;
/* Virt sucks; you cannot tell if a R/O MSR is present :/ */
if (rdmsrl_safe(msr[bit].msr, &val))
continue;
/* Disable zero counters if requested. */
if (!zero && !val)
continue;
grp->is_visible = NULL;
}
avail |= BIT(bit);
}
return avail;
}
EXPORT_SYMBOL_GPL(perf_msr_probe);

29
arch/x86/events/probe.h Normal file
View file

@ -0,0 +1,29 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef __ARCH_X86_EVENTS_PROBE_H__
#define __ARCH_X86_EVENTS_PROBE_H__
#include <linux/sysfs.h>
struct perf_msr {
u64 msr;
struct attribute_group *grp;
bool (*test)(int idx, void *data);
bool no_check;
};
unsigned long
perf_msr_probe(struct perf_msr *msr, int cnt, bool no_zero, void *data);
#define __PMU_EVENT_GROUP(_name) \
static struct attribute *attrs_##_name[] = { \
&attr_##_name.attr.attr, \
NULL, \
}
#define PMU_EVENT_GROUP(_grp, _name) \
__PMU_EVENT_GROUP(_name); \
static struct attribute_group group_##_name = { \
.name = #_grp, \
.attrs = attrs_##_name, \
}
#endif /* __ARCH_X86_EVENTS_PROBE_H__ */

View file

@ -22,8 +22,8 @@ enum cpuid_leafs
CPUID_LNX_3,
CPUID_7_0_EBX,
CPUID_D_1_EAX,
CPUID_F_0_EDX,
CPUID_F_1_EDX,
CPUID_LNX_4,
CPUID_7_1_EAX,
CPUID_8000_0008_EBX,
CPUID_6_EAX,
CPUID_8000_000A_EDX,

View file

@ -239,12 +239,14 @@
#define X86_FEATURE_BMI1 ( 9*32+ 3) /* 1st group bit manipulation extensions */
#define X86_FEATURE_HLE ( 9*32+ 4) /* Hardware Lock Elision */
#define X86_FEATURE_AVX2 ( 9*32+ 5) /* AVX2 instructions */
#define X86_FEATURE_FDP_EXCPTN_ONLY ( 9*32+ 6) /* "" FPU data pointer updated only on x87 exceptions */
#define X86_FEATURE_SMEP ( 9*32+ 7) /* Supervisor Mode Execution Protection */
#define X86_FEATURE_BMI2 ( 9*32+ 8) /* 2nd group bit manipulation extensions */
#define X86_FEATURE_ERMS ( 9*32+ 9) /* Enhanced REP MOVSB/STOSB instructions */
#define X86_FEATURE_INVPCID ( 9*32+10) /* Invalidate Processor Context ID */
#define X86_FEATURE_RTM ( 9*32+11) /* Restricted Transactional Memory */
#define X86_FEATURE_CQM ( 9*32+12) /* Cache QoS Monitoring */
#define X86_FEATURE_ZERO_FCS_FDS ( 9*32+13) /* "" Zero out FPU CS and FPU DS */
#define X86_FEATURE_MPX ( 9*32+14) /* Memory Protection Extension */
#define X86_FEATURE_RDT_A ( 9*32+15) /* Resource Director Technology Allocation */
#define X86_FEATURE_AVX512F ( 9*32+16) /* AVX-512 Foundation */
@ -269,13 +271,19 @@
#define X86_FEATURE_XGETBV1 (10*32+ 2) /* XGETBV with ECX = 1 instruction */
#define X86_FEATURE_XSAVES (10*32+ 3) /* XSAVES/XRSTORS instructions */
/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:0 (EDX), word 11 */
#define X86_FEATURE_CQM_LLC (11*32+ 1) /* LLC QoS if 1 */
/*
* Extended auxiliary flags: Linux defined - for features scattered in various
* CPUID levels like 0xf, etc.
*
* Reuse free bits when adding new feature flags!
*/
#define X86_FEATURE_CQM_LLC (11*32+ 0) /* LLC QoS if 1 */
#define X86_FEATURE_CQM_OCCUP_LLC (11*32+ 1) /* LLC occupancy monitoring */
#define X86_FEATURE_CQM_MBM_TOTAL (11*32+ 2) /* LLC Total MBM monitoring */
#define X86_FEATURE_CQM_MBM_LOCAL (11*32+ 3) /* LLC Local MBM monitoring */
/* Intel-defined CPU QoS Sub-leaf, CPUID level 0x0000000F:1 (EDX), word 12 */
#define X86_FEATURE_CQM_OCCUP_LLC (12*32+ 0) /* LLC occupancy monitoring */
#define X86_FEATURE_CQM_MBM_TOTAL (12*32+ 1) /* LLC Total MBM monitoring */
#define X86_FEATURE_CQM_MBM_LOCAL (12*32+ 2) /* LLC Local MBM monitoring */
/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
#define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */
/* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */
#define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */
@ -322,6 +330,7 @@
#define X86_FEATURE_UMIP (16*32+ 2) /* User Mode Instruction Protection */
#define X86_FEATURE_PKU (16*32+ 3) /* Protection Keys for Userspace */
#define X86_FEATURE_OSPKE (16*32+ 4) /* OS Protection Keys Enable */
#define X86_FEATURE_WAITPKG (16*32+ 5) /* UMONITOR/UMWAIT/TPAUSE Instructions */
#define X86_FEATURE_AVX512_VBMI2 (16*32+ 6) /* Additional AVX512 Vector Bit Manipulation Instructions */
#define X86_FEATURE_GFNI (16*32+ 8) /* Galois Field New Instructions */
#define X86_FEATURE_VAES (16*32+ 9) /* Vector AES */

View file

@ -56,6 +56,7 @@
#define INTEL_FAM6_ICELAKE_XEON_D 0x6C
#define INTEL_FAM6_ICELAKE_DESKTOP 0x7D
#define INTEL_FAM6_ICELAKE_MOBILE 0x7E
#define INTEL_FAM6_ICELAKE_NNPI 0x9D
/* "Small Core" Processors (Atom) */
@ -76,6 +77,7 @@
#define INTEL_FAM6_ATOM_GOLDMONT 0x5C /* Apollo Lake */
#define INTEL_FAM6_ATOM_GOLDMONT_X 0x5F /* Denverton */
#define INTEL_FAM6_ATOM_GOLDMONT_PLUS 0x7A /* Gemini Lake */
#define INTEL_FAM6_ATOM_TREMONT_X 0x86 /* Jacobsville */
/* Xeon Phi */

View file

@ -61,6 +61,15 @@
#define MSR_PLATFORM_INFO_CPUID_FAULT_BIT 31
#define MSR_PLATFORM_INFO_CPUID_FAULT BIT_ULL(MSR_PLATFORM_INFO_CPUID_FAULT_BIT)
#define MSR_IA32_UMWAIT_CONTROL 0xe1
#define MSR_IA32_UMWAIT_CONTROL_C02_DISABLE BIT(0)
#define MSR_IA32_UMWAIT_CONTROL_RESERVED BIT(1)
/*
* The time field is bit[31:2], but representing a 32bit value with
* bit[1:0] zero.
*/
#define MSR_IA32_UMWAIT_CONTROL_TIME_MASK (~0x03U)
#define MSR_PKG_CST_CONFIG_CONTROL 0x000000e2
#define NHM_C3_AUTO_DEMOTE (1UL << 25)
#define NHM_C1_AUTO_DEMOTE (1UL << 26)

View file

@ -105,7 +105,7 @@ struct cpuinfo_x86 {
int x86_power;
unsigned long loops_per_jiffy;
/* cpuid returned max cores value: */
u16 x86_max_cores;
u16 x86_max_cores;
u16 apicid;
u16 initial_apicid;
u16 x86_clflush_size;
@ -117,6 +117,8 @@ struct cpuinfo_x86 {
u16 logical_proc_id;
/* Core id: */
u16 cpu_core_id;
u16 cpu_die_id;
u16 logical_die_id;
/* Index into per_cpu list: */
u16 cpu_index;
u32 microcode;
@ -144,7 +146,8 @@ enum cpuid_regs_idx {
#define X86_VENDOR_TRANSMETA 7
#define X86_VENDOR_NSC 8
#define X86_VENDOR_HYGON 9
#define X86_VENDOR_NUM 10
#define X86_VENDOR_ZHAOXIN 10
#define X86_VENDOR_NUM 11
#define X86_VENDOR_UNKNOWN 0xff

View file

@ -23,6 +23,7 @@ extern unsigned int num_processors;
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_die_map);
/* cpus sharing the last level cache: */
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
DECLARE_PER_CPU_READ_MOSTLY(u16, cpu_llc_id);

View file

@ -106,15 +106,25 @@ extern const struct cpumask *cpu_coregroup_mask(int cpu);
#define topology_logical_package_id(cpu) (cpu_data(cpu).logical_proc_id)
#define topology_physical_package_id(cpu) (cpu_data(cpu).phys_proc_id)
#define topology_logical_die_id(cpu) (cpu_data(cpu).logical_die_id)
#define topology_die_id(cpu) (cpu_data(cpu).cpu_die_id)
#define topology_core_id(cpu) (cpu_data(cpu).cpu_core_id)
#ifdef CONFIG_SMP
#define topology_die_cpumask(cpu) (per_cpu(cpu_die_map, cpu))
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
#define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
extern unsigned int __max_logical_packages;
#define topology_max_packages() (__max_logical_packages)
extern unsigned int __max_die_per_package;
static inline int topology_max_die_per_package(void)
{
return __max_die_per_package;
}
extern int __max_smt_threads;
static inline int topology_max_smt_threads(void)
@ -123,14 +133,21 @@ static inline int topology_max_smt_threads(void)
}
int topology_update_package_map(unsigned int apicid, unsigned int cpu);
int topology_update_die_map(unsigned int dieid, unsigned int cpu);
int topology_phys_to_logical_pkg(unsigned int pkg);
int topology_phys_to_logical_die(unsigned int die, unsigned int cpu);
bool topology_is_primary_thread(unsigned int cpu);
bool topology_smt_supported(void);
#else
#define topology_max_packages() (1)
static inline int
topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
static inline int
topology_update_die_map(unsigned int dieid, unsigned int cpu) { return 0; }
static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
static inline int topology_phys_to_logical_die(unsigned int die,
unsigned int cpu) { return 0; }
static inline int topology_max_die_per_package(void) { return 1; }
static inline int topology_max_smt_threads(void) { return 1; }
static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
static inline bool topology_smt_supported(void) { return false; }

View file

@ -64,6 +64,21 @@ void acpi_processor_power_init_bm_check(struct acpi_processor_flags *flags,
c->x86_stepping >= 0x0e))
flags->bm_check = 1;
}
if (c->x86_vendor == X86_VENDOR_ZHAOXIN) {
/*
* All Zhaoxin CPUs that support C3 share cache.
* And caches should not be flushed by software while
* entering C3 type state.
*/
flags->bm_check = 1;
/*
* On all recent Zhaoxin platforms, ARB_DISABLE is a nop.
* So, set bm_control to zero to indicate that ARB_DISABLE
* is not required while entering C3 type state.
*/
flags->bm_control = 0;
}
}
EXPORT_SYMBOL(acpi_processor_power_init_bm_check);

View file

@ -24,6 +24,7 @@ obj-y += match.o
obj-y += bugs.o
obj-y += aperfmperf.o
obj-y += cpuid-deps.o
obj-y += umwait.o
obj-$(CONFIG_PROC_FS) += proc.o
obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o
@ -38,6 +39,7 @@ obj-$(CONFIG_CPU_SUP_CYRIX_32) += cyrix.o
obj-$(CONFIG_CPU_SUP_CENTAUR) += centaur.o
obj-$(CONFIG_CPU_SUP_TRANSMETA_32) += transmeta.o
obj-$(CONFIG_CPU_SUP_UMC_32) += umc.o
obj-$(CONFIG_CPU_SUP_ZHAOXIN) += zhaoxin.o
obj-$(CONFIG_X86_MCE) += mce/
obj-$(CONFIG_MTRR) += mtrr/

View file

@ -13,6 +13,7 @@
#include <linux/percpu.h>
#include <linux/cpufreq.h>
#include <linux/smp.h>
#include <linux/sched/isolation.h>
#include "cpu.h"
@ -85,6 +86,9 @@ unsigned int aperfmperf_get_khz(int cpu)
if (!boot_cpu_has(X86_FEATURE_APERFMPERF))
return 0;
if (!housekeeping_cpu(cpu, HK_FLAG_MISC))
return 0;
aperfmperf_snapshot_cpu(cpu, ktime_get(), true);
return per_cpu(samples.khz, cpu);
}
@ -101,9 +105,12 @@ void arch_freq_prepare_all(void)
if (!boot_cpu_has(X86_FEATURE_APERFMPERF))
return;
for_each_online_cpu(cpu)
for_each_online_cpu(cpu) {
if (!housekeeping_cpu(cpu, HK_FLAG_MISC))
continue;
if (!aperfmperf_snapshot_cpu(cpu, now, false))
wait = true;
}
if (wait)
msleep(APERFMPERF_REFRESH_DELAY_MS);
@ -117,6 +124,9 @@ unsigned int arch_freq_get_on_cpu(int cpu)
if (!boot_cpu_has(X86_FEATURE_APERFMPERF))
return 0;
if (!housekeeping_cpu(cpu, HK_FLAG_MISC))
return 0;
if (aperfmperf_snapshot_cpu(cpu, ktime_get(), true))
return per_cpu(samples.khz, cpu);

View file

@ -658,8 +658,7 @@ void cacheinfo_amd_init_llc_id(struct cpuinfo_x86 *c, int cpu, u8 node_id)
if (c->x86 < 0x17) {
/* LLC is at the node level. */
per_cpu(cpu_llc_id, cpu) = node_id;
} else if (c->x86 == 0x17 &&
c->x86_model >= 0 && c->x86_model <= 0x1F) {
} else if (c->x86 == 0x17 && c->x86_model <= 0x1F) {
/*
* LLC is at the core complex level.
* Core complex ID is ApicId[3] for these processors.

View file

@ -801,6 +801,30 @@ static void init_speculation_control(struct cpuinfo_x86 *c)
}
}
static void init_cqm(struct cpuinfo_x86 *c)
{
if (!cpu_has(c, X86_FEATURE_CQM_LLC)) {
c->x86_cache_max_rmid = -1;
c->x86_cache_occ_scale = -1;
return;
}
/* will be overridden if occupancy monitoring exists */
c->x86_cache_max_rmid = cpuid_ebx(0xf);
if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC) ||
cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL) ||
cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)) {
u32 eax, ebx, ecx, edx;
/* QoS sub-leaf, EAX=0Fh, ECX=1 */
cpuid_count(0xf, 1, &eax, &ebx, &ecx, &edx);
c->x86_cache_max_rmid = ecx;
c->x86_cache_occ_scale = ebx;
}
}
void get_cpu_cap(struct cpuinfo_x86 *c)
{
u32 eax, ebx, ecx, edx;
@ -823,6 +847,12 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
c->x86_capability[CPUID_7_0_EBX] = ebx;
c->x86_capability[CPUID_7_ECX] = ecx;
c->x86_capability[CPUID_7_EDX] = edx;
/* Check valid sub-leaf index before accessing it */
if (eax >= 1) {
cpuid_count(0x00000007, 1, &eax, &ebx, &ecx, &edx);
c->x86_capability[CPUID_7_1_EAX] = eax;
}
}
/* Extended state features: level 0x0000000d */
@ -832,33 +862,6 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
c->x86_capability[CPUID_D_1_EAX] = eax;
}
/* Additional Intel-defined flags: level 0x0000000F */
if (c->cpuid_level >= 0x0000000F) {
/* QoS sub-leaf, EAX=0Fh, ECX=0 */
cpuid_count(0x0000000F, 0, &eax, &ebx, &ecx, &edx);
c->x86_capability[CPUID_F_0_EDX] = edx;
if (cpu_has(c, X86_FEATURE_CQM_LLC)) {
/* will be overridden if occupancy monitoring exists */
c->x86_cache_max_rmid = ebx;
/* QoS sub-leaf, EAX=0Fh, ECX=1 */
cpuid_count(0x0000000F, 1, &eax, &ebx, &ecx, &edx);
c->x86_capability[CPUID_F_1_EDX] = edx;
if ((cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC)) ||
((cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL)) ||
(cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)))) {
c->x86_cache_max_rmid = ecx;
c->x86_cache_occ_scale = ebx;
}
} else {
c->x86_cache_max_rmid = -1;
c->x86_cache_occ_scale = -1;
}
}
/* AMD-defined flags: level 0x80000001 */
eax = cpuid_eax(0x80000000);
c->extended_cpuid_level = eax;
@ -889,6 +892,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
init_scattered_cpuid_features(c);
init_speculation_control(c);
init_cqm(c);
/*
* Clear/Set all flags overridden by options, after probe.
@ -1299,6 +1303,7 @@ static void validate_apic_and_package_id(struct cpuinfo_x86 *c)
cpu, apicid, c->initial_apicid);
}
BUG_ON(topology_update_package_map(c->phys_proc_id, cpu));
BUG_ON(topology_update_die_map(c->cpu_die_id, cpu));
#else
c->logical_proc_id = 0;
#endif

View file

@ -59,6 +59,10 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_AVX512_4VNNIW, X86_FEATURE_AVX512F },
{ X86_FEATURE_AVX512_4FMAPS, X86_FEATURE_AVX512F },
{ X86_FEATURE_AVX512_VPOPCNTDQ, X86_FEATURE_AVX512F },
{ X86_FEATURE_CQM_OCCUP_LLC, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL },
{}
};

View file

@ -66,6 +66,32 @@ void check_mpx_erratum(struct cpuinfo_x86 *c)
}
}
/*
* Processors which have self-snooping capability can handle conflicting
* memory type across CPUs by snooping its own cache. However, there exists
* CPU models in which having conflicting memory types still leads to
* unpredictable behavior, machine check errors, or hangs. Clear this
* feature to prevent its use on machines with known erratas.
*/
static void check_memory_type_self_snoop_errata(struct cpuinfo_x86 *c)
{
switch (c->x86_model) {
case INTEL_FAM6_CORE_YONAH:
case INTEL_FAM6_CORE2_MEROM:
case INTEL_FAM6_CORE2_MEROM_L:
case INTEL_FAM6_CORE2_PENRYN:
case INTEL_FAM6_CORE2_DUNNINGTON:
case INTEL_FAM6_NEHALEM:
case INTEL_FAM6_NEHALEM_G:
case INTEL_FAM6_NEHALEM_EP:
case INTEL_FAM6_NEHALEM_EX:
case INTEL_FAM6_WESTMERE:
case INTEL_FAM6_WESTMERE_EP:
case INTEL_FAM6_SANDYBRIDGE:
setup_clear_cpu_cap(X86_FEATURE_SELFSNOOP);
}
}
static bool ring3mwait_disabled __read_mostly;
static int __init ring3mwait_disable(char *__unused)
@ -304,6 +330,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
}
check_mpx_erratum(c);
check_memory_type_self_snoop_errata(c);
/*
* Get the number of SMT siblings early from the extended topology

View file

@ -743,7 +743,15 @@ static void prepare_set(void) __acquires(set_atomicity_lock)
/* Enter the no-fill (CD=1, NW=0) cache mode and flush caches. */
cr0 = read_cr0() | X86_CR0_CD;
write_cr0(cr0);
wbinvd();
/*
* Cache flushing is the most time-consuming step when programming
* the MTRRs. Fortunately, as per the Intel Software Development
* Manual, we can skip it if the processor supports cache self-
* snooping.
*/
if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
wbinvd();
/* Save value of CR4 and clear Page Global Enable (bit 7) */
if (boot_cpu_has(X86_FEATURE_PGE)) {
@ -760,7 +768,10 @@ static void prepare_set(void) __acquires(set_atomicity_lock)
/* Disable MTRRs, and set the default type to uncached */
mtrr_wrmsr(MSR_MTRRdefType, deftype_lo & ~0xcff, deftype_hi);
wbinvd();
/* Again, only flush caches if we have to. */
if (!static_cpu_has(X86_FEATURE_SELFSNOOP))
wbinvd();
}
static void post_set(void) __releases(set_atomicity_lock)

View file

@ -26,6 +26,10 @@ struct cpuid_bit {
static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x00000006, 0 },
{ X86_FEATURE_EPB, CPUID_ECX, 3, 0x00000006, 0 },
{ X86_FEATURE_CQM_LLC, CPUID_EDX, 1, 0x0000000f, 0 },
{ X86_FEATURE_CQM_OCCUP_LLC, CPUID_EDX, 0, 0x0000000f, 1 },
{ X86_FEATURE_CQM_MBM_TOTAL, CPUID_EDX, 1, 0x0000000f, 1 },
{ X86_FEATURE_CQM_MBM_LOCAL, CPUID_EDX, 2, 0x0000000f, 1 },
{ X86_FEATURE_CAT_L3, CPUID_EBX, 1, 0x00000010, 0 },
{ X86_FEATURE_CAT_L2, CPUID_EBX, 2, 0x00000010, 0 },
{ X86_FEATURE_CDP_L3, CPUID_ECX, 2, 0x00000010, 1 },

View file

@ -15,33 +15,66 @@
/* leaf 0xb SMT level */
#define SMT_LEVEL 0
/* leaf 0xb sub-leaf types */
/* extended topology sub-leaf types */
#define INVALID_TYPE 0
#define SMT_TYPE 1
#define CORE_TYPE 2
#define DIE_TYPE 5
#define LEAFB_SUBTYPE(ecx) (((ecx) >> 8) & 0xff)
#define BITS_SHIFT_NEXT_LEVEL(eax) ((eax) & 0x1f)
#define LEVEL_MAX_SIBLINGS(ebx) ((ebx) & 0xffff)
#ifdef CONFIG_SMP
unsigned int __max_die_per_package __read_mostly = 1;
EXPORT_SYMBOL(__max_die_per_package);
/*
* Check if given CPUID extended toplogy "leaf" is implemented
*/
static int check_extended_topology_leaf(int leaf)
{
unsigned int eax, ebx, ecx, edx;
cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE))
return -1;
return 0;
}
/*
* Return best CPUID Extended Toplogy Leaf supported
*/
static int detect_extended_topology_leaf(struct cpuinfo_x86 *c)
{
if (c->cpuid_level >= 0x1f) {
if (check_extended_topology_leaf(0x1f) == 0)
return 0x1f;
}
if (c->cpuid_level >= 0xb) {
if (check_extended_topology_leaf(0xb) == 0)
return 0xb;
}
return -1;
}
#endif
int detect_extended_topology_early(struct cpuinfo_x86 *c)
{
#ifdef CONFIG_SMP
unsigned int eax, ebx, ecx, edx;
int leaf;
if (c->cpuid_level < 0xb)
return -1;
cpuid_count(0xb, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
/*
* check if the cpuid leaf 0xb is actually implemented.
*/
if (ebx == 0 || (LEAFB_SUBTYPE(ecx) != SMT_TYPE))
leaf = detect_extended_topology_leaf(c);
if (leaf < 0)
return -1;
set_cpu_cap(c, X86_FEATURE_XTOPOLOGY);
cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
/*
* initial apic id, which also represents 32-bit extended x2apic id.
*/
@ -52,7 +85,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c)
}
/*
* Check for extended topology enumeration cpuid leaf 0xb and if it
* Check for extended topology enumeration cpuid leaf, and if it
* exists, use it for populating initial_apicid and cpu topology
* detection.
*/
@ -60,22 +93,28 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
{
#ifdef CONFIG_SMP
unsigned int eax, ebx, ecx, edx, sub_index;
unsigned int ht_mask_width, core_plus_mask_width;
unsigned int ht_mask_width, core_plus_mask_width, die_plus_mask_width;
unsigned int core_select_mask, core_level_siblings;
unsigned int die_select_mask, die_level_siblings;
int leaf;
if (detect_extended_topology_early(c) < 0)
leaf = detect_extended_topology_leaf(c);
if (leaf < 0)
return -1;
/*
* Populate HT related information from sub-leaf level 0.
*/
cpuid_count(0xb, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx);
c->initial_apicid = edx;
core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
sub_index = 1;
do {
cpuid_count(0xb, sub_index, &eax, &ebx, &ecx, &edx);
cpuid_count(leaf, sub_index, &eax, &ebx, &ecx, &edx);
/*
* Check for the Core type in the implemented sub leaves.
@ -83,23 +122,34 @@ int detect_extended_topology(struct cpuinfo_x86 *c)
if (LEAFB_SUBTYPE(ecx) == CORE_TYPE) {
core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
core_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
break;
die_level_siblings = core_level_siblings;
die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
}
if (LEAFB_SUBTYPE(ecx) == DIE_TYPE) {
die_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
}
sub_index++;
} while (LEAFB_SUBTYPE(ecx) != INVALID_TYPE);
core_select_mask = (~(-1 << core_plus_mask_width)) >> ht_mask_width;
die_select_mask = (~(-1 << die_plus_mask_width)) >>
core_plus_mask_width;
c->cpu_core_id = apic->phys_pkg_id(c->initial_apicid, ht_mask_width)
& core_select_mask;
c->phys_proc_id = apic->phys_pkg_id(c->initial_apicid, core_plus_mask_width);
c->cpu_core_id = apic->phys_pkg_id(c->initial_apicid,
ht_mask_width) & core_select_mask;
c->cpu_die_id = apic->phys_pkg_id(c->initial_apicid,
core_plus_mask_width) & die_select_mask;
c->phys_proc_id = apic->phys_pkg_id(c->initial_apicid,
die_plus_mask_width);
/*
* Reinit the apicid, now that we have extended initial_apicid.
*/
c->apicid = apic->phys_pkg_id(c->initial_apicid, 0);
c->x86_max_cores = (core_level_siblings / smp_num_siblings);
__max_die_per_package = (die_level_siblings / core_level_siblings);
#endif
return 0;
}

View file

@ -0,0 +1,200 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/syscore_ops.h>
#include <linux/suspend.h>
#include <linux/cpu.h>
#include <asm/msr.h>
#define UMWAIT_C02_ENABLE 0
#define UMWAIT_CTRL_VAL(max_time, c02_disable) \
(((max_time) & MSR_IA32_UMWAIT_CONTROL_TIME_MASK) | \
((c02_disable) & MSR_IA32_UMWAIT_CONTROL_C02_DISABLE))
/*
* Cache IA32_UMWAIT_CONTROL MSR. This is a systemwide control. By default,
* umwait max time is 100000 in TSC-quanta and C0.2 is enabled
*/
static u32 umwait_control_cached = UMWAIT_CTRL_VAL(100000, UMWAIT_C02_ENABLE);
/*
* Serialize access to umwait_control_cached and IA32_UMWAIT_CONTROL MSR in
* the sysfs write functions.
*/
static DEFINE_MUTEX(umwait_lock);
static void umwait_update_control_msr(void * unused)
{
lockdep_assert_irqs_disabled();
wrmsr(MSR_IA32_UMWAIT_CONTROL, READ_ONCE(umwait_control_cached), 0);
}
/*
* The CPU hotplug callback sets the control MSR to the global control
* value.
*
* Disable interrupts so the read of umwait_control_cached and the WRMSR
* are protected against a concurrent sysfs write. Otherwise the sysfs
* write could update the cached value after it had been read on this CPU
* and issue the IPI before the old value had been written. The IPI would
* interrupt, write the new value and after return from IPI the previous
* value would be written by this CPU.
*
* With interrupts disabled the upcoming CPU either sees the new control
* value or the IPI is updating this CPU to the new control value after
* interrupts have been reenabled.
*/
static int umwait_cpu_online(unsigned int cpu)
{
local_irq_disable();
umwait_update_control_msr(NULL);
local_irq_enable();
return 0;
}
/*
* On resume, restore IA32_UMWAIT_CONTROL MSR on the boot processor which
* is the only active CPU at this time. The MSR is set up on the APs via the
* CPU hotplug callback.
*
* This function is invoked on resume from suspend and hibernation. On
* resume from suspend the restore should be not required, but we neither
* trust the firmware nor does it matter if the same value is written
* again.
*/
static void umwait_syscore_resume(void)
{
umwait_update_control_msr(NULL);
}
static struct syscore_ops umwait_syscore_ops = {
.resume = umwait_syscore_resume,
};
/* sysfs interface */
/*
* When bit 0 in IA32_UMWAIT_CONTROL MSR is 1, C0.2 is disabled.
* Otherwise, C0.2 is enabled.
*/
static inline bool umwait_ctrl_c02_enabled(u32 ctrl)
{
return !(ctrl & MSR_IA32_UMWAIT_CONTROL_C02_DISABLE);
}
static inline u32 umwait_ctrl_max_time(u32 ctrl)
{
return ctrl & MSR_IA32_UMWAIT_CONTROL_TIME_MASK;
}
static inline void umwait_update_control(u32 maxtime, bool c02_enable)
{
u32 ctrl = maxtime & MSR_IA32_UMWAIT_CONTROL_TIME_MASK;
if (!c02_enable)
ctrl |= MSR_IA32_UMWAIT_CONTROL_C02_DISABLE;
WRITE_ONCE(umwait_control_cached, ctrl);
/* Propagate to all CPUs */
on_each_cpu(umwait_update_control_msr, NULL, 1);
}
static ssize_t
enable_c02_show(struct device *dev, struct device_attribute *attr, char *buf)
{
u32 ctrl = READ_ONCE(umwait_control_cached);
return sprintf(buf, "%d\n", umwait_ctrl_c02_enabled(ctrl));
}
static ssize_t enable_c02_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
{
bool c02_enable;
u32 ctrl;
int ret;
ret = kstrtobool(buf, &c02_enable);
if (ret)
return ret;
mutex_lock(&umwait_lock);
ctrl = READ_ONCE(umwait_control_cached);
if (c02_enable != umwait_ctrl_c02_enabled(ctrl))
umwait_update_control(ctrl, c02_enable);
mutex_unlock(&umwait_lock);
return count;
}
static DEVICE_ATTR_RW(enable_c02);
static ssize_t
max_time_show(struct device *kobj, struct device_attribute *attr, char *buf)
{
u32 ctrl = READ_ONCE(umwait_control_cached);
return sprintf(buf, "%u\n", umwait_ctrl_max_time(ctrl));
}
static ssize_t max_time_store(struct device *kobj,
struct device_attribute *attr,
const char *buf, size_t count)
{
u32 max_time, ctrl;
int ret;
ret = kstrtou32(buf, 0, &max_time);
if (ret)
return ret;
/* bits[1:0] must be zero */
if (max_time & ~MSR_IA32_UMWAIT_CONTROL_TIME_MASK)
return -EINVAL;
mutex_lock(&umwait_lock);
ctrl = READ_ONCE(umwait_control_cached);
if (max_time != umwait_ctrl_max_time(ctrl))
umwait_update_control(max_time, umwait_ctrl_c02_enabled(ctrl));
mutex_unlock(&umwait_lock);
return count;
}
static DEVICE_ATTR_RW(max_time);
static struct attribute *umwait_attrs[] = {
&dev_attr_enable_c02.attr,
&dev_attr_max_time.attr,
NULL
};
static struct attribute_group umwait_attr_group = {
.attrs = umwait_attrs,
.name = "umwait_control",
};
static int __init umwait_init(void)
{
struct device *dev;
int ret;
if (!boot_cpu_has(X86_FEATURE_WAITPKG))
return -ENODEV;
ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "umwait:online",
umwait_cpu_online, NULL);
register_syscore_ops(&umwait_syscore_ops);
/*
* Add umwait control interface. Ignore failure, so at least the
* default values are set up in case the machine manages to boot.
*/
dev = cpu_subsys.dev_root;
return sysfs_create_group(&dev->kobj, &umwait_attr_group);
}
device_initcall(umwait_init);

View file

@ -0,0 +1,167 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/sched.h>
#include <linux/sched/clock.h>
#include <asm/cpufeature.h>
#include "cpu.h"
#define MSR_ZHAOXIN_FCR57 0x00001257
#define ACE_PRESENT (1 << 6)
#define ACE_ENABLED (1 << 7)
#define ACE_FCR (1 << 7) /* MSR_ZHAOXIN_FCR */
#define RNG_PRESENT (1 << 2)
#define RNG_ENABLED (1 << 3)
#define RNG_ENABLE (1 << 8) /* MSR_ZHAOXIN_RNG */
#define X86_VMX_FEATURE_PROC_CTLS_TPR_SHADOW 0x00200000
#define X86_VMX_FEATURE_PROC_CTLS_VNMI 0x00400000
#define X86_VMX_FEATURE_PROC_CTLS_2ND_CTLS 0x80000000
#define X86_VMX_FEATURE_PROC_CTLS2_VIRT_APIC 0x00000001
#define X86_VMX_FEATURE_PROC_CTLS2_EPT 0x00000002
#define X86_VMX_FEATURE_PROC_CTLS2_VPID 0x00000020
static void init_zhaoxin_cap(struct cpuinfo_x86 *c)
{
u32 lo, hi;
/* Test for Extended Feature Flags presence */
if (cpuid_eax(0xC0000000) >= 0xC0000001) {
u32 tmp = cpuid_edx(0xC0000001);
/* Enable ACE unit, if present and disabled */
if ((tmp & (ACE_PRESENT | ACE_ENABLED)) == ACE_PRESENT) {
rdmsr(MSR_ZHAOXIN_FCR57, lo, hi);
/* Enable ACE unit */
lo |= ACE_FCR;
wrmsr(MSR_ZHAOXIN_FCR57, lo, hi);
pr_info("CPU: Enabled ACE h/w crypto\n");
}
/* Enable RNG unit, if present and disabled */
if ((tmp & (RNG_PRESENT | RNG_ENABLED)) == RNG_PRESENT) {
rdmsr(MSR_ZHAOXIN_FCR57, lo, hi);
/* Enable RNG unit */
lo |= RNG_ENABLE;
wrmsr(MSR_ZHAOXIN_FCR57, lo, hi);
pr_info("CPU: Enabled h/w RNG\n");
}
/*
* Store Extended Feature Flags as word 5 of the CPU
* capability bit array
*/
c->x86_capability[CPUID_C000_0001_EDX] = cpuid_edx(0xC0000001);
}
if (c->x86 >= 0x6)
set_cpu_cap(c, X86_FEATURE_REP_GOOD);
cpu_detect_cache_sizes(c);
}
static void early_init_zhaoxin(struct cpuinfo_x86 *c)
{
if (c->x86 >= 0x6)
set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
#ifdef CONFIG_X86_64
set_cpu_cap(c, X86_FEATURE_SYSENTER32);
#endif
if (c->x86_power & (1 << 8)) {
set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC);
}
if (c->cpuid_level >= 0x00000001) {
u32 eax, ebx, ecx, edx;
cpuid(0x00000001, &eax, &ebx, &ecx, &edx);
/*
* If HTT (EDX[28]) is set EBX[16:23] contain the number of
* apicids which are reserved per package. Store the resulting
* shift value for the package management code.
*/
if (edx & (1U << 28))
c->x86_coreid_bits = get_count_order((ebx >> 16) & 0xff);
}
}
static void zhaoxin_detect_vmx_virtcap(struct cpuinfo_x86 *c)
{
u32 vmx_msr_low, vmx_msr_high, msr_ctl, msr_ctl2;
rdmsr(MSR_IA32_VMX_PROCBASED_CTLS, vmx_msr_low, vmx_msr_high);
msr_ctl = vmx_msr_high | vmx_msr_low;
if (msr_ctl & X86_VMX_FEATURE_PROC_CTLS_TPR_SHADOW)
set_cpu_cap(c, X86_FEATURE_TPR_SHADOW);
if (msr_ctl & X86_VMX_FEATURE_PROC_CTLS_VNMI)
set_cpu_cap(c, X86_FEATURE_VNMI);
if (msr_ctl & X86_VMX_FEATURE_PROC_CTLS_2ND_CTLS) {
rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2,
vmx_msr_low, vmx_msr_high);
msr_ctl2 = vmx_msr_high | vmx_msr_low;
if ((msr_ctl2 & X86_VMX_FEATURE_PROC_CTLS2_VIRT_APIC) &&
(msr_ctl & X86_VMX_FEATURE_PROC_CTLS_TPR_SHADOW))
set_cpu_cap(c, X86_FEATURE_FLEXPRIORITY);
if (msr_ctl2 & X86_VMX_FEATURE_PROC_CTLS2_EPT)
set_cpu_cap(c, X86_FEATURE_EPT);
if (msr_ctl2 & X86_VMX_FEATURE_PROC_CTLS2_VPID)
set_cpu_cap(c, X86_FEATURE_VPID);
}
}
static void init_zhaoxin(struct cpuinfo_x86 *c)
{
early_init_zhaoxin(c);
init_intel_cacheinfo(c);
detect_num_cpu_cores(c);
#ifdef CONFIG_X86_32
detect_ht(c);
#endif
if (c->cpuid_level > 9) {
unsigned int eax = cpuid_eax(10);
/*
* Check for version and the number of counters
* Version(eax[7:0]) can't be 0;
* Counters(eax[15:8]) should be greater than 1;
*/
if ((eax & 0xff) && (((eax >> 8) & 0xff) > 1))
set_cpu_cap(c, X86_FEATURE_ARCH_PERFMON);
}
if (c->x86 >= 0x6)
init_zhaoxin_cap(c);
#ifdef CONFIG_X86_64
set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
#endif
if (cpu_has(c, X86_FEATURE_VMX))
zhaoxin_detect_vmx_virtcap(c);
}
#ifdef CONFIG_X86_32
static unsigned int
zhaoxin_size_cache(struct cpuinfo_x86 *c, unsigned int size)
{
return size;
}
#endif
static const struct cpu_dev zhaoxin_cpu_dev = {
.c_vendor = "zhaoxin",
.c_ident = { " Shanghai " },
.c_early_init = early_init_zhaoxin,
.c_init = init_zhaoxin,
#ifdef CONFIG_X86_32
.legacy_cache_size = zhaoxin_size_cache,
#endif
.c_x86_vendor = X86_VENDOR_ZHAOXIN,
};
cpu_dev_register(zhaoxin_cpu_dev);

View file

@ -397,22 +397,12 @@ static int putreg(struct task_struct *child,
case offsetof(struct user_regs_struct,fs_base):
if (value >= TASK_SIZE_MAX)
return -EIO;
/*
* When changing the FS base, use do_arch_prctl_64()
* to set the index to zero and to set the base
* as requested.
*/
if (child->thread.fsbase != value)
return do_arch_prctl_64(child, ARCH_SET_FS, value);
x86_fsbase_write_task(child, value);
return 0;
case offsetof(struct user_regs_struct,gs_base):
/*
* Exactly the same here as the %fs handling above.
*/
if (value >= TASK_SIZE_MAX)
return -EIO;
if (child->thread.gsbase != value)
return do_arch_prctl_64(child, ARCH_SET_GS, value);
x86_gsbase_write_task(child, value);
return 0;
#endif
}

View file

@ -89,6 +89,10 @@ EXPORT_PER_CPU_SYMBOL(cpu_sibling_map);
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
EXPORT_PER_CPU_SYMBOL(cpu_core_map);
/* representing HT, core, and die siblings of each logical CPU */
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_die_map);
EXPORT_PER_CPU_SYMBOL(cpu_die_map);
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
/* Per CPU bogomips and other parameters */
@ -99,6 +103,7 @@ EXPORT_PER_CPU_SYMBOL(cpu_info);
unsigned int __max_logical_packages __read_mostly;
EXPORT_SYMBOL(__max_logical_packages);
static unsigned int logical_packages __read_mostly;
static unsigned int logical_die __read_mostly;
/* Maximum number of SMT threads on any online core */
int __read_mostly __max_smt_threads = 1;
@ -300,6 +305,26 @@ int topology_phys_to_logical_pkg(unsigned int phys_pkg)
return -1;
}
EXPORT_SYMBOL(topology_phys_to_logical_pkg);
/**
* topology_phys_to_logical_die - Map a physical die id to logical
*
* Returns logical die id or -1 if not found
*/
int topology_phys_to_logical_die(unsigned int die_id, unsigned int cur_cpu)
{
int cpu;
int proc_id = cpu_data(cur_cpu).phys_proc_id;
for_each_possible_cpu(cpu) {
struct cpuinfo_x86 *c = &cpu_data(cpu);
if (c->initialized && c->cpu_die_id == die_id &&
c->phys_proc_id == proc_id)
return c->logical_die_id;
}
return -1;
}
EXPORT_SYMBOL(topology_phys_to_logical_die);
/**
* topology_update_package_map - Update the physical to logical package map
@ -324,6 +349,29 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu)
cpu_data(cpu).logical_proc_id = new;
return 0;
}
/**
* topology_update_die_map - Update the physical to logical die map
* @die: The die id as retrieved via CPUID
* @cpu: The cpu for which this is updated
*/
int topology_update_die_map(unsigned int die, unsigned int cpu)
{
int new;
/* Already available somewhere? */
new = topology_phys_to_logical_die(die, cpu);
if (new >= 0)
goto found;
new = logical_die++;
if (new != die) {
pr_info("CPU %u Converting physical %u to logical die %u\n",
cpu, die, new);
}
found:
cpu_data(cpu).logical_die_id = new;
return 0;
}
void __init smp_store_boot_cpu_info(void)
{
@ -333,6 +381,7 @@ void __init smp_store_boot_cpu_info(void)
*c = boot_cpu_data;
c->cpu_index = id;
topology_update_package_map(c->phys_proc_id, id);
topology_update_die_map(c->cpu_die_id, id);
c->initialized = true;
}
@ -387,6 +436,7 @@ static bool match_smt(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
int cpu1 = c->cpu_index, cpu2 = o->cpu_index;
if (c->phys_proc_id == o->phys_proc_id &&
c->cpu_die_id == o->cpu_die_id &&
per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2)) {
if (c->cpu_core_id == o->cpu_core_id)
return topology_sane(c, o, "smt");
@ -398,6 +448,7 @@ static bool match_smt(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
}
} else if (c->phys_proc_id == o->phys_proc_id &&
c->cpu_die_id == o->cpu_die_id &&
c->cpu_core_id == o->cpu_core_id) {
return topology_sane(c, o, "smt");
}
@ -460,6 +511,15 @@ static bool match_pkg(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
return false;
}
static bool match_die(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
{
if ((c->phys_proc_id == o->phys_proc_id) &&
(c->cpu_die_id == o->cpu_die_id))
return true;
return false;
}
#if defined(CONFIG_SCHED_SMT) || defined(CONFIG_SCHED_MC)
static inline int x86_sched_itmt_flags(void)
{
@ -522,6 +582,7 @@ void set_cpu_sibling_map(int cpu)
cpumask_set_cpu(cpu, topology_sibling_cpumask(cpu));
cpumask_set_cpu(cpu, cpu_llc_shared_mask(cpu));
cpumask_set_cpu(cpu, topology_core_cpumask(cpu));
cpumask_set_cpu(cpu, topology_die_cpumask(cpu));
c->booted_cores = 1;
return;
}
@ -570,6 +631,9 @@ void set_cpu_sibling_map(int cpu)
}
if (match_pkg(c, o) && !topology_same_node(c, o))
x86_has_numa_in_package = true;
if ((i == cpu) || (has_mp && match_die(c, o)))
link_mask(topology_die_cpumask, cpu, i);
}
threads = cpumask_weight(topology_sibling_cpumask(cpu));
@ -1174,6 +1238,7 @@ static __init void disable_smp(void)
physid_set_mask_of_physid(0, &phys_cpu_present_map);
cpumask_set_cpu(0, topology_sibling_cpumask(0));
cpumask_set_cpu(0, topology_core_cpumask(0));
cpumask_set_cpu(0, topology_die_cpumask(0));
}
/*
@ -1269,6 +1334,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_die_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
@ -1489,6 +1555,8 @@ static void remove_siblinginfo(int cpu)
cpu_data(sibling).booted_cores--;
}
for_each_cpu(sibling, topology_die_cpumask(cpu))
cpumask_clear_cpu(cpu, topology_die_cpumask(sibling));
for_each_cpu(sibling, topology_sibling_cpumask(cpu))
cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
for_each_cpu(sibling, cpu_llc_shared_mask(cpu))
@ -1496,6 +1564,7 @@ static void remove_siblinginfo(int cpu)
cpumask_clear(cpu_llc_shared_mask(cpu));
cpumask_clear(topology_sibling_cpumask(cpu));
cpumask_clear(topology_core_cpumask(cpu));
cpumask_clear(topology_die_cpumask(cpu));
c->cpu_core_id = 0;
c->booted_cores = 0;
cpumask_clear_cpu(cpu, cpu_sibling_setup_mask);

View file

@ -47,8 +47,6 @@ static const struct cpuid_reg reverse_cpuid[] = {
[CPUID_8000_0001_ECX] = {0x80000001, 0, CPUID_ECX},
[CPUID_7_0_EBX] = { 7, 0, CPUID_EBX},
[CPUID_D_1_EAX] = { 0xd, 1, CPUID_EAX},
[CPUID_F_0_EDX] = { 0xf, 0, CPUID_EDX},
[CPUID_F_1_EDX] = { 0xf, 1, CPUID_EDX},
[CPUID_8000_0008_EBX] = {0x80000008, 0, CPUID_EBX},
[CPUID_6_EAX] = { 6, 0, CPUID_EAX},
[CPUID_8000_000A_EDX] = {0x8000000a, 0, CPUID_EDX},

View file

@ -251,6 +251,7 @@ static void __init xen_pv_smp_prepare_cpus(unsigned int max_cpus)
for_each_possible_cpu(i) {
zalloc_cpumask_var(&per_cpu(cpu_sibling_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_core_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_die_map, i), GFP_KERNEL);
zalloc_cpumask_var(&per_cpu(cpu_llc_shared_map, i), GFP_KERNEL);
}
set_cpu_sibling_map(0);

View file

@ -934,6 +934,13 @@ void blk_mq_debugfs_register_sched(struct request_queue *q)
{
struct elevator_type *e = q->elevator->type;
/*
* If the parent directory has not been created yet, return, we will be
* called again later on and the directory/files will be created then.
*/
if (!q->debugfs_dir)
return;
if (!e->queue_debugfs_attrs)
return;

View file

@ -64,6 +64,7 @@ static void power_saving_mwait_init(void)
case X86_VENDOR_HYGON:
case X86_VENDOR_AMD:
case X86_VENDOR_INTEL:
case X86_VENDOR_ZHAOXIN:
/*
* AMD Fam10h TSC will tick in all
* C/P/S0/S1 states when this bit is set.

View file

@ -196,6 +196,7 @@ static void tsc_check_state(int state)
case X86_VENDOR_AMD:
case X86_VENDOR_INTEL:
case X86_VENDOR_CENTAUR:
case X86_VENDOR_ZHAOXIN:
/*
* AMD Fam10h TSC will tick in all
* C/P/S0/S1 states when this bit is set.

View file

@ -43,6 +43,9 @@ static ssize_t name##_list_show(struct device *dev, \
define_id_show_func(physical_package_id);
static DEVICE_ATTR_RO(physical_package_id);
define_id_show_func(die_id);
static DEVICE_ATTR_RO(die_id);
define_id_show_func(core_id);
static DEVICE_ATTR_RO(core_id);
@ -50,10 +53,22 @@ define_siblings_show_func(thread_siblings, sibling_cpumask);
static DEVICE_ATTR_RO(thread_siblings);
static DEVICE_ATTR_RO(thread_siblings_list);
define_siblings_show_func(core_cpus, sibling_cpumask);
static DEVICE_ATTR_RO(core_cpus);
static DEVICE_ATTR_RO(core_cpus_list);
define_siblings_show_func(core_siblings, core_cpumask);
static DEVICE_ATTR_RO(core_siblings);
static DEVICE_ATTR_RO(core_siblings_list);
define_siblings_show_func(die_cpus, die_cpumask);
static DEVICE_ATTR_RO(die_cpus);
static DEVICE_ATTR_RO(die_cpus_list);
define_siblings_show_func(package_cpus, core_cpumask);
static DEVICE_ATTR_RO(package_cpus);
static DEVICE_ATTR_RO(package_cpus_list);
#ifdef CONFIG_SCHED_BOOK
define_id_show_func(book_id);
static DEVICE_ATTR_RO(book_id);
@ -72,11 +87,18 @@ static DEVICE_ATTR_RO(drawer_siblings_list);
static struct attribute *default_attrs[] = {
&dev_attr_physical_package_id.attr,
&dev_attr_die_id.attr,
&dev_attr_core_id.attr,
&dev_attr_thread_siblings.attr,
&dev_attr_thread_siblings_list.attr,
&dev_attr_core_cpus.attr,
&dev_attr_core_cpus_list.attr,
&dev_attr_core_siblings.attr,
&dev_attr_core_siblings_list.attr,
&dev_attr_die_cpus.attr,
&dev_attr_die_cpus_list.attr,
&dev_attr_package_cpus.attr,
&dev_attr_package_cpus_list.attr,
#ifdef CONFIG_SCHED_BOOK
&dev_attr_book_id.attr,
&dev_attr_book_siblings.attr,

View file

@ -718,12 +718,13 @@ static irqreturn_t jz4780_dma_irq_handler(int irq, void *data)
{
struct jz4780_dma_dev *jzdma = data;
unsigned int nb_channels = jzdma->soc_data->nb_channels;
uint32_t pending, dmac;
unsigned long pending;
uint32_t dmac;
int i;
pending = jz4780_dma_ctrl_readl(jzdma, JZ_DMA_REG_DIRQP);
for_each_set_bit(i, (unsigned long *)&pending, nb_channels) {
for_each_set_bit(i, &pending, nb_channels) {
if (jz4780_dma_chan_irq(jzdma, &jzdma->chan[i]))
pending &= ~BIT(i);
}

View file

@ -703,7 +703,7 @@ static int sdma_load_script(struct sdma_engine *sdma, void *buf, int size,
spin_lock_irqsave(&sdma->channel_0_lock, flags);
bd0->mode.command = C0_SETPM;
bd0->mode.status = BD_DONE | BD_INTR | BD_WRAP | BD_EXTD;
bd0->mode.status = BD_DONE | BD_WRAP | BD_EXTD;
bd0->mode.count = size / 2;
bd0->buffer_addr = buf_phys;
bd0->ext_buffer_addr = address;
@ -1025,7 +1025,7 @@ static int sdma_load_context(struct sdma_channel *sdmac)
context->gReg[7] = sdmac->watermark_level;
bd0->mode.command = C0_SETDM;
bd0->mode.status = BD_DONE | BD_INTR | BD_WRAP | BD_EXTD;
bd0->mode.status = BD_DONE | BD_WRAP | BD_EXTD;
bd0->mode.count = sizeof(*context) / 4;
bd0->buffer_addr = sdma->context_phys;
bd0->ext_buffer_addr = 2048 + (sizeof(*context) / 4) * channel;
@ -2096,27 +2096,6 @@ static int sdma_probe(struct platform_device *pdev)
if (pdata && pdata->script_addrs)
sdma_add_scripts(sdma, pdata->script_addrs);
if (pdata) {
ret = sdma_get_firmware(sdma, pdata->fw_name);
if (ret)
dev_warn(&pdev->dev, "failed to get firmware from platform data\n");
} else {
/*
* Because that device tree does not encode ROM script address,
* the RAM script in firmware is mandatory for device tree
* probe, otherwise it fails.
*/
ret = of_property_read_string(np, "fsl,sdma-ram-script-name",
&fw_name);
if (ret)
dev_warn(&pdev->dev, "failed to get firmware name\n");
else {
ret = sdma_get_firmware(sdma, fw_name);
if (ret)
dev_warn(&pdev->dev, "failed to get firmware from device tree\n");
}
}
sdma->dma_device.dev = &pdev->dev;
sdma->dma_device.device_alloc_chan_resources = sdma_alloc_chan_resources;
@ -2161,6 +2140,33 @@ static int sdma_probe(struct platform_device *pdev)
of_node_put(spba_bus);
}
/*
* Kick off firmware loading as the very last step:
* attempt to load firmware only if we're not on the error path, because
* the firmware callback requires a fully functional and allocated sdma
* instance.
*/
if (pdata) {
ret = sdma_get_firmware(sdma, pdata->fw_name);
if (ret)
dev_warn(&pdev->dev, "failed to get firmware from platform data\n");
} else {
/*
* Because that device tree does not encode ROM script address,
* the RAM script in firmware is mandatory for device tree
* probe, otherwise it fails.
*/
ret = of_property_read_string(np, "fsl,sdma-ram-script-name",
&fw_name);
if (ret) {
dev_warn(&pdev->dev, "failed to get firmware name\n");
} else {
ret = sdma_get_firmware(sdma, fw_name);
if (ret)
dev_warn(&pdev->dev, "failed to get firmware from device tree\n");
}
}
return 0;
err_register:

View file

@ -799,6 +799,9 @@ static u32 process_channel_irqs(struct bam_device *bdev)
/* Number of bytes available to read */
avail = CIRC_CNT(offset, bchan->head, MAX_DESCRIPTORS + 1);
if (offset < bchan->head)
avail--;
list_for_each_entry_safe(async_desc, tmp,
&bchan->desc_list, desc_node) {
/* Not enough data to read */

View file

@ -96,10 +96,10 @@ struct platform_data {
struct device_attribute name_attr;
};
/* Keep track of how many package pointers we allocated in init() */
static int max_packages __read_mostly;
/* Array of package pointers. Serialized by cpu hotplug lock */
static struct platform_device **pkg_devices;
/* Keep track of how many zone pointers we allocated in init() */
static int max_zones __read_mostly;
/* Array of zone pointers. Serialized by cpu hotplug lock */
static struct platform_device **zone_devices;
static ssize_t show_label(struct device *dev,
struct device_attribute *devattr, char *buf)
@ -422,10 +422,10 @@ static int chk_ucode_version(unsigned int cpu)
static struct platform_device *coretemp_get_pdev(unsigned int cpu)
{
int pkgid = topology_logical_package_id(cpu);
int id = topology_logical_die_id(cpu);
if (pkgid >= 0 && pkgid < max_packages)
return pkg_devices[pkgid];
if (id >= 0 && id < max_zones)
return zone_devices[id];
return NULL;
}
@ -531,7 +531,7 @@ static int coretemp_probe(struct platform_device *pdev)
struct device *dev = &pdev->dev;
struct platform_data *pdata;
/* Initialize the per-package data structures */
/* Initialize the per-zone data structures */
pdata = devm_kzalloc(dev, sizeof(struct platform_data), GFP_KERNEL);
if (!pdata)
return -ENOMEM;
@ -566,13 +566,13 @@ static struct platform_driver coretemp_driver = {
static struct platform_device *coretemp_device_add(unsigned int cpu)
{
int err, pkgid = topology_logical_package_id(cpu);
int err, zoneid = topology_logical_die_id(cpu);
struct platform_device *pdev;
if (pkgid < 0)
if (zoneid < 0)
return ERR_PTR(-ENOMEM);
pdev = platform_device_alloc(DRVNAME, pkgid);
pdev = platform_device_alloc(DRVNAME, zoneid);
if (!pdev)
return ERR_PTR(-ENOMEM);
@ -582,7 +582,7 @@ static struct platform_device *coretemp_device_add(unsigned int cpu)
return ERR_PTR(err);
}
pkg_devices[pkgid] = pdev;
zone_devices[zoneid] = pdev;
return pdev;
}
@ -690,7 +690,7 @@ static int coretemp_cpu_offline(unsigned int cpu)
* the rest.
*/
if (cpumask_empty(&pd->cpumask)) {
pkg_devices[topology_logical_package_id(cpu)] = NULL;
zone_devices[topology_logical_die_id(cpu)] = NULL;
platform_device_unregister(pdev);
return 0;
}
@ -728,10 +728,10 @@ static int __init coretemp_init(void)
if (!x86_match_cpu(coretemp_ids))
return -ENODEV;
max_packages = topology_max_packages();
pkg_devices = kcalloc(max_packages, sizeof(struct platform_device *),
max_zones = topology_max_packages() * topology_max_die_per_package();
zone_devices = kcalloc(max_zones, sizeof(struct platform_device *),
GFP_KERNEL);
if (!pkg_devices)
if (!zone_devices)
return -ENOMEM;
err = platform_driver_register(&coretemp_driver);
@ -747,7 +747,7 @@ static int __init coretemp_init(void)
outdrv:
platform_driver_unregister(&coretemp_driver);
kfree(pkg_devices);
kfree(zone_devices);
return err;
}
module_init(coretemp_init)
@ -756,7 +756,7 @@ static void __exit coretemp_exit(void)
{
cpuhp_remove_state(coretemp_hp_online);
platform_driver_unregister(&coretemp_driver);
kfree(pkg_devices);
kfree(zone_devices);
}
module_exit(coretemp_exit)

View file

@ -166,12 +166,15 @@ struct rapl_domain {
#define power_zone_to_rapl_domain(_zone) \
container_of(_zone, struct rapl_domain, power_zone)
/* maximum rapl package domain name: package-%d-die-%d */
#define PACKAGE_DOMAIN_NAME_LENGTH 30
/* Each physical package contains multiple domains, these are the common
/* Each rapl package contains multiple domains, these are the common
* data across RAPL domains within a package.
*/
struct rapl_package {
unsigned int id; /* physical package/socket id */
unsigned int id; /* logical die id, equals physical 1-die systems */
unsigned int nr_domains;
unsigned long domain_map; /* bit map of active domains */
unsigned int power_unit;
@ -186,6 +189,7 @@ struct rapl_package {
int lead_cpu; /* one active cpu per package for access */
/* Track active cpus */
struct cpumask cpumask;
char name[PACKAGE_DOMAIN_NAME_LENGTH];
};
struct rapl_defaults {
@ -252,8 +256,9 @@ static struct powercap_control_type *control_type; /* PowerCap Controller */
static struct rapl_domain *platform_rapl_domain; /* Platform (PSys) domain */
/* caller to ensure CPU hotplug lock is held */
static struct rapl_package *find_package_by_id(int id)
static struct rapl_package *rapl_find_package_domain(int cpu)
{
int id = topology_logical_die_id(cpu);
struct rapl_package *rp;
list_for_each_entry(rp, &rapl_packages, plist) {
@ -913,8 +918,8 @@ static int rapl_check_unit_core(struct rapl_package *rp, int cpu)
value = (msr_val & TIME_UNIT_MASK) >> TIME_UNIT_OFFSET;
rp->time_unit = 1000000 / (1 << value);
pr_debug("Core CPU package %d energy=%dpJ, time=%dus, power=%duW\n",
rp->id, rp->energy_unit, rp->time_unit, rp->power_unit);
pr_debug("Core CPU %s energy=%dpJ, time=%dus, power=%duW\n",
rp->name, rp->energy_unit, rp->time_unit, rp->power_unit);
return 0;
}
@ -938,8 +943,8 @@ static int rapl_check_unit_atom(struct rapl_package *rp, int cpu)
value = (msr_val & TIME_UNIT_MASK) >> TIME_UNIT_OFFSET;
rp->time_unit = 1000000 / (1 << value);
pr_debug("Atom package %d energy=%dpJ, time=%dus, power=%duW\n",
rp->id, rp->energy_unit, rp->time_unit, rp->power_unit);
pr_debug("Atom %s energy=%dpJ, time=%dus, power=%duW\n",
rp->name, rp->energy_unit, rp->time_unit, rp->power_unit);
return 0;
}
@ -1168,7 +1173,7 @@ static void rapl_update_domain_data(struct rapl_package *rp)
u64 val;
for (dmn = 0; dmn < rp->nr_domains; dmn++) {
pr_debug("update package %d domain %s data\n", rp->id,
pr_debug("update %s domain %s data\n", rp->name,
rp->domains[dmn].name);
/* exclude non-raw primitives */
for (prim = 0; prim < NR_RAW_PRIMITIVES; prim++) {
@ -1193,7 +1198,6 @@ static void rapl_unregister_powercap(void)
static int rapl_package_register_powercap(struct rapl_package *rp)
{
struct rapl_domain *rd;
char dev_name[17]; /* max domain name = 7 + 1 + 8 for int + 1 for null*/
struct powercap_zone *power_zone = NULL;
int nr_pl, ret;
@ -1204,20 +1208,16 @@ static int rapl_package_register_powercap(struct rapl_package *rp)
for (rd = rp->domains; rd < rp->domains + rp->nr_domains; rd++) {
if (rd->id == RAPL_DOMAIN_PACKAGE) {
nr_pl = find_nr_power_limit(rd);
pr_debug("register socket %d package domain %s\n",
rp->id, rd->name);
memset(dev_name, 0, sizeof(dev_name));
snprintf(dev_name, sizeof(dev_name), "%s-%d",
rd->name, rp->id);
pr_debug("register package domain %s\n", rp->name);
power_zone = powercap_register_zone(&rd->power_zone,
control_type,
dev_name, NULL,
rp->name, NULL,
&zone_ops[rd->id],
nr_pl,
&constraint_ops);
if (IS_ERR(power_zone)) {
pr_debug("failed to register package, %d\n",
rp->id);
pr_debug("failed to register power zone %s\n",
rp->name);
return PTR_ERR(power_zone);
}
/* track parent zone in per package/socket data */
@ -1243,8 +1243,8 @@ static int rapl_package_register_powercap(struct rapl_package *rp)
&constraint_ops);
if (IS_ERR(power_zone)) {
pr_debug("failed to register power_zone, %d:%s:%s\n",
rp->id, rd->name, dev_name);
pr_debug("failed to register power_zone, %s:%s\n",
rp->name, rd->name);
ret = PTR_ERR(power_zone);
goto err_cleanup;
}
@ -1257,7 +1257,7 @@ static int rapl_package_register_powercap(struct rapl_package *rp)
* failed after the first domain setup.
*/
while (--rd >= rp->domains) {
pr_debug("unregister package %d domain %s\n", rp->id, rd->name);
pr_debug("unregister %s domain %s\n", rp->name, rd->name);
powercap_unregister_zone(control_type, &rd->power_zone);
}
@ -1288,7 +1288,7 @@ static int __init rapl_register_psys(void)
rd->rpl[0].name = pl1_name;
rd->rpl[1].prim_id = PL2_ENABLE;
rd->rpl[1].name = pl2_name;
rd->rp = find_package_by_id(0);
rd->rp = rapl_find_package_domain(0);
power_zone = powercap_register_zone(&rd->power_zone, control_type,
"psys", NULL,
@ -1367,8 +1367,8 @@ static void rapl_detect_powerlimit(struct rapl_domain *rd)
/* check if the domain is locked by BIOS, ignore if MSR doesn't exist */
if (!rapl_read_data_raw(rd, FW_LOCK, false, &val64)) {
if (val64) {
pr_info("RAPL package %d domain %s locked by BIOS\n",
rd->rp->id, rd->name);
pr_info("RAPL %s domain %s locked by BIOS\n",
rd->rp->name, rd->name);
rd->state |= DOMAIN_STATE_BIOS_LOCKED;
}
}
@ -1397,10 +1397,10 @@ static int rapl_detect_domains(struct rapl_package *rp, int cpu)
}
rp->nr_domains = bitmap_weight(&rp->domain_map, RAPL_DOMAIN_MAX);
if (!rp->nr_domains) {
pr_debug("no valid rapl domains found in package %d\n", rp->id);
pr_debug("no valid rapl domains found in %s\n", rp->name);
return -ENODEV;
}
pr_debug("found %d domains on package %d\n", rp->nr_domains, rp->id);
pr_debug("found %d domains on %s\n", rp->nr_domains, rp->name);
rp->domains = kcalloc(rp->nr_domains + 1, sizeof(struct rapl_domain),
GFP_KERNEL);
@ -1433,8 +1433,8 @@ static void rapl_remove_package(struct rapl_package *rp)
rd_package = rd;
continue;
}
pr_debug("remove package, undo power limit on %d: %s\n",
rp->id, rd->name);
pr_debug("remove package, undo power limit on %s: %s\n",
rp->name, rd->name);
powercap_unregister_zone(control_type, &rd->power_zone);
}
/* do parent zone last */
@ -1444,9 +1444,11 @@ static void rapl_remove_package(struct rapl_package *rp)
}
/* called from CPU hotplug notifier, hotplug lock held */
static struct rapl_package *rapl_add_package(int cpu, int pkgid)
static struct rapl_package *rapl_add_package(int cpu)
{
int id = topology_logical_die_id(cpu);
struct rapl_package *rp;
struct cpuinfo_x86 *c = &cpu_data(cpu);
int ret;
rp = kzalloc(sizeof(struct rapl_package), GFP_KERNEL);
@ -1454,9 +1456,16 @@ static struct rapl_package *rapl_add_package(int cpu, int pkgid)
return ERR_PTR(-ENOMEM);
/* add the new package to the list */
rp->id = pkgid;
rp->id = id;
rp->lead_cpu = cpu;
if (topology_max_die_per_package() > 1)
snprintf(rp->name, PACKAGE_DOMAIN_NAME_LENGTH,
"package-%d-die-%d", c->phys_proc_id, c->cpu_die_id);
else
snprintf(rp->name, PACKAGE_DOMAIN_NAME_LENGTH, "package-%d",
c->phys_proc_id);
/* check if the package contains valid domains */
if (rapl_detect_domains(rp, cpu) ||
rapl_defaults->check_unit(rp, cpu)) {
@ -1485,12 +1494,11 @@ static struct rapl_package *rapl_add_package(int cpu, int pkgid)
*/
static int rapl_cpu_online(unsigned int cpu)
{
int pkgid = topology_physical_package_id(cpu);
struct rapl_package *rp;
rp = find_package_by_id(pkgid);
rp = rapl_find_package_domain(cpu);
if (!rp) {
rp = rapl_add_package(cpu, pkgid);
rp = rapl_add_package(cpu);
if (IS_ERR(rp))
return PTR_ERR(rp);
}
@ -1500,11 +1508,10 @@ static int rapl_cpu_online(unsigned int cpu)
static int rapl_cpu_down_prep(unsigned int cpu)
{
int pkgid = topology_physical_package_id(cpu);
struct rapl_package *rp;
int lead_cpu;
rp = find_package_by_id(pkgid);
rp = rapl_find_package_domain(cpu);
if (!rp)
return 0;

View file

@ -81,6 +81,12 @@ static int chap_check_algorithm(const char *a_str)
return CHAP_DIGEST_UNKNOWN;
}
static void chap_close(struct iscsi_conn *conn)
{
kfree(conn->auth_protocol);
conn->auth_protocol = NULL;
}
static struct iscsi_chap *chap_server_open(
struct iscsi_conn *conn,
struct iscsi_node_auth *auth,
@ -118,7 +124,7 @@ static struct iscsi_chap *chap_server_open(
case CHAP_DIGEST_UNKNOWN:
default:
pr_err("Unsupported CHAP_A value\n");
kfree(conn->auth_protocol);
chap_close(conn);
return NULL;
}
@ -133,19 +139,13 @@ static struct iscsi_chap *chap_server_open(
* Generate Challenge.
*/
if (chap_gen_challenge(conn, 1, aic_str, aic_len) < 0) {
kfree(conn->auth_protocol);
chap_close(conn);
return NULL;
}
return chap;
}
static void chap_close(struct iscsi_conn *conn)
{
kfree(conn->auth_protocol);
conn->auth_protocol = NULL;
}
static int chap_server_compute_md5(
struct iscsi_conn *conn,
struct iscsi_node_auth *auth,

View file

@ -502,7 +502,7 @@ iblock_execute_write_same(struct se_cmd *cmd)
/* Always in 512 byte units for Linux/Block */
block_lba += sg->length >> SECTOR_SHIFT;
sectors -= 1;
sectors -= sg->length >> SECTOR_SHIFT;
}
iblock_submit_bios(&list);

View file

@ -43,7 +43,7 @@ MODULE_PARM_DESC(notify_delay_ms,
*/
#define MAX_NUMBER_OF_TRIPS 2
struct pkg_device {
struct zone_device {
int cpu;
bool work_scheduled;
u32 tj_max;
@ -58,10 +58,10 @@ static struct thermal_zone_params pkg_temp_tz_params = {
.no_hwmon = true,
};
/* Keep track of how many package pointers we allocated in init() */
static int max_packages __read_mostly;
/* Array of package pointers */
static struct pkg_device **packages;
/* Keep track of how many zone pointers we allocated in init() */
static int max_id __read_mostly;
/* Array of zone pointers */
static struct zone_device **zones;
/* Serializes interrupt notification, work and hotplug */
static DEFINE_SPINLOCK(pkg_temp_lock);
/* Protects zone operation in the work function against hotplug removal */
@ -108,12 +108,12 @@ static int pkg_temp_debugfs_init(void)
*
* - Other callsites: Must hold pkg_temp_lock
*/
static struct pkg_device *pkg_temp_thermal_get_dev(unsigned int cpu)
static struct zone_device *pkg_temp_thermal_get_dev(unsigned int cpu)
{
int pkgid = topology_logical_package_id(cpu);
int id = topology_logical_die_id(cpu);
if (pkgid >= 0 && pkgid < max_packages)
return packages[pkgid];
if (id >= 0 && id < max_id)
return zones[id];
return NULL;
}
@ -138,12 +138,13 @@ static int get_tj_max(int cpu, u32 *tj_max)
static int sys_get_curr_temp(struct thermal_zone_device *tzd, int *temp)
{
struct pkg_device *pkgdev = tzd->devdata;
struct zone_device *zonedev = tzd->devdata;
u32 eax, edx;
rdmsr_on_cpu(pkgdev->cpu, MSR_IA32_PACKAGE_THERM_STATUS, &eax, &edx);
rdmsr_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_STATUS,
&eax, &edx);
if (eax & 0x80000000) {
*temp = pkgdev->tj_max - ((eax >> 16) & 0x7f) * 1000;
*temp = zonedev->tj_max - ((eax >> 16) & 0x7f) * 1000;
pr_debug("sys_get_curr_temp %d\n", *temp);
return 0;
}
@ -153,7 +154,7 @@ static int sys_get_curr_temp(struct thermal_zone_device *tzd, int *temp)
static int sys_get_trip_temp(struct thermal_zone_device *tzd,
int trip, int *temp)
{
struct pkg_device *pkgdev = tzd->devdata;
struct zone_device *zonedev = tzd->devdata;
unsigned long thres_reg_value;
u32 mask, shift, eax, edx;
int ret;
@ -169,14 +170,14 @@ static int sys_get_trip_temp(struct thermal_zone_device *tzd,
shift = THERM_SHIFT_THRESHOLD0;
}
ret = rdmsr_on_cpu(pkgdev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT,
ret = rdmsr_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT,
&eax, &edx);
if (ret < 0)
return ret;
thres_reg_value = (eax & mask) >> shift;
if (thres_reg_value)
*temp = pkgdev->tj_max - thres_reg_value * 1000;
*temp = zonedev->tj_max - thres_reg_value * 1000;
else
*temp = 0;
pr_debug("sys_get_trip_temp %d\n", *temp);
@ -187,14 +188,14 @@ static int sys_get_trip_temp(struct thermal_zone_device *tzd,
static int
sys_set_trip_temp(struct thermal_zone_device *tzd, int trip, int temp)
{
struct pkg_device *pkgdev = tzd->devdata;
struct zone_device *zonedev = tzd->devdata;
u32 l, h, mask, shift, intr;
int ret;
if (trip >= MAX_NUMBER_OF_TRIPS || temp >= pkgdev->tj_max)
if (trip >= MAX_NUMBER_OF_TRIPS || temp >= zonedev->tj_max)
return -EINVAL;
ret = rdmsr_on_cpu(pkgdev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT,
ret = rdmsr_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT,
&l, &h);
if (ret < 0)
return ret;
@ -216,11 +217,12 @@ sys_set_trip_temp(struct thermal_zone_device *tzd, int trip, int temp)
if (!temp) {
l &= ~intr;
} else {
l |= (pkgdev->tj_max - temp)/1000 << shift;
l |= (zonedev->tj_max - temp)/1000 << shift;
l |= intr;
}
return wrmsr_on_cpu(pkgdev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT, l, h);
return wrmsr_on_cpu(zonedev->cpu, MSR_IA32_PACKAGE_THERM_INTERRUPT,
l, h);
}
static int sys_get_trip_type(struct thermal_zone_device *thermal, int trip,
@ -275,26 +277,26 @@ static void pkg_temp_thermal_threshold_work_fn(struct work_struct *work)
{
struct thermal_zone_device *tzone = NULL;
int cpu = smp_processor_id();
struct pkg_device *pkgdev;
struct zone_device *zonedev;
u64 msr_val, wr_val;
mutex_lock(&thermal_zone_mutex);
spin_lock_irq(&pkg_temp_lock);
++pkg_work_cnt;
pkgdev = pkg_temp_thermal_get_dev(cpu);
if (!pkgdev) {
zonedev = pkg_temp_thermal_get_dev(cpu);
if (!zonedev) {
spin_unlock_irq(&pkg_temp_lock);
mutex_unlock(&thermal_zone_mutex);
return;
}
pkgdev->work_scheduled = false;
zonedev->work_scheduled = false;
rdmsrl(MSR_IA32_PACKAGE_THERM_STATUS, msr_val);
wr_val = msr_val & ~(THERM_LOG_THRESHOLD0 | THERM_LOG_THRESHOLD1);
if (wr_val != msr_val) {
wrmsrl(MSR_IA32_PACKAGE_THERM_STATUS, wr_val);
tzone = pkgdev->tzone;
tzone = zonedev->tzone;
}
enable_pkg_thres_interrupt();
@ -320,7 +322,7 @@ static void pkg_thermal_schedule_work(int cpu, struct delayed_work *work)
static int pkg_thermal_notify(u64 msr_val)
{
int cpu = smp_processor_id();
struct pkg_device *pkgdev;
struct zone_device *zonedev;
unsigned long flags;
spin_lock_irqsave(&pkg_temp_lock, flags);
@ -329,10 +331,10 @@ static int pkg_thermal_notify(u64 msr_val)
disable_pkg_thres_interrupt();
/* Work is per package, so scheduling it once is enough. */
pkgdev = pkg_temp_thermal_get_dev(cpu);
if (pkgdev && !pkgdev->work_scheduled) {
pkgdev->work_scheduled = true;
pkg_thermal_schedule_work(pkgdev->cpu, &pkgdev->work);
zonedev = pkg_temp_thermal_get_dev(cpu);
if (zonedev && !zonedev->work_scheduled) {
zonedev->work_scheduled = true;
pkg_thermal_schedule_work(zonedev->cpu, &zonedev->work);
}
spin_unlock_irqrestore(&pkg_temp_lock, flags);
@ -341,12 +343,12 @@ static int pkg_thermal_notify(u64 msr_val)
static int pkg_temp_thermal_device_add(unsigned int cpu)
{
int pkgid = topology_logical_package_id(cpu);
int id = topology_logical_die_id(cpu);
u32 tj_max, eax, ebx, ecx, edx;
struct pkg_device *pkgdev;
struct zone_device *zonedev;
int thres_count, err;
if (pkgid >= max_packages)
if (id >= max_id)
return -ENOMEM;
cpuid(6, &eax, &ebx, &ecx, &edx);
@ -360,51 +362,51 @@ static int pkg_temp_thermal_device_add(unsigned int cpu)
if (err)
return err;
pkgdev = kzalloc(sizeof(*pkgdev), GFP_KERNEL);
if (!pkgdev)
zonedev = kzalloc(sizeof(*zonedev), GFP_KERNEL);
if (!zonedev)
return -ENOMEM;
INIT_DELAYED_WORK(&pkgdev->work, pkg_temp_thermal_threshold_work_fn);
pkgdev->cpu = cpu;
pkgdev->tj_max = tj_max;
pkgdev->tzone = thermal_zone_device_register("x86_pkg_temp",
INIT_DELAYED_WORK(&zonedev->work, pkg_temp_thermal_threshold_work_fn);
zonedev->cpu = cpu;
zonedev->tj_max = tj_max;
zonedev->tzone = thermal_zone_device_register("x86_pkg_temp",
thres_count,
(thres_count == MAX_NUMBER_OF_TRIPS) ? 0x03 : 0x01,
pkgdev, &tzone_ops, &pkg_temp_tz_params, 0, 0);
if (IS_ERR(pkgdev->tzone)) {
err = PTR_ERR(pkgdev->tzone);
kfree(pkgdev);
zonedev, &tzone_ops, &pkg_temp_tz_params, 0, 0);
if (IS_ERR(zonedev->tzone)) {
err = PTR_ERR(zonedev->tzone);
kfree(zonedev);
return err;
}
/* Store MSR value for package thermal interrupt, to restore at exit */
rdmsr(MSR_IA32_PACKAGE_THERM_INTERRUPT, pkgdev->msr_pkg_therm_low,
pkgdev->msr_pkg_therm_high);
rdmsr(MSR_IA32_PACKAGE_THERM_INTERRUPT, zonedev->msr_pkg_therm_low,
zonedev->msr_pkg_therm_high);
cpumask_set_cpu(cpu, &pkgdev->cpumask);
cpumask_set_cpu(cpu, &zonedev->cpumask);
spin_lock_irq(&pkg_temp_lock);
packages[pkgid] = pkgdev;
zones[id] = zonedev;
spin_unlock_irq(&pkg_temp_lock);
return 0;
}
static int pkg_thermal_cpu_offline(unsigned int cpu)
{
struct pkg_device *pkgdev = pkg_temp_thermal_get_dev(cpu);
struct zone_device *zonedev = pkg_temp_thermal_get_dev(cpu);
bool lastcpu, was_target;
int target;
if (!pkgdev)
if (!zonedev)
return 0;
target = cpumask_any_but(&pkgdev->cpumask, cpu);
cpumask_clear_cpu(cpu, &pkgdev->cpumask);
target = cpumask_any_but(&zonedev->cpumask, cpu);
cpumask_clear_cpu(cpu, &zonedev->cpumask);
lastcpu = target >= nr_cpu_ids;
/*
* Remove the sysfs files, if this is the last cpu in the package
* before doing further cleanups.
*/
if (lastcpu) {
struct thermal_zone_device *tzone = pkgdev->tzone;
struct thermal_zone_device *tzone = zonedev->tzone;
/*
* We must protect against a work function calling
@ -413,7 +415,7 @@ static int pkg_thermal_cpu_offline(unsigned int cpu)
* won't try to call.
*/
mutex_lock(&thermal_zone_mutex);
pkgdev->tzone = NULL;
zonedev->tzone = NULL;
mutex_unlock(&thermal_zone_mutex);
thermal_zone_device_unregister(tzone);
@ -427,8 +429,8 @@ static int pkg_thermal_cpu_offline(unsigned int cpu)
* one. When we drop the lock, then the interrupt notify function
* will see the new target.
*/
was_target = pkgdev->cpu == cpu;
pkgdev->cpu = target;
was_target = zonedev->cpu == cpu;
zonedev->cpu = target;
/*
* If this is the last CPU in the package remove the package
@ -437,23 +439,23 @@ static int pkg_thermal_cpu_offline(unsigned int cpu)
* worker will see the package anymore.
*/
if (lastcpu) {
packages[topology_logical_package_id(cpu)] = NULL;
zones[topology_logical_die_id(cpu)] = NULL;
/* After this point nothing touches the MSR anymore. */
wrmsr(MSR_IA32_PACKAGE_THERM_INTERRUPT,
pkgdev->msr_pkg_therm_low, pkgdev->msr_pkg_therm_high);
zonedev->msr_pkg_therm_low, zonedev->msr_pkg_therm_high);
}
/*
* Check whether there is work scheduled and whether the work is
* targeted at the outgoing CPU.
*/
if (pkgdev->work_scheduled && was_target) {
if (zonedev->work_scheduled && was_target) {
/*
* To cancel the work we need to drop the lock, otherwise
* we might deadlock if the work needs to be flushed.
*/
spin_unlock_irq(&pkg_temp_lock);
cancel_delayed_work_sync(&pkgdev->work);
cancel_delayed_work_sync(&zonedev->work);
spin_lock_irq(&pkg_temp_lock);
/*
* If this is not the last cpu in the package and the work
@ -461,21 +463,21 @@ static int pkg_thermal_cpu_offline(unsigned int cpu)
* need to reschedule the work, otherwise the interrupt
* stays disabled forever.
*/
if (!lastcpu && pkgdev->work_scheduled)
pkg_thermal_schedule_work(target, &pkgdev->work);
if (!lastcpu && zonedev->work_scheduled)
pkg_thermal_schedule_work(target, &zonedev->work);
}
spin_unlock_irq(&pkg_temp_lock);
/* Final cleanup if this is the last cpu */
if (lastcpu)
kfree(pkgdev);
kfree(zonedev);
return 0;
}
static int pkg_thermal_cpu_online(unsigned int cpu)
{
struct pkg_device *pkgdev = pkg_temp_thermal_get_dev(cpu);
struct zone_device *zonedev = pkg_temp_thermal_get_dev(cpu);
struct cpuinfo_x86 *c = &cpu_data(cpu);
/* Paranoia check */
@ -483,8 +485,8 @@ static int pkg_thermal_cpu_online(unsigned int cpu)
return -ENODEV;
/* If the package exists, nothing to do */
if (pkgdev) {
cpumask_set_cpu(cpu, &pkgdev->cpumask);
if (zonedev) {
cpumask_set_cpu(cpu, &zonedev->cpumask);
return 0;
}
return pkg_temp_thermal_device_add(cpu);
@ -503,10 +505,10 @@ static int __init pkg_temp_thermal_init(void)
if (!x86_match_cpu(pkg_temp_thermal_ids))
return -ENODEV;
max_packages = topology_max_packages();
packages = kcalloc(max_packages, sizeof(struct pkg_device *),
max_id = topology_max_packages() * topology_max_die_per_package();
zones = kcalloc(max_id, sizeof(struct zone_device *),
GFP_KERNEL);
if (!packages)
if (!zones)
return -ENOMEM;
ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "thermal/x86_pkg:online",
@ -525,7 +527,7 @@ static int __init pkg_temp_thermal_init(void)
return 0;
err:
kfree(packages);
kfree(zones);
return ret;
}
module_init(pkg_temp_thermal_init)
@ -537,7 +539,7 @@ static void __exit pkg_temp_thermal_exit(void)
cpuhp_remove_state(pkg_thermal_hp_state);
debugfs_remove_recursive(debugfs);
kfree(packages);
kfree(zones);
}
module_exit(pkg_temp_thermal_exit)

View file

@ -11,7 +11,6 @@ config DCACHE_WORD_ACCESS
config VALIDATE_FS_PARSER
bool "Validate filesystem parameter description"
default y
help
Enable this to perform validation of the parameter description for a
filesystem when it is registered.

View file

@ -175,6 +175,26 @@ int sysfs_create_group(struct kobject *kobj,
}
EXPORT_SYMBOL_GPL(sysfs_create_group);
static int internal_create_groups(struct kobject *kobj, int update,
const struct attribute_group **groups)
{
int error = 0;
int i;
if (!groups)
return 0;
for (i = 0; groups[i]; i++) {
error = internal_create_group(kobj, update, groups[i]);
if (error) {
while (--i >= 0)
sysfs_remove_group(kobj, groups[i]);
break;
}
}
return error;
}
/**
* sysfs_create_groups - given a directory kobject, create a bunch of attribute groups
* @kobj: The kobject to create the group on
@ -191,24 +211,28 @@ EXPORT_SYMBOL_GPL(sysfs_create_group);
int sysfs_create_groups(struct kobject *kobj,
const struct attribute_group **groups)
{
int error = 0;
int i;
if (!groups)
return 0;
for (i = 0; groups[i]; i++) {
error = sysfs_create_group(kobj, groups[i]);
if (error) {
while (--i >= 0)
sysfs_remove_group(kobj, groups[i]);
break;
}
}
return error;
return internal_create_groups(kobj, 0, groups);
}
EXPORT_SYMBOL_GPL(sysfs_create_groups);
/**
* sysfs_update_groups - given a directory kobject, create a bunch of attribute groups
* @kobj: The kobject to update the group on
* @groups: The attribute groups to update, NULL terminated
*
* This function update a bunch of attribute groups. If an error occurs when
* updating a group, all previously updated groups will be removed together
* with already existing (not updated) attributes.
*
* Returns 0 on success or error code from sysfs_update_group on failure.
*/
int sysfs_update_groups(struct kobject *kobj,
const struct attribute_group **groups)
{
return internal_create_groups(kobj, 1, groups);
}
EXPORT_SYMBOL_GPL(sysfs_update_groups);
/**
* sysfs_update_group - given a directory kobject, update an attribute group
* @kobj: The kobject to update the group on

View file

@ -256,6 +256,7 @@ struct pmu {
struct module *module;
struct device *dev;
const struct attribute_group **attr_groups;
const struct attribute_group **attr_update;
const char *name;
int type;
@ -749,6 +750,11 @@ struct perf_event_context {
int nr_stat;
int nr_freq;
int rotate_disable;
/*
* Set when nr_events != nr_active, except tolerant to events not
* necessary to be active due to scheduling constraints, such as cgroups.
*/
int rotate_necessary;
refcount_t refcount;
struct task_struct *task;

View file

@ -268,6 +268,8 @@ int __must_check sysfs_create_group(struct kobject *kobj,
const struct attribute_group *grp);
int __must_check sysfs_create_groups(struct kobject *kobj,
const struct attribute_group **groups);
int __must_check sysfs_update_groups(struct kobject *kobj,
const struct attribute_group **groups);
int sysfs_update_group(struct kobject *kobj,
const struct attribute_group *grp);
void sysfs_remove_group(struct kobject *kobj,
@ -433,6 +435,12 @@ static inline int sysfs_create_groups(struct kobject *kobj,
return 0;
}
static inline int sysfs_update_groups(struct kobject *kobj,
const struct attribute_group **groups)
{
return 0;
}
static inline int sysfs_update_group(struct kobject *kobj,
const struct attribute_group *grp)
{

View file

@ -184,6 +184,9 @@ static inline int cpu_to_mem(int cpu)
#ifndef topology_physical_package_id
#define topology_physical_package_id(cpu) ((void)(cpu), -1)
#endif
#ifndef topology_die_id
#define topology_die_id(cpu) ((void)(cpu), -1)
#endif
#ifndef topology_core_id
#define topology_core_id(cpu) ((void)(cpu), 0)
#endif
@ -193,6 +196,9 @@ static inline int cpu_to_mem(int cpu)
#ifndef topology_core_cpumask
#define topology_core_cpumask(cpu) cpumask_of(cpu)
#endif
#ifndef topology_die_cpumask
#define topology_die_cpumask(cpu) cpumask_of(cpu)
#endif
#ifdef CONFIG_SCHED_SMT
static inline const struct cpumask *cpu_smt_mask(int cpu)

View file

@ -2952,6 +2952,12 @@ static void ctx_sched_out(struct perf_event_context *ctx,
if (!ctx->nr_active || !(is_active & EVENT_ALL))
return;
/*
* If we had been multiplexing, no rotations are necessary, now no events
* are active.
*/
ctx->rotate_necessary = 0;
perf_pmu_disable(ctx->pmu);
if (is_active & EVENT_PINNED) {
list_for_each_entry_safe(event, tmp, &ctx->pinned_active, active_list)
@ -3319,10 +3325,13 @@ static int flexible_sched_in(struct perf_event *event, void *data)
return 0;
if (group_can_go_on(event, sid->cpuctx, sid->can_add_hw)) {
if (!group_sched_in(event, sid->cpuctx, sid->ctx))
list_add_tail(&event->active_list, &sid->ctx->flexible_active);
else
int ret = group_sched_in(event, sid->cpuctx, sid->ctx);
if (ret) {
sid->can_add_hw = 0;
sid->ctx->rotate_necessary = 1;
return 0;
}
list_add_tail(&event->active_list, &sid->ctx->flexible_active);
}
return 0;
@ -3690,24 +3699,17 @@ ctx_first_active(struct perf_event_context *ctx)
static bool perf_rotate_context(struct perf_cpu_context *cpuctx)
{
struct perf_event *cpu_event = NULL, *task_event = NULL;
bool cpu_rotate = false, task_rotate = false;
struct perf_event_context *ctx = NULL;
struct perf_event_context *task_ctx = NULL;
int cpu_rotate, task_rotate;
/*
* Since we run this from IRQ context, nobody can install new
* events, thus the event count values are stable.
*/
if (cpuctx->ctx.nr_events) {
if (cpuctx->ctx.nr_events != cpuctx->ctx.nr_active)
cpu_rotate = true;
}
ctx = cpuctx->task_ctx;
if (ctx && ctx->nr_events) {
if (ctx->nr_events != ctx->nr_active)
task_rotate = true;
}
cpu_rotate = cpuctx->ctx.rotate_necessary;
task_ctx = cpuctx->task_ctx;
task_rotate = task_ctx ? task_ctx->rotate_necessary : 0;
if (!(cpu_rotate || task_rotate))
return false;
@ -3716,7 +3718,7 @@ static bool perf_rotate_context(struct perf_cpu_context *cpuctx)
perf_pmu_disable(cpuctx->ctx.pmu);
if (task_rotate)
task_event = ctx_first_active(ctx);
task_event = ctx_first_active(task_ctx);
if (cpu_rotate)
cpu_event = ctx_first_active(&cpuctx->ctx);
@ -3724,17 +3726,17 @@ static bool perf_rotate_context(struct perf_cpu_context *cpuctx)
* As per the order given at ctx_resched() first 'pop' task flexible
* and then, if needed CPU flexible.
*/
if (task_event || (ctx && cpu_event))
ctx_sched_out(ctx, cpuctx, EVENT_FLEXIBLE);
if (task_event || (task_ctx && cpu_event))
ctx_sched_out(task_ctx, cpuctx, EVENT_FLEXIBLE);
if (cpu_event)
cpu_ctx_sched_out(cpuctx, EVENT_FLEXIBLE);
if (task_event)
rotate_ctx(ctx, task_event);
rotate_ctx(task_ctx, task_event);
if (cpu_event)
rotate_ctx(&cpuctx->ctx, cpu_event);
perf_event_sched_in(cpuctx, ctx, current);
perf_event_sched_in(cpuctx, task_ctx, current);
perf_pmu_enable(cpuctx->ctx.pmu);
perf_ctx_unlock(cpuctx, cpuctx->task_ctx);
@ -8535,9 +8537,9 @@ static int perf_tp_event_match(struct perf_event *event,
if (event->hw.state & PERF_HES_STOPPED)
return 0;
/*
* All tracepoints are from kernel-space.
* If exclude_kernel, only trace user-space tracepoints (uprobes)
*/
if (event->attr.exclude_kernel)
if (event->attr.exclude_kernel && !user_mode(regs))
return 0;
if (!perf_tp_filter_match(event, data))
@ -9877,6 +9879,12 @@ static int pmu_dev_alloc(struct pmu *pmu)
if (ret)
goto del_dev;
if (pmu->attr_update)
ret = sysfs_update_groups(&pmu->dev->kobj, pmu->attr_update);
if (ret)
goto del_dev;
out:
return ret;

View file

@ -1336,7 +1336,7 @@ static inline void init_trace_event_call(struct trace_uprobe *tu,
call->event.funcs = &uprobe_funcs;
call->class->define_fields = uprobe_event_define_fields;
call->flags = TRACE_EVENT_FL_UPROBE;
call->flags = TRACE_EVENT_FL_UPROBE | TRACE_EVENT_FL_CAP_ANY;
call->class->reg = trace_uprobe_register;
call->data = tu;
}

View file

@ -53,6 +53,7 @@ FEATURE_TESTS_BASIC := \
libpython \
libpython-version \
libslang \
libslang-include-subdir \
libcrypto \
libunwind \
pthread-attr-setaffinity-np \
@ -114,7 +115,6 @@ FEATURE_DISPLAY ?= \
numa_num_possible_cpus \
libperl \
libpython \
libslang \
libcrypto \
libunwind \
libdw-dwarf-unwind \

View file

@ -31,6 +31,7 @@ FILES= \
test-libpython.bin \
test-libpython-version.bin \
test-libslang.bin \
test-libslang-include-subdir.bin \
test-libcrypto.bin \
test-libunwind.bin \
test-libunwind-debug-frame.bin \
@ -182,7 +183,10 @@ $(OUTPUT)test-libaudit.bin:
$(BUILD) -laudit
$(OUTPUT)test-libslang.bin:
$(BUILD) -I/usr/include/slang -lslang
$(BUILD) -lslang
$(OUTPUT)test-libslang-include-subdir.bin:
$(BUILD) -lslang
$(OUTPUT)test-libcrypto.bin:
$(BUILD) -lcrypto

View file

@ -186,7 +186,7 @@
# include "test-disassembler-four-args.c"
#undef main
#define main main_test_zstd
#define main main_test_libzstd
# include "test-libzstd.c"
#undef main

View file

@ -1,3 +1,4 @@
// SPDX-License-Identifier: GPL-2.0
#include <stdio.h>
int main(void)

View file

@ -1,3 +1,4 @@
// SPDX-License-Identifier: GPL-2.0
#include <stdio.h>
int main(void)

View file

@ -0,0 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
#include <slang/slang.h>
int main(void)
{
return SLsmg_init_smg();
}

View file

@ -1,3 +1,4 @@
// SPDX-License-Identifier: GPL-2.0
#define _GNU_SOURCE
#include <sched.h>

View file

@ -0,0 +1,75 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _LINUX_CTYPE_H
#define _LINUX_CTYPE_H
/*
* NOTE! This ctype does not handle EOF like the standard C
* library is required to.
*/
#define _U 0x01 /* upper */
#define _L 0x02 /* lower */
#define _D 0x04 /* digit */
#define _C 0x08 /* cntrl */
#define _P 0x10 /* punct */
#define _S 0x20 /* white space (space/lf/tab) */
#define _X 0x40 /* hex digit */
#define _SP 0x80 /* hard space (0x20) */
extern const unsigned char _ctype[];
#define __ismask(x) (_ctype[(int)(unsigned char)(x)])
#define isalnum(c) ((__ismask(c)&(_U|_L|_D)) != 0)
#define isalpha(c) ((__ismask(c)&(_U|_L)) != 0)
#define iscntrl(c) ((__ismask(c)&(_C)) != 0)
static inline int __isdigit(int c)
{
return '0' <= c && c <= '9';
}
#define isdigit(c) __isdigit(c)
#define isgraph(c) ((__ismask(c)&(_P|_U|_L|_D)) != 0)
#define islower(c) ((__ismask(c)&(_L)) != 0)
#define isprint(c) ((__ismask(c)&(_P|_U|_L|_D|_SP)) != 0)
#define ispunct(c) ((__ismask(c)&(_P)) != 0)
/* Note: isspace() must return false for %NUL-terminator */
#define isspace(c) ((__ismask(c)&(_S)) != 0)
#define isupper(c) ((__ismask(c)&(_U)) != 0)
#define isxdigit(c) ((__ismask(c)&(_D|_X)) != 0)
#define isascii(c) (((unsigned char)(c))<=0x7f)
#define toascii(c) (((unsigned char)(c))&0x7f)
static inline unsigned char __tolower(unsigned char c)
{
if (isupper(c))
c -= 'A'-'a';
return c;
}
static inline unsigned char __toupper(unsigned char c)
{
if (islower(c))
c -= 'a'-'A';
return c;
}
#define tolower(c) __tolower(c)
#define toupper(c) __toupper(c)
/*
* Fast implementation of tolower() for internal usage. Do not use in your
* code.
*/
static inline char _tolower(const char c)
{
return c | 0x20;
}
/* Fast check for octal digit */
static inline int isodigit(const char c)
{
return c >= '0' && c <= '7';
}
#endif

View file

@ -102,6 +102,7 @@
int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
int scnprintf(char * buf, size_t size, const char * fmt, ...);
int scnprintf_pad(char * buf, size_t size, const char * fmt, ...);
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))

View file

@ -7,6 +7,9 @@
void *memdup(const void *src, size_t len);
char **argv_split(const char *str, int *argcp);
void argv_free(char **argv);
int strtobool(const char *s, bool *res);
/*
@ -19,6 +22,8 @@ extern size_t strlcpy(char *dest, const char *src, size_t size);
char *str_error_r(int errnum, char *buf, size_t buflen);
char *strreplace(char *s, char old, char new);
/**
* strstarts - does @str start with @prefix?
* @str: string to examine
@ -29,4 +34,8 @@ static inline bool strstarts(const char *str, const char *prefix)
return strncmp(str, prefix, strlen(prefix)) == 0;
}
#endif /* _LINUX_STRING_H_ */
extern char * __must_check skip_spaces(const char *);
extern char *strim(char *);
#endif /* _TOOLS_LINUX_STRING_H_ */

100
tools/lib/argv_split.c Normal file
View file

@ -0,0 +1,100 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Helper function for splitting a string into an argv-like array.
*/
#include <stdlib.h>
#include <linux/kernel.h>
#include <linux/ctype.h>
#include <linux/string.h>
static const char *skip_arg(const char *cp)
{
while (*cp && !isspace(*cp))
cp++;
return cp;
}
static int count_argc(const char *str)
{
int count = 0;
while (*str) {
str = skip_spaces(str);
if (*str) {
count++;
str = skip_arg(str);
}
}
return count;
}
/**
* argv_free - free an argv
* @argv - the argument vector to be freed
*
* Frees an argv and the strings it points to.
*/
void argv_free(char **argv)
{
char **p;
for (p = argv; *p; p++) {
free(*p);
*p = NULL;
}
free(argv);
}
/**
* argv_split - split a string at whitespace, returning an argv
* @str: the string to be split
* @argcp: returned argument count
*
* Returns an array of pointers to strings which are split out from
* @str. This is performed by strictly splitting on white-space; no
* quote processing is performed. Multiple whitespace characters are
* considered to be a single argument separator. The returned array
* is always NULL-terminated. Returns NULL on memory allocation
* failure.
*/
char **argv_split(const char *str, int *argcp)
{
int argc = count_argc(str);
char **argv = calloc(argc + 1, sizeof(*argv));
char **argvp;
if (argv == NULL)
goto out;
if (argcp)
*argcp = argc;
argvp = argv;
while (*str) {
str = skip_spaces(str);
if (*str) {
const char *p = str;
char *t;
str = skip_arg(str);
t = strndup(p, str-p);
if (t == NULL)
goto fail;
*argvp++ = t;
}
}
*argvp = NULL;
out:
return argv;
fail:
argv_free(argv);
return NULL;
}

35
tools/lib/ctype.c Normal file
View file

@ -0,0 +1,35 @@
// SPDX-License-Identifier: GPL-2.0
/*
* linux/lib/ctype.c
*
* Copyright (C) 1991, 1992 Linus Torvalds
*/
#include <linux/ctype.h>
#include <linux/compiler.h>
const unsigned char _ctype[] = {
_C,_C,_C,_C,_C,_C,_C,_C, /* 0-7 */
_C,_C|_S,_C|_S,_C|_S,_C|_S,_C|_S,_C,_C, /* 8-15 */
_C,_C,_C,_C,_C,_C,_C,_C, /* 16-23 */
_C,_C,_C,_C,_C,_C,_C,_C, /* 24-31 */
_S|_SP,_P,_P,_P,_P,_P,_P,_P, /* 32-39 */
_P,_P,_P,_P,_P,_P,_P,_P, /* 40-47 */
_D,_D,_D,_D,_D,_D,_D,_D, /* 48-55 */
_D,_D,_P,_P,_P,_P,_P,_P, /* 56-63 */
_P,_U|_X,_U|_X,_U|_X,_U|_X,_U|_X,_U|_X,_U, /* 64-71 */
_U,_U,_U,_U,_U,_U,_U,_U, /* 72-79 */
_U,_U,_U,_U,_U,_U,_U,_U, /* 80-87 */
_U,_U,_U,_P,_P,_P,_P,_P, /* 88-95 */
_P,_L|_X,_L|_X,_L|_X,_L|_X,_L|_X,_L|_X,_L, /* 96-103 */
_L,_L,_L,_L,_L,_L,_L,_L, /* 104-111 */
_L,_L,_L,_L,_L,_L,_L,_L, /* 112-119 */
_L,_L,_L,_P,_P,_P,_P,_C, /* 120-127 */
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* 128-143 */
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, /* 144-159 */
_S|_SP,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P, /* 160-175 */
_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P, /* 176-191 */
_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U, /* 192-207 */
_U,_U,_U,_U,_U,_U,_U,_P,_U,_U,_U,_U,_U,_U,_U,_L, /* 208-223 */
_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L, /* 224-239 */
_L,_L,_L,_L,_L,_L,_L,_P,_L,_L,_L,_L,_L,_L,_L,_L}; /* 240-255 */

View file

@ -17,6 +17,7 @@
#include <string.h>
#include <errno.h>
#include <linux/string.h>
#include <linux/ctype.h>
#include <linux/compiler.h>
/**
@ -106,3 +107,57 @@ size_t __weak strlcpy(char *dest, const char *src, size_t size)
}
return ret;
}
/**
* skip_spaces - Removes leading whitespace from @str.
* @str: The string to be stripped.
*
* Returns a pointer to the first non-whitespace character in @str.
*/
char *skip_spaces(const char *str)
{
while (isspace(*str))
++str;
return (char *)str;
}
/**
* strim - Removes leading and trailing whitespace from @s.
* @s: The string to be stripped.
*
* Note that the first trailing whitespace is replaced with a %NUL-terminator
* in the given string @s. Returns a pointer to the first non-whitespace
* character in @s.
*/
char *strim(char *s)
{
size_t size;
char *end;
size = strlen(s);
if (!size)
return s;
end = s + size - 1;
while (end >= s && isspace(*end))
end--;
*(end + 1) = '\0';
return skip_spaces(s);
}
/**
* strreplace - Replace all occurrences of character in string.
* @s: The string to operate on.
* @old: The character being replaced.
* @new: The character @old is replaced with.
*
* Returns pointer to the nul byte at the end of @s.
*/
char *strreplace(char *s, char old, char new)
{
for (; *s; ++s)
if (*s == old)
*s = new;
return s;
}

View file

@ -1,5 +1,4 @@
// SPDX-License-Identifier: GPL-2.0
#include <ctype.h>
#include "symbol/kallsyms.h"
#include <stdio.h>
#include <stdlib.h>
@ -16,6 +15,19 @@ bool kallsyms__is_function(char symbol_type)
return symbol_type == 'T' || symbol_type == 'W';
}
/*
* While we find nice hex chars, build a long_val.
* Return number of chars processed.
*/
int hex2u64(const char *ptr, u64 *long_val)
{
char *p;
*long_val = strtoull(ptr, &p, 16);
return p - ptr;
}
int kallsyms__parse(const char *filename, void *arg,
int (*process_symbol)(void *arg, const char *name,
char type, u64 start))

View file

@ -18,6 +18,8 @@ static inline u8 kallsyms2elf_binding(char type)
return isupper(type) ? STB_GLOBAL : STB_LOCAL;
}
int hex2u64(const char *ptr, u64 *long_val);
u8 kallsyms2elf_type(char type);
bool kallsyms__is_function(char symbol_type);

View file

@ -23,3 +23,22 @@ int scnprintf(char * buf, size_t size, const char * fmt, ...)
return (i >= ssize) ? (ssize - 1) : i;
}
int scnprintf_pad(char * buf, size_t size, const char * fmt, ...)
{
ssize_t ssize = size;
va_list args;
int i;
va_start(args, fmt);
i = vscnprintf(buf, size, fmt, args);
va_end(args);
if (i < (int) size) {
for (; i < (int) size; i++)
buf[i] = ' ';
buf[i] = 0x0;
}
return (i >= ssize) ? (ssize - 1) : i;
}

View file

@ -9,6 +9,7 @@ objtool-y += special.o
objtool-y += objtool.o
objtool-y += libstring.o
objtool-y += libctype.o
objtool-y += str_error_r.o
CFLAGS += -I$(srctree)/tools/lib
@ -17,6 +18,10 @@ $(OUTPUT)libstring.o: ../lib/string.c FORCE
$(call rule_mkdir)
$(call if_changed_dep,cc_o_c)
$(OUTPUT)libctype.o: ../lib/ctype.c FORCE
$(call rule_mkdir)
$(call if_changed_dep,cc_o_c)
$(OUTPUT)str_error_r.o: ../lib/str_error_r.c FORCE
$(call rule_mkdir)
$(call if_changed_dep,cc_o_c)

View file

@ -0,0 +1,41 @@
Database Export
===============
perf tool's python scripting engine:
tools/perf/util/scripting-engines/trace-event-python.c
supports scripts:
tools/perf/scripts/python/export-to-sqlite.py
tools/perf/scripts/python/export-to-postgresql.py
which export data to a SQLite3 or PostgreSQL database.
The export process provides records with unique sequential ids which allows the
data to be imported directly to a database and provides the relationships
between tables.
Over time it is possible to continue to expand the export while maintaining
backward and forward compatibility, by following some simple rules:
1. Because of the nature of SQL, existing tables and columns can continue to be
used so long as the names and meanings (and to some extent data types) remain
the same.
2. New tables and columns can be added, without affecting existing SQL queries,
so long as the new names are unique.
3. Scripts that use a database (e.g. exported-sql-viewer.py) can maintain
backward compatibility by testing for the presence of new tables and columns
before using them. e.g. function IsSelectable() in exported-sql-viewer.py
4. The export scripts themselves maintain forward compatibility (i.e. an existing
script will continue to work with new versions of perf) by accepting a variable
number of arguments (e.g. def call_return_table(*x)) i.e. perf can pass more
arguments which old scripts will ignore.
5. The scripting engine tests for the existence of script handler functions
before calling them. The scripting engine can also test for the support of new
or optional features by checking for the existence and value of script global
variables.

View file

@ -88,21 +88,51 @@ smaller.
To represent software control flow, "branches" samples are produced. By default
a branch sample is synthesized for every single branch. To get an idea what
data is available you can use the 'perf script' tool with no parameters, which
will list all the samples.
data is available you can use the 'perf script' tool with all itrace sampling
options, which will list all the samples.
perf record -e intel_pt//u ls
perf script
perf script --itrace=ibxwpe
An interesting field that is not printed by default is 'flags' which can be
displayed as follows:
perf script -Fcomm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,flags
perf script --itrace=ibxwpe -F+flags
The flags are "bcrosyiABEx" which stand for branch, call, return, conditional,
system, asynchronous, interrupt, transaction abort, trace begin, trace end, and
in transaction, respectively.
Another interesting field that is not printed by default is 'ipc' which can be
displayed as follows:
perf script --itrace=be -F+ipc
There are two ways that instructions-per-cycle (IPC) can be calculated depending
on the recording.
If the 'cyc' config term (see config terms section below) was used, then IPC is
calculated using the cycle count from CYC packets, otherwise MTC packets are
used - refer to the 'mtc' config term. When MTC is used, however, the values
are less accurate because the timing is less accurate.
Because Intel PT does not update the cycle count on every branch or instruction,
the values will often be zero. When there are values, they will be the number
of instructions and number of cycles since the last update, and thus represent
the average IPC since the last IPC for that event type. Note IPC for "branches"
events is calculated separately from IPC for "instructions" events.
Also note that the IPC instruction count may or may not include the current
instruction. If the cycle count is associated with an asynchronous branch
(e.g. page fault or interrupt), then the instruction count does not include the
current instruction, otherwise it does. That is consistent with whether or not
that instruction has retired when the cycle count is updated.
Another note, in the case of "branches" events, non-taken branches are not
presently sampled, so IPC values for them do not appear e.g. a CYC packet with a
TNT packet that starts with a non-taken branch. To see every possible IPC
value, "instructions" events can be used e.g. --itrace=i0ns
While it is possible to create scripts to analyze the data, an alternative
approach is available to export the data to a sqlite or postgresql database.
Refer to script export-to-sqlite.py or export-to-postgresql.py for more details,
@ -713,7 +743,7 @@ Having no option is the same as
which, in turn, is the same as
--itrace=ibxwpe
--itrace=cepwx
The letters are:

View file

@ -564,9 +564,12 @@ llvm.*::
llvm.clang-bpf-cmd-template::
Cmdline template. Below lines show its default value. Environment
variable is used to pass options.
"$CLANG_EXEC -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS \
-Wno-unused-value -Wno-pointer-sign -working-directory \
$WORKING_DIR -c $CLANG_SOURCE -target bpf -O2 -o -"
"$CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS "\
"-DLINUX_VERSION_CODE=$LINUX_VERSION_CODE " \
"$CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS " \
"-Wno-unused-value -Wno-pointer-sign " \
"-working-directory $WORKING_DIR " \
"-c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE"
llvm.clang-opt::
Options passed to clang.

View file

@ -90,9 +90,10 @@ OPTIONS
-c::
--compute::
Differential computation selection - delta, ratio, wdiff, delta-abs
(default is delta-abs). Default can be changed using diff.compute
config option. See COMPARISON METHODS section for more info.
Differential computation selection - delta, ratio, wdiff, cycles,
delta-abs (default is delta-abs). Default can be changed using
diff.compute config option. See COMPARISON METHODS section for
more info.
-p::
--period::
@ -142,12 +143,14 @@ OPTIONS
perf diff --time 0%-10%,30%-40%
It also supports analyzing samples within a given time window
<start>,<stop>. Times have the format seconds.microseconds. If 'start'
is not given (i.e., time string is ',x.y') then analysis starts at
the beginning of the file. If stop time is not given (i.e, time
string is 'x.y,') then analysis goes to the end of the file. Time string is
'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps for different
perf.data files.
<start>,<stop>. Times have the format seconds.nanoseconds. If 'start'
is not given (i.e. time string is ',x.y') then analysis starts at
the beginning of the file. If stop time is not given (i.e. time
string is 'x.y,') then analysis goes to the end of the file.
Multiple ranges can be separated by spaces, which requires the argument
to be quoted e.g. --time "1234.567,1234.789 1235,"
Time string is'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps
for different perf.data files.
For example, we get the timestamp information from 'perf script'.
@ -278,6 +281,16 @@ If specified the 'Weighted diff' column is displayed with value 'd' computed as:
- WEIGHT-A being the weight of the data file
- WEIGHT-B being the weight of the baseline data file
cycles
~~~~~~
If specified the '[Program Block Range] Cycles Diff' column is displayed.
It displays the cycles difference of same program basic block amongst
two perf.data. The program basic block is the code between two branches.
'[Program Block Range]' indicates the range of a program basic block.
Source line is reported if it can be found otherwise uses symbol+offset
instead.
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-report[1]

View file

@ -490,6 +490,17 @@ Configure all used events to run in kernel space.
--all-user::
Configure all used events to run in user space.
--kernel-callchains::
Collect callchains only from kernel space. I.e. this option sets
perf_event_attr.exclude_callchain_user to 1.
--user-callchains::
Collect callchains only from user space. I.e. this option sets
perf_event_attr.exclude_callchain_kernel to 1.
Don't use both --kernel-callchains and --user-callchains at the same time or no
callchains will be collected.
--timestamp-filename
Append timestamp to output file name.

View file

@ -89,7 +89,7 @@ OPTIONS
- socket: processor socket number the task ran at the time of sample
- srcline: filename and line number executed at the time of sample. The
DWARF debugging info must be provided.
- srcfile: file name of the source file of the same. Requires dwarf
- srcfile: file name of the source file of the samples. Requires dwarf
information.
- weight: Event specific weight, e.g. memory latency or transaction
abort cost. This is the global weight.
@ -412,12 +412,13 @@ OPTIONS
--time::
Only analyze samples within given time window: <start>,<stop>. Times
have the format seconds.microseconds. If start is not given (i.e., time
have the format seconds.nanoseconds. If start is not given (i.e. time
string is ',x.y') then analysis starts at the beginning of the file. If
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.
stop time is not given (i.e. time string is 'x.y,') then analysis goes
to end of file. Multiple ranges can be separated by spaces, which
requires the argument to be quoted e.g. --time "1234.567,1234.789 1235,"
Also support time percent with multiple time range. Time string is
Also support time percent with multiple time ranges. Time string is
'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
For example:

View file

@ -117,7 +117,7 @@ OPTIONS
Comma separated list of fields to print. Options are:
comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc, srccode.
brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc, srccode, ipc.
Field list can be prepended with the type, trace, sw or hw,
to indicate to which event type the field list applies.
e.g., -F sw:comm,tid,time,ip,sym and -F trace:time,cpu,trace
@ -203,6 +203,9 @@ OPTIONS
The synth field is used by synthesized events which may be created when
Instruction Trace decoding.
The ipc (instructions per cycle) field is synthesized and may have a value when
Instruction Trace decoding.
Finally, a user may not set fields to none for all event types.
i.e., -F "" is not allowed.
@ -313,6 +316,9 @@ OPTIONS
--show-round-events
Display finished round events i.e. events of type PERF_RECORD_FINISHED_ROUND.
--show-bpf-events
Display bpf events i.e. events of type PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT.
--demangle::
Demangle symbol names to human readable form. It's enabled by default,
disable with --no-demangle.
@ -355,12 +361,13 @@ include::itrace.txt[]
--time::
Only analyze samples within given time window: <start>,<stop>. Times
have the format seconds.microseconds. If start is not given (i.e., time
have the format seconds.nanoseconds. If start is not given (i.e. time
string is ',x.y') then analysis starts at the beginning of the file. If
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.
stop time is not given (i.e. time string is 'x.y,') then analysis goes
to end of file. Multiple ranges can be separated by spaces, which
requires the argument to be quoted e.g. --time "1234.567,1234.789 1235,"
Also support time percent with multipe time range. Time string is
Also support time percent with multiple time ranges. Time string is
'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
For example:

View file

@ -200,6 +200,13 @@ use --per-socket in addition to -a. (system-wide). The output includes the
socket number and the number of online processors on that socket. This is
useful to gauge the amount of aggregation.
--per-die::
Aggregate counts per processor die for system-wide mode measurements. This
is a useful mode to detect imbalance between dies. To enable this mode,
use --per-die in addition to -a. (system-wide). The output includes the
die number and the number of online processors on that die. This is
useful to gauge the amount of aggregation.
--per-core::
Aggregate counts per physical processor for system-wide mode measurements. This
is a useful mode to detect imbalance between physical cores. To enable this mode,
@ -239,6 +246,9 @@ Input file name.
--per-socket::
Aggregate counts per processor socket for system-wide mode measurements.
--per-die::
Aggregate counts per processor die for system-wide mode measurements.
--per-core::
Aggregate counts per physical processor for system-wide mode measurements.

View file

@ -262,6 +262,11 @@ Default is to monitor all CPUS.
The number of threads to run when synthesizing events for existing processes.
By default, the number of threads equals to the number of online CPUs.
--namespaces::
Record events of type PERF_RECORD_NAMESPACES and display it with the
'cgroup_id' sort key.
INTERACTIVE PROMPTING KEYS
--------------------------

View file

@ -151,25 +151,45 @@ struct {
HEADER_CPU_TOPOLOGY = 13,
String lists defining the core and CPU threads topology.
The string lists are followed by a variable length array
which contains core_id and socket_id of each cpu.
The number of entries can be determined by the size of the
section minus the sizes of both string lists.
struct {
/*
* First revision of HEADER_CPU_TOPOLOGY
*
* See 'struct perf_header_string_list' definition earlier
* in this file.
*/
struct perf_header_string_list cores; /* Variable length */
struct perf_header_string_list threads; /* Variable length */
/*
* Second revision of HEADER_CPU_TOPOLOGY, older tools
* will not consider what comes next
*/
struct {
uint32_t core_id;
uint32_t socket_id;
} cpus[nr]; /* Variable length records */
/* 'nr' comes from previously processed HEADER_NRCPUS's nr_cpu_avail */
/*
* Third revision of HEADER_CPU_TOPOLOGY, older tools
* will not consider what comes next
*/
struct perf_header_string_list dies; /* Variable length */
uint32_t die_id[nr_cpus_avail]; /* from previously processed HEADER_NR_CPUS, VLA */
};
Example:
sibling cores : 0-3
sibling sockets : 0-8
sibling dies : 0-3
sibling dies : 4-7
sibling threads : 0-1
sibling threads : 2-3
sibling threads : 4-5
sibling threads : 6-7
HEADER_NUMA_TOPOLOGY = 14,
@ -272,6 +292,69 @@ struct {
Two uint64_t for the time of first sample and the time of last sample.
HEADER_SAMPLE_TOPOLOGY = 22,
Physical memory map and its node assignments.
The format of data in MEM_TOPOLOGY is as follows:
0 - version | for future changes
8 - block_size_bytes | /sys/devices/system/memory/block_size_bytes
16 - count | number of nodes
For each node we store map of physical indexes:
32 - node id | node index
40 - size | size of bitmap
48 - bitmap | bitmap of memory indexes that belongs to node
| /sys/devices/system/node/node<NODE>/memory<INDEX>
The MEM_TOPOLOGY can be displayed with following command:
$ perf report --header-only -I
...
# memory nodes (nr 1, block size 0x8000000):
# 0 [7G]: 0-23,32-69
HEADER_CLOCKID = 23,
One uint64_t for the clockid frequency, specified, for instance, via 'perf
record -k' (see clock_gettime()), to enable timestamps derived metrics
conversion into wall clock time on the reporting stage.
HEADER_DIR_FORMAT = 24,
The data files layout is described by HEADER_DIR_FORMAT feature. Currently it
holds only version number (1):
uint64_t version;
The current version holds only version value (1) means that data files:
- Follow the 'data.*' name format.
- Contain raw events data in standard perf format as read from kernel (and need
to be sorted)
Future versions are expected to describe different data files layout according
to special needs.
HEADER_BPF_PROG_INFO = 25,
struct bpf_prog_info_linear, which contains detailed information about
a BPF program, including type, id, tag, jited/xlated instructions, etc.
HEADER_BPF_BTF = 26,
Contains BPF Type Format (BTF). For more information about BTF, please
refer to Documentation/bpf/btf.rst.
struct {
u32 id;
u32 data_size;
char data[];
};
HEADER_COMPRESSED = 27,
struct {

View file

@ -38,6 +38,6 @@ To report cacheline events from previous recording: perf c2c report
To browse sample contexts use perf report --sample 10 and select in context menu
To separate samples by time use perf report --sort time,overhead,sym
To set sample time separation other than 100ms with --sort time use --time-quantum
Add -I to perf report to sample register values visible in perf report context.
Add -I to perf record to sample register values, which will be visible in perf report sample context.
To show IPC for sampling periods use perf record -e '{cycles,instructions}:S' and then browse context
To show context switches in perf report sample context add --switch-events to perf record.

View file

@ -7,6 +7,8 @@ tools/lib/traceevent
tools/lib/api
tools/lib/bpf
tools/lib/subcmd
tools/lib/argv_split.c
tools/lib/ctype.c
tools/lib/hweight.c
tools/lib/rbtree.c
tools/lib/string.c

View file

@ -417,6 +417,9 @@ ifdef CORESIGHT
$(call feature_check,libopencsd)
ifeq ($(feature-libopencsd), 1)
CFLAGS += -DHAVE_CSTRACE_SUPPORT $(LIBOPENCSD_CFLAGS)
ifeq ($(feature-reallocarray), 0)
CFLAGS += -DCOMPAT_NEED_REALLOCARRAY
endif
LDFLAGS += $(LIBOPENCSD_LDFLAGS)
EXTLIBS += $(OPENCSDLIBS)
$(call detected,CONFIG_LIBOPENCSD)
@ -641,11 +644,15 @@ endif
ifndef NO_SLANG
ifneq ($(feature-libslang), 1)
msg := $(warning slang not found, disables TUI support. Please install slang-devel, libslang-dev or libslang2-dev);
NO_SLANG := 1
else
ifneq ($(feature-libslang-include-subdir), 1)
msg := $(warning slang not found, disables TUI support. Please install slang-devel, libslang-dev or libslang2-dev);
NO_SLANG := 1
else
CFLAGS += -DHAVE_SLANG_INCLUDE_SUBDIR
endif
endif
ifndef NO_SLANG
# Fedora has /usr/include/slang/slang.h, but ubuntu /usr/include/slang.h
CFLAGS += -I/usr/include/slang
CFLAGS += -DHAVE_SLANG_SUPPORT
EXTLIBS += -lslang
$(call detected,CONFIG_SLANG)

View file

@ -420,6 +420,24 @@ fadvise_advice_tbl := $(srctree)/tools/perf/trace/beauty/fadvise.sh
$(fadvise_advice_array): $(linux_uapi_dir)/in.h $(fadvise_advice_tbl)
$(Q)$(SHELL) '$(fadvise_advice_tbl)' $(linux_uapi_dir) > $@
fsmount_arrays := $(beauty_outdir)/fsmount_arrays.c
fsmount_tbls := $(srctree)/tools/perf/trace/beauty/fsmount.sh
$(fsmount_arrays): $(linux_uapi_dir)/fs.h $(fsmount_tbls)
$(Q)$(SHELL) '$(fsmount_tbls)' $(linux_uapi_dir) > $@
fspick_arrays := $(beauty_outdir)/fspick_arrays.c
fspick_tbls := $(srctree)/tools/perf/trace/beauty/fspick.sh
$(fspick_arrays): $(linux_uapi_dir)/fs.h $(fspick_tbls)
$(Q)$(SHELL) '$(fspick_tbls)' $(linux_uapi_dir) > $@
fsconfig_arrays := $(beauty_outdir)/fsconfig_arrays.c
fsconfig_tbls := $(srctree)/tools/perf/trace/beauty/fsconfig.sh
$(fsconfig_arrays): $(linux_uapi_dir)/fs.h $(fsconfig_tbls)
$(Q)$(SHELL) '$(fsconfig_tbls)' $(linux_uapi_dir) > $@
pkey_alloc_access_rights_array := $(beauty_outdir)/pkey_alloc_access_rights_array.c
asm_generic_hdr_dir := $(srctree)/tools/include/uapi/asm-generic/
pkey_alloc_access_rights_tbl := $(srctree)/tools/perf/trace/beauty/pkey_alloc_access_rights.sh
@ -494,6 +512,12 @@ mount_flags_tbl := $(srctree)/tools/perf/trace/beauty/mount_flags.sh
$(mount_flags_array): $(linux_uapi_dir)/fs.h $(mount_flags_tbl)
$(Q)$(SHELL) '$(mount_flags_tbl)' $(linux_uapi_dir) > $@
move_mount_flags_array := $(beauty_outdir)/move_mount_flags_array.c
move_mount_flags_tbl := $(srctree)/tools/perf/trace/beauty/move_mount_flags.sh
$(move_mount_flags_array): $(linux_uapi_dir)/fs.h $(move_mount_flags_tbl)
$(Q)$(SHELL) '$(move_mount_flags_tbl)' $(linux_uapi_dir) > $@
prctl_option_array := $(beauty_outdir)/prctl_option_array.c
prctl_hdr_dir := $(srctree)/tools/include/uapi/linux/
prctl_option_tbl := $(srctree)/tools/perf/trace/beauty/prctl_option.sh
@ -526,6 +550,12 @@ arch_errno_tbl := $(srctree)/tools/perf/trace/beauty/arch_errno_names.sh
$(arch_errno_name_array): $(arch_errno_tbl)
$(Q)$(SHELL) '$(arch_errno_tbl)' $(CC) $(arch_errno_hdr_dir) > $@
sync_file_range_arrays := $(beauty_outdir)/sync_file_range_arrays.c
sync_file_range_tbls := $(srctree)/tools/perf/trace/beauty/sync_file_range.sh
$(sync_file_range_arrays): $(linux_uapi_dir)/fs.h $(sync_file_range_tbls)
$(Q)$(SHELL) '$(sync_file_range_tbls)' $(linux_uapi_dir) > $@
all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS)
# Create python binding output directory if not already present
@ -629,6 +659,9 @@ build-dir = $(if $(__build-dir),$(__build-dir),.)
prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders $(drm_ioctl_array) \
$(fadvise_advice_array) \
$(fsconfig_arrays) \
$(fsmount_arrays) \
$(fspick_arrays) \
$(pkey_alloc_access_rights_array) \
$(sndrv_pcm_ioctl_array) \
$(sndrv_ctl_ioctl_array) \
@ -639,12 +672,14 @@ prepare: $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)common-cmds.h archheaders $(drm_ioc
$(madvise_behavior_array) \
$(mmap_flags_array) \
$(mount_flags_array) \
$(move_mount_flags_array) \
$(perf_ioctl_array) \
$(prctl_option_array) \
$(usbdevfs_ioctl_array) \
$(x86_arch_prctl_code_array) \
$(rename_flags_array) \
$(arch_errno_name_array)
$(arch_errno_name_array) \
$(sync_file_range_arrays)
$(OUTPUT)%.o: %.c prepare FORCE
$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@
@ -923,9 +958,13 @@ clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clea
$(OUTPUT)tests/llvm-src-{base,kbuild,prologue,relocation}.c \
$(OUTPUT)pmu-events/pmu-events.c \
$(OUTPUT)$(fadvise_advice_array) \
$(OUTPUT)$(fsconfig_arrays) \
$(OUTPUT)$(fsmount_arrays) \
$(OUTPUT)$(fspick_arrays) \
$(OUTPUT)$(madvise_behavior_array) \
$(OUTPUT)$(mmap_flags_array) \
$(OUTPUT)$(mount_flags_array) \
$(OUTPUT)$(move_mount_flags_array) \
$(OUTPUT)$(drm_ioctl_array) \
$(OUTPUT)$(pkey_alloc_access_rights_array) \
$(OUTPUT)$(sndrv_ctl_ioctl_array) \
@ -939,7 +978,8 @@ clean:: $(LIBTRACEEVENT)-clean $(LIBAPI)-clean $(LIBBPF)-clean $(LIBSUBCMD)-clea
$(OUTPUT)$(usbdevfs_ioctl_array) \
$(OUTPUT)$(x86_arch_prctl_code_array) \
$(OUTPUT)$(rename_flags_array) \
$(OUTPUT)$(arch_errno_name_array)
$(OUTPUT)$(arch_errno_name_array) \
$(OUTPUT)$(sync_file_range_arrays)
$(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) clean
#

Some files were not shown because too many files have changed in this diff Show more