linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-10-28 15:20:41 +00:00

Author	SHA1	Message	Date
Yicong Yang	1b0e3ea930	perf/smmuv3: Add MODULE_ALIAS for module auto loading On my ACPI based arm64 server, if the SMMUv3 PMU is configured as module it won't be loaded automatically after booting even if the device has already been scanned and added. It's because the module lacks a platform alias, the uevent mechanism and userspace tools like udevd make use of this to find the target driver module of the device. This patch adds the missing platform alias of the module, then module will be loaded automatically if device exists. Before this patch: [root@localhost tmp]# modinfo arm_smmuv3_pmu \| grep alias alias: of:NTCarm,smmu-v3-pmcgC* alias: of:NTCarm,smmu-v3-pmcg After this patch: [root@localhost tmp]# modinfo arm_smmuv3_pmu \| grep alias alias: platform:arm-smmu-v3-pmcg alias: of:NTCarm,smmu-v3-pmcgC* alias: of:NTCarm,smmu-v3-pmcg Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20230814131642.65263-1-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2023-08-15 12:53:04 +01:00
Yicong Yang	0242737dc4	perf/smmuv3: Enable HiSilicon Erratum 162001900 quirk for HIP08/09 Some HiSilicon SMMU PMCG suffers the erratum 162001900 that the PMU disable control sometimes fail to disable the counters. This will lead to error or inaccurate data since before we enable the counters the counter's still counting for the event used in last perf session. This patch tries to fix this by hardening the global disable process. Before disable the PMU, writing an invalid event type (0xffff) to focibly stop the counters. Correspondingly restore each events on pmu::pmu_enable(). Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20230814124012.58013-1-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2023-08-15 12:50:53 +01:00
Anshuman Khandual	90d6867722	perf: pmuv3: Remove comments from armv8pmu_[enable\|disable]_event() The comments in armv8pmu_[enable\|disable]_event() are blindingly obvious, and does not contribute in making things any better. Let's drop them off. Functional change is not intended. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20230802090853.1190391-1-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-08-04 17:23:57 +01:00
Robin Murphy	ac18ea1a89	perf/arm-cmn: Add CMN-700 r3 support CMN-700 r3 has a special configuration option for a so-called "Super Home Node", which is a superset of the standard HN-F that also manages remote-chip coherency for multi-chip setups. As such it has a similar but expanded set of PMU events compared to HN-F, with some additional filtering options to boot. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/49153b72253f6af0e625cb55b9e1b825b110c49c.1688746690.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-07-28 14:31:06 +01:00
Robin Murphy	b1b7dc38e4	perf/arm-cmn: Refactor HN-F event selector macros Refactor the macros for defining HN-F events with additional selectors, so they can be shared with another upcoming similar-but-distinct HN type. No functional change intended. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/0f05327941e06c665dbfd47e03fad29276b9e63c.1688746690.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-07-28 14:31:06 +01:00
Robin Murphy	00df90934c	perf/arm-cmn: Remove spurious event aliases As the name suggests, the "partial DAT flit" event is only counted for the DAT channel, and furthermore is only applicable to device ports, not mesh links (strictly it's only device ports with CHI-A requesters connected, but detecting that degree of detail is more bother than it's worth). Stop generating spurious event aliases for other combinations which aren't meaningful. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/b01a58e3ff05c322547fbfd015f6dbfedf555ed3.1688746690.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-07-28 14:31:05 +01:00
Rob Herring	918dc87b74	drivers/perf: Explicitly include correct DT includes The DT of_device.h and of_platform.h date back to the separate of_platform_bus_type before it as merged into the regular platform bus. As part of that merge prepping Arm DT support 13 years ago, they "temporarily" include each other. They also include platform_device.h and of.h. As a result, there's a pretty much random mix of those include files used throughout the tree. In order to detangle these headers and replace the implicit includes with struct declarations, users need to explicitly include the correct includes. Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230714174832.4061752-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-07-27 13:02:23 +01:00
Rob Herring	989567fc0f	perf: pmuv3: Add Cortex A520, A715, A720, X3 and X4 PMUs Add support for the Arm Cortex-A520, Cortex-A715, Cortex-A720, Cortex-X3, and Cortex-X4 CPU PMUs. They are straight-forward additions with just new compatible strings. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230706205505.308523-2-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-07-27 13:01:19 +01:00
Vincent Whitchurch	7c3f204e54	perf/smmuv3: Remove build dependency on ACPI This driver supports working without ACPI since commit `3f7be43561` ("perf/smmuv3: Add devicetree support"), so remove the build dependency. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20230706-smmuv3-pmu-noacpi-v1-1-7083ef189158@axis.com Signed-off-by: Will Deacon <will@kernel.org>	2023-07-27 13:00:49 +01:00
Yangtao Li	c47ea342d8	perf: xgene_pmu: Convert to devm_platform_ioremap_resource() Use devm_platform_ioremap_resource() to simplify code. Signed-off-by: Yangtao Li <frank.li@vivo.com> Link: https://lore.kernel.org/r/20230704093556.17926-1-frank.li@vivo.com Signed-off-by: Will Deacon <will@kernel.org>	2023-07-27 13:00:10 +01:00
Jing Zhang	cbbc6fdd85	driver/perf: Add identifier sysfs file for Yitian 710 DDR To allow userspace to identify the specific implementation of the device, add an "identifier" sysfs file. The perf tool can match the Yitian 710 DDR metric through the identifier. Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com> Acked-by: Ian Rogers <irogers@google.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/1687245156-61215-2-git-send-email-renyu.zj@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>	2023-07-27 12:57:04 +01:00
James Clark	80391d8c38	arm_pmu: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability Since commit `bd27568117` ("perf: Rewrite core context handling") the relationship between perf_event_context and PMUs has changed so that the error scenario that PERF_PMU_CAP_HETEROGENEOUS_CPUS originally silenced no longer exists. Remove the capability and associated comment to avoid confusion that it actually influences any perf core behavior. This change should be a no-op. Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20230724134500.970496-4-james.clark@arm.com	2023-07-26 12:28:47 +02:00
James Clark	5c81672865	arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability This capability gives us the ability to open PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events on a specific PMU for free. All the implementation is contained in the Perf core and tool code so no change to the Arm PMU driver is needed. The following basic use case now results in Perf opening the event on all PMUs rather than picking only one in an unpredictable way: $ perf stat -e cycles -- taskset --cpu-list 0,1 stress -c 2 Performance counter stats for 'taskset --cpu-list 0,1 stress -c 2': 963279620 armv8_cortex_a57/cycles/ (99.19%) 752745657 armv8_cortex_a53/cycles/ (94.80%) Fixes: `55bcf6ef31` ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE") Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20230724134500.970496-2-james.clark@arm.com	2023-07-26 12:28:46 +02:00
Eric Lin	66843b14fb	perf: RISC-V: Remove PERF_HES_STOPPED flag checking in riscv_pmu_start() Since commit `096b52fd2b` ("perf: RISC-V: throttle perf events") the perf_sample_event_took() function was added to report time spent in overflow interrupts. If the interrupt takes too long, the perf framework will lower the sysctl_perf_event_sample_rate and max_samples_per_tick. When hwc->interrupts is larger than max_samples_per_tick, the hwc->interrupts will be set to MAX_INTERRUPTS, and events will be throttled within the __perf_event_account_interrupt() function. However, the RISC-V PMU driver doesn't call riscv_pmu_stop() to update the PERF_HES_STOPPED flag after perf_event_overflow() in pmu_sbi_ovf_handler() function to avoid throttling. When the perf framework unthrottled the event in the timer interrupt handler, it triggers riscv_pmu_start() function and causes a WARN_ON_ONCE() warning, as shown below: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 240 at drivers/perf/riscv_pmu.c:184 riscv_pmu_start+0x7c/0x8e Modules linked in: CPU: 0 PID: 240 Comm: ls Not tainted 6.4-rc4-g19d0788e9ef2 #1 Hardware name: SiFive (DT) epc : riscv_pmu_start+0x7c/0x8e ra : riscv_pmu_start+0x28/0x8e epc : ffffffff80aef864 ra : ffffffff80aef810 sp : ffff8f80004db6f0 gp : ffffffff81c83750 tp : ffffaf80069f9bc0 t0 : ffff8f80004db6c0 t1 : 0000000000000000 t2 : 000000000000001f s0 : ffff8f80004db720 s1 : ffffaf8008ca1068 a0 : 0000ffffffffffff a1 : 0000000000000000 a2 : 0000000000000001 a3 : 0000000000000870 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000840 a7 : 0000000000000030 s2 : 0000000000000000 s3 : ffffaf8005165800 s4 : ffffaf800424da00 s5 : ffffffffffffffff s6 : ffffffff81cc7590 s7 : 0000000000000000 s8 : 0000000000000006 s9 : 0000000000000001 s10: ffffaf807efbc340 s11: ffffaf807efbbf00 t3 : ffffaf8006a16028 t4 : 00000000dbfbb796 t5 : 0000000700000000 t6 : ffffaf8005269870 status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff80aef864>] riscv_pmu_start+0x7c/0x8e [<ffffffff80185b56>] perf_adjust_freq_unthr_context+0x15e/0x174 [<ffffffff80188642>] perf_event_task_tick+0x88/0x9c [<ffffffff800626a8>] scheduler_tick+0xfe/0x27c [<ffffffff800b5640>] update_process_times+0x9a/0xba [<ffffffff800c5bd4>] tick_sched_handle+0x32/0x66 [<ffffffff800c5e0c>] tick_sched_timer+0x64/0xb0 [<ffffffff800b5e50>] __hrtimer_run_queues+0x156/0x2f4 [<ffffffff800b6bdc>] hrtimer_interrupt+0xe2/0x1fe [<ffffffff80acc9e8>] riscv_timer_interrupt+0x38/0x42 [<ffffffff80090a16>] handle_percpu_devid_irq+0x90/0x1d2 [<ffffffff8008a9f4>] generic_handle_domain_irq+0x28/0x36 After referring other PMU drivers like Arm, Loongarch, Csky, and Mips, they don't call _pmu_stop() to update with PERF_HES_STOPPED flag after perf_event_overflow() function nor do they add PERF_HES_STOPPED flag checking in _pmu_start() which don't cause this warning. Thus, it's recommended to remove this unnecessary check in riscv_pmu_start() function to prevent this warning. Signed-off-by: Eric Lin <eric.lin@sifive.com> Link: https://lore.kernel.org/r/20230710154328.19574-1-eric.lin@sifive.com Fixes: `096b52fd2b` ("perf: RISC-V: throttle perf events") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-07-12 07:41:23 -07:00
Linus Torvalds	d25f002575	cxl for v6.5 - Add infrastructure for supporting background commands along with support for device sanitization and firmware update - Introduce a CXL performance monitoring unit driver based on the common definition in the specification. - Land some preparatory cleanup and refactoring for the anticipated arrival of CXL type-2 (accelerator devices) and CXL RCH (CXL-v1.1 topology) error handling. - Rework CPU cache management with respect to region configuration (device hotplug or other dynamic changes to memory interleaving) - Fix region reconfiguration vs CXL decoder ordering rules. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZJ9fkQAKCRDfioYZHlFs ZyWcAP9THJ6ZzX1mbAfHhPz9r+oxsrE3l1jQpNjNbh7MNW29MAEA36dmTE62JaHK lTPDgHxqBt1vrHPktYWOM9ZPHE2tLwA= =3fFL -----END PGP SIGNATURE----- Merge tag 'cxl-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL updates from Dan Williams: "The highlights in terms of new functionality are support for the standard CXL Performance Monitor definition that appeared in CXL 3.0, support for device sanitization (wiping all data from a device), secure-erase (re-keying encryption of user data), and support for firmware update. The firmware update support is notable as it reuses the simple sysfs_upload interface to just cat(1) a blob to a sysfs file and pipe that to the device. Additionally there are a substantial number of cleanups and reorganizations to get ready for RCH error handling (RCH == Restricted CXL Host == current shipping hardware generation / pre CXL-2.0 topologies) and type-2 (accelerator / vendor specific) devices. For vendor specific devices they implement a subset of what the generic type-3 (generic memory expander) driver expects. As a result the rework decouples optional infrastructure from the core driver context. For RCH topologies, where the specification working group did not want to confuse pre-CXL-aware operating systems, many of the standard registers are hidden which makes support standard bus features like AER (PCIe Advanced Error Reporting) difficult. The rework arranges for the driver to help the PCI-AER core. Bjorn is on board with this direction but a late regression disocvery means the completion of this functionality needs to cook a bit longer, so it is code reorganizations only for now. Summary: - Add infrastructure for supporting background commands along with support for device sanitization and firmware update - Introduce a CXL performance monitoring unit driver based on the common definition in the specification. - Land some preparatory cleanup and refactoring for the anticipated arrival of CXL type-2 (accelerator devices) and CXL RCH (CXL-v1.1 topology) error handling. - Rework CPU cache management with respect to region configuration (device hotplug or other dynamic changes to memory interleaving) - Fix region reconfiguration vs CXL decoder ordering rules" * tag 'cxl-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (51 commits) cxl: Fix one kernel-doc comment cxl/pci: Use correct flag for sanitize polling docs: perf: Minimal introduction the the CXL PMU device and driver perf: CXL Performance Monitoring Unit driver tools/testing/cxl: add firmware update emulation to CXL memdevs tools/testing/cxl: Use named effects for the Command Effect Log tools/testing/cxl: Fix command effects for inject/clear poison cxl: add a firmware update mechanism using the sysfs firmware loader cxl/test: Add Secure Erase opcode support cxl/mem: Support Secure Erase cxl/test: Add Sanitize opcode support cxl/mem: Wire up Sanitization support cxl/mbox: Add sanitization handling machinery cxl/mem: Introduce security state sysfs file cxl/mbox: Allow for IRQ_NONE case in the isr Revert "cxl/port: Enable the HDM decoder capability for switch ports" cxl/memdev: Formalize endpoint port linkage cxl/pci: Unconditionally unmask 256B Flit errors cxl/region: Manage decoder target_type at decoder-attach time cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM ...	2023-07-01 08:58:41 -07:00
Linus Torvalds	533925cb76	RISC-V Patches for the 6.5 Merge Window, Part 1 * Support for ACPI. * Various cleanups to the ISA string parsing, including making them case-insensitive * Support for the vector extension. * Support for independent irq/softirq stacks. * Our CPU DT binding now has "unevaluatedProperties: false" -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmSe70ATHHBhbG1lckBk YWJiZWx0LmNvbQAKCRAuExnzX7sYiWNPD/0ZfSdQ0A/gMVOzAD4zFKPEqQ6ffW2V Zy6Jo7UDNqKsiai7QA4XB1uyYIv/y1yUKJ0oeBVcA9Nzyq+TW9QDcApDBTabxAUI agY19YKw6VVZ+p7I9sMsf6EbdJdkNfSAzcQACPxb4ScEoaf9X+oAK5qgXuRuWluh qQuVkkJlgWc/t1cuUkrRdJmHQYvjP3zL7z4o344q2IVpXJkNNu0GeP+HbF8BYKcA +I/TTA5JY3kCIaxkpF2rU6pE6T5T9xrPmRYZ7bZoPUPnbL+M8As/jx3ym52Y4WGp kf8pgkxixOjU64kVJOH66CA8GaOiaAH/ptjQb0ZmCaGrHhr7aOT9HrkX4rU1lS8T stPphfM4gGPcCoPgRqSl+mEhBzjII8maOBLtbricAoQi6efRq8fzoOGaif/QpCbc 6n0LGS4nQPGVyD3rAPfHxxfrlGJR+SsgyDvjZoDhqauFglims14GnK+eBeO8zrui Aj/uuAS63VIYprJWC1NOBJlU2WKZiOGhCANpZ6W6SH21PYn2WjsVILqaGh+WN8ZO KOHxZNaN8fQag0Yg7oNAUb7l6S0DHYtJIksFnFW2Rf2+VT58RAMYRQbpbhr7Tqr+ jLgIR8PkFrBERHE49IqLGhAxGDnNzAUysMRw9pIk7WIre2Jt4wPqUdl+ee+5ErIX jiYfSFZw9q28UA== =Fpq8 -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-6.5-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for ACPI - Various cleanups to the ISA string parsing, including making them case-insensitive - Support for the vector extension - Support for independent irq/softirq stacks - Our CPU DT binding now has "unevaluatedProperties: false" * tag 'riscv-for-linus-6.5-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (78 commits) riscv: hibernate: remove WARN_ON in save_processor_state dt-bindings: riscv: cpus: switch to unevaluatedProperties: false dt-bindings: riscv: cpus: add a ref the common cpu schema riscv: stack: Add config of thread stack size riscv: stack: Support HAVE_SOFTIRQ_ON_OWN_STACK riscv: stack: Support HAVE_IRQ_EXIT_ON_IRQ_STACK RISC-V: always report presence of extensions formerly part of the base ISA dt-bindings: riscv: explicitly mention assumption of Zicntr & Zihpm support RISC-V: remove decrement/increment dance in ISA string parser RISC-V: rework comments in ISA string parser RISC-V: validate riscv,isa at boot, not during ISA string parsing RISC-V: split early & late of_node to hartid mapping RISC-V: simplify register width check in ISA string parsing perf: RISC-V: Limit the number of counters returned from SBI riscv: replace deprecated scall with ecall riscv: uprobes: Restore thread.bad_cause riscv: mm: try VMA lock-based page fault handling first riscv: mm: Pre-allocate PGD entries for vmalloc/modules area RISC-V: hwprobe: Expose Zba, Zbb, and Zbs RISC-V: Track ISA extensions per hart ...	2023-06-30 09:37:26 -07:00
Linus Torvalds	77b1a7f7a0	- Arnd Bergmann has fixed a bunch of -Wmissing-prototypes in top-level directories. - Douglas Anderson has added a new "buddy" mode to the hardlockup detector. It permits the detector to work on architectures which cannot provide the required interrupts, by having CPUs periodically perform checks on other CPUs. - Zhen Lei has enhanced kexec's ability to support two crash regions. - Petr Mladek has done a lot of cleanup on the hard lockup detector's Kconfig entries. - And the usual bunch of singleton patches in various places. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZJelTAAKCRDdBJ7gKXxA juDkAP0VXWynzkXoojdS/8e/hhi+htedmQ3v2dLZD+vBrctLhAEA7rcH58zAVoWa 2ejqO6wDrRGUC7JQcO9VEjT0nv73UwU= =F293 -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2023-06-24-19-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-mm updates from Andrew Morton: - Arnd Bergmann has fixed a bunch of -Wmissing-prototypes in top-level directories - Douglas Anderson has added a new "buddy" mode to the hardlockup detector. It permits the detector to work on architectures which cannot provide the required interrupts, by having CPUs periodically perform checks on other CPUs - Zhen Lei has enhanced kexec's ability to support two crash regions - Petr Mladek has done a lot of cleanup on the hard lockup detector's Kconfig entries - And the usual bunch of singleton patches in various places * tag 'mm-nonmm-stable-2023-06-24-19-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (72 commits) kernel/time/posix-stubs.c: remove duplicated include ocfs2: remove redundant assignment to variable bit_off watchdog/hardlockup: fix typo in config HARDLOCKUP_DETECTOR_PREFER_BUDDY powerpc: move arch_trigger_cpumask_backtrace from nmi.h to irq.h devres: show which resource was invalid in __devm_ioremap_resource() watchdog/hardlockup: define HARDLOCKUP_DETECTOR_ARCH watchdog/sparc64: define HARDLOCKUP_DETECTOR_SPARC64 watchdog/hardlockup: make HAVE_NMI_WATCHDOG sparc64-specific watchdog/hardlockup: declare arch_touch_nmi_watchdog() only in linux/nmi.h watchdog/hardlockup: make the config checks more straightforward watchdog/hardlockup: sort hardlockup detector related config values a logical way watchdog/hardlockup: move SMP barriers from common code to buddy code watchdog/buddy: simplify the dependency for HARDLOCKUP_DETECTOR_PREFER_BUDDY watchdog/buddy: don't copy the cpumask in watchdog_next_cpu() watchdog/buddy: cleanup how watchdog_buddy_check_hardlockup() is called watchdog/hardlockup: remove softlockup comment in touch_nmi_watchdog() watchdog/hardlockup: in watchdog_hardlockup_check() use cpumask_copy() watchdog/hardlockup: don't use raw_cpu_ptr() in watchdog_hardlockup_kick() watchdog/hardlockup: HAVE_NMI_WATCHDOG must implement watchdog_hardlockup_probe() watchdog/hardlockup: keep kernel.nmi_watchdog sysctl as 0444 if probe fails ...	2023-06-28 10:59:38 -07:00
Linus Torvalds	2605e80d34	arm64 updates for 6.5: - Support for the Armv8.9 Permission Indirection Extensions. While this feature doesn't add new functionality, it enables future support for Guarded Control Stacks (GCS) and Permission Overlays. - User-space support for the Armv8.8 memcpy/memset instructions. - arm64 perf: support the HiSilicon SoC uncore PMU, Arm CMN sysfs identifier, support for the NXP i.MX9 SoC DDRC PMU, fixes and cleanups. - Removal of superfluous ISBs on context switch (following retrospective architecture tightening). - Decode the ISS2 register during faults for additional information to help with debugging. - KPTI clean-up/simplification of the trampoline exit code. - Addressing several -Wmissing-prototype warnings. - Kselftest improvements for signal handling and ptrace. - Fix TPIDR2_EL0 restoring on sigreturn - Clean-up, robustness improvements of the module allocation code. - More sysreg conversions to the automatic register/bitfields generation. - CPU capabilities handling cleanup. - Arm documentation updates: ACPI, ptdump. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmSZyXwACgkQa9axLQDI XvEM3BAAkMzHGTDhNVNGLSO07PVmdzTiuoNFlfX7bktdIb+El76VhGXhHeEywTje wAq9JIYBf/Src2HbgZLwuly8Fn2vCrhyp++bRJW82o9SiBnx91+0mH7zLf+XHiQ4 FHKZxvaE6PaDc9o8WXr+IeucPRb5W2HgH37mktxh7ShMLsxorwS94V1oL29A2mV9 t4XQY7/tdmrDKMKMuQnIr1DurNXBhJ1OKvDnSN/Zzm96JOU/QQ32N2wEE7Y0aHOh bBzClksx2mguQqV515mySGFe5yy9NqaAfx2hTAciq+1rwbiCSjqQQmEswoUH8WLX JNLylxADWT2qXThFe8W6uyFzEshSAoI1yKxlCGuOsQpu4sFJtR8oh8dDj5669g4Y j0jR87r9rWm0iyYI5I+XDMxFVyuh2eFInvjtynRbj+mtS3f/SkO8fXG6Uya+I76C UGLlBUKnLr/zHuIGN0LE/V4dYTqsi9EtHoc2Am2xCZsS9jqkxKJG8C93Zsm4GlJC OcUtBSjW0rYJq+tLk0yhR6hbh59QbiRh05KnZsPpOKi8purlKSL9ZNPRi7TndLdm HjHUY+vQwNIpPIb6pyK4aYZuTdGEQIsQykQ8CULiIGlHi7kc4g9029ouLc5bBAeU mU8D62I2ztzPoYljYWNtO7K6g/Dq8c4lpsaMAJ+1Wp2iq2xBJjo= =rNBK -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Notable features are user-space support for the memcpy/memset instructions and the permission indirection extension. - Support for the Armv8.9 Permission Indirection Extensions. While this feature doesn't add new functionality, it enables future support for Guarded Control Stacks (GCS) and Permission Overlays - User-space support for the Armv8.8 memcpy/memset instructions - arm64 perf: support the HiSilicon SoC uncore PMU, Arm CMN sysfs identifier, support for the NXP i.MX9 SoC DDRC PMU, fixes and cleanups - Removal of superfluous ISBs on context switch (following retrospective architecture tightening) - Decode the ISS2 register during faults for additional information to help with debugging - KPTI clean-up/simplification of the trampoline exit code - Addressing several -Wmissing-prototype warnings - Kselftest improvements for signal handling and ptrace - Fix TPIDR2_EL0 restoring on sigreturn - Clean-up, robustness improvements of the module allocation code - More sysreg conversions to the automatic register/bitfields generation - CPU capabilities handling cleanup - Arm documentation updates: ACPI, ptdump" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (124 commits) kselftest/arm64: Add a test case for TPIDR2 restore arm64/signal: Restore TPIDR2 register rather than memory state arm64: alternatives: make clean_dcache_range_nopatch() noinstr-safe Documentation/arm64: Add ptdump documentation arm64: hibernate: remove WARN_ON in save_processor_state kselftest/arm64: Log signal code and address for unexpected signals docs: perf: Fix warning from 'make htmldocs' in hisi-pmu.rst arm64/fpsimd: Exit streaming mode when flushing tasks docs: perf: Add new description for HiSilicon UC PMU drivers/perf: hisi: Add support for HiSilicon UC PMU driver drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driver perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE perf/arm-cmn: Add sysfs identifier perf/arm-cmn: Revamp model detection perf/arm_dmc620: Add cpumask arm64: mm: fix VA-range sanity check arm64/mm: remove now-superfluous ISBs from TTBR writes Documentation/arm64: Update ACPI tables from BBR Documentation/arm64: Update references in arm-acpi Documentation/arm64: Update ARM and arch reference ...	2023-06-26 17:11:53 -07:00
Jonathan Cameron	5d7107c727	perf: CXL Performance Monitoring Unit driver CXL rev 3.0 introduces a standard performance monitoring hardware block to CXL. Instances are discovered using CXL Register Locator DVSEC entries. Each CXL component may have multiple PMUs. This initial driver supports a subset of types of counter. It supports counters that are either fixed or configurable, but requires that they support the ability to freeze and write value whilst frozen. Development done with QEMU model which will be posted shortly. Example: $ perf stat -a -e cxl_pmu_mem0.0/h2d_req_snpcur/ -e cxl_pmu_mem0.0/h2d_req_snpdata/ -e cxl_pmu_mem0.0/clock_ticks/ sleep 1 Performance counter stats for 'system wide': 96,757,023,244,321 cxl_pmu_mem0.0/h2d_req_snpcur/ 96,757,023,244,365 cxl_pmu_mem0.0/h2d_req_snpdata/ 193,514,046,488,653 cxl_pmu_mem0.0/clock_ticks/ 1.090539600 seconds time elapsed Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230526095824.16336-5-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 17:47:09 -07:00
Viacheslav Mitrofanov	ee95b88d71	perf: RISC-V: Limit the number of counters returned from SBI Perf gets the number of supported counters from SBI. If it happens that the number of returned counters more than RISCV_MAX_COUNTERS the code trusts it. It does not lead to an immediate problem but can potentially lead to it. Prevent getting more than RISCV_MAX_COUNTERS from SBI. Signed-off-by: Viacheslav Mitrofanov <v.v.mitrofanov@yadro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20230505072058.1049732-1-v.v.mitrofanov@yadro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-06-20 12:55:24 -07:00
Sunil V L	ca7473cb83	RISC-V/perf: Use standard interface to get INTC domain Currently the PMU driver is using DT based lookup to find the INTC node for sscofpmf extension. This will not work for ACPI based systems causing the driver to fail to register the PMU overflow interrupt handler. Hence, change the code to use the standard interface to find the INTC node which works irrespective of DT or ACPI. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20230607112417.782085-3-sunilvl@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-06-19 09:27:59 -07:00
Junhao He	312eca95e2	drivers/perf: hisi: Add support for HiSilicon UC PMU driver On HiSilicon Hip09 platform, there are 4 UC (unified cache) modules on each chip CCL (CPU Cluster). UC is a cache that provides coherence between NUMA and UMA domains. It is located between L2 and Memory System. Many PMU events are supported. Let's support the UC PMU driver using the HiSilicon uncore PMU framework. * rd_req_en : rd_req_en is the abbreviation of read request tracetag enable and allows user to count only read operations. Details are listed in the hisi-pmu document at Documentation/admin-guide/perf/hisi-pmu.rst * srcid_en & srcid: Allows users to filter statistical information based on specific CPU/ICL by srcid. srcid_en depends on rd_req_en being enabled. * uring_channel: Allows users to filter statistical information based on the specified tx request uring channel. uring_channel only supported events: [0x47 ~ 0x59]. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230615125926.29832-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2023-06-16 12:27:38 +01:00
Junhao He	1a51688474	drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driver Compared to the original PA device, H60PA offers higher bandwidth. The H60PA is a new device and we use HID to differentiate them. The events supported by PAv3 and PAv2 are different. The PAv3 PMU removed some events which are supported by PAv2 PMU. The older PA PMU driver will probe v3 as v2. Therefore PA events displayed by "perf list" cannot work properly. We add the HISI0275 HID for PAv3 PMU to distinguish different. For each H60PA PMU, except for the overflow interrupt register, other functions of the H60PA PMU are the same as the original PA PMU module. It has 8-programable counters and each counter is free-running. Interrupt is supported to handle counter (64-bits) overflow. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230615125926.29832-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2023-06-16 12:27:37 +01:00
Ilkka Koskinen	7e51d05e43	perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE Add missing MODULE_DEVICE_TABLE definition to generate modalias, which enables module autoloading. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20230615232630.304870-1-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>	2023-06-16 10:32:32 +01:00
Robin Murphy	a1c45d3ebd	perf/arm-cmn: Add sysfs identifier Expose a sysfs identifier encapsulating the CMN part number and revision so that jevents can narrow down a fundamental set of possible events for calculating metrics. Configuration-dependent aspects - such as whether a given node type is present, and/or a given node ID is valid - are still not covered, and in general it's hard to see how userspace could handle them, so we won't be removing any data or validation logic from the driver any time soon, but at least it's a step in a useful direction. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Tested-by: Jing Zhang <renyu.zj@linux.alibaba.com> Link: https://lore.kernel.org/r/b8a14c14fcdf028939ebf57849863e8ae01743de.1686588640.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-06-16 10:28:21 +01:00
Robin Murphy	7819e05a0d	perf/arm-cmn: Revamp model detection CMN implements a set of CoreSight-format peripheral ID registers which in principle we should be able to use to identify the hardware. However so far we have avoided trying to use the part number field since the TRMs have all described it as "configuration dependent". It turns out, though, that this is a quirk of the documentation generation process, and in fact the part number should always be a stable well-defined field which we can trust. To that end, revamp our model detection to rely less on ACPI/DT, and pave the way towards further using the hardware information as an identifier for userspace jevent metrics. This includes renaming the revision constants to maximise readability. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/3c791eaae814b0126f9adbd5419bfb4a600dade7.1686588640.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-06-16 10:28:21 +01:00
Xin Yang	95f5819738	perf/arm_dmc620: Add cpumask Add a cpumask for the DMC620 PMU. As it is an uncore PMU, perf userspace tool only needs to open a single counter on the CPU specified by the CPU mask for each event on a given DMC620 device. Signed-off-by: Xin Yang <xin.yang@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20230613013423.2078397-1-xin.yang@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-06-16 10:24:12 +01:00
Douglas Anderson	d7a0fe9ef6	arm64: enable perf events based hard lockup detector With the recent feature added to enable perf events to use pseudo NMIs as interrupts on platforms which support GICv3 or later, its now been possible to enable hard lockup detector (or NMI watchdog) on arm64 platforms. So enable corresponding support. One thing to note here is that normally lockup detector is initialized just after the early initcalls but PMU on arm64 comes up much later as device_initcall(). To cope with that, override arch_perf_nmi_is_available() to let the watchdog framework know PMU not ready, and inform the framework to re-initialize lockup detection once PMU has been initialized. [dianders@chromium.org: only HAVE_HARDLOCKUP_DETECTOR_PERF if the PMU config is enabled] Link: https://lkml.kernel.org/r/20230523073952.1.I60217a63acc35621e13f10be16c0cd7c363caf8c@changeid Link: https://lkml.kernel.org/r/20230519101840.v5.18.Ia44852044cdcb074f387e80df6b45e892965d4a1@changeid Co-developed-by: Sumit Garg <sumit.garg@linaro.org> Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Co-developed-by: Pingfan Liu <kernelfans@gmail.com> Signed-off-by: Pingfan Liu <kernelfans@gmail.com> Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Colin Cross <ccross@android.com> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guenter Roeck <groeck@chromium.org> Cc: Ian Rogers <irogers@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masayoshi Mizuma <msys.mizuma@gmail.com> Cc: Matthias Kaehlcke <mka@chromium.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Boyd <swboyd@chromium.org> Cc: Tzung-Bi Shih <tzungbi@chromium.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-06-09 17:44:22 -07:00
Xu Yang	55691f99d4	drivers/perf: imx_ddr: Add support for NXP i.MX9 SoC DDRC PMU driver Add ddr performance monitor support for i.MX93. There are 11 counters for ddr performance events. - Counter 0 is a 64-bit counter that counts only clock cycles. - Counter 1-10 are 32-bit counters that can monitor counter-specific events in addition to counting reference events. For example: perf stat -a -e imx9_ddr0/ddrc_pm_1,counter=1/,imx9_ddr0/ddrc_pm_2,counter=2/ ls Besides, this ddr pmu support AXI filter capability. It's implemented as counter-specific events. It now supports read transaction, write transaction and read beat events which corresponding respecitively to counter 2, 3 and 4. axi_mask and axi_id need to be as event parameters. For example: perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_rd_trans_filt,counter=2,axi_mask=ID_MASK,axi_id=ID/ perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_wr_trans_filt,counter=3,axi_mask=ID_MASK,axi_id=ID/ perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_rd_beat_filt,counter=4,axi_mask=ID_MASK,axi_id=ID/ Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230418102910.2065651-1-xu.yang_2@nxp.com [will: Remove redundant error message on platform_get_irq() failure] Signed-off-by: Will Deacon <will@kernel.org>	2023-06-09 12:01:10 +01:00
Robin Murphy	d2e3bb5128	perf/arm_cspmu: Decouple APMT dependency The functional paths of the driver need not care about ACPI, so abstract the property of atomic doubleword access as its own flag (repacking the structure for a better fit). We also do not need to go poking directly at the APMT for standard resources which the ACPI layer has already dealt with, so deal with the optional MMIO page and interrupt in the normal firmware-agnostic manner. The few remaining portions of probing that are APMT-specific can still easily retrieve the APMT pointer as needed without us having to carry a duplicate copy around everywhere. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/88f97268603e1aa6016d178982a1dc2861f6770d.1685983270.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-06-09 11:26:47 +01:00
Robin Murphy	f9bd34e375	perf/arm_cspmu: Clean up ACPI dependency Build-wise, the ACPI dependency consists of only a couple of things which could probably stand being factored out into ACPI helpers anyway. However for the immediate concern of working towards Devicetree support here, it's easy enough to make a few tweaks to contain the affected code locally, such that we can relax the Kconfig dependency. Reviewed-and-Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/9d126711c7498b199b3e6f5cf48ca60ffb9df54c.1685983270.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-06-09 11:26:47 +01:00
Robin Murphy	71e0cb32d5	perf/arm_cspmu: Fix event attribute type ARM_CSPMU_EVENT_ATTR() defines a struct perf_pmu_events_attr, so arm_cspmu_sysfs_event_show() should not be interpreting it as struct dev_ext_attribute. Fixes: `e37dfd6573` ("perf: arm_cspmu: Add support for ARM CoreSight PMU driver") Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/27c0804af64007b2400abbc40278f642ee6a0a29.1685983270.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-06-09 11:26:47 +01:00
Ilkka Koskinen	225d757012	perf: arm_cspmu: Set irq affinitiy only if overflow interrupt is used Don't try to set irq affinity if PMU doesn't have an overflow interrupt. Fixes: `e37dfd6573` ("perf: arm_cspmu: Add support for ARM CoreSight PMU driver") Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20230608203742.3503486-1-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>	2023-06-09 11:25:56 +01:00
Junhao He	7a6a9f1c5a	drivers/perf: hisi: Don't migrate perf to the CPU going to teardown The driver needs to migrate the perf context if the current using CPU going to teardown. By the time calling the cpuhp::teardown() callback the cpu_online_mask() hasn't updated yet and still includes the CPU going to teardown. In current driver's implementation we may migrate the context to the teardown CPU and leads to the below calltrace: ... [ 368.104662][ T932] task:cpuhp/0 state:D stack: 0 pid: 15 ppid: 2 flags:0x00000008 [ 368.113699][ T932] Call trace: [ 368.116834][ T932] __switch_to+0x7c/0xbc [ 368.120924][ T932] __schedule+0x338/0x6f0 [ 368.125098][ T932] schedule+0x50/0xe0 [ 368.128926][ T932] schedule_preempt_disabled+0x18/0x24 [ 368.134229][ T932] __mutex_lock.constprop.0+0x1d4/0x5dc [ 368.139617][ T932] __mutex_lock_slowpath+0x1c/0x30 [ 368.144573][ T932] mutex_lock+0x50/0x60 [ 368.148579][ T932] perf_pmu_migrate_context+0x84/0x2b0 [ 368.153884][ T932] hisi_pcie_pmu_offline_cpu+0x90/0xe0 [hisi_pcie_pmu] [ 368.160579][ T932] cpuhp_invoke_callback+0x2a0/0x650 [ 368.165707][ T932] cpuhp_thread_fun+0xe4/0x190 [ 368.170316][ T932] smpboot_thread_fn+0x15c/0x1a0 [ 368.175099][ T932] kthread+0x108/0x13c [ 368.179012][ T932] ret_from_fork+0x10/0x18 ... Use function cpumask_any_but() to find one correct active cpu to fixes this issue. Fixes: `8404b0fbc7` ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230608114326.27649-1-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2023-06-09 11:24:09 +01:00
Marc Zyngier	8be3593b9e	drivers/perf: apple_m1: Force 63bit counters for M2 CPUs Sidharth reports that on M2, the PMU never generates any interrupt when using 'perf record', which is a annoying as you get no sample. I'm temped to say "no sample, no problem", but others may have a different opinion. Upon investigation, it appears that the counters on M2 are significantly different from the ones on M1, as they count on 64 bits instead of 48. Which of course, in the fine M1 tradition, means that we can only use 63 bits, as the top bit is used to signal the interrupt... This results in having to introduce yet another flag to indicate yet another odd counter width. Who knows what the next crazy implementation will do... With this, perf can work out the correct offset, and 'perf record' works as intended. Tested on M2 and M2-Pro CPUs. Cc: Janne Grunau <j@jannau.net> Cc: Hector Martin <marcan@marcan.st> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Fixes: `7d0bfb7c99` ("drivers/perf: apple_m1: Add Apple M2 support") Reported-by: Sidharth Kshatriya <sid.kshatriya@gmail.com> Tested-by: Sidharth Kshatriya <sid.kshatriya@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230528080205.288446-1-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-06-05 15:39:59 +01:00
Robin Murphy	71746c995c	perf/arm-cmn: Fix DTC reset It turns out that my naive DTC reset logic fails to work as intended, since, after checking with the hardware designers, the PMU actually needs to be fully enabled in order to correctly clear any pending overflows. Therefore, invert the sequence to start with turning on both enables so that we can reliably get the DTCs into a known state, then moving to our normal counters-stopped state from there. Since all the DTM counters have already been unpaired during the initial discovery pass, we just need to additionally reset the cycle counters to ensure that no other unexpected overflows occur during this period. Fixes: `0ba64770a2` ("perf: Add Arm CMN-600 PMU driver") Reported-by: Geoff Blake <blakgeof@amazon.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/0ea4559261ea394f827c9aee5168c77a60aaee03.1684946389.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-06-05 15:35:52 +01:00
Christophe JAILLET	7bd42f122c	perf: qcom_l2_pmu: Make l2_cache_pmu_probe_cluster() more robust If an error occurs after calling list_add(), the &l2cache_pmu->clusters list will reference some memory that will be freed when the managed resources will be released. Move the list_add() at the end of the function when everything is in fine. This is harmless because if l2_cache_pmu_probe_cluster() fails, then l2_cache_pmu_probe() will fail as well and 'l2cache_pmu' will be released as well. But it looks cleaner and could silence static checker warning. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/6a0f5bdb6b7b2ed4ef194fc49693e902ad5b95ea.1684397879.git.christophe.jaillet@wanadoo.fr Signed-off-by: Will Deacon <will@kernel.org>	2023-06-05 15:01:27 +01:00
Christophe JAILLET	f818947a06	perf/arm-cci: Slightly optimize cci_pmu_sync_counters() When the 'mask' bitmap is cleared, it is better to use its full maximum size instead of only the needed size. This lets the compiler optimize it because the size is now known at compile time. HW_CNTRS_MAX is small (i.e. currently 9), so a call to memset() is saved. Also, as 'mask' is local to the function, the non-atomic __set_bit() can also safely be used here. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/88d4e20d595f771396e9d558c1587eb4494057db.1682022422.git.christophe.jaillet@wanadoo.fr Signed-off-by: Will Deacon <will@kernel.org>	2023-06-05 14:53:04 +01:00
Reiji Watanabe	0c2f9acf6a	KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded Currently, with VHE, KVM sets ER, CR, SW and EN bits of PMUSERENR_EL0 to 1 on vcpu_load(), and saves and restores the register value for the host on vcpu_load() and vcpu_put(). If the value of those bits are cleared on a pCPU with a vCPU loaded (armv8pmu_start() would do that when PMU counters are programmed for the guest), PMU access from the guest EL0 might be trapped to the guest EL1 directly regardless of the current PMUSERENR_EL0 value of the vCPU. Fix this by not letting armv8pmu_start() overwrite PMUSERENR_EL0 on the pCPU where PMUSERENR_EL0 for the guest is loaded, and instead updating the saved shadow register value for the host so that the value can be restored on vcpu_put() later. While vcpu_{put,load}() are manipulating PMUSERENR_EL0, disable IRQs to prevent a race condition between these processes and IPIs that attempt to update PMUSERENR_EL0 for the host EL0. Suggested-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Marc Zyngier <maz@kernel.org> Fixes: `83a7a4d643` ("arm64: perf: Enable PMU counter userspace access for perf event") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230603025035.3781797-3-reijiw@google.com	2023-06-04 17:19:36 +01:00
Andrew Jones	41cad8284d	RISC-V: Align SBI probe implementation with spec sbi_probe_extension() is specified with "Returns 0 if the given SBI extension ID (EID) is not available, or 1 if it is available unless defined as any other non-zero value by the implementation." Additionally, sbiret.value is a long. Fix the implementation to ensure any nonzero long value is considered a success, rather than only positive int values. Fixes: `b9dcd9e415` ("RISC-V: Add basic support for SBI v0.2") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230427163626.101042-1-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-04-29 13:04:50 -07:00
Linus Torvalds	df45da57cb	arm64 updates for 6.4 ACPI: * Improve error reporting when failing to manage SDEI on AGDI device removal Assembly routines: * Improve register constraints so that the compiler can make use of the zero register instead of moving an immediate #0 into a GPR * Allow the compiler to allocate the registers used for CAS instructions CPU features and system registers: * Cleanups to the way in which CPU features are identified from the ID register fields * Extend system register definition generation to handle Enum types when defining shared register fields * Generate definitions for new _EL2 registers and add new fields for ID_AA64PFR1_EL1 * Allow SVE to be disabled separately from SME on the kernel command-line Tracing: * Support for "direct calls" in ftrace, which enables BPF tracing for arm64 Kdump: * Don't bother unmapping the crashkernel from the linear mapping, which then allows us to use huge (block) mappings and reduce TLB pressure when a crashkernel is loaded. Memory management: * Try again to remove data cache invalidation from the coherent DMA allocation path * Simplify the fixmap code by mapping at page granularity * Allow the kfence pool to be allocated early, preventing the rest of the linear mapping from being forced to page granularity Perf and PMU: * Move CPU PMU code out to drivers/perf/ where it can be reused by the 32-bit ARM architecture when running on ARMv8 CPUs * Fix race between CPU PMU probing and pKVM host de-privilege * Add support for Apple M2 CPU PMU * Adjust the generic PERF_COUNT_HW_BRANCH_INSTRUCTIONS event dynamically, depending on what the CPU actually supports * Minor fixes and cleanups to system PMU drivers Stack tracing: * Use the XPACLRI instruction to strip PAC from pointers, rather than rolling our own function in C * Remove redundant PAC removal for toolchains that handle this in their builtins * Make backtracing more resilient in the face of instrumentation Miscellaneous: * Fix single-step with KGDB * Remove harmless warning when 'nokaslr' is passed on the kernel command-line * Minor fixes and cleanups across the board -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmRChcwQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNCgBCADFvkYY9ESztSnd3EpiMbbAzgRCQBiA5H7U F2Wc+hIWgeAeUEttSH22+F16r6Jb0gbaDvsuhtN2W/rwQhKNbCU0MaUME05MPmg2 AOp+RZb2vdT5i5S5dC6ZM6G3T6u9O78LBWv2JWBdd6RIybamEn+RL00ep2WAduH7 n1FgTbsKgnbScD2qd4K1ejZ1W/BQMwYulkNpyTsmCIijXM12lkzFlxWnMtky3uhR POpawcIZzXvWI02QAX+SIdynGChQV3VP+dh9GuFbt7ASigDEhgunvfUYhZNSaqf4 +/q0O8toCtmQJBUhF0DEDSB5T8SOz5v9CKxKuwfaX6Trq0ixFQpZ =78L9 -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "ACPI: - Improve error reporting when failing to manage SDEI on AGDI device removal Assembly routines: - Improve register constraints so that the compiler can make use of the zero register instead of moving an immediate #0 into a GPR - Allow the compiler to allocate the registers used for CAS instructions CPU features and system registers: - Cleanups to the way in which CPU features are identified from the ID register fields - Extend system register definition generation to handle Enum types when defining shared register fields - Generate definitions for new _EL2 registers and add new fields for ID_AA64PFR1_EL1 - Allow SVE to be disabled separately from SME on the kernel command-line Tracing: - Support for "direct calls" in ftrace, which enables BPF tracing for arm64 Kdump: - Don't bother unmapping the crashkernel from the linear mapping, which then allows us to use huge (block) mappings and reduce TLB pressure when a crashkernel is loaded. Memory management: - Try again to remove data cache invalidation from the coherent DMA allocation path - Simplify the fixmap code by mapping at page granularity - Allow the kfence pool to be allocated early, preventing the rest of the linear mapping from being forced to page granularity Perf and PMU: - Move CPU PMU code out to drivers/perf/ where it can be reused by the 32-bit ARM architecture when running on ARMv8 CPUs - Fix race between CPU PMU probing and pKVM host de-privilege - Add support for Apple M2 CPU PMU - Adjust the generic PERF_COUNT_HW_BRANCH_INSTRUCTIONS event dynamically, depending on what the CPU actually supports - Minor fixes and cleanups to system PMU drivers Stack tracing: - Use the XPACLRI instruction to strip PAC from pointers, rather than rolling our own function in C - Remove redundant PAC removal for toolchains that handle this in their builtins - Make backtracing more resilient in the face of instrumentation Miscellaneous: - Fix single-step with KGDB - Remove harmless warning when 'nokaslr' is passed on the kernel command-line - Minor fixes and cleanups across the board" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (72 commits) KVM: arm64: Ensure CPU PMU probes before pKVM host de-privilege arm64: kexec: include reboot.h arm64: delete dead code in this_cpu_set_vectors() arm64/cpufeature: Use helper macro to specify ID register for capabilites drivers/perf: hisi: add NULL check for name drivers/perf: hisi: Remove redundant initialized of pmu->name arm64/cpufeature: Consistently use symbolic constants for min_field_value arm64/cpufeature: Pull out helper for CPUID register definitions arm64/sysreg: Convert HFGITR_EL2 to automatic generation ACPI: AGDI: Improve error reporting for problems during .remove() arm64: kernel: Fix kernel warning when nokaslr is passed to commandline perf/arm-cmn: Fix port detection for CMN-700 arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step arm64: move PAC masks to <asm/pointer_auth.h> arm64: use XPACLRI to strip PAC arm64: avoid redundant PAC stripping in __builtin_return_address() arm64/sme: Fix some comments of ARM SME arm64/signal: Alloc tpidr2 sigframe after checking system_supports_tpidr2() arm64/signal: Use system_supports_tpidr2() to check TPIDR2 arm64/idreg: Don't disable SME when disabling SVE ...	2023-04-25 12:39:01 -07:00
Junhao He	257aedb72e	drivers/perf: hisi: add NULL check for name When allocations fails that can be NULL now. If the name provided is NULL, then the initialization process of the PMU type and dev will be skipped in function perf_pmu_register(). Consequently, the PMU will not be able to register into the kernel. Moreover, in the case of unregister the PMU, the function device_del() will need to handle NULL pointers, which potentially can cause issues. So move this allocation above the cpuhp_state_add_instance() and directly return if it does fail. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230403081423.62460-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2023-04-17 13:14:10 +01:00
Junhao He	25d8c25025	drivers/perf: hisi: Remove redundant initialized of pmu->name "pmu->name" is initialized by perf_pmu_register() function, so remove the redundant initialized in hisi_pmu_init(). Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230403081423.62460-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2023-04-17 13:14:10 +01:00
Robin Murphy	2ad91e44e6	perf/arm-cmn: Fix port detection for CMN-700 When the "extra device ports" configuration was first added, the additional mxp_device_port_connect_info registers were added around the existing mxp_mesh_port_connect_info registers. What I missed about CMN-700 is that it shuffled them around to remove this discontinuity. As such, tweak the definitions and factor out a helper for reading these registers so we can deal with this discrepancy easily, which does at least allow nicely tidying up the callsites. With this we can then also do the nice thing and skip accesses completely rather than relying on RES0 behaviour where we know the extra registers aren't defined. Fixes: `23760a0144` ("perf/arm-cmn: Add CMN-700 support") Reported-by: Jing Zhang <renyu.zj@linux.alibaba.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/71d129241d4d7923cde72a0e5b4c8d2f6084525f.1681295193.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-04-14 13:45:00 +01:00
Stephane Eranian	a30b87e6bd	arm64: pmuv3: dynamically map PERF_COUNT_HW_BRANCH_INSTRUCTIONS The mapping of perf_events generic hardware events to actual PMU events on ARM PMUv3 may not always be correct. This is in particular true for the PERF_COUNT_HW_BRANCH_INSTRUCTIONS event. Although the mapping points to an architected event, it may not always be available. This can be seen with a simple: $ perf stat -e branches sleep 0 Performance counter stats for 'sleep 0': <not supported> branches 0.001401081 seconds time elapsed Yet the hardware does have an event that could be used for branches. Dynamically check for a supported hardware event which can be used for PERF_COUNT_HW_BRANCH_INSTRUCTIONS at mapping time. And with that: $ perf stat -e branches sleep 0 Performance counter stats for 'sleep 0': 166,739 branches 0.000832163 seconds time elapsed Co-developed-by: Stephane Eranian <eranian@google.com> Signed-off-by: Stephane Eranian <eranian@google.com> Co-developed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Co-developed-by: Peter Newman <peternewman@google.com> Signed-off-by: Peter Newman <peternewman@google.com> Link: https://lore.kernel.org/all/YvunKCJHSXKz%2FkZB@FVFF77S0Q05N Link: https://lore.kernel.org/r/20230411093809.657501-1-peternewman@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-04-11 12:35:20 +01:00
Robin Murphy	23b2fd8394	perf/arm-cmn: Validate cycles events fully DTC cycle count events don't have anything to validate or initialise in themselves, but we should not forget to still validate their whole group context. Otherwise, we may fail to correctly reject a contrived group containing an impossible number of cycles events. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/3124e8c276a1f513c1a415dc839ca4181b3c8bc8.1680522545.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-04-06 16:07:51 +01:00
Marc Gonzalez	f9d323e7c1	perf/amlogic: adjust register offsets Commit "perf/amlogic: resolve conflict between canvas & pmu" changed the base address. Fixes: `2016e2113d` ("perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driver") Signed-off-by: Marc Gonzalez <mgonzalez@freebox.fr> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20230327120932.2158389-4-mgonzalez@freebox.fr Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>	2023-03-27 17:31:22 +02:00
Janne Grunau	7d0bfb7c99	drivers/perf: apple_m1: Add Apple M2 support The PMU itself is compatible with the one found on M1. We still know next to nothing about the counters so keep using CPU uarch specific compatibles/PMU names. Signed-off-by: Janne Grunau <j@jannau.net> Acked-by: Mark Rutland <mark.rutland@arm.com. Reviewed-by: Hector Martin <marcan@marcan.st> Link: https://lore.kernel.org/r/20230214-apple_m2_pmu-v1-2-9c9213ab9b63@jannau.net Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 15:15:14 +01:00
Besar Wicaksono	16e1583465	perf: arm_cspmu: Fix variable dereference warning Fix warning message from smatch tool: \| smatch warnings: \| drivers/perf/arm_cspmu/arm_cspmu.c:1075 arm_cspmu_find_cpu_container() \| warn: variable dereferenced before check 'cpu_dev' (see line 1073) Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/202302191227.kc0V8fM7-lkp@intel.com/ Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20230302205701.35323-1-bwicaksono@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 15:12:58 +01:00
Jiucheng Xu	c61e5720f2	perf/amlogic: Fix config1/config2 parsing issue The 3th argument of for_each_set_bit is incorrect, fix them. Fixes: `2016e2113d` ("perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driver") Signed-off-by: Jiucheng Xu <jiucheng.xu@amlogic.com> Link: https://lore.kernel.org/r/20230209115403.521868-1-jiucheng.xu@amlogic.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 15:12:05 +01:00
Yang Li	bc12f344d5	drivers/perf: Use devm_platform_get_and_ioremap_resource() Convert platform_get_resource(), devm_ioremap_resource() to a single call to devm_platform_get_and_ioremap_resource(), as this is exactly what this function does. Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/20230216063403.9753-1-yang.lee@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 15:09:09 +01:00
Nick Alcock	a64021d372	kbuild, drivers/perf: remove MODULE_LICENSE in non-modules Since commit `8b41fc4454` ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations are used to identify modules. As a consequence, uses of the macro in non-modules will cause modprobe to misidentify their containing object file as a module when it is not (false positives), and modprobe might succeed rather than failing with a suitable error message. So remove it in the files in this commit, none of which can be built as modules. Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Suggested-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: linux-modules@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230217141059.392471-9-nick.alcock@oracle.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 15:08:17 +01:00
Yang Li	8540504c51	perf: qcom: Use devm_platform_get_and_ioremap_resource() According to commit `890cc39a87` ("drivers: provide devm_platform_get_and_ioremap_resource()"), convert platform_get_resource(), devm_ioremap_resource() to a single call to devm_platform_get_and_ioremap_resource(), as this is exactly what this function does. Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20230315023108.36953-1-yang.lee@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 15:03:01 +01:00
Yang Li	d7f4679dc8	perf: arm: Use devm_platform_get_and_ioremap_resource() According to commit `890cc39a87` ("drivers: provide devm_platform_get_and_ioremap_resource()"), convert platform_get_resource(), devm_ioremap_resource() to a single call to devm_platform_get_and_ioremap_resource(), as this is exactly what this function does. Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20230315023017.35789-1-yang.lee@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 15:02:32 +01:00
Ilkka Koskinen	f87e9114b5	perf/arm-cmn: Move overlapping wp_combine field As eventid field was expanded to support new mesh versions, it started to overlap with wp_combine field. Move wp_combine to fix the issue. Fixes: `23760a0144` ("perf/arm-cmn: Add CMN-700 support") Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20230301175540.19891-1-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 15:00:59 +01:00
Marc Zyngier	009d6dc87a	ARM: perf: Allow the use of the PMUv3 driver on 32bit ARM The only thing stopping the PMUv3 driver from compiling on 32bit is the lack of defined system registers names and the handful of required helpers. This is easily solved by providing the sysreg accessors and updating the Kconfig entry. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Co-developed-by: Zaid Al-Bassam <zalbassam@google.com> Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-8-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 14:01:18 +01:00
Zaid Al-Bassam	b3a070869f	perf: pmuv3: Change GENMASK to GENMASK_ULL GENMASK macro uses "unsigned long" (32-bit wide on arm and 64-bit on arm64), This causes build issues when enabling PMUv3 on arm as it tries to access bits > 31. This patch switches the GENMASK to GENMASK_ULL, which uses "unsigned long long" (64-bit on both arm and arm64). Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-6-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 14:01:18 +01:00
Zaid Al-Bassam	11fba29a8a	perf: pmuv3: Move inclusion of kvm_host.h to the arch-specific helper KVM host support is available only on arm64. By moving the inclusion of kvm_host.h to an arm64-specific file, the 32bit architecture will be able to implement dummy helpers. Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-5-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 14:01:18 +01:00
Zaid Al-Bassam	711432770f	perf: pmuv3: Abstract PMU version checks The current PMU version definitions are available for arm64 only, As we want to add PMUv3 support to arm (32-bit), abstracts these definitions by using arch-specific helpers. Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-4-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 14:01:18 +01:00
Marc Zyngier	df29ddf4f0	arm64: perf: Abstract system register accesses away As we want to enable 32bit support, we need to distanciate the PMUv3 driver from the AArch64 system register names. This patch moves all system register accesses to an architecture specific include file, allowing the 32bit counterpart to be slotted in at a later time. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Co-developed-by: Zaid Al-Bassam <zalbassam@google.com> Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-3-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 14:01:18 +01:00
Marc Zyngier	7755cec63a	arm64: perf: Move PMUv3 driver to drivers/perf Having the ARM PMUv3 driver sitting in arch/arm64/kernel is getting in the way of being able to use perf on ARMv8 cores running a 32bit kernel, such as 32bit KVM guests. This patch moves it into drivers/perf/arm_pmuv3.c, with an include file in include/linux/perf/arm_pmuv3.h. The only thing left in arch/arm64 is some mundane perf stuff. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Zaid Al-Bassam <zalbassam@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230317195027.3746949-2-zalbassam@google.com Signed-off-by: Will Deacon <will@kernel.org>	2023-03-27 14:01:18 +01:00
Linus Torvalds	bf1a1bad82	RISC-V Patches for the 6.3 Merge Window, Part 2 * Some cleanups and fixes for the Zbb-optimized string routines. * Support for custom (vendor or implementation defined) perf events. * COMMAND_LINE_SIZE has been increased to 1024. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmQBuDMTHHBhbG1lckBk YWJiZWx0LmNvbQAKCRAuExnzX7sYie2nEACDnj2w35BYDZDrbOiodXw1bA60IpJG 5Q3bZEIWimaoDluD52UxrLhAZgJ3imGHF2nLLevqqsqAWKT2aRxsiEtgqARN0GzW Am2+ZBdMuo1oCOHhIzCI9nk1xMG1fjva1ZAhGcVK876W3NuLUDvb1+LeDcwZB5bX 8KUu1haILIlngd8xOZE88xJOw83uQaF1Oc1NLqpQabbCiWxFBAIzI3O/7ZyuVk4M LrdD5zH7YBUpksol0CxWaJ/jI59t23na2421NKNm34zALsvVFSKc2CZ0+r5DFy3D 4bzMdzXmtDjWfSCTGicGk03acrxUwgpAkFH/y2eJzhQl1WWrO/FiUzlI2uKoHou9 MO+fxPHcOamnWH5OF8jzP9+v5fkKIX2eYWJl5E3Jge6KU8D1Y2cPbE/k75cOq70Y BVF+BUnQ26UdqYH0POFqikizF7dvt6G+rNpUnFrufc6xs+BALxfyIHRQnTw8dQu4 FbZRLX9YNn8/TpwcfnzIzFk1CH35n8QPBvcumIiV/344Gs7nt5vyDO7b//ExS4IL mrENDNdaQ/u8T6xkhGSWPPMwNAj3OFpLRi00BInYvRXx9/0WArD1Z5N9uMl42dqd kB060nUH1Ao7aBQJztqZVDOYwyFhREGM+xLRBLE+CiD8EnVvm30Fhovq2FrFfbif GPHt+ouKJtlkTw== =68ZK -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-6.3-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: - Some cleanups and fixes for the Zbb-optimized string routines - Support for custom (vendor or implementation defined) perf events - COMMAND_LINE_SIZE has been increased to 1024 * tag 'riscv-for-linus-6.3-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Bump COMMAND_LINE_SIZE value to 1024 drivers/perf: RISC-V: Allow programming custom firmware events riscv, lib: Fix Zbb strncmp RISC-V: improve string-function assembly	2023-03-03 09:32:51 -08:00
Mayuresh Chitale	9f828bc3fb	drivers/perf: RISC-V: Allow programming custom firmware events Applications need to be able to program the SBI implementation specific or custom firmware events in addition to the standard firmware events. Remove a check in the driver that prohibits the programming of the custom firmware events. Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20230208074314.3661406-1-mchitale@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2023-03-01 11:16:22 -08:00
Linus Torvalds	49d5759268	ARM: - Provide a virtual cache topology to the guest to avoid inconsistencies with migration on heterogenous systems. Non secure software has no practical need to traverse the caches by set/way in the first place. - Add support for taking stage-2 access faults in parallel. This was an accidental omission in the original parallel faults implementation, but should provide a marginal improvement to machines w/o FEAT_HAFDBS (such as hardware from the fruit company). - A preamble to adding support for nested virtualization to KVM, including vEL2 register state, rudimentary nested exception handling and masking unsupported features for nested guests. - Fixes to the PSCI relay that avoid an unexpected host SVE trap when resuming a CPU when running pKVM. - VGIC maintenance interrupt support for the AIC - Improvements to the arch timer emulation, primarily aimed at reducing the trap overhead of running nested. - Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the interest of CI systems. - Avoid VM-wide stop-the-world operations when a vCPU accesses its own redistributor. - Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions in the host. - Aesthetic and comment/kerneldoc fixes - Drop the vestiges of the old Columbia mailing list and add [Oliver] as co-maintainer This also drags in arm64's 'for-next/sme2' branch, because both it and the PSCI relay changes touch the EL2 initialization code. RISC-V: - Fix wrong usage of PGDIR_SIZE instead of PUD_SIZE - Correctly place the guest in S-mode after redirecting a trap to the guest - Redirect illegal instruction traps to guest - SBI PMU support for guest s390: - Two patches sorting out confusion between virtual and physical addresses, which currently are the same on s390. - A new ioctl that performs cmpxchg on guest memory - A few fixes x86: - Change tdp_mmu to a read-only parameter - Separate TDP and shadow MMU page fault paths - Enable Hyper-V invariant TSC control - Fix a variety of APICv and AVIC bugs, some of them real-world, some of them affecting architecurally legal but unlikely to happen in practice - Mark APIC timer as expired if its in one-shot mode and the count underflows while the vCPU task was being migrated - Advertise support for Intel's new fast REP string features - Fix a double-shootdown issue in the emergency reboot code - Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give SVM similar treatment to VMX - Update Xen's TSC info CPUID sub-leaves as appropriate - Add support for Hyper-V's extended hypercalls, where "support" at this point is just forwarding the hypercalls to userspace - Clean up the kvm->lock vs. kvm->srcu sequences when updating the PMU and MSR filters - One-off fixes and cleanups - Fix and cleanup the range-based TLB flushing code, used when KVM is running on Hyper-V - Add support for filtering PMU events using a mask. If userspace wants to restrict heavily what events the guest can use, it can now do so without needing an absurd number of filter entries - Clean up KVM's handling of "PMU MSRs to save", especially when vPMU support is disabled - Add PEBS support for Intel Sapphire Rapids - Fix a mostly benign overflow bug in SEV's send\|receive_update_data() - Move several SVM-specific flags into vcpu_svm x86 Intel: - Handle NMI VM-Exits before leaving the noinstr region - A few trivial cleanups in the VM-Enter flows - Stop enabling VMFUNC for L1 purely to document that KVM doesn't support EPTP switching (or any other VM function) for L1 - Fix a crash when using eVMCS's enlighted MSR bitmaps Generic: - Clean up the hardware enable and initialization flow, which was scattered around multiple arch-specific hooks. Instead, just let the arch code call into generic code. Both x86 and ARM should benefit from not having to fight common KVM code's notion of how to do initialization. - Account allocations in generic kvm_arch_alloc_vm() - Fix a memory leak if coalesced MMIO unregistration fails selftests: - On x86, cache the CPU vendor (AMD vs. Intel) and use the info to emit the correct hypercall instruction instead of relying on KVM to patch in VMMCALL - Use TAP interface for kvm_binary_stats_test and tsc_msrs_test -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmP2YA0UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPg/Qf+J6nT+TkIa+8Ei+fN1oMTDp4YuIOx mXvJ9mRK9sQ+tAUVwvDz3qN/fK5mjsYbRHIDlVc5p2Q3bCrVGDDqXPFfCcLx1u+O 9U9xjkO4JxD2LS9pc70FYOyzVNeJ8VMGOBbC2b0lkdYZ4KnUc6e/WWFKJs96bK+H duo+RIVyaMthnvbTwSv1K3qQb61n6lSJXplywS8KWFK6NZAmBiEFDAWGRYQE9lLs VcVcG0iDJNL/BQJ5InKCcvXVGskcCm9erDszPo7w4Bypa4S9AMS42DHUaRZrBJwV /WqdH7ckIz7+OSV0W1j+bKTHAFVTCjXYOM7wQykgjawjICzMSnnG9Gpskw== =goe1 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "ARM: - Provide a virtual cache topology to the guest to avoid inconsistencies with migration on heterogenous systems. Non secure software has no practical need to traverse the caches by set/way in the first place - Add support for taking stage-2 access faults in parallel. This was an accidental omission in the original parallel faults implementation, but should provide a marginal improvement to machines w/o FEAT_HAFDBS (such as hardware from the fruit company) - A preamble to adding support for nested virtualization to KVM, including vEL2 register state, rudimentary nested exception handling and masking unsupported features for nested guests - Fixes to the PSCI relay that avoid an unexpected host SVE trap when resuming a CPU when running pKVM - VGIC maintenance interrupt support for the AIC - Improvements to the arch timer emulation, primarily aimed at reducing the trap overhead of running nested - Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the interest of CI systems - Avoid VM-wide stop-the-world operations when a vCPU accesses its own redistributor - Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions in the host - Aesthetic and comment/kerneldoc fixes - Drop the vestiges of the old Columbia mailing list and add [Oliver] as co-maintainer RISC-V: - Fix wrong usage of PGDIR_SIZE instead of PUD_SIZE - Correctly place the guest in S-mode after redirecting a trap to the guest - Redirect illegal instruction traps to guest - SBI PMU support for guest s390: - Sort out confusion between virtual and physical addresses, which currently are the same on s390 - A new ioctl that performs cmpxchg on guest memory - A few fixes x86: - Change tdp_mmu to a read-only parameter - Separate TDP and shadow MMU page fault paths - Enable Hyper-V invariant TSC control - Fix a variety of APICv and AVIC bugs, some of them real-world, some of them affecting architecurally legal but unlikely to happen in practice - Mark APIC timer as expired if its in one-shot mode and the count underflows while the vCPU task was being migrated - Advertise support for Intel's new fast REP string features - Fix a double-shootdown issue in the emergency reboot code - Ensure GIF=1 and disable SVM during an emergency reboot, i.e. give SVM similar treatment to VMX - Update Xen's TSC info CPUID sub-leaves as appropriate - Add support for Hyper-V's extended hypercalls, where "support" at this point is just forwarding the hypercalls to userspace - Clean up the kvm->lock vs. kvm->srcu sequences when updating the PMU and MSR filters - One-off fixes and cleanups - Fix and cleanup the range-based TLB flushing code, used when KVM is running on Hyper-V - Add support for filtering PMU events using a mask. If userspace wants to restrict heavily what events the guest can use, it can now do so without needing an absurd number of filter entries - Clean up KVM's handling of "PMU MSRs to save", especially when vPMU support is disabled - Add PEBS support for Intel Sapphire Rapids - Fix a mostly benign overflow bug in SEV's send\|receive_update_data() - Move several SVM-specific flags into vcpu_svm x86 Intel: - Handle NMI VM-Exits before leaving the noinstr region - A few trivial cleanups in the VM-Enter flows - Stop enabling VMFUNC for L1 purely to document that KVM doesn't support EPTP switching (or any other VM function) for L1 - Fix a crash when using eVMCS's enlighted MSR bitmaps Generic: - Clean up the hardware enable and initialization flow, which was scattered around multiple arch-specific hooks. Instead, just let the arch code call into generic code. Both x86 and ARM should benefit from not having to fight common KVM code's notion of how to do initialization - Account allocations in generic kvm_arch_alloc_vm() - Fix a memory leak if coalesced MMIO unregistration fails selftests: - On x86, cache the CPU vendor (AMD vs. Intel) and use the info to emit the correct hypercall instruction instead of relying on KVM to patch in VMMCALL - Use TAP interface for kvm_binary_stats_test and tsc_msrs_test" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (325 commits) KVM: SVM: hyper-v: placate modpost section mismatch error KVM: x86/mmu: Make tdp_mmu_allowed static KVM: arm64: nv: Use reg_to_encoding() to get sysreg ID KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes KVM: arm64: nv: Filter out unsupported features from ID regs KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2 KVM: arm64: nv: Allow a sysreg to be hidden from userspace only KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2 KVM: arm64: nv: Handle SMCs taken from virtual EL2 KVM: arm64: nv: Handle trapped ERET from virtual EL2 KVM: arm64: nv: Inject HVC exceptions to the virtual EL2 KVM: arm64: nv: Support virtual EL2 exceptions KVM: arm64: nv: Handle HCR_EL2.NV system register traps KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state KVM: arm64: nv: Add EL2 system registers to vcpu context KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set KVM: arm64: nv: Introduce nested virtualization VCPU feature KVM: arm64: Use the S2 MMU context to iterate over S2 table ...	2023-02-25 11:30:21 -08:00
Linus Torvalds	8bf1a529cd	arm64 updates for 6.3: - Support for arm64 SME 2 and 2.1. SME2 introduces a new 512-bit architectural register (ZT0, for the look-up table feature) that Linux needs to save/restore. - Include TPIDR2 in the signal context and add the corresponding kselftests. - Perf updates: Arm SPEv1.2 support, HiSilicon uncore PMU updates, ACPI support to the Marvell DDR and TAD PMU drivers, reset DTM_PMU_CONFIG (ARM CMN) at probe time. - Support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64. - Permit EFI boot with MMU and caches on. Instead of cleaning the entire loaded kernel image to the PoC and disabling the MMU and caches before branching to the kernel bare metal entry point, leave the MMU and caches enabled and rely on EFI's cacheable 1:1 mapping of all of system RAM to populate the initial page tables. - Expose the AArch32 (compat) ELF_HWCAP features to user in an arm64 kernel (the arm32 kernel only defines the values). - Harden the arm64 shadow call stack pointer handling: stash the shadow stack pointer in the task struct on interrupt, load it directly from this structure. - Signal handling cleanups to remove redundant validation of size information and avoid reading the same data from userspace twice. - Refactor the hwcap macros to make use of the automatically generated ID registers. It should make new hwcaps writing less error prone. - Further arm64 sysreg conversion and some fixes. - arm64 kselftest fixes and improvements. - Pointer authentication cleanups: don't sign leaf functions, unify asm-arch manipulation. - Pseudo-NMI code generation optimisations. - Minor fixes for SME and TPIDR2 handling. - Miscellaneous updates: ARCH_FORCE_MAX_ORDER is now selectable, replace strtobool() to kstrtobool() in the cpufeature.c code, apply dynamic shadow call stack in two passes, intercept pfn changes in set_pte_at() without the required break-before-make sequence, attempt to dump all instructions on unhandled kernel faults. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmP0/QsACgkQa9axLQDI XvG+gA/+JDVEH9wRzAIZvbp9hSuohPc48xgAmIMP1eiVB0/5qeRjYAJwS33H0rXS BPC2kj9IBy/eQeM9ICg0nFd0zYznSVacITqe6NrqeJ1F+ftS4rrHdfxd+J7kIoCs V2L8e+BJvmHdhmNV2qMAgJdGlfxfQBA7fv2cy52HKYcouoOh1AUVR/x+yXVXAsCd qJP3+dlUKccgm/oc5unEC1eZ49u8O+EoasqOyfG6K5udMgzhEX3K6imT9J3hw0WT UjstYkx5uGS/prUrRCQAX96VCHoZmzEDKtQuHkHvQXEYXsYPF3ldbR2CziNJnHe7 QfSkjJlt8HAtExA+BkwEe9i0MQO/2VF5qsa2e4fA6l7uqGu3LOtS/jJd23C9n9fR Id8aBMeN6S8+MjqRA9L2uf4t6e4ISEHoG9ZRdc4WOwloxEEiJoIeun+7bHdOSZLj AFdHFCz4NXiiwC0UP0xPDI2YeCLqt5np7HmnrUqwzRpVO8UUagiJD8TIpcBSjBN9 J68eidenHUW7/SlIeaMKE2lmo8AUEAJs9AorDSugF19/ThJcQdx7vT2UAZjeVB3j 1dbbwajnlDOk/w8PQC4thFp5/MDlfst0htS3WRwa+vgkweE2EAdTU4hUZ8qEP7FQ smhYtlT1xUSTYDTqoaG/U2OWR6/UU79wP0jgcOsHXTuyYrtPI/Q= =VmXL -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - Support for arm64 SME 2 and 2.1. SME2 introduces a new 512-bit architectural register (ZT0, for the look-up table feature) that Linux needs to save/restore - Include TPIDR2 in the signal context and add the corresponding kselftests - Perf updates: Arm SPEv1.2 support, HiSilicon uncore PMU updates, ACPI support to the Marvell DDR and TAD PMU drivers, reset DTM_PMU_CONFIG (ARM CMN) at probe time - Support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64 - Permit EFI boot with MMU and caches on. Instead of cleaning the entire loaded kernel image to the PoC and disabling the MMU and caches before branching to the kernel bare metal entry point, leave the MMU and caches enabled and rely on EFI's cacheable 1:1 mapping of all of system RAM to populate the initial page tables - Expose the AArch32 (compat) ELF_HWCAP features to user in an arm64 kernel (the arm32 kernel only defines the values) - Harden the arm64 shadow call stack pointer handling: stash the shadow stack pointer in the task struct on interrupt, load it directly from this structure - Signal handling cleanups to remove redundant validation of size information and avoid reading the same data from userspace twice - Refactor the hwcap macros to make use of the automatically generated ID registers. It should make new hwcaps writing less error prone - Further arm64 sysreg conversion and some fixes - arm64 kselftest fixes and improvements - Pointer authentication cleanups: don't sign leaf functions, unify asm-arch manipulation - Pseudo-NMI code generation optimisations - Minor fixes for SME and TPIDR2 handling - Miscellaneous updates: ARCH_FORCE_MAX_ORDER is now selectable, replace strtobool() to kstrtobool() in the cpufeature.c code, apply dynamic shadow call stack in two passes, intercept pfn changes in set_pte_at() without the required break-before-make sequence, attempt to dump all instructions on unhandled kernel faults * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (130 commits) arm64: fix .idmap.text assertion for large kernels kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests kselftest/arm64: Copy whole EXTRA context arm64: kprobes: Drop ID map text from kprobes blacklist perf: arm_spe: Print the version of SPE detected perf: arm_spe: Add support for SPEv1.2 inverted event filtering perf: Add perf_event_attr::config3 arm64/sme: Fix __finalise_el2 SMEver check drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable arm64/signal: Only read new data when parsing the ZT context arm64/signal: Only read new data when parsing the ZA context arm64/signal: Only read new data when parsing the SVE context arm64/signal: Avoid rereading context frame sizes arm64/signal: Make interface for restore_fpsimd_context() consistent arm64/signal: Remove redundant size validation from parse_user_sigframe() arm64/signal: Don't redundantly verify FPSIMD magic arm64/cpufeature: Use helper macros to specify hwcaps arm64/cpufeature: Always use symbolic name for feature value in hwcaps arm64/sysreg: Initial unsigned annotations for ID registers arm64/sysreg: Initial annotation of signed ID registers ...	2023-02-21 15:27:48 -08:00
Linus Torvalds	1f2d9ffc7a	Scheduler updates in this cycle are: - Improve the scalability of the CFS bandwidth unthrottling logic with large number of CPUs. - Fix & rework various cpuidle routines, simplify interaction with the generic scheduler code. Add __cpuidle methods as noinstr to objtool's noinstr detection and fix boatloads of cpuidle bugs & quirks. - Add new ABI: introduce MEMBARRIER_CMD_GET_REGISTRATIONS, to query previously issued registrations. - Limit scheduler slice duration to the sysctl_sched_latency period, to improve scheduling granularity with a large number of SCHED_IDLE tasks. - Debuggability enhancement on sys_exit(): warn about disabled IRQs, but also enable them to prevent a cascade of followup problems and repeat warnings. - Fix the rescheduling logic in prio_changed_dl(). - Micro-optimize cpufreq and sched-util methods. - Micro-optimize ttwu_runnable() - Micro-optimize the idle-scanning in update_numa_stats(), select_idle_capacity() and steal_cookie_task(). - Update the RSEQ code & self-tests - Constify various scheduler methods - Remove unused methods - Refine __init tags - Documentation updates - ... Misc other cleanups, fixes Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmPzbJwRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1iIvA//ZcEaB8Z6ChLRQjM+bsaudKJu3pdLQbPK iYbP8Da+LsAfxbEfYuGV3m+jIp0LlBOtsI/EezxQrXV+V7FvNyAX9Y00eEu/zlj8 7Jn3LMy/DBYTwH7LwVdcU0MyIVI8ZPc6WNnkx0LOtGZn8n+qfHPSDzcP3CW+a5AV UvllPYpYyEmsX0Eby7CF4Ue8mSmbViw/xR3rNr8ZSve0c25XzKabw8O9kE3jiHxP d/zERJoAYeDyYUEuZqhfn5dTlB4an4IjNEkAfRE5SQ09RA8Gkxsa5Ar8gob9e9M1 eQsdd4/bdhnrkM8L5qDZczqmgCTZ2bukQrxkBXhRDhLgoFxwAn77b+2ZjmIW3Lae AyGqRcDSg1q2oxaYm5ZiuO/t26aDOZu9vPHyHRDGt95EGbZlrp+GgeePyfCigJYz UmPdZAAcHdSymnnnlcvdG37WVvaVkpgWZzd8LbtBi23QR+Zc4WQ2IlgnUS5WKNNf VOBcAcP6E1IslDotZDQCc2dPFFQoQQEssVooyUc5oMytm7BsvxXLOeHG+Ncu/8uc H+U8Qn8jnqTxJbC5hkWQIJlhVKCq2FJrHxxySYTKROfUNcDgCmxboFeAcXTCIU1K T0S+sdoTS/CvtLklRkG0j6B8N4N98mOd9cFwUV3tX+/gMLMep3hCQs5L76JagvC5 skkQXoONNaM= =l1nN -----END PGP SIGNATURE----- Merge tag 'sched-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Improve the scalability of the CFS bandwidth unthrottling logic with large number of CPUs. - Fix & rework various cpuidle routines, simplify interaction with the generic scheduler code. Add __cpuidle methods as noinstr to objtool's noinstr detection and fix boatloads of cpuidle bugs & quirks. - Add new ABI: introduce MEMBARRIER_CMD_GET_REGISTRATIONS, to query previously issued registrations. - Limit scheduler slice duration to the sysctl_sched_latency period, to improve scheduling granularity with a large number of SCHED_IDLE tasks. - Debuggability enhancement on sys_exit(): warn about disabled IRQs, but also enable them to prevent a cascade of followup problems and repeat warnings. - Fix the rescheduling logic in prio_changed_dl(). - Micro-optimize cpufreq and sched-util methods. - Micro-optimize ttwu_runnable() - Micro-optimize the idle-scanning in update_numa_stats(), select_idle_capacity() and steal_cookie_task(). - Update the RSEQ code & self-tests - Constify various scheduler methods - Remove unused methods - Refine __init tags - Documentation updates - Misc other cleanups, fixes * tag 'sched-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (110 commits) sched/rt: pick_next_rt_entity(): check list_entry sched/deadline: Add more reschedule cases to prio_changed_dl() sched/fair: sanitize vruntime of entity being placed sched/fair: Remove capacity inversion detection sched/fair: unlink misfit task from cpu overutilized objtool: mem() are not uaccess safe cpuidle: Fix poll_idle() noinstr annotation sched/clock: Make local_clock() noinstr sched/clock/x86: Mark sched_clock() noinstr x86/pvclock: Improve atomic update of last_value in pvclock_clocksource_read() x86/atomics: Always inline arch_atomic64() cpuidle: tracing, preempt: Squash _rcuidle tracing cpuidle: tracing: Warn about !rcu_is_watching() cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG cpuidle: drivers: firmware: psci: Dont instrument suspend code KVM: selftests: Fix build of rseq test exit: Detect and fix irq disabled state in oops cpuidle, arm64: Fix the ARM64 cpuidle logic cpuidle: mvebu: Fix duplicate flags assignment sched/fair: Limit sched slice duration ...	2023-02-20 17:41:08 -08:00
Mark Rutland	61d0386273	arm_pmu: fix event CPU filtering Janne reports that perf has been broken on Apple M1 as of commit: `bd27568117` ("perf: Rewrite core context handling") That commit replaced the pmu::filter_match() callback with pmu::filter(), whose return value has the opposite polarity, with true implying events should be ignored rather than scheduled. While an attempt was made to update the logic in armv8pmu_filter() and armpmu_filter() accordingly, the return value remains inverted in a couple of cases: * If the arm_pmu does not have an arm_pmu::filter() callback, armpmu_filter() will always return whether the CPU is supported rather than whether the CPU is not supported. As a result, the perf core will not schedule events on supported CPUs, resulting in a loss of events. Additionally, the perf core will attempt to schedule events on unsupported CPUs, but this will be rejected by armpmu_add(), which may result in a loss of events from other PMUs on those unsupported CPUs. * If the arm_pmu does have an arm_pmu::filter() callback, and armpmu_filter() is called on a CPU which is not supported by the arm_pmu, armpmu_filter() will return false rather than true. As a result, the perf core will attempt to schedule events on unsupported CPUs, but this will be rejected by armpmu_add(), which may result in a loss of events from other PMUs on those unsupported CPUs. This means a loss of events can be seen with any arm_pmu driver, but with the ARMv8 PMUv3 driver (which is the only arm_pmu driver with an arm_pmu::filter() callback) the event loss will be more limited and may go unnoticed, which is how this issue evaded testing so far. Fix the CPU filtering by performing this consistently in armpmu_filter(), and remove the redundant arm_pmu::filter() callback and armv8pmu_filter() implementation. Commit `bd27568117` also silently removed the CHAIN event filtering from armv8pmu_filter(), which will be addressed by a separate patch without using the filter callback. Fixes: `bd27568117` ("perf: Rewrite core context handling") Reported-by: Janne Grunau <j@jannau.net> Link: https://lore.kernel.org/asahi/20230215-arm_pmu_m1_regression-v1-1-f5a266577c8d@jannau.net/ Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Asahi Lina <lina@asahilina.net> Cc: Eric Curtin <ecurtin@redhat.com> Tested-by: Janne Grunau <j@jannau.net> Link: https://lore.kernel.org/r/20230216141240.3833272-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-02-16 21:23:52 +00:00
Atish Patra	8929283a68	perf: RISC-V: Improve privilege mode filtering for perf Currently, the host driver doesn't have any method to identify if the requested perf event is from kvm or bare metal. As KVM runs in HS mode, there are no separate hypervisor privilege mode to distinguish between the attributes for guest/host. Improve the privilege mode filtering by using the event specific config1 field. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-02-07 20:35:33 +05:30
Atish Patra	585e351ff3	perf: RISC-V: Define helper functions expose hpm counter width and count KVM module needs to know how many hardware counters and the counter width that the platform supports. Otherwise, it will not be able to show optimal value of virtual counters to the guest. The virtual hardware counters also need to have the same width as the logical hardware counters for simplicity. However, there shouldn't be mapping between virtual hardware counters and logical hardware counters. As we don't support hetergeneous harts or counters with different width as of now, the implementation relies on the counter width of the first available programmable counter. Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-02-07 20:35:31 +05:30
Rob Herring	e8a709dc2a	perf: arm_spe: Print the version of SPE detected There's up to 4 versions of SPE now. Let's add the version that's been detected to the driver's informational print out. Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230206204746.1452942-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-02-07 11:52:21 +00:00
Rob Herring	8d9190f00a	perf: arm_spe: Add support for SPEv1.2 inverted event filtering Arm SPEv1.2 (Arm v8.7/v9.2) adds a new feature called Inverted Event Filter which excludes samples matching the event filter. The feature mirrors the existing event filter in PMSEVFR_EL1 adding a new register, PMSNEVFR_EL1, which has the same event bit assignments. Tested-by: James Clark <james.clark@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-8-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-02-07 11:52:00 +00:00
Sascha Hauer	7f49b03739	drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable active_events is set but not used, remove it. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Link: https://lore.kernel.org/r/20230203121509.3580245-1-s.hauer@pengutronix.de Signed-off-by: Will Deacon <will@kernel.org>	2023-02-03 13:04:22 +00:00
Ingo Molnar	57a30218fa	Linux 6.2-rc6 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh SDc/Y/c= =zpaD -----END PGP SIGNATURE----- Merge tag 'v6.2-rc6' into sched/core, to pick up fixes Pick up fixes before merging another batch of cpuidle updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2023-01-31 15:01:20 +01:00
Robin Murphy	a428eb4b99	Partially revert "perf/arm-cmn: Optimise DTC counter accesses" It turns out the optimisation implemented by commit `4f2c3872dd` is totally broken, since all the places that consume hw->dtcs_used for events other than cycle count are still not expecting it to be sparsely populated, and fail to read all the relevant DTC counters correctly if so. If implemented correctly, the optimisation potentially saves up to 3 register reads per event update, which is reasonably significant for events targeting a single node, but still not worth a massive amount of additional code complexity overall. Getting it right within the current design looks a fair bit more involved than it was ever intended to be, so let's just make a functional revert which restores the old behaviour while still backporting easily. Fixes: `4f2c3872dd` ("perf/arm-cmn: Optimise DTC counter accesses") Reported-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/b41bb4ed7283c3d8400ce5cf5e6ec94915e6750f.1674498637.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-01-26 13:55:38 +00:00
Rob Herring	4998897b1e	perf: arm_spe: Support new SPEv1.2/v8.7 'not taken' event Arm SPEv1.2 (Armv8.7/v9.2) adds a new event, 'not taken', in bit 6 of the PMSEVFR_EL1 register. Update arm_spe_pmsevfr_res0() to support the additional event. Tested-by: James Clark <james.clark@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-6-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-01-19 18:30:23 +00:00
Rob Herring	05e4c88e2b	perf: arm_spe: Use new PMSIDR_EL1 register enums Now that the SPE register definitions include enums for some PMSIDR_EL1 fields, use them in the driver in place of magic values. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-5-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-01-19 18:30:23 +00:00
Rob Herring	2d347ac233	perf: arm_spe: Drop BIT() and use FIELD_GET/PREP accessors Now that generated sysregs are in place, update the register field accesses. The use of BIT() is no longer needed with the new defines. Use FIELD_GET and FIELD_PREP instead of open coding masking and shifting. No functional change. Tested-by: James Clark <james.clark@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-4-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-01-19 18:30:22 +00:00
Rob Herring	c759ec850d	arm64: Drop SYS_ from SPE register defines We currently have a non-standard SYS_ prefix in the constants generated for the SPE register bitfields. Drop this in preparation for automatic register definition generation. The SPE mask defines were unshifted, and the SPE register field enumerations were shifted. The autogenerated defines are the opposite, so make the necessary adjustments. No functional changes. Tested-by: James Clark <james.clark@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-2-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-01-19 18:30:22 +00:00
Rob Herring	e080477a05	perf: arm_spe: Use feature numbering for PMSEVFR_EL1 defines Similar to commit `121a8fc088` ("arm64/sysreg: Use feature numbering for PMU and SPE revisions") use feature numbering instead of architecture versions for the PMSEVFR_EL1 Res0 defines. Tested-by: James Clark <james.clark@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-1-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-01-19 18:30:22 +00:00
Gowthami Thiagarajan	093cf1f62f	perf/marvell: Add ACPI support to TAD uncore driver Add support for ACPI based device registration so that the driver can be also enabled through ACPI table. While at that change the DT specific API's to device_* API's so that both DT based and ACPI based probing works. Signed-off-by: Gowthami Thiagarajan <gthiagarajan@marvell.com> Link: https://lore.kernel.org/r/20221209053715.3930071-1-gthiagarajan@marvell.com Signed-off-by: Will Deacon <will@kernel.org>	2023-01-19 18:30:22 +00:00
Gowthami Thiagarajan	e85930f06f	perf/marvell: Add ACPI support to DDR uncore driver Add support for ACPI based device registration so that the driver can be also enabled through ACPI table. Signed-off-by: Gowthami Thiagarajan <gthiagarajan@marvell.com> Link: https://lore.kernel.org/r/20221209053607.3929964-1-gthiagarajan@marvell.com Signed-off-by: Will Deacon <will@kernel.org>	2023-01-19 18:30:21 +00:00
Robin Murphy	bb21ef19a3	perf/arm-cmn: Reset DTM_PMU_CONFIG at probe Although we treat the DTM counters as free-running such that we're not too concerned about the initial DTM state, it's possible for a previous user to have left DTM counters enabled and paired with DTC counters. Thus if the first events are scheduled using some, but not all, DTMs, the as-yet-unused ones could end up adding spurious increments to the event counts at the DTC. Make sure we sync our initial DTM_PMU_CONFIG state to all the DTMs at probe time to avoid that possibility. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/ba5f38b3dc733cd06bfb5e659b697e76d18c2183.1670269572.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2023-01-19 18:30:21 +00:00
Junhao He	e126f6f42f	drivers/perf: hisi: Extract initialization of "cpa_pmu->pmu" Use hisi_pmu_init() function to simplify initialization of "cpa_pmu->pmu". Signed-off-by: Junhao He <hejunhao3@huawei.com> Link: https://lore.kernel.org/r/20230119100307.3660-4-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2023-01-19 18:30:21 +00:00
Junhao He	053b5579da	drivers/perf: hisi: Simplify the parameters of hisi_pmu_init() Use "hisi_pmu" to simplify the parameter list for the hisi_pmu_init() function. Signed-off-by: Junhao He <hejunhao3@huawei.com> Link: https://lore.kernel.org/r/20230119100307.3660-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2023-01-19 18:30:21 +00:00
Junhao He	7f95da9d2d	drivers/perf: hisi: Advertise the PERF_PMU_CAP_NO_EXCLUDE capability Missed initialization the variable of pmu::capabilities when extract the initialization code of hisi_pmu->pmu into a function. HISI UNCORE PMU drivers counters that not support context exclusion. So we have to advertise the PERF_PMU_CAP_NO_EXCLUDE capability. This ensures that perf will prevent us from handling events where any exclusion flags are set. Signed-off-by: Junhao He <hejunhao3@huawei.com> Link: https://lore.kernel.org/r/20230119100307.3660-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2023-01-19 18:30:20 +00:00
Peter Zijlstra	1c38b0615f	arm64, riscv, perf: Remove RCU_NONIDLE() usage The PM notifiers should no longer be ran with RCU disabled (per the previous patches), as such this hack is no longer required either. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195542.151174682@infradead.org	2023-01-13 11:48:17 +01:00
Linus Torvalds	eb67d239f3	RISC-V Patches for the 6.2 Merge Window, Part 1 * Support for the T-Head PMU via the perf subsystem. * ftrace support for rv32. * Support for non-volatile memory devices. * Various fixes and cleanups. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmOZ6WsTHHBhbG1lckBk YWJiZWx0LmNvbQAKCRAuExnzX7sYiWGcD/wLGiHq3ekQhl5D+CaA1WlJ5XzQFfY2 bv1ZCZGdjuiv66jiMlmEsbpfUCk3bSAIjCO3MHQNDmTuPJztCHVJXOHbZFWItzzO soW4nXHKW1sGHa7hDLGQUPkltA48OdPoyqEDvlnpyEWFT+2xHwdFEURWE85FXGeq ZzFSKUQqX/V52n9TS4M4QtmNnQatR3TgIs8ttzD4JqwWFBbp4/iBfIGt6n3W24XH 9lKWikO4YOYUPl0KVIakM4d8NmX7g+7vhCKWavLke1fF/IQOlyWwA0eM8ryj33OG L1nFkqfF3mCw9i72WHftlc0rAgVqcYS8ntnQkPNpt2zPp3xFjDwEy+XiZrRE+sAp m5Ma2Tkw7G3ueBtXwP1yo+EKa7PrVFbCRD/rEpLJAC6+9ktvc7cYs39E08O+wrwT qkYThDolovqMOqfOq6afEGy5lfIa5U00vxK+3MXiE3eLEjHSJhwTXadUbwyMjJWE zOwA6p5NfDFzklESSNTtIBY85Zlh/g2q6GWCy7yBQnlaSdbpDxcnAlSZipq66Iqm 9ytdZiHid4BIRQxr5qyXTB184BvFnWNRs9NGhCj38uLEnuxwSChzwoh/WPDxLNte U9ouvwJO5U2qAZsMGJhY8W2s/9WvWpSqRhSMA/nnNV1Hh+URFz8rFXAln6kNn//v j+cYGCyjLnO1hg== =4Ak2 -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-6.2-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for the T-Head PMU via the perf subsystem - ftrace support for rv32 - Support for non-volatile memory devices - Various fixes and cleanups * tag 'riscv-for-linus-6.2-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits) Documentation: RISC-V: patch-acceptance: s/implementor/implementer Documentation: RISC-V: Mention the UEFI Standards Documentation: RISC-V: Allow patches for non-standard behavior Documentation: RISC-V: Fix a typo in patch-acceptance riscv: Fixup compile error with !MMU riscv: Fix P4D_SHIFT definition for 3-level page table mode riscv: Apply a static assert to riscv_isa_ext_id RISC-V: Add some comments about the shadow and overflow stacks RISC-V: Align the shadow stack RISC-V: Ensure Zicbom has a valid block size RISC-V: Introduce riscv_isa_extension_check RISC-V: Improve use of isa2hwcap[] riscv: Don't duplicate _ALTERNATIVE_CFG* macros riscv: alternatives: Drop the underscores from the assembly macro names riscv: alternatives: Don't name unused macro parameters riscv: Don't duplicate __ALTERNATIVE_CFG in __ALTERNATIVE_CFG_2 riscv: mm: call best_map_size many times during linear-mapping riscv: Move cast inside kernel_mapping_[pv]a_to_[vp]a riscv: Fix crash during early errata patching riscv: boot: add zstd support ...	2022-12-14 15:23:49 -08:00
Linus Torvalds	add7695957	Perf events updates for v6.2: - Thoroughly rewrite the data structures that implement perf task context handling, with the goal of fixing various quirks and unfeatures both in already merged, and in upcoming proposed code. The old data structure is the per task and per cpu perf_event_contexts: task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context ^ \| ^ \| ^ `---------------------------------' \| `--> pmu ---' v ^ perf_event ------' In this new design this is replaced with a single task context and a single CPU context, plus intermediate data-structures: task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context ^ \| ^ ^ `---------------------------' \| \| \| \| perf_cpu_pmu_context <--. \| `----. ^ \| \| \| \| \| \| v v \| \| ,--> perf_event_pmu_context \| \| \| \| \| \| \| v v \| perf_event ---> pmu ----------------' [ See commit `bd27568117` for more details. ] This rewrite was developed by Peter Zijlstra and Ravi Bangoria. - Optimize perf_tp_event() - Update the Intel uncore PMU driver, extending it with UPI topology discovery on various hardware models. - Misc fixes & cleanups Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmOXjuURHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1j+VhAAknimsLwenTHCGQp7yqsWSKfBr9KI2UgD ZgtQuuwRwSzwqAEwC5Mt6zcIkxRNhU1ookFPqQbpY3XA0W4aNakUk8bDF8QIEKW0 MFWxn7PtReWqKcUay2oEGGurqZ5OtfpljJGxigQh5oVeMGc+itIwHF2JefeyoRnu pq7R2qDgOBb7Np4lWTdqXGmKufzp04/nely2IZQBO8x80cGRZiKQIrGrch6vLUf7 3iEz9rwmvPyz0aczYSpa/duEZDMLm4lWNK4oMUEXuUWC8gU7CUzBJsJ3AS5NgxAu yGBXe/s7GHqwtc/F30l5gK/J5WAyK83IF7sckxTj0dBUpyC6wQwwYPm8BaCAMoqN X6mU7Ve938Siih1TyOBZfZsrtDDILhV2N/nku2erb3iqes26u0RcT25rWtu9Yqvn hm4Gm6cmkHWq4EOHSBvAdC7l7lDZ3fyVI5+8nN9ly9Qv867HjG70dvIr9iEEolpX rhFAz8r/NwTXhDY0AmFZcOkrnNV3IuHtibJ/9wJlgJNqDPqN12Wxqdzy0Nj3HH6G EsukBO05cWaDS0gB8MpaO6Q6YtqAr87ZY+afHDBwcfkME50/CyBLr5rd47dTR+Ip B+zreYKcaNHdEMd1A9KULRTTDnEjlXYMwjVVJiPRV0jcmA3dHmM46HN5Ae9NdO6+ R2BAWv9XR6M= =KNaI -----END PGP SIGNATURE----- Merge tag 'perf-core-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: - Thoroughly rewrite the data structures that implement perf task context handling, with the goal of fixing various quirks and unfeatures both in already merged, and in upcoming proposed code. The old data structure is the per task and per cpu perf_event_contexts: task_struct::perf_events_ctxp[] <-> perf_event_context <-> perf_cpu_context ^ \| ^ \| ^ `---------------------------------' \| `--> pmu ---' v ^ perf_event ------' In this new design this is replaced with a single task context and a single CPU context, plus intermediate data-structures: task_struct::perf_event_ctxp -> perf_event_context <- perf_cpu_context ^ \| ^ ^ `---------------------------' \| \| \| \| perf_cpu_pmu_context <--. \| `----. ^ \| \| \| \| \| \| v v \| \| ,--> perf_event_pmu_context \| \| \| \| \| \| \| v v \| perf_event ---> pmu ----------------' [ See commit `bd27568117` for more details. ] This rewrite was developed by Peter Zijlstra and Ravi Bangoria. - Optimize perf_tp_event() - Update the Intel uncore PMU driver, extending it with UPI topology discovery on various hardware models. - Misc fixes & cleanups * tag 'perf-core-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) perf/x86/intel/uncore: Fix reference count leak in __uncore_imc_init_box() perf/x86/intel/uncore: Fix reference count leak in snr_uncore_mmio_map() perf/x86/intel/uncore: Fix reference count leak in hswep_has_limit_sbox() perf/x86/intel/uncore: Fix reference count leak in sad_cfg_iio_topology() perf/x86/intel/uncore: Make set_mapping() procedure void perf/x86/intel/uncore: Update sysfs-devices-mapping file perf/x86/intel/uncore: Enable UPI topology discovery for Sapphire Rapids perf/x86/intel/uncore: Enable UPI topology discovery for Icelake Server perf/x86/intel/uncore: Get UPI NodeID and GroupID perf/x86/intel/uncore: Enable UPI topology discovery for Skylake Server perf/x86/intel/uncore: Generalize get_topology() for SKX PMUs perf/x86/intel/uncore: Disable I/O stacks to PMU mapping on ICX-D perf/x86/intel/uncore: Clear attr_update properly perf/x86/intel/uncore: Introduce UPI topology type perf/x86/intel/uncore: Generalize IIO topology support perf/core: Don't allow grouping events from different hw pmus perf/amd/ibs: Make IBS a core pmu perf: Fix function pointer case perf/x86/amd: Remove the repeated declaration perf: Fix possible memleak in pmu_dev_alloc() ...	2022-12-12 15:19:38 -08:00
Linus Torvalds	9d33edb20f	Updates for the interrupt core and driver subsystem: - Core: The bulk is the rework of the MSI subsystem to support per device MSI interrupt domains. This solves conceptual problems of the current PCI/MSI design which are in the way of providing support for PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device. IMS (Interrupt Message Store] is a new specification which allows device manufactures to provide implementation defined storage for MSI messages contrary to the uniform and specification defined storage mechanisms for PCI/MSI and PCI/MSI-X. IMS not only allows to overcome the size limitations of the MSI-X table, but also gives the device manufacturer the freedom to store the message in arbitrary places, even in host memory which is shared with the device. There have been several attempts to glue this into the current MSI code, but after lengthy discussions it turned out that there is a fundamental design problem in the current PCI/MSI-X implementation. This needs some historical background. When PCI/MSI[-X] support was added around 2003, interrupt management was completely different from what we have today in the actively developed architectures. Interrupt management was completely architecture specific and while there were attempts to create common infrastructure the commonalities were rudimentary and just providing shared data structures and interfaces so that drivers could be written in an architecture agnostic way. The initial PCI/MSI[-X] support obviously plugged into this model which resulted in some basic shared infrastructure in the PCI core code for setting up MSI descriptors, which are a pure software construct for holding data relevant for a particular MSI interrupt, but the actual association to Linux interrupts was completely architecture specific. This model is still supported today to keep museum architectures and notorious stranglers alive. In 2013 Intel tried to add support for hot-pluggable IO/APICs to the kernel, which was creating yet another architecture specific mechanism and resulted in an unholy mess on top of the existing horrors of x86 interrupt handling. The x86 interrupt management code was already an incomprehensible maze of indirections between the CPU vector management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X] implementation. At roughly the same time ARM struggled with the ever growing SoC specific extensions which were glued on top of the architected GIC interrupt controller. This resulted in a fundamental redesign of interrupt management and provided the today prevailing concept of hierarchical interrupt domains. This allowed to disentangle the interactions between x86 vector domain and interrupt remapping and also allowed ARM to handle the zoo of SoC specific interrupt components in a sane way. The concept of hierarchical interrupt domains aims to encapsulate the functionality of particular IP blocks which are involved in interrupt delivery so that they become extensible and pluggable. The X86 encapsulation looks like this: \|--- device 1 [Vector]---[Remapping]---[PCI/MSI]--\|... \|--- device N where the remapping domain is an optional component and in case that it is not available the PCI/MSI[-X] domains have the vector domain as their parent. This reduced the required interaction between the domains pretty much to the initialization phase where it is obviously required to establish the proper parent relation ship in the components of the hierarchy. While in most cases the model is strictly representing the chain of IP blocks and abstracting them so they can be plugged together to form a hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the hardware it's clear that the actual PCI/MSI[-X] interrupt controller is not a global entity, but strict a per PCI device entity. Here we took a short cut on the hierarchical model and went for the easy solution of providing "global" PCI/MSI domains which was possible because the PCI/MSI[-X] handling is uniform across the devices. This also allowed to keep the existing PCI/MSI[-X] infrastructure mostly unchanged which in turn made it simple to keep the existing architecture specific management alive. A similar problem was created in the ARM world with support for IP block specific message storage. Instead of going all the way to stack a IP block specific domain on top of the generic MSI domain this ended in a construct which provides a "global" platform MSI domain which allows overriding the irq_write_msi_msg() callback per allocation. In course of the lengthy discussions we identified other abuse of the MSI infrastructure in wireless drivers, NTB etc. where support for implementation specific message storage was just mindlessly glued into the existing infrastructure. Some of this just works by chance on particular platforms but will fail in hard to diagnose ways when the driver is used on platforms where the underlying MSI interrupt management code does not expect the creative abuse. Another shortcoming of today's PCI/MSI-X support is the inability to allocate or free individual vectors after the initial enablement of MSI-X. This results in an works by chance implementation of VFIO (PCI pass-through) where interrupts on the host side are not set up upfront to avoid resource exhaustion. They are expanded at run-time when the guest actually tries to use them. The way how this is implemented is that the host disables MSI-X and then re-enables it with a larger number of vectors again. That works by chance because most device drivers set up all interrupts before the device actually will utilize them. But that's not universally true because some drivers allocate a large enough number of vectors but do not utilize them until it's actually required, e.g. for acceleration support. But at that point other interrupts of the device might be in active use and the MSI-X disable/enable dance can just result in losing interrupts and therefore hard to diagnose subtle problems. Last but not least the "global" PCI/MSI-X domain approach prevents to utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact that IMS is not longer providing a uniform storage and configuration model. The solution to this is to implement the missing step and switch from global PCI/MSI domains to per device PCI/MSI domains. The resulting hierarchy then looks like this: \|--- [PCI/MSI] device 1 [Vector]---[Remapping]---\|... \|--- [PCI/MSI] device N which in turn allows to provide support for multiple domains per device: \|--- [PCI/MSI] device 1 \|--- [PCI/IMS] device 1 [Vector]---[Remapping]---\|... \|--- [PCI/MSI] device N \|--- [PCI/IMS] device N This work converts the MSI and PCI/MSI core and the x86 interrupt domains to the new model, provides new interfaces for post-enable allocation/free of MSI-X interrupts and the base framework for PCI/IMS. PCI/IMS has been verified with the work in progress IDXD driver. There is work in progress to convert ARM over which will replace the platform MSI train-wreck. The cleanup of VFIO, NTB and other creative "solutions" are in the works as well. - Drivers: - Updates for the LoongArch interrupt chip drivers - Support for MTK CIRQv2 - The usual small fixes and updates all over the place -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmOUsygTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoYXiD/40tXKzCzf0qFIqUlZLia1N3RRrwrNC DVTixuLtR9MrjwE+jWLQILa85SHInV8syXHSd35SzhsGDxkURFGi+HBgVWmysODf br9VSh3Gi+kt7iXtIwAg8WNWviGNmS3kPksxCko54F0YnJhMY5r5bhQVUBQkwFG2 wES1C9Uzd4pdV2bl24Z+WKL85cSmZ+pHunyKw1n401lBABXnTF9c4f13zC14jd+y wDxNrmOxeL3mEH4Pg6VyrDuTOURSf3TjJjeEq3EYqvUo0FyLt9I/cKX0AELcZQX7 fkRjrQQAvXNj39RJfeSkojDfllEPUHp7XSluhdBu5aIovSamdYGCDnuEoZ+l4MJ+ CojIErp3Dwj/uSaf5c7C3OaDAqH2CpOFWIcrUebShJE60hVKLEpUwd6W8juplaoT gxyXRb1Y+BeJvO8VhMN4i7f3232+sj8wuj+HTRTTbqMhkElnin94tAx8rgwR1sgR BiOGMJi4K2Y8s9Rqqp0Dvs01CW4guIYvSR4YY+WDbbi1xgiev89OYs6zZTJCJe4Y NUwwpqYSyP1brmtdDdBOZLqegjQm+TwUb6oOaasFem4vT1swgawgLcDnPOx45bk5 /FWt3EmnZxMz99x9jdDn1+BCqAZsKyEbEY1avvhPVMTwoVIuSX2ceTBMLseGq+jM 03JfvdxnueM3gw== =9erA -----END PGP SIGNATURE----- Merge tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Updates for the interrupt core and driver subsystem: The bulk is the rework of the MSI subsystem to support per device MSI interrupt domains. This solves conceptual problems of the current PCI/MSI design which are in the way of providing support for PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device. IMS (Interrupt Message Store] is a new specification which allows device manufactures to provide implementation defined storage for MSI messages (as opposed to PCI/MSI and PCI/MSI-X that has a specified message store which is uniform accross all devices). The PCI/MSI[-X] uniformity allowed us to get away with "global" PCI/MSI domains. IMS not only allows to overcome the size limitations of the MSI-X table, but also gives the device manufacturer the freedom to store the message in arbitrary places, even in host memory which is shared with the device. There have been several attempts to glue this into the current MSI code, but after lengthy discussions it turned out that there is a fundamental design problem in the current PCI/MSI-X implementation. This needs some historical background. When PCI/MSI[-X] support was added around 2003, interrupt management was completely different from what we have today in the actively developed architectures. Interrupt management was completely architecture specific and while there were attempts to create common infrastructure the commonalities were rudimentary and just providing shared data structures and interfaces so that drivers could be written in an architecture agnostic way. The initial PCI/MSI[-X] support obviously plugged into this model which resulted in some basic shared infrastructure in the PCI core code for setting up MSI descriptors, which are a pure software construct for holding data relevant for a particular MSI interrupt, but the actual association to Linux interrupts was completely architecture specific. This model is still supported today to keep museum architectures and notorious stragglers alive. In 2013 Intel tried to add support for hot-pluggable IO/APICs to the kernel, which was creating yet another architecture specific mechanism and resulted in an unholy mess on top of the existing horrors of x86 interrupt handling. The x86 interrupt management code was already an incomprehensible maze of indirections between the CPU vector management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X] implementation. At roughly the same time ARM struggled with the ever growing SoC specific extensions which were glued on top of the architected GIC interrupt controller. This resulted in a fundamental redesign of interrupt management and provided the today prevailing concept of hierarchical interrupt domains. This allowed to disentangle the interactions between x86 vector domain and interrupt remapping and also allowed ARM to handle the zoo of SoC specific interrupt components in a sane way. The concept of hierarchical interrupt domains aims to encapsulate the functionality of particular IP blocks which are involved in interrupt delivery so that they become extensible and pluggable. The X86 encapsulation looks like this: \|--- device 1 [Vector]---[Remapping]---[PCI/MSI]--\|... \|--- device N where the remapping domain is an optional component and in case that it is not available the PCI/MSI[-X] domains have the vector domain as their parent. This reduced the required interaction between the domains pretty much to the initialization phase where it is obviously required to establish the proper parent relation ship in the components of the hierarchy. While in most cases the model is strictly representing the chain of IP blocks and abstracting them so they can be plugged together to form a hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the hardware it's clear that the actual PCI/MSI[-X] interrupt controller is not a global entity, but strict a per PCI device entity. Here we took a short cut on the hierarchical model and went for the easy solution of providing "global" PCI/MSI domains which was possible because the PCI/MSI[-X] handling is uniform across the devices. This also allowed to keep the existing PCI/MSI[-X] infrastructure mostly unchanged which in turn made it simple to keep the existing architecture specific management alive. A similar problem was created in the ARM world with support for IP block specific message storage. Instead of going all the way to stack a IP block specific domain on top of the generic MSI domain this ended in a construct which provides a "global" platform MSI domain which allows overriding the irq_write_msi_msg() callback per allocation. In course of the lengthy discussions we identified other abuse of the MSI infrastructure in wireless drivers, NTB etc. where support for implementation specific message storage was just mindlessly glued into the existing infrastructure. Some of this just works by chance on particular platforms but will fail in hard to diagnose ways when the driver is used on platforms where the underlying MSI interrupt management code does not expect the creative abuse. Another shortcoming of today's PCI/MSI-X support is the inability to allocate or free individual vectors after the initial enablement of MSI-X. This results in an works by chance implementation of VFIO (PCI pass-through) where interrupts on the host side are not set up upfront to avoid resource exhaustion. They are expanded at run-time when the guest actually tries to use them. The way how this is implemented is that the host disables MSI-X and then re-enables it with a larger number of vectors again. That works by chance because most device drivers set up all interrupts before the device actually will utilize them. But that's not universally true because some drivers allocate a large enough number of vectors but do not utilize them until it's actually required, e.g. for acceleration support. But at that point other interrupts of the device might be in active use and the MSI-X disable/enable dance can just result in losing interrupts and therefore hard to diagnose subtle problems. Last but not least the "global" PCI/MSI-X domain approach prevents to utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact that IMS is not longer providing a uniform storage and configuration model. The solution to this is to implement the missing step and switch from global PCI/MSI domains to per device PCI/MSI domains. The resulting hierarchy then looks like this: \|--- [PCI/MSI] device 1 [Vector]---[Remapping]---\|... \|--- [PCI/MSI] device N which in turn allows to provide support for multiple domains per device: \|--- [PCI/MSI] device 1 \|--- [PCI/IMS] device 1 [Vector]---[Remapping]---\|... \|--- [PCI/MSI] device N \|--- [PCI/IMS] device N This work converts the MSI and PCI/MSI core and the x86 interrupt domains to the new model, provides new interfaces for post-enable allocation/free of MSI-X interrupts and the base framework for PCI/IMS. PCI/IMS has been verified with the work in progress IDXD driver. There is work in progress to convert ARM over which will replace the platform MSI train-wreck. The cleanup of VFIO, NTB and other creative "solutions" are in the works as well. Drivers: - Updates for the LoongArch interrupt chip drivers - Support for MTK CIRQv2 - The usual small fixes and updates all over the place" * tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (134 commits) irqchip/ti-sci-inta: Fix kernel doc irqchip/gic-v2m: Mark a few functions __init irqchip/gic-v2m: Include arm-gic-common.h irqchip/irq-mvebu-icu: Fix works by chance pointer assignment iommu/amd: Enable PCI/IMS iommu/vt-d: Enable PCI/IMS x86/apic/msi: Enable PCI/IMS PCI/MSI: Provide pci_ims_alloc/free_irq() PCI/MSI: Provide IMS (Interrupt Message Store) support genirq/msi: Provide constants for PCI/IMS support x86/apic/msi: Enable MSI_FLAG_PCI_MSIX_ALLOC_DYN PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X PCI/MSI: Provide prepare_desc() MSI domain op PCI/MSI: Split MSI-X descriptor setup genirq/msi: Provide MSI_FLAG_MSIX_ALLOC_DYN genirq/msi: Provide msi_domain_alloc_irq_at() genirq/msi: Provide msi_domain_ops:: Prepare_desc() genirq/msi: Provide msi_desc:: Msi_data genirq/msi: Provide struct msi_map x86/apic/msi: Remove arch_create_remap_msi_irq_domain() ...	2022-12-12 11:21:29 -08:00
Will Deacon	10162e78ea	Merge branch 'for-next/perf' into for-next/core * for-next/perf: (21 commits) arm_pmu: Drop redundant armpmu->map_event() in armpmu_event_init() drivers/perf: hisi: Add TLP filter support Documentation: perf: Indent filter options list of hisi-pcie-pmu docs: perf: Fix PMU instance name of hisi-pcie-pmu drivers/perf: hisi: Fix some event id for hisi-pcie-pmu arm64/perf: Replace PMU version number '0' with ID_AA64DFR0_EL1_PMUVer_NI perf/amlogic: Remove unused header inclusions of <linux/version.h> perf/amlogic: Fix build error for x86_64 allmodconfig dt-binding: perf: Add Amlogic DDR PMU docs/perf: Add documentation for the Amlogic G12 DDR PMU perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driver MAINTAINERS: Update HiSilicon PMU maintainers perf: arm_cspmu: Fix module cyclic dependency perf: arm_cspmu: Fix build failure on x86_64 perf: arm_cspmu: Fix modular builds due to missing MODULE_LICENSE()s perf: arm_cspmu: Add support for NVIDIA SCF and MCF attribute perf: arm_cspmu: Add support for ARM CoreSight PMU driver perf/smmuv3: Fix hotplug callback leak in arm_smmu_pmu_init() perf/arm_dmc620: Fix hotplug callback leak in dmc620_pmu_init() drivers: perf: marvell_cn10k: Fix hotplug callback leak in tad_pmu_init() ...	2022-12-06 11:22:48 +00:00
Anshuman Khandual	4361251cef	arm_pmu: Drop redundant armpmu->map_event() in armpmu_event_init() __hw_perf_event_init() already calls armpmu->map_event() callback, and also returns its error code including -ENOENT, along with a debug callout. Hence an additional armpmu->map_event() check for -ENOENT is redundant. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20221202015611.338499-1-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2022-12-02 10:15:03 +00:00
Yicong Yang	17d573984d	drivers/perf: hisi: Add TLP filter support The PMU support to filter the TLP when counting the bandwidth with below options: - only count the TLP headers - only count the TLP payloads - count both TLP headers and payloads In the current driver it's default to count the TLP payloads only, which will have an implicity side effects that on the traffic only have header only TLPs, we'll get no data. Make this user configuration through "len_mode" parameter and make it default to count both TLP headers and payloads when user not specified. Also update the documentation for it. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20221117084136.53572-5-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2022-11-29 14:30:55 +00:00
Yicong Yang	6b4bb4f38d	drivers/perf: hisi: Fix some event id for hisi-pcie-pmu Some event id of hisi-pcie-pmu is incorrect, fix them. Fixes: `8404b0fbc7` ("drivers/perf: hisi: Add driver for HiSilicon PCIe PMU") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20221117084136.53572-2-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2022-11-29 14:30:54 +00:00
Jiapeng Chong	ce00d127a6	perf/amlogic: Remove unused header inclusions of <linux/version.h> According to the "Abaci Robot": \| ./drivers/perf/amlogic/meson_g12_ddr_pmu.c:15 linux/version.h not needed. \| ./drivers/perf/amlogic/meson_ddr_pmu_core.c: 19 linux/version.h not needed. So drop the unnecessary #include directives. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3280 Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3282 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20221129032108.119661-1-jiapeng.chong@linux.alibaba.com Link: https://lore.kernel.org/r/20221129032108.119661-2-jiapeng.chong@linux.alibaba.com [will: Squashed patches together, filled out commit message a bit more] Signed-off-by: Will Deacon <will@kernel.org>	2022-11-29 13:32:16 +00:00
Jiucheng Xu	7299fdc1cf	perf/amlogic: Fix build error for x86_64 allmodconfig The driver misses including <linux/io.h>, which causes a compilation error with x86_64 'allmodconfig': drivers/perf/amlogic/meson_g12_ddr_pmu.c: In function 'dmc_g12_get_freq_quick': drivers/perf/amlogic/meson_g12_ddr_pmu.c:135:15: error: implicit declaration of function 'readl' [-Werror=implicit-function-declaration] 135 \| val = readl(info->pll_reg); \| ^~~~~ drivers/perf/amlogic/meson_g12_ddr_pmu.c: In function 'dmc_g12_counter_enable': drivers/perf/amlogic/meson_g12_ddr_pmu.c:204:9: error: implicit declaration of function 'writel' [-Werror=implicit-function-declaration] 204 \| writel(clock_count, info->ddr_reg[0] + DMC_MON_G12_TIMER); \| ^~~~~~ Add the missing header to fix the build. Fixes: `2016e2113d` ("perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driver") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jiucheng Xu <jiucheng.xu@amlogic.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20221122084028.572494-1-jiucheng.xu@amlogic.com Signed-off-by: Will Deacon <will@kernel.org>	2022-11-22 10:29:49 +00:00
Jiucheng Xu	2016e2113d	perf/amlogic: Add support for Amlogic meson G12 SoC DDR PMU driver Add support for Amlogic Meson G12 Series SOC - DDR bandwidth PMU driver framework and interfaces. The PMU can not only monitor the total DDR bandwidth, but also individual IP module bandwidth. Signed-off-by: Jiucheng Xu <jiucheng.xu@amlogic.com> Tested-by: Chris Healy <healych@amazon.com> Link: https://lore.kernel.org/r/20221121021602.3306998-1-jiucheng.xu@amlogic.com Signed-off-by: Will Deacon <will@kernel.org>	2022-11-21 18:28:45 +00:00
Besar Wicaksono	a91bbd5c99	perf: arm_cspmu: Fix module cyclic dependency Build on arm64 allmodconfig failed with: \| depmod: ERROR: Cycle detected: arm_cspmu -> nvidia_cspmu -> arm_cspmu \| depmod: ERROR: Found 2 modules in dependency cycles! The arm_cspmu.c provides standard functions to operate the PMU and the vendor code provides vendor specific attributes. Both need to be built as single kernel module. Update the makefile to compile sources under arm_cspmu into one module. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-and-Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20221116203952.34168-1-bwicaksono@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2022-11-18 13:32:11 +00:00
Besar Wicaksono	e72dbf9085	perf: arm_cspmu: Fix build failure on x86_64 Building on x86_64 allmodconfig failed: \| drivers/perf/arm_cspmu/arm_cspmu.c:1114:29: error: implicit \| declaration of function 'get_acpi_id_for_cpu' get_acpi_id_for_cpu is a helper function from ARM64. Fix by adding ARM64 dependency. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20221116190455.55651-1-bwicaksono@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>	2022-11-18 13:31:54 +00:00
Thomas Gleixner	13e7accb81	genirq: Get rid of GENERIC_MSI_IRQ_DOMAIN Adjust to reality and remove another layer of pointless Kconfig indirection. CONFIG_GENERIC_MSI_IRQ is good enough to serve all purposes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20221111122014.524842979@linutronix.de	2022-11-17 15:15:20 +01:00
Will Deacon	1830902eb8	perf: arm_cspmu: Fix modular builds due to missing MODULE_LICENSE()s Building an arm64 allmodconfig target results in the following failure from modpost: \| ERROR: modpost: missing MODULE_LICENSE() in drivers/perf/arm_cspmu/arm_cspmu.o \| ERROR: modpost: missing MODULE_LICENSE() in drivers/perf/arm_cspmu/nvidia_cspmu.o \| make[1]: * [scripts/Makefile.modpost:126: Module.symvers] Error 1 \| make: * [Makefile:1944: modpost] Error 2 Add the missing MODULE_LICENSE() macros, following the license of the source files and symbol exports. Signed-off-by: Will Deacon <will@kernel.org>	2022-11-15 18:24:03 +00:00

1 2 3 4 5 ...

556 commits