Commit graph

36 commits

Author SHA1 Message Date
Geert Uytterhoeven
1d8e926a04 perf: MARVELL_CN10K_DDR_PMU should depend on ARCH_THUNDER
The Marvell CN10K DRAM Subsystem (DSS) performance monitor is only
present on Marvell CN10K SoCs.  Hence add a dependency on ARCH_THUNDER,
to prevent asking the user about this driver when configuring a kernel
without Cavium Thunder (incl. Marvell CN10K) SoC support,

Fixes: 68fa55f0e0 ("perf/marvell: cn10k DDR perf event core ownership")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/18bfd6e1bcf67db7ea656d684a8bbb68261eeb54.1648559364.git.geert+renesas@glider.be
Signed-off-by: Will Deacon <will@kernel.org>
2022-04-04 10:51:20 +01:00
Linus Torvalds
aa5b537b0e RISC-V Patches for the 5.18 Merge Window, Part 1
* Support for Sv57-based virtual memory.
 * Various improvements for the MicroChip PolarFire SOC and the
   associated Icicle dev board, which should allow upstream kernels to
   boot without any additional modifications.
 * An improved memmove() implementation.
 * Support for the new Ssconfpmf and SBI PMU extensions, which allows for
   a much more useful perf implementation on RISC-V systems.
 * Support for restartable sequences.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmI96FcTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiQBFD/425+6xmoOru6Wiki3Ja0fqQToNrQyW
 IbmE/8AxUP7UxMvJSNzvQm8deXgklzvmegXCtnjwZZins971vMzzDSI83k/zn8I7
 m5thVC9z01BjodV+pvIp/44hS6FesolOLzkVHksX0Zh6h0iidrc34Qf5HrqvvNfN
 CZ/4K1+E9ig5r9qZp4WdvocCXj+FzwF/30GjKoW9vwA599CEG/dCo+TNN9GKD6XS
 k+xiUGwlIRA+kCLSPFCi7ev9XPr1tCmQB7uB8Igcvr7Y3mWl8HKfajQVXBnXNRC3
 ifbDxpx1elJiLPyf7Rza8jIDwDhLQdxBiwPgDgP9h9R4x0uF4efq8PzLzFlFmaE+
 9Z9thfykBb5dXYDFDje9bAOXvKnGk7Iqxdsz0qWo/ChEQawX1+11bJb0TNN8QTT9
 YvlQfUXgb1dmEcj5yG2uVE1Y8L7YNLRMsZU3W3FbmPJZoavSOuU4x0yCGeLyv597
 76af3nuBJ5v80Db97gu6St+HIACeevKflsZUf/8GS/p7d1DlvmrWzQUMEycxPTG9
 UZpZak58jh7AqQ9JbLnavhwmeacY50vpZOw6QHGAHSN+8daCPlOHDG7Ver7Z+kNj
 +srJ7iKMvLnnaEjGNgavfxdqTOme1gv4LWs/JdHYMkpphqVN92xBDJnhXTPRVZiQ
 0x39vK86qtB46A==
 =Omc6
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for Sv57-based virtual memory.

 - Various improvements for the MicroChip PolarFire SOC and the
   associated Icicle dev board, which should allow upstream kernels to
   boot without any additional modifications.

 - An improved memmove() implementation.

 - Support for the new Ssconfpmf and SBI PMU extensions, which allows
   for a much more useful perf implementation on RISC-V systems.

 - Support for restartable sequences.

* tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (36 commits)
  rseq/selftests: Add support for RISC-V
  RISC-V: Add support for restartable sequence
  MAINTAINERS: Add entry for RISC-V PMU drivers
  Documentation: riscv: Remove the old documentation
  RISC-V: Add sscofpmf extension support
  RISC-V: Add perf platform driver based on SBI PMU extension
  RISC-V: Add RISC-V SBI PMU extension definitions
  RISC-V: Add a simple platform driver for RISC-V legacy perf
  RISC-V: Add a perf core library for pmu drivers
  RISC-V: Add CSR encodings for all HPMCOUNTERS
  RISC-V: Remove the current perf implementation
  RISC-V: Improve /proc/cpuinfo output for ISA extensions
  RISC-V: Do no continue isa string parsing without correct XLEN
  RISC-V: Implement multi-letter ISA extension probing framework
  RISC-V: Extract multi-letter extension names from "riscv, isa"
  RISC-V: Minimal parser for "riscv, isa" strings
  RISC-V: Correctly print supported extensions
  riscv: Fixed misaligned memory access. Fixed pointer comparison.
  MAINTAINERS: update riscv/microchip entry
  riscv: dts: microchip: add new peripherals to icicle kit device tree
  ...
2022-03-25 10:11:38 -07:00
Atish Patra
e999143459
RISC-V: Add perf platform driver based on SBI PMU extension
RISC-V SBI specification added a PMU extension that allows to configure
start/stop any pmu counter. The RISC-V perf can use most of the generic
perf features except interrupt overflow and event filtering based on
privilege mode which will be added in future.

It also allows to monitor a handful of firmware counters that can provide
insights into firmware activity during a performance analysis.

Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-21 14:58:33 -07:00
Atish Patra
9b3e150e31
RISC-V: Add a simple platform driver for RISC-V legacy perf
The old RISC-V perf implementation allowed counting of only
cycle/instruction counters using perf. Restore that feature by implementing
a simple platform driver under a separate config to provide backward
compatibility. Any existing software stack will continue to work as it is.
However, it provides an easy way out in future where we can remove the
legacy driver.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-21 14:58:25 -07:00
Atish Patra
f5bfa23f57
RISC-V: Add a perf core library for pmu drivers
Implement a perf core library that can support all the essential perf
features in future. It can also accommodate any type of PMU implementation
in future. Currently, both SBI based perf driver and legacy driver
implemented uses the library. Most of the common perf functionalities
are kept in this core library wile PMU specific driver can implement PMU
specific features. For example, the SBI specific functionality will be
implemented in the SBI specific driver.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-21 14:58:21 -07:00
Will Deacon
0162052214 Merge branch 'for-next/perf-m1' into for-next/perf
Support for the CPU PMUs on the Apple M1.

* for-next/perf-m1:
  drivers/perf: Add Apple icestorm/firestorm CPU PMU driver
  drivers/perf: arm_pmu: Handle 47 bit counters
  irqchip/apple-aic: Move PMU-specific registers to their own include file
  arm64: dts: apple: Add t8303 PMU nodes
  arm64: dts: apple: Add t8103 PMU interrupt affinities
  irqchip/apple-aic: Wire PMU interrupts
  irqchip/apple-aic: Parse FIQ affinities from device-tree
  dt-bindings: apple,aic: Add affinity description for per-cpu pseudo-interrupts
  dt-bindings: apple,aic: Add CPU PMU per-cpu pseudo-interrupts
  dt-bindings: arm-pmu: Document Apple PMU compatible strings
2022-03-08 13:33:34 +00:00
Marc Zyngier
a639027a1b drivers/perf: Add Apple icestorm/firestorm CPU PMU driver
Add a new, weird and wonderful driver for the equally weird Apple
PMU HW. Although the PMU itself is functional, we don't know much
about the events yet, so this can be considered as yet another
random number generator...

Nonetheless, it can reliably count at least cycles and instructions
in the usually wonky big-little way. For anything else, it of course
supports raw event numbers.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08 13:32:48 +00:00
Bharat Bhushan
68fa55f0e0 perf/marvell: cn10k DDR perf event core ownership
As DDR perf event counters are not per core, so they should be accessed
only by one core at a time. Select new core when previously owning core
is going offline.

Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Reviewed-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Link: https://lore.kernel.org/r/20220211045346.17894-5-bbhushan2@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-03-08 11:17:37 +00:00
Geert Uytterhoeven
e564518b07 perf: MARVELL_CN10K_TAD_PMU should depend on ARCH_THUNDER
The Marvell CN10K Last-Level cache Tag-and-data Units (LLC-TAD)
performance monitor is only present on Marvell CN10K SoCs.  Hence add a
dependency on ARCH_THUNDER, to prevent asking the user about this driver
when configuring a kernel without Cavium Thunder (incl. Marvell CN10K)
SoC support.

Fixes: 036a7584be ("drivers: perf: Add LLC-TAD perf counter support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/b4662a2c767d04cca19417e0c845edea2da262ad.1641995941.git.geert+renesas@glider.be
Signed-off-by: Will Deacon <will@kernel.org>
2022-02-08 14:54:18 +00:00
Will Deacon
e73bc4fd78 Merge branch 'for-next/perf-cn10k' into for-next/perf
* for-next/perf-cn10k:
  dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings
  drivers: perf: Add LLC-TAD perf counter support
2021-12-14 13:41:58 +00:00
Bhaskara Budiredla
036a7584be drivers: perf: Add LLC-TAD perf counter support
This driver adds support for Last-level cache tag-and-data unit
(LLC-TAD) PMU that is featured in some of the Marvell's CN10K
infrastructure silicons.

The LLC is divided into 2N slices distributed across N Mesh tiles
in a single-socket configuration. The driver always configures the
same counter for all of the TADs. The user would end up effectively
reserving one of eight counters in every TAD to look across all TADs.
The occurrences of events are aggregated and presented to the user
at the end of an application run. The driver does not provide a way
for the user to partition TADs so that different TADs are used for
different applications.

The event counters are zeroed to start event counting to avoid any
rollover issues. TAD perf counters are 64-bit, so it's not currently
possible to overflow event counters at current mesh and core
frequencies.

To measure tad pmu events use perf tool stat command. For instance:

perf stat -e tad_dat_msh_in_dss,tad_req_msh_out_any <application>
perf stat -e tad_alloc_any,tad_hit_any,tad_tag_rd <application>

Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Link: https://lore.kernel.org/r/20211115043506.6679-2-bbudiredla@marvell.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:23:01 +00:00
Robin Murphy
82d8ea4b45 perf/arm-cmn: Drop compile-test restriction
Although CMN is currently (and overwhelmingly likely to remain) deployed
in arm64-only (modulo userspace) systems, the 64-bit "dependency" for
compile-testing was just laziness due to heavy reliance on readq/writeq
accessors. Since we only need one extra include for robustness in that
regard, let's pull that in, widen the compile-test coverage, and fix up
the smattering of type laziness that that brings to light.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/baee9ee0d0bdad8aaeb70f5a4b98d8fd4b1f5786.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-12-14 12:09:27 +00:00
John Garry
e656972b69 drivers/perf: Improve build test coverage
Improve build test cover by allowing some drivers to build under
COMPILE_TEST where possible.

Some notes:
- Mostly a dependency on CONFIG_ACPI is not really required for only
  building (but left untouched), but is required for TX2 which uses ACPI
  functions which have no stubs
- XGENE required 64b dependency as it relies on some unsigned long perf
  struct fields being 64b
- I don't see why TX2 requires NUMA to build, but left untouched
- Added an explicit dependency on GENERIC_MSI_IRQ_DOMAIN for
  ARM_SMMU_V3_PMU, which is required for platform MSI functions

Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1633085326-156653-3-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-04 13:13:11 +01:00
John Garry
34eb9359c1 driver/perf: Remove ARM_SMMU_V3_PMU dependency on ARM_SMMU_V3
The ARM_SMMU_V3_PMU dependency on ARM_SMMU_V3_PMU was added with the idea
that a SMMUv3 PMCG would only exist on a system with an associated SMMUv3.

However it is not the job of Kconfig to make these sorts of decisions (even
if it were true), so remove the dependency.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/1612175042-56866-1-git-send-email-john.garry@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-02-01 12:39:40 +00:00
Tuan Phan
53c218da22 driver/perf: Add PMU driver for the ARM DMC-620 memory controller
DMC-620 PMU supports total 10 counters which each is
independently programmable to different events and can
be started and stopped individually.

Currently, it only supports ACPI. Other platforms feel free to test and add
support for device tree.

Usage example:
  #perf stat -e arm_dmc620_10008c000/clk_cycle_count/ -C 0
  Get perf event for clk_cycle_count counter.

  #perf stat -e arm_dmc620_10008c000/clkdiv2_allocate,mask=0x1f,match=0x2f,
  incr=2,invert=1/ -C 0
  The above example shows how to specify mask, match, incr,
  invert parameters for clkdiv2_allocate event.

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Link: https://lore.kernel.org/r/1604518246-6198-1-git-send-email-tuanphan@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-25 14:51:21 +00:00
Robin Murphy
0ba64770a2 perf: Add Arm CMN-600 PMU driver
Initial driver for PMU event counting on the Arm CMN-600 interconnect.
CMN sports an obnoxiously complex distributed PMU system as part of
its debug and trace features, which can do all manner of things like
sampling, cross-triggering and generating CoreSight trace. This driver
covers the PMU functionality, plus the relevant aspects of watchpoints
for simply counting matching flits.

Tested-by: Tsahi Zidenberg <tsahee@amazon.com>
Tested-by: Tuan Phan <tuanphan@os.amperecomputing.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28 18:50:20 +01:00
Ilia Lin
6d0efeb14b soc: qcom: Separate kryo l2 accessors from PMU driver
The driver provides kernel level API for other drivers
to access the MSM8996 L2 cache registers.
Separating the L2 access code from the PMU driver and
making it public to allow other drivers use it.
The accesses must be separated with a single spinlock,
maintained in this driver.

Signed-off-by: Ilia Lin <ilialin@codeaurora.org>
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Link: https://lore.kernel.org/r/1593766185-16346-2-git-send-email-loic.poulain@linaro.org
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-07-10 17:08:55 -07:00
Zhou Wang
97807325a0 drivers/perf: hisi: Permit modular builds of HiSilicon uncore drivers
This patch lets HiSilicon uncore PMU driver can be built as modules.
A common module and three specific uncore PMU driver modules will be built.

Export necessary functions in hisi_uncore_pmu module, and change
irq_set_affinity to irq_set_affinity_hint to pass compile.

Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Tested-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1588820305-174479-1-git-send-email-wangzhou1@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-18 18:18:39 +01:00
Frank Li
9a66d36cc7 drivers/perf: imx_ddr: Add DDR performance counter support to perf
Add DDR performance monitor support for iMX8QXP. The PMU consists of 3
programmable event counters and a single dedicated cycle counter.

Example usage:

 $ perf stat -a -e \
   imx8_ddr0/read-cycles/,imx8_ddr0/write-cycles/,imx8_ddr0/precharge/ ls

- or -

 $ perf stat -a -e \
   imx8_ddr0/cycles/,imx8_ddr0/read-access/,imx8_ddr0/write-access/ ls

Other events are supported, and advertised via perf list.

Reviewed-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
[will: rewrote commit message/kconfig and used #defines for dev/cpuhp names]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-06-13 11:07:57 +01:00
Thomas Gleixner
ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Neil Leeder
7d839b4b9e perf/smmuv3: Add arm64 smmuv3 pmu driver
Adds a new driver to support the SMMUv3 PMU and add it into the
perf events framework.

Each SMMU node may have multiple PMUs associated with it, each of
which may support different events.

SMMUv3 PMCG devices are named as smmuv3_pmcg_<phys_addr_page> where
<phys_addr_page> is the physical page address of the SMMU PMCG
wrapped to 4K boundary. For example, the PMCG at 0xff88840000 is
named smmuv3_pmcg_ff88840

Filtering by stream id is done by specifying filtering parameters
with the event. options are:
   filter_enable    - 0 = no filtering, 1 = filtering enabled
   filter_span      - 0 = exact match, 1 = pattern match
   filter_stream_id - pattern to filter against

Example: perf stat -e smmuv3_pmcg_ff88840/transaction,filter_enable=1,
                       filter_span=1,filter_stream_id=0x42/ -a netperf

Applies filter pattern 0x42 to transaction events, which means events
matching stream ids 0x42 & 0x43 are counted as only upper StreamID
bits are required to match the given filter. Further filtering
information is available in the SMMU documentation.

SMMU events are not attributable to a CPU, so task mode and sampling
are not supported.

Signed-off-by: Neil Leeder <nleeder@codeaurora.org>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
[will: fold in review feedback from Robin]
[will: rewrite Kconfig text and allow building as a module]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-04 16:49:21 +01:00
Kulkarni, Ganapatrao
69c32972d5 drivers/perf: Add Cavium ThunderX2 SoC UNCORE PMU driver
This patch adds a perf driver for the PMU UNCORE devices DDR4 Memory
Controller(DMC) and Level 3 Cache(L3C). Each PMU supports up to 4
counters. All counters lack overflow interrupt and are
sampled periodically.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
[will: consistent enum cpuhp_state naming]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 13:03:17 +00:00
John Garry
b89205bd50 drivers/perf: Remove ARM_SPE_PMU explicit PERF_EVENTS dependency
Since commit bddb9b68d3 ("drivers/perf: commonise PERF_EVENTS
dependency"), all perf drivers depend on PERF_EVENTS config under a
common menu.

Config ARM_SPE_PMU still declares explicitly a dependency on
PERF_EVENTS, which is unneeded, so remove it.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-22 17:11:12 +01:00
Robin Murphy
8b0c93c20e perf/arm-cci: Allow building as a module
Fill in the few extra bits and annotations needed to make the driver
work properly as a module, and jiggle the Kconfig to expose the
driver-level ARM_CCI_PMU option.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-21 18:12:54 +01:00
Robin Murphy
3de6be7a3d drivers/bus: Split Arm CCI driver
The arm-cci driver is really two entirely separate drivers; one for MCPM
port control and the other for the performance monitors. Since they are
already relatively self-contained, let's take the plunge and move the
PMU parts out to drivers/perf where they belong these days. For non-MCPM
systems this leaves a small dependency on the remaining "bus" stub for
initial probing and discovery, but we end up with something that still
fits the general pattern of its fellow system PMU drivers to ease future
maintenance.

Moving code to a new file also offers a perfect excuse to modernise the
license/copyright headers and clean up some funky linewraps on the way.

Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Suzuki Poulose <suzuki.poulose@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-06 17:26:17 +01:00
Robin Murphy
1888d3ddc3 drivers/bus: Move Arm CCN PMU driver
The arm-ccn driver is purely a perf driver for the CCN PMU, not a bus
driver in the sense of the other residents of drivers/bus/, so let's
move it to the appropriate place for SoC PMU drivers. Not to mention
moving the documentation accordingly as well.

Acked-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-06 17:26:15 +01:00
Suzuki K Poulose
7520fa9924 perf: ARM DynamIQ Shared Unit PMU support
Add support for the Cluster PMU part of the ARM DynamIQ Shared Unit (DSU).
The DSU integrates one or more cores with an L3 memory system, control
logic, and external interfaces to form a multicore cluster. The PMU
allows counting the various events related to L3, SCU etc, along with
providing a cycle counter.

The PMU can be accessed via system registers, which are common
to the cores in the same cluster. The PMU registers follow the
semantics of the ARMv8 PMU, mostly, with the exception that
the counters record the cluster wide events.

This driver is mostly based on the ARMv8 and CCI PMU drivers.
The driver only supports ARM64 at the moment. It can be extended
to support ARM32 by providing register accessors like we do in
arch/arm64/include/arm_dsu_pmu.h.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-01-02 16:43:12 +00:00
Shaokun Zhang
6ce4ef9419 perf: hisi: Add support for HiSilicon SoC uncore PMU driver
This patch adds support HiSilicon SoC uncore PMU driver framework and
interfaces.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Anurup M <anurup.m@huawei.com>
[will: Fix leader accounting in uncore group validation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-19 17:06:34 +01:00
Will Deacon
d5d9696b03 drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension
The ARMv8.2 architecture introduces the optional Statistical Profiling
Extension (SPE).

SPE can be used to profile a population of operations in the CPU pipeline
after instruction decode. These are either architected instructions (i.e.
a dynamic instruction trace) or CPU-specific uops and the choice is fixed
statically in the hardware and advertised to userspace via caps/. Sampling
is controlled using a sampling interval, similar to a regular PMU counter,
but also with an optional random perturbation to avoid falling into patterns
where you continuously profile the same instruction in a hot loop.

After each operation is decoded, the interval counter is decremented. When
it hits zero, an operation is chosen for profiling and tracked within the
pipeline until it retires. Along the way, information such as TLB lookups,
cache misses, time spent to issue etc is captured in the form of a sample.
The sample is then filtered according to certain criteria (e.g. load
latency) that can be specified in the event config (described under
format/) and, if the sample satisfies the filter, it is written out to
memory as a record, otherwise it is discarded. Only one operation can
be sampled at a time.

The in-memory buffer is linear and virtually addressed, raising an
interrupt when it fills up. The PMU driver handles these interrupts to
give the appearance of a ring buffer, as expected by the AUX code.

The in-memory trace-like format is self-describing (though not parseable
in reverse) and written as a series of records, with each record
corresponding to a sample and consisting of a sequence of packets. These
packets are defined by the architecture, although some have CPU-specific
fields for recording information specific to the microarchitecture.

As a simple example, a record generated for a branch instruction may
consist of the following packets:

  0 (Address) : Virtual PC of the branch instruction
  1 (Type)    : Conditional direct branch
  2 (Counter) : Number of cycles taken from Dispatch to Issue
  3 (Address) : Virtual branch target + condition flags
  4 (Counter) : Number of cycles taken from Dispatch to Complete
  5 (Events)  : Mispredicted as not-taken
  6 (END)     : End of record

It is also possible to toggle properties such as timestamp packets in
each record.

This patch adds support for SPE in the form of a new perf driver.

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-10-18 12:53:34 +01:00
Mark Rutland
bddb9b68d3 drivers/perf: commonise PERF_EVENTS dependency
All PMU drivers are going to depend on PERF_EVENTS, so let's make this
dependency common and simplify the individual Kconfig entries.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-06-15 11:10:33 +01:00
Mark Rutland
45736a72fb drivers/perf: arm_pmu: add ACPI framework
This patch adds framework code to handle parsing PMU data out of the
MADT, sanity checking this, and managing the association of CPUs (and
their interrupts) with appropriate logical PMUs.

For the time being, we expect that only one PMU driver (PMUv3) will make
use of this, and we simply pass in a single probe function.

This is based on an earlier patch from Jeremy Linton.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-11 16:29:54 +01:00
Agustin Vega-Frias
3071f13d75 perf: qcom: Add L3 cache PMU driver
This adds a new dynamic PMU to the Perf Events framework to program
and control the L3 cache PMUs in some Qualcomm Technologies SOCs.

The driver supports a distributed cache architecture where the overall
cache for a socket is comprised of multiple slices each with its own PMU.
Access to each individual PMU is provided even though all CPUs share all
the slices. User space needs to aggregate to individual counts to provide
a global picture.

The driver exports formatting and event information to sysfs so it can
be used by the perf user space tools with the syntaxes:
   perf stat -a -e l3cache_0_0/read-miss/
   perf stat -a -e l3cache_0_0/event=0x21/

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org>
[will: fixed sparse issues]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-04-03 18:53:50 +01:00
Neil Leeder
21bdbb7102 perf: add qcom l2 cache perf events driver
Adds perf events support for L2 cache PMU.

The L2 cache PMU driver is named 'l2cache_0' and can be used
with perf events to profile L2 events such as cache hits
and misses on Qualcomm Technologies processors.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Neil Leeder <nleeder@codeaurora.org>
[will: minimise nesting in l2_cache_associate_cpu_with_cluster]
[will: use kstrtoul for unsigned long, remove redunant .owner setting]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2017-02-08 19:32:24 +00:00
Tai Nguyen
832c927d11 perf: xgene: Add APM X-Gene SoC Performance Monitoring Unit driver
This patch adds a driver for the SoC-wide (AKA uncore) PMU hardware
found in APM X-Gene SoCs.

Signed-off-by: Tai Nguyen <ttnguyen@apm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
2016-09-15 11:20:55 -07:00
Mark Rutland
6475b2d846 arm64: perf: move to shared arm_pmu framework
Now that the arm_pmu framework has been factored out to drivers/perf we
can make use of it for arm64, gaining support for heterogeneous PMUs
and unifying the two codebases before they diverge further.

The as yet unused PMU name for PMUv3 is changed to armv8_pmuv3, matching
the style previously applied to the 32-bit PMUs.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-10-07 14:24:48 +01:00
Mark Rutland
fa8ad7889d arm: perf: factor arm_pmu core out to drivers
To enable sharing of the arm_pmu code with arm64, this patch factors it
out to drivers/perf/. A new drivers/perf directory is added for
performance monitor drivers to live under.

MAINTAINERS is updated accordingly. Files added previously without a
corresponsing MAINTAINERS update (perf_regs.c, perf_callchain.c, and
perf_event.h) are also added.

Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[will: augmented Kconfig help slightly]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-31 15:01:14 +01:00