From 43cd3ddbff3c1635d0e09fe5b09af48d39dbb9d7 Mon Sep 17 00:00:00 2001 From: Douglas Anderson Date: Mon, 15 May 2023 13:13:50 -0700 Subject: [PATCH 1/6] dt-bindings: interrupt-controller: arm,gic-v3: Add quirk for Mediatek SoCs w/ broken FW When trying to turn on the "pseudo NMI" kernel feature in Linux, it was discovered that all Mediatek-based Chromebooks that ever shipped (at least ones with GICv3) had a firmware bug where they wouldn't save certain GIC "GICR" registers properly. If a processor ever entered a suspend/idle mode where the GICR registers lost state then they'd be reset to their default state. As a result of the bug, if you try to enable "pseudo NMIs" on the affected devices then certain interrupts will unexpectedly get promoted to be "pseudo NMIs" and cause crashes / freezes / general mayhem. ChromeOS is looking to start turning on "pseudo NMIs" in production to make crash reports more actionable. To do so, we will release firmware updates for at least some of the affected Mediatek Chromebooks. However, even when we update the firmware of a Chromebook it's always possible that a user will end up booting with old firmware. We need to be able to detect when we're running with firmware that will crash and burn if pseudo NMIs are enabled. The current plan is: * Update the device trees of all affected Chromebooks to include the 'mediatek,broken-save-restore-fw' property. The kernel can use this to know not to enable certain features like "pseudo NMI". NOTE: device trees for Chromebooks are never baked into the firmware but are bundled with the kernel. A kernel will never be configured to use "pseudo NMIs" and be bundled with an old device tree. * When we get a fixed firmware for one of these Chromebooks, it will patch the device tree to remove this property. For some details, you can also see the public bug Reviewed-by: Julius Werner Signed-off-by: Douglas Anderson Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20230515131353.v2.1.Iabe67a827e206496efec6beb5616d5a3b99c1e65@changeid --- .../bindings/interrupt-controller/arm,gic-v3.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml index 92117261e1e1..39e64c7f6360 100644 --- a/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml +++ b/Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml @@ -166,6 +166,12 @@ properties: resets: maxItems: 1 + mediatek,broken-save-restore-fw: + type: boolean + description: + Asserts that the firmware on this device has issues saving and restoring + GICR registers when the GIC redistributors are powered off. + dependencies: mbi-ranges: [ msi-controller ] msi-controller: [ mbi-ranges ] From 44bd78dd2b8897f59b7e3963f088caadb7e4f047 Mon Sep 17 00:00:00 2001 From: Douglas Anderson Date: Mon, 15 May 2023 13:13:51 -0700 Subject: [PATCH 2/6] irqchip/gic-v3: Disable pseudo NMIs on Mediatek devices w/ firmware issues Some Chromebooks with Mediatek SoCs have a problem where the firmware doesn't properly save/restore certain GICR registers. Newer Chromebooks should fix this issue and we may be able to do firmware updates for old Chromebooks. At the moment, the only known issue with these Chromebooks is that we can't enable "pseudo NMIs" since the priority register can be lost. Enabling "pseudo NMIs" on Chromebooks with the problematic firmware causes crashes and freezes. Let's detect devices with this problem and then disable "pseudo NMIs" on them. We'll detect the problem by looking for the presence of the "mediatek,broken-save-restore-fw" property in the GIC device tree node. Any devices with fixed firmware will not have this property. Our detection plan works because we never bake a Chromebook's device tree into firmware. Instead, device trees are always bundled with the kernel. We'll update the device trees of all affected Chromebooks and then we'll never enable "pseudo NMI" on a kernel that is bundled with old device trees. When a firmware update is shipped that fixes this issue it will know to patch the device tree to remove the property. In order to make this work, the quick detection mechanism of the GICv3 code is extended to be able to look for properties in addition to looking at "compatible". Reviewed-by: Julius Werner Signed-off-by: Douglas Anderson Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20230515131353.v2.2.I88dc0a0eb1d9d537de61604cd8994ecc55c0cac1@changeid --- drivers/irqchip/irq-gic-common.c | 8 ++++++-- drivers/irqchip/irq-gic-common.h | 1 + drivers/irqchip/irq-gic-v3.c | 20 ++++++++++++++++++++ 3 files changed, 27 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-gic-common.c b/drivers/irqchip/irq-gic-common.c index a610821c8ff2..de47b51cdadb 100644 --- a/drivers/irqchip/irq-gic-common.c +++ b/drivers/irqchip/irq-gic-common.c @@ -16,7 +16,11 @@ void gic_enable_of_quirks(const struct device_node *np, const struct gic_quirk *quirks, void *data) { for (; quirks->desc; quirks++) { - if (!of_device_is_compatible(np, quirks->compatible)) + if (quirks->compatible && + !of_device_is_compatible(np, quirks->compatible)) + continue; + if (quirks->property && + !of_property_read_bool(np, quirks->property)) continue; if (quirks->init(data)) pr_info("GIC: enabling workaround for %s\n", @@ -28,7 +32,7 @@ void gic_enable_quirks(u32 iidr, const struct gic_quirk *quirks, void *data) { for (; quirks->desc; quirks++) { - if (quirks->compatible) + if (quirks->compatible || quirks->property) continue; if (quirks->iidr != (quirks->mask & iidr)) continue; diff --git a/drivers/irqchip/irq-gic-common.h b/drivers/irqchip/irq-gic-common.h index 27e3d4ed4f32..3db4592cda1c 100644 --- a/drivers/irqchip/irq-gic-common.h +++ b/drivers/irqchip/irq-gic-common.h @@ -13,6 +13,7 @@ struct gic_quirk { const char *desc; const char *compatible; + const char *property; bool (*init)(void *data); u32 iidr; u32 mask; diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index 6fcee221f201..a605aa79435a 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -39,6 +39,7 @@ #define FLAGS_WORKAROUND_GICR_WAKER_MSM8996 (1ULL << 0) #define FLAGS_WORKAROUND_CAVIUM_ERRATUM_38539 (1ULL << 1) +#define FLAGS_WORKAROUND_MTK_GICR_SAVE (1ULL << 2) #define GIC_IRQ_TYPE_PARTITION (GIC_IRQ_TYPE_LPI + 1) @@ -1720,6 +1721,15 @@ static bool gic_enable_quirk_msm8996(void *data) return true; } +static bool gic_enable_quirk_mtk_gicr(void *data) +{ + struct gic_chip_data *d = data; + + d->flags |= FLAGS_WORKAROUND_MTK_GICR_SAVE; + + return true; +} + static bool gic_enable_quirk_cavium_38539(void *data) { struct gic_chip_data *d = data; @@ -1792,6 +1802,11 @@ static const struct gic_quirk gic_quirks[] = { .compatible = "qcom,msm8996-gic-v3", .init = gic_enable_quirk_msm8996, }, + { + .desc = "GICv3: Mediatek Chromebook GICR save problem", + .property = "mediatek,broken-save-restore-fw", + .init = gic_enable_quirk_mtk_gicr, + }, { .desc = "GICv3: HIP06 erratum 161010803", .iidr = 0x0204043b, @@ -1834,6 +1849,11 @@ static void gic_enable_nmi_support(void) if (!gic_prio_masking_enabled()) return; + if (gic_data.flags & FLAGS_WORKAROUND_MTK_GICR_SAVE) { + pr_warn("Skipping NMI enable due to firmware issues\n"); + return; + } + ppi_nmi_refs = kcalloc(gic_data.ppi_nr, sizeof(*ppi_nmi_refs), GFP_KERNEL); if (!ppi_nmi_refs) return; From 2c6c9c049510163090b979ea5f92a68ae8d93c45 Mon Sep 17 00:00:00 2001 From: Jiaxun Yang Date: Mon, 24 Apr 2023 11:31:55 +0100 Subject: [PATCH 3/6] irqchip/mips-gic: Don't touch vl_map if a local interrupt is not routable When a GIC local interrupt is not routable, it's vl_map will be used to control some internal states for core (providing IPTI, IPPCI, IPFDC input signal for core). Overriding it will interfere core's intetrupt controller. Do not touch vl_map if a local interrupt is not routable, we are not going to remap it. Before dd098a0e0319 (" irqchip/mips-gic: Get rid of the reliance on irq_cpu_online()"), if a local interrupt is not routable, then it won't be requested from GIC Local domain, and thus gic_all_vpes_irq_cpu_online won't be called for that particular interrupt. Fixes: dd098a0e0319 (" irqchip/mips-gic: Get rid of the reliance on irq_cpu_online()") Cc: stable@vger.kernel.org Signed-off-by: Jiaxun Yang Reviewed-by: Serge Semin Tested-by: Serge Semin Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20230424103156.66753-2-jiaxun.yang@flygoat.com --- drivers/irqchip/irq-mips-gic.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/irqchip/irq-mips-gic.c b/drivers/irqchip/irq-mips-gic.c index 046c355e120b..b568d55ef7c5 100644 --- a/drivers/irqchip/irq-mips-gic.c +++ b/drivers/irqchip/irq-mips-gic.c @@ -399,6 +399,8 @@ static void gic_all_vpes_irq_cpu_online(void) unsigned int intr = local_intrs[i]; struct gic_all_vpes_chip_data *cd; + if (!gic_local_irq_is_routable(intr)) + continue; cd = &gic_all_vpes_chip_data[intr]; write_gic_vl_map(mips_gic_vx_map_reg(intr), cd->map); if (cd->mask) From 3d6a0e4197c04599d75d85a608c8bb16a630a38c Mon Sep 17 00:00:00 2001 From: Jiaxun Yang Date: Mon, 24 Apr 2023 11:31:56 +0100 Subject: [PATCH 4/6] irqchip/mips-gic: Use raw spinlock for gic_lock Since we may hold gic_lock in hardirq context, use raw spinlock makes more sense given that it is for low-level interrupt handling routine and the critical section is small. Fixes BUG: [ 0.426106] ============================= [ 0.426257] [ BUG: Invalid wait context ] [ 0.426422] 6.3.0-rc7-next-20230421-dirty #54 Not tainted [ 0.426638] ----------------------------- [ 0.426766] swapper/0/1 is trying to lock: [ 0.426954] ffffffff8104e7b8 (gic_lock){....}-{3:3}, at: gic_set_type+0x30/08 Fixes: 95150ae8b330 ("irqchip: mips-gic: Implement irq_set_type callback") Cc: stable@vger.kernel.org Signed-off-by: Jiaxun Yang Reviewed-by: Serge Semin Tested-by: Serge Semin Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20230424103156.66753-3-jiaxun.yang@flygoat.com --- drivers/irqchip/irq-mips-gic.c | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/drivers/irqchip/irq-mips-gic.c b/drivers/irqchip/irq-mips-gic.c index b568d55ef7c5..6d5ecc10a870 100644 --- a/drivers/irqchip/irq-mips-gic.c +++ b/drivers/irqchip/irq-mips-gic.c @@ -50,7 +50,7 @@ void __iomem *mips_gic_base; static DEFINE_PER_CPU_READ_MOSTLY(unsigned long[GIC_MAX_LONGS], pcpu_masks); -static DEFINE_SPINLOCK(gic_lock); +static DEFINE_RAW_SPINLOCK(gic_lock); static struct irq_domain *gic_irq_domain; static int gic_shared_intrs; static unsigned int gic_cpu_pin; @@ -210,7 +210,7 @@ static int gic_set_type(struct irq_data *d, unsigned int type) irq = GIC_HWIRQ_TO_SHARED(d->hwirq); - spin_lock_irqsave(&gic_lock, flags); + raw_spin_lock_irqsave(&gic_lock, flags); switch (type & IRQ_TYPE_SENSE_MASK) { case IRQ_TYPE_EDGE_FALLING: pol = GIC_POL_FALLING_EDGE; @@ -250,7 +250,7 @@ static int gic_set_type(struct irq_data *d, unsigned int type) else irq_set_chip_handler_name_locked(d, &gic_level_irq_controller, handle_level_irq, NULL); - spin_unlock_irqrestore(&gic_lock, flags); + raw_spin_unlock_irqrestore(&gic_lock, flags); return 0; } @@ -268,7 +268,7 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *cpumask, return -EINVAL; /* Assumption : cpumask refers to a single CPU */ - spin_lock_irqsave(&gic_lock, flags); + raw_spin_lock_irqsave(&gic_lock, flags); /* Re-route this IRQ */ write_gic_map_vp(irq, BIT(mips_cm_vp_id(cpu))); @@ -279,7 +279,7 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *cpumask, set_bit(irq, per_cpu_ptr(pcpu_masks, cpu)); irq_data_update_effective_affinity(d, cpumask_of(cpu)); - spin_unlock_irqrestore(&gic_lock, flags); + raw_spin_unlock_irqrestore(&gic_lock, flags); return IRQ_SET_MASK_OK; } @@ -357,12 +357,12 @@ static void gic_mask_local_irq_all_vpes(struct irq_data *d) cd = irq_data_get_irq_chip_data(d); cd->mask = false; - spin_lock_irqsave(&gic_lock, flags); + raw_spin_lock_irqsave(&gic_lock, flags); for_each_online_cpu(cpu) { write_gic_vl_other(mips_cm_vp_id(cpu)); write_gic_vo_rmask(BIT(intr)); } - spin_unlock_irqrestore(&gic_lock, flags); + raw_spin_unlock_irqrestore(&gic_lock, flags); } static void gic_unmask_local_irq_all_vpes(struct irq_data *d) @@ -375,12 +375,12 @@ static void gic_unmask_local_irq_all_vpes(struct irq_data *d) cd = irq_data_get_irq_chip_data(d); cd->mask = true; - spin_lock_irqsave(&gic_lock, flags); + raw_spin_lock_irqsave(&gic_lock, flags); for_each_online_cpu(cpu) { write_gic_vl_other(mips_cm_vp_id(cpu)); write_gic_vo_smask(BIT(intr)); } - spin_unlock_irqrestore(&gic_lock, flags); + raw_spin_unlock_irqrestore(&gic_lock, flags); } static void gic_all_vpes_irq_cpu_online(void) @@ -393,7 +393,7 @@ static void gic_all_vpes_irq_cpu_online(void) unsigned long flags; int i; - spin_lock_irqsave(&gic_lock, flags); + raw_spin_lock_irqsave(&gic_lock, flags); for (i = 0; i < ARRAY_SIZE(local_intrs); i++) { unsigned int intr = local_intrs[i]; @@ -407,7 +407,7 @@ static void gic_all_vpes_irq_cpu_online(void) write_gic_vl_smask(BIT(intr)); } - spin_unlock_irqrestore(&gic_lock, flags); + raw_spin_unlock_irqrestore(&gic_lock, flags); } static struct irq_chip gic_all_vpes_local_irq_controller = { @@ -437,11 +437,11 @@ static int gic_shared_irq_domain_map(struct irq_domain *d, unsigned int virq, data = irq_get_irq_data(virq); - spin_lock_irqsave(&gic_lock, flags); + raw_spin_lock_irqsave(&gic_lock, flags); write_gic_map_pin(intr, GIC_MAP_PIN_MAP_TO_PIN | gic_cpu_pin); write_gic_map_vp(intr, BIT(mips_cm_vp_id(cpu))); irq_data_update_effective_affinity(data, cpumask_of(cpu)); - spin_unlock_irqrestore(&gic_lock, flags); + raw_spin_unlock_irqrestore(&gic_lock, flags); return 0; } @@ -533,12 +533,12 @@ static int gic_irq_domain_map(struct irq_domain *d, unsigned int virq, if (!gic_local_irq_is_routable(intr)) return -EPERM; - spin_lock_irqsave(&gic_lock, flags); + raw_spin_lock_irqsave(&gic_lock, flags); for_each_online_cpu(cpu) { write_gic_vl_other(mips_cm_vp_id(cpu)); write_gic_vo_map(mips_gic_vx_map_reg(intr), map); } - spin_unlock_irqrestore(&gic_lock, flags); + raw_spin_unlock_irqrestore(&gic_lock, flags); return 0; } From 14130211be5366a91ec07c3284c183b75d8fba17 Mon Sep 17 00:00:00 2001 From: Krzysztof Kozlowski Date: Fri, 12 May 2023 18:45:06 +0200 Subject: [PATCH 5/6] irqchip/meson-gpio: Mark OF related data as maybe unused MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The driver can be compile tested with !CONFIG_OF making certain data unused: drivers/irqchip/irq-meson-gpio.c:153:34: error: ‘meson_irq_gpio_matches’ defined but not used [-Werror=unused-const-variable=] Acked-by: Martin Blumenstingl Signed-off-by: Krzysztof Kozlowski Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20230512164506.212267-1-krzysztof.kozlowski@linaro.org --- drivers/irqchip/irq-meson-gpio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/irqchip/irq-meson-gpio.c b/drivers/irqchip/irq-meson-gpio.c index 2aaa9aad3e87..7da18ef95211 100644 --- a/drivers/irqchip/irq-meson-gpio.c +++ b/drivers/irqchip/irq-meson-gpio.c @@ -150,7 +150,7 @@ static const struct meson_gpio_irq_params s4_params = { INIT_MESON_S4_COMMON_DATA(82) }; -static const struct of_device_id meson_irq_gpio_matches[] = { +static const struct of_device_id meson_irq_gpio_matches[] __maybe_unused = { { .compatible = "amlogic,meson8-gpio-intc", .data = &meson8_params }, { .compatible = "amlogic,meson8b-gpio-intc", .data = &meson8b_params }, { .compatible = "amlogic,meson-gxbb-gpio-intc", .data = &gxbb_params }, From cddb536a73ef2c82ce04059dc61c65e85a355561 Mon Sep 17 00:00:00 2001 From: Kefeng Wang Date: Fri, 5 May 2023 17:06:54 +0800 Subject: [PATCH 6/6] irqchip/mbigen: Unify the error handling in mbigen_of_create_domain() Dan Carpenter reported that commit fea087fc291b "irqchip/mbigen: move to use bus_get_dev_root()" leads to the following Smatch static checker warning: drivers/irqchip/irq-mbigen.c:258 mbigen_of_create_domain() error: potentially dereferencing uninitialized 'child'. It should not cause a problem on real hardware, but better to fix the warning, let's move the bus_get_dev_root() out of the loop, and unify the error handling to silence it. Reported-by: Dan Carpenter Signed-off-by: Kefeng Wang Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20230505090654.12793-1-wangkefeng.wang@huawei.com --- drivers/irqchip/irq-mbigen.c | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/drivers/irqchip/irq-mbigen.c b/drivers/irqchip/irq-mbigen.c index eada5e0e3eb9..5101a3fb11df 100644 --- a/drivers/irqchip/irq-mbigen.c +++ b/drivers/irqchip/irq-mbigen.c @@ -240,26 +240,27 @@ static int mbigen_of_create_domain(struct platform_device *pdev, struct irq_domain *domain; struct device_node *np; u32 num_pins; + int ret = 0; + + parent = bus_get_dev_root(&platform_bus_type); + if (!parent) + return -ENODEV; for_each_child_of_node(pdev->dev.of_node, np) { if (!of_property_read_bool(np, "interrupt-controller")) continue; - parent = bus_get_dev_root(&platform_bus_type); - if (parent) { - child = of_platform_device_create(np, NULL, parent); - put_device(parent); - if (!child) { - of_node_put(np); - return -ENOMEM; - } + child = of_platform_device_create(np, NULL, parent); + if (!child) { + ret = -ENOMEM; + break; } if (of_property_read_u32(child->dev.of_node, "num-pins", &num_pins) < 0) { dev_err(&pdev->dev, "No num-pins property\n"); - of_node_put(np); - return -EINVAL; + ret = -EINVAL; + break; } domain = platform_msi_create_device_domain(&child->dev, num_pins, @@ -267,12 +268,16 @@ static int mbigen_of_create_domain(struct platform_device *pdev, &mbigen_domain_ops, mgn_chip); if (!domain) { - of_node_put(np); - return -ENOMEM; + ret = -ENOMEM; + break; } } - return 0; + put_device(parent); + if (ret) + of_node_put(np); + + return ret; } #ifdef CONFIG_ACPI