From 4ab2bb3c311a45d80d31f2f189606871669ed792 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Filipe=20La=C3=ADns?= Date: Sat, 11 Jan 2020 19:24:19 +0000 Subject: [PATCH 001/400] HID: logitech-hidpp: BatteryVoltage: only read chargeStatus if extPower is active MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In the HID++ 2.0 function getBatteryInfo() from the BatteryVoltage (0x1001) feature, chargeStatus is only valid if extPower is active. Previously we were ignoring extPower, which resulted in wrong values. Example: With an unplugged mouse $ cat /sys/class/power_supply/hidpp_battery_0/status Charging This patch fixes that, it also renames charge_sts to flags as charge_sts can be confused with chargeStatus from the spec. Spec: +--------+-------------------------------------------------------------------------+ | byte | 2 | +--------+--------------+------------+------------+----------+----------+----------+ | bit | 0..2 | 3 | 4 | 5 | 6 | 7 | +--------+--------------+------------+------------+----------+----------+----------+ | buffer | chargeStatus | fastCharge | slowCharge | critical | (unused) | extPower | +--------+--------------+------------+------------+----------+----------+----------+ Table 1 - battery voltage (0x1001), getBatteryInfo() (ASE 0), 3rd byte +-------+--------------------------------------+ | value | meaning | +-------+--------------------------------------+ | 0 | Charging | +-------+--------------------------------------+ | 1 | End of charge (100% charged) | +-------+--------------------------------------+ | 2 | Charge stopped (any "normal" reason) | +-------+--------------------------------------+ | 7 | Hardware error | +-------+--------------------------------------+ Table 2 - chargeStatus value Signed-off-by: Filipe Laíns Tested-by: Pedro Vanzella Reviewed-by: Pedro Vanzella Signed-off-by: Benjamin Tissoires --- drivers/hid/hid-logitech-hidpp.c | 43 ++++++++++++++++---------------- 1 file changed, 21 insertions(+), 22 deletions(-) diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c index 70e1cb928bf0..094f4f1b6555 100644 --- a/drivers/hid/hid-logitech-hidpp.c +++ b/drivers/hid/hid-logitech-hidpp.c @@ -1256,36 +1256,35 @@ static int hidpp20_battery_map_status_voltage(u8 data[3], int *voltage, { int status; - long charge_sts = (long)data[2]; + long flags = (long) data[2]; - *level = POWER_SUPPLY_CAPACITY_LEVEL_UNKNOWN; - switch (data[2] & 0xe0) { - case 0x00: - status = POWER_SUPPLY_STATUS_CHARGING; - break; - case 0x20: - status = POWER_SUPPLY_STATUS_FULL; - *level = POWER_SUPPLY_CAPACITY_LEVEL_FULL; - break; - case 0x40: + if (flags & 0x80) + switch (flags & 0x07) { + case 0: + status = POWER_SUPPLY_STATUS_CHARGING; + break; + case 1: + status = POWER_SUPPLY_STATUS_FULL; + *level = POWER_SUPPLY_CAPACITY_LEVEL_FULL; + break; + case 2: + status = POWER_SUPPLY_STATUS_NOT_CHARGING; + break; + default: + status = POWER_SUPPLY_STATUS_UNKNOWN; + break; + } + else status = POWER_SUPPLY_STATUS_DISCHARGING; - break; - case 0xe0: - status = POWER_SUPPLY_STATUS_NOT_CHARGING; - break; - default: - status = POWER_SUPPLY_STATUS_UNKNOWN; - } *charge_type = POWER_SUPPLY_CHARGE_TYPE_STANDARD; - if (test_bit(3, &charge_sts)) { + if (test_bit(3, &flags)) { *charge_type = POWER_SUPPLY_CHARGE_TYPE_FAST; } - if (test_bit(4, &charge_sts)) { + if (test_bit(4, &flags)) { *charge_type = POWER_SUPPLY_CHARGE_TYPE_TRICKLE; } - - if (test_bit(5, &charge_sts)) { + if (test_bit(5, &flags)) { *level = POWER_SUPPLY_CAPACITY_LEVEL_CRITICAL; } From a5b0cda136f4f420a8e24e50d19dfcef2f81df2e Mon Sep 17 00:00:00 2001 From: Petr Vorel Date: Tue, 28 Jan 2020 00:14:39 +0100 Subject: [PATCH 002/400] regulator: qcom_spmi: Fix docs for PM8004 Fixes: 2e36e140b8b8 ("regulator: qcom_spmi: Add support for PM8004 regulators") Signed-off-by: Petr Vorel Link: https://lore.kernel.org/r/20200127231439.3562452-1-petr.vorel@gmail.com Signed-off-by: Mark Brown --- .../devicetree/bindings/regulator/qcom,spmi-regulator.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/regulator/qcom,spmi-regulator.txt b/Documentation/devicetree/bindings/regulator/qcom,spmi-regulator.txt index f5cdac8b2847..8b005192f6e8 100644 --- a/Documentation/devicetree/bindings/regulator/qcom,spmi-regulator.txt +++ b/Documentation/devicetree/bindings/regulator/qcom,spmi-regulator.txt @@ -161,7 +161,7 @@ The regulator node houses sub-nodes for each regulator within the device. Each sub-node is identified using the node's name, with valid values listed for each of the PMICs below. -pm8005: +pm8004: s2, s5 pm8005: From beae56192a2570578ae45050e73c5ff9254f63e6 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Sat, 1 Feb 2020 12:56:48 +0100 Subject: [PATCH 003/400] HID: ite: Only bind to keyboard USB interface on Acer SW5-012 keyboard dock MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Commit 8f18eca9ebc5 ("HID: ite: Add USB id match for Acer SW5-012 keyboard dock") added the USB id for the Acer SW5-012's keyboard dock to the hid-ite driver to fix the rfkill driver not working. Most keyboard docks with an ITE 8595 keyboard/touchpad controller have the "Wireless Radio Control" bits which need the special hid-ite driver on the second USB interface (the mouse interface) and their touchpad only supports mouse emulation, so using generic hid-input handling for anything but the "Wireless Radio Control" bits is fine. On these devices we simply bind to all USB interfaces. But unlike other ITE8595 using keyboard docks, the Acer Aspire Switch 10 (SW5-012)'s touchpad not only does mouse emulation it also supports HID-multitouch and all the keys including the "Wireless Radio Control" bits have been moved to the first USB interface (the keyboard intf). So we need hid-ite to handle the first (keyboard) USB interface and have it NOT bind to the second (mouse) USB interface so that that can be handled by hid-multitouch.c and we get proper multi-touch support. This commit changes the hid_device_id for the SW5-012 keyboard dock to only match on hid devices from the HID_GROUP_GENERIC group, this way hid-ite will not bind the the mouse/multi-touch interface which has HID_GROUP_MULTITOUCH_WIN_8 as group. This fixes the regression to mouse-emulation mode introduced by adding the keyboard dock USB id. Cc: stable@vger.kernel.org Fixes: 8f18eca9ebc5 ("HID: ite: Add USB id match for Acer SW5-012 keyboard dock") Reported-by: Zdeněk Rampas Signed-off-by: Hans de Goede Signed-off-by: Benjamin Tissoires --- drivers/hid/hid-ite.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/hid/hid-ite.c b/drivers/hid/hid-ite.c index c436e12feb23..6c55682c5974 100644 --- a/drivers/hid/hid-ite.c +++ b/drivers/hid/hid-ite.c @@ -41,8 +41,9 @@ static const struct hid_device_id ite_devices[] = { { HID_USB_DEVICE(USB_VENDOR_ID_ITE, USB_DEVICE_ID_ITE8595) }, { HID_USB_DEVICE(USB_VENDOR_ID_258A, USB_DEVICE_ID_258A_6A88) }, /* ITE8595 USB kbd ctlr, with Synaptics touchpad connected to it. */ - { HID_USB_DEVICE(USB_VENDOR_ID_SYNAPTICS, - USB_DEVICE_ID_SYNAPTICS_ACER_SWITCH5_012) }, + { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC, + USB_VENDOR_ID_SYNAPTICS, + USB_DEVICE_ID_SYNAPTICS_ACER_SWITCH5_012) }, { } }; MODULE_DEVICE_TABLE(hid, ite_devices); From 51b2569402a38e206d26728b0099eb059ab315b5 Mon Sep 17 00:00:00 2001 From: Jeremy Cline Date: Wed, 5 Feb 2020 08:41:46 -0500 Subject: [PATCH 004/400] KVM: arm/arm64: Fix up includes for trace.h Fedora kernel builds on armv7hl began failing recently because kvm_arm_exception_type and kvm_arm_exception_class were undeclared in trace.h. Add the missing include. Fixes: 0e20f5e25556 ("KVM: arm/arm64: Cleanup MMIO handling") Signed-off-by: Jeremy Cline Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20200205134146.82678-1-jcline@redhat.com --- virt/kvm/arm/trace.h | 1 + 1 file changed, 1 insertion(+) diff --git a/virt/kvm/arm/trace.h b/virt/kvm/arm/trace.h index 204d210d01c2..cc94ccc68821 100644 --- a/virt/kvm/arm/trace.h +++ b/virt/kvm/arm/trace.h @@ -4,6 +4,7 @@ #include #include +#include #undef TRACE_SYSTEM #define TRACE_SYSTEM kvm From e4e8276a4f652be2c7bb783a0155d4adb85f5d7d Mon Sep 17 00:00:00 2001 From: Vignesh Raghavendra Date: Tue, 4 Feb 2020 18:18:15 +0530 Subject: [PATCH 005/400] spi: spi-omap2-mcspi: Handle DMA size restriction on AM65x On AM654, McSPI can only support 4K - 1 bytes per transfer when DMA is enabled. Therefore populate master->max_transfer_size callback to inform client drivers of this restriction when DMA channels are available. Signed-off-by: Vignesh Raghavendra Link: https://lore.kernel.org/r/20200204124816.16735-2-vigneshr@ti.com Signed-off-by: Mark Brown --- drivers/spi/spi-omap2-mcspi.c | 26 +++++++++++++++++++ include/linux/platform_data/spi-omap2-mcspi.h | 1 + 2 files changed, 27 insertions(+) diff --git a/drivers/spi/spi-omap2-mcspi.c b/drivers/spi/spi-omap2-mcspi.c index 7e2292c11d12..e9bc9cf984d6 100644 --- a/drivers/spi/spi-omap2-mcspi.c +++ b/drivers/spi/spi-omap2-mcspi.c @@ -130,6 +130,7 @@ struct omap2_mcspi { int fifo_depth; bool slave_aborted; unsigned int pin_dir:1; + size_t max_xfer_len; }; struct omap2_mcspi_cs { @@ -1305,6 +1306,18 @@ static bool omap2_mcspi_can_dma(struct spi_master *master, return (xfer->len >= DMA_MIN_BYTES); } +static size_t omap2_mcspi_max_xfer_size(struct spi_device *spi) +{ + struct omap2_mcspi *mcspi = spi_master_get_devdata(spi->master); + struct omap2_mcspi_dma *mcspi_dma = + &mcspi->dma_channels[spi->chip_select]; + + if (mcspi->max_xfer_len && mcspi_dma->dma_rx) + return mcspi->max_xfer_len; + + return SIZE_MAX; +} + static int omap2_mcspi_controller_setup(struct omap2_mcspi *mcspi) { struct spi_master *master = mcspi->master; @@ -1373,6 +1386,11 @@ static struct omap2_mcspi_platform_config omap4_pdata = { .regs_offset = OMAP4_MCSPI_REG_OFFSET, }; +static struct omap2_mcspi_platform_config am654_pdata = { + .regs_offset = OMAP4_MCSPI_REG_OFFSET, + .max_xfer_len = SZ_4K - 1, +}; + static const struct of_device_id omap_mcspi_of_match[] = { { .compatible = "ti,omap2-mcspi", @@ -1382,6 +1400,10 @@ static const struct of_device_id omap_mcspi_of_match[] = { .compatible = "ti,omap4-mcspi", .data = &omap4_pdata, }, + { + .compatible = "ti,am654-mcspi", + .data = &am654_pdata, + }, { }, }; MODULE_DEVICE_TABLE(of, omap_mcspi_of_match); @@ -1439,6 +1461,10 @@ static int omap2_mcspi_probe(struct platform_device *pdev) mcspi->pin_dir = pdata->pin_dir; } regs_offset = pdata->regs_offset; + if (pdata->max_xfer_len) { + mcspi->max_xfer_len = pdata->max_xfer_len; + master->max_transfer_size = omap2_mcspi_max_xfer_size; + } r = platform_get_resource(pdev, IORESOURCE_MEM, 0); mcspi->base = devm_ioremap_resource(&pdev->dev, r); diff --git a/include/linux/platform_data/spi-omap2-mcspi.h b/include/linux/platform_data/spi-omap2-mcspi.h index 0bf9fddb8306..3b400b1919a9 100644 --- a/include/linux/platform_data/spi-omap2-mcspi.h +++ b/include/linux/platform_data/spi-omap2-mcspi.h @@ -11,6 +11,7 @@ struct omap2_mcspi_platform_config { unsigned short num_cs; unsigned int regs_offset; unsigned int pin_dir:1; + size_t max_xfer_len; }; struct omap2_mcspi_device_config { From 32f2fc5dc3992b4b60cc6b1a6a31be605cc9c3a2 Mon Sep 17 00:00:00 2001 From: Vignesh Raghavendra Date: Tue, 4 Feb 2020 18:18:16 +0530 Subject: [PATCH 006/400] spi: spi-omap2-mcspi: Support probe deferral for DMA channels dma_request_channel() can return -EPROBE_DEFER, if DMA driver is not ready. Currently driver just falls back to PIO mode on probe deferral. Fix this by requesting all required channels during probe and propagating EPROBE_DEFER error code. Signed-off-by: Vignesh Raghavendra Link: https://lore.kernel.org/r/20200204124816.16735-3-vigneshr@ti.com Signed-off-by: Mark Brown --- drivers/spi/spi-omap2-mcspi.c | 77 +++++++++++++++++------------------ 1 file changed, 38 insertions(+), 39 deletions(-) diff --git a/drivers/spi/spi-omap2-mcspi.c b/drivers/spi/spi-omap2-mcspi.c index e9bc9cf984d6..e9e256718ef4 100644 --- a/drivers/spi/spi-omap2-mcspi.c +++ b/drivers/spi/spi-omap2-mcspi.c @@ -975,20 +975,12 @@ static int omap2_mcspi_setup_transfer(struct spi_device *spi, * Note that we currently allow DMA only if we get a channel * for both rx and tx. Otherwise we'll do PIO for both rx and tx. */ -static int omap2_mcspi_request_dma(struct spi_device *spi) +static int omap2_mcspi_request_dma(struct omap2_mcspi *mcspi, + struct omap2_mcspi_dma *mcspi_dma) { - struct spi_master *master = spi->master; - struct omap2_mcspi *mcspi; - struct omap2_mcspi_dma *mcspi_dma; int ret = 0; - mcspi = spi_master_get_devdata(master); - mcspi_dma = mcspi->dma_channels + spi->chip_select; - - init_completion(&mcspi_dma->dma_rx_completion); - init_completion(&mcspi_dma->dma_tx_completion); - - mcspi_dma->dma_rx = dma_request_chan(&master->dev, + mcspi_dma->dma_rx = dma_request_chan(mcspi->dev, mcspi_dma->dma_rx_ch_name); if (IS_ERR(mcspi_dma->dma_rx)) { ret = PTR_ERR(mcspi_dma->dma_rx); @@ -996,7 +988,7 @@ static int omap2_mcspi_request_dma(struct spi_device *spi) goto no_dma; } - mcspi_dma->dma_tx = dma_request_chan(&master->dev, + mcspi_dma->dma_tx = dma_request_chan(mcspi->dev, mcspi_dma->dma_tx_ch_name); if (IS_ERR(mcspi_dma->dma_tx)) { ret = PTR_ERR(mcspi_dma->dma_tx); @@ -1005,20 +997,40 @@ static int omap2_mcspi_request_dma(struct spi_device *spi) mcspi_dma->dma_rx = NULL; } + init_completion(&mcspi_dma->dma_rx_completion); + init_completion(&mcspi_dma->dma_tx_completion); + no_dma: return ret; } +static void omap2_mcspi_release_dma(struct spi_master *master) +{ + struct omap2_mcspi *mcspi = spi_master_get_devdata(master); + struct omap2_mcspi_dma *mcspi_dma; + int i; + + for (i = 0; i < master->num_chipselect; i++) { + mcspi_dma = &mcspi->dma_channels[i]; + + if (mcspi_dma->dma_rx) { + dma_release_channel(mcspi_dma->dma_rx); + mcspi_dma->dma_rx = NULL; + } + if (mcspi_dma->dma_tx) { + dma_release_channel(mcspi_dma->dma_tx); + mcspi_dma->dma_tx = NULL; + } + } +} + static int omap2_mcspi_setup(struct spi_device *spi) { int ret; struct omap2_mcspi *mcspi = spi_master_get_devdata(spi->master); struct omap2_mcspi_regs *ctx = &mcspi->ctx; - struct omap2_mcspi_dma *mcspi_dma; struct omap2_mcspi_cs *cs = spi->controller_state; - mcspi_dma = &mcspi->dma_channels[spi->chip_select]; - if (!cs) { cs = kzalloc(sizeof *cs, GFP_KERNEL); if (!cs) @@ -1043,13 +1055,6 @@ static int omap2_mcspi_setup(struct spi_device *spi) } } - if (!mcspi_dma->dma_rx || !mcspi_dma->dma_tx) { - ret = omap2_mcspi_request_dma(spi); - if (ret) - dev_warn(&spi->dev, "not using DMA for McSPI (%d)\n", - ret); - } - ret = pm_runtime_get_sync(mcspi->dev); if (ret < 0) { pm_runtime_put_noidle(mcspi->dev); @@ -1066,12 +1071,8 @@ static int omap2_mcspi_setup(struct spi_device *spi) static void omap2_mcspi_cleanup(struct spi_device *spi) { - struct omap2_mcspi *mcspi; - struct omap2_mcspi_dma *mcspi_dma; struct omap2_mcspi_cs *cs; - mcspi = spi_master_get_devdata(spi->master); - if (spi->controller_state) { /* Unlink controller state from context save list */ cs = spi->controller_state; @@ -1080,19 +1081,6 @@ static void omap2_mcspi_cleanup(struct spi_device *spi) kfree(cs); } - if (spi->chip_select < spi->master->num_chipselect) { - mcspi_dma = &mcspi->dma_channels[spi->chip_select]; - - if (mcspi_dma->dma_rx) { - dma_release_channel(mcspi_dma->dma_rx); - mcspi_dma->dma_rx = NULL; - } - if (mcspi_dma->dma_tx) { - dma_release_channel(mcspi_dma->dma_tx); - mcspi_dma->dma_tx = NULL; - } - } - if (gpio_is_valid(spi->cs_gpio)) gpio_free(spi->cs_gpio); } @@ -1303,6 +1291,9 @@ static bool omap2_mcspi_can_dma(struct spi_master *master, if (spi_controller_is_slave(master)) return true; + master->dma_rx = mcspi_dma->dma_rx; + master->dma_tx = mcspi_dma->dma_tx; + return (xfer->len >= DMA_MIN_BYTES); } @@ -1490,6 +1481,11 @@ static int omap2_mcspi_probe(struct platform_device *pdev) for (i = 0; i < master->num_chipselect; i++) { sprintf(mcspi->dma_channels[i].dma_rx_ch_name, "rx%d", i); sprintf(mcspi->dma_channels[i].dma_tx_ch_name, "tx%d", i); + + status = omap2_mcspi_request_dma(mcspi, + &mcspi->dma_channels[i]); + if (status == -EPROBE_DEFER) + goto free_master; } status = platform_get_irq(pdev, 0); @@ -1527,6 +1523,7 @@ disable_pm: pm_runtime_put_sync(&pdev->dev); pm_runtime_disable(&pdev->dev); free_master: + omap2_mcspi_release_dma(master); spi_master_put(master); return status; } @@ -1536,6 +1533,8 @@ static int omap2_mcspi_remove(struct platform_device *pdev) struct spi_master *master = platform_get_drvdata(pdev); struct omap2_mcspi *mcspi = spi_master_get_devdata(master); + omap2_mcspi_release_dma(master); + pm_runtime_dont_use_autosuspend(mcspi->dev); pm_runtime_put_sync(mcspi->dev); pm_runtime_disable(&pdev->dev); From 3f9e12e0df012c4a9a7fd7eb0d3ae69b459d6b2c Mon Sep 17 00:00:00 2001 From: Jean Delvare Date: Thu, 6 Feb 2020 16:58:45 +0100 Subject: [PATCH 007/400] ACPI: watchdog: Allow disabling WDAT at boot In case the WDAT interface is broken, give the user an option to ignore it to let a native driver bind to the watchdog device instead. Signed-off-by: Jean Delvare Acked-by: Mika Westerberg Signed-off-by: Rafael J. Wysocki --- Documentation/admin-guide/kernel-parameters.txt | 4 ++++ drivers/acpi/acpi_watchdog.c | 12 +++++++++++- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index dbc22d684627..c07815d230bc 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -136,6 +136,10 @@ dynamic table installation which will install SSDT tables to /sys/firmware/acpi/tables/dynamic. + acpi_no_watchdog [HW,ACPI,WDT] + Ignore the ACPI-based watchdog interface (WDAT) and let + a native driver control the watchdog device instead. + acpi_rsdp= [ACPI,EFI,KEXEC] Pass the RSDP address to the kernel, mostly used on machines running EFI runtime service to boot the diff --git a/drivers/acpi/acpi_watchdog.c b/drivers/acpi/acpi_watchdog.c index b5516b04ffc0..ab6e434b4cee 100644 --- a/drivers/acpi/acpi_watchdog.c +++ b/drivers/acpi/acpi_watchdog.c @@ -55,12 +55,14 @@ static bool acpi_watchdog_uses_rtc(const struct acpi_table_wdat *wdat) } #endif +static bool acpi_no_watchdog; + static const struct acpi_table_wdat *acpi_watchdog_get_wdat(void) { const struct acpi_table_wdat *wdat = NULL; acpi_status status; - if (acpi_disabled) + if (acpi_disabled || acpi_no_watchdog) return NULL; status = acpi_get_table(ACPI_SIG_WDAT, 0, @@ -88,6 +90,14 @@ bool acpi_has_watchdog(void) } EXPORT_SYMBOL_GPL(acpi_has_watchdog); +/* ACPI watchdog can be disabled on boot command line */ +static int __init disable_acpi_watchdog(char *str) +{ + acpi_no_watchdog = true; + return 1; +} +__setup("acpi_no_watchdog", disable_acpi_watchdog); + void __init acpi_watchdog_init(void) { const struct acpi_wdat_entry *entries; From e20d8e81a0e06c672e964c9f01100f07a64b1ce6 Mon Sep 17 00:00:00 2001 From: Brendan Higgins Date: Fri, 31 Jan 2020 16:01:02 -0800 Subject: [PATCH 008/400] Documentation: kunit: fixed sphinx error in code block Fix a missing newline in a code block that was causing a warning: Documentation/dev-tools/kunit/usage.rst:553: WARNING: Error in "code-block" directive: maximum 1 argument(s) allowed, 3 supplied. .. code-block:: bash modprobe example-test Signed-off-by: Brendan Higgins Reviewed-by: Alan Maguire Signed-off-by: Shuah Khan --- Documentation/dev-tools/kunit/usage.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/dev-tools/kunit/usage.rst b/Documentation/dev-tools/kunit/usage.rst index 7cd56a1993b1..607758a66a99 100644 --- a/Documentation/dev-tools/kunit/usage.rst +++ b/Documentation/dev-tools/kunit/usage.rst @@ -551,6 +551,7 @@ options to your ``.config``: Once the kernel is built and installed, a simple .. code-block:: bash + modprobe example-test ...will run the tests. From 318caac7c81cdf5806df30c3d72385659a5f0f53 Mon Sep 17 00:00:00 2001 From: Evan Benn Date: Fri, 7 Feb 2020 15:23:51 +1100 Subject: [PATCH 009/400] drm/mediatek: Find the cursor plane instead of hard coding it The cursor and primary planes were hard coded. Now search for them for passing to drm_crtc_init_with_planes Signed-off-by: Evan Benn Reviewed-by: Sean Paul Signed-off-by: CK Hu --- drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c index 0dfcd1787e65..5ee74d7ce35c 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c @@ -636,10 +636,18 @@ static const struct drm_crtc_helper_funcs mtk_crtc_helper_funcs = { static int mtk_drm_crtc_init(struct drm_device *drm, struct mtk_drm_crtc *mtk_crtc, - struct drm_plane *primary, - struct drm_plane *cursor, unsigned int pipe) + unsigned int pipe) { - int ret; + struct drm_plane *primary = NULL; + struct drm_plane *cursor = NULL; + int i, ret; + + for (i = 0; i < mtk_crtc->layer_nr; i++) { + if (mtk_crtc->planes[i].type == DRM_PLANE_TYPE_PRIMARY) + primary = &mtk_crtc->planes[i]; + else if (mtk_crtc->planes[i].type == DRM_PLANE_TYPE_CURSOR) + cursor = &mtk_crtc->planes[i]; + } ret = drm_crtc_init_with_planes(drm, &mtk_crtc->base, primary, cursor, &mtk_crtc_funcs, NULL); @@ -807,9 +815,7 @@ int mtk_drm_crtc_create(struct drm_device *drm_dev, return ret; } - ret = mtk_drm_crtc_init(drm_dev, mtk_crtc, &mtk_crtc->planes[0], - mtk_crtc->layer_nr > 1 ? &mtk_crtc->planes[1] : - NULL, pipe); + ret = mtk_drm_crtc_init(drm_dev, mtk_crtc, pipe); if (ret < 0) return ret; From 26d696192aa5f4fe9119d6d23f90ed535053abca Mon Sep 17 00:00:00 2001 From: Sean Paul Date: Thu, 30 Jan 2020 14:24:55 -0500 Subject: [PATCH 010/400] drm/mediatek: Ensure the cursor plane is on top of other overlays Currently the cursor is placed on the first overlay plane, which means it will be at the bottom of the stack when the hw does the compositing with anything other than primary plane. Since mtk doesn't support plane zpos, change the cursor location to the top-most plane. Signed-off-by: Sean Paul Signed-off-by: CK Hu --- drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c index 5ee74d7ce35c..01e37742dcea 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c @@ -697,11 +697,12 @@ static int mtk_drm_crtc_num_comp_planes(struct mtk_drm_crtc *mtk_crtc, } static inline -enum drm_plane_type mtk_drm_crtc_plane_type(unsigned int plane_idx) +enum drm_plane_type mtk_drm_crtc_plane_type(unsigned int plane_idx, + unsigned int num_planes) { if (plane_idx == 0) return DRM_PLANE_TYPE_PRIMARY; - else if (plane_idx == 1) + else if (plane_idx == (num_planes - 1)) return DRM_PLANE_TYPE_CURSOR; else return DRM_PLANE_TYPE_OVERLAY; @@ -720,7 +721,8 @@ static int mtk_drm_crtc_init_comp_planes(struct drm_device *drm_dev, ret = mtk_plane_init(drm_dev, &mtk_crtc->planes[mtk_crtc->layer_nr], BIT(pipe), - mtk_drm_crtc_plane_type(mtk_crtc->layer_nr), + mtk_drm_crtc_plane_type(mtk_crtc->layer_nr, + num_planes), mtk_ddp_comp_supported_rotations(comp)); if (ret) return ret; From 0cbb4f9c69827decf56519c2f63918f16904ede5 Mon Sep 17 00:00:00 2001 From: Stephen Boyd Date: Mon, 3 Feb 2020 09:46:19 -0800 Subject: [PATCH 011/400] platform/chrome: wilco_ec: Include asm/unaligned instead of linux/ path It seems that we shouldn't try to include the include/linux/ path to unaligned functions. Just include asm/unaligned.h instead so that we don't run into compilation warnings like below. In file included from drivers/platform/chrome/wilco_ec/properties.c:8:0: include/linux/unaligned/le_memmove.h:7:19: error: redefinition of 'get_unaligned_le16' static inline u16 get_unaligned_le16(const void *p) ^~~~~~~~~~~~~~~~~~ In file included from arch/ia64/include/asm/unaligned.h:5:0, from arch/ia64/include/asm/io.h:23, from arch/ia64/include/asm/smp.h:21, from include/linux/smp.h:68, from include/linux/percpu.h:7, from include/linux/arch_topology.h:9, from include/linux/topology.h:30, from include/linux/gfp.h:9, from include/linux/xarray.h:14, from include/linux/radix-tree.h:18, from include/linux/idr.h:15, from include/linux/kernfs.h:13, from include/linux/sysfs.h:16, from include/linux/kobject.h:20, from include/linux/device.h:16, from include/linux/platform_data/wilco-ec.h:11, from drivers/platform/chrome/wilco_ec/properties.c:6: include/linux/unaligned/le_struct.h:7:19: note: previous definition of 'get_unaligned_le16' was here static inline u16 get_unaligned_le16(const void *p) ^~~~~~~~~~~~~~~~~~ Reported-by: kbuild test robot Fixes: 60fb8a8e93ca ("platform/chrome: wilco_ec: Allow wilco to be compiled in COMPILE_TEST") Signed-off-by: Stephen Boyd Signed-off-by: Enric Balletbo i Serra --- drivers/platform/chrome/wilco_ec/properties.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/platform/chrome/wilco_ec/properties.c b/drivers/platform/chrome/wilco_ec/properties.c index e69682c95ea2..62f27610dd33 100644 --- a/drivers/platform/chrome/wilco_ec/properties.c +++ b/drivers/platform/chrome/wilco_ec/properties.c @@ -5,7 +5,7 @@ #include #include -#include +#include /* Operation code; what the EC should do with the property */ enum ec_property_op { From e433be929e63265b7412478eb7ff271467aee2d7 Mon Sep 17 00:00:00 2001 From: Mansour Behabadi Date: Wed, 29 Jan 2020 17:26:31 +1100 Subject: [PATCH 012/400] HID: apple: Add support for recent firmware on Magic Keyboards Magic Keyboards with more recent firmware (0x0100) report Fn key differently. Without this patch, Fn key may not behave as expected and may not be configurable via hid_apple fnmode module parameter. Signed-off-by: Mansour Behabadi Signed-off-by: Jiri Kosina --- drivers/hid/hid-apple.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/hid/hid-apple.c b/drivers/hid/hid-apple.c index 6ac8becc2372..d732d1d10caf 100644 --- a/drivers/hid/hid-apple.c +++ b/drivers/hid/hid-apple.c @@ -340,7 +340,8 @@ static int apple_input_mapping(struct hid_device *hdev, struct hid_input *hi, unsigned long **bit, int *max) { if (usage->hid == (HID_UP_CUSTOM | 0x0003) || - usage->hid == (HID_UP_MSVENDOR | 0x0003)) { + usage->hid == (HID_UP_MSVENDOR | 0x0003) || + usage->hid == (HID_UP_HPVENDOR2 | 0x0003)) { /* The fn key on Apple USB keyboards */ set_bit(EV_REP, hi->input->evbit); hid_map_usage_clear(hi, usage, bit, max, EV_KEY, KEY_FN); From 5ebdffd25098898aff1249ae2f7dbfddd76d8f8f Mon Sep 17 00:00:00 2001 From: Johan Korsnes Date: Fri, 17 Jan 2020 13:08:35 +0100 Subject: [PATCH 013/400] HID: core: fix off-by-one memset in hid_report_raw_event() In case a report is greater than HID_MAX_BUFFER_SIZE, it is truncated, but the report-number byte is not correctly handled. This results in a off-by-one in the following memset, causing a kernel Oops and ensuing system crash. Note: With commit 8ec321e96e05 ("HID: Fix slab-out-of-bounds read in hid_field_extract") I no longer hit the kernel Oops as we instead fail "controlled" at probe if there is a report too long in the HID report-descriptor. hid_report_raw_event() is an exported symbol, so presumabely we cannot always rely on this being the case. Fixes: 966922f26c7f ("HID: fix a crash in hid_report_raw_event() function.") Signed-off-by: Johan Korsnes Cc: Armando Visconti Cc: Jiri Kosina Cc: Alan Stern Signed-off-by: Jiri Kosina --- drivers/hid/hid-core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index 851fe54ea59e..359616e3efbb 100644 --- a/drivers/hid/hid-core.c +++ b/drivers/hid/hid-core.c @@ -1741,7 +1741,9 @@ int hid_report_raw_event(struct hid_device *hid, int type, u8 *data, u32 size, rsize = ((report->size - 1) >> 3) + 1; - if (rsize > HID_MAX_BUFFER_SIZE) + if (report_enum->numbered && rsize >= HID_MAX_BUFFER_SIZE) + rsize = HID_MAX_BUFFER_SIZE - 1; + else if (rsize > HID_MAX_BUFFER_SIZE) rsize = HID_MAX_BUFFER_SIZE; if (csize < rsize) { From 84a4062632462c4320704fcdf8e99e89e94c0aba Mon Sep 17 00:00:00 2001 From: Johan Korsnes Date: Fri, 17 Jan 2020 13:08:36 +0100 Subject: [PATCH 014/400] HID: core: increase HID report buffer size to 8KiB We have a HID touch device that reports its opens and shorts test results in HID buffers of size 8184 bytes. The maximum size of the HID buffer is currently set to 4096 bytes, causing probe of this device to fail. With this patch we increase the maximum size of the HID buffer to 8192 bytes, making device probe and acquisition of said buffers succeed. Signed-off-by: Johan Korsnes Cc: Alan Stern Cc: Armando Visconti Cc: Jiri Kosina Signed-off-by: Jiri Kosina --- include/linux/hid.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/hid.h b/include/linux/hid.h index cd41f209043f..875f71132b14 100644 --- a/include/linux/hid.h +++ b/include/linux/hid.h @@ -492,7 +492,7 @@ struct hid_report_enum { }; #define HID_MIN_BUFFER_SIZE 64 /* make sure there is at least a packet size of space */ -#define HID_MAX_BUFFER_SIZE 4096 /* 4kb */ +#define HID_MAX_BUFFER_SIZE 8192 /* 8kb */ #define HID_CONTROL_FIFO_SIZE 256 /* to init devices with >100 reports */ #define HID_OUTPUT_FIFO_SIZE 64 From 5c02c447eaeda29d3da121a2e17b97ccaf579b51 Mon Sep 17 00:00:00 2001 From: "dan.carpenter@oracle.com" Date: Wed, 15 Jan 2020 20:46:28 +0300 Subject: [PATCH 015/400] HID: hiddev: Fix race in in hiddev_disconnect() Syzbot reports that "hiddev" is used after it's free in hiddev_disconnect(). The hiddev_disconnect() function sets "hiddev->exist = 0;" so hiddev_release() can free it as soon as we drop the "existancelock" lock. This patch moves the mutex_unlock(&hiddev->existancelock) until after we have finished using it. Reported-by: syzbot+784ccb935f9900cc7c9e@syzkaller.appspotmail.com Fixes: 7f77897ef2b6 ("HID: hiddev: fix potential use-after-free") Suggested-by: Alan Stern Signed-off-by: Dan Carpenter Signed-off-by: Jiri Kosina --- drivers/hid/usbhid/hiddev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hid/usbhid/hiddev.c b/drivers/hid/usbhid/hiddev.c index a970b809d778..4140dea693e9 100644 --- a/drivers/hid/usbhid/hiddev.c +++ b/drivers/hid/usbhid/hiddev.c @@ -932,9 +932,9 @@ void hiddev_disconnect(struct hid_device *hid) hiddev->exist = 0; if (hiddev->open) { - mutex_unlock(&hiddev->existancelock); hid_hw_close(hiddev->hid); wake_up_interruptible(&hiddev->wait); + mutex_unlock(&hiddev->existancelock); } else { mutex_unlock(&hiddev->existancelock); kfree(hiddev); From 8d2e77b39b8fecb794e19cd006a12f90b14dd077 Mon Sep 17 00:00:00 2001 From: Christophe JAILLET Date: Wed, 4 Dec 2019 04:35:25 +0100 Subject: [PATCH 016/400] HID: alps: Fix an error handling path in 'alps_input_configured()' They are issues: - if 'input_allocate_device()' fails and return NULL, there is no need to free anything and 'input_free_device()' call is a no-op. It can be axed. - 'ret' is known to be 0 at this point, so we must set it to a meaningful value before returning Fixes: 2562756dde55 ("HID: add Alps I2C HID Touchpad-Stick support") Signed-off-by: Christophe JAILLET Signed-off-by: Jiri Kosina --- drivers/hid/hid-alps.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hid/hid-alps.c b/drivers/hid/hid-alps.c index ae79a7c66737..fa704153cb00 100644 --- a/drivers/hid/hid-alps.c +++ b/drivers/hid/hid-alps.c @@ -730,7 +730,7 @@ static int alps_input_configured(struct hid_device *hdev, struct hid_input *hi) if (data->has_sp) { input2 = input_allocate_device(); if (!input2) { - input_free_device(input2); + ret = -ENOMEM; goto exit; } From 9e661cedcc0a072d91a32cb88e0515ea26e35711 Mon Sep 17 00:00:00 2001 From: Wolfram Sang Date: Wed, 12 Feb 2020 10:35:30 +0100 Subject: [PATCH 017/400] i2c: jz4780: silence log flood on txabrt The printout for txabrt is way too talkative and is highly annoying with scanning programs like 'i2cdetect'. Reduce it to the minimum, the rest can be gained by I2C core debugging and datasheet information. Also, make it a debug printout, it won't help the regular user. Fixes: ba92222ed63a ("i2c: jz4780: Add i2c bus controller driver for Ingenic JZ4780") Reported-by: H. Nikolaus Schaller Tested-by: H. Nikolaus Schaller Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-jz4780.c | 36 ++------------------------------- 1 file changed, 2 insertions(+), 34 deletions(-) diff --git a/drivers/i2c/busses/i2c-jz4780.c b/drivers/i2c/busses/i2c-jz4780.c index 16a67a64284a..b426fc956938 100644 --- a/drivers/i2c/busses/i2c-jz4780.c +++ b/drivers/i2c/busses/i2c-jz4780.c @@ -78,25 +78,6 @@ #define X1000_I2C_DC_STOP BIT(9) -static const char * const jz4780_i2c_abrt_src[] = { - "ABRT_7B_ADDR_NOACK", - "ABRT_10ADDR1_NOACK", - "ABRT_10ADDR2_NOACK", - "ABRT_XDATA_NOACK", - "ABRT_GCALL_NOACK", - "ABRT_GCALL_READ", - "ABRT_HS_ACKD", - "SBYTE_ACKDET", - "ABRT_HS_NORSTRT", - "SBYTE_NORSTRT", - "ABRT_10B_RD_NORSTRT", - "ABRT_MASTER_DIS", - "ARB_LOST", - "SLVFLUSH_TXFIFO", - "SLV_ARBLOST", - "SLVRD_INTX", -}; - #define JZ4780_I2C_INTST_IGC BIT(11) #define JZ4780_I2C_INTST_ISTT BIT(10) #define JZ4780_I2C_INTST_ISTP BIT(9) @@ -576,21 +557,8 @@ done: static void jz4780_i2c_txabrt(struct jz4780_i2c *i2c, int src) { - int i; - - dev_err(&i2c->adap.dev, "txabrt: 0x%08x\n", src); - dev_err(&i2c->adap.dev, "device addr=%x\n", - jz4780_i2c_readw(i2c, JZ4780_I2C_TAR)); - dev_err(&i2c->adap.dev, "send cmd count:%d %d\n", - i2c->cmd, i2c->cmd_buf[i2c->cmd]); - dev_err(&i2c->adap.dev, "receive data count:%d %d\n", - i2c->cmd, i2c->data_buf[i2c->cmd]); - - for (i = 0; i < 16; i++) { - if (src & BIT(i)) - dev_dbg(&i2c->adap.dev, "I2C TXABRT[%d]=%s\n", - i, jz4780_i2c_abrt_src[i]); - } + dev_dbg(&i2c->adap.dev, "txabrt: 0x%08x, cmd: %d, send: %d, recv: %d\n", + src, i2c->cmd, i2c->cmd_buf[i2c->cmd], i2c->data_buf[i2c->cmd]); } static inline int jz4780_i2c_xfer_read(struct jz4780_i2c *i2c, From 54498e8070e19e74498a72c7331348143e7e1f8c Mon Sep 17 00:00:00 2001 From: "Gustavo A. R. Silva" Date: Tue, 11 Feb 2020 08:47:04 -0600 Subject: [PATCH 018/400] i2c: altera: Fix potential integer overflow Factor out 100 from the equation and do 32-bit arithmetic (3 * clk_mhz / 10) instead of 64-bit. Notice that clk_mhz is MHz, so the multiplication will never wrap 32 bits and there is no need for div_u64(). Addresses-Coverity: 1458369 ("Unintentional integer overflow") Fixes: 0560ad576268 ("i2c: altera: Add Altera I2C Controller driver") Suggested-by: David Laight Signed-off-by: Gustavo A. R. Silva Reviewed-by: Thor Thayer Signed-off-by: Wolfram Sang --- drivers/i2c/busses/i2c-altera.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/i2c/busses/i2c-altera.c b/drivers/i2c/busses/i2c-altera.c index 5255d3755411..1de23b4f3809 100644 --- a/drivers/i2c/busses/i2c-altera.c +++ b/drivers/i2c/busses/i2c-altera.c @@ -171,7 +171,7 @@ static void altr_i2c_init(struct altr_i2c_dev *idev) /* SCL Low Time */ writel(t_low, idev->base + ALTR_I2C_SCL_LOW); /* SDA Hold Time, 300ns */ - writel(div_u64(300 * clk_mhz, 1000), idev->base + ALTR_I2C_SDA_HOLD); + writel(3 * clk_mhz / 10, idev->base + ALTR_I2C_SDA_HOLD); /* Mask all master interrupt bits */ altr_i2c_int_enable(idev, ALTR_I2C_ALL_IRQ, false); From 61b5865d56bbb09dd5887b197dd324d1d4333d47 Mon Sep 17 00:00:00 2001 From: Dave Jiang Date: Tue, 11 Feb 2020 10:42:47 -0700 Subject: [PATCH 019/400] dmaengine: idxd: fix runaway module ref count on device driver bind idxd_config_bus_probe() calls try_module_get() but never calls module_put() when it fails. Thus with every failed attempt, the ref count goes up. Add module_put() in failure paths. Fixes: c52ca478233c ("dmaengine: idxd: add configuration component of driver") Reported-by: Jerry Chen Signed-off-by: Dave Jiang Link: https://lore.kernel.org/r/158144296730.41381.12134210685456322434.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul --- drivers/dma/idxd/sysfs.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c index 6d907fe150aa..298855ca934f 100644 --- a/drivers/dma/idxd/sysfs.c +++ b/drivers/dma/idxd/sysfs.c @@ -124,6 +124,7 @@ static int idxd_config_bus_probe(struct device *dev) rc = idxd_device_config(idxd); if (rc < 0) { spin_unlock_irqrestore(&idxd->dev_lock, flags); + module_put(THIS_MODULE); dev_warn(dev, "Device config failed: %d\n", rc); return rc; } @@ -132,6 +133,7 @@ static int idxd_config_bus_probe(struct device *dev) rc = idxd_device_enable(idxd); if (rc < 0) { spin_unlock_irqrestore(&idxd->dev_lock, flags); + module_put(THIS_MODULE); dev_warn(dev, "Device enable failed: %d\n", rc); return rc; } @@ -142,6 +144,7 @@ static int idxd_config_bus_probe(struct device *dev) rc = idxd_register_dma_device(idxd); if (rc < 0) { spin_unlock_irqrestore(&idxd->dev_lock, flags); + module_put(THIS_MODULE); dev_dbg(dev, "Failed to register dmaengine device\n"); return rc; } From 83c49f734463980802d6dba376f69e787fb07a8e Mon Sep 17 00:00:00 2001 From: Changbin Du Date: Tue, 4 Feb 2020 20:51:15 +0800 Subject: [PATCH 020/400] dmaengine: doc: fix warnings/issues of client.rst This fixed some warnings/issues of client.rst. o Need a blank line for enumerated lists. o Do not create section in enumerated list. o Remove suffix ':' from section title. Signed-off-by: Changbin Du Link: https://lore.kernel.org/r/20200204125115.12128-1-changbin.du@gmail.com Signed-off-by: Vinod Koul --- Documentation/driver-api/dmaengine/client.rst | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/Documentation/driver-api/dmaengine/client.rst b/Documentation/driver-api/dmaengine/client.rst index e5953e7e4bf4..2104830a99ae 100644 --- a/Documentation/driver-api/dmaengine/client.rst +++ b/Documentation/driver-api/dmaengine/client.rst @@ -151,8 +151,8 @@ The details of these operations are: Note that callbacks will always be invoked from the DMA engines tasklet, never from interrupt context. -Optional: per descriptor metadata ---------------------------------- + **Optional: per descriptor metadata** + DMAengine provides two ways for metadata support. DESC_METADATA_CLIENT @@ -199,12 +199,15 @@ Optional: per descriptor metadata DESC_METADATA_CLIENT - DMA_MEM_TO_DEV / DEV_MEM_TO_MEM: + 1. prepare the descriptor (dmaengine_prep_*) construct the metadata in the client's buffer 2. use dmaengine_desc_attach_metadata() to attach the buffer to the descriptor 3. submit the transfer + - DMA_DEV_TO_MEM: + 1. prepare the descriptor (dmaengine_prep_*) 2. use dmaengine_desc_attach_metadata() to attach the buffer to the descriptor @@ -215,6 +218,7 @@ Optional: per descriptor metadata DESC_METADATA_ENGINE - DMA_MEM_TO_DEV / DEV_MEM_TO_MEM: + 1. prepare the descriptor (dmaengine_prep_*) 2. use dmaengine_desc_get_metadata_ptr() to get the pointer to the engine's metadata area @@ -222,7 +226,9 @@ Optional: per descriptor metadata 4. use dmaengine_desc_set_metadata_len() to tell the DMA engine the amount of data the client has placed into the metadata buffer 5. submit the transfer + - DMA_DEV_TO_MEM: + 1. prepare the descriptor (dmaengine_prep_*) 2. submit the transfer 3. on transfer completion, use dmaengine_desc_get_metadata_ptr() to get @@ -278,8 +284,8 @@ Optional: per descriptor metadata void dma_async_issue_pending(struct dma_chan *chan); -Further APIs: -------------- +Further APIs +------------ 1. Terminate APIs From 2227ab4216cd4d56619733bcfaee7e6943cdc9aa Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Wed, 5 Feb 2020 15:32:48 +0300 Subject: [PATCH 021/400] dmaengine: idxd: Fix error handling in idxd_wq_cdev_dev_setup() We can't call kfree(dev) after calling device_register(dev). The "dev" pointer has to be freed using put_device(). Fixes: 42d279f9137a ("dmaengine: idxd: add char driver to expose submission portal to userland") Signed-off-by: Dan Carpenter Acked-by: Dave Jiang Link: https://lore.kernel.org/r/20200205123248.hmtog7qa2eiqaagh@kili.mountain Signed-off-by: Vinod Koul --- drivers/dma/idxd/cdev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c index 1d7347825b95..df47be612ebb 100644 --- a/drivers/dma/idxd/cdev.c +++ b/drivers/dma/idxd/cdev.c @@ -204,6 +204,7 @@ static int idxd_wq_cdev_dev_setup(struct idxd_wq *wq) minor = ida_simple_get(&cdev_ctx->minor_ida, 0, MINORMASK, GFP_KERNEL); if (minor < 0) { rc = minor; + kfree(dev); goto ida_err; } @@ -212,7 +213,6 @@ static int idxd_wq_cdev_dev_setup(struct idxd_wq *wq) rc = device_register(dev); if (rc < 0) { dev_err(&idxd->pdev->dev, "device register failed\n"); - put_device(dev); goto dev_reg_err; } idxd_cdev->minor = minor; @@ -221,8 +221,8 @@ static int idxd_wq_cdev_dev_setup(struct idxd_wq *wq) dev_reg_err: ida_simple_remove(&cdev_ctx->minor_ida, MINOR(dev->devt)); + put_device(dev); ida_err: - kfree(dev); idxd_cdev->dev = NULL; return rc; } From 1dade3a7048ccfc675650cd2cf13d578b095e5fb Mon Sep 17 00:00:00 2001 From: Mika Westerberg Date: Wed, 12 Feb 2020 17:59:39 +0300 Subject: [PATCH 022/400] ACPICA: Introduce ACPI_ACCESS_BYTE_WIDTH() macro Sometimes it is useful to find the access_width field value in bytes and not in bits so add a helper that can be used for this purpose. Suggested-by: Jean Delvare Signed-off-by: Mika Westerberg Reviewed-by: Jean Delvare Cc: 4.16+ # 4.16+ Signed-off-by: Rafael J. Wysocki --- include/acpi/actypes.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/acpi/actypes.h b/include/acpi/actypes.h index a2583c2bc054..4defed58ea33 100644 --- a/include/acpi/actypes.h +++ b/include/acpi/actypes.h @@ -532,11 +532,12 @@ typedef u64 acpi_integer; strnlen (a, ACPI_NAMESEG_SIZE) == ACPI_NAMESEG_SIZE) /* - * Algorithm to obtain access bit width. + * Algorithm to obtain access bit or byte width. * Can be used with access_width of struct acpi_generic_address and access_size of * struct acpi_resource_generic_register. */ #define ACPI_ACCESS_BIT_WIDTH(size) (1 << ((size) + 2)) +#define ACPI_ACCESS_BYTE_WIDTH(size) (1 << ((size) - 1)) /******************************************************************************* * From 2ba33a4e9e22ac4dda928d3e9b5978a3a2ded4e0 Mon Sep 17 00:00:00 2001 From: Mika Westerberg Date: Wed, 12 Feb 2020 17:59:40 +0300 Subject: [PATCH 023/400] ACPI: watchdog: Fix gas->access_width usage ACPI Generic Address Structure (GAS) access_width field is not in bytes as the driver seems to expect in few places so fix this by using the newly introduced macro ACPI_ACCESS_BYTE_WIDTH(). Fixes: b1abf6fc4982 ("ACPI / watchdog: Fix off-by-one error at resource assignment") Fixes: 058dfc767008 ("ACPI / watchdog: Add support for WDAT hardware watchdog") Reported-by: Jean Delvare Signed-off-by: Mika Westerberg Reviewed-by: Jean Delvare Cc: 4.16+ # 4.16+ Signed-off-by: Rafael J. Wysocki --- drivers/acpi/acpi_watchdog.c | 3 +-- drivers/watchdog/wdat_wdt.c | 2 +- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/acpi_watchdog.c b/drivers/acpi/acpi_watchdog.c index ab6e434b4cee..6e9ec6e3fe47 100644 --- a/drivers/acpi/acpi_watchdog.c +++ b/drivers/acpi/acpi_watchdog.c @@ -136,12 +136,11 @@ void __init acpi_watchdog_init(void) gas = &entries[i].register_region; res.start = gas->address; + res.end = res.start + ACPI_ACCESS_BYTE_WIDTH(gas->access_width) - 1; if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) { res.flags = IORESOURCE_MEM; - res.end = res.start + ALIGN(gas->access_width, 4) - 1; } else if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_IO) { res.flags = IORESOURCE_IO; - res.end = res.start + gas->access_width - 1; } else { pr_warn("Unsupported address space: %u\n", gas->space_id); diff --git a/drivers/watchdog/wdat_wdt.c b/drivers/watchdog/wdat_wdt.c index b069349b52f5..e1b1fcfc02af 100644 --- a/drivers/watchdog/wdat_wdt.c +++ b/drivers/watchdog/wdat_wdt.c @@ -389,7 +389,7 @@ static int wdat_wdt_probe(struct platform_device *pdev) memset(&r, 0, sizeof(r)); r.start = gas->address; - r.end = r.start + gas->access_width - 1; + r.end = r.start + ACPI_ACCESS_BYTE_WIDTH(gas->access_width) - 1; if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) { r.flags = IORESOURCE_MEM; } else if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_IO) { From cabe17d0173ab04bd3f87b8199ae75f43f1ea473 Mon Sep 17 00:00:00 2001 From: Mika Westerberg Date: Wed, 12 Feb 2020 17:59:41 +0300 Subject: [PATCH 024/400] ACPI: watchdog: Set default timeout in probe If the BIOS default timeout for the watchdog is too small userspace may not have enough time to configure new timeout after opening the device before the system is already reset. For this reason program default timeout of 30 seconds in the driver probe and allow userspace to change this from command line or through module parameter (wdat_wdt.timeout). Reported-by: Jean Delvare Signed-off-by: Mika Westerberg Reviewed-by: Jean Delvare Signed-off-by: Rafael J. Wysocki --- drivers/watchdog/wdat_wdt.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/drivers/watchdog/wdat_wdt.c b/drivers/watchdog/wdat_wdt.c index e1b1fcfc02af..3065dd670a18 100644 --- a/drivers/watchdog/wdat_wdt.c +++ b/drivers/watchdog/wdat_wdt.c @@ -54,6 +54,13 @@ module_param(nowayout, bool, 0); MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default=" __MODULE_STRING(WATCHDOG_NOWAYOUT) ")"); +#define WDAT_DEFAULT_TIMEOUT 30 + +static int timeout = WDAT_DEFAULT_TIMEOUT; +module_param(timeout, int, 0); +MODULE_PARM_DESC(timeout, "Watchdog timeout in seconds (default=" + __MODULE_STRING(WDAT_DEFAULT_TIMEOUT) ")"); + static int wdat_wdt_read(struct wdat_wdt *wdat, const struct wdat_instruction *instr, u32 *value) { @@ -438,6 +445,22 @@ static int wdat_wdt_probe(struct platform_device *pdev) platform_set_drvdata(pdev, wdat); + /* + * Set initial timeout so that userspace has time to configure the + * watchdog properly after it has opened the device. In some cases + * the BIOS default is too short and causes immediate reboot. + */ + if (timeout * 1000 < wdat->wdd.min_hw_heartbeat_ms || + timeout * 1000 > wdat->wdd.max_hw_heartbeat_ms) { + dev_warn(dev, "Invalid timeout %d given, using %d\n", + timeout, WDAT_DEFAULT_TIMEOUT); + timeout = WDAT_DEFAULT_TIMEOUT; + } + + ret = wdat_wdt_set_timeout(&wdat->wdd, timeout); + if (ret) + return ret; + watchdog_set_nowayout(&wdat->wdd, nowayout); return devm_watchdog_register_device(dev, &wdat->wdd); } From be0aba826c4a6ba5929def1962a90d6127871969 Mon Sep 17 00:00:00 2001 From: Kai-Heng Feng Date: Fri, 14 Feb 2020 14:53:07 +0800 Subject: [PATCH 025/400] HID: i2c-hid: add Trekstor Surfbook E11B to descriptor override The Surfbook E11B uses the SIPODEV SP1064 touchpad, which does not supply descriptors, so it has to be added to the override list. BugLink: https://bugs.launchpad.net/bugs/1858299 Signed-off-by: Kai-Heng Feng Reviewed-by: Hans de Goede Signed-off-by: Benjamin Tissoires --- drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c b/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c index d31ea82b84c1..a66f08041a1a 100644 --- a/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c +++ b/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c @@ -341,6 +341,14 @@ static const struct dmi_system_id i2c_hid_dmi_desc_override_table[] = { }, .driver_data = (void *)&sipodev_desc }, + { + .ident = "Trekstor SURFBOOK E11B", + .matches = { + DMI_EXACT_MATCH(DMI_SYS_VENDOR, "TREKSTOR"), + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "SURFBOOK E11B"), + }, + .driver_data = (void *)&sipodev_desc + }, { .ident = "Direkt-Tek DTLAPY116-2", .matches = { From b3f15ec3d809ccf2e171ca4e272a220d3c1a3e05 Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Mon, 10 Feb 2020 11:47:57 +0000 Subject: [PATCH 026/400] kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe() With VHE, running a vCPU always requires the sequence: 1. kvm_arm_vhe_guest_enter(); 2. kvm_vcpu_run_vhe(); 3. kvm_arm_vhe_guest_exit() ... and as we invoke this from the shared arm/arm64 KVM code, 32-bit arm has to provide stubs for all three functions. To simplify the common code, and make it easier to make further modifications to the arm64-specific portions in the near future, let's fold kvm_arm_vhe_guest_enter() and kvm_arm_vhe_guest_exit() into kvm_vcpu_run_vhe(). The 32-bit stubs for kvm_arm_vhe_guest_enter() and kvm_arm_vhe_guest_exit() are removed, as they are no longer used. The 32-bit stub for kvm_vcpu_run_vhe() is left as-is. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20200210114757.2889-1-mark.rutland@arm.com --- arch/arm/include/asm/kvm_host.h | 3 --- arch/arm64/include/asm/kvm_host.h | 32 ------------------------- arch/arm64/kvm/hyp/switch.c | 39 +++++++++++++++++++++++++++++-- virt/kvm/arm/arm.c | 2 -- 4 files changed, 37 insertions(+), 39 deletions(-) diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index bd2233805d99..cbd26ae95e7e 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -394,9 +394,6 @@ static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {} static inline void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu) {} static inline void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu) {} -static inline void kvm_arm_vhe_guest_enter(void) {} -static inline void kvm_arm_vhe_guest_exit(void) {} - #define KVM_BP_HARDEN_UNKNOWN -1 #define KVM_BP_HARDEN_WA_NEEDED 0 #define KVM_BP_HARDEN_NOT_REQUIRED 1 diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index f6a77ddab956..d740ec00ecd3 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -628,38 +628,6 @@ static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {} static inline void kvm_clr_pmu_events(u32 clr) {} #endif -static inline void kvm_arm_vhe_guest_enter(void) -{ - local_daif_mask(); - - /* - * Having IRQs masked via PMR when entering the guest means the GIC - * will not signal the CPU of interrupts of lower priority, and the - * only way to get out will be via guest exceptions. - * Naturally, we want to avoid this. - * - * local_daif_mask() already sets GIC_PRIO_PSR_I_SET, we just need a - * dsb to ensure the redistributor is forwards EL2 IRQs to the CPU. - */ - pmr_sync(); -} - -static inline void kvm_arm_vhe_guest_exit(void) -{ - /* - * local_daif_restore() takes care to properly restore PSTATE.DAIF - * and the GIC PMR if the host is using IRQ priorities. - */ - local_daif_restore(DAIF_PROCCTX_NOIRQ); - - /* - * When we exit from the guest we change a number of CPU configuration - * parameters, such as traps. Make sure these changes take effect - * before running the host or additional guests. - */ - isb(); -} - #define KVM_BP_HARDEN_UNKNOWN -1 #define KVM_BP_HARDEN_WA_NEEDED 0 #define KVM_BP_HARDEN_NOT_REQUIRED 1 diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 72fbbd86eb5e..457067706b75 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -617,7 +617,7 @@ static void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt) } /* Switch to the guest for VHE systems running in EL2 */ -int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu) +static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu) { struct kvm_cpu_context *host_ctxt; struct kvm_cpu_context *guest_ctxt; @@ -670,7 +670,42 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu) return exit_code; } -NOKPROBE_SYMBOL(kvm_vcpu_run_vhe); +NOKPROBE_SYMBOL(__kvm_vcpu_run_vhe); + +int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu) +{ + int ret; + + local_daif_mask(); + + /* + * Having IRQs masked via PMR when entering the guest means the GIC + * will not signal the CPU of interrupts of lower priority, and the + * only way to get out will be via guest exceptions. + * Naturally, we want to avoid this. + * + * local_daif_mask() already sets GIC_PRIO_PSR_I_SET, we just need a + * dsb to ensure the redistributor is forwards EL2 IRQs to the CPU. + */ + pmr_sync(); + + ret = __kvm_vcpu_run_vhe(vcpu); + + /* + * local_daif_restore() takes care to properly restore PSTATE.DAIF + * and the GIC PMR if the host is using IRQ priorities. + */ + local_daif_restore(DAIF_PROCCTX_NOIRQ); + + /* + * When we exit from the guest we change a number of CPU configuration + * parameters, such as traps. Make sure these changes take effect + * before running the host or additional guests. + */ + isb(); + + return ret; +} /* Switch to the guest for legacy non-VHE systems */ int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu) diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c index efda376ab3c5..560d6f258297 100644 --- a/virt/kvm/arm/arm.c +++ b/virt/kvm/arm/arm.c @@ -797,9 +797,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) guest_enter_irqoff(); if (has_vhe()) { - kvm_arm_vhe_guest_enter(); ret = kvm_vcpu_run_vhe(vcpu); - kvm_arm_vhe_guest_exit(); } else { ret = kvm_call_hyp_ret(__kvm_vcpu_run_nvhe, vcpu); } From 551c5f5574759802b2549709b92bfdc7ddf36972 Mon Sep 17 00:00:00 2001 From: Bibby Hsieh Date: Thu, 13 Feb 2020 09:23:52 +0800 Subject: [PATCH 027/400] drm/mediatek: Add plane check in async_check function MTK do rotation checking and transferring in layer check function, but we do not check that in atomic_check, so add back in atomic_check function. Fixes: 920fffcc8912 ("drm/mediatek: update cursors by using async atomic update") Signed-off-by: Bibby Hsieh Signed-off-by: CK Hu --- drivers/gpu/drm/mediatek/mtk_drm_plane.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_plane.c b/drivers/gpu/drm/mediatek/mtk_drm_plane.c index 914cc7619cd7..61cb7ddc117d 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_plane.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_plane.c @@ -80,6 +80,7 @@ static int mtk_plane_atomic_async_check(struct drm_plane *plane, struct drm_plane_state *state) { struct drm_crtc_state *crtc_state; + int ret; if (plane != state->crtc->cursor) return -EINVAL; @@ -90,6 +91,11 @@ static int mtk_plane_atomic_async_check(struct drm_plane *plane, if (!plane->state->fb) return -EINVAL; + ret = mtk_drm_crtc_plane_check(state->crtc, plane, + to_mtk_plane_state(state)); + if (ret) + return ret; + if (state->state) crtc_state = drm_atomic_get_existing_crtc_state(state->state, state->crtc); From c12b59adf21329b97609a9e23519055d6877c7b1 Mon Sep 17 00:00:00 2001 From: Bibby Hsieh Date: Thu, 13 Feb 2020 09:23:53 +0800 Subject: [PATCH 028/400] drm/mediatek: Add fb swap in async_update Besides x, y position, width and height, fb also need updating in async update. Fixes: 920fffcc8912 ("drm/mediatek: update cursors by using async atomic update") Signed-off-by: Bibby Hsieh Tested-by: Enric Balletbo i Serra Signed-off-by: CK Hu --- drivers/gpu/drm/mediatek/mtk_drm_plane.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_plane.c b/drivers/gpu/drm/mediatek/mtk_drm_plane.c index 61cb7ddc117d..c2bd683a87c8 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_plane.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_plane.c @@ -121,6 +121,7 @@ static void mtk_plane_atomic_async_update(struct drm_plane *plane, plane->state->src_y = new_state->src_y; plane->state->src_h = new_state->src_h; plane->state->src_w = new_state->src_w; + swap(plane->state->fb, new_state->fb); state->pending.async_dirty = true; mtk_drm_crtc_async_update(new_state->crtc, plane, new_state); From 60fa8c13ab1a33b8b958efb1510ec2fd8a064bcc Mon Sep 17 00:00:00 2001 From: Bibby Hsieh Date: Mon, 17 Feb 2020 17:10:20 +0800 Subject: [PATCH 029/400] drm/mediatek: Move gce event property to mutex device node According mtk hardware design, stream_done0 and stream_done1 are generated by mutex, so we move gce event property to mutex device mode. Signed-off-by: Bibby Hsieh Signed-off-by: CK Hu --- drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c index 01e37742dcea..debcd8e5658a 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c @@ -836,7 +836,8 @@ int mtk_drm_crtc_create(struct drm_device *drm_dev, drm_crtc_index(&mtk_crtc->base)); mtk_crtc->cmdq_client = NULL; } - ret = of_property_read_u32_index(dev->of_node, "mediatek,gce-events", + ret = of_property_read_u32_index(priv->mutex_node, + "mediatek,gce-events", drm_crtc_index(&mtk_crtc->base), &mtk_crtc->cmdq_event); if (ret) From 839cbf0531428f3f9535077a461b8631359c1165 Mon Sep 17 00:00:00 2001 From: Bibby Hsieh Date: Mon, 17 Feb 2020 17:10:19 +0800 Subject: [PATCH 030/400] drm/mediatek: Make sure previous message done or be aborted before send Mediatek CMDQ driver removed atomic parameter and implementation related to atomic. DRM driver need to make sure previous message done or be aborted before we send next message. If previous message is still waiting for event, it means the setting hasn't been updated into display hardware register, we can abort the message and send next message to update the newest setting into display hardware. If previous message already started, we have to wait it until transmission has been completed. So we flush mbox client before we send new message to controller driver. Signed-off-by: Bibby Hsieh Signed-off-by: CK Hu --- drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c index debcd8e5658a..fe85e487e477 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c @@ -486,6 +486,7 @@ static void mtk_drm_crtc_hw_config(struct mtk_drm_crtc *mtk_crtc) } #if IS_REACHABLE(CONFIG_MTK_CMDQ) if (mtk_crtc->cmdq_client) { + mbox_flush(mtk_crtc->cmdq_client->chan, 2000); cmdq_handle = cmdq_pkt_create(mtk_crtc->cmdq_client, PAGE_SIZE); cmdq_pkt_clear_event(cmdq_handle, mtk_crtc->cmdq_event); cmdq_pkt_wfe(cmdq_handle, mtk_crtc->cmdq_event); From 3b573bf318d894b4290e194c4d7dbcba8c1f6ead Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Fri, 14 Feb 2020 16:21:40 -0300 Subject: [PATCH 031/400] perf bpf: Remove bpf/ subdir from bpf.h headers used to build bpf events The bpf.h file needed gets installed in /usr/lib/include/perf/bpf/bpf.h, and /usr/lib/include/perf/ is added to the include path passed to clang to build the eBPF bytecode, so just remove "bpf/", its directly in the path passed already. This was working by accident, fix it. I.e. now this is back working: # cat /home/acme/git/perf/tools/perf/examples/bpf/hello.c #include int syscall_enter(openat)(void *args) { puts("Hello, world\n"); return 0; } license(GPL); # perf trace -e /home/acme/git/perf/tools/perf/examples/bpf/hello.c 0.000 pickup/21493 __bpf_stdout__(Hello, world) 56.462 sh/13539 __bpf_stdout__(Hello, world) 56.536 sh/13539 __bpf_stdout__(Hello, world) 56.673 sh/13539 __bpf_stdout__(Hello, world) 56.781 sh/13539 __bpf_stdout__(Hello, world) 56.707 perf/13182 __bpf_stdout__(Hello, world) 56.849 perf/13182 __bpf_stdout__(Hello, world) ^C # Cc: Adrian Hunter Cc: Jiri Olsa Cc: Namhyung Kim Link: https://lkml.kernel.org/n/tip-d9myswhgo8gfi3vmehdqpxa7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/include/bpf/pid_filter.h | 2 +- tools/perf/include/bpf/stdio.h | 2 +- tools/perf/include/bpf/unistd.h | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/perf/include/bpf/pid_filter.h b/tools/perf/include/bpf/pid_filter.h index 607189a315b2..6e61c4bdf548 100644 --- a/tools/perf/include/bpf/pid_filter.h +++ b/tools/perf/include/bpf/pid_filter.h @@ -3,7 +3,7 @@ #ifndef _PERF_BPF_PID_FILTER_ #define _PERF_BPF_PID_FILTER_ -#include +#include #define pid_filter(name) pid_map(name, bool) diff --git a/tools/perf/include/bpf/stdio.h b/tools/perf/include/bpf/stdio.h index 7ca6fa5463ee..316af5b2ff35 100644 --- a/tools/perf/include/bpf/stdio.h +++ b/tools/perf/include/bpf/stdio.h @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 -#include +#include struct bpf_map SEC("maps") __bpf_stdout__ = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, diff --git a/tools/perf/include/bpf/unistd.h b/tools/perf/include/bpf/unistd.h index d1a35b6c649d..ca7877f9a976 100644 --- a/tools/perf/include/bpf/unistd.h +++ b/tools/perf/include/bpf/unistd.h @@ -1,6 +1,6 @@ // SPDX-License-Identifier: LGPL-2.1 -#include +#include static int (*bpf_get_current_pid_tgid)(void) = (void *)BPF_FUNC_get_current_pid_tgid; From 2bbc83537614517730e9f2811195004b712de207 Mon Sep 17 00:00:00 2001 From: Thomas Richter Date: Mon, 17 Feb 2020 11:21:11 +0100 Subject: [PATCH 032/400] perf test: Fix test trace+probe_vfs_getname.sh on s390 This test places a kprobe to function getname_flags() in the kernel which has the following prototype: struct filename *getname_flags(const char __user *filename, int flags, int *empty) The 'filename' argument points to a filename located in user space memory. Looking at commit 88903c464321c ("tracing/probe: Add ustring type for user-space string") the kprobe should indicate that user space memory is accessed. Output before: [root@m35lp76 perf]# ./perf test 66 67 66: Use vfs_getname probe to get syscall args filenames : FAILED! 67: Check open filename arg using perf trace + vfs_getname: FAILED! [root@m35lp76 perf]# Output after: [root@m35lp76 perf]# ./perf test 66 67 66: Use vfs_getname probe to get syscall args filenames : Ok 67: Check open filename arg using perf trace + vfs_getname: Ok [root@m35lp76 perf]# Comments from Masami Hiramatsu: This bug doesn't happen on x86 or other archs on which user address space and kernel address space is the same. On some arches (ppc64 in this case?) user address space is partially or completely the same as kernel address space. (Yes, they switch the world when running into the kernel) In this case, we need to use different data access functions for each space. That is why I introduced the "ustring" type for kprobe events. As far as I can see, Thomas's patch is sane. Thomas, could you show us your result on your test environment? Comments from Thomas Richter: Test results for s/390 included above. Signed-off-by: Thomas Richter Acked-by: Masami Hiramatsu Tested-by: Arnaldo Carvalho de Melo Cc: Heiko Carstens Cc: Sumanth Korikkar Cc: Vasily Gorbik Link: http://lore.kernel.org/lkml/20200217102111.61137-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/tests/shell/lib/probe_vfs_getname.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/tests/shell/lib/probe_vfs_getname.sh b/tools/perf/tests/shell/lib/probe_vfs_getname.sh index 7cb99b433888..c2cc42daf924 100644 --- a/tools/perf/tests/shell/lib/probe_vfs_getname.sh +++ b/tools/perf/tests/shell/lib/probe_vfs_getname.sh @@ -14,7 +14,7 @@ add_probe_vfs_getname() { if [ $had_vfs_getname -eq 1 ] ; then line=$(perf probe -L getname_flags 2>&1 | egrep 'result.*=.*filename;' | sed -r 's/[[:space:]]+([[:digit:]]+)[[:space:]]+result->uptr.*/\1/') perf probe -q "vfs_getname=getname_flags:${line} pathname=result->name:string" || \ - perf probe $verbose "vfs_getname=getname_flags:${line} pathname=filename:string" + perf probe $verbose "vfs_getname=getname_flags:${line} pathname=filename:ustring" fi } From 2da4dd3d6973ffdfba4fa07f53240fda7ab22929 Mon Sep 17 00:00:00 2001 From: Wei Li Date: Fri, 14 Feb 2020 15:26:50 +0200 Subject: [PATCH 033/400] perf intel-pt: Fix endless record after being terminated In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will be set and the event list will be disabled by evlist__disable() once. While in auxtrace_record.read_finish(), the related events will be enabled again, if they are continuous, the recording seems to be endless. If the intel_pt event is disabled, we don't enable it again here. Before the patch: huawei@huawei-2288H-V5:~/linux-5.5-rc4/tools/perf$ ./perf record -e \ intel_pt//u -p 46803 ^C^C^C^C^C^C After the patch: huawei@huawei-2288H-V5:~/linux-5.5-rc4/tools/perf$ ./perf record -e \ intel_pt//u -p 48591 ^C[ perf record: Woken up 0 times to write data ] Warning: AUX data lost 504 times out of 4816! [ perf record: Captured and wrote 2024.405 MB perf.data ] Signed-off-by: Wei Li Cc: Jiri Olsa Cc: Tan Xiaojun Cc: stable@vger.kernel.org # 5.4+ Link: http://lore.kernel.org/lkml/20200214132654.20395-2-adrian.hunter@intel.com [ ahunter: removed redundant 'else' after 'return' ] Signed-off-by: Adrian Hunter Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/arch/x86/util/intel-pt.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c index 20df442fdf36..be07d6886256 100644 --- a/tools/perf/arch/x86/util/intel-pt.c +++ b/tools/perf/arch/x86/util/intel-pt.c @@ -1173,9 +1173,12 @@ static int intel_pt_read_finish(struct auxtrace_record *itr, int idx) struct evsel *evsel; evlist__for_each_entry(ptr->evlist, evsel) { - if (evsel->core.attr.type == ptr->intel_pt_pmu->type) + if (evsel->core.attr.type == ptr->intel_pt_pmu->type) { + if (evsel->disabled) + return 0; return perf_evlist__enable_event_idx(ptr->evlist, evsel, idx); + } } return -EINVAL; } From 783fed2f35e2a6771c8dc6ee29b8c4b9930783ce Mon Sep 17 00:00:00 2001 From: Wei Li Date: Fri, 14 Feb 2020 15:26:51 +0200 Subject: [PATCH 034/400] perf intel-bts: Fix endless record after being terminated In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will be set and the event list will be disabled by evlist__disable() once. While in auxtrace_record.read_finish(), the related events will be enabled again, if they are continuous, the recording seems to be endless. If the intel_bts event is disabled, we don't enable it again here. Note: This patch is NOT tested since i don't have such a machine with intel_bts feature, but the code seems buggy same as arm-spe and intel-pt. Signed-off-by: Wei Li Cc: Jiri Olsa Cc: Tan Xiaojun Cc: stable@vger.kernel.org # 5.4+ Link: http://lore.kernel.org/lkml/20200214132654.20395-3-adrian.hunter@intel.com [ahunter: removed redundant 'else' after 'return'] Signed-off-by: Adrian Hunter Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/arch/x86/util/intel-bts.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c index 27d9e214d068..39e363151ad7 100644 --- a/tools/perf/arch/x86/util/intel-bts.c +++ b/tools/perf/arch/x86/util/intel-bts.c @@ -420,9 +420,12 @@ static int intel_bts_read_finish(struct auxtrace_record *itr, int idx) struct evsel *evsel; evlist__for_each_entry(btsr->evlist, evsel) { - if (evsel->core.attr.type == btsr->intel_bts_pmu->type) + if (evsel->core.attr.type == btsr->intel_bts_pmu->type) { + if (evsel->disabled) + return 0; return perf_evlist__enable_event_idx(btsr->evlist, evsel, idx); + } } return -EINVAL; } From c9f2833cb472cf9e0a49b7bcdc210a96017a7bfd Mon Sep 17 00:00:00 2001 From: Wei Li Date: Fri, 14 Feb 2020 15:26:52 +0200 Subject: [PATCH 035/400] perf cs-etm: Fix endless record after being terminated In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will be set and the event list will be disabled by evlist__disable() once. While in auxtrace_record.read_finish(), the related events will be enabled again, if they are continuous, the recording seems to be endless. If the cs_etm event is disabled, we don't enable it again here. Note: This patch is NOT tested since i don't have such a machine with coresight feature, but the code seems buggy same as arm-spe and intel-pt. Tester notes: Thanks for looping, Adrian. Applied this patch and tested with CoreSight on juno board, it works well. Signed-off-by: Wei Li Reviewed-by: Leo Yan Reviewed-by: Mathieu Poirier Tested-by: Leo Yan Cc: Jiri Olsa Cc: Tan Xiaojun Cc: stable@vger.kernel.org # 5.4+ Link: http://lore.kernel.org/lkml/20200214132654.20395-4-adrian.hunter@intel.com [ahunter: removed redundant 'else' after 'return'] Signed-off-by: Adrian Hunter Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/arch/arm/util/cs-etm.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 2898cfdf8fe1..60141c3007a9 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -865,9 +865,12 @@ static int cs_etm_read_finish(struct auxtrace_record *itr, int idx) struct evsel *evsel; evlist__for_each_entry(ptr->evlist, evsel) { - if (evsel->core.attr.type == ptr->cs_etm_pmu->type) + if (evsel->core.attr.type == ptr->cs_etm_pmu->type) { + if (evsel->disabled) + return 0; return perf_evlist__enable_event_idx(ptr->evlist, evsel, idx); + } } return -EINVAL; From d6bc34c5ec18c3544c4b0d85963768dfbcd24184 Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Fri, 14 Feb 2020 15:26:53 +0200 Subject: [PATCH 036/400] perf arm-spe: Fix endless record after being terminated In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will be set and the event list will be disabled by evlist__disable() once. While in auxtrace_record.read_finish(), the related events will be enabled again, if they are continuous, the recording seems to be endless. If the event is disabled, don't enable it again here. Based-on-patch-by: Wei Li Signed-off-by: Adrian Hunter Cc: Jiri Olsa Cc: Tan Xiaojun Cc: stable@vger.kernel.org # 5.4+ Link: http://lore.kernel.org/lkml/20200214132654.20395-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/arch/arm64/util/arm-spe.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c index eba6541ec0f1..1d993c27242b 100644 --- a/tools/perf/arch/arm64/util/arm-spe.c +++ b/tools/perf/arch/arm64/util/arm-spe.c @@ -165,9 +165,12 @@ static int arm_spe_read_finish(struct auxtrace_record *itr, int idx) struct evsel *evsel; evlist__for_each_entry(sper->evlist, evsel) { - if (evsel->core.attr.type == sper->arm_spe_pmu->type) + if (evsel->core.attr.type == sper->arm_spe_pmu->type) { + if (evsel->disabled) + return 0; return perf_evlist__enable_event_idx(sper->evlist, evsel, idx); + } } return -EINVAL; } From ad60ba0c2e6da6ff573c5ac57708fbc443bbb473 Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Mon, 17 Feb 2020 10:23:00 +0200 Subject: [PATCH 037/400] perf auxtrace: Add auxtrace_record__read_finish() All ->read_finish() implementations are doing the same thing. Add a helper function so that they can share the same implementation. Signed-off-by: Adrian Hunter Reviewed-by: Leo Yan Tested-by: Leo Yan Reviewed-by: Mathieu Poirier Cc: Jiri Olsa Cc: Kim Phillips Cc: Wei Li Link: http://lore.kernel.org/lkml/20200217082300.6301-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/arch/arm/util/cs-etm.c | 21 ++------------------- tools/perf/arch/arm64/util/arm-spe.c | 20 ++------------------ tools/perf/arch/x86/util/intel-bts.c | 20 ++------------------ tools/perf/arch/x86/util/intel-pt.c | 20 ++------------------ tools/perf/util/auxtrace.c | 22 +++++++++++++++++++++- tools/perf/util/auxtrace.h | 6 ++++++ 6 files changed, 35 insertions(+), 74 deletions(-) diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 60141c3007a9..941f814820b8 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -858,24 +858,6 @@ static void cs_etm_recording_free(struct auxtrace_record *itr) free(ptr); } -static int cs_etm_read_finish(struct auxtrace_record *itr, int idx) -{ - struct cs_etm_recording *ptr = - container_of(itr, struct cs_etm_recording, itr); - struct evsel *evsel; - - evlist__for_each_entry(ptr->evlist, evsel) { - if (evsel->core.attr.type == ptr->cs_etm_pmu->type) { - if (evsel->disabled) - return 0; - return perf_evlist__enable_event_idx(ptr->evlist, - evsel, idx); - } - } - - return -EINVAL; -} - struct auxtrace_record *cs_etm_record_init(int *err) { struct perf_pmu *cs_etm_pmu; @@ -895,6 +877,7 @@ struct auxtrace_record *cs_etm_record_init(int *err) } ptr->cs_etm_pmu = cs_etm_pmu; + ptr->itr.pmu = cs_etm_pmu; ptr->itr.parse_snapshot_options = cs_etm_parse_snapshot_options; ptr->itr.recording_options = cs_etm_recording_options; ptr->itr.info_priv_size = cs_etm_info_priv_size; @@ -904,7 +887,7 @@ struct auxtrace_record *cs_etm_record_init(int *err) ptr->itr.snapshot_finish = cs_etm_snapshot_finish; ptr->itr.reference = cs_etm_reference; ptr->itr.free = cs_etm_recording_free; - ptr->itr.read_finish = cs_etm_read_finish; + ptr->itr.read_finish = auxtrace_record__read_finish; *err = 0; return &ptr->itr; diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c index 1d993c27242b..8d6821d9c3f6 100644 --- a/tools/perf/arch/arm64/util/arm-spe.c +++ b/tools/perf/arch/arm64/util/arm-spe.c @@ -158,23 +158,6 @@ static void arm_spe_recording_free(struct auxtrace_record *itr) free(sper); } -static int arm_spe_read_finish(struct auxtrace_record *itr, int idx) -{ - struct arm_spe_recording *sper = - container_of(itr, struct arm_spe_recording, itr); - struct evsel *evsel; - - evlist__for_each_entry(sper->evlist, evsel) { - if (evsel->core.attr.type == sper->arm_spe_pmu->type) { - if (evsel->disabled) - return 0; - return perf_evlist__enable_event_idx(sper->evlist, - evsel, idx); - } - } - return -EINVAL; -} - struct auxtrace_record *arm_spe_recording_init(int *err, struct perf_pmu *arm_spe_pmu) { @@ -192,12 +175,13 @@ struct auxtrace_record *arm_spe_recording_init(int *err, } sper->arm_spe_pmu = arm_spe_pmu; + sper->itr.pmu = arm_spe_pmu; sper->itr.recording_options = arm_spe_recording_options; sper->itr.info_priv_size = arm_spe_info_priv_size; sper->itr.info_fill = arm_spe_info_fill; sper->itr.free = arm_spe_recording_free; sper->itr.reference = arm_spe_reference; - sper->itr.read_finish = arm_spe_read_finish; + sper->itr.read_finish = auxtrace_record__read_finish; sper->itr.alignment = 0; *err = 0; diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c index 39e363151ad7..26cee1052179 100644 --- a/tools/perf/arch/x86/util/intel-bts.c +++ b/tools/perf/arch/x86/util/intel-bts.c @@ -413,23 +413,6 @@ out_err: return err; } -static int intel_bts_read_finish(struct auxtrace_record *itr, int idx) -{ - struct intel_bts_recording *btsr = - container_of(itr, struct intel_bts_recording, itr); - struct evsel *evsel; - - evlist__for_each_entry(btsr->evlist, evsel) { - if (evsel->core.attr.type == btsr->intel_bts_pmu->type) { - if (evsel->disabled) - return 0; - return perf_evlist__enable_event_idx(btsr->evlist, - evsel, idx); - } - } - return -EINVAL; -} - struct auxtrace_record *intel_bts_recording_init(int *err) { struct perf_pmu *intel_bts_pmu = perf_pmu__find(INTEL_BTS_PMU_NAME); @@ -450,6 +433,7 @@ struct auxtrace_record *intel_bts_recording_init(int *err) } btsr->intel_bts_pmu = intel_bts_pmu; + btsr->itr.pmu = intel_bts_pmu; btsr->itr.recording_options = intel_bts_recording_options; btsr->itr.info_priv_size = intel_bts_info_priv_size; btsr->itr.info_fill = intel_bts_info_fill; @@ -459,7 +443,7 @@ struct auxtrace_record *intel_bts_recording_init(int *err) btsr->itr.find_snapshot = intel_bts_find_snapshot; btsr->itr.parse_snapshot_options = intel_bts_parse_snapshot_options; btsr->itr.reference = intel_bts_reference; - btsr->itr.read_finish = intel_bts_read_finish; + btsr->itr.read_finish = auxtrace_record__read_finish; btsr->itr.alignment = sizeof(struct branch); return &btsr->itr; } diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c index be07d6886256..7eea4fd7ce58 100644 --- a/tools/perf/arch/x86/util/intel-pt.c +++ b/tools/perf/arch/x86/util/intel-pt.c @@ -1166,23 +1166,6 @@ static u64 intel_pt_reference(struct auxtrace_record *itr __maybe_unused) return rdtsc(); } -static int intel_pt_read_finish(struct auxtrace_record *itr, int idx) -{ - struct intel_pt_recording *ptr = - container_of(itr, struct intel_pt_recording, itr); - struct evsel *evsel; - - evlist__for_each_entry(ptr->evlist, evsel) { - if (evsel->core.attr.type == ptr->intel_pt_pmu->type) { - if (evsel->disabled) - return 0; - return perf_evlist__enable_event_idx(ptr->evlist, evsel, - idx); - } - } - return -EINVAL; -} - struct auxtrace_record *intel_pt_recording_init(int *err) { struct perf_pmu *intel_pt_pmu = perf_pmu__find(INTEL_PT_PMU_NAME); @@ -1203,6 +1186,7 @@ struct auxtrace_record *intel_pt_recording_init(int *err) } ptr->intel_pt_pmu = intel_pt_pmu; + ptr->itr.pmu = intel_pt_pmu; ptr->itr.recording_options = intel_pt_recording_options; ptr->itr.info_priv_size = intel_pt_info_priv_size; ptr->itr.info_fill = intel_pt_info_fill; @@ -1212,7 +1196,7 @@ struct auxtrace_record *intel_pt_recording_init(int *err) ptr->itr.find_snapshot = intel_pt_find_snapshot; ptr->itr.parse_snapshot_options = intel_pt_parse_snapshot_options; ptr->itr.reference = intel_pt_reference; - ptr->itr.read_finish = intel_pt_read_finish; + ptr->itr.read_finish = auxtrace_record__read_finish; /* * Decoding starts at a PSB packet. Minimum PSB period is 2K so 4K * should give at least 1 PSB per sample. diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index eb087e7df6f4..3571ce72ca28 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -629,8 +629,10 @@ int auxtrace_record__options(struct auxtrace_record *itr, struct evlist *evlist, struct record_opts *opts) { - if (itr) + if (itr) { + itr->evlist = evlist; return itr->recording_options(itr, evlist, opts); + } return 0; } @@ -664,6 +666,24 @@ int auxtrace_parse_snapshot_options(struct auxtrace_record *itr, return -EINVAL; } +int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx) +{ + struct evsel *evsel; + + if (!itr->evlist || !itr->pmu) + return -EINVAL; + + evlist__for_each_entry(itr->evlist, evsel) { + if (evsel->core.attr.type == itr->pmu->type) { + if (evsel->disabled) + return 0; + return perf_evlist__enable_event_idx(itr->evlist, evsel, + idx); + } + } + return -EINVAL; +} + /* * Event record size is 16-bit which results in a maximum size of about 64KiB. * Allow about 4KiB for the rest of the sample record, to give a maximum diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index 749d72cd9c7b..e58ef160b599 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -29,6 +29,7 @@ struct record_opts; struct perf_record_auxtrace_error; struct perf_record_auxtrace_info; struct events_stats; +struct perf_pmu; enum auxtrace_error_type { PERF_AUXTRACE_ERROR_ITRACE = 1, @@ -322,6 +323,8 @@ struct auxtrace_mmap_params { * @read_finish: called after reading from an auxtrace mmap * @alignment: alignment (if any) for AUX area data * @default_aux_sample_size: default sample size for --aux sample option + * @pmu: associated pmu + * @evlist: selected events list */ struct auxtrace_record { int (*recording_options)(struct auxtrace_record *itr, @@ -346,6 +349,8 @@ struct auxtrace_record { int (*read_finish)(struct auxtrace_record *itr, int idx); unsigned int alignment; unsigned int default_aux_sample_size; + struct perf_pmu *pmu; + struct evlist *evlist; }; /** @@ -537,6 +542,7 @@ int auxtrace_record__find_snapshot(struct auxtrace_record *itr, int idx, struct auxtrace_mmap *mm, unsigned char *data, u64 *head, u64 *old); u64 auxtrace_record__reference(struct auxtrace_record *itr); +int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx); int auxtrace_index__auxtrace_event(struct list_head *head, union perf_event *event, off_t file_offset); From 789a2c250340666220fa74bc6c8f58497e3863b3 Mon Sep 17 00:00:00 2001 From: Hanno Zulla Date: Tue, 18 Feb 2020 12:37:47 +0100 Subject: [PATCH 038/400] HID: hid-bigbenff: fix general protection fault caused by double kfree The struct *bigben was allocated via devm_kzalloc() and then used as a parameter in input_ff_create_memless(). This caused a double kfree during removal of the device, since both the managed resource API and ml_ff_destroy() in drivers/input/ff-memless.c would call kfree() on it. Signed-off-by: Hanno Zulla Signed-off-by: Benjamin Tissoires --- drivers/hid/hid-bigbenff.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/hid/hid-bigbenff.c b/drivers/hid/hid-bigbenff.c index 3f6abd190df4..f7e85bacb688 100644 --- a/drivers/hid/hid-bigbenff.c +++ b/drivers/hid/hid-bigbenff.c @@ -220,10 +220,16 @@ static void bigben_worker(struct work_struct *work) static int hid_bigben_play_effect(struct input_dev *dev, void *data, struct ff_effect *effect) { - struct bigben_device *bigben = data; + struct hid_device *hid = input_get_drvdata(dev); + struct bigben_device *bigben = hid_get_drvdata(hid); u8 right_motor_on; u8 left_motor_force; + if (!bigben) { + hid_err(hid, "no device data\n"); + return 0; + } + if (effect->type != FF_RUMBLE) return 0; @@ -341,7 +347,7 @@ static int bigben_probe(struct hid_device *hid, INIT_WORK(&bigben->worker, bigben_worker); - error = input_ff_create_memless(hidinput->input, bigben, + error = input_ff_create_memless(hidinput->input, NULL, hid_bigben_play_effect); if (error) return error; From 976a54d0f4202cb412a3b1fc7f117e1d97db35f3 Mon Sep 17 00:00:00 2001 From: Hanno Zulla Date: Tue, 18 Feb 2020 12:38:34 +0100 Subject: [PATCH 039/400] HID: hid-bigbenff: call hid_hw_stop() in case of error It's required to call hid_hw_stop() once hid_hw_start() was called previously, so error cases need to handle this. Also, hid_hw_close() is not necessary during removal. Signed-off-by: Hanno Zulla Signed-off-by: Benjamin Tissoires --- drivers/hid/hid-bigbenff.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/drivers/hid/hid-bigbenff.c b/drivers/hid/hid-bigbenff.c index f7e85bacb688..f8c552b64a89 100644 --- a/drivers/hid/hid-bigbenff.c +++ b/drivers/hid/hid-bigbenff.c @@ -305,7 +305,6 @@ static void bigben_remove(struct hid_device *hid) struct bigben_device *bigben = hid_get_drvdata(hid); cancel_work_sync(&bigben->worker); - hid_hw_close(hid); hid_hw_stop(hid); } @@ -350,7 +349,7 @@ static int bigben_probe(struct hid_device *hid, error = input_ff_create_memless(hidinput->input, NULL, hid_bigben_play_effect); if (error) - return error; + goto error_hw_stop; name_sz = strlen(dev_name(&hid->dev)) + strlen(":red:bigben#") + 1; @@ -360,8 +359,10 @@ static int bigben_probe(struct hid_device *hid, sizeof(struct led_classdev) + name_sz, GFP_KERNEL ); - if (!led) - return -ENOMEM; + if (!led) { + error = -ENOMEM; + goto error_hw_stop; + } name = (void *)(&led[1]); snprintf(name, name_sz, "%s:red:bigben%d", @@ -375,7 +376,7 @@ static int bigben_probe(struct hid_device *hid, bigben->leds[n] = led; error = devm_led_classdev_register(&hid->dev, led); if (error) - return error; + goto error_hw_stop; } /* initial state: LED1 is on, no rumble effect */ @@ -389,6 +390,10 @@ static int bigben_probe(struct hid_device *hid, hid_info(hid, "LED and force feedback support for BigBen gamepad\n"); return 0; + +error_hw_stop: + hid_hw_stop(hid); + return error; } static __u8 *bigben_report_fixup(struct hid_device *hid, __u8 *rdesc, From 4eb1b01de5b9d8596d6c103efcf1a15cfc1bedf7 Mon Sep 17 00:00:00 2001 From: Hanno Zulla Date: Tue, 18 Feb 2020 12:39:31 +0100 Subject: [PATCH 040/400] HID: hid-bigbenff: fix race condition for scheduled work during removal It's possible that there is scheduled work left while the device is already being removed, which can cause a kernel crash. Adding a flag will avoid this. Signed-off-by: Hanno Zulla Signed-off-by: Benjamin Tissoires --- drivers/hid/hid-bigbenff.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/hid/hid-bigbenff.c b/drivers/hid/hid-bigbenff.c index f8c552b64a89..db6da21ade06 100644 --- a/drivers/hid/hid-bigbenff.c +++ b/drivers/hid/hid-bigbenff.c @@ -174,6 +174,7 @@ static __u8 pid0902_rdesc_fixed[] = { struct bigben_device { struct hid_device *hid; struct hid_report *report; + bool removed; u8 led_state; /* LED1 = 1 .. LED4 = 8 */ u8 right_motor_on; /* right motor off/on 0/1 */ u8 left_motor_force; /* left motor force 0-255 */ @@ -190,6 +191,9 @@ static void bigben_worker(struct work_struct *work) struct bigben_device, worker); struct hid_field *report_field = bigben->report->field[0]; + if (bigben->removed) + return; + if (bigben->work_led) { bigben->work_led = false; report_field->value[0] = 0x01; /* 1 = led message */ @@ -304,6 +308,7 @@ static void bigben_remove(struct hid_device *hid) { struct bigben_device *bigben = hid_get_drvdata(hid); + bigben->removed = true; cancel_work_sync(&bigben->worker); hid_hw_stop(hid); } @@ -324,6 +329,7 @@ static int bigben_probe(struct hid_device *hid, return -ENOMEM; hid_set_drvdata(hid, bigben); bigben->hid = hid; + bigben->removed = false; error = hid_parse(hid); if (error) { From b103de53e09f20d645eb313477f52d1993347605 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Tue, 18 Feb 2020 10:28:52 -0300 Subject: [PATCH 041/400] perf arch powerpc: Sync powerpc syscall.tbl with the kernel sources Copy over powerpc syscall.tbl to grab changes from the below commits fddb5d430ad9 ("open: introduce openat2(2) syscall") 9a2cef09c801 ("arch: wire up pidfd_getfd syscall") Now 'perf trace' on powerpc will be able to map from those syscall strings to the right syscall numbers, i.e. perf trace -e pidfd* Will include 'pidfd_getfd' as well as: perf trace open* Will cover all 'open' variants. Reported-by: Stephen Rothwell Reviewed-by: Ravi Bangoria Cc: Adrian Hunter Cc: Aleksa Sarai Cc: Al Viro Cc: Christian Brauner Cc: Jiri Olsa Cc: Namhyung Kim Cc: Naveen N. Rao Cc: Nicholas Piggin Cc: Sargun Dhillon Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/arch/powerpc/entry/syscalls/syscall.tbl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl index 43f736ed47f2..35b61bfc1b1a 100644 --- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl +++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl @@ -517,3 +517,5 @@ 433 common fspick sys_fspick 434 common pidfd_open sys_pidfd_open 435 nospu clone3 ppc_clone3 +437 common openat2 sys_openat2 +438 common pidfd_getfd sys_pidfd_getfd From 310006cab991fa895cc3a42c53609778b0d50d46 Mon Sep 17 00:00:00 2001 From: Dan Murphy Date: Tue, 18 Feb 2020 12:52:52 -0600 Subject: [PATCH 042/400] ASoC: tas2562: Return invalid for when bitwidth is invalid If the bitwidth passed in to the set_bitwidth function is not supported then return an error. Fixes: 29b74236bd57 ("ASoC: tas2562: Introduce the TAS2562 amplifier") Signed-off-by: Dan Murphy Link: https://lore.kernel.org/r/20200218185252.26290-1-dmurphy@ti.com Signed-off-by: Mark Brown --- sound/soc/codecs/tas2562.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sound/soc/codecs/tas2562.c b/sound/soc/codecs/tas2562.c index 729acd874c48..5b4803a76719 100644 --- a/sound/soc/codecs/tas2562.c +++ b/sound/soc/codecs/tas2562.c @@ -215,7 +215,8 @@ static int tas2562_set_bitwidth(struct tas2562_data *tas2562, int bitwidth) break; default: - dev_info(tas2562->dev, "Not supported params format\n"); + dev_info(tas2562->dev, "Unsupported bitwidth format\n"); + return -EINVAL; } ret = snd_soc_component_update_bits(tas2562->component, From 1c83767c9d417c4cc45d95f09b3a6e6c6b5417b5 Mon Sep 17 00:00:00 2001 From: Vignesh Raghavendra Date: Fri, 14 Feb 2020 11:14:36 +0200 Subject: [PATCH 043/400] dmaengine: ti: k3-udma: Use ktime/usleep_range based TX completion check In some cases (McSPI for example) the jiffie and delayed_work based workaround can cause big throughput drop. Switch to use ktime/usleep_range based implementation to be able to sustain speed for PDMA based peripherals. Signed-off-by: Vignesh Raghavendra Signed-off-by: Peter Ujfalusi Link: https://lore.kernel.org/r/20200214091441.27535-2-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul --- drivers/dma/ti/k3-udma.c | 80 ++++++++++++++++++++++++++-------------- 1 file changed, 53 insertions(+), 27 deletions(-) diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c index ea79c2df28e0..fb59c869a6a7 100644 --- a/drivers/dma/ti/k3-udma.c +++ b/drivers/dma/ti/k3-udma.c @@ -5,6 +5,7 @@ */ #include +#include #include #include #include @@ -169,7 +170,7 @@ enum udma_chan_state { struct udma_tx_drain { struct delayed_work work; - unsigned long jiffie; + ktime_t tstamp; u32 residue; }; @@ -946,9 +947,10 @@ static bool udma_is_desc_really_done(struct udma_chan *uc, struct udma_desc *d) peer_bcnt = udma_tchanrt_read(uc->tchan, UDMA_TCHAN_RT_PEER_BCNT_REG); bcnt = udma_tchanrt_read(uc->tchan, UDMA_TCHAN_RT_BCNT_REG); + /* Transfer is incomplete, store current residue and time stamp */ if (peer_bcnt < bcnt) { uc->tx_drain.residue = bcnt - peer_bcnt; - uc->tx_drain.jiffie = jiffies; + uc->tx_drain.tstamp = ktime_get(); return false; } @@ -961,35 +963,59 @@ static void udma_check_tx_completion(struct work_struct *work) tx_drain.work.work); bool desc_done = true; u32 residue_diff; - unsigned long jiffie_diff, delay; + ktime_t time_diff; + unsigned long delay; - if (uc->desc) { - residue_diff = uc->tx_drain.residue; - jiffie_diff = uc->tx_drain.jiffie; - desc_done = udma_is_desc_really_done(uc, uc->desc); - } - - if (!desc_done) { - jiffie_diff = uc->tx_drain.jiffie - jiffie_diff; - residue_diff -= uc->tx_drain.residue; - if (residue_diff) { - /* Try to guess when we should check next time */ - residue_diff /= jiffie_diff; - delay = uc->tx_drain.residue / residue_diff / 3; - if (jiffies_to_msecs(delay) < 5) - delay = 0; - } else { - /* No progress, check again in 1 second */ - delay = HZ; + while (1) { + if (uc->desc) { + /* Get previous residue and time stamp */ + residue_diff = uc->tx_drain.residue; + time_diff = uc->tx_drain.tstamp; + /* + * Get current residue and time stamp or see if + * transfer is complete + */ + desc_done = udma_is_desc_really_done(uc, uc->desc); } - schedule_delayed_work(&uc->tx_drain.work, delay); - } else if (uc->desc) { - struct udma_desc *d = uc->desc; + if (!desc_done) { + /* + * Find the time delta and residue delta w.r.t + * previous poll + */ + time_diff = ktime_sub(uc->tx_drain.tstamp, + time_diff) + 1; + residue_diff -= uc->tx_drain.residue; + if (residue_diff) { + /* + * Try to guess when we should check + * next time by calculating rate at + * which data is being drained at the + * peer device + */ + delay = (time_diff / residue_diff) * + uc->tx_drain.residue; + } else { + /* No progress, check again in 1 second */ + schedule_delayed_work(&uc->tx_drain.work, HZ); + break; + } - uc->bcnt += d->residue; - udma_start(uc); - vchan_cookie_complete(&d->vd); + usleep_range(ktime_to_us(delay), + ktime_to_us(delay) + 10); + continue; + } + + if (uc->desc) { + struct udma_desc *d = uc->desc; + + uc->bcnt += d->residue; + udma_start(uc); + vchan_cookie_complete(&d->vd); + break; + } + + break; } } From 16cd3c670183d788f5a0dfe41415d3386ba92ed9 Mon Sep 17 00:00:00 2001 From: Peter Ujfalusi Date: Fri, 14 Feb 2020 11:14:37 +0200 Subject: [PATCH 044/400] dmaengine: ti: k3-udma: Workaround for RX teardown with stale data in peer When a channel is asked to be stopped (teardown) and we do not have active descriptor to receive stale data buffered on the remote side then the teardown will not complete as UDMA needs a descriptor to be able to flush out the DMA pipe. The peer is trying to push the data to UDMA in teardown, but UDMA is pushing back because it has no descriptor which would allow it to drain the data. The workaround is to create 1K 'trashcan' to receive the discarded data and set up descriptors for packet and TR mode channels. When a channel is stopped and there is no active descriptor then a descriptor is pushed to the ring for UDMA before the teardown is initiated. Signed-off-by: Peter Ujfalusi Link: https://lore.kernel.org/r/20200214091441.27535-3-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul --- drivers/dma/ti/k3-udma.c | 168 +++++++++++++++++++++++++++++++++++---- 1 file changed, 151 insertions(+), 17 deletions(-) diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c index fb59c869a6a7..cb9259e104b4 100644 --- a/drivers/dma/ti/k3-udma.c +++ b/drivers/dma/ti/k3-udma.c @@ -97,6 +97,24 @@ struct udma_match_data { u32 level_start_idx[]; }; +struct udma_hwdesc { + size_t cppi5_desc_size; + void *cppi5_desc_vaddr; + dma_addr_t cppi5_desc_paddr; + + /* TR descriptor internal pointers */ + void *tr_req_base; + struct cppi5_tr_resp_t *tr_resp_base; +}; + +struct udma_rx_flush { + struct udma_hwdesc hwdescs[2]; + + size_t buffer_size; + void *buffer_vaddr; + dma_addr_t buffer_paddr; +}; + struct udma_dev { struct dma_device ddev; struct device *dev; @@ -113,6 +131,8 @@ struct udma_dev { struct list_head desc_to_purge; spinlock_t lock; + struct udma_rx_flush rx_flush; + int tchan_cnt; int echan_cnt; int rchan_cnt; @@ -131,16 +151,6 @@ struct udma_dev { u32 psil_base; }; -struct udma_hwdesc { - size_t cppi5_desc_size; - void *cppi5_desc_vaddr; - dma_addr_t cppi5_desc_paddr; - - /* TR descriptor internal pointers */ - void *tr_req_base; - struct cppi5_tr_resp_t *tr_resp_base; -}; - struct udma_desc { struct virt_dma_desc vd; @@ -552,12 +562,17 @@ static void udma_sync_for_device(struct udma_chan *uc, int idx) } } +static inline dma_addr_t udma_get_rx_flush_hwdesc_paddr(struct udma_chan *uc) +{ + return uc->ud->rx_flush.hwdescs[uc->config.pkt_mode].cppi5_desc_paddr; +} + static int udma_push_to_ring(struct udma_chan *uc, int idx) { struct udma_desc *d = uc->desc; - struct k3_ring *ring = NULL; - int ret = -EINVAL; + dma_addr_t paddr; + int ret; switch (uc->config.dir) { case DMA_DEV_TO_MEM: @@ -568,21 +583,37 @@ static int udma_push_to_ring(struct udma_chan *uc, int idx) ring = uc->tchan->t_ring; break; default: - break; + return -EINVAL; } - if (ring) { - dma_addr_t desc_addr = udma_curr_cppi5_desc_paddr(d, idx); + /* RX flush packet: idx == -1 is only passed in case of DEV_TO_MEM */ + if (idx == -1) { + paddr = udma_get_rx_flush_hwdesc_paddr(uc); + } else { + paddr = udma_curr_cppi5_desc_paddr(d, idx); wmb(); /* Ensure that writes are not moved over this point */ udma_sync_for_device(uc, idx); - ret = k3_ringacc_ring_push(ring, &desc_addr); - uc->in_ring_cnt++; } + ret = k3_ringacc_ring_push(ring, &paddr); + if (!ret) + uc->in_ring_cnt++; + return ret; } +static bool udma_desc_is_rx_flush(struct udma_chan *uc, dma_addr_t addr) +{ + if (uc->config.dir != DMA_DEV_TO_MEM) + return false; + + if (addr == udma_get_rx_flush_hwdesc_paddr(uc)) + return true; + + return false; +} + static int udma_pop_from_ring(struct udma_chan *uc, dma_addr_t *addr) { struct k3_ring *ring = NULL; @@ -611,6 +642,10 @@ static int udma_pop_from_ring(struct udma_chan *uc, dma_addr_t *addr) if (cppi5_desc_is_tdcm(*addr)) return ret; + /* Check for flush descriptor */ + if (udma_desc_is_rx_flush(uc, *addr)) + return -ENOENT; + d = udma_udma_desc_from_paddr(uc, *addr); if (d) @@ -891,6 +926,9 @@ static int udma_stop(struct udma_chan *uc) switch (uc->config.dir) { case DMA_DEV_TO_MEM: + if (!uc->cyclic && !uc->desc) + udma_push_to_ring(uc, -1); + udma_rchanrt_write(uc->rchan, UDMA_RCHAN_RT_PEER_RT_EN_REG, UDMA_PEER_RT_EN_ENABLE | UDMA_PEER_RT_EN_TEARDOWN); @@ -3274,6 +3312,98 @@ static int udma_setup_resources(struct udma_dev *ud) return ch_count; } +static int udma_setup_rx_flush(struct udma_dev *ud) +{ + struct udma_rx_flush *rx_flush = &ud->rx_flush; + struct cppi5_desc_hdr_t *tr_desc; + struct cppi5_tr_type1_t *tr_req; + struct cppi5_host_desc_t *desc; + struct device *dev = ud->dev; + struct udma_hwdesc *hwdesc; + size_t tr_size; + + /* Allocate 1K buffer for discarded data on RX channel teardown */ + rx_flush->buffer_size = SZ_1K; + rx_flush->buffer_vaddr = devm_kzalloc(dev, rx_flush->buffer_size, + GFP_KERNEL); + if (!rx_flush->buffer_vaddr) + return -ENOMEM; + + rx_flush->buffer_paddr = dma_map_single(dev, rx_flush->buffer_vaddr, + rx_flush->buffer_size, + DMA_TO_DEVICE); + if (dma_mapping_error(dev, rx_flush->buffer_paddr)) + return -ENOMEM; + + /* Set up descriptor to be used for TR mode */ + hwdesc = &rx_flush->hwdescs[0]; + tr_size = sizeof(struct cppi5_tr_type1_t); + hwdesc->cppi5_desc_size = cppi5_trdesc_calc_size(tr_size, 1); + hwdesc->cppi5_desc_size = ALIGN(hwdesc->cppi5_desc_size, + ud->desc_align); + + hwdesc->cppi5_desc_vaddr = devm_kzalloc(dev, hwdesc->cppi5_desc_size, + GFP_KERNEL); + if (!hwdesc->cppi5_desc_vaddr) + return -ENOMEM; + + hwdesc->cppi5_desc_paddr = dma_map_single(dev, hwdesc->cppi5_desc_vaddr, + hwdesc->cppi5_desc_size, + DMA_TO_DEVICE); + if (dma_mapping_error(dev, hwdesc->cppi5_desc_paddr)) + return -ENOMEM; + + /* Start of the TR req records */ + hwdesc->tr_req_base = hwdesc->cppi5_desc_vaddr + tr_size; + /* Start address of the TR response array */ + hwdesc->tr_resp_base = hwdesc->tr_req_base + tr_size; + + tr_desc = hwdesc->cppi5_desc_vaddr; + cppi5_trdesc_init(tr_desc, 1, tr_size, 0, 0); + cppi5_desc_set_pktids(tr_desc, 0, CPPI5_INFO1_DESC_FLOWID_DEFAULT); + cppi5_desc_set_retpolicy(tr_desc, 0, 0); + + tr_req = hwdesc->tr_req_base; + cppi5_tr_init(&tr_req->flags, CPPI5_TR_TYPE1, false, false, + CPPI5_TR_EVENT_SIZE_COMPLETION, 0); + cppi5_tr_csf_set(&tr_req->flags, CPPI5_TR_CSF_SUPR_EVT); + + tr_req->addr = rx_flush->buffer_paddr; + tr_req->icnt0 = rx_flush->buffer_size; + tr_req->icnt1 = 1; + + /* Set up descriptor to be used for packet mode */ + hwdesc = &rx_flush->hwdescs[1]; + hwdesc->cppi5_desc_size = ALIGN(sizeof(struct cppi5_host_desc_t) + + CPPI5_INFO0_HDESC_EPIB_SIZE + + CPPI5_INFO0_HDESC_PSDATA_MAX_SIZE, + ud->desc_align); + + hwdesc->cppi5_desc_vaddr = devm_kzalloc(dev, hwdesc->cppi5_desc_size, + GFP_KERNEL); + if (!hwdesc->cppi5_desc_vaddr) + return -ENOMEM; + + hwdesc->cppi5_desc_paddr = dma_map_single(dev, hwdesc->cppi5_desc_vaddr, + hwdesc->cppi5_desc_size, + DMA_TO_DEVICE); + if (dma_mapping_error(dev, hwdesc->cppi5_desc_paddr)) + return -ENOMEM; + + desc = hwdesc->cppi5_desc_vaddr; + cppi5_hdesc_init(desc, 0, 0); + cppi5_desc_set_pktids(&desc->hdr, 0, CPPI5_INFO1_DESC_FLOWID_DEFAULT); + cppi5_desc_set_retpolicy(&desc->hdr, 0, 0); + + cppi5_hdesc_attach_buf(desc, + rx_flush->buffer_paddr, rx_flush->buffer_size, + rx_flush->buffer_paddr, rx_flush->buffer_size); + + dma_sync_single_for_device(dev, hwdesc->cppi5_desc_paddr, + hwdesc->cppi5_desc_size, DMA_TO_DEVICE); + return 0; +} + #define TI_UDMAC_BUSWIDTHS (BIT(DMA_SLAVE_BUSWIDTH_1_BYTE) | \ BIT(DMA_SLAVE_BUSWIDTH_2_BYTES) | \ BIT(DMA_SLAVE_BUSWIDTH_3_BYTES) | \ @@ -3387,6 +3517,10 @@ static int udma_probe(struct platform_device *pdev) if (ud->desc_align < dma_get_cache_alignment()) ud->desc_align = dma_get_cache_alignment(); + ret = udma_setup_rx_flush(ud); + if (ret) + return ret; + for (i = 0; i < ud->tchan_cnt; i++) { struct udma_tchan *tchan = &ud->tchans[i]; From a97934071fc3b0a5e52c89e06da68bcac0481de3 Mon Sep 17 00:00:00 2001 From: Peter Ujfalusi Date: Fri, 14 Feb 2020 11:14:38 +0200 Subject: [PATCH 045/400] dmaengine: ti: k3-udma: Move the TR counter calculation to helper function Move the TR counter parameter configuration code out from the prep_memcpy callback to a helper function to allow a generic re-usable code for other TR based transfers. Signed-off-by: Peter Ujfalusi Link: https://lore.kernel.org/r/20200214091441.27535-4-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul --- drivers/dma/ti/k3-udma.c | 74 +++++++++++++++++++++++++++------------- 1 file changed, 51 insertions(+), 23 deletions(-) diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c index cb9259e104b4..9b00013d6f63 100644 --- a/drivers/dma/ti/k3-udma.c +++ b/drivers/dma/ti/k3-udma.c @@ -2029,6 +2029,51 @@ static struct udma_desc *udma_alloc_tr_desc(struct udma_chan *uc, return d; } +/** + * udma_get_tr_counters - calculate TR counters for a given length + * @len: Length of the trasnfer + * @align_to: Preferred alignment + * @tr0_cnt0: First TR icnt0 + * @tr0_cnt1: First TR icnt1 + * @tr1_cnt0: Second (if used) TR icnt0 + * + * For len < SZ_64K only one TR is enough, tr1_cnt0 is not updated + * For len >= SZ_64K two TRs are used in a simple way: + * First TR: SZ_64K-alignment blocks (tr0_cnt0, tr0_cnt1) + * Second TR: the remaining length (tr1_cnt0) + * + * Returns the number of TRs the length needs (1 or 2) + * -EINVAL if the length can not be supported + */ +static int udma_get_tr_counters(size_t len, unsigned long align_to, + u16 *tr0_cnt0, u16 *tr0_cnt1, u16 *tr1_cnt0) +{ + if (len < SZ_64K) { + *tr0_cnt0 = len; + *tr0_cnt1 = 1; + + return 1; + } + + if (align_to > 3) + align_to = 3; + +realign: + *tr0_cnt0 = SZ_64K - BIT(align_to); + if (len / *tr0_cnt0 >= SZ_64K) { + if (align_to) { + align_to--; + goto realign; + } + return -EINVAL; + } + + *tr0_cnt1 = len / *tr0_cnt0; + *tr1_cnt0 = len % *tr0_cnt0; + + return 2; +} + static struct udma_desc * udma_prep_slave_sg_tr(struct udma_chan *uc, struct scatterlist *sgl, unsigned int sglen, enum dma_transfer_direction dir, @@ -2581,29 +2626,12 @@ udma_prep_dma_memcpy(struct dma_chan *chan, dma_addr_t dest, dma_addr_t src, return NULL; } - if (len < SZ_64K) { - num_tr = 1; - tr0_cnt0 = len; - tr0_cnt1 = 1; - } else { - unsigned long align_to = __ffs(src | dest); - - if (align_to > 3) - align_to = 3; - /* - * Keep simple: tr0: SZ_64K-alignment blocks, - * tr1: the remaining - */ - num_tr = 2; - tr0_cnt0 = (SZ_64K - BIT(align_to)); - if (len / tr0_cnt0 >= SZ_64K) { - dev_err(uc->ud->dev, "size %zu is not supported\n", - len); - return NULL; - } - - tr0_cnt1 = len / tr0_cnt0; - tr1_cnt0 = len % tr0_cnt0; + num_tr = udma_get_tr_counters(len, __ffs(src | dest), &tr0_cnt0, + &tr0_cnt1, &tr1_cnt0); + if (num_tr < 0) { + dev_err(uc->ud->dev, "size %zu is not supported\n", + len); + return NULL; } d = udma_alloc_tr_desc(uc, tr_size, num_tr, DMA_MEM_TO_MEM); From 6cf668a4ef829b9a11d74a4954f02a1767403246 Mon Sep 17 00:00:00 2001 From: Peter Ujfalusi Date: Fri, 14 Feb 2020 11:14:39 +0200 Subject: [PATCH 046/400] dmaengine: ti: k3-udma: Use the TR counter helper for slave_sg and cyclic Use the generic TR setup function to get the TR counters for both cyclic and slave_sg transfers. This way the period_size for cyclic and sg_dma_len() for slave_sg can be as large as (SZ_64K - 1) * (SZ_64K - 1) and we can handle cases when the length is >SZ_64K and a prime number. Signed-off-by: Peter Ujfalusi Link: https://lore.kernel.org/r/20200214091441.27535-5-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul --- drivers/dma/ti/k3-udma.c | 130 ++++++++++++++++++++++++++------------- 1 file changed, 88 insertions(+), 42 deletions(-) diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c index 9b00013d6f63..1dba47c662c4 100644 --- a/drivers/dma/ti/k3-udma.c +++ b/drivers/dma/ti/k3-udma.c @@ -2079,31 +2079,31 @@ udma_prep_slave_sg_tr(struct udma_chan *uc, struct scatterlist *sgl, unsigned int sglen, enum dma_transfer_direction dir, unsigned long tx_flags, void *context) { - enum dma_slave_buswidth dev_width; struct scatterlist *sgent; struct udma_desc *d; - size_t tr_size; struct cppi5_tr_type1_t *tr_req = NULL; + u16 tr0_cnt0, tr0_cnt1, tr1_cnt0; unsigned int i; - u32 burst; + size_t tr_size; + int num_tr = 0; + int tr_idx = 0; - if (dir == DMA_DEV_TO_MEM) { - dev_width = uc->cfg.src_addr_width; - burst = uc->cfg.src_maxburst; - } else if (dir == DMA_MEM_TO_DEV) { - dev_width = uc->cfg.dst_addr_width; - burst = uc->cfg.dst_maxburst; - } else { - dev_err(uc->ud->dev, "%s: bad direction?\n", __func__); + if (!is_slave_direction(dir)) { + dev_err(uc->ud->dev, "Only slave cyclic is supported\n"); return NULL; } - if (!burst) - burst = 1; + /* estimate the number of TRs we will need */ + for_each_sg(sgl, sgent, sglen, i) { + if (sg_dma_len(sgent) < SZ_64K) + num_tr++; + else + num_tr += 2; + } /* Now allocate and setup the descriptor. */ tr_size = sizeof(struct cppi5_tr_type1_t); - d = udma_alloc_tr_desc(uc, tr_size, sglen, dir); + d = udma_alloc_tr_desc(uc, tr_size, num_tr, dir); if (!d) return NULL; @@ -2111,19 +2111,46 @@ udma_prep_slave_sg_tr(struct udma_chan *uc, struct scatterlist *sgl, tr_req = d->hwdesc[0].tr_req_base; for_each_sg(sgl, sgent, sglen, i) { - d->residue += sg_dma_len(sgent); + dma_addr_t sg_addr = sg_dma_address(sgent); + + num_tr = udma_get_tr_counters(sg_dma_len(sgent), __ffs(sg_addr), + &tr0_cnt0, &tr0_cnt1, &tr1_cnt0); + if (num_tr < 0) { + dev_err(uc->ud->dev, "size %u is not supported\n", + sg_dma_len(sgent)); + udma_free_hwdesc(uc, d); + kfree(d); + return NULL; + } cppi5_tr_init(&tr_req[i].flags, CPPI5_TR_TYPE1, false, false, CPPI5_TR_EVENT_SIZE_COMPLETION, 0); cppi5_tr_csf_set(&tr_req[i].flags, CPPI5_TR_CSF_SUPR_EVT); - tr_req[i].addr = sg_dma_address(sgent); - tr_req[i].icnt0 = burst * dev_width; - tr_req[i].dim1 = burst * dev_width; - tr_req[i].icnt1 = sg_dma_len(sgent) / tr_req[i].icnt0; + tr_req[tr_idx].addr = sg_addr; + tr_req[tr_idx].icnt0 = tr0_cnt0; + tr_req[tr_idx].icnt1 = tr0_cnt1; + tr_req[tr_idx].dim1 = tr0_cnt0; + tr_idx++; + + if (num_tr == 2) { + cppi5_tr_init(&tr_req[tr_idx].flags, CPPI5_TR_TYPE1, + false, false, + CPPI5_TR_EVENT_SIZE_COMPLETION, 0); + cppi5_tr_csf_set(&tr_req[tr_idx].flags, + CPPI5_TR_CSF_SUPR_EVT); + + tr_req[tr_idx].addr = sg_addr + tr0_cnt1 * tr0_cnt0; + tr_req[tr_idx].icnt0 = tr1_cnt0; + tr_req[tr_idx].icnt1 = 1; + tr_req[tr_idx].dim1 = tr1_cnt0; + tr_idx++; + } + + d->residue += sg_dma_len(sgent); } - cppi5_tr_csf_set(&tr_req[i - 1].flags, CPPI5_TR_CSF_EOP); + cppi5_tr_csf_set(&tr_req[tr_idx - 1].flags, CPPI5_TR_CSF_EOP); return d; } @@ -2428,47 +2455,66 @@ udma_prep_dma_cyclic_tr(struct udma_chan *uc, dma_addr_t buf_addr, size_t buf_len, size_t period_len, enum dma_transfer_direction dir, unsigned long flags) { - enum dma_slave_buswidth dev_width; struct udma_desc *d; - size_t tr_size; + size_t tr_size, period_addr; struct cppi5_tr_type1_t *tr_req; - unsigned int i; unsigned int periods = buf_len / period_len; - u32 burst; + u16 tr0_cnt0, tr0_cnt1, tr1_cnt0; + unsigned int i; + int num_tr; - if (dir == DMA_DEV_TO_MEM) { - dev_width = uc->cfg.src_addr_width; - burst = uc->cfg.src_maxburst; - } else if (dir == DMA_MEM_TO_DEV) { - dev_width = uc->cfg.dst_addr_width; - burst = uc->cfg.dst_maxburst; - } else { - dev_err(uc->ud->dev, "%s: bad direction?\n", __func__); + if (!is_slave_direction(dir)) { + dev_err(uc->ud->dev, "Only slave cyclic is supported\n"); return NULL; } - if (!burst) - burst = 1; + num_tr = udma_get_tr_counters(period_len, __ffs(buf_addr), &tr0_cnt0, + &tr0_cnt1, &tr1_cnt0); + if (num_tr < 0) { + dev_err(uc->ud->dev, "size %zu is not supported\n", + period_len); + return NULL; + } /* Now allocate and setup the descriptor. */ tr_size = sizeof(struct cppi5_tr_type1_t); - d = udma_alloc_tr_desc(uc, tr_size, periods, dir); + d = udma_alloc_tr_desc(uc, tr_size, periods * num_tr, dir); if (!d) return NULL; tr_req = d->hwdesc[0].tr_req_base; + period_addr = buf_addr; for (i = 0; i < periods; i++) { - cppi5_tr_init(&tr_req[i].flags, CPPI5_TR_TYPE1, false, false, - CPPI5_TR_EVENT_SIZE_COMPLETION, 0); + int tr_idx = i * num_tr; - tr_req[i].addr = buf_addr + period_len * i; - tr_req[i].icnt0 = dev_width; - tr_req[i].icnt1 = period_len / dev_width; - tr_req[i].dim1 = dev_width; + cppi5_tr_init(&tr_req[tr_idx].flags, CPPI5_TR_TYPE1, false, + false, CPPI5_TR_EVENT_SIZE_COMPLETION, 0); + + tr_req[tr_idx].addr = period_addr; + tr_req[tr_idx].icnt0 = tr0_cnt0; + tr_req[tr_idx].icnt1 = tr0_cnt1; + tr_req[tr_idx].dim1 = tr0_cnt0; + + if (num_tr == 2) { + cppi5_tr_csf_set(&tr_req[tr_idx].flags, + CPPI5_TR_CSF_SUPR_EVT); + tr_idx++; + + cppi5_tr_init(&tr_req[tr_idx].flags, CPPI5_TR_TYPE1, + false, false, + CPPI5_TR_EVENT_SIZE_COMPLETION, 0); + + tr_req[tr_idx].addr = period_addr + tr0_cnt1 * tr0_cnt0; + tr_req[tr_idx].icnt0 = tr1_cnt0; + tr_req[tr_idx].icnt1 = 1; + tr_req[tr_idx].dim1 = tr1_cnt0; + } if (!(flags & DMA_PREP_INTERRUPT)) - cppi5_tr_csf_set(&tr_req[i].flags, + cppi5_tr_csf_set(&tr_req[tr_idx].flags, CPPI5_TR_CSF_SUPR_EVT); + + period_addr += period_len; } return d; From c7450bb211f3ded7cb6e4d305855683ed7d04e60 Mon Sep 17 00:00:00 2001 From: Peter Ujfalusi Date: Fri, 14 Feb 2020 11:14:40 +0200 Subject: [PATCH 047/400] dmaengine: ti: k3-udma: Use the channel direction in pause/resume functions It should be possible to pause, resume and check the pause state of a channel even if we do not have active transfer. udma_is_chan_paused() can trigger NULL pointer reference in it's current form when the status is checked while uc->desc is NULL. Fixes: 25dcb5dd7b7ce ("dmaengine: ti: New driver for K3 UDMA") Signed-off-by: Peter Ujfalusi Link: https://lore.kernel.org/r/20200214091441.27535-6-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul --- drivers/dma/ti/k3-udma.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c index 1dba47c662c4..9b4e1e5fa849 100644 --- a/drivers/dma/ti/k3-udma.c +++ b/drivers/dma/ti/k3-udma.c @@ -513,7 +513,7 @@ static bool udma_is_chan_paused(struct udma_chan *uc) { u32 val, pause_mask; - switch (uc->desc->dir) { + switch (uc->config.dir) { case DMA_DEV_TO_MEM: val = udma_rchanrt_read(uc->rchan, UDMA_RCHAN_RT_PEER_RT_EN_REG); @@ -2835,11 +2835,8 @@ static int udma_pause(struct dma_chan *chan) { struct udma_chan *uc = to_udma_chan(chan); - if (!uc->desc) - return -EINVAL; - /* pause the channel */ - switch (uc->desc->dir) { + switch (uc->config.dir) { case DMA_DEV_TO_MEM: udma_rchanrt_update_bits(uc->rchan, UDMA_RCHAN_RT_PEER_RT_EN_REG, @@ -2868,11 +2865,8 @@ static int udma_resume(struct dma_chan *chan) { struct udma_chan *uc = to_udma_chan(chan); - if (!uc->desc) - return -EINVAL; - /* resume the channel */ - switch (uc->desc->dir) { + switch (uc->config.dir) { case DMA_DEV_TO_MEM: udma_rchanrt_update_bits(uc->rchan, UDMA_RCHAN_RT_PEER_RT_EN_REG, From 8390318c04bb24cc4d41ac03e009b0748882f99f Mon Sep 17 00:00:00 2001 From: Peter Ujfalusi Date: Fri, 14 Feb 2020 11:14:41 +0200 Subject: [PATCH 048/400] dmaengine: ti: k3-udma: Fix terminated transfer handling When we receive back the descriptor of the terminated transfer the cookie must be marked as completed to make sure that the accounting is correct. In udma_tx_status() the status should be marked as completed if the channel is no longer running (it can only happen if the channel is not yet started for the first time, or after a channel termination). Fixes: 25dcb5dd7b7ce ("dmaengine: ti: New driver for K3 UDMA") Signed-off-by: Peter Ujfalusi Link: https://lore.kernel.org/r/20200214091441.27535-7-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul --- drivers/dma/ti/k3-udma.c | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/drivers/dma/ti/k3-udma.c b/drivers/dma/ti/k3-udma.c index 9b4e1e5fa849..0536866a58ce 100644 --- a/drivers/dma/ti/k3-udma.c +++ b/drivers/dma/ti/k3-udma.c @@ -1097,29 +1097,27 @@ static irqreturn_t udma_ring_irq_handler(int irq, void *data) goto out; } - if (uc->cyclic) { - /* push the descriptor back to the ring */ - if (d == uc->desc) { + if (d == uc->desc) { + /* active descriptor */ + if (uc->cyclic) { udma_cyclic_packet_elapsed(uc); vchan_cyclic_callback(&d->vd); - } - } else { - bool desc_done = false; - - if (d == uc->desc) { - desc_done = udma_is_desc_really_done(uc, d); - - if (desc_done) { + } else { + if (udma_is_desc_really_done(uc, d)) { uc->bcnt += d->residue; udma_start(uc); + vchan_cookie_complete(&d->vd); } else { schedule_delayed_work(&uc->tx_drain.work, 0); } } - - if (desc_done) - vchan_cookie_complete(&d->vd); + } else { + /* + * terminated descriptor, mark the descriptor as + * completed to update the channel's cookie marker + */ + dma_cookie_complete(&d->vd.tx); } } out: @@ -2769,6 +2767,9 @@ static enum dma_status udma_tx_status(struct dma_chan *chan, ret = dma_cookie_status(chan, cookie, txstate); + if (!udma_is_chan_running(uc)) + ret = DMA_COMPLETE; + if (ret == DMA_IN_PROGRESS && udma_is_chan_paused(uc)) ret = DMA_PAUSED; From c37c0ab029569a75fd180edb03d411e7a28a936f Mon Sep 17 00:00:00 2001 From: Hui Wang Date: Wed, 19 Feb 2020 13:23:06 +0800 Subject: [PATCH 049/400] ALSA: hda/realtek - Fix a regression for mute led on Lenovo Carbon X1 Need to chain the THINKPAD_ACPI, otherwise the mute led will not work. Fixes: d2cd795c4ece ("ALSA: hda - fixup for the bass speaker on Lenovo Carbon X1 7th gen") Cc: Signed-off-by: Hui Wang Link: https://lore.kernel.org/r/20200219052306.24935-1-hui.wang@canonical.com Signed-off-by: Takashi Iwai --- sound/pci/hda/patch_realtek.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 477589e7ec1d..31bee0512334 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -6684,6 +6684,8 @@ static const struct hda_fixup alc269_fixups[] = { [ALC285_FIXUP_SPEAKER2_TO_DAC1] = { .type = HDA_FIXUP_FUNC, .v.func = alc285_fixup_speaker2_to_dac1, + .chained = true, + .chain_id = ALC269_FIXUP_THINKPAD_ACPI }, [ALC256_FIXUP_DELL_INSPIRON_7559_SUBWOOFER] = { .type = HDA_FIXUP_PINS, From 2d0b1919457ad78036f24169968cadc6f55d37ec Mon Sep 17 00:00:00 2001 From: Dave Jiang Date: Tue, 18 Feb 2020 09:51:58 -0700 Subject: [PATCH 050/400] dmaengine: idxd: correct reserved token calculation The calcuation for limit of reserved token did not take into account the change the user wanted vs the current group reserved token. This causes changing of the reserved token to be possible only after we set the value of the reserved token back to 0. Fix calculation so we can set a value that is non zero for reserved token. Fixes: c52ca478233c ("dmaengine: idxd: add configuration component of driver") Reported-by: Jerry Chen Signed-off-by: Dave Jiang Link: https://lore.kernel.org/r/158204471889.37789.7749177228265869168.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul --- drivers/dma/idxd/sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c index 298855ca934f..edbfe83325eb 100644 --- a/drivers/dma/idxd/sysfs.c +++ b/drivers/dma/idxd/sysfs.c @@ -519,7 +519,7 @@ static ssize_t group_tokens_reserved_store(struct device *dev, if (val > idxd->max_tokens) return -EINVAL; - if (val > idxd->nr_tokens) + if (val > idxd->nr_tokens + group->tokens_reserved) return -EINVAL; group->tokens_reserved = val; From 64bbacc5f08c01954890981c63de744df1f29a30 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 18 Feb 2020 12:17:35 +0100 Subject: [PATCH 051/400] ASoC: intel: skl: Fix pin debug prints skl_print_pins() loops over all given pins but it overwrites the text at the very same position while increasing the returned length. Fix this to show the all pin contents properly. Fixes: d14700a01f91 ("ASoC: Intel: Skylake: Debugfs facility to dump module config") Signed-off-by: Takashi Iwai Acked-by: Cezary Rojewski Link: https://lore.kernel.org/r/20200218111737.14193-2-tiwai@suse.de Signed-off-by: Mark Brown --- sound/soc/intel/skylake/skl-debug.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sound/soc/intel/skylake/skl-debug.c b/sound/soc/intel/skylake/skl-debug.c index 3466675f2678..4c1703da1a6d 100644 --- a/sound/soc/intel/skylake/skl-debug.c +++ b/sound/soc/intel/skylake/skl-debug.c @@ -34,7 +34,7 @@ static ssize_t skl_print_pins(struct skl_module_pin *m_pin, char *buf, int i; ssize_t ret = 0; - for (i = 0; i < max_pin; i++) + for (i = 0; i < max_pin; i++) { ret += snprintf(buf + size, MOD_BUF - size, "%s %d\n\tModule %d\n\tInstance %d\n\t" "In-used %s\n\tType %s\n" @@ -45,6 +45,8 @@ static ssize_t skl_print_pins(struct skl_module_pin *m_pin, char *buf, m_pin[i].in_use ? "Used" : "Unused", m_pin[i].is_dynamic ? "Dynamic" : "Static", m_pin[i].pin_state, i); + size += ret; + } return ret; } From 549cd0ba04dcfe340c349cd983bd440480fae8ee Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 18 Feb 2020 12:17:36 +0100 Subject: [PATCH 052/400] ASoC: intel: skl: Fix possible buffer overflow in debug outputs The debugfs output of intel skl driver writes strings with multiple snprintf() calls with the fixed size. This was supposed to avoid the buffer overflow but actually it still would, because snprintf() returns the expected size to be output, not the actual output size. Fix it by replacing snprintf() calls with scnprintf(). Fixes: d14700a01f91 ("ASoC: Intel: Skylake: Debugfs facility to dump module config") Signed-off-by: Takashi Iwai Acked-by: Cezary Rojewski Link: https://lore.kernel.org/r/20200218111737.14193-3-tiwai@suse.de Signed-off-by: Mark Brown --- sound/soc/intel/skylake/skl-debug.c | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/sound/soc/intel/skylake/skl-debug.c b/sound/soc/intel/skylake/skl-debug.c index 4c1703da1a6d..a15aa2ffa681 100644 --- a/sound/soc/intel/skylake/skl-debug.c +++ b/sound/soc/intel/skylake/skl-debug.c @@ -35,7 +35,7 @@ static ssize_t skl_print_pins(struct skl_module_pin *m_pin, char *buf, ssize_t ret = 0; for (i = 0; i < max_pin; i++) { - ret += snprintf(buf + size, MOD_BUF - size, + ret += scnprintf(buf + size, MOD_BUF - size, "%s %d\n\tModule %d\n\tInstance %d\n\t" "In-used %s\n\tType %s\n" "\tState %d\n\tIndex %d\n", @@ -53,7 +53,7 @@ static ssize_t skl_print_pins(struct skl_module_pin *m_pin, char *buf, static ssize_t skl_print_fmt(struct skl_module_fmt *fmt, char *buf, ssize_t size, bool direction) { - return snprintf(buf + size, MOD_BUF - size, + return scnprintf(buf + size, MOD_BUF - size, "%s\n\tCh %d\n\tFreq %d\n\tBit depth %d\n\t" "Valid bit depth %d\n\tCh config %#x\n\tInterleaving %d\n\t" "Sample Type %d\n\tCh Map %#x\n", @@ -77,16 +77,16 @@ static ssize_t module_read(struct file *file, char __user *user_buf, if (!buf) return -ENOMEM; - ret = snprintf(buf, MOD_BUF, "Module:\n\tUUID %pUL\n\tModule id %d\n" + ret = scnprintf(buf, MOD_BUF, "Module:\n\tUUID %pUL\n\tModule id %d\n" "\tInstance id %d\n\tPvt_id %d\n", mconfig->guid, mconfig->id.module_id, mconfig->id.instance_id, mconfig->id.pvt_id); - ret += snprintf(buf + ret, MOD_BUF - ret, + ret += scnprintf(buf + ret, MOD_BUF - ret, "Resources:\n\tCPC %#x\n\tIBS %#x\n\tOBS %#x\t\n", res->cpc, res->ibs, res->obs); - ret += snprintf(buf + ret, MOD_BUF - ret, + ret += scnprintf(buf + ret, MOD_BUF - ret, "Module data:\n\tCore %d\n\tIn queue %d\n\t" "Out queue %d\n\tType %s\n", mconfig->core_id, mconfig->max_in_queue, @@ -96,38 +96,38 @@ static ssize_t module_read(struct file *file, char __user *user_buf, ret += skl_print_fmt(mconfig->in_fmt, buf, ret, true); ret += skl_print_fmt(mconfig->out_fmt, buf, ret, false); - ret += snprintf(buf + ret, MOD_BUF - ret, + ret += scnprintf(buf + ret, MOD_BUF - ret, "Fixup:\n\tParams %#x\n\tConverter %#x\n", mconfig->params_fixup, mconfig->converter); - ret += snprintf(buf + ret, MOD_BUF - ret, + ret += scnprintf(buf + ret, MOD_BUF - ret, "Module Gateway:\n\tType %#x\n\tVbus %#x\n\tHW conn %#x\n\tSlot %#x\n", mconfig->dev_type, mconfig->vbus_id, mconfig->hw_conn_type, mconfig->time_slot); - ret += snprintf(buf + ret, MOD_BUF - ret, + ret += scnprintf(buf + ret, MOD_BUF - ret, "Pipeline:\n\tID %d\n\tPriority %d\n\tConn Type %d\n\t" "Pages %#x\n", mconfig->pipe->ppl_id, mconfig->pipe->pipe_priority, mconfig->pipe->conn_type, mconfig->pipe->memory_pages); - ret += snprintf(buf + ret, MOD_BUF - ret, + ret += scnprintf(buf + ret, MOD_BUF - ret, "\tParams:\n\t\tHost DMA %d\n\t\tLink DMA %d\n", mconfig->pipe->p_params->host_dma_id, mconfig->pipe->p_params->link_dma_id); - ret += snprintf(buf + ret, MOD_BUF - ret, + ret += scnprintf(buf + ret, MOD_BUF - ret, "\tPCM params:\n\t\tCh %d\n\t\tFreq %d\n\t\tFormat %d\n", mconfig->pipe->p_params->ch, mconfig->pipe->p_params->s_freq, mconfig->pipe->p_params->s_fmt); - ret += snprintf(buf + ret, MOD_BUF - ret, + ret += scnprintf(buf + ret, MOD_BUF - ret, "\tLink %#x\n\tStream %#x\n", mconfig->pipe->p_params->linktype, mconfig->pipe->p_params->stream); - ret += snprintf(buf + ret, MOD_BUF - ret, + ret += scnprintf(buf + ret, MOD_BUF - ret, "\tState %d\n\tPassthru %s\n", mconfig->pipe->state, mconfig->pipe->passthru ? "true" : "false"); @@ -137,7 +137,7 @@ static ssize_t module_read(struct file *file, char __user *user_buf, ret += skl_print_pins(mconfig->m_out_pin, buf, mconfig->max_out_queue, ret, false); - ret += snprintf(buf + ret, MOD_BUF - ret, + ret += scnprintf(buf + ret, MOD_BUF - ret, "Other:\n\tDomain %d\n\tHomogeneous Input %s\n\t" "Homogeneous Output %s\n\tIn Queue Mask %d\n\t" "Out Queue Mask %d\n\tDMA ID %d\n\tMem Pages %d\n\t" @@ -193,7 +193,7 @@ static ssize_t fw_softreg_read(struct file *file, char __user *user_buf, __ioread32_copy(d->fw_read_buff, fw_reg_addr, w0_stat_sz >> 2); for (offset = 0; offset < FW_REG_SIZE; offset += 16) { - ret += snprintf(tmp + ret, FW_REG_BUF - ret, "%#.4x: ", offset); + ret += scnprintf(tmp + ret, FW_REG_BUF - ret, "%#.4x: ", offset); hex_dump_to_buffer(d->fw_read_buff + offset, 16, 16, 4, tmp + ret, FW_REG_BUF - ret, 0); ret += strlen(tmp + ret); From 6c89ffea60aa3b2a33ae7987de1e84bfb89e4c9e Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 18 Feb 2020 12:17:37 +0100 Subject: [PATCH 053/400] ASoC: pcm: Fix possible buffer overflow in dpcm state sysfs output dpcm_show_state() invokes multiple snprintf() calls to concatenate formatted strings on the fixed size buffer. The usage of snprintf() is supposed for avoiding the buffer overflow, but it doesn't work as expected because snprintf() doesn't return the actual output size but the size to be written. Fix this bug by replacing all snprintf() calls with scnprintf() calls. Fixes: f86dcef87b77 ("ASoC: dpcm: Add debugFS support for DPCM") Signed-off-by: Takashi Iwai Acked-by: Cezary Rojewski Link: https://lore.kernel.org/r/20200218111737.14193-4-tiwai@suse.de Signed-off-by: Mark Brown --- sound/soc/soc-pcm.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index e66ac9ce321b..42fb03a99cb1 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -3171,16 +3171,16 @@ static ssize_t dpcm_show_state(struct snd_soc_pcm_runtime *fe, unsigned long flags; /* FE state */ - offset += snprintf(buf + offset, size - offset, + offset += scnprintf(buf + offset, size - offset, "[%s - %s]\n", fe->dai_link->name, stream ? "Capture" : "Playback"); - offset += snprintf(buf + offset, size - offset, "State: %s\n", + offset += scnprintf(buf + offset, size - offset, "State: %s\n", dpcm_state_string(fe->dpcm[stream].state)); if ((fe->dpcm[stream].state >= SND_SOC_DPCM_STATE_HW_PARAMS) && (fe->dpcm[stream].state <= SND_SOC_DPCM_STATE_STOP)) - offset += snprintf(buf + offset, size - offset, + offset += scnprintf(buf + offset, size - offset, "Hardware Params: " "Format = %s, Channels = %d, Rate = %d\n", snd_pcm_format_name(params_format(params)), @@ -3188,10 +3188,10 @@ static ssize_t dpcm_show_state(struct snd_soc_pcm_runtime *fe, params_rate(params)); /* BEs state */ - offset += snprintf(buf + offset, size - offset, "Backends:\n"); + offset += scnprintf(buf + offset, size - offset, "Backends:\n"); if (list_empty(&fe->dpcm[stream].be_clients)) { - offset += snprintf(buf + offset, size - offset, + offset += scnprintf(buf + offset, size - offset, " No active DSP links\n"); goto out; } @@ -3201,16 +3201,16 @@ static ssize_t dpcm_show_state(struct snd_soc_pcm_runtime *fe, struct snd_soc_pcm_runtime *be = dpcm->be; params = &dpcm->hw_params; - offset += snprintf(buf + offset, size - offset, + offset += scnprintf(buf + offset, size - offset, "- %s\n", be->dai_link->name); - offset += snprintf(buf + offset, size - offset, + offset += scnprintf(buf + offset, size - offset, " State: %s\n", dpcm_state_string(be->dpcm[stream].state)); if ((be->dpcm[stream].state >= SND_SOC_DPCM_STATE_HW_PARAMS) && (be->dpcm[stream].state <= SND_SOC_DPCM_STATE_STOP)) - offset += snprintf(buf + offset, size - offset, + offset += scnprintf(buf + offset, size - offset, " Hardware Params: " "Format = %s, Channels = %d, Rate = %d\n", snd_pcm_format_name(params_format(params)), From e4103312d7b7afb8a3a7a842a33ef2b1856b2c0f Mon Sep 17 00:00:00 2001 From: Parav Pandit Date: Wed, 12 Feb 2020 09:26:29 +0200 Subject: [PATCH 054/400] Revert "RDMA/cma: Simplify rdma_resolve_addr() error flow" This reverts commit 219d2e9dfda9431b808c28d5efc74b404b95b638. The call chain below requires the cm_id_priv's destination address to be setup before performing rdma_bind_addr(). Otherwise source port allocation fails as cma_port_is_unique() no longer sees the correct tuple to allow duplicate users of the source port. rdma_resolve_addr() cma_bind_addr() rdma_bind_addr() cma_get_port() cma_alloc_any_port() cma_port_is_unique() <- compared with zero daddr This can result in false failures to connect, particularly if the source port range is restricted. Fixes: 219d2e9dfda9 ("RDMA/cma: Simplify rdma_resolve_addr() error flow") Link: https://lore.kernel.org/r/20200212072635.682689-4-leon@kernel.org Signed-off-by: Parav Pandit Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/cma.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c index 72f032160c4b..2dec3a02ab9f 100644 --- a/drivers/infiniband/core/cma.c +++ b/drivers/infiniband/core/cma.c @@ -3212,19 +3212,26 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr, int ret; id_priv = container_of(id, struct rdma_id_private, id); + memcpy(cma_dst_addr(id_priv), dst_addr, rdma_addr_size(dst_addr)); if (id_priv->state == RDMA_CM_IDLE) { ret = cma_bind_addr(id, src_addr, dst_addr); - if (ret) + if (ret) { + memset(cma_dst_addr(id_priv), 0, + rdma_addr_size(dst_addr)); return ret; + } } - if (cma_family(id_priv) != dst_addr->sa_family) + if (cma_family(id_priv) != dst_addr->sa_family) { + memset(cma_dst_addr(id_priv), 0, rdma_addr_size(dst_addr)); return -EINVAL; + } - if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_ADDR_QUERY)) + if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_ADDR_QUERY)) { + memset(cma_dst_addr(id_priv), 0, rdma_addr_size(dst_addr)); return -EINVAL; + } - memcpy(cma_dst_addr(id_priv), dst_addr, rdma_addr_size(dst_addr)); if (cma_any_addr(dst_addr)) { ret = cma_resolve_loopback(id_priv); } else { From 2b2d5c4db732c027a14987cfccf767dac1b45170 Mon Sep 17 00:00:00 2001 From: Dragos Tarcatu Date: Fri, 7 Feb 2020 20:53:24 +0200 Subject: [PATCH 055/400] ASoC: topology: Fix memleak in soc_tplg_link_elems_load() If soc_tplg_link_config() fails, _link needs to be freed in case of topology ABI version mismatch. However the current code is returning directly and ends up leaking memory in this case. This patch fixes that. Fixes: 593d9e52f9bb ("ASoC: topology: Add support to configure existing physical DAI links") Signed-off-by: Dragos Tarcatu Link: https://lore.kernel.org/r/20200207185325.22320-2-dragos_tarcatu@mentor.com Signed-off-by: Mark Brown --- sound/soc/soc-topology.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/sound/soc/soc-topology.c b/sound/soc/soc-topology.c index 107ba3ed5440..953517a73298 100644 --- a/sound/soc/soc-topology.c +++ b/sound/soc/soc-topology.c @@ -2376,8 +2376,11 @@ static int soc_tplg_link_elems_load(struct soc_tplg *tplg, } ret = soc_tplg_link_config(tplg, _link); - if (ret < 0) + if (ret < 0) { + if (!abi_match) + kfree(_link); return ret; + } /* offset by version-specific struct size and * real priv data size From 242c46c023610dbc0213fc8fb6b71eb836bc5d95 Mon Sep 17 00:00:00 2001 From: Dragos Tarcatu Date: Fri, 7 Feb 2020 20:53:25 +0200 Subject: [PATCH 056/400] ASoC: topology: Fix memleak in soc_tplg_manifest_load() In case of ABI version mismatch, _manifest needs to be freed as it is just a copy of the original topology manifest. However, if a driver manifest handler is defined, that would get executed and the cleanup is never reached. Fix that by getting the return status of manifest() instead of returning directly. Fixes: 583958fa2e52 ("ASoC: topology: Make manifest backward compatible from ABI v4") Signed-off-by: Dragos Tarcatu Link: https://lore.kernel.org/r/20200207185325.22320-3-dragos_tarcatu@mentor.com Signed-off-by: Mark Brown --- sound/soc/soc-topology.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/sound/soc/soc-topology.c b/sound/soc/soc-topology.c index 953517a73298..22c38df40d5a 100644 --- a/sound/soc/soc-topology.c +++ b/sound/soc/soc-topology.c @@ -2544,7 +2544,7 @@ static int soc_tplg_manifest_load(struct soc_tplg *tplg, { struct snd_soc_tplg_manifest *manifest, *_manifest; bool abi_match; - int err; + int ret = 0; if (tplg->pass != SOC_TPLG_PASS_MANIFEST) return 0; @@ -2557,19 +2557,19 @@ static int soc_tplg_manifest_load(struct soc_tplg *tplg, _manifest = manifest; } else { abi_match = false; - err = manifest_new_ver(tplg, manifest, &_manifest); - if (err < 0) - return err; + ret = manifest_new_ver(tplg, manifest, &_manifest); + if (ret < 0) + return ret; } /* pass control to component driver for optional further init */ if (tplg->comp && tplg->ops && tplg->ops->manifest) - return tplg->ops->manifest(tplg->comp, tplg->index, _manifest); + ret = tplg->ops->manifest(tplg->comp, tplg->index, _manifest); if (!abi_match) /* free the duplicated one */ kfree(_manifest); - return 0; + return ret; } /* validate header magic, size and type */ From 4ca501d6aaf21de31541deac35128bbea8427aa6 Mon Sep 17 00:00:00 2001 From: Nathan Chancellor Date: Mon, 17 Feb 2020 13:43:18 -0700 Subject: [PATCH 057/400] RDMA/core: Fix use of logical OR in get_new_pps Clang warns: ../drivers/infiniband/core/security.c:351:41: warning: converting the enum constant to a boolean [-Wint-in-bool-context] if (!(qp_attr_mask & (IB_QP_PKEY_INDEX || IB_QP_PORT)) && qp_pps) { ^ 1 warning generated. A bitwise OR should have been used instead. Fixes: 1dd017882e01 ("RDMA/core: Fix protection fault in get_pkey_idx_qp_list") Link: https://lore.kernel.org/r/20200217204318.13609-1-natechancellor@gmail.com Link: https://github.com/ClangBuiltLinux/linux/issues/889 Reported-by: Dan Carpenter Signed-off-by: Nathan Chancellor Reviewed-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/security.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/core/security.c b/drivers/infiniband/core/security.c index 2b4d80393bd0..b9a36ea244d4 100644 --- a/drivers/infiniband/core/security.c +++ b/drivers/infiniband/core/security.c @@ -348,7 +348,7 @@ static struct ib_ports_pkeys *get_new_pps(const struct ib_qp *qp, if ((qp_attr_mask & IB_QP_PKEY_INDEX) && (qp_attr_mask & IB_QP_PORT)) new_pps->main.state = IB_PORT_PKEY_VALID; - if (!(qp_attr_mask & (IB_QP_PKEY_INDEX || IB_QP_PORT)) && qp_pps) { + if (!(qp_attr_mask & (IB_QP_PKEY_INDEX | IB_QP_PORT)) && qp_pps) { new_pps->main.port_num = qp_pps->main.port_num; new_pps->main.pkey_index = qp_pps->main.pkey_index; if (qp_pps->main.state != IB_PORT_PKEY_NOT_VALID) From dde54b9492a8ba46bcd7e7e26172adf2bfcea817 Mon Sep 17 00:00:00 2001 From: Heidi Fahim Date: Tue, 26 Nov 2019 14:36:16 -0800 Subject: [PATCH 058/400] kunit: test: Improve error messages for kunit_tool when kunitconfig is invalid Previous error message for invalid kunitconfig was vague. Added to it so that it lists invalid fields and prompts for them to be removed. Added validate_config function returning whether or not this kconfig is valid. Signed-off-by: Heidi Fahim Reviewed-by: Brendan Higgins Tested-by: Brendan Higgins Signed-off-by: Shuah Khan --- tools/testing/kunit/kunit_kernel.py | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py index cc5d844ecca1..d99ae75ef72f 100644 --- a/tools/testing/kunit/kunit_kernel.py +++ b/tools/testing/kunit/kunit_kernel.py @@ -93,6 +93,20 @@ class LinuxSourceTree(object): return False return True + def validate_config(self, build_dir): + kconfig_path = get_kconfig_path(build_dir) + validated_kconfig = kunit_config.Kconfig() + validated_kconfig.read_from_file(kconfig_path) + if not self._kconfig.is_subset_of(validated_kconfig): + invalid = self._kconfig.entries() - validated_kconfig.entries() + message = 'Provided Kconfig is not contained in validated .config. Following fields found in kunitconfig, ' \ + 'but not in .config: %s' % ( + ', '.join([str(e) for e in invalid]) + ) + logging.error(message) + return False + return True + def build_config(self, build_dir): kconfig_path = get_kconfig_path(build_dir) if build_dir and not os.path.exists(build_dir): @@ -103,12 +117,7 @@ class LinuxSourceTree(object): except ConfigError as e: logging.error(e) return False - validated_kconfig = kunit_config.Kconfig() - validated_kconfig.read_from_file(kconfig_path) - if not self._kconfig.is_subset_of(validated_kconfig): - logging.error('Provided Kconfig is not contained in validated .config!') - return False - return True + return self.validate_config(build_dir) def build_reconfig(self, build_dir): """Creates a new .config if it is not a subset of the .kunitconfig.""" @@ -133,12 +142,7 @@ class LinuxSourceTree(object): except (ConfigError, BuildError) as e: logging.error(e) return False - used_kconfig = kunit_config.Kconfig() - used_kconfig.read_from_file(get_kconfig_path(build_dir)) - if not self._kconfig.is_subset_of(used_kconfig): - logging.error('Provided Kconfig is not contained in final config!') - return False - return True + return self.validate_config(build_dir) def run_kernel(self, args=[], timeout=None, build_dir=''): args.extend(['mem=256M']) From be886ba90cce2fb2f5a4dbcda8f3be3fd1b2f484 Mon Sep 17 00:00:00 2001 From: Heidi Fahim Date: Tue, 18 Feb 2020 14:19:16 -0800 Subject: [PATCH 059/400] kunit: run kunit_tool from any directory Implemented small fix so that the script changes work directories to the root of the linux kernel source tree from which kunit.py is run. This enables the user to run kunit from any working directory. Originally considered using os.path.join but this is more error prone as we would have to find all file path usages and modify them accordingly. Using os.chdir ensures that the entire script is run within /linux. Signed-off-by: Heidi Fahim Reviewed-by: Brendan Higgins Signed-off-by: Shuah Khan --- tools/testing/kunit/kunit.py | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py index e59eb9e7f923..180ad1e1b04f 100755 --- a/tools/testing/kunit/kunit.py +++ b/tools/testing/kunit/kunit.py @@ -24,6 +24,8 @@ KunitResult = namedtuple('KunitResult', ['status','result']) KunitRequest = namedtuple('KunitRequest', ['raw_output','timeout', 'jobs', 'build_dir', 'defconfig']) +KernelDirectoryPath = sys.argv[0].split('tools/testing/kunit/')[0] + class KunitStatus(Enum): SUCCESS = auto() CONFIG_FAILURE = auto() @@ -35,6 +37,13 @@ def create_default_kunitconfig(): shutil.copyfile('arch/um/configs/kunit_defconfig', kunit_kernel.kunitconfig_path) +def get_kernel_root_path(): + parts = sys.argv[0] if not __file__ else __file__ + parts = os.path.realpath(parts).split('tools/testing/kunit') + if len(parts) != 2: + sys.exit(1) + return parts[0] + def run_tests(linux: kunit_kernel.LinuxSourceTree, request: KunitRequest) -> KunitResult: config_start = time.time() @@ -114,6 +123,9 @@ def main(argv, linux=None): cli_args = parser.parse_args(argv) if cli_args.subcommand == 'run': + if get_kernel_root_path(): + os.chdir(get_kernel_root_path()) + if cli_args.build_dir: if not os.path.exists(cli_args.build_dir): os.mkdir(cli_args.build_dir) From ae99fb8baafc881b35aa0b79d7ac0178a7c40c89 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Sun, 16 Feb 2020 20:42:36 -0800 Subject: [PATCH 060/400] Documentation/admin-guide/acpi: fix fan_performance_states.rst warnings Fix Sphinx format warnings in fan_performace_states.rst by adding indentation. Documentation/admin-guide/acpi/fan_performance_states.rst:21: WARNING: Literal block ends without a blank line; unexpected unindent. Documentation/admin-guide/acpi/fan_performance_states.rst:41: WARNING: Literal block expected; none found. Signed-off-by: Randy Dunlap Signed-off-by: Rafael J. Wysocki --- Documentation/admin-guide/acpi/fan_performance_states.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/acpi/fan_performance_states.rst b/Documentation/admin-guide/acpi/fan_performance_states.rst index 21d233ca50d8..98fe5c333121 100644 --- a/Documentation/admin-guide/acpi/fan_performance_states.rst +++ b/Documentation/admin-guide/acpi/fan_performance_states.rst @@ -18,7 +18,7 @@ may look as follows:: $ ls -l /sys/bus/acpi/devices/INT3404:00/ total 0 -... + ... -r--r--r-- 1 root root 4096 Dec 13 20:38 state0 -r--r--r-- 1 root root 4096 Dec 13 20:38 state1 -r--r--r-- 1 root root 4096 Dec 13 20:38 state10 @@ -38,7 +38,7 @@ where each of the "state*" files represents one performance state of the fan and contains a colon-separated list of 5 integer numbers (fields) with the following interpretation:: -control_percent:trip_point_index:speed_rpm:noise_level_mdb:power_mw + control_percent:trip_point_index:speed_rpm:noise_level_mdb:power_mw * ``control_percent``: The percent value to be used to set the fan speed to a specific level using the _FSL object (0-100). From 14ba91c74782f4d470f4d7c7cf585e29d4761035 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jonathan=20Neusch=C3=A4fer?= Date: Tue, 18 Feb 2020 15:58:18 +0100 Subject: [PATCH 061/400] Documentation: power: Drop reference to interface.rst MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit It has been merged into sleep-states.rst. Fixes: c21502efdaed ("Documentation: admin-guide: PM: Update sleep states documentation") Signed-off-by: Jonathan Neuschäfer Signed-off-by: Rafael J. Wysocki --- Documentation/power/index.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/Documentation/power/index.rst b/Documentation/power/index.rst index 002e42745263..ced8a8007434 100644 --- a/Documentation/power/index.rst +++ b/Documentation/power/index.rst @@ -13,7 +13,6 @@ Power Management drivers-testing energy-model freezing-of-tasks - interface opp pci pm_qos_interface From b0c609ab2057d0953fa05e7566f0c0e8a28fa9e1 Mon Sep 17 00:00:00 2001 From: Alexandre Belloni Date: Fri, 14 Feb 2020 15:06:21 +0100 Subject: [PATCH 062/400] PM / hibernate: fix typo "reserverd_size" -> "reserved_size" Fix a mistake in a variable name in a comment. Signed-off-by: Alexandre Belloni Signed-off-by: Rafael J. Wysocki --- kernel/power/snapshot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index ddade80ad276..d82b7b88d616 100644 --- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -1681,7 +1681,7 @@ static unsigned long minimum_image_size(unsigned long saveable) * hibernation for allocations made while saving the image and for device * drivers, in case they need to allocate memory from their hibernation * callbacks (these two numbers are given by PAGES_FOR_IO (which is a rough - * estimate) and reserverd_size divided by PAGE_SIZE (which is tunable through + * estimate) and reserved_size divided by PAGE_SIZE (which is tunable through * /sys/power/reserved_size, respectively). To make this happen, we compute the * total number of available page frames and allocate at least * From 63d68382f5fb5f71772357e31841c19c4a133182 Mon Sep 17 00:00:00 2001 From: Pierre-Louis Bossart Date: Wed, 19 Feb 2020 16:21:30 -0600 Subject: [PATCH 063/400] ASoC: soc-core: fix for_rtd_codec_dai_rollback() macro The use of parentheses to protect the argument is fine for (i)++ but not for (--i). Fixes: 83f94a2e293d61 ("ASoC: soc-core: add snd_soc_close_delayed_work()") Signed-off-by: Pierre-Louis Bossart Acked-by: Kuninori Morimoto Cc: Kuninori Morimoto Link: https://lore.kernel.org/r/20200219222130.29933-1-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown --- include/sound/soc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/sound/soc.h b/include/sound/soc.h index f0e4f36f83bf..8a2266676b2d 100644 --- a/include/sound/soc.h +++ b/include/sound/soc.h @@ -1157,7 +1157,7 @@ struct snd_soc_pcm_runtime { ((i) < rtd->num_codecs) && ((dai) = rtd->codec_dais[i]); \ (i)++) #define for_each_rtd_codec_dai_rollback(rtd, i, dai) \ - for (; ((--i) >= 0) && ((dai) = rtd->codec_dais[i]);) + for (; (--(i) >= 0) && ((dai) = rtd->codec_dais[i]);) void snd_soc_close_delayed_work(struct snd_soc_pcm_runtime *rtd); From 68ca0fd272dac9a9f78fbf14b5e65de34f12c1b4 Mon Sep 17 00:00:00 2001 From: Christophe Leroy Date: Thu, 6 Feb 2020 08:11:39 +0000 Subject: [PATCH 064/400] selftest/lkdtm: Don't pollute 'git status' Commit 46d1a0f03d66 ("selftests/lkdtm: Add tests for LKDTM targets") added generation of lkdtm test scripts. Ignore those generated scripts when performing 'git status' Fixes: 46d1a0f03d66 ("selftests/lkdtm: Add tests for LKDTM targets") Signed-off-by: Christophe Leroy Signed-off-by: Shuah Khan --- .gitignore | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/.gitignore b/.gitignore index 72ef86a5570d..2763fce8766c 100644 --- a/.gitignore +++ b/.gitignore @@ -100,6 +100,10 @@ modules.order /include/ksym/ /arch/*/include/generated/ +# Generated lkdtm tests +/tools/testing/selftests/lkdtm/*.sh +!/tools/testing/selftests/lkdtm/run.sh + # stgit generated dirs patches-* From b9167c8078c3527de6da241c8a1a75a9224ed90a Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Thu, 20 Feb 2020 15:42:41 +1100 Subject: [PATCH 065/400] selftests: Install settings files to fix TIMEOUT failures Commit 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test") added a 45 second timeout for tests, and also added a way for tests to customise the timeout via a settings file. For example the ftrace tests take multiple minutes to run, so they were given longer in commit b43e78f65b1d ("tracing/selftests: Turn off timeout setting"). This works when the tests are run from the source tree. However if the tests are installed with "make -C tools/testing/selftests install", the settings files are not copied into the install directory. When the tests are then run from the install directory the longer timeouts are not applied and the tests timeout incorrectly. So add the settings files to TEST_FILES of the appropriate Makefiles to cause the settings files to be installed using the existing install logic. Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test") Signed-off-by: Michael Ellerman Signed-off-by: Shuah Khan --- tools/testing/selftests/ftrace/Makefile | 2 +- tools/testing/selftests/livepatch/Makefile | 2 ++ tools/testing/selftests/net/mptcp/Makefile | 2 ++ tools/testing/selftests/rseq/Makefile | 2 ++ tools/testing/selftests/rtc/Makefile | 2 ++ 5 files changed, 9 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile index cd1f5b3a7774..d6e106fbce11 100644 --- a/tools/testing/selftests/ftrace/Makefile +++ b/tools/testing/selftests/ftrace/Makefile @@ -2,7 +2,7 @@ all: TEST_PROGS := ftracetest -TEST_FILES := test.d +TEST_FILES := test.d settings EXTRA_CLEAN := $(OUTPUT)/logs/* include ../lib.mk diff --git a/tools/testing/selftests/livepatch/Makefile b/tools/testing/selftests/livepatch/Makefile index 3876d8d62494..1acc9e1fa3fb 100644 --- a/tools/testing/selftests/livepatch/Makefile +++ b/tools/testing/selftests/livepatch/Makefile @@ -8,4 +8,6 @@ TEST_PROGS := \ test-state.sh \ test-ftrace.sh +TEST_FILES := settings + include ../lib.mk diff --git a/tools/testing/selftests/net/mptcp/Makefile b/tools/testing/selftests/net/mptcp/Makefile index 93de52016dde..ba450e62dc5b 100644 --- a/tools/testing/selftests/net/mptcp/Makefile +++ b/tools/testing/selftests/net/mptcp/Makefile @@ -8,6 +8,8 @@ TEST_PROGS := mptcp_connect.sh TEST_GEN_FILES = mptcp_connect +TEST_FILES := settings + EXTRA_CLEAN := *.pcap include ../../lib.mk diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile index d6469535630a..f1053630bb6f 100644 --- a/tools/testing/selftests/rseq/Makefile +++ b/tools/testing/selftests/rseq/Makefile @@ -19,6 +19,8 @@ TEST_GEN_PROGS_EXTENDED = librseq.so TEST_PROGS = run_param_test.sh +TEST_FILES := settings + include ../lib.mk $(OUTPUT)/librseq.so: rseq.c rseq.h rseq-*.h diff --git a/tools/testing/selftests/rtc/Makefile b/tools/testing/selftests/rtc/Makefile index 2d93d65723c9..55198ecc04db 100644 --- a/tools/testing/selftests/rtc/Makefile +++ b/tools/testing/selftests/rtc/Makefile @@ -6,4 +6,6 @@ TEST_GEN_PROGS = rtctest TEST_GEN_PROGS_EXTENDED = setdate +TEST_FILES := settings + include ../lib.mk From ef89d0545132d685f73da6f58b7e7fe002536f91 Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Thu, 20 Feb 2020 22:37:48 +1100 Subject: [PATCH 066/400] selftests/rseq: Fix out-of-tree compilation Currently if you build with O=... the rseq tests don't build: $ make O=$PWD/output -C tools/testing/selftests/ TARGETS=rseq make: Entering directory '/linux/tools/testing/selftests' ... make[1]: Entering directory '/linux/tools/testing/selftests/rseq' gcc -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ -shared -fPIC rseq.c -lpthread -o /linux/output/rseq/librseq.so gcc -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ basic_test.c -lpthread -lrseq -o /linux/output/rseq/basic_test /usr/bin/ld: cannot find -lrseq collect2: error: ld returned 1 exit status This is because the library search path points to the source directory, not the output. We can fix it by changing the library search path to $(OUTPUT). Signed-off-by: Michael Ellerman Signed-off-by: Shuah Khan --- tools/testing/selftests/rseq/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile index f1053630bb6f..2af9d39a9716 100644 --- a/tools/testing/selftests/rseq/Makefile +++ b/tools/testing/selftests/rseq/Makefile @@ -4,7 +4,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),) CLANG_FLAGS += -no-integrated-as endif -CFLAGS += -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ \ +CFLAGS += -O2 -Wall -g -I./ -I../../../../usr/include/ -L$(OUTPUT) -Wl,-rpath=./ \ $(CLANG_FLAGS) LDLIBS += -lpthread From 6affca140cbea01f497c4f4e16f1e2be7f74bd04 Mon Sep 17 00:00:00 2001 From: Max Gurtovoy Date: Thu, 20 Feb 2020 12:08:18 +0200 Subject: [PATCH 067/400] RDMA/rw: Fix error flow during RDMA context initialization In case the SGL was mapped for P2P DMA operation, we must unmap it using pci_p2pdma_unmap_sg during the error unwind of rdma_rw_ctx_init() Fixes: 7f73eac3a713 ("PCI/P2PDMA: Introduce pci_p2pdma_unmap_sg()") Link: https://lore.kernel.org/r/20200220100819.41860-1-maxg@mellanox.com Signed-off-by: Max Gurtovoy Reviewed-by: Leon Romanovsky Reviewed-by: Logan Gunthorpe Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/rw.c | 31 ++++++++++++++++++++----------- 1 file changed, 20 insertions(+), 11 deletions(-) diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core/rw.c index 4fad732f9b3c..06e5b6787443 100644 --- a/drivers/infiniband/core/rw.c +++ b/drivers/infiniband/core/rw.c @@ -273,6 +273,23 @@ static int rdma_rw_init_single_wr(struct rdma_rw_ctx *ctx, struct ib_qp *qp, return 1; } +static void rdma_rw_unmap_sg(struct ib_device *dev, struct scatterlist *sg, + u32 sg_cnt, enum dma_data_direction dir) +{ + if (is_pci_p2pdma_page(sg_page(sg))) + pci_p2pdma_unmap_sg(dev->dma_device, sg, sg_cnt, dir); + else + ib_dma_unmap_sg(dev, sg, sg_cnt, dir); +} + +static int rdma_rw_map_sg(struct ib_device *dev, struct scatterlist *sg, + u32 sg_cnt, enum dma_data_direction dir) +{ + if (is_pci_p2pdma_page(sg_page(sg))) + return pci_p2pdma_map_sg(dev->dma_device, sg, sg_cnt, dir); + return ib_dma_map_sg(dev, sg, sg_cnt, dir); +} + /** * rdma_rw_ctx_init - initialize a RDMA READ/WRITE context * @ctx: context to initialize @@ -295,11 +312,7 @@ int rdma_rw_ctx_init(struct rdma_rw_ctx *ctx, struct ib_qp *qp, u8 port_num, struct ib_device *dev = qp->pd->device; int ret; - if (is_pci_p2pdma_page(sg_page(sg))) - ret = pci_p2pdma_map_sg(dev->dma_device, sg, sg_cnt, dir); - else - ret = ib_dma_map_sg(dev, sg, sg_cnt, dir); - + ret = rdma_rw_map_sg(dev, sg, sg_cnt, dir); if (!ret) return -ENOMEM; sg_cnt = ret; @@ -338,7 +351,7 @@ int rdma_rw_ctx_init(struct rdma_rw_ctx *ctx, struct ib_qp *qp, u8 port_num, return ret; out_unmap_sg: - ib_dma_unmap_sg(dev, sg, sg_cnt, dir); + rdma_rw_unmap_sg(dev, sg, sg_cnt, dir); return ret; } EXPORT_SYMBOL(rdma_rw_ctx_init); @@ -588,11 +601,7 @@ void rdma_rw_ctx_destroy(struct rdma_rw_ctx *ctx, struct ib_qp *qp, u8 port_num, break; } - if (is_pci_p2pdma_page(sg_page(sg))) - pci_p2pdma_unmap_sg(qp->pd->device->dma_device, sg, - sg_cnt, dir); - else - ib_dma_unmap_sg(qp->pd->device, sg, sg_cnt, dir); + rdma_rw_unmap_sg(qp->pd->device, sg, sg_cnt, dir); } EXPORT_SYMBOL(rdma_rw_ctx_destroy); From 279eef0531928a6669230879a6eed081513ad5a3 Mon Sep 17 00:00:00 2001 From: Tom Zanussi Date: Fri, 14 Feb 2020 16:56:38 -0600 Subject: [PATCH 068/400] tracing: Make sure synth_event_trace() example always uses u64 synth_event_trace() is the varargs version of synth_event_trace_array(), which takes an array of u64, as do synth_event_add_val() et al. To not only be consistent with those, but also to address the fact that synth_event_trace() expects every arg to be of the same type since it doesn't also pass in e.g. a format string, the caller needs to make sure all args are of the same type, u64. u64 is used because it needs to accomodate the largest type available in synthetic events, which is u64. This fixes the bug reported by the kernel test robot/Rong Chen. Link: https://lore.kernel.org/lkml/20200212113444.GS12867@shao2-debian/ Link: http://lkml.kernel.org/r/894c4e955558b521210ee0642ba194a9e603354c.1581720155.git.zanussi@kernel.org Fixes: 9fe41efaca084 ("tracing: Add synth event generation test module") Reported-by: kernel test robot Signed-off-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/synth_event_gen_test.c | 34 ++++++++++++++--------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/kernel/trace/synth_event_gen_test.c b/kernel/trace/synth_event_gen_test.c index 4aefe003cb7c..6866280a9b10 100644 --- a/kernel/trace/synth_event_gen_test.c +++ b/kernel/trace/synth_event_gen_test.c @@ -111,11 +111,11 @@ static int __init test_gen_synth_cmd(void) /* Create some bogus values just for testing */ vals[0] = 777; /* next_pid_field */ - vals[1] = (u64)"hula hoops"; /* next_comm_field */ + vals[1] = (u64)(long)"hula hoops"; /* next_comm_field */ vals[2] = 1000000; /* ts_ns */ vals[3] = 1000; /* ts_ms */ vals[4] = smp_processor_id(); /* cpu */ - vals[5] = (u64)"thneed"; /* my_string_field */ + vals[5] = (u64)(long)"thneed"; /* my_string_field */ vals[6] = 598; /* my_int_field */ /* Now generate a gen_synth_test event */ @@ -218,11 +218,11 @@ static int __init test_empty_synth_event(void) /* Create some bogus values just for testing */ vals[0] = 777; /* next_pid_field */ - vals[1] = (u64)"tiddlywinks"; /* next_comm_field */ + vals[1] = (u64)(long)"tiddlywinks"; /* next_comm_field */ vals[2] = 1000000; /* ts_ns */ vals[3] = 1000; /* ts_ms */ vals[4] = smp_processor_id(); /* cpu */ - vals[5] = (u64)"thneed_2.0"; /* my_string_field */ + vals[5] = (u64)(long)"thneed_2.0"; /* my_string_field */ vals[6] = 399; /* my_int_field */ /* Now trace an empty_synth_test event */ @@ -290,11 +290,11 @@ static int __init test_create_synth_event(void) /* Create some bogus values just for testing */ vals[0] = 777; /* next_pid_field */ - vals[1] = (u64)"tiddlywinks"; /* next_comm_field */ + vals[1] = (u64)(long)"tiddlywinks"; /* next_comm_field */ vals[2] = 1000000; /* ts_ns */ vals[3] = 1000; /* ts_ms */ vals[4] = smp_processor_id(); /* cpu */ - vals[5] = (u64)"thneed"; /* my_string_field */ + vals[5] = (u64)(long)"thneed"; /* my_string_field */ vals[6] = 398; /* my_int_field */ /* Now generate a create_synth_test event */ @@ -330,7 +330,7 @@ static int __init test_add_next_synth_val(void) goto out; /* next_comm_field */ - ret = synth_event_add_next_val((u64)"slinky", &trace_state); + ret = synth_event_add_next_val((u64)(long)"slinky", &trace_state); if (ret) goto out; @@ -350,7 +350,7 @@ static int __init test_add_next_synth_val(void) goto out; /* my_string_field */ - ret = synth_event_add_next_val((u64)"thneed_2.01", &trace_state); + ret = synth_event_add_next_val((u64)(long)"thneed_2.01", &trace_state); if (ret) goto out; @@ -396,12 +396,12 @@ static int __init test_add_synth_val(void) if (ret) goto out; - ret = synth_event_add_val("next_comm_field", (u64)"silly putty", + ret = synth_event_add_val("next_comm_field", (u64)(long)"silly putty", &trace_state); if (ret) goto out; - ret = synth_event_add_val("my_string_field", (u64)"thneed_9", + ret = synth_event_add_val("my_string_field", (u64)(long)"thneed_9", &trace_state); if (ret) goto out; @@ -423,13 +423,13 @@ static int __init test_trace_synth_event(void) /* Trace some bogus values just for testing */ ret = synth_event_trace(create_synth_test, 7, /* number of values */ - 444, /* next_pid_field */ - (u64)"clackers", /* next_comm_field */ - 1000000, /* ts_ns */ - 1000, /* ts_ms */ - smp_processor_id(), /* cpu */ - (u64)"Thneed", /* my_string_field */ - 999); /* my_int_field */ + (u64)444, /* next_pid_field */ + (u64)(long)"clackers", /* next_comm_field */ + (u64)1000000, /* ts_ns */ + (u64)1000, /* ts_ms */ + (u64)smp_processor_id(),/* cpu */ + (u64)(long)"Thneed", /* my_string_field */ + (u64)999); /* my_int_field */ return ret; } From 1d9d4c90194a8c3b2f7da9f4bf3f8ba2ed810656 Mon Sep 17 00:00:00 2001 From: Tom Zanussi Date: Fri, 14 Feb 2020 16:56:39 -0600 Subject: [PATCH 069/400] tracing: Make synth_event trace functions endian-correct synth_event_trace(), synth_event_trace_array() and __synth_event_add_val() write directly into the trace buffer and need to take endianness into account, like trace_event_raw_event_synth() does. Link: http://lkml.kernel.org/r/2011354355e405af9c9d28abba430d1f5ff7771a.1581720155.git.zanussi@kernel.org Signed-off-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/trace_events_hist.c | 62 +++++++++++++++++++++++++++++--- 1 file changed, 58 insertions(+), 4 deletions(-) diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index 65b54d6a1422..6a380fb83864 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -1891,7 +1891,25 @@ int synth_event_trace(struct trace_event_file *file, unsigned int n_vals, ...) strscpy(str_field, str_val, STR_VAR_LEN_MAX); n_u64 += STR_VAR_LEN_MAX / sizeof(u64); } else { - state.entry->fields[n_u64] = val; + struct synth_field *field = state.event->fields[i]; + + switch (field->size) { + case 1: + *(u8 *)&state.entry->fields[n_u64] = (u8)val; + break; + + case 2: + *(u16 *)&state.entry->fields[n_u64] = (u16)val; + break; + + case 4: + *(u32 *)&state.entry->fields[n_u64] = (u32)val; + break; + + default: + state.entry->fields[n_u64] = val; + break; + } n_u64++; } } @@ -1943,7 +1961,26 @@ int synth_event_trace_array(struct trace_event_file *file, u64 *vals, strscpy(str_field, str_val, STR_VAR_LEN_MAX); n_u64 += STR_VAR_LEN_MAX / sizeof(u64); } else { - state.entry->fields[n_u64] = vals[i]; + struct synth_field *field = state.event->fields[i]; + u64 val = vals[i]; + + switch (field->size) { + case 1: + *(u8 *)&state.entry->fields[n_u64] = (u8)val; + break; + + case 2: + *(u16 *)&state.entry->fields[n_u64] = (u16)val; + break; + + case 4: + *(u32 *)&state.entry->fields[n_u64] = (u32)val; + break; + + default: + state.entry->fields[n_u64] = val; + break; + } n_u64++; } } @@ -2062,8 +2099,25 @@ static int __synth_event_add_val(const char *field_name, u64 val, str_field = (char *)&entry->fields[field->offset]; strscpy(str_field, str_val, STR_VAR_LEN_MAX); - } else - entry->fields[field->offset] = val; + } else { + switch (field->size) { + case 1: + *(u8 *)&trace_state->entry->fields[field->offset] = (u8)val; + break; + + case 2: + *(u16 *)&trace_state->entry->fields[field->offset] = (u16)val; + break; + + case 4: + *(u32 *)&trace_state->entry->fields[field->offset] = (u32)val; + break; + + default: + trace_state->entry->fields[field->offset] = val; + break; + } + } out: return ret; } From 3843083772dc2afde790a6d7160658b00a808da1 Mon Sep 17 00:00:00 2001 From: Tom Zanussi Date: Fri, 14 Feb 2020 16:56:40 -0600 Subject: [PATCH 070/400] tracing: Check that number of vals matches number of synth event fields Commit 7276531d4036('tracing: Consolidate trace() functions') inadvertently dropped the synth_event_trace() and synth_event_trace_array() checks that verify the number of values passed in matches the number of fields in the synthetic event being traced, so add them back. Link: http://lkml.kernel.org/r/32819cac708714693669e0dfe10fe9d935e94a16.1581720155.git.zanussi@kernel.org Signed-off-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/trace_events_hist.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index 6a380fb83864..45622194a34d 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -1878,6 +1878,11 @@ int synth_event_trace(struct trace_event_file *file, unsigned int n_vals, ...) return ret; } + if (n_vals != state.event->n_fields) { + ret = -EINVAL; + goto out; + } + va_start(args, n_vals); for (i = 0, n_u64 = 0; i < state.event->n_fields; i++) { u64 val; @@ -1914,7 +1919,7 @@ int synth_event_trace(struct trace_event_file *file, unsigned int n_vals, ...) } } va_end(args); - +out: __synth_event_trace_end(&state); return ret; @@ -1953,6 +1958,11 @@ int synth_event_trace_array(struct trace_event_file *file, u64 *vals, return ret; } + if (n_vals != state.event->n_fields) { + ret = -EINVAL; + goto out; + } + for (i = 0, n_u64 = 0; i < state.event->n_fields; i++) { if (state.event->fields[i]->is_string) { char *str_val = (char *)(long)vals[i]; @@ -1984,7 +1994,7 @@ int synth_event_trace_array(struct trace_event_file *file, u64 *vals, n_u64++; } } - +out: __synth_event_trace_end(&state); return ret; From 784bd0847eda032ed2f3522f87250655a18c0190 Mon Sep 17 00:00:00 2001 From: Tom Zanussi Date: Fri, 14 Feb 2020 16:56:41 -0600 Subject: [PATCH 071/400] tracing: Fix number printing bug in print_synth_event() Fix a varargs-related bug in print_synth_event() which resulted in strange output and oopses on 32-bit x86 systems. The problem is that trace_seq_printf() expects the varargs to match the format string, but print_synth_event() was always passing u64 values regardless. This results in unspecified behavior when unpacking with va_arg() in trace_seq_printf(). Add a function that takes the size into account when calling trace_seq_printf(). Before: modprobe-1731 [003] .... 919.039758: gen_synth_test: next_pid_field=777(null)next_comm_field=hula hoops ts_ns=1000000 ts_ms=1000 cpu=3(null)my_string_field=thneed my_int_field=598(null) After: insmod-1136 [001] .... 36.634590: gen_synth_test: next_pid_field=777 next_comm_field=hula hoops ts_ns=1000000 ts_ms=1000 cpu=1 my_string_field=thneed my_int_field=598 Link: http://lkml.kernel.org/r/a9b59eb515dbbd7d4abe53b347dccf7a8e285657.1581720155.git.zanussi@kernel.org Reported-by: Steven Rostedt (VMware) Signed-off-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/trace_events_hist.c | 32 +++++++++++++++++++++++++++++--- 1 file changed, 29 insertions(+), 3 deletions(-) diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index 45622194a34d..f068d55bd37f 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -820,6 +820,29 @@ static const char *synth_field_fmt(char *type) return fmt; } +static void print_synth_event_num_val(struct trace_seq *s, + char *print_fmt, char *name, + int size, u64 val, char *space) +{ + switch (size) { + case 1: + trace_seq_printf(s, print_fmt, name, (u8)val, space); + break; + + case 2: + trace_seq_printf(s, print_fmt, name, (u16)val, space); + break; + + case 4: + trace_seq_printf(s, print_fmt, name, (u32)val, space); + break; + + default: + trace_seq_printf(s, print_fmt, name, val, space); + break; + } +} + static enum print_line_t print_synth_event(struct trace_iterator *iter, int flags, struct trace_event *event) @@ -858,10 +881,13 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter, } else { struct trace_print_flags __flags[] = { __def_gfpflag_names, {-1, NULL} }; + char *space = (i == se->n_fields - 1 ? "" : " "); - trace_seq_printf(s, print_fmt, se->fields[i]->name, - entry->fields[n_u64], - i == se->n_fields - 1 ? "" : " "); + print_synth_event_num_val(s, print_fmt, + se->fields[i]->name, + se->fields[i]->size, + entry->fields[n_u64], + space); if (strcmp(se->fields[i]->type, "gfp_t") == 0) { trace_seq_puts(s, " ("); From ac0a68997935c4acb92eaae5ad8982e0bb432d56 Mon Sep 17 00:00:00 2001 From: Matthias Reichl Date: Thu, 20 Feb 2020 21:29:56 +0100 Subject: [PATCH 072/400] ASoC: pcm512x: Fix unbalanced regulator enable call in probe error path When we get a clock error during probe we have to call regulator_bulk_disable before bailing out, otherwise we trigger a warning in regulator_put. Fix this by using "goto err" like in the error cases above. Fixes: 5a3af1293194d ("ASoC: pcm512x: Add PCM512x driver") Signed-off-by: Matthias Reichl Reviewed-by: Pierre-Louis Bossart Link: https://lore.kernel.org/r/20200220202956.29233-1-hias@horus.com Signed-off-by: Mark Brown --- sound/soc/codecs/pcm512x.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/sound/soc/codecs/pcm512x.c b/sound/soc/codecs/pcm512x.c index 861210f6bf4f..4cbef9affffd 100644 --- a/sound/soc/codecs/pcm512x.c +++ b/sound/soc/codecs/pcm512x.c @@ -1564,13 +1564,15 @@ int pcm512x_probe(struct device *dev, struct regmap *regmap) } pcm512x->sclk = devm_clk_get(dev, NULL); - if (PTR_ERR(pcm512x->sclk) == -EPROBE_DEFER) - return -EPROBE_DEFER; + if (PTR_ERR(pcm512x->sclk) == -EPROBE_DEFER) { + ret = -EPROBE_DEFER; + goto err; + } if (!IS_ERR(pcm512x->sclk)) { ret = clk_prepare_enable(pcm512x->sclk); if (ret != 0) { dev_err(dev, "Failed to enable SCLK: %d\n", ret); - return ret; + goto err; } } From 3c18a9be7c9d4f53239795282c5d927f73f534b3 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" Date: Thu, 20 Feb 2020 16:29:50 -0500 Subject: [PATCH 073/400] tracing: Have synthetic event test use raw_smp_processor_id() The test code that tests synthetic event creation pushes in as one of its test fields the current CPU using "smp_processor_id()". As this is just something to see if the value is correctly passed in, and the actual CPU used does not matter, use raw_smp_processor_id(), otherwise with debug preemption enabled, a warning happens as the smp_processor_id() is called without preemption enabled. Link: http://lkml.kernel.org/r/20200220162950.35162579@gandalf.local.home Reviewed-by: Tom Zanussi Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/synth_event_gen_test.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/kernel/trace/synth_event_gen_test.c b/kernel/trace/synth_event_gen_test.c index 6866280a9b10..7d56d621ffea 100644 --- a/kernel/trace/synth_event_gen_test.c +++ b/kernel/trace/synth_event_gen_test.c @@ -114,7 +114,7 @@ static int __init test_gen_synth_cmd(void) vals[1] = (u64)(long)"hula hoops"; /* next_comm_field */ vals[2] = 1000000; /* ts_ns */ vals[3] = 1000; /* ts_ms */ - vals[4] = smp_processor_id(); /* cpu */ + vals[4] = raw_smp_processor_id(); /* cpu */ vals[5] = (u64)(long)"thneed"; /* my_string_field */ vals[6] = 598; /* my_int_field */ @@ -221,7 +221,7 @@ static int __init test_empty_synth_event(void) vals[1] = (u64)(long)"tiddlywinks"; /* next_comm_field */ vals[2] = 1000000; /* ts_ns */ vals[3] = 1000; /* ts_ms */ - vals[4] = smp_processor_id(); /* cpu */ + vals[4] = raw_smp_processor_id(); /* cpu */ vals[5] = (u64)(long)"thneed_2.0"; /* my_string_field */ vals[6] = 399; /* my_int_field */ @@ -293,7 +293,7 @@ static int __init test_create_synth_event(void) vals[1] = (u64)(long)"tiddlywinks"; /* next_comm_field */ vals[2] = 1000000; /* ts_ns */ vals[3] = 1000; /* ts_ms */ - vals[4] = smp_processor_id(); /* cpu */ + vals[4] = raw_smp_processor_id(); /* cpu */ vals[5] = (u64)(long)"thneed"; /* my_string_field */ vals[6] = 398; /* my_int_field */ @@ -345,7 +345,7 @@ static int __init test_add_next_synth_val(void) goto out; /* cpu */ - ret = synth_event_add_next_val(smp_processor_id(), &trace_state); + ret = synth_event_add_next_val(raw_smp_processor_id(), &trace_state); if (ret) goto out; @@ -388,7 +388,7 @@ static int __init test_add_synth_val(void) if (ret) goto out; - ret = synth_event_add_val("cpu", smp_processor_id(), &trace_state); + ret = synth_event_add_val("cpu", raw_smp_processor_id(), &trace_state); if (ret) goto out; @@ -427,7 +427,7 @@ static int __init test_trace_synth_event(void) (u64)(long)"clackers", /* next_comm_field */ (u64)1000000, /* ts_ns */ (u64)1000, /* ts_ms */ - (u64)smp_processor_id(),/* cpu */ + (u64)raw_smp_processor_id(), /* cpu */ (u64)(long)"Thneed", /* my_string_field */ (u64)999); /* my_int_field */ return ret; From 78041c0c9e935d9ce4086feeff6c569ed88ddfd4 Mon Sep 17 00:00:00 2001 From: "Steven Rostedt (VMware)" Date: Thu, 20 Feb 2020 15:38:01 -0500 Subject: [PATCH 074/400] tracing: Disable trace_printk() on post poned tests The tracing seftests checks various aspects of the tracing infrastructure, and one is filtering. If trace_printk() is active during a self test, it can cause the filtering to fail, which will disable that part of the trace. To keep the selftests from failing because of trace_printk() calls, trace_printk() checks the variable tracing_selftest_running, and if set, it does not write to the tracing buffer. As some tracers were registered earlier in boot, the selftest they triggered would fail because not all the infrastructure was set up for the full selftest. Thus, some of the tests were post poned to when their infrastructure was ready (namely file system code). The postpone code did not set the tracing_seftest_running variable, and could fail if a trace_printk() was added and executed during their run. Cc: stable@vger.kernel.org Fixes: 9afecfbb95198 ("tracing: Postpone tracer start-up tests till the system is more robust") Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/trace.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 183b031a3828..a89c562ffb8f 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1837,6 +1837,7 @@ static __init int init_trace_selftests(void) pr_info("Running postponed tracer tests:\n"); + tracing_selftest_running = true; list_for_each_entry_safe(p, n, &postponed_selftests, list) { /* This loop can take minutes when sanitizers are enabled, so * lets make sure we allow RCU processing. @@ -1859,6 +1860,7 @@ static __init int init_trace_selftests(void) list_del(&p->list); kfree(p); } + tracing_selftest_running = false; out: mutex_unlock(&trace_types_lock); From 08d9e686426f7557d3f1cda219ff907397c89d53 Mon Sep 17 00:00:00 2001 From: Qiujun Huang Date: Sun, 16 Feb 2020 19:28:31 +0800 Subject: [PATCH 075/400] bootconfig: Mark boot_config_checksum() static In fact, this function is only used in this file, so mark it with 'static'. Link: http://lkml.kernel.org/r/1581852511-14163-1-git-send-email-hqjagain@gmail.com Acked-by: Masami Hiramatsu Signed-off-by: Qiujun Huang Signed-off-by: Steven Rostedt (VMware) --- init/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/init/main.c b/init/main.c index 59248717c925..48c87f47a444 100644 --- a/init/main.c +++ b/init/main.c @@ -335,7 +335,7 @@ static char * __init xbc_make_cmdline(const char *key) return new_cmdline; } -u32 boot_config_checksum(unsigned char *p, u32 size) +static u32 boot_config_checksum(unsigned char *p, u32 size) { u32 ret = 0; From 7ab215f22d04067094de8c81c20ba4c565ff8dd4 Mon Sep 17 00:00:00 2001 From: Masami Hiramatsu Date: Mon, 17 Feb 2020 18:52:39 +0900 Subject: [PATCH 076/400] tracing: Clear trace_state when starting trace Clear trace_state data structure when starting trace in __synth_event_trace_start() internal function. Currently trace_state is initialized only in the synth_event_trace_start() API, but the trace_state in synth_event_trace() and synth_event_trace_array() are on the stack without initialization. This means those APIs will see wrong parameters and wil skip closing process in __synth_event_trace_end() because trace_state->disabled may be !0. Link: http://lkml.kernel.org/r/158193315899.8868.1781259176894639952.stgit@devnote2 Reviewed-by: Tom Zanussi Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/trace_events_hist.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index f068d55bd37f..9d87aa1f0b79 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -1824,6 +1824,8 @@ __synth_event_trace_start(struct trace_event_file *file, int entry_size, fields_size = 0; int ret = 0; + memset(trace_state, '\0', sizeof(*trace_state)); + /* * Normal event tracing doesn't get called at all unless the * ENABLED bit is set (which attaches the probe thus allowing @@ -2063,8 +2065,6 @@ int synth_event_trace_start(struct trace_event_file *file, if (!trace_state) return -EINVAL; - memset(trace_state, '\0', sizeof(*trace_state)); - ret = __synth_event_trace_start(file, trace_state); if (ret == -ENOENT) ret = 0; /* just disabled, not really an error */ From d8a953ddde5ec30a36810d0a892c3949b50849e9 Mon Sep 17 00:00:00 2001 From: Masami Hiramatsu Date: Thu, 20 Feb 2020 21:18:33 +0900 Subject: [PATCH 077/400] bootconfig: Set CONFIG_BOOT_CONFIG=n by default Set CONFIG_BOOT_CONFIG=n by default. This also warns user if CONFIG_BOOT_CONFIG=n but "bootconfig" is given in the kernel command line. Link: http://lkml.kernel.org/r/158220111291.26565.9036889083940367969.stgit@devnote2 Suggested-by: Steven Rostedt Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) --- init/Kconfig | 1 - init/main.c | 8 ++++++++ kernel/trace/Kconfig | 3 ++- 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/init/Kconfig b/init/Kconfig index 4a672c6629d0..f586878410d2 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1218,7 +1218,6 @@ endif config BOOT_CONFIG bool "Boot config support" depends on BLK_DEV_INITRD - default y help Extra boot config allows system admin to pass a config file as complemental extension of kernel cmdline when booting. diff --git a/init/main.c b/init/main.c index 48c87f47a444..d96cc5f65022 100644 --- a/init/main.c +++ b/init/main.c @@ -418,6 +418,14 @@ not_found: } #else #define setup_boot_config(cmdline) do { } while (0) + +static int __init warn_bootconfig(char *str) +{ + pr_warn("WARNING: 'bootconfig' found on the kernel command line but CONFIG_BOOTCONFIG is not set.\n"); + return 0; +} +early_param("bootconfig", warn_bootconfig); + #endif /* Change NUL term back to "=", to make "param" the whole string. */ diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 91e885194dbc..795c3e02d3f1 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -143,7 +143,8 @@ if FTRACE config BOOTTIME_TRACING bool "Boot-time Tracing support" - depends on BOOT_CONFIG && TRACING + depends on TRACING + select BOOT_CONFIG default y help Enable developer to setup ftrace subsystem via supplemental From 85c46b78da58398be1c5166f55063c0512decd39 Mon Sep 17 00:00:00 2001 From: Masami Hiramatsu Date: Thu, 20 Feb 2020 21:18:42 +0900 Subject: [PATCH 078/400] bootconfig: Add bootconfig magic word for indicating bootconfig explicitly Add bootconfig magic word to the end of bootconfig on initrd image for indicating explicitly the bootconfig is there. Also tools/bootconfig treats wrong size or wrong checksum or parse error as an error, because if there is a bootconfig magic word, there must be a bootconfig. The bootconfig magic word is "#BOOTCONFIG\n", 12 bytes word. Thus the block image of the initrd file with bootconfig is as follows. [Initrd][bootconfig][size][csum][#BOOTCONFIG\n] Link: http://lkml.kernel.org/r/158220112263.26565.3944814205960612841.stgit@devnote2 Suggested-by: Steven Rostedt Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) --- Documentation/admin-guide/bootconfig.rst | 10 ++++-- include/linux/bootconfig.h | 3 ++ init/Kconfig | 2 +- init/main.c | 6 +++- tools/bootconfig/main.c | 43 ++++++++++++++++++------ tools/bootconfig/test-bootconfig.sh | 2 +- 6 files changed, 49 insertions(+), 17 deletions(-) diff --git a/Documentation/admin-guide/bootconfig.rst b/Documentation/admin-guide/bootconfig.rst index b342a6796392..5e7609936507 100644 --- a/Documentation/admin-guide/bootconfig.rst +++ b/Documentation/admin-guide/bootconfig.rst @@ -102,9 +102,13 @@ Boot Kernel With a Boot Config ============================== Since the boot configuration file is loaded with initrd, it will be added -to the end of the initrd (initramfs) image file. The Linux kernel decodes -the last part of the initrd image in memory to get the boot configuration -data. +to the end of the initrd (initramfs) image file with size, checksum and +12-byte magic word as below. + +[initrd][bootconfig][size(u32)][checksum(u32)][#BOOTCONFIG\n] + +The Linux kernel decodes the last part of the initrd image in memory to +get the boot configuration data. Because of this "piggyback" method, there is no need to change or update the boot loader and the kernel image itself. diff --git a/include/linux/bootconfig.h b/include/linux/bootconfig.h index 7e18c939663e..d11e183fcb54 100644 --- a/include/linux/bootconfig.h +++ b/include/linux/bootconfig.h @@ -10,6 +10,9 @@ #include #include +#define BOOTCONFIG_MAGIC "#BOOTCONFIG\n" +#define BOOTCONFIG_MAGIC_LEN 12 + /* XBC tree node */ struct xbc_node { u16 next; diff --git a/init/Kconfig b/init/Kconfig index f586878410d2..a84e7aa89a29 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1222,7 +1222,7 @@ config BOOT_CONFIG Extra boot config allows system admin to pass a config file as complemental extension of kernel cmdline when booting. The boot config file must be attached at the end of initramfs - with checksum and size. + with checksum, size and magic word. See for details. If unsure, say Y. diff --git a/init/main.c b/init/main.c index d96cc5f65022..2fe8dec93e68 100644 --- a/init/main.c +++ b/init/main.c @@ -374,7 +374,11 @@ static void __init setup_boot_config(const char *cmdline) if (!initrd_end) goto not_found; - hdr = (u32 *)(initrd_end - 8); + data = (char *)initrd_end - BOOTCONFIG_MAGIC_LEN; + if (memcmp(data, BOOTCONFIG_MAGIC, BOOTCONFIG_MAGIC_LEN)) + goto not_found; + + hdr = (u32 *)(data - 8); size = hdr[0]; csum = hdr[1]; diff --git a/tools/bootconfig/main.c b/tools/bootconfig/main.c index e18eeb070562..742271f019a9 100644 --- a/tools/bootconfig/main.c +++ b/tools/bootconfig/main.c @@ -131,15 +131,26 @@ int load_xbc_from_initrd(int fd, char **buf) struct stat stat; int ret; u32 size = 0, csum = 0, rcsum; + char magic[BOOTCONFIG_MAGIC_LEN]; ret = fstat(fd, &stat); if (ret < 0) return -errno; - if (stat.st_size < 8) + if (stat.st_size < 8 + BOOTCONFIG_MAGIC_LEN) return 0; - if (lseek(fd, -8, SEEK_END) < 0) { + if (lseek(fd, -BOOTCONFIG_MAGIC_LEN, SEEK_END) < 0) { + pr_err("Failed to lseek: %d\n", -errno); + return -errno; + } + if (read(fd, magic, BOOTCONFIG_MAGIC_LEN) < 0) + return -errno; + /* Check the bootconfig magic bytes */ + if (memcmp(magic, BOOTCONFIG_MAGIC, BOOTCONFIG_MAGIC_LEN) != 0) + return 0; + + if (lseek(fd, -(8 + BOOTCONFIG_MAGIC_LEN), SEEK_END) < 0) { pr_err("Failed to lseek: %d\n", -errno); return -errno; } @@ -150,11 +161,14 @@ int load_xbc_from_initrd(int fd, char **buf) if (read(fd, &csum, sizeof(u32)) < 0) return -errno; - /* Wrong size, maybe no boot config here */ - if (stat.st_size < size + 8) - return 0; + /* Wrong size error */ + if (stat.st_size < size + 8 + BOOTCONFIG_MAGIC_LEN) { + pr_err("bootconfig size is too big\n"); + return -E2BIG; + } - if (lseek(fd, stat.st_size - 8 - size, SEEK_SET) < 0) { + if (lseek(fd, stat.st_size - (size + 8 + BOOTCONFIG_MAGIC_LEN), + SEEK_SET) < 0) { pr_err("Failed to lseek: %d\n", -errno); return -errno; } @@ -163,17 +177,17 @@ int load_xbc_from_initrd(int fd, char **buf) if (ret < 0) return ret; - /* Wrong Checksum, maybe no boot config here */ + /* Wrong Checksum */ rcsum = checksum((unsigned char *)*buf, size); if (csum != rcsum) { pr_err("checksum error: %d != %d\n", csum, rcsum); - return 0; + return -EINVAL; } ret = xbc_init(*buf); - /* Wrong data, maybe no boot config here */ + /* Wrong data */ if (ret < 0) - return 0; + return ret; return size; } @@ -226,7 +240,8 @@ int delete_xbc(const char *path) } else if (size > 0) { ret = fstat(fd, &stat); if (!ret) - ret = ftruncate(fd, stat.st_size - size - 8); + ret = ftruncate(fd, stat.st_size + - size - 8 - BOOTCONFIG_MAGIC_LEN); if (ret) ret = -errno; } /* Ignore if there is no boot config in initrd */ @@ -295,6 +310,12 @@ int apply_xbc(const char *path, const char *xbc_path) pr_err("Failed to apply a boot config: %d\n", ret); return ret; } + /* Write a magic word of the bootconfig */ + ret = write(fd, BOOTCONFIG_MAGIC, BOOTCONFIG_MAGIC_LEN); + if (ret < 0) { + pr_err("Failed to apply a boot config magic: %d\n", ret); + return ret; + } close(fd); free(data); diff --git a/tools/bootconfig/test-bootconfig.sh b/tools/bootconfig/test-bootconfig.sh index 1de06de328e2..adafb7c50940 100755 --- a/tools/bootconfig/test-bootconfig.sh +++ b/tools/bootconfig/test-bootconfig.sh @@ -49,7 +49,7 @@ xpass $BOOTCONF -a $TEMPCONF $INITRD new_size=$(stat -c %s $INITRD) echo "File size check" -xpass test $new_size -eq $(expr $bconf_size + $initrd_size + 9) +xpass test $new_size -eq $(expr $bconf_size + $initrd_size + 9 + 12) echo "Apply command repeat test" xpass $BOOTCONF -a $TEMPCONF $INITRD From 15e95037b45f24f9ab6d4f0bd101d4df0be24c1d Mon Sep 17 00:00:00 2001 From: Masami Hiramatsu Date: Thu, 20 Feb 2020 21:18:52 +0900 Subject: [PATCH 079/400] tools/bootconfig: Remove unneeded error message silencer Remove error message silent knob, we don't need it anymore because we can check if there is a bootconfig by checking the magic word. If there is a magic word, but failed to load a bootconfig from initrd, there is a real problem. Link: http://lkml.kernel.org/r/158220113256.26565.14264598654427773104.stgit@devnote2 Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) --- tools/bootconfig/include/linux/printk.h | 5 +---- tools/bootconfig/main.c | 8 -------- 2 files changed, 1 insertion(+), 12 deletions(-) diff --git a/tools/bootconfig/include/linux/printk.h b/tools/bootconfig/include/linux/printk.h index e978a63d3222..036e667596eb 100644 --- a/tools/bootconfig/include/linux/printk.h +++ b/tools/bootconfig/include/linux/printk.h @@ -4,10 +4,7 @@ #include -/* controllable printf */ -extern int pr_output; -#define printk(fmt, ...) \ - (pr_output ? printf(fmt, ##__VA_ARGS__) : 0) +#define printk(fmt, ...) printf(fmt, ##__VA_ARGS__) #define pr_err printk #define pr_warn printk diff --git a/tools/bootconfig/main.c b/tools/bootconfig/main.c index 742271f019a9..a9b97814d1a9 100644 --- a/tools/bootconfig/main.c +++ b/tools/bootconfig/main.c @@ -14,8 +14,6 @@ #include #include -int pr_output = 1; - static int xbc_show_array(struct xbc_node *node) { const char *val; @@ -227,13 +225,7 @@ int delete_xbc(const char *path) return -errno; } - /* - * Suppress error messages in xbc_init() because it can be just a - * data which concidentally matches the size and checksum footer. - */ - pr_output = 0; size = load_xbc_from_initrd(fd, &buf); - pr_output = 1; if (size < 0) { ret = size; pr_err("Failed to load a boot config from initrd: %d\n", ret); From a24d286f36104ed45108a5a36f3868938434772f Mon Sep 17 00:00:00 2001 From: Masami Hiramatsu Date: Thu, 20 Feb 2020 21:19:12 +0900 Subject: [PATCH 080/400] bootconfig: Reject subkey and value on same parent key Reject if a value node is mixed with subkey node on same parent key node. A value node can not co-exist with subkey node under some key node, e.g. key = value key.subkey = another-value This is not be allowed because bootconfig API is not designed to handle such case. Link: http://lkml.kernel.org/r/158220115232.26565.7792340045009731803.stgit@devnote2 Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) --- Documentation/admin-guide/bootconfig.rst | 7 +++++++ lib/bootconfig.c | 16 ++++++++++++---- tools/bootconfig/samples/bad-mixed-kv1.bconf | 3 +++ tools/bootconfig/samples/bad-mixed-kv2.bconf | 3 +++ 4 files changed, 25 insertions(+), 4 deletions(-) create mode 100644 tools/bootconfig/samples/bad-mixed-kv1.bconf create mode 100644 tools/bootconfig/samples/bad-mixed-kv2.bconf diff --git a/Documentation/admin-guide/bootconfig.rst b/Documentation/admin-guide/bootconfig.rst index 5e7609936507..dfeffa73dca3 100644 --- a/Documentation/admin-guide/bootconfig.rst +++ b/Documentation/admin-guide/bootconfig.rst @@ -62,6 +62,13 @@ Or more shorter, written as following:: In both styles, same key words are automatically merged when parsing it at boot time. So you can append similar trees or key-values. +Note that a sub-key and a value can not co-exist under a parent key. +For example, following config is NOT allowed.:: + + foo = value1 + foo.bar = value2 # !ERROR! subkey "bar" and value "value1" can NOT co-exist + + Comments -------- diff --git a/lib/bootconfig.c b/lib/bootconfig.c index 3ea601a2eba5..54ac623ca781 100644 --- a/lib/bootconfig.c +++ b/lib/bootconfig.c @@ -533,7 +533,7 @@ struct xbc_node *find_match_node(struct xbc_node *node, char *k) static int __init __xbc_add_key(char *k) { - struct xbc_node *node; + struct xbc_node *node, *child; if (!xbc_valid_keyword(k)) return xbc_parse_error("Invalid keyword", k); @@ -543,8 +543,12 @@ static int __init __xbc_add_key(char *k) if (!last_parent) /* the first level */ node = find_match_node(xbc_nodes, k); - else - node = find_match_node(xbc_node_get_child(last_parent), k); + else { + child = xbc_node_get_child(last_parent); + if (child && xbc_node_is_value(child)) + return xbc_parse_error("Subkey is mixed with value", k); + node = find_match_node(child, k); + } if (node) last_parent = node; @@ -577,7 +581,7 @@ static int __init __xbc_parse_keys(char *k) static int __init xbc_parse_kv(char **k, char *v) { struct xbc_node *prev_parent = last_parent; - struct xbc_node *node; + struct xbc_node *node, *child; char *next; int c, ret; @@ -585,6 +589,10 @@ static int __init xbc_parse_kv(char **k, char *v) if (ret) return ret; + child = xbc_node_get_child(last_parent); + if (child && xbc_node_is_key(child)) + return xbc_parse_error("Value is mixed with subkey", v); + c = __xbc_parse_value(&v, &next); if (c < 0) return c; diff --git a/tools/bootconfig/samples/bad-mixed-kv1.bconf b/tools/bootconfig/samples/bad-mixed-kv1.bconf new file mode 100644 index 000000000000..1761547dd05c --- /dev/null +++ b/tools/bootconfig/samples/bad-mixed-kv1.bconf @@ -0,0 +1,3 @@ +# value -> subkey pattern +key = value +key.subkey = another-value diff --git a/tools/bootconfig/samples/bad-mixed-kv2.bconf b/tools/bootconfig/samples/bad-mixed-kv2.bconf new file mode 100644 index 000000000000..6b32e0c3878c --- /dev/null +++ b/tools/bootconfig/samples/bad-mixed-kv2.bconf @@ -0,0 +1,3 @@ +# subkey -> value pattern +key.subkey = value +key = another-value From 88b913718db94697497028b85acbec8b180a4333 Mon Sep 17 00:00:00 2001 From: Masami Hiramatsu Date: Thu, 20 Feb 2020 21:19:42 +0900 Subject: [PATCH 081/400] bootconfig: Print array as multiple commands for legacy command line Print arraied values as multiple same options for legacy kernel command line. With this rule, if the "kernel.*" and "init.*" array entries in bootconfig are printed out as multiple same options, e.g. kernel { console = "ttyS0,115200" console += "tty0" } will be correctly converted to console="ttyS0,115200" console="tty0" in the kernel command line. Link: http://lkml.kernel.org/r/158220118213.26565.8163300497009463916.stgit@devnote2 Reported-by: Borislav Petkov Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) --- init/main.c | 26 ++++++++++---------------- 1 file changed, 10 insertions(+), 16 deletions(-) diff --git a/init/main.c b/init/main.c index 2fe8dec93e68..c9b1ee6bbb8d 100644 --- a/init/main.c +++ b/init/main.c @@ -268,7 +268,6 @@ static int __init xbc_snprint_cmdline(char *buf, size_t size, { struct xbc_node *knode, *vnode; char *end = buf + size; - char c = '\"'; const char *val; int ret; @@ -279,25 +278,20 @@ static int __init xbc_snprint_cmdline(char *buf, size_t size, return ret; vnode = xbc_node_get_child(knode); - ret = snprintf(buf, rest(buf, end), "%s%c", xbc_namebuf, - vnode ? '=' : ' '); - if (ret < 0) - return ret; - buf += ret; - if (!vnode) - continue; - - c = '\"'; - xbc_array_for_each_value(vnode, val) { - ret = snprintf(buf, rest(buf, end), "%c%s", c, val); + if (!vnode) { + ret = snprintf(buf, rest(buf, end), "%s ", xbc_namebuf); + if (ret < 0) + return ret; + buf += ret; + continue; + } + xbc_array_for_each_value(vnode, val) { + ret = snprintf(buf, rest(buf, end), "%s=\"%s\" ", + xbc_namebuf, val); if (ret < 0) return ret; buf += ret; - c = ','; } - if (rest(buf, end) > 2) - strcpy(buf, "\" "); - buf += 2; } return buf - (end - size); From 9951ebfcdf2b97dbb28a5d930458424341e61aa2 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Fri, 21 Feb 2020 10:41:43 +0100 Subject: [PATCH 082/400] nl80211: fix potential leak in AP start If nl80211_parse_he_obss_pd() fails, we leak the previously allocated ACL memory. Free it in this case. Fixes: 796e90f42b7e ("cfg80211: add support for parsing OBBS_PD attributes") Signed-off-by: Johannes Berg Link: https://lore.kernel.org/r/20200221104142.835aba4cdd14.I1923b55ba9989c57e13978f91f40bfdc45e60cbd@changeid Signed-off-by: Johannes Berg --- net/wireless/nl80211.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index cedf17d4933f..46be40e19e7f 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -4800,8 +4800,7 @@ static int nl80211_start_ap(struct sk_buff *skb, struct genl_info *info) err = nl80211_parse_he_obss_pd( info->attrs[NL80211_ATTR_HE_OBSS_PD], ¶ms.he_obss_pd); - if (err) - return err; + goto out; } nl80211_calculate_ap_params(¶ms); @@ -4823,6 +4822,7 @@ static int nl80211_start_ap(struct sk_buff *skb, struct genl_info *info) } wdev_unlock(wdev); +out: kfree(params.acl); return err; From a7ee7d44b57c9ae174088e53a668852b7f4f452d Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Fri, 21 Feb 2020 10:44:50 +0100 Subject: [PATCH 083/400] cfg80211: check reg_rule for NULL in handle_channel_custom() We may end up with a NULL reg_rule after the loop in handle_channel_custom() if the bandwidth didn't fit, check if this is the case and bail out if so. Signed-off-by: Johannes Berg Link: https://lore.kernel.org/r/20200221104449.3b558a50201c.I4ad3725c4dacaefd2d18d3cc65ba6d18acd5dbfe@changeid Signed-off-by: Johannes Berg --- net/wireless/reg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/wireless/reg.c b/net/wireless/reg.c index fff9a74891fc..1a8218f1bbe0 100644 --- a/net/wireless/reg.c +++ b/net/wireless/reg.c @@ -2276,7 +2276,7 @@ static void handle_channel_custom(struct wiphy *wiphy, break; } - if (IS_ERR(reg_rule)) { + if (IS_ERR_OR_NULL(reg_rule)) { pr_debug("Disabling freq %d MHz as custom regd has no rule that fits it\n", chan->center_freq); if (wiphy->regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED) { From 0daa63ed4c6c4302790ce67b7a90c0997ceb7514 Mon Sep 17 00:00:00 2001 From: Andrei Otcheretianski Date: Fri, 21 Feb 2020 10:47:20 +0100 Subject: [PATCH 084/400] mac80211: Remove a redundant mutex unlock The below-mentioned commit changed the code to unlock *inside* the function, but previously the unlock was *outside*. It failed to remove the outer unlock, however, leading to double unlock. Fix this. Fixes: 33483a6b88e4 ("mac80211: fix missing unlock on error in ieee80211_mark_sta_auth()") Signed-off-by: Andrei Otcheretianski Link: https://lore.kernel.org/r/20200221104719.cce4741cf6eb.I671567b185c8a4c2409377e483fd149ce590f56d@changeid [rewrite commit message to better explain what happened] Signed-off-by: Johannes Berg --- net/mac80211/mlme.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index e041af2f021a..88d7a692a965 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -2959,7 +2959,7 @@ static void ieee80211_rx_mgmt_auth(struct ieee80211_sub_if_data *sdata, (auth_transaction == 2 && ifmgd->auth_data->expected_transaction == 2)) { if (!ieee80211_mark_sta_auth(sdata, bssid)) - goto out_err; + return; /* ignore frame -- wait for timeout */ } else if (ifmgd->auth_data->algorithm == WLAN_AUTH_SAE && auth_transaction == 2) { sdata_info(sdata, "SAE peer confirmed\n"); @@ -2967,10 +2967,6 @@ static void ieee80211_rx_mgmt_auth(struct ieee80211_sub_if_data *sdata, } cfg80211_rx_mlme_mgmt(sdata->dev, (u8 *)mgmt, len); - return; - out_err: - mutex_unlock(&sdata->local->sta_mtx); - /* ignore frame -- wait for timeout */ } #define case_WLAN(type) \ From 22946f37557e27697aabc8e4f62642bfe4a17fd8 Mon Sep 17 00:00:00 2001 From: Jerome Brunet Date: Fri, 21 Feb 2020 13:11:46 +0100 Subject: [PATCH 085/400] ASoC: meson: g12a: add tohdmitx reset Reset the g12a hdmi codec glue on probe. This ensure a sane startup state. Signed-off-by: Jerome Brunet Link: https://lore.kernel.org/r/20200221121146.1498427-1-jbrunet@baylibre.com Signed-off-by: Mark Brown --- sound/soc/meson/g12a-tohdmitx.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/sound/soc/meson/g12a-tohdmitx.c b/sound/soc/meson/g12a-tohdmitx.c index 9cfbd343a00c..8a0db28a6a40 100644 --- a/sound/soc/meson/g12a-tohdmitx.c +++ b/sound/soc/meson/g12a-tohdmitx.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -378,6 +379,11 @@ static int g12a_tohdmitx_probe(struct platform_device *pdev) struct device *dev = &pdev->dev; void __iomem *regs; struct regmap *map; + int ret; + + ret = device_reset(dev); + if (ret) + return ret; regs = devm_platform_ioremap_resource(pdev, 0); if (IS_ERR(regs)) From 136b5cd2e2f97581ae560cff0db2a3b5369112da Mon Sep 17 00:00:00 2001 From: Yuji Sasaki Date: Fri, 14 Feb 2020 13:13:40 +0530 Subject: [PATCH 086/400] spi: qup: call spi_qup_pm_resume_runtime before suspending spi_qup_suspend() will cause synchronous external abort when runtime suspend is enabled and applied, as it tries to access SPI controller register while clock is already disabled in spi_qup_pm_suspend_runtime(). Signed-off-by: Yuji sasaki Signed-off-by: Vinod Koul Link: https://lore.kernel.org/r/20200214074340.2286170-1-vkoul@kernel.org Signed-off-by: Mark Brown --- drivers/spi/spi-qup.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/spi/spi-qup.c b/drivers/spi/spi-qup.c index dd3434a407ea..a364b99497e2 100644 --- a/drivers/spi/spi-qup.c +++ b/drivers/spi/spi-qup.c @@ -1217,6 +1217,11 @@ static int spi_qup_suspend(struct device *device) struct spi_qup *controller = spi_master_get_devdata(master); int ret; + if (pm_runtime_suspended(device)) { + ret = spi_qup_pm_resume_runtime(device); + if (ret) + return ret; + } ret = spi_master_suspend(master); if (ret) return ret; @@ -1225,10 +1230,8 @@ static int spi_qup_suspend(struct device *device) if (ret) return ret; - if (!pm_runtime_suspended(device)) { - clk_disable_unprepare(controller->cclk); - clk_disable_unprepare(controller->iclk); - } + clk_disable_unprepare(controller->cclk); + clk_disable_unprepare(controller->iclk); return 0; } From 138c9c32f090894614899eca15e0bb7279f59865 Mon Sep 17 00:00:00 2001 From: Lukas Wunner Date: Tue, 18 Feb 2020 13:08:00 +0100 Subject: [PATCH 087/400] spi: spidev: Fix CS polarity if GPIO descriptors are used Commit f3186dd87669 ("spi: Optionally use GPIO descriptors for CS GPIOs") amended of_spi_parse_dt() to always set SPI_CS_HIGH for SPI slaves whose Chip Select is defined by a "cs-gpios" devicetree property. This change broke userspace applications which issue an SPI_IOC_WR_MODE ioctl() to an spidev: Chip Select polarity will be incorrect unless the application is changed to set SPI_CS_HIGH. And once changed, it will be incompatible with kernels not containing the commit. Fix by setting SPI_CS_HIGH in spidev_ioctl() (under the same conditions as in of_spi_parse_dt()). Fixes: f3186dd87669 ("spi: Optionally use GPIO descriptors for CS GPIOs") Reported-by: Simon Han Signed-off-by: Lukas Wunner Reviewed-by: Linus Walleij Link: https://lore.kernel.org/r/fca3ba7cdc930cd36854666ceac4fbcf01b89028.1582027457.git.lukas@wunner.de Signed-off-by: Mark Brown Cc: stable@vger.kernel.org # v5.1+ --- drivers/spi/spidev.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c index 1e217e3e9486..2ab6e782f14c 100644 --- a/drivers/spi/spidev.c +++ b/drivers/spi/spidev.c @@ -396,6 +396,7 @@ spidev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) else retval = get_user(tmp, (u32 __user *)arg); if (retval == 0) { + struct spi_controller *ctlr = spi->controller; u32 save = spi->mode; if (tmp & ~SPI_MODE_MASK) { @@ -403,6 +404,10 @@ spidev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) break; } + if (ctlr->use_gpio_descriptors && ctlr->cs_gpiods && + ctlr->cs_gpiods[spi->chip_select]) + tmp |= SPI_CS_HIGH; + tmp |= spi->mode & ~SPI_MODE_MASK; spi->mode = (u16)tmp; retval = spi_setup(spi); From 4e4694d8729f7cd6381f6691e8f83e378fce3160 Mon Sep 17 00:00:00 2001 From: Masami Hiramatsu Date: Fri, 21 Feb 2020 17:13:42 +0900 Subject: [PATCH 088/400] bootconfig: Prohibit re-defining value on same key Currently, bootconfig adds a new value on the existing key to the tail of an array. But this looks a bit confusing because an admin can easily rewrite the original value in the same config file. This rejects the following value re-definition. key = value1 ... key = value2 You should rewrite value1 to value2 in this case. Link: http://lkml.kernel.org/r/158227282199.12842.10110929876059658601.stgit@devnote2 Suggested-by: Steven Rostedt (VMware) Signed-off-by: Masami Hiramatsu [ Fixed spelling of arraies to arrays ] Signed-off-by: Steven Rostedt (VMware) --- Documentation/admin-guide/bootconfig.rst | 11 ++++++++++- lib/bootconfig.c | 13 ++++++++----- tools/bootconfig/samples/bad-samekey.bconf | 6 ++++++ 3 files changed, 24 insertions(+), 6 deletions(-) create mode 100644 tools/bootconfig/samples/bad-samekey.bconf diff --git a/Documentation/admin-guide/bootconfig.rst b/Documentation/admin-guide/bootconfig.rst index dfeffa73dca3..57119fb69d36 100644 --- a/Documentation/admin-guide/bootconfig.rst +++ b/Documentation/admin-guide/bootconfig.rst @@ -62,7 +62,16 @@ Or more shorter, written as following:: In both styles, same key words are automatically merged when parsing it at boot time. So you can append similar trees or key-values. -Note that a sub-key and a value can not co-exist under a parent key. +Same-key Values +--------------- + +It is prohibited that two or more values or arrays share a same-key. +For example,:: + + foo = bar, baz + foo = qux # !ERROR! we can not re-define same key + +Also, a sub-key and a value can not co-exist under a parent key. For example, following config is NOT allowed.:: foo = value1 diff --git a/lib/bootconfig.c b/lib/bootconfig.c index 54ac623ca781..2ef304db31f2 100644 --- a/lib/bootconfig.c +++ b/lib/bootconfig.c @@ -581,7 +581,7 @@ static int __init __xbc_parse_keys(char *k) static int __init xbc_parse_kv(char **k, char *v) { struct xbc_node *prev_parent = last_parent; - struct xbc_node *node, *child; + struct xbc_node *child; char *next; int c, ret; @@ -590,15 +590,18 @@ static int __init xbc_parse_kv(char **k, char *v) return ret; child = xbc_node_get_child(last_parent); - if (child && xbc_node_is_key(child)) - return xbc_parse_error("Value is mixed with subkey", v); + if (child) { + if (xbc_node_is_key(child)) + return xbc_parse_error("Value is mixed with subkey", v); + else + return xbc_parse_error("Value is redefined", v); + } c = __xbc_parse_value(&v, &next); if (c < 0) return c; - node = xbc_add_sibling(v, XBC_VALUE); - if (!node) + if (!xbc_add_sibling(v, XBC_VALUE)) return -ENOMEM; if (c == ',') { /* Array */ diff --git a/tools/bootconfig/samples/bad-samekey.bconf b/tools/bootconfig/samples/bad-samekey.bconf new file mode 100644 index 000000000000..e8d983a4563c --- /dev/null +++ b/tools/bootconfig/samples/bad-samekey.bconf @@ -0,0 +1,6 @@ +# Same key value is not allowed +key { + foo = value + bar = value2 +} +key.foo = value From 5f811c57c99205e048926293bb812c750a6ea562 Mon Sep 17 00:00:00 2001 From: Masami Hiramatsu Date: Fri, 21 Feb 2020 17:13:52 +0900 Subject: [PATCH 089/400] bootconfig: Add append value operator support Add append value operator "+=" support to bootconfig syntax. With this operator, user can add new value to the key as an entry of array instead of overwriting. For example, foo = bar ... foo += baz Then the key "foo" has "bar" and "baz" values as an array. Link: http://lkml.kernel.org/r/158227283195.12842.8310503105963275584.stgit@devnote2 Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) --- Documentation/admin-guide/bootconfig.rst | 10 +++++++++- lib/bootconfig.c | 15 +++++++++++---- tools/bootconfig/test-bootconfig.sh | 16 ++++++++++++++-- 3 files changed, 34 insertions(+), 7 deletions(-) diff --git a/Documentation/admin-guide/bootconfig.rst b/Documentation/admin-guide/bootconfig.rst index 57119fb69d36..cf2edcd09183 100644 --- a/Documentation/admin-guide/bootconfig.rst +++ b/Documentation/admin-guide/bootconfig.rst @@ -71,7 +71,15 @@ For example,:: foo = bar, baz foo = qux # !ERROR! we can not re-define same key -Also, a sub-key and a value can not co-exist under a parent key. +If you want to append the value to existing key as an array member, +you can use ``+=`` operator. For example:: + + foo = bar, baz + foo += qux + +In this case, the key ``foo`` has ``bar``, ``baz`` and ``qux``. + +However, a sub-key and a value can not co-exist under a parent key. For example, following config is NOT allowed.:: foo = value1 diff --git a/lib/bootconfig.c b/lib/bootconfig.c index 2ef304db31f2..ec3ce7fd299f 100644 --- a/lib/bootconfig.c +++ b/lib/bootconfig.c @@ -578,7 +578,7 @@ static int __init __xbc_parse_keys(char *k) return __xbc_add_key(k); } -static int __init xbc_parse_kv(char **k, char *v) +static int __init xbc_parse_kv(char **k, char *v, int op) { struct xbc_node *prev_parent = last_parent; struct xbc_node *child; @@ -593,7 +593,7 @@ static int __init xbc_parse_kv(char **k, char *v) if (child) { if (xbc_node_is_key(child)) return xbc_parse_error("Value is mixed with subkey", v); - else + else if (op == '=') return xbc_parse_error("Value is redefined", v); } @@ -774,7 +774,7 @@ int __init xbc_init(char *buf) p = buf; do { - q = strpbrk(p, "{}=;\n#"); + q = strpbrk(p, "{}=+;\n#"); if (!q) { p = skip_spaces(p); if (*p != '\0') @@ -785,8 +785,15 @@ int __init xbc_init(char *buf) c = *q; *q++ = '\0'; switch (c) { + case '+': + if (*q++ != '=') { + ret = xbc_parse_error("Wrong '+' operator", + q - 2); + break; + } + /* Fall through */ case '=': - ret = xbc_parse_kv(&p, q); + ret = xbc_parse_kv(&p, q, c); break; case '{': ret = xbc_open_brace(&p, q); diff --git a/tools/bootconfig/test-bootconfig.sh b/tools/bootconfig/test-bootconfig.sh index adafb7c50940..1411f4c3454f 100755 --- a/tools/bootconfig/test-bootconfig.sh +++ b/tools/bootconfig/test-bootconfig.sh @@ -9,7 +9,7 @@ TEMPCONF=`mktemp temp-XXXX.bconf` NG=0 cleanup() { - rm -f $INITRD $TEMPCONF + rm -f $INITRD $TEMPCONF $OUTFILE exit $NG } @@ -71,7 +71,6 @@ printf " \0\0\0 \0\0\0" >> $INITRD $BOOTCONF -a $TEMPCONF $INITRD > $OUTFILE 2>&1 xfail grep -i "failed" $OUTFILE xfail grep -i "error" $OUTFILE -rm $OUTFILE echo "Max node number check" @@ -96,6 +95,19 @@ truncate -s 32764 $TEMPCONF echo "\"" >> $TEMPCONF # add 2 bytes + terminal ('\"\n\0') xpass $BOOTCONF -a $TEMPCONF $INITRD +echo "Adding same-key values" +cat > $TEMPCONF << EOF +key = bar, baz +key += qux +EOF +echo > $INITRD + +xpass $BOOTCONF -a $TEMPCONF $INITRD +$BOOTCONF $INITRD > $OUTFILE +xpass grep -q "bar" $OUTFILE +xpass grep -q "baz" $OUTFILE +xpass grep -q "qux" $OUTFILE + echo "=== expected failure cases ===" for i in samples/bad-* ; do xfail $BOOTCONF -a $i $INITRD From ff6993bb79b9f99bdac0b5378169052931b65432 Mon Sep 17 00:00:00 2001 From: Igor Druzhinin Date: Tue, 14 Jan 2020 14:43:19 +0000 Subject: [PATCH 090/400] scsi: libfc: free response frame from GPN_ID fc_disc_gpn_id_resp() should be the last function using it so free it here to avoid memory leak. Link: https://lore.kernel.org/r/1579013000-14570-2-git-send-email-igor.druzhinin@citrix.com Reviewed-by: Hannes Reinecke Signed-off-by: Igor Druzhinin Signed-off-by: Martin K. Petersen --- drivers/scsi/libfc/fc_disc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/scsi/libfc/fc_disc.c b/drivers/scsi/libfc/fc_disc.c index 9c5f7c9178c6..2b865c6423e2 100644 --- a/drivers/scsi/libfc/fc_disc.c +++ b/drivers/scsi/libfc/fc_disc.c @@ -628,6 +628,8 @@ redisc: } out: kref_put(&rdata->kref, fc_rport_destroy); + if (!IS_ERR(fp)) + fc_frame_free(fp); } /** From f66ee0410b1c3481ee75e5db9b34547b4d582465 Mon Sep 17 00:00:00 2001 From: Jozsef Kadlecsik Date: Tue, 11 Feb 2020 23:20:43 +0100 Subject: [PATCH 091/400] netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports In the case of huge hash:* types of sets, due to the single spinlock of a set the processing of the whole set under spinlock protection could take too long. There were four places where the whole hash table of the set was processed from bucket to bucket under holding the spinlock: - During resizing a set, the original set was locked to exclude kernel side add/del element operations (userspace add/del is excluded by the nfnetlink mutex). The original set is actually just read during the resize, so the spinlocking is replaced with rcu locking of regions. However, thus there can be parallel kernel side add/del of entries. In order not to loose those operations a backlog is added and replayed after the successful resize. - Garbage collection of timed out entries was also protected by the spinlock. In order not to lock too long, region locking is introduced and a single region is processed in one gc go. Also, the simple timer based gc running is replaced with a workqueue based solution. The internal book-keeping (number of elements, size of extensions) is moved to region level due to the region locking. - Adding elements: when the max number of the elements is reached, the gc was called to evict the timed out entries. The new approach is that the gc is called just for the matching region, assuming that if the region (proportionally) seems to be full, then the whole set does. We could scan the other regions to check every entry under rcu locking, but for huge sets it'd mean a slowdown at adding elements. - Listing the set header data: when the set was defined with timeout support, the garbage collector was called to clean up timed out entries to get the correct element numbers and set size values. Now the set is scanned to check non-timed out entries, without actually calling the gc for the whole set. Thanks to Florian Westphal for helping me to solve the SOFTIRQ-safe -> SOFTIRQ-unsafe lock order issues during working on the patch. Reported-by: syzbot+4b0e9d4ff3cf117837e5@syzkaller.appspotmail.com Reported-by: syzbot+c27b8d5010f45c666ed1@syzkaller.appspotmail.com Reported-by: syzbot+68a806795ac89df3aa1c@syzkaller.appspotmail.com Fixes: 23c42a403a9c ("netfilter: ipset: Introduction of new commands and protocol version 7") Signed-off-by: Jozsef Kadlecsik --- include/linux/netfilter/ipset/ip_set.h | 11 +- net/netfilter/ipset/ip_set_core.c | 34 +- net/netfilter/ipset/ip_set_hash_gen.h | 629 +++++++++++++++++-------- 3 files changed, 470 insertions(+), 204 deletions(-) diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h index 908d38dbcb91..5448c8b443db 100644 --- a/include/linux/netfilter/ipset/ip_set.h +++ b/include/linux/netfilter/ipset/ip_set.h @@ -121,6 +121,7 @@ struct ip_set_ext { u32 timeout; u8 packets_op; u8 bytes_op; + bool target; }; struct ip_set; @@ -187,6 +188,14 @@ struct ip_set_type_variant { /* Return true if "b" set is the same as "a" * according to the create set parameters */ bool (*same_set)(const struct ip_set *a, const struct ip_set *b); + /* Region-locking is used */ + bool region_lock; +}; + +struct ip_set_region { + spinlock_t lock; /* Region lock */ + size_t ext_size; /* Size of the dynamic extensions */ + u32 elements; /* Number of elements vs timeout */ }; /* The core set type structure */ @@ -501,7 +510,7 @@ ip_set_init_skbinfo(struct ip_set_skbinfo *skbinfo, } #define IP_SET_INIT_KEXT(skb, opt, set) \ - { .bytes = (skb)->len, .packets = 1, \ + { .bytes = (skb)->len, .packets = 1, .target = true,\ .timeout = ip_set_adt_opt_timeout(opt, set) } #define IP_SET_INIT_UEXT(set) \ diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c index 69c107f9ba8d..8dd17589217d 100644 --- a/net/netfilter/ipset/ip_set_core.c +++ b/net/netfilter/ipset/ip_set_core.c @@ -723,6 +723,20 @@ ip_set_rcu_get(struct net *net, ip_set_id_t index) return set; } +static inline void +ip_set_lock(struct ip_set *set) +{ + if (!set->variant->region_lock) + spin_lock_bh(&set->lock); +} + +static inline void +ip_set_unlock(struct ip_set *set) +{ + if (!set->variant->region_lock) + spin_unlock_bh(&set->lock); +} + int ip_set_test(ip_set_id_t index, const struct sk_buff *skb, const struct xt_action_param *par, struct ip_set_adt_opt *opt) @@ -744,9 +758,9 @@ ip_set_test(ip_set_id_t index, const struct sk_buff *skb, if (ret == -EAGAIN) { /* Type requests element to be completed */ pr_debug("element must be completed, ADD is triggered\n"); - spin_lock_bh(&set->lock); + ip_set_lock(set); set->variant->kadt(set, skb, par, IPSET_ADD, opt); - spin_unlock_bh(&set->lock); + ip_set_unlock(set); ret = 1; } else { /* --return-nomatch: invert matched element */ @@ -775,9 +789,9 @@ ip_set_add(ip_set_id_t index, const struct sk_buff *skb, !(opt->family == set->family || set->family == NFPROTO_UNSPEC)) return -IPSET_ERR_TYPE_MISMATCH; - spin_lock_bh(&set->lock); + ip_set_lock(set); ret = set->variant->kadt(set, skb, par, IPSET_ADD, opt); - spin_unlock_bh(&set->lock); + ip_set_unlock(set); return ret; } @@ -797,9 +811,9 @@ ip_set_del(ip_set_id_t index, const struct sk_buff *skb, !(opt->family == set->family || set->family == NFPROTO_UNSPEC)) return -IPSET_ERR_TYPE_MISMATCH; - spin_lock_bh(&set->lock); + ip_set_lock(set); ret = set->variant->kadt(set, skb, par, IPSET_DEL, opt); - spin_unlock_bh(&set->lock); + ip_set_unlock(set); return ret; } @@ -1264,9 +1278,9 @@ ip_set_flush_set(struct ip_set *set) { pr_debug("set: %s\n", set->name); - spin_lock_bh(&set->lock); + ip_set_lock(set); set->variant->flush(set); - spin_unlock_bh(&set->lock); + ip_set_unlock(set); } static int ip_set_flush(struct net *net, struct sock *ctnl, struct sk_buff *skb, @@ -1713,9 +1727,9 @@ call_ad(struct sock *ctnl, struct sk_buff *skb, struct ip_set *set, bool eexist = flags & IPSET_FLAG_EXIST, retried = false; do { - spin_lock_bh(&set->lock); + ip_set_lock(set); ret = set->variant->uadt(set, tb, adt, &lineno, flags, retried); - spin_unlock_bh(&set->lock); + ip_set_unlock(set); retried = true; } while (ret == -EAGAIN && set->variant->resize && diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index 7480ce55b5c8..71e93eac0831 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -7,13 +7,21 @@ #include #include #include +#include #include -#define __ipset_dereference_protected(p, c) rcu_dereference_protected(p, c) -#define ipset_dereference_protected(p, set) \ - __ipset_dereference_protected(p, lockdep_is_held(&(set)->lock)) - -#define rcu_dereference_bh_nfnl(p) rcu_dereference_bh_check(p, 1) +#define __ipset_dereference(p) \ + rcu_dereference_protected(p, 1) +#define ipset_dereference_nfnl(p) \ + rcu_dereference_protected(p, \ + lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET)) +#define ipset_dereference_set(p, set) \ + rcu_dereference_protected(p, \ + lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET) || \ + lockdep_is_held(&(set)->lock)) +#define ipset_dereference_bh_nfnl(p) \ + rcu_dereference_bh_check(p, \ + lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET)) /* Hashing which uses arrays to resolve clashing. The hash table is resized * (doubled) when searching becomes too long. @@ -72,11 +80,35 @@ struct hbucket { __aligned(__alignof__(u64)); }; +/* Region size for locking == 2^HTABLE_REGION_BITS */ +#define HTABLE_REGION_BITS 10 +#define ahash_numof_locks(htable_bits) \ + ((htable_bits) < HTABLE_REGION_BITS ? 1 \ + : jhash_size((htable_bits) - HTABLE_REGION_BITS)) +#define ahash_sizeof_regions(htable_bits) \ + (ahash_numof_locks(htable_bits) * sizeof(struct ip_set_region)) +#define ahash_region(n, htable_bits) \ + ((n) % ahash_numof_locks(htable_bits)) +#define ahash_bucket_start(h, htable_bits) \ + ((htable_bits) < HTABLE_REGION_BITS ? 0 \ + : (h) * jhash_size(HTABLE_REGION_BITS)) +#define ahash_bucket_end(h, htable_bits) \ + ((htable_bits) < HTABLE_REGION_BITS ? jhash_size(htable_bits) \ + : ((h) + 1) * jhash_size(HTABLE_REGION_BITS)) + +struct htable_gc { + struct delayed_work dwork; + struct ip_set *set; /* Set the gc belongs to */ + u32 region; /* Last gc run position */ +}; + /* The hash table: the table size stored here in order to make resizing easy */ struct htable { atomic_t ref; /* References for resizing */ - atomic_t uref; /* References for dumping */ + atomic_t uref; /* References for dumping and gc */ u8 htable_bits; /* size of hash table == 2^htable_bits */ + u32 maxelem; /* Maxelem per region */ + struct ip_set_region *hregion; /* Region locks and ext sizes */ struct hbucket __rcu *bucket[0]; /* hashtable buckets */ }; @@ -162,6 +194,10 @@ htable_bits(u32 hashsize) #define NLEN 0 #endif /* IP_SET_HASH_WITH_NETS */ +#define SET_ELEM_EXPIRED(set, d) \ + (SET_WITH_TIMEOUT(set) && \ + ip_set_timeout_expired(ext_timeout(d, set))) + #endif /* _IP_SET_HASH_GEN_H */ #ifndef MTYPE @@ -205,10 +241,12 @@ htable_bits(u32 hashsize) #undef mtype_test_cidrs #undef mtype_test #undef mtype_uref -#undef mtype_expire #undef mtype_resize +#undef mtype_ext_size +#undef mtype_resize_ad #undef mtype_head #undef mtype_list +#undef mtype_gc_do #undef mtype_gc #undef mtype_gc_init #undef mtype_variant @@ -247,10 +285,12 @@ htable_bits(u32 hashsize) #define mtype_test_cidrs IPSET_TOKEN(MTYPE, _test_cidrs) #define mtype_test IPSET_TOKEN(MTYPE, _test) #define mtype_uref IPSET_TOKEN(MTYPE, _uref) -#define mtype_expire IPSET_TOKEN(MTYPE, _expire) #define mtype_resize IPSET_TOKEN(MTYPE, _resize) +#define mtype_ext_size IPSET_TOKEN(MTYPE, _ext_size) +#define mtype_resize_ad IPSET_TOKEN(MTYPE, _resize_ad) #define mtype_head IPSET_TOKEN(MTYPE, _head) #define mtype_list IPSET_TOKEN(MTYPE, _list) +#define mtype_gc_do IPSET_TOKEN(MTYPE, _gc_do) #define mtype_gc IPSET_TOKEN(MTYPE, _gc) #define mtype_gc_init IPSET_TOKEN(MTYPE, _gc_init) #define mtype_variant IPSET_TOKEN(MTYPE, _variant) @@ -275,8 +315,7 @@ htable_bits(u32 hashsize) /* The generic hash structure */ struct htype { struct htable __rcu *table; /* the hash table */ - struct timer_list gc; /* garbage collection when timeout enabled */ - struct ip_set *set; /* attached to this ip_set */ + struct htable_gc gc; /* gc workqueue */ u32 maxelem; /* max elements in the hash */ u32 initval; /* random jhash init value */ #ifdef IP_SET_HASH_WITH_MARKMASK @@ -288,21 +327,33 @@ struct htype { #ifdef IP_SET_HASH_WITH_NETMASK u8 netmask; /* netmask value for subnets to store */ #endif + struct list_head ad; /* Resize add|del backlist */ struct mtype_elem next; /* temporary storage for uadd */ #ifdef IP_SET_HASH_WITH_NETS struct net_prefixes nets[NLEN]; /* book-keeping of prefixes */ #endif }; +/* ADD|DEL entries saved during resize */ +struct mtype_resize_ad { + struct list_head list; + enum ipset_adt ad; /* ADD|DEL element */ + struct mtype_elem d; /* Element value */ + struct ip_set_ext ext; /* Extensions for ADD */ + struct ip_set_ext mext; /* Target extensions for ADD */ + u32 flags; /* Flags for ADD */ +}; + #ifdef IP_SET_HASH_WITH_NETS /* Network cidr size book keeping when the hash stores different * sized networks. cidr == real cidr + 1 to support /0. */ static void -mtype_add_cidr(struct htype *h, u8 cidr, u8 n) +mtype_add_cidr(struct ip_set *set, struct htype *h, u8 cidr, u8 n) { int i, j; + spin_lock_bh(&set->lock); /* Add in increasing prefix order, so larger cidr first */ for (i = 0, j = -1; i < NLEN && h->nets[i].cidr[n]; i++) { if (j != -1) { @@ -311,7 +362,7 @@ mtype_add_cidr(struct htype *h, u8 cidr, u8 n) j = i; } else if (h->nets[i].cidr[n] == cidr) { h->nets[CIDR_POS(cidr)].nets[n]++; - return; + goto unlock; } } if (j != -1) { @@ -320,24 +371,29 @@ mtype_add_cidr(struct htype *h, u8 cidr, u8 n) } h->nets[i].cidr[n] = cidr; h->nets[CIDR_POS(cidr)].nets[n] = 1; +unlock: + spin_unlock_bh(&set->lock); } static void -mtype_del_cidr(struct htype *h, u8 cidr, u8 n) +mtype_del_cidr(struct ip_set *set, struct htype *h, u8 cidr, u8 n) { u8 i, j, net_end = NLEN - 1; + spin_lock_bh(&set->lock); for (i = 0; i < NLEN; i++) { if (h->nets[i].cidr[n] != cidr) continue; h->nets[CIDR_POS(cidr)].nets[n]--; if (h->nets[CIDR_POS(cidr)].nets[n] > 0) - return; + goto unlock; for (j = i; j < net_end && h->nets[j].cidr[n]; j++) h->nets[j].cidr[n] = h->nets[j + 1].cidr[n]; h->nets[j].cidr[n] = 0; - return; + goto unlock; } +unlock: + spin_unlock_bh(&set->lock); } #endif @@ -345,7 +401,7 @@ mtype_del_cidr(struct htype *h, u8 cidr, u8 n) static size_t mtype_ahash_memsize(const struct htype *h, const struct htable *t) { - return sizeof(*h) + sizeof(*t); + return sizeof(*h) + sizeof(*t) + ahash_sizeof_regions(t->htable_bits); } /* Get the ith element from the array block n */ @@ -369,24 +425,29 @@ mtype_flush(struct ip_set *set) struct htype *h = set->data; struct htable *t; struct hbucket *n; - u32 i; + u32 r, i; - t = ipset_dereference_protected(h->table, set); - for (i = 0; i < jhash_size(t->htable_bits); i++) { - n = __ipset_dereference_protected(hbucket(t, i), 1); - if (!n) - continue; - if (set->extensions & IPSET_EXT_DESTROY) - mtype_ext_cleanup(set, n); - /* FIXME: use slab cache */ - rcu_assign_pointer(hbucket(t, i), NULL); - kfree_rcu(n, rcu); + t = ipset_dereference_nfnl(h->table); + for (r = 0; r < ahash_numof_locks(t->htable_bits); r++) { + spin_lock_bh(&t->hregion[r].lock); + for (i = ahash_bucket_start(r, t->htable_bits); + i < ahash_bucket_end(r, t->htable_bits); i++) { + n = __ipset_dereference(hbucket(t, i)); + if (!n) + continue; + if (set->extensions & IPSET_EXT_DESTROY) + mtype_ext_cleanup(set, n); + /* FIXME: use slab cache */ + rcu_assign_pointer(hbucket(t, i), NULL); + kfree_rcu(n, rcu); + } + t->hregion[r].ext_size = 0; + t->hregion[r].elements = 0; + spin_unlock_bh(&t->hregion[r].lock); } #ifdef IP_SET_HASH_WITH_NETS memset(h->nets, 0, sizeof(h->nets)); #endif - set->elements = 0; - set->ext_size = 0; } /* Destroy the hashtable part of the set */ @@ -397,7 +458,7 @@ mtype_ahash_destroy(struct ip_set *set, struct htable *t, bool ext_destroy) u32 i; for (i = 0; i < jhash_size(t->htable_bits); i++) { - n = __ipset_dereference_protected(hbucket(t, i), 1); + n = __ipset_dereference(hbucket(t, i)); if (!n) continue; if (set->extensions & IPSET_EXT_DESTROY && ext_destroy) @@ -406,6 +467,7 @@ mtype_ahash_destroy(struct ip_set *set, struct htable *t, bool ext_destroy) kfree(n); } + ip_set_free(t->hregion); ip_set_free(t); } @@ -414,28 +476,21 @@ static void mtype_destroy(struct ip_set *set) { struct htype *h = set->data; + struct list_head *l, *lt; if (SET_WITH_TIMEOUT(set)) - del_timer_sync(&h->gc); + cancel_delayed_work_sync(&h->gc.dwork); - mtype_ahash_destroy(set, - __ipset_dereference_protected(h->table, 1), true); + mtype_ahash_destroy(set, ipset_dereference_nfnl(h->table), true); + list_for_each_safe(l, lt, &h->ad) { + list_del(l); + kfree(l); + } kfree(h); set->data = NULL; } -static void -mtype_gc_init(struct ip_set *set, void (*gc)(struct timer_list *t)) -{ - struct htype *h = set->data; - - timer_setup(&h->gc, gc, 0); - mod_timer(&h->gc, jiffies + IPSET_GC_PERIOD(set->timeout) * HZ); - pr_debug("gc initialized, run in every %u\n", - IPSET_GC_PERIOD(set->timeout)); -} - static bool mtype_same_set(const struct ip_set *a, const struct ip_set *b) { @@ -454,11 +509,9 @@ mtype_same_set(const struct ip_set *a, const struct ip_set *b) a->extensions == b->extensions; } -/* Delete expired elements from the hashtable */ static void -mtype_expire(struct ip_set *set, struct htype *h) +mtype_gc_do(struct ip_set *set, struct htype *h, struct htable *t, u32 r) { - struct htable *t; struct hbucket *n, *tmp; struct mtype_elem *data; u32 i, j, d; @@ -466,10 +519,12 @@ mtype_expire(struct ip_set *set, struct htype *h) #ifdef IP_SET_HASH_WITH_NETS u8 k; #endif + u8 htable_bits = t->htable_bits; - t = ipset_dereference_protected(h->table, set); - for (i = 0; i < jhash_size(t->htable_bits); i++) { - n = __ipset_dereference_protected(hbucket(t, i), 1); + spin_lock_bh(&t->hregion[r].lock); + for (i = ahash_bucket_start(r, htable_bits); + i < ahash_bucket_end(r, htable_bits); i++) { + n = __ipset_dereference(hbucket(t, i)); if (!n) continue; for (j = 0, d = 0; j < n->pos; j++) { @@ -485,58 +540,100 @@ mtype_expire(struct ip_set *set, struct htype *h) smp_mb__after_atomic(); #ifdef IP_SET_HASH_WITH_NETS for (k = 0; k < IPSET_NET_COUNT; k++) - mtype_del_cidr(h, + mtype_del_cidr(set, h, NCIDR_PUT(DCIDR_GET(data->cidr, k)), k); #endif + t->hregion[r].elements--; ip_set_ext_destroy(set, data); - set->elements--; d++; } if (d >= AHASH_INIT_SIZE) { if (d >= n->size) { + t->hregion[r].ext_size -= + ext_size(n->size, dsize); rcu_assign_pointer(hbucket(t, i), NULL); kfree_rcu(n, rcu); continue; } tmp = kzalloc(sizeof(*tmp) + - (n->size - AHASH_INIT_SIZE) * dsize, - GFP_ATOMIC); + (n->size - AHASH_INIT_SIZE) * dsize, + GFP_ATOMIC); if (!tmp) - /* Still try to delete expired elements */ + /* Still try to delete expired elements. */ continue; tmp->size = n->size - AHASH_INIT_SIZE; for (j = 0, d = 0; j < n->pos; j++) { if (!test_bit(j, n->used)) continue; data = ahash_data(n, j, dsize); - memcpy(tmp->value + d * dsize, data, dsize); + memcpy(tmp->value + d * dsize, + data, dsize); set_bit(d, tmp->used); d++; } tmp->pos = d; - set->ext_size -= ext_size(AHASH_INIT_SIZE, dsize); + t->hregion[r].ext_size -= + ext_size(AHASH_INIT_SIZE, dsize); rcu_assign_pointer(hbucket(t, i), tmp); kfree_rcu(n, rcu); } } + spin_unlock_bh(&t->hregion[r].lock); } static void -mtype_gc(struct timer_list *t) +mtype_gc(struct work_struct *work) { - struct htype *h = from_timer(h, t, gc); - struct ip_set *set = h->set; + struct htable_gc *gc; + struct ip_set *set; + struct htype *h; + struct htable *t; + u32 r, numof_locks; + unsigned int next_run; + + gc = container_of(work, struct htable_gc, dwork.work); + set = gc->set; + h = set->data; - pr_debug("called\n"); spin_lock_bh(&set->lock); - mtype_expire(set, h); + t = ipset_dereference_set(h->table, set); + atomic_inc(&t->uref); + numof_locks = ahash_numof_locks(t->htable_bits); + r = gc->region++; + if (r >= numof_locks) { + r = gc->region = 0; + } + next_run = (IPSET_GC_PERIOD(set->timeout) * HZ) / numof_locks; + if (next_run < HZ/10) + next_run = HZ/10; spin_unlock_bh(&set->lock); - h->gc.expires = jiffies + IPSET_GC_PERIOD(set->timeout) * HZ; - add_timer(&h->gc); + mtype_gc_do(set, h, t, r); + + if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) { + pr_debug("Table destroy after resize by expire: %p\n", t); + mtype_ahash_destroy(set, t, false); + } + + queue_delayed_work(system_power_efficient_wq, &gc->dwork, next_run); + } +static void +mtype_gc_init(struct htable_gc *gc) +{ + INIT_DEFERRABLE_WORK(&gc->dwork, mtype_gc); + queue_delayed_work(system_power_efficient_wq, &gc->dwork, HZ); +} + +static int +mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, + struct ip_set_ext *mext, u32 flags); +static int +mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext, + struct ip_set_ext *mext, u32 flags); + /* Resize a hash: create a new hash table with doubling the hashsize * and inserting the elements to it. Repeat until we succeed or * fail due to memory pressures. @@ -547,7 +644,7 @@ mtype_resize(struct ip_set *set, bool retried) struct htype *h = set->data; struct htable *t, *orig; u8 htable_bits; - size_t extsize, dsize = set->dsize; + size_t dsize = set->dsize; #ifdef IP_SET_HASH_WITH_NETS u8 flags; struct mtype_elem *tmp; @@ -555,7 +652,9 @@ mtype_resize(struct ip_set *set, bool retried) struct mtype_elem *data; struct mtype_elem *d; struct hbucket *n, *m; - u32 i, j, key; + struct list_head *l, *lt; + struct mtype_resize_ad *x; + u32 i, j, r, nr, key; int ret; #ifdef IP_SET_HASH_WITH_NETS @@ -563,10 +662,8 @@ mtype_resize(struct ip_set *set, bool retried) if (!tmp) return -ENOMEM; #endif - rcu_read_lock_bh(); - orig = rcu_dereference_bh_nfnl(h->table); + orig = ipset_dereference_bh_nfnl(h->table); htable_bits = orig->htable_bits; - rcu_read_unlock_bh(); retry: ret = 0; @@ -583,88 +680,124 @@ retry: ret = -ENOMEM; goto out; } + t->hregion = ip_set_alloc(ahash_sizeof_regions(htable_bits)); + if (!t->hregion) { + kfree(t); + ret = -ENOMEM; + goto out; + } t->htable_bits = htable_bits; + t->maxelem = h->maxelem / ahash_numof_locks(htable_bits); + for (i = 0; i < ahash_numof_locks(htable_bits); i++) + spin_lock_init(&t->hregion[i].lock); - spin_lock_bh(&set->lock); - orig = __ipset_dereference_protected(h->table, 1); - /* There can't be another parallel resizing, but dumping is possible */ + /* There can't be another parallel resizing, + * but dumping, gc, kernel side add/del are possible + */ + orig = ipset_dereference_bh_nfnl(h->table); atomic_set(&orig->ref, 1); atomic_inc(&orig->uref); - extsize = 0; pr_debug("attempt to resize set %s from %u to %u, t %p\n", set->name, orig->htable_bits, htable_bits, orig); - for (i = 0; i < jhash_size(orig->htable_bits); i++) { - n = __ipset_dereference_protected(hbucket(orig, i), 1); - if (!n) - continue; - for (j = 0; j < n->pos; j++) { - if (!test_bit(j, n->used)) + for (r = 0; r < ahash_numof_locks(orig->htable_bits); r++) { + /* Expire may replace a hbucket with another one */ + rcu_read_lock_bh(); + for (i = ahash_bucket_start(r, orig->htable_bits); + i < ahash_bucket_end(r, orig->htable_bits); i++) { + n = __ipset_dereference(hbucket(orig, i)); + if (!n) continue; - data = ahash_data(n, j, dsize); + for (j = 0; j < n->pos; j++) { + if (!test_bit(j, n->used)) + continue; + data = ahash_data(n, j, dsize); + if (SET_ELEM_EXPIRED(set, data)) + continue; #ifdef IP_SET_HASH_WITH_NETS - /* We have readers running parallel with us, - * so the live data cannot be modified. - */ - flags = 0; - memcpy(tmp, data, dsize); - data = tmp; - mtype_data_reset_flags(data, &flags); + /* We have readers running parallel with us, + * so the live data cannot be modified. + */ + flags = 0; + memcpy(tmp, data, dsize); + data = tmp; + mtype_data_reset_flags(data, &flags); #endif - key = HKEY(data, h->initval, htable_bits); - m = __ipset_dereference_protected(hbucket(t, key), 1); - if (!m) { - m = kzalloc(sizeof(*m) + + key = HKEY(data, h->initval, htable_bits); + m = __ipset_dereference(hbucket(t, key)); + nr = ahash_region(key, htable_bits); + if (!m) { + m = kzalloc(sizeof(*m) + AHASH_INIT_SIZE * dsize, GFP_ATOMIC); - if (!m) { - ret = -ENOMEM; - goto cleanup; - } - m->size = AHASH_INIT_SIZE; - extsize += ext_size(AHASH_INIT_SIZE, dsize); - RCU_INIT_POINTER(hbucket(t, key), m); - } else if (m->pos >= m->size) { - struct hbucket *ht; + if (!m) { + ret = -ENOMEM; + goto cleanup; + } + m->size = AHASH_INIT_SIZE; + t->hregion[nr].ext_size += + ext_size(AHASH_INIT_SIZE, + dsize); + RCU_INIT_POINTER(hbucket(t, key), m); + } else if (m->pos >= m->size) { + struct hbucket *ht; - if (m->size >= AHASH_MAX(h)) { - ret = -EAGAIN; - } else { - ht = kzalloc(sizeof(*ht) + + if (m->size >= AHASH_MAX(h)) { + ret = -EAGAIN; + } else { + ht = kzalloc(sizeof(*ht) + (m->size + AHASH_INIT_SIZE) * dsize, GFP_ATOMIC); - if (!ht) - ret = -ENOMEM; + if (!ht) + ret = -ENOMEM; + } + if (ret < 0) + goto cleanup; + memcpy(ht, m, sizeof(struct hbucket) + + m->size * dsize); + ht->size = m->size + AHASH_INIT_SIZE; + t->hregion[nr].ext_size += + ext_size(AHASH_INIT_SIZE, + dsize); + kfree(m); + m = ht; + RCU_INIT_POINTER(hbucket(t, key), ht); } - if (ret < 0) - goto cleanup; - memcpy(ht, m, sizeof(struct hbucket) + - m->size * dsize); - ht->size = m->size + AHASH_INIT_SIZE; - extsize += ext_size(AHASH_INIT_SIZE, dsize); - kfree(m); - m = ht; - RCU_INIT_POINTER(hbucket(t, key), ht); - } - d = ahash_data(m, m->pos, dsize); - memcpy(d, data, dsize); - set_bit(m->pos++, m->used); + d = ahash_data(m, m->pos, dsize); + memcpy(d, data, dsize); + set_bit(m->pos++, m->used); + t->hregion[nr].elements++; #ifdef IP_SET_HASH_WITH_NETS - mtype_data_reset_flags(d, &flags); + mtype_data_reset_flags(d, &flags); #endif + } } + rcu_read_unlock_bh(); } - rcu_assign_pointer(h->table, t); - set->ext_size = extsize; - spin_unlock_bh(&set->lock); + /* There can't be any other writer. */ + rcu_assign_pointer(h->table, t); /* Give time to other readers of the set */ synchronize_rcu(); pr_debug("set %s resized from %u (%p) to %u (%p)\n", set->name, orig->htable_bits, orig, t->htable_bits, t); - /* If there's nobody else dumping the table, destroy it */ + /* Add/delete elements processed by the SET target during resize. + * Kernel-side add cannot trigger a resize and userspace actions + * are serialized by the mutex. + */ + list_for_each_safe(l, lt, &h->ad) { + x = list_entry(l, struct mtype_resize_ad, list); + if (x->ad == IPSET_ADD) { + mtype_add(set, &x->d, &x->ext, &x->mext, x->flags); + } else { + mtype_del(set, &x->d, NULL, NULL, 0); + } + list_del(l); + kfree(l); + } + /* If there's nobody else using the table, destroy it */ if (atomic_dec_and_test(&orig->uref)) { pr_debug("Table destroy by resize %p\n", orig); mtype_ahash_destroy(set, orig, false); @@ -677,15 +810,44 @@ out: return ret; cleanup: + rcu_read_unlock_bh(); atomic_set(&orig->ref, 0); atomic_dec(&orig->uref); - spin_unlock_bh(&set->lock); mtype_ahash_destroy(set, t, false); if (ret == -EAGAIN) goto retry; goto out; } +/* Get the current number of elements and ext_size in the set */ +static void +mtype_ext_size(struct ip_set *set, u32 *elements, size_t *ext_size) +{ + struct htype *h = set->data; + const struct htable *t; + u32 i, j, r; + struct hbucket *n; + struct mtype_elem *data; + + t = rcu_dereference_bh(h->table); + for (r = 0; r < ahash_numof_locks(t->htable_bits); r++) { + for (i = ahash_bucket_start(r, t->htable_bits); + i < ahash_bucket_end(r, t->htable_bits); i++) { + n = rcu_dereference_bh(hbucket(t, i)); + if (!n) + continue; + for (j = 0; j < n->pos; j++) { + if (!test_bit(j, n->used)) + continue; + data = ahash_data(n, j, set->dsize); + if (!SET_ELEM_EXPIRED(set, data)) + (*elements)++; + } + } + *ext_size += t->hregion[r].ext_size; + } +} + /* Add an element to a hash and update the internal counters when succeeded, * otherwise report the proper error code. */ @@ -698,32 +860,49 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, const struct mtype_elem *d = value; struct mtype_elem *data; struct hbucket *n, *old = ERR_PTR(-ENOENT); - int i, j = -1; + int i, j = -1, ret; bool flag_exist = flags & IPSET_FLAG_EXIST; bool deleted = false, forceadd = false, reuse = false; - u32 key, multi = 0; + u32 r, key, multi = 0, elements, maxelem; - if (set->elements >= h->maxelem) { - if (SET_WITH_TIMEOUT(set)) - /* FIXME: when set is full, we slow down here */ - mtype_expire(set, h); - if (set->elements >= h->maxelem && SET_WITH_FORCEADD(set)) + rcu_read_lock_bh(); + t = rcu_dereference_bh(h->table); + key = HKEY(value, h->initval, t->htable_bits); + r = ahash_region(key, t->htable_bits); + atomic_inc(&t->uref); + elements = t->hregion[r].elements; + maxelem = t->maxelem; + if (elements >= maxelem) { + u32 e; + if (SET_WITH_TIMEOUT(set)) { + rcu_read_unlock_bh(); + mtype_gc_do(set, h, t, r); + rcu_read_lock_bh(); + } + maxelem = h->maxelem; + elements = 0; + for (e = 0; e < ahash_numof_locks(t->htable_bits); e++) + elements += t->hregion[e].elements; + if (elements >= maxelem && SET_WITH_FORCEADD(set)) forceadd = true; } + rcu_read_unlock_bh(); - t = ipset_dereference_protected(h->table, set); - key = HKEY(value, h->initval, t->htable_bits); - n = __ipset_dereference_protected(hbucket(t, key), 1); + spin_lock_bh(&t->hregion[r].lock); + n = rcu_dereference_bh(hbucket(t, key)); if (!n) { - if (forceadd || set->elements >= h->maxelem) + if (forceadd || elements >= maxelem) goto set_full; old = NULL; n = kzalloc(sizeof(*n) + AHASH_INIT_SIZE * set->dsize, GFP_ATOMIC); - if (!n) - return -ENOMEM; + if (!n) { + ret = -ENOMEM; + goto unlock; + } n->size = AHASH_INIT_SIZE; - set->ext_size += ext_size(AHASH_INIT_SIZE, set->dsize); + t->hregion[r].ext_size += + ext_size(AHASH_INIT_SIZE, set->dsize); goto copy_elem; } for (i = 0; i < n->pos; i++) { @@ -737,19 +916,16 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, } data = ahash_data(n, i, set->dsize); if (mtype_data_equal(data, d, &multi)) { - if (flag_exist || - (SET_WITH_TIMEOUT(set) && - ip_set_timeout_expired(ext_timeout(data, set)))) { + if (flag_exist || SET_ELEM_EXPIRED(set, data)) { /* Just the extensions could be overwritten */ j = i; goto overwrite_extensions; } - return -IPSET_ERR_EXIST; + ret = -IPSET_ERR_EXIST; + goto unlock; } /* Reuse first timed out entry */ - if (SET_WITH_TIMEOUT(set) && - ip_set_timeout_expired(ext_timeout(data, set)) && - j == -1) { + if (SET_ELEM_EXPIRED(set, data) && j == -1) { j = i; reuse = true; } @@ -759,16 +935,16 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, if (!deleted) { #ifdef IP_SET_HASH_WITH_NETS for (i = 0; i < IPSET_NET_COUNT; i++) - mtype_del_cidr(h, + mtype_del_cidr(set, h, NCIDR_PUT(DCIDR_GET(data->cidr, i)), i); #endif ip_set_ext_destroy(set, data); - set->elements--; + t->hregion[r].elements--; } goto copy_data; } - if (set->elements >= h->maxelem) + if (elements >= maxelem) goto set_full; /* Create a new slot */ if (n->pos >= n->size) { @@ -776,28 +952,32 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, if (n->size >= AHASH_MAX(h)) { /* Trigger rehashing */ mtype_data_next(&h->next, d); - return -EAGAIN; + ret = -EAGAIN; + goto resize; } old = n; n = kzalloc(sizeof(*n) + (old->size + AHASH_INIT_SIZE) * set->dsize, GFP_ATOMIC); - if (!n) - return -ENOMEM; + if (!n) { + ret = -ENOMEM; + goto unlock; + } memcpy(n, old, sizeof(struct hbucket) + old->size * set->dsize); n->size = old->size + AHASH_INIT_SIZE; - set->ext_size += ext_size(AHASH_INIT_SIZE, set->dsize); + t->hregion[r].ext_size += + ext_size(AHASH_INIT_SIZE, set->dsize); } copy_elem: j = n->pos++; data = ahash_data(n, j, set->dsize); copy_data: - set->elements++; + t->hregion[r].elements++; #ifdef IP_SET_HASH_WITH_NETS for (i = 0; i < IPSET_NET_COUNT; i++) - mtype_add_cidr(h, NCIDR_PUT(DCIDR_GET(d->cidr, i)), i); + mtype_add_cidr(set, h, NCIDR_PUT(DCIDR_GET(d->cidr, i)), i); #endif memcpy(data, d, sizeof(struct mtype_elem)); overwrite_extensions: @@ -820,13 +1000,41 @@ overwrite_extensions: if (old) kfree_rcu(old, rcu); } + ret = 0; +resize: + spin_unlock_bh(&t->hregion[r].lock); + if (atomic_read(&t->ref) && ext->target) { + /* Resize is in process and kernel side add, save values */ + struct mtype_resize_ad *x; + + x = kzalloc(sizeof(struct mtype_resize_ad), GFP_ATOMIC); + if (!x) + /* Don't bother */ + goto out; + x->ad = IPSET_ADD; + memcpy(&x->d, value, sizeof(struct mtype_elem)); + memcpy(&x->ext, ext, sizeof(struct ip_set_ext)); + memcpy(&x->mext, mext, sizeof(struct ip_set_ext)); + x->flags = flags; + spin_lock_bh(&set->lock); + list_add_tail(&x->list, &h->ad); + spin_unlock_bh(&set->lock); + } + goto out; - return 0; set_full: if (net_ratelimit()) pr_warn("Set %s is full, maxelem %u reached\n", - set->name, h->maxelem); - return -IPSET_ERR_HASH_FULL; + set->name, maxelem); + ret = -IPSET_ERR_HASH_FULL; +unlock: + spin_unlock_bh(&t->hregion[r].lock); +out: + if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) { + pr_debug("Table destroy after resize by add: %p\n", t); + mtype_ahash_destroy(set, t, false); + } + return ret; } /* Delete an element from the hash and free up space if possible. @@ -840,13 +1048,23 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext, const struct mtype_elem *d = value; struct mtype_elem *data; struct hbucket *n; - int i, j, k, ret = -IPSET_ERR_EXIST; + struct mtype_resize_ad *x = NULL; + int i, j, k, r, ret = -IPSET_ERR_EXIST; u32 key, multi = 0; size_t dsize = set->dsize; - t = ipset_dereference_protected(h->table, set); + /* Userspace add and resize is excluded by the mutex. + * Kernespace add does not trigger resize. + */ + rcu_read_lock_bh(); + t = rcu_dereference_bh(h->table); key = HKEY(value, h->initval, t->htable_bits); - n = __ipset_dereference_protected(hbucket(t, key), 1); + r = ahash_region(key, t->htable_bits); + atomic_inc(&t->uref); + rcu_read_unlock_bh(); + + spin_lock_bh(&t->hregion[r].lock); + n = rcu_dereference_bh(hbucket(t, key)); if (!n) goto out; for (i = 0, k = 0; i < n->pos; i++) { @@ -857,8 +1075,7 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext, data = ahash_data(n, i, dsize); if (!mtype_data_equal(data, d, &multi)) continue; - if (SET_WITH_TIMEOUT(set) && - ip_set_timeout_expired(ext_timeout(data, set))) + if (SET_ELEM_EXPIRED(set, data)) goto out; ret = 0; @@ -866,20 +1083,33 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext, smp_mb__after_atomic(); if (i + 1 == n->pos) n->pos--; - set->elements--; + t->hregion[r].elements--; #ifdef IP_SET_HASH_WITH_NETS for (j = 0; j < IPSET_NET_COUNT; j++) - mtype_del_cidr(h, NCIDR_PUT(DCIDR_GET(d->cidr, j)), - j); + mtype_del_cidr(set, h, + NCIDR_PUT(DCIDR_GET(d->cidr, j)), j); #endif ip_set_ext_destroy(set, data); + if (atomic_read(&t->ref) && ext->target) { + /* Resize is in process and kernel side del, + * save values + */ + x = kzalloc(sizeof(struct mtype_resize_ad), + GFP_ATOMIC); + if (x) { + x->ad = IPSET_DEL; + memcpy(&x->d, value, + sizeof(struct mtype_elem)); + x->flags = flags; + } + } for (; i < n->pos; i++) { if (!test_bit(i, n->used)) k++; } if (n->pos == 0 && k == 0) { - set->ext_size -= ext_size(n->size, dsize); + t->hregion[r].ext_size -= ext_size(n->size, dsize); rcu_assign_pointer(hbucket(t, key), NULL); kfree_rcu(n, rcu); } else if (k >= AHASH_INIT_SIZE) { @@ -898,7 +1128,8 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext, k++; } tmp->pos = k; - set->ext_size -= ext_size(AHASH_INIT_SIZE, dsize); + t->hregion[r].ext_size -= + ext_size(AHASH_INIT_SIZE, dsize); rcu_assign_pointer(hbucket(t, key), tmp); kfree_rcu(n, rcu); } @@ -906,6 +1137,16 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext, } out: + spin_unlock_bh(&t->hregion[r].lock); + if (x) { + spin_lock_bh(&set->lock); + list_add(&x->list, &h->ad); + spin_unlock_bh(&set->lock); + } + if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) { + pr_debug("Table destroy after resize by del: %p\n", t); + mtype_ahash_destroy(set, t, false); + } return ret; } @@ -991,6 +1232,7 @@ mtype_test(struct ip_set *set, void *value, const struct ip_set_ext *ext, int i, ret = 0; u32 key, multi = 0; + rcu_read_lock_bh(); t = rcu_dereference_bh(h->table); #ifdef IP_SET_HASH_WITH_NETS /* If we test an IP address and not a network address, @@ -1022,6 +1264,7 @@ mtype_test(struct ip_set *set, void *value, const struct ip_set_ext *ext, goto out; } out: + rcu_read_unlock_bh(); return ret; } @@ -1033,23 +1276,14 @@ mtype_head(struct ip_set *set, struct sk_buff *skb) const struct htable *t; struct nlattr *nested; size_t memsize; + u32 elements = 0; + size_t ext_size = 0; u8 htable_bits; - /* If any members have expired, set->elements will be wrong - * mytype_expire function will update it with the right count. - * we do not hold set->lock here, so grab it first. - * set->elements can still be incorrect in the case of a huge set, - * because elements might time out during the listing. - */ - if (SET_WITH_TIMEOUT(set)) { - spin_lock_bh(&set->lock); - mtype_expire(set, h); - spin_unlock_bh(&set->lock); - } - rcu_read_lock_bh(); - t = rcu_dereference_bh_nfnl(h->table); - memsize = mtype_ahash_memsize(h, t) + set->ext_size; + t = rcu_dereference_bh(h->table); + mtype_ext_size(set, &elements, &ext_size); + memsize = mtype_ahash_memsize(h, t) + ext_size + set->ext_size; htable_bits = t->htable_bits; rcu_read_unlock_bh(); @@ -1071,7 +1305,7 @@ mtype_head(struct ip_set *set, struct sk_buff *skb) #endif if (nla_put_net32(skb, IPSET_ATTR_REFERENCES, htonl(set->ref)) || nla_put_net32(skb, IPSET_ATTR_MEMSIZE, htonl(memsize)) || - nla_put_net32(skb, IPSET_ATTR_ELEMENTS, htonl(set->elements))) + nla_put_net32(skb, IPSET_ATTR_ELEMENTS, htonl(elements))) goto nla_put_failure; if (unlikely(ip_set_put_flags(skb, set))) goto nla_put_failure; @@ -1091,15 +1325,15 @@ mtype_uref(struct ip_set *set, struct netlink_callback *cb, bool start) if (start) { rcu_read_lock_bh(); - t = rcu_dereference_bh_nfnl(h->table); + t = ipset_dereference_bh_nfnl(h->table); atomic_inc(&t->uref); cb->args[IPSET_CB_PRIVATE] = (unsigned long)t; rcu_read_unlock_bh(); } else if (cb->args[IPSET_CB_PRIVATE]) { t = (struct htable *)cb->args[IPSET_CB_PRIVATE]; if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) { - /* Resizing didn't destroy the hash table */ - pr_debug("Table destroy by dump: %p\n", t); + pr_debug("Table destroy after resize " + " by dump: %p\n", t); mtype_ahash_destroy(set, t, false); } cb->args[IPSET_CB_PRIVATE] = 0; @@ -1141,8 +1375,7 @@ mtype_list(const struct ip_set *set, if (!test_bit(i, n->used)) continue; e = ahash_data(n, i, set->dsize); - if (SET_WITH_TIMEOUT(set) && - ip_set_timeout_expired(ext_timeout(e, set))) + if (SET_ELEM_EXPIRED(set, e)) continue; pr_debug("list hash %lu hbucket %p i %u, data %p\n", cb->args[IPSET_CB_ARG0], n, i, e); @@ -1208,6 +1441,7 @@ static const struct ip_set_type_variant mtype_variant = { .uref = mtype_uref, .resize = mtype_resize, .same_set = mtype_same_set, + .region_lock = true, }; #ifdef IP_SET_EMIT_CREATE @@ -1226,6 +1460,7 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set, size_t hsize; struct htype *h; struct htable *t; + u32 i; pr_debug("Create set %s with family %s\n", set->name, set->family == NFPROTO_IPV4 ? "inet" : "inet6"); @@ -1294,6 +1529,15 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set, kfree(h); return -ENOMEM; } + t->hregion = ip_set_alloc(ahash_sizeof_regions(hbits)); + if (!t->hregion) { + kfree(t); + kfree(h); + return -ENOMEM; + } + h->gc.set = set; + for (i = 0; i < ahash_numof_locks(hbits); i++) + spin_lock_init(&t->hregion[i].lock); h->maxelem = maxelem; #ifdef IP_SET_HASH_WITH_NETMASK h->netmask = netmask; @@ -1304,9 +1548,10 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set, get_random_bytes(&h->initval, sizeof(h->initval)); t->htable_bits = hbits; + t->maxelem = h->maxelem / ahash_numof_locks(hbits); RCU_INIT_POINTER(h->table, t); - h->set = set; + INIT_LIST_HEAD(&h->ad); set->data = h; #ifndef IP_SET_PROTO_UNDEF if (set->family == NFPROTO_IPV4) { @@ -1329,12 +1574,10 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set, #ifndef IP_SET_PROTO_UNDEF if (set->family == NFPROTO_IPV4) #endif - IPSET_TOKEN(HTYPE, 4_gc_init)(set, - IPSET_TOKEN(HTYPE, 4_gc)); + IPSET_TOKEN(HTYPE, 4_gc_init)(&h->gc); #ifndef IP_SET_PROTO_UNDEF else - IPSET_TOKEN(HTYPE, 6_gc_init)(set, - IPSET_TOKEN(HTYPE, 6_gc)); + IPSET_TOKEN(HTYPE, 6_gc_init)(&h->gc); #endif } pr_debug("create %s hashsize %u (%u) maxelem %u: %p(%p)\n", From 5c37f1ae1c335800d16b207cb578009c695dcd39 Mon Sep 17 00:00:00 2001 From: James Morse Date: Thu, 20 Feb 2020 16:58:37 +0000 Subject: [PATCH 092/400] KVM: arm64: Ask the compiler to __always_inline functions used at HYP On non VHE CPUs, KVM's __hyp_text contains code run at EL2 while the rest of the kernel runs at EL1. This code lives in its own section with start and end markers so we can map it to EL2. The compiler may decide not to inline static-inline functions from the header file. It may also decide not to put these out-of-line functions in the same section, meaning they aren't mapped when called at EL2. Clang-9 does exactly this with __kern_hyp_va() and a few others when x18 is reserved for the shadow call stack. Add the additional __always_ hint to all the static-inlines that are called from a hyp file. Signed-off-by: James Morse Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20200220165839.256881-2-james.morse@arm.com ---- kvm_get_hyp_vector() pulls in all the regular per-cpu accessors and this_cpu_has_cap(), fortunately its only called for VHE. --- arch/arm64/include/asm/arch_gicv3.h | 2 +- arch/arm64/include/asm/cpufeature.h | 2 +- arch/arm64/include/asm/kvm_emulate.h | 48 ++++++++++++++-------------- arch/arm64/include/asm/kvm_mmu.h | 3 +- arch/arm64/include/asm/virt.h | 2 +- 5 files changed, 29 insertions(+), 28 deletions(-) diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h index 89e4c8b79349..07597028bb00 100644 --- a/arch/arm64/include/asm/arch_gicv3.h +++ b/arch/arm64/include/asm/arch_gicv3.h @@ -32,7 +32,7 @@ static inline void gic_write_eoir(u32 irq) isb(); } -static inline void gic_write_dir(u32 irq) +static __always_inline void gic_write_dir(u32 irq) { write_sysreg_s(irq, SYS_ICC_DIR_EL1); isb(); diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 4261d55e8506..0e6d03c7e368 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -581,7 +581,7 @@ static inline bool system_supports_sve(void) cpus_have_const_cap(ARM64_SVE); } -static inline bool system_supports_cnp(void) +static __always_inline bool system_supports_cnp(void) { return IS_ENABLED(CONFIG_ARM64_CNP) && cpus_have_const_cap(ARM64_HAS_CNP); diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 688c63412cc2..f658dda12364 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -36,7 +36,7 @@ void kvm_inject_undef32(struct kvm_vcpu *vcpu); void kvm_inject_dabt32(struct kvm_vcpu *vcpu, unsigned long addr); void kvm_inject_pabt32(struct kvm_vcpu *vcpu, unsigned long addr); -static inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu) +static __always_inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu) { return !(vcpu->arch.hcr_el2 & HCR_RW); } @@ -127,7 +127,7 @@ static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr) vcpu->arch.vsesr_el2 = vsesr; } -static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu) +static __always_inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu) { return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc; } @@ -153,17 +153,17 @@ static inline void vcpu_write_elr_el1(const struct kvm_vcpu *vcpu, unsigned long *__vcpu_elr_el1(vcpu) = v; } -static inline unsigned long *vcpu_cpsr(const struct kvm_vcpu *vcpu) +static __always_inline unsigned long *vcpu_cpsr(const struct kvm_vcpu *vcpu) { return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pstate; } -static inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu) +static __always_inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu) { return !!(*vcpu_cpsr(vcpu) & PSR_MODE32_BIT); } -static inline bool kvm_condition_valid(const struct kvm_vcpu *vcpu) +static __always_inline bool kvm_condition_valid(const struct kvm_vcpu *vcpu) { if (vcpu_mode_is_32bit(vcpu)) return kvm_condition_valid32(vcpu); @@ -181,13 +181,13 @@ static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu) * coming from a read of ESR_EL2. Otherwise, it may give the wrong result on * AArch32 with banked registers. */ -static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu, +static __always_inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu, u8 reg_num) { return (reg_num == 31) ? 0 : vcpu_gp_regs(vcpu)->regs.regs[reg_num]; } -static inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 reg_num, +static __always_inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 reg_num, unsigned long val) { if (reg_num != 31) @@ -264,12 +264,12 @@ static inline bool vcpu_mode_priv(const struct kvm_vcpu *vcpu) return mode != PSR_MODE_EL0t; } -static inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu *vcpu) +static __always_inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu *vcpu) { return vcpu->arch.fault.esr_el2; } -static inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu) +static __always_inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu) { u32 esr = kvm_vcpu_get_hsr(vcpu); @@ -279,12 +279,12 @@ static inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu) return -1; } -static inline unsigned long kvm_vcpu_get_hfar(const struct kvm_vcpu *vcpu) +static __always_inline unsigned long kvm_vcpu_get_hfar(const struct kvm_vcpu *vcpu) { return vcpu->arch.fault.far_el2; } -static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu) +static __always_inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu) { return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8; } @@ -299,7 +299,7 @@ static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu) return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK; } -static inline bool kvm_vcpu_dabt_isvalid(const struct kvm_vcpu *vcpu) +static __always_inline bool kvm_vcpu_dabt_isvalid(const struct kvm_vcpu *vcpu) { return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_ISV); } @@ -319,17 +319,17 @@ static inline bool kvm_vcpu_dabt_issf(const struct kvm_vcpu *vcpu) return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_SF); } -static inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu) +static __always_inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu) { return (kvm_vcpu_get_hsr(vcpu) & ESR_ELx_SRT_MASK) >> ESR_ELx_SRT_SHIFT; } -static inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu) +static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu) { return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_S1PTW); } -static inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu) +static __always_inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu) { return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_WNR) || kvm_vcpu_dabt_iss1tw(vcpu); /* AF/DBM update */ @@ -340,18 +340,18 @@ static inline bool kvm_vcpu_dabt_is_cm(const struct kvm_vcpu *vcpu) return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_CM); } -static inline unsigned int kvm_vcpu_dabt_get_as(const struct kvm_vcpu *vcpu) +static __always_inline unsigned int kvm_vcpu_dabt_get_as(const struct kvm_vcpu *vcpu) { return 1 << ((kvm_vcpu_get_hsr(vcpu) & ESR_ELx_SAS) >> ESR_ELx_SAS_SHIFT); } /* This one is not specific to Data Abort */ -static inline bool kvm_vcpu_trap_il_is32bit(const struct kvm_vcpu *vcpu) +static __always_inline bool kvm_vcpu_trap_il_is32bit(const struct kvm_vcpu *vcpu) { return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_IL); } -static inline u8 kvm_vcpu_trap_get_class(const struct kvm_vcpu *vcpu) +static __always_inline u8 kvm_vcpu_trap_get_class(const struct kvm_vcpu *vcpu) { return ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu)); } @@ -361,17 +361,17 @@ static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu) return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW; } -static inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu) +static __always_inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu) { return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_FSC; } -static inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vcpu) +static __always_inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vcpu) { return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_FSC_TYPE; } -static inline bool kvm_vcpu_dabt_isextabt(const struct kvm_vcpu *vcpu) +static __always_inline bool kvm_vcpu_dabt_isextabt(const struct kvm_vcpu *vcpu) { switch (kvm_vcpu_trap_get_fault(vcpu)) { case FSC_SEA: @@ -390,7 +390,7 @@ static inline bool kvm_vcpu_dabt_isextabt(const struct kvm_vcpu *vcpu) } } -static inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu) +static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu) { u32 esr = kvm_vcpu_get_hsr(vcpu); return ESR_ELx_SYS64_ISS_RT(esr); @@ -504,7 +504,7 @@ static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu, return data; /* Leave LE untouched */ } -static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr) +static __always_inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr) { if (vcpu_mode_is_32bit(vcpu)) kvm_skip_instr32(vcpu, is_wide_instr); @@ -519,7 +519,7 @@ static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr) * Skip an instruction which has been emulated at hyp while most guest sysregs * are live. */ -static inline void __hyp_text __kvm_skip_instr(struct kvm_vcpu *vcpu) +static __always_inline void __hyp_text __kvm_skip_instr(struct kvm_vcpu *vcpu) { *vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR); vcpu->arch.ctxt.gp_regs.regs.pstate = read_sysreg_el2(SYS_SPSR); diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 53d846f1bfe7..785762860c63 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -93,7 +93,7 @@ void kvm_update_va_mask(struct alt_instr *alt, __le32 *origptr, __le32 *updptr, int nr_inst); void kvm_compute_layout(void); -static inline unsigned long __kern_hyp_va(unsigned long v) +static __always_inline unsigned long __kern_hyp_va(unsigned long v) { asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n" "ror %0, %0, #1\n" @@ -473,6 +473,7 @@ static inline int kvm_write_guest_lock(struct kvm *kvm, gpa_t gpa, extern void *__kvm_bp_vect_base; extern int __kvm_harden_el2_vector_slot; +/* This is only called on a VHE system */ static inline void *kvm_get_hyp_vector(void) { struct bp_hardening_data *data = arm64_get_bp_hardening_data(); diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h index 0958ed6191aa..61fd26752adc 100644 --- a/arch/arm64/include/asm/virt.h +++ b/arch/arm64/include/asm/virt.h @@ -83,7 +83,7 @@ static inline bool is_kernel_in_hyp_mode(void) return read_sysreg(CurrentEL) == CurrentEL_EL2; } -static inline bool has_vhe(void) +static __always_inline bool has_vhe(void) { if (cpus_have_const_cap(ARM64_HAS_VIRT_HOST_EXTN)) return true; From 8c2d146ee7a2e0782eea4dd70fddc1c837140136 Mon Sep 17 00:00:00 2001 From: James Morse Date: Thu, 20 Feb 2020 16:58:38 +0000 Subject: [PATCH 093/400] KVM: arm64: Define our own swab32() to avoid a uapi static inline KVM uses swab32() when mediating GIC MMIO accesses if the GICV is badly aligned, and the host and guest differ in endianness. arm64 doesn't provide a __arch_swab32(), so __fswab32() is always backed by the macro implementation that the compiler reduces to a single instruction. But the static-inline causes problems for KVM if the compiler chooses not to inline this function, it may not be located in the __hyp_text where __vgic_v2_perform_cpuif_access() needs it. Create our own __kvm_swab32() macro that calls ___constant_swab32() directly. This way we know it will always be inlined. Signed-off-by: James Morse Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20200220165839.256881-3-james.morse@arm.com --- arch/arm64/include/asm/kvm_hyp.h | 7 +++++++ arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c | 4 ++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index 97f21cc66657..5fde137b5150 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -47,6 +47,13 @@ #define read_sysreg_el2(r) read_sysreg_elx(r, _EL2, _EL1) #define write_sysreg_el2(v,r) write_sysreg_elx(v, r, _EL2, _EL1) +/* + * Without an __arch_swab32(), we fall back to ___constant_swab32(), but the + * static inline can allow the compiler to out-of-line this. KVM always wants + * the macro version as its always inlined. + */ +#define __kvm_swab32(x) ___constant_swab32(x) + int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu); void __vgic_v3_save_state(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c b/arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c index 29ee1feba4eb..4f3a087e36d5 100644 --- a/arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c +++ b/arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c @@ -69,14 +69,14 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu) u32 data = vcpu_get_reg(vcpu, rd); if (__is_be(vcpu)) { /* guest pre-swabbed data, undo this for writel() */ - data = swab32(data); + data = __kvm_swab32(data); } writel_relaxed(data, addr); } else { u32 data = readl_relaxed(addr); if (__is_be(vcpu)) { /* guest expects swabbed data */ - data = swab32(data); + data = __kvm_swab32(data); } vcpu_set_reg(vcpu, rd, data); } From e43f1331e2ef913b8c566920c9af75e0ccdd1d3f Mon Sep 17 00:00:00 2001 From: James Morse Date: Thu, 20 Feb 2020 16:58:39 +0000 Subject: [PATCH 094/400] arm64: Ask the compiler to __always_inline functions used by KVM at HYP KVM uses some of the static-inline helpers like icache_is_vipt() from its HYP code. This assumes the function is inlined so that the code is mapped to EL2. The compiler may decide not to inline these, and the out-of-line version may not be in the __hyp_text section. Add the additional __always_ hint to these static-inlines that are used by KVM. Signed-off-by: James Morse Signed-off-by: Marc Zyngier Acked-by: Will Deacon Link: https://lore.kernel.org/r/20200220165839.256881-4-james.morse@arm.com --- arch/arm64/include/asm/cache.h | 2 +- arch/arm64/include/asm/cacheflush.h | 2 +- arch/arm64/include/asm/cpufeature.h | 8 ++++---- arch/arm64/include/asm/io.h | 4 ++-- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h index 806e9dc2a852..a4d1b5f771f6 100644 --- a/arch/arm64/include/asm/cache.h +++ b/arch/arm64/include/asm/cache.h @@ -69,7 +69,7 @@ static inline int icache_is_aliasing(void) return test_bit(ICACHEF_ALIASING, &__icache_flags); } -static inline int icache_is_vpipt(void) +static __always_inline int icache_is_vpipt(void) { return test_bit(ICACHEF_VPIPT, &__icache_flags); } diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h index 665c78e0665a..e6cca3d4acf7 100644 --- a/arch/arm64/include/asm/cacheflush.h +++ b/arch/arm64/include/asm/cacheflush.h @@ -145,7 +145,7 @@ extern void copy_to_user_page(struct vm_area_struct *, struct page *, #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1 extern void flush_dcache_page(struct page *); -static inline void __flush_icache_all(void) +static __always_inline void __flush_icache_all(void) { if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC)) return; diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 0e6d03c7e368..be078699ac4b 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -435,13 +435,13 @@ cpuid_feature_extract_signed_field(u64 features, int field) return cpuid_feature_extract_signed_field_width(features, field, 4); } -static inline unsigned int __attribute_const__ +static __always_inline unsigned int __attribute_const__ cpuid_feature_extract_unsigned_field_width(u64 features, int field, int width) { return (u64)(features << (64 - width - field)) >> (64 - width); } -static inline unsigned int __attribute_const__ +static __always_inline unsigned int __attribute_const__ cpuid_feature_extract_unsigned_field(u64 features, int field) { return cpuid_feature_extract_unsigned_field_width(features, field, 4); @@ -564,7 +564,7 @@ static inline bool system_supports_mixed_endian(void) return val == 0x1; } -static inline bool system_supports_fpsimd(void) +static __always_inline bool system_supports_fpsimd(void) { return !cpus_have_const_cap(ARM64_HAS_NO_FPSIMD); } @@ -575,7 +575,7 @@ static inline bool system_uses_ttbr0_pan(void) !cpus_have_const_cap(ARM64_HAS_PAN); } -static inline bool system_supports_sve(void) +static __always_inline bool system_supports_sve(void) { return IS_ENABLED(CONFIG_ARM64_SVE) && cpus_have_const_cap(ARM64_SVE); diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h index 4e531f57147d..6facd1308e7c 100644 --- a/arch/arm64/include/asm/io.h +++ b/arch/arm64/include/asm/io.h @@ -34,7 +34,7 @@ static inline void __raw_writew(u16 val, volatile void __iomem *addr) } #define __raw_writel __raw_writel -static inline void __raw_writel(u32 val, volatile void __iomem *addr) +static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr) { asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr)); } @@ -69,7 +69,7 @@ static inline u16 __raw_readw(const volatile void __iomem *addr) } #define __raw_readl __raw_readl -static inline u32 __raw_readl(const volatile void __iomem *addr) +static __always_inline u32 __raw_readl(const volatile void __iomem *addr) { u32 val; asm volatile(ALTERNATIVE("ldr %w0, [%1]", From 8af1c6fbd9239877998c7f5a591cb2c88d41fb66 Mon Sep 17 00:00:00 2001 From: Jozsef Kadlecsik Date: Sat, 22 Feb 2020 12:01:43 +0100 Subject: [PATCH 095/400] netfilter: ipset: Fix forceadd evaluation path When the forceadd option is enabled, the hash:* types should find and replace the first entry in the bucket with the new one if there are no reuseable (deleted or timed out) entries. However, the position index was just not set to zero and remained the invalid -1 if there were no reuseable entries. Reported-by: syzbot+6a86565c74ebe30aea18@syzkaller.appspotmail.com Fixes: 23c42a403a9c ("netfilter: ipset: Introduction of new commands and protocol version 7") Signed-off-by: Jozsef Kadlecsik --- net/netfilter/ipset/ip_set_hash_gen.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index 71e93eac0831..e52d7b7597a0 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -931,6 +931,8 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext, } } if (reuse || forceadd) { + if (j == -1) + j = 0; data = ahash_data(n, j, set->dsize); if (!deleted) { #ifdef IP_SET_HASH_WITH_NETS From 2ad3e17ebf94b7b7f3f64c050ff168f9915345eb Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Sat, 22 Feb 2020 20:36:47 -0500 Subject: [PATCH 096/400] audit: fix error handling in audit_data_to_entry() Commit 219ca39427bf ("audit: use union for audit_field values since they are mutually exclusive") combined a number of separate fields in the audit_field struct into a single union. Generally this worked just fine because they are generally mutually exclusive. Unfortunately in audit_data_to_entry() the overlap can be a problem when a specific error case is triggered that causes the error path code to attempt to cleanup an audit_field struct and the cleanup involves attempting to free a stored LSM string (the lsm_str field). Currently the code always has a non-NULL value in the audit_field.lsm_str field as the top of the for-loop transfers a value into audit_field.val (both .lsm_str and .val are part of the same union); if audit_data_to_entry() fails and the audit_field struct is specified to contain a LSM string, but the audit_field.lsm_str has not yet been properly set, the error handling code will attempt to free the bogus audit_field.lsm_str value that was set with audit_field.val at the top of the for-loop. This patch corrects this by ensuring that the audit_field.val is only set when needed (it is cleared when the audit_field struct is allocated with kcalloc()). It also corrects a few other issues to ensure that in case of error the proper error code is returned. Cc: stable@vger.kernel.org Fixes: 219ca39427bf ("audit: use union for audit_field values since they are mutually exclusive") Reported-by: syzbot+1f4d90ead370d72e450b@syzkaller.appspotmail.com Signed-off-by: Paul Moore --- kernel/auditfilter.c | 81 ++++++++++++++++++++++++-------------------- 1 file changed, 44 insertions(+), 37 deletions(-) diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c index b0126e9c0743..026e34da4ace 100644 --- a/kernel/auditfilter.c +++ b/kernel/auditfilter.c @@ -456,6 +456,7 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data, bufp = data->buf; for (i = 0; i < data->field_count; i++) { struct audit_field *f = &entry->rule.fields[i]; + u32 f_val; err = -EINVAL; @@ -464,12 +465,12 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data, goto exit_free; f->type = data->fields[i]; - f->val = data->values[i]; + f_val = data->values[i]; /* Support legacy tests for a valid loginuid */ - if ((f->type == AUDIT_LOGINUID) && (f->val == AUDIT_UID_UNSET)) { + if ((f->type == AUDIT_LOGINUID) && (f_val == AUDIT_UID_UNSET)) { f->type = AUDIT_LOGINUID_SET; - f->val = 0; + f_val = 0; entry->rule.pflags |= AUDIT_LOGINUID_LEGACY; } @@ -485,7 +486,7 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data, case AUDIT_SUID: case AUDIT_FSUID: case AUDIT_OBJ_UID: - f->uid = make_kuid(current_user_ns(), f->val); + f->uid = make_kuid(current_user_ns(), f_val); if (!uid_valid(f->uid)) goto exit_free; break; @@ -494,11 +495,12 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data, case AUDIT_SGID: case AUDIT_FSGID: case AUDIT_OBJ_GID: - f->gid = make_kgid(current_user_ns(), f->val); + f->gid = make_kgid(current_user_ns(), f_val); if (!gid_valid(f->gid)) goto exit_free; break; case AUDIT_ARCH: + f->val = f_val; entry->rule.arch_f = f; break; case AUDIT_SUBJ_USER: @@ -511,11 +513,13 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data, case AUDIT_OBJ_TYPE: case AUDIT_OBJ_LEV_LOW: case AUDIT_OBJ_LEV_HIGH: - str = audit_unpack_string(&bufp, &remain, f->val); - if (IS_ERR(str)) + str = audit_unpack_string(&bufp, &remain, f_val); + if (IS_ERR(str)) { + err = PTR_ERR(str); goto exit_free; - entry->rule.buflen += f->val; - + } + entry->rule.buflen += f_val; + f->lsm_str = str; err = security_audit_rule_init(f->type, f->op, str, (void **)&f->lsm_rule); /* Keep currently invalid fields around in case they @@ -524,68 +528,71 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data, pr_warn("audit rule for LSM \'%s\' is invalid\n", str); err = 0; - } - if (err) { - kfree(str); + } else if (err) goto exit_free; - } else - f->lsm_str = str; break; case AUDIT_WATCH: - str = audit_unpack_string(&bufp, &remain, f->val); - if (IS_ERR(str)) + str = audit_unpack_string(&bufp, &remain, f_val); + if (IS_ERR(str)) { + err = PTR_ERR(str); goto exit_free; - entry->rule.buflen += f->val; - - err = audit_to_watch(&entry->rule, str, f->val, f->op); + } + err = audit_to_watch(&entry->rule, str, f_val, f->op); if (err) { kfree(str); goto exit_free; } + entry->rule.buflen += f_val; break; case AUDIT_DIR: - str = audit_unpack_string(&bufp, &remain, f->val); - if (IS_ERR(str)) + str = audit_unpack_string(&bufp, &remain, f_val); + if (IS_ERR(str)) { + err = PTR_ERR(str); goto exit_free; - entry->rule.buflen += f->val; - + } err = audit_make_tree(&entry->rule, str, f->op); kfree(str); if (err) goto exit_free; + entry->rule.buflen += f_val; break; case AUDIT_INODE: + f->val = f_val; err = audit_to_inode(&entry->rule, f); if (err) goto exit_free; break; case AUDIT_FILTERKEY: - if (entry->rule.filterkey || f->val > AUDIT_MAX_KEY_LEN) + if (entry->rule.filterkey || f_val > AUDIT_MAX_KEY_LEN) goto exit_free; - str = audit_unpack_string(&bufp, &remain, f->val); - if (IS_ERR(str)) - goto exit_free; - entry->rule.buflen += f->val; - entry->rule.filterkey = str; - break; - case AUDIT_EXE: - if (entry->rule.exe || f->val > PATH_MAX) - goto exit_free; - str = audit_unpack_string(&bufp, &remain, f->val); + str = audit_unpack_string(&bufp, &remain, f_val); if (IS_ERR(str)) { err = PTR_ERR(str); goto exit_free; } - entry->rule.buflen += f->val; - - audit_mark = audit_alloc_mark(&entry->rule, str, f->val); + entry->rule.buflen += f_val; + entry->rule.filterkey = str; + break; + case AUDIT_EXE: + if (entry->rule.exe || f_val > PATH_MAX) + goto exit_free; + str = audit_unpack_string(&bufp, &remain, f_val); + if (IS_ERR(str)) { + err = PTR_ERR(str); + goto exit_free; + } + audit_mark = audit_alloc_mark(&entry->rule, str, f_val); if (IS_ERR(audit_mark)) { kfree(str); err = PTR_ERR(audit_mark); goto exit_free; } + entry->rule.buflen += f_val; entry->rule.exe = audit_mark; break; + default: + f->val = f_val; + break; } } From 42d84c8490f9f0931786f1623191fcab397c3d64 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= Date: Fri, 21 Feb 2020 12:06:56 +0100 Subject: [PATCH 097/400] vhost: Check docket sk_family instead of call getname MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Doing so, we save one call to get data we already have in the struct. Also, since there is no guarantee that getname use sockaddr_ll parameter beyond its size, we add a little bit of security here. It should do not do beyond MAX_ADDR_LEN, but syzbot found that ax25_getname writes more (72 bytes, the size of full_sockaddr_ax25, versus 20 + 32 bytes of sockaddr_ll + MAX_ADDR_LEN in syzbot repro). Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server") Reported-by: syzbot+f2a62d07a5198c819c7b@syzkaller.appspotmail.com Signed-off-by: Eugenio Pérez Acked-by: Michael S. Tsirkin Signed-off-by: David S. Miller --- drivers/vhost/net.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index e158159671fa..18e205eeb9af 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -1414,10 +1414,6 @@ static int vhost_net_release(struct inode *inode, struct file *f) static struct socket *get_raw_socket(int fd) { - struct { - struct sockaddr_ll sa; - char buf[MAX_ADDR_LEN]; - } uaddr; int r; struct socket *sock = sockfd_lookup(fd, &r); @@ -1430,11 +1426,7 @@ static struct socket *get_raw_socket(int fd) goto err; } - r = sock->ops->getname(sock, (struct sockaddr *)&uaddr.sa, 0); - if (r < 0) - goto err; - - if (uaddr.sa.sll_family != AF_PACKET) { + if (sock->sk->sk_family != AF_PACKET) { r = -EPFNOSUPPORT; goto err; } From 3e72dfdf8227b052393f71d820ec7599909dddc2 Mon Sep 17 00:00:00 2001 From: Matteo Croce Date: Fri, 21 Feb 2020 12:28:38 +0100 Subject: [PATCH 098/400] ipv4: ensure rcu_read_lock() in cipso_v4_error() Similarly to commit c543cb4a5f07 ("ipv4: ensure rcu_read_lock() in ipv4_link_failure()"), __ip_options_compile() must be called under rcu protection. Fixes: 3da1ed7ac398 ("net: avoid use IPCB in cipso_v4_error") Suggested-by: Guillaume Nault Signed-off-by: Matteo Croce Acked-by: Paul Moore Signed-off-by: David S. Miller --- net/ipv4/cipso_ipv4.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/net/ipv4/cipso_ipv4.c b/net/ipv4/cipso_ipv4.c index 376882215919..0bd10a1f477f 100644 --- a/net/ipv4/cipso_ipv4.c +++ b/net/ipv4/cipso_ipv4.c @@ -1724,6 +1724,7 @@ void cipso_v4_error(struct sk_buff *skb, int error, u32 gateway) { unsigned char optbuf[sizeof(struct ip_options) + 40]; struct ip_options *opt = (struct ip_options *)optbuf; + int res; if (ip_hdr(skb)->protocol == IPPROTO_ICMP || error != -EACCES) return; @@ -1735,7 +1736,11 @@ void cipso_v4_error(struct sk_buff *skb, int error, u32 gateway) memset(opt, 0, sizeof(struct ip_options)); opt->optlen = ip_hdr(skb)->ihl*4 - sizeof(struct iphdr); - if (__ip_options_compile(dev_net(skb->dev), opt, skb, NULL)) + rcu_read_lock(); + res = __ip_options_compile(dev_net(skb->dev), opt, skb, NULL); + rcu_read_unlock(); + + if (res) return; if (gateway) From 39f3b41aa7cae917f928ef9f31d09da28188e5ed Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Fri, 21 Feb 2020 19:42:13 +0100 Subject: [PATCH 099/400] net: genetlink: return the error code when attribute parsing fails. Currently if attribute parsing fails and the genl family does not support parallel operation, the error code returned by __nlmsg_parse() is discarded by genl_family_rcv_msg_attrs_parse(). Be sure to report the error for all genl families. Fixes: c10e6cf85e7d ("net: genetlink: push attrbuf allocation and parsing to a separate function") Fixes: ab5b526da048 ("net: genetlink: always allocate separate attrs for dumpit ops") Signed-off-by: Paolo Abeni Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller --- net/netlink/genetlink.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c index 0522b2b1fd95..9f357aa22b94 100644 --- a/net/netlink/genetlink.c +++ b/net/netlink/genetlink.c @@ -497,8 +497,9 @@ genl_family_rcv_msg_attrs_parse(const struct genl_family *family, err = __nlmsg_parse(nlh, hdrlen, attrbuf, family->maxattr, family->policy, validate, extack); - if (err && parallel) { - kfree(attrbuf); + if (err) { + if (parallel) + kfree(attrbuf); return ERR_PTR(err); } return attrbuf; From eae7172f8141eb98e64e6e81acc9e9d5b2add127 Mon Sep 17 00:00:00 2001 From: Daniele Palmas Date: Fri, 21 Feb 2020 14:17:05 +0100 Subject: [PATCH 100/400] net: usb: qmi_wwan: restore mtu min/max values after raw_ip switch MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit usbnet creates network interfaces with min_mtu = 0 and max_mtu = ETH_MAX_MTU. These values are not modified by qmi_wwan when the network interface is created initially, allowing, for example, to set mtu greater than 1500. When a raw_ip switch is done (raw_ip set to 'Y', then set to 'N') the mtu values for the network interface are set through ether_setup, with min_mtu = ETH_MIN_MTU and max_mtu = ETH_DATA_LEN, not allowing anymore to set mtu greater than 1500 (error: mtu greater than device maximum). The patch restores the original min/max mtu values set by usbnet after a raw_ip switch. Signed-off-by: Daniele Palmas Acked-by: Bjørn Mork Signed-off-by: David S. Miller --- drivers/net/usb/qmi_wwan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 3b7a3b8a5e06..5754bb6ca0ee 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -337,6 +337,9 @@ static void qmi_wwan_netdev_setup(struct net_device *net) netdev_dbg(net, "mode: raw IP\n"); } else if (!net->header_ops) { /* don't bother if already set */ ether_setup(net); + /* Restoring min/max mtu values set originally by usbnet */ + net->min_mtu = 0; + net->max_mtu = ETH_MAX_MTU; clear_bit(EVENT_NO_IP_ALIGN, &dev->flags); netdev_dbg(net, "mode: Ethernet\n"); } From e08658a657f974590809290c62e889f0fd420200 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Sat, 22 Feb 2020 13:50:49 +0530 Subject: [PATCH 101/400] powerpc/watchpoint: Don't call dar_within_range() for Book3S DAR is set to the first byte of overlap between actual access and watched range at DSI on Book3S processor. But actual access range might or might not be within user asked range. So for Book3S, it must not call dar_within_range(). This revert portion of commit 39413ae00967 ("powerpc/hw_breakpoints: Rewrite 8xx breakpoints to allow any address range size."). Before patch: # ./tools/testing/selftests/powerpc/ptrace/perf-hwbreak ... TESTED: No overlap FAILED: Partial overlap: 0 != 2 TESTED: Partial overlap TESTED: No overlap FAILED: Full overlap: 0 != 2 failure: perf_hwbreak After patch: TESTED: No overlap TESTED: Partial overlap TESTED: Partial overlap TESTED: No overlap TESTED: Full overlap success: perf_hwbreak Fixes: 39413ae00967 ("powerpc/hw_breakpoints: Rewrite 8xx breakpoints to allow any address range size.") Reported-by: Michael Ellerman Signed-off-by: Ravi Bangoria Reviewed-by: Christophe Leroy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200222082049.330435-1-ravi.bangoria@linux.ibm.com --- arch/powerpc/kernel/hw_breakpoint.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c index 2462cd7c565c..d0854320bb50 100644 --- a/arch/powerpc/kernel/hw_breakpoint.c +++ b/arch/powerpc/kernel/hw_breakpoint.c @@ -331,11 +331,13 @@ int hw_breakpoint_handler(struct die_args *args) } info->type &= ~HW_BRK_TYPE_EXTRANEOUS_IRQ; - if (!dar_within_range(regs->dar, info)) - info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ; - - if (!IS_ENABLED(CONFIG_PPC_8xx) && !stepping_handler(regs, bp, info)) - goto out; + if (IS_ENABLED(CONFIG_PPC_8xx)) { + if (!dar_within_range(regs->dar, info)) + info->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ; + } else { + if (!stepping_handler(regs, bp, info)) + goto out; + } /* * As a policy, the callback is invoked in a 'trigger-after-execute' From f6f13c125e05603f68f5bf31f045b95e6d493598 Mon Sep 17 00:00:00 2001 From: Haiyang Zhang Date: Fri, 21 Feb 2020 08:32:18 -0800 Subject: [PATCH 102/400] hv_netvsc: Fix unwanted wakeup in netvsc_attach() When netvsc_attach() is called by operations like changing MTU, etc., an extra wakeup may happen while netvsc_attach() calling rndis_filter_device_add() which sends rndis messages when queue is stopped in netvsc_detach(). The completion message will wake up queue 0. We can reproduce the issue by changing MTU etc., then the wake_queue counter from "ethtool -S" will increase beyond stop_queue counter: stop_queue: 0 wake_queue: 1 The issue causes queue wake up, and counter increment, no other ill effects in current code. So we didn't see any network problem for now. To fix this, initialize tx_disable to true, and set it to false when the NIC is ready to be attached or registered. Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic") Signed-off-by: Haiyang Zhang Signed-off-by: David S. Miller --- drivers/net/hyperv/netvsc.c | 2 +- drivers/net/hyperv/netvsc_drv.c | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index ae3f3084c2ed..1b320bcf150a 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -99,7 +99,7 @@ static struct netvsc_device *alloc_net_device(void) init_waitqueue_head(&net_device->wait_drain); net_device->destroy = false; - net_device->tx_disable = false; + net_device->tx_disable = true; net_device->max_pkt = RNDIS_MAX_PKT_DEFAULT; net_device->pkt_align = RNDIS_PKT_ALIGN_DEFAULT; diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index 65e12cb07f45..2c0a24c606fc 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -1068,6 +1068,7 @@ static int netvsc_attach(struct net_device *ndev, } /* In any case device is now ready */ + nvdev->tx_disable = false; netif_device_attach(ndev); /* Note: enable and attach happen when sub-channels setup */ @@ -2476,6 +2477,8 @@ static int netvsc_probe(struct hv_device *dev, else net->max_mtu = ETH_DATA_LEN; + nvdev->tx_disable = false; + ret = register_netdevice(net); if (ret != 0) { pr_err("Unable to register netdev.\n"); From dad8cea7add96a353fa1898b5ccefbb72da66f29 Mon Sep 17 00:00:00 2001 From: Neal Cardwell Date: Sat, 22 Feb 2020 11:21:15 -0500 Subject: [PATCH 103/400] tcp: fix TFO SYNACK undo to avoid double-timestamp-undo In a rare corner case the new logic for undo of SYNACK RTO could result in triggering the warning in tcp_fastretrans_alert() that says: WARN_ON(tp->retrans_out != 0); The warning looked like: WARNING: CPU: 1 PID: 1 at net/ipv4/tcp_input.c:2818 tcp_ack+0x13e0/0x3270 The sequence that tickles this bug is: - Fast Open server receives TFO SYN with data, sends SYNACK - (client receives SYNACK and sends ACK, but ACK is lost) - server app sends some data packets - (N of the first data packets are lost) - server receives client ACK that has a TS ECR matching first SYNACK, and also SACKs suggesting the first N data packets were lost - server performs TS undo of SYNACK RTO, then immediately enters recovery - buggy behavior then performed a *second* undo that caused the connection to be in CA_Open with retrans_out != 0 Basically, the incoming ACK packet with SACK blocks causes us to first undo the cwnd reduction from the SYNACK RTO, but then immediately enters fast recovery, which then makes us eligible for undo again. And then tcp_rcv_synrecv_state_fastopen() accidentally performs an undo using a "mash-up" of state from two different loss recovery phases: it uses the timestamp info from the ACK of the original SYNACK, and the undo_marker from the fast recovery. This fix refines the logic to only invoke the tcp_try_undo_loss() inside tcp_rcv_synrecv_state_fastopen() if the connection is still in CA_Loss. If peer SACKs triggered fast recovery, then tcp_rcv_synrecv_state_fastopen() can't safely undo. Fixes: 794200d66273 ("tcp: undo cwnd on Fast Open spurious SYNACK retransmit") Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller --- net/ipv4/tcp_input.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 316ebdf8151d..6b6b57000dad 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -6124,7 +6124,11 @@ static void tcp_rcv_synrecv_state_fastopen(struct sock *sk) { struct request_sock *req; - tcp_try_undo_loss(sk, false); + /* If we are still handling the SYNACK RTO, see if timestamp ECR allows + * undo. If peer SACKs triggered fast recovery, we can't undo here. + */ + if (inet_csk(sk)->icsk_ca_state == TCP_CA_Loss) + tcp_try_undo_loss(sk, false); /* Reset rtx states to prevent spurious retransmits_timed_out() */ tcp_sk(sk)->retrans_stamp = 0; From 66d0e797bf095d407479c89952d42b1d96ef0a7f Mon Sep 17 00:00:00 2001 From: Orson Zhai Date: Fri, 21 Feb 2020 01:37:04 +0800 Subject: [PATCH 104/400] Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs" This reverts commit 4585fbcb5331fc910b7e553ad3efd0dd7b320d14. The name changing as devfreq(X) breaks some user space applications, such as Android HAL from Unisoc and Hikey [1]. The device name will be changed unexpectly after every boot depending on module init sequence. It will make trouble to setup some system configuration like selinux for Android. So we'd like to revert it back to old naming rule before any better way being found. [1] https://lkml.org/lkml/2018/5/8/1042 Cc: John Stultz Cc: Greg Kroah-Hartman Cc: stable@vger.kernel.org Signed-off-by: Orson Zhai Acked-by: Greg Kroah-Hartman Signed-off-by: Chanwoo Choi --- drivers/devfreq/devfreq.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c index cceee8bc3c2f..7dcf2093e531 100644 --- a/drivers/devfreq/devfreq.c +++ b/drivers/devfreq/devfreq.c @@ -738,7 +738,6 @@ struct devfreq *devfreq_add_device(struct device *dev, { struct devfreq *devfreq; struct devfreq_governor *governor; - static atomic_t devfreq_no = ATOMIC_INIT(-1); int err = 0; if (!dev || !profile || !governor_name) { @@ -800,8 +799,7 @@ struct devfreq *devfreq_add_device(struct device *dev, devfreq->suspend_freq = dev_pm_opp_get_suspend_opp_freq(dev); atomic_set(&devfreq->suspend_count, 0); - dev_set_name(&devfreq->dev, "devfreq%d", - atomic_inc_return(&devfreq_no)); + dev_set_name(&devfreq->dev, "%s", dev_name(dev)); err = device_register(&devfreq->dev); if (err) { mutex_unlock(&devfreq->lock); From 193155c8c9429f57400daf1f2ef0075016767112 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Sat, 22 Feb 2020 23:22:19 -0700 Subject: [PATCH 105/400] io_uring: handle multiple personalities in link chains If we have a chain of requests and they don't all use the same credentials, then the head of the chain will be issued with the credentails of the tail of the chain. Ensure __io_queue_sqe() overrides the credentials, if they are different. Once we do that, we can clean up the creds handling as well, by only having io_submit_sqe() do the lookup of a personality. It doesn't need to assign it, since __io_queue_sqe() now always does the right thing. Fixes: 75c6a03904e0 ("io_uring: support using a registered personality for commands") Reported-by: Pavel Begunkov Signed-off-by: Jens Axboe --- fs/io_uring.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index de650df9ac53..7d0be264527d 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -4705,11 +4705,21 @@ static void __io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_kiocb *linked_timeout; struct io_kiocb *nxt = NULL; + const struct cred *old_creds = NULL; int ret; again: linked_timeout = io_prep_linked_timeout(req); + if (req->work.creds && req->work.creds != current_cred()) { + if (old_creds) + revert_creds(old_creds); + if (old_creds == req->work.creds) + old_creds = NULL; /* restored original creds */ + else + old_creds = override_creds(req->work.creds); + } + ret = io_issue_sqe(req, sqe, &nxt, true); /* @@ -4759,6 +4769,8 @@ done_req: goto punt; goto again; } + if (old_creds) + revert_creds(old_creds); } static void io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe) @@ -4803,7 +4815,6 @@ static inline void io_queue_link_head(struct io_kiocb *req) static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe, struct io_submit_state *state, struct io_kiocb **link) { - const struct cred *old_creds = NULL; struct io_ring_ctx *ctx = req->ctx; unsigned int sqe_flags; int ret, id; @@ -4818,14 +4829,12 @@ static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe, id = READ_ONCE(sqe->personality); if (id) { - const struct cred *personality_creds; - - personality_creds = idr_find(&ctx->personality_idr, id); - if (unlikely(!personality_creds)) { + req->work.creds = idr_find(&ctx->personality_idr, id); + if (unlikely(!req->work.creds)) { ret = -EINVAL; goto err_req; } - old_creds = override_creds(personality_creds); + get_cred(req->work.creds); } /* same numerical values with corresponding REQ_F_*, safe to copy */ @@ -4837,8 +4846,6 @@ static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe, err_req: io_cqring_add_event(req, ret); io_double_put_req(req); - if (old_creds) - revert_creds(old_creds); return false; } @@ -4899,8 +4906,6 @@ err_req: } } - if (old_creds) - revert_creds(old_creds); return true; } From c77ec025346f122bb74719725331c6386bbb86b7 Mon Sep 17 00:00:00 2001 From: Mauro Carvalho Chehab Date: Sat, 22 Feb 2020 10:00:04 +0100 Subject: [PATCH 106/400] docs: adm1177: fix a broken reference This reference was missing the .rst extension. This would be OK if it were using the :doc: directive. So, switch to it. As a side effect, this will create cross-reference links at html output. Signed-off-by: Mauro Carvalho Chehab Link: https://lore.kernel.org/r/8d37f465888656224855a21f5bb01edb1ca66cf3.1582361738.git.mchehab+huawei@kernel.org Signed-off-by: Guenter Roeck --- Documentation/hwmon/adm1177.rst | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/Documentation/hwmon/adm1177.rst b/Documentation/hwmon/adm1177.rst index c81e0b4abd28..471be1e98d6f 100644 --- a/Documentation/hwmon/adm1177.rst +++ b/Documentation/hwmon/adm1177.rst @@ -20,8 +20,7 @@ Usage Notes ----------- This driver does not auto-detect devices. You will have to instantiate the -devices explicitly. Please see Documentation/i2c/instantiating-devices for -details. +devices explicitly. Please see :doc:`/i2c/instantiating-devices` for details. Sysfs entries From 52df1e564eb0470b2ecd1481e457932f09f49f5d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jonathan=20Neusch=C3=A4fer?= Date: Sun, 23 Feb 2020 18:46:31 +0100 Subject: [PATCH 107/400] docs: networking: phy: Rephrase paragraph for clarity MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Let's make it a little easier to read. Signed-off-by: Jonathan Neuschäfer Signed-off-by: David S. Miller --- Documentation/networking/phy.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Documentation/networking/phy.rst b/Documentation/networking/phy.rst index 1e4735cc0553..256106054c8c 100644 --- a/Documentation/networking/phy.rst +++ b/Documentation/networking/phy.rst @@ -487,8 +487,9 @@ phy_register_fixup_for_id():: The stubs set one of the two matching criteria, and set the other one to match anything. -When phy_register_fixup() or \*_for_uid()/\*_for_id() is called at module, -unregister fixup and free allocate memory are required. +When phy_register_fixup() or \*_for_uid()/\*_for_id() is called at module load +time, the module needs to unregister the fixup and free allocated memory when +it's unloaded. Call one of following function before unloading module:: From 44343418d0f2f623cb9da6f5000df793131cbe3b Mon Sep 17 00:00:00 2001 From: Marek Vasut Date: Sun, 23 Feb 2020 14:38:40 +0100 Subject: [PATCH 108/400] net: ks8851-ml: Fix IRQ handling and locking The KS8851 requires that packet RX and TX are mutually exclusive. Currently, the driver hopes to achieve this by disabling interrupt from the card by writing the card registers and by disabling the interrupt on the interrupt controller. This however is racy on SMP. Replace this approach by expanding the spinlock used around the ks_start_xmit() TX path to ks_irq() RX path to assure true mutual exclusion and remove the interrupt enabling/disabling, which is now not needed anymore. Furthermore, disable interrupts also in ks_net_stop(), which was missing before. Note that a massive improvement here would be to re-use the KS8851 driver approach, which is to move the TX path into a worker thread, interrupt handling to threaded interrupt, and synchronize everything with mutexes, but that would be a much bigger rework, for a separate patch. Signed-off-by: Marek Vasut Cc: David S. Miller Cc: Lukas Wunner Cc: Petr Stetiar Cc: YueHaibing Signed-off-by: David S. Miller --- drivers/net/ethernet/micrel/ks8851_mll.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/micrel/ks8851_mll.c b/drivers/net/ethernet/micrel/ks8851_mll.c index 1c9e70c8cc30..58579baf3f7a 100644 --- a/drivers/net/ethernet/micrel/ks8851_mll.c +++ b/drivers/net/ethernet/micrel/ks8851_mll.c @@ -513,14 +513,17 @@ static irqreturn_t ks_irq(int irq, void *pw) { struct net_device *netdev = pw; struct ks_net *ks = netdev_priv(netdev); + unsigned long flags; u16 status; + spin_lock_irqsave(&ks->statelock, flags); /*this should be the first in IRQ handler */ ks_save_cmd_reg(ks); status = ks_rdreg16(ks, KS_ISR); if (unlikely(!status)) { ks_restore_cmd_reg(ks); + spin_unlock_irqrestore(&ks->statelock, flags); return IRQ_NONE; } @@ -546,6 +549,7 @@ static irqreturn_t ks_irq(int irq, void *pw) ks->netdev->stats.rx_over_errors++; /* this should be the last in IRQ handler*/ ks_restore_cmd_reg(ks); + spin_unlock_irqrestore(&ks->statelock, flags); return IRQ_HANDLED; } @@ -615,6 +619,7 @@ static int ks_net_stop(struct net_device *netdev) /* shutdown RX/TX QMU */ ks_disable_qmu(ks); + ks_disable_int(ks); /* set powermode to soft power down to save power */ ks_set_powermode(ks, PMECR_PM_SOFTDOWN); @@ -671,10 +676,9 @@ static netdev_tx_t ks_start_xmit(struct sk_buff *skb, struct net_device *netdev) { netdev_tx_t retv = NETDEV_TX_OK; struct ks_net *ks = netdev_priv(netdev); + unsigned long flags; - disable_irq(netdev->irq); - ks_disable_int(ks); - spin_lock(&ks->statelock); + spin_lock_irqsave(&ks->statelock, flags); /* Extra space are required: * 4 byte for alignment, 4 for status/length, 4 for CRC @@ -688,9 +692,7 @@ static netdev_tx_t ks_start_xmit(struct sk_buff *skb, struct net_device *netdev) dev_kfree_skb(skb); } else retv = NETDEV_TX_BUSY; - spin_unlock(&ks->statelock); - ks_enable_int(ks); - enable_irq(netdev->irq); + spin_unlock_irqrestore(&ks->statelock, flags); return retv; } From 503ba7c6961034ff0047707685644cad9287c226 Mon Sep 17 00:00:00 2001 From: Florian Fainelli Date: Thu, 20 Feb 2020 15:34:53 -0800 Subject: [PATCH 109/400] net: phy: Avoid multiple suspends It is currently possible for a PHY device to be suspended as part of a network device driver's suspend call while it is still being attached to that net_device, either via phy_suspend() or implicitly via phy_stop(). Later on, when the MDIO bus controller get suspended, we would attempt to suspend again the PHY because it is still attached to a network device. This is both a waste of time and creates an opportunity for improper clock/power management bugs to creep in. Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") Signed-off-by: Florian Fainelli Signed-off-by: David S. Miller --- drivers/net/phy/phy_device.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 6a5056e0ae77..6131aca79823 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -247,7 +247,7 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev) * MDIO bus driver and clock gated at this point. */ if (!netdev) - return !phydev->suspended; + goto out; if (netdev->wol_enabled) return false; @@ -267,7 +267,8 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev) if (device_may_wakeup(&netdev->dev)) return false; - return true; +out: + return !phydev->suspended; } static int mdio_bus_phy_suspend(struct device *dev) From 6132c1d9033d158bd0464e90bc46544fbe0bd6bc Mon Sep 17 00:00:00 2001 From: Madhuparna Bhowmik Date: Sun, 23 Feb 2020 16:52:33 +0530 Subject: [PATCH 110/400] net: core: devlink.c: Hold devlink->lock from the beginning of devlink_dpipe_table_register() devlink_dpipe_table_find() should be called under either rcu_read_lock() or devlink->lock. devlink_dpipe_table_register() calls devlink_dpipe_table_find() without holding the lock and acquires it later. Therefore hold the devlink->lock from the beginning of devlink_dpipe_table_register(). Suggested-by: Jiri Pirko Signed-off-by: Madhuparna Bhowmik Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller --- net/core/devlink.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/net/core/devlink.c b/net/core/devlink.c index 549ee56b7a21..8d0b558be942 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -6878,26 +6878,33 @@ int devlink_dpipe_table_register(struct devlink *devlink, void *priv, bool counter_control_extern) { struct devlink_dpipe_table *table; - - if (devlink_dpipe_table_find(&devlink->dpipe_table_list, table_name)) - return -EEXIST; + int err = 0; if (WARN_ON(!table_ops->size_get)) return -EINVAL; + mutex_lock(&devlink->lock); + + if (devlink_dpipe_table_find(&devlink->dpipe_table_list, table_name)) { + err = -EEXIST; + goto unlock; + } + table = kzalloc(sizeof(*table), GFP_KERNEL); - if (!table) - return -ENOMEM; + if (!table) { + err = -ENOMEM; + goto unlock; + } table->name = table_name; table->table_ops = table_ops; table->priv = priv; table->counter_control_extern = counter_control_extern; - mutex_lock(&devlink->lock); list_add_tail_rcu(&table->list, &devlink->dpipe_table_list); +unlock: mutex_unlock(&devlink->lock); - return 0; + return err; } EXPORT_SYMBOL_GPL(devlink_dpipe_table_register); From e3ae39edbce6dc933fb1393490d1b5d76d3edb90 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Mon, 24 Feb 2020 09:38:15 +0100 Subject: [PATCH 111/400] nl80211: explicitly include if_vlan.h We use that here, and do seem to get it through some recursive include, but better include it explicitly. Signed-off-by: Johannes Berg Link: https://lore.kernel.org/r/20200224093814.1b9c258fec67.I45ac150d4e11c72eb263abec9f1f0c7add9bef2b@changeid Signed-off-by: Johannes Berg --- net/wireless/nl80211.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c index 46be40e19e7f..5b19e9fac4aa 100644 --- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include From 253216ffb2a002a682c6f68bd3adff5b98b71de8 Mon Sep 17 00:00:00 2001 From: Madhuparna Bhowmik Date: Sun, 23 Feb 2020 20:03:02 +0530 Subject: [PATCH 112/400] mac80211: rx: avoid RCU list traversal under mutex local->sta_mtx is held in __ieee80211_check_fast_rx_iface(). No need to use list_for_each_entry_rcu() as it also requires a cond argument to avoid false lockdep warnings when not used in RCU read-side section (with CONFIG_PROVE_RCU_LIST). Therefore use list_for_each_entry(); Signed-off-by: Madhuparna Bhowmik Link: https://lore.kernel.org/r/20200223143302.15390-1-madhuparnabhowmik10@gmail.com Signed-off-by: Johannes Berg --- net/mac80211/rx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index 0e05ff037672..0ba98ad9bc85 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -4114,7 +4114,7 @@ void __ieee80211_check_fast_rx_iface(struct ieee80211_sub_if_data *sdata) lockdep_assert_held(&local->sta_mtx); - list_for_each_entry_rcu(sta, &local->sta_list, list) { + list_for_each_entry(sta, &local->sta_list, list) { if (sdata != sta->sdata && (!sta->sdata->bss || sta->sdata->bss != sdata->bss)) continue; From 3eb55e6f753a379e293395de8d5f3be28351a7f8 Mon Sep 17 00:00:00 2001 From: Tina Zhang Date: Fri, 21 Feb 2020 10:32:34 +0800 Subject: [PATCH 113/400] drm/i915/gvt: Separate display reset from ALL_ENGINES reset ALL_ENGINES reset doesn't clobber display with the current gvt-g supported platforms. Thus ALL_ENGINES reset shouldn't reset the display engine registers emulated by gvt-g. This fixes guest warning like [ 14.622026] [drm] Initialized i915 1.6.0 20200114 for 0000:00:03.0 on minor 0 [ 14.967917] fbcon: i915drmfb (fb0) is primary device [ 25.100188] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] E RROR [CRTC:51:pipe A] flip_done timed out [ 25.100860] -----------[ cut here ]----------- [ 25.100861] pll on state mismatch (expected 0, found 1) [ 25.101024] WARNING: CPU: 1 PID: 30 at drivers/gpu/drm/i915/display/intel_dis play.c:14382 verify_single_dpll_state.isra.115+0x28f/0x320 [i915] [ 25.101025] Modules linked in: intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel i915 aesni_intel cr ypto_simd cryptd glue_helper cec rc_core video drm_kms_helper joydev drm input_l eds i2c_algo_bit serio_raw fb_sys_fops syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 e1000 psmouse i2c_piix4 pata_acpi floppy [ 25.101052] CPU: 1 PID: 30 Comm: kworker/u4:1 Not tainted 5.5.0+ #1 [ 25.101053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1 .12.1-0-ga5cab58 04/01/2014 [ 25.101055] Workqueue: events_unbound async_run_entry_fn [ 25.101092] RIP: 0010:verify_single_dpll_state.isra.115+0x28f/0x320 [i915] [ 25.101093] Code: e0 d9 ff e9 a3 fe ff ff 80 3d e9 c2 11 00 00 44 89 f6 48 c7 c7 c0 9d 88 c0 75 3b e8 eb df d9 ff e9 c7 fe ff ff e8 d1 e0 ae c4 <0f> 0b e9 7a fe ff ff 80 3d c0 c2 11 00 00 8d 71 41 89 c2 48 c7 c7 [ 25.101093] RSP: 0018:ffffb1de80107878 EFLAGS: 00010286 [ 25.101094] RAX: 0000000000000000 RBX: ffffb1de80107884 RCX: 0000000000000007 [ 25.101095] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff94fdfdd19740 [ 25.101095] RBP: ffffb1de80107938 R08: 0000000d6bfdc7b4 R09: 000000000000002b [ 25.101096] R10: ffff94fdf82dc000 R11: 0000000000000225 R12: 00000000000001f8 [ 25.101096] R13: ffff94fdb3ca6a90 R14: ffff94fdb3ca0000 R15: 0000000000000000 [ 25.101097] FS: 0000000000000000(0000) GS:ffff94fdfdd00000(0000) knlGS:00000 00000000000 [ 25.101098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 25.101098] CR2: 00007fbc3e2be9c8 CR3: 000000003339a003 CR4: 0000000000360ee0 [ 25.101101] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 25.101101] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 25.101102] Call Trace: [ 25.101139] intel_atomic_commit_tail+0xde4/0x1520 [i915] [ 25.101141] ? flush_workqueue_prep_pwqs+0xfa/0x130 [ 25.101142] ? flush_workqueue+0x198/0x3c0 [ 25.101174] intel_atomic_commit+0x2ad/0x320 [i915] [ 25.101209] drm_atomic_commit+0x4a/0x50 [drm] [ 25.101220] drm_client_modeset_commit_atomic+0x1c4/0x200 [drm] [ 25.101231] drm_client_modeset_commit_force+0x47/0x170 [drm] [ 25.101250] drm_fb_helper_restore_fbdev_mode_unlocked+0x4e/0xa0 [drm_kms_hel per] [ 25.101255] drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper] [ 25.101287] intel_fbdev_set_par+0x1a/0x40 [i915] [ 25.101289] ? con_is_visible+0x2e/0x60 [ 25.101290] fbcon_init+0x378/0x600 [ 25.101292] visual_init+0xd5/0x130 [ 25.101296] do_bind_con_driver+0x217/0x430 [ 25.101297] do_take_over_console+0x7d/0x1b0 [ 25.101298] do_fbcon_takeover+0x5c/0xb0 [ 25.101299] fbcon_fb_registered+0x199/0x1a0 [ 25.101301] register_framebuffer+0x22c/0x330 [ 25.101306] __drm_fb_helper_initial_config_and_unlock+0x31a/0x520 [drm_kms_h elper] [ 25.101311] drm_fb_helper_initial_config+0x35/0x40 [drm_kms_helper] [ 25.101341] intel_fbdev_initial_config+0x18/0x30 [i915] [ 25.101342] async_run_entry_fn+0x3c/0x150 [ 25.101343] process_one_work+0x1fd/0x3f0 [ 25.101344] worker_thread+0x34/0x410 [ 25.101346] kthread+0x121/0x140 [ 25.101346] ? process_one_work+0x3f0/0x3f0 [ 25.101347] ? kthread_park+0x90/0x90 [ 25.101350] ret_from_fork+0x35/0x40 [ 25.101351] --[ end trace b5b47d44cd998ba1 ]-- Fixes: 6294b61ba769 ("drm/i915/gvt: add missing display part reset for vGPU reset") Signed-off-by: Tina Zhang Reviewed-by: Zhenyu Wang Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20200221023234.28635-1-tina.zhang@intel.com --- drivers/gpu/drm/i915/gvt/vgpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c index 85bd9bf4f6ee..487af6ea9972 100644 --- a/drivers/gpu/drm/i915/gvt/vgpu.c +++ b/drivers/gpu/drm/i915/gvt/vgpu.c @@ -560,9 +560,9 @@ void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr, intel_vgpu_reset_mmio(vgpu, dmlr); populate_pvinfo_page(vgpu); - intel_vgpu_reset_display(vgpu); if (dmlr) { + intel_vgpu_reset_display(vgpu); intel_vgpu_reset_cfg_space(vgpu); /* only reset the failsafe mode when dmlr reset */ vgpu->failsafe = false; From cb0cc635c7a9fa8a3a0f75d4d896721819c63add Mon Sep 17 00:00:00 2001 From: "Naveen N. Rao" Date: Thu, 20 Feb 2020 17:01:32 +0530 Subject: [PATCH 114/400] powerpc: Include .BTF section Selecting CONFIG_DEBUG_INFO_BTF results in the below warning from ld: ld: warning: orphan section `.BTF' from `.btf.vmlinux.bin.o' being placed in section `.BTF' Include .BTF section in vmlinux explicitly to fix the same. Signed-off-by: Naveen N. Rao Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200220113132.857132-1-naveen.n.rao@linux.vnet.ibm.com --- arch/powerpc/kernel/vmlinux.lds.S | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S index b4c89a1acebb..a32d478a7f41 100644 --- a/arch/powerpc/kernel/vmlinux.lds.S +++ b/arch/powerpc/kernel/vmlinux.lds.S @@ -303,6 +303,12 @@ SECTIONS *(.branch_lt) } +#ifdef CONFIG_DEBUG_INFO_BTF + .BTF : AT(ADDR(.BTF) - LOAD_OFFSET) { + *(.BTF) + } +#endif + .opd : AT(ADDR(.opd) - LOAD_OFFSET) { __start_opd = .; KEEP(*(.opd)) From 34a818882e2f84704059dead1f02eb8943e222c3 Mon Sep 17 00:00:00 2001 From: Hans Verkuil Date: Fri, 24 Jan 2020 12:53:32 +0100 Subject: [PATCH 115/400] media: pulse8-cec: INIT_DELAYED_WORK was called too late If earlier in the connect() an error occurred, then pulse8_cec_adap_free was called by cec_delete_adapter, and that free function tried to cancel the ping_eeprom_work workqueue, but that workqueue hasn't been initialized yet, resulting in a kernel warning. Move the initialization of that workqueue up to where the other workqueues are initialized. Signed-off-by: Hans Verkuil Fixes: 601282d65b96 ("media: pulse8-cec: use adap_free callback") Signed-off-by: Mauro Carvalho Chehab --- drivers/media/usb/pulse8-cec/pulse8-cec.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/media/usb/pulse8-cec/pulse8-cec.c b/drivers/media/usb/pulse8-cec/pulse8-cec.c index afda438d4e0a..8d61bcec14bb 100644 --- a/drivers/media/usb/pulse8-cec/pulse8-cec.c +++ b/drivers/media/usb/pulse8-cec/pulse8-cec.c @@ -840,6 +840,8 @@ static int pulse8_connect(struct serio *serio, struct serio_driver *drv) serio_set_drvdata(serio, pulse8); INIT_WORK(&pulse8->irq_work, pulse8_irq_work_handler); INIT_WORK(&pulse8->tx_work, pulse8_tx_work_handler); + INIT_DELAYED_WORK(&pulse8->ping_eeprom_work, + pulse8_ping_eeprom_work_handler); mutex_init(&pulse8->lock); spin_lock_init(&pulse8->msg_lock); pulse8->config_pending = false; @@ -865,8 +867,6 @@ static int pulse8_connect(struct serio *serio, struct serio_driver *drv) pulse8->restoring_config = true; } - INIT_DELAYED_WORK(&pulse8->ping_eeprom_work, - pulse8_ping_eeprom_work_handler); schedule_delayed_work(&pulse8->ping_eeprom_work, PING_PERIOD); return 0; From aa9eda76129c9f44c4dd7e233b04bc70c0f56e12 Mon Sep 17 00:00:00 2001 From: Hans Verkuil Date: Fri, 24 Jan 2020 15:52:38 +0100 Subject: [PATCH 116/400] media: pulse8-cec: close serio in disconnect, not adap_free The serio_close() call was moved to pulse8_cec_adap_free(), but that can be too late if that is called after the serio core pulled down the serio already, in which case you get a kernel oops. Keep it in the disconnect(). Signed-off-by: Hans Verkuil Fixes: 601282d65b96 ("media: pulse8-cec: use adap_free callback") Signed-off-by: Mauro Carvalho Chehab --- drivers/media/usb/pulse8-cec/pulse8-cec.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/media/usb/pulse8-cec/pulse8-cec.c b/drivers/media/usb/pulse8-cec/pulse8-cec.c index 8d61bcec14bb..0655aa9ecf28 100644 --- a/drivers/media/usb/pulse8-cec/pulse8-cec.c +++ b/drivers/media/usb/pulse8-cec/pulse8-cec.c @@ -635,8 +635,6 @@ static void pulse8_cec_adap_free(struct cec_adapter *adap) cancel_delayed_work_sync(&pulse8->ping_eeprom_work); cancel_work_sync(&pulse8->irq_work); cancel_work_sync(&pulse8->tx_work); - serio_close(pulse8->serio); - serio_set_drvdata(pulse8->serio, NULL); kfree(pulse8); } @@ -652,6 +650,9 @@ static void pulse8_disconnect(struct serio *serio) struct pulse8 *pulse8 = serio_get_drvdata(serio); cec_unregister_adapter(pulse8->adap); + pulse8->serio = NULL; + serio_set_drvdata(serio, NULL); + serio_close(serio); } static int pulse8_setup(struct pulse8 *pulse8, struct serio *serio, @@ -872,10 +873,11 @@ static int pulse8_connect(struct serio *serio, struct serio_driver *drv) return 0; close_serio: + pulse8->serio = NULL; + serio_set_drvdata(serio, NULL); serio_close(serio); delete_adap: cec_delete_adapter(pulse8->adap); - serio_set_drvdata(serio, NULL); free_device: kfree(pulse8); return err; From 49a56266f96f2c6608373464af8755b431ef1513 Mon Sep 17 00:00:00 2001 From: Hans Verkuil Date: Tue, 4 Feb 2020 13:45:04 +0100 Subject: [PATCH 117/400] media: vicodec: process all 4 components for RGB32 formats Only ARGB32-type pixelformat were assumed to have 4 components, which is wrong since RGB32-type pixelformats may have an alpha channel, so they should also assume 4 color components. The XRGB32-type pixelformats really have only 3 color components, but this complicated matters since that creates strides that are sometimes width * 3 and sometimes width * 4, and in fact this can result in buffer overflows. Keep things simple by just always processing all 4 color components. In the future we might want to optimize this again for the XRGB32-type pixelformats, but for now keep it simple and robust. Signed-off-by: Hans Verkuil Cc: # for v5.4 and up Signed-off-by: Mauro Carvalho Chehab --- .../media/platform/vicodec/codec-v4l2-fwht.c | 34 +++++-------------- 1 file changed, 9 insertions(+), 25 deletions(-) diff --git a/drivers/media/platform/vicodec/codec-v4l2-fwht.c b/drivers/media/platform/vicodec/codec-v4l2-fwht.c index 3c93d9232c3c..b6e39fbd8ad5 100644 --- a/drivers/media/platform/vicodec/codec-v4l2-fwht.c +++ b/drivers/media/platform/vicodec/codec-v4l2-fwht.c @@ -27,17 +27,17 @@ static const struct v4l2_fwht_pixfmt_info v4l2_fwht_pixfmts[] = { { V4L2_PIX_FMT_BGR24, 3, 3, 1, 3, 3, 1, 1, 3, 1, FWHT_FL_PIXENC_RGB}, { V4L2_PIX_FMT_RGB24, 3, 3, 1, 3, 3, 1, 1, 3, 1, FWHT_FL_PIXENC_RGB}, { V4L2_PIX_FMT_HSV24, 3, 3, 1, 3, 3, 1, 1, 3, 1, FWHT_FL_PIXENC_HSV}, - { V4L2_PIX_FMT_BGR32, 4, 4, 1, 4, 4, 1, 1, 3, 1, FWHT_FL_PIXENC_RGB}, - { V4L2_PIX_FMT_XBGR32, 4, 4, 1, 4, 4, 1, 1, 3, 1, FWHT_FL_PIXENC_RGB}, + { V4L2_PIX_FMT_BGR32, 4, 4, 1, 4, 4, 1, 1, 4, 1, FWHT_FL_PIXENC_RGB}, + { V4L2_PIX_FMT_XBGR32, 4, 4, 1, 4, 4, 1, 1, 4, 1, FWHT_FL_PIXENC_RGB}, { V4L2_PIX_FMT_ABGR32, 4, 4, 1, 4, 4, 1, 1, 4, 1, FWHT_FL_PIXENC_RGB}, - { V4L2_PIX_FMT_RGB32, 4, 4, 1, 4, 4, 1, 1, 3, 1, FWHT_FL_PIXENC_RGB}, - { V4L2_PIX_FMT_XRGB32, 4, 4, 1, 4, 4, 1, 1, 3, 1, FWHT_FL_PIXENC_RGB}, + { V4L2_PIX_FMT_RGB32, 4, 4, 1, 4, 4, 1, 1, 4, 1, FWHT_FL_PIXENC_RGB}, + { V4L2_PIX_FMT_XRGB32, 4, 4, 1, 4, 4, 1, 1, 4, 1, FWHT_FL_PIXENC_RGB}, { V4L2_PIX_FMT_ARGB32, 4, 4, 1, 4, 4, 1, 1, 4, 1, FWHT_FL_PIXENC_RGB}, - { V4L2_PIX_FMT_BGRX32, 4, 4, 1, 4, 4, 1, 1, 3, 1, FWHT_FL_PIXENC_RGB}, + { V4L2_PIX_FMT_BGRX32, 4, 4, 1, 4, 4, 1, 1, 4, 1, FWHT_FL_PIXENC_RGB}, { V4L2_PIX_FMT_BGRA32, 4, 4, 1, 4, 4, 1, 1, 4, 1, FWHT_FL_PIXENC_RGB}, - { V4L2_PIX_FMT_RGBX32, 4, 4, 1, 4, 4, 1, 1, 3, 1, FWHT_FL_PIXENC_RGB}, + { V4L2_PIX_FMT_RGBX32, 4, 4, 1, 4, 4, 1, 1, 4, 1, FWHT_FL_PIXENC_RGB}, { V4L2_PIX_FMT_RGBA32, 4, 4, 1, 4, 4, 1, 1, 4, 1, FWHT_FL_PIXENC_RGB}, - { V4L2_PIX_FMT_HSV32, 4, 4, 1, 4, 4, 1, 1, 3, 1, FWHT_FL_PIXENC_HSV}, + { V4L2_PIX_FMT_HSV32, 4, 4, 1, 4, 4, 1, 1, 4, 1, FWHT_FL_PIXENC_HSV}, { V4L2_PIX_FMT_GREY, 1, 1, 1, 1, 0, 1, 1, 1, 1, FWHT_FL_PIXENC_RGB}, }; @@ -175,22 +175,14 @@ static int prepare_raw_frame(struct fwht_raw_frame *rf, case V4L2_PIX_FMT_RGB32: case V4L2_PIX_FMT_XRGB32: case V4L2_PIX_FMT_HSV32: - rf->cr = rf->luma + 1; - rf->cb = rf->cr + 2; - rf->luma += 2; - break; - case V4L2_PIX_FMT_BGR32: - case V4L2_PIX_FMT_XBGR32: - rf->cb = rf->luma; - rf->cr = rf->cb + 2; - rf->luma++; - break; case V4L2_PIX_FMT_ARGB32: rf->alpha = rf->luma; rf->cr = rf->luma + 1; rf->cb = rf->cr + 2; rf->luma += 2; break; + case V4L2_PIX_FMT_BGR32: + case V4L2_PIX_FMT_XBGR32: case V4L2_PIX_FMT_ABGR32: rf->cb = rf->luma; rf->cr = rf->cb + 2; @@ -198,10 +190,6 @@ static int prepare_raw_frame(struct fwht_raw_frame *rf, rf->alpha = rf->cr + 1; break; case V4L2_PIX_FMT_BGRX32: - rf->cb = rf->luma + 1; - rf->cr = rf->cb + 2; - rf->luma += 2; - break; case V4L2_PIX_FMT_BGRA32: rf->alpha = rf->luma; rf->cb = rf->luma + 1; @@ -209,10 +197,6 @@ static int prepare_raw_frame(struct fwht_raw_frame *rf, rf->luma += 2; break; case V4L2_PIX_FMT_RGBX32: - rf->cr = rf->luma; - rf->cb = rf->cr + 2; - rf->luma++; - break; case V4L2_PIX_FMT_RGBA32: rf->alpha = rf->luma + 3; rf->cr = rf->luma; From 316e730f1d8bb029fe6cec2468fb2a50424485b3 Mon Sep 17 00:00:00 2001 From: Hans Verkuil Date: Tue, 4 Feb 2020 19:13:06 +0100 Subject: [PATCH 118/400] media: v4l2-mem2mem.c: fix broken links The topology that v4l2_m2m_register_media_controller() creates for a processing block actually created a source-to-source link and a sink-to-sink link instead of two source-to-sink links. Unfortunately v4l2-compliance never checked for such bad links, so this went unreported for quite some time. Signed-off-by: Hans Verkuil Reported-by: Nicolas Dufresne Cc: # for v4.19 and up Signed-off-by: Mauro Carvalho Chehab --- drivers/media/v4l2-core/v4l2-mem2mem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c b/drivers/media/v4l2-core/v4l2-mem2mem.c index 1afd9c6ad908..cc34c5ab7009 100644 --- a/drivers/media/v4l2-core/v4l2-mem2mem.c +++ b/drivers/media/v4l2-core/v4l2-mem2mem.c @@ -880,12 +880,12 @@ int v4l2_m2m_register_media_controller(struct v4l2_m2m_dev *m2m_dev, goto err_rel_entity1; /* Connect the three entities */ - ret = media_create_pad_link(m2m_dev->source, 0, &m2m_dev->proc, 1, + ret = media_create_pad_link(m2m_dev->source, 0, &m2m_dev->proc, 0, MEDIA_LNK_FL_IMMUTABLE | MEDIA_LNK_FL_ENABLED); if (ret) goto err_rel_entity2; - ret = media_create_pad_link(&m2m_dev->proc, 0, &m2m_dev->sink, 0, + ret = media_create_pad_link(&m2m_dev->proc, 1, &m2m_dev->sink, 0, MEDIA_LNK_FL_IMMUTABLE | MEDIA_LNK_FL_ENABLED); if (ret) goto err_rm_links0; From 044041cd5227ec9ccf969f4bf1cc08bffe13b9d3 Mon Sep 17 00:00:00 2001 From: Hans Verkuil Date: Tue, 4 Feb 2020 19:19:22 +0100 Subject: [PATCH 119/400] media: mc-entity.c: use & to check pad flags, not == These are bits so to test if a pad is a sink you use & but not ==. It looks like the only reason this hasn't caused problems before is that media_get_pad_index() is currently only used with pads that do not set the MEDIA_PAD_FL_MUST_CONNECT flag. So a pad really had only the SINK or SOURCE flag set and nothing else. Signed-off-by: Hans Verkuil Cc: # for v5.3 and up Signed-off-by: Mauro Carvalho Chehab --- drivers/media/mc/mc-entity.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/media/mc/mc-entity.c b/drivers/media/mc/mc-entity.c index 7c429ce98bae..668770e9f609 100644 --- a/drivers/media/mc/mc-entity.c +++ b/drivers/media/mc/mc-entity.c @@ -639,9 +639,9 @@ int media_get_pad_index(struct media_entity *entity, bool is_sink, return -EINVAL; for (i = 0; i < entity->num_pads; i++) { - if (entity->pads[i].flags == MEDIA_PAD_FL_SINK) + if (entity->pads[i].flags & MEDIA_PAD_FL_SINK) pad_is_sink = true; - else if (entity->pads[i].flags == MEDIA_PAD_FL_SOURCE) + else if (entity->pads[i].flags & MEDIA_PAD_FL_SOURCE) pad_is_sink = false; else continue; /* This is an error! */ From d171c45da874e3858a83e6377e00280a507fe2f2 Mon Sep 17 00:00:00 2001 From: Ezequiel Garcia Date: Tue, 4 Feb 2020 20:38:37 +0100 Subject: [PATCH 120/400] media: hantro: Fix broken media controller links The driver currently creates a broken topology, with a source-to-source link and a sink-to-sink link instead of two source-to-sink links. Reported-by: Nicolas Dufresne Cc: # for v5.3 and up Signed-off-by: Ezequiel Garcia Tested-by: Nicolas Dufresne Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab --- drivers/staging/media/hantro/hantro_drv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c index 97c615a2f057..c98835326135 100644 --- a/drivers/staging/media/hantro/hantro_drv.c +++ b/drivers/staging/media/hantro/hantro_drv.c @@ -558,13 +558,13 @@ static int hantro_attach_func(struct hantro_dev *vpu, goto err_rel_entity1; /* Connect the three entities */ - ret = media_create_pad_link(&func->vdev.entity, 0, &func->proc, 1, + ret = media_create_pad_link(&func->vdev.entity, 0, &func->proc, 0, MEDIA_LNK_FL_IMMUTABLE | MEDIA_LNK_FL_ENABLED); if (ret) goto err_rel_entity2; - ret = media_create_pad_link(&func->proc, 0, &func->sink, 0, + ret = media_create_pad_link(&func->proc, 1, &func->sink, 0, MEDIA_LNK_FL_IMMUTABLE | MEDIA_LNK_FL_ENABLED); if (ret) From fbb30168c7395b9cfeb9e6f7b0c0bca854a6552d Mon Sep 17 00:00:00 2001 From: John Bates Date: Thu, 20 Feb 2020 14:53:19 -0800 Subject: [PATCH 121/400] drm/virtio: fix resource id creation race The previous code was not thread safe and caused undefined behavior from spurious duplicate resource IDs. In this patch, an atomic_t is used instead. We no longer see any duplicate IDs in tests with this change. Fixes: 16065fcdd19d ("drm/virtio: do NOT reuse resource ids") Signed-off-by: John Bates Reviewed-by: Chia-I Wu Link: http://patchwork.freedesktop.org/patch/msgid/20200220225319.45621-1-jbates@chromium.org Signed-off-by: Gerd Hoffmann --- drivers/gpu/drm/virtio/virtgpu_object.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c b/drivers/gpu/drm/virtio/virtgpu_object.c index 017a9e0fc3bb..890121a45625 100644 --- a/drivers/gpu/drm/virtio/virtgpu_object.c +++ b/drivers/gpu/drm/virtio/virtgpu_object.c @@ -42,8 +42,8 @@ static int virtio_gpu_resource_id_get(struct virtio_gpu_device *vgdev, * "f91a9dd35715 Fix unlinking resources from hash * table." (Feb 2019) fixes the bug. */ - static int handle; - handle++; + static atomic_t seqno = ATOMIC_INIT(0); + int handle = atomic_inc_return(&seqno); *resid = handle + 1; } else { int handle = ida_alloc(&vgdev->resource_ida, GFP_KERNEL); From 41726c9a50e7464beca7112d0aebf3a0090c62d2 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Sun, 23 Feb 2020 13:11:42 -0700 Subject: [PATCH 122/400] io_uring: fix personality idr leak We somehow never free the idr, even though we init it for every ctx. Free it when the rest of the ring data is freed. Fixes: 071698e13ac6 ("io_uring: allow registering credentials") Reviewed-by: Stefano Garzarella Signed-off-by: Jens Axboe --- fs/io_uring.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/io_uring.c b/fs/io_uring.c index 7d0be264527d..d961945cb332 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -6339,6 +6339,7 @@ static void io_ring_ctx_free(struct io_ring_ctx *ctx) io_sqe_buffer_unregister(ctx); io_sqe_files_unregister(ctx); io_eventfd_unregister(ctx); + idr_destroy(&ctx->personality_idr); #if defined(CONFIG_UNIX) if (ctx->ring_sock) { From 36d5d22090d13fd3a7a8c9663a711cbe6970aac8 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Mon, 17 Feb 2020 17:40:50 +0300 Subject: [PATCH 123/400] dmaengine: coh901318: Fix a double lock bug in dma_tc_handle() The caller is already holding the lock so this will deadlock. Fixes: 0b58828c923e ("DMAENGINE: COH 901 318 remove irq counting") Signed-off-by: Dan Carpenter Link: https://lore.kernel.org/r/20200217144050.3i4ymbytogod4ijn@kili.mountain Signed-off-by: Vinod Koul --- drivers/dma/coh901318.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/dma/coh901318.c b/drivers/dma/coh901318.c index e51d836afcc7..1092d4ce723e 100644 --- a/drivers/dma/coh901318.c +++ b/drivers/dma/coh901318.c @@ -1947,8 +1947,6 @@ static void dma_tc_handle(struct coh901318_chan *cohc) return; } - spin_lock(&cohc->lock); - /* * When we reach this point, at least one queue item * should have been moved over from cohc->queue to @@ -1969,8 +1967,6 @@ static void dma_tc_handle(struct coh901318_chan *cohc) if (coh901318_queue_start(cohc) == NULL) cohc->busy = 0; - spin_unlock(&cohc->lock); - /* * This tasklet will remove items from cohc->active * and thus terminates them. From 88402c5b1ba7498217027c8a54e8df61d030500c Mon Sep 17 00:00:00 2001 From: Dave Jiang Date: Wed, 19 Feb 2020 10:24:08 -0700 Subject: [PATCH 124/400] dmaengine: idxd: sysfs input of wq incorrect wq type should return error Currently when inputing an unrecognized wq type, we set the wq type to "none". It really should return error and not change the existing wq type that's in the kernel. Fixes: c52ca478233c ("dmaengine: idxd: add configuration component of driver") Reported-by: Yixin Zhang Signed-off-by: Dave Jiang Link: https://lore.kernel.org/r/158213304803.2290.13336343633425868211.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul --- drivers/dma/idxd/sysfs.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c index edbfe83325eb..b4a7885e79ed 100644 --- a/drivers/dma/idxd/sysfs.c +++ b/drivers/dma/idxd/sysfs.c @@ -1002,12 +1002,14 @@ static ssize_t wq_type_store(struct device *dev, return -EPERM; old_type = wq->type; - if (sysfs_streq(buf, idxd_wq_type_names[IDXD_WQT_KERNEL])) + if (sysfs_streq(buf, idxd_wq_type_names[IDXD_WQT_NONE])) + wq->type = IDXD_WQT_NONE; + else if (sysfs_streq(buf, idxd_wq_type_names[IDXD_WQT_KERNEL])) wq->type = IDXD_WQT_KERNEL; else if (sysfs_streq(buf, idxd_wq_type_names[IDXD_WQT_USER])) wq->type = IDXD_WQT_USER; else - wq->type = IDXD_WQT_NONE; + return -EINVAL; /* If we are changing queue type, clear the name */ if (wq->type != old_type) From 50e7e7f6f2d040dd16a636f408eab9184abc63f8 Mon Sep 17 00:00:00 2001 From: Dave Jiang Date: Wed, 19 Feb 2020 10:24:56 -0700 Subject: [PATCH 125/400] dmaengine: idxd: wq size configuration needs to check global max size The current size_store() function for idxd sysfs does not check the total wq size. This allows configuration of all wqs with total wq size. Add check to make sure the wq sysfs attribute rejects storing of size over the total wq size. Fixes: c52ca478233c ("dmaengine: idxd: add configuration component of driver") Reported-by: Jerry Chen Signed-off-by: Dave Jiang Link: https://lore.kernel.org/r/158213309629.2509.3583411832507185041.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul --- drivers/dma/idxd/sysfs.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c index b4a7885e79ed..6ca6e520a2fa 100644 --- a/drivers/dma/idxd/sysfs.c +++ b/drivers/dma/idxd/sysfs.c @@ -904,6 +904,20 @@ static ssize_t wq_size_show(struct device *dev, struct device_attribute *attr, return sprintf(buf, "%u\n", wq->size); } +static int total_claimed_wq_size(struct idxd_device *idxd) +{ + int i; + int wq_size = 0; + + for (i = 0; i < idxd->max_wqs; i++) { + struct idxd_wq *wq = &idxd->wqs[i]; + + wq_size += wq->size; + } + + return wq_size; +} + static ssize_t wq_size_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) @@ -923,7 +937,7 @@ static ssize_t wq_size_store(struct device *dev, if (wq->state != IDXD_WQ_DISABLED) return -EPERM; - if (size > idxd->max_wq_size) + if (size + total_claimed_wq_size(idxd) - wq->size > idxd->max_wq_size) return -EINVAL; wq->size = size; From 3104abd1161bd9cfcad91a29fea72b87d813cf48 Mon Sep 17 00:00:00 2001 From: Lukas Bulwahn Date: Sun, 23 Feb 2020 10:09:50 +0100 Subject: [PATCH 126/400] MAINTAINERS: clean up PCIE DRIVER FOR CAVIUM THUNDERX Commit e1ac611f57c9 ("dt-bindings: PCI: Convert generic host binding to DT schema") combines all information from pci-thunder-{pem,ecam}.txt into host-generic-pci.yaml, and deleted the two files in Documentation/devicetree/bindings/pci/. Since then, ./scripts/get_maintainer.pl --self-test complains: no file matches F: Documentation/devicetree/bindings/pci/pci-thunder-* As the PCIE DRIVER FOR CAVIUM THUNDERX-relevant information is only a small part of the host-generic-pci.yaml, do not add this file to the PCIE DRIVER FOR CAVIUM THUNDERX entry, and only drop the reference to the removed files. Signed-off-by: Lukas Bulwahn Acked-by: Robert Richter Signed-off-by: Rob Herring --- MAINTAINERS | 1 - 1 file changed, 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 38fe2f3f7b6f..155611d1a4b2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -12953,7 +12953,6 @@ M: Robert Richter L: linux-pci@vger.kernel.org L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) S: Supported -F: Documentation/devicetree/bindings/pci/pci-thunder-* F: drivers/pci/controller/pci-thunder-* PCIE DRIVER FOR HISILICON From d288bddd8374e0a043ac9dde64a1ae6a09411d74 Mon Sep 17 00:00:00 2001 From: Martin Fuzzey Date: Wed, 29 Jan 2020 14:40:06 +0100 Subject: [PATCH 127/400] dmaengine: imx-sdma: fix context cache There is a DMA problem with the serial ports on i.MX6. When the following sequence is performed: 1) Open a port 2) Write some data 3) Close the port 4) Open a *different* port 5) Write some data 6) Close the port The second write sends nothing and the second close hangs. If the first close() is omitted it works. Adding logs to the the UART driver shows that the DMA is being setup but the callback is never invoked for the second write. This used to work in 4.19. Git bisect leads to: ad0d92d: "dmaengine: imx-sdma: refine to load context only once" This commit adds a "context_loaded" flag used to avoid unnecessary context setups. However the flag is only reset in sdma_channel_terminate_work(), which is only invoked in a worker triggered by sdma_terminate_all() IF there is an active descriptor. So, if no active descriptor remains when the channel is terminated, the flag is not reset and, when the channel is later reused the old context is used. Fix the problem by always resetting the flag in sdma_free_chan_resources(). Cc: stable@vger.kernel.org Signed-off-by: Martin Fuzzey Fixes: ad0d92d7ba6a ("dmaengine: imx-sdma: refine to load context only once") Reviewed-by: Fabio Estevam Link: https://lore.kernel.org/r/1580305274-27274-1-git-send-email-martin.fuzzey@flowbird.group Signed-off-by: Vinod Koul --- drivers/dma/imx-sdma.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c index 066b21a32232..332ca5034504 100644 --- a/drivers/dma/imx-sdma.c +++ b/drivers/dma/imx-sdma.c @@ -1338,6 +1338,7 @@ static void sdma_free_chan_resources(struct dma_chan *chan) sdmac->event_id0 = 0; sdmac->event_id1 = 0; + sdmac->context_loaded = false; sdma_set_channel_priority(sdmac, 0); From 51fdaa0490241e8cd41b40cbf43a336d1a014460 Mon Sep 17 00:00:00 2001 From: Damien Le Moal Date: Wed, 19 Feb 2020 15:38:00 +0900 Subject: [PATCH 128/400] scsi: sd_sbc: Fix sd_zbc_report_zones() The block layer generic blk_revalidate_disk_zones() checks the validity of zone descriptors reported by a disk using the blk_revalidate_zone_cb() callback function executed for each zone descriptor. If a ZBC disk reports invalid zone descriptors, blk_revalidate_disk_zones() returns an error and sd_zbc_read_zones() changes the disk capacity to 0, which in turn results in the gendisk structure capacity to be set to 0. This all works well for the first revalidate pass on a disk and the block layer detects the capactiy change. On the second revalidate pass, blk_revalidate_disk_zones() is called again and sd_zbc_report_zones() executed to check the zones a second time. However, for this second pass, the gendisk capacity is now 0, which results in sd_zbc_report_zones() to do nothing and to report success and no zones. blk_revalidate_disk_zones() in turn returns success and sets the disk queue chunk_sectors limit with zero as no zones were checked, causing a oops to trigger on the BUG_ON(!is_power_of_2(chunk_sectors)) in blk_queue_chunk_sectors(). Fix this by using the sdkp capacity field rather than the gendisk capacity for the report zones loop in sd_zbc_report_zones(). Also add a check to return immediately an error if the sdkp capacity is 0. With this fix, invalid/buggy ZBC disk scan does not trigger a oops and are exposed with a 0 capacity. This change also preserve the chance for the disk to be correctly revalidated on the second revalidate pass as the scsi disk structure capacity field is always set to the disk reported value when sd_zbc_report_zones() is called. Link: https://lore.kernel.org/r/20200219063800.880834-1-damien.lemoal@wdc.com Fixes: d41003513e61 ("block: rework zone reporting") Cc: Cc: # v5.5 Reviewed-by: Christoph Hellwig Reviewed-by: Johannes Thumshirn Signed-off-by: Damien Le Moal Signed-off-by: Martin K. Petersen --- drivers/scsi/sd_zbc.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index e4282bce5834..f45c22b09726 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -161,6 +161,7 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, unsigned int nr_zones, report_zones_cb cb, void *data) { struct scsi_disk *sdkp = scsi_disk(disk); + sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity); unsigned int nr, i; unsigned char *buf; size_t offset, buflen = 0; @@ -171,11 +172,15 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, /* Not a zoned device */ return -EOPNOTSUPP; + if (!capacity) + /* Device gone or invalid */ + return -ENODEV; + buf = sd_zbc_alloc_report_buffer(sdkp, nr_zones, &buflen); if (!buf) return -ENOMEM; - while (zone_idx < nr_zones && sector < get_capacity(disk)) { + while (zone_idx < nr_zones && sector < capacity) { ret = sd_zbc_do_report_zones(sdkp, buf, buflen, sectors_to_logical(sdkp->device, sector), true); if (ret) From a3fd4bfe85fbb67cf4ec1232d0af625ece3c508b Mon Sep 17 00:00:00 2001 From: Benjamin Block Date: Wed, 19 Feb 2020 16:09:25 +0100 Subject: [PATCH 129/400] scsi: zfcp: fix wrong data and display format of SFP+ temperature MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When implementing support for retrieval of local diagnostic data from the FCP channel, the wrong data format was assumed for the temperature of the local SFP+ connector. The Fibre Channel Link Services (FC-LS-3) specification is not clear on the format of the stored integer, and only after consulting the SNIA specification SFF-8472 did we realize it is stored as two's complement. Thus, the used data and display format is wrong, and highly misleading for users when the temperature should drop below 0°C (however unlikely that may be). To fix this, change the data format in `struct fsf_qtcb_bottom_port` from unsigned to signed, and change the printf format string used to generate `zfcp_sysfs_adapter_diag_sfp_temperature_show()` from `%hu` to `%hd`. Link: https://lore.kernel.org/r/d6e3be5428da5c9490cfff4df7cae868bc9f1a7e.1582039501.git.bblock@linux.ibm.com Fixes: a10a61e807b0 ("scsi: zfcp: support retrieval of SFP Data via Exchange Port Data") Fixes: 6028f7c4cd87 ("scsi: zfcp: introduce sysfs interface for diagnostics of local SFP transceiver") Cc: # 5.5+ Reviewed-by: Jens Remus Reviewed-by: Fedor Loshakov Reviewed-by: Steffen Maier Signed-off-by: Benjamin Block Signed-off-by: Martin K. Petersen --- drivers/s390/scsi/zfcp_fsf.h | 2 +- drivers/s390/scsi/zfcp_sysfs.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/s390/scsi/zfcp_fsf.h b/drivers/s390/scsi/zfcp_fsf.h index 2b1e4da1944f..4bfb79f20588 100644 --- a/drivers/s390/scsi/zfcp_fsf.h +++ b/drivers/s390/scsi/zfcp_fsf.h @@ -410,7 +410,7 @@ struct fsf_qtcb_bottom_port { u8 cb_util; u8 a_util; u8 res2; - u16 temperature; + s16 temperature; u16 vcc; u16 tx_bias; u16 tx_power; diff --git a/drivers/s390/scsi/zfcp_sysfs.c b/drivers/s390/scsi/zfcp_sysfs.c index 494b9fe9cc94..a711a0d15100 100644 --- a/drivers/s390/scsi/zfcp_sysfs.c +++ b/drivers/s390/scsi/zfcp_sysfs.c @@ -800,7 +800,7 @@ static ZFCP_DEV_ATTR(adapter_diag, b2b_credit, 0400, static ZFCP_DEV_ATTR(adapter_diag_sfp, _name, 0400, \ zfcp_sysfs_adapter_diag_sfp_##_name##_show, NULL) -ZFCP_DEFINE_DIAG_SFP_ATTR(temperature, temperature, 5, "%hu"); +ZFCP_DEFINE_DIAG_SFP_ATTR(temperature, temperature, 6, "%hd"); ZFCP_DEFINE_DIAG_SFP_ATTR(vcc, vcc, 5, "%hu"); ZFCP_DEFINE_DIAG_SFP_ATTR(tx_bias, tx_bias, 5, "%hu"); ZFCP_DEFINE_DIAG_SFP_ATTR(tx_power, tx_power, 5, "%hu"); From 54b3719d82e0816271e35579a9967b8a3b42296d Mon Sep 17 00:00:00 2001 From: Mauro Carvalho Chehab Date: Sat, 22 Feb 2020 10:00:02 +0100 Subject: [PATCH 130/400] docs: dt: fix several broken references due to renames Several DT references got broken due to txt->yaml conversion. Those are auto-fixed by running: scripts/documentation-file-ref-check --fix Signed-off-by: Mauro Carvalho Chehab Acked-by: Andrew Jeffery Reviewed-by: Dan Murphy Reviewed-by: Amit Kucheria Signed-off-by: Rob Herring --- Documentation/devicetree/bindings/arm/arm,scmi.txt | 2 +- Documentation/devicetree/bindings/arm/arm,scpi.txt | 2 +- .../devicetree/bindings/arm/bcm/brcm,bcm63138.txt | 2 +- .../devicetree/bindings/arm/hisilicon/hi3519-sysctrl.txt | 2 +- .../devicetree/bindings/arm/msm/qcom,idle-state.txt | 2 +- Documentation/devicetree/bindings/arm/omap/mpu.txt | 2 +- Documentation/devicetree/bindings/arm/psci.yaml | 2 +- .../devicetree/bindings/clock/qcom,gcc-apq8064.yaml | 2 +- .../devicetree/bindings/display/tilcdc/tilcdc.txt | 2 +- Documentation/devicetree/bindings/leds/common.yaml | 2 +- .../devicetree/bindings/leds/register-bit-led.txt | 2 +- .../devicetree/bindings/memory-controllers/ti/emif.txt | 2 +- Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt | 2 +- .../bindings/pinctrl/aspeed,ast2400-pinctrl.yaml | 2 +- .../bindings/pinctrl/aspeed,ast2500-pinctrl.yaml | 2 +- .../bindings/pinctrl/aspeed,ast2600-pinctrl.yaml | 2 +- .../devicetree/bindings/power/amlogic,meson-ee-pwrc.yaml | 2 +- .../devicetree/bindings/reset/st,stm32mp1-rcc.txt | 2 +- .../devicetree/bindings/thermal/brcm,avs-ro-thermal.yaml | 2 +- MAINTAINERS | 8 ++++---- 20 files changed, 23 insertions(+), 23 deletions(-) diff --git a/Documentation/devicetree/bindings/arm/arm,scmi.txt b/Documentation/devicetree/bindings/arm/arm,scmi.txt index f493d69e6194..dc102c4e4a78 100644 --- a/Documentation/devicetree/bindings/arm/arm,scmi.txt +++ b/Documentation/devicetree/bindings/arm/arm,scmi.txt @@ -102,7 +102,7 @@ Required sub-node properties: [1] Documentation/devicetree/bindings/clock/clock-bindings.txt [2] Documentation/devicetree/bindings/power/power-domain.yaml [3] Documentation/devicetree/bindings/thermal/thermal.txt -[4] Documentation/devicetree/bindings/sram/sram.txt +[4] Documentation/devicetree/bindings/sram/sram.yaml [5] Documentation/devicetree/bindings/reset/reset.txt Example: diff --git a/Documentation/devicetree/bindings/arm/arm,scpi.txt b/Documentation/devicetree/bindings/arm/arm,scpi.txt index 7b83ef43b418..dd04d9d9a1b8 100644 --- a/Documentation/devicetree/bindings/arm/arm,scpi.txt +++ b/Documentation/devicetree/bindings/arm/arm,scpi.txt @@ -109,7 +109,7 @@ Required properties: [0] http://infocenter.arm.com/help/topic/com.arm.doc.dui0922b/index.html [1] Documentation/devicetree/bindings/clock/clock-bindings.txt [2] Documentation/devicetree/bindings/thermal/thermal.txt -[3] Documentation/devicetree/bindings/sram/sram.txt +[3] Documentation/devicetree/bindings/sram/sram.yaml [4] Documentation/devicetree/bindings/power/power-domain.yaml Example: diff --git a/Documentation/devicetree/bindings/arm/bcm/brcm,bcm63138.txt b/Documentation/devicetree/bindings/arm/bcm/brcm,bcm63138.txt index b82b6a0ae6f7..8c7a4908a849 100644 --- a/Documentation/devicetree/bindings/arm/bcm/brcm,bcm63138.txt +++ b/Documentation/devicetree/bindings/arm/bcm/brcm,bcm63138.txt @@ -62,7 +62,7 @@ Timer node: Syscon reboot node: -See Documentation/devicetree/bindings/power/reset/syscon-reboot.txt for the +See Documentation/devicetree/bindings/power/reset/syscon-reboot.yaml for the detailed list of properties, the two values defined below are specific to the BCM6328-style timer: diff --git a/Documentation/devicetree/bindings/arm/hisilicon/hi3519-sysctrl.txt b/Documentation/devicetree/bindings/arm/hisilicon/hi3519-sysctrl.txt index 115c5be0bd0b..8defacc44dd5 100644 --- a/Documentation/devicetree/bindings/arm/hisilicon/hi3519-sysctrl.txt +++ b/Documentation/devicetree/bindings/arm/hisilicon/hi3519-sysctrl.txt @@ -1,7 +1,7 @@ * Hisilicon Hi3519 System Controller Block This bindings use the following binding: -Documentation/devicetree/bindings/mfd/syscon.txt +Documentation/devicetree/bindings/mfd/syscon.yaml Required properties: - compatible: "hisilicon,hi3519-sysctrl". diff --git a/Documentation/devicetree/bindings/arm/msm/qcom,idle-state.txt b/Documentation/devicetree/bindings/arm/msm/qcom,idle-state.txt index 06df04cc827a..6ce0b212ec6d 100644 --- a/Documentation/devicetree/bindings/arm/msm/qcom,idle-state.txt +++ b/Documentation/devicetree/bindings/arm/msm/qcom,idle-state.txt @@ -81,4 +81,4 @@ Example: }; }; -[1]. Documentation/devicetree/bindings/arm/idle-states.txt +[1]. Documentation/devicetree/bindings/arm/idle-states.yaml diff --git a/Documentation/devicetree/bindings/arm/omap/mpu.txt b/Documentation/devicetree/bindings/arm/omap/mpu.txt index f301e636fd52..e41490e6979c 100644 --- a/Documentation/devicetree/bindings/arm/omap/mpu.txt +++ b/Documentation/devicetree/bindings/arm/omap/mpu.txt @@ -17,7 +17,7 @@ am335x and am437x only: - pm-sram: Phandles to ocmcram nodes to be used for power management. First should be type 'protect-exec' for the driver to use to copy and run PM functions, second should be regular pool to be used for - data region for code. See Documentation/devicetree/bindings/sram/sram.txt + data region for code. See Documentation/devicetree/bindings/sram/sram.yaml for more details. Examples: diff --git a/Documentation/devicetree/bindings/arm/psci.yaml b/Documentation/devicetree/bindings/arm/psci.yaml index 8ef85420b2ab..f8218e60e3e2 100644 --- a/Documentation/devicetree/bindings/arm/psci.yaml +++ b/Documentation/devicetree/bindings/arm/psci.yaml @@ -100,7 +100,7 @@ properties: bindings in [1]) must specify this property. [1] Kernel documentation - ARM idle states bindings - Documentation/devicetree/bindings/arm/idle-states.txt + Documentation/devicetree/bindings/arm/idle-states.yaml "#power-domain-cells": description: diff --git a/Documentation/devicetree/bindings/clock/qcom,gcc-apq8064.yaml b/Documentation/devicetree/bindings/clock/qcom,gcc-apq8064.yaml index 17f87178f6b8..3647007f82ca 100644 --- a/Documentation/devicetree/bindings/clock/qcom,gcc-apq8064.yaml +++ b/Documentation/devicetree/bindings/clock/qcom,gcc-apq8064.yaml @@ -42,7 +42,7 @@ properties: be part of GCC and hence the TSENS properties can also be part of the GCC/clock-controller node. For more details on the TSENS properties please refer - Documentation/devicetree/bindings/thermal/qcom-tsens.txt + Documentation/devicetree/bindings/thermal/qcom-tsens.yaml nvmem-cell-names: minItems: 1 diff --git a/Documentation/devicetree/bindings/display/tilcdc/tilcdc.txt b/Documentation/devicetree/bindings/display/tilcdc/tilcdc.txt index 7bf1bb444812..aac617acb64f 100644 --- a/Documentation/devicetree/bindings/display/tilcdc/tilcdc.txt +++ b/Documentation/devicetree/bindings/display/tilcdc/tilcdc.txt @@ -37,7 +37,7 @@ Optional nodes: supports a single port with a single endpoint. - See also Documentation/devicetree/bindings/display/tilcdc/panel.txt and - Documentation/devicetree/bindings/display/tilcdc/tfp410.txt for connecting + Documentation/devicetree/bindings/display/bridge/ti,tfp410.txt for connecting tfp410 DVI encoder or lcd panel to lcdc [1] There is an errata about AM335x color wiring. For 16-bit color mode diff --git a/Documentation/devicetree/bindings/leds/common.yaml b/Documentation/devicetree/bindings/leds/common.yaml index d97d099b87e5..c60b994fe116 100644 --- a/Documentation/devicetree/bindings/leds/common.yaml +++ b/Documentation/devicetree/bindings/leds/common.yaml @@ -85,7 +85,7 @@ properties: # LED will act as a back-light, controlled by the framebuffer system - backlight # LED will turn on (but for leds-gpio see "default-state" property in - # Documentation/devicetree/bindings/leds/leds-gpio.txt) + # Documentation/devicetree/bindings/leds/leds-gpio.yaml) - default-on # LED "double" flashes at a load average based rate - heartbeat diff --git a/Documentation/devicetree/bindings/leds/register-bit-led.txt b/Documentation/devicetree/bindings/leds/register-bit-led.txt index cf1ea403ba7a..c7af6f70a97b 100644 --- a/Documentation/devicetree/bindings/leds/register-bit-led.txt +++ b/Documentation/devicetree/bindings/leds/register-bit-led.txt @@ -5,7 +5,7 @@ where single bits in a certain register can turn on/off a single LED. The register bit LEDs appear as children to the syscon device, with the proper compatible string. For the syscon bindings see: -Documentation/devicetree/bindings/mfd/syscon.txt +Documentation/devicetree/bindings/mfd/syscon.yaml Each LED is represented as a sub-node of the syscon device. Each node's name represents the name of the corresponding LED. diff --git a/Documentation/devicetree/bindings/memory-controllers/ti/emif.txt b/Documentation/devicetree/bindings/memory-controllers/ti/emif.txt index 44d71469c914..63f674ffeb4f 100644 --- a/Documentation/devicetree/bindings/memory-controllers/ti/emif.txt +++ b/Documentation/devicetree/bindings/memory-controllers/ti/emif.txt @@ -32,7 +32,7 @@ Required only for "ti,emif-am3352" and "ti,emif-am4372": - sram : Phandles for generic sram driver nodes, first should be type 'protect-exec' for the driver to use to copy and run PM functions, second should be regular pool to be used for - data region for code. See Documentation/devicetree/bindings/sram/sram.txt + data region for code. See Documentation/devicetree/bindings/sram/sram.yaml for more details. Optional properties: diff --git a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt index bb7e896cb644..9134e9bcca56 100644 --- a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt +++ b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt @@ -26,7 +26,7 @@ For generic IOMMU bindings, see Documentation/devicetree/bindings/iommu/iommu.txt. For arm-smmu binding, see: -Documentation/devicetree/bindings/iommu/arm,smmu.txt. +Documentation/devicetree/bindings/iommu/arm,smmu.yaml. Required properties: diff --git a/Documentation/devicetree/bindings/pinctrl/aspeed,ast2400-pinctrl.yaml b/Documentation/devicetree/bindings/pinctrl/aspeed,ast2400-pinctrl.yaml index bb690e20c368..135c7dfbc180 100644 --- a/Documentation/devicetree/bindings/pinctrl/aspeed,ast2400-pinctrl.yaml +++ b/Documentation/devicetree/bindings/pinctrl/aspeed,ast2400-pinctrl.yaml @@ -17,7 +17,7 @@ description: |+ "aspeed,ast2400-scu", "syscon", "simple-mfd" Refer to the the bindings described in - Documentation/devicetree/bindings/mfd/syscon.txt + Documentation/devicetree/bindings/mfd/syscon.yaml properties: compatible: diff --git a/Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml b/Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml index f7f5d57f2c9a..824f7fd1d51b 100644 --- a/Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml +++ b/Documentation/devicetree/bindings/pinctrl/aspeed,ast2500-pinctrl.yaml @@ -18,7 +18,7 @@ description: |+ "aspeed,g5-scu", "syscon", "simple-mfd" Refer to the the bindings described in - Documentation/devicetree/bindings/mfd/syscon.txt + Documentation/devicetree/bindings/mfd/syscon.yaml properties: compatible: diff --git a/Documentation/devicetree/bindings/pinctrl/aspeed,ast2600-pinctrl.yaml b/Documentation/devicetree/bindings/pinctrl/aspeed,ast2600-pinctrl.yaml index 3749fa233e87..ac8d1c30a8ed 100644 --- a/Documentation/devicetree/bindings/pinctrl/aspeed,ast2600-pinctrl.yaml +++ b/Documentation/devicetree/bindings/pinctrl/aspeed,ast2600-pinctrl.yaml @@ -17,7 +17,7 @@ description: |+ "aspeed,ast2600-scu", "syscon", "simple-mfd" Refer to the the bindings described in - Documentation/devicetree/bindings/mfd/syscon.txt + Documentation/devicetree/bindings/mfd/syscon.yaml properties: compatible: diff --git a/Documentation/devicetree/bindings/power/amlogic,meson-ee-pwrc.yaml b/Documentation/devicetree/bindings/power/amlogic,meson-ee-pwrc.yaml index aab70e8b681e..d3098c924b25 100644 --- a/Documentation/devicetree/bindings/power/amlogic,meson-ee-pwrc.yaml +++ b/Documentation/devicetree/bindings/power/amlogic,meson-ee-pwrc.yaml @@ -18,7 +18,7 @@ description: |+ "amlogic,meson-gx-hhi-sysctrl", "simple-mfd", "syscon" Refer to the the bindings described in - Documentation/devicetree/bindings/mfd/syscon.txt + Documentation/devicetree/bindings/mfd/syscon.yaml properties: compatible: diff --git a/Documentation/devicetree/bindings/reset/st,stm32mp1-rcc.txt b/Documentation/devicetree/bindings/reset/st,stm32mp1-rcc.txt index b4edaf7c7ff3..2880d5dda95e 100644 --- a/Documentation/devicetree/bindings/reset/st,stm32mp1-rcc.txt +++ b/Documentation/devicetree/bindings/reset/st,stm32mp1-rcc.txt @@ -3,4 +3,4 @@ STMicroelectronics STM32MP1 Peripheral Reset Controller The RCC IP is both a reset and a clock controller. -Please see Documentation/devicetree/bindings/clock/st,stm32mp1-rcc.txt +Please see Documentation/devicetree/bindings/clock/st,stm32mp1-rcc.yaml diff --git a/Documentation/devicetree/bindings/thermal/brcm,avs-ro-thermal.yaml b/Documentation/devicetree/bindings/thermal/brcm,avs-ro-thermal.yaml index d9fdf4809a49..f3e68ed03abf 100644 --- a/Documentation/devicetree/bindings/thermal/brcm,avs-ro-thermal.yaml +++ b/Documentation/devicetree/bindings/thermal/brcm,avs-ro-thermal.yaml @@ -17,7 +17,7 @@ description: |+ "brcm,bcm2711-avs-monitor", "syscon", "simple-mfd" Refer to the the bindings described in - Documentation/devicetree/bindings/mfd/syscon.txt + Documentation/devicetree/bindings/mfd/syscon.yaml properties: compatible: diff --git a/MAINTAINERS b/MAINTAINERS index 155611d1a4b2..7eb087d24519 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4016,7 +4016,7 @@ M: Cheng-Yi Chiang S: Maintained R: Enric Balletbo i Serra R: Guenter Roeck -F: Documentation/devicetree/bindings/sound/google,cros-ec-codec.txt +F: Documentation/devicetree/bindings/sound/google,cros-ec-codec.yaml F: sound/soc/codecs/cros_ec_codec.* CIRRUS LOGIC AUDIO CODEC DRIVERS @@ -5667,7 +5667,7 @@ L: dri-devel@lists.freedesktop.org T: git git://anongit.freedesktop.org/drm/drm-misc S: Maintained F: drivers/gpu/drm/stm -F: Documentation/devicetree/bindings/display/st,stm32-ltdc.txt +F: Documentation/devicetree/bindings/display/st,stm32-ltdc.yaml DRM DRIVERS FOR TI LCDC M: Jyri Sarha @@ -10163,7 +10163,7 @@ MAXBOTIX ULTRASONIC RANGER IIO DRIVER M: Andreas Klinger L: linux-iio@vger.kernel.org S: Maintained -F: Documentation/devicetree/bindings/iio/proximity/maxbotix,mb1232.txt +F: Documentation/devicetree/bindings/iio/proximity/maxbotix,mb1232.yaml F: drivers/iio/proximity/mb1232.c MAXIM MAX77650 PMIC MFD DRIVER @@ -10466,7 +10466,7 @@ M: Hugues Fruchet L: linux-media@vger.kernel.org T: git git://linuxtv.org/media_tree.git S: Supported -F: Documentation/devicetree/bindings/media/st,stm32-dcmi.txt +F: Documentation/devicetree/bindings/media/st,stm32-dcmi.yaml F: drivers/media/platform/stm32/stm32-dcmi.c MEDIA DRIVERS FOR NVIDIA TEGRA - VDE From a40df28c5640f13405f5617aff3e985f7e72cde7 Mon Sep 17 00:00:00 2001 From: Mauro Carvalho Chehab Date: Sun, 23 Feb 2020 09:59:53 +0100 Subject: [PATCH 131/400] docs: dt: fix several broken doc references MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit There are several DT doc references that require manual fixes. I found 3 cases fixed on this patch: - directory named "binding/" instead of "bindings/"; - .txt to .yaml renames; - file renames (still on txt format); Signed-off-by: Mauro Carvalho Chehab Reviewed-by: Miquel Raynal Reviewed-by: Jérôme Pouiller Acked-by: Mark Brown Signed-off-by: Rob Herring --- .../devicetree/bindings/mtd/cadence-nand-controller.txt | 2 +- .../devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt | 2 +- Documentation/devicetree/bindings/sound/st,stm32-sai.txt | 2 +- Documentation/devicetree/bindings/sound/st,stm32-spdifrx.txt | 2 +- Documentation/devicetree/bindings/spi/st,stm32-spi.yaml | 2 +- MAINTAINERS | 4 ++-- .../devicetree/bindings/net/wireless/siliabs,wfx.txt | 2 +- 7 files changed, 8 insertions(+), 8 deletions(-) diff --git a/Documentation/devicetree/bindings/mtd/cadence-nand-controller.txt b/Documentation/devicetree/bindings/mtd/cadence-nand-controller.txt index f3893c4d3c6a..d2eada5044b2 100644 --- a/Documentation/devicetree/bindings/mtd/cadence-nand-controller.txt +++ b/Documentation/devicetree/bindings/mtd/cadence-nand-controller.txt @@ -27,7 +27,7 @@ Required properties of NAND chips: - reg: shall contain the native Chip Select ids from 0 to max supported by the cadence nand flash controller -See Documentation/devicetree/bindings/mtd/nand.txt for more details on +See Documentation/devicetree/bindings/mtd/nand-controller.yaml for more details on generic bindings. Example: diff --git a/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt b/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt index 48a7f916c5e4..88b57b0ca1f4 100644 --- a/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt +++ b/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt @@ -45,7 +45,7 @@ Optional properties: switch queue - resets: a single phandle and reset identifier pair. See - Documentation/devicetree/binding/reset/reset.txt for details. + Documentation/devicetree/bindings/reset/reset.txt for details. - reset-names: If the "reset" property is specified, this property should have the value "switch" to denote the switch reset line. diff --git a/Documentation/devicetree/bindings/sound/st,stm32-sai.txt b/Documentation/devicetree/bindings/sound/st,stm32-sai.txt index 944743dd9212..c42b91e525fa 100644 --- a/Documentation/devicetree/bindings/sound/st,stm32-sai.txt +++ b/Documentation/devicetree/bindings/sound/st,stm32-sai.txt @@ -36,7 +36,7 @@ SAI subnodes required properties: - clock-names: Must contain "sai_ck". Must also contain "MCLK", if SAI shares a master clock, with a SAI set as MCLK clock provider. - - dmas: see Documentation/devicetree/bindings/dma/stm32-dma.txt + - dmas: see Documentation/devicetree/bindings/dma/st,stm32-dma.yaml - dma-names: identifier string for each DMA request line "tx": if sai sub-block is configured as playback DAI "rx": if sai sub-block is configured as capture DAI diff --git a/Documentation/devicetree/bindings/sound/st,stm32-spdifrx.txt b/Documentation/devicetree/bindings/sound/st,stm32-spdifrx.txt index 33826f2459fa..ca9101777c44 100644 --- a/Documentation/devicetree/bindings/sound/st,stm32-spdifrx.txt +++ b/Documentation/devicetree/bindings/sound/st,stm32-spdifrx.txt @@ -10,7 +10,7 @@ Required properties: - clock-names: must contain "kclk" - interrupts: cpu DAI interrupt line - dmas: DMA specifiers for audio data DMA and iec control flow DMA - See STM32 DMA bindings, Documentation/devicetree/bindings/dma/stm32-dma.txt + See STM32 DMA bindings, Documentation/devicetree/bindings/dma/st,stm32-dma.yaml - dma-names: two dmas have to be defined, "rx" and "rx-ctrl" Optional properties: diff --git a/Documentation/devicetree/bindings/spi/st,stm32-spi.yaml b/Documentation/devicetree/bindings/spi/st,stm32-spi.yaml index f0d979664f07..e49ecbf715ba 100644 --- a/Documentation/devicetree/bindings/spi/st,stm32-spi.yaml +++ b/Documentation/devicetree/bindings/spi/st,stm32-spi.yaml @@ -49,7 +49,7 @@ properties: dmas: description: | DMA specifiers for tx and rx dma. DMA fifo mode must be used. See - the STM32 DMA bindings Documentation/devicetree/bindings/dma/stm32-dma.txt. + the STM32 DMA bindings Documentation/devicetree/bindings/dma/st,stm32-dma.yaml. items: - description: rx DMA channel - description: tx DMA channel diff --git a/MAINTAINERS b/MAINTAINERS index 7eb087d24519..35e9b133c424 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4474,7 +4474,7 @@ L: linux-media@vger.kernel.org T: git git://linuxtv.org/media_tree.git S: Maintained F: drivers/media/platform/sunxi/sun6i-csi/ -F: Documentation/devicetree/bindings/media/sun6i-csi.txt +F: Documentation/devicetree/bindings/media/allwinner,sun6i-a31-csi.yaml CW1200 WLAN driver M: Solomon Peachy @@ -15922,7 +15922,7 @@ F: drivers/*/stm32-*timer* F: drivers/pwm/pwm-stm32* F: include/linux/*/stm32-*tim* F: Documentation/ABI/testing/*timer-stm32 -F: Documentation/devicetree/bindings/*/stm32-*timer* +F: Documentation/devicetree/bindings/*/*stm32-*timer* F: Documentation/devicetree/bindings/pwm/pwm-stm32* STMMAC ETHERNET DRIVER diff --git a/drivers/staging/wfx/Documentation/devicetree/bindings/net/wireless/siliabs,wfx.txt b/drivers/staging/wfx/Documentation/devicetree/bindings/net/wireless/siliabs,wfx.txt index 26de6762b942..081d58abd5ac 100644 --- a/drivers/staging/wfx/Documentation/devicetree/bindings/net/wireless/siliabs,wfx.txt +++ b/drivers/staging/wfx/Documentation/devicetree/bindings/net/wireless/siliabs,wfx.txt @@ -93,5 +93,5 @@ Some properties are recognized either by SPI and SDIO versions: Must contains 64 hexadecimal digits. Not supported in current version. WFx driver also supports `mac-address` and `local-mac-address` as described in -Documentation/devicetree/binding/net/ethernet.txt +Documentation/devicetree/bindings/net/ethernet.txt From 84823ff80f7403752b59e00bb198724100dc611c Mon Sep 17 00:00:00 2001 From: Esben Haabendal Date: Fri, 21 Feb 2020 07:47:21 +0100 Subject: [PATCH 132/400] net: ll_temac: Fix race condition causing TX hang It is possible that the interrupt handler fires and frees up space in the TX ring in between checking for sufficient TX ring space and stopping the TX queue in temac_start_xmit. If this happens, the queue wake from the interrupt handler will occur before the queue is stopped, causing a lost wakeup and the adapter's transmit hanging. To avoid this, after stopping the queue, check again whether there is sufficient space in the TX ring. If so, wake up the queue again. This is a port of the similar fix in axienet driver, commit 7de44285c1f6 ("net: axienet: Fix race condition causing TX hang"). Fixes: 23ecc4bde21f ("net: ll_temac: fix checksum offload logic") Signed-off-by: Esben Haabendal Signed-off-by: David S. Miller --- drivers/net/ethernet/xilinx/ll_temac_main.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c index 6f11f52c9a9e..996004ef8bd4 100644 --- a/drivers/net/ethernet/xilinx/ll_temac_main.c +++ b/drivers/net/ethernet/xilinx/ll_temac_main.c @@ -788,6 +788,9 @@ static void temac_start_xmit_done(struct net_device *ndev) stat = be32_to_cpu(cur_p->app0); } + /* Matches barrier in temac_start_xmit */ + smp_mb(); + netif_wake_queue(ndev); } @@ -830,9 +833,19 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev) cur_p = &lp->tx_bd_v[lp->tx_bd_tail]; if (temac_check_tx_bd_space(lp, num_frag + 1)) { - if (!netif_queue_stopped(ndev)) - netif_stop_queue(ndev); - return NETDEV_TX_BUSY; + if (netif_queue_stopped(ndev)) + return NETDEV_TX_BUSY; + + netif_stop_queue(ndev); + + /* Matches barrier in temac_start_xmit_done */ + smp_mb(); + + /* Space might have just been freed - check again */ + if (temac_check_tx_bd_space(lp, num_frag)) + return NETDEV_TX_BUSY; + + netif_wake_queue(ndev); } cur_p->app0 = 0; From d07c849cd2b97d6809430dfb7e738ad31088037a Mon Sep 17 00:00:00 2001 From: Esben Haabendal Date: Fri, 21 Feb 2020 07:47:33 +0100 Subject: [PATCH 133/400] net: ll_temac: Add more error handling of dma_map_single() calls This adds error handling to the remaining dma_map_single() calls, so that behavior is well defined if/when we run out of DMA memory. Fixes: 92744989533c ("net: add Xilinx ll_temac device driver") Signed-off-by: Esben Haabendal Signed-off-by: David S. Miller --- drivers/net/ethernet/xilinx/ll_temac_main.c | 26 +++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c index 996004ef8bd4..c368c3914bda 100644 --- a/drivers/net/ethernet/xilinx/ll_temac_main.c +++ b/drivers/net/ethernet/xilinx/ll_temac_main.c @@ -367,6 +367,8 @@ static int temac_dma_bd_init(struct net_device *ndev) skb_dma_addr = dma_map_single(ndev->dev.parent, skb->data, XTE_MAX_JUMBO_FRAME_SIZE, DMA_FROM_DEVICE); + if (dma_mapping_error(ndev->dev.parent, skb_dma_addr)) + goto out; lp->rx_bd_v[i].phys = cpu_to_be32(skb_dma_addr); lp->rx_bd_v[i].len = cpu_to_be32(XTE_MAX_JUMBO_FRAME_SIZE); lp->rx_bd_v[i].app0 = cpu_to_be32(STS_CTRL_APP0_IRQONEND); @@ -863,12 +865,13 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev) skb_dma_addr = dma_map_single(ndev->dev.parent, skb->data, skb_headlen(skb), DMA_TO_DEVICE); cur_p->len = cpu_to_be32(skb_headlen(skb)); + if (WARN_ON_ONCE(dma_mapping_error(ndev->dev.parent, skb_dma_addr))) + return NETDEV_TX_BUSY; cur_p->phys = cpu_to_be32(skb_dma_addr); ptr_to_txbd((void *)skb, cur_p); for (ii = 0; ii < num_frag; ii++) { - lp->tx_bd_tail++; - if (lp->tx_bd_tail >= TX_BD_NUM) + if (++lp->tx_bd_tail >= TX_BD_NUM) lp->tx_bd_tail = 0; cur_p = &lp->tx_bd_v[lp->tx_bd_tail]; @@ -876,6 +879,25 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev) skb_frag_address(frag), skb_frag_size(frag), DMA_TO_DEVICE); + if (dma_mapping_error(ndev->dev.parent, skb_dma_addr)) { + if (--lp->tx_bd_tail < 0) + lp->tx_bd_tail = TX_BD_NUM - 1; + cur_p = &lp->tx_bd_v[lp->tx_bd_tail]; + while (--ii >= 0) { + --frag; + dma_unmap_single(ndev->dev.parent, + be32_to_cpu(cur_p->phys), + skb_frag_size(frag), + DMA_TO_DEVICE); + if (--lp->tx_bd_tail < 0) + lp->tx_bd_tail = TX_BD_NUM - 1; + cur_p = &lp->tx_bd_v[lp->tx_bd_tail]; + } + dma_unmap_single(ndev->dev.parent, + be32_to_cpu(cur_p->phys), + skb_headlen(skb), DMA_TO_DEVICE); + return NETDEV_TX_BUSY; + } cur_p->phys = cpu_to_be32(skb_dma_addr); cur_p->len = cpu_to_be32(skb_frag_size(frag)); cur_p->app0 = 0; From 770d9c67974c4c71af4beb786dc43162ad2a15ba Mon Sep 17 00:00:00 2001 From: Esben Haabendal Date: Fri, 21 Feb 2020 07:47:45 +0100 Subject: [PATCH 134/400] net: ll_temac: Fix RX buffer descriptor handling on GFP_ATOMIC pressure Failures caused by GFP_ATOMIC memory pressure have been observed, and due to the missing error handling, results in kernel crash such as [1876998.350133] kernel BUG at mm/slub.c:3952! [1876998.350141] invalid opcode: 0000 [#1] PREEMPT SMP PTI [1876998.350147] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.3.0-scnxt #1 [1876998.350150] Hardware name: N/A N/A/COMe-bIP2, BIOS CCR2R920 03/01/2017 [1876998.350160] RIP: 0010:kfree+0x1ca/0x220 [1876998.350164] Code: 85 db 74 49 48 8b 95 68 01 00 00 48 31 c2 48 89 10 e9 d7 fe ff ff 49 8b 04 24 a9 00 00 01 00 75 0b 49 8b 44 24 08 a8 01 75 02 <0f> 0b 49 8b 04 24 31 f6 a9 00 00 01 00 74 06 41 0f b6 74 24 5b [1876998.350172] RSP: 0018:ffffc900000f0df0 EFLAGS: 00010246 [1876998.350177] RAX: ffffea00027f0708 RBX: ffff888008d78000 RCX: 0000000000391372 [1876998.350181] RDX: 0000000000000000 RSI: ffffe8ffffd01400 RDI: ffff888008d78000 [1876998.350185] RBP: ffff8881185a5d00 R08: ffffc90000087dd8 R09: 000000000000280a [1876998.350189] R10: 0000000000000002 R11: 0000000000000000 R12: ffffea0000235e00 [1876998.350193] R13: ffff8881185438a0 R14: 0000000000000000 R15: ffff888118543870 [1876998.350198] FS: 0000000000000000(0000) GS:ffff88811f300000(0000) knlGS:0000000000000000 [1876998.350203] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 s#1 Part1 [1876998.350206] CR2: 00007f8dac7b09f0 CR3: 000000011e20a006 CR4: 00000000001606e0 [1876998.350210] Call Trace: [1876998.350215] [1876998.350224] ? __netif_receive_skb_core+0x70a/0x920 [1876998.350229] kfree_skb+0x32/0xb0 [1876998.350234] __netif_receive_skb_core+0x70a/0x920 [1876998.350240] __netif_receive_skb_one_core+0x36/0x80 [1876998.350245] process_backlog+0x8b/0x150 [1876998.350250] net_rx_action+0xf7/0x340 [1876998.350255] __do_softirq+0x10f/0x353 [1876998.350262] irq_exit+0xb2/0xc0 [1876998.350265] do_IRQ+0x77/0xd0 [1876998.350271] common_interrupt+0xf/0xf [1876998.350274] In order to handle such failures more graceful, this change splits the receive loop into one for consuming the received buffers, and one for allocating new buffers. When GFP_ATOMIC allocations fail, the receive will continue with the buffers that is still there, and with the expectation that the allocations will succeed in a later call to receive. Fixes: 92744989533c ("net: add Xilinx ll_temac device driver") Signed-off-by: Esben Haabendal Signed-off-by: David S. Miller --- drivers/net/ethernet/xilinx/ll_temac.h | 1 + drivers/net/ethernet/xilinx/ll_temac_main.c | 110 ++++++++++++++------ 2 files changed, 81 insertions(+), 30 deletions(-) diff --git a/drivers/net/ethernet/xilinx/ll_temac.h b/drivers/net/ethernet/xilinx/ll_temac.h index 276292bca334..99fe059e5c7f 100644 --- a/drivers/net/ethernet/xilinx/ll_temac.h +++ b/drivers/net/ethernet/xilinx/ll_temac.h @@ -375,6 +375,7 @@ struct temac_local { int tx_bd_next; int tx_bd_tail; int rx_bd_ci; + int rx_bd_tail; /* DMA channel control setup */ u32 tx_chnl_ctrl; diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c index c368c3914bda..255207f2fd27 100644 --- a/drivers/net/ethernet/xilinx/ll_temac_main.c +++ b/drivers/net/ethernet/xilinx/ll_temac_main.c @@ -389,12 +389,13 @@ static int temac_dma_bd_init(struct net_device *ndev) lp->tx_bd_next = 0; lp->tx_bd_tail = 0; lp->rx_bd_ci = 0; + lp->rx_bd_tail = RX_BD_NUM - 1; /* Enable RX DMA transfers */ wmb(); lp->dma_out(lp, RX_CURDESC_PTR, lp->rx_bd_p); lp->dma_out(lp, RX_TAILDESC_PTR, - lp->rx_bd_p + (sizeof(*lp->rx_bd_v) * (RX_BD_NUM - 1))); + lp->rx_bd_p + (sizeof(*lp->rx_bd_v) * lp->rx_bd_tail)); /* Prepare for TX DMA transfer */ lp->dma_out(lp, TX_CURDESC_PTR, lp->tx_bd_p); @@ -923,27 +924,41 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev) static void ll_temac_recv(struct net_device *ndev) { struct temac_local *lp = netdev_priv(ndev); - struct sk_buff *skb, *new_skb; - unsigned int bdstat; - struct cdmac_bd *cur_p; - dma_addr_t tail_p, skb_dma_addr; - int length; unsigned long flags; + int rx_bd; + bool update_tail = false; spin_lock_irqsave(&lp->rx_lock, flags); - tail_p = lp->rx_bd_p + sizeof(*lp->rx_bd_v) * lp->rx_bd_ci; - cur_p = &lp->rx_bd_v[lp->rx_bd_ci]; + /* Process all received buffers, passing them on network + * stack. After this, the buffer descriptors will be in an + * un-allocated stage, where no skb is allocated for it, and + * they are therefore not available for TEMAC/DMA. + */ + do { + struct cdmac_bd *bd = &lp->rx_bd_v[lp->rx_bd_ci]; + struct sk_buff *skb = lp->rx_skb[lp->rx_bd_ci]; + unsigned int bdstat = be32_to_cpu(bd->app0); + int length; - bdstat = be32_to_cpu(cur_p->app0); - while ((bdstat & STS_CTRL_APP0_CMPLT)) { + /* While this should not normally happen, we can end + * here when GFP_ATOMIC allocations fail, and we + * therefore have un-allocated buffers. + */ + if (!skb) + break; - skb = lp->rx_skb[lp->rx_bd_ci]; - length = be32_to_cpu(cur_p->app4) & 0x3FFF; + /* Loop over all completed buffer descriptors */ + if (!(bdstat & STS_CTRL_APP0_CMPLT)) + break; - dma_unmap_single(ndev->dev.parent, be32_to_cpu(cur_p->phys), + dma_unmap_single(ndev->dev.parent, be32_to_cpu(bd->phys), XTE_MAX_JUMBO_FRAME_SIZE, DMA_FROM_DEVICE); + /* The buffer is not valid for DMA anymore */ + bd->phys = 0; + bd->len = 0; + length = be32_to_cpu(bd->app4) & 0x3FFF; skb_put(skb, length); skb->protocol = eth_type_trans(skb, ndev); skb_checksum_none_assert(skb); @@ -958,39 +973,74 @@ static void ll_temac_recv(struct net_device *ndev) * (back) for proper IP checksum byte order * (be16). */ - skb->csum = htons(be32_to_cpu(cur_p->app3) & 0xFFFF); + skb->csum = htons(be32_to_cpu(bd->app3) & 0xFFFF); skb->ip_summed = CHECKSUM_COMPLETE; } if (!skb_defer_rx_timestamp(skb)) netif_rx(skb); + /* The skb buffer is now owned by network stack above */ + lp->rx_skb[lp->rx_bd_ci] = NULL; ndev->stats.rx_packets++; ndev->stats.rx_bytes += length; - new_skb = netdev_alloc_skb_ip_align(ndev, - XTE_MAX_JUMBO_FRAME_SIZE); - if (!new_skb) { - spin_unlock_irqrestore(&lp->rx_lock, flags); - return; + rx_bd = lp->rx_bd_ci; + if (++lp->rx_bd_ci >= RX_BD_NUM) + lp->rx_bd_ci = 0; + } while (rx_bd != lp->rx_bd_tail); + + /* Allocate new buffers for those buffer descriptors that were + * passed to network stack. Note that GFP_ATOMIC allocations + * can fail (e.g. when a larger burst of GFP_ATOMIC + * allocations occurs), so while we try to allocate all + * buffers in the same interrupt where they were processed, we + * continue with what we could get in case of allocation + * failure. Allocation of remaining buffers will be retried + * in following calls. + */ + while (1) { + struct sk_buff *skb; + struct cdmac_bd *bd; + dma_addr_t skb_dma_addr; + + rx_bd = lp->rx_bd_tail + 1; + if (rx_bd >= RX_BD_NUM) + rx_bd = 0; + bd = &lp->rx_bd_v[rx_bd]; + + if (bd->phys) + break; /* All skb's allocated */ + + skb = netdev_alloc_skb_ip_align(ndev, XTE_MAX_JUMBO_FRAME_SIZE); + if (!skb) { + dev_warn(&ndev->dev, "skb alloc failed\n"); + break; } - cur_p->app0 = cpu_to_be32(STS_CTRL_APP0_IRQONEND); - skb_dma_addr = dma_map_single(ndev->dev.parent, new_skb->data, + skb_dma_addr = dma_map_single(ndev->dev.parent, skb->data, XTE_MAX_JUMBO_FRAME_SIZE, DMA_FROM_DEVICE); - cur_p->phys = cpu_to_be32(skb_dma_addr); - cur_p->len = cpu_to_be32(XTE_MAX_JUMBO_FRAME_SIZE); - lp->rx_skb[lp->rx_bd_ci] = new_skb; + if (WARN_ON_ONCE(dma_mapping_error(ndev->dev.parent, + skb_dma_addr))) { + dev_kfree_skb_any(skb); + break; + } - lp->rx_bd_ci++; - if (lp->rx_bd_ci >= RX_BD_NUM) - lp->rx_bd_ci = 0; + bd->phys = cpu_to_be32(skb_dma_addr); + bd->len = cpu_to_be32(XTE_MAX_JUMBO_FRAME_SIZE); + bd->app0 = cpu_to_be32(STS_CTRL_APP0_IRQONEND); + lp->rx_skb[rx_bd] = skb; - cur_p = &lp->rx_bd_v[lp->rx_bd_ci]; - bdstat = be32_to_cpu(cur_p->app0); + lp->rx_bd_tail = rx_bd; + update_tail = true; + } + + /* Move tail pointer when buffers have been allocated */ + if (update_tail) { + lp->dma_out(lp, RX_TAILDESC_PTR, + lp->rx_bd_p + sizeof(*lp->rx_bd_v) * lp->rx_bd_tail); } - lp->dma_out(lp, RX_TAILDESC_PTR, tail_p); spin_unlock_irqrestore(&lp->rx_lock, flags); } From 1d63b8d66d146deaaedbe16c80de105f685ea012 Mon Sep 17 00:00:00 2001 From: Esben Haabendal Date: Fri, 21 Feb 2020 07:47:58 +0100 Subject: [PATCH 135/400] net: ll_temac: Handle DMA halt condition caused by buffer underrun The SDMA engine used by TEMAC halts operation when it has finished processing of the last buffer descriptor in the buffer ring. Unfortunately, no interrupt event is generated when this happens, so we need to setup another mechanism to make sure DMA operation is restarted when enough buffers have been added to the ring. Fixes: 92744989533c ("net: add Xilinx ll_temac device driver") Signed-off-by: Esben Haabendal Signed-off-by: David S. Miller --- drivers/net/ethernet/xilinx/ll_temac.h | 3 ++ drivers/net/ethernet/xilinx/ll_temac_main.c | 58 +++++++++++++++++++-- 2 files changed, 56 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/xilinx/ll_temac.h b/drivers/net/ethernet/xilinx/ll_temac.h index 99fe059e5c7f..53fb8141f1a6 100644 --- a/drivers/net/ethernet/xilinx/ll_temac.h +++ b/drivers/net/ethernet/xilinx/ll_temac.h @@ -380,6 +380,9 @@ struct temac_local { /* DMA channel control setup */ u32 tx_chnl_ctrl; u32 rx_chnl_ctrl; + u8 coalesce_count_rx; + + struct delayed_work restart_work; }; /* Wrappers for temac_ior()/temac_iow() function pointers above */ diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c index 255207f2fd27..9461acec6f70 100644 --- a/drivers/net/ethernet/xilinx/ll_temac_main.c +++ b/drivers/net/ethernet/xilinx/ll_temac_main.c @@ -51,6 +51,7 @@ #include #include #include +#include #include #include #include @@ -866,8 +867,11 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev) skb_dma_addr = dma_map_single(ndev->dev.parent, skb->data, skb_headlen(skb), DMA_TO_DEVICE); cur_p->len = cpu_to_be32(skb_headlen(skb)); - if (WARN_ON_ONCE(dma_mapping_error(ndev->dev.parent, skb_dma_addr))) - return NETDEV_TX_BUSY; + if (WARN_ON_ONCE(dma_mapping_error(ndev->dev.parent, skb_dma_addr))) { + dev_kfree_skb_any(skb); + ndev->stats.tx_dropped++; + return NETDEV_TX_OK; + } cur_p->phys = cpu_to_be32(skb_dma_addr); ptr_to_txbd((void *)skb, cur_p); @@ -897,7 +901,9 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev) dma_unmap_single(ndev->dev.parent, be32_to_cpu(cur_p->phys), skb_headlen(skb), DMA_TO_DEVICE); - return NETDEV_TX_BUSY; + dev_kfree_skb_any(skb); + ndev->stats.tx_dropped++; + return NETDEV_TX_OK; } cur_p->phys = cpu_to_be32(skb_dma_addr); cur_p->len = cpu_to_be32(skb_frag_size(frag)); @@ -920,6 +926,17 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev) return NETDEV_TX_OK; } +static int ll_temac_recv_buffers_available(struct temac_local *lp) +{ + int available; + + if (!lp->rx_skb[lp->rx_bd_ci]) + return 0; + available = 1 + lp->rx_bd_tail - lp->rx_bd_ci; + if (available <= 0) + available += RX_BD_NUM; + return available; +} static void ll_temac_recv(struct net_device *ndev) { @@ -990,6 +1007,18 @@ static void ll_temac_recv(struct net_device *ndev) lp->rx_bd_ci = 0; } while (rx_bd != lp->rx_bd_tail); + /* DMA operations will halt when the last buffer descriptor is + * processed (ie. the one pointed to by RX_TAILDESC_PTR). + * When that happens, no more interrupt events will be + * generated. No IRQ_COAL or IRQ_DLY, and not even an + * IRQ_ERR. To avoid stalling, we schedule a delayed work + * when there is a potential risk of that happening. The work + * will call this function, and thus re-schedule itself until + * enough buffers are available again. + */ + if (ll_temac_recv_buffers_available(lp) < lp->coalesce_count_rx) + schedule_delayed_work(&lp->restart_work, HZ / 1000); + /* Allocate new buffers for those buffer descriptors that were * passed to network stack. Note that GFP_ATOMIC allocations * can fail (e.g. when a larger burst of GFP_ATOMIC @@ -1045,6 +1074,18 @@ static void ll_temac_recv(struct net_device *ndev) spin_unlock_irqrestore(&lp->rx_lock, flags); } +/* Function scheduled to ensure a restart in case of DMA halt + * condition caused by running out of buffer descriptors. + */ +static void ll_temac_restart_work_func(struct work_struct *work) +{ + struct temac_local *lp = container_of(work, struct temac_local, + restart_work.work); + struct net_device *ndev = lp->ndev; + + ll_temac_recv(ndev); +} + static irqreturn_t ll_temac_tx_irq(int irq, void *_ndev) { struct net_device *ndev = _ndev; @@ -1137,6 +1178,8 @@ static int temac_stop(struct net_device *ndev) dev_dbg(&ndev->dev, "temac_close()\n"); + cancel_delayed_work_sync(&lp->restart_work); + free_irq(lp->tx_irq, ndev); free_irq(lp->rx_irq, ndev); @@ -1258,6 +1301,7 @@ static int temac_probe(struct platform_device *pdev) lp->dev = &pdev->dev; lp->options = XTE_OPTION_DEFAULTS; spin_lock_init(&lp->rx_lock); + INIT_DELAYED_WORK(&lp->restart_work, ll_temac_restart_work_func); /* Setup mutex for synchronization of indirect register access */ if (pdata) { @@ -1364,6 +1408,7 @@ static int temac_probe(struct platform_device *pdev) */ lp->tx_chnl_ctrl = 0x10220000; lp->rx_chnl_ctrl = 0xff070000; + lp->coalesce_count_rx = 0x07; /* Finished with the DMA node; drop the reference */ of_node_put(dma_np); @@ -1395,11 +1440,14 @@ static int temac_probe(struct platform_device *pdev) (pdata->tx_irq_count << 16); else lp->tx_chnl_ctrl = 0x10220000; - if (pdata->rx_irq_timeout || pdata->rx_irq_count) + if (pdata->rx_irq_timeout || pdata->rx_irq_count) { lp->rx_chnl_ctrl = (pdata->rx_irq_timeout << 24) | (pdata->rx_irq_count << 16); - else + lp->coalesce_count_rx = pdata->rx_irq_count; + } else { lp->rx_chnl_ctrl = 0xff070000; + lp->coalesce_count_rx = 0x07; + } } /* Error handle returned DMA RX and TX interrupts */ From 823d81b0fa2cd83a640734e74caee338b5d3c093 Mon Sep 17 00:00:00 2001 From: Nikolay Aleksandrov Date: Mon, 24 Feb 2020 18:46:22 +0200 Subject: [PATCH 136/400] net: bridge: fix stale eth hdr pointer in br_dev_xmit In br_dev_xmit() we perform vlan filtering in br_allowed_ingress() but if the packet has the vlan header inside (e.g. bridge with disabled tx-vlan-offload) then the vlan filtering code will use skb_vlan_untag() to extract the vid before filtering which in turn calls pskb_may_pull() and we may end up with a stale eth pointer. Moreover the cached eth header pointer will generally be wrong after that operation. Remove the eth header caching and just use eth_hdr() directly, the compiler does the right thing and calculates it only once so we don't lose anything. Fixes: 057658cb33fb ("bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports") Signed-off-by: Nikolay Aleksandrov Signed-off-by: David S. Miller --- net/bridge/br_device.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index dc3d2c1dd9d5..0e3dbc5f3c34 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -34,7 +34,6 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev) const struct nf_br_ops *nf_ops; u8 state = BR_STATE_FORWARDING; const unsigned char *dest; - struct ethhdr *eth; u16 vid = 0; rcu_read_lock(); @@ -54,15 +53,14 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev) BR_INPUT_SKB_CB(skb)->frag_max_size = 0; skb_reset_mac_header(skb); - eth = eth_hdr(skb); skb_pull(skb, ETH_HLEN); if (!br_allowed_ingress(br, br_vlan_group_rcu(br), skb, &vid, &state)) goto out; if (IS_ENABLED(CONFIG_INET) && - (eth->h_proto == htons(ETH_P_ARP) || - eth->h_proto == htons(ETH_P_RARP)) && + (eth_hdr(skb)->h_proto == htons(ETH_P_ARP) || + eth_hdr(skb)->h_proto == htons(ETH_P_RARP)) && br_opt_get(br, BROPT_NEIGH_SUPPRESS_ENABLED)) { br_do_proxy_suppress_arp(skb, br, vid, NULL); } else if (IS_ENABLED(CONFIG_IPV6) && From 03264ddde2453f6877a7d637d84068079632a3c5 Mon Sep 17 00:00:00 2001 From: Adam Williamson Date: Wed, 19 Feb 2020 17:50:07 +0100 Subject: [PATCH 137/400] scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate places Arnd Bergmann inadvertently typoed these in d320a9551e394 and 64cbfa96551a; they seem to be the cause of https://bugzilla.redhat.com/show_bug.cgi?id=1801353 , invalid SCSI commands when udev tries to query a DVD drive. [arnd] Found another instance of the same bug, also introduced in my compat_ioctl series. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1801353 Link: https://lore.kernel.org/r/20200219165139.3467320-1-arnd@arndb.de Fixes: c103d6ee69f9 ("compat_ioctl: ide: floppy: add handler") Fixes: 64cbfa96551a ("compat_ioctl: move cdrom commands into cdrom.c") Fixes: d320a9551e39 ("compat_ioctl: scsi: move ioctl handling into drivers") Bisected-by: Chris Murphy Signed-off-by: Arnd Bergmann Signed-off-by: Adam Williamson Signed-off-by: Martin K. Petersen --- drivers/block/paride/pcd.c | 2 +- drivers/cdrom/gdrom.c | 2 +- drivers/ide/ide-gd.c | 2 +- drivers/scsi/sr.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/block/paride/pcd.c b/drivers/block/paride/pcd.c index 117cfc8cd05a..cda5cf917e9a 100644 --- a/drivers/block/paride/pcd.c +++ b/drivers/block/paride/pcd.c @@ -276,7 +276,7 @@ static const struct block_device_operations pcd_bdops = { .release = pcd_block_release, .ioctl = pcd_block_ioctl, #ifdef CONFIG_COMPAT - .ioctl = blkdev_compat_ptr_ioctl, + .compat_ioctl = blkdev_compat_ptr_ioctl, #endif .check_events = pcd_block_check_events, }; diff --git a/drivers/cdrom/gdrom.c b/drivers/cdrom/gdrom.c index 886b2638c730..c51292c2a131 100644 --- a/drivers/cdrom/gdrom.c +++ b/drivers/cdrom/gdrom.c @@ -519,7 +519,7 @@ static const struct block_device_operations gdrom_bdops = { .check_events = gdrom_bdops_check_events, .ioctl = gdrom_bdops_ioctl, #ifdef CONFIG_COMPAT - .ioctl = blkdev_compat_ptr_ioctl, + .compat_ioctl = blkdev_compat_ptr_ioctl, #endif }; diff --git a/drivers/ide/ide-gd.c b/drivers/ide/ide-gd.c index 1bb99b556393..05c26986637b 100644 --- a/drivers/ide/ide-gd.c +++ b/drivers/ide/ide-gd.c @@ -361,7 +361,7 @@ static const struct block_device_operations ide_gd_ops = { .release = ide_gd_release, .ioctl = ide_gd_ioctl, #ifdef CONFIG_COMPAT - .ioctl = ide_gd_compat_ioctl, + .compat_ioctl = ide_gd_compat_ioctl, #endif .getgeo = ide_gd_getgeo, .check_events = ide_gd_check_events, diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c index 0fbb8fe6e521..e4240e4ae8bb 100644 --- a/drivers/scsi/sr.c +++ b/drivers/scsi/sr.c @@ -688,7 +688,7 @@ static const struct block_device_operations sr_bdops = .release = sr_block_release, .ioctl = sr_block_ioctl, #ifdef CONFIG_COMPAT - .ioctl = sr_block_compat_ioctl, + .compat_ioctl = sr_block_compat_ioctl, #endif .check_events = sr_block_check_events, .revalidate_disk = sr_block_revalidate_disk, From fc513fac56e1b626ae48a74d7551d9c35c50129e Mon Sep 17 00:00:00 2001 From: Ronnie Sahlberg Date: Wed, 19 Feb 2020 06:01:03 +1000 Subject: [PATCH 138/400] cifs: don't leak -EAGAIN for stat() during reconnect If from cifs_revalidate_dentry_attr() the SMB2/QUERY_INFO call fails with an error, such as STATUS_SESSION_EXPIRED, causing the session to be reconnected it is possible we will leak -EAGAIN back to the application even for system calls such as stat() where this is not a valid error. Fix this by re-trying the operation from within cifs_revalidate_dentry_attr() if cifs_get_inode_info*() returns -EAGAIN. This fixes stat() and possibly also other system calls that uses cifs_revalidate_dentry*(). Signed-off-by: Ronnie Sahlberg Signed-off-by: Steve French Reviewed-by: Pavel Shilovsky Reviewed-by: Aurelien Aptel CC: Stable --- fs/cifs/inode.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index b5e6635c578e..1c6f659110d0 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -2073,6 +2073,7 @@ int cifs_revalidate_dentry_attr(struct dentry *dentry) struct inode *inode = d_inode(dentry); struct super_block *sb = dentry->d_sb; char *full_path = NULL; + int count = 0; if (inode == NULL) return -ENOENT; @@ -2094,15 +2095,18 @@ int cifs_revalidate_dentry_attr(struct dentry *dentry) full_path, inode, inode->i_count.counter, dentry, cifs_get_time(dentry), jiffies); +again: if (cifs_sb_master_tcon(CIFS_SB(sb))->unix_ext) rc = cifs_get_inode_info_unix(&inode, full_path, sb, xid); else rc = cifs_get_inode_info(&inode, full_path, NULL, sb, xid, NULL); - + if (rc == -EAGAIN && count++ < 10) + goto again; out: kfree(full_path); free_xid(xid); + return rc; } From 154255233830e1e4dd0d99ac929a5dce588c0b81 Mon Sep 17 00:00:00 2001 From: "Paulo Alcantara (SUSE)" Date: Thu, 20 Feb 2020 19:49:35 -0300 Subject: [PATCH 139/400] cifs: fix potential mismatch of UNC paths Ensure that full_path is an UNC path that contains '\\' as delimiter, which is required by cifs_build_devname(). The build_path_from_dentry_optional_prefix() function may return a path with '/' as delimiter when using SMB1 UNIX extensions, for example. Signed-off-by: Paulo Alcantara (SUSE) Signed-off-by: Steve French Acked-by: Ronnie Sahlberg --- fs/cifs/cifs_dfs_ref.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/cifs/cifs_dfs_ref.c b/fs/cifs/cifs_dfs_ref.c index 606f26d862dc..cc3ada12848d 100644 --- a/fs/cifs/cifs_dfs_ref.c +++ b/fs/cifs/cifs_dfs_ref.c @@ -324,6 +324,8 @@ static struct vfsmount *cifs_dfs_do_automount(struct dentry *mntpt) if (full_path == NULL) goto cdda_exit; + convert_delimiter(full_path, '\\'); + cifs_dbg(FYI, "%s: full_path: %s\n", __func__, full_path); if (!cifs_sb_master_tlink(cifs_sb)) { From ec57010acd03428a749d2600bf09bd537eaae993 Mon Sep 17 00:00:00 2001 From: Steve French Date: Wed, 19 Feb 2020 23:59:32 -0600 Subject: [PATCH 140/400] cifs: add missing mount option to /proc/mounts We were not displaying the mount option "signloosely" in /proc/mounts for cifs mounts which some users found confusing recently Signed-off-by: Steve French Reviewed-by: Aurelien Aptel --- fs/cifs/cifsfs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c index 46ebaf3f0824..fa77fe5258b0 100644 --- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -530,6 +530,8 @@ cifs_show_options(struct seq_file *s, struct dentry *root) if (tcon->seal) seq_puts(s, ",seal"); + else if (tcon->ses->server->ignore_signature) + seq_puts(s, ",signloosely"); if (tcon->nocase) seq_puts(s, ",nocase"); if (tcon->local_lease) From 86f740f2aed5ea7fe1aa86dc2df0fb4ab0f71088 Mon Sep 17 00:00:00 2001 From: Aurelien Aptel Date: Fri, 21 Feb 2020 11:19:06 +0100 Subject: [PATCH 141/400] cifs: fix rename() by ensuring source handle opened with DELETE bit To rename a file in SMB2 we open it with the DELETE access and do a special SetInfo on it. If the handle is missing the DELETE bit the server will fail the SetInfo with STATUS_ACCESS_DENIED. We currently try to reuse any existing opened handle we have with cifs_get_writable_path(). That function looks for handles with WRITE access but doesn't check for DELETE, making rename() fail if it finds a handle to reuse. Simple reproducer below. To select handles with the DELETE bit, this patch adds a flag argument to cifs_get_writable_path() and find_writable_file() and the existing 'bool fsuid_only' argument is converted to a flag. The cifsFileInfo struct only stores the UNIX open mode but not the original SMB access flags. Since the DELETE bit is not mapped in that mode, this patch stores the access mask in cifs_fid on file open, which is accessible from cifsFileInfo. Simple reproducer: #include #include #include #include #include #include #define E(s) perror(s), exit(1) int main(int argc, char *argv[]) { int fd, ret; if (argc != 3) { fprintf(stderr, "Usage: %s A B\n" "create&open A in write mode, " "rename A to B, close A\n", argv[0]); return 0; } fd = openat(AT_FDCWD, argv[1], O_WRONLY|O_CREAT|O_SYNC, 0666); if (fd == -1) E("openat()"); ret = rename(argv[1], argv[2]); if (ret) E("rename()"); ret = close(fd); if (ret) E("close()"); return ret; } $ gcc -o bugrename bugrename.c $ ./bugrename /mnt/a /mnt/b rename(): Permission denied Fixes: 8de9e86c67ba ("cifs: create a helper to find a writeable handle by path name") CC: Stable Signed-off-by: Aurelien Aptel Signed-off-by: Steve French Reviewed-by: Pavel Shilovsky Reviewed-by: Paulo Alcantara (SUSE) --- fs/cifs/cifsglob.h | 7 +++++++ fs/cifs/cifsproto.h | 5 +++-- fs/cifs/cifssmb.c | 3 ++- fs/cifs/file.c | 19 ++++++++++++------- fs/cifs/inode.c | 6 +++--- fs/cifs/smb1ops.c | 2 +- fs/cifs/smb2inode.c | 4 ++-- fs/cifs/smb2ops.c | 3 ++- fs/cifs/smb2pdu.c | 1 + 9 files changed, 33 insertions(+), 17 deletions(-) diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index de82cfa44b1a..0d956360e984 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -1281,6 +1281,7 @@ struct cifs_fid { __u64 volatile_fid; /* volatile file id for smb2 */ __u8 lease_key[SMB2_LEASE_KEY_SIZE]; /* lease key for smb2 */ __u8 create_guid[16]; + __u32 access; struct cifs_pending_open *pending_open; unsigned int epoch; #ifdef CONFIG_CIFS_DEBUG2 @@ -1741,6 +1742,12 @@ static inline bool is_retryable_error(int error) return false; } + +/* cifs_get_writable_file() flags */ +#define FIND_WR_ANY 0 +#define FIND_WR_FSUID_ONLY 1 +#define FIND_WR_WITH_DELETE 2 + #define MID_FREE 0 #define MID_REQUEST_ALLOCATED 1 #define MID_REQUEST_SUBMITTED 2 diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index 89eaaf46d1ca..e5cb681ec138 100644 --- a/fs/cifs/cifsproto.h +++ b/fs/cifs/cifsproto.h @@ -134,11 +134,12 @@ extern bool backup_cred(struct cifs_sb_info *); extern bool is_size_safe_to_change(struct cifsInodeInfo *, __u64 eof); extern void cifs_update_eof(struct cifsInodeInfo *cifsi, loff_t offset, unsigned int bytes_written); -extern struct cifsFileInfo *find_writable_file(struct cifsInodeInfo *, bool); +extern struct cifsFileInfo *find_writable_file(struct cifsInodeInfo *, int); extern int cifs_get_writable_file(struct cifsInodeInfo *cifs_inode, - bool fsuid_only, + int flags, struct cifsFileInfo **ret_file); extern int cifs_get_writable_path(struct cifs_tcon *tcon, const char *name, + int flags, struct cifsFileInfo **ret_file); extern struct cifsFileInfo *find_readable_file(struct cifsInodeInfo *, bool); extern int cifs_get_readable_path(struct cifs_tcon *tcon, const char *name, diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index 3c89569e7210..6f6fb3606a5d 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -1492,6 +1492,7 @@ openRetry: *oplock = rsp->OplockLevel; /* cifs fid stays in le */ oparms->fid->netfid = rsp->Fid; + oparms->fid->access = desired_access; /* Let caller know file was created so we can set the mode. */ /* Do we care about the CreateAction in any other cases? */ @@ -2115,7 +2116,7 @@ cifs_writev_requeue(struct cifs_writedata *wdata) wdata2->tailsz = tailsz; wdata2->bytes = cur_len; - rc = cifs_get_writable_file(CIFS_I(inode), false, + rc = cifs_get_writable_file(CIFS_I(inode), FIND_WR_ANY, &wdata2->cfile); if (!wdata2->cfile) { cifs_dbg(VFS, "No writable handle to retry writepages rc=%d\n", diff --git a/fs/cifs/file.c b/fs/cifs/file.c index bc9516ab4b34..3b942ecdd4be 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -1958,7 +1958,7 @@ struct cifsFileInfo *find_readable_file(struct cifsInodeInfo *cifs_inode, /* Return -EBADF if no handle is found and general rc otherwise */ int -cifs_get_writable_file(struct cifsInodeInfo *cifs_inode, bool fsuid_only, +cifs_get_writable_file(struct cifsInodeInfo *cifs_inode, int flags, struct cifsFileInfo **ret_file) { struct cifsFileInfo *open_file, *inv_file = NULL; @@ -1966,7 +1966,8 @@ cifs_get_writable_file(struct cifsInodeInfo *cifs_inode, bool fsuid_only, bool any_available = false; int rc = -EBADF; unsigned int refind = 0; - + bool fsuid_only = flags & FIND_WR_FSUID_ONLY; + bool with_delete = flags & FIND_WR_WITH_DELETE; *ret_file = NULL; /* @@ -1998,6 +1999,8 @@ refind_writable: continue; if (fsuid_only && !uid_eq(open_file->uid, current_fsuid())) continue; + if (with_delete && !(open_file->fid.access & DELETE)) + continue; if (OPEN_FMODE(open_file->f_flags) & FMODE_WRITE) { if (!open_file->invalidHandle) { /* found a good writable file */ @@ -2045,12 +2048,12 @@ refind_writable: } struct cifsFileInfo * -find_writable_file(struct cifsInodeInfo *cifs_inode, bool fsuid_only) +find_writable_file(struct cifsInodeInfo *cifs_inode, int flags) { struct cifsFileInfo *cfile; int rc; - rc = cifs_get_writable_file(cifs_inode, fsuid_only, &cfile); + rc = cifs_get_writable_file(cifs_inode, flags, &cfile); if (rc) cifs_dbg(FYI, "couldn't find writable handle rc=%d", rc); @@ -2059,6 +2062,7 @@ find_writable_file(struct cifsInodeInfo *cifs_inode, bool fsuid_only) int cifs_get_writable_path(struct cifs_tcon *tcon, const char *name, + int flags, struct cifsFileInfo **ret_file) { struct list_head *tmp; @@ -2085,7 +2089,7 @@ cifs_get_writable_path(struct cifs_tcon *tcon, const char *name, kfree(full_path); cinode = CIFS_I(d_inode(cfile->dentry)); spin_unlock(&tcon->open_file_lock); - return cifs_get_writable_file(cinode, 0, ret_file); + return cifs_get_writable_file(cinode, flags, ret_file); } spin_unlock(&tcon->open_file_lock); @@ -2162,7 +2166,8 @@ static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to) if (mapping->host->i_size - offset < (loff_t)to) to = (unsigned)(mapping->host->i_size - offset); - rc = cifs_get_writable_file(CIFS_I(mapping->host), false, &open_file); + rc = cifs_get_writable_file(CIFS_I(mapping->host), FIND_WR_ANY, + &open_file); if (!rc) { bytes_written = cifs_write(open_file, open_file->pid, write_data, to - from, &offset); @@ -2355,7 +2360,7 @@ retry: if (cfile) cifsFileInfo_put(cfile); - rc = cifs_get_writable_file(CIFS_I(inode), false, &cfile); + rc = cifs_get_writable_file(CIFS_I(inode), FIND_WR_ANY, &cfile); /* in case of an error store it to return later */ if (rc) diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index 1c6f659110d0..49dbf11e2c3f 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -2282,7 +2282,7 @@ cifs_set_file_size(struct inode *inode, struct iattr *attrs, * writebehind data than the SMB timeout for the SetPathInfo * request would allow */ - open_file = find_writable_file(cifsInode, true); + open_file = find_writable_file(cifsInode, FIND_WR_FSUID_ONLY); if (open_file) { tcon = tlink_tcon(open_file->tlink); server = tcon->ses->server; @@ -2432,7 +2432,7 @@ cifs_setattr_unix(struct dentry *direntry, struct iattr *attrs) args->ctime = NO_CHANGE_64; args->device = 0; - open_file = find_writable_file(cifsInode, true); + open_file = find_writable_file(cifsInode, FIND_WR_FSUID_ONLY); if (open_file) { u16 nfid = open_file->fid.netfid; u32 npid = open_file->pid; @@ -2535,7 +2535,7 @@ cifs_setattr_nounix(struct dentry *direntry, struct iattr *attrs) rc = 0; if (attrs->ia_valid & ATTR_MTIME) { - rc = cifs_get_writable_file(cifsInode, false, &wfile); + rc = cifs_get_writable_file(cifsInode, FIND_WR_ANY, &wfile); if (!rc) { tcon = tlink_tcon(wfile->tlink); rc = tcon->ses->server->ops->flush(xid, tcon, &wfile->fid); diff --git a/fs/cifs/smb1ops.c b/fs/cifs/smb1ops.c index eb994e313c6a..b130efaf8feb 100644 --- a/fs/cifs/smb1ops.c +++ b/fs/cifs/smb1ops.c @@ -766,7 +766,7 @@ smb_set_file_info(struct inode *inode, const char *full_path, struct cifs_tcon *tcon; /* if the file is already open for write, just use that fileid */ - open_file = find_writable_file(cinode, true); + open_file = find_writable_file(cinode, FIND_WR_FSUID_ONLY); if (open_file) { fid.netfid = open_file->fid.netfid; netpid = open_file->pid; diff --git a/fs/cifs/smb2inode.c b/fs/cifs/smb2inode.c index 1cf207564ff9..a8c301ae00ed 100644 --- a/fs/cifs/smb2inode.c +++ b/fs/cifs/smb2inode.c @@ -521,7 +521,7 @@ smb2_mkdir_setinfo(struct inode *inode, const char *name, cifs_i = CIFS_I(inode); dosattrs = cifs_i->cifsAttrs | ATTR_READONLY; data.Attributes = cpu_to_le32(dosattrs); - cifs_get_writable_path(tcon, name, &cfile); + cifs_get_writable_path(tcon, name, FIND_WR_ANY, &cfile); tmprc = smb2_compound_op(xid, tcon, cifs_sb, name, FILE_WRITE_ATTRIBUTES, FILE_CREATE, CREATE_NOT_FILE, ACL_NO_MODE, @@ -577,7 +577,7 @@ smb2_rename_path(const unsigned int xid, struct cifs_tcon *tcon, { struct cifsFileInfo *cfile; - cifs_get_writable_path(tcon, from_name, &cfile); + cifs_get_writable_path(tcon, from_name, FIND_WR_WITH_DELETE, &cfile); return smb2_set_path_attr(xid, tcon, from_name, to_name, cifs_sb, DELETE, SMB2_OP_RENAME, cfile); diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c index e47190cae163..c31e84ee3c39 100644 --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -1364,6 +1364,7 @@ smb2_set_fid(struct cifsFileInfo *cfile, struct cifs_fid *fid, __u32 oplock) cfile->fid.persistent_fid = fid->persistent_fid; cfile->fid.volatile_fid = fid->volatile_fid; + cfile->fid.access = fid->access; #ifdef CONFIG_CIFS_DEBUG2 cfile->fid.mid = fid->mid; #endif /* CIFS_DEBUG2 */ @@ -3327,7 +3328,7 @@ static loff_t smb3_llseek(struct file *file, struct cifs_tcon *tcon, loff_t offs * some servers (Windows2016) will not reflect recent writes in * QUERY_ALLOCATED_RANGES until SMB2_flush is called. */ - wrcfile = find_writable_file(cifsi, false); + wrcfile = find_writable_file(cifsi, FIND_WR_ANY); if (wrcfile) { filemap_write_and_wait(inode->i_mapping); smb2_flush_file(xid, tcon, &wrcfile->fid); diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index 1234f9ccab03..28c0be5e69b7 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -2771,6 +2771,7 @@ SMB2_open(const unsigned int xid, struct cifs_open_parms *oparms, __le16 *path, atomic_inc(&tcon->num_remote_opens); oparms->fid->persistent_fid = rsp->PersistentFileId; oparms->fid->volatile_fid = rsp->VolatileFileId; + oparms->fid->access = oparms->desired_access; #ifdef CONFIG_CIFS_DEBUG2 oparms->fid->mid = le64_to_cpu(rsp->sync_hdr.MessageId); #endif /* CIFS_DEBUG2 */ From fb4b5f13464c468a9c10ae1ab8ba9aa352d0256a Mon Sep 17 00:00:00 2001 From: Joe Perches Date: Fri, 21 Feb 2020 05:20:45 -0800 Subject: [PATCH 142/400] cifs: Use #define in cifs_dbg All other uses of cifs_dbg use defines so change this one. Signed-off-by: Joe Perches Reviewed-by: Aurelien Aptel Signed-off-by: Steve French --- fs/cifs/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c index 49dbf11e2c3f..1e8a4b1579db 100644 --- a/fs/cifs/inode.c +++ b/fs/cifs/inode.c @@ -653,8 +653,8 @@ cifs_all_info_to_fattr(struct cifs_fattr *fattr, FILE_ALL_INFO *info, */ if ((fattr->cf_nlink < 1) && !tcon->unix_ext && !info->DeletePending) { - cifs_dbg(1, "bogus file nlink value %u\n", - fattr->cf_nlink); + cifs_dbg(VFS, "bogus file nlink value %u\n", + fattr->cf_nlink); fattr->cf_flags |= CIFS_FATTR_UNKNOWN_NLINK; } } From 1c5312308c96556ae209e8eb1423c38d4a113759 Mon Sep 17 00:00:00 2001 From: Kuninori Morimoto Date: Fri, 21 Feb 2020 10:25:18 +0900 Subject: [PATCH 143/400] ASoC: soc-pcm/soc-compress: don't use snd_soc_dapm_stream_stop() commit b0edff42360ab4 ("ASoC: soc-pcm/soc-compress: use snd_soc_dapm_stream_stop() for SND_SOC_DAPM_STREAM_STOP") uses snd_soc_dapm_stream_stop() for soc_compr_free_fe() and dpcm_fe_dai_shutdown() because it didn't care about pmdown_time. But, it didn't need to care. This patch rollback to original code. Some system will wait unneeded timed-out without this patch. Special Thanks for reporting to Chris Gorman. ... intel_sst_acpi 808622A8:00: Wait timed-out condition:0x0, msg_id:0x1 fw_state 0x3 intel_sst_acpi 808622A8:00: fw returned err -16 sst-mfld-platform sst-mfld-platform: ASoC: PRE_PMD: pcm0_in event failed: -16 ... Fixes: commit b0edff42360ab4 ("ASoC: soc-pcm/soc-compress: use snd_soc_dapm_stream_stop() for SND_SOC_DAPM_STREAM_STOP") Reported-by: Chris Gorman Signed-off-by: Kuninori Morimoto Link: https://lore.kernel.org/r/87lfowspeb.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown --- sound/soc/soc-compress.c | 2 +- sound/soc/soc-pcm.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/sound/soc/soc-compress.c b/sound/soc/soc-compress.c index 223cd045719e..392a1c5b15d3 100644 --- a/sound/soc/soc-compress.c +++ b/sound/soc/soc-compress.c @@ -299,7 +299,7 @@ static int soc_compr_free_fe(struct snd_compr_stream *cstream) for_each_dpcm_be(fe, stream, dpcm) dpcm->state = SND_SOC_DPCM_LINK_STATE_FREE; - snd_soc_dapm_stream_stop(fe, stream); + dpcm_dapm_stream_event(fe, stream, SND_SOC_DAPM_STREAM_STOP); fe->dpcm[stream].state = SND_SOC_DPCM_STATE_CLOSE; fe->dpcm[stream].runtime_update = SND_SOC_DPCM_UPDATE_NO; diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 42fb03a99cb1..6d400bf02ca1 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -2006,7 +2006,7 @@ static int dpcm_fe_dai_shutdown(struct snd_pcm_substream *substream) soc_pcm_close(substream); /* run the stream event for each BE */ - snd_soc_dapm_stream_stop(fe, stream); + dpcm_dapm_stream_event(fe, stream, SND_SOC_DAPM_STREAM_STOP); fe->dpcm[stream].state = SND_SOC_DPCM_STATE_CLOSE; dpcm_set_fe_update_state(fe, stream, SND_SOC_DPCM_UPDATE_NO); From 8308a09e87d2cb51adb186dc7d5a5c1913fb0758 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Amadeusz=20S=C5=82awi=C5=84ski?= Date: Mon, 24 Feb 2020 07:52:02 -0500 Subject: [PATCH 144/400] ASoC: Intel: Skylake: Fix available clock counter incrementation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Incrementation of avail_clk_cnt was incorrectly moved to error path. Put it back to success path. Fixes: 6ee927f2f01466 ('ASoC: Intel: Skylake: Fix NULL ptr dereference when unloading clk dev') Signed-off-by: Amadeusz Sławiński Reviewed-by: Cezary Rojewski Reviewed-by: Pierre-Louis Bossart Link: https://lore.kernel.org/r/20200224125202.13784-1-amadeuszx.slawinski@linux.intel.com Signed-off-by: Mark Brown --- sound/soc/intel/skylake/skl-ssp-clk.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sound/soc/intel/skylake/skl-ssp-clk.c b/sound/soc/intel/skylake/skl-ssp-clk.c index 1c0e5226cb5b..bd43885f3805 100644 --- a/sound/soc/intel/skylake/skl-ssp-clk.c +++ b/sound/soc/intel/skylake/skl-ssp-clk.c @@ -384,9 +384,11 @@ static int skl_clk_dev_probe(struct platform_device *pdev) &clks[i], clk_pdata, i); if (IS_ERR(data->clk[data->avail_clk_cnt])) { - ret = PTR_ERR(data->clk[data->avail_clk_cnt++]); + ret = PTR_ERR(data->clk[data->avail_clk_cnt]); goto err_unreg_skl_clk; } + + data->avail_clk_cnt++; } platform_set_drvdata(pdev, data); From 756125289285f6e55a03861bf4b6257aa3d19a93 Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Mon, 24 Feb 2020 16:38:57 -0500 Subject: [PATCH 145/400] audit: always check the netlink payload length in audit_receive_msg() This patch ensures that we always check the netlink payload length in audit_receive_msg() before we take any action on the payload itself. Cc: stable@vger.kernel.org Reported-by: syzbot+399c44bf1f43b8747403@syzkaller.appspotmail.com Reported-by: syzbot+e4b12d8d202701f08b6d@syzkaller.appspotmail.com Signed-off-by: Paul Moore --- kernel/audit.c | 40 +++++++++++++++++++++------------------- 1 file changed, 21 insertions(+), 19 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index 17b0d523afb3..9ddfe2aa6671 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1101,13 +1101,11 @@ static void audit_log_feature_change(int which, u32 old_feature, u32 new_feature audit_log_end(ab); } -static int audit_set_feature(struct sk_buff *skb) +static int audit_set_feature(struct audit_features *uaf) { - struct audit_features *uaf; int i; BUILD_BUG_ON(AUDIT_LAST_FEATURE + 1 > ARRAY_SIZE(audit_feature_names)); - uaf = nlmsg_data(nlmsg_hdr(skb)); /* if there is ever a version 2 we should handle that here */ @@ -1175,6 +1173,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) { u32 seq; void *data; + int data_len; int err; struct audit_buffer *ab; u16 msg_type = nlh->nlmsg_type; @@ -1188,6 +1187,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) seq = nlh->nlmsg_seq; data = nlmsg_data(nlh); + data_len = nlmsg_len(nlh); switch (msg_type) { case AUDIT_GET: { @@ -1211,7 +1211,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) struct audit_status s; memset(&s, 0, sizeof(s)); /* guard against past and future API changes */ - memcpy(&s, data, min_t(size_t, sizeof(s), nlmsg_len(nlh))); + memcpy(&s, data, min_t(size_t, sizeof(s), data_len)); if (s.mask & AUDIT_STATUS_ENABLED) { err = audit_set_enabled(s.enabled); if (err < 0) @@ -1315,7 +1315,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) return err; break; case AUDIT_SET_FEATURE: - err = audit_set_feature(skb); + if (data_len < sizeof(struct audit_features)) + return -EINVAL; + err = audit_set_feature(data); if (err) return err; break; @@ -1327,6 +1329,8 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) err = audit_filter(msg_type, AUDIT_FILTER_USER); if (err == 1) { /* match or error */ + char *str = data; + err = 0; if (msg_type == AUDIT_USER_TTY) { err = tty_audit_push(); @@ -1334,26 +1338,24 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) break; } audit_log_user_recv_msg(&ab, msg_type); - if (msg_type != AUDIT_USER_TTY) + if (msg_type != AUDIT_USER_TTY) { + /* ensure NULL termination */ + str[data_len - 1] = '\0'; audit_log_format(ab, " msg='%.*s'", AUDIT_MESSAGE_TEXT_MAX, - (char *)data); - else { - int size; - + str); + } else { audit_log_format(ab, " data="); - size = nlmsg_len(nlh); - if (size > 0 && - ((unsigned char *)data)[size - 1] == '\0') - size--; - audit_log_n_untrustedstring(ab, data, size); + if (data_len > 0 && str[data_len - 1] == '\0') + data_len--; + audit_log_n_untrustedstring(ab, str, data_len); } audit_log_end(ab); } break; case AUDIT_ADD_RULE: case AUDIT_DEL_RULE: - if (nlmsg_len(nlh) < sizeof(struct audit_rule_data)) + if (data_len < sizeof(struct audit_rule_data)) return -EINVAL; if (audit_enabled == AUDIT_LOCKED) { audit_log_common_recv_msg(audit_context(), &ab, @@ -1365,7 +1367,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) audit_log_end(ab); return -EPERM; } - err = audit_rule_change(msg_type, seq, data, nlmsg_len(nlh)); + err = audit_rule_change(msg_type, seq, data, data_len); break; case AUDIT_LIST_RULES: err = audit_list_rules_send(skb, seq); @@ -1380,7 +1382,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) case AUDIT_MAKE_EQUIV: { void *bufp = data; u32 sizes[2]; - size_t msglen = nlmsg_len(nlh); + size_t msglen = data_len; char *old, *new; err = -EINVAL; @@ -1456,7 +1458,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) memset(&s, 0, sizeof(s)); /* guard against past and future API changes */ - memcpy(&s, data, min_t(size_t, sizeof(s), nlmsg_len(nlh))); + memcpy(&s, data, min_t(size_t, sizeof(s), data_len)); /* check if new data is valid */ if ((s.enabled != 0 && s.enabled != 1) || (s.log_passwd != 0 && s.log_passwd != 1)) From 01e99aeca3979600302913cef3f89076786f32c8 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Tue, 25 Feb 2020 09:04:32 +0800 Subject: [PATCH 146/400] blk-mq: insert passthrough request into hctx->dispatch directly For some reason, device may be in one situation which can't handle FS request, so STS_RESOURCE is always returned and the FS request will be added to hctx->dispatch. However passthrough request may be required at that time for fixing the problem. If passthrough request is added to scheduler queue, there isn't any chance for blk-mq to dispatch it given we prioritize requests in hctx->dispatch. Then the FS IO request may never be completed, and IO hang is caused. So passthrough request has to be added to hctx->dispatch directly for fixing the IO hang. Fix this issue by inserting passthrough request into hctx->dispatch directly together withing adding FS request to the tail of hctx->dispatch in blk_mq_dispatch_rq_list(). Actually we add FS request to tail of hctx->dispatch at default, see blk_mq_request_bypass_insert(). Then it becomes consistent with original legacy IO request path, in which passthrough request is always added to q->queue_head. Cc: Dongli Zhang Cc: Christoph Hellwig Cc: Ewan D. Milne Signed-off-by: Ming Lei Signed-off-by: Jens Axboe --- block/blk-flush.c | 2 +- block/blk-mq-sched.c | 22 +++++++++++++++------- block/blk-mq.c | 18 +++++++++++------- block/blk-mq.h | 3 ++- 4 files changed, 29 insertions(+), 16 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index 3f977c517960..5cc775bdb06a 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -412,7 +412,7 @@ void blk_insert_flush(struct request *rq) */ if ((policy & REQ_FSEQ_DATA) && !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) { - blk_mq_request_bypass_insert(rq, false); + blk_mq_request_bypass_insert(rq, false, false); return; } diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index ca22afd47b3d..856356b1619e 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -361,13 +361,19 @@ static bool blk_mq_sched_bypass_insert(struct blk_mq_hw_ctx *hctx, bool has_sched, struct request *rq) { - /* dispatch flush rq directly */ - if (rq->rq_flags & RQF_FLUSH_SEQ) { - spin_lock(&hctx->lock); - list_add(&rq->queuelist, &hctx->dispatch); - spin_unlock(&hctx->lock); + /* + * dispatch flush and passthrough rq directly + * + * passthrough request has to be added to hctx->dispatch directly. + * For some reason, device may be in one situation which can't + * handle FS request, so STS_RESOURCE is always returned and the + * FS request will be added to hctx->dispatch. However passthrough + * request may be required at that time for fixing the problem. If + * passthrough request is added to scheduler queue, there isn't any + * chance to dispatch it given we prioritize requests in hctx->dispatch. + */ + if ((rq->rq_flags & RQF_FLUSH_SEQ) || blk_rq_is_passthrough(rq)) return true; - } if (has_sched) rq->rq_flags |= RQF_SORTED; @@ -391,8 +397,10 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head, WARN_ON(e && (rq->tag != -1)); - if (blk_mq_sched_bypass_insert(hctx, !!e, rq)) + if (blk_mq_sched_bypass_insert(hctx, !!e, rq)) { + blk_mq_request_bypass_insert(rq, at_head, false); goto run; + } if (e && e->type->ops.insert_requests) { LIST_HEAD(list); diff --git a/block/blk-mq.c b/block/blk-mq.c index a12b1763508d..5e1e4151cb51 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -735,7 +735,7 @@ static void blk_mq_requeue_work(struct work_struct *work) * merge. */ if (rq->rq_flags & RQF_DONTPREP) - blk_mq_request_bypass_insert(rq, false); + blk_mq_request_bypass_insert(rq, false, false); else blk_mq_sched_insert_request(rq, true, false, false); } @@ -1286,7 +1286,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, q->mq_ops->commit_rqs(hctx); spin_lock(&hctx->lock); - list_splice_init(list, &hctx->dispatch); + list_splice_tail_init(list, &hctx->dispatch); spin_unlock(&hctx->lock); /* @@ -1677,12 +1677,16 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, * Should only be used carefully, when the caller knows we want to * bypass a potential IO scheduler on the target device. */ -void blk_mq_request_bypass_insert(struct request *rq, bool run_queue) +void blk_mq_request_bypass_insert(struct request *rq, bool at_head, + bool run_queue) { struct blk_mq_hw_ctx *hctx = rq->mq_hctx; spin_lock(&hctx->lock); - list_add_tail(&rq->queuelist, &hctx->dispatch); + if (at_head) + list_add(&rq->queuelist, &hctx->dispatch); + else + list_add_tail(&rq->queuelist, &hctx->dispatch); spin_unlock(&hctx->lock); if (run_queue) @@ -1849,7 +1853,7 @@ insert: if (bypass_insert) return BLK_STS_RESOURCE; - blk_mq_request_bypass_insert(rq, run_queue); + blk_mq_request_bypass_insert(rq, false, run_queue); return BLK_STS_OK; } @@ -1876,7 +1880,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false, true); if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) - blk_mq_request_bypass_insert(rq, true); + blk_mq_request_bypass_insert(rq, false, true); else if (ret != BLK_STS_OK) blk_mq_end_request(rq, ret); @@ -1910,7 +1914,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx, if (ret != BLK_STS_OK) { if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) { - blk_mq_request_bypass_insert(rq, + blk_mq_request_bypass_insert(rq, false, list_empty(list)); break; } diff --git a/block/blk-mq.h b/block/blk-mq.h index eaaca8fc1c28..c0fa34378eb2 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -66,7 +66,8 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, */ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, bool at_head); -void blk_mq_request_bypass_insert(struct request *rq, bool run_queue); +void blk_mq_request_bypass_insert(struct request *rq, bool at_head, + bool run_queue); void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx, struct list_head *list); From 3d2ed431b8f39483477bc3c3a2aefbc9778ffe12 Mon Sep 17 00:00:00 2001 From: Phong LE Date: Wed, 19 Feb 2020 15:13:24 +0100 Subject: [PATCH 147/400] drm/mediatek: Handle component type MTK_DISP_OVL_2L correctly The larb device remains NULL if the type is MTK_DISP_OVL_2L. A kernel panic is raised when a crtc uses mtk_smi_larb_get or mtk_smi_larb_put. Fixes: b17bdd0d7a73 ("drm/mediatek: add component OVL_2L0") Signed-off-by: Phong LE Signed-off-by: CK Hu --- drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c index 1f5a112bb034..57c88de9a329 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c @@ -471,6 +471,7 @@ int mtk_ddp_comp_init(struct device *dev, struct device_node *node, /* Only DMA capable components need the LARB property */ comp->larb_dev = NULL; if (type != MTK_DISP_OVL && + type != MTK_DISP_OVL_2L && type != MTK_DISP_RDMA && type != MTK_DISP_WDMA) return 0; From 94788af4ed039476ff3527b0e6a12c1dc42cb022 Mon Sep 17 00:00:00 2001 From: Dmitry Osipenko Date: Sun, 9 Feb 2020 19:33:38 +0300 Subject: [PATCH 148/400] dmaengine: tegra-apb: Fix use-after-free I was doing some experiments with I2C and noticed that Tegra APB DMA driver crashes sometime after I2C DMA transfer termination. The crash happens because tegra_dma_terminate_all() bails out immediately if pending list is empty, and thus, it doesn't release the half-completed descriptors which are getting re-used before ISR tasklet kicks-in. tegra-i2c 7000c400.i2c: DMA transfer timeout elants_i2c 0-0010: elants_i2c_irq: failed to read data: -110 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 142 at lib/list_debug.c:45 __list_del_entry_valid+0x45/0xac list_del corruption, ddbaac44->next is LIST_POISON1 (00000100) Modules linked in: CPU: 0 PID: 142 Comm: kworker/0:2 Not tainted 5.5.0-rc2-next-20191220-00175-gc3605715758d-dirty #538 Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) Workqueue: events_freezable_power_ thermal_zone_device_check [] (unwind_backtrace) from [] (show_stack+0x11/0x14) [] (show_stack) from [] (dump_stack+0x85/0x94) [] (dump_stack) from [] (__warn+0xc1/0xc4) [] (__warn) from [] (warn_slowpath_fmt+0x61/0x78) [] (warn_slowpath_fmt) from [] (__list_del_entry_valid+0x45/0xac) [] (__list_del_entry_valid) from [] (tegra_dma_tasklet+0x5b/0x154) [] (tegra_dma_tasklet) from [] (tasklet_action_common.constprop.0+0x41/0x7c) [] (tasklet_action_common.constprop.0) from [] (__do_softirq+0xd3/0x2a8) [] (__do_softirq) from [] (irq_exit+0x7b/0x98) [] (irq_exit) from [] (__handle_domain_irq+0x45/0x80) [] (__handle_domain_irq) from [] (gic_handle_irq+0x45/0x7c) [] (gic_handle_irq) from [] (__irq_svc+0x65/0x94) Exception stack(0xde2ebb90 to 0xde2ebbd8) Signed-off-by: Dmitry Osipenko Acked-by: Jon Hunter Cc: Link: https://lore.kernel.org/r/20200209163356.6439-2-digetx@gmail.com Signed-off-by: Vinod Koul --- drivers/dma/tegra20-apb-dma.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c index 3a45079d11ec..319f31d27014 100644 --- a/drivers/dma/tegra20-apb-dma.c +++ b/drivers/dma/tegra20-apb-dma.c @@ -756,10 +756,6 @@ static int tegra_dma_terminate_all(struct dma_chan *dc) bool was_busy; spin_lock_irqsave(&tdc->lock, flags); - if (list_empty(&tdc->pending_sg_req)) { - spin_unlock_irqrestore(&tdc->lock, flags); - return 0; - } if (!tdc->busy) goto skip_dma_stop; From c33ee1301c393a241d6424e36eff1071811b1064 Mon Sep 17 00:00:00 2001 From: Dmitry Osipenko Date: Sun, 9 Feb 2020 19:33:39 +0300 Subject: [PATCH 149/400] dmaengine: tegra-apb: Prevent race conditions of tasklet vs free list The interrupt handler puts a half-completed DMA descriptor on a free list and then schedules tasklet to process bottom half of the descriptor that executes client's callback, this creates possibility to pick up the busy descriptor from the free list. Thus, let's disallow descriptor's re-use until it is fully processed. Signed-off-by: Dmitry Osipenko Acked-by: Jon Hunter Cc: Link: https://lore.kernel.org/r/20200209163356.6439-3-digetx@gmail.com Signed-off-by: Vinod Koul --- drivers/dma/tegra20-apb-dma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c index 319f31d27014..4a750e29bfb5 100644 --- a/drivers/dma/tegra20-apb-dma.c +++ b/drivers/dma/tegra20-apb-dma.c @@ -281,7 +281,7 @@ static struct tegra_dma_desc *tegra_dma_desc_get( /* Do not allocate if desc are waiting for ack */ list_for_each_entry(dma_desc, &tdc->free_dma_desc, node) { - if (async_tx_test_ack(&dma_desc->txd)) { + if (async_tx_test_ack(&dma_desc->txd) && !dma_desc->cb_count) { list_del(&dma_desc->node); spin_unlock_irqrestore(&tdc->lock, flags); dma_desc->txd.flags = 0; From b549c252b1292aea959cd9b83537fcb9384a6112 Mon Sep 17 00:00:00 2001 From: Tina Zhang Date: Tue, 25 Feb 2020 13:35:27 +0800 Subject: [PATCH 150/400] drm/i915/gvt: Fix orphan vgpu dmabuf_objs' lifetime Deleting dmabuf item's list head after releasing its container can lead to KASAN-reported issue: BUG: KASAN: use-after-free in __list_del_entry_valid+0x15/0xf0 Read of size 8 at addr ffff88818a4598a8 by task kworker/u8:3/13119 So fix this issue by puting deleting dmabuf_objs ahead of releasing its container. Fixes: dfb6ae4e14bd6 ("drm/i915/gvt: Handle orphan dmabuf_objs") Signed-off-by: Tina Zhang Reviewed-by: Zhenyu Wang Signed-off-by: Zhenyu Wang Link: http://patchwork.freedesktop.org/patch/msgid/20200225053527.8336-2-tina.zhang@intel.com --- drivers/gpu/drm/i915/gvt/dmabuf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gvt/dmabuf.c b/drivers/gpu/drm/i915/gvt/dmabuf.c index 2477a1e5a166..ae139f0877ae 100644 --- a/drivers/gpu/drm/i915/gvt/dmabuf.c +++ b/drivers/gpu/drm/i915/gvt/dmabuf.c @@ -151,12 +151,12 @@ static void dmabuf_gem_object_free(struct kref *kref) dmabuf_obj = container_of(pos, struct intel_vgpu_dmabuf_obj, list); if (dmabuf_obj == obj) { + list_del(pos); intel_gvt_hypervisor_put_vfio_device(vgpu); idr_remove(&vgpu->object_idr, dmabuf_obj->dmabuf_id); kfree(dmabuf_obj->info); kfree(dmabuf_obj); - list_del(pos); break; } } From 25962e1a7f1d522f1b57ead2f266fab570042a70 Mon Sep 17 00:00:00 2001 From: Frieder Schrempf Date: Tue, 25 Feb 2020 08:23:20 +0000 Subject: [PATCH 151/400] dmaengine: imx-sdma: Fix the event id check to include RX event for UART6 On i.MX6UL/ULL and i.MX6SX the DMA event id for the RX channel of UART6 is '0'. To fix the broken DMA support for UART6, we change the check for event_id0 to include '0' as a valid id. Fixes: 1ec1e82f2510 ("dmaengine: Add Freescale i.MX SDMA support") Signed-off-by: Frieder Schrempf Reviewed-by: Fabio Estevam Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200225082139.7646-1-frieder.schrempf@kontron.de Signed-off-by: Vinod Koul --- drivers/dma/imx-sdma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c index 332ca5034504..4d4477df4ede 100644 --- a/drivers/dma/imx-sdma.c +++ b/drivers/dma/imx-sdma.c @@ -1331,7 +1331,7 @@ static void sdma_free_chan_resources(struct dma_chan *chan) sdma_channel_synchronize(chan); - if (sdmac->event_id0) + if (sdmac->event_id0 >= 0) sdma_event_disable(sdmac, sdmac->event_id0); if (sdmac->event_id1) sdma_event_disable(sdmac, sdmac->event_id1); @@ -1632,7 +1632,7 @@ static int sdma_config(struct dma_chan *chan, memcpy(&sdmac->slave_config, dmaengine_cfg, sizeof(*dmaengine_cfg)); /* Set ENBLn earlier to make sure dma request triggered after that */ - if (sdmac->event_id0) { + if (sdmac->event_id0 >= 0) { if (sdmac->event_id0 >= sdmac->sdma->drvdata->num_events) return -EINVAL; sdma_event_enable(sdmac, sdmac->event_id0); From 53ace1195263b30fd593677dd67559e879ed9aa2 Mon Sep 17 00:00:00 2001 From: Stephen Kitt Date: Fri, 21 Feb 2020 21:57:33 +0100 Subject: [PATCH 152/400] docs: remove MPX from the x86 toc MPX was removed in commit 45fc24e89b7c ("x86/mpx: remove MPX from arch/x86"), this removes the corresponding entry in the x86 toc. This was suggested by a Sphinx warning. Signed-off-by: Stephen Kitt Fixes: 45fc24e89b7cc ("x86/mpx: remove MPX from arch/x86") Acked-by: Dave Hansen Signed-off-by: Jonathan Corbet --- Documentation/x86/index.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst index a8de2fbc1caa..265d9e9a093b 100644 --- a/Documentation/x86/index.rst +++ b/Documentation/x86/index.rst @@ -19,7 +19,6 @@ x86-specific Documentation tlb mtrr pat - intel_mpx intel-iommu intel_txt amd-memory-encryption From adc10f5b0a03606e30c704cff1f0283a696d0260 Mon Sep 17 00:00:00 2001 From: Kees Cook Date: Fri, 21 Feb 2020 16:02:39 -0800 Subject: [PATCH 153/400] docs: Fix empty parallelism argument When there was no parallelism (no top-level -j arg and a pre-1.7 sphinx-build), the argument passed would be empty ("") instead of just being missing, which would (understandably) badly confuse sphinx-build. Fix this by removing the quotes. Reported-by: Rafael J. Wysocki Fixes: 51e46c7a4007 ("docs, parallelism: Rearrange how jobserver reservations are made") Cc: stable@vger.kernel.org # v5.5 only Signed-off-by: Kees Cook Signed-off-by: Jonathan Corbet --- Documentation/sphinx/parallel-wrapper.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/sphinx/parallel-wrapper.sh b/Documentation/sphinx/parallel-wrapper.sh index 7daf5133bdd3..e54c44ce117d 100644 --- a/Documentation/sphinx/parallel-wrapper.sh +++ b/Documentation/sphinx/parallel-wrapper.sh @@ -30,4 +30,4 @@ if [ -n "$parallel" ] ; then parallel="-j$parallel" fi -exec "$sphinx" "$parallel" "$@" +exec "$sphinx" $parallel "$@" From d0820556507bd7aef4f3a615b1b6eb66eb9785fe Mon Sep 17 00:00:00 2001 From: Stefano Brivio Date: Fri, 21 Feb 2020 03:11:56 +0100 Subject: [PATCH 154/400] selftests: nft_concat_range: Move option for 'list ruleset' before command Before nftables commit fb9cea50e8b3 ("main: enforce options before commands"), 'nft list ruleset -a' happened to work, but it's wrong and won't work anymore. Replace it by 'nft -a list ruleset'. Reported-by: Chen Yi Fixes: 611973c1e06f ("selftests: netfilter: Introduce tests for sets with range concatenation") Signed-off-by: Stefano Brivio Signed-off-by: Pablo Neira Ayuso --- .../testing/selftests/netfilter/nft_concat_range.sh | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/netfilter/nft_concat_range.sh b/tools/testing/selftests/netfilter/nft_concat_range.sh index aca21dde102a..5c1033ee1b39 100755 --- a/tools/testing/selftests/netfilter/nft_concat_range.sh +++ b/tools/testing/selftests/netfilter/nft_concat_range.sh @@ -1025,7 +1025,7 @@ format_noconcat() { add() { if ! nft add element inet filter test "${1}"; then err "Failed to add ${1} given ruleset:" - err "$(nft list ruleset -a)" + err "$(nft -a list ruleset)" return 1 fi } @@ -1045,7 +1045,7 @@ add_perf() { add_perf_norange() { if ! nft add element netdev perf norange "${1}"; then err "Failed to add ${1} given ruleset:" - err "$(nft list ruleset -a)" + err "$(nft -a list ruleset)" return 1 fi } @@ -1054,7 +1054,7 @@ add_perf_norange() { add_perf_noconcat() { if ! nft add element netdev perf noconcat "${1}"; then err "Failed to add ${1} given ruleset:" - err "$(nft list ruleset -a)" + err "$(nft -a list ruleset)" return 1 fi } @@ -1063,7 +1063,7 @@ add_perf_noconcat() { del() { if ! nft delete element inet filter test "${1}"; then err "Failed to delete ${1} given ruleset:" - err "$(nft list ruleset -a)" + err "$(nft -a list ruleset)" return 1 fi } @@ -1134,7 +1134,7 @@ send_match() { err " $(for f in ${src}; do eval format_\$f "${2}"; printf ' '; done)" err "should have matched ruleset:" - err "$(nft list ruleset -a)" + err "$(nft -a list ruleset)" return 1 fi nft reset counter inet filter test >/dev/null @@ -1160,7 +1160,7 @@ send_nomatch() { err " $(for f in ${src}; do eval format_\$f "${2}"; printf ' '; done)" err "should not have matched ruleset:" - err "$(nft list ruleset -a)" + err "$(nft -a list ruleset)" return 1 fi } From c780e86dd48ef6467a1146cf7d0fe1e05a635039 Mon Sep 17 00:00:00 2001 From: Jan Kara Date: Thu, 6 Feb 2020 15:28:12 +0100 Subject: [PATCH 155/400] blktrace: Protect q->blk_trace with RCU KASAN is reporting that __blk_add_trace() has a use-after-free issue when accessing q->blk_trace. Indeed the switching of block tracing (and thus eventual freeing of q->blk_trace) is completely unsynchronized with the currently running tracing and thus it can happen that the blk_trace structure is being freed just while __blk_add_trace() works on it. Protect accesses to q->blk_trace by RCU during tracing and make sure we wait for the end of RCU grace period when shutting down tracing. Luckily that is rare enough event that we can afford that. Note that postponing the freeing of blk_trace to an RCU callback should better be avoided as it could have unexpected user visible side-effects as debugfs files would be still existing for a short while block tracing has been shut down. Link: https://bugzilla.kernel.org/show_bug.cgi?id=205711 CC: stable@vger.kernel.org Reviewed-by: Chaitanya Kulkarni Reviewed-by: Ming Lei Tested-by: Ming Lei Reviewed-by: Bart Van Assche Reported-by: Tristan Madani Signed-off-by: Jan Kara Signed-off-by: Jens Axboe --- include/linux/blkdev.h | 2 +- include/linux/blktrace_api.h | 18 ++++-- kernel/trace/blktrace.c | 114 +++++++++++++++++++++++++---------- 3 files changed, 97 insertions(+), 37 deletions(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 053ea4b51988..10455b2bbbb4 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -524,7 +524,7 @@ struct request_queue { unsigned int sg_reserved_size; int node; #ifdef CONFIG_BLK_DEV_IO_TRACE - struct blk_trace *blk_trace; + struct blk_trace __rcu *blk_trace; struct mutex blk_trace_mutex; #endif /* diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h index 7bb2d8de9f30..3b6ff5902edc 100644 --- a/include/linux/blktrace_api.h +++ b/include/linux/blktrace_api.h @@ -51,9 +51,13 @@ void __trace_note_message(struct blk_trace *, struct blkcg *blkcg, const char *f **/ #define blk_add_cgroup_trace_msg(q, cg, fmt, ...) \ do { \ - struct blk_trace *bt = (q)->blk_trace; \ + struct blk_trace *bt; \ + \ + rcu_read_lock(); \ + bt = rcu_dereference((q)->blk_trace); \ if (unlikely(bt)) \ __trace_note_message(bt, cg, fmt, ##__VA_ARGS__);\ + rcu_read_unlock(); \ } while (0) #define blk_add_trace_msg(q, fmt, ...) \ blk_add_cgroup_trace_msg(q, NULL, fmt, ##__VA_ARGS__) @@ -61,10 +65,14 @@ void __trace_note_message(struct blk_trace *, struct blkcg *blkcg, const char *f static inline bool blk_trace_note_message_enabled(struct request_queue *q) { - struct blk_trace *bt = q->blk_trace; - if (likely(!bt)) - return false; - return bt->act_mask & BLK_TC_NOTIFY; + struct blk_trace *bt; + bool ret; + + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + ret = bt && (bt->act_mask & BLK_TC_NOTIFY); + rcu_read_unlock(); + return ret; } extern void blk_add_driver_data(struct request_queue *q, struct request *rq, diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index 0735ae8545d8..4560878f0bac 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -335,6 +335,7 @@ static void put_probe_ref(void) static void blk_trace_cleanup(struct blk_trace *bt) { + synchronize_rcu(); blk_trace_free(bt); put_probe_ref(); } @@ -629,8 +630,10 @@ static int compat_blk_trace_setup(struct request_queue *q, char *name, static int __blk_trace_startstop(struct request_queue *q, int start) { int ret; - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; + bt = rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex)); if (bt == NULL) return -EINVAL; @@ -740,8 +743,8 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg) void blk_trace_shutdown(struct request_queue *q) { mutex_lock(&q->blk_trace_mutex); - - if (q->blk_trace) { + if (rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex))) { __blk_trace_startstop(q, 0); __blk_trace_remove(q); } @@ -752,8 +755,10 @@ void blk_trace_shutdown(struct request_queue *q) #ifdef CONFIG_BLK_CGROUP static u64 blk_trace_bio_get_cgid(struct request_queue *q, struct bio *bio) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; + /* We don't use the 'bt' value here except as an optimization... */ + bt = rcu_dereference_protected(q->blk_trace, 1); if (!bt || !(blk_tracer_flags.val & TRACE_BLK_OPT_CGROUP)) return 0; @@ -796,10 +801,14 @@ blk_trace_request_get_cgid(struct request_queue *q, struct request *rq) static void blk_add_trace_rq(struct request *rq, int error, unsigned int nr_bytes, u32 what, u64 cgid) { - struct blk_trace *bt = rq->q->blk_trace; + struct blk_trace *bt; - if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(rq->q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + } if (blk_rq_is_passthrough(rq)) what |= BLK_TC_ACT(BLK_TC_PC); @@ -808,6 +817,7 @@ static void blk_add_trace_rq(struct request *rq, int error, __blk_add_trace(bt, blk_rq_trace_sector(rq), nr_bytes, req_op(rq), rq->cmd_flags, what, error, 0, NULL, cgid); + rcu_read_unlock(); } static void blk_add_trace_rq_insert(void *ignore, @@ -853,14 +863,19 @@ static void blk_add_trace_rq_complete(void *ignore, struct request *rq, static void blk_add_trace_bio(struct request_queue *q, struct bio *bio, u32 what, int error) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; - if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + } __blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size, bio_op(bio), bio->bi_opf, what, error, 0, NULL, blk_trace_bio_get_cgid(q, bio)); + rcu_read_unlock(); } static void blk_add_trace_bio_bounce(void *ignore, @@ -905,11 +920,14 @@ static void blk_add_trace_getrq(void *ignore, if (bio) blk_add_trace_bio(q, bio, BLK_TA_GETRQ, 0); else { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) __blk_add_trace(bt, 0, 0, rw, 0, BLK_TA_GETRQ, 0, 0, NULL, 0); + rcu_read_unlock(); } } @@ -921,27 +939,35 @@ static void blk_add_trace_sleeprq(void *ignore, if (bio) blk_add_trace_bio(q, bio, BLK_TA_SLEEPRQ, 0); else { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) __blk_add_trace(bt, 0, 0, rw, 0, BLK_TA_SLEEPRQ, 0, 0, NULL, 0); + rcu_read_unlock(); } } static void blk_add_trace_plug(void *ignore, struct request_queue *q) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) __blk_add_trace(bt, 0, 0, 0, 0, BLK_TA_PLUG, 0, 0, NULL, 0); + rcu_read_unlock(); } static void blk_add_trace_unplug(void *ignore, struct request_queue *q, unsigned int depth, bool explicit) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) { __be64 rpdu = cpu_to_be64(depth); u32 what; @@ -953,14 +979,17 @@ static void blk_add_trace_unplug(void *ignore, struct request_queue *q, __blk_add_trace(bt, 0, 0, 0, 0, what, 0, sizeof(rpdu), &rpdu, 0); } + rcu_read_unlock(); } static void blk_add_trace_split(void *ignore, struct request_queue *q, struct bio *bio, unsigned int pdu) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); if (bt) { __be64 rpdu = cpu_to_be64(pdu); @@ -969,6 +998,7 @@ static void blk_add_trace_split(void *ignore, BLK_TA_SPLIT, bio->bi_status, sizeof(rpdu), &rpdu, blk_trace_bio_get_cgid(q, bio)); } + rcu_read_unlock(); } /** @@ -988,11 +1018,15 @@ static void blk_add_trace_bio_remap(void *ignore, struct request_queue *q, struct bio *bio, dev_t dev, sector_t from) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; struct blk_io_trace_remap r; - if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + } r.device_from = cpu_to_be32(dev); r.device_to = cpu_to_be32(bio_dev(bio)); @@ -1001,6 +1035,7 @@ static void blk_add_trace_bio_remap(void *ignore, __blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size, bio_op(bio), bio->bi_opf, BLK_TA_REMAP, bio->bi_status, sizeof(r), &r, blk_trace_bio_get_cgid(q, bio)); + rcu_read_unlock(); } /** @@ -1021,11 +1056,15 @@ static void blk_add_trace_rq_remap(void *ignore, struct request *rq, dev_t dev, sector_t from) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; struct blk_io_trace_remap r; - if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + } r.device_from = cpu_to_be32(dev); r.device_to = cpu_to_be32(disk_devt(rq->rq_disk)); @@ -1034,6 +1073,7 @@ static void blk_add_trace_rq_remap(void *ignore, __blk_add_trace(bt, blk_rq_pos(rq), blk_rq_bytes(rq), rq_data_dir(rq), 0, BLK_TA_REMAP, 0, sizeof(r), &r, blk_trace_request_get_cgid(q, rq)); + rcu_read_unlock(); } /** @@ -1051,14 +1091,19 @@ void blk_add_driver_data(struct request_queue *q, struct request *rq, void *data, size_t len) { - struct blk_trace *bt = q->blk_trace; + struct blk_trace *bt; - if (likely(!bt)) + rcu_read_lock(); + bt = rcu_dereference(q->blk_trace); + if (likely(!bt)) { + rcu_read_unlock(); return; + } __blk_add_trace(bt, blk_rq_trace_sector(rq), blk_rq_bytes(rq), 0, 0, BLK_TA_DRV_DATA, 0, len, data, blk_trace_request_get_cgid(q, rq)); + rcu_read_unlock(); } EXPORT_SYMBOL_GPL(blk_add_driver_data); @@ -1597,6 +1642,7 @@ static int blk_trace_remove_queue(struct request_queue *q) return -EINVAL; put_probe_ref(); + synchronize_rcu(); blk_trace_free(bt); return 0; } @@ -1758,6 +1804,7 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev, struct hd_struct *p = dev_to_part(dev); struct request_queue *q; struct block_device *bdev; + struct blk_trace *bt; ssize_t ret = -ENXIO; bdev = bdget(part_devt(p)); @@ -1770,21 +1817,23 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev, mutex_lock(&q->blk_trace_mutex); + bt = rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex)); if (attr == &dev_attr_enable) { - ret = sprintf(buf, "%u\n", !!q->blk_trace); + ret = sprintf(buf, "%u\n", !!bt); goto out_unlock_bdev; } - if (q->blk_trace == NULL) + if (bt == NULL) ret = sprintf(buf, "disabled\n"); else if (attr == &dev_attr_act_mask) - ret = blk_trace_mask2str(buf, q->blk_trace->act_mask); + ret = blk_trace_mask2str(buf, bt->act_mask); else if (attr == &dev_attr_pid) - ret = sprintf(buf, "%u\n", q->blk_trace->pid); + ret = sprintf(buf, "%u\n", bt->pid); else if (attr == &dev_attr_start_lba) - ret = sprintf(buf, "%llu\n", q->blk_trace->start_lba); + ret = sprintf(buf, "%llu\n", bt->start_lba); else if (attr == &dev_attr_end_lba) - ret = sprintf(buf, "%llu\n", q->blk_trace->end_lba); + ret = sprintf(buf, "%llu\n", bt->end_lba); out_unlock_bdev: mutex_unlock(&q->blk_trace_mutex); @@ -1801,6 +1850,7 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev, struct block_device *bdev; struct request_queue *q; struct hd_struct *p; + struct blk_trace *bt; u64 value; ssize_t ret = -EINVAL; @@ -1831,8 +1881,10 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev, mutex_lock(&q->blk_trace_mutex); + bt = rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex)); if (attr == &dev_attr_enable) { - if (!!value == !!q->blk_trace) { + if (!!value == !!bt) { ret = 0; goto out_unlock_bdev; } @@ -1844,18 +1896,18 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev, } ret = 0; - if (q->blk_trace == NULL) + if (bt == NULL) ret = blk_trace_setup_queue(q, bdev); if (ret == 0) { if (attr == &dev_attr_act_mask) - q->blk_trace->act_mask = value; + bt->act_mask = value; else if (attr == &dev_attr_pid) - q->blk_trace->pid = value; + bt->pid = value; else if (attr == &dev_attr_start_lba) - q->blk_trace->start_lba = value; + bt->start_lba = value; else if (attr == &dev_attr_end_lba) - q->blk_trace->end_lba = value; + bt->end_lba = value; } out_unlock_bdev: From bdcd3eab2a9ae0ac93f27275b6895dd95e5bf360 Mon Sep 17 00:00:00 2001 From: Xiaoguang Wang Date: Tue, 25 Feb 2020 22:12:08 +0800 Subject: [PATCH 156/400] io_uring: fix poll_list race for SETUP_IOPOLL|SETUP_SQPOLL After making ext4 support iopoll method: let ext4_file_operations's iopoll method be iomap_dio_iopoll(), we found fio can easily hang in fio_ioring_getevents() with below fio job: rm -f testfile; sync; sudo fio -name=fiotest -filename=testfile -iodepth=128 -thread -rw=write -ioengine=io_uring -hipri=1 -sqthread_poll=1 -direct=1 -bs=4k -size=10G -numjobs=8 -runtime=2000 -group_reporting with IORING_SETUP_SQPOLL and IORING_SETUP_IOPOLL enabled. There are two issues that results in this hang, one reason is that when IORING_SETUP_SQPOLL and IORING_SETUP_IOPOLL are enabled, fio does not use io_uring_enter to get completed events, it relies on kernel io_sq_thread to poll for completed events. Another reason is that there is a race: when io_submit_sqes() in io_sq_thread() submits a batch of sqes, variable 'inflight' will record the number of submitted reqs, then io_sq_thread will poll for reqs which have been added to poll_list. But note, if some previous reqs have been punted to io worker, these reqs will won't be in poll_list timely. io_sq_thread() will only poll for a part of previous submitted reqs, and then find poll_list is empty, reset variable 'inflight' to be zero. If app just waits these deferred reqs and does not wake up io_sq_thread again, then hang happens. For app that entirely relies on io_sq_thread to poll completed requests, let io_iopoll_req_issued() wake up io_sq_thread properly when adding new element to poll_list, and when io_sq_thread prepares to sleep, check whether poll_list is empty again, if not empty, continue to poll. Signed-off-by: Xiaoguang Wang Signed-off-by: Jens Axboe --- fs/io_uring.c | 59 +++++++++++++++++++++++---------------------------- 1 file changed, 27 insertions(+), 32 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index d961945cb332..ffd9bfa84d86 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1821,6 +1821,10 @@ static void io_iopoll_req_issued(struct io_kiocb *req) list_add(&req->list, &ctx->poll_list); else list_add_tail(&req->list, &ctx->poll_list); + + if ((ctx->flags & IORING_SETUP_SQPOLL) && + wq_has_sleeper(&ctx->sqo_wait)) + wake_up(&ctx->sqo_wait); } static void io_file_put(struct io_submit_state *state) @@ -5086,9 +5090,8 @@ static int io_sq_thread(void *data) const struct cred *old_cred; mm_segment_t old_fs; DEFINE_WAIT(wait); - unsigned inflight; unsigned long timeout; - int ret; + int ret = 0; complete(&ctx->completions[1]); @@ -5096,39 +5099,19 @@ static int io_sq_thread(void *data) set_fs(USER_DS); old_cred = override_creds(ctx->creds); - ret = timeout = inflight = 0; + timeout = jiffies + ctx->sq_thread_idle; while (!kthread_should_park()) { unsigned int to_submit; - if (inflight) { + if (!list_empty(&ctx->poll_list)) { unsigned nr_events = 0; - if (ctx->flags & IORING_SETUP_IOPOLL) { - /* - * inflight is the count of the maximum possible - * entries we submitted, but it can be smaller - * if we dropped some of them. If we don't have - * poll entries available, then we know that we - * have nothing left to poll for. Reset the - * inflight count to zero in that case. - */ - mutex_lock(&ctx->uring_lock); - if (!list_empty(&ctx->poll_list)) - io_iopoll_getevents(ctx, &nr_events, 0); - else - inflight = 0; - mutex_unlock(&ctx->uring_lock); - } else { - /* - * Normal IO, just pretend everything completed. - * We don't have to poll completions for that. - */ - nr_events = inflight; - } - - inflight -= nr_events; - if (!inflight) + mutex_lock(&ctx->uring_lock); + if (!list_empty(&ctx->poll_list)) + io_iopoll_getevents(ctx, &nr_events, 0); + else timeout = jiffies + ctx->sq_thread_idle; + mutex_unlock(&ctx->uring_lock); } to_submit = io_sqring_entries(ctx); @@ -5157,7 +5140,7 @@ static int io_sq_thread(void *data) * more IO, we should wait for the application to * reap events and wake us up. */ - if (inflight || + if (!list_empty(&ctx->poll_list) || (!time_after(jiffies, timeout) && ret != -EBUSY && !percpu_ref_is_dying(&ctx->refs))) { cond_resched(); @@ -5167,6 +5150,19 @@ static int io_sq_thread(void *data) prepare_to_wait(&ctx->sqo_wait, &wait, TASK_INTERRUPTIBLE); + /* + * While doing polled IO, before going to sleep, we need + * to check if there are new reqs added to poll_list, it + * is because reqs may have been punted to io worker and + * will be added to poll_list later, hence check the + * poll_list again. + */ + if ((ctx->flags & IORING_SETUP_IOPOLL) && + !list_empty_careful(&ctx->poll_list)) { + finish_wait(&ctx->sqo_wait, &wait); + continue; + } + /* Tell userspace we may need a wakeup call */ ctx->rings->sq_flags |= IORING_SQ_NEED_WAKEUP; /* make sure to read SQ tail after writing flags */ @@ -5194,8 +5190,7 @@ static int io_sq_thread(void *data) mutex_lock(&ctx->uring_lock); ret = io_submit_sqes(ctx, to_submit, NULL, -1, &cur_mm, true); mutex_unlock(&ctx->uring_lock); - if (ret > 0) - inflight += ret; + timeout = jiffies + ctx->sq_thread_idle; } set_fs(old_fs); From 3030fd4cb783377eca0e2a3eee63724a5c66ee15 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Tue, 25 Feb 2020 08:47:30 -0700 Subject: [PATCH 157/400] io-wq: remove spin-for-work optimization Andres reports that buffered IO seems to suck up more cycles than we would like, and he narrowed it down to the fact that the io-wq workers will briefly spin for more work on completion of a work item. This was a win on the networking side, but apparently some other cases take a hit because of it. Remove the optimization to avoid burning more CPU than we have to for disk IO. Reported-by: Andres Freund Signed-off-by: Jens Axboe --- fs/io-wq.c | 19 ------------------- 1 file changed, 19 deletions(-) diff --git a/fs/io-wq.c b/fs/io-wq.c index 0a5ab1a8f69a..bf8ed1b0b90a 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -535,42 +535,23 @@ next: } while (1); } -static inline void io_worker_spin_for_work(struct io_wqe *wqe) -{ - int i = 0; - - while (++i < 1000) { - if (io_wqe_run_queue(wqe)) - break; - if (need_resched()) - break; - cpu_relax(); - } -} - static int io_wqe_worker(void *data) { struct io_worker *worker = data; struct io_wqe *wqe = worker->wqe; struct io_wq *wq = wqe->wq; - bool did_work; io_worker_start(wqe, worker); - did_work = false; while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) { set_current_state(TASK_INTERRUPTIBLE); loop: - if (did_work) - io_worker_spin_for_work(wqe); spin_lock_irq(&wqe->lock); if (io_wqe_run_queue(wqe)) { __set_current_state(TASK_RUNNING); io_worker_handle_work(worker); - did_work = true; goto loop; } - did_work = false; /* drops the lock on success, retry */ if (__io_worker_idle(wqe, worker)) { __release(&wqe->lock); From 4829f89855f1d3a3d8014e74cceab51b421503db Mon Sep 17 00:00:00 2001 From: Monk Liu Date: Sat, 8 Feb 2020 19:01:21 +0800 Subject: [PATCH 158/400] drm/amdgpu: fix memory leak during TDR test(v2) fix system memory leak v2: fix coding style Signed-off-by: Monk Liu Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c index b06c057a9002..c9e5ce135fd4 100644 --- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c +++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c @@ -978,8 +978,12 @@ int smu_v11_0_init_max_sustainable_clocks(struct smu_context *smu) struct smu_11_0_max_sustainable_clocks *max_sustainable_clocks; int ret = 0; - max_sustainable_clocks = kzalloc(sizeof(struct smu_11_0_max_sustainable_clocks), + if (!smu->smu_table.max_sustainable_clocks) + max_sustainable_clocks = kzalloc(sizeof(struct smu_11_0_max_sustainable_clocks), GFP_KERNEL); + else + max_sustainable_clocks = smu->smu_table.max_sustainable_clocks; + smu->smu_table.max_sustainable_clocks = (void *)max_sustainable_clocks; max_sustainable_clocks->uclock = smu->smu_table.boot_values.uclk / 100; From a3ed353cf8015ba84a0407a5dc3ffee038166ab0 Mon Sep 17 00:00:00 2001 From: Shirish S Date: Mon, 27 Jan 2020 16:35:24 +0530 Subject: [PATCH 159/400] amdgpu/gmc_v9: save/restore sdpif regs during S3 fixes S3 issue with IOMMU + S/G enabled @ 64M VRAM. Suggested-by: Alex Deucher Signed-off-by: Shirish S Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 1 + drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 37 ++++++++++++++++++- .../include/asic_reg/dce/dce_12_0_offset.h | 2 + 3 files changed, 39 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h index d3c27a3c43f6..7546da0cc70c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h @@ -195,6 +195,7 @@ struct amdgpu_gmc { uint32_t srbm_soft_reset; bool prt_warning; uint64_t stolen_size; + uint32_t sdpif_register; /* apertures */ u64 shared_aperture_start; u64 shared_aperture_end; diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 90216abf14a4..cc0c273a86f9 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -1271,6 +1271,19 @@ static void gmc_v9_0_init_golden_registers(struct amdgpu_device *adev) } } +/** + * gmc_v9_0_restore_registers - restores regs + * + * @adev: amdgpu_device pointer + * + * This restores register values, saved at suspend. + */ +static void gmc_v9_0_restore_registers(struct amdgpu_device *adev) +{ + if (adev->asic_type == CHIP_RAVEN) + WREG32(mmDCHUBBUB_SDPIF_MMIO_CNTRL_0, adev->gmc.sdpif_register); +} + /** * gmc_v9_0_gart_enable - gart enable * @@ -1376,6 +1389,20 @@ static int gmc_v9_0_hw_init(void *handle) return r; } +/** + * gmc_v9_0_save_registers - saves regs + * + * @adev: amdgpu_device pointer + * + * This saves potential register values that should be + * restored upon resume + */ +static void gmc_v9_0_save_registers(struct amdgpu_device *adev) +{ + if (adev->asic_type == CHIP_RAVEN) + adev->gmc.sdpif_register = RREG32(mmDCHUBBUB_SDPIF_MMIO_CNTRL_0); +} + /** * gmc_v9_0_gart_disable - gart disable * @@ -1412,9 +1439,16 @@ static int gmc_v9_0_hw_fini(void *handle) static int gmc_v9_0_suspend(void *handle) { + int r; struct amdgpu_device *adev = (struct amdgpu_device *)handle; - return gmc_v9_0_hw_fini(adev); + r = gmc_v9_0_hw_fini(adev); + if (r) + return r; + + gmc_v9_0_save_registers(adev); + + return 0; } static int gmc_v9_0_resume(void *handle) @@ -1422,6 +1456,7 @@ static int gmc_v9_0_resume(void *handle) int r; struct amdgpu_device *adev = (struct amdgpu_device *)handle; + gmc_v9_0_restore_registers(adev); r = gmc_v9_0_hw_init(adev); if (r) return r; diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_12_0_offset.h b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_12_0_offset.h index b6f74bf4af02..27bb8c1ab858 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_12_0_offset.h +++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_12_0_offset.h @@ -7376,6 +7376,8 @@ #define mmCRTC4_CRTC_DRR_CONTROL 0x0f3e #define mmCRTC4_CRTC_DRR_CONTROL_BASE_IDX 2 +#define mmDCHUBBUB_SDPIF_MMIO_CNTRL_0 0x395d +#define mmDCHUBBUB_SDPIF_MMIO_CNTRL_0_BASE_IDX 2 // addressBlock: dce_dc_fmt4_dispdec // base address: 0x2000 From 93d7c3185893b185e7f4347f8986b9b521254a6e Mon Sep 17 00:00:00 2001 From: Dongli Zhang Date: Mon, 24 Feb 2020 10:39:11 -0800 Subject: [PATCH 160/400] null_blk: remove unused fields in 'nullb_cmd' 'list', 'll_list' and 'csd' are no longer used. The 'list' is not used since it was introduced by commit f2298c0403b0 ("null_blk: multi queue aware block test driver"). The 'll_list' is no longer used since commit 3c395a969acc ("null_blk: set a separate timer for each command"). The 'csd' is no longer used since commit ce2c350b2cfe ("null_blk: use blk_complete_request and blk_mq_complete_request"). Reviewed-by: Bart Van Assche Signed-off-by: Dongli Zhang Signed-off-by: Jens Axboe --- drivers/block/null_blk.h | 3 --- drivers/block/null_blk_main.c | 2 -- 2 files changed, 5 deletions(-) diff --git a/drivers/block/null_blk.h b/drivers/block/null_blk.h index bc837862b767..62b660821dbc 100644 --- a/drivers/block/null_blk.h +++ b/drivers/block/null_blk.h @@ -14,9 +14,6 @@ #include struct nullb_cmd { - struct list_head list; - struct llist_node ll_list; - struct __call_single_data csd; struct request *rq; struct bio *bio; unsigned int tag; diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c index 16510795e377..133060431dbd 100644 --- a/drivers/block/null_blk_main.c +++ b/drivers/block/null_blk_main.c @@ -1518,8 +1518,6 @@ static int setup_commands(struct nullb_queue *nq) for (i = 0; i < nq->queue_depth; i++) { cmd = &nq->cmds[i]; - INIT_LIST_HEAD(&cmd->list); - cmd->ll_list.next = NULL; cmd->tag = -1U; } From d5bdf66108419cdb39da361b58ded661c29ff66e Mon Sep 17 00:00:00 2001 From: Mikulas Patocka Date: Fri, 7 Feb 2020 11:42:30 -0500 Subject: [PATCH 161/400] dm integrity: fix recalculation when moving from journal mode to bitmap mode If we resume a device in bitmap mode and the on-disk format is in journal mode, we must recalculate anything above ic->sb->recalc_sector. Otherwise, there would be non-recalculated blocks which would cause I/O errors. Fixes: 468dfca38b1a ("dm integrity: add a bitmap mode") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm-integrity.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c index b225b3e445fa..166727a47cef 100644 --- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -2888,17 +2888,24 @@ static void dm_integrity_resume(struct dm_target *ti) } else { replay_journal(ic); if (ic->mode == 'B') { - int mode; ic->sb->flags |= cpu_to_le32(SB_FLAG_DIRTY_BITMAP); ic->sb->log2_blocks_per_bitmap_bit = ic->log2_blocks_per_bitmap_bit; r = sync_rw_sb(ic, REQ_OP_WRITE, REQ_FUA); if (unlikely(r)) dm_integrity_io_error(ic, "writing superblock", r); - mode = ic->recalculate_flag ? BITMAP_OP_SET : BITMAP_OP_CLEAR; - block_bitmap_op(ic, ic->journal, 0, ic->provided_data_sectors, mode); - block_bitmap_op(ic, ic->recalc_bitmap, 0, ic->provided_data_sectors, mode); - block_bitmap_op(ic, ic->may_write_bitmap, 0, ic->provided_data_sectors, mode); + block_bitmap_op(ic, ic->journal, 0, ic->provided_data_sectors, BITMAP_OP_CLEAR); + block_bitmap_op(ic, ic->recalc_bitmap, 0, ic->provided_data_sectors, BITMAP_OP_CLEAR); + block_bitmap_op(ic, ic->may_write_bitmap, 0, ic->provided_data_sectors, BITMAP_OP_CLEAR); + if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING) && + le64_to_cpu(ic->sb->recalc_sector) < ic->provided_data_sectors) { + block_bitmap_op(ic, ic->journal, le64_to_cpu(ic->sb->recalc_sector), + ic->provided_data_sectors - le64_to_cpu(ic->sb->recalc_sector), BITMAP_OP_SET); + block_bitmap_op(ic, ic->recalc_bitmap, le64_to_cpu(ic->sb->recalc_sector), + ic->provided_data_sectors - le64_to_cpu(ic->sb->recalc_sector), BITMAP_OP_SET); + block_bitmap_op(ic, ic->may_write_bitmap, le64_to_cpu(ic->sb->recalc_sector), + ic->provided_data_sectors - le64_to_cpu(ic->sb->recalc_sector), BITMAP_OP_SET); + } rw_journal_sectors(ic, REQ_OP_WRITE, REQ_FUA | REQ_SYNC, 0, ic->n_bitmap_blocks * (BITMAP_BLOCK_SIZE >> SECTOR_SHIFT), NULL); } From 53770f0ec5fd417429775ba006bc4abe14002335 Mon Sep 17 00:00:00 2001 From: Mikulas Patocka Date: Mon, 17 Feb 2020 07:43:03 -0500 Subject: [PATCH 162/400] dm integrity: fix a deadlock due to offloading to an incorrect workqueue If we need to perform synchronous I/O in dm_integrity_map_continue(), we must make sure that we are not in the map function - in order to avoid the deadlock due to bio queuing in generic_make_request. To avoid the deadlock, we offload the request to metadata_wq. However, metadata_wq also processes metadata updates for write requests. If there are too many requests that get offloaded to metadata_wq at the beginning of dm_integrity_map_continue, the workqueue metadata_wq becomes clogged and the system is incapable of processing any metadata updates. This causes a deadlock because all the requests that need to do metadata updates wait for metadata_wq to proceed and metadata_wq waits inside wait_and_add_new_range until some existing request releases its range lock (which doesn't happen because the range lock is released after metadata update). In order to fix the deadlock, we create a new workqueue offload_wq and offload requests to it - so that processing of offload_wq is independent from processing of metadata_wq. Fixes: 7eada909bfd7 ("dm: add integrity target") Cc: stable@vger.kernel.org # v4.12+ Reported-by: Heinz Mauelshagen Tested-by: Heinz Mauelshagen Signed-off-by: Heinz Mauelshagen Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm-integrity.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c index 166727a47cef..6b6c3e1deaa8 100644 --- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -212,6 +212,7 @@ struct dm_integrity_c { struct list_head wait_list; wait_queue_head_t endio_wait; struct workqueue_struct *wait_wq; + struct workqueue_struct *offload_wq; unsigned char commit_seq; commit_id_t commit_ids[N_COMMIT_IDS]; @@ -1439,7 +1440,7 @@ static void dec_in_flight(struct dm_integrity_io *dio) dio->range.logical_sector += dio->range.n_sectors; bio_advance(bio, dio->range.n_sectors << SECTOR_SHIFT); INIT_WORK(&dio->work, integrity_bio_wait); - queue_work(ic->wait_wq, &dio->work); + queue_work(ic->offload_wq, &dio->work); return; } do_endio_flush(ic, dio); @@ -1865,7 +1866,7 @@ static void dm_integrity_map_continue(struct dm_integrity_io *dio, bool from_map if (need_sync_io && from_map) { INIT_WORK(&dio->work, integrity_bio_wait); - queue_work(ic->metadata_wq, &dio->work); + queue_work(ic->offload_wq, &dio->work); return; } @@ -2501,7 +2502,7 @@ static void bitmap_block_work(struct work_struct *w) dio->range.n_sectors, BITMAP_OP_TEST_ALL_SET)) { remove_range(ic, &dio->range); INIT_WORK(&dio->work, integrity_bio_wait); - queue_work(ic->wait_wq, &dio->work); + queue_work(ic->offload_wq, &dio->work); } else { block_bitmap_op(ic, ic->journal, dio->range.logical_sector, dio->range.n_sectors, BITMAP_OP_SET); @@ -2524,7 +2525,7 @@ static void bitmap_block_work(struct work_struct *w) remove_range(ic, &dio->range); INIT_WORK(&dio->work, integrity_bio_wait); - queue_work(ic->wait_wq, &dio->work); + queue_work(ic->offload_wq, &dio->work); } queue_delayed_work(ic->commit_wq, &ic->bitmap_flush_work, ic->bitmap_flush_interval); @@ -3843,6 +3844,14 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv) goto bad; } + ic->offload_wq = alloc_workqueue("dm-integrity-offload", WQ_MEM_RECLAIM, + METADATA_WORKQUEUE_MAX_ACTIVE); + if (!ic->offload_wq) { + ti->error = "Cannot allocate workqueue"; + r = -ENOMEM; + goto bad; + } + ic->commit_wq = alloc_workqueue("dm-integrity-commit", WQ_MEM_RECLAIM, 1); if (!ic->commit_wq) { ti->error = "Cannot allocate workqueue"; @@ -4147,6 +4156,8 @@ static void dm_integrity_dtr(struct dm_target *ti) destroy_workqueue(ic->metadata_wq); if (ic->wait_wq) destroy_workqueue(ic->wait_wq); + if (ic->offload_wq) + destroy_workqueue(ic->offload_wq); if (ic->commit_wq) destroy_workqueue(ic->commit_wq); if (ic->writer_wq) From 7fc2e47f40dd77ab1fcbda6db89614a0173d89c7 Mon Sep 17 00:00:00 2001 From: Mikulas Patocka Date: Mon, 17 Feb 2020 08:11:35 -0500 Subject: [PATCH 163/400] dm integrity: fix invalid table returned due to argument count mismatch If the flag SB_FLAG_RECALCULATE is present in the superblock, but it was not specified on the command line (i.e. ic->recalculate_flag is false), dm-integrity would return invalid table line - the reported number of arguments would not match the real number. Fixes: 468dfca38b1a ("dm integrity: add a bitmap mode") Cc: stable@vger.kernel.org # v5.2+ Reported-by: Ondrej Kozina Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm-integrity.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c index 6b6c3e1deaa8..2f98e88399ec 100644 --- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -2975,7 +2975,7 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type, DMEMIT(" meta_device:%s", ic->meta_dev->name); if (ic->sectors_per_block != 1) DMEMIT(" block_size:%u", ic->sectors_per_block << SECTOR_SHIFT); - if (ic->recalculate_flag) + if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING)) DMEMIT(" recalculate"); DMEMIT(" journal_sectors:%u", ic->initial_sectors - SB_SECTORS); DMEMIT(" interleave_sectors:%u", 1U << ic->sb->log2_interleave_sectors); From a8e41f6033a0c5633d55d6e35993c9e2005d872f Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Tue, 25 Feb 2020 18:05:35 +0800 Subject: [PATCH 164/400] icmp: allow icmpv6_ndo_send to work with CONFIG_IPV6=n The icmpv6_send function has long had a static inline implementation with an empty body for CONFIG_IPV6=n, so that code calling it doesn't need to be ifdef'd. The new icmpv6_ndo_send function, which is intended for drivers as a drop-in replacement with an identical function signature, should follow the same pattern. Without this patch, drivers that used to work with CONFIG_IPV6=n now result in a linker error. Cc: Chen Zhou Reported-by: Hulk Robot Fixes: 0b41713b6066 ("icmp: introduce helper for nat'd source address in network device context") Signed-off-by: Jason A. Donenfeld Signed-off-by: David S. Miller --- include/linux/icmpv6.h | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/include/linux/icmpv6.h b/include/linux/icmpv6.h index 93338fd54af8..33d379602314 100644 --- a/include/linux/icmpv6.h +++ b/include/linux/icmpv6.h @@ -22,19 +22,23 @@ extern int inet6_unregister_icmp_sender(ip6_icmp_send_t *fn); int ip6_err_gen_icmpv6_unreach(struct sk_buff *skb, int nhs, int type, unsigned int data_len); +#if IS_ENABLED(CONFIG_NF_NAT) +void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info); +#else +#define icmpv6_ndo_send icmpv6_send +#endif + #else static inline void icmpv6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info) { - } -#endif -#if IS_ENABLED(CONFIG_NF_NAT) -void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info); -#else -#define icmpv6_ndo_send icmpv6_send +static inline void icmpv6_ndo_send(struct sk_buff *skb, + u8 type, u8 code, __u32 info) +{ +} #endif extern int icmpv6_init(void); From eb9d8ddbc107d02e489681f9dcbf93949e1a99a4 Mon Sep 17 00:00:00 2001 From: Tomeu Vizoso Date: Wed, 12 Feb 2020 14:22:36 -0600 Subject: [PATCH 165/400] drm/panfrost: Don't try to map on error faults If the exception type isn't a translation fault, don't try to map and instead go straight to a terminal fault. Otherwise, we can get flooded by kernel warnings and further faults. Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations") Signed-off-by: Rob Herring Signed-off-by: Tomeu Vizoso Reviewed-by: Steven Price Reviewed-by: Tomeu Vizoso Acked-by: Alyssa Rosenzweig Link: https://patchwork.freedesktop.org/patch/msgid/20200212202236.13095-1-robh@kernel.org --- drivers/gpu/drm/panfrost/panfrost_mmu.c | 44 +++++++++++-------------- 1 file changed, 19 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 3107b0738e40..5d75f8cf6477 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -601,33 +601,27 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) source_id = (fault_status >> 16); /* Page fault only */ - if ((status & mask) == BIT(i)) { - WARN_ON(exception_type < 0xC1 || exception_type > 0xC4); - + ret = -1; + if ((status & mask) == BIT(i) && (exception_type & 0xF8) == 0xC0) ret = panfrost_mmu_map_fault_addr(pfdev, i, addr); - if (!ret) { - mmu_write(pfdev, MMU_INT_CLEAR, BIT(i)); - status &= ~mask; - continue; - } - } - /* terminal fault, print info about the fault */ - dev_err(pfdev->dev, - "Unhandled Page fault in AS%d at VA 0x%016llX\n" - "Reason: %s\n" - "raw fault status: 0x%X\n" - "decoded fault status: %s\n" - "exception type 0x%X: %s\n" - "access type 0x%X: %s\n" - "source id 0x%X\n", - i, addr, - "TODO", - fault_status, - (fault_status & (1 << 10) ? "DECODER FAULT" : "SLAVE FAULT"), - exception_type, panfrost_exception_name(pfdev, exception_type), - access_type, access_type_name(pfdev, fault_status), - source_id); + if (ret) + /* terminal fault, print info about the fault */ + dev_err(pfdev->dev, + "Unhandled Page fault in AS%d at VA 0x%016llX\n" + "Reason: %s\n" + "raw fault status: 0x%X\n" + "decoded fault status: %s\n" + "exception type 0x%X: %s\n" + "access type 0x%X: %s\n" + "source id 0x%X\n", + i, addr, + "TODO", + fault_status, + (fault_status & (1 << 10) ? "DECODER FAULT" : "SLAVE FAULT"), + exception_type, panfrost_exception_name(pfdev, exception_type), + access_type, access_type_name(pfdev, fault_status), + source_id); mmu_write(pfdev, MMU_INT_CLEAR, mask); From 2d141dd2caa78fbaf87b57c27769bdc14975ab3d Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Tue, 25 Feb 2020 11:52:56 -0700 Subject: [PATCH 166/400] io-wq: ensure work->task_pid is cleared on init We use ->task_pid for exit cancellation, but we need to ensure it's cleared to zero for io_req_work_grab_env() to do the right thing. Take a suggestion from Bart and clear the whole thing, just setting the function passed in. This makes it more future proof as well. Fixes: 36282881a795 ("io-wq: add io_wq_cancel_pid() to cancel based on a specific pid") Signed-off-by: Jens Axboe --- fs/io-wq.h | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/fs/io-wq.h b/fs/io-wq.h index ccc7d84af57d..33baba4370c5 100644 --- a/fs/io-wq.h +++ b/fs/io-wq.h @@ -79,16 +79,10 @@ struct io_wq_work { pid_t task_pid; }; -#define INIT_IO_WORK(work, _func) \ - do { \ - (work)->list.next = NULL; \ - (work)->func = _func; \ - (work)->files = NULL; \ - (work)->mm = NULL; \ - (work)->creds = NULL; \ - (work)->fs = NULL; \ - (work)->flags = 0; \ - } while (0) \ +#define INIT_IO_WORK(work, _func) \ + do { \ + *(work) = (struct io_wq_work){ .func = _func }; \ + } while (0) \ typedef void (get_work_fn)(struct io_wq_work *); typedef void (put_work_fn)(struct io_wq_work *); From deddc9e8c0e05c2a770d3c9e6683c7ae216741ca Mon Sep 17 00:00:00 2001 From: Vadim Pasternak Date: Tue, 25 Feb 2020 00:52:02 +0200 Subject: [PATCH 167/400] hwmon: (pmbus/xdpe12284) Add callback for vout limits conversion Provide read_word_data() callback for overvoltage and undervoltage output readouts conversion. These registers are presented in 'slinear11' format, while default conversion for 'vout' class for the devices is 'vid'. It is resulted in wrong conversion in pmbus_reg2data() for in{3-4}_lcrit and in{3-4}_crit attributes. ) Fixes: aaafb7c8eb1c ("hwmon: (pmbus) Add support for Infineon Multi-phase xdpe122 family controllers") Signed-off-by: Vadim Pasternak Link: https://lore.kernel.org/r/20200224225202.19576-1-vadimp@mellanox.com [gropeck: Adjusted to mainline PMBus API] Signed-off-by: Guenter Roeck --- drivers/hwmon/pmbus/xdpe12284.c | 54 +++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/drivers/hwmon/pmbus/xdpe12284.c b/drivers/hwmon/pmbus/xdpe12284.c index ecd9b65627ec..660556b89e9f 100644 --- a/drivers/hwmon/pmbus/xdpe12284.c +++ b/drivers/hwmon/pmbus/xdpe12284.c @@ -18,6 +18,59 @@ #define XDPE122_AMD_625MV 0x10 /* AMD mode 6.25mV */ #define XDPE122_PAGE_NUM 2 +static int xdpe122_read_word_data(struct i2c_client *client, int page, int reg) +{ + const struct pmbus_driver_info *info = pmbus_get_driver_info(client); + long val; + s16 exponent; + s32 mantissa; + int ret; + + switch (reg) { + case PMBUS_VOUT_OV_FAULT_LIMIT: + case PMBUS_VOUT_UV_FAULT_LIMIT: + ret = pmbus_read_word_data(client, page, reg); + if (ret < 0) + return ret; + + /* Convert register value to LINEAR11 data. */ + exponent = ((s16)ret) >> 11; + mantissa = ((s16)((ret & GENMASK(10, 0)) << 5)) >> 5; + val = mantissa * 1000L; + if (exponent >= 0) + val <<= exponent; + else + val >>= -exponent; + + /* Convert data to VID register. */ + switch (info->vrm_version[page]) { + case vr13: + if (val >= 500) + return 1 + DIV_ROUND_CLOSEST(val - 500, 10); + return 0; + case vr12: + if (val >= 250) + return 1 + DIV_ROUND_CLOSEST(val - 250, 5); + return 0; + case imvp9: + if (val >= 200) + return 1 + DIV_ROUND_CLOSEST(val - 200, 10); + return 0; + case amd625mv: + if (val >= 200 && val <= 1550) + return DIV_ROUND_CLOSEST((1550 - val) * 100, + 625); + return 0; + default: + return -EINVAL; + } + default: + return -ENODATA; + } + + return 0; +} + static int xdpe122_identify(struct i2c_client *client, struct pmbus_driver_info *info) { @@ -70,6 +123,7 @@ static struct pmbus_driver_info xdpe122_info = { PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP | PMBUS_HAVE_POUT | PMBUS_HAVE_PIN | PMBUS_HAVE_STATUS_INPUT, .identify = xdpe122_identify, + .read_word_data = xdpe122_read_word_data, }; static int xdpe122_probe(struct i2c_client *client, From 2910b5aa6f545c044173a5cab3dbb7f43e23916d Mon Sep 17 00:00:00 2001 From: Masami Hiramatsu Date: Tue, 25 Feb 2020 23:36:41 +0900 Subject: [PATCH 168/400] bootconfig: Fix CONFIG_BOOTTIME_TRACING dependency issue Since commit d8a953ddde5e ("bootconfig: Set CONFIG_BOOT_CONFIG=n by default") also changed the CONFIG_BOOTTIME_TRACING to select CONFIG_BOOT_CONFIG to show the boot-time tracing on the menu, it introduced wrong dependencies with BLK_DEV_INITRD as below. WARNING: unmet direct dependencies detected for BOOT_CONFIG Depends on [n]: BLK_DEV_INITRD [=n] Selected by [y]: - BOOTTIME_TRACING [=y] && TRACING_SUPPORT [=y] && FTRACE [=y] && TRACING [=y] This makes the CONFIG_BOOT_CONFIG selects CONFIG_BLK_DEV_INITRD to fix this error and make CONFIG_BOOTTIME_TRACING=n by default, so that both boot-time tracing and boot configuration off but those appear on the menu list. Link: http://lkml.kernel.org/r/158264140162.23842.11237423518607465535.stgit@devnote2 Fixes: d8a953ddde5e ("bootconfig: Set CONFIG_BOOT_CONFIG=n by default") Reported-by: Randy Dunlap Compiled-tested-by: Randy Dunlap Signed-off-by: Masami Hiramatsu Signed-off-by: Steven Rostedt (VMware) --- init/Kconfig | 2 +- kernel/trace/Kconfig | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/init/Kconfig b/init/Kconfig index a84e7aa89a29..8b4c3e8c05ea 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1217,7 +1217,7 @@ endif config BOOT_CONFIG bool "Boot config support" - depends on BLK_DEV_INITRD + select BLK_DEV_INITRD help Extra boot config allows system admin to pass a config file as complemental extension of kernel cmdline when booting. diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 795c3e02d3f1..402eef84c859 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -145,7 +145,6 @@ config BOOTTIME_TRACING bool "Boot-time Tracing support" depends on TRACING select BOOT_CONFIG - default y help Enable developer to setup ftrace subsystem via supplemental kernel cmdline at boot time for debugging (tracing) driver From 7c69eb84d98a28c428f902318c20c53cf29c9084 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Fri, 21 Feb 2020 06:37:23 -0800 Subject: [PATCH 169/400] zonefs: fix IOCB_NOWAIT handling IOCB_NOWAIT can't just be ignored as it breaks applications expecting it not to block. Just refuse the operation as applications must handle that (e.g. by falling back to a thread pool). Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system") Signed-off-by: Christoph Hellwig Signed-off-by: Damien Le Moal --- fs/zonefs/super.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index 8bc6ef82d693..69aee3dfb660 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -601,13 +601,13 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) ssize_t ret; /* - * For async direct IOs to sequential zone files, ignore IOCB_NOWAIT + * For async direct IOs to sequential zone files, refuse IOCB_NOWAIT * as this can cause write reordering (e.g. the first aio gets EAGAIN * on the inode lock but the second goes through but is now unaligned). */ - if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !is_sync_kiocb(iocb) - && (iocb->ki_flags & IOCB_NOWAIT)) - iocb->ki_flags &= ~IOCB_NOWAIT; + if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !is_sync_kiocb(iocb) && + (iocb->ki_flags & IOCB_NOWAIT)) + return -EOPNOTSUPP; if (iocb->ki_flags & IOCB_NOWAIT) { if (!inode_trylock(inode)) From 0dda2ddb7ded1395189e95d43106469687c07795 Mon Sep 17 00:00:00 2001 From: Johannes Thumshirn Date: Tue, 25 Feb 2020 22:03:33 +0100 Subject: [PATCH 170/400] zonefs: select FS_IOMAP Zonefs makes use of iomap internally, so it should also select iomap in Kconfig. Signed-off-by: Johannes Thumshirn Signed-off-by: Damien Le Moal --- fs/zonefs/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/zonefs/Kconfig b/fs/zonefs/Kconfig index fb87ad372e29..ef2697b78820 100644 --- a/fs/zonefs/Kconfig +++ b/fs/zonefs/Kconfig @@ -2,6 +2,7 @@ config ZONEFS_FS tristate "zonefs filesystem support" depends on BLOCK depends on BLK_DEV_ZONED + select FS_IOMAP help zonefs is a simple file system which exposes zones of a zoned block device (e.g. host-managed or host-aware SMR disk drives) as files. From b5dacc8fb52c690e2cdf7df3ae36bd1cf20e63dd Mon Sep 17 00:00:00 2001 From: Jani Nikula Date: Fri, 21 Feb 2020 12:54:14 +0200 Subject: [PATCH 171/400] drm/i915: fix header test with GCOV $(CC) with $(CFLAGS_GCOV) assumes the output filename with .gcno suffix appended is writable. This is not the case when the output filename is /dev/null: HDRTEST drivers/gpu/drm/i915/display/intel_frontbuffer.h /dev/null:1:0: error: cannot open /dev/null.gcno HDRTEST drivers/gpu/drm/i915/display/intel_ddi.h /dev/null:1:0: error: cannot open /dev/null.gcno make[5]: *** [../drivers/gpu/drm/i915/Makefile:307: drivers/gpu/drm/i915/display/intel_ddi.hdrtest] Error 1 make[5]: *** Waiting for unfinished jobs.... make[5]: *** [../drivers/gpu/drm/i915/Makefile:307: drivers/gpu/drm/i915/display/intel_frontbuffer.hdrtest] Error 1 Filter out $(CFLAGS_GVOC) from the header test $(c_flags) as they don't make sense here anyway. References: http://lore.kernel.org/r/d8112767-4089-4c58-d7d3-2ce03139858a@infradead.org Reported-by: Randy Dunlap Fixes: c6d4a099a240 ("drm/i915: reimplement header test feature") Cc: Masahiro Yamada Acked-by: Randy Dunlap Signed-off-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/20200221105414.14358-1-jani.nikula@intel.com (cherry picked from commit 408c1b3253dab93da175690dc0e21dd8bccf3371) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index b8c5f8934dbd..a1f2411aa21b 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -294,7 +294,7 @@ extra-$(CONFIG_DRM_I915_WERROR) += \ $(shell cd $(srctree)/$(src) && find * -name '*.h'))) quiet_cmd_hdrtest = HDRTEST $(patsubst %.hdrtest,%.h,$@) - cmd_hdrtest = $(CC) $(c_flags) -S -o /dev/null -x c /dev/null -include $<; touch $@ + cmd_hdrtest = $(CC) $(filter-out $(CFLAGS_GCOV), $(c_flags)) -S -o /dev/null -x c /dev/null -include $<; touch $@ $(obj)/%.hdrtest: $(src)/%.h FORCE $(call if_changed_dep,hdrtest) From eee18939e5767dbe3a98b3ea172e7fd7ba7d403c Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Mon, 24 Feb 2020 10:11:20 +0000 Subject: [PATCH 172/400] drm/i915/gtt: Downgrade gen7 (ivb, byt, hsw) back to aliasing-ppgtt Full-ppgtt on gen7 is proving to be highly unstable and not robust. Closes: https://gitlab.freedesktop.org/drm/intel/issues/694 Fixes: 3cd6e8860ecd ("drm/i915/gen7: Re-enable full-ppgtt for ivb & hsw") Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: Jani Nikula Cc: Dave Airlie Acked-by: Rodrigo Vivi Link: https://patchwork.freedesktop.org/patch/msgid/20200224101120.4024481-1-chris@chris-wilson.co.uk (cherry picked from commit 4fbe112a569526e46fa2accb5763c069f78cb431) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/i915_pci.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 83f01401b8b5..f631f6d21127 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -437,7 +437,7 @@ static const struct intel_device_info snb_m_gt2_info = { .has_rc6 = 1, \ .has_rc6p = 1, \ .has_rps = true, \ - .ppgtt_type = INTEL_PPGTT_FULL, \ + .ppgtt_type = INTEL_PPGTT_ALIASING, \ .ppgtt_size = 31, \ IVB_PIPE_OFFSETS, \ IVB_CURSOR_OFFSETS, \ @@ -494,7 +494,7 @@ static const struct intel_device_info vlv_info = { .has_rps = true, .display.has_gmch = 1, .display.has_hotplug = 1, - .ppgtt_type = INTEL_PPGTT_FULL, + .ppgtt_type = INTEL_PPGTT_ALIASING, .ppgtt_size = 31, .has_snoop = true, .has_coherent_ggtt = false, From 19ee5e8da6129d8d828201a12264ab3d09153ec4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Winiarski?= Date: Wed, 19 Feb 2020 17:18:21 +0100 Subject: [PATCH 173/400] drm/i915/pmu: Avoid using globals for CPU hotplug state MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Attempting to bind / unbind module from devices where we have both integrated and discreete GPU handled by i915 can lead to leaks and warnings from cpuhp: Error: Removing state XXX which has instances left. Let's move the state to i915_pmu. Fixes: 05488673a4d4 ("drm/i915/pmu: Support multiple GPUs") Signed-off-by: Michał Winiarski Cc: Chris Wilson Cc: Michal Wajdeczko Cc: Tvrtko Ursulin Reviewed-by: Chris Wilson Signed-off-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20200219161822.24592-1-michal.winiarski@intel.com (cherry picked from commit f5a179d4687d4e7bfadd7cbda7ee5d0bad76761f) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/i915_pmu.c | 18 +++++++++--------- drivers/gpu/drm/i915/i915_pmu.h | 7 +++++-- 2 files changed, 14 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index ec0299490dd4..84301004d5c0 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -1042,7 +1042,7 @@ static void free_event_attributes(struct i915_pmu *pmu) static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node) { - struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), node); + struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node); GEM_BUG_ON(!pmu->base.event_init); @@ -1055,7 +1055,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node) static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node) { - struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), node); + struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node); unsigned int target; GEM_BUG_ON(!pmu->base.event_init); @@ -1072,8 +1072,6 @@ static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node) return 0; } -static enum cpuhp_state cpuhp_slot = CPUHP_INVALID; - static int i915_pmu_register_cpuhp_state(struct i915_pmu *pmu) { enum cpuhp_state slot; @@ -1087,21 +1085,22 @@ static int i915_pmu_register_cpuhp_state(struct i915_pmu *pmu) return ret; slot = ret; - ret = cpuhp_state_add_instance(slot, &pmu->node); + ret = cpuhp_state_add_instance(slot, &pmu->cpuhp.node); if (ret) { cpuhp_remove_multi_state(slot); return ret; } - cpuhp_slot = slot; + pmu->cpuhp.slot = slot; return 0; } static void i915_pmu_unregister_cpuhp_state(struct i915_pmu *pmu) { - WARN_ON(cpuhp_slot == CPUHP_INVALID); - WARN_ON(cpuhp_state_remove_instance(cpuhp_slot, &pmu->node)); - cpuhp_remove_multi_state(cpuhp_slot); + WARN_ON(pmu->cpuhp.slot == CPUHP_INVALID); + WARN_ON(cpuhp_state_remove_instance(pmu->cpuhp.slot, &pmu->cpuhp.node)); + cpuhp_remove_multi_state(pmu->cpuhp.slot); + pmu->cpuhp.slot = CPUHP_INVALID; } static bool is_igp(struct drm_i915_private *i915) @@ -1128,6 +1127,7 @@ void i915_pmu_register(struct drm_i915_private *i915) spin_lock_init(&pmu->lock); hrtimer_init(&pmu->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); pmu->timer.function = i915_sample; + pmu->cpuhp.slot = CPUHP_INVALID; if (!is_igp(i915)) { pmu->name = kasprintf(GFP_KERNEL, diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h index 6c1647c5daf2..207058391cec 100644 --- a/drivers/gpu/drm/i915/i915_pmu.h +++ b/drivers/gpu/drm/i915/i915_pmu.h @@ -39,9 +39,12 @@ struct i915_pmu_sample { struct i915_pmu { /** - * @node: List node for CPU hotplug handling. + * @cpuhp: Struct used for CPU hotplug handling. */ - struct hlist_node node; + struct { + struct hlist_node node; + enum cpuhp_state slot; + } cpuhp; /** * @base: PMU base. */ From 2de0147d77168d6a227c00eb9c5a49374e1582a3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Micha=C5=82=20Winiarski?= Date: Wed, 19 Feb 2020 17:18:22 +0100 Subject: [PATCH 174/400] drm/i915/pmu: Avoid using globals for PMU events MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Attempting to bind / unbind module from devices where we have both integrated and discreete GPU handled by i915, will cause us to try and double free the global state, hitting null ptr deref in free_event_attributes. Let's move it to i915_pmu. Fixes: 05488673a4d4 ("drm/i915/pmu: Support multiple GPUs") Signed-off-by: Michał Winiarski Cc: Chris Wilson Cc: Michal Wajdeczko Cc: Tvrtko Ursulin Reviewed-by: Chris Wilson Signed-off-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20200219161822.24592-2-michal.winiarski@intel.com (cherry picked from commit 46129dc10f47c5c2b51c93a82b7b2aca46574ae0) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/i915_pmu.c | 41 ++++++++++++++++++--------------- drivers/gpu/drm/i915/i915_pmu.h | 4 ++++ 2 files changed, 26 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 84301004d5c0..aa729d04abe2 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -822,11 +822,6 @@ static ssize_t i915_pmu_event_show(struct device *dev, return sprintf(buf, "config=0x%lx\n", eattr->val); } -static struct attribute_group i915_pmu_events_attr_group = { - .name = "events", - /* Patch in attrs at runtime. */ -}; - static ssize_t i915_pmu_get_attr_cpumask(struct device *dev, struct device_attribute *attr, @@ -846,13 +841,6 @@ static const struct attribute_group i915_pmu_cpumask_attr_group = { .attrs = i915_cpumask_attrs, }; -static const struct attribute_group *i915_pmu_attr_groups[] = { - &i915_pmu_format_attr_group, - &i915_pmu_events_attr_group, - &i915_pmu_cpumask_attr_group, - NULL -}; - #define __event(__config, __name, __unit) \ { \ .config = (__config), \ @@ -1026,16 +1014,16 @@ err_alloc: static void free_event_attributes(struct i915_pmu *pmu) { - struct attribute **attr_iter = i915_pmu_events_attr_group.attrs; + struct attribute **attr_iter = pmu->events_attr_group.attrs; for (; *attr_iter; attr_iter++) kfree((*attr_iter)->name); - kfree(i915_pmu_events_attr_group.attrs); + kfree(pmu->events_attr_group.attrs); kfree(pmu->i915_attr); kfree(pmu->pmu_attr); - i915_pmu_events_attr_group.attrs = NULL; + pmu->events_attr_group.attrs = NULL; pmu->i915_attr = NULL; pmu->pmu_attr = NULL; } @@ -1117,6 +1105,13 @@ static bool is_igp(struct drm_i915_private *i915) void i915_pmu_register(struct drm_i915_private *i915) { struct i915_pmu *pmu = &i915->pmu; + const struct attribute_group *attr_groups[] = { + &i915_pmu_format_attr_group, + &pmu->events_attr_group, + &i915_pmu_cpumask_attr_group, + NULL + }; + int ret = -ENOMEM; if (INTEL_GEN(i915) <= 2) { @@ -1143,11 +1138,16 @@ void i915_pmu_register(struct drm_i915_private *i915) if (!pmu->name) goto err; - i915_pmu_events_attr_group.attrs = create_event_attributes(pmu); - if (!i915_pmu_events_attr_group.attrs) + pmu->events_attr_group.name = "events"; + pmu->events_attr_group.attrs = create_event_attributes(pmu); + if (!pmu->events_attr_group.attrs) goto err_name; - pmu->base.attr_groups = i915_pmu_attr_groups; + pmu->base.attr_groups = kmemdup(attr_groups, sizeof(attr_groups), + GFP_KERNEL); + if (!pmu->base.attr_groups) + goto err_attr; + pmu->base.task_ctx_nr = perf_invalid_context; pmu->base.event_init = i915_pmu_event_init; pmu->base.add = i915_pmu_event_add; @@ -1159,7 +1159,7 @@ void i915_pmu_register(struct drm_i915_private *i915) ret = perf_pmu_register(&pmu->base, pmu->name, -1); if (ret) - goto err_attr; + goto err_groups; ret = i915_pmu_register_cpuhp_state(pmu); if (ret) @@ -1169,6 +1169,8 @@ void i915_pmu_register(struct drm_i915_private *i915) err_unreg: perf_pmu_unregister(&pmu->base); +err_groups: + kfree(pmu->base.attr_groups); err_attr: pmu->base.event_init = NULL; free_event_attributes(pmu); @@ -1194,6 +1196,7 @@ void i915_pmu_unregister(struct drm_i915_private *i915) perf_pmu_unregister(&pmu->base); pmu->base.event_init = NULL; + kfree(pmu->base.attr_groups); if (!is_igp(i915)) kfree(pmu->name); free_event_attributes(pmu); diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h index 207058391cec..f1d6cad0d7d5 100644 --- a/drivers/gpu/drm/i915/i915_pmu.h +++ b/drivers/gpu/drm/i915/i915_pmu.h @@ -107,6 +107,10 @@ struct i915_pmu { * @sleep_last: Last time GT parked for RC6 estimation. */ ktime_t sleep_last; + /** + * @events_attr_group: Device events attribute group. + */ + struct attribute_group events_attr_group; /** * @i915_attr: Memory block holding device attributes. */ From 238734262142075056653b4de091458e0ca858f2 Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Fri, 21 Feb 2020 22:18:18 +0000 Subject: [PATCH 175/400] drm/i915: Avoid recursing onto active vma from the shrinker We mark the vma as active while binding it in order to protect outselves from being shrunk under mempressure. This only works if we are strict in not attempting to shrink active objects. <6> [472.618968] Workqueue: events_unbound fence_work [i915] <4> [472.618970] Call Trace: <4> [472.618974] ? __schedule+0x2e5/0x810 <4> [472.618978] schedule+0x37/0xe0 <4> [472.618982] schedule_preempt_disabled+0xf/0x20 <4> [472.618984] __mutex_lock+0x281/0x9c0 <4> [472.618987] ? mark_held_locks+0x49/0x70 <4> [472.618989] ? _raw_spin_unlock_irqrestore+0x47/0x60 <4> [472.619038] ? i915_vma_unbind+0xae/0x110 [i915] <4> [472.619084] ? i915_vma_unbind+0xae/0x110 [i915] <4> [472.619122] i915_vma_unbind+0xae/0x110 [i915] <4> [472.619165] i915_gem_object_unbind+0x1dc/0x400 [i915] <4> [472.619208] i915_gem_shrink+0x328/0x660 [i915] <4> [472.619250] ? i915_gem_shrink_all+0x38/0x60 [i915] <4> [472.619282] i915_gem_shrink_all+0x38/0x60 [i915] <4> [472.619325] vm_alloc_page.constprop.25+0x1aa/0x240 [i915] <4> [472.619330] ? rcu_read_lock_sched_held+0x4d/0x80 <4> [472.619363] ? __alloc_pd+0xb/0x30 [i915] <4> [472.619366] ? module_assert_mutex_or_preempt+0xf/0x30 <4> [472.619368] ? __module_address+0x23/0xe0 <4> [472.619371] ? is_module_address+0x26/0x40 <4> [472.619374] ? static_obj+0x34/0x50 <4> [472.619376] ? lockdep_init_map+0x4d/0x1e0 <4> [472.619407] setup_page_dma+0xd/0x90 [i915] <4> [472.619437] alloc_pd+0x29/0x50 [i915] <4> [472.619470] __gen8_ppgtt_alloc+0x443/0x6b0 [i915] <4> [472.619503] gen8_ppgtt_alloc+0xd7/0x300 [i915] <4> [472.619535] ppgtt_bind_vma+0x2a/0xe0 [i915] <4> [472.619577] __vma_bind+0x26/0x40 [i915] <4> [472.619611] fence_work+0x1c/0x90 [i915] <4> [472.619617] process_one_work+0x26a/0x620 Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex") Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin Link: https://patchwork.freedesktop.org/patch/msgid/20200221221818.2861432-1-chris@chris-wilson.co.uk (cherry picked from commit 6f24e41022f28061368776ea1514db0a6e67a9b1) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index f7e4b39c734f..59b387ade49c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -256,8 +256,7 @@ unsigned long i915_gem_shrink_all(struct drm_i915_private *i915) with_intel_runtime_pm(&i915->runtime_pm, wakeref) { freed = i915_gem_shrink(i915, -1UL, NULL, I915_SHRINK_BOUND | - I915_SHRINK_UNBOUND | - I915_SHRINK_ACTIVE); + I915_SHRINK_UNBOUND); } return freed; @@ -336,7 +335,6 @@ i915_gem_shrinker_oom(struct notifier_block *nb, unsigned long event, void *ptr) freed_pages = 0; with_intel_runtime_pm(&i915->runtime_pm, wakeref) freed_pages += i915_gem_shrink(i915, -1UL, NULL, - I915_SHRINK_ACTIVE | I915_SHRINK_BOUND | I915_SHRINK_UNBOUND | I915_SHRINK_WRITEBACK); From 8c8c06207bcfc5a7e5918fc0a0f7f7b9a2e196d6 Mon Sep 17 00:00:00 2001 From: Ahzo Date: Tue, 25 Feb 2020 13:56:14 -0500 Subject: [PATCH 176/400] drm/ttm: fix leaking fences via ttm_buffer_object_transfer MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Set the drm_device to NULL, so that the newly created buffer object doesn't appear to use the embedded gem object. This is necessary, because otherwise no corresponding dma_resv_fini for the dma_resv_init is called, resulting in a memory leak. The dma_resv_fini in ttm_bo_release_list is only called if the embedded gem object is not used, which is determined by checking if the drm_device is NULL. Bug: https://gitlab.freedesktop.org/drm/amd/issues/958 Fixes: 1e053b10ba60 ("drm/ttm: use gem reservation object") Reviewed-by: Christian König Signed-off-by: Ahzo Signed-off-by: Alex Deucher Signed-off-by: Christian König Link: https://patchwork.freedesktop.org/patch/355089/ --- drivers/gpu/drm/ttm/ttm_bo_util.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index 49ed55779128..953c82a4f573 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -515,6 +515,7 @@ static int ttm_buffer_object_transfer(struct ttm_buffer_object *bo, fbo->base.base.resv = &fbo->base.base._resv; dma_resv_init(&fbo->base.base._resv); + fbo->base.base.dev = NULL; ret = dma_resv_trylock(&fbo->base.base._resv); WARN_ON(!ret); From 212d58c106fd0f2704664be2bb173e14cb4e86d3 Mon Sep 17 00:00:00 2001 From: Stefano Brivio Date: Fri, 21 Feb 2020 03:04:21 +0100 Subject: [PATCH 177/400] nft_set_pipapo: Actually fetch key data in nft_pipapo_remove() Phil reports that adding elements, flushing and re-adding them right away: nft add table t '{ set s { type ipv4_addr . inet_service; flags interval; }; }' nft add element t s '{ 10.0.0.1 . 22-25, 10.0.0.1 . 10-20 }' nft flush set t s nft add element t s '{ 10.0.0.1 . 10-20, 10.0.0.1 . 22-25 }' triggers, almost reliably, a crash like this one: [ 71.319848] general protection fault, probably for non-canonical address 0x6f6b6e696c2e756e: 0000 [#1] PREEMPT SMP PTI [ 71.321540] CPU: 3 PID: 1201 Comm: kworker/3:2 Not tainted 5.6.0-rc1-00377-g2bb07f4e1d861 #192 [ 71.322746] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190711_202441-buildvm-armv7-10.arm.fedoraproject.org-2.fc31 04/01/2014 [ 71.324430] Workqueue: events nf_tables_trans_destroy_work [nf_tables] [ 71.325387] RIP: 0010:nft_set_elem_destroy+0xa5/0x110 [nf_tables] [ 71.326164] Code: 89 d4 84 c0 74 0e 8b 77 44 0f b6 f8 48 01 df e8 41 ff ff ff 45 84 e4 74 36 44 0f b6 63 08 45 84 e4 74 2c 49 01 dc 49 8b 04 24 <48> 8b 40 38 48 85 c0 74 4f 48 89 e7 4c 8b [ 71.328423] RSP: 0018:ffffc9000226fd90 EFLAGS: 00010282 [ 71.329225] RAX: 6f6b6e696c2e756e RBX: ffff88813ab79f60 RCX: ffff88813931b5a0 [ 71.330365] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88813ab79f9a [ 71.331473] RBP: ffff88813ab79f60 R08: 0000000000000008 R09: 0000000000000000 [ 71.332627] R10: 000000000000021c R11: 0000000000000000 R12: ffff88813ab79fc2 [ 71.333615] R13: ffff88813b3adf50 R14: dead000000000100 R15: ffff88813931b8a0 [ 71.334596] FS: 0000000000000000(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000 [ 71.335780] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 71.336577] CR2: 000055ac683710f0 CR3: 000000013a222003 CR4: 0000000000360ee0 [ 71.337533] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 71.338557] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 71.339718] Call Trace: [ 71.340093] nft_pipapo_destroy+0x7a/0x170 [nf_tables_set] [ 71.340973] nft_set_destroy+0x20/0x50 [nf_tables] [ 71.341879] nf_tables_trans_destroy_work+0x246/0x260 [nf_tables] [ 71.342916] process_one_work+0x1d5/0x3c0 [ 71.343601] worker_thread+0x4a/0x3c0 [ 71.344229] kthread+0xfb/0x130 [ 71.344780] ? process_one_work+0x3c0/0x3c0 [ 71.345477] ? kthread_park+0x90/0x90 [ 71.346129] ret_from_fork+0x35/0x40 [ 71.346748] Modules linked in: nf_tables_set nf_tables nfnetlink 8021q [last unloaded: nfnetlink] [ 71.348153] ---[ end trace 2eaa8149ca759bcc ]--- [ 71.349066] RIP: 0010:nft_set_elem_destroy+0xa5/0x110 [nf_tables] [ 71.350016] Code: 89 d4 84 c0 74 0e 8b 77 44 0f b6 f8 48 01 df e8 41 ff ff ff 45 84 e4 74 36 44 0f b6 63 08 45 84 e4 74 2c 49 01 dc 49 8b 04 24 <48> 8b 40 38 48 85 c0 74 4f 48 89 e7 4c 8b [ 71.350017] RSP: 0018:ffffc9000226fd90 EFLAGS: 00010282 [ 71.350019] RAX: 6f6b6e696c2e756e RBX: ffff88813ab79f60 RCX: ffff88813931b5a0 [ 71.350019] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88813ab79f9a [ 71.350020] RBP: ffff88813ab79f60 R08: 0000000000000008 R09: 0000000000000000 [ 71.350021] R10: 000000000000021c R11: 0000000000000000 R12: ffff88813ab79fc2 [ 71.350022] R13: ffff88813b3adf50 R14: dead000000000100 R15: ffff88813931b8a0 [ 71.350025] FS: 0000000000000000(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000 [ 71.350026] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 71.350027] CR2: 000055ac683710f0 CR3: 000000013a222003 CR4: 0000000000360ee0 [ 71.350028] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 71.350028] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 71.350030] Kernel panic - not syncing: Fatal exception [ 71.350412] Kernel Offset: disabled [ 71.365922] ---[ end Kernel panic - not syncing: Fatal exception ]--- which is caused by dangling elements that have been deactivated, but never removed. On a flush operation, nft_pipapo_walk() walks through all the elements in the mapping table, which are then deactivated by nft_flush_set(), one by one, and added to the commit list for removal. Element data is then freed. On transaction commit, nft_pipapo_remove() is called, and failed to remove these elements, leading to the stale references in the mapping. The first symptom of this, revealed by KASan, is a one-byte use-after-free in subsequent calls to nft_pipapo_walk(), which is usually not enough to trigger a panic. When stale elements are used more heavily, though, such as double-free via nft_pipapo_destroy() as in Phil's case, the problem becomes more noticeable. The issue comes from that fact that, on a flush operation, nft_pipapo_remove() won't get the actual key data via elem->key, elements to be deleted upon commit won't be found by the lookup via pipapo_get(), and removal will be skipped. Key data should be fetched via nft_set_ext_key(), instead. Reported-by: Phil Sutter Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Signed-off-by: Stefano Brivio Signed-off-by: Pablo Neira Ayuso --- net/netfilter/nft_set_pipapo.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c index feac8553f6d9..4fc0c924ed5d 100644 --- a/net/netfilter/nft_set_pipapo.c +++ b/net/netfilter/nft_set_pipapo.c @@ -1766,11 +1766,13 @@ static bool pipapo_match_field(struct nft_pipapo_field *f, static void nft_pipapo_remove(const struct net *net, const struct nft_set *set, const struct nft_set_elem *elem) { - const u8 *data = (const u8 *)elem->key.val.data; struct nft_pipapo *priv = nft_set_priv(set); struct nft_pipapo_match *m = priv->clone; + struct nft_pipapo_elem *e = elem->priv; int rules_f0, first_rule = 0; - struct nft_pipapo_elem *e; + const u8 *data; + + data = (const u8 *)nft_set_ext_key(&e->ext); e = pipapo_get(net, set, data, 0); if (IS_ERR(e)) From 0954df70fba743d8cdaa09ccf6ba8e4ad09628de Mon Sep 17 00:00:00 2001 From: Stefano Brivio Date: Fri, 21 Feb 2020 03:04:22 +0100 Subject: [PATCH 178/400] selftests: nft_concat_range: Add test for reported add/flush/add issue Add a specific test for the crash reported by Phil Sutter and addressed in the previous patch. The test cases that, in my intention, should have covered these cases, that is, the ones from the 'concurrency' section, don't run these sequences tightly enough and spectacularly failed to catch this. While at it, define a convenient way to add these kind of tests, by adding a "reported issues" test section. It's more convenient, for this particular test, to execute the set setup in its own function. However, future test cases like this one might need to call setup functions, and will typically need no tools other than nft, so allow for this in check_tools(). The original form of the reproducer used here was provided by Phil. Reported-by: Phil Sutter Signed-off-by: Stefano Brivio Signed-off-by: Pablo Neira Ayuso --- .../selftests/netfilter/nft_concat_range.sh | 43 +++++++++++++++++-- 1 file changed, 39 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/netfilter/nft_concat_range.sh b/tools/testing/selftests/netfilter/nft_concat_range.sh index 5c1033ee1b39..5a4938d6dcf2 100755 --- a/tools/testing/selftests/netfilter/nft_concat_range.sh +++ b/tools/testing/selftests/netfilter/nft_concat_range.sh @@ -13,11 +13,12 @@ KSELFTEST_SKIP=4 # Available test groups: +# - reported_issues: check for issues that were reported in the past # - correctness: check that packets match given entries, and only those # - concurrency: attempt races between insertion, deletion and lookup # - timeout: check that packets match entries until they expire # - performance: estimate matching rate, compare with rbtree and hash baselines -TESTS="correctness concurrency timeout" +TESTS="reported_issues correctness concurrency timeout" [ "${quicktest}" != "1" ] && TESTS="${TESTS} performance" # Set types, defined by TYPE_ variables below @@ -25,6 +26,9 @@ TYPES="net_port port_net net6_port port_proto net6_port_mac net6_port_mac_proto net_port_net net_mac net_mac_icmp net6_mac_icmp net6_port_net6_port net_port_mac_proto_net" +# Reported bugs, also described by TYPE_ variables below +BUGS="flush_remove_add" + # List of possible paths to pktgen script from kernel tree for performance tests PKTGEN_SCRIPT_PATHS=" ../../../samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh @@ -327,6 +331,12 @@ flood_spec ip daddr . tcp dport . meta l4proto . ip saddr perf_duration 0 " +# Definition of tests for bugs reported in the past: +# display display text for test report +TYPE_flush_remove_add=" +display Add two elements, flush, re-add +" + # Set template for all tests, types and rules are filled in depending on test set_template=' flush ruleset @@ -440,6 +450,8 @@ setup_set() { # Check that at least one of the needed tools is available check_tools() { + [ -z "${tools}" ] && return 0 + __tools= for tool in ${tools}; do if [ "${tool}" = "nc" ] && [ "${proto}" = "udp6" ] && \ @@ -1430,6 +1442,23 @@ test_performance() { kill "${perf_pid}" } +test_bug_flush_remove_add() { + set_cmd='{ set s { type ipv4_addr . inet_service; flags interval; }; }' + elem1='{ 10.0.0.1 . 22-25, 10.0.0.1 . 10-20 }' + elem2='{ 10.0.0.1 . 10-20, 10.0.0.1 . 22-25 }' + for i in `seq 1 100`; do + nft add table t ${set_cmd} || return ${KSELFTEST_SKIP} + nft add element t s ${elem1} 2>/dev/null || return 1 + nft flush set t s 2>/dev/null || return 1 + nft add element t s ${elem2} 2>/dev/null || return 1 + done + nft flush ruleset +} + +test_reported_issues() { + eval test_bug_"${subtest}" +} + # Run everything in a separate network namespace [ "${1}" != "run" ] && { unshare -n "${0}" run; exit $?; } tmp="$(mktemp)" @@ -1438,9 +1467,15 @@ trap cleanup EXIT # Entry point for test runs passed=0 for name in ${TESTS}; do - printf "TEST: %s\n" "${name}" - for type in ${TYPES}; do - eval desc=\$TYPE_"${type}" + printf "TEST: %s\n" "$(echo ${name} | tr '_' ' ')" + if [ "${name}" = "reported_issues" ]; then + SUBTESTS="${BUGS}" + else + SUBTESTS="${TYPES}" + fi + + for subtest in ${SUBTESTS}; do + eval desc=\$TYPE_"${subtest}" IFS=' ' for __line in ${desc}; do From 2a44f46781617c5040372b59da33553a02b1f46d Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Tue, 25 Feb 2020 13:25:41 -0700 Subject: [PATCH 179/400] io_uring: pick up link work on submit reference drop If work completes inline, then we should pick up a dependent link item in __io_queue_sqe() as well. If we don't do so, we're forced to go async with that item, which is suboptimal. This also fixes an issue with io_put_req_find_next(), which always looks up the next work item. That should only be done if we're dropping the last reference to the request, to prevent multiple lookups of the same work item. Outside of being a fix, this also enables a good cleanup series for 5.7, where we never have to pass 'nxt' around or into the work handlers. Reviewed-by: Pavel Begunkov Signed-off-by: Jens Axboe --- fs/io_uring.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index ffd9bfa84d86..f79ca494bb56 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1483,10 +1483,10 @@ static void io_free_req(struct io_kiocb *req) __attribute__((nonnull)) static void io_put_req_find_next(struct io_kiocb *req, struct io_kiocb **nxtptr) { - io_req_find_next(req, nxtptr); - - if (refcount_dec_and_test(&req->refs)) + if (refcount_dec_and_test(&req->refs)) { + io_req_find_next(req, nxtptr); __io_free_req(req); + } } static void io_put_req(struct io_kiocb *req) @@ -4749,7 +4749,7 @@ punt: err: /* drop submission reference */ - io_put_req(req); + io_put_req_find_next(req, &nxt); if (linked_timeout) { if (!ret) From 3a9015988b3d41027cda61f4fdeaaeee73be8b24 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Tue, 25 Feb 2020 17:48:55 -0700 Subject: [PATCH 180/400] io_uring: import_single_range() returns 0/-ERROR Unlike the other core import helpers, import_single_range() returns 0 on success, not the length imported. This means that links that depend on the result of non-vec based IORING_OP_{READ,WRITE} that were added for 5.5 get errored when they should not be. Fixes: 3a6820f2bb8a ("io_uring: add non-vectored read/write commands") Signed-off-by: Jens Axboe --- fs/io_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index f79ca494bb56..36917c0101fd 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2075,7 +2075,7 @@ static ssize_t io_import_iovec(int rw, struct io_kiocb *req, ssize_t ret; ret = import_single_range(rw, buf, sqe_len, *iovec, iter); *iovec = NULL; - return ret; + return ret < 0 ? ret : sqe_len; } if (req->io) { From 63056e8b5ebf41d52170e9f5ba1fc83d1855278c Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Fri, 21 Feb 2020 09:48:46 +0100 Subject: [PATCH 181/400] efi/x86: Align GUIDs to their size in the mixed mode runtime wrapper Hans reports that his mixed mode systems running v5.6-rc1 kernels hit the WARN_ON() in virt_to_phys_or_null_size(), caused by the fact that efi_guid_t objects on the vmap'ed stack happen to be misaligned with respect to their sizes. As a quick (i.e., backportable) fix, copy GUID pointer arguments to the local stack into a buffer that is naturally aligned to its size, so that it is guaranteed to cover only one physical page. Note that on x86, we cannot rely on the stack pointer being aligned the way the compiler expects, so we need to allocate an 8-byte aligned buffer of sufficient size, and copy the GUID into that buffer at an offset that is aligned to 16 bytes. Fixes: f6697df36bdf0bf7 ("x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y") Reported-by: Hans de Goede Signed-off-by: Ard Biesheuvel Signed-off-by: Ingo Molnar Tested-by: Hans de Goede Cc: linux-efi@vger.kernel.org Cc: Ingo Molnar Cc: Thomas Gleixner Link: https://lore.kernel.org/r/20200221084849.26878-2-ardb@kernel.org --- arch/x86/platform/efi/efi_64.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-) diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c index fa8506e76bbe..543edfdcd1b9 100644 --- a/arch/x86/platform/efi/efi_64.c +++ b/arch/x86/platform/efi/efi_64.c @@ -658,6 +658,8 @@ static efi_status_t efi_thunk_get_variable(efi_char16_t *name, efi_guid_t *vendor, u32 *attr, unsigned long *data_size, void *data) { + u8 buf[24] __aligned(8); + efi_guid_t *vnd = PTR_ALIGN((efi_guid_t *)buf, sizeof(*vnd)); efi_status_t status; u32 phys_name, phys_vendor, phys_attr; u32 phys_data_size, phys_data; @@ -665,8 +667,10 @@ efi_thunk_get_variable(efi_char16_t *name, efi_guid_t *vendor, spin_lock_irqsave(&efi_runtime_lock, flags); + *vnd = *vendor; + phys_data_size = virt_to_phys_or_null(data_size); - phys_vendor = virt_to_phys_or_null(vendor); + phys_vendor = virt_to_phys_or_null(vnd); phys_name = virt_to_phys_or_null_size(name, efi_name_size(name)); phys_attr = virt_to_phys_or_null(attr); phys_data = virt_to_phys_or_null_size(data, *data_size); @@ -683,14 +687,18 @@ static efi_status_t efi_thunk_set_variable(efi_char16_t *name, efi_guid_t *vendor, u32 attr, unsigned long data_size, void *data) { + u8 buf[24] __aligned(8); + efi_guid_t *vnd = PTR_ALIGN((efi_guid_t *)buf, sizeof(*vnd)); u32 phys_name, phys_vendor, phys_data; efi_status_t status; unsigned long flags; spin_lock_irqsave(&efi_runtime_lock, flags); + *vnd = *vendor; + phys_name = virt_to_phys_or_null_size(name, efi_name_size(name)); - phys_vendor = virt_to_phys_or_null(vendor); + phys_vendor = virt_to_phys_or_null(vnd); phys_data = virt_to_phys_or_null_size(data, data_size); /* If data_size is > sizeof(u32) we've got problems */ @@ -707,6 +715,8 @@ efi_thunk_set_variable_nonblocking(efi_char16_t *name, efi_guid_t *vendor, u32 attr, unsigned long data_size, void *data) { + u8 buf[24] __aligned(8); + efi_guid_t *vnd = PTR_ALIGN((efi_guid_t *)buf, sizeof(*vnd)); u32 phys_name, phys_vendor, phys_data; efi_status_t status; unsigned long flags; @@ -714,8 +724,10 @@ efi_thunk_set_variable_nonblocking(efi_char16_t *name, efi_guid_t *vendor, if (!spin_trylock_irqsave(&efi_runtime_lock, flags)) return EFI_NOT_READY; + *vnd = *vendor; + phys_name = virt_to_phys_or_null_size(name, efi_name_size(name)); - phys_vendor = virt_to_phys_or_null(vendor); + phys_vendor = virt_to_phys_or_null(vnd); phys_data = virt_to_phys_or_null_size(data, data_size); /* If data_size is > sizeof(u32) we've got problems */ @@ -732,14 +744,18 @@ efi_thunk_get_next_variable(unsigned long *name_size, efi_char16_t *name, efi_guid_t *vendor) { + u8 buf[24] __aligned(8); + efi_guid_t *vnd = PTR_ALIGN((efi_guid_t *)buf, sizeof(*vnd)); efi_status_t status; u32 phys_name_size, phys_name, phys_vendor; unsigned long flags; spin_lock_irqsave(&efi_runtime_lock, flags); + *vnd = *vendor; + phys_name_size = virt_to_phys_or_null(name_size); - phys_vendor = virt_to_phys_or_null(vendor); + phys_vendor = virt_to_phys_or_null(vnd); phys_name = virt_to_phys_or_null_size(name, *name_size); status = efi_thunk(get_next_variable, phys_name_size, @@ -747,6 +763,7 @@ efi_thunk_get_next_variable(unsigned long *name_size, spin_unlock_irqrestore(&efi_runtime_lock, flags); + *vendor = *vnd; return status; } From f80c9f6476db6c0802545aaa44eb9a38e751786a Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Fri, 21 Feb 2020 09:48:47 +0100 Subject: [PATCH 182/400] efi/x86: Remove support for EFI time and counter services in mixed mode Mixed mode calls at runtime are rather tricky with vmap'ed stacks, as we can no longer assume that data passed in by the callers of the EFI runtime wrapper routines is contiguous in physical memory. We need to fix this, but before we do, let's drop the implementations of routines that we know are never used on x86, i.e., the RTC related ones. Given that UEFI rev2.8 permits any runtime service to return EFI_UNSUPPORTED at runtime, let's return that instead. As get_next_high_mono_count() is never used at all, even on other architectures, let's make that return EFI_UNSUPPORTED too. Signed-off-by: Ard Biesheuvel Signed-off-by: Ingo Molnar Cc: linux-efi@vger.kernel.org Cc: Ingo Molnar Cc: Thomas Gleixner Link: https://lore.kernel.org/r/20200221084849.26878-3-ardb@kernel.org --- arch/x86/platform/efi/efi_64.c | 81 +++------------------------------- 1 file changed, 5 insertions(+), 76 deletions(-) diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c index 543edfdcd1b9..ae398587f264 100644 --- a/arch/x86/platform/efi/efi_64.c +++ b/arch/x86/platform/efi/efi_64.c @@ -568,85 +568,25 @@ efi_thunk_set_virtual_address_map(unsigned long memory_map_size, static efi_status_t efi_thunk_get_time(efi_time_t *tm, efi_time_cap_t *tc) { - efi_status_t status; - u32 phys_tm, phys_tc; - unsigned long flags; - - spin_lock(&rtc_lock); - spin_lock_irqsave(&efi_runtime_lock, flags); - - phys_tm = virt_to_phys_or_null(tm); - phys_tc = virt_to_phys_or_null(tc); - - status = efi_thunk(get_time, phys_tm, phys_tc); - - spin_unlock_irqrestore(&efi_runtime_lock, flags); - spin_unlock(&rtc_lock); - - return status; + return EFI_UNSUPPORTED; } static efi_status_t efi_thunk_set_time(efi_time_t *tm) { - efi_status_t status; - u32 phys_tm; - unsigned long flags; - - spin_lock(&rtc_lock); - spin_lock_irqsave(&efi_runtime_lock, flags); - - phys_tm = virt_to_phys_or_null(tm); - - status = efi_thunk(set_time, phys_tm); - - spin_unlock_irqrestore(&efi_runtime_lock, flags); - spin_unlock(&rtc_lock); - - return status; + return EFI_UNSUPPORTED; } static efi_status_t efi_thunk_get_wakeup_time(efi_bool_t *enabled, efi_bool_t *pending, efi_time_t *tm) { - efi_status_t status; - u32 phys_enabled, phys_pending, phys_tm; - unsigned long flags; - - spin_lock(&rtc_lock); - spin_lock_irqsave(&efi_runtime_lock, flags); - - phys_enabled = virt_to_phys_or_null(enabled); - phys_pending = virt_to_phys_or_null(pending); - phys_tm = virt_to_phys_or_null(tm); - - status = efi_thunk(get_wakeup_time, phys_enabled, - phys_pending, phys_tm); - - spin_unlock_irqrestore(&efi_runtime_lock, flags); - spin_unlock(&rtc_lock); - - return status; + return EFI_UNSUPPORTED; } static efi_status_t efi_thunk_set_wakeup_time(efi_bool_t enabled, efi_time_t *tm) { - efi_status_t status; - u32 phys_tm; - unsigned long flags; - - spin_lock(&rtc_lock); - spin_lock_irqsave(&efi_runtime_lock, flags); - - phys_tm = virt_to_phys_or_null(tm); - - status = efi_thunk(set_wakeup_time, enabled, phys_tm); - - spin_unlock_irqrestore(&efi_runtime_lock, flags); - spin_unlock(&rtc_lock); - - return status; + return EFI_UNSUPPORTED; } static unsigned long efi_name_size(efi_char16_t *name) @@ -770,18 +710,7 @@ efi_thunk_get_next_variable(unsigned long *name_size, static efi_status_t efi_thunk_get_next_high_mono_count(u32 *count) { - efi_status_t status; - u32 phys_count; - unsigned long flags; - - spin_lock_irqsave(&efi_runtime_lock, flags); - - phys_count = virt_to_phys_or_null(count); - status = efi_thunk(get_next_high_mono_count, phys_count); - - spin_unlock_irqrestore(&efi_runtime_lock, flags); - - return status; + return EFI_UNSUPPORTED; } static void From 8319e9d5ad98ffccd19f35664382c73cea216193 Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Fri, 21 Feb 2020 09:48:48 +0100 Subject: [PATCH 183/400] efi/x86: Handle by-ref arguments covering multiple pages in mixed mode The mixed mode runtime wrappers are fragile when it comes to how the memory referred to by its pointer arguments are laid out in memory, due to the fact that it translates these addresses to physical addresses that the runtime services can dereference when running in 1:1 mode. Since vmalloc'ed pages (including the vmap'ed stack) are not contiguous in the physical address space, this scheme only works if the referenced memory objects do not cross page boundaries. Currently, the mixed mode runtime service wrappers require that all by-ref arguments that live in the vmalloc space have a size that is a power of 2, and are aligned to that same value. While this is a sensible way to construct an object that is guaranteed not to cross a page boundary, it is overly strict when it comes to checking whether a given object violates this requirement, as we can simply take the physical address of the first and the last byte, and verify that they point into the same physical page. When this check fails, we emit a WARN(), but then simply proceed with the call, which could cause data corruption if the next physical page belongs to a mapping that is entirely unrelated. Given that with vmap'ed stacks, this condition is much more likely to trigger, let's relax the condition a bit, but fail the runtime service call if it does trigger. Fixes: f6697df36bdf0bf7 ("x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y") Signed-off-by: Ard Biesheuvel Signed-off-by: Ingo Molnar Cc: linux-efi@vger.kernel.org Cc: Ingo Molnar Cc: Thomas Gleixner Link: https://lore.kernel.org/r/20200221084849.26878-4-ardb@kernel.org --- arch/x86/platform/efi/efi_64.c | 45 ++++++++++++++++++++-------------- 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c index ae398587f264..d19a2edd63cb 100644 --- a/arch/x86/platform/efi/efi_64.c +++ b/arch/x86/platform/efi/efi_64.c @@ -180,7 +180,7 @@ void efi_sync_low_kernel_mappings(void) static inline phys_addr_t virt_to_phys_or_null_size(void *va, unsigned long size) { - bool bad_size; + phys_addr_t pa; if (!va) return 0; @@ -188,16 +188,13 @@ virt_to_phys_or_null_size(void *va, unsigned long size) if (virt_addr_valid(va)) return virt_to_phys(va); - /* - * A fully aligned variable on the stack is guaranteed not to - * cross a page bounary. Try to catch strings on the stack by - * checking that 'size' is a power of two. - */ - bad_size = size > PAGE_SIZE || !is_power_of_2(size); + pa = slow_virt_to_phys(va); - WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size); + /* check if the object crosses a page boundary */ + if (WARN_ON((pa ^ (pa + size - 1)) & PAGE_MASK)) + return 0; - return slow_virt_to_phys(va); + return pa; } #define virt_to_phys_or_null(addr) \ @@ -615,8 +612,11 @@ efi_thunk_get_variable(efi_char16_t *name, efi_guid_t *vendor, phys_attr = virt_to_phys_or_null(attr); phys_data = virt_to_phys_or_null_size(data, *data_size); - status = efi_thunk(get_variable, phys_name, phys_vendor, - phys_attr, phys_data_size, phys_data); + if (!phys_name || (data && !phys_data)) + status = EFI_INVALID_PARAMETER; + else + status = efi_thunk(get_variable, phys_name, phys_vendor, + phys_attr, phys_data_size, phys_data); spin_unlock_irqrestore(&efi_runtime_lock, flags); @@ -641,9 +641,11 @@ efi_thunk_set_variable(efi_char16_t *name, efi_guid_t *vendor, phys_vendor = virt_to_phys_or_null(vnd); phys_data = virt_to_phys_or_null_size(data, data_size); - /* If data_size is > sizeof(u32) we've got problems */ - status = efi_thunk(set_variable, phys_name, phys_vendor, - attr, data_size, phys_data); + if (!phys_name || !phys_data) + status = EFI_INVALID_PARAMETER; + else + status = efi_thunk(set_variable, phys_name, phys_vendor, + attr, data_size, phys_data); spin_unlock_irqrestore(&efi_runtime_lock, flags); @@ -670,9 +672,11 @@ efi_thunk_set_variable_nonblocking(efi_char16_t *name, efi_guid_t *vendor, phys_vendor = virt_to_phys_or_null(vnd); phys_data = virt_to_phys_or_null_size(data, data_size); - /* If data_size is > sizeof(u32) we've got problems */ - status = efi_thunk(set_variable, phys_name, phys_vendor, - attr, data_size, phys_data); + if (!phys_name || !phys_data) + status = EFI_INVALID_PARAMETER; + else + status = efi_thunk(set_variable, phys_name, phys_vendor, + attr, data_size, phys_data); spin_unlock_irqrestore(&efi_runtime_lock, flags); @@ -698,8 +702,11 @@ efi_thunk_get_next_variable(unsigned long *name_size, phys_vendor = virt_to_phys_or_null(vnd); phys_name = virt_to_phys_or_null_size(name, *name_size); - status = efi_thunk(get_next_variable, phys_name_size, - phys_name, phys_vendor); + if (!phys_name) + status = EFI_INVALID_PARAMETER; + else + status = efi_thunk(get_next_variable, phys_name_size, + phys_name, phys_vendor); spin_unlock_irqrestore(&efi_runtime_lock, flags); From be36f9e7517e17810ec369626a128d7948942259 Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Fri, 21 Feb 2020 09:48:49 +0100 Subject: [PATCH 184/400] efi: READ_ONCE rng seed size before munmap This function is consistent with using size instead of seed->size (except for one place that this patch fixes), but it reads seed->size without using READ_ONCE, which means the compiler might still do something unwanted. So, this commit simply adds the READ_ONCE wrapper. Fixes: 636259880a7e ("efi: Add support for seeding the RNG from a UEFI ...") Signed-off-by: Jason A. Donenfeld Signed-off-by: Ard Biesheuvel Signed-off-by: Ingo Molnar Cc: linux-efi@vger.kernel.org Cc: Ingo Molnar Cc: Thomas Gleixner Link: https://lore.kernel.org/r/20200217123354.21140-1-Jason@zx2c4.com Link: https://lore.kernel.org/r/20200221084849.26878-5-ardb@kernel.org --- drivers/firmware/efi/efi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 621220ab3d0e..21ea99f65113 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -552,7 +552,7 @@ int __init efi_config_parse_tables(void *config_tables, int count, int sz, seed = early_memremap(efi.rng_seed, sizeof(*seed)); if (seed != NULL) { - size = seed->size; + size = READ_ONCE(seed->size); early_memunmap(seed, sizeof(*seed)); } else { pr_err("Could not map UEFI random seed!\n"); @@ -562,7 +562,7 @@ int __init efi_config_parse_tables(void *config_tables, int count, int sz, sizeof(*seed) + size); if (seed != NULL) { pr_notice("seeding entropy pool\n"); - add_bootloader_randomness(seed->bits, seed->size); + add_bootloader_randomness(seed->bits, size); early_memunmap(seed, sizeof(*seed) + size); } else { pr_err("Could not map UEFI random seed!\n"); From 38b6a714941a285319734b9ccc30d34e107d5d7e Mon Sep 17 00:00:00 2001 From: Dan Murphy Date: Wed, 26 Feb 2020 07:03:05 -0600 Subject: [PATCH 185/400] ASoC: tas2562: Fix sample rate error message Fix error message for setting the sample rate. It says bitwidth but should say sample rate. Signed-off-by: Dan Murphy Link: https://lore.kernel.org/r/20200226130305.12043-3-dmurphy@ti.com Signed-off-by: Mark Brown --- sound/soc/codecs/tas2562.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/codecs/tas2562.c b/sound/soc/codecs/tas2562.c index 5b4803a76719..be52886a5edb 100644 --- a/sound/soc/codecs/tas2562.c +++ b/sound/soc/codecs/tas2562.c @@ -252,7 +252,7 @@ static int tas2562_hw_params(struct snd_pcm_substream *substream, ret = tas2562_set_samplerate(tas2562, params_rate(params)); if (ret) - dev_err(tas2562->dev, "set bitwidth failed, %d\n", ret); + dev_err(tas2562->dev, "set sample rate failed, %d\n", ret); return ret; } From 505b12b3861bc79d1b81c815faaf4910469a7006 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Wed, 12 Feb 2020 20:40:57 -0800 Subject: [PATCH 186/400] kbuild: add comment for V=2 mode Complete the comments for valid values of KBUILD_VERBOSE, specifically for KBUILD_VERBOSE=2. Signed-off-by: Randy Dunlap Signed-off-by: Masahiro Yamada --- Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/Makefile b/Makefile index 0914049d2929..2afa692b39ff 100644 --- a/Makefile +++ b/Makefile @@ -68,6 +68,7 @@ unexport GREP_OPTIONS # # If KBUILD_VERBOSE equals 0 then the above command will be hidden. # If KBUILD_VERBOSE equals 1 then the above command is displayed. +# If KBUILD_VERBOSE equals 2 then give the reason why each target is rebuilt. # # To put more focus on warnings, be less verbose as default # Use 'make V=1' to see the full commands From eccbde4f6c2b6ebc52b3e9103e6f2f73f5a9f79a Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Wed, 19 Feb 2020 10:15:19 +0900 Subject: [PATCH 187/400] kbuild: remove wrong documentation about mandatory-y This sentence does not make sense in the section about mandatory-y. This seems to be a copy-paste mistake of commit fcc8487d477a ("uapi: export all headers under uapi directories"). The correct description would be "The convention is to list one mandatory-y per line ...". I just removed it instead of fixing it. If such information is needed, it could be commented in include/asm-generic/Kbuild and include/uapi/asm-generic/Kbuild. Signed-off-by: Masahiro Yamada --- Documentation/kbuild/makefiles.rst | 3 --- 1 file changed, 3 deletions(-) diff --git a/Documentation/kbuild/makefiles.rst b/Documentation/kbuild/makefiles.rst index 0e0eb2c8da7d..4018ad7c7a11 100644 --- a/Documentation/kbuild/makefiles.rst +++ b/Documentation/kbuild/makefiles.rst @@ -1379,9 +1379,6 @@ See subsequent chapter for the syntax of the Kbuild file. in arch/$(ARCH)/include/(uapi/)/asm, Kbuild will automatically generate a wrapper of the asm-generic one. - The convention is to list one subdir per line and - preferably in alphabetic order. - 8 Kbuild Variables ================== From 7a04960560640ac5b0b89461f7757322b57d0c7a Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Sun, 23 Feb 2020 04:04:31 +0900 Subject: [PATCH 188/400] kbuild: fix DT binding schema rule to detect command line changes This if_change_rule is not working properly; it cannot detect any command line change. The reason is because cmd-check in scripts/Kbuild.include compares $(cmd_$@) and $(cmd_$1), but cmd_dtc_dt_yaml does not exist here. For if_change_rule to work properly, the stem part of cmd_* and rule_* must match. Because this cmd_and_fixdep invokes cmd_dtc, this rule must be named rule_dtc. Fixes: 4f0e3a57d6eb ("kbuild: Add support for DT binding schema checks") Signed-off-by: Masahiro Yamada Acked-by: Rob Herring --- scripts/Makefile.lib | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index bae62549e3d2..64b938c10039 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -302,13 +302,13 @@ DT_TMP_SCHEMA := $(objtree)/$(DT_BINDING_DIR)/processed-schema.yaml quiet_cmd_dtb_check = CHECK $@ cmd_dtb_check = $(DT_CHECKER) -u $(srctree)/$(DT_BINDING_DIR) -p $(DT_TMP_SCHEMA) $@ ; -define rule_dtc_dt_yaml +define rule_dtc $(call cmd_and_fixdep,dtc,yaml) $(call cmd,dtb_check) endef $(obj)/%.dt.yaml: $(src)/%.dts $(DTC) $(DT_TMP_SCHEMA) FORCE - $(call if_changed_rule,dtc_dt_yaml) + $(call if_changed_rule,dtc) dtc-tmp = $(subst $(comma),_,$(dot-target).dts.tmp) From fd63fab48f143f73b534821408a303241ed174f9 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Sun, 23 Feb 2020 04:04:32 +0900 Subject: [PATCH 189/400] kbuild: remove unneeded semicolon at the end of cmd_dtb_check This trailing semicolon is unneeded. Signed-off-by: Masahiro Yamada Acked-by: Rob Herring --- scripts/Makefile.lib | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index 64b938c10039..752ff0a225a9 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -300,7 +300,7 @@ DT_BINDING_DIR := Documentation/devicetree/bindings DT_TMP_SCHEMA := $(objtree)/$(DT_BINDING_DIR)/processed-schema.yaml quiet_cmd_dtb_check = CHECK $@ - cmd_dtb_check = $(DT_CHECKER) -u $(srctree)/$(DT_BINDING_DIR) -p $(DT_TMP_SCHEMA) $@ ; + cmd_dtb_check = $(DT_CHECKER) -u $(srctree)/$(DT_BINDING_DIR) -p $(DT_TMP_SCHEMA) $@ define rule_dtc $(call cmd_and_fixdep,dtc,yaml) From 964a596db8db8c77c9903dd05655696696e6b3ad Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Sun, 23 Feb 2020 04:04:33 +0900 Subject: [PATCH 190/400] kbuild: add dtbs_check to PHONY The dtbs_check should be a phony target, but currently it is not specified so. 'make dtbs_check' works even if a file named 'dtbs_check' exists because it depends on another phony target, scripts_dtc, but we should not rely on it. Add dtbs_check to PHONY. Signed-off-by: Masahiro Yamada Acked-by: Rob Herring --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 2afa692b39ff..51df13599211 100644 --- a/Makefile +++ b/Makefile @@ -1239,7 +1239,7 @@ ifneq ($(dtstree),) %.dtb: include/config/kernel.release scripts_dtc $(Q)$(MAKE) $(build)=$(dtstree) $(dtstree)/$@ -PHONY += dtbs dtbs_install dt_binding_check +PHONY += dtbs dtbs_install dtbs_check dt_binding_check dtbs dtbs_check: include/config/kernel.release scripts_dtc $(Q)$(MAKE) $(build)=$(dtstree) From c473a8d03ea8e03ca9d649b0b6d72fbcf6212c05 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Sun, 23 Feb 2020 04:04:34 +0900 Subject: [PATCH 191/400] kbuild: add dt_binding_check to PHONY in a correct place The dt_binding_check is added to PHONY, but it is invisible when $(dtstree) is empty. So, it is not specified as phony for ARCH=x86 etc. Add it to PHONY outside the ifneq ... endif block. Signed-off-by: Masahiro Yamada Acked-by: Rob Herring --- Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 51df13599211..1a1a0d271697 100644 --- a/Makefile +++ b/Makefile @@ -1239,7 +1239,7 @@ ifneq ($(dtstree),) %.dtb: include/config/kernel.release scripts_dtc $(Q)$(MAKE) $(build)=$(dtstree) $(dtstree)/$@ -PHONY += dtbs dtbs_install dtbs_check dt_binding_check +PHONY += dtbs dtbs_install dtbs_check dtbs dtbs_check: include/config/kernel.release scripts_dtc $(Q)$(MAKE) $(build)=$(dtstree) @@ -1259,6 +1259,7 @@ PHONY += scripts_dtc scripts_dtc: scripts_basic $(Q)$(MAKE) $(build)=scripts/dtc +PHONY += dt_binding_check dt_binding_check: scripts_dtc $(Q)$(MAKE) $(build)=Documentation/devicetree/bindings From cae740a04b4d6d5166f19ee5faf04ea2a1f34b3d Mon Sep 17 00:00:00 2001 From: John Garry Date: Wed, 26 Feb 2020 20:10:15 +0800 Subject: [PATCH 192/400] blk-mq: Remove some unused function arguments The struct blk_mq_hw_ctx pointer argument in blk_mq_put_tag(), blk_mq_poll_nsecs(), and blk_mq_poll_hybrid_sleep() is unused, so remove it. Overall obj code size shows a minor reduction, before: text data bss dec hex filename 27306 1312 0 28618 6fca block/blk-mq.o 4303 272 0 4575 11df block/blk-mq-tag.o after: 27282 1312 0 28594 6fb2 block/blk-mq.o 4311 272 0 4583 11e7 block/blk-mq-tag.o Reviewed-by: Johannes Thumshirn Reviewed-by: Hannes Reinecke Signed-off-by: John Garry -- This minor patch had been carried as part of the blk-mq shared tags RFC, I'd rather not carry it anymore as it required rebasing, so now or never.. Signed-off-by: Jens Axboe --- block/blk-mq-tag.c | 4 ++-- block/blk-mq-tag.h | 4 ++-- block/blk-mq.c | 10 ++++------ block/blk-mq.h | 2 +- 4 files changed, 9 insertions(+), 11 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index fbacde454718..586c9d6e904a 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -183,8 +183,8 @@ found_tag: return tag + tag_offset; } -void blk_mq_put_tag(struct blk_mq_hw_ctx *hctx, struct blk_mq_tags *tags, - struct blk_mq_ctx *ctx, unsigned int tag) +void blk_mq_put_tag(struct blk_mq_tags *tags, struct blk_mq_ctx *ctx, + unsigned int tag) { if (!blk_mq_tag_is_reserved(tags, tag)) { const int real_tag = tag - tags->nr_reserved_tags; diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h index 15bc74acb57e..2b8321efb682 100644 --- a/block/blk-mq-tag.h +++ b/block/blk-mq-tag.h @@ -26,8 +26,8 @@ extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags, unsigned int r extern void blk_mq_free_tags(struct blk_mq_tags *tags); extern unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data); -extern void blk_mq_put_tag(struct blk_mq_hw_ctx *hctx, struct blk_mq_tags *tags, - struct blk_mq_ctx *ctx, unsigned int tag); +extern void blk_mq_put_tag(struct blk_mq_tags *tags, struct blk_mq_ctx *ctx, + unsigned int tag); extern int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx, struct blk_mq_tags **tags, unsigned int depth, bool can_grow); diff --git a/block/blk-mq.c b/block/blk-mq.c index 5e1e4151cb51..d92088dec6c3 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -477,9 +477,9 @@ static void __blk_mq_free_request(struct request *rq) blk_pm_mark_last_busy(rq); rq->mq_hctx = NULL; if (rq->tag != -1) - blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag); + blk_mq_put_tag(hctx->tags, ctx, rq->tag); if (sched_tag != -1) - blk_mq_put_tag(hctx, hctx->sched_tags, ctx, sched_tag); + blk_mq_put_tag(hctx->sched_tags, ctx, sched_tag); blk_mq_sched_restart(hctx); blk_queue_exit(q); } @@ -3402,7 +3402,6 @@ static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb) } static unsigned long blk_mq_poll_nsecs(struct request_queue *q, - struct blk_mq_hw_ctx *hctx, struct request *rq) { unsigned long ret = 0; @@ -3435,7 +3434,6 @@ static unsigned long blk_mq_poll_nsecs(struct request_queue *q, } static bool blk_mq_poll_hybrid_sleep(struct request_queue *q, - struct blk_mq_hw_ctx *hctx, struct request *rq) { struct hrtimer_sleeper hs; @@ -3455,7 +3453,7 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q, if (q->poll_nsec > 0) nsecs = q->poll_nsec; else - nsecs = blk_mq_poll_nsecs(q, hctx, rq); + nsecs = blk_mq_poll_nsecs(q, rq); if (!nsecs) return false; @@ -3510,7 +3508,7 @@ static bool blk_mq_poll_hybrid(struct request_queue *q, return false; } - return blk_mq_poll_hybrid_sleep(q, hctx, rq); + return blk_mq_poll_hybrid_sleep(q, rq); } /** diff --git a/block/blk-mq.h b/block/blk-mq.h index c0fa34378eb2..10bfdfb494fa 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -200,7 +200,7 @@ static inline bool blk_mq_get_dispatch_budget(struct blk_mq_hw_ctx *hctx) static inline void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *rq) { - blk_mq_put_tag(hctx, hctx->tags, rq->mq_ctx, rq->tag); + blk_mq_put_tag(hctx->tags, rq->mq_ctx, rq->tag); rq->tag = -1; if (rq->rq_flags & RQF_MQ_INFLIGHT) { From dd3db2a34cff14e152da7c8e320297719a35abf9 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Wed, 26 Feb 2020 10:23:43 -0700 Subject: [PATCH 193/400] io_uring: drop file set ref put/get on switch Dan reports that he triggered a warning on ring exit doing some testing: percpu ref (io_file_data_ref_zero) <= 0 (0) after switching to atomic WARNING: CPU: 3 PID: 0 at lib/percpu-refcount.c:160 percpu_ref_switch_to_atomic_rcu+0xe8/0xf0 Modules linked in: CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.6.0-rc3+ #5648 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 RIP: 0010:percpu_ref_switch_to_atomic_rcu+0xe8/0xf0 Code: e7 ff 55 e8 eb d2 80 3d bd 02 d2 00 00 75 8b 48 8b 55 d8 48 c7 c7 e8 70 e6 81 c6 05 a9 02 d2 00 01 48 8b 75 e8 e8 3a d0 c5 ff <0f> 0b e9 69 ff ff ff 90 55 48 89 fd 53 48 89 f3 48 83 ec 28 48 83 RSP: 0018:ffffc90000110ef8 EFLAGS: 00010292 RAX: 0000000000000045 RBX: 7fffffffffffffff RCX: 0000000000000000 RDX: 0000000000000045 RSI: ffffffff825be7a5 RDI: ffffffff825bc32c RBP: ffff8881b75eac38 R08: 000000042364b941 R09: 0000000000000045 R10: ffffffff825beb40 R11: ffffffff825be78a R12: 0000607e46005aa0 R13: ffff888107dcdd00 R14: 0000000000000000 R15: 0000000000000009 FS: 0000000000000000(0000) GS:ffff8881b9d80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f49e6a5ea20 CR3: 00000001b747c004 CR4: 00000000001606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rcu_core+0x1e4/0x4d0 __do_softirq+0xdb/0x2f1 irq_exit+0xa0/0xb0 smp_apic_timer_interrupt+0x60/0x140 apic_timer_interrupt+0xf/0x20 RIP: 0010:default_idle+0x23/0x170 Code: ff eb ab cc cc cc cc 0f 1f 44 00 00 41 54 55 53 65 8b 2d 10 96 92 7e 0f 1f 44 00 00 e9 07 00 00 00 0f 00 2d 21 d0 51 00 fb f4 <65> 8b 2d f6 95 92 7e 0f 1f 44 00 00 5b 5d 41 5c c3 65 8b 05 e5 95 Turns out that this is due to percpu_ref_switch_to_atomic() only grabbing a reference to the percpu refcount if it's not already in atomic mode. io_uring drops a ref and re-gets it when switching back to percpu mode. We attempt to protect against this with the FFD_F_ATOMIC bit, but that isn't reliable. We don't actually need to juggle these refcounts between atomic and percpu switch, we can just do them when we've switched to atomic mode. This removes the need for FFD_F_ATOMIC, which wasn't reliable. Fixes: 05f3fb3c5397 ("io_uring: avoid ring quiesce for fixed file set unregister and update") Reported-by: Dan Melnic Signed-off-by: Jens Axboe --- fs/io_uring.c | 23 ++++++++--------------- 1 file changed, 8 insertions(+), 15 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 36917c0101fd..e412a1761d93 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -183,17 +183,12 @@ struct fixed_file_table { struct file **files; }; -enum { - FFD_F_ATOMIC, -}; - struct fixed_file_data { struct fixed_file_table *table; struct io_ring_ctx *ctx; struct percpu_ref refs; struct llist_head put_llist; - unsigned long state; struct work_struct ref_work; struct completion done; }; @@ -5595,7 +5590,6 @@ static void io_ring_file_ref_switch(struct work_struct *work) data = container_of(work, struct fixed_file_data, ref_work); io_ring_file_ref_flush(data); - percpu_ref_get(&data->refs); percpu_ref_switch_to_percpu(&data->refs); } @@ -5771,8 +5765,13 @@ static void io_atomic_switch(struct percpu_ref *ref) { struct fixed_file_data *data; + /* + * Juggle reference to ensure we hit zero, if needed, so we can + * switch back to percpu mode + */ data = container_of(ref, struct fixed_file_data, refs); - clear_bit(FFD_F_ATOMIC, &data->state); + percpu_ref_put(&data->refs); + percpu_ref_get(&data->refs); } static bool io_queue_file_removal(struct fixed_file_data *data, @@ -5795,11 +5794,7 @@ static bool io_queue_file_removal(struct fixed_file_data *data, llist_add(&pfile->llist, &data->put_llist); if (pfile == &pfile_stack) { - if (!test_and_set_bit(FFD_F_ATOMIC, &data->state)) { - percpu_ref_put(&data->refs); - percpu_ref_switch_to_atomic(&data->refs, - io_atomic_switch); - } + percpu_ref_switch_to_atomic(&data->refs, io_atomic_switch); wait_for_completion(&done); flush_work(&data->ref_work); return false; @@ -5873,10 +5868,8 @@ static int __io_sqe_files_update(struct io_ring_ctx *ctx, up->offset++; } - if (ref_switch && !test_and_set_bit(FFD_F_ATOMIC, &data->state)) { - percpu_ref_put(&data->refs); + if (ref_switch) percpu_ref_switch_to_atomic(&data->refs, io_atomic_switch); - } return done ? done : err; } From fda31c50292a5062332fa0343c084bd9f46604d9 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Mon, 24 Feb 2020 12:47:14 -0800 Subject: [PATCH 194/400] signal: avoid double atomic counter increments for user accounting When queueing a signal, we increment both the users count of pending signals (for RLIMIT_SIGPENDING tracking) and we increment the refcount of the user struct itself (because we keep a reference to the user in the signal structure in order to correctly account for it when freeing). That turns out to be fairly expensive, because both of them are atomic updates, and particularly under extreme signal handling pressure on big machines, you can get a lot of cache contention on the user struct. That can then cause horrid cacheline ping-pong when you do these multiple accesses. So change the reference counting to only pin the user for the _first_ pending signal, and to unpin it when the last pending signal is dequeued. That means that when a user sees a lot of concurrent signal queuing - which is the only situation when this matters - the only atomic access needed is generally the 'sigpending' count update. This was noticed because of a particularly odd timing artifact on a dual-socket 96C/192T Cascade Lake platform: when you get into bad contention, on that machine for some reason seems to be much worse when the contention happens in the upper 32-byte half of the cacheline. As a result, the kernel test robot will-it-scale 'signal1' benchmark had an odd performance regression simply due to random alignment of the 'struct user_struct' (and pointed to a completely unrelated and apparently nonsensical commit for the regression). Avoiding the double increments (and decrements on the dequeueing side, of course) makes for much less contention and hugely improved performance on that will-it-scale microbenchmark. Quoting Feng Tang: "It makes a big difference, that the performance score is tripled! bump from original 17000 to 54000. Also the gap between 5.0-rc6 and 5.0-rc6+Jiri's patch is reduced to around 2%" [ The "2% gap" is the odd cacheline placement difference on that platform: under the extreme contention case, the effect of which half of the cacheline was hot was 5%, so with the reduced contention the odd timing artifact is reduced too ] It does help in the non-contended case too, but is not nearly as noticeable. Reported-and-tested-by: Feng Tang Cc: Eric W. Biederman Cc: Huang, Ying Cc: Philip Li Cc: Andi Kleen Cc: Jiri Olsa Cc: Peter Zijlstra Signed-off-by: Linus Torvalds --- kernel/signal.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/kernel/signal.c b/kernel/signal.c index 9ad8dea93dbb..5b2396350dd1 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -413,27 +413,32 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t flags, int override_rlimi { struct sigqueue *q = NULL; struct user_struct *user; + int sigpending; /* * Protect access to @t credentials. This can go away when all * callers hold rcu read lock. + * + * NOTE! A pending signal will hold on to the user refcount, + * and we get/put the refcount only when the sigpending count + * changes from/to zero. */ rcu_read_lock(); - user = get_uid(__task_cred(t)->user); - atomic_inc(&user->sigpending); + user = __task_cred(t)->user; + sigpending = atomic_inc_return(&user->sigpending); + if (sigpending == 1) + get_uid(user); rcu_read_unlock(); - if (override_rlimit || - atomic_read(&user->sigpending) <= - task_rlimit(t, RLIMIT_SIGPENDING)) { + if (override_rlimit || likely(sigpending <= task_rlimit(t, RLIMIT_SIGPENDING))) { q = kmem_cache_alloc(sigqueue_cachep, flags); } else { print_dropped_signal(sig); } if (unlikely(q == NULL)) { - atomic_dec(&user->sigpending); - free_uid(user); + if (atomic_dec_and_test(&user->sigpending)) + free_uid(user); } else { INIT_LIST_HEAD(&q->list); q->flags = 0; @@ -447,8 +452,8 @@ static void __sigqueue_free(struct sigqueue *q) { if (q->flags & SIGQUEUE_PREALLOC) return; - atomic_dec(&q->user->sigpending); - free_uid(q->user); + if (atomic_dec_and_test(&q->user->sigpending)) + free_uid(q->user); kmem_cache_free(sigqueue_cachep, q); } From cfe2ce49b9da3959015e94b08f7494ade3ee0c49 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Wed, 26 Feb 2020 07:39:29 -0800 Subject: [PATCH 195/400] Revert "KVM: x86: enable -Werror" MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This reverts commit ead68df94d248c80fdbae220ae5425eb5af2e753. Using the -Werror flag breaks the build for me due to mostly harmless KASAN or similar warnings: arch/x86/kvm/x86.c: In function ‘kvm_timer_init’: arch/x86/kvm/x86.c:7209:1: error: the frame size of 1112 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] Feel free to add a CONFIG_WERROR if you care strong enough, but don't break peoples builds for absolutely no good reason. Signed-off-by: Christoph Hellwig Signed-off-by: Linus Torvalds --- arch/x86/kvm/Makefile | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index 4654e97a05cc..b19ef421084d 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -1,7 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 ccflags-y += -Iarch/x86/kvm -ccflags-y += -Werror KVM := ../../../virt/kvm From 683f65ded66a9a7ff01ed7280804d2132ebfdf7e Mon Sep 17 00:00:00 2001 From: Evan Green Date: Tue, 11 Feb 2020 14:37:00 -0800 Subject: [PATCH 196/400] spi: pxa2xx: Add CS control clock quirk In some circumstances on Intel LPSS controllers, toggling the LPSS CS control register doesn't actually cause the CS line to toggle. This seems to be failure of dynamic clock gating that occurs after going through a suspend/resume transition, where the controller is sent through a reset transition. This ruins SPI transactions that either rely on delay_usecs, or toggle the CS line without sending data. Whenever CS is toggled, momentarily set the clock gating register to "Force On" to poke the controller into acting on CS. Signed-off-by: Rajat Jain Signed-off-by: Evan Green Link: https://lore.kernel.org/r/20200211223700.110252-1-rajatja@google.com Signed-off-by: Mark Brown --- drivers/spi/spi-pxa2xx.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/drivers/spi/spi-pxa2xx.c b/drivers/spi/spi-pxa2xx.c index 9071333ebdd8..cabd1a85d71e 100644 --- a/drivers/spi/spi-pxa2xx.c +++ b/drivers/spi/spi-pxa2xx.c @@ -70,6 +70,10 @@ MODULE_ALIAS("platform:pxa2xx-spi"); #define LPSS_CAPS_CS_EN_SHIFT 9 #define LPSS_CAPS_CS_EN_MASK (0xf << LPSS_CAPS_CS_EN_SHIFT) +#define LPSS_PRIV_CLOCK_GATE 0x38 +#define LPSS_PRIV_CLOCK_GATE_CLK_CTL_MASK 0x3 +#define LPSS_PRIV_CLOCK_GATE_CLK_CTL_FORCE_ON 0x3 + struct lpss_config { /* LPSS offset from drv_data->ioaddr */ unsigned offset; @@ -86,6 +90,8 @@ struct lpss_config { unsigned cs_sel_shift; unsigned cs_sel_mask; unsigned cs_num; + /* Quirks */ + unsigned cs_clk_stays_gated : 1; }; /* Keep these sorted with enum pxa_ssp_type */ @@ -156,6 +162,7 @@ static const struct lpss_config lpss_platforms[] = { .tx_threshold_hi = 56, .cs_sel_shift = 8, .cs_sel_mask = 3 << 8, + .cs_clk_stays_gated = true, }, }; @@ -383,6 +390,22 @@ static void lpss_ssp_cs_control(struct spi_device *spi, bool enable) else value |= LPSS_CS_CONTROL_CS_HIGH; __lpss_ssp_write_priv(drv_data, config->reg_cs_ctrl, value); + if (config->cs_clk_stays_gated) { + u32 clkgate; + + /* + * Changing CS alone when dynamic clock gating is on won't + * actually flip CS at that time. This ruins SPI transfers + * that specify delays, or have no data. Toggle the clock mode + * to force on briefly to poke the CS pin to move. + */ + clkgate = __lpss_ssp_read_priv(drv_data, LPSS_PRIV_CLOCK_GATE); + value = (clkgate & ~LPSS_PRIV_CLOCK_GATE_CLK_CTL_MASK) | + LPSS_PRIV_CLOCK_GATE_CLK_CTL_FORCE_ON; + + __lpss_ssp_write_priv(drv_data, LPSS_PRIV_CLOCK_GATE, value); + __lpss_ssp_write_priv(drv_data, LPSS_PRIV_CLOCK_GATE, clkgate); + } } static void cs_assert(struct spi_device *spi) From 8a3bddf67ce88b96531fb22c5a75d7f4dc41d155 Mon Sep 17 00:00:00 2001 From: Daniel Vetter Date: Sat, 22 Feb 2020 18:54:31 +0100 Subject: [PATCH 197/400] drm/amdgpu: Drop DRIVER_USE_AGP MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This doesn't do anything except auto-init drm_agp support when you call drm_get_pci_dev(). Which amdgpu stopped doing with commit b58c11314a1706bf094c489ef5cb28f76478c704 Author: Alex Deucher Date: Fri Jun 2 17:16:31 2017 -0400 drm/amdgpu: drop deprecated drm_get_pci_dev and drm_put_dev No idea whether this was intentional or accidental breakage, but I guess anyone who manages to boot a this modern gpu behind an agp bridge deserves a price. A price I never expect anyone to ever collect :-) Cc: Alex Deucher Cc: "Christian König" Cc: Hawking Zhang Cc: Xiaojie Yuan Cc: Evan Quan Cc: "Tianci.Yin" Cc: "Marek Olšák" Cc: Hans de Goede Reviewed-by: Emil Velikov Reviewed-by: Alex Deucher Signed-off-by: Daniel Vetter Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 94e2fd758e01..42f4febe24c6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1389,7 +1389,7 @@ amdgpu_get_crtc_scanout_position(struct drm_device *dev, unsigned int pipe, static struct drm_driver kms_driver = { .driver_features = - DRIVER_USE_AGP | DRIVER_ATOMIC | + DRIVER_ATOMIC | DRIVER_GEM | DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ | DRIVER_SYNCOBJ_TIMELINE, From eb12c957735b582607e5842a06d1f4c62e185c1d Mon Sep 17 00:00:00 2001 From: Daniel Vetter Date: Sat, 22 Feb 2020 18:54:32 +0100 Subject: [PATCH 198/400] drm/radeon: Inline drm_get_pci_dev MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit It's the last user, and more importantly, it's the last non-legacy user of anything in drm_pci.c. The only tricky bit is the agp initialization. But a close look shows that radeon does not use the drm_agp midlayer (the main use of that is drm_bufs for legacy drivers), and instead could use the agp subsystem directly (like nouveau does already). Hence we can just pull this in too. A further step would be to entirely drop the use of drm_device->agp, but feels like too much churn just for this patch. Signed-off-by: Daniel Vetter Cc: Alex Deucher Cc: "Christian König" Cc: "David (ChunMing) Zhou" Cc: amd-gfx@lists.freedesktop.org Reviewed-by: Alex Deucher Reviewed-by: Emil Velikov Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org --- drivers/gpu/drm/radeon/radeon_drv.c | 43 +++++++++++++++++++++++++++-- drivers/gpu/drm/radeon/radeon_kms.c | 6 ++++ 2 files changed, 47 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index fd74e2611185..8696af1ee14d 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -37,6 +37,7 @@ #include #include +#include #include #include #include @@ -325,6 +326,7 @@ static int radeon_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { unsigned long flags = 0; + struct drm_device *dev; int ret; if (!ent) @@ -365,7 +367,44 @@ static int radeon_pci_probe(struct pci_dev *pdev, if (ret) return ret; - return drm_get_pci_dev(pdev, ent, &kms_driver); + dev = drm_dev_alloc(&kms_driver, &pdev->dev); + if (IS_ERR(dev)) + return PTR_ERR(dev); + + ret = pci_enable_device(pdev); + if (ret) + goto err_free; + + dev->pdev = pdev; +#ifdef __alpha__ + dev->hose = pdev->sysdata; +#endif + + pci_set_drvdata(pdev, dev); + + if (pci_find_capability(dev->pdev, PCI_CAP_ID_AGP)) + dev->agp = drm_agp_init(dev); + if (dev->agp) { + dev->agp->agp_mtrr = arch_phys_wc_add( + dev->agp->agp_info.aper_base, + dev->agp->agp_info.aper_size * + 1024 * 1024); + } + + ret = drm_dev_register(dev, ent->driver_data); + if (ret) + goto err_agp; + + return 0; + +err_agp: + if (dev->agp) + arch_phys_wc_del(dev->agp->agp_mtrr); + kfree(dev->agp); + pci_disable_device(pdev); +err_free: + drm_dev_put(dev); + return ret; } static void @@ -575,7 +614,7 @@ radeon_get_crtc_scanout_position(struct drm_device *dev, unsigned int pipe, static struct drm_driver kms_driver = { .driver_features = - DRIVER_USE_AGP | DRIVER_GEM | DRIVER_RENDER, + DRIVER_GEM | DRIVER_RENDER, .load = radeon_driver_load_kms, .open = radeon_driver_open_kms, .postclose = radeon_driver_postclose_kms, diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c index d24f23a81656..dd2f19b8022b 100644 --- a/drivers/gpu/drm/radeon/radeon_kms.c +++ b/drivers/gpu/drm/radeon/radeon_kms.c @@ -32,6 +32,7 @@ #include #include +#include #include #include #include @@ -77,6 +78,11 @@ void radeon_driver_unload_kms(struct drm_device *dev) radeon_modeset_fini(rdev); radeon_device_fini(rdev); + if (dev->agp) + arch_phys_wc_del(dev->agp->agp_mtrr); + kfree(dev->agp); + dev->agp = NULL; + done_free: kfree(rdev); dev->dev_private = NULL; From 6e11d1578fba8d09d03a286740ffcf336d53928c Mon Sep 17 00:00:00 2001 From: Amritha Nambiar Date: Mon, 24 Feb 2020 10:56:00 -0800 Subject: [PATCH 199/400] net: Fix Tx hash bound checking Fixes the lower and upper bounds when there are multiple TCs and traffic is on the the same TC on the same device. The lower bound is represented by 'qoffset' and the upper limit for hash value is 'qcount + qoffset'. This gives a clean Rx to Tx queue mapping when there are multiple TCs, as the queue indices for upper TCs will be offset by 'qoffset'. v2: Fixed commit description based on comments. Fixes: 1b837d489e06 ("net: Revoke export for __skb_tx_hash, update it to just be static skb_tx_hash") Fixes: eadec877ce9c ("net: Add support for subordinate traffic classes to netdev_pick_tx") Signed-off-by: Amritha Nambiar Reviewed-by: Alexander Duyck Reviewed-by: Sridhar Samudrala Signed-off-by: David S. Miller --- net/core/dev.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/core/dev.c b/net/core/dev.c index e10bd680dc03..c6c985fe7b1b 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3076,6 +3076,8 @@ static u16 skb_tx_hash(const struct net_device *dev, if (skb_rx_queue_recorded(skb)) { hash = skb_get_rx_queue(skb); + if (hash >= qoffset) + hash -= qoffset; while (unlikely(hash >= qcount)) hash -= qcount; return hash + qoffset; From e34f1753eebc428c312527662eb1b529cf260240 Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Mon, 24 Feb 2020 20:42:12 +0100 Subject: [PATCH 200/400] ethtool: limit bitset size Syzbot reported that ethnl_compact_sanity_checks() can be tricked into reading past the end of ETHTOOL_A_BITSET_VALUE and ETHTOOL_A_BITSET_MASK attributes and even the message by passing a value between (u32)(-31) and (u32)(-1) as ETHTOOL_A_BITSET_SIZE. The problem is that DIV_ROUND_UP(attr_nbits, 32) is 0 for such values so that zero length ETHTOOL_A_BITSET_VALUE will pass the length check but ethnl_bitmap32_not_zero() check would try to access up to 512 MB of attribute "payload". Prevent this overflow byt limiting the bitset size. Technically, compact bitset format would allow bitset sizes up to almost 2^18 (so that the nest size does not exceed U16_MAX) but bitsets used by ethtool are much shorter. S16_MAX, the largest value which can be directly used as an upper limit in policy, should be a reasonable compromise. Fixes: 10b518d4e6dd ("ethtool: netlink bitset handling") Reported-by: syzbot+7fd4ed5b4234ab1fdccd@syzkaller.appspotmail.com Reported-by: syzbot+709b7a64d57978247e44@syzkaller.appspotmail.com Reported-by: syzbot+983cb8fb2d17a7af549d@syzkaller.appspotmail.com Signed-off-by: Michal Kubecek Signed-off-by: David S. Miller --- net/ethtool/bitset.c | 3 ++- net/ethtool/bitset.h | 2 ++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/net/ethtool/bitset.c b/net/ethtool/bitset.c index 8977fe1f3946..ef9197541cb3 100644 --- a/net/ethtool/bitset.c +++ b/net/ethtool/bitset.c @@ -305,7 +305,8 @@ nla_put_failure: static const struct nla_policy bitset_policy[ETHTOOL_A_BITSET_MAX + 1] = { [ETHTOOL_A_BITSET_UNSPEC] = { .type = NLA_REJECT }, [ETHTOOL_A_BITSET_NOMASK] = { .type = NLA_FLAG }, - [ETHTOOL_A_BITSET_SIZE] = { .type = NLA_U32 }, + [ETHTOOL_A_BITSET_SIZE] = NLA_POLICY_MAX(NLA_U32, + ETHNL_MAX_BITSET_SIZE), [ETHTOOL_A_BITSET_BITS] = { .type = NLA_NESTED }, [ETHTOOL_A_BITSET_VALUE] = { .type = NLA_BINARY }, [ETHTOOL_A_BITSET_MASK] = { .type = NLA_BINARY }, diff --git a/net/ethtool/bitset.h b/net/ethtool/bitset.h index b8247e34109d..b849f9d19676 100644 --- a/net/ethtool/bitset.h +++ b/net/ethtool/bitset.h @@ -3,6 +3,8 @@ #ifndef _NET_ETHTOOL_BITSET_H #define _NET_ETHTOOL_BITSET_H +#define ETHNL_MAX_BITSET_SIZE S16_MAX + typedef const char (*const ethnl_string_array_t)[ETH_GSTRING_LEN]; int ethnl_bitset_is_compact(const struct nlattr *bitset, bool *compact); From 99b79c3900d4627672c85d9f344b5b0f06bc2a4d Mon Sep 17 00:00:00 2001 From: Cong Wang Date: Wed, 12 Feb 2020 22:53:52 -0800 Subject: [PATCH 201/400] netfilter: xt_hashlimit: unregister proc file before releasing mutex Before releasing the global mutex, we only unlink the hashtable from the hash list, its proc file is still not unregistered at this point. So syzbot could trigger a race condition where a parallel htable_create() could register the same file immediately after the mutex is released. Move htable_remove_proc_entry() back to mutex protection to fix this. And, fold htable_destroy() into htable_put() to make the code slightly easier to understand. Reported-and-tested-by: syzbot+d195fd3b9a364ddd6731@syzkaller.appspotmail.com Fixes: c4a3922d2d20 ("netfilter: xt_hashlimit: reduce hashlimit_mutex scope for htable_put()") Signed-off-by: Cong Wang Signed-off-by: Pablo Neira Ayuso --- net/netfilter/xt_hashlimit.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c index 7a2c4b8408c4..8c835ad63729 100644 --- a/net/netfilter/xt_hashlimit.c +++ b/net/netfilter/xt_hashlimit.c @@ -402,15 +402,6 @@ static void htable_remove_proc_entry(struct xt_hashlimit_htable *hinfo) remove_proc_entry(hinfo->name, parent); } -static void htable_destroy(struct xt_hashlimit_htable *hinfo) -{ - cancel_delayed_work_sync(&hinfo->gc_work); - htable_remove_proc_entry(hinfo); - htable_selective_cleanup(hinfo, true); - kfree(hinfo->name); - vfree(hinfo); -} - static struct xt_hashlimit_htable *htable_find_get(struct net *net, const char *name, u_int8_t family) @@ -432,8 +423,13 @@ static void htable_put(struct xt_hashlimit_htable *hinfo) { if (refcount_dec_and_mutex_lock(&hinfo->use, &hashlimit_mutex)) { hlist_del(&hinfo->node); + htable_remove_proc_entry(hinfo); mutex_unlock(&hashlimit_mutex); - htable_destroy(hinfo); + + cancel_delayed_work_sync(&hinfo->gc_work); + htable_selective_cleanup(hinfo, true); + kfree(hinfo->name); + vfree(hinfo); } } From 9a005c3898aa07cd5cdca77b7096814e6c478c92 Mon Sep 17 00:00:00 2001 From: Jonathan Lemon Date: Mon, 24 Feb 2020 15:29:09 -0800 Subject: [PATCH 202/400] bnxt_en: add newline to netdev_*() format strings Add missing newlines to netdev_* format strings so the lines aren't buffered by the printk subsystem. Nitpicked-by: Jakub Kicinski Signed-off-by: Jonathan Lemon Acked-by: Michael Chan Signed-off-by: David S. Miller --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 +- .../net/ethernet/broadcom/bnxt/bnxt_devlink.c | 10 ++-- .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 4 +- drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 48 +++++++++---------- drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c | 10 ++-- 5 files changed, 38 insertions(+), 38 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index fd6e0e48cd51..f9a8151f092c 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -11252,7 +11252,7 @@ static void bnxt_cfg_ntp_filters(struct bnxt *bp) } } if (test_and_clear_bit(BNXT_HWRM_PF_UNLOAD_SP_EVENT, &bp->sp_event)) - netdev_info(bp->dev, "Receive PF driver unload event!"); + netdev_info(bp->dev, "Receive PF driver unload event!\n"); } #else @@ -11759,7 +11759,7 @@ static int bnxt_pcie_dsn_get(struct bnxt *bp, u8 dsn[]) u32 dw; if (!pos) { - netdev_info(bp->dev, "Unable do read adapter's DSN"); + netdev_info(bp->dev, "Unable do read adapter's DSN\n"); return -EOPNOTSUPP; } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c index eec0168330b7..d3c93ccee86a 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c @@ -641,14 +641,14 @@ static int bnxt_dl_params_register(struct bnxt *bp) rc = devlink_params_register(bp->dl, bnxt_dl_params, ARRAY_SIZE(bnxt_dl_params)); if (rc) { - netdev_warn(bp->dev, "devlink_params_register failed. rc=%d", + netdev_warn(bp->dev, "devlink_params_register failed. rc=%d\n", rc); return rc; } rc = devlink_port_params_register(&bp->dl_port, bnxt_dl_port_params, ARRAY_SIZE(bnxt_dl_port_params)); if (rc) { - netdev_err(bp->dev, "devlink_port_params_register failed"); + netdev_err(bp->dev, "devlink_port_params_register failed\n"); devlink_params_unregister(bp->dl, bnxt_dl_params, ARRAY_SIZE(bnxt_dl_params)); return rc; @@ -679,7 +679,7 @@ int bnxt_dl_register(struct bnxt *bp) else dl = devlink_alloc(&bnxt_vf_dl_ops, sizeof(struct bnxt_dl)); if (!dl) { - netdev_warn(bp->dev, "devlink_alloc failed"); + netdev_warn(bp->dev, "devlink_alloc failed\n"); return -ENOMEM; } @@ -692,7 +692,7 @@ int bnxt_dl_register(struct bnxt *bp) rc = devlink_register(dl, &bp->pdev->dev); if (rc) { - netdev_warn(bp->dev, "devlink_register failed. rc=%d", rc); + netdev_warn(bp->dev, "devlink_register failed. rc=%d\n", rc); goto err_dl_free; } @@ -704,7 +704,7 @@ int bnxt_dl_register(struct bnxt *bp) sizeof(bp->dsn)); rc = devlink_port_register(dl, &bp->dl_port, bp->pf.port_id); if (rc) { - netdev_err(bp->dev, "devlink_port_register failed"); + netdev_err(bp->dev, "devlink_port_register failed\n"); goto err_dl_unreg; } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index 6171fa8b3677..e8fc1671c581 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -2028,7 +2028,7 @@ int bnxt_flash_package_from_file(struct net_device *dev, const char *filename, } if (fw->size > item_len) { - netdev_err(dev, "PKG insufficient update area in nvram: %lu", + netdev_err(dev, "PKG insufficient update area in nvram: %lu\n", (unsigned long)fw->size); rc = -EFBIG; } else { @@ -3338,7 +3338,7 @@ err: kfree(coredump.data); *dump_len += sizeof(struct bnxt_coredump_record); if (rc == -ENOBUFS) - netdev_err(bp->dev, "Firmware returned large coredump buffer"); + netdev_err(bp->dev, "Firmware returned large coredump buffer\n"); return rc; } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c index 0cc6ec51f45f..9bec256b0934 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c @@ -50,7 +50,7 @@ static u16 bnxt_flow_get_dst_fid(struct bnxt *pf_bp, struct net_device *dev) /* check if dev belongs to the same switch */ if (!netdev_port_same_parent_id(pf_bp->dev, dev)) { - netdev_info(pf_bp->dev, "dev(ifindex=%d) not on same switch", + netdev_info(pf_bp->dev, "dev(ifindex=%d) not on same switch\n", dev->ifindex); return BNXT_FID_INVALID; } @@ -70,7 +70,7 @@ static int bnxt_tc_parse_redir(struct bnxt *bp, struct net_device *dev = act->dev; if (!dev) { - netdev_info(bp->dev, "no dev in mirred action"); + netdev_info(bp->dev, "no dev in mirred action\n"); return -EINVAL; } @@ -106,7 +106,7 @@ static int bnxt_tc_parse_tunnel_set(struct bnxt *bp, const struct ip_tunnel_key *tun_key = &tun_info->key; if (ip_tunnel_info_af(tun_info) != AF_INET) { - netdev_info(bp->dev, "only IPv4 tunnel-encap is supported"); + netdev_info(bp->dev, "only IPv4 tunnel-encap is supported\n"); return -EOPNOTSUPP; } @@ -295,7 +295,7 @@ static int bnxt_tc_parse_actions(struct bnxt *bp, int i, rc; if (!flow_action_has_entries(flow_action)) { - netdev_info(bp->dev, "no actions"); + netdev_info(bp->dev, "no actions\n"); return -EINVAL; } @@ -370,7 +370,7 @@ static int bnxt_tc_parse_flow(struct bnxt *bp, /* KEY_CONTROL and KEY_BASIC are needed for forming a meaningful key */ if ((dissector->used_keys & BIT(FLOW_DISSECTOR_KEY_CONTROL)) == 0 || (dissector->used_keys & BIT(FLOW_DISSECTOR_KEY_BASIC)) == 0) { - netdev_info(bp->dev, "cannot form TC key: used_keys = 0x%x", + netdev_info(bp->dev, "cannot form TC key: used_keys = 0x%x\n", dissector->used_keys); return -EOPNOTSUPP; } @@ -508,7 +508,7 @@ static int bnxt_hwrm_cfa_flow_free(struct bnxt *bp, rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT); if (rc) - netdev_info(bp->dev, "%s: Error rc=%d", __func__, rc); + netdev_info(bp->dev, "%s: Error rc=%d\n", __func__, rc); return rc; } @@ -841,7 +841,7 @@ static int hwrm_cfa_decap_filter_alloc(struct bnxt *bp, resp = bnxt_get_hwrm_resp_addr(bp, &req); *decap_filter_handle = resp->decap_filter_id; } else { - netdev_info(bp->dev, "%s: Error rc=%d", __func__, rc); + netdev_info(bp->dev, "%s: Error rc=%d\n", __func__, rc); } mutex_unlock(&bp->hwrm_cmd_lock); @@ -859,7 +859,7 @@ static int hwrm_cfa_decap_filter_free(struct bnxt *bp, rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT); if (rc) - netdev_info(bp->dev, "%s: Error rc=%d", __func__, rc); + netdev_info(bp->dev, "%s: Error rc=%d\n", __func__, rc); return rc; } @@ -906,7 +906,7 @@ static int hwrm_cfa_encap_record_alloc(struct bnxt *bp, resp = bnxt_get_hwrm_resp_addr(bp, &req); *encap_record_handle = resp->encap_record_id; } else { - netdev_info(bp->dev, "%s: Error rc=%d", __func__, rc); + netdev_info(bp->dev, "%s: Error rc=%d\n", __func__, rc); } mutex_unlock(&bp->hwrm_cmd_lock); @@ -924,7 +924,7 @@ static int hwrm_cfa_encap_record_free(struct bnxt *bp, rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT); if (rc) - netdev_info(bp->dev, "%s: Error rc=%d", __func__, rc); + netdev_info(bp->dev, "%s: Error rc=%d\n", __func__, rc); return rc; } @@ -943,7 +943,7 @@ static int bnxt_tc_put_l2_node(struct bnxt *bp, tc_info->l2_ht_params); if (rc) netdev_err(bp->dev, - "Error: %s: rhashtable_remove_fast: %d", + "Error: %s: rhashtable_remove_fast: %d\n", __func__, rc); kfree_rcu(l2_node, rcu); } @@ -972,7 +972,7 @@ bnxt_tc_get_l2_node(struct bnxt *bp, struct rhashtable *l2_table, if (rc) { kfree_rcu(l2_node, rcu); netdev_err(bp->dev, - "Error: %s: rhashtable_insert_fast: %d", + "Error: %s: rhashtable_insert_fast: %d\n", __func__, rc); return NULL; } @@ -1031,7 +1031,7 @@ static bool bnxt_tc_can_offload(struct bnxt *bp, struct bnxt_tc_flow *flow) if ((flow->flags & BNXT_TC_FLOW_FLAGS_PORTS) && (flow->l4_key.ip_proto != IPPROTO_TCP && flow->l4_key.ip_proto != IPPROTO_UDP)) { - netdev_info(bp->dev, "Cannot offload non-TCP/UDP (%d) ports", + netdev_info(bp->dev, "Cannot offload non-TCP/UDP (%d) ports\n", flow->l4_key.ip_proto); return false; } @@ -1088,7 +1088,7 @@ static int bnxt_tc_put_tunnel_node(struct bnxt *bp, rc = rhashtable_remove_fast(tunnel_table, &tunnel_node->node, *ht_params); if (rc) { - netdev_err(bp->dev, "rhashtable_remove_fast rc=%d", rc); + netdev_err(bp->dev, "rhashtable_remove_fast rc=%d\n", rc); rc = -1; } kfree_rcu(tunnel_node, rcu); @@ -1129,7 +1129,7 @@ bnxt_tc_get_tunnel_node(struct bnxt *bp, struct rhashtable *tunnel_table, tunnel_node->refcount++; return tunnel_node; err: - netdev_info(bp->dev, "error rc=%d", rc); + netdev_info(bp->dev, "error rc=%d\n", rc); return NULL; } @@ -1187,7 +1187,7 @@ static void bnxt_tc_put_decap_l2_node(struct bnxt *bp, &decap_l2_node->node, tc_info->decap_l2_ht_params); if (rc) - netdev_err(bp->dev, "rhashtable_remove_fast rc=%d", rc); + netdev_err(bp->dev, "rhashtable_remove_fast rc=%d\n", rc); kfree_rcu(decap_l2_node, rcu); } } @@ -1227,7 +1227,7 @@ static int bnxt_tc_resolve_tunnel_hdrs(struct bnxt *bp, rt = ip_route_output_key(dev_net(real_dst_dev), &flow); if (IS_ERR(rt)) { - netdev_info(bp->dev, "no route to %pI4b", &flow.daddr); + netdev_info(bp->dev, "no route to %pI4b\n", &flow.daddr); return -EOPNOTSUPP; } @@ -1241,7 +1241,7 @@ static int bnxt_tc_resolve_tunnel_hdrs(struct bnxt *bp, if (vlan->real_dev != real_dst_dev) { netdev_info(bp->dev, - "dst_dev(%s) doesn't use PF-if(%s)", + "dst_dev(%s) doesn't use PF-if(%s)\n", netdev_name(dst_dev), netdev_name(real_dst_dev)); rc = -EOPNOTSUPP; @@ -1253,7 +1253,7 @@ static int bnxt_tc_resolve_tunnel_hdrs(struct bnxt *bp, #endif } else if (dst_dev != real_dst_dev) { netdev_info(bp->dev, - "dst_dev(%s) for %pI4b is not PF-if(%s)", + "dst_dev(%s) for %pI4b is not PF-if(%s)\n", netdev_name(dst_dev), &flow.daddr, netdev_name(real_dst_dev)); rc = -EOPNOTSUPP; @@ -1262,7 +1262,7 @@ static int bnxt_tc_resolve_tunnel_hdrs(struct bnxt *bp, nbr = dst_neigh_lookup(&rt->dst, &flow.daddr); if (!nbr) { - netdev_info(bp->dev, "can't lookup neighbor for %pI4b", + netdev_info(bp->dev, "can't lookup neighbor for %pI4b\n", &flow.daddr); rc = -EOPNOTSUPP; goto put_rt; @@ -1472,7 +1472,7 @@ static int __bnxt_tc_del_flow(struct bnxt *bp, rc = rhashtable_remove_fast(&tc_info->flow_table, &flow_node->node, tc_info->flow_ht_params); if (rc) - netdev_err(bp->dev, "Error: %s: rhashtable_remove_fast rc=%d", + netdev_err(bp->dev, "Error: %s: rhashtable_remove_fast rc=%d\n", __func__, rc); kfree_rcu(flow_node, rcu); @@ -1587,7 +1587,7 @@ unlock: free_node: kfree_rcu(new_node, rcu); done: - netdev_err(bp->dev, "Error: %s: cookie=0x%lx error=%d", + netdev_err(bp->dev, "Error: %s: cookie=0x%lx error=%d\n", __func__, tc_flow_cmd->cookie, rc); return rc; } @@ -1700,7 +1700,7 @@ bnxt_hwrm_cfa_flow_stats_get(struct bnxt *bp, int num_flows, le64_to_cpu(resp_bytes[i]); } } else { - netdev_info(bp->dev, "error rc=%d", rc); + netdev_info(bp->dev, "error rc=%d\n", rc); } mutex_unlock(&bp->hwrm_cmd_lock); @@ -1970,7 +1970,7 @@ static int bnxt_tc_indr_block_event(struct notifier_block *nb, bp); if (rc) netdev_info(bp->dev, - "Failed to register indirect blk: dev: %s", + "Failed to register indirect blk: dev: %s\n", netdev->name); break; case NETDEV_UNREGISTER: diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c index b010b34cdaf8..6f2faf81c1ae 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c @@ -43,7 +43,7 @@ static int hwrm_cfa_vfr_alloc(struct bnxt *bp, u16 vf_idx, netdev_dbg(bp->dev, "tx_cfa_action=0x%x, rx_cfa_code=0x%x", *tx_cfa_action, *rx_cfa_code); } else { - netdev_info(bp->dev, "%s error rc=%d", __func__, rc); + netdev_info(bp->dev, "%s error rc=%d\n", __func__, rc); } mutex_unlock(&bp->hwrm_cmd_lock); @@ -60,7 +60,7 @@ static int hwrm_cfa_vfr_free(struct bnxt *bp, u16 vf_idx) rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT); if (rc) - netdev_info(bp->dev, "%s error rc=%d", __func__, rc); + netdev_info(bp->dev, "%s error rc=%d\n", __func__, rc); return rc; } @@ -465,7 +465,7 @@ static int bnxt_vf_reps_create(struct bnxt *bp) return 0; err: - netdev_info(bp->dev, "%s error=%d", __func__, rc); + netdev_info(bp->dev, "%s error=%d\n", __func__, rc); kfree(cfa_code_map); __bnxt_vf_reps_destroy(bp); return rc; @@ -488,7 +488,7 @@ int bnxt_dl_eswitch_mode_set(struct devlink *devlink, u16 mode, mutex_lock(&bp->sriov_lock); if (bp->eswitch_mode == mode) { - netdev_info(bp->dev, "already in %s eswitch mode", + netdev_info(bp->dev, "already in %s eswitch mode\n", mode == DEVLINK_ESWITCH_MODE_LEGACY ? "legacy" : "switchdev"); rc = -EINVAL; @@ -508,7 +508,7 @@ int bnxt_dl_eswitch_mode_set(struct devlink *devlink, u16 mode, } if (pci_num_vf(bp->pdev) == 0) { - netdev_info(bp->dev, "Enable VFs before setting switchdev mode"); + netdev_info(bp->dev, "Enable VFs before setting switchdev mode\n"); rc = -EPERM; goto done; } From 98c5f7d44fef309e692c24c6d71131ee0f0871fb Mon Sep 17 00:00:00 2001 From: Florian Fainelli Date: Mon, 24 Feb 2020 15:56:32 -0800 Subject: [PATCH 203/400] net: dsa: bcm_sf2: Forcibly configure IMP port for 1Gb/sec We are still experiencing some packet loss with the existing advanced congestion buffering (ACB) settings with the IMP port configured for 2Gb/sec, so revert to conservative link speeds that do not produce packet loss until this is resolved. Fixes: 8f1880cbe8d0 ("net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec") Fixes: de34d7084edd ("net: dsa: bcm_sf2: Only 7278 supports 2Gb/sec IMP port") Signed-off-by: Florian Fainelli Reviewed-by: Vivien Didelot Signed-off-by: David S. Miller --- drivers/net/dsa/bcm_sf2.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c index d1955543acd1..b0f5280a83cb 100644 --- a/drivers/net/dsa/bcm_sf2.c +++ b/drivers/net/dsa/bcm_sf2.c @@ -69,8 +69,7 @@ static void bcm_sf2_imp_setup(struct dsa_switch *ds, int port) /* Force link status for IMP port */ reg = core_readl(priv, offset); reg |= (MII_SW_OR | LINK_STS); - if (priv->type == BCM7278_DEVICE_ID) - reg |= GMII_SPEED_UP_2G; + reg &= ~GMII_SPEED_UP_2G; core_writel(priv, reg, offset); /* Enable Broadcast, Multicast, Unicast forwarding to IMP port */ From 2eb51c75dcb354f8aef03d7648318b24630632e1 Mon Sep 17 00:00:00 2001 From: Madhuparna Bhowmik Date: Tue, 25 Feb 2020 17:57:45 +0530 Subject: [PATCH 204/400] net: core: devlink.c: Use built-in RCU list checking list_for_each_entry_rcu() has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled. The devlink->lock is held when devlink_dpipe_table_find() is called in non RCU read side section. Therefore, pass struct devlink to devlink_dpipe_table_find() for lockdep checking. Signed-off-by: Madhuparna Bhowmik Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller --- net/core/devlink.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/net/core/devlink.c b/net/core/devlink.c index 8d0b558be942..5e220809844c 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -2103,11 +2103,11 @@ err_action_values_put: static struct devlink_dpipe_table * devlink_dpipe_table_find(struct list_head *dpipe_tables, - const char *table_name) + const char *table_name, struct devlink *devlink) { struct devlink_dpipe_table *table; - - list_for_each_entry_rcu(table, dpipe_tables, list) { + list_for_each_entry_rcu(table, dpipe_tables, list, + lockdep_is_held(&devlink->lock)) { if (!strcmp(table->name, table_name)) return table; } @@ -2226,7 +2226,7 @@ static int devlink_nl_cmd_dpipe_entries_get(struct sk_buff *skb, table_name = nla_data(info->attrs[DEVLINK_ATTR_DPIPE_TABLE_NAME]); table = devlink_dpipe_table_find(&devlink->dpipe_table_list, - table_name); + table_name, devlink); if (!table) return -EINVAL; @@ -2382,7 +2382,7 @@ static int devlink_dpipe_table_counters_set(struct devlink *devlink, struct devlink_dpipe_table *table; table = devlink_dpipe_table_find(&devlink->dpipe_table_list, - table_name); + table_name, devlink); if (!table) return -EINVAL; @@ -6854,7 +6854,7 @@ bool devlink_dpipe_table_counter_enabled(struct devlink *devlink, rcu_read_lock(); table = devlink_dpipe_table_find(&devlink->dpipe_table_list, - table_name); + table_name, devlink); enabled = false; if (table) enabled = table->counters_enabled; @@ -6885,7 +6885,8 @@ int devlink_dpipe_table_register(struct devlink *devlink, mutex_lock(&devlink->lock); - if (devlink_dpipe_table_find(&devlink->dpipe_table_list, table_name)) { + if (devlink_dpipe_table_find(&devlink->dpipe_table_list, table_name, + devlink)) { err = -EEXIST; goto unlock; } @@ -6921,7 +6922,7 @@ void devlink_dpipe_table_unregister(struct devlink *devlink, mutex_lock(&devlink->lock); table = devlink_dpipe_table_find(&devlink->dpipe_table_list, - table_name); + table_name, devlink); if (!table) goto unlock; list_del_rcu(&table->list); @@ -7078,7 +7079,7 @@ int devlink_dpipe_table_resource_set(struct devlink *devlink, mutex_lock(&devlink->lock); table = devlink_dpipe_table_find(&devlink->dpipe_table_list, - table_name); + table_name, devlink); if (!table) { err = -EINVAL; goto out; From eabc8bcb292fb9a5757b0c8ab7751f41b0a104f8 Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Thu, 27 Feb 2020 02:44:58 +0900 Subject: [PATCH 205/400] kbuild: get rid of trailing slash from subdir- example obj-* needs a trailing slash for a directory, but subdir-* does not. Signed-off-by: Masahiro Yamada --- Documentation/kbuild/makefiles.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/kbuild/makefiles.rst b/Documentation/kbuild/makefiles.rst index 4018ad7c7a11..6bc126a14b3d 100644 --- a/Documentation/kbuild/makefiles.rst +++ b/Documentation/kbuild/makefiles.rst @@ -765,7 +765,7 @@ is not sufficient this sometimes needs to be explicit. Example:: #arch/x86/boot/Makefile - subdir- := compressed/ + subdir- := compressed The above assignment instructs kbuild to descend down in the directory compressed/ when "make clean" is executed. From 1521a67e6016664941f0917d50cb20053a8826a2 Mon Sep 17 00:00:00 2001 From: Jiri Pirko Date: Tue, 25 Feb 2020 13:54:12 +0100 Subject: [PATCH 206/400] sched: act: count in the size of action flags bitfield The put of the flags was added by the commit referenced in fixes tag, however the size of the message was not extended accordingly. Fix this by adding size of the flags bitfield to the message size. Fixes: e38226786022 ("net: sched: update action implementations to support flags") Signed-off-by: Jiri Pirko Signed-off-by: David S. Miller --- net/sched/act_api.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 90a31b15585f..8c466a712cda 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -186,6 +186,7 @@ static size_t tcf_action_shared_attrs_size(const struct tc_action *act) + nla_total_size(IFNAMSIZ) /* TCA_ACT_KIND */ + cookie_len /* TCA_ACT_COOKIE */ + nla_total_size(0) /* TCA_ACT_STATS nested */ + + nla_total_size(sizeof(struct nla_bitfield32)) /* TCA_ACT_FLAGS */ /* TCA_STATS_BASIC */ + nla_total_size_64bit(sizeof(struct gnet_stats_basic)) /* TCA_STATS_PKT64 */ From 402482a6a78e5c61d8a2ec6311fc5b4aca392cd6 Mon Sep 17 00:00:00 2001 From: Nicolas Saenz Julienne Date: Tue, 25 Feb 2020 14:11:59 +0100 Subject: [PATCH 207/400] net: bcmgenet: Clear ID_MODE_DIS in EXT_RGMII_OOB_CTRL when not needed Outdated Raspberry Pi 4 firmware might configure the external PHY as rgmii although the kernel currently sets it as rgmii-rxid. This makes connections unreliable as ID_MODE_DIS is left enabled. To avoid this, explicitly clear that bit whenever we don't need it. Fixes: da38802211cc ("net: bcmgenet: Add RGMII_RXID support") Signed-off-by: Nicolas Saenz Julienne Acked-by: Florian Fainelli Signed-off-by: David S. Miller --- drivers/net/ethernet/broadcom/genet/bcmmii.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/broadcom/genet/bcmmii.c b/drivers/net/ethernet/broadcom/genet/bcmmii.c index 6392a2530183..10244941a7a6 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmmii.c +++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c @@ -294,6 +294,7 @@ int bcmgenet_mii_config(struct net_device *dev, bool init) */ if (priv->ext_phy) { reg = bcmgenet_ext_readl(priv, EXT_RGMII_OOB_CTRL); + reg &= ~ID_MODE_DIS; reg |= id_mode_dis; if (GENET_IS_V1(priv) || GENET_IS_V2(priv) || GENET_IS_V3(priv)) reg |= RGMII_MODE_EN_V123; From 51e3dfa8906ace90c809235b3d3afebc166b6433 Mon Sep 17 00:00:00 2001 From: Ursula Braun Date: Tue, 25 Feb 2020 16:34:36 +0100 Subject: [PATCH 208/400] net/smc: fix cleanup for linkgroup setup failures If an SMC connection to a certain peer is setup the first time, a new linkgroup is created. In case of setup failures, such a linkgroup is unusable and should disappear. As a first step the linkgroup is removed from the linkgroup list in smc_lgr_forget(). There are 2 problems: smc_listen_decline() might be called before linkgroup creation resulting in a crash due to calling smc_lgr_forget() with parameter NULL. If a setup failure occurs after linkgroup creation, the connection is never unregistered from the linkgroup, preventing linkgroup freeing. This patch introduces an enhanced smc_lgr_cleanup_early() function which * contains a linkgroup check for early smc_listen_decline() invocations * invokes smc_conn_free() to guarantee unregistering of the connection. * schedules fast linkgroup removal of the unusable linkgroup And the unused function smcd_conn_free() is removed from smc_core.h. Fixes: 3b2dec2603d5b ("net/smc: restructure client and server code in af_smc") Fixes: 2a0674fffb6bc ("net/smc: improve abnormal termination of link groups") Signed-off-by: Ursula Braun Signed-off-by: Karsten Graul Signed-off-by: David S. Miller --- net/smc/af_smc.c | 25 +++++++++++++++---------- net/smc/smc_core.c | 12 ++++++++++++ net/smc/smc_core.h | 2 +- 3 files changed, 28 insertions(+), 11 deletions(-) diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 90988a511cd5..6fd44bdb0fc3 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -512,15 +512,18 @@ static int smc_connect_decline_fallback(struct smc_sock *smc, int reason_code) static int smc_connect_abort(struct smc_sock *smc, int reason_code, int local_contact) { + bool is_smcd = smc->conn.lgr->is_smcd; + if (local_contact == SMC_FIRST_CONTACT) - smc_lgr_forget(smc->conn.lgr); - if (smc->conn.lgr->is_smcd) + smc_lgr_cleanup_early(&smc->conn); + else + smc_conn_free(&smc->conn); + if (is_smcd) /* there is only one lgr role for SMC-D; use server lock */ mutex_unlock(&smc_server_lgr_pending); else mutex_unlock(&smc_client_lgr_pending); - smc_conn_free(&smc->conn); smc->connect_nonblock = 0; return reason_code; } @@ -1091,7 +1094,6 @@ static void smc_listen_out_err(struct smc_sock *new_smc) if (newsmcsk->sk_state == SMC_INIT) sock_put(&new_smc->sk); /* passive closing */ newsmcsk->sk_state = SMC_CLOSED; - smc_conn_free(&new_smc->conn); smc_listen_out(new_smc); } @@ -1102,12 +1104,13 @@ static void smc_listen_decline(struct smc_sock *new_smc, int reason_code, { /* RDMA setup failed, switch back to TCP */ if (local_contact == SMC_FIRST_CONTACT) - smc_lgr_forget(new_smc->conn.lgr); + smc_lgr_cleanup_early(&new_smc->conn); + else + smc_conn_free(&new_smc->conn); if (reason_code < 0) { /* error, no fallback possible */ smc_listen_out_err(new_smc); return; } - smc_conn_free(&new_smc->conn); smc_switch_to_fallback(new_smc); new_smc->fallback_rsn = reason_code; if (reason_code && reason_code != SMC_CLC_DECL_PEERDECL) { @@ -1170,16 +1173,18 @@ static int smc_listen_ism_init(struct smc_sock *new_smc, new_smc->conn.lgr->vlan_id, new_smc->conn.lgr->smcd)) { if (ini->cln_first_contact == SMC_FIRST_CONTACT) - smc_lgr_forget(new_smc->conn.lgr); - smc_conn_free(&new_smc->conn); + smc_lgr_cleanup_early(&new_smc->conn); + else + smc_conn_free(&new_smc->conn); return SMC_CLC_DECL_SMCDNOTALK; } /* Create send and receive buffers */ if (smc_buf_create(new_smc, true)) { if (ini->cln_first_contact == SMC_FIRST_CONTACT) - smc_lgr_forget(new_smc->conn.lgr); - smc_conn_free(&new_smc->conn); + smc_lgr_cleanup_early(&new_smc->conn); + else + smc_conn_free(&new_smc->conn); return SMC_CLC_DECL_MEM; } diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index 2249de5379ee..5b085efa3bce 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -162,6 +162,18 @@ static void smc_lgr_unregister_conn(struct smc_connection *conn) conn->lgr = NULL; } +void smc_lgr_cleanup_early(struct smc_connection *conn) +{ + struct smc_link_group *lgr = conn->lgr; + + if (!lgr) + return; + + smc_conn_free(conn); + smc_lgr_forget(lgr); + smc_lgr_schedule_free_work_fast(lgr); +} + /* Send delete link, either as client to request the initiation * of the DELETE LINK sequence from server; or as server to * initiate the delete processing. See smc_llc_rx_delete_link(). diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h index c472e12951d1..234ae25f0025 100644 --- a/net/smc/smc_core.h +++ b/net/smc/smc_core.h @@ -296,6 +296,7 @@ struct smc_clc_msg_accept_confirm; struct smc_clc_msg_local; void smc_lgr_forget(struct smc_link_group *lgr); +void smc_lgr_cleanup_early(struct smc_connection *conn); void smc_lgr_terminate(struct smc_link_group *lgr, bool soft); void smc_port_terminate(struct smc_ib_device *smcibdev, u8 ibport); void smc_smcd_terminate(struct smcd_dev *dev, u64 peer_gid, @@ -316,7 +317,6 @@ int smc_vlan_by_tcpsk(struct socket *clcsock, struct smc_init_info *ini); void smc_conn_free(struct smc_connection *conn); int smc_conn_create(struct smc_sock *smc, struct smc_init_info *ini); -void smcd_conn_free(struct smc_connection *conn); void smc_lgr_schedule_free_work_fast(struct smc_link_group *lgr); int smc_core_init(void); void smc_core_exit(void); From b6f6118901d1e867ac9177bbff3b00b185bd4fdc Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Tue, 25 Feb 2020 11:52:29 -0800 Subject: [PATCH 209/400] ipv6: restrict IPV6_ADDRFORM operation IPV6_ADDRFORM is able to transform IPv6 socket to IPv4 one. While this operation sounds illogical, we have to support it. One of the things it does for TCP socket is to switch sk->sk_prot to tcp_prot. We now have other layers playing with sk->sk_prot, so we should make sure to not interfere with them. This patch makes sure sk_prot is the default pointer for TCP IPv6 socket. syzbot reported : BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD a0113067 P4D a0113067 PUD a8771067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 0 PID: 10686 Comm: syz-executor.0 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:0x0 Code: Bad RIP value. RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246 RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40 RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5 R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098 R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000 FS: 00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: inet_release+0x165/0x1c0 net/ipv4/af_inet.c:427 __sock_release net/socket.c:605 [inline] sock_close+0xe1/0x260 net/socket.c:1283 __fput+0x2e4/0x740 fs/file_table.c:280 ____fput+0x15/0x20 fs/file_table.c:313 task_work_run+0x176/0x1b0 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_usermode_loop arch/x86/entry/common.c:164 [inline] prepare_exit_to_usermode+0x480/0x5b0 arch/x86/entry/common.c:195 syscall_return_slowpath+0x113/0x4a0 arch/x86/entry/common.c:278 do_syscall_64+0x11f/0x1c0 arch/x86/entry/common.c:304 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x45c429 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2ae75dac78 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: 0000000000000000 RBX: 00007f2ae75db6d4 RCX: 000000000045c429 RDX: 0000000000000001 RSI: 000000000000011a RDI: 0000000000000004 RBP: 000000000076bf20 R08: 0000000000000038 R09: 0000000000000000 R10: 0000000020000180 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000a9d R14: 00000000004ccfb4 R15: 000000000076bf2c Modules linked in: CR2: 0000000000000000 ---[ end trace 82567b5207e87bae ]--- RIP: 0010:0x0 Code: Bad RIP value. RSP: 0018:ffffc9000281fce0 EFLAGS: 00010246 RAX: 1ffffffff15f48ac RBX: ffffffff8afa4560 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8880a69a8f40 RBP: ffffc9000281fd10 R08: ffffffff86ed9b0c R09: ffffed1014d351f5 R10: ffffed1014d351f5 R11: 0000000000000000 R12: ffff8880920d3098 R13: 1ffff1101241a613 R14: ffff8880a69a8f40 R15: 0000000000000000 FS: 00007f2ae75db700(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000a3b85000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Eric Dumazet Reported-by: syzbot+1938db17e275e85dc328@syzkaller.appspotmail.com Cc: Daniel Borkmann Signed-off-by: David S. Miller --- net/ipv6/ipv6_sockglue.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c index 79fc012dd2ca..debdaeba5d8c 100644 --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -183,9 +183,15 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, retv = -EBUSY; break; } - } else if (sk->sk_protocol != IPPROTO_TCP) + } else if (sk->sk_protocol == IPPROTO_TCP) { + if (sk->sk_prot != &tcpv6_prot) { + retv = -EBUSY; + break; + } break; - + } else { + break; + } if (sk->sk_state != TCP_ESTABLISHED) { retv = -ENOTCONN; break; From f596c87005f7b1baeb7d62d9a9e25d68c3dfae10 Mon Sep 17 00:00:00 2001 From: yangerkun Date: Wed, 26 Feb 2020 11:54:35 +0800 Subject: [PATCH 210/400] slip: not call free_netdev before rtnl_unlock in slip_open As the description before netdev_run_todo, we cannot call free_netdev before rtnl_unlock, fix it by reorder the code. Signed-off-by: yangerkun Reviewed-by: Oliver Hartkopp Signed-off-by: David S. Miller --- drivers/net/slip/slip.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c index 6f4d7ba8b109..babb01888b78 100644 --- a/drivers/net/slip/slip.c +++ b/drivers/net/slip/slip.c @@ -863,7 +863,10 @@ err_free_chan: tty->disc_data = NULL; clear_bit(SLF_INUSE, &sl->flags); sl_free_netdev(sl->dev); + /* do not call free_netdev before rtnl_unlock */ + rtnl_unlock(); free_netdev(sl->dev); + return err; err_exit: rtnl_unlock(); From 4f31c532ad400a34dbdd836c204ed964d1ec2da5 Mon Sep 17 00:00:00 2001 From: Sudheesh Mavila Date: Wed, 26 Feb 2020 12:40:45 +0530 Subject: [PATCH 211/400] net: phy: corrected the return value for genphy_check_and_restart_aneg and genphy_c45_check_and_restart_aneg When auto-negotiation is not required, return value should be zero. Changes v1->v2: - improved comments and code as Andrew Lunn and Heiner Kallweit suggestion - fixed issue in genphy_c45_check_and_restart_aneg as Russell King suggestion. Fixes: 2a10ab043ac5 ("net: phy: add genphy_check_and_restart_aneg()") Fixes: 1af9f16840e9 ("net: phy: add genphy_c45_check_and_restart_aneg()") Signed-off-by: Sudheesh Mavila Reviewed-by: Heiner Kallweit Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller --- drivers/net/phy/phy-c45.c | 6 +++--- drivers/net/phy/phy_device.c | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c index a1caeee12236..dd2e23fb67c0 100644 --- a/drivers/net/phy/phy-c45.c +++ b/drivers/net/phy/phy-c45.c @@ -167,7 +167,7 @@ EXPORT_SYMBOL_GPL(genphy_c45_restart_aneg); */ int genphy_c45_check_and_restart_aneg(struct phy_device *phydev, bool restart) { - int ret = 0; + int ret; if (!restart) { /* Configure and restart aneg if it wasn't set before */ @@ -180,9 +180,9 @@ int genphy_c45_check_and_restart_aneg(struct phy_device *phydev, bool restart) } if (restart) - ret = genphy_c45_restart_aneg(phydev); + return genphy_c45_restart_aneg(phydev); - return ret; + return 0; } EXPORT_SYMBOL_GPL(genphy_c45_check_and_restart_aneg); diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 6131aca79823..c8b0c34030d3 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -1793,7 +1793,7 @@ EXPORT_SYMBOL(genphy_restart_aneg); */ int genphy_check_and_restart_aneg(struct phy_device *phydev, bool restart) { - int ret = 0; + int ret; if (!restart) { /* Advertisement hasn't changed, but maybe aneg was never on to @@ -1808,9 +1808,9 @@ int genphy_check_and_restart_aneg(struct phy_device *phydev, bool restart) } if (restart) - ret = genphy_restart_aneg(phydev); + return genphy_restart_aneg(phydev); - return ret; + return 0; } EXPORT_SYMBOL(genphy_check_and_restart_aneg); From dc24f8b4ecd3d6c4153a1ec1bc2006ab32a41b8d Mon Sep 17 00:00:00 2001 From: Paolo Abeni Date: Wed, 26 Feb 2020 12:19:03 +0100 Subject: [PATCH 212/400] mptcp: add dummy icsk_sync_mss() syzbot noted that the master MPTCP socket lacks the icsk_sync_mss callback, and was able to trigger a null pointer dereference: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 8e171067 P4D 8e171067 PUD 93fa2067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 0 PID: 8984 Comm: syz-executor066 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:0x0 Code: Bad RIP value. RSP: 0018:ffffc900020b7b80 EFLAGS: 00010246 RAX: 1ffff110124ba600 RBX: 0000000000000000 RCX: ffff88809fefa600 RDX: ffff8880994cdb18 RSI: 0000000000000000 RDI: ffff8880925d3140 RBP: ffffc900020b7bd8 R08: ffffffff870225be R09: fffffbfff140652a R10: fffffbfff140652a R11: 0000000000000000 R12: ffff8880925d35d0 R13: ffff8880925d3140 R14: dffffc0000000000 R15: 1ffff110124ba6ba FS: 0000000001a0b880(0000) GS:ffff8880aea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000000a6d6f000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: cipso_v4_sock_setattr+0x34b/0x470 net/ipv4/cipso_ipv4.c:1888 netlbl_sock_setattr+0x2a7/0x310 net/netlabel/netlabel_kapi.c:989 smack_netlabel security/smack/smack_lsm.c:2425 [inline] smack_inode_setsecurity+0x3da/0x4a0 security/smack/smack_lsm.c:2716 security_inode_setsecurity+0xb2/0x140 security/security.c:1364 __vfs_setxattr_noperm+0x16f/0x3e0 fs/xattr.c:197 vfs_setxattr fs/xattr.c:224 [inline] setxattr+0x335/0x430 fs/xattr.c:451 __do_sys_fsetxattr fs/xattr.c:506 [inline] __se_sys_fsetxattr+0x130/0x1b0 fs/xattr.c:495 __x64_sys_fsetxattr+0xbf/0xd0 fs/xattr.c:495 do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x440199 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffcadc19e48 EFLAGS: 00000246 ORIG_RAX: 00000000000000be RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440199 RDX: 0000000020000200 RSI: 00000000200001c0 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000003 R09: 00000000004002c8 R10: 0000000000000009 R11: 0000000000000246 R12: 0000000000401a20 R13: 0000000000401ab0 R14: 0000000000000000 R15: 0000000000000000 Modules linked in: CR2: 0000000000000000 Address the issue adding a dummy icsk_sync_mss callback. To properly sync the subflows mss and options list we need some additional infrastructure, which will land to net-next. Reported-by: syzbot+f4dfece964792d80b139@syzkaller.appspotmail.com Fixes: 2303f994b3e1 ("mptcp: Associate MPTCP context with TCP socket") Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau Signed-off-by: David S. Miller --- net/mptcp/protocol.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index e9aa6807b5be..3c19a8efdcea 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -543,6 +543,11 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, } } +static unsigned int mptcp_sync_mss(struct sock *sk, u32 pmtu) +{ + return 0; +} + static int __mptcp_init_sock(struct sock *sk) { struct mptcp_sock *msk = mptcp_sk(sk); @@ -551,6 +556,7 @@ static int __mptcp_init_sock(struct sock *sk) __set_bit(MPTCP_SEND_SPACE, &msk->flags); msk->first = NULL; + inet_csk(sk)->icsk_sync_mss = mptcp_sync_mss; return 0; } From c87a9d6fc6d555e4981f2ded77d9a8cce950743e Mon Sep 17 00:00:00 2001 From: Antoine Tenart Date: Wed, 26 Feb 2020 16:26:50 +0100 Subject: [PATCH 213/400] net: phy: mscc: fix firmware paths The firmware paths for the VSC8584 PHYs not not contain the leading 'microchip/' directory, as used in linux-firmware, resulting in an error when probing the driver. This patch fixes it. Fixes: a5afc1678044 ("net: phy: mscc: add support for VSC8584 PHY") Signed-off-by: Antoine Tenart Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller --- drivers/net/phy/mscc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/phy/mscc.c b/drivers/net/phy/mscc.c index 937ac7da2789..f686f40f6bdc 100644 --- a/drivers/net/phy/mscc.c +++ b/drivers/net/phy/mscc.c @@ -345,11 +345,11 @@ enum macsec_bank { BIT(VSC8531_FORCE_LED_OFF) | \ BIT(VSC8531_FORCE_LED_ON)) -#define MSCC_VSC8584_REVB_INT8051_FW "mscc_vsc8584_revb_int8051_fb48.bin" +#define MSCC_VSC8584_REVB_INT8051_FW "microchip/mscc_vsc8584_revb_int8051_fb48.bin" #define MSCC_VSC8584_REVB_INT8051_FW_START_ADDR 0xe800 #define MSCC_VSC8584_REVB_INT8051_FW_CRC 0xfb48 -#define MSCC_VSC8574_REVB_INT8051_FW "mscc_vsc8574_revb_int8051_29e8.bin" +#define MSCC_VSC8574_REVB_INT8051_FW "microchip/mscc_vsc8574_revb_int8051_29e8.bin" #define MSCC_VSC8574_REVB_INT8051_FW_START_ADDR 0x4000 #define MSCC_VSC8574_REVB_INT8051_FW_CRC 0x29e8 From 474a31e13a4e9749fb3ee55794d69d0f17ee0998 Mon Sep 17 00:00:00 2001 From: Aaro Koskinen Date: Wed, 26 Feb 2020 18:49:01 +0200 Subject: [PATCH 214/400] net: stmmac: fix notifier registration We cannot register the same netdev notifier multiple times when probing stmmac devices. Register the notifier only once in module init, and also make debugfs creation/deletion safe against simultaneous notifier call. Fixes: 481a7d154cbb ("stmmac: debugfs entry name is not be changed when udev rename device name.") Signed-off-by: Aaro Koskinen Signed-off-by: David S. Miller --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 5836b21edd7e..7da18c9afa01 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -4405,6 +4405,8 @@ static void stmmac_init_fs(struct net_device *dev) { struct stmmac_priv *priv = netdev_priv(dev); + rtnl_lock(); + /* Create per netdev entries */ priv->dbgfs_dir = debugfs_create_dir(dev->name, stmmac_fs_dir); @@ -4416,14 +4418,13 @@ static void stmmac_init_fs(struct net_device *dev) debugfs_create_file("dma_cap", 0444, priv->dbgfs_dir, dev, &stmmac_dma_cap_fops); - register_netdevice_notifier(&stmmac_notifier); + rtnl_unlock(); } static void stmmac_exit_fs(struct net_device *dev) { struct stmmac_priv *priv = netdev_priv(dev); - unregister_netdevice_notifier(&stmmac_notifier); debugfs_remove_recursive(priv->dbgfs_dir); } #endif /* CONFIG_DEBUG_FS */ @@ -4940,14 +4941,14 @@ int stmmac_dvr_remove(struct device *dev) netdev_info(priv->dev, "%s: removing driver", __func__); -#ifdef CONFIG_DEBUG_FS - stmmac_exit_fs(ndev); -#endif stmmac_stop_all_dma(priv); stmmac_mac_set(priv, priv->ioaddr, false); netif_carrier_off(ndev); unregister_netdev(ndev); +#ifdef CONFIG_DEBUG_FS + stmmac_exit_fs(ndev); +#endif phylink_destroy(priv->phylink); if (priv->plat->stmmac_rst) reset_control_assert(priv->plat->stmmac_rst); @@ -5166,6 +5167,7 @@ static int __init stmmac_init(void) /* Create debugfs main directory if it doesn't exist yet */ if (!stmmac_fs_dir) stmmac_fs_dir = debugfs_create_dir(STMMAC_RESOURCE_NAME, NULL); + register_netdevice_notifier(&stmmac_notifier); #endif return 0; @@ -5174,6 +5176,7 @@ static int __init stmmac_init(void) static void __exit stmmac_exit(void) { #ifdef CONFIG_DEBUG_FS + unregister_netdevice_notifier(&stmmac_notifier); debugfs_remove_recursive(stmmac_fs_dir); #endif } From a2f2ef4a54c0d97aa6a8386f4ff23f36ebb488cf Mon Sep 17 00:00:00 2001 From: Karsten Graul Date: Wed, 26 Feb 2020 17:52:46 +0100 Subject: [PATCH 215/400] net/smc: check for valid ib_client_data In smc_ib_remove_dev() check if the provided ib device was actually initialized for SMC before. Reported-by: syzbot+84484ccebdd4e5451d91@syzkaller.appspotmail.com Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client") Signed-off-by: Karsten Graul Signed-off-by: David S. Miller --- net/smc/smc_ib.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c index 548632621f4b..d6ba186f67e2 100644 --- a/net/smc/smc_ib.c +++ b/net/smc/smc_ib.c @@ -573,6 +573,8 @@ static void smc_ib_remove_dev(struct ib_device *ibdev, void *client_data) struct smc_ib_device *smcibdev; smcibdev = ib_get_client_data(ibdev, &smc_ib_client); + if (!smcibdev || smcibdev->ibdev != ibdev) + return; ib_set_client_data(ibdev, &smc_ib_client, NULL); spin_lock(&smc_ib_devices.lock); list_del_init(&smcibdev->list); /* remove from smc_ib_devices */ From f5739cb0b56590d68d8df8a44659893b6d0084c3 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Wed, 26 Feb 2020 22:39:27 +0100 Subject: [PATCH 216/400] cpufreq: Fix policy initialization for internal governor drivers Before commit 1e4f63aecb53 ("cpufreq: Avoid creating excessively large stack frames") the initial value of the policy field in struct cpufreq_policy set by the driver's ->init() callback was implicitly passed from cpufreq_init_policy() to cpufreq_set_policy() if the default governor was neither "performance" nor "powersave". After that commit, however, cpufreq_init_policy() must take that case into consideration explicitly and handle it as appropriate, so make that happen. Fixes: 1e4f63aecb53 ("cpufreq: Avoid creating excessively large stack frames") Link: https://lore.kernel.org/linux-pm/39fb762880c27da110086741315ca8b111d781cd.camel@gmail.com/ Reported-by: Artem Bityutskiy Cc: 5.4+ # 5.4+ Signed-off-by: Rafael J. Wysocki Acked-by: Viresh Kumar --- drivers/cpufreq/cpufreq.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index cbe6c94bf158..808874bccf4a 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1076,9 +1076,17 @@ static int cpufreq_init_policy(struct cpufreq_policy *policy) pol = policy->last_policy; } else if (def_gov) { pol = cpufreq_parse_policy(def_gov->name); - } else { - return -ENODATA; + /* + * In case the default governor is neiter "performance" + * nor "powersave", fall back to the initial policy + * value set by the driver. + */ + if (pol == CPUFREQ_POLICY_UNKNOWN) + pol = policy->policy; } + if (pol != CPUFREQ_POLICY_PERFORMANCE && + pol != CPUFREQ_POLICY_POWERSAVE) + return -ENODATA; } return cpufreq_set_policy(policy, gov, pol); From 289de35984815576793f579ec27248609e75976e Mon Sep 17 00:00:00 2001 From: Vincent Guittot Date: Tue, 18 Feb 2020 15:45:34 +0100 Subject: [PATCH 217/400] sched/fair: Fix statistics for find_idlest_group() sgs->group_weight is not set while gathering statistics in update_sg_wakeup_stats(). This means that a group can be classified as fully busy with 0 running tasks if utilization is high enough. This path is mainly used for fork and exec. Fixes: 57abff067a08 ("sched/fair: Rework find_idlest_group()") Signed-off-by: Vincent Guittot Signed-off-by: Ingo Molnar Acked-by: Peter Zijlstra Acked-by: Mel Gorman Link: https://lore.kernel.org/r/20200218144534.4564-1-vincent.guittot@linaro.org --- kernel/sched/fair.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3c8a379c357e..c1217bfe5e81 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8337,6 +8337,8 @@ static inline void update_sg_wakeup_stats(struct sched_domain *sd, sgs->group_capacity = group->sgc->capacity; + sgs->group_weight = group->group_weight; + sgs->group_type = group_classify(sd->imbalance_pct, group, sgs); /* From 2be30d34a387b8d97cc1b4be1223bfe0b75a0812 Mon Sep 17 00:00:00 2001 From: Icenowy Zheng Date: Sat, 22 Feb 2020 00:51:27 +0800 Subject: [PATCH 218/400] drm/bridge: analogix-anx6345: fix set of link bandwidth Current code tries to store the link rate (in bps, which is a big number) in a u8, which surely overflow. Then it's converted back to bandwidth code (which is thus 0) and written to the chip. The code sometimes works because the chip will automatically fallback to the lowest possible DP link rate (1.62Gbps) when get the invalid value. However, on the eDP panel of Olimex TERES-I, which wants 2.7Gbps link, it failed. As we had already read the link bandwidth as bandwidth code in earlier code (to check whether it is supported), use it when setting bandwidth, instead of converting it to link rate and then converting back. Fixes: e1cff82c1097 ("drm/bridge: fix anx6345 compilation for v5.5") Signed-off-by: Icenowy Zheng Reviewed-by: Torsten Duwe Cc: Maxime Ripard Cc: Torsten Duwe Cc: Sam Ravnborg Cc: Linus Walleij Cc: Thomas Zimmermann Cc: Icenowy Zheng Cc: Stephen Rothwell Signed-off-by: Thomas Zimmermann Link: https://patchwork.freedesktop.org/patch/msgid/20200221165127.813325-1-icenowy@aosc.io --- drivers/gpu/drm/bridge/analogix/analogix-anx6345.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix-anx6345.c b/drivers/gpu/drm/bridge/analogix/analogix-anx6345.c index 56f55c53abfd..2dfa2fd2a23b 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix-anx6345.c +++ b/drivers/gpu/drm/bridge/analogix/analogix-anx6345.c @@ -210,8 +210,7 @@ static int anx6345_dp_link_training(struct anx6345 *anx6345) if (err) return err; - dpcd[0] = drm_dp_max_link_rate(anx6345->dpcd); - dpcd[0] = drm_dp_link_rate_to_bw_code(dpcd[0]); + dpcd[0] = dp_bw; err = regmap_write(anx6345->map[I2C_IDX_DPTX], SP_DP_MAIN_LINK_BW_SET_REG, dpcd[0]); if (err) From d1f37226431f5d9657aa144a40f2383adbcf27e1 Mon Sep 17 00:00:00 2001 From: Cong Wang Date: Thu, 26 Dec 2019 22:32:04 -0800 Subject: [PATCH 219/400] dma-buf: free dmabuf->name in dma_buf_release() dma-buf name can be set via DMA_BUF_SET_NAME ioctl, but once set it never gets freed. Free it in dma_buf_release(). Fixes: bb2bb9030425 ("dma-buf: add DMA_BUF_SET_NAME ioctls") Reported-by: syzbot+b2098bc44728a4efb3e9@syzkaller.appspotmail.com Cc: Greg Hackmann Cc: Chenbo Feng Cc: Sumit Semwal Signed-off-by: Cong Wang Acked-by: Chenbo Feng Signed-off-by: Sumit Semwal Link: https://patchwork.freedesktop.org/patch/msgid/20191227063204.5813-1-xiyou.wangcong@gmail.com --- drivers/dma-buf/dma-buf.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index d4097856c86b..c343c7c10b4c 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -108,6 +108,7 @@ static int dma_buf_release(struct inode *inode, struct file *file) dma_resv_fini(dmabuf->resv); module_put(dmabuf->owner); + kfree(dmabuf->name); kfree(dmabuf); return 0; } From 5dd8304981ecffa77bb72b1c57c4be5dfe6cfae9 Mon Sep 17 00:00:00 2001 From: Thommy Jakobsson Date: Mon, 24 Feb 2020 17:26:43 +0100 Subject: [PATCH 220/400] spi/zynqmp: remove entry that causes a cs glitch In the public interface for chipselect, there is always an entry commented as "Dummy generic FIFO entry" pushed down to the fifo right after the activate/deactivate command. The dummy entry is 0x0, irregardless if the intention was to activate or deactive the cs. This causes the cs line to glitch rather than beeing activated in the case when there was an activate command. This has been observed on oscilloscope, and have caused problems for at least one specific flash device type connected to the qspi port. After the change the glitch is gone and cs goes active when intended. The reason why this worked before (except for the glitch) was because when sending the actual data, the CS bits are once again set. Since most flashes uses mode 0, there is always a half clk period anyway for cs to clk active setup time. If someone would rely on timing from a chip_select call to a transfer_one, it would fail though. It is unknown why the dummy entry was there in the first place, git log seems to be of no help in this case. The reference manual gives no indication of the necessity of this. In fact the lower 8 bits are a setup (or hold in case of deactivate) time expressed in cycles. So this should not be needed to fulfill any setup/hold timings. Signed-off-by: Thommy Jakobsson Reviewed-by: Naga Sureshkumar Relli Link: https://lore.kernel.org/r/20200224162643.29102-1-thommyj@gmail.com Signed-off-by: Mark Brown --- drivers/spi/spi-zynqmp-gqspi.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/spi/spi-zynqmp-gqspi.c b/drivers/spi/spi-zynqmp-gqspi.c index 60c4de4e4485..7412a3042a8d 100644 --- a/drivers/spi/spi-zynqmp-gqspi.c +++ b/drivers/spi/spi-zynqmp-gqspi.c @@ -401,9 +401,6 @@ static void zynqmp_qspi_chipselect(struct spi_device *qspi, bool is_high) zynqmp_gqspi_write(xqspi, GQSPI_GEN_FIFO_OFST, genfifoentry); - /* Dummy generic FIFO entry */ - zynqmp_gqspi_write(xqspi, GQSPI_GEN_FIFO_OFST, 0x0); - /* Manually start the generic FIFO command */ zynqmp_gqspi_write(xqspi, GQSPI_CONFIG_OFST, zynqmp_gqspi_read(xqspi, GQSPI_CONFIG_OFST) | From d8e3ee2e2b4ef36d7be3dd8a8fb6e136f2661203 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Thu, 27 Feb 2020 09:23:35 -0300 Subject: [PATCH 221/400] tools arch x86: Sync the msr-index.h copy with the kernel sources To pick up the changes from these csets: 21b5ee59ef18 ("x86/cpu/amd: Enable the fixed Instructions Retired counter IRPERF") $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ git diff diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index ebe1685e92dd..d5e517d1c3dd 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -512,6 +512,8 @@ #define MSR_K7_HWCR 0xc0010015 #define MSR_K7_HWCR_SMMLOCK_BIT 0 #define MSR_K7_HWCR_SMMLOCK BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT) +#define MSR_K7_HWCR_IRPERF_EN_BIT 30 +#define MSR_K7_HWCR_IRPERF_EN BIT_ULL(MSR_K7_HWCR_IRPERF_EN_BIT) #define MSR_K7_FID_VID_CTL 0xc0010041 #define MSR_K7_FID_VID_STATUS 0xc0010042 $ That don't result in any change in tooling: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after $ To silence this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h' diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Cc: Adrian Hunter Cc: Borislav Petkov Cc: Jiri Olsa Cc: Kim Phillips Cc: Namhyung Kim Signed-off-by: Arnaldo Carvalho de Melo --- tools/arch/x86/include/asm/msr-index.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index ebe1685e92dd..d5e517d1c3dd 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -512,6 +512,8 @@ #define MSR_K7_HWCR 0xc0010015 #define MSR_K7_HWCR_SMMLOCK_BIT 0 #define MSR_K7_HWCR_SMMLOCK BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT) +#define MSR_K7_HWCR_IRPERF_EN_BIT 30 +#define MSR_K7_HWCR_IRPERF_EN BIT_ULL(MSR_K7_HWCR_IRPERF_EN_BIT) #define MSR_K7_FID_VID_CTL 0xc0010041 #define MSR_K7_FID_VID_STATUS 0xc0010042 From 0d6f94fd498a7f9d15c5cbf64567727361fd35c0 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Thu, 27 Feb 2020 09:51:30 -0300 Subject: [PATCH 222/400] tools headers UAPI: Update tools's copy of kvm.h headers Picking the changes from: 5ef8acbdd687 ("KVM: nVMX: Emulate MTF when performing instruction emulation") Silencing this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h' diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h No change in tooling ensues, just the x86 kvm tooling gets rebuilt as those headers are included in its build: $ cp arch/x86/include/uapi/asm/kvm.h tools/arch/x86/include/uapi/asm/kvm.h $ make -C tools/perf make: Entering directory '/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j12' parallel build Auto-detecting system features: ... dwarf: [ on ] ... disassembler-four-args: [ on ] DESCEND plugins CC /tmp/build/perf/arch/x86/util/kvm-stat.o LD /tmp/build/perf/arch/x86/util/perf-in.o LD /tmp/build/perf/arch/x86/perf-in.o LD /tmp/build/perf/arch/perf-in.o LD /tmp/build/perf/perf-in.o LINK /tmp/build/perf/perf $ As it doesn't seem to be used there: $ grep STATE tools/perf/arch/x86/util/kvm-stat.c $ And the 'perf trace' beautifier table generator isn't interested in these things: $ grep regex= tools/perf/trace/beauty/kvm_ioctl.sh regex='^#[[:space:]]*define[[:space:]]+KVM_(\w+)[[:space:]]+_IO[RW]*\([[:space:]]*KVMIO[[:space:]]*,[[:space:]]*(0x[[:xdigit:]]+).*' $ Cc: Adrian Hunter Cc: Jiri Olsa Cc: Namhyung Kim Cc: Oliver Upton Cc: Paolo Bonzini Signed-off-by: Arnaldo Carvalho de Melo --- tools/arch/x86/include/uapi/asm/kvm.h | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index 503d3f42da16..3f3f780c8c65 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -390,6 +390,7 @@ struct kvm_sync_regs { #define KVM_STATE_NESTED_GUEST_MODE 0x00000001 #define KVM_STATE_NESTED_RUN_PENDING 0x00000002 #define KVM_STATE_NESTED_EVMCS 0x00000004 +#define KVM_STATE_NESTED_MTF_PENDING 0x00000008 #define KVM_STATE_NESTED_SMM_GUEST_MODE 0x00000001 #define KVM_STATE_NESTED_SMM_VMXON 0x00000002 From 1cad629257e76025bcbf490c58de550fb67d4d0e Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann Date: Wed, 26 Feb 2020 16:47:50 +0100 Subject: [PATCH 223/400] drm/shmem: add support for per object caching flags. Add map_cached bool to drm_gem_shmem_object, to request cached mappings on a per-object base. Check the flag before adding writecombine to pgprot bits. Cc: stable@vger.kernel.org Signed-off-by: Gerd Hoffmann Reviewed-by: Gurchetan Singh Tested-by: Guillaume Gardet Link: http://patchwork.freedesktop.org/patch/msgid/20200226154752.24328-2-kraxel@redhat.com --- drivers/gpu/drm/drm_gem_shmem_helper.c | 15 +++++++++++---- include/drm/drm_gem_shmem_helper.h | 5 +++++ 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index a421a2eed48a..aad9324dcf4f 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -254,11 +254,16 @@ static void *drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem) if (ret) goto err_zero_use; - if (obj->import_attach) + if (obj->import_attach) { shmem->vaddr = dma_buf_vmap(obj->import_attach->dmabuf); - else + } else { + pgprot_t prot = PAGE_KERNEL; + + if (!shmem->map_cached) + prot = pgprot_writecombine(prot); shmem->vaddr = vmap(shmem->pages, obj->size >> PAGE_SHIFT, - VM_MAP, pgprot_writecombine(PAGE_KERNEL)); + VM_MAP, prot); + } if (!shmem->vaddr) { DRM_DEBUG_KMS("Failed to vmap pages\n"); @@ -540,7 +545,9 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma) } vma->vm_flags |= VM_MIXEDMAP | VM_DONTEXPAND; - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); + vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); + if (!shmem->map_cached) + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot); vma->vm_ops = &drm_gem_shmem_vm_ops; diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h index e34a7b7f848a..294b2931c4cc 100644 --- a/include/drm/drm_gem_shmem_helper.h +++ b/include/drm/drm_gem_shmem_helper.h @@ -96,6 +96,11 @@ struct drm_gem_shmem_object { * The address are un-mapped when the count reaches zero. */ unsigned int vmap_use_count; + + /** + * @map_cached: map object cached (instead of using writecombine). + */ + bool map_cached; }; #define to_drm_gem_shmem_obj(obj) \ From 6be7e07335486f5731cab748d80c68f20896581f Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann Date: Wed, 26 Feb 2020 16:47:51 +0100 Subject: [PATCH 224/400] drm/virtio: fix mmap page attributes virtio-gpu uses cached mappings, set drm_gem_shmem_object.map_cached accordingly. Cc: stable@vger.kernel.org Fixes: c66df701e783 ("drm/virtio: switch from ttm to gem shmem helpers") Reported-by: Gurchetan Singh Reported-by: Guillaume Gardet Signed-off-by: Gerd Hoffmann Reviewed-by: Gurchetan Singh Tested-by: Guillaume Gardet Link: http://patchwork.freedesktop.org/patch/msgid/20200226154752.24328-3-kraxel@redhat.com --- drivers/gpu/drm/virtio/virtgpu_object.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c b/drivers/gpu/drm/virtio/virtgpu_object.c index 890121a45625..3af7ec80c7da 100644 --- a/drivers/gpu/drm/virtio/virtgpu_object.c +++ b/drivers/gpu/drm/virtio/virtgpu_object.c @@ -99,6 +99,7 @@ struct drm_gem_object *virtio_gpu_create_object(struct drm_device *dev, return NULL; bo->base.base.funcs = &virtio_gpu_gem_funcs; + bo->base.map_cached = true; return &bo->base.base; } From 54cf752cfb75602c256e94db6fdfd3de9dfbbef1 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Thu, 13 Feb 2020 12:12:59 +0530 Subject: [PATCH 225/400] perf annotate/tui: Re-render title bar after switching back from script browser The 'perf annotate' TUI browser provides a 'r' hot key to switch to a script browser. But the annotate browser title bar becomes hidden while switching back from script browser. Fix it. Signed-off-by: Ravi Bangoria Tested-by: Arnaldo Carvalho de Melo Cc: Adrian Hunter Cc: Alexey Budankov Cc: Changbin Du Cc: Ian Rogers Cc: Jin Yao Cc: Jiri Olsa Cc: Leo Yan Cc: Namhyung Kim Cc: Song Liu Cc: Taeung Song Cc: Thomas Richter Cc: Yisheng Xie Link: http://lore.kernel.org/lkml/20200213064306.160480-2-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/ui/browsers/annotate.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c index badbddbb30f8..0dbbf35e6ed1 100644 --- a/tools/perf/ui/browsers/annotate.c +++ b/tools/perf/ui/browsers/annotate.c @@ -754,10 +754,9 @@ static int annotate_browser__run(struct annotate_browser *browser, "? Search string backwards\n"); continue; case 'r': - { - script_browse(NULL, NULL); - continue; - } + script_browse(NULL, NULL); + annotate_browser__show(&browser->b, title, help); + continue; case 'k': notes->options->show_linenr = !notes->options->show_linenr; break; From 68aac855b643e1540012cbefa0dee06207c3dc64 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Thu, 13 Feb 2020 12:13:00 +0530 Subject: [PATCH 226/400] perf annotate: Fix --show-total-period for tui/stdio2 perf annotate --show-total-period does not really show total period. The reason is we have two separate variables for the same purpose. One is in symbol_conf.show_total_period and another is annotation_options.show_total_period. We save command line option in symbol_conf.show_total_period but uses annotation_option.show_total_period while rendering tui/stdio2 browser. Though, we copy symbol_conf.show_total_period to annotation__default_options.show_total_period but that is not really effective as we don't use annotation__default_options once we copy default options to dynamic variable annotate.opts in cmd_annotate(). Instead of all these complication, keep only one variable and use it all over. symbol_conf.show_total_period is used by perf report/top as well. So let's kill annotation_options.show_total_period. On a side note, I've kept annotation_options.show_total_period definition because it's still used by perf-config code. Follow up patch to fix perf-config for annotate will remove annotation_options.show_total_period. Signed-off-by: Ravi Bangoria Tested-by: Arnaldo Carvalho de Melo Cc: Adrian Hunter Cc: Alexey Budankov Cc: Changbin Du Cc: Ian Rogers Cc: Jin Yao Cc: Jiri Olsa Cc: Leo Yan Cc: Namhyung Kim Cc: Song Liu Cc: Taeung Song Cc: Thomas Richter Cc: Yisheng Xie Link: http://lore.kernel.org/lkml/20200213064306.160480-3-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/ui/browsers/annotate.c | 6 +++--- tools/perf/util/annotate.c | 5 ++--- tools/perf/util/annotate.h | 2 +- 3 files changed, 6 insertions(+), 7 deletions(-) diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c index 0dbbf35e6ed1..7e5b44becb5c 100644 --- a/tools/perf/ui/browsers/annotate.c +++ b/tools/perf/ui/browsers/annotate.c @@ -833,13 +833,13 @@ show_sup_ins: map_symbol__annotation_dump(ms, evsel, browser->opts); continue; case 't': - if (notes->options->show_total_period) { - notes->options->show_total_period = false; + if (symbol_conf.show_total_period) { + symbol_conf.show_total_period = false; notes->options->show_nr_samples = true; } else if (notes->options->show_nr_samples) notes->options->show_nr_samples = false; else - notes->options->show_total_period = true; + symbol_conf.show_total_period = true; annotation__update_column_widths(notes); continue; case 'c': diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index ca73fb74ad03..fe4b44d4ffab 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -2915,7 +2915,7 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati percent = annotation_data__percent(&al->data[i], percent_type); obj__set_percent_color(obj, percent, current_entry); - if (notes->options->show_total_period) { + if (symbol_conf.show_total_period) { obj__printf(obj, "%11" PRIu64 " ", al->data[i].he.period); } else if (notes->options->show_nr_samples) { obj__printf(obj, "%6" PRIu64 " ", @@ -2931,7 +2931,7 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati obj__printf(obj, "%-*s", pcnt_width, " "); else { obj__printf(obj, "%-*s", pcnt_width, - notes->options->show_total_period ? "Period" : + symbol_conf.show_total_period ? "Period" : notes->options->show_nr_samples ? "Samples" : "Percent"); } } @@ -3155,7 +3155,6 @@ void annotation_config__init(void) { perf_config(annotation__config, NULL); - annotation__default_options.show_total_period = symbol_conf.show_total_period; annotation__default_options.show_nr_samples = symbol_conf.show_nr_samples; } diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 455403e8fede..6237c2cc582d 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -309,7 +309,7 @@ static inline int annotation__cycles_width(struct annotation *notes) static inline int annotation__pcnt_width(struct annotation *notes) { - return (notes->options->show_total_period ? 12 : 7) * notes->nr_events; + return (symbol_conf.show_total_period ? 12 : 7) * notes->nr_events; } static inline bool annotation_line__filter(struct annotation_line *al, struct annotation *notes) From 46ccb44269665bba6a9bf0f77fe776421fc2304c Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Thu, 13 Feb 2020 12:13:01 +0530 Subject: [PATCH 227/400] perf annotate: Fix --show-nr-samples for tui/stdio2 perf annotate --show-nr-samples does not really show number of samples. The reason is we have two separate variables for the same purpose. One is in symbol_conf.show_nr_samples and another is annotation_options.show_nr_samples. We save command line option in symbol_conf.show_nr_samples but uses annotation_option.show_nr_samples while rendering tui/stdio2 browser. Though, we copy symbol_conf.show_nr_samples to annotation__default_options.show_nr_samples but that is not really effective as we don't use annotation__default_options once we copy default options to dynamic variable annotate.opts in cmd_annotate(). Instead of all these complication, keep only one variable and use it all over. symbol_conf.show_nr_samples is used by perf report/top as well. So let's kill annotation_options.show_nr_samples. On a side note, I've kept annotation_options.show_nr_samples definition because it's still used by perf-config code. Follow up patch to fix perf-config for annotate will remove annotation_options.show_nr_samples. Signed-off-by: Ravi Bangoria Tested-by: Arnaldo Carvalho de Melo Cc: Adrian Hunter Cc: Alexey Budankov Cc: Changbin Du Cc: Ian Rogers Cc: Jin Yao Cc: Jiri Olsa Cc: Leo Yan Cc: Namhyung Kim Cc: Song Liu Cc: Taeung Song Cc: Thomas Richter Cc: Yisheng Xie Link: http://lore.kernel.org/lkml/20200213064306.160480-4-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/ui/browsers/annotate.c | 6 +++--- tools/perf/util/annotate.c | 6 ++---- 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c index 7e5b44becb5c..9023267e5643 100644 --- a/tools/perf/ui/browsers/annotate.c +++ b/tools/perf/ui/browsers/annotate.c @@ -835,9 +835,9 @@ show_sup_ins: case 't': if (symbol_conf.show_total_period) { symbol_conf.show_total_period = false; - notes->options->show_nr_samples = true; - } else if (notes->options->show_nr_samples) - notes->options->show_nr_samples = false; + symbol_conf.show_nr_samples = true; + } else if (symbol_conf.show_nr_samples) + symbol_conf.show_nr_samples = false; else symbol_conf.show_total_period = true; annotation__update_column_widths(notes); diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index fe4b44d4ffab..f0741daf94ef 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -2917,7 +2917,7 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati obj__set_percent_color(obj, percent, current_entry); if (symbol_conf.show_total_period) { obj__printf(obj, "%11" PRIu64 " ", al->data[i].he.period); - } else if (notes->options->show_nr_samples) { + } else if (symbol_conf.show_nr_samples) { obj__printf(obj, "%6" PRIu64 " ", al->data[i].he.nr_samples); } else { @@ -2932,7 +2932,7 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati else { obj__printf(obj, "%-*s", pcnt_width, symbol_conf.show_total_period ? "Period" : - notes->options->show_nr_samples ? "Samples" : "Percent"); + symbol_conf.show_nr_samples ? "Samples" : "Percent"); } } @@ -3154,8 +3154,6 @@ static int annotation__config(const char *var, const char *value, void annotation_config__init(void) { perf_config(annotation__config, NULL); - - annotation__default_options.show_nr_samples = symbol_conf.show_nr_samples; } static unsigned int parse_percent_type(char *str1, char *str2) From 7b43b6970474757da68e89efb76e892219dea9d8 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Thu, 13 Feb 2020 12:13:02 +0530 Subject: [PATCH 228/400] perf config: Introduce perf_config_u8() Introduce perf_config_u8() utility function to convert char * input into u8 destination. We will utilize it in followup patch. Signed-off-by: Ravi Bangoria Cc: Adrian Hunter Cc: Alexey Budankov Cc: Changbin Du Cc: Ian Rogers Cc: Jin Yao Cc: Jiri Olsa Cc: Leo Yan Cc: Namhyung Kim Cc: Song Liu Cc: Taeung Song Cc: Thomas Richter Cc: Yisheng Xie Link: http://lore.kernel.org/lkml/20200213064306.160480-5-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/config.c | 12 ++++++++++++ tools/perf/util/config.h | 1 + 2 files changed, 13 insertions(+) diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c index 0bc9c4d7fdc5..ef38eba56ed0 100644 --- a/tools/perf/util/config.c +++ b/tools/perf/util/config.c @@ -374,6 +374,18 @@ int perf_config_int(int *dest, const char *name, const char *value) return 0; } +int perf_config_u8(u8 *dest, const char *name, const char *value) +{ + long ret = 0; + + if (!perf_parse_long(value, &ret)) { + bad_config(name); + return -1; + } + *dest = ret; + return 0; +} + static int perf_config_bool_or_int(const char *name, const char *value, int *is_bool) { int ret; diff --git a/tools/perf/util/config.h b/tools/perf/util/config.h index bd0a5897c76a..c10b66dde2f3 100644 --- a/tools/perf/util/config.h +++ b/tools/perf/util/config.h @@ -29,6 +29,7 @@ typedef int (*config_fn_t)(const char *, const char *, void *); int perf_default_config(const char *, const char *, void *); int perf_config(config_fn_t fn, void *); int perf_config_int(int *dest, const char *, const char *); +int perf_config_u8(u8 *dest, const char *name, const char *value); int perf_config_u64(u64 *dest, const char *, const char *); int perf_config_bool(const char *, const char *); int config_error_nonbool(const char *); From 7384083ba616092e62df7bfb4f2034730e631e40 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Thu, 13 Feb 2020 12:13:03 +0530 Subject: [PATCH 229/400] perf annotate: Make perf config effective MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit perf default config set by user in [annotate] section is totally ignored by annotate code. Fix it. Before: $ ./perf config annotate.hide_src_code=true annotate.show_nr_jumps=true annotate.show_nr_samples=true $ ./perf annotate shash │ unsigned h = 0; │ movl $0x0,-0xc(%rbp) │ while (*s) │ ↓ jmp 44 │ h = 65599 * h + *s++; 11.33 │24: mov -0xc(%rbp),%eax 43.50 │ imul $0x1003f,%eax,%ecx │ mov -0x18(%rbp),%rax After: │ movl $0x0,-0xc(%rbp) │ ↓ jmp 44 1 │1 24: mov -0xc(%rbp),%eax 4 │ imul $0x1003f,%eax,%ecx │ mov -0x18(%rbp),%rax Note that we have removed show_nr_samples and show_total_period from annotation_options because they are not used. Instead of them we use symbol_conf.show_nr_samples and symbol_conf.show_total_period. Committer testing: Using 'perf annotate --stdio2' to use the TUI rendering but emitting the output to stdio: # perf config # # perf config annotate.hide_src_code=true # perf config annotate.hide_src_code=true # # perf config annotate.show_nr_jumps=true # perf config annotate.show_nr_samples=true # perf config annotate.hide_src_code=true annotate.show_nr_jumps=true annotate.show_nr_samples=true # # Before: # perf annotate --stdio2 ObjectInstance::weak_pointer_was_finalized Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period] ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0 Percent 00000000000609f0 : endbr64 cmpq $0x0,0x20(%rdi) ↓ je 10 xor %eax,%eax ← retq xchg %ax,%ax 100.00 10: push %rbp cmpq $0x0,0x18(%rdi) mov %rdi,%rbp ↓ jne 20 1b: xor %eax,%eax pop %rbp ← retq nop 20: lea 0x18(%rdi),%rdi → callq JS_UpdateWeakPointerAfterGC(JS::Heap /dev/null Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period] ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0 Samples endbr64 cmpq $0x0,0x20(%rdi) ↓ je 10 xor %eax,%eax ← retq xchg %ax,%ax 1 1 10: push %rbp cmpq $0x0,0x18(%rdi) mov %rdi,%rbp ↓ jne 20 1 1b: xor %eax,%eax pop %rbp ← retq nop 1 20: lea 0x18(%rdi),%rdi → callq JS_UpdateWeakPointerAfterGC(JS::Heap /dev/null Samples: 1 of event 'cycles', 4000 Hz, Event count (approx.): 830873, [percent: local period] ObjectInstance::weak_pointer_was_finalized() /usr/lib64/libgjs.so.0.0.0 Samples endbr64 cmpq $0x0,0x20(%rdi) ↓ je 10 xor %eax,%eax ← retq xchg %ax,%ax 1 10: push %rbp cmpq $0x0,0x18(%rdi) mov %rdi,%rbp ↓ jne 20 1b: xor %eax,%eax pop %rbp ← retq nop 20: lea 0x18(%rdi),%rdi → callq JS_UpdateWeakPointerAfterGC(JS::Heap Tested-by: Arnaldo Carvalho de Melo Cc: Adrian Hunter Cc: Alexey Budankov Cc: Changbin Du Cc: Ian Rogers Cc: Jin Yao Cc: Jiri Olsa Cc: Leo Yan Cc: Namhyung Kim Cc: Song Liu Cc: Taeung Song Cc: Thomas Richter Cc: Yisheng Xie Link: http://lore.kernel.org/lkml/20200213064306.160480-6-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-annotate.c | 2 +- tools/perf/builtin-report.c | 2 +- tools/perf/builtin-top.c | 2 +- tools/perf/util/annotate.c | 76 +++++++++++++---------------------- tools/perf/util/annotate.h | 4 +- 5 files changed, 32 insertions(+), 54 deletions(-) diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c index ff61795a4d13..ea89077bb8e0 100644 --- a/tools/perf/builtin-annotate.c +++ b/tools/perf/builtin-annotate.c @@ -605,7 +605,7 @@ int cmd_annotate(int argc, const char **argv) if (ret < 0) goto out_delete; - annotation_config__init(); + annotation_config__init(&annotate.opts); symbol_conf.try_vmlinux_path = true; diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 9483b3f0cae3..72a12b69f120 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -1507,7 +1507,7 @@ repeat: symbol_conf.priv_size += sizeof(u32); symbol_conf.sort_by_name = true; } - annotation_config__init(); + annotation_config__init(&report.annotation_opts); } if (symbol__init(&session->header.env) < 0) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 8affcab75604..cc26aeab6a66 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -1683,7 +1683,7 @@ int cmd_top(int argc, const char **argv) if (status < 0) goto out_delete_evlist; - annotation_config__init(); + annotation_config__init(&top.annotation_opts); symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL); status = symbol__init(NULL); diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index f0741daf94ef..3b79da595db6 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -3094,66 +3094,46 @@ out_free_offsets: return err; } -#define ANNOTATION__CFG(n) \ - { .name = #n, .value = &annotation__default_options.n, } - -/* - * Keep the entries sorted, they are bsearch'ed - */ -static struct annotation_config { - const char *name; - void *value; -} annotation__configs[] = { - ANNOTATION__CFG(hide_src_code), - ANNOTATION__CFG(jump_arrows), - ANNOTATION__CFG(offset_level), - ANNOTATION__CFG(show_linenr), - ANNOTATION__CFG(show_nr_jumps), - ANNOTATION__CFG(show_nr_samples), - ANNOTATION__CFG(show_total_period), - ANNOTATION__CFG(use_offset), -}; - -#undef ANNOTATION__CFG - -static int annotation_config__cmp(const void *name, const void *cfgp) +static int annotation__config(const char *var, const char *value, void *data) { - const struct annotation_config *cfg = cfgp; - - return strcmp(name, cfg->name); -} - -static int annotation__config(const char *var, const char *value, - void *data __maybe_unused) -{ - struct annotation_config *cfg; - const char *name; + struct annotation_options *opt = data; if (!strstarts(var, "annotate.")) return 0; - name = var + 9; - cfg = bsearch(name, annotation__configs, ARRAY_SIZE(annotation__configs), - sizeof(struct annotation_config), annotation_config__cmp); + if (!strcmp(var, "annotate.offset_level")) { + perf_config_u8(&opt->offset_level, "offset_level", value); - if (cfg == NULL) - pr_debug("%s variable unknown, ignoring...", var); - else if (strcmp(var, "annotate.offset_level") == 0) { - perf_config_int(cfg->value, name, value); - - if (*(int *)cfg->value > ANNOTATION__MAX_OFFSET_LEVEL) - *(int *)cfg->value = ANNOTATION__MAX_OFFSET_LEVEL; - else if (*(int *)cfg->value < ANNOTATION__MIN_OFFSET_LEVEL) - *(int *)cfg->value = ANNOTATION__MIN_OFFSET_LEVEL; + if (opt->offset_level > ANNOTATION__MAX_OFFSET_LEVEL) + opt->offset_level = ANNOTATION__MAX_OFFSET_LEVEL; + else if (opt->offset_level < ANNOTATION__MIN_OFFSET_LEVEL) + opt->offset_level = ANNOTATION__MIN_OFFSET_LEVEL; + } else if (!strcmp(var, "annotate.hide_src_code")) { + opt->hide_src_code = perf_config_bool("hide_src_code", value); + } else if (!strcmp(var, "annotate.jump_arrows")) { + opt->jump_arrows = perf_config_bool("jump_arrows", value); + } else if (!strcmp(var, "annotate.show_linenr")) { + opt->show_linenr = perf_config_bool("show_linenr", value); + } else if (!strcmp(var, "annotate.show_nr_jumps")) { + opt->show_nr_jumps = perf_config_bool("show_nr_jumps", value); + } else if (!strcmp(var, "annotate.show_nr_samples")) { + symbol_conf.show_nr_samples = perf_config_bool("show_nr_samples", + value); + } else if (!strcmp(var, "annotate.show_total_period")) { + symbol_conf.show_total_period = perf_config_bool("show_total_period", + value); + } else if (!strcmp(var, "annotate.use_offset")) { + opt->use_offset = perf_config_bool("use_offset", value); } else { - *(bool *)cfg->value = perf_config_bool(name, value); + pr_debug("%s variable unknown, ignoring...", var); } + return 0; } -void annotation_config__init(void) +void annotation_config__init(struct annotation_options *opt) { - perf_config(annotation__config, NULL); + perf_config(annotation__config, opt); } static unsigned int parse_percent_type(char *str1, char *str2) diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 6237c2cc582d..8e54184b43dc 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -83,8 +83,6 @@ struct annotation_options { full_path, show_linenr, show_nr_jumps, - show_nr_samples, - show_total_period, show_minmax_cycle, show_asm_raw, annotate_src; @@ -413,7 +411,7 @@ static inline int symbol__tui_annotate(struct map_symbol *ms __maybe_unused, } #endif -void annotation_config__init(void); +void annotation_config__init(struct annotation_options *opt); int annotate_parse_percent_type(const struct option *opt, const char *_str, int unset); From 812b0f528240ab0e6c58911fcfcb61f4ed811ca2 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Thu, 13 Feb 2020 12:13:04 +0530 Subject: [PATCH 230/400] perf annotate: Prefer cmdline option over default config MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit For all the perf-config options that can also be set from command line option, the preference is given to command line version in case of any conflict. But that's opposite in case of perf annotate. i.e. the more preference is given to default option rather than command line option. Fix it. Before: $ ./perf config annotate.show_nr_samples=false $ ./perf annotate shash --show-nr-samples Percent│ │24: mov -0xc(%rbp),%eax 49.19 │ imul $0x1003f,%eax,%ecx │ mov -0x18(%rbp),%rax After: Samples│ │24: mov -0xc(%rbp),%eax 1 │ imul $0x1003f,%eax,%ecx │ mov -0x18(%rbp),%rax Signed-off-by: Ravi Bangoria Tested-by: Arnaldo Carvalho de Melo Cc: Adrian Hunter Cc: Alexey Budankov Cc: Changbin Du Cc: Ian Rogers Cc: Jin Yao Cc: Jiri Olsa Cc: Leo Yan Cc: Namhyung Kim Cc: Song Liu Cc: Taeung Song Cc: Thomas Richter Cc: Yisheng Xie Link: http://lore.kernel.org/lkml/20200213064306.160480-7-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-annotate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c index ea89077bb8e0..6c0a0412502e 100644 --- a/tools/perf/builtin-annotate.c +++ b/tools/perf/builtin-annotate.c @@ -566,6 +566,8 @@ int cmd_annotate(int argc, const char **argv) if (ret < 0) return ret; + annotation_config__init(&annotate.opts); + argc = parse_options(argc, argv, options, annotate_usage, 0); if (argc) { /* @@ -605,8 +607,6 @@ int cmd_annotate(int argc, const char **argv) if (ret < 0) goto out_delete; - annotation_config__init(&annotate.opts); - symbol_conf.try_vmlinux_path = true; ret = symbol__init(&annotate.session->header.env); From cd0a9c518db123e2097e03eae374e822d82493fd Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Thu, 13 Feb 2020 12:13:05 +0530 Subject: [PATCH 231/400] perf annotate: Fix perf config option description perf config annotate options says it works only with TUI, which is wrong. Most of the TUI options are applicable to stdio2 as well. So remove that generic line and add individual line with each option stating which browsers supports that option. Also, annotate.show_nr_samples config is missing in Documentation. Describe it. Signed-off-by: Ravi Bangoria Cc: Adrian Hunter Cc: Alexey Budankov Cc: Changbin Du Cc: Ian Rogers Cc: Jin Yao Cc: Jiri Olsa Cc: Leo Yan Cc: Namhyung Kim Cc: Song Liu Cc: Taeung Song Cc: Thomas Richter Cc: Yisheng Xie Link: http://lore.kernel.org/lkml/20200213064306.160480-8-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/Documentation/perf-config.txt | 30 +++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt index c4dd23c4b478..9dae0df3ab7e 100644 --- a/tools/perf/Documentation/perf-config.txt +++ b/tools/perf/Documentation/perf-config.txt @@ -239,7 +239,6 @@ buildid.*:: set buildid.dir to /dev/null. The default is $HOME/.debug annotate.*:: - These options work only for TUI. These are in control of addresses, jump function, source code in lines of assembly code from a specific program. @@ -269,6 +268,8 @@ annotate.*:: │ mov (%rdi),%rdx │ return n; + This option works with tui, stdio2 browsers. + annotate.use_offset:: Basing on a first address of a loaded function, offset can be used. Instead of using original addresses of assembly code, @@ -287,6 +288,8 @@ annotate.*:: 368:│ mov 0x8(%r14),%rdi + This option works with tui, stdio2 browsers. + annotate.jump_arrows:: There can be jump instruction among assembly code. Depending on a boolean value of jump_arrows, @@ -306,6 +309,8 @@ annotate.*:: │1330: mov %r15,%r10 │1333: cmp %r15,%r14 + This option works with tui browser. + annotate.show_linenr:: When showing source code if this option is 'true', line numbers are printed as below. @@ -325,6 +330,8 @@ annotate.*:: │ array++; │ } + This option works with tui, stdio2 browsers. + annotate.show_nr_jumps:: Let's see a part of assembly code. @@ -335,6 +342,8 @@ annotate.*:: │1 1382: movb $0x1,-0x270(%rbp) + This option works with tui, stdio2 browsers. + annotate.show_total_period:: To compare two records on an instruction base, with this option provided, display total number of samples that belong to a line @@ -348,11 +357,30 @@ annotate.*:: 99.93 │ mov %eax,%eax + This option works with tui, stdio2, stdio browsers. + + annotate.show_nr_samples:: + By default perf annotate shows percentage of samples. This option + can be used to print absolute number of samples. Ex, when set as + false: + + Percent│ + 74.03 │ mov %fs:0x28,%rax + + When set as true: + + Samples│ + 6 │ mov %fs:0x28,%rax + + This option works with tui, stdio2, stdio browsers. + annotate.offset_level:: Default is '1', meaning just jump targets will have offsets show right beside the instruction. When set to '2' 'call' instructions will also have its offsets shown, 3 or higher will show offsets for all instructions. + This option works with tui, stdio2 browsers. + hist.*:: hist.percentage:: This option control the way to calculate overhead of filtered entries - From b0aaf4c8f31feb21de59df723231c286df2d6be3 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Thu, 13 Feb 2020 12:13:06 +0530 Subject: [PATCH 232/400] perf config: Document missing config options While documenting annotate.show_nr_samples config option, I found many other config options missing in perf-config documentation. Add them. Signed-off-by: Ravi Bangoria Cc: Adrian Hunter Cc: Alexey Budankov Cc: Changbin Du Cc: Ian Rogers Cc: Jin Yao Cc: Jiri Olsa Cc: Leo Yan Cc: Namhyung Kim Cc: Song Liu Cc: Taeung Song Cc: Thomas Richter Cc: Yisheng Xie Link: http://lore.kernel.org/lkml/20200213064306.160480-9-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/Documentation/perf-config.txt | 44 ++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt index 9dae0df3ab7e..8ead55593984 100644 --- a/tools/perf/Documentation/perf-config.txt +++ b/tools/perf/Documentation/perf-config.txt @@ -518,6 +518,12 @@ top.*:: column by default. The default is 'true'. + top.call-graph:: + This is identical to 'call-graph.record-mode', except it is + applicable only for 'top' subcommand. This option ONLY setup + the unwind method. To enable 'perf top' to actually use it, + the command line option -g must be specified. + man.*:: man.viewer:: This option can assign a tool to view manual pages when 'help' @@ -545,6 +551,16 @@ record.*:: But if this option is 'no-cache', it will not update the build-id cache. 'skip' skips post-processing and does not update the cache. + record.call-graph:: + This is identical to 'call-graph.record-mode', except it is + applicable only for 'record' subcommand. This option ONLY setup + the unwind method. To enable 'perf record' to actually use it, + the command line option -g must be specified. + + record.aio:: + Use 'n' control blocks in asynchronous (Posix AIO) trace writing + mode ('n' default: 1, max: 4). + diff.*:: diff.order:: This option sets the number of columns to sort the result. @@ -594,6 +610,11 @@ trace.*:: "libbeauty", the default, to use the same argument beautifiers used in the strace-like sys_enter+sys_exit lines. +ftrace.*:: + ftrace.tracer:: + Can be used to select the default tracer. Possible values are + 'function' and 'function_graph'. + llvm.*:: llvm.clang-path:: Path to clang. If omit, search it from $PATH. @@ -638,6 +659,29 @@ scripts.*:: The script gets the same options passed as a full perf script, in particular -i perfdata file, --cpu, --tid +convert.*:: + + convert.queue-size:: + Limit the size of ordered_events queue, so we could control + allocation size of perf data files without proper finished + round events. + +intel-pt.*:: + + intel-pt.cache-divisor:: + + intel-pt.mispred-all:: + If set, Intel PT decoder will set the mispred flag on all + branches. + +auxtrace.*:: + + auxtrace.dumpdir:: + s390 only. The directory to save the auxiliary trace buffer + can be changed using this option. Ex, auxtrace.dumpdir=/tmp. + If the directory does not exist or has the wrong file type, + the current directory is used. + SEE ALSO -------- linkperf:perf[1] From bebdb65e077267957f48e43d205d4a16cc7b8161 Mon Sep 17 00:00:00 2001 From: Tobias Klauser Date: Wed, 26 Feb 2020 18:38:32 +0100 Subject: [PATCH 233/400] io_uring: define and set show_fdinfo only if procfs is enabled Follow the pattern used with other *_show_fdinfo functions and only define and use io_uring_show_fdinfo and its helper functions if CONFIG_PROC_FS is set. Fixes: 87ce955b24c9 ("io_uring: add ->show_fdinfo() for the io_uring file descriptor") Signed-off-by: Tobias Klauser Signed-off-by: Jens Axboe --- fs/io_uring.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/io_uring.c b/fs/io_uring.c index e412a1761d93..05eea06f5421 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -6641,6 +6641,7 @@ out_fput: return submitted ? submitted : ret; } +#ifdef CONFIG_PROC_FS static int io_uring_show_cred(int id, void *p, void *data) { const struct cred *cred = p; @@ -6714,6 +6715,7 @@ static void io_uring_show_fdinfo(struct seq_file *m, struct file *f) percpu_ref_put(&ctx->refs); } } +#endif static const struct file_operations io_uring_fops = { .release = io_uring_release, @@ -6725,7 +6727,9 @@ static const struct file_operations io_uring_fops = { #endif .poll = io_uring_poll, .fasync = io_uring_fasync, +#ifdef CONFIG_PROC_FS .show_fdinfo = io_uring_show_fdinfo, +#endif }; static int io_allocate_scq_urings(struct io_ring_ctx *ctx, From bd862b1d839221322b2e38eb8a06861604804b5e Mon Sep 17 00:00:00 2001 From: He Zhe Date: Wed, 26 Feb 2020 22:30:04 +0800 Subject: [PATCH 234/400] perf probe: Check return value of strlist__add() for -ENOMEM strlist__add() may fail with -ENOMEM. Check it and give debugging hint in advance. Signed-off-by: He Zhe Acked-by: Jiri Olsa Cc: Alexander Shishkin Cc: Kate Stewart Cc: Mark Rutland Cc: Masami Hiramatsu Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lore.kernel.org/lkml/1582727404-180095-1-git-send-email-zhe.he@windriver.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-probe.c | 6 ++++-- tools/perf/util/probe-file.c | 28 ++++++++++++++++++++++++---- 2 files changed, 28 insertions(+), 6 deletions(-) diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c index 26bc5923e6b5..70548df2abb9 100644 --- a/tools/perf/builtin-probe.c +++ b/tools/perf/builtin-probe.c @@ -449,7 +449,8 @@ static int perf_del_probe_events(struct strfilter *filter) ret = probe_file__del_strlist(kfd, klist); if (ret < 0) goto error; - } + } else if (ret == -ENOMEM) + goto error; ret2 = probe_file__get_events(ufd, filter, ulist); if (ret2 == 0) { @@ -459,7 +460,8 @@ static int perf_del_probe_events(struct strfilter *filter) ret2 = probe_file__del_strlist(ufd, ulist); if (ret2 < 0) goto error; - } + } else if (ret2 == -ENOMEM) + goto error; if (ret == -ENOENT && ret2 == -ENOENT) pr_warning("\"%s\" does not hit any event.\n", str); diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c index 5003ba403345..0f5fda11675f 100644 --- a/tools/perf/util/probe-file.c +++ b/tools/perf/util/probe-file.c @@ -301,10 +301,15 @@ int probe_file__get_events(int fd, struct strfilter *filter, p = strchr(ent->s, ':'); if ((p && strfilter__compare(filter, p + 1)) || strfilter__compare(filter, ent->s)) { - strlist__add(plist, ent->s); + ret = strlist__add(plist, ent->s); + if (ret == -ENOMEM) { + pr_err("strlist__add failed with -ENOMEM\n"); + goto out; + } ret = 0; } } +out: strlist__delete(namelist); return ret; @@ -511,7 +516,11 @@ static int probe_cache__load(struct probe_cache *pcache) ret = -EINVAL; goto out; } - strlist__add(entry->tevlist, buf); + ret = strlist__add(entry->tevlist, buf); + if (ret == -ENOMEM) { + pr_err("strlist__add failed with -ENOMEM\n"); + goto out; + } } } out: @@ -672,7 +681,12 @@ int probe_cache__add_entry(struct probe_cache *pcache, command = synthesize_probe_trace_command(&tevs[i]); if (!command) goto out_err; - strlist__add(entry->tevlist, command); + ret = strlist__add(entry->tevlist, command); + if (ret == -ENOMEM) { + pr_err("strlist__add failed with -ENOMEM\n"); + goto out_err; + } + free(command); } list_add_tail(&entry->node, &pcache->entries); @@ -853,9 +867,15 @@ int probe_cache__scan_sdt(struct probe_cache *pcache, const char *pathname) break; } - strlist__add(entry->tevlist, buf); + ret = strlist__add(entry->tevlist, buf); + free(buf); entry = NULL; + + if (ret == -ENOMEM) { + pr_err("strlist__add failed with -ENOMEM\n"); + break; + } } if (entry) { list_del_init(&entry->node); From e0ad4d68548005adb54cc7c35fd9abf760a2a050 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Tue, 4 Feb 2020 10:22:28 +0530 Subject: [PATCH 235/400] perf annotate: Remove privsize from symbol__annotate() args privsize is passed as 0 from all the symbol__annotate() callers. Remove it from argument list. Signed-off-by: Ravi Bangoria Acked-by: Jiri Olsa Cc: Ian Rogers Cc: Jin Yao Cc: Namhyung Kim Cc: Song Liu Link: http://lore.kernel.org/lkml/20200204045233.474937-2-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-top.c | 2 +- tools/perf/ui/gtk/annotate.c | 2 +- tools/perf/util/annotate.c | 7 ++++--- tools/perf/util/annotate.h | 2 +- 4 files changed, 7 insertions(+), 6 deletions(-) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index cc26aeab6a66..f6dd1a63f159 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -143,7 +143,7 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he) return err; } - err = symbol__annotate(&he->ms, evsel, 0, &top->annotation_opts, NULL); + err = symbol__annotate(&he->ms, evsel, &top->annotation_opts, NULL); if (err == 0) { top->sym_filter_entry = he; } else { diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c index 22cc240f7371..35f9641bf670 100644 --- a/tools/perf/ui/gtk/annotate.c +++ b/tools/perf/ui/gtk/annotate.c @@ -174,7 +174,7 @@ static int symbol__gtk_annotate(struct map_symbol *ms, struct evsel *evsel, if (ms->map->dso->annotate_warned) return -1; - err = symbol__annotate(ms, evsel, 0, &annotation__default_options, NULL); + err = symbol__annotate(ms, evsel, &annotation__default_options, NULL); if (err) { char msg[BUFSIZ]; symbol__strerror_disassemble(ms, err, msg, sizeof(msg)); diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 3b79da595db6..a76309fcf381 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -2149,9 +2149,10 @@ void symbol__calc_percent(struct symbol *sym, struct evsel *evsel) annotation__calc_percent(notes, evsel, symbol__size(sym)); } -int symbol__annotate(struct map_symbol *ms, struct evsel *evsel, size_t privsize, +int symbol__annotate(struct map_symbol *ms, struct evsel *evsel, struct annotation_options *options, struct arch **parch) { + size_t privsize = 0; struct symbol *sym = ms->sym; struct annotation *notes = symbol__annotation(sym); struct annotate_args args = { @@ -2790,7 +2791,7 @@ int symbol__tty_annotate(struct map_symbol *ms, struct evsel *evsel, struct symbol *sym = ms->sym; struct rb_root source_line = RB_ROOT; - if (symbol__annotate(ms, evsel, 0, opts, NULL) < 0) + if (symbol__annotate(ms, evsel, opts, NULL) < 0) return -1; symbol__calc_percent(sym, evsel); @@ -3070,7 +3071,7 @@ int symbol__annotate2(struct map_symbol *ms, struct evsel *evsel, if (perf_evsel__is_group_event(evsel)) nr_pcnt = evsel->core.nr_members; - err = symbol__annotate(ms, evsel, 0, options, parch); + err = symbol__annotate(ms, evsel, options, parch); if (err) goto out_free_offsets; diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 8e54184b43dc..7bc60988e478 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -350,7 +350,7 @@ struct annotated_source *symbol__hists(struct symbol *sym, int nr_hists); void symbol__annotate_zero_histograms(struct symbol *sym); int symbol__annotate(struct map_symbol *ms, - struct evsel *evsel, size_t privsize, + struct evsel *evsel, struct annotation_options *options, struct arch **parch); int symbol__annotate2(struct map_symbol *ms, From 73a7a271b3eee7b83f29b13866163776f1cbef89 Mon Sep 17 00:00:00 2001 From: Marek Szyprowski Date: Thu, 27 Feb 2020 12:51:46 +0100 Subject: [PATCH 236/400] PCI: brcmstb: Fix build on 32bit ARM platforms with older compilers Some older compilers have no implementation for the helper for 64-bit unsigned division/modulo, so linking pcie-brcmstb driver causes the "undefined reference to `__aeabi_uldivmod'" error. *rc_bar2_size is always a power of two, because it is calculated as: "1ULL << fls64(entry->res->end - entry->res->start)", so the modulo operation in the subsequent check can be replaced by a simple logical AND with a proper mask. Link: https://lore.kernel.org/r/20200227115146.24515-1-m.szyprowski@samsung.com Fixes: c0452137034b ("PCI: brcmstb: Add Broadcom STB PCIe host controller driver") Signed-off-by: Marek Szyprowski Signed-off-by: Bjorn Helgaas Acked-by: Nicolas Saenz Julienne Acked-by: Lorenzo Pieralisi --- drivers/pci/controller/pcie-brcmstb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c index d20aabc26273..3a10e678c7f4 100644 --- a/drivers/pci/controller/pcie-brcmstb.c +++ b/drivers/pci/controller/pcie-brcmstb.c @@ -670,7 +670,7 @@ static inline int brcm_pcie_get_rc_bar2_size_and_offset(struct brcm_pcie *pcie, * outbound memory @ 3GB). So instead it will start at the 1x * multiple of its size */ - if (!*rc_bar2_size || *rc_bar2_offset % *rc_bar2_size || + if (!*rc_bar2_size || (*rc_bar2_offset & (*rc_bar2_size - 1)) || (*rc_bar2_offset < SZ_4G && *rc_bar2_offset > SZ_2G)) { dev_err(dev, "Invalid rc_bar2_offset/size: size 0x%llx, off 0x%llx\n", *rc_bar2_size, *rc_bar2_offset); From 2316f861ae9ca640708f9529ae40a6f0bd7ae048 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Tue, 4 Feb 2020 10:22:29 +0530 Subject: [PATCH 237/400] perf annotate: Simplify disasm_line allocation and freeing code We are allocating disasm_line object in annotation_line__new() instead of disasm_line__new(). Similarly annotation_line__delete() is actually freeing disasm_line object as well. This complexity is because of privsize. But we don't need privsize anymore so get rid of privsize and simplify disasm_line allocation and freeing code. Signed-off-by: Ravi Bangoria Acked-by: Jiri Olsa Cc: Ian Rogers Cc: Jin Yao Cc: Namhyung Kim Cc: Song Liu Link: http://lore.kernel.org/lkml/20200204045233.474937-3-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/annotate.c | 92 ++++++++++++++------------------------ tools/perf/util/annotate.h | 1 - 2 files changed, 34 insertions(+), 59 deletions(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index a76309fcf381..f11031a40290 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1143,7 +1143,6 @@ out: } struct annotate_args { - size_t privsize; struct arch *arch; struct map_symbol ms; struct evsel *evsel; @@ -1153,83 +1152,61 @@ struct annotate_args { int line_nr; }; -static void annotation_line__delete(struct annotation_line *al) +static void annotation_line__init(struct annotation_line *al, + struct annotate_args *args, + int nr) { - void *ptr = (void *) al - al->privsize; - - free_srcline(al->path); - zfree(&al->line); - free(ptr); + al->offset = args->offset; + al->line = strdup(args->line); + al->line_nr = args->line_nr; + al->data_nr = nr; } -/* - * Allocating the annotation line data with following - * structure: - * - * -------------------------------------- - * private space | struct annotation_line - * -------------------------------------- - * - * Size of the private space is stored in 'struct annotation_line'. - * - */ -static struct annotation_line * -annotation_line__new(struct annotate_args *args, size_t privsize) +static void annotation_line__exit(struct annotation_line *al) +{ + free_srcline(al->path); + zfree(&al->line); +} + +static size_t disasm_line_size(int nr) { struct annotation_line *al; - struct evsel *evsel = args->evsel; - size_t size = privsize + sizeof(*al); - int nr = 1; - if (perf_evsel__is_group_event(evsel)) - nr = evsel->core.nr_members; - - size += sizeof(al->data[0]) * nr; - - al = zalloc(size); - if (al) { - al = (void *) al + privsize; - al->privsize = privsize; - al->offset = args->offset; - al->line = strdup(args->line); - al->line_nr = args->line_nr; - al->data_nr = nr; - } - - return al; + return (sizeof(struct disasm_line) + (sizeof(al->data[0]) * nr)); } /* * Allocating the disasm annotation line data with * following structure: * - * ------------------------------------------------------------ - * privsize space | struct disasm_line | struct annotation_line - * ------------------------------------------------------------ + * ------------------------------------------- + * struct disasm_line | struct annotation_line + * ------------------------------------------- * * We have 'struct annotation_line' member as last member * of 'struct disasm_line' to have an easy access. - * */ static struct disasm_line *disasm_line__new(struct annotate_args *args) { struct disasm_line *dl = NULL; - struct annotation_line *al; - size_t privsize = args->privsize + offsetof(struct disasm_line, al); + int nr = 1; - al = annotation_line__new(args, privsize); - if (al != NULL) { - dl = disasm_line(al); + if (perf_evsel__is_group_event(args->evsel)) + nr = args->evsel->core.nr_members; - if (dl->al.line == NULL) - goto out_delete; + dl = zalloc(disasm_line_size(nr)); + if (!dl) + return NULL; - if (args->offset != -1) { - if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0) - goto out_free_line; + annotation_line__init(&dl->al, args, nr); + if (dl->al.line == NULL) + goto out_delete; - disasm_line__init_ins(dl, args->arch, &args->ms); - } + if (args->offset != -1) { + if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0) + goto out_free_line; + + disasm_line__init_ins(dl, args->arch, &args->ms); } return dl; @@ -1248,7 +1225,8 @@ void disasm_line__free(struct disasm_line *dl) else ins__delete(&dl->ops); zfree(&dl->ins.name); - annotation_line__delete(&dl->al); + annotation_line__exit(&dl->al); + free(dl); } int disasm_line__scnprintf(struct disasm_line *dl, char *bf, size_t size, bool raw, int max_ins_name) @@ -2152,11 +2130,9 @@ void symbol__calc_percent(struct symbol *sym, struct evsel *evsel) int symbol__annotate(struct map_symbol *ms, struct evsel *evsel, struct annotation_options *options, struct arch **parch) { - size_t privsize = 0; struct symbol *sym = ms->sym; struct annotation *notes = symbol__annotation(sym); struct annotate_args args = { - .privsize = privsize, .evsel = evsel, .options = options, }; diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index 7bc60988e478..001258601a37 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -139,7 +139,6 @@ struct annotation_line { u64 cycles; u64 cycles_max; u64 cycles_min; - size_t privsize; char *path; u32 idx; int idx_asm; From d3c03147bf8019bda821334428e0ba31ce4fb425 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Tue, 4 Feb 2020 10:22:30 +0530 Subject: [PATCH 238/400] perf annotate: Align struct annotate_args Align fields of struct annotate_args. Signed-off-by: Ravi Bangoria Acked-by: Jiri Olsa Cc: Ian Rogers Cc: Jin Yao Cc: Namhyung Kim Cc: Song Liu Link: http://lore.kernel.org/lkml/20200204045233.474937-4-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/annotate.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index f11031a40290..c816e5840166 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1143,13 +1143,13 @@ out: } struct annotate_args { - struct arch *arch; - struct map_symbol ms; - struct evsel *evsel; + struct arch *arch; + struct map_symbol ms; + struct evsel *evsel; struct annotation_options *options; - s64 offset; - char *line; - int line_nr; + s64 offset; + char *line; + int line_nr; }; static void annotation_line__init(struct annotation_line *al, From e0560ba6d92f06dbe13e9d11c921a60c07ea6fcc Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Tue, 4 Feb 2020 10:22:31 +0530 Subject: [PATCH 239/400] perf annotate: Fix segfault with source toggle While rendering annotate browser from perf report tui, we keep track of total number of lines(asm + source) in annotation->nr_entries and total number of asm lines in annotation->nr_asm_entries. But we don't reset them before starting. Thus if user annotates same function multiple times, we restart incrementing these fields with old values. This causes a segfault when user tries to toggle source code after annotating same function multiple times. Fix it. Signed-off-by: Ravi Bangoria Tested-by: Arnaldo Carvalho de Melo Acked-by: Jiri Olsa Cc: Ian Rogers Cc: Jin Yao Cc: Namhyung Kim Cc: Song Liu Link: http://lore.kernel.org/lkml/20200204045233.474937-5-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/annotate.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index c816e5840166..0ea95be84b3b 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -2621,6 +2621,8 @@ void annotation__set_offsets(struct annotation *notes, s64 size) struct annotation_line *al; notes->max_line_len = 0; + notes->nr_entries = 0; + notes->nr_asm_entries = 0; list_for_each_entry(al, ¬es->src->source, node) { size_t line_len = strlen(al->line); From 9515743bfb39c61aaf3d4f3219a645c8d1fe9a0e Mon Sep 17 00:00:00 2001 From: Bijan Mottahedeh Date: Wed, 26 Feb 2020 18:53:43 -0800 Subject: [PATCH 240/400] nvme-pci: Hold cq_poll_lock while completing CQEs Completions need to consumed in the same order the controller submitted them, otherwise future completion entries may overwrite ones we haven't handled yet. Hold the nvme queue's poll lock while completing new CQEs to prevent another thread from freeing command tags for reuse out-of-order. Fixes: dabcefab45d3 ("nvme: provide optimized poll function for separate poll queues") Signed-off-by: Bijan Mottahedeh Reviewed-by: Sagi Grimberg Reviewed-by: Jens Axboe Signed-off-by: Keith Busch --- drivers/nvme/host/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index ace4dd9e953c..d3f23d6254e4 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1078,9 +1078,9 @@ static int nvme_poll(struct blk_mq_hw_ctx *hctx) spin_lock(&nvmeq->cq_poll_lock); found = nvme_process_cq(nvmeq, &start, &end, -1); + nvme_complete_cqes(nvmeq, start, end); spin_unlock(&nvmeq->cq_poll_lock); - nvme_complete_cqes(nvmeq, start, end); return found; } From 7cdf6a0aae1cccf5167f3f04ecddcf648b78e289 Mon Sep 17 00:00:00 2001 From: Mikulas Patocka Date: Wed, 19 Feb 2020 10:25:45 -0500 Subject: [PATCH 241/400] dm cache: fix a crash due to incorrect work item cancelling The crash can be reproduced by running the lvm2 testsuite test lvconvert-thin-external-cache.sh for several minutes, e.g.: while :; do make check T=shell/lvconvert-thin-external-cache.sh; done The crash happens in this call chain: do_waker -> policy_tick -> smq_tick -> end_hotspot_period -> clear_bitset -> memset -> __memset -- which accesses an invalid pointer in the vmalloc area. The work entry on the workqueue is executed even after the bitmap was freed. The problem is that cancel_delayed_work doesn't wait for the running work item to finish, so the work item can continue running and re-submitting itself even after cache_postsuspend. In order to make sure that the work item won't be running, we must use cancel_delayed_work_sync. Also, change flush_workqueue to drain_workqueue, so that if some work item submits itself or another work item, we are properly waiting for both of them. Fixes: c6b4fcbad044 ("dm: add cache target") Cc: stable@vger.kernel.org # v3.9 Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm-cache-target.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c index 2d32821b3a5b..f4be63671233 100644 --- a/drivers/md/dm-cache-target.c +++ b/drivers/md/dm-cache-target.c @@ -2846,8 +2846,8 @@ static void cache_postsuspend(struct dm_target *ti) prevent_background_work(cache); BUG_ON(atomic_read(&cache->nr_io_migrations)); - cancel_delayed_work(&cache->waker); - flush_workqueue(cache->wq); + cancel_delayed_work_sync(&cache->waker); + drain_workqueue(cache->wq); WARN_ON(cache->tracker.in_flight); /* From 3918e0667bbac99400b44fa5aef3f8be2eeada4a Mon Sep 17 00:00:00 2001 From: Theodore Ts'o Date: Sun, 23 Feb 2020 14:54:58 -0500 Subject: [PATCH 242/400] dm thin metadata: fix lockdep complaint [ 3934.173244] ====================================================== [ 3934.179572] WARNING: possible circular locking dependency detected [ 3934.185884] 5.4.21-xfstests #1 Not tainted [ 3934.190151] ------------------------------------------------------ [ 3934.196673] dmsetup/8897 is trying to acquire lock: [ 3934.201688] ffffffffbce82b18 (shrinker_rwsem){++++}, at: unregister_shrinker+0x22/0x80 [ 3934.210268] but task is already holding lock: [ 3934.216489] ffff92a10cc5e1d0 (&pmd->root_lock){++++}, at: dm_pool_metadata_close+0xba/0x120 [ 3934.225083] which lock already depends on the new lock. [ 3934.564165] Chain exists of: shrinker_rwsem --> &journal->j_checkpoint_mutex --> &pmd->root_lock For a more detailed lockdep report, please see: https://lore.kernel.org/r/20200220234519.GA620489@mit.edu We shouldn't need to hold the lock while are just tearing down and freeing the whole metadata pool structure. Fixes: 44d8ebf436399a4 ("dm thin metadata: use pool locking at end of dm_pool_metadata_close") Signed-off-by: Theodore Ts'o Signed-off-by: Mike Snitzer --- drivers/md/dm-thin-metadata.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c index fc9947d6210c..76b6b323bf4b 100644 --- a/drivers/md/dm-thin-metadata.c +++ b/drivers/md/dm-thin-metadata.c @@ -960,9 +960,9 @@ int dm_pool_metadata_close(struct dm_pool_metadata *pmd) DMWARN("%s: __commit_transaction() failed, error = %d", __func__, r); } + pmd_write_unlock(pmd); if (!pmd->fail_io) __destroy_persistent_data_objects(pmd); - pmd_write_unlock(pmd); kfree(pmd); return 0; From 735a6dd02222d8d070c7bb748f25895239ca8c92 Mon Sep 17 00:00:00 2001 From: Sean Christopherson Date: Wed, 26 Feb 2020 15:16:15 -0800 Subject: [PATCH 243/400] x86/pkeys: Manually set X86_FEATURE_OSPKE to preserve existing changes Explicitly set X86_FEATURE_OSPKE via set_cpu_cap() instead of calling get_cpu_cap() to pull the feature bit from CPUID after enabling CR4.PKE. Invoking get_cpu_cap() effectively wipes out any {set,clear}_cpu_cap() changes that were made between this_cpu->c_init() and setup_pku(), as all non-synthetic feature words are reinitialized from the CPU's CPUID values. Blasting away capability updates manifests most visibility when running on a VMX capable CPU, but with VMX disabled by BIOS. To indicate that VMX is disabled, init_ia32_feat_ctl() clears X86_FEATURE_VMX, using clear_cpu_cap() instead of setup_clear_cpu_cap() so that KVM can report which CPU is misconfigured (KVM needs to probe every CPU anyways). Restoring X86_FEATURE_VMX from CPUID causes KVM to think VMX is enabled, ultimately leading to an unexpected #GP when KVM attempts to do VMXON. Arguably, init_ia32_feat_ctl() should use setup_clear_cpu_cap() and let KVM figure out a different way to report the misconfigured CPU, but VMX is not the only feature bit that is affected, i.e. there is precedent that tweaking feature bits via {set,clear}_cpu_cap() after ->c_init() is expected to work. Most notably, x86_init_rdrand()'s clearing of X86_FEATURE_RDRAND when RDRAND malfunctions is also overwritten. Fixes: 0697694564c8 ("x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU") Reported-by: Jacob Keller Signed-off-by: Sean Christopherson Signed-off-by: Borislav Petkov Acked-by: Dave Hansen Tested-by: Jacob Keller Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200226231615.13664-1-sean.j.christopherson@intel.com --- arch/x86/kernel/cpu/common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 52c9bfbbdb2a..4cdb123ff66a 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -445,7 +445,7 @@ static __always_inline void setup_pku(struct cpuinfo_x86 *c) * cpuid bit to be set. We need to ensure that we * update that bit in this CPU's "cpu_info". */ - get_cpu_cap(c); + set_cpu_cap(c, X86_FEATURE_OSPKE); } #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS From 0bff777bd0cba73ad4cd0145696ad284d7e6a99f Mon Sep 17 00:00:00 2001 From: Luo bin Date: Thu, 27 Feb 2020 06:34:42 +0000 Subject: [PATCH 244/400] hinic: fix a irq affinity bug can not use a local variable as an input parameter of irq_set_affinity_hint Signed-off-by: Luo bin Signed-off-by: David S. Miller --- drivers/net/ethernet/huawei/hinic/hinic_hw_qp.h | 1 + drivers/net/ethernet/huawei/hinic/hinic_rx.c | 5 ++--- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_qp.h b/drivers/net/ethernet/huawei/hinic/hinic_hw_qp.h index f4a339b10b10..79091e131418 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_hw_qp.h +++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_qp.h @@ -94,6 +94,7 @@ struct hinic_rq { struct hinic_wq *wq; + struct cpumask affinity_mask; u32 irq; u16 msix_entry; diff --git a/drivers/net/ethernet/huawei/hinic/hinic_rx.c b/drivers/net/ethernet/huawei/hinic/hinic_rx.c index 56ea6d692f1c..2695ad69fca6 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_rx.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_rx.c @@ -475,7 +475,6 @@ static int rx_request_irq(struct hinic_rxq *rxq) struct hinic_hwdev *hwdev = nic_dev->hwdev; struct hinic_rq *rq = rxq->rq; struct hinic_qp *qp; - struct cpumask mask; int err; rx_add_napi(rxq); @@ -492,8 +491,8 @@ static int rx_request_irq(struct hinic_rxq *rxq) } qp = container_of(rq, struct hinic_qp, rq); - cpumask_set_cpu(qp->q_id % num_online_cpus(), &mask); - return irq_set_affinity_hint(rq->irq, &mask); + cpumask_set_cpu(qp->q_id % num_online_cpus(), &rq->affinity_mask); + return irq_set_affinity_hint(rq->irq, &rq->affinity_mask); } static void rx_free_irq(struct hinic_rxq *rxq) From d2ed69ce9ed3477e2a9527e6b89fe4689d99510e Mon Sep 17 00:00:00 2001 From: Luo bin Date: Thu, 27 Feb 2020 06:34:43 +0000 Subject: [PATCH 245/400] hinic: fix a bug of setting hw_ioctxt a reserved field is used to signify prime physical function index in the latest firmware version, so we must assign a value to it correctly Signed-off-by: Luo bin Signed-off-by: David S. Miller --- drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c | 1 + drivers/net/ethernet/huawei/hinic/hinic_hw_dev.h | 2 +- drivers/net/ethernet/huawei/hinic/hinic_hw_if.h | 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c index 6f2cf569a283..79b3d53f2fbf 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c @@ -297,6 +297,7 @@ static int set_hw_ioctxt(struct hinic_hwdev *hwdev, unsigned int rq_depth, } hw_ioctxt.func_idx = HINIC_HWIF_FUNC_IDX(hwif); + hw_ioctxt.ppf_idx = HINIC_HWIF_PPF_IDX(hwif); hw_ioctxt.set_cmdq_depth = HW_IOCTXT_SET_CMDQ_DEPTH_DEFAULT; hw_ioctxt.cmdq_depth = 0; diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.h b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.h index b069045de416..66fd2340d447 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.h +++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.h @@ -151,8 +151,8 @@ struct hinic_cmd_hw_ioctxt { u8 lro_en; u8 rsvd3; + u8 ppf_idx; u8 rsvd4; - u8 rsvd5; u16 rq_depth; u16 rx_buf_sz_idx; diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_if.h b/drivers/net/ethernet/huawei/hinic/hinic_hw_if.h index 517794509eb2..c7bb9ceca72c 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_hw_if.h +++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_if.h @@ -137,6 +137,7 @@ #define HINIC_HWIF_FUNC_IDX(hwif) ((hwif)->attr.func_idx) #define HINIC_HWIF_PCI_INTF(hwif) ((hwif)->attr.pci_intf_idx) #define HINIC_HWIF_PF_IDX(hwif) ((hwif)->attr.pf_idx) +#define HINIC_HWIF_PPF_IDX(hwif) ((hwif)->attr.ppf_idx) #define HINIC_FUNC_TYPE(hwif) ((hwif)->attr.func_type) #define HINIC_IS_PF(hwif) (HINIC_FUNC_TYPE(hwif) == HINIC_PF) From 386d4716fd91869e07c731657f2cde5a33086516 Mon Sep 17 00:00:00 2001 From: Luo bin Date: Thu, 27 Feb 2020 06:34:44 +0000 Subject: [PATCH 246/400] hinic: fix a bug of rss configuration should use real receive queue number to configure hw rss indirect table rather than maximal queue number Signed-off-by: Luo bin Signed-off-by: David S. Miller --- drivers/net/ethernet/huawei/hinic/hinic_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/huawei/hinic/hinic_main.c b/drivers/net/ethernet/huawei/hinic/hinic_main.c index 02a14f5e7fe3..13560975c103 100644 --- a/drivers/net/ethernet/huawei/hinic/hinic_main.c +++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c @@ -356,7 +356,8 @@ static void hinic_enable_rss(struct hinic_dev *nic_dev) if (!num_cpus) num_cpus = num_online_cpus(); - nic_dev->num_qps = min_t(u16, nic_dev->max_qps, num_cpus); + nic_dev->num_qps = hinic_hwdev_num_qps(hwdev); + nic_dev->num_qps = min_t(u16, nic_dev->num_qps, num_cpus); nic_dev->rss_limit = nic_dev->num_qps; nic_dev->num_rss = nic_dev->num_qps; From 3a12500ed5dd21a63da779ac73503f11085bbc1c Mon Sep 17 00:00:00 2001 From: Tobias Klauser Date: Wed, 26 Feb 2020 18:29:53 +0100 Subject: [PATCH 247/400] unix: define and set show_fdinfo only if procfs is enabled Follow the pattern used with other *_show_fdinfo functions and only define unix_show_fdinfo and set it in proto_ops if CONFIG_PROCFS is set. Fixes: 3c32da19a858 ("unix: Show number of pending scm files of receive queue in fdinfo") Signed-off-by: Tobias Klauser Reviewed-by: Kirill Tkhai Signed-off-by: David S. Miller --- net/unix/af_unix.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 62c12cb5763e..aa6e2530e1ec 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -682,6 +682,7 @@ static int unix_set_peek_off(struct sock *sk, int val) return 0; } +#ifdef CONFIG_PROCFS static void unix_show_fdinfo(struct seq_file *m, struct socket *sock) { struct sock *sk = sock->sk; @@ -692,6 +693,9 @@ static void unix_show_fdinfo(struct seq_file *m, struct socket *sock) seq_printf(m, "scm_fds: %u\n", READ_ONCE(u->scm_stat.nr_fds)); } } +#else +#define unix_show_fdinfo NULL +#endif static const struct proto_ops unix_stream_ops = { .family = PF_UNIX, From e387f7d5fccf95299135c88b799184c3bef6a705 Mon Sep 17 00:00:00 2001 From: Jiri Pirko Date: Thu, 27 Feb 2020 08:22:10 +0100 Subject: [PATCH 248/400] mlx5: register lag notifier for init network namespace only The current code causes problems when the unregistering netdevice could be different then the registering one. Since the check in mlx5_lag_netdev_event() does not allow any other network namespace anyway, fix this by registerting the lag notifier per init network namespace only. Fixes: d48834f9d4b4 ("mlx5: Use dev_net netdevice notifier registrations") Signed-off-by: Jiri Pirko Tested-by: Aya Levin Acked-by: Saeed Mahameed Signed-off-by: David S. Miller --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +-- drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 3 +-- drivers/net/ethernet/mellanox/mlx5/core/lag.c | 11 +++-------- drivers/net/ethernet/mellanox/mlx5/core/lag.h | 1 - drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h | 2 +- 5 files changed, 6 insertions(+), 14 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 966983674663..21de4764d4c0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -5147,7 +5147,6 @@ static void mlx5e_nic_enable(struct mlx5e_priv *priv) static void mlx5e_nic_disable(struct mlx5e_priv *priv) { - struct net_device *netdev = priv->netdev; struct mlx5_core_dev *mdev = priv->mdev; #ifdef CONFIG_MLX5_CORE_EN_DCB @@ -5168,7 +5167,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv) mlx5e_monitor_counter_cleanup(priv); mlx5e_disable_async_events(priv); - mlx5_lag_remove(mdev, netdev); + mlx5_lag_remove(mdev); } int mlx5e_update_nic_rx(struct mlx5e_priv *priv) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 7b48ccacebe2..6ed307d7f191 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -1861,7 +1861,6 @@ static void mlx5e_uplink_rep_enable(struct mlx5e_priv *priv) static void mlx5e_uplink_rep_disable(struct mlx5e_priv *priv) { - struct net_device *netdev = priv->netdev; struct mlx5_core_dev *mdev = priv->mdev; struct mlx5e_rep_priv *rpriv = priv->ppriv; @@ -1870,7 +1869,7 @@ static void mlx5e_uplink_rep_disable(struct mlx5e_priv *priv) #endif mlx5_notifier_unregister(mdev, &priv->events_nb); cancel_work_sync(&rpriv->uplink_priv.reoffload_flows_work); - mlx5_lag_remove(mdev, netdev); + mlx5_lag_remove(mdev); } static MLX5E_DEFINE_STATS_GRP(sw_rep, 0); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag.c index b91eabc09fbc..8e19f6ab8393 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.c @@ -464,9 +464,6 @@ static int mlx5_lag_netdev_event(struct notifier_block *this, struct mlx5_lag *ldev; int changed = 0; - if (!net_eq(dev_net(ndev), &init_net)) - return NOTIFY_DONE; - if ((event != NETDEV_CHANGEUPPER) && (event != NETDEV_CHANGELOWERSTATE)) return NOTIFY_DONE; @@ -586,8 +583,7 @@ void mlx5_lag_add(struct mlx5_core_dev *dev, struct net_device *netdev) if (!ldev->nb.notifier_call) { ldev->nb.notifier_call = mlx5_lag_netdev_event; - if (register_netdevice_notifier_dev_net(netdev, &ldev->nb, - &ldev->nn)) { + if (register_netdevice_notifier_net(&init_net, &ldev->nb)) { ldev->nb.notifier_call = NULL; mlx5_core_err(dev, "Failed to register LAG netdev notifier\n"); } @@ -600,7 +596,7 @@ void mlx5_lag_add(struct mlx5_core_dev *dev, struct net_device *netdev) } /* Must be called with intf_mutex held */ -void mlx5_lag_remove(struct mlx5_core_dev *dev, struct net_device *netdev) +void mlx5_lag_remove(struct mlx5_core_dev *dev) { struct mlx5_lag *ldev; int i; @@ -620,8 +616,7 @@ void mlx5_lag_remove(struct mlx5_core_dev *dev, struct net_device *netdev) if (i == MLX5_MAX_PORTS) { if (ldev->nb.notifier_call) - unregister_netdevice_notifier_dev_net(netdev, &ldev->nb, - &ldev->nn); + unregister_netdevice_notifier_net(&init_net, &ldev->nb); mlx5_lag_mp_cleanup(ldev); cancel_delayed_work_sync(&ldev->bond_work); mlx5_lag_dev_free(ldev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag.h index 316ab09e2664..f1068aac6406 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.h @@ -44,7 +44,6 @@ struct mlx5_lag { struct workqueue_struct *wq; struct delayed_work bond_work; struct notifier_block nb; - struct netdev_net_notifier nn; struct lag_mp lag_mp; }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index fcce9e0fc82c..da67b28d6e23 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -157,7 +157,7 @@ int mlx5_query_qcam_reg(struct mlx5_core_dev *mdev, u32 *qcam, u8 feature_group, u8 access_reg_group); void mlx5_lag_add(struct mlx5_core_dev *dev, struct net_device *netdev); -void mlx5_lag_remove(struct mlx5_core_dev *dev, struct net_device *netdev); +void mlx5_lag_remove(struct mlx5_core_dev *dev); int mlx5_irq_table_init(struct mlx5_core_dev *dev); void mlx5_irq_table_cleanup(struct mlx5_core_dev *dev); From b82cf17ff1957ec35eaee7dc519c365ecd06ba38 Mon Sep 17 00:00:00 2001 From: Russell King Date: Thu, 27 Feb 2020 09:44:49 +0000 Subject: [PATCH 249/400] net: phy: marvell: don't interpret PHY status unless resolved Don't attempt to interpret the PHY specific status register unless the PHY is indicating that the resolution is valid. Reviewed-by: Andrew Lunn Signed-off-by: Russell King Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- drivers/net/phy/marvell.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c index 28e33ece4ce1..9a8badafea8a 100644 --- a/drivers/net/phy/marvell.c +++ b/drivers/net/phy/marvell.c @@ -1306,6 +1306,9 @@ static int marvell_read_status_page_an(struct phy_device *phydev, } } + if (!(status & MII_M1011_PHY_STATUS_RESOLVED)) + return 0; + if (status & MII_M1011_PHY_STATUS_FULLDUPLEX) phydev->duplex = DUPLEX_FULL; else @@ -1365,6 +1368,8 @@ static int marvell_read_status_page(struct phy_device *phydev, int page) linkmode_zero(phydev->lp_advertising); phydev->pause = 0; phydev->asym_pause = 0; + phydev->speed = SPEED_UNKNOWN; + phydev->duplex = DUPLEX_UNKNOWN; if (phydev->autoneg == AUTONEG_ENABLE) err = marvell_read_status_page_an(phydev, fiber, status); From 93b5cbfa9636d385126f211dca9efa7e3f683202 Mon Sep 17 00:00:00 2001 From: Taehee Yoo Date: Thu, 27 Feb 2020 12:23:52 +0000 Subject: [PATCH 250/400] net: rmnet: fix NULL pointer dereference in rmnet_newlink() rmnet registers IFLA_LINK interface as a lower interface. But, IFLA_LINK could be NULL. In the current code, rmnet doesn't check IFLA_LINK. So, panic would occur. Test commands: modprobe rmnet ip link add rmnet0 type rmnet mux_id 1 Splat looks like: [ 36.826109][ T1115] general protection fault, probably for non-canonical address 0xdffffc0000000000I [ 36.838817][ T1115] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 36.839908][ T1115] CPU: 1 PID: 1115 Comm: ip Not tainted 5.6.0-rc1+ #447 [ 36.840569][ T1115] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 36.841408][ T1115] RIP: 0010:rmnet_newlink+0x54/0x510 [rmnet] [ 36.841986][ T1115] Code: 83 ec 18 48 c1 e9 03 80 3c 01 00 0f 85 d4 03 00 00 48 8b 6a 28 48 b8 00 00 00 00 00 c [ 36.843923][ T1115] RSP: 0018:ffff8880b7e0f1c0 EFLAGS: 00010247 [ 36.844756][ T1115] RAX: dffffc0000000000 RBX: ffff8880d14cca00 RCX: 1ffff11016fc1e99 [ 36.845859][ T1115] RDX: 0000000000000000 RSI: ffff8880c3d04000 RDI: 0000000000000004 [ 36.846961][ T1115] RBP: 0000000000000000 R08: ffff8880b7e0f8b0 R09: ffff8880b6ac2d90 [ 36.848020][ T1115] R10: ffffffffc0589a40 R11: ffffed1016d585b7 R12: ffffffff88ceaf80 [ 36.848788][ T1115] R13: ffff8880c3d04000 R14: ffff8880b7e0f8b0 R15: ffff8880c3d04000 [ 36.849546][ T1115] FS: 00007f50ab3360c0(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000 [ 36.851784][ T1115] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 36.852422][ T1115] CR2: 000055871afe5ab0 CR3: 00000000ae246001 CR4: 00000000000606e0 [ 36.853181][ T1115] Call Trace: [ 36.853514][ T1115] __rtnl_newlink+0xbdb/0x1270 [ 36.853967][ T1115] ? lock_downgrade+0x6e0/0x6e0 [ 36.854420][ T1115] ? rtnl_link_unregister+0x220/0x220 [ 36.854936][ T1115] ? lock_acquire+0x164/0x3b0 [ 36.855376][ T1115] ? is_bpf_image_address+0xff/0x1d0 [ 36.855884][ T1115] ? rtnl_newlink+0x4c/0x90 [ 36.856304][ T1115] ? kernel_text_address+0x111/0x140 [ 36.856857][ T1115] ? __kernel_text_address+0xe/0x30 [ 36.857440][ T1115] ? unwind_get_return_address+0x5f/0xa0 [ 36.858063][ T1115] ? create_prof_cpu_mask+0x20/0x20 [ 36.858644][ T1115] ? arch_stack_walk+0x83/0xb0 [ 36.859171][ T1115] ? stack_trace_save+0x82/0xb0 [ 36.859710][ T1115] ? stack_trace_consume_entry+0x160/0x160 [ 36.860357][ T1115] ? deactivate_slab.isra.78+0x2c5/0x800 [ 36.860928][ T1115] ? kasan_unpoison_shadow+0x30/0x40 [ 36.861520][ T1115] ? kmem_cache_alloc_trace+0x135/0x350 [ 36.862125][ T1115] ? rtnl_newlink+0x4c/0x90 [ 36.864073][ T1115] rtnl_newlink+0x65/0x90 [ ... ] Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Signed-off-by: Taehee Yoo Signed-off-by: David S. Miller --- drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c index 06de59521fc4..471e3b2a1403 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c @@ -135,6 +135,11 @@ static int rmnet_newlink(struct net *src_net, struct net_device *dev, int err = 0; u16 mux_id; + if (!tb[IFLA_LINK]) { + NL_SET_ERR_MSG_MOD(extack, "link not specified"); + return -EINVAL; + } + real_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK])); if (!real_dev || !dev) return -ENODEV; From 1eb1f43a6e37282348a41e3d68f5e9a6a4359212 Mon Sep 17 00:00:00 2001 From: Taehee Yoo Date: Thu, 27 Feb 2020 12:24:26 +0000 Subject: [PATCH 251/400] net: rmnet: fix NULL pointer dereference in rmnet_changelink() In the rmnet_changelink(), it uses IFLA_LINK without checking NULL pointer. tb[IFLA_LINK] could be NULL pointer. So, NULL-ptr-deref could occur. rmnet already has a lower interface (real_dev). So, after this patch, rmnet_changelink() does not use IFLA_LINK anymore. Test commands: modprobe rmnet ip link add dummy0 type dummy ip link add rmnet0 link dummy0 type rmnet mux_id 1 ip link set rmnet0 type rmnet mux_id 2 Splat looks like: [ 90.578726][ T1131] general protection fault, probably for non-canonical address 0xdffffc0000000000I [ 90.581121][ T1131] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 90.582380][ T1131] CPU: 2 PID: 1131 Comm: ip Not tainted 5.6.0-rc1+ #447 [ 90.584285][ T1131] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 90.587506][ T1131] RIP: 0010:rmnet_changelink+0x5a/0x8a0 [rmnet] [ 90.588546][ T1131] Code: 83 ec 20 48 c1 ea 03 80 3c 02 00 0f 85 6f 07 00 00 48 8b 5e 28 48 b8 00 00 00 00 00 0 [ 90.591447][ T1131] RSP: 0018:ffff8880ce78f1b8 EFLAGS: 00010247 [ 90.592329][ T1131] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffff8880ce78f8b0 [ 90.593253][ T1131] RDX: 0000000000000000 RSI: ffff8880ce78f4a0 RDI: 0000000000000004 [ 90.594058][ T1131] RBP: ffff8880cf543e00 R08: 0000000000000002 R09: 0000000000000002 [ 90.594859][ T1131] R10: ffffffffc0586a40 R11: 0000000000000000 R12: ffff8880ca47c000 [ 90.595690][ T1131] R13: ffff8880ca47c000 R14: ffff8880cf545000 R15: 0000000000000000 [ 90.596553][ T1131] FS: 00007f21f6c7e0c0(0000) GS:ffff8880da400000(0000) knlGS:0000000000000000 [ 90.597504][ T1131] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 90.599418][ T1131] CR2: 0000556e413db458 CR3: 00000000c917a002 CR4: 00000000000606e0 [ 90.600289][ T1131] Call Trace: [ 90.600631][ T1131] __rtnl_newlink+0x922/0x1270 [ 90.601194][ T1131] ? lock_downgrade+0x6e0/0x6e0 [ 90.601724][ T1131] ? rtnl_link_unregister+0x220/0x220 [ 90.602309][ T1131] ? lock_acquire+0x164/0x3b0 [ 90.602784][ T1131] ? is_bpf_image_address+0xff/0x1d0 [ 90.603331][ T1131] ? rtnl_newlink+0x4c/0x90 [ 90.603810][ T1131] ? kernel_text_address+0x111/0x140 [ 90.604419][ T1131] ? __kernel_text_address+0xe/0x30 [ 90.604981][ T1131] ? unwind_get_return_address+0x5f/0xa0 [ 90.605616][ T1131] ? create_prof_cpu_mask+0x20/0x20 [ 90.606304][ T1131] ? arch_stack_walk+0x83/0xb0 [ 90.606985][ T1131] ? stack_trace_save+0x82/0xb0 [ 90.607656][ T1131] ? stack_trace_consume_entry+0x160/0x160 [ 90.608503][ T1131] ? deactivate_slab.isra.78+0x2c5/0x800 [ 90.609336][ T1131] ? kasan_unpoison_shadow+0x30/0x40 [ 90.610096][ T1131] ? kmem_cache_alloc_trace+0x135/0x350 [ 90.610889][ T1131] ? rtnl_newlink+0x4c/0x90 [ 90.611512][ T1131] rtnl_newlink+0x65/0x90 [ ... ] Fixes: 23790ef12082 ("net: qualcomm: rmnet: Allow to configure flags for existing devices") Signed-off-by: Taehee Yoo Signed-off-by: David S. Miller --- drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c index 471e3b2a1403..ac58f584190b 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c @@ -300,10 +300,8 @@ static int rmnet_changelink(struct net_device *dev, struct nlattr *tb[], if (!dev) return -ENODEV; - real_dev = __dev_get_by_index(dev_net(dev), - nla_get_u32(tb[IFLA_LINK])); - - if (!real_dev || !rmnet_is_real_dev_registered(real_dev)) + real_dev = priv->real_dev; + if (!rmnet_is_real_dev_registered(real_dev)) return -ENODEV; port = rmnet_get_port_rtnl(real_dev); From 102210f7664442d8c0ce332c006ea90626df745b Mon Sep 17 00:00:00 2001 From: Taehee Yoo Date: Thu, 27 Feb 2020 12:24:45 +0000 Subject: [PATCH 252/400] net: rmnet: fix suspicious RCU usage rmnet_get_port() internally calls rcu_dereference_rtnl(), which checks RTNL. But rmnet_get_port() could be called by packet path. The packet path is not protected by RTNL. So, the suspicious RCU usage problem occurs. Test commands: modprobe rmnet ip netns add nst ip link add veth0 type veth peer name veth1 ip link set veth1 netns nst ip link add rmnet0 link veth0 type rmnet mux_id 1 ip netns exec nst ip link add rmnet1 link veth1 type rmnet mux_id 1 ip netns exec nst ip link set veth1 up ip netns exec nst ip link set rmnet1 up ip netns exec nst ip a a 192.168.100.2/24 dev rmnet1 ip link set veth0 up ip link set rmnet0 up ip a a 192.168.100.1/24 dev rmnet0 ping 192.168.100.2 Splat looks like: [ 146.630958][ T1174] WARNING: suspicious RCU usage [ 146.631735][ T1174] 5.6.0-rc1+ #447 Not tainted [ 146.632387][ T1174] ----------------------------- [ 146.633151][ T1174] drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c:386 suspicious rcu_dereference_check() ! [ 146.634742][ T1174] [ 146.634742][ T1174] other info that might help us debug this: [ 146.634742][ T1174] [ 146.645992][ T1174] [ 146.645992][ T1174] rcu_scheduler_active = 2, debug_locks = 1 [ 146.646937][ T1174] 5 locks held by ping/1174: [ 146.647609][ T1174] #0: ffff8880c31dea70 (sk_lock-AF_INET){+.+.}, at: raw_sendmsg+0xab8/0x2980 [ 146.662463][ T1174] #1: ffffffff93925660 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x243/0x2150 [ 146.671696][ T1174] #2: ffffffff93925660 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x213/0x2940 [ 146.673064][ T1174] #3: ffff8880c19ecd58 (&dev->qdisc_running_key#7){+...}, at: ip_finish_output2+0x714/0x2150 [ 146.690358][ T1174] #4: ffff8880c5796898 (&dev->qdisc_xmit_lock_key#3){+.-.}, at: sch_direct_xmit+0x1e2/0x1020 [ 146.699875][ T1174] [ 146.699875][ T1174] stack backtrace: [ 146.701091][ T1174] CPU: 0 PID: 1174 Comm: ping Not tainted 5.6.0-rc1+ #447 [ 146.705215][ T1174] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 146.706565][ T1174] Call Trace: [ 146.707102][ T1174] dump_stack+0x96/0xdb [ 146.708007][ T1174] rmnet_get_port.part.9+0x76/0x80 [rmnet] [ 146.709233][ T1174] rmnet_egress_handler+0x107/0x420 [rmnet] [ 146.710492][ T1174] ? sch_direct_xmit+0x1e2/0x1020 [ 146.716193][ T1174] rmnet_vnd_start_xmit+0x3d/0xa0 [rmnet] [ 146.717012][ T1174] dev_hard_start_xmit+0x160/0x740 [ 146.717854][ T1174] sch_direct_xmit+0x265/0x1020 [ 146.718577][ T1174] ? register_lock_class+0x14d0/0x14d0 [ 146.719429][ T1174] ? dev_watchdog+0xac0/0xac0 [ 146.723738][ T1174] ? __dev_queue_xmit+0x15fd/0x2940 [ 146.724469][ T1174] ? lock_acquire+0x164/0x3b0 [ 146.725172][ T1174] __dev_queue_xmit+0x20c7/0x2940 [ ... ] Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Signed-off-by: Taehee Yoo Signed-off-by: David S. Miller --- drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 13 ++++++------- drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h | 2 +- .../net/ethernet/qualcomm/rmnet/rmnet_handlers.c | 4 ++-- 3 files changed, 9 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c index ac58f584190b..fc68ecdd804b 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c @@ -382,11 +382,10 @@ struct rtnl_link_ops rmnet_link_ops __read_mostly = { .fill_info = rmnet_fill_info, }; -/* Needs either rcu_read_lock() or rtnl lock */ -struct rmnet_port *rmnet_get_port(struct net_device *real_dev) +struct rmnet_port *rmnet_get_port_rcu(struct net_device *real_dev) { if (rmnet_is_real_dev_registered(real_dev)) - return rcu_dereference_rtnl(real_dev->rx_handler_data); + return rcu_dereference_bh(real_dev->rx_handler_data); else return NULL; } @@ -412,7 +411,7 @@ int rmnet_add_bridge(struct net_device *rmnet_dev, struct rmnet_port *port, *slave_port; int err; - port = rmnet_get_port(real_dev); + port = rmnet_get_port_rtnl(real_dev); /* If there is more than one rmnet dev attached, its probably being * used for muxing. Skip the briding in that case @@ -427,7 +426,7 @@ int rmnet_add_bridge(struct net_device *rmnet_dev, if (err) return -EBUSY; - slave_port = rmnet_get_port(slave_dev); + slave_port = rmnet_get_port_rtnl(slave_dev); slave_port->rmnet_mode = RMNET_EPMODE_BRIDGE; slave_port->bridge_ep = real_dev; @@ -445,11 +444,11 @@ int rmnet_del_bridge(struct net_device *rmnet_dev, struct net_device *real_dev = priv->real_dev; struct rmnet_port *port, *slave_port; - port = rmnet_get_port(real_dev); + port = rmnet_get_port_rtnl(real_dev); port->rmnet_mode = RMNET_EPMODE_VND; port->bridge_ep = NULL; - slave_port = rmnet_get_port(slave_dev); + slave_port = rmnet_get_port_rtnl(slave_dev); rmnet_unregister_real_device(slave_dev, slave_port); netdev_dbg(slave_dev, "removed from rmnet as slave\n"); diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h index cd0a6bcbe74a..0d568dcfd65a 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h @@ -65,7 +65,7 @@ struct rmnet_priv { struct rmnet_priv_stats stats; }; -struct rmnet_port *rmnet_get_port(struct net_device *real_dev); +struct rmnet_port *rmnet_get_port_rcu(struct net_device *real_dev); struct rmnet_endpoint *rmnet_get_endpoint(struct rmnet_port *port, u8 mux_id); int rmnet_add_bridge(struct net_device *rmnet_dev, struct net_device *slave_dev, diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c index 1b74bc160402..074a8b326c30 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c @@ -184,7 +184,7 @@ rx_handler_result_t rmnet_rx_handler(struct sk_buff **pskb) return RX_HANDLER_PASS; dev = skb->dev; - port = rmnet_get_port(dev); + port = rmnet_get_port_rcu(dev); switch (port->rmnet_mode) { case RMNET_EPMODE_VND: @@ -217,7 +217,7 @@ void rmnet_egress_handler(struct sk_buff *skb) skb->dev = priv->real_dev; mux_id = priv->mux_id; - port = rmnet_get_port(skb->dev); + port = rmnet_get_port_rcu(skb->dev); if (!port) goto drop; From c026d970102e9af9958edefb4a015702c6aab636 Mon Sep 17 00:00:00 2001 From: Taehee Yoo Date: Thu, 27 Feb 2020 12:25:05 +0000 Subject: [PATCH 253/400] net: rmnet: remove rcu_read_lock in rmnet_force_unassociate_device() The notifier_call() of the slave interface removes rmnet interface with unregister_netdevice_queue(). But, before calling unregister_netdevice_queue(), it acquires rcu readlock. In the RCU critical section, sleeping isn't be allowed. But, unregister_netdevice_queue() internally calls synchronize_net(), which would sleep. So, suspicious RCU usage warning occurs. Test commands: modprobe rmnet ip link add dummy0 type dummy ip link add dummy1 type dummy ip link add rmnet0 link dummy0 type rmnet mux_id 1 ip link set dummy1 master rmnet0 ip link del dummy0 Splat looks like: [ 79.639245][ T1195] ============================= [ 79.640134][ T1195] WARNING: suspicious RCU usage [ 79.640852][ T1195] 5.6.0-rc1+ #447 Not tainted [ 79.641657][ T1195] ----------------------------- [ 79.642472][ T1195] ./include/linux/rcupdate.h:273 Illegal context switch in RCU read-side critical section! [ 79.644043][ T1195] [ 79.644043][ T1195] other info that might help us debug this: [ 79.644043][ T1195] [ 79.645682][ T1195] [ 79.645682][ T1195] rcu_scheduler_active = 2, debug_locks = 1 [ 79.646980][ T1195] 2 locks held by ip/1195: [ 79.647629][ T1195] #0: ffffffffa3cf64f0 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x457/0x890 [ 79.649312][ T1195] #1: ffffffffa39256c0 (rcu_read_lock){....}, at: rmnet_config_notify_cb+0xf0/0x590 [rmnet] [ 79.651717][ T1195] [ 79.651717][ T1195] stack backtrace: [ 79.652650][ T1195] CPU: 3 PID: 1195 Comm: ip Not tainted 5.6.0-rc1+ #447 [ 79.653702][ T1195] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 79.655037][ T1195] Call Trace: [ 79.655560][ T1195] dump_stack+0x96/0xdb [ 79.656252][ T1195] ___might_sleep+0x345/0x440 [ 79.656994][ T1195] synchronize_net+0x18/0x30 [ 79.661132][ T1195] netdev_rx_handler_unregister+0x40/0xb0 [ 79.666266][ T1195] rmnet_unregister_real_device+0x42/0xb0 [rmnet] [ 79.667211][ T1195] rmnet_config_notify_cb+0x1f7/0x590 [rmnet] [ 79.668121][ T1195] ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet] [ 79.669166][ T1195] ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet] [ 79.670286][ T1195] ? __module_text_address+0x13/0x140 [ 79.671139][ T1195] notifier_call_chain+0x90/0x160 [ 79.671973][ T1195] rollback_registered_many+0x660/0xcf0 [ 79.672893][ T1195] ? netif_set_real_num_tx_queues+0x780/0x780 [ 79.675091][ T1195] ? __lock_acquire+0xdfe/0x3de0 [ 79.675825][ T1195] ? memset+0x1f/0x40 [ 79.676367][ T1195] ? __nla_validate_parse+0x98/0x1ab0 [ 79.677290][ T1195] unregister_netdevice_many.part.133+0x13/0x1b0 [ 79.678163][ T1195] rtnl_delete_link+0xbc/0x100 [ ... ] Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Signed-off-by: Taehee Yoo Signed-off-by: David S. Miller --- drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c index fc68ecdd804b..0ad64aa66592 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c @@ -230,7 +230,6 @@ static void rmnet_force_unassociate_device(struct net_device *dev) port = rmnet_get_port_rtnl(dev); - rcu_read_lock(); rmnet_unregister_bridge(dev, port); hash_for_each_safe(port->muxed_ep, bkt_ep, tmp_ep, ep, hlnode) { @@ -241,7 +240,6 @@ static void rmnet_force_unassociate_device(struct net_device *dev) kfree(ep); } - rcu_read_unlock(); unregister_netdevice_many(&list); rmnet_unregister_real_device(real_dev, port); From 1dc49e9d164cd7e11c81279c83db84a147e14740 Mon Sep 17 00:00:00 2001 From: Taehee Yoo Date: Thu, 27 Feb 2020 12:25:19 +0000 Subject: [PATCH 254/400] net: rmnet: do not allow to change mux id if mux id is duplicated Basically, duplicate mux id isn't be allowed. So, the creation of rmnet will be failed if there is duplicate mux id is existing. But, changelink routine doesn't check duplicate mux id. Test commands: modprobe rmnet ip link add dummy0 type dummy ip link add rmnet0 link dummy0 type rmnet mux_id 1 ip link add rmnet1 link dummy0 type rmnet mux_id 2 ip link set rmnet1 type rmnet mux_id 1 Fixes: 23790ef12082 ("net: qualcomm: rmnet: Allow to configure flags for existing devices") Signed-off-by: Taehee Yoo Signed-off-by: David S. Miller --- drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c index 0ad64aa66592..3c0e6d24d083 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c @@ -306,6 +306,10 @@ static int rmnet_changelink(struct net_device *dev, struct nlattr *tb[], if (data[IFLA_RMNET_MUX_ID]) { mux_id = nla_get_u16(data[IFLA_RMNET_MUX_ID]); + if (rmnet_get_endpoint(port, mux_id)) { + NL_SET_ERR_MSG_MOD(extack, "MUX ID already exists"); + return -EINVAL; + } ep = rmnet_get_endpoint(port, priv->mux_id); if (!ep) return -ENODEV; From 037f9cdf72fb8a7ff9ec2b5dd05336ec1492bdf1 Mon Sep 17 00:00:00 2001 From: Taehee Yoo Date: Thu, 27 Feb 2020 12:25:43 +0000 Subject: [PATCH 255/400] net: rmnet: use upper/lower device infrastructure netdev_upper_dev_link() is useful to manage lower/upper interfaces. And this function internally validates looping, maximum depth. All or most virtual interfaces that could have a real interface (e.g. macsec, macvlan, ipvlan etc.) use lower/upper infrastructure. Test commands: modprobe rmnet ip link add dummy0 type dummy ip link add rmnet1 link dummy0 type rmnet mux_id 1 for i in {2..100} do let A=$i-1 ip link add rmnet$i link rmnet$A type rmnet mux_id $i done ip link del dummy0 The purpose of the test commands is to make stack overflow. Splat looks like: [ 52.411438][ T1395] BUG: KASAN: slab-out-of-bounds in find_busiest_group+0x27e/0x2c00 [ 52.413218][ T1395] Write of size 64 at addr ffff8880c774bde0 by task ip/1395 [ 52.414841][ T1395] [ 52.430720][ T1395] CPU: 1 PID: 1395 Comm: ip Not tainted 5.6.0-rc1+ #447 [ 52.496511][ T1395] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 52.513597][ T1395] Call Trace: [ 52.546516][ T1395] [ 52.558773][ T1395] Allocated by task 3171537984: [ 52.588290][ T1395] BUG: unable to handle page fault for address: ffffffffb999e260 [ 52.589311][ T1395] #PF: supervisor read access in kernel mode [ 52.590529][ T1395] #PF: error_code(0x0000) - not-present page [ 52.591374][ T1395] PGD d6818067 P4D d6818067 PUD d6819063 PMD 0 [ 52.592288][ T1395] Thread overran stack, or stack corrupted [ 52.604980][ T1395] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 52.605856][ T1395] CPU: 1 PID: 1395 Comm: ip Not tainted 5.6.0-rc1+ #447 [ 52.611764][ T1395] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 52.621520][ T1395] RIP: 0010:stack_depot_fetch+0x10/0x30 [ 52.622296][ T1395] Code: ff e9 f9 fe ff ff 48 89 df e8 9c 1d 91 ff e9 ca fe ff ff cc cc cc cc cc cc cc 89 f8 0 [ 52.627887][ T1395] RSP: 0018:ffff8880c774bb60 EFLAGS: 00010006 [ 52.628735][ T1395] RAX: 00000000001f8880 RBX: ffff8880c774d140 RCX: 0000000000000000 [ 52.631773][ T1395] RDX: 000000000000001d RSI: ffff8880c774bb68 RDI: 0000000000003ff0 [ 52.649584][ T1395] RBP: ffffea00031dd200 R08: ffffed101b43e403 R09: ffffed101b43e403 [ 52.674857][ T1395] R10: 0000000000000001 R11: ffffed101b43e402 R12: ffff8880d900e5c0 [ 52.678257][ T1395] R13: ffff8880c774c000 R14: 0000000000000000 R15: dffffc0000000000 [ 52.694541][ T1395] FS: 00007fe867f6e0c0(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000 [ 52.764039][ T1395] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 52.815008][ T1395] CR2: ffffffffb999e260 CR3: 00000000c26aa005 CR4: 00000000000606e0 [ 52.862312][ T1395] Call Trace: [ 52.887133][ T1395] Modules linked in: dummy rmnet veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_dex [ 52.936749][ T1395] CR2: ffffffffb999e260 [ 52.965695][ T1395] ---[ end trace 7e32ca99482dbb31 ]--- [ 52.966556][ T1395] RIP: 0010:stack_depot_fetch+0x10/0x30 [ 52.971083][ T1395] Code: ff e9 f9 fe ff ff 48 89 df e8 9c 1d 91 ff e9 ca fe ff ff cc cc cc cc cc cc cc 89 f8 0 [ 53.003650][ T1395] RSP: 0018:ffff8880c774bb60 EFLAGS: 00010006 [ 53.043183][ T1395] RAX: 00000000001f8880 RBX: ffff8880c774d140 RCX: 0000000000000000 [ 53.076480][ T1395] RDX: 000000000000001d RSI: ffff8880c774bb68 RDI: 0000000000003ff0 [ 53.093858][ T1395] RBP: ffffea00031dd200 R08: ffffed101b43e403 R09: ffffed101b43e403 [ 53.112795][ T1395] R10: 0000000000000001 R11: ffffed101b43e402 R12: ffff8880d900e5c0 [ 53.139837][ T1395] R13: ffff8880c774c000 R14: 0000000000000000 R15: dffffc0000000000 [ 53.141500][ T1395] FS: 00007fe867f6e0c0(0000) GS:ffff8880da000000(0000) knlGS:0000000000000000 [ 53.143343][ T1395] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 53.152007][ T1395] CR2: ffffffffb999e260 CR3: 00000000c26aa005 CR4: 00000000000606e0 [ 53.156459][ T1395] Kernel panic - not syncing: Fatal exception [ 54.213570][ T1395] Shutting down cpus with NMI [ 54.354112][ T1395] Kernel Offset: 0x33000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0x) [ 54.355687][ T1395] Rebooting in 5 seconds.. Fixes: b37f78f234bf ("net: qualcomm: rmnet: Fix crash on real dev unregistration") Signed-off-by: Taehee Yoo Signed-off-by: David S. Miller --- .../ethernet/qualcomm/rmnet/rmnet_config.c | 35 +++++++++---------- 1 file changed, 16 insertions(+), 19 deletions(-) diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c index 3c0e6d24d083..e3fbf2331b96 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c @@ -61,9 +61,6 @@ static int rmnet_unregister_real_device(struct net_device *real_dev, kfree(port); - /* release reference on real_dev */ - dev_put(real_dev); - netdev_dbg(real_dev, "Removed from rmnet\n"); return 0; } @@ -89,9 +86,6 @@ static int rmnet_register_real_device(struct net_device *real_dev) return -EBUSY; } - /* hold on to real dev for MAP data */ - dev_hold(real_dev); - for (entry = 0; entry < RMNET_MAX_LOGICAL_EP; entry++) INIT_HLIST_HEAD(&port->muxed_ep[entry]); @@ -162,6 +156,10 @@ static int rmnet_newlink(struct net *src_net, struct net_device *dev, if (err) goto err1; + err = netdev_upper_dev_link(real_dev, dev, extack); + if (err < 0) + goto err2; + port->rmnet_mode = mode; hlist_add_head_rcu(&ep->hlnode, &port->muxed_ep[mux_id]); @@ -178,6 +176,8 @@ static int rmnet_newlink(struct net *src_net, struct net_device *dev, return 0; +err2: + unregister_netdevice(dev); err1: rmnet_unregister_real_device(real_dev, port); err0: @@ -209,33 +209,30 @@ static void rmnet_dellink(struct net_device *dev, struct list_head *head) rmnet_vnd_dellink(mux_id, port, ep); kfree(ep); } + netdev_upper_dev_unlink(real_dev, dev); rmnet_unregister_real_device(real_dev, port); unregister_netdevice_queue(dev, head); } -static void rmnet_force_unassociate_device(struct net_device *dev) +static void rmnet_force_unassociate_device(struct net_device *real_dev) { - struct net_device *real_dev = dev; struct hlist_node *tmp_ep; struct rmnet_endpoint *ep; struct rmnet_port *port; unsigned long bkt_ep; LIST_HEAD(list); - if (!rmnet_is_real_dev_registered(real_dev)) - return; - ASSERT_RTNL(); - port = rmnet_get_port_rtnl(dev); + port = rmnet_get_port_rtnl(real_dev); - rmnet_unregister_bridge(dev, port); + rmnet_unregister_bridge(real_dev, port); hash_for_each_safe(port->muxed_ep, bkt_ep, tmp_ep, ep, hlnode) { + netdev_upper_dev_unlink(real_dev, ep->egress_dev); unregister_netdevice_queue(ep->egress_dev, &list); rmnet_vnd_dellink(ep->mux_id, port, ep); - hlist_del_init_rcu(&ep->hlnode); kfree(ep); } @@ -248,15 +245,15 @@ static void rmnet_force_unassociate_device(struct net_device *dev) static int rmnet_config_notify_cb(struct notifier_block *nb, unsigned long event, void *data) { - struct net_device *dev = netdev_notifier_info_to_dev(data); + struct net_device *real_dev = netdev_notifier_info_to_dev(data); - if (!dev) + if (!rmnet_is_real_dev_registered(real_dev)) return NOTIFY_DONE; switch (event) { case NETDEV_UNREGISTER: - netdev_dbg(dev, "Kernel unregister\n"); - rmnet_force_unassociate_device(dev); + netdev_dbg(real_dev, "Kernel unregister\n"); + rmnet_force_unassociate_device(real_dev); break; default: @@ -477,8 +474,8 @@ static int __init rmnet_init(void) static void __exit rmnet_exit(void) { - unregister_netdevice_notifier(&rmnet_dev_notifier); rtnl_link_unregister(&rmnet_link_ops); + unregister_netdevice_notifier(&rmnet_dev_notifier); } module_init(rmnet_init) From d939b6d30bea1a2322bc536b12be0a7c4c2bccd7 Mon Sep 17 00:00:00 2001 From: Taehee Yoo Date: Thu, 27 Feb 2020 12:26:02 +0000 Subject: [PATCH 256/400] net: rmnet: fix bridge mode bugs In order to attach a bridge interface to the rmnet interface, "master" operation is used. (e.g. ip link set dummy1 master rmnet0) But, in the rmnet_add_bridge(), which is a callback of ->ndo_add_slave() doesn't register lower interface. So, ->ndo_del_slave() doesn't work. There are other problems too. 1. It couldn't detect circular upper/lower interface relationship. 2. It couldn't prevent stack overflow because of too deep depth of upper/lower interface 3. It doesn't check the number of lower interfaces. 4. Panics because of several reasons. The root problem of these issues is actually the same. So, in this patch, these all problems will be fixed. Test commands: modprobe rmnet ip link add dummy0 type dummy ip link add rmnet0 link dummy0 type rmnet mux_id 1 ip link add dummy1 master rmnet0 type dummy ip link add dummy2 master rmnet0 type dummy ip link del rmnet0 ip link del dummy2 ip link del dummy1 Splat looks like: [ 41.867595][ T1164] general protection fault, probably for non-canonical address 0xdffffc0000000101I [ 41.869993][ T1164] KASAN: null-ptr-deref in range [0x0000000000000808-0x000000000000080f] [ 41.872950][ T1164] CPU: 0 PID: 1164 Comm: ip Not tainted 5.6.0-rc1+ #447 [ 41.873915][ T1164] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 41.875161][ T1164] RIP: 0010:rmnet_unregister_bridge.isra.6+0x71/0xf0 [rmnet] [ 41.876178][ T1164] Code: 48 89 ef 48 89 c6 5b 5d e9 fc fe ff ff e8 f7 f3 ff ff 48 8d b8 08 08 00 00 48 ba 00 7 [ 41.878925][ T1164] RSP: 0018:ffff8880c4d0f188 EFLAGS: 00010202 [ 41.879774][ T1164] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000101 [ 41.887689][ T1164] RDX: dffffc0000000000 RSI: ffffffffb8cf64f0 RDI: 0000000000000808 [ 41.888727][ T1164] RBP: ffff8880c40e4000 R08: ffffed101b3c0e3c R09: 0000000000000001 [ 41.889749][ T1164] R10: 0000000000000001 R11: ffffed101b3c0e3b R12: 1ffff110189a1e3c [ 41.890783][ T1164] R13: ffff8880c4d0f200 R14: ffffffffb8d56160 R15: ffff8880ccc2c000 [ 41.891794][ T1164] FS: 00007f4300edc0c0(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000 [ 41.892953][ T1164] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 41.893800][ T1164] CR2: 00007f43003bc8c0 CR3: 00000000ca53e001 CR4: 00000000000606f0 [ 41.894824][ T1164] Call Trace: [ 41.895274][ T1164] ? rcu_is_watching+0x2c/0x80 [ 41.895895][ T1164] rmnet_config_notify_cb+0x1f7/0x590 [rmnet] [ 41.896687][ T1164] ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet] [ 41.897611][ T1164] ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet] [ 41.898508][ T1164] ? __module_text_address+0x13/0x140 [ 41.899162][ T1164] notifier_call_chain+0x90/0x160 [ 41.899814][ T1164] rollback_registered_many+0x660/0xcf0 [ 41.900544][ T1164] ? netif_set_real_num_tx_queues+0x780/0x780 [ 41.901316][ T1164] ? __lock_acquire+0xdfe/0x3de0 [ 41.901958][ T1164] ? memset+0x1f/0x40 [ 41.902468][ T1164] ? __nla_validate_parse+0x98/0x1ab0 [ 41.903166][ T1164] unregister_netdevice_many.part.133+0x13/0x1b0 [ 41.903988][ T1164] rtnl_delete_link+0xbc/0x100 [ ... ] Fixes: 60d58f971c10 ("net: qualcomm: rmnet: Implement bridge mode") Signed-off-by: Taehee Yoo Signed-off-by: David S. Miller --- .../ethernet/qualcomm/rmnet/rmnet_config.c | 131 +++++++++--------- .../ethernet/qualcomm/rmnet/rmnet_config.h | 1 + .../net/ethernet/qualcomm/rmnet/rmnet_vnd.c | 8 -- .../net/ethernet/qualcomm/rmnet/rmnet_vnd.h | 1 - 4 files changed, 64 insertions(+), 77 deletions(-) diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c index e3fbf2331b96..fbf4cbcf1a65 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c @@ -13,25 +13,6 @@ #include "rmnet_vnd.h" #include "rmnet_private.h" -/* Locking scheme - - * The shared resource which needs to be protected is realdev->rx_handler_data. - * For the writer path, this is using rtnl_lock(). The writer paths are - * rmnet_newlink(), rmnet_dellink() and rmnet_force_unassociate_device(). These - * paths are already called with rtnl_lock() acquired in. There is also an - * ASSERT_RTNL() to ensure that we are calling with rtnl acquired. For - * dereference here, we will need to use rtnl_dereference(). Dev list writing - * needs to happen with rtnl_lock() acquired for netdev_master_upper_dev_link(). - * For the reader path, the real_dev->rx_handler_data is called in the TX / RX - * path. We only need rcu_read_lock() for these scenarios. In these cases, - * the rcu_read_lock() is held in __dev_queue_xmit() and - * netif_receive_skb_internal(), so readers need to use rcu_dereference_rtnl() - * to get the relevant information. For dev list reading, we again acquire - * rcu_read_lock() in rmnet_dellink() for netdev_master_upper_dev_get_rcu(). - * We also use unregister_netdevice_many() to free all rmnet devices in - * rmnet_force_unassociate_device() so we dont lose the rtnl_lock() and free in - * same context. - */ - /* Local Definitions and Declarations */ static const struct nla_policy rmnet_policy[IFLA_RMNET_MAX + 1] = { @@ -51,9 +32,10 @@ rmnet_get_port_rtnl(const struct net_device *real_dev) return rtnl_dereference(real_dev->rx_handler_data); } -static int rmnet_unregister_real_device(struct net_device *real_dev, - struct rmnet_port *port) +static int rmnet_unregister_real_device(struct net_device *real_dev) { + struct rmnet_port *port = rmnet_get_port_rtnl(real_dev); + if (port->nr_rmnet_devs) return -EINVAL; @@ -93,28 +75,33 @@ static int rmnet_register_real_device(struct net_device *real_dev) return 0; } -static void rmnet_unregister_bridge(struct net_device *dev, - struct rmnet_port *port) +static void rmnet_unregister_bridge(struct rmnet_port *port) { - struct rmnet_port *bridge_port; - struct net_device *bridge_dev; + struct net_device *bridge_dev, *real_dev, *rmnet_dev; + struct rmnet_port *real_port; if (port->rmnet_mode != RMNET_EPMODE_BRIDGE) return; - /* bridge slave handling */ + rmnet_dev = port->rmnet_dev; if (!port->nr_rmnet_devs) { - bridge_dev = port->bridge_ep; + /* bridge device */ + real_dev = port->bridge_ep; + bridge_dev = port->dev; - bridge_port = rmnet_get_port_rtnl(bridge_dev); - bridge_port->bridge_ep = NULL; - bridge_port->rmnet_mode = RMNET_EPMODE_VND; + real_port = rmnet_get_port_rtnl(real_dev); + real_port->bridge_ep = NULL; + real_port->rmnet_mode = RMNET_EPMODE_VND; } else { + /* real device */ bridge_dev = port->bridge_ep; - bridge_port = rmnet_get_port_rtnl(bridge_dev); - rmnet_unregister_real_device(bridge_dev, bridge_port); + port->bridge_ep = NULL; + port->rmnet_mode = RMNET_EPMODE_VND; } + + netdev_upper_dev_unlink(bridge_dev, rmnet_dev); + rmnet_unregister_real_device(bridge_dev); } static int rmnet_newlink(struct net *src_net, struct net_device *dev, @@ -161,6 +148,7 @@ static int rmnet_newlink(struct net *src_net, struct net_device *dev, goto err2; port->rmnet_mode = mode; + port->rmnet_dev = dev; hlist_add_head_rcu(&ep->hlnode, &port->muxed_ep[mux_id]); @@ -178,8 +166,9 @@ static int rmnet_newlink(struct net *src_net, struct net_device *dev, err2: unregister_netdevice(dev); + rmnet_vnd_dellink(mux_id, port, ep); err1: - rmnet_unregister_real_device(real_dev, port); + rmnet_unregister_real_device(real_dev); err0: kfree(ep); return err; @@ -188,30 +177,32 @@ err0: static void rmnet_dellink(struct net_device *dev, struct list_head *head) { struct rmnet_priv *priv = netdev_priv(dev); - struct net_device *real_dev; + struct net_device *real_dev, *bridge_dev; + struct rmnet_port *real_port, *bridge_port; struct rmnet_endpoint *ep; - struct rmnet_port *port; - u8 mux_id; + u8 mux_id = priv->mux_id; real_dev = priv->real_dev; - if (!real_dev || !rmnet_is_real_dev_registered(real_dev)) + if (!rmnet_is_real_dev_registered(real_dev)) return; - port = rmnet_get_port_rtnl(real_dev); + real_port = rmnet_get_port_rtnl(real_dev); + bridge_dev = real_port->bridge_ep; + if (bridge_dev) { + bridge_port = rmnet_get_port_rtnl(bridge_dev); + rmnet_unregister_bridge(bridge_port); + } - mux_id = rmnet_vnd_get_mux(dev); - - ep = rmnet_get_endpoint(port, mux_id); + ep = rmnet_get_endpoint(real_port, mux_id); if (ep) { hlist_del_init_rcu(&ep->hlnode); - rmnet_unregister_bridge(dev, port); - rmnet_vnd_dellink(mux_id, port, ep); + rmnet_vnd_dellink(mux_id, real_port, ep); kfree(ep); } - netdev_upper_dev_unlink(real_dev, dev); - rmnet_unregister_real_device(real_dev, port); + netdev_upper_dev_unlink(real_dev, dev); + rmnet_unregister_real_device(real_dev); unregister_netdevice_queue(dev, head); } @@ -223,23 +214,23 @@ static void rmnet_force_unassociate_device(struct net_device *real_dev) unsigned long bkt_ep; LIST_HEAD(list); - ASSERT_RTNL(); - port = rmnet_get_port_rtnl(real_dev); - rmnet_unregister_bridge(real_dev, port); - - hash_for_each_safe(port->muxed_ep, bkt_ep, tmp_ep, ep, hlnode) { - netdev_upper_dev_unlink(real_dev, ep->egress_dev); - unregister_netdevice_queue(ep->egress_dev, &list); - rmnet_vnd_dellink(ep->mux_id, port, ep); - hlist_del_init_rcu(&ep->hlnode); - kfree(ep); + if (port->nr_rmnet_devs) { + /* real device */ + rmnet_unregister_bridge(port); + hash_for_each_safe(port->muxed_ep, bkt_ep, tmp_ep, ep, hlnode) { + unregister_netdevice_queue(ep->egress_dev, &list); + netdev_upper_dev_unlink(real_dev, ep->egress_dev); + rmnet_vnd_dellink(ep->mux_id, port, ep); + hlist_del_init_rcu(&ep->hlnode); + kfree(ep); + } + rmnet_unregister_real_device(real_dev); + unregister_netdevice_many(&list); + } else { + rmnet_unregister_bridge(port); } - - unregister_netdevice_many(&list); - - rmnet_unregister_real_device(real_dev, port); } static int rmnet_config_notify_cb(struct notifier_block *nb, @@ -418,6 +409,9 @@ int rmnet_add_bridge(struct net_device *rmnet_dev, if (port->nr_rmnet_devs > 1) return -EINVAL; + if (port->rmnet_mode != RMNET_EPMODE_VND) + return -EINVAL; + if (rmnet_is_real_dev_registered(slave_dev)) return -EBUSY; @@ -425,9 +419,17 @@ int rmnet_add_bridge(struct net_device *rmnet_dev, if (err) return -EBUSY; + err = netdev_master_upper_dev_link(slave_dev, rmnet_dev, NULL, NULL, + extack); + if (err) { + rmnet_unregister_real_device(slave_dev); + return err; + } + slave_port = rmnet_get_port_rtnl(slave_dev); slave_port->rmnet_mode = RMNET_EPMODE_BRIDGE; slave_port->bridge_ep = real_dev; + slave_port->rmnet_dev = rmnet_dev; port->rmnet_mode = RMNET_EPMODE_BRIDGE; port->bridge_ep = slave_dev; @@ -439,16 +441,9 @@ int rmnet_add_bridge(struct net_device *rmnet_dev, int rmnet_del_bridge(struct net_device *rmnet_dev, struct net_device *slave_dev) { - struct rmnet_priv *priv = netdev_priv(rmnet_dev); - struct net_device *real_dev = priv->real_dev; - struct rmnet_port *port, *slave_port; + struct rmnet_port *port = rmnet_get_port_rtnl(slave_dev); - port = rmnet_get_port_rtnl(real_dev); - port->rmnet_mode = RMNET_EPMODE_VND; - port->bridge_ep = NULL; - - slave_port = rmnet_get_port_rtnl(slave_dev); - rmnet_unregister_real_device(slave_dev, slave_port); + rmnet_unregister_bridge(port); netdev_dbg(slave_dev, "removed from rmnet as slave\n"); return 0; diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h index 0d568dcfd65a..be515982d628 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h @@ -28,6 +28,7 @@ struct rmnet_port { u8 rmnet_mode; struct hlist_head muxed_ep[RMNET_MAX_LOGICAL_EP]; struct net_device *bridge_ep; + struct net_device *rmnet_dev; }; extern struct rtnl_link_ops rmnet_link_ops; diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c index 509dfc895a33..26ad40f19c64 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c @@ -266,14 +266,6 @@ int rmnet_vnd_dellink(u8 id, struct rmnet_port *port, return 0; } -u8 rmnet_vnd_get_mux(struct net_device *rmnet_dev) -{ - struct rmnet_priv *priv; - - priv = netdev_priv(rmnet_dev); - return priv->mux_id; -} - int rmnet_vnd_do_flow_control(struct net_device *rmnet_dev, int enable) { netdev_dbg(rmnet_dev, "Setting VND TX queue state to %d\n", enable); diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h index 54cbaf3c3bc4..14d77c709d4a 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h @@ -16,6 +16,5 @@ int rmnet_vnd_dellink(u8 id, struct rmnet_port *port, struct rmnet_endpoint *ep); void rmnet_vnd_rx_fixup(struct sk_buff *skb, struct net_device *dev); void rmnet_vnd_tx_fixup(struct sk_buff *skb, struct net_device *dev); -u8 rmnet_vnd_get_mux(struct net_device *rmnet_dev); void rmnet_vnd_setup(struct net_device *dev); #endif /* _RMNET_VND_H_ */ From ad3cc31b599ea80f06b29ebdc18b3a39878a48d6 Mon Sep 17 00:00:00 2001 From: Taehee Yoo Date: Thu, 27 Feb 2020 12:26:15 +0000 Subject: [PATCH 257/400] net: rmnet: fix packet forwarding in rmnet bridge mode Packet forwarding is not working in rmnet bridge mode. Because when a packet is forwarded, skb_push() for an ethernet header is needed. But it doesn't call skb_push(). So, the ethernet header will be lost. Test commands: modprobe rmnet ip netns add nst ip netns add nst2 ip link add veth0 type veth peer name veth1 ip link add veth2 type veth peer name veth3 ip link set veth1 netns nst ip link set veth3 netns nst2 ip link add rmnet0 link veth0 type rmnet mux_id 1 ip link set veth2 master rmnet0 ip link set veth0 up ip link set veth2 up ip link set rmnet0 up ip a a 192.168.100.1/24 dev rmnet0 ip netns exec nst ip link set veth1 up ip netns exec nst ip a a 192.168.100.2/24 dev veth1 ip netns exec nst2 ip link set veth3 up ip netns exec nst2 ip a a 192.168.100.3/24 dev veth3 ip netns exec nst2 ping 192.168.100.2 Fixes: 60d58f971c10 ("net: qualcomm: rmnet: Implement bridge mode") Signed-off-by: Taehee Yoo Signed-off-by: David S. Miller --- drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c index 074a8b326c30..29a7bfa2584d 100644 --- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c +++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c @@ -159,6 +159,9 @@ static int rmnet_map_egress_handler(struct sk_buff *skb, static void rmnet_bridge_handler(struct sk_buff *skb, struct net_device *bridge_dev) { + if (skb_mac_header_was_set(skb)) + skb_push(skb, skb->mac_len); + if (bridge_dev) { skb->dev = bridge_dev; dev_queue_xmit(skb); From 5c05a164d441a1792791175e4959ea9df12f7e2b Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 27 Feb 2020 11:52:35 -0800 Subject: [PATCH 258/400] unix: It's CONFIG_PROC_FS not CONFIG_PROCFS Fixes: 3a12500ed5dd ("unix: define and set show_fdinfo only if procfs is enabled") Signed-off-by: David S. Miller --- net/unix/af_unix.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index aa6e2530e1ec..68debcb28fa4 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -682,7 +682,7 @@ static int unix_set_peek_off(struct sock *sk, int val) return 0; } -#ifdef CONFIG_PROCFS +#ifdef CONFIG_PROC_FS static void unix_show_fdinfo(struct seq_file *m, struct socket *sock) { struct sock *sk = sock->sk; From 3f74957fcbeab703297ed0f135430414ed7e0dd0 Mon Sep 17 00:00:00 2001 From: Stefano Garzarella Date: Wed, 26 Feb 2020 11:58:18 +0100 Subject: [PATCH 259/400] vsock: fix potential deadlock in transport->release() Some transports (hyperv, virtio) acquire the sock lock during the .release() callback. In the vsock_stream_connect() we call vsock_assign_transport(); if the socket was previously assigned to another transport, the vsk->transport->release() is called, but the sock lock is already held in the vsock_stream_connect(), causing a deadlock reported by syzbot: INFO: task syz-executor280:9768 blocked for more than 143 seconds. Not tainted 5.6.0-rc1-syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor280 D27912 9768 9766 0x00000000 Call Trace: context_switch kernel/sched/core.c:3386 [inline] __schedule+0x934/0x1f90 kernel/sched/core.c:4082 schedule+0xdc/0x2b0 kernel/sched/core.c:4156 __lock_sock+0x165/0x290 net/core/sock.c:2413 lock_sock_nested+0xfe/0x120 net/core/sock.c:2938 virtio_transport_release+0xc4/0xd60 net/vmw_vsock/virtio_transport_common.c:832 vsock_assign_transport+0xf3/0x3b0 net/vmw_vsock/af_vsock.c:454 vsock_stream_connect+0x2b3/0xc70 net/vmw_vsock/af_vsock.c:1288 __sys_connect_file+0x161/0x1c0 net/socket.c:1857 __sys_connect+0x174/0x1b0 net/socket.c:1874 __do_sys_connect net/socket.c:1885 [inline] __se_sys_connect net/socket.c:1882 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1882 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe To avoid this issue, this patch remove the lock acquiring in the .release() callback of hyperv and virtio transports, and it holds the lock when we call vsk->transport->release() in the vsock core. Reported-by: syzbot+731710996d79d0d58fbc@syzkaller.appspotmail.com Fixes: 408624af4c89 ("vsock: use local transport when it is loaded") Signed-off-by: Stefano Garzarella Reviewed-by: Stefan Hajnoczi Signed-off-by: David S. Miller --- net/vmw_vsock/af_vsock.c | 20 ++++++++++++-------- net/vmw_vsock/hyperv_transport.c | 3 --- net/vmw_vsock/virtio_transport_common.c | 2 -- 3 files changed, 12 insertions(+), 13 deletions(-) diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 9c5b2a91baad..a5f28708e0e7 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -451,6 +451,12 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk) if (vsk->transport == new_transport) return 0; + /* transport->release() must be called with sock lock acquired. + * This path can only be taken during vsock_stream_connect(), + * where we have already held the sock lock. + * In the other cases, this function is called on a new socket + * which is not assigned to any transport. + */ vsk->transport->release(vsk); vsock_deassign_transport(vsk); } @@ -753,20 +759,18 @@ static void __vsock_release(struct sock *sk, int level) vsk = vsock_sk(sk); pending = NULL; /* Compiler warning. */ - /* The release call is supposed to use lock_sock_nested() - * rather than lock_sock(), if a sock lock should be acquired. - */ - if (vsk->transport) - vsk->transport->release(vsk); - else if (sk->sk_type == SOCK_STREAM) - vsock_remove_sock(vsk); - /* When "level" is SINGLE_DEPTH_NESTING, use the nested * version to avoid the warning "possible recursive locking * detected". When "level" is 0, lock_sock_nested(sk, level) * is the same as lock_sock(sk). */ lock_sock_nested(sk, level); + + if (vsk->transport) + vsk->transport->release(vsk); + else if (sk->sk_type == SOCK_STREAM) + vsock_remove_sock(vsk); + sock_orphan(sk); sk->sk_shutdown = SHUTDOWN_MASK; diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c index 3492c021925f..630b851f8150 100644 --- a/net/vmw_vsock/hyperv_transport.c +++ b/net/vmw_vsock/hyperv_transport.c @@ -526,12 +526,9 @@ static bool hvs_close_lock_held(struct vsock_sock *vsk) static void hvs_release(struct vsock_sock *vsk) { - struct sock *sk = sk_vsock(vsk); bool remove_sock; - lock_sock_nested(sk, SINGLE_DEPTH_NESTING); remove_sock = hvs_close_lock_held(vsk); - release_sock(sk); if (remove_sock) vsock_remove_sock(vsk); } diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index d9f0c9c5425a..f3c4bab2f737 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -829,7 +829,6 @@ void virtio_transport_release(struct vsock_sock *vsk) struct sock *sk = &vsk->sk; bool remove_sock = true; - lock_sock_nested(sk, SINGLE_DEPTH_NESTING); if (sk->sk_type == SOCK_STREAM) remove_sock = virtio_transport_close(vsk); @@ -837,7 +836,6 @@ void virtio_transport_release(struct vsock_sock *vsk) list_del(&pkt->list); virtio_transport_free_pkt(pkt); } - release_sock(sk); if (remove_sock) vsock_remove_sock(vsk); From 23797b98909f34b75fd130369bde86f760db69d0 Mon Sep 17 00:00:00 2001 From: "Alex Maftei (amaftei)" Date: Wed, 26 Feb 2020 17:33:19 +0000 Subject: [PATCH 260/400] sfc: fix timestamp reconstruction at 16-bit rollover points We can't just use the top bits of the last sync event as they could be off-by-one every 65,536 seconds, giving an error in reconstruction of 65,536 seconds. This patch uses the difference in the bottom 16 bits (mod 2^16) to calculate an offset that needs to be applied to the last sync event to get to the current time. Signed-off-by: Alexandru-Mihai Maftei Acked-by: Martin Habets Signed-off-by: David S. Miller --- drivers/net/ethernet/sfc/ptp.c | 38 +++++++++++++++++++++++++++++++--- 1 file changed, 35 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/sfc/ptp.c b/drivers/net/ethernet/sfc/ptp.c index af15a737c675..59b4f16896a8 100644 --- a/drivers/net/ethernet/sfc/ptp.c +++ b/drivers/net/ethernet/sfc/ptp.c @@ -560,13 +560,45 @@ efx_ptp_mac_nic_to_ktime_correction(struct efx_nic *efx, u32 nic_major, u32 nic_minor, s32 correction) { + u32 sync_timestamp; ktime_t kt = { 0 }; + s16 delta; if (!(nic_major & 0x80000000)) { WARN_ON_ONCE(nic_major >> 16); - /* Use the top bits from the latest sync event. */ - nic_major &= 0xffff; - nic_major |= (last_sync_timestamp_major(efx) & 0xffff0000); + + /* Medford provides 48 bits of timestamp, so we must get the top + * 16 bits from the timesync event state. + * + * We only have the lower 16 bits of the time now, but we do + * have a full resolution timestamp at some point in past. As + * long as the difference between the (real) now and the sync + * is less than 2^15, then we can reconstruct the difference + * between those two numbers using only the lower 16 bits of + * each. + * + * Put another way + * + * a - b = ((a mod k) - b) mod k + * + * when -k/2 < (a-b) < k/2. In our case k is 2^16. We know + * (a mod k) and b, so can calculate the delta, a - b. + * + */ + sync_timestamp = last_sync_timestamp_major(efx); + + /* Because delta is s16 this does an implicit mask down to + * 16 bits which is what we need, assuming + * MEDFORD_TX_SECS_EVENT_BITS is 16. delta is signed so that + * we can deal with the (unlikely) case of sync timestamps + * arriving from the future. + */ + delta = nic_major - sync_timestamp; + + /* Recover the fully specified time now, by applying the offset + * to the (fully specified) sync time. + */ + nic_major = sync_timestamp + delta; kt = ptp->nic_to_kernel_time(nic_major, nic_minor, correction); From ac004e84164e27d69017731a97b11402a69d854b Mon Sep 17 00:00:00 2001 From: Amit Cohen Date: Thu, 27 Feb 2020 21:07:53 +0100 Subject: [PATCH 261/400] mlxsw: pci: Wait longer before accessing the device after reset During initialization the driver issues a reset to the device and waits for 100ms before checking if the firmware is ready. The waiting is necessary because before that the device is irresponsive and the first read can result in a completion timeout. While 100ms is sufficient for Spectrum-1 and Spectrum-2, it is insufficient for Spectrum-3. Fix this by increasing the timeout to 200ms. Fixes: da382875c616 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC") Signed-off-by: Amit Cohen Signed-off-by: Ido Schimmel Signed-off-by: Jiri Pirko Signed-off-by: David S. Miller --- drivers/net/ethernet/mellanox/mlxsw/pci_hw.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h b/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h index e0d7d2d9a0c8..43fa8c85b5d9 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h +++ b/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h @@ -28,7 +28,7 @@ #define MLXSW_PCI_SW_RESET 0xF0010 #define MLXSW_PCI_SW_RESET_RST_BIT BIT(0) #define MLXSW_PCI_SW_RESET_TIMEOUT_MSECS 900000 -#define MLXSW_PCI_SW_RESET_WAIT_MSECS 100 +#define MLXSW_PCI_SW_RESET_WAIT_MSECS 200 #define MLXSW_PCI_FW_READY 0xA1844 #define MLXSW_PCI_FW_READY_MASK 0xFFFF #define MLXSW_PCI_FW_READY_MAGIC 0x5E From 3ee339eb28959629db33aaa2b8cde4c63c6289eb Mon Sep 17 00:00:00 2001 From: Andrew Lunn Date: Thu, 27 Feb 2020 21:20:49 +0100 Subject: [PATCH 262/400] net: dsa: mv88e6xxx: Fix masking of egress port Add missing ~ to the usage of the mask. Reported-by: Kevin Benson Reported-by: Chris Healy Fixes: 5c74c54ce6ff ("net: dsa: mv88e6xxx: Split monitor port configuration") Signed-off-by: Andrew Lunn Signed-off-by: David S. Miller --- drivers/net/dsa/mv88e6xxx/global1.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/global1.c b/drivers/net/dsa/mv88e6xxx/global1.c index b016cc205f81..ca3a7a7a73c3 100644 --- a/drivers/net/dsa/mv88e6xxx/global1.c +++ b/drivers/net/dsa/mv88e6xxx/global1.c @@ -278,13 +278,13 @@ int mv88e6095_g1_set_egress_port(struct mv88e6xxx_chip *chip, switch (direction) { case MV88E6XXX_EGRESS_DIR_INGRESS: dest_port_chip = &chip->ingress_dest_port; - reg &= MV88E6185_G1_MONITOR_CTL_INGRESS_DEST_MASK; + reg &= ~MV88E6185_G1_MONITOR_CTL_INGRESS_DEST_MASK; reg |= port << __bf_shf(MV88E6185_G1_MONITOR_CTL_INGRESS_DEST_MASK); break; case MV88E6XXX_EGRESS_DIR_EGRESS: dest_port_chip = &chip->egress_dest_port; - reg &= MV88E6185_G1_MONITOR_CTL_EGRESS_DEST_MASK; + reg &= ~MV88E6185_G1_MONITOR_CTL_EGRESS_DEST_MASK; reg |= port << __bf_shf(MV88E6185_G1_MONITOR_CTL_EGRESS_DEST_MASK); break; From c14dfddbd869bf0c2bafb7ef260c41d9cebbcfec Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Fri, 21 Feb 2020 15:20:26 +0000 Subject: [PATCH 263/400] RMDA/cm: Fix missing ib_cm_destroy_id() in ib_cm_insert_listen() The algorithm pre-allocates a cm_id since allocation cannot be done while holding the cm.lock spinlock, however it doesn't free it on one error path, leading to a memory leak. Fixes: 067b171b8679 ("IB/cm: Share listening CM IDs") Link: https://lore.kernel.org/r/20200221152023.GA8680@ziepe.ca Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/cm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c index 68cc1b2d6824..15e99a888427 100644 --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -1191,6 +1191,7 @@ struct ib_cm_id *ib_cm_insert_listen(struct ib_device *device, /* Sharing an ib_cm_id with different handlers is not * supported */ spin_unlock_irqrestore(&cm.lock, flags); + ib_destroy_cm_id(cm_id); return ERR_PTR(-EINVAL); } refcount_inc(&cm_id_priv->refcount); From d876836204897b6d7d911f942084f69a1e9d5c4d Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Thu, 27 Feb 2020 14:17:49 -0700 Subject: [PATCH 264/400] io_uring: fix 32-bit compatability with sendmsg/recvmsg We must set MSG_CMSG_COMPAT if we're in compatability mode, otherwise the iovec import for these commands will not do the right thing and fail the command with -EINVAL. Found by running the test suite compiled as 32-bit. Cc: stable@vger.kernel.org Fixes: aa1fa28fc73e ("io_uring: add support for recvmsg()") Fixes: 0fa03c624d8f ("io_uring: add support for sendmsg()") Signed-off-by: Jens Axboe --- fs/io_uring.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/fs/io_uring.c b/fs/io_uring.c index 05eea06f5421..6a595c13e108 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -3001,6 +3001,11 @@ static int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) sr->msg = u64_to_user_ptr(READ_ONCE(sqe->addr)); sr->len = READ_ONCE(sqe->len); +#ifdef CONFIG_COMPAT + if (req->ctx->compat) + sr->msg_flags |= MSG_CMSG_COMPAT; +#endif + if (!io || req->opcode == IORING_OP_SEND) return 0; /* iovec is already imported */ @@ -3153,6 +3158,11 @@ static int io_recvmsg_prep(struct io_kiocb *req, sr->msg = u64_to_user_ptr(READ_ONCE(sqe->addr)); sr->len = READ_ONCE(sqe->len); +#ifdef CONFIG_COMPAT + if (req->ctx->compat) + sr->msg_flags |= MSG_CMSG_COMPAT; +#endif + if (!io || req->opcode == IORING_OP_RECV) return 0; /* iovec is already imported */ From adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 Mon Sep 17 00:00:00 2001 From: Mikulas Patocka Date: Mon, 24 Feb 2020 10:20:28 +0100 Subject: [PATCH 265/400] dm: report suspended device during destroy The function dm_suspended returns true if the target is suspended. However, when the target is being suspended during unload, it returns false. An example where this is a problem: the test "!dm_suspended(wc->ti)" in writecache_writeback is not sufficient, because dm_suspended returns zero while writecache_suspend is in progress. As is, without an enhanced dm_suspended, simply switching from flush_workqueue to drain_workqueue still emits warnings: workqueue writecache-writeback: drain_workqueue() isn't complete after 10 tries workqueue writecache-writeback: drain_workqueue() isn't complete after 100 tries workqueue writecache-writeback: drain_workqueue() isn't complete after 200 tries workqueue writecache-writeback: drain_workqueue() isn't complete after 300 tries workqueue writecache-writeback: drain_workqueue() isn't complete after 400 tries writecache_suspend calls flush_workqueue(wc->writeback_wq) - this function flushes the current work. However, the workqueue may re-queue itself and flush_workqueue doesn't wait for re-queued works to finish. Because of this - the function writecache_writeback continues execution after the device was suspended and then concurrently with writecache_dtr, causing a crash in writecache_writeback. We must use drain_workqueue - that waits until the work and all re-queued works finish. As a prereq for switching to drain_workqueue, this commit fixes dm_suspended to return true after the presuspend hook and before the postsuspend hook - just like during a normal suspend. It allows simplifying the dm-integrity and dm-writecache targets so that they don't have to maintain suspended flags on their own. With this change use of drain_workqueue() can be used effectively. This change was tested with the lvm2 testsuite and cryptsetup testsuite and the are no regressions. Fixes: 48debafe4f2f ("dm: add writecache target") Cc: stable@vger.kernel.org # 4.18+ Reported-by: Corey Marthaler Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm-integrity.c | 12 +++++------- drivers/md/dm-writecache.c | 2 +- drivers/md/dm.c | 1 + 3 files changed, 7 insertions(+), 8 deletions(-) diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c index 2f98e88399ec..e1ad0b53f681 100644 --- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -201,12 +201,13 @@ struct dm_integrity_c { __u8 log2_blocks_per_bitmap_bit; unsigned char mode; - int suspending; int failed; struct crypto_shash *internal_hash; + struct dm_target *ti; + /* these variables are locked with endio_wait.lock */ struct rb_root in_progress; struct list_head wait_list; @@ -2316,7 +2317,7 @@ static void integrity_writer(struct work_struct *w) unsigned prev_free_sectors; /* the following test is not needed, but it tests the replay code */ - if (READ_ONCE(ic->suspending) && !ic->meta_dev) + if (unlikely(dm_suspended(ic->ti)) && !ic->meta_dev) return; spin_lock_irq(&ic->endio_wait.lock); @@ -2377,7 +2378,7 @@ static void integrity_recalc(struct work_struct *w) next_chunk: - if (unlikely(READ_ONCE(ic->suspending))) + if (unlikely(dm_suspended(ic->ti))) goto unlock_ret; range.logical_sector = le64_to_cpu(ic->sb->recalc_sector); @@ -2805,8 +2806,6 @@ static void dm_integrity_postsuspend(struct dm_target *ti) del_timer_sync(&ic->autocommit_timer); - WRITE_ONCE(ic->suspending, 1); - if (ic->recalc_wq) drain_workqueue(ic->recalc_wq); @@ -2835,8 +2834,6 @@ static void dm_integrity_postsuspend(struct dm_target *ti) #endif } - WRITE_ONCE(ic->suspending, 0); - BUG_ON(!RB_EMPTY_ROOT(&ic->in_progress)); ic->journal_uptodate = true; @@ -3631,6 +3628,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv) } ti->private = ic; ti->per_io_data_size = sizeof(struct dm_integrity_io); + ic->ti = ti; ic->in_progress = RB_ROOT; INIT_LIST_HEAD(&ic->wait_list); diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c index b9e27e37a943..8d45c848cbe4 100644 --- a/drivers/md/dm-writecache.c +++ b/drivers/md/dm-writecache.c @@ -842,7 +842,7 @@ static void writecache_suspend(struct dm_target *ti) } wc_unlock(wc); - flush_workqueue(wc->writeback_wq); + drain_workqueue(wc->writeback_wq); wc_lock(wc); if (flush_on_suspend) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index b89f07ee2eff..d01ad6a35dd0 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2368,6 +2368,7 @@ static void __dm_destroy(struct mapped_device *md, bool wait) map = dm_get_live_table(md, &srcu_idx); if (!dm_suspended_md(md)) { dm_table_presuspend_targets(map); + set_bit(DMF_SUSPENDED, &md->flags); dm_table_postsuspend_targets(map); } /* dm_put_live_table must be before msleep, otherwise deadlock is possible */ From 41c526c5af46d4c4dab7f72c99000b7fac0b9702 Mon Sep 17 00:00:00 2001 From: Mikulas Patocka Date: Mon, 24 Feb 2020 10:20:30 +0100 Subject: [PATCH 266/400] dm writecache: verify watermark during resume Verify the watermark upon resume - so that if the target is reloaded with lower watermark, it will start the cleanup process immediately. Fixes: 48debafe4f2f ("dm: add writecache target") Cc: stable@vger.kernel.org # 4.18+ Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm-writecache.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c index 8d45c848cbe4..d692d2a00745 100644 --- a/drivers/md/dm-writecache.c +++ b/drivers/md/dm-writecache.c @@ -625,6 +625,12 @@ static void writecache_add_to_freelist(struct dm_writecache *wc, struct wc_entry wc->freelist_size++; } +static inline void writecache_verify_watermark(struct dm_writecache *wc) +{ + if (unlikely(wc->freelist_size + wc->writeback_size <= wc->freelist_high_watermark)) + queue_work(wc->writeback_wq, &wc->writeback_work); +} + static struct wc_entry *writecache_pop_from_freelist(struct dm_writecache *wc, sector_t expected_sector) { struct wc_entry *e; @@ -650,8 +656,8 @@ static struct wc_entry *writecache_pop_from_freelist(struct dm_writecache *wc, s list_del(&e->lru); } wc->freelist_size--; - if (unlikely(wc->freelist_size + wc->writeback_size <= wc->freelist_high_watermark)) - queue_work(wc->writeback_wq, &wc->writeback_work); + + writecache_verify_watermark(wc); return e; } @@ -965,6 +971,8 @@ erase_this: writecache_commit_flushed(wc, false); } + writecache_verify_watermark(wc); + wc_unlock(wc); } From ee63634bae02e13c8c0df1209a6a0ca5326f3189 Mon Sep 17 00:00:00 2001 From: Shin'ichiro Kawasaki Date: Thu, 27 Feb 2020 09:18:52 +0900 Subject: [PATCH 267/400] dm zoned: Fix reference counter initial value of chunk works Dm-zoned initializes reference counters of new chunk works with zero value and refcount_inc() is called to increment the counter. However, the refcount_inc() function handles the addition to zero value as an error and triggers the warning as follows: refcount_t: addition on 0; use-after-free. WARNING: CPU: 7 PID: 1506 at lib/refcount.c:25 refcount_warn_saturate+0x68/0xf0 ... CPU: 7 PID: 1506 Comm: systemd-udevd Not tainted 5.4.0+ #134 ... Call Trace: dmz_map+0x2d2/0x350 [dm_zoned] __map_bio+0x42/0x1a0 __split_and_process_non_flush+0x14a/0x1b0 __split_and_process_bio+0x83/0x240 ? kmem_cache_alloc+0x165/0x220 dm_process_bio+0x90/0x230 ? generic_make_request_checks+0x2e7/0x680 dm_make_request+0x3e/0xb0 generic_make_request+0xcf/0x320 ? memcg_drain_all_list_lrus+0x1c0/0x1c0 submit_bio+0x3c/0x160 ? guard_bio_eod+0x2c/0x130 mpage_readpages+0x182/0x1d0 ? bdev_evict_inode+0xf0/0xf0 read_pages+0x6b/0x1b0 __do_page_cache_readahead+0x1ba/0x1d0 force_page_cache_readahead+0x93/0x100 generic_file_read_iter+0x83a/0xe40 ? __seccomp_filter+0x7b/0x670 new_sync_read+0x12a/0x1c0 vfs_read+0x9d/0x150 ksys_read+0x5f/0xe0 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ... After this warning, following refcount API calls for the counter all fail to change the counter value. Fix this by setting the initial reference counter value not zero but one for the new chunk works. Instead, do not call refcount_inc() via dmz_get_chunk_work() for the new chunks works. The failure was observed with linux version 5.4 with CONFIG_REFCOUNT_FULL enabled. Refcount rework was merged to linux version 5.5 by the commit 168829ad09ca ("Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"). After this commit, CONFIG_REFCOUNT_FULL was removed and the failure was observed regardless of kernel configuration. Linux version 4.20 merged the commit 092b5648760a ("dm zoned: target: use refcount_t for dm zoned reference counters"). Before this commit, dm zoned used atomic_t APIs which does not check addition to zero, then this fix is not necessary. Fixes: 092b5648760a ("dm zoned: target: use refcount_t for dm zoned reference counters") Cc: stable@vger.kernel.org # 5.4+ Signed-off-by: Shin'ichiro Kawasaki Reviewed-by: Damien Le Moal Signed-off-by: Mike Snitzer --- drivers/md/dm-zoned-target.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c index 70a1063161c0..b1e64cd31647 100644 --- a/drivers/md/dm-zoned-target.c +++ b/drivers/md/dm-zoned-target.c @@ -533,8 +533,9 @@ static int dmz_queue_chunk_work(struct dmz_target *dmz, struct bio *bio) /* Get the BIO chunk work. If one is not active yet, create one */ cw = radix_tree_lookup(&dmz->chunk_rxtree, chunk); - if (!cw) { - + if (cw) { + dmz_get_chunk_work(cw); + } else { /* Create a new chunk work */ cw = kmalloc(sizeof(struct dm_chunk_work), GFP_NOIO); if (unlikely(!cw)) { @@ -543,7 +544,7 @@ static int dmz_queue_chunk_work(struct dmz_target *dmz, struct bio *bio) } INIT_WORK(&cw->work, dmz_chunk_work); - refcount_set(&cw->refcount, 0); + refcount_set(&cw->refcount, 1); cw->target = dmz; cw->chunk = chunk; bio_list_init(&cw->bio_list); @@ -556,7 +557,6 @@ static int dmz_queue_chunk_work(struct dmz_target *dmz, struct bio *bio) } bio_list_add(&cw->bio_list, bio); - dmz_get_chunk_work(cw); dmz_reclaim_bio_acc(dmz->reclaim); if (queue_work(dmz->chunk_wq, &cw->work)) From 5901b51f3e5d9129da3e59b10cc76e4cc983e940 Mon Sep 17 00:00:00 2001 From: Lukas Bulwahn Date: Fri, 21 Feb 2020 19:54:02 +0100 Subject: [PATCH 268/400] MAINTAINERS: Correct Cadence PCI driver path de80f95ccb9c ("PCI: cadence: Move all files to per-device cadence directory") moved files of the PCI cadence drivers, but did not update the MAINTAINERS entry. Since then, ./scripts/get_maintainer.pl --self-test complains: warning: no file matches F: drivers/pci/controller/pcie-cadence* Repair the MAINTAINERS entry. Link: https://lore.kernel.org/r/20200221185402.4703-1-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn Signed-off-by: Bjorn Helgaas --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 38fe2f3f7b6f..8dd7ae98c574 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -12740,7 +12740,7 @@ M: Tom Joseph L: linux-pci@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/pci/cdns,*.txt -F: drivers/pci/controller/pcie-cadence* +F: drivers/pci/controller/cadence/ PCI DRIVER FOR FREESCALE LAYERSCAPE M: Minghuan Lian From fc37a1632d40c80c067eb1bc235139f5867a2667 Mon Sep 17 00:00:00 2001 From: "Desnes A. Nunes do Rosario" Date: Thu, 27 Feb 2020 10:47:15 -0300 Subject: [PATCH 269/400] powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systems PowerVM systems running compatibility mode on a few Power8 revisions are still vulnerable to the hardware defect that loses PMU exceptions arriving prior to a context switch. The software fix for this issue is enabled through the CPU_FTR_PMAO_BUG cpu_feature bit, nevertheless this bit also needs to be set for PowerVM compatibility mode systems. Fixes: 68f2f0d431d9ea4 ("powerpc: Add a cpu feature CPU_FTR_PMAO_BUG") Signed-off-by: Desnes A. Nunes do Rosario Reviewed-by: Leonardo Bras Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200227134715.9715-1-desnesn@linux.ibm.com --- arch/powerpc/kernel/cputable.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c index e745abc5457a..245be4fafe13 100644 --- a/arch/powerpc/kernel/cputable.c +++ b/arch/powerpc/kernel/cputable.c @@ -2193,11 +2193,13 @@ static struct cpu_spec * __init setup_cpu_spec(unsigned long offset, * oprofile_cpu_type already has a value, then we are * possibly overriding a real PVR with a logical one, * and, in that case, keep the current value for - * oprofile_cpu_type. + * oprofile_cpu_type. Futhermore, let's ensure that the + * fix for the PMAO bug is enabled on compatibility mode. */ if (old.oprofile_cpu_type != NULL) { t->oprofile_cpu_type = old.oprofile_cpu_type; t->oprofile_type = old.oprofile_type; + t->cpu_features |= old.cpu_features & CPU_FTR_PMAO_BUG; } } From 7943f4acea3caf0b6d5b6cdfce7d5a2b4a9aa608 Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Tue, 25 Feb 2020 08:54:26 +0100 Subject: [PATCH 270/400] KVM: SVM: allocate AVIC data structures based on kvm_amd module parameter Even if APICv is disabled at startup, the backing page and ir_list need to be initialized in case they are needed later. The only case in which this can be skipped is for userspace irqchip, and that must be done because avic_init_backing_page dereferences vcpu->arch.apic (which is NULL for userspace irqchip). Tested-by: rmuncrief@humanavance.com Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206579 Reviewed-by: Miaohe Lin Signed-off-by: Paolo Bonzini --- arch/x86/kvm/svm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index ad3f5b178a03..bd02526300ab 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -2194,8 +2194,9 @@ static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) static int avic_init_vcpu(struct vcpu_svm *svm) { int ret; + struct kvm_vcpu *vcpu = &svm->vcpu; - if (!kvm_vcpu_apicv_active(&svm->vcpu)) + if (!avic || !irqchip_in_kernel(vcpu->kvm)) return 0; ret = avic_init_backing_page(&svm->vcpu); From fcd07f9adc7dacc2532695cf9dd2284d49e716ff Mon Sep 17 00:00:00 2001 From: Christian Borntraeger Date: Fri, 28 Feb 2020 09:49:41 +0100 Subject: [PATCH 271/400] KVM: let declaration of kvm_get_running_vcpus match implementation Sparse notices that declaration and implementation do not match: arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: warning: incorrect type in return expression (different address spaces) arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: expected struct kvm_vcpu [noderef] ** arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: got struct kvm_vcpu *[noderef] * Signed-off-by: Christian Borntraeger Signed-off-by: Paolo Bonzini --- include/linux/kvm_host.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7944ad6ac10b..bcb9b2ac0791 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1344,7 +1344,7 @@ static inline void kvm_vcpu_set_dy_eligible(struct kvm_vcpu *vcpu, bool val) #endif /* CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT */ struct kvm_vcpu *kvm_get_running_vcpu(void); -struct kvm_vcpu __percpu **kvm_get_running_vcpus(void); +struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void); #ifdef CONFIG_HAVE_KVM_IRQ_BYPASS bool kvm_arch_has_irq_bypass(void); From a262bca3aba03f0696995beb223c610e47533db3 Mon Sep 17 00:00:00 2001 From: Wanpeng Li Date: Tue, 18 Feb 2020 09:08:23 +0800 Subject: [PATCH 272/400] KVM: Introduce pv check helpers Introduce some pv check helpers for consistency. Suggested-by: Vitaly Kuznetsov Reviewed-by: Konrad Rzeszutek Wilk Signed-off-by: Wanpeng Li Signed-off-by: Paolo Bonzini --- arch/x86/kernel/kvm.c | 34 ++++++++++++++++++++++++---------- 1 file changed, 24 insertions(+), 10 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index d817f255aed8..7bc0fff3f8e6 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -425,7 +425,27 @@ static void __init sev_map_percpu_data(void) } } +static bool pv_tlb_flush_supported(void) +{ + return (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH) && + !kvm_para_has_hint(KVM_HINTS_REALTIME) && + kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)); +} + #ifdef CONFIG_SMP + +static bool pv_ipi_supported(void) +{ + return kvm_para_has_feature(KVM_FEATURE_PV_SEND_IPI); +} + +static bool pv_sched_yield_supported(void) +{ + return (kvm_para_has_feature(KVM_FEATURE_PV_SCHED_YIELD) && + !kvm_para_has_hint(KVM_HINTS_REALTIME) && + kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)); +} + #define KVM_IPI_CLUSTER_SIZE (2 * BITS_PER_LONG) static void __send_ipi_mask(const struct cpumask *mask, int vector) @@ -619,9 +639,7 @@ static void __init kvm_guest_init(void) pv_ops.time.steal_clock = kvm_steal_clock; } - if (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH) && - !kvm_para_has_hint(KVM_HINTS_REALTIME) && - kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) { + if (pv_tlb_flush_supported()) { pv_ops.mmu.flush_tlb_others = kvm_flush_tlb_others; pv_ops.mmu.tlb_remove_table = tlb_remove_table; } @@ -632,9 +650,7 @@ static void __init kvm_guest_init(void) #ifdef CONFIG_SMP smp_ops.smp_prepare_cpus = kvm_smp_prepare_cpus; smp_ops.smp_prepare_boot_cpu = kvm_smp_prepare_boot_cpu; - if (kvm_para_has_feature(KVM_FEATURE_PV_SCHED_YIELD) && - !kvm_para_has_hint(KVM_HINTS_REALTIME) && - kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) { + if (pv_sched_yield_supported()) { smp_ops.send_call_func_ipi = kvm_smp_send_call_func_ipi; pr_info("KVM setup pv sched yield\n"); } @@ -700,7 +716,7 @@ static uint32_t __init kvm_detect(void) static void __init kvm_apic_init(void) { #if defined(CONFIG_SMP) - if (kvm_para_has_feature(KVM_FEATURE_PV_SEND_IPI)) + if (pv_ipi_supported()) kvm_setup_pv_ipi(); #endif } @@ -739,9 +755,7 @@ static __init int kvm_setup_pv_tlb_flush(void) if (!kvm_para_available() || nopv) return 0; - if (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH) && - !kvm_para_has_hint(KVM_HINTS_REALTIME) && - kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) { + if (pv_tlb_flush_supported()) { for_each_possible_cpu(cpu) { zalloc_cpumask_var_node(per_cpu_ptr(&__pv_tlb_mask, cpu), GFP_KERNEL, cpu_to_node(cpu)); From 8a9442f49c72bde43f982e53b74526ac37d3565b Mon Sep 17 00:00:00 2001 From: Wanpeng Li Date: Tue, 18 Feb 2020 09:08:24 +0800 Subject: [PATCH 273/400] KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipis Nick Desaulniers Reported: When building with: $ make CC=clang arch/x86/ CFLAGS=-Wframe-larger-than=1000 The following warning is observed: arch/x86/kernel/kvm.c:494:13: warning: stack frame size of 1064 bytes in function 'kvm_send_ipi_mask_allbutself' [-Wframe-larger-than=] static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector) ^ Debugging with: https://github.com/ClangBuiltLinux/frame-larger-than via: $ python3 frame_larger_than.py arch/x86/kernel/kvm.o \ kvm_send_ipi_mask_allbutself points to the stack allocated `struct cpumask newmask` in `kvm_send_ipi_mask_allbutself`. The size of a `struct cpumask` is potentially large, as it's CONFIG_NR_CPUS divided by BITS_PER_LONG for the target architecture. CONFIG_NR_CPUS for X86_64 can be as high as 8192, making a single instance of a `struct cpumask` 1024 B. This patch fixes it by pre-allocate 1 cpumask variable per cpu and use it for both pv tlb and pv ipis.. Reported-by: Nick Desaulniers Acked-by: Nick Desaulniers Reviewed-by: Vitaly Kuznetsov Cc: Peter Zijlstra Cc: Nick Desaulniers Signed-off-by: Wanpeng Li Signed-off-by: Paolo Bonzini --- arch/x86/kernel/kvm.c | 33 +++++++++++++++++++++------------ 1 file changed, 21 insertions(+), 12 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 7bc0fff3f8e6..6efe0410fb72 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -432,6 +432,8 @@ static bool pv_tlb_flush_supported(void) kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)); } +static DEFINE_PER_CPU(cpumask_var_t, __pv_cpu_mask); + #ifdef CONFIG_SMP static bool pv_ipi_supported(void) @@ -510,12 +512,12 @@ static void kvm_send_ipi_mask(const struct cpumask *mask, int vector) static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector) { unsigned int this_cpu = smp_processor_id(); - struct cpumask new_mask; + struct cpumask *new_mask = this_cpu_cpumask_var_ptr(__pv_cpu_mask); const struct cpumask *local_mask; - cpumask_copy(&new_mask, mask); - cpumask_clear_cpu(this_cpu, &new_mask); - local_mask = &new_mask; + cpumask_copy(new_mask, mask); + cpumask_clear_cpu(this_cpu, new_mask); + local_mask = new_mask; __send_ipi_mask(local_mask, vector); } @@ -595,7 +597,6 @@ static void __init kvm_apf_trap_init(void) update_intr_gate(X86_TRAP_PF, async_page_fault); } -static DEFINE_PER_CPU(cpumask_var_t, __pv_tlb_mask); static void kvm_flush_tlb_others(const struct cpumask *cpumask, const struct flush_tlb_info *info) @@ -603,7 +604,7 @@ static void kvm_flush_tlb_others(const struct cpumask *cpumask, u8 state; int cpu; struct kvm_steal_time *src; - struct cpumask *flushmask = this_cpu_cpumask_var_ptr(__pv_tlb_mask); + struct cpumask *flushmask = this_cpu_cpumask_var_ptr(__pv_cpu_mask); cpumask_copy(flushmask, cpumask); /* @@ -642,6 +643,7 @@ static void __init kvm_guest_init(void) if (pv_tlb_flush_supported()) { pv_ops.mmu.flush_tlb_others = kvm_flush_tlb_others; pv_ops.mmu.tlb_remove_table = tlb_remove_table; + pr_info("KVM setup pv remote TLB flush\n"); } if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) @@ -748,24 +750,31 @@ static __init int activate_jump_labels(void) } arch_initcall(activate_jump_labels); -static __init int kvm_setup_pv_tlb_flush(void) +static __init int kvm_alloc_cpumask(void) { int cpu; + bool alloc = false; if (!kvm_para_available() || nopv) return 0; - if (pv_tlb_flush_supported()) { + if (pv_tlb_flush_supported()) + alloc = true; + +#if defined(CONFIG_SMP) + if (pv_ipi_supported()) + alloc = true; +#endif + + if (alloc) for_each_possible_cpu(cpu) { - zalloc_cpumask_var_node(per_cpu_ptr(&__pv_tlb_mask, cpu), + zalloc_cpumask_var_node(per_cpu_ptr(&__pv_cpu_mask, cpu), GFP_KERNEL, cpu_to_node(cpu)); } - pr_info("KVM setup pv remote TLB flush\n"); - } return 0; } -arch_initcall(kvm_setup_pv_tlb_flush); +arch_initcall(kvm_alloc_cpumask); #ifdef CONFIG_PARAVIRT_SPINLOCKS From 575b255c1663c8fccc41fe965dcac281e3113c65 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Valdis=20Kl=C4=93tnieks?= Date: Thu, 27 Feb 2020 21:49:52 -0500 Subject: [PATCH 274/400] KVM: x86: allow compiling as non-module with W=1 Compile error with CONFIG_KVM_INTEL=y and W=1: CC arch/x86/kvm/vmx/vmx.o arch/x86/kvm/vmx/vmx.c:68:32: error: 'vmx_cpu_id' defined but not used [-Werror=unused-const-variable=] 68 | static const struct x86_cpu_id vmx_cpu_id[] = { | ^~~~~~~~~~ cc1: all warnings being treated as errors When building with =y, the MODULE_DEVICE_TABLE macro doesn't generate a reference to the structure (or any code at all). This makes W=1 compiles unhappy. Wrap both in a #ifdef to avoid the issue. Signed-off-by: Valdis Kletnieks [Do the same for CONFIG_KVM_AMD. - Paolo] Signed-off-by: Paolo Bonzini --- arch/x86/kvm/svm.c | 2 ++ arch/x86/kvm/vmx/vmx.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index bd02526300ab..24c0b2ba8fb9 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -57,11 +57,13 @@ MODULE_AUTHOR("Qumranet"); MODULE_LICENSE("GPL"); +#ifdef MODULE static const struct x86_cpu_id svm_cpu_id[] = { X86_FEATURE_MATCH(X86_FEATURE_SVM), {} }; MODULE_DEVICE_TABLE(x86cpu, svm_cpu_id); +#endif #define IOPM_ALLOC_ORDER 2 #define MSRPM_ALLOC_ORDER 1 diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 63aaf44edd1f..ce70a71037ed 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -64,11 +64,13 @@ MODULE_AUTHOR("Qumranet"); MODULE_LICENSE("GPL"); +#ifdef MODULE static const struct x86_cpu_id vmx_cpu_id[] = { X86_FEATURE_MATCH(X86_FEATURE_VMX), {} }; MODULE_DEVICE_TABLE(x86cpu, vmx_cpu_id); +#endif bool __read_mostly enable_vpid = 1; module_param_named(vpid, enable_vpid, bool, 0444); From 4f337faf1c55e55bdc49df13fcb3a3c45655899e Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Fri, 28 Feb 2020 10:42:31 +0100 Subject: [PATCH 275/400] KVM: allow disabling -Werror Restrict -Werror to well-tested configurations and allow disabling it via Kconfig. Reported-by: Christoph Hellwig Signed-off-by: Paolo Bonzini --- arch/x86/kvm/Kconfig | 13 +++++++++++++ arch/x86/kvm/Makefile | 2 +- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 991019d5eee1..1bb4927030af 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -59,6 +59,19 @@ config KVM If unsure, say N. +config KVM_WERROR + bool "Compile KVM with -Werror" + # KASAN may cause the build to fail due to larger frames + default y if X86_64 && !KASAN + # We use the dependency on !COMPILE_TEST to not be enabled + # blindly in allmodconfig or allyesconfig configurations + depends on (X86_64 && !KASAN) || !COMPILE_TEST + depends on EXPERT + help + Add -Werror to the build flags for (and only for) i915.ko. + + If in doubt, say "N". + config KVM_INTEL tristate "KVM for Intel (and compatible) processors support" depends on KVM && IA32_FEAT_CTL diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index 4654e97a05cc..e553f0fdd87d 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 ccflags-y += -Iarch/x86/kvm -ccflags-y += -Werror +ccflags-$(CONFIG_KVM_WERROR) += -Werror KVM := ../../../virt/kvm From aaec7c03de92c35a96966631989950e6e27662db Mon Sep 17 00:00:00 2001 From: Paolo Bonzini Date: Fri, 28 Feb 2020 10:49:10 +0100 Subject: [PATCH 276/400] KVM: x86: avoid useless copy of cpufreq policy struct cpufreq_policy is quite big and it is not a good idea to allocate one on the stack. Just use cpufreq_cpu_get and cpufreq_cpu_put which is even simpler. Reported-by: Christoph Hellwig Signed-off-by: Paolo Bonzini --- arch/x86/kvm/x86.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 359fcd395132..bcb6b676608b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7190,15 +7190,15 @@ static void kvm_timer_init(void) if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) { #ifdef CONFIG_CPU_FREQ - struct cpufreq_policy policy; + struct cpufreq_policy *policy; int cpu; - memset(&policy, 0, sizeof(policy)); cpu = get_cpu(); - cpufreq_get_policy(&policy, cpu); - if (policy.cpuinfo.max_freq) - max_tsc_khz = policy.cpuinfo.max_freq; + policy = cpufreq_cpu_get(cpu); + if (policy && policy->cpuinfo.max_freq) + max_tsc_khz = policy->cpuinfo.max_freq; put_cpu(); + cpufreq_cpu_put(policy); #endif cpufreq_register_notifier(&kvmclock_cpufreq_notifier_block, CPUFREQ_TRANSITION_NOTIFIER); From ef935c25fd648a17c27af5d1738b1884f78c5b75 Mon Sep 17 00:00:00 2001 From: Erwan Velu Date: Thu, 27 Feb 2020 19:00:46 +0100 Subject: [PATCH 277/400] kvm: x86: Limit the number of "kvm: disabled by bios" messages In older version of systemd(219), at boot time, udevadm is called with : /usr/bin/udevadm trigger --type=devices --action=add" This program generates an echo "add" in /sys/devices/system/cpu/cpu/uevent, leading to the "kvm: disabled by bios" message in case of your Bios disabled the virtualization extensions. On a modern system running up to 256 CPU threads, this pollutes the Kernel logs. This patch offers to ratelimit this message to avoid any userspace program triggering this uevent printing this message too often. This patch is only a workaround but greatly reduce the pollution without breaking the current behavior of printing a message if some try to instantiate KVM on a system that doesn't support it. Note that recent versions of systemd (>239) do not have trigger this behavior. This patch will be useful at least for some using older systemd with recent Kernels. Signed-off-by: Erwan Velu Signed-off-by: Paolo Bonzini --- arch/x86/kvm/x86.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bcb6b676608b..5de200663f51 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7308,12 +7308,12 @@ int kvm_arch_init(void *opaque) } if (!ops->cpu_has_kvm_support()) { - printk(KERN_ERR "kvm: no hardware support\n"); + pr_err_ratelimited("kvm: no hardware support\n"); r = -EOPNOTSUPP; goto out; } if (ops->disabled_by_bios()) { - printk(KERN_ERR "kvm: disabled by bios\n"); + pr_err_ratelimited("kvm: disabled by bios\n"); r = -EOPNOTSUPP; goto out; } From 0c282b068eb26db0e85e2ab4ec6d1e932acda841 Mon Sep 17 00:00:00 2001 From: Madhuparna Bhowmik Date: Mon, 27 Jan 2020 23:28:21 +0530 Subject: [PATCH 278/400] fork: Use RCU_INIT_POINTER() instead of rcu_access_pointer() Use RCU_INIT_POINTER() instead of rcu_access_pointer() in copy_sighand(). Suggested-by: Oleg Nesterov Signed-off-by: Madhuparna Bhowmik Acked-by: Oleg Nesterov Acked-by: Christian Brauner [christian.brauner@ubuntu.com: edit commit message] Link: https://lore.kernel.org/r/20200127175821.10833-1-madhuparnabhowmik10@gmail.com Signed-off-by: Christian Brauner --- kernel/fork.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/fork.c b/kernel/fork.c index 60a1295f4384..86425305cd4a 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1508,7 +1508,7 @@ static int copy_sighand(unsigned long clone_flags, struct task_struct *tsk) return 0; } sig = kmem_cache_alloc(sighand_cachep, GFP_KERNEL); - rcu_assign_pointer(tsk->sighand, sig); + RCU_INIT_POINTER(tsk->sighand, sig); if (!sig) return -ENOMEM; From 22a34c6fe0ffc1d92ee26a25913fadf347258fd6 Mon Sep 17 00:00:00 2001 From: Madhuparna Bhowmik Date: Thu, 30 Jan 2020 11:50:28 +0530 Subject: [PATCH 279/400] exit: Fix Sparse errors and warnings This patch fixes the following sparse error: kernel/exit.c:627:25: error: incompatible types in comparison expression And the following warning: kernel/exit.c:626:40: warning: incorrect type in assignment Signed-off-by: Madhuparna Bhowmik Acked-by: Oleg Nesterov Acked-by: Christian Brauner [christian.brauner@ubuntu.com: edit commit message] Link: https://lore.kernel.org/r/20200130062028.4870-1-madhuparnabhowmik10@gmail.com Signed-off-by: Christian Brauner --- kernel/exit.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/exit.c b/kernel/exit.c index 2833ffb0c211..0b81b26a872a 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -619,8 +619,8 @@ static void forget_original_parent(struct task_struct *father, reaper = find_new_reaper(father, reaper); list_for_each_entry(p, &father->children, sibling) { for_each_thread(p, t) { - t->real_parent = reaper; - BUG_ON((!t->ptrace) != (t->parent == father)); + RCU_INIT_POINTER(t->real_parent, reaper); + BUG_ON((!t->ptrace) != (rcu_access_pointer(t->parent) == father)); if (likely(!t->ptrace)) t->parent = t->real_parent; if (t->pdeath_signal) From 186e28a18aeb0fec99cc586fda337e6b23190791 Mon Sep 17 00:00:00 2001 From: Christophe Leroy Date: Fri, 28 Feb 2020 00:00:08 +0000 Subject: [PATCH 280/400] selftests: pidfd: Add pidfd_fdinfo_test in .gitignore The commit identified below added pidfd_fdinfo_test but failed to add it to .gitignore Fixes: 2def297ec7fb ("pidfd: add tests for NSpid info in fdinfo") Signed-off-by: Christophe Leroy Acked-by: Christian Brauner Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/966567c7dbaa26a06730d796354f8a086c0ee288.1582847778.git.christophe.leroy@c-s.fr Signed-off-by: Christian Brauner --- tools/testing/selftests/pidfd/.gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/pidfd/.gitignore b/tools/testing/selftests/pidfd/.gitignore index 3a779c084d96..39559d723c41 100644 --- a/tools/testing/selftests/pidfd/.gitignore +++ b/tools/testing/selftests/pidfd/.gitignore @@ -2,4 +2,5 @@ pidfd_open_test pidfd_poll_test pidfd_test pidfd_wait +pidfd_fdinfo_test pidfd_getfd_test From 801b67f3eaafd3f2ec8b65d93142d4ffedba85df Mon Sep 17 00:00:00 2001 From: Maor Gottlieb Date: Thu, 27 Feb 2020 14:57:28 +0200 Subject: [PATCH 281/400] RDMA/core: Fix pkey and port assignment in get_new_pps When port is part of the modify mask, then we should take it from the qp_attr and not from the old pps. Same for PKEY. Otherwise there are panics in some configurations: RIP: 0010:get_pkey_idx_qp_list+0x50/0x80 [ib_core] Code: c7 18 e8 13 04 30 ef 0f b6 43 06 48 69 c0 b8 00 00 00 48 03 85 a0 04 00 00 48 8b 50 20 48 8d 48 20 48 39 ca 74 1a 0f b7 73 04 <66> 39 72 10 75 08 eb 10 66 39 72 10 74 0a 48 8b 12 48 39 ca 75 f2 RSP: 0018:ffffafb3480932f0 EFLAGS: 00010203 RAX: ffff98059ababa10 RBX: ffff980d926e8cc0 RCX: ffff98059ababa30 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff98059ababa28 RBP: ffff98059b940000 R08: 00000000000310c0 R09: ffff97fe47c07480 R10: 0000000000000036 R11: 0000000000000200 R12: 0000000000000071 R13: ffff98059b940000 R14: ffff980d87f948a0 R15: 0000000000000000 FS: 00007f88deb31740(0000) GS:ffff98059f600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 0000000853e26001 CR4: 00000000001606e0 Call Trace: port_pkey_list_insert+0x3d/0x1b0 [ib_core] ? kmem_cache_alloc_trace+0x215/0x220 ib_security_modify_qp+0x226/0x3a0 [ib_core] _ib_modify_qp+0xcf/0x390 [ib_core] ipoib_init_qp+0x7f/0x200 [ib_ipoib] ? rvt_modify_port+0xd0/0xd0 [rdmavt] ? ib_find_pkey+0x99/0xf0 [ib_core] ipoib_ib_dev_open_default+0x1a/0x200 [ib_ipoib] ipoib_ib_dev_open+0x96/0x130 [ib_ipoib] ipoib_open+0x44/0x130 [ib_ipoib] __dev_open+0xd1/0x160 __dev_change_flags+0x1ab/0x1f0 dev_change_flags+0x23/0x60 do_setlink+0x328/0xe30 ? __nla_validate_parse+0x54/0x900 __rtnl_newlink+0x54e/0x810 ? __alloc_pages_nodemask+0x17d/0x320 ? page_fault+0x30/0x50 ? _cond_resched+0x15/0x30 ? kmem_cache_alloc_trace+0x1c8/0x220 rtnl_newlink+0x43/0x60 rtnetlink_rcv_msg+0x28f/0x350 ? kmem_cache_alloc+0x1fb/0x200 ? _cond_resched+0x15/0x30 ? __kmalloc_node_track_caller+0x24d/0x2d0 ? rtnl_calcit.isra.31+0x120/0x120 netlink_rcv_skb+0xcb/0x100 netlink_unicast+0x1e0/0x340 netlink_sendmsg+0x317/0x480 ? __check_object_size+0x48/0x1d0 sock_sendmsg+0x65/0x80 ____sys_sendmsg+0x223/0x260 ? copy_msghdr_from_user+0xdc/0x140 ___sys_sendmsg+0x7c/0xc0 ? skb_dequeue+0x57/0x70 ? __inode_wait_for_writeback+0x75/0xe0 ? fsnotify_grab_connector+0x45/0x80 ? __dentry_kill+0x12c/0x180 __sys_sendmsg+0x58/0xa0 do_syscall_64+0x5b/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f88de467f10 Link: https://lore.kernel.org/r/20200227125728.100551-1-leon@kernel.org Cc: Fixes: 1dd017882e01 ("RDMA/core: Fix protection fault in get_pkey_idx_qp_list") Signed-off-by: Maor Gottlieb Signed-off-by: Leon Romanovsky Tested-by: Mike Marciniszyn Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/security.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/core/security.c b/drivers/infiniband/core/security.c index b9a36ea244d4..2d5608315dc8 100644 --- a/drivers/infiniband/core/security.c +++ b/drivers/infiniband/core/security.c @@ -340,11 +340,15 @@ static struct ib_ports_pkeys *get_new_pps(const struct ib_qp *qp, return NULL; if (qp_attr_mask & IB_QP_PORT) - new_pps->main.port_num = - (qp_pps) ? qp_pps->main.port_num : qp_attr->port_num; + new_pps->main.port_num = qp_attr->port_num; + else if (qp_pps) + new_pps->main.port_num = qp_pps->main.port_num; + if (qp_attr_mask & IB_QP_PKEY_INDEX) - new_pps->main.pkey_index = (qp_pps) ? qp_pps->main.pkey_index : - qp_attr->pkey_index; + new_pps->main.pkey_index = qp_attr->pkey_index; + else if (qp_pps) + new_pps->main.pkey_index = qp_pps->main.pkey_index; + if ((qp_attr_mask & IB_QP_PKEY_INDEX) && (qp_attr_mask & IB_QP_PORT)) new_pps->main.state = IB_PORT_PKEY_VALID; From 9b3193089e77d3b59b045146ff1c770dd899acb1 Mon Sep 17 00:00:00 2001 From: Charles Keepax Date: Fri, 28 Feb 2020 15:31:45 +0000 Subject: [PATCH 282/400] ASoC: dapm: Correct DAPM handling of active widgets during shutdown commit c2caa4da46a4 ("ASoC: Fix widget powerdown on shutdown") added a set of the power state during snd_soc_dapm_shutdown to ensure the widgets powered off. However, when commit 39eb5fd13dff ("ASoC: dapm: Delay w->power update until the changes are written") added the new_power member of the widget structure, to differentiate between the current power state and the target power state, it did not update the shutdown to use the new_power member. As new_power has not updated it will be left in the state set by the last DAPM sequence, ie. 1 for active widgets. So as the DAPM sequence for the shutdown proceeds it will turn the widgets on (despite them already being on) rather than turning them off. Fixes: 39eb5fd13dff ("ASoC: dapm: Delay w->power update until the changes are written") Signed-off-by: Charles Keepax Link: https://lore.kernel.org/r/20200228153145.21013-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown --- sound/soc/soc-dapm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index 9b130561d562..9fb54e6fe254 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -4772,7 +4772,7 @@ static void soc_dapm_shutdown_dapm(struct snd_soc_dapm_context *dapm) continue; if (w->power) { dapm_seq_insert(w, &down_list, false); - w->power = 0; + w->new_power = 0; powerdown = 1; } } From f1861a7c58ba1ba43c7adff6909d9a920338e4a8 Mon Sep 17 00:00:00 2001 From: Kuninori Morimoto Date: Fri, 28 Feb 2020 10:48:35 +0900 Subject: [PATCH 283/400] ASoC: soc-component: tidyup snd_soc_pcm_component_sync_stop() commit 1e5ddb6ba73894 ("ASoC: component: Add sync_stop PCM ops") added snd_soc_pcm_component_sync_stop(), but it is checking ioctrl instead of sync_stop. This is bug. This patch fixup it. Fixes: commit 1e5ddb6ba73894 ("ASoC: component: Add sync_stop PCM ops") Signed-off-by: Kuninori Morimoto Cc: Takashi Iwai Link: https://lore.kernel.org/r/8736av7a8c.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown --- sound/soc/soc-component.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/soc-component.c b/sound/soc/soc-component.c index 14e175cdeeb8..785a0385cc7f 100644 --- a/sound/soc/soc-component.c +++ b/sound/soc/soc-component.c @@ -451,7 +451,7 @@ int snd_soc_pcm_component_sync_stop(struct snd_pcm_substream *substream) int i, ret; for_each_rtd_components(rtd, i, component) { - if (component->driver->ioctl) { + if (component->driver->sync_stop) { ret = component->driver->sync_stop(component, substream); if (ret < 0) From 8e093ea4d3593379be46b845b9e823179558047e Mon Sep 17 00:00:00 2001 From: Tudor Ambarus Date: Fri, 28 Feb 2020 15:55:32 +0000 Subject: [PATCH 284/400] spi: atmel-quadspi: fix possible MMIO window size overrun The QSPI controller memory space is limited to 128MB: 0x9000_00000-0x9800_00000/0XD000_0000--0XD800_0000. There are nor flashes that are bigger in size than the memory size supported by the controller: Micron MT25QL02G (256 MB). Check if the address exceeds the MMIO window size. An improvement would be to add support for regular SPI mode and fall back to it when the flash memories overrun the controller's memory space. Fixes: 0e6aae08e9ae ("spi: Add QuadSPI driver for Atmel SAMA5D2") Signed-off-by: Tudor Ambarus Link: https://lore.kernel.org/r/20200228155437.1558219-1-tudor.ambarus@microchip.com Signed-off-by: Mark Brown --- drivers/spi/atmel-quadspi.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/spi/atmel-quadspi.c b/drivers/spi/atmel-quadspi.c index fd8007ebb145..13def7f78b9e 100644 --- a/drivers/spi/atmel-quadspi.c +++ b/drivers/spi/atmel-quadspi.c @@ -149,6 +149,7 @@ struct atmel_qspi { struct clk *qspick; struct platform_device *pdev; const struct atmel_qspi_caps *caps; + resource_size_t mmap_size; u32 pending; u32 mr; u32 scr; @@ -329,6 +330,14 @@ static int atmel_qspi_exec_op(struct spi_mem *mem, const struct spi_mem_op *op) u32 sr, offset; int err; + /* + * Check if the address exceeds the MMIO window size. An improvement + * would be to add support for regular SPI mode and fall back to it + * when the flash memories overrun the controller's memory space. + */ + if (op->addr.val + op->data.nbytes > aq->mmap_size) + return -ENOTSUPP; + err = atmel_qspi_set_cfg(aq, op, &offset); if (err) return err; @@ -480,6 +489,8 @@ static int atmel_qspi_probe(struct platform_device *pdev) goto exit; } + aq->mmap_size = resource_size(res); + /* Get the peripheral clock */ aq->pclk = devm_clk_get(&pdev->dev, "pclk"); if (IS_ERR(aq->pclk)) From 51a21e0e7baf279fbd0b57f3e376f8762df8bb7d Mon Sep 17 00:00:00 2001 From: Rob Herring Date: Fri, 21 Feb 2020 16:27:10 -0600 Subject: [PATCH 285/400] dt-bindings: Fix dtc warnings in examples Fix all the warnings in the DT binding schema examples when built with 'W=1'. This is in preparation to make that the default for examples. Reviewed-by: Linus Walleij Acked-by: Stephen Boyd Acked-by: Sam Ravnborg Acked-by: Vinod Koul Acked-by: Lee Jones Acked-by: Srinivas Kandagatla Acked-by: Mark Brown Acked-by: Bartosz Golaszewski Cc: Daniel Lezcano Cc: Kishon Vijay Abraham I Cc: Ulf Hansson Cc: Dmitry Torokhov Cc: Krzysztof Kozlowski Cc: Kukjin Kim Cc: Jonathan Cameron Cc: Thierry Reding Cc: Chen-Yu Tsai Cc: Maxime Ripard Cc: Alexandre Torgue Cc: Maxime Coquelin Signed-off-by: Rob Herring --- .../devicetree/bindings/arm/stm32/st,mlahb.yaml | 2 +- .../clock/allwinner,sun4i-a10-osc-clk.yaml | 2 +- .../bindings/clock/allwinner,sun9i-a80-gt-clk.yaml | 2 +- .../display/allwinner,sun4i-a10-tv-encoder.yaml | 6 +----- .../bindings/display/bridge/anx6345.yaml | 10 ++-------- .../display/panel/leadtek,ltk500hd1829.yaml | 2 ++ .../bindings/display/panel/xinpeng,xpp055c272.yaml | 2 ++ .../bindings/display/simple-framebuffer.yaml | 6 +----- .../devicetree/bindings/dma/ti/k3-udma.yaml | 14 +------------- .../devicetree/bindings/gpu/arm,mali-bifrost.yaml | 14 +++++++------- .../devicetree/bindings/gpu/arm,mali-midgard.yaml | 14 +++++++------- .../bindings/iio/adc/samsung,exynos-adc.yaml | 2 +- .../bindings/input/touchscreen/goodix.yaml | 2 +- .../devicetree/bindings/media/ti,cal.yaml | 2 +- .../devicetree/bindings/mfd/max77650.yaml | 4 ++-- .../devicetree/bindings/mmc/mmc-controller.yaml | 1 + Documentation/devicetree/bindings/nvmem/nvmem.yaml | 2 ++ .../bindings/phy/allwinner,sun4i-a10-usb-phy.yaml | 2 +- .../bindings/pinctrl/st,stm32-pinctrl.yaml | 2 +- .../devicetree/bindings/regulator/regulator.yaml | 2 +- .../sram/allwinner,sun4i-a10-system-control.yaml | 2 +- .../bindings/timer/allwinner,sun4i-a10-timer.yaml | 2 +- 22 files changed, 39 insertions(+), 58 deletions(-) diff --git a/Documentation/devicetree/bindings/arm/stm32/st,mlahb.yaml b/Documentation/devicetree/bindings/arm/stm32/st,mlahb.yaml index 68917bb7c7e8..55f7938c4826 100644 --- a/Documentation/devicetree/bindings/arm/stm32/st,mlahb.yaml +++ b/Documentation/devicetree/bindings/arm/stm32/st,mlahb.yaml @@ -52,7 +52,7 @@ required: examples: - | - mlahb: ahb { + mlahb: ahb@38000000 { compatible = "st,mlahb", "simple-bus"; #address-cells = <1>; #size-cells = <1>; diff --git a/Documentation/devicetree/bindings/clock/allwinner,sun4i-a10-osc-clk.yaml b/Documentation/devicetree/bindings/clock/allwinner,sun4i-a10-osc-clk.yaml index 69cfa4a3d562..c604822cda07 100644 --- a/Documentation/devicetree/bindings/clock/allwinner,sun4i-a10-osc-clk.yaml +++ b/Documentation/devicetree/bindings/clock/allwinner,sun4i-a10-osc-clk.yaml @@ -40,7 +40,7 @@ additionalProperties: false examples: - | - osc24M: clk@01c20050 { + osc24M: clk@1c20050 { #clock-cells = <0>; compatible = "allwinner,sun4i-a10-osc-clk"; reg = <0x01c20050 0x4>; diff --git a/Documentation/devicetree/bindings/clock/allwinner,sun9i-a80-gt-clk.yaml b/Documentation/devicetree/bindings/clock/allwinner,sun9i-a80-gt-clk.yaml index 07f38def7dc3..43963c3062c8 100644 --- a/Documentation/devicetree/bindings/clock/allwinner,sun9i-a80-gt-clk.yaml +++ b/Documentation/devicetree/bindings/clock/allwinner,sun9i-a80-gt-clk.yaml @@ -41,7 +41,7 @@ additionalProperties: false examples: - | - clk@0600005c { + clk@600005c { #clock-cells = <0>; compatible = "allwinner,sun9i-a80-gt-clk"; reg = <0x0600005c 0x4>; diff --git a/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tv-encoder.yaml b/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tv-encoder.yaml index 5d5d39665119..6009324be967 100644 --- a/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tv-encoder.yaml +++ b/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tv-encoder.yaml @@ -49,11 +49,7 @@ examples: resets = <&tcon_ch0_clk 0>; port { - #address-cells = <1>; - #size-cells = <0>; - - tve0_in_tcon0: endpoint@0 { - reg = <0>; + tve0_in_tcon0: endpoint { remote-endpoint = <&tcon0_out_tve0>; }; }; diff --git a/Documentation/devicetree/bindings/display/bridge/anx6345.yaml b/Documentation/devicetree/bindings/display/bridge/anx6345.yaml index 6d72b3d11fbc..c21103869923 100644 --- a/Documentation/devicetree/bindings/display/bridge/anx6345.yaml +++ b/Documentation/devicetree/bindings/display/bridge/anx6345.yaml @@ -79,21 +79,15 @@ examples: #size-cells = <0>; anx6345_in: port@0 { - #address-cells = <1>; - #size-cells = <0>; reg = <0>; - anx6345_in_tcon0: endpoint@0 { - reg = <0>; + anx6345_in_tcon0: endpoint { remote-endpoint = <&tcon0_out_anx6345>; }; }; anx6345_out: port@1 { - #address-cells = <1>; - #size-cells = <0>; reg = <1>; - anx6345_out_panel: endpoint@0 { - reg = <0>; + anx6345_out_panel: endpoint { remote-endpoint = <&panel_in_edp>; }; }; diff --git a/Documentation/devicetree/bindings/display/panel/leadtek,ltk500hd1829.yaml b/Documentation/devicetree/bindings/display/panel/leadtek,ltk500hd1829.yaml index 4ebcea7d0c63..a614644c9849 100644 --- a/Documentation/devicetree/bindings/display/panel/leadtek,ltk500hd1829.yaml +++ b/Documentation/devicetree/bindings/display/panel/leadtek,ltk500hd1829.yaml @@ -37,6 +37,8 @@ examples: dsi@ff450000 { #address-cells = <1>; #size-cells = <0>; + reg = <0xff450000 0x1000>; + panel@0 { compatible = "leadtek,ltk500hd1829"; reg = <0>; diff --git a/Documentation/devicetree/bindings/display/panel/xinpeng,xpp055c272.yaml b/Documentation/devicetree/bindings/display/panel/xinpeng,xpp055c272.yaml index 186e5e1c8fa3..22c91beb0541 100644 --- a/Documentation/devicetree/bindings/display/panel/xinpeng,xpp055c272.yaml +++ b/Documentation/devicetree/bindings/display/panel/xinpeng,xpp055c272.yaml @@ -37,6 +37,8 @@ examples: dsi@ff450000 { #address-cells = <1>; #size-cells = <0>; + reg = <0xff450000 0x1000>; + panel@0 { compatible = "xinpeng,xpp055c272"; reg = <0>; diff --git a/Documentation/devicetree/bindings/display/simple-framebuffer.yaml b/Documentation/devicetree/bindings/display/simple-framebuffer.yaml index 678776b6012a..1db608c9eef5 100644 --- a/Documentation/devicetree/bindings/display/simple-framebuffer.yaml +++ b/Documentation/devicetree/bindings/display/simple-framebuffer.yaml @@ -174,10 +174,6 @@ examples: }; }; - soc@1c00000 { - lcdc0: lcdc@1c0c000 { - compatible = "allwinner,sun4i-a10-lcdc"; - }; - }; + lcdc0: lcdc { }; ... diff --git a/Documentation/devicetree/bindings/dma/ti/k3-udma.yaml b/Documentation/devicetree/bindings/dma/ti/k3-udma.yaml index 8b5c346f23f6..34780d7535b8 100644 --- a/Documentation/devicetree/bindings/dma/ti/k3-udma.yaml +++ b/Documentation/devicetree/bindings/dma/ti/k3-udma.yaml @@ -143,7 +143,7 @@ examples: #size-cells = <2>; dma-coherent; dma-ranges; - ranges; + ranges = <0x0 0x30800000 0x0 0x30800000 0x0 0x05000000>; ti,sci-dev-id = <118>; @@ -169,16 +169,4 @@ examples: ti,sci-rm-range-rflow = <0x6>; /* GP RFLOW */ }; }; - - mcasp0: mcasp@02B00000 { - dmas = <&main_udmap 0xc400>, <&main_udmap 0x4400>; - dma-names = "tx", "rx"; - }; - - crypto: crypto@4E00000 { - compatible = "ti,sa2ul-crypto"; - - dmas = <&main_udmap 0xc000>, <&main_udmap 0x4000>, <&main_udmap 0x4001>; - dma-names = "tx", "rx1", "rx2"; - }; }; diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-bifrost.yaml b/Documentation/devicetree/bindings/gpu/arm,mali-bifrost.yaml index 4ea6a8789699..e8b99adcb1bd 100644 --- a/Documentation/devicetree/bindings/gpu/arm,mali-bifrost.yaml +++ b/Documentation/devicetree/bindings/gpu/arm,mali-bifrost.yaml @@ -84,31 +84,31 @@ examples: gpu_opp_table: opp_table0 { compatible = "operating-points-v2"; - opp@533000000 { + opp-533000000 { opp-hz = /bits/ 64 <533000000>; opp-microvolt = <1250000>; }; - opp@450000000 { + opp-450000000 { opp-hz = /bits/ 64 <450000000>; opp-microvolt = <1150000>; }; - opp@400000000 { + opp-400000000 { opp-hz = /bits/ 64 <400000000>; opp-microvolt = <1125000>; }; - opp@350000000 { + opp-350000000 { opp-hz = /bits/ 64 <350000000>; opp-microvolt = <1075000>; }; - opp@266000000 { + opp-266000000 { opp-hz = /bits/ 64 <266000000>; opp-microvolt = <1025000>; }; - opp@160000000 { + opp-160000000 { opp-hz = /bits/ 64 <160000000>; opp-microvolt = <925000>; }; - opp@100000000 { + opp-100000000 { opp-hz = /bits/ 64 <100000000>; opp-microvolt = <912500>; }; diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-midgard.yaml b/Documentation/devicetree/bindings/gpu/arm,mali-midgard.yaml index 36f59b3ade71..8d966f3ff3db 100644 --- a/Documentation/devicetree/bindings/gpu/arm,mali-midgard.yaml +++ b/Documentation/devicetree/bindings/gpu/arm,mali-midgard.yaml @@ -138,31 +138,31 @@ examples: gpu_opp_table: opp_table0 { compatible = "operating-points-v2"; - opp@533000000 { + opp-533000000 { opp-hz = /bits/ 64 <533000000>; opp-microvolt = <1250000>; }; - opp@450000000 { + opp-450000000 { opp-hz = /bits/ 64 <450000000>; opp-microvolt = <1150000>; }; - opp@400000000 { + opp-400000000 { opp-hz = /bits/ 64 <400000000>; opp-microvolt = <1125000>; }; - opp@350000000 { + opp-350000000 { opp-hz = /bits/ 64 <350000000>; opp-microvolt = <1075000>; }; - opp@266000000 { + opp-266000000 { opp-hz = /bits/ 64 <266000000>; opp-microvolt = <1025000>; }; - opp@160000000 { + opp-160000000 { opp-hz = /bits/ 64 <160000000>; opp-microvolt = <925000>; }; - opp@100000000 { + opp-100000000 { opp-hz = /bits/ 64 <100000000>; opp-microvolt = <912500>; }; diff --git a/Documentation/devicetree/bindings/iio/adc/samsung,exynos-adc.yaml b/Documentation/devicetree/bindings/iio/adc/samsung,exynos-adc.yaml index f46de17c0878..cc3c8ea6a894 100644 --- a/Documentation/devicetree/bindings/iio/adc/samsung,exynos-adc.yaml +++ b/Documentation/devicetree/bindings/iio/adc/samsung,exynos-adc.yaml @@ -123,7 +123,7 @@ examples: samsung,syscon-phandle = <&pmu_system_controller>; /* NTC thermistor is a hwmon device */ - ncp15wb473@0 { + ncp15wb473 { compatible = "murata,ncp15wb473"; pullup-uv = <1800000>; pullup-ohm = <47000>; diff --git a/Documentation/devicetree/bindings/input/touchscreen/goodix.yaml b/Documentation/devicetree/bindings/input/touchscreen/goodix.yaml index d7c3262b2494..c99ed3934d7e 100644 --- a/Documentation/devicetree/bindings/input/touchscreen/goodix.yaml +++ b/Documentation/devicetree/bindings/input/touchscreen/goodix.yaml @@ -62,7 +62,7 @@ required: examples: - | - i2c@00000000 { + i2c { #address-cells = <1>; #size-cells = <0>; gt928@5d { diff --git a/Documentation/devicetree/bindings/media/ti,cal.yaml b/Documentation/devicetree/bindings/media/ti,cal.yaml index 1ea784179536..5e066629287d 100644 --- a/Documentation/devicetree/bindings/media/ti,cal.yaml +++ b/Documentation/devicetree/bindings/media/ti,cal.yaml @@ -177,7 +177,7 @@ examples: }; }; - i2c5: i2c@4807c000 { + i2c { clock-frequency = <400000>; #address-cells = <1>; #size-cells = <0>; diff --git a/Documentation/devicetree/bindings/mfd/max77650.yaml b/Documentation/devicetree/bindings/mfd/max77650.yaml index 4a70f875a6eb..480385789394 100644 --- a/Documentation/devicetree/bindings/mfd/max77650.yaml +++ b/Documentation/devicetree/bindings/mfd/max77650.yaml @@ -97,14 +97,14 @@ examples: regulators { compatible = "maxim,max77650-regulator"; - max77650_ldo: regulator@0 { + max77650_ldo: regulator-ldo { regulator-compatible = "ldo"; regulator-name = "max77650-ldo"; regulator-min-microvolt = <1350000>; regulator-max-microvolt = <2937500>; }; - max77650_sbb0: regulator@1 { + max77650_sbb0: regulator-sbb0 { regulator-compatible = "sbb0"; regulator-name = "max77650-sbb0"; regulator-min-microvolt = <800000>; diff --git a/Documentation/devicetree/bindings/mmc/mmc-controller.yaml b/Documentation/devicetree/bindings/mmc/mmc-controller.yaml index 3c0df4016a12..8fded83c519a 100644 --- a/Documentation/devicetree/bindings/mmc/mmc-controller.yaml +++ b/Documentation/devicetree/bindings/mmc/mmc-controller.yaml @@ -370,6 +370,7 @@ examples: mmc3: mmc@1c12000 { #address-cells = <1>; #size-cells = <0>; + reg = <0x1c12000 0x200>; pinctrl-names = "default"; pinctrl-0 = <&mmc3_pins_a>; vmmc-supply = <®_vmmc3>; diff --git a/Documentation/devicetree/bindings/nvmem/nvmem.yaml b/Documentation/devicetree/bindings/nvmem/nvmem.yaml index b43c6c65294e..65980224d550 100644 --- a/Documentation/devicetree/bindings/nvmem/nvmem.yaml +++ b/Documentation/devicetree/bindings/nvmem/nvmem.yaml @@ -76,6 +76,8 @@ examples: qfprom: eeprom@700000 { #address-cells = <1>; #size-cells = <1>; + reg = <0x00700000 0x100000>; + wp-gpios = <&gpio1 3 GPIO_ACTIVE_HIGH>; /* ... */ diff --git a/Documentation/devicetree/bindings/phy/allwinner,sun4i-a10-usb-phy.yaml b/Documentation/devicetree/bindings/phy/allwinner,sun4i-a10-usb-phy.yaml index 020ef9e4c411..94ac23687b7e 100644 --- a/Documentation/devicetree/bindings/phy/allwinner,sun4i-a10-usb-phy.yaml +++ b/Documentation/devicetree/bindings/phy/allwinner,sun4i-a10-usb-phy.yaml @@ -86,7 +86,7 @@ examples: #include #include - usbphy: phy@01c13400 { + usbphy: phy@1c13400 { #phy-cells = <1>; compatible = "allwinner,sun4i-a10-usb-phy"; reg = <0x01c13400 0x10>, <0x01c14800 0x4>, <0x01c1c800 0x4>; diff --git a/Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.yaml b/Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.yaml index 754ea7ab040a..ef4de32cb17c 100644 --- a/Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.yaml +++ b/Documentation/devicetree/bindings/pinctrl/st,stm32-pinctrl.yaml @@ -248,7 +248,7 @@ examples: }; //Example 3 pin groups - pinctrl@60020000 { + pinctrl { usart1_pins_a: usart1-0 { pins1 { pinmux = ; diff --git a/Documentation/devicetree/bindings/regulator/regulator.yaml b/Documentation/devicetree/bindings/regulator/regulator.yaml index 92ff2e8ad572..91a39a33000b 100644 --- a/Documentation/devicetree/bindings/regulator/regulator.yaml +++ b/Documentation/devicetree/bindings/regulator/regulator.yaml @@ -191,7 +191,7 @@ patternProperties: examples: - | - xyzreg: regulator@0 { + xyzreg: regulator { regulator-min-microvolt = <1000000>; regulator-max-microvolt = <2500000>; regulator-always-on; diff --git a/Documentation/devicetree/bindings/sram/allwinner,sun4i-a10-system-control.yaml b/Documentation/devicetree/bindings/sram/allwinner,sun4i-a10-system-control.yaml index 80bac7a182d5..4b5509436588 100644 --- a/Documentation/devicetree/bindings/sram/allwinner,sun4i-a10-system-control.yaml +++ b/Documentation/devicetree/bindings/sram/allwinner,sun4i-a10-system-control.yaml @@ -125,7 +125,7 @@ examples: #size-cells = <1>; ranges; - sram_a: sram@00000000 { + sram_a: sram@0 { compatible = "mmio-sram"; reg = <0x00000000 0xc000>; #address-cells = <1>; diff --git a/Documentation/devicetree/bindings/timer/allwinner,sun4i-a10-timer.yaml b/Documentation/devicetree/bindings/timer/allwinner,sun4i-a10-timer.yaml index 23e989e09766..d918cee100ac 100644 --- a/Documentation/devicetree/bindings/timer/allwinner,sun4i-a10-timer.yaml +++ b/Documentation/devicetree/bindings/timer/allwinner,sun4i-a10-timer.yaml @@ -87,7 +87,7 @@ additionalProperties: false examples: - | - timer { + timer@1c20c00 { compatible = "allwinner,sun4i-a10-timer"; reg = <0x01c20c00 0x400>; interrupts = <22>, From 99bcd4a6e5b8ba201fdd252f1054689884899fee Mon Sep 17 00:00:00 2001 From: Juergen Gross Date: Tue, 18 Feb 2020 16:47:12 +0100 Subject: [PATCH 286/400] x86/ioperm: Add new paravirt function update_io_bitmap() Commit 111e7b15cf10f6 ("x86/ioperm: Extend IOPL config to control ioperm() as well") reworked the iopl syscall to use I/O bitmaps. Unfortunately this broke Xen PV domains using that syscall as there is currently no I/O bitmap support in PV domains. Add I/O bitmap support via a new paravirt function update_io_bitmap which Xen PV domains can use to update their I/O bitmaps via a hypercall. Fixes: 111e7b15cf10f6 ("x86/ioperm: Extend IOPL config to control ioperm() as well") Reported-by: Jan Beulich Signed-off-by: Juergen Gross Signed-off-by: Thomas Gleixner Tested-by: Jan Beulich Reviewed-by: Jan Beulich Cc: # 5.5 Link: https://lkml.kernel.org/r/20200218154712.25490-1-jgross@suse.com --- arch/x86/include/asm/io_bitmap.h | 9 ++++++++- arch/x86/include/asm/paravirt.h | 7 +++++++ arch/x86/include/asm/paravirt_types.h | 4 ++++ arch/x86/kernel/paravirt.c | 5 +++++ arch/x86/kernel/process.c | 2 +- arch/x86/xen/enlighten_pv.c | 25 +++++++++++++++++++++++++ 6 files changed, 50 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/io_bitmap.h b/arch/x86/include/asm/io_bitmap.h index 02c6ef8f7667..07344d82e88e 100644 --- a/arch/x86/include/asm/io_bitmap.h +++ b/arch/x86/include/asm/io_bitmap.h @@ -19,7 +19,14 @@ struct task_struct; void io_bitmap_share(struct task_struct *tsk); void io_bitmap_exit(void); -void tss_update_io_bitmap(void); +void native_tss_update_io_bitmap(void); + +#ifdef CONFIG_PARAVIRT_XXL +#include +#else +#define tss_update_io_bitmap native_tss_update_io_bitmap +#endif + #else static inline void io_bitmap_share(struct task_struct *tsk) { } static inline void io_bitmap_exit(void) { } diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 86e7317eb31f..694d8daf4983 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -295,6 +295,13 @@ static inline void write_idt_entry(gate_desc *dt, int entry, const gate_desc *g) PVOP_VCALL3(cpu.write_idt_entry, dt, entry, g); } +#ifdef CONFIG_X86_IOPL_IOPERM +static inline void tss_update_io_bitmap(void) +{ + PVOP_VCALL0(cpu.update_io_bitmap); +} +#endif + static inline void paravirt_activate_mm(struct mm_struct *prev, struct mm_struct *next) { diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index 84812964d3dd..732f62e04ddb 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -140,6 +140,10 @@ struct pv_cpu_ops { void (*load_sp0)(unsigned long sp0); +#ifdef CONFIG_X86_IOPL_IOPERM + void (*update_io_bitmap)(void); +#endif + void (*wbinvd)(void); /* cpuid emulation, mostly so that caps bits can be disabled */ diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 789f5e4f89de..c131ba4e70ef 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -30,6 +30,7 @@ #include #include #include +#include /* * nop stub, which must not clobber anything *including the stack* to @@ -341,6 +342,10 @@ struct paravirt_patch_template pv_ops = { .cpu.iret = native_iret, .cpu.swapgs = native_swapgs, +#ifdef CONFIG_X86_IOPL_IOPERM + .cpu.update_io_bitmap = native_tss_update_io_bitmap, +#endif + .cpu.start_context_switch = paravirt_nop, .cpu.end_context_switch = paravirt_nop, diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 839b5244e3b7..3053c85e0e42 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -374,7 +374,7 @@ static void tss_copy_io_bitmap(struct tss_struct *tss, struct io_bitmap *iobm) /** * tss_update_io_bitmap - Update I/O bitmap before exiting to usermode */ -void tss_update_io_bitmap(void) +void native_tss_update_io_bitmap(void) { struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw); struct thread_struct *t = ¤t->thread; diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 79409120a603..507f4fb88fa7 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -72,6 +72,9 @@ #include #include #include +#ifdef CONFIG_X86_IOPL_IOPERM +#include +#endif #ifdef CONFIG_ACPI #include @@ -837,6 +840,25 @@ static void xen_load_sp0(unsigned long sp0) this_cpu_write(cpu_tss_rw.x86_tss.sp0, sp0); } +#ifdef CONFIG_X86_IOPL_IOPERM +static void xen_update_io_bitmap(void) +{ + struct physdev_set_iobitmap iobitmap; + struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw); + + native_tss_update_io_bitmap(); + + iobitmap.bitmap = (uint8_t *)(&tss->x86_tss) + + tss->x86_tss.io_bitmap_base; + if (tss->x86_tss.io_bitmap_base == IO_BITMAP_OFFSET_INVALID) + iobitmap.nr_ports = 0; + else + iobitmap.nr_ports = IO_BITMAP_BITS; + + HYPERVISOR_physdev_op(PHYSDEVOP_set_iobitmap, &iobitmap); +} +#endif + static void xen_io_delay(void) { } @@ -1047,6 +1069,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = { .write_idt_entry = xen_write_idt_entry, .load_sp0 = xen_load_sp0, +#ifdef CONFIG_X86_IOPL_IOPERM + .update_io_bitmap = xen_update_io_bitmap, +#endif .io_delay = xen_io_delay, /* Xen takes care of %gs when switching to usermode for us */ From bba42affa732d6fd5bd5c9678e6deacde2de1547 Mon Sep 17 00:00:00 2001 From: Juergen Gross Date: Fri, 21 Feb 2020 11:38:51 +0100 Subject: [PATCH 287/400] x86/mm: Fix dump_pagetables with Xen PV Commit 2ae27137b2db89 ("x86: mm: convert dump_pagetables to use walk_page_range") broke Xen PV guests as the hypervisor reserved hole in the memory map was not taken into account. Fix that by starting the kernel range only at GUARD_HOLE_END_ADDR. Fixes: 2ae27137b2db89 ("x86: mm: convert dump_pagetables to use walk_page_range") Reported-by: Julien Grall Signed-off-by: Juergen Gross Signed-off-by: Thomas Gleixner Tested-by: Julien Grall Link: https://lkml.kernel.org/r/20200221103851.7855-1-jgross@suse.com --- arch/x86/mm/dump_pagetables.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c index 64229dad7eab..69309cd56fdf 100644 --- a/arch/x86/mm/dump_pagetables.c +++ b/arch/x86/mm/dump_pagetables.c @@ -363,13 +363,8 @@ static void ptdump_walk_pgd_level_core(struct seq_file *m, { const struct ptdump_range ptdump_ranges[] = { #ifdef CONFIG_X86_64 - -#define normalize_addr_shift (64 - (__VIRTUAL_MASK_SHIFT + 1)) -#define normalize_addr(u) ((signed long)((u) << normalize_addr_shift) >> \ - normalize_addr_shift) - {0, PTRS_PER_PGD * PGD_LEVEL_MULT / 2}, - {normalize_addr(PTRS_PER_PGD * PGD_LEVEL_MULT / 2), ~0UL}, + {GUARD_HOLE_END_ADDR, ~0UL}, #else {0, ~0UL}, #endif From 6c5d911249290f41f7b50b43344a7520605b1acb Mon Sep 17 00:00:00 2001 From: Qian Cai Date: Fri, 21 Feb 2020 23:31:11 -0500 Subject: [PATCH 288/400] jbd2: fix data races at struct journal_head journal_head::b_transaction and journal_head::b_next_transaction could be accessed concurrently as noticed by KCSAN, LTP: starting fsync04 /dev/zero: Can't open blockdev EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) ================================================================== BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2] write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70: __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2] __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569 jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2] (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034 kjournald2+0x13b/0x450 [jbd2] kthread+0x1cd/0x1f0 ret_from_fork+0x27/0x50 read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68: jbd2_write_access_granted+0x1b2/0x250 [jbd2] jbd2_write_access_granted at fs/jbd2/transaction.c:1155 jbd2_journal_get_write_access+0x2c/0x60 [jbd2] __ext4_journal_get_write_access+0x50/0x90 [ext4] ext4_mb_mark_diskspace_used+0x158/0x620 [ext4] ext4_mb_new_blocks+0x54f/0xca0 [ext4] ext4_ind_map_blocks+0xc79/0x1b40 [ext4] ext4_map_blocks+0x3b4/0x950 [ext4] _ext4_get_block+0xfc/0x270 [ext4] ext4_get_block+0x3b/0x50 [ext4] __block_write_begin_int+0x22e/0xae0 __block_write_begin+0x39/0x50 ext4_write_begin+0x388/0xb50 [ext4] generic_perform_write+0x15d/0x290 ext4_buffered_write_iter+0x11f/0x210 [ext4] ext4_file_write_iter+0xce/0x9e0 [ext4] new_sync_write+0x29c/0x3b0 __vfs_write+0x92/0xa0 vfs_write+0x103/0x260 ksys_write+0x9d/0x130 __x64_sys_write+0x4c/0x60 do_syscall_64+0x91/0xb05 entry_SYSCALL_64_after_hwframe+0x49/0xbe 5 locks held by fsync04/25724: #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260 #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4] #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2] #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4] #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2] irq event stamp: 1407125 hardirqs last enabled at (1407125): [] __find_get_block+0x107/0x790 hardirqs last disabled at (1407124): [] __find_get_block+0x49/0x790 softirqs last enabled at (1405528): [] __do_softirq+0x34c/0x57c softirqs last disabled at (1405521): [] irq_exit+0xa2/0xc0 Reported by Kernel Concurrency Sanitizer on: CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 The plain reads are outside of jh->b_state_lock critical section which result in data races. Fix them by adding pairs of READ|WRITE_ONCE(). Reviewed-by: Jan Kara Signed-off-by: Qian Cai Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pw Signed-off-by: Theodore Ts'o --- fs/jbd2/transaction.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index d181948c0390..3dccc23cf010 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -1150,8 +1150,8 @@ static bool jbd2_write_access_granted(handle_t *handle, struct buffer_head *bh, /* For undo access buffer must have data copied */ if (undo && !jh->b_committed_data) goto out; - if (jh->b_transaction != handle->h_transaction && - jh->b_next_transaction != handle->h_transaction) + if (READ_ONCE(jh->b_transaction) != handle->h_transaction && + READ_ONCE(jh->b_next_transaction) != handle->h_transaction) goto out; /* * There are two reasons for the barrier here: @@ -2569,8 +2569,8 @@ bool __jbd2_journal_refile_buffer(struct journal_head *jh) * our jh reference and thus __jbd2_journal_file_buffer() must not * take a new one. */ - jh->b_transaction = jh->b_next_transaction; - jh->b_next_transaction = NULL; + WRITE_ONCE(jh->b_transaction, jh->b_next_transaction); + WRITE_ONCE(jh->b_next_transaction, NULL); if (buffer_freed(bh)) jlist = BJ_Forget; else if (jh->b_modified) From 38b17afb0ebb9ecd41418d3c08bcf9198af4349d Mon Sep 17 00:00:00 2001 From: Wolfram Sang Date: Tue, 25 Feb 2020 15:12:29 +0100 Subject: [PATCH 289/400] macintosh: therm_windtunnel: fix regression when instantiating devices Removing attach_adapter from this driver caused a regression for at least some machines. Those machines had the sensors described in their DT, too, so they didn't need manual creation of the sensor devices. The old code worked, though, because manual creation came first. Creation of DT devices then failed later and caused error logs, but the sensors worked nonetheless because of the manually created devices. When removing attach_adaper, manual creation now comes later and loses the race. The sensor devices were already registered via DT, yet with another binding, so the driver could not be bound to it. This fix refactors the code to remove the race and only manually creates devices if there are no DT nodes present. Also, the DT binding is updated to match both, the DT and manually created devices. Because we don't know which device creation will be used at runtime, the code to start the kthread is moved to do_probe() which will be called by both methods. Fixes: 3e7bed52719d ("macintosh: therm_windtunnel: drop using attach_adapter") Link: https://bugzilla.kernel.org/show_bug.cgi?id=201723 Reported-by: Erhard Furtner Tested-by: Erhard Furtner Acked-by: Michael Ellerman (powerpc) Signed-off-by: Wolfram Sang Cc: stable@kernel.org # v4.19+ --- drivers/macintosh/therm_windtunnel.c | 56 ++++++++++++++++------------ 1 file changed, 33 insertions(+), 23 deletions(-) diff --git a/drivers/macintosh/therm_windtunnel.c b/drivers/macintosh/therm_windtunnel.c index 8c744578122a..a0d87ed9da69 100644 --- a/drivers/macintosh/therm_windtunnel.c +++ b/drivers/macintosh/therm_windtunnel.c @@ -300,9 +300,11 @@ static int control_loop(void *dummy) /* i2c probing and setup */ /************************************************************************/ -static int -do_attach( struct i2c_adapter *adapter ) +static void do_attach(struct i2c_adapter *adapter) { + struct i2c_board_info info = { }; + struct device_node *np; + /* scan 0x48-0x4f (DS1775) and 0x2c-2x2f (ADM1030) */ static const unsigned short scan_ds1775[] = { 0x48, 0x49, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4f, @@ -313,25 +315,24 @@ do_attach( struct i2c_adapter *adapter ) I2C_CLIENT_END }; - if( strncmp(adapter->name, "uni-n", 5) ) - return 0; + if (x.running || strncmp(adapter->name, "uni-n", 5)) + return; - if( !x.running ) { - struct i2c_board_info info; - - memset(&info, 0, sizeof(struct i2c_board_info)); - strlcpy(info.type, "therm_ds1775", I2C_NAME_SIZE); + np = of_find_compatible_node(adapter->dev.of_node, NULL, "MAC,ds1775"); + if (np) { + of_node_put(np); + } else { + strlcpy(info.type, "MAC,ds1775", I2C_NAME_SIZE); i2c_new_probed_device(adapter, &info, scan_ds1775, NULL); - - strlcpy(info.type, "therm_adm1030", I2C_NAME_SIZE); - i2c_new_probed_device(adapter, &info, scan_adm1030, NULL); - - if( x.thermostat && x.fan ) { - x.running = 1; - x.poll_task = kthread_run(control_loop, NULL, "g4fand"); - } } - return 0; + + np = of_find_compatible_node(adapter->dev.of_node, NULL, "MAC,adm1030"); + if (np) { + of_node_put(np); + } else { + strlcpy(info.type, "MAC,adm1030", I2C_NAME_SIZE); + i2c_new_probed_device(adapter, &info, scan_adm1030, NULL); + } } static int @@ -404,8 +405,8 @@ out: enum chip { ds1775, adm1030 }; static const struct i2c_device_id therm_windtunnel_id[] = { - { "therm_ds1775", ds1775 }, - { "therm_adm1030", adm1030 }, + { "MAC,ds1775", ds1775 }, + { "MAC,adm1030", adm1030 }, { } }; MODULE_DEVICE_TABLE(i2c, therm_windtunnel_id); @@ -414,6 +415,7 @@ static int do_probe(struct i2c_client *cl, const struct i2c_device_id *id) { struct i2c_adapter *adapter = cl->adapter; + int ret = 0; if( !i2c_check_functionality(adapter, I2C_FUNC_SMBUS_WORD_DATA | I2C_FUNC_SMBUS_WRITE_BYTE) ) @@ -421,11 +423,19 @@ do_probe(struct i2c_client *cl, const struct i2c_device_id *id) switch (id->driver_data) { case adm1030: - return attach_fan( cl ); + ret = attach_fan(cl); + break; case ds1775: - return attach_thermostat(cl); + ret = attach_thermostat(cl); + break; } - return 0; + + if (!x.running && x.thermostat && x.fan) { + x.running = 1; + x.poll_task = kthread_run(control_loop, NULL, "g4fand"); + } + + return ret; } static struct i2c_driver g4fan_driver = { From 37b0b6b8b99c0e1c1f11abbe7cf49b6d03795b3f Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Fri, 28 Feb 2020 12:22:56 +0300 Subject: [PATCH 290/400] ext4: potential crash on allocation error in ext4_alloc_flex_bg_array() If sbi->s_flex_groups_allocated is zero and the first allocation fails then this code will crash. The problem is that "i--" will set "i" to -1 but when we compare "i >= sbi->s_flex_groups_allocated" then the -1 is type promoted to unsigned and becomes UINT_MAX. Since UINT_MAX is more than zero, the condition is true so we call kvfree(new_groups[-1]). The loop will carry on freeing invalid memory until it crashes. Fixes: 7c990728b99e ("ext4: fix potential race between s_flex_groups online resizing and access") Reviewed-by: Suraj Jitindar Singh Signed-off-by: Dan Carpenter Cc: stable@kernel.org Link: https://lore.kernel.org/r/20200228092142.7irbc44yaz3by7nb@kili.mountain Signed-off-by: Theodore Ts'o --- fs/ext4/super.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index ff1b764b0c0e..0c7c4adb664e 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -2391,7 +2391,7 @@ int ext4_alloc_flex_bg_array(struct super_block *sb, ext4_group_t ngroup) { struct ext4_sb_info *sbi = EXT4_SB(sb); struct flex_groups **old_groups, **new_groups; - int size, i; + int size, i, j; if (!sbi->s_log_groups_per_flex) return 0; @@ -2412,8 +2412,8 @@ int ext4_alloc_flex_bg_array(struct super_block *sb, ext4_group_t ngroup) sizeof(struct flex_groups)), GFP_KERNEL); if (!new_groups[i]) { - for (i--; i >= sbi->s_flex_groups_allocated; i--) - kvfree(new_groups[i]); + for (j = sbi->s_flex_groups_allocated; j < i; j++) + kvfree(new_groups[j]); kvfree(new_groups); ext4_msg(sb, KERN_ERR, "not enough memory for %d flex groups", size); From 86f7e90ce840aa1db407d3ea6e9b3a52b2ce923c Mon Sep 17 00:00:00 2001 From: Oliver Upton Date: Sat, 29 Feb 2020 11:30:14 -0800 Subject: [PATCH 291/400] KVM: VMX: check descriptor table exits on instruction emulation KVM emulates UMIP on hardware that doesn't support it by setting the 'descriptor table exiting' VM-execution control and performing instruction emulation. When running nested, this emulation is broken as KVM refuses to emulate L2 instructions by default. Correct this regression by allowing the emulation of descriptor table instructions if L1 hasn't requested 'descriptor table exiting'. Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode") Reported-by: Jan Kiszka Cc: stable@vger.kernel.org Cc: Paolo Bonzini Cc: Jim Mattson Signed-off-by: Oliver Upton Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/vmx.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index ce70a71037ed..40b1e6138cd5 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7177,6 +7177,7 @@ static int vmx_check_intercept_io(struct kvm_vcpu *vcpu, else intercept = nested_vmx_check_io_bitmaps(vcpu, port, size); + /* FIXME: produce nested vmexit and return X86EMUL_INTERCEPTED. */ return intercept ? X86EMUL_UNHANDLEABLE : X86EMUL_CONTINUE; } @@ -7206,6 +7207,20 @@ static int vmx_check_intercept(struct kvm_vcpu *vcpu, case x86_intercept_outs: return vmx_check_intercept_io(vcpu, info); + case x86_intercept_lgdt: + case x86_intercept_lidt: + case x86_intercept_lldt: + case x86_intercept_ltr: + case x86_intercept_sgdt: + case x86_intercept_sidt: + case x86_intercept_sldt: + case x86_intercept_str: + if (!nested_cpu_has2(vmcs12, SECONDARY_EXEC_DESC)) + return X86EMUL_CONTINUE; + + /* FIXME: produce nested vmexit and return X86EMUL_INTERCEPTED. */ + break; + /* TODO: check more intercepts... */ default: break; From 98d54f81e36ba3bf92172791eba5ca5bd813989b Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 1 Mar 2020 16:38:46 -0600 Subject: [PATCH 292/400] Linux 5.6-rc4 --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 1a1a0d271697..86035d866f2c 100644 --- a/Makefile +++ b/Makefile @@ -2,7 +2,7 @@ VERSION = 5 PATCHLEVEL = 6 SUBLEVEL = 0 -EXTRAVERSION = -rc3 +EXTRAVERSION = -rc4 NAME = Kleptomaniac Octopus # *DOCUMENTATION* From 0a9d1e3f3f038785ebc72d53f1c409d07f6b4ff5 Mon Sep 17 00:00:00 2001 From: Marek Szyprowski Date: Thu, 27 Feb 2020 08:06:37 +0100 Subject: [PATCH 293/400] drm/exynos: dsi: propagate error value and silence meaningless warning Properly propagate error value from devm_regulator_bulk_get() and don't confuse user with meaningless warning about failure in getting regulators in case of deferred probe. Signed-off-by: Marek Szyprowski Reviewed-by: Krzysztof Kozlowski Signed-off-by: Inki Dae --- drivers/gpu/drm/exynos/exynos_drm_dsi.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c b/drivers/gpu/drm/exynos/exynos_drm_dsi.c index 33628d85edad..3f6fcd453d33 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c +++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c @@ -1773,8 +1773,9 @@ static int exynos_dsi_probe(struct platform_device *pdev) ret = devm_regulator_bulk_get(dev, ARRAY_SIZE(dsi->supplies), dsi->supplies); if (ret) { - dev_info(dev, "failed to get regulators: %d\n", ret); - return -EPROBE_DEFER; + if (ret != -EPROBE_DEFER) + dev_info(dev, "failed to get regulators: %d\n", ret); + return ret; } dsi->clks = devm_kcalloc(dev, From c0fd99d659ba5582e09625c7a985d63fc2ca74b5 Mon Sep 17 00:00:00 2001 From: Marek Szyprowski Date: Thu, 20 Feb 2020 13:30:12 +0100 Subject: [PATCH 294/400] drm/exynos: dsi: fix workaround for the legacy clock name Writing to the built-in strings arrays doesn't work if driver is loaded as kernel module. This is also considered as a bad pattern. Fix this by adding a call to clk_get() with legacy clock name. This fixes following kernel oops if driver is loaded as module: Unable to handle kernel paging request at virtual address bf047978 pgd = (ptrval) [bf047978] *pgd=59344811, *pte=5903c6df, *ppte=5903c65f Internal error: Oops: 80f [#1] SMP ARM Modules linked in: mc exynosdrm(+) analogix_dp rtc_s3c exynos_ppmu i2c_gpio CPU: 1 PID: 212 Comm: systemd-udevd Not tainted 5.6.0-rc2-next-20200219 #326 videodev: Linux video capture interface: v2.00 Hardware name: Samsung Exynos (Flattened Device Tree) PC is at exynos_dsi_probe+0x1f0/0x384 [exynosdrm] LR is at exynos_dsi_probe+0x1dc/0x384 [exynosdrm] ... Process systemd-udevd (pid: 212, stack limit = 0x(ptrval)) ... [] (exynos_dsi_probe [exynosdrm]) from [] (platform_drv_probe+0x6c/0xa4) [] (platform_drv_probe) from [] (really_probe+0x210/0x350) [] (really_probe) from [] (driver_probe_device+0x60/0x1a0) [] (driver_probe_device) from [] (device_driver_attach+0x58/0x60) [] (device_driver_attach) from [] (__driver_attach+0x80/0xbc) [] (__driver_attach) from [] (bus_for_each_dev+0x68/0xb4) [] (bus_for_each_dev) from [] (bus_add_driver+0x130/0x1e8) [] (bus_add_driver) from [] (driver_register+0x78/0x110) [] (driver_register) from [] (exynos_drm_init+0xe8/0x11c [exynosdrm]) [] (exynos_drm_init [exynosdrm]) from [] (do_one_initcall+0x50/0x220) [] (do_one_initcall) from [] (do_init_module+0x60/0x210) [] (do_init_module) from [] (load_module+0x1c0c/0x2310) [] (load_module) from [] (sys_finit_module+0xac/0xbc) [] (sys_finit_module) from [] (ret_fast_syscall+0x0/0x54) Exception stack(0xd979bfa8 to 0xd979bff0) ... ---[ end trace db16efe05faab470 ]--- Signed-off-by: Marek Szyprowski Reviewed-by: Andrzej Hajda Signed-off-by: Inki Dae --- drivers/gpu/drm/exynos/exynos_drm_dsi.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c b/drivers/gpu/drm/exynos/exynos_drm_dsi.c index 3f6fcd453d33..a85365c56d4d 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c +++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c @@ -1788,9 +1788,10 @@ static int exynos_dsi_probe(struct platform_device *pdev) dsi->clks[i] = devm_clk_get(dev, clk_names[i]); if (IS_ERR(dsi->clks[i])) { if (strcmp(clk_names[i], "sclk_mipi") == 0) { - strcpy(clk_names[i], OLD_SCLK_MIPI_CLK_NAME); - i--; - continue; + dsi->clks[i] = devm_clk_get(dev, + OLD_SCLK_MIPI_CLK_NAME); + if (!IS_ERR(dsi->clks[i])) + continue; } dev_info(dev, "failed to get the clock: %s\n", From 3b6a9b19ab652efac7ad4c392add6f1235019568 Mon Sep 17 00:00:00 2001 From: Marek Szyprowski Date: Thu, 20 Feb 2020 13:57:26 +0100 Subject: [PATCH 295/400] drm/exynos: hdmi: don't leak enable HDMI_EN regulator if probe fails Move enabling and disabling HDMI_EN optional regulator to probe() function to keep track on the regulator status. This fixes following warning if probe() fails (for example when I2C DDC adapter cannot be yet gathered due to the missing driver). This fixes following warning observed on Arndale5250 board with multi_v7_defconfig: [drm] Failed to get ddc i2c adapter by node ------------[ cut here ]------------ WARNING: CPU: 0 PID: 214 at drivers/regulator/core.c:2051 _regulator_put+0x16c/0x184 Modules linked in: ... CPU: 0 PID: 214 Comm: systemd-udevd Not tainted 5.6.0-rc2-next-20200219-00040-g38af1dfafdbb #7570 Hardware name: Samsung Exynos (Flattened Device Tree) [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [] (show_stack) from [] (dump_stack+0xcc/0xe0) [] (dump_stack) from [] (__warn+0xe0/0xf8) [] (__warn) from [] (warn_slowpath_fmt+0xb0/0xb8) [] (warn_slowpath_fmt) from [] (_regulator_put+0x16c/0x184) [] (_regulator_put) from [] (regulator_put+0x1c/0x2c) [] (regulator_put) from [] (release_nodes+0x17c/0x200) [] (release_nodes) from [] (really_probe+0x10c/0x350) [] (really_probe) from [] (driver_probe_device+0x60/0x1a0) [] (driver_probe_device) from [] (device_driver_attach+0x58/0x60) [] (device_driver_attach) from [] (__driver_attach+0x80/0xbc) [] (__driver_attach) from [] (bus_for_each_dev+0x68/0xb4) [] (bus_for_each_dev) from [] (bus_add_driver+0x130/0x1e8) [] (bus_add_driver) from [] (driver_register+0x78/0x110) [] (driver_register) from [] (exynos_drm_init+0xe8/0x11c [exynosdrm]) [] (exynos_drm_init [exynosdrm]) from [] (do_one_initcall+0x50/0x220) [] (do_one_initcall) from [] (do_init_module+0x60/0x210) [] (do_init_module) from [] (load_module+0x1c0c/0x2310) [] (load_module) from [] (sys_finit_module+0xac/0xbc) [] (sys_finit_module) from [] (ret_fast_syscall+0x0/0x54) Exception stack(0xecca3fa8 to 0xecca3ff0) ... ---[ end trace 276c91214635905c ]--- Signed-off-by: Marek Szyprowski Reviewed-by: Andrzej Hajda Signed-off-by: Inki Dae --- drivers/gpu/drm/exynos/exynos_hdmi.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_hdmi.c b/drivers/gpu/drm/exynos/exynos_hdmi.c index 9ff921f43a93..f141916eade6 100644 --- a/drivers/gpu/drm/exynos/exynos_hdmi.c +++ b/drivers/gpu/drm/exynos/exynos_hdmi.c @@ -1805,18 +1805,10 @@ static int hdmi_resources_init(struct hdmi_context *hdata) hdata->reg_hdmi_en = devm_regulator_get_optional(dev, "hdmi-en"); - if (PTR_ERR(hdata->reg_hdmi_en) != -ENODEV) { + if (PTR_ERR(hdata->reg_hdmi_en) != -ENODEV) if (IS_ERR(hdata->reg_hdmi_en)) return PTR_ERR(hdata->reg_hdmi_en); - ret = regulator_enable(hdata->reg_hdmi_en); - if (ret) { - DRM_DEV_ERROR(dev, - "failed to enable hdmi-en regulator\n"); - return ret; - } - } - return hdmi_bridge_init(hdata); } @@ -2023,6 +2015,15 @@ static int hdmi_probe(struct platform_device *pdev) } } + if (!IS_ERR(hdata->reg_hdmi_en)) { + ret = regulator_enable(hdata->reg_hdmi_en); + if (ret) { + DRM_DEV_ERROR(dev, + "failed to enable hdmi-en regulator\n"); + goto err_hdmiphy; + } + } + pm_runtime_enable(dev); audio_infoframe = &hdata->audio.infoframe; @@ -2047,7 +2048,8 @@ err_unregister_audio: err_rpm_disable: pm_runtime_disable(dev); - + if (!IS_ERR(hdata->reg_hdmi_en)) + regulator_disable(hdata->reg_hdmi_en); err_hdmiphy: if (hdata->hdmiphy_port) put_device(&hdata->hdmiphy_port->dev); From 852d7655ea4395a1deb7070abe37962a7d0662e4 Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann Date: Fri, 28 Feb 2020 11:47:23 +0100 Subject: [PATCH 296/400] drm/shmem: drop pgprot_decrypted() Was added by commit 95cf9264d5f3 ("x86, drm, fbdev: Do not specify encrypted memory for video mappings"), then it was kept through various changes. While vram actually needs decrypted mappings this is not correct for shmem gem objects which live in main memory not io memory, so remove the call. Signed-off-by: Gerd Hoffmann Reviewed-by: Thomas Hellstrom Link: http://patchwork.freedesktop.org/patch/msgid/20200228104723.18757-1-kraxel@redhat.com --- drivers/gpu/drm/drm_gem_shmem_helper.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index aad9324dcf4f..df31e5782eed 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -548,7 +548,6 @@ int drm_gem_shmem_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma) vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); if (!shmem->map_cached) vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); - vma->vm_page_prot = pgprot_decrypted(vma->vm_page_prot); vma->vm_ops = &drm_gem_shmem_vm_ops; return 0; From bb699a793110fc29664e80c4ebb158a922151d52 Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Fri, 21 Feb 2020 10:09:53 +0000 Subject: [PATCH 297/400] drm/i915/gem: Break up long lists of object reclaim Call cond_resched() between each freed object in case we have a really, really long list, and we don't want to block normal processes. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld Link: https://patchwork.freedesktop.org/patch/msgid/20200221100953.2587176-1-chris@chris-wilson.co.uk (cherry picked from commit deeee411a97559096523f97655ff16da34cf0573) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 35985218bd85..5da9f9e534b9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -225,6 +225,7 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915, /* But keep the pointer alive for RCU-protected lookups */ call_rcu(&obj->rcu, __i915_gem_free_object_rcu); + cond_resched(); } intel_runtime_pm_put(&i915->runtime_pm, wakeref); } From 33e059a2e4df454359f642f2235af39de9d3e914 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Roberto=20de=20Souza?= Date: Thu, 27 Feb 2020 12:55:40 -0800 Subject: [PATCH 298/400] drm/i915/psr: Force PSR probe only after full initialization MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Commit 60c6a14b489b ("drm/i915/display: Force the state compute phase once to enable PSR") was forcing the state compute too earlier causing errors because not everything was initialized, so here moving to the end of i915_driver_modeset_probe() when the display is all initialized. Also fixing the place where it disarm the force probe as during the atomic check phase errors could happen like the ones due locking and it would cause PSR to never be enabled if that happens. Leaving the disarm to the atomic commit phase, intel_psr_enable() or intel_psr_update() will be called even if the current state do not allow PSR to be enabled. v2: Check if intel_dp is null in intel_psr_force_mode_changed_set() v3: Check intel_dp before get dev_priv v4: - renamed intel_psr_force_mode_changed_set() to intel_psr_set_force_mode_changed() - removed the set parameter from intel_psr_set_force_mode_changed() - not calling intel_psr_set_force_mode_changed() from intel_psr_enable/update(), directly setting it after the same checks that intel_psr_set_force_mode_changed() does - moved intel_psr_set_force_mode_changed() arm call to i915_driver_modeset_probe() as it is a better for a PSR call, all the functions calls happening between the old and the new function call will cause issue [backported to v5.6-rc3] Fixes: 60c6a14b489b ("drm/i915/display: Force the state compute phase once to enable PSR") Closes: https://gitlab.freedesktop.org/drm/intel/issues/1151 Tested-by: Ross Zwisler Reported-by: Ross Zwisler Cc: Gwan-gyeong Mun Cc: Jani Nikula Cc: Anshuman Gupta Reviewed-by: Gwan-gyeong Mun Signed-off-by: José Roberto de Souza Link: https://patchwork.freedesktop.org/patch/msgid/20200221212635.11614-1-jose.souza@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20200227205540.126135-1-jose.souza@intel.com (cherry picked from commit df1a5bfc16f3275a74f77d73375e69bc62c45c4b) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/display/intel_psr.c | 25 ++++++++++++++++++++---- drivers/gpu/drm/i915/display/intel_psr.h | 1 + drivers/gpu/drm/i915/i915_drv.c | 3 +++ drivers/gpu/drm/i915/i915_drv.h | 2 +- 4 files changed, 26 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c index 89c9cf5f38d2..83025052c965 100644 --- a/drivers/gpu/drm/i915/display/intel_psr.c +++ b/drivers/gpu/drm/i915/display/intel_psr.c @@ -852,10 +852,12 @@ void intel_psr_enable(struct intel_dp *intel_dp, { struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); - if (!crtc_state->has_psr) + if (!CAN_PSR(dev_priv) || dev_priv->psr.dp != intel_dp) return; - if (WARN_ON(!CAN_PSR(dev_priv))) + dev_priv->psr.force_mode_changed = false; + + if (!crtc_state->has_psr) return; WARN_ON(dev_priv->drrs.dp); @@ -1009,6 +1011,8 @@ void intel_psr_update(struct intel_dp *intel_dp, if (!CAN_PSR(dev_priv) || READ_ONCE(psr->dp) != intel_dp) return; + dev_priv->psr.force_mode_changed = false; + mutex_lock(&dev_priv->psr.lock); enable = crtc_state->has_psr && psr_global_enabled(psr->debug); @@ -1534,7 +1538,7 @@ void intel_psr_atomic_check(struct drm_connector *connector, struct drm_crtc_state *crtc_state; if (!CAN_PSR(dev_priv) || !new_state->crtc || - dev_priv->psr.initially_probed) + !dev_priv->psr.force_mode_changed) return; intel_connector = to_intel_connector(connector); @@ -1545,5 +1549,18 @@ void intel_psr_atomic_check(struct drm_connector *connector, crtc_state = drm_atomic_get_new_crtc_state(new_state->state, new_state->crtc); crtc_state->mode_changed = true; - dev_priv->psr.initially_probed = true; +} + +void intel_psr_set_force_mode_changed(struct intel_dp *intel_dp) +{ + struct drm_i915_private *dev_priv; + + if (!intel_dp) + return; + + dev_priv = dp_to_i915(intel_dp); + if (!CAN_PSR(dev_priv) || intel_dp != dev_priv->psr.dp) + return; + + dev_priv->psr.force_mode_changed = true; } diff --git a/drivers/gpu/drm/i915/display/intel_psr.h b/drivers/gpu/drm/i915/display/intel_psr.h index c58a1d438808..274fc6bb6221 100644 --- a/drivers/gpu/drm/i915/display/intel_psr.h +++ b/drivers/gpu/drm/i915/display/intel_psr.h @@ -40,5 +40,6 @@ bool intel_psr_enabled(struct intel_dp *intel_dp); void intel_psr_atomic_check(struct drm_connector *connector, struct drm_connector_state *old_state, struct drm_connector_state *new_state); +void intel_psr_set_force_mode_changed(struct intel_dp *intel_dp); #endif /* __INTEL_PSR_H__ */ diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index f7385abdd74b..8410330ce4f0 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -56,6 +56,7 @@ #include "display/intel_hotplug.h" #include "display/intel_overlay.h" #include "display/intel_pipe_crc.h" +#include "display/intel_psr.h" #include "display/intel_sprite.h" #include "display/intel_vga.h" @@ -330,6 +331,8 @@ static int i915_driver_modeset_probe(struct drm_i915_private *i915) intel_init_ipc(i915); + intel_psr_set_force_mode_changed(i915->psr.dp); + return 0; cleanup_gem: diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 077af22b8340..810e3ccd56ec 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -505,7 +505,7 @@ struct i915_psr { bool dc3co_enabled; u32 dc3co_exit_delay; struct delayed_work idle_work; - bool initially_probed; + bool force_mode_changed; }; #define QUIRK_LVDS_SSC_DISABLE (1<<1) From c725161924f9a5872a3e53b73345a6026a5c170e Mon Sep 17 00:00:00 2001 From: Matt Roper Date: Thu, 27 Feb 2020 16:43:19 -0800 Subject: [PATCH 299/400] drm/i915: Program MBUS with rmw during initialization It wasn't terribly clear from the bspec's wording, but after discussion with the hardware folks, it turns out that we need to preserve the pre-existing contents of the MBUS ABOX control register when initializing a few specific bits. Bspec: 49213 Bspec: 50096 Fixes: 4cb4585e5a7f ("drm/i915/icl: initialize MBus during display init") Cc: Stanislav Lisovskiy Signed-off-by: Matt Roper Link: https://patchwork.freedesktop.org/patch/msgid/20200204011032.582737-1-matthew.d.roper@intel.com Reviewed-by: Matt Atwood (cherry picked from commit 837b63e6087838d0f1e612d448405419199d8033) Signed-off-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/20200228004320.127142-1-matthew.d.roper@intel.com --- .../gpu/drm/i915/display/intel_display_power.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index 21561acfa3ac..ae532e45e36b 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -4466,13 +4466,19 @@ static void icl_dbuf_disable(struct drm_i915_private *dev_priv) static void icl_mbus_init(struct drm_i915_private *dev_priv) { - u32 val; + u32 mask, val; - val = MBUS_ABOX_BT_CREDIT_POOL1(16) | - MBUS_ABOX_BT_CREDIT_POOL2(16) | - MBUS_ABOX_B_CREDIT(1) | - MBUS_ABOX_BW_CREDIT(1); + mask = MBUS_ABOX_BT_CREDIT_POOL1_MASK | + MBUS_ABOX_BT_CREDIT_POOL2_MASK | + MBUS_ABOX_B_CREDIT_MASK | + MBUS_ABOX_BW_CREDIT_MASK; + val = I915_READ(MBUS_ABOX_CTL); + val &= ~mask; + val |= MBUS_ABOX_BT_CREDIT_POOL1(16) | + MBUS_ABOX_BT_CREDIT_POOL2(16) | + MBUS_ABOX_B_CREDIT(1) | + MBUS_ABOX_BW_CREDIT(1); I915_WRITE(MBUS_ABOX_CTL, val); } From 4c116e1ae43955a0a38555dfd4d136a222a8996b Mon Sep 17 00:00:00 2001 From: Matt Roper Date: Thu, 27 Feb 2020 16:43:20 -0800 Subject: [PATCH 300/400] drm/i915/tgl: Add Wa_22010178259:tgl MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We need to explicitly set the TLB Request Timer initial value in the BW_BUDDY registers to 0x8 rather than relying on the hardware default. v2: Apply missing REG_FIELD_PREP to ensure 0x8 is placed in the correct bits during the rmw. (Jose) Bspec: 52890 Bspec: 50044 Fixes: 3fa01d642fa7 ("drm/i915/tgl: Program BW_BUDDY registers during display init") Cc: Stanislav Lisovskiy Cc: José Roberto de Souza Signed-off-by: Matt Roper Link: https://patchwork.freedesktop.org/patch/msgid/20200219215655.2923650-1-matthew.d.roper@intel.com Reviewed-by: José Roberto de Souza (cherry picked from commit 87e04f75928bb5d357ef7df4eedc1a7e2761a833) Signed-off-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/20200228004320.127142-2-matthew.d.roper@intel.com --- drivers/gpu/drm/i915/display/intel_display_power.c | 13 +++++++++++++ drivers/gpu/drm/i915/i915_reg.h | 1 + 2 files changed, 14 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index ae532e45e36b..46c40db992dd 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -4974,8 +4974,21 @@ static void tgl_bw_buddy_init(struct drm_i915_private *dev_priv) I915_WRITE(BW_BUDDY1_CTL, BW_BUDDY_DISABLE); I915_WRITE(BW_BUDDY2_CTL, BW_BUDDY_DISABLE); } else { + u32 val; + I915_WRITE(BW_BUDDY1_PAGE_MASK, table[i].page_mask); I915_WRITE(BW_BUDDY2_PAGE_MASK, table[i].page_mask); + + /* Wa_22010178259:tgl */ + val = I915_READ(BW_BUDDY1_CTL); + val &= ~BW_BUDDY_TLB_REQ_TIMER_MASK; + val |= REG_FIELD_PREP(BW_BUDDY_TLB_REQ_TIMER_MASK, 0x8); + I915_WRITE(BW_BUDDY1_CTL, val); + + val = I915_READ(BW_BUDDY2_CTL); + val &= ~BW_BUDDY_TLB_REQ_TIMER_MASK; + val |= REG_FIELD_PREP(BW_BUDDY_TLB_REQ_TIMER_MASK, 0x8); + I915_WRITE(BW_BUDDY2_CTL, val); } } diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 6cc55c103f67..3575fd30756b 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -7757,6 +7757,7 @@ enum { #define BW_BUDDY1_CTL _MMIO(0x45140) #define BW_BUDDY2_CTL _MMIO(0x45150) #define BW_BUDDY_DISABLE REG_BIT(31) +#define BW_BUDDY_TLB_REQ_TIMER_MASK REG_GENMASK(21, 16) #define BW_BUDDY1_PAGE_MASK _MMIO(0x45144) #define BW_BUDDY2_PAGE_MASK _MMIO(0x45154) From eddf309a8ed42eb3312b17a6934686b018189cd3 Mon Sep 17 00:00:00 2001 From: Lucas De Marchi Date: Mon, 24 Feb 2020 11:12:58 -0800 Subject: [PATCH 301/400] drm/i915/tgl: Add Wa_1608008084 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Wa_1608008084 is an additional WA that applies to writes on FF_MODE2 register. We can't read it back either from CPU or GPU. Since the other bits should be 0, recommendation to handle Wa_1604555607 is to actually just write the timer value. Do a write only and don't try to read it, neither before or after the WA is applied. Fixes: ff690b2111ba ("drm/i915/tgl: Implement Wa_1604555607") Signed-off-by: Lucas De Marchi Reviewed-by: José Roberto de Souza Link: https://patchwork.freedesktop.org/patch/msgid/20200224191258.15668-1-lucas.demarchi@intel.com (cherry picked from commit e94bda14325ccf1a519ffb516738d1201457f97f) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 4e292d4bf7b9..173a7f2d109f 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -575,24 +575,19 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, static void tgl_ctx_workarounds_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { - u32 val; - /* Wa_1409142259:tgl */ WA_SET_BIT_MASKED(GEN11_COMMON_SLICE_CHICKEN3, GEN12_DISABLE_CPS_AWARE_COLOR_PIPE); - /* Wa_1604555607:tgl */ - val = intel_uncore_read(engine->uncore, FF_MODE2); - val &= ~FF_MODE2_TDS_TIMER_MASK; - val |= FF_MODE2_TDS_TIMER_128; /* - * FIXME: FF_MODE2 register is not readable till TGL B0. We can - * enable verification of WA from the later steppings, which enables - * the read of FF_MODE2. + * Wa_1604555607:gen12 and Wa_1608008084:gen12 + * FF_MODE2 register will return the wrong value when read. The default + * value for this register is zero for all fields and there are no bit + * masks. So instead of doing a RMW we should just write the TDS timer + * value for Wa_1604555607. */ - wa_add(wal, FF_MODE2, FF_MODE2_TDS_TIMER_MASK, val, - IS_TGL_REVID(engine->i915, TGL_REVID_A0, TGL_REVID_A0) ? 0 : - FF_MODE2_TDS_TIMER_MASK); + wa_add(wal, FF_MODE2, FF_MODE2_TDS_TIMER_MASK, + FF_MODE2_TDS_TIMER_128, 0); } static void From 0b1570b7ffe68dfefa07cb092a0723f898bb8184 Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Thu, 27 Feb 2020 08:57:14 +0000 Subject: [PATCH 302/400] drm/i915: Protect i915_request_await_start from early waits We need to be extremely careful inside i915_request_await_start() as it needs to walk the list of requests in the foreign timeline with very little protection. As we hold our own timeline mutex, we can not nest inside the signaler's timeline mutex, so all that remains is our RCU protection. However, to be safe we need to tell the compiler that we may be traversing the list only under RCU protection, and furthermore we need to start declaring requests as elements of the timeline from their construction. Fixes: 9ddc8ec027a3 ("drm/i915: Eliminate the trylock for awaiting an earlier request") Fixes: 6a79d848403d ("drm/i915: Lock signaler timeline while navigating") Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-11-chris@chris-wilson.co.uk (cherry picked from commit d22d2d073ef859b346bc32cb25299262e3973769) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/i915_request.c | 41 ++++++++++++++++++++--------- 1 file changed, 28 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index f56b046a32de..dcaa85a91090 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -275,7 +275,7 @@ bool i915_request_retire(struct i915_request *rq) spin_unlock_irq(&rq->lock); remove_from_client(rq); - list_del(&rq->link); + list_del_rcu(&rq->link); intel_context_exit(rq->context); intel_context_unpin(rq->context); @@ -721,6 +721,8 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp) rq->infix = rq->ring->emit; /* end of header; start of user payload */ intel_context_mark_active(ce); + list_add_tail_rcu(&rq->link, &tl->requests); + return rq; err_unwind: @@ -777,13 +779,23 @@ i915_request_await_start(struct i915_request *rq, struct i915_request *signal) GEM_BUG_ON(i915_request_timeline(rq) == rcu_access_pointer(signal->timeline)); + if (i915_request_started(signal)) + return 0; + fence = NULL; rcu_read_lock(); spin_lock_irq(&signal->lock); - if (!i915_request_started(signal) && - !list_is_first(&signal->link, - &rcu_dereference(signal->timeline)->requests)) { - struct i915_request *prev = list_prev_entry(signal, link); + do { + struct list_head *pos = READ_ONCE(signal->link.prev); + struct i915_request *prev; + + /* Confirm signal has not been retired, the link is valid */ + if (unlikely(i915_request_started(signal))) + break; + + /* Is signal the earliest request on its timeline? */ + if (pos == &rcu_dereference(signal->timeline)->requests) + break; /* * Peek at the request before us in the timeline. That @@ -791,13 +803,18 @@ i915_request_await_start(struct i915_request *rq, struct i915_request *signal) * after acquiring a reference to it, confirm that it is * still part of the signaler's timeline. */ - if (i915_request_get_rcu(prev)) { - if (list_next_entry(prev, link) == signal) - fence = &prev->fence; - else - i915_request_put(prev); + prev = list_entry(pos, typeof(*prev), link); + if (!i915_request_get_rcu(prev)) + break; + + /* After the strong barrier, confirm prev is still attached */ + if (unlikely(READ_ONCE(prev->link.next) != &signal->link)) { + i915_request_put(prev); + break; } - } + + fence = &prev->fence; + } while (0); spin_unlock_irq(&signal->lock); rcu_read_unlock(); if (!fence) @@ -1242,8 +1259,6 @@ __i915_request_add_to_timeline(struct i915_request *rq) 0); } - list_add_tail(&rq->link, &timeline->requests); - /* * Make sure that no request gazumped us - if it was allocated after * our i915_request_alloc() and called __i915_request_add() before From f4aaa44e8b20f7e0d4ea68d3bca4968b6ae5aaff Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Fri, 28 Feb 2020 17:14:13 +0300 Subject: [PATCH 303/400] drm/i915/selftests: Fix return in assert_mmap_offset() The assert_mmap_offset() returns type bool so if we return an error pointer that is "return true;" or success. If we have an error, then we should return false. Fixes: 3d81d589d6e3 ("drm/i915: Test exhaustion of the mmap space") Signed-off-by: Dan Carpenter Reviewed-by: Chris Wilson Signed-off-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20200228141413.qfjf4abr323drlo4@kili.mountain (cherry picked from commit efbf928824820f2738f41271934f6ec2c6ebd587) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c index ef7c74cff28a..43912e9b683d 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -570,7 +570,7 @@ static bool assert_mmap_offset(struct drm_i915_private *i915, obj = i915_gem_object_create_internal(i915, size); if (IS_ERR(obj)) - return PTR_ERR(obj); + return false; mmo = mmap_offset_attach(obj, I915_MMAP_OFFSET_GTT, NULL); i915_gem_object_put(obj); From 049d919168458ac54e7fad27cd156a958b042d2f Mon Sep 17 00:00:00 2001 From: Joakim Zhang Date: Tue, 25 Feb 2020 20:56:43 +0800 Subject: [PATCH 304/400] drivers/perf: fsl_imx8_ddr: Correct the CLEAR bit definition When disabling a counter from ddr_perf_event_stop(), the counter value is reset to 0 at the same time. Preserve the counter value by performing a read-modify-write of the PMU register and clearing only the enable bit. Signed-off-by: Joakim Zhang Signed-off-by: Will Deacon --- drivers/perf/fsl_imx8_ddr_perf.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/perf/fsl_imx8_ddr_perf.c b/drivers/perf/fsl_imx8_ddr_perf.c index 95dca2cb5265..90884d14f95f 100644 --- a/drivers/perf/fsl_imx8_ddr_perf.c +++ b/drivers/perf/fsl_imx8_ddr_perf.c @@ -388,9 +388,10 @@ static void ddr_perf_counter_enable(struct ddr_pmu *pmu, int config, if (enable) { /* - * must disable first, then enable again - * otherwise, cycle counter will not work - * if previous state is enabled. + * cycle counter is special which should firstly write 0 then + * write 1 into CLEAR bit to clear it. Other counters only + * need write 0 into CLEAR bit and it turns out to be 1 by + * hardware. Below enable flow is harmless for all counters. */ writel(0, pmu->base + reg); val = CNTL_EN | CNTL_CLEAR; @@ -398,7 +399,8 @@ static void ddr_perf_counter_enable(struct ddr_pmu *pmu, int config, writel(val, pmu->base + reg); } else { /* Disable counter */ - writel(0, pmu->base + reg); + val = readl_relaxed(pmu->base + reg) & CNTL_EN_MASK; + writel(val, pmu->base + reg); } } From 3ba52ad55b533760a1f65836aa0ec9d35e36bb4f Mon Sep 17 00:00:00 2001 From: luanshi Date: Wed, 26 Feb 2020 13:45:10 +0800 Subject: [PATCH 305/400] drivers/perf: arm_pmu_acpi: Fix incorrect checking of gicc pointer Fix bogus NULL checks on the return value of acpi_cpu_get_madt_gicc() by checking for a 0 'gicc->performance_interrupt' value instead. Signed-off-by: Liguang Zhang Signed-off-by: Will Deacon --- drivers/perf/arm_pmu_acpi.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c index acce8781c456..f5c7a845cd7b 100644 --- a/drivers/perf/arm_pmu_acpi.c +++ b/drivers/perf/arm_pmu_acpi.c @@ -24,8 +24,6 @@ static int arm_pmu_acpi_register_irq(int cpu) int gsi, trigger; gicc = acpi_cpu_get_madt_gicc(cpu); - if (WARN_ON(!gicc)) - return -EINVAL; gsi = gicc->performance_interrupt; @@ -64,11 +62,10 @@ static void arm_pmu_acpi_unregister_irq(int cpu) int gsi; gicc = acpi_cpu_get_madt_gicc(cpu); - if (!gicc) - return; gsi = gicc->performance_interrupt; - acpi_unregister_gsi(gsi); + if (gsi) + acpi_unregister_gsi(gsi); } #if IS_ENABLED(CONFIG_ARM_SPE_PMU) From 9abd515a6e4a5c58c6eb4d04110430325eb5f5ac Mon Sep 17 00:00:00 2001 From: Jean-Philippe Brucker Date: Thu, 27 Feb 2020 09:34:47 +0100 Subject: [PATCH 306/400] arm64: context: Fix ASID limit in boot messages Since commit f88f42f853a8 ("arm64: context: Free up kernel ASIDs if KPTI is not in use"), the NUM_USER_ASIDS macro doesn't correspond to the effective number of ASIDs when KPTI is enabled. Get an accurate number of available ASIDs in an arch_initcall, once we've discovered all CPUs' capabilities and know if we still need to halve the ASID space for KPTI. Fixes: f88f42f853a8 ("arm64: context: Free up kernel ASIDs if KPTI is not in use") Reviewed-by: Vladimir Murzin Signed-off-by: Jean-Philippe Brucker Signed-off-by: Will Deacon --- arch/arm64/mm/context.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c index 8ef73e89d514..d89bb22589f6 100644 --- a/arch/arm64/mm/context.c +++ b/arch/arm64/mm/context.c @@ -260,14 +260,26 @@ asmlinkage void post_ttbr_update_workaround(void) CONFIG_CAVIUM_ERRATUM_27456)); } -static int asids_init(void) +static int asids_update_limit(void) { - asid_bits = get_cpu_asid_bits(); + unsigned long num_available_asids = NUM_USER_ASIDS; + + if (arm64_kernel_unmapped_at_el0()) + num_available_asids /= 2; /* * Expect allocation after rollover to fail if we don't have at least * one more ASID than CPUs. ASID #0 is reserved for init_mm. */ - WARN_ON(NUM_USER_ASIDS - 1 <= num_possible_cpus()); + WARN_ON(num_available_asids - 1 <= num_possible_cpus()); + pr_info("ASID allocator initialised with %lu entries\n", + num_available_asids); + return 0; +} +arch_initcall(asids_update_limit); + +static int asids_init(void) +{ + asid_bits = get_cpu_asid_bits(); atomic64_set(&asid_generation, ASID_FIRST_VERSION); asid_map = kcalloc(BITS_TO_LONGS(NUM_USER_ASIDS), sizeof(*asid_map), GFP_KERNEL); @@ -282,8 +294,6 @@ static int asids_init(void) */ if (IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0)) set_kpti_asid_bits(); - - pr_info("ASID allocator initialised with %lu entries\n", NUM_USER_ASIDS); return 0; } early_initcall(asids_init); From d237851d5d9dff5973160737197e825f05715ba3 Mon Sep 17 00:00:00 2001 From: Jack Yu Date: Mon, 2 Mar 2020 09:54:24 +0800 Subject: [PATCH 307/400] ASoC: rt1015: add operation callback function for rt1015_dai[] Add operation callback function for rt1015_dai[]. Signed-off-by: Jack Yu Link: https://lore.kernel.org/r/20200302015424.9075-1-jack.yu@realtek.com Signed-off-by: Mark Brown --- sound/soc/codecs/rt1015.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/soc/codecs/rt1015.c b/sound/soc/codecs/rt1015.c index 6d490e2dbc25..9f151c7c3d2d 100644 --- a/sound/soc/codecs/rt1015.c +++ b/sound/soc/codecs/rt1015.c @@ -857,6 +857,7 @@ struct snd_soc_dai_driver rt1015_dai[] = { .rates = RT1015_STEREO_RATES, .formats = RT1015_FORMATS, }, + .ops = &rt1015_aif_dai_ops, } }; From e959e5405f34aa92d71d0dd162b969c21742061d Mon Sep 17 00:00:00 2001 From: Daniel Wagner Date: Mon, 2 Mar 2020 14:24:08 +0100 Subject: [PATCH 308/400] block: Remove used kblockd_schedule_work_on() Commit ee63cfa7fc19 ("block: add kblockd_schedule_work_on()") introduced the helper in 2016. Remove it because since then no caller was added. Cc: Jens Axboe Signed-off-by: Daniel Wagner Signed-off-by: Jens Axboe --- block/blk-core.c | 6 ------ include/linux/blkdev.h | 1 - 2 files changed, 7 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 089e890ab208..60dc9552ef8d 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1663,12 +1663,6 @@ int kblockd_schedule_work(struct work_struct *work) } EXPORT_SYMBOL(kblockd_schedule_work); -int kblockd_schedule_work_on(int cpu, struct work_struct *work) -{ - return queue_work_on(cpu, kblockd_workqueue, work); -} -EXPORT_SYMBOL(kblockd_schedule_work_on); - int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork, unsigned long delay) { diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 10455b2bbbb4..f629d40c645c 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1494,7 +1494,6 @@ static inline void put_dev_sector(Sector p) } int kblockd_schedule_work(struct work_struct *work); -int kblockd_schedule_work_on(int cpu, struct work_struct *work); int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork, unsigned long delay); #define MODULE_ALIAS_BLOCKDEV(major,minor) \ From fc04c39bae01a607454f7619665309870c60937a Mon Sep 17 00:00:00 2001 From: Pavel Begunkov Date: Sun, 1 Mar 2020 19:18:19 +0300 Subject: [PATCH 309/400] io-wq: fix IO_WQ_WORK_NO_CANCEL cancellation To cancel a work, io-wq sets IO_WQ_WORK_CANCEL and executes the callback. However, IO_WQ_WORK_NO_CANCEL works will just execute and may return next work, which will be ignored and lost. Cancel the whole link. Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe --- fs/io-wq.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/fs/io-wq.c b/fs/io-wq.c index bf8ed1b0b90a..9a7aacc96d84 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -747,6 +747,17 @@ static bool io_wq_can_queue(struct io_wqe *wqe, struct io_wqe_acct *acct, return true; } +static void io_run_cancel(struct io_wq_work *work) +{ + do { + struct io_wq_work *old_work = work; + + work->flags |= IO_WQ_WORK_CANCEL; + work->func(&work); + work = (work == old_work) ? NULL : work; + } while (work); +} + static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work) { struct io_wqe_acct *acct = io_work_get_acct(wqe, work); @@ -760,8 +771,7 @@ static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work) * It's close enough to not be an issue, fork() has the same delay. */ if (unlikely(!io_wq_can_queue(wqe, acct, work))) { - work->flags |= IO_WQ_WORK_CANCEL; - work->func(&work); + io_run_cancel(work); return; } @@ -900,8 +910,7 @@ static enum io_wq_cancel io_wqe_cancel_cb_work(struct io_wqe *wqe, spin_unlock_irqrestore(&wqe->lock, flags); if (found) { - work->flags |= IO_WQ_WORK_CANCEL; - work->func(&work); + io_run_cancel(work); return IO_WQ_CANCEL_OK; } @@ -976,8 +985,7 @@ static enum io_wq_cancel io_wqe_cancel_work(struct io_wqe *wqe, spin_unlock_irqrestore(&wqe->lock, flags); if (found) { - work->flags |= IO_WQ_WORK_CANCEL; - work->func(&work); + io_run_cancel(work); return IO_WQ_CANCEL_OK; } From 51bddd4501bc414b8b1e8f4d096b4a5304068169 Mon Sep 17 00:00:00 2001 From: Christophe JAILLET Date: Fri, 28 Feb 2020 22:38:38 +0100 Subject: [PATCH 310/400] spi: bcm63xx-hsspi: Really keep pll clk enabled The purpose of commit 0fd85869c2a9 ("spi/bcm63xx-hsspi: keep pll clk enabled") was to keep the pll clk enabled through the lifetime of the device. In order to do that, some 'clk_prepare_enable()'/'clk_disable_unprepare()' calls have been added in the error handling path of the probe function, in the remove function and in the suspend and resume functions. However, a 'clk_disable_unprepare()' call has been unfortunately left in the probe function. So the commit seems to be more or less a no-op. Axe it now, so that the pll clk is left enabled through the lifetime of the device, as described in the commit. Fixes: 0fd85869c2a9 ("spi/bcm63xx-hsspi: keep pll clk enabled") Signed-off-by: Christophe JAILLET Acked-by: Jonas Gorski Link: https://lore.kernel.org/r/20200228213838.7124-1-christophe.jaillet@wanadoo.fr Signed-off-by: Mark Brown --- drivers/spi/spi-bcm63xx-hsspi.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/spi/spi-bcm63xx-hsspi.c b/drivers/spi/spi-bcm63xx-hsspi.c index 7327309ea3d5..6c235306c0e4 100644 --- a/drivers/spi/spi-bcm63xx-hsspi.c +++ b/drivers/spi/spi-bcm63xx-hsspi.c @@ -366,7 +366,6 @@ static int bcm63xx_hsspi_probe(struct platform_device *pdev) goto out_disable_clk; rate = clk_get_rate(pll_clk); - clk_disable_unprepare(pll_clk); if (!rate) { ret = -EINVAL; goto out_disable_pll_clk; From 817a68a6584aa08e323c64283fec5ded7be84759 Mon Sep 17 00:00:00 2001 From: Dennis Dalessandro Date: Tue, 25 Feb 2020 14:54:45 -0500 Subject: [PATCH 311/400] IB/hfi1, qib: Ensure RCU is locked when accessing list The packet handling function, specifically the iteration of the qp list for mad packet processing misses locking RCU before running through the list. Not only is this incorrect, but the list_for_each_entry_rcu() call can not be called with a conditional check for lock dependency. Remedy this by invoking the rcu lock and unlock around the critical section. This brings MAD packet processing in line with what is done for non-MAD packets. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20200225195445.140896.41873.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn Signed-off-by: Dennis Dalessandro Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/hfi1/verbs.c | 4 +++- drivers/infiniband/hw/qib/qib_verbs.c | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c index 089e201d7550..2f6323ad9c59 100644 --- a/drivers/infiniband/hw/hfi1/verbs.c +++ b/drivers/infiniband/hw/hfi1/verbs.c @@ -515,10 +515,11 @@ static inline void hfi1_handle_packet(struct hfi1_packet *packet, opa_get_lid(packet->dlid, 9B)); if (!mcast) goto drop; + rcu_read_lock(); list_for_each_entry_rcu(p, &mcast->qp_list, list) { packet->qp = p->qp; if (hfi1_do_pkey_check(packet)) - goto drop; + goto unlock_drop; spin_lock_irqsave(&packet->qp->r_lock, flags); packet_handler = qp_ok(packet); if (likely(packet_handler)) @@ -527,6 +528,7 @@ static inline void hfi1_handle_packet(struct hfi1_packet *packet, ibp->rvp.n_pkt_drops++; spin_unlock_irqrestore(&packet->qp->r_lock, flags); } + rcu_read_unlock(); /* * Notify rvt_multicast_detach() if it is waiting for us * to finish. diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c index 33778d451b82..5ef93f8f17a1 100644 --- a/drivers/infiniband/hw/qib/qib_verbs.c +++ b/drivers/infiniband/hw/qib/qib_verbs.c @@ -329,8 +329,10 @@ void qib_ib_rcv(struct qib_ctxtdata *rcd, void *rhdr, void *data, u32 tlen) if (mcast == NULL) goto drop; this_cpu_inc(ibp->pmastats->n_multicast_rcv); + rcu_read_lock(); list_for_each_entry_rcu(p, &mcast->qp_list, list) qib_qp_rcv(rcd, hdr, 1, data, tlen, p->qp); + rcu_read_unlock(); /* * Notify rvt_multicast_detach() if it is waiting for us * to finish. From f3a60268f5cec7dae0e9713f5fc65aecc3734c09 Mon Sep 17 00:00:00 2001 From: Christophe Leroy Date: Thu, 27 Feb 2020 14:07:10 +0000 Subject: [PATCH 312/400] selftest/lkdtm: Use local .gitignore Commit 68ca0fd272da ("selftest/lkdtm: Don't pollute 'git status'") introduced patterns for git to ignore files generated in tools/testing/selftests/lkdtm/ Use local .gitignore file instead of using the root one. Fixes: 68ca0fd272da ("selftest/lkdtm: Don't pollute 'git status'") Signed-off-by: Christophe Leroy Acked-by: Kees Cook Signed-off-by: Shuah Khan --- .gitignore | 4 ---- tools/testing/selftests/lkdtm/.gitignore | 2 ++ 2 files changed, 2 insertions(+), 4 deletions(-) create mode 100644 tools/testing/selftests/lkdtm/.gitignore diff --git a/.gitignore b/.gitignore index 2763fce8766c..72ef86a5570d 100644 --- a/.gitignore +++ b/.gitignore @@ -100,10 +100,6 @@ modules.order /include/ksym/ /arch/*/include/generated/ -# Generated lkdtm tests -/tools/testing/selftests/lkdtm/*.sh -!/tools/testing/selftests/lkdtm/run.sh - # stgit generated dirs patches-* diff --git a/tools/testing/selftests/lkdtm/.gitignore b/tools/testing/selftests/lkdtm/.gitignore new file mode 100644 index 000000000000..f26212605b6b --- /dev/null +++ b/tools/testing/selftests/lkdtm/.gitignore @@ -0,0 +1,2 @@ +*.sh +!run.sh From 80ad894382bf1d73eb688c29714fa10c0afcf2e7 Mon Sep 17 00:00:00 2001 From: Pavel Begunkov Date: Mon, 2 Mar 2020 23:46:10 +0300 Subject: [PATCH 313/400] io-wq: remove io_wq_flush and IO_WQ_WORK_INTERNAL io_wq_flush() is buggy, during cancelation of a flush, the associated work may be passed to the caller's (i.e. io_uring) @match callback. That callback is expecting it to be embedded in struct io_kiocb. Cancelation of internal work probably doesn't make a lot of sense to begin with. As the flush helper is no longer used, just delete it and the associated work flag. Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe --- fs/io-wq.c | 38 +------------------------------------- fs/io-wq.h | 2 -- 2 files changed, 1 insertion(+), 39 deletions(-) diff --git a/fs/io-wq.c b/fs/io-wq.c index 9a7aacc96d84..5cef075c0b37 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -502,7 +502,7 @@ next: if (worker->mm) work->flags |= IO_WQ_WORK_HAS_MM; - if (wq->get_work && !(work->flags & IO_WQ_WORK_INTERNAL)) { + if (wq->get_work) { put_work = work; wq->get_work(work); } @@ -1057,42 +1057,6 @@ enum io_wq_cancel io_wq_cancel_pid(struct io_wq *wq, pid_t pid) return ret; } -struct io_wq_flush_data { - struct io_wq_work work; - struct completion done; -}; - -static void io_wq_flush_func(struct io_wq_work **workptr) -{ - struct io_wq_work *work = *workptr; - struct io_wq_flush_data *data; - - data = container_of(work, struct io_wq_flush_data, work); - complete(&data->done); -} - -/* - * Doesn't wait for previously queued work to finish. When this completes, - * it just means that previously queued work was started. - */ -void io_wq_flush(struct io_wq *wq) -{ - struct io_wq_flush_data data; - int node; - - for_each_node(node) { - struct io_wqe *wqe = wq->wqes[node]; - - if (!node_online(node)) - continue; - init_completion(&data.done); - INIT_IO_WORK(&data.work, io_wq_flush_func); - data.work.flags |= IO_WQ_WORK_INTERNAL; - io_wqe_enqueue(wqe, &data.work); - wait_for_completion(&data.done); - } -} - struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data) { int ret = -ENOMEM, node; diff --git a/fs/io-wq.h b/fs/io-wq.h index 33baba4370c5..e5e15f2c93ec 100644 --- a/fs/io-wq.h +++ b/fs/io-wq.h @@ -8,7 +8,6 @@ enum { IO_WQ_WORK_HAS_MM = 2, IO_WQ_WORK_HASHED = 4, IO_WQ_WORK_UNBOUND = 32, - IO_WQ_WORK_INTERNAL = 64, IO_WQ_WORK_CB = 128, IO_WQ_WORK_NO_CANCEL = 256, IO_WQ_WORK_CONCURRENT = 512, @@ -100,7 +99,6 @@ void io_wq_destroy(struct io_wq *wq); void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work); void io_wq_enqueue_hashed(struct io_wq *wq, struct io_wq_work *work, void *val); -void io_wq_flush(struct io_wq *wq); void io_wq_cancel_all(struct io_wq *wq); enum io_wq_cancel io_wq_cancel_work(struct io_wq *wq, struct io_wq_work *cwork); From ab4562f4dd92c455f6b313717af5e7d72a55d7b4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?S=C3=A9bastien=20Szymanski?= Date: Tue, 25 Feb 2020 13:39:04 +0100 Subject: [PATCH 314/400] dt-bindings: arm: fsl: fix APF6Dev compatible MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit APF6 Dev compatible is armadeus,imx6dl-apf6dev and not armadeus,imx6dl-apf6dldev. Fixes: 3d735471d066 ("dt-bindings: arm: Document Armadeus SoM and Dev boards devicetree binding") Signed-off-by: Sébastien Szymanski Signed-off-by: Rob Herring --- Documentation/devicetree/bindings/arm/fsl.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/arm/fsl.yaml b/Documentation/devicetree/bindings/arm/fsl.yaml index a8e0b4a813ed..0e17e1f6fb80 100644 --- a/Documentation/devicetree/bindings/arm/fsl.yaml +++ b/Documentation/devicetree/bindings/arm/fsl.yaml @@ -160,7 +160,7 @@ properties: items: - enum: - armadeus,imx6dl-apf6 # APF6 (Solo) SoM - - armadeus,imx6dl-apf6dldev # APF6 (Solo) SoM on APF6Dev board + - armadeus,imx6dl-apf6dev # APF6 (Solo) SoM on APF6Dev board - eckelmann,imx6dl-ci4x10 - emtrion,emcon-mx6 # emCON-MX6S or emCON-MX6DL SoM - emtrion,emcon-mx6-avari # emCON-MX6S or emCON-MX6DL SoM on Avari Base From 764b53b26c9897b0693c934797e898d6cd883a26 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Mon, 2 Mar 2020 20:01:32 -0700 Subject: [PATCH 315/400] Revert "bcache: ignore pending signals when creating gc and allocator thread" This reverts commit 0b96da639a4874311e9b5156405f69ef9fc3bef8. We can't just go flushing random signals, under the assumption that the OOM killer will just do something else. It's not safe from the OOM perspective, and it could also cause other signals to get randomly lost. Signed-off-by: Jens Axboe --- drivers/md/bcache/alloc.c | 18 ++---------------- drivers/md/bcache/btree.c | 13 ------------- 2 files changed, 2 insertions(+), 29 deletions(-) diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c index 8bc1faf71ff2..a1df0d95151c 100644 --- a/drivers/md/bcache/alloc.c +++ b/drivers/md/bcache/alloc.c @@ -67,7 +67,6 @@ #include #include #include -#include #include #define MAX_OPEN_BUCKETS 128 @@ -734,21 +733,8 @@ int bch_open_buckets_alloc(struct cache_set *c) int bch_cache_allocator_start(struct cache *ca) { - struct task_struct *k; - - /* - * In case previous btree check operation occupies too many - * system memory for bcache btree node cache, and the - * registering process is selected by OOM killer. Here just - * ignore the SIGKILL sent by OOM killer if there is, to - * avoid kthread_run() being failed by pending signals. The - * bcache registering process will exit after the registration - * done. - */ - if (signal_pending(current)) - flush_signals(current); - - k = kthread_run(bch_allocator_thread, ca, "bcache_allocator"); + struct task_struct *k = kthread_run(bch_allocator_thread, + ca, "bcache_allocator"); if (IS_ERR(k)) return PTR_ERR(k); diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index b12186c87f52..fa872df4e770 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -34,7 +34,6 @@ #include #include #include -#include #include #include #include @@ -1914,18 +1913,6 @@ static int bch_gc_thread(void *arg) int bch_gc_thread_start(struct cache_set *c) { - /* - * In case previous btree check operation occupies too many - * system memory for bcache btree node cache, and the - * registering process is selected by OOM killer. Here just - * ignore the SIGKILL sent by OOM killer if there is, to - * avoid kthread_run() being failed by pending signals. The - * bcache registering process will exit after the registration - * done. - */ - if (signal_pending(current)) - flush_signals(current); - c->gc_thread = kthread_run(bch_gc_thread, c, "bcache_gc"); return PTR_ERR_OR_ZERO(c->gc_thread); } From 4b01618b624736fc7d74f150f623d11b17b7b288 Mon Sep 17 00:00:00 2001 From: Jack Yu Date: Tue, 3 Mar 2020 10:59:13 +0800 Subject: [PATCH 316/400] ASoC: rt1015: modify pre-divider for sysclk Modify pre-divider for system clock. Signed-off-by: Jack Yu Link: https://lore.kernel.org/r/20200303025913.24499-1-jack.yu@realtek.com Signed-off-by: Mark Brown --- sound/soc/codecs/rt1015.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/codecs/rt1015.c b/sound/soc/codecs/rt1015.c index 9f151c7c3d2d..66eb55b4ffd4 100644 --- a/sound/soc/codecs/rt1015.c +++ b/sound/soc/codecs/rt1015.c @@ -664,7 +664,7 @@ static int rt1015_hw_params(struct snd_pcm_substream *substream, snd_soc_component_update_bits(component, RT1015_TDM_MASTER, RT1015_I2S_DL_MASK, val_len); snd_soc_component_update_bits(component, RT1015_CLK2, - RT1015_FS_PD_MASK, pre_div); + RT1015_FS_PD_MASK, pre_div << RT1015_FS_PD_SFT); return 0; } From 613cea5935e83cb5a7d182ee3f98d54620e102e2 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Tue, 3 Mar 2020 13:18:58 +0300 Subject: [PATCH 317/400] ASoC: SOF: Fix snd_sof_ipc_stream_posn() We're passing "&posn" instead of "posn" so it ends up corrupting memory instead of doing something useful. Fixes: 53e0c72d98ba ("ASoC: SOF: Add support for IPC IO between DSP and Host") Signed-off-by: Dan Carpenter Reviewed-by: Kai Vehmanen Link: https://lore.kernel.org/r/20200303101858.ytehbrivocyp3cnf@kili.mountain Signed-off-by: Mark Brown --- sound/soc/sof/ipc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/sof/ipc.c b/sound/soc/sof/ipc.c index b63fc529b456..78aa1da7c7a9 100644 --- a/sound/soc/sof/ipc.c +++ b/sound/soc/sof/ipc.c @@ -499,7 +499,7 @@ int snd_sof_ipc_stream_posn(struct snd_soc_component *scomp, /* send IPC to the DSP */ err = sof_ipc_tx_message(sdev->ipc, - stream.hdr.cmd, &stream, sizeof(stream), &posn, + stream.hdr.cmd, &stream, sizeof(stream), posn, sizeof(*posn)); if (err < 0) { dev_err(sdev->dev, "error: failed to get stream %d position\n", From e7a04894c766daa4248cb736efee93550f2d5872 Mon Sep 17 00:00:00 2001 From: Omar Sandoval Date: Mon, 2 Mar 2020 14:02:49 -0800 Subject: [PATCH 318/400] btrfs: fix RAID direct I/O reads with alternate csums btrfs_lookup_and_bind_dio_csum() does pointer arithmetic which assumes 32-bit checksums. If using a larger checksum, this leads to spurious failures when a direct I/O read crosses a stripe. This is easy to reproduce: # mkfs.btrfs -f --checksum blake2 -d raid0 /dev/vdc /dev/vdd ... # mount /dev/vdc /mnt # cd /mnt # dd if=/dev/urandom of=foo bs=1M count=1 status=none # dd if=foo of=/dev/null bs=1M iflag=direct status=none dd: error reading 'foo': Input/output error # dmesg | tail -1 [ 135.821568] BTRFS warning (device vdc): csum failed root 5 ino 257 off 421888 ... Fix it by using the actual checksum size. Fixes: 1e25a2e3ca0d ("btrfs: don't assume ordered sums to be 4 bytes") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Johannes Thumshirn Signed-off-by: Omar Sandoval Reviewed-by: David Sterba Signed-off-by: David Sterba --- fs/btrfs/inode.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 1ccb3f8d528d..27076ebadb36 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7783,6 +7783,7 @@ static inline blk_status_t btrfs_lookup_and_bind_dio_csum(struct inode *inode, { struct btrfs_io_bio *io_bio = btrfs_io_bio(bio); struct btrfs_io_bio *orig_io_bio = btrfs_io_bio(dip->orig_bio); + u16 csum_size; blk_status_t ret; /* @@ -7802,7 +7803,8 @@ static inline blk_status_t btrfs_lookup_and_bind_dio_csum(struct inode *inode, file_offset -= dip->logical_offset; file_offset >>= inode->i_sb->s_blocksize_bits; - io_bio->csum = (u8 *)(((u32 *)orig_io_bio->csum) + file_offset); + csum_size = btrfs_super_csum_size(btrfs_sb(inode->i_sb)->super_copy); + io_bio->csum = orig_io_bio->csum + csum_size * file_offset; return 0; } From 1b17159e52bb31f982f82a6278acd7fab1d3f67b Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Fri, 28 Feb 2020 18:00:53 -0500 Subject: [PATCH 319/400] dm bio record: save/restore bi_end_io and bi_integrity Also, save/restore __bi_remaining in case the bio was used in a BIO_CHAIN (e.g. due to blk_queue_split). Suggested-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm-bio-record.h | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/md/dm-bio-record.h b/drivers/md/dm-bio-record.h index c82578af56a5..2ea0360108e1 100644 --- a/drivers/md/dm-bio-record.h +++ b/drivers/md/dm-bio-record.h @@ -20,8 +20,13 @@ struct dm_bio_details { struct gendisk *bi_disk; u8 bi_partno; + int __bi_remaining; unsigned long bi_flags; struct bvec_iter bi_iter; + bio_end_io_t *bi_end_io; +#if defined(CONFIG_BLK_DEV_INTEGRITY) + struct bio_integrity_payload *bi_integrity; +#endif }; static inline void dm_bio_record(struct dm_bio_details *bd, struct bio *bio) @@ -30,6 +35,11 @@ static inline void dm_bio_record(struct dm_bio_details *bd, struct bio *bio) bd->bi_partno = bio->bi_partno; bd->bi_flags = bio->bi_flags; bd->bi_iter = bio->bi_iter; + bd->__bi_remaining = atomic_read(&bio->__bi_remaining); + bd->bi_end_io = bio->bi_end_io; +#if defined(CONFIG_BLK_DEV_INTEGRITY) + bd->bi_integrity = bio_integrity(bio); +#endif } static inline void dm_bio_restore(struct dm_bio_details *bd, struct bio *bio) @@ -38,6 +48,11 @@ static inline void dm_bio_restore(struct dm_bio_details *bd, struct bio *bio) bio->bi_partno = bd->bi_partno; bio->bi_flags = bd->bi_flags; bio->bi_iter = bd->bi_iter; + atomic_set(&bio->__bi_remaining, bd->__bi_remaining); + bio->bi_end_io = bd->bi_end_io; +#if defined(CONFIG_BLK_DEV_INTEGRITY) + bio->bi_integrity = bd->bi_integrity; +#endif } #endif From 248aa2645aa7fc9175d1107c2593cc90d4af5a4e Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Fri, 28 Feb 2020 18:11:53 -0500 Subject: [PATCH 320/400] dm integrity: use dm_bio_record and dm_bio_restore MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In cases where dec_in_flight() has to requeue the integrity_bio_wait work to transfer the rest of the data, the bio's __bi_remaining might already have been decremented to 0, e.g.: if bio passed to underlying data device was split via blk_queue_split(). Use dm_bio_{record,restore} rather than effectively open-coding them in dm-integrity -- these methods now manage __bi_remaining too. Depends-on: f7f0b057a9c1 ("dm bio record: save/restore bi_end_io and bi_integrity") Reported-by: Daniel Glöckner Suggested-by: Mikulas Patocka Signed-off-by: Mike Snitzer --- drivers/md/dm-integrity.c | 32 +++++++++----------------------- 1 file changed, 9 insertions(+), 23 deletions(-) diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c index e1ad0b53f681..a82a9c257744 100644 --- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -6,6 +6,8 @@ * This file is released under the GPL. */ +#include "dm-bio-record.h" + #include #include #include @@ -295,11 +297,7 @@ struct dm_integrity_io { struct completion *completion; - struct gendisk *orig_bi_disk; - u8 orig_bi_partno; - bio_end_io_t *orig_bi_end_io; - struct bio_integrity_payload *orig_bi_integrity; - struct bvec_iter orig_bi_iter; + struct dm_bio_details bio_details; }; struct journal_completion { @@ -1452,14 +1450,9 @@ static void integrity_end_io(struct bio *bio) { struct dm_integrity_io *dio = dm_per_bio_data(bio, sizeof(struct dm_integrity_io)); - bio->bi_iter = dio->orig_bi_iter; - bio->bi_disk = dio->orig_bi_disk; - bio->bi_partno = dio->orig_bi_partno; - if (dio->orig_bi_integrity) { - bio->bi_integrity = dio->orig_bi_integrity; + dm_bio_restore(&dio->bio_details, bio); + if (bio->bi_integrity) bio->bi_opf |= REQ_INTEGRITY; - } - bio->bi_end_io = dio->orig_bi_end_io; if (dio->completion) complete(dio->completion); @@ -1544,7 +1537,7 @@ static void integrity_metadata(struct work_struct *w) } } - __bio_for_each_segment(bv, bio, iter, dio->orig_bi_iter) { + __bio_for_each_segment(bv, bio, iter, dio->bio_details.bi_iter) { unsigned pos; char *mem, *checksums_ptr; @@ -1588,7 +1581,7 @@ again: if (likely(checksums != checksums_onstack)) kfree(checksums); } else { - struct bio_integrity_payload *bip = dio->orig_bi_integrity; + struct bio_integrity_payload *bip = dio->bio_details.bi_integrity; if (bip) { struct bio_vec biv; @@ -2007,20 +2000,13 @@ offload_to_thread: } else dio->completion = NULL; - dio->orig_bi_iter = bio->bi_iter; - - dio->orig_bi_disk = bio->bi_disk; - dio->orig_bi_partno = bio->bi_partno; + dm_bio_record(&dio->bio_details, bio); bio_set_dev(bio, ic->dev->bdev); - - dio->orig_bi_integrity = bio_integrity(bio); bio->bi_integrity = NULL; bio->bi_opf &= ~REQ_INTEGRITY; - - dio->orig_bi_end_io = bio->bi_end_io; bio->bi_end_io = integrity_end_io; - bio->bi_iter.bi_size = dio->range.n_sectors << SECTOR_SHIFT; + generic_make_request(bio); if (need_sync_io) { From 0a68ff5e2e7cf2263674b7f0418b31e10b2a497f Mon Sep 17 00:00:00 2001 From: Kees Cook Date: Wed, 19 Feb 2020 22:22:43 -0800 Subject: [PATCH 321/400] fcntl: Distribute switch variables for initialization MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Variables declared in a switch statement before any case statements cannot be automatically initialized with compiler instrumentation (as they are not part of any execution flow). With GCC's proposed automatic stack variable initialization feature, this triggers a warning (and they don't get initialized). Clang's automatic stack variable initialization (via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also doesn't initialize such variables[1]. Note that these warnings (or silent skipping) happen before the dead-store elimination optimization phase, so even when the automatic initializations are later elided in favor of direct initializations, the warnings remain. To avoid these problems, move such variables into the "case" where they're used or lift them up into the main function body. fs/fcntl.c: In function ‘send_sigio_to_task’: fs/fcntl.c:738:20: warning: statement will never be executed [-Wswitch-unreachable] 738 | kernel_siginfo_t si; | ^~ [1] https://bugs.llvm.org/show_bug.cgi?id=44916 Signed-off-by: Kees Cook Signed-off-by: Jeff Layton --- fs/fcntl.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/fcntl.c b/fs/fcntl.c index 9bc167562ee8..2e4c0fa2074b 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -735,8 +735,9 @@ static void send_sigio_to_task(struct task_struct *p, return; switch (signum) { - kernel_siginfo_t si; - default: + default: { + kernel_siginfo_t si; + /* Queue a rt signal with the appropriate fd as its value. We use SI_SIGIO as the source, not SI_KERNEL, since kernel signals always get @@ -769,6 +770,7 @@ static void send_sigio_to_task(struct task_struct *p, si.si_fd = fd; if (!do_send_sig_info(signum, &si, p, type)) break; + } /* fall-through - fall back on the old plain SIGIO signal */ case 0: do_send_sig_info(SIGIO, SEND_SIG_PRIV, p, type); From 974f51e8633f0f3f33e8f86bbb5ae66758aa63c7 Mon Sep 17 00:00:00 2001 From: Hou Tao Date: Tue, 3 Mar 2020 16:45:01 +0800 Subject: [PATCH 322/400] dm: fix congested_fn for request-based device We neither assign congested_fn for requested-based blk-mq device nor implement it correctly. So fix both. Also, remove incorrect comment from dm_init_normal_md_queue and rename it to dm_init_congested_fn. Fixes: 4aa9c692e052 ("bdi: separate out congested state into a separate struct") Cc: stable@vger.kernel.org Signed-off-by: Hou Tao Signed-off-by: Mike Snitzer --- drivers/md/dm.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index d01ad6a35dd0..0413018c8305 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1788,7 +1788,8 @@ static int dm_any_congested(void *congested_data, int bdi_bits) * With request-based DM we only need to check the * top-level queue for congestion. */ - r = md->queue->backing_dev_info->wb.state & bdi_bits; + struct backing_dev_info *bdi = md->queue->backing_dev_info; + r = bdi->wb.congested->state & bdi_bits; } else { map = dm_get_live_table_fast(md); if (map) @@ -1854,15 +1855,6 @@ static const struct dax_operations dm_dax_ops; static void dm_wq_work(struct work_struct *work); -static void dm_init_normal_md_queue(struct mapped_device *md) -{ - /* - * Initialize aspects of queue that aren't relevant for blk-mq - */ - md->queue->backing_dev_info->congested_data = md; - md->queue->backing_dev_info->congested_fn = dm_any_congested; -} - static void cleanup_mapped_device(struct mapped_device *md) { if (md->wq) @@ -2249,6 +2241,12 @@ struct queue_limits *dm_get_queue_limits(struct mapped_device *md) } EXPORT_SYMBOL_GPL(dm_get_queue_limits); +static void dm_init_congested_fn(struct mapped_device *md) +{ + md->queue->backing_dev_info->congested_data = md; + md->queue->backing_dev_info->congested_fn = dm_any_congested; +} + /* * Setup the DM device's queue based on md's type */ @@ -2265,11 +2263,12 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t) DMERR("Cannot initialize queue for request-based dm-mq mapped device"); return r; } + dm_init_congested_fn(md); break; case DM_TYPE_BIO_BASED: case DM_TYPE_DAX_BIO_BASED: case DM_TYPE_NVME_BIO_BASED: - dm_init_normal_md_queue(md); + dm_init_congested_fn(md); break; case DM_TYPE_NONE: WARN_ON_ONCE(true); From 636be4241bdd88fec273b38723e44bad4e1c4fae Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Thu, 27 Feb 2020 14:25:31 -0500 Subject: [PATCH 323/400] dm: bump version of core and various targets Changes made during the 5.6 cycle warrant bumping the version number for DM core and the targets modified by this commit. It should be noted that dm-thin, dm-crypt and dm-raid already had their target version bumped during the 5.6 merge window. Signed-off-by; Mike Snitzer --- drivers/md/dm-cache-target.c | 2 +- drivers/md/dm-integrity.c | 2 +- drivers/md/dm-mpath.c | 2 +- drivers/md/dm-verity-target.c | 2 +- drivers/md/dm-writecache.c | 2 +- drivers/md/dm-zoned-target.c | 2 +- include/uapi/linux/dm-ioctl.h | 4 ++-- 7 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c index f4be63671233..d3bb355819a4 100644 --- a/drivers/md/dm-cache-target.c +++ b/drivers/md/dm-cache-target.c @@ -3492,7 +3492,7 @@ static void cache_io_hints(struct dm_target *ti, struct queue_limits *limits) static struct target_type cache_target = { .name = "cache", - .version = {2, 1, 0}, + .version = {2, 2, 0}, .module = THIS_MODULE, .ctr = cache_ctr, .dtr = cache_dtr, diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c index a82a9c257744..2f03fecd312d 100644 --- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -4202,7 +4202,7 @@ static void dm_integrity_dtr(struct dm_target *ti) static struct target_type integrity_target = { .name = "integrity", - .version = {1, 4, 0}, + .version = {1, 5, 0}, .module = THIS_MODULE, .features = DM_TARGET_SINGLETON | DM_TARGET_INTEGRITY, .ctr = dm_integrity_ctr, diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 2bc18c9c3abc..58fd137b6ae1 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -2053,7 +2053,7 @@ static int multipath_busy(struct dm_target *ti) *---------------------------------------------------------------*/ static struct target_type multipath_target = { .name = "multipath", - .version = {1, 13, 0}, + .version = {1, 14, 0}, .features = DM_TARGET_SINGLETON | DM_TARGET_IMMUTABLE | DM_TARGET_PASSES_INTEGRITY, .module = THIS_MODULE, diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c index 0d61e9c67986..eec9f252e935 100644 --- a/drivers/md/dm-verity-target.c +++ b/drivers/md/dm-verity-target.c @@ -1221,7 +1221,7 @@ bad: static struct target_type verity_target = { .name = "verity", - .version = {1, 5, 0}, + .version = {1, 6, 0}, .module = THIS_MODULE, .ctr = verity_ctr, .dtr = verity_dtr, diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c index d692d2a00745..a09bdc000e64 100644 --- a/drivers/md/dm-writecache.c +++ b/drivers/md/dm-writecache.c @@ -2320,7 +2320,7 @@ static void writecache_status(struct dm_target *ti, status_type_t type, static struct target_type writecache_target = { .name = "writecache", - .version = {1, 1, 1}, + .version = {1, 2, 0}, .module = THIS_MODULE, .ctr = writecache_ctr, .dtr = writecache_dtr, diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c index b1e64cd31647..f4f83d39b3dc 100644 --- a/drivers/md/dm-zoned-target.c +++ b/drivers/md/dm-zoned-target.c @@ -967,7 +967,7 @@ static int dmz_iterate_devices(struct dm_target *ti, static struct target_type dmz_type = { .name = "zoned", - .version = {1, 0, 0}, + .version = {1, 1, 0}, .features = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM, .module = THIS_MODULE, .ctr = dmz_ctr, diff --git a/include/uapi/linux/dm-ioctl.h b/include/uapi/linux/dm-ioctl.h index 2df8ceca1f9b..6622912c2342 100644 --- a/include/uapi/linux/dm-ioctl.h +++ b/include/uapi/linux/dm-ioctl.h @@ -272,9 +272,9 @@ enum { #define DM_DEV_SET_GEOMETRY _IOWR(DM_IOCTL, DM_DEV_SET_GEOMETRY_CMD, struct dm_ioctl) #define DM_VERSION_MAJOR 4 -#define DM_VERSION_MINOR 41 +#define DM_VERSION_MINOR 42 #define DM_VERSION_PATCHLEVEL 0 -#define DM_VERSION_EXTRA "-ioctl (2019-09-16)" +#define DM_VERSION_EXTRA "-ioctl (2020-02-27)" /* Status bits */ #define DM_READONLY_FLAG (1 << 0) /* In/Out */ From 0cff8bff7af886af0923d5c91776cd51603e531f Mon Sep 17 00:00:00 2001 From: Vincent Chen Date: Fri, 21 Feb 2020 10:47:54 +0800 Subject: [PATCH 324/400] riscv: avoid the PIC offset of static percpu data in module beyond 2G limits The compiler uses the PIC-relative method to access static variables instead of GOT when the code model is PIC. Therefore, the limitation of the access range from the instruction to the symbol address is +-2GB. Under this circumstance, the kernel cannot load a kernel module if this module has static per-CPU symbols declared by DEFINE_PER_CPU(). The reason is that kernel relocates the .data..percpu section of the kernel module to the end of kernel's .data..percpu. Hence, the distance between the per-CPU symbols and the instruction will exceed the 2GB limits. To solve this problem, the kernel should place the loaded module in the memory area [&_end-2G, VMALLOC_END]. Signed-off-by: Vincent Chen Suggested-by: Alexandre Ghiti Suggested-by: Anup Patel Tested-by: Alexandre Ghiti Tested-by: Carlos de Paula Signed-off-by: Palmer Dabbelt --- arch/riscv/kernel/module.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c index b7401858d872..8bbe5dbe1341 100644 --- a/arch/riscv/kernel/module.c +++ b/arch/riscv/kernel/module.c @@ -8,6 +8,10 @@ #include #include #include +#include +#include +#include +#include static int apply_r_riscv_32_rela(struct module *me, u32 *location, Elf_Addr v) { @@ -386,3 +390,15 @@ int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab, return 0; } + +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT) +#define VMALLOC_MODULE_START \ + max(PFN_ALIGN((unsigned long)&_end - SZ_2G), VMALLOC_START) +void *module_alloc(unsigned long size) +{ + return __vmalloc_node_range(size, 1, VMALLOC_MODULE_START, + VMALLOC_END, GFP_KERNEL, + PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE, + __builtin_return_address(0)); +} +#endif From aad15bc85c189261b0554a7dc8e053641dd4025c Mon Sep 17 00:00:00 2001 From: Vincent Chen Date: Fri, 21 Feb 2020 10:47:55 +0800 Subject: [PATCH 325/400] riscv: Change code model of module to medany to improve data accessing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit All the loaded module locates in the region [&_end-2G,VMALLOC_END] at runtime, so the distance from the module start to the end of the kernel image does not exceed 2GB. Hence, the code model of the kernel module can be changed to medany to improve the performance data access. Signed-off-by: Vincent Chen Signed-off-by: Palmer Dabbelt --- arch/riscv/Makefile | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile index b9009a2fbaf5..259cb53d7f20 100644 --- a/arch/riscv/Makefile +++ b/arch/riscv/Makefile @@ -13,8 +13,10 @@ LDFLAGS_vmlinux := ifeq ($(CONFIG_DYNAMIC_FTRACE),y) LDFLAGS_vmlinux := --no-relax endif -KBUILD_AFLAGS_MODULE += -fPIC -KBUILD_CFLAGS_MODULE += -fPIC + +ifeq ($(CONFIG_64BIT)$(CONFIG_CMODEL_MEDLOW),yy) +KBUILD_CFLAGS_MODULE += -mcmodel=medany +endif export BITS ifeq ($(CONFIG_ARCH_RV64I),y) From 44f2f882909fedfc3a56e4b90026910456019743 Mon Sep 17 00:00:00 2001 From: Dan Carpenter Date: Tue, 3 Mar 2020 13:16:08 +0300 Subject: [PATCH 326/400] hwmon: (adt7462) Fix an error return in ADT7462_REG_VOLT() This is only called from adt7462_update_device(). The caller expects it to return zero on error. I fixed a similar issue earlier in commit a4bf06d58f21 ("hwmon: (adt7462) ADT7462_REG_VOLT_MAX() should return 0") but I missed this one. Fixes: c0b4e3ab0c76 ("adt7462: new hwmon driver") Signed-off-by: Dan Carpenter Reviewed-by: Darrick J. Wong Link: https://lore.kernel.org/r/20200303101608.kqjwfcazu2ylhi2a@kili.mountain Signed-off-by: Guenter Roeck --- drivers/hwmon/adt7462.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwmon/adt7462.c b/drivers/hwmon/adt7462.c index 9632e2e3c4bb..319a0519ebdb 100644 --- a/drivers/hwmon/adt7462.c +++ b/drivers/hwmon/adt7462.c @@ -413,7 +413,7 @@ static int ADT7462_REG_VOLT(struct adt7462_data *data, int which) return 0x95; break; } - return -ENODEV; + return 0; } /* Provide labels for sysfs */ From a4769905f0ae32cae4f096f646ab03b8b4794c74 Mon Sep 17 00:00:00 2001 From: Jernej Skrabec Date: Mon, 24 Feb 2020 18:38:55 +0100 Subject: [PATCH 327/400] drm/sun4i: de2/de3: Remove unsupported VI layer formats YUV444 and YVU444 are planar formats, but HW format RGB888 is packed. This means that those two mappings were never correct. Remove them. Fixes: 60a3dcf96aa8 ("drm/sun4i: Add DE2 definitions for YUV formats") Acked-by: Maxime Ripard Signed-off-by: Jernej Skrabec Link: https://patchwork.freedesktop.org/patch/msgid/20200224173901.174016-2-jernej.skrabec@siol.net --- drivers/gpu/drm/sun4i/sun8i_mixer.c | 12 ------------ drivers/gpu/drm/sun4i/sun8i_vi_layer.c | 2 -- 2 files changed, 14 deletions(-) diff --git a/drivers/gpu/drm/sun4i/sun8i_mixer.c b/drivers/gpu/drm/sun4i/sun8i_mixer.c index 7c24f8f832a5..3a78dbbceb8a 100644 --- a/drivers/gpu/drm/sun4i/sun8i_mixer.c +++ b/drivers/gpu/drm/sun4i/sun8i_mixer.c @@ -196,12 +196,6 @@ static const struct de2_fmt_info de2_formats[] = { .rgb = false, .csc = SUN8I_CSC_MODE_YUV2RGB, }, - { - .drm_fmt = DRM_FORMAT_YUV444, - .de2_fmt = SUN8I_MIXER_FBFMT_RGB888, - .rgb = true, - .csc = SUN8I_CSC_MODE_YUV2RGB, - }, { .drm_fmt = DRM_FORMAT_YUV422, .de2_fmt = SUN8I_MIXER_FBFMT_YUV422, @@ -220,12 +214,6 @@ static const struct de2_fmt_info de2_formats[] = { .rgb = false, .csc = SUN8I_CSC_MODE_YUV2RGB, }, - { - .drm_fmt = DRM_FORMAT_YVU444, - .de2_fmt = SUN8I_MIXER_FBFMT_RGB888, - .rgb = true, - .csc = SUN8I_CSC_MODE_YVU2RGB, - }, { .drm_fmt = DRM_FORMAT_YVU422, .de2_fmt = SUN8I_MIXER_FBFMT_YUV422, diff --git a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c index 42d445d23773..6a244d6fafd9 100644 --- a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c +++ b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c @@ -431,11 +431,9 @@ static const u32 sun8i_vi_layer_formats[] = { DRM_FORMAT_YUV411, DRM_FORMAT_YUV420, DRM_FORMAT_YUV422, - DRM_FORMAT_YUV444, DRM_FORMAT_YVU411, DRM_FORMAT_YVU420, DRM_FORMAT_YVU422, - DRM_FORMAT_YVU444, }; struct sun8i_vi_layer *sun8i_vi_layer_init_one(struct drm_device *drm, From 169ca4b38932112e8b2ee8baef9cea44678625b3 Mon Sep 17 00:00:00 2001 From: Jernej Skrabec Date: Mon, 24 Feb 2020 18:38:56 +0100 Subject: [PATCH 328/400] drm/sun4i: Add separate DE3 VI layer formats DE3 VI layers support alpha blending, but DE2 VI layers do not. Additionally, DE3 VI layers support 10-bit RGB and YUV formats. Make a separate list for DE3. Fixes: c50519e6db4d ("drm/sun4i: Add basic support for DE3") Acked-by: Maxime Ripard Signed-off-by: Jernej Skrabec Link: https://patchwork.freedesktop.org/patch/msgid/20200224173901.174016-3-jernej.skrabec@siol.net --- drivers/gpu/drm/sun4i/sun8i_mixer.c | 36 ++++++++++++++++ drivers/gpu/drm/sun4i/sun8i_mixer.h | 11 +++++ drivers/gpu/drm/sun4i/sun8i_vi_layer.c | 58 ++++++++++++++++++++++++-- 3 files changed, 102 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/sun4i/sun8i_mixer.c b/drivers/gpu/drm/sun4i/sun8i_mixer.c index 3a78dbbceb8a..655445bfe64a 100644 --- a/drivers/gpu/drm/sun4i/sun8i_mixer.c +++ b/drivers/gpu/drm/sun4i/sun8i_mixer.c @@ -148,6 +148,30 @@ static const struct de2_fmt_info de2_formats[] = { .rgb = true, .csc = SUN8I_CSC_MODE_OFF, }, + { + .drm_fmt = DRM_FORMAT_ARGB2101010, + .de2_fmt = SUN8I_MIXER_FBFMT_ARGB2101010, + .rgb = true, + .csc = SUN8I_CSC_MODE_OFF, + }, + { + .drm_fmt = DRM_FORMAT_ABGR2101010, + .de2_fmt = SUN8I_MIXER_FBFMT_ABGR2101010, + .rgb = true, + .csc = SUN8I_CSC_MODE_OFF, + }, + { + .drm_fmt = DRM_FORMAT_RGBA1010102, + .de2_fmt = SUN8I_MIXER_FBFMT_RGBA1010102, + .rgb = true, + .csc = SUN8I_CSC_MODE_OFF, + }, + { + .drm_fmt = DRM_FORMAT_BGRA1010102, + .de2_fmt = SUN8I_MIXER_FBFMT_BGRA1010102, + .rgb = true, + .csc = SUN8I_CSC_MODE_OFF, + }, { .drm_fmt = DRM_FORMAT_UYVY, .de2_fmt = SUN8I_MIXER_FBFMT_UYVY, @@ -232,6 +256,18 @@ static const struct de2_fmt_info de2_formats[] = { .rgb = false, .csc = SUN8I_CSC_MODE_YVU2RGB, }, + { + .drm_fmt = DRM_FORMAT_P010, + .de2_fmt = SUN8I_MIXER_FBFMT_P010_YUV, + .rgb = false, + .csc = SUN8I_CSC_MODE_YUV2RGB, + }, + { + .drm_fmt = DRM_FORMAT_P210, + .de2_fmt = SUN8I_MIXER_FBFMT_P210_YUV, + .rgb = false, + .csc = SUN8I_CSC_MODE_YUV2RGB, + }, }; const struct de2_fmt_info *sun8i_mixer_format_info(u32 format) diff --git a/drivers/gpu/drm/sun4i/sun8i_mixer.h b/drivers/gpu/drm/sun4i/sun8i_mixer.h index c6cc94057faf..345b28b0a80a 100644 --- a/drivers/gpu/drm/sun4i/sun8i_mixer.h +++ b/drivers/gpu/drm/sun4i/sun8i_mixer.h @@ -93,6 +93,10 @@ #define SUN8I_MIXER_FBFMT_ABGR1555 17 #define SUN8I_MIXER_FBFMT_RGBA5551 18 #define SUN8I_MIXER_FBFMT_BGRA5551 19 +#define SUN8I_MIXER_FBFMT_ARGB2101010 20 +#define SUN8I_MIXER_FBFMT_ABGR2101010 21 +#define SUN8I_MIXER_FBFMT_RGBA1010102 22 +#define SUN8I_MIXER_FBFMT_BGRA1010102 23 #define SUN8I_MIXER_FBFMT_YUYV 0 #define SUN8I_MIXER_FBFMT_UYVY 1 @@ -109,6 +113,13 @@ /* format 12 is semi-planar YUV411 UVUV */ /* format 13 is semi-planar YUV411 VUVU */ #define SUN8I_MIXER_FBFMT_YUV411 14 +/* format 15 doesn't exist */ +/* format 16 is P010 YVU */ +#define SUN8I_MIXER_FBFMT_P010_YUV 17 +/* format 18 is P210 YVU */ +#define SUN8I_MIXER_FBFMT_P210_YUV 19 +/* format 20 is packed YVU444 10-bit */ +/* format 21 is packed YUV444 10-bit */ /* * Sub-engines listed bellow are unused for now. The EN registers are here only diff --git a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c index 6a244d6fafd9..6c0084a3c3d7 100644 --- a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c +++ b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c @@ -436,24 +436,76 @@ static const u32 sun8i_vi_layer_formats[] = { DRM_FORMAT_YVU422, }; +static const u32 sun8i_vi_layer_de3_formats[] = { + DRM_FORMAT_ABGR1555, + DRM_FORMAT_ABGR2101010, + DRM_FORMAT_ABGR4444, + DRM_FORMAT_ABGR8888, + DRM_FORMAT_ARGB1555, + DRM_FORMAT_ARGB2101010, + DRM_FORMAT_ARGB4444, + DRM_FORMAT_ARGB8888, + DRM_FORMAT_BGR565, + DRM_FORMAT_BGR888, + DRM_FORMAT_BGRA1010102, + DRM_FORMAT_BGRA5551, + DRM_FORMAT_BGRA4444, + DRM_FORMAT_BGRA8888, + DRM_FORMAT_BGRX8888, + DRM_FORMAT_RGB565, + DRM_FORMAT_RGB888, + DRM_FORMAT_RGBA1010102, + DRM_FORMAT_RGBA4444, + DRM_FORMAT_RGBA5551, + DRM_FORMAT_RGBA8888, + DRM_FORMAT_RGBX8888, + DRM_FORMAT_XBGR8888, + DRM_FORMAT_XRGB8888, + + DRM_FORMAT_NV16, + DRM_FORMAT_NV12, + DRM_FORMAT_NV21, + DRM_FORMAT_NV61, + DRM_FORMAT_P010, + DRM_FORMAT_P210, + DRM_FORMAT_UYVY, + DRM_FORMAT_VYUY, + DRM_FORMAT_YUYV, + DRM_FORMAT_YVYU, + DRM_FORMAT_YUV411, + DRM_FORMAT_YUV420, + DRM_FORMAT_YUV422, + DRM_FORMAT_YVU411, + DRM_FORMAT_YVU420, + DRM_FORMAT_YVU422, +}; + struct sun8i_vi_layer *sun8i_vi_layer_init_one(struct drm_device *drm, struct sun8i_mixer *mixer, int index) { u32 supported_encodings, supported_ranges; + unsigned int plane_cnt, format_count; struct sun8i_vi_layer *layer; - unsigned int plane_cnt; + const u32 *formats; int ret; layer = devm_kzalloc(drm->dev, sizeof(*layer), GFP_KERNEL); if (!layer) return ERR_PTR(-ENOMEM); + if (mixer->cfg->is_de3) { + formats = sun8i_vi_layer_de3_formats; + format_count = ARRAY_SIZE(sun8i_vi_layer_de3_formats); + } else { + formats = sun8i_vi_layer_formats; + format_count = ARRAY_SIZE(sun8i_vi_layer_formats); + } + /* possible crtcs are set later */ ret = drm_universal_plane_init(drm, &layer->plane, 0, &sun8i_vi_layer_funcs, - sun8i_vi_layer_formats, - ARRAY_SIZE(sun8i_vi_layer_formats), + formats, format_count, NULL, DRM_PLANE_TYPE_OVERLAY, NULL); if (ret) { dev_err(drm->dev, "Couldn't initialize layer\n"); From 20896ef137340e9426cf322606f764452f5eb960 Mon Sep 17 00:00:00 2001 From: Jernej Skrabec Date: Mon, 24 Feb 2020 18:38:57 +0100 Subject: [PATCH 329/400] drm/sun4i: Fix DE2 VI layer format support DE2 VI layer doesn't support blending which means alpha channel is ignored. Replace all formats with alpha with "don't care" (X) channel. Fixes: 7480ba4d7571 ("drm/sun4i: Add support for DE2 VI planes") Acked-by: Maxime Ripard Signed-off-by: Jernej Skrabec Link: https://patchwork.freedesktop.org/patch/msgid/20200224173901.174016-4-jernej.skrabec@siol.net --- drivers/gpu/drm/sun4i/sun8i_mixer.c | 56 ++++++++++++++++++++++++++ drivers/gpu/drm/sun4i/sun8i_vi_layer.c | 22 +++++----- 2 files changed, 67 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/sun4i/sun8i_mixer.c b/drivers/gpu/drm/sun4i/sun8i_mixer.c index 655445bfe64a..4a64f7ae437a 100644 --- a/drivers/gpu/drm/sun4i/sun8i_mixer.c +++ b/drivers/gpu/drm/sun4i/sun8i_mixer.c @@ -106,48 +106,104 @@ static const struct de2_fmt_info de2_formats[] = { .rgb = true, .csc = SUN8I_CSC_MODE_OFF, }, + { + /* for DE2 VI layer which ignores alpha */ + .drm_fmt = DRM_FORMAT_XRGB4444, + .de2_fmt = SUN8I_MIXER_FBFMT_ARGB4444, + .rgb = true, + .csc = SUN8I_CSC_MODE_OFF, + }, { .drm_fmt = DRM_FORMAT_ABGR4444, .de2_fmt = SUN8I_MIXER_FBFMT_ABGR4444, .rgb = true, .csc = SUN8I_CSC_MODE_OFF, }, + { + /* for DE2 VI layer which ignores alpha */ + .drm_fmt = DRM_FORMAT_XBGR4444, + .de2_fmt = SUN8I_MIXER_FBFMT_ABGR4444, + .rgb = true, + .csc = SUN8I_CSC_MODE_OFF, + }, { .drm_fmt = DRM_FORMAT_RGBA4444, .de2_fmt = SUN8I_MIXER_FBFMT_RGBA4444, .rgb = true, .csc = SUN8I_CSC_MODE_OFF, }, + { + /* for DE2 VI layer which ignores alpha */ + .drm_fmt = DRM_FORMAT_RGBX4444, + .de2_fmt = SUN8I_MIXER_FBFMT_RGBA4444, + .rgb = true, + .csc = SUN8I_CSC_MODE_OFF, + }, { .drm_fmt = DRM_FORMAT_BGRA4444, .de2_fmt = SUN8I_MIXER_FBFMT_BGRA4444, .rgb = true, .csc = SUN8I_CSC_MODE_OFF, }, + { + /* for DE2 VI layer which ignores alpha */ + .drm_fmt = DRM_FORMAT_BGRX4444, + .de2_fmt = SUN8I_MIXER_FBFMT_BGRA4444, + .rgb = true, + .csc = SUN8I_CSC_MODE_OFF, + }, { .drm_fmt = DRM_FORMAT_ARGB1555, .de2_fmt = SUN8I_MIXER_FBFMT_ARGB1555, .rgb = true, .csc = SUN8I_CSC_MODE_OFF, }, + { + /* for DE2 VI layer which ignores alpha */ + .drm_fmt = DRM_FORMAT_XRGB1555, + .de2_fmt = SUN8I_MIXER_FBFMT_ARGB1555, + .rgb = true, + .csc = SUN8I_CSC_MODE_OFF, + }, { .drm_fmt = DRM_FORMAT_ABGR1555, .de2_fmt = SUN8I_MIXER_FBFMT_ABGR1555, .rgb = true, .csc = SUN8I_CSC_MODE_OFF, }, + { + /* for DE2 VI layer which ignores alpha */ + .drm_fmt = DRM_FORMAT_XBGR1555, + .de2_fmt = SUN8I_MIXER_FBFMT_ABGR1555, + .rgb = true, + .csc = SUN8I_CSC_MODE_OFF, + }, { .drm_fmt = DRM_FORMAT_RGBA5551, .de2_fmt = SUN8I_MIXER_FBFMT_RGBA5551, .rgb = true, .csc = SUN8I_CSC_MODE_OFF, }, + { + /* for DE2 VI layer which ignores alpha */ + .drm_fmt = DRM_FORMAT_RGBX5551, + .de2_fmt = SUN8I_MIXER_FBFMT_RGBA5551, + .rgb = true, + .csc = SUN8I_CSC_MODE_OFF, + }, { .drm_fmt = DRM_FORMAT_BGRA5551, .de2_fmt = SUN8I_MIXER_FBFMT_BGRA5551, .rgb = true, .csc = SUN8I_CSC_MODE_OFF, }, + { + /* for DE2 VI layer which ignores alpha */ + .drm_fmt = DRM_FORMAT_BGRX5551, + .de2_fmt = SUN8I_MIXER_FBFMT_BGRA5551, + .rgb = true, + .csc = SUN8I_CSC_MODE_OFF, + }, { .drm_fmt = DRM_FORMAT_ARGB2101010, .de2_fmt = SUN8I_MIXER_FBFMT_ARGB2101010, diff --git a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c index 6c0084a3c3d7..b8398ca18b0f 100644 --- a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c +++ b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c @@ -398,26 +398,26 @@ static const struct drm_plane_funcs sun8i_vi_layer_funcs = { }; /* - * While all RGB formats are supported, VI planes don't support - * alpha blending, so there is no point having formats with alpha - * channel if their opaque analog exist. + * While DE2 VI layer supports same RGB formats as UI layer, alpha + * channel is ignored. This structure lists all unique variants + * where alpha channel is replaced with "don't care" (X) channel. */ static const u32 sun8i_vi_layer_formats[] = { - DRM_FORMAT_ABGR1555, - DRM_FORMAT_ABGR4444, - DRM_FORMAT_ARGB1555, - DRM_FORMAT_ARGB4444, DRM_FORMAT_BGR565, DRM_FORMAT_BGR888, - DRM_FORMAT_BGRA5551, - DRM_FORMAT_BGRA4444, + DRM_FORMAT_BGRX4444, + DRM_FORMAT_BGRX5551, DRM_FORMAT_BGRX8888, DRM_FORMAT_RGB565, DRM_FORMAT_RGB888, - DRM_FORMAT_RGBA4444, - DRM_FORMAT_RGBA5551, + DRM_FORMAT_RGBX4444, + DRM_FORMAT_RGBX5551, DRM_FORMAT_RGBX8888, + DRM_FORMAT_XBGR1555, + DRM_FORMAT_XBGR4444, DRM_FORMAT_XBGR8888, + DRM_FORMAT_XRGB1555, + DRM_FORMAT_XRGB4444, DRM_FORMAT_XRGB8888, DRM_FORMAT_NV16, From b94858a7eae1aeabf6d910ded3b6fe66d06e8a1e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jonathan=20Neusch=C3=A4fer?= Date: Thu, 27 Feb 2020 16:55:00 +0100 Subject: [PATCH 330/400] dt-bindings: mfd: zii,rave-sp: Fix a typo ("onborad") MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Jonathan Neuschäfer Signed-off-by: Rob Herring --- Documentation/devicetree/bindings/mfd/zii,rave-sp.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/mfd/zii,rave-sp.txt b/Documentation/devicetree/bindings/mfd/zii,rave-sp.txt index 088eff9ddb78..e0f901edc063 100644 --- a/Documentation/devicetree/bindings/mfd/zii,rave-sp.txt +++ b/Documentation/devicetree/bindings/mfd/zii,rave-sp.txt @@ -20,7 +20,7 @@ RAVE SP consists of the following sub-devices: Device Description ------ ----------- rave-sp-wdt : Watchdog -rave-sp-nvmem : Interface to onborad EEPROM +rave-sp-nvmem : Interface to onboard EEPROM rave-sp-backlight : Display backlight rave-sp-hwmon : Interface to onboard hardware sensors rave-sp-leds : Interface to onboard LEDs From 8c6687efcfd2f849a1a66bc6d54ead4f60d9b5e4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jonathan=20Neusch=C3=A4fer?= Date: Thu, 27 Feb 2020 17:05:21 +0100 Subject: [PATCH 331/400] dt-bindings: mfd: tps65910: Improve grammar MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Jonathan Neuschäfer Signed-off-by: Rob Herring --- Documentation/devicetree/bindings/mfd/tps65910.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/mfd/tps65910.txt b/Documentation/devicetree/bindings/mfd/tps65910.txt index 4f62143afd24..a5ced46bbde9 100644 --- a/Documentation/devicetree/bindings/mfd/tps65910.txt +++ b/Documentation/devicetree/bindings/mfd/tps65910.txt @@ -26,8 +26,8 @@ Required properties: ldo6, ldo7, ldo8 - xxx-supply: Input voltage supply regulator. - These entries are require if regulators are enabled for a device. Missing of these - properties can cause the regulator registration fails. + These entries are required if regulators are enabled for a device. Missing these + properties can cause the regulator registration to fail. If some of input supply is powered through battery or always-on supply then also it is require to have these parameters with proper node handle of always on power supply. From 50bbd62ce7a153051209049db708b8f5f3c395b8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jonathan=20Neusch=C3=A4fer?= Date: Thu, 27 Feb 2020 18:07:01 +0100 Subject: [PATCH 332/400] dt-bindings: mfd: Fix typo in file name of twl-familly.txt MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Jonathan Neuschäfer Signed-off-by: Rob Herring --- Documentation/devicetree/bindings/input/twl4030-pwrbutton.txt | 2 +- .../devicetree/bindings/mfd/{twl-familly.txt => twl-family.txt} | 0 2 files changed, 1 insertion(+), 1 deletion(-) rename Documentation/devicetree/bindings/mfd/{twl-familly.txt => twl-family.txt} (100%) diff --git a/Documentation/devicetree/bindings/input/twl4030-pwrbutton.txt b/Documentation/devicetree/bindings/input/twl4030-pwrbutton.txt index c864a46cddcf..f5021214edec 100644 --- a/Documentation/devicetree/bindings/input/twl4030-pwrbutton.txt +++ b/Documentation/devicetree/bindings/input/twl4030-pwrbutton.txt @@ -1,7 +1,7 @@ Texas Instruments TWL family (twl4030) pwrbutton module This module is part of the TWL4030. For more details about the whole -chip see Documentation/devicetree/bindings/mfd/twl-familly.txt. +chip see Documentation/devicetree/bindings/mfd/twl-family.txt. This module provides a simple power button event via an Interrupt. diff --git a/Documentation/devicetree/bindings/mfd/twl-familly.txt b/Documentation/devicetree/bindings/mfd/twl-family.txt similarity index 100% rename from Documentation/devicetree/bindings/mfd/twl-familly.txt rename to Documentation/devicetree/bindings/mfd/twl-family.txt From 582b4e55403e053d8a48ff687a05174da9cc3fb0 Mon Sep 17 00:00:00 2001 From: Gerald Schaefer Date: Thu, 27 Feb 2020 12:56:42 +0100 Subject: [PATCH 333/400] s390/mm: fix panic in gup_fast on large pud On s390 there currently is no implementation of pud_write(). That was ok as long as we had our own implementation of get_user_pages_fast() which checked for pud protection by testing the bit directly w/o using pud_write(). The other callers of pud_write() are not reachable on s390. After commit 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code") we use the generic get_user_pages_fast(), which does call pud_write() in pud_access_permitted() for FOLL_WRITE access on a large pud. Without an s390 specific pud_write(), the generic version is called, which contains a BUG() statement to remind us that we don't have a proper implementation. This results in a kernel panic. Fix this by providing an implementation of pud_write(). Cc: # 5.2+ Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code") Signed-off-by: Gerald Schaefer Reviewed-by: Heiko Carstens Signed-off-by: Vasily Gorbik --- arch/s390/include/asm/pgtable.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h index 137a3920ca36..6d7c3b7e9281 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -752,6 +752,12 @@ static inline int pmd_write(pmd_t pmd) return (pmd_val(pmd) & _SEGMENT_ENTRY_WRITE) != 0; } +#define pud_write pud_write +static inline int pud_write(pud_t pud) +{ + return (pud_val(pud) & _REGION3_ENTRY_WRITE) != 0; +} + static inline int pmd_dirty(pmd_t pmd) { return (pmd_val(pmd) & _SEGMENT_ENTRY_DIRTY) != 0; From df057c914a9c219ac8b8ed22caf7da2f80c1fe26 Mon Sep 17 00:00:00 2001 From: Niklas Schnelle Date: Thu, 27 Feb 2020 12:17:18 +0100 Subject: [PATCH 334/400] s390/pci: Fix unexpected write combine on resource In the initial MIO support introduced in commit 71ba41c9b1d9 ("s390/pci: provide support for MIO instructions") zpci_map_resource() and zpci_setup_resources() default to using the mio_wb address as the resource's start address. This means users of the mapping, which includes most drivers, will get write combining on PCI Stores. This may lead to problems when drivers expect write through behavior when not using an explicit ioremap_wc(). Cc: stable@vger.kernel.org Fixes: 71ba41c9b1d9 ("s390/pci: provide support for MIO instructions") Signed-off-by: Niklas Schnelle Reviewed-by: Pierre Morel Signed-off-by: Vasily Gorbik --- arch/s390/pci/pci.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c index bc61ea18e88d..60716d18ce5a 100644 --- a/arch/s390/pci/pci.c +++ b/arch/s390/pci/pci.c @@ -424,7 +424,7 @@ static void zpci_map_resources(struct pci_dev *pdev) if (zpci_use_mio(zdev)) pdev->resource[i].start = - (resource_size_t __force) zdev->bars[i].mio_wb; + (resource_size_t __force) zdev->bars[i].mio_wt; else pdev->resource[i].start = (resource_size_t __force) pci_iomap_range_fh(pdev, i, 0, 0); @@ -531,7 +531,7 @@ static int zpci_setup_bus_resources(struct zpci_dev *zdev, flags |= IORESOURCE_MEM_64; if (zpci_use_mio(zdev)) - addr = (unsigned long) zdev->bars[i].mio_wb; + addr = (unsigned long) zdev->bars[i].mio_wt; else addr = ZPCI_ADDR(entry); size = 1UL << zdev->bars[i].size; From 08f56f8f3799b2ed1c5ac7eed6d86a4926289655 Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Mon, 2 Mar 2020 08:57:57 +0000 Subject: [PATCH 335/400] drm/i915/perf: Reintroduce wait on OA configuration completion We still need to wait for the initial OA configuration to happen before we enable OA report writes to the OA buffer. Reported-by: Lionel Landwerlin Fixes: 15d0ace1f876 ("drm/i915/perf: execute OA configuration from command stream") Closes: https://gitlab.freedesktop.org/drm/intel/issues/1356 Testcase: igt/perf/stream-open-close Signed-off-by: Chris Wilson Cc: Lionel Landwerlin Reviewed-by: Lionel Landwerlin Link: https://patchwork.freedesktop.org/patch/msgid/20200302085812.4172450-7-chris@chris-wilson.co.uk (cherry picked from commit 4b4e973d5eb89244b67d3223b60f752d0479f253) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/i915_perf.c | 58 ++++++++++++++++++-------- drivers/gpu/drm/i915/i915_perf_types.h | 3 +- 2 files changed, 43 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 0f556d80ba36..3b6b913bd27a 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1954,9 +1954,10 @@ out: return i915_vma_get(oa_bo->vma); } -static int emit_oa_config(struct i915_perf_stream *stream, - struct i915_oa_config *oa_config, - struct intel_context *ce) +static struct i915_request * +emit_oa_config(struct i915_perf_stream *stream, + struct i915_oa_config *oa_config, + struct intel_context *ce) { struct i915_request *rq; struct i915_vma *vma; @@ -1964,7 +1965,7 @@ static int emit_oa_config(struct i915_perf_stream *stream, vma = get_oa_vma(stream, oa_config); if (IS_ERR(vma)) - return PTR_ERR(vma); + return ERR_CAST(vma); err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL | PIN_HIGH); if (err) @@ -1989,13 +1990,17 @@ static int emit_oa_config(struct i915_perf_stream *stream, err = rq->engine->emit_bb_start(rq, vma->node.start, 0, I915_DISPATCH_SECURE); + if (err) + goto err_add_request; + + i915_request_get(rq); err_add_request: i915_request_add(rq); err_vma_unpin: i915_vma_unpin(vma); err_vma_put: i915_vma_put(vma); - return err; + return err ? ERR_PTR(err) : rq; } static struct intel_context *oa_context(struct i915_perf_stream *stream) @@ -2003,7 +2008,8 @@ static struct intel_context *oa_context(struct i915_perf_stream *stream) return stream->pinned_ctx ?: stream->engine->kernel_context; } -static int hsw_enable_metric_set(struct i915_perf_stream *stream) +static struct i915_request * +hsw_enable_metric_set(struct i915_perf_stream *stream) { struct intel_uncore *uncore = stream->uncore; @@ -2406,7 +2412,8 @@ static int lrc_configure_all_contexts(struct i915_perf_stream *stream, return oa_configure_all_contexts(stream, regs, ARRAY_SIZE(regs)); } -static int gen8_enable_metric_set(struct i915_perf_stream *stream) +static struct i915_request * +gen8_enable_metric_set(struct i915_perf_stream *stream) { struct intel_uncore *uncore = stream->uncore; struct i915_oa_config *oa_config = stream->oa_config; @@ -2448,7 +2455,7 @@ static int gen8_enable_metric_set(struct i915_perf_stream *stream) */ ret = lrc_configure_all_contexts(stream, oa_config); if (ret) - return ret; + return ERR_PTR(ret); return emit_oa_config(stream, oa_config, oa_context(stream)); } @@ -2460,7 +2467,8 @@ static u32 oag_report_ctx_switches(const struct i915_perf_stream *stream) 0 : GEN12_OAG_OA_DEBUG_DISABLE_CTX_SWITCH_REPORTS); } -static int gen12_enable_metric_set(struct i915_perf_stream *stream) +static struct i915_request * +gen12_enable_metric_set(struct i915_perf_stream *stream) { struct intel_uncore *uncore = stream->uncore; struct i915_oa_config *oa_config = stream->oa_config; @@ -2491,7 +2499,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream) */ ret = gen12_configure_all_contexts(stream, oa_config); if (ret) - return ret; + return ERR_PTR(ret); /* * For Gen12, performance counters are context @@ -2501,7 +2509,7 @@ static int gen12_enable_metric_set(struct i915_perf_stream *stream) if (stream->ctx) { ret = gen12_configure_oar_context(stream, true); if (ret) - return ret; + return ERR_PTR(ret); } return emit_oa_config(stream, oa_config, oa_context(stream)); @@ -2696,6 +2704,20 @@ static const struct i915_perf_stream_ops i915_oa_stream_ops = { .read = i915_oa_read, }; +static int i915_perf_stream_enable_sync(struct i915_perf_stream *stream) +{ + struct i915_request *rq; + + rq = stream->perf->ops.enable_metric_set(stream); + if (IS_ERR(rq)) + return PTR_ERR(rq); + + i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT); + i915_request_put(rq); + + return 0; +} + /** * i915_oa_stream_init - validate combined props for OA stream and init * @stream: An i915 perf stream @@ -2829,7 +2851,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, stream->ops = &i915_oa_stream_ops; perf->exclusive_stream = stream; - ret = perf->ops.enable_metric_set(stream); + ret = i915_perf_stream_enable_sync(stream); if (ret) { DRM_DEBUG("Unable to enable metric set\n"); goto err_enable; @@ -3147,7 +3169,7 @@ static long i915_perf_config_locked(struct i915_perf_stream *stream, return -EINVAL; if (config != stream->oa_config) { - int err; + struct i915_request *rq; /* * If OA is bound to a specific context, emit the @@ -3158,11 +3180,13 @@ static long i915_perf_config_locked(struct i915_perf_stream *stream, * When set globally, we use a low priority kernel context, * so it will effectively take effect when idle. */ - err = emit_oa_config(stream, config, oa_context(stream)); - if (err == 0) + rq = emit_oa_config(stream, config, oa_context(stream)); + if (!IS_ERR(rq)) { config = xchg(&stream->oa_config, config); - else - ret = err; + i915_request_put(rq); + } else { + ret = PTR_ERR(rq); + } } i915_oa_config_put(config); diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h index 45e581455f5d..a0e22f00f6cf 100644 --- a/drivers/gpu/drm/i915/i915_perf_types.h +++ b/drivers/gpu/drm/i915/i915_perf_types.h @@ -339,7 +339,8 @@ struct i915_oa_ops { * counter reports being sampled. May apply system constraints such as * disabling EU clock gating as required. */ - int (*enable_metric_set)(struct i915_perf_stream *stream); + struct i915_request * + (*enable_metric_set)(struct i915_perf_stream *stream); /** * @disable_metric_set: Remove system constraints associated with using From 169c0aa4bc17d37370f55188d9327b99d60fd9d7 Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Tue, 3 Mar 2020 14:00:09 +0000 Subject: [PATCH 336/400] drm/i915/gt: Drop the timeline->mutex as we wait for retirement As we have pinned the timeline (using tl->active_count), we can safely drop the tl->mutex as we wait for what we believe to be the final request on that timeline. This is useful for ensuring that we do not block the engine heartbeat by hogging the kernel_context's timeline on a dead GPU. References: https://gitlab.freedesktop.org/drm/intel/issues/1364 Fixes: 058179e72e09 ("drm/i915/gt: Replace hangcheck by heartbeats") Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request") Signed-off-by: Chris Wilson Cc: Mika Kuoppala Reviewed-by: Mika Kuoppala Link: https://patchwork.freedesktop.org/patch/msgid/20200303140009.1494819-1-chris@chris-wilson.co.uk (cherry picked from commit 82126e596d8519baac416aee83cad938f1d23cf8) Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/gt/intel_gt_requests.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c index 8a5054f21bf8..24c99d0838af 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c @@ -147,24 +147,32 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout) fence = i915_active_fence_get(&tl->last_request); if (fence) { + mutex_unlock(&tl->mutex); + timeout = dma_fence_wait_timeout(fence, interruptible, timeout); dma_fence_put(fence); + + /* Retirement is best effort */ + if (!mutex_trylock(&tl->mutex)) { + active_count++; + goto out_active; + } } } if (!retire_requests(tl) || flush_submission(gt)) active_count++; + mutex_unlock(&tl->mutex); - spin_lock(&timelines->lock); +out_active: spin_lock(&timelines->lock); - /* Resume iteration after dropping lock */ + /* Resume list iteration after reacquiring spinlock */ list_safe_reset_next(tl, tn, link); if (atomic_dec_and_test(&tl->active_count)) list_del(&tl->link); - mutex_unlock(&tl->mutex); /* Defer the final release to after the spinlock */ if (refcount_dec_and_test(&tl->kref.refcount)) { From 0d6defc7e0e437a9fd53622f7fd85740f38d5693 Mon Sep 17 00:00:00 2001 From: Olivier Moysan Date: Wed, 4 Mar 2020 11:24:06 +0100 Subject: [PATCH 337/400] ASoC: stm32: sai: manage rebind issue The commit e894efef9ac7 ("ASoC: core: add support to card rebind") allows to rebind the sound card after a rebind of one of its component. With this commit, the sound card is actually rebound, but may be no more functional. The following problems have been seen with STM32 SAI driver. 1) DMA channel is not requested: With the sound card rebind the simplified call sequence is: stm32_sai_sub_probe snd_soc_register_component snd_soc_try_rebind_card snd_soc_instantiate_card devm_snd_dmaengine_pcm_register The problem occurs because the pcm must be registered, before snd_soc_instantiate_card() is called. Modify SAI driver, to change the call sequence as follows: stm32_sai_sub_probe devm_snd_dmaengine_pcm_register snd_soc_register_component snd_soc_try_rebind_card 2) DMA channel is not released: dma_release_channel() is not called when devm_dmaengine_pcm_release() is executed. This occurs because SND_DMAENGINE_PCM_DRV_NAME component, has already been released through devm_component_release(). devm_dmaengine_pcm_release() should be called before devm_component_release() to avoid this problem. Call snd_dmaengine_pcm_unregister() and snd_soc_unregister_component() explicitly from SAI driver, to have the right sequence. Signed-off-by: Olivier Moysan Message-Id: <20200304102406.8093-1-olivier.moysan@st.com> Signed-off-by: Mark Brown --- sound/soc/stm/stm32_sai_sub.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/sound/soc/stm/stm32_sai_sub.c b/sound/soc/stm/stm32_sai_sub.c index 30bcd5d3a32a..10eb4b8e8e7e 100644 --- a/sound/soc/stm/stm32_sai_sub.c +++ b/sound/soc/stm/stm32_sai_sub.c @@ -1543,20 +1543,20 @@ static int stm32_sai_sub_probe(struct platform_device *pdev) return ret; } - ret = devm_snd_soc_register_component(&pdev->dev, &stm32_component, - &sai->cpu_dai_drv, 1); + ret = snd_dmaengine_pcm_register(&pdev->dev, conf, 0); + if (ret) { + dev_err(&pdev->dev, "Could not register pcm dma\n"); + return ret; + } + + ret = snd_soc_register_component(&pdev->dev, &stm32_component, + &sai->cpu_dai_drv, 1); if (ret) return ret; if (STM_SAI_PROTOCOL_IS_SPDIF(sai)) conf = &stm32_sai_pcm_config_spdif; - ret = devm_snd_dmaengine_pcm_register(&pdev->dev, conf, 0); - if (ret) { - dev_err(&pdev->dev, "Could not register pcm dma\n"); - return ret; - } - return 0; } @@ -1565,6 +1565,8 @@ static int stm32_sai_sub_remove(struct platform_device *pdev) struct stm32_sai_sub_data *sai = dev_get_drvdata(&pdev->dev); clk_unprepare(sai->pdata->pclk); + snd_dmaengine_pcm_unregister(&pdev->dev); + snd_soc_unregister_component(&pdev->dev); return 0; } From 1b79cfd99ff5127e6a143767b51694a527b3ea38 Mon Sep 17 00:00:00 2001 From: John Stultz Date: Tue, 3 Mar 2020 16:32:28 +0000 Subject: [PATCH 338/400] drm: kirin: Revert "Fix for hikey620 display offset problem" This reverts commit ff57c6513820efe945b61863cf4a51b79f18b592. With the commit ff57c6513820 ("drm: kirin: Fix for hikey620 display offset problem") we added support for handling LDI overflows by resetting the hardware. However, its been observed that when we do hit the LDI overflow condition, the irq seems to be screaming, and we do nothing but stream: [drm:ade_irq_handler [kirin_drm]] *ERROR* LDI underflow! over and over to the screen I've tried a few appraoches to avoid this, but none has yet been successful and the cure here is worse then the original disease, so revert this for now. Cc: Xinliang Liu Cc: Rongrong Zou Cc: Xinwei Kong Cc: Chen Feng Cc: Sam Ravnborg Cc: David Airlie Cc: Daniel Vetter Cc: dri-devel Fixes: ff57c6513820 ("drm: kirin: Fix for hikey620 display offset problem") Signed-off-by: John Stultz Acked-by: Xinliang Liu Signed-off-by: Xinliang Liu Link: https://patchwork.freedesktop.org/patch/msgid/20200303163228.52741-1-john.stultz@linaro.org --- .../gpu/drm/hisilicon/kirin/kirin_ade_reg.h | 1 - .../gpu/drm/hisilicon/kirin/kirin_drm_ade.c | 20 ------------------- 2 files changed, 21 deletions(-) diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h b/drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h index 0da860200410..e2ac09894a6d 100644 --- a/drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h +++ b/drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h @@ -83,7 +83,6 @@ #define VSIZE_OFST 20 #define LDI_INT_EN 0x741C #define FRAME_END_INT_EN_OFST 1 -#define UNDERFLOW_INT_EN_OFST 2 #define LDI_CTRL 0x7420 #define BPP_OFST 3 #define DATA_GATE_EN BIT(2) diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c index 73cd28a6ea07..86000127d4ee 100644 --- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c +++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c @@ -46,7 +46,6 @@ struct ade_hw_ctx { struct clk *media_noc_clk; struct clk *ade_pix_clk; struct reset_control *reset; - struct work_struct display_reset_wq; bool power_on; int irq; @@ -136,7 +135,6 @@ static void ade_init(struct ade_hw_ctx *ctx) */ ade_update_bits(base + ADE_CTRL, FRM_END_START_OFST, FRM_END_START_MASK, REG_EFFECTIVE_IN_ADEEN_FRMEND); - ade_update_bits(base + LDI_INT_EN, UNDERFLOW_INT_EN_OFST, MASK(1), 1); } static bool ade_crtc_mode_fixup(struct drm_crtc *crtc, @@ -304,17 +302,6 @@ static void ade_crtc_disable_vblank(struct drm_crtc *crtc) MASK(1), 0); } -static void drm_underflow_wq(struct work_struct *work) -{ - struct ade_hw_ctx *ctx = container_of(work, struct ade_hw_ctx, - display_reset_wq); - struct drm_device *drm_dev = ctx->crtc->dev; - struct drm_atomic_state *state; - - state = drm_atomic_helper_suspend(drm_dev); - drm_atomic_helper_resume(drm_dev, state); -} - static irqreturn_t ade_irq_handler(int irq, void *data) { struct ade_hw_ctx *ctx = data; @@ -331,12 +318,6 @@ static irqreturn_t ade_irq_handler(int irq, void *data) MASK(1), 1); drm_crtc_handle_vblank(crtc); } - if (status & BIT(UNDERFLOW_INT_EN_OFST)) { - ade_update_bits(base + LDI_INT_CLR, UNDERFLOW_INT_EN_OFST, - MASK(1), 1); - DRM_ERROR("LDI underflow!"); - schedule_work(&ctx->display_reset_wq); - } return IRQ_HANDLED; } @@ -919,7 +900,6 @@ static void *ade_hw_ctx_alloc(struct platform_device *pdev, if (ret) return ERR_PTR(-EIO); - INIT_WORK(&ctx->display_reset_wq, drm_underflow_wq); ctx->crtc = crtc; return ctx; From 02fbabd5f4ed182d2c616e49309f5a3efd9ec671 Mon Sep 17 00:00:00 2001 From: Fabrice Gasnier Date: Wed, 4 Mar 2020 09:55:32 +0100 Subject: [PATCH 339/400] regulator: stm32-vrefbuf: fix a possible overshoot when re-enabling There maybe an overshoot, when disabling, then re-enabling vrefbuf too quickly. VREFBUF is used by ADC/DAC on some boards. When re-enabling too quickly, an overshoot on the reference voltage make the conversions inaccurate for a short period of time. - Don't put the VREFBUF in HiZ when disabling, to force an active discharge. - Enforce a 1ms OFF/ON delay Fixes: 0cdbf481e927 ("regulator: Add support for stm32-vrefbuf") Signed-off-by: Fabrice Gasnier Message-Id: <1583312132-20932-1-git-send-email-fabrice.gasnier@st.com> Signed-off-by: Mark Brown --- drivers/regulator/stm32-vrefbuf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/regulator/stm32-vrefbuf.c b/drivers/regulator/stm32-vrefbuf.c index bdfaf7edb75a..992bc18101ef 100644 --- a/drivers/regulator/stm32-vrefbuf.c +++ b/drivers/regulator/stm32-vrefbuf.c @@ -88,7 +88,7 @@ static int stm32_vrefbuf_disable(struct regulator_dev *rdev) } val = readl_relaxed(priv->base + STM32_VREFBUF_CSR); - val = (val & ~STM32_ENVR) | STM32_HIZ; + val &= ~STM32_ENVR; writel_relaxed(val, priv->base + STM32_VREFBUF_CSR); pm_runtime_mark_last_busy(priv->dev); @@ -175,6 +175,7 @@ static const struct regulator_desc stm32_vrefbuf_regu = { .volt_table = stm32_vrefbuf_voltages, .n_voltages = ARRAY_SIZE(stm32_vrefbuf_voltages), .ops = &stm32_vrefbuf_volt_ops, + .off_on_delay = 1000, .type = REGULATOR_VOLTAGE, .owner = THIS_MODULE, }; From f9981d4f50b475d7dbb70f3022b87a3c8bba9fd6 Mon Sep 17 00:00:00 2001 From: Aaro Koskinen Date: Wed, 4 Mar 2020 13:17:40 +0200 Subject: [PATCH 340/400] spi: spi_register_controller(): free bus id on error paths Some error paths leave the bus id allocated. As a result the IDR allocation will fail after a deferred probe. Fix by freeing the bus id always on error. Signed-off-by: Aaro Koskinen Message-Id: <20200304111740.27915-1-aaro.koskinen@nokia.com> Signed-off-by: Mark Brown --- drivers/spi/spi.c | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index 197c9e0ac2a6..94145b25f446 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -2645,7 +2645,7 @@ int spi_register_controller(struct spi_controller *ctlr) if (ctlr->use_gpio_descriptors) { status = spi_get_gpio_descs(ctlr); if (status) - return status; + goto free_bus_id; /* * A controller using GPIO descriptors always * supports SPI_CS_HIGH if need be. @@ -2655,7 +2655,7 @@ int spi_register_controller(struct spi_controller *ctlr) /* Legacy code path for GPIOs from DT */ status = of_spi_get_gpio_numbers(ctlr); if (status) - return status; + goto free_bus_id; } } @@ -2663,17 +2663,14 @@ int spi_register_controller(struct spi_controller *ctlr) * Even if it's just one always-selected device, there must * be at least one chipselect. */ - if (!ctlr->num_chipselect) - return -EINVAL; + if (!ctlr->num_chipselect) { + status = -EINVAL; + goto free_bus_id; + } status = device_add(&ctlr->dev); - if (status < 0) { - /* free bus id */ - mutex_lock(&board_lock); - idr_remove(&spi_master_idr, ctlr->bus_num); - mutex_unlock(&board_lock); - goto done; - } + if (status < 0) + goto free_bus_id; dev_dbg(dev, "registered %s %s\n", spi_controller_is_slave(ctlr) ? "slave" : "master", dev_name(&ctlr->dev)); @@ -2689,11 +2686,7 @@ int spi_register_controller(struct spi_controller *ctlr) status = spi_controller_initialize_queue(ctlr); if (status) { device_del(&ctlr->dev); - /* free bus id */ - mutex_lock(&board_lock); - idr_remove(&spi_master_idr, ctlr->bus_num); - mutex_unlock(&board_lock); - goto done; + goto free_bus_id; } } /* add statistics */ @@ -2708,7 +2701,12 @@ int spi_register_controller(struct spi_controller *ctlr) /* Register devices from the device tree and ACPI */ of_register_spi_devices(ctlr); acpi_register_spi_devices(ctlr); -done: + return status; + +free_bus_id: + mutex_lock(&board_lock); + idr_remove(&spi_master_idr, ctlr->bus_num); + mutex_unlock(&board_lock); return status; } EXPORT_SYMBOL_GPL(spi_register_controller); From 8d62d9c4bc05b3d6731f3fa083eea6400871822c Mon Sep 17 00:00:00 2001 From: Ulf Hansson Date: Tue, 3 Mar 2020 16:07:43 +0100 Subject: [PATCH 341/400] dt-bindings: arm: Correct links to idle states definitions The arm,idle-state DT bindings recently got converted to the json-schema, but some links are still pointing to the old, non-existing, txt file. Let's update the links to fix this. Fixes: baac82fe06db ("dt-bindings: arm: Convert arm,idle-state binding to DT schema") Signed-off-by: Ulf Hansson Signed-off-by: Rob Herring --- Documentation/devicetree/bindings/arm/cpus.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/arm/cpus.yaml b/Documentation/devicetree/bindings/arm/cpus.yaml index 7a9c3ce2dbef..0d5b61056b10 100644 --- a/Documentation/devicetree/bindings/arm/cpus.yaml +++ b/Documentation/devicetree/bindings/arm/cpus.yaml @@ -216,7 +216,7 @@ properties: $ref: '/schemas/types.yaml#/definitions/phandle-array' description: | List of phandles to idle state nodes supported - by this cpu (see ./idle-states.txt). + by this cpu (see ./idle-states.yaml). capacity-dmips-mhz: $ref: '/schemas/types.yaml#/definitions/uint32' From ac9686a936a194fd3d4b8797944bdaaecf0adfda Mon Sep 17 00:00:00 2001 From: Ulf Hansson Date: Tue, 3 Mar 2020 16:07:44 +0100 Subject: [PATCH 342/400] dt-bindings: arm: Fix cpu compatibles in the hierarchical example for PSCI Fixes: a3f048b5424e ("dt: psci: Update DT bindings to support hierarchical PSCI states") Signed-off-by: Ulf Hansson Signed-off-by: Rob Herring --- Documentation/devicetree/bindings/arm/psci.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/arm/psci.yaml b/Documentation/devicetree/bindings/arm/psci.yaml index f8218e60e3e2..540211a080d4 100644 --- a/Documentation/devicetree/bindings/arm/psci.yaml +++ b/Documentation/devicetree/bindings/arm/psci.yaml @@ -199,7 +199,7 @@ examples: CPU0: cpu@0 { device_type = "cpu"; - compatible = "arm,cortex-a53", "arm,armv8"; + compatible = "arm,cortex-a53"; reg = <0x0>; enable-method = "psci"; power-domains = <&CPU_PD0>; @@ -208,7 +208,7 @@ examples: CPU1: cpu@1 { device_type = "cpu"; - compatible = "arm,cortex-a57", "arm,armv8"; + compatible = "arm,cortex-a53"; reg = <0x100>; enable-method = "psci"; power-domains = <&CPU_PD1>; From 3261227d136d83bcb4df4e9292385fb9e007fae7 Mon Sep 17 00:00:00 2001 From: Ulf Hansson Date: Tue, 3 Mar 2020 16:07:45 +0100 Subject: [PATCH 343/400] dt-bindings: power: Convert domain-idle-states bindings to json-schema While converting to the json-schema, let's also take the opportunity to further specify/clarify some more details about the DT binding. For example, let's define the label where to put the states nodes, set a pattern for nodename of the state nodes and finally add an example. Fixes: a3f048b5424e ("dt: psci: Update DT bindings to support hierarchical PSCI states") Signed-off-by: Ulf Hansson [robh: drop type refs from standard unit properties] Signed-off-by: Rob Herring --- .../devicetree/bindings/arm/psci.yaml | 2 +- .../bindings/power/domain-idle-state.txt | 33 ---------- .../bindings/power/domain-idle-state.yaml | 64 +++++++++++++++++++ .../bindings/power/power-domain.yaml | 22 +++---- .../bindings/power/power_domain.txt | 2 +- 5 files changed, 76 insertions(+), 47 deletions(-) delete mode 100644 Documentation/devicetree/bindings/power/domain-idle-state.txt create mode 100644 Documentation/devicetree/bindings/power/domain-idle-state.yaml diff --git a/Documentation/devicetree/bindings/arm/psci.yaml b/Documentation/devicetree/bindings/arm/psci.yaml index 540211a080d4..0bc3c43a525a 100644 --- a/Documentation/devicetree/bindings/arm/psci.yaml +++ b/Documentation/devicetree/bindings/arm/psci.yaml @@ -123,7 +123,7 @@ properties: to mandate it. [3] Documentation/devicetree/bindings/power/power_domain.txt - [4] Documentation/devicetree/bindings/power/domain-idle-state.txt + [4] Documentation/devicetree/bindings/power/domain-idle-state.yaml power-domains: $ref: '/schemas/types.yaml#/definitions/phandle-array' diff --git a/Documentation/devicetree/bindings/power/domain-idle-state.txt b/Documentation/devicetree/bindings/power/domain-idle-state.txt deleted file mode 100644 index eefc7ed22ca2..000000000000 --- a/Documentation/devicetree/bindings/power/domain-idle-state.txt +++ /dev/null @@ -1,33 +0,0 @@ -PM Domain Idle State Node: - -A domain idle state node represents the state parameters that will be used to -select the state when there are no active components in the domain. - -The state node has the following parameters - - -- compatible: - Usage: Required - Value type: - Definition: Must be "domain-idle-state". - -- entry-latency-us - Usage: Required - Value type: - Definition: u32 value representing worst case latency in - microseconds required to enter the idle state. - The exit-latency-us duration may be guaranteed - only after entry-latency-us has passed. - -- exit-latency-us - Usage: Required - Value type: - Definition: u32 value representing worst case latency - in microseconds required to exit the idle state. - -- min-residency-us - Usage: Required - Value type: - Definition: u32 value representing minimum residency duration - in microseconds after which the idle state will yield - power benefits after overcoming the overhead in entering -i the idle state. diff --git a/Documentation/devicetree/bindings/power/domain-idle-state.yaml b/Documentation/devicetree/bindings/power/domain-idle-state.yaml new file mode 100644 index 000000000000..dfba1af9abe5 --- /dev/null +++ b/Documentation/devicetree/bindings/power/domain-idle-state.yaml @@ -0,0 +1,64 @@ +# SPDX-License-Identifier: GPL-2.0 +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/power/domain-idle-state.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: PM Domain Idle States binding description + +maintainers: + - Ulf Hansson + +description: + A domain idle state node represents the state parameters that will be used to + select the state when there are no active components in the PM domain. + +properties: + $nodename: + const: domain-idle-states + +patternProperties: + "^(cpu|cluster|domain)-": + type: object + description: + Each state node represents a domain idle state description. + + properties: + compatible: + const: domain-idle-state + + entry-latency-us: + description: + The worst case latency in microseconds required to enter the idle + state. Note that, the exit-latency-us duration may be guaranteed only + after the entry-latency-us has passed. + + exit-latency-us: + description: + The worst case latency in microseconds required to exit the idle + state. + + min-residency-us: + description: + The minimum residency duration in microseconds after which the idle + state will yield power benefits, after overcoming the overhead while + entering the idle state. + + required: + - compatible + - entry-latency-us + - exit-latency-us + - min-residency-us + +examples: + - | + + domain-idle-states { + domain_retention: domain-retention { + compatible = "domain-idle-state"; + entry-latency-us = <20>; + exit-latency-us = <40>; + min-residency-us = <80>; + }; + }; +... diff --git a/Documentation/devicetree/bindings/power/power-domain.yaml b/Documentation/devicetree/bindings/power/power-domain.yaml index 455b573293ae..207e63ae10f9 100644 --- a/Documentation/devicetree/bindings/power/power-domain.yaml +++ b/Documentation/devicetree/bindings/power/power-domain.yaml @@ -29,18 +29,16 @@ properties: domain-idle-states: $ref: /schemas/types.yaml#/definitions/phandle-array - description: - A phandle of an idle-state that shall be soaked into a generic domain - power state. The idle state definitions are compatible with - domain-idle-state specified in - Documentation/devicetree/bindings/power/domain-idle-state.txt - phandles that are not compatible with domain-idle-state will be ignored. - The domain-idle-state property reflects the idle state of this PM domain - and not the idle states of the devices or sub-domains in the PM domain. - Devices and sub-domains have their own idle-states independent - of the parent domain's idle states. In the absence of this property, - the domain would be considered as capable of being powered-on - or powered-off. + description: | + Phandles of idle states that defines the available states for the + power-domain provider. The idle state definitions are compatible with the + domain-idle-state bindings, specified in ./domain-idle-state.yaml. + + Note that, the domain-idle-state property reflects the idle states of this + PM domain and not the idle states of the devices or sub-domains in the PM + domain. Devices and sub-domains have their own idle states independent of + the parent domain's idle states. In the absence of this property, the + domain would be considered as capable of being powered-on or powered-off. operating-points-v2: $ref: /schemas/types.yaml#/definitions/phandle-array diff --git a/Documentation/devicetree/bindings/power/power_domain.txt b/Documentation/devicetree/bindings/power/power_domain.txt index 5b09b2deb483..08497ef26c7a 100644 --- a/Documentation/devicetree/bindings/power/power_domain.txt +++ b/Documentation/devicetree/bindings/power/power_domain.txt @@ -109,4 +109,4 @@ Example: required-opps = <&domain1_opp_1>; }; -[1]. Documentation/devicetree/bindings/power/domain-idle-state.txt +[1]. Documentation/devicetree/bindings/power/domain-idle-state.yaml From de5ed007a03d71daaa505f5daa4d3666530c7090 Mon Sep 17 00:00:00 2001 From: Artemy Kovalyov Date: Thu, 27 Feb 2020 13:39:18 +0200 Subject: [PATCH 344/400] IB/mlx5: Fix implicit ODP race Following race may occur because of the call_srcu and the placement of the synchronize_srcu vs the xa_erase. CPU0 CPU1 mlx5_ib_free_implicit_mr: destroy_unused_implicit_child_mr: xa_erase(odp_mkeys) synchronize_srcu() xa_lock(implicit_children) if (still in xarray) atomic_inc() call_srcu() xa_unlock(implicit_children) xa_erase(implicit_children): xa_lock(implicit_children) __xa_erase() xa_unlock(implicit_children) flush_workqueue() [..] free_implicit_child_mr_rcu: (via call_srcu) queue_work() WARN_ON(atomic_read()) [..] free_implicit_child_mr_work: (via wq) free_implicit_child_mr() mlx5_mr_cache_invalidate() mlx5_ib_update_xlt() <-- UMR QP fail atomic_dec() The wait_event() solves the race because it blocks until free_implicit_child_mr_work() completes. Fixes: 5256edcb98a1 ("RDMA/mlx5: Rework implicit ODP destroy") Link: https://lore.kernel.org/r/20200227113918.94432-1-leon@kernel.org Signed-off-by: Artemy Kovalyov Reviewed-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe --- drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 + drivers/infiniband/hw/mlx5/odp.c | 17 +++++++---------- 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index d9bffcc93587..bb78142bca5e 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -636,6 +636,7 @@ struct mlx5_ib_mr { /* For ODP and implicit */ atomic_t num_deferred_work; + wait_queue_head_t q_deferred_work; struct xarray implicit_children; union { struct rcu_head rcu; diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 4216814ba871..bf50cd91f472 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -235,7 +235,8 @@ static void free_implicit_child_mr(struct mlx5_ib_mr *mr, bool need_imr_xlt) mr->parent = NULL; mlx5_mr_cache_free(mr->dev, mr); ib_umem_odp_release(odp); - atomic_dec(&imr->num_deferred_work); + if (atomic_dec_and_test(&imr->num_deferred_work)) + wake_up(&imr->q_deferred_work); } static void free_implicit_child_mr_work(struct work_struct *work) @@ -554,6 +555,7 @@ struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd, imr->umem = &umem_odp->umem; imr->is_odp_implicit = true; atomic_set(&imr->num_deferred_work, 0); + init_waitqueue_head(&imr->q_deferred_work); xa_init(&imr->implicit_children); err = mlx5_ib_update_xlt(imr, 0, @@ -611,10 +613,7 @@ void mlx5_ib_free_implicit_mr(struct mlx5_ib_mr *imr) * under xa_lock while the child is in the xarray. Thus at this point * it is only decreasing, and all work holding it is now on the wq. */ - if (atomic_read(&imr->num_deferred_work)) { - flush_workqueue(system_unbound_wq); - WARN_ON(atomic_read(&imr->num_deferred_work)); - } + wait_event(imr->q_deferred_work, !atomic_read(&imr->num_deferred_work)); /* * Fence the imr before we destroy the children. This allows us to @@ -645,10 +644,7 @@ void mlx5_ib_fence_odp_mr(struct mlx5_ib_mr *mr) /* Wait for all running page-fault handlers to finish. */ synchronize_srcu(&mr->dev->odp_srcu); - if (atomic_read(&mr->num_deferred_work)) { - flush_workqueue(system_unbound_wq); - WARN_ON(atomic_read(&mr->num_deferred_work)); - } + wait_event(mr->q_deferred_work, !atomic_read(&mr->num_deferred_work)); dma_fence_odp_mr(mr); } @@ -1720,7 +1716,8 @@ static void destroy_prefetch_work(struct prefetch_mr_work *work) u32 i; for (i = 0; i < work->num_sge; ++i) - atomic_dec(&work->frags[i].mr->num_deferred_work); + if (atomic_dec_and_test(&work->frags[i].mr->num_deferred_work)) + wake_up(&work->frags[i].mr->q_deferred_work); kvfree(work); } From e38b55ea0443da35a50a3eb2079ad3612cf763b9 Mon Sep 17 00:00:00 2001 From: Maor Gottlieb Date: Thu, 27 Feb 2020 13:27:08 +0200 Subject: [PATCH 345/400] RDMA/core: Fix protection fault in ib_mr_pool_destroy Fix NULL pointer dereference in the error flow of ib_create_qp_user when accessing to uninitialized list pointers - rdma_mrs and sig_mrs. The following crash from syzkaller revealed it. kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN PTI CPU: 1 PID: 23167 Comm: syz-executor.1 Not tainted 5.5.0-rc5 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:ib_mr_pool_destroy+0x81/0x1f0 Code: 00 00 fc ff df 49 c1 ec 03 4d 01 fc e8 a8 ea 72 fe 41 80 3c 24 00 0f 85 62 01 00 00 48 8b 13 48 89 d6 4c 8d 6a c8 48 c1 ee 03 <42> 80 3c 3e 00 0f 85 34 01 00 00 48 8d 7a 08 4c 8b 02 48 89 fe 48 RSP: 0018:ffffc9000951f8b0 EFLAGS: 00010046 RAX: 0000000000040000 RBX: ffff88810f268038 RCX: ffffffff82c41628 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffc9000951f850 RBP: ffff88810f268020 R08: 0000000000000004 R09: fffff520012a3f0a R10: 0000000000000001 R11: fffff520012a3f0a R12: ffffed1021e4d007 R13: ffffffffffffffc8 R14: 0000000000000246 R15: dffffc0000000000 FS: 00007f54bc788700(0000) GS:ffff88811b100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000116920002 CR4: 0000000000360ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rdma_rw_cleanup_mrs+0x15/0x30 ib_destroy_qp_user+0x674/0x7d0 ib_create_qp_user+0xb01/0x11c0 create_qp+0x1517/0x2130 ib_uverbs_create_qp+0x13e/0x190 ib_uverbs_write+0xaa5/0xdf0 __vfs_write+0x7c/0x100 vfs_write+0x168/0x4a0 ksys_write+0xc8/0x200 do_syscall_64+0x9c/0x390 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x465b49 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f54bc787c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000465b49 RDX: 0000000000000040 RSI: 0000000020000540 RDI: 0000000000000003 RBP: 00007f54bc787c70 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f54bc7886bc R13: 00000000004ca2ec R14: 000000000070ded0 R15: 0000000000000005 Fixes: a060b5629ab0 ("IB/core: generic RDMA READ/WRITE API") Link: https://lore.kernel.org/r/20200227112708.93023-1-leon@kernel.org Signed-off-by: Maor Gottlieb Signed-off-by: Leon Romanovsky Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/core_priv.h | 14 ++++++++++++++ drivers/infiniband/core/uverbs_cmd.c | 9 --------- drivers/infiniband/core/verbs.c | 10 ---------- 3 files changed, 14 insertions(+), 19 deletions(-) diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h index b1457b3464d3..cf42acca4a3a 100644 --- a/drivers/infiniband/core/core_priv.h +++ b/drivers/infiniband/core/core_priv.h @@ -338,6 +338,20 @@ static inline struct ib_qp *_ib_create_qp(struct ib_device *dev, qp->pd = pd; qp->uobject = uobj; qp->real_qp = qp; + + qp->qp_type = attr->qp_type; + qp->rwq_ind_tbl = attr->rwq_ind_tbl; + qp->send_cq = attr->send_cq; + qp->recv_cq = attr->recv_cq; + qp->srq = attr->srq; + qp->rwq_ind_tbl = attr->rwq_ind_tbl; + qp->event_handler = attr->event_handler; + + atomic_set(&qp->usecnt, 0); + spin_lock_init(&qp->mr_lock); + INIT_LIST_HEAD(&qp->rdma_mrs); + INIT_LIST_HEAD(&qp->sig_mrs); + /* * We don't track XRC QPs for now, because they don't have PD * and more importantly they are created internaly by driver, diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index 025933752e1d..060b4ebbd2ba 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -1445,16 +1445,7 @@ static int create_qp(struct uverbs_attr_bundle *attrs, if (ret) goto err_cb; - qp->pd = pd; - qp->send_cq = attr.send_cq; - qp->recv_cq = attr.recv_cq; - qp->srq = attr.srq; - qp->rwq_ind_tbl = ind_tbl; - qp->event_handler = attr.event_handler; - qp->qp_type = attr.qp_type; - atomic_set(&qp->usecnt, 0); atomic_inc(&pd->usecnt); - qp->port = 0; if (attr.send_cq) atomic_inc(&attr.send_cq->usecnt); if (attr.recv_cq) diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index 3ebae3b65c28..e62c9dfc7837 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -1185,16 +1185,6 @@ struct ib_qp *ib_create_qp_user(struct ib_pd *pd, if (ret) goto err; - qp->qp_type = qp_init_attr->qp_type; - qp->rwq_ind_tbl = qp_init_attr->rwq_ind_tbl; - - atomic_set(&qp->usecnt, 0); - qp->mrs_used = 0; - spin_lock_init(&qp->mr_lock); - INIT_LIST_HEAD(&qp->rdma_mrs); - INIT_LIST_HEAD(&qp->sig_mrs); - qp->port = 0; - if (qp_init_attr->qp_type == IB_QPT_XRC_TGT) { struct ib_qp *xrc_qp = create_xrc_qp_user(qp, qp_init_attr, udata); From a4e63bce1414df7ab6eb82ca9feb8494ce13e554 Mon Sep 17 00:00:00 2001 From: Jason Gunthorpe Date: Thu, 27 Feb 2020 13:41:18 +0200 Subject: [PATCH 346/400] RDMA/odp: Ensure the mm is still alive before creating an implicit child Registration of a mmu_notifier requires the caller to hold a mmget() on the mm as registration is not permitted to race with exit_mmap(). There is a BUG_ON inside the mmu_notifier to guard against this. Normally creating a umem is done against current which implicitly holds the mmget(), however an implicit ODP child is created from a pagefault work queue and is not guaranteed to have a mmget(). Call mmget() around this registration and abort faulting if the MM has gone to exit_mmap(). Before the patch below the notifier was registered when the implicit ODP parent was created, so there was no chance to register a notifier outside of current. Fixes: c571feca2dc9 ("RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm'") Link: https://lore.kernel.org/r/20200227114118.94736-1-leon@kernel.org Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/umem_odp.c | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index b8c657b28380..cd656ad4953b 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -181,14 +181,28 @@ ib_umem_odp_alloc_child(struct ib_umem_odp *root, unsigned long addr, odp_data->page_shift = PAGE_SHIFT; odp_data->notifier.ops = ops; + /* + * A mmget must be held when registering a notifier, the owming_mm only + * has a mm_grab at this point. + */ + if (!mmget_not_zero(umem->owning_mm)) { + ret = -EFAULT; + goto out_free; + } + odp_data->tgid = get_pid(root->tgid); ret = ib_init_umem_odp(odp_data, ops); - if (ret) { - put_pid(odp_data->tgid); - kfree(odp_data); - return ERR_PTR(ret); - } + if (ret) + goto out_tgid; + mmput(umem->owning_mm); return odp_data; + +out_tgid: + put_pid(odp_data->tgid); + mmput(umem->owning_mm); +out_free: + kfree(odp_data); + return ERR_PTR(ret); } EXPORT_SYMBOL(ib_umem_odp_alloc_child); From 78f34a16c28654cb47791257006f90d0948f2f0c Mon Sep 17 00:00:00 2001 From: Mark Zhang Date: Thu, 27 Feb 2020 14:51:11 +0200 Subject: [PATCH 347/400] RDMA/nldev: Fix crash when set a QP to a new counter but QPN is missing This fixes the kernel crash when a RDMA_NLDEV_CMD_STAT_SET command is received, but the QP number parameter is not available. iwpm_register_pid: Unable to send a nlmsg (client = 2) infiniband syz1: RDMA CMA: cma_listen_on_dev, error -98 general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 0 PID: 9754 Comm: syz-executor069 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:nla_get_u32 include/net/netlink.h:1474 [inline] RIP: 0010:nldev_stat_set_doit+0x63c/0xb70 drivers/infiniband/core/nldev.c:1760 Code: fc 01 0f 84 58 03 00 00 e8 41 83 bf fb 4c 8b a3 58 fd ff ff 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 6d RSP: 0018:ffffc900068bf350 EFLAGS: 00010247 RAX: dffffc0000000000 RBX: ffffc900068bf728 RCX: ffffffff85b60470 RDX: 0000000000000000 RSI: ffffffff85b6047f RDI: 0000000000000004 RBP: ffffc900068bf750 R08: ffff88808c3ee140 R09: ffff8880a25e6010 R10: ffffed10144bcddc R11: ffff8880a25e6ee3 R12: 0000000000000000 R13: ffff88809acb0000 R14: ffff888092a42c80 R15: 000000009ef2e29a FS: 0000000001ff0880(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4733e34000 CR3: 00000000a9b27000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rdma_nl_rcv_msg drivers/infiniband/core/netlink.c:195 [inline] rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0x5d9/0x980 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:672 ____sys_sendmsg+0x753/0x880 net/socket.c:2343 ___sys_sendmsg+0x100/0x170 net/socket.c:2397 __sys_sendmsg+0x105/0x1d0 net/socket.c:2430 __do_sys_sendmsg net/socket.c:2439 [inline] __se_sys_sendmsg net/socket.c:2437 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4403d9 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffc0efbc5c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004403d9 RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000004 RBP: 00000000006ca018 R08: 0000000000000008 R09: 00000000004002c8 R10: 000000000000004a R11: 0000000000000246 R12: 0000000000401c60 R13: 0000000000401cf0 R14: 0000000000000000 R15: 0000000000000000 Fixes: b389327df905 ("RDMA/nldev: Allow counter manual mode configration through RDMA netlink") Link: https://lore.kernel.org/r/20200227125111.99142-1-leon@kernel.org Reported-by: syzbot+bd4af81bc51ee0283445@syzkaller.appspotmail.com Signed-off-by: Mark Zhang Signed-off-by: Leon Romanovsky Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/nldev.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c index 37b433aa7306..e0b0a91da696 100644 --- a/drivers/infiniband/core/nldev.c +++ b/drivers/infiniband/core/nldev.c @@ -1757,6 +1757,8 @@ static int nldev_stat_set_doit(struct sk_buff *skb, struct nlmsghdr *nlh, if (ret) goto err_msg; } else { + if (!tb[RDMA_NLDEV_ATTR_RES_LQPN]) + goto err_msg; qpn = nla_get_u32(tb[RDMA_NLDEV_ATTR_RES_LQPN]); if (tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID]) { cntn = nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID]); From 12e5eef0f4d8087ea7b559f6630be08ffea2d851 Mon Sep 17 00:00:00 2001 From: Bernard Metzler Date: Mon, 2 Mar 2020 16:58:14 +0100 Subject: [PATCH 348/400] RDMA/siw: Fix failure handling during device creation A failing call to ib_device_set_netdev() during device creation caused system crash due to xa_destroy of uninitialized xarray hit by device deallocation. Fixed by moving xarray initialization before potential device deallocation. Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface") Link: https://lore.kernel.org/r/20200302155814.9896-1-bmt@zurich.ibm.com Reported-by: syzbot+2e80962bedd9559fe0b3@syzkaller.appspotmail.com Signed-off-by: Bernard Metzler Signed-off-by: Jason Gunthorpe --- drivers/infiniband/sw/siw/siw_main.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/sw/siw/siw_main.c b/drivers/infiniband/sw/siw/siw_main.c index 96ed349c0939..5cd40fb9e20c 100644 --- a/drivers/infiniband/sw/siw/siw_main.c +++ b/drivers/infiniband/sw/siw/siw_main.c @@ -388,6 +388,9 @@ static struct siw_device *siw_device_create(struct net_device *netdev) { .max_segment_size = SZ_2G }; base_dev->num_comp_vectors = num_possible_cpus(); + xa_init_flags(&sdev->qp_xa, XA_FLAGS_ALLOC1); + xa_init_flags(&sdev->mem_xa, XA_FLAGS_ALLOC1); + ib_set_device_ops(base_dev, &siw_device_ops); rv = ib_device_set_netdev(base_dev, netdev, 1); if (rv) @@ -415,9 +418,6 @@ static struct siw_device *siw_device_create(struct net_device *netdev) sdev->attrs.max_srq_wr = SIW_MAX_SRQ_WR; sdev->attrs.max_srq_sge = SIW_MAX_SGE; - xa_init_flags(&sdev->qp_xa, XA_FLAGS_ALLOC1); - xa_init_flags(&sdev->mem_xa, XA_FLAGS_ALLOC1); - INIT_LIST_HEAD(&sdev->cep_list); INIT_LIST_HEAD(&sdev->qp_list); From 810dbc69087b08fd53e1cdd6c709f385bc2921ad Mon Sep 17 00:00:00 2001 From: Bernard Metzler Date: Mon, 2 Mar 2020 19:16:14 +0100 Subject: [PATCH 349/400] RDMA/iwcm: Fix iwcm work deallocation The dealloc_work_entries() function must update the work_free_list pointer while freeing its entries, since potentially called again on same list. A second iteration of the work list caused system crash. This happens, if work allocation fails during cma_iw_listen() and free_cm_id() tries to free the list again during cleanup. Fixes: 922a8e9fb2e0 ("RDMA: iWARP Connection Manager.") Link: https://lore.kernel.org/r/20200302181614.17042-1-bmt@zurich.ibm.com Reported-by: syzbot+cb0c054eabfba4342146@syzkaller.appspotmail.com Signed-off-by: Bernard Metzler Reviewed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe --- drivers/infiniband/core/iwcm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/core/iwcm.c b/drivers/infiniband/core/iwcm.c index ade71823370f..da8adadf4755 100644 --- a/drivers/infiniband/core/iwcm.c +++ b/drivers/infiniband/core/iwcm.c @@ -159,8 +159,10 @@ static void dealloc_work_entries(struct iwcm_id_private *cm_id_priv) { struct list_head *e, *tmp; - list_for_each_safe(e, tmp, &cm_id_priv->work_free_list) + list_for_each_safe(e, tmp, &cm_id_priv->work_free_list) { + list_del(e); kfree(list_entry(e, struct iwcm_work, free_list)); + } } static int alloc_work_entries(struct iwcm_id_private *cm_id_priv, int count) From aa2734202acc506d09c8e641db4da161f902df27 Mon Sep 17 00:00:00 2001 From: Damien Le Moal Date: Wed, 12 Feb 2020 19:34:24 +0900 Subject: [PATCH 350/400] riscv: Force flat memory model with no-mmu Compilation errors trigger if ARCH_SPARSEMEM_ENABLE is enabled for a nommu kernel. Since the sparsemem model does not make sense anyway for the nommu case, do not allow selecting this option to always use the flatmem model. Signed-off-by: Damien Le Moal Reviewed-by: Anup Patel Reviewed-by: Palmer Dabbelt Signed-off-by: Palmer Dabbelt --- arch/riscv/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 73f029eae0cc..1a3b5a5276be 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -121,6 +121,7 @@ config ARCH_FLATMEM_ENABLE config ARCH_SPARSEMEM_ENABLE def_bool y + depends on MMU select SPARSEMEM_VMEMMAP_ENABLE config ARCH_SELECT_MEMORY_MODEL From 07f5ae220b36214cd9be489cf36ffe92d9c08944 Mon Sep 17 00:00:00 2001 From: Rob Herring Date: Mon, 2 Mar 2020 11:36:20 -0600 Subject: [PATCH 351/400] dt-bindings: bus: Drop empty compatible string in example In preparation to add generic checks of compatible strings, drop the compatible as '...' is not a valid compatible string. Cc: Maxime Ripard Cc: Chen-Yu Tsai Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Rob Herring --- .../devicetree/bindings/bus/allwinner,sun8i-a23-rsb.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/Documentation/devicetree/bindings/bus/allwinner,sun8i-a23-rsb.yaml b/Documentation/devicetree/bindings/bus/allwinner,sun8i-a23-rsb.yaml index 9fe11ceecdba..80973619342d 100644 --- a/Documentation/devicetree/bindings/bus/allwinner,sun8i-a23-rsb.yaml +++ b/Documentation/devicetree/bindings/bus/allwinner,sun8i-a23-rsb.yaml @@ -70,7 +70,6 @@ examples: #size-cells = <0>; pmic@3e3 { - compatible = "..."; reg = <0x3e3>; /* ... */ From 7589238a8cf37331607c3222a64ac3140b29532d Mon Sep 17 00:00:00 2001 From: Brendan Higgins Date: Thu, 27 Feb 2020 16:00:01 -0800 Subject: [PATCH 352/400] Revert "software node: Simplify software_node_release() function" This reverts commit 3df85a1ae51f6b256982fe9d17c2dc5bfb4cc402. The reverted commit says "It's possible to release the node ID immediately when fwnode_remove_software_node() is called, no need to wait for software_node_release() with that." However, releasing the node ID before waiting for software_node_release() to be called causes the node ID to be released before the kobject and the underlying sysfs entry; this means there is a period of time where a sysfs entry exists that is associated with an unallocated node ID. Once consequence of this is that there is a race condition where it is possible to call fwnode_create_software_node() with no parent node specified (NULL) and have it fail with -EEXIST because the node ID that was assigned is still associated with a stale sysfs entry that hasn't been cleaned up yet. Although it is difficult to reproduce this race condition under normal conditions, it can be deterministically reproduced with the following minconfig on UML: CONFIG_KUNIT_DRIVER_PE_TEST=y CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_OBJECTS=y CONFIG_DEBUG_OBJECTS_TIMERS=y CONFIG_DEBUG_KOBJECT_RELEASE=y CONFIG_KUNIT=y Running the tests with this configuration causes the following failure: kobject: 'node0' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 400) ok 1 - pe_test_uints sysfs: cannot create duplicate filename '/kernel/software_nodes/node0' CPU: 0 PID: 28 Comm: kunit_try_catch Not tainted 5.6.0-rc3-next-20200227 #14 kobject_add_internal failed for node0 with -EEXIST, don't try to register things with the same name in the same directory. kobject: 'node0' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 100) # pe_test_uint_arrays: ASSERTION FAILED at drivers/base/test/property-entry-test.c:123 Expected node is not error, but is: -17 not ok 2 - pe_test_uint_arrays Reported-by: Heidi Fahim Signed-off-by: Brendan Higgins Reviewed-by: Heikki Krogerus Cc: 5.3+ # 5.3+ Signed-off-by: Rafael J. Wysocki --- drivers/base/swnode.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/base/swnode.c b/drivers/base/swnode.c index 0b081dee1e95..de8d3543e8fe 100644 --- a/drivers/base/swnode.c +++ b/drivers/base/swnode.c @@ -608,6 +608,13 @@ static void software_node_release(struct kobject *kobj) { struct swnode *swnode = kobj_to_swnode(kobj); + if (swnode->parent) { + ida_simple_remove(&swnode->parent->child_ids, swnode->id); + list_del(&swnode->entry); + } else { + ida_simple_remove(&swnode_root_ids, swnode->id); + } + if (swnode->allocated) { property_entries_free(swnode->node->properties); kfree(swnode->node); @@ -773,13 +780,6 @@ void fwnode_remove_software_node(struct fwnode_handle *fwnode) if (!swnode) return; - if (swnode->parent) { - ida_simple_remove(&swnode->parent->child_ids, swnode->id); - list_del(&swnode->entry); - } else { - ida_simple_remove(&swnode_root_ids, swnode->id); - } - kobject_put(&swnode->kobj); } EXPORT_SYMBOL_GPL(fwnode_remove_software_node); From a160eed4b783d7b250a32f7e5787c9867abc5686 Mon Sep 17 00:00:00 2001 From: Alexandre Ghiti Date: Mon, 17 Feb 2020 00:28:47 -0500 Subject: [PATCH 353/400] riscv: Fix range looking for kernel image memblock When looking for the memblock where the kernel lives, we should check that the memory range associated to the memblock entirely comprises the kernel image and not only intersects with it. Signed-off-by: Alexandre Ghiti Reviewed-by: Anup Patel Signed-off-by: Palmer Dabbelt --- arch/riscv/mm/init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 965a8cf4829c..fab855963c73 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -131,7 +131,7 @@ void __init setup_bootmem(void) for_each_memblock(memory, reg) { phys_addr_t end = reg->base + reg->size; - if (reg->base <= vmlinux_end && vmlinux_end <= end) { + if (reg->base <= vmlinux_start && vmlinux_end <= end) { mem_size = min(reg->size, (phys_addr_t)-PAGE_OFFSET); /* From 2ab7e274b86739f4ceed5d94b6879f2d07b2802f Mon Sep 17 00:00:00 2001 From: Yintian Tao Date: Fri, 28 Feb 2020 14:24:42 +0800 Subject: [PATCH 354/400] drm/amdgpu: clean wptr on wb when gpu recovery MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The TDR will be randomly failed due to compute ring test failure. If the compute ring wptr & 0x7ff(ring_buf_mask) is 0x100 then after map mqd the compute ring rptr will be synced with 0x100. And the ring test packet size is also 0x100. Then after invocation of amdgpu_ring_commit, the cp will not really handle the packet on the ring buffer because rptr is equal to wptr. Signed-off-by: Yintian Tao Acked-by: Christian König Reviewed-by: Monk Liu Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 1 + drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 22bbb36c768e..ced29790217c 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -3513,6 +3513,7 @@ static int gfx_v10_0_kcq_init_queue(struct amdgpu_ring *ring) /* reset ring buffer */ ring->wptr = 0; + atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], 0); amdgpu_ring_clear_ring(ring); } else { amdgpu_ring_clear_ring(ring); diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 3afdbbd6aaad..889154a78c4a 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -3663,6 +3663,7 @@ static int gfx_v9_0_kcq_init_queue(struct amdgpu_ring *ring) /* reset ring buffer */ ring->wptr = 0; + atomic64_set((atomic64_t *)&adev->wb.wb[ring->wptr_offs], 0); amdgpu_ring_clear_ring(ring); } else { amdgpu_ring_clear_ring(ring); From 59bee45b9712c759ea4d3dcc4eff1752f3a66558 Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Tue, 3 Mar 2020 23:28:47 +1100 Subject: [PATCH 355/400] powerpc/mm: Fix missing KUAP disable in flush_coherent_icache() Stefan reported a strange kernel fault which turned out to be due to a missing KUAP disable in flush_coherent_icache() called from flush_icache_range(). The fault looks like: Kernel attempted to access user page (7fffc30d9c00) - exploit attempt? (uid: 1009) BUG: Unable to handle kernel data access on read at 0x7fffc30d9c00 Faulting instruction address: 0xc00000000007232c Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV CPU: 35 PID: 5886 Comm: sigtramp Not tainted 5.6.0-rc2-gcc-8.2.0-00003-gfc37a1632d40 #79 NIP: c00000000007232c LR: c00000000003b7fc CTR: 0000000000000000 REGS: c000001e11093940 TRAP: 0300 Not tainted (5.6.0-rc2-gcc-8.2.0-00003-gfc37a1632d40) MSR: 900000000280b033 CR: 28000884 XER: 00000000 CFAR: c0000000000722fc DAR: 00007fffc30d9c00 DSISR: 08000000 IRQMASK: 0 GPR00: c00000000003b7fc c000001e11093bd0 c0000000023ac200 00007fffc30d9c00 GPR04: 00007fffc30d9c18 0000000000000000 c000001e11093bd4 0000000000000000 GPR08: 0000000000000000 0000000000000001 0000000000000000 c000001e1104ed80 GPR12: 0000000000000000 c000001fff6ab380 c0000000016be2d0 4000000000000000 GPR16: c000000000000000 bfffffffffffffff 0000000000000000 0000000000000000 GPR20: 00007fffc30d9c00 00007fffc30d8f58 00007fffc30d9c18 00007fffc30d9c20 GPR24: 00007fffc30d9c18 0000000000000000 c000001e11093d90 c000001e1104ed80 GPR28: c000001e11093e90 0000000000000000 c0000000023d9d18 00007fffc30d9c00 NIP flush_icache_range+0x5c/0x80 LR handle_rt_signal64+0x95c/0xc2c Call Trace: 0xc000001e11093d90 (unreliable) handle_rt_signal64+0x93c/0xc2c do_notify_resume+0x310/0x430 ret_from_except_lite+0x70/0x74 Instruction dump: 409e002c 7c0802a6 3c62ff31 3863f6a0 f8010080 48195fed 60000000 48fe4c8d 60000000 e8010080 7c0803a6 7c0004ac <7c00ffac> 7c0004ac 4c00012c 38210070 This path through handle_rt_signal64() to setup_trampoline() and flush_icache_range() is only triggered by 64-bit processes that have unmapped their VDSO, which is rare. flush_icache_range() takes a range of addresses to flush. In flush_coherent_icache() we implement an optimisation for CPUs where we know we don't actually have to flush the whole range, we just need to do a single icbi. However we still execute the icbi on the user address of the start of the range we're flushing. On CPUs that also implement KUAP (Power9) that leads to the spurious fault above. We should be able to pass any address, including a kernel address, to the icbi on these CPUs, which would avoid any interaction with KUAP. But I don't want to make that change in a bug fix, just in case it surfaces some strange behaviour on some CPU. So for now just disable KUAP around the icbi. Note the icbi is treated as a load, so we allow read access, not write as you'd expect. Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU") Cc: stable@vger.kernel.org # v5.2+ Reported-by: Stefan Berger Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200303235708.26004-1-mpe@ellerman.id.au --- arch/powerpc/mm/mem.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index ef7b1119b2e2..1c07d5a3f543 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -373,7 +373,9 @@ static inline bool flush_coherent_icache(unsigned long addr) */ if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) { mb(); /* sync */ + allow_read_from_user((const void __user *)addr, L1_CACHE_BYTES); icbi((void *)addr); + prevent_read_from_user((const void __user *)addr, L1_CACHE_BYTES); mb(); /* sync */ isync(); return true; From 3fb83cbee1de58fcd5d22f1db89460bb7c08b6e8 Mon Sep 17 00:00:00 2001 From: Axel Lin Date: Wed, 4 Mar 2020 22:02:41 +0800 Subject: [PATCH 356/400] ASoC: wm8741: Fix typo in Kconfig prompt Fix trivial copy-n-paste mistake. Signed-off-by: Axel Lin Link: https://lore.kernel.org/r/20200304140241.340-1-axel.lin@ingics.com Signed-off-by: Mark Brown --- sound/soc/codecs/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/codecs/Kconfig b/sound/soc/codecs/Kconfig index 7e90f5d83097..ea912439e446 100644 --- a/sound/soc/codecs/Kconfig +++ b/sound/soc/codecs/Kconfig @@ -1406,7 +1406,7 @@ config SND_SOC_WM8737 depends on SND_SOC_I2C_AND_SPI config SND_SOC_WM8741 - tristate "Wolfson Microelectronics WM8737 DAC" + tristate "Wolfson Microelectronics WM8741 DAC" depends on SND_SOC_I2C_AND_SPI config SND_SOC_WM8750 From 6198adeaf21536f426a79f3d490651e52fd76d60 Mon Sep 17 00:00:00 2001 From: Lukas Bulwahn Date: Wed, 4 Mar 2020 22:26:00 +0100 Subject: [PATCH 357/400] MAINTAINERS: update ALLWINNER CPUFREQ DRIVER entry Commit b30d8cf5e171 ("dt-bindings: opp: Convert Allwinner H6 OPP to a schema") converted in Documentation/devicetree/bindings/opp/ the file sun50i-nvmem-cpufreq.txt to allwinner,sun50i-h6-operating-points.yaml. Since then, ./scripts/get_maintainer.pl --self-test complains: warning: no file matches \ F: Documentation/devicetree/bindings/opp/sun50i-nvmem-cpufreq.txt Adjust the file pattern in the ALLWINNER CPUFREQ DRIVER entry. Signed-off-by: Lukas Bulwahn Signed-off-by: Rob Herring --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 35e9b133c424..74a6d080ef6e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -693,7 +693,7 @@ ALLWINNER CPUFREQ DRIVER M: Yangtao Li L: linux-pm@vger.kernel.org S: Maintained -F: Documentation/devicetree/bindings/opp/sun50i-nvmem-cpufreq.txt +F: Documentation/devicetree/bindings/opp/allwinner,sun50i-h6-operating-points.yaml F: drivers/cpufreq/sun50i-cpufreq-nvmem.c ALLWINNER CRYPTO DRIVERS From acb4d372a0311cb1c2b03e471007708b2a50c5da Mon Sep 17 00:00:00 2001 From: Sasha Levin Date: Wed, 5 Feb 2020 16:32:42 -0500 Subject: [PATCH 358/400] Hyper-V: Drop Sasha Levin from the Hyper-V maintainers Signed-off-by: Sasha Levin Signed-off-by: Wei Liu --- MAINTAINERS | 1 - 1 file changed, 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 6158a143a13e..75c907763482 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7738,7 +7738,6 @@ Hyper-V CORE AND DRIVERS M: "K. Y. Srinivasan" M: Haiyang Zhang M: Stephen Hemminger -M: Sasha Levin T: git git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git L: linux-hyperv@vger.kernel.org S: Supported From 8c1b0767ae0c4b18cd967556aa6ddc7aab5bef0d Mon Sep 17 00:00:00 2001 From: Wei Liu Date: Wed, 4 Mar 2020 14:47:09 +0000 Subject: [PATCH 359/400] Hyper-V: add myself as a maintainer Signed-off-by: Wei Liu Acked-by: K. Y. Srinivasan --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 75c907763482..68eebf3650ac 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7738,6 +7738,7 @@ Hyper-V CORE AND DRIVERS M: "K. Y. Srinivasan" M: Haiyang Zhang M: Stephen Hemminger +M: Wei Liu T: git git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git L: linux-hyperv@vger.kernel.org S: Supported From 5313b2a58ef02e2b077d7ae8088043609e3155b0 Mon Sep 17 00:00:00 2001 From: Lucas Tanure Date: Sat, 29 Feb 2020 17:30:07 +0000 Subject: [PATCH 360/400] HID: hyperv: NULL check before some freeing functions is not needed. Fix below warnings reported by coccicheck: drivers/hid/hid-hyperv.c:197:2-7: WARNING: NULL check before some freeing functions is not needed. drivers/hid/hid-hyperv.c:211:2-7: WARNING: NULL check before some freeing functions is not needed. Signed-off-by: Lucas Tanure Reviewed-by: Michael Kelley Reviewed-by: Wei Liu Acked-by: Benjamin Tissoires Signed-off-by: Wei Liu --- drivers/hid/hid-hyperv.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/hid/hid-hyperv.c b/drivers/hid/hid-hyperv.c index dddfca555df9..0b6ee1dee625 100644 --- a/drivers/hid/hid-hyperv.c +++ b/drivers/hid/hid-hyperv.c @@ -193,8 +193,7 @@ static void mousevsc_on_receive_device_info(struct mousevsc_dev *input_device, goto cleanup; /* The pointer is not NULL when we resume from hibernation */ - if (input_device->hid_desc != NULL) - kfree(input_device->hid_desc); + kfree(input_device->hid_desc); input_device->hid_desc = kmemdup(desc, desc->bLength, GFP_ATOMIC); if (!input_device->hid_desc) @@ -207,8 +206,7 @@ static void mousevsc_on_receive_device_info(struct mousevsc_dev *input_device, } /* The pointer is not NULL when we resume from hibernation */ - if (input_device->report_desc != NULL) - kfree(input_device->report_desc); + kfree(input_device->report_desc); input_device->report_desc = kzalloc(input_device->report_desc_size, GFP_ATOMIC); From 78def224f59c05d00e815be946ec229719ccf377 Mon Sep 17 00:00:00 2001 From: Kailang Yang Date: Thu, 20 Feb 2020 15:21:54 +0800 Subject: [PATCH 361/400] ALSA: hda/realtek - Add Headset Mic supported Dell desktop platform supported headset Mic. Add pin verb to enable headset Mic. This platform only support fixed type headset for Iphone type. Signed-off-by: Kailang Yang Cc: Link: https://lore.kernel.org/r/b9da28d772ef43088791b0f3675929e7@realtek.com Signed-off-by: Takashi Iwai --- sound/pci/hda/patch_realtek.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 31bee0512334..128c7e9df6de 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -7117,6 +7117,8 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1028, 0x0935, "Dell", ALC274_FIXUP_DELL_AIO_LINEOUT_VERB), SND_PCI_QUIRK(0x1028, 0x097e, "Dell Precision", ALC289_FIXUP_DUAL_SPK), SND_PCI_QUIRK(0x1028, 0x097d, "Dell Precision", ALC289_FIXUP_DUAL_SPK), + SND_PCI_QUIRK(0x1028, 0x098d, "Dell Precision", ALC233_FIXUP_ASUS_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x1028, 0x09bf, "Dell Precision", ALC233_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1028, 0x164a, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1028, 0x164b, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x103c, 0x1586, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC2), From 76f7dec08fd64e9e3ad0810a1a8a60b0a846d348 Mon Sep 17 00:00:00 2001 From: Kailang Yang Date: Mon, 10 Feb 2020 16:30:26 +0800 Subject: [PATCH 362/400] ALSA: hda/realtek - Add Headset Button supported for ThinkPad X1 ThinkPad want to support Headset Button control. This patch will enable it. Signed-off-by: Kailang Yang Cc: Link: https://lore.kernel.org/r/7f0b7128f40f41f6b5582ff610adc33d@realtek.com Signed-off-by: Takashi Iwai --- sound/pci/hda/patch_realtek.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 128c7e9df6de..a8d66c1e4991 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -5920,7 +5920,7 @@ enum { ALC289_FIXUP_DUAL_SPK, ALC294_FIXUP_SPK2_TO_DAC1, ALC294_FIXUP_ASUS_DUAL_SPK, - + ALC285_FIXUP_THINKPAD_HEADSET_JACK, }; static const struct hda_fixup alc269_fixups[] = { @@ -7042,7 +7042,12 @@ static const struct hda_fixup alc269_fixups[] = { .chained = true, .chain_id = ALC294_FIXUP_SPK2_TO_DAC1 }, - + [ALC285_FIXUP_THINKPAD_HEADSET_JACK] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc_fixup_headset_jack, + .chained = true, + .chain_id = ALC285_FIXUP_SPEAKER2_TO_DAC1 + }, }; static const struct snd_pci_quirk alc269_fixup_tbl[] = { @@ -7278,8 +7283,8 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x17aa, 0x224c, "Thinkpad", ALC298_FIXUP_TPT470_DOCK), SND_PCI_QUIRK(0x17aa, 0x224d, "Thinkpad", ALC298_FIXUP_TPT470_DOCK), SND_PCI_QUIRK(0x17aa, 0x225d, "Thinkpad T480", ALC269_FIXUP_LIMIT_INT_MIC_BOOST), - SND_PCI_QUIRK(0x17aa, 0x2292, "Thinkpad X1 Yoga 7th", ALC285_FIXUP_SPEAKER2_TO_DAC1), - SND_PCI_QUIRK(0x17aa, 0x2293, "Thinkpad X1 Carbon 7th", ALC285_FIXUP_SPEAKER2_TO_DAC1), + SND_PCI_QUIRK(0x17aa, 0x2292, "Thinkpad X1 Yoga 7th", ALC285_FIXUP_THINKPAD_HEADSET_JACK), + SND_PCI_QUIRK(0x17aa, 0x2293, "Thinkpad X1 Carbon 7th", ALC285_FIXUP_THINKPAD_HEADSET_JACK), SND_PCI_QUIRK(0x17aa, 0x30bb, "ThinkCentre AIO", ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY), SND_PCI_QUIRK(0x17aa, 0x30e2, "ThinkCentre AIO", ALC233_FIXUP_LENOVO_LINE2_MIC_HOTKEY), SND_PCI_QUIRK(0x17aa, 0x310c, "ThinkCentre Station", ALC294_FIXUP_LENOVO_MIC_LOCATION), From 0d45e86d2267d5bdf7bbb631499788da1c27ceb2 Mon Sep 17 00:00:00 2001 From: Christian Lachner Date: Sun, 23 Feb 2020 10:24:16 +0100 Subject: [PATCH 363/400] ALSA: hda/realtek - Fix silent output on Gigabyte X570 Aorus Master The Gigabyte X570 Aorus Master motherboard with ALC1220 codec requires a similar workaround for Clevo laptops to enforce the DAC/mixer connection path. Set up a quirk entry for that. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=205275 Signed-off-by: Christian Lachner Cc: Link: https://lore.kernel.org/r/20200223092416.15016-2-gladiac@gmail.com Signed-off-by: Takashi Iwai --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index a8d66c1e4991..508d2be3d43d 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -2447,6 +2447,7 @@ static const struct snd_pci_quirk alc882_fixup_tbl[] = { SND_PCI_QUIRK(0x1071, 0x8258, "Evesham Voyaeger", ALC882_FIXUP_EAPD), SND_PCI_QUIRK(0x1458, 0xa002, "Gigabyte EP45-DS3/Z87X-UD3H", ALC889_FIXUP_FRONT_HP_NO_PRESENCE), SND_PCI_QUIRK(0x1458, 0xa0b8, "Gigabyte AZ370-Gaming", ALC1220_FIXUP_GB_DUAL_CODECS), + SND_PCI_QUIRK(0x1458, 0xa0cd, "Gigabyte X570 Aorus Master", ALC1220_FIXUP_CLEVO_P950), SND_PCI_QUIRK(0x1462, 0x1228, "MSI-GP63", ALC1220_FIXUP_CLEVO_P950), SND_PCI_QUIRK(0x1462, 0x1276, "MSI-GL73", ALC1220_FIXUP_CLEVO_P950), SND_PCI_QUIRK(0x1462, 0x1293, "MSI-GP65", ALC1220_FIXUP_CLEVO_P950), From 194bcf35bce4a236059816bc41b3db9c9c92a1bb Mon Sep 17 00:00:00 2001 From: "Tianci.Yin" Date: Fri, 28 Feb 2020 17:10:21 +0800 Subject: [PATCH 364/400] drm/amdgpu: disable 3D pipe 1 on Navi1x [why] CP firmware decide to skip setting the state for 3D pipe 1 for Navi1x as there is no use case. [how] Disable 3D pipe 1 on Navi1x. Reviewed-by: Feifei Xu Reviewed-by: Monk Liu Signed-off-by: Tianci.Yin Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 93 ++++++++++++++------------ 1 file changed, 49 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index ced29790217c..02702597ddeb 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -52,7 +52,7 @@ * 1. Primary ring * 2. Async ring */ -#define GFX10_NUM_GFX_RINGS 2 +#define GFX10_NUM_GFX_RINGS_NV1X 1 #define GFX10_MEC_HPD_SIZE 2048 #define F32_CE_PROGRAM_RAM_SIZE 65536 @@ -1304,7 +1304,7 @@ static int gfx_v10_0_sw_init(void *handle) case CHIP_NAVI14: case CHIP_NAVI12: adev->gfx.me.num_me = 1; - adev->gfx.me.num_pipe_per_me = 2; + adev->gfx.me.num_pipe_per_me = 1; adev->gfx.me.num_queue_per_pipe = 1; adev->gfx.mec.num_mec = 2; adev->gfx.mec.num_pipe_per_mec = 4; @@ -2710,18 +2710,20 @@ static int gfx_v10_0_cp_gfx_start(struct amdgpu_device *adev) amdgpu_ring_commit(ring); /* submit cs packet to copy state 0 to next available state */ - ring = &adev->gfx.gfx_ring[1]; - r = amdgpu_ring_alloc(ring, 2); - if (r) { - DRM_ERROR("amdgpu: cp failed to lock ring (%d).\n", r); - return r; + if (adev->gfx.num_gfx_rings > 1) { + /* maximum supported gfx ring is 2 */ + ring = &adev->gfx.gfx_ring[1]; + r = amdgpu_ring_alloc(ring, 2); + if (r) { + DRM_ERROR("amdgpu: cp failed to lock ring (%d).\n", r); + return r; + } + + amdgpu_ring_write(ring, PACKET3(PACKET3_CLEAR_STATE, 0)); + amdgpu_ring_write(ring, 0); + + amdgpu_ring_commit(ring); } - - amdgpu_ring_write(ring, PACKET3(PACKET3_CLEAR_STATE, 0)); - amdgpu_ring_write(ring, 0); - - amdgpu_ring_commit(ring); - return 0; } @@ -2818,39 +2820,41 @@ static int gfx_v10_0_cp_gfx_resume(struct amdgpu_device *adev) mutex_unlock(&adev->srbm_mutex); /* Init gfx ring 1 for pipe 1 */ - mutex_lock(&adev->srbm_mutex); - gfx_v10_0_cp_gfx_switch_pipe(adev, PIPE_ID1); - ring = &adev->gfx.gfx_ring[1]; - rb_bufsz = order_base_2(ring->ring_size / 8); - tmp = REG_SET_FIELD(0, CP_RB1_CNTL, RB_BUFSZ, rb_bufsz); - tmp = REG_SET_FIELD(tmp, CP_RB1_CNTL, RB_BLKSZ, rb_bufsz - 2); - WREG32_SOC15(GC, 0, mmCP_RB1_CNTL, tmp); - /* Initialize the ring buffer's write pointers */ - ring->wptr = 0; - WREG32_SOC15(GC, 0, mmCP_RB1_WPTR, lower_32_bits(ring->wptr)); - WREG32_SOC15(GC, 0, mmCP_RB1_WPTR_HI, upper_32_bits(ring->wptr)); - /* Set the wb address wether it's enabled or not */ - rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4); - WREG32_SOC15(GC, 0, mmCP_RB1_RPTR_ADDR, lower_32_bits(rptr_addr)); - WREG32_SOC15(GC, 0, mmCP_RB1_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & - CP_RB1_RPTR_ADDR_HI__RB_RPTR_ADDR_HI_MASK); - wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4); - WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_LO, - lower_32_bits(wptr_gpu_addr)); - WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_HI, - upper_32_bits(wptr_gpu_addr)); + if (adev->gfx.num_gfx_rings > 1) { + mutex_lock(&adev->srbm_mutex); + gfx_v10_0_cp_gfx_switch_pipe(adev, PIPE_ID1); + /* maximum supported gfx ring is 2 */ + ring = &adev->gfx.gfx_ring[1]; + rb_bufsz = order_base_2(ring->ring_size / 8); + tmp = REG_SET_FIELD(0, CP_RB1_CNTL, RB_BUFSZ, rb_bufsz); + tmp = REG_SET_FIELD(tmp, CP_RB1_CNTL, RB_BLKSZ, rb_bufsz - 2); + WREG32_SOC15(GC, 0, mmCP_RB1_CNTL, tmp); + /* Initialize the ring buffer's write pointers */ + ring->wptr = 0; + WREG32_SOC15(GC, 0, mmCP_RB1_WPTR, lower_32_bits(ring->wptr)); + WREG32_SOC15(GC, 0, mmCP_RB1_WPTR_HI, upper_32_bits(ring->wptr)); + /* Set the wb address wether it's enabled or not */ + rptr_addr = adev->wb.gpu_addr + (ring->rptr_offs * 4); + WREG32_SOC15(GC, 0, mmCP_RB1_RPTR_ADDR, lower_32_bits(rptr_addr)); + WREG32_SOC15(GC, 0, mmCP_RB1_RPTR_ADDR_HI, upper_32_bits(rptr_addr) & + CP_RB1_RPTR_ADDR_HI__RB_RPTR_ADDR_HI_MASK); + wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4); + WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_LO, + lower_32_bits(wptr_gpu_addr)); + WREG32_SOC15(GC, 0, mmCP_RB_WPTR_POLL_ADDR_HI, + upper_32_bits(wptr_gpu_addr)); - mdelay(1); - WREG32_SOC15(GC, 0, mmCP_RB1_CNTL, tmp); + mdelay(1); + WREG32_SOC15(GC, 0, mmCP_RB1_CNTL, tmp); - rb_addr = ring->gpu_addr >> 8; - WREG32_SOC15(GC, 0, mmCP_RB1_BASE, rb_addr); - WREG32_SOC15(GC, 0, mmCP_RB1_BASE_HI, upper_32_bits(rb_addr)); - WREG32_SOC15(GC, 0, mmCP_RB1_ACTIVE, 1); - - gfx_v10_0_cp_gfx_set_doorbell(adev, ring); - mutex_unlock(&adev->srbm_mutex); + rb_addr = ring->gpu_addr >> 8; + WREG32_SOC15(GC, 0, mmCP_RB1_BASE, rb_addr); + WREG32_SOC15(GC, 0, mmCP_RB1_BASE_HI, upper_32_bits(rb_addr)); + WREG32_SOC15(GC, 0, mmCP_RB1_ACTIVE, 1); + gfx_v10_0_cp_gfx_set_doorbell(adev, ring); + mutex_unlock(&adev->srbm_mutex); + } /* Switch to pipe 0 */ mutex_lock(&adev->srbm_mutex); gfx_v10_0_cp_gfx_switch_pipe(adev, PIPE_ID0); @@ -3967,7 +3971,8 @@ static int gfx_v10_0_early_init(void *handle) { struct amdgpu_device *adev = (struct amdgpu_device *)handle; - adev->gfx.num_gfx_rings = GFX10_NUM_GFX_RINGS; + adev->gfx.num_gfx_rings = GFX10_NUM_GFX_RINGS_NV1X; + adev->gfx.num_compute_rings = AMDGPU_MAX_COMPUTE_RINGS; gfx_v10_0_set_kiq_pm4_funcs(adev); From 5ac7fd2f597b88ee81f4748ee50cab06192a8dc3 Mon Sep 17 00:00:00 2001 From: Bhawanpreet Lakha Date: Thu, 20 Feb 2020 11:16:14 -0500 Subject: [PATCH 365/400] drm/amd/display: Clear link settings on MST disable connector [Why] If we have a single MST display and we disconnect it, we dont disable that link. This causes the old link settings to still exist Now on a replug for MST we think its a link loss and will try to reallocate mst payload which will fail, throwing warning below. [ 129.374192] [drm] Failed to updateMST allocation table forpipe idx:0 [ 129.374206] ------------[ cut here ]------------ [ 129.374284] WARNING: CPU: 14 PID: 1710 at drivers/gpu/drm/amd/amdgpu/../dal-dev/dc/core/dc_link.c:3153 dc_link_allocate_mst_payload+0x1f7/0x220 [amdgpu] [ 129.374285] Modules linked in: amdgpu(OE) amd_iommu_v2 gpu_sched ttm drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect sysimgblt binfmt_misc nls_iso8859_1 edac_mce_amd snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio kvm snd_hda_codec_hdmi snd_hda_intel snd_intel_nhlt snd_hda_codec irqbypass snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi crct10dif_pclmul snd_seq crc32_pclmul ghash_clmulni_intel snd_seq_device snd_timer snd aesni_intel eeepc_wmi crypto_simd asus_wmi joydev cryptd sparse_keymap input_leds soundcore video glue_helper wmi_bmof mxm_wmi k10temp ccp mac_hid sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid igb i2c_algo_bit ahci dca i2c_piix4 libahci gpio_amdpt wmi gpio_generic [ 129.374318] CPU: 14 PID: 1710 Comm: kworker/14:2 Tainted: G W OE 5.4.0-rc7bhawan+ #480 [ 129.374318] Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 0515 03/30/2017 [ 129.374397] Workqueue: events dm_irq_work_func [amdgpu] [ 129.374468] RIP: 0010:dc_link_allocate_mst_payload+0x1f7/0x220 [amdgpu] [ 129.374470] Code: 52 20 e8 1c 63 ad f4 48 8b 5d d0 65 48 33 1c 25 28 00 00 00 b8 01 00 00 00 75 16 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b e9 fa fe ff ff e8 ed 5b d6 f3 41 0f b6 b6 c4 02 00 00 48 c7 [ 129.374471] RSP: 0018:ffff9f9141e7fcc0 EFLAGS: 00010246 [ 129.374472] RAX: 0000000000000000 RBX: ffff91ef0762f800 RCX: 0000000000000000 [ 129.374473] RDX: 0000000000000005 RSI: ffffffffc0c4a988 RDI: 0000000000000004 [ 129.374474] RBP: ffff9f9141e7fd10 R08: 0000000000000005 R09: 0000000000000000 [ 129.374475] R10: 0000000000000002 R11: 0000000000000001 R12: ffff91eebd510c00 [ 129.374475] R13: ffff91eebd510e58 R14: ffff91ef052c01b8 R15: 0000000000000006 [ 129.374476] FS: 0000000000000000(0000) GS:ffff91ef0ef80000(0000) knlGS:0000000000000000 [ 129.374477] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 129.374478] CR2: 000055623ea01d50 CR3: 0000000408a8c000 CR4: 00000000003406e0 [ 129.374479] Call Trace: [ 129.374550] dc_link_reallocate_mst_payload+0x12e/0x150 [amdgpu] [ 129.374617] dc_link_handle_hpd_rx_irq+0x6d4/0x6e0 [amdgpu] [ 129.374693] handle_hpd_rx_irq+0x77/0x310 [amdgpu] [ 129.374768] dm_irq_work_func+0x53/0x70 [amdgpu] [ 129.374774] process_one_work+0x1fd/0x3f0 [ 129.374776] worker_thread+0x255/0x410 [ 129.374778] kthread+0x121/0x140 [ 129.374780] ? process_one_work+0x3f0/0x3f0 [ 129.374781] ? kthread_park+0x90/0x90 [ 129.374785] ret_from_fork+0x22/0x40 [How] when we disable MST we should clear the cur link settings (lane_count=0 is good enough). This will cause us to not reallocate payloads earlier than expected and not throw the warning Signed-off-by: Bhawanpreet Lakha Reviewed-by: Hersen Wu Acked-by: Rodrigo Siqueira Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c index 5672f7765919..da73161043d5 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -451,6 +451,7 @@ static void dm_dp_destroy_mst_connector(struct drm_dp_mst_topology_mgr *mgr, aconnector->dc_sink); dc_sink_release(aconnector->dc_sink); aconnector->dc_sink = NULL; + aconnector->dc_link->cur_link_settings.lane_count = 0; } drm_connector_unregister(connector); From a0275dfc82c9034eefbeffd556cca6dd239d7925 Mon Sep 17 00:00:00 2001 From: Josip Pavic Date: Fri, 21 Feb 2020 12:26:19 -0500 Subject: [PATCH 366/400] drm/amd/display: fix dcc swath size calculations on dcn1 [Why] Swath sizes are being calculated incorrectly. The horizontal swath size should be the product of block height, viewport width, and bytes per element, but the calculation uses viewport height instead of width. The vertical swath size is similarly incorrectly calculated. The effect of this is that we report the wrong DCC caps. [How] Use viewport width in the horizontal swath size calculation and viewport height in the vertical swath size calculation. Signed-off-by: Josip Pavic Reviewed-by: Aric Cyr Acked-by: Rodrigo Siqueira Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c index f36a0d8cedfe..446ba0a7a4b3 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c @@ -840,8 +840,8 @@ static void hubbub1_det_request_size( hubbub1_get_blk256_size(&blk256_width, &blk256_height, bpe); - swath_bytes_horz_wc = height * blk256_height * bpe; - swath_bytes_vert_wc = width * blk256_width * bpe; + swath_bytes_horz_wc = width * blk256_height * bpe; + swath_bytes_vert_wc = height * blk256_width * bpe; *req128_horz_wc = (2 * swath_bytes_horz_wc <= detile_buf_size) ? false : /* full 256B request */ From 80381d40c9bf5218db06a7d7246c5478c95987ee Mon Sep 17 00:00:00 2001 From: Prike Liang Date: Mon, 2 Mar 2020 09:36:15 +0800 Subject: [PATCH 367/400] drm/amd/powerplay: fix pre-check condition for setting clock range This fix will handle some MP1 FW issue like as mclk dpm table in renoir has a reverse dpm clock layout and a zero frequency dpm level as following case. cat pp_dpm_mclk 0: 1200Mhz 1: 1200Mhz 2: 800Mhz 3: 0Mhz Signed-off-by: Prike Liang Reviewed-by: Evan Quan Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 2 +- drivers/gpu/drm/amd/powerplay/smu_v12_0.c | 3 --- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c index 99ad4ddbe12f..ad8e9b5628e4 100644 --- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c @@ -222,7 +222,7 @@ int smu_set_soft_freq_range(struct smu_context *smu, enum smu_clk_type clk_type, { int ret = 0; - if (min <= 0 && max <= 0) + if (min < 0 && max < 0) return -EINVAL; if (!smu_clk_dpm_is_enabled(smu, clk_type)) diff --git a/drivers/gpu/drm/amd/powerplay/smu_v12_0.c b/drivers/gpu/drm/amd/powerplay/smu_v12_0.c index 870e6db2907e..518e6597bf2d 100644 --- a/drivers/gpu/drm/amd/powerplay/smu_v12_0.c +++ b/drivers/gpu/drm/amd/powerplay/smu_v12_0.c @@ -458,9 +458,6 @@ int smu_v12_0_set_soft_freq_limited_range(struct smu_context *smu, enum smu_clk_ { int ret = 0; - if (max < min) - return -EINVAL; - switch (clk_type) { case SMU_GFXCLK: case SMU_SCLK: From ab65a371dd5f5cba6bd9a58a1a6d4115a71cc5c9 Mon Sep 17 00:00:00 2001 From: Prike Liang Date: Wed, 4 Mar 2020 10:36:21 +0800 Subject: [PATCH 368/400] drm/amd/powerplay: map mclk to fclk for COMBINATIONAL_BYPASS case When hit COMBINATIONAL_BYPASS the mclk will be bypass and can export fclk frequency to user usage. Signed-off-by: Prike Liang Reviewed-by: Evan Quan Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org --- drivers/gpu/drm/amd/powerplay/renoir_ppt.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/powerplay/renoir_ppt.c b/drivers/gpu/drm/amd/powerplay/renoir_ppt.c index 861e6410363b..568c041c2206 100644 --- a/drivers/gpu/drm/amd/powerplay/renoir_ppt.c +++ b/drivers/gpu/drm/amd/powerplay/renoir_ppt.c @@ -111,8 +111,8 @@ static struct smu_12_0_cmn2aisc_mapping renoir_clk_map[SMU_CLK_COUNT] = { CLK_MAP(GFXCLK, CLOCK_GFXCLK), CLK_MAP(SCLK, CLOCK_GFXCLK), CLK_MAP(SOCCLK, CLOCK_SOCCLK), - CLK_MAP(UCLK, CLOCK_UMCCLK), - CLK_MAP(MCLK, CLOCK_UMCCLK), + CLK_MAP(UCLK, CLOCK_FCLK), + CLK_MAP(MCLK, CLOCK_FCLK), }; static struct smu_12_0_cmn2aisc_mapping renoir_table_map[SMU_TABLE_COUNT] = { @@ -280,7 +280,7 @@ static int renoir_print_clk_levels(struct smu_context *smu, break; case SMU_MCLK: count = NUM_MEMCLK_DPM_LEVELS; - cur_value = metrics.ClockFrequency[CLOCK_UMCCLK]; + cur_value = metrics.ClockFrequency[CLOCK_FCLK]; break; case SMU_DCEFCLK: count = NUM_DCFCLK_DPM_LEVELS; From 09ed6ba43e659474878b22d40b141a01d09ec857 Mon Sep 17 00:00:00 2001 From: Hersen Wu Date: Thu, 13 Feb 2020 10:50:13 -0500 Subject: [PATCH 369/400] drm/amdgpu/display: navi1x copy dcn watermark clock settings to smu resume from s3 (v2) This interface is for dGPU Navi1x. Linux dc-pplib interface depends on window driver dc implementation. For Navi1x, clock settings of dcn watermarks are fixed. the settings should be passed to smu during boot up and resume from s3. boot up: dc calculate dcn watermark clock settings within dc_create, dcn20_resource_construct, then call pplib functions below to pass the settings to smu: smu_set_watermarks_for_clock_ranges smu_set_watermarks_table navi10_set_watermarks_table smu_write_watermarks_table For Renoir, clock settings of dcn watermark are also fixed values. dc has implemented different flow for window driver: dc_hardware_init / dc_set_power_state dcn10_init_hw notify_wm_ranges set_wm_ranges For Linux smu_set_watermarks_for_clock_ranges renoir_set_watermarks_table smu_write_watermarks_table dc_hardware_init -> amdgpu_dm_init dc_set_power_state --> dm_resume therefore, linux dc-pplib interface of navi10/12/14 is different from that of Renoir. v2: add missing unlock in error case Signed-off-by: Hersen Wu Reviewed-by: Evan Quan Signed-off-by: Alex Deucher --- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 69 +++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index e8f66fbf399e..e997251a8b57 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1422,6 +1422,73 @@ static void s3_handle_mst(struct drm_device *dev, bool suspend) drm_kms_helper_hotplug_event(dev); } +static int amdgpu_dm_smu_write_watermarks_table(struct amdgpu_device *adev) +{ + struct smu_context *smu = &adev->smu; + int ret = 0; + + if (!is_support_sw_smu(adev)) + return 0; + + /* This interface is for dGPU Navi1x.Linux dc-pplib interface depends + * on window driver dc implementation. + * For Navi1x, clock settings of dcn watermarks are fixed. the settings + * should be passed to smu during boot up and resume from s3. + * boot up: dc calculate dcn watermark clock settings within dc_create, + * dcn20_resource_construct + * then call pplib functions below to pass the settings to smu: + * smu_set_watermarks_for_clock_ranges + * smu_set_watermarks_table + * navi10_set_watermarks_table + * smu_write_watermarks_table + * + * For Renoir, clock settings of dcn watermark are also fixed values. + * dc has implemented different flow for window driver: + * dc_hardware_init / dc_set_power_state + * dcn10_init_hw + * notify_wm_ranges + * set_wm_ranges + * -- Linux + * smu_set_watermarks_for_clock_ranges + * renoir_set_watermarks_table + * smu_write_watermarks_table + * + * For Linux, + * dc_hardware_init -> amdgpu_dm_init + * dc_set_power_state --> dm_resume + * + * therefore, this function apply to navi10/12/14 but not Renoir + * * + */ + switch(adev->asic_type) { + case CHIP_NAVI10: + case CHIP_NAVI14: + case CHIP_NAVI12: + break; + default: + return 0; + } + + mutex_lock(&smu->mutex); + + /* pass data to smu controller */ + if ((smu->watermarks_bitmap & WATERMARKS_EXIST) && + !(smu->watermarks_bitmap & WATERMARKS_LOADED)) { + ret = smu_write_watermarks_table(smu); + + if (ret) { + mutex_unlock(&smu->mutex); + DRM_ERROR("Failed to update WMTABLE!\n"); + return ret; + } + smu->watermarks_bitmap |= WATERMARKS_LOADED; + } + + mutex_unlock(&smu->mutex); + + return 0; +} + /** * dm_hw_init() - Initialize DC device * @handle: The base driver device containing the amdgpu_dm device. @@ -1700,6 +1767,8 @@ static int dm_resume(void *handle) amdgpu_dm_irq_resume_late(adev); + amdgpu_dm_smu_write_watermarks_table(adev); + return 0; } From 8b33a134a9cc2a501f8fc731d91caef39237d495 Mon Sep 17 00:00:00 2001 From: Jian-Hong Pan Date: Tue, 25 Feb 2020 15:29:21 +0800 Subject: [PATCH 370/400] ALSA: hda/realtek - Enable the headset of ASUS B9450FA with ALC294 A headset on the laptop like ASUS B9450FA does not work, until quirk ALC294_FIXUP_ASUS_HPE is applied. Signed-off-by: Jian-Hong Pan Signed-off-by: Kailang Yang Cc: Link: https://lore.kernel.org/r/20200225072920.109199-1-jian-hong@endlessm.com Signed-off-by: Takashi Iwai --- sound/pci/hda/patch_realtek.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 508d2be3d43d..0ac06ff1a17c 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -5922,6 +5922,7 @@ enum { ALC294_FIXUP_SPK2_TO_DAC1, ALC294_FIXUP_ASUS_DUAL_SPK, ALC285_FIXUP_THINKPAD_HEADSET_JACK, + ALC294_FIXUP_ASUS_HPE, }; static const struct hda_fixup alc269_fixups[] = { @@ -7049,6 +7050,17 @@ static const struct hda_fixup alc269_fixups[] = { .chained = true, .chain_id = ALC285_FIXUP_SPEAKER2_TO_DAC1 }, + [ALC294_FIXUP_ASUS_HPE] = { + .type = HDA_FIXUP_VERBS, + .v.verbs = (const struct hda_verb[]) { + /* Set EAPD high */ + { 0x20, AC_VERB_SET_COEF_INDEX, 0x0f }, + { 0x20, AC_VERB_SET_PROC_COEF, 0x7774 }, + { } + }, + .chained = true, + .chain_id = ALC294_FIXUP_ASUS_HEADSET_MIC + }, }; static const struct snd_pci_quirk alc269_fixup_tbl[] = { @@ -7214,6 +7226,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1043, 0x16e3, "ASUS UX50", ALC269_FIXUP_STEREO_DMIC), SND_PCI_QUIRK(0x1043, 0x17d1, "ASUS UX431FL", ALC294_FIXUP_ASUS_DUAL_SPK), SND_PCI_QUIRK(0x1043, 0x18b1, "Asus MJ401TA", ALC256_FIXUP_ASUS_HEADSET_MIC), + SND_PCI_QUIRK(0x1043, 0x19ce, "ASUS B9450FA", ALC294_FIXUP_ASUS_HPE), SND_PCI_QUIRK(0x1043, 0x1a13, "Asus G73Jw", ALC269_FIXUP_ASUS_G73JW), SND_PCI_QUIRK(0x1043, 0x1a30, "ASUS X705UD", ALC256_FIXUP_ASUS_MIC), SND_PCI_QUIRK(0x1043, 0x1b13, "Asus U41SV", ALC269_FIXUP_INV_DMIC), From e8dc73c9f9ea554b36093dea23e4ca3b586105d7 Mon Sep 17 00:00:00 2001 From: "Gustavo A. R. Silva" Date: Wed, 26 Feb 2020 15:26:12 -0600 Subject: [PATCH 371/400] xen: Replace zero-length array with flexible-array member The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva Link: https://lore.kernel.org/r/20200226212612.GA4663@embeddedor Reviewed-by: Juergen Gross Signed-off-by: Boris Ostrovsky --- drivers/xen/xen-pciback/pciback.h | 2 +- include/xen/interface/io/tpmif.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/xen/xen-pciback/pciback.h b/drivers/xen/xen-pciback/pciback.h index ce1077e32466..7c95516a860f 100644 --- a/drivers/xen/xen-pciback/pciback.h +++ b/drivers/xen/xen-pciback/pciback.h @@ -52,7 +52,7 @@ struct xen_pcibk_dev_data { unsigned int ack_intr:1; /* .. and ACK-ing */ unsigned long handled; unsigned int irq; /* Saved in case device transitions to MSI/MSI-X */ - char irq_name[0]; /* xen-pcibk[000:04:00.0] */ + char irq_name[]; /* xen-pcibk[000:04:00.0] */ }; /* Used by XenBus and xen_pcibk_ops.c */ diff --git a/include/xen/interface/io/tpmif.h b/include/xen/interface/io/tpmif.h index 28e7dcd75e82..f8aa8bac5196 100644 --- a/include/xen/interface/io/tpmif.h +++ b/include/xen/interface/io/tpmif.h @@ -46,7 +46,7 @@ struct vtpm_shared_page { uint8_t pad; uint8_t nr_extra_pages; /* extra pages for long packets; may be zero */ - uint32_t extra_pages[0]; /* grant IDs; length in nr_extra_pages */ + uint32_t extra_pages[]; /* grant IDs; length in nr_extra_pages */ }; #endif From 1b6a51e86cce38cf4d48ce9c242120283ae2f603 Mon Sep 17 00:00:00 2001 From: Dongli Zhang Date: Tue, 3 Mar 2020 14:14:22 -0800 Subject: [PATCH 372/400] xenbus: req->body should be updated before req->state The req->body should be updated before req->state is updated and the order should be guaranteed by a barrier. Otherwise, read_reply() might return req->body = NULL. Below is sample callstack when the issue is reproduced on purpose by reordering the updates of req->body and req->state and adding delay in code between updates of req->state and req->body. [ 22.356105] general protection fault: 0000 [#1] SMP PTI [ 22.361185] CPU: 2 PID: 52 Comm: xenwatch Not tainted 5.5.0xen+ #6 [ 22.366727] Hardware name: Xen HVM domU, BIOS ... [ 22.372245] RIP: 0010:_parse_integer_fixup_radix+0x6/0x60 ... ... [ 22.392163] RSP: 0018:ffffb2d64023fdf0 EFLAGS: 00010246 [ 22.395933] RAX: 0000000000000000 RBX: 75746e7562755f6d RCX: 0000000000000000 [ 22.400871] RDX: 0000000000000000 RSI: ffffb2d64023fdfc RDI: 75746e7562755f6d [ 22.405874] RBP: 0000000000000000 R08: 00000000000001e8 R09: 0000000000cdcdcd [ 22.410945] R10: ffffb2d6402ffe00 R11: ffff9d95395eaeb0 R12: ffff9d9535935000 [ 22.417613] R13: ffff9d9526d4a000 R14: ffff9d9526f4f340 R15: ffff9d9537654000 [ 22.423726] FS: 0000000000000000(0000) GS:ffff9d953bc80000(0000) knlGS:0000000000000000 [ 22.429898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 22.434342] CR2: 000000c4206a9000 CR3: 00000001ea3fc002 CR4: 00000000001606e0 [ 22.439645] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 22.444941] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 22.450342] Call Trace: [ 22.452509] simple_strtoull+0x27/0x70 [ 22.455572] xenbus_transaction_start+0x31/0x50 [ 22.459104] netback_changed+0x76c/0xcc1 [xen_netfront] [ 22.463279] ? find_watch+0x40/0x40 [ 22.466156] xenwatch_thread+0xb4/0x150 [ 22.469309] ? wait_woken+0x80/0x80 [ 22.472198] kthread+0x10e/0x130 [ 22.474925] ? kthread_park+0x80/0x80 [ 22.477946] ret_from_fork+0x35/0x40 [ 22.480968] Modules linked in: xen_kbdfront xen_fbfront(+) xen_netfront xen_blkfront [ 22.486783] ---[ end trace a9222030a747c3f7 ]--- [ 22.490424] RIP: 0010:_parse_integer_fixup_radix+0x6/0x60 The virt_rmb() is added in the 'true' path of test_reply(). The "while" is changed to "do while" so that test_reply() is used as a read memory barrier. Signed-off-by: Dongli Zhang Link: https://lore.kernel.org/r/20200303221423.21962-1-dongli.zhang@oracle.com Reviewed-by: Julien Grall Signed-off-by: Boris Ostrovsky --- drivers/xen/xenbus/xenbus_comms.c | 2 ++ drivers/xen/xenbus/xenbus_xs.c | 9 ++++++--- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/xen/xenbus/xenbus_comms.c b/drivers/xen/xenbus/xenbus_comms.c index d239fc3c5e3d..852ed161fc2a 100644 --- a/drivers/xen/xenbus/xenbus_comms.c +++ b/drivers/xen/xenbus/xenbus_comms.c @@ -313,6 +313,8 @@ static int process_msg(void) req->msg.type = state.msg.type; req->msg.len = state.msg.len; req->body = state.body; + /* write body, then update state */ + virt_wmb(); req->state = xb_req_state_got_reply; req->cb(req); } else diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c index ddc18da61834..3a06eb699f33 100644 --- a/drivers/xen/xenbus/xenbus_xs.c +++ b/drivers/xen/xenbus/xenbus_xs.c @@ -191,8 +191,11 @@ static bool xenbus_ok(void) static bool test_reply(struct xb_req_data *req) { - if (req->state == xb_req_state_got_reply || !xenbus_ok()) + if (req->state == xb_req_state_got_reply || !xenbus_ok()) { + /* read req->state before all other fields */ + virt_rmb(); return true; + } /* Make sure to reread req->state each time. */ barrier(); @@ -202,7 +205,7 @@ static bool test_reply(struct xb_req_data *req) static void *read_reply(struct xb_req_data *req) { - while (req->state != xb_req_state_got_reply) { + do { wait_event(req->wq, test_reply(req)); if (!xenbus_ok()) @@ -216,7 +219,7 @@ static void *read_reply(struct xb_req_data *req) if (req->err) return ERR_PTR(req->err); - } + } while (req->state != xb_req_state_got_reply); return req->body; } From 8130b9d5b5abf26f9927b487c15319a187775f34 Mon Sep 17 00:00:00 2001 From: Dongli Zhang Date: Tue, 3 Mar 2020 14:14:23 -0800 Subject: [PATCH 373/400] xenbus: req->err should be updated before req->state This patch adds the barrier to guarantee that req->err is always updated before req->state. Otherwise, read_reply() would not return ERR_PTR(req->err) but req->body, when process_writes()->xb_write() is failed. Signed-off-by: Dongli Zhang Link: https://lore.kernel.org/r/20200303221423.21962-2-dongli.zhang@oracle.com Reviewed-by: Julien Grall Signed-off-by: Boris Ostrovsky --- drivers/xen/xenbus/xenbus_comms.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/xen/xenbus/xenbus_comms.c b/drivers/xen/xenbus/xenbus_comms.c index 852ed161fc2a..eb5151fc8efa 100644 --- a/drivers/xen/xenbus/xenbus_comms.c +++ b/drivers/xen/xenbus/xenbus_comms.c @@ -397,6 +397,8 @@ static int process_writes(void) if (state.req->state == xb_req_state_aborted) kfree(state.req); else { + /* write err, then update state */ + virt_wmb(); state.req->state = xb_req_state_got_reply; wake_up(&state.req->wq); } From 2f69a110e7bba3ec6bc089a2f736ca0941d887ed Mon Sep 17 00:00:00 2001 From: Juergen Gross Date: Thu, 5 Mar 2020 11:03:23 +0100 Subject: [PATCH 374/400] xen/xenbus: fix locking Commit 060eabe8fbe726 ("xenbus/backend: Protect xenbus callback with lock") introduced a bug by holding a lock while calling a function which might schedule. Fix that by using a semaphore instead. Fixes: 060eabe8fbe726 ("xenbus/backend: Protect xenbus callback with lock") Signed-off-by: Juergen Gross Link: https://lore.kernel.org/r/20200305100323.16736-1-jgross@suse.com Reviewed-by: Boris Ostrovsky Signed-off-by: Boris Ostrovsky --- drivers/xen/xenbus/xenbus_probe.c | 10 +++++----- drivers/xen/xenbus/xenbus_probe_backend.c | 5 +++-- include/xen/xenbus.h | 3 ++- 3 files changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c index 66975da4f3b6..8c4d05b687b7 100644 --- a/drivers/xen/xenbus/xenbus_probe.c +++ b/drivers/xen/xenbus/xenbus_probe.c @@ -239,9 +239,9 @@ int xenbus_dev_probe(struct device *_dev) goto fail; } - spin_lock(&dev->reclaim_lock); + down(&dev->reclaim_sem); err = drv->probe(dev, id); - spin_unlock(&dev->reclaim_lock); + up(&dev->reclaim_sem); if (err) goto fail_put; @@ -271,9 +271,9 @@ int xenbus_dev_remove(struct device *_dev) free_otherend_watch(dev); if (drv->remove) { - spin_lock(&dev->reclaim_lock); + down(&dev->reclaim_sem); drv->remove(dev); - spin_unlock(&dev->reclaim_lock); + up(&dev->reclaim_sem); } module_put(drv->driver.owner); @@ -473,7 +473,7 @@ int xenbus_probe_node(struct xen_bus_type *bus, goto fail; dev_set_name(&xendev->dev, "%s", devname); - spin_lock_init(&xendev->reclaim_lock); + sema_init(&xendev->reclaim_sem, 1); /* Register with generic device framework. */ err = device_register(&xendev->dev); diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c index 791f6fe01e91..9b2fbe69bccc 100644 --- a/drivers/xen/xenbus/xenbus_probe_backend.c +++ b/drivers/xen/xenbus/xenbus_probe_backend.c @@ -45,6 +45,7 @@ #include #include #include +#include #include #include @@ -257,10 +258,10 @@ static int backend_reclaim_memory(struct device *dev, void *data) drv = to_xenbus_driver(dev->driver); if (drv && drv->reclaim_memory) { xdev = to_xenbus_device(dev); - if (!spin_trylock(&xdev->reclaim_lock)) + if (down_trylock(&xdev->reclaim_sem)) return 0; drv->reclaim_memory(xdev); - spin_unlock(&xdev->reclaim_lock); + up(&xdev->reclaim_sem); } return 0; } diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h index 89a889585ba0..850a43bd69d3 100644 --- a/include/xen/xenbus.h +++ b/include/xen/xenbus.h @@ -42,6 +42,7 @@ #include #include #include +#include #include #include #include @@ -76,7 +77,7 @@ struct xenbus_device { enum xenbus_state state; struct completion down; struct work_struct work; - spinlock_t reclaim_lock; + struct semaphore reclaim_sem; }; static inline struct xenbus_device *to_xenbus_device(struct device *dev) From 4ab50af63d2eb5da5c1571f8518948514f535782 Mon Sep 17 00:00:00 2001 From: Juergen Gross Date: Thu, 5 Mar 2020 16:51:29 +0100 Subject: [PATCH 375/400] xen/blkfront: fix ring info addressing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Commit 0265d6e8ddb890 ("xen/blkfront: limit allocated memory size to actual use case") made struct blkfront_ring_info size dynamic. This is fine when running with only one queue, but with multiple queues the addressing of the single queues has to be adapted as the structs are allocated in an array. Fixes: 0265d6e8ddb890 ("xen/blkfront: limit allocated memory size to actual use case") Reported-by: Sander Eikelenboom Tested-by: Sander Eikelenboom Signed-off-by: Juergen Gross Acked-by: Roger Pau Monné Link: https://lore.kernel.org/r/20200305155129.28326-1-jgross@suse.com Signed-off-by: Boris Ostrovsky --- drivers/block/xen-blkfront.c | 80 +++++++++++++++++++----------------- 1 file changed, 42 insertions(+), 38 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index e2ad6bba2281..9df516a56bb2 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -213,6 +213,7 @@ struct blkfront_info struct blk_mq_tag_set tag_set; struct blkfront_ring_info *rinfo; unsigned int nr_rings; + unsigned int rinfo_size; /* Save uncomplete reqs and bios for migration. */ struct list_head requests; struct bio_list bio_list; @@ -259,6 +260,18 @@ static int blkfront_setup_indirect(struct blkfront_ring_info *rinfo); static void blkfront_gather_backend_features(struct blkfront_info *info); static int negotiate_mq(struct blkfront_info *info); +#define for_each_rinfo(info, ptr, idx) \ + for ((ptr) = (info)->rinfo, (idx) = 0; \ + (idx) < (info)->nr_rings; \ + (idx)++, (ptr) = (void *)(ptr) + (info)->rinfo_size) + +static inline struct blkfront_ring_info * +get_rinfo(const struct blkfront_info *info, unsigned int i) +{ + BUG_ON(i >= info->nr_rings); + return (void *)info->rinfo + i * info->rinfo_size; +} + static int get_id_from_freelist(struct blkfront_ring_info *rinfo) { unsigned long free = rinfo->shadow_free; @@ -883,8 +896,7 @@ static blk_status_t blkif_queue_rq(struct blk_mq_hw_ctx *hctx, struct blkfront_info *info = hctx->queue->queuedata; struct blkfront_ring_info *rinfo = NULL; - BUG_ON(info->nr_rings <= qid); - rinfo = &info->rinfo[qid]; + rinfo = get_rinfo(info, qid); blk_mq_start_request(qd->rq); spin_lock_irqsave(&rinfo->ring_lock, flags); if (RING_FULL(&rinfo->ring)) @@ -1181,6 +1193,7 @@ static int xlvbd_alloc_gendisk(blkif_sector_t capacity, static void xlvbd_release_gendisk(struct blkfront_info *info) { unsigned int minor, nr_minors, i; + struct blkfront_ring_info *rinfo; if (info->rq == NULL) return; @@ -1188,9 +1201,7 @@ static void xlvbd_release_gendisk(struct blkfront_info *info) /* No more blkif_request(). */ blk_mq_stop_hw_queues(info->rq); - for (i = 0; i < info->nr_rings; i++) { - struct blkfront_ring_info *rinfo = &info->rinfo[i]; - + for_each_rinfo(info, rinfo, i) { /* No more gnttab callback work. */ gnttab_cancel_free_callback(&rinfo->callback); @@ -1339,6 +1350,7 @@ free_shadow: static void blkif_free(struct blkfront_info *info, int suspend) { unsigned int i; + struct blkfront_ring_info *rinfo; /* Prevent new requests being issued until we fix things up. */ info->connected = suspend ? @@ -1347,8 +1359,8 @@ static void blkif_free(struct blkfront_info *info, int suspend) if (info->rq) blk_mq_stop_hw_queues(info->rq); - for (i = 0; i < info->nr_rings; i++) - blkif_free_ring(&info->rinfo[i]); + for_each_rinfo(info, rinfo, i) + blkif_free_ring(rinfo); kvfree(info->rinfo); info->rinfo = NULL; @@ -1775,6 +1787,7 @@ static int talk_to_blkback(struct xenbus_device *dev, int err; unsigned int i, max_page_order; unsigned int ring_page_order; + struct blkfront_ring_info *rinfo; if (!info) return -ENODEV; @@ -1788,9 +1801,7 @@ static int talk_to_blkback(struct xenbus_device *dev, if (err) goto destroy_blkring; - for (i = 0; i < info->nr_rings; i++) { - struct blkfront_ring_info *rinfo = &info->rinfo[i]; - + for_each_rinfo(info, rinfo, i) { /* Create shared ring, alloc event channel. */ err = setup_blkring(dev, rinfo); if (err) @@ -1815,7 +1826,7 @@ again: /* We already got the number of queues/rings in _probe */ if (info->nr_rings == 1) { - err = write_per_ring_nodes(xbt, &info->rinfo[0], dev->nodename); + err = write_per_ring_nodes(xbt, info->rinfo, dev->nodename); if (err) goto destroy_blkring; } else { @@ -1837,10 +1848,10 @@ again: goto abort_transaction; } - for (i = 0; i < info->nr_rings; i++) { + for_each_rinfo(info, rinfo, i) { memset(path, 0, pathsize); snprintf(path, pathsize, "%s/queue-%u", dev->nodename, i); - err = write_per_ring_nodes(xbt, &info->rinfo[i], path); + err = write_per_ring_nodes(xbt, rinfo, path); if (err) { kfree(path); goto destroy_blkring; @@ -1868,9 +1879,8 @@ again: goto destroy_blkring; } - for (i = 0; i < info->nr_rings; i++) { + for_each_rinfo(info, rinfo, i) { unsigned int j; - struct blkfront_ring_info *rinfo = &info->rinfo[i]; for (j = 0; j < BLK_RING_SIZE(info); j++) rinfo->shadow[j].req.u.rw.id = j + 1; @@ -1900,6 +1910,7 @@ static int negotiate_mq(struct blkfront_info *info) { unsigned int backend_max_queues; unsigned int i; + struct blkfront_ring_info *rinfo; BUG_ON(info->nr_rings); @@ -1911,20 +1922,16 @@ static int negotiate_mq(struct blkfront_info *info) if (!info->nr_rings) info->nr_rings = 1; - info->rinfo = kvcalloc(info->nr_rings, - struct_size(info->rinfo, shadow, - BLK_RING_SIZE(info)), - GFP_KERNEL); + info->rinfo_size = struct_size(info->rinfo, shadow, + BLK_RING_SIZE(info)); + info->rinfo = kvcalloc(info->nr_rings, info->rinfo_size, GFP_KERNEL); if (!info->rinfo) { xenbus_dev_fatal(info->xbdev, -ENOMEM, "allocating ring_info structure"); info->nr_rings = 0; return -ENOMEM; } - for (i = 0; i < info->nr_rings; i++) { - struct blkfront_ring_info *rinfo; - - rinfo = &info->rinfo[i]; + for_each_rinfo(info, rinfo, i) { INIT_LIST_HEAD(&rinfo->indirect_pages); INIT_LIST_HEAD(&rinfo->grants); rinfo->dev_info = info; @@ -2017,6 +2024,7 @@ static int blkif_recover(struct blkfront_info *info) int rc; struct bio *bio; unsigned int segs; + struct blkfront_ring_info *rinfo; blkfront_gather_backend_features(info); /* Reset limits changed by blk_mq_update_nr_hw_queues(). */ @@ -2024,9 +2032,7 @@ static int blkif_recover(struct blkfront_info *info) segs = info->max_indirect_segments ? : BLKIF_MAX_SEGMENTS_PER_REQUEST; blk_queue_max_segments(info->rq, segs / GRANTS_PER_PSEG); - for (r_index = 0; r_index < info->nr_rings; r_index++) { - struct blkfront_ring_info *rinfo = &info->rinfo[r_index]; - + for_each_rinfo(info, rinfo, r_index) { rc = blkfront_setup_indirect(rinfo); if (rc) return rc; @@ -2036,10 +2042,7 @@ static int blkif_recover(struct blkfront_info *info) /* Now safe for us to use the shared ring */ info->connected = BLKIF_STATE_CONNECTED; - for (r_index = 0; r_index < info->nr_rings; r_index++) { - struct blkfront_ring_info *rinfo; - - rinfo = &info->rinfo[r_index]; + for_each_rinfo(info, rinfo, r_index) { /* Kick any other new requests queued since we resumed */ kick_pending_request_queues(rinfo); } @@ -2072,13 +2075,13 @@ static int blkfront_resume(struct xenbus_device *dev) struct blkfront_info *info = dev_get_drvdata(&dev->dev); int err = 0; unsigned int i, j; + struct blkfront_ring_info *rinfo; dev_dbg(&dev->dev, "blkfront_resume: %s\n", dev->nodename); bio_list_init(&info->bio_list); INIT_LIST_HEAD(&info->requests); - for (i = 0; i < info->nr_rings; i++) { - struct blkfront_ring_info *rinfo = &info->rinfo[i]; + for_each_rinfo(info, rinfo, i) { struct bio_list merge_bio; struct blk_shadow *shadow = rinfo->shadow; @@ -2337,6 +2340,7 @@ static void blkfront_connect(struct blkfront_info *info) unsigned int binfo; char *envp[] = { "RESIZE=1", NULL }; int err, i; + struct blkfront_ring_info *rinfo; switch (info->connected) { case BLKIF_STATE_CONNECTED: @@ -2394,8 +2398,8 @@ static void blkfront_connect(struct blkfront_info *info) "physical-sector-size", sector_size); blkfront_gather_backend_features(info); - for (i = 0; i < info->nr_rings; i++) { - err = blkfront_setup_indirect(&info->rinfo[i]); + for_each_rinfo(info, rinfo, i) { + err = blkfront_setup_indirect(rinfo); if (err) { xenbus_dev_fatal(info->xbdev, err, "setup_indirect at %s", info->xbdev->otherend); @@ -2416,8 +2420,8 @@ static void blkfront_connect(struct blkfront_info *info) /* Kick pending requests. */ info->connected = BLKIF_STATE_CONNECTED; - for (i = 0; i < info->nr_rings; i++) - kick_pending_request_queues(&info->rinfo[i]); + for_each_rinfo(info, rinfo, i) + kick_pending_request_queues(rinfo); device_add_disk(&info->xbdev->dev, info->gd, NULL); @@ -2652,9 +2656,9 @@ static void purge_persistent_grants(struct blkfront_info *info) { unsigned int i; unsigned long flags; + struct blkfront_ring_info *rinfo; - for (i = 0; i < info->nr_rings; i++) { - struct blkfront_ring_info *rinfo = &info->rinfo[i]; + for_each_rinfo(info, rinfo, i) { struct grant *gnt_list_entry, *tmp; spin_lock_irqsave(&rinfo->ring_lock, flags); From 759bdc168181abeff61399d0f7ecec2852cc3e61 Mon Sep 17 00:00:00 2001 From: Anup Patel Date: Tue, 3 Dec 2019 03:49:31 +0000 Subject: [PATCH 376/400] RISC-V: Add kconfig option for QEMU virt machine We add kconfig option for QEMU virt machine and select all required VIRTIO drivers using this kconfig option. Signed-off-by: Anup Patel Reviewed-by: Atish Patra Reviewed-by: Palmer Dabbelt Reviewed-by: Alistair Francis Signed-off-by: Palmer Dabbelt --- arch/riscv/Kconfig.socs | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/arch/riscv/Kconfig.socs b/arch/riscv/Kconfig.socs index d325b67d00df..414db54d7dbf 100644 --- a/arch/riscv/Kconfig.socs +++ b/arch/riscv/Kconfig.socs @@ -10,4 +10,24 @@ config SOC_SIFIVE help This enables support for SiFive SoC platform hardware. +config SOC_VIRT + bool "QEMU Virt Machine" + select VIRTIO_PCI + select VIRTIO_BALLOON + select VIRTIO_MMIO + select VIRTIO_CONSOLE + select VIRTIO_NET + select NET_9P_VIRTIO + select VIRTIO_BLK + select SCSI_VIRTIO + select DRM_VIRTIO_GPU + select HW_RANDOM_VIRTIO + select RPMSG_CHAR + select RPMSG_VIRTIO + select CRYPTO_DEV_VIRTIO + select VIRTIO_INPUT + select SIFIVE_PLIC + help + This enables support for QEMU Virt Machine. + endmenu From a4485398b6b86334aa26dff5088b3e7e8a87682d Mon Sep 17 00:00:00 2001 From: Anup Patel Date: Tue, 3 Dec 2019 03:49:34 +0000 Subject: [PATCH 377/400] RISC-V: Enable QEMU virt machine support in defconfigs We have kconfig option for QEMU virt machine so let's enable it in RV32 and RV64 defconfigs. Also, we remove various VIRTIO configs from RV32 and RV64 defconfigs because these are now selected by QEMU virt machine kconfig option. Signed-off-by: Anup Patel Reviewed-by: Atish Patra Reviewed-by: Palmer Dabbelt Reviewed-by: Alistair Francis Signed-off-by: Palmer Dabbelt --- arch/riscv/configs/defconfig | 15 +-------------- arch/riscv/configs/rv32_defconfig | 16 +--------------- 2 files changed, 2 insertions(+), 29 deletions(-) diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig index e2ff95cb3390..189c97c0d2e6 100644 --- a/arch/riscv/configs/defconfig +++ b/arch/riscv/configs/defconfig @@ -15,6 +15,7 @@ CONFIG_BLK_DEV_INITRD=y CONFIG_EXPERT=y CONFIG_BPF_SYSCALL=y CONFIG_SOC_SIFIVE=y +CONFIG_SOC_VIRT=y CONFIG_SMP=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y @@ -30,7 +31,6 @@ CONFIG_IP_PNP_BOOTP=y CONFIG_IP_PNP_RARP=y CONFIG_NETLINK_DIAG=y CONFIG_NET_9P=y -CONFIG_NET_9P_VIRTIO=y CONFIG_PCI=y CONFIG_PCIEPORTBUS=y CONFIG_PCI_HOST_GENERIC=y @@ -38,15 +38,12 @@ CONFIG_PCIE_XILINX=y CONFIG_DEVTMPFS=y CONFIG_DEVTMPFS_MOUNT=y CONFIG_BLK_DEV_LOOP=y -CONFIG_VIRTIO_BLK=y CONFIG_BLK_DEV_SD=y CONFIG_BLK_DEV_SR=y -CONFIG_SCSI_VIRTIO=y CONFIG_ATA=y CONFIG_SATA_AHCI=y CONFIG_SATA_AHCI_PLATFORM=y CONFIG_NETDEVICES=y -CONFIG_VIRTIO_NET=y CONFIG_MACB=y CONFIG_E1000E=y CONFIG_R8169=y @@ -57,15 +54,12 @@ CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_OF_PLATFORM=y CONFIG_SERIAL_EARLYCON_RISCV_SBI=y CONFIG_HVC_RISCV_SBI=y -CONFIG_VIRTIO_CONSOLE=y CONFIG_HW_RANDOM=y -CONFIG_HW_RANDOM_VIRTIO=y CONFIG_SPI=y CONFIG_SPI_SIFIVE=y # CONFIG_PTP_1588_CLOCK is not set CONFIG_DRM=y CONFIG_DRM_RADEON=y -CONFIG_DRM_VIRTIO_GPU=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_USB=y CONFIG_USB_XHCI_HCD=y @@ -78,12 +72,6 @@ CONFIG_USB_STORAGE=y CONFIG_USB_UAS=y CONFIG_MMC=y CONFIG_MMC_SPI=y -CONFIG_VIRTIO_PCI=y -CONFIG_VIRTIO_BALLOON=y -CONFIG_VIRTIO_INPUT=y -CONFIG_VIRTIO_MMIO=y -CONFIG_RPMSG_CHAR=y -CONFIG_RPMSG_VIRTIO=y CONFIG_EXT4_FS=y CONFIG_EXT4_FS_POSIX_ACL=y CONFIG_AUTOFS4_FS=y @@ -98,7 +86,6 @@ CONFIG_NFS_V4_2=y CONFIG_ROOT_NFS=y CONFIG_9P_FS=y CONFIG_CRYPTO_USER_API_HASH=y -CONFIG_CRYPTO_DEV_VIRTIO=y CONFIG_PRINTK_TIME=y CONFIG_DEBUG_FS=y CONFIG_DEBUG_PAGEALLOC=y diff --git a/arch/riscv/configs/rv32_defconfig b/arch/riscv/configs/rv32_defconfig index eb519407c841..417cfc4ee469 100644 --- a/arch/riscv/configs/rv32_defconfig +++ b/arch/riscv/configs/rv32_defconfig @@ -14,6 +14,7 @@ CONFIG_CHECKPOINT_RESTORE=y CONFIG_BLK_DEV_INITRD=y CONFIG_EXPERT=y CONFIG_BPF_SYSCALL=y +CONFIG_SOC_VIRT=y CONFIG_ARCH_RV32I=y CONFIG_SMP=y CONFIG_MODULES=y @@ -30,7 +31,6 @@ CONFIG_IP_PNP_BOOTP=y CONFIG_IP_PNP_RARP=y CONFIG_NETLINK_DIAG=y CONFIG_NET_9P=y -CONFIG_NET_9P_VIRTIO=y CONFIG_PCI=y CONFIG_PCIEPORTBUS=y CONFIG_PCI_HOST_GENERIC=y @@ -38,15 +38,12 @@ CONFIG_PCIE_XILINX=y CONFIG_DEVTMPFS=y CONFIG_DEVTMPFS_MOUNT=y CONFIG_BLK_DEV_LOOP=y -CONFIG_VIRTIO_BLK=y CONFIG_BLK_DEV_SD=y CONFIG_BLK_DEV_SR=y -CONFIG_SCSI_VIRTIO=y CONFIG_ATA=y CONFIG_SATA_AHCI=y CONFIG_SATA_AHCI_PLATFORM=y CONFIG_NETDEVICES=y -CONFIG_VIRTIO_NET=y CONFIG_MACB=y CONFIG_E1000E=y CONFIG_R8169=y @@ -57,13 +54,10 @@ CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_OF_PLATFORM=y CONFIG_SERIAL_EARLYCON_RISCV_SBI=y CONFIG_HVC_RISCV_SBI=y -CONFIG_VIRTIO_CONSOLE=y CONFIG_HW_RANDOM=y -CONFIG_HW_RANDOM_VIRTIO=y # CONFIG_PTP_1588_CLOCK is not set CONFIG_DRM=y CONFIG_DRM_RADEON=y -CONFIG_DRM_VIRTIO_GPU=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_USB=y CONFIG_USB_XHCI_HCD=y @@ -74,13 +68,6 @@ CONFIG_USB_OHCI_HCD=y CONFIG_USB_OHCI_HCD_PLATFORM=y CONFIG_USB_STORAGE=y CONFIG_USB_UAS=y -CONFIG_VIRTIO_PCI=y -CONFIG_VIRTIO_BALLOON=y -CONFIG_VIRTIO_INPUT=y -CONFIG_VIRTIO_MMIO=y -CONFIG_RPMSG_CHAR=y -CONFIG_RPMSG_VIRTIO=y -CONFIG_SIFIVE_PLIC=y CONFIG_EXT4_FS=y CONFIG_EXT4_FS_POSIX_ACL=y CONFIG_AUTOFS4_FS=y @@ -95,7 +82,6 @@ CONFIG_NFS_V4_2=y CONFIG_ROOT_NFS=y CONFIG_9P_FS=y CONFIG_CRYPTO_USER_API_HASH=y -CONFIG_CRYPTO_DEV_VIRTIO=y CONFIG_PRINTK_TIME=y CONFIG_DEBUG_FS=y CONFIG_DEBUG_PAGEALLOC=y From 81e2d3c52c0ef819d2fe68ebe2e167045938929e Mon Sep 17 00:00:00 2001 From: Anup Patel Date: Tue, 3 Dec 2019 03:49:37 +0000 Subject: [PATCH 378/400] RISC-V: Select SYSCON Reboot and Poweroff for QEMU virt machine The SYSCON Reboot and Poweroff drivers can be used on QEMU virt machine to reboot or poweroff the system hence we select these drivers using QEMU virt machine kconfig option. Signed-off-by: Anup Patel Reviewed-by: Palmer Dabbelt Reviewed-by: Alistair Francis Signed-off-by: Palmer Dabbelt --- arch/riscv/Kconfig.socs | 2 ++ arch/riscv/configs/defconfig | 1 + arch/riscv/configs/rv32_defconfig | 1 + 3 files changed, 4 insertions(+) diff --git a/arch/riscv/Kconfig.socs b/arch/riscv/Kconfig.socs index 414db54d7dbf..9f6f9a063bc4 100644 --- a/arch/riscv/Kconfig.socs +++ b/arch/riscv/Kconfig.socs @@ -26,6 +26,8 @@ config SOC_VIRT select RPMSG_VIRTIO select CRYPTO_DEV_VIRTIO select VIRTIO_INPUT + select POWER_RESET_SYSCON + select POWER_RESET_SYSCON_POWEROFF select SIFIVE_PLIC help This enables support for QEMU Virt Machine. diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig index 189c97c0d2e6..b15fc2c71d8b 100644 --- a/arch/riscv/configs/defconfig +++ b/arch/riscv/configs/defconfig @@ -58,6 +58,7 @@ CONFIG_HW_RANDOM=y CONFIG_SPI=y CONFIG_SPI_SIFIVE=y # CONFIG_PTP_1588_CLOCK is not set +CONFIG_POWER_RESET=y CONFIG_DRM=y CONFIG_DRM_RADEON=y CONFIG_FRAMEBUFFER_CONSOLE=y diff --git a/arch/riscv/configs/rv32_defconfig b/arch/riscv/configs/rv32_defconfig index 417cfc4ee469..a0880110fe58 100644 --- a/arch/riscv/configs/rv32_defconfig +++ b/arch/riscv/configs/rv32_defconfig @@ -56,6 +56,7 @@ CONFIG_SERIAL_EARLYCON_RISCV_SBI=y CONFIG_HVC_RISCV_SBI=y CONFIG_HW_RANDOM=y # CONFIG_PTP_1588_CLOCK is not set +CONFIG_POWER_RESET=y CONFIG_DRM=y CONFIG_DRM_RADEON=y CONFIG_FRAMEBUFFER_CONSOLE=y From d2047aba2e68f02119fa28904364070b98d92cd8 Mon Sep 17 00:00:00 2001 From: Anup Patel Date: Tue, 3 Dec 2019 03:49:39 +0000 Subject: [PATCH 379/400] RISC-V: Select Goldfish RTC driver for QEMU virt machine We select Goldfish RTC driver using QEMU virt machine kconfig option to access RTC device on QEMU virt machine. Signed-off-by: Anup Patel Reviewed-by: Atish Patra Reviewed-by: Palmer Dabbelt Reviewed-by: Alistair Francis Signed-off-by: Palmer Dabbelt --- arch/riscv/Kconfig.socs | 2 ++ arch/riscv/configs/defconfig | 1 + arch/riscv/configs/rv32_defconfig | 1 + 3 files changed, 4 insertions(+) diff --git a/arch/riscv/Kconfig.socs b/arch/riscv/Kconfig.socs index 9f6f9a063bc4..3078b2de0b2d 100644 --- a/arch/riscv/Kconfig.socs +++ b/arch/riscv/Kconfig.socs @@ -28,6 +28,8 @@ config SOC_VIRT select VIRTIO_INPUT select POWER_RESET_SYSCON select POWER_RESET_SYSCON_POWEROFF + select GOLDFISH + select RTC_DRV_GOLDFISH select SIFIVE_PLIC help This enables support for QEMU Virt Machine. diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig index b15fc2c71d8b..c8f084203067 100644 --- a/arch/riscv/configs/defconfig +++ b/arch/riscv/configs/defconfig @@ -73,6 +73,7 @@ CONFIG_USB_STORAGE=y CONFIG_USB_UAS=y CONFIG_MMC=y CONFIG_MMC_SPI=y +CONFIG_RTC_CLASS=y CONFIG_EXT4_FS=y CONFIG_EXT4_FS_POSIX_ACL=y CONFIG_AUTOFS4_FS=y diff --git a/arch/riscv/configs/rv32_defconfig b/arch/riscv/configs/rv32_defconfig index a0880110fe58..a844920a261f 100644 --- a/arch/riscv/configs/rv32_defconfig +++ b/arch/riscv/configs/rv32_defconfig @@ -69,6 +69,7 @@ CONFIG_USB_OHCI_HCD=y CONFIG_USB_OHCI_HCD_PLATFORM=y CONFIG_USB_STORAGE=y CONFIG_USB_UAS=y +CONFIG_RTC_CLASS=y CONFIG_EXT4_FS=y CONFIG_EXT4_FS_POSIX_ACL=y CONFIG_AUTOFS4_FS=y From 0a91330b2af9f71ceeeed483f92774182b58f6d9 Mon Sep 17 00:00:00 2001 From: Yash Shah Date: Wed, 19 Feb 2020 09:19:07 +0530 Subject: [PATCH 380/400] riscv: dts: Add GPIO reboot method to HiFive Unleashed DTS file Add the ability to reboot the HiFive Unleashed board via GPIO. Signed-off-by: Yash Shah Reviewed-by: Anup Patel Signed-off-by: Palmer Dabbelt --- arch/riscv/boot/dts/sifive/hifive-unleashed-a00.dts | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/riscv/boot/dts/sifive/hifive-unleashed-a00.dts b/arch/riscv/boot/dts/sifive/hifive-unleashed-a00.dts index 609198cb1163..4a2729f5ca3f 100644 --- a/arch/riscv/boot/dts/sifive/hifive-unleashed-a00.dts +++ b/arch/riscv/boot/dts/sifive/hifive-unleashed-a00.dts @@ -2,6 +2,7 @@ /* Copyright (c) 2018-2019 SiFive, Inc */ #include "fu540-c000.dtsi" +#include /* Clock frequency (in Hz) of the PCB crystal for rtcclk */ #define RTCCLK_FREQ 1000000 @@ -41,6 +42,10 @@ clock-frequency = ; clock-output-names = "rtcclk"; }; + gpio-restart { + compatible = "gpio-restart"; + gpios = <&gpio 10 GPIO_ACTIVE_LOW>; + }; }; &uart0 { From 153031a301bb07194e9c37466cfce8eacb977621 Mon Sep 17 00:00:00 2001 From: Cengiz Can Date: Wed, 4 Mar 2020 13:58:19 +0300 Subject: [PATCH 381/400] blktrace: fix dereference after null check There was a recent change in blktrace.c that added a RCU protection to `q->blk_trace` in order to fix a use-after-free issue during access. However the change missed an edge case that can lead to dereferencing of `bt` pointer even when it's NULL: Coverity static analyzer marked this as a FORWARD_NULL issue with CID 1460458. ``` /kernel/trace/blktrace.c: 1904 in sysfs_blk_trace_attr_store() 1898 ret = 0; 1899 if (bt == NULL) 1900 ret = blk_trace_setup_queue(q, bdev); 1901 1902 if (ret == 0) { 1903 if (attr == &dev_attr_act_mask) >>> CID 1460458: Null pointer dereferences (FORWARD_NULL) >>> Dereferencing null pointer "bt". 1904 bt->act_mask = value; 1905 else if (attr == &dev_attr_pid) 1906 bt->pid = value; 1907 else if (attr == &dev_attr_start_lba) 1908 bt->start_lba = value; 1909 else if (attr == &dev_attr_end_lba) ``` Added a reassignment with RCU annotation to fix the issue. Fixes: c780e86dd48 ("blktrace: Protect q->blk_trace with RCU") Cc: stable@vger.kernel.org Reviewed-by: Ming Lei Reviewed-by: Bob Liu Reviewed-by: Steven Rostedt (VMware) Signed-off-by: Cengiz Can Signed-off-by: Jens Axboe --- kernel/trace/blktrace.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index 4560878f0bac..ca39dc3230cb 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -1896,8 +1896,11 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev, } ret = 0; - if (bt == NULL) + if (bt == NULL) { ret = blk_trace_setup_queue(q, bdev); + bt = rcu_dereference_protected(q->blk_trace, + lockdep_is_held(&q->blk_trace_mutex)); + } if (ret == 0) { if (attr == &dev_attr_act_mask) From af33d2433b03d63ed31fcfda842f46676a5e1afc Mon Sep 17 00:00:00 2001 From: Tycho Andersen Date: Sat, 8 Feb 2020 08:18:17 -0700 Subject: [PATCH 382/400] riscv: fix seccomp reject syscall code path If secure_computing() rejected a system call, we were previously setting the system call number to -1, to indicate to later code that the syscall failed. However, if something (e.g. a user notification) was sleeping, and received a signal, we may set a0 to -ERESTARTSYS and re-try the system call again. In this case, seccomp "denies" the syscall (because of the signal), and we would set a7 to -1, thus losing the value of the system call we want to restart. Instead, let's return -1 from do_syscall_trace_enter() to indicate that the syscall was rejected, so we don't clobber the value in case of -ERESTARTSYS or whatever. This commit fixes the user_notification_signal seccomp selftest on riscv to no longer hang. That test expects the system call to be re-issued after the signal, and it wasn't due to the above bug. Now that it is, everything works normally. Note that in the ptrace (tracer) case, the tracer can set the register values to whatever they want, so we still need to keep the code that handles out-of-bounds syscalls. However, we can drop the comment. We can also drop syscall_set_nr(), since it is no longer used anywhere, and the code that re-loads the value in a7 because of it. Reported in: https://lore.kernel.org/bpf/CAEn-LTp=ss0Dfv6J00=rCAy+N78U2AmhqJNjfqjr2FDpPYjxEQ@mail.gmail.com/ Reported-by: David Abdurachmanov Signed-off-by: Tycho Andersen Reviewed-by: Kees Cook Signed-off-by: Palmer Dabbelt --- arch/riscv/include/asm/syscall.h | 7 ------- arch/riscv/kernel/entry.S | 11 +++-------- arch/riscv/kernel/ptrace.c | 11 +++++------ 3 files changed, 8 insertions(+), 21 deletions(-) diff --git a/arch/riscv/include/asm/syscall.h b/arch/riscv/include/asm/syscall.h index 42347d0981e7..49350c8bd7b0 100644 --- a/arch/riscv/include/asm/syscall.h +++ b/arch/riscv/include/asm/syscall.h @@ -28,13 +28,6 @@ static inline int syscall_get_nr(struct task_struct *task, return regs->a7; } -static inline void syscall_set_nr(struct task_struct *task, - struct pt_regs *regs, - int sysno) -{ - regs->a7 = sysno; -} - static inline void syscall_rollback(struct task_struct *task, struct pt_regs *regs) { diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index bad4d85b5e91..208702d8c18e 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -228,20 +228,13 @@ check_syscall_nr: /* Check to make sure we don't jump to a bogus syscall number. */ li t0, __NR_syscalls la s0, sys_ni_syscall - /* - * The tracer can change syscall number to valid/invalid value. - * We use syscall_set_nr helper in syscall_trace_enter thus we - * cannot trust the current value in a7 and have to reload from - * the current task pt_regs. - */ - REG_L a7, PT_A7(sp) /* * Syscall number held in a7. * If syscall number is above allowed value, redirect to ni_syscall. */ bge a7, t0, 1f /* - * Check if syscall is rejected by tracer or seccomp, i.e., a7 == -1. + * Check if syscall is rejected by tracer, i.e., a7 == -1. * If yes, we pretend it was executed. */ li t1, -1 @@ -334,6 +327,7 @@ work_resched: handle_syscall_trace_enter: move a0, sp call do_syscall_trace_enter + move t0, a0 REG_L a0, PT_A0(sp) REG_L a1, PT_A1(sp) REG_L a2, PT_A2(sp) @@ -342,6 +336,7 @@ handle_syscall_trace_enter: REG_L a5, PT_A5(sp) REG_L a6, PT_A6(sp) REG_L a7, PT_A7(sp) + bnez t0, ret_from_syscall_rejected j check_syscall_nr handle_syscall_trace_exit: move a0, sp diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c index 407464201b91..444dc7b0fd78 100644 --- a/arch/riscv/kernel/ptrace.c +++ b/arch/riscv/kernel/ptrace.c @@ -148,21 +148,19 @@ long arch_ptrace(struct task_struct *child, long request, * Allows PTRACE_SYSCALL to work. These are called from entry.S in * {handle,ret_from}_syscall. */ -__visible void do_syscall_trace_enter(struct pt_regs *regs) +__visible int do_syscall_trace_enter(struct pt_regs *regs) { if (test_thread_flag(TIF_SYSCALL_TRACE)) if (tracehook_report_syscall_entry(regs)) - syscall_set_nr(current, regs, -1); + return -1; /* * Do the secure computing after ptrace; failures should be fast. * If this fails we might have return value in a0 from seccomp * (via SECCOMP_RET_ERRNO/TRACE). */ - if (secure_computing() == -1) { - syscall_set_nr(current, regs, -1); - return; - } + if (secure_computing() == -1) + return -1; #ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS if (test_thread_flag(TIF_SYSCALL_TRACEPOINT)) @@ -170,6 +168,7 @@ __visible void do_syscall_trace_enter(struct pt_regs *regs) #endif audit_syscall_entry(regs->a7, regs->a0, regs->a1, regs->a2, regs->a3); + return 0; } __visible void do_syscall_trace_exit(struct pt_regs *regs) From 95dbf14b236f3147f716cd159bd29461916c610e Mon Sep 17 00:00:00 2001 From: Thomas Bogendoerfer Date: Fri, 6 Mar 2020 11:58:37 +0100 Subject: [PATCH 383/400] ALSA: sgio2audio: Remove usage of dropped hw_params/hw_free functions Commit ee88f4ebe575 ("ALSA: mips: Use managed buffer allocation") removed superfluous hw_params/hw_free callbacks, but forgot to remove them where they were used. Fixes: ee88f4ebe575 ("ALSA: mips: Use managed buffer allocation") Signed-off-by: Thomas Bogendoerfer Link: https://lore.kernel.org/r/20200306105837.31523-1-tsbogend@alpha.franken.de Signed-off-by: Takashi Iwai --- sound/mips/sgio2audio.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/sound/mips/sgio2audio.c b/sound/mips/sgio2audio.c index 9f60a5037f8b..5bf1ea150f26 100644 --- a/sound/mips/sgio2audio.c +++ b/sound/mips/sgio2audio.c @@ -649,8 +649,6 @@ snd_sgio2audio_pcm_pointer(struct snd_pcm_substream *substream) static const struct snd_pcm_ops snd_sgio2audio_playback1_ops = { .open = snd_sgio2audio_playback1_open, .close = snd_sgio2audio_pcm_close, - .hw_params = snd_sgio2audio_pcm_hw_params, - .hw_free = snd_sgio2audio_pcm_hw_free, .prepare = snd_sgio2audio_pcm_prepare, .trigger = snd_sgio2audio_pcm_trigger, .pointer = snd_sgio2audio_pcm_pointer, @@ -659,8 +657,6 @@ static const struct snd_pcm_ops snd_sgio2audio_playback1_ops = { static const struct snd_pcm_ops snd_sgio2audio_playback2_ops = { .open = snd_sgio2audio_playback2_open, .close = snd_sgio2audio_pcm_close, - .hw_params = snd_sgio2audio_pcm_hw_params, - .hw_free = snd_sgio2audio_pcm_hw_free, .prepare = snd_sgio2audio_pcm_prepare, .trigger = snd_sgio2audio_pcm_trigger, .pointer = snd_sgio2audio_pcm_pointer, @@ -669,8 +665,6 @@ static const struct snd_pcm_ops snd_sgio2audio_playback2_ops = { static const struct snd_pcm_ops snd_sgio2audio_capture_ops = { .open = snd_sgio2audio_capture_open, .close = snd_sgio2audio_pcm_close, - .hw_params = snd_sgio2audio_pcm_hw_params, - .hw_free = snd_sgio2audio_pcm_hw_free, .prepare = snd_sgio2audio_pcm_prepare, .trigger = snd_sgio2audio_pcm_trigger, .pointer = snd_sgio2audio_pcm_pointer, From 8b272b3cbbb50a6a8e62d8a15affd473a788e184 Mon Sep 17 00:00:00 2001 From: Mel Gorman Date: Thu, 5 Mar 2020 22:28:26 -0800 Subject: [PATCH 384/400] mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa : A user reported a bug against a distribution kernel while running a : proprietary workload described as "memory intensive that is not swapping" : that is expected to apply to mainline kernels. The workload is : read/write/modifying ranges of memory and checking the contents. They : reported that within a few hours that a bad PMD would be reported followed : by a memory corruption where expected data was all zeros. A partial : report of the bad PMD looked like : : [ 5195.338482] ../mm/pgtable-generic.c:33: bad pmd ffff8888157ba008(000002e0396009e2) : [ 5195.341184] ------------[ cut here ]------------ : [ 5195.356880] kernel BUG at ../mm/pgtable-generic.c:35! : .... : [ 5195.410033] Call Trace: : [ 5195.410471] [] change_protection_range+0x7dd/0x930 : [ 5195.410716] [] change_prot_numa+0x18/0x30 : [ 5195.410918] [] task_numa_work+0x1fe/0x310 : [ 5195.411200] [] task_work_run+0x72/0x90 : [ 5195.411246] [] exit_to_usermode_loop+0x91/0xc2 : [ 5195.411494] [] prepare_exit_to_usermode+0x31/0x40 : [ 5195.411739] [] retint_user+0x8/0x10 : : Decoding revealed that the PMD was a valid prot_numa PMD and the bad PMD : was a false detection. The bug does not trigger if automatic NUMA : balancing or transparent huge pages is disabled. : : The bug is due a race in change_pmd_range between a pmd_trans_huge and : pmd_nond_or_clear_bad check without any locks held. During the : pmd_trans_huge check, a parallel protection update under lock can have : cleared the PMD and filled it with a prot_numa entry between the transhuge : check and the pmd_none_or_clear_bad check. : : While this could be fixed with heavy locking, it's only necessary to make : a copy of the PMD on the stack during change_pmd_range and avoid races. A : new helper is created for this as the check if quite subtle and the : existing similar helpful is not suitable. This passed 154 hours of : testing (usually triggers between 20 minutes and 24 hours) without : detecting bad PMDs or corruption. A basic test of an autonuma-intensive : workload showed no significant change in behaviour. Although Mel withdrew the patch on the face of LKML comment https://lkml.org/lkml/2017/4/10/922 the race window aforementioned is still open, and we have reports of Linpack test reporting bad residuals after the bad PMD warning is observed. In addition to that, bad rss-counter and non-zero pgtables assertions are triggered on mm teardown for the task hitting the bad PMD. host kernel: mm/pgtable-generic.c:40: bad pmd 00000000b3152f68(8000000d2d2008e7) .... host kernel: BUG: Bad rss-counter state mm:00000000b583043d idx:1 val:512 host kernel: BUG: non-zero pgtables_bytes on freeing mm: 4096 The issue is observed on a v4.18-based distribution kernel, but the race window is expected to be applicable to mainline kernels, as well. [akpm@linux-foundation.org: fix comment typo, per Rafael] Signed-off-by: Andrew Morton Signed-off-by: Rafael Aquini Signed-off-by: Mel Gorman Cc: Cc: Zi Yan Cc: "Kirill A. Shutemov" Cc: Vlastimil Babka Cc: Michal Hocko Link: http://lkml.kernel.org/r/20200216191800.22423-1-aquini@redhat.com Signed-off-by: Linus Torvalds --- mm/mprotect.c | 38 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 36 insertions(+), 2 deletions(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 7a8e84f86831..311c0dadf71c 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -161,6 +161,31 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, return pages; } +/* + * Used when setting automatic NUMA hinting protection where it is + * critical that a numa hinting PMD is not confused with a bad PMD. + */ +static inline int pmd_none_or_clear_bad_unless_trans_huge(pmd_t *pmd) +{ + pmd_t pmdval = pmd_read_atomic(pmd); + + /* See pmd_none_or_trans_huge_or_clear_bad for info on barrier */ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + barrier(); +#endif + + if (pmd_none(pmdval)) + return 1; + if (pmd_trans_huge(pmdval)) + return 0; + if (unlikely(pmd_bad(pmdval))) { + pmd_clear_bad(pmd); + return 1; + } + + return 0; +} + static inline unsigned long change_pmd_range(struct vm_area_struct *vma, pud_t *pud, unsigned long addr, unsigned long end, pgprot_t newprot, int dirty_accountable, int prot_numa) @@ -178,8 +203,17 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, unsigned long this_pages; next = pmd_addr_end(addr, end); - if (!is_swap_pmd(*pmd) && !pmd_trans_huge(*pmd) && !pmd_devmap(*pmd) - && pmd_none_or_clear_bad(pmd)) + + /* + * Automatic NUMA balancing walks the tables with mmap_sem + * held for read. It's possible a parallel update to occur + * between pmd_trans_huge() and a pmd_none_or_clear_bad() + * check leading to a false positive and clearing. + * Hence, it's necessary to atomically read the PMD value + * for all the checks. + */ + if (!is_swap_pmd(*pmd) && !pmd_devmap(*pmd) && + pmd_none_or_clear_bad_unless_trans_huge(pmd)) goto next; /* invoke the mmu notifier if the pmd is populated */ From 8a8683ad9ba48b4b52a57f013513d1635c1ca5c4 Mon Sep 17 00:00:00 2001 From: Huang Ying Date: Thu, 5 Mar 2020 22:28:29 -0800 Subject: [PATCH 385/400] mm: fix possible PMD dirty bit lost in set_pmd_migration_entry() In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD atomically. But the PMD is read before that with an ordinary memory reading. If the THP (transparent huge page) is written between the PMD reading and pmdp_invalidate(), the PMD dirty bit may be lost, and cause data corruption. The race window is quite small, but still possible in theory, so need to be fixed. The race is fixed via using the return value of pmdp_invalidate() to get the original content of PMD, which is a read/modify/write atomic operation. So no THP writing can occur in between. The race has been introduced when the THP migration support is added in the commit 616b8371539a ("mm: thp: enable thp migration in generic path"). But this fix depends on the commit d52605d7cb30 ("mm: do not lose dirty and accessed bits in pmdp_invalidate()"). So it's easy to be backported after v4.16. But the race window is really small, so it may be fine not to backport the fix at all. Signed-off-by: Andrew Morton Signed-off-by: "Huang, Ying" Reviewed-by: Zi Yan Reviewed-by: William Kucharski Acked-by: Kirill A. Shutemov Cc: Cc: Vlastimil Babka Cc: Michal Hocko Cc: Andrea Arcangeli Link: http://lkml.kernel.org/r/20200220075220.2327056-1-ying.huang@intel.com Signed-off-by: Linus Torvalds --- mm/huge_memory.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b08b199f9a11..24ad53b4dfc0 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3043,8 +3043,7 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, return; flush_cache_range(vma, address, address + HPAGE_PMD_SIZE); - pmdval = *pvmw->pmd; - pmdp_invalidate(vma, address, pvmw->pmd); + pmdval = pmdp_invalidate(vma, address, pvmw->pmd); if (pmd_dirty(pmdval)) set_page_dirty(page); entry = make_migration_entry(page, pmd_write(pmdval)); From c3e5ea6ee574ae5e845a40ac8198de1fb63bb3ab Mon Sep 17 00:00:00 2001 From: "Kirill A. Shutemov" Date: Thu, 5 Mar 2020 22:28:32 -0800 Subject: [PATCH 386/400] mm: avoid data corruption on CoW fault into PFN-mapped VMA Jeff Moyer has reported that one of xfstests triggers a warning when run on DAX-enabled filesystem: WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50 ... wp_page_copy+0x98c/0xd50 (unreliable) do_wp_page+0xd8/0xad0 __handle_mm_fault+0x748/0x1b90 handle_mm_fault+0x120/0x1f0 __do_page_fault+0x240/0xd70 do_page_fault+0x38/0xd0 handle_page_fault+0x10/0x30 The warning happens on failed __copy_from_user_inatomic() which tries to copy data into a CoW page. This happens because of race between MADV_DONTNEED and CoW page fault: CPU0 CPU1 handle_mm_fault() do_wp_page() wp_page_copy() do_wp_page() madvise(MADV_DONTNEED) zap_page_range() zap_pte_range() ptep_get_and_clear_full() __copy_from_user_inatomic() sees empty PTE and fails WARN_ON_ONCE(1) clear_page() The solution is to re-try __copy_from_user_inatomic() under PTL after checking that PTE is matches the orig_pte. The second copy attempt can still fail, like due to non-readable PTE, but there's nothing reasonable we can do about, except clearing the CoW page. Reported-by: Jeff Moyer Signed-off-by: Andrew Morton Signed-off-by: Kirill A. Shutemov Tested-by: Jeff Moyer Cc: Cc: Justin He Cc: Dan Williams Link: http://lkml.kernel.org/r/20200218154151.13349-1-kirill.shutemov@linux.intel.com Signed-off-by: Linus Torvalds --- mm/memory.c | 35 +++++++++++++++++++++++++++-------- 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 0bccc622e482..e8bfdf0d9d1d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2257,7 +2257,7 @@ static inline bool cow_user_page(struct page *dst, struct page *src, bool ret; void *kaddr; void __user *uaddr; - bool force_mkyoung; + bool locked = false; struct vm_area_struct *vma = vmf->vma; struct mm_struct *mm = vma->vm_mm; unsigned long addr = vmf->address; @@ -2282,11 +2282,11 @@ static inline bool cow_user_page(struct page *dst, struct page *src, * On architectures with software "accessed" bits, we would * take a double page fault, so mark it accessed here. */ - force_mkyoung = arch_faults_on_old_pte() && !pte_young(vmf->orig_pte); - if (force_mkyoung) { + if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) { pte_t entry; vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl); + locked = true; if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) { /* * Other thread has already handled the fault @@ -2310,18 +2310,37 @@ static inline bool cow_user_page(struct page *dst, struct page *src, * zeroes. */ if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) { + if (locked) + goto warn; + + /* Re-validate under PTL if the page is still mapped */ + vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl); + locked = true; + if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) { + /* The PTE changed under us. Retry page fault. */ + ret = false; + goto pte_unlock; + } + /* - * Give a warn in case there can be some obscure - * use-case + * The same page can be mapped back since last copy attampt. + * Try to copy again under PTL. */ - WARN_ON_ONCE(1); - clear_page(kaddr); + if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE)) { + /* + * Give a warn in case there can be some obscure + * use-case + */ +warn: + WARN_ON_ONCE(1); + clear_page(kaddr); + } } ret = true; pte_unlock: - if (force_mkyoung) + if (locked) pte_unmap_unlock(vmf->pte, vmf->ptl); kunmap_atomic(kaddr); flush_dcache_page(dst); From bc87302a093f0eab45cd4e250c2021299f712ec6 Mon Sep 17 00:00:00 2001 From: OGAWA Hirofumi Date: Thu, 5 Mar 2020 22:28:36 -0800 Subject: [PATCH 387/400] fat: fix uninit-memory access for partial initialized inode When get an error in the middle of reading an inode, some fields in the inode might be still not initialized. And then the evict_inode path may access those fields via iput(). To fix, this makes sure that inode fields are initialized. Reported-by: syzbot+9d82b8de2992579da5d0@syzkaller.appspotmail.com Signed-off-by: Andrew Morton Signed-off-by: OGAWA Hirofumi Cc: Link: http://lkml.kernel.org/r/871rqnreqx.fsf@mail.parknet.co.jp Signed-off-by: Linus Torvalds --- fs/fat/inode.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/fs/fat/inode.c b/fs/fat/inode.c index 594b05ae16c9..71946da84388 100644 --- a/fs/fat/inode.c +++ b/fs/fat/inode.c @@ -750,6 +750,13 @@ static struct inode *fat_alloc_inode(struct super_block *sb) return NULL; init_rwsem(&ei->truncate_lock); + /* Zeroing to allow iput() even if partial initialized inode. */ + ei->mmu_private = 0; + ei->i_start = 0; + ei->i_logstart = 0; + ei->i_attrs = 0; + ei->i_pos = 0; + return &ei->vfs_inode; } @@ -1374,16 +1381,6 @@ out: return 0; } -static void fat_dummy_inode_init(struct inode *inode) -{ - /* Initialize this dummy inode to work as no-op. */ - MSDOS_I(inode)->mmu_private = 0; - MSDOS_I(inode)->i_start = 0; - MSDOS_I(inode)->i_logstart = 0; - MSDOS_I(inode)->i_attrs = 0; - MSDOS_I(inode)->i_pos = 0; -} - static int fat_read_root(struct inode *inode) { struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb); @@ -1844,13 +1841,11 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat, fat_inode = new_inode(sb); if (!fat_inode) goto out_fail; - fat_dummy_inode_init(fat_inode); sbi->fat_inode = fat_inode; fsinfo_inode = new_inode(sb); if (!fsinfo_inode) goto out_fail; - fat_dummy_inode_init(fsinfo_inode); fsinfo_inode->i_ino = MSDOS_FSINFO_INO; sbi->fsinfo_inode = fsinfo_inode; insert_inode_hash(fsinfo_inode); From a8198fedd94590ba28c1537440cdb260718ac13b Mon Sep 17 00:00:00 2001 From: Sebastian Andrzej Siewior Date: Thu, 5 Mar 2020 22:28:39 -0800 Subject: [PATCH 388/400] mm/z3fold.c: do not include rwlock.h directly rwlock.h should not be included directly. Instead linux/splinlock.h should be included. One thing it does is to break the RT build. Signed-off-by: Andrew Morton Signed-off-by: Sebastian Andrzej Siewior Cc: Peter Zijlstra Cc: Vitaly Wool Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20200224133631.1510569-1-bigeasy@linutronix.de Signed-off-by: Linus Torvalds --- mm/z3fold.c | 1 - 1 file changed, 1 deletion(-) diff --git a/mm/z3fold.c b/mm/z3fold.c index 43754d8ebce8..42f31c4b53ad 100644 --- a/mm/z3fold.c +++ b/mm/z3fold.c @@ -41,7 +41,6 @@ #include #include #include -#include #include #include From c87cbc1f007c4b46165f05ceca04e1973cda0b9c Mon Sep 17 00:00:00 2001 From: Vlastimil Babka Date: Thu, 5 Mar 2020 22:28:42 -0800 Subject: [PATCH 389/400] mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled Commit cd02cf1aceea ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC") fixed memory hotplug with debug_pagealloc enabled, where onlining a page goes through page freeing, which removes the direct mapping. Some arches don't like when the page is not mapped in the first place, so generic_online_page() maps it first. This is somewhat wasteful, but better than special casing page freeing fast paths. The commit however missed that DEBUG_PAGEALLOC configured doesn't mean it's actually enabled. One has to test debug_pagealloc_enabled() since 031bc5743f15 ("mm/debug-pagealloc: make debug-pagealloc boottime configurable"), or alternatively debug_pagealloc_enabled_static() since 8e57f8acbbd1 ("mm, debug_pagealloc: don't rely on static keys too early"), but this is not done. As a result, a s390 kernel with DEBUG_PAGEALLOC configured but not enabled will crash: Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000483 Fault in home space mode while using kernel ASCE. AS:0000001ece13400b R2:000003fff7fd000b R3:000003fff7fcc007 S:000003fff7fd7000 P:000000000000013d Oops: 0004 ilc:2 [#1] SMP CPU: 1 PID: 26015 Comm: chmem Kdump: loaded Tainted: GX 5.3.18-5-default #1 SLE15-SP2 (unreleased) Krnl PSW : 0704e00180000000 0000001ecd281b9e (__kernel_map_pages+0x166/0x188) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 0000000000000000 0000000000000800 0000400b00000000 0000000000000100 0000000000000001 0000000000000000 0000000000000002 0000000000000100 0000001ece139230 0000001ecdd98d40 0000400b00000100 0000000000000000 000003ffa17e4000 001fffe0114f7d08 0000001ecd4d93ea 001fffe0114f7b20 Krnl Code: 0000001ecd281b8e: ec17ffff00d8 ahik %r1,%r7,-1 0000001ecd281b94: ec111dbc0355 risbg %r1,%r1,29,188,3 >0000001ecd281b9e: 94fb5006 ni 6(%r5),251 0000001ecd281ba2: 41505008 la %r5,8(%r5) 0000001ecd281ba6: ec51fffc6064 cgrj %r5,%r1,6,1ecd281b9e 0000001ecd281bac: 1a07 ar %r0,%r7 0000001ecd281bae: ec03ff584076 crj %r0,%r3,4,1ecd281a5e Call Trace: [<0000001ecd281b9e>] __kernel_map_pages+0x166/0x188 [<0000001ecd4d9516>] online_pages_range+0xf6/0x128 [<0000001ecd2a8186>] walk_system_ram_range+0x7e/0xd8 [<0000001ecda28aae>] online_pages+0x2fe/0x3f0 [<0000001ecd7d02a6>] memory_subsys_online+0x8e/0xc0 [<0000001ecd7add42>] device_online+0x5a/0xc8 [<0000001ecd7d0430>] state_store+0x88/0x118 [<0000001ecd5b9f62>] kernfs_fop_write+0xc2/0x200 [<0000001ecd5064b6>] vfs_write+0x176/0x1e0 [<0000001ecd50676a>] ksys_write+0xa2/0x100 [<0000001ecda315d4>] system_call+0xd8/0x2c8 Fix this by checking debug_pagealloc_enabled_static() before calling kernel_map_pages(). Backports for kernel before 5.5 should use debug_pagealloc_enabled() instead. Also add comments. Fixes: cd02cf1aceea ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC") Reported-by: Gerald Schaefer Signed-off-by: Andrew Morton Signed-off-by: Vlastimil Babka Reviewed-by: David Hildenbrand Cc: Cc: Joonsoo Kim Cc: Qian Cai Link: http://lkml.kernel.org/r/20200224094651.18257-1-vbabka@suse.cz Signed-off-by: Linus Torvalds --- include/linux/mm.h | 4 ++++ mm/memory_hotplug.c | 8 +++++++- 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 52269e56c514..c54fb96cb1e6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2715,6 +2715,10 @@ static inline bool debug_pagealloc_enabled_static(void) #if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_ARCH_HAS_SET_DIRECT_MAP) extern void __kernel_map_pages(struct page *page, int numpages, int enable); +/* + * When called in DEBUG_PAGEALLOC context, the call should most likely be + * guarded by debug_pagealloc_enabled() or debug_pagealloc_enabled_static() + */ static inline void kernel_map_pages(struct page *page, int numpages, int enable) { diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 0a54ffac8c68..19389cdc16a5 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -574,7 +574,13 @@ EXPORT_SYMBOL_GPL(restore_online_page_callback); void generic_online_page(struct page *page, unsigned int order) { - kernel_map_pages(page, 1 << order, 1); + /* + * Freeing the page with debug_pagealloc enabled will try to unmap it, + * so we should map it first. This is better than introducing a special + * case in page freeing fast path. + */ + if (debug_pagealloc_enabled_static()) + kernel_map_pages(page, 1 << order, 1); __free_pages_core(page, order); totalram_pages_add(1UL << order); #ifdef CONFIG_HIGHMEM From 140d7e88bb2ac4af7b0db1fd6302179440f3c4be Mon Sep 17 00:00:00 2001 From: Miroslav Benes Date: Thu, 5 Mar 2020 22:28:45 -0800 Subject: [PATCH 390/400] arch/Kconfig: update HAVE_RELIABLE_STACKTRACE description save_stack_trace_tsk_reliable() is not the only function providing the reliable stack traces anymore. Architecture might define ARCH_STACKWALK which provides a newer stack walking interface and has arch_stack_walk_reliable() function. Update the description accordingly. Signed-off-by: Andrew Morton Signed-off-by: Miroslav Benes Acked-by: Josh Poimboeuf Link: http://lkml.kernel.org/r/20200120154042.9934-1-mbenes@suse.cz Signed-off-by: Linus Torvalds --- arch/Kconfig | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index 98de654b79b3..17fe351cdde0 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -738,8 +738,9 @@ config HAVE_STACK_VALIDATION config HAVE_RELIABLE_STACKTRACE bool help - Architecture has a save_stack_trace_tsk_reliable() function which - only returns a stack trace if it can guarantee the trace is reliable. + Architecture has either save_stack_trace_tsk_reliable() or + arch_stack_walk_reliable() function which only returns a stack trace + if it can guarantee the trace is reliable. config HAVE_ARCH_HASH bool From 14afc59361976c0ba39e3a9589c3eaa43ebc7e1d Mon Sep 17 00:00:00 2001 From: Carlo Nonato Date: Fri, 6 Mar 2020 13:27:31 +0100 Subject: [PATCH 391/400] block, bfq: fix overwrite of bfq_group pointer in bfq_find_set_group() The bfq_find_set_group() function takes as input a blkcg (which represents a cgroup) and retrieves the corresponding bfq_group, then it updates the bfq internal group hierarchy (see comments inside the function for why this is needed) and finally it returns the bfq_group. In the hierarchy update cycle, the pointer holding the correct bfq_group that has to be returned is mistakenly used to traverse the hierarchy bottom to top, meaning that in each iteration it gets overwritten with the parent of the current group. Since the update cycle stops at root's children (depth = 2), the overwrite becomes a problem only if the blkcg describes a cgroup at a hierarchy level deeper than that (depth > 2). In this case the root's child that happens to be also an ancestor of the correct bfq_group is returned. The main consequence is that processes contained in a cgroup at depth greater than 2 are wrongly placed in the group described above by BFQ. This commits fixes this problem by using a different bfq_group pointer in the update cycle in order to avoid the overwrite of the variable holding the original group reference. Reported-by: Kwon Je Oh Signed-off-by: Carlo Nonato Signed-off-by: Paolo Valente Signed-off-by: Jens Axboe --- block/bfq-cgroup.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index 09b69a3ed490..f0ff6654af28 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -610,12 +610,13 @@ struct bfq_group *bfq_find_set_group(struct bfq_data *bfqd, */ entity = &bfqg->entity; for_each_entity(entity) { - bfqg = container_of(entity, struct bfq_group, entity); - if (bfqg != bfqd->root_group) { - parent = bfqg_parent(bfqg); + struct bfq_group *curr_bfqg = container_of(entity, + struct bfq_group, entity); + if (curr_bfqg != bfqd->root_group) { + parent = bfqg_parent(curr_bfqg); if (!parent) parent = bfqd->root_group; - bfq_group_set_parent(bfqg, parent); + bfq_group_set_parent(curr_bfqg, parent); } } From 6d390e4b5d48ec03bb87e63cf0a2bff5f4e116da Mon Sep 17 00:00:00 2001 From: yangerkun Date: Wed, 4 Mar 2020 15:25:56 +0800 Subject: [PATCH 392/400] locks: fix a potential use-after-free problem when wakeup a waiter MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit '16306a61d3b7 ("fs/locks: always delete_block after waiting.")' add the logic to check waiter->fl_blocker without blocked_lock_lock. And it will trigger a UAF when we try to wakeup some waiter: Thread 1 has create a write flock a on file, and now thread 2 try to unlock and delete flock a, thread 3 try to add flock b on the same file. Thread2 Thread3 flock syscall(create flock b) ...flock_lock_inode_wait flock_lock_inode(will insert our fl_blocked_member list to flock a's fl_blocked_requests) sleep flock syscall(unlock) ...flock_lock_inode_wait locks_delete_lock_ctx ...__locks_wake_up_blocks __locks_delete_blocks( b->fl_blocker = NULL) ... break by a signal locks_delete_block b->fl_blocker == NULL && list_empty(&b->fl_blocked_requests) success, return directly locks_free_lock b wake_up(&b->fl_waiter) trigger UAF Fix it by remove this logic, and this patch may also fix CVE-2019-19769. Cc: stable@vger.kernel.org Fixes: 16306a61d3b7 ("fs/locks: always delete_block after waiting.") Signed-off-by: yangerkun Signed-off-by: Jeff Layton --- fs/locks.c | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 44b6da032842..426b55d333d5 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -753,20 +753,6 @@ int locks_delete_block(struct file_lock *waiter) { int status = -ENOENT; - /* - * If fl_blocker is NULL, it won't be set again as this thread - * "owns" the lock and is the only one that might try to claim - * the lock. So it is safe to test fl_blocker locklessly. - * Also if fl_blocker is NULL, this waiter is not listed on - * fl_blocked_requests for some lock, so no other request can - * be added to the list of fl_blocked_requests for this - * request. So if fl_blocker is NULL, it is safe to - * locklessly check if fl_blocked_requests is empty. If both - * of these checks succeed, there is no need to take the lock. - */ - if (waiter->fl_blocker == NULL && - list_empty(&waiter->fl_blocked_requests)) - return status; spin_lock(&blocked_lock_lock); if (waiter->fl_blocker) status = 0; From c1e2148f8ecb26863b899d402a823dab8e26efd1 Mon Sep 17 00:00:00 2001 From: Jens Axboe Date: Wed, 4 Mar 2020 07:25:50 -0700 Subject: [PATCH 393/400] io_uring: free fixed_file_data after RCU grace period The percpu refcount protects this structure, and we can have an atomic switch in progress when exiting. This makes it unsafe to just free the struct normally, and can trigger the following KASAN warning: BUG: KASAN: use-after-free in percpu_ref_switch_to_atomic_rcu+0xfa/0x1b0 Read of size 1 at addr ffff888181a19a30 by task swapper/0/0 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc4+ #5747 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: dump_stack+0x76/0xa0 print_address_description.constprop.0+0x3b/0x60 ? percpu_ref_switch_to_atomic_rcu+0xfa/0x1b0 ? percpu_ref_switch_to_atomic_rcu+0xfa/0x1b0 __kasan_report.cold+0x1a/0x3d ? percpu_ref_switch_to_atomic_rcu+0xfa/0x1b0 percpu_ref_switch_to_atomic_rcu+0xfa/0x1b0 rcu_core+0x370/0x830 ? percpu_ref_exit+0x50/0x50 ? rcu_note_context_switch+0x7b0/0x7b0 ? run_rebalance_domains+0x11d/0x140 __do_softirq+0x10a/0x3e9 irq_exit+0xd5/0xe0 smp_apic_timer_interrupt+0x86/0x200 apic_timer_interrupt+0xf/0x20 RIP: 0010:default_idle+0x26/0x1f0 Fix this by punting the final exit and free of the struct to RCU, then we know that it's safe to do so. Jann suggested the approach of using a double rcu callback to achieve this. It's important that we do a nested call_rcu() callback, as otherwise the free could be ordered before the atomic switch, even if the latter was already queued. Reported-by: syzbot+e017e49c39ab484ac87a@syzkaller.appspotmail.com Suggested-by: Jann Horn Reviewed-by: Paul E. McKenney Signed-off-by: Jens Axboe --- fs/io_uring.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 6a595c13e108..68050b61ad0e 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -191,6 +191,7 @@ struct fixed_file_data { struct llist_head put_llist; struct work_struct ref_work; struct completion done; + struct rcu_head rcu; }; struct io_ring_ctx { @@ -5329,6 +5330,26 @@ static void io_file_ref_kill(struct percpu_ref *ref) complete(&data->done); } +static void __io_file_ref_exit_and_free(struct rcu_head *rcu) +{ + struct fixed_file_data *data = container_of(rcu, struct fixed_file_data, + rcu); + percpu_ref_exit(&data->refs); + kfree(data); +} + +static void io_file_ref_exit_and_free(struct rcu_head *rcu) +{ + /* + * We need to order our exit+free call against the potentially + * existing call_rcu() for switching to atomic. One way to do that + * is to have this rcu callback queue the final put and free, as we + * could otherwise have a pre-existing atomic switch complete _after_ + * the free callback we queued. + */ + call_rcu(rcu, __io_file_ref_exit_and_free); +} + static int io_sqe_files_unregister(struct io_ring_ctx *ctx) { struct fixed_file_data *data = ctx->file_data; @@ -5341,14 +5362,13 @@ static int io_sqe_files_unregister(struct io_ring_ctx *ctx) flush_work(&data->ref_work); wait_for_completion(&data->done); io_ring_file_ref_flush(data); - percpu_ref_exit(&data->refs); __io_sqe_files_unregister(ctx); nr_tables = DIV_ROUND_UP(ctx->nr_user_files, IORING_MAX_FILES_TABLE); for (i = 0; i < nr_tables; i++) kfree(data->table[i].files); kfree(data->table); - kfree(data); + call_rcu(&data->rcu, io_file_ref_exit_and_free); ctx->file_data = NULL; ctx->nr_user_files = 0; return 0; From 14ee09a05ed5b8b9121ca80958a06fdfc8c85d93 Mon Sep 17 00:00:00 2001 From: Ulf Hansson Date: Tue, 3 Mar 2020 16:07:46 +0100 Subject: [PATCH 394/400] dt-bindings: power: Extend nodename pattern for power-domain providers The existing binding requires the nodename to have a '@', which is a bit limiting for the wider use case. Therefore, let's extend the pattern to allow either '@' or '-'. Fixes: a3f048b5424e ("dt: psci: Update DT bindings to support hierarchical PSCI states") Signed-off-by: Ulf Hansson [robh: drop example change] Signed-off-by: Rob Herring --- Documentation/devicetree/bindings/power/power-domain.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/power/power-domain.yaml b/Documentation/devicetree/bindings/power/power-domain.yaml index 207e63ae10f9..6047aacd7766 100644 --- a/Documentation/devicetree/bindings/power/power-domain.yaml +++ b/Documentation/devicetree/bindings/power/power-domain.yaml @@ -25,7 +25,7 @@ description: |+ properties: $nodename: - pattern: "^(power-controller|power-domain)(@.*)?$" + pattern: "^(power-controller|power-domain)([@-].*)?$" domain-idle-states: $ref: /schemas/types.yaml#/definitions/phandle-array From d2334a91a3b01dce4f290b4536fcfa4b9e923a3d Mon Sep 17 00:00:00 2001 From: Ulf Hansson Date: Tue, 3 Mar 2020 16:07:47 +0100 Subject: [PATCH 395/400] dt-bindings: arm: Fixup the DT bindings for hierarchical PSCI states The hierarchical topology with power-domain should be described through child nodes, rather than as currently described in the PSCI root node. Fix this by adding a patternProperties with a corresponding reference to the power-domain DT binding. Additionally, update the example to conform to the new pattern, but also to the adjusted domain-idle-state DT binding. Fixes: a3f048b5424e ("dt: psci: Update DT bindings to support hierarchical PSCI states") Signed-off-by: Ulf Hansson [robh: Add missing allOf, tweak power-domain node name] Signed-off-by: Rob Herring --- .../devicetree/bindings/arm/psci.yaml | 28 +++++++++---------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/Documentation/devicetree/bindings/arm/psci.yaml b/Documentation/devicetree/bindings/arm/psci.yaml index 0bc3c43a525a..5e66934455bb 100644 --- a/Documentation/devicetree/bindings/arm/psci.yaml +++ b/Documentation/devicetree/bindings/arm/psci.yaml @@ -102,11 +102,12 @@ properties: [1] Kernel documentation - ARM idle states bindings Documentation/devicetree/bindings/arm/idle-states.yaml - "#power-domain-cells": - description: - The number of cells in a PM domain specifier as per binding in [3]. - Must be 0 as to represent a single PM domain. - +patternProperties: + "^power-domain-": + allOf: + - $ref: "../power/power-domain.yaml#" + type: object + description: | ARM systems can have multiple cores, sometimes in an hierarchical arrangement. This often, but not always, maps directly to the processor power topology of the system. Individual nodes in a topology have their @@ -122,15 +123,9 @@ properties: helps to implement support for OSI mode and OS implementations may choose to mandate it. - [3] Documentation/devicetree/bindings/power/power_domain.txt + [3] Documentation/devicetree/bindings/power/power-domain.yaml [4] Documentation/devicetree/bindings/power/domain-idle-state.yaml - power-domains: - $ref: '/schemas/types.yaml#/definitions/phandle-array' - description: - List of phandles and PM domain specifiers, as defined by bindings of the - PM domain provider. - required: - compatible - method @@ -224,6 +219,9 @@ examples: exit-latency-us = <10>; min-residency-us = <100>; }; + }; + + domain-idle-states { CLUSTER_RET: cluster-retention { compatible = "domain-idle-state"; @@ -247,19 +245,19 @@ examples: compatible = "arm,psci-1.0"; method = "smc"; - CPU_PD0: cpu-pd0 { + CPU_PD0: power-domain-cpu0 { #power-domain-cells = <0>; domain-idle-states = <&CPU_PWRDN>; power-domains = <&CLUSTER_PD>; }; - CPU_PD1: cpu-pd1 { + CPU_PD1: power-domain-cpu1 { #power-domain-cells = <0>; domain-idle-states = <&CPU_PWRDN>; power-domains = <&CLUSTER_PD>; }; - CLUSTER_PD: cluster-pd { + CLUSTER_PD: power-domain-cluster { #power-domain-cells = <0>; domain-idle-states = <&CLUSTER_RET>, <&CLUSTER_PWRDN>; }; From 513dc792d6060d5ef572e43852683097a8420f56 Mon Sep 17 00:00:00 2001 From: Zhang Xiaoxu Date: Wed, 4 Mar 2020 10:24:29 +0800 Subject: [PATCH 396/400] vgacon: Fix a UAF in vgacon_invert_region When syzkaller tests, there is a UAF: BUG: KASan: use after free in vgacon_invert_region+0x9d/0x110 at addr ffff880000100000 Read of size 2 by task syz-executor.1/16489 page:ffffea0000004000 count:0 mapcount:-127 mapping: (null) index:0x0 page flags: 0xfffff00000000() page dumped because: kasan: bad access detected CPU: 1 PID: 16489 Comm: syz-executor.1 Not tainted Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 Call Trace: [] dump_stack+0x1e/0x20 [] kasan_report+0x577/0x950 [] __asan_load2+0x62/0x80 [] vgacon_invert_region+0x9d/0x110 [] invert_screen+0xe5/0x470 [] set_selection+0x44b/0x12f0 [] tioclinux+0xee/0x490 [] vt_ioctl+0xff4/0x2670 [] tty_ioctl+0x46a/0x1a10 [] do_vfs_ioctl+0x5bd/0xc40 [] SyS_ioctl+0x132/0x170 [] system_call_fastpath+0x22/0x27 Memory state around the buggy address: ffff8800000fff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8800000fff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff880000100000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff It can be reproduce in the linux mainline by the program: #include #include #include #include #include #include #include #include struct tiocl_selection { unsigned short xs; /* X start */ unsigned short ys; /* Y start */ unsigned short xe; /* X end */ unsigned short ye; /* Y end */ unsigned short sel_mode; /* selection mode */ }; #define TIOCL_SETSEL 2 struct tiocl { unsigned char type; unsigned char pad; struct tiocl_selection sel; }; int main() { int fd = 0; const char *dev = "/dev/char/4:1"; struct vt_consize v = {0}; struct tiocl tioc = {0}; fd = open(dev, O_RDWR, 0); v.v_rows = 3346; ioctl(fd, VT_RESIZEX, &v); tioc.type = TIOCL_SETSEL; ioctl(fd, TIOCLINUX, &tioc); return 0; } When resize the screen, update the 'vc->vc_size_row' to the new_row_size, but when 'set_origin' in 'vgacon_set_origin', vgacon use 'vga_vram_base' for 'vc_origin' and 'vc_visible_origin', not 'vc_screenbuf'. It maybe smaller than 'vc_screenbuf'. When TIOCLINUX, use the new_row_size to calc the offset, it maybe larger than the vga_vram_size in vgacon driver, then bad access. Also, if set an larger screenbuf firstly, then set an more larger screenbuf, when copy old_origin to new_origin, a bad access may happen. So, If the screen size larger than vga_vram, resize screen should be failed. This alse fix CVE-2020-8649 and CVE-2020-8647. Linus pointed out that overflow checking seems absent. We're saved by the existing bounds checks in vc_do_resize() with rather strict limits: if (cols > VC_RESIZE_MAXCOL || lines > VC_RESIZE_MAXROW) return -EINVAL; Fixes: 0aec4867dca14 ("[PATCH] SVGATextMode fix") Reference: CVE-2020-8647 and CVE-2020-8649 Reported-by: Hulk Robot Signed-off-by: Zhang Xiaoxu [danvet: augment commit message to point out overflow safety] Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20200304022429.37738-1-zhangxiaoxu5@huawei.com --- drivers/video/console/vgacon.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/video/console/vgacon.c b/drivers/video/console/vgacon.c index de7b8382aba9..998b0de1812f 100644 --- a/drivers/video/console/vgacon.c +++ b/drivers/video/console/vgacon.c @@ -1316,6 +1316,9 @@ static int vgacon_font_get(struct vc_data *c, struct console_font *font) static int vgacon_resize(struct vc_data *c, unsigned int width, unsigned int height, unsigned int user) { + if ((width << 1) * height > vga_vram_size) + return -EINVAL; + if (width % 2 || width > screen_info.orig_video_cols || height > (screen_info.orig_video_lines * vga_default_font_height)/ c->vc_font.height) From 611d61f9ac99dc9e1494473fb90117a960a89dfa Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jonathan=20Neusch=C3=A4fer?= Date: Fri, 6 Mar 2020 23:13:11 +0100 Subject: [PATCH 397/400] parse-maintainers: Mark as executable MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This makes the script more convenient to run. Signed-off-by: Jonathan Neuschäfer Signed-off-by: Linus Torvalds --- scripts/parse-maintainers.pl | 0 1 file changed, 0 insertions(+), 0 deletions(-) mode change 100644 => 100755 scripts/parse-maintainers.pl diff --git a/scripts/parse-maintainers.pl b/scripts/parse-maintainers.pl old mode 100644 new mode 100755 From f0e20b8943509d81200cef5e30af2adfddba0f5c Mon Sep 17 00:00:00 2001 From: Pavel Begunkov Date: Sat, 7 Mar 2020 01:15:22 +0300 Subject: [PATCH 398/400] io_uring: fix lockup with timeouts There is a recipe to deadlock the kernel: submit a timeout sqe with a linked_timeout (e.g. test_single_link_timeout_ception() from liburing), and SIGKILL the process. Then, io_kill_timeouts() takes @ctx->completion_lock, but the timeout isn't flagged with REQ_F_COMP_LOCKED, and will try to double grab it during io_put_free() to cancel the linked timeout. Probably, the same can happen with another io_kill_timeout() call site, that is io_commit_cqring(). Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe --- fs/io_uring.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/io_uring.c b/fs/io_uring.c index 68050b61ad0e..c06082bb039a 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -1000,6 +1000,7 @@ static void io_kill_timeout(struct io_kiocb *req) if (ret != -1) { atomic_inc(&req->ctx->cq_timeouts); list_del_init(&req->list); + req->flags |= REQ_F_COMP_LOCKED; io_cqring_fill_event(req, 0); io_put_req(req); } From 286d3250c9d6437340203fb64938bea344729a0e Mon Sep 17 00:00:00 2001 From: Vladis Dronov Date: Sun, 8 Mar 2020 09:08:54 +0100 Subject: [PATCH 399/400] efi: Fix a race and a buffer overflow while reading efivars via sysfs There is a race and a buffer overflow corrupting a kernel memory while reading an EFI variable with a size more than 1024 bytes via the older sysfs method. This happens because accessing struct efi_variable in efivar_{attr,size,data}_read() and friends is not protected from a concurrent access leading to a kernel memory corruption and, at best, to a crash. The race scenario is the following: CPU0: CPU1: efivar_attr_read() var->DataSize = 1024; efivar_entry_get(... &var->DataSize) down_interruptible(&efivars_lock) efivar_attr_read() // same EFI var var->DataSize = 1024; efivar_entry_get(... &var->DataSize) down_interruptible(&efivars_lock) virt_efi_get_variable() // returns EFI_BUFFER_TOO_SMALL but // var->DataSize is set to a real // var size more than 1024 bytes up(&efivars_lock) virt_efi_get_variable() // called with var->DataSize set // to a real var size, returns // successfully and overwrites // a 1024-bytes kernel buffer up(&efivars_lock) This can be reproduced by concurrent reading of an EFI variable which size is more than 1024 bytes: ts# for cpu in $(seq 0 $(nproc --ignore=1)); do ( taskset -c $cpu \ cat /sys/firmware/efi/vars/KEKDefault*/size & ) ; done Fix this by using a local variable for a var's data buffer size so it does not get overwritten. Fixes: e14ab23dde12b80d ("efivars: efivar_entry API") Reported-by: Bob Sanders and the LTP testsuite Signed-off-by: Vladis Dronov Signed-off-by: Ard Biesheuvel Signed-off-by: Ingo Molnar Cc: Link: https://lore.kernel.org/r/20200305084041.24053-2-vdronov@redhat.com Link: https://lore.kernel.org/r/20200308080859.21568-24-ardb@kernel.org --- drivers/firmware/efi/efivars.c | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/drivers/firmware/efi/efivars.c b/drivers/firmware/efi/efivars.c index 7576450c8254..69f13bc4b931 100644 --- a/drivers/firmware/efi/efivars.c +++ b/drivers/firmware/efi/efivars.c @@ -83,13 +83,16 @@ static ssize_t efivar_attr_read(struct efivar_entry *entry, char *buf) { struct efi_variable *var = &entry->var; + unsigned long size = sizeof(var->Data); char *str = buf; + int ret; if (!entry || !buf) return -EINVAL; - var->DataSize = 1024; - if (efivar_entry_get(entry, &var->Attributes, &var->DataSize, var->Data)) + ret = efivar_entry_get(entry, &var->Attributes, &size, var->Data); + var->DataSize = size; + if (ret) return -EIO; if (var->Attributes & EFI_VARIABLE_NON_VOLATILE) @@ -116,13 +119,16 @@ static ssize_t efivar_size_read(struct efivar_entry *entry, char *buf) { struct efi_variable *var = &entry->var; + unsigned long size = sizeof(var->Data); char *str = buf; + int ret; if (!entry || !buf) return -EINVAL; - var->DataSize = 1024; - if (efivar_entry_get(entry, &var->Attributes, &var->DataSize, var->Data)) + ret = efivar_entry_get(entry, &var->Attributes, &size, var->Data); + var->DataSize = size; + if (ret) return -EIO; str += sprintf(str, "0x%lx\n", var->DataSize); @@ -133,12 +139,15 @@ static ssize_t efivar_data_read(struct efivar_entry *entry, char *buf) { struct efi_variable *var = &entry->var; + unsigned long size = sizeof(var->Data); + int ret; if (!entry || !buf) return -EINVAL; - var->DataSize = 1024; - if (efivar_entry_get(entry, &var->Attributes, &var->DataSize, var->Data)) + ret = efivar_entry_get(entry, &var->Attributes, &size, var->Data); + var->DataSize = size; + if (ret) return -EIO; memcpy(buf, var->Data, var->DataSize); @@ -250,14 +259,16 @@ efivar_show_raw(struct efivar_entry *entry, char *buf) { struct efi_variable *var = &entry->var; struct compat_efi_variable *compat; + unsigned long datasize = sizeof(var->Data); size_t size; + int ret; if (!entry || !buf) return 0; - var->DataSize = 1024; - if (efivar_entry_get(entry, &entry->var.Attributes, - &entry->var.DataSize, entry->var.Data)) + ret = efivar_entry_get(entry, &var->Attributes, &datasize, var->Data); + var->DataSize = datasize; + if (ret) return -EIO; if (in_compat_syscall()) { From d6c066fda90d578aacdf19771a027ed484a79825 Mon Sep 17 00:00:00 2001 From: Vladis Dronov Date: Sun, 8 Mar 2020 09:08:55 +0100 Subject: [PATCH 400/400] efi: Add a sanity check to efivar_store_raw() Add a sanity check to efivar_store_raw() the same way efivar_{attr,size,data}_read() and efivar_show_raw() have it. Signed-off-by: Vladis Dronov Signed-off-by: Ard Biesheuvel Signed-off-by: Ingo Molnar Cc: Link: https://lore.kernel.org/r/20200305084041.24053-3-vdronov@redhat.com Link: https://lore.kernel.org/r/20200308080859.21568-25-ardb@kernel.org --- drivers/firmware/efi/efivars.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/firmware/efi/efivars.c b/drivers/firmware/efi/efivars.c index 69f13bc4b931..aff3dfb4d7ba 100644 --- a/drivers/firmware/efi/efivars.c +++ b/drivers/firmware/efi/efivars.c @@ -208,6 +208,9 @@ efivar_store_raw(struct efivar_entry *entry, const char *buf, size_t count) u8 *data; int err; + if (!entry || !buf) + return -EINVAL; + if (in_compat_syscall()) { struct compat_efi_variable *compat;