linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2024-09-30 06:10:56 +00:00

Author	SHA1	Message	Date
Jassi Brar	f78f4107ea	dt-bindings: net: Add DT bindings for Socionext Netsec This patch adds documentation for Device-Tree bindings for the Socionext NetSec Controller driver. Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-10 14:50:29 -05:00
David S. Miller	c215dae430	Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2018-01-09 This series contains updates to ixgbe and ixgbevf only. Emil fixes an issue with "wake on LAN"(WoL) where we need to ensure we enable the reception of multicast packets so that WoL works for IPv6 magic packets. Cleaned up code no longer needed with the update to adaptive ITR. Paul update the driver to advertise the highest capable link speed when a module gets inserted. Also extended the displaying of firmware version to include the iSCSI and OEM block in the EEPROM to better identify firmware versions/images. Tonghao Zhang cleans up a code comment that no longer applies since InterruptThrottleRate has been removed from the driver. Alex fixes SR-IOV and MACVLAN offload interaction, where the MACVLAN offload was incorrectly configuring several filters with the wrong pool value which resulted in MACLVAN interfaces not being able to receive traffic that had to pass over the physical interface. Fixed transmit hangs and dropped receive frames when the number of VFs changed. Added support for RSS on MACVLAN pools for X550 devices. Fixed up the MACVLAN limitations so we can now support 63 offloaded devices. Cleaned up MACVLAN code that is no longer needed with the recent changes and fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-10 14:38:06 -05:00
Linus Torvalds	5f615b97cd	sound fixes for 4.15-rc8 A collection of the last-minute small PCM fixes: - A workaround for the recent regression wrt PulseAudio - Removal of spurious WARN_ON() that is triggered by syzkaller - Fixes for aloop, hardening racy accesses - Fixes in PCM OSS emulation wrt the unabortable loops that may cause RCU stall -----BEGIN PGP SIGNATURE----- iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAlpUeWoOHHRpd2FpQHN1 c2UuZGUACgkQLtJE4w1nLE8rMxAAy+XJNWigvkWHd79ttKeAmndia/u9d+T6Ge/I VI/SJSy8vhnGO0YNf/AHEs6vtad73XnXP76x1H3TkCsrDxykfhKogCvp0Aat/Ji7 LQFkhQKsaEdACm2TlPxmxpO64sYB8UjvcZBFS82tCmNCldMkwi8T+DDDHocP0A0D pOQogjffqPBZdk7X1hJxoVKOm95GI1ms09+JPrLl47aa6mLIvNxa81RGnrVK5blE +kYZQAblweGN8RsMVWqyrnxgRatF59UbV6JIKui/8KD2AXl3Hya/Dn2aFWtMqqH8 p8siLsUI+tACPucNk7tMt9UjHEy7yGK02hClhYVZG6vZ81nSoJsJFTdwXBMKjrfy Fa1bBb8quM6WfBEHXB7YISulUrrc2nftkPhB/zIa5E9arkHWY4FL7jhdUTEAjkgr D0Ka3Q/PtdXxmK+NBqUpoiqDHoOQeA5HG+njsz5L0xSbxoxMy8guyxSaoeF4BOnW KbrVbzcJzSxDWPYGbKmeLEYHW8P3FOKNv9SI/WZErmyjQkeMiq7AuP93yYACFEyj LhSAxBZ00sStl6IgM4Unw6p4Gi0SOawQfADDG4Arfr/fRA52l9wmpaUwU3uJ1RMn gLvLfJkBbs/MwBjD5BPxfdjKIuREvMdUBwl/hZk1zp5d2ay0lAl6toNcee//MuHf DKd1t3I= =Fz+p -----END PGP SIGNATURE----- Merge tag 'sound-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of the last-minute small PCM fixes: - A workaround for the recent regression wrt PulseAudio - Removal of spurious WARN_ON() that is triggered by syzkaller - Fixes for aloop, hardening racy accesses - Fixes in PCM OSS emulation wrt the unabortable loops that may cause RCU stall" * tag 'sound-4.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: pcm: Allow aborting mutex lock at OSS read/write loops ALSA: pcm: Abort properly at pending signal in OSS read/write loops ALSA: aloop: Fix racy hw constraints adjustment ALSA: aloop: Fix inconsistent format due to incomplete rule ALSA: aloop: Release cable upon open error path ALSA: pcm: Workaround for weird PulseAudio behavior on rewind error ALSA: pcm: Add missing error checks in OSS emulation plugin builder ALSA: pcm: Remove incorrect snd_BUG_ON() usages	2018-01-10 11:18:31 -08:00
Lukas Wunner	ff8759609d	Bluetooth: btbcm: Fix sleep mode struct ordering According to the documentation for Laird SD40 radio modules (which use the BCM4329 chipset), the order of the Enable_BREAK_To_Host and Pulsed_HOST_WAKE parameters in the sleep mode struct is reversed vis-à-vis our struct declaration. See page 46 of this PDF: http://cdn.lairdtech.com/home/brandworld/files/Application%20Note%20-%2040%20Series%20Bluetooth.pdf The documentation is dated Oct 2015, so fairly recent, making it appear more likely that the documentation is correct and our code is wrong. Amend our code to be in congruence with the documentation. Cc: Sue White <sue.white@lairdtech.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:14 +01:00
Lukas Wunner	e4b9e5b861	Bluetooth: hci_bcm: Sleep instead of spinning The driver calls mdelay(15) in the ->suspend, ->resume, ->runtime_suspend and ->runtime_resume hook, however spinning for such a long period of time is discouraged as per Documentation/timers/timers-howto.txt. The use of mdelay() seems unnecessary, it is allowed to sleep in the system sleep and runtime PM hooks (with the exception of ->suspend_noirq and ->resume_noirq) and the driver itself also does not rely on a non-sleeping ->runtime_resume as the only place where a synchronous resume is performed, in bcm_dequeue(), is called from a work item in hci_ldisc.c and hci_serdev.c. So replace the mdelay(15) with msleep(15). Note that the delay is inserted after asserting or deasserting the device wake pin, but in bcm_gpio_set_power() that pin is asserted or deasserted without observing a delay. It is thus unclear if the delay is necessary at all. It is likewise unclear why it is exactly 15 ms, the commit introducing it, `118612fb91` ("Bluetooth: hci_bcm: Add suspend/resume PM functions"), does not provide a rationale. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Suggested-and-reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:14 +01:00
Lukas Wunner	5954cdf179	Bluetooth: hci_bcm: Silence IRQ printk The host wake IRQ is optional, but if none is found, "BCM irq: -22" is logged which may irritate users. This is really a debug message, so use dev_dbg() instead of dev_info(). If users are interested in the IRQ, they can always consult /proc/interrupts. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:13 +01:00
Lukas Wunner	4c33162c1a	Bluetooth: hci_bcm: Support Apple GPIO handling Enable Bluetooth on the following Macs which provide custom ACPI methods to toggle the GPIOs for device wake and shutdown instead of accessing the pins directly: MacBook8,1 2015 12" MacBook9,1 2016 12" MacBook10,1 2017 12" MacBookPro13,1 2016 13" MacBookPro13,2 2016 13" with Touch Bar MacBookPro13,3 2016 15" with Touch Bar MacBookPro14,1 2017 13" MacBookPro14,2 2017 13" with Touch Bar MacBookPro14,3 2017 15" with Touch Bar On the MacBook8,1 Bluetooth is muxed with a second device (a debug port on the SSD) under the control of PCH GPIO 36. Because serdev cannot deal with multiple slaves yet, it is currently necessary to patch the DSDT and remove the SSDC device. The custom ACPI methods are called: BTLP (Low Power) takes one argument, toggles device wake GPIO BTPU (Power Up) tells SMC to drive shutdown GPIO high BTPD (Power Down) tells SMC to drive shutdown GPIO low BTRS (Reset) calls BTPD followed by BTPU BTRB unknown, not present on all MacBooks Search for the BTLP, BTPU and BTPD methods on ->probe and cache them in struct bcm_device if the machine is a Mac. Additionally, set the init_speed based on a custom device property provided by Apple in lieu of _CRS resources. The Broadcom UART's speed is fixed on Apple Macs: Any attempt to change it results in Bluetooth status code 0x0c and bcm_set_baudrate() thus always returns -EBUSY. By setting only the init_speed and leaving oper_speed at zero, we can achieve that the host UART's speed is adjusted but the Broadcom UART's speed is left as is. The host wake pin goes into the SMC which handles it independently of the OS, so there's no IRQ for it. Thanks to Ronald Tschalär who did extensive debugging and testing of this patch and contributed fixes. ACPI snippet containing the custom methods and device properties (taken from a MacBook8,1): Method (BTLP, 1, Serialized) { If (LEqual (Arg0, 0x00)) { Store (0x01, GD54) /* set PCH GPIO 54 direction to input / } If (LEqual (Arg0, 0x01)) { Store (0x00, GD54) / set PCH GPIO 54 direction to output / Store (0x00, GP54) / set PCH GPIO 54 value to low */ } } Method (BTPU, 0, Serialized) { Store (0x01, \_SB.PCI0.LPCB.EC.BTPC) Sleep (0x0A) } Method (BTPD, 0, Serialized) { Store (0x00, \_SB.PCI0.LPCB.EC.BTPC) Sleep (0x0A) } Method (BTRS, 0, Serialized) { BTPD () BTPU () } Method (_DSM, 4, NotSerialized) // _DSM: Device-Specific Method { If (LEqual (Arg0, ToUUID ("a0b5b7c6-1318-441c-b0c9-fe695eaf949b"))) { Store (Package (0x08) { "baud", Buffer (0x08) { 0xC0, 0xC6, 0x2D, 0x00, 0x00, 0x00, 0x00, 0x00 }, "parity", Buffer (0x08) { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, "dataBits", Buffer (0x08) { 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }, "stopBits", Buffer (0x08) { 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 } }, Local0) DTGP (Arg0, Arg1, Arg2, Arg3, RefOf (Local0)) Return (Local0) } Return (0x00) } Link: https://github.com/Dunedan/mbp-2016-linux/issues/29 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=110901 Reported-by: Leif Liddy <leif.liddy@gmail.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Cc: Loic Poulain <loic.poulain@linaro.org> Cc: Hans de Goede <hdegoede@redhat.com> Tested-by: Max Shavrick <mxms@me.com> [MacBook8,1] Tested-by: Leif Liddy <leif.liddy@gmail.com> [MacBook9,1] Tested-by: Daniel Roschka <danielroschka@phoenitydawn.de> [MacBookPro13,2] Tested-by: Ronald Tschalär <ronald@innovation.ch> [MacBookPro13,3] Tested-by: Peter Y. Chuang <peteryuchuang@gmail.com> [MacBookPro14,1] Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Ronald Tschalär <ronald@innovation.ch> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:13 +01:00
Lukas Wunner	8bfa7e1e03	Bluetooth: hci_bcm: Handle errors properly A significant portion of this driver lacks error handling. As a first step, add error paths to bcm_gpio_set_power(), bcm_open(), bcm_close(), bcm_suspend_device(), bcm_resume_device(), bcm_resume(), bcm_probe() and bcm_serdev_probe(). (I've also scrutinized bcm_suspend() but think it's fine as is.) Those are all the functions accessing the device wake and shutdown GPIO. On Apple Macs the pins are accessed through ACPI methods, which may fail for various reasons, hence proper error handling is necessary. Non-Macs access the pins directly, which may fail as well but the GPIO core does not yet pass back errors to consumers. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Cc: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:13 +01:00
Lukas Wunner	8353b4a636	Bluetooth: hci_bcm: Add callbacks to toggle GPIOs MacBooks provides custom ACPI methods to toggle the GPIOs for device wake and shutdown instead of accessing the pins directly. Prepare for their support by adding callbacks to toggle the GPIOs, which on non-Macs do nothing more but call gpiod_set_value(). No functional change intended. Suggested-and-reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:13 +01:00
Lukas Wunner	b7c2abac14	Bluetooth: hci_bcm: Document struct bcm_device Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:13 +01:00
Lukas Wunner	4dc273306c	Bluetooth: hci_bcm: Invalidate IRQ on request failure If devm_request_irq() fails, the driver bails out of bcm_request_irq() but continues to ->setup the device (because the IRQ is optional). The driver subsequently calls devm_free_irq(), enable_irq_wake() and disable_irq_wake() on the IRQ even though requesting it failed. Avoid by invalidating the IRQ on request failure. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:13 +01:00
Lukas Wunner	f4cf6b7e3b	Bluetooth: hci_bcm: Fix unbalanced pm_runtime_disable() On ->setup, pm_runtime_enable() is only called if a valid IRQ was found, but on ->close(), pm_runtime_disable() is called unconditionally. Disablement of runtime PM is recorded in a counter, so every pm_runtime_disable() needs to be balanced. Fix it. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Reported-and-reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:13 +01:00
Lukas Wunner	54ba69f9e7	Bluetooth: hci_bcm: Fix race on close Upon ->close, the driver powers the Bluetooth controller down, deasserts the device wake pin, updates the runtime PM status to "suspended" and finally frees the IRQ. Because the IRQ is freed last, a runtime resume can take place after the controller was powered down. The impact is not grave, the worst thing that can happen is that the device wake pin is reasserted (should have no effect while the regulator is off) and that setting the runtime PM status to "suspended" does not reflect reality. Still, it's wrong, so free the IRQ first. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:13 +01:00
Lukas Wunner	6d83f1ee88	Bluetooth: hci_bcm: Clean up unnecessary #ifdef pm_runtime_disable() and pm_runtime_set_suspended() are replaced with empty inlines if CONFIG_PM is disabled, so there's no need to #ifdef them. device_init_wakeup() is likewise replaced with an inline, though it's not empty, but it and devm_free_irq() can be made conditional on IS_ENABLED(CONFIG_PM), which is preferable to #ifdef as per section 20 of Documentation/process/coding-style.rst. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:12 +01:00
Ronald Tschalär	4a59f1fab9	Bluetooth: hci_bcm: Validate IRQ before using it The ->close, ->suspend and ->resume hooks assume presence of a valid IRQ if the device is wakeup capable. However it's entirely possible that wakeup was enabled by some other entity besides this driver and in this case the user will get a WARN splat if no valid IRQ was found. Avoid by checking if the IRQ is valid, i.e. > 0. Case in point: On recent MacBook Pros, the Bluetooth device lacks an IRQ (because host wakeup is handled by the SMC, independently of the operating system), but it does possess a _PRW method (which specifies the SMC's GPE as wake event). The ACPI core therefore automatically marks the physical Bluetooth device wakeup capable upon binding it to its ACPI companion: device_set_wakeup_capable+0x96/0xb0 acpi_bind_one+0x28a/0x310 acpi_platform_notify+0x20/0xa0 device_add+0x215/0x690 serdev_device_add+0x57/0xf0 acpi_serdev_add_device+0xc9/0x110 acpi_ns_walk_namespace+0x131/0x280 acpi_walk_namespace+0xf5/0x13d serdev_controller_add+0x6f/0x110 serdev_tty_port_register+0x98/0xf0 tty_port_register_device_attr_serdev+0x3a/0x70 uart_add_one_port+0x268/0x500 serial8250_register_8250_port+0x32e/0x490 dw8250_probe+0x46c/0x720 platform_drv_probe+0x35/0x90 driver_probe_device+0x300/0x450 bus_for_each_drv+0x67/0xb0 __device_attach+0xde/0x160 bus_probe_device+0x9c/0xb0 device_add+0x448/0x690 platform_device_add+0x10e/0x260 mfd_add_device+0x392/0x4c0 mfd_add_devices+0xb1/0x110 intel_lpss_probe+0x2a9/0x610 [intel_lpss] intel_lpss_pci_probe+0x7a/0xa8 [intel_lpss_pci] Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Ronald Tschalär <ronald@innovation.ch> [lukas: fix up ->suspend and ->resume as well, add commit message] Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:12 +01:00
Lukas Wunner	3e81a4ca51	Bluetooth: hci_bcm: Mandate presence of shutdown and device wake GPIO Commit `0395ffc1ee` ("Bluetooth: hci_bcm: Add PM for BCM devices") amended this driver to request a shutdown and device wake GPIO on probe, but mandated that only one of them need to be present: /* Make sure at-least one of the GPIO is defined and that * a name is specified for this instance / if ((!dev->device_wakeup && !dev->shutdown) \|\| !dev->name) { dev_err(&pdev->dev, "invalid platform data\n"); return -EINVAL; } However the same commit added a call to bcm_gpio_set_power() to the ->probe hook, which unconditionally accesses both* GPIOs. Luckily, the resulting NULL pointer deref was never reported, suggesting there's no machine where either GPIO is missing. Commit `8a92056837` ("Bluetooth: hci_bcm: Add (runtime)pm support to the serdev driver") removed the check whether at least one of the GPIOs is present without specifying a reason. Because commit `62aaefa7d0` ("Bluetooth: hci_bcm: improve use of gpios API") refactored the driver to use devm_gpiod_get_optional() instead of devm_gpiod_get(), one is now tempted to believe that the driver doesn't require any of the two GPIOs. Which is wrong, the driver still requires both GPIOs to avoid a NULL pointer deref. To this end, establish the status quo ante and request the GPIOs with devm_gpiod_get() again. Bail out of ->probe if either of them is missing. Oddly enough, whereas bcm_gpio_set_power() accesses the device wake pin unconditionally, bcm_suspend_device() and bcm_resume_device() do check for its presence before accessing it. Those checks are superfluous, so remove them. Cc: Frédéric Danis <frederic.danis.oss@gmail.com> Cc: Loic Poulain <loic.poulain@linaro.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2018-01-10 19:00:12 +01:00
David S. Miller	661e4e33a9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2018-01-09 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Prevent out-of-bounds speculation in BPF maps by masking the index after bounds checks in order to fix spectre v1, and add an option BPF_JIT_ALWAYS_ON into Kconfig that allows for removing the BPF interpreter from the kernel in favor of JIT-only mode to make spectre v2 harder, from Alexei. 2) Remove false sharing of map refcount with max_entries which was used in spectre v1, from Daniel. 3) Add a missing NULL psock check in sockmap in order to fix a race, from John. 4) Fix test_align BPF selftest case since a recent change in verifier rejects the bit-wise arithmetic on pointers earlier but test_align update was missing, from Alexei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-10 11:17:21 -05:00
Geert Uytterhoeven	1e77fc8211	gpio: Add missing open drain/source handling to gpiod_set_value_cansleep() Since commit `f11a04464a` ("i2c: gpio: Enable working over slow can_sleep GPIOs"), probing the i2c RTC connected to an i2c-gpio bus on r8a7740/armadillo fails with: rtc-s35390a 0-0030: error resetting chip rtc-s35390a: probe of 0-0030 failed with error -5 More debug code reveals: i2c i2c-0: master_xfer[0] R, addr=0x30, len=1 i2c i2c-0: NAK from device addr 0x30 msg #0 s35390a_get_reg: ret = -6 Commit `02e479808b` ("gpio: Alter semantics of raw operations to actually be raw") moved open drain/source handling from gpiod_set_raw_value_commit() to gpiod_set_value(), but forgot to take into account that gpiod_set_value_cansleep() also needs this handling. The i2c protocol mandates that i2c signals are open drain, hence i2c communication fails. Fix this by adding the missing handling to gpiod_set_value_cansleep(), using a new common helper gpiod_set_value_nocheck(). Fixes: `02e479808b` ("gpio: Alter semantics of raw operations to actually be raw") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> [removed underscore syntax, added kerneldoc] Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2018-01-10 14:17:17 +01:00
Daniel Borkmann	632130ed3b	Merge branch 'bpf-nfp-misc-improvements' Jakub Kicinski says: ==================== This series starts with a fix to Jesper's recent work, somehow I forgot about control rings during review. Second patch is cleaning up a vNIC header, in kdoc we should not use @ for #define constants. Aligning of the top of the stack as well as bottom (last bytes will be unused) helps the performance. We should check offload datapath's max MTU when program is loaded and we can allow TC hw offload flag to be changed freely while XDP offload is active. Next group of patches adds more fully featured relocation support. Due to limited amount of code space we only load the image to NIC's memory when program is attached. Since we can't predict which programs are loaded later, we should translate as if image was to be loaded at offset zero and only apply relocations at load time. Many more advanced features (eg. tail class, subprograms, dynamic allocation of program space and sharing it between ports) will depend on this. Nic adds support for signed comparison instructions. Quentin makes use of the verifier log in our driver, the verifier print function (verbose()) has to be renamed and exported. v2: - replace #define by function aliasing for verbose() in patch 13 ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:37 +01:00
Quentin Monnet	ff627e3d07	nfp: bpf: reuse verifier log for debug messages Now that `bpf_verifier_log_write()` is exported from the verifier and makes it possible to reuse the verifier log to print messages to the standard output, use this instead of the kernel logs in the nfp driver for printing error messages occurring at verification time. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:36 +01:00
Quentin Monnet	430e68d10b	bpf: export function to write into verifier log buffer Rename the BPF verifier `verbose()` to `bpf_verifier_log_write()` and export it, so that other components (in particular, drivers for BPF offload) can reuse the user buffer log to dump error messages at verification time. Renaming `verbose()` was necessary in order to avoid a name so generic to be exported to the global namespace. However to prevent too much pain for backports, the calls to `verbose()` in the kernel BPF verifier were not changed. Instead, use function aliasing to make `verbose` point to `bpf_verifier_log_write`. Another solution could consist in making a wrapper around `verbose()`, but since it is a variadic function, I don't see a clean way without creating two identical wrappers, one for the verifier and one to export. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:36 +01:00
Nic Viljoen	c087aa8bbf	nfp: bpf: add signed jump insns This patch adds signed jump instructions (jsgt, jsge, jslt, jsle) to the nfp jit. As well as adding the additional required raw assembler branch mask to nfp_asm.h Signed-off-by: Nic Viljoen <nick.viljoen@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:36 +01:00
Jakub Kicinski	af93d15ac6	nfp: hand over to BPF offload app at coarser granularity Instead of having an app callback per message type hand off all offload-related handling to apps with one "rest of ndo_bpf" callback. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:36 +01:00
Jakub Kicinski	e84797fe15	nfp: bpf: use a large constant in unresolved branches To make absolute relocated branches (branches which will be completely rewritten with br_set_offset()) distinguishable in user space dumps from normal jumps add a large offset to them. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	44a12ecc1c	nfp: bpf: don't depend on high order allocations for program image The translator pre-allocates a buffer of maximal program size. Due to HW/FW limitations the program buffer can't currently be longer than 128Kb, so we used to kmalloc() it, and then map for DMA directly. Now that the late branch resolution is copying the program image anyway, we can just kvmalloc() the buffer. While at it, after translation reallocate the buffer to save space. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	2314fe9ed0	nfp: bpf: relocate jump targets just before the load Don't translate the program assuming it will be loaded at a given address. This will be required for sharing programs between ports of the same NIC, tail calls and subprograms. It will also make the jump targets easier to understand when dumping the program to user space. Translate the program as if it was going to be loaded at address zero. When load happens add the load offset in and set addresses of special branches. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	488feeaf6d	nfp: bpf: add helpers for modifying branch addresses In preparation for better handling of relocations move existing helper for setting branch offset to nfp_asm.c and add two more. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	1549921da3	nfp: bpf: move jump resolution to jit.c Jump target resolution should be in jit.c not offload.c. No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	a0f30c97ac	nfp: bpf: allow disabling TC offloads when XDP active TC BPF offload was added first, so we used to assume that the ethtool TC HW offload flag cannot be touched whenever any BPF program is loaded on the NIC. This unncessarily limits changes to the TC flag when offloaded program is XDP. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	ccbdc596f4	nfp: bpf: don't allow changing MTU above BPF offload limit when active When BPF offload is active we need may need to restrict the MTU changes more than just to the limitation of the kernel XDP datapath. Allow the BPF code to veto a MTU change. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	c4f7730be5	nfp: bpf: round up the size of the stack Kernel enforces the alignment of the bottom of the stack, NFP deals with positive offsets better so we should align the top of the stack. Round the stack size to NFP word size (4B). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	8c6a6d9804	nfp: fix incumbent kdoc warnings We should use % instead of @ for documenting preprocessor defines. Add missing documentation of __NFP_REPR_TYPE_MAX. This gets rid of all remaining kdoc warnings in the driver. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	a9c324be72	nfp: don't try to register XDP rxq structures on control queues Some RX rings are used for control messages, those will not have a netdev pointer in dp. Skip XDP rxq handling on those rings. Fixes: `7f1c684a89` ("nfp: setup xdp_rxq_info") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Daniel Borkmann	148989d8ba	Merge branch 'bpf-xdp-rxq-fixes' Jakub Kicinski says: ==================== Two more trivial fixes to the recent XDP RXQ series. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 12:06:18 +01:00
Jakub Kicinski	82aaff2f63	net: free RX queue structures Looks like commit `e817f85652` ("xdp: generic XDP handling of xdp_rxq_info") replaced kvfree(dev->_rx) in free_netdev() with a call to netif_free_rx_queues() which doesn't actually free the rings? While at it remove the unnecessary temporary variable. Fixes: `e817f85652` ("xdp: generic XDP handling of xdp_rxq_info") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 12:06:17 +01:00
Jakub Kicinski	141b52a98a	net: use the right variant of kfree kvzalloc'ed memory should be kvfree'd. Fixes: `e817f85652` ("xdp: generic XDP handling of xdp_rxq_info") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 12:06:17 +01:00
Linus Torvalds	cf1fb15823	RISC-V changes for 4.15-rc8 This contains what I hope are the last RISC-V changes to go into 4.15. I know it's a bit last minute, but I think they're all fairly small changes: * SR_* constants have been renamed to match the latest ISA specification. * Some CONFIG_MMU #ifdef cruft has been removed. We've never supported !CONFIG_MMU. * __NR_riscv_flush_icache is now visible to userspace. We were hoping to avoid making this public in order to force userspace to call the vDSO entry, but it looks like QEMU's user-mode emulation doesn't want to emulate a vDSO. In order to allow glibc to fall back to a system call when the vDSO entry doesn't exist we're just * Our defconfig is no long empty. This is another one that just slipped through the cracks. The defconfig isn't perfect, but it's at least close to what users will want for the first RISC-V development board. Getting closer is kind of splitting hairs here: none of the RISC-V specific drivers are in yet, so it's not like things will boot out of the box. The only one that's strictly necessary is the __NR_riscv_flush_icache change, as I want that to be part of the public API starting from our first kernel so nobody has to worry about it. The others are nice to haves, but they seem sane for 4.15 to me. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlpU/UcTHHBhbG1lckBk YWJiZWx0LmNvbQAKCRDvTKFQLMurQdy5D/9fcTwXTk98U2gSoR4Dv25tztqbNMhw +Lae5EeIqAaPI4xfyLGldJe0BWAJaouZWIY5xkB5JWzsdYPx/jYgC+SbwI/3aGVy VjcU0d4haZtz2kdm0Y0ZKIGg91vDlULoVvcxrM8Jff0gDmyKoT1OjwKpt3esyhmN Vc+iC0FxtJow/xIaFlnPa42qh/pFkcLDmmY/Im6N8IEcyHBT6vCDnD3CgCFY/hdu 9vcWJDvFBj4SFwL8y+ajspQ4tPzDt4Ko+3NLxtEv+19y3NEgLm+shbxv/J8AVO8O BvBr51QfggM2rAqGzCa4nEZZR7Roxgg9bJVQARXyzX1tUhtBEz9+eUArJ0tzMtbx GyXYY5NwyupDJ/MA9yn+GqYlLNnS2yL2y0zIBJehi/37+KpAFtH/cRnA58sXViqw IKGhKW7JCGU3/xyW+RtuY3N5urU18+qE4CZRLtI5QN0QRcTWLhqqQvQRud86HqqD g4KPo6g9Z6Ak9Xu81n/liIExp3Vp2kpQUts1lCF1D+4WYRwpb4Mqy4HiOCSf/OO2 wOuX5HY+tbS8yvupgYjszTXaYDn35RoGkcjK9o1Lkq9RgI5kzHDyaQrSK/c/oAzn A7cJ2z7dBaV0W4O7R+2SJ2k9DHw1db/WVf19pKVjSi5osSoUds5w1YxHK25cSBUz +47LVCgkQI/Scw== =PhUK -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-4.15-rc8_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux Pull RISC-V updates from Palmer Dabbelt: "This contains what I hope are the last RISC-V changes to go into 4.15. I know it's a bit last minute, but I think they're all fairly small changes: - SR_* constants have been renamed to match the latest ISA specification. - Some CONFIG_MMU #ifdef cruft has been removed. We've never supported !CONFIG_MMU. - __NR_riscv_flush_icache is now visible to userspace. We were hoping to avoid making this public in order to force userspace to call the vDSO entry, but it looks like QEMU's user-mode emulation doesn't want to emulate a vDSO. In order to allow glibc to fall back to a system call when the vDSO entry doesn't exist we're just - Our defconfig is no long empty. This is another one that just slipped through the cracks. The defconfig isn't perfect, but it's at least close to what users will want for the first RISC-V development board. Getting closer is kind of splitting hairs here: none of the RISC-V specific drivers are in yet, so it's not like things will boot out of the box. The only one that's strictly necessary is the __NR_riscv_flush_icache change, as I want that to be part of the public API starting from our first kernel so nobody has to worry about it. The others are nice to haves, but they seem sane for 4.15 to me" * tag 'riscv-for-linus-4.15-rc8_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux: riscv: rename SR_* constants to match the spec riscv: remove CONFIG_MMU ifdefs RISC-V: Make __NR_riscv_flush_icache visible to userspace RISC-V: Add a basic defconfig	2018-01-09 15:45:06 -08:00
Linus Torvalds	44cae9b209	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus Pull MIPS fixes from Ralf Baechle: "Another round of MIPS fixes for 4.15. - Maciej Rozycki found another series of FP issues which requires a seven part series to restructure and fix. - James fixes a warning about .set mt which gas doesn't like when building for R1 processors" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: Validate PR_SET_FP_MODE prctl(2) requests against the ABI of the task MIPS: Disallow outsized PTRACE_SETREGSET NT_PRFPREG regset accesses MIPS: Also verify sizeof `elf_fpreg_t' with PTRACE_SETREGSET MIPS: Fix an FCSR access API regression with NT_PRFPREG and MSA MIPS: Consistently handle buffer counter with PTRACE_SETREGSET MIPS: Guard against any partial write attempt with PTRACE_SETREGSET MIPS: Factor out NT_PRFPREG regset access helpers MIPS: CPS: Fix r1 .set mt assembler warning	2018-01-09 15:43:13 -08:00
Alexei Starovoitov	290af86629	bpf: introduce BPF_JIT_ALWAYS_ON config The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715. A quote from goolge project zero blog: "At this point, it would normally be necessary to locate gadgets in the host kernel code that can be used to actually leak data by reading from an attacker-controlled location, shifting and masking the result appropriately and then using the result of that as offset to an attacker-controlled address for a load. But piecing gadgets together and figuring out which ones work in a speculation context seems annoying. So instead, we decided to use the eBPF interpreter, which is built into the host kernel - while there is no legitimate way to invoke it from inside a VM, the presence of the code in the host kernel's text section is sufficient to make it usable for the attack, just like with ordinary ROP gadgets." To make attacker job harder introduce BPF_JIT_ALWAYS_ON config option that removes interpreter from the kernel in favor of JIT-only mode. So far eBPF JIT is supported by: x64, arm64, arm32, sparc64, s390, powerpc64, mips64 The start of JITed program is randomized and code page is marked as read-only. In addition "constant blinding" can be turned on with net.core.bpf_jit_harden v2->v3: - move __bpf_prog_ret0 under ifdef (Daniel) v1->v2: - fix init order, test_bpf and cBPF (Daniel's feedback) - fix offloaded bpf (Jakub's feedback) - add 'return 0' dummy in case something can invoke prog->bpf_func - retarget bpf tree. For bpf-next the patch would need one extra hunk. It will be sent when the trees are merged back to net-next Considered doing: int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT; but it seems better to land the patch as-is and in bpf-next remove bpf_jit_enable global variable from all JITs, consolidate in one place and remove this jit_init() function. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-09 22:25:26 +01:00
Linus Torvalds	d476c5334f	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "A set of fixes that should go into this release. This contains: - An NVMe pull request from Christoph, with a few critical fixes for NVMe. - A block drain queue fix from Ming. - The concurrent lo_open/release fix for loop" * 'for-linus' of git://git.kernel.dk/linux-block: loop: fix concurrent lo_open/lo_release block: drain queue before waiting for q_usage_counter becoming zero nvme-fcloop: avoid possible uninitialized variable warning nvme-mpath: fix last path removal during traffic nvme-rdma: fix concurrent reset and reconnect nvme: fix sector units when going between formats nvme-pci: move use_sgl initialization to nvme_init_iod()	2018-01-09 11:20:55 -08:00
Daniel Borkmann	be95a845cc	bpf: avoid false sharing of map refcount with max_entries In addition to commit `b2157399cc` ("bpf: prevent out-of-bounds speculation") also change the layout of struct bpf_map such that false sharing of fast-path members like max_entries is avoided when the maps reference counter is altered. Therefore enforce them to be placed into separate cachelines. pahole dump after change: struct bpf_map { const struct bpf_map_ops * ops; /* 0 8 / struct bpf_map inner_map_meta; /* 8 8 / void security; /* 16 8 / enum bpf_map_type map_type; / 24 4 / u32 key_size; / 28 4 / u32 value_size; / 32 4 / u32 max_entries; / 36 4 / u32 map_flags; / 40 4 / u32 pages; / 44 4 / u32 id; / 48 4 / int numa_node; / 52 4 / bool unpriv_array; / 56 1 / / XXX 7 bytes hole, try to pack / / --- cacheline 1 boundary (64 bytes) --- / struct user_struct user; /* 64 8 / atomic_t refcnt; / 72 4 / atomic_t usercnt; / 76 4 / struct work_struct work; / 80 32 / char name[16]; / 112 16 / / --- cacheline 2 boundary (128 bytes) --- / / size: 128, cachelines: 2, members: 17 / / sum members: 121, holes: 1, sum holes: 7 */ }; Now all entries in the first cacheline are read only throughout the life time of the map, set up once during map creation. Overall struct size and number of cachelines doesn't change from the reordering. struct bpf_map is usually first member and embedded in map structs in specific map implementations, so also avoid those members to sit at the end where it could potentially share the cacheline with first map values e.g. in the array since remote CPUs could trigger map updates just as well for those (easily dirtying members like max_entries intentionally as well) while having subsequent values in cache. Quoting from Google's Project Zero blog [1]: Additionally, at least on the Intel machine on which this was tested, bouncing modified cache lines between cores is slow, apparently because the MESI protocol is used for cache coherence [8]. Changing the reference counter of an eBPF array on one physical CPU core causes the cache line containing the reference counter to be bounced over to that CPU core, making reads of the reference counter on all other CPU cores slow until the changed reference counter has been written back to memory. Because the length and the reference counter of an eBPF array are stored in the same cache line, this also means that changing the reference counter on one physical CPU core causes reads of the eBPF array's length to be slow on other physical CPU cores (intentional false sharing). While this doesn't 'control' the out-of-bounds speculation through masking the index as in commit `b2157399cc`, triggering a manipulation of the map's reference counter is really trivial, so lets not allow to easily affect max_entries from it. Splitting to separate cachelines also generally makes sense from a performance perspective anyway in that fast-path won't have a cache miss if the map gets pinned, reused in other progs, etc out of control path, thus also avoids unintentional false sharing. [1] https://googleprojectzero.blogspot.ch/2018/01/reading-privileged-memory-with-side.html Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-09 10:07:30 -08:00
David S. Miller	61ad64080e	Merge branch 'r8169-improve-runtime-pm' Heiner Kallweit says: ==================== r8169: improve runtime pm On my system with two network ports I found that runtime PM didn't suspend the unused port. Therefore I checked runtime pm in this driver in somewhat more detail and this series improves runtime pm in general and solves the mentioned issue. Tested on a system with RTL8168evl (MAC version 34). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-09 12:38:57 -05:00
Heiner Kallweit	a92a08499b	r8169: improve runtime pm in general and suspend unused ports So far rpm doesn't cover cases like unused ports which are never brought up. If they are active at probe time they remain in this state. Included in this patch: - Let the idle notification check whether we can suspend and let it schedule the suspend. This way we don't need to have calls to pm_schedule_suspend in different places. - At the end of rtl_open and rtl_init_one send an idle notification to allow suspending if the link is down. If a cable is plugged in aneg is finished before the suspend timer expires and the suspend request is cancelled. - Change rtl8169_runtime_suspend to power down the chip if the interface is down. Successfully tested on a RTL8168evl (mac version 34). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-09 12:38:56 -05:00
Heiner Kallweit	ef4d5fcceb	r8169: improve runtime pm in rtl8169_check_link_status This patch partially reverts commit `e4fbce740f` "r8169: Fix runtime power management" from 2010. At that time the suspend delay was 100ms and therefore suspending happened during initial aneg. Currently suspend delay is 5s, so suspend starts after aneg and the issue doesn't exist any longer. On my system aneg takes almost 3s, to be on the safe side let's increase the suspend delay to 10s. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-09 12:38:56 -05:00
Heiner Kallweit	b9aa1c75e6	r8169: remove unneeded rpm ops in rtl_shutdown This patch reverts commit `2a15cd2ff4` "r8169: runtime resume before shutdown" from 2012. Few months after this change the underlying issue was solved in the PCI core with commit `3ff2de9ba1` "PCI/PM: Resume device before shutdown". Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-09 12:38:56 -05:00
David S. Miller	fdb533c304	Merge branch 'tipc-improvements-to-group-messaging' Jon Maloy says: ==================== tipc: improvements to group messaging We make a number of simplifications and improvements to the group messaging service. They aim at readability/maintainability of the code as well as scalability. The series is based on commit `f9c935db80` ("tipc: fix problems with multipoint-to-point flow control) which has been applied to 'net' but not yet to 'net-next'. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-09 12:35:59 -05:00
Jon Maloy	eb929a91b2	tipc: improve poll() for group member socket The current criteria for returning POLLOUT from a group member socket is too simplistic. It basically returns POLLOUT as soon as the group has external destinations, something obviously leading to a lot of spinning during destination congestion situations. At the same time, the internal congestion handling is unnecessarily complex. We now change this as follows. - We introduce an 'open' flag in struct tipc_group. This flag is used only to help poll() get the setting of POLLOUT right, and not for congeston handling as such. This means that a user can choose to ignore an EAGAIN for a destination and go on sending messages to other destinations in the group if he wants to. - The flag is set to false every time we return EAGAIN on a send call. - The flag is set to true every time any member, i.e., not necessarily the member that caused EAGAIN, is removed from the small_win list. - We remove the group member 'usr_pending' flag. The size of the send window and presence in the 'small_win' list is sufficient criteria for recognizing congestion. This solution seems to be a reasonable compromise between 'anycast', which is normally not waiting for POLLOUT for a specific destination, and the other three send modes, which are. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-09 12:35:58 -05:00
Jon Maloy	232d07b74a	tipc: improve groupcast scope handling When a member joins a group, it also indicates a binding scope. This makes it possible to create both node local groups, invisible to other nodes, as well as cluster global groups, visible everywhere. In order to avoid that different members end up having permanently differing views of group size and memberhip, we must inhibit locally and globally bound members from joining the same group. We do this by using the binding scope as an additional separator between groups. I.e., a member must ignore all membership events from sockets using a different scope than itself, and all lookups for message destinations must require an exact match between the message's lookup scope and the potential target's binding scope. Apart from making it possible to create local groups using the same identity on different nodes, a side effect of this is that it now also becomes possible to create a cluster global group with the same identity across the same nodes, without interfering with the local groups. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-09 12:35:58 -05:00
Jon Maloy	8348500f80	tipc: add option to suppress PUBLISH events for pre-existing publications Currently, when a user is subscribing for binding table publications, he will receive a PUBLISH event for all already existing matching items in the binding table. However, a group socket making a subscriptions doesn't need this initial status update from the binding table, because it has already scanned it during the join operation. Worse, the multiplicatory effect of issuing mutual events for dozens or hundreds group members within a short time frame put a heavy load on the topology server, with the end result that scale out operations on a big group tend to take much longer than needed. We now add a new filter option, TIPC_SUB_NO_STATUS, for topology server subscriptions, so that this initial avalanche of events is suppressed. This change, along with the previous commit, significantly improves the range and speed of group scale out operations. We keep the new option internal for the tipc driver, at least for now. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-09 12:35:58 -05:00
Jon Maloy	d12d2e12ce	tipc: send out join messages as soon as new member is discovered When a socket is joining a group, we look up in the binding table to find if there are already other members of the group present. This is used for being able to return EAGAIN instead of EHOSTUNREACH if the user proceeds directly to a send attempt. However, the information in the binding table can be used to directly set the created member in state MBR_PUBLISHED and send a JOIN message to the peer, instead of waiting for a topology PUBLISH event to do this. When there are many members in a group, the propagation time for such events can be significant, and we can save time during the join operation if we use the initial lookup result fully. In this commit, we eliminate the member state MBR_DISCOVERED which has been the result of the initial lookup, and do instead go directly to MBR_PUBLISHED, which initiates the setup. After this change, the tipc_member FSM looks as follows: +-----------+ ---->\| PUBLISHED \|-----------------------------------------------+ PUB- +-----------+ LEAVE/WITHRAW \| LISH \|JOIN \| \| +-------------------------------------------+ \| \| \| LEAVE/WITHDRAW \| \| \| \| +------------+ \| \| \| \| +----------->\| PENDING \|---------+ \| \| \| \| \|msg/maxactv +-+---+------+ LEAVE/ \| \| \| \| \| \| \| \| WITHDRAW \| \| \| \| \| \| +----------+ \| \| \| \| \| \| \| \|revert/maxactv\| \| \| \| \| \| \| V V V V V \| +----------+ msg +------------+ +-----------+ +-->\| JOINED \|------>\| ACTIVE \|------>\| LEAVING \|---> \| +----------+ +--- -+------+ LEAVE/+-----------+DOWN \| A A \| WITHDRAW A A A EVT \| \| \| \|RECLAIM \| \| \| \| \| \|REMIT V \| \| \| \| \| \|== adv +------------+ \| \| \| \| \| +---------\| RECLAIMING \|--------+ \| \| \| \| +-----+------+ LEAVE/ \| \| \| \| \|REMIT WITHDRAW \| \| \| \| \|< adv \| \| \| \|msg/ V LEAVE/ \| \| \| \|adv==ADV_IDLE+------------+ WITHDRAW \| \| \| +-------------\| REMITTED \|------------+ \| \| +------------+ \| \|PUBLISH \| JOIN +-----------+ LEAVE/WITHDRAW \| ---->\| JOINING \|-----------------------------------------------+ +-----------+ Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-09 12:35:58 -05:00

... 2 3 4 5 6 ...

725368 commits