linux-stable/drivers/base
Luis R. Rodriguez 90d41e74a9 firmware: fix batched requests - send wake up on failure on direct lookups
Fix batched requests from waiting forever on failure.

The firmware API batched requests feature has been broken since the API call
request_firmware_direct() was introduced on commit bba3a87e98 ("firmware:
Introduce request_firmware_direct()"), added on v3.14 *iff* the firmware
being requested was not present in *certain kernel builds* [0].

When no firmware is found the worker which goes on to finish never informs
waiters queued up of this, so any batched request will stall in what seems
to be forever (MAX_SCHEDULE_TIMEOUT). Sadly, a reboot will also stall, as
the reboot notifier was only designed to kill custom fallback workers. The
issue seems to the user as a type of soft lockup, what *actually* happens
underneath the hood is a wait call which never completes as we failed to
issue a completion on error.

For device drivers with optional firmware schemes (ie, Intel iwlwifi, or
Netronome -- even though it uses request_firmware() and not
request_firmware_direct()), this could mean that when you boot a system with
multiple cards the firmware will seem to never load on the system, or that
the card is just not responsive even the driver initialization. Due to
differences in scheduling possible this should not always trigger --
one would need to to ensure that multiple requests are in place at the
right time for this to work, also release_firmware() must not be called
prior to any other incoming request. The complexity may not be worth
supporting batched requests in the future given the wait mechanism is
only used also for the fallback mechanism. We'll keep it for now and
just fix it.

Its reported that at least with the Intel WiFi cards on one system this
issue was creeping up 50% of the boots [0].

Before this commit batched requests testing revealed:
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
CONFIG_FW_LOADER_USER_HELPER=y

Most common Linux distribution setup.

API-type                               no-firmware-found   firmware-found
----------------------------------------------------------------------
request_firmware()                     FAIL                OK
request_firmware_direct()              FAIL                OK
request_firmware_nowait(uevent=true)   FAIL                OK
request_firmware_nowait(uevent=false)  FAIL                OK
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
CONFIG_FW_LOADER_USER_HELPER=n

Only possible if CONFIG_DELL_RBU=n and CONFIG_LEDS_LP55XX_COMMON=n, rare.

API-type                               no-firmware-found   firmware-found
----------------------------------------------------------------------
request_firmware()                     FAIL                OK
request_firmware_direct()              FAIL                OK
request_firmware_nowait(uevent=true)   FAIL                OK
request_firmware_nowait(uevent=false)  FAIL                OK
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
CONFIG_FW_LOADER_USER_HELPER=y

Google Android setup.

API-type                               no-firmware-found   firmware-found
----------------------------------------------------------------------
request_firmware()                     OK                  OK
request_firmware_direct()              FAIL                OK
request_firmware_nowait(uevent=true)   OK                  OK
request_firmware_nowait(uevent=false)  OK                  OK
============================================================================

Ater this commit batched testing results:
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
CONFIG_FW_LOADER_USER_HELPER=y

Most common Linux distribution setup.

API-type                               no-firmware-found   firmware-found
----------------------------------------------------------------------
request_firmware()                     OK                  OK
request_firmware_direct()              OK                  OK
request_firmware_nowait(uevent=true)   OK                  OK
request_firmware_nowait(uevent=false)  OK                  OK
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
CONFIG_FW_LOADER_USER_HELPER=n

Only possible if CONFIG_DELL_RBU=n and CONFIG_LEDS_LP55XX_COMMON=n, rare.

API-type                               no-firmware-found   firmware-found
----------------------------------------------------------------------
request_firmware()                     OK                  OK
request_firmware_direct()              OK                  OK
request_firmware_nowait(uevent=true)   OK                  OK
request_firmware_nowait(uevent=false)  OK                  OK
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
CONFIG_FW_LOADER_USER_HELPER=y

Google Android setup.

API-type                               no-firmware-found   firmware-found
----------------------------------------------------------------------
request_firmware()                     OK                  OK
request_firmware_direct()              OK                  OK
request_firmware_nowait(uevent=true)   OK                  OK
request_firmware_nowait(uevent=false)  OK                  OK
============================================================================

[0] https://bugzilla.kernel.org/show_bug.cgi?id=195477

Cc: stable <stable@vger.kernel.org> # v3.14
Fixes: bba3a87e98 ("firmware: Introduce request_firmware_direct()"
Reported-by: Nicolas <nbroeking@me.com>
Reported-by: John Ewalt  <jewalt@lgsinnovations.com>
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-10 13:54:16 -07:00
..
power Merge branches 'pm-qos' and 'pm-devfreq' 2017-07-14 13:16:31 +02:00
regmap Merge remote-tracking branches 'regmap/topic/1wire', 'regmap/topic/irq' and 'regmap/topic/lzo' into regmap-next 2017-07-03 16:20:28 +01:00
test driver core: test_async: fix up typo found by 0-day 2016-11-29 22:06:42 +01:00
arch_topology.c arm,arm64,drivers: add a prefix to drivers arch_topology interfaces 2017-06-03 19:10:09 +09:00
attribute_container.c attribute_container: Fix typo 2016-08-31 15:13:56 +02:00
base.h Revert "driver core: Add deferred_probe attribute to devices in sysfs" 2017-01-14 14:09:03 +01:00
bus.c driver-core: remove struct bus_type.dev_attrs 2017-06-12 16:18:37 +02:00
cacheinfo.c Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-12-22 09:25:45 -08:00
class.c driver core: remove class_attrs from struct class 2017-06-09 10:41:00 +02:00
component.c Merge 4.5-rc4 into driver-core-next 2016-02-14 14:29:55 -08:00
container.c
core.c Add "shutdown" to "struct class". 2017-07-07 09:49:24 +10:00
cpu.c CPU / PM: expose pm_qos_resume_latency for CPUs 2017-01-30 11:05:29 +01:00
dd.c of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices 2017-04-20 16:31:06 +02:00
devcoredump.c driver core: devcoredump: convert to use class_groups 2016-11-29 21:12:12 +01:00
devres.c devres: add devm_alloc_percpu() 2016-11-15 22:34:25 -05:00
devtmpfs.c statx: Add a system call to make enhanced file info available 2017-03-02 20:51:15 -05:00
dma-coherent.c drivers: dma-coherent: Introduce default DMA pool 2017-06-28 06:55:03 -07:00
dma-contiguous.c cma: Store a name in the cma structure 2017-04-18 20:41:12 +02:00
dma-mapping.c This is the first pull request for the new dma-mapping subsystem 2017-07-06 19:20:54 -07:00
driver.c
firmware.c
firmware_class.c firmware: fix batched requests - send wake up on failure on direct lookups 2017-08-10 13:54:16 -07:00
hypervisor.c
init.c
isa.c isa: Call isa_bus_init before dependent ISA bus drivers register 2016-06-17 20:47:11 -07:00
Kconfig arm, arm64: factorize common cpu capacity default code 2017-06-03 19:10:09 +09:00
Makefile arm, arm64: factorize common cpu capacity default code 2017-06-03 19:10:09 +09:00
map.c
memory.c mm, memory_hotplug: do not assume ZONE_NORMAL is default kernel zone 2017-07-06 16:24:32 -07:00
module.c base: make module_create_drivers_dir race-free 2016-06-15 19:21:31 -07:00
node.c mm: drop useless local parameters of __register_one_node() 2017-07-10 16:32:32 -07:00
pinctrl.c driver core: fix automatic pinctrl management 2017-06-13 11:07:32 +02:00
platform-msi.c irqchip/MSI: Use irq_domain_update_bus_token instead of an open coded access 2017-06-22 18:29:17 +02:00
platform.c This is the bulk of GPIO changes for the v4.13 series: 2017-07-07 12:40:27 -07:00
property.c device property: Introduce fwnode_call_bool_op() for ops that return bool 2017-07-12 13:32:46 +02:00
soc.c base: soc: Allow early registration of a single SoC device 2017-03-29 21:43:26 +02:00
syscore.c
topology.c drivers base/topology: Convert to hotplug state machine 2016-11-09 23:45:29 +01:00
transport_class.c